Deploying an HAProxy Load Balancer on CentOS 6

Overview

HAProxy is a free and open-source Linux application used for load balancing network traffic. This tutorial will guide you through deploying it for both simple web applications and large, complex web sites.

Load Balancing Concepts

Depending on the complexity of your needs, balancing can be accomplished many ways using different techniques. One of the first things you’ll need to identify is do use layer 4 or layer 7 balancing. Each is a solution for different needs, so it’s very important to understand your actually requirements before making a decision on a load balancer configuration.

Layer 4

The higher you are on the network stack layer, the faster you will be able to process requests. The caveat is that the extra performance gain comes at the cost of feature and ability loss. Ignoring features you may find in advanced enterprise-ready load balancers, you can’t beat the performance of a layer 4 balancing. It works by modifying the destination IP and MAC address part of every packet’s header to forward it to one of the available nodes. This approach is accomplished by using very little processing.

Layer 4 web server load balancing
Layer 4 web server load balancing

The biggest downside to this method of balancing is your nodes must host every component of your application (PHP, Java, CSS, Javscript, IMGs, etc), and the application files on each node must be the exact same versions. Otherwise, you’re users’ experience will differ each time they access your website.

Layer 4 balancing is ideal for small websites or those with few complexities.

Layer 7

Layer 7 balancing may require more horsepower to process, but what it offers in return for large and complex websites is worth the extra CPU time. This type of balancing allows you to separate every component onto it’s own server, and it does so transparently to the user. Another benefit is being able to move heavily accessed areas of your website onto separate servers. For example, maybe you’re hosting a news website that also has forums to allow users to discuss things. Given enough popularity, the forum may cause other areas of your website to be sluggish or become inaccessible. Using layer 7, you could separate your forum onto its own server or server group to allow you to scale it out.

Layer 7 load balancing
Layer 7 load balancing

Layer 7 works by analyzing the application request part of every packet, and then matching it against a set of policies or rules. For a web server, it looks at the HTTP request to identify the action and destination URI.  Using our news website with a forum example above, if an HTTP GET or POST request is submitted starting with /forum, the traffic will be routed to the servers hosting the forum application, allevating stress from the news posting part of the site.

 

Installing HAProxy

HAProxy isn’t available in the default repositories for CentOS or Red Hat. In order for us to be able to install it, we need to either compile it from source (preferred) or add the EPEL repository to our server and install it using Yum.

Installing from EPEL

  1. Download the EPEL repository RPM.
    wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
  2. Install the EPEL repository RPM to add its Yum configuration file to your server.
    rpm -ivh epel-release-6.8.noarch.rpm
  3. Install HAProxy
    yum install haproxy

Installing from Source

This method is best for optimizing the binaries for your hardware. However, you’ll need compilers and development libraries installed. It’s recommended you compile the binaries on a separate computer and then move the binaries over to your server.

  1. Install development tools.
  2. Download the latest source version of haproxy. At the time of this writing, version 1.4.24 was out.
    wget http://haproxy.1wt.eu/download/1.4/src/haproxy-1.4.24.tar.gz
  3. Untar the downloaded file.
    tar xvzf haproxy-1.4.24.tar.gz
  4. Navigate into new haproxy directory.
    cd ./haproxy-1.4.24
  5. Compile HAProxy. To optimize the binaries for your hardware, there are a few options you should specify. Use one of the following, depending on what you are running.
    1. Compile for Linux kernel 2.6.28 or later and optimize the binaries for the install CPU architecture. 
      make TARGET=linux2628 ARCH=native
    2. Compile for Linux kernel 2.6 or later and optimize the binaries for i386 CPU architecture. 
      make TARGET=linux26 ARCH=i386
  6. Install the newly compiled binary.
    make install
  7. Copy the haproxy binaries to /usr/sbin
    cp /usr/local/sbin/haproxy /usr/sbin/
  8. Copy the example init file for haproxy to the init directory.
    cp /root/haproxy-1.4.24/examples/haproxy.init /etc/init.d/haproxy
  9. Modify the file permissions of the init file.
    chmod 755 /etc/init.d/haproxy
  10. Create a user for haproxy.
    useradd --system haproxy
  11. Create an haproxy /etc directory to store configuration files.
    mkdir -p /etc/haproxy

Configuring HAProxy

There are a few base configuration that should be set before we move into creating load balance clusters. If you installed HAProxy using Yum, a lot the defaults are preset. If you compiled from source, there is no configuration file and you need to create one.

Base Configuration

  1. Open the existing file or create a new haproxy configuration file.
    vi /etc/haproxy/haproxy.cfg
  2. The first section of the configuration file is for global properties. The following is an example of a basic configuration.
    global
            user    haproxy
            group   haproxy
            nbproc  1
    user The user account from the local passwd file HAProxy will run as.
    group The group that HAProxy will run as.
    nbproc The number of processes HAProxy will run in. It is strongly recommended that you set this value to 1.
  3. Below the global section there can be a default section. This defines default configurations for our groups.
    defaults
            mode http
            timeout connect 5000ms
            timeout client 50000ms
            timeout server 50000ms
    Mode
    timeout connect The amount of time before a connection timesout.
    timeout client The amount of time before a client times out.
    timeout server The amount of time before our server times out.
  4. Save your changes to the haproxy configuration file. We're now ready to move onto creating our layer 4 and/or layer 7 load balance clusters.

Conclusion

Our server is now configured to use HAProxy to load balance our application. However, we currently are not balancing anything. Use the tutorials below to continue configuring the type of load balancing your application requires.

Tutorials in this series:

  1. Deploying an HAProxy Load Balancer on CentOS 6
  2. Layer 4 Load Balancing with HAProxy
  3. Layer 7 Load Balancing with HAProxy