How to Install HAProxy Load Balancer on CentOS
@(9-Docker构建LNAMP架构)
Table of contents
Load balancing is a common solution for distributing web applications horizontally across multiple hosts while providing the users a single point of access to the service. It aims to optimize resource usage, maximize throughput, minimize response time, and avoid overloading any single resource. HAProxy is one of the most popular opensource load balancing software, which also offers high availability and proxy functionality. It's available for install on many Linux distributions as well as Solaris and FreeBSD.
HAProxy is particularly suited for very high traffic websites, and is therefore often used to improve web service reliability and performance for multi-server configurations. This guide lays out the steps for setting up HAProxy as a load balancer on CentOS 7 to its own cloud host which then directs the traffic to your web servers.
As a pre-requirement for following this guide, you'll need to have a minimum of two servers with at least the basic web service such as Apache2 or httpd installed and running, and a third load balancer server you'll be installing HAProxy on.
Installing HAProxy 1.6
As a fast developing opensource application HAProxy available for install in the CentOS default repositories might not be of the latest release. To find out what version number is being offered through the official channels enter the following command
sudo yum info haproxy
HAProxy has always three active stable versions of the releases, two of the latest versions in development plus a third older version that is still receiving critical updates. You can always check the currently newest stable version listed on HAProxy website and then decide which version you wish to go with.
In this guide we'll be installing the currently latest stable version of 1.6, which was not yet available in the standard repositories. Instead you'll need to install it from the source, but before this check that you have the prerequisites to download and compile the program.
sudo yum install wget gcc pcre-static pcre-devel -y
Download the source code with the command below. You can check if there's a newer version available at the HAProxy download page and then replace the download link in the wget command with the latest.
wget http://www.haproxy.org/download/1.6/src/haproxy-1.6.3.tar.gz -O ~/haproxy.tar.gz
Once the download is complete, extract the files using the following
tar xzvf ~/haproxy.tar.gz -C ~/
Change into the directory.
cd ~/haproxy-1.6.3
Then compile the program for your system.
make TARGET=linux2628
And finally install HAProxy itself.
sudo make install
To complete the install, use the following commands to copy the settings over.
sudo cp /usr/local/sbin/haproxy /usr/sbin/
sudo cp ~/haproxy-1.6.3/examples/haproxy.init /etc/init.d/haproxy
sudo chmod 755 /etc/init.d/haproxy
Create these directories and the statistics file for HAProxy to record in.
sudo mkdir -p /etc/haproxy
sudo mkdir -p /run/haproxy
sudo mkdir -p /var/lib/haproxy
sudo touch /var/lib/haproxy/stats
Then add a new user for HAProxy.
sudo useradd -r haproxy
After the installation you can double check the installed version number with the following
sudo haproxy -v
HA-Proxy version 1.6.3 2015/12/25
Copyright 2000-2015 Willy Tarreau <[[email protected]][4]>
In this case the version is 1.6.3 like shown in the example output above. 自己构建haproxy镜像到这一步就可以了。一种构建的方法是在容器内,然后通过commit来生成镜像。一种是通过dockerfile来生成,不过要注意的是,老师的docker-compose.yml中的command处需要更改为haproxy才能生效。
Configuring the load balancer
Setting up HAProxy for load balancing is a quite straight forward process. Basically all you need to do is tell HAProxy what kind of connections it should be listening for and which servers it should relay the connections to. This is done by creating a configuration file _/etc/haproxy/haproxy.cfg _with the defining settings. You can read about the configuration options at HAProxy documentation if you wish to find out more.
Open a .cfg file for edit for example using vi with the following command
sudo vi /etc/haproxy/haproxy.cfg
Add the following sections to the the file. Replace the
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
frontend http_front
bind *:80
stats uri /haproxy?stats
default_backend http_back
backend http_back
balance roundrobin
server <server name=""> <private ip="">:80 check
server <server name=""> <private ip="">:80 check
This defines a layer 4 load balancer with a front-end name http_front listening to the port number 80, which then directs the traffic to the default back-end name http_back. The additional stats uri /haproxy?stats enables the statistics page at that specified address. Configuring the servers in the back-end section allows HAProxy to use these servers for load balancing whenever available according to the roundrobin algorithm.
The balancing algorithms are used to decide which server at the back-end each connection is transferred to. Some of the useful options include the following:
- Roundrobin: Each server is used in turns according to their weights. This is the smoothest and fairest algorithm when the servers' processing time remains equally distributed. This algorithm is dynamic, which allows server weights to be adjusted on the fly.
- Leastconn: The server with the lowest number of connections is chosen. Round-robin is performed between servers with the same load. Using this algorithm is recommended with long sessions, such as LDAP, SQL, TSE, etc, but it's not very well suited for short sessions such as HTTP.
- First: The first server with available connection slots receives the connection. The servers are chosen from the lowest numeric identifier to the highest, which defaults to the server's position in the farm. Once a server reaches its maxconn value, the next server is used.
- Source: The source IP address is hashed and divided by the total weight of the running servers to designate which server will receive the request. This way the same client IP address will always reach the same server while the servers stay the same.
An other possibility is to configure the load balancer to work on layer 7, this can be useful when parts of your web application are located on different hosts. This can be accomplished by conditioning the connection transfer for example by the URL.
frontend http_front
bind *:80
stats uri /haproxy?stats
acl url_blog path_beg /blog
use_backend blog_back if url_blog
default_backend http_back
backend http_back
balance roundrobin
server <server name=""> <private ip="">:80 check
server <server name=""> <private ip="">:80 check
backend blog_back
server <server name=""> <private ip="">:80 check
The front-end declares an ACL -rule named url_blog that applies to all connections which path begins with /blog, and use_backend defines that connections matching the url_blog condition should be served by the back-end named blog_back.
At the back-end side the configuration sets up two server groups, http_back like before and the new one called blog_back that servers specifically connections to domain.com/blog.
After making the configurations, save the file and restart HAProxy with the following
sudo systemctl restart haproxy
If you get any errors or warnings at start up, check the configuration for any mistypes and that you've created all the necessary files and folders, then try restarting again.
Testing the setup
With the HAProxy configured and running, open your load balancer server's public IP in a web browser and check that you get connected to your back-end correctly. The parameter stats uri in the configuration enables the statistics page at the defined address.
http://<load balancer="" public="" ip="">/haproxy?stats
When you load the statistics page and all of your servers are listed in green your configuration was successful!
In case your load balancer does not reply, check that HTTP connections are not getting blocked by the firewall. Since you most likely deployed a fresh install of CentOS 7 for this project, the host is rather restrictive by default. You can use the following commands to add these rules and to restart the firewall.
sudo firewall-cmd --permanent --zone=public --add-service=http
sudo firewall-cmd --permanent --zone=public --add-port=8181/tcp
sudo firewall-cmd --reload
The statistics page contains some helpful information to keep track of your web hosts including up- and downtimes and session counts. If a server is listed in red, check that the server is powered on and that you can ping it from the load balancer machine.
Having the statistics page simply listed at the front-end like this however is publicly open for anyone to view, which might not be such a good idea. Instead you can set it up to its own port number by adding the example below to the end of your haproxy.cfg -file. Replace the username and password with something secure.
listen stats
bind *:8181
stats enable
stats uri /
stats realm Haproxy Statistics
stats auth username:password
After adding the new listen -group, remove the old reference to the stats uri from the frontend -group. When done, save the file and restart HAProxy again.
sudo systemctl restart haproxy
Then open the load balancer again with the new port number, and log in with the username and password you set in the configuration file.
http://<load balancer="" public="" ip="">:8181
Check that your servers are still reporting all green and then open just the load balancer IP without any port numbers on your web browser.
http://<load balancer="" public="" ip="">/
If your two back-end servers have at least slightly different landing pages you'll notice that each time you reload the page you get the reply from a different host. You can try out different balancing algorithms listed in the configuration part of this article, or take a look at the full documentation for 1.6.
Conclusions
Congratulations on successfully configuring HAProxy! With a basic load balancer setup you can considerably increase your web application performance and availability. This guide is however just an introduction to load balancing with HAProxy, which is capable of much more than what could be covered in a first time setup instructions. We recommend experimenting with different configurations with the help of the extensive documentation available for HAProxy, and then start planning the load balancing for your production environment.
While using multiple hosts to protect your web service with redundancy, the load balancer itself can still leave a single point of failure. You can further improve upon the high availability by setting up a floating IP between multiple load balancers. You can find out more about this at our article for Floating IPs on UpCloud.