Scaling to Multiple Servers with AWS ALB

This blog post is part of a series about how to deploy a Django app to AWS. If you haven't deployed your app to EC2 and RDS yet, I recommend reading the AWS EC2 and AWS RDS posts first.

So your website's getting so much traffic that you're ready to scale your web app to multiple servers? Great! In this post, we'll walk through how to prepare our single EC2 server for scalability, create a template from it to spin up more instances, and create an AWS Application Load Balancer (ALB) to distribute traffic amongst them. Let's get started!

01 01 Move Logging to AWS CloudWatch

At the moment, our EC2 instance logs to its own hard drive. This won't scale because each server will have its own logs, and we'll have a hard time aggregating them to find bugs or anomalies. Instead, we'll stream our logs to AWS CloudWatch so they're stored in a centralized location.

First, let's grant permissions to our EC2 machine to stream logs to CloudWatch:

If you don't already have an IAM role for your EC2 machines, create a new role for sending metrics to CloudWatch called web-server. Give your new (or existing) role the permission called CloudWatchAgentServerPolicy.
Attach the role to our EC2 instance if you haven't already. Go to EC2 > Click on the instance > Actions > Security > Modify IAM role > Attach the new role.

Next, let's install and configure CloudWatch Agent on our EC2 machine to stream logs to CloudWatch.

SSH into your EC2 machine.

Run the following commands to launch the CloudWatch Agent configuration wizard:

sudo apt install collectd amazon-cloudwatch-agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

Say yes to all defaults except "Do you want to store the config in the SSM parameter store?" and "Do you want the CloudWatch agent to also retrieve X-ray traces?"
Also say yes to "Do you want to monitor any log files?" and add our log files, which should look something like these:
- /home/ubuntu/my-public-repo/myapp/myapp.log
- /var/log/nginx/access.log
- /var/log/nginx/error.log
- /var/log/gunicorn/gunicorn.log
- /var/log/gunicorn/gunicorn-err.log
Note: For retention policy, -1 means "never delete."

The wizard will generate a configuration file that is stored at /opt/aws/amazon-cloudwatch-agent/bin/config.json. Note that our configuration is also saved to AWS Systems Manager > Parameter Store for us. When we launch the CloudWatch Agent, we can choose to launch it with a local JSON file or by grabbing a config from the parameter store. Grabbing from the parameter store is considered best practice, but for the simplicity of this tutorial we're using the local file. If we ever did want to fetch a config from the parameter store, we would use -c ssm:<parameterName> below instead.

Finally, start the CloudWatch agent using:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s

Great! Our CloudWatch Agent is now streaming our web server's logs to CloudWatch. We can check the status of the CloudWatch Agent on any of our EC2 machines at any time using: amazon-cloudwatch-agent-ctl -a status. If we navigate to AWS CloudWatch > Log groups we should be able to see our logs streaming in. We now have a scalable way to monitor our web servers' usage.

02 02 Move Secrets to AWS Secrets Manager

Currently, our secrets (such as database password, API keys, etc.) might be stored as environment variables on our EC2 machine. This also won't scale, since creating, updating, or deleting these secrets will get cumbersome as we increase the number of servers. More importantly, this also isn't a secure way to store secrets since anyone with access to our server can access these secrets. Instead, we'll use AWS Secrets Manager to store them in a centralized location where it's easier to manage access.

First, let's grant our EC2 machine access to Secrets Manager:

Navigate to AWS IAM console and select Roles.
If you don't already have an IAM role for your EC2 machines, create a new role for access to Secrets Manager called web-server. Give your new (or existing) role the permission called SecretsManagerReadWrite.
Attach the role to the EC2 instance if you haven't already. Go to EC2 > Click on the instance > Actions > Security > Modify IAM role > Attach the new role.

Now, let's create a test secret in Secrets Manager so we can try reading it from Django on our EC2 machine:

Navigate to AWS Secrets console and select Store a new secret.
Select Other type of secret.
Give it a key, such as MYSECRET and a value, such as MYVALUE, then select Next.
Give it a readable name, such as my-secret, then select Next.
On the final page, we are shown some sample code for accessing our new secret. Select Next.

Now that our secret is created, let's access it from Django. Keep in mind that good practice mandates using different secrets for development vs production—e.g. different API keys or database passwords. Let's continue using environment variables for local development so we aren't reliant on an Internet connection. Modify settings.py to look like the following:

settings.py

...
if DEBUG:
  MYSECRET = os.environ['MYSECRET]
else:
  session = boto3.session.Session()
  client = session.client(
      service_name='secretsmanager',
      region_name='us-west-1'
  )
  response = client.get_secret_value(SecretId='my-secret')
  secrets = json.loads(response['SecretString'])
  MYSECRET = secrets['MYSECRET']
...

settings.py

Great! Now we have access to our secrets in settings.py in a scalable way.

03 03 Create an EC2 Launch Template

Great! Now that our EC2 machine is ready for scaling, let's create a template so that we can quickly launch more EC2 instances without having to painstakingly set each one up manually. There are various strategies for creating a template, such as creating a new Amazon Machine Image (AMI), which is a snapshot of our EC2 machine, or a launch template. For this tutorial, we'll use the simplest strategy: creating a launch template.

Launch templates allow us to quickly launch new EC2 instances with one click containing our latest code and other configuration files.

Navigate to the AWS EC2 console.
Navigate to Launch templates > Create launch template.
Give it a name such as web-server-template.
You might notice a checkbox called Provide guidance to help me set up a template that I can use with EC2 Auto-scaling. Leave this unchecked for now and we'll come back to it later.
Choose an instance type, such as t2.micro for the cheapest option or m6i.large for a more powerful option.
Choose the key pair you typically use for your EC2 machine.
Choose the security group you used for your EC2 machine.
Choose the subnet you typically use for your EC2 machine.
Add a storage volume of at least 10GB (or more depending on the size of your web app).
Finally, in Advanced details > IAM instance profile, select the role you created for your EC2 machine.

In Advanced details > User data, add the following script to install and configure our web app:

#!/bin/bash
cd /home/ubuntu
sudo apt update
sudo apt -y upgrade
sudo apt install -y collectd nginx gunicorn supervisor python3-pip postgresql-client-common postgresql-client-14

# code
cd /home/ubuntu
git clone https://github.com/your-username/your-public-repo.git
cd your-public-repo
pip install -r requirements.txt

# nginx
cd /etc/nginx/sites-available/
cat <<- EOF | sudo tee example.com
server {
    listen 80;
    server_name example.com www.example.com;

    location / {
        include proxy_params;
        proxy_pass http://localhost:8000;
        client_max_body_size 100m;
        proxy_read_timeout 3600s;
    }
}
EOF
cd /etc/nginx/sites-enabled/
sudo rm default
sudo ln -s /etc/nginx/sites-available/example.com example.com
sudo nginx -s reload

# gunicorn
cd /home/ubuntu
mkdir .gunicorn
cat <<- EOF > /home/ubuntu/.gunicorn/config.py
	"""Gunicorn config file"""

	# Django WSGI application path in pattern MODULE_NAME:VARIABLE_NAME
	wsgi_app = "example.wsgi:application"
	# The granularity of Error log outputs
	loglevel = "debug"
	# The number of worker processes for handling requests
	workers = 4
	# The socket to bind
	bind = "0.0.0.0:8000"
	# Restart workers when code changes (development only!)
	#reload = True
	# Write access and error info to /var/log
	accesslog = errorlog = "/var/log/gunicorn/gunicorn.log"
	# Redirect stdout/stderr to log file
	capture_output = True
	# PID file so you can easily fetch process ID
	#pidfile = "/var/run/gunicorn/dev.pid"
	# Daemonize the Gunicorn process (detach & enter background)
	#daemon = True
	# Workers silent for more than this many seconds are killed and restarted
	timeout = 600
	# Restart workers after this many requests
	max_requests = 10
	# Stagger reloading of workers to avoid restarting at the same time
	max_requests_jitter = 30
	# Restart workers after this much resident memory
EOF
chown -R ubuntu:ubuntu /home/ubuntu/.gunicorn

# supervisor
cat <<- EOF | sudo tee /etc/supervisor/conf.d/gunicorn.conf
	[program:django_gunicorn]
	directory=/home/ubuntu/my-public-repo/
	command=/usr/bin/gunicorn -c /home/ubuntu/.gunicorn/config.py
	autostart=true
	autorestart=true
	stdout_logfile=/var/log/supervisor/django-gunicorn-out.log
	stderr_logfile=/var/log/supervisor/django-gunicorn-err.log
EOF
sudo systemctl restart supervisor

# logging
cd /home/ubuntu
wget https://amazoncloudwatch-agent.s3.amazonaws.com/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb
sudo apt install -y ./amazon-cloudwatch-agent.deb
cd /opt/aws/amazon-cloudwatch-agent/bin
cat << EOF | sudo tee config.json
{
    "agent": {
        "metrics_collection_interval": 60,
        "run_as_user": "root"
    },
    "logs": {
        "logs_collected": {
            "files": {
                "collect_list": [
                    {
                        "file_path": "/var/log/nginx/access.log",
                        "log_group_class": "STANDARD",
                        "log_group_name": "nginx-access.log",
                        "log_stream_name": "prod",
                        "retention_in_days": -1
                    },
                    {
                        "file_path": "/var/log/nginx/error.log",
                        "log_group_class": "STANDARD",
                        "log_group_name": "nginx-error.log",
                        "log_stream_name": "prod",
                        "retention_in_days": -1
                    },
                    {
                        "file_path": "/var/log/gunicorn/gunicorn.log",
                        "log_group_class": "STANDARD",
                        "log_group_name": "gunicorn.log",
                        "log_stream_name": "prod",
                        "retention_in_days": -1
                    },
                    {
                        "file_path": "/home/ubuntu/my-public-repo/myapp.log",
                        "log_group_class": "STANDARD",
                        "log_group_name": "myapp.log",
                        "log_stream_name": "prod",
                        "retention_in_days": -1
                    }
                ]
            }
        }
    },
    "metrics": {
        "aggregation_dimensions": [
            [
                "InstanceId"
            ]
        ],
        "append_dimensions": {
            "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
            "ImageId": "${aws:ImageId}",
            "InstanceId": "${aws:InstanceId}",
            "InstanceType": "${aws:InstanceType}"
        },
        "metrics_collected": {
            "collectd": {
                "metrics_aggregation_interval": 60
            },
            "disk": {
                "measurement": [
                    "used_percent"
                ],
                "metrics_collection_interval": 60,
                "resources": [
                    "*"
                ]
            },
            "mem": {
                "measurement": [
                    "mem_used_percent"
                ],
                "metrics_collection_interval": 60
            },
            "statsd": {
                "metrics_aggregation_interval": 60,
                "metrics_collection_interval": 10,
                "service_address": ":8125"
            }
        }
    }
}
EOF
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s

Finally, select Create launch template.

Great! Now we can easily launch multiple new web servers by selecting our template > Actions > Launch instance from template. Feel free to modify this template as you go, for example by adding extra setup instructions for npm by selecting our template > Actions > Modify template (Create new version).

04 04 Add an AWS Application Load Balancer

Before creating our load balancer, we need to create a target group for our web servers, which is simply a group of EC2 instances that our load balancer will point to. But first, we'll need to create an endpoint where our target group can ping our web servers to check if they're still responding. If they stop responding with a 200 status, the server will automatically be marked unhealthy by the target group. We could add a custom middleware to intercept this request and put less load on our server, but let's just add a regular view instead for simplicity at the endpoint /status:

Add a new view to our top-level views.py that returns a simple HTTP response:

def status_view(request):
  return HttpResponse("Healthy.")

Add a new URL to our top-level urls.py that points to this view:

urlpatterns = [
  path('status/', views.status_view, name='status'),
]

Now that our endpoint is created, we can create a target group for our web servers that periodically sends health checks to this endpoint:

Navigate to the AWS EC2 console and select Target Groups > Create target group.
Select a target type of Instances.
Give it a name such as web-server-tg.
Select a protocal of HTTP which defaults to port 80.
Under Health checks, select HTTPS for the protocol and /status for the path.
Select Next. On the next page, select the instance(s) to add to this target group, then select Include as pending below.
Select Create target group.

Now that our instance(s) are added to the target group, we should see each one receiving a ping to /status every 30 seconds. If we select the newly created target group, we should see a count of healthy and unhealthy web servers. By default, a server is marked as unhealthy if it fails to respond to 5 pings in a row, and is marked healthy again when it responds to 2 pings in a row (all of this can be configured as desired in the target group). The target group will use this information to automatically instruct our load balancer to stop sending requests to unhealthy servers, and in the future we can even configure it to automatically spin up new web servers when the number of healthy web servers drops below some threshold.

Next, let's modify our security groups. Note that we want no longer want our EC2 instances exposed to public web traffic, only our load balancer. First, let's create a new security group for our ALB:

Navigate to the AWS EC2 console and select Security Groups > Create security group.
Give it a name such as alb-sg.
Add two inbound rules that allow HTTP and HTTPS traffic from anywhere.

Next, let's modify the existing security group for our web servers to only allow web traffic from our ALB, not the open web:

Navigate to the AWS EC2 console and select Security Groups > web-server-sg.
Select Actions > Edit inbound rules.
Remove all inbound rules except for the one that allows SSH traffic.
Add a new inbound rule that allows HTTP traffic from the security group of our ALB we just created.

Finally, we're ready to create the AWS Application Load Balancer (ALB) which will distribute traffic to the web servers in our new target group:

Navigate to the AWS EC2 console and select Load Balancers > Create load balancer. Select Application Load Balancer and Create.
Give it a name like web-server-alb.
Select a scheme of Internet-facing.
Select any two or more Availability Zones.
Select the security group we created earlier.
Select the target group we created earlier.
Select Create.

Note down the IPv4 address of our new load balancer (something like dualstack.alb-<random-id>.us-west-1.elb.amazonaws.com). Now that everything's hooked up, the final step is to point our domain to our ALB instead of to our single EC2 instance.

Select the ALB we just created and navigate to the Details tab. Copy the DNS name of the ALB.
Navigate to your domain registrar, remove the previous A records that pointed to the EC2 machine, and add an A record that points to the DNS name of the ALB, such as dualstack.alb-<random-id>.us-west-1.elb.amazonaws.com.
Save the changes and test your domain by navigating to it in your browser. You should see your web app.

Well done! We've now enabled multiple web servers and created a load balancer to distribute traffic amongst them.

05 05 Next Steps

From here, we could enable tools like EC2 Auto Scaling, which can automatically terminate and spin up servers using our launch template to maintain a certain number of healthy servers at all times. It can also be configured to scale up and down servers based on the average usage (memory, CPU, or networking traffic) across our servers to keep up with traffic. Congratulations on completing this series of blog posts!