Skip to content

Deploying many sites in ECS using one ALB

My current go-to deployment strategy is AWS Elastic Container Service (ECS) using Fargate behind an Application Load Balancer (ALB). Each site is its own stateless docker container persisting dynamic data in RDS and/or S3. When I make a change, I build the new container, push it to ECR, create a new task revision and ECS deploys the site for me.

I've now set this up a couple of times and each time I struggle to recollect all the steps along the way, so it's high time I write it down so that I can look it up next time. And now that I understand this a bit better, I was also able to consolidate my infrastructure, since my original approach wasn't necessarily the most cost-efficient setup.

Aside from remembering/reverse engineering all the pieces needed, the part I always got stuck on was the apparent catch-22 of a load balancer wanting a target group, a target group wanting an IP, while the ECS Service wants to set up a load balancer before providing said IP.

Setting up the Application Load Balancer (ALB)

We're going to be using a single Application Load Balancer for all the ECS hosted sites and will be hosting them under https:// with http:// redirecting to the former.

Multi-domain certificate

Before we can create the load balancer we need a certificate that covers all the domains we want to host. This certificate can be swapped out later, so the list of domains defined here is not set in stone. AWS lets you generate a multi-domain and wildcard certificates directly via Certificate Manager, which allows us to host different domains at the single IP of the load balancer, rather than requiring an IP and certificate for each domain.

  1. Go to Certificate Manager
  2. Select Request
  3. Request a public certificate
  4. Add all the domains and sub-domains that the load balancer will handle
    • You can provide up to 10 different domains in one certificate by default, i.e. 10 different sites with one Application Load Balancer. You can request more, but I haven't tried yet.
    • foo.com and *.foo.com have to be declared separately, so that might cut you down to 5 different sites.
  5. Keep the remaining settings as described
  6. Follow the instructions for verifying each domain by setting up the appropriate CNAMEs in DNS

The only caveat is that the certificate common name (CN) is the first domain that you declare. I.e. if you create it for foo.com, bar.com and baz.com, visitors looking at the certificate for the latter two will see foo.com as the name on the certificate, which depending on the sites you are hosting may be a deal breaker.

Security Groups

We only want to open up the containers for the ALB to send traffic to, so we need a pair of groups, the first definining who can talk to the ALB and the second defining that the ALB can talk to this group.

  1. Goto EC2
  2. Select Security Groups from Network & Security
  3. Select Create Security Group
    1. Create security group for the load balancer itself
      1. Inbound: Ports 80 & 443 inbound from anywhere
      2. Outbound: Default anywhere rule
    2. Create security group for ECS services to be reachable only by the load balancer
      1. Inbound: Port 8000 (or whatever port your docker containers listen for HTTP on) with a source being the previous security group
      2. Outbound: Default anywhere rule

I run the container with a public IP (actually required for env files) and add a Home only security group that opens up the container to my IP, so that I can sanity check it without the load balancer.

Creating the ALB

The ALB, once configured, will provide us an AWS based name, that we will use as the target of the CNAMEs that we're hosting. When someone hits some host name that we haven't set up routing for, we want to return a 503, rather than have one of the sites exposed at a non-canonical host. Unfortunately, the console does not let you define a static response as the initial default, but requires a Target Group to be created instead, so we will create a target group of the first site we want to host and then change the target rules after.

  1. Goto EC2
  2. Select Load Balancers from Load Balancing
  3. Select Create Application Load Balancer
  4. Pick Application Load Balancer
    1. Internet Facing
    2. Remove default security group and add group created above
    3. Select HTTPS as the listener
    4. Create a IP Target Group for the first site to be hosted
      1. Name it for the first site to be hosted
      2. Select HTTP (not HTTPS) and the port that the docker container listens on
        • The ALB terminates HTTPS and everything inside our network is just HTTP
      3. Define a healthcheck path on the container
        • Some path on the container that returns 200 when things are ok.
      4. Click Create without specifying an IP
    5. Add the ACM certificate created above
    6. Click Create for the ALB
  5. Add HTTP listener to ALB
    1. Set Routing Action to Redirect to URL
    2. Keep Uri Parts and HTTPS target defaults
  6. Modify default HTTPS rule
    1. Select HTTPS listener
    2. Check default rule
    3. Select Edit Rule from Actions
    4. Set Routing Action to Return fixed response
    5. Set Response Code to 503 and text/plain
    6. Add No such site as the Response body

The load balancer is now ready to direct traffic to the sites we create in ECS

Creating Services

Each site is a docker container running in ECS (or if the container handles multi-tenancy one container may service multiple sites). This provides us with immutable infrastructure, rather than servers we update as we make changes. If a container is compromised it is simply terminated and replaced. Ideally the container is configured to require no local changes at runtime, even for ephemeral state, so that we can use read-only file systems further reducing the chance of a compromise. This assumes, of course, that you do not permit executable code to be stored in your persistence layer.

Add a routing rule to the load balancer

  1. Goto EC2
  2. Select Load Balancers from Load Balancing
  3. Select the load balancer
  4. Select HTTPS listener
  5. Select Add Rule
  6. Click Next
  7. Add Host header condition for the target domain
  8. Confirm and click Next
  9. Set Routing action to Forward to target groups
  10. Create a target group as described above or pick the previously created target group if this is the first site we're setting up
  11. Pick some priority and click Next
    • priority doesn't matter since all our rules will be host header rules that either match or don't
  12. Click Create

Upload Application Container to ECR

  1. Go to ECR
  2. Create a private repository for the application
  3. Upload the container (see View Push Commands for details)

Create IAM Policies & Roles

Every service requires two roles: - A Task Role - providing the permissions that your application needs - e.g. appropriate S3 permissions for persistence - for illustration we'll grant full s3 access to the publicly readable bucket that the site will use for its media storage - A Task Execution Role - providing permission to run the task - since we'll use env files, we'll need appropriate s3 access to the private bucket we keep env files in

Public S3 Policy

This assumes that your site uses S3 for some file storage that is exposed publicly. If it doesn't skip to the next section. Further assuming bucket sample-site-public in which objects can be public, we create a policy so that our container can write files to it.

  1. Go to IAM
  2. Select Policies
  3. Click Create Policy
  4. Select the JSON policy editor and add the following:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": [
                    "s3:PutObject",
                    "s3:GetObject",
                    "s3:ListBucket",
                    "s3:DeleteObject",
                    "s3:GetBucketLocation"
                ],
                "Resource": [
                    "arn:aws:s3:::sample-site-public/*",
                    "arn:aws:s3:::sample-site-public"
                ]
            }
        ]
    }
    
  5. Set the name to sample-site.s3.editor
  6. Create the policy

Task Role

  1. Go to IAM
  2. Select Roles
  3. Select AWS Service as Trusted Entity Type
  4. Select Elastic Container Service as Use case
  5. Select Elastic Container Service Task
  6. Attach sample-site.s3.editor to role (if you are using S3 for public file storage)
  7. Set the name to sample-site.task
  8. Create role

Private S3 Policy

Assuming private bucket sample-site-private, we create a policy so that our task can read our environment file prod.env. We use separate buckets, so that we can't accidentally expose our secrets with a rogue permission change on the media bucket.

  1. Go to IAM
  2. Select Policies
  3. Click Create Policy
  4. Select the JSON policy editor and add the following:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject"
                ],
                "Resource": [
                    "arn:aws:s3:::sample-site-private/prod.env"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "s3:GetBucketLocation"
                ],
                "Resource": [
                    "arn:aws:s3:::sample-site-private"
                ]
            }
        ]
    }
    
  5. Set the name to sample-site.task-files
  6. Create the policy

Task Execution Role

  1. Go to IAM
  2. Select Roles
  3. Select AWS Service as Trusted Entity Type
  4. Select Elastic Container Service as Use case
  5. Select Elastic Container Service Task
  6. Attach AmazonECSTaskExecutionRolePolicy role
  7. Attach sample-site.task-files role
  8. Set the name to sample-site.task-runner
  9. Create role

Create Task Definition

Note

The assumption is that there is already a cluster you can create services in. If not, create one using FARGATE, just for simplicity. Considering this setup is primarily low traffic sites, even the minimum FARGATE resources seem like a waste, so unlike what the console says about EC2 being for high throughput work, I may benefit from buying a single small EC2 instance and using it to run multiple low-traffic ECS services on. But that's a future topic. For now, all services are containers running in FARGATE.

  1. Go to ECS
  2. Select Task definitions
  3. Click Create new task definition
  4. Select Launch Type FARGATE
  5. Select sample-site.task for the Task role
  6. Select sample-site.task-runner for the Task execution role
  7. Under Container details, link to your application container
  8. Under Port Mappings expose the port that the container listens on
  9. Click Add environment file
  10. Set Location to arn:aws:s3:::sample-site-private/prod.env
    • More versatile than env vars in the definition, and safer if the env contains secrets
    • Alternatively could use Parameter Store. I'll write that up once I figure it out.
  11. Create task definition

Create Service

  1. Go to ECS
  2. Select Clusters
  3. Select your cluster
  4. Click Create in the Services tab
  5. Leave Capacity provider strategy
  6. Leave Application Type as Service
  7. Select the above create Task definition
  8. under Networking
  9. Remove default group
  10. Add the service specific security group created above
  11. (Optional) Add a Home only security group for direct access
  12. Under Load balancing, select previously created ALB
  13. Pick HTTPS listener
  14. Pick previously created Target Group
  15. As the service is deployed, it will automatically fill in targets in the target group
  16. Update DNS to point host to ALB cname

Add more sites

Repeat the Creating Services instructions to add additional sites to the ALB.

What's next

The above setups are for low traffic sites. I use this setup to run a couple of wordpress blogs and some other code projects. If there is more than rudimentary traffic, it would be time revisit the deployment types and possible replicas. But the nice thing is that you can easily scale this up and down or add elastic scaling rules, and deployment of code changes is done by container pushes and creating new revisions of task definitions.

The next thing I do want to take a look at is setting up an EC2 instance to run as my launch target instead of FARGATE because even at .25vCPU my current sites are over-provisioned and costing me more than the dedicated EC2 instance with its Elastic IP I had before that I was manually updating code on.