2022-03-18

Get direct traffic to ECS Fargate containers with LetsEncrypt, CDK and AWS Lambda

Expose each Fargate task on its own subdomain — scale-to-zero clusters, per-customer isolation, no load balancer and no NAT gateway — with wildcard certificates and an event-driven DNS lambda.

With a small amount of code, you can expose running Fargate tasks to the internet directly, each on an individual subdomain — for example ${taskId}.eu-west-2.browser.reflow.io.

This is almost always a bad idea. If you really need to do it, this guide should help.

Why you shouldn’t do this

AWS provides battle-tested patterns for containerized workloads that are easier to configure, less likely to break, and right for the vast majority of use-cases:

Application Load Balanced Fargate Service: a Fargate service fronted by an application load balancer.
1. Health checks come built in, diverting traffic from unhealthy instances before replacing them.
2. Deployments via CodePipeline monitor newly provisioned services before gradually shifting traffic.
3. Traffic distributes across availability zones for data-center resilience.
Queue Processing Fargate Service: a Fargate service auto-scaled from an SQS queue.
1. Failed jobs retry automatically.
2. Scales up and down with asynchronous workload.
3. Handles long-lived jobs.

Why we do it anyway

Reflow runs web browsers that record and execute end-to-end tests. Recording needs a websocket to a server holding transient state. ECS on Fargate fits — no server fleet to operate, none of the complexity of Kubernetes — and our first design used the ALB-fronted pattern above. It ran into three walls:

Isolation. We run untrusted customer code, which demands customer workloads physically isolated from each other on zero-privilege servers. ALB-fronted services make per-customer physical isolation non-trivial.
Cost. We wanted multi-region, but keeping one warm instance plus a NAT gateway in every region is significant money for a bootstrapped startup — and there was no way to let clusters scale to zero when unused.
Affinity. Team members can share one browser instance to collaborate on a recording. Behind a load balancer, we couldn’t guarantee two users in the same team reach the same server.

Routing directly to tasks gives us:

Clusters that scale to zero when not in use — no warm-instance fee per region.
Physical isolation per customer workload, while still sharing transient state within a team.
Expected customer ids baked into each server process’s environment, simplifying authentication.
No NAT gateway per availability zone.

The costs:

DNS propagation means the first use of a recording instance waits roughly a minute before the server is reachable.
More moving parts to monitor.

Logical components

A LetsEncrypt wildcard certificate on the task domain, e.g. *.eu-west-2.browser.reflow.io.
A Lambda that renews the certificate monthly and alerts on failure.
A Lambda that creates and destroys DNS records for every task in the cluster, driven by ECS state-change events.
An ECS cluster and task configuration that runs the service on demand with a public IP.
Application logic to deliver task hostnames to clients. We store the hostname in DynamoDB when a server boots, read it from web clients over AppSync, and bake a TeamId into the server’s environment — a Cognito custom attribute that must be signed into the client JWT on every request.

Components 1–3 are generic — once configured, every task in the target cluster is exposed via DNS — so they are what this guide covers. Components 4 and 5 are domain-specific; reach out if you have questions about them.

LetsEncrypt certificates

CDK

We manage all infrastructure with CDK. This construct owns the renewal lambda. It creates:

an S3 bucket holding the issued certificates
an SNS topic notifying us on renewal
the Lambda function itself. ReamplifyLambdaFunction is our wrapper that pre-compiles code outside CDK; a NodejsFunction works just as well.

It references the hosted zone for task DNS records and the domain suffix, a workspace parameter (dev / prod) so multiple instances can live in one account, a notification email, and the target region and account.

import { Construct } from 'constructs';
import { Duration, RemovalPolicy, StackProps, Tags } from 'aws-cdk-lib';
import { BlockPublicAccess, Bucket, BucketEncryption, ObjectOwnership } from 'aws-cdk-lib/aws-s3';
import { Topic } from 'aws-cdk-lib/aws-sns';
import { EmailSubscription } from 'aws-cdk-lib/aws-sns-subscriptions';
import { ReamplifyLambdaFunction } from './reamplifyLambdaFunction';
import { PolicyStatement } from 'aws-cdk-lib/aws-iam';
import { IHostedZone } from 'aws-cdk-lib/aws-route53';
import { Rule, Schedule } from 'aws-cdk-lib/aws-events';
import { LambdaFunction } from 'aws-cdk-lib/aws-events-targets';

interface CertbotProps {
  adminNotificationEmail: string;
  hostedZone: IHostedZone;
  domain: string;
  workspace: string;
  env: {
    region: string;
    account: string;
  };
}

export class Certbot extends Construct {
  public readonly certBucket: Bucket;
  constructor(scope: Construct, id: string, props: StackProps & CertbotProps) {
    super(scope, id);
    Tags.of(this).add('construct', 'Certbot');
    const certBucket = new Bucket(this, 'bucket', {
      bucketName: `certs.${props.env.region}.${props.workspace}.reflow.io`,
      objectOwnership: ObjectOwnership.BUCKET_OWNER_PREFERRED,
      removalPolicy: RemovalPolicy.DESTROY,
      autoDeleteObjects: true,
      versioned: true,
      lifecycleRules: [
        {
          enabled: true,
          abortIncompleteMultipartUploadAfter: Duration.days(1),
        },
      ],
      encryption: BucketEncryption.S3_MANAGED,
      enforceSSL: true,
      blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
    });
    this.certBucket = certBucket;

    const topic = new Topic(this, 'CertAdminTopic');
    topic.addSubscription(new EmailSubscription(props.adminNotificationEmail));

    const fn = new ReamplifyLambdaFunction(this, 'LambdaFn', {
      workspace: props.workspace,
      lambdaConfig: 'deploy/browserCerts.ts',
      timeout: Duration.minutes(15),
      environment: {
        NOTIFY_EMAIL: props.adminNotificationEmail,
        CERTIFICATES: JSON.stringify([
          {
            domains: [`*.${props.domain}`],
            zoneId: props.hostedZone.hostedZoneId,
            certStorageBucketName: certBucket.bucketName,
            certStoragePrefix: 'browser/',
            successSnsTopicArn: topic.topicArn,
            failureSnsTopicArn: topic.topicArn,
          },
        ]),
      },
    });

    fn.addToRolePolicy(
      new PolicyStatement({
        actions: ['route53:ListHostedZones'],
        resources: ['*'],
      })
    );
    fn.addToRolePolicy(
      new PolicyStatement({
        actions: ['route53:GetChange', 'route53:ChangeResourceRecordSets'],
        resources: ['arn:aws:route53:::change/*'].concat(props.hostedZone.hostedZoneArn),
      })
    );
    fn.addToRolePolicy(
      new PolicyStatement({
        actions: ['ssm:GetParameter', 'ssm:PutParameter'],
        resources: ['*'],
      })
    );
    certBucket.grantWrite(fn);
    topic.grantPublish(fn);

    new Rule(this, 'trigger', {
      schedule: Schedule.cron({ minute: '32', hour: '17', day: '3', month: '*', year: '*' }),
      targets: [new LambdaFunction(fn)],
    });
  }
}

The renewal lambda

Dependencies: acme-client@4.2.3.

This leans heavily on acme-client, with a scattering of logic to:

maintain SSM parameters so one LetsEncrypt account is reused across runs, while still bootstrapping cleanly in a fresh environment
answer LetsEncrypt DNS-01 challenges with Route53 records to prove domain ownership
store issued certificates in S3
notify an admin of success or failure

import AWS from 'aws-sdk';
import acme from 'acme-client';

const route53 = new AWS.Route53();
const s3 = new AWS.S3();
const sns = new AWS.SNS();

export function assertEnv(key: string): string {
  if (process.env[key] !== undefined) {
    console.log('env', key, 'resolved by process.env as', process.env[key]!);
    return process.env[key]!;
  }
  throw new Error(`expected environment variable ${key}`);
}

export const assertEnvOrSSM = async (key: string, shouldThrow = true): Promise<string> => {
  const workspace = assertEnv('workspace');

  if (process.env[key] !== undefined) {
    console.log('env', key, 'resolved by process.env as', process.env[key]!);
    return Promise.resolve(process.env[key]!);
  } else {
    const SSMLocation = `/${workspace}/${key}`;
    console.log('env', key, 'resolving via SSM at', SSMLocation);

    const SSM = new AWS.SSM();
    try {
      const ssmResponse = await SSM.getParameter({
        Name: SSMLocation,
      }).promise();
      if (!ssmResponse.Parameter || !ssmResponse.Parameter.Value) {
        throw new Error(`env ${key} missing`);
      }
      console.log('env', key, 'resolved by SSM as', ssmResponse.Parameter.Value);
      process.env[key] = ssmResponse.Parameter.Value;
      return ssmResponse.Parameter.Value;
    } catch (e) {
      console.error(`SSM.getParameter({Name: ${SSMLocation}}):`, e);
      if (shouldThrow) {
        throw e;
      }
      return '';
    }
  }
};

export const writeSSM = async (key: string, value: string): Promise<void> => {
  const workspace = assertEnv('workspace');

  const SSMLocation = `/${workspace}/${key}`;
  console.log('env', key, 'writing to SSM at', SSMLocation, 'value', value);

  const SSM = new AWS.SSM();
  await SSM.putParameter({
    Name: SSMLocation,
    Value: value,
    Overwrite: true,
    DataType: 'text',
    Tier: 'Standard',
    Type: 'String',
  }).promise();
};

async function getOrCreateAccountPrivateKey() {
  let accountKey = await assertEnvOrSSM('LETSENCRYPT_ACCOUNT_KEY', false);
  if (accountKey) {
    return accountKey;
  }
  console.log('Generating Account Key');
  accountKey = (await acme.forge.createPrivateKey()).toString();
  await writeSSM('LETSENCRYPT_ACCOUNT_KEY', accountKey);
  return accountKey;
}

export const handler = async function (event) {
  const maintainerEmail = assertEnv('NOTIFY_EMAIL');
  const accountURL = await assertEnvOrSSM('LETSENCRYPT_ACCOUNT_URL', false);
  const certificates = JSON.parse(assertEnv('CERTIFICATES'));
  const accountPrivateKey = await getOrCreateAccountPrivateKey();

  acme.setLogger(console.log);
  const client = new acme.Client({
    directoryUrl: acme.directory.letsencrypt.production,
    accountKey: accountPrivateKey,
    accountUrl: accountURL ? accountURL : undefined,
  });

  const certificateRuns = certificates.map(async (certificate) => {
    const { domains, zoneId, certStorageBucketName, certStoragePrefix, successSnsTopicArn, failureSnsTopicArn } =
      certificate;

    try {
      const [certificateKey, certificateCsr] = await acme.forge.createCsr({
        commonName: domains[0],
        altNames: domains.slice(1),
      });

      const certificate = await client.auto({
        csr: certificateCsr,
        email: maintainerEmail,
        termsOfServiceAgreed: true,
        challengeCreateFn: async (authz, challenge, keyAuthorization) => {
          console.log(authz, challenge, keyAuthorization);
          const dnsRecord = `_acme-challenge.${authz.identifier.value}`;

          if (challenge.type !== 'dns-01') {
            throw new Error('Only DNS-01 challenges are supported');
          }
          const changeReq = {
            ChangeBatch: {
              Changes: [
                {
                  Action: 'UPSERT',
                  ResourceRecordSet: {
                    Name: dnsRecord,
                    ResourceRecords: [
                      {
                        Value: '"' + keyAuthorization + '"',
                      },
                    ],
                    TTL: 60,
                    Type: 'TXT',
                  },
                },
              ],
            },
            HostedZoneId: zoneId,
          };
          console.log('Sending create request', JSON.stringify(changeReq));
          const response = await route53.changeResourceRecordSets(changeReq).promise();
          const changeId = response.ChangeInfo.Id;
          console.log(`Create request sent for ${dnsRecord} (Change id ${changeId}); waiting for it to complete`);
          const waitRequest = route53.waitFor('resourceRecordSetsChanged', { Id: changeId });
          const waitResponse = await waitRequest.promise();
          console.log(
            `Create request complete for ${dnsRecord}: (Change id ${waitResponse.ChangeInfo.Id}) ${waitResponse.ChangeInfo.Status}`
          );
        },
        challengeRemoveFn: async (authz, challenge, keyAuthorization) => {
          const dnsRecord = `_acme-challenge.${authz.identifier.value}`;

          const deleteReq = {
            ChangeBatch: {
              Changes: [
                {
                  Action: 'DELETE',
                  ResourceRecordSet: {
                    Name: dnsRecord,
                    ResourceRecords: [
                      {
                        Value: '"' + keyAuthorization + '"',
                      },
                    ],
                    TTL: 60,
                    Type: 'TXT',
                  },
                },
              ],
            },
            HostedZoneId: zoneId,
          };
          console.log('Sending delete request', JSON.stringify(deleteReq));
          const response = await route53.changeResourceRecordSets(deleteReq).promise();
          const changeId = response.ChangeInfo.Id;
          console.log(`Delete request sent for ${dnsRecord} (Change id ${changeId}); waiting for it to complete`);
          const waitRequest = route53.waitFor('resourceRecordSetsChanged', { Id: changeId });
          const waitResponse = await waitRequest.promise();
          console.log(
            `Delete request complete for ${dnsRecord}: (Change id ${waitResponse.ChangeInfo.Id}) ${waitResponse.ChangeInfo.Status}`
          );
        },
        challengePriority: ['dns-01'],
      });

      // Write private key & certificate to S3
      const certKeyWritingPromise = s3
        .putObject({
          Body: certificateKey.toString(),
          Bucket: certStorageBucketName,
          Key: certStoragePrefix + 'key.pem',
          ServerSideEncryption: 'AES256',
        })
        .promise();
      const certChainWritingPromise = s3
        .putObject({
          Body: certificate,
          Bucket: certStorageBucketName,
          Key: certStoragePrefix + 'cert.pem',
        })
        .promise();

      await Promise.all([certKeyWritingPromise, certChainWritingPromise]);
      console.log('Completed with certificate for ', domains);

      // after client.auto, an account should be available
      if (!accountURL) {
        await writeSSM('LETSENCRYPT_ACCOUNT_URL', client.getAccountUrl());
      }

      if (successSnsTopicArn) {
        await sns
          .publish({
            TopicArn: successSnsTopicArn,
            Message: `Certificate for ${JSON.stringify(domains)} issued`,
            Subject: 'Certificate Issue Success',
          })
          .promise();
      }
    } catch (err) {
      console.log('Error ', err);
      if (failureSnsTopicArn) {
        await sns
          .publish({
            TopicArn: failureSnsTopicArn,
            Message: `Certificate for ${JSON.stringify(domains)} issue failure\n${err}`,
            Subject: 'Certificate Issue Failure',
          })
          .promise();
      }
      throw err;
    }
  });

  await Promise.all(certificateRuns);
};

Automatic DNS records

CDK

This wires an EventBridge rule to a lambda. It references the clusterArn whose task state-change events we want, the serviceDiscoveryTLD to suffix records with (for us, browser.${props.env.region}.reflow.io), and the hosted zone to write records into.

import { Rule } from 'aws-cdk-lib/aws-events';
import { LambdaFunction } from 'aws-cdk-lib/aws-events-targets';
import { PolicyStatement } from 'aws-cdk-lib/aws-iam';

// ...

const eventRule = new Rule(this, 'ECSChangeRule', {
   eventPattern: {
      source: ['aws.ecs'],
      detailType: ['ECS Task State Change'],
      detail: {
         clusterArn: [cluster.clusterArn],
      },
   },
});

const ecsChangeFn = new ReamplifyLambdaFunction(this, 'ECSStreamLambda', {
  ...props,
  lambdaConfig: 'stream/ecsChangeStream.ts',
  unreservedConcurrency: true,
  memorySize: 128,
  environment: {
    DOMAIN_PREFIX: props.serviceDiscoveryTLD,
    HOSTED_ZONE_ID: props.hostedZone.hostedZoneId,
  },
});

eventRule.addTarget(new LambdaFunction(ecsChangeFn));

ecsChangeFn.addToRolePolicy(
        new PolicyStatement({
           actions: ['route53:GetChange', 'route53:ChangeResourceRecordSets', 'route53:ListResourceRecordSets'],
           resources: ['arn:aws:route53:::change/*'].concat(props.hostedZone.hostedZoneArn),
        })
);
ecsChangeFn.addToRolePolicy(
        new PolicyStatement({
           actions: ['ec2:DescribeNetworkInterfaces'],
           resources: ['*'],
        })
);

The DNS lambda

The function sanity-checks each event, then:

if the task is currently RUNNING and desired RUNNING: looks up the task’s public IP and upserts an A record at ${taskId}.${DOMAIN_PREFIX}
otherwise: deletes the task’s A record

import type { EventBridgeHandler } from 'aws-lambda';
import AWS from 'aws-sdk';
import { Task } from 'aws-sdk/clients/ecs';

export function assertEnv(key: string): string {
  if (process.env[key] !== undefined) {
    console.log('env', key, 'resolved by process.env as', process.env[key]!);
    return process.env[key]!;
  }
  throw new Error(`expected environment variable ${key}`);
}

const ec2 = new AWS.EC2();
const route53 = new AWS.Route53();
const DOMAIN_PREFIX = assertEnv('DOMAIN_PREFIX');
const HOSTED_ZONE_ID = assertEnv('HOSTED_ZONE_ID');

export const handler: EventBridgeHandler<string, Task, unknown> = async (event) => {
  console.log('event', JSON.stringify(event));
  const task = event.detail;
  const clusterArn = task.clusterArn;
  const lastStatus = task.lastStatus;
  const desiredStatus = task.desiredStatus;

  if (!clusterArn) {
    return;
  }

  if (!lastStatus) {
    return;
  }

  if (!desiredStatus) {
    return;
  }

  const taskArn = task.taskArn;
  if (!taskArn) {
    return;
  }
  const taskId = taskArn.split('/').pop();
  if (!taskId) {
    return;
  }

  const clusterName = clusterArn.split(':cluster/')[1];
  if (!clusterName) {
    return;
  }
  const containerDomain = `${taskId}.${DOMAIN_PREFIX}`;

  if (lastStatus === 'RUNNING' && desiredStatus === 'RUNNING') {
    const eniId = getEniId(task);
    if (!eniId) {
      return;
    }

    const taskPublicIp = await fetchEniPublicIp(eniId);
    if (!taskPublicIp) {
      return;
    }

    const recordSet = createRecordSet(containerDomain, taskPublicIp);

    await updateDnsRecord(clusterName, HOSTED_ZONE_ID, recordSet);

    console.log(`DNS record update finished for ${taskId} (${taskPublicIp})`);
  } else {
    const recordSet = await route53
      .listResourceRecordSets({
        HostedZoneId: HOSTED_ZONE_ID,
        StartRecordName: containerDomain,
        StartRecordType: 'A',
      })
      .promise();
    console.log('listRecordSets', JSON.stringify(recordSet));
    const found = recordSet.ResourceRecordSets.find((record) => record.Name === containerDomain + '.');
    if (found && found.ResourceRecords?.[0].Value) {
      await route53
        .changeResourceRecordSets({
          HostedZoneId: HOSTED_ZONE_ID,
          ChangeBatch: {
            Changes: [
              {
                Action: 'DELETE',
                ResourceRecordSet: {
                  Name: containerDomain,
                  Type: 'A',
                  ResourceRecords: [
                    {
                      Value: found.ResourceRecords[0].Value,
                    },
                  ],
                  TTL: found.TTL,
                },
              },
            ],
          },
        })
        .promise();
    }
  }
};

function getEniId(task): string | undefined {
  const eniAttachment = task.attachments.find(function (attachment) {
    return attachment.type === 'eni';
  });
  if (!eniAttachment) {
    return undefined;
  }
  const networkInterfaceIdDetail = eniAttachment.details.find((detail) => detail.name === 'networkInterfaceId');
  if (!networkInterfaceIdDetail) {
    return undefined;
  }
  return networkInterfaceIdDetail.value;
}

async function fetchEniPublicIp(eniId): Promise<string | undefined> {
  const data = await ec2
    .describeNetworkInterfaces({
      NetworkInterfaceIds: [eniId],
    })
    .promise();
  console.log(data);

  return data.NetworkInterfaces?.[0].PrivateIpAddresses?.[0].Association?.PublicIp;
}

function createRecordSet(domain, publicIp) {
  return {
    Action: 'UPSERT',
    ResourceRecordSet: {
      Name: domain,
      Type: 'A',
      TTL: 60,
      ResourceRecords: [
        {
          Value: publicIp,
        },
      ],
    },
  };
}

async function updateDnsRecord(clusterName, hostedZoneId, changeRecordSet) {
  let param = {
    ChangeBatch: {
      Comment: `Auto generated Record for ECS Fargate cluster ${clusterName}`,
      Changes: [changeRecordSet],
    },
    HostedZoneId: hostedZoneId,
  };
  await route53.changeResourceRecordSets(param).promise();
}

Running this in production

Two months in production. Not perfect, but working well.

Things we worried about unnecessarily:

Record accumulation. We expected error conditions to leak DNS records that never get removed. Many thousands of records later, it hasn’t been an issue.
Route53 throttling. We’ve seen it a handful of times; the lambdas retry automatically and the change eventually lands.

Real negatives:

Browsers sometimes refuse to see a new DNS record until refresh, even past the TTL — this surfaced as flakiness in our end-to-end tests, and we had to automate around it.
Orchestration logic is considerably more complex when you manage individual ECS tasks instead of a service behind a load balancer.