Skip to main content

Command Palette

Search for a command to run...

Part 2: Deploy a Scalable Video Transcoder on AWS with Serverless Architecture

Updated
21 min read
Part 2: Deploy a Scalable Video Transcoder on AWS with Serverless Architecture
A

Software engineer, passionate about learning and exploring distributed systems, tinkering around with Frontend, learning on the go

This article continues from the designing of Scalable video transcoder (Part-1) , which was the first article in this series. This article focuses on hosting the entire backend on AWS, using ECS, ECR, Lambda and SQS services.

The backend flow goes as follows:

  1. An Application Load Balancer (ALB) listens to HTTPS traffic on port 443. The ALB is hosted on atleast 2 Availablity zones (AZs) and is publicly accessible, via the provided ALB DNS url. The ALB forwards the incoming traffic to a target group.

  2. A target group is a group of containers or EC2 instances where the ALB will send the traffic to. In our case, the containers are the ECS Fargate containers which are configured to expose their endpoint on port 3000. Thus, the incoming traffic is sent to port 3000 to one of the targets in the target group, depending on the ALB routing.

  3. Any of the containers will upload the video file to S3, which is the VideoStorageS3 bucket. The video file is stored as is. Upon the upload, the S3 bucket is configured to publish an event, which invokes a Lambda function.

  4. The Lambda function is a python program that converts the video to the required format and resolution. It uses an additional ffmpeg layer under the hood, as the key tool responsible for doing the actual video conversion.

  5. Once the Lambda function process is complete, and the output video file is generated, it is uploaded to the output bucket VideoOutputS3

  6. When the output video upload is complete, this bucket also publishes an event to the SQS.

  7. The SQS receives this event, and publishes it. The ECS containers from Step-2 also keep listening to the SQS events, and thus upon receiving an event, they are notified of the video transcoding process.

  8. Finally, the containers update the RDS Database (PostgreSQL) and update the record for the particular video ID that the conversion process is complete

Lets take a look at the detailed steps to set up the backend:

Setting up AWS ECR private repository

ECR is a container registry service where we can upload our docker images to be used by other AWS services, like EC2s or containers

Part-1 also discussed how to create the docker file for our application, please refer to the last part of that article to know more about the docker file

We first need to build the docker image out of the docker file so that we can push it to the ECR registry.

To build the above docker image, we use the following command:

docker build --platform linux/arm64 -t video-transcoder-app .

The platform flag is necessary, if your’e using a mac with M series chip, because in some cases there are compatibility issues when running a docker image built on macs, on linux environments. This ensures that the image will run without any issues on a arm64 based linux docker container.

docker images

Run the above command to list the images

We can see that video-transcoder-app image is built with the latest tag

Now, we need to push this image to AWS ECR.

  1. Go to AWS, and search for “Elastic container registry”

  2. In private registry, go to repositories section, and click on Create repository

  3. Keep all other settings as is, give the name for the repository

  1. Finally, click on create

  2. Once done, we need to push the image we have produced locally to this newly created registry on AWS

  3. Before this, ensure you have properly configured your AWS credentials on your local, such as either using static credentials like access key and secret key present in ~/.aws directory, or fetching a token temporarily using sts , or by using SSO login to retrieve temporary credentials

  4. In my case, I have AWS SSO configured, so the following command uses a particular profile where I have already configured SSO.

  5. Use the below command to login to the newly created registry, this command fetches the login password (assuming you have the credentials in place) and performs an authentication to the registry using AWS as the username and the password, with the registry being the <AWS_ACCOUNT_ID>.dkr.ecr.ap-south-1.amazonaws.com

  6.   aws ecr get-login-password --region ap-south-1 --profile my-dev-profile | docker login --username AWS --password-stdin <AWS_ACCOUNT_ID>.dkr.ecr.ap-south-1.amazonaws.com
    
  7. After this, we need to tag the image we just built so that we can push it to our private registry, grab the Image ID for the docker image we built, and use the below command

  8. docker tag d51663649ad5 <AWS_ACCOUNT_ID>.dkr.ecr.ap-south-1.amazonaws.com/video-transcoder
    
  9. Once the image is tagged, we can now push the image

  10. docker push <AWS_ACCOUNT_ID>.dkr.ecr.ap-south-1.amazonaws.com/video-transcoder
    
  11. We will now be able to see the image successfully pushed to AWS ECR

Setting up the ECS Fargate service

AWS ECS offers Fargate, that helps run containers on demand. Fargate is a capacity provider that is managed by AWS, which handles the provisioning of compute resources to run the containers. Since we have already created the docker image of our application, we can utilise this AWS service which offers a serverless way to run containers, has easy to configure auto-scaling to meet dynamic load.

We will begin by creating a ECS cluster, wherein we define the type of the cluster to be a Fargate type. Then, we create an intermediate ECS service, that takes care of autoscaling and deployment, that starts and stops containers on demand, maintaining an optimal number of containers to serve the traffic, based on the load. The containers, created by the ECS service are called “Tasks”.

Since ECS is a paid AWS service, it is important to understand the cost impact we will have when creating this setup. I was able to experiment with three types of deployment strategies, which have their own costs associated.

  1. Placing the ECS tasks inside a public subnet, and assigning a public IP to the task

  2. Placing the ECS tasks inside a private subnet, and enable routing from private subnet to public subnet / Internet via NAT gateway

  3. Placing the ECS tasks inside a private subnet, and enabling S3 gateway endpoint along with VPC endpoint for ECR

Lets discuss each of these approaches in detail.

Placing the ECS tasks inside a public subnet, and assigning a public IP to the task

This is the most simple approach, it does not require a lot of setup. Before starting, we need to ensure a few things from the VPC perspective of the setup.

VPC is virtual private cloud which is a logical isolation of resources in AWS, such as EC2, databases and containers. Inside a VPC, we can define a subnet, which is another level of grouping, based on the available number of IP addresses (address range), which allows us to define the range of IP addresses any entity can take, when it is placed inside the subnet.

For example, if a VPC has a CIDR block of 10.0.0.0/16 then we can create two subnets, 10.0.1.0/24 and 10.0.2.0/24 , thus each subnet allows for use of approximately 256 IP addresses (excluding some internal IPs and broadcast IPs for each subnet). The number 256 is derived from the fact that a CIDR block such as 10.0.1.0/24 means that the subnet mask is 24 bits long, i.e., 256.256.256.0 and the IP corresponding to it is 10.0.1.0 which means we only have space for 8 bits left, leaving us with 256 possible numbers (2^8)

For subnets to route traffic occurring within, it is important that each subnet is associated with a route table. A route table is like a lookup table, which helps packets (traffic) reach their destination by doing a lookup for the target against their destination IP.

A default route table in AWS looks like this:

The above route table tells that the destination 0.0.0.0/0 means anywhere, has the target of Internet gateway. The internet gateway is an AWS entity that is a route towards the internet. Thus, within this subnet, any traffic for the outside world will be route to the internet. The other destination is 172.31.0.0/16 which is the VPC IP, tells the traffic that any traffic occurring that has the destination IP within the IP range of the VPC should be routed locally to the VPC, to allow communication with the resources present inside the same subnet or other subnets.

There are two types of subnets. One is a public subnet, and the other is a private subnet. Public subnet simply means that a resource inside the public subnet has a route table to route traffic to the outside world, i.e. the internet. To communicate with the internet, a resource must have a static IPv4, and thus resources in a public subnet must have a public IP, from the list of available IPs.

A private subnet on the other hand is cut off from the Internet access, thus the resources spawned inside a private subnet should not have a public IP assigned. Also, a private subnet must not contain a route table pointing to the internet gateway.

There are many advantanges of keeping business resources like databases, EC2s, tasks inside a private subnet, we get a logical isolation from the outside world, and we can define various other firewall rules using security groups that prevent unwanted access to our resources.

The above shows a route table associated to a private subnet. As you can see, the private subnet does not have a route to the internet gateway as shown in the previous image.

In our current strategy, we aim to do a simple setup that places our ECS tasks inside a public subnet and assigns them a public IP.

Cluster creation for ECS

To create an ECS cluster, go to ECS section on the AWS console, and click on create cluster

Provide a cluster name, and the type of the cluster, as I mentioned, for this time we are creating Fargate based container to abstract away the capacity provider on AWS itself.

Click on create, after sometime, the cluster will get created.

Creating a task definition

A task definition represents information about our task, it defines information such as what docker image the task is about, what are the capacity requirements in terms of CPU and Memory required by the task to run optimally, what are the environment variables or other files the task requires to run properly, and where to store the logs for the task.

Go to the task defintions section in ECS to create a new task defintion

Give an appropriate name to task the defintion, then move to the infrastructure requirements section

Select Fargate as the launch type, as our cluster supports that. Use the platform as Linux/ARM64 as this is what we used during docker image creation step. We can give 1 vCPU and 2GB memory as requirements for the task defintion to run.

Move to the roles section, here we define the IAM roles that we need our tasks to assume in order to access other AWS resources. In our case, since our backend interacts with video uploads to S3 bucket, and also listens to an SQS queue, we need to create an IAM role that exactly does give permission to do that. The task role will be assumed by the ECS task and AWS IAM will take care of fetching the credentials for performing operations such as uploading videos to S3, and listening to SQS queues

Go to IAM in AWS console, click on policies, and click create policy

Click on JSON and add the below permissions to allow access to other AWS resources

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "sqs:DeleteMessage",
                "s3:PutObject",
                "s3:GetObject",
                "sqs:GetQueueUrl",
                "sqs:ListDeadLetterSourceQueues",
                "sqs:ListMessageMoveTasks",
                "sqs:ReceiveMessage",
                "sqs:GetQueueAttributes",
                "sqs:ListQueueTags"
            ],
            "Resource": [
                "arn:aws:sqs:ap-south-1:<AWS_ACCOUNT_ID>:video-output-queue",
                "arn:aws:s3:::videostorage/*"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "sqs:ListQueues",
            "Resource": "*"
        },
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::videooutput/*"
            ]
        }
    ]
}

Click on review and next, and then click on create. This will create the policy that defines the accesses to our video storage and retrieval buckets, as well as SQS permissions to read and delete messages from the queue.

Next, Go to IAM in AWS console, click on roles, and click create role

In the trusted entity section, as we want an ECS task to be able to use this role, we have to select Elastic Container Service Task.

Go to the next step and choose the policy we just created

Give the role a name and click on create role.

Once we are done with the above step, lets resume the task definition step

Choose the newly created IAM role to be used by the task. Select “Create new role” under the Task execution role section. This will create a basic role managed by AWS to push logs to cloudwatch.

In the next step, we define the docker image and runtime variables for our task.

Give the name to the container, and the image to use. Since we have already pushed our image to ECR, we can simply copy the URI from ECR to point to the latest image version to be used by this container. We also mark this as an essential container so that when this container is not healthy, our deployment fails. You can copy the image URI from ECR as shown below

In the environment variables section, you can either add custom variables, or add an environment file. You can refer to Part-1 of the article where I have explained the example environment variables, or check out this link to get an idea of the environment variables we require at runtime.

Add default options to log collection part

Leave storage to defaults, and click on create. Our task definition creation is now done.

Service creation in the ECS cluster

We will create the ECS service inside the newly created cluster, simply go to the cluster and click on services tab, and click on create

As we are using fargate, the first option inside the service creation step is simplified for us, just use the default capacity provider and AWS will use fargate to provision capacity for the containers to run

In the deployment configuration section, select the application type as a service, and choose the task definition family that we created previously. Use the latest version.

In the replica configuration, turn off AZ rebalancing and set desired tasks to 1. This will ensure the service always maintains one instance of the task

Leave other options as is, to the defaults.

In the Networking section, choose the default VPC.

Choose a public subnet, that is available by default. Also, use a default security group. Ensure that you select atleast two subnets across two AZs, as in the next steps, we will create a load balancer which needs atleast two AZs for availability. Select Public IP to “Turned on” as we need to give a public IP to each task.

For the purposes of this tutorial, I have used default security group. However, this can be improved further by using a security group that is attached to the load balancer. That will ensure, only the traffic coming out from the Load balancer will be allowed to the task. The default security group allows a connection from anywhere, and allows outbound connections to anywhere, simplifying the firewall access in our use case.

An example of how we can use a source security group, the above shows that the security group used by every ECS task will attach the SG of the load balancer as the source.

In the optional load balancing section, we configure the load balancer for our tasks

Choose an “Application Load Balancer”. This load balancer will listen on HTTP port 80 and forward the traffic to the target group. The load balancer will route the traffic to port 3000 to all the targets registered in the target group. During the task definition creation step, we defined the host port to be 3000. Our docker image “forwards” port 3000 on the host and redirects to port 3000 on the container. That is where our endpoints are also hosted by the backend app.

To introduce autoscaling to our tasks to deploy and undeploy dynamically based on the load, enable the auto scaling on the service. The ECS service takes care of registering/ de-registering the newly created tasks to the target group as well.

The above configuration will set up an autoscaling definition which will keep a minimum of 1 task always, and maximum of 5 tasks when the count of requests on the Load balancer exceed 100. Thus, the moment the number of requests to the load balancer increases to 100 or more, the ECS service will deploy a new task to handle the increase in load. This is a simple metric based scaling, for purposes of simplicity.

We can leave other defaults as is, and click on create. It will take some time before the ECS service is created successfully.

If everything is done correctly, you should see a service in the cluster having 1 task up and running

The logs will be seen successfully

Cost analysis

Each container uses some CPU and memory, both of which are chargeable by AWS based on the region. For our example, the ap-south-1 region has the following costs from AWS

For my use case, I am using 0.25 vCPU and 0.5 gb memory per task. To run one task per hour, it costs me

  • CPU - 0.0238×0.25 = 0.00595$

  • Memory - 0.0026×0.5 = 0.0013$

Each task is given a public IP, public IPs are charged at 0.005$ per hour pricing

If I want to run 10 tasks, for 7 days, the total cost comes to

  • CPU - 0.00595×10(tasks)×24(hours)×7(days) = 9.996$

  • Memory - 0.0026×0.5×10×24×7 = 2.184$

  • IPv4 - 0.005×10×24×7 = 8.4$

As you can see, the cost of only using the Public IPs is 8.4$, that too in case of just 10 tasks running for 7 days. We can optimise this to reduce the cost on network side, as we cannot really do much on the compute side.

Placing the ECS tasks inside a private subnet, and enable routing from private subnet to public subnet / Internet via NAT gateway

We can improve the security posture of our setup and also reduce network costs by using private subnets and using NAT gateway. NAT gateways work like a reverse proxy, they act as a bridge between public and private subnets, allowing resources in private subnets to interact with the internet.

We need to create a new service, just this time, we will use a private subnet.

To create a private subnet, go to VPC section, select your region, and go to subnets. Click on create subnet

Choose your current VPC

Select an AZ where you want to keep your private subnet, it is better to use some dedicated AZs for hosting private subnets and rest AZs for public subnets. Give an appropriate CIDR range from the available range of the VPC’s IP address range. In this example, I have used 172.31.1.0/24 to allow approximately 256 IPs within this private subnet. Click on create subnet.

Go to the subnet you just created and edit the route table associations. Remove any route defintion that has the target to an internet gateway, so as to make this subnet private.

In the above image, you will see no entries for an internet gateway. However, we need to create a gateway endpoint to Amazon S3, to allow our containers in the private subnet to access Amazon S3. Check out the Access S3 through gateway endpoint section on AWS to know why we need a gateway endpoint.

Gateway endpoints are free of cost, and not charged by AWS.

Check out the tutorial on how to create a simple S3 gateway endpoint. Once that is in place, you can associate the endpoint to the private subnet, and an entry will automatically be created by AWS that has a preflix list, containing the destination IPs to Amazon S3 services, and the target being the endpoint we just created. Any traffic towards amazon S3 will get routed via this endpoint.

With everything in place, we just have one more step, that is to select this private subnet in the service creation step.

Ensure to turn off the public IP, this will not assign a public IP to the task, which is exactly what we want.

The next step is to create a NAT gateway to allow the tasks to use internet in the private subnet.

Go the VPC section, and in NAT gateways, click on create NAT gateway

Choose a public subnet to place the NAT gateway into. Select public connectivity type so that the NAT gateway receives a public IP. Use an elastic IP address for the NAT gateway so that AWS manages the provisioning of the IP to the NAT.

Because private subnets can access public subnets within the same VPC, they can also access the NAT gateway. Checkout this article to know more about this way of network setup.

Next, go to the route table section and click on create route table. Add the below endpoints

  • VPC traffic → locally

  • S3 traffic → S3 gateway endpoint

  • Outbound traffic → NAT gateway

Click on save changes and associate this route table with the private subnet we created.

Now, our ECS service will launch tasks in the private subnet, without assigning them any public IP, and any traffic occurring within the subnet will reach the public NAT gateway, which will route outbound packets with the IP of itself, and take care of the routing process.

Cost analysis

In our previous strategy, our network costs were dependent on the number of tasks, which was increasing our total cost by a lot. Because we are using a single NAT gateway here, the cost is only charged for this gateway, and not the tasks.

For Asia pacific, the following is the NAT gateway costs pricing page

If we are using the NAT gateway for 7 days, and transfer 20 GB of data through the NAT gateway, the costs will be as follows

  • Fixed per hour charge for 7 days - 0.056×24×7 = 9.4$

  • Data transfer charge - 0.056×20 = 1.12$

But this does not seem really cost efficient as compared to first step, right?

The efficiency comes when we take a look at the number of tasks. In the previous example, we just had 10 tasks running and we incurred a cost of 8.4$, however, in this case, if we assume there are 100 tasks running, and each task transferred 200mb of data (totalling to 20GB), our total cost is just 10$.

Thus, the only variable here is the data transfer cost, and this can also be a good strategy if the data transfer is not very heavy, and we can achieve a good amount of horizontal scaling.

Placing the ECS tasks inside a private subnet, along with VPC endpoint for ECR

We can even optimise our costs further if we only require outbound access to AWS resources like Amazon ECR and Cloudwatch.

In a private subnet, the ECS tasks cannot pull images from Amazon ECR or push logs to Cloudwatch as these are external AWS services. The previous approach with NAT gateway works, but it is too costly if we really don’t want to have true internet access, but just use external AWS services like I mentioned.

We can use AWS PrivateLink, also called VPC endpoints for AWS.

We can create VPC interface endpoints for Amazon ECR, and Cloudwatch, so that any resource residing within our VPC can access the external AWS services via these interface endpoints.

To create a VPC endpoint, head to the VPC section on AWS, and go to endpoints tab. Click create endpoint.

Select the ecr.dkr endpoint which is the docker registry endpoint. Note that we can only select one service per VPC endpoint.

Be sure to enable DNS name resolution and also check if your existing VPC supports DNS resolution

Select atleast one subnet to place the VPC endpoint into. In the security groups section, select default group to ease out firewall access.

Select full access in the policy section. Note that if you select any custom policy, it does not override the existing policies you have on the resources, for e.g. our ECS tasks. This is merely an added filter.

Click on create endpoint.

Repeat the same steps for below services

  • com.amazonaws.ap-south-1.logs (or com.amazonaws.ap-south-1.monitoring) if logs does not work

  • com.amazonaws.ap-south-1.ecr.api

Note that the ap-south-1 is the VPC region, if yours is different, this may be different.

Once in place, we no longer need the NAT gateway, we can simply de-provision it, and rely on VPC endpoints. There is no other setup needed, as we already used DNS resolution, AWS PrivateLink automatically takes care of resolving the DNS for Amazon ECR and Cloudwatch.

Cost analysis

VPC endpoints are charged 0.013$ per hour in ap-south-1 region as of Jan-2025 link

For the network costs, this reduces our total network cost, irrespective of the number of tasks we have deployed by ECS

For running the VPC endpoints for 7 days, we have in total 3 VPC endpoints created, two for ECR, one for cloudwatch. S3 endpoint is free of cost.

  • Cost for 7 days for 3 endpoints - 0.013×24×7×3 = 6.5$

  • Data transfer cost - 0.01$ uptill first 1 PB

Thus, even our data transfer costs are optimised.

Deploying an RDS PostgreSQL database

For our application, we also require a postgresql database for keeping a track of completed video transcoding processes. Lets go through the setup of the RDS

There are no complex steps to this, simply use Easy create to configure a PostgreSQL database.

I am using the free tier RDS database that comes with AWS Free tier.

Create the master password and master username for the RDS and don’t keep it publicly accessible, as our resources within the subnets (public / private) can talk to the database.

Thus with this step, the RDS setup is complete.


Conclusion

Thanks for reading this article, I hope you enjoyed going into the depths of the important AWS services, how to do cost optimisation, and how easy it is to set up a serverless architecture for any backend application.

In the next article, lets discuss the Lambda function and the event oriented architecture where S3 buckets invoke the lambda function upon video upload, and the events are routed to SQS.

I hope you enjoyed reading this, please find relevant links to the project / my profile below

GitHub Link

Live

My LinkedIn

X