Amazon CloudFront Origin and Distribution

·

4 min read

To reduce the latency AWS provides a solution as content delivery to send data fast to the client. We as a content originator/distributer create the content and store in any storage, for example, s3 or ec2 instance and then we can map those origins with a Cloudfront.

Low latency, consistency, and high throughput through AWS global network that is the tagline for all the fast performing applications.

Edge Location and Edge Cache

Cloudfront uses Edge location and edge cache to cache the content. There are 12 Edge cache and 210 Edge locations currently. Edge cache has more cache bandwidth and can hold more cache.

TTL

By default, objects get cached for 24 hours. If CloudFront finds the cache in the edge locations or edge cache it is called a Hit else Miss. We can manage the cache time using the TTL settings. For dynamic content, we can decrease the TTL and for static content(that is not changing frequently) we can increase the TTL.

We can also set TTL based on file type or pattern.

After the expiration, CloudFront will check the origin for the original content.

Headers can also be used to control the cache using cache-control max-age=(seconds)- how long the CloudFront will wait to get the content from the origin

Invalidation

We can invalidate the cache at any time but that will cost us money, there are 1000 invalidation requests allowed each month. If invalidating one file or invalidating wildcard considers 1 request even wildcard invalidate multiple files. There are 30 invalidate requests allowed at a time or a maximum of 3000 files (that can be in any combination 100 requests for 30 files or 30 requests for 100 files). If using wildcard for invalidation request there are 15 paths limit at the time but still, we can invalidate the 3000 files on top of it.

Headers

Cloudfront can forward the request headers to the origin. Then CloudFront will cache multiple versions of an object based on the header configuration.

We can configure:

  • forward all headers to the origin (no cached)

  • forward a white list of headers (Cached multiple version of the request)

  • forward default headers

signedURL

signed URL provides more control over the content.. and can specify expiration dates, IP address, and IP ranges. Using a signed URL we can access only one content at a time and is useful when the client does not support cookies it is called Canned Policy.

Signed cookies-- can be used to access multiple files.It is called Custom policy

Origin access identities

origin access identities -- if the origin is s3, to secure the s3 resources we can use origin-access-identities. OAI creates a principle that s3 can authenticate with and it is used in the CloudFront distribution. With the bucket policy set, we can turn on "Block all public access" to make the s3 only accessible through CloudFront.

    "Sid": "StmtCFOAI",
    "Effect": "Allow",
    "Principal": {
        "AWS": "arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity <OAI ID>"
    },
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::<s3 bucket name>/*"
 },

Ref

But if ec2 is the origin then can not use OAI and to support the access we can use the list of IP addresses published by AWS and add those in the security groups. Ref

Another way to the performance of the application we can use Content-encoding gzip for compressed content. By default, S3 does not support but we can upload gzip file and set content-encoding to gzip

Cloudfront SSL/TLSand SNI

For HTTPS access we can use Aws Certificate manager or can import third-party certificates. We can set the behavior to redirect HTTP to HTTPS or sever only HTTPS content. ACM is only supported from us-east-1 (CloudFront is a global service )

SNI is an extension of the TLS protocol which allows multiple domains to be served over the same IP address by including the hostname. With the SNI method CloudFront associate IP addresses with each domain but these are not dedicated IP addresses. The browser those supports SNI automatically gets the domain name from the requested URL and adds that in the request header.

When CloudFront receives the request and analyzes the headers and based on the domain name responds to the request with an applicable SSL/TLS certificate.

Viewer and CloudFront perform SSL negotiation and CloudFront returns the requested resource.

Lambda Edge

This is very important to understand and a great feature to make the CloudFront dynamic. We can write Lambda in NodeJS and Python and execute the lambda before and after the CloudFront request-response.

  • Viewer Request - after CloudFront receives the request from the Viewer

  • Origin Request - Before sending to the origin after getting the request from the viewer

  • Origin response - After the response from the origin and before sending it to the viewer

  • Origin response- after sending the response to the viewer

use cases: aws.amazon.com/lambda/edge