Amazon Cloudfront Issues / Enhancements

Amazon recently (May 2012) made a significant improvement to Amazon Cloudfront by adding support for dynamic content, i.e. they treat URLs with querystrings as being unique.

Previously Cloudfront treated the following

as if the querystring didn’t exist

I am pleased to say they have listened to users and now have an option to treat querystrings and the order of parameters within them as being unique, i.e. the following are treated as separate cache objects.

There are however still issues or enhancements that need to be addressed with used in conjunction with ELB.

CNAME / Hostname

Cloudfront does not pass the hostname / CNAME of the request to the origin.
For example if a client requests the following URLs

  • images.x.com/1.jpg
  • images.y.com/2.jpg

the origin receives the ELB name not the hostname so it doesn’t know what the request pertains to.

This is a fatal issue for origin systems that rely on the hostname to look up configuration settings.

A suggested workaround was to create a Distribution per customer but this didn’t work either. There is a limit of 100 Distributions. Another limit worth mentioning is a Distribution can have no more than 10 CNAMEs.

SSL CNAME / Hostname

Cloudfront doesn’t support vanity hostnames when serving traffic via HTTPS.

Referrer

Cloudfront appears to pass blank to origin. It is therefore not possible to use this data for reporting purposes.

UserAgent

Cloudfront passes to the origin itself as the UserAgent overwriting the original UserAgent information. It is therefore not possible to use this data for reporting purposes.

Caching Non 200 responses

Cloudfront documentation claims “If your custom origin server responds to a CloudFront request with any of the following status codes, CloudFront caches the code for five minutes and writes the results to the access logs.”

  • 204 No Content
  • 305 Use Proxy
  • 400 Bad Request
  • 403 Forbidden
  • 404 Not Found
  • 405 Method Not Allowed
  • 414 Request-URI Too Large
  • 500 Internal Service Error
  • 501 Not Implemented
  • 502 Bad Gateway
  • 503 Service Unavailable
  • 504 Gateway Time-out

5 minutes is not long enough. It should be possible to configure Cloudfront to honour cache-control headers I send it. The whole point of Cloudfront after all is to take the load off the origin.

Worse it doesn’t appear Cloudfront even honours the 5 minutes for various 400 requests I have just tested.

Verbosity

Cloudfront stuffs a load of unnecessary information in the response headers, e.g.
X-Cache: Hit from cloudfront
X-Amz-Cf-Id: eLe2RUVe3An_sGU9fJC83_BFZ2HCZh2Pfk8bXd4AErkCcTFFH7ga4A==
Via: 1.0 a51e82ee73758b3d189f044e592793b7.cloudfront.net (CloudFront), 1.0 rrba-ip-pcache-6 (NetCache NetApp/6.1.1D8), 1.1 wbs-ip-ccache-2 (NetCache NetApp/6.1.1D8)

Client IP

Cloudfront does not pass the Client IP address to the origin.

Cloudfront logging

Cloudfront log format does not include original CNAME requested.

#Version: 1.0
#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host)
cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query
2012-06-25      13:44:21        LHR5    6184    41.215.233.146  GET d2o181sy3p9pq1.cloudfront.net /NexTUmvuchVSXe7Y3ThHidh9lGcj.jpg?w=225&g=1&m=1 200     -
Mozilla/5.0%20(Windows%20NT%206.1)%20AppleWebKit/536.5%20(KHTML,%20like%20Gecko)%20Chrome/19.0.1084.56%20Safari/536.5       w=225&g=1&m=1
2012-06-25      13:08:45        LHR5    7375    41.215.233.146  GET d2o181sy3p9pq1.cloudfront.net   /NexTUmvuchVSXe7Y3ThHidh9lGcj.jpg?w=225 200 -
Mozilla/5.0%20(Windows%20NT%206.1)%20AppleWebKit/536.5%20(KHTML,%20like%20Gecko)%20Chrome/19.0.1084.56%20Safari/536.5       w=225

I can’t tell looking at the log file whether http or https either. Nor can I see how long CloudFront took to process the request.

Other issues

These have been reported by other people and if the issues still exist it would be great if they were fixed.

“The bottom line is that requests that send an If-Modified-Since to CloudFront and get a 304 back will essentially lose the Cache-Control hints. If your Expires header is missing, or in the past, the resource will be conditionally validated on every page navigation until it gets evicted from the cache. That can cause a lot of unnecessary requests and will slow down your visitor’s page loads.”

If this behaviour is still the norm can we have an option so CloudFront can be configured not to do this, i.e. if visit on 29th day client would receive instruction to cache for 30 days

“Let’s assume you have just implemented CloudFront for your newly launched website, and decided to use Cache-Control: max-age=2592000, thus allowing CloudFront (and browsers) to cache your object for 30 days. The Date header is cached at CloudFront, themax-age value also remains the same, what changes is the current time at your visitors’ browser. 5 days after launch, your object would only be browser cacheble for 25 days. After 29 days, it would be cachable for 1 day. If the a user visited your site on day 29, then again on day 31 then both these visits would result in a request being made to CloudFront since the first visit had a cache_ttl of one day only.”

Summary

If you agree please fill out the Cloudfront survey

http://aws.qualtrics.com/SE/?SID=SV_e4eM1cRblPaccFS

Advertisements

9 Responses to “Amazon Cloudfront Issues / Enhancements”

  1. Jakob Says:

    Do you know if CloudFront has changed it in meantime, so I get ClientIP and referrer passed on to my origin?

  2. richardyeo Says:

    Looks like the Client IP address is if using X-Forward-for.
    Referrer and UserAgent sadly not.

  3. Dan Dunckel Says:

    Thanks for the article! I think some things have updated with CF lately. Anyway, check this out: http://aws.amazon.com/cloudfront-request/

  4. richardyeo Says:

    Hi Dan
    Most if not all of the issues in this post are still relevant.
    I have raised several cloudfront requests via the link you included. I have also spoken to Amazon CloudFront people at various events in London.
    Elasticera did have a customer on CloudFront but the performance and reliability just were not good enough in our experience. Our customer switched to using our Tier 1 CDN. Whilst the tier 1 CDN is more expensive it is much faster and more reliable.
    Rich

  5. richardyeo Says:

    Amazon have just announced support for custom / vanity SSL certificates.

    http://aws.amazon.com/about-aws/whats-new/2013/06/11/announcing-custom-ssl-certificates-and-zone-apex-support-for-cloudfront/

    CloudFront does not support cache clear / invalidation using wildcards.

  6. Amazon CloudFront Wildcard CNAME Support | The diary of a CTO Says:

    […] Amazon Cloudfront Issues / Enhancements […]

  7. richardyeo Says:

    Amazon has today announced error handling and caching enhancements for CloudFront

    https://richardyeo.wordpress.com/2013/09/24/amazon-cloudfront-error-handling-enhancements/

    Whilst this goes some way it doesn’t solve all the caching issues with CloudFront mentioned above.

  8. Amazon CloudFront wildcard cache invalidation | The diary of a CTO Says:

    […] https://richardyeo.wordpress.com/2012/06/24/amazon-cloudfront-issues-enhancements/ […]

  9. richardyeo Says:

    In June 2014 Amazon added support for multi-site hosting.

    f you host multiple applications on the same origin web server, Amazon CloudFront can now be configured to pass the Host Header to your custom origin server and cache responses specific to each host name.

    http://aws.amazon.com/about-aws/whats-new/2014/06/26/amazon-cloudfront-device-detection-geo-targeting-host-header-cors/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: