Serving Index Pages From non-root Locations With AWS CloudFront

Note: Adapted from someguyontheinter.net, I grabbed the content from web caches as the site appears to have been taken offline, but I did find it useful, so thought it might be worth re-creating.

So, I was doing a quick experiment with host this site in static form in AWS S3, details on how that works are readily available, so I’ll not go into that here. Once you’ve got a static website it’s not hard to add a CloudFront distribution in front of it for content caching and other CDN stuff.

Once setup and with the DNS entries in place, the Cloudfront distribution will present cached copies of your website in S3, and if you’ve got a flat site structure, such as this example below;

http://website-bucket.s3-website-eu-west-1.amazonaws.com/content.html

this will work fine.

However, if you have data in subfolders, ie. non-root locations, for example if there was a folder in the bucket called, “subfolder” such as the example here;

http://website-bucket.s3-website-eu-west-1.amazonaws.com/subfolder/

and you want to be able to browse to

https://your-site.tld/subfolder/

and have the server automatically serve out the index page from within this folder, you’ll find you get a 403 error from CloudFront. This problem comes about as S3 doesn’t really have a folder structure, but rather has a flat structure of keys and values with lots of cleverness that enables it to simulate a hierarchical folder structure. So your request to CloudFront gets converted into, “hey S3, give me the object whose key is subfolder/“, to which S3 correctly replies, “that doesn’t exist”.

When you enable S3’s static website hosting mode, however, some additional transformations are performed on inbound requests; these transformations include the ability to translate requests for a “directory” to requests for the default index page inside that “directory”, which is what we want to happen, and this is the key to the solution.

In brief: when setting up your CloudFront distribution, don’t set the origin to the name of the S3 bucket; instead, set the origin to the static website endpoint that corresponds to that S3 bucket. Amazon are clear there is a difference here, between REST API endpoints and static website endpoints, but they’re only looking at 403 errors coming from the root in that document.

So, assuming you’ve already created the static site in S3 and that can be accessed on the usual http://website-bucket.s3-website-eu-west-1.amazonaws.com URL, it’s example time;

  1. Create a new CloudFront distribution.
  2. When creating the CloudFront distribution, set the origin hostname to the static website endpoint and do NOT let the AWS console autocomplete a S3 bucket name for you, and do not follow the instructions that say “For example, for an Amazon S3 bucket, type the name in the format bucketname.s3.amazonaws.com”.
  3. Also, do not configure a default root object for the CloudFront distribution, we’ll let S3 handle this
  4. Configure the desired hostname for your site, such as your-site.tld as an alternate domain name for the CloudFront distribution.
  5. Finish creating the CloudFront distribution; you’ll know you’ve done it correctly if the Origin Type of the origin is listed as “Custom Origin”, not “S3 Origin”.
  6. While the CloudFront distribution is deploying, set up the necessary DNS entries, either directly to the CloudFront distribution in Route 53 or as a CNAME in whatever DNS provider is hosting the zone for your domain.

Once your distribution is fully deployed and the A record has propagated, browse around in your site and you should see all of your content, and it’ll be served out from CloudFront. Essentially what’s happening is CloudFront is acting as a simple caching reverse proxy, and all of the request routing logic is being implemented at S3, so you get the best of both worlds.

Note: nothing comes without a cost, and in this case the cost is that you must make all of your content visible to the public Internet, as though you were serving direct from S3, which means that it will be possible for others to bypass the CloudFront CDN and pull content directly from S3. So be careful to not put anything in the S3 bucket that you don’t want to publish.

If you need to use the feature of CloudFront that enables you to leave your S3 bucket with restricted access, using CloudFront as the only point of entry, then this method will not work for you.

Experimenting with & Moving to AWS – Part 2

This is a follow up to my previous post – Experimenting with moving to AWS

All went well with AWS Lightsail, it’s a very serviceable VPS solution, but now I’ve had a bit of time in AWS I’ve migrated the site further to EC2. It was a simple enough process, snapshot the Lightsail machine and export that as an EC2 AMI and EBS snapshot, and then cloned the whole lot from London to Ireland. The move of regions was because I have some other data already in Ireland and wanted to keep the site in the same region now.

Off the back of all that I’ve got my IPv6 connectivity back to the site again, as Lightsail does not support IPv6 addressing, which is a bit of a negative point there of Lightsail. EC2 instances however, most certainly do support IPv6.

I’ve also gone as far as migrating DNS management into Route53 from Google Domains, mainly to simplify managing the domain zone.

The instance type the site is now running on is also one of the newer AMD EPYC EC2 instance types, which work out slightly cheaper than the equivalent Intel instances, so keep an eye on the instances suffixed with “a”, as you can save a bit of money there.

Experimenting with & Moving to AWS – Part 1

So, in my work environment, I’ve been heavily based in the VMware and “traditional data centre” world, covering all the usual stuff, as well as some very modern technologies like VSAN.

However, a need has now arisen for me to start skilling up in AWS technologies. So as of last week, my journey into cloud technologies has begun, and I’ve been using the fantastic A Cloud Guru site for their great courses on AWS. I’m starting from the ground up, with very little experience of AWS, so it should be an interesting path for me.

On a related note, for an easy in to AWS, I’ve migrated this site to now live in AWS via their Lightsail platform. For what you get, it’s very cheap and has allowed me to start to experiment with AWS technologies. I’d recommend it to anyone looking at self-hosting WordPress sites. Give it a go, you can get a free month and try things out. Overall, even though the specs of the basic entry-level server look very diminutive, but I’ve found the performance to be great in reality.

I’ll report back when I’m a little further on with the learning, but just for your information, the path I’ve started down is the AWS Certified Solutions Architect, starting with the associate level and hopefully working up to professional level eventually.

Wish me luck!!!