AWS – Mark Gilbert's Tech Blog

Adding root certificates to the system trust store in Amazon Linux 2023 in AWS

As AWS are constantly notifying customers about the need to update RDS certificates, I thought now would be both a good time to do it, and also ensure that WordPress is connecting to RDS via TLS. Sounds simple enough, updating the certificates is simply a case of modifying the RDS instance and picking the new certificate. Getting the new root and intermediate CA’s into the AL2023 system certificate store was another matter.

Once you’ve had a read of this page, and downloaded the relevant bundles, in the case of WordPress you need to add them to the system store. I had a good look around the internet and couldn’t find much information related to this at all, but knowing AL2023 is based on Fedora, I checked and they do have documentation on this, so kudos to them for that. The basic process for this is as follows:

# Grab the correct bundle for your region
wget https://truststore.pki.rds.amazonaws.com/eu-west-1/eu-west-1-bundle.pem
# Copy the bundle to the trust anchors store
sudo cp ~/eu-west-1-bundle.pem /etc/pki/ca-trust/source/anchors/
# This ensures the relevant certificate bundles in app specific formats are updated
sudo update-ca-trust
# You should then be able to run this and see the newly added certificate bundles
trust list | grep eu-west-1

After this you can test connectivity to RDS via the MariaDB or MySQL CLI utility to confirm the new certificate is being picked up and works, and this command should then connect if that’s the case

#  Connect to the database
mariadb -h DBHost --ssl-verify-server-cert -P 3306 -u DBUsername -p

Next we can test PHP connectivity

# Drop into a php prompt
php -a

// Define the database variables
$host = 'DBHost';
$username = 'DBUsername';
$password = 'DBPassword';
$db_name = 'DBName';

// Initializes MySQLi
$conn = mysqli_init();

mysqli_ssl_set($conn,NULL,NULL, "/etc/pki/ca-trust/source/anchors/eu-west-1-bundle.pem", NULL, NULL);

// Establish the connection
mysqli_real_connect($conn, $host, $username, $password, $db_name, 3306, NULL, MYSQLI_CLIENT_SSL);

// If connection failed, show the error
if (mysqli_connect_errno())
{
    die('Failed to connect to MySQL: '.mysqli_connect_error());
}

// Run the Select query which should dump a list of WordPress users out if the connection was successful
printf("Reading data from table: \n");
$res = mysqli_query($conn, 'SELECT * FROM wp_users');
while ($row = mysqli_fetch_assoc($res))
 {
    var_dump($row);
 }

If that’s successful then the final step should be to make sure WordPress use TLS to connect to the database, which means adding the following to the wp-config.php file

define( 'MYSQL_CLIENT_FLAGS', MYSQLI_CLIENT_SSL );

At this point, the final step was to enforce SSL connectivity on the RDS instance, which simply needed a change of the RDS parameter group to the require_secure_transport option.

I hope this helps someone enforce SSL connectivity, or at the very least update the root certificates in AL2023 for another reason, as I found very little AL2023 specific documentation regarding that, but luckily the upstream Fedora docs work.

Benchmarking AWS ARM Instances Part 1

As AWS have made the t4g.micro instance free until the end of June, it’d be a crime to not test them out. My test comparisons will be between two burstable small instances, the sort you might use for a small website, a t3a.micro and a t4g.micro, and for pure CPU benchmarking, larger general purpose instances, a m5.2xlarge and a m6g.2xlarge. The ARM instances here run on the AWS Graviton processors and AnandTech have a fantastic writeup on them.

So diving straight into my admittedly small and basic benchmarks, the first test I ran was an ApacheBench test against a website being served from each instance. In this case I replicated this site to a LEMP stack on the instance and ran against that, and again the command ran is below:

ab -t 120 -n 100000 -c 100 https://www.test-site.com/

t3a.micro ApacheBench
Requests per second: 14.95 #/sec
Time per request: 6690.257 ms
Time per request: 66.903 [ms] (mean, across all concurrent requests)
Transfer rate: 1626.65 [Kbytes/sec] received

t4g.micro ApacheBench
Requests per second: 24.65 #/sec
Time per request: 4057.414 ms
Time per request: 40.574 [ms] (mean, across all concurrent requests)
Transfer rate: 2680.52 [Kbytes/sec] received

m5.2xlarge ApacheBench
Requests per second: 67.22 #/sec
Time per request: 1487.566 ms
Time per request: 14.876 [ms] (mean, across all concurrent requests)
Transfer rate: 6876.10 [Kbytes/sec] received

m6g.2xlarge ApacheBench
Requests per second: 67.88 #/sec
Time per request: 1473.144 ms
Time per request: 14.731 [ms] (mean, across all concurrent requests)
Transfer rate: 7502.57 [Kbytes/sec] received

The performance difference here on the smaller instance types is incredible really, the ARM instance is 64.8% quicker, and these instances cost about 10-20% less than the equivalent Intel or AMD powered instance. For the larger instances the figures are so similar I suspect we’re hitting another limiting factor somewhere, possibly database performance.

Next up was a basic Sysbench CPU benchmark, which will verify prime numbers by calculating primes up to a limit and give a raw CPU performance figure. This also includes the larger general purpose instances here too. The command used for this is below, with threads to match the vCPU in the instance:

sysbench cpu --cpu-max-prime=20000 --threads=2 --time=120 run

t3a.micro Sysbench
CPU speed: events per second: 500.22

t4g.micro Sysbench
CPU speed: events per second: 2187.58

m5.2xlarge Sysbench
CPU speed: events per second: 2693.14

m6g.2xlarge Sysbench
CPU speed: events per second: 8774.13

As with the ApacheBench tests, the ARM instances absolutely wipe the floor with the Intel and AMD instances, and they’re 10-20% cheaper. There’s no wonder AWS are saying that the Graviton 2 based ARM instances represent a 40% price-performance improvement over other instance types.

Based on this I’ll certainly be looking at trying to use ARM instances where I can.

Backing Up Data Centre Hosted Data To AWS

Currently at work, we backup the majority of our critical on-premises data to Azure, with some local retention onsite, as the majority of restores are needed for data within the last week or so. This is done using a combination of Microsoft Azure Backup Services (MABS) and the standalone Microsoft Azure Recovery Services (MARS) software and agents.

In time the majority of the on-prem data is likely to move to the cloud, but this takes time, with various products and business functions to move, so for now we still have a large data set that we need to backup from our data centres. The cloud all this on-prem data is moving to is and will continue to be AWS, but the backups happen into Azure, for mainly historical reasons.

We started looking at whether we could move this from Azure to AWS to reduce complexity, simplify billing and potentially reduce cost, because as I said, a lot of our estate runs in AWS already. We fairly quickly found out that the actual “AWS Backup” solution doesn’t really cover on-premises data directly, and from there things started to get complicated.

So we looked at the various iterations of Storage Gateway, including file gateway and volume gateway.

Tape gateways are pretty much ruled out as we don’t have any enterprise backup software that will write to tape storage, so would incur additional cost to purchase licences for that.

File Gateway does most of what we need as we can write backup output from things like MSSQL or MySQL servers running on-prem, to the file gateway presented volume and have that written back into S3 and backed up from there. However, as this can’t be throttled in terms of bandwidth and we don’t have a direct connect available for this means we can’t risk annihilating our data centre egress bandwidth.

Volume gateways would do what we need in terms of being able to present storage to a VM that backups are written to, and that can then be throttled and sent into S3. From there we’d have to pick that data up and move that via AWS Backup into a proper backup with proper retention policies attached, however as this bills as EBS rather than S3 storage, when we priced all this out it worked out considerably more expensive than our current solution of backing up into Azure, which again, pretty much rules this out as option.

Now, if only Amazon would add an on-prem option for AWS Backup, we’d be laughing – oh well, we can dream.

Serving Index Pages From non-root Locations With AWS CloudFront

Note: Adapted from someguyontheinter.net, I grabbed the content from web caches as the site appears to have been taken offline, but I did find it useful, so thought it might be worth re-creating.

So, I was doing a quick experiment with host this site in static form in AWS S3, details on how that works are readily available, so I’ll not go into that here. Once you’ve got a static website it’s not hard to add a CloudFront distribution in front of it for content caching and other CDN stuff.

Once setup and with the DNS entries in place, the Cloudfront distribution will present cached copies of your website in S3, and if you’ve got a flat site structure, such as this example below;

http://website-bucket.s3-website-eu-west-1.amazonaws.com/content.html

this will work fine.

However, if you have data in subfolders, ie. non-root locations, for example if there was a folder in the bucket called, “subfolder” such as the example here;

http://website-bucket.s3-website-eu-west-1.amazonaws.com/subfolder/

and you want to be able to browse to

https://your-site.tld/subfolder/

and have the server automatically serve out the index page from within this folder, you’ll find you get a 403 error from CloudFront. This problem comes about as S3 doesn’t really have a folder structure, but rather has a flat structure of keys and values with lots of cleverness that enables it to simulate a hierarchical folder structure. So your request to CloudFront gets converted into, “hey S3, give me the object whose key is subfolder/“, to which S3 correctly replies, “that doesn’t exist”.

When you enable S3’s static website hosting mode, however, some additional transformations are performed on inbound requests; these transformations include the ability to translate requests for a “directory” to requests for the default index page inside that “directory”, which is what we want to happen, and this is the key to the solution.

In brief: when setting up your CloudFront distribution, don’t set the origin to the name of the S3 bucket; instead, set the origin to the static website endpoint that corresponds to that S3 bucket. Amazon are clear there is a difference here, between REST API endpoints and static website endpoints, but they’re only looking at 403 errors coming from the root in that document.

So, assuming you’ve already created the static site in S3 and that can be accessed on the usual http://website-bucket.s3-website-eu-west-1.amazonaws.com URL, it’s example time;

Create a new CloudFront distribution.
When creating the CloudFront distribution, set the origin hostname to the static website endpoint and do NOT let the AWS console autocomplete a S3 bucket name for you, and do not follow the instructions that say “For example, for an Amazon S3 bucket, type the name in the format bucketname.s3.amazonaws.com”.
Also, do not configure a default root object for the CloudFront distribution, we’ll let S3 handle this
Configure the desired hostname for your site, such as your-site.tld as an alternate domain name for the CloudFront distribution.
Finish creating the CloudFront distribution; you’ll know you’ve done it correctly if the Origin Type of the origin is listed as “Custom Origin”, not “S3 Origin”.
While the CloudFront distribution is deploying, set up the necessary DNS entries, either directly to the CloudFront distribution in Route 53 or as a CNAME in whatever DNS provider is hosting the zone for your domain.

Once your distribution is fully deployed and the A record has propagated, browse around in your site and you should see all of your content, and it’ll be served out from CloudFront. Essentially what’s happening is CloudFront is acting as a simple caching reverse proxy, and all of the request routing logic is being implemented at S3, so you get the best of both worlds.

Note: nothing comes without a cost, and in this case the cost is that you must make all of your content visible to the public Internet, as though you were serving direct from S3, which means that it will be possible for others to bypass the CloudFront CDN and pull content directly from S3. So be careful to not put anything in the S3 bucket that you don’t want to publish.

If you need to use the feature of CloudFront that enables you to leave your S3 bucket with restricted access, using CloudFront as the only point of entry, then this method will not work for you.

Experimenting with & Moving to AWS – Part 2

This is a follow up to my previous post – Experimenting with moving to AWS

All went well with AWS Lightsail, it’s a very serviceable VPS solution, but now I’ve had a bit of time in AWS I’ve migrated the site further to EC2. It was a simple enough process, snapshot the Lightsail machine and export that as an EC2 AMI and EBS snapshot, and then cloned the whole lot from London to Ireland. The move of regions was because I have some other data already in Ireland and wanted to keep the site in the same region now.

Off the back of all that I’ve got my IPv6 connectivity back to the site again, as Lightsail does not support IPv6 addressing, which is a bit of a negative point there of Lightsail. EC2 instances however, most certainly do support IPv6.

I’ve also gone as far as migrating DNS management into Route53 from Google Domains, mainly to simplify managing the domain zone.

The instance type the site is now running on is also one of the newer AMD EPYC EC2 instance types, which work out slightly cheaper than the equivalent Intel instances, so keep an eye on the instances suffixed with “a”, as you can save a bit of money there.

Experimenting with & Moving to AWS – Part 1

So, in my work environment, I’ve been heavily based in the VMware and “traditional data centre” world, covering all the usual stuff, as well as some very modern technologies like VSAN.

However, a need has now arisen for me to start skilling up in AWS technologies. So as of last week, my journey into cloud technologies has begun, and I’ve been using the fantastic A Cloud Guru site for their great courses on AWS. I’m starting from the ground up, with very little experience of AWS, so it should be an interesting path for me.

On a related note, for an easy in to AWS, I’ve migrated this site to now live in AWS via their Lightsail platform. For what you get, it’s very cheap and has allowed me to start to experiment with AWS technologies. I’d recommend it to anyone looking at self-hosting WordPress sites. Give it a go, you can get a free month and try things out. Overall, even though the specs of the basic entry-level server look very diminutive, but I’ve found the performance to be great in reality.

I’ll report back when I’m a little further on with the learning, but just for your information, the path I’ve started down is the AWS Certified Solutions Architect, starting with the associate level and hopefully working up to professional level eventually.

Wish me luck!!!