Benchmarking AWS ARM Instances

As AWS have made the t4g.micro instance free until the end of June, it’d be a crime to not test them out. My test comparisons will be between two burstable small instances, the sort you might use for a small website, a t3a.micro and a t4g.micro, and for pure CPU benchmarking, larger general purpose instances, a m5.2xlarge and a m6g.2xlarge. The ARM instances here run on the AWS Graviton processors and AnandTech have a fantastic writeup on them.

So diving into my admittedly small and basic benchmarks, the first test I ran was a ApacheBench test against a website being served from each instance. In this case I replicated this site to a LEMP stack on the instance and ran against that, and again the command ran is below:

ab -t 120 -n 100000 -c 100 https://www.mark-gilbert.co.uk/

t3a.micro ApacheBench
Requests per second: 14.95 #/sec
Time per request: 6690.257 ms
Time per request: 66.903 [ms] (mean, across all concurrent requests)
Transfer rate: 1626.65 [Kbytes/sec] received

t4g.micro ApacheBench
Requests per second: 24.65 #/sec
Time per request: 4057.414 ms
Time per request: 40.574 [ms] (mean, across all concurrent requests)
Transfer rate: 2680.52 [Kbytes/sec] received

m5.2xlarge ApacheBench
Requests per second: 67.22 #/sec
Time per request: 1487.566 ms
Time per request: 14.876 [ms] (mean, across all concurrent requests)
Transfer rate: 6876.10 [Kbytes/sec] received

m6g.2xlarge ApacheBench
Requests per second: 67.88 #/sec
Time per request: 1473.144 ms
Time per request: 14.731 [ms] (mean, across all concurrent requests)
Transfer rate: 7502.57 [Kbytes/sec] received

The performance difference here on the smaller instance types is incredible really, the ARM instance is 64.8% quicker, and these instances cost about 10-20% less than the equivalent Intel or AMD powered instance. For the larger instances the figures are so similar I suspect we’re hitting another limiting factor somewhere, possibly database performance.

Next up was a basic Sysbench CPU benchmark, which will verify prime numbers by calculating primes up to a limit and give a raw CPU performance figure. This also includes the larger general purpose instances here too. The command used for this is below, with threads to match the vCPU in the instance:

sysbench cpu --cpu-max-prime=20000 --threads=2 --time=120 run

t3a.micro Sysbench
CPU speed: events per second: 500.22

t4g.micro Sysbench
CPU speed: events per second: 2187.58

m5.2xlarge Sysbench
CPU speed: events per second: 2693.14

m6g.2xlarge Sysbench
CPU speed: events per second: 8774.13

As with the ApacheBench tests, the ARM instances absolutely wipe the floor with the Intel and AMD instances, and they’re 10-20% cheaper. There’s no wonder AWS are saying that the Graviton 2 based ARM instances represent a 40% price-performance improvement over other instance types.

Based on this I’ll certainly be looking at trying to use ARM instances where I can.