Saving 90% vs. GitHub Codespaces with EC2 Spot Instances

By auto-starting and stopping EC2 instances, we get on-demand VMs for VSCode and more!

GitHub Codespaces offer a great option for thin client development. Because I run a lot of obscure software for my research, and use an ARM Macbook, having remote machines with extra resources, standardized settings, and (in some cases) x86-64 CPUs is really beneficial. Offerings like Codespaces also offer better data durability and centralization through the cloud, and make it easy to transition between machines.

While Codespaces is a compelling offering for thin client, it has some major drawbacks: price ($0.09 per hour per CPU!) and customization (only a few instance types available). What we’d like to do is take advantage of spare cloud computing capacity, while still only paying for the time we use. There’s some setup work involved, but once done we get major price reductions over Codespaces. Here’s a compute price comparison (based on average spot prices in us-east-2b):

CPUsMemoryEC2 SKUCodespacesOn-DemandSpot
24GBc6a.large$0.18/hr$0.077/hr (-57%)$0.02 (-89%)
48GBc6a.xlarge$0.36/hr$0.153/hr (-57%)$0.04 (-89%)
816GBc6a.2xlarge$0.72/hr$0.306/hr (-57%)$0.09 (-87%)
1632GBc6a.4xlarge$1.44/hr$0.612/hr (-57%)$0.16 (-89%)
3264GBc6a.8xlarge$2.88/hr$1.224/hr (-57%)$0.32 (-89%)

A few other prices should also be considered:

  • Storage: Github bills $0.07/GB and AWS (gp3) bills $0.08/GB each month. This is probably in the noise.
  • Data transfer: AWS bills up to $0.09/GB of data transfer (ouch!). This means if you transfer less than roughly 1GB/CPU/hr you’re still coming out ahead. AWS free tier covers the first 100GB of this.
  • Other costs: Our setup relies on an always-on server running Wireguard to start instances on demand. This can be run under the EC2 free tier on a t2.micro instance, but otherwise can run on a t4g.nano instance at roughly $3/month.

Clearly moving workloads from Codespaces to EC2 provides major cost savings. But this only works if we shut down instances when they’re not in use. To do this, we’ll need to be able to tell when instances need to start and stop. We’ll tackle these in two separate steps: (1) a Wireguard VPN into AWS which auto-starts instances, and (2) a daemon that stops inactive instances.

EC2+Wireguard = Autostart

While we could easily start and stop EC2 instances via the AWS console each time we want to use them, this is hardly convenient and requires giving console access to anyone who wants to run instances. Instead, we’ll put all our EC2 instances behind a Wireguard VPN and allow that server to boot instances on demand when clients try to connect. Not only does this bypass the need for console access, but it also further protects cloud resources because only the Wireguard endpoint needs to allow outside access.

First, you’ll need to configure a Wireguard server for access to AWS. I’d recommend following the guide here. Note that, because we’re giving this wireguard host an EC2 role, you’ll need to additionally make sure clients can’t access EC2 metadata to get these credentials. You can do this using the following modified iptables rules (you’ll need to update ens5 to the actual interface on your Wireguard host):

PostUp   = iptables -A FORWARD -i wg0 -d -j ACCEPT; iptables -A FORWARD -s -o wg0 -j ACCEPT; iptables -P FORWARD DROP; iptables -t nat -A POSTROUTING -o ens5 -j MASQUERADE
PostDown = iptables -D FORWARD -i wg0 -d -j ACCEPT; iptables -D FORWARD -s -o wg0 -j ACCEPT; iptables -t nat -D POSTROUTING -o ens5 -j MASQUERADE

Once Wireguard is working, we can setup the server to auto-start EC2 instances. I’ve written a systemd service, ec2-autostart, for this that monitors the Wireguard interface, identifies traffic destined to EC2 instances, and boots them as needed. Tag EC2 instances with autostart=true and give the Wireguard server an IAM role that can start instances (see README), and instances will be started when you connect. You can build and save autostart to /usr/bin/autostart, then add autostart.service to /etc/systemd/system/autostart.service and systemctl enable autostart.service to make sure it runs on boot.

Auto-stopping instances

Next, we want our instances to stop automatically when we’re no longer using them. We’ll use another part of the ec2-autostart repo, autostop, that checks for ssh and screen sessions, then automatically shuts down the instance if neither of these are present for 10 minutes. Download a build autostop to /usr/bin/autostop, then copy autostop.service to /etc/systemd/system/autostop.service. You can start or stop this with systemctl [start|stop] autostop.service, and enable it on boot using systemctl enable autostop.service.

With both autostart and autostart running on the Wireguard server and spot instance, respectively, the server will automatically boot when you ssh in and shutdown when no longer in use, ensuring you’re only billed for instance hours you actually use.

Using this setup, we can start and stop arbitrary EC2 instances on the fly, paying a fraction of the cost of GitHub Codespaces and having far more flexible instance options. We can also work with software that isn’t easily containerized, since we’re running a full Linux VM instead of just a VSCode container.

Eric Pauley
Eric Pauley
PhD Student & NSF Graduate Research Fellow

My current research focuses on practical security for the public cloud.