WireGuard VPN setup in Terraform CDK for Road Warriors

Or: A good enough reason to play with Terraform CDK is setting up a disposable WireGuard VPN on cheap cloud providers

· 19 minute read

I’ve been working a lot with CDK over the last months, although almost exclusively with the one from AWS. I think it is again time to look into what Terraform has cooking, so to speak.

Yeah I know, AWS CDK version 2.0 has just been released and I should probably write about that.. Tho, while I appreciate the merging of the gazillion @aws-cdk/* libraries into one, there was nothing new that really peaked out to me, that made me want to write about. Then again, I have not really spend a lot of time looking into it. Only that it’s not at all hard to migrate - which I am thankful for of course.

TL;DR: skip ahead to practical part

Use-case: Road Warrior VPN

I’ve recently moved - for the first time in my life - outside of my country of birth (Germany). It’s a longer story. To cut it short: I can’t complain about being bored.

One of the things that came up a bit unexpected for me was limited access to some of the services I got used to using in Germany. Services, that - as I now found out - do geofencing, which means: they restrict access to their services from specific geographic regions.

In addition, I find myself in possibly insecure - or at least untrustworthy - networks, through which my internet access is routed. Networks, whose maintainers possibly log or otherwise spy on my internet traffic. Sure, about everything goes through encrypted transport and there is no malicious attempt - as far as I know - but I don’t appreciate the sentiment for whatever reason. It makes my toes itch.

So, I am in need of a VPN solution, that can both move my apparent geographic location to where I want it to be - at least until I switched some services to different providers. In addition it would make me feel better to know all traffic leaving or entering my machine flows through a secure network, that also makes my internet use opaque to the outside.

All of that I want to have on-demand. I probably won’t need it 24/7. But maybe. We’ll see.

Just use some VPN-as-a-service?

There are a great many out there. I know. I see their advertising everywhere. Or maybe I am just caught up in some weird ad-targeting loop. Who knows? From what I gather VPN-as-a-service is a very high-margin market - if likely in a deeply dark red ocean. Hence everyone is primarily competing by pushing out more ads than the competition, hence I see the ads everywhere. Just a hunch.

Anyway. I’ve never tried any of them, nor would I. I don’t trust them on principle. I know, that sounds a bit paranoid, but when I need to use a VPN (for the reason I layed out above - or any other that I can think of), then I am especially not in a mood to inject an unknown person in the middle of my network. Someone, that is specialized in VPN hosting and has specialized tooling to monitor VPN networks - aka my internet traffic. That is just counter intuitive to me.

Sure, setting up my own VPN on some cloud provider: they also see my network traffic coming through, they also can profile me - or whatever my paranoia thinks those as-a-service are doing. But somehow I deem that far less likely. Mainly because VPN is not the primary use-case for cloud infras. Far from it. Hence cloud-infra provider will likely have a far more generalized tooling / monitoring / network analysis. In short: I suspect they don’t care. I might be wrong, of course. I am likely wrong about the VPN provider: they probably don’t care about my doings either. It would be bad for their business, if they do.

Or maybe I just needed to cook up a reason to play around with some tech I am interested in. We might never know.

WireGuard - an incredibly simple to use VPN

I’ve read often about Wireguard since it was added to the mainline Linux kernel in early 2020 and was always intrigued. Sadly, no actual use-case has crossed my professional path. Sure, I played around with it a bit, but without actual needing it - there are just so many other things to play around with that do interfere.

Anyway, that is water under the bridge, for now I have an actual use-case! Let me tell you, after some first practical experience: WireGuard is indeed incredibly easy to use, compared to other VPNs I’ve used before.

If you ever had the great displeasure of connecting IPSec-based VPNs in between devices from multiple vendors, you have a reference for what I mean by: ouch. Granted, it must be something like ten years back for me, so maybe that things have changed and IPsec is now easy pie - although I’ve never heard anything along those lines.

The other comparison I have is good ol’ OpenVPN. I don’t think I have ever had some it-just-works experience with it, but I can distinctly remember thinking to myself: at least it’s not OpenSwan again (no offense intended - for someone with more of a neck for this kind of thing, I am sure it’s a great solution). However, OpenVPN can become rather cumbersome, rather fast as well. There are just so many settings to fiddle with, so many configurations to tune, so many ways things can go wrong.

None of that with WireGuard. It’s unbelievable simple. The configuration files are so short you would not believe that they are complete. Alas I did not in the beginning (and searched and searched to figure out what was missing, but nothing was - I am still a bit stunned). Well, at least for the kind of setup that I need - and really, this is all that interest me at this point.

However, this article is not about WireGuard. At least not in depth. I am simply not enough of a network engineer, as you might have deduced, to provide anything of value here. It’s about Terraform CDK. So let’s get to it:

Terraform - one ring tool to rule manage them all

Terraform is around for a quite a while. Since 2014 to be exact - so forever, in internet terms. It’s one of the first generalized Infrastructure-as-Code (IaC) frameworks and also the biggest gorilla in the room.

That is mainly (imo) because Terraform is fully multi-cloud: You can manage resources in about any cloud infrastructure there is. Actually more so. You can manage your own private cloud infrastructure, if you want.

Actually there is still more: You can manage about any programmable service, anything that provides some kind of API. Well, as long as there is a Terraform provider, of course. But there are some many by now: I count 35 official, 166 “verified” and 1447 community maintained ones at the time of writing - that is 1648 in total.

If that is not enough: You can also develop your own Terraform provider, in case your service - or one you want to use - is missing from the list.

You get the gist. You can manage & instrument about any kind of infrastructure. Whether you want to maintain your compute instances on AWS, GCP or DigitalOcean or manage your organization in Github or your F5 Big-IP load balancer in your data-center: Terraform has your back.

Terraform old school - HCL

All that management is done by writing a huge amount of configuration files in so called Terraform HCL (HashiCorp Configuration Language), that looks like this:

resource "aws_instance" "web" {
  ami           = "ami-123123"
  instance_type = "t3.micro"

  tags = {
    Name = "HelloWorld"
  }
}

HCL, in very short, you can imagine like YAML or JSON with additional programming-language elements, like variables and simplistic loops. There is a full specification, if you are interested, that states HCL is one language, that is made up of three sub-languages: A structural language (think YAML/JSON), an expression language (think awk-like programming) and a template language (think mustache or so).

Terraform new school - CDK

The new, cool thing in the IaC world are Cloud Development Kits - or CDK for short. Also, since Terraform HCL became stable in June of this year after only seven years, it’s high time to replace it! I’ve written about AWS CDK in the past and detailed the properties of CDK frameworks, so I will only provide a short summary of Terraform CDK here.

Terraform CDK allows you to use common, high-level programming languages - like Typescript, Python, Go, Java and so on - to write Infrastructure as Code. The advantages - in the very shortest way - are:

Since I never was a big fan of HCL, Terraform going CDK is really amazing to me.

WireGuard as VPN with Terraform CDK

Ok, now let me show you what I cooked up and how to use it.

The code can be found on GitHub: https://github.com/ukautz/roadwarrior-vpn

If you want to use the tooling, please keep in mind:

Setup and configure

The first step is to clone the Git repository into a local folder. As always I will showcase the examples on the command line (maybe some GUI screeners - this time).

# go to your project directory - or wherever you want the code to be
$ cd ~/MyProjects

# download the repo
$ git clone https://github.com/ukautz/roadwarrior-vpn
$ cd roadwarrior-vpn

Now, you need to configure the appropriate token.

DigitalOcean

First create a Personal Access Token in the web console, then:

$ export CDKTF_CONTEXT_provider=digitalocean
$ export CDKTF_CONTEXT_digitalOceanToken="your-token"

Hetzner Cloud

Create an API Token in the Hetzner Cloud Console, then:

$ export CDKTF_CONTEXT_provider=hetzner
$ export CDKTF_CONTEXT_hcloudToken="your-token"

Hint: If you plan using the VPN on a regular basis, consider persiting those environment variables in direnv or akin.

Start the VPN

To start the VPN, you just need to run a single make command - wait a minute, though. The first execution creates a key pair for the server and a key pair for your local machine (the client), that you can find in the keys/ folder.

This is not organization-level security, but should be sufficient for the Road Warrior use-case.

Before you execute the command, let me explain what it will do:

  1. First run setup:
    • Create a key pair for the VPN server and the client (your machine) - unless they already exist in the keys/ folder
    • Resolve NPM package dependencies
    • Resolve Terraform CDK provider dependencies
  2. Upload your local public SSH key (~/.ssh/id_ecdsa.pub - if you want to use a different one, specify the path via CDKTF_CONTEXT_sshKeyPath=/path/to/key.pub) so you can access the VPN server via SSH as well
  3. Create a VPN server - or however the cloud provider calls the VM; the smallest, cheapest size available will be used (at the time of writing, no guarantees)
  4. The VPN server is setup: wireguard is installed and configured, including the private server key and your public client key that were created locally

This is how it should look like (depending on which cloud provider you used):

# start VPN for the first time
#   type in `yes` when asked to deploy the VPN server
$ make up

added 349 packages, and audited 399 packages in 1s

28 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities
Generated typescript constructs in the output directory: .gen
Deploying Stack: VpnStack
Resources
 ✔ CLOUDINIT_CONFIG     WireguardCloudInit  cloudinit_config.WireguardCloudInit
 ✔ HCLOUD_SERVER        VpnServer           hcloud_server.VpnServer
 ✔ HCLOUD_SSH_KEY       SshKey              hcloud_ssh_key.SshKey

Summary: 3 created, 0 updated, 0 destroyed.

Output: client-vpn-address = 192.168.6.10/24
        server-id = 123123123
        server-ip = 123.123.123.123
        server-status = running
        server-vpn-network = 192.168.6.1/24
        server-vpn-port = 51397

Use private key for client: mDj4yZdqDJBVo3hxF53Xv0aADGjiF0HJxxxppXl9sF8=
Use public key for server:  YwOMtZ0HnjcOEfKnTzSkZXvwXV5KJIcXGpfWl10KQlI=

Well, you will of course have a different server-ip and unique private and public keys. Make note of the output, you will need it in the next step.

Connect to VPN

To connect to your just started VPN server, you need to create a configuration file, that should look like the following (MIND to change the values according to your output of the previous make up!):

[Interface]
PrivateKey = mDj4yZdqDJBVo3hxF53Xv0aADGjiF0HJxxxppXl9sF8=
Address = 192.168.6.10/24
DNS = 1.1.1.1, 8.8.8.8

[Peer]
PublicKey = YwOMtZ0HnjcOEfKnTzSkZXvwXV5KJIcXGpfWl10KQlI=
AllowedIPs = 0.0.0.0/0
Endpoint = 123.123.123.123:51397

Let’s walk quickly through that:

That’s it! That is the whole configuration you need! And, to me unfathomably, the server side configuration is equally small. Since your SSH key is installed with the server, you can login and have a look in /etc/wireguard/wg0.conf, if you are interested.

Of course, if you setup more sophisticated topologies, the configuration size grows - but it will always be a far cry from the complexity and size of anything IPSec or even OpenVPN. I love it!

You can verify that you are indeed behind a VPN, e.g. by:

$ curl ifconfig.me

That should print out your VPN server’s public IP address.

On a Mac with the GUI

If you are on the Mac - and also use the GUI - you can do it from there:

Manage Tunnels:

Manage Tunnels

Add Empty Tunnel:

Add Empty Tunnel

Then add the following config (changed to your values):

Add Config

Stop the VPN

To shut the VPN server down, and thereby stop the whole VPN, I recommend to first disconnect from it (if you are connected), then you can just execute. Again, depending on which provider you use, the output may vary slightly. It should be “like” this:

# Type in `yes` when asked to destroy the existing resources
$ make down
Destroying Stack: VpnStack
Resources
 ✔ CLOUDINIT_CONFIG     WireguardCloudInit  cloudinit_config.WireguardCloudInit
 ✔ DIGITALOCEAN_DROPLET VpnServer           digitalocean_droplet.VpnServer
 ✔ DIGITALOCEAN_SSH_KEY SshKey              digitalocean_ssh_key.SshKey

Summary: 3 destroyed.

Note: Make sure you have the respective CDKTF_CONTEXT_provider and token environment variables (still) exported - as described above. They are needed.

Digging into it

If you are interested about the underlying Terraform CDK implementation, here some highlights:

Are all Terraform HCL providers available for Terraform CDK?

This is an important question. One that kind of came up immediately for me, because this is really the primary selling argument to use Terraform in the first place: The sheer amount of supported infrastructure.

In short, the answer is a clear and resounding: Yes.

For a bit longer answer: The cfktf CLI provides the immensly powerful get command, that both downloads the Terraform (HCL) providers and generates CDK constructs - in the language you are using - for you. These constructs are generated directly from the HCL, that means: Any Terraform provider is supported, not only the official ones.

For a long explanation and excellent tutorial, have a look in the documentation.

Modularized server configuration via Cloud Init

Modularization and encapsulation is one of the huge advantages that CDK-style frameworks offer, due to using what the high-level language, they are based on, already provides: functions, classes, packages/libraries. Sure, “classical” Terraform has a module concept, but if you ever worked with that - and then compare it to the ease and flexibility simple functions and classes grant you…

To give you an example: I encapsulated the cloud-init configuration in a dedicated Level 2 CDK construct, which inherits from the the cloudinit_config directive which you might know already if you used Terraform HCL. An instance of it is then passed to the respective server constructs in DigitalOcean or Hetzner Cloud.

Side note: A surprise here was that DigitalOcean does not support base64, gzipped cloud-init files. So I needed to introduce a parameter to allow plain, uncompressed user data.

Dedicated stack per cloud provider

When you deploy the VPN infra, you can control which provider is being used by setting the CDKTF_CONTEXT_provider variable to hetzner or digitalocean. Ideally, I would like to write the CDK code in such a way, that I just create “a server”. Under the hood, that should use some kind of adapter implementation that the selected provider supplies. Like so:

import * as cdktf from "cdktf";
// --%<--
cdktf.setProvider("hetzner");
new cdktf.Server(...);

That is not how it works. In the same that Terraform (HCL) has providers, that offer their own resources, Terraform CDK needs you to use the specific constructs, offered by the providers. That means the code looks much more like this (if not exactly):

import * as hetzner from "./gen/provider/hetzner";
import * as digitalocean from "./gen/provider/digitalocean";
// --%<--
new hetzner.Server(...);
new digitalocean.Droplet(...);

All providers come with their own resources. If those resources “look the same”, then they do so incidentally. In the case of the server/VM DigitalOcean calls their construct Droplet, whereas Hetzner goes with Server. Both, btw, call their SSH key construct actually SshKey.

This is why I needed to create a dedicated Terraform CDK stack for each cloud provider in use:

If you look them over, you’ll find they do look suspiciously alike - but, as written above, they only “look alike”. The actual constructs in use are vastly different under the hood - as are their HCL “origins”.

Maybe in the future, there will be a “Terraform Hosting CDK” (or so) framework, that provides a semantic closer to my first code example and allows you to define infrastructure independent of the provider. Just switching out the credentials used would then switch the provider. Kind of how multi-cloud was pitched to me, back in the days. Or maybe I misunderstood.

Anyway, such a framework could only ever provide the “smallest common denominator” of functionality that all the supported cloud providers offer - so I am not sure whether it would be of any use.. Well, it’s a dream.

Terraform CDK vs parameter

Not unlike when I started to dig into AWS CDK, the “parameter situation” in Terraform CDK is somewhat unclear to me.

When you worked with Terraform HCL, you probably came across input variables. You could use them via environment variables or command line parameters at runtime, e.g.:

$ export TF_VAR_some_var=foo
$ terraform apply -var="other_var=bar"

These variables are still around in the CDK as TerraformVariable constructs. However, they are of limited use - and not only because the cdktf deploy command (basically equivalent to terraform apply) does not have a -var parameters anymore (nor does it care about TF_VAR_anything).

The situation is somewhat similar in AWS CDK, that has it’s origin in CloudFormation. With CloudFormation you used parameters, that then could be used on the command line (similar to the -var above):

$ aws cloudformation deploy --parameter-overrides SomeParam=Foo

While those parameters are also available in AWS CDK as CfnParameter constructs, they are of limited use, because they are interpolated at synthesis time. That means: They are available when the CloudFormation is actually executed (i.e. deployed), but their value cannot be used e.g. in control structures. To be clear, you cannot do the following:

const someParam = new CfnParameter(this, "SomeParam", {type: "String", ..});
if (someParam === "Foo") {
    // do something
}

Within that above if condition the someParam value would be something like ${TOKEN[0]}. The same goes for TerraformVariable constructs. Their value is “not available at code execution time”. Hence: they are of limited use.

So what do you do instead?

Again, going back to AWS CDK, which is quite mature and people put some thinking into it, already. AWS CDK recommends to use Context instead. Context can be provided in multiple ways:

Once it is provided, it can be easily used in the code:

import * as cdk from "aws-cdk-lib";

export class SomeStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    if (this.node.tryGetContext("someName") === "Foo") {
      // do something
    }
  }
}

Since both AWS CDK and Terraform CDK are build on the Constructs package, the this.node.tryGetContext accessor is actually also usable in Terraform CDK. However, the ways to provide context in Terraform CDK are far more limited. Currently, you can set context only in context within the cdktf.json file.

Theoretically, you could also set a JSON encoded value to the CDKTF_CONTEXT_JSON environment variable - but that is overloaded when the cdktf.json file is read. So: no, you can’t.

To address that issue I went ahead and wrote an experimental solution that allows you to provide context in additional ways. You may have noted the above instruction to use CDKTF_CONTEXT_<name>=<value> or modify the provider.json file. Both of these variants are part of the experimental solution. I’ve extended the Terraform CDK App construct so that it automatically loads:

Check out the implementation here and the especially the usage in the main.ts file.

Whether this experiment pans out, we’ll see. For this particular use-case it seems to be working well enough. It allows me to be extremly upfront about all dependencies - aka input parameters - by having them all declared in main.ts. That is very important to me. I could also imagine how this would work in a typical staging scenario, in which I deploy the same stack into multiple stages / environments (like testing, staging, qa, production, you-name-it).

Alternatively, you can always use process.env["VAR"] directly - although it was considered bad practice within AWS CDK.

Testing

I did look into unit and snapshot testing in this stint, but not in overly depth.

Have a look into the tests folder. If you are familiar using the Jest testing framework - that is also used when writing AWS CDK tests - you should have an easy time understanding.

Especially interesting might be how write tests for stacks (synth) is slightly different than writing tests for (other) constructs (synthScope).

Extend for a different cloud provider

The implementation fulfills my use-case. I likely won’t extend on it. If you are interested to provide an implementation for a different cloud provider, feel free to offer a PR. I am sure to look it over and happy to merge it (please mind some testing).

Fin

That’s it. Hope this helps someone with the same problem I had - or at least furthers the interest in Terraform CDK.