Accessing an AWS hosted backend via IPv6

Reaching my API via IPv6 is not something I considered a high priority, but due to my prior experience with AWS I took the ability for granted. Recently I found out that it’s not as simple as I thought, since new AWS accounts do not get access to EC2-Classic, and to my surprise EC2-VPC has only recently started supporting IPv6, and only in us-east-1 with several limitations.

While not as trivial as I thought, there are simple solutions available, so I will describe the how and why of a couple of ways which we recently used at Takumi.

Background

This summer Apple announced that they would require all apps submitted to the App Store to support IPv6. At Takumi we were not concerned, as in the past I had noticed Amazon ELB’s have both AAAA records and a nice “dualstack” DNS name which had both A records and AAAA records for the underlying hosts – and the Takumi API is accessed through an AWS ELB. Besides our app not hardcoding any reliance on IPv4 (such as addresses or low level packet construction), we felt no pressure to reassess our connectivity.

Then about a month ago we got our first app rejection because our API was not reachable in an IPv6-only network Apple have setup for testing. To my surprise and despair there were no AAAA records, or a “dualstack” DNS name on any of our ELBs. Also I was surprised because our DNS setup did not include any AAAA records, it seemed Apple was testing this without using a DNS64/NAT64 network.

Quickly I found out that the reason for no AAAA records is that the Virtual Private Cloud offering of AWS does not support IPv6 (some support has been rolled out now in select regions), and I surmise that any resources created to wrap publicly accessible VPC instances are thus IPv4-only as well. All of our ELB’s and ALB’s have only A records, and no AAAA records associated with their DNS names. So the only way for an IPv6 device to access them is via (local) DNS64/NAT64.

To sum up: because we are a new customer of AWS, we are forced to use the VPC offering, which replaces EC2-Classic (as it’s now known), and this newer compute cloud does not support IPv6, which there is growing pressure around the internet to fully support. I was not pleased, and we are not the first app developer to get frustrated by this. And of course this shouldn’t happen unless your app has some unusual reliance on IPv4.

I however try and avoid getting submissions approved by explaining to the app reviewer that they should test things differently, and to be fair: if someone attempts to access our app via an IPv6-only network it wouldn’t work. So regardless of Apple’s testing procedures or fairness, getting this connectivity issue fixed seemed like the quickest and best solution.

First solution

A quick solution was to find a cloud provider which could provide a publicly accessible (IPv4 and IPv6) machine, and Digital Ocean became the provider of choice. I setup a debian machine there, an nginx proxy listening on the IPv6 interface which proxied to IPv4 backends using our ELB CNAME.

Workaround using Digital Ocean proxy node

Do those A records smell a little funky?

This simple solution only took an hour or so to setup, and got our app approved without any lengthy dialogue with Apple. The nginx configuration is very simple, as it’s simply functioning as a TCP proxy:

Caveats

While a quick and easy solution for most engineers to implement, this solution is both complex as it requires more moving parts and introduces a single point of failure in another datacenter with another provider – although the proxy is only used for IPv6. More alarmingly though, because it is not possible to create multiple DNS records with the same name with both CNAME and A/AAAA record types, I had to hardcode the ELB IP addresses as A-records on api.takumi.com, while the AAAA record pointed to the IPv6 address of the proxy.

This meant that if AWS changes the IP addresses of our ELB, which they might do at any moment and actively discourage any reliance on them, instead providing their own DNS names which should be accessed via CNAME entries from customer hostnames; our users would be stuck accessing outdated IP addresses which might not even answer, and certainly wouldn’t route traffic to our actual cloud servers.

A real solution would be to get a single DNS hostname from Amazon which would contain both IPv6 and IPv4 (AAAA and A records) which will reliably reach our load balancers, without the need to monitor for IP address changes, or add new proxy machines to our operational environment!

CloudFront to the Rescue

After consulting with a friend who’s an engineer with Amazon the simplest solution (with the added benefit of giving us lower latencies and DDoS protection) would be to setup a non-caching CloudFront distribution, the origin of which would be our production Elastic Load Balancer. When creating a CloudFront distribution it is possible to enable IPv6, and by allowing all HTTP methods, and choosing to forward all HTTP headers, and to forward all cookies and query string parameters, you’ve created a non-caching IPv6 accessible CloudFront distribution.

CloudFront Default Cache Behavior Settings

Relevant sections highlighted in red

CloudFront Distribution Settings

Relevant sections highlighted in red

Migrating

Since the new setup means having a CDN in front of our API, which essentially means deploying a distributed caching system in front of an API which almost by definition can be dangerous to cache accidentally, and we also rely on our own custom HTTP headers for identifying different client platforms and versions.

In short: we don’t want to switch all of our traffic onto CloudFront in one go as some insidious bugs might surface, and due to the aggressive caching and TTL ignoring of many large ISPs and DNS providers, the fallout could be rather painful.

To migrate gradually to our new CloudFront distribution I then decided to change our main API DNS entry to a set of weighted records, starting with 99% of lookups receiving my workaround solution, and 1% the new CloudFront distribution.

Migrating to the new setup

Weighted DNS records allow a gradual switch

In a few days, if we have no issues with the new setup we can change the weights on those records to gradually move more traffic over to CloudFront, or all at once.

Final Setup

On top of getting the IPv6 support we need to avoid any random rejections at Apple, we’ll also benefit from better latencies and we are in a stronger position to deal with any DDoS attacks. Although this is an annoying problem due to inconsistent testing at Apple, and the fact that our cloud provider has not fully embraced IPv6, I am very pleased with this solution, and hope it can be of help to others.

Final Setup with CloudFront

Simple and resilient

Note that in the last diagram actual IP’s have been replaced with a.b.c.d, as we no longer depend on knowledge about those specifics further up in the stack.

This is an example of the same principles of building clean abstractions most developers know from writing software, but applied to systems design where they are of course equally important.

The chain leading up to the ELB should not need to know or care about the actual machines/IP’s comprising the load balancer. The fragility and complexity of the first solution can be mostly attributed to that leaky abstraction, which we have now gotten rid of completely.

Conclusions

IPv6 is coming, and I (we?) can no longer avoid dealing with it and learning more about it. Albeit random and potentially unfair this occurrence did trigger that, and in the end our API is more robust and flexible than before.

I hope our experience can help other app developers on AWS backends, although the CloudFront solution could be used by anyone with a HTTP/HTTPS accessible API.