posted in DevOps on 2016-10-17 16:14:58 EDT by Dave Martorana
One of the most baffling parts of Amazon Web Services’ service is that, while Route 53 supports AAAA records, almost none of AWS supports IPv6 addresses. That’s slooooowly changing, (CloudFront, for instance, AWS’s CDN, just received support), but which services do and don’t support it are very unclear at best, and disastrous at worst.
We run DNS through Route53 (R53), and build our server stacks on top of CloudFormation (CF), AWS’s stack-description templating service. Our CF template describes the R53 DNS records and how they should point to Elastic Load Balancers (ELB) that sit in front of our public-facing servers. Our ELBs - our entire stack - sits inside of a Virtual Private Cloud (VPC), which is not only the best way to run AWS resources, but, it turns out, the only way AWS will let new accounts run servers at all.
For those of us that have been using AWS for years, there is AWS “Classic” which runs EC2 servers and ELBs as stand-alone servers. Bizarrely, ELBs in “Classic” mode will answer requests over an IPv6 IP address, while newer ELBs inside a VPC won’t. However, even more bizarrely, CloudFormation and Route53 will allow you to set up AAAA records pointing to ELBs inside of a VPC.
See where this breaks? AWS’s own services will gladly set up an IPv6 DNS entry, point it to an ELB, and then… drop all connections across that IP address.
Yeah. So, just about every mobile operating system looks up and uses IPv4 addresses. Many resources now have both IPv4 and IPv6 addresses, but those IPv6 addresses, while being the future of the internet, are relatively ignored.
This fall, Apple made a change to iOS in their iOS 10 release, making networking in iOS 10 prefer IPv6 addresses. The DNS servers on AT&T’s mobile internet connections (unlike Verizon, Sprint, T-Mobile, etc) have been upgraded to start replying with IPv6 addresses where available.
Because CloudFormation assigned IPv6 addresses in DNS to our ELBs, DNS responded to iOS 10 devices on AT&T’s mobile network with those IPv6 addresses. However, when our software then tried to issue a call to our backend API, AWS unceremoniously dropped the connections outright.
So for at least a couple weeks, as people quietly upgraded their phones to iOS 10, we had tens of thousands of players that could play our multi-player games on their home WIFI, but not while using mobile data.
It took us some time to track down the issue - only on iOS 10, only on AT&T, only on mobile data. While I don’t have exact figures, there was a reduction in play, and most certainly a direct impact on revenue.
What should happen?
Route53 should deny AAAA records using AWS’s proprietary pointer-type record to ELB endpoints in a VPC - or any endpoint, for that matter, in the array of services they provide and support that don’t yet play well with IPv6.
BE VERY CAREFUL rolling out support for IPv6 if running your server stack on AWS - inside or outside a VPC. Support is spotty at best, and Amazon’s own lack of understanding of the severity of issues caused by their lackadaisical application of IPv6 addresses to services that have little-to-no support for those addresses can cause huge headaches.