re:Covering from re:Invent

Notes and themes, now that we've had time to digest

Dec 20, 2021

Note: this is a subgraph on the Slipbox repo, and you can check out the live version in the link to see all my notes, saved links, and database:

re:Invent 2021

Introduction

re:Invent 2021 was my first time visiting the conference, despite having worked with AWS for several years. I was dumbstruck by the sheer amount of preparation, effort, care, thoughtfulness, planning, design, and empathy that the community and Amazonians put into the conference. I had a great time with old and new friends, met lots of people, and learned a ton.

Key Themes

This year, there were a few important themes that came out of the conference, either as commonalities across multiple product/feature announcements, as thru-lines in the messaging, or as explicitly stated ideas (such as The Everywhere Cloud.)

The Relentless Expansion of Severless

Over the years the Serverless offerings at AWS have become increasingly mature. The most striking expansion of Serverless offerings this year had to do with the expansion of Serverless offerings as the compute modality for existing managed services. Before we jump into some of the new developments, I'll just leave this here:

Charles Landau @landau_charles

More managed services with Fargate compute options will make more things more #serverless

MSK Serverless is a powerful example here. Ultimately, the Kafka API (and Kafka Connect ecosystem) is the "thing of value" that we usually want. For some organizations running your own Kafka cluster will be mission critical, but for everyone who can offload it to a trillion-dollar company and only pay for data throughput, that's very compelling. I've read enough gripes about the pain of running your own Kafka cluster to understand that this can be not just a point of technical leverage, but really a quality of life benefit.

Jonathan Ellis recently did an excellent deep-dive into Pulsar and how to reason about the Kafka ecosystem on Software Daily. In it, he pointed out that many users are interacting with Kafka through prebuilt connectors, rather than the APIs, and for users like this the underlying implementation is not relevant.

One important thing to note about this development is that the product design embodies the trend: the Serverless offering is more opinionated, with more constrained config available to the end user. Namely, the broker settings are fully locked-in, and you have a constrained set of topic configs.

EMR Serverless: managed Spark with a Serverless bent is obviously very compelling. Running a Spark cluster is not particularly fun. Spark shares a lot in common with Kafka in that we often only want the cluster for the sake of the Spark APIs and ecosystem. Having a near-zero-ops Spark cluster is therefore "the dream".

It bears mentioning that Serverless Spark offerings are not necessarily new. Having said that, as the largest player in the market it's a different proposition for AWS to add feature parity with the other cloud vendors.

One key point here is that the underlying infrastructure is not fully abstracted, because the worker compute and storage sizes are configurable. (The service is essentially magic already, so this shouldn't be interpreted as a gripe — I'm only pointing it out as "less-Serverless" on the spectrum of things.)

Serverless Inference for SageMaker Model Endpoints: this one is big news for a certain class SageMaker users. Deploying models prior to this offering would mean standing up at least one inference instance with sufficient resources to serve the API requests. GPU instances are pricey.

To be sure, users with deployed SageMaker models under constant load can achieve high utilization and better unit economics on those instances, but that's a headwind for adopters. Essentially, each new endpoint deployment would have a marginal cost associated with it. Now, it will be interesting to see what new usage patterns can be enabled when that marginal cost is vanishingly small.

There were even more Serverless announcements:

Instances and Chips

AWS continues to develop custom silicone. Graviton3 is their latest ARM offering, and this year they've also launched a partner program to validate software that's optimized for Graviton. This, combined with Apple's M1 line of system-on-chip offerings, is a compelling reason for vendors to support ARM chips, or cross-compiled languages that can easily be built for different architectures (e.g. Golang ).

The partner program is an extension of the Service Ready program, and a pretty clear signal to the market. If your customers are interested in the price performance of your software in the cloud, and you don't have a plan for supporting ARM architectures, AWS is dropping you a hint. Optimizing for Graviton specifically is an interesting topic which I would like to learn more about.

The catalog of instances continued to expand, including some machine learning optimized instances (more on that in a moment.) Speaking of the Apple M1, they added an instance class for those. AMD got much less fanfare, but they did get their own instance launch with significant price performance improvement as well.

AWS is an iceberg, so the aforementioned chip development is likely just the public-facing arm of a lot of internal development. (Nick McKeown and Martin Casado recently discussed chip development in hyperscaler datacenters on a podcast, worth a listen.)

Sustainability and Education

Sustainability became a pillar of the AWS architecture framework ("Well Architected Framework" not to be confused with their WAF service). Historically focused on operational excellence, Amazon may very well be better at executing on sustainability goals for their data centers than other operators. They also commissioned a study which found that moving workloads to the cloud can reduce carbon footprint by >80%. Chip development come into play here too, because power consumption is a factor for chip price performance.

Education were also prominently highlighted. There were multiple projects aimed at making AWS capabilities more attainable for underrepresented communities, including scholarships, training tools, a training headquarters in Seattle, and a high school-targeted training and participation program for the Deep Racer league. (I've been on hiatus this year from Deep Racer, but I would love to find a way to get involved coaching a high-school team interested in Deep Racer.)

The Everywhere Cloud

AWS had a variety of announcements that extend their networking capabilities, but the broader view as articulated in the keynotes was of an AWS network that can go anywhere, thus an "Everywhere Cloud".

Wide Area Network (WAN) is perhaps the biggest development in this theme. A higher level abstraction for global networks, you could be forgiven for looking at WAN and seeing a set of capabilities that are already delivered through Transit Gateway, CloudFront, PrivateLink, and Direct Connect SiteLink. But! A global network control point/plane can be a locus of all kinds of features in the future. It will be interesting to see if, through WAN, AWS can launch options to manage other services at the WAN level which are currently bound to individual VPCs and regional management. Complementary control-plane announcements like the Improved Inspector and Network Access Analyzer bolster this case.

I'm very eager to play with their new private 5G network offering. There was a little bit of confusion initially, but it appears as though this service is not (yet) usable for something like municipal broadband. More predictably, this service is aimed first at the hyper-instrumented and advanced industrial facility use case. So, at launch time this service appears to be mainly aimed at some aligned segments that AWS is targeting, like telecoms, energy, or manufacturing, which had their own tracks at re:Invent of course.

Also announced in this category were rack-ready form factors for Outposts. This might be the thing that gets me to invest more time in my a homelab... but my languishing collection of old computers and Raspberry Pies which I don't have time to fiddle with say otherwise.

All of this would already be a very compelling “Everwhere Cloud” story, but there were also automotive, IoT, and Satellite tracks!

Machine Learning Democratized and Industrialized

I caught Bratin Saha's recap of AWS ML on TWIML and I think the framing was very good. AI/ML is having these two, arguably orthogonal movements to "democratize" and "industrialize", and AWS is positioning itself to serve both.

Industrialization, Saha argued, was primarily focused on three areas: infrastructure, tooling, and MLOps. In the infrastructure space, they launched new Inferentia and Trainium instances. They also introduced an Apache TVM compiler that's on-by-default for model serving (Arguably this is tooling, but the infrastructure is supposed to compile the model for you, so I categorize this as infrastructure after all.) In terms of tooling, they launched many quality-of-life features into SageMaker and CodeGuru. I'm most interested in the Neuron SDK and the Syne-Tune optimization library. Neuron is particularly interesting from a product standpoint because it is a vertical integration - purpose built for running on Inferentia and Trainium with a unified interface. Inference Recommender, in the vein of "automate all the things", and Serverless Inference for SageMaker ML Models, would be the MLOps highlights.

Meanwhile, splashy ML Democratization launches were very welcome. I'm looking forward to SageMaker Studio Lab, which is on my list to test drive very soon. Canvas is also very compelling, and Saha noted that a way to eject to SageMaker was already in the roadmap. (In these no-code solutions, the ability to transfer the project to a practitioner seamlessly is known failure mode.)

Strategic Outlook

Incremental Progress: Many analysts characterized re:Invent as more incremental, with a dual focus on enterprise/vertical offerings and "beginner-friendly" offerings. (With some very interesting counterpoints.) It's true that there was (arguably) no new major product line launches, but there were major launches in existing product lines.

Sensory Overload: I was struck just by how BIG Amazon and AWS are.

Jeff Barr ☁️ (@ 🏠 ) 💉 @jeffbarr

Perhaps the best line of #reinvent2021 - "You have asked for this, it is basically your fault!" -- @Werner on why there are more than 200 #AWS Services.

AWS clearly understands this, but how do you make something so huge and sprawling and complex digestible, legible, or manageable? AWS shipped some new tools to help, like CDK Constructs Hub for CDK users, and re:Post, a StackOverflow sort of thing. Terraform Account Factory, and from earlier in the year Cloud Control API, should get a mention in this category too.

The Moat: AWS shipped numerous changes that they clearly hope will give them competitive advantage (or parity) with other offerings. The chips and serverless offerings are the most obvious manifestation of this, where they straightforwardly work to improve the unit economics of your cloud footprint by offering you more cost effective compute. There were more targeted areas of investment as well, with Redshift getting upgrades to put pressure on Databricks, Snowflake, and BigQuery, ECR getting upgrades that happen to help customers sidestep Docker's pricing changes, and CloudFront getting a price cut that seems like a direct response to CloudFlare's R2 offering.

Government: Only briefly mentioned was AWS’ deep journey into government. Government accounts remain big business for the cloud hyperscalers and their partners. Many of the announcements may have landed dully for some attendees but rang in the ears of our government counterparts (IL5 support in GovCloud anybody?) Overshadowed by outages was the post-re:Invent announcement of a new top-secret region as well.

Adam Selipsky @aselipsky

We are excited to announce the launch of our second Top Secret Region—AWS Top Secret-West. AWS is committed to helping our customers and partners in the defense, intelligence, and national security communities deliver their most critical missions.

aws.amazon.comAnnouncing second AWS Top Secret Region, extending support for US government classified missions | Amazon Web ServicesAWS Top Secret-West is accredited to operate workloads at the Top Secret U.S. security classification level. The new Region adds multiple Availability Zones geographically separated from AWS Top Secret-East. With two Top Secret Regions, customers in the U.S. defense, intelligence, and national secur…

On a Personal Note

re:Invent this year was my first time away from home in a long time (like many attendees.) It was stressful, it was overwhelming, I learned a tremendous amount, and I would love to go back. There are a lot of things I would have done differently in hindsight BUT... I got a chance to spend time with friends and colleagues who I otherwise would never see. I met a bunch of people from the community who I admire the hell out of, and I did my best not to try their patience with questions and nerding out. I missed my family a lot. None of it would have been possible were it not for the efforts of literally thousands of event planners, Amazonians, partners, the support of my family, and my colleagues. I’m really grateful to them for making it happen.

See you next time!

The Slip Box