When Andy Jassy, then head of Amazon Web Services, announced Amazon Aurora in 2014, the pitch was bold but metered: Aurora would be a relational database built for the cloud. As such, it would provide access to cost-effective, fast, and scalable computing infrastructure.
In essence, he explained, Aurora would combine the cost effectiveness and simplicity of MySQL with the speed and availability of high-end commercial databases, the kind that firms typically managed on their own. In numbers, Aurora promised five times the throughput (e.g., the number of transactions, queries, read/write operations) of MySQL at one-tenth the price of commercial database solutions, all while offloading costly management challenges and maintaining performance and availability.
Aurora launched a year later, in 2015. Significantly, it decoupled computation from storage, a distinct contrast to traditional database architectures where the two are entwined. This fundamental innovation, along with automated backups and replication and other improvements, enabled easy scaling for both computational tasks and storage, while meeting reliability demands.
“Aurora's design preserves the core transactional consistency strengths of relational databases. It innovates at the storage layer to create a database built for the cloud that can support modern workloads without sacrificing performance,” explained Werner Vogels, Amazon’s CTO, in 2019.
“To start addressing the limitations of relational databases, we reconceptualized the stack by decomposing the system into its fundamental building blocks,” Vogels said. “We recognized that the caching and logging layers were ripe for innovation. We could move these layers into a purpose-built, scale-out, self-healing, multitenant, database-optimized storage service. When we began building the distributed storage system, Amazon Aurora was born.”
Within two years, Aurora became the fastest-growing service in AWS history. Tens of thousands of customers — including financial-services companies, gaming companies, healthcare providers, educational institutions, and startups — turned to Aurora to help carry their workloads.
In the intervening years, Aurora has continued to evolve to suit the needs of a changing digital landscape. Most recently, in 2024, Amazon announced Aurora DSQL. A major step forward, Aurora DSQL is a serverless approach designed for global scale and enhanced adaptability to variable workloads.
Today, International Data Corporation (IDC) research estimates that firms using Aurora see a three-year return on investment of 434 percent and an operational cost reduction of 42 percent compared to other database solutions.
But what lies behind those figures? How did Aurora become so valuable to its users? To understand that, it’s useful to consider what came before.
A time for reinvention
In 2015, as cloud computing was gaining popularity, legacy firms began migrating workloads away from on-premises data centers to save money on capital investments and in-house maintenance. At the same time, mobile and web app startups were calling for remote, highly reliable databases that could scale in an instant. The theme was clear: computing and storage needed to be elastic and reliable. The reality was that, at the time, most databases simply hadn’t adapted to those needs.
Amazon engineers recognized that the cloud could enable virtually unlimited, networked storage and, separately, compute.
That rigidity makes sense considering the origin of databases and the problems they were invented to solve. The 1960s saw one of their earliest uses: NASA engineers had to navigate a complex list of parts, components, and systems as they built spacecraft for moon exploration. That need inspired the creation of the Information Management System, or IMS, a hierarchically structured solution that allowed engineers to more easily locate relevant information, such as the sizes or compatibilities of various parts and components. While IMS was a boon at the time, it was also limited. Finding parts meant engineers had to write batches of specially coded queries that would then move through a tree-like data structure, a relatively slow and specialized process.
In 1970, the idea of relational databases made its public debut when E. F. Codd coined the term. Relational databases organized data according to how it was related: customers and their purchases, for instance, or students in a class. Relational databases meant faster search, since data was stored in structured tables, and queries didn’t require special coding knowledge. With programming languages like SQL, relational databases became a dominant model for storing and retrieving structured data.
By the 1990s, however, that approach began to show its limits. Firms that needed more computing capabilities typically had to buy and physically install more on-premises servers. They also needed specialists to manage new capabilities, such as the influx of transactional workloads — as, for instance, when increasing numbers of customers added more and more pet supplies to virtual shopping carts. By the time AWS arrived in 2006, these legacy databases were the most brittle, least elastic component of a company’s IT stack.
The emergence of cloud computing promised a better way forward with more flexibility and remotely managed solutions. Amazon engineers recognized that the cloud could enable virtually unlimited, networked storage and, separately, computation.
The Amazon Relational Database Service (Amazon RDS) debuted in 2009 to help customers set up, operate, and scale a MySQL database in the cloud. And while that service expanded to include Oracle, SQL Server, and PostgreSQL, as Jeff Barr noted in a 2014 blog post, those database engines “were designed to function in a constrained and somewhat simplistic hardware environment.”
AWS researchers challenged themselves to examine those constraints and “quickly realized that they had a unique opportunity to create an efficient, integrated design that encompassed the storage, network, compute, system software, and database software”.
“The central constraint in high-throughput data processing has moved from compute and storage to the network,” wrote the authors of a SIGMOD 2017 paper describing Aurora’s architecture. Aurora researchers addressed that constraint via “a novel, service-oriented architecture”, one that offered significant advantages over traditional approaches. These included “building storage as an independent fault-tolerant and self-healing service across multiple data centers … protecting databases from performance variance and transient or permanent failures at either the networking or storage tiers.”'
The serverless era is now
In the years since its debut, Amazon engineers and researchers have ensured Aurora has kept pace with customer needs. In 2018, Aurora Serverless provided an on-demand autoscaling configuration that allowed customers to adjust computational capacity up and down based on their needs. Later versions further optimized that process by automatically scaling based on customer needs. That approach relieves the customer of the need to explicitly manage database capacity; customers need to specify only minimum and maximum levels.
Achieving that sort of “resource elasticity at high levels of efficiency” meant Aurora Serverless had to address several challenges, wrote the authors of a VLDB 2024 paper. “These included policy issues such as how to define ‘heat’ (i.e., resource usage features on which to base decision making)” and how to determine whether remedial action may be required. Aurora Serverless meets those challenges, the authors noted, by adapting and modifying “well-established ideas related to resource oversubscription; reactive control informed by recent measurements; distributed and hierarchical decision making; and innovations in the DB engine, OS, and hypervisor for efficiency.”
As of May 2025, all of Aurora’s offerings are now serverless. Customers no longer need to choose a specific server type or size or worry about the underlying hardware or operating system, patching, or backups; all that is completely managed by AWS. “One of the things that we’ve tried to design from the beginning is a database where you don’t have to worry about the internals,” Marc Brooker, AWS vice president and Distinguished Engineer, said at AWS re:Invent in 2024.
These are exactly the capabilities that Arizona State University needs, says John Rome, deputy chief information officer at ASU. Each fall, the university’s data needs explode when classes for its more than 73,000 students are in session across multiple campuses. Aurora lets ASU pay for the computation and storage it uses and helps it to adapt on the fly.
We see Amazon Aurora Serverless as a next step in our cloud maturity.
“We see Amazon Aurora Serverless as a next step in our cloud maturity,” Rome says, “to help us improve development agility while reducing costs on infrequently used systems, to further optimize our overall infrastructure operations.”
And what might the next step in maturity look like for the now 10-year-old Aurora service? The authors of that 2024 paper outlined several potential paths. Those include “introducing predictive techniques for live migration”; “exploiting statistical multiplexing opportunities stemming from complementary resource needs”, and “using sophisticated ML/RL-based techniques for workload prediction and decision making.”
