{"id":340910,"date":"2021-05-27T07:23:23","date_gmt":"2021-05-27T14:23:23","guid":{"rendered":"https:\/\/css-tricks.com\/?p=340910"},"modified":"2021-05-27T07:23:25","modified_gmt":"2021-05-27T14:23:25","slug":"rethinking-postgres-in-a-post-server-world","status":"publish","type":"post","link":"https:\/\/css-tricks.com\/rethinking-postgres-in-a-post-server-world\/","title":{"rendered":"Rethinking Postgres in a Post-Server World"},"content":{"rendered":"\n
Serverless architectures have brought engineering teams a great number of benefits. We get simpler deployments, automatic and infinite scale, better concurrency, and a stateless API surface. It\u2019s hard to imagine going back to the world of managed services, broken local environments, and SSHing into servers. When I started doing web development, moving from servers in a closet to rackspace was a revolution.<\/p>\n\n\n\n
It\u2019s not just hosting and how we deploy applications that have changed under this new paradigm. These advantages of serverless have presented challenges to the traditional MVC architecture that has been so ubiquitous. Thinking back to those early days, frameworks like Zend, Laravel, Django, and Rails were incredible productivity boosters. They not only influenced the way we built applications, but also the way we think about solving problems with the web. These were your \u201cmajestic monoliths\u201d and they solidified the MVC pattern as the defacto standard for most of the web applications we use today.<\/p>\n\n\n\n
In many ways, the rise of microservices and with it the idea of hexagonal architectures (aka ports and adaptors) led naturally to this new serverless world. It started as creating and hosting standalone APIs organized by a shared context that were still backed by the classic frameworks we already knew and loved.<\/p>\n\n\n\n\n\n\n\n
The popularity of NodeJS led to the express framework where we now had a less rigid microframework enabling us to more flexibly organize our code. The final metamorphosis of this pattern is individual pieces of business logic that can be executed on demand in the cloud. We no longer manage machines or even multiple API servers. Each piece of business logic only exists when it is needed, for as long as it\u2019s needed and no longer. They are lightweight, and you only pay for the individual parts of your application that get used.<\/p>\n\n\n\n
Today, we hardly realize there is a server at all–even the terminology, serverless, is a misnomer designed to highlight just how far we\u2019ve come from the days of setting up XAMP or VirtualBox and Vagrant. The benefits are clear, the hours saved, the headaches avoided, and the freedom to just solve business problems with code bring building software closer than ever to the simple act of writing prose.<\/p>\n\n\n
The classic MVC frameworks codified not only the pattern of working in three distinct tiers (data, application, and presentation) but also the technology for each. You were able to choose some options at each layer, for instance Postgres or MySQL as the data layer, but the general idea is these decisions are made for you. You implicitly adopt the idea of convention over configuration.<\/p>\n\n\n\n
Postgres as a data layer solution makes a lot of sense. It\u2019s robust, fast, supports ACID transactions, and has over thirty years of development behind it. It is also open-source, can be hosted almost anywhere, and is likely to be around for another thirty years. You could do much worse than stake your company’s future on Postgres. Add to that all the work put into integrating it into these equally battle-tested frameworks and the story for choosing Postgres becomes very strong.<\/p>\n\n\n\n
However, when we enter a serverless context, this type of architecture presents a number of challenges particularly when it comes to handling our data.<\/p>\n\n\n\n
Common issues include:<\/p>\n\n\n\n
Traditional systems like Postgres were never designed for this purpose. To start, Postgres operates on the assumption of a stateful connection. What this means is that Postgres will hold open a connection with a server in order to optimize the response time. In a traditional monolithic application, if your server had to open a new connection every single time it requested data this would be quite inefficient. The actual network request would in many times be the primary bottleneck. By keeping this connection cached Postgres removes this bottleneck. As you scale your application you will likely have multiple machines running, and a single Postgres database can handle many such connections, but this number isn\u2019t infinite. In fact, in many cases, you have to set this number at the time of provisioning the database.<\/p>\n\n\n\n
In a serverless context, each request is effectively a new machine and a new connection to the database. As Postgres attempts to hold open these connections we can quickly run up against our connection limit and the memory limits of the machine. This also introduces another issue with the traditional Postgres use case, which is provisioned resources. <\/p>\n\n\n\n
With Postgres we have to decide the size of the database, the capacity of the machine it runs on, where that machine is located, and the connection limit at the time of creation. This puts us in a situation where our application can scale automatically but we must watch our database closely and scale it ourselves. This can be even trickier when we are dealing with spikes in traffic that are not consistent in both time and location. Ultimately by moving to serverless we have reduced the operational overhead of our application layer, but created some increased operational overhead in our database. Would it not be better if both our application and our data layer could scale together without us having to manage it?<\/p>\n\n\n\n
The complexity required to make traditional systems like Postgres work in a serverless environment can often be enough to abandon the architecture all together. Serverless requires on-demand, stateless execution of business logic. This allows us to create lighter, more scalable programs but does not allow us to preserve things like network connections and can be slowed down by additional dependencies like ORMs and middleware. <\/p>\n\n\n
It\u2019s time we begin thinking about a new type of database, one more in line with the spirit of serverless and one that embraces iterative development and more unified tooling. We want this database to have the same automatic, on-demand scale as the rest of our application as well as handle global distribution that are hallmarks of the serverless promise. This ideal solution should be:<\/p>\n\n\n\n
If we are truly to embrace the serverless architecture, we need to ensure that our database scales along with the rest of the application. In this case, we have a variety of solutions some of which involve sticking with Postgres. Amazon Aurora is one example of a Postgres cloud solution that gives us automatic scalability and backups, and gives us some global distribution of data. However, Amazon Aurora is hardly easy to set up and doesn\u2019t free us from all operational overhead. We also are not able to securely access our data from the client without building an accompanying API as it still follows the traditional Postgres security model.<\/p>\n\n\n\n
Another option here are services like Hasura, that allow us to leverage Postgres but access our data by way of a GraphQL API. This solves our security issues when accessing data from the client and gets us much closer to the ease of use we have with many of our other serverless services. However, we are left to manage our database ourselves and this merely adds another layer on top of our database to manage the security. While the Hasura application layer is distributed, our database is not so we don\u2019t get true global distribution with this system.<\/p>\n\n\n\n
I think at this point we should turn toward some additional solutions that really hit all the points above. When looking at solutions outside of Postgres we have to add two additional requirements that put the solutions on par with the power of Postgres:<\/p>\n\n\n\n
When we typically step outside of relational database systems into the world of schemaless solutions, ACID transactions and relational, normalized data are often things we sacrifice. So we want to make sure that when we optimize for serverless we are not losing the features that have made Postgres such a strong contender for so long.<\/p>\n\n\n\n
Azure\u2019s CosmosDB supports a variety of databases (both SQL and NoSQL) under the hood. CosmosDB also provides us with libraries that can work on both the client and server freeing us from an additional dependency like Hasura. We get some global distribution as well as automatic scale. However, we are still left with a lot of choices to make and are not free entirely from database management. We still have to manage our database size effectively and choose from many database options that all have their pros and cons.<\/p>\n\n\n\n
What we really want is a fully managed solution where the operational overhead of choosing database size and the type of database can be abstracted away. In a general sense having to research many types of databases and estimate scale would be things that matter a lot less if we have all of the features we need. Fauna<\/a> is a solution where we don\u2019t have to worry about the size of the database nor do we have to select the type of database under the hood. We get the support of ACID transactions, global distribution, and no data loss without having to figure out the best underlying technology to achieve that. We also can freely access our database on the client or the server with full support for serverless functions. This allows us to flexibly create different types of applications in a variety of architectures such as JAMstack clients, serverless APIs, traditional long-running backends, or combinations of these styles.<\/p>\n\n\n\n When it comes to schemaless databases, we gain flexibility but are forced to think differently about our data model to most efficiently query our data. When moving away from Postgres this is often a subtle but large point of friction. With Fauna<\/a>, we have to move into a schemaless design as you cannot opt into another database type. However, Fauna makes use of a unique document-relational database model. This allows us to utilize relational database knowledge and principles when modeling our data into collections and indexes. This is why I think it\u2019s worth considering for people used to Postgres as the mental overhead is not the same as with other NoSql options.<\/p>\n\n\n Systems like Postgres have been powerful allies for building applications for over thirty years. The rise of agile and iterative development cycles led us to the serverless revolution before us today. We are able to build increasingly more complex applications with less operational overhead than ever before. That power requires us to think about databases differently and demand a similar level of flexibility and ease of management. We want to preserve the best qualities of a Postgres, like ACID transactions, but ditch the more unsavory aspects of working with the database like connection pooling, provisioning resources, security, distribution and managing scale, availability and reliability.<\/p>\n\n\n\n Solutions such as Amazon\u2019s Aurora Serverless v2 create a serverless solution that works in this new world. There are also solutions like Hasura that sit on top of this to further fulfill the promise of serverless. We also have solutions like Cosmos DB and Fauna<\/a> that are not based in Postgres but are built for serverless while supporting important Postgres functionality.<\/p>\n\n\n\n While Cosmos DB gives us a lot of flexibility in terms of what database we use, it still leaves us with many decisions and is not completely free of operational overhead. Fauna<\/a> has made it so you don\u2019t have to compromise on ACID transactions, relational modeling or normalized data\u2009\u2014\u2009while still alleviating all the operational overhead of database management. Fauna is a complete rethinking of a database that is truly free of operational overhead. By combining the best of the past with the needs of the future Fauna has built a solution that behaves more like a data API and feels natural in a serverless future.<\/p>\n\n\n\nConclusion<\/h2>\n\n\n
\n\n\n\n