Serverless prototyping - A case study

urmade

Tobias Urban

Posted on October 26, 2019

Serverless prototyping - A case study

Serverless computing is an exciting way of hosting your apps. It offers you unmatched scalability whilst making you only pay what you really used. In this article I will go through a case study on how an architecture for your serverless web app prototype could look like (on Microsoft Azure), some best practices and what it would cost you in the end.

What will be built?

So you decided to use serverless computing, but what now? The good thing first: You're not really forced in any programming language. The Azure functions runtime offers some preferred languages, but due to its open source character you could modify it to work with any language of your choice.

For your web app itself, you will most likely will have a front-end and a back-end. On the front-end site, you gain the most flexibility in terms of where to run your app (Mobile, Desktop, Web, …) with the classic HTML / CSS / JS stack. Plus, if you're fairly new to programming and don't have a deep knowledge in another specific language, those three are very easy to get into. For the backend, you can leverage the rich ecosystems of Node.js, .Net, Java or Python to get started quickly.

How to design your functions?

Now to the interesting part: How could an architecture for our app look like? How do we define a function in our system? Is the operation a+b worth its own function or should we split on logical entities (e.g. search all entries of a database table, bring them in the right format and then return them)? As always, the truth is somewhere in the middle.
From a hosting perspective, we have two metrics that determine how expensive a function will be for us: Execution time and memory consumption. So when splitting functions, it should always be our goal to overally minimize those two factors while keeping network traffic to a fair level without bringing a hell of communication latency into the system.
And note the "overally minimize": With every communication between two functions you bring in a sending (packaging) and a receiving (parsing) step, as well as securing operations, which all again is taking its time (which is neglectable at first but can become a factor if overdone extremely).

We can at least optimize for the networking and deployment bit by identifying functions that are highly dependent on each other. To understand why this is important, we have to peek "under the hood" of Azure Functions:
Whenever we deploy our Function App Code to Azure, that code is stored somewhere with a reference to an ever-active URL that marks our Function APP API endpoint. When someone hits that API, Azure starts looking up for an available VM where it could deploy your app. Once found, it takes your code, puts it onto that VM and starts your main process. Once this is done, all your functions in the app are available.
When traffic gets too high and the compute power on the current VM isn't enough to run your app, the code gets deployed to a second app and requests get distributed by a load balancer.
That means for us: All functions in a single function app live and scale together, and if you have a function that is called once a month and one that is called 50 times a second, both of them will have to be deployed and initialized together in each scaling step.
So to optimize scalability we can split our functions in "high-demand" functions and "just-sometimes" functions. Again, the additional time it takes Azure Functions to mount a few additional functions to a VM won't make the difference - but if we speak about dozens of functions, it may does. And on the other hand, functions which are often called together should be kept on the same deployment - this way you save a lot of networking overhead.

Services besides your functions

If you're developing a web app, you will most likely have a front-end that shows HTML and loads a lot of scripts and static files. As we learned already, Azure Functions always takes your whole codebase and shifts it onto an available VM. What happens now if a Gigabyte of images has to be moved? Exactly, the startup time of your Functions App gets tremendously slower.
So it would be recommendable to outsource as many bits and bytes as somehow possible - and this is where Azure Storage kicks in. With Azure (Blob) Storage you can store all your static files outside of your Functions App where they are always available and don't block scaling of your Web Application, while also being optimized for high-frequency delivery (think about CDNs and stuff like that, you have those options with Blob Storage but not if all your files are baked into your app). Plus, they are easily interchangeable and your app doesn't have to be redeployed every time you alter a picture.

And we have a second storage issue: How do we persist dynamic data (like user inputs)? We need some kind of database to store all of that, and databases need VMs, and VMs cost a lot of money per month, and we don't have money, so is it in the end inevitable anyways to go bankrupt and fail? Again, Azure Storage to the rescue. Azure Table Storage provides a very cheap "pseudo-SQL" database where you can store all kinds of data. I call it pseudo-SQL as it takes objects of key-value pairs as parameters and flattens them all in one big table where every distinct key that exists in at least one object gets a column. And it doesn't offer a rich query language, so you are mostly stuck with storing and retrieving data. As long as you create collections (the NoSQL equivalent of tables) with some sort of meaning, you're good to go (for the start).
And the best thing: The API for Table Storage and Cosmos DB (Azures high-performance database) are exactly the same, so if advanced queries and analytics and answers in the single-digit millisecond become a thing at your organization, having that is only a paramater away.

Are we done now with theory? Yes. You can finally start developing your own Azure Functions driven web app. I would recommend following structure for your project: You will need a function for every page you want to display to the customer (e.g. yourdomain/ and yourdomain/about) where you do the website rendering and send out the final file to the user (Which is also a great way to monitor your traffic and latency for each page). And you will of course need one function for every API endpoint that you provide (e.g. GET yourdomain/api/user, POST yourdomain/api/user, …). If you only have simple CRUD operations for your database, I would leave it with this setup.
If you however want to make more complex calculations that involve parallel computing (e.g. for every item in my array do something wildly complex), you could split up your functions API. If you have long-running orchestration functions that have to wait for other functions, familiarize yourself with durable functions.
Whenever these functions trigger another function, they go to sleep and wake up when all "sub-"functions are done executing. This way, you only pay for the functions which are actively doing stuff.

Architecture done, but why does my app get horribly slow sometimes?

Remember the whole "Spin up a new VM and deploy my code to it"-process? This is called startup of your Functions App, and its one of the main principles of serverless computing (only use your code whenever it is really needed).
Your functions transitions from being in a "cold" state (meaning it isn't deployed) to a "hot" state (meaning it is deployed and ready to serve requests immediatly). This process is constantly optimized to go faster by the Azure developers, but at the time writing this article it can still take up to ten seconds until your app is responding. After receiving a call and brining your App into a hot state, Azure Functions keeps your App hot for five minutes (By the time writing this article). Whenever it receives a new call, the clock starts ticking again for five minutes. After that, it clears your code from the VM and sends your App into cold state.
So theoretically you would need at least one user every five minutes to keep your App alive, which will probably not always be the case.

But there is an (unofficial) trick to simulate this effect: There are so-called time trigger functions which execute always after a specified time period. If you now add one of these time triggered functions to your project and set it off to execute every 4 Minutes 30 (It doesn't matter what the function does, just let it log a char or something like that), you have an activity in your app that keeps your app constantly in a hot state.
And as functions are optimized to be cheap for a huge amount of small compute activities, that roughly 9.000 calls per month that you "waste" actually cost you absolutely nothing (the first million calls in a function app per month are free). So this way you will always keep one VM occupied with your code and have your systems up and running at all times.

Do I have to ping an .azuresomething URL now?

As you have sure noticed Functions provides you with a standard URL from which you can call your service. This URL usually consists of the name of your Functions App.azurewebsites.net and is quite impractical if actual customers should engage with it (but at least its HTTPS-enabled by default, so you got that going for you).
If you want to release your Application, you most likely will want to bring in a custom domain from which your app should be reachable. And with Azure Functions you can do that in just a few clicks. Functions Apps are just basic Web Apps in Azure which can be configured in the exact same way. To access the Web App settings just click on "Platform Features" in the starting page of your Functions App and you're good to go. See this documentation on how to add your custom domain then.

And it get's even better. Usually when you host something with your own domain, you don't have a SSL certificate to enable HTTPS communication. This means that all of your users will most likely see a warning stating that your site is insecure and should not be trusted. With Azure Web Apps you have to possibility to configure a LetsEncrypt plugin that automatically creates and renews SSL certificates for your custom domain. To do this, just follow this article.

Great, but let's get to business. How much will that make?

(Before we start: All pricing detail reference the Western Europe Datacenter, are given in Euro and were taken in March 2019). Remember that we have three parts in our application: We have our Functions App for the logic and an Azure Storage with a Blob Storage and a Table Storage to host our website files and our data. So let's break down the single components:

Calculating cost for the functions apps

The Functions App is meant to support large concurrent tasks. It also offers a very generous free contingent of 1.000.000 API hits and the 400.000 GB-seconds per month.

Wait, what are GB-seconds? Besides of tracking how often your APIs are hit, Azure Functions also measures the RAM consumption of your Application. That means: For every second that your app is running, its RAM impact gets measured. If your App needs less than 1 GB of RAM, one GB-s will be charged. If it uses e.g. 1.3 GB of RAM in that second, two GB-s will be charged (keep that in mind when implementing compute-intensive functions). 
But due to the free contingent, you could run nearly ten concurrent functions without a pause through the whole month (assuming that your RAM consumption stays below 1 GB) without paying a dime. And remember, Functions only charge for when your function is running, meaning idle time in hot state is completely free of charge. 
Some technical details about this metric: Measurement is done in 128 MB intervals, and one function in the standard tier can take up a maximum of 1.536 MB of RAM.

Although our compute cost is the one kicking in earlier, we still have a second metric. So let's look at the second metric, API hits.
As you might remember we are using time trigger functions every 4 minutes 30 seconds to keep our app alive. That adds up to roughly 9.000 calls a month, leaving 991.000 free calls per month. If we're assuming a typical user of your app checks in twice a day every day of the week and uses during one session 50 function calls (page loading and interactions with the API like creating and deleting elements), you could support nearly 354 concurrent users within the free contingent.

But what happens when you exceed your free budget? Is that when the price trap finally snatches? Not really.

Let's calculate with 500 active users for your app. Using the scenario described above, these users would trigger 1.400.000 function calls each month, so we would pay 0.17€ for an additional million function executions. We will also have to pay for 1.000.000 additional GB-s, which costs us 14€. For 500 active users in our system, this leaves us with 14,17€ of hosting cost. This is a cost of not even 3 cents per user in your system!

As we scale up, the cost per user stays roughly the same: Supporting 1.000 users in our scenario would cost 34,11€ which is 3,4 cents per user, 10.000 users would cost us 390,97€ which is roughly 4 cents per user, and 100.000 users would be 3.961,55€ with again is roughly 4 cents per user.

Calculating storage cost

But don't forget: We are not yet done. There is still the storage. Remember, we had two parts in there: A blob storage to store static files and a table storage to store persistent data. Both storages have the same pricing model. You are paying once for the amount of data you've stored (in GB) and second for the amount of CRUD transactions you did on that data (in 10.000 operations).
For the Blob storage, you pay 0.0166€ per GB per month (in a "hot", meaning very fast access level), as well as 0.004€ per 10.000 read operations and 0.0456€ per 10.000 write operations.
Let's say we're having 4 sites, per site 15 static assets that get loaded, that our whole frontend asset base has a size of 100 MB and that we update four times a month. An average user visits all 4 sites per session, and has the option to upload a profile picture which he of course does. Assume that an unoptimized 4k profile pic has 3.5 MB of data. The profile pic is loaded with every change of the users site and gets re-set once a month.
With these assumptions and the fact that we support 500 users that browse our site twice a day, we would have: 0.3€ for hosting roughly 2 GB of assets, 0.07€ for loading the pages and 0.02€ for re-deploying our front-end and storing the user profile pictures, adding up to 0.39€ for file storage.

Azure table storage has the same metrics, yet different prices. Per GB of data you pay 0,0591€ (in a locally redundant storage, which should be enough for our case) as well as 0.000304€ per 10.000 transactions, regardless of their type. Let's say we host a user database where every entry has 0.5 kb of data (thats roughly 10 columns of 15-char-strings) and where an entry is changed once a month. Furthermore we log everything our users do with a size of 0.05 kb of every log entry. We furthermore have four databases where a user on average stores a hundred elements at 1 kb of data each and which a user accesses for all his entries once per session (imagine a To-Do list of something like that). Let's add that up with our 500 users: We'll pay 0.18€ for 3 GB of data storage as well as 1.97€ for over 64.3 million transactions per month, adding up to 2.15€ in database costs.

Let's put that all together: We've built a killer app and managed to convert 500 users to use it on a daily basis. Per month we're paying 14.17€ for hosting our servers, 2.15€ for hosting and maintaining our data and 0.39€ for storing our static files. This adds up for monthly costs of 16.71€ per month on IT infrastructure costs, leading to costs of 0.034€ per user in our system. So for the rough price of a Netflix family subscription you can bring value to 500 users through our Functions app. And the best thing is as our app is built for scalability, growth in usage and your costs would scale linear, making it real easy to calculate your own pricing.

That is awesome!

I know, I was thinking just the same when deciding to write this article. But please keep in mind: This approach is meant for when you are starting with your product. You will always have limitations that sooner or later will become a bottleneck in your IT organization and if you don't shift your hosting methods soon enough you'll end up building a lot of technical debt. Here a the most striking limitations you have:

  • You have a RAM limitation of 1.5 GB per function in the standard plan, meaning RAM-intensive processes like rendering, complex image-refactoring or AI are hard to implement and in every case very slow in execution.
  • Your Functions App is very good at scaling, making you vulnerable for DDOS attacks. You can mitigate that risk by putting your App behind an (Azure) Firewall (and Azure has an eye out for you as well by default).
  • Your attack surface increases significantly, as you have to protect dozens or hundreds of small functions. You have to validate input, output, access rights and many more in each function you are writing.
  • Though Azure Functions scale really well, they don't scale infinitely. Right now, a maximum of 10 VMs can be occupied by your Functions App in the Standard Plan, and these VMs only have a fixed amount of RAM to offer, meaning there exists a hard cap on how much workload a Functions App can handle.

As already said, these are all risks that could be partially mitigated or worked around, but only at the cost of really tweaking your code base into something that isn't meant for doing that.
Although all the downsides and grumpiness at the end of the article, I hope this article gives you a good introduction how to quickly set up a scalable, reliable and cheap service that you can use to kick-start your product.

💖 💪 🙅 🚩
urmade
Tobias Urban

Posted on October 26, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Serverless prototyping - A case study
serverless Serverless prototyping - A case study

October 26, 2019