Function as a service: behind the scenes
Amir Keshavarz
Posted on March 28, 2021
Function as a service: behind the scenes
It's been months when I first decided to write this post but I honestly didn't know to structure all the materials. With that in mind, I started writing it so let's see how It goes.
This article tends to be a little bit unfair because I won't cover all topics equally. There are some things that interest me more than others.
Introduction to Function as a service
As the name suggests, in this model we deploy units of "function" or "code". These platforms are designed to make developers stop worrying about the underlying infrastructure needed to run their code.
In this model, the FaaS platform is language-aware so the only thing developers need to do is exposing a function to the FaaS runtime and the rest will be taken care of by the FaaS platform. So no servers, only an entry point function!
This doesn't mean that your code magically is deployed without any kind of server. The server stuff is provided and managed by the FaaS platform so you don't have to worry about it.
By using a FaaS platform, You can achieve Serverless, the buzzword of the year! In a true Serverless platform, some concepts like servers, containers, scaling, reserved resources and etc simply don't exist in users' world. Also in most platforms, billing is done on a per-request basis which is awesome :)
These things sound amazing and I do believe it is. But most legacy systems are not designed to be deployed using this model and lots of them for good reasons. It's worth noting that most FaaS platforms assume that your software is stateless so their use case might be limited for now.
Anyway, we're not here to discuss the pros and cons of FaaS. So let's dive in and find out how a FaaS platform is made.
FaaS is used for a variety of use cases including:
- Data Processing
- IoT Backend Services
- Cron Jobs
- All kinds of web access control
- Middlewares
- Caching
- Serving static sites
Categorization
Depending on the context, we can define multiple categories for FaaS.
Locality
- Central and local deployments
- Edge Computing
Central and local deployments is what we think of FaaS platforms when we first think about them. A central and local system that runs our code.
Edge Computing truly deserves its own blog post but to make it short, we can say that edge computing is when our code is deployed to multiple regions, and when a user calls our code, we serve them from the nearest location. This model is usually well integrated into CDNs.
Infrastructure
- VM-based
- Container-based
- Sandbox-based
No matter what kind of infrastructure, one of the most important goals is isolation. This can be achieved in different ways which we will be discussing in the next section.
Infrastructures
VM-based Platforms
VMs provide a large amount of isolation between instances without any noticeable performance penalty but it has a huge flaw which is the overhead of an OS!
FaaS platforms are event-based systems so it needs an extremely fast orchestrator. It's quite a challenge to design a VM-based orchestrator that's fast!
VM-based orchestrators are usually sluggish and don't perform well under stress and yet you just might be able to design a VM-based orchestrator suitable for FaaS platforms if you play your cards well :)
But the appealing level of isolation and performance in VMs is not something we can forget about. Here's where microVMs enter the game!
microVM is a lightweight and stripped-down version of a VM. The goal is to reduce the overhead associated with VMs.
Here are some links to get you started:
Announcing the Firecracker Open Source Technology: Secure and Fast microVM for Serverless Computing
microVM: Another Level of Abstraction for Serverless Computing
Container-based Platforms
Some platforms settle for containers to reduce their overhead. In Linux, containers use cgroups and namespaces to provide a "good enough" level of isolation between instances and the host. The small size of container instances and their native performance has turned this method into the most desirable way to provide infrastructure for FaaS.
But there's a catch. Containers are not sandboxed! They don't have the same level of isolation as in VMs or sandboxes. That's why we have been seeing some new projects which provide sandboxing for containers like gVisor.
Sandbox-based platforms
Sandboxes provide limited space for processes to play in like when you played in one when you were a kid. (Fun fact: we didn't have sandboxes in our country while growing up)
Processes that run in a sandbox can't interact with OS directly, instead, they rely on the interface that the sandbox runtime provides. This allows us to trace interface calls and prevent malicious processes to access the host.
Sandboxes are hard! It's not easy to design and implement a sandbox with decent performance but in the end, the result is worth it.
Some of the most noticeable sandbox implementation currently used in the FaaS world:
- WebAssembly runtimes
- V8 JavaScript engine
- gVisor
FaaS Sandbox Implamantations
In this section, We review 2 FaaS implementations using sandboxes: V8 and WebAssembly.
WebAssembly
WebAssembly is a portable binary instruction format for virtual stack machines.
It aims to execute at native or near-native speed by assuming a few characteristics about the execution environment.
WebAssembly is designed to be executed in a fully sandboxed environment with linear memory. The binary is completely blind to the host environment by default because of that design. That being said, we can set up some kind of communication by reading memory or importing functions.
I've written about WebAssembly before. You can read about it before continuing:
WebAssembly, Wasmer And Embedding Wasmer in C program
WebAssembly is extremely appealing for FaaS since It's designed with sandboxing in mind, It's light and It has many compiler implementations including an LLVM backend which means you can compile any language that has an LLVM compiler to WebAssembly.
There are a few challenges though:
- Running WASM modules is inherently a blocking operation. You should consider this and find a solution suitable for your environment If you plan to use WASM for long-running executions.
- Memory is completely isolated. So input data should be copied into the WASM memory.
- CPU limitation! You'll need a plan to limit CPU usages like timeouts or metering. There's no native support for that.
Even though these challenges may limit the use of WASM for some use cases but, in my opinion, It's still the best candidate for FaaS platforms.
There's currently one implementation in the wild that I know of. Fastly Serverless uses lucet to run WASM modules at the edge.
V8
Chrome V8 is a JavaScript engine by Google used in Google Chrome, Node.js and etc. In addition to JavaScript V8 is also able to run WASM modules too. V8 is a huge project and It's one of the best JavaScript engines you can find in terms of performance and security. Its development is also guaranteed by the fact is It's used in hundreds of millions of devices. (Java installation flashback :D)
Isolation in V8 is achieved by using Isolates. Isolates basically create an isolated environment for your codes to run in.
Implementing a FaaS platform using V8 is a complex task but the reward is an easy and fast entryway for end-users since They can use plain JavaScript and It just works.
Cloudflare Worker is the only user of V8 at the edge.
Commercial Products
- Amazon Lambda: VM-based
- Cloudflare Workers: Sandbox / V8-based
- Fastly Serverless: Sandbox / WASM-based
There are many more like Google Cloud Functions, Fly.io, Stackpath EdgeEngine but I wasn't able to confirm what kind of infrastructure they use.
Future
It's hard to say what will happen in the future. Even though FaaS brings many advantages but migrating legacy and existing code may not be easy. In my mind its future is bright but in reality, it might be nothing more than a hype. But I'd say that even If this turned out to a terrible service, We can apply the good parts of FaaS to other services like serverless.
People don't like to manage their infrastructure and that's just a fact. They don't want to know what is a container, They don't want to know how to scale their service, They only want to worry about their code and I think that's fair. Current PaaS products don't have enough abstractions and the success of services like Vercel shows just that.
Conclusion
In this article, I tried to categorize different kinds of FaaS implementations and the technology used in them. I tried to make it short without many details because If I included the details, It had the potential to be a book! And as I said at the beginning, This has been an unfair post since I didn't cover all topics equally.
Thank you for your time.
Posted on March 28, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.