Lessons learned from building a WebSocket server

torstendittmann

Torsten Dittmann

Posted on September 14, 2021

Lessons learned from building a WebSocket server

Appwrite is an open-source, self-hosted Backend-as-a-Service that aims to make app development easier with SDKs available in a variety of programming languages.

Before the release of a Realtime API with version 0.10.0, applications were only able to communicate with our REST API.

Why did we build a Realtime API?

REST APIs have been a popular architecture for data delivery in the past. So why do we need Realtime now?

Our REST API works great and is very simple, but in order for us to allow more flexibility, and to allow developers to create new use-cases such as game development and reactive applications, we needed to add a new API layer for realtime interaction.

Rather than the API client only getting new data on their next query, the new data is pushed to them immediately. If a developer is already polling the REST API for data changes, it doesn't just mean they want to access the data faster, but is a strong indication that they really want a realtime API.

Realtime APIs provide a more enjoyable developer experience that can significantly reduce application processing overhead and code complexity. Once the data is transferred to the system in real time, you allow the developers to focus on adding value to the product.

Architecture

Since the Realtime Service is implemented on top of an already existing REST API, all messages sent over Realtime are triggered from the HTTP Server. This means that if a resource is created or updated, the WebSocket Server will be triggered to send this action to its subscribers.

The backbone of the data exchange between REST and WebSocket is a Redis instance. We are using a single Pub/Sub channel, which is the Source of Truth for the WebSocket Server. If a new resource is added via the REST API, the HTTP server will publish the payload alongside metadata in this channel. The WebSocket Server subscribes to the channel, processes the message, decides which client is allowed to receive the message, and sends it to the destined client.

Architecture

Data Flow

At Appwrite, resources from the REST API are separated by projects, secured by permissions, and the events are categorized in channels. When a client establishes a connection to the realtime server, a project identifier is sent along with information to authenticate the connection to a user and channels via which the client will receive messages. In the following, we take the Car resource as an example, and tell the WebSocket Server to subscribe to the Cars channel.

The WebSocket Server now allocates all roles of the user, the project, and the channels to the unique connection identifier for the client.

If the Car resource now gets updated via the REST API, the HTTP Server publishes this event to the Redis channel with its payload. The WebSocket server will then receive this event and start checking who the receiver of this event will be.

The following conditions need to meet for a client:

  • The Project ID must be equal.
  • The Permissions of the resource must meet the user's roles.
  • The Channel must be subscribed to.

The WebSocket Server will then send the payload of the resource to all clients that meet the conditions.

Data Structure

Speed is vital for building Applications with Realtime Updates. Our Data Structure needs to be processed as quickly as possible to decide which Client is supposed and allowed to receive an event. For this, we are maintaining 2 Hashmaps in memory. One of them holding all Subscriptions, and the other, all Connections.

Subscriptions

Looking at the previous conditions, we can see the pattern reflected in this tree. You may realize that this structure has a disadvantage, namely, there are many duplicate data entries of the connection ID. However, this disadvantage is intentional and has a specific reason – speed.

The tradeoff of memory for speed is essential in a WebSocket server. This structure allows us, even with a high number of subscribers, to quickly identify them and forward the message to them, even though this might use up more memory.

Below is an example of our implementation which distributes subscribers evenly across 20 different channels, then using one event to gather all subscribers for that event.

Subscriptions Time used Memory used
10,000 0.022ms 11MB
100,000 0.238ms 90MB
500,000 1.525ms 427MB
1,000,000 3.678ms 852MB
5,000,000 19.334ms 4,289MB

These numbers are more than fast enough for everyday applications, especially considering that a single WebSocket server is unlikely to maintain more than a million connections simultaneously. Since the WebSocket server is stateless and only manages its own subscriptions, it can easily scale horizontally and balance off the work.

Now we come to our next data structure and the reason why we need it in the first place.

Let's assume a client connects to our WebSocket server and subscribes to some channels. After some time, the client disconnects and we have to clean up after them and remove their connection from all channels.

Connections

To avoid endless loops of identifying every legacy, we have an auxiliary data table that holds the project and roles of each connection easily accessible for us. Using this data, we can remove all the information from the subscribers without much searching.

Stumbling Blocks

Of course, we didn't get everything right the first time. Every time we encountered and solved one hurdle, the next one was already there waiting for us.

Change of Permissions

One of the first hurdles we encountered was: What happens if a user's permissions change while they are connected? What if a user is deactivated and the connection is still open?

The WebSocket server would not know about this change and would continue to send all the messages that the user was allowed to receive at the beginning of the connection. This would result in exposing a resource to someone who is not authorized to read it.

To prevent this phenomenon, we have added a flag to the message sent to the WebSocket server, which indicates whether the permissions for a particular user have changed. When the WebSocket Server receives this message, it checks if this user is currently connected and matches their roles with those in the backend.

The Operating System

Linux’s networking stack comes with sane defaults for many workloads, but the stack isn't tuned for 1+ million concurrent connections. We expected to face some form of the C10k problem, so we prepared our systems in advance [1][2][3]:

  • Increased the default TCP buffer sizes for the system
  • Increased the default IPv4 port range
  • Increased the limit for open files and file handles

Despite this tuning, we hit a limit of around 260k connections - past that point, the HTTP server stopped responding to our clients. We observed that our server wasn't completing the TCP 3-way handshake: it would receive SYN packets from the client (as observed with tcpdump) but wasn't responding with ACK.

After hours of fruitless debugging, we tapped other maintainers to lend their eyes to the problem. Through the power of open-source collaboration, we had our culprit in a matter of minutes:

$ cat /proc/sys/net/netfilter/nf_conntrack_max
262144
Enter fullscreen mode Exit fullscreen mode

Because websocket connections are long-lived, we needed to increase the connection tracking limit in the networking stack. Once increased, we cruised all the way to 1 million connections with ease.

Asynchronous Delivery

When we checked the performance of sending messages everything was going well, that is, until the moment we ran higher scaled tests and were surprised with very poor results. The culprit was the fact that we sent each message serially instead of in parallel.

Fortunately, the solution was only a few lines of code away.

Authentication with Cookies

The first implementation of the WebSocket Server only communicated 1-way, which was sending updates to clients. This turned out to be a problem in retrospect, as our current implementation uses the HTTP-only cookie which is transmitted to the WebSocket server with the handshake.

Later, when developing a demo application, we noticed that under certain circumstances this cookie is not sent, for example, when the client and server are on different domains.

After a bit of research, we came across the information that the handshake is not intended as a method for authentication at all. Reason for this can be found here from one of the maintainers of Chrome's WebSocket implementation. This was solved by additionaly authenticating via a message over the WebSocket protocol. If the user was not authenticated via the cookie, we decided to fall back to authentication via a message and send the token of the cookie to the WebSocket Server.

So, relying on the handshake for authentication alone was obviously a bad idea.

Takeaway

Of course, the above approaches might not apply for every use-case - but they are for us at this point. As Donald Knuth said in his book The Art of Computer Programming:

“The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.”

We could micro optimize our data structure to achieve even better results with many more subscribers. However, it is easier to add another instance of the WebSocket Server behind a Load Balancer and scale horizontally.

As long as this works for us, we’ll follow Donald’s advice.

Credits

Thank you for your attention and we hope you enjoyed this article!

Here are some handy links for more informations about Appwrite:

💖 💪 🙅 🚩
torstendittmann
Torsten Dittmann

Posted on September 14, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related