Server can handle 10 million users

koukikitamura

koki kitamura

Posted on May 1, 2022

Server can handle 10 million users

Overview

I created an API server that is highly scalable and can handle 10 million users. It's an SNS like Twitter.
The implementation is published on Github.

The development environment is as follows.

  • Node 16.14
  • Express 4.17.3
  • DynamoDB 2012-08-10

The functional requirements are as follows.

  • Post a tweet
  • Post a Comment for tweet
  • Follow user
  • Get Timeline

Introduction

Services with hundreds of millions of users, such as Facebook, Amazon, and Youtube, need to handle a lot of traffic. A commonly used approach to handling heavy traffic is scale-out rather than scale-up. Scale-up is expensive because it uses high-performance server. In addition, there is a performance limit for operating on one server.

Let's talk about scale-out. Application can be broadly divided into three layers.

  • Client layer
  • Server layer
  • Database layer

When handling a large amount of traffic, The server layer only processes the data, it does not store it. Therefore, it is easy to scale out.

Image description

On the other hand, the database layer becomes difficult to maintain consistency and availability as data is distributed due to scale-out. You also need the logic to decide which data is stored on which node. Data relocation is required when increasing or decreasing the number of nodes. Since these features are not in RDB, we will use NoSQL.

Typical databases that support scale-out include BigTable, HBase, DynamoDB, Cassandra, etc.

Database Description
BigTable、HBase Consistent and up-to-date data can be obtained. On the other hand, data cannot be acquired while the lock is applied due to data update.
DynamoDB、Cassandra Data is always accessible. On the other hand, old data may be read during data synchronization.

This time, we will create an API server for SNS, so availability is more important than consistency. Therefore, we use DynamoDB.

What is DynamoDB?

DynamoDB is a key-value database. You can create tables, and each table stores an item. Each item has a key and a value.

You can specify a partition key and a sort key for the item key. The partition key is used to determine the node from within the DynamoDB cluster. The sort key is like an index on a table and is used for sorting.

You can store multiple attribute / value pairs for an item's value. The attributes can be different for each item.

Image description

DynamoDB queries are limited and basically narrow down items by partition key and sort key only. When querying using other attributes, it will be slower as the number of items increases because it is necessary to check all items.

When you want to treat other attributes as partition keys, use GSI (Global Secondaly Index). When other attributes are treated as sort keys, LSI (Local Secndary Index) is used.

Database Design

DynamoDB's database design is different from RDB. The flexibility of querying RDBs allows you to design a normalized table first, without considering access patterns to your data. On the other hand, DynamoDB has a limited query pattern, so first determine the access pattern to the data and then design the table based on it. Specifically, we will proceed with the following flow.

  1. Modeling
  2. Create use case list
  3. Design Table
  4. Create query definition

Modeling

The ER diagram is as follows.

Image description

The timeline shows tweets of users that you are following. In SNS, the display speed of the timeline has a great influence on usability. Consider a database design that can display the timeline faster.

Read Heavy / Write Light on the timeline

In the case of a normalized table design, writing data at the time of tweeting is light because data is written only to the Tweets table. On the other hand, reading data on the timeline is heavy. The main flow when reading the timeline is as follows.

  1. Get a list of IDs of users you are following
  2. Get tweets from each user you follow
  3. Merge the retrieved tweets
  4. Sort merged tweets

The SQL for getting the timeline is as follows.

SELECT
  *
FROM 
  tweets
WHERE
  userId IN (
    SELECT followeeId FROM follows WHERE followerId = 'user id'
  )
ORDER BY
  postDate DESC
Enter fullscreen mode Exit fullscreen mode

With this method, the more followers you have, the heavier the load on the timeline will be. It can be said to be a Read Heavy / Write Light method.

Read Light / Write Heavy on the timeline

Consider a Read Light / Write Heavy technique. If you create a Timeline table and want to read the timeline, just query the timeline table. On the other hand, when a user tweeted, make sure to write the tweet to the user's follower's timeline.

Image description

The SQL for getting the timeline is as follows.

SELECT
  *
FROM
  timelines
WHERE
  userId = 'user id'
ORDER BY
  tweetPostDate
Enter fullscreen mode Exit fullscreen mode

This time, we will use this Read Light / Write Heavy method.

Create use case list

Create a data use case list based on functional requirements to find out how to access the data.

Entity UseCase Screen
Tweet getTimelineByUserId Home
User getUserByUserName User Detail
Follow getFolloweesByUserId User Detail
Follow getFollowersByUserId User Detail
Follow getCountFoloweeByUserId User Detail
Follow getcountFollowerByUsreId User Detail
Tweet getTweetsByUserId User Detail
Tweet getTweetByTweetId Tweet Detail
Comment getCommentsByTweetId Tweet Detail

Design Table

We will design the table and index based on the use case list. DynamoDB has a limited query pattern, but a method called Overloading GSI allows for flexible queries.

Image description

Include the ID in the sort key. Make the order of ID and record creation time the same. Then you can sort the posts by date without using LSI.

Create query definition

Finally, write out the query conditions. Based on this, we will implement around the database.

Entity UseCase Parameters Table / Index Key Condition
Tweet getTimelineByUserId { UserId } Primary Key GetItem (ID=UserId AND begins_with(DataType, timeline))
User getUserByUserName {Username} GSI-1 Query (DataValue=Username AND DataType=usserProfile)
Follow getFolloweesByUserId {UserId} Primary key Query (ID=userId AND begins_with(DataType, followee)
Follow getFollowersByUserId {UserId} Primary Key Query (ID=userId AND begins_with(DataType, follower)
Follow getCountFoloweeByUserId {UserId} Primary Key Select COUNT / Query (ID=userId AND begins_with(DataType, followee)
Follow getcountFollowerByUsreId {UserId} Primary Key Select COUNT / Query (ID=userId AND begins_with(DataType, follower)
Tweet getTweetsByUserId {UserId} Primary Key Query(ID=userId AND begins_with(DataType, tweet)
Tweet getTweetByTweetId {TweetId} GSI-1 Query(DataValue=tweetId AND begins_with(DataType, tweet)
Comment getCommentsByTweetId {TweetId} GSI-1 Query(DataValue=tweetId AND begins_with(DataType, comment)

Design API Server

Software Design

Design based on Domain Driven Design. The layer and directory names are matched.

Directory Name DDD Layer Components
src/domain Domain Layer Entity / Value Object / Repository Interface
src/application Application Layer Application Service / Serializer
src/infrastructure Infrastructure Layer Repository / AWS Config
src/presentation Presentation Layer API Server

ID generation method

Make the order of ID and record creation time the same. It can be handled by ID generation using the numbering table, but it lacks scalability. Use Snowflake as a scalable ID generation method.

This method divides the bit string into three parts. The ID is the decimal number of this bit string.

Part Description
Epoch time The number of seconds of difference from a particular time.
Sequence It counts up every time an ID is generated and is cleared every second.
Node number The number assigned to each node.

Implementing Snowflake in Node.js is as following.

import { config } from "@src/config";
import { dateToUnixTime } from "./time";

const workerIDBits = 10;
const sequenceBits = 12;

// Use snowflake
// See: https://blog.twitter.com/engineering/en_us/a/2010/announcing-snowflake
export class IdGenerator {
  private workerId: number;
  private lastGenerateAt: number;
  private sequence: number;

  constructor(workerId?: number) {
    this.workerId = config.snowflakeWorkerId;
    this.lastGenerateAt = dateToUnixTime(new Date());
    this.sequence = 0;
  }
  generate(): number {
    const now = dateToUnixTime(new Date());

    if (now == this.lastGenerateAt) {
      this.sequence++;
    } else {
      this.sequence = 0;
    }
    this.lastGenerateAt = now;

    // The bit operators ('<<' and '|' ) can handle numbers within
    // the range of signed 32 bit integer.
    return (
      now * 2 ** (workerIDBits + sequenceBits) +
      this.workerId * 2 ** sequenceBits +
      this.sequence
    );
  }
}
Enter fullscreen mode Exit fullscreen mode

FAQ

Is the user's profile information duplicated?

Yes, it's a duplicate. When the profile is updated, you need to start Lambda with DynamoDB Stream to keep it asynchronous and consistent.

Isn't the tweet of a user with many followers a heavy writing load?

Yes, it's expensive. Only when the number of followers is large, it is necessary to take some measures such as dynamically merging when the timeline is acquired without writing to the follower's timeline.

Do you not cache?

Let's do it. It's not too late to monitor and find bottlenecks before making a decision.

Conclusion

In this article, I explained how to create a highly scalable API server. Just keep in mind that excessive performance optimization can go wrong when there are no performance issues.

The implementation is published on Github, so please take a look.

💖 💪 🙅 🚩
koukikitamura
koki kitamura

Posted on May 1, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related