The Tech Stack of a Simple SaaS for AWS Cloud

server_kota

Mark K

Posted on October 16, 2024

The Tech Stack of a Simple SaaS for AWS Cloud

Introduction


Note 1: Here is the hosted interactive demo: demo.saasconstruct.com

Note 2: My monthly bill for each SaaS setup is 3-5$ per month, and it's mostly CI/CD costs.

Note 3: Template is here: saasconstruct.com.


I've done several AI PoCs and MVPs on AWS, and it's always similar things:

  • Host the frontend somewhere
  • Make a call to the backend
  • The backend gets/updates data from binary storage/database
  • The backend does some AI logic or calls another service and sends back the result
  • There are two isolated AWS accounts: dev and prod
  • CI/CD for deployments
  • Infrastructure-as-code for cloud resources declaration

So, I thought I'd build a simple solution to bootstrap such things on AWS. And write a blog post about it.

I decided to add some features, like Stripe payments (and LemonSqueezy payments if you don't want to worry about sales tax/VAT) and payment management, authentication, traffic alarms, and others. I also thought it needed to be configurable, like replacing API Gateway and AWS Lambda with ELB and ECS for longer tasks.

Frontend

I picked the commonly declared easiest framework to start with. It is Vue, and, as far as I understood, the second most popular framework out there. I picked it because not only is it the easiest, but also I had some experience with it.

The website is a standard SPA application with Vite as a build tool. For styling, I use Bootstrap because it is, too, very easy to work with, and also because it does not cause a lot of pain when migrating from one version of the frontend framework to another.

Frontend Hosting

There are two options:

  • S3 and CloudFront (CDN)
  • AWS Amplify Hosting, which is a wrapper around S3 and CloudFront, easy to work with but less configurable. E.g., you can't do anything with CloudFront distribution, as it is not visible. You also can't geo-block your application except doing it with redirects.

I went with Amplify Hosting as its primary focus of AWS in frontend hosting solutions and because it is easy to set up, attach a domain, etc.

Since it is a Pay-as-you-go basis, I have set up a traffic alarm: if there is more than a certain number of hits per 10 seconds, I get a notification.

Backend

The backend is the API Gateway, which does the rate limiting, and AWS Lambda (Python), which does the business and general logic:

  • Checking if the user is authenticated
  • Processing payments and manage subscriptions (customer portal)
  • Sending emails
  • etc.

I also have another AWS Lambda function that creates a user in the database after signing up in Cognito.

There is shared utilities where I put some shared functionality, for example, emailing. Also, logging functionality, for example, an email is sent to me if there was a payment error.

Authentication

Authentication is pain, I know, and I did not want to use a third-party service. So I stayed with AWS Cognito. It is pretty cheap.

You can say, just use AWS Amplify Auth (which is a wrapper around AWS Cognito), but I had some problems with it. I even wrote a post on Reddit:

My list of problems with Amplify for authentication

And there is another post with an even bigger list from some frustrated user (it is an old post though).

here

Besides, if you use Amplify only, you are stuck with the whole ecosystem with no chance to change something. E.g., if you want to have access to the CloudFront distribution (e.g., when you want to geo-block certain regions), tough luck, you can't see it with Amplify Hosting. I also had other issues: one of the examples being CDK creation from Amplify resources, which was a pain point for me.

So what I did is a hybrid approach (which is somewhat popular according to Reddit): the AWS Amplify JS library allows you to import cloud resources you create yourself, like user pools, so I created them with CDK and then just used the Amplify JS library for authentication.

In this case, I can always change whatever I want, swap cloud resources (for example, I could go from Amplify Hosting to CloudFront + S3 if I need access to CloudFront distribution).

Emails

AWS SES. It is the main AWS Email service. It sends everything, including Cognito authentication emails, requests from the contact form, etc. The only thing you need to understand is that in your dev AWS account, you will need to create verified identities first to be able to send (I automated it via IaC), and in the production AWS account, you will need to request production access (which is just a couple of clicks).

Using AWS SES, email notifications are sent in the following scenarios:

  • When payment errors occur.
  • In case of spikes in web traffic.
  • If the CI/CD rollout fails.
  • For other situations, such as authentication emails and inquiries from the contact form, etc.

Storage

DynamoDB as a database. Easy, fast, and managed. Yes, I had to think about access patterns, but generally, it is good to work with and also does not cost me anything while I validate/build. Since I plan to work on several products and want to keep them isolated, I can't put RDS/DocumentDB in dev and prod accounts per project (it costs way too much).

Payments

I added two payment systems, and it is possible to choose which one to use because they work similarly:

  • Stripe is popular and easy to integrate, plain and simple. When a user buys a product, I use Stripe checkout, and for managing subscriptions, I use the Stripe Customer portal.
  • LemonSqueezy is very similar to Stripe, but it is also a Merchant of Record, meaning that it handles sales tax/VAT tax for you. It also has a checkout for buying a subscription and a Customer portal for managing them.

There are endpoints I wrote for the Stripe/LemonSqueezy webhooks, which handle all the logic.

Infrastructure as Code

So there are a lot of things to choose from:

  • Something like Terraform or OpenTofu (fully open-source alternative which is based on Terraform)
  • Pulumi
  • CDK
  • CloudFormation

I chose AWS CDK, and here are my reasons:

  • It is easy to work with
  • It is popular and mature enough
  • It is way better than AWS CloudFormation, in my opinion
  • It is an AWS library, and I use AWS
  • I can write it in Python, TypeScript, or other languages. Since I use Python on the backend and TypeScript on the frontend, it is a good choice.

The reason I did not choose Terraform is that CDK is easier; it allows creating resources in a simple manner, at least in my opinion. I like OOP and try to construct my cloud infrastructure accordingly. A big benefit is that CI/CD is included (CDK pipelines), so I don't have to invent that.

CI/CD

I chose CDK pipelines because it is, again, easy. Just connect the pipeline to the GitHub repository, and you are good to go. Git push to the development branch -> it will be rolled out to the development account. Git push to the main (or pull request) -> production rollout.

Alarms and Rate Limits

I've set up Rate Limiting to prevent getting spammed through the API gateway. I've set up two CloudWatch alarms:

  • To alert me when the hosted website is getting spammed with requests.
  • To alert me when the API Gateway is getting spammed with requests.

I've also set up billing alarms to inform me if I am about to spend too much.

Logging

CloudWatch logs the events, you can see them both in the AWS Console and directly in the IDE via extensions.

AI

The choice was between using either OpenAI (with GPT models) or AWS Bedrock (with Claude models). This decision was challenging because, while AWS Bedrock with Claude integrates easily with AWS, OpenAI is more commonly used. Both companies offer top-tier AI models. For now, I have chosen to stick with AWS Bedrock. This might change in the future, but for now, I appreciate the simplicity. For the vector database, I use Pinecone, which has serverless indexes.

An example of the AI application I built here is a RAG system, which is essentially a chatbot that can answer questions based on your data. You store information in a vector database, and on the query, you do a similarity search, and then just use LLM to give an answer based on the result of that search. I currently use simple models to avoid costs, but switching to different models is as simple as changing a line of code.

Programming Languages

I was initially a Java developer, but then became a Python developer because I developed machine learning and deep learning services. The most libraries in that space are developed in Python or featuring a Python wrapper. Besides, Python integrates seamlessly with AWS, whether in AWS Lambda (e.g., using the AWS Lambda Powertools library) or in CDK. So in the end, both the backend and cloud infrastructure (via CDK) are implemented in Python.

My secondary language is TypeScript due to its popularity with frontend frameworks. While I used to work with JavaScript, I found the absence of types to be confusing as the codebase grew larger. TypeScript’s static typing provides much-needed clarity and safety during development, especially in large projects.

AWS Bills

Since I don’t have a high traffic load, my AWS costs are very low, typically $3-5 per month, primarily due to CI/CD expenses.

The setup includes a CDN (provided by Amplify Hosting) and a small caching layer within AWS Lambda. Additionally, some services fall under the AWS Free Tier, which further reduces my costs.

As the product scales and gains more users, I might need to optimize resources by switching to provisioned DynamoDB and implementing DAX (DynamoDB Accelerator). However, for now, this setup works perfectly.

Conclusion

This solution meets my current needs efficiently.

I’ve included this entire tech stack as a boilerplate (which I actively develop and update) in my AWS template on SaaSConstruct.

I will continue exploring additional features that can be incorporated into this setup to enhance its capabilities...

💖 💪 🙅 🚩
server_kota
Mark K

Posted on October 16, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related