My AWS SQS Requests Skyrocketed to 1 Million at Month's Start. This is How I Implemented a Cost-Effective Solution.
Ganesh Kumar
Posted on August 4, 2024
We started using Amazon SQS for our push notifications, triggering emails, etc but the way it handles polling led to a big increase in usage. To address this and potentially save on cloud costs, we switched to Redis. Here’s our story.
At Hexmos, we are building innovative products like Feedback By Hexmos and FeedZap.
To keep our systems efficient and adaptable, we embraced a microservices architecture. This means we built separate, smaller services that work together seamlessly.
One of these services is a unified Identity Service, which manages user accounts and payments across all our products.
But behind the scenes, ensuring smooth communication between different parts of our system can be a challenge.
We leverage a microservices architecture, which breaks down our application into smaller, independent services that work together seamlessly.
However, coordinating communication between these services, especially for time-sensitive tasks like sending emails, requires a robust and reliable solution.
This article explores how we overcame the limitations of our initial approach and discovered a powerful solution with Redis Streams.
We'll delve into our challenges, why traditional message queuing systems weren't the perfect fit, and how Redis Streams helped us build a resilient notification system that keeps users informed and engaged.
Our Existing Microservices Architecture
Building Scalable Systems with Microservices
Our current architecture involves two services:
-
Leave Request API:
- Processes leave requests.
- Creates a message containing relevant data.
- Pushes the message to the message queue.
-
Email API:
- Processes received messages by sending appropriate emails.
Direct communication between these services can cause problems:
- Tight Coupling: Changes in one service impact the other, hindering independent development.
- Data Loss: If the Email API is down during signup, the email notification might be lost. Customers might not receive verification emails or leave notifications.
The Identity Service: A Foundation for User Experience
Why Two Services?
Explaining the use of microservices in our architecture:
- Backend dedicated to product: Handles core functionalities of the product.
- Backend dedicated to emails and notifications: Manages all email and notification-related tasks.
This separation allows each service to be developed, deployed, and scaled independently, increasing the system's resilience and maintainability.
AWS SQS: A Step Towards Decoupling
We integrated AWS SQS as a connection between these services:
- Leave Request API:
- Processes leave requests.
- Creates a message (e.g., JSON object) containing relevant data (user details, leave type, duration, etc.).
- Pushes the message to the message queue.
-
Message Queue (SQS) :
- Stores the message until it's processed.
- Provides reliability and durability guarantees.
- Can handle varying message rates, ensuring system scalability.
-
Email API:
- Continuously polls the message queue for new messages.
- Processes received messages by sending appropriate emails.
If the Consumer BE (Email API) is down, messages are queued and not lost. Once the Consumer BE is back online, it processes the queued messages and sends email notifications using data from SQS.
Benefits of Using a Message Queue
- Decoupling: Services become independent, improving maintainability and scalability.
- Reliability: Messages are persisted in the queue, ensuring delivery even if the Email API is temporarily unavailable.
- Performance: The Leave Request API can process requests faster without waiting for email delivery.
- Scalability: Each service can be scaled independently to handle increasing load.
- Error Handling: Implement retry mechanisms and dead-letter queues to handle failed message processing.
AWS SQS Challenges and Finding Perfect Alternatives
As We integrated SQS to 7 different services we started depleting usage limits Due to Free Tier Limitations.
AWS SQS Free Tier Limitations
While AWS SQS offers a convenient solution for message queuing, it presents some limitations for our specific needs:
Free Tier Limitations: The free tier offered by SQS restricts the number of requests, hindering the scalability of our growing application.
Hidden Costs: Exceeding the free tier results in significant cost increases, potentially impacting our budget.
Finding a Temporary Fix for It
Amazon SQS offers Short and Long polling options for receiving messages from a queue.
Short polling (default) – The ReceiveMessage request queries a subset of servers (based on a weighted random distribution) to find available messages and sends an immediate response, even if no messages are found.
Long polling – ReceiveMessage queries all servers for messages, sending a response once at least one message is available, up to the specified maximum. An empty response is sent only if the polling wait time expires. This option can reduce the number of empty responses and potentially lower costs.
For Temporary solutions, we choose Long polling. But this didn't solve Our Problem.
Why Pub/Sub Wasn't the Perfect Fit
We explored alternative message queuing solutions, including Pub/Sub. However, Pub/Sub wasn't a suitable choice due to specific requirements:
Pros of Pub/Sub
Scalability: Pub/Sub systems are designed to handle a large number of publishers and subscribers efficiently.
Decoupling: It promotes loose coupling between systems, as publishers and subscribers don't need to know about each other.
Flexibility: Pub/Sub can be used for various messaging patterns, including publish-subscribe, fan-out, and request-reply.
Cons of Pub/Sub
Message Loss: Pub/Sub systems typically don't guarantee message delivery, especially if subscribers are offline or unable to process messages.
Message Ordering: Message order is not guaranteed in most Pub/Sub systems, which can be a limitation for certain applications.
Complexity: Managing subscribers and handling message delivery can be complex, especially for large-scale systems.
Latency: While Pub/Sub is generally fast, it might not be suitable for applications with strict real-time requirements.
The search for a viable alternative led us to Redis Streams, a powerful data structure within the Redis database.
Continue Reading : Article
Posted on August 4, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.