Architecting a Real-Time Scheduling Integration with DynamoDB
Shane Jarman
Posted on September 30, 2021
The Problem
At BetterHealthcare, we are focused on building the best digital front door in healthcare. To do that, we needed to develop an integrated scheduling product that accurately displayed provider availability in real-time.
Considerations & Thought Process
Our solution needed to be scalable, event-driven, and able to support multiple EMRs (each of which could have calendar data formatted in slightly different ways). The best way to brainstorm a solution to a big problem like this is to break it down into a few smaller problems.
In this case, we discussed the following questions:
- What will the scheduling data look like? How will we initially process the scheduling data when we receive it?
- Once data from a specific EMR has entered the system, how will we transform it into our standard availability model?
- After the data is transformed into a standard format, how will we store it?
After answering all those questions and considering the results in all in the context of our broader platform and architecture, we came up with the following solution.
The BetterScheduling Solution
We decided that the best move was to leverage AWS's managed services to build our availability data flow using AWS Lambda, DynamoDB, and Kinesis streams.
Step 1 - Processing EMR Data Feeds
Each scheduling event that we receive contains just a single piece of the availability puzzle. It could be an appointment cancellation, an updated vacation day, or a change to the 'working hours' of a provider. Scheduling events can arrive in rapid bursts (e.g. when the integration is activated for a new customer) or one at a time as provider schedules update throughout the day.
Because the quantity of events we are receiving at any given time can be highly variable, we decided to publish each one to an AWS Kinesis Stream when it is received. The scheduling-event-stream
then has an AWS Lambda Consumer that creates or updates the corresponding scheduling item in the scheduling-items
DynamoDB Table.
The benefit of using the scheduling-event-stream
is that if data feed from an EMR partner exceeds our write capacity, we can hold the data in the stream temporarily and retry processing the event! Alternatively, if our API failed the request, we would force our partners to "try again" and resend the data (which would not be as reliable and lead to a lot of errors).
Our Lambda, Kinesis, and Dynamo event intake architecture allows us to have the benefits of 'real-time' event processing without the risk of overloading the system when volume increases.
Step 2 - Transforming Scheduling Items
OK - so we now have a table full of scheduling items updated in real-time. Great! But these items are formatted differently for each EMR, and "scheduling items" are not "available slots". We still have more work to do to get to the "real-time availabilities" that we need.
We solved this data 'transformation' issue using DynamoDB Streams. Using pattern-matching, we are able match specific scheduling item updates to a corresponding "EMR Data Transform" Lambda. When a scheduling item is modified, the appropriate Lambda immediately calculates new, properly standardized availability for whichever providers and dates were impacted.
Our 'availability data' is now ready to be stored!
Step 3 - Store Standardized Available Spans
Now we have a day (or multiple days) of standardized available spans ready to be saved. But how should we store them?
We decided to save the final, standardized data in DynamoDB. There are pros and cons to using a 'wide-column' NoSQL datastore like Dynamo, but in our case the benefits clearly outweighed the costs.
Our access patterns were clearly defined, the Dynamo API works splendidly with AWS Lambda, and we would not need to manage any connections (in the case of a SQL solution). As we add additional EMR partners and integrated customers, Dynamo would allow us to scale our reads and writes horizontally while "paying as we go" to handle bursts of onboarding throughout the day.
So - our realtime availabilities are saved in our available-spans
DynamoDB table, which our clients are able to query through our GraphQL API.
NOTE: In the final architecture diagram above, you'll notice that there is an additional Lambda connected to our available-spans
DynamoDB table. That is the final piece of a slightly shorter 'pre-calculated availability' data flow. It is there to show an example of additional data flows that we support, but it is not required for the provided example to function.
Conclusion
We hope that you enjoyed this high-level overview of our scheduling infrastructure. We plan on posting more in-depth posts on each of the three steps above over the next couple weeks, so stay tuned!
Posted on September 30, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 15, 2024