RFC: A Full-stack Analytics Platform Architecture
Hayden McParlane
Posted on June 2, 2022
TL;DR
Predecos is in its proof-of-concept (PoC) stage. Most of my focus has been on getting the codebase setup for high-reliability, high-velocity, and efficient development. System technologies include Lit and Auth0 on the front-end communicating with an Apollo GraphQL back-end. Apollo GraphQL provides tethering of the various microservices which communicate quickly with each other using Apache Kafka through KafkaJS. Automation is leveraged to increase code quality with tools used such as Husky, Coveralls, Snyk, SonarCloud, CodeClimate and CircleCI; all of which are free for open-source projects. Operations are improved by the powerful but somewhat complex orchestration tool Kubernetes. Helm is used to simplify interactions with Kubernetes. Finally, Docker is used for image creation. Business logic is performed using a command-line interface (CLI) which communicates with the invoking microservice using a Kafka event. The microservice in that scenario provides authorization, management and storage logic while producing events of its own for consumption by different microservices. One question that remains is how a library like TensorFlow will perform when executed inside a Kubernetes job or cron job.
Introduction
When starting a new project, there are many approaches to consider to ensure that the project is maintainable both in terms of code quality as well as personnel. How do you keep the codebase affordable? What technologies should be adopted to attain project goals? How do you enforce coding standards? How do you keep faulty code from contaminating the main branch? How best to ensure that the system scales well as user numbers increase? In order to create a quality product, these, and other, questions must be addressed. The architecture of Predecos (Predictive Economic System) serves as one solution to the problem of creating a full-stack analytics platform, and I’m interested in collaborating with others in order to improve its design. Some considerations that I view as high return on investment (ROI) include maintaining simplicity; following sound architectural principles; using automation whenever possible; using heavily vetted, and open-source security tools; favoring long-term support (LTS) versions, and keeping things affordable.
Maintaining Simplicity
A complex codebase needs to be kept as simple as possible. In Predecos, one way this is done is through use of full-stack JavaScript, meaning developers only need experience with one language to maintain the codebase. Some differences exist in NodeJS when compared with browser-based JavaScript but those are minor compared with having to work in multiple languages. Front-end and back-end dependencies are also kept as uniform as possible. For example, Mocha and Chai are used in both the front-end and back-end meaning experience with said tools enables development in either. Developer setup and deployment should also be straightforward. Predecos uses Helm v3.8.2 to trigger one-click deployment of all the microservices. After doing this and setting a couple environment variables, one can easily start the front-end by using
yarn run start
then they are ready for development. This is also valuable when doing a deployment. When CircleCI went down, the impact was slight because automated deployment and manual deployment were very similar. All that was required was modifying the CircleCI config for the terminal, building the necessary docker images manually, pushing those images and running helm upgrade. A simple and fast mitigation that saved a lot of time. Diverging from simplicity was sometimes necessary though. Some may argue that use of Kubernetes makes the codebase difficult to understand by virtue of its high learning curve. However, the stability, reliability, documentation and wide-spread adoption of Kubernetes made it hard to resist. Google Cloud Platform (GCP), Amazon Web Services (AWS), Microsoft Azure and DigitalOcean all have Kubernetes offerings. This means applications developed on Kubernetes are largely portable from one cloud provider to another. Predecos uses DigitalOcean’s managed Kubernetes product because it is low-cost when compared to the other options. Should the desire for features such as, for example, encryption at rest arise, one option would be to migrate to Google Kubernetes Engine (GKE). Some modifications would be necessary but because Predecos is in a proof-of-concept (PoC) state and minimal data has been stored that migration would likely be simple.
Sound Architectural Principles
Leveraging a particular architecture has benefits and drawbacks. One may say a monolithic architecture is more suited to this project because of its small scale, but the goal is to enable the system to scale as needed. Microservices provide more flexibility to do so. When scaling, only what’s needed is scaled. Sound architectural design also drove adoption of Apollo Federation with graphql. Federation is employed to tether microservices together so that a unified interface is provided, despite multiple underlying microservices. Each microservice is responsible for its own interface. Thus, changes in one microservice have no impact on the functionality of another. Separation of concerns is better-kept. Graphql also provides single-trip data fetch capability which is very powerful in terms of optimization when compared with REST. Application performance is frequently bottlenecked by network round-trip time (RTT). If an application can provide a method of gathering as much data as possible in one round-trip that can greatly improve performance. GraphQL products also frequently come with a developer interface providing API-specific auto-complete allowing developers to quickly test changes. This can greatly increase developer velocity. Performance is also increased by use of Lit in the front-end. Lit continues the Polymer tradition of using the platform. This speeds up front-end component performance but it has some drawbacks. Front-end frameworks like Angular and React are heavily used making them easier to staff. However, the purpose of the framework is to make up for inefficiency in the browser. From an engineering perspective the browser should provide the capabilities that React and Angular provide, so that developers don’t need to rely on frameworks to get the job done. This is why Predecos uses Lit. Lit is designed to be closer to the browser layer, meaning it leverages native browser features that provide creation of custom elements that are framework agnostic. Eventually the goal is to move to raw web components which are now largely supported by the browser. It’s my belief that this will be the future of front-end development and so it was adopted in Predecos.
There is also something to be said about heavily vetted dependencies. The Predecos back-end relies upon Kubernetes and Apache Kafka. Both are widely trusted. Kubernetes originates from a system created by Google to manage containers at scale called Borg. With that background comes valuable experience. Borg was used for years at scale before Kubernetes was introduced. Kubernetes is now a trusted technology by many cloud providers and it provides a layer of abstraction between the cloud implementation and the cloud technology used (i.e, GCP, AWS, DigitalOcean, Azure, etc). There are other factors that tie you to the provider such as storage, but the dependence is reduced by use of Kubernetes which translates operations to those required by a specific cloud using what’s called the cloud-controller-manager. Kafka is similarly battle-tested making it very reliable. It’s used in Predecos as an event streaming system. It enables very valuable subscriptions, providing real-time data capabilities as well as efficient communication between microservices.
Automation
Ideally, software can quickly go from development to production. Continuous deployment and delivery are some processes that make this possible. Continuous deployment means establishing an automated pipeline from development to production while continuous delivery means maintaining the main branch in a deployable state so that a deployment can be requested at any time. Predecos uses these tools. When a commit goes into master, the code is pushed directly to the public environment. Deployment also occurs when a push is made to a development branch enabling local/e2e testing before push to master. In this manner the master branch can be kept clean and ready for deployment most of the time. Problems that surface resulting from changes are visible before reaching master. Additional automated tools are used. Docker images are built for each microservice on commit to a development or master branch, a static code analysis is performed by SonarCloud revealing quality and security problems, Snyk provides vulnerability analysis and CodeClimate provides feedback on code quality while Coveralls provides test coverage. Finally, a CircleCI build is done. Each of these components use badges which give a heads-up display of the health of the system being developed. Incorporating each of these tools into the development process will keep the code on a trajectory of stability. For example, eliminating code smells, security vulnerabilities, and broken tests before merging a pull-request (PR) into master. Using Husky on development machines to ensure that code is well linted and locally tested before it is allowed to be pushed to source-control management (SCM). Applying additional processes such as writing tests around bugs meaning reintroduction of a given bug would cause a test to fail. The automated tools would then require that test to be fixed before push to SCM meaning fewer bugs will be reintroduced. Proper development processes and automation have a strong synergy.
Security Tools
Security is always an important part of development.With a lack of security expertise, my goal was to use a quality third-party product that works well. In balancing cost with features available I decided to use Auth0 providing many features for a cheap, consistent price.
Business Logic
The primary purpose of this application is to provide a full-stack analytics solution. Currently, the drive is to run application business logic using Kubernetes jobs and cron jobs. Jobs for immediately invoked operations while cron jobs will handle scheduled tasks. Scheduling tasks combined with subscriptions enables real-time data collection for the end user relating to a desired business. An example of such an operation is data fetching. Currently, a simple fetch tweets operation is implemented as a PoC of what the system can do. It’s a simple NodeJS command-line interface (CLI). Once data is fetched, a Kafka event is triggered passing that data into the collection microservice which stores that data in raw form. A similar approach will be used with the analytics and cleaning microservice. Specifically, the analytics microservice will trigger a job and cron job that performs the actual analytics using a library such as TensorFlow. Once analysis has been performed, a Kafka event will be triggered passing the data into the analytics microservice which will store the results for front-end consumption. One question that remains is how a system like TensorFlow will perform when used inside a container.
Conclusion
Predecos is in its PoC stage. There’s quite a bit of work to do and some interesting problems to solve. I’m doing my best to design the system in a way that’s future-proof, open source and fun. In keeping with that, this post is intended as a request for comments (RFC) so that I can improve my current design and learn. Please feel free to comment below so that the architecture can benefit from the experience of many rather than the experience of one. Happy coding. :-)
Posted on June 2, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.