John D
Posted on August 12, 2019
I’m writing this article after only two weeks of working with Gremlin and CosmosDB. What I’m writing about could be dead wrong. I honestly hope so, because my job would be much easier if I’m missing something and what little goodwill I had towards Azure before this experience might be restored.
This article assumes that you have an intermediate knowledge of TypeScript and a basic knowledge of Gremlin and CosmosDB. I won’t be stopping to explain the benefits of TypeScript or what Gremlin is and how it works, but I have included links to resources that do. If you’re feeling rusty, feel free to brush up using the following articles.
On to business.
The Players
- Gremlin - Graph Traversal Machine and Language (how you communicate with some graph databases)
- CosmosDB - Microsoft Azure’s multi-model database—specifically, its graph database offering
- TypeScript - JavaScript’s sane younger brother
- Gremlin-JavaScript - Apache’s managed Gremlin-JavaScript implementation
TypeScript and Gremlin-JavaScript
Before we start using CosmosDB, we need to correct the Gremlin-JavaScript package’s TypeScript type declarations. Don’t skip this section or you’ll learn what happens when type declarations don’t match up with their functionality counterparts.
The @types/gremlin package contains incorrect declarations and is currently incomplete. I’m in the process of contributing to the official package, but that is a slow process. In the meantime, the best option is to use Declaration Merging to augment and correct current type declarations. Here is part of my corrected declaration file
This is where my knowledge of TypeScript could use a little help. The least painful way I’ve found to augment declarations is to merge modules, so I don’t have to report and then import Gremlin from different locations. This does have a drawback: you can’t modify any of the constructors and might have to instantiate objects with an empty ID field. This generally isn’t a problem, since the majority of classes have extremely simple constructors, but that won’t always be the case.
I’ve corrected the most obvious errors for you, and have made a majority of the changes for CosmosDB to work, but these are probably not all the declaration changes you’ll have to make for your project. Don’t just plug this in and expect that all functionality has been covered. Be careful and pay attention.
Gremlin and CosmosDB
You have two significant hurdles to overcome when dealing with CosmosDB through the Gremlin-JavaScript library:
CosmosDB does not support Gremlin bytecode commands
Gremlin works best when it can take the user commands and translate them into Gremlin bytecode. This helps avoid issues that can come about because of malformed or unescaped strings, and it allows the developer to use steps and traversal methods that would be too difficult or impossible otherwise. If you want more info, you can read all about Gremlin bytecode and why it’s a Very Good Thing™.
Without bytecode support, the CosmosDB website and example packages (even the Gremlin-Javascript reference documentation!) would have you believe that the only way to accomplish communication and queries against Gremlin is through raw script submission.
This is incorrect.
Included in the Gremlin-JavaScript package is a nifty set of classes for taking normal, fluent, traversal steps—minus the termination steps—and converting bytecode commands to a Gremlin/groovy script.
Note: Microsoft does not support all traversal steps; check out this page for supported steps.
Microsoft says they’ve begun work on accepting bytecode and that a public preview will be available in December 2019, but I won’t hold my breath for it becoming quickly available afterwards.
In the spirit of abstraction and longevity of the application, I’d suggest coding your app using the bytecode functionality and methods, and then using the script translator. You’ll thank me if/when CosmosDB enables bytecode support or you decide to find a much better alternative graph database provider. If you’re especially talented, you could probably make a fantastic abstraction layer that makes switching back and forth a breeze!
CosmosDB outputs GraphSON 1.0
GraphSON is like JSON but for graph databases. When an SDK (in this case the Gremlin-JavaScript library) communicates with a Gremlin enabled graph database the data shared is serialized into GraphSON.
Simple.
There are three versions of GraphSON to date. Changes from 1.0 to 2.0 were very drastic, changes from 2.0 to 3.0 not so much. Most modern database providers use either GraphSON 2.0 or 3.0 and the majority of SDK’s can serialize/deserialize 2.0 and 3.0 messages.
This is where things get frustrating.
CosmosDB accepts GraphSON 2.0 format, meaning that the Gremlin-JavaScript package’s serialization of Gremlin scripts will be accepted by CosmosDB right out of the box. However, CosmosDB outputs GraphSON 1.0—and none of the GraphSON serializers included with the Gremlin-JavaScript package accept GraphSON 1.0
Why CosmosDB accepts one version of GraphSON and outputs another is beyond me. I have both a question on Stack Overflow and a post on Reddit asking that same question. I continue to hope I’m just missing a setting or not reading the documentation closely enough. But at the time of writing I have yet to receive any response, and, seeing that the UI tool on the Azure site mirrors the same GraphSON 1.0 output I get from communicating with the server directly, I am not filled with confidence that I will.
I’m in the process of writing a GraphSON 1.0 reader/serializer for the Gremlin-JavaScript package, which you are free to use Keep in mind that it’s unfinished, and though it’s tested, I can’t guarantee it’s feature-complete or particularly good. If nothing else, it will be a demonstration on how the serializer/deserializer should function and where to start in writing your own.
The End
I hope this article does not age well. Microsoft has stated they’re working on accepting Gremlin bytecode commands, and I hope that actually happens. As CosmosDB evolves, I hope the graph database offering will become more modern, outputting GraphSON 2.0. Most of all, I hope that Microsoft understands that these shortcomings are what’s keeping them from being as competitive in their cloud graph databases as they should be.
My recommendation? Avoid CosmosDB’s graph database service, for now, and try any of these alternative solutions.
Posted on August 12, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.