out() vs. outE() – JanusGraph and Gremlin
Sunny Srinidhi
Posted on March 3, 2021
If you are new to JanusGraph and the Gremlin query language, like I am, you would be confused about the out()
, outE()
, in()
, and inE()
methods. If you look at examples of these functions, you’ll not be able to comprehend the difference easily. Or is it just me?
Anyway, I got confused and it took me a while to understand there is a difference, and there isn’t. Let me explain.
The Sample Graph
Before we look at the differences, let’s look at a sample graph.
As you can see from the graph above, we have four vertices and three edges. The vertex in the middle with the property "name": "sunny" is the vertex from where we’ll start our traversal. The other three vertices are the items that I bought from an e-commerce website. They are a smartphone, a laptop, and a monitor. The relationship is represented with edges labelled bought.
The edges have another property called count, and as you can tell, they represent the number of times I have bought these items. So I bought three smartphones, two laptops, and one monitor. This is the data we’re going to work with.
Now, we’ll first get a reference to our starting vertex with the following query:
sunny = g.V().has('name', 'sunny').next()
We now have all the data we need to understand the difference between these functions.
out()
vs. outE()
We already know that we use the outE()
function to traverse an edge that is going out of the current vertex. We pass in the label of one or more edges to the function. From our e-commerce example, if I want to get all the items that I have bought, I’ll run the following query:
g.V(sunny).outE('bought').in()
This would give us all the vertices which have a ‘bought’ relationship with the current vertex. But you’d have also seen the following query for the same use case:
g.V(sunny).out('bought')
So, they are performing the same traversal and returning the same results. I found out that when you’re using the outE().in()
combination, you can simply replace it with out()
. It’s a shorthand or an alias for the long form outE().in()
. But then, why would you use outE()
at all?
Suppose you want to filter or limit the traversal based on other properties of the edge. For example, in our sample graph, I want to get only the items that have bought more than once. We have the count property for each of our bought edge. We can use that to filter our vertices. For this, the query is as follows:
g.V(sunny).outE('bought').has('count', gt(1)).inV()
As you can see, we can use the has()
function on edges as well to filter out edges with particular property. This ability to filter is not available when you use the out()
function. Because the result of the out()
function is vertices. So if you call the has()
function on that result, you’ll be filtering on the vertices and not edges. I hope I’m not complicating things.
in()
vs. inE()
It’s the same story with the in()
and inE()
functions as well. If you want to filter edges based on extra properties, you use the inE()
function instead of in()
.
There’s more…
Using the outE()
or inE()
functions gives you access to more functions that can be used on edges, such as aliasing them using the as()
function, the count()
function, etc. You can have a look at the documentation to see the list of all functions available on edges.
I hope this has not confused you more than you already are. I thought this would clear things our for at least a few people who have the same questions as me when you are still getting started with JanusGraph and Gremlin. Let me know if this helped, or didn’t.
And if you like what you see here, or on my personal or Medium blogs, and would like to see more of such helpful technical posts in the future, consider supporting me on Patreon and Github.
Posted on March 3, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.