Ken Woon
Posted on March 11, 2023
Introduction
Welcome to the world of graph databases! When it comes to modelling complex and highly connected data, graph databases have proven to be an efficient and intuitive solution. And one of the most popular graph databases out there is Neo4j, which uses a query language called Cypher.
But what if you could use Cypher to query data in PostgreSQL? Well, now you can! Thanks to the Apache Age extension, you can use Cypher to query a graph that is stored in a PostgreSQL database. This powerful combination allows you to take advantage of the benefits of graph databases while leveraging the maturity and stability of PostgreSQL.
In this blog post, I will walk you through the basics of querying with Cypher in PostgreSQL using the Apache Age extension. Whether you're new to graph databases or an experienced user, this post will help you get started on querying in Apache Age. So let's get started! If you have any doubts setting up your environment, check out my previous post about the installation process for both PostgreSQL and Apache AGE.
Cypher Syntax
The syntax for Cypher is designed to be easy to read and write, making it accessible to both developers and non-technical users. The language is built around the concept of pattern matching, which allows you to describe complex relationships between nodes and edges in your graph data.
In Cypher, nodes are represented by parenthesis ()
, while labels or tags are indicated by a colon :
followed by the label name, which groups nodes by roles or types. For example, a node representing a person who is male could be labelled as:
(:Person:Male)
Nodes can also have properties, which are enclosed in curly braces {}
and follow the label, such as:
(:Person {name: 'Jake'})
Relationships, on the other hand, are indicated by hyphens -
or square brackets []
, and connect two nodes together. The direction of the relationship is specified using <
and >
to indicate the direction of the arrow. For example, if Jake likes Jane, we can represent this with:
(:Person {name: 'Jake'})-[:LIKES]->(:Person {name: 'Jane'})
If the colon is dropped -[LIKES]->
, this will instead represent a variable (alias) instead of a relationship type and all types of relationships will be searched.
Similar to nodes, relationships can also have properties, which are enclosed in curly braces {}
and follow the relationship type, such as:
-[:LIKES {type: 'as a friend'}]->
Finally, it's worth noting that aliases can be used to refer to nodes and relationships throughout your queries. To use an alias, simply name the node or relationship before the label, such as the a
from (a:Person)
, the r
from [r:LIKES]
and the b
from (b:Person)
. These can be referred to later in your query using the aliases you've defined, like so:
(a)-[r:LIKES {type: 'as a friend'}]->(b)
Basic Queries
In Apache AGE, Cypher cannot be used in an expression and the query must exist in the SQL FROM
clause of a query. For example, to execute a Cypher query, the typical layout would be as follows:
SELECT *
FROM cypher('graph_name', $$
/* Cypher query here, for example:
MATCH (a:Person)
RETURN a
*/
$$) AS (person agtype);
Cypher queries are written in between the dollar signs $$ $$
of the cypher()
command, which requires a graph name in the beginning to specify which graph is being worked on. Aliases have to be specified for the RETURN
outputs from the Cypher query. The syntax can be generalized as cypher(graph_name, query_string, parameters)
. AGE uses a custom data type called agtype
, which is the only data type returned by AGE.
In order to start working with graphs in Apache Age using Cypher, the first step is to create a graph. This can be done using the create_graph
function, which takes the name of the graph as its argument:
SELECT create_graph('graph_name');
To delete a graph, use drop_graph
, which also takes the name of the graph as an argument:
SELECT drop_graph('graph_name', true);
Once you have a graph to work with, you can start creating nodes or vertices using the CREATE
clause. To create a simple node, use CREATE (n)
. To create a node with a label, use CREATE (:Person)
. You can also add properties to nodes using the curly brace syntax, such as CREATE (:Person {name: Jack})
, as we saw in the previous section. Here is an example of a complete query:
SELECT *
FROM cypher('People', $$
CREATE (:Person {name: Jack}),
(:Person {name: Jane})
$$) AS (person agtype);
The MATCH
clause is used to specify patterns that Cypher will search for in the database. To get all vertices in the graph, use MATCH (n)
. To get all vertices with a specific label, use MATCH (n:Person)
. To get related vertices, use MATCH (:Person {name: 'Jack'})-[]-(:Person)
, where the symbol -[]-
means related to, without regard to type or direction of the relationship. You can also match on specific edge types using a variable with MATCH (:Person {name: 'Jack'})-[:LIKES]->(:Person {name: 'Jane'})
. A full query will look something like:
SELECT *
FROM cypher('People', $$
MATCH (a:Person {name: 'Jack'})-[r:LIKES {type: 'as a friend'}]->(b:Person {name: 'Jane'})
RETURN a.name, r.type, b.name
$$) AS (he agtype, likes agtype, her agtype);
To delete graph elements, the DELETE
clause can be used. To delete a vertex, use DELETE n
. To delete all vertices and edges in the graph, use:
SELECT *
FROM cypher('People', $$
MATCH (n:People)
DETACH DELETE n
$$) as (n agtype);
To delete edges only, use:
SELECT *
FROM cypher('People', $$
MATCH (:People {name: 'Jack'})-[r:LIKES]->()
DELETE r
$$) AS (n agtype);
The SET
clause is used to update labels on nodes and properties on vertices and edges. To set a property, use SET n.age = 25
:
SELECT *
FROM cypher('People', $$
MATCH (a:Person {name: 'Jack'})
SET a.age = 25
$$) AS (n agtype);
The REMOVE
clause can be used to remove properties from vertices and edges, such as REMOVE n.age
:
SELECT *
FROM cypher('People', $$
MATCH (a:Person {name: 'Jack'})
REMOVE a.age
$$) AS (n agtype);
Finally, the RETURN
clause allows you to specify which parts of the pattern you're interested in. This could be nodes, relationships, or properties on either of these. When you want to return all vertices, edges, and paths found in a query, you can use the *
symbol:
SELECT *
FROM cypher('People', $$
MATCH (a: Person {name: 'Jack'})-[r]->(b)
RETURN *
$$) AS (person1 agtype, relationship agtype, person2 agtype);
In this section, we learned how to perform basic Cypher queries in Apache Age. We covered the creation and deletion of graphs, as well as the creation, deletion, and modification of nodes and relationships. With this knowledge, we can start building more complex queries to extract insights and knowledge from our data. To find out more about each clauses and other unmentioned clauses, as well as functions available in Apache AGE, check out the official documentation available online.
Conclusion
By learning these fundamental concepts and commands, you can start to leverage the power of graph databases and extract valuable insights and knowledge from your data. With Cypher and Apache Age, you can model your data as a graph, and query it using a familiar, intuitive syntax. Whether you're working on social networks, recommendation engines, fraud detection, or any other problem that can be modelled as a graph, Apache Age can help you store, manage, and query your data efficiently and effectively.
I hope this blog post has been helpful in getting you started with Cypher queries in Apache Age. Remember, this is just the beginning as there are many more advanced features and techniques to explore as you become more proficient in using this powerful tool. So keep learning, keep practicing, and keep discovering new insights with Cypher and Apache Age.
This post is written based on the Neo4j Getting Started Guide and the Apache AGE Master Documentation
Posted on March 11, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.