How to migrate from Neo4j to Memgraph
Katarina Supe
Posted on March 17, 2022
Introduction
Through this tutorial, you'll learn how to migrate the movies dataset from Neo4j to Memgraph. If you used Neo4j, you are probably familiar with their example graph that helps you learn the basics of the Cypher query language. Neo4j is an ACID-compliant transactional native graph database, while Memgraph is a platform designed for graph computations on streaming data. You can read more about their differences in the Neo4j vs Memgraph article.
Prerequisites
- Docker (Linux) or Docker Desktop (macOS/Windows) - you will run Memgraph and Neo4j in a Docker container
-
Neo4j with APOC - in this tutorial, you'll use the
apoc.export.csv.all()
procedure. -
Memgraph - you are going to import your data using the
LOAD CSV
clause.
Run Neo4j and Memgraph
You need to start Neo4j with APOC, and you can do that by running:
docker run -p 7474:7474 -p 7687:7687 \
-v data:/data -v plugins:/plugins --name neo4j-apoc \
-e NEO4J_apoc_export_file_enabled=true \
-e NEO4J_apoc_import_file_enabled=true \
-e NEO4J_apoc_import_file_use__neo4j__config=true \
-e NEO4JLABS_PLUGINS=\[\"apoc\"\] \
-e NEO4J_AUTH=neo4j/password neo4j:latest
If you're having trouble running this on the new Apple M1 chip, try adding
--platform linux/arm64/v8
after the run command. Also, you can setNEO4J_AUTH
however you like, hereneo4j
is username andpassword
is password.
The Neo4j Browser is available athttp://localhost:7474
, and you can use the credentials from theNEO4J_AUTH
environment variable. The ports 7474 and 7687 are now exposed, and since Memgraph also uses port 7687, you can run it at the next available port, that is, 7688:
docker run -it -p 7688:7687 -p 3000:3000 -v mg_lib:/var/lib/memgraph memgraph/memgraph-platform
Since you started Memgraph Platform, the command-line tool mgconsole should be open in your terminal, and the visual user interface Memgraph Lab is available at http://localhost:3000
.
Great, now you have both databases running and you're ready to play with the movies dataset!
1. Load the movies dataset into Neo4j
The movies dataset contains actors and directors that are related through the movies they've collaborated on.
This is a simple dataset and a good example of how you can migrate the whole graph database from Neo4j to Memgraph. You can import it to Neo4j by running the Movie Graph example from Example Graphs in the Neo4j Browser or Desktop app. Once the movie dataset is loaded into Neo4j, you can jump to exporting the whole database.
2. Export the movies dataset from Neo4j
You are going to export the whole database from Neo4j to a CSV file movies.csv
by using the apoc.export.csv.all()
procedure. Through the Neo4j Browser (or Desktop app), run the following query:
CALL apoc.export.csv.all("movies.csv", {})
Now you have exported the whole database to the movies.csv
file. But where is this file located? Since you ran both Neo4j and Memgraph with Docker, files you are exporting or importing are located inside the Docker container. To enter the container, you first must find out the container ID of the container where Neo4j is running. Run docker ps
to check all the running containers. There you can find the Neo4j container ID and copy it. After that, run:
docker exec -it <container_ID> bash
After you run the above command, you are inside the container where Neo4j is running. All exported data is located at /var/lib/neo4j/import/
. Hence, run cd /var/lib/neo4j/import/
to check whether the movies.csv
file is there. Next, you are going to copy the movies.csv
file to your local file system. To do that, you need to run:
docker cp <container_ID>:/var/lib/neo4j/import/movies.csv /path_to_local_folder/movies.csv
The <container_ID>
is the ID of the Neo4j container, which you already found before. On the right, you can give the path to one of your local folders where you want the movies.csv
file to be copied to.
3. Import the movies dataset into Memgraph
After you have exported the whole database from Neo4j into the movies.csv
file and copied it to your local file system, you are ready to import the movies dataset into Memgraph. First, you have to copy the movies.csv
file to the Docker container where Memgraph is running. Here's what you have to do:
- Run
docker ps
to find out Memgraph's container ID - Run
docker cp /path_to_local_folder/movies.csv <container_ID>:/usr/lib/memgraph/movies.csv
Now the movies.csv
file is ready to be imported to Memgraph with the LOAD CSV
clause.
First, you need to import all nodes, and after that, relationships. In order to create the relationships, start and end nodes have to already be in the database, and that's why you are importing in that order.
You can write the queries in the mgconsole, a command-line tool, or in the query editor in the Memgraph Lab. Memgraph Lab is more user-friendly, so I would recommend this way of querying. To connect to Memgraph from Memgraph Lab, you will have to choose the Connect manually
option, since the port on which Memgraph is running is not the default one (7687). Once you choose Connect manually
, you just have to change the port to 7688 and click Connect
.
To create all Person
nodes, copy the following query in the query editor:
LOAD CSV FROM "/usr/lib/memgraph/movies.csv" WITH HEADER as row
WITH row WHERE row._labels = ':Person'
CREATE (:Person {id: row._id, name: row.name, born: row.born});
In the movies.csv
file, the column _labels
is empty if the data in the row represents a relationship. If the row is a node, then the column _labels
contains the label of that node. Hence, you can create the Person
nodes by going through the _labels
column. Every person has the properties id
, name
and born
.
In a similar way, you can create the Movie
nodes:
LOAD CSV FROM "/usr/lib/memgraph/movies.csv" WITH HEADER as row
WITH row WHERE row._labels = ':Movie'
CREATE (:Movie {id: row._id, relased: row.released, tagline: row.tagline, title: row.title});
You can check whether you created all nodes correctly. First, in the Overview tab, you can see that the number of nodes is 171, and it matches the number of nodes in Neo4j.
Next, you can import the relationships. There are 6 different types of relationships, some have properties and some don't. It's best to import them one by one:
- Import ACTED_IN relationship
LOAD CSV FROM "/usr/lib/memgraph/movies.csv" WITH HEADER as row
WITH row WHERE row._type = 'ACTED_IN'
MATCH (n {id: row._start}), (m {id: row._end})
CREATE (n)-[:ACTED_IN {roles: row.roles}]->(m);
- Import DIRECTED relationship
LOAD CSV FROM "/usr/lib/memgraph/movies.csv" WITH HEADER as row
WITH row WHERE row._type = 'DIRECTED'
MATCH (n {id: row._start}), (m {id: row._end})
CREATE (n)-[:DIRECTED]->(m);
- Import PRODUCED relationship
LOAD CSV FROM "/usr/lib/memgraph/movies.csv" WITH HEADER as row
WITH row WHERE row._type = 'PRODUCED'
MATCH (n {id: row._start}), (m {id: row._end})
CREATE (n)-[:PRODUCED]->(m);
- Import WROTE relationship
LOAD CSV FROM "/usr/lib/memgraph/movies.csv" WITH HEADER as row
WITH row WHERE row._type = 'WROTE'
MATCH (n {id: row._start}), (m {id: row._end})
CREATE (n)-[:WROTE]->(m);
- Import FOLLOWS relationship
LOAD CSV FROM "/usr/lib/memgraph/movies.csv" WITH HEADER as row
WITH row WHERE row._type = 'FOLLOWS'
MATCH (n {id: row._start}), (m {id: row._end})
CREATE (n)-[:FOLLOWS]->(m);
- Import REVIEWED relationship
LOAD CSV FROM "/usr/lib/memgraph/movies.csv" WITH HEADER as row
WITH row WHERE row._type = 'REVIEWED'
MATCH (n {id: row._start}), (m {id: row._end})
CREATE (n)-[:REVIEWED {rating: row.rating, summary: row.summary}]->(m);
Now you can check again that the number of relationships in the Overview tab in Memgraph Lab is the same as in Neo4j.
And that's it! You have migrated the whole movies dataset from Neo4j to Memgraph.
4. Explore the data
You can run a couple of example queries from the Neo4j and compare them with the results in Memgraph, to make sure you did everything right.
For example, run the following query:
MATCH (tom {name: "Tom Hanks"}) RETURN tom;
and check the results:
Next, you can run:
MATCH (cloudAtlas {title: "Cloud Atlas"}) RETURN cloudAtlas;
and check the results:
Feel free to try out a different set of queries and make sure that the results are the same in Neo4j and Memgraph.
Conclusion
That's it for now! You've learned how to migrate the whole database from Neo4j to Memgraph. In this tutorial, you used the apoc.export.csv.all()
procedure to export the data from Neo4j and the LOAD CSV
clause to import the data to Memgraph. This is not the only way to migrate your data from Neo4j to Memgraph. If you feel creative and find some easier way of doing this, join our Discord server and share it with the Memgraph team and community.
Posted on March 17, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.