WALEED SHAHID
Posted on April 27, 2023
Importing data from CSV files is a common task in graph databases, and Apache AGE is no exception. In this article, we will explain how to import data from CSV files in Apache AGE. We will cover the following topics:
- Functions to load graphs from files
- CSV file structure for loading vertices and edges
- Example code to load countries and cities from files
Functions to load graphs from files:
Before we dive into loading data from CSV files, it's important to understand the functions available in Apache AGE to create graphs from files. Here are the two main functions you will use to load data from CSV files:
load_labels_from_file: This function is used to load vertices from CSV files.
load_edges_from_file: This function is used to load edges from CSV files.
To use these functions, you must first create a graph and labels. Once you have your graph and labels set up, you can use these functions to load data from CSV files.
Here's an example of how to use load_labels_from_file:
load_labels_from_file('<graph name>', '<label name>', '<file path>')
And here's an example of how to use load_edges_from_file:
load_edges_from_file('<graph name>', '<label name>', '<file path>')
CSV file structure for loading vertices and edges:
Now that you know how to use the functions to load data from CSV files, it's important to understand the structure of the CSV files themselves. Here's an overview of the CSV file structure for loading vertices and edges:
CSV file for vertices:
id: This is the first column of the file and all values shall be a positive integer. This is an optional field when id_field_exists is false. However, it should be present when id_field_exists is not set to false.
Properties: All other columns contain the properties for the vertices. The header row shall contain the name of each property.
CSV file for edges:
start_id: This is the node ID of the node from where the edge starts. This ID shall be present in the nodes.csv file.
start_vertex_type: This is the class of the node from where the edge starts.
end_id: This is the end ID of the node at which the edge terminates.
end_vertex_type: This is the class of the node at which the edge terminates.
properties: These are the properties of the edge. The header shall contain the property name.
Example code to load countries and cities from files
Now that you understand the functions and CSV file structure, let's take a look at an example of how to load countries and cities from files in Apache AGE.
First, we'll create a graph and two labels: Country and City:
CREATE GRAPH mygraph;
CREATE LABEL Country;
CREATE LABEL City;
Next, we'll load the vertices from two CSV files: countries.csv and cities.csv:
load_labels_from_file('mygraph', 'Country', 'countries.csv');
load_labels_from_file('mygraph', 'City', 'cities.csv');
Finally, we'll load the edges from a CSV file called city_country.csv:
load_edges_from_file('mygraph', 'City', 'city_country.csv');
And that's it! Now you have a graph with countries, cities, and the relationships between them loaded from CSV files.
Conclusion:
In conclusion, Apache AGE provides easy and efficient ways to load graph data from CSV files. By following the instructions outlined in this article, users can create graphs, load vertices and edges from CSV files, and specify the file formats for the data. This is a powerful feature for data analysts and developers who need to quickly load and analyze large amounts of graph data. With Apache AGE, users can focus on their data analysis and development tasks without having to worry about the underlying database infrastructure.
Posted on April 27, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 1, 2023