Katarina Supe
Posted on January 7, 2022
Introduction
In this blog post, you will learn how to create a React app and WebSocket that connects to the Flask server, and visualize streaming data in real-time using D3.js. I decided to write this blog post as a part of my learning path, and I hope to help anyone struggling with the same problems while trying to implement something similar. I decided to tackle on the frontend implementation by visualizing Twitter users who retweeted something with the hashtag #christmas. Each community of users is presented with a different color which allowed me to notice the important communities in the Twitter network.
The frontend service is a part of a web application that you can find in the GitHub repository. Besides that, the repository also holds a visualization that discovers the most Christmassy person on Twitter using the dynamic PageRank algorithm.
Prerequisites and app architecture
If you are using Windows, you need to install Windows Subsystem for Linux (WSL) and then Docker Desktop. On the other hand, if you are a Linux user, just install Docker and Docker Compose.
The app is dockerized and it consists of five services:
-
stream
: A Python script collects new retweets with the hashtag #christmas and sends them to a Kafka cluster. -
kafka
: A Kafka cluster consisting of one topic namedretweets
. -
memgraph-mage
: The graph analytics platform where we store the incoming Twitter data from Kafka and perform dynamic PageRank and dynamic community detection on all Twitter users. -
backend-app
: A Flask server that sends all the data we query frommemgraph-mage
to thefrontend-app
. It also consumes the Kafka stream and sends it to thefrontend-app
. -
frontend-app
: A React app that visualizes the Twitter network using the D3.js library.
Project structure
You can see the whole project structure in the GitHub repository. The blog post focuses on the frontend service and explains how the visualization was created.
| docker-compose.yml
|
+---backend
| Dockerfile
| +---server
| +---tests
|
+---frontend
| | .dockerignore
| | Dockerfile
| | package.json
| | package-lock.json
| +---node_modules
| +---public
| +---src
|
+---memgraph
| | Dockerfile
| | requirements.txt
| +---procedures
| +---transformations
|
+---stream
| | Dockerfile
| | kafka_utils.py
| | produce.py
| | requirements.txt
| +---data
The frontend
folder was created using the create-react-app
npm
package. If you are starting from scratch and want to create a React app follow these steps:
- Place yourself in the root folder of your project.
- Run
npm install -g create-react-app
(if you don't want to install the latest version, you can specify the version of thecreate-react-app
, for example,create-react-app@5.0.0
). - Next, run
npm init react-app frontend --use-npm
, which will initialize thereact-app
package in thefrontend
folder. - In the end, place yourself in the
frontend
folder by runningcd frontend
and start the app withnpm start
.
Even a simpler way of creating a React app is by using npx
- a package runner tool that comes with npm 5.2+
. Then you just have to run:
npx create-react-app frontend
cd frontend
npm start
Socket.IO library
I have used socket.io-client@2.3.1
since I had issues with the latest version. I am going to explain the process on the CommunityDetection
component, since it's very similar to the PageRank
component. If you are running the frontend application locally, and not using the provided dockerized application, make sure to install the library by running:
npm install socket.io-client@2.3.1
Don't forget that Node.js is a prerequisite for using
npm
.
First, we are going to import the socket we're using on the client side. Backend is implemented with Flask.
import io from "socket.io-client"
After that, we are initializing the socket.
var socket = io("http://localhost:5000/", {
transports: ["websocket", "polling"]
})
We set the socket to listen to the server running at http://localhost:5000/
. Then we established the connection to the server with websocket
first. If websocket
is not available, connection to the server will be established with HTTP
long-polling
- successive HTTP
requests (POST
for writing, GET
for reading). Next, we need to handle different events with our socket are handled. When the connection is established, the socket is emitting the consumer
signal. This signal is also emitted on the server side, whenever a new message is sent. This configuration allows the socket to receive all messages related to consumer
signal.
socket.on("connect", () => {
socket.emit('consumer')
console.log("Connected to socket ", socket.id)
});
socket.on("connect_error", (err) => {
console.log(err)
// try reconnecting
socket.connect()
});
socket.on("disconnect", () => {
console.log("Disconnected from socket.")
});
socket.on("consumer", (msg) => {
console.log('Received a message from the WebSocket service: ', msg.data);
}
React.Component
lifecycle
You may ask yourself where to place all this socket.io
code within a React component. First, I initialized the socket in the component's constructor. After that I have placed the socket events in componentDidMount()
lifecycle method. This part of the React.Component
lifecyle is invoked once, immediately after a component is mounted. If you need to load data from a remote endpoint, this is a good place to instantiate the network request. This method is also a good place to set up any subscriptions. That's why I have decided to place all socket events there. On each consumer
emit, the state of the component will be updated and that will trigger an extra rendering which will happen before the browser updates the screen, so the user won’t see the intermediate state. Before I set up the socket, at the beginning of the componentDidMount()
, I have made a simple HTTP
request that will trigger the backend to start producing the needed data.
firstRequest() {
fetch("http://localhost:5000/api/graph")
.then((res) => res.json())
.then((result) => console.log(result))
}
After that, I initialized everything that was necessary for drawing using D3.js
in the initializeGraph()
method. By setting a new state of nodes and links with setState()
on each consumer
emit, componentDidUpdate()
lifecycle method will be called. In that method we are updating the graph by drawing new incoming nodes and links. This lifecycle method is not called for the initial render, and that's the reason why we initialized everything in the initializeGraph()
method.
In the end, when the component unmounts (for example, when we click on the button to switch to the PageRank), the componentWillUnmount()
lifecycle method is called and the client disconnects from the server.
componentWillUnmount() {
this.socket.emit('disconnect');
this.socket.disconnect();
}
If you want to learn more about
React.Component
lifecycle methods, check out the React official docs.
Visualizing with D3.js
We want to draw the graph on svg
using D3.js
within the class component. We are going to do that by creating a reference in the component constructor which will be attached to the svg
via the ref
attribute. In constructor we have to use createRef()
method.
constructor(props) {
super(props);
this.myReference = React.createRef();
this.state = {
nodes: [],
links: []
}
this.socket = io("http://localhost:5000/", { transports: ["websocket", "polling"] })
}
Then, in the component's render()
method we are adding the ref
attribute with value this.myReference
to the svg
.
render() {
return (<div>
<h1>Community Detection</h1>
<p>Number of users that retweeted so far: {this.state.nodes.length}</p>
<svg ref={this.myReference}
style={{
height: 500, //width: "100%"
width: 900,
marginRight: "0px",
marginLeft: "0px",
background: "white"
}}></svg></div>
);
}
Now, by selecting the current attribute of the reference, it's easy to get the svg
on which we are going to draw our graph.
var svg = d3.select(this.myReference.current);
If you want to know how to use
D3.js
within function component, check out one of my previous blog posts - Twitch Streaming Graph Analysis - Part 2.
In the updateGraph()
method we have to draw the nodes and relationships using D3.js
, where nodes will be colored depending on the community they belong to. We are receiving the community information through the cluster
property of each node.
// Remove old nodes
node.exit().remove();
// Update existing nodes
node = node.data(nodes, (d) => d.id);
node = node
.enter()
.append('circle')
.attr("r", function (d) {
return 7;
})
.attr('fill', function (d) {
if (!clusterColors.hasOwnProperty(d.cluster)) {
clusterColors[d.cluster] = "#" + Math.floor(Math.random() * 16777215).toString(16)
}
return clusterColors[d.cluster]
})
.on("mouseover", function (d) {
tooltip.text(d.srcElement["__data__"]["username"])
tooltip.style("visibility", "visible")
})
.on("mousemove", function (event, d) {
return tooltip.style("top", (event.y - 10) + "px").style("left", (event.x + 10) + "px"); })
.on("mouseout", function (event, d) { return tooltip.style("visibility", "hidden"); })
.call(this.drag())
.merge(node);
First we are removing the old nodes and setting the node
value to the new nodes data. Next, we want each node to be a circle with radius 7 (that's just a random value which seemed quite okay to me). After that, we want each node to be colored depending on the cluster it belongs to. We have previously created a map of colors called clusterColors
. When new cluster appears, a new key value pair is created in the map, where key is the cluster number and value is a randomly generated color. If the cluster of the node already exists, then the color of the node will be the value of that cluster key in the clusterColors
map. Then if we want to see usernames on hover, we need mouseover
, mousemove
and mouseout
events. In the next line, we are calling the drag()
method which allows us to drag the nodes. At the end, new nodes are being merged with the old ones with the merge()
method. We will add the links between the nodes in a similar manner. All that's left to do is to create the simulation on updated nodes and links.
try {
simulation
.nodes(nodes)
.force('link', d3.forceLink(links).id(function (n) { return n.id; }))
.force(
'collide',
d3
.forceCollide()
.radius(function (d) {
return 20;
})
)
.force('charge', d3.forceManyBody())
.force('center', d3.forceCenter(width / 2, height / 2));
} catch (err) {
console.log('err', err);
}
Here we are creating the force between the nodes and links, where each link has an unique id which we created by adding the attribute id .attr('id', (d) => d.source.id + '-' + d.target.id)
to each link. That id is created from the id's of the nodes the certain link is connecting. Collide force is there so that the nodes are not overlapping, considering the size of their radius. Here we have set the radius to size 20, which is larger than 7 - the radius of the nodes. Charge force is causing the nodes in the graph to repel each other, that is, it prevents the nodes from overlapping each other in the visualization. In the end, we have a center force, which is forcing the nodes and links to appear at the middle of the svg
.
And how this actually looks? Check out the GIF
below for the preview, and if you want to start the app all by yourself, follow the instructions at the README in the repository.
The PageRank visualization code is similar, the notable difference is in the radius of each node and the color of the nodes.
node = node
.enter()
.append('circle')
.attr("r", function (d) {
return d.rank * 1000;
})
.attr('fill', 'url(#gradient)')
.on("mouseover", function (d) {
tooltip.text(d.srcElement["__data__"]["username"])
tooltip.style("visibility", "visible")
})
.on("mousemove", function (event, d) { return tooltip.style("top", (event.y - 15) + "px").style("left", (event.x + 15) + "px"); })
.on("mouseout", function (event, d) { return tooltip.style("visibility", "hidden"); })
.call(this.drag())
.merge(node);
You can see that the attribute r
is proportional to rank (calculated PageRank of each node). Also, the fill
attribute is determined by the gradient
created in the defineGradient()
method.
Conclusion
There is still a lot left to learn regarding the React
, D3.js
and WebSocket
, but creating this demo application gave me a pretty good insight into the real-time visualization. It was fun playing with it, and I'm looking forward to learning more in the future. Also, I would like to emphasize that Reddit network explorer application, developed by my colleagues Ivan, David and Antonio, helped me a lot. There, you can find real-time visualization with frontend in Angular. For any feedback or questions ping me or Memgraph team at our Discord server.
Posted on January 7, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.