Realtime Data Visualization with Peak Detection using Socket.IO, D3 and React ππ
Hatem Hassan π¨βπ»βοΈπ»πΊπ
Posted on October 30, 2019
Originally posted on my personal blog (better formatting)
TLDR: This post is UX research + fullstack programming tutorial for building a web application that streams time series data to be visualized in realtime using multiple Javascript frameworks.
The Why?
IoT is growing rapidly these days and one aspect that's key to the success of this kind of projects is data visualization. Design and UX is actually one of the most critical parts in any IoT project.
Any SME may have a very good infrastructure and complex integrations, but what would that do if their systems lack the tools to transform this gathered into actionable insights that can deliver real business value.
One thing is to collect some data from those new cool devices, another is to visualize it and make it accessible for your business or even your customer.
Use case: what is happening here?
In this project I'm trying to build a React app that can consume data using Socket.io from a live source of data (we'll call it a Sensor) and visualize the time series data (we'll call it Readings) in realtime using D3.js
To simulate the Sensor, I created a very simple server using Express.js that rotate over a JSON file and keeps on continuously sending the Readings one by one every 1.5 seconds.
TBH I didn't spend much time on the server development and didn't implement proper error handling/reporting because that's not the scope of the project.
It's worth noting that while this project is built to work with a stateful WebSockets API assuming a realtime streaming use case. We could also make it support batch streaming REST API with a periodic server pinging on a longer interval. This really depends on the nature of the data and what level of granularity you want to have. I'm assuming the customer wants to do deep analysis and monitoring down to the interval of 1.5 seconds.
Peak detection
In the world of Real-Time Signal Analysis, peak detection is a hot topic for many engineering fields including chemistry, biology, biomedical, optics, astrophysics and energy systems.
To add some spice to this sample project, I included in the sample data the Moving Z-score of each data-point beside the actual sensor value.
The Moving Z-score is a mathematical model for measuring the anomalousness of each point in sequential time series. One of its main parameters is the window size (w). Given that the moving Z-score is the number of standard deviations each data-point is away from the mean, (w) limits the window in which we calculate the such statistical parameters. In this specific use case, the mean and standard deviation are computed only over the previous (w) observations.
In this scenario the Z-score would be a binary (0/1) value that tells whether there's a "peak" at this point or not.
In this project I got hold of a dataset with manually pre-calculated z-score to detect if the sensor readings increase suddenly (aka peaks) over a certain period of time. I included the scores in the mock data on server side.
Later, I'm planning to revisit this project to do this calculation on client-side, but given we need a (w) of data-points to get the first value so there would be a little bit of lag in initializing the visualization.
UX Research
So now we know that we have two pieces of information at any certain point of time:
- Actual sensor reading
- A binary value for whether there's a peak or not
Let's start by asking ourselves a few questions:
- How to show those 2 values effectively and clearly?
- How to create readable and meaningful time series chart?
- How to design unique and usable experience for the user interacting with the visualization?
Data Visualization should be "beautiful"
This is a must. Whichever use case you have or application you are building, if it's an IoT monitoring tool or a fancy marketing dashboard, an ugly looking visualization will discourage your end users from looking at it and may actually prevent them from understanding the meaning behind it and what insights the data entails.
But what is an "ugly" visualization? What makes it "beautiful" and visually appealing instead?
Well, this is subjective. There is no silver bullet. But with some conventional wisdom and lots of experience you realize that you should consider following:
- Use the right type of diagram:
Each type of a diagram was designed to represent data in a different way and focus on one aspect of the data. Obviously, graphing population of distant cities on world map wouldn't be the best option and using a pie chart to display more than 2-3 variables is a big no, and so on.
Although there's some interesting visualizations for time series like Stream and Gantt charts and there's always room of creativity and artistic improvisation in data visualization, customers tend to like what they are familiar with. And we only have two variables in this project.
Well...this is a time series. It's gonna be a line graph.
- Avoid overcrowded areas:
Too much elements in a tiny space can only be justified in very limited cases. Points in a scatter plot for example can be tolerated but also, it's called a scatter plot for a reason. Only when the crowded (or unscattered) data points have the same meaning you can allow crowding them together to show density, but if they have different meaning and you can't them apart because of the crowd jam ,you are doing something wrong.
This is why I started my visualization by drawing it in its simplest form then adding on top of it. I removed all of the chart junk for now and will add whatever I need along the way.
- Avoid mismatching or vague colors:
Colors are very subjective too, and they are associated with different feelings. Some feelings are obvious like hot red and warm blue, but if your data doesn't represent temperature? Also, some feelings or ideas associated with certain colors are cultural and differ from one target group to another.
There's lots of science behind color theory and why we perceive colors the way we do.
So, for this challenge what I do is that I stick with some of the famous palettes that have proved to work over time. You can use this cool color wheel by Adobe to find some preset palettes or create your own based on color harmonies like Analogous, Triad, or Complementary colors, then they have an Amazing feature where you can Copy CSS or Less.
For this project, I went with this simple palette that has 2 shades of green and 2 shades of red.
Compare & contrast
Comparisons and contrasts lead to conclusions
A visualization has to reflect the meaning of data and be built as simple as possible to make comparisons easier so the user can draw conclusions.
First thing we need to contrast here is the Readings series to the Z-scores series, so instead of showing the two series in different graphs we can overlay the peaks over the original signal (Readings) and decreased **opacity* to 10%*.
We face a problem here with scales or the unit of each value. You can't place a binary value in a line chart along a numerical value like the sensor readings.
In this case we need to improvise. To show the Z-score as a pink region over the line series in D3, I converted it to an area series to span the whole height of the graph. I normalized the 0-1 values to be 0-X where X is the highest value of readings displayed currently in view.
We also need to provide the user with a way to compare the sensors data to each other. Why? So, the customer can see if the peak pattern is happening in one sensor or in all of them, and most importantly if the pattern is happening across all sensors at the exact time or if there's a shift.
Since I'm assuming there's only 3 sensors we are visualizing, we can't really use a small multiple. What we can do is stack the 3 graphs we have on top of each other, making sure that all graphs are horizontally aligned to each other.
Usability & Interaction
Usability is the ease of access of an interface. Itβs a sub-discipline of UX. Although UX design and usability are sometimes used interchangeably, usability has grown to have to be more than the ease of access. Usability now is measurable. Measuring usability is out of the scope of this blog post, so will will take a holistic approach towards increasing usability in general.
Since we are here, we need to introduce a new terms: Dashboards and Widgets. A dashboard shows various semi-related visualizations that deliver a shared business value but not necessary from the same data source. Widgets are the building blocks of a dashboard.
The cards you've been seeing throughout the previous sections are all widgets. What do we need to consider now to make each single card/widget user friendly and most importantly relate them to each other ?
Labels & Controls
We need to show several labels to guide users where to look and help them understand what are they looking at, for this project we need to include the following:
- Titles: dashboard title and sensor title.
- Connectivity indicator: Here I'm assuming that the dashboard may get disconnect from a sensor for any reason. This happens a lot in IoT applications. We need to inform the user if one chart is outdated.
- Time series legend: This will have 2 functions, it will tell the user which is the actual reading and which is the peak area, and it will act as a toggle so the user can show/hide one of the two variables.
- Axes: Beside the fact that we need to show the units and values of each dimension, we need to make it clear in which direction time is moving.
- (Extra element) Last reading Timestamp: Since for the x-axis, I'm truncating the time stamp to show only the seconds (:20, :30,...). I added the Full timestamp of the last reading in the bottom right corner of the widget.
States
Any frontend component goes through a cycle of several states. These states are driven by the business logic, in our case we have the following states for each Widget:
Connecting:
This is the initial state when the page is loading and we don't have enough information to show to the user.Disconnected:
This is when a widget is disconnected for a server or client error. We also show the HTTP error message for debugging and helping users report their issues.
In this scenario we can't only rely on the Connectivity indicator, we need to explain to the user that the current data in view is not active. So we set the whole line graph opacity to 50%.
Connected: Everything is perfect ππ½
(Extra UI state) Mouseout:
This is mainly to make the visualization less cluttered and more visually appealing.
Although this is debatable and some designers don't favor it, I removed the x-axis and last Reading timestamp if the user is not hovering on a Widget.
My rationale behind this is that the customer is not really concerned with the exact time of each point but rather the main focus points of this visualization are the pink shaded Peak areas.
If users really want to know when that happened, they can hover on the graph.
The How
Engineering the solution
The frontend app contains two main classes:
- Chart: src/components/Chart.js
- This is the main React component that connects to the relevant sensor to stream readings, store it then does some data manipulation logic and finally initialize and update the D3 chart.
- The React component has 1 required prop
sensorID
and optionalx-ticks
which has a default value of 20 and a max value of 50.
- D3TsChart: src/d3-helpers/d3-ts-chart.js
- This is the custom class that handles the Time Series Chart graphics, and everything related to the chart SVG.
-
Readings are passed to this class to be rendered in DOM using D3 but never stored in the class itself. Data lives in the
Chart
component state.
File organization:
:root // React component
> api // Express JS App
> src
> components
> d3-helpers
> styles
> public
Backend
The backend server is very simple, it's just a single Express.js file along with the the data.json file.
The data file contains mock data for 3 sensors. You can connect to the socket by pinging https://localhost:4001?socket={sensorId}
. sensorId can only be 1 or 2 or 3 for now.
You can begin by creating the api
folder and installing the 2 needed packages:
npm install -s socket.io express
First thing we need to import the server requirements and initialize the Express.js server (app
) and wrap it by the socket.IO server (io
). We will also import the JSON data and set a const INTERVAL
of 1.5 seconds. This is how frequent we will emit data to each client connected to the server
const http = require('http');
const express = require('express');
const socketIO = require('socket.io');
const app = express();
const server = http.createServer(app);
const io = socketIO(server);
const port = process.env.PORT || 4001;
const INTERVAL = 1500;
const sensorData = require('./data.json');
To keep track of each client connected to the server we will create a custom object that will keep track of 1. which sensor data was requested by the client, 2. index of the next data point to serve, and 3. the setInterval
reference that will emit data each 1.5 seconds (INTERVAL
). Then we will store one object like this in a custom dictionary attached to the Socket.IO io
object.
// Connection object interface
// {
// sensorId,
// index,
// interval
// }
io.connections = {};
The idea behind storing the setInterval
is that we need to emit the data periodically and we will also need to stop (clearInterval
) this interval when a client disconnects from the server.
Now we need to listen to and handle the clients connect
and disconnect
in the Socket.IO server, and then emit data accordingly using the emitData
function:
io.on('connection', (socket) => {
const connectionId = socket.id;
const sensorId = Number(socket.handshake.query['sensor']); //parse the sensorID
console.log(`New client connected with id:${connectionId}`);
// Add a client connection to the custom dictionary
io.connections[connectionId] = {
sensorId,
index: 0,
interval: setInterval(() => emitData(connectionId, socket), INTERVAL)
};
// Remove connection
socket.on('disconnect', () => {
clearInterval(io.connections[connectionId].interval)
io.connections[connectionId] = undefined;
console.log(`Client ${connectionId} disconnected`)
});
});
After that we need to implement the emitData()
function which basically:
- Selects the relevant sensor data from the data file
- calls getReading() which gets One Reading from the data file
- Store the next Reading index in the connection object we created in
io.connections
dictionary last snippet. - Emit the reading with the event name 'reading'. We will listen to this in the client app next section.
const emitData = (connectionId, socket) => {
let conn = io.connections[connectionId]
const { newIndex, response } = getNextReading(sensorData[conn.sensorId - 1], conn.index);
console.log(`Emitted to client: ${connectionId}, sensor id:${conn.sensorId}, index: ${conn.index}`);
socket.emit("reading", JSON.stringify(response));
conn.index = newIndex;
}
// Get the next reading for the selected socket
const getNextReading = (data, index) => {
response = {
timestamp: Date.now(),
value: data.readings[index],
zscore: data.zScores[index]
};
return { newIndex: (index + 1) % data.readings.length, response };
}
Now if you add "start": "node index.js"
to the scripts property in package.json
file and then run npm start
in the api folder, the server will be up and running and read to serve clients.
We can test the server using this awesome Electron app ...or go to the next section and start implementing the React app ππ½ππ½ππ½
Frontend
As mentioned before the client app will basically contain the main React component Chart
that renders one chart and is responsible of controlling and passing data to the D3.js chart that lives inside the a separate custom d3-ts-chart
class.
React App and Chart
component
To initialize the React app we will use create-react-app
. You can install that globally by running npm i -g create-react-app
.
Then to initialize the actual code template we run create-react-app realtime-client
. This will create a folder with the name "realtime-client" and npm install the needed packages inside it.
If you cd into the folder and run a simple react-scripts start
you should have a simple react app built and served in your browser on http://localhost:3000/
.
Note that this will be the root folder of the project and the backend server will live in a subdirectory inside it with the name api
.
Now we need to install the extra packages we will use in the project. cd into the folder and run npm i -s socket.io node-sass d3
.
I'm using node-sass
to write the app styles which means you need to rename all the .css
files into .scss
and change the reference in the index.js
file.
Let's build a component
The final Chart component is a big one. I will focus on the important parts here.
We will need to define some basics stuff:
- The series list: which is a list of information about the series/lines that will be graphed. This is we will pass to the
D3TsChart
later to initialize the chart. -
tsChart
is theD3TsChart
object that we will code later and it's the one responsible of all D3 related operations. -
socket
is the socketId object that we will use to connect to the server and listen to the data. - State: The React component state in which we will store the data and some info and flags about the chart.
So the initial Component should start as the following:
import React from 'react';
import ReactDOM from 'react-dom';
import socketIOClient from 'socket.io-client';
import D3TsChart from '../d3-helpers/d3-ts-chart';
export class Chart extends React.Component {
seriesList = [
{
name: 'sensor-data',
type: 'LINE',
stroke: '#038C7E',
strokeWidth: 5,
label: 'Readings',
labelClass: 'readings',
},
{
name: 'z-score',
type: 'AREA',
fill: 'rgba(216, 13, 49, 0.2)',
stroke: 'transparent',
strokeWidth: 0,
label: 'Peaks',
labelClass: 'z-score',
}
]
tsChart = new D3TsChart();
socket;
state = {
data: [],
lastTimestamp: null,
connected: false,
error: ''
}
componentDidMount() { }
render = () => (
<div className="card">
<div className='chart-container'></div>
</div>
)
}
export default Chart;
Now we need to connect to the socket.IO server and fetch data for one sensor by its id. We will pass sensorId
to the component as a prop. This should be done in the componentDidMount()
function. After passing the component HTML element reference to tsChart
and initializing the 2 lines to be drawn by D3, it will call the connect()
function and will disconnect()
in componentWillUnmount()
.
Also notice that we listen to the "reading" event coming from the server and attach the storeReading
handler to it.
componentDidMount():
componentDidMount() {
if (this.props['sensorId'] === undefined) throw new Error('You have to pass "sensorId" prop to Chart component');
// Component enclosing DIV HTML reference.
const parentRef = ReactDOM.findDOMNode(this);
this.tsChart.init({
// Let D3 draw the chart SVG inside .chart-container div
elRef: parentRef.getElementsByClassName('chart-container')[0],
classList: {
svg: 'z-chart'
}
});
this.tsChart.addSeries(this.seriesList[0]); // readings
this.tsChart.addSeries(this.seriesList[1]); //z-score
this.connect();
}
connect = () => {
this.socket = socketIOClient(`/?sensor=${this.props.sensorId}`);
this.socket.on('reading', this.storeReading);
// Various Errors handling
SOCKETIO_ERRORS.forEach(errType => {
this.socket.on(errType, (error) => this.setError(errType, error));
});
}
componentWillUnmount() {
this.socket.disconnect();
}
The Socket.IO several errors strings and other constants are to be found in the top of the file:
const SOCKETIO_ERRORS = ['reconnect_error', 'connect_error', 'connect_timeout', 'connect_failed', 'error'];
const MAX_POINTS_TO_STORE = 20;
Now we need to implement the storeReading
function that will store the readings into the component state
and pass the new data to the tsChart
object.
We first push the new reading to the current data then we update the state.data
with the last MAX_POINTS_TO_STORE
items. We also store some meta data like the connected
indicator and the lastTimestamp
to be displayed in UI. LAstly, we call the updateChart()
method.
storeReading():
storeReading = (response) => {
const reading = JSON.parse(response);
this.setState((prevState) => {
const data = prevState.data;
const pointsToStore = Math.max(data.length - MAX_POINTS_TO_STORE, 0);
data.push(reading);
return {
data: data.slice(pointsToStore),
connected: true,
error: false,
lastTimestamp: new Date(data[data.length - 1].timestamp).toLocaleTimeString()
};
});
this.updateChart();
}
updateChart()
is implemented as a separate function because this is where we calculate highestValueInView
from the Readings series. This is done so we can normalize the zscores 0/1 and replaces the 1s with the highest value. This will essentially make the Peaks Area series take the whole height of the current data in view.
updateChart():
updateChart() {
const data = this.state.data;
const highestValueInView = Math.max(...data.map(p => p.value));
const zLine = data.map(p => ({
timestamp: p.timestamp,
value: p.zscore ? highestValueInView : 0
})
);
this.tsChart.adjustAxes(data);
this.tsChart.setSeriesData('sensor-data', data, false);
this.tsChart.setSeriesData('z-score', zLine, false);
}
This all the basic logic needed to pass the data to the D2TsChart
class object.
Now we need to update the render()
function to display the meta data we store in state
:
final render():
render = () => (
<div className="card">
<h2>{!this.state.lastTimestamp ? 'Connecting...' : `Sensor ${this.props.sensorId}`}</h2>
<span className={'status ' + (this.state.connected ? 'success' : 'danger')}>
{this.state.error}
<i className="pulse"></i>
{this.state.connected ? 'Connected' : 'Disconnected'}
</span>
<div className={'chart-container ' + (this.state.error ? 'faded' : '')}></div>
<span className={'timestamp ' + (this.state.connected ? 'success' : 'danger')}>
{this.state.connected ? '' : 'Last reading was at '}
{this.state.lastTimestamp}
</span>
</div>
)
Finally we need to update the React index.js
to include the charts for the 3 Sensors we can fetch from API.
index.js:
import React from 'react';
import ReactDOM from 'react-dom';
import './styles/main.scss';
import Chart from './components/Chart';
ReactDOM.render(
<div>
<h1>Peak Detection Dashboard</h1>
<Chart sensorId="1" />
<Chart sensorId="2" />
<Chart sensorId="3" />
</div>
, document.getElementById('root'));
You can find all the needed scss
styles in the styles
directory.
D3 Time Series line graph
Here's where all the actual "graphing" happens. This is the class where we import D3.js library and use it to append the different SVG elements to the HTML element stored inelRef
.
We need to set some constants like TRANSITION_DURATION
and MAX_Y_TICKS
, and for now we only support two SERIES_TYPES
in graphing: LINE
and AREA
.
So this is how we start with the basic class:
import * as d3 from 'd3';
const SERIES_TYPES = ['LINE', 'AREA'];
const TRANSITION_DURATION = 100;
const MAX_Y_TICKS = 6;
export default class D3TsChart {
margin = { top: 10, right: 30, bottom: 30, left: 30 };
outerWidth; outerHeight;
init({ elRef, width, height, classList }) {
this.elRef = elRef;
// If no width/height specified, SVG will inherit container element dimensions
if (width === undefined) this.responsiveWidth = true;
if (height === undefined) this.responsiveHeight = true;
this.outerWidth = width || this.elRef.offsetWidth;
this.outerHeight = height || this.elRef.offsetHeight;
this.classList = classList || {};
this.draw();
}
}
You will notice that we pass some initial config to the chart in the init
function including a width and height which are used to set up the graph layout according to the Margin Convention.
draw():
draw() {
// Main SVG
this.svg = d3.select(this.elRef)
.append('svg')
.attr('width', this.outerWidth)
.attr('height', this.outerHeight)
.classed(this.classList.svg || null, true);
//Inner box group (deducting margins)
this.group = this.svg.append('g')
.attr('width', this.outerWidth - this.margin.left - this.margin.right)
.attr('height', this.outerHeight - this.margin.top - this.margin.bottom)
.attr('transform', `translate(${this.margin.left} , ${this.margin.top})`)
.classed(this.classList.group || null, true);
// X Axis init
this.xScale
.range([0, this.outerWidth - this.margin.left - this.margin.right]);
this.xAxisRef = this.group.append('g')
.attr('transform', `translate(0,${this.outerHeight - this.margin.bottom})`)
.classed('x-axis', true);
// Y Axis init
this.yScale
.range([this.outerHeight - this.margin.bottom, 0]);
this.yAxisRef = this.group.append('g')
.attr('transform', 'translate(0, 0)')
.classed('y-axis', true);
}
Here we set the main SVG (with margins) and the inner group then we set the scales for X-axis and Y-axis.
Now we need to add the functions that will draw the series (line and area) inside the SVG.
Add Series:
addSeries({ name, type, fill, stroke, strokeWidth, id }) {
if (this.seriesDict[name]) throw new Error('Series name must be unique!');
if (!SERIES_TYPES.includes(type)) throw new Error('Series type not supported!');
this.seriesDict[name] = {
type,
ref: this.group.append('path')
.attr('fill', fill || 'none')
.attr('stroke', stroke || 'black')
.attr('stroke-width', strokeWidth || 2)
.classed('series', true)
.classed('hidden', false)
};
}
setSeriesData(name, data, adjustAxes = true) {
const series = this.seriesDict[name];
switch (series.type) {
case 'AREA':
this.updateAreaSeries(series, data);
break;
case 'LINE':
default:
this.updateLineSeries(series, data);
break;
}
}
Updating data of a single series:
updateLineSeries(series, data) {
series.ref
.datum(data)
.transition().duration(TRANSITION_DURATION).ease(d3.easeQuadIn)
.attr('d', d3.line()
.x((d) => { return this.xScale(d.timestamp); })
.y((d) => { return this.yScale(d.value); })
);
}
updateAreaSeries(series, data) {
series.ref
.datum(data)
.transition().duration(TRANSITION_DURATION).ease(d3.easeQuadIn)
.attr('d', d3.area()
.x((d) => { return this.xScale(d.timestamp); })
.y0(this.yScale(0))
.y1((d) => {
return this.yScale(d.value);
})
);
}
Then finally we will have a function to adjust the axes to the current data in view.
adjustAxes():
adjustAxes(data) {
const maxValue = d3.max(data, (d) => d.value);
this.xScale.domain(d3.extent(data, (d) => d.timestamp));
this.xAxisRef
.transition().duration(TRANSITION_DURATION).ease(d3.easeLinear)
.call(d3.axisBottom(this.xScale));
this.yScale.domain([0, maxValue]);
this.yAxisRef
.transition().duration(TRANSITION_DURATION).ease(d3.easeLinear)
.call(
d3.axisLeft(this.yScale)
.ticks(maxValue < MAX_Y_TICKS ? maxValue : MAX_Y_TICKS)
.tickFormat(d3.format('d'))
);
}
You can have a look a deeper a look at this class in the D3TsChart definition file src/d3-helpers/d3-ts-chart.js.
Deploying to Heroku
To deploy to Heroku this app to heroku we need to setup the app so it:
- Builds the react app and move the static webapp to
api/public
- Run the Express.js server
We can do so by adding the proper commands in the root package.json file.
Package.json:
...
"scripts": {
"start": "node api/index.js",
"prebuild": "rm -r api/public",
"build": "react-scripts build",
"postbuild": "mv build api/public"
}
...
Heroku will automatically detect that this is a Node.js app and will run the npm commands correctly.
The final step here is to set up the ExpressJS app so it serves the static app in api/public
directory.
ExpressJS index.js:
app.use(express.static(__dirname + '/public'));
That's all folks.
Related Links:
Posted on October 30, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 30, 2019