Quick tip: Using Uber's H3 to visualise British Transport Police crime data
Akmal Chaudhri
Posted on October 4, 2022
Abstract
Creating visualisations can often be a great way to present data. In this short article, we'll apply Uber's H3 Hexagonal Hierarchical Spatial Index to British Transport Police (BTP) crime data, and then visualise the results. We'll use a local Jupyter installation as our development environment.
The notebook file used in this article is available on GitHub.
Introduction
In a previous article, we discussed how to map crimes and visualise hot routes. We'll extend that work in this article by obtaining the latest BTP crime data for the UK and use Uber's H3 library.
Obtain BTP crime data
The file we need is 2024-08-btp-street.csv. This can be generated from the Data Downloads page. On that page, we'll select the following:
- Date range: August 2024 to August 2024.
- Forces: Check (✔) British Transport Police.
- Data sets: Check (✔) Include crime data.
- Generate file.
The download will be a zip file, and the CSV file we need can be extracted from that.
Notebook
Let's now start to fill out our notebook.
First, we'll need to install the following:
!pip install geopandas h3 pandas --quiet --no-warn-script-location
Next, we'll import some libraries:
import geopandas as gpd
import pandas as pd
from h3 import h3
from shapely.geometry import Polygon
Next, we'll read the CSV file into a Pandas Dataframe, filter what we need and create a Geopandas Dataframe, with the correct coordinate system.
df = pd.read_csv("2024-08-btp-street.csv")
crimes = gpd.GeoDataFrame(
df["Crime type"],
geometry = gpd.points_from_xy(df.Longitude, df.Latitude),
crs = "EPSG:4326"
)
crimes.head(5)
The output should be similar to the following:
Crime type geometry
0 Bicycle theft POINT (-0.32764 50.83438)
1 Vehicle crime POINT (-0.32764 50.83438)
2 Other theft POINT (-0.23643 50.83255)
3 Violence and sexual offences POINT (-0.23643 50.83255)
4 Other theft POINT (-3.55862 54.64503)
We'll now convert our geometry to H3 using code from an excellent article. Initially, we'll set the h3_level to 5 and then we'll try it with a smaller value.
h3_level = 5
# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/
def lat_lng_to_h3(row):
return h3.geo_to_h3(
row.geometry.y, row.geometry.x, h3_level
)
crimes["h3"] = crimes.apply(lat_lng_to_h3, axis = 1)
crimes.head(5)
The output should be similar to the following:
Crime type geometry h3
0 Bicycle theft POINT (-0.32764 50.83438) 85194a73fffffff
1 Vehicle crime POINT (-0.32764 50.83438) 85194a73fffffff
2 Other theft POINT (-0.23643 50.83255) 85194a73fffffff
3 Violence and sexual offences POINT (-0.23643 50.83255) 85194a73fffffff
4 Other theft POINT (-3.55862 54.64503) 85195097fffffff
Next, we'll aggregate the number of crimes:
# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/
counts = (crimes.groupby(["h3"])
.h3.agg("count")
.to_frame("count")
.reset_index()
)
counts.head(5)
The output should be similar to the following:
h3 count
0 851870d3fffffff 1
1 851870dbfffffff 1
2 85187433fffffff 1
3 85187463fffffff 1
4 8518746bfffffff 1
Now, we'll convert H3 to polygons that can be visualised:
# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/
def add_geometry(row):
points = h3.h3_to_geo_boundary(
row["h3"], True
)
return Polygon(points)
counts["geometry"] = counts.apply(add_geometry, axis = 1)
counts.head(5)
The output should be similar to the following:
h3 count geometry
0 851870d3fffffff 1 POLYGON ((-5.3895450275705175 50.2383686006280...
1 851870dbfffffff 1 POLYGON ((-5.618770916484385 50.21578200467119...
2 85187433fffffff 1 POLYGON ((-4.402362230572691 50.46432248133249...
3 85187463fffffff 1 POLYGON ((-5.0895660767735675 50.4018208682358...
4 8518746bfffffff 1 POLYGON ((-5.319050950902007 50.38000939979234...
We'll also ensure that we have the correct coordinate system:
crimes_h3 = gpd.GeoDataFrame(counts, crs = "EPSG:4326")
crimes_h3.head(5)
The output should be similar to the following:
h3 count geometry
0 851870d3fffffff 1 POLYGON ((-5.38955 50.23837, -5.48931 50.18359...
1 851870dbfffffff 1 POLYGON ((-5.61877 50.21578, -5.71845 50.16074...
2 85187433fffffff 1 POLYGON ((-4.40236 50.46432, -4.5027 50.41076,...
3 85187463fffffff 1 POLYGON ((-5.08957 50.40182, -5.18967 50.34749...
4 8518746bfffffff 1 POLYGON ((-5.31905 50.38001, -5.41907 50.32542...
Finally, we'll plot the data:
btp_crimes = crimes_h3.plot(
column = "count",
cmap = "OrRd",
edgecolor = "black",
figsize = (7, 7),
legend = True,
legend_kwds = {
"label" : "Number of crimes",
"orientation" : "vertical"
}
)
btp_crimes.set_axis_off()
btp_crimes.plot()
h3_level set to 5 will render the chart shown in Figure 1.
Changing the value of h3_level to 3 and re-running the code will render the chart shown in Figure 2.
London and the South East have higher crime numbers than other parts of the United Kingdom.
Summary
Using Uber's H3, we have been able to create some useful charts. H3 could be used in many different application domains. Feel free to experiment with different h3_level settings and also try your own dataset.
Posted on October 4, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.