MongoDB Geospatial Queries: How to query based on geographic location
Bruno Henrique Koga Fazano
Posted on March 26, 2024
Hi folks! On this article I’ll give a quick introduction and a simple example on geospatial queries with MongoDB. This is a topic I’ve learned about recently and even though I don’t use it on a daily basis I found it very interesting! Hope it helps, or at least arouse the curiosity of someone.
What are Geospatial Queries
These are a type of database query that involves the retrieval and analysis of data based on its geographical or spatial characteristics. Geospatial queries are used to search for information based on its location on earth. They are specially useful for analyzing and interacting with spatial data, which can come from many sources, such as maps, GPS coordinates or simple data entries for latitude and longitude.
For example, suppose you are using an app called Ubeer, that allows you to call for a driver to deliver beer at you door. Your house is located at latitude X and longitude Y. The app should be able to query all the drivers nearby, within a 2 kilometer radius for instance, and then notify them about your request! That’s where geospatial queries come in handy. The location of the drivers are updated real-time and we need a quick way to find out who is reachable.
How does it work?
First, data needs to be stored appropriately (for example keeping track of latitude and longitude). Then spacial indexing comes in to enable fast querying. Spacial indexing involves the creation of data structures and indexing mechanisms that organize the data in a way that accelerates query processing. The choice of database technology and indexing method depends on specific requirements of the application.
Discussing the different indexing methods is out of the scope of this article but I’ll give two quick examples to make it a little bit clearer how this might work.
R-Tree
The R stands for “rectangular”. This method is widely used and consists in organizing the objects into a hierarchical tree structure. Each node represents a bounding box that holds some data points. The root node contains the boxes of all objects and each internal node contains new bounding boxes. The leaf nodes hold the actual data points.
Geohashing
This one I find very interesting. It encodes geographical coordinates into a string of characters, dividing the world into a grid and using binary search to create a hierarchical representation of locations. Each character in the hash refines the location further. For example, we could have:
- Geohash for New York:
abcd
- Geohash for Los Angeles:
abef
The initial ab
common to both codes point to the fact that they are located within the same larger area (a country for example). The different last two characters indicate different, more specific locations inside that first region.
Geospatial Queries with MongoDB and NodeJs
Now let’s get into our example. We’ll use NodeJs and MongoDB to illustrate how this kind of queries can be useful and how to implement it.
MongoDB Data Representation & Indexing
In MongoDB, geospatial data is usually represented using the GeoJSON format. It’s a standard format for encoding various types of geographic data. When you store geospatial data in a MongoDB document it is usually represented as a field containing GeoJSON objects.
Here you can have a reference on GeoJSON format and its types. We are going to use the format for the Point data type:
{
"type": "point",
"coordinates": [ 10.0, 11.2 ]
}
To make this work with MongoDB on our NodeJs application we will use mongoose. Your schema definition (for the document you want to implement geospatial queries) should look similar to this:
import mongoose from "mongoose";
interface IVehicleLocation {
vehicleId: string;
location: {
type: string;
coordinates: [number, number]; // [longitude, latitude]
};
}
const vehicleLocationSchema = new mongoose.Schema({
vehicleId: {
type: String,
required: true,
unique: true
},
location: {
type: {
type: String,
enum: ["Point"],
required: true
},
coordinates: {
type: [Number],
required: true
}
}
});
vehicleLocationSchema.index({ location: "2dsphere" });
const VehicleLocationSchema = mongoose.model<IVehicleLocation>('VehicleLocation', vehicleLocationSchema);
export { VehicleLocationSchema, IVehicleLocation }
What this code does is: define a property location
(note the similarity to the GeoJSON format) that receives a string type
— here only allowed as “Point” — and an array of coordinates, longitude
and latitude
.
Then, we define an indexing method for this schema, the 2dsphere
. This method is designed to work with geospatial data represented on the surface of a sphere — such as the Earth’s — by dividing it into a grid of cells. This grid allows for efficient spatial indexing and query processing.
The 2dsphere
index filters documents that do not match the geospatial query criteria by making use of this grid. This reduces the number of documents fetched in order to find the result for the query.
Creating some entries
Won’t spend some time on this since this is not a Mongoose or NodeJs tutorial. To create new entries you can create a POST /vehicle-location
endpoint on your API that does something like this:
const vehicleLocation = new VehicleLocationSchema({
vehicleId: vehicleId,
location: {
type: "Point",
coordinates: [longitude, latitude]
}
});
await vehicleLocation.save();
return vehicleLocation.toObject();
Querying your data
Now that our schema is good to go and we have a document ready to store geographical coordinates, we can — after creating some dummy data — start querying! MongoDB provides a set of query operators, and we’ll use the $near
and the $maxDistance
to demonstrate how to solve the Ubeer example mentioned earlier.
So, to achieve our goal of finding all the drivers within a radius from a user’s house we can do the following:
export const getVehiclesByRadius = async (latitude: number, longitude: number, radius: number): Promise<IVehicleLocation[]> => {
const vehicleLocations = await VehicleLocationSchema.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: [longitude, latitude]
},
$maxDistance: radius
}
}
})
return vehicleLocations.map(vehicle => {
return vehicle.toObject();
});
};
The $near
operator allows MongoDB to find all the documents that are close enough to the specified point, and returns them sorted by proximity. The $maxDistance
specify the radius we want to look to limit our search. In other words, this query returns all the vehicles that are within a radius from that point.
Wanna check if it’s really working?
When I was working on a project with this I wanted to make sure the results were consistent, so here is a hint. I’ve used the haversine distance to validate if the results were correct. This formula calculates the distance between two points on the surface of a sphere.
const haversineDistance = (latitude: number, longitude: number, center: { latitude: number, longitude: number }): number => {
let earthRadius: number = 6371;
let dLat = Math.abs(latitude - center.latitude) * Math.PI / 180.0;
let dLon = Math.abs(longitude - center.longitude) * Math.PI / 180.0;
// convert to radians
let lat = (latitude) * Math.PI / 180.0;
let clat = (center.latitude) * Math.PI / 180.0;
// apply formulae
let a = Math.pow(Math.sin(dLat / 2), 2) +
Math.pow(Math.sin(dLon / 2), 2) *
Math.cos(lat) *
Math.cos(clat);
let c = 2 * Math.asin(Math.sqrt(a));
return earthRadius * c;
};
So, if you are curious about the results, you can — for each result you get — calculate the distance between the result’s coordinates and the given point. I can assure you they are all smaller than your radius :)
So there you go, now you are able to query your MongoDB documents based on their location on Earth’s surface! Hope this helped and feel free to comment or suggest anything.
Posted on March 26, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.