Danny Reed
Posted on April 20, 2021
Kalman who?
Have you ever used a Kalman filter? My first experience with Kalman filtering was in smoothing motion sensor readings for a homemade Segway I built back in college. Many years ago it was used in the Apollo missions (ok, they win on coolness...). It's job is simple -- take noisy data and smooth it out a bit.
At work, I needed to smooth some latitude/longitude values to help make navigation paths be smoother. Normally, I'd just npm i kalmanjs
and be off to the races, but this particular data pipeline presented a challenge.
Challenge
I apply several data-processing operations to the data I'm working with, and several of those operations happen in the database because SQL is a great place to do set-based operations. These operations must take effect in a specific order in order for things to come out right, and I needed to stick the Kalman filtering operation right between two specific steps in my process. Both of those steps were done in SQL.
Option 1
Do the first few operations in SQL, then return results to the API layer (Lambda) to do the Kalman filtering (using Kalmanjs) and the remaining operations. This requires porting the rest of the steps from SQL to JS and doing them in the Lambda.
Advantage: Puts some CPU load on the Lambda instead of the DB, which is a good thing in my case.
Disadvantage: I have to translate several operations from SQL to JS, and some of them are set-based, which means they're better to do in SQL.
Option 2
Try to implement Kalman filtering in SQL. This option is tricky because there's practically no information out there on Kalman filtering in SQL.
Advantage: Preserve data pipeline and existing SQL implementations for subsequent operations.
Disadvantage: Requires writing a Kalman filter in SQL, which has some specific challenges we'll discuss later. Also, I don't do much fancy math, so understanding the way Kalman filtering actually works would be a challenge.
I chose option 2 because I just can't justify doing several operations in the "wrong" place just because that's where the Kalman implementations are available.
Translation
Before attempting the implementation, I took a good look at the code in KalmanJS and tried to understand it. Honestly the code looked pretty simple, and all I really had to do was change syntax around a bit. Here's a snippet of the translation:
Original JS implementation (KalmanJS)
/**
* Filter a new value
* @param {Number} z Measurement
* @param {Number} u Control
* @return {Number}
*/
filter(z, u = 0) {
if (isNaN(this.x)) {
this.x = (1 / this.C) * z;
this.cov = (1 / this.C) * this.Q * (1 / this.C);
}
else {
// Compute prediction
const predX = this.predict(u);
const predCov = this.uncertainty();
// Kalman gain
const K = predCov * this.C * (1 / ((this.C * predCov * this.C) + this.Q));
// Correction
this.x = predX + K * (z - (this.C * predX));
this.cov = predCov - (K * this.C * predCov);
}
return this.x;
}
Essentially this just does basic ifs, and some math. In fact, I found it so clear that I decided I could probably just translate the code without even having to dive into understanding the math behind Kalman filtering.
SQL Translation:
IF (@x IS NULL)
BEGIN
PRINT 'No existing state; proceeding without it';
SET @x = 1 / @C * @z;
SET @cov = 1 / @C * @Q * (1 / @C);
END
ELSE
BEGIN
SET @predX = @A * @x + @B * @u;
SET @predCov = @A * @cov * @A + @R;
SET @K = @predCov * @C * (1 / (@C * @predCov * @C + @Q));
SET @x = @predX + @K * (@z - @C * @predX);
SET @cov = @predCov - @K * @C * @predCov;
END
I did choose to change uncertainty()
and predict()
functions to inline formulas, just to minimize the need to create function declarations in SQL. Other than that, you can tell they're pretty much the same.
Speed Bump
This is where I hit a snag with the translation. Kalman works by looking at statistical properties the previous values in order to predict and smooth the next value. This means that Kalman needs to keep track of the state of the filter. In OOP you can just keep an instance of a class around in memory and use instance properties to keep track of state. In SQL, once the invocation completes, nothing is stored in memory.
Solution
Since the only obvious way to store things in SQL is in a table, I decided to create a KalmanState
table in my database. I read through the JS code again to determine which values were actually stored and used in the next calculation. From what I gathered, it appeared that all it needs to store are these variables:
x, cov, predX, predCov, K
Now it needs to store one set of these variables for each instance of the filter, so I made a simple table that looks like this:
id x cov predX predCov K identifier
----------- ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- ----------------
43 26.9861111111111 0.618055555555556 26.9636363636364 1.61818181818182 0.618055555555555 testingLat
44 21.0416666666667 0.618055555555556 21.1090909090909 1.61818181818182 0.618055555555555 testingLon
The last column is "identifier" which is just a string that you can use to name the instance. In my case, I'm using one filter for each of several MAC addresses of devices I'm using. So for me, I can just pass in the MAC address of the device as the identifier. This helps the SPROC retrieve the proper state when it's time to filter a new value.
Usage
Here's how I integrated this into my query.
SELECT TOP 1
@LatIdentifier = CONCAT(@DeviceId, '-lat')
,@LonIdentifier = CONCAT(@DeviceId, '-lon')
,@rawLat = Latitude
,@rawLon = Longitude
FROM #temp;
EXEC sp_kalman @LatIdentifier, @rawLat, @predictedLat OUTPUT;
EXEC sp_kalman @LonIdentifier, @rawLon, @predictedLon OUTPUT;
-- Overwrite the unfiltered lat/lon values in the temp table
UPDATE #temp
SET
Latitude = @predictedLat
,Longitude = @predictedLon;
Note that this is part of a much larger query, there were contextual reasons to use the temp table for this purpose, and that there is only ever one row in that table when this is executed.
Problems/Caveats
I'm not writing this to say that I've come up with some perfect solution...in fact this solution has some serious challenges.
I was forced to make it a SPROC. I attempted to turn my code into a UDF. Functions are nice because you can use them right in a
SELECT
or even aJOIN
. Sadly (but sensibly) functions are not allowed to alter data. As we discussed earlier, the Kalman filter must alter data in order to save its state. Now to integrate it with my query, I can't just flow it right in with theSELECT
s but I have toEXEC
the SPROC with parameters and then retrieve the output into a variable. This is just a clunky experience that makes for uglier and less readable code, in my opinion.You can't use it with
ISOLATION LEVEL SNAPSHOT
. This is true for the same reason as #1. You can't alter data in aSNAPSHOT
-isolated transaction. This makes sense too, but it's annoying! I had to change up how my transaction worked in order to exclude theEXEC
calls from it. In my case it was possible to side-step this limitation, but in other cases it may not be!You have to clean up after yourself By this I mean that there's no mechanism whereby this will ever
DELETE
records from theKalmanState
table, which could allow for uncontrolled growth dependent on the application. In my case, I had to modify my API layer to perform cleanup when a filter is no longer useful to the system.
Should you use this?
It seems to me that the vast majority of use cases for Kalman filtering can be accomplished without doing the math in the DB. It's just easier all around :)
If you find yourself in a position like mine, where doing it in the DB really was the best option, then I hope this gives you a leg up on accomplishing that.
Next Steps
The author of the KalmanJS library (who does some cool stuff -- check out his work) has a contrib filter where he invites translations to other languages. You can find this implementation there now.
Also check out his original blog post about the development of KalmanJS here.
If you can think of improvements, please feel free to have a go at it!
(Cover image sourced from here and used with permission)
Posted on April 20, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024