SQL Recipe: Compare with Neighbors

nalgeon

Anton Zhiyanov

Posted on June 5, 2023

SQL Recipe: Compare with Neighbors

Suppose we want to compare each data record with its neighbors based on some column value. For example:

  • Compare sales from one month to the previous month (month-over-month or MoM change) or to the same month a year ago (year-over-year or YoY change).
  • Compare financial results for a given period to the same period in the previous year (like-for-like or LFL analysis).
  • Observe the daily difference in stock prices to understand market trends.
  • Calculate the difference in traffic between days of the week to plan capacity changes.

The solution is to use the lag() function over an SQL window ordered by target columns.

Example

Let's compare the company's expenses for each month to the previous month in absolute terms:

select
  year, month,
  expense,
  expense - lag(expense) over w as diff
from expenses
window w as (order by year, month)
order by year, month;
Enter fullscreen mode Exit fullscreen mode
┌──────┬───────┬─────────┬──────┐
│ year │ month │ expense │ diff │
├──────┼───────┼─────────┼──────┤
│ 2020 │ 1     │  82     │      │
│ 2020 │ 2     │  75     │  -7  │
│ 2020 │ 3     │ 104     │  29  │
│ 2020 │ 4     │  94     │ -10  │
│ 2020 │ 5     │  99     │   5  │
│ 2020 │ 6     │ 105     │   6  │
│ 2020 │ 7     │  95     │ -10  │
│ 2020 │ 8     │ 110     │  15  │
│ 2020 │ 9     │ 104     │  -6  │
│ 2020 │ 10    │ 100     │  -4  │
│ 2020 │ 11    │  98     │  -2  │
│ 2020 │ 12    │ 106     │   8  │
└──────┴───────┴─────────┴──────┘
Enter fullscreen mode Exit fullscreen mode

The lag(value, offset) function returns the value of the record that is offset rows behind the current one. By default, the offset is 1 and can be omitted.

Now let's calculate the relative change from month to month:

select
  year, month, expense,
  round(
    (expense - lag(expense) over w)*100.0 / lag(expense) over w
  ) as "diff %"
from expenses
window w as (order by year, month)
order by year, month;
Enter fullscreen mode Exit fullscreen mode
┌──────┬───────┬─────────┬────────┐
│ year │ month │ expense │ diff % │
├──────┼───────┼─────────┼────────┤
│ 2020 │ 1     │  82     │        │
│ 2020 │ 2     │  75     │   -9   │
│ 2020 │ 3     │ 104     │   39   │
│ 2020 │ 4     │  94     │  -10   │
│ 2020 │ 5     │  99     │    5   │
│ 2020 │ 6     │ 105     │    6   │
│ 2020 │ 7     │  95     │  -10   │
│ 2020 │ 8     │ 110     │   16   │
│ 2020 │ 9     │ 104     │   -5   │
│ 2020 │ 10    │ 100     │   -4   │
│ 2020 │ 11    │  98     │   -2   │
│ 2020 │ 12    │ 106     │    8   │
└──────┴───────┴─────────┴────────┘
Enter fullscreen mode Exit fullscreen mode

Alternatives

Suppose we want to compare quarterly sales with the previous year. This is where the offset parameter comes in handy:

with data as (
  select
    year, quarter,
    lag(amount, 4) over w as prev,
    amount as current,
    round(amount*100.0 / lag(amount, 4) over w) as "increase %"
  from sales
  window w as (order by year, quarter)
)
select
  quarter,
  prev as y2019,
  current as y2020,
  "increase %"
from data
where year = 2020
order by quarter;
Enter fullscreen mode Exit fullscreen mode
┌─────────┬────────┬────────┬────────────┐
│ quarter │ y2019  │ y2020  │ increase % │
├─────────┼────────┼────────┼────────────┤
│ 1       │ 155040 │ 242040 │ 156        │
│ 2       │ 162600 │ 338040 │ 208        │
│ 3       │ 204120 │ 287520 │ 141        │
│ 4       │ 200700 │ 377340 │ 188        │
└─────────┴────────┴────────┴────────────┘
Enter fullscreen mode Exit fullscreen mode

Looking back 4 quarters with lag(amount, 4) gives us the same quarter but from the previous year.

There is also a lead() function. It works just like lag(), except that it looks forward instead of backward.

Compatibility

All major vendors support the lag() and lead() window functions. Some of them, such as MS SQL and Oracle, do not support the window clause. In these cases, we can inline the window definition:

select
  year, month, expense,
  expense - lag(expense) over (
    order by year, month
  ) as diff
from expenses
order by year, month;
Enter fullscreen mode Exit fullscreen mode

We can also rewrite the query without window functions:

select
  cur.year, cur.month, cur.expense,
  cur.expense - prev.expense as diff
from expenses cur
left join expenses prev on
  cur.year = prev.year and
  cur.month - 1 = prev.month
order by cur.year, cur.month;
Enter fullscreen mode Exit fullscreen mode

Want to learn more about window functions? Read my book — SQL Window Functions Explained

Follow @ohmypy on Twitter to keep up with new posts

💖 💪 🙅 🚩
nalgeon
Anton Zhiyanov

Posted on June 5, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related