Percentage calculations using SQL Window Functions

gvenzl

Gerald Venzl 🚀

Posted on December 6, 2022

Percentage calculations using SQL Window Functions

Sometimes you have the requirement to calculate percentages on some values of your data. There are multiple ways of doing it, of course, but often people are not aware that you do not have to calculate these percentages in the application itself or via a SQL statement that queries the same table multiple times to first calculate the denominator and then calculate the actual percentage. Instead, you can use SQL window functions to run a variety of complex calculations over different groups of data in a single pass! Window functions were introduced in the SQL:2003 standard back in 2003 and although the SQL standard calls these Window Functions, Oracle Database has them documented as Analytic Functions.

Let’s take a look.


Example Data

First, we need to have some data that we can run queries against. Let’s use movies and calculate the percentages of the actors’ earnings – fictional, of course. 🙂

CREATE TABLE movie_earnings (
  movie_name   VARCHAR(30),
  actor_name   VARCHAR(30),
  earnings     NUMBER
);

INSERT INTO movie_earnings VALUES
   ('Thor: Love and Thunder', 'Chris Hemsworth', 5000000);
INSERT INTO movie_earnings VALUES
   ('Thor: Love and Thunder', 'Natalie Portman', 2000000);
INSERT INTO movie_earnings VALUES
   ('Thor: Love and Thunder', 'Christian Bale',  1000000);

INSERT INTO movie_earnings VALUES
   ('Minions: The Rise of Gru', 'Steve Carell',  500000);
INSERT INTO movie_earnings VALUES
   ('Minions: The Rise of Gru', 'Pierre Coffin', 100000);
COMMIT;
Enter fullscreen mode Exit fullscreen mode

Percentage Total Earnings per Actor

Back in the days before window functions were available, and unfortunately still too often today, you will find queries with the two-step approach that will read the same table twice, first to get the denominator and second to run the actual computation of the percentage using the denominator:

SELECT movie_name, actor_name,
       ROUND(100 * earnings/sums.total, 2) AS percent
FROM movie_earnings, (
       SELECT SUM(earnings) AS total
         FROM movie_earnings
     ) sums
ORDER BY percent DESC;

MOVIE_NAME               ACTOR_NAME      PERCENT
------------------------ --------------- -------
Thor: Love and Thunder   Chris Hemsworth   58.14
Thor: Love and Thunder   Natalie Portman   23.26
Thor: Love and Thunder   Christian Bale    11.63
Minions: The Rise of Gru Steve Carell       5.81
Minions: The Rise of Gru Pierre Coffin      1.16
Enter fullscreen mode Exit fullscreen mode

While this produces the correct result, it has an unnecessary second query on the same table. With window functions, however – and once again, they have been around in some databases for more 20 years now and added to the SQL standard in 2003 – you can calculate the denominator of the percentage in a single go, no need to read the table twice:

SELECT
    movie_name, actor_name,
    ROUND(
       100 * earnings / SUM(earnings) OVER (),
    2) AS percent
FROM movie_earnings
ORDER BY percent DESC;

MOVIE_NAME               ACTOR_NAME      PERCENT
------------------------ --------------- -------
Thor: Love and Thunder   Chris Hemsworth   58.14
Thor: Love and Thunder   Natalie Portman   23.26
Thor: Love and Thunder   Christian Bale    11.63
Minions: The Rise of Gru Steve Carell       5.81
Minions: The Rise of Gru Pierre Coffin      1.16
Enter fullscreen mode Exit fullscreen mode

The window function in the above statement is the SUM(earnings) OVER () part. Concise, isn’t it?

Percentage of Earnings per Movie

Of course, calculating the percentage of earnings for all movies is interesting but perhaps more interesting is to know who the highest-paid actor in a given movie is. Prior to window functions, you may have written a statement like this:

SELECT
    e.movie_name, e.actor_name,
    ROUND(100 * e.earnings/s.total, 2) AS percent
FROM movie_earnings e, (
    SELECT SUM(earnings) AS total, movie_name
      FROM movie_earnings
        GROUP BY movie_name) s
WHERE e.movie_name = s.movie_name
ORDER BY movie_name, percent DESC;

MOVIE_NAME               ACTOR_NAME      PERCENT
------------------------ --------------- -------
Minions: The Rise of Gru Steve Carell      83.33
Minions: The Rise of Gru Pierre Coffin     16.67
Thor: Love and Thunder   Chris Hemsworth    62.5
Thor: Love and Thunder   Natalie Portman      25
Thor: Love and Thunder   Christian Bale     12.5
Enter fullscreen mode Exit fullscreen mode

The benefit of using window functions is that we can reuse the same query from before and just change the way how we partition the data for our analysis. The previous statement had a window function with a trailing OVER() clause. That OVER() clause is there to tell the window function, among other things, what part of the retrieved data to apply the calculation on. Window functions use values from one or multiple rows to return a value for each row. This contrasts with aggregate functions, which return a single value for multiple rows. In short, window functions always have an OVER() clause while any function without an OVER() clause is not a window function, but rather an aggregate or single-row (scalar) function.

To calculate the percentages per movie, all you have to do is to tell the window function to look at the data on a per-movie basis. In other words, partition the retrieved data by the movie_name and apply the calculation for each partition or window of data:

SELECT
    movie_name, actor_name,
    ROUND(
       100 * earnings / SUM(earnings)
                          OVER (PARTITION BY movie_name),
    2) AS percent
FROM movie_earnings
ORDER BY movie_name, percent DESC;

MOVIE_NAME               ACTOR_NAME      PERCENT
------------------------ --------------- -------
Minions: The Rise of Gru Steve Carell      83.33
Minions: The Rise of Gru Pierre Coffin     16.67
Thor: Love and Thunder   Chris Hemsworth    62.5
Thor: Love and Thunder   Natalie Portman      25
Thor: Love and Thunder   Christian Bale     12.5
Enter fullscreen mode Exit fullscreen mode

Pay attention to the (now in a separate line for further illustration purposes) OVER (PARTITION BY movie_name) clause. That’s all that needed to change to execute the percentage calculations per movie.

Percentage of Earnings per Movie and Total Earnings

The cool thing about window functions is that you can use multiple window functions in one SQL statement! So, for example, instead of executing the two statements above, you can just run one single query and retrieve both calculations in one go:

SELECT
    movie_name, actor_name,
    ROUND(
       100 * earnings / SUM(earnings)
                          OVER (PARTITION BY movie_name),
    2) AS ptc_movie,
    ROUND(
       100 * earnings / SUM(earnings)
                          OVER (),
    2) AS ptc_total
FROM movie_earnings
ORDER BY movie_name, ptc_movie DESC;

MOVIE_NAME               ACTOR_NAME      PTC_MOVIE PTC_TOTAL
------------------------ --------------- --------- ---------
Minions: The Rise of Gru Steve Carell        83.33      5.81
Minions: The Rise of Gru Pierre Coffin       16.67      1.16
Thor: Love and Thunder   Chris Hemsworth      62.5     58.14
Thor: Love and Thunder   Natalie Portman        25     23.26
Thor: Love and Thunder   Christian Bale       12.5     11.63
Enter fullscreen mode Exit fullscreen mode

This result provides some interesting insights right away. Here you can quickly see that although Steve Carell got the most money from his movie, it is just a tiny fraction of what others have made with their movies. We know this already, of course, because of the queries earlier on, but here you have the numbers next to each other, making it much more apparent. And, of course, we didn’t need to read the table twice to get the per movie and total earnings percentages either.

Many functions to choose from

Just like with aggregate functions, there are many different window functions available, so always check out the documentation! It just happens to be that Oracle Database has a RATIO_TO_REPORT function that computes the ratio of a value to the sum of a set of values. In other words, there already is a window function to calculate the percentages. So the above query can be further simplified to the following, saving you from having to do the division manually:

SELECT
    movie_name, actor_name,
    ROUND(
       100 * RATIO_TO_REPORT(earnings)
                OVER (PARTITION BY movie_name),
    2) AS ptc_movie,
    ROUND(
       100 * RATIO_TO_REPORT(earnings)
                OVER (),
    2) AS ptc_total
FROM movie_earnings
ORDER BY movie_name, ptc_movie DESC;

MOVIE_NAME               ACTOR_NAME      PTC_MOVIE PTC_TOTAL
------------------------ --------------- --------- ---------
Minions: The Rise of Gru Steve Carell        83.33      5.81
Minions: The Rise of Gru Pierre Coffin       16.67      1.16
Thor: Love and Thunder   Chris Hemsworth      62.5     58.14
Thor: Love and Thunder   Natalie Portman        25     23.26
Thor: Love and Thunder   Christian Bale       12.5     11.63
Enter fullscreen mode Exit fullscreen mode

Conclusion

Window functions, or analytical functions as Oracle calls them, are a powerful way to execute complex calculations over multiple “windows” in a query result.

They are much more compact than other methods and only need to read the data once instead of multiple times, giving you a performance boost for running complex calculations on large data sets.

Window functions should be in the toolbox of any developer who regularly writes SQL!

If you want to learn more about window functions, check out the free Analytic SQL for Developers course from Oracle!

đź’– đź’Ş đź™… đźš©
gvenzl
Gerald Venzl 🚀

Posted on December 6, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related