Game Play Analysis IV | LeetCode | MSSQL

ranggakd

Retiago Drago

Posted on June 16, 2023

Game Play Analysis IV | LeetCode | MSSQL

The Problem

The problem is sourced from LeetCode, specifically Game Play Analysis IV.

We are given a table 'Activity' with the following schema:

Column Name Type
player_id int
device_id int
event_date date
games_played int

(player_id, event_date) is the primary key of this table.

The table records the activity of players for different games. Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on some day using some device.

Our task is to write an SQL query to report the fraction of players that logged in again on the day after the day they first logged in, rounded to 2 decimal places. In other words, we need to count the number of players that logged in for at least two consecutive days starting from their first login date, then divide that number by the total number of players.

Example:

Input:

player_id device_id event_date games_played
1 2 2016-03-01 5
1 2 2016-03-02 6
2 3 2017-06-25 1
3 1 2016-03-02 0
3 4 2018-07-03 5

Output:

fraction
0.33

Only the player with id 1 logged back in after the first day he had logged in so the answer is 1/3 = 0.33

The Solutions

We have four different solutions. Each uses a different approach and SQL feature. Here, we explain how each solution works and provide performance benchmarks based on LeetCode's system.

Solution 1: Using CTE, JOIN, and Aggregate Functions

This solution uses Common Table Expressions (CTE) to determine the first login date of each player (first_login) and the players who logged in on the day after their first login (second_login). Then, we use LEFT JOIN and COUNT functions to calculate the ratio of these players.

WITH first_login AS (
    SELECT player_id, MIN(event_date) [first_day]
    FROM Activity
    GROUP BY player_id
),
second_login AS (
    SELECT a.player_id
    FROM Activity a
    INNER JOIN first_login f ON a.player_id = f.player_id AND a.event_date = DATEADD(day, 1, f.first_day)
)
SELECT 
    ROUND(COUNT(DISTINCT s.player_id) * 1.0 / COUNT(DISTINCT f.player_id), 2) [fraction]
FROM 
    first_login f
LEFT JOIN 
    second_login s ON f.player_id = s.player_id
Enter fullscreen mode Exit fullscreen mode

Runtime: 2497ms, beats 56.20% of submissions on LeetCode.

Solution 1

Solution 2: Using Window Functions and DATEDIFF

This solution creates a running count of players' login days using the ROW_NUMBER() window function and compares consecutive days with the DATEDIFF() function.

WITH result AS (
    SELECT
        event_date,
        MIN(event_date) OVER(PARTITION BY player_id) [first_event_date],
        ROW_NUMBER() OVER(PARTITION BY player_id ORDER BY event_date) [rn]
    FROM Activity
), counters AS (
    SELECT COUNT(DISTINCT player_id) [counter]
    FROM Activity
)
SELECT ISNULL(
    (
        SELECT DISTINCT ROUND(COUNT(rn) OVER() * 1.0 / counter, 2) [f]
        FROM result, counters
        WHERE DATEDIFF(day, first_event_date, event_date) = 1
    )
, 0) [fraction]
Enter fullscreen mode Exit fullscreen mode

Runtime: 2576ms, beats 47.64% of submissions on LeetCode.

Solution 2

Solution 3: Using Window Functions and DATEADD

This solution is similar to Solution 2, but uses DATEADD() instead of DATEDIFF() to compare dates. This may make it easier to understand, but it is slightly slower in performance.

WITH result AS (
    SELECT
        event_date,
        MIN(event_date) OVER(PARTITION BY player_id) [first_event_date],
        ROW_NUMBER() OVER(PARTITION BY player_id ORDER BY event_date) [rn]
    FROM Activity
), counters AS (
    SELECT COUNT(DISTINCT player_id) [counter]
    FROM Activity
)
SELECT ISNULL(
    (
        SELECT DISTINCT ROUND(COUNT(rn) OVER() * 1.0 / counter, 2) [f]
        FROM result, counters
        WHERE event_date = DATEADD(day, 1, first_event_date)
    )
, 0) [fraction]
Enter fullscreen mode Exit fullscreen mode

Runtime: 2625ms, beats 43.67% of submissions on LeetCode.

Solution 3

Solution 4: Using Window Functions, DATEADD and JOIN

This solution also uses window functions and DATEADD(), but replaces the comma join with an explicit JOIN clause, which could increase readability but potentially decreases performance.

WITH result AS (
    SELECT
        event_date,
        MIN(event_date) OVER(PARTITION BY player_id) [first_event_date],
        ROW_NUMBER() OVER(PARTITION BY player_id ORDER BY event_date) [rn]
    FROM Activity
), counters AS (
    SELECT COUNT(DISTINCT player_id) [counter]
    FROM Activity
)
SELECT ISNULL(
    (
        SELECT DISTINCT ROUND(COUNT(rn) OVER() * 1.0 / counter, 2) [f]
        FROM result
        JOIN counters ON 1=1
        WHERE event_date = DATEADD(day, 1, first_event_date)
    )
, 0) [fraction]
Enter fullscreen mode Exit fullscreen mode

Runtime: 3430ms, beats 9.67% of submissions on LeetCode.

Solution 4

Conclusion

Based on performance on LeetCode, Solution 1 is the most efficient, followed by Solution 2, Solution 3, and Solution 4. However, the best solution may vary depending on the specific SQL server, database size, and table indexing. It's always a good idea to understand the problem thoroughly and benchmark different solutions in your specific scenario to identify the most effective solution.

I hope you find this analysis helpful. For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.

ranggakd - Link in Bio & Creator Tools | Beacons

@ranggakd | center details summary summary Oh hello there I m a an Programmer AI Tech Writer Data Practitioner Statistics Math Addict Open Source Contributor Quantum Computing Enthusiast details center.

favicon beacons.ai
💖 💪 🙅 🚩
ranggakd
Retiago Drago

Posted on June 16, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related