NET6 - C# DistinctBy

karenpayneoregon

Karen Payne

Posted on November 11, 2024

NET6 - C# DistinctBy

Introduction

Learn how to use DistinctBy which was introduced with NET6 Core.

DistinctBy returns unique items from a list, determined by a key (which can be one or more properties) specified through a selector function. This method is particularly beneficial when dealing with large sets of data, as DistinctBy can optimize performance by reducing the dataset to only unique items based on a particular property or properties.

DistinctBy receives a delegate to select the property or properties to use as the comparison key and returns the objects containing the distinct values.

There is a question on Stackoverflow that may have very well sparked the idea for DistinctBy. For those using older NET Frameworks check it out.

Microsoft Documentation remarks

This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in C# or For Each in Visual Basic.

Side notes

Secondary learning, in the provided source code, check out GenericsExtensions.cs which has two useful language extension methods.

Index extension method provides deconstruction for a foreach statement along with providing the item index.

foreach (var (index, member) in distinctList.Index())
{
    Console.WriteLine($"{index,-7}{member.Id,-10}{member.FirstName,-10}{member.SurName}");
}
Enter fullscreen mode Exit fullscreen mode

All code done with .NET Core 8 which uses Collection expressions.

Code samples

Source code

Movies release year

In this example, we are asked to get one movie for release year in a list of movies.

The model for movies.

/// <summary>
/// Represents a movie with properties for identification, name, release year, and rating.
/// </summary>
public class Movie
{
    public int Id { get; set; }
    public string Name { get; set; }
    public int Released { get; set; }
    public int Rating { get; set; }
    public override string ToString() => Name;
}
Enter fullscreen mode Exit fullscreen mode

List of movies which in a real application would comes from a database, json file or other data source.

public static IEnumerable<Movie> MovieList()
{
    return new List<Movie>
    {
        new() { Id = 1, Name = "Inception", Released = 2010, Rating = 5 },
        new() { Id = 2, Name = "The Matrix", Released = 1999, Rating = 5},
        new() { Id = 3, Name = "Interstellar", Released = 2014, Rating = 5 },
        new() { Id = 4, Name = "The Dark Knight", Released = 2008, Rating = 5 },
        new() { Id = 5, Name = "Fight Club", Released = 1999, Rating = 4 },
        new() { Id = 6, Name = "Pulp Fiction", Released = 1994, Rating = 4 },
        new() { Id = 7, Name = "Forrest Gump", Released = 1994, Rating = 4 },
        new() { Id = 8, Name = "The Shawshank Redemption", Released = 1994, Rating = 5 },
        new() { Id = 9, Name = "The Godfather", Released = 1972, Rating = 5 },
        new() { Id = 10, Name = "The Godfather: Part II", Released = 1974, Rating = 5 }
    };
}
Enter fullscreen mode Exit fullscreen mode

The conventual way to get distinct release year is using GroupBy then using Select(g => g.First()).

var distinctMoviesByReleaseYear = 
    MockedData.MovieList()
        .GroupBy(m => m.Released)
        .Select(g => g.First())
        .ToList();
Enter fullscreen mode Exit fullscreen mode

DistinctBy is easier as we need only specify the property, in the case Released.

var distinctList = MockedData.MovieList()
    .DistinctBy(movie => movie.Released)
    .ToList();
Enter fullscreen mode Exit fullscreen mode

This does not mean GroupBy still can not be used, it may be a personal choice or that DistinctBy is not right for a specific task.

Example where GroupBy may be a better direction is still using Movie model, get distinct by Name starts with "The" and by Rating.

public static void GroupMoviesNameStartsWithAndRating()
{

    PrintCyan();

    var moviesGroupedByNameAndRating = MockedData.MovieList()
        .GroupBy(m => new MovieGroupItem(
            m.Name.StartsWith("The", StringComparison.OrdinalIgnoreCase), 
            m.Rating));


    AnsiConsole.MarkupLine($"[{Color.Chartreuse1}][u]Name                      Released    Rating[/][/]");
    foreach (var group in moviesGroupedByNameAndRating)
    {
        if (!group.Key.StartsWithThe) continue;
        foreach (var movie in group)
        {
            Console.WriteLine($"{movie.Name, -25} {movie.Released, -12}{movie.Rating}");
        }

    }
}
Enter fullscreen mode Exit fullscreen mode

DistinctBy with one property

The following model will be used in several examples.

/// <summary>
/// Represents a member with personal details and address information.
/// </summary>
public class Member
{
    public int Id { get; set; }
    public bool Active { get; set; }
    public string FirstName { get; set; }
    public string SurName { get; set; }
    public Gender Gender { get; set; }
    public Address Address { get; set; }
    public override string ToString() => Id.ToString();
}
Enter fullscreen mode Exit fullscreen mode

In this example, some how data has been sent with duplicate primary keys.

public static IEnumerable<Member> MembersList3() =>
[
    new() { Id = 1, Active = true, FirstName = "Mary", SurName = "Adams", Gender = Gender.Female},
    new() { Id = 2, Active = false, FirstName = "Sue", SurName = "Williams", Gender = Gender.Female},
    new() { Id = 1, Active = false, FirstName = "Jake", SurName = "Burns", Gender = Gender.Male},
    new() { Id = 4, Active = true, FirstName = "Jake", SurName = "Burns", Gender = Gender.Male},
    new() { Id = 5, Active = true, FirstName = "Clair", SurName = "Smith",Gender = Gender.Other},
    new() { Id = 1, Active = true, FirstName = "Mary", SurName = "Adams", Gender = Gender.Female },
    new() { Id = 7, Active = true, FirstName = "Sue", SurName = "Miller", Gender = Gender.Female }
];
Enter fullscreen mode Exit fullscreen mode

The following works against the primary key.

public static void DistinctByPrimaryKey()
{

    PrintCyan();

    var distinctList = MockedData.MembersList3()
        .DistinctBy(member => new
        {
            member.Id
        })
        .ToList();

    MemberHeader();

    foreach (var (index, item) in distinctList.Index())
    {
        Console.WriteLine($"{index,-7}{item.Id,-10}{item.FirstName,-10}{item.SurName}");
    }
}
Enter fullscreen mode Exit fullscreen mode

query results

DistinctBy with multiple properties

Get distinct by first name, surname and active.

Data

public static IEnumerable<Member> MembersList1() =>
[
    new() { Id = 1, Active = true, FirstName = "Mary", SurName = "Adams", Gender = Gender.Female},
    new() { Id = 2, Active = false, FirstName = "Sue", SurName = "Williams", Gender = Gender.Female},
    new() { Id = 3, Active = true, FirstName = "Jake", SurName = "Burns", Gender = Gender.Male},
    new() { Id = 4, Active = true, FirstName = "Jake", SurName = "Burns", Gender = Gender.Male},
    new() { Id = 5, Active = true, FirstName = "Clair", SurName = "Smith",Gender = Gender.Other},
    new() { Id = 6, Active = true, FirstName = "Mary", SurName = "Adams", Gender = Gender.Female },
    new() { Id = 7, Active = true, FirstName = "Sue", SurName = "Miller", Gender = Gender.Female }
];
Enter fullscreen mode Exit fullscreen mode

Code example

public static void DistinctByFirstLastNameAndActive()
{

    PrintCyan();

    var distinctList = MockedData.MembersList1()
        .DistinctBy(member => new
        {
            member.FirstName,
            member.SurName,
            member.Active
        })
        .ToList();

    MemberHeader();

    foreach (var (index, item) in distinctList.Index())
    {
        Console.WriteLine($"{index,-7}{item.Id,-10}{item.FirstName,-10}{item.SurName}");
    }
}
Enter fullscreen mode Exit fullscreen mode

query results

Distinct by on sub property

In the follow example use DistinctBy, on Address property of Member model.

public class Member
{
    public int Id { get; set; }
    public bool Active { get; set; }
    public string FirstName { get; set; }
    public string SurName { get; set; }
    public Gender Gender { get; set; }
    public Address Address { get; set; }
    public override string ToString() => Id.ToString();
}
public class Address
{
    public int Id { get; set; }
    public string Street { get; set; }
    public string City { get; set; }
    public string State { get; set; }
}
Enter fullscreen mode Exit fullscreen mode

Data

public static IEnumerable<Member> MembersList4() =>
[
    new()
    {
        Id = 1, 
        Active = true, 
        FirstName = "Mary", 
        SurName = "Adams", 
        Gender = Gender.Female,
        Address = new() { Id = 1, Street = "123 Main St", City = "Portland", State = "NY" }
    },
    new()
    {
        Id = 2, 
        Active = false, 
        FirstName = "Sue", 
        SurName = "Williams", 
        Gender = Gender.Female,
        Address = new() { Id = 2, Street = "124 Main St", City = "Anytown", State = "NY" }
    },
    new()
    {
        Id = 3, 
        Active = false, 
        FirstName = "Jake", 
        SurName = "Burns", 
        Gender = Gender.Male,
        Address = new() { Id = 3, Street = "123 Main St", City = "Anytown", State = "CA" }
    },
    new()
    {
        Id = 4, 
        Active = true, 
        FirstName = "Jake", 
        SurName = "Burns", 
        Gender = Gender.Male,
        Address = new() { Id = 4, Street = "123 Main St", City = "Anytown", State = "PA" }
    },
    new()
    {
        Id = 5,
        Active = true, 
        FirstName = "Clair", 
        SurName = "Smith",
        Gender = Gender.Other,
        Address = new() { Id = 5, Street = "123 Main St", City = "Anytown", State = "NJ" }
    },
    new()
    {
        Id = 6, 
        Active = true, 
        FirstName = "Mary", 
        SurName = "Adams", 
        Gender = Gender.Female,
        Address = new() { Id = 1, Street = "123 Main St", City = "Portland", State = "NY" }
    }
];
Enter fullscreen mode Exit fullscreen mode

Code sample

public static void DistinctByAddress()
{

    PrintCyan();

    var distinctList = MockedData.MembersList4()
        .DistinctBy(member => new
        {
            member.Address.Id,
            member.Address.Street,
            member.Address.City,
            member.Address.State

        })
        .ToList();

    MemberHeader();

    foreach (var (index, item) in distinctList.Index())
    {
        Console.WriteLine($"{index,-7}{item.Id,-10}{item.FirstName,-10}{item.SurName}");
    }
}
Enter fullscreen mode Exit fullscreen mode

The results in this case are not apparent. For learning purposes, open the MockedData file, drill down to MembesList4 and set a breakpoint on MemberHeader.

Run the project, when the breakpoint is hit, use the debugger local window to examine the results.

Summary

In this article, DistinctBy for NET6 and higher offers a new way to get distinct items from a list along with in one case GroupBy verses DistinctBy.

💖 💪 🙅 🚩
karenpayneoregon
Karen Payne

Posted on November 11, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

NET6 - C# DistinctBy
csharp NET6 - C# DistinctBy

November 11, 2024