LINQ Distinct is a powerful feature that enables developers to remove duplicate elements from a collection efficiently. Whether you’re working with a large dataset or managing complex data structures, utilizing LINQ Distinct can significantly streamline your data analysis process.
The Distinct
method is defined as an extension method in the System.Linq
namespace and can be used on any interface collection. It takes no parameters and returns an IEnumerable<T>
of the same type as the input sequence.
Example Code of Linq Distinct() in C#
In this example, the Distinct
method is called on an array of integers (numbers
). The resulting sequence (distinctNumbers
) contains only the unique elements from the original array.
1
2
3
4
5
Note that the Distinct
method uses the default equality comparer to compare elements in the sequence. If you want to use a custom equality comparer, you can pass it as a parameter to the Distinct
method.
Custom Comparer With Linq Distinct()
Consider a scenario where we want to get distinct objects from the collection using one or two properties of the object.
List<Person> people = new List<Person> { new Person { Name = "John", Age = 25 }, new Person { Name = "Jane", Age = 30 }, new Person { Name = "John", Age = 25 }, new Person { Name = "Bob", Age = 20 } };
By default, it will get distinct objects based on the Name property. If we want to get distinct objects based on age, then we need to implement the customer equality comparison.
public class PersonNameComparer : IEqualityComparer<Person> { public bool Equals(Person x, Person y) { return x.Name == y.Name; } public int GetHashCode(Person obj) { return obj.Name.GetHashCode(); } }
This example PersonNameComparer is a custom equality comparison that compares two Person objects based on their Name property. Here’s what the implementation of PersonNameComparer might look like in the above code.
var distinctPeople = people.Distinct(new PersonNameComparer());
Also, We can pass the custom compare to the Distinct method as a parameter.
Linq Distinct Select
Let’s understand with an example. How we can use select statements along with the distinct collection?
using System; using System.Linq; class Program { static void Main() { int[] numbers = { 1, 2, 2, 3, 4, 4, 5, 6, 6, 7 }; var distinctNumbers = numbers.Distinct().Select(x => x * 2); Console.WriteLine("Distinct numbers multiplied by 2:"); foreach (var number in distinctNumbers) { Console.WriteLine(number); } } }
In this example, we have an array of numbers containing duplicate elements. We use the LINQ Distinct method to remove the duplicates, ensuring that only unique values are retained. After that, we chain the Select method to multiply each distinct number by 2. The result is a new collection with distinct numbers multiplied by 2.
Distinct numbers multiplied by 2:
2
4
6
8
10
12
14
Linq Distinct by Property/Field
We can filter the collection based on the specific property or fields available in the object.
using System; using System.Collections.Generic; using System.Linq; class Program { static void Main() { List<Person> people = new List<Person> { new Person { Id = 1, Name = "John" }, new Person { Id = 2, Name = "Jane" }, new Person { Id = 3, Name = "John" }, new Person { Id = 4, Name = "Adam" }, new Person { Id = 5, Name = "Jane" } }; var distinctPeople = people.GroupBy(p => p.Name) .Select(g => g.First()); Console.WriteLine("Distinct People:"); foreach (var person in distinctPeople) { Console.WriteLine($"Id: {person.Id}, Name: {person.Name}"); } } } class Person { public int Id { get; set; } public string Name { get; set; } }
In this example, we have a list of Person
objects where each person has an Id
and a Name
property.
We want to retrieve a distinct set of people based on their names. To achieve this, we use LINQ’s GroupBy
method to group the people by their names.
Then, we select the first person from each group using the First
method. This ensures that only the first occurrence of each distinct name is included in the result.
Distinct people by name:
Id: 1, Name: John
Id: 2, Name: Jane
Id: 4, Name: Alice
Conclusion
In summary, the Distinct()
method is a useful LINQ method that allows you to retrieve unique elements from a collection based on their default or custom equality comparison.