Exploring HashSet: A Dive into Unordered Collections
Arshi Saxena
Posted on October 14, 2024
Introduction
The HashSet
class is part of the Java Collections Framework, providing a fast, unordered collection that doesn't allow duplicate elements. It is built on top of the HashMap
, meaning it inherits the same time complexity benefits but focuses purely on element uniqueness. In this article, we’ll explore how HashSet
works, what makes it unique, and why it's different from other collections.
What is a HashSet?
A HashSet
is:
- Unordered: The elements have no predictable sequence.
- Unique: Duplicate elements are ignored.
-
Internally backed by a HashMap: It uses a
HashMap
to store elements, focusing only on the keys while discarding values. - O(1) average time complexity: Operations like insertion, deletion, and search are highly efficient.
1. Initializing a HashSet
// Parameterized constructor with initial capacity
Set<Integer> setWithInitialCapacity = new HashSet<>(5);
// Parameterized constructor using a collection
Set<Integer> setWithCollection = new HashSet<>(Arrays.asList(4, 4, 3));
// Default constructor with default capacity 16
Set<Integer> set = new HashSet<>();
Explanation:
-
Default Constructor: Creates a
HashSet
with an initial capacity of 16. - Parameterized Constructor: You can specify the initial capacity, but note that capacity isn’t the same as size. Size refers to the actual number of elements in the set.
-
Using a Collection: A
HashSet
can be created from a collection like a list, ensuring that only unique elements are retained.
2. Adding Elements to a HashSet
set.add(1);
set.add(2);
set.add(1); // Duplicate value is ignored
System.out.println(set); // Output -> [1, 2]
Explanation:
- The
add()
method inserts elements into theHashSet
. -
Duplicate elements are ignored. When you try to add
1
twice, only the first occurrence is retained.
Key Takeaway
If you need to replace duplicate values instead of ignoring them, HashSet
won’t be the right choice. This is because it prioritizes element uniqueness.
3. Checking Size vs Capacity
// Parameterized constructor with initial capacity
Set<Integer> setWithInitialCapacity = new HashSet<>(5);
System.out.println(setWithInitialCapacity.size()); // Output -> 0
Even though the capacity of setWithInitialCapacity
is 5
, the size is 0
because size reflects the number of elements present in the set, not the initial capacity. You can think of capacity as the internal storage space, which adjusts as elements are added.
4. Using HashSet with Collections
// Parameterized constructor using a collection
Set<Integer> setWithCollection = new HashSet<>(Arrays.asList(4, 4, 3));
System.out.println(setWithCollection); // Output -> [3, 4] or [4, 3]
Explanation:
- Although three elements were provided in the list (
4, 4, 3
), the duplicate value4
was discarded, leaving only two elements (3
and4
). - The order of elements is unpredictable because
HashSet
doesn’t maintain any insertion or natural ordering.
If you need to retain sorted elements, consider using a TreeSet
, which ensures elements are arranged in ascending order.
5. Indexing in HashSet – Is it Possible?
In interviews, a common question is whether you can retrieve an index of an element in a HashSet
. The answer is No, because HashSet
uses a hashing mechanism to store elements, not an index-based structure like a list or an array.
Summary of Key Points
-
Unordered and Unique:
HashSet
only retains unique elements, ignoring duplicates. -
Built on HashMap: It uses the keys of an internal
HashMap
to store elements. - Fast Operations: Average time complexity is O(1) for adding, removing, and checking elements.
- Capacity vs Size: Capacity is the allocated space, while size is the actual number of elements.
- No Indexing: You cannot retrieve elements by index due to the hashing mechanism.
Relation to HashMap
Since HashSet
is backed by a HashMap
, it uses the keys of the map to store elements, while the values are irrelevant. This is why every element in a HashSet
must be unique, just like the keys in a HashMap
.
Conclusion
HashSet
is a powerful tool when you need a fast, unordered collection that avoids duplicates. While it offers O(1) time complexity for most operations, it lacks features like sorting and indexing. For developers, knowing how HashSet
relates to HashMap
helps to understand its inner workings and make better use of the collections framework.
In the next post, we’ll explore a common interview question frequently asked in interviews to test candidates' knowledge of collections concepts.
Related Posts
Happy Coding!
Posted on October 14, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.