Sets and their usefulness
Aaron Elligsen
Posted on September 10, 2022
Sets
The most basic data structure, the beginning of them all, is the Set. It is one of the bases of Mathematics. A set is a simple collection of unique items. It has no other constraints.
A set can tell you one thing, whether an element is inside of it or not. There are a great many situations in software which come down to this question. However, the key property of a Set is uniqueness! If it contains an element it only contains one of those elements.
Looking at Sets in JavaScript will help us move from Sets as a mathematical concept to practical implementation. Any real world implementation of a theoretical concept will come with some limitations which we will discuss throughout. Before running the code blocks below I encourage you to guess what the output will be
A Container and its methods
In JavaScript Sets are a collection object. They contain values of other data. Sets were added to the language in JavaScript 2015. Sets make it easier and more efficient to test element membership.
Constructor
The JavaScript set constructor take an optional parameter of an iterable. Elements are added to this container object and the data structure steps through them. Frequently, arrays are used for Set construction but other sets are valid inputs as well, or any other custom data structure which implements the JavaScript iterable protocol. In any case, what you get at the end is a collection of unique elements.
let emptySet = new Set();
//create a new set containing 1, 2, and 3
let smallNumbers = new Set([1, 2, 3])
let copyNumbers = new Set(smallNumbers);
let someOtherSet = new Set(iterableCollection)
Add
JavaScript sets are mutable, or changeable, meaning we can add new unique elements to a set.
let smallNumbers = new Set([1, 2, 3])
smallNumbers.add(4) //set now contains [1,2,3,4]
Delete
The complement of add removes an element from a set.
let smallNumbers2 = new Set([1, 2, 3])
smallNumbers2.delete(3) //set now contains [1,2]
Has - the membership test
The most important method of a set, it will let you ask if a given element is in a particular set, returning either true or false. In most mathematical writing this method is called "in" and is written using the symbol ∈ and can be read as either "in" or "belongs to". In the case that it does not belong in the set, it can be written as "not in" or ∉. The behavior and sensitivity of has()
is what sets it apart and is an improvement over JavaScript's standard objects.
let responseSet = new Set(["yes", "no"]);
responseSet.has("yes") // true
responseSet.has("no") // true
responseSet.has("maybe") //false
JavaScript Aside: the keyword in
in
serves to test membership of string keys in a standard JavaScript object. However, it does not work with sets.
JavaScript objects consist of zero or more key value pairs. {key: value1, key2: value2, ...}
. All keys are converted to strings upon object creation, unless they are a symbol. One important feature of Sets in JavaScript is that they can test membership of objects correctly, which is not the case for standard JavaScript objects.
const mySymbol = Symbol("bet you haven't seen these");
const yourSymbol = Symbol("Virgo?");
let node = {value: 10, next: null};
let otherNode = {value: 100, next: {value: 90, next: null}};
let exampleObj = {
one: 1,
two: 2,
1: 0,
true: "example",
"undefined": "sort of undefined",
"null": "not so null",
[mySymbol]: "Symbols are a newish primitive type"
}
exampleObj[node] = "misleadingAssignment";
exampleObj[undefined] = "defined";
exampleObj[null] = "definitely null";
//After these last lines
//what do you expect the values of
//exampleObj["undefined"]
//exampleObj["null"] to equal?
"one" in exampleObj // true
"two" in exampleObj // true
"1" in exampleObj //true
1 in exampleObj //true - 1 is cast to a string
true in exampleObj //true - true is cast to a string
"true" in exampleObj //true
2 in exampleObj //false
undefined in exampleObj //true - undefined is cast to a string
"undefined" in exampleObj //true
null in exampleObj // true - null is cast to a string
"null" in exampleObj //true
mySymbol in exampleObj // true
yourSymbol in exampleObj //false
node in exampleObj //returns true, but it is a lie
otherNode in exampleObj //returns true but it shouldn't.
"[object Object]" in exampleObj //returns true, but why?
We never added otherNode to the exampleObj. Unhelpfully, JavaScript converts node
to the string '[object Object]'. This default behavior is provided by the .toString method inherited by all objects. You can override this behavior to try to return unique strings for an object but this can be incorrect as well.
As we can see JavaScript's old syntax for testing membership has several gotcha's and limitations. The Set has
method improves the situation by no longer implicitly converting the element we want to test into a string. The has
method does value comparison for primitive elements and reference comparison for objects (asking is it the same object in memory?). It does not do any casts or conversions. In the following code block, we can see how the set is not confused between primitive values and the string versions of those values. Importantly how it correctly answers the question of membership of objects like node
and otherNode
from the example above.
let betterSet = new Set([
"one",
1,
true,
undefined,
null,
node,
mySymbol
])
betterSet.has("one") // true
betterSet.has("1") //false
betterSet.has(1) //true
betterSet.has(true) //true
betterSet.has("true") //false
betterSet.has(undefined) //true
betterSet.has("undefined") //false
betterSet.has(null) //true
betterSet.has("null") //false
betterSet.has(mySymbol) // true
betterSet.has(node) //true - correct
betterSet.has(otherNode) //false - correct
Mathematical Operations
Unfortunately, when JavaScript added sets to the language they did not provide the standard mathematical operators for sets as methods. MDN provides the following implementations of these operations.
Subset/Superset
A set can test if every element in that set is contained in another typically larger set. The former is called a subset and the latter is called the super set. In mathematics they use a sideways u symbol to indicate a set is a subset of another set, subset ⊆ set.
function isSuperset(set, subset) {
for (let elem of subset) {
if (!set.has(elem)) {
return false
}
}
return true
}
Equality
The superset operations provides a natural definition of set equality, that is if two sets are both supersets and subsets of each other then they are considered equal.
function setEqual(setA, setB) {
return isSuperset(setA, setB) && isSuperset(setB, setA);
}
Union
This is the notion of combining two different sets.
Formally, this operation takes two sets to creates a new set which contains all elements that are contained in setA and setB. Mathematically, this operation is represented with a '∪'. It is easier to remember the symbol as part of the name 'U'nion.
function union(setA, setB) {
let _union = new Set(setA)
for (let elem of setB) {
_union.add(elem)
}
return _union
}
union(new Set([1,2]), new Set([2,3,4])) //returns Set {1,2,3,4}
Intersection
The intersection of two sets is the set of all elements which the sets share in common. Formally, the intersection operation takes two sets and creates a new set which contains only elements that are both contained in setA and contained in setB. Mathematically, this operation is represented with '∩', like a large upside down U. I like to remember the symbol as i'n'tersection.
function intersection(setA, setB) {
let _intersection = new Set()
for (let elem of setB) {
if (setA.has(elem)) {
_intersection.add(elem)
}
}
return _intersection
}
intersection(new Set([1,2]), new Set([2,3,4])) //returns Set {2}
Symmetric Difference
This operation takes two sets and create a new set which contains only elements which are only contained in setA and only contained in setB, but not both.
function symmetricDifference(setA, setB) {
let _difference = new Set(setA)
for (let elem of setB) {
if (_difference.has(elem)) {
_difference.delete(elem)
} else {
_difference.add(elem)
}
}
return _difference
}
symmetricDifference(new Set([1,2]), new Set([2,3,4])) //returns Set {1,3,4}
Difference
This operation takes two sets to create a new set which contains only elements from the first set, excluding any elements from the second set. It is also sometimes called the relative complement.
function difference(setA, setB) {
let _difference = new Set(setA)
for (let elem of setB) {
_difference.delete(elem)
}
return _difference
}
Complement
This operation returns everything that is not in the current set. In the traditional mathematical set logic this operation is usually represented with a bar over the set. Another piece of convention is that U is used to represent the Universe, meaning the maximal set, as in everything possible for that set.
We can consider it visually like this.
function complement(setA, universe) {
return difference(universe, setA);
}
Mathematical Laws
Sets and their operation have a number of important laws governing their behavior. These are pieces of information that you know will be true that you can use to build programs.
Subset laws
∅ ⊆ A
A ⊆ A
Subsets have two follow two important laws. The empty subset is a subset of every set and every set is a subset of itself.
Identity Laws
Union and intersection both have special sets which act as an identity element for these operations, meaning you end up with what you started with.
Domination Laws
The domination laws say that if you union a set with everything you get everything and if you intersect a set with nothing you get nothing.
Idempotent Laws
Similar to the identity laws, in these situations you end up with what you started with.
Complementation Law
This is the formal notion that a double complement of a set is the set itself.
Commutative Laws
The Commutative laws is one of the most powerful laws of sets. It tells you that regardless of the order of the operations for two sets you end up with the same value. It is powerful because if you have a large number of sets that you need to union or intersect together then you do not need to be worried about the specific order in which you do operations.
Associative Laws
The Associative law is similarly powerful and can be used with the commutative law to process groups of sets in either multiple steps, multiple locations (as in distributed computing), or in different processing orders. It gives you tremendous flexibility in how you can process data.
Formally, it says that we can either take the union of three sets in either order A union (B union C first) or (A union B first) union C.
Distributive Laws
This law tells us how union and intersect interact together and what result to expect from the different combinations. They also provide alternative ways to calculate the answer we might be after.
De Morgan's Laws
De Morgan's laws, named for a famous mathematician, tell use how complement interacts with union and intersection.
Absorption Laws
The absorption laws are somewhat a retelling of the idempotent law. The tell yet another situation where we consistently end up where we started. These situations are helpful in software because they let us skips steps and go straight to the end without having to recalculate answer we already have.
Complement Laws
To finally round out our picture, we can say that if we union A with everything not in A, we get everything. Similarly, if we take A intersect with everything that is not in A, there is nothing in common, so we get the empty set.
Patterns
Commonly, you have multiple sets that you either need to union or intersect to get some result.
let mathClass = new Set(["Alice","Bob","Carl","Eve"]);
let litClass = new Set(["Jane","John","Bob","Eve"]);
let historyClass = new Set(["Jim","Bob","Leslye","Eve"]);
let allClasses = [mathClass, litClass, historyClass];
let allClassMates = allClasses.reduce(
(members, classSet) => union(members, classSet), new Set())
let commonStudents = allClasses.reduce(
(members, classSet) => intersection(members, classSet));
JavaScript Caveats
The biggest caveat in JavaScript is around object equality. Set of objects in JavaScript compare the objects by reference, meaning it checks if they are the same object in memory not if they are the same value. It should be obvious that the same object in memory is equivalent to itself, but if we expect {one: true, two: false}
to be equivalent to another object with the same values the default behavior of javascript is unexpected. This is a weakness of the equality operators in JavaScript that is inherited by the Set implementation.
let example1 = {one: true, two: false}
let example2 = {one: true, two: false};
console.log(example1 == example1) //true
console.log(example1 == example2) //false - by reference
let sampleSet = new Set([example1]);
sampleSet.has(example1) // true as expected
sampleSet.has(example2) // false perhaps unexpected
JavaScript Alternatives
Links
Posted on September 10, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.