Scala collections in a nutshell

bartoszgajda55

Bartosz Gajda

Posted on September 19, 2020

Scala collections in a nutshell

The Scala language, being a mix of object oriented and functional programming paradigms, has a quite unique collection framework, compared to some other languages. Although inheriting a lot from Java and JVM world, the Scala has a much easier and to use API and is overall more polished.

In this post, I will show you a range of Scala collections, how to use them with code samples, and where to use them. Enjoy!

Scala collection package(s)

First, let's have a look at the packages that contain all the Scala collections. The top-level package is called as you might expect - collection. What is more interesting, are the sub-packages included in it. Those contain a more specialized version of typical collections, tailored for the specific use case.

Whether you require parallel processing, concurrent access or just want to have a normal key-based lookup - just import the right package, and you are good to go!

The table below presents the name of collections sub-package, as well as brief description of what you can find there.

Package Name Description
collection Defines the base traits and objects needed to use and extend Scala’s collections
library, including all definitions in sub-packages. Most of the abstractions you’ll
work with are defined here.
collection.concurrent Defines a Map trait and TrieMap class with atomic, lock-free access
operations.
collection.convert Defines types for wrapping Scala collections with Java collection abstractions
and wrapping Java collections with Scala collection abstractions
collection.generic Defines reusable components used to build the specific mutable, immutable,
etc. collections.
collection.immutable Defines the immutable collections, the ones you’ll use most frequently.
collection.mutable Defines mutable collections. Most of the specific collection types are available
in mutable and immutable forms, but not all.
collection.parallel Defines reusable components used to build specific mutable and immutable
collections that distribute processing to parallel threads
collection.parallel.immutable Defines parallel, immutable collections.
collection.parallel.mutable Defines parallel, mutable collections.
collections.script A deprecated package of tools for observing collection operations.
Source: Programming Scala, 2nd Edition

As you can see the range of options is very wide. On one hand, it is always good to have an appropriate collection for any use case. On the other hand, it is quite difficult to get started, if you are just looking to implement simple Map.

For the purpose of this post, let's focus on the most widely used collections - immutable

collection.immutable

The package of immutable collections is the most popular out of all Scala collections. In fact, this package is included in the scope by default, so we can start using it without any imports.

This package includes lots of both traits and concrete implementations of certain data structures. The image below shows those (blue is a trait, and black is class):

Source: Scala Docs

Now, let's set that aside and actually learn to use some of the basic collections. The following sections focus on some of the widely used implementations, one at a time.

Set

The first one on the list is Set. This is a collection, which contains no duplicate elements. If having only unique elements in the collection, that is the one you should choose. A Set is created like that:

// Set of Char type
val set: Set[Char] = Set('a', 'b', 'c')

// Or just empty Set
val setOfInt: Set[Int] = Set()

What's interesting about operations in Set, is the apply method. The apply method behaves like contains method - both check whether the Set contains a given element:

val set: Set[Int] = Set(1, 2, 5, 6, 9)

set.contains(2) // true
set(2) // same as above, returns true

A Set offers a wide range of operations - full reference can be found here.

Where to use Set?

  • Don't want duplicates
  • Need to quickly check for the presence of an element
  • No need to traverse

Map

A Map is a collection of key-value pairs. Every key has a corresponding value assigned to it. Keys have to be unique, in order to provide O(1) lookup time. Values, however, do not have to be unique. Let's see how we can create a Map:

// A Map of Int to String mapping
val map: Map[Int, String] = Map(1 -> "one", 2 -> "two", 3 -> "three")

// Or an empty Map
val emptyMap: Map[String, String] = Map()

The elements in the Map, are created using key -> value syntax, although you can also use (key, value) if you prefer (first version is preferred though).

What's interesting about the Map, is how we can get a value, using a key. The default method to get a value is get(key) method. This method, however, returns an Option[T], and not T. That is to accommodate for a situation when a key doesn't exist - instead of an exception, an Option[None] is returned. If you prefer getting straight value, the apply method will give you that (or exception if none found of course). Let's see how it looks:

val map: Map[Char, Int] = Map('I' -> 1, 'V' -> 5, 'X' -> 10)

map.get('I') // returns Option[Int]
map.get('D') // returns Option[None]

map('V') // returns 5
map('C') // throws exception

A Map offers a wide range of operations - full reference can be found here.

You can learn more about Option in my blog post on it.

Where to use Map?

  • Need key-value storage
  • Need O(1) lookup and retrieval
  • Order is not important
  • No duplicate keys required

Vector

A Vector is an indexed, ordered, and traversable sequence. Despite sounding quite complicated, it's quite popular, general-purpose data structure, thanks to its good balance between fast random selection and fast random functional updates. Let's see how to use them in practice:

// A Vector of Integers
val vector: Vector[Int] = Vector(1, 2, 3, 5, 7, 11)

// Or an empty Vector
val emptyVector: Vector[String] = Vector()

The apply method of Vector, simply gets and element, using and index in the vector. In this data structure, an index must be between zero and the length of a vector - otherwise, and IndexOutOfBoundsException is thrown:

val vector: Vector[String] = Vector("one", "two", "three")

vector(0) // returns "one"
vector(3) // exception

A Vector offers a wide range of operations - full reference can be found here.

Where to use Vector?

  • Need fast random access and updates
  • Need fast append/prepend/tail operations
  • Must be traversable
  • Ordering matters

List

The last collection is the easiest and therefore popular among simple use cases - List. This collection is implemented as a linked list, which represents ordered collections. Let's see how to use a List:

// List of Strings
val list: List[String] = List("blue", "red", "yellow")

// Empty List
val emptyList: List[Int] = List()

As mentioned above, a List is implemented as a linked list. This means, that we can create a list, using prepend operator :: and a Nil, which represents empty list. This can be done like this:

// Same as example from above
val list: List[String] = "blue" :: ("red" :: ("yellow" :: Nil))

// And an empty List
val emptyList = Nil

A List offers a wide range of operations - full reference can be found here.

Where to use List?

  • Have to be traversable
  • Need O(1) prepending and reading of head element
  • Can tolerate O(n) appending and reading of internal elements
  • Don't need random access

Summary

I hope you have found this post useful. If so, don’t hesitate to like or share this post. Additionally, you can follow me on my social media if you fancy so 🙂

💖 💪 🙅 🚩
bartoszgajda55
Bartosz Gajda

Posted on September 19, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Scala collections in a nutshell
scala Scala collections in a nutshell

September 19, 2020

Variances in Scala
scala Variances in Scala

August 10, 2020

Scala Type Bounds
scala Scala Type Bounds

August 10, 2020

Partial Functions in Scala
scala Partial Functions in Scala

August 10, 2020