Sets Contents

# Sets

Sets are Iterables that contain no duplicate elements. The operations on sets are summarized in the next table for general sets and the table after that for mutable sets. They fall into the following categories:

• Tests contains, apply, subsetOf. The contains method asks whether a set contains a given element. The apply method for a set is the same as contains, so set(elem) is the same as set contains elem. That means sets can also be used as test functions that return true for the elements they contain. For example

 val fruit = Set("apple", "orange", "peach", "banana") fruit: scala.collection.immutable.Set[java.lang.String] = Set(apple, orange, peach, banana) scala> fruit("peach") res0: Boolean = true scala> fruit("potato") res1: Boolean = false

• Additions + and ++, which add one or more elements to a set, yielding a new set.
• Removals -, --, which remove one or more elements from a set, yielding a new set.
• Set operations for union, intersection, and set difference. Each of these operations exists in two forms: alphabetic and symbolic. The alphabetic versions are intersect, union, and diff, whereas the symbolic versions are &, |, and &~. In fact, the ++ that Set inherits from Traversable can be seen as yet another alias of union or |, except that ++ takes a Traversable argument whereas union and | take sets.
 Operations in class Set What it is What it does Tests: xs contains x Tests whether x is an element of xs. xs(x) Same as xs contains x. xs subsetOf ys Tests whether xs is a subset of ys. Additions: xs + x The set containing all elements of xs as well as x. xs + (x, y, z) The set containing all elements of xs as well as the given additional elements. xs ++ ys The set containing all elements of xs as well as all elements of ys. Removals: xs - x The set containing all elements of xs except x. xs - (x, y, z) The set containing all elements of xs except the given elements. xs -- ys The set containing all elements of xs except the elements of ys. xs.empty An empty set of the same class as xs. Binary operations: xs & ys The set intersection of xs and ys. xs intersect ys Same as xs & ys. xs | ys The set union of xs and ys. xs union ys Same as xs | ys. xs &~ ys The set difference of xs and ys. xs diff ys Same as xs &~ ys.

Mutable sets offer in addition methods to add, remove, or update elements, which are summarized in this table.

 Operations in class mutable.Set What it is What it does Additions: xs += x Adds element x to set xs as a side effect and returns xs itself. xs += (x, y, z) Adds the given elements to set xs as a side effect and returns xs itself. xs ++= ys Adds all elements in ys to set xs as a side effect and returns xs itself. xs add x Adds element x to xs and returns true if x was not previously contained in the set, false if it was. Removals: xs -= x Removes element x from set xs as a side effect and returns xs itself. xs -= (x, y, z) Removes the given elements from set xs as a side effect and returns xs itself. xs --= ys Removes all element in ys from set xs as a side effect and returns xs itself. xs remove x Removes element x from xs and returns true if x was previously contained in the set, false if it was not. xs retain p Keeps only those elements in xs that satisfy predicate p. xs.clear() Removes all elements from xs. Update: xs(x) = b (or, written out, xs.update(x, b)). If boolean argument b is true, adds x to xs, otherwise removes x from xs. Cloning: xs.clone A new mutable set with the same elements as xs.

Just like an immutable set, a mutable set offers the + and ++ operations for element additions and the - and -- operations for element removals. But these are less often used for mutable sets since they involve copying the set. As a more efficient alternative, mutable sets offer the update methods += and -=. The operation s += elem adds elem to the set s as a side effect, and returns the mutated set as a result. Likewise, s -= elem removes elem from the set, and returns the mutated set as a result. Besides += and -= there are also the bulk operations ++= and --= which add or remove all elements of a traversable or an iterator.

The choice of the method names += and -= means that very similar code can work with either mutable or immutable sets. Consider first the following REPL dialogue which uses an immutable set s:

 scala> var s = Set(1, 2, 3) s: scala.collection.immutable.Set[Int] = Set(1, 2, 3) scala> s += 4 scala> s -= 2 scala> s res2: scala.collection.immutable.Set[Int] = Set(1, 3, 4)

We used += and -= on a var of type immutable.Set. A statement such as s += 4 is an abbreviation for s = s + 4. So this invokes the addition method + on the set s and then assigns the result back to the s variable. Consider now an analogous interaction with a mutable set.

 scala> val s = collection.mutable.Set(1, 2, 3) s: scala.collection.mutable.Set[Int] = Set(1, 2, 3) scala> s += 4 res3: s.type = Set(1, 4, 2, 3) scala> s -= 2 res4: s.type = Set(1, 4, 3)

The end effect is very similar to the previous interaction; we start with a Set(1, 2, 3) end end up with a Set(1, 3, 4). However, even though the statements look the same as before, they do something different. s += 4 now invokes the += method on the mutable set value s, changing the set in place. Likewise, s -= 2 now invokes the -= method in the same set.

Comparing the two interactions shows an important principle. You often can replace a mutable collection stored in a val by an immutable collection stored in a var, and vice versa. This works at least as long as there are no alias references to the collection through which one can observe whether it was updated in place or whether a new collection was created.

Mutable sets also provide add and remove as variants of += and -=. The difference is that add and remove return a Boolean result indicating whether the operation had an effect on the set.

The current default implementation of a mutable set uses a hashtable to store the set's elements. The default implementation of an immutable set uses a representation that adapts to the number of elements of the set. An empty set is represented by just a singleton object. Sets of sizes up to four are represented by a single object that stores all elements as fields. Beyond that size, immutable sets are implemented as hash tries.

A consequence of these representation choices is that, for sets of small sizes (say up to 4), immutable sets are usually more compact and also more efficient than mutable sets. So, if you expect the size of a set to be small, try making it immutable.

Two subtraits of sets are SortedSet and BitSet. These are explained on separate pages.

Next:

 Sets Contents