Basic Functional Programming in Scala
A brief introduction to functional programming in Scala.
Main programming paradigms:
Imperative programming
Imperative programming is about:
- Modifying mutable variables
- Using assignment
- Control structures such as if-then-else, loops, return
The most common informal way to understand imperative programs is as instruction sequences for a Von Neumann computer
- There is a strong correspondence between
- Mutable variables \(\approx\) memory cells
- Variable dereference \(\approx\) load instructions
- Variable assignment \(\approx\) store instructions
- Control structures \(\approx\) jumps
- There is a strong correspondence between
Problem:
- Scaling up. How can we avoid conceptualising programs word by word?
- We need other techniques for defining high-level abstractions
- Ideally: develop theories of collections, shapes, strings, ...
Theory
A theory consist of:
- One or more data types
- Operations on these types
- Laws that describe the relationships between values and operations
Normally, a theory does not describe mutations
For instance the theory of polynomials defines the sum of two polynomials by law such as:
(a * x + b) + (c * x + d) = (a + c) * x + (b + d)
But it does not define an operator to change a coefficient while keeping the polynomial the same
Consequently, if we want to implement high-level concept following their mathematical theories, there is no place for mutation
- The theories do not admit it
- Mutation can destroy useful laws in the theories
- Therefore, we want to:
- Concentrate on defining theories for operators expressed as functions
- Avoid mutations
- Have powerful ways to abstract and compose functions
Functional programming
- In a restricted sense, functional programming means programming without mutable variables, assignments, loops, and other imperative control structures
- In a wider sense, functional programming means focusing on the functions
- In particular, functions can be values that are produced, consumed,
and composed
- They can be defined anywhere, including inside other functions
- Like any other value, they can be passed as parameters to functions and returned as results
- As for other values, there exists a set of operators to compose functions
- Some functional programming languages
- In the restricted sense:
- Pure Lisp, XSLT, XPath, XQuery, FP
- Haskell (without I/O Monad or UnsafePerformIO)
- In the wider sense:
- Lisp, Scheme, Racket, Clojure
- SML, Ocaml, F#
- Haskell
- Scala
- Smalltalk, Ruby
- In the restricted sense:
Logic programming
Element of Programming
Every non-trivial programming language provides:
- Primitive expressions representing the simplest elements
- Ways to combine expressions
- Ways to abstract expressions, which introduce a name for an expression by which it can then be referred to
The substitution model
- The idea underlying this model is that all evaluation does is to reduce an expression to a value
- It can be applied to all expressions, as long as they have no side effects
- The substitution model is formalized in the \(\lambda\)-calculus, which gives a foundation for functional programming
Call-by-name and call-by-value evaluation strategies in Scala
Both evaluation evaluation strategies reduce an expression to the same final values as long as
- The reduced expression consists of pure functions
- Both evaluations terminate
Comparison
- Call-by-value has the advantage that it evaluates every function argument only once
- Call-by-name has the advantage that a function argument is not evaluated if the corresponding parameter is unused in the evaluation of the function body
- If CBV evaluation of an expression terminates, then CBN evaluation of the same expression terminates too.
- The other direction is not true
Scala normally uses call-by-value. But if the type of a function
parameter starts with =>
it uses call-by-name.
Example:
1 | def constOne(x: Int, y: => Int) = 1 |
Then
1 | constOne(1+2, infiniteLoop) |
will be able to terminate.
Value definitions in Scala
The def
form of value definition is "by-name". Its right
hand side is evaluated on each use.
The val
form of value definition is "by-value". Its
right hand side is evaluated at the point of the definition itself.
Afterward, the name refers to the value.
1 | def loop: Boolean = loop |
Blocks in Scala
A block is delimited by braces
{...}
1
2
3{ val x = f(3)
x * x
}It contains a sequence of definitions or expressions
The last element of a block is an expression that defines its value
This return expression can be preceded by auxiliary definitions
Blocks are themselves expression. A block may appear everywhere an expression can
The definitions inside a block are only visible from within the block
The definitions inside a block shadow definitions of the same name outside the block
Semicolons in Scala
Semicolons at the end of lines are in most cases optional but if there are more than one statements on a line, they need to be separated by semicolons
You could write a multi-line expression in parentheses or write the operator at the end of the first line. This tells the Scala compiler that the expression is not yet finished.
1
2
3
4
5someLongExpression +
someOtherExpression
(someLongExpression
+ someOtherExpression)
Rewriting rule
Consider the following situation:
1 | def f(x1, ..., xn) = B; ... f(v1, ..., vn) |
This will be later rewritten as the following
1 | def f(x1, ..., xn) = B; ... [v1/x1, ..., vn/xn]B |
Here, [v1/x1, ..., vn/xn]B
means:
The expression B
in which all occurrences of
xi
have been replaced by vi
.
[v1/x1, ..., vn/xn]
is called a substitution.
Tail recursion
Implementation consideration: if a function calls itself as its last action, the function's stack frame can be reused. This is called tail recursion. Tail recursive function are iterative processes.
In general, if the last action of a function consist of calling a function (which may be the same), one stack frame would be sufficient for both functions. Such calls are called tail-calls.
In Scala, only directly recursive calls to the current function are optimized
One can require that a function is tail-recursive using a
@tailrec
annotation:1
2
def gcd(a: Int, b: Int): Int = ...If the annotation is given, and the implementation of
gcd
were not tail recursive, an error would be issued
Types and Pattern Matching
Decomposition
Suppose you want to write a small interpreter for arithmetic
expressions. To keep it simple, let's restrict ourselves to numbers and
additions. Expressions can be represented as a class hierarchy, with a
base trait Expr
and two subclasses, Number
and
Sum
. To treat an expression, it's necessary to know the
expression's shape and its components. This brings us to the following
implementation.
1 | trait Expr: |
Problems
- Writing all these classification and accessor functions quickly becomes tedious
- There is no static guarantee you use the right accessor functions. You might hit an Error case if you are not careful.
Non-solution: Type tests and type casts
Scala lets you do these using method defined in class
Any
:1
2def isInstanceOf[T]: Boolean
def asInstanceOf[T]: TThese correspond to Java's type tests and casts
1
2x instanceof T;
(T) x;But their use in Scala is discouraged, because there are better alternatives.
Solution 1: Object-oriented decomposition
For example, suppose that all you want to do is evaluate expressions
1
2
3
4
5
6
7
8trait Expr:
def eval: Int
class Number(n: Int) extends Expr:
def eval: Int = n
class Sum(e1: Expr, e2: Expr) extends Expr:
def eval: Int = e1.eval + e2.evalBut what happens if you'd like to display expressions now? You have to define new methods in all the subclasses.
Assessment of OO decomposition
- OO decomposition mixes data with operations on the data.
- This can be the right thing if there's a need for encapsulation and data abstraction.
- On the other hand, it increases complexity and adds new dependencies to classes.
- It makes it easy to add new kinds of data but hard to add new kinds of operations.
- Thus, complexity arises from mixing several things together.
Limitations of OO decomposition
OO decomposition only works well if operations are on a single object.
What is you want to simplify expressions, say using the rule:
1
a * b + a * c -> a * (b + c)
Problem: This is a non-local simplification. It cannot be encapsulated in the method of a single object.
You are back to square one; you need test and access method sfor all the different subclasses.
Solution 2: Functional decomposition with Pattern matching
Observation: the sole purpose of test and accessor functions is to reverse the construction process:
- Which subclass was used?
- What were the arguments of the constructor?
This situation is so common that many functional languages, Scala included, automate it.
Case classes
A case class definition is similar to a normal class definition, except that it is preceded by the modifier case. For example:
1 | trait Expr |
These classes are now empty. So how can we access the members?
Pattern matching
Pattern matching is a generalization of switch
from
C/Java to class hierarchies:
1 | def eval(e: Expr): Int = e match |
Rules:
match
is preceded by a selector expression and is followed by a sequence ofcases
,pat => expr
.- Each case associates an expression
expr
with a patternpat
. - A
MatchError
exception is thrown if no pattern matches the value of the selector. - The compiler creates a
.field_i
accessor for eachfield_i
of the case class.
Forms of patterns
Patterns are constructed from:
Constructors, e.g.
Number
,Sum
1
2
3def isNumberMessage(e: Expr): String = e match
case Number(n) => "This is a number"
case Sum(e1, e2) => eval(e1) + eval(e2)Variables, e.g.
e1
,e2
1
2
3def isNumberMessage(e: Expr): String = e match
case Number(n) => "This is a number"
case v => "This is not a number"- Variable patterns are often used as fallbacks. For example, in
example above,
v
is a variable pattern. - Variables always begin with a lowercase letter.
- The same variable name can only appear once in a pattern. So,
Sum(x, x)
is not a legal pattern.
- Variable patterns are often used as fallbacks. For example, in
example above,
Wildcard patterns
_
1
2
3def isNumber2(n: Int): Boolean = e match
case 2 => true
case _ => falseConstants, e.g. 1,
true
- Example as above.
- Names of constants begin with a capital letter, with the exception
of the reserved words
null
,true
,false
.
Type tests, e.g. n:
Number
1
2
3def isNumber(e: Expr): Boolean = e match
case n: Number => true
case _ => false
Evaluating match expressions
An expression of the form
1 | e match { |
matches the value of the selector e
with the patterns
p1, ..., pn
in the order in which they are written. The
whole match expression is written to the right-hand side of the first
case where the pattern matches the selector e
. References
to pattern variables are replaced by the corresponding parts in the
selector.
A constructor pattern C(p1, ..., pn)
matches all the
values of type C
(or a subtype) that have been constructed
with arguments matching the patterns p1, ..., pn
. A
variable pattern x
matches any value, and binds the name of
the variable to this value. A constant pattern c
matches
values that are equal to c
(in the sense of ==).
Pattern matching and methods
It is possible to define the evaluation function as a method of the base trait:
1 | trait Expr: |
Another example:
1 | case class Var(name: String) extends Expr |
Lists
The list is a fundamental data structure in functional programming..
A list having x1, ..., xn
as elements is written
List(x1, ..., xn)
.
There are two important differences between lists and arrays
- Lists are immutable - the elements of a list cannot be changed
- Lists are recursive, while arrays are flat.
Constructors of lists
All lists are constructed from:
The empty list
Nil
The construction operation
::
(pronouncedcons
)1
2
3nums1 = List(1, 2, 3, 4)
nums2 = 1 :: (2 :: (3 :: 4 :: Nil))
empty = Nil
Right associativity
Convention: operators ending in :
associate to the
right. So A :: B :: C
is interpreted as
A :: (B :: C)
. We can thus omit the parenthesis in the
definition above.
Operations on Lists
All operations on lists can be expressed in terms of the following three:
head
: the first element of the listtail
: the list composed of all the elements except the first.isEmpty
:true
if the list is empty,false
otherwise.
These operations are defined as methods of objects of type
List
. For example:
1 | nums.head == 1 |
List patterns
It is possible to decompose lists with pattern matching:
Nil
: theNil
constantp :: ps
: A pattern that matches a list with a head matchingp
and a tail matchingps
.List(p1, ..., pn)
: same asp1 :: ... :: pn :: Nil
Sorting lists
Suppose we want to sort a list of number in ascending order. The code below describes insertion sort:
1 | def isort(xs: List[Int]): List[Int] = xs match |
Enums
Sometimes we just need to compose and decompose pure data without any associated functions. Case classes and pattern matching work well for this task.
A case class hierarchy
1 | trait Exor |
This time we have put all case classes in the Expr companion object,
in order not to pollute the global namespace. So it's
Expr.Number(1)
instead of Number(1)
, for
example. But one can still "pull out" all the cases using an import:
import Expr.*
.
Pure data definitions like these are called algebraic data types, or ADTs for short. They are very common in functional programming. To make them even more convenient, Scala offers some special syntax.
Enums for ADTs
An enum enumerates all the cases of an ADT and nothing else. For example:
1 | enum Expr: |
This enum is equivalent to the case class hierarchy above but is shorter.
Pattern matching on ADTs
1 | def show(e: Expr): String = e match |
Simple enums
Cases of an enum can also be simple values, without any parameters. For example:
1 | enum Color: |
or shorter:
1 | enum Color: |
Pattern matching on simple enums
For pattern matching, simple cases count as constants:
1 | enum DayOfWeek: |
More on enums
Enumerations can take parameters and can define methods. For example:
1 | enum Direction(val dx: Int, val dy: Int): |
Notes:
- Enumeration cases that pass parameters have to use an explicit
extends
clause - The expression
e.ordinal
gives the ordinal value of the enum casee
. Cases start with zero and are numbered consecutively. values
is an immutable array in the companion object of an enum that contains all enum values.- Only simple cases have ordinal numbers and show up in values, parameterized cases do not.
- Enumerations are shorthands for classes and objects
Domain modeling
ADTs and enums are particularly useful for domain modelling tasks where one needs to define a large number of data types without attaching operations. For example modelling payment methods:
1 | enum PaymentMethod: |
Subtyping and generics
Polymorphism
Two principal forms of polymorphism
- Subtyping
- Generics
Type bounds
Consider the method assertAllPos
which
- takes an
IntSet
- returns the
IntSet
itself if all its elements are positive - throws an exception otherwise
What would be the best type you can give to
assertAllPos
? Maybe:
1 | def assertAllPos(s: IntSet): IntSet |
In most situations this is fine, but can one be more precise? One
might want to express that assertAllPos
takes empty sets to
empty sets and non-empty sets to non-empty sets. A way to express this
is:
1 | def assertAllPos[S <: IntSet](r: S): S = ... |
Here, <: IntSet
is an upper bound of the type
parameter S
. It means that S
can be
instantiated only to types that conform to IntSet
.
Generally, the notation:
S <: T
meansS
is a subtype ofT
S >: T
meansS
is a supertype ofT
Lower bounds
You can also use a lower bound for a type variable. For example
1 | [S >: NonEmpty] |
introduces a type parameter S
that can range only over
supertypes of NonEmpty
. So S
could be one of
NonEmpty
, IntSet
, AnyRef
, or
Any
.
Mixed bounds
Finally, it is possible to mix a lower bound with an upper bound. For instance:
1 | [S >: NonEmpty <: IntSet] |
would restrict S
any type on the interval between
NonEmpty
and IntSet
.
Covariance
There's another interaction between subtyping and type parameters we need to consider. Given:
1 | NonEmpty <: IntSet |
is the following true?
1 | List[NonEmpty] <: List[IntSet] |
Intuitively, this makes sense: A list of non-empty sets is a special case of a list of arbitrary sets. We call types for which this relationship holds covariant because their subtyping relationship varies with the type parameter.
The Liskov substitution principle
The following principle, stated by Barbara Liskov, tells us when a type can be a subtype of another.
1 | It A <: B, then everything one can to do with a value of type B one should also be able to do with a value of type A. |
Variance
You have seen that some types should be covariant whereas others should not. Roughly speaking, a type that accepts mutations of its elements should not be covariant. But immutable types can be covariant, if some conditions on methods are met.
Definition of variance
Say C[T]
is a parameterized type and A
,
B
are types such that A <: B
. In general,
there are three possible relationships between C[A]
and
C[B]
:
C[A] <: C[B]
-->C
is covariantC[A] >: C[B]
-->C
is contravariant- Neither
C[A]
norC[B]
is a subtype of the other -->C
is nonvariant
Scala lets you declare the variance of a type by annotating the type parameter:
class C[+A] {...}
-->C
is covariantclass C[-A] {...}
-->C
is contravariantclass C[A] {...}
-->C
is nonvariant
Typing rules for functions
Generally, we have the following rule for subtyping between function
types: If A2 <: A1
and B1 <: B2
,
then
1 | A1 => B1 <: A2 => B2 |
So functions are contravariant in their argument types and covariant in their result type.
Variance checks
If we turn Array
into a class, and update into a method,
it would look like this:
1 | class Array[+T]: |
The problematic combination is
- the covariant type parameter
T
- which appears in parameter position of the method update.
The Scala compiler will check that there are no problematic combinations when compiling a class with variance annotations. Roughly,
- Covariant type parameters can only appear in method results
- Contravariant type parameters can only appear in method parameters
- Invariant type parameters can appear annywhere
The precise rules are a bit more involved, fortunately the Scala compiler performs them for us.
Variance and lists
One shortcoming with the previous implementation of lists was that
Nil
had to be a class, whereas we would prefer it to be an
object (after all, there is only one empty list). We can change this by
making list covariant:
1 | trait List[+T] |
Idealized lists
1 | trait List[+T]: |
The need for a lower bound was essentially to decouple the new
parameter of the class and the parameter of the newly created object.
Using an extension method such as in ::
above, sidesteps
the problem and is often simpler.
Lists
Lists operations
Sublists and element access
xs.length
: The number of elements ofxs
.xs.last
: The list's last element, exception ifxs
is empty.xs.init
: A list consisting of all elements ofxs
except the last one, exceptionxs
is empty.xs.drop(n)
: The rest of the collection after takingn
elements.xs(n)
orxs.apply(n)
: The element ofxs
at indexn
.
Creating new lists
xs ++ ys
: The list consisting of all elements ofxs
followed by all elements ofys
.xs.reverse
: The list containing the elements ofxs
in reversed order.xs.updated(n, x)
: The list containing the same elements asxs
, except at indexn
where it containsx
.
Finding elements
xs.indexOf(x)
: The index of the first element inxs
equal tox
, or-1
ifx
does not appear inxs
.xs.contains(x)
: same asxs.indexOf(x) >= 0
.
Implementation of last
The complexity of head
is small constant time. What is
the complexity of last? To find out, let's write a possible
implementation of last
as a stand-alone function.
1 | def last[T](xs: List[T]): T = xs match |
So, last
takes steps proportional to the length of the
list xs
.
Implementation of concatenation
Let's try to write an extension method for ++
:
1 | extension [T](xs: List[T]) |
The complexity of concatenation is O(xs.length)
Implementation of
reverse
1 | extension [T](xs: List[T]) |
The complexity of reverse is O(xs.length * xs.length)
.
It can be done better.
Implementation of
flatten
Flatten a list structure which means lists all atoms in nested lists in a list.
1 | def flatten(xs: Any): List[Any] = xs match |
Tuples and generic methods
Pair and tuples
The pair consisting of x and y is written as (x, y)
in
Scala. Pairs can also be used as patterns:
1 | val (label, value) = pair |
This works analogously for tuples with more than two elements. One could also have written
1 | val label = pair._1 |
Translation of tuples
For small \(n\), up to 22 (there is
also a TupleXXL
class that handles tuples larger than that
limit), the tuple type \((T_1, \dots,
T_n)\) is an abbreviation of the parameterized type \[
\text{scala.Tuple } n[T_1,\dots, T_n]
\] An tuple expression \((e_1,\dots,
e_n)\) is equivalent to the function application \[
\text{scala.Tuple } n(e_1, \dots, e_n)
\] A tuple pattern \((p_1,\dots,
p_n)\) is equivalent to the constructor pattern \[
\text{scala.Tuple } n(p_1, \dots, p_n)
\]
Merge sort
1 | def msort[T](xs: List[T])(lt: (T,T) => Boolean) = |
Higher-order list functions
Mapping
A simple way to define map
is as follows:
1 | extension [T](xs: List[T]) |
In fact, the actual definition of map
is a bit more
complicated, because it is tail-recursive, and also because it works for
arbitrary collections, not just lists. Using map
, we can
scale a list by the following code:
1 | def scaleList(xs: List[Double], factor: Double) = |
Filtering
Selection of all elements satisfying a given condition. This pattern
is generalized by the method filter
of the List class:
1 | extension [T](xs: List[T]) |
Beside filter, there are also the following methods that extract sublists based on a predicate:
xs.filterNot(p)
: Same asxs.filter(x => !p(x))
.xs.partition(p)
: Same as(xs.filter(p), xs.filterNot(p))
, but computed in a single traversal of the listxs.
xs.takeWhile(p)
: The longest prefix of listxs
consisting of elements that all satisfy the predicate p.xs.dropWhile(p)
: The remainder of the listxs
after any leading elements satisfying p have been removed.xs.span(p)
: Same as(xs.takeWhile(p), xs.dropWhile(p))
but computed in a single traversal of the listxs
.
Reduction of lists
Another common operation on lists is to combine the elements of a list using a given operator. For example:
1 | sum(List(x1, ..., xn)) |
The pattern can be abstracted out using the generic method
reduceLeft
. reduceLeft
inserts a given binary
operator between adjacent elements of a list:
1 | List(x1, ..., xn).reduceLeft(op) = x1.op(x2). ... .op(xn) |
Using reduceLeft
, we can simplify:
1 | def sum(xs: List[Int]) = (0 :: xs).reduceLeft((x, y) => x + y) |
Instead of ((x, y) => x * y)
, one can also write
short: (_ * _)
. Every _
represents a new
parameter, going from left to right. The parameters are defined at the
next outer pair of parentheses (or the whole expression if there are no
enclosing parentheses). So, sum
and product
can also be expressed like:
1 | def sum(xs: List[Int]) = (0 :: xs).reduceLeft(_ + _) |
FoldLeft
The function reduceLeft
is defined in terms of a more
general function, foldLeft
. foldLeft
is like
reduceLeft
but takes an accumulator, z
, as an
additional parameter, which is returned when foldLeft
is
called on an empty list:
1 | List(x1, ..., xn).foldLeft(z)(op) = z.op(x1).op(x2). ... .op(xn) |
So, sum
and product
can also be defined as
follows:
1 | def sum(xs: List[Int]) = xs.foldLeft(0)(_ + _) |
Implementations of
reduceLeft
and foldLeft
They can be implemented in class List
as follows:
1 | abstract class List[T]: |
FoldRight and ReduceRight
Applications of foldLeft
and reduceLeft
unfold on trees that lean to the left. They have two dual functions.
foldRight
and reduceRight
, which produce trees
which lean to the right. Their implementations:
1 | def reduceRight(op: (T, T) => T): T = this match |
Difference between FoldLeft and FoldRight
For operators that are associative and commutative,
foldLeft
and foldRight
are equivalent (even
though there may be a difference in efficiency). But sometimes, only one
of the two operators is appropriate. For example:
1 | def reverse[T](xs: List[T]): List[T] = |
Remark: the type parameter in List[T]()
is necessary for
type inference.
Reasoning about lists
Laws of concat
Recall the concatenation operation ++
on lists. We would
like to verify that concatenation is associative, and that it admits the
empty list Nil
as neutral element to the left and to the
right. But how can we prove properties like these? We can do it by
structural induction on lists.
Referential transparency
A proof can freely apply reduction steps as equalities to some part of a term. That works because pure functional programs don't have side effects; so that a term is equivalent to the term to which it reduces. This principle is called referential transparency.
Structural induction
The principle of structural induction is analogous to natural induction:
To prove a property P(xs)
for all lists xs,
- show that
P(Nil)
holds (base case) - for a list
xs
and some elementx
, show the induction step:- If
P(xs)
holds, thenP(x :: xs)
also holds.
- If
Example
Let's show that, for lists xs
, ys
,
zs
:
1 | (xs ++ ys) ++ zs = xs ++ (ys ++ zs) |
To do this, use structural induction on xs
. From the
previous implementation of ++
:
1 | extension [T](xs: List[T]): |
distill two defining clauses of ++
:
1 | Nil ++ ys = ys |
Other Collections
Vectors
Operations on vectors
Vectors are created analogously to lists
1 | val nums = Vector(1, 2, 3, -88) |
They support the same operations as lists, with the exception of
::
. Instead of x :: xs
, there is
x +: xs
: create a new vector with leading elementx
, followed by all elements ofxs
xs :+ x
: create a new vector with trailing elementx
, preceded by all elements ofxs
- Note that the
:
always points to the sequence.
Collection hierarchy
A common base class of List
and Vector
is
Seq
, the class of all sequences. Seq
itself is
a subclass of Iterable
.
Arrays and strings
Arrays and strings support the same operations as Seq
and can implicitly be converted to sequences where needed. (They cannot
be subclasses of Seq
because they come from Java).
1 | val xs: Array[Int] = Array(1, 2, 3) |
Ranges
Another simple kind of sequence is the range. It represents a
sequence of evenly spaced integers. Three operators: to
(inclusive), until
(exclusive), by
(to
determine step value):
1 | val r: Range = 1 until 5 // 1,2,3,4 |
A range is represented as a single object with three fields: lower bound, upper bound, step value.
Some more sequence operations:
xs.exists(p)
: true if there is an elementx
ofxs
such thatp(x)
holds, false otherwise.xs.forall(p)
: true ifp(x)
holds for all elementsx
ofxs
, false otherwisexs.zip(ys)
: A sequence of pairs drawn from corresponding elements of sequencesxs
andys
.xs.unzip
: Splits a sequence of pairsxs
into two sequences consisting of the first, respectively second halves of all pairs.xs.flatMap(f)
: Applies collection-valued functionf
to all elements ofxs
and concatenates the results.xs.sum
: The sum of all elements of this numeric collection.xs.product
: The product of all elements of this numeric collectionxs.max
: The maximum of all elements of this collection (anOrdering
must exist)xs.min
: The minimum of all elements of this collection.
Example 1:
To list all combinations of numbers x
and y
where x
is drawn from 1..M
and y
is drawn from 1..N
:
1 | (1 to M).flatMap(x => (1 to N).map(y => (x, y))) |
This will return the default Seq
:
Vector
.
Example 2
To compute the scalar product of two vectors:
1 | def scalarProduct(xs: Vector[Double], ys: Vector[Double]): Double = |
Note that there is some automatic decomposition going on here. Each
pair of elements from xs
and ys
is split into
its halves which are then passed as the x
and
y
parameters to the lambda.
Combinatorial search and for-expressions
Handling nested sequences
We can extend the usage of higher order functions on sequences to
many calculations which are usually expressed using nested loops. For
example, given a positive integer n
, find all pairs of
positive integers i
and j
, with
i <= j < i < n
such that i + j
is
prime.
A natural way to do this is to
- Generate the sequence of all pairs of integers
(i, j)
- Filter the pairs for which
i + j
is prime
1 | (1 until n) |
This works, but make most people's head hurt.
For-expressions
Let persons
be a list of elements of class
Person
, with fields name
and
age
.
1 | case class Person(name: String, age: Int) |
To obtain the names of persons over 20 years old, you can write:
1 | for p <- persons if p.age > 20 yield p.name |
which is equivalent to:
1 | persons |
The for-expression is similar to loops in imperative languages, except that it builds a list of the results of all iterations.
Syntax of For
A for-expression is of the form
1 | for s yield e |
where s
is a sequence of generators and filters, and
e
is an expression whose value is returned by an
iteration.
- A generator is of the form
p <- e
, wherep
is a pattern ande
an expression whose value is a collection. - A filter is of the form
if f
wheref
is a boolean expression. - The sequence must start with a generator.
- If there are several generators in the sequence, the last generators vary faster than the first.
Then the previous problem can be solved by the following code:
1 | for |
Sets
A set is written analogously to a sequence:
1 | val fruit = Set("apple", "banana", "pear") |
Most operations on sequences are also available on sets, for example,
map
, filter
and nonEmpty
.
The principal differences between sets and sequences are:
- Sets are unordered; the elements of a set do not have a predefined order in which they appear in the set.
- Sets do not have duplicate elements
- The fundamental operation on sets is contains
For example: N-queen problem
1 | def queens(n: Int) = |
Maps
A map of type Map[Key, Value]
is a data structure that
associates keys of type Key
with values of type
Value
. For example:
1 | val romanNumerals = Map("I" -> 1, "V" -> 5, "X" -> 10) |
Maps are iterables.
Class Map[Key, Value]
extends the collection type
Iterable[(Key, Value)]
. Therefor, maps support the same
collection operations as other iterables do. Example:
1 | val countryOfCapital = capitalOfCountry.map((x, y) => (y, x)) |
The syntax key -> value
is just an alternative way to
write the pair (key, value)
. (->
implemented as an extension method in Predef
)
Maps are functions
Class Map[Key, Value]
also extends the function type
Key => Value
, so maps can be used everywhere functions
can. In particular, maps can be applied to key arguments:
1 | capitalOfCountry("US") |
Querying map
Apply a map to a non-existing key gives an error. To query a map
without knowing beforehand whether it contains a given key, you can use
the get
operation:
1 | capitalOfCountry.get("US") // Some("Washington") |
The result of a get operation is an Option
value.
Updating maps
Functional updates of a map are done with the +
and
++
operations:
m + (k -> v)
: The map that takes keyk
to valuev
and is otherwise equal tom
m ++ kvs
: The mapm
updated via+
with all key/value pairs inkvs
These operations are purely functional.
Sorted and groupBy
Two useful operations known from SQL queries are groupBy
and orderBy. orderBy
on a collection can be expressed using
sortWith
and sorted
:
1 | val fruit = List("apple", "pear", "orange", "pineapple") |
groupBy
is available on Scala collections. It partitions
a collection into a map of collections according to a discriminator
function f
:
1 | fruit.groupBy(_.head) //> Map(p -> List(pear, pineapple), |
Default values
So far, maps were partial functions: Applying a map to a key value in
map(key)
could lead to an exception, if the key was not
stored in the map. There is an operation withDefaultValue
that turns a map into a total function:
1 | val cap1 = capitalOfCountry.withDefaultValue("<unknown>") |
The Option
type
The Option
type is defined as
1 | trait Option[+A] |
The expression map.get(key)
returns:
None
ifmap
does not contain the given key,Some(x)
ifmap
associates the given key with the valuex
.
Decomposing Option
Since options are defined as case classes, they can be decomposed using pattern matching:
1 | def showCapital(country: String) = capitalOfCountry.get(country) match |
Options also support quite a few operations of the other collections.
Benefits of Scala's immutable collections
- Easy to use: few steps to do the job
- Concise: one word replaces a whole loop
- Safe: type checker is really good at catching errors
- Fast: collection ops are turned, can be parallelized
- Universal: one vocabulary to work on all kinds of collections