lists

To really understand the power of definitions by pattern matching, it is convenient to introduce more complex data structures. In this lecture we shall focus on list; in the next lecture we will deal with the general case of user-defined types.

A *list* is a finite or infinite sequence of elements of the same type.
The empty list is denoted by `[]`

.

# [];; - : 'a list = []

When a list is finite, we can enumerate its elements, separated by a semicolon,
within the square brackets `[…]`

.
Unlike sets, lists may contain duplicate elements.

The type of a list has the form `'a list`

, where `'a`

is the type of its elements.
All the elements of a list are required to have the same type, otherwise
the list will not be assigned any type.

# [1; 2; 3] - : int list = [1; 2; 3] # [1;2;2;1;3];; - : int list = [1; 2; 2; 1; 3] # ["alice"; "bob"; "carl"];; - : string list = ["alice"; "bob"; "carl"] # ["alice"; 1; 2];; Characters 10-11: ["alice"; 1; 2];; ^ This expression has type int but is here used with type string

Note that this is quite different from the way collections are dealt with in Java
(al least, before Java 1.5 introduced polymorphism).
In Java you can have a list of elements of any type (it suffices they subclass Object),
but you need to use downcasts (and possibly incur in runtime errors) to work with them.

The operator :: (usually spelled *cons*, for “constructor”)
allows to construct a list `hd::tl`

from an element `hd`

of type `'a`

and a list `tl`

of type `'a list`

.

When `l = hd::tl`

we say that `hd`

is the **head** of the list `l`

,
and `tl`

is the **tail** of the list `l`

.

Indeed, any list `[x1;x2;…;xn]`

can be seen just as an abbreviation for `x1::x2::…::xn::[]`

.

# 1::[];; - : int list = [1] # 1::2::3::[];; - : int list = [1; 2; 3] # "alice"::["bob";"carl"];; - : string list = ["alice"; "bob"; "carl"]

Since lists may contain elements of arbitrary type (it suffices that all the elements have the same type), we can construct lists the elements of which are other lists:

# [[1;2];[];[3;4;5]];; - : int list list = [[1; 2]; []; [3; 4; 5]] # []::[];; - : 'a list list = [[]] []::[[[]]];; - : 'a list list list = [[]; [[]]]

Recall that all the elements of a list must have the same type.

# "alice"::[1;2];; Characters 12-13: "alice" :: [1;2];; ^ This expression has type int but is here used with type string

Recall that the only legitimate use of x::y is when x has type `'a`

and y has type `'a list`

.

# [1;2]::3;; This expression has type int but is here used with type int list list

To append an element to the tail of a list, you must define a suitable function.

Watch out for commas. You must use semicolons (e.g. `[1;2;3]`

) and **not** commas (e.g. `[1,2,3]`

)
to separate the elements of a list.
Using commas is particularly dangerous, since it is not immediately forbidden by the compiler.
The effect is that you obtain a tuple of elements, instead of a list.
It might be time-consuming to detect this error in later phases of the development.

# [1,2,3];; - : (int * int * int) list = [(1, 2, 3)] # 0::[1,2,3];; Characters 4-9: 0::[1,2,3];; ^^^^^ This expression has type int * int * int but is here used with type int

The operator `::`

is also used to de-construct lists,
i.e. to explore their structure by pattern matching.
The empty list is matched by `[]`

.
A non-empty list is matched by the pattern `hd::tl`

,
where `hd`

stands for the head of the list (its first element),
while `tl`

stand for the tail.

As a first example, consider the function `length`

, that computes the length of a list.

# let rec length l = match l with [] -> 0 | x::xl -> 1 + length xl;; val length : 'a list -> int = <fun> # length [1;2;3;4;5];; - : int = 5

Note that the function `length`

above can be written even more concisely with the
help of some syntactic sugar:

let rec length = function [] -> 0 | x::xl -> 1 + length xl;; val length : 'a list -> int = <fun>

The following function searches a given value `x`

in an unordered list.

# let rec search x = function [] -> false | hd::tl -> hd=x || search x tl;; val search : 'a -> 'a list -> bool = <fun> # search 5 [2; 3; 5; 7; 11];; - : bool = true # search "elizabeth" ["alice"; "bob"; "carl"; "david"];; - : bool = false

The following function concatenates two lists. Notice that the same effect can be obtained through the built-in infix operator @.

# let rec append xl yl = match xl with [] -> yl | x::xl' -> x::(append xl' yl);; val append : 'a list -> 'a list -> 'a list = <fun> # append [1;2;3] [4;5];; - : int list = [1; 2; 3; 4; 5] # [1;2;3] @ [4;5];; - : int list = [1; 2; 3; 4; 5]

The function `drop`

drops the first `n`

elements of a list.
Note below that the patterns are matched in order of appearance
(thus, the last pattern in only applied when the list is not empty, and `n`

=0).

# let rec drop n = function [] -> [] | x::xl when n>0 -> drop (n-1) xl | l -> l;; val drop : int -> 'a list -> 'a list = <fun> # drop 2 [10;20;30;40;50];; - : int list = [30; 40; 50]

`count: 'a → 'a list → int`

. Counts the number of occurrences of an element in a list.`even: 'a list → bool`

. When fed with a list`l`

, evaluates to`true`

iff`l`

has an even number of elements.`rev: 'a list → 'a list`

. Reverts the elements of a list.`nth: int → 'a list → 'a`

. Takes the n-th element of a list (throws an exception if the list is too short).`take: int → 'a list → 'a list`

. Takes the first n elements of a list.`split : ('a * 'b) list → 'a list * 'b list`

. Splits a list of pairs into a pair of lists.`combine : 'a list → 'b list → ('a * 'b) list`

. Combines two lists (of equal length) into a list of pairs.

Let us consider the function `range`

, defined as follows:

# let rec range a b = if b<a then [] else a::(range (a+1) b);; val range : int -> int -> int list = <fun>

When applied to two integers `a`

and `b`

, `range a b`

produces a list containing all the integers in the range [a,b].
For instance:

# range 1 10;; - : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

Now, let's try to construct a list with one million of elements:

# range 1 1000000 Stack overflow during evaluation (looping recursion?).

What has gone wrong ? The function `range`

defined above is not tail recursive,
because the recursive call to `range`

is not the very last thing to happen before the return.
We have already shown a technique for transforming a non-tail recursive function
into a tail recursive one.
Let us try it:

# let range a b = let rec range2 a b acc = if a>b then acc else range2 (a+1) b (a::acc) in range2 a b [];; # range 1 10;; - : int list = [10; 9; 8; 7; 6; 5; 4; 3; 2; 1]

Unexpectedly, the resulting list is reverted. What happened ?
When accumulating in `acc`

the result, the element `a`

is put at the head of the list.
Let us observe the sequence of recursive calls, in the given range:

- range2 1 10 []
- range2 2 10 [1]
- range2 3 10 2::[1]
- range2 4 10 3::[2;1]
- range2 4 10 4::[3;2;1]
- …

In general, writing functions on lists in a tail recursive manner
is made a bit tricky because the *cons* operator can only
be used to attach an element at the head of a list.

Luckily, in this case there is a simple solution that allows for constructing the list in the correct order.

# let range a b = let rec range2 a b acc = if a>b then acc else range2 a (b-1) (b::acc) in range2 a b [];; # range 1 10;; - : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

Now, let us construct a one million elements list:

# let verylong = range 1 1000000;; - : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; ...]

Not all functions on lists can be made tail recursive so easily.
An example is the `append`

function.
It is easy to write a tail recursive version of `append`

which reverts
the first list:

# let rec rev_append l1 l2 = match l1 with [] -> l2 | x::l' -> rev_append l' (x::l2) ;; val rev_append : 'a list -> 'a list -> 'a list = <fun> # rev_append [1;2;3;4;5] [6;7];; - : int list = [5; 4; 3; 2; 1; 6; 7]

Note in passing that, when `l2`

is empty, this subsumes the function that reverts a list:

# let rev l = rev_append l [];; val rev : 'a list -> 'a list = <fun>

We can finally write our tail recursive append function.

# let append l1 l2 = rev_append (rev l1) l2;; val append : 'a list -> 'a list -> 'a list = <fun> # append [1;2;3;4;5] [6;7];; - : int list = [1; 2; 3; 4; 5; 6; 7]

Let us test is with a huge list:

# length (append longlist longlist);; - : int = 2000000

`length: 'a list → int`

.

We now introduce some higher-order combinators that are especially useful to manipulate lists. Indeed, many of the examples seen above can be reformulated using these combinators.

The function `map`

applies a given function `f`

to each element of a list.
Note the type: `map`

takes a function of type `'a → 'b`

and a list of elements
of type `'a`

, and produces a list of elements of type `'b`

.

# let rec map f = function [] -> [] | hd :: tl -> f hd :: map f tl;; val map : ('a -> 'b) -> 'a list -> 'b list = <fun> # map (fun x -> x*2) [1; 2; 3; 4; 5];; = : int list = [2; 4; 6; 8; 10]

The function `filter`

selects all the elements of a list which respect a given predicate.
If the elements of the list have type `'a`

, then the predicate is a function of type `'a → bool`

.

# let rec filter p = function [] -> [] | x::xl -> if (p x) then x::(filter p xl) else (filter p xl);; val filter : ('a -> bool) -> 'a list -> 'a list = <fun> # filter (fun x -> x mod 2 = 0) [1; 2; 3; 4; 5];; - : int list = [2; 4]

As a more involved example, let us exploit combinators to define the
function `drop`

, that drops the first `n`

elements of a list `l`

.

Intuitively, the definitions consists of three steps.

- First, we add progressive indices to the elements of
`l`

. To do that, we combine`l`

(say, of length k) with the list of all the integers between 1 and k. - Second, we filter the elements that appear in positions greater than
`n`

. - Finally, we map the right projection
`snd`

to all the elements of the list, so to obtain the desired sublist of`l`

.

Summing up, we have the following code:

# let addidx l = combine (range 1 (length l)) l;; val addidx : 'a list -> (int * 'a) list = <fun> # addidx [10;20;30;40;50];; - : (int * int) list = [(1, 10); (2, 20); (3, 30); (4, 40); (5, 50)] # let drop n l = map snd (filter (fun x -> fst x > n) (addidx l));; val drop : int -> 'a list -> 'a list = <fun> # drop 2 [10;20;30;40;50];; - : int list = [30; 40; 50]

`map`

, `filter`

, and `addidx`

, write the following functions:
`search: 'a → 'a list → bool`

returns true iff the given element occurs in the list.`evenidx: 'a list → 'a list`

given a list, produces the sublist of the elements that appear in even position.

The list combinators `map`

and `filter`

presented above transform a list into a new list.
We now present the *fold* combinators, that instead transform a list into a value of an arbitrary type.
These combinators are quite general: indeed, they can be exploited to encode both `map`

and `filter`

.

The fold combinators
must be fed with three things: a combining function `f`

, an element `z`

, and a list `l`

.
Then, the fold combines the elements of `l`

using the function `f`

; the element `z`

is used as a base case.
The actual order in which the elements are combined depends on the fold being a “right” or a “left” fold.

The combinator `fold_right`

can be informally defined as follows:

- fold_right f z [] = z
- fold_right f z [x1] = f x1 z
- fold_right f z [x1;x2] = f x1 (f x2 z)
- …
- fold_right f z [x1;x2;…;xn] = f x1 (f x2 (… (f xn z) …))

The recursive definition of `fold_right`

can be given by pattern matching as follows:

# let rec fold_right f z = function [] -> z | x::xs -> f x (fold_right f z xs);; val fold_right : ('a -> 'b -> 'b) -> 'b -> 'a list -> 'b = <fun>

As a first example, we can use `fold_right`

to define a function `addlist`

that sums all the elements of a list.
In this case, the base element is 0, while the combining function is the (prefix) sum operator `(+)`

.
According to out informal definition, we shall have that:

- fold_right (+) 0 [] = 0
- fold_right (+) 0 [x1] = (+) x1 0 = x1+0 = x1
- fold_right (+) 0 [x1;x2] = (+) x1 ( (+) x2 0) = x1+(x2+0) = x1+x2
- …

This is the actual definition of `addlist`

:

# let addlist = fold_right (+) 0;; val addlist : int list -> int = <fun> # addlist [1;2;3;4;5];; - : int = 15

The combinator `fold_left`

can be informally defined as follows:

- fold_left f z [] = z
- fold_left f z [x1] = f z x1
- fold_left f z [x1;x2] = f (f z x1) x2
- …
- fold_left f z [x1;x2;…;xn] = f( … f(f z x1) x2 … )

The recursive definition of `fold_left`

can be given by pattern matching as follows:

# let rec fold_left f z = function [] -> z | x::xs -> fold_left f (f z x) xs;; val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun> # fold_left (+) 0 [1;2;3;4;5];; - : int = 15

Note that the `fold_right`

and the `fold_left`

combinators yield the same result when
the combining function `f`

is associative, and `z`

is a neutral element of `f`

.
For instance, `fold_right`

and `fold_left`

would associate the sum operator `(+)`

on the list `[1;2;3]`

respectively as follows:

- fold_right (+) 0 [1;2;3] = (+) 1 ( (+) 2 ( (+) 3 0) = 1 + (2 + (3 + 0))
- fold_left (+) 0 [1;2;3] = ((0 + 1) + 2) + 3

Since addition is associative (and 0 is its neutral element), the result is the same in both cases.

The OCaml List library contains all the standard combinators
defined above, as well as a few other useful combinators.
To use them, it suffices to write

`List.`

as a prefix of the combinator.
For instance, you can use map and filter as follows:
# List.map (fun x -> x*x) (List.filter (fun x -> x mod 2 = 0) [1;2;3;4;5;6;7] );; - : int list = [4; 16; 36]

We now show how to encode `map`

and `filter`

with the `fold_right`

combinator
(note that `fold_left`

could have been used as well).

# let map f = fold_right (fun x l -> (f x) :: l) [];; val map : ('a -> 'b) -> 'a list -> 'b list = <fun> # map (fun x -> x*x) [1;2;3;4;5];; - : int list = [1; 4; 9; 16; 25] # let filter p = fold_right (fun x l -> if p x then x::l else l) [];; val filter : ('a -> bool) -> 'a list -> 'a list = <fun> # filter (fun x -> x mod 2 = 0) [1;2;3;4;5;6;7];; - : int list = [2; 4]

`append : 'a list → 'a list → 'a list`

. Concatenates two lists.`length: 'a list → int`

. Returns the length of a list.`count: 'a → 'a list → int`

counts the number of occurrences of an element in a list.`rev: 'a list → 'a list`

. Reverts the elements of a list.

lists.txt · Last modified: 2015/10/08 15:20 (external edit)