lists

Lists

To really understand the power of definitions by pattern matching, it is convenient to introduce more complex data structures. In this lecture we shall focus on list; in the next lecture we will deal with the general case of user-defined types.

List constructors

A list is a finite or infinite sequence of elements of the same type. The empty list is denoted by `[]`.

```# [];;
- : 'a list = []```

When a list is finite, we can enumerate its elements, separated by a semicolon, within the square brackets `[…]`. Unlike sets, lists may contain duplicate elements.

The type of a list has the form `'a list`, where `'a` is the type of its elements. All the elements of a list are required to have the same type, otherwise the list will not be assigned any type.

```# [1; 2; 3]
- : int list = [1; 2; 3]

# [1;2;2;1;3];;
- : int list = [1; 2; 2; 1; 3]

# ["alice"; "bob"; "carl"];;
- : string list = ["alice"; "bob"; "carl"]

# ["alice"; 1; 2];;
Characters 10-11:
["alice"; 1; 2];;
^
This expression has type int but is here used with type string```

Note that this is quite different from the way collections are dealt with in Java (al least, before Java 1.5 introduced polymorphism). In Java you can have a list of elements of any type (it suffices they subclass Object), but you need to use downcasts (and possibly incur in runtime errors) to work with them.

The operator :: (usually spelled cons, for “constructor”) allows to construct a list `hd::tl` from an element `hd` of type `'a` and a list `tl` of type `'a list`.

When `l = hd::tl` we say that `hd` is the head of the list `l`, and `tl` is the tail of the list `l`.

Indeed, any list `[x1;x2;…;xn]` can be seen just as an abbreviation for `x1::x2::…::xn::[]`.

```# 1::[];;
- : int list = [1]

# 1::2::3::[];;
- : int list = [1; 2; 3]

# "alice"::["bob";"carl"];;
- : string list = ["alice"; "bob"; "carl"]```

Since lists may contain elements of arbitrary type (it suffices that all the elements have the same type), we can construct lists the elements of which are other lists:

```# [[1;2];[];[3;4;5]];;
- : int list list = [[1; 2]; []; [3; 4; 5]]

# []::[];;
- : 'a list list = [[]]

[]::[[[]]];;
- : 'a list list list = [[]; [[]]]```

Common errors

Common error #1

Recall that all the elements of a list must have the same type.

```# "alice"::[1;2];;
Characters 12-13:
"alice" :: [1;2];;
^
This expression has type int but is here used with type string```
Common error #2

Recall that the only legitimate use of x::y is when x has type `'a` and y has type `'a list`.

```# [1;2]::3;;
This expression has type int but is here used with type int list list```

To append an element to the tail of a list, you must define a suitable function.

Common error #3

Watch out for commas. You must use semicolons (e.g. `[1;2;3]`) and not commas (e.g. `[1,2,3]`) to separate the elements of a list. Using commas is particularly dangerous, since it is not immediately forbidden by the compiler. The effect is that you obtain a tuple of elements, instead of a list. It might be time-consuming to detect this error in later phases of the development.

```# [1,2,3];;
- : (int * int * int) list = [(1, 2, 3)]

# 0::[1,2,3];;
Characters 4-9:
0::[1,2,3];;
^^^^^
This expression has type int * int * int but is here used with type int```

Pattern matching on lists

The operator `::` is also used to de-construct lists, i.e. to explore their structure by pattern matching. The empty list is matched by `[]`. A non-empty list is matched by the pattern `hd::tl`, where `hd` stands for the head of the list (its first element), while `tl` stand for the tail.

As a first example, consider the function `length`, that computes the length of a list.

```# let rec length l = match l with
[] -> 0
| x::xl -> 1 + length xl;;

val length : 'a list -> int = <fun>

# length [1;2;3;4;5];;
- : int = 5```

Note that the function `length` above can be written even more concisely with the help of some syntactic sugar:

```let rec length = function
[] -> 0
| x::xl -> 1 + length xl;;

val length : 'a list -> int = <fun>```

The following function searches a given value `x` in an unordered list.

```# let rec search x = function
[] -> false
| hd::tl -> hd=x || search x tl;;

val search : 'a -> 'a list -> bool = <fun>

# search 5 [2; 3; 5; 7; 11];;
- : bool = true

# search "elizabeth" ["alice"; "bob"; "carl"; "david"];;
- : bool = false```

The following function concatenates two lists. Notice that the same effect can be obtained through the built-in infix operator @.

```# let rec append xl yl = match xl with
[] -> yl
| x::xl' -> x::(append xl' yl);;

val append : 'a list -> 'a list -> 'a list = <fun>

# append [1;2;3] [4;5];;
- : int list = [1; 2; 3; 4; 5]

# [1;2;3] @ [4;5];;
- : int list = [1; 2; 3; 4; 5]```

The function `drop` drops the first `n` elements of a list. Note below that the patterns are matched in order of appearance (thus, the last pattern in only applied when the list is not empty, and `n`=0).

```# let rec drop n = function
[] -> []
| x::xl when n>0 -> drop (n-1) xl
| l -> l;;

val drop : int -> 'a list -> 'a list = <fun>

# drop 2 [10;20;30;40;50];;
- : int list = [30; 40; 50]```

Exercise. Write the following functions:

• `count: 'a → 'a list → int`. Counts the number of occurrences of an element in a list.
• `even: 'a list → bool`. When fed with a list `l`, evaluates to `true` iff `l` has an even number of elements.
• `rev: 'a list → 'a list`. Reverts the elements of a list.
• `nth: int → 'a list → 'a`. Takes the n-th element of a list (throws an exception if the list is too short).
• `take: int → 'a list → 'a list`. Takes the first n elements of a list.
• `split : ('a * 'b) list → 'a list * 'b list`. Splits a list of pairs into a pair of lists.
• `combine : 'a list → 'b list → ('a * 'b) list`. Combines two lists (of equal length) into a list of pairs.

Tail recursive functions on lists

Let us consider the function `range`, defined as follows:

```# let rec range a b =
if b<a then []
else a::(range (a+1) b);;

val range : int -> int -> int list = <fun>```

When applied to two integers `a` and `b`, `range a b` produces a list containing all the integers in the range [a,b]. For instance:

```# range 1 10;;
- : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]```

Now, let's try to construct a list with one million of elements:

```# range 1 1000000
Stack overflow during evaluation (looping recursion?).```

What has gone wrong ? The function `range` defined above is not tail recursive, because the recursive call to `range` is not the very last thing to happen before the return. We have already shown a technique for transforming a non-tail recursive function into a tail recursive one. Let us try it:

```# let range a b =
let rec range2 a b acc =
if a>b then acc
else range2 (a+1) b (a::acc)
in range2 a b [];;

# range 1 10;;
- : int list = [10; 9; 8; 7; 6; 5; 4; 3; 2; 1]```

Unexpectedly, the resulting list is reverted. What happened ? When accumulating in `acc` the result, the element `a` is put at the head of the list. Let us observe the sequence of recursive calls, in the given range:

• range2 1 10 []
• range2 2 10 [1]
• range2 3 10 2::[1]
• range2 4 10 3::[2;1]
• range2 4 10 4::[3;2;1]

In general, writing functions on lists in a tail recursive manner is made a bit tricky because the cons operator can only be used to attach an element at the head of a list.

Luckily, in this case there is a simple solution that allows for constructing the list in the correct order.

```# let range a b =
let rec range2 a b acc =
if a>b then acc
else range2 a (b-1) (b::acc)
in range2 a b [];;

# range 1 10;;
- : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]```

Now, let us construct a one million elements list:

```# let verylong = range 1 1000000;;
- : int list =  [1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; ...]```

Not all functions on lists can be made tail recursive so easily. An example is the `append` function. It is easy to write a tail recursive version of `append` which reverts the first list:

```# let rec rev_append l1 l2 = match l1 with
[] -> l2
| x::l' -> rev_append l' (x::l2)
;;

val rev_append : 'a list -> 'a list -> 'a list = <fun>

# rev_append [1;2;3;4;5] [6;7];;
- : int list = [5; 4; 3; 2; 1; 6; 7]```

Note in passing that, when `l2` is empty, this subsumes the function that reverts a list:

```# let rev l = rev_append l [];;
val rev : 'a list -> 'a list = <fun>```

We can finally write our tail recursive append function.

```# let append l1 l2 = rev_append (rev l1) l2;;
val append : 'a list -> 'a list -> 'a list = <fun>

# append [1;2;3;4;5] [6;7];;
- : int list = [1; 2; 3; 4; 5; 6; 7]```

Let us test is with a huge list:

```# length (append longlist longlist);;
- : int = 2000000```

Exercise. Write a tail recursive version of the function `length: 'a list → int`.

Map and Filter combinators

We now introduce some higher-order combinators that are especially useful to manipulate lists. Indeed, many of the examples seen above can be reformulated using these combinators.

The function `map` applies a given function `f` to each element of a list. Note the type: `map` takes a function of type `'a → 'b` and a list of elements of type `'a`, and produces a list of elements of type `'b`.

```# let rec map f = function
[] -> []
| hd :: tl -> f hd :: map f tl;;

val map : ('a -> 'b) -> 'a list -> 'b list = <fun>

# map (fun x -> x*2) [1; 2; 3; 4; 5];;
= : int list = [2; 4; 6; 8; 10]```

The function `filter` selects all the elements of a list which respect a given predicate. If the elements of the list have type `'a`, then the predicate is a function of type `'a → bool`.

```# let rec filter p = function
[] -> []
| x::xl -> if (p x) then x::(filter p xl) else (filter p xl);;

val filter : ('a -> bool) -> 'a list -> 'a list = <fun>

# filter (fun x -> x mod 2 = 0) [1; 2; 3; 4; 5];;
- : int list = [2; 4]```

As a more involved example, let us exploit combinators to define the function `drop`, that drops the first `n` elements of a list `l`.

Intuitively, the definitions consists of three steps.

1. First, we add progressive indices to the elements of `l`. To do that, we combine `l` (say, of length k) with the list of all the integers between 1 and k.
2. Second, we filter the elements that appear in positions greater than `n`.
3. Finally, we map the right projection `snd` to all the elements of the list, so to obtain the desired sublist of `l`.

Summing up, we have the following code:

```# let addidx l = combine (range 1 (length l)) l;;
val addidx : 'a list -> (int * 'a) list = <fun>

- : (int * int) list = [(1, 10); (2, 20); (3, 30); (4, 40); (5, 50)]

# let drop n l = map snd (filter (fun x -> fst x > n) (addidx l));;
val drop : int -> 'a list -> 'a list = <fun>

# drop 2 [10;20;30;40;50];;
- : int list = [30; 40; 50]```

Exercise. Using only the list combinators `map`, `filter`, and `addidx`, write the following functions:

• `search: 'a → 'a list → bool` returns true iff the given element occurs in the list.
• `evenidx: 'a list → 'a list` given a list, produces the sublist of the elements that appear in even position.

Fold combinators

The list combinators `map` and `filter` presented above transform a list into a new list. We now present the fold combinators, that instead transform a list into a value of an arbitrary type. These combinators are quite general: indeed, they can be exploited to encode both `map` and `filter`.

The fold combinators must be fed with three things: a combining function `f`, an element `z`, and a list `l`. Then, the fold combines the elements of `l` using the function `f`; the element `z` is used as a base case. The actual order in which the elements are combined depends on the fold being a “right” or a “left” fold.

The combinator `fold_right` can be informally defined as follows:

• fold_right f z [] = z
• fold_right f z [x1] = f x1 z
• fold_right f z [x1;x2] = f x1 (f x2 z)
• fold_right f z [x1;x2;…;xn] = f x1 (f x2 (… (f xn z) …))

The recursive definition of `fold_right` can be given by pattern matching as follows:

```# let rec fold_right f z = function
[] -> z
| x::xs -> f x (fold_right f z xs);;

val fold_right : ('a -> 'b -> 'b) -> 'b -> 'a list -> 'b = <fun>```

As a first example, we can use `fold_right` to define a function `addlist` that sums all the elements of a list. In this case, the base element is 0, while the combining function is the (prefix) sum operator `(+)`. According to out informal definition, we shall have that:

• fold_right (+) 0 [] = 0
• fold_right (+) 0 [x1] = (+) x1 0 = x1+0 = x1
• fold_right (+) 0 [x1;x2] = (+) x1 ( (+) x2 0) = x1+(x2+0) = x1+x2

This is the actual definition of `addlist`:

```# let addlist = fold_right (+) 0;;
val addlist : int list -> int = <fun>

- : int = 15```

The combinator `fold_left` can be informally defined as follows:

• fold_left f z [] = z
• fold_left f z [x1] = f z x1
• fold_left f z [x1;x2] = f (f z x1) x2
• fold_left f z [x1;x2;…;xn] = f( … f(f z x1) x2 … )

The recursive definition of `fold_left` can be given by pattern matching as follows:

```# let rec fold_left f z = function
[] -> z
| x::xs -> fold_left f (f z x) xs;;

val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>

# fold_left (+) 0 [1;2;3;4;5];;
- : int = 15```

Note that the `fold_right` and the `fold_left` combinators yield the same result when the combining function `f` is associative, and `z` is a neutral element of `f`. For instance, `fold_right` and `fold_left` would associate the sum operator `(+)` on the list `[1;2;3]` respectively as follows:

• fold_right (+) 0 [1;2;3] = (+) 1 ( (+) 2 ( (+) 3 0) = 1 + (2 + (3 + 0))
• fold_left (+) 0 [1;2;3] = ((0 + 1) + 2) + 3

Since addition is associative (and 0 is its neutral element), the result is the same in both cases.

The OCaml List library contains all the standard combinators defined above, as well as a few other useful combinators. To use them, it suffices to write `List.` as a prefix of the combinator. For instance, you can use map and filter as follows:

```# List.map
(fun x -> x*x)
(List.filter
(fun x -> x mod 2 = 0)
[1;2;3;4;5;6;7]
);;

- : int list = [4; 16; 36]```

We now show how to encode `map` and `filter` with the `fold_right` combinator (note that `fold_left` could have been used as well).

```# let map f = fold_right (fun x l -> (f x) :: l) [];;
val map : ('a -> 'b) -> 'a list -> 'b list = <fun>

# map (fun x -> x*x) [1;2;3;4;5];;
- : int list = [1; 4; 9; 16; 25]

# let filter p = fold_right (fun x l -> if p x then x::l else l) [];;
val filter : ('a -> bool) -> 'a list -> 'a list = <fun>

# filter (fun x -> x mod 2 = 0) [1;2;3;4;5;6;7];;
- : int list = [2; 4]```

Exercise. Use the fold combinators to define the following functions.

• `append : 'a list → 'a list → 'a list`. Concatenates two lists.
• `length: 'a list → int`. Returns the length of a list.
• `count: 'a → 'a list → int` counts the number of occurrences of an element in a list.
• `rev: 'a list → 'a list`. Reverts the elements of a list.