lists

Lists

To really understand the power of definitions by pattern matching, it is convenient to introduce more complex data structures. In this lecture we shall focus on list; in the next lecture we will deal with the general case of user-defined types.

List constructors

A list is a finite or infinite sequence of elements of the same type. The empty list is denoted by [].

# [];;
- : 'a list = []

When a list is finite, we can enumerate its elements, separated by a semicolon, within the square brackets […]. Unlike sets, lists may contain duplicate elements.

The type of a list has the form 'a list, where 'a is the type of its elements. All the elements of a list are required to have the same type, otherwise the list will not be assigned any type.

# [1; 2; 3]
- : int list = [1; 2; 3]

# [1;2;2;1;3];;
- : int list = [1; 2; 2; 1; 3]

# ["alice"; "bob"; "carl"];;
- : string list = ["alice"; "bob"; "carl"]

# ["alice"; 1; 2];;
Characters 10-11:
["alice"; 1; 2];;
^
This expression has type int but is here used with type string

Note that this is quite different from the way collections are dealt with in Java (al least, before Java 1.5 introduced polymorphism). In Java you can have a list of elements of any type (it suffices they subclass Object), but you need to use downcasts (and possibly incur in runtime errors) to work with them.

The operator :: (usually spelled cons, for “constructor”) allows to construct a list hd::tl from an element hd of type 'a and a list tl of type 'a list.

When l = hd::tl we say that hd is the head of the list l, and tl is the tail of the list l.

Indeed, any list [x1;x2;…;xn] can be seen just as an abbreviation for x1::x2::…::xn::[].

# 1::[];;
- : int list = 

# 1::2::3::[];;
- : int list = [1; 2; 3]

# "alice"::["bob";"carl"];;
- : string list = ["alice"; "bob"; "carl"]

Since lists may contain elements of arbitrary type (it suffices that all the elements have the same type), we can construct lists the elements of which are other lists:

# [[1;2];[];[3;4;5]];;
- : int list list = [[1; 2]; []; [3; 4; 5]]

# []::[];;
- : 'a list list = [[]]

[]::[[[]]];;
- : 'a list list list = [[]; [[]]]

Common errors

Common error #1

Recall that all the elements of a list must have the same type.

# "alice"::[1;2];;
Characters 12-13:
"alice" :: [1;2];;
^
This expression has type int but is here used with type string
Common error #2

Recall that the only legitimate use of x::y is when x has type 'a and y has type 'a list.

# [1;2]::3;;
This expression has type int but is here used with type int list list

To append an element to the tail of a list, you must define a suitable function.

Common error #3

Watch out for commas. You must use semicolons (e.g. [1;2;3]) and not commas (e.g. [1,2,3]) to separate the elements of a list. Using commas is particularly dangerous, since it is not immediately forbidden by the compiler. The effect is that you obtain a tuple of elements, instead of a list. It might be time-consuming to detect this error in later phases of the development.

# [1,2,3];;
- : (int * int * int) list = [(1, 2, 3)]

# 0::[1,2,3];;
Characters 4-9:
0::[1,2,3];;
^^^^^
This expression has type int * int * int but is here used with type int

Pattern matching on lists

The operator :: is also used to de-construct lists, i.e. to explore their structure by pattern matching. The empty list is matched by []. A non-empty list is matched by the pattern hd::tl, where hd stands for the head of the list (its first element), while tl stand for the tail.

As a first example, consider the function length, that computes the length of a list.

# let rec length l = match l with
[] -> 0
| x::xl -> 1 + length xl;;

val length : 'a list -> int = <fun>

# length [1;2;3;4;5];;
- : int = 5

Note that the function length above can be written even more concisely with the help of some syntactic sugar:

let rec length = function
[] -> 0
| x::xl -> 1 + length xl;;

val length : 'a list -> int = <fun>

The following function searches a given value x in an unordered list.

# let rec search x = function
[] -> false
| hd::tl -> hd=x || search x tl;;

val search : 'a -> 'a list -> bool = <fun>

# search 5 [2; 3; 5; 7; 11];;
- : bool = true

# search "elizabeth" ["alice"; "bob"; "carl"; "david"];;
- : bool = false

The following function concatenates two lists. Notice that the same effect can be obtained through the built-in infix operator @.

# let rec append xl yl = match xl with
[] -> yl
| x::xl' -> x::(append xl' yl);;

val append : 'a list -> 'a list -> 'a list = <fun>

# append [1;2;3] [4;5];;
- : int list = [1; 2; 3; 4; 5]

# [1;2;3] @ [4;5];;
- : int list = [1; 2; 3; 4; 5]

The function drop drops the first n elements of a list. Note below that the patterns are matched in order of appearance (thus, the last pattern in only applied when the list is not empty, and n=0).

# let rec drop n = function
[] -> []
| x::xl when n>0 -> drop (n-1) xl
| l -> l;;

val drop : int -> 'a list -> 'a list = <fun>

# drop 2 [10;20;30;40;50];;
- : int list = [30; 40; 50]

Exercise. Write the following functions:

• count: 'a → 'a list → int. Counts the number of occurrences of an element in a list.
• even: 'a list → bool. When fed with a list l, evaluates to true iff l has an even number of elements.
• rev: 'a list → 'a list. Reverts the elements of a list.
• nth: int → 'a list → 'a. Takes the n-th element of a list (throws an exception if the list is too short).
• take: int → 'a list → 'a list. Takes the first n elements of a list.
• split : ('a * 'b) list → 'a list * 'b list. Splits a list of pairs into a pair of lists.
• combine : 'a list → 'b list → ('a * 'b) list. Combines two lists (of equal length) into a list of pairs.

Tail recursive functions on lists

Let us consider the function range, defined as follows:

# let rec range a b =
if b<a then []
else a::(range (a+1) b);;

val range : int -> int -> int list = <fun>

When applied to two integers a and b, range a b produces a list containing all the integers in the range [a,b]. For instance:

# range 1 10;;
- : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

Now, let's try to construct a list with one million of elements:

# range 1 1000000
Stack overflow during evaluation (looping recursion?).

What has gone wrong ? The function range defined above is not tail recursive, because the recursive call to range is not the very last thing to happen before the return. We have already shown a technique for transforming a non-tail recursive function into a tail recursive one. Let us try it:

# let range a b =
let rec range2 a b acc =
if a>b then acc
else range2 (a+1) b (a::acc)
in range2 a b [];;

# range 1 10;;
- : int list = [10; 9; 8; 7; 6; 5; 4; 3; 2; 1]

Unexpectedly, the resulting list is reverted. What happened ? When accumulating in acc the result, the element a is put at the head of the list. Let us observe the sequence of recursive calls, in the given range:

• range2 1 10 []
• range2 2 10 
• range2 3 10 2::
• range2 4 10 3::[2;1]
• range2 4 10 4::[3;2;1]

In general, writing functions on lists in a tail recursive manner is made a bit tricky because the cons operator can only be used to attach an element at the head of a list.

Luckily, in this case there is a simple solution that allows for constructing the list in the correct order.

# let range a b =
let rec range2 a b acc =
if a>b then acc
else range2 a (b-1) (b::acc)
in range2 a b [];;

# range 1 10;;
- : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

Now, let us construct a one million elements list:

# let verylong = range 1 1000000;;
- : int list =  [1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; ...]

Not all functions on lists can be made tail recursive so easily. An example is the append function. It is easy to write a tail recursive version of append which reverts the first list:

# let rec rev_append l1 l2 = match l1 with
[] -> l2
| x::l' -> rev_append l' (x::l2)
;;

val rev_append : 'a list -> 'a list -> 'a list = <fun>

# rev_append [1;2;3;4;5] [6;7];;
- : int list = [5; 4; 3; 2; 1; 6; 7]

Note in passing that, when l2 is empty, this subsumes the function that reverts a list:

# let rev l = rev_append l [];;
val rev : 'a list -> 'a list = <fun>

We can finally write our tail recursive append function.

# let append l1 l2 = rev_append (rev l1) l2;;
val append : 'a list -> 'a list -> 'a list = <fun>

# append [1;2;3;4;5] [6;7];;
- : int list = [1; 2; 3; 4; 5; 6; 7]

Let us test is with a huge list:

# length (append longlist longlist);;
- : int = 2000000

Exercise. Write a tail recursive version of the function length: 'a list → int.

Map and Filter combinators

We now introduce some higher-order combinators that are especially useful to manipulate lists. Indeed, many of the examples seen above can be reformulated using these combinators.

The function map applies a given function f to each element of a list. Note the type: map takes a function of type 'a → 'b and a list of elements of type 'a, and produces a list of elements of type 'b.

# let rec map f = function
[] -> []
| hd :: tl -> f hd :: map f tl;;

val map : ('a -> 'b) -> 'a list -> 'b list = <fun>

# map (fun x -> x*2) [1; 2; 3; 4; 5];;
= : int list = [2; 4; 6; 8; 10]

The function filter selects all the elements of a list which respect a given predicate. If the elements of the list have type 'a, then the predicate is a function of type 'a → bool.

# let rec filter p = function
[] -> []
| x::xl -> if (p x) then x::(filter p xl) else (filter p xl);;

val filter : ('a -> bool) -> 'a list -> 'a list = <fun>

# filter (fun x -> x mod 2 = 0) [1; 2; 3; 4; 5];;
- : int list = [2; 4]

As a more involved example, let us exploit combinators to define the function drop, that drops the first n elements of a list l.

Intuitively, the definitions consists of three steps.

1. First, we add progressive indices to the elements of l. To do that, we combine l (say, of length k) with the list of all the integers between 1 and k.
2. Second, we filter the elements that appear in positions greater than n.
3. Finally, we map the right projection snd to all the elements of the list, so to obtain the desired sublist of l.

Summing up, we have the following code:

# let addidx l = combine (range 1 (length l)) l;;
val addidx : 'a list -> (int * 'a) list = <fun>

- : (int * int) list = [(1, 10); (2, 20); (3, 30); (4, 40); (5, 50)]

# let drop n l = map snd (filter (fun x -> fst x > n) (addidx l));;
val drop : int -> 'a list -> 'a list = <fun>

# drop 2 [10;20;30;40;50];;
- : int list = [30; 40; 50]

Exercise. Using only the list combinators map, filter, and addidx, write the following functions:

• search: 'a → 'a list → bool returns true iff the given element occurs in the list.
• evenidx: 'a list → 'a list given a list, produces the sublist of the elements that appear in even position.

Fold combinators

The list combinators map and filter presented above transform a list into a new list. We now present the fold combinators, that instead transform a list into a value of an arbitrary type. These combinators are quite general: indeed, they can be exploited to encode both map and filter.

The fold combinators must be fed with three things: a combining function f, an element z, and a list l. Then, the fold combines the elements of l using the function f; the element z is used as a base case. The actual order in which the elements are combined depends on the fold being a “right” or a “left” fold.

The combinator fold_right can be informally defined as follows:

• fold_right f z [] = z
• fold_right f z [x1] = f x1 z
• fold_right f z [x1;x2] = f x1 (f x2 z)
• fold_right f z [x1;x2;…;xn] = f x1 (f x2 (… (f xn z) …))

The recursive definition of fold_right can be given by pattern matching as follows:

# let rec fold_right f z = function
[] -> z
| x::xs -> f x (fold_right f z xs);;

val fold_right : ('a -> 'b -> 'b) -> 'b -> 'a list -> 'b = <fun>

As a first example, we can use fold_right to define a function addlist that sums all the elements of a list. In this case, the base element is 0, while the combining function is the (prefix) sum operator (+). According to out informal definition, we shall have that:

• fold_right (+) 0 [] = 0
• fold_right (+) 0 [x1] = (+) x1 0 = x1+0 = x1
• fold_right (+) 0 [x1;x2] = (+) x1 ( (+) x2 0) = x1+(x2+0) = x1+x2

This is the actual definition of addlist:

# let addlist = fold_right (+) 0;;
val addlist : int list -> int = <fun>

- : int = 15

The combinator fold_left can be informally defined as follows:

• fold_left f z [] = z
• fold_left f z [x1] = f z x1
• fold_left f z [x1;x2] = f (f z x1) x2
• fold_left f z [x1;x2;…;xn] = f( … f(f z x1) x2 … )

The recursive definition of fold_left can be given by pattern matching as follows:

# let rec fold_left f z = function
[] -> z
| x::xs -> fold_left f (f z x) xs;;

val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>

# fold_left (+) 0 [1;2;3;4;5];;
- : int = 15

Note that the fold_right and the fold_left combinators yield the same result when the combining function f is associative, and z is a neutral element of f. For instance, fold_right and fold_left would associate the sum operator (+) on the list [1;2;3] respectively as follows:

• fold_right (+) 0 [1;2;3] = (+) 1 ( (+) 2 ( (+) 3 0) = 1 + (2 + (3 + 0))
• fold_left (+) 0 [1;2;3] = ((0 + 1) + 2) + 3

Since addition is associative (and 0 is its neutral element), the result is the same in both cases.

The OCaml List library contains all the standard combinators defined above, as well as a few other useful combinators. To use them, it suffices to write List. as a prefix of the combinator. For instance, you can use map and filter as follows:

# List.map
(fun x -> x*x)
(List.filter
(fun x -> x mod 2 = 0)
[1;2;3;4;5;6;7]
);;

- : int list = [4; 16; 36]

We now show how to encode map and filter with the fold_right combinator (note that fold_left could have been used as well).

# let map f = fold_right (fun x l -> (f x) :: l) [];;
val map : ('a -> 'b) -> 'a list -> 'b list = <fun>

# map (fun x -> x*x) [1;2;3;4;5];;
- : int list = [1; 4; 9; 16; 25]

# let filter p = fold_right (fun x l -> if p x then x::l else l) [];;
val filter : ('a -> bool) -> 'a list -> 'a list = <fun>

# filter (fun x -> x mod 2 = 0) [1;2;3;4;5;6;7];;
- : int list = [2; 4]

Exercise. Use the fold combinators to define the following functions.

• append : 'a list → 'a list → 'a list. Concatenates two lists.
• length: 'a list → int. Returns the length of a list.
• count: 'a → 'a list → int counts the number of occurrences of an element in a list.
• rev: 'a list → 'a list. Reverts the elements of a list.

Exercises

lists.txt · Last modified: 2015/10/08 15:20 (external edit)

Page Tools 