User Tools

Site Tools


lists

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

lists [2015/10/08 15:20] (current)
Line 1: Line 1:
 +====== Lists ======
  
 +To really understand the power of definitions by pattern matching, ​
 +it is convenient to introduce more complex data structures.
 +In this lecture we shall focus on list; in the next lecture
 +we will deal with the general case of user-defined [[types]].
 +
 +\\
 +
 +===== List constructors =====
 +
 +A //list// is a finite or infinite sequence of elements of the same type.
 +The empty list is denoted by ''​[]''​.
 +<code ocaml>
 +# [];;
 +- : 'a list = []
 +</​code>​
 +
 +When a list is finite, we can enumerate its elements, separated by a semicolon, ​
 +within the square brackets ''​[...]''​.
 +Unlike sets, lists may contain duplicate elements.
 +
 +The type of a list has the form '''​a list'',​ where '''​a''​ is the type of its elements.
 +All the elements of a list are required to have the same type, otherwise
 +the list will not be assigned any type.
 +
 +<code ocaml>
 +# [1; 2; 3]
 +- : int list = [1; 2; 3]
 +
 +# [1;​2;​2;​1;​3];;​
 +- : int list = [1; 2; 2; 1; 3]
 +
 +# ["​alice";​ "​bob";​ "​carl"​];;​
 +- : string list = ["​alice";​ "​bob";​ "​carl"​]
 +
 +# ["​alice";​ 1; 2];;
 +Characters 10-11:
 +  ["​alice";​ 1; 2];;
 +            ^
 +This expression has type int but is here used with type string
 +</​code>​
 +
 +<note warning>
 +Note that this is quite different from the way collections are dealt with in Java
 +(al least, before Java 1.5 introduced polymorphism). ​
 +In Java you can have a list of elements of any type (it suffices they subclass Object), ​
 +but you need to use downcasts (and possibly incur in runtime errors) to work with them.
 +</​note>​
 +
 +The operator :: (usually spelled //cons//, for "​constructor"​)
 +allows to construct a list ''​hd::​tl'' ​
 +from an element ''​hd''​ of type '''​a''​
 +and a list ''​tl''​ of type '''​a list''​.
 +
 +When ''​l = hd::​tl''​ we say that ''​hd''​ is the **head** of the list ''​l'',​
 +and ''​tl''​ is the **tail** of the list ''​l''​.
 +
 +Indeed, any list ''​[x1;​x2;​...;​xn]''​ can be seen just as an abbreviation for ''​x1::​x2::​...::​xn::​[]''​.
 +<code ocaml>
 +# 1::[];;
 +- : int list = [1]
 +
 +# 1::​2::​3::​[];;​
 +- : int list = [1; 2; 3]
 +
 +# "​alice"::​["​bob";"​carl"​];;​
 +- : string list = ["​alice";​ "​bob";​ "​carl"​]
 +</​code>​
 +
 +Since lists may contain elements of arbitrary type (it suffices that all the elements have the same type), ​
 +we can construct lists the elements of which are other lists:
 +<code ocaml>
 +# [[1;​2];​[];​[3;​4;​5]];;​
 +- : int list list = [[1; 2]; []; [3; 4; 5]]
 +
 +# []::[];;
 +- : 'a list list = [[]]
 +
 +[]::​[[[]]];;​
 +- : 'a list list list = [[]; [[]]]
 +</​code>​
 +
 +\\
 +
 +===== Common errors =====
 +
 +==Common error #1==
 +
 +Recall that all the elements of a list must have the same type. 
 +
 +<code ocaml>
 +# "​alice"::​[1;​2];;​
 +Characters 12-13:
 +  "​alice"​ :: [1;2];;
 +              ^
 +This expression has type int but is here used with type string
 +</​code>​
 +
 +==Common error #2==
 +
 +Recall that the only legitimate use of x::y is when x has type '''​a''​ and y has type '''​a list''​.  ​
 +
 +<code ocaml>
 +# [1;2]::3;;
 +This expression has type int but is here used with type int list list
 +</​code>​
 +
 +To append an element to the tail of a list, you must define a suitable function. ​
 +
 +==Common error #3==
 +
 +Watch out for commas. You must use semicolons (e.g. ''​[1;​2;​3]''​) and **not** commas (e.g. ''​[1,​2,​3]''​) ​
 +to separate the elements of a list.
 +Using commas is particularly dangerous, since it is not immediately forbidden by the compiler.
 +The effect is that you obtain a tuple of elements, instead of a list.
 +It might be time-consuming to detect this error in later phases of the development.
 +<code ocaml>
 +# [1,2,3];;
 +- : (int * int * int) list = [(1, 2, 3)]
 +
 +# 0::​[1,​2,​3];;​
 + ​Characters 4-9:
 +  0::​[1,​2,​3];;​
 +      ^^^^^
 +This expression has type int * int * int but is here used with type int
 +</​code>​
 +
 +\\
 +
 +===== Pattern matching on lists =====
 +
 +The operator ''::''​ is also used to de-construct lists,
 +i.e. to explore their structure by pattern matching.
 +The empty list is matched by ''​[]''​.
 +A non-empty list is matched by the pattern ''​hd::​tl'', ​
 +where ''​hd''​ stands for the head of the list (its first element), ​
 +while ''​tl''​ stand for the tail.
 +
 +As a first example, consider the function ''​length'',​ that computes the length of a list.
 +<code ocaml>
 +# let rec length l = match l with
 +    [] -> 0
 +  | x::xl -> 1 + length xl;;
 +
 +val length : 'a list -> int = <fun>
 +
 +# length [1;​2;​3;​4;​5];;​
 +- : int = 5
 +</​code>​
 +
 +Note that the function ''​length''​ above can be written even more concisely with the
 +help of some syntactic sugar: ​
 +<code ocaml>
 +let rec length = function
 +  [] -> 0
 +| x::xl -> 1 + length xl;;
 +
 +val length : 'a list -> int = <fun>
 +</​code>​
 +
 +The following function searches a given value ''​x''​ in an unordered list.
 +<code ocaml>
 +# let rec search x = function
 +    [] -> false
 +  | hd::tl -> hd=x || search x tl;;
 +
 +val search : 'a -> 'a list -> bool = <fun>
 +
 +# search 5 [2; 3; 5; 7; 11];;
 +- : bool = true
 +
 +# search "​elizabeth"​ ["​alice";​ "​bob";​ "​carl";​ "​david"​];;​
 +- : bool = false
 +</​code>​
 +
 +The following function concatenates two lists.
 +Notice that the same effect can be obtained through the built-in infix operator @.
 +<code ocaml>
 +# let rec append xl yl = match xl with
 +  [] -> yl 
 +| x::xl' -> x::(append xl' yl);;
 +
 +val append : 'a list -> 'a list -> 'a list = <fun>
 +
 +# append [1;2;3] [4;5];;
 +- : int list = [1; 2; 3; 4; 5]
 +
 +# [1;2;3] @ [4;5];;
 +- : int list = [1; 2; 3; 4; 5]
 +</​code>​
 +
 +The function ''​drop''​ drops the first ''​n''​ elements of a list.
 +Note below that the patterns are matched in order of appearance
 +(thus, the last pattern in only applied when the list is not empty, and ''​n''​=0).
 +<code ocaml>
 +# let rec drop n = function ​
 +      [] -> []
 +    | x::xl when n>0 -> drop (n-1) xl
 +    | l -> l;;
 +
 +val drop : int -> 'a list -> 'a list = <fun>
 +
 +# drop 2 [10;​20;​30;​40;​50];;​
 +- : int list = [30; 40; 50]
 +</​code>​
 +
 +<note important>​**Exercise.** Write the following functions:
 +
 +  * ''​count:​ 'a -> 'a list -> int''​. Counts the number of occurrences of an element in a list.
 +  * ''​even:​ 'a list -> bool''​. When fed with a list ''​l'',​ evaluates to ''​true''​ iff ''​l''​ has an even number of elements.
 +  * ''​rev:​ 'a list -> 'a list''​. Reverts the elements of a list.
 +  * ''​nth:​ int -> 'a list -> '​a''​. Takes the n-th element of a list (throws an exception if the list is too short).
 +  * ''​take:​ int -> 'a list -> 'a list''​. Takes the first n elements of a list.
 +  * ''​split : ('a * 'b) list -> 'a list * 'b list''​. Splits a list of pairs into a pair of lists.
 +  * ''​combine : 'a list -> 'b list -> ('a * 'b) list''​. Combines two lists (of equal length) into a list of pairs.
 +</​note>​
 +
 +\\
 +
 +
 +===== Tail recursive functions on lists =====
 +
 +Let us consider the function ''​range'',​ defined as follows:
 +<code ocaml>
 +# let rec range a b = 
 +  if b<a then []
 +  else a::(range (a+1) b);;
 +
 +val range : int -> int -> int list = <fun>
 +</​code>​
 +
 +When applied to two integers ''​a''​ and ''​b'',​ ''​range a b''​
 +produces a list containing all the integers in the range [a,b].
 +For instance:
 +<code ocaml>
 +# range 1 10;;
 +- : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
 +</​code>​
 +
 +Now, let's try to construct a list with one million of elements:
 +<code ocaml>
 +# range 1 1000000
 +Stack overflow during evaluation (looping recursion?​).
 +</​code>​
 +
 +What has gone wrong ? The function ''​range''​ defined above is not [[rec#​tail_recursion|tail recursive]],​
 +because the recursive call to ''​range''​ is not the very last thing to happen before the return.
 +We have already shown a technique for transforming a non-tail recursive function ​
 +into a tail recursive one.
 +Let us try it:
 +<code ocaml>
 +# let range a b =
 +    let rec range2 a b acc =
 +        if a>b then acc
 +        else range2 (a+1) b (a::acc)
 +    in range2 a b [];;
 +
 +# range 1 10;;
 +- : int list = [10; 9; 8; 7; 6; 5; 4; 3; 2; 1]
 +</​code>​
 +
 +Unexpectedly,​ the resulting list is reverted. What happened ?
 +When accumulating in ''​acc''​ the result, the element ''​a''​
 +is put at the head of the list.
 +Let us observe the sequence of recursive calls, in the given range:
 +  * range2 1 10 []
 +  * range2 2 10 [1]
 +  * range2 3 10 2::[1]
 +  * range2 4 10 3::[2;1]
 +  * range2 4 10 4::[3;2;1]
 +  * ...
 +
 +In general, writing functions on lists in a tail recursive manner
 +is made a bit tricky because the //cons// operator can only 
 +be used to attach an element at the head of a list.
 +
 +Luckily, in this case there is a simple solution that allows for
 +constructing the list in the correct order.
 +<code ocaml>
 +# let range a b =
 +    let rec range2 a b acc =
 +        if a>b then acc
 +        else range2 a (b-1) (b::acc)
 +    in range2 a b [];;
 +
 +# range 1 10;;
 +- : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
 +</​code>​
 +
 +Now, let us construct a one million elements list:
 +<code ocaml>
 +# let verylong = range 1 1000000;;
 + - : int list =  [1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; ...]
 +</​code>​
 +
 +Not all functions on lists can be made tail recursive so easily.
 +An example is the ''​append''​ function.
 +It is easy to write a tail recursive version of ''​append''​ which reverts
 +the first list:
 +<code ocaml>
 +# let rec rev_append l1 l2 = match l1 with
 +    [] -> l2
 +  | x::l' -> rev_append l' (x::l2)
 +;;
 +
 +val rev_append : 'a list -> 'a list -> 'a list = <fun>
 +
 +# rev_append [1;2;3;4;5] [6;7];;
 +- : int list = [5; 4; 3; 2; 1; 6; 7]
 +</​code>​
 +
 +Note in passing that, when ''​l2''​ is empty, this subsumes the function that reverts a list:
 +<code ocaml>
 +# let rev l = rev_append l [];;
 +val rev : 'a list -> 'a list = <fun>
 +</​code>​
 +
 +We can finally write our tail recursive append function.
 +<code ocaml>
 +# let append l1 l2 = rev_append (rev l1) l2;;
 +val append : 'a list -> 'a list -> 'a list = <fun>
 +
 +# append [1;2;3;4;5] [6;7];;
 +- : int list = [1; 2; 3; 4; 5; 6; 7]
 +</​code>​
 +
 +Let us test is with a huge list:
 +<code ocaml>
 +# length (append longlist longlist);;
 +- : int = 2000000
 +</​code>​
 +
 +<note important>​
 +**Exercise.** ​
 +Write a tail recursive version of the function ''​length:​ 'a list -> int''​.
 +</​note>​
 +
 +\\
 +
 +===== Map and Filter combinators =====
 +
 +We now introduce some higher-order combinators that are especially useful to manipulate lists. ​
 +Indeed, many of the examples seen above can be reformulated using these combinators.
 +
 +The function ''​map''​ applies a given function ''​f''​ to each element of a list.
 +Note the type: ''​map''​ takes a function of type '''​a -> '​b''​ and a list of elements
 +of type '''​a'',​ and produces a list of elements of type '''​b''​.
 +<code ocaml>
 +# let rec map f = function
 +     [] -> []
 +   | hd :: tl -> f hd :: map f tl;;
 +
 +val map : ('a -> 'b) -> 'a list -> 'b list = <fun>
 +
 +# map (fun x -> x*2) [1; 2; 3; 4; 5];;
 += : int list = [2; 4; 6; 8; 10]
 +</​code>​
 +
 +The function ''​filter''​ selects all the elements of a list which respect a given predicate.
 +If the elements of the list have type '''​a'',​ then the predicate is a function of type '''​a -> bool''​.
 +<code ocaml>
 +# let rec filter p = function
 +  [] -> []
 +| x::xl -> if (p x) then x::(filter p xl) else (filter p xl);;
 +
 +val filter : ('a -> bool) -> 'a list -> 'a list = <fun>
 +
 +# filter (fun x -> x mod 2 = 0) [1; 2; 3; 4; 5];;
 +- : int list = [2; 4]
 +</​code>​
 +
 +As a more involved example, let us exploit combinators to define the
 +function ''​drop'',​ that drops the first ''​n''​ elements of a list ''​l''​.
 +
 +Intuitively,​ the definitions consists of three steps.
 +
 +  - First, we add progressive indices to the elements of ''​l''​. To do that, we combine ''​l''​ (say, of length k) with the list of all the integers between 1 and k.
 +  - Second, we filter the elements that appear in positions greater than ''​n''​.
 +  - Finally, we map the right projection ''​snd''​ to all the elements of the list, so to obtain the desired sublist of ''​l''​.
 +
 +Summing up, we have the following code:
 +<code ocaml>
 +# let addidx l = combine (range 1 (length l)) l;;
 +val addidx : 'a list -> (int * 'a) list = <fun>
 +
 +# addidx [10;​20;​30;​40;​50];;​
 +- : (int * int) list = [(1, 10); (2, 20); (3, 30); (4, 40); (5, 50)]
 +
 +# let drop n l = map snd (filter (fun x -> fst x > n) (addidx l));;
 +val drop : int -> 'a list -> 'a list = <fun>
 +
 +# drop 2 [10;​20;​30;​40;​50];;​
 +- : int list = [30; 40; 50]
 +</​code>​
 +
 +<note important>​**Exercise**. Using only the list combinators ''​map'',​ ''​filter'',​ and ''​addidx'', ​ write the following functions:
 +
 +  * ''​search:​ 'a -> 'a list -> bool''​ returns true iff the given element occurs in the list.
 +  * ''​evenidx:​ 'a list -> 'a list''​ given a list, produces the sublist of the elements that appear in even position.
 +</​note>​
 +
 +
 +===== Fold combinators =====
 +
 +The list combinators ''​map''​ and ''​filter''​ presented above transform a list into a new list.
 +We now present the //fold// combinators,​ that instead transform a list into a value of an arbitrary type.
 +These combinators are quite general: indeed, they can be exploited to encode both ''​map''​ and ''​filter''​. ​
 +
 +The [[wp>​Fold_(higher-order_function)|fold combinators]] ​
 +must be fed with three things: a combining function ''​f'',​ an element ''​z'',​ and a list ''​l''​. ​
 +Then, the fold combines the elements of ''​l''​ using the function ''​f'';​ the element ''​z''​ is used as a base case.
 +The actual order in which the elements are combined depends on the fold being a "​right"​ or a "​left"​ fold.
 +
 +The combinator ''​fold_right''​ can be informally defined as follows:
 +  * fold_right f z [] = z
 +  * fold_right f z [x1] = f x1 z
 +  * fold_right f z [x1;x2] = f x1 (f x2 z)
 +  * ...
 +  * fold_right f z [x1;​x2;​...;​xn] = f x1 (f x2 (... (f xn z) ...))
 +
 +The recursive definition of ''​fold_right''​ can be given by pattern matching as follows:
 +<code ocaml>
 +# let rec fold_right f z = function
 +    [] -> z
 +  | x::xs -> f x (fold_right f z xs);;
 +
 +val fold_right : ('a -> 'b -> 'b) -> 'b -> 'a list -> 'b = <fun>
 +</​code>​
 +
 +As a first example, we can use ''​fold_right''​ to define a function ''​addlist''​ that sums all the elements of a list.
 +In this case, the base element is 0, while the combining function is the (prefix) sum operator ''​(+)''​.
 +According to out informal definition, we shall have that:
 +  * fold_right (+) 0 [] = 0
 +  * fold_right (+) 0 [x1] = (+) x1 0 = x1+0 = x1
 +  * fold_right (+) 0 [x1;x2] = (+) x1 ( (+) x2 0) = x1+(x2+0) = x1+x2
 +  * ...
 +
 +This is the actual definition of ''​addlist'':​
 +<code ocaml>
 +# let addlist = fold_right (+) 0;; 
 +val addlist : int list -> int = <fun>
 +
 +# addlist [1;​2;​3;​4;​5];;​
 +- : int = 15
 +</​code>​
 +
 +
 +The combinator ''​fold_left''​ can be informally defined as follows:
 +  * fold_left f z [] = z
 +  * fold_left f z [x1] = f z x1
 +  * fold_left f z [x1;x2] = f (f z x1) x2
 +  * ...
 +  * fold_left f z [x1;​x2;​...;​xn] = f( ... f(f z x1) x2 ... )
 +
 +The recursive definition of ''​fold_left''​ can be given by pattern matching as follows:
 +<code ocaml>
 +# let rec fold_left f z = function
 +    [] -> z
 +  | x::xs -> fold_left f (f z x) xs;;
 +
 +val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>
 +
 +# fold_left (+) 0 [1;​2;​3;​4;​5];;​
 +- : int = 15
 +</​code>​
 +
 +Note that the ''​fold_right''​ and the ''​fold_left''​ combinators yield the same result when 
 +the combining function ''​f''​ is associative,​ and ''​z''​ is a neutral element of ''​f''​.
 +For instance, ''​fold_right''​ and ''​fold_left''​ would associate the sum operator ''​(+)''​ on the list ''​[1;​2;​3]''​ respectively as follows:
 +
 +  * fold_right (+) 0 [1;2;3] = (+) 1 ( (+) 2 ( (+) 3 0) = 1 + (2 + (3 + 0))
 +  * fold_left (+) 0 [1;2;3] = ((0 + 1) + 2) + 3
 +
 +Since addition is associative (and 0 is its neutral element), the result is the same in both cases.
 +
 +<note tip>
 +The [[http://​caml.inria.fr/​pub/​docs/​manual-ocaml/​libref/​List.html|OCaml List library]] contains all the standard combinators
 +defined above, as well as a few other useful combinators.
 +To use them, it suffices to write ''​List.''​ as a prefix of the combinator.
 +For instance, you can use map and filter as follows:
 +<code ocaml>
 +# List.map ​
 +    (fun x -> x*x) 
 +    (List.filter ​
 +      (fun x -> x mod 2 = 0) 
 +      [1;​2;​3;​4;​5;​6;​7]
 +    );;
 +    ​
 +- : int list = [4; 16; 36]
 +</​code>​
 +</​note>​
 +
 +We now show how to encode ''​map''​ and ''​filter''​ with the ''​fold_right''​ combinator
 +(note that ''​fold_left''​ could have been used as well).
 +<code ocaml>
 +# let map f = fold_right (fun x l -> (f x) :: l) [];;
 +val map : ('a -> 'b) -> 'a list -> 'b list = <fun>
 +
 +# map (fun x -> x*x) [1;​2;​3;​4;​5];;​
 +- : int list = [1; 4; 9; 16; 25]
 +
 +# let filter p = fold_right (fun x l -> if p x then x::l else l) [];;
 +val filter : ('a -> bool) -> 'a list -> 'a list = <fun>
 +
 +# filter (fun x -> x mod 2 = 0) [1;​2;​3;​4;​5;​6;​7];;​
 +- : int list = [2; 4]
 +</​code>​
 +
 +<note important>​**Exercise.** ​
 +Use the fold combinators to define the following functions.
 +  * ''​append : 'a list -> 'a list -> 'a list''​. Concatenates two lists.
 +  * ''​length:​ 'a list -> int''​. Returns the length of a list.
 +  * ''​count:​ 'a -> 'a list -> int''​ counts the number of occurrences of an element in a list.
 +  * ''​rev:​ 'a list -> 'a list''​. Reverts the elements of a list.
 +</​note>​
 +
 +\\
 +
 +
 +===== Exercises =====
 +
 +[[ex-lists|LIP assignment #4]]
lists.txt ยท Last modified: 2015/10/08 15:20 (external edit)