types

This shows you the differences between two versions of the page.

— |
types [2015/10/08 15:20] (current) |
||
---|---|---|---|

Line 1: | Line 1: | ||

+ | ====== Types ====== | ||

+ | |||

+ | OCaml features facilities to define new types. | ||

+ | |||

+ | In many cases, assigning an "evocative" type to a function is just a good way of documenting your code. | ||

+ | For instance, assume we want a function ''add'' that sums two complex numbers. | ||

+ | We can certainly implement complex numbers as pairs of floats, | ||

+ | and define ''add'' such that is type is: | ||

+ | <code ocaml> | ||

+ | add: (float * float) -> (float * float) -> (float * float) | ||

+ | </code> | ||

+ | |||

+ | However, we would be happier if ''add'' could be written such that its type is | ||

+ | a bit more informative about the expected behaviour. | ||

+ | For instance, we would like to have: | ||

+ | <code ocaml> | ||

+ | add: complex -> complex -> complex | ||

+ | </code> | ||

+ | |||

+ | A further use of types is for specifying //recursive data structures//, like lists, trees, etc. | ||

+ | Operations over these data structures can be implemented through pattern matching. | ||

+ | The OCaml module system can be used to pack together the definitions related to a given | ||

+ | data structure, so to provide them with a further abstraction layer. | ||

+ | |||

+ | \\ | ||

+ | |||

+ | ===== Tuples ===== | ||

+ | |||

+ | Tuples are finite sequences of elements, possibly of different types. | ||

+ | |||

+ | A tuple with elements a<sub>1</sub>,...,a<sub>k</sub> is written just as (a<sub>1</sub>,...,a<sub>k</sub>). | ||

+ | Note that the elements are separated by commas. | ||

+ | The outer parentheses can be omitted, yet they improve readibility. | ||

+ | |||

+ | If the type of the element a<sub>i</sub> is T<sub>i</sub>, | ||

+ | then the type of the whole tuple will be written as | ||

+ | T<sub>1</sub> * ... * T<sub>k</sub>. | ||

+ | Differently from lists, the type of a tuple reveals the number of its elements. | ||

+ | |||

+ | For instance, we can define a pair as follows: | ||

+ | <code ocaml> | ||

+ | # (1,"alice");; | ||

+ | - : int * string = (1, "alice") | ||

+ | </code> | ||

+ | |||

+ | Two polymorphic operations are used to access the elements of a pair. | ||

+ | The operation ''fst'' evaluates to the first element, while ''snd'' deals with the second element. | ||

+ | <code ocaml> | ||

+ | # fst;; | ||

+ | - : 'a * 'b -> 'a = <fun> | ||

+ | |||

+ | # fst (1,"alice");; | ||

+ | - : int = 1 | ||

+ | |||

+ | # snd;; | ||

+ | - : 'a * 'b -> 'b = <fun> | ||

+ | |||

+ | # snd (1,"alice");; | ||

+ | - : string = "alice" | ||

+ | </code> | ||

+ | |||

+ | No facility is given to access the elements of an arbitrary tuple. | ||

+ | To do that, you must define the needed operators; this can be easily done by pattern matching. | ||

+ | For instance, to get the third element of a tuple of four elements, we define the function | ||

+ | ''third_of_four'' as follows: | ||

+ | <code ocaml> | ||

+ | # let third_of_four (_,_,x,_) = x;; | ||

+ | val third_of_four : 'a * 'b * 'c * 'd -> 'c = <fun> | ||

+ | |||

+ | # third_of_four (1,"alice",2,"bob");; | ||

+ | - : int = 2 | ||

+ | </code> | ||

+ | |||

+ | \\ | ||

+ | |||

+ | ===== Records ===== | ||

+ | |||

+ | Records are a slight (yet useful) extension of tuples. | ||

+ | Each element of a record is uniquely identified by a label (rather than by a position, like in tuples). | ||

+ | This allows for avoiding some common errors, | ||

+ | e.g. combining values the representations of which are structurally equivalent, but with different meaning. | ||

+ | |||

+ | For instance, consider two complex numbers ''c1'' and ''c2'' both represented as pairs of floats, | ||

+ | where ''c1'' is in cartesian form, and ''c2'' is in polar form. | ||

+ | A possible error would be summing ''c1'' to ''c2'' as if they were both in cartesian form. | ||

+ | Using records to represent complex numbers helps in preventing from this kind of errors. | ||

+ | |||

+ | The definition of a record type for complex numbers (say, in cartesian form) is as follows. | ||

+ | First, we need to give a name to the new type: in our case, we choose the name ''complex''. | ||

+ | Then, we choose a label and a type for each element of the record. | ||

+ | In our case, the record has two elements, both of type ''float'': | ||

+ | one is labelled ''re'' (for the real part), and the other is labelled ''im'' (imaginary part). | ||

+ | <code ocaml> | ||

+ | type complex = {re: float; im: float};; | ||

+ | </code> | ||

+ | |||

+ | Note that type names and field names are always uncapitalized (the first character is lowercase). | ||

+ | |||

+ | To construct a value of type ''complex'', we use a similar notation, | ||

+ | except that we use the symbol ''='' instead of '':''. | ||

+ | To access the element labelled ''l'' of a record ''x'', | ||

+ | we write ''x.l''. | ||

+ | Note that the order of the elements of a record is immaterial. | ||

+ | |||

+ | <code ocaml> | ||

+ | # let i = {re=0.; im=1.};; | ||

+ | val i : complex = {re = 0.; im = 1.} | ||

+ | |||

+ | # i.im;; | ||

+ | - : float = 1. | ||

+ | |||

+ | # i = {im=1.; re=0.};; | ||

+ | - : bool = true | ||

+ | </code> | ||

+ | |||

+ | Here is a function for sums two complex numbers: | ||

+ | <code ocaml> | ||

+ | # let addcomplex c1 c2 = {re = c1.re +. c2.re; im = c1.im +. c2.im};; | ||

+ | val addcomplex : complex -> complex -> complex = <fun> | ||

+ | </code> | ||

+ | |||

+ | Let us now define the type ''polar'' for representing complex numbers in polar form: | ||

+ | <code ocaml> | ||

+ | # type polar = {radius: float; angle: float};; | ||

+ | |||

+ | # let c = { radius=3.0; angle=1.5};; | ||

+ | val c : polar = {radius = 3.; angle = 1.5} | ||

+ | </code> | ||

+ | |||

+ | If we try to add a complex in cartesian coordinates to a complex in polar coordinates, | ||

+ | we obtain a type error: | ||

+ | <code ocaml> | ||

+ | # addcomplex i c;; | ||

+ | |||

+ | Characters 13-14: | ||

+ | addcomplex i c;; | ||

+ | ^ | ||

+ | This expression has type polar but is here used with type complex | ||

+ | </code> | ||

+ | |||

+ | \\ | ||

+ | |||

+ | ===== Tagged unions ===== | ||

+ | |||

+ | A tagged union (a.k.a., a //variant//) is a data structure that can hold a value, | ||

+ | the type of which can be taken on a given set of types. | ||

+ | The type of a tagged union is called //variant type//. | ||

+ | It lists all possible shapes for values of that type. | ||

+ | Each case is identified by a tag, called a **type constructor**. | ||

+ | Such tag serves both for constructing values of the variant type, | ||

+ | and for destructing them by pattern-matching. | ||

+ | |||

+ | <note warning> | ||

+ | Constructors are always capitalized, to distinguish them from other | ||

+ | [[http://caml.inria.fr/pub/docs/manual-ocaml/manual011.html|names]] | ||

+ | (which must start with a lowercase letter). | ||

+ | </note> | ||

+ | |||

+ | Let us now define a variant type to represent the sign of an integer number: | ||

+ | <code ocaml> | ||

+ | # type sign = Pos | Neg | Zero;; | ||

+ | </code> | ||

+ | |||

+ | The function ''getsign'' below maps each integer to its sign. | ||

+ | <code ocaml> | ||

+ | # let getsign = function | ||

+ | n when n<0 -> Neg | ||

+ | | n when n>0 -> Pos | ||

+ | | _ -> Zero;; | ||

+ | |||

+ | val getsign : int -> sign = <fun> | ||

+ | </code> | ||

+ | |||

+ | Similarly to lists, we can use pattern-matching to define functions on tagged unions. | ||

+ | Indeed, lists are just a special case of variant types, as we shall see in a while. | ||

+ | Here is a function for "multiplying" signs: | ||

+ | <code ocaml> | ||

+ | # let mulsign x y = match (x,y) with | ||

+ | (Zero,_) | (_,Zero) -> Zero | ||

+ | | (Pos,z) | (z,Pos) -> z | ||

+ | | (Neg,Neg) -> Pos;; | ||

+ | |||

+ | val mulsign : sign -> sign -> sign = <fun> | ||

+ | </code> | ||

+ | |||

+ | In the previous example, the values of a variant type were just denoted by the type constructors. | ||

+ | More in general, each type constructor in a variant type can be associated with an arbitrary (possibly polymorphic) type. | ||

+ | |||

+ | For instance, assume we have to manage values that represent lengths, expressed in two different | ||

+ | systems of measurement: the International System of Units and the British Imperial system. | ||

+ | To keep things simple, we only consider lengths expressed in meters and yards. | ||

+ | |||

+ | The values of the variant type ''length'' can be chosen from two sets: | ||

+ | they can either be floats tagged with the constructor ''Meter'', | ||

+ | or they can be floats tagged with the constructor ''Yard''. | ||

+ | <code ocaml> | ||

+ | # type length = Meter of float | Yard of float;; | ||

+ | |||

+ | # Meter 1.0;; | ||

+ | - : length = Meter 1. | ||

+ | |||

+ | # Yard 2.5;; | ||

+ | - : length = Yard 2.5 | ||

+ | </code> | ||

+ | |||

+ | Let us now define a function that adds two lengths. | ||

+ | Note that we convert yards to meters (1 yard = 0.9144 meters) | ||

+ | when we are mixing lengths expressed in different units. | ||

+ | <code ocaml> | ||

+ | # let add_length x y = match (x,y) with | ||

+ | (Meter xm, Meter ym) -> Meter (xm +. ym) | ||

+ | | (Meter xm, Yard ym) -> Meter (xm +. 0.9144 *. ym) | ||

+ | | (Yard xy, Meter ym) -> Meter (0.9144 *. xy +. ym) | ||

+ | | (Yard xy, Yard yy) -> Yard (xy +. yy);; | ||

+ | |||

+ | val add_length : length -> length -> length = <fun> | ||

+ | |||

+ | # add_length (Meter 1.0) (Yard 1.0);; | ||

+ | - : length = Meter 1.9144 | ||

+ | </code> | ||

+ | |||

+ | Note in passing that such a scrupulous treatment of values would have allowed NASA | ||

+ | to save $125M, when in 1999 the | ||

+ | [[http://www.tysknews.com/Depts/Metrication/mystery_of_orbiter_crash_solved.htm|Mars Climate Orbiter]] | ||

+ | crashed because one engineering team used metric units, while another used English units | ||

+ | for a key spacecraft operation. | ||

+ | |||

+ | \\ | ||

+ | |||

+ | |||

+ | ===== Partial functions and environments ===== | ||

+ | |||

+ | Suppose that we want to define a //partial// function ''f'' with domain '''b'' and codomain '''a''. | ||

+ | We can model ''f'' as a //total// function from '''b'' to the "lifted" codomain | ||

+ | '''a partial'', defined by the following variant type: | ||

+ | <code ocaml> | ||

+ | type 'a partial = None | Some of 'a;; | ||

+ | </code> | ||

+ | |||

+ | <note warning> | ||

+ | If you are using Tuareg mode, the above definition could produce a syntax error. | ||

+ | This can be easily solved as follows: | ||

+ | <code ocaml> | ||

+ | type 'a partial = None | Some of ('a);; | ||

+ | </code> | ||

+ | </note> | ||

+ | |||

+ | Intuitively, | ||

+ | * The value ''None'' represents no result, i.e. ''f x'' will evaluate to ''None'' when ''f'' is undefined on ''x''. | ||

+ | * A value with type ''Some of 'a'' represents a result of type 'a, i.e. ''f x'' will evaluate to ''Some y'' when ''f x'' = ''y''. | ||

+ | |||

+ | To produce a value of a tagged union, we must "inject" the value into the union, | ||

+ | by making the tag explicit | ||

+ | (the keyword ''of'' is now omitted). | ||

+ | <code ocaml> | ||

+ | # Some 51;; | ||

+ | - : int partial = Some 51 | ||

+ | </code> | ||

+ | |||

+ | The following function, given a predicate ''p: 'a -> bool'' and a list, | ||

+ | finds the first element of the list (if any) that satisfies the predicate. | ||

+ | The function is undefined (i.e. it evaluates to ''None'') when no such | ||

+ | element exists. | ||

+ | <code ocaml> | ||

+ | # let rec find p = function | ||

+ | [] -> None | ||

+ | | x::l -> if p x then Some x else find p l;; | ||

+ | |||

+ | val find : ('a -> bool) -> 'a list -> 'a partial = <fun> | ||

+ | </code> | ||

+ | |||

+ | The function ''find'' above can be used to specify an **environment**, | ||

+ | i.e. a partial function from identifiers to values. | ||

+ | There are several ways of doing that; below, | ||

+ | we model an environment as a list of pairs (identifier,value). | ||

+ | |||

+ | First, we define a function ''bind'', that binds | ||

+ | an identifier ''x'' to a value ''v'' in the given environment. | ||

+ | <code ocaml> | ||

+ | # let rec bind x v = function | ||

+ | [] -> [(x,v)] | ||

+ | | (y,v')::l when x=y -> (x,v)::l | ||

+ | | (y,v')::l -> (y,v')::(bind x v l);; | ||

+ | |||

+ | val bind : 'a -> 'b -> ('a * 'b) list -> ('a * 'b) list = <fun> | ||

+ | </code> | ||

+ | |||

+ | Then, we define the function ''applyenv'', | ||

+ | thay searches the environment for the value bound to the identifier ''x''. | ||

+ | We define ''applyenv'' as a partial function that produces | ||

+ | ''None'' if ''x'' is not bound in the environment, or | ||

+ | ''Some v'' if ''x'' is bound to the value ''v''. | ||

+ | <code ocaml> | ||

+ | # let applyenv x l = match (find (fun (y,v) -> x=y)) l with | ||

+ | None -> None | ||

+ | | Some (y,v) -> Some v;; | ||

+ | |||

+ | val applyenv : 'a -> ('a * 'b) list -> 'b partial = <fun> | ||

+ | </code> | ||

+ | |||

+ | Note that using a variant type for the result of a function requires a further pattern-matching step, | ||

+ | where one inspects whether the result is ''None'' or ''Some v''. If one is just interested in ''v'', | ||

+ | and in raising an error in the other case, exceptions are perhaps more convenient. | ||

+ | |||

+ | Here are some usages examples: | ||

+ | <code ocaml> | ||

+ | # let rho0 = [("x",3);("y",5)];; | ||

+ | val rho0 : (string * int) list = [("x", 3); ("y", 5)] | ||

+ | |||

+ | # applyenv "x" rho0;; | ||

+ | - : int partial = Some 3 | ||

+ | |||

+ | # applyenv "z" rho0;; | ||

+ | - : int partial = None | ||

+ | |||

+ | # let rho1 = bind "x" 7 rho0;; | ||

+ | val rho1 : (string * int) list = [("x", 7); ("y", 5)] | ||

+ | |||

+ | # applyenv "x" rho1;; | ||

+ | - : int partial = Some 7 | ||

+ | |||

+ | # let rho2 = bind "z" 9 rho1;; | ||

+ | val rho2 : (string * int) list = [("x", 7); ("y", 5); ("z", 9)] | ||

+ | |||

+ | # applyenv "z" rho2;; | ||

+ | - : int partial = Some 9 | ||

+ | </code> | ||

+ | |||

+ | \\ | ||

+ | |||

+ | ===== Recursive types ===== | ||

+ | |||

+ | Variant types can be defined recursively. | ||

+ | That is, when defining a type ''t'', | ||

+ | the name ''t'' can appear on the right-hand side of the type definition. | ||

+ | |||

+ | <note warning> | ||

+ | In contrast to the way recursive functions are defined (''let rec f = ...''), | ||

+ | the definition of types is recursive by default, so we can just write ''type t = ...'', | ||

+ | without the keyword ''rec''. | ||

+ | </note> | ||

+ | |||

+ | As a first example, we define natural numbers through (a subset of) the [[wp>Peano_axioms|Peano axioms]]. | ||

+ | A natural number is either zero, or the successor of a natural number. | ||

+ | <code ocaml> | ||

+ | # type nat = Zero | Succ of nat;; | ||

+ | </code> | ||

+ | |||

+ | We define below the addition between two natural numbers. | ||

+ | This is done by recursion and pattern matching. | ||

+ | <code ocaml> | ||

+ | # let rec add_nat x y = match x with | ||

+ | Zero -> y | ||

+ | | Succ x' -> Succ (add_nat x' y);; | ||

+ | |||

+ | val add_nat : nat -> nat -> nat = <fun> | ||

+ | |||

+ | # add_nat (Succ (Succ Zero)) (Succ (Succ (Succ Zero)));; | ||

+ | - : nat = Succ (Succ (Succ (Succ (Succ Zero)))) | ||

+ | </code> | ||

+ | |||

+ | Let us now define a function ''int_of_nat'' that transforms a ''Nat'' into an integer. | ||

+ | <code ocaml> | ||

+ | # let rec int_of_nat = function | ||

+ | Zero -> 0 | ||

+ | | Succ x -> 1 + int_of_nat x;; | ||

+ | |||

+ | val int_of_nat : nat -> int = <fun> | ||

+ | |||

+ | # int_of_nat (add_nat (Succ (Succ Zero)) (Succ (Succ (Succ Zero))));; | ||

+ | - : int = 5 | ||

+ | </code> | ||

+ | |||

+ | Recursive types allow for a natural representation of recursive data structures like lists, stacks, trees, etc. | ||

+ | |||

+ | For instance, a stack of elements of type '''a'' is either the empty stack (''Nil''), or the stack | ||

+ | where an element of type '''a'' is "pushed" onto another stack (''Stack of 'a * 'a stack''). | ||

+ | <code ocaml> | ||

+ | # type 'a stack = Nil | Push of 'a * 'a stack;; | ||

+ | </code> | ||

+ | |||

+ | We specify below the basic operations on stacks. | ||

+ | <code ocaml> | ||

+ | exception EmptyStack;; | ||

+ | |||

+ | # let isempty = function | ||

+ | Nil -> true | ||

+ | | _ -> false;; | ||

+ | |||

+ | val isempty : 'a stack -> bool = <fun> | ||

+ | |||

+ | # let top = function | ||

+ | Nil -> raise EmptyStack | ||

+ | | Push (x,s') -> x;; | ||

+ | |||

+ | val top : 'a stack -> 'a = <fun> | ||

+ | |||

+ | # let pop = function | ||

+ | Nil -> raise EmptyStack | ||

+ | | Push (x,s') -> s';; | ||

+ | |||

+ | val pop : 'a stack -> 'a stack = <fun> | ||

+ | |||

+ | # let push x s = Push (x,s);; | ||

+ | |||

+ | val push : 'a -> 'a stack -> 'a stack = <fun> | ||

+ | </code> | ||

+ | |||

+ | Let us now test out stack specification: | ||

+ | <code ocaml> | ||

+ | # let s0 = push 1 (push 2 (push 3 Nil));; | ||

+ | val s0 : int stack = Push (1, Push (2, Push (3, Nil))) | ||

+ | |||

+ | # top (pop s0);; | ||

+ | - : int = 2 | ||

+ | </code> | ||

+ | |||

+ | A recursive variant type can be seen as a sort of [[wp>Context-free_grammar|context-free grammar]]. | ||

+ | For instance, the following (otherwise useless) type ''t'' represents the grammar { S -> a S b, S -> c }, | ||

+ | which denotes the (context-tree) language { a<sup>n</sup> c b<sup>n</sup> | n≥0 }. | ||

+ | If we consider the syntax tree of a value of type ''t'', then the leaves of the tree, | ||

+ | read from left to right, are words of the grammar T. | ||

+ | <code ocaml> | ||

+ | # type a = A;; | ||

+ | # type b = B;; | ||

+ | # type c = C;; | ||

+ | # type t = T of a * t * b | C;; | ||

+ | |||

+ | # T(A,T(A,T(A,C,B),B),B);; | ||

+ | - : t = T (A, T (A, T (A, C, B), B), B) | ||

+ | </code> | ||

+ | |||

+ | \\ | ||

+ | |||

+ | |||

+ | ===== Abstract Syntax Trees ===== | ||

+ | |||

+ | Tagged unions are particularly convenient to manipulate abstract syntax tree of terms. | ||

+ | |||

+ | As an example, let us consider a small language for arithmetic expressions. | ||

+ | An expression can be a constant, or a variable, or a sum of two expressions, | ||

+ | of a subtraction of two expressions. | ||

+ | In BNF notation, you would write: | ||

+ | |||

+ | <const> ::= ... | ||

+ | <var> ::= ... | ||

+ | <exp> ::= <const> | <var> | <exp> + <exp> | <exp> - <exp> | ||

+ | |||

+ | In OCaml, you can define arithmetic expressions through a variant type, | ||

+ | that specifies their abstract syntax. | ||

+ | <code ocaml> | ||

+ | # type exp = | ||

+ | Const of int | ||

+ | | Var of string | ||

+ | | Sum of exp * exp | ||

+ | | Sub of exp * exp;; | ||

+ | </code> | ||

+ | |||

+ | Let us now define a function ''eval'' that evaluates an expression ''e'' to a value of type ''int''. | ||

+ | We need an environment ''rho'' to bind variables to values. | ||

+ | <code ocaml> | ||

+ | # let rec eval e rho = match e with | ||

+ | Const v -> v | ||

+ | | Var x -> (match applyenv x rho with | ||

+ | None -> failwith ("Unbound variable " ^ x) | ||

+ | | Some v -> v) | ||

+ | | Sum (e1,e2) -> (eval e1 rho) + (eval e2 rho) | ||

+ | | Sub (e1,e2) -> (eval e1 rho) - (eval e2 rho);; | ||

+ | |||

+ | val eval : exp -> (string * int) list -> int = <fun> | ||

+ | </code> | ||

+ | |||

+ | Let us now test our arithmetic calculator: | ||

+ | <code ocaml> | ||

+ | # let rho0 = [("x",3);("y",5)];; | ||

+ | |||

+ | # eval (Sum(Var "x", Const 5)) rho0;; | ||

+ | - : int = 8 | ||

+ | |||

+ | # eval (Sum(Var "x", Var "z")) rho0;; | ||

+ | Exception: Failure "Unbound variable z". | ||

+ | </code> | ||

+ | |||

+ | \\ | ||

+ | |||

+ | |||

+ | ===== Exercises ===== | ||

+ | |||

+ | [[ex-types|LIP assignment #5]] | ||

+ | |||

+ | \\ | ||

+ | |||

types.txt · Last modified: 2015/10/08 15:20 (external edit)