types

OCaml features facilities to define new types.

In many cases, assigning an “evocative” type to a function is just a good way of documenting your code.
For instance, assume we want a function `add`

that sums two complex numbers.
We can certainly implement complex numbers as pairs of floats,
and define `add`

such that is type is:

add: (float * float) -> (float * float) -> (float * float)

However, we would be happier if `add`

could be written such that its type is
a bit more informative about the expected behaviour.
For instance, we would like to have:

add: complex -> complex -> complex

A further use of types is for specifying *recursive data structures*, like lists, trees, etc.
Operations over these data structures can be implemented through pattern matching.
The OCaml module system can be used to pack together the definitions related to a given
data structure, so to provide them with a further abstraction layer.

Tuples are finite sequences of elements, possibly of different types.

A tuple with elements a_{1},…,a_{k} is written just as (a_{1},…,a_{k}).
Note that the elements are separated by commas.
The outer parentheses can be omitted, yet they improve readibility.

If the type of the element a_{i} is T_{i},
then the type of the whole tuple will be written as
T_{1} * … * T_{k}.
Differently from lists, the type of a tuple reveals the number of its elements.

For instance, we can define a pair as follows:

# (1,"alice");; - : int * string = (1, "alice")

Two polymorphic operations are used to access the elements of a pair.
The operation `fst`

evaluates to the first element, while `snd`

deals with the second element.

# fst;; - : 'a * 'b -> 'a = <fun> # fst (1,"alice");; - : int = 1 # snd;; - : 'a * 'b -> 'b = <fun> # snd (1,"alice");; - : string = "alice"

No facility is given to access the elements of an arbitrary tuple.
To do that, you must define the needed operators; this can be easily done by pattern matching.
For instance, to get the third element of a tuple of four elements, we define the function
`third_of_four`

as follows:

# let third_of_four (_,_,x,_) = x;; val third_of_four : 'a * 'b * 'c * 'd -> 'c = <fun> # third_of_four (1,"alice",2,"bob");; - : int = 2

Records are a slight (yet useful) extension of tuples. Each element of a record is uniquely identified by a label (rather than by a position, like in tuples). This allows for avoiding some common errors, e.g. combining values the representations of which are structurally equivalent, but with different meaning.

For instance, consider two complex numbers `c1`

and `c2`

both represented as pairs of floats,
where `c1`

is in cartesian form, and `c2`

is in polar form.
A possible error would be summing `c1`

to `c2`

as if they were both in cartesian form.
Using records to represent complex numbers helps in preventing from this kind of errors.

The definition of a record type for complex numbers (say, in cartesian form) is as follows.
First, we need to give a name to the new type: in our case, we choose the name `complex`

.
Then, we choose a label and a type for each element of the record.
In our case, the record has two elements, both of type `float`

:
one is labelled `re`

(for the real part), and the other is labelled `im`

(imaginary part).

type complex = {re: float; im: float};;

Note that type names and field names are always uncapitalized (the first character is lowercase).

To construct a value of type `complex`

, we use a similar notation,
except that we use the symbol `=`

instead of `:`

.
To access the element labelled `l`

of a record `x`

,
we write `x.l`

.
Note that the order of the elements of a record is immaterial.

# let i = {re=0.; im=1.};; val i : complex = {re = 0.; im = 1.} # i.im;; - : float = 1. # i = {im=1.; re=0.};; - : bool = true

Here is a function for sums two complex numbers:

# let addcomplex c1 c2 = {re = c1.re +. c2.re; im = c1.im +. c2.im};; val addcomplex : complex -> complex -> complex = <fun>

Let us now define the type `polar`

for representing complex numbers in polar form:

# type polar = {radius: float; angle: float};; # let c = { radius=3.0; angle=1.5};; val c : polar = {radius = 3.; angle = 1.5}

If we try to add a complex in cartesian coordinates to a complex in polar coordinates, we obtain a type error:

# addcomplex i c;; Characters 13-14: addcomplex i c;; ^ This expression has type polar but is here used with type complex

A tagged union (a.k.a., a *variant*) is a data structure that can hold a value,
the type of which can be taken on a given set of types.
The type of a tagged union is called *variant type*.
It lists all possible shapes for values of that type.
Each case is identified by a tag, called a **type constructor**.
Such tag serves both for constructing values of the variant type,
and for destructing them by pattern-matching.

Constructors are always capitalized, to distinguish them from other
names
(which must start with a lowercase letter).

Let us now define a variant type to represent the sign of an integer number:

# type sign = Pos | Neg | Zero;;

The function `getsign`

below maps each integer to its sign.

# let getsign = function n when n<0 -> Neg | n when n>0 -> Pos | _ -> Zero;; val getsign : int -> sign = <fun>

Similarly to lists, we can use pattern-matching to define functions on tagged unions. Indeed, lists are just a special case of variant types, as we shall see in a while. Here is a function for “multiplying” signs:

# let mulsign x y = match (x,y) with (Zero,_) | (_,Zero) -> Zero | (Pos,z) | (z,Pos) -> z | (Neg,Neg) -> Pos;; val mulsign : sign -> sign -> sign = <fun>

In the previous example, the values of a variant type were just denoted by the type constructors. More in general, each type constructor in a variant type can be associated with an arbitrary (possibly polymorphic) type.

For instance, assume we have to manage values that represent lengths, expressed in two different systems of measurement: the International System of Units and the British Imperial system. To keep things simple, we only consider lengths expressed in meters and yards.

The values of the variant type `length`

can be chosen from two sets:
they can either be floats tagged with the constructor `Meter`

,
or they can be floats tagged with the constructor `Yard`

.

# type length = Meter of float | Yard of float;; # Meter 1.0;; - : length = Meter 1. # Yard 2.5;; - : length = Yard 2.5

Let us now define a function that adds two lengths. Note that we convert yards to meters (1 yard = 0.9144 meters) when we are mixing lengths expressed in different units.

# let add_length x y = match (x,y) with (Meter xm, Meter ym) -> Meter (xm +. ym) | (Meter xm, Yard ym) -> Meter (xm +. 0.9144 *. ym) | (Yard xy, Meter ym) -> Meter (0.9144 *. xy +. ym) | (Yard xy, Yard yy) -> Yard (xy +. yy);; val add_length : length -> length -> length = <fun> # add_length (Meter 1.0) (Yard 1.0);; - : length = Meter 1.9144

Note in passing that such a scrupulous treatment of values would have allowed NASA to save $125M, when in 1999 the Mars Climate Orbiter crashed because one engineering team used metric units, while another used English units for a key spacecraft operation.

Suppose that we want to define a *partial* function `f`

with domain `'b`

and codomain `'a`

.
We can model `f`

as a *total* function from `'b`

to the “lifted” codomain
`'a partial`

, defined by the following variant type:

type 'a partial = None | Some of 'a;;

If you are using Tuareg mode, the above definition could produce a syntax error.
This can be easily solved as follows:

type 'a partial = None | Some of ('a);;

Intuitively,

- The value
`None`

represents no result, i.e.`f x`

will evaluate to`None`

when`f`

is undefined on`x`

. - A value with type
`Some of 'a`

represents a result of type 'a, i.e.`f x`

will evaluate to`Some y`

when`f x`

=`y`

.

To produce a value of a tagged union, we must “inject” the value into the union,
by making the tag explicit
(the keyword `of`

is now omitted).

# Some 51;; - : int partial = Some 51

The following function, given a predicate `p: 'a → bool`

and a list,
finds the first element of the list (if any) that satisfies the predicate.
The function is undefined (i.e. it evaluates to `None`

) when no such
element exists.

# let rec find p = function [] -> None | x::l -> if p x then Some x else find p l;; val find : ('a -> bool) -> 'a list -> 'a partial = <fun>

The function `find`

above can be used to specify an **environment**,
i.e. a partial function from identifiers to values.
There are several ways of doing that; below,
we model an environment as a list of pairs (identifier,value).

First, we define a function `bind`

, that binds
an identifier `x`

to a value `v`

in the given environment.

# let rec bind x v = function [] -> [(x,v)] | (y,v')::l when x=y -> (x,v)::l | (y,v')::l -> (y,v')::(bind x v l);; val bind : 'a -> 'b -> ('a * 'b) list -> ('a * 'b) list = <fun>

Then, we define the function `applyenv`

,
thay searches the environment for the value bound to the identifier `x`

.
We define `applyenv`

as a partial function that produces
`None`

if `x`

is not bound in the environment, or
`Some v`

if `x`

is bound to the value `v`

.

# let applyenv x l = match (find (fun (y,v) -> x=y)) l with None -> None | Some (y,v) -> Some v;; val applyenv : 'a -> ('a * 'b) list -> 'b partial = <fun>

Note that using a variant type for the result of a function requires a further pattern-matching step,
where one inspects whether the result is `None`

or `Some v`

. If one is just interested in `v`

,
and in raising an error in the other case, exceptions are perhaps more convenient.

Here are some usages examples:

# let rho0 = [("x",3);("y",5)];; val rho0 : (string * int) list = [("x", 3); ("y", 5)] # applyenv "x" rho0;; - : int partial = Some 3 # applyenv "z" rho0;; - : int partial = None # let rho1 = bind "x" 7 rho0;; val rho1 : (string * int) list = [("x", 7); ("y", 5)] # applyenv "x" rho1;; - : int partial = Some 7 # let rho2 = bind "z" 9 rho1;; val rho2 : (string * int) list = [("x", 7); ("y", 5); ("z", 9)] # applyenv "z" rho2;; - : int partial = Some 9

Variant types can be defined recursively.
That is, when defining a type `t`

,
the name `t`

can appear on the right-hand side of the type definition.

In contrast to the way recursive functions are defined (

`let rec f = …`

),
the definition of types is recursive by default, so we can just write `type t = …`

,
without the keyword `rec`

.
As a first example, we define natural numbers through (a subset of) the Peano axioms. A natural number is either zero, or the successor of a natural number.

# type nat = Zero | Succ of nat;;

We define below the addition between two natural numbers. This is done by recursion and pattern matching.

# let rec add_nat x y = match x with Zero -> y | Succ x' -> Succ (add_nat x' y);; val add_nat : nat -> nat -> nat = <fun> # add_nat (Succ (Succ Zero)) (Succ (Succ (Succ Zero)));; - : nat = Succ (Succ (Succ (Succ (Succ Zero))))

Let us now define a function `int_of_nat`

that transforms a `Nat`

into an integer.

# let rec int_of_nat = function Zero -> 0 | Succ x -> 1 + int_of_nat x;; val int_of_nat : nat -> int = <fun> # int_of_nat (add_nat (Succ (Succ Zero)) (Succ (Succ (Succ Zero))));; - : int = 5

Recursive types allow for a natural representation of recursive data structures like lists, stacks, trees, etc.

For instance, a stack of elements of type `'a`

is either the empty stack (`Nil`

), or the stack
where an element of type `'a`

is “pushed” onto another stack (`Stack of 'a * 'a stack`

).

# type 'a stack = Nil | Push of 'a * 'a stack;;

We specify below the basic operations on stacks.

exception EmptyStack;; # let isempty = function Nil -> true | _ -> false;; val isempty : 'a stack -> bool = <fun> # let top = function Nil -> raise EmptyStack | Push (x,s') -> x;; val top : 'a stack -> 'a = <fun> # let pop = function Nil -> raise EmptyStack | Push (x,s') -> s';; val pop : 'a stack -> 'a stack = <fun> # let push x s = Push (x,s);; val push : 'a -> 'a stack -> 'a stack = <fun>

Let us now test out stack specification:

# let s0 = push 1 (push 2 (push 3 Nil));; val s0 : int stack = Push (1, Push (2, Push (3, Nil))) # top (pop s0);; - : int = 2

A recursive variant type can be seen as a sort of context-free grammar.
For instance, the following (otherwise useless) type `t`

represents the grammar { S → a S b, S → c },
which denotes the (context-tree) language { a^{n} c b^{n} | n≥0 }.
If we consider the syntax tree of a value of type `t`

, then the leaves of the tree,
read from left to right, are words of the grammar T.

# type a = A;; # type b = B;; # type c = C;; # type t = T of a * t * b | C;; # T(A,T(A,T(A,C,B),B),B);; - : t = T (A, T (A, T (A, C, B), B), B)

Tagged unions are particularly convenient to manipulate abstract syntax tree of terms.

As an example, let us consider a small language for arithmetic expressions. An expression can be a constant, or a variable, or a sum of two expressions, of a subtraction of two expressions. In BNF notation, you would write:

<const> ::= ... <var> ::= ... <exp> ::= <const> | <var> | <exp> + <exp> | <exp> - <exp>

In OCaml, you can define arithmetic expressions through a variant type, that specifies their abstract syntax.

# type exp = Const of int | Var of string | Sum of exp * exp | Sub of exp * exp;;

Let us now define a function `eval`

that evaluates an expression `e`

to a value of type `int`

.
We need an environment `rho`

to bind variables to values.

# let rec eval e rho = match e with Const v -> v | Var x -> (match applyenv x rho with None -> failwith ("Unbound variable " ^ x) | Some v -> v) | Sum (e1,e2) -> (eval e1 rho) + (eval e2 rho) | Sub (e1,e2) -> (eval e1 rho) - (eval e2 rho);; val eval : exp -> (string * int) list -> int = <fun>

Let us now test our arithmetic calculator:

# let rho0 = [("x",3);("y",5)];; # eval (Sum(Var "x", Const 5)) rho0;; - : int = 8 # eval (Sum(Var "x", Var "z")) rho0;; Exception: Failure "Unbound variable z".

types.txt · Last modified: 2015/10/08 15:20 (external edit)