User Tools

Site Tools


types

Types

OCaml features facilities to define new types.

In many cases, assigning an “evocative” type to a function is just a good way of documenting your code. For instance, assume we want a function add that sums two complex numbers. We can certainly implement complex numbers as pairs of floats, and define add such that is type is:

add: (float * float) -> (float * float) -> (float * float)

However, we would be happier if add could be written such that its type is a bit more informative about the expected behaviour. For instance, we would like to have:

add: complex -> complex -> complex

A further use of types is for specifying recursive data structures, like lists, trees, etc. Operations over these data structures can be implemented through pattern matching. The OCaml module system can be used to pack together the definitions related to a given data structure, so to provide them with a further abstraction layer.


Tuples

Tuples are finite sequences of elements, possibly of different types.

A tuple with elements a1,…,ak is written just as (a1,…,ak). Note that the elements are separated by commas. The outer parentheses can be omitted, yet they improve readibility.

If the type of the element ai is Ti, then the type of the whole tuple will be written as T1 * … * Tk. Differently from lists, the type of a tuple reveals the number of its elements.

For instance, we can define a pair as follows:

# (1,"alice");;
- : int * string = (1, "alice")

Two polymorphic operations are used to access the elements of a pair. The operation fst evaluates to the first element, while snd deals with the second element.

# fst;;
- : 'a * 'b -> 'a = <fun>
 
# fst (1,"alice");;
- : int = 1
 
# snd;;
- : 'a * 'b -> 'b = <fun>
 
# snd (1,"alice");;
- : string = "alice"

No facility is given to access the elements of an arbitrary tuple. To do that, you must define the needed operators; this can be easily done by pattern matching. For instance, to get the third element of a tuple of four elements, we define the function third_of_four as follows:

# let third_of_four (_,_,x,_) = x;;
val third_of_four : 'a * 'b * 'c * 'd -> 'c = <fun>
 
# third_of_four (1,"alice",2,"bob");;
- : int = 2


Records

Records are a slight (yet useful) extension of tuples. Each element of a record is uniquely identified by a label (rather than by a position, like in tuples). This allows for avoiding some common errors, e.g. combining values the representations of which are structurally equivalent, but with different meaning.

For instance, consider two complex numbers c1 and c2 both represented as pairs of floats, where c1 is in cartesian form, and c2 is in polar form. A possible error would be summing c1 to c2 as if they were both in cartesian form. Using records to represent complex numbers helps in preventing from this kind of errors.

The definition of a record type for complex numbers (say, in cartesian form) is as follows. First, we need to give a name to the new type: in our case, we choose the name complex. Then, we choose a label and a type for each element of the record. In our case, the record has two elements, both of type float: one is labelled re (for the real part), and the other is labelled im (imaginary part).

type complex = {re: float; im: float};;

Note that type names and field names are always uncapitalized (the first character is lowercase).

To construct a value of type complex, we use a similar notation, except that we use the symbol = instead of :. To access the element labelled l of a record x, we write x.l. Note that the order of the elements of a record is immaterial.

# let i = {re=0.; im=1.};;
val i : complex = {re = 0.; im = 1.}
 
# i.im;;
- : float = 1.
 
# i = {im=1.; re=0.};;
- : bool = true

Here is a function for sums two complex numbers:

# let addcomplex c1 c2 = {re = c1.re +. c2.re; im = c1.im +. c2.im};;
val addcomplex : complex -> complex -> complex = <fun>

Let us now define the type polar for representing complex numbers in polar form:

# type polar = {radius: float; angle: float};;
 
# let c = { radius=3.0; angle=1.5};;
val c : polar = {radius = 3.; angle = 1.5}

If we try to add a complex in cartesian coordinates to a complex in polar coordinates, we obtain a type error:

# addcomplex i c;;
 
Characters 13-14:
  addcomplex i c;;
               ^
This expression has type polar but is here used with type complex


Tagged unions

A tagged union (a.k.a., a variant) is a data structure that can hold a value, the type of which can be taken on a given set of types. The type of a tagged union is called variant type. It lists all possible shapes for values of that type. Each case is identified by a tag, called a type constructor. Such tag serves both for constructing values of the variant type, and for destructing them by pattern-matching.

Constructors are always capitalized, to distinguish them from other names (which must start with a lowercase letter).

Let us now define a variant type to represent the sign of an integer number:

# type sign = Pos | Neg | Zero;;

The function getsign below maps each integer to its sign.

# let getsign = function
      n when n<0 -> Neg
    | n when n>0 -> Pos
    | _ -> Zero;; 
 
val getsign : int -> sign = <fun>

Similarly to lists, we can use pattern-matching to define functions on tagged unions. Indeed, lists are just a special case of variant types, as we shall see in a while. Here is a function for “multiplying” signs:

# let mulsign x y = match (x,y) with
    (Zero,_) | (_,Zero) -> Zero
  | (Pos,z) | (z,Pos) -> z
  | (Neg,Neg) -> Pos;;
 
val mulsign : sign -> sign -> sign = <fun>

In the previous example, the values of a variant type were just denoted by the type constructors. More in general, each type constructor in a variant type can be associated with an arbitrary (possibly polymorphic) type.

For instance, assume we have to manage values that represent lengths, expressed in two different systems of measurement: the International System of Units and the British Imperial system. To keep things simple, we only consider lengths expressed in meters and yards.

The values of the variant type length can be chosen from two sets: they can either be floats tagged with the constructor Meter, or they can be floats tagged with the constructor Yard.

# type length = Meter of float | Yard of float;;
 
# Meter 1.0;;
- : length = Meter 1.
 
# Yard 2.5;;
- : length = Yard 2.5

Let us now define a function that adds two lengths. Note that we convert yards to meters (1 yard = 0.9144 meters) when we are mixing lengths expressed in different units.

# let add_length x y = match (x,y) with
    (Meter xm, Meter ym) -> Meter (xm +. ym)
  | (Meter xm, Yard ym)  -> Meter (xm +. 0.9144 *. ym)
  | (Yard xy, Meter ym)  -> Meter (0.9144 *. xy +. ym)
  | (Yard xy, Yard yy)   -> Yard  (xy +. yy);;
 
val add_length : length -> length -> length = <fun>
 
# add_length (Meter 1.0) (Yard 1.0);;
- : length = Meter 1.9144

Note in passing that such a scrupulous treatment of values would have allowed NASA to save $125M, when in 1999 the Mars Climate Orbiter crashed because one engineering team used metric units, while another used English units for a key spacecraft operation.


Partial functions and environments

Suppose that we want to define a partial function f with domain 'b and codomain 'a. We can model f as a total function from 'b to the “lifted” codomain 'a partial, defined by the following variant type:

type 'a partial = None | Some of 'a;;

If you are using Tuareg mode, the above definition could produce a syntax error. This can be easily solved as follows:

type 'a partial = None | Some of ('a);;

Intuitively,

  • The value None represents no result, i.e. f x will evaluate to None when f is undefined on x.
  • A value with type Some of 'a represents a result of type 'a, i.e. f x will evaluate to Some y when f x = y.

To produce a value of a tagged union, we must “inject” the value into the union, by making the tag explicit (the keyword of is now omitted).

# Some 51;;
- : int partial = Some 51

The following function, given a predicate p: 'a → bool and a list, finds the first element of the list (if any) that satisfies the predicate. The function is undefined (i.e. it evaluates to None) when no such element exists.

# let rec find p = function
    [] -> None
  | x::l -> if p x then Some x else find p l;;
 
val find : ('a -> bool) -> 'a list -> 'a partial = <fun>

The function find above can be used to specify an environment, i.e. a partial function from identifiers to values. There are several ways of doing that; below, we model an environment as a list of pairs (identifier,value).

First, we define a function bind, that binds an identifier x to a value v in the given environment.

# let rec bind x v = function
    [] -> [(x,v)]
  | (y,v')::l when x=y -> (x,v)::l
  | (y,v')::l -> (y,v')::(bind x v l);;
 
val bind : 'a -> 'b -> ('a * 'b) list -> ('a * 'b) list = <fun>

Then, we define the function applyenv, thay searches the environment for the value bound to the identifier x. We define applyenv as a partial function that produces None if x is not bound in the environment, or Some v if x is bound to the value v.

# let applyenv x l = match (find (fun (y,v) -> x=y)) l with
      None -> None
    | Some (y,v) -> Some v;;
 
val applyenv : 'a -> ('a * 'b) list -> 'b partial = <fun>

Note that using a variant type for the result of a function requires a further pattern-matching step, where one inspects whether the result is None or Some v. If one is just interested in v, and in raising an error in the other case, exceptions are perhaps more convenient.

Here are some usages examples:

# let rho0 = [("x",3);("y",5)];;
val rho0 : (string * int) list = [("x", 3); ("y", 5)]
 
# applyenv "x" rho0;;
- : int partial = Some 3
 
# applyenv "z" rho0;;
- : int partial = None
 
# let rho1 = bind "x" 7 rho0;;
val rho1 : (string * int) list = [("x", 7); ("y", 5)]
 
# applyenv "x" rho1;;
- : int partial = Some 7
 
# let rho2 = bind "z" 9 rho1;;
val rho2 : (string * int) list = [("x", 7); ("y", 5); ("z", 9)]
 
# applyenv "z" rho2;;
- : int partial = Some 9


Recursive types

Variant types can be defined recursively. That is, when defining a type t, the name t can appear on the right-hand side of the type definition.

In contrast to the way recursive functions are defined (let rec f = …), the definition of types is recursive by default, so we can just write type t = …, without the keyword rec.

As a first example, we define natural numbers through (a subset of) the Peano axioms. A natural number is either zero, or the successor of a natural number.

# type nat = Zero | Succ of nat;;

We define below the addition between two natural numbers. This is done by recursion and pattern matching.

# let rec add_nat x y = match x with
    Zero -> y
  | Succ x' -> Succ (add_nat x' y);;
 
val add_nat : nat -> nat -> nat = <fun>
 
# add_nat (Succ (Succ Zero)) (Succ (Succ (Succ Zero)));;
- : nat = Succ (Succ (Succ (Succ (Succ Zero))))

Let us now define a function int_of_nat that transforms a Nat into an integer.

# let rec int_of_nat = function
    Zero -> 0
  | Succ x -> 1 + int_of_nat x;;
 
val int_of_nat : nat -> int = <fun>
 
# int_of_nat (add_nat (Succ (Succ Zero)) (Succ (Succ (Succ Zero))));;
- : int = 5

Recursive types allow for a natural representation of recursive data structures like lists, stacks, trees, etc.

For instance, a stack of elements of type 'a is either the empty stack (Nil), or the stack where an element of type 'a is “pushed” onto another stack (Stack of 'a * 'a stack).

# type 'a stack = Nil | Push of 'a * 'a stack;;

We specify below the basic operations on stacks.

exception EmptyStack;;
 
# let isempty = function 
      Nil -> true 
    | _ -> false;;
 
val isempty : 'a stack -> bool = <fun>
 
# let top = function 
      Nil -> raise EmptyStack 
    | Push (x,s') -> x;;
 
val top : 'a stack -> 'a = <fun>
 
# let pop = function 
      Nil -> raise EmptyStack 
    | Push (x,s') -> s';;
 
val pop : 'a stack -> 'a stack = <fun>
 
# let push x s = Push (x,s);;
 
val push : 'a -> 'a stack -> 'a stack = <fun>

Let us now test out stack specification:

# let s0 = push 1 (push 2 (push 3 Nil));;
val s0 : int stack = Push (1, Push (2, Push (3, Nil)))
 
# top (pop s0);;
- : int = 2

A recursive variant type can be seen as a sort of context-free grammar. For instance, the following (otherwise useless) type t represents the grammar { S → a S b, S → c }, which denotes the (context-tree) language { an c bn | n≥0 }. If we consider the syntax tree of a value of type t, then the leaves of the tree, read from left to right, are words of the grammar T.

# type a = A;;
# type b = B;; 
# type c = C;;
# type t = T of a * t * b | C;;
 
# T(A,T(A,T(A,C,B),B),B);;
- : t = T (A, T (A, T (A, C, B), B), B)


Abstract Syntax Trees

Tagged unions are particularly convenient to manipulate abstract syntax tree of terms.

As an example, let us consider a small language for arithmetic expressions. An expression can be a constant, or a variable, or a sum of two expressions, of a subtraction of two expressions. In BNF notation, you would write:

<const> ::=   ...
<var>   ::=   ...
<exp>   ::=   <const> | <var> | <exp> + <exp> | <exp> - <exp>

In OCaml, you can define arithmetic expressions through a variant type, that specifies their abstract syntax.

# type exp =
    Const of int
  | Var of string
  | Sum of exp * exp
  | Sub of exp * exp;;

Let us now define a function eval that evaluates an expression e to a value of type int. We need an environment rho to bind variables to values.

# let rec eval e rho = match e with
    Const v -> v
  | Var x -> (match applyenv x rho with 
      None -> failwith ("Unbound variable " ^ x)
    | Some v -> v)
  | Sum (e1,e2) -> (eval e1 rho) + (eval e2 rho)
  | Sub (e1,e2) -> (eval e1 rho) - (eval e2 rho);;
 
val eval : exp -> (string * int) list -> int = <fun>

Let us now test our arithmetic calculator:

# let rho0 = [("x",3);("y",5)];;
 
# eval (Sum(Var "x", Const 5)) rho0;;
 - : int = 8
 
# eval (Sum(Var "x", Var "z")) rho0;;
Exception: Failure "Unbound variable z".


Exercises

types.txt · Last modified: 2015/10/08 15:20 (external edit)