basics

The Ocaml interactive environment should already be installed on all computers in the Lab. To run it, just open a shell and type:

ocaml

the system shoud output something like:

Objective Caml version 3.12.0 #

The first line is the version number (maybe in the Lab is different from that shown above). The second line is the *prompt*, that indicates the system is waiting for some input. You can write any expression terminated by the symbol `;;`

(double semicolon). For instance, if you want to evaluate the expression 42:

# 42;; - : int = 42

The first line (except from the opening `#`

) is what you typed.

The second line is the answer of Ocaml. It means that the expression you entered has *type* `int`

(i.e. integer), and its value is 42. The symbol `-`

means that the result is not stored in any variable, and so it can no longer be used.

In this wiki, lines starting with # will always represent user input in the interactive environment. The system answers will be printed below inputs, without the opening #.

As the Ocaml toplevel system does not allow for line editing, its usage as a programming environment is extremely awkward. The toplevel can be more easily used in conjunction with external editors, see below for instructions.

I suggest using emacs to edit and debug Ocaml programs (*note: Windows ports of Ocaml and its development environment are available, but no support for them will be given here*). The Ocaml emacs-mode should be already installed on all machines in the Lab.

To check if the emacs-mode is installed correctly on your machine, create a file named `test.ml`

, then copy & paste the following code snippet:

let x = 2 in x+1;;

If emacs displays the keywords `let`

and `in`

with a different color from `x=2`

and `x+1`

, then the emacs-mode should be installed correctly.

If all the characters are rendered with the same color, then follow the instructions below.

**Installation instructions**

- check if a directory named
`emacs`

exists under your home; if not, create it - download the emacs-mode package
- extract it to the
`~/emacs`

directory - open the
`.emacs`

file:`emacs ~/emacs/.emacs`

, - copy&paste into the .emacs file this snippet (do not delete existing lines)
- save, quit and relaunch emacs.

Upon successful installation, when a file with extension `.ml`

is opened, a new item named `Caml`

will appear in the menu bar.

**Usage instructions**

- open emacs (from a shell, type:
`emacs`

) - open a file with extension
`.ml`

(menu File → Open File) - write a phrase, e.g.
`2+2;;`

- place the blinking cursor before the ending
`;;`

- evaluate the phrase (menu
`Caml → Eval Phrase`

) - upon the requests “Caml toplevel to run” in the minibuffer, click enter
- split the window (menu
`File → Split Windows`

) - focus on the bottom window, by clicking the mouse over it
- to view the
`inferior caml`

buffer, select from the menu`Buffers → inferior caml`

At this point, you can enter Ocaml phrases in the top window, evaluate them, and see the answers in the bottom window.

To evaluate a phrase, it is convenient to use the keyboard shortcut

`meta-ctrl-x`

. This means that you:
- press the
`Esc`

key, - release it
- press the
`Ctrl`

key - while holding down the Ctrl, press the
`x`

key

To cancel partially typed or accidental commands, use the shortcut `ctrl-g`

Very roughly, you can think of any functional programming language as an “extensible” calculator. That is, you can build upon some basic values and operators to program more complex functions. We shall see in a while how to define functions, but first let us consider simple arithmetic expressions.

You enter as input the *expression* 2+3. The system evaluates it to the integer *value* 5. A value is something that cannot be further simplified.

# 2+3;; - : int = 5

Quite differently from any pocket calculator, Ocaml never makes confusion between integers and floating point numbers. This is a safety measure to prevent programmers from unintentionally using integers in place of floating points, and viceversa. Floating point numbers have a dot before the decimal part, even when empty:

# 3.0;; - : float = 3.

Unproperly mixing integers with floating point numbers will result in a syntax error:

# 2 + 3.0;; This expression has type float but is here used with type int

Floating point operators are distinguished from integer operators by an ending dot:

# 2. +. 3.;; - : float = 5.

The reason why the first expression resulted in an error, while the second succeded, it that Ocaml is a *strongly typed* language.

Let us observe the types of `+`

and `+.`

. This cannot be done directly, yet we can observe the types of the operators `(+)`

and `(+.)`

, that are the prefixed versions of the infix operators `+`

and `+.`

.

# (+) ;; - : int -> int -> int = <fun> # (+.);; - : float -> float -> float = <fun>

The type of `(+)`

is `int → int → int`

. Since the type constructor `→`

associates to the right, the type of `(+)`

is actually `int → (int → int)`

. This means that `(+)`

takes as input an integer, and evaluates to a function from integers to integers. A further application produces the result.

This may sound quite unusual: probably, you would have expected

`(+)`

to take as input a pair of integers, and return an integer. However, you may think of it just as a change of notation!
We shall discuss this in more detail later, see
Curried vs. uncurried functions
This is a (not complete) list of integer and floating point operators (see the OCaml manual for a complete list):

Operator | Type | Meaning |
---|---|---|

`+` | `int → int → int` | integer addition |

`-` | `int → int → int` | integer subtraction |

`*` | `int → int → int` | integer multiplication |

`/` | `int → int → int` | integer division |

`mod` | `int → int → int` | integer modulus |

`+.` | `float → float → float` | floating point addition |

`-.` | `float → float → float` | floating point subtraction |

`*.` | `float → float → float` | floating point multiplication |

`/.` | `float → float → float` | floating point division |

`sqrt` | `float → float` | square root |

`int_of_float` | `float → int` | float to integer conversion |

`float_of_int` | `int → float` | integer to float conversion |

The type `bool`

is inhabited by two values, `true`

and `false`

. Expressions of type `bool`

are used as guards in *conditionals*. For instance, consider the following conditional expression:

# if (2 > 1) then 5 else 8;; - : int = 5

In general, the conditional expression `if expr1 then expr2 else expr3`

evaluates to:

- the value of
`expr2`

, if`expr1`

evaluates to`true`

, - the value of
`expr3`

, if`expr1`

evaluates to`false`

.

Differently from most imperative languages, in OCaml you can use a conditional expression just as an ordinary expression:

# 7 + (if (2 > 1) then (if 2<=3 then 5 else 1) else 8);; - : int = 12

Note that the types of the two branches in a conditional must be equal, otherwise the whole conditional is not typed.

# if (2>1) then 5 else 2.1;; This expression has type float but is here used with type int

Boolean operators include && for logical “and”, || for logical “or”, and `not`

for negation.
Be aware that the order of the operands is important!
More precisely:

`expr1 && expr2`

behaves as`if expr1 then expr2 else false`

, i.e. the operand`expr2`

is not evaluated if`expr1`

evaluates to`false`

.`expr1 || expr2`

behaves as`if expr1 then true else expr2`

, i.e.`expr2`

is not evaluated if`expr1`

evaluates to`true`

.

This is a list of boolean operators:

Operator | Type | Meaning |
---|---|---|

`true` | `bool` | constant true |

`false` | `bool` | constant false |

`&&` | `bool → bool → bool` | logical and (shortcut evaluation) |

`||` | `bool → bool → bool` | logical and (shortcut evaluation) |

`not` | `bool → bool` | logical not |

`=` | `'a → 'a → bool` | structural equality |

`<>` | `'a → 'a → bool` | structural inequality |

`==` | `'a → 'a → bool` | physical equality |

`!=` | `'a → 'a → bool` | physical inequality |

Note that the equality/inequality operators work on elements of type `'a`

. That is, they are *polymorphic* functions. We will discuss polymorphism soon.

The difference between structural and physical equality is explained here .

In Ocaml there is a *global* environment of definitions which can be used by other expressions. For instance, we can define as follows a rough approximation of π:

# let pi = 3.14;; val pi : float = 3.14

The output now says that the variable `pi`

has been bound to the (float) value 3.14. Variable identifiers must be lower-case.

# let Pi = 3.14;; Characters 4-6: let Pi = 3.14;; ^^ Unbound constructor Pi

We can now compute the area of a circle with radius `r`

as follows:

# let r = 5.0;; val r : float = 5. # pi *. (r *. r);; - : float = 78.5

Definitions can also have a *local* scope. For example:

# let x = 3 in x * x;; - : int = 9

After the expression above has been evaluated, the previous definition of `x`

is no longer valid:

# x;; Unbound value x

Let-definitions can be arbitrarily nested:

# let x = 1 in let y = 3 in let z = 5 in x+y+z;; - : int = 9

You can use the keyword `and`

to define many variables simultaneously (this is particularly useful with recursion):

# let s1 = "binka" and s2 = "spit";; val s1 : string = "binka" val s2 : string = "spit"

What happens to the variables bound within an inner let-definition ?

# let s = let s' = "binka " and s'' = "spit" in s' ^ s'';; val s : string = "binka spit" # s;; # - : string = "binka spit" # s';; Characters 0-2: s';; ^^ Unbound value s'

The power of functional programming languages is given by the possibility of constructing complex functions from simpler ones. At the very basic level, there are the arithmetic operators, and some other operators on basic types (boolean, strings, etc.) For instance, let us define a function that takes as input an integer x and returns its square:

# let square = fun x -> x * x;; val square : int -> int = <fun>

The type of this function (automatically inferred by the compiler) is denoted by an arrow `int → int`

.
This is all the system has to say about this function.
The symbol `<fun>`

is a way to inform us that no better representation of a function has to be expected from a system which must run in a finite amount of time. In theory, the set of all pairs (x, x^{2}) would be a better (extensional) denotation for this function, yet it is an infinite one.

Applying a function works as expected.

# square 3;; - : int = 9

To ease readability, the `fun`

binder can be omitted.
This makes the form of function definitions even more familiar:

# let square x = x * x;; val square : int -> int = <fun>

Although functions are first class citizens in OCaml, they cannot be tested for equality.
This should not look strange: indeed, *extensional* function equality is not computable.

# let square' x = (x+1) * (x-1) + 1;; val square' : int -> int = <fun> # square = square';; Exception: Invalid_argument "equal: functional value".

As a further example, consider a function that computes the average between two integer values:

# let avg x y = (x + y) / 2;; val avg : int -> int -> int = <fun> # avg 3 5;; - : int = 4

There are a couple of things to note here.

- the type constructor '→' associates to the right, i.e.
`int → int → int`

actually means`int → (int → int)`

- function application associates to the left, i.e.
`avg 3 5`

means (avg 3) 5 - similarly to the function
`(+)`

discussed above,`avg`

is a function which takes a single parameter of type`int`

, and produces a function with type`int → int`

.

Define a function *Hint.* Use the function

`avg' : int → int → float`

that computes the (floating-point) average between two integers.
For instance, `avg' 2 5`

will evaluate to 3.5.
`float_of_int: int → float`

.
Functions always take one parameter.
To define a constant function `f`

, i.e. a function the parameter of which is immaterial,
we stipulate that `f`

acts on the value `()`

.
This is a special value, of type `unit`

(this is quite similar to the type `void`

in C).
The type `unit`

is also called *singleton type*,
since it is inhabited by exactly one value, `()`

.

# let f = fun () -> 51;; val f : unit -> int = <fun> # f ();; - : int = 51

Like in the λ-calculus, functions need not even have a name, e.g.:

# (fun x -> x+1) 7;; - : int = 8

Indeed, the definition `let f x = x+1`

is just *syntactic sugar*,
since the same result can be obtained by defining a pure function within a let binding, i.e. `let f = fun x → x+1`

.

In Ocaml, definitions have a static scope

# let x=0;; val x : int = 0 # let f = fun () -> x;; val f : unit -> int = <fun> # let g = fun () -> let x=1 in f();; val g : unit -> int = <fun> # g();; - : int = 0

It is also possible to define *infix* functions.
For instance, we define as follows an infix operator `-!`

,
that subtracts an integer m from an integer n,
and results in 0 if n<m.

# let ( -! ) n m = if n<m then 0 else n-m;; val ( -! ) : int -> int -> int = <fun> # 5 -! 3;; - : int = 2 # 3 -! 5;; - : int = 0

The spaces before and after `-!`

in the definition are mandatory.
An infix operator can only contain symbols (e.g. *, +, @, etc.), while it cannot contain letters or digits.

When applying a function to negative numbers, wrap then within parentheses. Otherwise Ocaml thinks you are trying su subtract a number from a function.

# square -3;; ^^^^^^ This expression has type int -> int but is here used with type int

Function application associates to the left. That is, when you write `f g e`

you are actually meaning `(f g) e`

. If you want to apply first `g`

to `e`

, you must use parentheses, i.e. f (g e).

# square square 3;; ^^^^^^ This function is applied to too many arguments, maybe you forgot a `;' # square (square 3);; - : int = 81

The classical notation for writing a function f of n arguments x_{1},…,x_{n}
is f(x_{1},…,x_{n}), i.e a function taking as argument an n-tuple of parameters.

For instance, a function that sums two integers can be written as follows:

# let sum (x,y) = x+y val sum : int * int -> int = <fun> # sum (5,3) - : int = 8

The type `int * int → int`

means that `sum`

maps a pair of integers to an integer.

An alternative way of defining the same function is the following:

# let sumc = fun x -> fun y -> x+y;; val sumc : int -> int -> int = <fun>

The type `int → int → int`

has to be read `int → (int → int)`

, since
the constructor `→`

associates to the left.
This means that `sumc`

takes as input an integer, and evaluates to another function.
This function has no name: from its type we only know that, when fed with an integer, it produces another integer.
Intuitively, the function which results from applying `sumc`

to the integer n,
is the function that maps an integer x to x+n.

For instance, we can obtain the successor function by specialization of `sumc`

as follows:

# let succ = sumc 1;; val succ : int -> int = <fun> # succ 7;; - : int = 8

Let us now apply the function `sumc`

to the arguments 5 and 3.
According to its type, we should write `(sumc 5) 3`

.
We can write it more concisely as `sumc 5 3`

: this is equivalent,
since function application associates to the left.

# sumc 5 3;; - : int = 8

`sumc`

above as follows:
# let sumc x y = x+y;; val sumc : int -> int -> int = <fun>

Traditionally, functions written in the first way (i.e. taking as input a tuple of arguments) are said to be in *uncurried* form,
while those written in the second way are said to be in *curried* form.
The two forms are equivalent:
this is consistent with the isomorphism between the function spaces (A x B) → C and A → (B → C).

The operation of transforming a function taking a tuple of arguments into a (higher-order) function taking a single argument is called Currying. We shall define the currying and uncurrying tranformations in a couple of lectures.

Assume you want to model natural numbers. When you have to subtract b from a, and b is greater than a, you have two choices:

- you decide that a-b=0 if b>a
- you decide that a-b is undefined, i.e.
`-`

is a partial function.

There are two standard ways to define partial function in OCaml. One is to use tagged unions, the other is to use exceptions. The latter can be done as follows.

# let sub x y = if x<y then failwith "Negative number" else x-y;; val sub : int -> int -> int = <fun> # sub 3 5;; Exception: Failure "Negative number".

Alternatively, you can define a custom exception.
This can be useful when you have to deal with multiple kinds of failuers.
You can use the construct `try … with`

to catch and manage an exception.

# exception NegativeNumber;; # let sub x y = if x<y then raise NegativeNumber else x-y;; # sub 3 5;; Exception: NegativeNumber. # let dist x y = try sub x y with NegativeNumber -> y-x;; val dist : int -> int -> int = <fun> # dist 3 5;; - : int = 2

The standard Ocaml distribution comes with two compilers:

`ocamlc`

, which produces bytecode to be run with`ocamlrun`

`ocamlopt`

, which produces native (platform-dependent) code.

The `ocamlopt`

compiler produces code that runs faster than the bytecode produced by `ocamlc`

. However, compilation takes longer, and the size of executable code is bigger. The two compilers should produce executables that behave identically at run-time.

Let us consider an Ocaml program that takes as input a string, and then prints a welcome message.
The operator `^`

concatenates two strings.
The constructur `let`

binds a name to a value.
Comments are delimited by (* and *).

(* My first Ocaml program *) print_string "Please enter your name: ";; let name = read_line ();; print_string ("Hello, " ^ name ^ "\n");;

Let us assume that the above program is contained in the file `hello.ml`

. To produce an executable named `hello`

, type:

ocamlopt -o hello hello.ml

To run the executable, type:

./hello

The following program takes the name of the person to greet from the command line.
The array `Sys.argv`

contains the command line parameters.
The length of an array `a`

is obtained by `Array.length a`

.
The `i`

-th element of an array `a`

is obtained by `a.(i)`

.

let name = if (Array.length Sys.argv = 1) then "anonymous" else (Sys.argv).(1) in print_string ("Hello, " ^ name ^ "\n");;

basics.txt · Last modified: 2015/10/08 15:20 (external edit)