User Tools

Site Tools


basics

Ocaml basics

The Ocaml toplevel system

The Ocaml interactive environment should already be installed on all computers in the Lab. To run it, just open a shell and type:

ocaml

the system shoud output something like:

        Objective Caml version 3.12.0
 
# 

The first line is the version number (maybe in the Lab is different from that shown above). The second line is the prompt, that indicates the system is waiting for some input. You can write any expression terminated by the symbol ;; (double semicolon). For instance, if you want to evaluate the expression 42:

# 42;;
- : int = 42

The first line (except from the opening #) is what you typed.

The second line is the answer of Ocaml. It means that the expression you entered has type int (i.e. integer), and its value is 42. The symbol - means that the result is not stored in any variable, and so it can no longer be used.

In this wiki, lines starting with # will always represent user input in the interactive environment. The system answers will be printed below inputs, without the opening #.

As the Ocaml toplevel system does not allow for line editing, its usage as a programming environment is extremely awkward. The toplevel can be more easily used in conjunction with external editors, see below for instructions.


The Ocaml Emacs-mode

I suggest using emacs to edit and debug Ocaml programs (note: Windows ports of Ocaml and its development environment are available, but no support for them will be given here). The Ocaml emacs-mode should be already installed on all machines in the Lab.

To check if the emacs-mode is installed correctly on your machine, create a file named test.ml, then copy & paste the following code snippet:

let x = 2 in x+1;;

If emacs displays the keywords let and in with a different color from x=2 and x+1, then the emacs-mode should be installed correctly.

If all the characters are rendered with the same color, then follow the instructions below.

Installation instructions

  1. check if a directory named emacs exists under your home; if not, create it
  2. download the emacs-mode package
  3. extract it to the ~/emacs directory
  4. open the .emacs file: emacs ~/emacs/.emacs,
  5. copy&paste into the .emacs file this snippet (do not delete existing lines)
  6. save, quit and relaunch emacs.

Upon successful installation, when a file with extension .ml is opened, a new item named Caml will appear in the menu bar.

Usage instructions

  1. open emacs (from a shell, type: emacs)
  2. open a file with extension .ml (menu File → Open File)
  3. write a phrase, e.g. 2+2;;
  4. place the blinking cursor before the ending ;;
  5. evaluate the phrase (menu Caml → Eval Phrase)
  6. upon the requests “Caml toplevel to run” in the minibuffer, click enter
  7. split the window (menu File → Split Windows)
  8. focus on the bottom window, by clicking the mouse over it
  9. to view the inferior caml buffer, select from the menu Buffers → inferior caml

At this point, you can enter Ocaml phrases in the top window, evaluate them, and see the answers in the bottom window.

To evaluate a phrase, it is convenient to use the keyboard shortcut meta-ctrl-x. This means that you:

  1. press the Esc key,
  2. release it
  3. press the Ctrl key
  4. while holding down the Ctrl, press the x key

To cancel partially typed or accidental commands, use the shortcut ctrl-g


Arithmetic expressions

Very roughly, you can think of any functional programming language as an “extensible” calculator. That is, you can build upon some basic values and operators to program more complex functions. We shall see in a while how to define functions, but first let us consider simple arithmetic expressions.

You enter as input the expression 2+3. The system evaluates it to the integer value 5. A value is something that cannot be further simplified.

# 2+3;;
- : int = 5

Quite differently from any pocket calculator, Ocaml never makes confusion between integers and floating point numbers. This is a safety measure to prevent programmers from unintentionally using integers in place of floating points, and viceversa. Floating point numbers have a dot before the decimal part, even when empty:

# 3.0;;
- : float = 3.

Unproperly mixing integers with floating point numbers will result in a syntax error:

# 2 + 3.0;;
This expression has type float but is here used with type int

Floating point operators are distinguished from integer operators by an ending dot:

# 2. +. 3.;;
- : float = 5.

The reason why the first expression resulted in an error, while the second succeded, it that Ocaml is a strongly typed language.

Let us observe the types of + and +.. This cannot be done directly, yet we can observe the types of the operators (+) and (+.), that are the prefixed versions of the infix operators + and +..

# (+) ;;
- : int -> int -> int = <fun>
 
# (+.);;
- : float -> float -> float = <fun>

The type of (+) is int → int → int. Since the type constructor associates to the right, the type of (+) is actually int → (int → int). This means that (+) takes as input an integer, and evaluates to a function from integers to integers. A further application produces the result.

This may sound quite unusual: probably, you would have expected (+) to take as input a pair of integers, and return an integer. However, you may think of it just as a change of notation! We shall discuss this in more detail later, see Curried vs. uncurried functions

This is a (not complete) list of integer and floating point operators (see the OCaml manual for a complete list):

Operator Type Meaning
+ int → int → int integer addition
- int → int → int integer subtraction
* int → int → int integer multiplication
/ int → int → int integer division
mod int → int → int integer modulus
+. float → float → float floating point addition
-. float → float → float floating point subtraction
*. float → float → float floating point multiplication
/. float → float → float floating point division
sqrt float → float square root
int_of_float float → int float to integer conversion
float_of_int int → float integer to float conversion


Booleans and conditionals

The type bool is inhabited by two values, true and false. Expressions of type bool are used as guards in conditionals. For instance, consider the following conditional expression:

# if (2 > 1) then 5 else 8;;
- : int = 5

In general, the conditional expression if expr1 then expr2 else expr3 evaluates to:

  • the value of expr2, if expr1 evaluates to true,
  • the value of expr3, if expr1 evaluates to false.

Differently from most imperative languages, in OCaml you can use a conditional expression just as an ordinary expression:

# 7 + (if (2 > 1) then (if 2<=3 then 5 else 1) else 8);;
- : int = 12

Note that the types of the two branches in a conditional must be equal, otherwise the whole conditional is not typed.

# if (2>1) then 5 else 2.1;;
This expression has type float but is here used with type int

Boolean operators include && for logical “and”, || for logical “or”, and not for negation. Be aware that the order of the operands is important! More precisely:

  • expr1 && expr2 behaves as if expr1 then expr2 else false, i.e. the operand expr2 is not evaluated if expr1 evaluates to false.
  • expr1 || expr2 behaves as if expr1 then true else expr2, i.e. expr2 is not evaluated if expr1 evaluates to true.

This is a list of boolean operators:

Operator Type Meaning
true bool constant true
false bool constant false
&& bool → bool → bool logical and (shortcut evaluation)
|| bool → bool → bool logical and (shortcut evaluation)
not bool → bool logical not
= 'a → 'a → bool structural equality
<> 'a → 'a → bool structural inequality
== 'a → 'a → bool physical equality
!= 'a → 'a → bool physical inequality

Note that the equality/inequality operators work on elements of type 'a. That is, they are polymorphic functions. We will discuss polymorphism soon.

The difference between structural and physical equality is explained here .


Definitions

In Ocaml there is a global environment of definitions which can be used by other expressions. For instance, we can define as follows a rough approximation of π:

# let pi = 3.14;;
val pi : float = 3.14

The output now says that the variable pi has been bound to the (float) value 3.14. Variable identifiers must be lower-case.

# let Pi = 3.14;;
Characters 4-6:
  let Pi = 3.14;;
      ^^
Unbound constructor Pi

We can now compute the area of a circle with radius r as follows:

# let r = 5.0;;      
val r : float = 5.
 
# pi *. (r *. r);;
- : float = 78.5

Definitions can also have a local scope. For example:

# let x = 3 in x * x;; 
- : int = 9

After the expression above has been evaluated, the previous definition of x is no longer valid:

# x;;
Unbound value x

Let-definitions can be arbitrarily nested:

# let x = 1 in let y = 3 in let z = 5 in x+y+z;;
- : int = 9

You can use the keyword and to define many variables simultaneously (this is particularly useful with recursion):

# let s1 = "binka" and s2 = "spit";;
val s1 : string = "binka"
val s2 : string = "spit"

What happens to the variables bound within an inner let-definition ?

# let s = let s' = "binka " and s'' = "spit" in s' ^ s'';;
val s : string = "binka spit"
 
# s;;
# - : string = "binka spit"
 
# s';;
Characters 0-2:
  s';;
  ^^
Unbound value s'


Functions

The power of functional programming languages is given by the possibility of constructing complex functions from simpler ones. At the very basic level, there are the arithmetic operators, and some other operators on basic types (boolean, strings, etc.) For instance, let us define a function that takes as input an integer x and returns its square:

# let square = fun x -> x * x;;
val square : int -> int = <fun>

The type of this function (automatically inferred by the compiler) is denoted by an arrow int → int. This is all the system has to say about this function. The symbol <fun> is a way to inform us that no better representation of a function has to be expected from a system which must run in a finite amount of time. In theory, the set of all pairs (x, x2) would be a better (extensional) denotation for this function, yet it is an infinite one.

Applying a function works as expected.

# square 3;;
- : int = 9

To ease readability, the fun binder can be omitted. This makes the form of function definitions even more familiar:

# let square x = x * x;;
val square : int -> int = <fun>

Although functions are first class citizens in OCaml, they cannot be tested for equality. This should not look strange: indeed, extensional function equality is not computable.

# let square' x = (x+1) * (x-1) + 1;;
val square' : int -> int = <fun>
 
# square = square';;
Exception: Invalid_argument "equal: functional value".

As a further example, consider a function that computes the average between two integer values:

# let avg x y = (x + y) / 2;;
val avg : int -> int -> int = <fun>
 
# avg 3 5;;
- : int = 4

There are a couple of things to note here.

  • the type constructor '→' associates to the right, i.e. int → int → int actually means int → (int → int)
  • function application associates to the left, i.e. avg 3 5 means (avg 3) 5
  • similarly to the function (+) discussed above, avg is a function which takes a single parameter of type int, and produces a function with type int → int.

Define a function avg' : int → int → float that computes the (floating-point) average between two integers. For instance, avg' 2 5 will evaluate to 3.5. Hint. Use the function float_of_int: int → float.

Functions always take one parameter. To define a constant function f, i.e. a function the parameter of which is immaterial, we stipulate that f acts on the value (). This is a special value, of type unit (this is quite similar to the type void in C). The type unit is also called singleton type, since it is inhabited by exactly one value, ().

# let f = fun () -> 51;;
val f : unit -> int = <fun>
 
# f ();;
- : int = 51

Like in the λ-calculus, functions need not even have a name, e.g.:

# (fun x -> x+1) 7;;
- : int = 8

Indeed, the definition let f x = x+1 is just syntactic sugar, since the same result can be obtained by defining a pure function within a let binding, i.e. let f = fun x → x+1.

In Ocaml, definitions have a static scope

# let x=0;;
val x : int = 0
 
# let f = fun () -> x;;
val f : unit -> int = <fun>
 
# let g = fun () -> let x=1 in f();;
val g : unit -> int = <fun>
 
# g();;
- : int = 0

It is also possible to define infix functions. For instance, we define as follows an infix operator -!, that subtracts an integer m from an integer n, and results in 0 if n<m.

# let ( -! ) n m = if n<m then 0 else n-m;;
val ( -! ) : int -> int -> int = <fun>
 
# 5 -! 3;;
- : int = 2
 
# 3 -! 5;;
- : int = 0

The spaces before and after -! in the definition are mandatory. An infix operator can only contain symbols (e.g. *, +, @, etc.), while it cannot contain letters or digits.


Common errors

Common error #1

When applying a function to negative numbers, wrap then within parentheses. Otherwise Ocaml thinks you are trying su subtract a number from a function.

# square -3;;
  ^^^^^^
This expression has type int -> int but is here used with type int
Common error #2

Function application associates to the left. That is, when you write f g e you are actually meaning (f g) e. If you want to apply first g to e, you must use parentheses, i.e. f (g e).

# square square 3;;
  ^^^^^^
This function is applied to too many arguments, maybe you forgot a `;'
 
# square (square 3);;
- : int = 81


Curried vs. uncurried functions

The classical notation for writing a function f of n arguments x1,…,xn is f(x1,…,xn), i.e a function taking as argument an n-tuple of parameters.

For instance, a function that sums two integers can be written as follows:

# let sum (x,y) = x+y
val sum : int * int -> int = <fun>
 
# sum (5,3)
- : int = 8

The type int * int → int means that sum maps a pair of integers to an integer.

An alternative way of defining the same function is the following:

# let sumc = fun x -> fun y -> x+y;;
val sumc : int -> int -> int = <fun>

The type int → int → int has to be read int → (int → int), since the constructor associates to the left. This means that sumc takes as input an integer, and evaluates to another function. This function has no name: from its type we only know that, when fed with an integer, it produces another integer. Intuitively, the function which results from applying sumc to the integer n, is the function that maps an integer x to x+n.

For instance, we can obtain the successor function by specialization of sumc as follows:

# let succ = sumc 1;;
val succ : int -> int = <fun>
 
# succ 7;;
- : int = 8

Let us now apply the function sumc to the arguments 5 and 3. According to its type, we should write (sumc 5) 3. We can write it more concisely as sumc 5 3: this is equivalent, since function application associates to the left.

# sumc 5 3;;
- : int = 8

Syntactic sugar. We can equivalenty rephrase the function sumc above as follows:

# let sumc x y = x+y;;
val sumc : int -> int -> int = <fun>

Traditionally, functions written in the first way (i.e. taking as input a tuple of arguments) are said to be in uncurried form, while those written in the second way are said to be in curried form. The two forms are equivalent: this is consistent with the isomorphism between the function spaces (A x B) → C and A → (B → C).

The operation of transforming a function taking a tuple of arguments into a (higher-order) function taking a single argument is called Currying. We shall define the currying and uncurrying tranformations in a couple of lectures.


Partial functions

Assume you want to model natural numbers. When you have to subtract b from a, and b is greater than a, you have two choices:

  • you decide that a-b=0 if b>a
  • you decide that a-b is undefined, i.e. - is a partial function.

There are two standard ways to define partial function in OCaml. One is to use tagged unions, the other is to use exceptions. The latter can be done as follows.

# let sub x y = if x<y then failwith "Negative number" else x-y;;
val sub : int -> int -> int = <fun>
 
# sub 3 5;;
Exception: Failure "Negative number".

Alternatively, you can define a custom exception. This can be useful when you have to deal with multiple kinds of failuers. You can use the construct try … with to catch and manage an exception.

# exception NegativeNumber;;
 
# let sub x y = if x<y then raise NegativeNumber else x-y;;
 
# sub 3 5;;
Exception: NegativeNumber.
 
# let dist x y = try sub x y with NegativeNumber -> y-x;;
val dist : int -> int -> int = <fun>
 
# dist 3 5;;
 - : int = 2


Compiling

The standard Ocaml distribution comes with two compilers:

  • ocamlc, which produces bytecode to be run with ocamlrun
  • ocamlopt, which produces native (platform-dependent) code.

The ocamlopt compiler produces code that runs faster than the bytecode produced by ocamlc. However, compilation takes longer, and the size of executable code is bigger. The two compilers should produce executables that behave identically at run-time.

Let us consider an Ocaml program that takes as input a string, and then prints a welcome message. The operator ^ concatenates two strings. The constructur let binds a name to a value. Comments are delimited by (* and *).

(* My first Ocaml program *)
 
print_string "Please enter your name: ";;
let name = read_line ();;
print_string ("Hello, " ^ name ^ "\n");;

Let us assume that the above program is contained in the file hello.ml. To produce an executable named hello, type:

ocamlopt -o hello hello.ml

To run the executable, type:

./hello

The following program takes the name of the person to greet from the command line. The array Sys.argv contains the command line parameters. The length of an array a is obtained by Array.length a. The i-th element of an array a is obtained by a.(i).

let name = if (Array.length Sys.argv = 1) then "anonymous" else (Sys.argv).(1)
in print_string ("Hello, " ^ name ^ "\n");;


Exercises

basics.txt · Last modified: 2015/10/08 15:20 (external edit)