lip-midterm-2010

# LiP midterm project 2010

The goal of this project is to specify a simple lexical analyser. For evaluation purposes, the project is partitioned into five exercises.

##### Submission policy
• You must declare your group by 25.11.2010.
• The policy for group collaboration is described here
• To declare a group, send an email to the teaching assistant.
• The subject of the email must begin with [LIP] (automatic filtering will be used).
• The body of the email must contain the names of the components (min 1, max 3)
• You must declare your group even though you want to work alone (otherwise I cannot assign you an ID).
• The groups and their IDs will be displayed here.
• You must submit your project by 11.12.2010, 23.59 GMT +1.
• To submit a project, send it by email to the teaching assistant.
• The subject of the email must be [LIP] lip10a-gXX.ml, where XX is the ID of your group.
• The solutions to the exercises must be pasted into the body of the email (plain text).
• The email will contain no attachments.
• You can submit a project only once.
• You must adhere the LiP honesty policy.
• You are not required to exploit the solution of an exercise to solve the subsequent ones: you can start each exercise from scratch.

## Exercise 1 (3 pt)

Consider the type:

type 'a partial = None | Some of 'a

(a) Write a function bind with following signature:

bind : ('a -> 'b partial) -> 'a -> 'b -> 'a -> 'b partial

Given a partial function f: 'a → 'b partial, an identifier x: 'a and a value v: 'b, the expression bind f x v will evaluate to a function g: 'a → 'b partial such that, for all identifiers y:

• if y=x, then g y = Some v
• if y≠x, then g y = f y

(b) Write a function join with following signature:

join : ('a -> 'b partial) -> ('a -> 'b partial) -> 'a -> 'b partial

Given two partial functions f,g with type 'a → 'b partial, the expression join f g will evaluate to a function h with type 'a → 'b partial such that:

• h x = f x = g x, if both f and g are defined on x, and f x = g x
• h x = f x, if g is undefined on x
• h x = g x, if f is undefined on x
• h x = None, if neither f nor g are defined on x

Applying the function h on some x for which both f and g are defined on x, and f x ≠ g x, will result in an exception.

## Exercise 2 (6 pt)

Define datatype for specifying sets. A set is an unordered collection of elements, without repetitions. A set of elements of type 'a will be implemented as a list of type 'a list. The following functions will be provided:

• member : 'a → 'a set → bool. The expression member x s will evaluate to true iff the element x belong to s.
• isempty : 'a set → bool. The expression isempty s will evaluate to true iff the set s is empty.
• union : 'a set → 'a set → 'a set. The expression union s0 s1 will evaluate to the union of the sets s0 and s1.
• intersect : 'a set → 'a set → 'a set. The expression intersect s0 s1 will evaluate to the intersection of the sets s0 and s1.
• setminus : 'a set → 'a set → 'a set. The expression setminus s0 s1 will evaluate to the difference between the sets s0 and s1.
• setenum : 'a list → 'a set. The expression setenum l will evaluate to a set that contains all (and only) the elements of the list l.
• setcomp : 'a set → ('a → bool) → 'a set. The expression setcomp s p will evaluate to the largest subset s' of s such that the predicate p is true for all the elements of s'.
• subset: 'a set → 'a set → bool. The expression subset s0 s1 will evaluate to true iff s0 is a proper subset of s1.
• powset: 'a set → 'a set set. The expression powset s will evaluate to the set of all the possible subsets of s.

## Exercise 3 (6 pt)

(a) Define a datatype for specifying Deterministic Finite State Automata (DFAs).

(b) Define a function step that, given a DFA a, a state q and an input symbol s, evaluates the next state of a upon reading s on state q.

(c) Define a function recognize such that the expression recognize a l evaluates to true iff the DFA a accepts the sequence of input symbols l.

(d) Write a DFA that recognizes the language (a|b)*abb (on the alphabet {a,b}). Use the function recognize to test if it accepts the sequences abbabb and bababba.

## Exercise 4 (12 pt)

(a) Define a datatype for specifying Nondeterministic Finite State Automata (NFAs).

(b) Write a function dfa_of_nfa that transforma a NFA into a DFA which accepts the same language. The function will use the subset construction algorithm 1).

(c) Write an NFA that recognizes the language (a|b)*abb (on the alphabet {a,b}). Use the function dfa_of_nfa to transform it into a DFA.

## Exercise 5 (10 pt)

(a) Define a datatype for specifying regular expressions.

(b) Write a function nfa_of_regexp that transforms a regular expression into a NFA which accepts the same language. The function will use the Thompson's algorithm 2).

(c) Using the datatype defined at item (a), write a regular expression for (a|b)*abb. Use the function nfa_of_regexp to construct an NFA that recognizes the same language.

1) , 2) J. Hopcroft, R. Motwani, J. Ullman. Introduction to Automata Theory, Languages, and Computation. Prentice Hall, 2007.