The goal of this project is to specify a simple lexical analyser. For evaluation purposes, the project is partitioned into five exercises.
[LIP]
(automatic filtering will be used).[LIP] lip10a-gXX.ml
, where XX is the ID of your group.
Consider the type:
type 'a partial = None | Some of 'a
(a) Write a function bind
with following signature:
bind : ('a -> 'b partial) -> 'a -> 'b -> 'a -> 'b partial
Given a partial function f: 'a → 'b partial
, an identifier x: 'a
and a value v: 'b
,
the expression bind f x v
will evaluate to a function g: 'a → 'b partial
such that,
for all identifiers y:
(b) Write a function join
with following signature:
join : ('a -> 'b partial) -> ('a -> 'b partial) -> 'a -> 'b partial
Given two partial functions f
,g
with type 'a → 'b partial
,
the expression join f g
will evaluate to a function h
with type 'a → 'b partial
such that:
Applying the function h
on some x
for which both f and g are defined on x
, and f x ≠ g x
,
will result in an exception.
Define datatype for specifying sets. A set is an unordered collection of elements, without repetitions.
A set of elements of type 'a
will be implemented as a list of type 'a list
.
The following functions will be provided:
member : 'a → 'a set → bool
. The expression member x s
will evaluate to true
iff the element x
belong to s
.isempty : 'a set → bool
. The expression isempty s
will evaluate to true
iff the set s
is empty.union : 'a set → 'a set → 'a set
. The expression union s0 s1
will evaluate to the union of the sets s0
and s1
.intersect : 'a set → 'a set → 'a set
. The expression intersect s0 s1
will evaluate to the intersection of the sets s0
and s1
.setminus : 'a set → 'a set → 'a set
. The expression setminus s0 s1
will evaluate to the difference between the sets s0
and s1
.setenum : 'a list → 'a set
. The expression setenum l
will evaluate to a set that contains all (and only) the elements of the list l
.setcomp : 'a set → ('a → bool) → 'a set
. The expression setcomp s p
will evaluate to the largest subset s
' of s
such that the predicate p
is true for all the elements of s
'. subset: 'a set → 'a set → bool
. The expression subset s0 s1
will evaluate to true iff s0
is a proper subset of s1
.powset: 'a set → 'a set set
. The expression powset s
will evaluate to the set of all the possible subsets of s
.
(a) Define a datatype for specifying Deterministic Finite State Automata (DFAs).
(b) Define a function step
that, given a DFA a
, a state q
and an input symbol s
,
evaluates the next state of a
upon reading s
on state q
.
(c) Define a function recognize
such that
the expression recognize a l
evaluates to true iff the DFA a
accepts the sequence of input symbols l
.
(d) Write a DFA that recognizes the language (a|b)*abb (on the alphabet {a,b}).
Use the function recognize
to test if it accepts the sequences abbabb and bababba.
(a) Define a datatype for specifying Nondeterministic Finite State Automata (NFAs).
(b) Write a function dfa_of_nfa
that transforma a NFA into a DFA which accepts the same language.
The function will use the subset construction algorithm 1).
(c) Write an NFA that recognizes the language (a|b)*abb (on the alphabet {a,b}). Use the function dfa_of_nfa
to transform it into a DFA.
(a) Define a datatype for specifying regular expressions.
(b) Write a function nfa_of_regexp
that transforms a regular expression into a NFA
which accepts the same language.
The function will use the Thompson's algorithm 2).
(c) Using the datatype defined at item (a), write a regular expression for (a|b)*abb. Use the function nfa_of_regexp
to construct an NFA that
recognizes the same language.