HaLeX
A Haskell Library to Model, Manipulate and Animate Regular Languages

Joćo Saraiva
Department of Computer Science
University of Minho

Joćo Saraiva
jas@di.uminho.pt 

1: The HaLeX Library
This library defines a number of Haskell data types and functions that manipulate them, which provide the following facilities:
 Definition of
regular expressions
,
nondeterministic
and
deterministic finite automata
directly and straightforwardly in Haskell.
 Definition of the
acceptance functions
for all those models.
 Transformation
of regular expressions into nondeterministic and deterministic finite automata.
 Minimization
of the states of deterministic finite automata.
 Lexlike facilities
,
e.g.
it generates a Haskell program from a regular expression (expressed as it's minimized finite automata).
 Equivalence of regular expressions
and finite automata.
 Definition of
reactive finite automata
.
 Graphical representation
of finite automata.
 Animation
of the acceptance function of finite automata.


2: The HaLeX Library
The HaLeX library is a small extension to
 Simon Thompson, "Regular Expressions and Automata using Haskell", January 2000.
 Johan Jeuring and Doaitse Swierstra, "Advanced Programming Concepts in a Course on Grammars and Parsing", FDPE'99.
Basically, it extends definitions in these papers with
 the graphic visualization of automata
 the definition of reactive finite automata.


3: Motivation
This library was developed in the context of a programming methodology course for undergraduate students (3rd year of our degree on software engineering).
 It was developed mainly with
educational purposes
:
 It was developed in order to provide a
clear
,
efficient
and
concise
way to define, to understand and to manipulate regular languages in Haskell.
 The construction of parts of this library has been proposed as assigment projects the students had to solve in that course.
 HaLeX is now being used to support the programming methodology course.


4: Regular Expressions
Regular expressions are modelled in Haskell by the abstract data type
RegExp
. We define one data constructor for each "standard" regular expression operator.
data
RegExp 
= 
Epsilon 


 
Literal 
Char 

 
Or 
RegExp RegExp 

 
Then 
RegExp RegExp 

 
Star 
RegExp 

 
OneOrMore 
RegExp 

 
Optional 
RegExp 
and we can write:
a 
= 
Literal
'a' 
aOrb 
= 
(
Literal
'a') `
Or
` (
Literal
'b')  

5: Deterministic Finite Automata
Formally, a Deterministic Finite Automaton (DFA) is a fivetuple
A = (V,Q,S,Z,D)
. In order to make our code more readable, we prefer to define a new data type, to express the automata's fivetuple, named
Dfa
with a single constructor, named
Dfa
as well, rather than using the Haskell builtin tuples. To make our DFA's more general, we parameterize the data type with the types of states and symbols
data
Dfa
st sy 
=
Dfa 
[sy] 
 Vocabulary 


[st] 
 Finite set of states 


st 
 The start state 


[st] 
 The set of final states 


(st  > sy  > st) 
 The transition function 
where sets are modelled through the Haskell builtin lists. 

6: Haskellbased deterministic finite automata: example

ex1 
::
Dfa 
Char Char 
ex1 
=
Dfa 
['a','b'] 


['A','B','C','D'] 


'A' 


['C'] 


delta 

where 
delta 'A' 'a' = 'A' 


delta 'A' 'b' = 'B' 


delta 'B' 'a' = 'C' 


delta _ _ = 'D'   

7: NonDeterministic Finite Automata with Epsilon Transitions
We define nondeterministic finite automata with
epsilon
transitions and several initial states.
To epsilon transitions we use the predefined datatype Maybe, where the constructor Nothing models epsilontransitions, while the constructor Just models transitions labeled by its argument symbol.
data Ndfa st sy 
= Ndfa 
[sy] 
 Vocabulary 


[st] 
 Finite set of states 


[st] 
 The set of start states 


[st] 
 The set of final states 


(st  >
Maybe
sy  > [st]) 
 The transition function  

8: NonDeterministic Finite Automata with Epsilon Transitions: Example
ex1 
::
Ndfa 
Char Char 
ex1 
=
Ndfa 
['a','b'] 


['A','B','C'] 


['A'] 


['C'] 


delta 

where 
delta 'A' (
Just
'a') = ['A','B'] 


delta 'A' (
Just
'b') = ['B'] 


delta 'B' (
Just
'a') = ['C'] 


delta 'C'
Nothing
= ['A','B'] 


delta _ _ = []  
 

9: The Acceptance Functions
We have defined acceptance functions for all these models.
For example, the acceptance function of a DFA expresses the recursion pattern captured by the Haskell prelude foldl function. Thus, the function accept is concisely defined as follows:
accept
(
Dfa
v q s z delta) simb 
= 
(
foldl
delta s simb) `elem` z  

10: Graphic Representation of Finite Automata
Finite automata are more comprehensible in a graphical representation. Usually automata are represented as graphs:
 states are depicted as the nodes of a graph, final states get a double circle, and the initial states are explicitly mentioned usually with an incoming arrow.
But, drawing graphs is a complex task and an intensive area of research.
Rather than implementing a graph visualization system from scracth (and reinventing the wheel...), we compute a concrete representation for a high quality graph visualization system: the
GraphViz system
.
 GraphViz provides a collection of tools for manipulating graph structures and generating graph layouts.


11: Graphic Representation of Finite Automata

digraph HaLeX { 
rankdir = LR ; 
"1" [shape = circle , color = green]; 
"2" [shape = doublecircle , color = red]; 
"3" [shape = doublecircle , color = red]; 
node [shape = circle , color = black]; 
"1"  > "2" [label = "'a'"]; 
"1"  > "3" [label = "'b'"]; 
"2"  > "4" [label = "'a','b'"]; 
"3"  > "3" [label = "'a'"]; 
"3"  > "4" [label = "'b'"]; 
"4"  > "4" [label = "'a','b'"]; 
}   

12: Graphic Representation of Finite Automata
The HaLeX library includes a prettyprinting function, named
toGraph
that, given a FA, produces its (textual) graph representation in the Dot language.
In order to beautify the graphical representation, we have included options
 to ignore sink states
 to fuse transitions with the same origin and destination.
We omit here its trivial implementation. 

13: Reactive Finite Automata
While accepting a sentence, we may wish the automaton to react. For example, we may wish
 to compute some trace information
 to perform some IO operations
 to compute semantic functions
A reactive finite automaton (RFA) is a machine that reacts while moving from one state to another. To add reactions to finite automata we use monads and we define monadic finite automata. 

14: Reactive Deterministic Finite Automata
Because we wish to associate different effects to finite automata, we parameterize the type of the
DFA
with the type of the monad. The reactions are defined in the (monadic) transition function. As a result, those functions not only have to indicate the destination state(s), but also the reactions to be performed in the transitions.
data
Dfa
m
st sy 
=
Dfa 
[sy] 
 Vocabulary 


[st] 
 Finite set of states 


st 
 The start state 


[st] 
 The set of final states 


(st  > sy  >
m
st) 
 The Monadic transition function 


15: Acceptance Function
The acceptance function is now defined as the standard \Haskell\ monadic foldM function:
dfaaccept
(
Dfa
_ _ s z delta) sent = do 
st < 
foldM
delta s sent 

return (st `elem` z)  

16: Communication protocol
ptl 
:: Dfa (State ([Char],[Int])) Int Char 
ptl 
= Dfa ['1','0'] [1,2,3,4,5,6,7,8,9,10,11,12,13] 1 [12] delta 

where delta 1 '0' = return 2 

delta 2 '0' = return 3 

delta 3 '0' = return 4 

delta 4 s = do { bits s ; return 5 } 

delta 5 s = do { bits s ; return 6 } 

delta 6 s = do { bits s ; return 7 } 

delta 7 '0' = do { values ; return 8 } 

delta 7 '1' = do { values ; return 10 } 

delta 8 '0' = return 9 

delta 9 '1' = return 4 

delta 10 '1' = return 11 

delta 11 '1' = return 12 

delta _ _ = return 13 

bits x = modify (\ s  > ((fst s) + + [x],snd s)) 

values = modify (\ s  > ("",snd s + + [bits2int (fst s)]))  
 

17: Animating Regular Languages
The acceptance function finds a path from the initial state into a final state of the automata.
HaLeX animations are just the stepbystep display of such a path in the graphic representation of the automaton
Thus, to animate finite automata we need to compute the sequence of visited nodes.
This trace information can be easily computed using our reactive FA: the monadic automata just have to accumulate in the state the nodes being visited. 

18: Animating Regular Languages
For example, HaLeX derives such monadic DFA from regular expressions
mdfa 
:: Dfa (State [Char]) Int Char 
mdfa 
= Dfa v q s z delta 
where 


v = " + d." 

q = [1,2,3,4,5,6] 

s = 1 

z = [3,6] 

delta 1 ' + ' = do { accum '2' ; return 2 } 

delta 1 '' = do { accum '2' ; return 2 } 

delta 1 'd' = do { accum '3' ; return 3 } 

delta 1 '.' = do { accum '4' ; return 4 }  

19: The halex Tool
HaLeX is available at:
http://www.di.uminho.pt/~jas/Research/HaLeX/HaLeX.html
Having defined the HaLeX library, we can now define useful tools to manipulate and visualize regular languages.
We have constructed the
halex
program entirely in HaLeX.
Tool Demo 

20: More Animations?
Today, 16:30, GPCE
 Joćo Saraiva, "Componentbased Programming for HigherOrder Attribute Grammars"

