1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
|
#+TITLE: Plan of the package
#+AUTHOR: Durand
#+DATE: <2022-11-18 Ven 19:57>
* Things to do [2/7]
- [X] Implement builders for graphs
- [X] Find sets of the first terminals for each non-terminal, in the
grammar module.
- [-] NFA [4/5]
+ [X] Add regular expression into NFA.
+ [X] Convert a set of regular expressions into a nondeterministic
finite automaton, where non-terminals denote the indices of
regular expressions in the set to substitute.
+ [X] Convert a grammar into its grammar of left-linear closures of
non-temrinals and their derivatives.
+ [X] Convert nondeterministic finite automata to their null
closures.
+ [ ] Test more grammars to ensure it works correctly.
- [ ] Define the Atom trait.
- [ ] Implement virtual nodes for each derivative of each atom: The
lack of this step might be the root cause of the failure of the
previous version of this package.
- [ ] Implement languages. [0/2]
+ [ ] First implement them as doubly labelled (directed acyclic)
graphs.
+ [ ] Implement finding derivatives.
- [ ] Implement semiring. [0/5]
+ [ ] Define the trait.
+ [ ] Implement the boolean semiring.
+ [ ] Implement the natural number semiring.
+ [ ] Implement the free semiring.
+ [ ] Compute and record a semiring value when computing
derivatives.
* Atomic Languages
This describes the behaviours of atomic languages. The atomic
language consists of the null closure of any non-terminal symbol in
the grammar, and their deriavtives by terminals and non-terminal.
* Script for creating GIF animation
[[https://gist.github.com/maelvls/5379127][a gist]]
* Derivative Languages
This is the main driving device of the algorithm. Basically, the
algorithm works by taking successive derivatives, according to the
input symbol. At each step, we calculate the derivative language. In
this process, we also compute some semiring value and store in a
carrier. The end result of the algorithm is the final semiring
value.
If one wants simply to determine if the input string belongs to the
grammar, one chooses the semiring to be the field with two elements,
the boolean field. If one wants to find how many ways are there to
derive a given input string, then one uses the semiring of natural
numbers instead. If one wants, moreover, to find all the possible
ways to derive a particular input string, then one can use the
free semiring on the set of terminals and non-terminals of the
grammar. Here the free semiring is the left-adjoint functor to the
forgetful functor from the category of semirings to the category of
sets.
To be more specific, the free semiring on a set is given by sets of
sequences of elements in the set. The addition of the semiring is the
set union operation, and the multiplication is taking the respective
concatenations.
** Semirings
So we need a module to define the behaviours of semirings, and provide
some common semirings implementations. Then in the main driving force
we can freely substitute different semirings, according to the
particular needs.
** Languages
Then the main part is to define the behaviour of languages. This
should be easy enough, since we already have the mechanism of graphs,
nondeterministic automata, and semirings. All we need to do is to
combine them together.
* Testing ground
I am in a strong need to test things out. The most important one is
to visualize each step of the derivation, in a human-friendly manner.
I need this to examine whether my atomic languages are wrongly
implemented, or my atomic languages are wrongly derived, or my
understanding of the main algorithm is plain wrong.
This is the main reason I started this rewrite of the package.
|