1 files changed, 135 insertions, 19 deletions
diff --git a/chain/src/plan.org b/chain/src/plan.org
index 8c63492..61738a9 100644
--- a/chain/src/plan.org
+++ b/chain/src/plan.org
@@ -2,29 +2,73 @@
 #+AUTHOR: Durand
 #+DATE: <2022-11-18 Ven 19:57>
 
-* Things to do [2/7]
+* Things to do [5/10]
 
 - [X] Implement builders for graphs
 - [X] Find sets of the first terminals for each non-terminal, in the
   grammar module.
-- [-] NFA [4/5]
+- [-] Establish a testing framework of various grammars. [5/6]
+  + [X] One simple grammar
+  + [X] Notes
+  + [X] Parentheses
+  + [X] Pathelogical left-recursive
+  + [X] Pathelogical right-recursive
+  + [ ] Pathelogically ambiguous
+    # More should be added
+- [X] NFA [4/4]
   + [X] Add regular expression into NFA.
   + [X] Convert a set of regular expressions into a nondeterministic
     finite automaton, where non-terminals denote the indices of
     regular expressions in the set to substitute.
-  + [X] Convert a grammar into its grammar of left-linear closures of
-    non-temrinals and their derivatives.
-  + [X] Convert nondeterministic finite automata to their null
-    closures.
-  + [ ] Test more grammars to ensure it works correctly.
-- [ ] Define the Atom trait.
-- [ ] Implement virtual nodes for each derivative of each atom: The
-  lack of this step might be the root cause of the failure of the
-  previous version of this package.
-- [ ] Implement languages. [0/2]
-  + [ ] First implement them as doubly labelled (directed acyclic)
-    graphs.
-  + [ ] Implement finding derivatives.
+  + [X] Convert a grammar to the language of atomic languages. [2/2]
+    * [X] Convert a grammar into its grammar of left-linear closures
+      of non-temrinals and their derivatives.
+    * [X] Convert nondeterministic finite automata to their null
+      closures.
+  + [X] Test more grammars to ensure it works correctly. [5/5]
+    * [X] Test the conversion to left-closure as a set of regular
+      expressions.
+    * [X] Test the conversion of a set of regular expressions into a
+      nondeterministic finite automaton.
+    * [X] Test the closure of null edges, where the nullity of an edge
+      is defined by a function.  In our specific case, an edge is null
+      if the grammar symbol that edge represents, either a terminal or
+      a non-terminal, can match an empty string.
+    * [X] Test the removal of empty transitions from nondeterministic
+      finite automata.
+    * [X] Test the removal of dead states, where a state is dead if
+      and only if no other states have an edge into that state.
+- [ ] Refactor [0/3]
+  + [ ] When we pull in some regular language because of the
+    left-linear expansion, we need to mark that branch as coming from
+    left-linear expansions.  This is necessary because we cannot
+    follow a left-linearly expanded branch when we take the derivative
+    of a language.  We only expand left-linearly when we try to access
+    the atomic languages [s]^{(t)}.  We can mark by returning a set of
+    nodes which are the beginnings of left-linearly expanded branches.
+  + [ ] An edge of the NFA should carry a label that can be more
+    informative than solely a terminal or a non-terminal.
+  + [ ] Add a mechanism for a grammar to attach labels to edges of NFA
+    which are not necessarily Copy-able, and which should be stored in a
+    separate array, such that the labels on edges of NFA point to the
+    elements of the array.
+- [X] Atom [3/3]
+  + [X] Define the Atom trait.
+  + [X] Implement virtual nodes for each derivative of each atom: The
+    lack of this step might be the root cause of the failure of the
+    previous version of this project.
+  + [X] Test atoms
+- [-] Implement languages. [1/3]
+  + [X] First define a trait with the expected behaviours.
+  + [ ] Then implement them as doubly labelled graphs.
+  + [ ] Thenimplement finding derivatives by use of the chain rule.
+- [-] Implement forests [0/2]
+  + [-] Define a trait with the expected behaviours.
+  + [ ] Implement them using adjacency list: we only need one label
+    per edge, and we do not wish to exclude duplicate edges, and we do
+    not need to index edges by the labels.  All in all, the labels can
+    be implemented by a simple hash map that maps edges to labels, so
+    a special data type is not needed here.
 - [ ] Implement semiring. [0/5]
   + [ ] Define the trait.
   + [ ] Implement the boolean semiring.
@@ -32,6 +76,9 @@
   + [ ] Implement the free semiring.
   + [ ] Compute and record a semiring value when computing
     derivatives.
+- [X] Miscellaneous [1/1]
+  + [X] Implement a function for NFA that checks if a given predicate
+    is satisfied for each edge label.
 
 * Atomic Languages
 
@@ -39,10 +86,6 @@ This describes the behaviours of atomic languages.  The atomic
 language consists of the null closure of any non-terminal symbol in
 the grammar, and their deriavtives by terminals and non-terminal.
 
-* Script for creating GIF animation
-
-[[https://gist.github.com/maelvls/5379127][a gist]]
-
 * Derivative Languages
 
 This is the main driving device of the algorithm.  Basically, the
@@ -82,6 +125,61 @@ should be easy enough, since we already have the mechanism of graphs,
 nondeterministic automata, and semirings.  All we need to do is to
 combine them together.
 
+* Lexers
+
+I got this idea from [[https://lists.gnu.org/archive/html/emacs-devel/2022-12/msg01127.html][a discussion on emacs-devel]].  The idea can be
+formulated as
+
+#+begin_quote
+Potentially it allows invocation of a parser with different variants
+of lexers - one mode with block tokens for the exploration of
+project's structure, and another mode for indentation and error
+checking purposes.
+#+end_quote
+
+I think this is the "right" use of lexers.  That is to say, the parser
+by default should operate on an abstract type of tokens, which are but
+unsigned integers, and rely on the user application to provide a lexer
+for turning the actual input, like a string, into an iterator of
+tokens.  The lexer can choose to do any sort of preprocessing that it
+sees fit, or it can do nothing and just turn characters to unsigned
+integers.  In the latter case, this parser generator works on the
+character level.  But, when one neeeds, one can instruct the parser
+generator to work on the level of preprocessed tokens easily.
+
+This has the advantage of allowing a certain degree of flexibility in
+the type of tokens accepted, while keeping the implementation simple
+at the same time: the internal core only has to deal with unsigned
+integers, after all.
+
+* Library for drawing graphs
+
+In the past, I thought that drawing graphs is only a development tool
+for my package, so I can rely on such external tools as GraphViz, as
+that would not be needed for my end-users.
+
+But recently I realized that this is not a need only for the
+development of the package, but also for the users as well.  To be
+more specific, the target users are those who want to "play" with
+grammars.  So if one cannot view the resulting parse forest diagrams
+easily and interactively, it would be hard to "play" with grammars.
+
+Moreover, the internal workings of the parsing machine might also be
+interesting for users to inspect, whether to debug the program or to
+know more under the hood.  As such it is required to view the diagrams
+of the internal directed acyclic graphs.
+
+For all these reasons, I know regard the ability to draw graphs and to
+interact with those graphs is essential to the user experience of this
+package.  As a consequence, I would like to develop a small default
+library for this purpose.  I follow the convention adapted by this
+package here as well, which is to provide a default implementation for
+the functionality, while still leaving the room for users to
+substitute with other external packages, if so desired, for whatever
+reason.  For example, an external package may provide an unprecedented
+performance for drawing graphs in Emacs.  If such a package appears, I
+can easily avail of that external package to draw graphs.
+
 * Testing ground
 
 I am in a strong need to test things out.  The most important one is
@@ -92,3 +190,21 @@ understanding of the main algorithm is plain wrong.
 
 This is the main reason I started this rewrite of the package.
 
+* To figure out
+
+I need to figure out how to construct the parse forest from a
+left-most null-closed derivation graph.
+
+Reading through the original paper again, I found that the author
+augmented the atomic languages with annotations of partial derivation
+steps on each edge of the nondeterministic finite automaton.  In
+brief, this is what I tried to achieve with some obscure constructs
+such as expanding reduction informations carried by the
+nondeterministic finite automata, in one of the previous versions.  To
+sum up, I really need to carry those informations in the first place,
+otherwise the parse forests cannot be reliably construced later on.
+
+But I really do not want to adopt the approach of the original author.
+I still prefer the idea of a recorder.  That is to say, instead of
+constructing the parse forest after the recognition, we shall
+construct the parse forest simultaneously while we are recognizing.