diff options
author | JSDurand <mmemmew@gmail.com> | 2023-01-11 23:47:26 +0800 |
---|---|---|
committer | JSDurand <mmemmew@gmail.com> | 2023-01-11 23:47:26 +0800 |
commit | 1a3d346f413325ed37848a6b2526e8e729269833 (patch) | |
tree | ab8812f8094d096c68aee53cf516e986cc9a273a /chain/src/plan.org | |
parent | f27d604d93ce583d4404e1874664e08382ea2f00 (diff) |
Record left-linear expansion and forest format
Now the grammar will record the left-linear expansions when generating
the nondeterministic finite automaton frmo its rules, and will record
whether an edge in the nondeterministic finite automaton comes from a
left-linear expansion. The latter is needed because while performing
a chain-rule derivation, we do not need the left-linear expanded
derivations in the "first layer". This might well have been the root
cause of the bad performance of the previous version of this package.
Also I have figured out how to properly generate and handle parse
forests while manipulating the "chain-rule machine".
Diffstat (limited to 'chain/src/plan.org')
-rw-r--r-- | chain/src/plan.org | 52 |
1 files changed, 25 insertions, 27 deletions
diff --git a/chain/src/plan.org b/chain/src/plan.org index bbd6683..b708413 100644 --- a/chain/src/plan.org +++ b/chain/src/plan.org @@ -2,7 +2,7 @@ #+AUTHOR: Durand #+DATE: <2022-11-18 Ven 19:57> -* Things to do [5/10] +* Things to do [6/10] - [X] Implement builders for graphs - [X] Find sets of the first terminals for each non-terminal, in the @@ -38,18 +38,30 @@ finite automata. * [X] Test the removal of dead states, where a state is dead if and only if no other states have an edge into that state. -- [-] Refactor [2/8] +- [X] Refactor [7/7] + [X] Implement a data type for graphs with labels on vertices and edges, but do not need to index edges by labels, which can index vertices by labels instead. * [X] Test it. + [X] Implement a builder that borrows a graph mutably. * [X] Test it. - + [ ] Implement a data type for graphs in which every node knows its + + [X] Implement a data type for graphs in which every node knows its parents, and every node has a unique label but edges have no labels. - * [ ] Test it. - + [ ] We need to record the expansions of those "virtual nodes". + * [X] Test it. + + [X] When we pull in some regular language because of the + left-linear expansion, we need to mark that branch as coming from + left-linear expansions. This is necessary because we cannot + follow a left-linearly expanded branch when we take the derivative + of a language. We only expand left-linearly when we try to access + the atomic languages [s]^{(t)}. + + [X] An edge of the NFA should carry a label that can be more + informative than solely a terminal or a non-terminal. + + [X] Add a mechanism for a grammar to attach labels to edges of NFA + which are not necessarily Copy-able, and which should be stored in a + separate array, such that the labels on edges of NFA point to the + elements of the array. + + [X] We need to record the expansions of those "virtual nodes". That is, the virtual nodes represent atomic languages such as [s]^{(t)} where s is a non-terminal and t is a terminal. To be more specific, it represents the derivative of the left-linear closure @@ -59,32 +71,21 @@ alone, without consuming any inputs, by expanding according to the grammar rules. This means that we need to know which non-terminals were expanded in order to get to a state in [s]^{(t)}. - + [ ] Each edge in the chain-rule machine needs to be labelled also - with a position in the forest. This perfectly solves the problem - of those "plugs"! - + [ ] When we pull in some regular language because of the - left-linear expansion, we need to mark that branch as coming from - left-linear expansions. This is necessary because we cannot - follow a left-linearly expanded branch when we take the derivative - of a language. We only expand left-linearly when we try to access - the atomic languages [s]^{(t)}. We can mark by returning a set of - nodes which are the beginnings of left-linearly expanded branches. - + [ ] An edge of the NFA should carry a label that can be more - informative than solely a terminal or a non-terminal. - + [ ] Add a mechanism for a grammar to attach labels to edges of NFA - which are not necessarily Copy-able, and which should be stored in a - separate array, such that the labels on edges of NFA point to the - elements of the array. + * [X] Test + * [X] Test more - [X] Atom [3/3] + [X] Define the Atom trait. + [X] Implement virtual nodes for each derivative of each atom: The lack of this step might be the root cause of the failure of the previous version of this project. + [X] Test atoms -- [-] Implement languages. [1/3] +- [-] Implement languages. [1/4] + [X] First define a trait with the expected behaviours. + [ ] Then implement them as doubly labelled graphs. - + [ ] Thenimplement finding derivatives by use of the chain rule. + + [ ] Then implement finding derivatives by use of the chain rule. + + [ ] Each edge in the chain-rule machine needs to be labelled also + with a position in the forest. This perfectly solves the problem + of those "plugs"! - [-] Implement forests [2/3] + [X] Design a format of forests. This should be the most crucial thing to do, in order to have a better understanding of the whole @@ -92,10 +93,7 @@ the binarised shared packed parsed forests, that reflects the regular expressions in the grammar equations. + [X] Define a trait with the expected behaviours. - + [-] Implement them using adjacency map: we only need one label per - edge, and we do not wish to exclude duplicate edges, and we do not - need to index edges by the labels. All in all, we implement them - using a vector of hashmaps. + + [-] Implement them as parents-knowing graphs. - [ ] Implement semiring. [0/5] + [ ] Define the trait. + [ ] Implement the boolean semiring. |