summaryrefslogtreecommitdiff
path: root/chain/src/atom/default.rs
AgeCommit message (Collapse)Author
2023-07-19chain/atom/default: Make `print_nullables` public.JSDurand
* chain/src/atom/default.rs: Making this function public means I do not have to worry about it being unused.
2023-07-18chain/src/atom/default.rs: Add function to print nullablesJSDurand
* chain/src/atom/default.rs (print_nullables): This functions prints the nullables nodes of an atomic language. This is useful whe designing unit tests, as it enables us to know which rule positions are considered accepting by the underlying testing atomic language.
2023-06-18fixed the bugs of node duplications and left-open nodesJSDurand
There were two main issues in the previous version. One is that there are lots of duplications of nodes when manipulating the forest. This does not mean that labels repeat: by the use of the data type this cannot happen. What happened is that there were cloned nodes whose children are exactly equal. In this case there is no need to clone that node in the first place. This is now fixed by checking carefully before cloning, so that we do not clone unnecessary nodes. The other issue, which is perhaps more important, is that there are nodes which are not closed. This means that when there should be a reuction of grammar rules, the forest does not mark the corresponding node as already reduced. The incorrect forests thus caused is hard to fix: I tried several different approaches to fix it afterwards, but all to no avail. I also tried to record enough information to fix these nodes during the manipulations. It turned out that recording nodes is a dead end, as I cannot properly syncronize the information in the forest and the information in the chain-rule machine. Any inconsistencies will result in incorrect operations later on. The approach I finally adapt is to perform every possible reduction at each step. This might lead to some more nodes than what we need. But those are technically expected to be there after all, and it is easy to filter them out, so it is fine, from my point of view at the moment. Therefore, what remains is to filter those nodes out and connect it to the holy Emacs. :D
2023-06-02Fix a bug of duplication from planting after sploingJSDurand
I should have staged and committed these changes separately, but I am too lazy to deal with that. The main changes in this commit are that I added the derive macro that automates the delegation of the Graph trait. This saves a lot of boiler-plate codes. The second main change, perhaps the most important one, is that I found and tried to fix a bug that caused duplication of nodes. The bug arises from splitting or cloning a node multiple times, and immediately planting the same fragment under the new "sploned" node. That is, when we try to splone the node again, we found that we need to splone, because the node that was created by the same sploning process now has a different label because of the planting of the fragment. Then after the sploning, we plant the fragment again. This makes the newly sploned node have the same label (except for the clone index) and the same children as the node that was sploned and planted in the previous rounds. The fix is to check for the existence of a node that has the same set of children as the about-to-be-sploned node, except for the last one, which contains the about-to-be-planted fragment as a prefix. If that is the case, treat it as an already existing node, so that we do not have to splone the node again. This is consistent with the principle to not create what we do not need.
2023-02-27before a major refactorJSDurand
I decide to adopt a new approach of recording and updating item derivation forests. Since this affects a lot of things, I decide to commit before the refactor, so that I can create a branch for that refactor.
2023-02-12Added the functionality of split or clone.JSDurand
I need more than the ability to clone nodes: I also need to split the nodes. Now this seems to be correctly added.
2023-02-03Finally produced the first correct forestJSDurand
Finally the prototype parser has produced the first correct forest. It is my first time to generate a correct forest, in fact, ever since the beginning of this project.
2023-01-22forest: clone correctlyJSDurand
Now the forest can detect if a node is packed or cloned, and correctly clones a node in those circumstances. But it still needs to be tested.
2023-01-20minor refactoringJSDurand
It seems the performance is indeed linear for a simple grammar. This is such a historical moment, for me, that I think it deserves a separate commit, haha.
2023-01-20chain: a prototype is added.JSDurand
I have an ostensibly working prototype now. Further tests are needed to make sure that the algorithm meets the time complexity requirement, though.
2023-01-11Record left-linear expansion and forest formatJSDurand
Now the grammar will record the left-linear expansions when generating the nondeterministic finite automaton frmo its rules, and will record whether an edge in the nondeterministic finite automaton comes from a left-linear expansion. The latter is needed because while performing a chain-rule derivation, we do not need the left-linear expanded derivations in the "first layer". This might well have been the root cause of the bad performance of the previous version of this package. Also I have figured out how to properly generate and handle parse forests while manipulating the "chain-rule machine".
2023-01-03structural change: separate crates outJSDurand
I put functionalities that are not strictly core to separate crates, so that the whole package becomes more modular, and makes it easier to try other parsing algorithms in the future. Also I have to figure the forests out before finishing the core chain-rule algorithm, as the part about forests affects the labels of the grammars directly. From my experiences in writing the previous version, it is asking for trouble to change the labels type dramatically at a later point: too many places need to be changed. Thus I decide to figure the rough part of forests out. Actually I only have to figure out how to attach forests fragments to edges of the underlying atomic languages, and the more complex parts of putting forests together can be left to the recorders, which is my vision of assembling semi-ring values during the chain-rule machine. It should be relatively easy to produce forests fragments from grammars since we are just trying to extract some information from the grammar, not to manipulate those information in some complicated way. We have to do some manipulations in the process, though, in order to make sure that the nulling and epsilon-removal processes do not invalidate these fragments.