#ifndef BSR_H #define BSR_H /* This file implements the "Binary Subtree Representation" of a derivation forest. See the following paper for the miraculous details: Derivation representation using binary subtree sets by Elizabeth Scott, Adrian Johnstone and L.Thomas van Binsbergen. */ /* Note that there is an implementation of this GLL algorithm in the following GitHub repository: https://github.com/goccmack/gogll But the implementation there uses array look-up to determine if a BSR has already been added to a set. I guess the reason is that this is not the bottleneck of performance in its applications. But I am not satisfied with that solution. I want to take advantage of hash-table's almost constant time look-up feature to implement this. */ /* A BSR set has two types. The first type is of the form (X := alpha, i, j, k), where X := alpha is a grammar rule. This is equivalent with a 5-tuple (X, a, i, j, k), where X is the integer corresponding to X and a is the index of the rule X := alpha. The second type is of the form (X := beta . gamma, i, j, k), where beta is a prefix of the rule X := beta gamma. So this is equivalent with a 6-tuple (X, a, b, i, j, k), where a is the index of the rule X := beta gamma and b is the length of beta. The simplest way to implement this is to have two "sets", one for each type. Each set is an array of arrays, where the root array's indices correspond to non-terminals, and the leaf arrays correspond to rules of that non-terminal. Then the rest is implemented through hash-tables, where a hash table's values are yet another hash table, and so on. I don't know if there are better implementations. If I think of better ways in the future, I will re-implement the BSR sets. */ /* Note that we also need to implement "process descriptors" for the GLL parser. Since a process descriptor is of the form (X := beta . gamma, i, j, k), where X := beta gamma is a rule and i is the length of beta, we can represent a process descriptor as (X, a, i, j, k), i.e. as a first-type BSR set. But the meaning of the "i" is different. */ int place(int x); #endif