diff options
Diffstat (limited to 'src/bsr.h')
-rw-r--r-- | src/bsr.h | 38 |
1 files changed, 34 insertions, 4 deletions
@@ -20,7 +20,7 @@ BSR has already been added to a set. I guess the reason is that this is not the bottleneck of performance in its applications. But I am not satisfied with that solution. I want to take advantage of - hash-table's almost constant time look-up feature to implement + hash-table's almost constant-time look-up feature to implement this. */ /* A BSR set has two types. @@ -53,8 +53,13 @@ each type. Each set is an array of arrays, where the root array's indices correspond to non-terminals, and the leaf arrays correspond to rules of that non-terminal. Then the rest is implemented - through hash-tables, where a hash table's values are yet another - hash table, and so on. + through hash-tables, whose keys are tuples of integers. For the + first type, the keys are the (i, k) pairs and the values are those + j. For the second, the keys are the (b, i, k) tupels and the + values are those j. The reason to use i and k as keys is that they + represent the left and the right extent of the BSR respectively, so + we index them in order to find them easily. In addition, this will + be useful if we want to extract SPPF out of a BSR set. I don't know if there are better implementations. If I think of better ways in the future, I will re-implement the BSR sets. */ @@ -70,7 +75,32 @@ (X, a, i, j, k), i.e. as a first-type BSR set. But the meaning of the "i" is - different. */ + different. + + However, we shall notice that there is a natural "grading" on the + process descriptors. The k-th grading consists of the descriptors + with the last element in the tuple equal to k. In the clustered + nonterminal parsing algorithm, the process descriptors that are + added as a consequence of dealing with a process descriptor of + grading k must be of grade greater than or equal to k. This means + that if all process descriptors of grade k have been processed and + the descriptors awaiting to be processed all have grades > k, then + we won't see any process descriptors of grade <= k again. So we + can actually keep a list of process descriptors that serve the role + of the set U_k in the afore-mentionned paper. This list + corresponds to the grading of the descriptors. This way we don't + need to store the k in the descriptor. Therefore a process + descriptor is represented as + + (X, a, i, j). + + Again we store an array of arrays corresponding to X and a. Thus + the hash table simply has keys i and values j. + + The main benefit of this is that after we find that all process + descriptors in R_k have been processed, we can free those + descriptors in U_k. I think this is a good way to lower the space + requirements of this algorithm. */ int place(int x); |