fixed the bugs of node duplications and left-open nodes

There were two main issues in the previous version. One is that there are lots of duplications of nodes when manipulating the forest. This does not mean that labels repeat: by the use of the data type this cannot happen. What happened is that there were cloned nodes whose children are exactly equal. In this case there is no need to clone that node in the first place. This is now fixed by checking carefully before cloning, so that we do not clone unnecessary nodes. The other issue, which is perhaps more important, is that there are nodes which are not closed. This means that when there should be a reuction of grammar rules, the forest does not mark the corresponding node as already reduced. The incorrect forests thus caused is hard to fix: I tried several different approaches to fix it afterwards, but all to no avail. I also tried to record enough information to fix these nodes during the manipulations. It turned out that recording nodes is a dead end, as I cannot properly syncronize the information in the forest and the information in the chain-rule machine. Any inconsistencies will result in incorrect operations later on. The approach I finally adapt is to perform every possible reduction at each step. This might lead to some more nodes than what we need. But those are technically expected to be there after all, and it is easy to filter them out, so it is fine, from my point of view at the moment. Therefore, what remains is to filter those nodes out and connect it to the holy Emacs. :D
author: JSDurand <mmemmew@gmail.com> 2023-06-18 15:03:34 +0800
committer: JSDurand <mmemmew@gmail.com> 2023-06-18 15:03:34 +0800
commit: a80db17473ff09cc72acba2c1975101e6dbedf39 (patch)
tree: 4d5dfcdfcd1537b51b92349d9a5274aa90a6d110 /chain/src/archive.txt
parent: 6ce44bb3bdb79e8e727ee6fc7f5c6cd7fa0bb51e (diff)
1 files changed, 495 insertions, 0 deletions
diff --git a/chain/src/archive.txt b/chain/src/archive.txt
new file mode 100644
index 0000000..030ccba
--- /dev/null
+++ b/chain/src/archive.txt
@@ -0,0 +1,495 @@
+// Temporary archive
+
+//        let mut seen_nodes: HashSet<usize> = HashSet::with_capacity(stack.len());
+//
+//        // dbg!(&stack);
+//
+//        // self.forest.print_viz("dbg forest.gv").unwrap();
+//
+//        while let Some(mut top) = stack.pop() {
+//            if seen_nodes.contains(&top) {
+//                continue;
+//            }
+//
+//            seen_nodes.insert(top);
+//
+//            let mut top_label = self
+//                .forest
+//                .vertex_label(top)?
+//                .ok_or(Error::NodeNoLabel(top))?;
+//
+//            if !top_label.is_packed()
+//                && matches!(top_label.label().label().tnt(), Some(TNT::Non(_)))
+//                && top_label.label().end().is_none()
+//            {
+//                let degree = self.forest.degree(top)?;
+//                let last_index = if degree > 0 { degree - 1 } else { 0 };
+//
+//                let pos = if degree > 0 {
+//                    let last_child = self.forest.nth_child(top, last_index)?;
+//                    let last_child_label = self
+//                        .forest
+//                        .vertex_label(last_child)?
+//                        .ok_or(Error::NodeNoLabel(last_child))?
+//                        .label();
+//                    let last_child_end = last_child_label.end();
+//
+//                    if !matches!(last_child_label.label().rule(),
+//                                 Some(rule) if
+//                                 self
+//                                 .atom
+//                                 .is_accepting(2*rule+1)
+//                                 .map_err(index_out_of_bounds_conversion)?)
+//                    {
+//                        continue;
+//                    }
+//
+//                    if let Some(pos) = last_child_end {
+//                        pos
+//                    } else {
+//                        stack.extend(self.forest.children_of(last_child)?);
+//                        continue;
+//                    }
+//                } else {
+//                    top_label.label().start() + 1
+//                };
+//
+//                top = self.forest.splone(top, Some(pos), last_index, true)?;
+//                top_label = self
+//                    .forest
+//                    .vertex_label(top)?
+//                    .ok_or(Error::NodeNoLabel(top))?;
+//            } else if top_label.is_packed()
+//                || top_label.label().label().rule().is_some() && top_label.label().end().is_none()
+//            {
+//                stack.extend(self.forest.children_of(top)?);
+//            }
+//
+//            if top_label.clone_index().is_some() {
+//                let mut parents = self.forest.parents_of(top)?;
+//                if parents.len() != 1 {
+//                    dbg!(top, top_label, parents.len());
+//                    self.forest.print_viz("dbg forest.gv").unwrap();
+//                    panic!("0 parent?");
+//                }
+//
+//                top = parents.next().unwrap().node();
+//            }
+//
+//            stack.extend(self.forest.parents_of(top)?.map(|parent| parent.node()));
+//        }
+
+
+// extra reduction archive function
+
+// /// Perform extra reductions.
+// ///
+// /// To be precise, this function first splones the bottom node,
+// /// and then queries the reducer to find the next nodes to splone,
+// /// and repeats this process until either we find the top node, or
+// /// the reducer says to stop.
+// ///
+// /// One nuance to note is that after we splone a node, if we split
+// /// that node, then the splitted node will have a cloned parent,
+// /// see
+// /// [`split_node`][DefaultForest::<ForestLabel<GrammarLabel>>::split_node]
+// /// for details.  So for each parent pointed out by the reducer,
+// /// we need to find its clone that has the splitted node as a
+// /// child.
+// #[allow(dead_code)]
+// fn extra_reductions(
+//     &mut self,
+//     botop: BoTop,
+//     pos: usize,
+//     reducer: &Reducer,
+//     atom: &DefaultAtom,
+// ) -> Result<Vec<usize>, Error> {
+//     if botop.bottom() == botop.top() {
+//         return Ok(vec![self.node_from_pavi(botop.top())?]);
+//     }
+// 
+//     let top = botop.top();
+//     let bottom = botop.bottom();
+// 
+//     let bottom_closed = self.close_pavi(atom.borrow(), bottom, pos)?;
+// 
+//     let top_node = self.node_from_pavi(top)?;
+//     let bottom_node = self.node_from_pavi(bottom)?;
+// 
+//     let mut stack = vec![(bottom_closed, bottom_node)];
+// 
+//     // Exclude duplicate nodes to ensure that no infinite
+//     // recursion occurs.  In the future I shall experiment if this
+//     // is necessary, and get rid of this if possible.
+//     // let mut seen_nodes: HashSet<usize> = Default::default();
+// 
+//     let mut result = Vec::new();
+// 
+//     // NOTE: We must query the reducer by the original nodes,
+//     // otherwise the reducer has no chance to know about the new
+//     // nodes that were created by a previous iteration here.  At
+//     // the same time, we must splone a clone of the queried result
+//     // which is the parent of the previously sploned node, so that
+//     // we are on the right track.
+//     //
+//     // A summary is that we record both the original node and the
+//     // sploned node in the stack, so that we can query by the
+//     // original node, and then find the correct clone of the
+//     // queried result by means of the sploned node.
+// 
+//     let mut seen_nodes = HashSet::new();
+// 
+//     while let Some((bottom_sploned, bottom_original)) = stack.pop() {
+//         if seen_nodes.contains(&bottom_original) {
+//             continue;
+//         }
+// 
+//         seen_nodes.insert(bottom_original);
+// 
+//         if pos == 4 {
+//             println!("stack popped: {bottom_sploned}, {bottom_original}");
+//         }
+// 
+//         let query_result = reducer.query(bottom_original, top_node);
+// 
+//         let mut sploned = bottom_sploned;
+// 
+//         let mut label_set: HashSet<GrammarLabel> = HashSet::with_capacity(query_result.len());
+// 
+//         if query_result.len() != 0 {
+//             if pos == 4 {
+//                 println!("reducer:");
+//             }
+// 
+//             for node in query_result {
+//                 if pos == 4 {
+//                     println!("{node}");
+//                 }
+// 
+//                 label_set.insert(
+//                     self.vertex_label(node)?
+//                         .ok_or(Error::NodeNoLabel(node))?
+//                         .label(),
+//                 );
+//             }
+// 
+//             let sploned_label = self
+//                 .vertex_label(sploned)?
+//                 .ok_or(Error::NodeNoLabel(sploned))?;
+// 
+//             if sploned_label.clone_index().is_some() {
+//                 let mut parents = self.parents_of(sploned)?;
+// 
+//                 #[cfg(debug_assertions)]
+//                 if parents.len() != 1 {
+//                     panic!("assumption fails");
+//                 }
+// 
+//                 sploned = parents.next().unwrap().node();
+//             }
+// 
+//             // NOTE: We cannot borrow self mutably and immutably
+//             // at the same time, so we have to collect the nodes
+//             // first.
+//             let mut to_splone = Vec::with_capacity(label_set.len());
+// 
+//             for rule_parent in self.parents_of(sploned)? {
+//                 if self.degree(rule_parent.node())? != 1 {
+//                     panic!("assumption fails");
+//                 }
+// 
+//                 for parent in self.parents_of(rule_parent.node())? {
+//                     let parent_node = parent.node();
+// 
+//                     let parent_label = self
+//                         .vertex_label(parent_node)?
+//                         .ok_or(Error::NodeNoLabel(parent_node))?
+//                         .label();
+// 
+//                     if label_set.contains(&parent_label) {
+//                         to_splone.push(parent_node);
+//                     }
+//                 }
+//             }
+// 
+//             for parent_node in to_splone {
+//                 let degree = self.degree(parent_node)?;
+//                 let last_index = std::cmp::max(degree, 1) - 1;
+//                 let parent_sploned = self.splone(parent_node, Some(pos), last_index, false)?;
+// 
+//                 stack.push((parent_sploned, parent_node));
+//             }
+//         } else {
+//             result.push(bottom_sploned);
+//         }
+//     }
+// 
+//     Ok(result)
+// }
+
+// BoTop is now unused
+
+// / TODO: Add a hashmap mapping tuples of forest positions (top,
+// / bottom) to vectors of tuples of unsigned integers.  Each tuple
+// / represents a non-terminal and a rule.  The tuple means that we need
+// / to close the expansion of the non-terminal by the rule position, if
+// / we want to move from bottom to top.
+// /
+// / After we move up one layer, we are now in a new forest position, so
+// / we just query again with the new bottom and top.  This means we do
+// / not need to clear the map.
+// 
+// / REVIEW: Why is the above needed?
+// 
+// // This represents a tuple of bottom and top forest positions.
+// //
+// // It is supposed to be used as keys to query the reduction
+// // information.  See the type [`Reducer`] for more details.
+// [derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash, Default)]
+// ub(crate) struct BoTop {
+//    bottom: PaVi,
+//    top: PaVi,
+// 
+// 
+// mpl BoTop {
+//    #[allow(dead_code)]
+//    pub(crate) fn new(bottom: PaVi, top: PaVi) -> Self {
+//        Self { bottom, top }
+//    }
+// 
+//    pub(crate) fn top(&self) -> PaVi {
+//        self.top
+//    }
+// 
+//    pub(crate) fn bottom(&self) -> PaVi {
+//        self.bottom
+//    }
+// 
+
+
+// FIXME: Fix this outdated documentation.
+// This hashmap records information for performing extra reductions
+// when we insert items.
+
+// When we want to *jump* from the bottom to the top positions in the
+// forest, we need to perform a set of reductions, in order to jump
+// one layer upwards.  After one step, we query again for the next
+// step of reductions to perform.
+
+// # Reduction format
+
+// The value records a set of tuples of the form `(non-terminal,
+// starting_position)`.  Such a tuple means we want to reduce from
+// the expansion of the given `non-terminal`, which expands from the
+// given starting position.  We need to restrict the
+// pub(crate) type Reducer = Map<usize, HashSet<usize>>;
+
+// /// Check if every child already has an end.
+// fn every_child_is_completed(&self, node_id: usize, atom: &DefaultAtom) -> Result<bool, Error> {
+//     let children = self.children_of(node_id)?;
+
+//     if children.len() == 0 {
+//         return Ok(true);
+//     }
+
+//     let mut pos = self
+//         .vertex_label(node_id)?
+//         .ok_or(Error::NodeNoLabel(node_id))?
+//         .label()
+//         .start();
+
+//     let mut last_child_label = None;
+
+//     for child in children {
+//         let child_label = self
+//             .vertex_label(child)?
+//             .ok_or(Error::NodeNoLabel(child))?
+//             .label();
+
+//         last_child_label = Some(child_label);
+
+//         if child_label.start() != pos || child_label.end().is_none() {
+//             return Ok(false);
+//         }
+
+//         pos = child_label.end().unwrap();
+//     }
+
+//     if let Some(label) = last_child_label {
+//         if let Some(rule) = label.label().rule() {
+//             if !atom
+//                 .is_accepting(2 * rule + 1)
+//                 .map_err(|_| Error::IndexOutOfBounds(2 * rule + 1, atom.nodes_len()))?
+//             {
+//                 return Ok(false);
+//             }
+//         }
+//     }
+
+//     Ok(true)
+// }
+
+// /// Complete the forest by trying to set the ending position of
+// /// every node that does not have an end, by the ending position
+// /// of its last child.
+// pub fn complete(&mut self, atom: &DefaultAtom) -> Result<(), Error> {
+//     let mut stack: Vec<_> = self
+//         .nodes()
+//         .filter(|node| {
+//             let label = self.vertex_label(*node).unwrap().unwrap().label();
+
+//             label.label().rule().is_some() && label.end().is_some()
+//         })
+//         .collect();
+
+//     let mut second_stack: Vec<usize> = Vec::new();
+
+//     let mut pending_candidates: Vec<usize> = Vec::new();
+
+//     let nodes_len = self.nodes_len();
+
+//     let mut seen_nodes: HashSet<usize> = HashSet::with_capacity(nodes_len);
+
+//     while !stack.is_empty() {
+//         while let Some(mut top) = stack.pop() {
+//             if seen_nodes.contains(&top) {
+//                 continue;
+//             }
+
+//             seen_nodes.insert(top);
+
+//             let top_label = self.vertex_label(top)?.unwrap();
+
+//             if top_label.clone_index().is_some() {
+//                 let mut top_parents = self.parents_of(top)?;
+
+//                 if top_parents.len() != 1 {
+//                     return Err(Error::InvalidClone(top));
+//                 }
+
+//                 top = top_parents.next().unwrap().node();
+//             }
+
+//             let rule_parents = self.parents_of(top)?;
+
+//             let candidates = {
+//                 let mut result = Vec::with_capacity(rule_parents.len());
+
+//                 for parent in rule_parents {
+//                     // for parent in self.parents_of(parent.node())? {
+//                     // if self.degree(parent.node())? <= parent.edge() + 1 {
+//                     result.push(parent);
+//                     // }
+//                     // }
+//                 }
+
+//                 result
+//             };
+
+//             for candidate in candidates {
+//                 if matches!(self.vertex_label(candidate.node())?, Some(label) if label.label().end().is_none())
+//                 {
+//                     if self.every_child_is_completed(candidate.node(), atom)? {
+//                         let last_child = self
+//                             .nth_child(candidate.node(), self.degree(candidate.node())? - 1)?;
+//                         let end = self
+//                             .vertex_label(last_child)?
+//                             .ok_or(Error::NodeNoLabel(last_child))?
+//                             .label()
+//                             .end();
+
+//                         let sploned_node = self.splone(
+//                             candidate.node(),
+//                             end,
+//                             self.degree(candidate.node())? - 1,
+//                             true,
+//                         )?;
+
+//                         second_stack.push(sploned_node);
+//                     } else {
+//                         pending_candidates.push(candidate.node());
+//                     }
+//                 } else {
+//                     second_stack.push(candidate.node());
+//                 }
+//             }
+
+//             let mut new_pending = Vec::with_capacity(pending_candidates.len());
+
+//             while let Some(node) = pending_candidates.pop() {
+//                 if self.every_child_is_completed(node, atom)? {
+//                     let last_edge = self.degree(node)? - 1;
+//                     let last_child = self.nth_child(node, last_edge)?;
+//                     let end = self
+//                         .vertex_label(last_child)?
+//                         .ok_or(Error::NodeNoLabel(last_child))?
+//                         .label()
+//                         .end();
+
+//                     let sploned_node = self.splone(node, end, last_edge, true)?;
+
+//                     second_stack.push(sploned_node);
+//                 } else {
+//                     new_pending.push(node);
+//                 }
+//             }
+
+//             std::mem::swap(&mut pending_candidates, &mut new_pending);
+//         }
+
+//         std::mem::swap(&mut stack, &mut second_stack);
+//     }
+
+//     Ok(())
+// }
+
+// /// Unconditionally set the label of the node to be the new label.
+// ///
+// /// # Warning
+// ///
+// /// Use with caution: it does not do anything special, so it is
+// /// the responsibility of the caller to ensure this operation is
+// /// legal.
+// #[allow(dead_code)]
+// pub(crate) fn set_label(&mut self, node: usize, label: GrammarLabel) -> Result<(), Error> {
+//     let status = self
+//         .vertex_label(node)?
+//         .ok_or(Error::NodeNoLabel(node))?
+//         .status;
+
+//     let label = ForestLabel::new(label, status);
+
+//     let mut builder = PLGBuilderMut::from_graph_mut(&mut self.graph);
+
+//     builder.set_label(node, label)?;
+
+//     Ok(())
+// }
+
+// /// Extract the node from `pavi`.
+// ///
+// /// If pavi is a parent, this returns the child pointed to by the
+// /// represented edge.
+// ///
+// /// If pavi is a virtual node, this simply returns the virtual
+// /// node.
+// ///
+// /// If pavi is empty, this returns the root, unless the forest is
+// /// empty.
+// ///
+// /// # Guarantee
+// ///
+// /// The returned node is guaranteed to be a valid node.
+// pub(crate) fn node_from_pavi(&self, pavi: PaVi) -> Result<usize, Error> {
+//     match pavi {
+//         PaVi::Parent(_node, _edge, child) => Ok(child),
+//         PaVi::Virtual(_, _, node) => {
+//             if node >= self.nodes_len() {
+//                 return Err(Error::IndexOutOfBounds(node, self.nodes_len()));
+//             }
+// 
+//             Ok(node)
+//         }
+//         PaVi::Empty => Ok(self.root().ok_or(Error::IndexOutOfBounds(0, 0))?),
+//     }
+// }
author	JSDurand <mmemmew@gmail.com>	2023-06-18 15:03:34 +0800
committer	JSDurand <mmemmew@gmail.com>	2023-06-18 15:03:34 +0800
commit	a80db17473ff09cc72acba2c1975101e6dbedf39 (patch)
tree	4d5dfcdfcd1537b51b92349d9a5274aa90a6d110 /chain/src/archive.txt
parent	6ce44bb3bdb79e8e727ee6fc7f5c6cd7fa0bb51e (diff)