In Parsing as Deduction, we introduced a general framework for expressing different parsing algorithms as systems of logical deduction and in Semirings we showed how to organize different parsing problems (recognition, maximization, marginalization) by considering the parsing problem over different semirings. However, we have not discussed how these two idea can be used in a practical parsing algorithm. In this unit, we introduce a general way of using a parsing as deduction system known as chart parsing.

The idea of chart parsing is to keep track of two datastrcutures. First, the chart stores all of the items that have been discovered (i.e., proven) by the algorithm. The second datastructure, the agenda keeps track of which items should be used to generate later deductions using the inference rules.

The general procedure looks like this:

  1. Initialize an empty chart and put the axioms of the parsing as deduction system into the agenda.
  2. While there is anything in the agenda, repeat the following steps.
    1. Take a trigger item off of the agenda.
    2. If the trigger item is not in the chart already.
      1. Add it to the chart.
      2. Generate all items that result by applying the deduction rules to the trigger item and other items already in the chart.
      3. Add all these generated items to the agenda.
  3. If the goal item is in the chart, then it contains the answer to the parsing problem as it’s semiring value component.

In order to understand the notion of a trigger, it is useful to consider the CYK algorithm.

Each of the two rules has its own kind of trigger(s). The first rule is triggered by taking an item off the agenda that has a terminal in it. incomplete item is triggered by taking a complete item off of the agenda. It will generate new incomplete items which will be added to the agenda.

The second rule advance incomplete item is triggered either by taking an incomplete item off of the agenda. In this case the incomplete item will look for every complete item that is in the chart that can appear to the right and introduce a new incomplete item into the agenda, with the dot advanced. This rule can also be triggered by a complete item on the right. In this case, the complete item will find every incomplete item that ends in the correct place and introduce the resulting edges to the agenda.

Finally, the final rule is triggered by an incomplete item whose dot is at the far right of the rule. In this case, the rule will introduce a complete item into the agenda.

For this to work efficiently, we must order the search such that smaller items are built before larger items. This can be accomplished by making the agenda a priority queue which always returns the item whose span (i.e., \(j-i\)) is as short as possible.

Finally, we need to be careful not to double count ways of building item produced by the second rule. This can happen, for instance, we we trigger the second rule two times with the same inputs by adjacent incomplete and complete item of the same length. To guard against this, we can implement a check that we only allow incomplete triggers to apply when the complete item to the right is stricter shorter than the incomplete item on the left.


47 Viterbi and Inside Parsing for CFGs 48 Parameter Estimation for CFGs