next up previous contents
Next: Phenogrammatical Parsing Up: Topological Parsing Previous: Tectogrammar   Contents

Subsections

Synchronising Phenogrammar and Tectogrammar

Synchronisation between phenogrammar and tectogrammar is achieved through the notion of covering. Covering constraints state to what extent and how the yield of a tectogrammatical category corresponds to that of a phenogrammatical category. Two special kinds of covering are also used in this formalism: matching, and splicing. These are all discussed in the following subsections. Covering constraints can be declared either globally in a grammar file or locally inside tectogrammatical rules. The scope of local constraints is only the rule in which they have been introduces.

Covering

Covering constraints state that the phonological yield of the tectogrammatical category $\phi$ consumes that of a phenogrammatical category $f$ and possibly more. This situation is depicted in Figure 10.2.

Figure 10.2: Covering
\begin{figure}\centering \begin{tabular}{ccc}
&\node{phi}{$\phi$}\\ [3ex]
\nod...
...\node{r}{}
\end{tabular} \nodeconnect{phi}{l}\nodeconnect{phi}{r}
\end{figure}

As an example, refer to the following hypothetical rule:

pp *--> p, nbar,
        {2 covers npr}

What the above rule says is that a PP consists of a P and an NBar and that the second daughter in the rule, namely, NBar covers an NPR. In this context the field or region specified will be one that is topologically accessible to the sponsor of the mother node. Therefore, the NPR in the above example has to be topologically accessible to the phenogrammatical edge that resulted in the prediction of that PP. The NBar in this rule could, in principle, be larger than the NPR. For example, it might also contain an extraposed relative clause.

Matching

Matching is a special kind of covering relation. Matching constraints state that a given tectogrammatical category $\phi$ is identical in its phonological yield to that of a phenogrammatical category $f$. This situation is depicted in Figure 10.3.

Figure 10.3: Matching
\begin{figure}\centering \begin{tabular}{ccc}
\node{phi}{$\phi$}\\ [3ex]
\node...
...l]{rho}\nodeconnect{phi}[tr]{rho}
\nodeconnect[tl]{rho}[tr]{rho}
\end{figure}

An example of a local matches constraint in German grammar is:

nbar *--> nbar, rp,
          {   2 matches nf
            ; 2 matches postf
          }

This rule states that an NBar consists of another NBar as well as an RP, which matches either a NF (extraposed) or PostF (adjacent to the head noun that it modifies) that is accessible to the sponsor of the mother NBar.

Splicing

Another special case of covering relation is splicing. Splicing refers to a situation where a tectogrammatical category $\phi$ matches in its phonological yield with more than one phenogrammatical category $f_1\dots f_n$. This situation is depicted in Figure 10.4.

Figure 10.4: Splicing
\begin{figure}\centering \begin{tabular}{cccc}
&\node{phi}{$\phi$}\\ [3ex]
\no...
...r]{f1}
\nodeconnect[tl]{f2}[tr]{f2}
\nodeconnect[tl]{fn}[tr]{fn}
\end{figure}

An example of splicing is given in the following rule:

pp *--> p, np,
          {2 matches npr+nf}

The above rule states that a PP is made up of a P and a NP, which matches the yield of some NPR and NF put together.

Compaction

Compaction constraints apply to tectogrammatical categories and states that the elements in the list that is its argument be contiguous strings. As a global constraint, compaction applies this condition to all categories that are consistent with any of the elements in the argument list of the constraint.

For example, the global rule compacts([s]) says that all Ss form contiguous strings.

Precedence and Immediate Precedence Constraints

Precedence

Precedence constraints mean that the daughter whose index is mentioned on the left-hand side must be entirely located before the daughter whose index is on the right-hand side. Note that this constraint does not make any assumptions about the contiguity of the daughters. As an example, one can provide the following hypothetical rule that states an NP consists of a determiner that precedes but might not be adjacent to an NBar with the same agreement features.

np *--> det, nbar,
        {1 < 2}

Immediate Precedence

A special kind of precedence is immediate precedence, which means that the two daughters mentioned in the constraint have to be adjacent to one another. This only applies to the rightmost word of the first daughter specified in the constraint and the leftmost word of the second one. This constraint does not assume contiguity of daughters either. The following serves as an example for this and means that an NP consists of a determiner which is immediately followed by an NBar.

np *--> det, nbar,
        {1 << 2}


Topological Accessibility

The topological parser in ALE is essentially an all-paths chart parser. It performs a phenogrammatical parse of the input string in a bottom-up fashion and once it finds a region that is matched by a tectogrammatical category, it predicts that category and performs a tectogrammatical parse. This continues until all of the input string has been consumed and all parses have been found.

Because this is an all-paths parser, we need to make sure that larger tectogrammatical categories only have access to those smaller categories that have been directly or indirectly predicted by the same entity in phenogrammar. To do this, we have introduced the notion of sponsorship. Whenever a phenogrammatical category predicts some tectogrammatical category $\phi$ by introducing an active edge into the chart, it passes its own ID $i$ to it. We then say that the sponsor of that active edge and all the other active edges introduced in the course of finding $\phi$ have the same sponsor with the ID $i$. Active edges only have access to passive edges with the same sponsor. This ensures that clauses and all other categories that can be predicted from phenogrammar are largely parsed independently, which also helps make parsing more efficient. The only time that a category has access to another category with a different sponsor is when it agrees to consume all of the yield of that category and not just parts of it.

For example, in the sentence (10.3.6),


\begin{exe}
\ex \emph{Ich sagte da$\beta$\ er den Mann geseben hat.}\\
I think that he the man seen has\\
\lq\lq I think that he saw the man.''
\end{exe}

once the parser has found the embedded clause, it adds an active edge to the chart for a sentence and passes its own ID to it. This triggers a tectogrammatical parse of the embedded clause. Later on, when parsing the matrix clause, the parser can only gain access to the embedded clause only when it has agreed to take the embedded clause as a whole. It never gains access to the internal subtrees of it as the matrix clause has a different sponsor. This is shown in Figure 10.5.

Figure 10.5: Sponsorship and Topological Accessibility
\begin{figure}\centering \begin{small}
\begin{tabular}{cccccccc}
&&\node{s2}{S...
...r]{clause1}[l]{cp1}{0.5in}
\anodecurve[t]{clause2}[l]{s2}{0.5in}}
\end{figure}


next up previous contents
Next: Phenogrammatical Parsing Up: Topological Parsing Previous: Tectogrammar   Contents
TRALE Reference Manual