next up previous contents
Next: Tectogrammar Up: Topological Parsing Previous: Topological Parsing   Contents

Subsections

Phenogrammar

The advantage of using topological fields, which express linear precedence (LP) constraints, is that they provide first-class access to units that may not correspond to syntactic subtrees or may occur in a similar position but not form any natural syntactic and/or semantic class. An example of the latter is the Left Sentence-Bracket (CF) in German. This position, which is the second traditionally identified topological field in the clause is filled by such categories as finite verbs, complementisers, and subordinating conjunctions. These do not form a natural syntactic class but occupy the same linear position. Moreover, without recourse to topological fields it is very difficult to define this second position. Another example is the first (topic) position, the Vorfeld (VF), which holds a variety of categories and sometimes strings of words that are only parts of syntactic categories. An example of the traditional topological fields identified for German is given in 4 below:10.1


\begin{exe}
\ex
\begin{small}
\begin{tabular}{\vert c\vert c\vert c\vert c\ver...
...{tabular}\\ [2ex]
I gave the man the book that I've read.
\end{small}\end{exe}

These five fields are said to describe the topology of the clause region. In addition to the clause region, we assume that other regions can be found upon which one can define a topology. In the case of German for example, one can also talk about a noun-phrase regions (NPR) that contains its own fields. NPRs may or may not correspond to complete NPs in tectogrammar depending on whether or not they are contiguous. This way, we are extending the rather flat structure of regions and giving it a recursive structure. The need for this is made more apparent by certain facts: for example, a NF in a clause region itself being able to contain another (embedded) clause.

An extended CFG is used to formalise this grammar. The infix operator topo/2 is used to define the topology of regions. For example, a topological rule for the German clause region which is written as:

clause topo [vf,cf,mf*,{vc},{nf}]

means that a clause region contains exactly one VF followed by exactly one CF, followed by zero or more MFs, followed by an optional VC and an optional NF. A topological rule for the German NPR can be written as follows:

npr topo [{sprf},adjf*,nounf,postmodf*]

The above rule states that a German NPR is made up of an optional specifier field (SPRF), followed by zero or more adjective fields (ADJF), followed by exactly one noun field (NounF), and finally zero more post-modifier fields (PostModF).

Linkage

Since regions themselves can occur as fields in larger regions, we also need a way of making that explicit in the grammar. A new class of rules called linking rules is used for this purpose. These rules are universally quantified on the left-hand side and existentially quantified on the right-hand side. Therefore,

r <<-- f

states that all regions named r occur in some f, where f is a field name or a disjunction of field names. The rule

f -->> r

on the other hand, state that all fields named f contain some r, where r is a region name or a disjunction of region names. Two examples of linking rules in German are provided below:

npr <<-- (vf;mf;nf)
clause <<-- nf

These rules state that all NPRs in German occur in some VF, or MF, or NF, and all clauses in German occur in some NF. Looking back to our example sentence above, we can now see what phenogrammatical tree these rules together generate for that sentence. This is shown in Figure 10.1.

Figure 10.1: A sample phenogrammatical tree
\begin{figure}\centering \begin{footnotesize}
\begin{tabular}{ccccccccccc}
\mu...
...ect{mf3}{ich2}
\nodeconnect{vc1}{gelesen}\nodeconnect{vc1}{habe2}
\end{figure}


next up previous contents
Next: Tectogrammar Up: Topological Parsing Previous: Topological Parsing   Contents
TRALE Reference Manual