[Code]
Lexical entries in ALE are specified as rewriting
rules, as given by the following BNF syntax:
<lex_entry> ::= <word> ---> <desc>.For instance, in the categorial grammar lexicon in the appendix, the following lexical entry is provided, along with the relevant macros:
john --->
@ pn(j).
pn(Name) macro
synsem: @ np(Name),
@ quantifier_free.
np(Ind) macro
syn:np,
sem:Ind.
quantifier_free macro
qstore:[].
Read declaratively, this rule says that the word john has as its
lexical category the most general satisfier of the description
@ pn(j), which is:
cat
SYNSEM basic
SYN np
SEM j
QSTORE e_list
Note that this lexical entry is equivalent to that given without
macros by:
john --->
synsem:(syn:np,
sem:j),
qstore:e_list.
Macros are useful as a method of organizing lexical information to
keep it consistent across lexical entries. The lexical entry for the
word runs is:
runs ---> @ iv((run,runner:Ind),Ind).
iv(Sem,Arg) macro
synsem:(backward,
arg: @ np(Arg),
res:(syn:s,
sem:Sem)),
@ quantifier_free.
This entry uses nested macros along with structure sharing, and
expands to the category:
cat
SYNSEM backward
ARG synsem
SYN np
SEM [0] sem_obj
RES SYN s
SEM run
RUNNER [0]
QSTORE e_list
It also illustrates how macro parameters are in fact treated as
variables.
Multiple lexical entries may be provided for each word. Disjunctions may also be used in lexical entries, but are expanded out at compile-time. Thus the first three lexical entries, taken together, compile to the same result as the fourth:
bank --->
syn:noun,
sem:river_bank.
bank --->
syn:noun,
sem:money_bank.
bank --->
syn:verb,
sem:roll_plane.
bank --->
( syn:noun,
sem:( river_bank
; money_bank
)
; syn:verb,
sem:roll_plane
).
Note that this last entry uses the standard Prolog layout conventions
of placing each conjunct and disjunct on its own line, with commas at
the end of lines, and disjunctions set off with vertically aligned
parentheses at the beginning of lines.
The compiler finds all the most general satisfiers for lexical entries at compile time, reporting on those lexical entries that have unsatisfiable descriptions. In the above case of bank, the second combined method is marginally faster at compile-time, but their run-time performance is identical. The reason for this is that both entries have the same set of most general satisfiers.
ALE supports the construction of large lexica, as it relies on Prolog's hashing mechanism to actually look up a lexical entry for a word during bottom-up parsing. For generation, ALE indexes lexical entries for faster unification, as described in Penn and Popescu (1997). Constraints on types can also be used to enforce conditions on lexical representations, allowing for further factorization of information.