# ---- Recognizer Pattern File ---- # from pattern.h: # Possible Pattern Predicates: # # # # # # # # # # # # # # # # # # (only "to ") # # # # # # # Combinations of predicates are allowed and considered to be # AND'ed together. # # All predicates should be followed by "->" and then the type # of constituent constituent you want to be created followed # by a dash and the pattern type: # NP-LOCATION, # NP-COMPANY, # NP-NUMBER, # NP-MONEY, # NP-PERCENT, # NP-TIME # NP-ING. # # # NOTE: Because of the way that ApplyConstPattern() works, every # predicate must target a constituent of the type that would exist # after sundance segmenting at the point that all this pattern # recognition gets called. For example, if we call this after np # segmenting, then the VP and PP predicates really don't do anything # for us. Also, since each predicate must target one constituent # in a sequence, there is no facility for looking at, for example, # two particular words within the same NP. If we ever want to change # this, we've got to make the necessary changes in the ApplyConstPattern() # function. Note also that since the ApplyConstPattern() function # returns pairs of numbers which equate to the starting and ending # position of a sentence's children which should be rolled together, # this recognizer code cannot be called after clause handling because # clause handling creates a 3-level parse and we've just assumed # a 2-level parse. # # So, because the current implementation applies these patterns # after NP segmenting, but before any other segmenting, the examples # with VP's and PP's are irrelevant for now. (The logic is there in # sundance, we'd just need to have multiple pattern files to be # called in at different points of the parsing.) # # Pattern Precedence Rules : The pattern recognizer now enforces a # type of precedence when more than one pattern may apply to # a set of shared constituents. The rule is simple...the longest # one wins, and in the case of ties, the rule that is listed first # in the rule file wins. # #Part added by Pol Schumacher # #This pattern should recognize labels which are in " # -> NP-TIME # -> NP-TIME #Pattern for keys -> NP-LIST -> NP-LIST -> NP-LIST #Menu tree structures ->NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST #Pattern for time # -> NP-TIME # -> NP-TIME # -> NP-TIME # -> NP-TIME # -> NP-TIME # -> NP-TIME # -> NP-TIME # -> NP-TIME # -> NP-TIME ##Pattern buttons -> NP-LIST ##Pattern for ingredient list -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST ##Pattern comma and -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST ##Pattern comma or -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST -> NP-LIST #End Pol's part # To capture "IBM Corp." "Nestle Inc." and "L. L. Bean Co." ### -> NP-COMPANY ### -> NP-COMPANY ### -> NP-COMPANY # To capture : "Canada" ### -> NP-LOCATION # To capture : Alberta, Canada

-> NP-LOCATION # NOTE: can't do this!!! Really need for *contextual* features in # these rules! The preposition "in" is a crucial part of recognizing # that Ames is a location and therefore a necessary part of the # pattern, but it shouldn't be pulled into the NP itself! # # Ex: "in Ames, Iowa" # where Ames is an unknown word but Iowa is a known LOCATION. # I think this should be a good and pretty safe rule -emr 8/3/07 # #

-> NP-LOCATION # To capture : Calgary, Alberta, Canada ###

-> NP-LOCATION #=================== # More aggressive location tagging... -> NP-LOCATION # Date/time tagging--anyword or head ### -> NP-TIME -> NP-TIME # -> NP-TIME # -> NP-TIME # -> NP-TIME #