Deterministic dependency parsing of unrestricted English text

Detta är en Magister-uppsats från Matematiska och systemtekniska institutionen

Författare: Mario Scholz; [2005]

Nyckelord: ;

Sammanfattning: This master’s thesis describes a deterministic dependency parser using a memorybased learning approach to parse unrestricted English text. A converter transforms the Wall Street Journal section of the Penn Treebank to an intermediate dependency representation which is used to train the parser using the TiMBL (Daelemans, Zavrel, Sloot, & Bosch, 2003) library. The output of the parser is labeled dependency graphs, using as arc labels a combination of bracket labels and grammatical role labels constructed from the Penn Treebank II annotation scheme (Marcus, Kim, et al., 1994). The parser reaches a maximum unlabeled attachment score of 87.1% and produces labeled dependency graphs with an accuracy of of 86.0% with the correct head and arc label recognised. The results are close to the state of the art in dependency parsing, and the parser also outputs arc labels that other parsers do not produce.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)