Generating Wikipedia Articles with Grammatical Framework : A Case Study

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Natural language generation is a method used to produce understandable texts in human languages from data [1]. Grammatical Framework is a grammar formalism and a functional programming language using a nonstatistical approach to build natural language applications. It separates the semantics and the syntax - achieving multilingualism by mapping the same semantic model to several syntaxes [2]. Grammatical Framework also has a large library called the Resource Grammar Library which serves the programmer pre-made functions in over 30 languages ready to be used to build words and sentences [3]. This report investigates if Grammatical Framework can be successfully used to perform natural language generation in order to create Wikipedia articles from data taken from Wikidata. A grammar and a program has been built to generate articles in Swedish for urban areas in Sweden. The grammar has been built around the structure of the first three sentences in the Swedish article about the urban area Linköping. Furthermore, the grammar and program is extended in order to support generation of the same articles in English and French. The results show that Grammatical Framework can be somewhat successfully used to generate small Wikipedia articles in different languages using data from Wikidata as input. While all texts were coherent, the Swedish texts were the ones having the least amount of grammatical mistakes. The biggest drawback is the rule of no pattern matching on run-time arguments, which severely limits the programmer since many functions in the resource grammar library use pattern matching internally. Even though Grammatical Framework does not solve the whole problem, it serves as a powerful enough tool to be suitable for natural language generation, with the main advantage being that it relieves the programmer from needing to pay attention to tasks related to grammar such as inflection and gender agreement.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)