Attention Mechanisms for Transition-based Dependency Parsing

Detta är en Master-uppsats från Uppsala universitet/Institutionen för lingvistik och filologi

Sammanfattning: Transition-based dependency parsing is known to compute the syntactic structure of a sentence efficiently, but is less accurate to predict long-distance relations between tokens as it lacks global information about the sentence. Our main contribution is the integration of attention mechanisms to replace the static token selection with a dynamic approach that takes the complete sequence into account. Though our experiments confirm that our approach fundamentally works, our models do not outperform the baseline parser. We further present a line of follow-up experiments to investigate these results. Our main conclusion is that the BiLSTM of the traditional parser is already powerful enough to encode the required global information into each token, eliminating the need for an attention-driven approach. Our secondary results indicate that the attention models require a neural network with a higher capacity to potentially extract more latent information from the word embeddings and the LSTM than the traditional parser. We further show that positional encodings are not useful for our attention models, though BERT-style positional embeddings slightly improve the results. Finally, we experiment with replacing the LSTM with a Transformer-encoder to test the impact of self-attention. The results are disappointing, though we think that more future research should be dedicated to this. For our work, we implement a UUParser-inspired dependency parser from scratch in PyTorch and extend it with, among other things, full GPU support and mini-batch processing. We publish the code under a permissive open source license at https://github.com/jgontrum/parseridge.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)