Memory and Reasoning in Deep Learning : Data efficiency of the SAM-based Two-memory (STM) Model

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Developing Deep Learning models capable of learning to reason and store memories are some of the most important current challenges in AI research. Finding out which network architectures are best suited for tackling this problem can guide research toward the most promising approaches. The bAbI challenge is a popular benchmark dataset composed of different Question Answering tasks each designed to test specific memory and reasoning abilities fundamental for text comprehension. A model well adapted to learning reasoning should be able to efficiently extract relevant knowledge from small amount of training data and generalise from it to achieve good performance, referred to as the model being data efficient. Memory-augmented networks are one of the most successful kinds of neural network architecture at the bAbI challenge and the SAM-based Two-memory (STM) model uses this architectural approach. This thesis compares STM model performance on the version of the bAbI challenge with little training data (bAbI 1k) to the best performing memory-augmented model on this challenge, the MemN2N model. The aim is to find out which memory-augmented architecture approach is more data efficient at bAbI. STM model performance is compared to two variants of the MemN2N model: MemN2N basic and its enhanced version MemN2N LS-RN. STM and MemN2N basic are found to have similar overall performance while the MemN2N LS-RN model is found to outperform them both, meaning it is more data efficient at bAbI. Differences in performance between models on several individual bAbI tasks are found, with a few being significant. STM is found to perform significantly worse at tasks involving temporal relation and time dependency reasoning than both MemN2N models. MemN2N LS-RN is also found to vastly outperform both STM and MemN2N basic at basic induction. Lastly, all models are found to perform poorly at complex spatial reasoning tasks. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)