On the Efficiency of Transfer Learning in a Fighter Pilot Behavior Modelling Context

Detta är en Master-uppsats från KTH/Matematik (Inst.)

Författare: Viktor Sandström; [2021]

Nyckelord: Imitation Learning; Transfer Learning; Applied Mathematics; Behavior Cloning; DAgger; FOI; Fighter Pilot; Mathematics; Deep Learning; Machine Learning; Imitationsinlärning; Överföringsinlärning; Maskininlärning; Tillämpad Matematik;

Sammanfattning: Creating realistic models of human fighter pilot behavior is made possible with recent deep learning techniques. However, these techniques are often highly dependent on large datasets, often unavailable in many settings, or expensive to produce. Transfer learning is an active research field where the idea is to leverage the knowledge gained from studying a problem for which large amounts of training data are more readily available, when considering a different, related problem. The related problem is called the target task and the initial problem is called the source task. Given a successful transfer scenario, a smaller amount of data, or less training, can be required to reach high quality results on the target task. The first part of this thesis focuses on the development of a fighter pilot model using behavior cloning, a method for reducing an imitation learning problem to standard supervised learning. The resulting model, called a policy, is capable of imitating a human pilot controlling a fighter jet in the military combat simulator Virtual BattleSpace 3. In this simulator, the forces acting on the aircraft can be modelled using one of several flight dynamic models (FDMs). In the second part, the efficiency of transfer learning is measured. This is done by replacing the built-in FDM to one with a significant variation in the input response, and subsequently train two policies on successive amount of data. One policy was trained using only the latter FDM, whereas the other policy exploits the gained knowledge from the first part of the thesis, using a technique called fine-tuning. The results indicate that a model already capable of handling one FDM, adapts to a different FDM with less data compared to a previously untrained policy.

HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)

On the Efficiency of Transfer Learning in a Fighter Pilot Behavior Modelling Context

Sökningar just nu

Populära sökningar

Uppsatser med många visningar igår (2024-04-26)