Optimal Order Execution using Stochastic Control and Reinforcement Learning

Detta är en Master-uppsats från KTH/Matematisk statistik

Författare: Robert Hu; [2016]

Nyckelord: ;

Sammanfattning: In this thesis an attempt is made to find the optimal order execution policy that maximizes the reward from trading financial instruments. The optimal policies are found us-ing a Markov Decision Process that is build using a state space model and the Bellman equation. Since there is not an explicit formula for state space dynamics, simulations on historical data are made instead to find the state transition probabilities and the rewards associated with each state and control. The optimal policy is then generated from the Bellman equation and tested against naive policies on out-of-sample data. This thesis also attempts to model the notion of market impact and test whether the Markov Deci-sion Process is still viable under the imposed assumptions. Lastly, there is also an attempt to estimate the value func-tion using various techniques from Reinforcement Learning. It turns out that naive strategies are superior when market impact is not present and when market impact is modeled as a direct penalty on reward. The Markov Decision Pro-cess is superior with market impact when it is modeled as having an impact on simulations, although some results suggest that the market impact model is not consistent for all types of instruments. Further, approximating the value function yields results that are inferior to the Markov Deci-sion Process, but interestingly the method exhibits an im-provement in performance if the estimated value function is trained before it is tested.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)