Sökning: "markov decision process"
Visar resultat 1 - 5 av 50 uppsatser innehållade orden markov decision process.
1. Risk-Averse Multi-Armed Bandit Problem with Multiple Plays
Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknikSammanfattning : This study aims to construct an efficient heuristic, referred to as RA, for a riskaverse Markovian multi-armed bandit problem (MAB) with multiple plays. The RA incorporates risk-aversion and multiple plays by modifying the Gittins index strategy. LÄS MER
2. Optimal Order Placement Using Markov Models of Limit Order Books
Master-uppsats, KTH/Matematik (Avd.)Sammanfattning : We study optimal order placement in a limit order book. By modelling the limit order book dynamics as a Markov chain, we can frame the purchase of a single share as a Markov Decision Process. Within the framework of the model, we can estimate optimal decision policies numerically. The trade rate is varied using a running cost control variable. LÄS MER
3. Decreasing Training Time of Reinforcement Learning Agents for Remote Tilt Optimization using a Surrogate Neural Network Approximator
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : One possible application of reinforcement learning in the telecommunication field is antenna tilt optimization. However, one of key challenges we face is that the use of handcrafted simulators as environments to provide information for agents is often time-consuming regarding training reinforcement learning agents. LÄS MER
4. S-MARL: An Algorithm for Single-To-Multi-Agent Reinforcement Learning : Case Study: Formula 1 Race Strategies
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : A Multi-Agent System is a group of autonomous, intelligent, interacting agents sharing an environment that they observe through sensors, and upon which they act with actuators. The behaviors of these agents can be either defined upfront by programmers or learned by trial-and-error resorting to Reinforcement Learning. LÄS MER
5. Random Edge is not faster than Random Facet on Linear Programs
Master-uppsats, KTH/Matematik (Avd.)Sammanfattning : A Linear Program is a problem where the goal is to maximize a linear function subject to a set of linear inequalities. Geometrically, this can be rephrased as finding the highest point on a polyhedron. The Simplex method is a commonly used algorithm to solve Linear Programs. LÄS MER