Sökning: "Markov Decision Process"

Visar resultat 1 - 5 av 50 uppsatser innehållade orden Markov Decision Process.

  1. 1. Risk-Averse Multi-Armed Bandit Problem with Multiple Plays

    Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknik

    Författare :Siri Dahlgren; Nicholas Marriott; [2023-10-23]
    Nyckelord :MAB; Gittins; Markovian bandit; risk-aversion; policy iteration; multiple plays;

    Sammanfattning : This study aims to construct an efficient heuristic, referred to as RA, for a riskaverse Markovian multi-armed bandit problem (MAB) with multiple plays. The RA incorporates risk-aversion and multiple plays by modifying the Gittins index strategy. LÄS MER

  2. 2. Optimal Order Placement Using Markov Models of Limit Order Books

    Master-uppsats, KTH/Matematik (Avd.)

    Författare :Max Oliveberg; [2023]
    Nyckelord :Optimal order placement; Limit order book; Markov; Optimal orderläggning; Orderbok; Markov;

    Sammanfattning : We study optimal order placement in a limit order book. By modelling the limit order book dynamics as a Markov chain, we can frame the purchase of a single share as a Markov Decision Process. Within the framework of the model, we can estimate optimal decision policies numerically. The trade rate is varied using a running cost control variable. LÄS MER

  3. 3. Decreasing Training Time of Reinforcement Learning Agents for Remote Tilt Optimization using a Surrogate Neural Network Approximator

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Jiaming Huang; [2023]
    Nyckelord :;

    Sammanfattning : One possible application of reinforcement learning in the telecommunication field is antenna tilt optimization. However, one of key challenges we face is that the use of handcrafted simulators as environments to provide information for agents is often time-consuming regarding training reinforcement learning agents. LÄS MER

  4. 4. S-MARL: An Algorithm for Single-To-Multi-Agent Reinforcement Learning : Case Study: Formula 1 Race Strategies

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Marinaro Davide; [2023]
    Nyckelord :Reinforcement Learning; Single-to-Multi-Agent; Learning Stability; Exploration-Exploitation trade-off; Race Strategy Optimization; Förstärkningsinlärning; Från en till flera agenter; Stabilitet vid inlärning; Utforskning-exploatering; Optimering av tävlingsstrategier;

    Sammanfattning : A Multi-Agent System is a group of autonomous, intelligent, interacting agents sharing an environment that they observe through sensors, and upon which they act with actuators. The behaviors of these agents can be either defined upfront by programmers or learned by trial-and-error resorting to Reinforcement Learning. LÄS MER

  5. 5. Random Edge is not faster than Random Facet on Linear Programs

    Master-uppsats, KTH/Matematik (Avd.)

    Författare :Nicole Hedblom; [2023]
    Nyckelord :Simplex method; simplex; Random Edge; Linear Programming; Random Facet; randomized pivoting rule; Markov decision process; Simplexmetoden; Random Edge; linjärprogrammering; Random Facet; Markov-beslutsprocess;

    Sammanfattning : A Linear Program is a problem where the goal is to maximize a linear function subject to a set of linear inequalities. Geometrically, this can be rephrased as finding the highest point on a polyhedron. The Simplex method is a commonly used algorithm to solve Linear Programs. LÄS MER