Sökning: "policy gradient"

Visar resultat 1 - 5 av 14 uppsatser innehållade orden policy gradient.

  1. 1. Reinforcement Learning for Real Time Bidding

    Master-uppsats, Lunds universitet/Institutionen för datavetenskap

    Författare :Erik Smith; [2019]
    Nyckelord :Reinforcement learning; Markov decision process; value iteration; policy gradient; real time bidding; Technology and Engineering;

    Sammanfattning : When an internet user opens a web page containing an advertising slot, how is it determined which ad is shown? Today, the most common software-based approach to trading advertising slots is real time bidding: as soon as the user begins to load the web page, an auction for the slot is held in real time, and the highest bidder gets to display their advertisement of choice. Auction bidding is performed by different demand side platforms (DSPs). LÄS MER

  2. 2. Impact of observation noise and reward sparseness on Deep Deterministic Policy Gradient when applied to inverted pendulum stabilization

    Kandidat-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS); KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Adam Björnberg; Haris Poljo; [2019]
    Nyckelord :;

    Sammanfattning : Deep Reinforcement Learning (RL) algorithms have been shown to solve complex problems. Deep Deterministic Policy Gradient (DDPG) is a state-of-the-art deep RL algorithm able to handle environments with continuous action spaces. LÄS MER

  3. 3. Generalizing Deep Deterministic Policy Gradient

    Kandidat-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS); KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Gustaf Jacobzon; Martin Larsson; [2018]
    Nyckelord :;

    Sammanfattning : We extend Deep Deterministic Policy Gradient, a state of the art algorithm for continuous control, in order to achieve a high generalization capability. To achieve better generalization capabilities for the agent we introduce drop-out to the algorithm one of the most successful regularization techniques for generalization in machine learning. LÄS MER

  4. 4. Real-time System Control with Deep Reinforcement Learning

    Kandidat-uppsats, KTH/Skolan för teknikvetenskap (SCI); KTH/Skolan för teknikvetenskap (SCI)

    Författare :Gustav Gybäck; Fredrik Röstlund; [2018]
    Nyckelord :;

    Sammanfattning : We reproduce the Deep Deterministic Policy Gradient algorithm presented in the paper Continuous Control With Deep Reinforcement Learning to verify its results. We also strive to explain the necessary machine learning framework needed to understand the algorithm. LÄS MER

  5. 5. Deep reinforcement learning i distribuerad optimering

    Kandidat-uppsats, KTH/Skolan för teknikvetenskap (SCI); KTH/Skolan för teknikvetenskap (SCI)

    Författare :Marcus Lindström; Jahangir Jazayeri; [2018]
    Nyckelord :;

    Sammanfattning : Reinforcement learning has recently become a promising area of machine learning with significant achievements in the subject. Recent successes include surpassing human experts on Atari games and also AlphaGo becoming the first computer ranked on the highest professional level in the game Go, to mention a few. LÄS MER