Sökning: "Policy iteration"

Visar resultat 1 - 5 av 14 uppsatser innehållade orden Policy iteration.

  1. 1. Risk-Averse Multi-Armed Bandit Problem with Multiple Plays

    Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknik

    Författare :Siri Dahlgren; Nicholas Marriott; [2023-10-23]
    Nyckelord :MAB; Gittins; Markovian bandit; risk-aversion; policy iteration; multiple plays;

    Sammanfattning : This study aims to construct an efficient heuristic, referred to as RA, for a riskaverse Markovian multi-armed bandit problem (MAB) with multiple plays. The RA incorporates risk-aversion and multiple plays by modifying the Gittins index strategy. LÄS MER

  2. 2. Tackling Non-Stationarity in Reinforcement Learning via Latent Representation : An application to Intraday Foreign Exchange Trading

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Adriano Mundo; [2023]
    Nyckelord :Reinforcement Learning; Latent Representation; VAE; Non-Stationary; FQI; FX Trading; Förstärkningsinlärning; Latent representation; VAE; Icke-stationär; FQI; FX handel;

    Sammanfattning : Reinforcement Learning has applications in various domains, but the typical assumption is of a stationary process. Hence, when this hypothesis does not hold, performance may be sub-optimal. LÄS MER

  3. 3. RDF vocabulary : Translation of policies with RDF

    Kandidat-uppsats, Linköpings universitet/Institutionen för datavetenskap

    Författare :Sergio Garcia Bernabeu; Lukas Bergdahl; [2023]
    Nyckelord :RDF; RDF vocabulary; Security; ODRL; Dublin core; Policy; RDF translation;

    Sammanfattning : Throughout this thesis, we have worked on translating policies into RDF formats andtesting RDF vocabularies. Our goal is to create policies that can be applied to future indus-tries within a circular economy. While Onto-Deside is the primary source of motivation forthis work, we do not focus on it in this thesis. LÄS MER

  4. 4. Changing the Stories We Live By: Revolutionizing the North American Model of Wildlife Conservation Through Transformative Conservation

    Master-uppsats, Uppsala universitet/Institutionen för geovetenskaper

    Författare :Tess Marie Burroughs; [2022]
    Nyckelord :Sustainable Development; Wildlife Conservation; Biodiversity Loss; Critical Discourse Analysis; Transformative Conservation;

    Sammanfattning : As biodiversity continues to diminish worldwide, an interrogation of long-standing conservation discourse is needed to reformulate a new conservation rhetoric that confronts the socio-ecological complexities of the world and reorients the relationship between humans and nature. Using ecologically sensitive critical discourse analysis, this research investigates the dominant ideologies perpetuated within an iteration of mainstream American wildlife discourse and explores opportunities for transformative conservation alternatives. LÄS MER

  5. 5. Offline Reinforcement Learning for Remote Electrical Tilt Optimization : An application of Conservative Q-Learning

    Master-uppsats, KTH/Matematik (Avd.)

    Författare :Marcus Kastengren; [2021]
    Nyckelord :Remote Electrical Tilt; Antenna Tilt Optimization; Reinforcement Learning; Offline Reinforcement Learning; Conservative Q-Learning; Fjärrlutning; Antennlutningsoptimering; Förstärkningsinlärning; Offline-förstärkningsinlärning; Konservativ Q-inlärning;

    Sammanfattning : In telecom networks adjusting the tilt of antennas in an optimal manner, the so called remote electrical tilt (RET) optimization, is a method to ensure quality of service (QoS) for network users. Tilt adjustments made during operations in real-world networks are usually executed through a suboptimal policy, and a significant amount of data is collected during the execution of such policy. LÄS MER