Sökning: "Policy iteration"
Visar resultat 1 - 5 av 14 uppsatser innehållade orden Policy iteration.
1. Risk-Averse Multi-Armed Bandit Problem with Multiple Plays
Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknikSammanfattning : This study aims to construct an efficient heuristic, referred to as RA, for a riskaverse Markovian multi-armed bandit problem (MAB) with multiple plays. The RA incorporates risk-aversion and multiple plays by modifying the Gittins index strategy. LÄS MER
2. Tackling Non-Stationarity in Reinforcement Learning via Latent Representation : An application to Intraday Foreign Exchange Trading
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : Reinforcement Learning has applications in various domains, but the typical assumption is of a stationary process. Hence, when this hypothesis does not hold, performance may be sub-optimal. LÄS MER
3. RDF vocabulary : Translation of policies with RDF
Kandidat-uppsats, Linköpings universitet/Institutionen för datavetenskapSammanfattning : Throughout this thesis, we have worked on translating policies into RDF formats andtesting RDF vocabularies. Our goal is to create policies that can be applied to future indus-tries within a circular economy. While Onto-Deside is the primary source of motivation forthis work, we do not focus on it in this thesis. LÄS MER
4. Changing the Stories We Live By: Revolutionizing the North American Model of Wildlife Conservation Through Transformative Conservation
Master-uppsats, Uppsala universitet/Institutionen för geovetenskaperSammanfattning : As biodiversity continues to diminish worldwide, an interrogation of long-standing conservation discourse is needed to reformulate a new conservation rhetoric that confronts the socio-ecological complexities of the world and reorients the relationship between humans and nature. Using ecologically sensitive critical discourse analysis, this research investigates the dominant ideologies perpetuated within an iteration of mainstream American wildlife discourse and explores opportunities for transformative conservation alternatives. LÄS MER
5. Offline Reinforcement Learning for Remote Electrical Tilt Optimization : An application of Conservative Q-Learning
Master-uppsats, KTH/Matematik (Avd.)Sammanfattning : In telecom networks adjusting the tilt of antennas in an optimal manner, the so called remote electrical tilt (RET) optimization, is a method to ensure quality of service (QoS) for network users. Tilt adjustments made during operations in real-world networks are usually executed through a suboptimal policy, and a significant amount of data is collected during the execution of such policy. LÄS MER