Sökning: "reward function"

Visar resultat 1 - 5 av 85 uppsatser innehållade orden reward function.

  1. 1. Decreasing Training Time of Reinforcement Learning Agents for Remote Tilt Optimization using a Surrogate Neural Network Approximator

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Jiaming Huang; [2023]
    Nyckelord :;

    Sammanfattning : One possible application of reinforcement learning in the telecommunication field is antenna tilt optimization. However, one of key challenges we face is that the use of handcrafted simulators as environments to provide information for agents is often time-consuming regarding training reinforcement learning agents. LÄS MER

  2. 2. Smart Tracking for Edge-assisted Object Detection : Deep Reinforcement Learning for Multi-objective Optimization of Tracking-based Detection Process

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Shihang Zhou; [2023]
    Nyckelord :Tracking-By-Detection; Deep Reinforcement Learning; Multi-Objective Optimization; Spårning genom detektion; Djup förstärkningsinlärning; Multiobjektiv optimering;

    Sammanfattning : Detecting generic objects is one important sensing task for applications that need to understand the environment, for example eXtended Reality (XR), drone navigation etc. However, Object Detection algorithms are particularly computationally heavy for real-time video analysis on resource-constrained mobile devices. LÄS MER

  3. 3. Cooperative Modular Neural Networks for Artificial Intelligence in Games : A Comparison with A Monolithic Neural Network Regarding Technical Aspects and The Player Experience

    Uppsats för yrkesexamina på avancerad nivå, Blekinge Tekniska Högskola/Fakulteten för datavetenskaper

    Författare :Emil Högstedt; Ove Ødegård; [2023]
    Nyckelord :Neural Network; Modularization; Sensor; Reinforcement Learning; Supervised Learning; Neuralt Nätverk; Modulärisering; Sensor; Förstärkningsinlärning; Väglett Lärande;

    Sammanfattning : Recent years have seen multiple machine-learning research projects concerning agents in video games. Yet, there is a disjoint between this academic research and the video game industry, evidenced by the fact that game developers still hesitate to use neural networks (NN) due to lack of clarity and control. LÄS MER

  4. 4. An operator theoretic approach to the Riemann Hypothesis

    Master-uppsats, Lunds universitet/Matematik (naturvetenskapliga fakulteten); Lunds universitet/Matematikcentrum

    Författare :Ramon Arjan van de Scheur; [2023]
    Nyckelord :Riemann hypothesis; Prime number theorem; Prime number theorem for arithmetic progressions; operator theoretic equivalent; compact operator; prime-counting function; Chebyshev function; Laplace transform; Fourier transform; convolution operator; weak topology; strong topology; compact perturbation; Tauberian theorem; Poisson kernel; holomorphic; self-adjoint; spectrum; trace; Mathematics and Statistics;

    Sammanfattning : In 2023 an operator theoretic approach to the Prime Number Theorem was introduced by Olsen. In this thesis this approach is examined and applied to give a new, operator theoretic, proof of a different version of the Prime Number Theorem and of the Prime Number Theorem for arithmetic progressions. LÄS MER

  5. 5. Machine Learning-Based Instruction Scheduling for a DSP Architecture Compiler : Instruction Scheduling using Deep Reinforcement Learning and Graph Convolutional Networks

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Lucas Alava Peña; [2023]
    Nyckelord :Instruction Scheduling; Deep reinforcement Learning; Compilers; Graph Convolutional Networks; Schemaläggning av instruktioner; Deep Reinforcement Learning; kompilatorer; grafkonvolutionella nätverk;

    Sammanfattning : Instruction Scheduling is a back-end compiler optimisation technique that can provide significant performance gains. It refers to ordering instructions in a particular order to reduce latency for processors with instruction-level parallelism. LÄS MER