Sökning: "multi-armed bandits"

Visar resultat 1 - 5 av 8 uppsatser innehållade orden multi-armed bandits.

  1. 1. An Empirical Survey of Bandits in an Industrial Recommender System Setting

    Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknik

    Författare :Tobias Schwarz; Johan Brandby; [2023-09-21]
    Nyckelord :computer science; industrial application; machine learning; reinforcement learning; multi-armed bandits; MAB; contextual multi-armed bandits; survey; batch learning;

    Sammanfattning : In this thesis, the effects of incorporating unstructured data—images in the wild—in contextual multi-armed bandits are investigated, when used within a recommender system setting, which focuses on picture-based content suggestion. The idea is to employ image features, extracted by a pre-trained convolutional neural network, and study the resulting bandit behaviors when including respective excluding this information in the typical context creation, which normally relies on structured data sources—such as metadata. LÄS MER

  2. 2. Causal Reinforcement Learning for Bandits with Unobserved Confounders

    Master-uppsats, Uppsala universitet/Institutionen för informationsteknologi

    Författare :Mingwei Deng; [2023]
    Nyckelord :;

    Sammanfattning : Reinforcement Learning (RL) has been recognized as a valuable tool in various fields. However, its application is limited by its reliance on extensive data through a trial-and-error approach and challenges in generalizing learned policies. LÄS MER

  3. 3. Graph Bandits : Multi-Armed Bandits with Locality Constraints

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Kasper Johansson; [2022]
    Nyckelord :Multi-armed bandits; locality constraints; reinforcement learning; Flerarmade banditer; lokala restriktioner; förstärkningsinlärning;

    Sammanfattning : Multi-armed bandits (MABs) have been studied extensively in the literature and have applications in a wealth of domains, including recommendation systems, dynamic pricing, and investment management. On the one hand, the current MAB literature largely seems to focus on the setting where each arm is available to play at each time step, and ignores how agents move between the arms. LÄS MER

  4. 4. A Recommender System for Suggested Sites using Multi-Armed Bandits : Initialising Bandit Contexts by Neural Collaborative Filtering

    Master-uppsats, Linköpings universitet/Institutionen för datavetenskap

    Författare :William Stenberg; [2021]
    Nyckelord :Recommender Systems; Neural Collaborative Filtering; Multi-Armed Bandits;

    Sammanfattning : The abundance of information available on the internet necessitates means of quickly finding what is relevant for the individual user. To this end, there has been much research concerning recommender systems and lately specifically methods using deep learning for such systems. LÄS MER

  5. 5. Reference Tracking with Adversarial Adaptive Output- Feedback Model Predictive Control

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Linda Bui; [2021]
    Nyckelord :Model Predictive Control; Adversarial Multi-Armed Bandits; Kalman Filter; Output-Feedback; Adaptive Control ; Modell Prediktiv Reglering; Kontradiktoriska Flerarmade Banditer; Kalman Filter; Output-Feedback; Adaptiv Reglering;

    Sammanfattning : Model Predictive Control (MPC) is a control strategy based on optimization that handles system constraints explicitly, making it a popular feedback control method in real industrial processes. However, designing this control policy is an expensive operation since an explicit model of the process is required when re-tuning the controller. LÄS MER