Sökning: "Multi-armed bandits"
Visar resultat 1 - 5 av 8 uppsatser innehållade orden Multi-armed bandits.
1. An Empirical Survey of Bandits in an Industrial Recommender System Setting
Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknikSammanfattning : In this thesis, the effects of incorporating unstructured data—images in the wild—in contextual multi-armed bandits are investigated, when used within a recommender system setting, which focuses on picture-based content suggestion. The idea is to employ image features, extracted by a pre-trained convolutional neural network, and study the resulting bandit behaviors when including respective excluding this information in the typical context creation, which normally relies on structured data sources—such as metadata. LÄS MER
2. Causal Reinforcement Learning for Bandits with Unobserved Confounders
Master-uppsats, Uppsala universitet/Institutionen för informationsteknologiSammanfattning : Reinforcement Learning (RL) has been recognized as a valuable tool in various fields. However, its application is limited by its reliance on extensive data through a trial-and-error approach and challenges in generalizing learned policies. LÄS MER
3. Graph Bandits : Multi-Armed Bandits with Locality Constraints
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : Multi-armed bandits (MABs) have been studied extensively in the literature and have applications in a wealth of domains, including recommendation systems, dynamic pricing, and investment management. On the one hand, the current MAB literature largely seems to focus on the setting where each arm is available to play at each time step, and ignores how agents move between the arms. LÄS MER
4. A Recommender System for Suggested Sites using Multi-Armed Bandits : Initialising Bandit Contexts by Neural Collaborative Filtering
Master-uppsats, Linköpings universitet/Institutionen för datavetenskapSammanfattning : The abundance of information available on the internet necessitates means of quickly finding what is relevant for the individual user. To this end, there has been much research concerning recommender systems and lately specifically methods using deep learning for such systems. LÄS MER
5. Reference Tracking with Adversarial Adaptive Output- Feedback Model Predictive Control
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : Model Predictive Control (MPC) is a control strategy based on optimization that handles system constraints explicitly, making it a popular feedback control method in real industrial processes. However, designing this control policy is an expensive operation since an explicit model of the process is required when re-tuning the controller. LÄS MER