  1. 1. A Partially Observable Markov Decision Process for Breast Cancer Screening

    Master-uppsats, Linköpings universitet/Statistik och maskininlärning

    Författare :Joshua Hudson; [2019]
    Nyckelord :POMDP; Markov Decision Process; Breast Cancer; Screening; Operations Research;

    Sammanfattning : In the US, breast cancer is one of the most common forms of cancer and the most lethal. There are many decisions that must be made by the doctor and/or the patient when dealing with a potential breast cancer.


    Master-uppsats, Mälardalens högskola/Inbyggda system; Mälardalens högskola/Inbyggda system

    Författare :Gallardo Marielle; Chakraborty Sweta; [2019]
    Nyckelord :shared-space users; MPDM; timing analysis; planning and decision-making; autonomous vehicle; MDP; reinforcement learning; social force model;

    Sammanfattning : Autonomous driving requires tactical decision-making while navigating in a dynamic shared space environment. The complexity and uncertainty in this process arise due to unknown and tightly-coupled interaction among traffic users.

  3. 3. A study of the exploration/exploitation trade-off in reinforcement learning : Applied to autonomous driving

    Kandidat-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS); KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Ruwaid Louis; David Yu; [2019]
    Nyckelord :;

    Sammanfattning : A world initiative was set in motion for decreasing the amount of traffic accidents. Autonomous driving is a field which contributes to the initiative. Following report examines exploration/exploitationtrade-off in reinforcement learning applied to decision making in autonomous driving.

  4. 4. Reinforcement Learning for Real Time Bidding

    Master-uppsats, Lunds universitet/Institutionen för datavetenskap

    Författare :Erik Smith; [2019]
    Nyckelord :Reinforcement learning; Markov decision process; value iteration; policy gradient; real time bidding; Technology and Engineering;

    Sammanfattning : When an internet user opens a web page containing an advertising slot, how is it determined which ad is shown? Today, the most common software-based approach to trading advertising slots is real time bidding: as soon as the user begins to load the web page, an auction for the slot is held in real time, and the highest bidder gets to display their advertisement of choice. Auction bidding is performed by different demand side platforms (DSPs).

  5. 5. Att spela 'Breakout' med hjälp av 'Deep Q-Learning'

    Kandidat-uppsats, KTH/Skolan för teknikvetenskap (SCI); KTH/Skolan för teknikvetenskap (SCI)

    Författare :Gabriel Andersson; Martti Yap; [2019]
    Nyckelord :;

    Sammanfattning : I denna rapport implementerar vi en reinforcement learning (RL) algoritm som lär sig spela Breakout på 'Atari Learning Environment'. Den dator drivna spelaren (Agenten) har tillgång till samma information som en mänsklig spelare och vet inget om spelet och dess regler på förhand.