Deep Reinforcement Learning and Simulation for the Optimization of Production Systems

Detta är en Master-uppsats från Uppsala universitet/Institutionen för informationsteknologi

Författare: Siyuan Chen; [2022]

Nyckelord: ;

Sammanfattning: The main objective of this master thesis project is to use the deep reinforcement learning (DRL) and simulation method for optimization of production systems. In this project, the Deep Q-learning Networks (DQN) algorithm is first used to optimize seven decision variables in Averill Law’s production system to find the best profit, with 99.970% accuracy in profit output. Then, the DQN algorithm is used to optimize the scheduling of a simulation model of flow shop, using a model with three operations and a more complex model with six operations. The flow shop model is also the model for the upcoming Automated Guided Vehicle (AGV) scheduling in EXPLAIN, since the current DQN training and optimization focuses on the selection of orders to be picked up by machines or individual AGV, i.e., scheduling. To reflect real world characteristics, multi-objective training is added later in this project, considering a large number of examples and using simulation optimization as a basis. This project uses DQN for deep reinforcement learning training and uses Markov decision processes to define the states, actions, reward functions, and loss functions of DQN. Optuna is used to find optimal hyperparameters combination for DQN process in production system. However, this project has a few limitations, such as using only one DRL algorithm instead of multiple DRL algorithms for comparison, using Optuna only to find the hyperparameters in the Averill Law example, which is not used in the flow shop scheduling experiments, and not using GPU acceleration to speed up the reinforcement learning training. Future work will continue to focus on AGV scheduling, considering more about target destination, and routing selection for each AGV.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)