Improving Co-existence of URLLC and Distributed AI using RL

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Wei Shi; [2023]

Nyckelord: 5G; URLLC; RL; HRL; Optimization; 5G; URLLC; RL; HRL; Optimering;

Sammanfattning: In 5G, Ultra-reliable and low-Latency communications (URLLC) service is envisioned to enable use cases with strict reliability and latency requirements on wireless communication. For the upcoming 6G network, machine learning (ML) also stands an important role that introduces intelligence and further enhances the system performance. This thesis explores the deployment of reinforcement learning (RL), a popular sub-field of ML, to optimize the application-layer availability and reliability of URLLC service in factory automation scenarios. In conventional RL methods, the decision variables are typically optimized in the same control loop. However, wireless systems’ parameters can be optimized either on a cell level or globally, depending on the inter-cell dynamics’ impact on their optimal value. Although global optimizations can provide a better performance, such optimizations introduce major practical limitations on the control loop’s delay. Besides, global optimization of all decision variables leads to excessive signalings, and thus, it is costly in terms of communication overhead. In this thesis, we propose a more flexible hierarchical reinforcement learning (HRL) framework that enables the implementation of multiple agents and multi-level policies with different time scales for each optimization. Therefore, we selected a use case from the prior art, optimizing the maximum number of retransmissions and transmission power to industrial devices, and solved it with our HRL framework. Our simulation results on factory automation scenario shows that HRL framework achieves similar performance as the ideal RL method, which highly improves the availability and reliability compared to the baseline solutions. Besides, the new HRL framework allows a more flexible allocation of agents. By allocating the low-level agents close to the base stations, our framework also significantly decreases the overhead of signal transmissions compared to the one-agent RL method.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)