
Research Article
A Reinforcement Learning Based Approach to Identify Resource Bottlenecks for Multiple Services Interactions in Cloud Computing Environments
@INPROCEEDINGS{10.1007/978-3-030-67540-0_4, author={Lingxiao Xu and Minxian Xu and Richard Semmes and Hui Li and Hong Mu and Shuangquan Gui and Wenhong Tian and Kui Wu and Rajkumar Buyya}, title={A Reinforcement Learning Based Approach to Identify Resource Bottlenecks for Multiple Services Interactions in Cloud Computing Environments}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 16th EAI International Conference, CollaborateCom 2020, Shanghai, China, October 16--18, 2020, Proceedings, Part II}, proceedings_a={COLLABORATECOM PART 2}, year={2021}, month={1}, keywords={Cloud computing Reinforcement learning Service interactions Non-functional requirement Resource bottleneck}, doi={10.1007/978-3-030-67540-0_4} }
- Lingxiao Xu
Minxian Xu
Richard Semmes
Hui Li
Hong Mu
Shuangquan Gui
Wenhong Tian
Kui Wu
Rajkumar Buyya
Year: 2021
A Reinforcement Learning Based Approach to Identify Resource Bottlenecks for Multiple Services Interactions in Cloud Computing Environments
COLLABORATECOM PART 2
Springer
DOI: 10.1007/978-3-030-67540-0_4
Abstract
Cloud service providers are provisioning resources including a variety of virtual machine instances to support customers that migrate their services to the cloud. From the customers’ perspective, selecting the appropriate amount of resources is tightly coupled with performance and cost. By identifying the potential resource bottlenecks in the early stage of the service deployment process, resource planning can be significantly optimized. However, due to the unpredictable workloads and heterogeneous resources, it is difficult to identify resource bottlenecks that can degrade system performance. To support system non-functional requirements (NFR) in a better manner, we propose a reinforcement learning based approach to support the NFR management of system concerning the multiple services interactions scenario by identifying the potential resource bottleneck and optimizing the demanded resources. The proposed approach can predict the resource bottleneck for multiple services interactions, e.g. bottleneck in CPU or overloads in specific service, and provide guidance for resource planning. We modeled and simulated the proposed approach using an extended version of the CloudSim toolkit. Comprehensive evaluations with realistic use case from Siemens Digital Industries Software’s MindSphere Solution on AliCloud show that our proposed approach can achieve high accuracy in terms of performance metrics, such as response time, queries per second (QPS), and resource usage.