
Research Article
Reactive Workflow Scheduling in Fluctuant Infrastructure-as-a-Service Clouds Using Deep Reinforcement Learning
@INPROCEEDINGS{10.1007/978-3-030-67540-0_17, author={Qinglan Peng and Wanbo Zheng and Yunni Xia and Chunrong Wu and Yin Li and Mei Long and Xiaobo Li}, title={Reactive Workflow Scheduling in Fluctuant Infrastructure-as-a-Service Clouds Using Deep Reinforcement Learning}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 16th EAI International Conference, CollaborateCom 2020, Shanghai, China, October 16--18, 2020, Proceedings, Part II}, proceedings_a={COLLABORATECOM PART 2}, year={2021}, month={1}, keywords={Workflow scheduling IaaS cloud Quality-of-Service Pay-as-you-go Reinforcement learning}, doi={10.1007/978-3-030-67540-0_17} }
- Qinglan Peng
Wanbo Zheng
Yunni Xia
Chunrong Wu
Yin Li
Mei Long
Xiaobo Li
Year: 2021
Reactive Workflow Scheduling in Fluctuant Infrastructure-as-a-Service Clouds Using Deep Reinforcement Learning
COLLABORATECOM PART 2
Springer
DOI: 10.1007/978-3-030-67540-0_17
Abstract
As a promising and evolving computing paradigm, cloud computing benefits scientific computing-related computational-intensive applications, which usually orchestrated in terms of workflows, by providing unlimited, elastic, and heterogeneous resources in a pay-as-you-go way. Given a workflow template, identifying a set of appropriate cloud services that fulfill users’ functional requirements under pre-given constraints is widely recognized to be a challenge. However, due to the situation that the supporting cloud infrastructures can be highly prone to performance variations and fluctuations, various challenges such as guaranteeing user-perceived performance and reducing the cost of the cloud-supported scientific workflow need to be properly tackled. Traditional approaches tend to ignore such fluctuations when scheduling workflow tasks and thus can lead to frequent violations to Service-Level-Agreement (SLA). On the contrary, we take such fluctuations into consideration and formulate the workflow scheduling problem as a continuous decision-making process and propose a reactive, deep-reinforcement-learning-based method, named DeepWS, to solve it. Extensive case studies based on real-world workflow templates show that our approach outperforms significantly than traditional ones in terms of SLA-violation rate and total cost.