
Research Article
Adaptive Online Estimation of Thrashing-Avoiding Memory Reservations for Long-Lived Containers
@INPROCEEDINGS{10.1007/978-3-030-67537-0_37, author={Jiayun Lin and Fang Liu and Zhenhua Cai and Zhijie Huang and Weijun Li and Nong Xiao}, title={Adaptive Online Estimation of Thrashing-Avoiding Memory Reservations for Long-Lived Containers}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 16th EAI International Conference, CollaborateCom 2020, Shanghai, China, October 16--18, 2020, Proceedings, Part I}, proceedings_a={COLLABORATECOM}, year={2021}, month={1}, keywords={Memory reservation estimation Intelligent cluster scheduling Cloud datacenter Reinforcement learning Long-live container}, doi={10.1007/978-3-030-67537-0_37} }
- Jiayun Lin
Fang Liu
Zhenhua Cai
Zhijie Huang
Weijun Li
Nong Xiao
Year: 2021
Adaptive Online Estimation of Thrashing-Avoiding Memory Reservations for Long-Lived Containers
COLLABORATECOM
Springer
DOI: 10.1007/978-3-030-67537-0_37
Abstract
Data-intensive computing systems in cloud datacenters create long-lived containers and allocate memory resource for them to execute long-running applications. It is a challenge to exactly estimate how much memory should be reserved for containers to enable smooth application execution and high resource utilization as well. Current state-of-the-art work has two limitations. First, prediction accuracy is restricted by the monotonicity of the iterative search. Second, application performance fluctuates due to the termination conditions. In this paper, we propose two improved strategies based on MEER, called MEER+ and Deep-MEER, which are designed to assist in memory allocation upon resource manager like YARN. MEER+ has one more step of approximation than MEER, to make the iterative search bi-directional and better approach the optimal value. Based on reinforcement learning and rich data, Deep-MEER achieves thrashing-avoiding estimation without involving termination conditions. Based on the different input requirements and advantages, a scheme to adaptively adopt MEER+ and Deep-MEER in cluster life cycle is proposed. We have evaluated MEER+ and Deep-MEER. Our experimental results show that MEER+ and Deep-MEER yield up to 88% and 20% higher accuracy. Moreover, Deep-MEER guarantees stable performance for applications during recurring executions.