In this work in progress paper, we present our ongoing effort towards devising a priority based resource scheduling technique and framework for apache storm. Apache Storm is a popular distributed real time stream processing engine which has been widely adopted by key players in the industry including YAHOO and Twitter. An application running in storm is called a topology that is characterized by a Directed Acyclic Graph (DAG). To run multiple of such topologies in a storm cluster, storm provides with default, out of the box scheduler called Isolation Scheduler. Isolation Scheduler assigns resources to topologies based on static resource configuration and does not provide any means to prioritize topologies based on their varying business priority. As a result, performance degradation, even complete starvation of topologies with high business priority is possible when available cluster resources are insufficient. A priority based resource scheduling strategy is proposed in this paper to overcome this problem. A preliminary performance evaluation is performed to demonstrate effectiveness of the proposed scheduler over the default storm Isolation Scheduler.

Additional Metadata
Keywords Apache Storm, Big Data, Distributed Computing, Distributed Stream Processing (DSP), Event Processing, Priority Scheduling, Resource Management
Persistent URL dx.doi.org/10.1109/SPECTS.2016.7570513
Conference 2016 International Symposium on Performance Evaluation of Computer and Telecommunication Systems, SPECTS 2016
Citation
Chakraborty, R. (Rudraneel), & Majumdar, S. (2016). A priority based resource scheduling technique for multitenant storm clusters. In Proceedings of the 2016 International Symposium on Performance Evaluation of Computer and Telecommunication Systems, SPECTS 2016 - Part of SummerSim 2016 Multiconference. doi:10.1109/SPECTS.2016.7570513