One of the most commonly used ways to monitor execution of software applications is by analyzing logs. Logs are execution foot-print of software applications that are produced and stored for real-time or post-execution analysis of execution. With the software applications becoming large, complex, distributed, web-scale, also called as big data applications, logs produced by such software applications are also large-scale. That means, such logs are large in volume, velocity and variety. That makes it crucial to have such logs analyzed in an automated, scalable and effective manner to ensure high veracity and have analytics with high value. In this paper, we present our proposed solution of a formal model for organizing and structuring logs. We then present a Bayesian deep learning network based analysis approach that utilizes the formal model for logs to detect and predict any possible faults and consequences of such faults. Moreover, we also present our MapReduce based distributed, parallel, single-pass and incremental approach to build, train and execute the proposed Bayesian deep learning framework. This helps in effective processing of logs on cloud platforms and therefore efficient handling of logs that are produced at the scale of big data by big data applications.

Additional Metadata
Keywords Applications, Bayesian networks, Big Data, Deep learning, Execution, Fault detection, Formal model, Logs, MapReduce, Monitoring
Persistent URL
Conference 5th IEEE International Conference on Big Data, Big Data 2017
Shafiq, M.O, & Torunski, E. (Eric). (2018). Towards MapReduce based Bayesian deep learning network for monitoring big data applications. In Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 (pp. 2112–2121). doi:10.1109/BigData.2017.8258159