Challenges of Monitoring Distributed Systems

Back Blog

Last October one of our co-founders and senior consultants Nenad Bozic held a presentation on Data Science conference 2.0 about challenges of monitoring distributed systems.

This is the abstract of the presentation:

Back in the days, you had a single machine and you could scroll down the single log file to figure out what is going on. In this Big Data world you need to combine a lot of logs together to figure out what is going on. Data is coming in huge volumes, with high speed so choosing important information and getting rid of noise becomes real challenge. There is a need for a centralized monitoring platform which will aid the engineers operating the systems, and serve the right information at the right time.

This talk will try to help you understand all the challenges and you will get an idea which tools and technology stacks are good fit to successfully monitor Big Data systems. The focus will be on open source and free solutions. The problem can be separated in two domains which both are the subject of this talk: metrics stack to gather simple metrics on central place and log stack to aggregate logs from different machines to central place. We will finish up with a combined stack and ideas how it can be improved even further with alerting and automated failover scenarios.

Here is a link to the video:

and here are the SLIDES.

Previous post Next post

Nenad Bozic

Co-founder & CEO

Software engineer with more than 10 years of experience currently focused on data intensive systems. Certified Cassandra developer and Datastax MVP for Apache Cassandra for 2016/2017. Strong believer in balance between good technical skills and soft skills. Striving for knowledge is his main drive, which is why he enjoys learning new tools and languages, blogging, working on open source, presenting.