System Monitoring Application

Go
Jython
Sensu (0.18+)
Zendesk REST API
Java 7
Nagios Plugins
Redis

Introduction

This project was initially designed to help diagnose customers’ BPM environment issues, as BP3’s customers frequently had performance issues on their systems running IBM BPM. Although eventually only used internally to monitor BP3’s internal BPM environments, it was designed to automatically open or update support tickets to notify support of issues with environments.

The UI for this project looks similar to Uchiwa's, which is the dashboard that comes with Sensu.

What is BPM Environment Monitoring?

Generally, servers are monitored for simple metrics such as:

However, BPM environments require further monitoring of their variables, such as JVM settings, database sizes, and average task completion time, as these variables can be bottlenecks to system and BPM performance. Getting some of this information requires “jumping through hoops”, especially for IBM BPM environments1.

My Responsibilities

During 2014 and 2015, I was responsible for designing, implementing, and delivering this project, while receiving some help from other developers during the implementation phase. As this was my first (and solo) project at BP3, I faced several challenges. Despite my shortcomings, this was a great learning experience and I believe it has helped immensely improve my engineering skills.

Once ready for production use, our sysadmin would take over the task of running this product, meaning I had to work closely with him to ensure he was having an exceptional user experience when using and maintaining this product.

Design and Technologies Used

This project was designed to consist of 3 parts:

Since this project was initially intended to be sold to customers (with the synchronizer and notifier being offered as a SaaS product), it was designed to be scalable and easy to install for customers.

Sensu Monitoring Tool

Sensu is an open source monitoring tool for decentralized monitoring. Multiple clients and servers can be installed that allow clients to connect to any of the available servers for relaying their findings. Sensu also comes with a web UI, where users can monitor systems in real-time. Sensu servers can then be configured to forward their findings (in our case, via email) through a provided API.

The simple architecture of Sensu Sensu.

Nagios

Nagios is an industry standard for monitoring systems and networks. It offers thousands of open source plugins for use with various monitoring tasks. A few drawbacks of Nagios are high barrier to entry (as it doesn’t have a clear client-server architecture like Sensu does) and a dated web UI . Since Sensu is decentralized, configuring it is relatively easier: Sensu clients only need one server to connect to, which they are then redirected to other servers if needed. In addition, Sensu supports the usage of Nagios plugins.

The left side shows the old Nagios interface, while the right side shows the new Nagios interface (not available at the time of development of this project)

Synchronizer and Notifier

When Sensu detected problems, it was configured to forward this information to a central email account. We used a listener (implemented in Java) on this email account, which were then synchronized (using our synchronizer implemented in Go) with the ones already stored in Redis. If no tickets existed on Zendesk for incoming problems, a new ticket was created. Otherwise, the existing tickets were updated with new information.

Redis

Redis is an open source, in-memory data store that can be used as a database or cache. I chose to use Redis because I was only storing Zendesk ticket IDs against Sensu monitoring information. A disk-based database system was not needed in this case (and not as easy to scale), as Redis excels in key-value pair storage and hashes. Since Redis can also be backed up, this eliminated the problem of losing data in the case of crashes.

Why Go?

At the time, the Go language was recently released and was praised for its excellence at concurrent programming. Seeing as our goal was to create scalable software, Go was the perfect candidate.

A high-level architecture diagram, showing all the interconnected modules. The Synchronizer and Notifier was programmed in Go and use Redis to keep track of related tickets.

Difficulties Faced

The biggest problems faced were due to security constraints of customers:

What Did I Learn From This Project?

What Would I Do Different Next Time?

Footnotes


Back to Projects