If you’re in the mood to run your own monitoring infrastructure, I can strongly recommend Riemann as a central piece. It’s like a monitoring switchboard which is configured in Clojure, so you can make arbitrarily powerful aggregation and alarm logic before forwarding events to other systems for persistence and side-effects. We use it with Grafana and InfluxDB for dashboards, and plug it into Slack and OpsGenie for alerting.
1 Like