Continuing the discussion from Moving metrics out of postgres:
The other discussion was more centered around what may break if we move metrics out of postgres.
Here, I wanted to focus on the choice of the new metrics engine/database.
There are a few reasons for standardizing our metrics collection:
- Allow users to add custom metrics of their own to our database.
- Allow users to use their existing dashboards that already have adapters for our new databases.
- Leverage gui components that have already been written and integrated with our new database.
The first things that come to mind are sensu / statsd -> graphite / logstash. But the horse power requirements are a bit excessive.
I want it to be able to collect data for disk arrays (as @rpo has been giving me valuable edge cases to keep in mind here).
I also want to collect more system centric metrics so people like @akrzos do not need to install a separate monitoring solution because the one we use does not meet their needs/can not be easily extendable. And by extensible, I’d like that to mean grabbing an adapter already written from github with minimal glue.
Elastic search is quite neat but has failed in the past for me once it has grown in size. Graylog has had similar characteristics. I do not have experience with logstash, but the demos are impressive. Graphite is a pain to install on the mac, but was successful in my ~20 node system. The mentioned systems have matured since I’ve used them as well.
@phil_griffith
What type of solution have you used to collect metrics and any things to keep an eye on? Where do you want to fed this data and what adapters already exist to get the data into there?