As @Fryguy pointed out, I really do want to write up some ideas about this.
Here’s a super high level view of the ideas that I’ve been kicking around for this. First, we break down the provider interactions into three different areas:
- Inventory Collection (a.k.a. “ems refresh”)
- Event catching
- Metrics collection (a.k.a. “cap and u”)
Currently, this process is nicely separated into two separate steps:
a. Collect inventory from the provider (build a huge hash of the entire inventory)
b. Save the inventory hash into the ManageIQ database
What’s nice about this current process is that the process of gathering data is separated from the deep knowledge of the ManageIQ database fields.
However, there’s still the problem of knowing exactly how to construct the massive inventory hash in step (a).
The suggestion to make this even easier to consume from the outside is to introduce the idea of an inventory event handler.
This way, someone wishing to write the inventory collector for a brand new provider would need to know how to collect the inventory from the provider, and how to emit events that our inventory event handler cares about. Then, under the covers, our inventory event handler would be able to build the appropriate inventory hash, and update the database accordingly. In fact, in the future, this would give us much more flexibility to be able to update the “save inventory” process how ever we want without ever impacting the inventory collection process. Today, it’s absolutely necessary to collect the entire inventory before saving it to the DB. Tomorrow, maybe we could save it in stages, or who knows…
Inventory Collection Sticky Points
The following points make this a little more difficult than “a simple matter of typing on a keyboard”:
- create the inventory event handler
- providing a way to link different inventory items together
- i.e., hosts need to own VMs … how would one emit a “new host” event, then emit a “new VM” event, and still somehow indicate that the VM and the host are related?
This is another area where we’re almost there.
Fig 1: Openstack Event Model
The event model (Fig 1) shows the various parts involved in collecting and processing events. This particular illustration uses Openstack as an example.
This is broken down into a few different areas that we really care about here:
- event monitor
- see OpenstackEventMonitor
- responsible for listening for events from openstack
- hands events to the event catcher
- event catcher
- see EventCatcherOpenstack
- responsible for catching events from openstack
- registers the events with ManageIQ database
- event handler
- see EventHandler
- responsible for retrieving events from ManageIQ database and deciding what to do with them
- reads the event_handler configuration file and performs configured action based on event contents
To make event collection provider-driver friendly, I honestly believe that it’s a matter of drawing a line in the sand in the Event Monitor -> Event Catcher -> Event Handler chain and deciding that one part lives on the Provider Driver side, and the other part lives in the core ManageIQ side.
My first hunch here is that the Event Monitor and possibly some of the Event Catcher lives in the Provider Driver side. That leaves the majority of the Event Catcher and likely the entire Event Handler in the core ManageIQ side.
Event Collection Sticky Points
Somewhat similar to Inventory Collection, Metrics Collection has a very particular expectation for the format of the data before it can actually process and save it to the ManageIQ database.
I actually have not looked at Metrics Collection in quite some time. It would take a decent amount of time to come back up to speed on how to even start breaking this up.
However, one quick idea is to look at simply how to separate out the actual collection of data from the formatting and processing of the data. This remains, by far, the trickiest place to tease out a “provider-driver” friendly approach. The metrics collection logic is hugely ingrained in two different areas:
- Metrics and metric collection logic that are very specific to vCenter metrics.
- ManageIQ reports that depend on specific
Metrics Collection Sticky Points
All of it?
On of the main problems here is that the data is expected to look and feel very much like vCenter metrics data. This includes the time interval for the metrics. For instance, vCenter produces metrics data every 20 seconds. However, Rhevm produces metrics data every minute. This means that for every single data point from Rhevm, we have to manufacture two additional “fake” data points in order to match the 20-second cadence set by vCenter. Openstack allows the Openstack administrator to define a completely customized metric collection period.
It’s possible that some of this legacy constraint could be kept in tact and well hidden from the provider driver concept. However, we still have to figure out how to draw a line in the sand between provider driver and core ManageIQ.
These are just some ideas that have been floating around my head. Please add your comments. I’d love to hear other ideas. Even if it’s just to poke holes.
Regardless of what paths we take, clearly we have a decent amount of research ahead to really develop a workable plan.
Also, I do plan on expanding on this quite a bit. So, let’s get different ideas out in the open so we can evolve these ideas in the right direction.