Can anyone explain what smart state analysis is actually doing?

I’m trying to understand what exactly happens when a smart state analysis is run on a VM (especially with a VMware provider, but I’m interested in understanding the process generally).

Understanding the analysis process is important to us, as we need to be able to predict the possible impact of analysis runs (in past experience, MS Exchange, for instance, disliked having snaps taken on large vmdks!). We would like to be able to set sensible parameters around scheduling of analysis runs, and be able to identify the sort of workloads that might be negatively impacted so we can tag them for special handling. We also want to allow customers to run analysis jobs, and need to give them some guide lines around what to expect and how to safely manage the task.

I had thought that a VM in a vSphere environment would have a snapshot taken, the snap would be mounted to the smart proxy by means of the vddk installed on it, and then analyzed, after which the snap would be unmounted and then deleted in vsphere.

However, documentation indicates this may not be entirely correct, this is from the CloudForms 5.2 QuickStart guide: “If the virtual machine is associated with a host or provider, ensure the virtual machine is registered with that system to be properly analyzed; the server requires this information since a snapshot might be created.” … only might … so what is really going on then?

In our lab I have noted that EVM SmartState analysis events get logged in vCenter, but I haven’t yet seen a snapshot created in relation to an analysis event. Now, I know that snaps were created when analyzing certain VM’s in a previous project, as it caused some issues we had to work around.

I’m wondering, now, was that because we were using an earlier version of ManageIQ/CloudForms (that was using CloudForms 3.0)? Has the way ManageIQ performs analysis on a vSphere VM changed since that time? Or are there just certain conditions which will trigger a snapshot to be created, and what might they be?

Has anyone had any experience in this regard that can shed some light on what is going on and why? Or are there any ManageIQ developers lurking on the forum that would be willing to share what is happening?

1 Like

For VMware environments, snapshots are always taken of VMs before SmartState scans - of course this is not the case for templates. A snapshot is taken, disk data is read via the VixDIskLib API, then the snapshot is deleted. By default, these operations are performed through the VM’s host, not vCenter. If the datastore doesn’t have enough free space to safely create and delete the snapshot, the SmartState request fails with the appropriate error message.

1 Like

Hi rpo,

Thanks for that reply. You are, of course correct. I found the snapshot entries in the ESXi host event logs, not in vCenter, as it seems EVM connects directly to the host for disk operations, like snaps.

However, I’m now having issues getting smart state analysis to work as a non-root user under ESXi 5.5. But that’s another topic.