Hosts are not longer "eligible_hosts" during placement, why?


We added a new host to one of our vmware infrastructure providers, updated the ESXi version and build of the current hosts and since this change only one host (which wasn’t updated, or put into maintenance mode and also differe by not being in a cluster) is now being found as an eligible host during the vmware_best_fit_least_utilized method.

Note: The host found to be eligible is running ESXi Version: 5.5.0 Build Number: 2068190, where as the upgraded and additional host are ESXi Version: 6.0.0 Build Number: 7967664.

All the hosts are listed via the call to ems.hosts.each....., but not prov.eligible_hosts.each...... Using the ems.hosts.each approach results in an alternative error:
Resource <Host> <22:HOSTNAME> is not an eligible resource for this provisioning instance.

Steps Taken to resolve, to no avail:

Appliance restarted
The provider and hosts were removed and re-added
Permissions were checked, in vmware and MIQ
Steps listed here: Host Filter in provisioning dialog

After numerous attempts to resolve I ended up standing up a new instance (version gaprindashvili-3 rather than fine-3), configuring this and running a provision which identified all the expected hosts as eligible.

  1. What is determined as an “eligible host”?
  2. Why are the hosts no longer eligible? ESXi version and build, possibly?
  3. Is something cached, something laying around in the DB that’s causing conflict?

Thanks in Advance

An ‘eligible host’ is one that matches the selection criteria for other placement options, such as cluster, network, storage, etc.

For example if another part of the provisioning placement process has selected a target cluster, then any host not in that cluster will be removed from the ‘eligible hosts’ list. A user’s RBAC visibility of the infrastructure also governs what appears in their eligible hosts list, so one user may be able to provision to a particular host/storage/folder/network combination, while another user may not be able to.

So after finding the time to investigate this issue it was found that during placement the dynamically built eligible_hosts method was applying a filter when determining the hosts.

We introduced a new provisioning dialogue and with this we used some scripts of Kevin Morey’s (build_vm_provision_request.rb in particular) which contains a method called get_network. This method contains a case statement (amongst other things) which is setting the vlan (in our dialogue we don’t allow the network to be selected by a user, we determine this via code later on so at this point :vlan is blank):

if merged_options_hash[:vlan].blank?
  merged_options_hash[:vlan] = 'VM Network'
  log(:info, "Build: {build} vlan: #{merged_options_hash[:vlan]}")

Setting this default somehow applied a filter during the eligible_hosts method which was seen in the evm.log , the line:

MIQ(ManageIQ::Providers::Vmware::InfraManager::ProvisionWorkflow#filter_hosts_by_vlan_name) Filtering hosts with the following network: <VM Network>

when this filter was applied it was found that the two eligible hosts being selected had a network vlan equal to VM Network and therefore only they were eligible.

Setting the hosts to something other than VM Network stopped the filter being applied, but found the more appropriate approach was to remove the case statement setting the :vlan attribute in build_vm_provision_request.rb.

Big thanks to @pemcg for the help on this.

Hope this helps someone else in the future.