Orchestration Stacks (openstack HEAT)

I have troubles with my attempts to create service based on openstack heat orchestration template (hammer-4.20190329011451).

When I make an order, stack (with 2 VMs) gets successfully created on Openstack side and also in MIQ (but no VMs there). I get "Service K8S 1.11 (2 nodes)-… has been provisioned. " message. But there are no VMs under created service:

I am able to see such new stack in MIQ, but several fields (like Instances, Parameters…) are missing/empty

After some time later (not exactly sure how long, seems to me 1-2 hour) these fields get filled. But that is to late, because I need to tag VMs during provisioning. :slightly_frowning_face:

We use admin account for openstack integration and AMQP. I suspect that there may be a configuration issue on openstack side? I see some errors or warnings in log files, but not sure what is relevant - for instance warns in fog.log

I attached all my logs here. It is from my latest attempt after MIQ appliance restart
miq_logs.zip (326.7 KB)
Could anyone provide any hints?

Thank you very much

could be related to Openstack integration problem - [Bunny::ResourceLocked]: RESOURCE_LOCKED ?

Hi,

I am having the same problem.
It seems that when i create a service from a catalog based on a heat template (has 2 simple vms attached to a network), a service gets created in manageiq and a stack the in openstack side ( the instances are created and running) but the service shows no details at all.

after some time (for me from 20min to 2h) the service starts showing details and the vms attached to it (in the bottom)

when i opened the log files i get the same error as @johny5v

[----] E, [2019-07-31T13:41:49.991302 #11668:4b40d30] ERROR -- : MIQ(ManageIQ::Providers::Openstack::CloudManager::EventCatcher::Runner#start_event_monitor) EMS [172.29.50.224] as [admin] Event Monitor Thread aborted because [RESOURCE_LOCKED - cannot obtain exclusive access to locked queue 'miq-172.29.50.148-nova' in vhost '/']
[----] E, [2019-07-31T13:41:49.991393 #11668:4b40d30] ERROR -- : [Bunny::ResourceLocked]: RESOURCE_LOCKED - cannot obtain exclusive access to locked queue 'miq-172.29.50.148-nova' in vhost '/'  Method:[block (2 levels) in <class:LogProxy>]
[----] E, [2019-07-31T13:41:49.991481 #11668:4b40d30] ERROR -- : /usr/local/lib/ruby/gems/2.4.0/gems/bunny-2.1.0/lib/bunny/channel.rb:1946:in `raise_if_continuation_resulted_in_a_channel_error!'
/usr/local/lib/ruby/gems/2.4.0/gems/bunny-2.1.0/lib/bunny/channel.rb:1020:in `queue_declare'
/usr/local/lib/ruby/gems/2.4.0/gems/bunny-2.1.0/lib/bunny/queue.rb:371:in `declare!'
/usr/local/lib/ruby/gems/2.4.0/gems/bunny-2.1.0/lib/bunny/queue.rb:52:in `initialize'
/usr/local/lib/ruby/gems/2.4.0/gems/bunny-2.1.0/lib/bunny/channel.rb:410:in `new'
/usr/local/lib/ruby/gems/2.4.0/gems/bunny-2.1.0/lib/bunny/channel.rb:410:in `queue'
/usr/local/lib/ruby/gems/2.4.0/gems/bunny-2.1.0/lib/bunny/session.rb:439:in `queue_exists?'
/usr/local/lib/ruby/gems/2.4.0/bundler/gems/manageiq-providers-openstack-69315b29c656/lib/manageiq/providers/openstack/legacy/events/openstack_rabbit_event_monitor.rb:148:in `block in remove_legacy_queues'
/usr/local/lib/ruby/gems/2.4.0/bundler/gems/manageiq-providers-openstack-69315b29c656/lib/manageiq/providers/openstack/legacy/events/openstack_rabbit_event_monitor.rb:146:in `each'
/usr/local/lib/ruby/gems/2.4.0/bundler/gems/manageiq-providers-openstack-69315b29c656/lib/manageiq/providers/openstack/legacy/events/openstack_rabbit_event_monitor.rb:146:in `remove_legacy_queues'
/usr/local/lib/ruby/gems/2.4.0/bundler/gems/manageiq-providers-openstack-69315b29c656/lib/manageiq/providers/openstack/legacy/events/openstack_rabbit_event_monitor.rb:108:in `initialize_queues'
/usr/local/lib/ruby/gems/2.4.0/bundler/gems/manageiq-providers-openstack-69315b29c656/lib/manageiq/providers/openstack/legacy/events/openstack_rabbit_event_monitor.rb:78:in `start'
/usr/local/lib/ruby/gems/2.4.0/bundler/gems/manageiq-providers-openstack-69315b29c656/app/models/manageiq/providers/openstack/event_catcher_mixin.rb:38:in `monitor_events'
/var/www/miq/vmdb/app/models/manageiq/providers/base_manager/event_catcher/runner.rb:161:in `block in start_event_monitor'

I am using:
openstack stein
manageiq hammer-5.20190424144425_1d4816f
admin account to connect to openstack
rabbitmq user has all the permissions
super admin in the manageiq side

this problem is reproduced every time I start a service

can anyone help?

We managed to reduce time of service details loading by adding refresh_interval params to server config:

image

we have also increased a bit memory resource for Refresh worker to 300MB
image

I have also added code into Cloud/Orchestration/Provisioning/StateMachines/Methods/check_provisioned to check that VM created by stack are already visible in MIQ (for tagging, etc.)

image

         def vm_found_in_miq?
            all_vms_found = true
            cr_opts = @handle.root['service'].options[:create_options]
            cr_pars = cr_opts[:parameters]
            cr_pars.each { |k, v|
				if k.start_with?("k8s_host_") && k.include?("_name")
                    vm = $evm.vmdb('vm').find_by_name(v)
                    $evm.log("info", "Checking existency of VM '#{v}' in vmdb. Found: #{vm}")
                    unless vm
              			# deployment not done yet in provider
                        $evm.log("info", "VM '#{v}' not found in vmdb. Retrying...")
              			@handle.root['ae_result']         = 'retry'
              			@handle.root['ae_retry_interval'] = '1.minute'  
                        all_vms_found = false
                        break
                    end
				end
			}
            all_vms_found
          end     

(I am using special naming convention for hostname parameters in heat templates - k8s_host_ipam_nam_ - to be able dynamicaly call our IPAM for hostname generating during provisioning flow)

After all that everything started looking better for little while. But during more tests new problems occured (possible linked to prevailing RESOURCE_LOCKED error in logs) and most of the provisioning attempts are were failing with error:

[----] E, [2019-07-26T17:24:15.596382 #27370:a14f50] ERROR -- : Q-task_id([r497_service_template_provision_task_1375]) Terminating non responsive method with pid 30131
[----] E, [2019-07-26T17:24:15.597111 #27370:cccbdc8] ERROR -- : Q-task_id([r497_service_template_provision_task_1375]) <AEMethod provision> The following error occurred during
method evaluation:
[----] E, [2019-07-26T17:24:15.597421 #27370:cccbdc8] ERROR -- : Q-    task_id([r497_service_template_provision_task_1375]) <AEMethod provision>   SignalException: SIGTERM
[----] E, [2019-07-26T17:24:15.598405 #27370:cccbdc8] ERROR -- : Q-    task_id([r497_service_template_provision_task_1375]) <AEMethod provision>   /usr/local/lib/ruby/2.4.0/drb/drb.
rb:575:in `read'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:575:in `load'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:641:in `recv_reply'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:941:in `recv_reply'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:1254:in `send_message'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:1142:in `block (2 levels) in method_missing'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:1229:in `open'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:1141:in `block in method_missing'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:1160:in `with_friend'
/usr/local/lib/ruby/2.4.0/drb/drb.rb:1140:in `method_missing'
/ManageIQ/Cloud/Orchestration/Provisioning/StateMachines/Methods/provision:22:in `main'

Such error occurs about 10 minutes after service.deploy_orchestration_stack call. Without no activity on openstack side…

1 Like

thanks @johny5v
setting refresh_interval did the job for me