Taking long time to trigger Ansible Tower job and Error in Requests details

I have a service catalog with catalog item type is “Ansible Tower”.
To get auto-approve, it is taking time - 4 to 5 mins
After auto-approve to trigger the Ansible Tower Job, it is taking time - 9 to 10 mins
Note: I am not getting message for provisioning but I am getting message for auto-approved in Notification events.

In Services -> Request -> Request details,
Status - Error
Request State - Finished
Requester - Administrator
Request Type - Service Provision
Last MessageServer - Service [xyz-20201101-232006] Step [post4] Status [Processing Post4 Customizations] Message [Processed Post3 Customizations]

but in Services -> My Services -> Services
Details:
Lifecycle
State - Error in provisioning

Job:
Status - successful

Please help me to troubleshoot why it is taking so long time to trigger Ansible Tower job and reason for Error in Requests details even though Ansible tower job is successful.

It may be worth looking at how “busy” your region is. What providers do you have configured, and how many VMs are being managed by the region? Which version of ManageIQ are you running, and how many appliances do you have in the region?

pemcg

Thank you @pemcg

I will check on above things.

Can you please tell How to troubleshoot and find the reason for Error in Requests details even though Ansible tower job is successful?

And Also any reason why I am not getting notification events messages for provision started / finished / error ?

Unfortunately I think you’d need to look through evm.log and automation.log for any hints as to why the request is failing.

I believe there’s a problem with email notifications in Jansa-1 (No email notifications since jansa upgrade). Are you not getting the pop-up notifications in the WebUI either?

Hi pemcg

Below are the automation log details:

INFO – : Q-task_id([r4000000000655_service_template_provision_task_4000000000655]) Updated namespace [AutomationManagement/AnsibleTower/Service/Provisioning/StateMachines/Provision/update_serviceprovision_status ManageIQ/AutomationManagement/AnsibleTower/Service/Provisioning/StateMachines]
INFO – : Q-task_id([r4000000000655_service_template_provision_task_4000000000655]) Invoking [inline] method [/ManageIQ/AutomationManagement/AnsibleTower/Service/Provisioning/StateMachines/Provision/update_serviceprovision_status] with inputs [{“status”=>“Processed Post3 Customizations”}]
INFO – : Q-task_id([r4000000000655_service_template_provision_task_4000000000655]) <AEMethod [/ManageIQ/AutomationManagement/AnsibleTower/Service/Provisioning/StateMachines/Provision/update_serviceprovision_status]> Starting

ERROR – : Q-task_id([r4000000000655_service_template_provision_task_4000000000655]) Terminating non responsive method with pid 21823
ERROR – : Q-task_id([r4000000000655_service_template_provision_task_4000000000655]) Error terminating 21823 exception No such process

Can you please tell what i need to do resolve this error?

yes @pemcg . I am not getting pop up notification but I am using Ivanchuk not Jansa.
Maybe i think because of this below log line:

INFO – : Q-task_id([log_status]) Followed Relationship [miqaedb:/System/Event/MiqEvent/POLICY/evm_server_var_log_disk_high_usage#create]

Please tell your valuable suggestions.

Are you running the podified or the appliance version of ManageIQ?

It might be worth checking whether your workers are being killed for exceeding their memory threshold.

Try the following from a command line on an appliance (from the /var/www/miq/vmdb/log directory):

grep 'MiqServer#validate_worker' evm.log

If you see any results that include text such as “process memory usage [nnn] exceeded limit” then you need to increase the memory threshold for that worker type in Configuration -> Settings -> Server/Workers tab.

The evm_server_var_log_disk_high_usage event is worrying, and suggests that either you’re logging a lot of messages, or your disk storage is particularly slow.

pemcg

Thank you @pemcg

ThirumuruganC