Management Event

Hi,

We have created alert profile and assigned some alerts for cpu % monitoring.

Alert started appearing in Monitor alerts however other two actions are not working

1- Email ( Test email works fine )
2- Management Event ( Automation log does not show Custom Event being called )

We have copied Management Event in our Domain and created instance.

Appreciate response.
Thanks
Regards
Asif Bilgrami

I’m not sure what provider you’re using, but you might want to try the “CPU - Time Used (ms)” alert as the real time alert trigger, I’ve had more luck with this one.

You might also want to check that you’ve scaled out your ManageIQ infrastructure so that it can handle the collection of the Capacity and Utilization metrics in time, and not lose data. Enabling realtime alerting shortens the collection interval, which correspondingly puts more load on the C&U collection and processing workers.

There’s a rough guideline of one MIQ appliance per 300-500 managed VMs in a virtual infrastructure, but looking at the logs can help tune this. There are some scripts here that can help show the timings of the C&U collection workers, and this will show if they are “keeping up”.

Hope this helps,
pemcg

Thanks . Actually we are getting Alerts in Monitor > Alerts > All Alerts Screen. But other actions like email and Management Event is not being called.

Earlier it worked and miqaedb:/System/Event/CustomEvent/Alert/VMAlert#create is called and we able to call our Method for automation.

Earlier Automation Log when Management Custom Event worked:

[----] I, [2020-07-10T05:20:39.082434 #3084:e625f8] INFO – : Instantiating [/System/Process/Event?EventStream%3A%3Aevent_stream=26997&MiqServer%3A%3Amiq_server=1&User%3A%3Auser=1&VmOrTemplate%3A%3Avm=28&alert_guid=ce25c0e9-2772-436f-be36-ccac794f3338&event_stream_id=26997&event_type=VMAlert&miq_alert_description=VM%20Alert&miq_alert_id=40&object_name=Event&vmdb_object_type=vm]
[----] I, [2020-07-10T05:20:39.097749 #3084:e625f8] INFO – : Updated namespace [/System/Process/Event?EventStream%3A%3Aevent_stream=26997&MiqServer%3A%3Amiq_server=1&User%3A%3Auser=1&VmOrTemplate%3A%3Avm=28&alert_guid=ce25c0e9-2772-436f-be36-ccac794f3338&event_stream_id=26997&event_type=VMAlert&miq_alert_description=VM%20Alert&miq_alert_id=40&object_name=Event&vmdb_object_type=vm ManageIQ/System]
[----] I, [2020-07-10T05:20:39.139731 #3084:e625f8] INFO – : Following Relationship [miqaedb:/System/Event/CustomEvent/Alert/VMAlert#create]
[----] I, [2020-07-10T05:20:39.158747 #3084:e625f8] INFO – : Updated namespace [miqaedb:/System/Event/CustomEvent/Alert/VMAlert#create Systems/System/Event/CustomEvent]
[----] I, [2020-07-10T05:20:39.179682 #3084:e625f8] INFO – : Following Relationship [miqaedb:/General/Methods/TestInstance#create]
[----] I, [2020-07-10T05:20:39.188042 #3084:e625f8] INFO – : Updated namespace [miqaedb:/General/Methods/TestInstance#create Systems/General]
[----] I, [2020-07-10T05:20:39.202473 #3084:e625f8] INFO – : Updated namespace [General/Methods/test Systems/General]
[----] I, [2020-07-10T05:20:39.205965 #3084:e625f8] INFO – : Invoking [inline] method [/Systems/General/Methods/Test] with inputs [{}]
[----] I, [2020-07-10T05:20:39.207449 #3084:e625f8] INFO – : <AEMethod [/Systems/General/Methods/Test]> Starting
[----] I, [2020-07-10T05:20:39.538320 #3084:8f927a4] INFO – : Testing
[----] I, [2020-07-10T05:20:39.551868 #3084:e625f8] INFO – : <AEMethod [/Systems/General/Methods/Test]> Ending
[----] I, [2020-07-10T05:20:39.551976 #3084:e625f8] INFO – : Method exited with rc=MIQ_OK

Thanks Again :slight_smile:

ok, so a couple of things that I would check:

  • Check automation.log for errors
  • Check automation.log to see if the EventStream object that corresponds to your alert is still being processed (like the first few lines of that log listing that you posted)
  • Check that your method/instance is still runnable outside of the event mechanism, i.e test it from Automation -> Simulation to check that it still works as expected

How many VMs is your MIQ infrastructure managing, and how many MIQ appliances have you deployed?

pemcg

Hi , Thanks.

Below are answers starting with >>

  • Check automation.log for errors

There are no Errors for Custom Event , EMS Event works fine (Please see attached document link with details and screen shots)
https://drive.google.com/file/d/11p-FJNS7eGKujesCiGVHM9PrTLAS356k/view?usp=sharing

  • Check automation.log to see if the EventStream object that corresponds to your alert is still being processed (like the first few lines of that log listing that you posted)

No it is never called. Some how Custom Event is not triggered although I can see alert in Monitor Screen also Email is not Received. (Details in attached document)

  • Check that your method/instance is still runnable outside of the event mechanism, i.e test it from Automation -> Simulation to check that it still works as expected

Yes it works , it only prints Testing in Evm log and works well for EMS Event.
How many VMs is your MIQ infrastructure managing, and how many MIQ appliances have you deployed?

I am exploring Manage IQ with one appliance and connected with Google, AWS and Azure Providers with not much VM’s . I think 15 -20 instances

Link to Document https://drive.google.com/file/d/11p-FJNS7eGKujesCiGVHM9PrTLAS356k/view?usp=sharing

Thanks again :slight_smile: Appreciate your response .

There are several “evm worker memory exceeded” errors in your log for the Generic worker (this is the worker that handles automate tasks). The worker will have been killed and re-spawned, which may account for the lack of expected Automate processing.

Try increasing the memory threshold for your Generic worker from the Configuration -> Settings -> Server - > Workers tab.

It’s worth periodically checking in the log to see whether any of the workers are exceeding their defined thresholds; you can use something like the following:

zgrep 'MiqServer#validate_worker' evm.log* | grep "WARN\|ERROR"

It also might be worth noting that according to the CloudForms 5.0 support matrix (https://access.redhat.com/documentation/en-us/red_hat_cloudforms/5.0/pdf/support_matrix/Red_Hat_CloudForms-5.0-Support_Matrix-en-US.pdf), real time alerts aren’t supported for the AWS or Azure cloud providers. I’ve only ever had them working successfully from the VMware and RHV providers, although I haven’t tested them thoroughly with the cloud providers so I can’t say definitively that they don’t work. You may need to try alternative counters though, as I mentioned I’ve used “CPU - Time Used (ms)” mostly.

pemcg

Thanks :slight_smile: I have increased memory for Generic Worker and there are no more Error shown with zgrep ‘MiqServer#validate_worker’ evm.log* | grep “WARN|ERROR”

I have added some Alerts and getting on Monitor > Alerts > All Alerts
However still Management Event is not getting fired and also Email is not received for configured alerts.

Automation works fine for EMS Events like AWS_API_CALL_StartInstances and it calls /System/Event/EmsEvent/AMAZON/AWS_API_CALL_StartInstances

Any ways thanks for all your help.

Best Regards
Asif Bilgrami