Utilizations graphs for VMs on RHV are empty

gaprindashvili

#1

I have configure the C&U login with RHVM, and configured the roles as you can see the screenshots, but I do not see any graphs regarding the utilizations for VMs at all, I have heard that these graphs will be avaibale only after 30 days is the correct?
I am running Gaprindashvili-3








#2

Have you added the C&U Endpoint correctly in the RHV provider details (and checked with the ‘Validate’ button)? There’s a description of the procedure that you need to go through at the RHV end in this thread: RHEV cpu and memory reporting issue

You can use the perf_process_timings.rb script from here: https://github.com/RHsyseng/cfme-log-parsing to pull the collection timings out of the log file; this will also tell you whether or not the metrics are being collected.

Hope this helps,
pemcg


#3

I have downloaded and ran the script, and I got the following output, so it seems, it is working, further I have followed the thread you sent as well, and everything is configured, so what might be the issue?!! :frowning:
Here is a sample of the output of the script:

ruby ems_refresh_timings.rb -i /var/log/manageiq/evm.log
—nd 19 refreshes
Worker PID: 32124
Message ID: 811058
Message fetch time: 2018-08-07T04:24:58.650457
Message time in queue: 0.035977665 seconds
Provider: Redhat::InfraManager
EMS Name: RHV
Refresh type: full
Refresh start time: 2018-08-07T04:24:58.687501
Refresh timings:
fetch_all: 4.858584 seconds
collect_inventory_for_targets: 5.040737 seconds
parse_inventory: 0.036896 seconds
parse_targeted_inventory: 0.036962 seconds
save_inventory: 5.300321 seconds
ems_refresh: 10.378416 seconds
Refresh end time: 2018-08-07T04:25:09.066146
Message delivered time: 2018-08-07T04:25:09.095838
Message state: ok
Message delivered in: 10.445172506 seconds


Worker PID: 32134
Message ID: 811059
Message fetch time: 2018-08-07T04:24:58.749381
Message time in queue: 0.035941131 seconds
Provider: Redhat::NetworkManager
EMS Name: RHV Network Manager
Refresh type: full
Refresh start time: 2018-08-07T04:24:58.792505
Refresh timings:
collect_inventory_for_targets: 0.377505 seconds
parse_inventory: 2.085786 seconds
parse_targeted_inventory: 2.085878 seconds
save_inventory: 0.207739 seconds
manager_refresh_post_processing: 0.000004 seconds
ems_refresh: 2.671630 seconds
Refresh end time: 2018-08-07T04:25:01.464428
Message delivered time: 2018-08-07T04:25:01.494492
Message state: ok
Message delivered in: 2.744916725 seconds


Worker PID: 7566
Message ID: 814122
Message fetch time: 2018-08-07T06:25:30.174561
Message time in queue: 0.041902725 seconds
Provider: Redhat::InfraManager
EMS Name: RHV
Refresh type: full
Refresh start time: 2018-08-07T06:25:30.210383
Refresh timings:
fetch_all: 4.048275 seconds
collect_inventory_for_targets: 4.217112 seconds
parse_inventory: 0.040644 seconds
parse_targeted_inventory: 0.040718 seconds
save_inventory: 5.188923 seconds
ems_refresh: 9.447119 seconds
Refresh end time: 2018-08-07T06:25:39.657758
Message delivered time: 2018-08-07T06:25:39.688184
Message state: ok
Message delivered in: 9.513421775 seconds


Worker PID: 7576
Message ID: 814123
Message fetch time: 2018-08-07T06:25:30.275756
Message time in queue: 0.037605443 seconds
Provider: Redhat::NetworkManager
EMS Name: RHV Network Manager
Refresh type: full
Refresh start time: 2018-08-07T06:25:30.317907
Refresh timings:
collect_inventory_for_targets: 0.379402 seconds
parse_inventory: 1.902161 seconds
parse_targeted_inventory: 1.902257 seconds
save_inventory: 0.219408 seconds
manager_refresh_post_processing: 0.000004 seconds
ems_refresh: 2.501572 seconds
Refresh end time: 2018-08-07T06:25:32.819735
Message delivered time: 2018-08-07T06:25:32.847136
Message state: ok
Message delivered in: 2.571182261 seconds


#4

It all looks fine and you have metrics coming in. Try selecting one of your hosts in the WebUI, then click the Monitoring -> Utilization button. For “Interval” select “Most Recent Hour” and see if you get anything plotted.


#5

I tried for Ansible-DB virtual machine but no data available:


#6

ok, sorry I hadn’t noticed that you ran ems_refresh_timings.rb which tells you that the EMS refresh is working. You should try running perf_process_timings.rb which will report the C&U performance capture timings.


#7

Ooops it is showing collect errors for all vms, what does that mean?:

Worker PID: 17473
Message ID: 857370
Message fetch time: 2018-08-08T10:34:52.512153
Message time in queue: 22.211065964 seconds
Provider: Redhat::InfraManager
Object type: Vm
Object name: vm1
Metrics processing start time: 2018-08-08T10:34:52.516245
Time range:
Rows added:
Rows updated:
Capture state: collect_error
Capture timings at time of error:
capture_state: -0.000004 seconds
rhevm_connect: 0.040366 seconds
collect_data: -0.002934 seconds
Message delivered time: 2018-08-08T10:34:52.584587
Message state: error
Message delivered in: 0.072315381 seconds


Worker PID: 17473
Message ID: 857372
Message fetch time: 2018-08-08T10:34:52.593645
Message time in queue: 22.280578116 seconds
Provider: Redhat::InfraManager
Object type: Vm
Object name: vm2
Metrics processing start time: 2018-08-08T10:34:52.596560
Time range:
Rows added:
Rows updated:
Capture state: collect_error
Capture timings at time of error:
capture_state: -0.000519 seconds
rhevm_connect: -0.040439 seconds
collect_data: -0.000200 seconds
Message delivered time: 2018-08-08T10:34:52.622828
Message state: error
Message delivered in: 0.029075091 seconds


Worker PID: 17473
Message ID: 857427
Message fetch time: 2018-08-08T10:38:07.735058
Message time in queue: 34.260928487 seconds
Provider: Redhat::InfraManager
Object type: Vm
Object name: vm3
Metrics processing start time: 2018-08-08T10:38:07.738974
Time range:
Rows added:
Rows updated:
Capture state: collect_error
Capture timings at time of error:
capture_state: 0.000783 seconds
rhevm_connect: 0.001124 seconds
collect_data: 0.001940 seconds
Message delivered time: 2018-08-08T10:38:07.770316
Message state: error
Message delivered in: 0.035094059 seconds


#8

You need to look at evm.log to troubleshoot what might be happening. Search for ERROR and see if anything obvious is there


#9

Ok I found the issue, I was getting:
"ERROR: permission denied for relation vm_device_history" which is related to:

https://bugzilla.redhat.com/show_bug.cgi?id=1459342

In the bugzilla it advices initially to run:

ALTER ROLE cfme SUPERUSER;

This solved the problem, but then the formal advice was to:
1- Role back the superuser:
ALTER ROLE cfme NOSUPERUSER;
2- Add the following lines in /var/lib/pgsql/data/pg_hba.conf :
host ovirt_engine_history cfme 0.0.0.0/0 md5
host ovirt_engine_history cfme ::0/0 md5
3- Reload the postgres db:
# systemctl reload postgresql

Now I am monitoring “permission denied” and it is not happening anymore:

[root@localhost manageiq]# tail -f evm.log| grep -i "permission denied"

Thank you so much I believe the problem is solved I will keep any eye on the system now.