Hello everyone !

I encounter a wierd issue on my manageiq infra, since a few days i have a wierd comportement i can’t produce vm or not correctly, so when i go saw the provider in the UI, the authentication status in error state, so i refreshed it and it’s good again, it’s seems to be only on one provider.

So i go throught logs and saw this errors :

[----] I, [2022-01-05T10:55:25.139458 #6309:2b134385f97c] INFO – automation: MiqAeEvent.build_evm_event >> event=<“login_failed”>
inputs=<{:event=>“authenticate_database”, :userid=>"", :message=>"Authentication failed for userid ",
“MiqEvent::miq_event”=>99000001731477, :miq_event_id=>99000001731477,
“EventStream::event_stream”=>99000001731477, :event_stream_id=>99000001731477}>

[----] I, [2022-01-05T10:20:50.707479 #2713:2ae5ea88796c] INFO – automation: MiqAeEvent.build_evm_event >>
event=<“ems_auth_error”> inputs=<{“MiqEvent::miq_event”=>99000001731438, :miq_event_id=>99000001731438,
“EventStream::event_stream”=>99000001731438, :event_stream_id=>99000001731438}>

[----] I, [2022-01-05T10:20:50.876318 #2713:2ae5ea88796c] INFO – automation: MiqAeEvent.build_evm_event >>
event=<“ems_auth_unreachable”> inputs=<{“MiqEvent::miq_event”=>99000001731439, :miq_event_id=>99000001731439,
“EventStream::event_stream”=>99000001731439, :event_stream_id=>99000001731439}>

I suppose it’s related with my error, so i checked if a number of simultaneous connection on same user is set but its limitless but i don’t know, with my manage iq worker’s can’t speak with the database sometimes so can’t speak with the provider but i’m lost because why i can produce VM on my other VMware provider

I tried to delete it and add it again but without success, it still in error

I search in pg log and i can’t found error of authentication but i find some very short connections but i don’t think it’s an issue :

2022-01-05 10:01:02 EST:[unknown]@[unknown]:[2695711]:LOG: connection received: host= port=52560
2022-01-05 10:01:02 EST:[2695711]:LOG: connection authorized: user=root database=vmdb_production SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, compression=off)
2022-01-05 10:01:02 EST:[2695711]:LOG: disconnection: session time: 0:00:00.035 user=root database=vmdb_production host= port=52560

The only things i do on my appliance its to restart and clean logs. I already do that many times without impacts. I can only work on this the next monday, i will try to redo the link with v2_key and fix_auth but i don’t belive it’s the solution…

@agrare Have you seen anything like this before?

Hm no I haven’t, I think you need to find the log line from the appliance where it runs the provider authentication check and find that error that should help us see what failed.

If it couldn’t connect to the database I’d expect the worker to fail and exit not mark the auth invalid so might be two different issues? Unless it is intermittent

Hi @agrare , @Fryguy

Thanks for your time !

I already delete and re add the provider but nothing happen, if i remenber well i can’t this more information in logs i search in a lots of files without success, i find only thoses types of error without more information. If you know which log file i need to inspect ?

There is the weird problem, i saw some connections from all my server/ui in my pg logs, i saw only this error in automation.log

When i try to validate the connection in the ui, it’s a succes but once it’s done i got thoses error and the authentication status is in error or unknow

Hello Guys !

I’ve take a look again to this error, and this happend only on a specific Zone.

I Got 3 Zone :

  • Default (No server)

  • Zone 1 (3 servers)

  • Zone 2 (3 servers)

i split my providers Vmware & Openstack between thoses two zone. Since a month i got an error on servers in Zone 2 (the one at the beginning of this topic). It’s seems to loose the account because “userid” is empty. So it can’t reach the database. I keep investigate because i would like keep this infra and not rebuild everything.

I want to precise my infra was working very well with this architecture. I doesn’t done a particular action which could broke my infra, maybe when it’s when i restart an appliance, it’s doesn’t come up right.

I found a weird comportment, i need to switch my Vmware Provider in Zone 2 to make him work again. So i tried to switch all object in default zone but it’s doesn’t work i keep my vcenter with auth error. They need to be in Zone 2 to work.

When they are in this zone the vcenter who can’t work before, work now but when i tried to create a VM, the VM is created but not power on and i don’t know why so the request return error and say the job has failed because he can’t check if the VM was provisioned. I checked i got the option after creation power on the VM.

Maybe i miss something important to work with zone.