Setup Region / Join Region in Darga


#1

Hey there,

i got a little problem during the setup of a Darga-2 Environment. Our Setup is:

1 MasterNode (webserver + coordinator)
2 WorkerNodes (Automate Engine + Collectors)

The Database running internal on the MasterNode. We initialized it with:

vmdb
systemctl stop evmserverd
systemctl restart rh-postgresql94-postgresql.service
DISABLE_DATABASE_ENVIRONMENT_CHECK=1 bin/rake evm:db:region -- --region #REGIONID
DISABLE_DATABASE_ENVIRONMENT_CHECK=1 bin/rake evm:db:reset

That’s working fine…

The problem exists during the configuration of the WorkerNodes. If we use the appliance_console to join the region in the external database, the script said that everything is okay, the operation was successful… BUT: The Server is not shown in the WebUI and the regionID in the appliance_console overview is still 0…

So i tried to configure this over the appliance_console_cli with

appliance_console_cli -r 13 -h dbhost.example.com -p somepasshere

Got an error here:
Warning: There are 18 existing connections to the database preventing the setup of a database region

Since darga there is an check in the appliance_console which looks for database connection - i understand this check, but this is very annoying…

Why i can’t join a region when the database has some connections… i don’t understand this check in this place…

Thanks for your support


#2

@carbonin can you review this question from @schmandforke and forward to a SME if necessary.


#3

just did some research…

in _/var/www/miq/vmdb/gems/pending/appliance_console/cli.rb you defined the join_region call:

def join_region
      output = AwesomeSpawn.run(
        'bin/rails runner',
        :params => [File.expand_path("join_region.rb", __dir__)],
        :chdir  => RAILS_ROOT
      ).output

      if output.to_s.empty?
        true
      else
        say("\n#{output}") if interactive?
        false
      end
    end

correct me if i’m wrong, but in this function you only compare the regionIDs

# This script is run by ApplianceConsole::DatabaseConfiguration when joining a region.

$log.info("MIQ(#{$0}) Joining region...")

config = Rails.configuration.database_configuration[Rails.env]
raise "Failed to retrieve database configuration for Rails.env [#{Rails.env}]" if config.nil?

$log.info("MIQ(#{$0}) Establishing connection with #{config.except("password").inspect}")
class RemoteDatabase < ApplicationRecord
end
RemoteDatabase.establish_connection(config)
new_region = RemoteDatabase.region_number_from_sequence.to_i

region_file = Rails.root.join("REGION")
old_region = region_file.exists? ? region_file.read.to_i : 0

if new_region != old_region
  $log.info("MIQ(#{$0}) Changing REGION file from [#{old_region}] to [#{new_region}]. Restart to use the new region.")
  region_file.write(new_region)
end

$log.info("MIQ(#{$0}) Joining region...Complete")

So, i gone the way back in the appliance_console, but didn’t found the point where you join to the region… currently we’re not able to join any worker to the MasterNode Region.


#4

Taking a look now @schmandforke

I’ll update here when I have some news.
For now, can you check the evm.log and appliance_console.log files on the appliance you are trying to join to the region for any errors or ruby backtraces?

In the mean time I’m going to try to reproduce the issue on some darga-2 appliances.


#5

Hey,

thanks to take a look at it…

if i do

appliance_console_cli -r 13 -h dbhost.example.com -p somepasshere

i get the following logs in appliance_console.log

I, [2016-08-11T15:10:16.316786 #6344]  INFO -- : MIQ(ApplianceConsole::ExternalDatabaseConfiguration#create_region) : starting

E, [2016-08-11T15:10:32.753429 #6344] ERROR -- : DEPRECATION WARNING: `config.serve_static_files` is deprecated and will be removed in Rails 5.1.
Please use `config.public_file_server.enabled = false` instead.
 (called from serve_static_files= at /opt/rubies/ruby-2.2.5/lib/ruby/gems/2.2.0/gems/railties-5.0.0.rc2/lib/rails/application/configuration.rb:81)
PG::ObjectInUse: ERROR:  database "vmdb_production" is being accessed by other users
DETAIL:  There are 17 other sessions using the database.
: DROP DATABASE IF EXISTS "vmdb_production"
Couldn't drop database 'vmdb_production'
Encountered issue setting up Database using region 13: PG::ObjectInUse: ERROR:  database "vmdb_production" is being accessed by other users
DETAIL:  There are 17 other sessions using the database.
: DROP DATABASE IF EXISTS "vmdb_production"
rake aborted!
ActiveRecord::StatementInvalid: PG::ObjectInUse: ERROR:  database "vmdb_production" is being accessed by other users
DETAIL:  There are 17 other sessions using the database.
: DROP DATABASE IF EXISTS "vmdb_production"
/var/www/miq/vmdb/lib/tasks/evm_dba.rake:141:in `block (3 levels) in <top (required)>'
/var/www/miq/vmdb/lib/tasks/evm_dba.rake:177:in `block (3 levels) in <top (required)>'
PG::ObjectInUse: ERROR:  database "vmdb_production" is being accessed by other users
DETAIL:  There are 17 other sessions using the database.
/var/www/miq/vmdb/lib/tasks/evm_dba.rake:141:in `block (3 levels) in <top (required)>'
/var/www/miq/vmdb/lib/tasks/evm_dba.rake:177:in `block (3 levels) in <top (required)>'
Tasks: TOP => db:drop:_unsafe
(See full trace by running task with --trace)

I, [2016-08-11T15:10:32.753657 #6344]  INFO -- : MIQ(ApplianceConsole::ExternalDatabaseConfiguration#create_region) : complete

and the way over the appliance_console:

Choose the advanced setting: 8
Configure Database

Database Operation

1) Create Internal Database
2) Create Region in External Database
3) Join Region in External Database
4) Reset Configured Database

Choose the database operation: 3
Database Configuration
Enter the database hostname or IP address: databasehost.example.com
Enter the name of the database on databasehost.example.com:
|vmdb_production|
Enter the username: |root|
Enter the database password on databasehost.example.com: *******
Enter the database password again: *******
Activating the configuration using the following settings...
Host:     databasehost.example.com
Username: root
Database: vmdb_production


Configuration activated successfully.

Press any key to continue.

but it changed nothing and it produced no logs in appliance_console.log


#6

Okay, I found the bug, I’ll open an issue and get it fixed. I can give you steps to work around it though.

This is the way I would do it:
For the “Master” appliance, there is no need to go through those scripting steps.
If you use the appliance_console and go through the following selections you will be able to pick a new region for the currently configured database for that server.

  1. 12 (Stop EVM Server Processes)
  2. 8 (Configure Database) -> 4 (Reset Configured Database)
  3. 13 (Start EVM Server Processes)

As a side note, the appliance_console_cli command you gave is not doing what you think it is doing.

Specifying the -r option indicates that you want to create a new region in the specified database, which you definitely don’t want to do if there are connections to that database (which is why we added the check :smile: ). That command should do what you want if you omit the -r flag.

With that in mind you should also be able to do this through the appliance_console using the following options:

  1. 12 (Stop EVM Server Processes)
  2. 11 (Generate Custom Encryption Key) -> 2 (Fetch key…) and follow the prompts to fetch the key from the “Master” server.
  3. echo "13" > /var/www/miq/vmdb/REGION (This is the workaround)
  4. 8 (Configure Database) -> 3 (Join Region in External Database) and follow the prompts

This should get you up and running. I’ll post the issue link here after I open it up so you can track the status if you want.


#7

Here’s the link to the issue https://github.com/ManageIQ/manageiq/issues/10409


#8

Hey,

is there a command over the appliance_console_cli to join a region ?

So… i checked your workaround:
my REGION-File contains 13 (i didn’t do anything) and i refetched the encryption key, if i now run the “join region” section in appliance_console i get:

Activating the configuration using the following settings...
Host:     databasehost.example.com
Username: root
Database: vmdb_production

/var/www/miq/vmdb/gems/pending/util/miq-password.rb:39:in `rescue in decrypt': can not decrypt v2_key encrypted string (MiqPassword::MiqPasswordError)
	from /var/www/miq/vmdb/gems/pending/util/miq-password.rb:36:in `decrypt'
	from /var/www/miq/vmdb/gems/pending/util/miq-password.rb:68:in `decrypt'
	from /var/www/miq/vmdb/gems/pending/util/miq-password.rb:93:in `try_decrypt'
	from /var/www/miq/vmdb/gems/pending/appliance_console/database_configuration.rb:199:in `block in decrypt_password'
	from /var/www/miq/vmdb/gems/pending/appliance_console/database_configuration.rb:233:in `encrypt_decrypt_password'
	from /var/www/miq/vmdb/gems/pending/appliance_console/database_configuration.rb:199:in `decrypt_password'
	from /var/www/miq/vmdb/gems/pending/appliance_console/database_configuration.rb:203:in `current'
	from /var/www/miq/vmdb/gems/pending/appliance_console/database_configuration.rb:68:in `activate'
	from /var/www/miq/vmdb/gems/pending/appliance_console/external_database_configuration.rb:17:in `activate'
	from /var/www/miq/vmdb/gems/pending/appliance_console/database_configuration.rb:47:in `run_interactive'
	from /var/www/miq/vmdb/gems/pending/appliance_console.rb:452:in `block in <module:ApplianceConsole>'
	from /var/www/miq/vmdb/gems/pending/appliance_console.rb:106:in `loop'
	from /var/www/miq/vmdb/gems/pending/appliance_console.rb:106:in `<module:ApplianceConsole>'
	from /var/www/miq/vmdb/gems/pending/appliance_console.rb:90:in `<main>'

#9

so i tried to make this workaround on the second workerNode and it worked…

Here is my braindump for a little environment installation:

  1. start appliance_console
  2. Menu 2) Set static IP => enter IP Informations
  3. Menu 4) Set Hostname => set FQDN
  4. Menu 5) Set Timezone
  5. Menu 6) Set Time and Date
  6. Menu 11) Generate custome Encryption Key
  7. MasterNode: Menu 1) Create key
  8. WorkerNode: Menu 2) Fetch key from remote machine
  9. Menu 8) Configure Database
  10. MasterNode:
    1. preinstalled RegionID: Menu 1) Create Internal Database
    2. custome RegionID
      exit appliance_console and type <-SNIPPET->
  11. WorkerNode: and here comes the bug workaround:
    exit appliance_console to CLI and write target RegionID to REGION-File
vmdb
echo $REGIONID > REGION

now you can go over the appliance_menu Menu8) => Menu 3) Join Region in External Database

Here is the SNIPPET mentioned above (because it breaks the style of the Numbered List)

vmdb
systemctl stop evmserverd
sleep 60
systemctl restart rh-postgresql94-postgresql.service
DISABLE_DATABASE_ENVIRONMENT_CHECK=1 bin/rake evm:db:region -- --region $TARGET_REGIONID

Thanks for the great support, hove this issue will help to make it more stable…


#10

Opened https://github.com/ManageIQ/manageiq/pull/10442 to fix this issue.

This PR should also make the process a lot more straight forward in the source.