Can someone explain the new CFME 40 RC1 a new message processing sequence as "unget" added to the ".put" and "#get" sequence?


#1

I am currently working with Cloudforms 4.0 RC1 and have recognized what for me is a new construct in the logs. I am familiar with the message “get…”, message "put: and message “merge” constructs, but now I am seeing messages being processed as “unget” with the message status transitioning from ready-dequeued-ready and being requeued for a much later delivery time. Can some on explain:

  • why this happening?
  • what problem is this addressing or what future functionality is it intended to facilitate?
  • Is the retry delay configurable and if if so how?

#2

@tch Where are you seeing this? Can you give an example log snippet?


#4

with first sample '#unget" message sequence isolated:

[----] I, [2015-11-23T20:30:53.384082 #20776:11dd994] INFO – : ** MIQ(MiqQueue#unget)** Message id: [442000000013034], id: [], Zone: [default], Role: [ems_operations], Server: [], Ident: [generic], Target id: [], Instance id: [442000000000029], Task id: [], Command: [Storage.smartstate_analysis], Timeout: [1800], Priority: [100], State: [ready], Deliver On: [2015-11-24 01:31:53 UTC], Data: [], Args: [442000000000475], Requeued
[----] E, [2015-11-23T20:30:53.384249 #20776:11dd994] ERROR – : MIQ(MiqQueue#deliver) Message id: [442000000013034], Message not processed. Retrying at 2015-11-24 01:31:53 UTC


#5

@tch It seems unget has been there a very long time. It’s possible it was just not used previously.


#6

So the questions are:

  • why is a message that has been dequeued added back into the miq_queue table with the state changed from ‘dequeue’ back to ‘ready’ but now with a later delivery date?
  • what condition would cause this action?
  • Is there a limit to how many times a message can be dequeued-and-requeued?
  • Since this behavior has not been evident in either EVM or CFME, what has changed that is engaging this behavior now?
  • Is it reasonable to expect this churn in the miq_queue table to adversly impact overall behavior of CFME systems?

#7

Based on the one line of code that uses it, the message is put back on the queue if the code processing the queue item fails with an MiqVimBrokerUnavailable. I’m not sure of all the processes that could raise that, but offhand I’d guess VMware’s refresh, C&U collection and operations.

This code was introduced in 2011:

commit 097add944cefa6df9f409df56d485892e14d044e
Author: Oleg Barenboim <obarenbo@redhat.com>
Date:   Tue Sep 13 19:14:53 2011 +0000

    WorkerBase now supports require_vim_broker setting for classes that depend on VimBrokerWorker
    Created QueueWorkerBase#message_delivery_active? which checks for VimBrokerWorker availability for workers that depend on the VimBrokerWorker
    In QueueWorkerBase#do_work, break out of the loop unless message_delivery_active?
    Created QueueWorkerBase#deliver_queue_message (refactoring) that handles MiqException::MiqVimBrokerUnavailable by logging an error  and requeueing the message
    EmsRefreshWorker and PerfCollectorWorker now explicitly have a dependence on the VimBrokerWorker
    VimBrokerWorker - moved starting and priming of broker_server from before_work_loop (status="started") to after_initialize (status="starting")
    BugzID:13129
    
    git-svn-id: http://miq-ubuntusub.manageiq.com/svn/svnrepos/Manageiq/trunk@30564 3c68ef56-dcc3-11dc-9475-a42b84ecc76f

#8

Thank you