Unexpected behavior of $evm.execute


#1

Hi,
it appears that we might have hit an unexpected behavior of $evm.execute method.

Parent instance (level 0) consists of 10 steps (relationships) – relation-1 to relation-10.
Each of the steps is a relationship to an instance itself (level 1). These instances work as logical encapsulations of activities, which we need to execute (state).

Instances on level 1 can represent a different set of activities, i.e.:

  • relationship 3 performs steps A to H
  • relationship 6 performs steps M to O
  • relationship 7 performs steps R to T

Some of the steps (for example C and D) use $evm.execute (level 2) for executing the code in a particular zone.

Problem statement

  • retry operation initialized on any level performs restart of a whole parent instance (level 0)

Expected behavior
retry operation on any level performs only restart of instance on it’s own level:

  • state machine on level 1 reinitializes only itself in case of a non-successful state
  • similarly, on the 2nd level the code waits for successful finalization of an operation executed with $evm.execute

Does anybody has an idea why the retry restarts the whole workflow?


#2

There are a number of $evm.execute methods, can you say which one you are running (I assume create_automation_request)?

But the behaviour that you describe is expected, a state machine retry will re-launch everything under a common $evm.root (actually everything that was initiated from a single message to automate).

When you’re defining your class schema, make sure that all of your state machine steps are of type ‘State’ rather than ‘Relationship’. The difference is that if (say) step 5 issues a retry in a ‘true’ state machine, steps 1-4 won’t be re-executed. If these steps were relationships rather than states, they would be re-run on each retry of the state machine.

In your example if you wanted steps C and D to be independent from retries, you could launch them asynchronously using $evm.execute('create_automation_request'...) in which case they’d execute under their own $evm.root. If you wanted to be able to coordinate completion of these asynchronous steps, you could save their request IDs as state_vars (which persist around state machine retries), and then have a check-and-retry stage somewhere after their launch that checks the request object’s state for ‘finished’.

Hope this helps,
pemcg


#3

Hi Peter,

it is really good to know that all $evm.root is restarted. As based on all existing documentation it looked like only state machine itself (instance with several steps) is restarted.

I am wondering about state machine setting, # of retries, max time, retry interval. As these option may be defined at several levels - parent instance, nested instances, methods that wait for completion of request done by $evm.execute

  • what if I define different values, eg. for number on retries on these different levels? Is there some priority from parent instance, or value defined for exact level (e.g nested state machine) is valid?
  • what are default values if schema definition have empty values?

Thanks,
Vaclav


#4

Hi Vaclav

I’ve just run a quick test on two nested state machines, and it appears that the max retries value in the outer state machine takes priority, so I’m guessing that the max time would be the same. The retry interval ($evm.root['ae_retry_interval']) is always defined in the method that exits with $evm.root['ae_result'] = 'retry' so there is no ambiguity there.

Any automate instances launched using $evm.execute('create_automation_request',...) run asynchronously to its caller. If you want to wait for completion you’d need a follow-up ‘CheckCompleted’ state method that runs a check-and-retry loop, testing for the new automation request to progress to a state of ‘finished’. The ‘normal’ retry rules/logic apply here, even in nested state machines.

Any automate instances launched using $evm.instantiate('/path/to/instance') run synchronously in-line with the caller, so the caller will wait for completion anyway before proceeding.

If you want to investigate more about what’s happening, it might be enlightening to call object_walker from one of your inner state machine methods (You can install it as a git-importable domain from here). This will show the automation instance hierarchy at the point from which it’s called, for example:

 --- automation instance hierarchy ---
 /General_pemcg/StateMachines/Demo/StateMachine1  ($evm.root)
 |    /General_pemcg/StateMachines/Demo/StateMachine2
 |    |    /General_pemcg/Stuff/Methods/Test2  ($evm.parent)
 |    |    |    /Investigative_Debugging/Discovery/ObjectWalker/object_walker  ($evm.object)

You’ll see that the hierarchy is shorter in retry calls of the method as the state machine isn’t re-executing earlier instances. You’ll also see some of the current state-related $evm.root attributes such as:

 |    $evm.root['ae_next_state'] =    (type: String)
 |    $evm.root['ae_result'] = ok   (type: String)
 |    $evm.root['ae_retry_server_affinity'] = false   (type: FalseClass)
 |    $evm.root['ae_state'] = State3   (type: String)
 |    $evm.root['ae_state_max_retries'] = 10   (type: Fixnum)
 |    $evm.root['ae_state_retries'] = 3   (type: Fixnum)
 |    $evm.root['ae_state_started'] = 2017-11-08 09:00:04 UTC   (type: String)
 |    $evm.root['ae_state_step'] = main   (type: String)
 |    $evm.root['ae_status_state'] = on_entry   (type: String)

Hope this helps,
pemcg