[Swan-dev] too many ways for a state transition to fail

Andrew Cagney andrew.cagney at gmail.com
Thu May 7 15:41:27 UTC 2020


I don't think STF_* is sufficient to describe the ways a state
transition processor can fail, however I'm also wary of adding still
more STF_* codes.

Here's a possible model ...

In the RESPONDER:

- all ok: a good response is sent and the transition completes
  lets call this OK
- the operations fails: an error response is sent and the (child)
state is considered dead
  lets call this FAIL
- the operation is a disaster: a fatal response is sent and the entire
family is considered dead
  lets call this FATAL
- in IKE_AUTH the IKE SA can succeed but the CHILD SA can fail; so
both success and fail are sent and the (child) state is considered
dead
  lets call this OK (if nothing else the code can do a CHILD->IKE
switch and continue)

How does this compare to the current code?  Things are looking hopeful:
- I've changed STF_FAIL and STF_FATAIL to send back the recorded response
- there's never been a clear differentiation between STF_FAIL and
STF_FATAL so what is currently returns seems arbitrary
- delete_state() is called and who knows what that will do - can be hobbled
- IKE_AUTH

In the INITIATOR processing the response:

- a fatal response is received, the entire family is considered dead,
there is no delete request
  handle this with dedicated state transitions
  or call this FATAL
- a fail response is received, the initiator is considered dead, there
is no delete request
  handle this with a dedicated state transition
  or call this FAIL
- a good response is received, BUT this end detects something fatal
(critical bit), the entire family is considered dead, there is no
delete request
  lets call this FATAL
- a good response is received, this end considers it ok and the
transition completes
  can this trigger a further exchange (consider IKE_SA_INIT then
IKE_AUTH), later
  lets call this OK
- a good response is received, BUT this end detects something wrong
and needs to initiate a child/ike delete
  this is the violation to the rule that an IKEv2's response never
triggers a request
  lets call this FAIL/FATAL, oh, both are taken

How does this compare to the current initiator processing a response code:
- note how the response can trigger at least 3 actions: IKE kill,
CHILD kill, CHILD delete but there's only FAIL and FATAL
- currently STF_FAIL and STF_FATAL call delete_state() and,
invariably, that function's heuristics fail and it sends out a delete
request and forgets to delete children

So what next, some ideas I'm playing with are:

- add STF status codes to differentiate the above
- 'succeed' but inject a 'delete' event
- record an outging 'delete' and return above; not as weird as it sounds ...


More information about the Swan-dev mailing list