[Swan-dev] ikev2-32-nat-rw-rekey is weird
Andrew Cagney
andrew.cagney at gmail.com
Wed Dec 5 21:01:29 UTC 2018
I think I've restored ikev2-32-nat-rw-rekey's behaviour.
commit 48ab456071939535f9f915622162bbcc056fe2ea (origin/master,
origin/HEAD, master)
Author: Andrew Cagney <cagney at gnu.org>
Date: Mon Dec 3 11:01:34 2018 -0500
ikev2: schedule "replace" as explicit "rekey" (new event),
"replace", or "expire" events
The schedule replace code, depending on context will schedule an
explicit "rekey", "replace", or "expire".
The "rekey" handler starts a rekey of the SA (the IKE SA calls
ikev2_rekey_ike_start(), the CHILD uses magic and a call to
ipsecdoi_replace()). A replace is then scheduled.
The "replace" handler seeing a rekey in progress "cleans up" the mess:
for the IKE SA it forces a full replace and forced "expire"; for the
old CHILD SA, it is forced to "expire" (what happens to the new CHILD
SA remains a mystery; can CHILD SA even skip directly from "rekey" to
"expire"?).
This should restore a quirk in ikev2-32-nat-rw-rekey where the rekey
runs runs out of time.
(Note that there is a deliberate bug where EVENT_SA_REKEY is logged as
the old EVENT_SA_REPLACE. It avoids churning the output. Something
to fix later).
On Mon, 26 Nov 2018 at 12:08, Andrew Cagney <andrew.cagney at gmail.com> wrote:
>
> On Mon, 26 Nov 2018 at 11:16, Antony Antony <antony at phenome.org> wrote:
> >
> > an unestablished child state would become a new "connection" initiation (STATE_PARENT_I1) when the parent deletes. That is how #4 is created
>
> Unfortunately what was happening depended on luck:
>
> - the #1 REPLACE event would create a re-key state #3 and hash that to
> a random slot
>
> - the #1 EXPIRE event would then call delete_my_family(IKE SA, FALSE) which:
> -- deleted all children of the IKE SA, but only if they are hashed to
> the same slot as the expired IKE SA
> -- since #3 re-key state was hashed to a random slot (which may or may
> not match the IKE SA's slot) it surviving this depended on luck
>
> Assuming #3 survivied, the code would then call delete_state() which:
>
> > delete_state
> > flush_pending_children
> > flush_pending_child
> > #queue up new IKE_INIT exchange.
>
> because it was searching the entire state table, and not just the IKE
> SA's hash slot, would stumble across the rekey state #3 and cause it
> to trigger a replace
>
> While the quick fix seems to be to not delete the re-key state #3 it
> seems weird.
>
> - other than the re-key state, could there ever be another other state
> lurking in the state table?
>
> - since the old IKE SA needs replacing, then why not just replace it
>
> > And #4 deletes when retransmit expires, say 60sec default.
> > I think keyingtries is to supposed to keep it going, create #5 and so on.
> >
> > -antony
> >
> >
> > On Mon, Nov 26, 2018 at 10:26:25AM -0500, Andrew Cagney wrote:
> > > The old code was doing roughly:
> > >
> > > #1 established as IKE SA
> > > #2 established as CHILD SA
> > >
> > > and then
> > >
> > > | handling event EVENT_SA_REPLACE for parent state #1
> > > | #3 schedule initiate IKE Rekey SA none to replace IKE# 1
> > > - can't as network is down but keeps retrying
> > > | inserting event EVENT_SA_EXPIRE, timeout in 13.000 seconds for #1
> > > - i.e., switch #1 from REPLACE to EXPIRE
> > >
> > > and then
> > >
> > > | #1: ISAKMP SA expired (LATEST!)
> > > - deletes all known children (i.e. #2, but not #3 - that's become a zombie)
> > > | #1: reschedule pending child #3 STATE_V2_REKEY_IKE_I of connection
> > > "road-east-x509-ipv4"[1] 192.1.2.23 - the parent is going away
> > > | inserting event EVENT_SA_REPLACE, timeout in 0.000 seconds for #3
> > > - i.e, flips #3's event from retransmit to replace
> > > - deletes itself (#3)
> > >
> > > and this wakes up zombie #3 causing it to:
> > >
> > > #3: handling event EVENT_SA_REPLACE for child state
> > > - creates #4 to do full re-negotiation
> >
> >
> >
> > > - deletes itself
> > >
> > > Since the new code deletes #3 (re-key state) while deleting #1
> > > (original IKE SA) there is no #3 zombie state to bring back from the
> > > dead. Hence the connection dies.
> > >
> > > My guess is what should happen is: the #1 EXPIRE event (clearly it
> > > wasn't as wakes up the zombie state #3 causing it to replace REPLACE)
> > > should do the replace itself. Any thoughts.
> > > _______________________________________________
> > > Swan-dev mailing list
> > > Swan-dev at lists.libreswan.org
> > > https://lists.libreswan.org/mailman/listinfo/swan-dev
More information about the Swan-dev
mailing list