[Swan-dev] IKEv2 revival

Sat May 2 15:38:05 UTC 2020

Tuomo and I spent a bit of Friday debugging a regression where the
liveness probe was stomping on a DISCARD event (forcing it to REPLACE)
set according to the connection.  The actual code was old; it had
unexpected consequences when combined with IKE queuing requests.

Anyway, I think this points to the next change.  When retransmits
fail, force what ever event is in .st_event (and I'm tempted to rename
.st_event to .st_kill_event or .st_death_event).

Presumably earlier code has set .st_event according to policy and
retransmits should just follow that policy.  If that isn't always the
case then that is where the next bug is (I'll need to check
retransmits).

On Fri, 1 May 2020 at 13:47, Paul Wouters <paul at nohats.ca> wrote:
>
> On Fri, 1 May 2020, Tuomo Soini wrote:
>
> >> the dpd/liveness action should be phased out. The hold action was to
> >> keep a hold into place to prevent leaks, while another mechanism
> >> restarts the connection. hold was never valid for connections with
> >> rekey=no that are supposed to clean up all state when going down.
> >
> > Not so simple. auto=ondemand, rekey=no - that is on-demand tunneling.
> > With this combination we absolutely want hold/trap.
>
> Yes, but starting a connection also puts the policy in a hold. It is
> the equivalent of add + route + initiate. The "route" causes the hold.
>
> The real question here is some of the order or things, so we do not
> briefly leak.
>
> >> The fact that we can specify different dpdaction= for connections with
> >> the same IKE SA is a limitation in our connection loading. I consider
> >> it a misconfiguration that we do not need to support.
> >
> > I agree- and that is one reason why I think we should phase out
> > dpdaction completely and add real logics which corresponds other config
> > options. We really do know when we want tunnel to initiate again and
> > when not.
>
> Yup.
>
> >> A liveness event that times out is a failure of the IKE SA, which
> >> means it should affect ALL connections that share the IKE SA. They
> >> should all go into failure mode. If revival is required, the
> >> connections should go down into auto=ondemand to get a hold without a
> >> leak. Once the connection is up the hold will be replaced with the
> >> IPsec SA policy.
> >
> > Actually I think this works for most cases but not all. If only our end
> > can initiate connection we should go to revival and try to get tunnel
> > up. That is if config has auto=start we should initiate immediately.
>
> revial == initiate immediately. The devil is in the details. We cannot
> do delete + initiate without ensuring we put in a hold. I think the
> revival code handles this?
>
> > I agree. There must never be liveness before Child SA is established.
> > Only established child sa can cause liveness checks to start.
>
> The question is, what do you do when you have a shared IKE SA with
> two children, and one of them is idle. It will trigger a liveness
> I think but since the other SA is still live, I think we could just
> not send one and assume the second one is just idle.
>
> >> We should never try to send out a liveness probe if we are alread
> >> waiting on an IKE reply. If a liveness probe is needed, then whatever
> >> request is in transit will act as that liveness proof. If the current
> >> request will timeout, that is also the equivalent to a failed liveness
> >> probe, and the IKE SA gets torn down, taking down its children.
> >
> > Exactly - unlike with ikev1, when ike sa doesn't work we must take all
> > our Child SAs down anyway.
>
> Yup. And for IKEv1 the situation is just too bizarre to fix. You can
> have a child with no ike sa, then if it becomes idle, you have to
> setup a new IKE SA, and then you don't even know if the other end's
> IKE SA is aware of that IPsec SA for a liveness probe.
>
> Paul
> _______________________________________________
> Swan-dev mailing list
> Swan-dev at lists.libreswan.org
> https://lists.libreswan.org/mailman/listinfo/swan-dev