[Swan-dev] no proposal chosen response when a rekey

Andrew Cagney andrew.cagney at gmail.com
Sat Sep 14 01:01:57 UTC 2019


On Fri, 13 Sep 2019 at 18:35, Paul Wouters <paul at nohats.ca> wrote:

> On Fri, 13 Sep 2019, Andrew Cagney wrote:
>
> > See https://dpaste.de/EyUR from IRC
> >
> > - libreswan sends a rekey request and gets back no proposal chosen
> >
> > I suspect this is because libreswan's proposal strictly requires DH
> > and the other end strictly refuse it (further down in the log is the
> > remote proposing to CREATE_CHILD_SA with no DH)
>
> a rekey MUST be for identical parameters, so was libreswan too nice to
> continue with a DH mismatch ?
>
>
Except the rekey adds DH.

The CREATE_CHILD_SA from the other end seems to be for a new CHILD on an
established IKE SA.  Since it didn't include DH that makes me suspect that
is why our proposal is failing.  But we need to see the other end.


> But what's more interesting is the other things that go on:
> >
> > dropping unexpected CREATE_CHILD_SA message containing
> > NO_PROPOSAL_CHOSEN notification; message payloads: SK; encrypted
> > payloads: N; missing payloads: SA,Ni,TSi,TSr
> > -> we're missing a state transition to detect this and initiate a delete
>
> Should we delete? If we just respond and keep the state, the existing
> tunnel will still work until expiry time. So the current way of sending
> an error and not deleting seems correct?
>


(More context)

Pluto initiated the rekey.  It got back the no proposal chosen response but
ignored it.
Consequently it re-transmits the original re-key, and that request clogs up
the outgoing message queue (we can't send anything else until it clears).

I see two problems:

1. It should have paired the response with the request, and stopped
re-transmitting; even though the response contents weren't as expected
Like you point out it could then wait for the replace timeout; or
immediately initiate a replace.

> message id deadlock? wait sending, add to send next list using parent
> > #1628 unacknowledged 1 next message id=1 ike exchange window 1
> > -> there's an outstanding re-transmit in front of the delete request;
> > the code should just kill the SA family - given the re-transmit went
> > no where what makes us think a delete will do better
>
>
Now lets pretend that there never was a response (rather than the above
bug, someone yanked the cable); and lets ignore mobike

2. Pluto's used up all its re-transmits so the entire SA family is SNAFUed
- the IKE SA should simply be declared down
Since the (rekey) request is clogging up the request queue there's no space
to send the delete; and besides, if that request isn't getting though, why
would the next (delete)
So the only option left is to throw away the entire SA family and,
depending on policy, start again.
This isn't specific to rekeying - any request timing out should cause the
IKE SA to be declared dead.



> I don't think we should send a new IKE request, so this situation is
> avoided :)
>
> Are we sure the rekey did not fail due to it matching the wrong conn and
> this wrong subnet? Eg what happens if:
>
> conn foo establishes
> conn bar uses CREATE_CHILD_SA to be setup as well.
>
> The IKE SA of foo is now shared with foo and bar.
>
> If the remote sends a REKEY request for bar, do we know that we need to
> switch connection?
>
> Guess we need ipsec whack --child-rekey name :)


so we can trigger any event :-)


> Paul
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.libreswan.org/pipermail/swan-dev/attachments/20190913/b1312518/attachment.html>


More information about the Swan-dev mailing list