[Swan-dev] problem from IRC: confusing message and action of lost final packet

Fri Sep 28 15:53:16 UTC 2018

On Wed, 26 Sep 2018 at 09:52, Paul Wouters <paul at nohats.ca> wrote:
>
> On Sat, 22 Sep 2018, D. Hugh Redelmeier wrote:
>
> > <mcp> since libreswan 3.26 + 83e33a69b27f6c5d5f4aff2fc94a1357d5126ed1 I
> > get these syslog messages very often:
> > http://paste.debian.net/hidden/a99f6aa9/ - that's annoying ;)
>
> this is reproduced in test case ikev1-responder-retransmit-01-Q2

I tweaked the test so it injects a duplicate of every response using
--impair replay-duplicate[s], we might want to also test this scenario
with replay-{forward,backwards}.

> > No. STATE_MAIN* and STATE_QUICK* are IKEv1
> >
> > Did you not delete the retained packets in these states?  This is my
> > vague recollection.  Also that I questioned whether this would cause
> > problems.
>
> I thought that was only related to XAUTH states, which live sort of
> between Main/Aggr and Quickmode, for which retransmiting a "last"
> packet was tricky because it of the initiator role change mid-exchange?
>
> It seems to be we are simply mismatching state machine entry. We should
> have one for the established IKE SA and recognise it is established and
> therefor a retransmit.

Paul did some triage and, indeed, traced it back to:

commit 49cfd21870994d1afc038ecd0830c9ad0a14e6d1
Author: Andrew Cagney <cagney at gnu.org>
Date:   Tue May 29 09:24:49 2018 -0400

    ikev1 retransmits: only save the received packet when responding

    Should eliminate problems such as the responder, when receiving a
    response to its XAUTH request from the initiator (remember, an IKEv1
    exchange can flip initiator and responder part way through), would see
    the received packet matched .st_rpacket and assume it needed to
    re-transmit something.

    Really fix 8f440ae125a1d29eb4507bd94b123d22bbd3cb2a

commit 8f440ae125a1d29eb4507bd94b123d22bbd3cb2a
Author: Andrew Cagney <cagney at gnu.org>
Date:   Thu May 24 21:08:20 2018 -0400

    ikev1: apply another bandaid to code trying to send empty packets

    Duplicate the bandaid in send_chunks() that rejects empty packets.

    send_or_resend_v1_ike_msg_from_state() when passed an empty
    st_tpacket, was able to stumble past a passert(st_tpacket.len!=0) (see
    1f61a49a6f2d83997fcad50da20ed7cd5924b9f0 which left .len non-zero).
    Only later, in send_chunks(), was a "bandaid" detecting the problem
    (st_tpacket.ptr==NULL) and reject the attempt to send (grep for
    "Cannot send packet - a.ptr is NULL" in code and old test results).

with the above two applied, here's what's going wrong (other than it's
IKEv1 and we're stuffed)?

- since the IKEv1 initiator is in STATE_MAIN_I4 the IKE SA has been
established - any message from an earlier part of the exchange should
be detected and dropped

In IKEv2, that's easy as the Message ID is a counter.
What about IKEv1?  During these exchange the message ID seems to always be zero.

- since the IKEv1 IKE SA is established (almost) all packets should be
encrypted and have integrity, yet this packet fails that so why on
earth is libreswan sending out a notification

In IKEv2 this is easy, find the SK payload and decrypt/verify as a single step.
What about IKEv1?  As best I can tell the process is to decrypt the
packet and then parse the resulting white noise looking for a HASH
et.al. payload to use as verification - until all that is done nothing
can be trusted and everything should have been dropped.

So by pushing 'ikev1 retransmits: only save the received packet when
responding' I exposed the above two failings.  Reverting it wouldn't
be sufficient.  It would likely need some special state magic to
detect if/when that last outgoing packet should be re-transmitted; and
would still leave libreswan exposed to the above.

Andrew