[Swan] DPD settings in config does not trigger updown script on disconnect

Paul Wouters paul at nohats.ca
Wed Jul 14 02:07:13 UTC 2021


On Tue, 13 Jul 2021, Mike Brown wrote:

> Subject: [Swan] DPD settings in config does not trigger updown script on
>     disconnect
> 
> Hello, first time writing the list.  Let me know if this is going to the wrong place.

This is exactly the right place :)

> My overall goal is for a peer running a pair HA tunnels to terminate at my libreswan node (so my node has 2
> tunnels using the same right/left-subnets in their .conf, but different marks).  My local "switching" to implement
> the HA behavior is an updown script - on up/route it writes iptables connmark rules to send packets to the mark of
> the tunnel referenced in the updown invocation.  (I have no preferred "primary".)  On down/unroute it removes
> rules for the tunnel referenced in the invocation, leaving rules to the remaining tunnel mark only.

Note that a better more modern design would be using the XFRM
interfaces. Then you can create an ipsec1 and ipsec2 interface for your
tunnels, and then you can just handle things with regular routing.
Whichever interface you route into, the traffic gets encrypted for that
endpoint.

But, your solution _should_ work too!

> Libreswans.)  All router boxes are Libreswan 3.23, Ubuntu 18.04.5 LTS, running in the AWS cloud free-tier.

3.23 is very very old - January 2018. That is from before Tesla launched
a car into space. We have done 14 releases since that version. I
recommend grapping the 4.4 tar.gz and running "make deb". You might need
to tweak some settings in mk/config.mk to disable some things due to
older versions of nss/unbound libraries.

> HA switching behavior works as intended if I issue "ipsec auto --delete p1_to_n" while on P1.

> But, where I am having trouble is when I try to make this more realistic by suddenly blocking traffic instead of
> issuing a --delete.  My expectation for this scenario was that DPD would detect the disconnect, down the tunnel
> (as suggested in libreswan DPD code tests) and call my updown script; but that has not been the case.

Yes, it should do that.

>  I see NAT-T
> packets go out, but not DPD and lastdpd=-1 never changes.  If I disable NAT-T (which may cause me other problems
> with AWS public addressing) I do see an R_U_THERE and _ACK, but only once.

That is actually a bug that was fixed :)

NAT-T keepalives are only needed for libreswan behind a NAT. It sends a
1 byte packet every 20s that the other end's kernel eats up just to keep the NAT
mapping open on the NAT routers in the path. When there is no NAT, these
packets are not send, but older versions send 1 of them by mistake.

But DPD should really kick in.

>  After the first NAT-T disabled DPD
> exchange, I see "DPD: no need to send or schedule DPD for replaced IPsec SA" repeatedly (every 30 seconds,
> matching my dpdtimeout) but I never see another DPD exchange.

Seems like a bug. What this means is that you have two IPsec SAs for the
same connection. This happens when your connection rekeys. The older
rekeyed ones lingers for a bit to ensure overlap between old and new
tunnel. If the old one triggers a DPD event, it ignores it since it is
expected to be "idle" since it is not the newest active tunnel. In your
case, it seems there is a false positive for this check.

> I've done quite a bit of diving to see what's happening and am happy to drop both my digest and/or raw-logs here,
> but as a new user I first wanted to check if I'm just missing something entirely before I completely word vomit on
> the mail list.

Please try a newer version first. If 4.4 fails for you too, we are happy
to help you investigate and fix your issue and make the world a better
place. But we don't really want to waste our time on looking at very
old versions of our code - there is not much gain to the world at large
with that.

Cheers,

Paul


More information about the Swan mailing list