[Swan] Frequent dropped connections and martian source

Paul Wouters paul at nohats.ca
Wed May 22 00:43:59 UTC 2019


On Tue, 21 May 2019, Alex wrote:

> I have libreswan-3.27 on fedora29 on both ends with 5.0.10 that's been
> running fine for a while. Over the last few days, the connection on
> the local side has inexplicably disconnected from one of its two
> net-to-net peers.
>
> Just running "ipsec auto --up <tunnel-name>" on the local side usually
> brings it up again. The remote side typically doesn't acknowledge that
> the connection was lost, as it reports all tunnels are up. This has
> happened about three times per day for the past week or so. I can't
> think of anything that's changed with the system, and nothing has
> changed with the configuration.
>
> This time it didn't bring the connection up. This is reported in pluto.log:
>
> May 21 20:14:21.606083: "orion-cyclops/1x1" #2019: initiate rekey of
> IKEv2 CREATE_CHILD_SA IKE Rekey
> May 21 20:14:21.607453: "orion-cyclops/1x1" #2028: message id
> deadlock? wait sending, add to send next list using parent #2019
> unacknowledged 2 next message id=2 ike exchange window 1
> May 21 20:17:41.608603: "orion-cyclops/1x1" #2028: deleting state
> (STATE_V2_REKEY_IKE_I0) and NOT sending notification

Sorry to hate to do this, but please try 3.28 which we released
yesterday. Sometimes a rekeyed Child SA would not end up in the proper
parental hash bucket, and the link between parent and child was lost.

It might be you are seeing something else too, that a DPD/livenes is
not answered but the tear-down period was not reached yet, yet we need
to rekey but we cannot because we are waiting on the msgid reply of
the DPD/liveness. While that issue is still there, once this end brings
down the tunnel, it will more aggressively bring it up again since it
knows the tunnel should be up. That is a fix that was released as part
of v3.28.

> I've also noticed martian source messages in the logs, but I don't
> know if that's what's causing it, or that's the consequence of the
> disconnected endpoint. The 192.168.1.0/24 is our local internal
> network that's sometimes used to connect to networks behind the remote
> network. I don't know where the 192.168.49.1 is coming from, as that's
> not an IP or network we use.
>
> [1376538.238061] IPv4: martian source 192.168.1.35 from 192.168.49.1,
> on dev eth1
> [1376538.238075] ll header: 00000000: ff ff ff ff ff ff 0c 47 c9 7b 4e b2 08 06
> [1380207.332144] IPv4: martian source 192.168.1.105 from 192.168.49.1,
> on dev eth1
> [1380207.332159] ll header: 00000000: ff ff ff ff ff ff 0c 47 c9 7b 4e b2 08 06
> [1393701.446458] IPv4: martian source 192.168.1.35 from 192.168.49.1,
> on dev eth1

That might just be broadcast cruft from other cable modem users that
just happen to use an IP range that you are also using on a non-cable
modem interface? I would expect this to be unrelated to you or IKE/IPsec

Paul


More information about the Swan mailing list