[Swan] Trouble with connection dropping
zip
zip at fodvo.org
Mon Jun 9 05:01:12 EEST 2014
On 06/08/2014 07:44 PM, Paul Wouters wrote:
> On Sun, 8 Jun 2014, zip wrote:
>
>> Using libreswan 3.8.1 between two household networks each running
>> Fedora 20
>
> You mean libreswan-3.8-1 ?
>> Left's DSL connection must use PPPOE, so its MTU is 8 bytes less than
>> Right's MTU. In the config below I set the MTU to 1422. (in the old
>> days this MTU problem caused ssh untold grief, and why I stopped
>> using it).
>
> You might need to use TCP clamping:
>
> ptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS
> --clamp-mss-to-pmtu
>
> If that does not help, try hardcoding it yourself:
>
> iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss
> 1460
>
>> Back to the problem:
>> When I service restart both sides, the VPN starts up fine, both
>> networks can ping / ssh both directions. Then at some random point
>> in time, Right stops routing traffic through the VPN, but rather goes
>> directly out the public interface; so all ping/ssh traffic
>> originating from Right and its network stops. However Left can still
>> ping any host in Right including the firewall. ssh however doesn't
>> work in either direction after the failure.
>
> Could it be that this coincides with your DHCP lease getting renewed
> (even if it is renewed to the same IP address) ?
>
>> Finding log output is difficult. From Left's side, I have
>> /var/log/secure logs but there isn't an immediate entry corresponding
>> to when the VPN drops. The log on Right's side... well for what I
>> think is an unrelated problem, /var/log/secure is empty and I've
>> opened a Fedora bug describing:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1105828
>> so I don't know what's happening on Right's side. (Seems like
>> problems always happen in two's and three's).
>
> You can use plutostderrlog=/var/log/pluto.log if you get tired of all the
> ways rsyslog and systemd interfere with logging....
>
>> ipsec.conf's are below (note for unknown reasons I've had to use
>> slightly different "rightnexthop" statements).
>
> There are some bugs in the nexthop handling we addressed that will be in
> libreswan-3.9. (already commited to git master on github)
>
>> config setup
>> protostack=netkey
>
>> mtu=1422
>
> When the tunnel is up, do you see a route entry with the mtu specified?
>
> I think you might be seeing the DHCP lease issue bug, which has also
> been filed already for rhel by Patrick:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1078593
>
> Paul
Paul,
Thanks for the reply.
Yes I'm using libreswan-3.8-1.fc20.i686 on the box with the problem.
WIth the tunnel up and working, my route output looks like: (the right host)
[root at windward ~]# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use
Iface
default 66.43.233.126 0.0.0.0 UG 1024 0 0 enp0s4
10.20.0.0 66.43.233.126 255.255.255.0 UG 0 0 0 enp0s4
10.20.1.0 0.0.0.0 255.255.255.0 U 0 0 0 p9p1
10.20.128.0 0.0.0.0 255.255.255.0 U 0 0 0 p9p1
66.43.233.0 0.0.0.0 255.255.255.128 U 0 0 0 enp0s4
dhcp.netins.net 66.43.233.126 255.255.255.255 UGH 1 0 0 enp0s4
But Trace path shows the MTU is correct:
tracepath -n 10.20.0.1
1?: [LOCALHOST] pmtu 1500
1?: [LOCALHOST] pmtu 1422
1: 10.20.0.1 130.652ms reached
1: 10.20.0.1 139.447ms reached
As for the clamp MTU, I'm using the Shorewall config line:
CLAMPMSS=Yes
Which is adding a rule of this:
iptables-save|grep -i clamp
-A FORWARD -p tcp -m tcp --tcp-flags SYN,RST SYN -m policy --dir out
--pol none -j TCPMSS --clamp-mss-to-pmtu
At this point, I believe the MTU issue is just a relic of the past.
I found the logging directive, but didn't find any interesting content
in it when the connection drops.
Your point about DHCP could definately be the problem. The Right
(problem) host is DHCP driven, even tho the it never changes. I found
the lease file, its dhcp-lease-time = 21600 (6 hrs). I signed up to get
cc's of that bug report.
Thanks,
Brian
More information about the Swan
mailing list