[Swan] Frequent Crashing with libreswan 3.10/pluto

Reuben Farrelly reuben-libreswan at reub.net
Sun Oct 12 23:44:02 EEST 2014


This is with PSK's.

I'm using Cisco router IOS, not Apple iPhone IOS...

Reuben


On 13/10/2014 1:58 AM, Paul wrote:
> Is this using psk or certs? With certificates you should not run into
> this as the uniqueids should cause the old connection to get
> replaced.
>
> Hopefully these kind of switching bugs will vanish with ios8 that
> sets up two vpn tunnels, one for wifi and one for 4g
>
> Sent from my iPhone
>
>> On Oct 12, 2014, at 01:42, Reuben Farrelly
>> <reuben-libreswan at reub.net> wrote:
>>
>> Just following up on this thread that I created from earlier in the
>> week...
>>
>>> On 7/10/2014 12:07 PM, Paul Wouters wrote:
>>>> On Tue, 7 Oct 2014, Reuben Farrelly wrote: Frequently the
>>>> libreswan end seems to just die.  Pluto crashes out entirely
>>>> and the VPN goes down.
>>>>
>>>> At the time of this the following is logged in the kernel log:
>>>>
>>>> Oct  6 14:52:06 lightning kernel: pluto[23223]: segfault at 58
>>>> ip 00007f8f85f0c8d0 sp 00007fffb30275b8 error 4 in
>>>> libnss3.so[7f8f85ebc000+11f000]
>>>>
>>>> And in the auth.log the lines preceding this are:
>>>>
>>>> Oct  6 14:50:12 lightning pluto[23223]: | V2 microcode entry
>>>> (R2: process INFORMATIONAL) has unspecified timeout_event
>>>
>>> It would be useful to have the full debug log for that with more
>>> history.
>>
>> I ended up enabling lots more logging and upgrading to -git soon
>> after, and didn't have this problem reoccur after that, so it's
>> probable that this problem will be fixed in the next release.
>>
>> I have however had one other problem show up.  Yesterday I also
>> changed over the connection from a static public IP address to
>> another telco provider with a dynamic (private) IP which is NATted
>> to the Internet. Gotta love Carrier Grade NAT :-)
>>
>> Periodically if the 4G link goes down and back up again and/or the
>> connection is seen from another public IP and/or the spoke router
>> is restarted, the IPsec refuses to complete re-negotiation.  This
>> is logged in the pluto log:
>>
>> | hmac_update data value: | | hmac_update data value: |   04 |
>> hmac_update: inside if | hmac_update: after digest | hmac_update:
>> after assert | prf+[4]:  dc ad f9 1e  6f 6c cf 47  b6 82 a9 07  ea
>> 52 c5 fd | prf+[4]:  2a 64 19 dc | our  keymat  c3 e3 1d 09  97 fa
>> 5c b7  cc d1 06 ea  77 f9 68 fa | our  keymat  51 62 91 34  2e 89
>> 6e c1  0b 2c dd 55  26 07 67 7f | our  keymat  d2 36 6a 07 | peer
>> keymat  7c 69 f8 6b  22 18 43 a7  93 e8 af 2e  26 6b df d2 | peer
>> keymat  d7 c7 04 f2  50 96 79 f3  dc ad f9 1e  6f 6c cf 47 | peer
>> keymat  b6 82 a9 07 | install_ipsec_sa() for #541: inbound and
>> outbound | route owner of "reub.net"[32] 120.158.138.197 unrouted:
>> "reub.net"[30] 110.151.236.171 prospective erouted; eroute owner:
>> "reub.net"[30] 110.151.236.171 prospective erouted | could_route
>> called for reub.net (kind=CK_INSTANCE) "reub.net"[32]
>> 120.158.138.197 #540: cannot route -- route already in use for
>> "reub.net" | ikev2_child_sa_respond returned STF_FATAL |
>> ikev2_parent_inI2outR2_tail returned STF_FATAL | complete v2 state
>> transition with STF_FATAL | deleting event for #541 | deleting
>> state #541 | deleting event for #541 | ICOOKIE:  b4 66 69 60  d5 a6
>> 39 d4 | RCOOKIE:  53 4e 42 cc  3a c2 17 cc | state hash entry 15 |
>> * processed 1 messages from cryptographic helpers | next event
>> EVENT_v2_RETRANSMIT in 21 seconds for #510 | next event
>> EVENT_v2_RETRANSMIT in 21 seconds for #510
>>
>> The system will stay in this sort of stuck state for some time,
>> continually trying to re-establish but failing to complete.  If I
>> restart the ipsec tools it then successfully connects.
>>
>> The routing table on the remote libreswan box shows a route for
>> 192.168.6.2 still present even after the external IP has changed
>> (both before and after the connection re-establishes).  192.168.6.1
>> is the libreswan/hub end and 192.168.6.2 is the Cisco IOS client
>> end.
>>
>> The ipsec status output on the libreswan box also shows lots of log
>> entries:
>>
>> 000 #667: "reub.net"[32] 120.158.138.197:4500 STATE_PARENT_R2
>> (received v2I2, PARENT SA established); EVENT_SA_REPLACE in 58s;
>> isakmp#0; idle; import:respond to stranger 000 #667: "reub.net"[32]
>> 120.158.138.197 ref=0 refhim=0 Traffic: 000 #659: "reub.net"[32]
>> 120.158.138.197:4500 STATE_PARENT_R2 (received v2I2, PARENT SA
>> established); EVENT_SA_EXPIRE in 207s; isakmp#0; idle;
>> import:respond to stranger 000 #659: "reub.net"[32] 120.158.138.197
>> ref=0 refhim=0 Traffic: 000 #647: "reub.net"[32]
>> 120.158.138.197:4500 STATE_PARENT_R2 (received v2I2, PARENT SA
>> established); EVENT_SA_EXPIRE in 56s; isakmp#0; idle;
>> import:respond to stranger 000 #647: "reub.net"[32] 120.158.138.197
>> ref=0 refhim=0 Traffic:
>>
>> I would have thought DPD would have taken care of this by aging out
>> the old (now defunct) connection which has presumably not seen or
>> heard any more matching packets after the event, but it doesn't
>> seem to do that.
>>
>> I can reproduce this on demand.  Currently my pluto.log is 93M so I
>> can post more output on either side of this event if required...
>>
>> Reuben
>>
>>



More information about the Swan mailing list