[Swan] Problem with SAupdate when SA does not exist in the kernel

Philippe Vouters philippe.vouters at laposte.net
Thu Sep 12 15:23:50 EEST 2013


Dear Mattias,

For your knowledge, the IPSec dpdtimeout closest TCP/IP parameter is the 
KEEPALIVE parameter which is both system administrator settable as well 
as software programmable.

Provided the IPSec implementations were based onto TCP instead of UDP, 
there would be no need for such dpd and dpdtimeout parameters.

I checked my existing C codes and I could only code SO_KEEPALIVE as an 
only on or off socket option. As far as it looks, there is no way for a 
programmer to adjust the KEEPALIVE timer from within his code.

A quick check on my Linux shows the KEEPALIVE related timeouts keep only 
settable by a system administrator.
[philippe at victor ~]$ sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200

In the hope this answer to your concern helps you better appreciate my 
so far responses to you.

Philippe Vouters (Fontainebleau/France)
URL: http://vouters.dyndns.org/
SIP: sip:Vouters at sip.linphone.org

On 09/12/2013 01:53 PM, Philippe Vouters wrote:
> Mattias,
>
> If you follow me on my dpdtimeout suggestion I draw your attention 
> onto, changing this parameter from its default will be a tradeoff for 
> you. The shorter you set this parameter, the more CPU and network 
> traffic you'll use at the cost of productive data handled by the CPU 
> and the network. In summary, you'll request more work from Libreswan. 
> On other hand, the Dead Peer Detection will be performed quicker.
>
> Philippe Vouters (Fontainebleau/France)
> URL: http://vouters.dyndns.org/
> SIP: sip:Vouters at sip.linphone.org
>
> On 09/12/2013 01:23 PM, Mattias Walström wrote:
>> The reason why I used -9 is that it is very similar to a power loss 
>> on the responder, in that case the result will be the same. This is 
>> my main problem, that any unclean exit on the responder cause the 
>> initiator to behave strange.
>>
>> Mattias
>>
>> On 09/12/2013 01:08 PM, Philippe Vouters wrote:
>>> Dear Mattias,
>>>
>>> I give my first answer not knowing whether this is fully applicable 
>>> to your case. What I can say with my Unix/Linux experience is that a 
>>> kill -9 (alias SIGKILL) is a brutal untrappable signal to use when 
>>> everything else has failed. In no case SIGKILL causes a clean exit. 
>>> I have not yet checked Libreswan source to best advise you, but you 
>>> may rather attempt a SIGTERM (kill -TERM) or SIGQUIT (kill -QUIT) 
>>> which are more often appropriate and trapped by most software. 
>>> SIGTERM and SIGQUIT signals are commonly trapped by Unix/Linux 
>>> software and cause them to cleanly exit.
>>>
>>> Philippe Vouters (Fontainebleau/France)
>>> URL: http://vouters.dyndns.org/
>>> SIP: sip:Vouters at sip.linphone.org
>>>
>>> On 09/12/2013 12:18 PM, Mattias Walström wrote:
>>>> Hi!
>>>> I have discovered a problem with a non-clean restart of the 
>>>> responder, I have 14 tunnels configured between one initiator and 
>>>> one responder. When I do a "killall -9 pluto" on the responder, it 
>>>> will force pluto to exit, without closing the connection. When 
>>>> pluto starts again I will get an error on the initiator for some of 
>>>> the tunnels (one to three tunnels will not come back up at all):
>>>>
>>>> Jan  5 16:46:50 i pluto[2593]: | setup_half_ipsec_sa() hit fail:
>>>> Jan  5 16:46:50 i pluto[2593]: "ipsec7" #23: ERROR: netlink 
>>>> response for Add SA esp.17d54247 at 198.18.106.2 included errno 3: No 
>>>> such process
>>>>
>>>> To solve this I have made sure that update will not fail even if 
>>>> there has been a problem adding the SA, but I am unsure if this is 
>>>> a proper solution.
>>>>
>>>> I have seen the same problem for both libreswan 3.5 and openswan 
>>>> 2.6.38, but I have only tested to patch for openswan.
>>>>
>>>> Regards
>>>> Mattias
>>>>
>>>> Index: openswan-2.6.38/programs/pluto/kernel_netlink.c
>>>> ===================================================================
>>>> --- openswan-2.6.38.orig/programs/pluto/kernel_netlink.c 2013-09-12 
>>>> 11:35:45.853061103 +0200
>>>> +++ openswan-2.6.38/programs/pluto/kernel_netlink.c 2013-09-12 
>>>> 12:09:50.948600196 +0200
>>>> @@ -393,6 +393,7 @@
>>>>          , description, text_said
>>>>          , -rsp.e.error
>>>>          , strerror(-rsp.e.error));
>>>> +    errno = -rsp.e.error;
>>>>      return FALSE;
>>>>      }
>>>>
>>>> @@ -794,6 +795,7 @@
>>>>      } req;
>>>>      struct rtattr *attr;
>>>>      struct aead_alg *aead;
>>>> +    int ret;
>>>>
>>>>      memset(&req, 0, sizeof(req));
>>>>      req.n.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
>>>> @@ -990,8 +992,11 @@
>>>>          attr = (struct rtattr *)((char *)attr + attr->rta_len);
>>>>     }
>>>>  #endif
>>>> +    ret = send_netlink_msg(&req.n, NULL, 0, "Add SA", 
>>>> sa->text_said); *
>>>> +    if (ret == FALSE && errno == ESRCH && req.n.nlmsg_type == 
>>>> XFRM_MSG_UPDSA)
>>>> +        return netlink_add_sa(sa, 0);
>>>>
>>>> -    return send_netlink_msg(&req.n, NULL, 0, "Add SA", 
>>>> sa->text_said);
>>>> +    return ret;
>>>>  }
>>>>
>>>>  /** netlink_del_sa - Delete an SA from the Kernel
>>>>
>>>> _______________________________________________
>>>> Swan mailing list
>>>> Swan at lists.libreswan.org
>>>> https://lists.libreswan.org/mailman/listinfo/swan
>>>>
>>>
>>
>>
>



More information about the Swan mailing list