[Swan] help needed with Libreswan (libreswan-3.15-5.3.el6.x86_64) and with libreswan-3.17-1.el6.x86_64 which went into a "stuck" or failed? state on 2.6.32-573.18.1.el6.x86_64 RHEL6

Li, Mike Mike.Li at finra.org
Wed Jul 13 19:35:32 UTC 2016


Hi Paul,
Libreswan3.17 is running on RHEL6.
Today got following: Jul 13 05:47:06 server1 ipsec__plutorun: !pluto failure!:  exited with error status 139 (signal 11)
I've a situation where sudo /etc/init.d/ipsec status is showing:
Pluto (pid 18393) is running... but not displaying the count for active channels
I'm running a sudo strace -v -p 18393 got following (had to do a control+c)
sudo strace -v -p 18393
Process 18393 attached
write(153, "P\0\0\0\24\0\5\0\217Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\217Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\217Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\220Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\220Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\220Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\221Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\221Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\221Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\222Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\222Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\222Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\223Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\223Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\223Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\224Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\224Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\224Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\225Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\225Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\225Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\226Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\226Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\226Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\227Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\227Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\227Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\230Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\230Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\230Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\231Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\231Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\231Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\232Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\232Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\232Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\233Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\233Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\233Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\234Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\234Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\234Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\235Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\235Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\235Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\236Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\236Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\236Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\237Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\237Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\237Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\240Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\240Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\240Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\241Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\241Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\241Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\242Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\242Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\242Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\243Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\243Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\243Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\244Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\244Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\244Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\245Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\245Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\245Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\246Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\246Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\246Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\247Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\247Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\247Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\250Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\250Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\250Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100
write(153, "P\0\0\0\24\0\5\0\251Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 80) = 80
recvfrom(153, "d\0\0\0\2\0\0\0\251Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\251Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 100

sudo ls -lt /var/run/pluto/ (did not see any dump file)
total 4
srwx------ 1 root root 0 Jul 13 05:47 pluto.ctl
-r--r--r-- 1 root root 6 Jul 13 05:47 pluto.pid
[lim2 at server1 ~]$ ps -ef|grep pluto
root      18386      1  0 05:47 ?        00:00:00 /bin/sh /usr/libexec/ipsec/_plutorun --config /etc/ipsec.conf --nofork
root      18393  18386 36 05:47 ?        03:33:52 /usr/libexec/ipsec/pluto --config /etc/ipsec.conf --nofork

Please advise.
Thanks
Mike
-----Original Message-----
From: Paul Wouters [mailto:paul at nohats.ca] 
Sent: Tuesday, July 12, 2016 2:29 PM
To: Li, Mike
Cc: swan at lists.libreswan.org
Subject: RE: [Swan] help needed with Libreswan (libreswan-3.15-5.3.el6.x86_64) and with libreswan-3.17-1.el6.x86_64 which went into a "stuck" or failed? state on 2.6.32-573.18.1.el6.x86_64 RHEL6

On Tue, 12 Jul 2016, Li, Mike wrote:

> Had to force kill the processes yesterday to and restart again restore service.
> I've been using Openswan (openswan-2.6.32-9.el5) on RHEL5 for a few 
> years. Initially worked with Matt R. from RH to use following config 
> to connect Windows 2012 ipsec

Perhaps upgrade that machine to rhel6 or rhel7 with libreswan? Openswan has been obsoleted for RHEL6 (and was never in RHEL7)

> Issue is with the randomness of the pluto crashing issue happening. It happened on 2 servers. Same unresponsive pluto process.
> Server 1: around "Jul 10 03:25:41" while doing following "max number 
> of retransmissions (8) reached STATE_QUICK_I1.  No acceptable response to our first Quick Mode message: perhaps peer likes no proposal".
> Server2:I see 24 "ipsec__plutorun: !pluto failure!:  exited with error 
> status 139 (signal 11)" from Jul 3 - Jul 8 Will those 2 situations cause pluto process to stop responding?

So it looks like server2's pluto crashed. There can be some log lines, but not neccessarily. You can enable dumpdir=/var/tmp/ and see if you get a core dump in that directory which you can debug with gdb. But you might just want to try upgrading first.

> Could I use plutodebug=all to turn on debug? That will generate large 
> amount of logging

That might help a bit to determine what exactly happened just before the crash, if this is not a known bug that's been fixed.

Paul

Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.


More information about the Swan mailing list