[Swan] help needed with Libreswan (libreswan-3.15-5.3.el6.x86_64) and with libreswan-3.17-1.el6.x86_64 which went into a "stuck" or failed? state on 2.6.32-573.18.1.el6.x86_64 RHEL6

Li, Mike Mike.Li at finra.org
Wed Jul 13 20:22:40 UTC 2016


Paul,
How to get 3.18dr3 prerelease for RHEL6? I don't see it https://download.libreswan.org/?C=M;O=A
Thanks.

-----Original Message-----
From: Paul Wouters [mailto:paul at nohats.ca] 
Sent: Wednesday, July 13, 2016 4:14 PM
To: Li, Mike
Cc: swan at lists.libreswan.org
Subject: Re: [Swan] help needed with Libreswan (libreswan-3.15-5.3.el6.x86_64) and with libreswan-3.17-1.el6.x86_64 which went into a "stuck" or failed? state on 2.6.32-573.18.1.el6.x86_64 RHEL6

I think tha was fixed for 3.18 that will be released Monday. Can you try the 3.18dr3 prerelease?

Sent from my iPhone

> On Jul 13, 2016, at 9:35 PM, Li, Mike <Mike.Li at finra.org> wrote:
> 
> Hi Paul,
> Libreswan3.17 is running on RHEL6.
> Today got following: Jul 13 05:47:06 server1 ipsec__plutorun: !pluto 
> failure!:  exited with error status 139 (signal 11) I've a situation where sudo /etc/init.d/ipsec status is showing:
> Pluto (pid 18393) is running... but not displaying the count for 
> active channels I'm running a sudo strace -v -p 18393 got following 
> (had to do a control+c) sudo strace -v -p 18393 Process 18393 attached 
> write(153, 
> "P\0\0\0\24\0\5\0\217Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\217Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 17Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\220Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\220Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 20Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\221Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\221Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 21Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\222Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\222Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 22Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\223Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\223Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 23Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\224Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\224Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 24Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\225Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\225Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 25Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\226Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\226Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 26Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\227Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\227Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 27Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\230Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\230Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 30Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\231Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\231Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 31Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\232Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\232Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 32Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\233Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\233Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 33Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\234Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\234Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 34Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\235Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\235Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 35Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\236Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\236Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 36Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\237Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\237Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 37Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\240Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\240Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 40Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\241Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\241Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 41Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\242Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\242Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 42Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\243Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\243Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 43Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\244Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\244Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 44Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\245Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\245Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 45Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\246Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\246Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 46Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\247Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\247Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 47Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\250Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\250Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 50Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100 write(153, 
> "P\0\0\0\24\0\5\0\251Dv?\0\0\0\0\n\7AD\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 80) = 80 recvfrom(153, 
> "d\0\0\0\2\0\0\0\251Dv?}\215\377\377\376\377\377\377P\0\0\0\24\0\5\0\2
> 51Dv?"..., 8228, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> [12]) = 100
> 
> sudo ls -lt /var/run/pluto/ (did not see any dump file) total 4
> srwx------ 1 root root 0 Jul 13 05:47 pluto.ctl
> -r--r--r-- 1 root root 6 Jul 13 05:47 pluto.pid
> [lim2 at server1 ~]$ ps -ef|grep pluto
> root      18386      1  0 05:47 ?        00:00:00 /bin/sh /usr/libexec/ipsec/_plutorun --config /etc/ipsec.conf --nofork
> root      18393  18386 36 05:47 ?        03:33:52 /usr/libexec/ipsec/pluto --config /etc/ipsec.conf --nofork
> 
> Please advise.
> Thanks
> Mike
> -----Original Message-----
> From: Paul Wouters [mailto:paul at nohats.ca]
> Sent: Tuesday, July 12, 2016 2:29 PM
> To: Li, Mike
> Cc: swan at lists.libreswan.org
> Subject: RE: [Swan] help needed with Libreswan 
> (libreswan-3.15-5.3.el6.x86_64) and with libreswan-3.17-1.el6.x86_64 
> which went into a "stuck" or failed? state on 
> 2.6.32-573.18.1.el6.x86_64 RHEL6
> 
>> On Tue, 12 Jul 2016, Li, Mike wrote:
>> 
>> Had to force kill the processes yesterday to and restart again restore service.
>> I've been using Openswan (openswan-2.6.32-9.el5) on RHEL5 for a few 
>> years. Initially worked with Matt R. from RH to use following config 
>> to connect Windows 2012 ipsec
> 
> Perhaps upgrade that machine to rhel6 or rhel7 with libreswan? 
> Openswan has been obsoleted for RHEL6 (and was never in RHEL7)
> 
>> Issue is with the randomness of the pluto crashing issue happening. It happened on 2 servers. Same unresponsive pluto process.
>> Server 1: around "Jul 10 03:25:41" while doing following "max number 
>> of retransmissions (8) reached STATE_QUICK_I1.  No acceptable response to our first Quick Mode message: perhaps peer likes no proposal".
>> Server2:I see 24 "ipsec__plutorun: !pluto failure!:  exited with 
>> error status 139 (signal 11)" from Jul 3 - Jul 8 Will those 2 situations cause pluto process to stop responding?
> 
> So it looks like server2's pluto crashed. There can be some log lines, but not neccessarily. You can enable dumpdir=/var/tmp/ and see if you get a core dump in that directory which you can debug with gdb. But you might just want to try upgrading first.
> 
>> Could I use plutodebug=all to turn on debug? That will generate large 
>> amount of logging
> 
> That might help a bit to determine what exactly happened just before the crash, if this is not a known bug that's been fixed.
> 
> Paul
> 
> Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.



More information about the Swan mailing list