[Swan] Failed migration from OpenSwan

Sat Aug 13 16:12:22 UTC 2016

On Sat, 13 Aug 2016, dsnail at email.com wrote:

> We have been using Openswan to Openswan ipsec successfully on CentOS to RHEL.  The biggest problem we had with OpenSwan was that it crashed when re-configuring tunnels during a rekey.  Now we are forced to move to Libreswan and it has so far been a failure with tons of time spent on it.  We have found issues that we have tried to fix, so far unsuccessfully.

I'm sorry to hear you are finding migration problems. The first thing to
try is the rpms for libreswan-3.18 from:

https://download.libreswan.org/binaries/rhel/

> 1.  Random INVALID_ID_INFORMATION responses.  Libreswan goes into a state where it simply will not accept the connection that it has accepted numerous times before. Libreswan says "cert verify failed with internal error" and "Peer public key is not available for this exchange". A restart of Libreswan sometimes fixes this but not always. The worst part is that libreswan allows unencypted traffic between the two points in this situation.  There is nothing wrong with the cert.  It works sometimes,  it always worked for OpenSwan.

The custom aging X.509 verification code was replaced by using NSS calls
directly. To see why it is rejecting your certificates/ID, we would need
to see the output of plutodebug=all from the ipsec service start until
the failure happens. This might be a large file, so a pointer to the
file instead of a very large attachment would be best.

> 2.  Tunnels working and then stopping to work and never working again until manual intervention (--up for example), which is a totally unacceptable requirement.  This failure usually happens after one side has restarted.  We had used auto=start with Openswan but we have found during testing that auto=ondemand may have made the problem in Libreswan less reproducible (but still very,very reproducible).

This could happen on older versions when one side issued a --down. It
would cause the other end to not re-initiate even though it had
auto=ondemand or auto=start. This should work in 3.18 for IKEv1. We are
working on a similar fix for IKEv2.

> Example source and destination ipsec.secrets
>
> : RSA "src.ourdomain.com"
>
> : RSA "dst.ourdomain.com"

These are no longer needed as of 3.16.

> 002 "src-to-dst-on-80" #347: Main mode peer ID is ID_DER_ASN1_DN: 'C=XX, O=YYY, OU=ZZZZZ-IPSEC, CN=dst.ourdomain.com'
> 002 "src-to-dst-on-80" #347: cert verify failed with internal error
> 002 "src-to-dst-on-80" #347: Peer public key is not available for this exchange
> 218 "src-to-dst-on-80" #347: STATE_MAIN_I3: INVALID_ID_INFORMATION

I'd be interested to see the plutodebug=all for this.

> Aug 13 11:06:46 src pluto[4431]: "src-to-dst-on-80" #168: transition from state STATE_MAIN_R2 to state STATE_MAIN_R3
> Aug 13 11:06:46 src pluto[4431]: "src-to-dst-on-80" #168: STATE_MAIN_R3: sent MR3, ISAKMP SA established {auth=RSA_SIG cipher=aes_128 integ=OAKLEY_SHA2_256 group=MODP1536}
> Aug 13 11:06:46 src pluto[4431]: "src-to-dst-on-80" #168: Dead Peer Detection (RFC 3706): enabled
> Aug 13 11:20:23 src pluto[4431]: "src-to-dst-on-80" #156: deleting state #156 (STATE_MAIN_R3)

Here it seems that we never received the last IKE packet and the tunnel
rekey timed out and was killed.

> Aug 13 11:49:32 src pluto[4431]: packet from 10.88.180.151:500: received Vendor ID payload [Dead Peer Detection]

> Aug 13 11:49:32 src pluto[4431]: "src-to-dst-on-80" #185: STATE_MAIN_R3: sent MR3, ISAKMP SA established {auth=RSA_SIG cipher=aes_128 integ=OAKLEY_SHA2_256 group=MODP1536}
> Aug 13 11:49:32 src pluto[4431]: "src-to-dst-on-80" #185: Dead Peer Detection (RFC 3706): enabled
> Aug 13 11:51:59 src pluto[4431]: "src-to-dst-on-80" #185: received Delete SA payload: self-deleting ISAKMP State #185

And here that happens again. It would be interesting to see if the other
end thinks it sent that reply or not. Possibly it decided for some
reason not to send the reply?

Paul