[Swan] Peer declared dead and tunnel down for 4 hours despite traffic
paul at nohats.ca
Wed Aug 24 16:48:20 UTC 2016
On Wed, 24 Aug 2016, dsnail at email.com wrote:
> We have intermittent tunnel failures that can usually be fixed by a manual 'ipsec auto --up <connection'. This is not an acceptable requirement, though. The source was declared dead by the destination which makes no sense as the source was up/running and communicating with 15+ other peers at the time. I decided to allow the tunnel failure to remain without manual intervention to see if it would eventually fix itself and in this case it did. It appears as though the tunnel was down for about 4 hours and appears it was 'fixed' very close to 8 hours after the last rekey (15:40:17 - 23:35:47), which seems to be the default salifetime. Even if the source was unavailable to the destination, why did both sides stop trying to communicate and why did the source all of a sudden decide to start communicating again (at 23:35:47). Can anything be done to diagnose, prevent, etc?
This probably relates to this discussion:
I think we have reached agreement on the behaviour, and just need to
update the code to reflect that in all cases. I expect this to be
fixed in the next 1-2 weeks.
The upcoming RHEL-7.3 build has a fix for IKEv1 for this already.
More information about the Swan