[Swan] Advice on troubleshooting AWS VPC site-to-site VPN connection.
paul at nohats.ca
Tue Oct 12 19:50:51 UTC 2021
On Tue, 12 Oct 2021, Scott Classen wrote:
> Subject: [Swan] Advice on troubleshooting AWS VPC site-to-site VPN connection.
> Apologies for the lengthy message, but thought it better to give you too much information rather than too little. I am attempting to configure a VPN tunnel from my on premise CentOS 7 machine to an Amazon AWS Virtual Private Cloud (VPC). I have installed and configured Libreswan from the Base repo (libreswan version 3.25). I have followed the AWS instructions for setting up and configuring a site-to-site VPN connection.
> AWS automatically sets up 2 tunnels, and provides a set of instructions for configuring Strongswan 5.5.1+ (unfortunately ther are no instructions explicilty for Libreswan). I proceeded under the assumption that most of the ipsec configuration options would be similar or the same.
libreswan currently does not properly support bringing up two tunnels to
different endpoints for the same subnet. For now, you need your own
scripting to bring down/up one or the other.
> Here is my current configuration:
> conn conn-to-aws-1
> conn conn-to-aws-2
The Linux VTI code does not support sharing VTI to different endpoint
IPs, so you cannot use VTI. You might be able to use XFRMi interfaces
instead (eg ipsec-interface=ipsec1)
> After a fair bit of fiddling I believe I have established the tunnel connection. The following tcpdump command shows UDP traffic between my on prem machine and AWS:
> ([root at mymachine]# tcpdump -n -i enp5s0f1 esp or udp port 500 or udp port 4500
> 10:14:33.984518 IP xxx.xxx.xxx.230.ipsec-nat-t > xxx.xxx.xxx.105.ipsec-nat-t: NONESP-encap: isakmp: child_sa inf2
> 10:14:33.985420 IP xxx.xxx.xxx.105.ipsec-nat-t > xxx.xxx.xxx.230.ipsec-nat-t: NONESP-encap: isakmp: child_sa inf2[IR]
> 10:14:34.987292 IP xxx.xxx.xxx.74.ipsec-nat-t > xxx.xxx.xxx.105.ipsec-nat-t: NONESP-encap: isakmp: child_sa inf2
> 10:14:34.988971 IP xxx.xxx.xxx.105.ipsec-nat-t > xxx.xxx.xxx.74.ipsec-nat-t: NONESP-encap: isakmp: child_sa inf2[IR]
No, that is showing isakmp (aka IKE) traffic, not encapsulated ESP traffic.
> and according to ipsec it looks like there are some connections being made.
> (base) [root at bl1231 ipsec.d]# ipsec trafficstatus
> 006 #3: "sibyls-to-aws-1/1x1", type=ESP, add_time=1634058414, inBytes=0, outBytes=0, id=‘xxx.xxx.xxx.230'
> 006 #4: "sibyls-to-aws-2/1x1", type=ESP, add_time=1634058413, inBytes=0, outBytes=0, id=‘xxx.xxx.xxx.74'
> 006 #5: "sibyls-to-aws-2/1x2", type=ESP, add_time=1634058414, inBytes=0, outBytes=252, id=‘xxx.xxx.xxx.74'
so the conn-to-aws-2 came up fully but it like replaced some of the conn-to-aws-1 connection?
> I have an AWS EC AMI instance up and running which has been assigned the private IP of 10.0.1.52 and attached to my VPC. but I am unable to ping it from my on-prem machine.
> I think tcpdump shows the packets going out:
> 10:14:48.484678 IP xxx.xxx.xxx.105.ipsec-nat-t > xxx.xxx.xxx.74.ipsec-nat-t: UDP-encap: ESP(spi=0xcfb5cc6f,seq=0x1), length 132
> 10:14:49.486181 IP xxx.xxx.xxx.105.ipsec-nat-t > xxx.xxx.xxx.74.ipsec-nat-t: UDP-encap: ESP(spi=0xcfb5cc6f,seq=0x2), length 132
> 10:14:50.485446 IP xxx.xxx.xxx.105.ipsec-nat-t > xxx.xxx.xxx.74.ipsec-nat-t: UDP-encap: ESP(spi=0xcfb5cc6f,seq=0x3), length 132
That does look like ESP traffic and the .74 matches the conn-to-aws-2 connection.
> but nothing comes back. At this point I think I am essentaily stuck on an AWS routing problem (or maybe my firewall is misconfigured), and I will not bother this list with AWS questions, but I was curious why some instructions say to create 2 vti tunnels and some say to create 1 vti tunnel and share it between the 2 ipsec connections? I was also curious if it would be to my benefit to build and install the latest Libreswan as it appears 3.25 is a bit outdated? Can people think of what other things I should be troubleshooting?
We are fixing the overlapping tunnels being prevented from starting on
the next libreswan release. The VTI configuration cannot be fixed. The
limitation in the kernel code is exactly why XFRMi interfaces were
created, and VTI code is considered legacy.
For now, your best option is to do your own monitoring to swap the
tunnels. In a few weeks we should have a new release out that fixes
the "identical tunnel to two different remote IPs" issue. Then it
should work with XFRMi and AWS.
More information about the Swan