[Swan] XFRM pCPU Load distribution in KVM Muti-queue virtio-net

Paul Wouters paul at nohats.ca
Mon Sep 21 21:55:10 UTC 2020


On Mon, 21 Sep 2020, Rav Ya wrote:

> I have been referring to this page (https://libreswan.org/wiki/XFRM_pCPU) and it doesn't say that XFRM is only supported for ikev2. I
> am setting up a shared VTI for 500 Remote Clients IPSec (xAUTH using PAM, IKEv1) tunnels. I have attached my ipsec.conf at the
> bottom of this email.

The goal of pCPU is to use more than 1 CPU for a single IPsec SA. If you
have 500 clients you have 500 IPsec SA's, which get roughly load
balanced over your CPUs already. It should not help your case.

> What I understand from your response: Please correct me
> 1. Lbreswan experimental versions only support pCPU with IKEv2. (Lod balancing one big IPSec flow over multiple vCPUs.)
> 
> Question: For my use case (500 Clients, xAUTH using PAM, IKEv1 ) the SAs per client will be created per vCPU.
>  *  The vCPU will be picked randomly (How will the 500 SAs be distributed?) 500/6 = 82 SAs per CPU. 
>  *  There shall be no duplicate SAs for a single connection over multiple vCPU because there is no pCPU XFRM. Correct?
>  *  Is there a way fro me to check how any SAs got allocated to a vCPU on my system?

I don't know the answers for these.

> My Observation: When I start pushing traffic across all the 500 SAs 
>  *  Some times the load isn't distributed evenly and I see some vCPUs geting overutilized and start slowing down the Libreswan packet
>     processing rate. 

Most CPU should be going into IPsec packets inside the kernel, not IKE
packets inside libreswan.

>  *  The Libreswan server itn't able to process packets fast enough and the TAP interface (tx queue) on the KVM virtulization host
>     starts dropping packets.

Clarify "dropping packets". If it is not IKE packets, than libreswan is
not involved. It is the kernel.

> Currently, my ipsec clients are using: ( Any advice?) vCPU is Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz passthrough Host VM
> ike=3des-sha1-modp1024

3des is 8x more cpu intensive compared to aes. Use ike=aes-sha1-modp1536
modp1024 is too weak and recent libreswan has removed support for this.

> esp=aes256-md5-modp1024

If your clients support it, use esp=aes_gcm256. It is much faster than
aes+md5/sha1/sh2

It seems you problem might be more related to libreswan speed
optimializations with NSS in the last few versions. Are you at
least running 3.32 ?

You can benchmark the libreswan cpu usage using:

 	sudo ipsec whack --debug cpu-usage

Note that switching your clients to IKEv2 will also greatly improve your
speed:

- Less (encrypted) IKE packets to setup conneections
- less retransmits because initiator is responsible in IKEv2
   (in IKEv1, both ends retransmit)

certificate handling has also greatly improved in 3.32 leading to a 5 to
10 times better performance due to certificate caching and expiration,
making the nss lookups faster. it also caches the (encrypted) private
key which is useful for busy servers.

I don't think pCPU is a fix for what your problem is really is. Upgrade
to latest libreswan on the server, and if possible switch to IKEv2.

Paul


More information about the Swan mailing list