[Swan-dev] [PATCH libreswan] Add support for IPSec HW-offload on the NIC

Ilan Tayari ilant at mellanox.com
Thu Jun 29 11:59:42 UTC 2017


> > > 1. how to detect which esp algorithms are supported by this card?
> > There is no kernel API for that :/
> > Currently the user is supposed to be aware which algos and modes his
> offload-capable NIC supports.
> >
> It would be nice to have such listing function.
> 
> I advise better logging when pluto see EINVAL with offload.  The bellow
> link
> suggest very limited ESP algorithm support.  Just one AES GCM.  Which is
> ok
> to start with.  However, if there is a mismatch, it would be hard to
> debug.
> A clear log message would help.  Pluto detect EINVAL way late in the
> connection negotiation.  Just at the final step.
> Also if one end has such an offload and the other end do not, it will be
> hard to debug. Especially if the initiator has a offload card.
> In this responder would happily install the SA and send ESP traffic which
> will get discarded now. Even the IKE dpd/liveness could go through?
> 
> May be the phase 2 pending timer will kick in after two minutes on the
> initiator and restart.

My current conn sets phase2alg to be compatible with the hardware. This
avoids all those complications. This is what I meant that user is "supposed
to know"...

conn myconn
    left=192.168.7.1
    right=192.168.7.11
    authby=secret
    auto=start
    hw_offload=yes
    phase2alg=aes_gcm256-null

authby=secret is because I'm too lazy to get anything more elaborate, but it
doesn't matter for the SA offload.

See comment at bottom regarding failure symptoms.

> 
> > > 2. how does it deal with add_sa for a unsupported algorithm?
> > If you attempt to install an SA with unsupported offload properties, it
> fails with -EINVAL.
> > User may get more info in the logs, but the daemon will get just this
> generic indication.
> >
> > > 3. does the card support AH SA?
> > Our card does not currently. It is in the plans for future.
> > See the driver cover letter for more info:
> >
> 
> I guess AH config would fail hard. Could you try and see what pluto would
> log if any?

I'll check. AH didn't work for me at all like this:
-    hw_offload=yes
-    phase2alg=aes_gcm256-null
+    phase2=ah

Need to see why...

> 
> > > 4. does it support xfrm acquire, block and pass polices too?
> > The card currently offloads only the SADB, and not the SPD.
> > So all policy-related checks are still in the xfrm stack.
> > Acquires are not offloaded, only SAs that are supposed to have traffic
> on them are offloaded.
> >
> > Offloading the SPD is planned for the future.
> 
> It sounds this could work. However, it would be nice to test it. You need
> a
> "conn" with auto=ondemand on both ends, initiate a ping. The connection
> should come up and
> 
> "ipsec whack --trafficstatus" should show the SA.
> 
> conn myconn
> 	auto=ondemand
> 
> if you run into issues may be look at "ip xfrm monitor"

It works :)
 
> 
> > > 5. Any limits on number of SA supported? and would it return something
> > > like
> > > can't add any more message or silently fail.
> > The card currently supports 1 million SAs maximum.
> > You may not reach that limit, though, due to hash collisions.
> > If offloaded SA cannot be added to the hardware due to that, the add_sa
> will fail.
> 
> sounds good enough for now.
> 
> > > 6. does a "ipsec restart" clear the SAs properly if pluto crash?
> > > _stackmanger try to do that when pluto crash.
> >
> > Deleting the SA in xfrm deletes it in the NIC as well.
> > Flushing SAs in xfrm flushes them in the NIC as well.
> 
> Then I guess if pluto get killed and it restart things should work!

Yes, I did that a few times. Didn't see any issues.

> 
> > The conclusion from all the above, is that on failure to add_sa with
> > offload, we may retry add_sa without offload.
> > But then again some users may want to engineer their systems to only add
> supported SAs. They will not want to tolerate fallback to non-offload.
> > Maybe this could be another configuration option?
> 
> not sure what would be a good solution. In some sense less knobs the
> beter!
> My intention is to make sure there is clear logging when add_sa failis.
> So the user know what failed.

I did just:
-    phase2alg=aes_gcm256-null

And it doesn't work ofcourse.

/var/log/secure shows:
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #1: initiating Main Mode
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #1: transition from state STATE_MAIN_I1 to state STATE_MAIN_I2
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #1: STATE_MAIN_I2: sent MI2, expecting MR2
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #1: transition from state STATE_MAIN_I2 to state STATE_MAIN_I3
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #1: STATE_MAIN_I3: sent MI3, expecting MR3
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #1: Main mode peer ID is ID_IPV4_ADDR: '192.168.7.11'
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #1: transition from state STATE_MAIN_I3 to state STATE_MAIN_I4
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #1: STATE_MAIN_I4: ISAKMP SA established {auth=PRESHARED_KEY cipher=aes_256 integ=sha group=MODP2048}
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #2: initiating Quick Mode PSK+ENCRYPT+TUNNEL+PFS+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO {using isakmp#1 msgid:6f21bad0 proposal=defaults pfsgroup=MODP2048}
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #2: ERROR: netlink response for Get SA esp.9195dc7a at 192.168.7.11 included errno 3: No such process
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #2: transition from state STATE_QUICK_I1 to state STATE_QUICK_I2
Jun 29 14:53:07 gen-l-vrt-103-005 pluto[28569]: "myconn" #2: STATE_QUICK_I2: sent QI2, IPsec SA established tunnel mode {ESP=>0x9195dc7a <0xfee4040d xfrm=AES_128-HMAC_SHA1 NATOA=none NATD=none DPD=passive}
Jun 29 14:53:08 gen-l-vrt-103-005 pluto[28569]: initiate on demand from 192.168.7.1:8 to 192.168.7.11:0 proto=1 because: acquire
Jun 29 14:53:08 gen-l-vrt-103-005 pluto[28569]: "myconn" #3: initiating Quick Mode PSK+ENCRYPT+TUNNEL+PFS+IKEV1_ALLOW+IKEV2_ALLOW+SAREF_TRACK+IKE_FRAG_ALLOW+ESN_NO {using isakmp#1 msgid:b52ce458 proposal=defaults pfsgroup=MODP2048}
Jun 29 14:53:08 gen-l-vrt-103-005 pluto[28569]: "myconn" #3: transition from state STATE_QUICK_I1 to state STATE_QUICK_I2
Jun 29 14:53:08 gen-l-vrt-103-005 pluto[28569]: "myconn" #3: STATE_QUICK_I2: sent QI2, IPsec SA established tunnel mode {ESP=>0x9cb505f5 <0x27cc5e2d xfrm=AES_128-HMAC_SHA1 NATOA=none NATD=none DPD=passive}

Apparently it was going for AES with HMAC-SHA1, which is not supported for offload.
But the log doesn't indicate where the problem is or that MSG_NEWSA/UPDATESA failed.

dmesg warning reveals it:
[ 8852.735900] mlx5_core 0000:00:08.0 ens8: Cannot offload authenticated xfrm states

> 
> > In any case maybe these things can be developed as incremental
> > improvements to libreswan?
> 
> agree!
> 
> regards,
> -antony



More information about the Swan-dev mailing list