[Swan-dev] found in my postponed folder: Re: xfrmi branch (fwd)
Paul Wouters
paul at nohats.ca
Fri Jan 17 13:37:27 UTC 2020
I found this one in my postponed folder. I am not sure what parts are
still relevant or not.
Paul
On Mon, 30 Sep 2019, Paul Wouters wrote:
> Subject: xfrmi branch
I looked further into my vpn.nohats.ca client issue, and found
the following diff breaks it for me (even when leftiface=id is
not set):
- oops="$(eval ${it} 2>&1)"
- st=$?
- if [ -z "${oops}" -a ${st} -ne 0 ]; then
- oops="silent error, exit status ${st}"
+
+ if [ "${ROUTE}" = "yes" -o "${XFRMI_ROUTE}" = "yes" ]; then
+ oops="$(eval ${it} 2>&1)"
+ st=$?
+
+ st_r=$(eval err_check ${st} ${oops} "$it")
+ if [ ${st_r} -ne 0 ]; then
+ return ${st}
+ fi
This changes the eval of $it to only happen if ROUTE of XFRMI_ROUTE is
set. But if you look slightly above it in the script, it= is only ever
set if PLUTO_PEER_CLIENT is 0.0.0.0/0 so we get our halfroutes. This is
why it breaks my vpn client connection with leftiface-id=no.
So the patch for this is basically:
it="ip route ${cmd} 0.0.0.0/1 ${parms2} && ip route ${cmd}
128.0.0.0 /1 ${parms2}"
+ HALFROUTES=yes
[...]
if [ "${ROUTE}" = "yes" -o "${XFRMI_ROUTE}" = "yes" ]; then
+ if [ "${HALFROUTES}" = "no" ]; then
I confirmed also that when using XFRMi and scope 50, we do not need to
set any halfroutes and omit any "via" parameter, and things work, but
it needs a second fix:
So this is the second patch I needed:
# use nexthop if nexthop is not %direct and POINTPOINT is not set
if [ "${PLUTO_NEXT_HOP}" != "${PLUTO_PEER}" -a -z "${POINTPOINT}" ]; then
- parms2="via ${PLUTO_NEXT_HOP}"
+ # XFRM interface needs no nexthop
+ if [ -z "${PLUTO_XFRMI_ROUTE}" ]; then
+ parms2="via ${PLUTO_NEXT_HOP}"
+ fi
Otherwise we try to set a route the kernel refuses and packet flow is
broken because the scope 50 table isn't set properly.
With these two fixes, my vpn.nohats.ca client works with and without
leftiface-id=yes
I added a test case ikev2-xfrmi-05-remote-access-client, copied from a
test case without xfrm interfaces enabled. So a regression in either
case will show up with a test failure.
Note there are still minor issues with the updown script. We get a few
errors still that are non-fatal and show up in the "good" reference
output. Those need fixing still.
I reran the ikev2-xfrmi test cases and they still pass with these
changes.
Note, when testing my updown changes, I ran the xfrmi test cases, and
ikev2-xfrmi-01 once showed a craher on east:
| expiring aged bare shunts from shunt table
| spent 0.00563 milliseconds in global timer EVENT_SHUNT_SCAN
| processing global timer EVENT_SHUNT_SCAN
| expiring aged bare shunts from shunt table
| spent 0.00418 milliseconds in global timer EVENT_SHUNT_SCAN
in event_schedule (type=<optimized out>, delay=..., st=0x55d765980000)
at
/usr/src/debug/libreswan-3.28-0.rc877_ga79330704c_xfrmi.x86_64/programs/pluto/timer.c:637
#5 0x000055d7641634d3 in timer_event_cb (unused_fd=<optimized out>,
unused_event=<optimized out>, arg=<optimized out>) at
/usr/src/debug/libreswan-3.28-0.rc877_ga79330704c_xfrmi.x86_64/programs/pluto/timer.c:336
#6 0x00007fd106615a5a in event_process_active_single_queue
(base=base at entry=0x55d765967e40, activeq=0x55d765968100,
max_to_process=max_to_process at entry=2147483647,
endtime=endtime at entry=0x0) at event.c:1646
#7 0x00007fd10661630f in event_process_active (base=0x55d765967e40) at
event.c:1738
#8 event_base_loop (base=0x55d765967e40, flags=flags at entry=0) at
event.c:1961
#9 0x000055d764160865 in call_server () at
/usr/src/debug/libreswan-3.28-0.rc877_ga79330704c_xfrmi.x86_64/programs/pluto/server.c:1496
which maps to the timer event hitting the default cause and expecting no
st_event but having one:
case EVENT_v1_SEND_XAUTH:
passert(st->st_send_xauth_event == NULL);
st->st_send_xauth_event = ev;
break;
default:
passert(st->st_event == NULL);
st->st_event = ev;
break;
}
when this happened, I was running namespace based testing, but I also had
ipsec1 to vpn.nohats.ca up on my bare metal. I stopped the host ipsec
and killed all pluto's and reran the test a few times and no crasher
happened ?
Paul
More information about the Swan-dev
mailing list