[Swan-dev] hostpairs and instances, was Re: ikev2-10-2behind-nat and 80d68d57a57

Paul Wouters paul at nohats.ca
Fri Sep 20 02:06:36 UTC 2019


On Thu, 19 Sep 2019, Andrew Cagney wrote:

> seems to be a little too aggressive.  For instance in
> ikev2-10-2behind-nat on east I see:

I disagree.

> - north establishes #1,#2 creating instance "pool-eastnet-ikev2"[1]
> with cert E=user-north at testing.libreswan.org
> - road starts establishing #3,#4 and grabs connection instance
> "pool-eastnet-ikev2"[1]
> - during authentication "pool-eastnet-ikev2"[1] is found unsuitable -
> E=user-road at testing.libreswan.org !=
> E=user-north at testing.libreswan.org I'm guessing
>
> the old code would then stumble on, eventually going back to the
> original connection and re-refining it (the patch stops this):

The core of the problem is that we use refine_host_connection() in
IKEv2, even though that function is written for IKEv1. In IKEv1,
you only have authentication material, so you cannot refine based
on traffic selectors until after authentication is done and a new
packet for quick mode appears.

So the IKEv2 code, instead of talking IKE ID's plus Traffic Selectors
into account for refining the connection, needs to continue whether or
not there is an AUTH issue. This could be many kind of failures (wrong
CA, wrong IKE ID, IKE ID not on certificate). Currently, I think if the
certificate does not validate, it will fail without looking for other
connections (with possibly other CA's that would make it succeed). If
the certificate validates, it checks the IKE ID with the certificate IDs
(SAN) but has to continue when failing. Then it hopefully switches once
it hits the traffic selector code, and finds a better conn from a traffic
selector point of view. But then when it switches to that conn, it has to
retroactively check if the switched conn allows the same authentication
properties. Eg the CERT could now have the right IKE ID on the SAN,
but it could also be that this connection now has the wrong CA and the
previous certificate validation is not valid for this conn.

> "pool-eastnet-ikev2"[1] 192.1.2.254 #3: switched from
> "pool-eastnet-ikev2"[1] 192.1.2.254 to "pool-eastnet-ikev2"
> | rw_instantiate() instantiated "pool-eastnet-ikev2"[2] 192.1.2.254
> for 192.1.2.254
> "pool-eastnet-ikev2"[2] 192.1.2.254 #3: Authenticated using RSA
>
> (#include rant about how connection prefixes changing randomly and
> meaninglessly)

It does not. If the name changes, it means you switched connections, and
that there be dragons.

> Presumably tweaking the connection would fix this?
>
> But here's my question.  Why was road assigned the instance
> "pool-eastnet-ikev2"[1] to start with?  Would it be better to:
> - in the responder, on first contact, assign a non-template connection
> to the state
> - during auth, when everything is known, switch to or refine the
> connection instance as needed

Yes, that is one of the proposals I've suggested in the past, but it has
a problem too. If you don't have the real connection, you don't know
what DH / KE is allowed, so you have to accept everything, and you
cannot ever return INVALID_KE, because you accepted whatever they gave
you in IKE_SA_INIT, and now in IKE_AUTH you realise that the current
connection does not support the specific DH group.

So there is some merit in trying to find the best one in IKE_SA_INIT,
at least one with a matching DH / KE. But it's a poor guarantee for
knowing you are on the right conn.

Furthermore, I have a pending change that I had hoped to talk with Hugh
about before commiting related to this:

diff --git a/programs/pluto/ikev2_ts.c b/programs/pluto/ikev2_ts.c
index 3ae176e8a3..d8470f1072 100644
--- a/programs/pluto/ikev2_ts.c
+++ b/programs/pluto/ikev2_ts.c
@@ -780,8 +780,8 @@ bool v2_process_ts_request(struct child_sa *child,
                 if (hp == NULL)
                         continue;

-               for (struct connection *d = hp->connections;
-                    d != NULL; d = d->hp_next) {
+               struct connection *d;
+               for (d = connections; d != NULL; d = d->ac_next) {
                         /* groups are templates instantiated as GROUPINSTANCE */
                         if (d->policy & POLICY_GROUP) {
                                 continue;

Currently, the traffic selector code loops over the hostpair. But this
fails when you have multiple connections with the remote as %any because
you don't know the remote's dynamic IP (but your two connections are
meant for the same dynamic entity). As you can see in ikev2-06-dual-net,
without the above change, it will never find the second conn for the
CREATE_CHILD_SA. It only finds the first template and the instance of it,
but not the second template conn. Despite the fact that both conns have
the same leftid/rightid but it has a right=%any. I think it is a
limitation of the hostpair code, which assumes that you are talking to
one entity on one IP address, so it is unclear how the second connection
with %any would handle a connection from a different IP.

The real question here is, should we change the hostpair concept to be
keyed on ID instead of IP, or do we trust our dynamic trees and buckets
enough now that the performance gain/loss from hostpair is so little, we
should just remove the entire concept?

Unfortunately, the reason this commit is pending and I wanted to
investigate more is that doing this seems to result in a bogus "eroute
already in use", despite that the two conns have different subnets. Of
course, this is also an artifact of right=%any. It's not really required
to instantiate, but it does because of the %any.

We should have used right=%many and right=%dynamic which would have been
clearer and that %dynamic would not instantiate, but that ship sailed
in 1995.

> it won't help when first contact has a choice of connections; but it
> might help when the problem is finding the correct connection instance
> - the initial decision hasn't dug a hole

Related to the above. For "real" instances of remote access clients, you
would never need to refine_host_connection() to a template instance. You
would either find the template and create a new instance, or find
something that didn't come from this already instantiated template. But
because of the above, you have to check instances too :/

> thoughts,

Many thoughts are needed. Maybe some prayers too.

Paul


More information about the Swan-dev mailing list