[Swan-dev] Set keyingtries to 1 for Opportunistic Encryption connections

Mon Mar 9 02:43:35 UTC 2020

| From: Paul Wouters <paul at nohats.ca>

| On Thu, 5 Mar 2020, D. Hugh Redelmeier wrote:
| 
| > | I meant "resolved by setting it to 1".
| >
| > I don't really understand the issues.
| 
| A failure shunt is installed after the first keyingtries= fails.
| If a second keyingtries is started, it will want to install the
| negotiationshunt. What does it mean when these two shunts are
| different? It is unclear what we _should_ do.
| 
| The actual bug seems to be that sometimes, the second shunt comes
| in without widening to the full IP address. I do not yet know how
| or when this exactly happens. When it does, pluto installs a second
| shunt because it is different from the first one (eg proto 0 vs proto 6)
| then later on when a tunnel is installed, only one shunt is removed.

OK.  That is (I think) a separate discussion that we should have.

The code I wrote implemented the same logic as your fix.  So my change
is neutral with regard to this issue.

| > If the shunts fail, except in special cases, and those cases are
| > undocumented, we should
| >
| > - fix the shunts issue (hard, I assume), or
| 
| We should either fix the shunt issue, or think of removing shunts
| completely. While shunts give us a little more fine-grained control
| on what to do during negotiation, some of that is countered by the
| XFRM larval state anyway, and additionally, there is a _lot_ of
| complexity with shunts. That makes it tempting to remove.

I think that this is an important discussion that should be separated
from this particular bug.

Among other things, shunts were to prevent unexpected leaks in the clear.  
That seems pretty important.  They were also used to mirror kernel state 
in Pluto (it was unreasonable to try to query the kernel whenever Pluto 
needed the information).  Unfortunately, the kernel could change the 
kernel's representation without notifying userland.

It is possible that the design constraints have changed over the last
decades and a different design might be better.  But just slashing
without careful analysis seems like a bad idea.  That analysis must
not be based on a subset of the cases.

Such a change might require sysadmins to change firewalls.  If so, it
should not just be slipped into a release.

| > Here's an add-on to Paul's code [UNTESTED].
| >
| > Since it changes starterwhack, something I'm not an expert in, the code is
| > particularly suspect.
| 
| I actually changed it in add_connection() so it is independent on
| whether the parameters came in via whack or via ipsec.conf.

Right.  That's why I didn't suggest removing your change.

My code intended to implement the same result, but earlier, with a
more useful diagnostic (I hope -- untested!).

| > It implements that last policy, I hope.
| > If one were to delete one line, it would only change a defaulted
| > keyingtries.
| 
| Which makes this harder to do because we currently merge in our defaults
| via libipsecconf/whack interactions. By the time add_connection() is
| called, what was an implicit default is no longer known.

Where I made the change, it is known whether this 0 came from a
default or an explicit assignment.  The diagnostic is only generated
if an explicit assignment is being over-ridden.

I fear that your code emits too many diagnostics (i.e. where the 0
comes from default) and that they are not visible.

That's why I was suggesting my additional change.

Putting a check in the whack code would be good too.  But I don't
imagine that it is as important.