[Swan-dev] Leaks when killing states during crypto; time to drop WIRE_*?

Tue Dec 5 19:29:05 UTC 2017

On 5 December 2017 at 12:29, D. Hugh Redelmeier <hugh at mimosa.com> wrote:
> | From: Andrew Cagney <andrew.cagney at gmail.com>
>
> | First, here's my rewritten version of history:
>
> Seems pretty good.
>
> To get pointers (and many other things) right, one needs an iron
> discipline.  This is best done by a set of simple-to-follow rules.
>
> The wire stuff was such a simple set of rules.  Fairly reasonable when
> there was a virtual wire.
>
> Now, shared variables make a lot of sense.  Threads with shared
> variables should be cheaper than processes with pipes.  But many
> thread disciplines are really awkward.
>
> Currently we have a hybrid solution.  We use pipes as an inter-thread
> communications method.  This is probably silly.

otoh, it works

Changing the the worker->main path to use the event-loop may be low hanging

> - what are the actors?  The main thread and worker threads, but with
>   roles that might be changeable.
>
> - for each variable, who (what actor) owns it?
>   - who allocates it; who frees it
>   - who is allowed to write it?
>   - who is allowed to read it?
>   - what happens when an actor fails?
>
> - how is work distributed?  How are results gathered?
>
> | The problem was, when pluto is overloaded it will kill states
> | mid-crypto and there is no code to clean up these pointers (if the
> | code is there it isn't obvious)
>
> In the framework I just put forward, the problem here is that an actor
> appears to have lost competence and someone has to:
>
> - make sure that the actor has stopped doing anything (observable).
>   That may not be easy in the face of asynchrony.
>
> - inherit all the actor's obligations
>
> This means that those obligations must be represented in a non-opaque
> way, one that must be shared or transferred between threads.  Yuck.

I think the short-lived XAUTH PAM thread code (replaced by fork()) is
a reasonable model.
Only release stuff on the main thread after the worker has finished and replied.

> | So here's my solution:
> |
> | Accept that pointers are being passed and make it work:
> |
> | - try to apply the dogma that state and workers share no pointers
> | (currently MD violates this) so there is no question as to who is
> | responsible for releasing stuff
>
> I suspect that the MD need only be owned by one actor at a time.
>
> Clearly the main thread needs the MD most of the time.  But probably
> not during "suspension" of a state.  That's when the worker could have
> ownership.  I'm guessing that the only worker-access needed is for
> encryption/authentication of the packet itself.

As an aside, I find "suspended" confusing.  The state transition is
still in progress.

> | - handle cleaning up after an abort with a separate callback, and run
> | this from the main thread
>
> The original design of the continuation mechanism is that failure and
> success took the same path.  This seems surprising but it really cut
> down combinatorial explosion.

What combinatorial explosion?  Each offloaded task needs to deal with
two outcomes:

finished, where everything created by the worker gets copied into
(used by) state; and there is always a state
aborted, where everything created by the worker gets deleted; and
there is never a state

a look at the existing callbacks show they have all being trying to
handle these two cases using some convoluted variation on the boiler
plate code:

  if (st == NULL) {
      ... oops ...
      return;
  }
  push_cur_state(st);
  ....
  pop_cur_state(st);

(yes, every callback seems to be different) better to do it right, once.

> It would be good if abort (death or assassination) used the same path.
> Then it is always the continuation that manages resources.
>
> When I added the continuation mechanism to pluto almost 20 years ago,
> the idea of continuation was not mainstream.  It has been hard for
> programmers to get their head around.  But in 2017 it ought to be in
> every programmer's toolkit.  Of course C doesn't help at all.

Yea.  While implementing protocols using states and events has been
main stream in Telco S/W Engineering space for a very very long time;
it is less so elsewhere.

> We really need to cultivate simplicity.  One major part of that is
> cutting down code paths.  Especially untested ones.  Error-handling is
> generally the least exercised.
>
> The continuation mechanism could be replaced.  The obvious way would
> be to add substates (really: finer-grained states) to the state
> mechanism.  Currently state transitions are triggered by incoming
> packets.  Except for the anomaly of the initiator's first packet,
> every State Transition Function is invoked for an input packet aimed
> at that state.  Timeouts and whack commands can have effects on states
> too but don't invoke STFs.  We could add triggering for worker
> completion.  Everything currently held in any continuation would have
> to be stuck in struct state or reachable from it.  I'm guessing that
> the result might be easier to understand but it would be a lot of
> work.