[Swan-dev] Leaks when killing states during crypto; time to drop WIRE_*?

Thu Dec 7 04:01:36 UTC 2017

On 6 December 2017 at 18:12, D. Hugh Redelmeier <hugh at mimosa.com> wrote:
> | From: Andrew Cagney <andrew.cagney at gmail.com>
>
> | > Clearly the main thread needs the MD most of the time.  But probably
> | > not during "suspension" of a state.  That's when the worker could have
> | > ownership.  I'm guessing that the only worker-access needed is for
> | > encryption/authentication of the packet itself.
> |
> | Yes, the state transition code uses MD (as a parameter).  It doesn't
> | need to be stuffing it in st_suspended_md or conversely having state
> | stuffed into md.st.
>
> Historically, md has a pointer to the current struct state.  I know
> you've changed this, but I'm not sure why that's a win.
>
> The idea was that md would carry all that was needed by a State
> Transition Function.

That's what I suspected - borrow object for A to also do B.  Better to
create object for B.

> It is true that the rest of the msg_digest was the sliced and diced
> packet and that the state object is different.

Right.  So why mix it.

> The state pointer in md was kind of cheap: it did not "own" the state,
> so that freeing the md did not require freeing the state.

Someone new to this code should be allowed the reasonable expectation
that a function operating on state, takes state as the [first]
parameter.  Hiding it in MD is confusing; not cheap.

> On the other hand, if the state were freed (killed), the md needed to
> be deleted because nothing else "owned" the md.

Or ST isn't there at all and the need for this goes away.

As an exercise, pick a random crypto continuation function, and then
change its function signature to:

 typedef void crypto_req_cont_func(struct state *st, struct msg_digest *md,
                                  struct pluto_crypto_req *);

and then, assuming ST!=NULL and correct, and assuming the caller
sets/resets the context correctly, see how much accumulated guff
and/or sillyness such as:

    passert(ke->pcrc_serialno == st->st_serialno);    /* transitional */
    passert(st != NULL); /* too late! */
    passert(st->st_suspended_md == ke->pcrc_md);

can be removed

> If a state transition were suspended, the md (if any) needs to be
> preserved and remembered: for the restarted the state transition and
> for not leaking (kind of related). The md is preserved by nulling out a
> pointer to it in the framework so that the framework won't free it at
> the end of the (incomplete) state transition.  It is remembered by
> being placed in st_suspended_md.

(I wish our memory manager referenced counted)

It is always in .pcrc_md - this is because it gets passed into new_pcrc()

I thought it was also always redundantly in .st_suspended_md; but then
I came across the comment:
     * If there was a packet received while we were calculating, then
     * process it now.
     * Otherwise, the result awaits the packet.
in main_inI2_outR2_calcdone()

> So it a state object is killed, it is easy to find any md attached to
> it by suspension.  The maintenance of the md's pointer to the state
> falls easily out of that.

Or delete it and delete the need to maintain this.

> You sped up serial number to state mapping.  No
> longer a linear search.  That mechanism wasn't intended to be used
> much.  But: are the uses that caused performance issues really
> reasonable?  I don't know.

In the profile I saw it was "hot", yet, to your point, it shouldn't
have even appeared in the profile.  There was clearly a lot of states.

It also means that saving a state as so_serial_t and then later trying
to turn it into a state is cheap; and much simpler, I think, than
trying to chase down stray state pointers.

> Scanning all existing states is linear.  There seem to be a fair
> number of casess where that is used.  Can that be sped up by
> datastructuctures that let us only look at the relevant states?

If it isn't in the path used when creating a connection it doesn't matter.

For instance, deleting a connection and all states is expensive; but not common.

> Performance measurement showed that sanitizing the log messages took a
> non-trivial amount of time.  You ditched the sanitization.  It
> probably wasn't necessary, but it was part of defense in depth.
> Alternatively, we could have found out why it was so slow.  My guess:
> something to do with locales, something that should have been easy to
> cut out of the processing.  If sanitizing the log messages took
> noticeable time, we were doing too much logging -- something expensive
> in itself.

>From memory:
- needless locking
- redundant memory moves
- sanitizer

> | However, one thing that I do think should go is that extra code path
> | handling STF_INLINE.  When there are no threads, use the event loop to
> | offload the work.
>
> I don't remember that clearly, but I think that's exactly what
> STF_INLINE is actually doing.

No.  It calls the code inline i.e., directly; so another code path
(with different bugs - for instance cur-state isn't correctly
saved/restored).

Anyway I think I've hacked things to the point where I can easily add
more work to the crypto pool.

Andrew

PS: And I suspect I found another leak, the continuation object when STF_FAIL