[Swan-dev] Leaks when killing states during crypto; time to drop WIRE_*?

Andrew Cagney andrew.cagney at gmail.com
Tue Dec 5 15:55:24 UTC 2017


First, here's my rewritten version of history:

In the beginning, all of pluto's crypto material was stored in chunks.
Hence, when it came to off loading cryptography to a thread or even
process(!), all that was needed was to serialize the chunks into a big
blob and stuff that into a pipe(2) pointing at the workers  Later,
when the work was done, a serialized reply containing everything was
returned in another pipe(2).   Pointers were easy, there weren't any.
Memory Management was easy - delete the blob.  Killing a state is
easy, wait for crypto to finish and then, see 'Memory Management'.

The wire stuff implemented this. Life was good (if a little complicated).

But then along came NSS.

The problem with NSS is that, unlike chunks, it really isn't amenable
to being serialized (yes it can technically be done).  Instead, crypto
material is locked up in PK11SymKeys and tracked using pointers and
reference counts.  Instead of serialized chunks, the references are
sent to/from the workers.  Quickly, the idea spread (it was, after
all, much easier than trying to understand all the WIRE stuff and
seemed to work), instead of wire chunks, normal chunks (aka plain
pointers) started being passed to/from workers as well.

Life still seemed pretty good....

The problem was, when pluto is overloaded it will kill states
mid-crypto and there is no code to clean up these pointers (if the
code is there it isn't obvious),  For instance, a grep of IKEv2's
skeyid_ai finds:

./crypt_start_dh.c:289:        free_any_const_nss_symkey(dhv2->skeyid_ai);
./crypt_start_dh.c:304:        st->st_skey_ai_nss = dhv2->skeyid_ai;
    these two are in finish_dh_v2() but that is only called during a
normal state transition

./pluto_crypt.h:213:    PK11SymKey *skeyid_ai;
./ikev2_prf.c:266:        &skr->skeyid_ai,       /* output */

and, when the crypto finishes and ikev2_crypto_continue() sees that
the state is gone, it only deletes the MD(1).

More recently, changes such as
9f32986f3c979e304ee58a4bce0b034059f58da2 and
016e9e9a27e34164ed09eafc644df9d6bace94c5 mean that not even the MD is
deleted(?).

So here's my solution:

Accept that pointers are being passed and make it work:

- try to apply the dogma that state and workers share no pointers
(currently MD violates this) so there is no question as to who is
responsible for releasing stuff
- handle cleaning up after an abort with a separate callback, and run
this from the main thread
- in the case of IKEv2 DH replies, at least, delete the wire stuff as
it is just adding to the general confusion

Andrew

(and all I wanted for Christmas was to more off-loaded crypto; sigh)


More information about the Swan-dev mailing list