[Swan-dev] why do we bother checking out_raw() et.al.'s result?
andrew.cagney at gmail.com
Sun Sep 23 16:38:20 UTC 2018
On Sat, 22 Sep 2018 at 14:19, D. Hugh Redelmeier <hugh at mimosa.com> wrote:
> | From: Andrew Cagney <andrew.cagney at gmail.com>
> | > There are no good exception mechnisms.
> | There is, its called abort(). Part of libreswan's start up is dealing
> | with any mess from a previous abort.
> Abort kills all operations. I have no trouble advocating abort/assert
> when we go into unknown territory. When an invariant turns out not to
> be true.
Such as the buffer overflowed?
> Packet output problems are not uncontemplated. They are a kind of
> resource exhaustion. (Although the last time I hit one was actually
> due to a bug in the calling code (that I had just writtten); an
> assertion failure would have been slightly more convenient to debug.)
> The packet output problems are currently handled in a way that stops a
> particular negotiation. Nothing else is affected.
I'm really not sure what you mean by 'stops a particular negotiation'
means. For instance if an IKEv2 AUTH response comes back with a fail,
we migh stop the negotiation, re-start the negotiation, or do some
funky negotation (a notify exchange to delete things).
> | > In security code, it is good to stop when things are going wrong. Not
> | > get further into the weeds.
> | The real enemy of security is unnecessary complexity.
> | So how is this:
> | return ikev1_out_generic(np, &isakmp_keyex_desc, outs, &z) &&
> | out_zero(g->len, &z, "fake g^x") &&
> | (close_output_pbs(&z), TRUE);
> | which goes hand-in-hand with code trying to propagate 'fatal' up the
> | stack is somehow less weedy than:
> | ikev1_out_generic(np, &isakmp_keyex_desc, outs, &z);
> | out_zero(g->len, &z, "fake g^x");
> | close_output_pbs(&z);
> | After all, if it really is fatal, abort().
> It's not fatal.
> For example, an adversary might be able to drive us into such an
> overflow. The way things are now, only the negotiation he's cast his
> evil spell upon will fail. No general DOS.
right, #2 below
> | I made these points:
> | - calling an out function after a fail will core dump
> | hopefully that isn't true - that code path needs to be robust
> | regardless (I'm sure if I look I can find more code like the above
> | where a status is ignored)
> Probably. Or an error can be lost (not buffer full errors, but
Such as, can you be specific? If we can't enumerate potential
failures when emitting a payload then I suspect we've a problem.
Perhaps out_struct() can screw up? Regardless of this thread, we
should ensure that the first error is sticky. Something that can be
tested (even if all a re-call does is trigger an abort()?).
A case where care does need to be taken is in the packet-encrypt code.
It should probably check that the un-encrypted packet built ok before
trying to encrypt it. Since IKEv each only have one encryption
code paths, checking this should be easy, Oh! wait ...
> | - pluto must never knowingly send out a corrupt packet (reflection
> | attack and all that)
> | (but I want to :-) so drop the packet as it goes out the door
> During state transition functions, we modify the internal state of our
> system (XFRMs, for example). Quiting an STF as soon as an error is
> discovered prevents such changes, that either matters or it doesn't.
> I don't wish to audit the code to determine which.
I don't want to be the bearer of bad news, but for IKEv2 I _know_ we
screw this up. The STF_* model fails.
For instance, during an IKEv2 AUTH the initiator may find itself
wanting to: accept the packet, transition the IKE state to
"established", delete the child (it shouldn't exist), and starting the
process of cleanly deleting the parent (which means sending a notify).
STF_* can't describe all that. Nor do I think it should.
> | - the state machine will get confused because a packet was sent when it wasn't
> | First, this is no different to a MITM attack (where packets get
> | corrupted/dropped without our knowledge) - we get to deal.
> Normally: "deal" means retransmit the packet. But, oops, we don't
> have one.
But we do, we just don't send it.
> | Second, if the SA should be killed, then inject a KILL-SA event rather
> | than trying to handle it in-line.
> I don't know what this means. It doesn't sound conservative.
See above for an AUTH exchange. If the initiator rejects auth it
needs to transition to initiate a delete, most straight forward way of
doing that is injecting a delete parent event to trigger the process.
More information about the Swan-dev