[Swan] ike_frag= options - what should it mean and do?

Thu Feb 14 10:15:50 EET 2013

| From: Paul Wouters <pwouters at redhat.com>

| ike_frag=no|yes|force with the default being yes

I wonder if that is the best default.

| - When yes or force, send the FRAGMENTATION vendorid

And not on frag=no

| - When yes, on retransmit for packets > FRAG_LEN, send in fragments

| - when force, don't wait for retransmit, do it right away (and don't care
| about seeing vendorid)

I think that it should require having seen the vendorid.

| - When no, don't send vendorids, dont send fragments (but assemble
|   received fragments)
| 
| These do not seem the right choices now that we are doing some tests
| 
| Image your roadwarrior "road" knows it is on bad a network. It can set
| ike_frag=force so it does not have to wait on that one retransmit. But
| the other end does not use force, so on its answer, there is a still a
| packet being lost and re-transmitted. This could be resolved if we not
| only remember receiving a vendorid, but also remember if the peer has
| sent us fragments.
| 
| If so, we should probably assume the link problem is
| symmetrical and fragment immediately instead of waiting on a retransmit.

Interesting.  Should we infer the maximum fragment size from the size
of fragments sent?

| My thoughts right now are leaning towards:
| 
| - Never ignore vendorid. If we don't see it from the peer, don't send
|   fragments.

That seems right.

| - When using force, fragment without waiting for a retransmit

OK.

| - When using yes, fragment when retransmitting if vendorid was seen,

OK.

| or
|   fragment immediately when we received fragments from the peer already.

The theory being: if he knows that he has to fragment, surely we have
to fragment too?

How should this be remembered?  Only for this negotiation?  Perhaps
the connection should be adjusted for future negotiations?  Probably
the former.

Is the Initiator or the Responder likely to be the first to have to
fragment?

Is more than one packet from each side likely to need fragmentation?
If so, if the fragmented resend worked, should fragmentation be turned
on for subsequent messages?

I'll ignore this optimization in the rest of what I say.  Not because
it is bad, just for simplicity.

| - When using no, don't send fragmentation vendorid and don't send
|   fragments. (be Postel on receiving fragments)

OK.

| There is the corner case of not seeing a vendorid but receiving
| fragments. I don't know of any such implementations. However, currently
| we never refuse to assemble fragments, even if we did not see a
| vendorid. This _could_ become a security issue, although the code is
| pretty restrictive. We don't allow more then 16 fragments before giving
| up. Worst case they give us a 552*16 garbled IKE packet, but they might
| as well send us 1500+ byte garbled IKE packets...

I don't see a threat, but there might well be on since fragmentation
isn't cryptographically protected (but the assembled packet is
protected to the level specified by IKE).

| What racoon has for yes/no/force is kinda strange to me:
| 
|              ike_frag (on | off | force);
|                      Enable receiver-side IKE fragmentation if racoon(8) has
|                      been built with this feature.  If set to on, racoon will
|                      advertise itself as being capable of receiving packets
|                      split by IKE fragmentation.  This extension is there to
|                      work around broken firewalls that do not work with
|                      fragmented UDP packets.  IKE fragmentation is always
|                      enabled
|                      on the sender-side, and it is used if the peer advertises
|                      itself as IKE fragmentation capable.  By selecting
|                      force, IKE Fragmentation will be used when racoon is
|                      acting as the initiator even before the remote peer has
|                      advertised itself as IKE fragmentation capable.
| 
| Having the on/off be different for receiver/sender seems weird to me.

That description is hard to understand.

I assume that by "receiver side" they don't mean Responder side, but
refer to the code to receive a message, in either Initiator or
Responder mode.

It seems to say that any explicit setting of ike_frag enables
receiver-side IKE fragmentation.

- a receiver doesn't fragment, it assembles.

- The obvious implication that not specifying ike_frag disables
  acceptance of fragments

Clearly the vendorid can only advertise acceptance of fragments, not
threaten generation of fragments.

Does sending a fragmentation vendorid also imply somehow that the
sender knows that the path has a dumb firewall and thus it is
recommending fragmentation?  I don't think so.  Maybe another
vendorid could be invented for that purpose.

As far as the sender side is concerned, why is "IKE fragmentation
always enabled?"  I think that it might actually mean the code is
always compiled into Racoon, not that it will actually always
fragment.

In more sensible terminology, I think that Racoon will only send
packets if it has seen the vendorid from the far side OR ike_frag is
set to force.

The description suggests that the vendorid is only sent if ike_frag is
set to on.  Surely it should also be sent if ike_frag is force.  But
it doesn't say that.

How does this description compare with the actual Racoon behaviour?

| Note that on the first packet of an exchange, we cannot reassmble
| because we have no state object to match it with, so we currently
| don't support that and the packet is lost. However, the first packet
| is never that big, and if you cannot do UDP packets of say 1200 MTU,
| then even establishing the tunnel will be pretty useless, as all traffic
| in that tunnel will hit an even worse MTU limit (due to ESPinUDP and
| tunnel mode overhead)
| 
| Thoughts?

The code has a wired-in size for fragments (ISAKMP_FRAG_MAXLEN).  What
should that size be?  Should it be wired-in?

Guess: the size might as well be wired-in since the user usually has
no reason or expertise to control it.  Interesting point: the vendorid
payload doesn't specify the fragment size.  Too bad that PMTU
discovery fails so often, generally in those cases where we need to do
fragmentation (i.e. non-compliant firewalls).

There is a guaranteed-to-work fragment size that is small.  And there
is a pragmatic, minimum size.  Currently the code uses 552 octets
(very small).  Apple seems to use something like 1200.  What are
observed limits in the field?

Since libreswan doesn't handle fragmented first packets, perhaps the
code should ensure one isn't generated.

Note: fragmentation is only supported for IKEv1.  Is this reasonable?

================

Who is responsible for figuring out that there is a stupid firewall in
the path?

- the current experimental version of the Pluto code depends on the
setting of frag=

  "off" means don't fragment.

  "on" means "if the other side sent a vendorid and you have to retry
  sending a packet and the packet was large, resend as fragments"

  "force" means "if the other side sent a vendorid and you have to
  send a large packet, send as fragments"

I think that there are four cases, in increasing severity:
1) we don't care about broken firewalls: we assume them away
2) there might be a broken firewall and we want to survive this.
3a) we know the other side is behind a broken firewall
3b) we know we're behind a broken firewall

3 a and b are almost the same.  The difference is that the policy in 3
is per-connection whereas in 4 it is for all connections.  From now
on, I'll treat them as the same.

Here's what makes sense to me in each case:

1: act as if fragmentation doesn't exist (frag=off)

2: act as frag=on now specifies

3: act as frag=force now specifies

Maybe the three settings ought to be written "no", "on-resend", and "yes". 
This would make the meanings clearer.  And not look so much like Racoon's 
settings with different meanings.

Should always advertise (via vendorid) that we accept fragmentation?

  Argument against: all our current customers (by definition) don't
  need fragmentation.  Their negotiations will be slowed down if
  the vendorid is taken as a request, not an offer.

  Argument against: probably we should not do so in case 1, frag=off

Should there be another vendorid that means "please fragment" as opposed 
to the current one that means "you may fragment"?  We should send it in 
case 3 (frag=force). That would allow the other side to know that we think 
fragmentation is required.  In effect, we're negotiating up from frag=on 
to frag=force for the other side.