[Swan] Proposal to remove force_keepalive= global option

Paul Wouters pwouters at redhat.com
Thu Apr 4 19:03:07 EEST 2013


While adding a per-conn option for nat_keepalive=yes|no (default yes,
matching current behaviour) I noticed that the global force_keepalive=
is actually a pretty strange option.

It is only used in nat_traversal_ka_event_state() function at three
places:

It is severely limited in scope (and handled with some bad ugly globals):


         if ( IS_ISAKMP_SA_ESTABLISHED(st->st_state)
              && (st->hidden_variables.st_nat_traversal & NAT_T_DETECTED)
              && ((st->hidden_variables.st_nat_traversal & LELEM(NAT_TRAVERSAL_NAT_BHND_ME))
                  || (_force_ka)))

[...]

         if ( ((st->st_state == STATE_QUICK_R2)
               || (st->st_state == STATE_QUICK_I2))
              && (st->hidden_variables.st_nat_traversal & NAT_T_DETECTED)
              && ((st->hidden_variables.st_nat_traversal & LELEM(NAT_TRAVERSAL_NAT_BHND_ME))
                  || (_force_ka)))

[...]


                if ((st_newest)
                     && ((st_newest->st_state==STATE_QUICK_R2)
                         || (st_newest->st_state == STATE_QUICK_I2))
                     && (st_newest->hidden_variables.st_nat_traversal & NAT_T_DETECTED)
                     && ((st_newest->hidden_variables.st_nat_traversal & LELEM(NAT_TRAVERSAL_NAT_BHND_ME))
                         || (_force_ka)))

In other words, its only purpose seems to be to override the limitation
of sending keep alives only when we detect NAT_TRAVERSAL_NAT_BHND_ME, or
in other words, when we are NAT_TRAVERSAL_NAT_BHND_PEER.

from RFC 3947:

3.2.

[...]
    The location of the NAT device is important, as the keepalives have to
    initiate from the peer "behind" the NAT.

7.  Recovering from the Expiring NAT Mappings

    There are cases where NAT box decides to remove mappings that are
    still alive (for example, when the keepalive interval is too long, or
    when the NAT box is rebooted).  To recover from this, ends that are
    NOT behind NAT SHOULD use the last valid UDP encapsulated IKE or
    IPsec packet from the other end to determine which IP and port
    addresses should be used.  The host behind dynamic NAT MUST NOT do
    this, as otherwise it opens a DoS attack possibility because the IP
    address or port of the other host will not change (it is not behind
    NAT).

    Keepalives cannot be used for these purposes, as they are not
    authenticated, but any IKE authenticated IKE packet or ESP packet can
    be used to detect whether the IP address or the port has changed.

from RFC 3948:

4.  NAT Keepalive Procedure

    The sole purpose of sending NAT-keepalive packets is to keep NAT
    mappings alive for the duration of a connection between the peers
    (see [RFC3715], Section 2.2, case j).  Reception of NAT-keepalive
    packets MUST NOT be used to detect whether a connection is live.

    A peer MAY send a NAT-keepalive packet if one or more phase I or
    phase II SAs exist between the peers, or if such an SA has existed at
    most N minutes earlier.  N is a locally configurable parameter with a
    default value of 5 minutes.

    A peer SHOULD send a NAT-keepalive packet if a need for it is
    detected according to [RFC3947] and if no other packet to the peer
    has been sent in M seconds.  M is a locally configurable parameter
    with a default value of 20 seconds.



This raises two questions for me:

1) What are we supposed to do when BOTH sides are behind NAT? Currently,
    against RFC3947 section 7, we send keep-alives. But the RFC does not
    tell us what to do in this case. I think this behaviour is okay, and
    any DOS attack can be avoided by not placing _both_ endpoints behind
    a NAT. There is not really a choice in behaviour, because in this case
    both ends are responsible for keeping _one_ NAT mapping open on one
    of the two NAT gateways.

2) I cannot think of a legitimate reason for force_keepalive=yes, as the
    NAT gateway should only increase their NAT mapping lifetime based on
    the client behind NAT sending packets. So if it vanishes, a remote
    peer cannot keep the NAT mapping open with no client at the other end.


Therefor, I suggest we remove the entore global force_keepalive= option.
This will leave us with a per-connection nat_keepalive=yes|no option,
which defaults to "yes" and works as per RFC 3947/3948.

The alternative could be to change the per-conn option to emulate the
behaviour of force_keepalive by allowing nat_keepalive=yes|no|force, but
I would rather avoid that if we have no use-case for it.

Paul


More information about the Swan mailing list