[Swan-dev] retry/re-transmit controls
Andrew Cagney
andrew.cagney at gmail.com
Wed Nov 15 19:29:11 UTC 2017
I've been trying to get my head around pluto's retry / re-transmit logic.
I was expecting something like:
start:
delay = r_interval
cap = monoadd(now(), r_timeout)
schedule(delay, retransmit)
retransmit:
if now() >= cap
return done
// MAXIMUM_RETRANSMITS_PER_EXCHANGE?
delay = delay + delay // double
// can be more than cap but that is ok
schedule(delay, retransmit)
but, as you can guess, that isn't what I found. I'm going to push a change
to somewhat abstract/simplify the re-transmit logic and greatly increase
logging; here are my notes:
- because of a post-increment, the delay (r_interval * 2^^nr_retransmits)
grows:
r_interval, r_interval, r_interval*2, r_interval*4, ...
I think this is a bug; it should be:
r_interval, r_interval*2, r_interval*4, ...
- because the start time isn't saved, the code uses something like:
if delay >= r_timeout
to decide if r_timeout was exceeded; I'm guessing it was a good enough
approximation
- pluto can also auto-reply receives a "duplicate"; and that is sometimes
capped:
- IKEv2 normally unlimited; and plays no part in in the retransmit
code; but ...
- IKEv2 invalid KE; limited to MAXIMUM_INVALID_KE_RETRANS 3, because it
is also re-transmitting
- IKEv1 limited to MAXIMUM_v1_ACCEPTED_DUPLICATES 2 which seems very low
I suspect the IKEv1 case should be unlimited (like IKEv1) when
re-transmits are not happening. For instance when in MAIN_R1.
- re-transmits can be impaired; but instead of dealing with this in 'start'
vis:
start:
cap = monoadd(now(), r_timeout)
if (impaired)
libreswan_log("IMPAIR: ...");
schedule(r_timeout, retransmit)
else ...
it deals with it in the first re-transmit event at time r_interval, and
only sends the log to whack(?!?) if the timer expires; I suspect this is
because it was easier.
Changing it to the above makes it more deterministic and usable as a way
to really suppress re-transmits.
- since duplicate replies are counted as re-transmits they feed into the
re-transmit delay calculation - r_interval * 2^^nr_retransmits - the effect
is two fold:
- future re-transmits are more spaced out
- the total time is shortened (because of how the timeout test is
performed) and can (I suspect) result in waiting for less than r_timeout?
puzzling; I suspect the second effect is unintended; and can be fixed by
computing timeout properly.
Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.libreswan.org/pipermail/swan-dev/attachments/20171115/67ab4664/attachment.html>
More information about the Swan-dev
mailing list