[Swan-dev] retransmit-interval and retransmit-timeout

Paul Wouters paul at nohats.ca
Sat Jan 24 01:10:21 EET 2015


On Fri, 23 Jan 2015, D. Hugh Redelmeier wrote:

> My reverse engineering of two new knobs:
>
> retransmit-interval is meant to allow the user to specify that libreswan
> should be quicker off the mark in issuing the first retransmission packet.
> (In units of milliseconds.)

This value has always been inside the code, hardcoded. What Antony and I
wanted to do is make it at least a config setup option before we would
go and dramatically change that to be much more aggressive then the
past. Think of it as a false-save switch.

> Why is this useful?

It allows us to set the initial period for further exponentiation.

> 1) I think that Paul has said that iPhones lose the first packet when they
>   are asleep.  Apple users are impatient: our current retry isn't fast
>   enough.
>   [How soon can the retry be and still be received in this case?]

We did not think we had all the answers, and therefor wanted a little
flexibility in the new system. Right now we have it at 500ms, but we
really do plan to bring that down a lot before a release.

> 2) the old initial retry delay (10 seconds?) was too sluggish in the
>   modern world.  Even 1 second is considered too slow [by whom?
>   Why?].

By ever enduser in the world :/

And it was 20 seconds even. In fact, some iphones would abort within 20
seconds so any single packet loss would end up in failure before
retransmit.

>   [In the real world, how commonly are packets lost by systems where
>   1 second is too slow?]

I've already found that some "hangs" I saw with pluto were in fact
packet loss on my DSL link. I now see retransmits on my client
while I see no duplicate packets on my server. This code has already
proven that I was suffering from packet loss without knowing.

> Are these two reasons the same?
>
> Are there more reasons?
>
> If both are true, why not change the initial retry delay to 0.5 seconds
> for everyone?  Why make it configurable?

Because we currently do not believe we have the answer to all the
timings. And to have an emergency switch to make things lower if it
turns out to cause really big issues.

> ================
>
> retransmit-timeout is meant to say how long (in seconds) libreswan should
> keep waiting for an answer to a particular IKE message.
>
> The old code had wired-in the number of retransmissions it was willing to
> do.  After that, it would (under user control) retry the whole
> negotiation.
>
> Why is this new parameter useful?

Because depending on your initial interval, 3x retransmit can be either
30ms or 80 seconds. So waiting 3x is not a useful measure to users on
how long they might want to wait.

> Summary: I'd like to see a stronger case for this extra interface
> complexity

I hope to above clarifies it.

> I also don't much care for the names.
>
> retransmit-interval might be better named response-initial-deadline.  This
> name indicates that this is properly considered a deadline.

We didn't like the name too much either, but tried to avoid really long
names. We're open to suggestions.

> Intuitively, doubling seems a bit severe.  I admit that I introduced it
> to Pluto.  To be honest, I don't know that it matters very much.

I think it is very good especially within the sub-second range. I agree
that once you pass a second or two, it becomes way too slow in practise.
But we're hoping to go down much lower that 500ms.

> Why exactly are we mucking about with retransmission counts?  Are we
> fixing an observed problem?  One that matters?  If so, what exactly is
> the problem (not the solution!).

See above. I've suffered from regular packet loss and I restarted pluto
because I thought it was hung and didn't want to wait 20s to find out.

Browser people who deal with user attention span talk about every single
roundtrip. Their users care about every 10ms. Also, if we attempt to do
OE and setup a hand full of connections, we don't want to user to wait
"a few seconds". We have to succeed or fail fast.

Paul


More information about the Swan-dev mailing list