[Swan-dev] test failures due to IKE retransmissions

D. Hugh Redelmeier hugh at mimosa.com
Sun Oct 21 15:48:30 UTC 2018


My previous message to the list described the times taken to run our
test suite on three different machines.

This one focuses on tests failing due to unexpected IKE retransmissions.

This is so common that I have a procedure for rerunning tests that
failed due to exactly one IKE retransmissions.  (Some fail with
multiple IKE retransmission but I don't detect those.)

redtiny: 10
  ikev2-11-simple-psk
  ikev2-61-any-psk
  ikev2-algo-03-aes-ccm
  ikev2-algo-06-aes-aes_xcbc
  ikev2-ecdsa-01
  ikev2-hostpair-01
  ikev2-liveness-11-silent
  ikev2-x509-17-multicert-rightid-san-wildcard
  ikev2-x509-20-multicert-rightid-san-wildcard
  netkey-passthrough-02

redox: 3
  ikev2-ecdsa-01
  ikev2-x509-17-multicert-rightid-san-wildcard
  ikev2-x509-20-multicert-rightid-san-wildcard

redbird: 5
  ikev2-ecdsa-01
  ikev2-hostpair-01
  ikev2-x509-17-multicert-rightid-san-wildcard
  ikev2-x509-20-multicert-rightid-san-wildcard
  nss-cert-crl-03-strict

Notice that redox's set is a subset of redbird's, which in turn is a
subset of redtiny's.

I blame the HDD -- what else is inferior about redtiny?

This makes the HDD system quite annoying.  Some changes that Andrew
made have reduced this effect considerably (Andrew: thanks!).  I have
a script that reruns tests that fail this way.

In general, I don't like the idea of "rerun a test until it passes":
this would hide some real errors that are non-deterministic.  On the
other hand, spurious error reports are a large waste of my time.

As a check on how many tests failed due to exactly two retries, I
found that on redtiny there were:
48 tests with unexpected first retries
45 tests with unexpected second retries
35 tests with unexpected third retries
33 tests with unexpected fourth retries
33 tests with unexpected fifth retries.

Note that these numbers should be monotonically non-increasing since any
test with an unexpected n+1 retry would have an unexpected n return.

These numbers are AFTER I reran any test that only failed due to one
unexpected first retry.

Many of these test may have failed for other reasons.  The 33 look
like real failures.  10 might be failures due to only two sequential
retries.

I just reran all the tests that failed on redtiny with only unexpected 
first and second IKE retransmissions.  This picks up tests that failed 
more than one place due to a first retransmission or any second 
retransmission.  Here are the 12.

  fips-default-ikev2-01-nofips-east
  ikev2-49-hub-spoke
  ikev2-algo-07-aes_ctr
  ikev2-algo-11-gcm-prop2
  ikev2-algo-ike-sha2-04
  ikev2-ecdsa-01
  ikev2-invalid-ke-02-wrong-modp
  ikev2-liveness-11-silent
  ikev2-mobike-01
  ikev2-nat-pluto-03
  ikev2-x509-17-multicert-rightid-san-wildcard
  ikev2-x509-20-multicert-rightid-san-wildcard

After a single rerun, 5 of the 12 passed, 6 of them still had only one or 
two IKE retransmissions, and 1 failed for some other reason.  Here are the 
six with only one or two IKE retransmissions:

  fips-default-ikev2-01-nofips-east
  ikev2-algo-07-aes_ctr
  ikev2-ecdsa-01
  ikev2-liveness-11-silent
  ikev2-x509-17-multicert-rightid-san-wildcard
  ikev2-x509-20-multicert-rightid-san-wildcard

Doing this again leaves four:
  fips-default-ikev2-01-nofips-east
  ikev2-ecdsa-01
  ikev2-x509-17-multicert-rightid-san-wildcard
  ikev2-x509-20-multicert-rightid-san-wildcard


More information about the Swan-dev mailing list