[Swan-dev] status of failing tests

Tue Feb 13 15:12:44 UTC 2018

On 4 February 2018 at 16:19, Paul Wouters <paul at nohats.ca> wrote:
> On Fri, 2 Feb 2018, Andrew Cagney wrote:
>
>>>> - early stop?
>>>>  testing/pluto/klips-netkey-pluto-06 failed east:output-different
>>>> west:output-different
>>>
>>>
>>>
>>> if final.sh runs a status or trafficstatus and also shuts down pluto for
>>> a leak report, there is a race between nodes. If one shuts down fast,
>>> the other won't see the proper status because it will have processed the
>>> deletes from the other peer's shutdown. The rule is to not have status
>>> and shutdown in final.sh.
>>
>>
>> This one is a no win.  We've too often missed core dumps because pluto
>> wasn't being shutdown.
>>
>> What about wrapping the inconsistent output in --cut-- --tuc--?
>
>
> I'd say the best fix would be to have a flag that would shutdown pluto
> after all ends have send their "done" for the final.sh. Then only grab
> the shutdown leaks/cores.

Right, but don't tie it to "done" in final.sh.  Currently a core dump when:

- eastinit.sh runs ok
- westinit.sh runs ok
east dumps core
- westrun.sh times out

doesn't get logged because final.sh (which contains scripts to look
for core files) gets skipped and the core file is missed.

This should be detect but isn't.

I know of two equivalent ways of handling this:

- even when an earlier script hangs, try to run final.sh
- add a new generic script testing/pluto/bin/teardown.sh and always run that

(we could hack final.sh to invoke teardown.sh)

> But it would have to be a flag, because often we run a test case, so we
> can login to the hosts and look at the state manually, so we wouldn't
> want pluto to be shutdown in those cases.

There is "kvmrunner --stop-at final.sh ...".  But is behaviour is
orthogonal more than usable in that:
  kvmrunner testing/pluto/basic-pluto-01
  kvmrunner [--stop-at final.sh] testing/pluto/basic-pluto-01
  kvmrunner testing/pluto/basic*
  kvmrunner [--stop-at final.sh] testing/pluto/basic*
all do what you would expect but probably not what you want.

I'm left wondering if it would be easier to have a separate script
(kvmrun.py?) that always stops at final.sh and requires/allows only
one test.

Andrew