[Swan-dev] updating test system leaves it broken

D. Hugh Redelmeier hugh at mimosa.com
Sun Jan 19 20:39:41 UTC 2020


I hadn't run the tests in months.  And I've updated the host to Fedora 31.  
So I did:

	git pull
	make kvm-demolish
	make kvm-test

The result was that almost all tests failed (unresolved) because test
virtual machines didn't seem to start up properly.

This happened on two different host machines.

What's gone wrong?  I could use some help.

Here's something suspicious early on:

================
[root at swanbase ~]# 
make kvm-shutdown-base-domain
make[1]: Entering directory '/home/build/libreswan'
Package nss was not found in the pkg-config search path.
Perhaps you should add the directory containing `nss.pc'
to the PKG_CONFIG_PATH environment variable
Package 'nss', required by 'virtual:world', not found
: shutdown-kvm-domain domain=a.swanf30base
echo ; if sudo virsh --connect qemu:///system dominfo a.swanf30base > /dev/null 2>&1 ; then  /home/build/libreswan/testing/utils/kvmsh.py --shutdown a.swanf30base || exit 1 ; else echo Domain a.swanf30base does not exist ; fi ; echo

virsh 0.00: waiting 20 seconds for domain to shutdown
virsh 0.06: domain shutdown after 0.6 seconds
================

Is this message something that needs to be addressed?

The scripts continue happily, so perhaps not.

The "not found" message appears 10 more times in the log I captured.


Here are a couple of other messages reporting a test that is incomplete:

kvmrunner 0.01: ****** testing/pluto/TESTLIST:533: invalid test certoe-18-pass-then-go-slash24-keying1: test directory not found: testing/pluto/certoe-18-pass-then-go-slash24-keying1
kvmrunner 0.01: ****** testing/pluto/TESTLIST:534: invalid test certoe-18-pass-then-go-slash32-keying1: test directory not found: testing/pluto/certoe-18-pass-then-go-slash32-keying1


The very first test to complete fails to bring up its first VM.

Virtual Machine Manage (GUI) lists these machines:
	a.build
	a.clone
	a.swanf30base
	a.swanfedora28base
	a.swaanfedora28base
	swanfedora22base
	swanfedora28base
	swanfedorabase
No workers seem to be listed.  No b. machines seem to be listed.

Here's the log; focus on b.

================
kvmrunner 0.01: run started at 2020-01-19 03:18:09.938535
kvmrunner 0.01: using a pool of 2 worker threads to reboot domains
kvmrunner 0.01: using the parallel test processor and domain prefixes ['a.', 'b.']
kvmrunner 0.02: waiting for first thread to finish
a.runner check-01 0.02: start processing test check-01 (test 1 of 827) at 2020-01-19 03:18:10.140375
b.runner basic-pluto-01 0.02: start processing test basic-pluto-01 (test 2 of 827) at 2020-01-19 03:18:10.140464
a.runner check-01 0.02: ****** check-01 (test 1 of 827) started ....
b.runner basic-pluto-01 0.02: ****** basic-pluto-01 (test 2 of 827) started ....
b.runner basic-pluto-01 0.02: start testing basic-pluto-01 (test 2 of 827) at 2020-01-19 03:18:10.141345
a.runner check-01 0.02: start testing check-01 (test 1 of 827) at 2020-01-19 03:18:10.141512
b.runner basic-pluto-01 0.02/0.00: start booting domains at 2020-01-19 03:18:10.141587
a.runner check-01 0.02/0.00: start booting domains at 2020-01-19 03:18:10.141714
b.runner basic-pluto-01 0.02/0.00: 0 shutdown/reboot jobs ahead of us in the queue
a.runner check-01 0.02/0.00: 0 shutdown/reboot jobs ahead of us in the queue
b.runner basic-pluto-01 0.08/0.06: submitting shutdown jobs for unused domains: b.nic b.road b.north
b.runner basic-pluto-01 0.08/0.06: submitting boot-and-login jobs for test domains: b.west b.east
b.runner basic-pluto-01 0.08/0.06: submitted 5 jobs; currently 3 jobs pending
a.runner check-01 1.00/0.08: submitting shutdown jobs for unused domains: a.road a.north a.east a.nic
a.runner check-01 1.00/0.08: submitting boot-and-login jobs for test domains: a.west
a.runner check-01 1.00/0.08: submitted 5 jobs; currently 8 jobs pending
b.virsh nic 1.00/0.08: domain already shutdown
b.virsh road 1.01/0.09: domain already shutdown
b.virsh west 1.01/0.09: starting domain
b.virsh north 1.01/0.09: domain already shutdown
b.virsh east 1.01/0.09: starting domain
b.runner basic-pluto-01 1.03/1.01: trying to cancel job <Future at 0x7fd1308579d0 state=running> on b.east
b.runner basic-pluto-01 1.03/1.01: job <Future at 0x7fd1308579d0 state=running> on b.east did not cancel
b.runner basic-pluto-01 1.03/1.01: trying to crash job <Future at 0x7fd1308579d0 state=running> on b.east
b.runner basic-pluto-01 east 1.03/1.01: closing any existing console by forcing a console re-open
b.runner basic-pluto-01 1.04/1.02: eof (disconnect) while booting domains
Traceback (most recent call last):
  File "/home/build/libreswan/testing/utils/fab/runner.py", line 388, in _process_test
    test_domains = _boot_test_domains(logger, test, domain_prefix, boot_executor)
  File "/home/build/libreswan/testing/utils/fab/runner.py", line 215, in _boot_test_domains
    job.result()
  File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 428, in result
    return self.__get_result()
  File "/usr/lib64/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/lib64/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/build/libreswan/testing/utils/fab/runner.py", line 120, in boot_and_login
    self.console = remote.boot_and_login(self.domain, self.console)
  File "/home/build/libreswan/testing/utils/fab/remote.py", line 311, in boot_and_login
    console = boot_to_login_prompt(domain, console)
  File "/home/build/libreswan/testing/utils/fab/remote.py", line 304, in boot_to_login_prompt
    console = _start(domain, timeout=START_TIMEOUT)
  File "/home/build/libreswan/testing/utils/fab/remote.py", line 198, in _start
    raise pexpect.EOF("failed to start domain %s" % output)
pexpect.exceptions.EOF: failed to start domain error: failed to get domain 'b.west'
b.runner basic-pluto-01 1.04/1.02: stop booting domains after 1.2 seconds
b.runner basic-pluto-01 1.04: stop testing basic-pluto-01 (test 2 of 827) after 1.2 seconds
b.runner basic-pluto-01 1.04: start post-mortem basic-pluto-01 (test 2 of 827) at 2020-01-19 03:18:11.387596
b.runner basic-pluto-01 1.04: ****** basic-pluto-01 (test 2 of 827) unresolved east:output-missing west:output-missing ******
================


More information about the Swan-dev mailing list