[Swan-dev] [Testing] Libreswan Testing Future - Call Notes (from May 16)

Wed May 18 13:40:41 UTC 2016

Hi,

on May 16 we had a call about future of libreswan testing. Call notes
are as follows, feel free to correct, comment or add anything. I added
some points myself to the call notes. I will stick it to the testing
wiki next week with a list of action items ideally.

Current state of testing
------------------------
There are several testing hosts and the testsuite is running at least
twice a day with a comprehensive html output, there is no connection
between github and the testsuite, no continuous integration. Major
drawbacks are: very long runtime (up to 10 hours for ~429 tests, target
runtime is 1-2 hours for 500 tests), dependency on 9P (slow & not
supported everywhere), non-scalability (hard to get more than 3 network
nodes) and limited parallel execution. An essential part of the testing
is to have a directory shared between test driver (running on host) and
network nodes (running on guests or in containers). We do not want to
get rid of KVM but rather to provide some faster alternative and to keep
both approaches alive and supported. I think that test creation  should
be independent of test drivers.

Testing OS
----------
Any (Linux) OS should be supported by the testsuite ideally, this allows
us to do OS testing. In order to achieve this goal system-specific and
not widely accepted system features should better be avoided (such as 9P
for instance). Host testing environment should be stable, should not
obsolete too fast or be significantly changed very often. I thinks there
should be at least one distribution on which the testsuite works flawlessly.

Testing with KVM
----------------
KVM is now used to emulate network under test. This approach is slow
because of various reasons (not only related to KVM): slow boot of
guests, sanity checks (eg. checking that something is not reachable
means waiting for timeout), transmogrification slow-down, reboots of
guests between subsequent tests need to clean-up network stacks. KVM
approach does not scale well, it works fine for 1-3 network nodes but
fails badly for 100. Parallel test execution does not fit into KVM by
the same principle. The only non-network shared directory for KVM is
represented by 9P filesystem which is rather slow, seemingly not
maintained anymore and missing completely in RHEL. Avoiding 9P means
using network between host and guests (eg. NFS) which might be fragile
and potential source of problems. Last but no least, KVM seems to losing
drive.

Testing with Docker
-------------------
Support for test execution in Docker was recently added to the
testsuite, ie. containers are used instead of virtual guests. Currently
there is one pluto test for docker implemented and it is possible to
execute any other non-KLIPS test with unknown result. We have no docker
analogy of KVM TESTLIST at the moment. In ideal world a test would be
transparent for its execution in KVM and Docker. Ie. one prepares
scripts for network nodes together with expected outputs and executes
the test via swantest specifying a desired testing driver (KVM or
Docker). However in real world there are differences in generated output
(eg. LOGDROP iptables target does not work in Docker). Hopefully, these
differences can be mitigated by extending output sanitizers. Docker
scales very well and should run smoothly in parallel. However Antony
experienced unexpected slow-down when more than 3 tests are executed in
parallel. This needs further investigation. Fast shared directory is
supported by Docker natively. Kernel on the host and guests is the same
which is both advantage and disadvantage. Docker containers cannot run
in 32 bit mode and on non-intel architectures at the moment. Each
container can have only one network interface assigned which is
workarounded by using third party solution - pipework which works at the
moment. KLIPS IPsec stack is not "namespaced" and hence cannot be tested
in docker.

Testing with namespaces
-----------------------
Paul mentioned an interesting idea of using just specific namespaces
instead of a complete Docker orchestration. This would certainly
introduce some new serious challenges, on the other hand it would remove
Docker management overhead. Not only network namespace is needed but
most likely also pid and mount namespaces and possibly more whack
sockets. It would probably be the fastest, the most lightweight and the
most scalable approach. Similarly to Docker, KLIPS cannot be tested.

Misc & Long-term plans
----------------------
We want to be able to test in more exotic testing environments such as
FIPS 140, MLS selinux policy. Interoperability testing with ARM / MIPS
architectures is desired as well. We are also interested in more complex
multi-host tests (ie. executing setup / run / clean-up on a given set of
network nodes in a specific order). Obviously it should be possible to
enable valgrind for leak detection during tests execution and develop
libreswan "within" the testsuite, ie. change code, compile changes and
re-run the tests. We also marginally mentioned VirtualBox as a possible
alternative for KVM. Finally it was decided to try to get rid of KLIPS
tests which are not KLIPS-specific by copying them to XFRM/NETKEY and
disabling their automatic execution.

--
Ondrej