[Swan-dev] testing and unstable dns

Mon May 3 15:30:45 UTC 2021

On Sun, 2 May 2021 at 16:17, Paul Wouters <paul at nohats.ca> wrote:

> On Sun, 2 May 2021, Antony Antony wrote:
>
> >>     I think the swan-prep should to copy fresh config files every time.
> >>
> >> Moving the nsd/unbound stuff out of transmogrify makes sense.
>
> >>     the namespace directories and files, which are bind mount, should
> be setup
> >>     in swan-prep.  especially because we want to restart inside a
> vm(east or
> >>     west..) manually, inside a namespace, without resetting the all
> namespaces
> >>     of a test. So I think we should leave those tasks in swan-prep. It
> should
> >>     not be in namespace test runner.
>
> I think a per-test feature (eg running a DNS server) should be handled
> in the per-test setup, so swap-prep. Doing that in
> kvmrunner/namespacerunner
> does not make any sense to me. That's just too much secret sauce on the
> outside. I'd rather have to tests do the things themselves.
>
> I'd like the DNS daemons to be started within the test, and if possible
> without requiring any mounts. What is wrong with:
>
> unbound -c /testing/pluto/XXXX/unbound.conf
>
> Let the conf file or the command line options deal with a tmp dir and
> pid file ? We don't need to talk to another they do other than port 53 /
> 5353
> Keep it as simple as possible. Also avoid systemctl start|stop|status
> because it doesn't work inside namespaces. Avoid code that does
> different things in kvm or namespaces (like trying to check for KVM and
> then issuing systemctl status seems wrong to me)
>

There's a tension here between using services the way god^^^D the guest
intended (with fedora it's systemctl, NetBSD it's /etc/rc.d/*) VS inventing
our own secret sauce for achieving a similar effect.

<<ipsec start>> illustrates this.  For ipsec fans, it is one across all
platforms, but for fedora fanastas systemctl also does the right thing.

... with that in mind I changed start-dns.sh to just run the custom
guestbin/unbound-start.sh and guestbin/nsd-start.sh.  I figure those
scripts know what to do.

otoh, how much does this matter?

> >> (I'm sure there is other stuff)
> >> while this is currently implemented by walking the VM through a
> boot-and-login
> >> sequence, there's nothing to rule out using snapshots, say.  Just as
> long as
> >> the environment is established before the test starts.
>
> If it is one snapshot per domain, eg you compile libreswan, install it,
> then snapshot the thing to boot more quickly, it shouldn't be running
> any DNS at the start, because most tests don't need it. So it would have
> to be swan-prep that starts it anyway.
>
> >> If I were to type "reboot" in such a vm, then I'll need to first
> manually
> >> re-establish the above before entering the first shell command.  Why
> should
> >> namespaces be different?
>
> Well, namespaces are ephemeral. You don't "reboot" them. You re-create
> them. But I'm still not sure this matters for the issue we are talking
> about?
>
> >>  If namespaces and KVM established some minimum
> >> environment before running tests then I think the odds of tests runing
> under
> >> both frameworks would be greatly improved.
>
> Don't we have that minimum? What are you trying to make the new minimum?
>

No.  We don't have a minimum.

For namespaces, test scripts have to go through all sorts of custom sized
hoops before services can start.

Consider /run:

Each service and pluto are making their own set of custom mounts so that
things like /run/<service>.pid work (technical nit, /run needs something as
otherwise parallel plutos will get confused).
I don't understand why.
Is there a way for the namespace script to prep that directory before the
test is started so that:
  - /run is writeable and per test instance
swan-prep would no longer have to deal with this.

However, if this isn't possible can we adopt techniques that work more
consistently between namespaces and KVMs?

For instance, your suggestion to use:
  unbound -c /testing/pluto/XXXX/unbound.conf
could be one.  Another is to just, blindly use tmpfs regardless of the
guest, start-dns.sh is doing that.

>> BTW, I'd take the above list as a starting point for discussion.
> Currently
> >> swan-prep has to deal with cleaning up from previous tests, I think
> that's a
> >> bug.
>
> > I have been trying to reply to this e-mail for two weeks. This is the
> best
> > I could come with on your discovery of the clean-up feature as a bug. In
> my
> > opinion, it is a wrong conclusion!
>
> > I have been scratching my head to understand how you would reach the
> > conclusion it is a bug. As far as I can remember it has been there from
> the
> > beginning of KVM testing scripts, such as swan-prep.
> >
> > To be clear do not remove the hooks put in there to support rerunning
> > a test manually by script by script without using runner: such as
> swan-prep
> > cleaning up previous instances,
>
> I am not sure I see what you two are really disagreeing about.
>
> Andrew is saying, "it would be nice if the filesystem was clean and
> swan-prep didn't need to go check for that".
>
> Antony is saying, "I want to be able to run swan-prep, do test stuff,
> then re-run swan-prep for a new test without reboot".
>
> Running swan-prep twice should not make any changes to the filesystem on
> the 2nd iteration. It would just be repeating what it is doing. So I
> don't think Andrew would break that. Unless Antony wants to run 1 test,
> and then later another test. And the first test "left" a config file
> somewhere. In this case, running swan-prep a second time would wipe
> anything the first test did. I think not breaking this has value. I
> also sometimes run multiple tests manually one after the other from
> consoles without reboot. But it is rare. These days I try to use
> namespace-runner.py to run multiple tests.
>
> I am myself more concerned with things happening in post-mortem, and
> how running a single test that does not run post-mortem would end up
> with a different output then that test running in a bundle of tests
> that get a post-mortem. Like moving the selinux audit checks into
> postmortem would mean running a single test means I cannot see if
> I trigger audit issues. I really do want to be able to keep running
> a test as a single test or as part of a bundle, without it failing with
> a false positive or silently not doing some checks for me anymore.
> But I don't know how to do that.
>

While I have this lurking in Makefile.inc.local:

  KVMRUNNER_FLAGS += --run-post-mortem

it also isn't ideal.  Some days I want the VMs to shut down and post mortem
to run, some days I don't.

If a test is deliberately targeting audit, why not run the audit command as
part of the test's scripts?   The only risk I see is the test being copied,
resulting in completely unrelated tests explicitly checking for audit
records when they don't need to.

For audit though, the open question is still what to do about tests that
deliberately trigger audit records. They would need to be flushed since
post-mortem.sh doesn't expect this.

> > also in some init scripts, westinit/roadinit.sh, may remove the
> > addressees/routes added during a run that is to clean up the previous
> run.
> > e.g MOBIKE dns. I don"t agree with removing those even if it may speedup
> a
> > test run. It is an essential feature! And rebooting KVM is not an
> > alternative!
>
> In this specific case, I actually am on the fence. I really don't like
> that I run some tests involving road, and when the test is done I cannot
> "ssh road" because the test removed the real IP of road and didn't put
> it back in the end. but I understand why this is, because otherwise you
> cannot run it manually and then look at the state using a shell. But in
> general, test cases should "reset" things like DNS configs after their
> run, provided that they specifically test for the expected changes
> during the run.
>

This is why I use:
  .gmake kvmsh-road
always works (or ./kvm sh road)

> > If you feel avoiding clean-up would considerably speed up our testrun
> feel
> > free to add some options which by default disabled and could be
> optionally
> > enabled on testrun say on testing.libreswan.org.
>
> Yes, it seems most of these issues of expecting different things depend
> on whether the test is run manually or grouped. I would not be against
> an ENV variable being set by the kvmrunner/namespace runner that
> swan-prep uses to skip cleanup, that would not be set if running a
> single test. I would hope that such an approach would address everyone's
> concern ?
>
> Paul
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.libreswan.org/pipermail/swan-dev/attachments/20210503/b627ac5c/attachment.html>