Re: [PATCH linux-kselftest/test v2] ext4: add kunit test for decoding extended timestamps

From: Brendan Higgins <brendanhiggins@google.com>
To: "Bird, Timothy" <Tim.Bird@sony.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Iurii Zaikin <yzaikin@google.com>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>,
	linux-ext4@vger.kernel.org, adilger.kernel@dilger.ca,
	KUnit Development <kunit-dev@googlegroups.com>
Subject: Re: [PATCH linux-kselftest/test v2] ext4: add kunit test for decoding extended timestamps
Date: Thu, 17 Oct 2019 18:12:29 -0700	[thread overview]
Message-ID: <CAFd5g44txp2j9May1YD9rq6bcNnCx_JKNNmnsrr+JG+cTX0chg@mail.gmail.com> (raw)
In-Reply-To: <ECADFF3FD767C149AD96A924E7EA6EAF977D0023@USCULXMSG01.am.sony.com>

On Thu, Oct 17, 2019 at 3:25 PM <Tim.Bird@sony.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Theodore Y. Ts'o on October 17, 2019 2:09 AM
> >
> > On Wed, Oct 16, 2019 at 05:26:29PM -0600, Shuah Khan wrote:
> > >
> > > I don't really buy the argument that unit tests should be deterministic
> > > Possibly, but I would opt for having the ability to feed test data.
> >
> > I strongly believe that unit tests should be deterministic.
> > Non-deterministic tests are essentially fuzz tests.  And fuzz tests
> > should be different from unit tests.
>
> I'm not sure I have the entire context here, but I think deterministic
> might not be the right word, or it might not capture the exact meaning
> intended.
>
> I think there are multiple issues here:
>  1. Does the test enclose all its data, including working data and expected results?
> Or, does the test allow someone to provide working data?  This alternative
> implies that either the some of testcases or the results might be different depending on
> the data that is provided.  IMHO the test would be deterministic if it always produced
> the same results based on the same data inputs.  And if the input data was deterministic.
> I would call this a data-driven test.
>
> Since the results would be dependent on the data provided, the results
> from tests using different data would not be comparable.  Essentially,
> changing the input data changes the test so maybe it's best to consider
> this a different test.  Like 'test-with-data-A' and 'test-with-data-B'.

That kind of sound like parameterized tests[1]; it was a feature I was
thinking about adding to KUnit, but I think the general idea of
parameterized tests has fallen out of favor; I am not sure why. In any
case, I have used parameterized tests before and have found them
useful in certain circumstances.

> 2. Does the test automatically detect some attribute of the system, and adjust
> its operation based on that (does the test probe?)  This is actually quite common
> if you include things like when a test requires root access to run.  Sometimes such tests,
> when run without root privilege, run as many testcases as possible not as root, and skip
> the testcases that require root.
>
> In general, altering the test based on probed data is a form of data-driven test,
> except the data is not provided by the user.  Whether
> this is deterministic in the sense of (1) depends on whether the data that
> is probed is deterministic.  In the case or requiring root, then it should
> not change from run to run (and it should probably be reflected in the characterization
> of the results).
>
> Maybe neither of the above cases fall in the category of unit tests, but
> they are not necessarily fuzzing tests.  IMHO a fuzzing test is one which randomizes

Kind of sounds remotely similar to Haskell's QuickCheck[2]; it's sort
of a mix of unit testing and fuzz testing. I have used this style of
testing for other projects and it can be pretty useful. I actually
have a little experiment somewhere trying to port the idea to KUnit.

> the data for a data-driven test (hence using non-deterministic data).  Once the fuzzer
> has found a bug, and the data and code for a test is fixed into a reproducer program,
> then at that point it should be deterministic (modulo what I say about race condition
> tests below).
>
> >
> > We want unit tests to run quickly.  Fuzz tests need to be run for a
> > large number of passes (perhaps hours) in order to be sure that we've
> > hit any possible bad cases.  We want to be able to easily bisect fuzz
> > tests --- preferably, automatically.  And any kind of flakey test is
> > hell to bisect.
> Agreed.
>
> > It's bad enough when a test is flakey because of the underlying code.
> > But when a test is flakey because the test inputs are
> > non-deterministic, it's even worse.
> I very much agree on this as well.
>
> I'm not sure how one classes a program that seeks to invoke a race condition.
> This can take variable time, so in that sense it is not deterministic.   But it should
> produce the same result if the probabilities required for the race condition
> to be hit are fulfilled.  Probably (see what I did there :-), one needs to take
> a probabilistic approach to reproducing and bisecting such bugs.  The duration
> or iterations required to reproduce the bug (to some confidence level) may
> need to be included with the reproducer program.  I'm not sure if the syskaller
> reproducers do this or not, or if they just run forever.  One I looked at ran forever.
> But you would want to limit this in order to produce results with some confidence
> level (and not waste testing resources).
>
> ---
> The reason I want get clarity on the issue of data-driven tests is that I think
> data-driven tests and tests that probe are very much desirable.  This allows a
> test to be able to be more generalized and allows for specialization of the
> test for more scenarios without re-coding it.
> I'm not sure if this still qualifies as unit testing, but it's very useful as a means to
> extend the value of a test.  We haven't trod into the mocking parts of kunit,
> but I'm hoping that it may be possible to have that be data-driven (depending on
> what's being mocked), to make it easier to test more things using the same code.

I imagine it wouldn't be that hard to add that on as a feature of a
parameterized testing implementation.

> Finally, I think the issue of testing speed is orthogonal to whether a test is self-enclosed
> or data-driven.  Definitely fuzzers, which are experimenting with system interaction
> in a non-deterministic way, have speed problems.

[1] https://dzone.com/articles/junit-parameterized-test
[2] http://hackage.haskell.org/package/QuickCheck

Cheers!