Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
From: Brendan Higgins <brendanhiggins@google.com>
To: "Bird, Timothy" <Tim.Bird@sony.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Iurii Zaikin <yzaikin@google.com>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>,
	linux-ext4@vger.kernel.org, adilger.kernel@dilger.ca,
	KUnit Development <kunit-dev@googlegroups.com>
Subject: Re: [PATCH linux-kselftest/test v2] ext4: add kunit test for decoding extended timestamps
Date: Thu, 17 Oct 2019 18:12:29 -0700
Message-ID: <CAFd5g44txp2j9May1YD9rq6bcNnCx_JKNNmnsrr+JG+cTX0chg@mail.gmail.com> (raw)
In-Reply-To: <ECADFF3FD767C149AD96A924E7EA6EAF977D0023@USCULXMSG01.am.sony.com>

On Thu, Oct 17, 2019 at 3:25 PM <Tim.Bird@sony.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Theodore Y. Ts'o on October 17, 2019 2:09 AM
> >
> > On Wed, Oct 16, 2019 at 05:26:29PM -0600, Shuah Khan wrote:
> > >
> > > I don't really buy the argument that unit tests should be deterministic
> > > Possibly, but I would opt for having the ability to feed test data.
> >
> > I strongly believe that unit tests should be deterministic.
> > Non-deterministic tests are essentially fuzz tests.  And fuzz tests
> > should be different from unit tests.
>
> I'm not sure I have the entire context here, but I think deterministic
> might not be the right word, or it might not capture the exact meaning
> intended.
>
> I think there are multiple issues here:
>  1. Does the test enclose all its data, including working data and expected results?
> Or, does the test allow someone to provide working data?  This alternative
> implies that either the some of testcases or the results might be different depending on
> the data that is provided.  IMHO the test would be deterministic if it always produced
> the same results based on the same data inputs.  And if the input data was deterministic.
> I would call this a data-driven test.
>
> Since the results would be dependent on the data provided, the results
> from tests using different data would not be comparable.  Essentially,
> changing the input data changes the test so maybe it's best to consider
> this a different test.  Like 'test-with-data-A' and 'test-with-data-B'.

That kind of sound like parameterized tests[1]; it was a feature I was
thinking about adding to KUnit, but I think the general idea of
parameterized tests has fallen out of favor; I am not sure why. In any
case, I have used parameterized tests before and have found them
useful in certain circumstances.

> 2. Does the test automatically detect some attribute of the system, and adjust
> its operation based on that (does the test probe?)  This is actually quite common
> if you include things like when a test requires root access to run.  Sometimes such tests,
> when run without root privilege, run as many testcases as possible not as root, and skip
> the testcases that require root.
>
> In general, altering the test based on probed data is a form of data-driven test,
> except the data is not provided by the user.  Whether
> this is deterministic in the sense of (1) depends on whether the data that
> is probed is deterministic.  In the case or requiring root, then it should
> not change from run to run (and it should probably be reflected in the characterization
> of the results).
>
> Maybe neither of the above cases fall in the category of unit tests, but
> they are not necessarily fuzzing tests.  IMHO a fuzzing test is one which randomizes

Kind of sounds remotely similar to Haskell's QuickCheck[2]; it's sort
of a mix of unit testing and fuzz testing. I have used this style of
testing for other projects and it can be pretty useful. I actually
have a little experiment somewhere trying to port the idea to KUnit.

> the data for a data-driven test (hence using non-deterministic data).  Once the fuzzer
> has found a bug, and the data and code for a test is fixed into a reproducer program,
> then at that point it should be deterministic (modulo what I say about race condition
> tests below).
>
> >
> > We want unit tests to run quickly.  Fuzz tests need to be run for a
> > large number of passes (perhaps hours) in order to be sure that we've
> > hit any possible bad cases.  We want to be able to easily bisect fuzz
> > tests --- preferably, automatically.  And any kind of flakey test is
> > hell to bisect.
> Agreed.
>
> > It's bad enough when a test is flakey because of the underlying code.
> > But when a test is flakey because the test inputs are
> > non-deterministic, it's even worse.
> I very much agree on this as well.
>
> I'm not sure how one classes a program that seeks to invoke a race condition.
> This can take variable time, so in that sense it is not deterministic.   But it should
> produce the same result if the probabilities required for the race condition
> to be hit are fulfilled.  Probably (see what I did there :-), one needs to take
> a probabilistic approach to reproducing and bisecting such bugs.  The duration
> or iterations required to reproduce the bug (to some confidence level) may
> need to be included with the reproducer program.  I'm not sure if the syskaller
> reproducers do this or not, or if they just run forever.  One I looked at ran forever.
> But you would want to limit this in order to produce results with some confidence
> level (and not waste testing resources).
>
> ---
> The reason I want get clarity on the issue of data-driven tests is that I think
> data-driven tests and tests that probe are very much desirable.  This allows a
> test to be able to be more generalized and allows for specialization of the
> test for more scenarios without re-coding it.
> I'm not sure if this still qualifies as unit testing, but it's very useful as a means to
> extend the value of a test.  We haven't trod into the mocking parts of kunit,
> but I'm hoping that it may be possible to have that be data-driven (depending on
> what's being mocked), to make it easier to test more things using the same code.

I imagine it wouldn't be that hard to add that on as a feature of a
parameterized testing implementation.

> Finally, I think the issue of testing speed is orthogonal to whether a test is self-enclosed
> or data-driven.  Definitely fuzzers, which are experimenting with system interaction
> in a non-deterministic way, have speed problems.

[1] https://dzone.com/articles/junit-parameterized-test
[2] http://hackage.haskell.org/package/QuickCheck

Cheers!

  parent reply index

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-10  2:39 Iurii Zaikin
2019-10-10  3:46 ` Tim.Bird
2019-10-10 16:45   ` Iurii Zaikin
2019-10-10 20:29     ` Tim.Bird
2019-10-10 23:49       ` Iurii Zaikin
2019-10-10 17:11 ` Shuah Khan
2019-10-10 22:13   ` Iurii Zaikin
2019-10-11 10:05     ` Brendan Higgins
2019-10-11 13:19       ` Theodore Y. Ts'o
2019-10-12  2:38         ` Iurii Zaikin
2019-10-16 22:18         ` Brendan Higgins
2019-10-16 23:26           ` Shuah Khan
2019-10-17  0:07             ` Iurii Zaikin
2019-10-17 12:08             ` Theodore Y. Ts'o
2019-10-17 22:25               ` Tim.Bird
2019-10-17 22:56                 ` Theodore Y. Ts'o
2019-10-17 23:40                   ` Tim.Bird
2019-10-18  1:40                     ` Theodore Y. Ts'o
2019-10-18  2:40                       ` Tim.Bird
2019-10-18 15:27                         ` Theodore Y. Ts'o
2019-10-18 20:24                           ` Shuah Khan
2019-10-24  1:30                             ` Brendan Higgins
2019-10-18  1:12                 ` Brendan Higgins [this message]
2019-10-18  1:30                   ` Tim.Bird
2019-10-17 22:49               ` Shuah Khan
2019-10-17 23:07                 ` Iurii Zaikin
2019-10-17 23:12                   ` Shuah Khan
2019-10-17 23:27                     ` Iurii Zaikin
2019-10-17 23:42                       ` Shuah Khan
2019-10-17 23:54                       ` Tim.Bird
2019-10-17 23:59                         ` Shuah Khan
2019-10-18  0:11                         ` Iurii Zaikin
2019-10-18  0:38                           ` Tim.Bird
2019-10-18  1:06                             ` Iurii Zaikin

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFd5g44txp2j9May1YD9rq6bcNnCx_JKNNmnsrr+JG+cTX0chg@mail.gmail.com \
    --to=brendanhiggins@google.com \
    --cc=Tim.Bird@sony.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=kunit-dev@googlegroups.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=tytso@mit.edu \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git