All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Amir Goldstein <amir73il@gmail.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Leah Rumancik <leah.rumancik@gmail.com>,
	Josef Bacik <josef@toxicpanda.com>,
	Chuck Lever <chuck.lever@oracle.com>,
	chandanrmail@gmail.com,
	Sweet Tea Dorminy <sweettea-kernel@dorminy.me>,
	Pankaj Raghav <pankydev8@gmail.com>,
	Zorro Lang <zlang@redhat.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	fstests <fstests@vger.kernel.org>
Subject: Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1))
Date: Sat, 25 Jun 2022 17:50:26 -0400	[thread overview]
Message-ID: <YreDIk2FMMPQDpLL@mit.edu> (raw)
In-Reply-To: <YrdjluHoj9xAz3Op@bombadil.infradead.org>

On Sat, Jun 25, 2022 at 12:35:50PM -0700, Luis Chamberlain wrote:
> 
> The way the expunge list is process could simply be modified in kdevops
> so that non-deterministic tests are not expunged but also not treated as
> fatal at the end. But think about it, the exception is if the non-deterministic
> failure does not lead to a crash, no?

That's what I'm doing today, but once we have a better test analysis
system, what I think the only thing which should be excluded is:

   a)   bugs which cause the kernel to crash
   b)   test bugs
   c)   tests which take ***forever*** for a particular configuration
   	    (and for which  we probably get enough coverage through
	    other configs)

If we have a non-deterministic failure, which is due to a kernel bug,
I don't see any reason why we should skip the test.  We just need to
have a fully-featured enough test results analyzer so that we can
distinguish between known failures, known flaky failures, and new test
regressions.

So for example, the new tests generic/681, generic/682, and
generic/692 are causing determinsitic failures for the ext4/encrypt
config.  Right now, this is being tracked manually in a flat text
file:

generic/68[12]  encrypt   Failure percentage: 100%
    The directory does grow, but blocks aren't charged to either root or
    the non-privileged users' quota.  So this appears to be a real bug.
    Testing shows this goes all the way back to at least 4.14.

It's currently not tagged by kernel version, because I mostly only
care about upstream.  So once it's fixed upstream, I stop caring about
it.  In the ideal world, we'd track the kernel commit which fixed the
test failure, and when the fix propagated to the various stable
kernels, etc.

I've also resisted putting it in an expunge file, since if it did, I
would ignore it forever.  If it stays in my face, I'm more likely to
fix it, even if it's on my personal time.

> Here's the thing though. Not all developers have incentives to share.

Part of this is the amount of *time* that it takes to share this
information.  Right now, a lot of sharing takes place on the weekly
ext4 conference call.  It doesn't take Eric Whitney a lot of time to
mention that he's seeing a particular test failure, and I can quickly
search my test summary Unix mbox file and say, "yep, I've seen this
fail a couple of times before, starting in February 2020 --- but it's
super rare."

And since Darrick attends the weekly ext4 video chats, once or twice
we've asked him about some test failures on some esoteric xfs config,
such as realtime with an external logdev, and he might say, "oh yeah,
that's a known test bug.  pull this branch from my public xfstests
tree, I just haven't had time to push those fixes upstream yet."

(And I don't blame him for that; I just recently pushed some ext4 test
bug fixes, some of which I had initially sent to the list in late
April --- but on code review, changes were requested, and I just
didn't have *time* to clean up fixes in response to the code reviews.
So the fix which was good enough to suppress the failures sat in my
tree, but didn't go upstream since it was deemed not ready for
upstream.  I'm all for decreasing tech debt in xfstests; but do
understand that sometimes this means fixes to known test bugs will
stay in developers' git trees, since we're all overloaded.)

It's a similar problem with test failures.  Simply reporting a test
failure isn't *that* hard.  But the analysis, even if it's something
like:

generic/68[12]  encrypt   Failure percentage: 100%
    The directory does grow, but blocks aren't charged to either root or
    the non-privileged users' quota.....
    
... is the critical bit that people *really* want, and it takes real
developer time to come up with that kind of information.  In the ideal
world, I'd have an army of trained minions to run down this kind of
stuff.  In the real world, sometimes this stuff happens after
midnight, local time, on a Friday night.

(Note that Android and Chrome OS, both of which are big users of
fscrypt, don't use quota.  So If I were to open a bug tracker entry on
it, the bug would get prioritized to P2 or P3, and never be heard from
again, since there's no business reason to prioritize fixing it.
Which is why some of this happens on personal time.)

      	      	    	      	    	       - Ted

  reply	other threads:[~2022-06-25 21:50 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-16 18:27 [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik
2022-06-16 18:27 ` [PATCH 5.15 CANDIDATE v2 1/8] xfs: use kmem_cache_free() for kmem_cache objects Leah Rumancik
2022-06-16 18:27 ` [PATCH 5.15 CANDIDATE v2 2/8] xfs: punch out data fork delalloc blocks on COW writeback failure Leah Rumancik
2022-06-16 18:27 ` [PATCH 5.15 CANDIDATE v2 3/8] xfs: Fix the free logic of state in xfs_attr_node_hasname Leah Rumancik
2022-06-16 18:27 ` [PATCH 5.15 CANDIDATE v2 4/8] xfs: remove all COW fork extents when remounting readonly Leah Rumancik
2022-06-16 18:27 ` [PATCH 5.15 CANDIDATE v2 5/8] xfs: check sb_meta_uuid for dabuf buffer recovery Leah Rumancik
2022-06-16 18:27 ` [PATCH 5.15 CANDIDATE v2 6/8] xfs: prevent UAF in xfs_log_item_in_current_chkpt Leah Rumancik
2022-06-16 18:27 ` [PATCH 5.15 CANDIDATE v2 7/8] xfs: only bother with sync_filesystem during readonly remount Leah Rumancik
2022-06-16 18:27 ` [PATCH 5.15 CANDIDATE v2 8/8] xfs: use setattr_copy to set vfs inode attributes Leah Rumancik
2022-06-17  7:27   ` Amir Goldstein
2022-06-22  0:07 ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Luis Chamberlain
2022-06-22 21:44   ` Theodore Ts'o
2022-06-23  5:31     ` Amir Goldstein
2022-06-23 21:39       ` Luis Chamberlain
2022-06-23 21:31     ` Luis Chamberlain
2022-06-24  5:32       ` Theodore Ts'o
2022-06-24 22:54         ` Luis Chamberlain
2022-06-25  2:21           ` Theodore Ts'o
2022-06-25 18:49             ` Luis Chamberlain
2022-06-25 21:14               ` Theodore Ts'o
2022-07-01 23:08                 ` Luis Chamberlain
2022-06-25  7:28           ` sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1)) Amir Goldstein
2022-06-25 19:35             ` Luis Chamberlain
2022-06-25 21:50               ` Theodore Ts'o [this message]
2022-07-01 23:13                 ` Luis Chamberlain
2022-06-22 21:52   ` [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1) Leah Rumancik
2022-06-23 21:40     ` Luis Chamberlain
2022-06-22 16:23 ` Darrick J. Wong
2022-06-22 16:35   ` Darrick J. Wong
2022-06-22 21:29     ` Leah Rumancik
2022-06-23  4:53       ` Amir Goldstein
2022-06-23  6:28         ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YreDIk2FMMPQDpLL@mit.edu \
    --to=tytso@mit.edu \
    --cc=amir73il@gmail.com \
    --cc=chandanrmail@gmail.com \
    --cc=chuck.lever@oracle.com \
    --cc=djwong@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=leah.rumancik@gmail.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=pankydev8@gmail.com \
    --cc=sweettea-kernel@dorminy.me \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.