All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Luis Chamberlain <mcgrof@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org
Cc: amir73il@gmail.com, pankydev8@gmail.com, tytso@mit.edu,
	josef@toxicpanda.com, jmeneghi@redhat.com,
	Jan Kara <jack@suse.cz>, Davidlohr Bueso <dave@stgolabs.net>,
	Dan Williams <dan.j.williams@intel.com>, Jake Edge <jake@lwn.net>,
	Klaus Jensen <its@irrelevant.dk>
Subject: Re: [RFC: kdevops] Standardizing on failure rate nomenclature for expunges
Date: Sat, 2 Jul 2022 14:48:12 -0700	[thread overview]
Message-ID: <a120fb86-5a08-230f-33ee-1cb47381fff1@acm.org> (raw)
In-Reply-To: <YoW0ZC+zM27Pi0Us@bombadil.infradead.org>

On 5/18/22 20:07, Luis Chamberlain wrote:
> I've been promoting the idea that running fstests once is nice,
> but things get interesting if you try to run fstests multiple
> times until a failure is found. It turns out at least kdevops has
> found tests which fail with a failure rate of typically 1/2 to
> 1/30 average failure rate. That is 1/2 means a failure can happen
> 50% of the time, whereas 1/30 means it takes 30 runs to find the
> failure.
> 
> I have tried my best to annotate failure rates when I know what
> they might be on the test expunge list, as an example:
> 
> workflows/fstests/expunges/5.17.0-rc7/xfs/unassigned/xfs_reflink.txt:generic/530 # failure rate about 1/15 https://gist.github.com/mcgrof/4129074db592c170e6bf748aa11d783d
> 
> The term "failure rate 1/15" is 16 characters long, so I'd like
> to propose to standardize a way to represent this. How about
> 
> generic/530 # F:1/15
> 
> Then we could extend the definition. F being current estimate, and this
> can be just how long it took to find the first failure. A more valuable
> figure would be failure rate avarage, so running the test multiple
> times, say 10, to see what the failure rate is and then averaging the
> failure out. So this could be a more accurate representation. For this
> how about:
> 
> generic/530 # FA:1/15
> 
> This would mean on average there failure rate has been found to be about
> 1/15, and this was determined based on 10 runs.
> 
> We should also go extend check for fstests/blktests to run a test
> until a failure is found and report back the number of successes.
> 
> Thoughts?
> 
> Note: yes failure rates lower than 1/100 do exist but they are rare
> creatures. I love them though as my experience shows so far that they
> uncover hidden bones in the closet, and they they make take months and
> a lot of eyeballs to resolve.

I strongly disagree with annotating tests with failure rates. My opinion 
is that on a given test setup a test either should pass 100% of the time 
or fail 100% of the time. If a test passes in one run and fails in 
another run that either indicates a bug in the test or a bug in the 
software that is being tested. Examples of behaviors that can cause 
tests to behave unpredictably are use-after-free bugs and race 
conditions. How likely it is to trigger such behavior depends on a 
number of factors. This could even depend on external factors like which 
network packets are received from other systems. I do not expect that 
flaky tests have an exact failure rate. Hence my opinion that flaky 
tests are not useful and also that it is not useful to annotate flaky 
tests with a failure rate. If a test is flaky I think that the root 
cause of the flakiness must be determined and fixed.

Bart.

  parent reply	other threads:[~2022-07-02 21:48 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-19  3:07 [RFC: kdevops] Standardizing on failure rate nomenclature for expunges Luis Chamberlain
2022-05-19  6:36 ` Amir Goldstein
2022-05-19  7:58   ` Dave Chinner
2022-05-19  9:20     ` Amir Goldstein
2022-05-19 15:36       ` Josef Bacik
2022-05-19 16:18         ` Zorro Lang
2022-05-19 11:24   ` Zorro Lang
2022-05-19 14:18     ` Theodore Ts'o
2022-05-19 15:10       ` Zorro Lang
2022-05-19 14:58     ` Matthew Wilcox
2022-05-19 15:44       ` Zorro Lang
2022-05-19 16:06         ` Matthew Wilcox
2022-05-19 16:54           ` Zorro Lang
2022-07-01 23:36           ` Luis Chamberlain
2022-07-02 17:01           ` Theodore Ts'o
2022-07-07 21:36             ` Luis Chamberlain
2022-07-02 21:48 ` Bart Van Assche [this message]
2022-07-03  5:56   ` Amir Goldstein
2022-07-03 13:15     ` Theodore Ts'o
2022-07-03 14:22       ` Amir Goldstein
2022-07-03 16:30         ` Theodore Ts'o
2022-07-04  3:25     ` Dave Chinner
2022-07-04  7:58       ` Amir Goldstein
2022-07-05  2:29         ` Theodore Ts'o
2022-07-05  3:11         ` Dave Chinner
2022-07-06 10:11           ` Amir Goldstein
2022-07-06 14:29             ` Theodore Ts'o
2022-07-06 16:35               ` Amir Goldstein
2022-07-03 13:32   ` Theodore Ts'o
2022-07-03 14:54     ` Bart Van Assche
2022-07-07 21:16       ` Luis Chamberlain
2022-07-07 21:06     ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a120fb86-5a08-230f-33ee-1cb47381fff1@acm.org \
    --to=bvanassche@acm.org \
    --cc=amir73il@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=its@irrelevant.dk \
    --cc=jack@suse.cz \
    --cc=jake@lwn.net \
    --cc=jmeneghi@redhat.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=pankydev8@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.