All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Filipe Manana <fdmanana@kernel.org>
Cc: fstests@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] fstests: UDEV_SETTLE_PROG before dmsetup create
Date: Thu, 2 Dec 2021 10:45:35 -0500	[thread overview]
Message-ID: <YajqH1Njtqz0ZXF+@localhost.localdomain> (raw)
In-Reply-To: <YajcBNMm3dR4YI/N@debian9.Home>

On Thu, Dec 02, 2021 at 02:45:24PM +0000, Filipe Manana wrote:
> On Wed, Dec 01, 2021 at 01:18:52PM -0500, Josef Bacik wrote:
> > We've been seeing transient errors with any test that uses a dm device
> > for the entirety of the time that we've been running nightly xfstests
> 
> I have been having it on my tests vms since ever as well.
> It's really annoying, but fortunatelly it doesn't happen too often.
> 

Yeah before this change we'd fail ~6 tests on every configruation on every
overnight run.  With this change we fail 0.  It's rare, but with our continual
testing it happens sooooooo much.

> > runs.  This turns out to be because sometimes we get EBUSY while trying
> > to create our new dm device.  Generally this is because the test comes
> > right after another test that messes with the dm device, and thus we
> > still have udev messing around with the device when DM tries to O_EXCL
> > the block device.
> > 
> > Add a UDEV_SETTLE_PROG before creating the device to make sure we can
> > create our new dm device without getting this transient error.
> 
> I suspect this might only make it seem the problem goes away but does not
> really fix it.
> 
> I say that for 2 reasons:
> 
> 1) All tests that use dm end up calling _dmsetup_remove(), like through
>    _log_writes_remove() or _cleanup_flakey() for example. Normally those
>    are called in the _cleanup() function, which ensures it's done even if
>    the test fails for some reason.
> 
>    So I don't understand why we need that UDEV_SETTLE_PROG at _dmsetup_create().
> 
>    And I've seen the ebusy failure happen even when the previous tests did
>    not use any dm device;
> 
> 2) Some tests fail after creating the dm device and using it. For example
>    btrfs/206 often fails when it tries to fsck the filesystem:
> 
>    btrfs/206 3s ... [failed, exit status 1]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad)
>         --- tests/btrfs/206.out     2020-10-16 23:13:46.554152652 +0100
>         +++ /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad      2021-12-01 21:09:46.317632589 +0000
>         @@ -3,3 +3,5 @@
>         XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>         wrote 8192/8192 bytes at offset 0
>         XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>         +_check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
>         +(see /home/fdmanana/git/hub/xfstests/results//btrfs/206.full for details)
>         ...
> 
>        (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/206.out /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad'  to see the entire diff)
> 
>     In the .full file I got:
> 
>     (...)
>     replaying 1239@11201: sector 2173408, size 16384, flags 0x10(METADATA)
>     replaying 1240@11234: sector 0, size 0, flags 0x1(FLUSH)
>     replaying 1241@11235: sector 128, size 4096, flags 0x12(FUA|METADATA)
>     _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
>     *** fsck.btrfs output ***
>     ERROR: cannot open device '/dev/sdc': Device or resource busy
>     ERROR: cannot open file system
>     Opening filesystem to check...
>     *** end fsck.btrfs output
>     *** mount output ***
> 
>    The ebusy failure is not when the test starts, but when somewhere in the middle
>    of the replay loop when it calls fsck, or when it ends and the fstests framework
>    calls fsck.
> 
>    I've seen that with btrfs/172 too, which also uses dm logwrites in a similar way.
> 
> So to me this suggests 2 things:
> 
> 1) Calling UDEV_SETTLE_PROG at _dmsetup_create() doesn't solve that problem with
>    btrfs/206 (and other tests) - the problem is fsck failing to open the scratch
>    device after it called _log_writes_remove() -> _dmsetup_remove(), and not a
>    failure to create the dm device;
> 
> 2) The problem is likely something missing at _dmsetup_remove(). Perhaps add
>    another UDEV_SETTLE_PROG there:
> 
>    diff --git a/common/rc b/common/rc
>    index 8e351f17..22b34677 100644
>    --- a/common/rc
>    +++ b/common/rc
>    @@ -4563,6 +4563,7 @@ _dmsetup_remove()
>             $UDEV_SETTLE_PROG >/dev/null 2>&1
>             $DMSETUP_PROG remove "$@" >>$seqres.full 2>&1
>             $DMSETUP_PROG mknodes >/dev/null 2>&1
>     +       $UDEV_SETTLE_PROG >/dev/null 2>&1
>      }
>  
>     _dmsetup_create()
> 
>   I can't say if that change to _dmsetup_remove() is correct, or what it's
>   needed, as I really haven't spent time trying to figure out why the issue
>   happens.
> 

I actually tried a few iterations before I settled on this one, but I was only
trying to reproduce the EBUSY when creating the flakey device, I hadn't seen it
with fsck.  So I originally started with your change, but it didn't fix the
problem.  Then I did both, UDEV_SETTLE at the end of remove and at the beginning
of create and the problem went away, and then I removed the one from remove and
the problem still was gone.

But since I've made this change I also have been investigating another problem
where we'll get EBUSY at mkfs time when we use SCRATCH_DEV_POOL.  I have a test
patch in our staging branch to make sure it actuall fixes it, but I have to add
this to the start of _scratch_pool_mkfs as well.

It turns out that udev is doing this thing where it writes to
/sys/block/whatever/uevent to make sure that a KOBJ_CHANGE event gets sent out
for the device.

This is racing with the test doing a mount.  So the mount gets O_EXCL, which
means the uevent doesn't get emitted until umount.  This would explain what
you're seeing, we umount, we get the KOBJ_CHANGE event once the O_EXCL is
dropped, udev runs, and then fsck gets an EBUSY.

This is a very long email to say that udev is causing spurious failures because
of behavior I don't entirely understand.  We're going to need to sprinkle in
UDEV_SETTLE_PROG in different places to kill all of these different scenarios.

What do we think is best here?  Put UDEV_SETTLE_PROG at the start of any
function that needs to do O_EXCL?  So this would mean

_dmsetup_create
_dmsetup_remove
*_mkfs
*_mount
*_check

That would be safest, and I can certainly do that.  My initial approach was just
to do it where it was problematic, but the debugging I did yesterday around
btrfs/223 failures and the fact that udev is queue'ing up events that get
delivered at some point in the future makes it kind of hard to handle on a
case-by-case basis.  Thanks,

Josef

  reply	other threads:[~2021-12-02 15:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-01 18:18 [PATCH] fstests: UDEV_SETTLE_PROG before dmsetup create Josef Bacik
2021-12-02 14:45 ` Filipe Manana
2021-12-02 15:45   ` Josef Bacik [this message]
2021-12-06  5:47     ` Chandan Babu R
2021-12-06 14:07       ` Chandan Babu R
2021-12-06 19:28         ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YajqH1Njtqz0ZXF+@localhost.localdomain \
    --to=josef@toxicpanda.com \
    --cc=fdmanana@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.