linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Jan Kara <jack@suse.cz>, John Lenton <john.lenton@canonical.com>,
	Kai-Heng Feng <kaihengfeng@me.com>, Jens Axboe <axboe@kernel.dk>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	linux-block@vger.kernel.org,
	jean-baptiste.lallement@canonical.com
Subject: Re: [PATCH] loop: Don't change loop device under exclusive opener
Date: Wed, 7 Aug 2019 11:45:20 +0200	[thread overview]
Message-ID: <20190807094520.GB14658@quack2.suse.cz> (raw)
In-Reply-To: <43344436-99b5-f0a7-b71e-2bbb025dfd09@acm.org>

[-- Attachment #1: Type: text/plain, Size: 2839 bytes --]

On Mon 05-08-19 09:41:39, Bart Van Assche wrote:
> On 7/30/19 6:36 AM, Jan Kara wrote:
> > On Tue 30-07-19 12:16:46, Jan Kara wrote:
> > > On Tue 30-07-19 10:36:59, John Lenton wrote:
> > > > On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote:
> > > > > 
> > > > > Thanks for the notice and the references. What's your version of
> > > > > util-linux? What your test script does is indeed racy. You have there:
> > > > > 
> > > > > echo Running:
> > > > > for i in {a..z}{a..z}; do
> > > > >      mount $i.squash /mnt/$i &
> > > > > done
> > > > > 
> > > > > So all mount(8) commands will run in parallel and race to setup loop
> > > > > devices with LOOP_SET_FD and mount them. However util-linux (at least in
> > > > > the current version) seems to handle EBUSY from LOOP_SET_FD just fine and
> > > > > retries with the new loop device. So at this point I don't see why the patch
> > > > > makes difference... I guess I'll need to reproduce and see what's going on
> > > > > in detail.
> > > > 
> > > > We've observed this in arch with util-linux 2.34, and ubuntu 19.10
> > > > (eoan ermine) with util-linux 2.33.
> > > > 
> > > > just to be clear, the initial reports didn't involve a zany loop of
> > > > mounts, but were triggered by effectively the same thing as systemd
> > > > booted a system with a lot of snaps. The reroducer tries to makes
> > > > things simpler to reproduce :-). FWIW,  systemd versions were 244 and
> > > > 242 for those systems, respectively.
> > > 
> > > Thanks for info! So I think I see what's going on. The two mounts race
> > > like:
> > > 
> > > MOUNT1					MOUNT2
> > > num = ioctl(LOOP_CTL_GET_FREE)
> > > 					num = ioctl(LOOP_CTL_GET_FREE)
> > > ioctl("/dev/loop$num", LOOP_SET_FD, ..)
> > >   - returns OK
> > > 					ioctl("/dev/loop$num", LOOP_SET_FD, ..)
> > > 					  - acquires exclusine loop$num
> > > 					    reference
> > > mount("/dev/loop$num", ...)
> > >   - sees exclusive reference from MOUNT2 and fails
> > > 					  - sees loop device is already
> > > 					    bound and fails
> > > 
> > > It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block
> > > perfectly valid mount. I'll think how to fix this...
> > 
> > So how about attached patch? It fixes the regression for me.
 
Hi Bart,

> A new kernel warning is triggered by blktests block/001 that did not happen
> without this patch. Reverting commit 89e524c04fa9 ("loop: Fix mount(2)
> failure due to race with LOOP_SET_FD") makes that kernel warning disappear.
> Is this reproducible on your setup?

Thanks for report! Hum, no, it seems the warning doesn't trigger in my test
VM. But reviewing the mentioned commit with fresh head, I can see where I
did a mistake during my conversion of blkdev_get(). Does attached patch fix
the warning for you?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-bdev-Fixup-error-handling-in-blkdev_get.patch --]
[-- Type: text/x-patch, Size: 1501 bytes --]

From c4cd39244088d8de548264bc33dc9fb8f0f1db2d Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 7 Aug 2019 11:36:47 +0200
Subject: [PATCH] bdev: Fixup error handling in blkdev_get()

Commit 89e524c04fa9 ("loop: Fix mount(2) failure due to race with
LOOP_SET_FD") converted blkdev_get() to use the new helpers for
finishing claiming of a block device. However the conversion botched the
error handling in blkdev_get() and thus the bdev has been marked as held
even in case __blkdev_get() returned error. This led to occasional
warnings with block/001 test from blktests like:

kernel: WARNING: CPU: 5 PID: 907 at fs/block_dev.c:1899 __blkdev_put+0x396/0x3a0

Correct the error handling.

CC: stable@vger.kernel.org
Fixes: 89e524c04fa9 ("loop: Fix mount(2) failure due to race with LOOP_SET_FD")
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/block_dev.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index a6f7c892cb4a..7db181581606 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1754,7 +1754,10 @@ int blkdev_get(struct block_device *bdev, fmode_t mode, void *holder)
 
 		/* finish claiming */
 		mutex_lock(&bdev->bd_mutex);
-		bd_finish_claiming(bdev, whole, holder);
+		if (!res)
+			bd_finish_claiming(bdev, whole, holder);
+		else
+			bd_abort_claiming(bdev, whole, holder);
 		/*
 		 * Block event polling for write claims if requested.  Any
 		 * write holder makes the write_holder state stick until
-- 
2.16.4


  parent reply	other threads:[~2019-08-07  9:52 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16 14:01 [PATCH] loop: Don't change loop device under exclusive opener Jan Kara
2019-05-16 20:44 ` Jens Axboe
2019-05-27 12:29   ` Jan Kara
2019-05-27 13:34     ` Jens Axboe
2019-07-18  8:15       ` Kai-Heng Feng
2019-07-30  9:29         ` Jan Kara
2019-07-30  9:36           ` John Lenton
2019-07-30 10:16             ` Jan Kara
2019-07-30 13:36               ` Jan Kara
2019-07-30 17:59                 ` Kai-Heng Feng
2019-07-30 19:17                 ` Jens Axboe
2019-07-30 21:11                   ` John Lenton
2019-07-31  8:56                   ` Jan Kara
2019-08-05 16:41                 ` Bart Van Assche
2019-08-05 21:01                   ` Tetsuo Handa
2019-08-07  9:45                   ` Jan Kara [this message]
2019-08-07 18:45                     ` Bart Van Assche
2019-08-08 13:37                     ` Jens Axboe
2019-07-30 10:16           ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190807094520.GB14658@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=jean-baptiste.lallement@canonical.com \
    --cc=john.lenton@canonical.com \
    --cc=kaihengfeng@me.com \
    --cc=linux-block@vger.kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).