All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: axboe@kernel.dk, viro@zeniv.linux.org.uk, bvanassche@acm.org,
	gregkh@linuxfoundation.org, rostedt@goodmis.org,
	mingo@redhat.com, jack@suse.cz, ming.lei@redhat.com,
	nstange@suse.de, akpm@linux-foundation.org, mhocko@suse.com,
	yukuai3@huawei.com, linux-block@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Omar Sandoval <osandov@fb.com>,
	Hannes Reinecke <hare@suse.com>, Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH 5/5] block: revert back to synchronous request_queue removal
Date: Tue, 14 Apr 2020 20:58:52 +0000	[thread overview]
Message-ID: <20200414205852.GP11244@42.do-not-panic.com> (raw)
In-Reply-To: <20200414154725.GD25765@infradead.org>

On Tue, Apr 14, 2020 at 08:47:25AM -0700, Christoph Hellwig wrote:
> On Tue, Apr 14, 2020 at 04:19:02AM +0000, Luis Chamberlain wrote:
> > Commit dc9edc44de6c ("block: Fix a blk_exit_rl() regression") merged on
> > v4.12 moved the work behind blk_release_queue() into a workqueue after a
> > splat floated around which indicated some work on blk_release_queue()
> > could sleep in blk_exit_rl(). This splat would be possible when a driver
> > called blk_put_queue() or blk_cleanup_queue() (which calls blk_put_queue()
> > as its final call) from an atomic context.
> > 
> > blk_put_queue() decrements the refcount for the request_queue
> > kobject, and upon reaching 0 blk_release_queue() is called. Although
> > blk_exit_rl() is now removed through commit db6d9952356 ("block: remove
> > request_list code"), we reserve the right to be able to sleep within
> > blk_release_queue() context. If you see no other way and *have* be
> > in atomic context when you driver calls the last blk_put_queue()
> > you can always just increase your block device's reference count with
> > bdgrab() as this can be done in atomic context and the request_queue
> > removal would be left to upper layers later. We document this bit of
> > tribal knowledge as well now, and adjust kdoc format a bit.
> > 
> > We revert back to synchronous request_queue removal because asynchronous
> > removal creates a regression with expected userspace interaction with
> > several drivers. An example is when removing the loopback driver and
> > issues ioctl from userspace to do so, upon return and if successful one
> > expects the device to be removed. Moving to asynchronous request_queue
> > removal could have broken many scripts which relied on the removal to
> > have been completed if there was no error.
> > 
> > Using asynchronous request_queue removal however has helped us find
> > other bugs, in the future we can test what could break with this
> > arrangement by enabling CONFIG_DEBUG_KOBJECT_RELEASE.
> > 
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Cc: Omar Sandoval <osandov@fb.com>
> > Cc: Hannes Reinecke <hare@suse.com>
> > Cc: Nicolai Stange <nstange@suse.de>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: yu kuai <yukuai3@huawei.com>
> > Suggested-by: Nicolai Stange <nstange@suse.de>
> > Fixes: dc9edc44de6c ("block: Fix a blk_exit_rl() regression")
> > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> > ---
> >  block/blk-core.c       | 19 ++++++++++++++++++-
> >  block/blk-sysfs.c      | 38 +++++++++++++++++---------------------
> >  include/linux/blkdev.h |  2 --
> >  3 files changed, 35 insertions(+), 24 deletions(-)
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index 5aaae7a1b338..8346c7c59ee6 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -301,6 +301,17 @@ void blk_clear_pm_only(struct request_queue *q)
> >  }
> >  EXPORT_SYMBOL_GPL(blk_clear_pm_only);
> >  
> > +/**
> > + * blk_put_queue - decrement the request_queue refcount
> > + *
> > + * Decrements the refcount to the request_queue kobject, when this reaches
> > + * 0 we'll have blk_release_queue() called. You should avoid calling
> > + * this function in atomic context but if you really have to ensure you
> > + * first refcount the block device with bdgrab() / bdput() so that the
> > + * last decrement happens in blk_cleanup_queue().
> > + *
> > + * @q: the request_queue structure to decrement the refcount for
> > + */
> >  void blk_put_queue(struct request_queue *q)
> >  {
> >  	kobject_put(&q->kobj);
> > @@ -328,10 +339,16 @@ EXPORT_SYMBOL_GPL(blk_set_queue_dying);
> >  
> >  /**
> >   * blk_cleanup_queue - shutdown a request queue
> > - * @q: request queue to shutdown
> >   *
> >   * Mark @q DYING, drain all pending requests, mark @q DEAD, destroy and
> >   * put it.  All future requests will be failed immediately with -ENODEV.
> > + *
> > + * You should not call this function in atomic context. If you need to
> > + * refcount a request_queue in atomic context, instead refcount the
> > + * block device with bdgrab() / bdput().
> 
> I think this needs a WARN_ON thrown in to enforece the calling context.

I considered adding a might_sleep() but upon review with Bart, he noted
that this function already has a mutex_lock(), and if you look under the
hood of mutex_lock(), it has a might_sleep() at the very top. The
warning then is implicit.

> > + *
> > + * @q: request queue to shutdown
> 
> Moving the argument documentation seems against the usual kerneldoc
> style.

Would you look at that, Documentation/doc-guide/kernel-doc.rst does
say to keep the argument at the top as it was in place before, OK will
revert that. Sorry, I used include/net/mac80211.h as my base for style.

> Otherwise this look good, I hope it sticks :)

I hope that the kdocs / might_sleep() sprinkled should make it stick now.
But hey, this uncovered wonderful obscure bugs, it was fun. I'll add a
selftest also later to ensure we don't regress on some of this later
once again.

  Luis

  reply	other threads:[~2020-04-14 20:59 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-14  4:18 [PATCH 0/5] blktrace: fix use after free Luis Chamberlain
2020-04-14  4:18 ` [PATCH 1/5] block: move main block debugfs initialization to its own file Luis Chamberlain
2020-04-14  7:35   ` Greg KH
2020-04-15  2:44   ` Bart Van Assche
2020-04-14  4:18 ` [PATCH 2/5] blktrace: fix debugfs use after free Luis Chamberlain
2020-04-14  7:37   ` Greg KH
2020-04-14 15:38   ` Christoph Hellwig
2020-04-15  2:46   ` Bart Van Assche
2020-04-15 17:38   ` Eric Sandeen
2020-04-15 21:48     ` Bart Van Assche
2020-04-16  0:56     ` Luis Chamberlain
2020-04-16  1:02       ` Eric Sandeen
2020-04-16  1:20         ` Luis Chamberlain
2020-04-16  2:10   ` Ming Lei
2020-04-16  5:25     ` Luis Chamberlain
2020-04-16  5:47       ` Ming Lei
2020-04-16  6:09         ` Ming Lei
2020-04-16  6:22           ` Luis Chamberlain
2020-04-16  6:20         ` Luis Chamberlain
2020-04-16  6:28           ` Ming Lei
2020-04-17  4:09             ` Luis Chamberlain
2020-04-17  4:09               ` Luis Chamberlain
2020-04-14  4:19 ` [PATCH 3/5] blktrace: refcount the request_queue during ioctl Luis Chamberlain
2020-04-14 15:40   ` Christoph Hellwig
2020-04-15  6:16     ` Luis Chamberlain
2020-04-15  7:14       ` Christoph Hellwig
2020-04-15 12:34         ` Luis Chamberlain
2020-04-15 12:39           ` Christoph Hellwig
2020-04-15 13:25             ` Luis Chamberlain
2020-04-15 14:18           ` Bart Van Assche
2020-04-16  1:12             ` Luis Chamberlain
2020-04-16  3:43               ` Bart Van Assche
2020-04-16  5:29                 ` Luis Chamberlain
2020-04-15 14:45       ` Bart Van Assche
2020-04-16  1:17         ` Luis Chamberlain
2020-04-16  2:31   ` Ming Lei
2020-04-16  5:36     ` Luis Chamberlain
2020-04-14  4:19 ` [PATCH 4/5] mm/swapfile: refcount block and queue before using blkcg_schedule_throttle() Luis Chamberlain
2020-04-14 15:44   ` Christoph Hellwig
2020-04-15  5:42     ` Luis Chamberlain
2020-04-15  7:27       ` Christoph Hellwig
2020-04-15  7:34         ` Christoph Hellwig
2020-04-15 13:19           ` Luis Chamberlain
2020-04-16  6:10             ` Christoph Hellwig
2020-04-16  6:22   ` Ming Lei
2020-04-16  6:25     ` Luis Chamberlain
2020-04-16  6:34       ` Ming Lei
2020-04-14  4:19 ` [PATCH 5/5] block: revert back to synchronous request_queue removal Luis Chamberlain
2020-04-14 15:47   ` Christoph Hellwig
2020-04-14 20:58     ` Luis Chamberlain [this message]
2020-04-15  6:46       ` Christoph Hellwig
2020-04-15 13:20         ` Luis Chamberlain
2020-04-16  2:36   ` Ming Lei
2020-04-14  7:38 ` [PATCH 0/5] blktrace: fix use after free Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200414205852.GP11244@42.do-not-panic.com \
    --to=mcgrof@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=ming.lei@redhat.com \
    --cc=mingo@redhat.com \
    --cc=nstange@suse.de \
    --cc=osandov@fb.com \
    --cc=rostedt@goodmis.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.