linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dan Williams <dan.j.williams@intel.com>
Cc: axboe@fb.com, Andi Kleen <ak@linux.intel.com>,
	Jan Kara <jack@suse.cz>, Rabin Vincent <rabinv@axis.com>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	linux-block@vger.kernel.org, Jeff Moyer <jmoyer@redhat.com>,
	Wei Fang <fangwei1@huawei.com>, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH] block: fix blk_get_backing_dev_info() crash, use bdev->bd_queue
Date: Fri, 6 Jan 2017 11:23:30 +0100	[thread overview]
Message-ID: <20170106102330.GB3533@quack2.suse.cz> (raw)
In-Reply-To: <148366547505.38941.646379357860772670.stgit@dwillia2-desk3.amr.corp.intel.com>

On Thu 05-01-17 17:17:55, Dan Williams wrote:
> The ->bd_queue member of struct block_device was added in commit
> 87192a2a49c4 ("vfs: cache request_queue in struct block_device") in
> v3.3. However, blk_get_backing_dev_info() has been using
> bdev_get_queue() and grabbing the request_queue through the gendisk
> since before the git era.
> 
> At final __blkdev_put() time ->bd_disk is cleared while ->bd_queue is
> not. The queue remains valid until the final put of the parent disk.
> 
> The following crash signature results from blk_get_backing_dev_info()
> trying to lookup the queue through ->bd_disk after the final put of the
> block device. Simply switch bdev_get_queue() to use ->bd_queue directly
> which is guaranteed to still be valid at invalidate_partition() time.
> 
>  BUG: unable to handle kernel NULL pointer dereference at 0000000000000568
>  IP: blk_get_backing_dev_info+0x10/0x20
>  [..]
>  Call Trace:
>   __inode_attach_wb+0x3a7/0x5d0
>   __filemap_fdatawrite_range+0xf8/0x100
>   filemap_write_and_wait+0x40/0x90
>   fsync_bdev+0x54/0x60
>   ? bdget_disk+0x30/0x40
>   invalidate_partition+0x24/0x50
>   del_gendisk+0xfa/0x230

So we have a similar reports of the same problem. E.g.:

http://www.spinics.net/lists/linux-fsdevel/msg105153.html

However I kind of miss how your patch would fix all those cases. The
principial problem is that inode_to_bdi() called on block device inode
wants to get the backing_dev_info however on last close of a block device
we do put_disk() and thus the request queue containing backing_dev_info
does not have to be around at that time. In your case you are lucky enough
to have the containing disk still around but that's not the case for all
inode_to_bdi() users (see e.g. the report I referenced) and your patch
would change relatively straightforward NULL pointer dereference to rather
subtle use-after-free issue so I disagree with going down this path.

So what I think needs to be done is that we make backing_dev_info
independently allocated structure with different lifetime rules to gendisk
or request_queue - definitely I want it to live as long as block device
inode exists. However it needs more thought what the exact lifetime rules
will be.

								Honza
> 
> Cc: <stable@vger.kernel.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Jens Axboe <axboe@fb.com>
> Cc: Jeff Moyer <jmoyer@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Wei Fang <fangwei1@huawei.com>
> Cc: Rabin Vincent <rabinv@axis.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Fixes: 87192a2a49c4 ("vfs: cache request_queue in struct block_device")
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  block/blk-core.c       |    4 ++--
>  include/linux/blkdev.h |    2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 61ba08c58b64..04737548e1e1 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -109,8 +109,8 @@ void blk_queue_congestion_threshold(struct request_queue *q)
>   * @bdev:	device
>   *
>   * Locates the passed device's request queue and returns the address of its
> - * backing_dev_info.  This function can only be called if @bdev is opened
> - * and the return value is never NULL.
> + * backing_dev_info.  This function can only be called while there is an
> + * active reference against the parent gendisk.
>   */
>  struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev)
>  {
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 83695641bd5e..27779709f821 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -962,7 +962,7 @@ bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
>  
>  static inline struct request_queue *bdev_get_queue(struct block_device *bdev)
>  {
> -	return bdev->bd_disk->queue;	/* this is never NULL */
> +	return bdev->bd_queue;	/* this is never NULL */
>  }
>  
>  /*
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2017-01-06 10:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-06  1:17 [PATCH] block: fix blk_get_backing_dev_info() crash, use bdev->bd_queue Dan Williams
2017-01-06 10:23 ` Jan Kara [this message]
2017-01-06 17:45   ` Dan Williams
2017-01-08 19:46     ` Jan Kara
2017-01-08 20:50       ` Dan Williams
2017-01-10  1:59         ` Dan Williams
2017-01-10 15:57   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170106102330.GB3533@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=ak@linux.intel.com \
    --cc=axboe@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=fangwei1@huawei.com \
    --cc=hch@lst.de \
    --cc=jmoyer@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rabinv@axis.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).