All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Jan Kara <jack@suse.cz>
Cc: Yu Kuai <yukuai1@huaweicloud.com>, Christoph Hellwig <hch@lst.de>,
	brauner@kernel.org, axboe@kernel.dk,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	yi.zhang@huawei.com, yangerkun@huawei.com,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [RFC v4 linux-next 19/19] fs & block: remove bdev->bd_inode
Date: Fri, 22 Mar 2024 14:57:28 +0000	[thread overview]
Message-ID: <20240322145728.GN538574@ZenIV> (raw)
In-Reply-To: <20240322131030.pxbvtubien2t27zw@quack3>

On Fri, Mar 22, 2024 at 02:10:30PM +0100, Jan Kara wrote:
> > End result:
> > 
> > * old ->i_private leaked (already grabbed by your get_bdev_file())
> > * ->bd_openers at 1 (after your bdev_open() gets through)
> > * ->i_private left NULL.
> > 
> > Christoph, could we please get rid of that atomic_t nonsense?
> > It only confuses people into brainos like that.  It really
> > needs ->open_mutex for any kind of atomicity.
> 
> Well, there are a couple of places where we end up reading bd_openers
> without ->open_mutex. Sure these places are racy wrt other opens / closes
> so they need to be careful but we want to make sure we read back at least
> some sane value which is not guaranteed with normal int and compiler
> possily playing weird tricks when updating it. But yes, we could convert
> the atomic_t to using READ_ONCE + WRITE_ONCE in appropriate places to avoid
> these issues and make it more obvious bd_openers are not really handled in
> an atomic way.

What WRITE_ONE()?  We really shouldn't modify it without ->open_mutex; do
we ever do that?  In current mainline:

in blkdev_get_whole(), both callers under ->open_mutex:
block/bdev.c:671:       if (!atomic_read(&bdev->bd_openers))
block/bdev.c:675:       atomic_inc(&bdev->bd_openers);

in blkdev_put_whole(), the sole caller under ->open_mutex:
block_mutex/bdev.c:681:       if (atomic_dec_and_test(&bdev->bd_openers))

in blkdev_get_part(), both callers under ->open_mutex:
block/bdev.c:700:       if (!atomic_read(&part->bd_openers)) {
block/bdev.c:704:       atomic_inc(&part->bd_openers);

in blkdev_put_whole(), the sole caller under ->open_mutex:
block/bdev.c:741:       if (atomic_dec_and_test(&part->bd_openers)) {

in bdev_release(), a deliberately racy reader, commented as such:
block/bdev.c:1032:      if (atomic_read(&bdev->bd_openers) == 1)

in sync_bdevs(), under ->open_mutex:
block/bdev.c:1163:              if (!atomic_read(&bdev->bd_openers)) {

in bdev_del_partition(), under ->open_mutex:
block/partitions/core.c:460:    if (atomic_read(&part->bd_openers))

and finally, in disk_openers(), a racy reader:
include/linux/blkdev.h:231:     return atomic_read(&disk->part0->bd_openers);

So that's two READ_ONCE() and a bunch of reads and writes under ->open_mutex.
Callers of disk_openers() need to be careful and looking through those...
Some of them are under ->open_mutex (either explicitly, or as e.g. lo_release()
called only via bdev ->release(), which comes only under ->open_mutex), but
four of them are not:

arch/um/drivers/ubd_kern.c:1023:                if (disk_openers(ubd_dev->disk))
in ubd_remove().  Racy, possibly a bug.  AFAICS, it's accessible through UML
console and there's nothing to stop it from racing with open().

drivers/block/loop.c:1245:      if (disk_openers(lo->lo_disk) > 1) {
in loop_clr_fd().  Under loop's private lock, but that's likely to
be a race - ->bd_openers updates are not under that.  Note that
there's no ->open() for /dev/loop, BTW...

drivers/block/loop.c:2161:      if (lo->lo_state != Lo_unbound || disk_openers(lo->lo_disk) > 0) {
in loop_control_remove().  Similar to the previous one, except that
it's done out of band *and* it doesn't have the "autoclean" logics
to work around udev, lovingly described in the comment before the
call in loop_clr_fd().

drivers/block/nbd.c:1279:       if (disk_openers(nbd->disk) > 1)
in nbd_bdev_reset().  Under nbd private mutex (->config_lock),
so there's some exclusion with nbd_open(), but ->bd_openers change
comes outside of that.  Might or might not be a bug - I need to wake
up properly to look through that.

  reply	other threads:[~2024-03-22 14:57 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-22 12:45 [RFC v4 linux-next 00/19] fs & block: remove bdev->bd_inode Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 01/19] block: move two helpers into bdev.c Yu Kuai
2024-03-15 14:31   ` Jan Kara
2024-03-17 21:19   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 02/19] block: remove sync_blockdev_nowait() Yu Kuai
2024-03-15 14:34   ` Jan Kara
2024-03-17 21:19   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 03/19] block: remove sync_blockdev_range() Yu Kuai
2024-03-15 14:37   ` Jan Kara
2024-03-17 21:21   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 04/19] block: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:44   ` Jan Kara
2024-03-17 21:23   ` Christoph Hellwig
2024-03-22  5:44   ` Al Viro
2024-02-22 12:45 ` [RFC v4 linux-next 05/19] bcachefs: remove dead function bdev_sectors() Yu Kuai
2024-03-15 14:42   ` Jan Kara
2024-03-17 21:23   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 06/19] cramfs: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:44   ` Jan Kara
2024-03-17 21:23   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 07/19] erofs: " Yu Kuai
2024-03-15 14:45   ` Jan Kara
2024-03-17 21:24   ` Christoph Hellwig
2024-03-18  2:39   ` Gao Xiang
2024-02-22 12:45 ` [RFC v4 linux-next 08/19] nilfs2: " Yu Kuai
2024-03-15 14:49   ` Jan Kara
2024-03-17 21:24   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 09/19] gfs2: " Yu Kuai
2024-03-15 14:54   ` Jan Kara
2024-03-17 21:24   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 10/19] s390/dasd: use bdev api in dasd_format() Yu Kuai
2024-03-15 14:55   ` Jan Kara
2024-03-17 21:25   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 11/19] btrfs: prevent direct access of bd_inode Yu Kuai
2024-03-15 15:09   ` Jan Kara
2024-03-17 21:25   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 12/19] ext4: remove block_device_ejected() Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 13/19] ext4: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:58   ` Jan Kara
2024-03-17 21:25   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 14/19] jbd2: " Yu Kuai
2024-03-15 15:06   ` Jan Kara
2024-03-17 21:26   ` Christoph Hellwig
2024-03-18  1:10     ` Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 15/19] bcache: " Yu Kuai
2024-03-15 15:11   ` Jan Kara
2024-03-17 21:34   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 16/19] block2mtd: " Yu Kuai
2024-03-15 15:12   ` Jan Kara
2024-03-17 21:36   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 17/19] dm-vdo: " Yu Kuai
2024-02-28 13:41   ` Christoph Hellwig
2024-03-18  9:11     ` Jan Kara
2024-03-18  9:19   ` Jan Kara
2024-03-18 13:38     ` Yu Kuai
2024-03-19  2:00       ` Matthew Sakai
2024-02-22 12:45 ` [RFC v4 linux-next 18/19] scsi: factor out a helper bdev_read_folio() from scsi_bios_ptable() Yu Kuai
2024-03-17 21:36   ` Christoph Hellwig
2024-03-18  1:12     ` Yu Kuai
2024-03-18  9:22   ` Jan Kara
2024-02-22 12:45 ` [RFC v4 linux-next 19/19] fs & block: remove bdev->bd_inode Yu Kuai
2024-02-25  0:06   ` kernel test robot
2024-03-17 21:38   ` Christoph Hellwig
2024-03-18  1:26     ` Yu Kuai
2024-03-18  1:32       ` Christoph Hellwig
2024-03-18  1:51         ` Yu Kuai
2024-03-18  7:19           ` Yu Kuai
2024-03-18 10:07             ` Christian Brauner
2024-03-18 10:29               ` Christian Brauner
2024-03-18 10:46                 ` Christian Brauner
2024-03-18 11:57                   ` Yu Kuai
2024-03-18 23:35                 ` Christoph Hellwig
2024-03-18 23:22             ` Christoph Hellwig
2024-03-19  8:26               ` Yu Kuai
2024-03-21 11:27                 ` Jan Kara
2024-03-21 12:15                   ` Yu Kuai
2024-03-22  6:37                     ` Al Viro
2024-03-22  6:39                       ` Al Viro
2024-03-22  6:52                         ` Yu Kuai
2024-03-22 12:57                           ` Jan Kara
2024-03-22 13:57                             ` Christian Brauner
2024-03-22 15:43                           ` Al Viro
2024-03-22 16:16                             ` Al Viro
2024-03-22  6:33                 ` Al Viro
2024-03-22  7:09                   ` Yu Kuai
2024-03-22 16:01                     ` Al Viro
2024-03-22 13:10                   ` Jan Kara
2024-03-22 14:57                     ` Al Viro [this message]
2024-03-25  1:06                       ` Christoph Hellwig
2024-02-28 13:42 ` [RFC v4 linux-next 00/19] " Christoph Hellwig
2024-03-15 12:08 ` Yu Kuai
2024-03-15 13:54   ` Christian Brauner
2024-03-16  2:49     ` Yu Kuai
2024-03-18  9:39       ` Christian Brauner
2024-03-19  1:18         ` Yu Kuai
2024-03-19  1:43           ` Yu Kuai
2024-03-19  2:13             ` Matthew Sakai
2024-03-19  2:27               ` Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240322145728.GN538574@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.