From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
Mike Snitzer <snitzer@redhat.com>
Cc: linux-block@vger.kernel.org, dm-devel@redhat.com,
Zolnierkiewicz <b.zolnierkie@samsung.com>,
Bartlomiej
Subject: Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
Date: Tue, 10 Aug 2021 23:30:58 +0200 [thread overview]
Message-ID: <df80819c-1246-a5e7-15a0-b9efa9f563b5@samsung.com> (raw)
In-Reply-To: <20210804094147.459763-5-hch@lst.de>
Hi,
On 04.08.2021 11:41, Christoph Hellwig wrote:
> device mapper needs to register holders before it is ready to do I/O.
> Currently it does so by registering the disk early, which can leave
> the disk and queue in a weird half state where the queue is registered
> with the disk, except for sysfs and the elevator. And this state has
> been a bit promlematic before, and will get more so when sorting out
> the responsibilities between the queue and the disk.
>
> Support registering holders on an initialized but not registered disk
> instead by delaying the sysfs registration until the disk is registered.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Mike Snitzer <snitzer@redhat.com>
This patch landed in today's linux-next (20210810) as commit
d62633873590 ("block: support delayed holder registration"). It triggers
a following lockdep warning on ARM64's virt 'machine' on QEmu:
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc4+ #10642 Not tainted
------------------------------------------------------
systemd-udevd/227 is trying to acquire lock:
ffffb6b41952d628 (mtd_table_mutex){+.+.}-{3:3}, at: blktrans_open+0x40/0x250
but task is already holding lock:
ffff0eacc403bb18 (&disk->open_mutex){+.+.}-{3:3}, at:
blkdev_get_by_dev+0x110/0x2f8
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&disk->open_mutex){+.+.}-{3:3}:
__mutex_lock+0xa4/0x978
mutex_lock_nested+0x54/0x60
bd_register_pending_holders+0x2c/0x118
__device_add_disk+0x1d8/0x368
device_add_disk+0x10/0x18
add_mtd_blktrans_dev+0x2dc/0x428
mtdblock_add_mtd+0x68/0x98
blktrans_notify_add+0x44/0x70
add_mtd_device+0x430/0x4a0
mtd_device_parse_register+0x1a4/0x2b0
physmap_flash_probe+0x44c/0x780
platform_probe+0x90/0xd8
really_probe+0x138/0x2d0
__driver_probe_device+0x78/0xd8
driver_probe_device+0x40/0x110
__driver_attach+0xcc/0x118
bus_for_each_dev+0x68/0xc8
driver_attach+0x20/0x28
bus_add_driver+0x168/0x1f8
driver_register+0x60/0x110
__platform_driver_register+0x24/0x30
physmap_init+0x18/0x20
do_one_initcall+0x84/0x450
kernel_init_freeable+0x31c/0x38c
kernel_init+0x20/0x120
ret_from_fork+0x10/0x18
-> #0 (mtd_table_mutex){+.+.}-{3:3}:
__lock_acquire+0xff4/0x1840
lock_acquire+0x130/0x3e8
__mutex_lock+0xa4/0x978
mutex_lock_nested+0x54/0x60
blktrans_open+0x40/0x250
blkdev_get_whole+0x28/0x120
blkdev_get_by_dev+0x15c/0x2f8
blkdev_open+0x50/0xb0
do_dentry_open+0x238/0x3c0
vfs_open+0x28/0x30
path_openat+0x720/0x938
do_filp_open+0x80/0x108
do_sys_openat2+0x1b4/0x2c8
do_sys_open+0x68/0x88
__arm64_compat_sys_openat+0x1c/0x28
invoke_syscall+0x40/0xf8
el0_svc_common+0x60/0x100
do_el0_svc_compat+0x1c/0x48
el0_svc_compat+0x20/0x30
el0t_32_sync_handler+0xec/0x140
el0t_32_sync+0x168/0x16c
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&disk->open_mutex);
lock(mtd_table_mutex);
lock(&disk->open_mutex);
lock(mtd_table_mutex);
*** DEADLOCK ***
1 lock held by systemd-udevd/227:
#0: ffff0eacc403bb18 (&disk->open_mutex){+.+.}-{3:3}, at:
blkdev_get_by_dev+0x110/0x2f8
stack backtrace:
CPU: 1 PID: 227 Comm: systemd-udevd Not tainted 5.14.0-rc4+ #10642
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x1d0
show_stack+0x14/0x20
dump_stack_lvl+0x88/0xb0
dump_stack+0x14/0x2c
print_circular_bug.isra.50+0x1ac/0x200
check_noncircular+0x134/0x148
__lock_acquire+0xff4/0x1840
lock_acquire+0x130/0x3e8
__mutex_lock+0xa4/0x978
mutex_lock_nested+0x54/0x60
blktrans_open+0x40/0x250
blkdev_get_whole+0x28/0x120
blkdev_get_by_dev+0x15c/0x2f8
blkdev_open+0x50/0xb0
do_dentry_open+0x238/0x3c0
vfs_open+0x28/0x30
path_openat+0x720/0x938
do_filp_open+0x80/0x108
do_sys_openat2+0x1b4/0x2c8
do_sys_open+0x68/0x88
__arm64_compat_sys_openat+0x1c/0x28
invoke_syscall+0x40/0xf8
el0_svc_common+0x60/0x100
do_el0_svc_compat+0x1c/0x48
el0_svc_compat+0x20/0x30
el0t_32_sync_handler+0xec/0x140
el0t_32_sync+0x168/0x16c
If this is a false positive, then it should be annotated as such.
> ---
> block/genhd.c | 10 +++++++
> block/holder.c | 68 ++++++++++++++++++++++++++++++++-----------
> include/linux/genhd.h | 5 ++++
> 3 files changed, 66 insertions(+), 17 deletions(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index cd4eab744667..db916f779077 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -447,6 +447,16 @@ static void register_disk(struct device *parent, struct gendisk *disk,
> kobject_create_and_add("holders", &ddev->kobj);
> disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj);
>
> + /*
> + * XXX: this is a mess, can't wait for real error handling in add_disk.
> + * Make sure ->slave_dir is NULL if we failed some of the registration
> + * so that the cleanup in bd_unlink_disk_holder works properly.
> + */
> + if (bd_register_pending_holders(disk) < 0) {
> + kobject_put(disk->slave_dir);
> + disk->slave_dir = NULL;
> + }
> +
> if (disk->flags & GENHD_FL_HIDDEN)
> return;
>
> diff --git a/block/holder.c b/block/holder.c
> index 11e65d99a9fb..4568cc4f6827 100644
> --- a/block/holder.c
> +++ b/block/holder.c
> @@ -28,6 +28,19 @@ static void del_symlink(struct kobject *from, struct kobject *to)
> sysfs_remove_link(from, kobject_name(to));
> }
>
> +static int __link_disk_holder(struct block_device *bdev, struct gendisk *disk)
> +{
> + int ret;
> +
> + ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
> + if (ret)
> + return ret;
> + ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
> + if (ret)
> + del_symlink(disk->slave_dir, bdev_kobj(bdev));
> + return ret;
> +}
> +
> /**
> * bd_link_disk_holder - create symlinks between holding disk and slave bdev
> * @bdev: the claimed slave bdev
> @@ -66,7 +79,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
> WARN_ON_ONCE(!bdev->bd_holder);
>
> /* FIXME: remove the following once add_disk() handles errors */
> - if (WARN_ON(!disk->slave_dir || !bdev->bd_holder_dir))
> + if (WARN_ON(!bdev->bd_holder_dir))
> goto out_unlock;
>
> holder = bd_find_holder_disk(bdev, disk);
> @@ -84,28 +97,28 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
> INIT_LIST_HEAD(&holder->list);
> holder->bdev = bdev;
> holder->refcnt = 1;
> -
> - ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
> - if (ret)
> - goto out_free;
> -
> - ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
> - if (ret)
> - goto out_del;
> + if (disk->slave_dir) {
> + ret = __link_disk_holder(bdev, disk);
> + if (ret) {
> + kfree(holder);
> + goto out_unlock;
> + }
> + }
>
> list_add(&holder->list, &disk->slave_bdevs);
> - goto out_unlock;
> -
> -out_del:
> - del_symlink(disk->slave_dir, bdev_kobj(bdev));
> -out_free:
> - kfree(holder);
> out_unlock:
> mutex_unlock(&disk->open_mutex);
> return ret;
> }
> EXPORT_SYMBOL_GPL(bd_link_disk_holder);
>
> +static void __unlink_disk_holder(struct block_device *bdev,
> + struct gendisk *disk)
> +{
> + del_symlink(disk->slave_dir, bdev_kobj(bdev));
> + del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
> +}
> +
> /**
> * bd_unlink_disk_holder - destroy symlinks created by bd_link_disk_holder()
> * @bdev: the calimed slave bdev
> @@ -123,11 +136,32 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
> mutex_lock(&disk->open_mutex);
> holder = bd_find_holder_disk(bdev, disk);
> if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
> - del_symlink(disk->slave_dir, bdev_kobj(bdev));
> - del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
> + if (disk->slave_dir)
> + __unlink_disk_holder(bdev, disk);
> list_del_init(&holder->list);
> kfree(holder);
> }
> mutex_unlock(&disk->open_mutex);
> }
> EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
> +
> +int bd_register_pending_holders(struct gendisk *disk)
> +{
> + struct bd_holder_disk *holder;
> + int ret;
> +
> + mutex_lock(&disk->open_mutex);
> + list_for_each_entry(holder, &disk->slave_bdevs, list) {
> + ret = __link_disk_holder(holder->bdev, disk);
> + if (ret)
> + goto out_undo;
> + }
> + mutex_unlock(&disk->open_mutex);
> + return 0;
> +
> +out_undo:
> + list_for_each_entry_continue_reverse(holder, &disk->slave_bdevs, list)
> + __unlink_disk_holder(holder->bdev, disk);
> + mutex_unlock(&disk->open_mutex);
> + return ret;
> +}
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 0721807d76ee..80952f038d79 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -323,6 +323,7 @@ long compat_blkdev_ioctl(struct file *, unsigned, unsigned long);
> #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
> int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
> void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk);
> +int bd_register_pending_holders(struct gendisk *disk);
> #else
> static inline int bd_link_disk_holder(struct block_device *bdev,
> struct gendisk *disk)
> @@ -333,6 +334,10 @@ static inline void bd_unlink_disk_holder(struct block_device *bdev,
> struct gendisk *disk)
> {
> }
> +static inline int bd_register_pending_holders(struct gendisk *disk)
> +{
> + return 0;
> +}
> #endif /* CONFIG_BLOCK_HOLDER_DEPRECATED */
>
> dev_t part_devt(struct gendisk *disk, u8 partno);
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
next prev parent reply other threads:[~2021-08-11 7:19 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-04 9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
2021-08-04 9:41 ` [dm-devel] [PATCH 1/8] block: make the block holder code optional Christoph Hellwig
2021-08-04 9:41 ` [dm-devel] [PATCH 2/8] block: remove the extra kobject reference in bd_link_disk_holder Christoph Hellwig
2021-08-04 9:41 ` [dm-devel] [PATCH 3/8] block: look up holders by bdev Christoph Hellwig
2021-08-04 9:41 ` [dm-devel] [PATCH 4/8] block: support delayed holder registration Christoph Hellwig
[not found] ` <CGME20210810213058eucas1p109323e3c3ecaa76d37d8cf63b6d8ecfd@eucas1p1.samsung.com>
2021-08-10 21:30 ` Marek Szyprowski [this message]
2021-08-14 21:13 ` Guenter Roeck
2021-08-15 7:07 ` Christoph Hellwig
2021-08-15 14:27 ` Guenter Roeck
2021-08-16 7:21 ` Christoph Hellwig
2021-08-16 14:17 ` Guenter Roeck
2021-08-20 15:08 ` Christoph Hellwig
2021-08-21 3:17 ` Guenter Roeck
2021-08-18 2:51 ` Guenter Roeck
2021-08-04 9:41 ` [dm-devel] [PATCH 5/8] dm: cleanup cleanup_mapped_device Christoph Hellwig
2021-08-04 9:41 ` [dm-devel] [PATCH 6/8] dm: move setting md->type into dm_setup_md_queue Christoph Hellwig
2021-08-04 9:41 ` [dm-devel] [PATCH 7/8] dm: delay registering the gendisk Christoph Hellwig
2021-08-09 23:31 ` Alasdair G Kergon
2021-08-10 0:17 ` Alasdair G Kergon
2021-08-10 13:12 ` Peter Rajnoha
2021-08-10 15:05 ` Alasdair G Kergon
2022-07-07 3:29 ` Yu Kuai
2022-07-07 5:24 ` Christoph Hellwig
2022-07-07 7:20 ` Yu Kuai
2022-07-15 3:24 ` Yu Kuai
2022-08-01 1:04 ` Yu Kuai
2021-08-04 9:41 ` [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations Christoph Hellwig
2021-08-09 17:51 ` [dm-devel] use regular gendisk registration in device mapper v2 Jens Axboe
2021-08-10 0:36 ` Alasdair G Kergon
2021-08-10 14:41 ` Alasdair G Kergon
2021-08-19 15:58 ` [dm-devel] holders not working properly, regression [was: Re: use regular gendisk registration in device mapper v2] Mike Snitzer
2021-08-19 18:05 ` Christoph Hellwig
2021-08-19 22:08 ` Mike Snitzer
-- strict thread matches above, loose matches on Subject: below --
2021-07-25 5:54 [dm-devel] use regular gendisk registration in device mapper Christoph Hellwig
2021-07-25 5:54 ` [dm-devel] [PATCH 4/8] block: support delayed holder registration Christoph Hellwig
2021-07-27 16:06 ` Mike Snitzer
2021-07-28 6:54 ` Christoph Hellwig
2021-07-29 16:32 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df80819c-1246-a5e7-15a0-b9efa9f563b5@samsung.com \
--to=m.szyprowski@samsung.com \
--cc=axboe@kernel.dk \
--cc=b.zolnierkie@samsung.com \
--cc=dm-devel@redhat.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).