dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [dm-devel] use regular gendisk registration in device mapper v2
@ 2021-08-04  9:41 Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 1/8] block: make the block holder code optional Christoph Hellwig
                   ` (10 more replies)
  0 siblings, 11 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

Hi all,

The device mapper code currently has a somewhat odd gendisk registration
scheme where it calls add_disk early, but uses a special flag to skip the
"queue registration", which is a major part of add_disk.  This series
improves the block layer holder tracking to work on an entirely
unregistered disk and thus allows device mapper to use the normal scheme
of calling add_disk when it is ready to accept I/O.

Note that this leads to a user visible change - the sysfs attributes on
the disk and the dm directory hanging off it are not only visible once
the initial table is loaded.  This did not make a different to my testing
using dmsetup and the lvm2 tools.

Changes since v1:
 - rebased on the lastes for-5.15/block tree
 - improve various commit messages, including commit references

Diffstat:
 block/Kconfig             |    4 +
 block/Makefile            |    1 
 block/elevator.c          |    1 
 block/genhd.c             |   42 +++++------
 block/holder.c            |  167 ++++++++++++++++++++++++++++++++++++++++++++++
 drivers/md/Kconfig        |    2 
 drivers/md/bcache/Kconfig |    1 
 drivers/md/dm-ioctl.c     |    4 -
 drivers/md/dm-rq.c        |    1 
 drivers/md/dm.c           |   32 +++-----
 fs/block_dev.c            |  145 ---------------------------------------
 include/linux/blk_types.h |    3 
 include/linux/genhd.h     |   19 ++---
 13 files changed, 219 insertions(+), 203 deletions(-)

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 1/8] block: make the block holder code optional
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
@ 2021-08-04  9:41 ` Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 2/8] block: remove the extra kobject reference in bd_link_disk_holder Christoph Hellwig
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

Move the block holder code into a separate file as it is not in any way
related to the other block_dev.c code, and add a new selectable config
option for it so that we don't have to build it without any remapped
drivers selected.

The Kconfig symbol contains a _DEPRECATED suffix to match the comments
added in commit 49731baa41df
("block: restore multiple bd_link_disk_holder() support").

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
---
 block/Kconfig             |   4 ++
 block/Makefile            |   1 +
 block/holder.c            | 139 ++++++++++++++++++++++++++++++++++++
 drivers/md/Kconfig        |   2 +
 drivers/md/bcache/Kconfig |   1 +
 fs/block_dev.c            | 144 +-------------------------------------
 include/linux/blk_types.h |   2 +-
 include/linux/genhd.h     |   4 +-
 8 files changed, 151 insertions(+), 146 deletions(-)
 create mode 100644 block/holder.c

diff --git a/block/Kconfig b/block/Kconfig
index 15dfb7660645..bac87d773c54 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -241,4 +241,8 @@ config BLK_MQ_RDMA
 config BLK_PM
 	def_bool BLOCK && PM
 
+# do not use in new code
+config BLOCK_HOLDER_DEPRECATED
+	bool
+
 source "block/Kconfig.iosched"
diff --git a/block/Makefile b/block/Makefile
index c72592b4cf31..0d951adce796 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -41,3 +41,4 @@ obj-$(CONFIG_BLK_SED_OPAL)	+= sed-opal.o
 obj-$(CONFIG_BLK_PM)		+= blk-pm.o
 obj-$(CONFIG_BLK_INLINE_ENCRYPTION)	+= keyslot-manager.o blk-crypto.o
 obj-$(CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK)	+= blk-crypto-fallback.o
+obj-$(CONFIG_BLOCK_HOLDER_DEPRECATED)	+= holder.o
diff --git a/block/holder.c b/block/holder.c
new file mode 100644
index 000000000000..904a1dcd5c12
--- /dev/null
+++ b/block/holder.c
@@ -0,0 +1,139 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/genhd.h>
+
+struct bd_holder_disk {
+	struct list_head	list;
+	struct gendisk		*disk;
+	int			refcnt;
+};
+
+static struct bd_holder_disk *bd_find_holder_disk(struct block_device *bdev,
+						  struct gendisk *disk)
+{
+	struct bd_holder_disk *holder;
+
+	list_for_each_entry(holder, &bdev->bd_holder_disks, list)
+		if (holder->disk == disk)
+			return holder;
+	return NULL;
+}
+
+static int add_symlink(struct kobject *from, struct kobject *to)
+{
+	return sysfs_create_link(from, to, kobject_name(to));
+}
+
+static void del_symlink(struct kobject *from, struct kobject *to)
+{
+	sysfs_remove_link(from, kobject_name(to));
+}
+
+/**
+ * bd_link_disk_holder - create symlinks between holding disk and slave bdev
+ * @bdev: the claimed slave bdev
+ * @disk: the holding disk
+ *
+ * DON'T USE THIS UNLESS YOU'RE ALREADY USING IT.
+ *
+ * This functions creates the following sysfs symlinks.
+ *
+ * - from "slaves" directory of the holder @disk to the claimed @bdev
+ * - from "holders" directory of the @bdev to the holder @disk
+ *
+ * For example, if /dev/dm-0 maps to /dev/sda and disk for dm-0 is
+ * passed to bd_link_disk_holder(), then:
+ *
+ *   /sys/block/dm-0/slaves/sda --> /sys/block/sda
+ *   /sys/block/sda/holders/dm-0 --> /sys/block/dm-0
+ *
+ * The caller must have claimed @bdev before calling this function and
+ * ensure that both @bdev and @disk are valid during the creation and
+ * lifetime of these symlinks.
+ *
+ * CONTEXT:
+ * Might sleep.
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
+{
+	struct bd_holder_disk *holder;
+	int ret = 0;
+
+	mutex_lock(&bdev->bd_disk->open_mutex);
+
+	WARN_ON_ONCE(!bdev->bd_holder);
+
+	/* FIXME: remove the following once add_disk() handles errors */
+	if (WARN_ON(!disk->slave_dir || !bdev->bd_holder_dir))
+		goto out_unlock;
+
+	holder = bd_find_holder_disk(bdev, disk);
+	if (holder) {
+		holder->refcnt++;
+		goto out_unlock;
+	}
+
+	holder = kzalloc(sizeof(*holder), GFP_KERNEL);
+	if (!holder) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	INIT_LIST_HEAD(&holder->list);
+	holder->disk = disk;
+	holder->refcnt = 1;
+
+	ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
+	if (ret)
+		goto out_free;
+
+	ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
+	if (ret)
+		goto out_del;
+	/*
+	 * bdev could be deleted beneath us which would implicitly destroy
+	 * the holder directory.  Hold on to it.
+	 */
+	kobject_get(bdev->bd_holder_dir);
+
+	list_add(&holder->list, &bdev->bd_holder_disks);
+	goto out_unlock;
+
+out_del:
+	del_symlink(disk->slave_dir, bdev_kobj(bdev));
+out_free:
+	kfree(holder);
+out_unlock:
+	mutex_unlock(&bdev->bd_disk->open_mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(bd_link_disk_holder);
+
+/**
+ * bd_unlink_disk_holder - destroy symlinks created by bd_link_disk_holder()
+ * @bdev: the calimed slave bdev
+ * @disk: the holding disk
+ *
+ * DON'T USE THIS UNLESS YOU'RE ALREADY USING IT.
+ *
+ * CONTEXT:
+ * Might sleep.
+ */
+void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
+{
+	struct bd_holder_disk *holder;
+
+	mutex_lock(&bdev->bd_disk->open_mutex);
+	holder = bd_find_holder_disk(bdev, disk);
+	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
+		del_symlink(disk->slave_dir, bdev_kobj(bdev));
+		del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
+		kobject_put(bdev->bd_holder_dir);
+		list_del_init(&holder->list);
+		kfree(holder);
+	}
+	mutex_unlock(&bdev->bd_disk->open_mutex);
+}
+EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 0602e82a9516..f821dae101a9 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -15,6 +15,7 @@ if MD
 
 config BLK_DEV_MD
 	tristate "RAID support"
+	select BLOCK_HOLDER_DEPRECATED if SYSFS
 	help
 	  This driver lets you combine several hard disk partitions into one
 	  logical block device. This can be used to simply append one
@@ -201,6 +202,7 @@ config BLK_DEV_DM_BUILTIN
 
 config BLK_DEV_DM
 	tristate "Device mapper support"
+	select BLOCK_HOLDER_DEPRECATED if SYSFS
 	select BLK_DEV_DM_BUILTIN
 	depends on DAX || DAX=n
 	help
diff --git a/drivers/md/bcache/Kconfig b/drivers/md/bcache/Kconfig
index d1ca4d059c20..cf3e8096942a 100644
--- a/drivers/md/bcache/Kconfig
+++ b/drivers/md/bcache/Kconfig
@@ -2,6 +2,7 @@
 
 config BCACHE
 	tristate "Block device as cache"
+	select BLOCK_HOLDER_DEPRECATED if SYSFS
 	select CRC64
 	help
 	Allows a block device to be used as cache for other devices; uses
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 6658f40ae492..ae9651cad923 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -902,7 +902,7 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 	bdev->bd_disk = disk;
 	bdev->bd_partno = partno;
 	bdev->bd_inode = inode;
-#ifdef CONFIG_SYSFS
+#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
 	INIT_LIST_HEAD(&bdev->bd_holder_disks);
 #endif
 	bdev->bd_stats = alloc_percpu(struct disk_stats);
@@ -1063,148 +1063,6 @@ void bd_abort_claiming(struct block_device *bdev, void *holder)
 }
 EXPORT_SYMBOL(bd_abort_claiming);
 
-#ifdef CONFIG_SYSFS
-struct bd_holder_disk {
-	struct list_head	list;
-	struct gendisk		*disk;
-	int			refcnt;
-};
-
-static struct bd_holder_disk *bd_find_holder_disk(struct block_device *bdev,
-						  struct gendisk *disk)
-{
-	struct bd_holder_disk *holder;
-
-	list_for_each_entry(holder, &bdev->bd_holder_disks, list)
-		if (holder->disk == disk)
-			return holder;
-	return NULL;
-}
-
-static int add_symlink(struct kobject *from, struct kobject *to)
-{
-	return sysfs_create_link(from, to, kobject_name(to));
-}
-
-static void del_symlink(struct kobject *from, struct kobject *to)
-{
-	sysfs_remove_link(from, kobject_name(to));
-}
-
-/**
- * bd_link_disk_holder - create symlinks between holding disk and slave bdev
- * @bdev: the claimed slave bdev
- * @disk: the holding disk
- *
- * DON'T USE THIS UNLESS YOU'RE ALREADY USING IT.
- *
- * This functions creates the following sysfs symlinks.
- *
- * - from "slaves" directory of the holder @disk to the claimed @bdev
- * - from "holders" directory of the @bdev to the holder @disk
- *
- * For example, if /dev/dm-0 maps to /dev/sda and disk for dm-0 is
- * passed to bd_link_disk_holder(), then:
- *
- *   /sys/block/dm-0/slaves/sda --> /sys/block/sda
- *   /sys/block/sda/holders/dm-0 --> /sys/block/dm-0
- *
- * The caller must have claimed @bdev before calling this function and
- * ensure that both @bdev and @disk are valid during the creation and
- * lifetime of these symlinks.
- *
- * CONTEXT:
- * Might sleep.
- *
- * RETURNS:
- * 0 on success, -errno on failure.
- */
-int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
-{
-	struct bd_holder_disk *holder;
-	int ret = 0;
-
-	mutex_lock(&bdev->bd_disk->open_mutex);
-
-	WARN_ON_ONCE(!bdev->bd_holder);
-
-	/* FIXME: remove the following once add_disk() handles errors */
-	if (WARN_ON(!disk->slave_dir || !bdev->bd_holder_dir))
-		goto out_unlock;
-
-	holder = bd_find_holder_disk(bdev, disk);
-	if (holder) {
-		holder->refcnt++;
-		goto out_unlock;
-	}
-
-	holder = kzalloc(sizeof(*holder), GFP_KERNEL);
-	if (!holder) {
-		ret = -ENOMEM;
-		goto out_unlock;
-	}
-
-	INIT_LIST_HEAD(&holder->list);
-	holder->disk = disk;
-	holder->refcnt = 1;
-
-	ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
-	if (ret)
-		goto out_free;
-
-	ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
-	if (ret)
-		goto out_del;
-	/*
-	 * bdev could be deleted beneath us which would implicitly destroy
-	 * the holder directory.  Hold on to it.
-	 */
-	kobject_get(bdev->bd_holder_dir);
-
-	list_add(&holder->list, &bdev->bd_holder_disks);
-	goto out_unlock;
-
-out_del:
-	del_symlink(disk->slave_dir, bdev_kobj(bdev));
-out_free:
-	kfree(holder);
-out_unlock:
-	mutex_unlock(&bdev->bd_disk->open_mutex);
-	return ret;
-}
-EXPORT_SYMBOL_GPL(bd_link_disk_holder);
-
-/**
- * bd_unlink_disk_holder - destroy symlinks created by bd_link_disk_holder()
- * @bdev: the calimed slave bdev
- * @disk: the holding disk
- *
- * DON'T USE THIS UNLESS YOU'RE ALREADY USING IT.
- *
- * CONTEXT:
- * Might sleep.
- */
-void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
-{
-	struct bd_holder_disk *holder;
-
-	mutex_lock(&bdev->bd_disk->open_mutex);
-
-	holder = bd_find_holder_disk(bdev, disk);
-
-	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
-		del_symlink(disk->slave_dir, bdev_kobj(bdev));
-		del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
-		kobject_put(bdev->bd_holder_dir);
-		list_del_init(&holder->list);
-		kfree(holder);
-	}
-
-	mutex_unlock(&bdev->bd_disk->open_mutex);
-}
-EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
-#endif
-
 static void blkdev_flush_mapping(struct block_device *bdev)
 {
 	WARN_ON_ONCE(bdev->bd_holders);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 290f9061b29a..7a4e139d24ef 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -34,7 +34,7 @@ struct block_device {
 	void *			bd_holder;
 	int			bd_holders;
 	bool			bd_write_holder;
-#ifdef CONFIG_SYSFS
+#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
 	struct list_head	bd_holder_disks;
 #endif
 	struct kobject		*bd_holder_dir;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 849486de81c6..e21a91c16a79 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -318,7 +318,7 @@ void set_capacity(struct gendisk *disk, sector_t size);
 int blkdev_ioctl(struct block_device *, fmode_t, unsigned, unsigned long);
 long compat_blkdev_ioctl(struct file *, unsigned, unsigned long);
 
-#ifdef CONFIG_SYSFS
+#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
 int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
 void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk);
 #else
@@ -331,7 +331,7 @@ static inline void bd_unlink_disk_holder(struct block_device *bdev,
 					 struct gendisk *disk)
 {
 }
-#endif /* CONFIG_SYSFS */
+#endif /* CONFIG_BLOCK_HOLDER_DEPRECATED */
 
 dev_t part_devt(struct gendisk *disk, u8 partno);
 void inc_diskseq(struct gendisk *disk);
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 2/8] block: remove the extra kobject reference in bd_link_disk_holder
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 1/8] block: make the block holder code optional Christoph Hellwig
@ 2021-08-04  9:41 ` Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 3/8] block: look up holders by bdev Christoph Hellwig
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

Since commit 0d02129e76ed ("block: merge struct block_device and struct
hd_struct") there is no way for the bdev to go away as long as there is
a holder, so remove the extra references.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
---
 block/holder.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/block/holder.c b/block/holder.c
index 904a1dcd5c12..960654a71342 100644
--- a/block/holder.c
+++ b/block/holder.c
@@ -92,11 +92,6 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
 	if (ret)
 		goto out_del;
-	/*
-	 * bdev could be deleted beneath us which would implicitly destroy
-	 * the holder directory.  Hold on to it.
-	 */
-	kobject_get(bdev->bd_holder_dir);
 
 	list_add(&holder->list, &bdev->bd_holder_disks);
 	goto out_unlock;
@@ -130,7 +125,6 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
 		del_symlink(disk->slave_dir, bdev_kobj(bdev));
 		del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
-		kobject_put(bdev->bd_holder_dir);
 		list_del_init(&holder->list);
 		kfree(holder);
 	}
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 3/8] block: look up holders by bdev
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 1/8] block: make the block holder code optional Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 2/8] block: remove the extra kobject reference in bd_link_disk_holder Christoph Hellwig
@ 2021-08-04  9:41 ` Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 4/8] block: support delayed holder registration Christoph Hellwig
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

Invert they way the holder relations are tracked.  This very
slightly reduces the memory overhead for partitioned devices.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c             |  4 +++-
 block/holder.c            | 18 +++++++++---------
 fs/block_dev.c            |  3 ---
 include/linux/blk_types.h |  3 ---
 include/linux/genhd.h     |  4 +++-
 5 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index a4817e42f3a3..cd4eab744667 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1289,7 +1289,9 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 	disk_to_dev(disk)->type = &disk_type;
 	device_initialize(disk_to_dev(disk));
 	inc_diskseq(disk);
-
+#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
+	INIT_LIST_HEAD(&disk->slave_bdevs);
+#endif
 	return disk;
 
 out_destroy_part_tbl:
diff --git a/block/holder.c b/block/holder.c
index 960654a71342..11e65d99a9fb 100644
--- a/block/holder.c
+++ b/block/holder.c
@@ -3,7 +3,7 @@
 
 struct bd_holder_disk {
 	struct list_head	list;
-	struct gendisk		*disk;
+	struct block_device	*bdev;
 	int			refcnt;
 };
 
@@ -12,8 +12,8 @@ static struct bd_holder_disk *bd_find_holder_disk(struct block_device *bdev,
 {
 	struct bd_holder_disk *holder;
 
-	list_for_each_entry(holder, &bdev->bd_holder_disks, list)
-		if (holder->disk == disk)
+	list_for_each_entry(holder, &disk->slave_bdevs, list)
+		if (holder->bdev == bdev)
 			return holder;
 	return NULL;
 }
@@ -61,7 +61,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	struct bd_holder_disk *holder;
 	int ret = 0;
 
-	mutex_lock(&bdev->bd_disk->open_mutex);
+	mutex_lock(&disk->open_mutex);
 
 	WARN_ON_ONCE(!bdev->bd_holder);
 
@@ -82,7 +82,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	}
 
 	INIT_LIST_HEAD(&holder->list);
-	holder->disk = disk;
+	holder->bdev = bdev;
 	holder->refcnt = 1;
 
 	ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
@@ -93,7 +93,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	if (ret)
 		goto out_del;
 
-	list_add(&holder->list, &bdev->bd_holder_disks);
+	list_add(&holder->list, &disk->slave_bdevs);
 	goto out_unlock;
 
 out_del:
@@ -101,7 +101,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 out_free:
 	kfree(holder);
 out_unlock:
-	mutex_unlock(&bdev->bd_disk->open_mutex);
+	mutex_unlock(&disk->open_mutex);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(bd_link_disk_holder);
@@ -120,7 +120,7 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 {
 	struct bd_holder_disk *holder;
 
-	mutex_lock(&bdev->bd_disk->open_mutex);
+	mutex_lock(&disk->open_mutex);
 	holder = bd_find_holder_disk(bdev, disk);
 	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
 		del_symlink(disk->slave_dir, bdev_kobj(bdev));
@@ -128,6 +128,6 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 		list_del_init(&holder->list);
 		kfree(holder);
 	}
-	mutex_unlock(&bdev->bd_disk->open_mutex);
+	mutex_unlock(&disk->open_mutex);
 }
 EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
diff --git a/fs/block_dev.c b/fs/block_dev.c
index ae9651cad923..cc801767a377 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -902,9 +902,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 	bdev->bd_disk = disk;
 	bdev->bd_partno = partno;
 	bdev->bd_inode = inode;
-#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
-	INIT_LIST_HEAD(&bdev->bd_holder_disks);
-#endif
 	bdev->bd_stats = alloc_percpu(struct disk_stats);
 	if (!bdev->bd_stats) {
 		iput(inode);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 7a4e139d24ef..e92735655684 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -34,9 +34,6 @@ struct block_device {
 	void *			bd_holder;
 	int			bd_holders;
 	bool			bd_write_holder;
-#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
-	struct list_head	bd_holder_disks;
-#endif
 	struct kobject		*bd_holder_dir;
 	u8			bd_partno;
 	spinlock_t		bd_size_lock; /* for bd_inode->i_size updates */
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index e21a91c16a79..0721807d76ee 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -159,7 +159,9 @@ struct gendisk {
 	unsigned open_partitions;	/* number of open partitions */
 
 	struct kobject *slave_dir;
-
+#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
+	struct list_head slave_bdevs;
+#endif
 	struct timer_rand_state *random;
 	atomic_t sync_io;		/* RAID */
 	struct disk_events *ev;
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2021-08-04  9:41 ` [dm-devel] [PATCH 3/8] block: look up holders by bdev Christoph Hellwig
@ 2021-08-04  9:41 ` Christoph Hellwig
       [not found]   ` <CGME20210810213058eucas1p109323e3c3ecaa76d37d8cf63b6d8ecfd@eucas1p1.samsung.com>
  2021-08-14 21:13   ` Guenter Roeck
  2021-08-04  9:41 ` [dm-devel] [PATCH 5/8] dm: cleanup cleanup_mapped_device Christoph Hellwig
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

device mapper needs to register holders before it is ready to do I/O.
Currently it does so by registering the disk early, which can leave
the disk and queue in a weird half state where the queue is registered
with the disk, except for sysfs and the elevator.  And this state has
been a bit promlematic before, and will get more so when sorting out
the responsibilities between the queue and the disk.

Support registering holders on an initialized but not registered disk
instead by delaying the sysfs registration until the disk is registered.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
---
 block/genhd.c         | 10 +++++++
 block/holder.c        | 68 ++++++++++++++++++++++++++++++++-----------
 include/linux/genhd.h |  5 ++++
 3 files changed, 66 insertions(+), 17 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index cd4eab744667..db916f779077 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -447,6 +447,16 @@ static void register_disk(struct device *parent, struct gendisk *disk,
 		kobject_create_and_add("holders", &ddev->kobj);
 	disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj);
 
+	/*
+	 * XXX: this is a mess, can't wait for real error handling in add_disk.
+	 * Make sure ->slave_dir is NULL if we failed some of the registration
+	 * so that the cleanup in bd_unlink_disk_holder works properly.
+	 */
+	if (bd_register_pending_holders(disk) < 0) {
+		kobject_put(disk->slave_dir);
+		disk->slave_dir = NULL;
+	}
+
 	if (disk->flags & GENHD_FL_HIDDEN)
 		return;
 
diff --git a/block/holder.c b/block/holder.c
index 11e65d99a9fb..4568cc4f6827 100644
--- a/block/holder.c
+++ b/block/holder.c
@@ -28,6 +28,19 @@ static void del_symlink(struct kobject *from, struct kobject *to)
 	sysfs_remove_link(from, kobject_name(to));
 }
 
+static int __link_disk_holder(struct block_device *bdev, struct gendisk *disk)
+{
+	int ret;
+
+	ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
+	if (ret)
+		return ret;
+	ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
+	if (ret)
+		del_symlink(disk->slave_dir, bdev_kobj(bdev));
+	return ret;
+}
+
 /**
  * bd_link_disk_holder - create symlinks between holding disk and slave bdev
  * @bdev: the claimed slave bdev
@@ -66,7 +79,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	WARN_ON_ONCE(!bdev->bd_holder);
 
 	/* FIXME: remove the following once add_disk() handles errors */
-	if (WARN_ON(!disk->slave_dir || !bdev->bd_holder_dir))
+	if (WARN_ON(!bdev->bd_holder_dir))
 		goto out_unlock;
 
 	holder = bd_find_holder_disk(bdev, disk);
@@ -84,28 +97,28 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	INIT_LIST_HEAD(&holder->list);
 	holder->bdev = bdev;
 	holder->refcnt = 1;
-
-	ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
-	if (ret)
-		goto out_free;
-
-	ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
-	if (ret)
-		goto out_del;
+	if (disk->slave_dir) {
+		ret = __link_disk_holder(bdev, disk);
+		if (ret) {
+			kfree(holder);
+			goto out_unlock;
+		}
+	}
 
 	list_add(&holder->list, &disk->slave_bdevs);
-	goto out_unlock;
-
-out_del:
-	del_symlink(disk->slave_dir, bdev_kobj(bdev));
-out_free:
-	kfree(holder);
 out_unlock:
 	mutex_unlock(&disk->open_mutex);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(bd_link_disk_holder);
 
+static void __unlink_disk_holder(struct block_device *bdev,
+		struct gendisk *disk)
+{
+	del_symlink(disk->slave_dir, bdev_kobj(bdev));
+	del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
+}
+
 /**
  * bd_unlink_disk_holder - destroy symlinks created by bd_link_disk_holder()
  * @bdev: the calimed slave bdev
@@ -123,11 +136,32 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	mutex_lock(&disk->open_mutex);
 	holder = bd_find_holder_disk(bdev, disk);
 	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
-		del_symlink(disk->slave_dir, bdev_kobj(bdev));
-		del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
+		if (disk->slave_dir)
+			__unlink_disk_holder(bdev, disk);
 		list_del_init(&holder->list);
 		kfree(holder);
 	}
 	mutex_unlock(&disk->open_mutex);
 }
 EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
+
+int bd_register_pending_holders(struct gendisk *disk)
+{
+	struct bd_holder_disk *holder;
+	int ret;
+
+	mutex_lock(&disk->open_mutex);
+	list_for_each_entry(holder, &disk->slave_bdevs, list) {
+		ret = __link_disk_holder(holder->bdev, disk);
+		if (ret)
+			goto out_undo;
+	}
+	mutex_unlock(&disk->open_mutex);
+	return 0;
+
+out_undo:
+	list_for_each_entry_continue_reverse(holder, &disk->slave_bdevs, list)
+		__unlink_disk_holder(holder->bdev, disk);
+	mutex_unlock(&disk->open_mutex);
+	return ret;
+}
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 0721807d76ee..80952f038d79 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -323,6 +323,7 @@ long compat_blkdev_ioctl(struct file *, unsigned, unsigned long);
 #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
 int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
 void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk);
+int bd_register_pending_holders(struct gendisk *disk);
 #else
 static inline int bd_link_disk_holder(struct block_device *bdev,
 				      struct gendisk *disk)
@@ -333,6 +334,10 @@ static inline void bd_unlink_disk_holder(struct block_device *bdev,
 					 struct gendisk *disk)
 {
 }
+static inline int bd_register_pending_holders(struct gendisk *disk)
+{
+	return 0;
+}
 #endif /* CONFIG_BLOCK_HOLDER_DEPRECATED */
 
 dev_t part_devt(struct gendisk *disk, u8 partno);
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 5/8] dm: cleanup cleanup_mapped_device
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2021-08-04  9:41 ` [dm-devel] [PATCH 4/8] block: support delayed holder registration Christoph Hellwig
@ 2021-08-04  9:41 ` Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 6/8] dm: move setting md->type into dm_setup_md_queue Christoph Hellwig
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

md->queue is now always set when md->disk is set, so simplify the
conditionals a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 2c5f9e585211..7971ec8ce677 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1694,13 +1694,9 @@ static void cleanup_mapped_device(struct mapped_device *md)
 		md->disk->private_data = NULL;
 		spin_unlock(&_minor_lock);
 		del_gendisk(md->disk);
-	}
-
-	if (md->queue)
 		dm_queue_destroy_keyslot_manager(md->queue);
-
-	if (md->disk)
 		blk_cleanup_disk(md->disk);
+	}
 
 	cleanup_srcu_struct(&md->io_barrier);
 
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 6/8] dm: move setting md->type into dm_setup_md_queue
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2021-08-04  9:41 ` [dm-devel] [PATCH 5/8] dm: cleanup cleanup_mapped_device Christoph Hellwig
@ 2021-08-04  9:41 ` Christoph Hellwig
  2021-08-04  9:41 ` [dm-devel] [PATCH 7/8] dm: delay registering the gendisk Christoph Hellwig
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

Move setting md->type from both callers into dm_setup_md_queue.
This ensures that md->type is only set to a valid value after the queue
has been fully setup, something we'll rely on future changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-ioctl.c | 4 ----
 drivers/md/dm.c       | 5 +++--
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index 2209cbcd84db..2575074a2204 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1436,9 +1436,6 @@ static int table_load(struct file *filp, struct dm_ioctl *param, size_t param_si
 	}
 
 	if (dm_get_md_type(md) == DM_TYPE_NONE) {
-		/* Initial table load: acquire type of table. */
-		dm_set_md_type(md, dm_table_get_type(t));
-
 		/* setup md->queue to reflect md's type (may block) */
 		r = dm_setup_md_queue(md, t);
 		if (r) {
@@ -2187,7 +2184,6 @@ int __init dm_early_create(struct dm_ioctl *dmi,
 	if (r)
 		goto err_destroy_table;
 
-	md->type = dm_table_get_type(t);
 	/* setup md->queue to reflect md's type (may block) */
 	r = dm_setup_md_queue(md, t);
 	if (r) {
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7971ec8ce677..f003bd5b93ce 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2052,9 +2052,9 @@ EXPORT_SYMBOL_GPL(dm_get_queue_limits);
  */
 int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 {
-	int r;
+	enum dm_queue_mode type = dm_table_get_type(t);
 	struct queue_limits limits;
-	enum dm_queue_mode type = dm_get_md_type(md);
+	int r;
 
 	switch (type) {
 	case DM_TYPE_REQUEST_BASED:
@@ -2081,6 +2081,7 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 	r = dm_table_set_restrictions(t, md->queue, &limits);
 	if (r)
 		return r;
+	md->type = type;
 
 	blk_register_queue(md->disk);
 
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2021-08-04  9:41 ` [dm-devel] [PATCH 6/8] dm: move setting md->type into dm_setup_md_queue Christoph Hellwig
@ 2021-08-04  9:41 ` Christoph Hellwig
  2021-08-09 23:31   ` Alasdair G Kergon
  2022-07-07  3:29   ` Yu Kuai
  2021-08-04  9:41 ` [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations Christoph Hellwig
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

device mapper is currently the only outlier that tries to call
register_disk after add_disk, leading to fairly inconsistent state
of these block layer data structures.  Instead change device-mapper
to just register the gendisk later now that the holder mechanism
can cope with that.

Note that this introduces a user visible change: the dm kobject is
now only visible after the initial table has been loaded.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-rq.c |  1 -
 drivers/md/dm.c    | 23 +++++++++++------------
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 0dbd48cbdff9..5b95eea517d1 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -559,7 +559,6 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	err = blk_mq_init_allocated_queue(md->tag_set, md->queue);
 	if (err)
 		goto out_tag_set;
-	elevator_init_mq(md->queue);
 	return 0;
 
 out_tag_set:
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index f003bd5b93ce..7981b7287628 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1693,7 +1693,10 @@ static void cleanup_mapped_device(struct mapped_device *md)
 		spin_lock(&_minor_lock);
 		md->disk->private_data = NULL;
 		spin_unlock(&_minor_lock);
-		del_gendisk(md->disk);
+		if (dm_get_md_type(md) != DM_TYPE_NONE) {
+			dm_sysfs_exit(md);
+			del_gendisk(md->disk);
+		}
 		dm_queue_destroy_keyslot_manager(md->queue);
 		blk_cleanup_disk(md->disk);
 	}
@@ -1788,7 +1791,6 @@ static struct mapped_device *alloc_dev(int minor)
 			goto bad;
 	}
 
-	add_disk_no_queue_reg(md->disk);
 	format_dev_t(md->name, MKDEV(_major, minor));
 
 	md->wq = alloc_workqueue("kdmflush", WQ_MEM_RECLAIM, 0);
@@ -1989,19 +1991,12 @@ static struct dm_table *__unbind(struct mapped_device *md)
  */
 int dm_create(int minor, struct mapped_device **result)
 {
-	int r;
 	struct mapped_device *md;
 
 	md = alloc_dev(minor);
 	if (!md)
 		return -ENXIO;
 
-	r = dm_sysfs_init(md);
-	if (r) {
-		free_dev(md);
-		return r;
-	}
-
 	*result = md;
 	return 0;
 }
@@ -2081,10 +2076,15 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 	r = dm_table_set_restrictions(t, md->queue, &limits);
 	if (r)
 		return r;
-	md->type = type;
 
-	blk_register_queue(md->disk);
+	add_disk(md->disk);
 
+	r = dm_sysfs_init(md);
+	if (r) {
+		del_gendisk(md->disk);
+		return r;
+	}
+	md->type = type;
 	return 0;
 }
 
@@ -2190,7 +2190,6 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
 		DMWARN("%s: Forcibly removing mapped_device still in use! (%d users)",
 		       dm_device_name(md), atomic_read(&md->holders));
 
-	dm_sysfs_exit(md);
 	dm_table_destroy(__unbind(md));
 	free_dev(md);
 }
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2021-08-04  9:41 ` [dm-devel] [PATCH 7/8] dm: delay registering the gendisk Christoph Hellwig
@ 2021-08-04  9:41 ` Christoph Hellwig
  2021-08-09 17:51 ` [dm-devel] use regular gendisk registration in device mapper v2 Jens Axboe
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-04  9:41 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

Now that device mapper has been changed to register the disk once
it is fully ready all this code is unused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
---
 block/elevator.c      |  1 -
 block/genhd.c         | 29 +++++++----------------------
 include/linux/genhd.h |  6 ------
 3 files changed, 7 insertions(+), 29 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index 52ada14cfe45..706d5a64508d 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -702,7 +702,6 @@ void elevator_init_mq(struct request_queue *q)
 		elevator_put(e);
 	}
 }
-EXPORT_SYMBOL_GPL(elevator_init_mq); /* only for dm-rq */
 
 /*
  * switch to new_e io scheduler. be careful not to introduce deadlocks -
diff --git a/block/genhd.c b/block/genhd.c
index db916f779077..b0b6e0caa389 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -475,20 +475,20 @@ static void register_disk(struct device *parent, struct gendisk *disk,
 }
 
 /**
- * __device_add_disk - add disk information to kernel list
+ * device_add_disk - add disk information to kernel list
  * @parent: parent device for the disk
  * @disk: per-device partitioning information
  * @groups: Additional per-device sysfs groups
- * @register_queue: register the queue if set to true
  *
  * This function registers the partitioning information in @disk
  * with the kernel.
  *
  * FIXME: error handling
  */
-static void __device_add_disk(struct device *parent, struct gendisk *disk,
-			      const struct attribute_group **groups,
-			      bool register_queue)
+
+void device_add_disk(struct device *parent, struct gendisk *disk,
+		     const struct attribute_group **groups)
+
 {
 	int ret;
 
@@ -498,8 +498,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 	 * elevator if one is needed, that is, for devices requesting queue
 	 * registration.
 	 */
-	if (register_queue)
-		elevator_init_mq(disk->queue);
+	elevator_init_mq(disk->queue);
 
 	/*
 	 * If the driver provides an explicit major number it also must provide
@@ -553,8 +552,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 		bdev_add(disk->part0, dev->devt);
 	}
 	register_disk(parent, disk, groups);
-	if (register_queue)
-		blk_register_queue(disk);
+	blk_register_queue(disk);
 
 	/*
 	 * Take an extra ref on queue which will be put on disk_release()
@@ -568,21 +566,8 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 	disk_add_events(disk);
 	blk_integrity_add(disk);
 }
-
-void device_add_disk(struct device *parent, struct gendisk *disk,
-		     const struct attribute_group **groups)
-
-{
-	__device_add_disk(parent, disk, groups, true);
-}
 EXPORT_SYMBOL(device_add_disk);
 
-void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
-{
-	__device_add_disk(parent, disk, NULL, false);
-}
-EXPORT_SYMBOL(device_add_disk_no_queue_reg);
-
 /**
  * del_gendisk - remove the gendisk
  * @disk: the struct gendisk to remove
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 80952f038d79..473d93c6ebda 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -219,12 +219,6 @@ static inline void add_disk(struct gendisk *disk)
 {
 	device_add_disk(NULL, disk, NULL);
 }
-extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
-static inline void add_disk_no_queue_reg(struct gendisk *disk)
-{
-	device_add_disk_no_queue_reg(NULL, disk);
-}
-
 extern void del_gendisk(struct gendisk *gp);
 
 void set_disk_ro(struct gendisk *disk, bool read_only);
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [dm-devel] use regular gendisk registration in device mapper v2
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2021-08-04  9:41 ` [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations Christoph Hellwig
@ 2021-08-09 17:51 ` Jens Axboe
  2021-08-10  0:36 ` Alasdair G Kergon
  2021-08-19 15:58 ` [dm-devel] holders not working properly, regression [was: Re: use regular gendisk registration in device mapper v2] Mike Snitzer
  10 siblings, 0 replies; 35+ messages in thread
From: Jens Axboe @ 2021-08-09 17:51 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer; +Cc: linux-block, dm-devel

On 8/4/21 3:41 AM, Christoph Hellwig wrote:
> Hi all,
> 
> The device mapper code currently has a somewhat odd gendisk registration
> scheme where it calls add_disk early, but uses a special flag to skip the
> "queue registration", which is a major part of add_disk.  This series
> improves the block layer holder tracking to work on an entirely
> unregistered disk and thus allows device mapper to use the normal scheme
> of calling add_disk when it is ready to accept I/O.
> 
> Note that this leads to a user visible change - the sysfs attributes on
> the disk and the dm directory hanging off it are not only visible once
> the initial table is loaded.  This did not make a different to my testing
> using dmsetup and the lvm2 tools.

Applied, thanks.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2021-08-04  9:41 ` [dm-devel] [PATCH 7/8] dm: delay registering the gendisk Christoph Hellwig
@ 2021-08-09 23:31   ` Alasdair G Kergon
  2021-08-10  0:17     ` Alasdair G Kergon
  2021-08-10 13:12     ` Peter Rajnoha
  2022-07-07  3:29   ` Yu Kuai
  1 sibling, 2 replies; 35+ messages in thread
From: Alasdair G Kergon @ 2021-08-09 23:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, dm-devel, prajnoha, Mike Snitzer

On Wed, Aug 04, 2021 at 11:41:46AM +0200, Christoph Hellwig wrote:
> device mapper is currently the only outlier that tries to call
> register_disk after add_disk, leading to fairly inconsistent state
> of these block layer data structures.  

> Note that this introduces a user visible change: the dm kobject is
> now only visible after the initial table has been loaded.

Indeed.  We should try to document the userspace implications of this
change in a bit more detail.  While lvm2 and any other tools that
followed our recommendations about how to use dm should be OK, there's
always the chance that some other less robustly-written code will need
to make adjustments.

Currently to make a dm device, 3 ioctls are called in sequence:

1. DM_DEV_CREATE  - triggers 'add' uevents
2. DM_TABLE_LOAD
3. DM_SUSPEND     - triggers 'change' uevent

After this patch we have:

1. DM_DEV_CREATE  
2. DM_TABLE_LOAD  - triggers 'add' uevents
3. DM_SUSPEND     - triggers 'change' uevent

The equivalent dmsetup commands for a simple test device are
0. udevadm monitor --kernel --env &   # View the uevents as they happen
1. dmsetup create dev1 --notable
2. dmsetup load --table "0 1 error" dev1
3. dmsetup resume dev1

  => Anyone with a udev rule that relies on 'add' needs to check if they
     need to change their code.

The udev rules that lvm2 uses to synchronise what it is doing rely
only on the 'change' event - which is not moving.  The 'add' event
gets ignored.  

When loading tables, our tools also always refer to devices using
the 'major:minor' format, which isn't affected, rather than using
pathnames in /dev which might not exist now after this change if a table
hasn't been loaded into a referenced device yet.  Previously this was
permissible but we always recommended against it to avoid a pointless
pathname lookup that's subject to races and delays.

So again, any tools that followed our recommendations ought to be
unaffected.

Here's an example of poor code that previously worked but will fail now:
  dmsetup create dev1 --notable
  dmsetup create dev2 --notable
  dmsetup ls  <-- get the minor number of dev1 (say it's 1 corresponding
to dm-1)
  dmsetup load dev2 --table '0 1 linear /dev/dm-1 0'
  ...

Peter - have I missed anything?

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2021-08-09 23:31   ` Alasdair G Kergon
@ 2021-08-10  0:17     ` Alasdair G Kergon
  2021-08-10 13:12     ` Peter Rajnoha
  1 sibling, 0 replies; 35+ messages in thread
From: Alasdair G Kergon @ 2021-08-10  0:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, dm-devel, prajnoha, Mike Snitzer

On Tue, Aug 10, 2021 at 12:31:43AM +0100, Alasdair G Kergon wrote:
> When loading tables, our tools also always refer to devices using
> the 'major:minor' format, which isn't affected, rather than using
                            ^^^^^^^^^^^^^^^^^^^^
Wrong - that is also affected.

So there is a new general constraint that a table must be loaded into a
device before another device's table can reference that device.  (The
stacked device handling in lvm2 as supported by libdevmapper should
always be doing this.)

(The original implementation had to be a bit loose to accommodate
multipath device paths that were essentially placeholders at the point
they got set up.)

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] use regular gendisk registration in device mapper v2
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2021-08-09 17:51 ` [dm-devel] use regular gendisk registration in device mapper v2 Jens Axboe
@ 2021-08-10  0:36 ` Alasdair G Kergon
  2021-08-10 14:41   ` Alasdair G Kergon
  2021-08-19 15:58 ` [dm-devel] holders not working properly, regression [was: Re: use regular gendisk registration in device mapper v2] Mike Snitzer
  10 siblings, 1 reply; 35+ messages in thread
From: Alasdair G Kergon @ 2021-08-10  0:36 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer

On Wed, Aug 04, 2021 at 11:41:39AM +0200, Christoph Hellwig wrote:
> allows device mapper to use the normal scheme
> of calling add_disk when it is ready to accept I/O.

For clarity, even after this patchset, the device is not ready to accept
I/O when add_disk is called.  It is ready to accept I/O later if a 
'resume' happens triggering the 'change' uevent that userspace reacts
to by setting up the /dev entries for it.
 
Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2021-08-09 23:31   ` Alasdair G Kergon
  2021-08-10  0:17     ` Alasdair G Kergon
@ 2021-08-10 13:12     ` Peter Rajnoha
  2021-08-10 15:05       ` Alasdair G Kergon
  1 sibling, 1 reply; 35+ messages in thread
From: Peter Rajnoha @ 2021-08-10 13:12 UTC (permalink / raw)
  To: Alasdair G Kergon
  Cc: Jens Axboe, linux-block, dm-devel, Christoph Hellwig, Mike Snitzer

On Tue 10 Aug 2021 00:31, Alasdair G Kergon wrote:
> On Wed, Aug 04, 2021 at 11:41:46AM +0200, Christoph Hellwig wrote:
> > device mapper is currently the only outlier that tries to call
> > register_disk after add_disk, leading to fairly inconsistent state
> > of these block layer data structures.  
> 
> > Note that this introduces a user visible change: the dm kobject is
> > now only visible after the initial table has been loaded.
> 
> Indeed.  We should try to document the userspace implications of this
> change in a bit more detail.  While lvm2 and any other tools that
> followed our recommendations about how to use dm should be OK, there's
> always the chance that some other less robustly-written code will need
> to make adjustments.
> 
> Currently to make a dm device, 3 ioctls are called in sequence:
> 
> 1. DM_DEV_CREATE  - triggers 'add' uevents
> 2. DM_TABLE_LOAD
> 3. DM_SUSPEND     - triggers 'change' uevent
> 
> After this patch we have:
> 
> 1. DM_DEV_CREATE  
> 2. DM_TABLE_LOAD  - triggers 'add' uevents
> 3. DM_SUSPEND     - triggers 'change' uevent
> 
> The equivalent dmsetup commands for a simple test device are
> 0. udevadm monitor --kernel --env &   # View the uevents as they happen
> 1. dmsetup create dev1 --notable
> 2. dmsetup load --table "0 1 error" dev1
> 3. dmsetup resume dev1
> 
>   => Anyone with a udev rule that relies on 'add' needs to check if they
>      need to change their code.
> 
> The udev rules that lvm2 uses to synchronise what it is doing rely
> only on the 'change' event - which is not moving.  The 'add' event
> gets ignored.  
> 
> When loading tables, our tools also always refer to devices using
> the 'major:minor' format, which isn't affected, rather than using
> pathnames in /dev which might not exist now after this change if a table
> hasn't been loaded into a referenced device yet.  Previously this was
> permissible but we always recommended against it to avoid a pointless
> pathname lookup that's subject to races and delays.
> 
> So again, any tools that followed our recommendations ought to be
> unaffected.
> 
> Here's an example of poor code that previously worked but will fail now:
>   dmsetup create dev1 --notable
>   dmsetup create dev2 --notable
>   dmsetup ls  <-- get the minor number of dev1 (say it's 1 corresponding
> to dm-1)
>   dmsetup load dev2 --table '0 1 linear /dev/dm-1 0'
>   ...
> 
> Peter - have I missed anything?

It looks this is the only area affected, but as you say, this should be
well documented (including comments in our own udev rules) so there are
no false assumptions made by other non-lvm/non-libdm users.

(I'm not counting the very corner use case of
'dmsetup --addnodeoncreate --verifyudev' which now ends up with a dev node
in /dev that logically returns -ENODEV when accessed instead of zero-sized
device as it was before.)

-- 
Peter

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] use regular gendisk registration in device mapper v2
  2021-08-10  0:36 ` Alasdair G Kergon
@ 2021-08-10 14:41   ` Alasdair G Kergon
  0 siblings, 0 replies; 35+ messages in thread
From: Alasdair G Kergon @ 2021-08-10 14:41 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Mike Snitzer, linux-block, dm-devel

On Tue, Aug 10, 2021 at 01:36:08AM +0100, Alasdair G Kergon wrote:
> On Wed, Aug 04, 2021 at 11:41:39AM +0200, Christoph Hellwig wrote:
> > allows device mapper to use the normal scheme
> > of calling add_disk when it is ready to accept I/O.
> For clarity, even after this patchset, the device is not ready to accept
> I/O when add_disk is called.  

The question then arises: could we go beyond this patchset and move the
add_disk further to the first resume to make the statement true?  (From
step 2 to 3 in my earlier response. DM_TABLE_CLEAR then also enters the
mix for testing.)

In the early days, in practice userspace did have to resume a device
before it could be referenced in a table and lvm2 and other tools were
designed with that in mind - they should always resume a device before
loading a table that references it.  This was because the device
reference performed a size check - to make sure the access was within
the device, and the device size isn't defined until a table becomes live
when the device is resumed.  But some multipath tables had to be set up
referencing devices with not-yet-defined sizes, so the code got relaxed
to accept references to zero-sized devices.  (At the back of my mind I
think there was some non-multipath code that found this a convenient
short-cut too.)

So since this "must resume before referencing in a table" hasn't been
enforced for so long, I can't really say how much userspace code, if
any, might now not be doing it.  We and others would need to do some
testing to see if we could get away with making such a change.

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2021-08-10 13:12     ` Peter Rajnoha
@ 2021-08-10 15:05       ` Alasdair G Kergon
  0 siblings, 0 replies; 35+ messages in thread
From: Alasdair G Kergon @ 2021-08-10 15:05 UTC (permalink / raw)
  To: Peter Rajnoha
  Cc: Jens Axboe, Mike Snitzer, linux-block, dm-devel,
	Christoph Hellwig, Alasdair G Kergon

On Tue, Aug 10, 2021 at 03:12:27PM +0200, Peter Rajnoha wrote:
> (I'm not counting the very corner use case of
> 'dmsetup --addnodeoncreate --verifyudev' which now ends up with a dev node
> in /dev that logically returns -ENODEV when accessed instead of zero-sized
> device as it was before.)
 
Yes.  That facility was provided to assist people having to work with
old or incorrect code or misconfigured systems and breaking it in this
way shouldn't be a concern.  (We could possibly still patch it up to
continue to do the best thing after the patchset goes in.)

Alasdair

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
       [not found]   ` <CGME20210810213058eucas1p109323e3c3ecaa76d37d8cf63b6d8ecfd@eucas1p1.samsung.com>
@ 2021-08-10 21:30     ` Marek Szyprowski
  0 siblings, 0 replies; 35+ messages in thread
From: Marek Szyprowski @ 2021-08-10 21:30 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Mike Snitzer
  Cc: linux-block, dm-devel, Zolnierkiewicz, Bartlomiej

Hi,

On 04.08.2021 11:41, Christoph Hellwig wrote:
> device mapper needs to register holders before it is ready to do I/O.
> Currently it does so by registering the disk early, which can leave
> the disk and queue in a weird half state where the queue is registered
> with the disk, except for sysfs and the elevator.  And this state has
> been a bit promlematic before, and will get more so when sorting out
> the responsibilities between the queue and the disk.
>
> Support registering holders on an initialized but not registered disk
> instead by delaying the sysfs registration until the disk is registered.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Mike Snitzer <snitzer@redhat.com>

This patch landed in today's linux-next (20210810) as commit 
d62633873590 ("block: support delayed holder registration"). It triggers 
a following lockdep warning on ARM64's virt 'machine' on QEmu:

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc4+ #10642 Not tainted
------------------------------------------------------
systemd-udevd/227 is trying to acquire lock:
ffffb6b41952d628 (mtd_table_mutex){+.+.}-{3:3}, at: blktrans_open+0x40/0x250

but task is already holding lock:
ffff0eacc403bb18 (&disk->open_mutex){+.+.}-{3:3}, at: 
blkdev_get_by_dev+0x110/0x2f8

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&disk->open_mutex){+.+.}-{3:3}:
        __mutex_lock+0xa4/0x978
        mutex_lock_nested+0x54/0x60
        bd_register_pending_holders+0x2c/0x118
        __device_add_disk+0x1d8/0x368
        device_add_disk+0x10/0x18
        add_mtd_blktrans_dev+0x2dc/0x428
        mtdblock_add_mtd+0x68/0x98
        blktrans_notify_add+0x44/0x70
        add_mtd_device+0x430/0x4a0
        mtd_device_parse_register+0x1a4/0x2b0
        physmap_flash_probe+0x44c/0x780
        platform_probe+0x90/0xd8
        really_probe+0x138/0x2d0
        __driver_probe_device+0x78/0xd8
        driver_probe_device+0x40/0x110
        __driver_attach+0xcc/0x118
        bus_for_each_dev+0x68/0xc8
        driver_attach+0x20/0x28
        bus_add_driver+0x168/0x1f8
        driver_register+0x60/0x110
        __platform_driver_register+0x24/0x30
        physmap_init+0x18/0x20
        do_one_initcall+0x84/0x450
        kernel_init_freeable+0x31c/0x38c
        kernel_init+0x20/0x120
        ret_from_fork+0x10/0x18

-> #0 (mtd_table_mutex){+.+.}-{3:3}:
        __lock_acquire+0xff4/0x1840
        lock_acquire+0x130/0x3e8
        __mutex_lock+0xa4/0x978
        mutex_lock_nested+0x54/0x60
        blktrans_open+0x40/0x250
        blkdev_get_whole+0x28/0x120
        blkdev_get_by_dev+0x15c/0x2f8
        blkdev_open+0x50/0xb0
        do_dentry_open+0x238/0x3c0
        vfs_open+0x28/0x30
        path_openat+0x720/0x938
        do_filp_open+0x80/0x108
        do_sys_openat2+0x1b4/0x2c8
        do_sys_open+0x68/0x88
        __arm64_compat_sys_openat+0x1c/0x28
        invoke_syscall+0x40/0xf8
        el0_svc_common+0x60/0x100
        do_el0_svc_compat+0x1c/0x48
        el0_svc_compat+0x20/0x30
        el0t_32_sync_handler+0xec/0x140
        el0t_32_sync+0x168/0x16c

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&disk->open_mutex);
                                lock(mtd_table_mutex);
                                lock(&disk->open_mutex);
   lock(mtd_table_mutex);

  *** DEADLOCK ***

1 lock held by systemd-udevd/227:
  #0: ffff0eacc403bb18 (&disk->open_mutex){+.+.}-{3:3}, at: 
blkdev_get_by_dev+0x110/0x2f8

stack backtrace:
CPU: 1 PID: 227 Comm: systemd-udevd Not tainted 5.14.0-rc4+ #10642
Hardware name: linux,dummy-virt (DT)
Call trace:
  dump_backtrace+0x0/0x1d0
  show_stack+0x14/0x20
  dump_stack_lvl+0x88/0xb0
  dump_stack+0x14/0x2c
  print_circular_bug.isra.50+0x1ac/0x200
  check_noncircular+0x134/0x148
  __lock_acquire+0xff4/0x1840
  lock_acquire+0x130/0x3e8
  __mutex_lock+0xa4/0x978
  mutex_lock_nested+0x54/0x60
  blktrans_open+0x40/0x250
  blkdev_get_whole+0x28/0x120
  blkdev_get_by_dev+0x15c/0x2f8
  blkdev_open+0x50/0xb0
  do_dentry_open+0x238/0x3c0
  vfs_open+0x28/0x30
  path_openat+0x720/0x938
  do_filp_open+0x80/0x108
  do_sys_openat2+0x1b4/0x2c8
  do_sys_open+0x68/0x88
  __arm64_compat_sys_openat+0x1c/0x28
  invoke_syscall+0x40/0xf8
  el0_svc_common+0x60/0x100
  do_el0_svc_compat+0x1c/0x48
  el0_svc_compat+0x20/0x30
  el0t_32_sync_handler+0xec/0x140
  el0t_32_sync+0x168/0x16c

If this is a false positive, then it should be annotated as such.

> ---
>   block/genhd.c         | 10 +++++++
>   block/holder.c        | 68 ++++++++++++++++++++++++++++++++-----------
>   include/linux/genhd.h |  5 ++++
>   3 files changed, 66 insertions(+), 17 deletions(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index cd4eab744667..db916f779077 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -447,6 +447,16 @@ static void register_disk(struct device *parent, struct gendisk *disk,
>   		kobject_create_and_add("holders", &ddev->kobj);
>   	disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj);
>   
> +	/*
> +	 * XXX: this is a mess, can't wait for real error handling in add_disk.
> +	 * Make sure ->slave_dir is NULL if we failed some of the registration
> +	 * so that the cleanup in bd_unlink_disk_holder works properly.
> +	 */
> +	if (bd_register_pending_holders(disk) < 0) {
> +		kobject_put(disk->slave_dir);
> +		disk->slave_dir = NULL;
> +	}
> +
>   	if (disk->flags & GENHD_FL_HIDDEN)
>   		return;
>   
> diff --git a/block/holder.c b/block/holder.c
> index 11e65d99a9fb..4568cc4f6827 100644
> --- a/block/holder.c
> +++ b/block/holder.c
> @@ -28,6 +28,19 @@ static void del_symlink(struct kobject *from, struct kobject *to)
>   	sysfs_remove_link(from, kobject_name(to));
>   }
>   
> +static int __link_disk_holder(struct block_device *bdev, struct gendisk *disk)
> +{
> +	int ret;
> +
> +	ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
> +	if (ret)
> +		return ret;
> +	ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
> +	if (ret)
> +		del_symlink(disk->slave_dir, bdev_kobj(bdev));
> +	return ret;
> +}
> +
>   /**
>    * bd_link_disk_holder - create symlinks between holding disk and slave bdev
>    * @bdev: the claimed slave bdev
> @@ -66,7 +79,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
>   	WARN_ON_ONCE(!bdev->bd_holder);
>   
>   	/* FIXME: remove the following once add_disk() handles errors */
> -	if (WARN_ON(!disk->slave_dir || !bdev->bd_holder_dir))
> +	if (WARN_ON(!bdev->bd_holder_dir))
>   		goto out_unlock;
>   
>   	holder = bd_find_holder_disk(bdev, disk);
> @@ -84,28 +97,28 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
>   	INIT_LIST_HEAD(&holder->list);
>   	holder->bdev = bdev;
>   	holder->refcnt = 1;
> -
> -	ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
> -	if (ret)
> -		goto out_free;
> -
> -	ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
> -	if (ret)
> -		goto out_del;
> +	if (disk->slave_dir) {
> +		ret = __link_disk_holder(bdev, disk);
> +		if (ret) {
> +			kfree(holder);
> +			goto out_unlock;
> +		}
> +	}
>   
>   	list_add(&holder->list, &disk->slave_bdevs);
> -	goto out_unlock;
> -
> -out_del:
> -	del_symlink(disk->slave_dir, bdev_kobj(bdev));
> -out_free:
> -	kfree(holder);
>   out_unlock:
>   	mutex_unlock(&disk->open_mutex);
>   	return ret;
>   }
>   EXPORT_SYMBOL_GPL(bd_link_disk_holder);
>   
> +static void __unlink_disk_holder(struct block_device *bdev,
> +		struct gendisk *disk)
> +{
> +	del_symlink(disk->slave_dir, bdev_kobj(bdev));
> +	del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
> +}
> +
>   /**
>    * bd_unlink_disk_holder - destroy symlinks created by bd_link_disk_holder()
>    * @bdev: the calimed slave bdev
> @@ -123,11 +136,32 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
>   	mutex_lock(&disk->open_mutex);
>   	holder = bd_find_holder_disk(bdev, disk);
>   	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
> -		del_symlink(disk->slave_dir, bdev_kobj(bdev));
> -		del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
> +		if (disk->slave_dir)
> +			__unlink_disk_holder(bdev, disk);
>   		list_del_init(&holder->list);
>   		kfree(holder);
>   	}
>   	mutex_unlock(&disk->open_mutex);
>   }
>   EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
> +
> +int bd_register_pending_holders(struct gendisk *disk)
> +{
> +	struct bd_holder_disk *holder;
> +	int ret;
> +
> +	mutex_lock(&disk->open_mutex);
> +	list_for_each_entry(holder, &disk->slave_bdevs, list) {
> +		ret = __link_disk_holder(holder->bdev, disk);
> +		if (ret)
> +			goto out_undo;
> +	}
> +	mutex_unlock(&disk->open_mutex);
> +	return 0;
> +
> +out_undo:
> +	list_for_each_entry_continue_reverse(holder, &disk->slave_bdevs, list)
> +		__unlink_disk_holder(holder->bdev, disk);
> +	mutex_unlock(&disk->open_mutex);
> +	return ret;
> +}
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 0721807d76ee..80952f038d79 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -323,6 +323,7 @@ long compat_blkdev_ioctl(struct file *, unsigned, unsigned long);
>   #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
>   int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
>   void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk);
> +int bd_register_pending_holders(struct gendisk *disk);
>   #else
>   static inline int bd_link_disk_holder(struct block_device *bdev,
>   				      struct gendisk *disk)
> @@ -333,6 +334,10 @@ static inline void bd_unlink_disk_holder(struct block_device *bdev,
>   					 struct gendisk *disk)
>   {
>   }
> +static inline int bd_register_pending_holders(struct gendisk *disk)
> +{
> +	return 0;
> +}
>   #endif /* CONFIG_BLOCK_HOLDER_DEPRECATED */
>   
>   dev_t part_devt(struct gendisk *disk, u8 partno);

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-04  9:41 ` [dm-devel] [PATCH 4/8] block: support delayed holder registration Christoph Hellwig
       [not found]   ` <CGME20210810213058eucas1p109323e3c3ecaa76d37d8cf63b6d8ecfd@eucas1p1.samsung.com>
@ 2021-08-14 21:13   ` Guenter Roeck
  2021-08-15  7:07     ` Christoph Hellwig
  1 sibling, 1 reply; 35+ messages in thread
From: Guenter Roeck @ 2021-08-14 21:13 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer

On Wed, Aug 04, 2021 at 11:41:43AM +0200, Christoph Hellwig wrote:
> device mapper needs to register holders before it is ready to do I/O.
> Currently it does so by registering the disk early, which can leave
> the disk and queue in a weird half state where the queue is registered
> with the disk, except for sysfs and the elevator.  And this state has
> been a bit promlematic before, and will get more so when sorting out
> the responsibilities between the queue and the disk.
> 
> Support registering holders on an initialized but not registered disk
> instead by delaying the sysfs registration until the disk is registered.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Mike Snitzer <snitzer@redhat.com>

This patch results in lockdep splats when booting from flash.
Reverting it fixes the proboem.

Guenter

---
bisect log:

# bad: [4b358aabb93a2c654cd1dcab1a25a589f6e2b153] Add linux-next specific files for 20210813
# good: [36a21d51725af2ce0700c6ebcb6b9594aac658a6] Linux 5.14-rc5
git bisect start 'HEAD' 'v5.14-rc5'
# good: [204808b2ca750e27cbad3455f7cb4368c4f5b260] Merge remote-tracking branch 'crypto/master'
git bisect good 204808b2ca750e27cbad3455f7cb4368c4f5b260
# bad: [2201162fca73b487152bcff2ebb0f85c1dde8479] Merge remote-tracking branch 'tip/auto-latest'
git bisect bad 2201162fca73b487152bcff2ebb0f85c1dde8479
# good: [a22c074fd1dd52a8b41dd6789220409b64093e9c] Merge tag 'drm-intel-next-2021-08-10-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
git bisect good a22c074fd1dd52a8b41dd6789220409b64093e9c
# bad: [33a201f05bbd7475ebe10af22b986ab70550dc8f] Merge remote-tracking branch 'block/for-next'
git bisect bad 33a201f05bbd7475ebe10af22b986ab70550dc8f
# good: [6849b6a4f2d8a7ce7c9434ed6f1e286443cf5fd3] Merge remote-tracking branch 'sound/for-next'
git bisect good 6849b6a4f2d8a7ce7c9434ed6f1e286443cf5fd3
# good: [77fb0208ae88b36e46d99441c8369412dbaacc0d] Merge remote-tracking branch 'sound-asoc/for-next'
git bisect good 77fb0208ae88b36e46d99441c8369412dbaacc0d
# bad: [3d2e79894bd7adc7d14638a0c72ceb8b722d1fa3] block: pass a gendisk to bdev_resize_partition
git bisect bad 3d2e79894bd7adc7d14638a0c72ceb8b722d1fa3
# good: [7957d93bf32bc211415827e44fdd9cdf1388df59] block: add ioctl to read the disk sequence number
git bisect good 7957d93bf32bc211415827e44fdd9cdf1388df59
# bad: [1008162b2782a3624d12b0aee8da58bc75d12e19] block: add a queue_has_disk helper
git bisect bad 1008162b2782a3624d12b0aee8da58bc75d12e19
# good: [fbd9a39542ecdd2ade55869c13856b2590db3df8] block: remove the extra kobject reference in bd_link_disk_holder
git bisect good fbd9a39542ecdd2ade55869c13856b2590db3df8
# bad: [ba30585936b0b88f0fb2b19be279b346a6cc87eb] dm: move setting md->type into dm_setup_md_queue
git bisect bad ba30585936b0b88f0fb2b19be279b346a6cc87eb
# bad: [d626338735909bc2b2e7cafc332f44ed41cfdeee] block: support delayed holder registration
git bisect bad d626338735909bc2b2e7cafc332f44ed41cfdeee
# good: [0dbcfe247f22a6d73302dfa691c48b3c14d31c4c] block: look up holders by bdev
git bisect good 0dbcfe247f22a6d73302dfa691c48b3c14d31c4c
# first bad commit: [d626338735909bc2b2e7cafc332f44ed41cfdeee] block: support delayed holder registration

---

lockdep splat on mips:

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc5-next-20210813 #1 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
81066ab8 (mtd_table_mutex){+.+.}-{3:3}, at: blktrans_open+0x4c/0x214

but task is already holding lock:
82675ea8 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev+0x17c/0x454

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&disk->open_mutex){+.+.}-{3:3}:
       lock_acquire+0x2b0/0x49c
       __mutex_lock+0xb8/0x760
       mutex_lock_nested+0x1c/0x28
       bd_register_pending_holders+0x34/0x140
       device_add_disk+0x214/0x55c
       add_mtd_blktrans_dev+0x318/0x58c
       mtdblock_add_mtd+0x94/0xf4
       blktrans_notify_add+0x44/0x6c
       add_mtd_device+0x35c/0x5ec
       add_mtd_partitions+0xc4/0x22c
       parse_mtd_partitions+0x1d8/0x3d8
       mtd_device_parse_register+0x94/0x330
       physmap_flash_probe+0x438/0x7a8
       platform_probe+0x50/0xc4
       really_probe+0x140/0x30c
       driver_probe_device+0x48/0x110
       __driver_attach+0xe8/0x13c
       bus_for_each_dev+0x70/0xd0
       bus_add_driver+0x174/0x234
       driver_register+0x80/0x144
       do_one_initcall+0x94/0x3c4
       kernel_init_freeable+0x20c/0x2a0
       kernel_init+0x24/0x128
       ret_from_kernel_thread+0x14/0x1c

-> #0 (mtd_table_mutex){+.+.}-{3:3}:
       check_noncircular+0x1b4/0x21c
       __lock_acquire+0x1ebc/0x3b70
       lock_acquire+0x2b0/0x49c
       __mutex_lock+0xb8/0x760
       mutex_lock_nested+0x1c/0x28
       blktrans_open+0x4c/0x214
       blkdev_get_whole+0x2c/0xd4
       blkdev_get_by_dev+0x140/0x454
       blkdev_get_by_path+0x6c/0xbc
       mount_bdev+0x50/0x1fc
       ext4_mount+0x18/0x24
       legacy_get_tree+0x30/0x78
       vfs_get_tree+0x2c/0x104
       path_mount+0x44c/0xa1c
       init_mount+0x70/0xb4
       do_mount_root+0xac/0x164
       mount_block_root+0x174/0x2a8
       mount_root+0x120/0x15c
       prepare_namespace+0x15c/0x19c
       kernel_init+0x24/0x128
       ret_from_kernel_thread+0x14/0x1c

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&disk->open_mutex);
                               lock(mtd_table_mutex);
                               lock(&disk->open_mutex);
  lock(mtd_table_mutex);

 *** DEADLOCK ***

1 lock held by swapper/0/1:
 #0: 82675ea8 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev+0x17c/0x454

stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc5-next-20210813 #1
Stack : ffffffff 801b5414 80cfba14 00000004 801b5374 00000000 820cb90c a16af2b0
        80fe0000 81119930 80ef8fe0 80fe0000 80fddc23 00000001 820cb8b0 82100ec0
        00000000 00000000 80ef8fe0 820cb730 00000001 820cb744 00000000 0000ffff
        00000008 00000007 00000280 82318800 80fe0000 00000000 80ef8fe0 80fe0000
        00000000 80fdad68 820c6fd8 820c6fb8 00000000 807b0f60 00000000 81120000
        ...
Call Trace:
[<8010ad9c>] show_stack+0x84/0x11c
[<80c79334>] dump_stack_lvl+0xa8/0x100
[<801a3978>] check_noncircular+0x1b4/0x21c
[<801a8184>] __lock_acquire+0x1ebc/0x3b70
[<801a58c0>] lock_acquire+0x2b0/0x49c
[<80c85068>] __mutex_lock+0xb8/0x760
[<80c8572c>] mutex_lock_nested+0x1c/0x28
[<8087bf68>] blktrans_open+0x4c/0x214
[<8033ddfc>] blkdev_get_whole+0x2c/0xd4
[<8033f2c0>] blkdev_get_by_dev+0x140/0x454
[<8033f954>] blkdev_get_by_path+0x6c/0xbc
[<802e3f10>] mount_bdev+0x50/0x1fc
[<80423c04>] ext4_mount+0x18/0x24
[<8032ef70>] legacy_get_tree+0x30/0x78
[<802e3034>] vfs_get_tree+0x2c/0x104
[<803149ec>] path_mount+0x44c/0xa1c
[<810aec68>] init_mount+0x70/0xb4
[<8109948c>] do_mount_root+0xac/0x164
[<81099714>] mount_block_root+0x174/0x2a8
[<81099968>] mount_root+0x120/0x15c
[<81099b00>] prepare_namespace+0x15c/0x19c
[<80c810c0>] kernel_init+0x24/0x128
[<801038f8>] ret_from_kernel_thread+0x14/0x1c

---
lockdep splat on powerpc:

[   14.502119][    T1] ======================================================
[   14.502379][    T1] WARNING: possible circular locking dependency detected
[   14.502668][    T1] 5.14.0-rc5-next-20210813 #1 Not tainted
[   14.502933][    T1] ------------------------------------------------------
[   14.503185][    T1] swapper/0/1 is trying to acquire lock:
[   14.503419][    T1] c0000000018bbb90 (mtd_table_mutex){+.+.}-{3:3}, at: blktrans_open+0x60/0x300
[   14.503992][    T1]
[   14.503992][    T1] but task is already holding lock:
[   14.504273][    T1] c0000000058a8718 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev+0x20c/0x3c0
[   14.504643][    T1]
[   14.504643][    T1] which lock already depends on the new lock.
[   14.504643][    T1]
[   14.505012][    T1]
[   14.505012][    T1] the existing dependency chain (in reverse order) is:
[   14.505377][    T1]
[   14.505377][    T1] -> #1 (&disk->open_mutex){+.+.}-{3:3}:
[   14.505724][    T1]        __mutex_lock+0xd8/0xad0
[   14.505945][    T1]        bd_register_pending_holders+0x48/0x190
[   14.506191][    T1]        device_add_disk+0x29c/0x400
[   14.506390][    T1]        add_mtd_blktrans_dev+0x358/0x630
[   14.506600][    T1]        mtdblock_add_mtd+0x94/0x120
[   14.506802][    T1]        blktrans_notify_add+0x7c/0xb0
[   14.507004][    T1]        add_mtd_device+0x408/0x5f0
[   14.507197][    T1]        add_mtd_partitions+0xfc/0x2b0
[   14.507400][    T1]        parse_mtd_partitions+0x2b4/0x980
[   14.507611][    T1]        mtd_device_parse_register+0xc0/0x3a0
[   14.507843][    T1]        powernv_flash_probe+0x180/0x240
[   14.508051][    T1]        platform_probe+0x78/0x120
[   14.508246][    T1]        really_probe+0x1cc/0x440
[   14.508435][    T1]        __driver_probe_device+0xb0/0x160
[   14.508645][    T1]        driver_probe_device+0x60/0x130
[   14.508848][    T1]        __driver_attach+0xe8/0x160
[   14.509041][    T1]        bus_for_each_dev+0xb4/0x130
[   14.509234][    T1]        driver_attach+0x34/0x50
[   14.509426][    T1]        bus_add_driver+0x1d8/0x2b0
[   14.509621][    T1]        driver_register+0x98/0x1a0
[   14.509820][    T1]        __platform_driver_register+0x38/0x50
[   14.510048][    T1]        powernv_flash_driver_init+0x2c/0x40
[   14.510276][    T1]        do_one_initcall+0x88/0x490
[   14.510471][    T1]        kernel_init_freeable+0x3dc/0x484
[   14.510688][    T1]        kernel_init+0x3c/0x180
[   14.510871][    T1]        ret_from_kernel_thread+0x5c/0x64
[   14.511120][    T1]
[   14.511120][    T1] -> #0 (mtd_table_mutex){+.+.}-{3:3}:
[   14.511419][    T1]        __lock_acquire+0x1eb0/0x2a40
[   14.511622][    T1]        lock_acquire+0x2d8/0x490
[   14.511815][    T1]        __mutex_lock+0xd8/0xad0
[   14.512001][    T1]        blktrans_open+0x60/0x300
[   14.512189][    T1]        blkdev_get_whole+0x50/0x110
[   14.512387][    T1]        blkdev_get_by_dev+0x1dc/0x3c0
[   14.512588][    T1]        blkdev_get_by_path+0x90/0xe0
[   14.512787][    T1]        mount_bdev+0x6c/0x2b0
[   14.512972][    T1]        ext4_mount+0x28/0x40
[   14.513154][    T1]        legacy_get_tree+0x4c/0xb0
[   14.513358][    T1]        vfs_get_tree+0x4c/0x110
[   14.513545][    T1]        path_mount+0x2d8/0xd30
[   14.513730][    T1]        init_mount+0x7c/0xcc
[   14.513911][    T1]        mount_block_root+0x230/0x454
[   14.514111][    T1]        prepare_namespace+0x1b0/0x204
[   14.514313][    T1]        kernel_init_freeable+0x428/0x484
[   14.514524][    T1]        kernel_init+0x3c/0x180
[   14.514713][    T1]        ret_from_kernel_thread+0x5c/0x64
[   14.514944][    T1]
[   14.514944][    T1] other info that might help us debug this:
[   14.514944][    T1]
[   14.515330][    T1]  Possible unsafe locking scenario:
[   14.515330][    T1]
[   14.515607][    T1]        CPU0                    CPU1
[   14.515808][    T1]        ----                    ----
[   14.516007][    T1]   lock(&disk->open_mutex);
[   14.516196][    T1]                                lock(mtd_table_mutex);
[   14.516463][    T1]                                lock(&disk->open_mutex);
[   14.516739][    T1]   lock(mtd_table_mutex);
[   14.516926][    T1]
[   14.516926][    T1]  *** DEADLOCK ***
[   14.516926][    T1]
[   14.517242][    T1] 1 lock held by swapper/0/1:
[   14.517455][    T1]  #0: c0000000058a8718 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev+0x20c/0x3c0
[   14.517875][    T1]
[   14.517875][    T1] stack backtrace:
[   14.518197][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc5-next-20210813 #1
[   14.518684][    T1] Call Trace:
[   14.518852][    T1] [c000000002c872c0] [c00000000090eb28] dump_stack_lvl+0xac/0x108 (unreliable)
[   14.519242][    T1] [c000000002c87300] [c0000000001a9dac] print_circular_bug.isra.44+0x37c/0x3e0
[   14.519588][    T1] [c000000002c873a0] [c0000000001a9fe0] check_noncircular+0x1d0/0x220
[   14.519887][    T1] [c000000002c87470] [c0000000001affd0] __lock_acquire+0x1eb0/0x2a40
[   14.520182][    T1] [c000000002c875b0] [c0000000001ad028] lock_acquire+0x2d8/0x490
[   14.520469][    T1] [c000000002c876a0] [c0000000010ab098] __mutex_lock+0xd8/0xad0
[   14.520756][    T1] [c000000002c877b0] [c000000000bac480] blktrans_open+0x60/0x300
[   14.521045][    T1] [c000000002c87800] [c0000000005406d0] blkdev_get_whole+0x50/0x110
[   14.521346][    T1] [c000000002c87840] [c000000000542efc] blkdev_get_by_dev+0x1dc/0x3c0
[   14.521650][    T1] [c000000002c878a0] [c0000000005434f0] blkdev_get_by_path+0x90/0xe0
[   14.521948][    T1] [c000000002c878f0] [c0000000004d1dec] mount_bdev+0x6c/0x2b0
[   14.522221][    T1] [c000000002c87990] [c000000000649af8] ext4_mount+0x28/0x40
[   14.522503][    T1] [c000000002c879b0] [c000000000531d5c] legacy_get_tree+0x4c/0xb0
[   14.522800][    T1] [c000000002c879e0] [c0000000004cff2c] vfs_get_tree+0x4c/0x110
[   14.523083][    T1] [c000000002c87a50] [c0000000005103a8] path_mount+0x2d8/0xd30
[   14.523365][    T1] [c000000002c87ae0] [c0000000015776c4] init_mount+0x7c/0xcc
[   14.523645][    T1] [c000000002c87b50] [c000000001541bd0] mount_block_root+0x230/0x454
[   14.523943][    T1] [c000000002c87c50] [c000000001542050] prepare_namespace+0x1b0/0x204
[   14.524238][    T1] [c000000002c87cc0] [c00000000154170c] kernel_init_freeable+0x428/0x484
[   14.524545][    T1] [c000000002c87da0] [c000000000012d2c] kernel_init+0x3c/0x180
[   14.524831][    T1] [c000000002c87e10] [c00000000000cfd4] ret_from_kernel_thread+0x5c/0x64


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-14 21:13   ` Guenter Roeck
@ 2021-08-15  7:07     ` Christoph Hellwig
  2021-08-15 14:27       ` Guenter Roeck
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-15  7:07 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jens Axboe, linux-block, dm-devel, Christoph Hellwig, Mike Snitzer

On Sat, Aug 14, 2021 at 02:13:09PM -0700, Guenter Roeck wrote:
> On Wed, Aug 04, 2021 at 11:41:43AM +0200, Christoph Hellwig wrote:
> > device mapper needs to register holders before it is ready to do I/O.
> > Currently it does so by registering the disk early, which can leave
> > the disk and queue in a weird half state where the queue is registered
> > with the disk, except for sysfs and the elevator.  And this state has
> > been a bit promlematic before, and will get more so when sorting out
> > the responsibilities between the queue and the disk.
> > 
> > Support registering holders on an initialized but not registered disk
> > instead by delaying the sysfs registration until the disk is registered.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > Reviewed-by: Mike Snitzer <snitzer@redhat.com>
> 
> This patch results in lockdep splats when booting from flash.
> Reverting it fixes the proboem.

Should be fixed by:
https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.15/drivers&id=6e4df4c6488165637b95b9701cc862a42a3836ba

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-15  7:07     ` Christoph Hellwig
@ 2021-08-15 14:27       ` Guenter Roeck
  2021-08-16  7:21         ` Christoph Hellwig
  0 siblings, 1 reply; 35+ messages in thread
From: Guenter Roeck @ 2021-08-15 14:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer

On 8/15/21 12:07 AM, Christoph Hellwig wrote:
> On Sat, Aug 14, 2021 at 02:13:09PM -0700, Guenter Roeck wrote:
>> On Wed, Aug 04, 2021 at 11:41:43AM +0200, Christoph Hellwig wrote:
>>> device mapper needs to register holders before it is ready to do I/O.
>>> Currently it does so by registering the disk early, which can leave
>>> the disk and queue in a weird half state where the queue is registered
>>> with the disk, except for sysfs and the elevator.  And this state has
>>> been a bit promlematic before, and will get more so when sorting out
>>> the responsibilities between the queue and the disk.
>>>
>>> Support registering holders on an initialized but not registered disk
>>> instead by delaying the sysfs registration until the disk is registered.
>>>
>>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>>> Reviewed-by: Mike Snitzer <snitzer@redhat.com>
>>
>> This patch results in lockdep splats when booting from flash.
>> Reverting it fixes the proboem.
> 
> Should be fixed by:
> https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.15/drivers&id=6e4df4c6488165637b95b9701cc862a42a3836ba
> 

No, it doesn't. I could not apply this patch alone, so I applied the entire series
on top of next-20210813 and gave it another try.

f53c2d11ac98 (HEAD -> master) nbd: reduce the nbd_index_mutex scope
f2f5254b356f nbd: refactor device search and allocation in nbd_genl_connect
d5b03177e069 nbd: return the allocated nbd_device from nbd_dev_add
350b3f6a6e6b nbd: remove nbd_del_disk
49efbeb9de86 nbd: refactor device removal
cdd920eb7cf2 nbd: do del_gendisk() asynchronously for NBD_DESTROY_ON_DISCONNECT
4b358aabb93a (tag: next-20210813, origin/master, origin/HEAD) Add linux-next specific files for 20210813

Still:
...

[   14.467748][    T1]  Possible unsafe locking scenario:
[   14.467748][    T1]
[   14.467928][    T1]        CPU0                    CPU1
[   14.468058][    T1]        ----                    ----
[   14.468187][    T1]   lock(&disk->open_mutex);
[   14.468317][    T1]                                lock(mtd_table_mutex);
[   14.468493][    T1]                                lock(&disk->open_mutex);
[   14.468671][    T1]   lock(mtd_table_mutex);

Guenter

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-15 14:27       ` Guenter Roeck
@ 2021-08-16  7:21         ` Christoph Hellwig
  2021-08-16 14:17           ` Guenter Roeck
  2021-08-18  2:51           ` Guenter Roeck
  0 siblings, 2 replies; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-16  7:21 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jens Axboe, linux-block, dm-devel, Christoph Hellwig, Mike Snitzer

On Sun, Aug 15, 2021 at 07:27:37AM -0700, Guenter Roeck wrote:
> [   14.467748][    T1]  Possible unsafe locking scenario:
> [   14.467748][    T1]
> [   14.467928][    T1]        CPU0                    CPU1
> [   14.468058][    T1]        ----                    ----
> [   14.468187][    T1]   lock(&disk->open_mutex);
> [   14.468317][    T1]                                lock(mtd_table_mutex);
> [   14.468493][    T1]                                lock(&disk->open_mutex);
> [   14.468671][    T1]   lock(mtd_table_mutex);

Oh, that ooks like a really old one, fixed by
b7abb0516822 ("mtd: fix lock hierarchy in deregister_mtd_blktrans")
in linux-next.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-16  7:21         ` Christoph Hellwig
@ 2021-08-16 14:17           ` Guenter Roeck
  2021-08-20 15:08             ` Christoph Hellwig
  2021-08-18  2:51           ` Guenter Roeck
  1 sibling, 1 reply; 35+ messages in thread
From: Guenter Roeck @ 2021-08-16 14:17 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer

On Mon, Aug 16, 2021 at 09:21:58AM +0200, Christoph Hellwig wrote:
> On Sun, Aug 15, 2021 at 07:27:37AM -0700, Guenter Roeck wrote:
> > [   14.467748][    T1]  Possible unsafe locking scenario:
> > [   14.467748][    T1]
> > [   14.467928][    T1]        CPU0                    CPU1
> > [   14.468058][    T1]        ----                    ----
> > [   14.468187][    T1]   lock(&disk->open_mutex);
> > [   14.468317][    T1]                                lock(mtd_table_mutex);
> > [   14.468493][    T1]                                lock(&disk->open_mutex);
> > [   14.468671][    T1]   lock(mtd_table_mutex);
> 
> Oh, that ooks like a really old one, fixed by
> b7abb0516822 ("mtd: fix lock hierarchy in deregister_mtd_blktrans")
> in linux-next.

I have seen the problem in next-20210813 and that patch is there,
so that is somewhat unlikely.

Guenter

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-16  7:21         ` Christoph Hellwig
  2021-08-16 14:17           ` Guenter Roeck
@ 2021-08-18  2:51           ` Guenter Roeck
  1 sibling, 0 replies; 35+ messages in thread
From: Guenter Roeck @ 2021-08-18  2:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer

On Mon, Aug 16, 2021 at 09:21:58AM +0200, Christoph Hellwig wrote:
> On Sun, Aug 15, 2021 at 07:27:37AM -0700, Guenter Roeck wrote:
> > [   14.467748][    T1]  Possible unsafe locking scenario:
> > [   14.467748][    T1]
> > [   14.467928][    T1]        CPU0                    CPU1
> > [   14.468058][    T1]        ----                    ----
> > [   14.468187][    T1]   lock(&disk->open_mutex);
> > [   14.468317][    T1]                                lock(mtd_table_mutex);
> > [   14.468493][    T1]                                lock(&disk->open_mutex);
> > [   14.468671][    T1]   lock(mtd_table_mutex);
> 
> Oh, that ooks like a really old one, fixed by
> b7abb0516822 ("mtd: fix lock hierarchy in deregister_mtd_blktrans")
> in linux-next.

I tested again with next-20210817. The problem is still there, and
reverting commit d62633873590 ("block: support delayed holder
registration") still fixes it. A complete boot log is attached
for reference.

Guenter

---
Build reference: next-20210817

qemu log:
[    0.010915815,5] OPAL v6.4 starting...
[    0.011392739,7] initial console log level: memory 7, driver 5
[    0.011422553,6] CPU: P9 generation processor (max 4 threads/core)
[    0.011437437,7] CPU: Boot CPU PIR is 0x0000 PVR is 0x004e1200
[    0.011524734,7] OPAL table: 0x30101830 .. 0x30101da0, branch table: 0x30002000
[    0.011667083,7] Assigning physical memory map table for nimbus
[    0.011914730,7] FDT: Parsing fdt @0x1000000
[    0.013589817,5] CHIP: Detected Qemu simulator
[    0.013713155,6] CHIP: Initialised chip 0 from xscom@603fc00000000
[    0.013992710,6] P9 DD2.00 detected
[    0.014006187,5] CHIP: Chip ID 0000 type: P9N DD2.00
[    0.014012448,7] XSCOM: Base address: 0x603fc00000000
[    0.014031889,7] XSTOP: ibm,sw-checkstop-fir prop not found
[    0.014094328,6] MFSI 0:0: Initialized
[    0.014102372,6] MFSI 0:2: Initialized
[    0.014109468,6] MFSI 0:1: Initialized
[    0.014477509,6] LPC: LPC[000]: Initialized
[    0.014483986,7] LPC: access via MMIO @0x6030000000000
[    0.014513616,7] LPC: Default bus on chip 0x0
[    0.014579301,7] CPU: New max PIR set to 0x3
[    0.014944450,7] MEM: parsing reserved memory from reserved-names/-ranges properties
[    0.015013478,7] CPU: decrementer bits 56
[    0.015059057,6] CPU: CPU from DT PIR=0x0000 Server#=0x0 State=3
[    0.015116735,6] CPU:  1 secondary threads
[    0.016079682,5] PLAT: Using SuperIO UART
[    0.016285800,7] UART: Using LPC IRQ 4
[    0.017744314,5] PLAT: Detected QEMU POWER9 platform
[    0.017803645,5] PLAT: Detected BMC platform ast2500:openbmc
[    0.033090750,5] CPU: All 1 processors called in...
[    0.033389379,3] SBE: Master chip ID not found.
[    0.033807105,7] LPC: Routing irq 10, policy: 0 (r=1)
[    0.033866006,7] LPC: SerIRQ 10 using route 0 targetted at OPAL
[    0.055456502,5] HIOMAP: Negotiated hiomap protocol v2
[    0.055559088,5] HIOMAP: Block size is 4KiB
[    0.055635761,5] HIOMAP: BMC suggested flash timeout of 0s
[    0.055732832,5] HIOMAP: Flash size is 32MiB
[    0.055809489,5] HIOMAP: Erase granule size is 4KiB
[    0.072601777,4] FLASH: No ffs info; using raw device only
[    0.078031760,3] FLASH: Can't open ffs handle
[    0.083329630,3] FLASH: Can't open ffs handle
[    0.088614080,3] FLASH: Can't open ffs handle
[    0.093896205,3] FLASH: Can't open ffs handle
[    0.099174034,3] FLASH: Can't open ffs handle
[    0.104454746,3] FLASH: Can't open ffs handle
[    0.115977110,2] NVRAM: Failed to load
[    0.116068253,2] NVRAM: Failed to load
[    0.116173219,5] STB: secure boot not supported
[    0.116283085,5] STB: trusted boot not supported
[    0.116683004,4] FLASH: Can't load resource id:4. No system flash found
[    0.117069713,4] FLASH: Can't load resource id:3. No system flash found
[    0.117269585,2] NVRAM: Failed to load
[    0.117490209,7] LPC: Routing irq 4, policy: 0 (r=1)
[    0.117514888,7] LPC: SerIRQ 4 using route 1 targetted at OPAL
[    0.117933346,3] SLW: HOMER base not set 0
[    0.118081064,5] Unable to log error
[    0.118227237,2] NVRAM: Failed to load
[    0.118442741,3] OCC: No HOMER detected, assuming no pstates
[    0.118501222,5] Unable to log error
[    0.118555649,2] NVRAM: Failed to load
[    0.118604663,2] NVRAM: Failed to load
[    0.118720628,4] FLASH: Can't load resource id:2. No system flash found
[    0.118912385,4] FLASH: Can't load resource id:0. No system flash found
[    0.118983574,4] FLASH: Can't load resource id:1. No system flash found
[    0.119116640,3] IMC: IMC Catalog load failed
[    0.119364768,2] NVRAM: Failed to load
[    0.119406236,2] NVRAM: Failed to load
[    0.119444636,2] NVRAM: Failed to load
[    0.119482315,2] NVRAM: Failed to load
[    0.131102078,3] CAPP: Error loading ucode lid. index=200d1
[    0.150206406,2] NVRAM: Failed to load
[    0.160574726,5] PCI: Resetting PHBs and training links...
[    6.170966934,5] PCI: Probing slots...
[    6.175681253,5] PCI Summary:
[    6.176039126,5] PHB#0000:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:01..01 
[    6.176342301,5] PHB#0000:01:00.0 [PCID] 10ec 8139 R:20 C:020000 (      ethernet) 
[    6.176532516,5] PHB#0001:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:00..00 
[    6.176658081,5] PHB#0002:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:00..00 
[    6.176779913,5] PHB#0003:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:00..00 
[    6.176904577,5] PHB#0004:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:00..00 
[    6.177035788,5] PHB#0005:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:00..00 
[    6.177260815,4] FLASH: Failed to load VERSION data
[    6.312317126,5] INIT: Waiting for kernel...
[    6.312368972,5] INIT: platform wait for kernel load failed
[    6.312433234,5] INIT: Assuming kernel at 0x20000000
[    6.312515708,5] INIT: 64-bit LE kernel discovered
[    6.312673483,3] OCC: Unassigned OCC Common Area. No sensors found
[    6.312827720,2] NVRAM: Failed to load
[    6.370807692,5] INIT: Starting kernel at 0x20010000, fdt at 0x306c9818 24120 bytes

zImage starting: loaded at 0x0000000020010000 (sp: 0x00000000209e7ed8)
Allocating 0x2647d20 bytes for kernel...
Decompressing (0x0000000000000000 <- 0x0000000020021000:0x00000000209e6ea9)...
Done! Decompressed 0x1a7ebf4 bytes

Linux/PowerPC load: panic=-1 slub_debug=FZPUA root=/dev/mtdblock0 console=tty console=hvc0
Finalizing device tree... flat tree at 0x209e8ca0
[    0.000000][    T0] dt-cpu-ftrs: setup for ISA 3000
[    0.000000][    T0] dt-cpu-ftrs: final cpu/mmu features = 0x0003c06b8f5fb187 0x3c007041
[    0.000000][    T0] Activating Kernel Userspace Execution Prevention
[    0.000000][    T0] Activating Kernel Userspace Access Prevention
[    0.000000][    T0] radix-mmu: Mapped 0x0000000000000000-0x0000000001740000 with 64.0 KiB pages (exec)
[    0.000000][    T0] radix-mmu: Mapped 0x0000000001740000-0x0000000080000000 with 64.0 KiB pages
[    0.000000][    T0] radix-mmu: Initializing Radix MMU
[    0.000000][    T0] Linux version 5.14.0-rc6-next-20210817 (groeck@server.roeck-us.net) (powerpc64-linux-gcc.br_real (Buildroot 2019.02-git-00353-g4f20f23) 7.4.0, GNU ld (GNU Binutils) 2.31.1) #1 SMP Tue Aug 17 19:13:07 PDT 2021
[    0.000000][    T0] Using PowerNV machine description
[    0.000000][    T0] printk: bootconsole [udbg0] enabled
[    0.000000][    T0] CPU maps initialized for 1 thread per core
[    0.000000][    T0] -----------------------------------------------------
[    0.000000][    T0] phys_mem_size     = 0x80000000
[    0.000000][    T0] dcache_bsize      = 0x80
[    0.000000][    T0] icache_bsize      = 0x80
[    0.000000][    T0] cpu_features      = 0x0003c06b8f4fb187
[    0.000000][    T0]   possible        = 0x000ffbfbcf5fb187
[    0.000000][    T0]   always          = 0x0000000380008181
[    0.000000][    T0] cpu_user_features = 0xdc0065c2 0xaef00000
[    0.000000][    T0] mmu_features      = 0x3c007641
[    0.000000][    T0] firmware_features = 0x0000000010000000
[    0.000000][    T0] vmalloc start     = 0xc008000000000000
[    0.000000][    T0] IO start          = 0xc00a000000000000
[    0.000000][    T0] vmemmap start     = 0xc00c000000000000
[    0.000000][    T0] -----------------------------------------------------
[    0.000000][    T0] kvm_cma_reserve: reserving 102 MiB for global area
[    0.000000][    T0] cma: Reserved 112 MiB at 0x0000000073000000
[    0.000000][    T0] numa:   NODE_DATA [mem 0x7bf7ae00-0x7bf7ffff]
[    0.000000][    T0] rfi-flush: fallback displacement flush available
[    0.000000][    T0] count-cache-flush: flush disabled.
[    0.000000][    T0] link-stack-flush: flush disabled.
[    0.000000][    T0] stf-barrier: eieio barrier available
[    0.000000][    T0] barrier-nospec: using ORI speculation barrier
[    0.000000][    T0] Zone ranges:
[    0.000000][    T0]   Normal   [mem 0x0000000000000000-0x000000007fffffff]
[    0.000000][    T0] Movable zone start for each node
[    0.000000][    T0] Early memory node ranges
[    0.000000][    T0]   node   0: [mem 0x0000000000000000-0x000000007fffffff]
[    0.000000][    T0] Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
[    0.000000][    T0] percpu: Embedded 11 pages/cpu s645968 r0 d74928 u720896
[    0.000000][    T0] Built 1 zonelists, mobility grouping on.  Total pages: 32736
[    0.000000][    T0] Policy zone: Normal
[    0.000000][    T0] Kernel command line: panic=-1 slub_debug=FZPUA root=/dev/mtdblock0 console=tty console=hvc0
[    0.000000][    T0] Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes, linear)
[    0.000000][    T0] Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes, linear)
[    0.000000][    T0] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000][    T0] Memory: 0K/2097152K available (17152K kernel code, 3328K rwdata, 4608K rodata, 1984K init, 12063K bss, 173440K reserved, 114688K cma-reserved)
[    0.000000][    T0] **********************************************************
[    0.000000][    T0] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
[    0.000000][    T0] **                                                      **
[    0.000000][    T0] ** This system shows unhashed kernel memory addresses   **
[    0.000000][    T0] ** via the console, logs, and other interfaces. This    **
[    0.000000][    T0] ** might reduce the security of your system.            **
[    0.000000][    T0] **                                                      **
[    0.000000][    T0] ** If you see this message and you are not debugging    **
[    0.000000][    T0] ** the kernel, report this immediately to your system   **
[    0.000000][    T0] ** administrator!                                       **
[    0.000000][    T0] **                                                      **
[    0.000000][    T0] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
[    0.000000][    T0] **********************************************************
[    0.000000][    T0] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000][    T0] ftrace: allocating 36425 entries in 14 pages
[    0.000000][    T0] ftrace: allocated 14 pages with 3 groups
[    0.000000][    T0] trace event string verifier disabled
[    0.000000][    T0] Running RCU self tests
[    0.000000][    T0] rcu: Hierarchical RCU implementation.
[    0.000000][    T0] rcu: 	RCU event tracing is enabled.
[    0.000000][    T0] rcu: 	RCU lockdep checking is enabled.
[    0.000000][    T0] rcu: 	RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=1.
[    0.000000][    T0] rcu: 	RCU debug extended QS entry/exit.
[    0.000000][    T0] 	Rude variant of Tasks RCU enabled.
[    0.000000][    T0] 	Tracing variant of Tasks RCU enabled.
[    0.000000][    T0] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000][    T0] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000][    T0] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
[    0.000000][    T0] xive: Interrupt handling initialized with native backend
[    0.000000][    T0] xive: Using priority 7 for all interrupts
[    0.000000][    T0] xive: Using 64kB queues
[    0.000000][    T0] random: get_random_u64 called from start_kernel+0x680/0x8e8 with crng_init=0
[    0.000090][    T0] time_init: 56 bit decrementer (max: 7fffffffffffff)
[    0.001865][    T0] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
[    0.004659][    T0] clocksource: timebase mult[1f40000] shift[24] registered
[    0.012714][    T0] Console: colour dummy device 80x25
[    0.026090][    T0] printk: console [tty0] enabled
[    0.026571][    T0] printk: console [hvc0] enabled
[    0.026571][    T0] printk: console [hvc0] enabled
[    0.027221][    T0] printk: bootconsole [udbg0] disabled
[    0.027221][    T0] printk: bootconsole [udbg0] disabled
[    0.028092][    T0] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.028424][    T0] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.028613][    T0] ... MAX_LOCK_DEPTH:          48
[    0.028800][    T0] ... MAX_LOCKDEP_KEYS:        8192
[    0.028989][    T0] ... CLASSHASH_SIZE:          4096
[    0.029185][    T0] ... MAX_LOCKDEP_ENTRIES:     32768
[    0.029381][    T0] ... MAX_LOCKDEP_CHAINS:      65536
[    0.029574][    T0] ... CHAINHASH_SIZE:          32768
[    0.029768][    T0]  memory used by lock dependency info: 6365 kB
[    0.029997][    T0]  memory used for stack traces: 4224 kB
[    0.030204][    T0]  per task-struct memory footprint: 1920 bytes
[    0.030467][    T0] ------------------------
[    0.030645][    T0] | Locking API testsuite:
[    0.030819][    T0] ----------------------------------------------------------------------------
[    0.035275][    T0]                                  | spin |wlock |rlock |mutex | wsem | rsem |
[    0.035616][    T0]   --------------------------------------------------------------------------
[    0.036183][    T0]                      A-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.049825][    T0]                  A-B-B-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.064240][    T0]              A-B-B-C-C-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.086497][    T0]              A-B-C-A-B-C deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.103109][    T0]          A-B-B-C-C-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.122436][    T0]          A-B-C-D-B-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.141864][    T0]          A-B-C-D-B-C-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.161418][    T0]                     double unlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.173155][    T0]                   initialize held:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.183712][    T0]   --------------------------------------------------------------------------
[    0.184041][    T0]               recursive read-lock:             |  ok  |             |  ok  |
[    0.187988][    T0]            recursive read-lock #2:             |  ok  |             |  ok  |
[    0.191588][    T0]             mixed read-write-lock:             |  ok  |             |  ok  |
[    0.195301][    T0]             mixed write-read-lock:             |  ok  |             |  ok  |
[    0.198850][    T0]   mixed read-lock/lock-write ABBA:             |  ok  |             |  ok  |
[    0.202797][    T0]    mixed read-lock/lock-read ABBA:             |  ok  |             |  ok  |
[    0.207250][    T0]  mixed write-lock/lock-write ABBA:             |  ok  |             |  ok  |
[    0.211513][    T0]   chain cached mixed R-L/L-W ABBA:             |  ok  |
[    0.213847][    T0]          rlock W1R2/W2R3/W3R1/123:             |  ok  |
[    0.216917][    T0]          rlock W1R2/W2R3/W3R1/132:             |  ok  |
[    0.219589][    T0]          rlock W1R2/W2R3/W3R1/213:             |  ok  |
[    0.222209][    T0]          rlock W1R2/W2R3/W3R1/231:             |  ok  |
[    0.224828][    T0]          rlock W1R2/W2R3/W3R1/312:             |  ok  |
[    0.227578][    T0]          rlock W1R2/W2R3/W3R1/321:             |  ok  |
[    0.230260][    T0]          rlock W1W2/R2R3/W3R1/123:             |  ok  |
[    0.232933][    T0]          rlock W1W2/R2R3/W3R1/132:             |  ok  |
[    0.235588][    T0]          rlock W1W2/R2R3/W3R1/213:             |  ok  |
[    0.238346][    T0]          rlock W1W2/R2R3/W3R1/231:             |  ok  |
[    0.241010][    T0]          rlock W1W2/R2R3/W3R1/312:             |  ok  |
[    0.243660][    T0]          rlock W1W2/R2R3/W3R1/321:             |  ok  |
[    0.246311][    T0]          rlock W1W2/R2R3/R3W1/123:             |  ok  |
[    0.249064][    T0]          rlock W1W2/R2R3/R3W1/132:             |  ok  |
[    0.251699][    T0]          rlock W1W2/R2R3/R3W1/213:             |  ok  |
[    0.254318][    T0]          rlock W1W2/R2R3/R3W1/231:             |  ok  |
[    0.256931][    T0]          rlock W1W2/R2R3/R3W1/312:             |  ok  |
[    0.259677][    T0]          rlock W1W2/R2R3/R3W1/321:             |  ok  |
[    0.262346][    T0]          rlock W1R2/R2R3/W3W1/123:             |  ok  |
[    0.264978][    T0]          rlock W1R2/R2R3/W3W1/132:             |  ok  |
[    0.267613][    T0]          rlock W1R2/R2R3/W3W1/213:             |  ok  |
[    0.270324][    T0]          rlock W1R2/R2R3/W3W1/231:             |  ok  |
[    0.272953][    T0]          rlock W1R2/R2R3/W3W1/312:             |  ok  |
[    0.275577][    T0]          rlock W1R2/R2R3/W3W1/321:             |  ok  |
[    0.278228][    T0]   --------------------------------------------------------------------------
[    0.278553][    T0]      hard-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
[    0.283971][    T0]      soft-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
[    0.289507][    T0]      hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
[    0.294903][    T0]      soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
[    0.300069][    T0]        sirq-safe-A => hirqs-on/12:  ok  |  ok  |  ok  |
[    0.305415][    T0]        sirq-safe-A => hirqs-on/21:  ok  |  ok  |  ok  |
[    0.310652][    T0]          hard-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
[    0.316082][    T0]          soft-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
[    0.321303][    T0]          hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
[    0.326734][    T0]          soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
[    0.331984][    T0]     hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
[    0.338422][    T0]     soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
[    0.345541][    T0]     hard-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
[    0.356619][    T0]     soft-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
[    0.362796][    T0]     hard-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |
[    0.369123][    T0]     soft-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |
[    0.375299][    T0]     hard-safe-A + unsafe-B #1/231:  ok  |  ok  |  ok  |
[    0.381629][    T0]     soft-safe-A + unsafe-B #1/231:  ok  |  ok  |  ok  |
[    0.387777][    T0]     hard-safe-A + unsafe-B #1/312:  ok  |  ok  |  ok  |
[    0.393484][    T0]     soft-safe-A + unsafe-B #1/312:  ok  |  ok  |  ok  |
[    0.399117][    T0]     hard-safe-A + unsafe-B #1/321:  ok  |  ok  |  ok  |
[    0.405361][    T0]     soft-safe-A + unsafe-B #1/321:  ok  |  ok  |  ok  |
[    0.411995][    T0]     hard-safe-A + unsafe-B #2/123:  ok  |  ok  |  ok  |
[    0.418450][    T0]     soft-safe-A + unsafe-B #2/123:  ok  |  ok  |  ok  |
[    0.424913][    T0]     hard-safe-A + unsafe-B #2/132:  ok  |  ok  |  ok  |
[    0.431446][    T0]     soft-safe-A + unsafe-B #2/132:  ok  |  ok  |  ok  |
[    0.437784][    T0]     hard-safe-A + unsafe-B #2/213:  ok  |  ok  |  ok  |
[    0.447485][    T0]     soft-safe-A + unsafe-B #2/213:  ok  |  ok  |  ok  |
[    0.455678][    T0]     hard-safe-A + unsafe-B #2/231:  ok  |  ok  |  ok  |
[    0.462876][    T0]     soft-safe-A + unsafe-B #2/231:  ok  |  ok  |  ok  |
[    0.469234][    T0]     hard-safe-A + unsafe-B #2/312:  ok  |  ok  |  ok  |
[    0.475729][    T0]     soft-safe-A + unsafe-B #2/312:  ok  |  ok  |  ok  |
[    0.481936][    T0]     hard-safe-A + unsafe-B #2/321:  ok  |  ok  |  ok  |
[    0.488372][    T0]     soft-safe-A + unsafe-B #2/321:  ok  |  ok  |  ok  |
[    0.494631][    T0]       hard-irq lock-inversion/123:  ok  |  ok  |  ok  |
[    0.501236][    T0]       soft-irq lock-inversion/123:  ok  |  ok  |  ok  |
[    0.507923][    T0]       hard-irq lock-inversion/132:  ok  |  ok  |  ok  |
[    0.514774][    T0]       soft-irq lock-inversion/132:  ok  |  ok  |  ok  |
[    0.521709][    T0]       hard-irq lock-inversion/213:  ok  |  ok  |  ok  |
[    0.528010][    T0]       soft-irq lock-inversion/213:  ok  |  ok  |  ok  |
[    0.535301][    T0]       hard-irq lock-inversion/231:  ok  |  ok  |  ok  |
[    0.541693][    T0]       soft-irq lock-inversion/231:  ok  |  ok  |  ok  |
[    0.548029][    T0]       hard-irq lock-inversion/312:  ok  |  ok  |  ok  |
[    0.554493][    T0]       soft-irq lock-inversion/312:  ok  |  ok  |  ok  |
[    0.561569][    T0]       hard-irq lock-inversion/321:  ok  |  ok  |  ok  |
[    0.568094][    T0]       soft-irq lock-inversion/321:  ok  |  ok  |  ok  |
[    0.574559][    T0]       hard-irq read-recursion/123:      |  ok  |  ok  |
[    0.578977][    T0]       soft-irq read-recursion/123:      |  ok  |  ok  |
[    0.583251][    T0]       hard-irq read-recursion/132:      |  ok  |  ok  |
[    0.587936][    T0]       soft-irq read-recursion/132:      |  ok  |  ok  |
[    0.592573][    T0]       hard-irq read-recursion/213:      |  ok  |  ok  |
[    0.596938][    T0]       soft-irq read-recursion/213:      |  ok  |  ok  |
[    0.600913][    T0]       hard-irq read-recursion/231:      |  ok  |  ok  |
[    0.604965][    T0]       soft-irq read-recursion/231:      |  ok  |  ok  |
[    0.609102][    T0]       hard-irq read-recursion/312:      |  ok  |  ok  |
[    0.613560][    T0]       soft-irq read-recursion/312:      |  ok  |  ok  |
[    0.618017][    T0]       hard-irq read-recursion/321:      |  ok  |  ok  |
[    0.622508][    T0]       soft-irq read-recursion/321:      |  ok  |  ok  |
[    0.626868][    T0]    hard-irq read-recursion #2/123:      |  ok  |  ok  |
[    0.631391][    T0]    soft-irq read-recursion #2/123:      |  ok  |  ok  |
[    0.635753][    T0]    hard-irq read-recursion #2/132:      |  ok  |  ok  |
[    0.640608][    T0]    soft-irq read-recursion #2/132:      |  ok  |  ok  |
[    0.644958][    T0]    hard-irq read-recursion #2/213:      |  ok  |  ok  |
[    0.649470][    T0]    soft-irq read-recursion #2/213:      |  ok  |  ok  |
[    0.653954][    T0]    hard-irq read-recursion #2/231:      |  ok  |  ok  |
[    0.658352][    T0]    soft-irq read-recursion #2/231:      |  ok  |  ok  |
[    0.663008][    T0]    hard-irq read-recursion #2/312:      |  ok  |  ok  |
[    0.667407][    T0]    soft-irq read-recursion #2/312:      |  ok  |  ok  |
[    0.671847][    T0]    hard-irq read-recursion #2/321:      |  ok  |  ok  |
[    0.676744][    T0]    soft-irq read-recursion #2/321:      |  ok  |  ok  |
[    0.681763][    T0]    hard-irq read-recursion #3/123:      |  ok  |  ok  |
[    0.686669][    T0]    soft-irq read-recursion #3/123:      |  ok  |  ok  |
[    0.691021][    T0]    hard-irq read-recursion #3/132:      |  ok  |  ok  |
[    0.695718][    T0]    soft-irq read-recursion #3/132:      |  ok  |  ok  |
[    0.700074][    T0]    hard-irq read-recursion #3/213:      |  ok  |  ok  |
[    0.704752][    T0]    soft-irq read-recursion #3/213:      |  ok  |  ok  |
[    0.713237][    T0]    hard-irq read-recursion #3/231:      |  ok  |  ok  |
[    0.718254][    T0]    soft-irq read-recursion #3/231:      |  ok  |  ok  |
[    0.722683][    T0]    hard-irq read-recursion #3/312:      |  ok  |  ok  |
[    0.727007][    T0]    soft-irq read-recursion #3/312:      |  ok  |  ok  |
[    0.731373][    T0]    hard-irq read-recursion #3/321:      |  ok  |  ok  |
[    0.735706][    T0]    soft-irq read-recursion #3/321:      |  ok  |  ok  |
[    0.739976][    T0]   --------------------------------------------------------------------------
[    0.740294][    T0]   | Wound/wait tests |
[    0.740456][    T0]   ---------------------
[    0.740616][    T0]                   ww api failures:  ok  |  ok  |  ok  |
[    0.747943][    T0]                ww contexts mixing:  ok  |  ok  |
[    0.751839][    T0]              finishing ww context:  ok  |  ok  |  ok  |  ok  |
[    0.759807][    T0]                locking mismatches:  ok  |  ok  |  ok  |
[    0.765794][    T0]                  EDEADLK handling:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.786926][    T0]            spinlock nest unlocked:  ok  |
[    0.788813][    T0]                spinlock nest test:  ok  |
[    0.791472][    T0]   -----------------------------------------------------
[    0.791735][    T0]                                  |block | try  |context|
[    0.792000][    T0]   -----------------------------------------------------
[    0.792252][    T0]                           context:  ok  |  ok  |  ok  |
[    0.798615][    T0]                               try:  ok  |  ok  |  ok  |
[    0.804046][    T0]                             block:  ok  |  ok  |  ok  |
[    0.810770][    T0]                          spinlock:  ok  |  ok  |  ok  |
[    0.821347][    T0]   --------------------------------------------------------------------------
[    0.821673][    T0]   | queued read lock tests |
[    0.821856][    T0]   ---------------------------
[    0.822041][    T0]       hardirq read-lock/lock-read:  ok  |
[    0.824356][    T0]       hardirq lock-read/read-lock:  ok  |
[    0.826567][    T0]                 hardirq inversion:  ok  |
[    0.828774][    T0]   --------------------
[    0.828939][    T0]   | fs_reclaim tests |
[    0.829097][    T0]   --------------------
[    0.829252][    T0]                   correct nesting:  ok  |
[    0.830836][    T0]                     wrong nesting:  ok  |
[    0.832385][    T0]                 protected nesting:  ok  |
[    0.833899][    T0]   --------------------------------------------------------------------------
[    0.834216][    T0]   | local_lock tests |
[    0.834373][    T0]   ---------------------
[    0.834531][    T0]           local_lock inversion  2:  ok  |
[    0.836820][    T0]           local_lock inversion 3A:  ok  |
[    0.839520][    T0]           local_lock inversion 3B:  ok  |
[    0.842690][    T0]       hardirq_unsafe_softirq_safe:  ok  |
[    0.846193][    T0] -------------------------------------------------------
[    0.846452][    T0] Good, all 358 testcases passed! |
[    0.846645][    T0] ---------------------------------
[    0.848247][    T0] pid_max: default: 32768 minimum: 301
[    0.850462][    T0] Mount-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[    0.850823][    T0] Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[    0.902548][    T1] Running RCU-tasks wait API self tests
[    0.909691][    T1] POWER9 performance monitor hardware support registered
[    0.911844][    T1] rcu: Hierarchical SRCU implementation.
[    0.929941][   T10] Callback from call_rcu_tasks_trace() invoked.
[    0.932539][    T1] smp: Bringing up secondary CPUs ...
[    0.932888][    T1] smp: Brought up 1 node, 1 CPU
[    0.933249][    T1] numa: Node 0 CPUs: 0
[    0.943941][   T16] node 0 deferred pages initialised in 0ms
[    0.948649][   T16] pgdatinit0 (16) used greatest stack depth: 12528 bytes left
[    0.960416][    T1] devtmpfs: initialized
[    1.045777][    T9] Callback from call_rcu_tasks_rude() invoked.
[    1.051318][    T1] Initializing IODA2 PHB (/pciex@600c3c0000000)
[    1.052702][    T1] PCI host bridge /pciex@600c3c0000000 (primary) ranges:
[    1.053465][    T1]  MEM 0x000600c000000000..0x000600c07ffeffff -> 0x0000000080000000 
[    1.054400][    T1]  MEM 0x0006000000000000..0x0006003fffffffff -> 0x0006000000000000 (M64 #1..31)
[    1.054813][    T1]  Using M64 #31 as default window
[    1.056081][    T1]   512 (511) PE's M32: 0x80000000 [segment=0x400000]
[    1.056408][    T1]                  M64: 0x4000000000 [segment=0x20000000]
[    1.056858][    T1]   Allocated bitmap for 4088 MSIs (base IRQ 0xfe000)
[    1.061975][    T1] Initializing IODA2 PHB (/pciex@600c3c0100000)
[    1.062583][    T1] PCI host bridge /pciex@600c3c0100000  ranges:
[    1.062929][    T1]  MEM 0x000600c080000000..0x000600c0fffeffff -> 0x0000000080000000 
[    1.063511][    T1]  MEM 0x0006004000000000..0x0006007fffffffff -> 0x0006004000000000 (M64 #1..15)
[    1.063907][    T1]  Using M64 #15 as default window
[    1.064324][    T1]   256 (255) PE's M32: 0x80000000 [segment=0x800000]
[    1.064617][    T1]                  M64: 0x4000000000 [segment=0x40000000]
[    1.064944][    T1]   Allocated bitmap for 2040 MSIs (base IRQ 0xfd800)
[    1.066901][    T1] Initializing IODA2 PHB (/pciex@600c3c0200000)
[    1.067465][    T1] PCI host bridge /pciex@600c3c0200000  ranges:
[    1.067794][    T1]  MEM 0x000600c100000000..0x000600c17ffeffff -> 0x0000000080000000 
[    1.068359][    T1]  MEM 0x0006008000000000..0x000600bfffffffff -> 0x0006008000000000 (M64 #1..15)
[    1.068745][    T1]  Using M64 #15 as default window
[    1.069417][    T1]   256 (255) PE's M32: 0x80000000 [segment=0x800000]
[    1.069708][    T1]                  M64: 0x4000000000 [segment=0x40000000]
[    1.070032][    T1]   Allocated bitmap for 2040 MSIs (base IRQ 0xfd000)
[    1.072066][    T1] Initializing IODA2 PHB (/pciex@600c3c0300000)
[    1.072637][    T1] PCI host bridge /pciex@600c3c0300000  ranges:
[    1.072969][    T1]  MEM 0x000600c180000000..0x000600c1fffeffff -> 0x0000000080000000 
[    1.073537][    T1]  MEM 0x0006020000000000..0x0006023fffffffff -> 0x0006020000000000 (M64 #1..31)
[    1.073922][    T1]  Using M64 #31 as default window
[    1.074531][    T1]   512 (511) PE's M32: 0x80000000 [segment=0x400000]
[    1.074818][    T1]                  M64: 0x4000000000 [segment=0x20000000]
[    1.075132][    T1]   Allocated bitmap for 4088 MSIs (base IRQ 0xfc000)
[    1.077220][    T1] Initializing IODA2 PHB (/pciex@600c3c0400000)
[    1.077782][    T1] PCI host bridge /pciex@600c3c0400000  ranges:
[    1.078120][    T1]  MEM 0x000600c200000000..0x000600c27ffeffff -> 0x0000000080000000 
[    1.078682][    T1]  MEM 0x0006024000000000..0x0006027fffffffff -> 0x0006024000000000 (M64 #1..15)
[    1.079077][    T1]  Using M64 #15 as default window
[    1.079742][    T1]   256 (255) PE's M32: 0x80000000 [segment=0x800000]
[    1.080035][    T1]                  M64: 0x4000000000 [segment=0x40000000]
[    1.080352][    T1]   Allocated bitmap for 2040 MSIs (base IRQ 0xfb800)
[    1.082665][    T1] Initializing IODA2 PHB (/pciex@600c3c0500000)
[    1.083238][    T1] PCI host bridge /pciex@600c3c0500000  ranges:
[    1.083563][    T1]  MEM 0x000600c280000000..0x000600c2fffeffff -> 0x0000000080000000 
[    1.084123][    T1]  MEM 0x0006028000000000..0x000602bfffffffff -> 0x0006028000000000 (M64 #1..15)
[    1.084504][    T1]  Using M64 #15 as default window
[    1.084916][    T1]   256 (255) PE's M32: 0x80000000 [segment=0x800000]
[    1.085203][    T1]                  M64: 0x4000000000 [segment=0x40000000]
[    1.085519][    T1]   Allocated bitmap for 2040 MSIs (base IRQ 0xfb000)
[    1.088892][    T1] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    1.089557][    T1] futex hash table entries: 256 (order: -1, 32768 bytes, linear)
[    1.101390][    T1] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    1.109073][    T1] audit: initializing netlink subsys (disabled)
[    1.118056][    T1] cpuidle: using governor menu
[    1.119320][    T1] Failed to initialize. Disabling HugeTLB
[    1.119642][    T1] nvram: No room to create lnx,oops-log partition, deleting any obsolete OS partitions...
[    1.120368][    T1] nvram: Failed to find or create lnx,oops-log partition, err -28
[    1.120723][    T1] nvram: Failed to initialize oops partition!
[    1.134049][   T20] audit: type=2000 audit(1629254531.000:1): state=initialized audit_enabled=0 res=1
[    1.137609][    T1] EEH: PowerNV platform initialized
[    1.145237][    T1] PCI: Probing PCI hardware
[    1.149106][    T1] PCI host bridge to bus 0000:00
[    1.149625][    T1] pci_bus 0000:00: root bus resource [mem 0x600c000000000-0x600c07ffeffff] (bus address [0x80000000-0xfffeffff])
[    1.150397][    T1] pci_bus 0000:00: root bus resource [mem 0x6000000000000-0x6003fbfffffff 64bit pref]
[    1.151013][    T1] pci_bus 0000:00: root bus resource [bus 00-ff]
[    1.151524][    T1] pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to ff
[    1.154823][    T1] pci 0000:00:00.0: [1014:04c1] type 01 class 0x060400
[    1.167476][    T1] pci 0000:01:00.0: [10ec:8139] type 00 class 0x020000
[    1.168101][    T1] pci 0000:01:00.0: reg 0x10: [io  0x0000-0x00ff]
[    1.168442][    T1] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x000000ff]
[    1.168858][    T1] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0003ffff pref]
[    1.169311][    T1] pci 0000:01:00.0: BAR1 [mem size 0x00000100]: requesting alignment to 0x10000
[    1.172922][    T1] pci 0000:00:00.0: PCI bridge to [bus 01]
[    1.174447][    T1] pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to 01
[    1.176415][    T1] PCI host bridge to bus 0001:00
[    1.176674][    T1] pci_bus 0001:00: root bus resource [mem 0x600c080000000-0x600c0fffeffff] (bus address [0x80000000-0xfffeffff])
[    1.177170][    T1] pci_bus 0001:00: root bus resource [mem 0x6004000000000-0x6007f7fffffff 64bit pref]
[    1.177597][    T1] pci_bus 0001:00: root bus resource [bus 00-ff]
[    1.177871][    T1] pci_bus 0001:00: busn_res: [bus 00-ff] end is updated to ff
[    1.178330][    T1] pci 0001:00:00.0: [1014:04c1] type 01 class 0x060400
[    1.184174][    T1] pci 0001:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.187991][    T1] pci 0001:00:00.0: PCI bridge to [bus 01-ff]
[    1.188371][    T1] pci_bus 0001:01: busn_res: [bus 01-ff] end is updated to 01
[    1.189153][    T1] pci_bus 0001:00: busn_res: [bus 00-ff] end is updated to 01
[    1.190752][    T1] PCI host bridge to bus 0002:00
[    1.191002][    T1] pci_bus 0002:00: root bus resource [mem 0x600c100000000-0x600c17ffeffff] (bus address [0x80000000-0xfffeffff])
[    1.191493][    T1] pci_bus 0002:00: root bus resource [mem 0x6008000000000-0x600bf7fffffff 64bit pref]
[    1.191911][    T1] pci_bus 0002:00: root bus resource [bus 00-ff]
[    1.192184][    T1] pci_bus 0002:00: busn_res: [bus 00-ff] end is updated to ff
[    1.192636][    T1] pci 0002:00:00.0: [1014:04c1] type 01 class 0x060400
[    1.199010][    T1] pci 0002:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.202319][    T1] pci 0002:00:00.0: PCI bridge to [bus 01-ff]
[    1.202673][    T1] pci_bus 0002:01: busn_res: [bus 01-ff] end is updated to 01
[    1.203269][    T1] pci_bus 0002:00: busn_res: [bus 00-ff] end is updated to 01
[    1.204899][    T1] PCI host bridge to bus 0003:00
[    1.205148][    T1] pci_bus 0003:00: root bus resource [mem 0x600c180000000-0x600c1fffeffff] (bus address [0x80000000-0xfffeffff])
[    1.205642][    T1] pci_bus 0003:00: root bus resource [mem 0x6020000000000-0x6023fbfffffff 64bit pref]
[    1.206748][    T1] pci_bus 0003:00: root bus resource [bus 00-ff]
[    1.207045][    T1] pci_bus 0003:00: busn_res: [bus 00-ff] end is updated to ff
[    1.207505][    T1] pci 0003:00:00.0: [1014:04c1] type 01 class 0x060400
[    1.213117][    T1] pci 0003:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.216451][    T1] pci 0003:00:00.0: PCI bridge to [bus 01-ff]
[    1.217503][    T1] pci_bus 0003:01: busn_res: [bus 01-ff] end is updated to 01
[    1.218181][    T1] pci_bus 0003:00: busn_res: [bus 00-ff] end is updated to 01
[    1.219717][    T1] PCI host bridge to bus 0004:00
[    1.219968][    T1] pci_bus 0004:00: root bus resource [mem 0x600c200000000-0x600c27ffeffff] (bus address [0x80000000-0xfffeffff])
[    1.220454][    T1] pci_bus 0004:00: root bus resource [mem 0x6024000000000-0x6027f7fffffff 64bit pref]
[    1.220863][    T1] pci_bus 0004:00: root bus resource [bus 00-ff]
[    1.221132][    T1] pci_bus 0004:00: busn_res: [bus 00-ff] end is updated to ff
[    1.221585][    T1] pci 0004:00:00.0: [1014:04c1] type 01 class 0x060400
[    1.227067][    T1] pci 0004:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.231059][    T1] pci 0004:00:00.0: PCI bridge to [bus 01-ff]
[    1.231411][    T1] pci_bus 0004:01: busn_res: [bus 01-ff] end is updated to 01
[    1.231984][    T1] pci_bus 0004:00: busn_res: [bus 00-ff] end is updated to 01
[    1.233643][    T1] PCI host bridge to bus 0005:00
[    1.233894][    T1] pci_bus 0005:00: root bus resource [mem 0x600c280000000-0x600c2fffeffff] (bus address [0x80000000-0xfffeffff])
[    1.234393][    T1] pci_bus 0005:00: root bus resource [mem 0x6028000000000-0x602bf7fffffff 64bit pref]
[    1.234799][    T1] pci_bus 0005:00: root bus resource [bus 00-ff]
[    1.235079][    T1] pci_bus 0005:00: busn_res: [bus 00-ff] end is updated to ff
[    1.235538][    T1] pci 0005:00:00.0: [1014:04c1] type 01 class 0x060400
[    1.241752][    T1] pci 0005:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.245089][    T1] pci 0005:00:00.0: PCI bridge to [bus 01-ff]
[    1.245444][    T1] pci_bus 0005:01: busn_res: [bus 01-ff] end is updated to 01
[    1.246021][    T1] pci_bus 0005:00: busn_res: [bus 00-ff] end is updated to 01
[    1.247644][    T1] pci 0000:00:00.0: bridge window [mem 0x20000000-0x1fffffff 64bit pref] to [bus 01] add_size 20000000 add_align 20000000
[    1.249777][    T1] pci 0000:00:00.0: BAR 9: assigned [mem 0x6000000000000-0x600001fffffff 64bit pref]
[    1.250256][    T1] pci 0000:00:00.0: BAR 8: assigned [mem 0x600c000000000-0x600c0003fffff]
[    1.250664][    T1] pci 0000:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.250986][    T1] pci 0000:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.251724][    T1] pci 0000:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.252029][    T1] pci 0000:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.252488][    T1] pci 0000:01:00.0: BAR 6: assigned [mem 0x600c000000000-0x600c00003ffff pref]
[    1.252966][    T1] pci 0000:01:00.0: BAR 1: assigned [mem 0x600c000040000-0x600c0000400ff]
[    1.253406][    T1] pci 0000:01:00.0: BAR 0: no space for [io  size 0x0100]
[    1.253698][    T1] pci 0000:01:00.0: BAR 0: failed to assign [io  size 0x0100]
[    1.254222][    T1] pci 0000:00:00.0: PCI bridge to [bus 01]
[    1.254868][    T1] pci 0000:00:00.0:   bridge window [mem 0x600c000000000-0x600c07fefffff]
[    1.255351][    T1] pci 0000:00:00.0:   bridge window [mem 0x6000000000000-0x6003fbff0ffff 64bit pref]
[    1.256063][    T1] pci_bus 0000:00: Some PCI device resources are unassigned, try booting with pci=realloc
[    1.256580][    T1] pci_bus 0000:00: resource 4 [mem 0x600c000000000-0x600c07ffeffff]
[    1.256922][    T1] pci_bus 0000:00: resource 5 [mem 0x6000000000000-0x6003fbfffffff 64bit pref]
[    1.257298][    T1] pci_bus 0000:01: resource 1 [mem 0x600c000000000-0x600c07fefffff]
[    1.257623][    T1] pci_bus 0000:01: resource 2 [mem 0x6000000000000-0x6003fbff0ffff 64bit pref]
[    1.258108][    T1] pci 0001:00:00.0: bridge window [io  0x1000-0x0fff] to [bus 01] add_size 1000
[    1.258537][    T1] pci 0001:00:00.0: bridge window [mem 0x40000000-0x3fffffff 64bit pref] to [bus 01] add_size 40000000 add_align 40000000
[    1.259065][    T1] pci 0001:00:00.0: bridge window [mem 0x00800000-0x007fffff] to [bus 01] add_size 800000 add_align 800000
[    1.259617][    T1] pci 0001:00:00.0: BAR 9: assigned [mem 0x6004000000000-0x600403fffffff 64bit pref]
[    1.260009][    T1] pci 0001:00:00.0: BAR 8: assigned [mem 0x600c080000000-0x600c0807fffff]
[    1.260501][    T1] pci 0001:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.260808][    T1] pci 0001:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.261239][    T1] pci 0001:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.261539][    T1] pci 0001:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.261933][    T1] pci 0001:00:00.0: PCI bridge to [bus 01]
[    1.262441][    T1] pci 0001:00:00.0:   bridge window [mem 0x600c080000000-0x600c0ffefffff]
[    1.262836][    T1] pci 0001:00:00.0:   bridge window [mem 0x6004000000000-0x6007f7ff0ffff 64bit pref]
[    1.263456][    T1] pci_bus 0001:00: resource 4 [mem 0x600c080000000-0x600c0fffeffff]
[    1.263827][    T1] pci_bus 0001:00: resource 5 [mem 0x6004000000000-0x6007f7fffffff 64bit pref]
[    1.264186][    T1] pci_bus 0001:01: resource 1 [mem 0x600c080000000-0x600c0ffefffff]
[    1.264507][    T1] pci_bus 0001:01: resource 2 [mem 0x6004000000000-0x6007f7ff0ffff 64bit pref]
[    1.264910][    T1] pci 0002:00:00.0: bridge window [io  0x1000-0x0fff] to [bus 01] add_size 1000
[    1.265291][    T1] pci 0002:00:00.0: bridge window [mem 0x40000000-0x3fffffff 64bit pref] to [bus 01] add_size 40000000 add_align 40000000
[    1.265802][    T1] pci 0002:00:00.0: bridge window [mem 0x00800000-0x007fffff] to [bus 01] add_size 800000 add_align 800000
[    1.266339][    T1] pci 0002:00:00.0: BAR 9: assigned [mem 0x6008000000000-0x600803fffffff 64bit pref]
[    1.266722][    T1] pci 0002:00:00.0: BAR 8: assigned [mem 0x600c100000000-0x600c1007fffff]
[    1.267067][    T1] pci 0002:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.267352][    T1] pci 0002:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.267712][    T1] pci 0002:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.268002][    T1] pci 0002:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.268316][    T1] pci 0002:00:00.0: PCI bridge to [bus 01]
[    1.268818][    T1] pci 0002:00:00.0:   bridge window [mem 0x600c100000000-0x600c17fefffff]
[    1.269207][    T1] pci 0002:00:00.0:   bridge window [mem 0x6008000000000-0x600bf7ff0ffff 64bit pref]
[    1.269834][    T1] pci_bus 0002:00: resource 4 [mem 0x600c100000000-0x600c17ffeffff]
[    1.270205][    T1] pci_bus 0002:00: resource 5 [mem 0x6008000000000-0x600bf7fffffff 64bit pref]
[    1.270564][    T1] pci_bus 0002:01: resource 1 [mem 0x600c100000000-0x600c17fefffff]
[    1.271232][    T1] pci_bus 0002:01: resource 2 [mem 0x6008000000000-0x600bf7ff0ffff 64bit pref]
[    1.271649][    T1] pci 0003:00:00.0: bridge window [io  0x1000-0x0fff] to [bus 01] add_size 1000
[    1.272043][    T1] pci 0003:00:00.0: bridge window [mem 0x20000000-0x1fffffff 64bit pref] to [bus 01] add_size 20000000 add_align 20000000
[    1.272563][    T1] pci 0003:00:00.0: bridge window [mem 0x00400000-0x003fffff] to [bus 01] add_size 400000 add_align 400000
[    1.273100][    T1] pci 0003:00:00.0: BAR 9: assigned [mem 0x6020000000000-0x602001fffffff 64bit pref]
[    1.273487][    T1] pci 0003:00:00.0: BAR 8: assigned [mem 0x600c180000000-0x600c1803fffff]
[    1.273832][    T1] pci 0003:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.274119][    T1] pci 0003:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.274478][    T1] pci 0003:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.274763][    T1] pci 0003:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.275085][    T1] pci 0003:00:00.0: PCI bridge to [bus 01]
[    1.275632][    T1] pci 0003:00:00.0:   bridge window [mem 0x600c180000000-0x600c1ffefffff]
[    1.276022][    T1] pci 0003:00:00.0:   bridge window [mem 0x6020000000000-0x6023fbff0ffff 64bit pref]
[    1.276634][    T1] pci_bus 0003:00: resource 4 [mem 0x600c180000000-0x600c1fffeffff]
[    1.276999][    T1] pci_bus 0003:00: resource 5 [mem 0x6020000000000-0x6023fbfffffff 64bit pref]
[    1.277359][    T1] pci_bus 0003:01: resource 1 [mem 0x600c180000000-0x600c1ffefffff]
[    1.277679][    T1] pci_bus 0003:01: resource 2 [mem 0x6020000000000-0x6023fbff0ffff 64bit pref]
[    1.278082][    T1] pci 0004:00:00.0: bridge window [io  0x1000-0x0fff] to [bus 01] add_size 1000
[    1.278464][    T1] pci 0004:00:00.0: bridge window [mem 0x40000000-0x3fffffff 64bit pref] to [bus 01] add_size 40000000 add_align 40000000
[    1.278987][    T1] pci 0004:00:00.0: bridge window [mem 0x00800000-0x007fffff] to [bus 01] add_size 800000 add_align 800000
[    1.279526][    T1] pci 0004:00:00.0: BAR 9: assigned [mem 0x6024000000000-0x602403fffffff 64bit pref]
[    1.279925][    T1] pci 0004:00:00.0: BAR 8: assigned [mem 0x600c200000000-0x600c2007fffff]
[    1.280273][    T1] pci 0004:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.280564][    T1] pci 0004:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.280932][    T1] pci 0004:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.281223][    T1] pci 0004:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.281688][    T1] pci 0004:00:00.0: PCI bridge to [bus 01]
[    1.282222][    T1] pci 0004:00:00.0:   bridge window [mem 0x600c200000000-0x600c27fefffff]
[    1.282623][    T1] pci 0004:00:00.0:   bridge window [mem 0x6024000000000-0x6027f7ff0ffff 64bit pref]
[    1.283258][    T1] pci_bus 0004:00: resource 4 [mem 0x600c200000000-0x600c27ffeffff]
[    1.283633][    T1] pci_bus 0004:00: resource 5 [mem 0x6024000000000-0x6027f7fffffff 64bit pref]
[    1.284000][    T1] pci_bus 0004:01: resource 1 [mem 0x600c200000000-0x600c27fefffff]
[    1.284324][    T1] pci_bus 0004:01: resource 2 [mem 0x6024000000000-0x6027f7ff0ffff 64bit pref]
[    1.284724][    T1] pci 0005:00:00.0: bridge window [io  0x1000-0x0fff] to [bus 01] add_size 1000
[    1.285109][    T1] pci 0005:00:00.0: bridge window [mem 0x40000000-0x3fffffff 64bit pref] to [bus 01] add_size 40000000 add_align 40000000
[    1.285624][    T1] pci 0005:00:00.0: bridge window [mem 0x00800000-0x007fffff] to [bus 01] add_size 800000 add_align 800000
[    1.286157][    T1] pci 0005:00:00.0: BAR 9: assigned [mem 0x6028000000000-0x602803fffffff 64bit pref]
[    1.286544][    T1] pci 0005:00:00.0: BAR 8: assigned [mem 0x600c280000000-0x600c2807fffff]
[    1.286895][    T1] pci 0005:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.287182][    T1] pci 0005:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.287543][    T1] pci 0005:00:00.0: BAR 7: no space for [io  size 0x1000]
[    1.287833][    T1] pci 0005:00:00.0: BAR 7: failed to assign [io  size 0x1000]
[    1.288149][    T1] pci 0005:00:00.0: PCI bridge to [bus 01]
[    1.288640][    T1] pci 0005:00:00.0:   bridge window [mem 0x600c280000000-0x600c2ffefffff]
[    1.289031][    T1] pci 0005:00:00.0:   bridge window [mem 0x6028000000000-0x602bf7ff0ffff 64bit pref]
[    1.289663][    T1] pci_bus 0005:00: resource 4 [mem 0x600c280000000-0x600c2fffeffff]
[    1.290038][    T1] pci_bus 0005:00: resource 5 [mem 0x6028000000000-0x602bf7fffffff 64bit pref]
[    1.290415][    T1] pci_bus 0005:01: resource 1 [mem 0x600c280000000-0x600c2ffefffff]
[    1.290744][    T1] pci_bus 0005:01: resource 2 [mem 0x6028000000000-0x602bf7ff0ffff 64bit pref]
[    1.291302][    T1] pci_bus 0000:00: Configuring PE for bus
[    1.292035][    T1] pci 0000:00     : [PE# 1fe] Secondary bus 0x0000000000000000 associated with PE#1fe
[    1.293202][    T1] pci 0000:00:00.0: Configured PE#1fe
[    1.295876][    T1] pci_bus 0000:01: Configuring PE for bus
[    1.296332][    T1] pci 0000:01     : [PE# 1fd] Secondary bus 0x0000000000000001 associated with PE#1fd
[    1.297103][    T1] pci 0000:01:00.0: Configured PE#1fd
[    1.297363][    T1] pci 0000:01     : [PE# 1fd] Setting up 32-bit TCE table at 0..80000000
[    1.298764][    T1] IOMMU table initialized, virtual merging enabled
[    1.299106][    T1] pci 0000:01     : [PE# 1fd] Setting up window#0 0..7fffffff pg=10000
[    1.299875][    T1] pci 0000:01     : [PE# 1fd] Enabling 64-bit DMA bypass
[    1.301127][    T1] pci 0000:01:00.0: Adding to iommu group 0
[    1.303551][    T1] pci_bus 0001:00: Configuring PE for bus
[    1.303821][    T1] pci 0001:00     : [PE# fe] Secondary bus 0x0000000000000000 associated with PE#fe
[    1.304336][    T1] pci 0001:00:00.0: Configured PE#fe
[    1.305333][    T1] pci_bus 0002:00: Configuring PE for bus
[    1.305584][    T1] pci 0002:00     : [PE# fe] Secondary bus 0x0000000000000000 associated with PE#fe
[    1.306093][    T1] pci 0002:00:00.0: Configured PE#fe
[    1.307058][    T1] pci_bus 0003:00: Configuring PE for bus
[    1.307309][    T1] pci 0003:00     : [PE# 1fe] Secondary bus 0x0000000000000000 associated with PE#1fe
[    1.307823][    T1] pci 0003:00:00.0: Configured PE#1fe
[    1.308719][    T1] pci_bus 0004:00: Configuring PE for bus
[    1.308969][    T1] pci 0004:00     : [PE# fe] Secondary bus 0x0000000000000000 associated with PE#fe
[    1.309469][    T1] pci 0004:00:00.0: Configured PE#fe
[    1.310396][    T1] pci_bus 0005:00: Configuring PE for bus
[    1.310646][    T1] pci 0005:00     : [PE# fe] Secondary bus 0x0000000000000000 associated with PE#fe
[    1.311146][    T1] pci 0005:00:00.0: Configured PE#fe
[    1.314106][    T1] pci 0000:00:00.0: enabling device (0105 -> 0107)
[    1.314972][    T1] EEH: Capable adapter found: recovery enabled.
[    1.327661][    T1] cpuidle-powernv: Default stop: psscr = 0x0000000000000330,mask=0x00000000003003ff
[    1.328088][    T1] cpuidle-powernv: Deepest stop: psscr = 0x0000000000300331,mask=0x00000000003003ff
[    1.328467][    T1] cpuidle-powernv: First stop level that may lose SPRs = 0x10
[    1.328771][    T1] cpuidle-powernv: First stop level that may lose timebase = 0x10
[    1.418096][    T1] Kprobes globally optimized
[    1.445284][   T34] cryptomgr_test (34) used greatest stack depth: 10880 bytes left
[    1.477721][   T43] cryptomgr_test (43) used greatest stack depth: 10528 bytes left
[    1.649775][    T1] raid6: vpermxor8 gen()   911 MB/s
[    1.821391][    T1] raid6: vpermxor4 gen()   892 MB/s
[    1.992859][    T1] raid6: vpermxor2 gen()   801 MB/s
[    2.164481][    T1] raid6: vpermxor1 gen()   670 MB/s
[    2.336154][    T1] raid6: altivecx8 gen()  1082 MB/s
[    2.507671][    T1] raid6: altivecx4 gen()  1221 MB/s
[    2.679460][    T1] raid6: altivecx2 gen()  1120 MB/s
[    2.850914][    T1] raid6: altivecx1 gen()   843 MB/s
[    3.022625][    T1] raid6: int64x8  gen()  1612 MB/s
[    3.195010][    T1] raid6: int64x8  xor()  1085 MB/s
[    3.366613][    T1] raid6: int64x4  gen()  2585 MB/s
[    3.539108][    T1] raid6: int64x4  xor()  1289 MB/s
[    3.711797][    T1] raid6: int64x2  gen()  2250 MB/s
[    3.884349][    T1] raid6: int64x2  xor()  1449 MB/s
[    4.055962][    T1] raid6: int64x1  gen()  1654 MB/s
[    4.227577][    T1] raid6: int64x1  xor()  1052 MB/s
[    4.227844][    T1] raid6: using algorithm int64x4 gen() 2585 MB/s
[    4.228126][    T1] raid6: .... xor() 1289 MB/s, rmw enabled
[    4.228428][    T1] raid6: using intx1 recovery algorithm
[    4.229638][    T1] iommu: Default domain type: Translated 
[    4.229925][    T1] iommu: DMA domain TLB invalidation policy: lazy mode 
[    4.237510][    T1] vgaarb: loaded
[    4.246973][    T1] SCSI subsystem initialized
[    4.250526][    T1] usbcore: registered new interface driver usbfs
[    4.251465][    T1] usbcore: registered new interface driver hub
[    4.252078][    T1] usbcore: registered new device driver usb
[    4.270795][    T1] clocksource: Switched to clocksource timebase
[    4.952052][    T1] hugetlbfs: disabling because there are no supported hugepage sizes
[    5.048614][    T1] NET: Registered PF_INET protocol family
[    5.050972][    T1] IP idents hash table entries: 32768 (order: 2, 262144 bytes, linear)
[    5.058507][    T1] tcp_listen_portaddr_hash hash table entries: 1024 (order: 0, 81920 bytes, linear)
[    5.059211][    T1] TCP established hash table entries: 16384 (order: 1, 131072 bytes, linear)
[    5.060165][    T1] TCP bind hash table entries: 16384 (order: 4, 1179648 bytes, linear)
[    5.063161][    T1] TCP: Hash tables configured (established 16384 bind 16384)
[    5.064854][    T1] UDP hash table entries: 1024 (order: 1, 163840 bytes, linear)
[    5.065745][    T1] UDP-Lite hash table entries: 1024 (order: 1, 163840 bytes, linear)
[    5.068316][    T1] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    5.069444][    T1] PCI: CLS 0 bytes, default 128
[   11.698074][    T1] workingset: timestamp_bits=38 max_order=15 bucket_order=0
[   11.863823][    T1] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[   11.868058][    T1] xor: measuring software checksum speed
[   11.871448][    T1]    8regs           :  3419 MB/sec
[   11.875731][    T1]    8regs_prefetch  :  2781 MB/sec
[   11.879196][    T1]    32regs          :  3140 MB/sec
[   11.883403][    T1]    32regs_prefetch :  2606 MB/sec
[   11.887182][    T1]    altivec         :  2876 MB/sec
[   11.887443][    T1] xor: using function: 8regs (3419 MB/sec)
[   11.888066][    T1] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250)
[   11.888569][    T1] io scheduler mq-deadline registered
[   11.888942][    T1] io scheduler kyber registered
[   12.533563][    T1] String selftests succeeded
[   12.536168][    T1] test_firmware: interface ready
[   12.536606][    T1] test_bitmap: loaded.
[   12.538982][    T1] test_bitmap: parselist: 14: input is '0-2047:128/256' OK, Time: 2500
[   12.540499][    T1] test_bitmap: bitmap_print_to_pagebuf: input is '0-524287
[   12.540499][    T1] ', Time: 196283
[   12.547687][    T1] test_bitmap: all 1943 tests passed
[   12.548124][    T1] test_uuid: all 18 tests passed
[   12.550638][    T1] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
[   12.550919][    T1] crc32: self tests passed, processed 225944 bytes in 441546 nsec
[   12.551692][    T1] crc32c: CRC_LE_BITS = 64
[   12.551895][    T1] crc32c: self tests passed, processed 225944 bytes in 210993 nsec
[   12.589245][    T1] crc32_combine: 8373 self tests passed
[   12.625891][    T1] crc32c_combine: 8373 self tests passed
[   12.626772][    T1] glob: 64 self-tests passed, 0 failed
[   12.632624][    T1] rbtree testing
[   12.653861][    T1]  -> test 1 (latency of nnodes insert+delete): 10807 cycles
[   12.666573][    T1]  -> test 2 (latency of nnodes cached insert+delete): 6252 cycles
[   12.669009][    T1]  -> test 3 (latency of inorder traversal): 1046 cycles
[   12.669361][    T1]  -> test 4 (latency to fetch first node)
[   12.669608][    T1]         non-cached: 27 cycles
[   12.669823][    T1]         cached: 0 cycles
[   12.769434][    T1] augmented rbtree testing
[   12.812243][    T1]  -> test 1 (latency of nnodes insert+delete): 21897 cycles
[   12.860505][    T1]  -> test 2 (latency of nnodes cached insert+delete): 24428 cycles
[   12.994713][    T1] interval tree insert/remove
[   13.018177][    T1]  -> 11958 cycles
[   13.018560][    T1] interval tree search
[   13.112939][    T1]  -> 48288 cycles (2692 results)
[   13.114012][    T1] IPMI message handler: version 39.2
[   13.114781][    T1] ipmi device interface
[   13.345730][    T1] ipmi-powernv ibm,opal:ipmi: IPMI message handler: Found new BMC (man_id: 0x000000, prod_id: 0x0000, dev_id: 0x20)
[   13.569872][    T1] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
[   13.571124][    T1] hvc0: No interrupts property, using OPAL event
[   13.579830][    T1] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[   13.647596][    T1] loop: module loaded
[   13.650217][    T1] Adaptec aacraid driver 1.2.1[50983]-custom
[   13.652349][    T1] megasas: 07.717.02.00-rc1
[   13.653397][    T1] ipr: IBM Power RAID SCSI Device Driver version: 2.6.4 (March 14, 2017)
[   13.679028][    T1] 1 fixed-partitions partitions found on MTD device flash@0
[   13.679453][    T1] Creating 1 MTD partitions on "flash@0":
[   13.679876][    T1] 0x000000000000-0x000002000000 : "PNOR"
[   13.711952][    T1] libphy: Fixed MDIO Bus: probed
[   13.715189][    T1] e100: Intel(R) PRO/100 Network Driver
[   13.715437][    T1] e100: Copyright(c) 1999-2006 Intel Corporation
[   13.715979][    T1] e1000: Intel(R) PRO/1000 Network Driver
[   13.716232][    T1] e1000: Copyright (c) 1999-2006 Intel Corporation.
[   13.716928][    T1] e1000e: Intel(R) PRO/1000 Network Driver
[   13.717181][    T1] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[   13.718728][   T11] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
[   13.719630][   T11] 8139cp 0000:01:00.0: enabling device (0100 -> 0102)
[   13.726664][   T11] 8139cp 0000:01:00.0 eth0: RTL-8139C+ at 0xc00a000080630000, 52:54:00:12:34:56, IRQ 32
[   13.728905][    T1] usbcore: registered new interface driver asix
[   13.729476][    T1] usbcore: registered new interface driver ax88179_178a
[   13.730006][    T1] usbcore: registered new interface driver cdc_ether
[   13.730746][    T1] usbcore: registered new interface driver net1080
[   13.731272][    T1] usbcore: registered new interface driver cdc_subset
[   13.731791][    T1] usbcore: registered new interface driver zaurus
[   13.732528][    T1] usbcore: registered new interface driver cdc_ncm
[   13.732828][    T1] Fusion MPT base driver 3.04.20
[   13.733047][    T1] Copyright (c) 1999-2008 LSI Corporation
[   13.733504][    T1] Fusion MPT SAS Host driver 3.04.20
[   13.735177][    T1] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   13.735547][    T1] ehci-pci: EHCI PCI platform driver
[   13.736156][    T1] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[   13.738233][    T1] usbcore: registered new interface driver uas
[   13.738956][    T1] usbcore: registered new interface driver usb-storage
[   13.739614][    T1] usbcore: registered new interface driver usbtest
[   13.740123][    T1] usbcore: registered new interface driver usb_ehset_test
[   13.740829][    T1] usbcore: registered new interface driver lvs
[   13.984603][    T1] rtc-opal opal-rtc: registered as rtc0
[   14.191792][    T1] rtc-opal opal-rtc: setting system clock to 2021-08-18T02:42:24 UTC (1629254544)
[   14.193896][    T1] i2c /dev entries driver
[   14.202822][    T1] device-mapper: uevent: version 1.0.3
[   14.209077][    T1] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22) initialised: dm-devel@redhat.com
[   14.209551][    T1] powernv-cpufreq: ibm,pstate-min node not found
[   14.209825][    T1] powernv-cpufreq: Platform driver disabled. System does not support PState control
[   14.213922][    T1] sdhci: Secure Digital Host Controller Interface driver
[   14.214246][    T1] sdhci: Copyright(c) Pierre Ossman
[   14.216479][    T1] ipip: IPv4 and MPLS over IPv4 tunneling driver
[   14.228886][    T1] NET: Registered PF_INET6 protocol family
[   14.243458][    T1] Segment Routing with IPv6
[   14.243966][    T1] In-situ OAM (IOAM) with IPv6
[   14.244762][    T1] NET: Registered PF_PACKET protocol family
[   14.246324][    T1] Key type dns_resolver registered
[   14.246774][    T1] drmem: No dynamic reconfiguration memory found
[   14.258788][    T1] registered taskstats version 1
[   14.282301][    T1] Btrfs loaded, crc32c=crc32c-generic, zoned=no, fsverity=no
[   14.292334][    T1] ### dt-test ### start of unittest - you will see error messages
[   14.296627][    T1] ### dt-test ### EXPECT \ : Duplicate name in testcase-data, renamed to "duplicate-name#1"
[   14.297957][    T1] Duplicate name in testcase-data, renamed to "duplicate-name#1"
[   14.322891][    T1] ### dt-test ### EXPECT / : Duplicate name in testcase-data, renamed to "duplicate-name#1"
[   14.328302][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-a: could not get #phandle-cells-missing for /testcase-data/phandle-tests/provider1
[   14.328760][    T1] OF: /testcase-data/phandle-tests/consumer-a: could not get #phandle-cells-missing for /testcase-data/phandle-tests/provider1
[   14.329967][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-a: could not get #phandle-cells-missing for /testcase-data/phandle-tests/provider1
[   14.329990][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-a: could not get #phandle-cells-missing for /testcase-data/phandle-tests/provider1
[   14.330773][    T1] OF: /testcase-data/phandle-tests/consumer-a: could not get #phandle-cells-missing for /testcase-data/phandle-tests/provider1
[   14.331988][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-a: could not get #phandle-cells-missing for /testcase-data/phandle-tests/provider1
[   14.332025][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-a: could not find phandle
[   14.332660][    T1] OF: /testcase-data/phandle-tests/consumer-a: could not find phandle 12345678
[   14.333489][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-a: could not find phandle
[   14.333533][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-a: could not find phandle
[   14.333966][    T1] OF: /testcase-data/phandle-tests/consumer-a: could not find phandle 12345678
[   14.334811][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-a: could not find phandle
[   14.334842][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found -1
[   14.335284][    T1] OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found -1
[   14.336102][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found -1
[   14.336124][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found -1
[   14.336574][    T1] OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found -1
[   14.337380][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-a: #phandle-cells = 3 found -1
[   14.338978][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-b: could not get #phandle-missing-cells for /testcase-data/phandle-tests/provider1
[   14.339501][    T1] OF: /testcase-data/phandle-tests/consumer-b: could not get #phandle-missing-cells for /testcase-data/phandle-tests/provider1
[   14.341394][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-b: could not get #phandle-missing-cells for /testcase-data/phandle-tests/provider1
[   14.341466][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-b: could not find phandle
[   14.342848][    T1] OF: /testcase-data/phandle-tests/consumer-b: could not find phandle 12345678
[   14.344546][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-b: could not find phandle
[   14.344606][    T1] ### dt-test ### EXPECT \ : OF: /testcase-data/phandle-tests/consumer-b: #phandle-cells = 2 found -1
[   14.345657][    T1] OF: /testcase-data/phandle-tests/consumer-b: #phandle-cells = 2 found -1
[   14.347372][    T1] ### dt-test ### EXPECT / : OF: /testcase-data/phandle-tests/consumer-b: #phandle-cells = 2 found -1
[   14.353289][    T1] ### dt-test ### FAIL of_unittest_dma_ranges_one():922 of_dma_get_range: wrong phys addr 0x0000000000000000 (expecting 20000000) on node /testcase-data/address-tests/device@70000000
[   14.354511][    T1] ### dt-test ### FAIL of_unittest_dma_ranges_one():925 of_dma_get_range: wrong DMA addr 0x0000000020000000 (expecting 0) on node /testcase-data/address-tests/device@70000000
[   14.355461][    T1] ### dt-test ### FAIL of_unittest_dma_ranges_one():922 of_dma_get_range: wrong phys addr 0x0000000100000000 (expecting 20000000) on node /testcase-data/address-tests/bus@80000000/device@1000
[   14.356255][    T1] ### dt-test ### FAIL of_unittest_dma_ranges_one():925 of_dma_get_range: wrong DMA addr 0x0000000020000000 (expecting 100000000) on node /testcase-data/address-tests/bus@80000000/device@1000
[   14.357459][    T1] ### dt-test ### FAIL of_unittest_dma_ranges_one():922 of_dma_get_range: wrong phys addr 0x0000000080000000 (expecting 20000000) on node /testcase-data/address-tests/pci@90000000
[   14.358210][    T1] ### dt-test ### FAIL of_unittest_dma_ranges_one():925 of_dma_get_range: wrong DMA addr 0x0000000020000000 (expecting 80000000) on node /testcase-data/address-tests/pci@90000000
[   14.361799][    T1] ### dt-test ### EXPECT \ : platform testcase-data:testcase-device2: IRQ index 0 not found
[   14.361874][    T1] platform testcase-data:testcase-device2: IRQ index 0 not found
[   14.362644][    T1] ### dt-test ### EXPECT / : platform testcase-data:testcase-device2: IRQ index 0 not found
[   14.371513][    T1] ### dt-test ### end of unittest - 184 passed, 6 failed
[   14.375004][    T1] TAP version 14
[   14.375192][    T1] 1..11
[   14.375391][    T1]     # Subtest: sysctl_test
[   14.375460][    T1]     1..10
[   14.378464][    T1]     ok 1 - sysctl_test_api_dointvec_null_tbl_data
[   14.379651][    T1]     ok 2 - sysctl_test_api_dointvec_table_maxlen_unset
[   14.381506][    T1]     ok 3 - sysctl_test_api_dointvec_table_len_is_zero
[   14.382656][    T1]     ok 4 - sysctl_test_api_dointvec_table_read_but_position_set
[   14.384011][    T1]     ok 5 - sysctl_test_dointvec_read_happy_single_positive
[   14.385228][    T1]     ok 6 - sysctl_test_dointvec_read_happy_single_negative
[   14.386628][    T1]     ok 7 - sysctl_test_dointvec_write_happy_single_positive
[   14.387832][    T1]     ok 8 - sysctl_test_dointvec_write_happy_single_negative
[   14.389086][    T1]     ok 9 - sysctl_test_api_dointvec_write_single_less_int_min
[   14.390490][    T1]     ok 10 - sysctl_test_api_dointvec_write_single_greater_int_max
[   14.390881][    T1] # sysctl_test: pass:10 fail:0 skip:0 total:10
[   14.391258][    T1] # Totals: pass:10 fail:0 skip:0 total:10
[   14.391605][    T1] ok 1 - sysctl_test
[   14.392077][    T1]     # Subtest: ext4_inode_test
[   14.392090][    T1]     1..1
[   14.393284][    T1]     # inode_test_xtimestamp_decoding: ok 1 - 1901-12-13 Lower bound of 32bit < 0 timestamp, no extra bits
[   14.394195][    T1]     # inode_test_xtimestamp_decoding: ok 2 - 1969-12-31 Upper bound of 32bit < 0 timestamp, no extra bits
[   14.395457][    T1]     # inode_test_xtimestamp_decoding: ok 3 - 1970-01-01 Lower bound of 32bit >=0 timestamp, no extra bits
[   14.396662][    T1]     # inode_test_xtimestamp_decoding: ok 4 - 2038-01-19 Upper bound of 32bit >=0 timestamp, no extra bits
[   14.397815][    T1]     # inode_test_xtimestamp_decoding: ok 5 - 2038-01-19 Lower bound of 32bit <0 timestamp, lo extra sec bit on
[   14.398969][    T1]     # inode_test_xtimestamp_decoding: ok 6 - 2106-02-07 Upper bound of 32bit <0 timestamp, lo extra sec bit on
[   14.400124][    T1]     # inode_test_xtimestamp_decoding: ok 7 - 2106-02-07 Lower bound of 32bit >=0 timestamp, lo extra sec bit on
[   14.402278][    T1]     # inode_test_xtimestamp_decoding: ok 8 - 2174-02-25 Upper bound of 32bit >=0 timestamp, lo extra sec bit on
[   14.403526][    T1]     # inode_test_xtimestamp_decoding: ok 9 - 2174-02-25 Lower bound of 32bit <0 timestamp, hi extra sec bit on
[   14.404696][    T1]     # inode_test_xtimestamp_decoding: ok 10 - 2242-03-16 Upper bound of 32bit <0 timestamp, hi extra sec bit on
[   14.405854][    T1]     # inode_test_xtimestamp_decoding: ok 11 - 2242-03-16 Lower bound of 32bit >=0 timestamp, hi extra sec bit on
[   14.407020][    T1]     # inode_test_xtimestamp_decoding: ok 12 - 2310-04-04 Upper bound of 32bit >=0 timestamp, hi extra sec bit on
[   14.408178][    T1]     # inode_test_xtimestamp_decoding: ok 13 - 2310-04-04 Upper bound of 32bit>=0 timestamp, hi extra sec bit 1. 1 ns
[   14.409335][    T1]     # inode_test_xtimestamp_decoding: ok 14 - 2378-04-22 Lower bound of 32bit>= timestamp. Extra sec bits 1. Max ns
[   14.411220][    T1]     # inode_test_xtimestamp_decoding: ok 15 - 2378-04-22 Lower bound of 32bit >=0 timestamp. All extra sec bits on
[   14.412401][    T1]     # inode_test_xtimestamp_decoding: ok 16 - 2446-05-10 Upper bound of 32bit >=0 timestamp. All extra sec bits on
[   14.412926][    T1]     # inode_test_xtimestamp_decoding: pass:16 fail:0 skip:0 total:16
[   14.413434][    T1]     ok 1 - inode_test_xtimestamp_decoding
[   14.413780][    T1] # Totals: pass:16 fail:0 skip:0 total:16
[   14.414032][    T1] ok 2 - ext4_inode_test
[   14.414468][    T1]     # Subtest: lib_sort
[   14.414480][    T1]     1..1
[   14.415945][    T1]     ok 1 - test_sort
[   14.416111][    T1] ok 3 - lib_sort
[   14.416444][    T1]     # Subtest: kunit_executor_test
[   14.416455][    T1]     1..3
[   14.417902][    T1]     ok 1 - filter_subsuite_test
[   14.418905][    T1]     ok 2 - filter_subsuite_to_empty_test
[   14.420261][    T1]     ok 3 - filter_suites_test
[   14.421124][    T1] # kunit_executor_test: pass:3 fail:0 skip:0 total:3
[   14.421343][    T1] # Totals: pass:3 fail:0 skip:0 total:3
[   14.421626][    T1] ok 4 - kunit_executor_test
[   14.422053][    T1]     # Subtest: kunit-try-catch-test
[   14.422065][    T1]     1..2
[   14.424075][    T1]     ok 1 - kunit_test_try_catch_successful_try_no_catch
[   14.425713][    T1]     ok 2 - kunit_test_try_catch_unsuccessful_try_does_catch
[   14.426024][    T1] # kunit-try-catch-test: pass:2 fail:0 skip:0 total:2
[   14.426347][    T1] # Totals: pass:2 fail:0 skip:0 total:2
[   14.426633][    T1] ok 5 - kunit-try-catch-test
[   14.427069][    T1]     # Subtest: kunit-resource-test
[   14.427080][    T1]     1..7
[   14.428173][    T1]     ok 1 - kunit_resource_test_init_resources
[   14.429237][    T1]     ok 2 - kunit_resource_test_alloc_resource
[   14.431621][    T1]     ok 3 - kunit_resource_test_destroy_resource
[   14.432898][    T1]     ok 4 - kunit_resource_test_cleanup_resources
[   14.434190][    T1]     ok 5 - kunit_resource_test_proper_free_ordering
[   14.435279][    T1]     ok 6 - kunit_resource_test_static
[   14.436861][    T1]     ok 7 - kunit_resource_test_named
[   14.437112][    T1] # kunit-resource-test: pass:7 fail:0 skip:0 total:7
[   14.437346][    T1] # Totals: pass:7 fail:0 skip:0 total:7
[   14.437627][    T1] ok 6 - kunit-resource-test
[   14.438049][    T1]     # Subtest: kunit-log-test
[   14.438059][    T1]     1..1
[   14.438771][  T104] put this in log.
[   14.438939][  T104] this too.
[   14.439116][  T104] add to suite log.
[   14.439272][  T104] along with this.
[   14.439707][    T1]     ok 1 - kunit_log_test
[   14.439888][    T1] ok 7 - kunit-log-test
[   14.440257][    T1]     # Subtest: kunit_status
[   14.440268][    T1]     1..2
[   14.441840][    T1]     ok 1 - kunit_status_set_failure_test
[   14.442860][    T1]     ok 2 - kunit_status_mark_skipped_test
[   14.443125][    T1] # kunit_status: pass:2 fail:0 skip:0 total:2
[   14.443377][    T1] # Totals: pass:2 fail:0 skip:0 total:2
[   14.443637][    T1] ok 8 - kunit_status
[   14.444039][    T1]     # Subtest: string-stream-test
[   14.444050][    T1]     1..3
[   14.445245][    T1]     ok 1 - string_stream_test_empty_on_creation
[   14.446679][    T1]     ok 2 - string_stream_test_not_empty_after_add
[   14.448202][    T1]     ok 3 - string_stream_test_get_string
[   14.448495][    T1] # string-stream-test: pass:3 fail:0 skip:0 total:3
[   14.448746][    T1] # Totals: pass:3 fail:0 skip:0 total:3
[   14.449023][    T1] ok 9 - string-stream-test
[   14.449446][    T1]     # Subtest: list-kunit-test
[   14.449457][    T1]     1..36
[   14.451451][    T1]     ok 1 - list_test_list_init
[   14.452493][    T1]     ok 2 - list_test_list_add
[   14.453552][    T1]     ok 3 - list_test_list_add_tail
[   14.454616][    T1]     ok 4 - list_test_list_del
[   14.455644][    T1]     ok 5 - list_test_list_replace
[   14.456671][    T1]     ok 6 - list_test_list_replace_init
[   14.457789][    T1]     ok 7 - list_test_list_swap
[   14.458867][    T1]     ok 8 - list_test_list_del_init
[   14.459925][    T1]     ok 9 - list_test_list_move
[   14.461559][    T1]     ok 10 - list_test_list_move_tail
[   14.462757][    T1]     ok 11 - list_test_list_bulk_move_tail
[   14.463787][    T1]     ok 12 - list_test_list_is_first
[   14.464823][    T1]     ok 13 - list_test_list_is_last
[   14.465859][    T1]     ok 14 - list_test_list_empty
[   14.466872][    T1]     ok 15 - list_test_list_empty_careful
[   14.467928][    T1]     ok 16 - list_test_list_rotate_left
[   14.469065][    T1]     ok 17 - list_test_list_rotate_to_front
[   14.470131][    T1]     ok 18 - list_test_list_is_singular
[   14.471588][    T1]     ok 19 - list_test_list_cut_position
[   14.472793][    T1]     ok 20 - list_test_list_cut_before
[   14.473933][    T1]     ok 21 - list_test_list_splice
[   14.475096][    T1]     ok 22 - list_test_list_splice_tail
[   14.476234][    T1]     ok 23 - list_test_list_splice_init
[   14.477388][    T1]     ok 24 - list_test_list_splice_tail_init
[   14.478401][    T1]     ok 25 - list_test_list_entry
[   14.479436][    T1]     ok 26 - list_test_list_first_entry
[   14.481777][    T1]     ok 27 - list_test_list_last_entry
[   14.482916][    T1]     ok 28 - list_test_list_first_entry_or_null
[   14.483933][    T1]     ok 29 - list_test_list_next_entry
[   14.484971][    T1]     ok 30 - list_test_list_prev_entry
[   14.486050][    T1]     ok 31 - list_test_list_for_each
[   14.487135][    T1]     ok 32 - list_test_list_for_each_prev
[   14.488253][    T1]     ok 33 - list_test_list_for_each_safe
[   14.489385][    T1]     ok 34 - list_test_list_for_each_prev_safe
[   14.491430][    T1]     ok 35 - list_test_list_for_each_entry
[   14.492601][    T1]     ok 36 - list_test_list_for_each_entry_reverse
[   14.492868][    T1] # list-kunit-test: pass:36 fail:0 skip:0 total:36
[   14.493149][    T1] # Totals: pass:36 fail:0 skip:0 total:36
[   14.493425][    T1] ok 10 - list-kunit-test
[   14.493848][    T1]     # Subtest: qos-kunit-test
[   14.493859][    T1]     1..3
[   14.495411][    T1]     ok 1 - freq_qos_test_min
[   14.496624][    T1]     ok 2 - freq_qos_test_maxdef
[   14.497716][    T1]     ok 3 - freq_qos_test_readd
[   14.497949][    T1] # qos-kunit-test: pass:3 fail:0 skip:0 total:3
[   14.498176][    T1] # Totals: pass:3 fail:0 skip:0 total:3
[   14.498441][    T1] ok 11 - qos-kunit-test
[   14.505883][    T1] md: Waiting for all devices to be available before autodetect
[   14.506243][    T1] md: If you don't use raid, use raid=noautodetect
[   14.506595][    T1] md: Autodetecting RAID arrays.
[   14.506868][    T1] md: autorun ...
[   14.507051][    T1] md: ... autorun DONE.
[   14.509976][    T1] 
[   14.510063][    T1] ======================================================
[   14.510243][    T1] WARNING: possible circular locking dependency detected
[   14.510463][    T1] 5.14.0-rc6-next-20210817 #1 Not tainted
[   14.510664][    T1] ------------------------------------------------------
[   14.510835][    T1] swapper/0/1 is trying to acquire lock:
[   14.511006][    T1] c00000000192c620 (mtd_table_mutex){+.+.}-{3:3}, at: blktrans_open+0x60/0x300
[   14.511490][    T1] 
[   14.511490][    T1] but task is already holding lock:
[   14.511672][    T1] c000000005a7b118 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev+0x20c/0x3c0
[   14.511946][    T1] 
[   14.511946][    T1] which lock already depends on the new lock.
[   14.511946][    T1] 
[   14.512198][    T1] 
[   14.512198][    T1] the existing dependency chain (in reverse order) is:
[   14.512434][    T1] 
[   14.512434][    T1] -> #1 (&disk->open_mutex){+.+.}-{3:3}:
[   14.512676][    T1]        __mutex_lock+0xd8/0xaf0
[   14.512849][    T1]        bd_register_pending_holders+0x48/0x190
[   14.513013][    T1]        device_add_disk+0x27c/0x3d0
[   14.513171][    T1]        add_mtd_blktrans_dev+0x358/0x620
[   14.513332][    T1]        mtdblock_add_mtd+0x94/0x120
[   14.513487][    T1]        blktrans_notify_add+0x7c/0xb0
[   14.513644][    T1]        add_mtd_device+0x3f8/0x600
[   14.513789][    T1]        add_mtd_partitions+0xfc/0x2b0
[   14.513944][    T1]        parse_mtd_partitions+0x2b4/0x980
[   14.514115][    T1]        mtd_device_parse_register+0xc0/0x3a0
[   14.514281][    T1]        powernv_flash_probe+0x180/0x240
[   14.514444][    T1]        platform_probe+0x78/0x120
[   14.514598][    T1]        really_probe+0x1cc/0x440
[   14.514738][    T1]        __driver_probe_device+0xb0/0x160
[   14.514892][    T1]        driver_probe_device+0x60/0x130
[   14.515043][    T1]        __driver_attach+0xe8/0x160
[   14.515188][    T1]        bus_for_each_dev+0xb4/0x130
[   14.515328][    T1]        driver_attach+0x34/0x50
[   14.515466][    T1]        bus_add_driver+0x1d8/0x2b0
[   14.515608][    T1]        driver_register+0x98/0x1a0
[   14.515753][    T1]        __platform_driver_register+0x38/0x50
[   14.515923][    T1]        powernv_flash_driver_init+0x2c/0x40
[   14.516084][    T1]        do_one_initcall+0x88/0x480
[   14.516223][    T1]        kernel_init_freeable+0x3dc/0x484
[   14.516392][    T1]        kernel_init+0x3c/0x180
[   14.516522][    T1]        ret_from_kernel_thread+0x5c/0x64
[   14.516718][    T1] 
[   14.516718][    T1] -> #0 (mtd_table_mutex){+.+.}-{3:3}:
[   14.516917][    T1]        __lock_acquire+0x1b5c/0x21d0
[   14.517068][    T1]        lock_acquire+0x2d8/0x4b0
[   14.517208][    T1]        __mutex_lock+0xd8/0xaf0
[   14.517342][    T1]        blktrans_open+0x60/0x300
[   14.517487][    T1]        blkdev_get_whole+0x50/0x110
[   14.517632][    T1]        blkdev_get_by_dev+0x1dc/0x3c0
[   14.517782][    T1]        blkdev_get_by_path+0x90/0xe0
[   14.517931][    T1]        mount_bdev+0x6c/0x2b0
[   14.518063][    T1]        ext4_mount+0x28/0x40
[   14.518207][    T1]        legacy_get_tree+0x4c/0xb0
[   14.518358][    T1]        vfs_get_tree+0x48/0x100
[   14.518489][    T1]        path_mount+0x2d8/0xd30
[   14.518616][    T1]        init_mount+0x7c/0xcc
[   14.518759][    T1]        mount_block_root+0x230/0x454
[   14.518894][    T1]        prepare_namespace+0x1b0/0x204
[   14.519031][    T1]        kernel_init_freeable+0x428/0x484
[   14.519193][    T1]        kernel_init+0x3c/0x180
[   14.519321][    T1]        ret_from_kernel_thread+0x5c/0x64
[   14.519501][    T1] 
[   14.519501][    T1] other info that might help us debug this:
[   14.519501][    T1] 
[   14.519756][    T1]  Possible unsafe locking scenario:
[   14.519756][    T1] 
[   14.519934][    T1]        CPU0                    CPU1
[   14.520065][    T1]        ----                    ----
[   14.520194][    T1]   lock(&disk->open_mutex);
[   14.520326][    T1]                                lock(mtd_table_mutex);
[   14.520506][    T1]                                lock(&disk->open_mutex);
[   14.520693][    T1]   lock(mtd_table_mutex);
[   14.520822][    T1] 
[   14.520822][    T1]  *** DEADLOCK ***
[   14.520822][    T1] 
[   14.521042][    T1] 1 lock held by swapper/0/1:
[   14.521181][    T1]  #0: c000000005a7b118 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev+0x20c/0x3c0
[   14.521481][    T1] 
[   14.521481][    T1] stack backtrace:
[   14.521733][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc6-next-20210817 #1
[   14.522127][    T1] Call Trace:
[   14.522256][    T1] [c000000002d072b0] [c0000000009106c8] dump_stack_lvl+0xac/0x108 (unreliable)
[   14.522564][    T1] [c000000002d072f0] [c0000000001aa04c] print_circular_bug.isra.44+0x37c/0x3e0
[   14.522812][    T1] [c000000002d07390] [c0000000001aa270] check_noncircular+0x1c0/0x1f0
[   14.523024][    T1] [c000000002d07460] [c0000000001aff3c] __lock_acquire+0x1b5c/0x21d0
[   14.523236][    T1] [c000000002d075a0] [c0000000001ad2b8] lock_acquire+0x2d8/0x4b0
[   14.523439][    T1] [c000000002d076a0] [c0000000010ae868] __mutex_lock+0xd8/0xaf0
[   14.523644][    T1] [c000000002d077b0] [c000000000bad9f0] blktrans_open+0x60/0x300
[   14.523858][    T1] [c000000002d07800] [c000000000540050] blkdev_get_whole+0x50/0x110
[   14.524069][    T1] [c000000002d07840] [c00000000054289c] blkdev_get_by_dev+0x1dc/0x3c0
[   14.524286][    T1] [c000000002d078a0] [c000000000542e90] blkdev_get_by_path+0x90/0xe0
[   14.524499][    T1] [c000000002d078f0] [c0000000004d148c] mount_bdev+0x6c/0x2b0
[   14.524694][    T1] [c000000002d07990] [c00000000064a0f8] ext4_mount+0x28/0x40
[   14.524887][    T1] [c000000002d079b0] [c0000000005315dc] legacy_get_tree+0x4c/0xb0
[   14.525100][    T1] [c000000002d079e0] [c0000000004cf5e8] vfs_get_tree+0x48/0x100
[   14.525298][    T1] [c000000002d07a50] [c00000000050fbe8] path_mount+0x2d8/0xd30
[   14.525489][    T1] [c000000002d07ae0] [c000000001587874] init_mount+0x7c/0xcc
[   14.525694][    T1] [c000000002d07b50] [c000000001551bd8] mount_block_root+0x230/0x454
[   14.525894][    T1] [c000000002d07c50] [c000000001552058] prepare_namespace+0x1b0/0x204
[   14.526106][    T1] [c000000002d07cc0] [c000000001551714] kernel_init_freeable+0x428/0x484
[   14.526335][    T1] [c000000002d07da0] [c000000000012d1c] kernel_init+0x3c/0x180
[   14.526529][    T1] [c000000002d07e10] [c00000000000cfd4] ret_from_kernel_thread+0x5c/0x64
[   14.567502][    T1] EXT4-fs (mtdblock0): mounted filesystem without journal. Opts: (null). Quota mode: disabled.
[   14.568104][    T1] VFS: Mounted root (ext4 filesystem) readonly on device 31:0.
[   14.572575][    T1] devtmpfs: mounted
[   14.632605][    T1] Freeing unused kernel image (initmem) memory: 1984K
[   14.632873][    T1] Kernel memory protection not selected by kernel config.
[   14.633341][    T1] Run /sbin/init as init process
[   14.633475][    T1]   with arguments:
[   14.633586][    T1]     /sbin/init
[   14.633692][    T1]   with environment:
[   14.633814][    T1]     HOME=/
[   14.633912][    T1]     TERM=linux
[   14.651001][    C0] random: fast init done
[   15.671461][  T152] mount (152) used greatest stack depth: 10048 bytes left
mount: mounting devtmpfs on /dev failed: Device or resource busy
[   15.768631][   T48]  (null): opal_flash_async_op(op=0) failed (rc -1)
[   15.773159][   T48] blk_update_request: I/O error, dev mtdblock0, sector 2 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   15.773787][   T48] blk_update_request: I/O error, dev mtdblock0, sector 2 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[   15.774151][   T48] Buffer I/O error on dev mtdblock0, logical block 1, lost sync page write
[   15.775037][  T154] EXT4-fs (mtdblock0): I/O error while writing superblock
mount: mounting /dev/root on / failed: Input/output error
Starting syslogd: OK
Starting klogd: OK
Running sysctl: [   16.730207][  T179] logger (179) used greatest stack depth: 9920 bytes left
OK
Saving random seed: SKIP (read-only file system detected)
Starting network: [   17.450126][  T195] 8139cp 0000:01:00.0 eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
udhcpc: started, v1.33.0
udhcpc: sending discover
udhcpc: sending select for 10.0.2.15
udhcpc: lease of 10.0.2.15 obtained, lease time 86400
deleting routers
adding dns 10.0.2.3
OK
Found console hvc0

Linux version 5.14.0-rc6-next-20210817 (groeck@server.roeck-us.net) (powerpc64-linux-gcc.br_real (Buildroot 2019.02-git-00353-g4f20f23) 7.4.0, GNU ld (GNU Binutils) 2.31.1) #1 SMP Tue Aug 17 19:13:07 PDT 2021
[   18.515763][  T227] telnet (227) used greatest stack depth: 8528 bytes left
Network interface test passed
Boot successful.
Rebooting
[   34.724800][  T232] reboot: Restarting system
[   48.142228983,5] OPAL: Reboot request...
[   48.142294919,2] NVRAM: Failed to load
------------

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dm-devel] holders not working properly, regression [was: Re: use regular gendisk registration in device mapper v2]
  2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2021-08-10  0:36 ` Alasdair G Kergon
@ 2021-08-19 15:58 ` Mike Snitzer
  2021-08-19 18:05   ` Christoph Hellwig
  10 siblings, 1 reply; 35+ messages in thread
From: Mike Snitzer @ 2021-08-19 15:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel, Tushar Sugandhi

On Wed, Aug 04 2021 at  5:41P -0400,
Christoph Hellwig <hch@lst.de> wrote:

> Hi all,
> 
> The device mapper code currently has a somewhat odd gendisk registration
> scheme where it calls add_disk early, but uses a special flag to skip the
> "queue registration", which is a major part of add_disk.  This series
> improves the block layer holder tracking to work on an entirely
> unregistered disk and thus allows device mapper to use the normal scheme
> of calling add_disk when it is ready to accept I/O.
> 
> Note that this leads to a user visible change - the sysfs attributes on
> the disk and the dm directory hanging off it are not only visible once
> the initial table is loaded.  This did not make a different to my testing
> using dmsetup and the lvm2 tools.
> 
> Changes since v1:
>  - rebased on the lastes for-5.15/block tree
>  - improve various commit messages, including commit references

Hi,

This was originally reported to me by Tushar (cc'd).

Unfortunately I too am seeing a block-5.15/linux-next regression
related to holders when testing dm-multipath with an mptest test
case. To reproduce the following trcaes and crash simply do:

git clone git://github.com/snitm/mptest.git
cd mptest
./runtest tests/test_02_sdev_delete

I got bogged with trying different kernels, because I _thought_ I
verified mptest's tests all passed when I reviewed v1 of this
patchset.  ut I'll pivot to looking closer at these traces and the
code to try to find the issue. But I've sat on this regression since
Tuesday so need to at least share with others now:

** Running: ./tests/test_02_sdev_delete

[ 1411.113642] ------------[ cut here ]------------
[ 1411.118260] kernfs: can not remove 'dm-0', no directory
[ 1411.123488] WARNING: CPU: 16 PID: 23326 at fs/kernfs/dir.c:1509 kernfs_remove_by_name_ns+0x81/0x90
[ 1411.132446] Modules linked in: dm_queue_length dm_multipath tcm_loop target_core_user uio target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod dm_mod nfsv3 nfs_acl nfs lockd grace sunrpc intel_rapl_msr intel_rapl_common skx_edac nfit intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul nd_pmem mei_me ghash_clmulni_intel i2c_i801 nd_btt ipmi_si joydev pcspkr mei sg i2c_smbus lpc_ich wmi ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter ip_tables xfs libcrc32c sd_mod ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ttm drm ahci nvme libahci i40e nvme_core libata crc32c_intel i2c_core t10_pi
[ 1411.197935] CPU: 16 PID: 23326 Comm: dmsetup Tainted: G        W         5.14.0-rc4.snitm+ #196
[ 1411.206624] Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
[ 1411.214190] RIP: 0010:kernfs_remove_by_name_ns+0x81/0x90
[ 1411.219504] Code: 45 8f 58 00 31 c0 5b 5d 41 5c c3 48 c7 c7 e0 e1 6a 85 e8 32 8f 58 00 b8 fe ff ff ff eb e8 48 c7 c7 c8 78 f3 84 e8 9e ef 53 00 <0f> 0b b8 fe ff ff ff eb d3 66 0f 1f 44 00 00 0f 1f 44 00 00 41 57
[ 1411.238251] RSP: 0018:ffffb6c3a198fc00 EFLAGS: 00010286
[ 1411.243474] RAX: 0000000000000000 RBX: ffff963d80d08980 RCX: 0000000000000000
[ 1411.250608] RDX: 0000000000000001 RSI: ffff963d600979d0 RDI: ffff963d600979d0
[ 1411.257741] RBP: ffff96360771a7d8 R08: 0000000000000000 R09: c0000000ffff7fff
[ 1411.264875] R10: 0000000000000001 R11: ffffb6c3a198fa10 R12: ffff963d835f5800
[ 1411.272007] R13: ffff963d835f5870 R14: dead000000000122 R15: dead000000000100
[ 1411.279140] FS:  00007f9e557e1840(0000) GS:ffff963d60080000(0000) knlGS:0000000000000000
[ 1411.287227] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1411.292973] CR2: 0000000002dd9020 CR3: 000000014209e002 CR4: 00000000007706e0
[ 1411.300102] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1411.307238] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1411.314370] PKRU: 55555554
[ 1411.317082] Call Trace:
[ 1411.319536]  bd_unlink_disk_holder+0x78/0xc0
[ 1411.323815]  dm_put_table_device+0x5a/0xf0 [dm_mod]
[ 1411.328697]  dm_put_device+0x83/0xe0 [dm_mod]
[ 1411.333063]  ? dm_put_path_selector+0x30/0x40 [dm_multipath]
[ 1411.338721]  free_priority_group+0x8b/0xc0 [dm_multipath]
[ 1411.344121]  free_multipath+0x6a/0xa0 [dm_multipath]
[ 1411.349088]  ? table_load+0x2d0/0x2d0 [dm_mod]
[ 1411.353545]  dm_table_destroy+0x62/0x140 [dm_mod]
[ 1411.358257]  ? table_load+0x2d0/0x2d0 [dm_mod]
[ 1411.362703]  dev_suspend+0xe6/0x290 [dm_mod]
[ 1411.366976]  ctl_ioctl+0x1af/0x420 [dm_mod]
[ 1411.371162]  dm_ctl_ioctl+0xa/0x10 [dm_mod]
[ 1411.375350]  __x64_sys_ioctl+0x84/0xc0
[ 1411.379102]  do_syscall_64+0x3a/0x80
[ 1411.382683]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1411.387731] RIP: 0033:0x7f9e550aa567

Trace that finally crashed was:
[ 1413.924355] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI
[ 1413.924356] CPU: 0 PID: 23394 Comm: dmsetup Tainted: G        W         5.14.0-rc4.snitm+ #196
[ 1413.924357] Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
[ 1413.924358] RIP: 0010:string_nocheck+0x12/0x70
[ 1413.924358] Code: 00 00 4c 89 e2 be 20 00 00 00 48 89 ef e8 e6 a4 00 00 4c 01 e3 eb 81 90 49 89 f2 48 89 ce 48 89 f8 48 c1 fe 30 66 85 f6 74 4f <44> 0f b6 0a 45 84 c9 74 46 83 ee 01 41 b8 01 00 00 00 48 8d 7c 37
[ 1413.924359] RSP: 0018:ffffb6c3a1a6f9d0 EFLAGS: 00010086
[ 1413.924360] RAX: ffffb6c3a1a6faf2 RBX: ffffb6c3a1a6fae0 RCX: ffff0a00ffffff04
[ 1413.924361] RDX: dead000000000122 RSI: ffffffffffffffff RDI: ffffb6c3a1a6faf2
[ 1413.924361] RBP: dead000000000122 R08: 0000000000000009 R09: 0000000000000000
[ 1413.924362] R10: ffffb6c3a1a6fae0 R11: ffffb6c3a1a6f988 R12: ffff0a00ffffff04
[ 1413.924362] R13: ffffffff84f37876 R14: 0000000000000008 R15: ffffffff84f37876
[ 1413.924362] FS:  00007f967061f840(0000) GS:ffff963d5fe00000(0000) knlGS:0000000000000000
[ 1413.924363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1413.924363] CR2: 0000000002bfc000 CR3: 00000001e22c0002 CR4: 00000000007706f0
[ 1413.924363] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1413.924364] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1413.924364] PKRU: 55555554
[ 1413.924364] Call Trace:
[ 1413.924364]  string+0x40/0x50
[ 1413.924364]  vsnprintf+0x339/0x520
[ 1413.924365]  vprintk_store+0xad/0x440
[ 1413.924365]  ? __irq_work_queue_local+0x48/0x50
[ 1413.924365]  ? irq_work_queue+0x16/0x20
[ 1413.924366]  ? wake_up_klogd.part.31+0x30/0x40
[ 1413.924366]  ? vprintk_emit+0x11a/0x240
[ 1413.924366]  vprintk_emit+0xf7/0x240
[ 1413.924367]  __warn_printk+0x6b/0x87
[ 1413.924367]  ? kernfs_put+0xd0/0x190
[ 1413.924367]  kernfs_find_ns+0x9f/0xc0
[ 1413.924368]  kernfs_remove_by_name_ns+0x31/0x90
[ 1413.924368]  bd_unlink_disk_holder+0x78/0xc0
[ 1413.924369]  dm_put_table_device+0x5a/0xf0 [dm_mod]
[ 1413.924369]  dm_put_device+0x83/0xe0 [dm_mod]
[ 1413.924369]  ? dm_put_path_selector+0x30/0x40 [dm_multipath]
[ 1413.924369]  free_priority_group+0x8b/0xc0 [dm_multipath]
[ 1413.924370]  free_multipath+0x6a/0xa0 [dm_multipath]
[ 1413.924370]  ? table_load+0x2d0/0x2d0 [dm_mod]
[ 1413.924370]  dm_table_destroy+0x62/0x140 [dm_mod]
[ 1413.924370]  ? table_load+0x2d0/0x2d0 [dm_mod]
[ 1413.924371]  dev_suspend+0xe6/0x290 [dm_mod]
[ 1413.924371]  ctl_ioctl+0x1af/0x420 [dm_mod]
[ 1413.924371]  dm_ctl_ioctl+0xa/0x10 [dm_mod]
[ 1413.924372]  __x64_sys_ioctl+0x84/0xc0
[ 1413.924372]  do_syscall_64+0x3a/0x80
[ 1413.924373]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1413.924373] RIP: 0033:0x7f966fee8567

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] holders not working properly, regression [was: Re: use regular gendisk registration in device mapper v2]
  2021-08-19 15:58 ` [dm-devel] holders not working properly, regression [was: Re: use regular gendisk registration in device mapper v2] Mike Snitzer
@ 2021-08-19 18:05   ` Christoph Hellwig
  2021-08-19 22:08     ` Mike Snitzer
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-19 18:05 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, linux-block, dm-devel, Tushar Sugandhi, Christoph Hellwig

Manually reverting "block: remove the extra kobject reference in
bd_link_disk_holder" as show below fixed the issue for me.  I'll spend
some more time tomorrow trying to fully understan the life time rules
tomorrow before sending a patch, though.

---
>From 6b94f5435900d23769db8d07ff47415aab4ac63e Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Thu, 19 Aug 2021 20:01:43 +0200
Subject: Revert "block: remove the extra kobject reference in
 bd_link_disk_holder"

This reverts commit fbd9a39542ecdd2ade55869c13856b2590db3df8.
---
 block/holder.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/holder.c b/block/holder.c
index 4568cc4f6827..ecbc6941e7d8 100644
--- a/block/holder.c
+++ b/block/holder.c
@@ -106,6 +106,12 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	}
 
 	list_add(&holder->list, &disk->slave_bdevs);
+	/*
+	 * bdev could be deleted beneath us which would implicitly destroy
+	 * the holder directory.  Hold on to it.
+	 */
+	kobject_get(bdev->bd_holder_dir);
+
 out_unlock:
 	mutex_unlock(&disk->open_mutex);
 	return ret;
@@ -138,6 +144,7 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
 	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
 		if (disk->slave_dir)
 			__unlink_disk_holder(bdev, disk);
+		kobject_put(bdev->bd_holder_dir);
 		list_del_init(&holder->list);
 		kfree(holder);
 	}
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [dm-devel] holders not working properly, regression [was: Re: use regular gendisk registration in device mapper v2]
  2021-08-19 18:05   ` Christoph Hellwig
@ 2021-08-19 22:08     ` Mike Snitzer
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Snitzer @ 2021-08-19 22:08 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel, Tushar Sugandhi

On Thu, Aug 19 2021 at  2:05P -0400,
Christoph Hellwig <hch@lst.de> wrote:

> Manually reverting "block: remove the extra kobject reference in
> bd_link_disk_holder" as show below fixed the issue for me.  I'll spend
> some more time tomorrow trying to fully understan the life time rules
> tomorrow before sending a patch, though.
> 
> ---
> From 6b94f5435900d23769db8d07ff47415aab4ac63e Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Thu, 19 Aug 2021 20:01:43 +0200
> Subject: Revert "block: remove the extra kobject reference in
>  bd_link_disk_holder"
> 
> This reverts commit fbd9a39542ecdd2ade55869c13856b2590db3df8.
> ---
>  block/holder.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/block/holder.c b/block/holder.c
> index 4568cc4f6827..ecbc6941e7d8 100644
> --- a/block/holder.c
> +++ b/block/holder.c
> @@ -106,6 +106,12 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
>  	}
>  
>  	list_add(&holder->list, &disk->slave_bdevs);
> +	/*
> +	 * bdev could be deleted beneath us which would implicitly destroy
> +	 * the holder directory.  Hold on to it.
> +	 */
> +	kobject_get(bdev->bd_holder_dir);
> +
>  out_unlock:
>  	mutex_unlock(&disk->open_mutex);
>  	return ret;
> @@ -138,6 +144,7 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
>  	if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
>  		if (disk->slave_dir)
>  			__unlink_disk_holder(bdev, disk);
> +		kobject_put(bdev->bd_holder_dir);
>  		list_del_init(&holder->list);
>  		kfree(holder);
>  	}
> -- 
> 2.30.2
> 

OK, this fixed it for me too, thanks.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-16 14:17           ` Guenter Roeck
@ 2021-08-20 15:08             ` Christoph Hellwig
  2021-08-21  3:17               ` Guenter Roeck
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2021-08-20 15:08 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Jens Axboe, linux-block, dm-devel, Christoph Hellwig, Mike Snitzer

Please try the patch below:

---
>From 7609266da56160d211662cd2fbe26570aad11b15 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 20 Aug 2021 17:00:11 +0200
Subject: mtd_blkdevs: don't hold del_mtd_blktrans_dev in
 blktrans_{open,release}

There is nothing that this protects against except for slightly reducing
the window when new opens can appear just before calling del_gendisk.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/mtd/mtd_blkdevs.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/mtd/mtd_blkdevs.c b/drivers/mtd/mtd_blkdevs.c
index 44bea3f65060..6b81a1c9ccbe 100644
--- a/drivers/mtd/mtd_blkdevs.c
+++ b/drivers/mtd/mtd_blkdevs.c
@@ -207,7 +207,6 @@ static int blktrans_open(struct block_device *bdev, fmode_t mode)
 	if (!dev)
 		return -ERESTARTSYS; /* FIXME: busy loop! -arnd*/
 
-	mutex_lock(&mtd_table_mutex);
 	mutex_lock(&dev->lock);
 
 	if (dev->open)
@@ -233,7 +232,6 @@ static int blktrans_open(struct block_device *bdev, fmode_t mode)
 unlock:
 	dev->open++;
 	mutex_unlock(&dev->lock);
-	mutex_unlock(&mtd_table_mutex);
 	blktrans_dev_put(dev);
 	return ret;
 
@@ -244,7 +242,6 @@ static int blktrans_open(struct block_device *bdev, fmode_t mode)
 	module_put(dev->tr->owner);
 	kref_put(&dev->ref, blktrans_dev_release);
 	mutex_unlock(&dev->lock);
-	mutex_unlock(&mtd_table_mutex);
 	blktrans_dev_put(dev);
 	return ret;
 }
@@ -256,7 +253,6 @@ static void blktrans_release(struct gendisk *disk, fmode_t mode)
 	if (!dev)
 		return;
 
-	mutex_lock(&mtd_table_mutex);
 	mutex_lock(&dev->lock);
 
 	if (--dev->open)
@@ -272,7 +268,6 @@ static void blktrans_release(struct gendisk *disk, fmode_t mode)
 	}
 unlock:
 	mutex_unlock(&dev->lock);
-	mutex_unlock(&mtd_table_mutex);
 	blktrans_dev_put(dev);
 }
 
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 4/8] block: support delayed holder registration
  2021-08-20 15:08             ` Christoph Hellwig
@ 2021-08-21  3:17               ` Guenter Roeck
  0 siblings, 0 replies; 35+ messages in thread
From: Guenter Roeck @ 2021-08-21  3:17 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer

On 8/20/21 8:08 AM, Christoph Hellwig wrote:
> Please try the patch below:
> 
> ---
>>From 7609266da56160d211662cd2fbe26570aad11b15 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Fri, 20 Aug 2021 17:00:11 +0200
> Subject: mtd_blkdevs: don't hold del_mtd_blktrans_dev in
>   blktrans_{open,release}
> 
> There is nothing that this protects against except for slightly reducing
> the window when new opens can appear just before calling del_gendisk.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

A cautious

Tested-by: Guenter Roeck <linux@roeck-us.net>

Cautious because -next is a bit broken right now and I can not run a complete
test for all images.

Guenter

> ---
>   drivers/mtd/mtd_blkdevs.c | 5 -----
>   1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/mtd/mtd_blkdevs.c b/drivers/mtd/mtd_blkdevs.c
> index 44bea3f65060..6b81a1c9ccbe 100644
> --- a/drivers/mtd/mtd_blkdevs.c
> +++ b/drivers/mtd/mtd_blkdevs.c
> @@ -207,7 +207,6 @@ static int blktrans_open(struct block_device *bdev, fmode_t mode)
>   	if (!dev)
>   		return -ERESTARTSYS; /* FIXME: busy loop! -arnd*/
>   
> -	mutex_lock(&mtd_table_mutex);
>   	mutex_lock(&dev->lock);
>   
>   	if (dev->open)
> @@ -233,7 +232,6 @@ static int blktrans_open(struct block_device *bdev, fmode_t mode)
>   unlock:
>   	dev->open++;
>   	mutex_unlock(&dev->lock);
> -	mutex_unlock(&mtd_table_mutex);
>   	blktrans_dev_put(dev);
>   	return ret;
>   
> @@ -244,7 +242,6 @@ static int blktrans_open(struct block_device *bdev, fmode_t mode)
>   	module_put(dev->tr->owner);
>   	kref_put(&dev->ref, blktrans_dev_release);
>   	mutex_unlock(&dev->lock);
> -	mutex_unlock(&mtd_table_mutex);
>   	blktrans_dev_put(dev);
>   	return ret;
>   }
> @@ -256,7 +253,6 @@ static void blktrans_release(struct gendisk *disk, fmode_t mode)
>   	if (!dev)
>   		return;
>   
> -	mutex_lock(&mtd_table_mutex);
>   	mutex_lock(&dev->lock);
>   
>   	if (--dev->open)
> @@ -272,7 +268,6 @@ static void blktrans_release(struct gendisk *disk, fmode_t mode)
>   	}
>   unlock:
>   	mutex_unlock(&dev->lock);
> -	mutex_unlock(&mtd_table_mutex);
>   	blktrans_dev_put(dev);
>   }
>   
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2021-08-04  9:41 ` [dm-devel] [PATCH 7/8] dm: delay registering the gendisk Christoph Hellwig
  2021-08-09 23:31   ` Alasdair G Kergon
@ 2022-07-07  3:29   ` Yu Kuai
  2022-07-07  5:24     ` Christoph Hellwig
  1 sibling, 1 reply; 35+ messages in thread
From: Yu Kuai @ 2022-07-07  3:29 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

Hi, Christoph

在 2021/08/04 17:41, Christoph Hellwig 写道:
> device mapper is currently the only outlier that tries to call
> register_disk after add_disk, leading to fairly inconsistent state
> of these block layer data structures.  Instead change device-mapper
> to just register the gendisk later now that the holder mechanism
> can cope with that.
> 
> Note that this introduces a user visible change: the dm kobject is
> now only visible after the initial table has been loaded.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Mike Snitzer <snitzer@redhat.com>

We found that this patch fix a nullptr crash in our test:
[   88.727918] BUG: kernel NULL pointer dereference, address: 
00000000000001a0
[   88.730698] #PF: supervisor read access in kernel mode
[   88.731381] #PF: error_code(0x0000) - not-present page
[   88.732086] PGD 0 P4D 0
[   88.732441] Oops: 0000 [#1] PREEMPT SMP
[   88.732964] CPU: 1 PID: 1317 Comm: mount Not tainted 
5.10.0-16691-gf6076432827d-dirty #169
[   88.734055] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS ?-20190727_073836-4
[   88.735819] RIP: 0010:__blk_mq_sched_bio_merge+0x9d/0x1a0
[   88.736544] Code: 87 1e 9d 89 d0 25 00 00 00 01 0f 85 ad 00 00 00 48 
83 05 25 a1 37 0c 01 3
[   88.739040] RSP: 0018:ffffc90000473b50 EFLAGS: 00010202
[   88.739744] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
ffffc90000473b98
[   88.740697] RDX: 0000000000001000 RSI: ffff8881080c7500 RDI: 
ffff888103a9cc18
[   88.741659] RBP: ffff88813bc80000 R08: 0000000000000001 R09: 
0000000000000000
[   88.742611] R10: ffff88810710be30 R11: 0000000000000000 R12: 
ffff888103a9cc18
[   88.743551] R13: ffff8881080c7500 R14: 0000000000000001 R15: 
0000000000000000
[   88.744501] FS:  00007f51bcdbb040(0000) GS:ffff88813bc80000(0000) 
knlGS:0000000000000000
[   88.745581] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.746345] CR2: 00000000000001a0 CR3: 000000010d715000 CR4: 
00000000000006e0
[   88.747298] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[   88.748253] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400

[   88.749204] Call Trace:
[   88.749549]  blk_mq_submit_bio+0x115/0xd80
[   88.750124]  submit_bio_noacct+0x4ff/0x610
[   88.750692]  submit_bio+0xaa/0x1a0
[   88.751149]  submit_bh_wbc+0x1cb/0x2f0
[   88.751662]  submit_bh+0x17/0x20
[   88.752102]  ext4_read_bh+0x63/0x170
[   88.752588]  ext4_read_bh_lock+0x2c/0xd0
[   88.753125]  __ext4_sb_bread_gfp.isra.0+0xa0/0xf0
[   88.753766]  ext4_fill_super+0x21f/0x5610
[   88.754317]  ? pointer+0x31b/0x5a0
[   88.754796]  ? vsnprintf+0x131/0x7d0
[   88.755304]  mount_bdev+0x233/0x280
[   88.755791]  ? ext4_calculate_overhead+0x660/0x660
[   88.756461]  ext4_mount+0x19/0x30
[   88.756926]  legacy_get_tree+0x35/0x90
[   88.757450]  vfs_get_tree+0x29/0x100
[   88.757955]  ? capable+0x1d/0x30
[   88.758406]  path_mount+0x8a7/0x1150
[   88.758918]  do_mount+0x8d/0xc0
[   88.759360]  __se_sys_mount+0x14a/0x220
[   88.759906]  __x64_sys_mount+0x29/0x40
[   88.760431]  do_syscall_64+0x45/0x70
[   88.760931]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   88.761634] RIP: 0033:0x7f51bbe1623a
[   88.762135] Code: 48 8b 0d 51 dc 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 
66 2e 0f 1f 84 00 00 8
[   88.764657] RSP: 002b:00007fff173ae898 EFLAGS: 00000246 ORIG_RAX: 
00000000000000a5
[   88.765700] RAX: ffffffffffffffda RBX: 000056169a120030 RCX: 
00007f51bbe1623a
[   88.766675] RDX: 000056169a120210 RSI: 000056169a120250 RDI: 
000056169a120230
[   88.767642] RBP: 0000000000000000 R08: 0000000000000000 R09: 
00007fff173ad798
[   88.768619] R10: 00000000c0ed0000 R11: 0000000000000246 R12: 
000056169a120230
[   88.769605] R13: 000056169a120210 R14: 0000000000000000 R15: 
00007f51bcbac184
[   88.770611] Modules linked in: dm_service_time dm_multipath
[   88.771388] CR2: 00000000000001a0
[   88.776323] ---[ end trace ac5d86e09fdc7c98 ]---
[   88.777009] RIP: 0010:__blk_mq_sched_bio_merge+0x9d/0x1a0
[   88.778038] Code: 87 1e 9d 89 d0 25 00 00 00 01 0f 85 ad 00 00 00 48 
83 05 25 a1 37 0c 01 3
[   88.780708] RSP: 0018:ffffc90000473b50 EFLAGS: 00010202
[   88.781443] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
ffffc90000473b98
[   88.782692] RDX: 0000000000001000 RSI: ffff8881080c7500 RDI: 
ffff888103a9cc18
[   88.783839] RBP: ffff88813bc80000 R08: 0000000000000001 R09: 
0000000000000000
[   88.784942] R10: ffff88810710be30 R11: 0000000000000000 R12: 
ffff888103a9cc18
[   88.786051] R13: ffff8881080c7500 R14: 0000000000000001 R15: 
0000000000000000
[   88.787142] FS:  00007f51bcdbb040(0000) GS:ffff88813bc80000(0000) 
knlGS:0000000000000000
[   88.788399] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.789444] CR2: 00007f10e97a5000 CR3: 000000010d715000 CR4: 
00000000000006e0
[   88.790586] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[   88.791686] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[   88.792773] Kernel panic - not syncing: Fatal exception
[   88.793573] Kernel Offset: disabled
[   88.794052] ---[ end Kernel panic - not syncing: Fatal exception ]---

root cause:
t1 dm-mpath       t2 mount

alloc_dev
  md->queue = blk_alloc_queue
  add_disk_no_queue_reg

dm_setup_md_queue
  case DM_TYPE_REQUEST_BASED -> multipath
   md->disk->fops = &dm_rq_blk_dops;
                         ext4_fill_super
                         ┊__ext4_sb_bread_gfp
                         ┊ ext4_read_bh
                         ┊  submit_bio -> queue is not initialized yet
                         ┊   __blk_mq_sched_bio_merge
                         ┊    ctx = blk_mq_get_ctx(q); -> ctx is NULL
   dm_mq_init_request_queue

Do you think it's ok to backport this patch(and all realted patches) to
lts, or it's better to fix that bio can be submitted with queue
uninitialized from block layer?

Thanks,
Kuai
> ---
>   drivers/md/dm-rq.c |  1 -
>   drivers/md/dm.c    | 23 +++++++++++------------
>   2 files changed, 11 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index 0dbd48cbdff9..5b95eea517d1 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -559,7 +559,6 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
>   	err = blk_mq_init_allocated_queue(md->tag_set, md->queue);
>   	if (err)
>   		goto out_tag_set;
> -	elevator_init_mq(md->queue);
>   	return 0;
>   
>   out_tag_set:
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index f003bd5b93ce..7981b7287628 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1693,7 +1693,10 @@ static void cleanup_mapped_device(struct mapped_device *md)
>   		spin_lock(&_minor_lock);
>   		md->disk->private_data = NULL;
>   		spin_unlock(&_minor_lock);
> -		del_gendisk(md->disk);
> +		if (dm_get_md_type(md) != DM_TYPE_NONE) {
> +			dm_sysfs_exit(md);
> +			del_gendisk(md->disk);
> +		}
>   		dm_queue_destroy_keyslot_manager(md->queue);
>   		blk_cleanup_disk(md->disk);
>   	}
> @@ -1788,7 +1791,6 @@ static struct mapped_device *alloc_dev(int minor)
>   			goto bad;
>   	}
>   
> -	add_disk_no_queue_reg(md->disk);
>   	format_dev_t(md->name, MKDEV(_major, minor));
>   
>   	md->wq = alloc_workqueue("kdmflush", WQ_MEM_RECLAIM, 0);
> @@ -1989,19 +1991,12 @@ static struct dm_table *__unbind(struct mapped_device *md)
>    */
>   int dm_create(int minor, struct mapped_device **result)
>   {
> -	int r;
>   	struct mapped_device *md;
>   
>   	md = alloc_dev(minor);
>   	if (!md)
>   		return -ENXIO;
>   
> -	r = dm_sysfs_init(md);
> -	if (r) {
> -		free_dev(md);
> -		return r;
> -	}
> -
>   	*result = md;
>   	return 0;
>   }
> @@ -2081,10 +2076,15 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
>   	r = dm_table_set_restrictions(t, md->queue, &limits);
>   	if (r)
>   		return r;
> -	md->type = type;
>   
> -	blk_register_queue(md->disk);
> +	add_disk(md->disk);
>   
> +	r = dm_sysfs_init(md);
> +	if (r) {
> +		del_gendisk(md->disk);
> +		return r;
> +	}
> +	md->type = type;
>   	return 0;
>   }
>   
> @@ -2190,7 +2190,6 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>   		DMWARN("%s: Forcibly removing mapped_device still in use! (%d users)",
>   		       dm_device_name(md), atomic_read(&md->holders));
>   
> -	dm_sysfs_exit(md);
>   	dm_table_destroy(__unbind(md));
>   	free_dev(md);
>   }
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2022-07-07  3:29   ` Yu Kuai
@ 2022-07-07  5:24     ` Christoph Hellwig
  2022-07-07  7:20       ` Yu Kuai
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2022-07-07  5:24 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Jens Axboe, linux-block, dm-devel, Christoph Hellwig, Mike Snitzer

On Thu, Jul 07, 2022 at 11:29:26AM +0800, Yu Kuai wrote:
> We found that this patch fix a nullptr crash in our test:

> Do you think it's ok to backport this patch(and all realted patches) to
> lts, or it's better to fix that bio can be submitted with queue
> uninitialized from block layer?

Given how long ago this was I do not remember offhand how much prep
work this would require.  The patch itself is of course tiny and
backportable, but someone will need to do the work and figure out how
much else would have to be backported.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2022-07-07  5:24     ` Christoph Hellwig
@ 2022-07-07  7:20       ` Yu Kuai
  2022-07-15  3:24         ` Yu Kuai
  0 siblings, 1 reply; 35+ messages in thread
From: Yu Kuai @ 2022-07-07  7:20 UTC (permalink / raw)
  To: Christoph Hellwig, Yu Kuai
  Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer

在 2022/07/07 13:24, Christoph Hellwig 写道:
> On Thu, Jul 07, 2022 at 11:29:26AM +0800, Yu Kuai wrote:
>> We found that this patch fix a nullptr crash in our test:
> 
>> Do you think it's ok to backport this patch(and all realted patches) to
>> lts, or it's better to fix that bio can be submitted with queue
>> uninitialized from block layer?
> 
> Given how long ago this was I do not remember offhand how much prep
> work this would require.  The patch itself is of course tiny and
> backportable, but someone will need to do the work and figure out how
> much else would have to be backported.

Ok, I'll try to figure out that, and backport them.(At least to 5.10.y)

Thanks,
Kuai
> .
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2022-07-07  7:20       ` Yu Kuai
@ 2022-07-15  3:24         ` Yu Kuai
  2022-08-01  1:04           ` Yu Kuai
  0 siblings, 1 reply; 35+ messages in thread
From: Yu Kuai @ 2022-07-15  3:24 UTC (permalink / raw)
  To: Yu Kuai, Christoph Hellwig
  Cc: Jens Axboe, linux-block, dm-devel, Mike Snitzer

Hi, Christoph!

在 2022/07/07 15:20, Yu Kuai 写道:
> 在 2022/07/07 13:24, Christoph Hellwig 写道:
>> On Thu, Jul 07, 2022 at 11:29:26AM +0800, Yu Kuai wrote:
>>> We found that this patch fix a nullptr crash in our test:
>>
>>> Do you think it's ok to backport this patch(and all realted patches) to
>>> lts, or it's better to fix that bio can be submitted with queue
>>> uninitialized from block layer?
>>
>> Given how long ago this was I do not remember offhand how much prep
>> work this would require.  The patch itself is of course tiny and
>> backportable, but someone will need to do the work and figure out how
>> much else would have to be backported.
> 
> Ok, I'll try to figure out that, and backport them.(At least to 5.10.y)

While reviewing the code, I didn't found any protection that
bd_link_disk_holder() won't concurrent with
bd_register_pending_holders(). If they do can concurrent,
following scenario is problematic:

t1				t2
device_add_disk
  disk->slave_dir = kobject_create_and_add
				bd_link_disk_holder
				 __link_disk_holder
				 list_add
  bd_register_pending_holders
   list_for_each_entry
    __link_disk_holder -> -EEXIST

In this case, I think maybe ignore '-EEXIST' is fine.

I'm not familiar with dm, and I'm not sure if I missed something,
please kindly correct me if I'm wrong.

Thanks,
Kuai

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 7/8] dm: delay registering the gendisk
  2022-07-15  3:24         ` Yu Kuai
@ 2022-08-01  1:04           ` Yu Kuai
  0 siblings, 0 replies; 35+ messages in thread
From: Yu Kuai @ 2022-08-01  1:04 UTC (permalink / raw)
  To: Yu Kuai, Christoph Hellwig
  Cc: Jens Axboe, Mike Snitzer, zhangyi (F), linux-block, dm-devel, yukuai (C)

Hi, Christoph!

在 2022/07/15 11:24, Yu Kuai 写道:
> Hi, Christoph!
> 
> 在 2022/07/07 15:20, Yu Kuai 写道:
>> 在 2022/07/07 13:24, Christoph Hellwig 写道:
>>> On Thu, Jul 07, 2022 at 11:29:26AM +0800, Yu Kuai wrote:
>>>> We found that this patch fix a nullptr crash in our test:
>>>
>>>> Do you think it's ok to backport this patch(and all realted patches) to
>>>> lts, or it's better to fix that bio can be submitted with queue
>>>> uninitialized from block layer?
>>>
>>> Given how long ago this was I do not remember offhand how much prep
>>> work this would require.  The patch itself is of course tiny and
>>> backportable, but someone will need to do the work and figure out how
>>> much else would have to be backported.
>>
>> Ok, I'll try to figure out that, and backport them.(At least to 5.10.y)

I posted a stable patchset on stable 5.10, can you pleas take a loock ?

dm: fix nullptr crash
https://lore.kernel.org/all/20220729062356.1663513-1-yukuai1@huaweicloud.com/

Thanks,
Kuai
> 
> While reviewing the code, I didn't found any protection that
> bd_link_disk_holder() won't concurrent with
> bd_register_pending_holders(). If they do can concurrent,
> following scenario is problematic:
> 
> t1                t2
> device_add_disk
>   disk->slave_dir = kobject_create_and_add
>                  bd_link_disk_holder
>                   __link_disk_holder
>                   list_add
>   bd_register_pending_holders
>    list_for_each_entry
>     __link_disk_holder -> -EEXIST
> 
> In this case, I think maybe ignore '-EEXIST' is fine.
> 
> I'm not familiar with dm, and I'm not sure if I missed something,
> please kindly correct me if I'm wrong.
> 
> Thanks,
> Kuai
> 
> .
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations
  2021-07-25  5:54 ` [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations Christoph Hellwig
@ 2021-07-29 16:37   ` Mike Snitzer
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Snitzer @ 2021-07-29 16:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, dm-devel

On Sun, Jul 25 2021 at  1:54P -0400,
Christoph Hellwig <hch@lst.de> wrote:

> Now that device mapper has been changed to register the disk once
> it is fully ready all this code is unused.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Mike Snitzer <snitzer@redhat.com>


> ---
>  block/elevator.c      |  1 -
>  block/genhd.c         | 29 +++++++----------------------
>  include/linux/genhd.h |  6 ------
>  3 files changed, 7 insertions(+), 29 deletions(-)
> 
> diff --git a/block/elevator.c b/block/elevator.c
> index 52ada14cfe45..706d5a64508d 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -702,7 +702,6 @@ void elevator_init_mq(struct request_queue *q)
>  		elevator_put(e);
>  	}
>  }
> -EXPORT_SYMBOL_GPL(elevator_init_mq); /* only for dm-rq */
>  
>  /*
>   * switch to new_e io scheduler. be careful not to introduce deadlocks -
> diff --git a/block/genhd.c b/block/genhd.c
> index e3d93b868ec5..3cd9f165a5a7 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -457,20 +457,20 @@ static void register_disk(struct device *parent, struct gendisk *disk,
>  }
>  
>  /**
> - * __device_add_disk - add disk information to kernel list
> + * device_add_disk - add disk information to kernel list
>   * @parent: parent device for the disk
>   * @disk: per-device partitioning information
>   * @groups: Additional per-device sysfs groups
> - * @register_queue: register the queue if set to true
>   *
>   * This function registers the partitioning information in @disk
>   * with the kernel.
>   *
>   * FIXME: error handling
>   */
> -static void __device_add_disk(struct device *parent, struct gendisk *disk,
> -			      const struct attribute_group **groups,
> -			      bool register_queue)
> +
> +void device_add_disk(struct device *parent, struct gendisk *disk,
> +		     const struct attribute_group **groups)
> +
>  {
>  	int ret;
>  
> @@ -480,8 +480,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
>  	 * elevator if one is needed, that is, for devices requesting queue
>  	 * registration.
>  	 */
> -	if (register_queue)
> -		elevator_init_mq(disk->queue);
> +	elevator_init_mq(disk->queue);
>  
>  	/*
>  	 * If the driver provides an explicit major number it also must provide
> @@ -535,8 +534,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
>  		bdev_add(disk->part0, dev->devt);
>  	}
>  	register_disk(parent, disk, groups);
> -	if (register_queue)
> -		blk_register_queue(disk);
> +	blk_register_queue(disk);
>  
>  	/*
>  	 * Take an extra ref on queue which will be put on disk_release()
> @@ -550,21 +548,8 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
>  	disk_add_events(disk);
>  	blk_integrity_add(disk);
>  }
> -
> -void device_add_disk(struct device *parent, struct gendisk *disk,
> -		     const struct attribute_group **groups)
> -
> -{
> -	__device_add_disk(parent, disk, groups, true);
> -}
>  EXPORT_SYMBOL(device_add_disk);
>  
> -void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
> -{
> -	__device_add_disk(parent, disk, NULL, false);
> -}
> -EXPORT_SYMBOL(device_add_disk_no_queue_reg);
> -
>  /**
>   * del_gendisk - remove the gendisk
>   * @disk: the struct gendisk to remove
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index dd95d53c75fa..fbc4bf269f63 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -218,12 +218,6 @@ static inline void add_disk(struct gendisk *disk)
>  {
>  	device_add_disk(NULL, disk, NULL);
>  }
> -extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
> -static inline void add_disk_no_queue_reg(struct gendisk *disk)
> -{
> -	device_add_disk_no_queue_reg(NULL, disk);
> -}
> -
>  extern void del_gendisk(struct gendisk *gp);
>  
>  void set_disk_ro(struct gendisk *disk, bool read_only);
> -- 
> 2.30.2
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations
  2021-07-25  5:54 [dm-devel] use regular gendisk registration in device mapper Christoph Hellwig
@ 2021-07-25  5:54 ` Christoph Hellwig
  2021-07-29 16:37   ` Mike Snitzer
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2021-07-25  5:54 UTC (permalink / raw)
  To: Jens Axboe, Mike Snitzer; +Cc: linux-block, dm-devel

Now that device mapper has been changed to register the disk once
it is fully ready all this code is unused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/elevator.c      |  1 -
 block/genhd.c         | 29 +++++++----------------------
 include/linux/genhd.h |  6 ------
 3 files changed, 7 insertions(+), 29 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index 52ada14cfe45..706d5a64508d 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -702,7 +702,6 @@ void elevator_init_mq(struct request_queue *q)
 		elevator_put(e);
 	}
 }
-EXPORT_SYMBOL_GPL(elevator_init_mq); /* only for dm-rq */
 
 /*
  * switch to new_e io scheduler. be careful not to introduce deadlocks -
diff --git a/block/genhd.c b/block/genhd.c
index e3d93b868ec5..3cd9f165a5a7 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -457,20 +457,20 @@ static void register_disk(struct device *parent, struct gendisk *disk,
 }
 
 /**
- * __device_add_disk - add disk information to kernel list
+ * device_add_disk - add disk information to kernel list
  * @parent: parent device for the disk
  * @disk: per-device partitioning information
  * @groups: Additional per-device sysfs groups
- * @register_queue: register the queue if set to true
  *
  * This function registers the partitioning information in @disk
  * with the kernel.
  *
  * FIXME: error handling
  */
-static void __device_add_disk(struct device *parent, struct gendisk *disk,
-			      const struct attribute_group **groups,
-			      bool register_queue)
+
+void device_add_disk(struct device *parent, struct gendisk *disk,
+		     const struct attribute_group **groups)
+
 {
 	int ret;
 
@@ -480,8 +480,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 	 * elevator if one is needed, that is, for devices requesting queue
 	 * registration.
 	 */
-	if (register_queue)
-		elevator_init_mq(disk->queue);
+	elevator_init_mq(disk->queue);
 
 	/*
 	 * If the driver provides an explicit major number it also must provide
@@ -535,8 +534,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 		bdev_add(disk->part0, dev->devt);
 	}
 	register_disk(parent, disk, groups);
-	if (register_queue)
-		blk_register_queue(disk);
+	blk_register_queue(disk);
 
 	/*
 	 * Take an extra ref on queue which will be put on disk_release()
@@ -550,21 +548,8 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 	disk_add_events(disk);
 	blk_integrity_add(disk);
 }
-
-void device_add_disk(struct device *parent, struct gendisk *disk,
-		     const struct attribute_group **groups)
-
-{
-	__device_add_disk(parent, disk, groups, true);
-}
 EXPORT_SYMBOL(device_add_disk);
 
-void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
-{
-	__device_add_disk(parent, disk, NULL, false);
-}
-EXPORT_SYMBOL(device_add_disk_no_queue_reg);
-
 /**
  * del_gendisk - remove the gendisk
  * @disk: the struct gendisk to remove
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index dd95d53c75fa..fbc4bf269f63 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -218,12 +218,6 @@ static inline void add_disk(struct gendisk *disk)
 {
 	device_add_disk(NULL, disk, NULL);
 }
-extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
-static inline void add_disk_no_queue_reg(struct gendisk *disk)
-{
-	device_add_disk_no_queue_reg(NULL, disk);
-}
-
 extern void del_gendisk(struct gendisk *gp);
 
 void set_disk_ro(struct gendisk *disk, bool read_only);
-- 
2.30.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-08-01  1:05 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-04  9:41 [dm-devel] use regular gendisk registration in device mapper v2 Christoph Hellwig
2021-08-04  9:41 ` [dm-devel] [PATCH 1/8] block: make the block holder code optional Christoph Hellwig
2021-08-04  9:41 ` [dm-devel] [PATCH 2/8] block: remove the extra kobject reference in bd_link_disk_holder Christoph Hellwig
2021-08-04  9:41 ` [dm-devel] [PATCH 3/8] block: look up holders by bdev Christoph Hellwig
2021-08-04  9:41 ` [dm-devel] [PATCH 4/8] block: support delayed holder registration Christoph Hellwig
     [not found]   ` <CGME20210810213058eucas1p109323e3c3ecaa76d37d8cf63b6d8ecfd@eucas1p1.samsung.com>
2021-08-10 21:30     ` Marek Szyprowski
2021-08-14 21:13   ` Guenter Roeck
2021-08-15  7:07     ` Christoph Hellwig
2021-08-15 14:27       ` Guenter Roeck
2021-08-16  7:21         ` Christoph Hellwig
2021-08-16 14:17           ` Guenter Roeck
2021-08-20 15:08             ` Christoph Hellwig
2021-08-21  3:17               ` Guenter Roeck
2021-08-18  2:51           ` Guenter Roeck
2021-08-04  9:41 ` [dm-devel] [PATCH 5/8] dm: cleanup cleanup_mapped_device Christoph Hellwig
2021-08-04  9:41 ` [dm-devel] [PATCH 6/8] dm: move setting md->type into dm_setup_md_queue Christoph Hellwig
2021-08-04  9:41 ` [dm-devel] [PATCH 7/8] dm: delay registering the gendisk Christoph Hellwig
2021-08-09 23:31   ` Alasdair G Kergon
2021-08-10  0:17     ` Alasdair G Kergon
2021-08-10 13:12     ` Peter Rajnoha
2021-08-10 15:05       ` Alasdair G Kergon
2022-07-07  3:29   ` Yu Kuai
2022-07-07  5:24     ` Christoph Hellwig
2022-07-07  7:20       ` Yu Kuai
2022-07-15  3:24         ` Yu Kuai
2022-08-01  1:04           ` Yu Kuai
2021-08-04  9:41 ` [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations Christoph Hellwig
2021-08-09 17:51 ` [dm-devel] use regular gendisk registration in device mapper v2 Jens Axboe
2021-08-10  0:36 ` Alasdair G Kergon
2021-08-10 14:41   ` Alasdair G Kergon
2021-08-19 15:58 ` [dm-devel] holders not working properly, regression [was: Re: use regular gendisk registration in device mapper v2] Mike Snitzer
2021-08-19 18:05   ` Christoph Hellwig
2021-08-19 22:08     ` Mike Snitzer
  -- strict thread matches above, loose matches on Subject: below --
2021-07-25  5:54 [dm-devel] use regular gendisk registration in device mapper Christoph Hellwig
2021-07-25  5:54 ` [dm-devel] [PATCH 8/8] block: remove support for delayed queue registrations Christoph Hellwig
2021-07-29 16:37   ` Mike Snitzer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).