All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/12] Simplify block integrity registration v3
@ 2015-10-21 17:19 Dan Williams
  2015-10-21 17:19 ` [PATCH v3 01/12] block: Move integrity kobject to struct gendisk Dan Williams
                   ` (12 more replies)
  0 siblings, 13 replies; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:19 UTC (permalink / raw)


Changes since v2 [1]:

1/ Fold in static declaration fix to "block: Consolidate static
   integrity profile properties"

2/ Rebase on latest for-4.4/drivers from block.git

3/ Merge the pending lifetime fixes

[1]: http://lists.infradead.org/pipermail/linux-nvme/2015-October/002783.html

---

Dan Williams (7):
      md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown
      md: suspend i/o during runtime blk_integrity_unregister
      nvme: suspend i/o during runtime blk_integrity_unregister
      block: generic request_queue reference counting
      block: move blk_integrity to request_queue
      block: blk_flush_integrity() for bio-based drivers
      block, libnvdimm, nvme: provide a built-in blk_integrity nop profile

Martin K. Petersen (5):
      block: Move integrity kobject to struct gendisk
      block: Consolidate static integrity profile properties
      block: Reduce the size of struct blk_integrity
      block: Export integrity data interval size in sysfs
      block: Inline blk_integrity in struct gendisk


 Documentation/ABI/testing/sysfs-block |    7 +
 block/bio-integrity.c                 |   17 ++-
 block/blk-core.c                      |   74 +++++++++++--
 block/blk-integrity.c                 |  192 +++++++++++++++------------------
 block/blk-mq-sysfs.c                  |    6 -
 block/blk-mq.c                        |   80 ++++----------
 block/blk-sysfs.c                     |    3 -
 block/blk.h                           |   22 ++++
 block/genhd.c                         |    2 
 block/partition-generic.c             |    1 
 block/t10-pi.c                        |   16 +--
 drivers/md/dm-table.c                 |   88 ++++++++-------
 drivers/md/dm.c                       |    2 
 drivers/md/md.c                       |   11 +-
 drivers/md/multipath.c                |    2 
 drivers/md/raid1.c                    |    2 
 drivers/md/raid10.c                   |    2 
 drivers/nvdimm/btt.c                  |    1 
 drivers/nvdimm/core.c                 |   21 +---
 drivers/nvme/host/pci.c               |   34 +-----
 drivers/scsi/sd.c                     |    1 
 drivers/scsi/sd_dif.c                 |   31 +++--
 drivers/target/target_core_iblock.c   |   10 +-
 fs/block_dev.c                        |    2 
 include/linux/blk-mq.h                |    1 
 include/linux/blkdev.h                |   48 ++++----
 include/linux/genhd.h                 |   26 ++++
 include/linux/t10-pi.h                |    8 +
 28 files changed, 367 insertions(+), 343 deletions(-)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 01/12] block: Move integrity kobject to struct gendisk
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
@ 2015-10-21 17:19 ` Dan Williams
  2015-10-21 17:19 ` [PATCH v3 02/12] block: Consolidate static integrity profile properties Dan Williams
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:19 UTC (permalink / raw)


From: Martin K. Petersen <martin.petersen@oracle.com>

The integrity kobject purely exists to support the integrity
subdirectory in sysfs and doesn't really have anything to do with the
blk_integrity data structure. Move the kobject to struct gendisk where
it belongs.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c  |   22 +++++++++++-----------
 include/linux/blkdev.h |    2 --
 include/linux/genhd.h  |    1 +
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 75f29cf70188..182bfd2383ea 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -249,8 +249,8 @@ struct integrity_sysfs_entry {
 static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -261,8 +261,8 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    struct attribute *attr, const char *page,
 				    size_t count)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -385,8 +385,8 @@ subsys_initcall(blk_dev_integrity_init);
 
 static void blk_integrity_release(struct kobject *kobj)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 
 	kmem_cache_free(integrity_cachep, bi);
 }
@@ -429,14 +429,14 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 		if (!bi)
 			return -1;
 
-		if (kobject_init_and_add(&bi->kobj, &integrity_ktype,
+		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
 					 &disk_to_dev(disk)->kobj,
 					 "%s", "integrity")) {
 			kmem_cache_free(integrity_cachep, bi);
 			return -1;
 		}
 
-		kobject_uevent(&bi->kobj, KOBJ_ADD);
+		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
 
 		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
 		bi->interval = queue_logical_block_size(disk->queue);
@@ -479,9 +479,9 @@ void blk_integrity_unregister(struct gendisk *disk)
 
 	bi = disk->integrity;
 
-	kobject_uevent(&bi->kobj, KOBJ_REMOVE);
-	kobject_del(&bi->kobj);
-	kobject_put(&bi->kobj);
+	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
+	kobject_del(&disk->integrity_kobj);
+	kobject_put(&disk->integrity_kobj);
 	disk->integrity = NULL;
 }
 EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 19c2e947d4d1..830f9c07d4bb 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1472,8 +1472,6 @@ struct blk_integrity {
 	unsigned short		tag_size;
 
 	const char		*name;
-
-	struct kobject		kobj;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 2adbfa6d02bc..9e6e0dfa97ad 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -199,6 +199,7 @@ struct gendisk {
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
 	struct blk_integrity *integrity;
+	struct kobject integrity_kobj;
 #endif
 	int node_id;
 };

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 02/12] block: Consolidate static integrity profile properties
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
  2015-10-21 17:19 ` [PATCH v3 01/12] block: Move integrity kobject to struct gendisk Dan Williams
@ 2015-10-21 17:19 ` Dan Williams
  2015-10-21 17:19 ` [PATCH v3 03/12] block: Reduce the size of struct blk_integrity Dan Williams
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:19 UTC (permalink / raw)


From: Martin K. Petersen <martin.petersen@oracle.com>

We previously made a complete copy of a device's data integrity profile
even though several of the fields inside the blk_integrity struct are
pointers to fixed template entries in t10-pi.c.

Split the static and per-device portions so that we can reference the
template directly.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Cc: Dan Williams <dan.j.williams at intel.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/bio-integrity.c               |    8 ++++----
 block/blk-integrity.c               |   17 ++++++++---------
 block/t10-pi.c                      |   16 ++++------------
 drivers/nvdimm/core.c               |   11 +++++++----
 drivers/nvme/host/pci.c             |    8 ++++----
 drivers/scsi/sd_dif.c               |   31 ++++++++++++++++++-------------
 drivers/target/target_core_iblock.c |   10 +++++-----
 include/linux/blkdev.h              |   20 +++++++++++---------
 include/linux/t10-pi.h              |    8 ++++----
 9 files changed, 65 insertions(+), 64 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 14b8faf8b09d..a10ffe19a8dd 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -177,11 +177,11 @@ bool bio_integrity_enabled(struct bio *bio)
 	if (bi == NULL)
 		return false;
 
-	if (bio_data_dir(bio) == READ && bi->verify_fn != NULL &&
+	if (bio_data_dir(bio) == READ && bi->profile->verify_fn != NULL &&
 	    (bi->flags & BLK_INTEGRITY_VERIFY))
 		return true;
 
-	if (bio_data_dir(bio) == WRITE && bi->generate_fn != NULL &&
+	if (bio_data_dir(bio) == WRITE && bi->profile->generate_fn != NULL &&
 	    (bi->flags & BLK_INTEGRITY_GENERATE))
 		return true;
 
@@ -340,7 +340,7 @@ int bio_integrity_prep(struct bio *bio)
 
 	/* Auto-generate integrity metadata if this is a write */
 	if (bio_data_dir(bio) == WRITE)
-		bio_integrity_process(bio, bi->generate_fn);
+		bio_integrity_process(bio, bi->profile->generate_fn);
 
 	return 0;
 }
@@ -361,7 +361,7 @@ static void bio_integrity_verify_fn(struct work_struct *work)
 	struct bio *bio = bip->bip_bio;
 	struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
 
-	bio->bi_error = bio_integrity_process(bio, bi->verify_fn);
+	bio->bi_error = bio_integrity_process(bio, bi->profile->verify_fn);
 
 	/* Restore original bio completion handler */
 	bio->bi_end_io = bip->bip_end_io;
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 182bfd2383ea..daf590ab3b46 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -176,10 +176,10 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 		return -1;
 	}
 
-	if (strcmp(b1->name, b2->name)) {
+	if (b1->profile != b2->profile) {
 		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
-		       b1->name, b2->name);
+		       b1->profile->name, b2->profile->name);
 		return -1;
 	}
 
@@ -275,8 +275,8 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->name != NULL)
-		return sprintf(page, "%s\n", bi->name);
+	if (bi != NULL && bi->profile->name != NULL)
+		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
 }
@@ -401,7 +401,8 @@ bool blk_integrity_is_initialized(struct gendisk *disk)
 {
 	struct blk_integrity *bi = blk_get_integrity(disk);
 
-	return (bi && bi->name && strcmp(bi->name, bi_unsupported_name) != 0);
+	return (bi && bi->profile->name && strcmp(bi->profile->name,
+						  bi_unsupported_name) != 0);
 }
 EXPORT_SYMBOL(blk_integrity_is_initialized);
 
@@ -446,14 +447,12 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 
 	/* Use the provided profile as template */
 	if (template != NULL) {
-		bi->name = template->name;
-		bi->generate_fn = template->generate_fn;
-		bi->verify_fn = template->verify_fn;
+		bi->profile = template->profile;
 		bi->tuple_size = template->tuple_size;
 		bi->tag_size = template->tag_size;
 		bi->flags |= template->flags;
 	} else
-		bi->name = bi_unsupported_name;
+		bi->profile->name = bi_unsupported_name;
 
 	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
 
diff --git a/block/t10-pi.c b/block/t10-pi.c
index 24d6e9715318..2c97912335a9 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -160,38 +160,30 @@ static int t10_pi_type3_verify_ip(struct blk_integrity_iter *iter)
 	return t10_pi_verify(iter, t10_pi_ip_fn, 3);
 }
 
-struct blk_integrity t10_pi_type1_crc = {
+struct blk_integrity_profile t10_pi_type1_crc = {
 	.name			= "T10-DIF-TYPE1-CRC",
 	.generate_fn		= t10_pi_type1_generate_crc,
 	.verify_fn		= t10_pi_type1_verify_crc,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type1_crc);
 
-struct blk_integrity t10_pi_type1_ip = {
+struct blk_integrity_profile t10_pi_type1_ip = {
 	.name			= "T10-DIF-TYPE1-IP",
 	.generate_fn		= t10_pi_type1_generate_ip,
 	.verify_fn		= t10_pi_type1_verify_ip,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type1_ip);
 
-struct blk_integrity t10_pi_type3_crc = {
+struct blk_integrity_profile t10_pi_type3_crc = {
 	.name			= "T10-DIF-TYPE3-CRC",
 	.generate_fn		= t10_pi_type3_generate_crc,
 	.verify_fn		= t10_pi_type3_verify_crc,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type3_crc);
 
-struct blk_integrity t10_pi_type3_ip = {
+struct blk_integrity_profile t10_pi_type3_ip = {
 	.name			= "T10-DIF-TYPE3-IP",
 	.generate_fn		= t10_pi_type3_generate_ip,
 	.verify_fn		= t10_pi_type3_verify_ip,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type3_ip);
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index cb62ec6a12d0..7df89b547ae1 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -399,19 +399,22 @@ static int nd_pi_nop_generate_verify(struct blk_integrity_iter *iter)
 
 int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 {
-	struct blk_integrity integrity = {
+	struct blk_integrity bi;
+	static struct blk_integrity_profile profile = {
 		.name = "ND-PI-NOP",
 		.generate_fn = nd_pi_nop_generate_verify,
 		.verify_fn = nd_pi_nop_generate_verify,
-		.tuple_size = meta_size,
-		.tag_size = meta_size,
 	};
 	int ret;
 
 	if (meta_size == 0)
 		return 0;
 
-	ret = blk_integrity_register(disk, &integrity);
+	bi.profile = &profile;
+	bi.tuple_size = meta_size;
+	bi.tag_size = meta_size;
+
+	ret = blk_integrity_register(disk, &bi);
 	if (ret)
 		return ret;
 
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 22d83752ae87..04e3d60a1e45 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -558,7 +558,7 @@ static int nvme_noop_generate(struct blk_integrity_iter *iter)
 	return 0;
 }
 
-struct blk_integrity nvme_meta_noop = {
+struct blk_integrity_profile nvme_meta_noop = {
 	.name			= "NVME_META_NOOP",
 	.generate_fn		= nvme_noop_generate,
 	.verify_fn		= nvme_noop_verify,
@@ -570,14 +570,14 @@ static void nvme_init_integrity(struct nvme_ns *ns)
 
 	switch (ns->pi_type) {
 	case NVME_NS_DPS_PI_TYPE3:
-		integrity = t10_pi_type3_crc;
+		integrity.profile = &t10_pi_type3_crc;
 		break;
 	case NVME_NS_DPS_PI_TYPE1:
 	case NVME_NS_DPS_PI_TYPE2:
-		integrity = t10_pi_type1_crc;
+		integrity.profile = &t10_pi_type1_crc;
 		break;
 	default:
-		integrity = nvme_meta_noop;
+		integrity.profile = &nvme_meta_noop;
 		break;
 	}
 	integrity.tuple_size = ns->ms;
diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c
index 5c06d292b94c..987bf392c336 100644
--- a/drivers/scsi/sd_dif.c
+++ b/drivers/scsi/sd_dif.c
@@ -43,6 +43,7 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 	struct scsi_device *sdp = sdkp->device;
 	struct gendisk *disk = sdkp->disk;
 	u8 type = sdkp->protection_type;
+	struct blk_integrity bi;
 	int dif, dix;
 
 	dif = scsi_host_dif_capable(sdp->host, type);
@@ -55,39 +56,43 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 	if (!dix)
 		return;
 
+	memset(&bi, 0, sizeof(bi));
+
 	/* Enable DMA of protection information */
 	if (scsi_host_get_guard(sdkp->device->host) & SHOST_DIX_GUARD_IP) {
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			blk_integrity_register(disk, &t10_pi_type3_ip);
+			bi.profile = &t10_pi_type3_ip;
 		else
-			blk_integrity_register(disk, &t10_pi_type1_ip);
+			bi.profile = &t10_pi_type1_ip;
 
-		disk->integrity->flags |= BLK_INTEGRITY_IP_CHECKSUM;
+		bi.flags |= BLK_INTEGRITY_IP_CHECKSUM;
 	} else
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			blk_integrity_register(disk, &t10_pi_type3_crc);
+			bi.profile = &t10_pi_type3_crc;
 		else
-			blk_integrity_register(disk, &t10_pi_type1_crc);
+			bi.profile = &t10_pi_type1_crc;
 
+	bi.tuple_size = sizeof(struct t10_pi_tuple);
 	sd_printk(KERN_NOTICE, sdkp,
-		  "Enabling DIX %s protection\n", disk->integrity->name);
+		  "Enabling DIX %s protection\n", bi.profile->name);
 
-	/* Signal to block layer that we support sector tagging */
 	if (dif && type) {
-
-		disk->integrity->flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
+		bi.flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
 
 		if (!sdkp->ATO)
-			return;
+			goto out;
 
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			disk->integrity->tag_size = sizeof(u16) + sizeof(u32);
+			bi.tag_size = sizeof(u16) + sizeof(u32);
 		else
-			disk->integrity->tag_size = sizeof(u16);
+			bi.tag_size = sizeof(u16);
 
 		sd_printk(KERN_NOTICE, sdkp, "DIF application tag size %u\n",
-			  disk->integrity->tag_size);
+			  bi.tag_size);
 	}
+
+out:
+	blk_integrity_register(disk, &bi);
 }
 
 /*
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 0f19e11acac2..f29c69120054 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -155,17 +155,17 @@ static int iblock_configure_device(struct se_device *dev)
 	if (bi) {
 		struct bio_set *bs = ib_dev->ibd_bio_set;
 
-		if (!strcmp(bi->name, "T10-DIF-TYPE3-IP") ||
-		    !strcmp(bi->name, "T10-DIF-TYPE1-IP")) {
+		if (!strcmp(bi->profile->name, "T10-DIF-TYPE3-IP") ||
+		    !strcmp(bi->profile->name, "T10-DIF-TYPE1-IP")) {
 			pr_err("IBLOCK export of blk_integrity: %s not"
-			       " supported\n", bi->name);
+			       " supported\n", bi->profile->name);
 			ret = -ENOSYS;
 			goto out_blkdev_put;
 		}
 
-		if (!strcmp(bi->name, "T10-DIF-TYPE3-CRC")) {
+		if (!strcmp(bi->profile->name, "T10-DIF-TYPE3-CRC")) {
 			dev->dev_attrib.pi_prot_type = TARGET_DIF_TYPE3_PROT;
-		} else if (!strcmp(bi->name, "T10-DIF-TYPE1-CRC")) {
+		} else if (!strcmp(bi->profile->name, "T10-DIF-TYPE1-CRC")) {
 			dev->dev_attrib.pi_prot_type = TARGET_DIF_TYPE1_PROT;
 		}
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 830f9c07d4bb..f36c6476f1c7 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1462,16 +1462,18 @@ struct blk_integrity_iter {
 
 typedef int (integrity_processing_fn) (struct blk_integrity_iter *);
 
-struct blk_integrity {
-	integrity_processing_fn	*generate_fn;
-	integrity_processing_fn	*verify_fn;
-
-	unsigned short		flags;
-	unsigned short		tuple_size;
-	unsigned short		interval;
-	unsigned short		tag_size;
+struct blk_integrity_profile {
+	integrity_processing_fn		*generate_fn;
+	integrity_processing_fn		*verify_fn;
+	const char			*name;
+};
 
-	const char		*name;
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned short			flags;
+	unsigned short			tuple_size;
+	unsigned short			interval;
+	unsigned short			tag_size;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
diff --git a/include/linux/t10-pi.h b/include/linux/t10-pi.h
index 6a8b9942632d..dd8de82cf5b5 100644
--- a/include/linux/t10-pi.h
+++ b/include/linux/t10-pi.h
@@ -14,9 +14,9 @@ struct t10_pi_tuple {
 };
 
 
-extern struct blk_integrity t10_pi_type1_crc;
-extern struct blk_integrity t10_pi_type1_ip;
-extern struct blk_integrity t10_pi_type3_crc;
-extern struct blk_integrity t10_pi_type3_ip;
+extern struct blk_integrity_profile t10_pi_type1_crc;
+extern struct blk_integrity_profile t10_pi_type1_ip;
+extern struct blk_integrity_profile t10_pi_type3_crc;
+extern struct blk_integrity_profile t10_pi_type3_ip;
 
 #endif

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 03/12] block: Reduce the size of struct blk_integrity
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
  2015-10-21 17:19 ` [PATCH v3 01/12] block: Move integrity kobject to struct gendisk Dan Williams
  2015-10-21 17:19 ` [PATCH v3 02/12] block: Consolidate static integrity profile properties Dan Williams
@ 2015-10-21 17:19 ` Dan Williams
  2015-10-21 17:19 ` [PATCH v3 04/12] block: Export integrity data interval size in sysfs Dan Williams
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:19 UTC (permalink / raw)


From: Martin K. Petersen <martin.petersen@oracle.com>

The per-device properties in the blk_integrity structure were previously
unsigned short. However, most of the values fit inside a char. The only
exception is the data interval size and we can work around that by
storing it as a power of two.

This cuts the size of the dynamic portion of blk_integrity in half.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/bio-integrity.c  |    4 ++--
 block/blk-integrity.c  |    6 +++---
 include/linux/blkdev.h |    8 ++++----
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index a10ffe19a8dd..6a90eca9cea1 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -202,7 +202,7 @@ EXPORT_SYMBOL(bio_integrity_enabled);
 static inline unsigned int bio_integrity_intervals(struct blk_integrity *bi,
 						   unsigned int sectors)
 {
-	return sectors >> (ilog2(bi->interval) - 9);
+	return sectors >> (bi->interval_exp - 9);
 }
 
 static inline unsigned int bio_integrity_bytes(struct blk_integrity *bi,
@@ -229,7 +229,7 @@ static int bio_integrity_process(struct bio *bio,
 		bip->bip_vec->bv_offset;
 
 	iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
-	iter.interval = bi->interval;
+	iter.interval = 1 << bi->interval_exp;
 	iter.seed = bip_get_seed(bip);
 	iter.prot_buf = prot_buf;
 
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index daf590ab3b46..c7508654faff 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -155,10 +155,10 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	if (!b1 || !b2)
 		return -1;
 
-	if (b1->interval != b2->interval) {
+	if (b1->interval_exp != b2->interval_exp) {
 		pr_err("%s: %s/%s protection interval %u != %u\n",
 		       __func__, gd1->disk_name, gd2->disk_name,
-		       b1->interval, b2->interval);
+		       1 << b1->interval_exp, 1 << b2->interval_exp);
 		return -1;
 	}
 
@@ -440,7 +440,7 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
 
 		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval = queue_logical_block_size(disk->queue);
+		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
 		disk->integrity = bi;
 	} else
 		bi = disk->integrity;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f36c6476f1c7..4f1968f15e30 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1470,10 +1470,10 @@ struct blk_integrity_profile {
 
 struct blk_integrity {
 	struct blk_integrity_profile	*profile;
-	unsigned short			flags;
-	unsigned short			tuple_size;
-	unsigned short			interval;
-	unsigned short			tag_size;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 04/12] block: Export integrity data interval size in sysfs
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (2 preceding siblings ...)
  2015-10-21 17:19 ` [PATCH v3 03/12] block: Reduce the size of struct blk_integrity Dan Williams
@ 2015-10-21 17:19 ` Dan Williams
  2015-10-21 17:19 ` [PATCH v3 05/12] block: Inline blk_integrity in struct gendisk Dan Williams
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:19 UTC (permalink / raw)


From: Martin K. Petersen <martin.petersen@oracle.com>

The size of the data interval was not exported in the sysfs integrity
directory. Export it so that userland apps can tell whether the interval
is different from the device's logical block size.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 Documentation/ABI/testing/sysfs-block |    7 +++++++
 block/blk-integrity.c                 |   14 ++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index 8df003963d99..71d184dbb70d 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -60,6 +60,13 @@ Description:
 		Indicates whether a storage device is capable of storing
 		integrity metadata. Set if the device is T10 PI-capable.
 
+What:		/sys/block/<disk>/integrity/protection_interval_bytes
+Date:		July 2015
+Contact:	Martin K. Petersen <martin.petersen at oracle.com>
+Description:
+		Describes the number of data bytes which are protected
+		by one integrity tuple. Typically the device's logical
+		block size.
 
 What:		/sys/block/<disk>/integrity/write_generate
 Date:		June 2008
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index c7508654faff..7a96f57ed195 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -289,6 +289,14 @@ static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 		return sprintf(page, "0\n");
 }
 
+static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
+{
+	if (bi != NULL)
+		return sprintf(page, "%u\n", 1 << bi->interval_exp);
+	else
+		return sprintf(page, "0\n");
+}
+
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
 				      const char *page, size_t count)
 {
@@ -343,6 +351,11 @@ static struct integrity_sysfs_entry integrity_tag_size_entry = {
 	.show = integrity_tag_size_show,
 };
 
+static struct integrity_sysfs_entry integrity_interval_entry = {
+	.attr = { .name = "protection_interval_bytes", .mode = S_IRUGO },
+	.show = integrity_interval_show,
+};
+
 static struct integrity_sysfs_entry integrity_verify_entry = {
 	.attr = { .name = "read_verify", .mode = S_IRUGO | S_IWUSR },
 	.show = integrity_verify_show,
@@ -363,6 +376,7 @@ static struct integrity_sysfs_entry integrity_device_entry = {
 static struct attribute *integrity_attrs[] = {
 	&integrity_format_entry.attr,
 	&integrity_tag_size_entry.attr,
+	&integrity_interval_entry.attr,
 	&integrity_verify_entry.attr,
 	&integrity_generate_entry.attr,
 	&integrity_device_entry.attr,

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 05/12] block: Inline blk_integrity in struct gendisk
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (3 preceding siblings ...)
  2015-10-21 17:19 ` [PATCH v3 04/12] block: Export integrity data interval size in sysfs Dan Williams
@ 2015-10-21 17:19 ` Dan Williams
  2015-10-21 17:19 ` [PATCH v3 06/12] md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown Dan Williams
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:19 UTC (permalink / raw)


From: Martin K. Petersen <martin.petersen@oracle.com>

Up until now the_integrity profile has been dynamically allocated and
attached to struct gendisk after the disk has been made active.

This causes problems because NVMe devices need to register the profile
prior to the partition table being read due to a mandatory metadata
buffer requirement. In addition, DM goes through hoops to deal with
preallocating, but not initializing integrity profiles.

Since the integrity profile is small (4 bytes + a pointer), Christoph
suggested moving it to struct gendisk proper. This requires several
changes:

 - Moving the blk_integrity definition to genhd.h.

 - Inlining blk_integrity in struct gendisk.

 - Removing the dynamic allocation code.

 - Adding helper functions which allow gendisk to set up and tear down
   the integrity sysfs dir when a disk is added/deleted.

 - Adding a blk_integrity_revalidate() callback for updating the stable
   pages bdi setting.

 - The calls that depend on whether a device has an integrity profile or
   not now key off of the bi->profile pointer.

 - Simplifying the integrity support routines in DM (Mike Snitzer).

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Signed-off-by: Mike Snitzer <snitzer at redhat.com>
Cc: Dan Williams <dan.j.williams at intel.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c     |  160 ++++++++++++++++-----------------------------
 block/genhd.c             |    2 +
 block/partition-generic.c |    1 
 drivers/md/dm-table.c     |   88 +++++++++++++------------
 drivers/md/md.c           |    9 +--
 drivers/nvdimm/core.c     |    6 --
 drivers/nvme/host/pci.c   |    5 +
 fs/block_dev.c            |    2 -
 include/linux/blkdev.h    |   34 ++++------
 include/linux/genhd.h     |   26 +++++++
 10 files changed, 152 insertions(+), 181 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 7a96f57ed195..4615a3386798 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -30,10 +30,6 @@
 
 #include "blk.h"
 
-static struct kmem_cache *integrity_cachep;
-
-static const char *bi_unsupported_name = "unsupported";
-
 /**
  * blk_rq_count_integrity_sg - Count number of integrity scatterlist elements
  * @q:		request queue
@@ -146,13 +142,13 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
  */
 int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 {
-	struct blk_integrity *b1 = gd1->integrity;
-	struct blk_integrity *b2 = gd2->integrity;
+	struct blk_integrity *b1 = &gd1->integrity;
+	struct blk_integrity *b2 = &gd2->integrity;
 
-	if (!b1 && !b2)
+	if (!b1->profile && !b2->profile)
 		return 0;
 
-	if (!b1 || !b2)
+	if (!b1->profile || !b2->profile)
 		return -1;
 
 	if (b1->interval_exp != b2->interval_exp) {
@@ -163,21 +159,21 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	}
 
 	if (b1->tuple_size != b2->tuple_size) {
-		printk(KERN_ERR "%s: %s/%s tuple sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tuple sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tuple_size, b2->tuple_size);
 		return -1;
 	}
 
 	if (b1->tag_size && b2->tag_size && (b1->tag_size != b2->tag_size)) {
-		printk(KERN_ERR "%s: %s/%s tag sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tag sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tag_size, b2->tag_size);
 		return -1;
 	}
 
 	if (b1->profile != b2->profile) {
-		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
+		pr_err("%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->profile->name, b2->profile->name);
 		return -1;
@@ -250,7 +246,7 @@ static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -262,7 +258,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    size_t count)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -275,7 +271,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->profile->name != NULL)
+	if (bi->profile && bi->profile->name)
 		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
@@ -283,18 +279,13 @@ static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 
 static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", bi->tag_size);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n", bi->tag_size);
 }
 
 static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", 1 << bi->interval_exp);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n",
+		       bi->interval_exp ? 1 << bi->interval_exp : 0);
 }
 
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
@@ -388,113 +379,78 @@ static const struct sysfs_ops integrity_ops = {
 	.store	= &integrity_attr_store,
 };
 
-static int __init blk_dev_integrity_init(void)
-{
-	integrity_cachep = kmem_cache_create("blkdev_integrity",
-					     sizeof(struct blk_integrity),
-					     0, SLAB_PANIC, NULL);
-	return 0;
-}
-subsys_initcall(blk_dev_integrity_init);
-
-static void blk_integrity_release(struct kobject *kobj)
-{
-	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	kmem_cache_free(integrity_cachep, bi);
-}
-
 static struct kobj_type integrity_ktype = {
 	.default_attrs	= integrity_attrs,
 	.sysfs_ops	= &integrity_ops,
-	.release	= blk_integrity_release,
 };
 
-bool blk_integrity_is_initialized(struct gendisk *disk)
-{
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	return (bi && bi->profile->name && strcmp(bi->profile->name,
-						  bi_unsupported_name) != 0);
-}
-EXPORT_SYMBOL(blk_integrity_is_initialized);
-
 /**
  * blk_integrity_register - Register a gendisk as being integrity-capable
  * @disk:	struct gendisk pointer to make integrity-aware
- * @template:	optional integrity profile to register
+ * @template:	block integrity profile to register
  *
- * Description: When a device needs to advertise itself as being able
- * to send/receive integrity metadata it must use this function to
- * register the capability with the block layer.  The template is a
- * blk_integrity struct with values appropriate for the underlying
- * hardware.  If template is NULL the new profile is allocated but
- * not filled out. See Documentation/block/data-integrity.txt.
+ * Description: When a device needs to advertise itself as being able to
+ * send/receive integrity metadata it must use this function to register
+ * the capability with the block layer. The template is a blk_integrity
+ * struct with values appropriate for the underlying hardware. See
+ * Documentation/block/data-integrity.txt.
  */
-int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
+void blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 {
-	struct blk_integrity *bi;
-
-	BUG_ON(disk == NULL);
+	struct blk_integrity *bi = &disk->integrity;
 
-	if (disk->integrity == NULL) {
-		bi = kmem_cache_alloc(integrity_cachep,
-				      GFP_KERNEL | __GFP_ZERO);
-		if (!bi)
-			return -1;
+	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
+		template->flags;
+	bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
+	bi->profile = template->profile;
+	bi->tuple_size = template->tuple_size;
+	bi->tag_size = template->tag_size;
 
-		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
-					 &disk_to_dev(disk)->kobj,
-					 "%s", "integrity")) {
-			kmem_cache_free(integrity_cachep, bi);
-			return -1;
-		}
-
-		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
-
-		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
-		disk->integrity = bi;
-	} else
-		bi = disk->integrity;
-
-	/* Use the provided profile as template */
-	if (template != NULL) {
-		bi->profile = template->profile;
-		bi->tuple_size = template->tuple_size;
-		bi->tag_size = template->tag_size;
-		bi->flags |= template->flags;
-	} else
-		bi->profile->name = bi_unsupported_name;
-
-	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
-
-	return 0;
+	blk_integrity_revalidate(disk);
 }
 EXPORT_SYMBOL(blk_integrity_register);
 
 /**
- * blk_integrity_unregister - Remove block integrity profile
- * @disk:	disk whose integrity profile to deallocate
+ * blk_integrity_unregister - Unregister block integrity profile
+ * @disk:	disk whose integrity profile to unregister
  *
- * Description: This function frees all memory used by the block
- * integrity profile.  To be called at device teardown.
+ * Description: This function unregisters the integrity capability from
+ * a block device.
  */
 void blk_integrity_unregister(struct gendisk *disk)
 {
-	struct blk_integrity *bi;
+	blk_integrity_revalidate(disk);
+	memset(&disk->integrity, 0, sizeof(struct blk_integrity));
+}
+EXPORT_SYMBOL(blk_integrity_unregister);
 
-	if (!disk || !disk->integrity)
+void blk_integrity_revalidate(struct gendisk *disk)
+{
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!(disk->flags & GENHD_FL_UP))
 		return;
 
-	disk->queue->backing_dev_info.capabilities &= ~BDI_CAP_STABLE_WRITES;
+	if (bi->profile)
+		disk->queue->backing_dev_info.capabilities |=
+			BDI_CAP_STABLE_WRITES;
+	else
+		disk->queue->backing_dev_info.capabilities &=
+			~BDI_CAP_STABLE_WRITES;
+}
 
-	bi = disk->integrity;
+void blk_integrity_add(struct gendisk *disk)
+{
+	if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
+				 &disk_to_dev(disk)->kobj, "%s", "integrity"))
+		return;
 
+	kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
+}
+
+void blk_integrity_del(struct gendisk *disk)
+{
 	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
 	kobject_del(&disk->integrity_kobj);
 	kobject_put(&disk->integrity_kobj);
-	disk->integrity = NULL;
 }
-EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/block/genhd.c b/block/genhd.c
index 0c706f33a599..e5cafa51567c 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -630,6 +630,7 @@ void add_disk(struct gendisk *disk)
 	WARN_ON(retval);
 
 	disk_add_events(disk);
+	blk_integrity_add(disk);
 }
 EXPORT_SYMBOL(add_disk);
 
@@ -638,6 +639,7 @@ void del_gendisk(struct gendisk *disk)
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
+	blk_integrity_del(disk);
 	disk_del_events(disk);
 
 	/* invalidate stuff */
diff --git a/block/partition-generic.c b/block/partition-generic.c
index e7711133284e..3b030157ec85 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -428,6 +428,7 @@ rescan:
 
 	if (disk->fops->revalidate_disk)
 		disk->fops->revalidate_disk(disk);
+	blk_integrity_revalidate(disk);
 	check_disk_size_change(disk, bdev);
 	bdev->bd_invalidated = 0;
 	if (!get_capacity(disk) || !(state = check_partition(disk, bdev)))
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index e76ed003769e..061152a43730 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1014,15 +1014,16 @@ static int dm_table_build_index(struct dm_table *t)
 	return r;
 }
 
+static bool integrity_profile_exists(struct gendisk *disk)
+{
+	return !!blk_get_integrity(disk);
+}
+
 /*
  * Get a disk whose integrity profile reflects the table's profile.
- * If %match_all is true, all devices' profiles must match.
- * If %match_all is false, all devices must at least have an
- * allocated integrity profile; but uninitialized is ok.
  * Returns NULL if integrity support was inconsistent or unavailable.
  */
-static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
-						    bool match_all)
+static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
 {
 	struct list_head *devices = dm_table_get_devices(t);
 	struct dm_dev_internal *dd = NULL;
@@ -1030,10 +1031,8 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
 
 	list_for_each_entry(dd, devices, list) {
 		template_disk = dd->dm_dev->bdev->bd_disk;
-		if (!blk_get_integrity(template_disk))
+		if (!integrity_profile_exists(template_disk))
 			goto no_integrity;
-		if (!match_all && !blk_integrity_is_initialized(template_disk))
-			continue; /* skip uninitialized profiles */
 		else if (prev_disk &&
 			 blk_integrity_compare(prev_disk, template_disk) < 0)
 			goto no_integrity;
@@ -1052,34 +1051,40 @@ no_integrity:
 }
 
 /*
- * Register the mapped device for blk_integrity support if
- * the underlying devices have an integrity profile.  But all devices
- * may not have matching profiles (checking all devices isn't reliable
+ * Register the mapped device for blk_integrity support if the
+ * underlying devices have an integrity profile.  But all devices may
+ * not have matching profiles (checking all devices isn't reliable
  * during table load because this table may use other DM device(s) which
- * must be resumed before they will have an initialized integity profile).
- * Stacked DM devices force a 2 stage integrity profile validation:
- * 1 - during load, validate all initialized integrity profiles match
- * 2 - during resume, validate all integrity profiles match
+ * must be resumed before they will have an initialized integity
+ * profile).  Consequently, stacked DM devices force a 2 stage integrity
+ * profile validation: First pass during table load, final pass during
+ * resume.
  */
-static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device *md)
+static int dm_table_register_integrity(struct dm_table *t)
 {
+	struct mapped_device *md = t->md;
 	struct gendisk *template_disk = NULL;
 
-	template_disk = dm_table_get_integrity_disk(t, false);
+	template_disk = dm_table_get_integrity_disk(t);
 	if (!template_disk)
 		return 0;
 
-	if (!blk_integrity_is_initialized(dm_disk(md))) {
+	if (!integrity_profile_exists(dm_disk(md))) {
 		t->integrity_supported = 1;
-		return blk_integrity_register(dm_disk(md), NULL);
+		/*
+		 * Register integrity profile during table load; we can do
+		 * this because the final profile must match during resume.
+		 */
+		blk_integrity_register(dm_disk(md),
+				       blk_get_integrity(template_disk));
+		return 0;
 	}
 
 	/*
-	 * If DM device already has an initalized integrity
+	 * If DM device already has an initialized integrity
 	 * profile the new profile should not conflict.
 	 */
-	if (blk_integrity_is_initialized(template_disk) &&
-	    blk_integrity_compare(dm_disk(md), template_disk) < 0) {
+	if (blk_integrity_compare(dm_disk(md), template_disk) < 0) {
 		DMWARN("%s: conflict with existing integrity profile: "
 		       "%s profile mismatch",
 		       dm_device_name(t->md),
@@ -1087,7 +1092,7 @@ static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device
 		return 1;
 	}
 
-	/* Preserve existing initialized integrity profile */
+	/* Preserve existing integrity profile */
 	t->integrity_supported = 1;
 	return 0;
 }
@@ -1112,7 +1117,7 @@ int dm_table_complete(struct dm_table *t)
 		return r;
 	}
 
-	r = dm_table_prealloc_integrity(t, t->md);
+	r = dm_table_register_integrity(t);
 	if (r) {
 		DMERR("could not register integrity profile.");
 		return r;
@@ -1278,29 +1283,30 @@ combine_limits:
 }
 
 /*
- * Set the integrity profile for this device if all devices used have
- * matching profiles.  We're quite deep in the resume path but still
- * don't know if all devices (particularly DM devices this device
- * may be stacked on) have matching profiles.  Even if the profiles
- * don't match we have no way to fail (to resume) at this point.
+ * Verify that all devices have an integrity profile that matches the
+ * DM device's registered integrity profile.  If the profiles don't
+ * match then unregister the DM device's integrity profile.
  */
-static void dm_table_set_integrity(struct dm_table *t)
+static void dm_table_verify_integrity(struct dm_table *t)
 {
 	struct gendisk *template_disk = NULL;
 
-	if (!blk_get_integrity(dm_disk(t->md)))
-		return;
+	if (t->integrity_supported) {
+		/*
+		 * Verify that the original integrity profile
+		 * matches all the devices in this table.
+		 */
+		template_disk = dm_table_get_integrity_disk(t);
+		if (template_disk &&
+		    blk_integrity_compare(dm_disk(t->md), template_disk) >= 0)
+			return;
+	}
 
-	template_disk = dm_table_get_integrity_disk(t, true);
-	if (template_disk)
-		blk_integrity_register(dm_disk(t->md),
-				       blk_get_integrity(template_disk));
-	else if (blk_integrity_is_initialized(dm_disk(t->md)))
-		DMWARN("%s: device no longer has a valid integrity profile",
-		       dm_device_name(t->md));
-	else
+	if (integrity_profile_exists(dm_disk(t->md))) {
 		DMWARN("%s: unable to establish an integrity profile",
 		       dm_device_name(t->md));
+		blk_integrity_unregister(dm_disk(t->md));
+	}
 }
 
 static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
@@ -1500,7 +1506,7 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
 	else
 		queue_flag_set_unlocked(QUEUE_FLAG_NO_SG_MERGE, q);
 
-	dm_table_set_integrity(t);
+	dm_table_verify_integrity(t);
 
 	/*
 	 * Determine whether or not this queue's I/O timings contribute
diff --git a/drivers/md/md.c b/drivers/md/md.c
index c702de18207a..2af9d590e1a0 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1962,12 +1962,9 @@ int md_integrity_register(struct mddev *mddev)
 	 * All component devices are integrity capable and have matching
 	 * profiles, register the common profile for the md device.
 	 */
-	if (blk_integrity_register(mddev->gendisk,
-			bdev_get_integrity(reference->bdev)) != 0) {
-		printk(KERN_ERR "md: failed to register integrity for %s\n",
-			mdname(mddev));
-		return -EINVAL;
-	}
+	blk_integrity_register(mddev->gendisk,
+			       bdev_get_integrity(reference->bdev));
+
 	printk(KERN_NOTICE "md: data integrity enabled on %s\n", mdname(mddev));
 	if (bioset_integrity_create(mddev->bio_set, BIO_POOL_SIZE)) {
 		printk(KERN_ERR "md: failed to create integrity pool for %s\n",
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 7df89b547ae1..e85848caf8d2 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -405,7 +405,6 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 		.generate_fn = nd_pi_nop_generate_verify,
 		.verify_fn = nd_pi_nop_generate_verify,
 	};
-	int ret;
 
 	if (meta_size == 0)
 		return 0;
@@ -414,10 +413,7 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 	bi.tuple_size = meta_size;
 	bi.tag_size = meta_size;
 
-	ret = blk_integrity_register(disk, &bi);
-	if (ret)
-		return ret;
-
+	blk_integrity_register(disk, &bi);
 	blk_queue_max_integrity_segments(disk->queue, 1);
 
 	return 0;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 04e3d60a1e45..8d2aeaaa3895 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -538,7 +538,7 @@ static void nvme_dif_remap(struct request *req,
 	virt = bip_get_seed(bip);
 	phys = nvme_block_nr(ns, blk_rq_pos(req));
 	nlb = (blk_rq_bytes(req) >> ns->lba_shift);
-	ts = ns->disk->integrity->tuple_size;
+	ts = ns->disk->integrity.tuple_size;
 
 	for (i = 0; i < nlb; i++, virt++, phys++) {
 		pi = (struct t10_pi_tuple *)p;
@@ -2044,8 +2044,7 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 	ns->pi_type = pi_type;
 	blk_queue_logical_block_size(ns->queue, bs);
 
-	if (ns->ms && !blk_get_integrity(disk) && (disk->flags & GENHD_FL_UP) &&
-								!ns->ext)
+	if (ns->ms && !ns->ext)
 		nvme_init_integrity(ns);
 
 	if (ns->ms && !(ns->ms == 8 && ns->pi_type) && !blk_get_integrity(disk))
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 073bb57adab1..0a793c7930eb 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1075,7 +1075,7 @@ int revalidate_disk(struct gendisk *disk)
 
 	if (disk->fops->revalidate_disk)
 		ret = disk->fops->revalidate_disk(disk);
-
+	blk_integrity_revalidate(disk);
 	bdev = bdget_disk(disk, 0);
 	if (!bdev)
 		return ret;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4f1968f15e30..60669c20190f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1468,16 +1468,7 @@ struct blk_integrity_profile {
 	const char			*name;
 };
 
-struct blk_integrity {
-	struct blk_integrity_profile	*profile;
-	unsigned char			flags;
-	unsigned char			tuple_size;
-	unsigned char			interval_exp;
-	unsigned char			tag_size;
-};
-
-extern bool blk_integrity_is_initialized(struct gendisk *);
-extern int blk_integrity_register(struct gendisk *, struct blk_integrity *);
+extern void blk_integrity_register(struct gendisk *, struct blk_integrity *);
 extern void blk_integrity_unregister(struct gendisk *);
 extern int blk_integrity_compare(struct gendisk *, struct gendisk *);
 extern int blk_rq_map_integrity_sg(struct request_queue *, struct bio *,
@@ -1488,15 +1479,20 @@ extern bool blk_integrity_merge_rq(struct request_queue *, struct request *,
 extern bool blk_integrity_merge_bio(struct request_queue *, struct request *,
 				    struct bio *);
 
-static inline
-struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
+static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
 {
-	return bdev->bd_disk->integrity;
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!bi->profile)
+		return NULL;
+
+	return bi;
 }
 
-static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
+static inline
+struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
 {
-	return disk->integrity;
+	return blk_get_integrity(bdev->bd_disk);
 }
 
 static inline bool blk_integrity_rq(struct request *rq)
@@ -1570,10 +1566,9 @@ static inline int blk_integrity_compare(struct gendisk *a, struct gendisk *b)
 {
 	return 0;
 }
-static inline int blk_integrity_register(struct gendisk *d,
+static inline void blk_integrity_register(struct gendisk *d,
 					 struct blk_integrity *b)
 {
-	return 0;
 }
 static inline void blk_integrity_unregister(struct gendisk *d)
 {
@@ -1598,10 +1593,7 @@ static inline bool blk_integrity_merge_bio(struct request_queue *rq,
 {
 	return true;
 }
-static inline bool blk_integrity_is_initialized(struct gendisk *g)
-{
-	return 0;
-}
+
 static inline bool integrity_req_gap_back_merge(struct request *req,
 						struct bio *next)
 {
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 9e6e0dfa97ad..82f4911e0ad8 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -163,6 +163,18 @@ struct disk_part_tbl {
 
 struct disk_events;
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
+};
+
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 struct gendisk {
 	/* major, first_minor and minors are input parameters only,
 	 * don't use directly.  Use disk_devt() and disk_max_parts().
@@ -198,9 +210,9 @@ struct gendisk {
 	atomic_t sync_io;		/* RAID */
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
-	struct blk_integrity *integrity;
+	struct blk_integrity integrity;
 	struct kobject integrity_kobj;
-#endif
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;
 };
 
@@ -728,6 +740,16 @@ static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
 #endif
 }
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+extern void blk_integrity_add(struct gendisk *);
+extern void blk_integrity_del(struct gendisk *);
+extern void blk_integrity_revalidate(struct gendisk *);
+#else	/* CONFIG_BLK_DEV_INTEGRITY */
+static inline void blk_integrity_add(struct gendisk *disk) { }
+static inline void blk_integrity_del(struct gendisk *disk) { }
+static inline void blk_integrity_revalidate(struct gendisk *disk) { }
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 #else /* CONFIG_BLOCK */
 
 static inline void printk_all_partitions(void) { }

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 06/12] md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (4 preceding siblings ...)
  2015-10-21 17:19 ` [PATCH v3 05/12] block: Inline blk_integrity in struct gendisk Dan Williams
@ 2015-10-21 17:19 ` Dan Williams
  2016-01-14 10:32   ` Sagi Grimberg
  2015-10-21 17:20 ` [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister Dan Williams
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:19 UTC (permalink / raw)


Now that the integrity profile is statically allocated there is no work
to do when shutting down an integrity enabled block device.

Cc: Matthew Wilcox <willy at linux.intel.com>
Cc: Mike Snitzer <snitzer at redhat.com>
Cc: James Bottomley <JBottomley at Odin.com>
Acked-by: NeilBrown <neilb at suse.com>
Acked-by: Keith Busch <keith.busch at intel.com>
Acked-by: Vishal Verma <vishal.l.verma at intel.com>
Tested-by: Ross Zwisler <ross.zwisler at linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 drivers/md/dm.c         |    2 --
 drivers/md/md.c         |    1 -
 drivers/nvdimm/btt.c    |    1 -
 drivers/nvme/host/pci.c |    5 +----
 drivers/scsi/sd.c       |    1 -
 5 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 6264781dc69a..f4d953e10e2f 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2233,8 +2233,6 @@ static void cleanup_mapped_device(struct mapped_device *md)
 		spin_lock(&_minor_lock);
 		md->disk->private_data = NULL;
 		spin_unlock(&_minor_lock);
-		if (blk_get_integrity(md->disk))
-			blk_integrity_unregister(md->disk);
 		del_gendisk(md->disk);
 		put_disk(md->disk);
 	}
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2af9d590e1a0..7d07caceb349 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5539,7 +5539,6 @@ static int do_md_stop(struct mddev *mddev, int mode,
 		if (mddev->hold_active == UNTIL_STOP)
 			mddev->hold_active = 0;
 	}
-	blk_integrity_unregister(disk);
 	md_new_event(mddev);
 	sysfs_notify_dirent_safe(mddev->sysfs_state);
 	return 0;
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 254239746020..eae93ab8ffcd 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1279,7 +1279,6 @@ static int btt_blk_init(struct btt *btt)
 
 static void btt_blk_cleanup(struct btt *btt)
 {
-	blk_integrity_unregister(btt->btt_disk);
 	del_gendisk(btt->btt_disk);
 	put_disk(btt->btt_disk);
 	blk_cleanup_queue(btt->btt_queue);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8d2aeaaa3895..ad06f8dcf582 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2407,11 +2407,8 @@ static void nvme_ns_remove(struct nvme_ns *ns)
 
 	if (kill)
 		blk_set_queue_dying(ns->queue);
-	if (ns->disk->flags & GENHD_FL_UP) {
-		if (blk_get_integrity(ns->disk))
-			blk_integrity_unregister(ns->disk);
+	if (ns->disk->flags & GENHD_FL_UP)
 		del_gendisk(ns->disk);
-	}
 	if (kill || !blk_queue_dying(ns->queue)) {
 		blk_mq_abort_requeue_list(ns->queue);
 		blk_cleanup_queue(ns->queue);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 3f370228bf31..9e85211ea1d1 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3068,7 +3068,6 @@ static void scsi_disk_release(struct device *dev)
 	ida_remove(&sd_index_ida, sdkp->index);
 	spin_unlock(&sd_index_lock);
 
-	blk_integrity_unregister(disk);
 	disk->private_data = NULL;
 	put_disk(disk);
 	put_device(&sdkp->device->sdev_gendev);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (5 preceding siblings ...)
  2015-10-21 17:19 ` [PATCH v3 06/12] md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown Dan Williams
@ 2015-10-21 17:20 ` Dan Williams
  2016-01-13  2:14   ` NeilBrown
  2015-10-21 17:20 ` [PATCH v3 08/12] nvme: suspend i/o during runtime blk_integrity_unregister Dan Williams
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:20 UTC (permalink / raw)


Synchronize pending i/o against a change in the integrity profile to
avoid the possibility of spurious integrity errors.  Given linear_add()
is suspending the mddev before manipulating the mddev, do the same for
the other personalities.

Acked-by: NeilBrown <neilb at suse.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 drivers/md/md.c        |    1 +
 drivers/md/multipath.c |    2 ++
 drivers/md/raid1.c     |    2 ++
 drivers/md/raid10.c    |    2 ++
 4 files changed, 7 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7d07caceb349..714aa92db174 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1994,6 +1994,7 @@ void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
 	if (bi_rdev && blk_integrity_compare(mddev->gendisk,
 					     rdev->bdev->bd_disk) >= 0)
 		return;
+	WARN_ON_ONCE(!mddev->suspended);
 	printk(KERN_NOTICE "disabling data integrity on %s\n", mdname(mddev));
 	blk_integrity_unregister(mddev->gendisk);
 }
diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
index d132f06afdd1..7331a80d89f1 100644
--- a/drivers/md/multipath.c
+++ b/drivers/md/multipath.c
@@ -264,7 +264,9 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 			spin_unlock_irq(&conf->device_lock);
 			rcu_assign_pointer(p->rdev, rdev);
 			err = 0;
+			mddev_suspend(mddev);
 			md_integrity_add_rdev(rdev, mddev);
+			mddev_resume(mddev);
 			break;
 		}
 
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 049df6c4a8cc..a881b111fa35 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1621,7 +1621,9 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 			break;
 		}
 	}
+	mddev_suspend(mddev);
 	md_integrity_add_rdev(rdev, mddev);
+	mddev_resume(mddev);
 	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
 		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
 	print_conf(conf);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 7c99a4037715..6f0ec107996a 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1736,7 +1736,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 		rcu_assign_pointer(p->rdev, rdev);
 		break;
 	}
+	mddev_suspend(mddev);
 	md_integrity_add_rdev(rdev, mddev);
+	mddev_resume(mddev);
 	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
 		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
 

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 08/12] nvme: suspend i/o during runtime blk_integrity_unregister
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (6 preceding siblings ...)
  2015-10-21 17:20 ` [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister Dan Williams
@ 2015-10-21 17:20 ` Dan Williams
  2016-01-14 10:32   ` Sagi Grimberg
  2015-10-21 17:20 ` [PATCH v3 09/12] block: generic request_queue reference counting Dan Williams
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:20 UTC (permalink / raw)


Synchronize pending i/o against a change in the integrity profile to
avoid the possibility of spurious integrity errors.

Cc: Matthew Wilcox <willy at linux.intel.com>
Acked-by: Keith Busch <keith.busch at intel.com>
[keith: also protect dynamic integrity registration]
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 drivers/nvme/host/pci.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ad06f8dcf582..4757a2c9366e 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2035,6 +2035,7 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 	pi_type = ns->ms == sizeof(struct t10_pi_tuple) ?
 					id->dps & NVME_NS_DPS_PI_MASK : 0;
 
+	blk_mq_freeze_queue(disk->queue);
 	if (blk_get_integrity(disk) && (ns->pi_type != pi_type ||
 				ns->ms != old_ms ||
 				bs != queue_logical_block_size(disk->queue) ||
@@ -2054,6 +2055,7 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 
 	if (dev->oncs & NVME_CTRL_ONCS_DSM)
 		nvme_config_discard(ns);
+	blk_mq_unfreeze_queue(disk->queue);
 
 	kfree(id);
 	return 0;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 09/12] block: generic request_queue reference counting
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (7 preceding siblings ...)
  2015-10-21 17:20 ` [PATCH v3 08/12] nvme: suspend i/o during runtime blk_integrity_unregister Dan Williams
@ 2015-10-21 17:20 ` Dan Williams
  2016-01-14 10:35   ` Sagi Grimberg
  2015-10-21 17:20 ` [PATCH v3 10/12] block: move blk_integrity to request_queue Dan Williams
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:20 UTC (permalink / raw)


Allow pmem, and other synchronous/bio-based block drivers, to fallback
on a per-cpu reference count managed by the core for tracking queue
live/dead state.

The existing per-cpu reference count for the blk_mq case is promoted to
be used in all block i/o scenarios.  This involves initializing it by
default, waiting for it to drop to zero at exit, and holding a live
reference over the invocation of q->make_request_fn() in
generic_make_request().  The blk_mq code continues to take its own
reference per blk_mq request and retains the ability to freeze the
queue, but the check that the queue is frozen is moved to
generic_make_request().

This fixes crash signatures like the following:

 BUG: unable to handle kernel paging request at ffff880140000000
 [..]
 Call Trace:
  [<ffffffff8145e8bf>] ? copy_user_handle_tail+0x5f/0x70
  [<ffffffffa004e1e0>] pmem_do_bvec.isra.11+0x70/0xf0 [nd_pmem]
  [<ffffffffa004e331>] pmem_make_request+0xd1/0x200 [nd_pmem]
  [<ffffffff811c3162>] ? mempool_alloc+0x72/0x1a0
  [<ffffffff8141f8b6>] generic_make_request+0xd6/0x110
  [<ffffffff8141f966>] submit_bio+0x76/0x170
  [<ffffffff81286dff>] submit_bh_wbc+0x12f/0x160
  [<ffffffff81286e62>] submit_bh+0x12/0x20
  [<ffffffff813395bd>] jbd2_write_superblock+0x8d/0x170
  [<ffffffff8133974d>] jbd2_mark_journal_empty+0x5d/0x90
  [<ffffffff813399cb>] jbd2_journal_destroy+0x24b/0x270
  [<ffffffff810bc4ca>] ? put_pwq_unlocked+0x2a/0x30
  [<ffffffff810bc6f5>] ? destroy_workqueue+0x225/0x250
  [<ffffffff81303494>] ext4_put_super+0x64/0x360
  [<ffffffff8124ab1a>] generic_shutdown_super+0x6a/0xf0

Cc: Jens Axboe <axboe at kernel.dk>
Cc: Keith Busch <keith.busch at intel.com>
Cc: Ross Zwisler <ross.zwisler at linux.intel.com>
Suggested-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Christoph Hellwig <hch at lst.de>
Tested-by: Ross Zwisler <ross.zwisler at linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-core.c       |   71 +++++++++++++++++++++++++++++++++++++------
 block/blk-mq-sysfs.c   |    6 ----
 block/blk-mq.c         |   80 ++++++++++++++----------------------------------
 block/blk-sysfs.c      |    3 +-
 block/blk.h            |   14 ++++++++
 include/linux/blk-mq.h |    1 -
 include/linux/blkdev.h |    2 +
 7 files changed, 102 insertions(+), 75 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2eb722d48773..9b4d735cb5b8 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -554,13 +554,10 @@ void blk_cleanup_queue(struct request_queue *q)
 	 * Drain all requests queued before DYING marking. Set DEAD flag to
 	 * prevent that q->request_fn() gets invoked after draining finished.
 	 */
-	if (q->mq_ops) {
-		blk_mq_freeze_queue(q);
-		spin_lock_irq(lock);
-	} else {
-		spin_lock_irq(lock);
+	blk_freeze_queue(q);
+	spin_lock_irq(lock);
+	if (!q->mq_ops)
 		__blk_drain_queue(q, true);
-	}
 	queue_flag_set(QUEUE_FLAG_DEAD, q);
 	spin_unlock_irq(lock);
 
@@ -570,6 +567,7 @@ void blk_cleanup_queue(struct request_queue *q)
 
 	if (q->mq_ops)
 		blk_mq_free_queue(q);
+	percpu_ref_exit(&q->q_usage_counter);
 
 	spin_lock_irq(lock);
 	if (q->queue_lock != &q->__queue_lock)
@@ -629,6 +627,40 @@ struct request_queue *blk_alloc_queue(gfp_t gfp_mask)
 }
 EXPORT_SYMBOL(blk_alloc_queue);
 
+int blk_queue_enter(struct request_queue *q, gfp_t gfp)
+{
+	while (true) {
+		int ret;
+
+		if (percpu_ref_tryget_live(&q->q_usage_counter))
+			return 0;
+
+		if (!(gfp & __GFP_WAIT))
+			return -EBUSY;
+
+		ret = wait_event_interruptible(q->mq_freeze_wq,
+				!atomic_read(&q->mq_freeze_depth) ||
+				blk_queue_dying(q));
+		if (blk_queue_dying(q))
+			return -ENODEV;
+		if (ret)
+			return ret;
+	}
+}
+
+void blk_queue_exit(struct request_queue *q)
+{
+	percpu_ref_put(&q->q_usage_counter);
+}
+
+static void blk_queue_usage_counter_release(struct percpu_ref *ref)
+{
+	struct request_queue *q =
+		container_of(ref, struct request_queue, q_usage_counter);
+
+	wake_up_all(&q->mq_freeze_wq);
+}
+
 struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
 {
 	struct request_queue *q;
@@ -690,11 +722,22 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
 
 	init_waitqueue_head(&q->mq_freeze_wq);
 
-	if (blkcg_init_queue(q))
+	/*
+	 * Init percpu_ref in atomic mode so that it's faster to shutdown.
+	 * See blk_register_queue() for details.
+	 */
+	if (percpu_ref_init(&q->q_usage_counter,
+				blk_queue_usage_counter_release,
+				PERCPU_REF_INIT_ATOMIC, GFP_KERNEL))
 		goto fail_bdi;
 
+	if (blkcg_init_queue(q))
+		goto fail_ref;
+
 	return q;
 
+fail_ref:
+	percpu_ref_exit(&q->q_usage_counter);
 fail_bdi:
 	bdi_destroy(&q->backing_dev_info);
 fail_split:
@@ -1966,9 +2009,19 @@ void generic_make_request(struct bio *bio)
 	do {
 		struct request_queue *q = bdev_get_queue(bio->bi_bdev);
 
-		q->make_request_fn(q, bio);
+		if (likely(blk_queue_enter(q, __GFP_WAIT) == 0)) {
+
+			q->make_request_fn(q, bio);
+
+			blk_queue_exit(q);
 
-		bio = bio_list_pop(current->bio_list);
+			bio = bio_list_pop(current->bio_list);
+		} else {
+			struct bio *bio_next = bio_list_pop(current->bio_list);
+
+			bio_io_error(bio);
+			bio = bio_next;
+		}
 	} while (bio);
 	current->bio_list = NULL; /* deactivate */
 }
diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c
index 788fffd9b409..6f57a110289c 100644
--- a/block/blk-mq-sysfs.c
+++ b/block/blk-mq-sysfs.c
@@ -413,12 +413,6 @@ static void blk_mq_sysfs_init(struct request_queue *q)
 		kobject_init(&ctx->kobj, &blk_mq_ctx_ktype);
 }
 
-/* see blk_register_queue() */
-void blk_mq_finish_init(struct request_queue *q)
-{
-	percpu_ref_switch_to_percpu(&q->mq_usage_counter);
-}
-
 int blk_mq_register_disk(struct gendisk *disk)
 {
 	struct device *dev = disk_to_dev(disk);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index d921cd5177f5..6c240712553a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -78,47 +78,13 @@ static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx *hctx,
 	clear_bit(CTX_TO_BIT(hctx, ctx), &bm->word);
 }
 
-static int blk_mq_queue_enter(struct request_queue *q, gfp_t gfp)
-{
-	while (true) {
-		int ret;
-
-		if (percpu_ref_tryget_live(&q->mq_usage_counter))
-			return 0;
-
-		if (!(gfp & __GFP_WAIT))
-			return -EBUSY;
-
-		ret = wait_event_interruptible(q->mq_freeze_wq,
-				!atomic_read(&q->mq_freeze_depth) ||
-				blk_queue_dying(q));
-		if (blk_queue_dying(q))
-			return -ENODEV;
-		if (ret)
-			return ret;
-	}
-}
-
-static void blk_mq_queue_exit(struct request_queue *q)
-{
-	percpu_ref_put(&q->mq_usage_counter);
-}
-
-static void blk_mq_usage_counter_release(struct percpu_ref *ref)
-{
-	struct request_queue *q =
-		container_of(ref, struct request_queue, mq_usage_counter);
-
-	wake_up_all(&q->mq_freeze_wq);
-}
-
 void blk_mq_freeze_queue_start(struct request_queue *q)
 {
 	int freeze_depth;
 
 	freeze_depth = atomic_inc_return(&q->mq_freeze_depth);
 	if (freeze_depth == 1) {
-		percpu_ref_kill(&q->mq_usage_counter);
+		percpu_ref_kill(&q->q_usage_counter);
 		blk_mq_run_hw_queues(q, false);
 	}
 }
@@ -126,18 +92,34 @@ EXPORT_SYMBOL_GPL(blk_mq_freeze_queue_start);
 
 static void blk_mq_freeze_queue_wait(struct request_queue *q)
 {
-	wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->mq_usage_counter));
+	wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter));
 }
 
 /*
  * Guarantee no request is in use, so we can change any data structure of
  * the queue afterward.
  */
-void blk_mq_freeze_queue(struct request_queue *q)
+void blk_freeze_queue(struct request_queue *q)
 {
+	/*
+	 * In the !blk_mq case we are only calling this to kill the
+	 * q_usage_counter, otherwise this increases the freeze depth
+	 * and waits for it to return to zero.  For this reason there is
+	 * no blk_unfreeze_queue(), and blk_freeze_queue() is not
+	 * exported to drivers as the only user for unfreeze is blk_mq.
+	 */
 	blk_mq_freeze_queue_start(q);
 	blk_mq_freeze_queue_wait(q);
 }
+
+void blk_mq_freeze_queue(struct request_queue *q)
+{
+	/*
+	 * ...just an alias to keep freeze and unfreeze actions balanced
+	 * in the blk_mq_* namespace
+	 */
+	blk_freeze_queue(q);
+}
 EXPORT_SYMBOL_GPL(blk_mq_freeze_queue);
 
 void blk_mq_unfreeze_queue(struct request_queue *q)
@@ -147,7 +129,7 @@ void blk_mq_unfreeze_queue(struct request_queue *q)
 	freeze_depth = atomic_dec_return(&q->mq_freeze_depth);
 	WARN_ON_ONCE(freeze_depth < 0);
 	if (!freeze_depth) {
-		percpu_ref_reinit(&q->mq_usage_counter);
+		percpu_ref_reinit(&q->q_usage_counter);
 		wake_up_all(&q->mq_freeze_wq);
 	}
 }
@@ -256,7 +238,7 @@ struct request *blk_mq_alloc_request(struct request_queue *q, int rw, gfp_t gfp,
 	struct blk_mq_alloc_data alloc_data;
 	int ret;
 
-	ret = blk_mq_queue_enter(q, gfp);
+	ret = blk_queue_enter(q, gfp);
 	if (ret)
 		return ERR_PTR(ret);
 
@@ -279,7 +261,7 @@ struct request *blk_mq_alloc_request(struct request_queue *q, int rw, gfp_t gfp,
 	}
 	blk_mq_put_ctx(ctx);
 	if (!rq) {
-		blk_mq_queue_exit(q);
+		blk_queue_exit(q);
 		return ERR_PTR(-EWOULDBLOCK);
 	}
 	return rq;
@@ -298,7 +280,7 @@ static void __blk_mq_free_request(struct blk_mq_hw_ctx *hctx,
 
 	clear_bit(REQ_ATOM_STARTED, &rq->atomic_flags);
 	blk_mq_put_tag(hctx, tag, &ctx->last_tag);
-	blk_mq_queue_exit(q);
+	blk_queue_exit(q);
 }
 
 void blk_mq_free_hctx_request(struct blk_mq_hw_ctx *hctx, struct request *rq)
@@ -1177,11 +1159,7 @@ static struct request *blk_mq_map_request(struct request_queue *q,
 	int rw = bio_data_dir(bio);
 	struct blk_mq_alloc_data alloc_data;
 
-	if (unlikely(blk_mq_queue_enter(q, GFP_KERNEL))) {
-		bio_io_error(bio);
-		return NULL;
-	}
-
+	blk_queue_enter_live(q);
 	ctx = blk_mq_get_ctx(q);
 	hctx = q->mq_ops->map_queue(q, ctx->cpu);
 
@@ -2000,14 +1978,6 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 		hctxs[i]->queue_num = i;
 	}
 
-	/*
-	 * Init percpu_ref in atomic mode so that it's faster to shutdown.
-	 * See blk_register_queue() for details.
-	 */
-	if (percpu_ref_init(&q->mq_usage_counter, blk_mq_usage_counter_release,
-			    PERCPU_REF_INIT_ATOMIC, GFP_KERNEL))
-		goto err_hctxs;
-
 	setup_timer(&q->timeout, blk_mq_rq_timer, (unsigned long) q);
 	blk_queue_rq_timeout(q, set->timeout ? set->timeout : 30 * HZ);
 
@@ -2088,8 +2058,6 @@ void blk_mq_free_queue(struct request_queue *q)
 
 	blk_mq_exit_hw_queues(q, set, set->nr_hw_queues);
 	blk_mq_free_hw_queues(q, set);
-
-	percpu_ref_exit(&q->mq_usage_counter);
 }
 
 /* Basically redo blk_mq_init_queue with queue frozen */
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 3e44a9da2a13..61fc2633bbea 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -599,9 +599,8 @@ int blk_register_queue(struct gendisk *disk)
 	 */
 	if (!blk_queue_init_done(q)) {
 		queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
+		percpu_ref_switch_to_percpu(&q->q_usage_counter);
 		blk_queue_bypass_end(q);
-		if (q->mq_ops)
-			blk_mq_finish_init(q);
 	}
 
 	ret = blk_trace_init_sysfs(dev);
diff --git a/block/blk.h b/block/blk.h
index 98614ad37c81..5b2cd393afbe 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -72,6 +72,20 @@ void blk_dequeue_request(struct request *rq);
 void __blk_queue_free_tags(struct request_queue *q);
 bool __blk_end_bidi_request(struct request *rq, int error,
 			    unsigned int nr_bytes, unsigned int bidi_bytes);
+int blk_queue_enter(struct request_queue *q, gfp_t gfp);
+void blk_queue_exit(struct request_queue *q);
+void blk_freeze_queue(struct request_queue *q);
+
+static inline void blk_queue_enter_live(struct request_queue *q)
+{
+	/*
+	 * Given that running in generic_make_request() context
+	 * guarantees that a live reference against q_usage_counter has
+	 * been established, further references under that same context
+	 * need not check that the queue has been frozen (marked dead).
+	 */
+	percpu_ref_get(&q->q_usage_counter);
+}
 
 void blk_rq_timed_out_timer(unsigned long data);
 unsigned long blk_rq_timeout(unsigned long timeout);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 5e7d43ab61c0..83cc9d4e5455 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -166,7 +166,6 @@ enum {
 struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *);
 struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 						  struct request_queue *q);
-void blk_mq_finish_init(struct request_queue *q);
 int blk_mq_register_disk(struct gendisk *);
 void blk_mq_unregister_disk(struct gendisk *);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 60669c20190f..3e0465257d68 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -450,7 +450,7 @@ struct request_queue {
 #endif
 	struct rcu_head		rcu_head;
 	wait_queue_head_t	mq_freeze_wq;
-	struct percpu_ref	mq_usage_counter;
+	struct percpu_ref	q_usage_counter;
 	struct list_head	all_q_node;
 
 	struct blk_mq_tag_set	*tag_set;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 10/12] block: move blk_integrity to request_queue
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (8 preceding siblings ...)
  2015-10-21 17:20 ` [PATCH v3 09/12] block: generic request_queue reference counting Dan Williams
@ 2015-10-21 17:20 ` Dan Williams
  2016-01-14 10:36   ` Sagi Grimberg
  2015-10-21 17:20 ` [PATCH v3 11/12] block: blk_flush_integrity() for bio-based drivers Dan Williams
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:20 UTC (permalink / raw)


A trace like the following proceeds a crash in bio_integrity_process()
when it goes to use an already freed blk_integrity profile.

 BUG: unable to handle kernel paging request at ffff8800d31b10d8
 IP: [<ffff8800d31b10d8>] 0xffff8800d31b10d8
 PGD 2f65067 PUD 21fffd067 PMD 80000000d30001e3
 Oops: 0011 [#1] SMP
 Dumping ftrace buffer:
 ---------------------------------
    ndctl-2222    2.... 44526245us : disk_release: pmem1s
 systemd--2223    4.... 44573945us : bio_integrity_endio: pmem1s
    <...>-409     4.... 44574005us : bio_integrity_process: pmem1s
 ---------------------------------
[..]
  Call Trace:
  [<ffffffff8144e0f9>] ? bio_integrity_process+0x159/0x2d0
  [<ffffffff8144e4f6>] bio_integrity_verify_fn+0x36/0x60
  [<ffffffff810bd2dc>] process_one_work+0x1cc/0x4e0

Given that a request_queue is pinned while i/o is in flight and that a
gendisk is allowed to have a shorter lifetime, move blk_integrity to
request_queue to satisfy requests arriving after the gendisk has been
torn down.

Cc: Christoph Hellwig <hch at lst.de>
Cc: Martin K. Petersen <martin.petersen at oracle.com>
[martin: fix the CONFIG_BLK_DEV_INTEGRITY=n case]
Tested-by: Ross Zwisler <ross.zwisler at linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c   |   14 +++++++-------
 drivers/nvme/host/pci.c |    2 +-
 include/linux/blkdev.h  |    6 +++++-
 include/linux/genhd.h   |    1 -
 4 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 4615a3386798..5d339ae64d56 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -142,8 +142,8 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
  */
 int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 {
-	struct blk_integrity *b1 = &gd1->integrity;
-	struct blk_integrity *b2 = &gd2->integrity;
+	struct blk_integrity *b1 = &gd1->queue->integrity;
+	struct blk_integrity *b2 = &gd2->queue->integrity;
 
 	if (!b1->profile && !b2->profile)
 		return 0;
@@ -246,7 +246,7 @@ static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = &disk->integrity;
+	struct blk_integrity *bi = &disk->queue->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -258,7 +258,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    size_t count)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = &disk->integrity;
+	struct blk_integrity *bi = &disk->queue->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -397,7 +397,7 @@ static struct kobj_type integrity_ktype = {
  */
 void blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 {
-	struct blk_integrity *bi = &disk->integrity;
+	struct blk_integrity *bi = &disk->queue->integrity;
 
 	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
 		template->flags;
@@ -420,13 +420,13 @@ EXPORT_SYMBOL(blk_integrity_register);
 void blk_integrity_unregister(struct gendisk *disk)
 {
 	blk_integrity_revalidate(disk);
-	memset(&disk->integrity, 0, sizeof(struct blk_integrity));
+	memset(&disk->queue->integrity, 0, sizeof(struct blk_integrity));
 }
 EXPORT_SYMBOL(blk_integrity_unregister);
 
 void blk_integrity_revalidate(struct gendisk *disk)
 {
-	struct blk_integrity *bi = &disk->integrity;
+	struct blk_integrity *bi = &disk->queue->integrity;
 
 	if (!(disk->flags & GENHD_FL_UP))
 		return;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4757a2c9366e..2fa28680ad0f 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -538,7 +538,7 @@ static void nvme_dif_remap(struct request *req,
 	virt = bip_get_seed(bip);
 	phys = nvme_block_nr(ns, blk_rq_pos(req));
 	nlb = (blk_rq_bytes(req) >> ns->lba_shift);
-	ts = ns->disk->integrity.tuple_size;
+	ts = ns->disk->queue->integrity.tuple_size;
 
 	for (i = 0; i < nlb; i++, virt++, phys++) {
 		pi = (struct t10_pi_tuple *)p;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3e0465257d68..cf57884db4b7 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -369,6 +369,10 @@ struct request_queue {
 	 */
 	struct kobject mq_kobj;
 
+#ifdef  CONFIG_BLK_DEV_INTEGRITY
+	struct blk_integrity integrity;
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 #ifdef CONFIG_PM
 	struct device		*dev;
 	int			rpm_status;
@@ -1481,7 +1485,7 @@ extern bool blk_integrity_merge_bio(struct request_queue *, struct request *,
 
 static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
 {
-	struct blk_integrity *bi = &disk->integrity;
+	struct blk_integrity *bi = &disk->queue->integrity;
 
 	if (!bi->profile)
 		return NULL;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 82f4911e0ad8..847cc1d91634 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -210,7 +210,6 @@ struct gendisk {
 	atomic_t sync_io;		/* RAID */
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
-	struct blk_integrity integrity;
 	struct kobject integrity_kobj;
 #endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 11/12] block: blk_flush_integrity() for bio-based drivers
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (9 preceding siblings ...)
  2015-10-21 17:20 ` [PATCH v3 10/12] block: move blk_integrity to request_queue Dan Williams
@ 2015-10-21 17:20 ` Dan Williams
  2016-01-14 10:36   ` Sagi Grimberg
  2015-10-21 17:20 ` [PATCH v3 12/12] block, libnvdimm, nvme: provide a built-in blk_integrity nop profile Dan Williams
  2015-10-22 15:26 ` [PATCH v3 00/12] Simplify block integrity registration v3 Christoph Hellwig
  12 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:20 UTC (permalink / raw)


Since they lack requests to pin the request_queue active, synchronous
bio-based drivers may have in-flight integrity work from
bio_integrity_endio() that is not flushed by blk_freeze_queue().  Flush
that work to prevent races to free the queue and the final usage of the
blk_integrity profile.

This is temporary unless/until bio-based drivers start to generically
take a q_usage_counter reference while a bio is in-flight.

Cc: Martin K. Petersen <martin.petersen at oracle.com>
[martin: fix the CONFIG_BLK_DEV_INTEGRITY=n case]
Tested-by: Ross Zwisler <ross.zwisler at linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/bio-integrity.c |    5 +++++
 block/blk-core.c      |    3 +++
 block/blk.h           |    8 ++++++++
 3 files changed, 16 insertions(+)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 6a90eca9cea1..f6325d573c10 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -32,6 +32,11 @@
 static struct kmem_cache *bip_slab;
 static struct workqueue_struct *kintegrityd_wq;
 
+void blk_flush_integrity(void)
+{
+	flush_workqueue(kintegrityd_wq);
+}
+
 /**
  * bio_integrity_alloc - Allocate integrity payload and attach it to bio
  * @bio:	bio to attach integrity metadata to
diff --git a/block/blk-core.c b/block/blk-core.c
index 9b4d735cb5b8..6ebe33ed5154 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -561,6 +561,9 @@ void blk_cleanup_queue(struct request_queue *q)
 	queue_flag_set(QUEUE_FLAG_DEAD, q);
 	spin_unlock_irq(lock);
 
+	/* for synchronous bio-based driver finish in-flight integrity i/o */
+	blk_flush_integrity();
+
 	/* @q won't process any more request, flush async actions */
 	del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
 	blk_sync_queue(q);
diff --git a/block/blk.h b/block/blk.h
index 5b2cd393afbe..157c93d54dc9 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -87,6 +87,14 @@ static inline void blk_queue_enter_live(struct request_queue *q)
 	percpu_ref_get(&q->q_usage_counter);
 }
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+void blk_flush_integrity(void);
+#else
+static inline void blk_flush_integrity(void)
+{
+}
+#endif
+
 void blk_rq_timed_out_timer(unsigned long data);
 unsigned long blk_rq_timeout(unsigned long timeout);
 void blk_add_timer(struct request *req);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 12/12] block, libnvdimm, nvme: provide a built-in blk_integrity nop profile
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (10 preceding siblings ...)
  2015-10-21 17:20 ` [PATCH v3 11/12] block: blk_flush_integrity() for bio-based drivers Dan Williams
@ 2015-10-21 17:20 ` Dan Williams
  2016-01-14 10:37   ` Sagi Grimberg
  2015-10-22 15:26 ` [PATCH v3 00/12] Simplify block integrity registration v3 Christoph Hellwig
  12 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2015-10-21 17:20 UTC (permalink / raw)


The libnvidmm-btt and nvme drivers use blk_integrity to reserve space
for per-sector metadata, but sometimes without protection checksums.
This property is generically useful, so teach the block core to
internally specify a nop profile if one is not provided at registration
time.

Cc: Keith Busch <keith.busch at intel.com>
Cc: Matthew Wilcox <willy at linux.intel.com>
Suggested-by: Christoph Hellwig <hch at lst.de>
[hch: kill the local nvme nop profile as well]
Acked-by: Martin K. Petersen <martin.petersen at oracle.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c   |   13 ++++++++++++-
 drivers/nvdimm/core.c   |   12 +-----------
 drivers/nvme/host/pci.c |   18 +-----------------
 3 files changed, 14 insertions(+), 29 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 5d339ae64d56..d69c5c79f98e 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -384,6 +384,17 @@ static struct kobj_type integrity_ktype = {
 	.sysfs_ops	= &integrity_ops,
 };
 
+static int blk_integrity_nop_fn(struct blk_integrity_iter *iter)
+{
+	return 0;
+}
+
+static struct blk_integrity_profile nop_profile = {
+	.name = "nop",
+	.generate_fn = blk_integrity_nop_fn,
+	.verify_fn = blk_integrity_nop_fn,
+};
+
 /**
  * blk_integrity_register - Register a gendisk as being integrity-capable
  * @disk:	struct gendisk pointer to make integrity-aware
@@ -402,7 +413,7 @@ void blk_integrity_register(struct gendisk *disk, struct blk_integrity *template
 	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
 		template->flags;
 	bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
-	bi->profile = template->profile;
+	bi->profile = template->profile ? template->profile : &nop_profile;
 	bi->tuple_size = template->tuple_size;
 	bi->tag_size = template->tag_size;
 
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index e85848caf8d2..82c49bb87055 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -392,24 +392,14 @@ void nvdimm_bus_unregister(struct nvdimm_bus *nvdimm_bus)
 EXPORT_SYMBOL_GPL(nvdimm_bus_unregister);
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
-static int nd_pi_nop_generate_verify(struct blk_integrity_iter *iter)
-{
-	return 0;
-}
-
 int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 {
 	struct blk_integrity bi;
-	static struct blk_integrity_profile profile = {
-		.name = "ND-PI-NOP",
-		.generate_fn = nd_pi_nop_generate_verify,
-		.verify_fn = nd_pi_nop_generate_verify,
-	};
 
 	if (meta_size == 0)
 		return 0;
 
-	bi.profile = &profile;
+	bi.profile = NULL;
 	bi.tuple_size = meta_size;
 	bi.tag_size = meta_size;
 
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 2fa28680ad0f..9bea542afc4f 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -548,22 +548,6 @@ static void nvme_dif_remap(struct request *req,
 	kunmap_atomic(pmap);
 }
 
-static int nvme_noop_verify(struct blk_integrity_iter *iter)
-{
-	return 0;
-}
-
-static int nvme_noop_generate(struct blk_integrity_iter *iter)
-{
-	return 0;
-}
-
-struct blk_integrity_profile nvme_meta_noop = {
-	.name			= "NVME_META_NOOP",
-	.generate_fn		= nvme_noop_generate,
-	.verify_fn		= nvme_noop_verify,
-};
-
 static void nvme_init_integrity(struct nvme_ns *ns)
 {
 	struct blk_integrity integrity;
@@ -577,7 +561,7 @@ static void nvme_init_integrity(struct nvme_ns *ns)
 		integrity.profile = &t10_pi_type1_crc;
 		break;
 	default:
-		integrity.profile = &nvme_meta_noop;
+		integrity.profile = NULL;
 		break;
 	}
 	integrity.tuple_size = ns->ms;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v3 00/12] Simplify block integrity registration v3
  2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
                   ` (11 preceding siblings ...)
  2015-10-21 17:20 ` [PATCH v3 12/12] block, libnvdimm, nvme: provide a built-in blk_integrity nop profile Dan Williams
@ 2015-10-22 15:26 ` Christoph Hellwig
  12 siblings, 0 replies; 30+ messages in thread
From: Christoph Hellwig @ 2015-10-22 15:26 UTC (permalink / raw)


Hi Dan,

the series looks fine to me:

Acked-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister
  2015-10-21 17:20 ` [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister Dan Williams
@ 2016-01-13  2:14   ` NeilBrown
  2016-01-13  2:24     ` Dan Williams
  2016-01-14  0:00       ` Dan Williams
  0 siblings, 2 replies; 30+ messages in thread
From: NeilBrown @ 2016-01-13  2:14 UTC (permalink / raw)


On Thu, Oct 22 2015, Dan Williams wrote:

> Synchronize pending i/o against a change in the integrity profile to
> avoid the possibility of spurious integrity errors.  Given linear_add()
> is suspending the mddev before manipulating the mddev, do the same for
> the other personalities.
>
> Acked-by: NeilBrown <neilb at suse.com>
> Signed-off-by: Dan Williams <dan.j.williams at intel.com>
> ---
>  drivers/md/md.c        |    1 +
>  drivers/md/multipath.c |    2 ++
>  drivers/md/raid1.c     |    2 ++
>  drivers/md/raid10.c    |    2 ++
>  4 files changed, 7 insertions(+)
>

> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 7c99a4037715..6f0ec107996a 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1736,7 +1736,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  		rcu_assign_pointer(p->rdev, rdev);
>  		break;
>  	}
> +	mddev_suspend(mddev);
>  	md_integrity_add_rdev(rdev, mddev);
> +	mddev_resume(mddev);
>  	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>  		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>  

Hi Dan,

Somewhat belatedly I am testing this code, and it deadlocks.  Which is
obvious once I think about it.

raid10_add_disk() is called from
  raid10d -> md_check_recovery -> remove_and_add_spares

so this call to mddev_suspend() will block raid10d until all pending IO
completes.  But raid10d might be needed to complete some IO -
particularly in the case of errors.  I don't know exactly what is
deadlocking in my test, but it doesn't surprise me that something does.

Can you explain in more detail what needs to be synchronised here?  If
the synchronisation doesn't happen, what can do wrong?
Because I cannot imagine what this might actually be protecting against.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20160113/219e785c/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister
  2016-01-13  2:14   ` NeilBrown
@ 2016-01-13  2:24     ` Dan Williams
  2016-01-13  2:59       ` Mike Snitzer
  2016-01-13  3:10       ` NeilBrown
  2016-01-14  0:00       ` Dan Williams
  1 sibling, 2 replies; 30+ messages in thread
From: Dan Williams @ 2016-01-13  2:24 UTC (permalink / raw)


On Tue, Jan 12, 2016@6:14 PM, NeilBrown <neilb@suse.com> wrote:
> On Thu, Oct 22 2015, Dan Williams wrote:
>
>> Synchronize pending i/o against a change in the integrity profile to
>> avoid the possibility of spurious integrity errors.  Given linear_add()
>> is suspending the mddev before manipulating the mddev, do the same for
>> the other personalities.
>>
>> Acked-by: NeilBrown <neilb at suse.com>
>> Signed-off-by: Dan Williams <dan.j.williams at intel.com>
>> ---
>>  drivers/md/md.c        |    1 +
>>  drivers/md/multipath.c |    2 ++
>>  drivers/md/raid1.c     |    2 ++
>>  drivers/md/raid10.c    |    2 ++
>>  4 files changed, 7 insertions(+)
>>
>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index 7c99a4037715..6f0ec107996a 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -1736,7 +1736,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>>               rcu_assign_pointer(p->rdev, rdev);
>>               break;
>>       }
>> +     mddev_suspend(mddev);
>>       md_integrity_add_rdev(rdev, mddev);
>> +     mddev_resume(mddev);
>>       if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>>               queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>>
>
> Hi Dan,
>
> Somewhat belatedly I am testing this code, and it deadlocks.  Which is
> obvious once I think about it.
>
> raid10_add_disk() is called from
>   raid10d -> md_check_recovery -> remove_and_add_spares
>
> so this call to mddev_suspend() will block raid10d until all pending IO
> completes.  But raid10d might be needed to complete some IO -
> particularly in the case of errors.  I don't know exactly what is
> deadlocking in my test, but it doesn't surprise me that something does.

Ugh.

> Can you explain in more detail what needs to be synchronised here?  If
> the synchronisation doesn't happen, what can do wrong?
> Because I cannot imagine what this might actually be protecting against.

My thinking is that if the integrity profile changes while i/o is in
flight we can get spurious errors.  For example when enabling
integrity a checksum for in-flight commands will be missing, so wait
for those to complete.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister
  2016-01-13  2:24     ` Dan Williams
@ 2016-01-13  2:59       ` Mike Snitzer
  2016-01-13  3:10       ` NeilBrown
  1 sibling, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2016-01-13  2:59 UTC (permalink / raw)


On Tue, Jan 12 2016 at  9:24pm -0500,
Dan Williams <dan.j.williams@intel.com> wrote:

> On Tue, Jan 12, 2016@6:14 PM, NeilBrown <neilb@suse.com> wrote:
> >
> > Hi Dan,
> >
> > Somewhat belatedly I am testing this code, and it deadlocks.  Which is
> > obvious once I think about it.
> >
> > raid10_add_disk() is called from
> >   raid10d -> md_check_recovery -> remove_and_add_spares
> >
> > so this call to mddev_suspend() will block raid10d until all pending IO
> > completes.  But raid10d might be needed to complete some IO -
> > particularly in the case of errors.  I don't know exactly what is
> > deadlocking in my test, but it doesn't surprise me that something does.
> 
> Ugh.
> 
> > Can you explain in more detail what needs to be synchronised here?  If
> > the synchronisation doesn't happen, what can do wrong?
> > Because I cannot imagine what this might actually be protecting against.
> 
> My thinking is that if the integrity profile changes while i/o is in
> flight we can get spurious errors.  For example when enabling
> integrity a checksum for in-flight commands will be missing, so wait
> for those to complete.

DM avoids this because it doesn't allow changing the integrity profile
once one is established.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister
  2016-01-13  2:24     ` Dan Williams
  2016-01-13  2:59       ` Mike Snitzer
@ 2016-01-13  3:10       ` NeilBrown
  2016-01-13  5:11         ` Dan Williams
  1 sibling, 1 reply; 30+ messages in thread
From: NeilBrown @ 2016-01-13  3:10 UTC (permalink / raw)


On Wed, Jan 13 2016, Dan Williams wrote:

> On Tue, Jan 12, 2016@6:14 PM, NeilBrown <neilb@suse.com> wrote:
>> On Thu, Oct 22 2015, Dan Williams wrote:
>>
>>> Synchronize pending i/o against a change in the integrity profile to
>>> avoid the possibility of spurious integrity errors.  Given linear_add()
>>> is suspending the mddev before manipulating the mddev, do the same for
>>> the other personalities.
>>>
>>> Acked-by: NeilBrown <neilb at suse.com>
>>> Signed-off-by: Dan Williams <dan.j.williams at intel.com>
>>> ---
>>>  drivers/md/md.c        |    1 +
>>>  drivers/md/multipath.c |    2 ++
>>>  drivers/md/raid1.c     |    2 ++
>>>  drivers/md/raid10.c    |    2 ++
>>>  4 files changed, 7 insertions(+)
>>>
>>
>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>> index 7c99a4037715..6f0ec107996a 100644
>>> --- a/drivers/md/raid10.c
>>> +++ b/drivers/md/raid10.c
>>> @@ -1736,7 +1736,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>>>               rcu_assign_pointer(p->rdev, rdev);
>>>               break;
>>>       }
>>> +     mddev_suspend(mddev);
>>>       md_integrity_add_rdev(rdev, mddev);
>>> +     mddev_resume(mddev);
>>>       if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>>>               queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>>>
>>
>> Hi Dan,
>>
>> Somewhat belatedly I am testing this code, and it deadlocks.  Which is
>> obvious once I think about it.
>>
>> raid10_add_disk() is called from
>>   raid10d -> md_check_recovery -> remove_and_add_spares
>>
>> so this call to mddev_suspend() will block raid10d until all pending IO
>> completes.  But raid10d might be needed to complete some IO -
>> particularly in the case of errors.  I don't know exactly what is
>> deadlocking in my test, but it doesn't surprise me that something does.
>
> Ugh.
>
>> Can you explain in more detail what needs to be synchronised here?  If
>> the synchronisation doesn't happen, what can do wrong?
>> Because I cannot imagine what this might actually be protecting against.
>
> My thinking is that if the integrity profile changes while i/o is in
> flight we can get spurious errors.  For example when enabling
> integrity a checksum for in-flight commands will be missing, so wait
> for those to complete.

Who/what adds that checksum?  The filesystem?
mddev_suspend() doesn't stop the filesystem from submitting requests.
It just stops them from getting into md.  So when mddev_resume()
completes, old requests can arrive.

If stability of the integrity profile is important, then we presumably
need to make sure it never changes (while the array is active - or at
least while it is mounted).  So don't accept drives which don't support
the current profile.  Also don't upgrade the profile just because all
current devices support some new feature.

Maybe the array should have a persistent knowledge of what profile it
has and only accept devices which match?
Or maybe integrity should be completely disabled on arrays which can
hot-add devices (so support it on RAID0 and LINEAR, but not on
RAID1/4/5/6/10).

Seems like a bit of a mess.

NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20160113/f6e4f388/attachment.sig>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister
  2016-01-13  3:10       ` NeilBrown
@ 2016-01-13  5:11         ` Dan Williams
  2016-01-13 19:51           ` Mike Snitzer
  0 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2016-01-13  5:11 UTC (permalink / raw)


On Tue, Jan 12, 2016@7:10 PM, NeilBrown <neilb@suse.com> wrote:
> On Wed, Jan 13 2016, Dan Williams wrote:
>
>> On Tue, Jan 12, 2016@6:14 PM, NeilBrown <neilb@suse.com> wrote:
>>> On Thu, Oct 22 2015, Dan Williams wrote:
>>>
>>>> Synchronize pending i/o against a change in the integrity profile to
>>>> avoid the possibility of spurious integrity errors.  Given linear_add()
>>>> is suspending the mddev before manipulating the mddev, do the same for
>>>> the other personalities.
>>>>
>>>> Acked-by: NeilBrown <neilb at suse.com>
>>>> Signed-off-by: Dan Williams <dan.j.williams at intel.com>
>>>> ---
>>>>  drivers/md/md.c        |    1 +
>>>>  drivers/md/multipath.c |    2 ++
>>>>  drivers/md/raid1.c     |    2 ++
>>>>  drivers/md/raid10.c    |    2 ++
>>>>  4 files changed, 7 insertions(+)
>>>>
>>>
>>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>>> index 7c99a4037715..6f0ec107996a 100644
>>>> --- a/drivers/md/raid10.c
>>>> +++ b/drivers/md/raid10.c
>>>> @@ -1736,7 +1736,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>>>>               rcu_assign_pointer(p->rdev, rdev);
>>>>               break;
>>>>       }
>>>> +     mddev_suspend(mddev);
>>>>       md_integrity_add_rdev(rdev, mddev);
>>>> +     mddev_resume(mddev);
>>>>       if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>>>>               queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>>>>
>>>
>>> Hi Dan,
>>>
>>> Somewhat belatedly I am testing this code, and it deadlocks.  Which is
>>> obvious once I think about it.
>>>
>>> raid10_add_disk() is called from
>>>   raid10d -> md_check_recovery -> remove_and_add_spares
>>>
>>> so this call to mddev_suspend() will block raid10d until all pending IO
>>> completes.  But raid10d might be needed to complete some IO -
>>> particularly in the case of errors.  I don't know exactly what is
>>> deadlocking in my test, but it doesn't surprise me that something does.
>>
>> Ugh.
>>
>>> Can you explain in more detail what needs to be synchronised here?  If
>>> the synchronisation doesn't happen, what can do wrong?
>>> Because I cannot imagine what this might actually be protecting against.
>>
>> My thinking is that if the integrity profile changes while i/o is in
>> flight we can get spurious errors.  For example when enabling
>> integrity a checksum for in-flight commands will be missing, so wait
>> for those to complete.
>
> Who/what adds that checksum?  The filesystem?
> mddev_suspend() doesn't stop the filesystem from submitting requests.
> It just stops them from getting into md.  So when mddev_resume()
> completes, old requests can arrive.

The block layer generates the checksum when the bio is queued.

You're right the suspend does nothing to protect against the integrity
generated for the array.

>
> If stability of the integrity profile is important, then we presumably
> need to make sure it never changes (while the array is active - or at
> least while it is mounted).  So don't accept drives which don't support
> the current profile.  Also don't upgrade the profile just because all
> current devices support some new feature.
>
> Maybe the array should have a persistent knowledge of what profile it
> has and only accept devices which match?
> Or maybe integrity should be completely disabled on arrays which can
> hot-add devices (so support it on RAID0 and LINEAR, but not on
> RAID1/4/5/6/10).

For -stable I'll develop an immediate fix that disallows live changes
of the integrity similar to DM.

However, given no filesystem is sending down app tags, I'm wondering
why md has integrity enabled at the array level in the first place?
Is this just future preparation for eventual filesystem enabling?
Otherwise, the integrity handling for each component device is
sufficient.  I'm probably missing something.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister
  2016-01-13  5:11         ` Dan Williams
@ 2016-01-13 19:51           ` Mike Snitzer
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Snitzer @ 2016-01-13 19:51 UTC (permalink / raw)


On Wed, Jan 13 2016 at 12:11am -0500,
Dan Williams <dan.j.williams@intel.com> wrote:

> On Tue, Jan 12, 2016@7:10 PM, NeilBrown <neilb@suse.com> wrote:
> > On Wed, Jan 13 2016, Dan Williams wrote:
> >
> >> On Tue, Jan 12, 2016@6:14 PM, NeilBrown <neilb@suse.com> wrote:
> >>> On Thu, Oct 22 2015, Dan Williams wrote:
> >>>
> >>>> Synchronize pending i/o against a change in the integrity profile to
> >>>> avoid the possibility of spurious integrity errors.  Given linear_add()
> >>>> is suspending the mddev before manipulating the mddev, do the same for
> >>>> the other personalities.
> >>>>
> >>>> Acked-by: NeilBrown <neilb at suse.com>
> >>>> Signed-off-by: Dan Williams <dan.j.williams at intel.com>
> >>>> ---
> >>>>  drivers/md/md.c        |    1 +
> >>>>  drivers/md/multipath.c |    2 ++
> >>>>  drivers/md/raid1.c     |    2 ++
> >>>>  drivers/md/raid10.c    |    2 ++
> >>>>  4 files changed, 7 insertions(+)
> >>>>
> >>>
> >>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> >>>> index 7c99a4037715..6f0ec107996a 100644
> >>>> --- a/drivers/md/raid10.c
> >>>> +++ b/drivers/md/raid10.c
> >>>> @@ -1736,7 +1736,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
> >>>>               rcu_assign_pointer(p->rdev, rdev);
> >>>>               break;
> >>>>       }
> >>>> +     mddev_suspend(mddev);
> >>>>       md_integrity_add_rdev(rdev, mddev);
> >>>> +     mddev_resume(mddev);
> >>>>       if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
> >>>>               queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
> >>>>
> >>>
> >>> Hi Dan,
> >>>
> >>> Somewhat belatedly I am testing this code, and it deadlocks.  Which is
> >>> obvious once I think about it.
> >>>
> >>> raid10_add_disk() is called from
> >>>   raid10d -> md_check_recovery -> remove_and_add_spares
> >>>
> >>> so this call to mddev_suspend() will block raid10d until all pending IO
> >>> completes.  But raid10d might be needed to complete some IO -
> >>> particularly in the case of errors.  I don't know exactly what is
> >>> deadlocking in my test, but it doesn't surprise me that something does.
> >>
> >> Ugh.
> >>
> >>> Can you explain in more detail what needs to be synchronised here?  If
> >>> the synchronisation doesn't happen, what can do wrong?
> >>> Because I cannot imagine what this might actually be protecting against.
> >>
> >> My thinking is that if the integrity profile changes while i/o is in
> >> flight we can get spurious errors.  For example when enabling
> >> integrity a checksum for in-flight commands will be missing, so wait
> >> for those to complete.
> >
> > Who/what adds that checksum?  The filesystem?
> > mddev_suspend() doesn't stop the filesystem from submitting requests.
> > It just stops them from getting into md.  So when mddev_resume()
> > completes, old requests can arrive.
> 
> The block layer generates the checksum when the bio is queued.
> 
> You're right the suspend does nothing to protect against the integrity
> generated for the array.

MD and DM are _not_ using the Linux block layer's ability to generate
protection information metadata (via bio_integrity_generate).  As such
they are only providing pass-through of upper layer application
generated protection information (DIX).

A bit more explanation:
bio-based MD and DM's cloning of protection information causes their
cloned bios to have bio->bi_integrity -- as such the underlying
request-based driver skips bio_integrity_generate() via
blk_integrity_prep() (because blk_queue_bio()'s call to
bio_integrity_enabled() returns false for MD and DM's cloned bios).

> > If stability of the integrity profile is important, then we presumably
> > need to make sure it never changes (while the array is active - or at
> > least while it is mounted).  So don't accept drives which don't support
> > the current profile.  Also don't upgrade the profile just because all
> > current devices support some new feature.
> >
> > Maybe the array should have a persistent knowledge of what profile it
> > has and only accept devices which match?
> > Or maybe integrity should be completely disabled on arrays which can
> > hot-add devices (so support it on RAID0 and LINEAR, but not on
> > RAID1/4/5/6/10).
> 
> For -stable I'll develop an immediate fix that disallows live changes
> of the integrity similar to DM.
> 
> However, given no filesystem is sending down app tags, I'm wondering
> why md has integrity enabled at the array level in the first place?
> Is this just future preparation for eventual filesystem enabling?
> Otherwise, the integrity handling for each component device is
> sufficient.  I'm probably missing something.

AFAIK, Oracle DB is the only application known to be using DIX.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] md/raid: only permit hot-add of compatible integrity profiles
  2016-01-13  2:14   ` NeilBrown
@ 2016-01-14  0:00       ` Dan Williams
  2016-01-14  0:00       ` Dan Williams
  1 sibling, 0 replies; 30+ messages in thread
From: Dan Williams @ 2016-01-14  0:00 UTC (permalink / raw)
  To: neilb
  Cc: axboe, Mike Snitzer, martin.petersen, linux-kernel, stable,
	NeilBrown, linux-raid, linux-nvme, keith.busch, hch

It is not safe for an integrity profile to be changed while i/o is
in-flight in the queue.  Prevent adding new disks or otherwise online
spares to an array if the device has an incompatible integrity profile.

The original change to the blk_integrity_unregister implementation in
md, commmit c7bfced9a671 "md: suspend i/o during runtime
blk_integrity_unregister" introduced an immediate hang regression.

This policy of disallowing changes the integrity profile once one has
been established is shared with DM.

Here is an abbreviated log from a test run that:
1/ Creates a degraded raid1 with an integrity-enabled device (pmem0s) [   59.076127]
2/ Tries to add an integrity-disabled device (pmem1m) [   90.489209]
3/ Retries with an integrity-enabled device (pmem1s) [  205.671277]

[   59.076127] md/raid1:md0: active with 1 out of 2 mirrors
[   59.078302] md: data integrity enabled on md0
[..]
[   90.489209] md0: incompatible integrity profile for pmem1m
[..]
[  205.671277] md: super_written gets error=-5
[  205.677386] md/raid1:md0: Disk failure on pmem1m, disabling device.
[  205.677386] md/raid1:md0: Operation continuing on 1 devices.
[  205.683037] RAID1 conf printout:
[  205.684699]  --- wd:1 rd:2
[  205.685972]  disk 0, wo:0, o:1, dev:pmem0s
[  205.687562]  disk 1, wo:1, o:1, dev:pmem1s
[  205.691717] md: recovery of RAID array md0

Fixes: c7bfced9a671 ("md: suspend i/o during runtime blk_integrity_unregister")
Cc: <stable@vger.kernel.org>
Cc: Mike Snitzer <snitzer@redhat.com>
Reported-by: NeilBrown <neilb@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/md/md.c        |   28 ++++++++++++++++------------
 drivers/md/md.h        |    2 +-
 drivers/md/multipath.c |    6 +++---
 drivers/md/raid1.c     |    6 +++---
 drivers/md/raid10.c    |    6 +++---
 5 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 61aacab424cf..b1e1f6b95782 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2017,28 +2017,32 @@ int md_integrity_register(struct mddev *mddev)
 }
 EXPORT_SYMBOL(md_integrity_register);
 
-/* Disable data integrity if non-capable/non-matching disk is being added */
-void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
+/*
+ * Attempt to add an rdev, but only if it is consistent with the current
+ * integrity profile
+ */
+int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
 {
 	struct blk_integrity *bi_rdev;
 	struct blk_integrity *bi_mddev;
+	char name[BDEVNAME_SIZE];
 
 	if (!mddev->gendisk)
-		return;
+		return 0;
 
 	bi_rdev = bdev_get_integrity(rdev->bdev);
 	bi_mddev = blk_get_integrity(mddev->gendisk);
 
 	if (!bi_mddev) /* nothing to do */
-		return;
-	if (rdev->raid_disk < 0) /* skip spares */
-		return;
-	if (bi_rdev && blk_integrity_compare(mddev->gendisk,
-					     rdev->bdev->bd_disk) >= 0)
-		return;
-	WARN_ON_ONCE(!mddev->suspended);
-	printk(KERN_NOTICE "disabling data integrity on %s\n", mdname(mddev));
-	blk_integrity_unregister(mddev->gendisk);
+		return 0;
+
+	if (blk_integrity_compare(mddev->gendisk, rdev->bdev->bd_disk) != 0) {
+		printk(KERN_NOTICE "%s: incompatible integrity profile for %s\n",
+				mdname(mddev), bdevname(rdev->bdev, name));
+		return -ENXIO;
+	}
+
+	return 0;
 }
 EXPORT_SYMBOL(md_integrity_add_rdev);
 
diff --git a/drivers/md/md.h b/drivers/md/md.h
index ca0b643fe3c1..dfa57b41541b 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -657,7 +657,7 @@ extern void md_wait_for_blocked_rdev(struct md_rdev *rdev, struct mddev *mddev);
 extern void md_set_array_sectors(struct mddev *mddev, sector_t array_sectors);
 extern int md_check_no_bitmap(struct mddev *mddev);
 extern int md_integrity_register(struct mddev *mddev);
-extern void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev);
+extern int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev);
 extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
 
 extern void mddev_init(struct mddev *mddev);
diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
index 7331a80d89f1..0a72ab6e6c20 100644
--- a/drivers/md/multipath.c
+++ b/drivers/md/multipath.c
@@ -257,6 +257,9 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 			disk_stack_limits(mddev->gendisk, rdev->bdev,
 					  rdev->data_offset << 9);
 
+			err = md_integrity_add_rdev(rdev, mddev);
+			if (err)
+				break;
 			spin_lock_irq(&conf->device_lock);
 			mddev->degraded--;
 			rdev->raid_disk = path;
@@ -264,9 +267,6 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 			spin_unlock_irq(&conf->device_lock);
 			rcu_assign_pointer(p->rdev, rdev);
 			err = 0;
-			mddev_suspend(mddev);
-			md_integrity_add_rdev(rdev, mddev);
-			mddev_resume(mddev);
 			break;
 		}
 
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index e2169ff6e0f0..c4b913409226 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1589,6 +1589,9 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 	if (mddev->recovery_disabled == conf->recovery_disabled)
 		return -EBUSY;
 
+	if (md_integrity_add_rdev(rdev, mddev))
+		return -ENXIO;
+
 	if (rdev->raid_disk >= 0)
 		first = last = rdev->raid_disk;
 
@@ -1632,9 +1635,6 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 			break;
 		}
 	}
-	mddev_suspend(mddev);
-	md_integrity_add_rdev(rdev, mddev);
-	mddev_resume(mddev);
 	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
 		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
 	print_conf(conf);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 84e597e1c489..ce959b4ae4df 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1698,6 +1698,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 	if (rdev->saved_raid_disk < 0 && !_enough(conf, 1, -1))
 		return -EINVAL;
 
+	if (md_integrity_add_rdev(rdev, mddev))
+		return -ENXIO;
+
 	if (rdev->raid_disk >= 0)
 		first = last = rdev->raid_disk;
 
@@ -1739,9 +1742,6 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 		rcu_assign_pointer(p->rdev, rdev);
 		break;
 	}
-	mddev_suspend(mddev);
-	md_integrity_add_rdev(rdev, mddev);
-	mddev_resume(mddev);
 	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
 		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
 

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH] md/raid: only permit hot-add of compatible integrity profiles
@ 2016-01-14  0:00       ` Dan Williams
  0 siblings, 0 replies; 30+ messages in thread
From: Dan Williams @ 2016-01-14  0:00 UTC (permalink / raw)


It is not safe for an integrity profile to be changed while i/o is
in-flight in the queue.  Prevent adding new disks or otherwise online
spares to an array if the device has an incompatible integrity profile.

The original change to the blk_integrity_unregister implementation in
md, commmit c7bfced9a671 "md: suspend i/o during runtime
blk_integrity_unregister" introduced an immediate hang regression.

This policy of disallowing changes the integrity profile once one has
been established is shared with DM.

Here is an abbreviated log from a test run that:
1/ Creates a degraded raid1 with an integrity-enabled device (pmem0s) [   59.076127]
2/ Tries to add an integrity-disabled device (pmem1m) [   90.489209]
3/ Retries with an integrity-enabled device (pmem1s) [  205.671277]

[   59.076127] md/raid1:md0: active with 1 out of 2 mirrors
[   59.078302] md: data integrity enabled on md0
[..]
[   90.489209] md0: incompatible integrity profile for pmem1m
[..]
[  205.671277] md: super_written gets error=-5
[  205.677386] md/raid1:md0: Disk failure on pmem1m, disabling device.
[  205.677386] md/raid1:md0: Operation continuing on 1 devices.
[  205.683037] RAID1 conf printout:
[  205.684699]  --- wd:1 rd:2
[  205.685972]  disk 0, wo:0, o:1, dev:pmem0s
[  205.687562]  disk 1, wo:1, o:1, dev:pmem1s
[  205.691717] md: recovery of RAID array md0

Fixes: c7bfced9a671 ("md: suspend i/o during runtime blk_integrity_unregister")
Cc: <stable at vger.kernel.org>
Cc: Mike Snitzer <snitzer at redhat.com>
Reported-by: NeilBrown <neilb at suse.com>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 drivers/md/md.c        |   28 ++++++++++++++++------------
 drivers/md/md.h        |    2 +-
 drivers/md/multipath.c |    6 +++---
 drivers/md/raid1.c     |    6 +++---
 drivers/md/raid10.c    |    6 +++---
 5 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 61aacab424cf..b1e1f6b95782 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2017,28 +2017,32 @@ int md_integrity_register(struct mddev *mddev)
 }
 EXPORT_SYMBOL(md_integrity_register);
 
-/* Disable data integrity if non-capable/non-matching disk is being added */
-void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
+/*
+ * Attempt to add an rdev, but only if it is consistent with the current
+ * integrity profile
+ */
+int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
 {
 	struct blk_integrity *bi_rdev;
 	struct blk_integrity *bi_mddev;
+	char name[BDEVNAME_SIZE];
 
 	if (!mddev->gendisk)
-		return;
+		return 0;
 
 	bi_rdev = bdev_get_integrity(rdev->bdev);
 	bi_mddev = blk_get_integrity(mddev->gendisk);
 
 	if (!bi_mddev) /* nothing to do */
-		return;
-	if (rdev->raid_disk < 0) /* skip spares */
-		return;
-	if (bi_rdev && blk_integrity_compare(mddev->gendisk,
-					     rdev->bdev->bd_disk) >= 0)
-		return;
-	WARN_ON_ONCE(!mddev->suspended);
-	printk(KERN_NOTICE "disabling data integrity on %s\n", mdname(mddev));
-	blk_integrity_unregister(mddev->gendisk);
+		return 0;
+
+	if (blk_integrity_compare(mddev->gendisk, rdev->bdev->bd_disk) != 0) {
+		printk(KERN_NOTICE "%s: incompatible integrity profile for %s\n",
+				mdname(mddev), bdevname(rdev->bdev, name));
+		return -ENXIO;
+	}
+
+	return 0;
 }
 EXPORT_SYMBOL(md_integrity_add_rdev);
 
diff --git a/drivers/md/md.h b/drivers/md/md.h
index ca0b643fe3c1..dfa57b41541b 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -657,7 +657,7 @@ extern void md_wait_for_blocked_rdev(struct md_rdev *rdev, struct mddev *mddev);
 extern void md_set_array_sectors(struct mddev *mddev, sector_t array_sectors);
 extern int md_check_no_bitmap(struct mddev *mddev);
 extern int md_integrity_register(struct mddev *mddev);
-extern void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev);
+extern int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev);
 extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
 
 extern void mddev_init(struct mddev *mddev);
diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
index 7331a80d89f1..0a72ab6e6c20 100644
--- a/drivers/md/multipath.c
+++ b/drivers/md/multipath.c
@@ -257,6 +257,9 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 			disk_stack_limits(mddev->gendisk, rdev->bdev,
 					  rdev->data_offset << 9);
 
+			err = md_integrity_add_rdev(rdev, mddev);
+			if (err)
+				break;
 			spin_lock_irq(&conf->device_lock);
 			mddev->degraded--;
 			rdev->raid_disk = path;
@@ -264,9 +267,6 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 			spin_unlock_irq(&conf->device_lock);
 			rcu_assign_pointer(p->rdev, rdev);
 			err = 0;
-			mddev_suspend(mddev);
-			md_integrity_add_rdev(rdev, mddev);
-			mddev_resume(mddev);
 			break;
 		}
 
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index e2169ff6e0f0..c4b913409226 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1589,6 +1589,9 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 	if (mddev->recovery_disabled == conf->recovery_disabled)
 		return -EBUSY;
 
+	if (md_integrity_add_rdev(rdev, mddev))
+		return -ENXIO;
+
 	if (rdev->raid_disk >= 0)
 		first = last = rdev->raid_disk;
 
@@ -1632,9 +1635,6 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 			break;
 		}
 	}
-	mddev_suspend(mddev);
-	md_integrity_add_rdev(rdev, mddev);
-	mddev_resume(mddev);
 	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
 		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
 	print_conf(conf);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 84e597e1c489..ce959b4ae4df 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1698,6 +1698,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 	if (rdev->saved_raid_disk < 0 && !_enough(conf, 1, -1))
 		return -EINVAL;
 
+	if (md_integrity_add_rdev(rdev, mddev))
+		return -ENXIO;
+
 	if (rdev->raid_disk >= 0)
 		first = last = rdev->raid_disk;
 
@@ -1739,9 +1742,6 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 		rcu_assign_pointer(p->rdev, rdev);
 		break;
 	}
-	mddev_suspend(mddev);
-	md_integrity_add_rdev(rdev, mddev);
-	mddev_resume(mddev);
 	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
 		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
 

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] md/raid: only permit hot-add of compatible integrity profiles
  2016-01-14  0:00       ` Dan Williams
@ 2016-01-14  0:56         ` NeilBrown
  -1 siblings, 0 replies; 30+ messages in thread
From: NeilBrown @ 2016-01-14  0:56 UTC (permalink / raw)
  To: Dan Williams
  Cc: axboe, Mike Snitzer, martin.petersen, linux-kernel, stable,
	linux-raid, linux-nvme, keith.busch, hch

[-- Attachment #1: Type: text/plain, Size: 7176 bytes --]

On Thu, Jan 14 2016, Dan Williams wrote:

> It is not safe for an integrity profile to be changed while i/o is
> in-flight in the queue.  Prevent adding new disks or otherwise online
> spares to an array if the device has an incompatible integrity profile.
>
> The original change to the blk_integrity_unregister implementation in
> md, commmit c7bfced9a671 "md: suspend i/o during runtime
> blk_integrity_unregister" introduced an immediate hang regression.
>
> This policy of disallowing changes the integrity profile once one has
> been established is shared with DM.

Thanks Dan.  That looks like it should address the issues and seems to
make sense.
If it passes my smoke-testing I'll include it in my merge-window pull
request tomorrow.

NeilBrown


>
> Here is an abbreviated log from a test run that:
> 1/ Creates a degraded raid1 with an integrity-enabled device (pmem0s) [   59.076127]
> 2/ Tries to add an integrity-disabled device (pmem1m) [   90.489209]
> 3/ Retries with an integrity-enabled device (pmem1s) [  205.671277]
>
> [   59.076127] md/raid1:md0: active with 1 out of 2 mirrors
> [   59.078302] md: data integrity enabled on md0
> [..]
> [   90.489209] md0: incompatible integrity profile for pmem1m
> [..]
> [  205.671277] md: super_written gets error=-5
> [  205.677386] md/raid1:md0: Disk failure on pmem1m, disabling device.
> [  205.677386] md/raid1:md0: Operation continuing on 1 devices.
> [  205.683037] RAID1 conf printout:
> [  205.684699]  --- wd:1 rd:2
> [  205.685972]  disk 0, wo:0, o:1, dev:pmem0s
> [  205.687562]  disk 1, wo:1, o:1, dev:pmem1s
> [  205.691717] md: recovery of RAID array md0
>
> Fixes: c7bfced9a671 ("md: suspend i/o during runtime blk_integrity_unregister")
> Cc: <stable@vger.kernel.org>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Reported-by: NeilBrown <neilb@suse.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/md/md.c        |   28 ++++++++++++++++------------
>  drivers/md/md.h        |    2 +-
>  drivers/md/multipath.c |    6 +++---
>  drivers/md/raid1.c     |    6 +++---
>  drivers/md/raid10.c    |    6 +++---
>  5 files changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 61aacab424cf..b1e1f6b95782 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -2017,28 +2017,32 @@ int md_integrity_register(struct mddev *mddev)
>  }
>  EXPORT_SYMBOL(md_integrity_register);
>  
> -/* Disable data integrity if non-capable/non-matching disk is being added */
> -void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
> +/*
> + * Attempt to add an rdev, but only if it is consistent with the current
> + * integrity profile
> + */
> +int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
>  {
>  	struct blk_integrity *bi_rdev;
>  	struct blk_integrity *bi_mddev;
> +	char name[BDEVNAME_SIZE];
>  
>  	if (!mddev->gendisk)
> -		return;
> +		return 0;
>  
>  	bi_rdev = bdev_get_integrity(rdev->bdev);
>  	bi_mddev = blk_get_integrity(mddev->gendisk);
>  
>  	if (!bi_mddev) /* nothing to do */
> -		return;
> -	if (rdev->raid_disk < 0) /* skip spares */
> -		return;
> -	if (bi_rdev && blk_integrity_compare(mddev->gendisk,
> -					     rdev->bdev->bd_disk) >= 0)
> -		return;
> -	WARN_ON_ONCE(!mddev->suspended);
> -	printk(KERN_NOTICE "disabling data integrity on %s\n", mdname(mddev));
> -	blk_integrity_unregister(mddev->gendisk);
> +		return 0;
> +
> +	if (blk_integrity_compare(mddev->gendisk, rdev->bdev->bd_disk) != 0) {
> +		printk(KERN_NOTICE "%s: incompatible integrity profile for %s\n",
> +				mdname(mddev), bdevname(rdev->bdev, name));
> +		return -ENXIO;
> +	}
> +
> +	return 0;
>  }
>  EXPORT_SYMBOL(md_integrity_add_rdev);
>  
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index ca0b643fe3c1..dfa57b41541b 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -657,7 +657,7 @@ extern void md_wait_for_blocked_rdev(struct md_rdev *rdev, struct mddev *mddev);
>  extern void md_set_array_sectors(struct mddev *mddev, sector_t array_sectors);
>  extern int md_check_no_bitmap(struct mddev *mddev);
>  extern int md_integrity_register(struct mddev *mddev);
> -extern void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev);
> +extern int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev);
>  extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
>  
>  extern void mddev_init(struct mddev *mddev);
> diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
> index 7331a80d89f1..0a72ab6e6c20 100644
> --- a/drivers/md/multipath.c
> +++ b/drivers/md/multipath.c
> @@ -257,6 +257,9 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  			disk_stack_limits(mddev->gendisk, rdev->bdev,
>  					  rdev->data_offset << 9);
>  
> +			err = md_integrity_add_rdev(rdev, mddev);
> +			if (err)
> +				break;
>  			spin_lock_irq(&conf->device_lock);
>  			mddev->degraded--;
>  			rdev->raid_disk = path;
> @@ -264,9 +267,6 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  			spin_unlock_irq(&conf->device_lock);
>  			rcu_assign_pointer(p->rdev, rdev);
>  			err = 0;
> -			mddev_suspend(mddev);
> -			md_integrity_add_rdev(rdev, mddev);
> -			mddev_resume(mddev);
>  			break;
>  		}
>  
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index e2169ff6e0f0..c4b913409226 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1589,6 +1589,9 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  	if (mddev->recovery_disabled == conf->recovery_disabled)
>  		return -EBUSY;
>  
> +	if (md_integrity_add_rdev(rdev, mddev))
> +		return -ENXIO;
> +
>  	if (rdev->raid_disk >= 0)
>  		first = last = rdev->raid_disk;
>  
> @@ -1632,9 +1635,6 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  			break;
>  		}
>  	}
> -	mddev_suspend(mddev);
> -	md_integrity_add_rdev(rdev, mddev);
> -	mddev_resume(mddev);
>  	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>  		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>  	print_conf(conf);
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 84e597e1c489..ce959b4ae4df 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1698,6 +1698,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  	if (rdev->saved_raid_disk < 0 && !_enough(conf, 1, -1))
>  		return -EINVAL;
>  
> +	if (md_integrity_add_rdev(rdev, mddev))
> +		return -ENXIO;
> +
>  	if (rdev->raid_disk >= 0)
>  		first = last = rdev->raid_disk;
>  
> @@ -1739,9 +1742,6 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  		rcu_assign_pointer(p->rdev, rdev);
>  		break;
>  	}
> -	mddev_suspend(mddev);
> -	md_integrity_add_rdev(rdev, mddev);
> -	mddev_resume(mddev);
>  	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>  		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>  

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] md/raid: only permit hot-add of compatible integrity profiles
@ 2016-01-14  0:56         ` NeilBrown
  0 siblings, 0 replies; 30+ messages in thread
From: NeilBrown @ 2016-01-14  0:56 UTC (permalink / raw)


On Thu, Jan 14 2016, Dan Williams wrote:

> It is not safe for an integrity profile to be changed while i/o is
> in-flight in the queue.  Prevent adding new disks or otherwise online
> spares to an array if the device has an incompatible integrity profile.
>
> The original change to the blk_integrity_unregister implementation in
> md, commmit c7bfced9a671 "md: suspend i/o during runtime
> blk_integrity_unregister" introduced an immediate hang regression.
>
> This policy of disallowing changes the integrity profile once one has
> been established is shared with DM.

Thanks Dan.  That looks like it should address the issues and seems to
make sense.
If it passes my smoke-testing I'll include it in my merge-window pull
request tomorrow.

NeilBrown


>
> Here is an abbreviated log from a test run that:
> 1/ Creates a degraded raid1 with an integrity-enabled device (pmem0s) [   59.076127]
> 2/ Tries to add an integrity-disabled device (pmem1m) [   90.489209]
> 3/ Retries with an integrity-enabled device (pmem1s) [  205.671277]
>
> [   59.076127] md/raid1:md0: active with 1 out of 2 mirrors
> [   59.078302] md: data integrity enabled on md0
> [..]
> [   90.489209] md0: incompatible integrity profile for pmem1m
> [..]
> [  205.671277] md: super_written gets error=-5
> [  205.677386] md/raid1:md0: Disk failure on pmem1m, disabling device.
> [  205.677386] md/raid1:md0: Operation continuing on 1 devices.
> [  205.683037] RAID1 conf printout:
> [  205.684699]  --- wd:1 rd:2
> [  205.685972]  disk 0, wo:0, o:1, dev:pmem0s
> [  205.687562]  disk 1, wo:1, o:1, dev:pmem1s
> [  205.691717] md: recovery of RAID array md0
>
> Fixes: c7bfced9a671 ("md: suspend i/o during runtime blk_integrity_unregister")
> Cc: <stable at vger.kernel.org>
> Cc: Mike Snitzer <snitzer at redhat.com>
> Reported-by: NeilBrown <neilb at suse.com>
> Signed-off-by: Dan Williams <dan.j.williams at intel.com>
> ---
>  drivers/md/md.c        |   28 ++++++++++++++++------------
>  drivers/md/md.h        |    2 +-
>  drivers/md/multipath.c |    6 +++---
>  drivers/md/raid1.c     |    6 +++---
>  drivers/md/raid10.c    |    6 +++---
>  5 files changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 61aacab424cf..b1e1f6b95782 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -2017,28 +2017,32 @@ int md_integrity_register(struct mddev *mddev)
>  }
>  EXPORT_SYMBOL(md_integrity_register);
>  
> -/* Disable data integrity if non-capable/non-matching disk is being added */
> -void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
> +/*
> + * Attempt to add an rdev, but only if it is consistent with the current
> + * integrity profile
> + */
> +int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev)
>  {
>  	struct blk_integrity *bi_rdev;
>  	struct blk_integrity *bi_mddev;
> +	char name[BDEVNAME_SIZE];
>  
>  	if (!mddev->gendisk)
> -		return;
> +		return 0;
>  
>  	bi_rdev = bdev_get_integrity(rdev->bdev);
>  	bi_mddev = blk_get_integrity(mddev->gendisk);
>  
>  	if (!bi_mddev) /* nothing to do */
> -		return;
> -	if (rdev->raid_disk < 0) /* skip spares */
> -		return;
> -	if (bi_rdev && blk_integrity_compare(mddev->gendisk,
> -					     rdev->bdev->bd_disk) >= 0)
> -		return;
> -	WARN_ON_ONCE(!mddev->suspended);
> -	printk(KERN_NOTICE "disabling data integrity on %s\n", mdname(mddev));
> -	blk_integrity_unregister(mddev->gendisk);
> +		return 0;
> +
> +	if (blk_integrity_compare(mddev->gendisk, rdev->bdev->bd_disk) != 0) {
> +		printk(KERN_NOTICE "%s: incompatible integrity profile for %s\n",
> +				mdname(mddev), bdevname(rdev->bdev, name));
> +		return -ENXIO;
> +	}
> +
> +	return 0;
>  }
>  EXPORT_SYMBOL(md_integrity_add_rdev);
>  
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index ca0b643fe3c1..dfa57b41541b 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -657,7 +657,7 @@ extern void md_wait_for_blocked_rdev(struct md_rdev *rdev, struct mddev *mddev);
>  extern void md_set_array_sectors(struct mddev *mddev, sector_t array_sectors);
>  extern int md_check_no_bitmap(struct mddev *mddev);
>  extern int md_integrity_register(struct mddev *mddev);
> -extern void md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev);
> +extern int md_integrity_add_rdev(struct md_rdev *rdev, struct mddev *mddev);
>  extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
>  
>  extern void mddev_init(struct mddev *mddev);
> diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
> index 7331a80d89f1..0a72ab6e6c20 100644
> --- a/drivers/md/multipath.c
> +++ b/drivers/md/multipath.c
> @@ -257,6 +257,9 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  			disk_stack_limits(mddev->gendisk, rdev->bdev,
>  					  rdev->data_offset << 9);
>  
> +			err = md_integrity_add_rdev(rdev, mddev);
> +			if (err)
> +				break;
>  			spin_lock_irq(&conf->device_lock);
>  			mddev->degraded--;
>  			rdev->raid_disk = path;
> @@ -264,9 +267,6 @@ static int multipath_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  			spin_unlock_irq(&conf->device_lock);
>  			rcu_assign_pointer(p->rdev, rdev);
>  			err = 0;
> -			mddev_suspend(mddev);
> -			md_integrity_add_rdev(rdev, mddev);
> -			mddev_resume(mddev);
>  			break;
>  		}
>  
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index e2169ff6e0f0..c4b913409226 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1589,6 +1589,9 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  	if (mddev->recovery_disabled == conf->recovery_disabled)
>  		return -EBUSY;
>  
> +	if (md_integrity_add_rdev(rdev, mddev))
> +		return -ENXIO;
> +
>  	if (rdev->raid_disk >= 0)
>  		first = last = rdev->raid_disk;
>  
> @@ -1632,9 +1635,6 @@ static int raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  			break;
>  		}
>  	}
> -	mddev_suspend(mddev);
> -	md_integrity_add_rdev(rdev, mddev);
> -	mddev_resume(mddev);
>  	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>  		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>  	print_conf(conf);
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 84e597e1c489..ce959b4ae4df 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1698,6 +1698,9 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  	if (rdev->saved_raid_disk < 0 && !_enough(conf, 1, -1))
>  		return -EINVAL;
>  
> +	if (md_integrity_add_rdev(rdev, mddev))
> +		return -ENXIO;
> +
>  	if (rdev->raid_disk >= 0)
>  		first = last = rdev->raid_disk;
>  
> @@ -1739,9 +1742,6 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev)
>  		rcu_assign_pointer(p->rdev, rdev);
>  		break;
>  	}
> -	mddev_suspend(mddev);
> -	md_integrity_add_rdev(rdev, mddev);
> -	mddev_resume(mddev);
>  	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>  		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>  
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20160114/2ffd2f07/attachment.sig>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 06/12] md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown
  2015-10-21 17:19 ` [PATCH v3 06/12] md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown Dan Williams
@ 2016-01-14 10:32   ` Sagi Grimberg
  0 siblings, 0 replies; 30+ messages in thread
From: Sagi Grimberg @ 2016-01-14 10:32 UTC (permalink / raw)


For the nvme, scsi bits

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 08/12] nvme: suspend i/o during runtime blk_integrity_unregister
  2015-10-21 17:20 ` [PATCH v3 08/12] nvme: suspend i/o during runtime blk_integrity_unregister Dan Williams
@ 2016-01-14 10:32   ` Sagi Grimberg
  0 siblings, 0 replies; 30+ messages in thread
From: Sagi Grimberg @ 2016-01-14 10:32 UTC (permalink / raw)


Looks good,

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 09/12] block: generic request_queue reference counting
  2015-10-21 17:20 ` [PATCH v3 09/12] block: generic request_queue reference counting Dan Williams
@ 2016-01-14 10:35   ` Sagi Grimberg
  0 siblings, 0 replies; 30+ messages in thread
From: Sagi Grimberg @ 2016-01-14 10:35 UTC (permalink / raw)


This looks good,

This is running in my setups for some time now so,

Tested-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 10/12] block: move blk_integrity to request_queue
  2015-10-21 17:20 ` [PATCH v3 10/12] block: move blk_integrity to request_queue Dan Williams
@ 2016-01-14 10:36   ` Sagi Grimberg
  0 siblings, 0 replies; 30+ messages in thread
From: Sagi Grimberg @ 2016-01-14 10:36 UTC (permalink / raw)


Looks good,

Tested-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 11/12] block: blk_flush_integrity() for bio-based drivers
  2015-10-21 17:20 ` [PATCH v3 11/12] block: blk_flush_integrity() for bio-based drivers Dan Williams
@ 2016-01-14 10:36   ` Sagi Grimberg
  0 siblings, 0 replies; 30+ messages in thread
From: Sagi Grimberg @ 2016-01-14 10:36 UTC (permalink / raw)


Looks good,

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v3 12/12] block, libnvdimm, nvme: provide a built-in blk_integrity nop profile
  2015-10-21 17:20 ` [PATCH v3 12/12] block, libnvdimm, nvme: provide a built-in blk_integrity nop profile Dan Williams
@ 2016-01-14 10:37   ` Sagi Grimberg
  0 siblings, 0 replies; 30+ messages in thread
From: Sagi Grimberg @ 2016-01-14 10:37 UTC (permalink / raw)


For the block, nvme bits

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2016-01-14 10:37 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-21 17:19 [PATCH v3 00/12] Simplify block integrity registration v3 Dan Williams
2015-10-21 17:19 ` [PATCH v3 01/12] block: Move integrity kobject to struct gendisk Dan Williams
2015-10-21 17:19 ` [PATCH v3 02/12] block: Consolidate static integrity profile properties Dan Williams
2015-10-21 17:19 ` [PATCH v3 03/12] block: Reduce the size of struct blk_integrity Dan Williams
2015-10-21 17:19 ` [PATCH v3 04/12] block: Export integrity data interval size in sysfs Dan Williams
2015-10-21 17:19 ` [PATCH v3 05/12] block: Inline blk_integrity in struct gendisk Dan Williams
2015-10-21 17:19 ` [PATCH v3 06/12] md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown Dan Williams
2016-01-14 10:32   ` Sagi Grimberg
2015-10-21 17:20 ` [PATCH v3 07/12] md: suspend i/o during runtime blk_integrity_unregister Dan Williams
2016-01-13  2:14   ` NeilBrown
2016-01-13  2:24     ` Dan Williams
2016-01-13  2:59       ` Mike Snitzer
2016-01-13  3:10       ` NeilBrown
2016-01-13  5:11         ` Dan Williams
2016-01-13 19:51           ` Mike Snitzer
2016-01-14  0:00     ` [PATCH] md/raid: only permit hot-add of compatible integrity profiles Dan Williams
2016-01-14  0:00       ` Dan Williams
2016-01-14  0:56       ` NeilBrown
2016-01-14  0:56         ` NeilBrown
2015-10-21 17:20 ` [PATCH v3 08/12] nvme: suspend i/o during runtime blk_integrity_unregister Dan Williams
2016-01-14 10:32   ` Sagi Grimberg
2015-10-21 17:20 ` [PATCH v3 09/12] block: generic request_queue reference counting Dan Williams
2016-01-14 10:35   ` Sagi Grimberg
2015-10-21 17:20 ` [PATCH v3 10/12] block: move blk_integrity to request_queue Dan Williams
2016-01-14 10:36   ` Sagi Grimberg
2015-10-21 17:20 ` [PATCH v3 11/12] block: blk_flush_integrity() for bio-based drivers Dan Williams
2016-01-14 10:36   ` Sagi Grimberg
2015-10-21 17:20 ` [PATCH v3 12/12] block, libnvdimm, nvme: provide a built-in blk_integrity nop profile Dan Williams
2016-01-14 10:37   ` Sagi Grimberg
2015-10-22 15:26 ` [PATCH v3 00/12] Simplify block integrity registration v3 Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.