All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] NVMe: Reread partitions on metadata formats
@ 2015-07-14 17:57 Keith Busch
  2015-07-14 18:49 ` Jens Axboe
  0 siblings, 1 reply; 67+ messages in thread
From: Keith Busch @ 2015-07-14 17:57 UTC (permalink / raw)


This patch has the driver automatically reread partitions if a namespace
has a separate metadata format. Previously revalidating a disk was
sufficient to get the correct capacity set on such formatted drives,
but partitions that may exist would not have been surfaced.

Reported-by: Paul Grabinar <paul.grabinar at ranbarg.com>
Signed-off-by: Keith Busch <keith.busch at intel.com>
Cc: Matthew Wilcox <willy at linux.intel.com>
Cc: Jens Axboe <axboe at fb.com>
---
 drivers/block/nvme-core.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index d1d6141..7920c27 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -2108,8 +2108,17 @@ static void nvme_alloc_ns(struct nvme_dev *dev, unsigned nsid)
 		goto out_free_disk;
 
 	add_disk(ns->disk);
-	if (ns->ms)
-		revalidate_disk(ns->disk);
+	if (ns->ms) {
+		struct block_device *bd = bdget_disk(ns->disk, 0);
+		if (!bd)
+			return;
+		if (blkdev_get(bd, FMODE_READ, NULL)) {
+			bdput(bd);
+			return;
+		}
+		blkdev_reread_part(bd);
+		blkdev_put(bd, FMODE_READ);
+	}
 	return;
  out_free_disk:
 	kfree(disk);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-14 17:57 [PATCH] NVMe: Reread partitions on metadata formats Keith Busch
@ 2015-07-14 18:49 ` Jens Axboe
  2015-07-14 19:01   ` Paul Grabinar
                     ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Jens Axboe @ 2015-07-14 18:49 UTC (permalink / raw)


On 07/14/2015 11:57 AM, Keith Busch wrote:
> This patch has the driver automatically reread partitions if a namespace
> has a separate metadata format. Previously revalidating a disk was
> sufficient to get the correct capacity set on such formatted drives,
> but partitions that may exist would not have been surfaced.

Looks sane to me. Paul, does this fix your issue?


-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-14 18:49 ` Jens Axboe
@ 2015-07-14 19:01   ` Paul Grabinar
  2015-07-15  3:48   ` Dan Williams
  2015-07-15 17:37   ` Paul Grabinar
  2 siblings, 0 replies; 67+ messages in thread
From: Paul Grabinar @ 2015-07-14 19:01 UTC (permalink / raw)


On 14/07/15 19:49, Jens Axboe wrote:
> On 07/14/2015 11:57 AM, Keith Busch wrote:
>> This patch has the driver automatically reread partitions if a namespace
>> has a separate metadata format. Previously revalidating a disk was
>> sufficient to get the correct capacity set on such formatted drives,
>> but partitions that may exist would not have been surfaced.
>
> Looks sane to me. Paul, does this fix your issue?
>
>

I won't be able to confirm until tomorrow. I will certainly try it and
let you know.

Thanks.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-14 18:49 ` Jens Axboe
  2015-07-14 19:01   ` Paul Grabinar
@ 2015-07-15  3:48   ` Dan Williams
  2015-07-15 21:36     ` Jens Axboe
  2015-07-15 17:37   ` Paul Grabinar
  2 siblings, 1 reply; 67+ messages in thread
From: Dan Williams @ 2015-07-15  3:48 UTC (permalink / raw)


On Tue, Jul 14, 2015@11:49 AM, Jens Axboe <axboe@fb.com> wrote:
> On 07/14/2015 11:57 AM, Keith Busch wrote:
>>
>> This patch has the driver automatically reread partitions if a namespace
>> has a separate metadata format. Previously revalidating a disk was
>> sufficient to get the correct capacity set on such formatted drives,
>> but partitions that may exist would not have been surfaced.
>
>
> Looks sane to me. Paul, does this fix your issue?

NVDIMM namespaces with metadata space will run into this same problem.
Why not make revalidate_disk re-read partitions?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-14 18:49 ` Jens Axboe
  2015-07-14 19:01   ` Paul Grabinar
  2015-07-15  3:48   ` Dan Williams
@ 2015-07-15 17:37   ` Paul Grabinar
  2 siblings, 0 replies; 67+ messages in thread
From: Paul Grabinar @ 2015-07-15 17:37 UTC (permalink / raw)


On 14/07/15 19:49, Jens Axboe wrote:
> On 07/14/2015 11:57 AM, Keith Busch wrote:
>> This patch has the driver automatically reread partitions if a namespace
>> has a separate metadata format. Previously revalidating a disk was
>> sufficient to get the correct capacity set on such formatted drives,
>> but partitions that may exist would not have been surfaced.
>
> Looks sane to me. Paul, does this fix your issue?
>
>

Yes, I can confirm this fixes the issue. After loading the driver, the
devices for the partitions are now created are are usable.
Thank you.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-15  3:48   ` Dan Williams
@ 2015-07-15 21:36     ` Jens Axboe
  2015-07-15 22:28       ` Keith Busch
  0 siblings, 1 reply; 67+ messages in thread
From: Jens Axboe @ 2015-07-15 21:36 UTC (permalink / raw)


On 07/14/2015 09:48 PM, Dan Williams wrote:
> On Tue, Jul 14, 2015@11:49 AM, Jens Axboe <axboe@fb.com> wrote:
>> On 07/14/2015 11:57 AM, Keith Busch wrote:
>>>
>>> This patch has the driver automatically reread partitions if a namespace
>>> has a separate metadata format. Previously revalidating a disk was
>>> sufficient to get the correct capacity set on such formatted drives,
>>> but partitions that may exist would not have been surfaced.
>>
>>
>> Looks sane to me. Paul, does this fix your issue?
>
> NVDIMM namespaces with metadata space will run into this same problem.
> Why not make revalidate_disk re-read partitions?

Yeah that's a good point, we should try that. I'll queue up Keith's fix 
for now since it's fixing a real bug, then we can revisit and kill it later.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-15 21:36     ` Jens Axboe
@ 2015-07-15 22:28       ` Keith Busch
  2015-07-16  9:19         ` Christoph Hellwig
  2015-07-17  1:44         ` [PATCH] NVMe: Reread partitions on metadata formats Martin K. Petersen
  0 siblings, 2 replies; 67+ messages in thread
From: Keith Busch @ 2015-07-15 22:28 UTC (permalink / raw)


On Wed, 15 Jul 2015, Jens Axboe wrote:
> On 07/14/2015 09:48 PM, Dan Williams wrote:
>> NVDIMM namespaces with metadata space will run into this same problem.
>> Why not make revalidate_disk re-read partitions?
>
> Yeah that's a good point, we should try that. I'll queue up Keith's fix for 
> now since it's fixing a real bug, then we can revisit and kill it later.

Should we make it so a driver can register with blk-integrity prior to
calling add_disk()? The other thread on this issue sounded like that'd
be better and removes the extra driver complexity to call blk_reread_part
or revalidate after add_disk.

Here's a patch for that and the resulting nvme driver. It works the same
as today if registered after add_disk(), but safe to call prior. If
called prior, add_disk() handles the queue and backing info settings
for integrity, and adds the kobj.

---
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index f548b64..d5d1bb9 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -426,17 +426,19 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
  		if (!bi)
  			return -1;

-		if (kobject_init_and_add(&bi->kobj, &integrity_ktype,
+		if ((disk->flags & GENHD_FL_UP)) {
+			if (kobject_init_and_add(&bi->kobj, &integrity_ktype,
  					 &disk_to_dev(disk)->kobj,
  					 "%s", "integrity")) {
-			kmem_cache_free(integrity_cachep, bi);
-			return -1;
-		}
-
-		kobject_uevent(&bi->kobj, KOBJ_ADD);
+				kmem_cache_free(integrity_cachep, bi);
+				return -1;
+			}
+			kobject_uevent(&bi->kobj, KOBJ_ADD);
+			bi->interval = queue_logical_block_size(disk->queue);
+		} else
+			kobject_init(&bi->kobj, &integrity_ktype);

  		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval = queue_logical_block_size(disk->queue);
  		disk->integrity = bi;
  	} else
  		bi = disk->integrity;
@@ -452,8 +454,8 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
  	} else
  		bi->name = bi_unsupported_name;

-	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
-
+	if (disk->flags & GENHD_FL_UP)
+		disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
  	return 0;
  }
  EXPORT_SYMBOL(blk_integrity_register);
diff --git a/block/genhd.c b/block/genhd.c
index 59a1395..a6f731d 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -582,6 +582,7 @@ exit:
   */
  void add_disk(struct gendisk *disk)
  {
+	struct blk_integrity *bi = disk->integrity;
  	struct backing_dev_info *bdi;
  	dev_t devt;
  	int retval;
@@ -616,6 +617,10 @@ void add_disk(struct gendisk *disk)

  	blk_register_region(disk_devt(disk), disk->minors, NULL,
  			    exact_match, exact_lock, disk);
+	if (bi) {
+		bi->interval = queue_logical_block_size(disk->queue);
+		bdi->capabilities |= BDI_CAP_STABLE_WRITES;
+	}
  	register_disk(disk);
  	blk_register_queue(disk);

@@ -630,6 +635,10 @@ void add_disk(struct gendisk *disk)
  	WARN_ON(retval);

  	disk_add_events(disk);
+
+	if (bi && kobject_add(&bi->kobj, &disk_to_dev(disk)->kobj,
+						"%s", "integrity"))
+		blk_integrity_unregister(disk);
  }
  EXPORT_SYMBOL(add_disk);

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index e9127ad..d9792b7 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -1989,8 +1989,7 @@ static int nvme_revalidate_disk(struct gendisk *disk)
  	ns->pi_type = pi_type;
  	blk_queue_logical_block_size(ns->queue, bs);

-	if (ns->ms && !blk_get_integrity(disk) && (disk->flags & GENHD_FL_UP) &&
-								!ns->ext)
+	if (ns->ms && !blk_get_integrity(disk) && !ns->ext)
  		nvme_init_integrity(ns);

  	if (ns->ms && !blk_get_integrity(disk))
@@ -2134,28 +2133,9 @@ static void nvme_alloc_ns(struct nvme_dev *dev, unsigned nsid)
  	disk->flags = GENHD_FL_EXT_DEVT;
  	sprintf(disk->disk_name, "nvme%dn%d", dev->instance, nsid);

-	/*
-	 * Initialize capacity to 0 until we establish the namespace format and
-	 * setup integrity extentions if necessary. The revalidate_disk after
-	 * add_disk allows the driver to register with integrity if the format
-	 * requires it.
-	 */
-	set_capacity(disk, 0);
  	if (nvme_revalidate_disk(ns->disk))
  		goto out_free_disk;
-
  	add_disk(ns->disk);
-	if (ns->ms) {
-		struct block_device *bd = bdget_disk(ns->disk, 0);
-		if (!bd)
-			return;
-		if (blkdev_get(bd, FMODE_READ, NULL)) {
-			bdput(bd);
-			return;
-		}
-		blkdev_reread_part(bd);
-		blkdev_put(bd, FMODE_READ);
-	}
  	return;
   out_free_disk:
  	kfree(disk);
--

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-15 22:28       ` Keith Busch
@ 2015-07-16  9:19         ` Christoph Hellwig
  2015-07-17  1:47           ` Martin K. Petersen
  2015-07-21  6:02           ` Data integrity tweaks Martin K. Petersen
  2015-07-17  1:44         ` [PATCH] NVMe: Reread partitions on metadata formats Martin K. Petersen
  1 sibling, 2 replies; 67+ messages in thread
From: Christoph Hellwig @ 2015-07-16  9:19 UTC (permalink / raw)


On Wed, Jul 15, 2015@10:28:32PM +0000, Keith Busch wrote:
> Should we make it so a driver can register with blk-integrity prior to
> calling add_disk()? The other thread on this issue sounded like that'd
> be better and removes the extra driver complexity to call blk_reread_part
> or revalidate after add_disk.
> 
> Here's a patch for that and the resulting nvme driver. It works the same
> as today if registered after add_disk(), but safe to call prior. If
> called prior, add_disk() handles the queue and backing info settings
> for integrity, and adds the kobj.

This looks sensible to me as a quick fix for the issue.

In the lng run I'd prefer to just pass the integrity template
to add_disk (or rather an add_disk_integrity helper, for which
add_disk becomes a wrapper).

Also the way struct blk_integrity is used is a bit of a mess,
instead of passing it and then copying into a newly allocated
copy of it I'd rather move the kobj to struct gendisk and then
just have a pointer to the original template in struct gendisk.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-15 22:28       ` Keith Busch
  2015-07-16  9:19         ` Christoph Hellwig
@ 2015-07-17  1:44         ` Martin K. Petersen
  1 sibling, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-17  1:44 UTC (permalink / raw)


>>>>> "Keith" == Keith Busch <keith.busch at intel.com> writes:

Keith> Here's a patch for that and the resulting nvme driver. It works
Keith> the same as today if registered after add_disk(), but safe to
Keith> call prior. If called prior, add_disk() handles the queue and
Keith> backing info settings for integrity, and adds the kobj.

This is going to blow up without blk integrity compiled in.

I'll take a stab at it tomorrow.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-16  9:19         ` Christoph Hellwig
@ 2015-07-17  1:47           ` Martin K. Petersen
  2015-07-17  9:30             ` Christoph Hellwig
  2015-07-21  6:02           ` Data integrity tweaks Martin K. Petersen
  1 sibling, 1 reply; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-17  1:47 UTC (permalink / raw)


>>>>> "Christoph" == Christoph Hellwig <hch at infradead.org> writes:

Christoph> Also the way struct blk_integrity is used is a bit of a mess,
Christoph> instead of passing it and then copying into a newly allocated
Christoph> copy of it I'd rather move the kobj to struct gendisk and
Christoph> then just have a pointer to the original template in struct
Christoph> gendisk.

The flags, protection interval, tag size, etc. are per-device
properties.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] NVMe: Reread partitions on metadata formats
  2015-07-17  1:47           ` Martin K. Petersen
@ 2015-07-17  9:30             ` Christoph Hellwig
  0 siblings, 0 replies; 67+ messages in thread
From: Christoph Hellwig @ 2015-07-17  9:30 UTC (permalink / raw)


On Thu, Jul 16, 2015@09:47:15PM -0400, Martin K. Petersen wrote:
> The flags, protection interval, tag size, etc. are per-device
> properties.

In which case we should just add them directly to the gendisk..

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Data integrity tweaks
  2015-07-16  9:19         ` Christoph Hellwig
  2015-07-17  1:47           ` Martin K. Petersen
@ 2015-07-21  6:02           ` Martin K. Petersen
  2015-07-21  6:02             ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
                               ` (4 more replies)
  1 sibling, 5 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-21  6:02 UTC (permalink / raw)


Christoph suggested moving blk_integrity to struct gendisk to overcome
some of the headaches we have wrt. NVMe and metadata. This patch series
aims to simplify the code and to accommodate requirements that didn't
exist when the code was originally conceived.

I have not done a full integrity qual cycle on top of this series so it
is mostly meant as an RFC. Would like to hear if people agree with the
overall approach first.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/5] block: Move integrity kobject to struct gendisk
  2015-07-21  6:02           ` Data integrity tweaks Martin K. Petersen
@ 2015-07-21  6:02             ` Martin K. Petersen
  2015-07-22 11:32               ` Sagi Grimberg
  2015-07-21  6:02             ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
                               ` (3 subsequent siblings)
  4 siblings, 1 reply; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-21  6:02 UTC (permalink / raw)


The integrity kobject purely exists to support the integrity
subdirectory in sysfs and doesn't really have anything to do with the
blk_integrity data structure. Move the kobject to struct gendisk where
it belongs.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
---
 block/blk-integrity.c  | 22 +++++++++++-----------
 include/linux/blkdev.h |  2 --
 include/linux/genhd.h  |  1 +
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index f548b64be092..6a173a7d1ec6 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -246,8 +246,8 @@ struct integrity_sysfs_entry {
 static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -258,8 +258,8 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    struct attribute *attr, const char *page,
 				    size_t count)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -382,8 +382,8 @@ subsys_initcall(blk_dev_integrity_init);
 
 static void blk_integrity_release(struct kobject *kobj)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 
 	kmem_cache_free(integrity_cachep, bi);
 }
@@ -426,14 +426,14 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 		if (!bi)
 			return -1;
 
-		if (kobject_init_and_add(&bi->kobj, &integrity_ktype,
+		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
 					 &disk_to_dev(disk)->kobj,
 					 "%s", "integrity")) {
 			kmem_cache_free(integrity_cachep, bi);
 			return -1;
 		}
 
-		kobject_uevent(&bi->kobj, KOBJ_ADD);
+		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
 
 		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
 		bi->interval = queue_logical_block_size(disk->queue);
@@ -476,9 +476,9 @@ void blk_integrity_unregister(struct gendisk *disk)
 
 	bi = disk->integrity;
 
-	kobject_uevent(&bi->kobj, KOBJ_REMOVE);
-	kobject_del(&bi->kobj);
-	kobject_put(&bi->kobj);
+	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
+	kobject_del(&disk->integrity_kobj);
+	kobject_put(&disk->integrity_kobj);
 	disk->integrity = NULL;
 }
 EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 243f29e779ec..7d5fb4d88519 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1437,8 +1437,6 @@ struct blk_integrity {
 	unsigned short		tag_size;
 
 	const char		*name;
-
-	struct kobject		kobj;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 2adbfa6d02bc..9e6e0dfa97ad 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -199,6 +199,7 @@ struct gendisk {
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
 	struct blk_integrity *integrity;
+	struct kobject integrity_kobj;
 #endif
 	int node_id;
 };
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-07-21  6:02           ` Data integrity tweaks Martin K. Petersen
  2015-07-21  6:02             ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
@ 2015-07-21  6:02             ` Martin K. Petersen
  2015-07-22 11:33               ` Sagi Grimberg
  2015-07-21  6:02             ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
                               ` (2 subsequent siblings)
  4 siblings, 1 reply; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-21  6:02 UTC (permalink / raw)


We previously made a complete copy of a device's data integrity profile
even though several of the fields inside the blk_integrity struct are
pointers to fixed template entries in t10-pi.c.

Split the static and per-device portions so that we can reference the
template directly.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
---
 block/bio-integrity.c               |  8 ++++----
 block/blk-integrity.c               | 17 ++++++++---------
 block/t10-pi.c                      | 16 ++++------------
 drivers/block/nvme-core.c           |  8 ++++----
 drivers/scsi/sd_dif.c               | 29 ++++++++++++++++-------------
 drivers/target/target_core_iblock.c | 10 +++++-----
 include/linux/blkdev.h              | 20 +++++++++++---------
 include/linux/t10-pi.h              |  8 ++++----
 8 files changed, 56 insertions(+), 60 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 0436c21db7f2..f09531130cff 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -172,11 +172,11 @@ bool bio_integrity_enabled(struct bio *bio)
 	if (bi == NULL)
 		return false;
 
-	if (bio_data_dir(bio) == READ && bi->verify_fn != NULL &&
+	if (bio_data_dir(bio) == READ && bi->profile->verify_fn != NULL &&
 	    (bi->flags & BLK_INTEGRITY_VERIFY))
 		return true;
 
-	if (bio_data_dir(bio) == WRITE && bi->generate_fn != NULL &&
+	if (bio_data_dir(bio) == WRITE && bi->profile->generate_fn != NULL &&
 	    (bi->flags & BLK_INTEGRITY_GENERATE))
 		return true;
 
@@ -335,7 +335,7 @@ int bio_integrity_prep(struct bio *bio)
 
 	/* Auto-generate integrity metadata if this is a write */
 	if (bio_data_dir(bio) == WRITE)
-		bio_integrity_process(bio, bi->generate_fn);
+		bio_integrity_process(bio, bi->profile->generate_fn);
 
 	return 0;
 }
@@ -357,7 +357,7 @@ static void bio_integrity_verify_fn(struct work_struct *work)
 	struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
 	int error;
 
-	error = bio_integrity_process(bio, bi->verify_fn);
+	error = bio_integrity_process(bio, bi->profile->verify_fn);
 
 	/* Restore original bio completion handler */
 	bio->bi_end_io = bip->bip_end_io;
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 6a173a7d1ec6..5e5280d58c61 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -176,10 +176,10 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 		return -1;
 	}
 
-	if (strcmp(b1->name, b2->name)) {
+	if (b1->profile != b2->profile) {
 		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
-		       b1->name, b2->name);
+		       b1->profile->name, b2->profile->name);
 		return -1;
 	}
 
@@ -272,8 +272,8 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->name != NULL)
-		return sprintf(page, "%s\n", bi->name);
+	if (bi != NULL && bi->profile->name != NULL)
+		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
 }
@@ -398,7 +398,8 @@ bool blk_integrity_is_initialized(struct gendisk *disk)
 {
 	struct blk_integrity *bi = blk_get_integrity(disk);
 
-	return (bi && bi->name && strcmp(bi->name, bi_unsupported_name) != 0);
+	return (bi && bi->profile->name && strcmp(bi->profile->name,
+						  bi_unsupported_name) != 0);
 }
 EXPORT_SYMBOL(blk_integrity_is_initialized);
 
@@ -443,14 +444,12 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 
 	/* Use the provided profile as template */
 	if (template != NULL) {
-		bi->name = template->name;
-		bi->generate_fn = template->generate_fn;
-		bi->verify_fn = template->verify_fn;
+		bi->profile = template->profile;
 		bi->tuple_size = template->tuple_size;
 		bi->tag_size = template->tag_size;
 		bi->flags |= template->flags;
 	} else
-		bi->name = bi_unsupported_name;
+		bi->profile->name = bi_unsupported_name;
 
 	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
 
diff --git a/block/t10-pi.c b/block/t10-pi.c
index 24d6e9715318..2c97912335a9 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -160,38 +160,30 @@ static int t10_pi_type3_verify_ip(struct blk_integrity_iter *iter)
 	return t10_pi_verify(iter, t10_pi_ip_fn, 3);
 }
 
-struct blk_integrity t10_pi_type1_crc = {
+struct blk_integrity_profile t10_pi_type1_crc = {
 	.name			= "T10-DIF-TYPE1-CRC",
 	.generate_fn		= t10_pi_type1_generate_crc,
 	.verify_fn		= t10_pi_type1_verify_crc,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type1_crc);
 
-struct blk_integrity t10_pi_type1_ip = {
+struct blk_integrity_profile t10_pi_type1_ip = {
 	.name			= "T10-DIF-TYPE1-IP",
 	.generate_fn		= t10_pi_type1_generate_ip,
 	.verify_fn		= t10_pi_type1_verify_ip,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type1_ip);
 
-struct blk_integrity t10_pi_type3_crc = {
+struct blk_integrity_profile t10_pi_type3_crc = {
 	.name			= "T10-DIF-TYPE3-CRC",
 	.generate_fn		= t10_pi_type3_generate_crc,
 	.verify_fn		= t10_pi_type3_verify_crc,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type3_crc);
 
-struct blk_integrity t10_pi_type3_ip = {
+struct blk_integrity_profile t10_pi_type3_ip = {
 	.name			= "T10-DIF-TYPE3-IP",
 	.generate_fn		= t10_pi_type3_generate_ip,
 	.verify_fn		= t10_pi_type3_verify_ip,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type3_ip);
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index b1eb9d321071..db9d39e84933 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -549,7 +549,7 @@ static int nvme_noop_generate(struct blk_integrity_iter *iter)
 	return 0;
 }
 
-struct blk_integrity nvme_meta_noop = {
+struct blk_integrity_profile nvme_meta_noop = {
 	.name			= "NVME_META_NOOP",
 	.generate_fn		= nvme_noop_generate,
 	.verify_fn		= nvme_noop_verify,
@@ -561,14 +561,14 @@ static void nvme_init_integrity(struct nvme_ns *ns)
 
 	switch (ns->pi_type) {
 	case NVME_NS_DPS_PI_TYPE3:
-		integrity = t10_pi_type3_crc;
+		integrity.profile = &t10_pi_type3_crc;
 		break;
 	case NVME_NS_DPS_PI_TYPE1:
 	case NVME_NS_DPS_PI_TYPE2:
-		integrity = t10_pi_type1_crc;
+		integrity.profile = &t10_pi_type1_crc;
 		break;
 	default:
-		integrity = nvme_meta_noop;
+		integrity.profile = &nvme_meta_noop;
 		break;
 	}
 	integrity.tuple_size = ns->ms;
diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c
index 5c06d292b94c..5a5ec9aa26b3 100644
--- a/drivers/scsi/sd_dif.c
+++ b/drivers/scsi/sd_dif.c
@@ -43,6 +43,7 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 	struct scsi_device *sdp = sdkp->device;
 	struct gendisk *disk = sdkp->disk;
 	u8 type = sdkp->protection_type;
+	struct blk_integrity bi;
 	int dif, dix;
 
 	dif = scsi_host_dif_capable(sdp->host, type);
@@ -58,36 +59,38 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 	/* Enable DMA of protection information */
 	if (scsi_host_get_guard(sdkp->device->host) & SHOST_DIX_GUARD_IP) {
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			blk_integrity_register(disk, &t10_pi_type3_ip);
+			bi.profile = &t10_pi_type3_ip;
 		else
-			blk_integrity_register(disk, &t10_pi_type1_ip);
+			bi.profile = &t10_pi_type1_ip;
 
-		disk->integrity->flags |= BLK_INTEGRITY_IP_CHECKSUM;
+		bi.flags |= BLK_INTEGRITY_IP_CHECKSUM;
 	} else
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			blk_integrity_register(disk, &t10_pi_type3_crc);
+			bi.profile = &t10_pi_type3_crc;
 		else
-			blk_integrity_register(disk, &t10_pi_type1_crc);
+			bi.profile = &t10_pi_type1_crc;
 
+	bi.tuple_size = sizeof(struct t10_pi_tuple);
 	sd_printk(KERN_NOTICE, sdkp,
-		  "Enabling DIX %s protection\n", disk->integrity->name);
+		  "Enabling DIX %s protection\n", bi.profile->name);
 
-	/* Signal to block layer that we support sector tagging */
 	if (dif && type) {
-
-		disk->integrity->flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
+		bi.flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
 
 		if (!sdkp->ATO)
-			return;
+			goto out;
 
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			disk->integrity->tag_size = sizeof(u16) + sizeof(u32);
+			bi.tag_size = sizeof(u16) + sizeof(u32);
 		else
-			disk->integrity->tag_size = sizeof(u16);
+			bi.tag_size = sizeof(u16);
 
 		sd_printk(KERN_NOTICE, sdkp, "DIF application tag size %u\n",
-			  disk->integrity->tag_size);
+			  bi.tag_size);
 	}
+
+out:
+	blk_integrity_register(disk, &bi);
 }
 
 /*
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 6d88d24e6cce..fa15cfabbe01 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -153,17 +153,17 @@ static int iblock_configure_device(struct se_device *dev)
 	if (bi) {
 		struct bio_set *bs = ib_dev->ibd_bio_set;
 
-		if (!strcmp(bi->name, "T10-DIF-TYPE3-IP") ||
-		    !strcmp(bi->name, "T10-DIF-TYPE1-IP")) {
+		if (!strcmp(bi->profile->name, "T10-DIF-TYPE3-IP") ||
+		    !strcmp(bi->profile->name, "T10-DIF-TYPE1-IP")) {
 			pr_err("IBLOCK export of blk_integrity: %s not"
-			       " supported\n", bi->name);
+			       " supported\n", bi->profile->name);
 			ret = -ENOSYS;
 			goto out_blkdev_put;
 		}
 
-		if (!strcmp(bi->name, "T10-DIF-TYPE3-CRC")) {
+		if (!strcmp(bi->profile->name, "T10-DIF-TYPE3-CRC")) {
 			dev->dev_attrib.pi_prot_type = TARGET_DIF_TYPE3_PROT;
-		} else if (!strcmp(bi->name, "T10-DIF-TYPE1-CRC")) {
+		} else if (!strcmp(bi->profile->name, "T10-DIF-TYPE1-CRC")) {
 			dev->dev_attrib.pi_prot_type = TARGET_DIF_TYPE1_PROT;
 		}
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 7d5fb4d88519..4fbb3b9f7f2e 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1427,16 +1427,18 @@ struct blk_integrity_iter {
 
 typedef int (integrity_processing_fn) (struct blk_integrity_iter *);
 
-struct blk_integrity {
-	integrity_processing_fn	*generate_fn;
-	integrity_processing_fn	*verify_fn;
-
-	unsigned short		flags;
-	unsigned short		tuple_size;
-	unsigned short		interval;
-	unsigned short		tag_size;
+struct blk_integrity_profile {
+	integrity_processing_fn		*generate_fn;
+	integrity_processing_fn		*verify_fn;
+	const char			*name;
+};
 
-	const char		*name;
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned short			flags;
+	unsigned short			tuple_size;
+	unsigned short			interval;
+	unsigned short			tag_size;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
diff --git a/include/linux/t10-pi.h b/include/linux/t10-pi.h
index 6a8b9942632d..dd8de82cf5b5 100644
--- a/include/linux/t10-pi.h
+++ b/include/linux/t10-pi.h
@@ -14,9 +14,9 @@ struct t10_pi_tuple {
 };
 
 
-extern struct blk_integrity t10_pi_type1_crc;
-extern struct blk_integrity t10_pi_type1_ip;
-extern struct blk_integrity t10_pi_type3_crc;
-extern struct blk_integrity t10_pi_type3_ip;
+extern struct blk_integrity_profile t10_pi_type1_crc;
+extern struct blk_integrity_profile t10_pi_type1_ip;
+extern struct blk_integrity_profile t10_pi_type3_crc;
+extern struct blk_integrity_profile t10_pi_type3_ip;
 
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/5] block: Reduce the size of struct blk_integrity
  2015-07-21  6:02           ` Data integrity tweaks Martin K. Petersen
  2015-07-21  6:02             ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
  2015-07-21  6:02             ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
@ 2015-07-21  6:02             ` Martin K. Petersen
  2015-07-21 11:53               ` Christoph Hellwig
  2015-07-22 11:35               ` Sagi Grimberg
  2015-07-21  6:02             ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
  2015-07-21  6:02             ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
  4 siblings, 2 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-21  6:02 UTC (permalink / raw)


The per-device properties in the blk_integrity structure were previously
unsigned short. However, most of the values fit inside a char. The only
exception is the data interval size and we can work around that by
storing it as a power of two.

This cuts the size of the dynamic portion of blk_integrity in half.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
---
 block/bio-integrity.c  | 4 ++--
 block/blk-integrity.c  | 6 +++---
 include/linux/blkdev.h | 8 ++++----
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index f09531130cff..4da7fbc28845 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -197,7 +197,7 @@ EXPORT_SYMBOL(bio_integrity_enabled);
 static inline unsigned int bio_integrity_intervals(struct blk_integrity *bi,
 						   unsigned int sectors)
 {
-	return sectors >> (ilog2(bi->interval) - 9);
+	return sectors >> (bi->interval_exp - 9);
 }
 
 static inline unsigned int bio_integrity_bytes(struct blk_integrity *bi,
@@ -224,7 +224,7 @@ static int bio_integrity_process(struct bio *bio,
 		bip->bip_vec->bv_offset;
 
 	iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
-	iter.interval = bi->interval;
+	iter.interval = 1 << bi->interval_exp;
 	iter.seed = bip_get_seed(bip);
 	iter.prot_buf = prot_buf;
 
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 5e5280d58c61..70ba9389d0dd 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -155,10 +155,10 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	if (!b1 || !b2)
 		return -1;
 
-	if (b1->interval != b2->interval) {
+	if (b1->interval_exp != b2->interval_exp) {
 		pr_err("%s: %s/%s protection interval %u != %u\n",
 		       __func__, gd1->disk_name, gd2->disk_name,
-		       b1->interval, b2->interval);
+		       1 << b1->interval_exp, 1 << b2->interval_exp);
 		return -1;
 	}
 
@@ -437,7 +437,7 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
 
 		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval = queue_logical_block_size(disk->queue);
+		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
 		disk->integrity = bi;
 	} else
 		bi = disk->integrity;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4fbb3b9f7f2e..c01ce2f7168b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1435,10 +1435,10 @@ struct blk_integrity_profile {
 
 struct blk_integrity {
 	struct blk_integrity_profile	*profile;
-	unsigned short			flags;
-	unsigned short			tuple_size;
-	unsigned short			interval;
-	unsigned short			tag_size;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 4/5] block: Export integrity data interval size in sysfs
  2015-07-21  6:02           ` Data integrity tweaks Martin K. Petersen
                               ` (2 preceding siblings ...)
  2015-07-21  6:02             ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
@ 2015-07-21  6:02             ` Martin K. Petersen
  2015-07-22 11:37               ` Sagi Grimberg
  2015-07-21  6:02             ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
  4 siblings, 1 reply; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-21  6:02 UTC (permalink / raw)


The size of the data interval was not exported in the sysfs integrity
directory. Export it so that userland apps can tell whether the interval
is different from the device's logical block size.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
---
 Documentation/ABI/testing/sysfs-block |  7 +++++++
 block/blk-integrity.c                 | 14 ++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index 8df003963d99..a61f1e2165ee 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -60,6 +60,13 @@ Description:
 		Indicates whether a storage device is capable of storing
 		integrity metadata. Set if the device is T10 PI-capable.
 
+What:		/sys/block/<disk>/integrity/data_interval_bytes
+Date:		July 2015
+Contact:	Martin K. Petersen <martin.petersen at oracle.com>
+Description:
+		Describes the number of data bytes which are protected
+		by one integrity tuple. Typically the device's logical
+		block size.
 
 What:		/sys/block/<disk>/integrity/write_generate
 Date:		June 2008
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 70ba9389d0dd..16d5a15d632a 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -286,6 +286,14 @@ static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 		return sprintf(page, "0\n");
 }
 
+static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
+{
+	if (bi != NULL)
+		return sprintf(page, "%u\n", 1 << bi->interval_exp);
+	else
+		return sprintf(page, "0\n");
+}
+
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
 				      const char *page, size_t count)
 {
@@ -340,6 +348,11 @@ static struct integrity_sysfs_entry integrity_tag_size_entry = {
 	.show = integrity_tag_size_show,
 };
 
+static struct integrity_sysfs_entry integrity_interval_entry = {
+	.attr = { .name = "data_interval_bytes", .mode = S_IRUGO },
+	.show = integrity_interval_show,
+};
+
 static struct integrity_sysfs_entry integrity_verify_entry = {
 	.attr = { .name = "read_verify", .mode = S_IRUGO | S_IWUSR },
 	.show = integrity_verify_show,
@@ -360,6 +373,7 @@ static struct integrity_sysfs_entry integrity_device_entry = {
 static struct attribute *integrity_attrs[] = {
 	&integrity_format_entry.attr,
 	&integrity_tag_size_entry.attr,
+	&integrity_interval_entry.attr,
 	&integrity_verify_entry.attr,
 	&integrity_generate_entry.attr,
 	&integrity_device_entry.attr,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-07-21  6:02           ` Data integrity tweaks Martin K. Petersen
                               ` (3 preceding siblings ...)
  2015-07-21  6:02             ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
@ 2015-07-21  6:02             ` Martin K. Petersen
  2015-07-21 12:01               ` Christoph Hellwig
  4 siblings, 1 reply; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-21  6:02 UTC (permalink / raw)


Up until now the_integrity profile has been dynamically allocated and
attached to struct gendisk after the disk has been made active.

This causes problems because NVMe devices need to register the profile
prior to the partition table being read due to a mandatory metadata
buffer requirement. In addition, DM goes through hoops to deal with
preallocating, but not initializing integrity profiles.

Since the integrity profile is small (4 bytes + a pointer), Christoph
suggested moving it to struct gendisk proper. This requires several
changes:

 - Moving the blk_integrity definitions to genhd.h.

 - Inlining blk_integrity in struct gendisk.

 - Removing the dynamic allocation code.

 - Adding helper functions which allow gendisk to set up and tear down
   the integrity sysfs dir when a disk is added/deleted.

 - Adding a blk_integrity_revalidate() callback for updating the stable
   pages bdi setting.

 - The calls that depend on whether a device has an integrity profile or
   not now key off of the bi->profile pointer.

 - Simplifying the integrity support routines in DM.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Cc: Mike Snitzer <snitzer at redhat.com>
---
 block/blk-integrity.c     | 152 +++++++++++++++++-----------------------------
 block/genhd.c             |   2 +
 block/partition-generic.c |   1 +
 drivers/block/nvme-core.c |   2 +-
 drivers/md/dm-table.c     |  89 +++++++--------------------
 drivers/md/md.c           |   9 +--
 fs/block_dev.c            |   2 +-
 include/linux/blkdev.h    |  53 +++-------------
 include/linux/genhd.h     |  50 ++++++++++++++-
 9 files changed, 146 insertions(+), 214 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 16d5a15d632a..3e03f47b2da7 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -30,10 +30,6 @@
 
 #include "blk.h"
 
-static struct kmem_cache *integrity_cachep;
-
-static const char *bi_unsupported_name = "unsupported";
-
 /**
  * blk_rq_count_integrity_sg - Count number of integrity scatterlist elements
  * @q:		request queue
@@ -146,13 +142,13 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
  */
 int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 {
-	struct blk_integrity *b1 = gd1->integrity;
-	struct blk_integrity *b2 = gd2->integrity;
+	struct blk_integrity *b1 = &gd1->integrity;
+	struct blk_integrity *b2 = &gd2->integrity;
 
-	if (!b1 && !b2)
+	if (!b1->profile && !b2->profile)
 		return 0;
 
-	if (!b1 || !b2)
+	if (!b1->profile || !b2->profile)
 		return -1;
 
 	if (b1->interval_exp != b2->interval_exp) {
@@ -163,21 +159,21 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	}
 
 	if (b1->tuple_size != b2->tuple_size) {
-		printk(KERN_ERR "%s: %s/%s tuple sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tuple sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tuple_size, b2->tuple_size);
 		return -1;
 	}
 
 	if (b1->tag_size && b2->tag_size && (b1->tag_size != b2->tag_size)) {
-		printk(KERN_ERR "%s: %s/%s tag sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tag sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tag_size, b2->tag_size);
 		return -1;
 	}
 
 	if (b1->profile != b2->profile) {
-		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
+		pr_err("%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->profile->name, b2->profile->name);
 		return -1;
@@ -247,7 +243,7 @@ static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -259,7 +255,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    size_t count)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -272,7 +268,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->profile->name != NULL)
+	if (bi->profile && bi->profile->name)
 		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
@@ -280,18 +276,13 @@ static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 
 static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", bi->tag_size);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n", bi->tag_size);
 }
 
 static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", 1 << bi->interval_exp);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n",
+		       bi->interval_exp ? 1 << bi->interval_exp : 0);
 }
 
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
@@ -385,113 +376,84 @@ static const struct sysfs_ops integrity_ops = {
 	.store	= &integrity_attr_store,
 };
 
-static int __init blk_dev_integrity_init(void)
-{
-	integrity_cachep = kmem_cache_create("blkdev_integrity",
-					     sizeof(struct blk_integrity),
-					     0, SLAB_PANIC, NULL);
-	return 0;
-}
-subsys_initcall(blk_dev_integrity_init);
-
-static void blk_integrity_release(struct kobject *kobj)
-{
-	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	kmem_cache_free(integrity_cachep, bi);
-}
-
 static struct kobj_type integrity_ktype = {
 	.default_attrs	= integrity_attrs,
 	.sysfs_ops	= &integrity_ops,
-	.release	= blk_integrity_release,
 };
 
-bool blk_integrity_is_initialized(struct gendisk *disk)
-{
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	return (bi && bi->profile->name && strcmp(bi->profile->name,
-						  bi_unsupported_name) != 0);
-}
-EXPORT_SYMBOL(blk_integrity_is_initialized);
-
 /**
  * blk_integrity_register - Register a gendisk as being integrity-capable
  * @disk:	struct gendisk pointer to make integrity-aware
- * @template:	optional integrity profile to register
+ * @template:	block integrity profile to register
  *
- * Description: When a device needs to advertise itself as being able
- * to send/receive integrity metadata it must use this function to
- * register the capability with the block layer.  The template is a
- * blk_integrity struct with values appropriate for the underlying
- * hardware.  If template is NULL the new profile is allocated but
- * not filled out. See Documentation/block/data-integrity.txt.
+ * Description: When a device needs to advertise itself as being able to
+ * send/receive integrity metadata it must use this function to register
+ * the capability with the block layer. The template is a blk_integrity
+ * struct with values appropriate for the underlying hardware. See
+ * Documentation/block/data-integrity.txt.
  */
 int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 {
-	struct blk_integrity *bi;
-
-	BUG_ON(disk == NULL);
+	struct blk_integrity *bi = &disk->integrity;
 
-	if (disk->integrity == NULL) {
-		bi = kmem_cache_alloc(integrity_cachep,
-				      GFP_KERNEL | __GFP_ZERO);
-		if (!bi)
-			return -1;
-
-		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
-					 &disk_to_dev(disk)->kobj,
-					 "%s", "integrity")) {
-			kmem_cache_free(integrity_cachep, bi);
-			return -1;
-		}
-
-		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
-
-		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
+	if (template) {
+		bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
+			template->flags;
 		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
-		disk->integrity = bi;
-	} else
-		bi = disk->integrity;
-
-	/* Use the provided profile as template */
-	if (template != NULL) {
 		bi->profile = template->profile;
 		bi->tuple_size = template->tuple_size;
 		bi->tag_size = template->tag_size;
-		bi->flags |= template->flags;
-	} else
-		bi->profile->name = bi_unsupported_name;
 
-	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
+		blk_integrity_revalidate(disk);
+} else
+		blk_integrity_unregister(disk);
 
 	return 0;
 }
 EXPORT_SYMBOL(blk_integrity_register);
 
 /**
- * blk_integrity_unregister - Remove block integrity profile
- * @disk:	disk whose integrity profile to deallocate
+ * blk_integrity_unregister - Unregister block integrity profile
+ * @disk:	disk whose integrity profile to unregister
  *
- * Description: This function frees all memory used by the block
- * integrity profile.  To be called at device teardown.
+ * Description: This function unregisters the integrity capability from
+ * a block device.
  */
 void blk_integrity_unregister(struct gendisk *disk)
 {
-	struct blk_integrity *bi;
+	blk_integrity_revalidate(disk);
+	memset(&disk->integrity, 0, sizeof(struct blk_integrity));
+}
+EXPORT_SYMBOL(blk_integrity_unregister);
+
+void blk_integrity_revalidate(struct gendisk *disk)
+{
+	struct blk_integrity *bi = &disk->integrity;
 
-	if (!disk || !disk->integrity)
+	if (!(disk->flags & GENHD_FL_UP))
 		return;
 
-	disk->queue->backing_dev_info.capabilities &= ~BDI_CAP_STABLE_WRITES;
+	if (bi->profile)
+		disk->queue->backing_dev_info.capabilities |=
+			BDI_CAP_STABLE_WRITES;
+	else
+		disk->queue->backing_dev_info.capabilities &=
+			~BDI_CAP_STABLE_WRITES;
+}
 
-	bi = disk->integrity;
+void blk_integrity_add(struct gendisk *disk)
+{
+	if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
+				 &disk_to_dev(disk)->kobj, "%s", "integrity"))
+		return;
 
+	kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
+}
+
+void blk_integrity_del(struct gendisk *disk)
+{
 	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
 	kobject_del(&disk->integrity_kobj);
 	kobject_put(&disk->integrity_kobj);
-	disk->integrity = NULL;
 }
-EXPORT_SYMBOL(blk_integrity_unregister);
+
diff --git a/block/genhd.c b/block/genhd.c
index 0c706f33a599..e5cafa51567c 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -630,6 +630,7 @@ void add_disk(struct gendisk *disk)
 	WARN_ON(retval);
 
 	disk_add_events(disk);
+	blk_integrity_add(disk);
 }
 EXPORT_SYMBOL(add_disk);
 
@@ -638,6 +639,7 @@ void del_gendisk(struct gendisk *disk)
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
+	blk_integrity_del(disk);
 	disk_del_events(disk);
 
 	/* invalidate stuff */
diff --git a/block/partition-generic.c b/block/partition-generic.c
index e7711133284e..3b030157ec85 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -428,6 +428,7 @@ rescan:
 
 	if (disk->fops->revalidate_disk)
 		disk->fops->revalidate_disk(disk);
+	blk_integrity_revalidate(disk);
 	check_disk_size_change(disk, bdev);
 	bdev->bd_invalidated = 0;
 	if (!get_capacity(disk) || !(state = check_partition(disk, bdev)))
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index db9d39e84933..29a57306e309 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -529,7 +529,7 @@ static void nvme_dif_remap(struct request *req,
 	virt = bip_get_seed(bip);
 	phys = nvme_block_nr(ns, blk_rq_pos(req));
 	nlb = (blk_rq_bytes(req) >> ns->lba_shift);
-	ts = ns->disk->integrity->tuple_size;
+	ts = ns->disk->integrity.tuple_size;
 
 	for (i = 0; i < nlb; i++, virt++, phys++) {
 		pi = (struct t10_pi_tuple *)p;
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 16ba55ad7089..b4c9734e76b7 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1024,13 +1024,9 @@ static int dm_table_build_index(struct dm_table *t)
 
 /*
  * Get a disk whose integrity profile reflects the table's profile.
- * If %match_all is true, all devices' profiles must match.
- * If %match_all is false, all devices must at least have an
- * allocated integrity profile; but uninitialized is ok.
  * Returns NULL if integrity support was inconsistent or unavailable.
  */
-static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
-						    bool match_all)
+static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
 {
 	struct list_head *devices = dm_table_get_devices(t);
 	struct dm_dev_internal *dd = NULL;
@@ -1038,11 +1034,7 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
 
 	list_for_each_entry(dd, devices, list) {
 		template_disk = dd->dm_dev->bdev->bd_disk;
-		if (!blk_get_integrity(template_disk))
-			goto no_integrity;
-		if (!match_all && !blk_integrity_is_initialized(template_disk))
-			continue; /* skip uninitialized profiles */
-		else if (prev_disk &&
+		if (prev_disk &&
 			 blk_integrity_compare(prev_disk, template_disk) < 0)
 			goto no_integrity;
 		prev_disk = template_disk;
@@ -1060,43 +1052,34 @@ no_integrity:
 }
 
 /*
- * Register the mapped device for blk_integrity support if
- * the underlying devices have an integrity profile.  But all devices
- * may not have matching profiles (checking all devices isn't reliable
+ * Register the mapped device for blk_integrity support if the
+ * underlying devices have an integrity profile.  But all devices may
+ * not have matching profiles (checking all devices isn't reliable
  * during table load because this table may use other DM device(s) which
- * must be resumed before they will have an initialized integity profile).
- * Stacked DM devices force a 2 stage integrity profile validation:
- * 1 - during load, validate all initialized integrity profiles match
- * 2 - during resume, validate all integrity profiles match
+ * must be resumed before they will have an initialized integity
+ * profile).  Consequently, stacked DM devices force a 2 stage integrity
+ * profile validation: First pass during table load, final pass during
+ * resume.
  */
-static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device *md)
+static int dm_table_set_integrity(struct dm_table *t)
 {
 	struct gendisk *template_disk = NULL;
+	bool existing_profile = dm_disk(t->md)->integrity.profile;
 
-	template_disk = dm_table_get_integrity_disk(t, false);
-	if (!template_disk)
-		return 0;
-
-	if (!blk_integrity_is_initialized(dm_disk(md))) {
+	template_disk = dm_table_get_integrity_disk(t);
+	if (template_disk) {
+		blk_integrity_register(dm_disk(t->md),
+				       blk_get_integrity(template_disk));
 		t->integrity_supported = 1;
-		return blk_integrity_register(dm_disk(md), NULL);
-	}
-
-	/*
-	 * If DM device already has an initalized integrity
-	 * profile the new profile should not conflict.
-	 */
-	if (blk_integrity_is_initialized(template_disk) &&
-	    blk_integrity_compare(dm_disk(md), template_disk) < 0) {
-		DMWARN("%s: conflict with existing integrity profile: "
-		       "%s profile mismatch",
-		       dm_device_name(t->md),
-		       template_disk->disk_name);
+	} else if (existing_profile) {
+		blk_integrity_unregister(dm_disk(t->md));
+		DMWARN("%s: device no longer has a valid integrity profile",
+		       dm_device_name(t->md));
 		return 1;
-	}
+	} else
+		DMWARN("%s: unable to establish an integrity profile",
+		       dm_device_name(t->md));
 
-	/* Preserve existing initialized integrity profile */
-	t->integrity_supported = 1;
 	return 0;
 }
 
@@ -1120,7 +1103,7 @@ int dm_table_complete(struct dm_table *t)
 		return r;
 	}
 
-	r = dm_table_prealloc_integrity(t, t->md);
+	r = dm_table_set_integrity(t);
 	if (r) {
 		DMERR("could not register integrity profile.");
 		return r;
@@ -1285,32 +1268,6 @@ combine_limits:
 	return validate_hardware_logical_block_alignment(table, limits);
 }
 
-/*
- * Set the integrity profile for this device if all devices used have
- * matching profiles.  We're quite deep in the resume path but still
- * don't know if all devices (particularly DM devices this device
- * may be stacked on) have matching profiles.  Even if the profiles
- * don't match we have no way to fail (to resume) at this point.
- */
-static void dm_table_set_integrity(struct dm_table *t)
-{
-	struct gendisk *template_disk = NULL;
-
-	if (!blk_get_integrity(dm_disk(t->md)))
-		return;
-
-	template_disk = dm_table_get_integrity_disk(t, true);
-	if (template_disk)
-		blk_integrity_register(dm_disk(t->md),
-				       blk_get_integrity(template_disk));
-	else if (blk_integrity_is_initialized(dm_disk(t->md)))
-		DMWARN("%s: device no longer has a valid integrity profile",
-		       dm_device_name(t->md));
-	else
-		DMWARN("%s: unable to establish an integrity profile",
-		       dm_device_name(t->md));
-}
-
 static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
 				sector_t start, sector_t len, void *data)
 {
diff --git a/drivers/md/md.c b/drivers/md/md.c
index d429c30cd514..406bfe99d893 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1980,12 +1980,9 @@ int md_integrity_register(struct mddev *mddev)
 	 * All component devices are integrity capable and have matching
 	 * profiles, register the common profile for the md device.
 	 */
-	if (blk_integrity_register(mddev->gendisk,
-			bdev_get_integrity(reference->bdev)) != 0) {
-		printk(KERN_ERR "md: failed to register integrity for %s\n",
-			mdname(mddev));
-		return -EINVAL;
-	}
+	blk_integrity_register(mddev->gendisk,
+			       bdev_get_integrity(reference->bdev));
+
 	printk(KERN_NOTICE "md: data integrity enabled on %s\n", mdname(mddev));
 	if (bioset_integrity_create(mddev->bio_set, BIO_POOL_SIZE)) {
 		printk(KERN_ERR "md: failed to create integrity pool for %s\n",
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 198243717da5..918170c0099a 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1074,7 +1074,7 @@ int revalidate_disk(struct gendisk *disk)
 
 	if (disk->fops->revalidate_disk)
 		ret = disk->fops->revalidate_disk(disk);
-
+	blk_integrity_revalidate(disk);
 	bdev = bdget_disk(disk, 0);
 	if (!bdev)
 		return ret;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c01ce2f7168b..e58e0709c462 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1408,40 +1408,6 @@ static inline uint64_t rq_io_start_time_ns(struct request *req)
 	MODULE_ALIAS("block-major-" __stringify(major) "-*")
 
 #if defined(CONFIG_BLK_DEV_INTEGRITY)
-
-enum blk_integrity_flags {
-	BLK_INTEGRITY_VERIFY		= 1 << 0,
-	BLK_INTEGRITY_GENERATE		= 1 << 1,
-	BLK_INTEGRITY_DEVICE_CAPABLE	= 1 << 2,
-	BLK_INTEGRITY_IP_CHECKSUM	= 1 << 3,
-};
-
-struct blk_integrity_iter {
-	void			*prot_buf;
-	void			*data_buf;
-	sector_t		seed;
-	unsigned int		data_size;
-	unsigned short		interval;
-	const char		*disk_name;
-};
-
-typedef int (integrity_processing_fn) (struct blk_integrity_iter *);
-
-struct blk_integrity_profile {
-	integrity_processing_fn		*generate_fn;
-	integrity_processing_fn		*verify_fn;
-	const char			*name;
-};
-
-struct blk_integrity {
-	struct blk_integrity_profile	*profile;
-	unsigned char			flags;
-	unsigned char			tuple_size;
-	unsigned char			interval_exp;
-	unsigned char			tag_size;
-};
-
-extern bool blk_integrity_is_initialized(struct gendisk *);
 extern int blk_integrity_register(struct gendisk *, struct blk_integrity *);
 extern void blk_integrity_unregister(struct gendisk *);
 extern int blk_integrity_compare(struct gendisk *, struct gendisk *);
@@ -1453,15 +1419,20 @@ extern bool blk_integrity_merge_rq(struct request_queue *, struct request *,
 extern bool blk_integrity_merge_bio(struct request_queue *, struct request *,
 				    struct bio *);
 
-static inline
-struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
+static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
 {
-	return bdev->bd_disk->integrity;
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!bi->profile)
+		return NULL;
+
+	return bi;
 }
 
-static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
+static inline
+struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
 {
-	return disk->integrity;
+	return blk_get_integrity(bdev->bd_disk);
 }
 
 static inline bool blk_integrity_rq(struct request *rq)
@@ -1543,10 +1514,6 @@ static inline bool blk_integrity_merge_bio(struct request_queue *rq,
 {
 	return true;
 }
-static inline bool blk_integrity_is_initialized(struct gendisk *g)
-{
-	return 0;
-}
 
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
 
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 9e6e0dfa97ad..b4b465c6e6de 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -163,6 +163,42 @@ struct disk_part_tbl {
 
 struct disk_events;
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+
+enum blk_integrity_flags {
+	BLK_INTEGRITY_VERIFY		= 1 << 0,
+	BLK_INTEGRITY_GENERATE		= 1 << 1,
+	BLK_INTEGRITY_DEVICE_CAPABLE	= 1 << 2,
+	BLK_INTEGRITY_IP_CHECKSUM	= 1 << 3,
+};
+
+struct blk_integrity_iter {
+	void			*prot_buf;
+	void			*data_buf;
+	sector_t		seed;
+	unsigned int		data_size;
+	unsigned short		interval;
+	const char		*disk_name;
+};
+
+typedef int (integrity_processing_fn) (struct blk_integrity_iter *);
+
+struct blk_integrity_profile {
+	integrity_processing_fn		*generate_fn;
+	integrity_processing_fn		*verify_fn;
+	const char			*name;
+};
+
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
+};
+
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 struct gendisk {
 	/* major, first_minor and minors are input parameters only,
 	 * don't use directly.  Use disk_devt() and disk_max_parts().
@@ -198,9 +234,9 @@ struct gendisk {
 	atomic_t sync_io;		/* RAID */
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
-	struct blk_integrity *integrity;
+	struct blk_integrity integrity;
 	struct kobject integrity_kobj;
-#endif
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;
 };
 
@@ -728,6 +764,16 @@ static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
 #endif
 }
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+extern void blk_integrity_add(struct gendisk *);
+extern void blk_integrity_del(struct gendisk *);
+extern void blk_integrity_revalidate(struct gendisk *);
+#else	/* CONFIG_BLK_DEV_INTEGRITY */
+static inline void blk_integrity_add(struct gendisk *disk) { }
+static inline void blk_integrity_del(struct gendisk *disk) { }
+static inline void blk_integrity_revalidate(struct gendisk *disk) { }
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 #else /* CONFIG_BLOCK */
 
 static inline void printk_all_partitions(void) { }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/5] block: Reduce the size of struct blk_integrity
  2015-07-21  6:02             ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
@ 2015-07-21 11:53               ` Christoph Hellwig
  2015-07-24 15:14                 ` Martin K. Petersen
  2015-07-22 11:35               ` Sagi Grimberg
  1 sibling, 1 reply; 67+ messages in thread
From: Christoph Hellwig @ 2015-07-21 11:53 UTC (permalink / raw)


On Tue, Jul 21, 2015@02:02:57AM -0400, Martin K. Petersen wrote:
> The per-device properties in the blk_integrity structure were previously
> unsigned short. However, most of the values fit inside a char. The only
> exception is the data interval size and we can work around that by
> storing it as a power of two.

Just curious: why do we event bother with storing the interval?  Seem
like our current impplementation forces it to the block size anyway.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-07-21  6:02             ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
@ 2015-07-21 12:01               ` Christoph Hellwig
  2015-08-20 20:41                 ` Simplify block integrity registration Martin K. Petersen
  0 siblings, 1 reply; 67+ messages in thread
From: Christoph Hellwig @ 2015-07-21 12:01 UTC (permalink / raw)


On Tue, Jul 21, 2015@02:02:59AM -0400, Martin K. Petersen wrote:
>  - Moving the blk_integrity definitions to genhd.h.

You only need to move struct blk_integrity.  as that one only has a
pointer to struct blk_integrity_profile there is no need to move he
rest.

Otherwise this looks good, although eplace calls to
blk_integrity_register with a NULL integrity profile with direct calls
to blk_integrity_unregister.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/5] block: Move integrity kobject to struct gendisk
  2015-07-21  6:02             ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
@ 2015-07-22 11:32               ` Sagi Grimberg
  0 siblings, 0 replies; 67+ messages in thread
From: Sagi Grimberg @ 2015-07-22 11:32 UTC (permalink / raw)


On 7/21/2015 9:02 AM, Martin K. Petersen wrote:
> The integrity kobject purely exists to support the integrity
> subdirectory in sysfs and doesn't really have anything to do with the
> blk_integrity data structure. Move the kobject to struct gendisk where
> it belongs.
>
> Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
> Reported-by: Christoph Hellwig <hch at lst.de>

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-07-21  6:02             ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
@ 2015-07-22 11:33               ` Sagi Grimberg
  0 siblings, 0 replies; 67+ messages in thread
From: Sagi Grimberg @ 2015-07-22 11:33 UTC (permalink / raw)


On 7/21/2015 9:02 AM, Martin K. Petersen wrote:
> We previously made a complete copy of a device's data integrity profile
> even though several of the fields inside the blk_integrity struct are
> pointers to fixed template entries in t10-pi.c.
>
> Split the static and per-device portions so that we can reference the
> template directly.
>
> Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
> Reported-by: Christoph Hellwig <hch at lst.de>

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 3/5] block: Reduce the size of struct blk_integrity
  2015-07-21  6:02             ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
  2015-07-21 11:53               ` Christoph Hellwig
@ 2015-07-22 11:35               ` Sagi Grimberg
  1 sibling, 0 replies; 67+ messages in thread
From: Sagi Grimberg @ 2015-07-22 11:35 UTC (permalink / raw)


On 7/21/2015 9:02 AM, Martin K. Petersen wrote:
> The per-device properties in the blk_integrity structure were previously
> unsigned short. However, most of the values fit inside a char. The only
> exception is the data interval size and we can work around that by
> storing it as a power of two.
>
> This cuts the size of the dynamic portion of blk_integrity in half.
>
> Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
> Reported-by: Christoph Hellwig <hch at lst.de>

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 4/5] block: Export integrity data interval size in sysfs
  2015-07-21  6:02             ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
@ 2015-07-22 11:37               ` Sagi Grimberg
  2015-07-24 15:26                 ` Martin K. Petersen
  0 siblings, 1 reply; 67+ messages in thread
From: Sagi Grimberg @ 2015-07-22 11:37 UTC (permalink / raw)


On 7/21/2015 9:02 AM, Martin K. Petersen wrote:
> The size of the data interval was not exported in the sysfs integrity
> directory. Export it so that userland apps can tell whether the interval
> is different from the device's logical block size.
>
> Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
> ---
>   Documentation/ABI/testing/sysfs-block |  7 +++++++
>   block/blk-integrity.c                 | 14 ++++++++++++++
>   2 files changed, 21 insertions(+)
>
> diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
> index 8df003963d99..a61f1e2165ee 100644
> --- a/Documentation/ABI/testing/sysfs-block
> +++ b/Documentation/ABI/testing/sysfs-block
> @@ -60,6 +60,13 @@ Description:
>   		Indicates whether a storage device is capable of storing
>   		integrity metadata. Set if the device is T10 PI-capable.
>
> +What:		/sys/block/<disk>/integrity/data_interval_bytes

I wander if pi_interval is not a more suitable name...

Otherwise,

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 3/5] block: Reduce the size of struct blk_integrity
  2015-07-21 11:53               ` Christoph Hellwig
@ 2015-07-24 15:14                 ` Martin K. Petersen
  0 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-24 15:14 UTC (permalink / raw)


>>>>> "Christoph" == Christoph Hellwig <hch at infradead.org> writes:

Christoph> Just curious: why do we event bother with storing the
Christoph> interval?  Seem like our current impplementation forces it to
Christoph> the block size anyway.

There are SCSI devices that permit the PI at different intervals than
the logical block size. I have a patch set in the pipeline.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 4/5] block: Export integrity data interval size in sysfs
  2015-07-22 11:37               ` Sagi Grimberg
@ 2015-07-24 15:26                 ` Martin K. Petersen
  0 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-07-24 15:26 UTC (permalink / raw)


>>>>> "Sagi" == Sagi Grimberg <sagig at dev.mellanox.co.il> writes:

>> +What: /sys/block/<disk>/integrity/data_interval_bytes

Sagi> I wander if pi_interval is not a more suitable name...

I have always found the SBC notion of "protection information interval"
a bit confusing. The interval describes the protected data, not the
protection information.

I am OK with protection_interval_bytes too. But I do think that having
data in there makes things clearer.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Simplify block integrity registration
  2015-07-21 12:01               ` Christoph Hellwig
@ 2015-08-20 20:41                 ` Martin K. Petersen
  2015-08-20 20:41                   ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
                                     ` (5 more replies)
  0 siblings, 6 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-08-20 20:41 UTC (permalink / raw)



Had a couple of requests this week so here is the latest version of the
patch set that removes the dynamic integrity allocation to ease the
registration headaches for nvme and nvdimm.

Mike, I'd appreciate if you could check the DM bits.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/5] block: Move integrity kobject to struct gendisk
  2015-08-20 20:41                 ` Simplify block integrity registration Martin K. Petersen
@ 2015-08-20 20:41                   ` Martin K. Petersen
  2015-09-16 17:26                       ` Mike Snitzer
  2015-08-20 20:41                   ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
                                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 67+ messages in thread
From: Martin K. Petersen @ 2015-08-20 20:41 UTC (permalink / raw)


The integrity kobject purely exists to support the integrity
subdirectory in sysfs and doesn't really have anything to do with the
blk_integrity data structure. Move the kobject to struct gendisk where
it belongs.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
---
 block/blk-integrity.c  | 22 +++++++++++-----------
 include/linux/blkdev.h |  2 --
 include/linux/genhd.h  |  1 +
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index f548b64be092..6a173a7d1ec6 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -246,8 +246,8 @@ struct integrity_sysfs_entry {
 static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -258,8 +258,8 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    struct attribute *attr, const char *page,
 				    size_t count)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -382,8 +382,8 @@ subsys_initcall(blk_dev_integrity_init);
 
 static void blk_integrity_release(struct kobject *kobj)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 
 	kmem_cache_free(integrity_cachep, bi);
 }
@@ -426,14 +426,14 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 		if (!bi)
 			return -1;
 
-		if (kobject_init_and_add(&bi->kobj, &integrity_ktype,
+		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
 					 &disk_to_dev(disk)->kobj,
 					 "%s", "integrity")) {
 			kmem_cache_free(integrity_cachep, bi);
 			return -1;
 		}
 
-		kobject_uevent(&bi->kobj, KOBJ_ADD);
+		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
 
 		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
 		bi->interval = queue_logical_block_size(disk->queue);
@@ -476,9 +476,9 @@ void blk_integrity_unregister(struct gendisk *disk)
 
 	bi = disk->integrity;
 
-	kobject_uevent(&bi->kobj, KOBJ_REMOVE);
-	kobject_del(&bi->kobj);
-	kobject_put(&bi->kobj);
+	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
+	kobject_del(&disk->integrity_kobj);
+	kobject_put(&disk->integrity_kobj);
 	disk->integrity = NULL;
 }
 EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e427debc7008..e879473873d5 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1431,8 +1431,6 @@ struct blk_integrity {
 	unsigned short		tag_size;
 
 	const char		*name;
-
-	struct kobject		kobj;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 2adbfa6d02bc..9e6e0dfa97ad 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -199,6 +199,7 @@ struct gendisk {
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
 	struct blk_integrity *integrity;
+	struct kobject integrity_kobj;
 #endif
 	int node_id;
 };
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-08-20 20:41                 ` Simplify block integrity registration Martin K. Petersen
  2015-08-20 20:41                   ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
@ 2015-08-20 20:41                   ` Martin K. Petersen
  2015-08-20 20:41                   ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
                                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-08-20 20:41 UTC (permalink / raw)


We previously made a complete copy of a device's data integrity profile
even though several of the fields inside the blk_integrity struct are
pointers to fixed template entries in t10-pi.c.

Split the static and per-device portions so that we can reference the
template directly.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Cc: Dan Williams <dan.j.williams at intel.com>
---
 block/bio-integrity.c               |  8 ++++----
 block/blk-integrity.c               | 17 ++++++++---------
 block/t10-pi.c                      | 16 ++++------------
 drivers/block/nvme-core.c           |  8 ++++----
 drivers/nvdimm/core.c               | 11 +++++++----
 drivers/scsi/sd_dif.c               | 29 ++++++++++++++++-------------
 drivers/target/target_core_iblock.c | 10 +++++-----
 include/linux/blkdev.h              | 20 +++++++++++---------
 include/linux/t10-pi.h              |  8 ++++----
 9 files changed, 63 insertions(+), 64 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 4aecca79374a..400e69e575eb 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -172,11 +172,11 @@ bool bio_integrity_enabled(struct bio *bio)
 	if (bi == NULL)
 		return false;
 
-	if (bio_data_dir(bio) == READ && bi->verify_fn != NULL &&
+	if (bio_data_dir(bio) == READ && bi->profile->verify_fn != NULL &&
 	    (bi->flags & BLK_INTEGRITY_VERIFY))
 		return true;
 
-	if (bio_data_dir(bio) == WRITE && bi->generate_fn != NULL &&
+	if (bio_data_dir(bio) == WRITE && bi->profile->generate_fn != NULL &&
 	    (bi->flags & BLK_INTEGRITY_GENERATE))
 		return true;
 
@@ -335,7 +335,7 @@ int bio_integrity_prep(struct bio *bio)
 
 	/* Auto-generate integrity metadata if this is a write */
 	if (bio_data_dir(bio) == WRITE)
-		bio_integrity_process(bio, bi->generate_fn);
+		bio_integrity_process(bio, bi->profile->generate_fn);
 
 	return 0;
 }
@@ -356,7 +356,7 @@ static void bio_integrity_verify_fn(struct work_struct *work)
 	struct bio *bio = bip->bip_bio;
 	struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
 
-	bio->bi_error = bio_integrity_process(bio, bi->verify_fn);
+	bio->bi_error = bio_integrity_process(bio, bi->profile->verify_fn);
 
 	/* Restore original bio completion handler */
 	bio->bi_end_io = bip->bip_end_io;
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 6a173a7d1ec6..5e5280d58c61 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -176,10 +176,10 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 		return -1;
 	}
 
-	if (strcmp(b1->name, b2->name)) {
+	if (b1->profile != b2->profile) {
 		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
-		       b1->name, b2->name);
+		       b1->profile->name, b2->profile->name);
 		return -1;
 	}
 
@@ -272,8 +272,8 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->name != NULL)
-		return sprintf(page, "%s\n", bi->name);
+	if (bi != NULL && bi->profile->name != NULL)
+		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
 }
@@ -398,7 +398,8 @@ bool blk_integrity_is_initialized(struct gendisk *disk)
 {
 	struct blk_integrity *bi = blk_get_integrity(disk);
 
-	return (bi && bi->name && strcmp(bi->name, bi_unsupported_name) != 0);
+	return (bi && bi->profile->name && strcmp(bi->profile->name,
+						  bi_unsupported_name) != 0);
 }
 EXPORT_SYMBOL(blk_integrity_is_initialized);
 
@@ -443,14 +444,12 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 
 	/* Use the provided profile as template */
 	if (template != NULL) {
-		bi->name = template->name;
-		bi->generate_fn = template->generate_fn;
-		bi->verify_fn = template->verify_fn;
+		bi->profile = template->profile;
 		bi->tuple_size = template->tuple_size;
 		bi->tag_size = template->tag_size;
 		bi->flags |= template->flags;
 	} else
-		bi->name = bi_unsupported_name;
+		bi->profile->name = bi_unsupported_name;
 
 	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
 
diff --git a/block/t10-pi.c b/block/t10-pi.c
index 24d6e9715318..2c97912335a9 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -160,38 +160,30 @@ static int t10_pi_type3_verify_ip(struct blk_integrity_iter *iter)
 	return t10_pi_verify(iter, t10_pi_ip_fn, 3);
 }
 
-struct blk_integrity t10_pi_type1_crc = {
+struct blk_integrity_profile t10_pi_type1_crc = {
 	.name			= "T10-DIF-TYPE1-CRC",
 	.generate_fn		= t10_pi_type1_generate_crc,
 	.verify_fn		= t10_pi_type1_verify_crc,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type1_crc);
 
-struct blk_integrity t10_pi_type1_ip = {
+struct blk_integrity_profile t10_pi_type1_ip = {
 	.name			= "T10-DIF-TYPE1-IP",
 	.generate_fn		= t10_pi_type1_generate_ip,
 	.verify_fn		= t10_pi_type1_verify_ip,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type1_ip);
 
-struct blk_integrity t10_pi_type3_crc = {
+struct blk_integrity_profile t10_pi_type3_crc = {
 	.name			= "T10-DIF-TYPE3-CRC",
 	.generate_fn		= t10_pi_type3_generate_crc,
 	.verify_fn		= t10_pi_type3_verify_crc,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type3_crc);
 
-struct blk_integrity t10_pi_type3_ip = {
+struct blk_integrity_profile t10_pi_type3_ip = {
 	.name			= "T10-DIF-TYPE3-IP",
 	.generate_fn		= t10_pi_type3_generate_ip,
 	.verify_fn		= t10_pi_type3_verify_ip,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type3_ip);
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index d844ec4a2b85..e26eb1524bfd 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -549,7 +549,7 @@ static int nvme_noop_generate(struct blk_integrity_iter *iter)
 	return 0;
 }
 
-struct blk_integrity nvme_meta_noop = {
+struct blk_integrity_profile nvme_meta_noop = {
 	.name			= "NVME_META_NOOP",
 	.generate_fn		= nvme_noop_generate,
 	.verify_fn		= nvme_noop_verify,
@@ -561,14 +561,14 @@ static void nvme_init_integrity(struct nvme_ns *ns)
 
 	switch (ns->pi_type) {
 	case NVME_NS_DPS_PI_TYPE3:
-		integrity = t10_pi_type3_crc;
+		integrity.profile = &t10_pi_type3_crc;
 		break;
 	case NVME_NS_DPS_PI_TYPE1:
 	case NVME_NS_DPS_PI_TYPE2:
-		integrity = t10_pi_type1_crc;
+		integrity.profile = &t10_pi_type1_crc;
 		break;
 	default:
-		integrity = nvme_meta_noop;
+		integrity.profile = &nvme_meta_noop;
 		break;
 	}
 	integrity.tuple_size = ns->ms;
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index cb62ec6a12d0..9e1b0f656a9b 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -399,19 +399,22 @@ static int nd_pi_nop_generate_verify(struct blk_integrity_iter *iter)
 
 int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 {
-	struct blk_integrity integrity = {
+	struct blk_integrity bi;
+	struct blk_integrity_profile profile = {
 		.name = "ND-PI-NOP",
 		.generate_fn = nd_pi_nop_generate_verify,
 		.verify_fn = nd_pi_nop_generate_verify,
-		.tuple_size = meta_size,
-		.tag_size = meta_size,
 	};
 	int ret;
 
 	if (meta_size == 0)
 		return 0;
 
-	ret = blk_integrity_register(disk, &integrity);
+	bi.profile = &profile;
+	bi.tuple_size = meta_size;
+	bi.tag_size = meta_size;
+
+	ret = blk_integrity_register(disk, &bi);
 	if (ret)
 		return ret;
 
diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c
index 5c06d292b94c..5a5ec9aa26b3 100644
--- a/drivers/scsi/sd_dif.c
+++ b/drivers/scsi/sd_dif.c
@@ -43,6 +43,7 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 	struct scsi_device *sdp = sdkp->device;
 	struct gendisk *disk = sdkp->disk;
 	u8 type = sdkp->protection_type;
+	struct blk_integrity bi;
 	int dif, dix;
 
 	dif = scsi_host_dif_capable(sdp->host, type);
@@ -58,36 +59,38 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 	/* Enable DMA of protection information */
 	if (scsi_host_get_guard(sdkp->device->host) & SHOST_DIX_GUARD_IP) {
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			blk_integrity_register(disk, &t10_pi_type3_ip);
+			bi.profile = &t10_pi_type3_ip;
 		else
-			blk_integrity_register(disk, &t10_pi_type1_ip);
+			bi.profile = &t10_pi_type1_ip;
 
-		disk->integrity->flags |= BLK_INTEGRITY_IP_CHECKSUM;
+		bi.flags |= BLK_INTEGRITY_IP_CHECKSUM;
 	} else
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			blk_integrity_register(disk, &t10_pi_type3_crc);
+			bi.profile = &t10_pi_type3_crc;
 		else
-			blk_integrity_register(disk, &t10_pi_type1_crc);
+			bi.profile = &t10_pi_type1_crc;
 
+	bi.tuple_size = sizeof(struct t10_pi_tuple);
 	sd_printk(KERN_NOTICE, sdkp,
-		  "Enabling DIX %s protection\n", disk->integrity->name);
+		  "Enabling DIX %s protection\n", bi.profile->name);
 
-	/* Signal to block layer that we support sector tagging */
 	if (dif && type) {
-
-		disk->integrity->flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
+		bi.flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
 
 		if (!sdkp->ATO)
-			return;
+			goto out;
 
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			disk->integrity->tag_size = sizeof(u16) + sizeof(u32);
+			bi.tag_size = sizeof(u16) + sizeof(u32);
 		else
-			disk->integrity->tag_size = sizeof(u16);
+			bi.tag_size = sizeof(u16);
 
 		sd_printk(KERN_NOTICE, sdkp, "DIF application tag size %u\n",
-			  disk->integrity->tag_size);
+			  bi.tag_size);
 	}
+
+out:
+	blk_integrity_register(disk, &bi);
 }
 
 /*
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 5a9982f5d5d6..3e51338762df 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -153,17 +153,17 @@ static int iblock_configure_device(struct se_device *dev)
 	if (bi) {
 		struct bio_set *bs = ib_dev->ibd_bio_set;
 
-		if (!strcmp(bi->name, "T10-DIF-TYPE3-IP") ||
-		    !strcmp(bi->name, "T10-DIF-TYPE1-IP")) {
+		if (!strcmp(bi->profile->name, "T10-DIF-TYPE3-IP") ||
+		    !strcmp(bi->profile->name, "T10-DIF-TYPE1-IP")) {
 			pr_err("IBLOCK export of blk_integrity: %s not"
-			       " supported\n", bi->name);
+			       " supported\n", bi->profile->name);
 			ret = -ENOSYS;
 			goto out_blkdev_put;
 		}
 
-		if (!strcmp(bi->name, "T10-DIF-TYPE3-CRC")) {
+		if (!strcmp(bi->profile->name, "T10-DIF-TYPE3-CRC")) {
 			dev->dev_attrib.pi_prot_type = TARGET_DIF_TYPE3_PROT;
-		} else if (!strcmp(bi->name, "T10-DIF-TYPE1-CRC")) {
+		} else if (!strcmp(bi->profile->name, "T10-DIF-TYPE1-CRC")) {
 			dev->dev_attrib.pi_prot_type = TARGET_DIF_TYPE1_PROT;
 		}
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e879473873d5..6a25cf6716c0 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1421,16 +1421,18 @@ struct blk_integrity_iter {
 
 typedef int (integrity_processing_fn) (struct blk_integrity_iter *);
 
-struct blk_integrity {
-	integrity_processing_fn	*generate_fn;
-	integrity_processing_fn	*verify_fn;
-
-	unsigned short		flags;
-	unsigned short		tuple_size;
-	unsigned short		interval;
-	unsigned short		tag_size;
+struct blk_integrity_profile {
+	integrity_processing_fn		*generate_fn;
+	integrity_processing_fn		*verify_fn;
+	const char			*name;
+};
 
-	const char		*name;
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned short			flags;
+	unsigned short			tuple_size;
+	unsigned short			interval;
+	unsigned short			tag_size;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
diff --git a/include/linux/t10-pi.h b/include/linux/t10-pi.h
index 6a8b9942632d..dd8de82cf5b5 100644
--- a/include/linux/t10-pi.h
+++ b/include/linux/t10-pi.h
@@ -14,9 +14,9 @@ struct t10_pi_tuple {
 };
 
 
-extern struct blk_integrity t10_pi_type1_crc;
-extern struct blk_integrity t10_pi_type1_ip;
-extern struct blk_integrity t10_pi_type3_crc;
-extern struct blk_integrity t10_pi_type3_ip;
+extern struct blk_integrity_profile t10_pi_type1_crc;
+extern struct blk_integrity_profile t10_pi_type1_ip;
+extern struct blk_integrity_profile t10_pi_type3_crc;
+extern struct blk_integrity_profile t10_pi_type3_ip;
 
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/5] block: Reduce the size of struct blk_integrity
  2015-08-20 20:41                 ` Simplify block integrity registration Martin K. Petersen
  2015-08-20 20:41                   ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
  2015-08-20 20:41                   ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
@ 2015-08-20 20:41                   ` Martin K. Petersen
  2015-08-20 20:41                   ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
                                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-08-20 20:41 UTC (permalink / raw)


The per-device properties in the blk_integrity structure were previously
unsigned short. However, most of the values fit inside a char. The only
exception is the data interval size and we can work around that by
storing it as a power of two.

This cuts the size of the dynamic portion of blk_integrity in half.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
---
 block/bio-integrity.c  | 4 ++--
 block/blk-integrity.c  | 6 +++---
 include/linux/blkdev.h | 8 ++++----
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 400e69e575eb..b33975155a3f 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -197,7 +197,7 @@ EXPORT_SYMBOL(bio_integrity_enabled);
 static inline unsigned int bio_integrity_intervals(struct blk_integrity *bi,
 						   unsigned int sectors)
 {
-	return sectors >> (ilog2(bi->interval) - 9);
+	return sectors >> (bi->interval_exp - 9);
 }
 
 static inline unsigned int bio_integrity_bytes(struct blk_integrity *bi,
@@ -224,7 +224,7 @@ static int bio_integrity_process(struct bio *bio,
 		bip->bip_vec->bv_offset;
 
 	iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
-	iter.interval = bi->interval;
+	iter.interval = 1 << bi->interval_exp;
 	iter.seed = bip_get_seed(bip);
 	iter.prot_buf = prot_buf;
 
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 5e5280d58c61..70ba9389d0dd 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -155,10 +155,10 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	if (!b1 || !b2)
 		return -1;
 
-	if (b1->interval != b2->interval) {
+	if (b1->interval_exp != b2->interval_exp) {
 		pr_err("%s: %s/%s protection interval %u != %u\n",
 		       __func__, gd1->disk_name, gd2->disk_name,
-		       b1->interval, b2->interval);
+		       1 << b1->interval_exp, 1 << b2->interval_exp);
 		return -1;
 	}
 
@@ -437,7 +437,7 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
 
 		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval = queue_logical_block_size(disk->queue);
+		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
 		disk->integrity = bi;
 	} else
 		bi = disk->integrity;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6a25cf6716c0..2d93d3875586 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1429,10 +1429,10 @@ struct blk_integrity_profile {
 
 struct blk_integrity {
 	struct blk_integrity_profile	*profile;
-	unsigned short			flags;
-	unsigned short			tuple_size;
-	unsigned short			interval;
-	unsigned short			tag_size;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 4/5] block: Export integrity data interval size in sysfs
  2015-08-20 20:41                 ` Simplify block integrity registration Martin K. Petersen
                                     ` (2 preceding siblings ...)
  2015-08-20 20:41                   ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
@ 2015-08-20 20:41                   ` Martin K. Petersen
  2015-08-20 20:41                   ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
  2015-08-20 20:45                   ` Simplify block integrity registration Mike Snitzer
  5 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-08-20 20:41 UTC (permalink / raw)


The size of the data interval was not exported in the sysfs integrity
directory. Export it so that userland apps can tell whether the interval
is different from the device's logical block size.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
---
 Documentation/ABI/testing/sysfs-block |  7 +++++++
 block/blk-integrity.c                 | 14 ++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index 8df003963d99..71d184dbb70d 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -60,6 +60,13 @@ Description:
 		Indicates whether a storage device is capable of storing
 		integrity metadata. Set if the device is T10 PI-capable.
 
+What:		/sys/block/<disk>/integrity/protection_interval_bytes
+Date:		July 2015
+Contact:	Martin K. Petersen <martin.petersen at oracle.com>
+Description:
+		Describes the number of data bytes which are protected
+		by one integrity tuple. Typically the device's logical
+		block size.
 
 What:		/sys/block/<disk>/integrity/write_generate
 Date:		June 2008
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 70ba9389d0dd..72b427c3ed1e 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -286,6 +286,14 @@ static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 		return sprintf(page, "0\n");
 }
 
+static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
+{
+	if (bi != NULL)
+		return sprintf(page, "%u\n", 1 << bi->interval_exp);
+	else
+		return sprintf(page, "0\n");
+}
+
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
 				      const char *page, size_t count)
 {
@@ -340,6 +348,11 @@ static struct integrity_sysfs_entry integrity_tag_size_entry = {
 	.show = integrity_tag_size_show,
 };
 
+static struct integrity_sysfs_entry integrity_interval_entry = {
+	.attr = { .name = "protection_interval_bytes", .mode = S_IRUGO },
+	.show = integrity_interval_show,
+};
+
 static struct integrity_sysfs_entry integrity_verify_entry = {
 	.attr = { .name = "read_verify", .mode = S_IRUGO | S_IWUSR },
 	.show = integrity_verify_show,
@@ -360,6 +373,7 @@ static struct integrity_sysfs_entry integrity_device_entry = {
 static struct attribute *integrity_attrs[] = {
 	&integrity_format_entry.attr,
 	&integrity_tag_size_entry.attr,
+	&integrity_interval_entry.attr,
 	&integrity_verify_entry.attr,
 	&integrity_generate_entry.attr,
 	&integrity_device_entry.attr,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-08-20 20:41                 ` Simplify block integrity registration Martin K. Petersen
                                     ` (3 preceding siblings ...)
  2015-08-20 20:41                   ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
@ 2015-08-20 20:41                   ` Martin K. Petersen
  2015-08-21 23:47                     ` Busch, Keith
  2015-09-16  1:07                       ` Mike Snitzer
  2015-08-20 20:45                   ` Simplify block integrity registration Mike Snitzer
  5 siblings, 2 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-08-20 20:41 UTC (permalink / raw)


Up until now the_integrity profile has been dynamically allocated and
attached to struct gendisk after the disk has been made active.

This causes problems because NVMe devices need to register the profile
prior to the partition table being read due to a mandatory metadata
buffer requirement. In addition, DM goes through hoops to deal with
preallocating, but not initializing integrity profiles.

Since the integrity profile is small (4 bytes + a pointer), Christoph
suggested moving it to struct gendisk proper. This requires several
changes:

 - Moving the blk_integrity definition to genhd.h.

 - Inlining blk_integrity in struct gendisk.

 - Removing the dynamic allocation code.

 - Adding helper functions which allow gendisk to set up and tear down
   the integrity sysfs dir when a disk is added/deleted.

 - Adding a blk_integrity_revalidate() callback for updating the stable
   pages bdi setting.

 - The calls that depend on whether a device has an integrity profile or
   not now key off of the bi->profile pointer.

 - Simplifying the integrity support routines in DM.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Cc: Mike Snitzer <snitzer at redhat.com>
Cc: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c     | 160 +++++++++++++++++-----------------------------
 block/genhd.c             |   2 +
 block/partition-generic.c |   1 +
 drivers/block/nvme-core.c |   2 +-
 drivers/md/dm-table.c     |  89 +++++++-------------------
 drivers/md/md.c           |   9 +--
 drivers/nvdimm/core.c     |   6 +-
 fs/block_dev.c            |   2 +-
 include/linux/blkdev.h    |  33 ++++------
 include/linux/genhd.h     |  26 +++++++-
 10 files changed, 126 insertions(+), 204 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 72b427c3ed1e..80398bb4eee4 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -30,10 +30,6 @@
 
 #include "blk.h"
 
-static struct kmem_cache *integrity_cachep;
-
-static const char *bi_unsupported_name = "unsupported";
-
 /**
  * blk_rq_count_integrity_sg - Count number of integrity scatterlist elements
  * @q:		request queue
@@ -146,13 +142,13 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
  */
 int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 {
-	struct blk_integrity *b1 = gd1->integrity;
-	struct blk_integrity *b2 = gd2->integrity;
+	struct blk_integrity *b1 = &gd1->integrity;
+	struct blk_integrity *b2 = &gd2->integrity;
 
-	if (!b1 && !b2)
+	if (!b1->profile && !b2->profile)
 		return 0;
 
-	if (!b1 || !b2)
+	if (!b1->profile || !b2->profile)
 		return -1;
 
 	if (b1->interval_exp != b2->interval_exp) {
@@ -163,21 +159,21 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	}
 
 	if (b1->tuple_size != b2->tuple_size) {
-		printk(KERN_ERR "%s: %s/%s tuple sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tuple sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tuple_size, b2->tuple_size);
 		return -1;
 	}
 
 	if (b1->tag_size && b2->tag_size && (b1->tag_size != b2->tag_size)) {
-		printk(KERN_ERR "%s: %s/%s tag sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tag sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tag_size, b2->tag_size);
 		return -1;
 	}
 
 	if (b1->profile != b2->profile) {
-		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
+		pr_err("%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->profile->name, b2->profile->name);
 		return -1;
@@ -247,7 +243,7 @@ static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -259,7 +255,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    size_t count)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -272,7 +268,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->profile->name != NULL)
+	if (bi->profile && bi->profile->name)
 		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
@@ -280,18 +276,13 @@ static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 
 static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", bi->tag_size);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n", bi->tag_size);
 }
 
 static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", 1 << bi->interval_exp);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n",
+		       bi->interval_exp ? 1 << bi->interval_exp : 0);
 }
 
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
@@ -385,113 +376,78 @@ static const struct sysfs_ops integrity_ops = {
 	.store	= &integrity_attr_store,
 };
 
-static int __init blk_dev_integrity_init(void)
-{
-	integrity_cachep = kmem_cache_create("blkdev_integrity",
-					     sizeof(struct blk_integrity),
-					     0, SLAB_PANIC, NULL);
-	return 0;
-}
-subsys_initcall(blk_dev_integrity_init);
-
-static void blk_integrity_release(struct kobject *kobj)
-{
-	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	kmem_cache_free(integrity_cachep, bi);
-}
-
 static struct kobj_type integrity_ktype = {
 	.default_attrs	= integrity_attrs,
 	.sysfs_ops	= &integrity_ops,
-	.release	= blk_integrity_release,
 };
 
-bool blk_integrity_is_initialized(struct gendisk *disk)
-{
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	return (bi && bi->profile->name && strcmp(bi->profile->name,
-						  bi_unsupported_name) != 0);
-}
-EXPORT_SYMBOL(blk_integrity_is_initialized);
-
 /**
  * blk_integrity_register - Register a gendisk as being integrity-capable
  * @disk:	struct gendisk pointer to make integrity-aware
- * @template:	optional integrity profile to register
+ * @template:	block integrity profile to register
  *
- * Description: When a device needs to advertise itself as being able
- * to send/receive integrity metadata it must use this function to
- * register the capability with the block layer.  The template is a
- * blk_integrity struct with values appropriate for the underlying
- * hardware.  If template is NULL the new profile is allocated but
- * not filled out. See Documentation/block/data-integrity.txt.
+ * Description: When a device needs to advertise itself as being able to
+ * send/receive integrity metadata it must use this function to register
+ * the capability with the block layer. The template is a blk_integrity
+ * struct with values appropriate for the underlying hardware. See
+ * Documentation/block/data-integrity.txt.
  */
-int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
+void blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 {
-	struct blk_integrity *bi;
-
-	BUG_ON(disk == NULL);
+	struct blk_integrity *bi = &disk->integrity;
 
-	if (disk->integrity == NULL) {
-		bi = kmem_cache_alloc(integrity_cachep,
-				      GFP_KERNEL | __GFP_ZERO);
-		if (!bi)
-			return -1;
+	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
+		template->flags;
+	bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
+	bi->profile = template->profile;
+	bi->tuple_size = template->tuple_size;
+	bi->tag_size = template->tag_size;
 
-		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
-					 &disk_to_dev(disk)->kobj,
-					 "%s", "integrity")) {
-			kmem_cache_free(integrity_cachep, bi);
-			return -1;
-		}
-
-		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
-
-		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
-		disk->integrity = bi;
-	} else
-		bi = disk->integrity;
-
-	/* Use the provided profile as template */
-	if (template != NULL) {
-		bi->profile = template->profile;
-		bi->tuple_size = template->tuple_size;
-		bi->tag_size = template->tag_size;
-		bi->flags |= template->flags;
-	} else
-		bi->profile->name = bi_unsupported_name;
-
-	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
-
-	return 0;
+	blk_integrity_revalidate(disk);
 }
 EXPORT_SYMBOL(blk_integrity_register);
 
 /**
- * blk_integrity_unregister - Remove block integrity profile
- * @disk:	disk whose integrity profile to deallocate
+ * blk_integrity_unregister - Unregister block integrity profile
+ * @disk:	disk whose integrity profile to unregister
  *
- * Description: This function frees all memory used by the block
- * integrity profile.  To be called at device teardown.
+ * Description: This function unregisters the integrity capability from
+ * a block device.
  */
 void blk_integrity_unregister(struct gendisk *disk)
 {
-	struct blk_integrity *bi;
+	blk_integrity_revalidate(disk);
+	memset(&disk->integrity, 0, sizeof(struct blk_integrity));
+}
+EXPORT_SYMBOL(blk_integrity_unregister);
 
-	if (!disk || !disk->integrity)
+void blk_integrity_revalidate(struct gendisk *disk)
+{
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!(disk->flags & GENHD_FL_UP))
 		return;
 
-	disk->queue->backing_dev_info.capabilities &= ~BDI_CAP_STABLE_WRITES;
+	if (bi->profile)
+		disk->queue->backing_dev_info.capabilities |=
+			BDI_CAP_STABLE_WRITES;
+	else
+		disk->queue->backing_dev_info.capabilities &=
+			~BDI_CAP_STABLE_WRITES;
+}
 
-	bi = disk->integrity;
+void blk_integrity_add(struct gendisk *disk)
+{
+	if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
+				 &disk_to_dev(disk)->kobj, "%s", "integrity"))
+		return;
 
+	kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
+}
+
+void blk_integrity_del(struct gendisk *disk)
+{
 	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
 	kobject_del(&disk->integrity_kobj);
 	kobject_put(&disk->integrity_kobj);
-	disk->integrity = NULL;
 }
-EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/block/genhd.c b/block/genhd.c
index 0c706f33a599..e5cafa51567c 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -630,6 +630,7 @@ void add_disk(struct gendisk *disk)
 	WARN_ON(retval);
 
 	disk_add_events(disk);
+	blk_integrity_add(disk);
 }
 EXPORT_SYMBOL(add_disk);
 
@@ -638,6 +639,7 @@ void del_gendisk(struct gendisk *disk)
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
+	blk_integrity_del(disk);
 	disk_del_events(disk);
 
 	/* invalidate stuff */
diff --git a/block/partition-generic.c b/block/partition-generic.c
index e7711133284e..3b030157ec85 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -428,6 +428,7 @@ rescan:
 
 	if (disk->fops->revalidate_disk)
 		disk->fops->revalidate_disk(disk);
+	blk_integrity_revalidate(disk);
 	check_disk_size_change(disk, bdev);
 	bdev->bd_invalidated = 0;
 	if (!get_capacity(disk) || !(state = check_partition(disk, bdev)))
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index e26eb1524bfd..75ec6895c378 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -529,7 +529,7 @@ static void nvme_dif_remap(struct request *req,
 	virt = bip_get_seed(bip);
 	phys = nvme_block_nr(ns, blk_rq_pos(req));
 	nlb = (blk_rq_bytes(req) >> ns->lba_shift);
-	ts = ns->disk->integrity->tuple_size;
+	ts = ns->disk->integrity.tuple_size;
 
 	for (i = 0; i < nlb; i++, virt++, phys++) {
 		pi = (struct t10_pi_tuple *)p;
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index afb4ad3dfeb3..25c33b7f4749 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1016,13 +1016,9 @@ static int dm_table_build_index(struct dm_table *t)
 
 /*
  * Get a disk whose integrity profile reflects the table's profile.
- * If %match_all is true, all devices' profiles must match.
- * If %match_all is false, all devices must at least have an
- * allocated integrity profile; but uninitialized is ok.
  * Returns NULL if integrity support was inconsistent or unavailable.
  */
-static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
-						    bool match_all)
+static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
 {
 	struct list_head *devices = dm_table_get_devices(t);
 	struct dm_dev_internal *dd = NULL;
@@ -1030,11 +1026,7 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
 
 	list_for_each_entry(dd, devices, list) {
 		template_disk = dd->dm_dev->bdev->bd_disk;
-		if (!blk_get_integrity(template_disk))
-			goto no_integrity;
-		if (!match_all && !blk_integrity_is_initialized(template_disk))
-			continue; /* skip uninitialized profiles */
-		else if (prev_disk &&
+		if (prev_disk &&
 			 blk_integrity_compare(prev_disk, template_disk) < 0)
 			goto no_integrity;
 		prev_disk = template_disk;
@@ -1052,43 +1044,34 @@ no_integrity:
 }
 
 /*
- * Register the mapped device for blk_integrity support if
- * the underlying devices have an integrity profile.  But all devices
- * may not have matching profiles (checking all devices isn't reliable
+ * Register the mapped device for blk_integrity support if the
+ * underlying devices have an integrity profile.  But all devices may
+ * not have matching profiles (checking all devices isn't reliable
  * during table load because this table may use other DM device(s) which
- * must be resumed before they will have an initialized integity profile).
- * Stacked DM devices force a 2 stage integrity profile validation:
- * 1 - during load, validate all initialized integrity profiles match
- * 2 - during resume, validate all integrity profiles match
+ * must be resumed before they will have an initialized integity
+ * profile).  Consequently, stacked DM devices force a 2 stage integrity
+ * profile validation: First pass during table load, final pass during
+ * resume.
  */
-static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device *md)
+static int dm_table_set_integrity(struct dm_table *t)
 {
 	struct gendisk *template_disk = NULL;
+	bool existing_profile = blk_get_integrity(dm_disk(t->md));
 
-	template_disk = dm_table_get_integrity_disk(t, false);
-	if (!template_disk)
-		return 0;
-
-	if (!blk_integrity_is_initialized(dm_disk(md))) {
+	template_disk = dm_table_get_integrity_disk(t);
+	if (template_disk) {
+		blk_integrity_register(dm_disk(t->md),
+				       blk_get_integrity(template_disk));
 		t->integrity_supported = 1;
-		return blk_integrity_register(dm_disk(md), NULL);
-	}
-
-	/*
-	 * If DM device already has an initalized integrity
-	 * profile the new profile should not conflict.
-	 */
-	if (blk_integrity_is_initialized(template_disk) &&
-	    blk_integrity_compare(dm_disk(md), template_disk) < 0) {
-		DMWARN("%s: conflict with existing integrity profile: "
-		       "%s profile mismatch",
-		       dm_device_name(t->md),
-		       template_disk->disk_name);
+	} else if (existing_profile) {
+		blk_integrity_unregister(dm_disk(t->md));
+		DMWARN("%s: device no longer has a valid integrity profile",
+		       dm_device_name(t->md));
 		return 1;
-	}
+	} else
+		DMWARN("%s: unable to establish an integrity profile",
+		       dm_device_name(t->md));
 
-	/* Preserve existing initialized integrity profile */
-	t->integrity_supported = 1;
 	return 0;
 }
 
@@ -1112,7 +1095,7 @@ int dm_table_complete(struct dm_table *t)
 		return r;
 	}
 
-	r = dm_table_prealloc_integrity(t, t->md);
+	r = dm_table_set_integrity(t);
 	if (r) {
 		DMERR("could not register integrity profile.");
 		return r;
@@ -1277,32 +1260,6 @@ combine_limits:
 	return validate_hardware_logical_block_alignment(table, limits);
 }
 
-/*
- * Set the integrity profile for this device if all devices used have
- * matching profiles.  We're quite deep in the resume path but still
- * don't know if all devices (particularly DM devices this device
- * may be stacked on) have matching profiles.  Even if the profiles
- * don't match we have no way to fail (to resume) at this point.
- */
-static void dm_table_set_integrity(struct dm_table *t)
-{
-	struct gendisk *template_disk = NULL;
-
-	if (!blk_get_integrity(dm_disk(t->md)))
-		return;
-
-	template_disk = dm_table_get_integrity_disk(t, true);
-	if (template_disk)
-		blk_integrity_register(dm_disk(t->md),
-				       blk_get_integrity(template_disk));
-	else if (blk_integrity_is_initialized(dm_disk(t->md)))
-		DMWARN("%s: device no longer has a valid integrity profile",
-		       dm_device_name(t->md));
-	else
-		DMWARN("%s: unable to establish an integrity profile",
-		       dm_device_name(t->md));
-}
-
 static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
 				sector_t start, sector_t len, void *data)
 {
diff --git a/drivers/md/md.c b/drivers/md/md.c
index d28bf5cea224..dd7b7c31f3f2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1959,12 +1959,9 @@ int md_integrity_register(struct mddev *mddev)
 	 * All component devices are integrity capable and have matching
 	 * profiles, register the common profile for the md device.
 	 */
-	if (blk_integrity_register(mddev->gendisk,
-			bdev_get_integrity(reference->bdev)) != 0) {
-		printk(KERN_ERR "md: failed to register integrity for %s\n",
-			mdname(mddev));
-		return -EINVAL;
-	}
+	blk_integrity_register(mddev->gendisk,
+			       bdev_get_integrity(reference->bdev));
+
 	printk(KERN_NOTICE "md: data integrity enabled on %s\n", mdname(mddev));
 	if (bioset_integrity_create(mddev->bio_set, BIO_POOL_SIZE)) {
 		printk(KERN_ERR "md: failed to create integrity pool for %s\n",
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 9e1b0f656a9b..9b6ac57c6e73 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -405,7 +405,6 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 		.generate_fn = nd_pi_nop_generate_verify,
 		.verify_fn = nd_pi_nop_generate_verify,
 	};
-	int ret;
 
 	if (meta_size == 0)
 		return 0;
@@ -414,10 +413,7 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 	bi.tuple_size = meta_size;
 	bi.tag_size = meta_size;
 
-	ret = blk_integrity_register(disk, &bi);
-	if (ret)
-		return ret;
-
+	blk_integrity_register(disk, &bi);
 	blk_queue_max_integrity_segments(disk->queue, 1);
 
 	return 0;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 198243717da5..918170c0099a 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1074,7 +1074,7 @@ int revalidate_disk(struct gendisk *disk)
 
 	if (disk->fops->revalidate_disk)
 		ret = disk->fops->revalidate_disk(disk);
-
+	blk_integrity_revalidate(disk);
 	bdev = bdget_disk(disk, 0);
 	if (!bdev)
 		return ret;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2d93d3875586..15a29ac1eb66 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1427,16 +1427,7 @@ struct blk_integrity_profile {
 	const char			*name;
 };
 
-struct blk_integrity {
-	struct blk_integrity_profile	*profile;
-	unsigned char			flags;
-	unsigned char			tuple_size;
-	unsigned char			interval_exp;
-	unsigned char			tag_size;
-};
-
-extern bool blk_integrity_is_initialized(struct gendisk *);
-extern int blk_integrity_register(struct gendisk *, struct blk_integrity *);
+extern void blk_integrity_register(struct gendisk *, struct blk_integrity *);
 extern void blk_integrity_unregister(struct gendisk *);
 extern int blk_integrity_compare(struct gendisk *, struct gendisk *);
 extern int blk_rq_map_integrity_sg(struct request_queue *, struct bio *,
@@ -1447,15 +1438,20 @@ extern bool blk_integrity_merge_rq(struct request_queue *, struct request *,
 extern bool blk_integrity_merge_bio(struct request_queue *, struct request *,
 				    struct bio *);
 
-static inline
-struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
+static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
 {
-	return bdev->bd_disk->integrity;
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!bi->profile)
+		return NULL;
+
+	return bi;
 }
 
-static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
+static inline
+struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
 {
-	return disk->integrity;
+	return blk_get_integrity(bdev->bd_disk);
 }
 
 static inline bool blk_integrity_rq(struct request *rq)
@@ -1509,10 +1505,9 @@ static inline int blk_integrity_compare(struct gendisk *a, struct gendisk *b)
 {
 	return 0;
 }
-static inline int blk_integrity_register(struct gendisk *d,
+static inline void blk_integrity_register(struct gendisk *d,
 					 struct blk_integrity *b)
 {
-	return 0;
 }
 static inline void blk_integrity_unregister(struct gendisk *d)
 {
@@ -1537,10 +1532,6 @@ static inline bool blk_integrity_merge_bio(struct request_queue *rq,
 {
 	return true;
 }
-static inline bool blk_integrity_is_initialized(struct gendisk *g)
-{
-	return 0;
-}
 
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
 
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 9e6e0dfa97ad..82f4911e0ad8 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -163,6 +163,18 @@ struct disk_part_tbl {
 
 struct disk_events;
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
+};
+
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 struct gendisk {
 	/* major, first_minor and minors are input parameters only,
 	 * don't use directly.  Use disk_devt() and disk_max_parts().
@@ -198,9 +210,9 @@ struct gendisk {
 	atomic_t sync_io;		/* RAID */
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
-	struct blk_integrity *integrity;
+	struct blk_integrity integrity;
 	struct kobject integrity_kobj;
-#endif
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;
 };
 
@@ -728,6 +740,16 @@ static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
 #endif
 }
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+extern void blk_integrity_add(struct gendisk *);
+extern void blk_integrity_del(struct gendisk *);
+extern void blk_integrity_revalidate(struct gendisk *);
+#else	/* CONFIG_BLK_DEV_INTEGRITY */
+static inline void blk_integrity_add(struct gendisk *disk) { }
+static inline void blk_integrity_del(struct gendisk *disk) { }
+static inline void blk_integrity_revalidate(struct gendisk *disk) { }
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 #else /* CONFIG_BLOCK */
 
 static inline void printk_all_partitions(void) { }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Simplify block integrity registration
  2015-08-20 20:41                 ` Simplify block integrity registration Martin K. Petersen
                                     ` (4 preceding siblings ...)
  2015-08-20 20:41                   ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
@ 2015-08-20 20:45                   ` Mike Snitzer
  5 siblings, 0 replies; 67+ messages in thread
From: Mike Snitzer @ 2015-08-20 20:45 UTC (permalink / raw)


On Thu, Aug 20 2015 at  4:41pm -0400,
Martin K. Petersen <martin.petersen@oracle.com> wrote:

> 
> Had a couple of requests this week so here is the latest version of the
> patch set that removes the dynamic integrity allocation to ease the
> registration headaches for nvme and nvdimm.
> 
> Mike, I'd appreciate if you could check the DM bits.

I'm just starting vacation until 8/26.  But I'll certainly review once I
get back.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-08-20 20:41                   ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
@ 2015-08-21 23:47                     ` Busch, Keith
  2015-08-27  0:25                       ` Martin K. Petersen
                                         ` (2 more replies)
  2015-09-16  1:07                       ` Mike Snitzer
  1 sibling, 3 replies; 67+ messages in thread
From: Busch, Keith @ 2015-08-21 23:47 UTC (permalink / raw)


On Thu, Aug 20 2015, Martin K. Petersen wrote:
> -static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
> -						    bool match_all)
> +static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
>  {
>  	struct list_head *devices = dm_table_get_devices(t);
>  	struct dm_dev_internal *dd = NULL;
> @@ -1030,11 +1026,7 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
> 
>  	list_for_each_entry(dd, devices, list) {
>  		template_disk = dd->dm_dev->bdev->bd_disk;
> -		if (!blk_get_integrity(template_disk))
> -			goto no_integrity;


The blk_get_integrity check is necessary to prevent a kernel crash. Without it,
this function will return a template disk without integrity and later attempt to
register a NULL disk->integrity.

Otherwise, looks good! I will post the nvme driver patch removing the two pass
disk revalidation since that's no longer necessary with this good stuff.


> -		if (!match_all && !blk_integrity_is_initialized(template_disk))
> -			continue; /* skip uninitialized profiles */
> -		else if (prev_disk &&
> +		if (prev_disk &&
>  			 blk_integrity_compare(prev_disk, template_disk) < 0)
>  			goto no_integrity;
>  		prev_disk = template_disk;

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-08-21 23:47                     ` Busch, Keith
@ 2015-08-27  0:25                       ` Martin K. Petersen
  2015-08-27  0:25                       ` [PATCH] " Martin K. Petersen
  2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
  2 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-08-27  0:25 UTC (permalink / raw)


>>>>> "Keith" == Busch, Keith <keith.busch at intel.com> writes:

Keith> The blk_get_integrity check is necessary to prevent a kernel
Keith> crash. Without it, this function will return a template disk
Keith> without integrity and later attempt to register a NULL
Keith> disk->integrity.

Fixed that up.

Keith> Otherwise, looks good! I will post the nvme driver patch removing
Keith> the two pass disk revalidation since that's no longer necessary
Keith> with this good stuff.

I tweaked that while I was at it. Updated patch 5 coming...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] block: Inline blk_integrity in struct gendisk
  2015-08-21 23:47                     ` Busch, Keith
  2015-08-27  0:25                       ` Martin K. Petersen
@ 2015-08-27  0:25                       ` Martin K. Petersen
  2015-08-27  8:28                         ` Christoph Hellwig
  2015-09-03 20:38                         ` Keith Busch
  2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
  2 siblings, 2 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-08-27  0:25 UTC (permalink / raw)


Up until now the_integrity profile has been dynamically allocated and
attached to struct gendisk after the disk has been made active.

This causes problems because NVMe devices need to register the profile
prior to the partition table being read due to a mandatory metadata
buffer requirement. In addition, DM goes through hoops to deal with
preallocating, but not initializing integrity profiles.

Since the integrity profile is small (4 bytes + a pointer), Christoph
suggested moving it to struct gendisk proper. This requires several
changes:

 - Moving the blk_integrity definition to genhd.h.

 - Inlining blk_integrity in struct gendisk.

 - Removing the dynamic allocation code.

 - Adding helper functions which allow gendisk to set up and tear down
   the integrity sysfs dir when a disk is added/deleted.

 - Adding a blk_integrity_revalidate() callback for updating the stable
   pages bdi setting.

 - The calls that depend on whether a device has an integrity profile or
   not now key off of the bi->profile pointer.

 - Simplifying the integrity support routines in DM.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Cc: Mike Snitzer <snitzer at redhat.com>
Cc: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c     | 160 +++++++++++++++++-----------------------------
 block/genhd.c             |   2 +
 block/partition-generic.c |   1 +
 drivers/block/nvme-core.c |  10 +--
 drivers/md/dm-table.c     |  85 ++++++------------------
 drivers/md/md.c           |   9 +--
 drivers/nvdimm/core.c     |   6 +-
 fs/block_dev.c            |   2 +-
 include/linux/blkdev.h    |  33 ++++------
 include/linux/genhd.h     |  26 +++++++-
 10 files changed, 126 insertions(+), 208 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 72b427c3ed1e..80398bb4eee4 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -30,10 +30,6 @@
 
 #include "blk.h"
 
-static struct kmem_cache *integrity_cachep;
-
-static const char *bi_unsupported_name = "unsupported";
-
 /**
  * blk_rq_count_integrity_sg - Count number of integrity scatterlist elements
  * @q:		request queue
@@ -146,13 +142,13 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
  */
 int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 {
-	struct blk_integrity *b1 = gd1->integrity;
-	struct blk_integrity *b2 = gd2->integrity;
+	struct blk_integrity *b1 = &gd1->integrity;
+	struct blk_integrity *b2 = &gd2->integrity;
 
-	if (!b1 && !b2)
+	if (!b1->profile && !b2->profile)
 		return 0;
 
-	if (!b1 || !b2)
+	if (!b1->profile || !b2->profile)
 		return -1;
 
 	if (b1->interval_exp != b2->interval_exp) {
@@ -163,21 +159,21 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	}
 
 	if (b1->tuple_size != b2->tuple_size) {
-		printk(KERN_ERR "%s: %s/%s tuple sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tuple sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tuple_size, b2->tuple_size);
 		return -1;
 	}
 
 	if (b1->tag_size && b2->tag_size && (b1->tag_size != b2->tag_size)) {
-		printk(KERN_ERR "%s: %s/%s tag sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tag sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tag_size, b2->tag_size);
 		return -1;
 	}
 
 	if (b1->profile != b2->profile) {
-		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
+		pr_err("%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->profile->name, b2->profile->name);
 		return -1;
@@ -247,7 +243,7 @@ static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -259,7 +255,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    size_t count)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -272,7 +268,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->profile->name != NULL)
+	if (bi->profile && bi->profile->name)
 		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
@@ -280,18 +276,13 @@ static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 
 static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", bi->tag_size);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n", bi->tag_size);
 }
 
 static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", 1 << bi->interval_exp);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n",
+		       bi->interval_exp ? 1 << bi->interval_exp : 0);
 }
 
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
@@ -385,113 +376,78 @@ static const struct sysfs_ops integrity_ops = {
 	.store	= &integrity_attr_store,
 };
 
-static int __init blk_dev_integrity_init(void)
-{
-	integrity_cachep = kmem_cache_create("blkdev_integrity",
-					     sizeof(struct blk_integrity),
-					     0, SLAB_PANIC, NULL);
-	return 0;
-}
-subsys_initcall(blk_dev_integrity_init);
-
-static void blk_integrity_release(struct kobject *kobj)
-{
-	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	kmem_cache_free(integrity_cachep, bi);
-}
-
 static struct kobj_type integrity_ktype = {
 	.default_attrs	= integrity_attrs,
 	.sysfs_ops	= &integrity_ops,
-	.release	= blk_integrity_release,
 };
 
-bool blk_integrity_is_initialized(struct gendisk *disk)
-{
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	return (bi && bi->profile->name && strcmp(bi->profile->name,
-						  bi_unsupported_name) != 0);
-}
-EXPORT_SYMBOL(blk_integrity_is_initialized);
-
 /**
  * blk_integrity_register - Register a gendisk as being integrity-capable
  * @disk:	struct gendisk pointer to make integrity-aware
- * @template:	optional integrity profile to register
+ * @template:	block integrity profile to register
  *
- * Description: When a device needs to advertise itself as being able
- * to send/receive integrity metadata it must use this function to
- * register the capability with the block layer.  The template is a
- * blk_integrity struct with values appropriate for the underlying
- * hardware.  If template is NULL the new profile is allocated but
- * not filled out. See Documentation/block/data-integrity.txt.
+ * Description: When a device needs to advertise itself as being able to
+ * send/receive integrity metadata it must use this function to register
+ * the capability with the block layer. The template is a blk_integrity
+ * struct with values appropriate for the underlying hardware. See
+ * Documentation/block/data-integrity.txt.
  */
-int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
+void blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 {
-	struct blk_integrity *bi;
-
-	BUG_ON(disk == NULL);
+	struct blk_integrity *bi = &disk->integrity;
 
-	if (disk->integrity == NULL) {
-		bi = kmem_cache_alloc(integrity_cachep,
-				      GFP_KERNEL | __GFP_ZERO);
-		if (!bi)
-			return -1;
+	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
+		template->flags;
+	bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
+	bi->profile = template->profile;
+	bi->tuple_size = template->tuple_size;
+	bi->tag_size = template->tag_size;
 
-		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
-					 &disk_to_dev(disk)->kobj,
-					 "%s", "integrity")) {
-			kmem_cache_free(integrity_cachep, bi);
-			return -1;
-		}
-
-		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
-
-		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
-		disk->integrity = bi;
-	} else
-		bi = disk->integrity;
-
-	/* Use the provided profile as template */
-	if (template != NULL) {
-		bi->profile = template->profile;
-		bi->tuple_size = template->tuple_size;
-		bi->tag_size = template->tag_size;
-		bi->flags |= template->flags;
-	} else
-		bi->profile->name = bi_unsupported_name;
-
-	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
-
-	return 0;
+	blk_integrity_revalidate(disk);
 }
 EXPORT_SYMBOL(blk_integrity_register);
 
 /**
- * blk_integrity_unregister - Remove block integrity profile
- * @disk:	disk whose integrity profile to deallocate
+ * blk_integrity_unregister - Unregister block integrity profile
+ * @disk:	disk whose integrity profile to unregister
  *
- * Description: This function frees all memory used by the block
- * integrity profile.  To be called at device teardown.
+ * Description: This function unregisters the integrity capability from
+ * a block device.
  */
 void blk_integrity_unregister(struct gendisk *disk)
 {
-	struct blk_integrity *bi;
+	blk_integrity_revalidate(disk);
+	memset(&disk->integrity, 0, sizeof(struct blk_integrity));
+}
+EXPORT_SYMBOL(blk_integrity_unregister);
 
-	if (!disk || !disk->integrity)
+void blk_integrity_revalidate(struct gendisk *disk)
+{
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!(disk->flags & GENHD_FL_UP))
 		return;
 
-	disk->queue->backing_dev_info.capabilities &= ~BDI_CAP_STABLE_WRITES;
+	if (bi->profile)
+		disk->queue->backing_dev_info.capabilities |=
+			BDI_CAP_STABLE_WRITES;
+	else
+		disk->queue->backing_dev_info.capabilities &=
+			~BDI_CAP_STABLE_WRITES;
+}
 
-	bi = disk->integrity;
+void blk_integrity_add(struct gendisk *disk)
+{
+	if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
+				 &disk_to_dev(disk)->kobj, "%s", "integrity"))
+		return;
 
+	kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
+}
+
+void blk_integrity_del(struct gendisk *disk)
+{
 	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
 	kobject_del(&disk->integrity_kobj);
 	kobject_put(&disk->integrity_kobj);
-	disk->integrity = NULL;
 }
-EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/block/genhd.c b/block/genhd.c
index 0c706f33a599..e5cafa51567c 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -630,6 +630,7 @@ void add_disk(struct gendisk *disk)
 	WARN_ON(retval);
 
 	disk_add_events(disk);
+	blk_integrity_add(disk);
 }
 EXPORT_SYMBOL(add_disk);
 
@@ -638,6 +639,7 @@ void del_gendisk(struct gendisk *disk)
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
+	blk_integrity_del(disk);
 	disk_del_events(disk);
 
 	/* invalidate stuff */
diff --git a/block/partition-generic.c b/block/partition-generic.c
index e7711133284e..3b030157ec85 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -428,6 +428,7 @@ rescan:
 
 	if (disk->fops->revalidate_disk)
 		disk->fops->revalidate_disk(disk);
+	blk_integrity_revalidate(disk);
 	check_disk_size_change(disk, bdev);
 	bdev->bd_invalidated = 0;
 	if (!get_capacity(disk) || !(state = check_partition(disk, bdev)))
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index e26eb1524bfd..e4f55c8318a5 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -529,7 +529,7 @@ static void nvme_dif_remap(struct request *req,
 	virt = bip_get_seed(bip);
 	phys = nvme_block_nr(ns, blk_rq_pos(req));
 	nlb = (blk_rq_bytes(req) >> ns->lba_shift);
-	ts = ns->disk->integrity->tuple_size;
+	ts = ns->disk->integrity.tuple_size;
 
 	for (i = 0; i < nlb; i++, virt++, phys++) {
 		pi = (struct t10_pi_tuple *)p;
@@ -1985,14 +1985,10 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 	ns->pi_type = pi_type;
 	blk_queue_logical_block_size(ns->queue, bs);
 
-	if (ns->ms && !blk_get_integrity(disk) && (disk->flags & GENHD_FL_UP) &&
-								!ns->ext)
+	if (ns->ms && !ns->ext)
 		nvme_init_integrity(ns);
 
-	if (ns->ms && !blk_get_integrity(disk))
-		set_capacity(disk, 0);
-	else
-		set_capacity(disk, le64_to_cpup(&id->nsze) << (ns->lba_shift - 9));
+	set_capacity(disk, le64_to_cpup(&id->nsze) << (ns->lba_shift - 9));
 
 	if (dev->oncs & NVME_CTRL_ONCS_DSM)
 		nvme_config_discard(ns);
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index afb4ad3dfeb3..3bb1ab48b400 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1016,13 +1016,9 @@ static int dm_table_build_index(struct dm_table *t)
 
 /*
  * Get a disk whose integrity profile reflects the table's profile.
- * If %match_all is true, all devices' profiles must match.
- * If %match_all is false, all devices must at least have an
- * allocated integrity profile; but uninitialized is ok.
  * Returns NULL if integrity support was inconsistent or unavailable.
  */
-static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
-						    bool match_all)
+static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
 {
 	struct list_head *devices = dm_table_get_devices(t);
 	struct dm_dev_internal *dd = NULL;
@@ -1032,10 +1028,8 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
 		template_disk = dd->dm_dev->bdev->bd_disk;
 		if (!blk_get_integrity(template_disk))
 			goto no_integrity;
-		if (!match_all && !blk_integrity_is_initialized(template_disk))
-			continue; /* skip uninitialized profiles */
-		else if (prev_disk &&
-			 blk_integrity_compare(prev_disk, template_disk) < 0)
+		if (prev_disk &&
+		    blk_integrity_compare(prev_disk, template_disk) < 0)
 			goto no_integrity;
 		prev_disk = template_disk;
 	}
@@ -1052,43 +1046,32 @@ no_integrity:
 }
 
 /*
- * Register the mapped device for blk_integrity support if
- * the underlying devices have an integrity profile.  But all devices
- * may not have matching profiles (checking all devices isn't reliable
+ * Register the mapped device for blk_integrity support if the
+ * underlying devices have an integrity profile.  But all devices may
+ * not have matching profiles (checking all devices isn't reliable
  * during table load because this table may use other DM device(s) which
- * must be resumed before they will have an initialized integity profile).
- * Stacked DM devices force a 2 stage integrity profile validation:
- * 1 - during load, validate all initialized integrity profiles match
- * 2 - during resume, validate all integrity profiles match
+ * must be resumed before they will have an initialized integity
+ * profile).  Consequently, stacked DM devices force a 2 stage integrity
+ * profile validation: First pass during table load, final pass during
+ * resume.
  */
-static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device *md)
+static int dm_table_set_integrity(struct dm_table *t)
 {
 	struct gendisk *template_disk = NULL;
+	bool existing_profile = blk_get_integrity(dm_disk(t->md));
 
-	template_disk = dm_table_get_integrity_disk(t, false);
-	if (!template_disk)
-		return 0;
-
-	if (!blk_integrity_is_initialized(dm_disk(md))) {
+	template_disk = dm_table_get_integrity_disk(t);
+	if (template_disk) {
+		blk_integrity_register(dm_disk(t->md),
+				       blk_get_integrity(template_disk));
 		t->integrity_supported = 1;
-		return blk_integrity_register(dm_disk(md), NULL);
-	}
-
-	/*
-	 * If DM device already has an initalized integrity
-	 * profile the new profile should not conflict.
-	 */
-	if (blk_integrity_is_initialized(template_disk) &&
-	    blk_integrity_compare(dm_disk(md), template_disk) < 0) {
-		DMWARN("%s: conflict with existing integrity profile: "
-		       "%s profile mismatch",
-		       dm_device_name(t->md),
-		       template_disk->disk_name);
+	} else if (existing_profile) {
+		blk_integrity_unregister(dm_disk(t->md));
+		DMWARN("%s: device no longer has a valid integrity profile",
+		       dm_device_name(t->md));
 		return 1;
 	}
 
-	/* Preserve existing initialized integrity profile */
-	t->integrity_supported = 1;
 	return 0;
 }
 
@@ -1112,7 +1095,7 @@ int dm_table_complete(struct dm_table *t)
 		return r;
 	}
 
-	r = dm_table_prealloc_integrity(t, t->md);
+	r = dm_table_set_integrity(t);
 	if (r) {
 		DMERR("could not register integrity profile.");
 		return r;
@@ -1277,32 +1260,6 @@ combine_limits:
 	return validate_hardware_logical_block_alignment(table, limits);
 }
 
-/*
- * Set the integrity profile for this device if all devices used have
- * matching profiles.  We're quite deep in the resume path but still
- * don't know if all devices (particularly DM devices this device
- * may be stacked on) have matching profiles.  Even if the profiles
- * don't match we have no way to fail (to resume) at this point.
- */
-static void dm_table_set_integrity(struct dm_table *t)
-{
-	struct gendisk *template_disk = NULL;
-
-	if (!blk_get_integrity(dm_disk(t->md)))
-		return;
-
-	template_disk = dm_table_get_integrity_disk(t, true);
-	if (template_disk)
-		blk_integrity_register(dm_disk(t->md),
-				       blk_get_integrity(template_disk));
-	else if (blk_integrity_is_initialized(dm_disk(t->md)))
-		DMWARN("%s: device no longer has a valid integrity profile",
-		       dm_device_name(t->md));
-	else
-		DMWARN("%s: unable to establish an integrity profile",
-		       dm_device_name(t->md));
-}
-
 static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
 				sector_t start, sector_t len, void *data)
 {
diff --git a/drivers/md/md.c b/drivers/md/md.c
index d28bf5cea224..dd7b7c31f3f2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1959,12 +1959,9 @@ int md_integrity_register(struct mddev *mddev)
 	 * All component devices are integrity capable and have matching
 	 * profiles, register the common profile for the md device.
 	 */
-	if (blk_integrity_register(mddev->gendisk,
-			bdev_get_integrity(reference->bdev)) != 0) {
-		printk(KERN_ERR "md: failed to register integrity for %s\n",
-			mdname(mddev));
-		return -EINVAL;
-	}
+	blk_integrity_register(mddev->gendisk,
+			       bdev_get_integrity(reference->bdev));
+
 	printk(KERN_NOTICE "md: data integrity enabled on %s\n", mdname(mddev));
 	if (bioset_integrity_create(mddev->bio_set, BIO_POOL_SIZE)) {
 		printk(KERN_ERR "md: failed to create integrity pool for %s\n",
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 9e1b0f656a9b..9b6ac57c6e73 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -405,7 +405,6 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 		.generate_fn = nd_pi_nop_generate_verify,
 		.verify_fn = nd_pi_nop_generate_verify,
 	};
-	int ret;
 
 	if (meta_size == 0)
 		return 0;
@@ -414,10 +413,7 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 	bi.tuple_size = meta_size;
 	bi.tag_size = meta_size;
 
-	ret = blk_integrity_register(disk, &bi);
-	if (ret)
-		return ret;
-
+	blk_integrity_register(disk, &bi);
 	blk_queue_max_integrity_segments(disk->queue, 1);
 
 	return 0;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 198243717da5..918170c0099a 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1074,7 +1074,7 @@ int revalidate_disk(struct gendisk *disk)
 
 	if (disk->fops->revalidate_disk)
 		ret = disk->fops->revalidate_disk(disk);
-
+	blk_integrity_revalidate(disk);
 	bdev = bdget_disk(disk, 0);
 	if (!bdev)
 		return ret;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2d93d3875586..15a29ac1eb66 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1427,16 +1427,7 @@ struct blk_integrity_profile {
 	const char			*name;
 };
 
-struct blk_integrity {
-	struct blk_integrity_profile	*profile;
-	unsigned char			flags;
-	unsigned char			tuple_size;
-	unsigned char			interval_exp;
-	unsigned char			tag_size;
-};
-
-extern bool blk_integrity_is_initialized(struct gendisk *);
-extern int blk_integrity_register(struct gendisk *, struct blk_integrity *);
+extern void blk_integrity_register(struct gendisk *, struct blk_integrity *);
 extern void blk_integrity_unregister(struct gendisk *);
 extern int blk_integrity_compare(struct gendisk *, struct gendisk *);
 extern int blk_rq_map_integrity_sg(struct request_queue *, struct bio *,
@@ -1447,15 +1438,20 @@ extern bool blk_integrity_merge_rq(struct request_queue *, struct request *,
 extern bool blk_integrity_merge_bio(struct request_queue *, struct request *,
 				    struct bio *);
 
-static inline
-struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
+static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
 {
-	return bdev->bd_disk->integrity;
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!bi->profile)
+		return NULL;
+
+	return bi;
 }
 
-static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
+static inline
+struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
 {
-	return disk->integrity;
+	return blk_get_integrity(bdev->bd_disk);
 }
 
 static inline bool blk_integrity_rq(struct request *rq)
@@ -1509,10 +1505,9 @@ static inline int blk_integrity_compare(struct gendisk *a, struct gendisk *b)
 {
 	return 0;
 }
-static inline int blk_integrity_register(struct gendisk *d,
+static inline void blk_integrity_register(struct gendisk *d,
 					 struct blk_integrity *b)
 {
-	return 0;
 }
 static inline void blk_integrity_unregister(struct gendisk *d)
 {
@@ -1537,10 +1532,6 @@ static inline bool blk_integrity_merge_bio(struct request_queue *rq,
 {
 	return true;
 }
-static inline bool blk_integrity_is_initialized(struct gendisk *g)
-{
-	return 0;
-}
 
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
 
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 9e6e0dfa97ad..82f4911e0ad8 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -163,6 +163,18 @@ struct disk_part_tbl {
 
 struct disk_events;
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
+};
+
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 struct gendisk {
 	/* major, first_minor and minors are input parameters only,
 	 * don't use directly.  Use disk_devt() and disk_max_parts().
@@ -198,9 +210,9 @@ struct gendisk {
 	atomic_t sync_io;		/* RAID */
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
-	struct blk_integrity *integrity;
+	struct blk_integrity integrity;
 	struct kobject integrity_kobj;
-#endif
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;
 };
 
@@ -728,6 +740,16 @@ static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
 #endif
 }
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+extern void blk_integrity_add(struct gendisk *);
+extern void blk_integrity_del(struct gendisk *);
+extern void blk_integrity_revalidate(struct gendisk *);
+#else	/* CONFIG_BLK_DEV_INTEGRITY */
+static inline void blk_integrity_add(struct gendisk *disk) { }
+static inline void blk_integrity_del(struct gendisk *disk) { }
+static inline void blk_integrity_revalidate(struct gendisk *disk) { }
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 #else /* CONFIG_BLOCK */
 
 static inline void printk_all_partitions(void) { }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH] block: Inline blk_integrity in struct gendisk
  2015-08-27  0:25                       ` [PATCH] " Martin K. Petersen
@ 2015-08-27  8:28                         ` Christoph Hellwig
  2015-09-03 20:38                         ` Keith Busch
  1 sibling, 0 replies; 67+ messages in thread
From: Christoph Hellwig @ 2015-08-27  8:28 UTC (permalink / raw)


The new version still looks fine to me.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] block: Inline blk_integrity in struct gendisk
  2015-08-27  0:25                       ` [PATCH] " Martin K. Petersen
  2015-08-27  8:28                         ` Christoph Hellwig
@ 2015-09-03 20:38                         ` Keith Busch
  2015-09-04  3:25                           ` Martin K. Petersen
  1 sibling, 1 reply; 67+ messages in thread
From: Keith Busch @ 2015-09-03 20:38 UTC (permalink / raw)


On Wed, 26 Aug 2015, Martin K. Petersen wrote:
> @@ -1985,14 +1985,10 @@ static int nvme_revalidate_disk(struct gendisk *disk)
> 	ns->pi_type = pi_type;
> 	blk_queue_logical_block_size(ns->queue, bs);
>
> -	if (ns->ms && !blk_get_integrity(disk) && (disk->flags & GENHD_FL_UP) &&
> -								!ns->ext)
> +	if (ns->ms && !ns->ext)
> 		nvme_init_integrity(ns);
>
> -	if (ns->ms && !blk_get_integrity(disk))
> -		set_capacity(disk, 0);
> -	else
> -		set_capacity(disk, le64_to_cpup(&id->nsze) << (ns->lba_shift - 9));
> +	set_capacity(disk, le64_to_cpup(&id->nsze) << (ns->lba_shift - 9));

We still ought to set the capacity to 0 if there is no block integrity
support for the disk when it has meta data formats. This would happen
with either extended metadata, or kernel config does not include integrity
extensions.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] block: Inline blk_integrity in struct gendisk
  2015-09-03 20:38                         ` Keith Busch
@ 2015-09-04  3:25                           ` Martin K. Petersen
  0 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-09-04  3:25 UTC (permalink / raw)


>>>>> "Keith" == Keith Busch <keith.busch at intel.com> writes:

Keith> We still ought to set the capacity to 0 if there is no block
Keith> integrity support for the disk when it has meta data
Keith> formats. This would happen with either extended metadata, or
Keith> kernel config does not include integrity extensions.

Gotcha. Will fix!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-08-20 20:41                   ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
@ 2015-09-16  1:07                       ` Mike Snitzer
  2015-09-16  1:07                       ` Mike Snitzer
  1 sibling, 0 replies; 67+ messages in thread
From: Mike Snitzer @ 2015-09-16  1:07 UTC (permalink / raw)


On Thu, Aug 20 2015 at  4:41pm -0400,
Martin K. Petersen <martin.petersen@oracle.com> wrote:

> Up until now the_integrity profile has been dynamically allocated and
> attached to struct gendisk after the disk has been made active.
> 
> This causes problems because NVMe devices need to register the profile
> prior to the partition table being read due to a mandatory metadata
> buffer requirement. In addition, DM goes through hoops to deal with
> preallocating, but not initializing integrity profiles.

Yes, but only if the underlying device(s) actually support bip.  This
change to inlining blk_integrity (purely to make NVMe happy) comes at
the cost of _always_ allocating the memory 'struct blk_integrity' (via
gendisk inline) even if there is absolutely no need for it.

That said, with your changes the blk_integrity structure is no longer
large (previously was 96 bytes).. so I can let that part go.

> Since the integrity profile is small (4 bytes + a pointer), Christoph
> suggested moving it to struct gendisk proper. This requires several
> changes:
> 
>  - Moving the blk_integrity definition to genhd.h.
> 
>  - Inlining blk_integrity in struct gendisk.
> 
>  - Removing the dynamic allocation code.
> 
>  - Adding helper functions which allow gendisk to set up and tear down
>    the integrity sysfs dir when a disk is added/deleted.
> 
>  - Adding a blk_integrity_revalidate() callback for updating the stable
>    pages bdi setting.
> 
>  - The calls that depend on whether a device has an integrity profile or
>    not now key off of the bi->profile pointer.
> 
>  - Simplifying the integrity support routines in DM.

But I cannot let this DM "simplifying" go (vast majority of it breaks
bip for DM).  You missed the fact that DM has inactive and active
tables.  And that the top-level DM device can only be changed once an
inactive table is made active via DM's resume.  The bip for the DM
device cannot be changed during table load.  Such a change can only be
done during table resume.  Reason is an inactive DM table can get thrown
away at any time; so changing the top-level DM device during the load of
an inactive table isn't right.

Please review the commit header from commit a63a5cf84 ("dm: improve
block integrity support").

Long story short, DM changes that eliminate the checks/code for
allocating the blk_integrity structure should be all that is needed for
this patch.

I'll work through this further.  Hope to send an incremental patch that
fixes things up in the next day or so.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 5/5] block: Inline blk_integrity in struct gendisk
@ 2015-09-16  1:07                       ` Mike Snitzer
  0 siblings, 0 replies; 67+ messages in thread
From: Mike Snitzer @ 2015-09-16  1:07 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Jens Axboe, linux-nvme, Christoph Hellwig, dm-devel,
	Matthew Wilcox, Keith Busch, Dan Williams

On Thu, Aug 20 2015 at  4:41pm -0400,
Martin K. Petersen <martin.petersen@oracle.com> wrote:

> Up until now the_integrity profile has been dynamically allocated and
> attached to struct gendisk after the disk has been made active.
> 
> This causes problems because NVMe devices need to register the profile
> prior to the partition table being read due to a mandatory metadata
> buffer requirement. In addition, DM goes through hoops to deal with
> preallocating, but not initializing integrity profiles.

Yes, but only if the underlying device(s) actually support bip.  This
change to inlining blk_integrity (purely to make NVMe happy) comes at
the cost of _always_ allocating the memory 'struct blk_integrity' (via
gendisk inline) even if there is absolutely no need for it.

That said, with your changes the blk_integrity structure is no longer
large (previously was 96 bytes).. so I can let that part go.

> Since the integrity profile is small (4 bytes + a pointer), Christoph
> suggested moving it to struct gendisk proper. This requires several
> changes:
> 
>  - Moving the blk_integrity definition to genhd.h.
> 
>  - Inlining blk_integrity in struct gendisk.
> 
>  - Removing the dynamic allocation code.
> 
>  - Adding helper functions which allow gendisk to set up and tear down
>    the integrity sysfs dir when a disk is added/deleted.
> 
>  - Adding a blk_integrity_revalidate() callback for updating the stable
>    pages bdi setting.
> 
>  - The calls that depend on whether a device has an integrity profile or
>    not now key off of the bi->profile pointer.
> 
>  - Simplifying the integrity support routines in DM.

But I cannot let this DM "simplifying" go (vast majority of it breaks
bip for DM).  You missed the fact that DM has inactive and active
tables.  And that the top-level DM device can only be changed once an
inactive table is made active via DM's resume.  The bip for the DM
device cannot be changed during table load.  Such a change can only be
done during table resume.  Reason is an inactive DM table can get thrown
away at any time; so changing the top-level DM device during the load of
an inactive table isn't right.

Please review the commit header from commit a63a5cf84 ("dm: improve
block integrity support").

Long story short, DM changes that eliminate the checks/code for
allocating the blk_integrity structure should be all that is needed for
this patch.

I'll work through this further.  Hope to send an incremental patch that
fixes things up in the next day or so.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/5] block: Move integrity kobject to struct gendisk
  2015-08-20 20:41                   ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
@ 2015-09-16 17:26                       ` Mike Snitzer
  0 siblings, 0 replies; 67+ messages in thread
From: Mike Snitzer @ 2015-09-16 17:26 UTC (permalink / raw)


On Thu, Aug 20 2015 at  4:41pm -0400,
Martin K. Petersen <martin.petersen@oracle.com> wrote:

> The integrity kobject purely exists to support the integrity
> subdirectory in sysfs and doesn't really have anything to do with the
> blk_integrity data structure. Move the kobject to struct gendisk where
> it belongs.
> 
> Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
> Reported-by: Christoph Hellwig <hch at lst.de>
> Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

[I understand both Martin and Keith are on vacation but I'm putting the
time to this now with the understanding that a proper exchange may be
delayed.]

Thinking about this series further: I don't really agree with it.
Maybe in the end the benefit of embedding in gendisk outweighs the
complexity of dynamic allocation... but I need to see it for myself.

Christoph, if you _know_ this to be the right way forward I can accept
it but please elaborate further.

What we currently have (before this patchset) has the very real benefit
of not wasting any memory if the block device doesn't have integrity
(DIF/DIX) support.  Especially given how rare bip supporting devices
still are.  Is there reason to expect they won't be so rare in the
future?  Every NVMe and nvdimm device will have integrity support?

This patch moves thes 64 byte 'struct kobject' out into 'struct
gendisk'.  I'm not seeing why (on x86_64 anyway) we'd what to always
allocate an extra 76 bytes (blk_integrity + kobject) for _every_ gendisk
regardless of whether a bip is needed..

76 bytes may not sound like a lot but in the context of DM-mpath that
adds up when you have systems with 1000s of devices with multiple paths
and then a DM mpath device (with its own gendisk) ontop -- 1000 devices
with 4 paths can waste 5000 * 76bytes = ~372K (even 372K isn't much...)

If DM-core could support existing dynamic bip (albeit with more involved
code born out of DM-specific quirks of device stacking and inactive and
active tables) then I have to believe NVMe can too.

As such I'm now going to shift my focus to revisiting what is so hard
for NVMe that it cannot cope with the existing dynamic allocation of
struct blk_integrity.

[If that turns out to be a waste of time (and NVMe/nvdimm/whatever would
be needlessly inconvenienced) then I'll switch back to fixing DM to cope
with this change.  But to be clear the current proposed DM changes
_cannot_ go upstream.]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/5] block: Move integrity kobject to struct gendisk
@ 2015-09-16 17:26                       ` Mike Snitzer
  0 siblings, 0 replies; 67+ messages in thread
From: Mike Snitzer @ 2015-09-16 17:26 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Jens Axboe, linux-nvme, Christoph Hellwig, dm-devel,
	Matthew Wilcox, Keith Busch, Dan Williams

On Thu, Aug 20 2015 at  4:41pm -0400,
Martin K. Petersen <martin.petersen@oracle.com> wrote:

> The integrity kobject purely exists to support the integrity
> subdirectory in sysfs and doesn't really have anything to do with the
> blk_integrity data structure. Move the kobject to struct gendisk where
> it belongs.
> 
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Reported-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Sagi Grimberg <sagig@mellanox.com>

[I understand both Martin and Keith are on vacation but I'm putting the
time to this now with the understanding that a proper exchange may be
delayed.]

Thinking about this series further: I don't really agree with it.
Maybe in the end the benefit of embedding in gendisk outweighs the
complexity of dynamic allocation... but I need to see it for myself.

Christoph, if you _know_ this to be the right way forward I can accept
it but please elaborate further.

What we currently have (before this patchset) has the very real benefit
of not wasting any memory if the block device doesn't have integrity
(DIF/DIX) support.  Especially given how rare bip supporting devices
still are.  Is there reason to expect they won't be so rare in the
future?  Every NVMe and nvdimm device will have integrity support?

This patch moves thes 64 byte 'struct kobject' out into 'struct
gendisk'.  I'm not seeing why (on x86_64 anyway) we'd what to always
allocate an extra 76 bytes (blk_integrity + kobject) for _every_ gendisk
regardless of whether a bip is needed..

76 bytes may not sound like a lot but in the context of DM-mpath that
adds up when you have systems with 1000s of devices with multiple paths
and then a DM mpath device (with its own gendisk) ontop -- 1000 devices
with 4 paths can waste 5000 * 76bytes = ~372K (even 372K isn't much...)

If DM-core could support existing dynamic bip (albeit with more involved
code born out of DM-specific quirks of device stacking and inactive and
active tables) then I have to believe NVMe can too.

As such I'm now going to shift my focus to revisiting what is so hard
for NVMe that it cannot cope with the existing dynamic allocation of
struct blk_integrity.

[If that turns out to be a waste of time (and NVMe/nvdimm/whatever would
be needlessly inconvenienced) then I'll switch back to fixing DM to cope
with this change.  But to be clear the current proposed DM changes
_cannot_ go upstream.]

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-09-16  1:07                       ` Mike Snitzer
@ 2015-09-21 20:45                         ` Mike Snitzer
  -1 siblings, 0 replies; 67+ messages in thread
From: Mike Snitzer @ 2015-09-21 20:45 UTC (permalink / raw)


On Tue, Sep 15 2015 at  9:07P -0400,
Mike Snitzer <snitzer@redhat.com> wrote:
 
> Long story short, DM changes that eliminate the checks/code for
> allocating the blk_integrity structure should be all that is needed for
> this patch.
> 
> I'll work through this further.  Hope to send an incremental patch that
> fixes things up in the next day or so.

Here is what I came up with.  Leaves something to be desired given the
integrity profile is being established on first table load (as opposed
to during resume like the old DM and block integrity code did).. but we
can iterate on this as needed.

Feel free to fold it into your last patch and add my Signed-off-by.

From: Mike Snitzer <snitzer@redhat.com>
Date: Mon, 21 Sep 2015 14:58:44 -0400
Subject: [PATCH] dm table: fixup block integrity profile processing

Not very different from what Martin originally proposed but subtle
changes include:

. only register the integrity profile on first table load; subsequent
  loads must have a matching integrity profile
  - if profile is already registered, verify new table's profile matches

. resume only verifies that the DM device's integrity profile matches
  all the underlying devices' profiles.
  - if they don't match the DM device's integrity profile is unregistered

Signed-off-by: Mike Snitzer <snitzer at redhat.com>
---
 drivers/md/dm-table.c | 77 +++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 63 insertions(+), 14 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index e2f98fc..061152a 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1014,6 +1014,11 @@ static int dm_table_build_index(struct dm_table *t)
 	return r;
 }
 
+static bool integrity_profile_exists(struct gendisk *disk)
+{
+	return !!blk_get_integrity(disk);
+}
+
 /*
  * Get a disk whose integrity profile reflects the table's profile.
  * Returns NULL if integrity support was inconsistent or unavailable.
@@ -1026,10 +1031,10 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
 
 	list_for_each_entry(dd, devices, list) {
 		template_disk = dd->dm_dev->bdev->bd_disk;
-		if (!blk_get_integrity(template_disk))
+		if (!integrity_profile_exists(template_disk))
 			goto no_integrity;
-		if (prev_disk &&
-		    blk_integrity_compare(prev_disk, template_disk) < 0)
+		else if (prev_disk &&
+			 blk_integrity_compare(prev_disk, template_disk) < 0)
 			goto no_integrity;
 		prev_disk = template_disk;
 	}
@@ -1055,23 +1060,40 @@ no_integrity:
  * profile validation: First pass during table load, final pass during
  * resume.
  */
-static int dm_table_set_integrity(struct dm_table *t)
+static int dm_table_register_integrity(struct dm_table *t)
 {
+	struct mapped_device *md = t->md;
 	struct gendisk *template_disk = NULL;
-	bool existing_profile = blk_get_integrity(dm_disk(t->md));
 
 	template_disk = dm_table_get_integrity_disk(t);
-	if (template_disk) {
-		blk_integrity_register(dm_disk(t->md),
-				       blk_get_integrity(template_disk));
+	if (!template_disk)
+		return 0;
+
+	if (!integrity_profile_exists(dm_disk(md))) {
 		t->integrity_supported = 1;
-	} else if (existing_profile) {
-		blk_integrity_unregister(dm_disk(t->md));
-		DMWARN("%s: device no longer has a valid integrity profile",
-		       dm_device_name(t->md));
+		/*
+		 * Register integrity profile during table load; we can do
+		 * this because the final profile must match during resume.
+		 */
+		blk_integrity_register(dm_disk(md),
+				       blk_get_integrity(template_disk));
+		return 0;
+	}
+
+	/*
+	 * If DM device already has an initialized integrity
+	 * profile the new profile should not conflict.
+	 */
+	if (blk_integrity_compare(dm_disk(md), template_disk) < 0) {
+		DMWARN("%s: conflict with existing integrity profile: "
+		       "%s profile mismatch",
+		       dm_device_name(t->md),
+		       template_disk->disk_name);
 		return 1;
 	}
 
+	/* Preserve existing integrity profile */
+	t->integrity_supported = 1;
 	return 0;
 }
 
@@ -1095,7 +1117,7 @@ int dm_table_complete(struct dm_table *t)
 		return r;
 	}
 
-	r = dm_table_set_integrity(t);
+	r = dm_table_register_integrity(t);
 	if (r) {
 		DMERR("could not register integrity profile.");
 		return r;
@@ -1260,6 +1282,33 @@ combine_limits:
 	return validate_hardware_logical_block_alignment(table, limits);
 }
 
+/*
+ * Verify that all devices have an integrity profile that matches the
+ * DM device's registered integrity profile.  If the profiles don't
+ * match then unregister the DM device's integrity profile.
+ */
+static void dm_table_verify_integrity(struct dm_table *t)
+{
+	struct gendisk *template_disk = NULL;
+
+	if (t->integrity_supported) {
+		/*
+		 * Verify that the original integrity profile
+		 * matches all the devices in this table.
+		 */
+		template_disk = dm_table_get_integrity_disk(t);
+		if (template_disk &&
+		    blk_integrity_compare(dm_disk(t->md), template_disk) >= 0)
+			return;
+	}
+
+	if (integrity_profile_exists(dm_disk(t->md))) {
+		DMWARN("%s: unable to establish an integrity profile",
+		       dm_device_name(t->md));
+		blk_integrity_unregister(dm_disk(t->md));
+	}
+}
+
 static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
 				sector_t start, sector_t len, void *data)
 {
@@ -1457,7 +1506,7 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
 	else
 		queue_flag_set_unlocked(QUEUE_FLAG_NO_SG_MERGE, q);
 
-	dm_table_set_integrity(t);
+	dm_table_verify_integrity(t);
 
 	/*
 	 * Determine whether or not this queue's I/O timings contribute
-- 
2.3.8 (Apple Git-58)

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 5/5] block: Inline blk_integrity in struct gendisk
@ 2015-09-21 20:45                         ` Mike Snitzer
  0 siblings, 0 replies; 67+ messages in thread
From: Mike Snitzer @ 2015-09-21 20:45 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Jens Axboe, linux-nvme, Christoph Hellwig, dm-devel,
	Matthew Wilcox, Keith Busch, Dan Williams

On Tue, Sep 15 2015 at  9:07P -0400,
Mike Snitzer <snitzer@redhat.com> wrote:
 
> Long story short, DM changes that eliminate the checks/code for
> allocating the blk_integrity structure should be all that is needed for
> this patch.
> 
> I'll work through this further.  Hope to send an incremental patch that
> fixes things up in the next day or so.

Here is what I came up with.  Leaves something to be desired given the
integrity profile is being established on first table load (as opposed
to during resume like the old DM and block integrity code did).. but we
can iterate on this as needed.

Feel free to fold it into your last patch and add my Signed-off-by.

From: Mike Snitzer <snitzer@redhat.com>
Date: Mon, 21 Sep 2015 14:58:44 -0400
Subject: [PATCH] dm table: fixup block integrity profile processing

Not very different from what Martin originally proposed but subtle
changes include:

. only register the integrity profile on first table load; subsequent
  loads must have a matching integrity profile
  - if profile is already registered, verify new table's profile matches

. resume only verifies that the DM device's integrity profile matches
  all the underlying devices' profiles.
  - if they don't match the DM device's integrity profile is unregistered

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-table.c | 77 +++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 63 insertions(+), 14 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index e2f98fc..061152a 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1014,6 +1014,11 @@ static int dm_table_build_index(struct dm_table *t)
 	return r;
 }
 
+static bool integrity_profile_exists(struct gendisk *disk)
+{
+	return !!blk_get_integrity(disk);
+}
+
 /*
  * Get a disk whose integrity profile reflects the table's profile.
  * Returns NULL if integrity support was inconsistent or unavailable.
@@ -1026,10 +1031,10 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
 
 	list_for_each_entry(dd, devices, list) {
 		template_disk = dd->dm_dev->bdev->bd_disk;
-		if (!blk_get_integrity(template_disk))
+		if (!integrity_profile_exists(template_disk))
 			goto no_integrity;
-		if (prev_disk &&
-		    blk_integrity_compare(prev_disk, template_disk) < 0)
+		else if (prev_disk &&
+			 blk_integrity_compare(prev_disk, template_disk) < 0)
 			goto no_integrity;
 		prev_disk = template_disk;
 	}
@@ -1055,23 +1060,40 @@ no_integrity:
  * profile validation: First pass during table load, final pass during
  * resume.
  */
-static int dm_table_set_integrity(struct dm_table *t)
+static int dm_table_register_integrity(struct dm_table *t)
 {
+	struct mapped_device *md = t->md;
 	struct gendisk *template_disk = NULL;
-	bool existing_profile = blk_get_integrity(dm_disk(t->md));
 
 	template_disk = dm_table_get_integrity_disk(t);
-	if (template_disk) {
-		blk_integrity_register(dm_disk(t->md),
-				       blk_get_integrity(template_disk));
+	if (!template_disk)
+		return 0;
+
+	if (!integrity_profile_exists(dm_disk(md))) {
 		t->integrity_supported = 1;
-	} else if (existing_profile) {
-		blk_integrity_unregister(dm_disk(t->md));
-		DMWARN("%s: device no longer has a valid integrity profile",
-		       dm_device_name(t->md));
+		/*
+		 * Register integrity profile during table load; we can do
+		 * this because the final profile must match during resume.
+		 */
+		blk_integrity_register(dm_disk(md),
+				       blk_get_integrity(template_disk));
+		return 0;
+	}
+
+	/*
+	 * If DM device already has an initialized integrity
+	 * profile the new profile should not conflict.
+	 */
+	if (blk_integrity_compare(dm_disk(md), template_disk) < 0) {
+		DMWARN("%s: conflict with existing integrity profile: "
+		       "%s profile mismatch",
+		       dm_device_name(t->md),
+		       template_disk->disk_name);
 		return 1;
 	}
 
+	/* Preserve existing integrity profile */
+	t->integrity_supported = 1;
 	return 0;
 }
 
@@ -1095,7 +1117,7 @@ int dm_table_complete(struct dm_table *t)
 		return r;
 	}
 
-	r = dm_table_set_integrity(t);
+	r = dm_table_register_integrity(t);
 	if (r) {
 		DMERR("could not register integrity profile.");
 		return r;
@@ -1260,6 +1282,33 @@ combine_limits:
 	return validate_hardware_logical_block_alignment(table, limits);
 }
 
+/*
+ * Verify that all devices have an integrity profile that matches the
+ * DM device's registered integrity profile.  If the profiles don't
+ * match then unregister the DM device's integrity profile.
+ */
+static void dm_table_verify_integrity(struct dm_table *t)
+{
+	struct gendisk *template_disk = NULL;
+
+	if (t->integrity_supported) {
+		/*
+		 * Verify that the original integrity profile
+		 * matches all the devices in this table.
+		 */
+		template_disk = dm_table_get_integrity_disk(t);
+		if (template_disk &&
+		    blk_integrity_compare(dm_disk(t->md), template_disk) >= 0)
+			return;
+	}
+
+	if (integrity_profile_exists(dm_disk(t->md))) {
+		DMWARN("%s: unable to establish an integrity profile",
+		       dm_device_name(t->md));
+		blk_integrity_unregister(dm_disk(t->md));
+	}
+}
+
 static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
 				sector_t start, sector_t len, void *data)
 {
@@ -1457,7 +1506,7 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
 	else
 		queue_flag_set_unlocked(QUEUE_FLAG_NO_SG_MERGE, q);
 
-	dm_table_set_integrity(t);
+	dm_table_verify_integrity(t);
 
 	/*
 	 * Determine whether or not this queue's I/O timings contribute
-- 
2.3.8 (Apple Git-58)

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-09-21 20:45                         ` Mike Snitzer
@ 2015-10-09  7:36                           ` Christoph Hellwig
  -1 siblings, 0 replies; 67+ messages in thread
From: Christoph Hellwig @ 2015-10-09  7:36 UTC (permalink / raw)


On Mon, Sep 21, 2015@04:45:07PM -0400, Mike Snitzer wrote:
> Here is what I came up with.  Leaves something to be desired given the
> integrity profile is being established on first table load (as opposed
> to during resume like the old DM and block integrity code did).. but we
> can iterate on this as needed.
> 
> Feel free to fold it into your last patch and add my Signed-off-by.

Martin, any chance to repost with Mike's changes folded in?  I'd really
hate to miss 4.4 for the series.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 5/5] block: Inline blk_integrity in struct gendisk
@ 2015-10-09  7:36                           ` Christoph Hellwig
  0 siblings, 0 replies; 67+ messages in thread
From: Christoph Hellwig @ 2015-10-09  7:36 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, Martin K. Petersen, linux-nvme, Christoph Hellwig,
	dm-devel, Matthew Wilcox, Keith Busch, Dan Williams

On Mon, Sep 21, 2015 at 04:45:07PM -0400, Mike Snitzer wrote:
> Here is what I came up with.  Leaves something to be desired given the
> integrity profile is being established on first table load (as opposed
> to during resume like the old DM and block integrity code did).. but we
> can iterate on this as needed.
> 
> Feel free to fold it into your last patch and add my Signed-off-by.

Martin, any chance to repost with Mike's changes folded in?  I'd really
hate to miss 4.4 for the series.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-10-09  7:36                           ` Christoph Hellwig
@ 2015-10-12  1:17                             ` Martin K. Petersen
  -1 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-12  1:17 UTC (permalink / raw)


>>>>> "Christoph" == Christoph Hellwig <hch at infradead.org> writes:

Christoph> any chance to repost with Mike's changes folded in?  I'd
Christoph> really hate to miss 4.4 for the series.

Will do first thing in the morning.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 5/5] block: Inline blk_integrity in struct gendisk
@ 2015-10-12  1:17                             ` Martin K. Petersen
  0 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-12  1:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Mike Snitzer, Martin K. Petersen, linux-nvme,
	Jens Axboe, dm-devel, Matthew Wilcox, Dan Williams

>>>>> "Christoph" == Christoph Hellwig <hch@infradead.org> writes:

Christoph> any chance to repost with Mike's changes folded in?  I'd
Christoph> really hate to miss 4.4 for the series.

Will do first thing in the morning.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Block integrity registration update
  2015-08-21 23:47                     ` Busch, Keith
  2015-08-27  0:25                       ` Martin K. Petersen
  2015-08-27  0:25                       ` [PATCH] " Martin K. Petersen
@ 2015-10-12 21:05                       ` Martin K. Petersen
  2015-10-12 21:05                         ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
                                           ` (5 more replies)
  2 siblings, 6 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-12 21:05 UTC (permalink / raw)


As requested, here's the integrity registration update for 4.4. Only
delta is patch 5 which retains the non-PI metadata check as requested by
Keith as well as the DM rework by Mike. I rebased on top of
block/for-4.4/drivers to accommodate the NVMe shuffle.

[PATCH 1/5] block: Move integrity kobject to struct gendisk
[PATCH 2/5] block: Consolidate static integrity profile properties
[PATCH 3/5] block: Reduce the size of struct blk_integrity
[PATCH 4/5] block: Export integrity data interval size in sysfs
[PATCH 5/5] block: Inline blk_integrity in struct gendisk

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/5] block: Move integrity kobject to struct gendisk
  2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
@ 2015-10-12 21:05                         ` Martin K. Petersen
  2015-10-12 21:05                         ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
                                           ` (4 subsequent siblings)
  5 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-12 21:05 UTC (permalink / raw)


The integrity kobject purely exists to support the integrity
subdirectory in sysfs and doesn't really have anything to do with the
blk_integrity data structure. Move the kobject to struct gendisk where
it belongs.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
---
 block/blk-integrity.c  | 22 +++++++++++-----------
 include/linux/blkdev.h |  2 --
 include/linux/genhd.h  |  1 +
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 75f29cf70188..182bfd2383ea 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -249,8 +249,8 @@ struct integrity_sysfs_entry {
 static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -261,8 +261,8 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    struct attribute *attr, const char *page,
 				    size_t count)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -385,8 +385,8 @@ subsys_initcall(blk_dev_integrity_init);
 
 static void blk_integrity_release(struct kobject *kobj)
 {
-	struct blk_integrity *bi =
-		container_of(kobj, struct blk_integrity, kobj);
+	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
+	struct blk_integrity *bi = blk_get_integrity(disk);
 
 	kmem_cache_free(integrity_cachep, bi);
 }
@@ -429,14 +429,14 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 		if (!bi)
 			return -1;
 
-		if (kobject_init_and_add(&bi->kobj, &integrity_ktype,
+		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
 					 &disk_to_dev(disk)->kobj,
 					 "%s", "integrity")) {
 			kmem_cache_free(integrity_cachep, bi);
 			return -1;
 		}
 
-		kobject_uevent(&bi->kobj, KOBJ_ADD);
+		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
 
 		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
 		bi->interval = queue_logical_block_size(disk->queue);
@@ -479,9 +479,9 @@ void blk_integrity_unregister(struct gendisk *disk)
 
 	bi = disk->integrity;
 
-	kobject_uevent(&bi->kobj, KOBJ_REMOVE);
-	kobject_del(&bi->kobj);
-	kobject_put(&bi->kobj);
+	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
+	kobject_del(&disk->integrity_kobj);
+	kobject_put(&disk->integrity_kobj);
 	disk->integrity = NULL;
 }
 EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 19c2e947d4d1..830f9c07d4bb 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1472,8 +1472,6 @@ struct blk_integrity {
 	unsigned short		tag_size;
 
 	const char		*name;
-
-	struct kobject		kobj;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 2adbfa6d02bc..9e6e0dfa97ad 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -199,6 +199,7 @@ struct gendisk {
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
 	struct blk_integrity *integrity;
+	struct kobject integrity_kobj;
 #endif
 	int node_id;
 };
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
  2015-10-12 21:05                         ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
@ 2015-10-12 21:05                         ` Martin K. Petersen
  2015-10-14  1:11                           ` Dan Williams
  2015-10-12 21:05                         ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
                                           ` (3 subsequent siblings)
  5 siblings, 1 reply; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-12 21:05 UTC (permalink / raw)


We previously made a complete copy of a device's data integrity profile
even though several of the fields inside the blk_integrity struct are
pointers to fixed template entries in t10-pi.c.

Split the static and per-device portions so that we can reference the
template directly.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Cc: Dan Williams <dan.j.williams at intel.com>
---
 block/bio-integrity.c               |  8 ++++----
 block/blk-integrity.c               | 17 ++++++++---------
 block/t10-pi.c                      | 16 ++++------------
 drivers/nvdimm/core.c               | 11 +++++++----
 drivers/nvme/host/pci.c             |  8 ++++----
 drivers/scsi/sd_dif.c               | 29 ++++++++++++++++-------------
 drivers/target/target_core_iblock.c | 10 +++++-----
 include/linux/blkdev.h              | 20 +++++++++++---------
 include/linux/t10-pi.h              |  8 ++++----
 9 files changed, 63 insertions(+), 64 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 14b8faf8b09d..a10ffe19a8dd 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -177,11 +177,11 @@ bool bio_integrity_enabled(struct bio *bio)
 	if (bi == NULL)
 		return false;
 
-	if (bio_data_dir(bio) == READ && bi->verify_fn != NULL &&
+	if (bio_data_dir(bio) == READ && bi->profile->verify_fn != NULL &&
 	    (bi->flags & BLK_INTEGRITY_VERIFY))
 		return true;
 
-	if (bio_data_dir(bio) == WRITE && bi->generate_fn != NULL &&
+	if (bio_data_dir(bio) == WRITE && bi->profile->generate_fn != NULL &&
 	    (bi->flags & BLK_INTEGRITY_GENERATE))
 		return true;
 
@@ -340,7 +340,7 @@ int bio_integrity_prep(struct bio *bio)
 
 	/* Auto-generate integrity metadata if this is a write */
 	if (bio_data_dir(bio) == WRITE)
-		bio_integrity_process(bio, bi->generate_fn);
+		bio_integrity_process(bio, bi->profile->generate_fn);
 
 	return 0;
 }
@@ -361,7 +361,7 @@ static void bio_integrity_verify_fn(struct work_struct *work)
 	struct bio *bio = bip->bip_bio;
 	struct blk_integrity *bi = bdev_get_integrity(bio->bi_bdev);
 
-	bio->bi_error = bio_integrity_process(bio, bi->verify_fn);
+	bio->bi_error = bio_integrity_process(bio, bi->profile->verify_fn);
 
 	/* Restore original bio completion handler */
 	bio->bi_end_io = bip->bip_end_io;
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 182bfd2383ea..daf590ab3b46 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -176,10 +176,10 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 		return -1;
 	}
 
-	if (strcmp(b1->name, b2->name)) {
+	if (b1->profile != b2->profile) {
 		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
-		       b1->name, b2->name);
+		       b1->profile->name, b2->profile->name);
 		return -1;
 	}
 
@@ -275,8 +275,8 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->name != NULL)
-		return sprintf(page, "%s\n", bi->name);
+	if (bi != NULL && bi->profile->name != NULL)
+		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
 }
@@ -401,7 +401,8 @@ bool blk_integrity_is_initialized(struct gendisk *disk)
 {
 	struct blk_integrity *bi = blk_get_integrity(disk);
 
-	return (bi && bi->name && strcmp(bi->name, bi_unsupported_name) != 0);
+	return (bi && bi->profile->name && strcmp(bi->profile->name,
+						  bi_unsupported_name) != 0);
 }
 EXPORT_SYMBOL(blk_integrity_is_initialized);
 
@@ -446,14 +447,12 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 
 	/* Use the provided profile as template */
 	if (template != NULL) {
-		bi->name = template->name;
-		bi->generate_fn = template->generate_fn;
-		bi->verify_fn = template->verify_fn;
+		bi->profile = template->profile;
 		bi->tuple_size = template->tuple_size;
 		bi->tag_size = template->tag_size;
 		bi->flags |= template->flags;
 	} else
-		bi->name = bi_unsupported_name;
+		bi->profile->name = bi_unsupported_name;
 
 	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
 
diff --git a/block/t10-pi.c b/block/t10-pi.c
index 24d6e9715318..2c97912335a9 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -160,38 +160,30 @@ static int t10_pi_type3_verify_ip(struct blk_integrity_iter *iter)
 	return t10_pi_verify(iter, t10_pi_ip_fn, 3);
 }
 
-struct blk_integrity t10_pi_type1_crc = {
+struct blk_integrity_profile t10_pi_type1_crc = {
 	.name			= "T10-DIF-TYPE1-CRC",
 	.generate_fn		= t10_pi_type1_generate_crc,
 	.verify_fn		= t10_pi_type1_verify_crc,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type1_crc);
 
-struct blk_integrity t10_pi_type1_ip = {
+struct blk_integrity_profile t10_pi_type1_ip = {
 	.name			= "T10-DIF-TYPE1-IP",
 	.generate_fn		= t10_pi_type1_generate_ip,
 	.verify_fn		= t10_pi_type1_verify_ip,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type1_ip);
 
-struct blk_integrity t10_pi_type3_crc = {
+struct blk_integrity_profile t10_pi_type3_crc = {
 	.name			= "T10-DIF-TYPE3-CRC",
 	.generate_fn		= t10_pi_type3_generate_crc,
 	.verify_fn		= t10_pi_type3_verify_crc,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type3_crc);
 
-struct blk_integrity t10_pi_type3_ip = {
+struct blk_integrity_profile t10_pi_type3_ip = {
 	.name			= "T10-DIF-TYPE3-IP",
 	.generate_fn		= t10_pi_type3_generate_ip,
 	.verify_fn		= t10_pi_type3_verify_ip,
-	.tuple_size		= sizeof(struct t10_pi_tuple),
-	.tag_size		= 0,
 };
 EXPORT_SYMBOL(t10_pi_type3_ip);
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index cb62ec6a12d0..9e1b0f656a9b 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -399,19 +399,22 @@ static int nd_pi_nop_generate_verify(struct blk_integrity_iter *iter)
 
 int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 {
-	struct blk_integrity integrity = {
+	struct blk_integrity bi;
+	struct blk_integrity_profile profile = {
 		.name = "ND-PI-NOP",
 		.generate_fn = nd_pi_nop_generate_verify,
 		.verify_fn = nd_pi_nop_generate_verify,
-		.tuple_size = meta_size,
-		.tag_size = meta_size,
 	};
 	int ret;
 
 	if (meta_size == 0)
 		return 0;
 
-	ret = blk_integrity_register(disk, &integrity);
+	bi.profile = &profile;
+	bi.tuple_size = meta_size;
+	bi.tag_size = meta_size;
+
+	ret = blk_integrity_register(disk, &bi);
 	if (ret)
 		return ret;
 
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ad58ee3c3b57..5dba51d4bae6 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -558,7 +558,7 @@ static int nvme_noop_generate(struct blk_integrity_iter *iter)
 	return 0;
 }
 
-struct blk_integrity nvme_meta_noop = {
+struct blk_integrity_profile nvme_meta_noop = {
 	.name			= "NVME_META_NOOP",
 	.generate_fn		= nvme_noop_generate,
 	.verify_fn		= nvme_noop_verify,
@@ -570,14 +570,14 @@ static void nvme_init_integrity(struct nvme_ns *ns)
 
 	switch (ns->pi_type) {
 	case NVME_NS_DPS_PI_TYPE3:
-		integrity = t10_pi_type3_crc;
+		integrity.profile = &t10_pi_type3_crc;
 		break;
 	case NVME_NS_DPS_PI_TYPE1:
 	case NVME_NS_DPS_PI_TYPE2:
-		integrity = t10_pi_type1_crc;
+		integrity.profile = &t10_pi_type1_crc;
 		break;
 	default:
-		integrity = nvme_meta_noop;
+		integrity.profile = &nvme_meta_noop;
 		break;
 	}
 	integrity.tuple_size = ns->ms;
diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c
index 5c06d292b94c..5a5ec9aa26b3 100644
--- a/drivers/scsi/sd_dif.c
+++ b/drivers/scsi/sd_dif.c
@@ -43,6 +43,7 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 	struct scsi_device *sdp = sdkp->device;
 	struct gendisk *disk = sdkp->disk;
 	u8 type = sdkp->protection_type;
+	struct blk_integrity bi;
 	int dif, dix;
 
 	dif = scsi_host_dif_capable(sdp->host, type);
@@ -58,36 +59,38 @@ void sd_dif_config_host(struct scsi_disk *sdkp)
 	/* Enable DMA of protection information */
 	if (scsi_host_get_guard(sdkp->device->host) & SHOST_DIX_GUARD_IP) {
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			blk_integrity_register(disk, &t10_pi_type3_ip);
+			bi.profile = &t10_pi_type3_ip;
 		else
-			blk_integrity_register(disk, &t10_pi_type1_ip);
+			bi.profile = &t10_pi_type1_ip;
 
-		disk->integrity->flags |= BLK_INTEGRITY_IP_CHECKSUM;
+		bi.flags |= BLK_INTEGRITY_IP_CHECKSUM;
 	} else
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			blk_integrity_register(disk, &t10_pi_type3_crc);
+			bi.profile = &t10_pi_type3_crc;
 		else
-			blk_integrity_register(disk, &t10_pi_type1_crc);
+			bi.profile = &t10_pi_type1_crc;
 
+	bi.tuple_size = sizeof(struct t10_pi_tuple);
 	sd_printk(KERN_NOTICE, sdkp,
-		  "Enabling DIX %s protection\n", disk->integrity->name);
+		  "Enabling DIX %s protection\n", bi.profile->name);
 
-	/* Signal to block layer that we support sector tagging */
 	if (dif && type) {
-
-		disk->integrity->flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
+		bi.flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
 
 		if (!sdkp->ATO)
-			return;
+			goto out;
 
 		if (type == SD_DIF_TYPE3_PROTECTION)
-			disk->integrity->tag_size = sizeof(u16) + sizeof(u32);
+			bi.tag_size = sizeof(u16) + sizeof(u32);
 		else
-			disk->integrity->tag_size = sizeof(u16);
+			bi.tag_size = sizeof(u16);
 
 		sd_printk(KERN_NOTICE, sdkp, "DIF application tag size %u\n",
-			  disk->integrity->tag_size);
+			  bi.tag_size);
 	}
+
+out:
+	blk_integrity_register(disk, &bi);
 }
 
 /*
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 0f19e11acac2..f29c69120054 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -155,17 +155,17 @@ static int iblock_configure_device(struct se_device *dev)
 	if (bi) {
 		struct bio_set *bs = ib_dev->ibd_bio_set;
 
-		if (!strcmp(bi->name, "T10-DIF-TYPE3-IP") ||
-		    !strcmp(bi->name, "T10-DIF-TYPE1-IP")) {
+		if (!strcmp(bi->profile->name, "T10-DIF-TYPE3-IP") ||
+		    !strcmp(bi->profile->name, "T10-DIF-TYPE1-IP")) {
 			pr_err("IBLOCK export of blk_integrity: %s not"
-			       " supported\n", bi->name);
+			       " supported\n", bi->profile->name);
 			ret = -ENOSYS;
 			goto out_blkdev_put;
 		}
 
-		if (!strcmp(bi->name, "T10-DIF-TYPE3-CRC")) {
+		if (!strcmp(bi->profile->name, "T10-DIF-TYPE3-CRC")) {
 			dev->dev_attrib.pi_prot_type = TARGET_DIF_TYPE3_PROT;
-		} else if (!strcmp(bi->name, "T10-DIF-TYPE1-CRC")) {
+		} else if (!strcmp(bi->profile->name, "T10-DIF-TYPE1-CRC")) {
 			dev->dev_attrib.pi_prot_type = TARGET_DIF_TYPE1_PROT;
 		}
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 830f9c07d4bb..f36c6476f1c7 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1462,16 +1462,18 @@ struct blk_integrity_iter {
 
 typedef int (integrity_processing_fn) (struct blk_integrity_iter *);
 
-struct blk_integrity {
-	integrity_processing_fn	*generate_fn;
-	integrity_processing_fn	*verify_fn;
-
-	unsigned short		flags;
-	unsigned short		tuple_size;
-	unsigned short		interval;
-	unsigned short		tag_size;
+struct blk_integrity_profile {
+	integrity_processing_fn		*generate_fn;
+	integrity_processing_fn		*verify_fn;
+	const char			*name;
+};
 
-	const char		*name;
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned short			flags;
+	unsigned short			tuple_size;
+	unsigned short			interval;
+	unsigned short			tag_size;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
diff --git a/include/linux/t10-pi.h b/include/linux/t10-pi.h
index 6a8b9942632d..dd8de82cf5b5 100644
--- a/include/linux/t10-pi.h
+++ b/include/linux/t10-pi.h
@@ -14,9 +14,9 @@ struct t10_pi_tuple {
 };
 
 
-extern struct blk_integrity t10_pi_type1_crc;
-extern struct blk_integrity t10_pi_type1_ip;
-extern struct blk_integrity t10_pi_type3_crc;
-extern struct blk_integrity t10_pi_type3_ip;
+extern struct blk_integrity_profile t10_pi_type1_crc;
+extern struct blk_integrity_profile t10_pi_type1_ip;
+extern struct blk_integrity_profile t10_pi_type3_crc;
+extern struct blk_integrity_profile t10_pi_type3_ip;
 
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/5] block: Reduce the size of struct blk_integrity
  2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
  2015-10-12 21:05                         ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
  2015-10-12 21:05                         ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
@ 2015-10-12 21:05                         ` Martin K. Petersen
  2015-10-12 21:05                         ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
                                           ` (2 subsequent siblings)
  5 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-12 21:05 UTC (permalink / raw)


The per-device properties in the blk_integrity structure were previously
unsigned short. However, most of the values fit inside a char. The only
exception is the data interval size and we can work around that by
storing it as a power of two.

This cuts the size of the dynamic portion of blk_integrity in half.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
---
 block/bio-integrity.c  | 4 ++--
 block/blk-integrity.c  | 6 +++---
 include/linux/blkdev.h | 8 ++++----
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index a10ffe19a8dd..6a90eca9cea1 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -202,7 +202,7 @@ EXPORT_SYMBOL(bio_integrity_enabled);
 static inline unsigned int bio_integrity_intervals(struct blk_integrity *bi,
 						   unsigned int sectors)
 {
-	return sectors >> (ilog2(bi->interval) - 9);
+	return sectors >> (bi->interval_exp - 9);
 }
 
 static inline unsigned int bio_integrity_bytes(struct blk_integrity *bi,
@@ -229,7 +229,7 @@ static int bio_integrity_process(struct bio *bio,
 		bip->bip_vec->bv_offset;
 
 	iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
-	iter.interval = bi->interval;
+	iter.interval = 1 << bi->interval_exp;
 	iter.seed = bip_get_seed(bip);
 	iter.prot_buf = prot_buf;
 
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index daf590ab3b46..c7508654faff 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -155,10 +155,10 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	if (!b1 || !b2)
 		return -1;
 
-	if (b1->interval != b2->interval) {
+	if (b1->interval_exp != b2->interval_exp) {
 		pr_err("%s: %s/%s protection interval %u != %u\n",
 		       __func__, gd1->disk_name, gd2->disk_name,
-		       b1->interval, b2->interval);
+		       1 << b1->interval_exp, 1 << b2->interval_exp);
 		return -1;
 	}
 
@@ -440,7 +440,7 @@ int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
 
 		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval = queue_logical_block_size(disk->queue);
+		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
 		disk->integrity = bi;
 	} else
 		bi = disk->integrity;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f36c6476f1c7..4f1968f15e30 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1470,10 +1470,10 @@ struct blk_integrity_profile {
 
 struct blk_integrity {
 	struct blk_integrity_profile	*profile;
-	unsigned short			flags;
-	unsigned short			tuple_size;
-	unsigned short			interval;
-	unsigned short			tag_size;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
 };
 
 extern bool blk_integrity_is_initialized(struct gendisk *);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 4/5] block: Export integrity data interval size in sysfs
  2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
                                           ` (2 preceding siblings ...)
  2015-10-12 21:05                         ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
@ 2015-10-12 21:05                         ` Martin K. Petersen
  2015-10-12 21:05                         ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
  2015-10-13  0:31                         ` Block integrity registration update Williams, Dan J
  5 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-12 21:05 UTC (permalink / raw)


The size of the data interval was not exported in the sysfs integrity
directory. Export it so that userland apps can tell whether the interval
is different from the device's logical block size.

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
---
 Documentation/ABI/testing/sysfs-block |  7 +++++++
 block/blk-integrity.c                 | 14 ++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index 8df003963d99..71d184dbb70d 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -60,6 +60,13 @@ Description:
 		Indicates whether a storage device is capable of storing
 		integrity metadata. Set if the device is T10 PI-capable.
 
+What:		/sys/block/<disk>/integrity/protection_interval_bytes
+Date:		July 2015
+Contact:	Martin K. Petersen <martin.petersen at oracle.com>
+Description:
+		Describes the number of data bytes which are protected
+		by one integrity tuple. Typically the device's logical
+		block size.
 
 What:		/sys/block/<disk>/integrity/write_generate
 Date:		June 2008
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index c7508654faff..7a96f57ed195 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -289,6 +289,14 @@ static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 		return sprintf(page, "0\n");
 }
 
+static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
+{
+	if (bi != NULL)
+		return sprintf(page, "%u\n", 1 << bi->interval_exp);
+	else
+		return sprintf(page, "0\n");
+}
+
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
 				      const char *page, size_t count)
 {
@@ -343,6 +351,11 @@ static struct integrity_sysfs_entry integrity_tag_size_entry = {
 	.show = integrity_tag_size_show,
 };
 
+static struct integrity_sysfs_entry integrity_interval_entry = {
+	.attr = { .name = "protection_interval_bytes", .mode = S_IRUGO },
+	.show = integrity_interval_show,
+};
+
 static struct integrity_sysfs_entry integrity_verify_entry = {
 	.attr = { .name = "read_verify", .mode = S_IRUGO | S_IWUSR },
 	.show = integrity_verify_show,
@@ -363,6 +376,7 @@ static struct integrity_sysfs_entry integrity_device_entry = {
 static struct attribute *integrity_attrs[] = {
 	&integrity_format_entry.attr,
 	&integrity_tag_size_entry.attr,
+	&integrity_interval_entry.attr,
 	&integrity_verify_entry.attr,
 	&integrity_generate_entry.attr,
 	&integrity_device_entry.attr,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
                                           ` (3 preceding siblings ...)
  2015-10-12 21:05                         ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
@ 2015-10-12 21:05                         ` Martin K. Petersen
  2015-10-12 23:06                           ` Mike Snitzer
  2015-10-13  0:31                         ` Block integrity registration update Williams, Dan J
  5 siblings, 1 reply; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-12 21:05 UTC (permalink / raw)


Up until now the_integrity profile has been dynamically allocated and
attached to struct gendisk after the disk has been made active.

This causes problems because NVMe devices need to register the profile
prior to the partition table being read due to a mandatory metadata
buffer requirement. In addition, DM goes through hoops to deal with
preallocating, but not initializing integrity profiles.

Since the integrity profile is small (4 bytes + a pointer), Christoph
suggested moving it to struct gendisk proper. This requires several
changes:

 - Moving the blk_integrity definition to genhd.h.

 - Inlining blk_integrity in struct gendisk.

 - Removing the dynamic allocation code.

 - Adding helper functions which allow gendisk to set up and tear down
   the integrity sysfs dir when a disk is added/deleted.

 - Adding a blk_integrity_revalidate() callback for updating the stable
   pages bdi setting.

 - The calls that depend on whether a device has an integrity profile or
   not now key off of the bi->profile pointer.

 - Simplifying the integrity support routines in DM (Mike Snitzer).

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Cc: Mike Snitzer <snitzer at redhat.com>
Cc: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c     | 160 +++++++++++++++++-----------------------------
 block/genhd.c             |   2 +
 block/partition-generic.c |   1 +
 drivers/md/dm-table.c     |  88 +++++++++++++------------
 drivers/md/md.c           |   9 +--
 drivers/nvdimm/core.c     |   6 +-
 drivers/nvme/host/pci.c   |   5 +-
 fs/block_dev.c            |   2 +-
 include/linux/blkdev.h    |  34 ++++------
 include/linux/genhd.h     |  26 +++++++-
 10 files changed, 152 insertions(+), 181 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 7a96f57ed195..4615a3386798 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -30,10 +30,6 @@
 
 #include "blk.h"
 
-static struct kmem_cache *integrity_cachep;
-
-static const char *bi_unsupported_name = "unsupported";
-
 /**
  * blk_rq_count_integrity_sg - Count number of integrity scatterlist elements
  * @q:		request queue
@@ -146,13 +142,13 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
  */
 int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 {
-	struct blk_integrity *b1 = gd1->integrity;
-	struct blk_integrity *b2 = gd2->integrity;
+	struct blk_integrity *b1 = &gd1->integrity;
+	struct blk_integrity *b2 = &gd2->integrity;
 
-	if (!b1 && !b2)
+	if (!b1->profile && !b2->profile)
 		return 0;
 
-	if (!b1 || !b2)
+	if (!b1->profile || !b2->profile)
 		return -1;
 
 	if (b1->interval_exp != b2->interval_exp) {
@@ -163,21 +159,21 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	}
 
 	if (b1->tuple_size != b2->tuple_size) {
-		printk(KERN_ERR "%s: %s/%s tuple sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tuple sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tuple_size, b2->tuple_size);
 		return -1;
 	}
 
 	if (b1->tag_size && b2->tag_size && (b1->tag_size != b2->tag_size)) {
-		printk(KERN_ERR "%s: %s/%s tag sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tag sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tag_size, b2->tag_size);
 		return -1;
 	}
 
 	if (b1->profile != b2->profile) {
-		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
+		pr_err("%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->profile->name, b2->profile->name);
 		return -1;
@@ -250,7 +246,7 @@ static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -262,7 +258,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    size_t count)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -275,7 +271,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->profile->name != NULL)
+	if (bi->profile && bi->profile->name)
 		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
@@ -283,18 +279,13 @@ static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 
 static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", bi->tag_size);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n", bi->tag_size);
 }
 
 static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", 1 << bi->interval_exp);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n",
+		       bi->interval_exp ? 1 << bi->interval_exp : 0);
 }
 
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
@@ -388,113 +379,78 @@ static const struct sysfs_ops integrity_ops = {
 	.store	= &integrity_attr_store,
 };
 
-static int __init blk_dev_integrity_init(void)
-{
-	integrity_cachep = kmem_cache_create("blkdev_integrity",
-					     sizeof(struct blk_integrity),
-					     0, SLAB_PANIC, NULL);
-	return 0;
-}
-subsys_initcall(blk_dev_integrity_init);
-
-static void blk_integrity_release(struct kobject *kobj)
-{
-	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	kmem_cache_free(integrity_cachep, bi);
-}
-
 static struct kobj_type integrity_ktype = {
 	.default_attrs	= integrity_attrs,
 	.sysfs_ops	= &integrity_ops,
-	.release	= blk_integrity_release,
 };
 
-bool blk_integrity_is_initialized(struct gendisk *disk)
-{
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	return (bi && bi->profile->name && strcmp(bi->profile->name,
-						  bi_unsupported_name) != 0);
-}
-EXPORT_SYMBOL(blk_integrity_is_initialized);
-
 /**
  * blk_integrity_register - Register a gendisk as being integrity-capable
  * @disk:	struct gendisk pointer to make integrity-aware
- * @template:	optional integrity profile to register
+ * @template:	block integrity profile to register
  *
- * Description: When a device needs to advertise itself as being able
- * to send/receive integrity metadata it must use this function to
- * register the capability with the block layer.  The template is a
- * blk_integrity struct with values appropriate for the underlying
- * hardware.  If template is NULL the new profile is allocated but
- * not filled out. See Documentation/block/data-integrity.txt.
+ * Description: When a device needs to advertise itself as being able to
+ * send/receive integrity metadata it must use this function to register
+ * the capability with the block layer. The template is a blk_integrity
+ * struct with values appropriate for the underlying hardware. See
+ * Documentation/block/data-integrity.txt.
  */
-int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
+void blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 {
-	struct blk_integrity *bi;
-
-	BUG_ON(disk == NULL);
+	struct blk_integrity *bi = &disk->integrity;
 
-	if (disk->integrity == NULL) {
-		bi = kmem_cache_alloc(integrity_cachep,
-				      GFP_KERNEL | __GFP_ZERO);
-		if (!bi)
-			return -1;
+	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
+		template->flags;
+	bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
+	bi->profile = template->profile;
+	bi->tuple_size = template->tuple_size;
+	bi->tag_size = template->tag_size;
 
-		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
-					 &disk_to_dev(disk)->kobj,
-					 "%s", "integrity")) {
-			kmem_cache_free(integrity_cachep, bi);
-			return -1;
-		}
-
-		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
-
-		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
-		disk->integrity = bi;
-	} else
-		bi = disk->integrity;
-
-	/* Use the provided profile as template */
-	if (template != NULL) {
-		bi->profile = template->profile;
-		bi->tuple_size = template->tuple_size;
-		bi->tag_size = template->tag_size;
-		bi->flags |= template->flags;
-	} else
-		bi->profile->name = bi_unsupported_name;
-
-	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
-
-	return 0;
+	blk_integrity_revalidate(disk);
 }
 EXPORT_SYMBOL(blk_integrity_register);
 
 /**
- * blk_integrity_unregister - Remove block integrity profile
- * @disk:	disk whose integrity profile to deallocate
+ * blk_integrity_unregister - Unregister block integrity profile
+ * @disk:	disk whose integrity profile to unregister
  *
- * Description: This function frees all memory used by the block
- * integrity profile.  To be called at device teardown.
+ * Description: This function unregisters the integrity capability from
+ * a block device.
  */
 void blk_integrity_unregister(struct gendisk *disk)
 {
-	struct blk_integrity *bi;
+	blk_integrity_revalidate(disk);
+	memset(&disk->integrity, 0, sizeof(struct blk_integrity));
+}
+EXPORT_SYMBOL(blk_integrity_unregister);
 
-	if (!disk || !disk->integrity)
+void blk_integrity_revalidate(struct gendisk *disk)
+{
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!(disk->flags & GENHD_FL_UP))
 		return;
 
-	disk->queue->backing_dev_info.capabilities &= ~BDI_CAP_STABLE_WRITES;
+	if (bi->profile)
+		disk->queue->backing_dev_info.capabilities |=
+			BDI_CAP_STABLE_WRITES;
+	else
+		disk->queue->backing_dev_info.capabilities &=
+			~BDI_CAP_STABLE_WRITES;
+}
 
-	bi = disk->integrity;
+void blk_integrity_add(struct gendisk *disk)
+{
+	if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
+				 &disk_to_dev(disk)->kobj, "%s", "integrity"))
+		return;
 
+	kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
+}
+
+void blk_integrity_del(struct gendisk *disk)
+{
 	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
 	kobject_del(&disk->integrity_kobj);
 	kobject_put(&disk->integrity_kobj);
-	disk->integrity = NULL;
 }
-EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/block/genhd.c b/block/genhd.c
index 0c706f33a599..e5cafa51567c 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -630,6 +630,7 @@ void add_disk(struct gendisk *disk)
 	WARN_ON(retval);
 
 	disk_add_events(disk);
+	blk_integrity_add(disk);
 }
 EXPORT_SYMBOL(add_disk);
 
@@ -638,6 +639,7 @@ void del_gendisk(struct gendisk *disk)
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
+	blk_integrity_del(disk);
 	disk_del_events(disk);
 
 	/* invalidate stuff */
diff --git a/block/partition-generic.c b/block/partition-generic.c
index e7711133284e..3b030157ec85 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -428,6 +428,7 @@ rescan:
 
 	if (disk->fops->revalidate_disk)
 		disk->fops->revalidate_disk(disk);
+	blk_integrity_revalidate(disk);
 	check_disk_size_change(disk, bdev);
 	bdev->bd_invalidated = 0;
 	if (!get_capacity(disk) || !(state = check_partition(disk, bdev)))
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index e76ed003769e..061152a43730 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1014,15 +1014,16 @@ static int dm_table_build_index(struct dm_table *t)
 	return r;
 }
 
+static bool integrity_profile_exists(struct gendisk *disk)
+{
+	return !!blk_get_integrity(disk);
+}
+
 /*
  * Get a disk whose integrity profile reflects the table's profile.
- * If %match_all is true, all devices' profiles must match.
- * If %match_all is false, all devices must at least have an
- * allocated integrity profile; but uninitialized is ok.
  * Returns NULL if integrity support was inconsistent or unavailable.
  */
-static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
-						    bool match_all)
+static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
 {
 	struct list_head *devices = dm_table_get_devices(t);
 	struct dm_dev_internal *dd = NULL;
@@ -1030,10 +1031,8 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
 
 	list_for_each_entry(dd, devices, list) {
 		template_disk = dd->dm_dev->bdev->bd_disk;
-		if (!blk_get_integrity(template_disk))
+		if (!integrity_profile_exists(template_disk))
 			goto no_integrity;
-		if (!match_all && !blk_integrity_is_initialized(template_disk))
-			continue; /* skip uninitialized profiles */
 		else if (prev_disk &&
 			 blk_integrity_compare(prev_disk, template_disk) < 0)
 			goto no_integrity;
@@ -1052,34 +1051,40 @@ no_integrity:
 }
 
 /*
- * Register the mapped device for blk_integrity support if
- * the underlying devices have an integrity profile.  But all devices
- * may not have matching profiles (checking all devices isn't reliable
+ * Register the mapped device for blk_integrity support if the
+ * underlying devices have an integrity profile.  But all devices may
+ * not have matching profiles (checking all devices isn't reliable
  * during table load because this table may use other DM device(s) which
- * must be resumed before they will have an initialized integity profile).
- * Stacked DM devices force a 2 stage integrity profile validation:
- * 1 - during load, validate all initialized integrity profiles match
- * 2 - during resume, validate all integrity profiles match
+ * must be resumed before they will have an initialized integity
+ * profile).  Consequently, stacked DM devices force a 2 stage integrity
+ * profile validation: First pass during table load, final pass during
+ * resume.
  */
-static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device *md)
+static int dm_table_register_integrity(struct dm_table *t)
 {
+	struct mapped_device *md = t->md;
 	struct gendisk *template_disk = NULL;
 
-	template_disk = dm_table_get_integrity_disk(t, false);
+	template_disk = dm_table_get_integrity_disk(t);
 	if (!template_disk)
 		return 0;
 
-	if (!blk_integrity_is_initialized(dm_disk(md))) {
+	if (!integrity_profile_exists(dm_disk(md))) {
 		t->integrity_supported = 1;
-		return blk_integrity_register(dm_disk(md), NULL);
+		/*
+		 * Register integrity profile during table load; we can do
+		 * this because the final profile must match during resume.
+		 */
+		blk_integrity_register(dm_disk(md),
+				       blk_get_integrity(template_disk));
+		return 0;
 	}
 
 	/*
-	 * If DM device already has an initalized integrity
+	 * If DM device already has an initialized integrity
 	 * profile the new profile should not conflict.
 	 */
-	if (blk_integrity_is_initialized(template_disk) &&
-	    blk_integrity_compare(dm_disk(md), template_disk) < 0) {
+	if (blk_integrity_compare(dm_disk(md), template_disk) < 0) {
 		DMWARN("%s: conflict with existing integrity profile: "
 		       "%s profile mismatch",
 		       dm_device_name(t->md),
@@ -1087,7 +1092,7 @@ static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device
 		return 1;
 	}
 
-	/* Preserve existing initialized integrity profile */
+	/* Preserve existing integrity profile */
 	t->integrity_supported = 1;
 	return 0;
 }
@@ -1112,7 +1117,7 @@ int dm_table_complete(struct dm_table *t)
 		return r;
 	}
 
-	r = dm_table_prealloc_integrity(t, t->md);
+	r = dm_table_register_integrity(t);
 	if (r) {
 		DMERR("could not register integrity profile.");
 		return r;
@@ -1278,29 +1283,30 @@ combine_limits:
 }
 
 /*
- * Set the integrity profile for this device if all devices used have
- * matching profiles.  We're quite deep in the resume path but still
- * don't know if all devices (particularly DM devices this device
- * may be stacked on) have matching profiles.  Even if the profiles
- * don't match we have no way to fail (to resume) at this point.
+ * Verify that all devices have an integrity profile that matches the
+ * DM device's registered integrity profile.  If the profiles don't
+ * match then unregister the DM device's integrity profile.
  */
-static void dm_table_set_integrity(struct dm_table *t)
+static void dm_table_verify_integrity(struct dm_table *t)
 {
 	struct gendisk *template_disk = NULL;
 
-	if (!blk_get_integrity(dm_disk(t->md)))
-		return;
+	if (t->integrity_supported) {
+		/*
+		 * Verify that the original integrity profile
+		 * matches all the devices in this table.
+		 */
+		template_disk = dm_table_get_integrity_disk(t);
+		if (template_disk &&
+		    blk_integrity_compare(dm_disk(t->md), template_disk) >= 0)
+			return;
+	}
 
-	template_disk = dm_table_get_integrity_disk(t, true);
-	if (template_disk)
-		blk_integrity_register(dm_disk(t->md),
-				       blk_get_integrity(template_disk));
-	else if (blk_integrity_is_initialized(dm_disk(t->md)))
-		DMWARN("%s: device no longer has a valid integrity profile",
-		       dm_device_name(t->md));
-	else
+	if (integrity_profile_exists(dm_disk(t->md))) {
 		DMWARN("%s: unable to establish an integrity profile",
 		       dm_device_name(t->md));
+		blk_integrity_unregister(dm_disk(t->md));
+	}
 }
 
 static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
@@ -1500,7 +1506,7 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
 	else
 		queue_flag_set_unlocked(QUEUE_FLAG_NO_SG_MERGE, q);
 
-	dm_table_set_integrity(t);
+	dm_table_verify_integrity(t);
 
 	/*
 	 * Determine whether or not this queue's I/O timings contribute
diff --git a/drivers/md/md.c b/drivers/md/md.c
index c702de18207a..2af9d590e1a0 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1962,12 +1962,9 @@ int md_integrity_register(struct mddev *mddev)
 	 * All component devices are integrity capable and have matching
 	 * profiles, register the common profile for the md device.
 	 */
-	if (blk_integrity_register(mddev->gendisk,
-			bdev_get_integrity(reference->bdev)) != 0) {
-		printk(KERN_ERR "md: failed to register integrity for %s\n",
-			mdname(mddev));
-		return -EINVAL;
-	}
+	blk_integrity_register(mddev->gendisk,
+			       bdev_get_integrity(reference->bdev));
+
 	printk(KERN_NOTICE "md: data integrity enabled on %s\n", mdname(mddev));
 	if (bioset_integrity_create(mddev->bio_set, BIO_POOL_SIZE)) {
 		printk(KERN_ERR "md: failed to create integrity pool for %s\n",
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 9e1b0f656a9b..9b6ac57c6e73 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -405,7 +405,6 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 		.generate_fn = nd_pi_nop_generate_verify,
 		.verify_fn = nd_pi_nop_generate_verify,
 	};
-	int ret;
 
 	if (meta_size == 0)
 		return 0;
@@ -414,10 +413,7 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 	bi.tuple_size = meta_size;
 	bi.tag_size = meta_size;
 
-	ret = blk_integrity_register(disk, &bi);
-	if (ret)
-		return ret;
-
+	blk_integrity_register(disk, &bi);
 	blk_queue_max_integrity_segments(disk->queue, 1);
 
 	return 0;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 5dba51d4bae6..d5d877716fd6 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -538,7 +538,7 @@ static void nvme_dif_remap(struct request *req,
 	virt = bip_get_seed(bip);
 	phys = nvme_block_nr(ns, blk_rq_pos(req));
 	nlb = (blk_rq_bytes(req) >> ns->lba_shift);
-	ts = ns->disk->integrity->tuple_size;
+	ts = ns->disk->integrity.tuple_size;
 
 	for (i = 0; i < nlb; i++, virt++, phys++) {
 		pi = (struct t10_pi_tuple *)p;
@@ -2042,8 +2042,7 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 	ns->pi_type = pi_type;
 	blk_queue_logical_block_size(ns->queue, bs);
 
-	if (ns->ms && !blk_get_integrity(disk) && (disk->flags & GENHD_FL_UP) &&
-								!ns->ext)
+	if (ns->ms && !ns->ext)
 		nvme_init_integrity(ns);
 
 	if (ns->ms && !(ns->ms == 8 && ns->pi_type) && !blk_get_integrity(disk))
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 073bb57adab1..0a793c7930eb 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1075,7 +1075,7 @@ int revalidate_disk(struct gendisk *disk)
 
 	if (disk->fops->revalidate_disk)
 		ret = disk->fops->revalidate_disk(disk);
-
+	blk_integrity_revalidate(disk);
 	bdev = bdget_disk(disk, 0);
 	if (!bdev)
 		return ret;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4f1968f15e30..60669c20190f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1468,16 +1468,7 @@ struct blk_integrity_profile {
 	const char			*name;
 };
 
-struct blk_integrity {
-	struct blk_integrity_profile	*profile;
-	unsigned char			flags;
-	unsigned char			tuple_size;
-	unsigned char			interval_exp;
-	unsigned char			tag_size;
-};
-
-extern bool blk_integrity_is_initialized(struct gendisk *);
-extern int blk_integrity_register(struct gendisk *, struct blk_integrity *);
+extern void blk_integrity_register(struct gendisk *, struct blk_integrity *);
 extern void blk_integrity_unregister(struct gendisk *);
 extern int blk_integrity_compare(struct gendisk *, struct gendisk *);
 extern int blk_rq_map_integrity_sg(struct request_queue *, struct bio *,
@@ -1488,15 +1479,20 @@ extern bool blk_integrity_merge_rq(struct request_queue *, struct request *,
 extern bool blk_integrity_merge_bio(struct request_queue *, struct request *,
 				    struct bio *);
 
-static inline
-struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
+static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
 {
-	return bdev->bd_disk->integrity;
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!bi->profile)
+		return NULL;
+
+	return bi;
 }
 
-static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
+static inline
+struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
 {
-	return disk->integrity;
+	return blk_get_integrity(bdev->bd_disk);
 }
 
 static inline bool blk_integrity_rq(struct request *rq)
@@ -1570,10 +1566,9 @@ static inline int blk_integrity_compare(struct gendisk *a, struct gendisk *b)
 {
 	return 0;
 }
-static inline int blk_integrity_register(struct gendisk *d,
+static inline void blk_integrity_register(struct gendisk *d,
 					 struct blk_integrity *b)
 {
-	return 0;
 }
 static inline void blk_integrity_unregister(struct gendisk *d)
 {
@@ -1598,10 +1593,7 @@ static inline bool blk_integrity_merge_bio(struct request_queue *rq,
 {
 	return true;
 }
-static inline bool blk_integrity_is_initialized(struct gendisk *g)
-{
-	return 0;
-}
+
 static inline bool integrity_req_gap_back_merge(struct request *req,
 						struct bio *next)
 {
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 9e6e0dfa97ad..82f4911e0ad8 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -163,6 +163,18 @@ struct disk_part_tbl {
 
 struct disk_events;
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
+};
+
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 struct gendisk {
 	/* major, first_minor and minors are input parameters only,
 	 * don't use directly.  Use disk_devt() and disk_max_parts().
@@ -198,9 +210,9 @@ struct gendisk {
 	atomic_t sync_io;		/* RAID */
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
-	struct blk_integrity *integrity;
+	struct blk_integrity integrity;
 	struct kobject integrity_kobj;
-#endif
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;
 };
 
@@ -728,6 +740,16 @@ static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
 #endif
 }
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+extern void blk_integrity_add(struct gendisk *);
+extern void blk_integrity_del(struct gendisk *);
+extern void blk_integrity_revalidate(struct gendisk *);
+#else	/* CONFIG_BLK_DEV_INTEGRITY */
+static inline void blk_integrity_add(struct gendisk *disk) { }
+static inline void blk_integrity_del(struct gendisk *disk) { }
+static inline void blk_integrity_revalidate(struct gendisk *disk) { }
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 #else /* CONFIG_BLOCK */
 
 static inline void printk_all_partitions(void) { }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-10-12 21:05                         ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
@ 2015-10-12 23:06                           ` Mike Snitzer
  0 siblings, 0 replies; 67+ messages in thread
From: Mike Snitzer @ 2015-10-12 23:06 UTC (permalink / raw)


On Mon, Oct 12 2015 at  5:05pm -0400,
Martin K. Petersen <martin.petersen@oracle.com> wrote:

> Up until now the_integrity profile has been dynamically allocated and
> attached to struct gendisk after the disk has been made active.
> 
> This causes problems because NVMe devices need to register the profile
> prior to the partition table being read due to a mandatory metadata
> buffer requirement. In addition, DM goes through hoops to deal with
> preallocating, but not initializing integrity profiles.
> 
> Since the integrity profile is small (4 bytes + a pointer), Christoph
> suggested moving it to struct gendisk proper. This requires several
> changes:
> 
>  - Moving the blk_integrity definition to genhd.h.
> 
>  - Inlining blk_integrity in struct gendisk.
> 
>  - Removing the dynamic allocation code.
> 
>  - Adding helper functions which allow gendisk to set up and tear down
>    the integrity sysfs dir when a disk is added/deleted.
> 
>  - Adding a blk_integrity_revalidate() callback for updating the stable
>    pages bdi setting.
> 
>  - The calls that depend on whether a device has an integrity profile or
>    not now key off of the bi->profile pointer.
> 
>  - Simplifying the integrity support routines in DM (Mike Snitzer).
> 
> Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
> Reported-by: Christoph Hellwig <hch at lst.de>
> Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
> Cc: Mike Snitzer <snitzer at redhat.com>
> Cc: Dan Williams <dan.j.williams at intel.com>

Thanks Martin.

Please replace the Cc: Mike Snitzer <snitzer at redhat.com> above with
Signed-off-by: Mike Snitzer <snitzer at redhat.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Block integrity registration update
  2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
                                           ` (4 preceding siblings ...)
  2015-10-12 21:05                         ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
@ 2015-10-13  0:31                         ` Williams, Dan J
  2015-10-13  1:53                           ` Williams, Dan J
  5 siblings, 1 reply; 67+ messages in thread
From: Williams, Dan J @ 2015-10-13  0:31 UTC (permalink / raw)


On Mon, 2015-10-12@17:05 -0400, Martin K. Petersen wrote:
> As requested, here's the integrity registration update for 4.4. Only
> delta is patch 5 which retains the non-PI metadata check as requested by
> Keith as well as the DM rework by Mike. I rebased on top of
> block/for-4.4/drivers to accommodate the NVMe shuffle.
> 
> [PATCH 1/5] block: Move integrity kobject to struct gendisk
> [PATCH 2/5] block: Consolidate static integrity profile properties
> [PATCH 3/5] block: Reduce the size of struct blk_integrity
> [PATCH 4/5] block: Export integrity data interval size in sysfs
> [PATCH 5/5] block: Inline blk_integrity in struct gendisk
> 

I'm triggering the oops below with the libnvdimm unit tests when running
these on top of block.git#for-4.4/drivers, and the problem does not
happen (readily) with them not applied.  I say "readily" because I don't
think this failure mode is new as much as it is making an existing
problem more reproducible.  I'll take a look, but wanted to give a heads
up in the meantime.

See "make check" and README.md in the ndctl repository if you want to
try reproducing...

https://github.com/pmem/ndctl

---

BUG: unable to handle kernel paging request at ffff8800d731bbd0
IP: [<ffff8800d731bbd0>] 0xffff8800d731bbd0
PGD 2f65067 PUD 21fffd067 PMD 80000000d72001e3 
Oops: 0011 [#1] SMP  
Dumping ftrace buffer: 
   (ftrace buffer empty)
Modules linked in: nd_blk(O) nfit_test(O) nfit(O) ip6t_rpfilter
ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute
bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
iptable_security iptable_raw nd_pmem(O) nd_btt(O) serio_raw nd_e820(O)
libnvdimm(O) nfit_test_iomap(O)
CPU: 1 PID: 420 Comm: kworker/1:1H Tainted: G           O    4.3.0-rc4+
#1546
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: kintegrityd bio_integrity_verify_fn
task: ffff8800d9cd3fc0 ti: ffff8800d9dc0000 task.ti: ffff8800d9dc0000
RIP: 0010:[<ffff8800d731bbd0>]  [<ffff8800d731bbd0>] 0xffff8800d731bbd0
RSP: 0018:ffff8800d9dc3cd8  EFLAGS: 00010286
RAX: ffff8800d731bbd0 RBX: 0000000000001000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8800d7393600 RDI: ffff8800d9dc3d10
RBP: ffff8800d9dc3d68 R08: 0000000000000000 R09: 0000000000000000
R10: 0000160000000000 R11: ffff8800d9cd3fe8 R12: 0000000000001000
R13: ffff8800d9cd3fc0 R14: ffff88020d95ef00 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88021fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff8800d731bbd0 CR3: 00000000da226000 CR4: 00000000000006e0
Stack:
 ffffffff8144d9ae 0000000000000000 ffff880000000000 0000160000000000
 ffff8800d731bbd0 0000000000000000 0000000000000000 ffff8800da780940
 ffff8800d9174000 0000000000000000 0000020000001000 ffff8800cda8380c
Call Trace:
 [<ffffffff8144d9ae>] ? bio_integrity_process+0x12e/0x290
 [<ffffffff8144dd96>] bio_integrity_verify_fn+0x36/0x60
 [<ffffffff810bd2dc>] process_one_work+0x1cc/0x4e0
 [<ffffffff810bd26e>] ? process_one_work+0x15e/0x4e0
 [<ffffffff810bd92b>] worker_thread+0x4b/0x440
 [<ffffffff810bd8e0>] ? rescuer_thread+0x2f0/0x2f0
 [<ffffffff810bd8e0>] ? rescuer_thread+0x2f0/0x2f0
 [<ffffffff810c3993>] kthread+0xf3/0x110
 [<ffffffff810c38a0>] ? kthread_create_on_node+0x230/0x230
 [<ffffffff818f1aaf>] ret_from_fork+0x3f/0x70
 [<ffffffff810c38a0>] ? kthread_create_on_node+0x230/0x230
Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0 bb 31 d7 00 88 ff ff c0 bb 31 d7 00 88 ff ff <d0> bb 31 d7 00 88 ff ff d0 bb 31 d7 00 88 ff ff e0 bb 31 d7 00 
RIP  [<ffff8800d731bbd0>] 0xffff8800d731bbd0
 RSP <ffff8800d9dc3cd8>
CR2: ffff8800d731bbd0
---[ end trace 056f87b7ce676294 ]---

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Block integrity registration update
  2015-10-13  0:31                         ` Block integrity registration update Williams, Dan J
@ 2015-10-13  1:53                           ` Williams, Dan J
  2015-10-13 12:26                             ` hch
  0 siblings, 1 reply; 67+ messages in thread
From: Williams, Dan J @ 2015-10-13  1:53 UTC (permalink / raw)


On Mon, 2015-10-12@17:31 -0700, Dan J Williams wrote:
> On Mon, 2015-10-12@17:05 -0400, Martin K. Petersen wrote:
> > As requested, here's the integrity registration update for 4.4. Only
> > delta is patch 5 which retains the non-PI metadata check as requested by
> > Keith as well as the DM rework by Mike. I rebased on top of
> > block/for-4.4/drivers to accommodate the NVMe shuffle.
> > 
> > [PATCH 1/5] block: Move integrity kobject to struct gendisk
> > [PATCH 2/5] block: Consolidate static integrity profile properties
> > [PATCH 3/5] block: Reduce the size of struct blk_integrity
> > [PATCH 4/5] block: Export integrity data interval size in sysfs
> > [PATCH 5/5] block: Inline blk_integrity in struct gendisk
> > 
> 
> I'm triggering the oops below with the libnvdimm unit tests when running
> these on top of block.git#for-4.4/drivers...

A little tracing reveals:

[   35.116011]    ndctl-2226    3.... 37126789us : blk_integrity_unregister: pmem1s: from: nvdimm_namespace_detach_btt [nd_btt]
[   35.116011]    ndctl-2226    3.... 37185506us : bio_integrity_endio: pmem1s
[   35.116011]    <...>-302     3.... 37185560us : bio_integrity_process: pmem1s

...i.e. that we're destroying the integrity profile while i/o is still
in flight.  As far as I can see any driver that calls
blk_integrity_unregister() before blk_cleanup_queue() can hit this.  

However, with the change to static allocation I'm not sure why a driver
would ever need to call blk_integrity_unregister() in its shutdown path.
It seems this would only be necessary for disabling integrity at run
time, but it can only do it safely when the queue is known to be idle.

Is there a way to solve this without the generic blk_freeze_queue()
implementation? [1].  The immediate fix for libnvdimm is to just stop
calling blk_integrity_unregister().

[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-October/002388.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Block integrity registration update
  2015-10-13  1:53                           ` Williams, Dan J
@ 2015-10-13 12:26                             ` hch
  2015-10-13 17:38                                 ` Dan Williams
  0 siblings, 1 reply; 67+ messages in thread
From: hch @ 2015-10-13 12:26 UTC (permalink / raw)


On Tue, Oct 13, 2015@01:53:34AM +0000, Williams, Dan J wrote:
> ...i.e. that we're destroying the integrity profile while i/o is still
> in flight.  As far as I can see any driver that calls
> blk_integrity_unregister() before blk_cleanup_queue() can hit this.  
> 
> However, with the change to static allocation I'm not sure why a driver
> would ever need to call blk_integrity_unregister() in its shutdown path.

It shouldn't.

> It seems this would only be necessary for disabling integrity at run
> time, but it can only do it safely when the queue is known to be idle.

Yes.  And even for that case we should a) only clear ->flags not the
whole integrity profile (and fix blk_integrity_revalidate to check the
right thing) and b) clear flags before calling blk_integrity_revalidate.

> Is there a way to solve this without the generic blk_freeze_queue()
> implementation? [1].  The immediate fix for libnvdimm is to just stop
> calling blk_integrity_unregister().

Seems like only nvme ever updates the profile, and nvme is blk-mq
only. 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: Block integrity registration update
  2015-10-13 12:26                             ` hch
@ 2015-10-13 17:38                                 ` Dan Williams
  0 siblings, 0 replies; 67+ messages in thread
From: Dan Williams @ 2015-10-13 17:38 UTC (permalink / raw)
  To: hch
  Cc: martin.petersen, Busch, Keith, willy, linux-nvme, axboe, snitzer,
	Neil Brown, linux-raid

[ adding Neil, linux-raid ]

On Tue, Oct 13, 2015 at 5:26 AM, hch@infradead.org <hch@infradead.org> wrote:
> On Tue, Oct 13, 2015 at 01:53:34AM +0000, Williams, Dan J wrote:
>> ...i.e. that we're destroying the integrity profile while i/o is still
>> in flight.  As far as I can see any driver that calls
>> blk_integrity_unregister() before blk_cleanup_queue() can hit this.
>>
>> However, with the change to static allocation I'm not sure why a driver
>> would ever need to call blk_integrity_unregister() in its shutdown path.
>
> It shouldn't.
>
>> It seems this would only be necessary for disabling integrity at run
>> time, but it can only do it safely when the queue is known to be idle.
>
> Yes.  And even for that case we should a) only clear ->flags not the
> whole integrity profile (and fix blk_integrity_revalidate to check the
> right thing) and b) clear flags before calling blk_integrity_revalidate.
>
>> Is there a way to solve this without the generic blk_freeze_queue()
>> implementation? [1].  The immediate fix for libnvdimm is to just stop
>> calling blk_integrity_unregister().
>
> Seems like only nvme ever updates the profile, and nvme is blk-mq
> only.

Looks like both dm and md are calling blk_integrity_unregister() at
run-time as well when adding new disks to the configuration.  Even if
we fix blk_integrity_unregister() to not trigger a crash it seems we
still need to stop all I/O during the unregistration so we don't throw
spurious integrity miscompares for in-flight i/o.

For dm I'm wondering what's the difference between a mapped_device
being suspended vs having completed a stop_queue()?  Mike?

For md looks like a simple mddev_suspend()/resume() in
md_integrity_add_rdev() is sufficient.

...and then blk_mq_freeze_queue() for nvme in nvme_revalidate_disk().

I'll throw together a patch...

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Block integrity registration update
@ 2015-10-13 17:38                                 ` Dan Williams
  0 siblings, 0 replies; 67+ messages in thread
From: Dan Williams @ 2015-10-13 17:38 UTC (permalink / raw)


[ adding Neil, linux-raid ]

On Tue, Oct 13, 2015@5:26 AM, hch@infradead.org <hch@infradead.org> wrote:
> On Tue, Oct 13, 2015@01:53:34AM +0000, Williams, Dan J wrote:
>> ...i.e. that we're destroying the integrity profile while i/o is still
>> in flight.  As far as I can see any driver that calls
>> blk_integrity_unregister() before blk_cleanup_queue() can hit this.
>>
>> However, with the change to static allocation I'm not sure why a driver
>> would ever need to call blk_integrity_unregister() in its shutdown path.
>
> It shouldn't.
>
>> It seems this would only be necessary for disabling integrity at run
>> time, but it can only do it safely when the queue is known to be idle.
>
> Yes.  And even for that case we should a) only clear ->flags not the
> whole integrity profile (and fix blk_integrity_revalidate to check the
> right thing) and b) clear flags before calling blk_integrity_revalidate.
>
>> Is there a way to solve this without the generic blk_freeze_queue()
>> implementation? [1].  The immediate fix for libnvdimm is to just stop
>> calling blk_integrity_unregister().
>
> Seems like only nvme ever updates the profile, and nvme is blk-mq
> only.

Looks like both dm and md are calling blk_integrity_unregister() at
run-time as well when adding new disks to the configuration.  Even if
we fix blk_integrity_unregister() to not trigger a crash it seems we
still need to stop all I/O during the unregistration so we don't throw
spurious integrity miscompares for in-flight i/o.

For dm I'm wondering what's the difference between a mapped_device
being suspended vs having completed a stop_queue()?  Mike?

For md looks like a simple mddev_suspend()/resume() in
md_integrity_add_rdev() is sufficient.

...and then blk_mq_freeze_queue() for nvme in nvme_revalidate_disk().

I'll throw together a patch...

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-10-12 21:05                         ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
@ 2015-10-14  1:11                           ` Dan Williams
  2015-10-14  7:23                             ` Christoph Hellwig
  0 siblings, 1 reply; 67+ messages in thread
From: Dan Williams @ 2015-10-14  1:11 UTC (permalink / raw)


On Mon, Oct 12, 2015 at 2:05 PM, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
> We previously made a complete copy of a device's data integrity profile
> even though several of the fields inside the blk_integrity struct are
> pointers to fixed template entries in t10-pi.c.
>
> Split the static and per-device portions so that we can reference the
> template directly.
>
> Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
> Reported-by: Christoph Hellwig <hch at lst.de>
> Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
> Cc: Dan Williams <dan.j.williams at intel.com>
[..]
> diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
> index cb62ec6a12d0..9e1b0f656a9b 100644
> --- a/drivers/nvdimm/core.c
> +++ b/drivers/nvdimm/core.c
> @@ -399,19 +399,22 @@ static int nd_pi_nop_generate_verify(struct blk_integrity_iter *iter)
>
>  int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
>  {
> -       struct blk_integrity integrity = {
> +       struct blk_integrity bi;
> +       struct blk_integrity_profile profile = {
>                 .name = "ND-PI-NOP",
>                 .generate_fn = nd_pi_nop_generate_verify,
>                 .verify_fn = nd_pi_nop_generate_verify,
> -               .tuple_size = meta_size,
> -               .tag_size = meta_size,

'profile' here needs to be made static since we reference rather than
copy the profile data at blk_integrity_register() time.  This is part
of, but I don't think all of, my blk_integrity shutdown woes.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-10-14  1:11                           ` Dan Williams
@ 2015-10-14  7:23                             ` Christoph Hellwig
  2015-10-14 19:42                               ` Williams, Dan J
  0 siblings, 1 reply; 67+ messages in thread
From: Christoph Hellwig @ 2015-10-14  7:23 UTC (permalink / raw)


On Tue, Oct 13, 2015@06:11:50PM -0700, Dan Williams wrote:
> >  int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
> >  {
> > -       struct blk_integrity integrity = {
> > +       struct blk_integrity bi;
> > +       struct blk_integrity_profile profile = {
> >                 .name = "ND-PI-NOP",
> >                 .generate_fn = nd_pi_nop_generate_verify,
> >                 .verify_fn = nd_pi_nop_generate_verify,
> > -               .tuple_size = meta_size,
> > -               .tag_size = meta_size,
> 
> 'profile' here needs to be made static since we reference rather than
> copy the profile data at blk_integrity_register() time.  This is part
> of, but I don't think all of, my blk_integrity shutdown woes.

Oh, yes.  Can we also add a single noop profile to block/blk-integrity.c
while we're at it?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-10-14  7:23                             ` Christoph Hellwig
@ 2015-10-14 19:42                               ` Williams, Dan J
  2015-10-14 19:47                                 ` hch
  0 siblings, 1 reply; 67+ messages in thread
From: Williams, Dan J @ 2015-10-14 19:42 UTC (permalink / raw)


On Wed, 2015-10-14@00:23 -0700, Christoph Hellwig wrote:
> On Tue, Oct 13, 2015@06:11:50PM -0700, Dan Williams wrote:
> > >  int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
> > >  {
> > > -       struct blk_integrity integrity = {
> > > +       struct blk_integrity bi;
> > > +       struct blk_integrity_profile profile = {
> > >                 .name = "ND-PI-NOP",
> > >                 .generate_fn = nd_pi_nop_generate_verify,
> > >                 .verify_fn = nd_pi_nop_generate_verify,
> > > -               .tuple_size = meta_size,
> > > -               .tag_size = meta_size,
> > 
> > 'profile' here needs to be made static since we reference rather than
> > copy the profile data at blk_integrity_register() time.  This is part
> > of, but I don't think all of, my blk_integrity shutdown woes.
> 
> Oh, yes.  Can we also add a single noop profile to block/blk-integrity.c
> while we're at it?
> 

Sounds good, how about?

8<------
Subject: block, libnvdimm: provide a built-in blk_integrity nop profile

From: Dan Williams <dan.j.williams@intel.com>

The libnvidmm-btt driver uses blk_integrity to reserve space for
per-sector metadata, but does not implement protection checksums.  This
property is generically useful, so teach the block core to internally
specify a nop profile if one is not provided at registration time.

Suggested-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c |   13 ++++++++++++-
 drivers/nvdimm/core.c |   12 +-----------
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index dc4dea7b8a93..506cc16c1a92 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -384,6 +384,17 @@ static struct kobj_type integrity_ktype = {
 	.sysfs_ops	= &integrity_ops,
 };
 
+static int blk_integrity_nop_fn(struct blk_integrity_iter *iter)
+{
+	return 0;
+}
+
+static struct blk_integrity_profile nop_profile = {
+	.name = "nop",
+	.generate_fn = blk_integrity_nop_fn,
+	.verify_fn = blk_integrity_nop_fn,
+};
+
 /**
  * blk_integrity_register - Register a request_queue as being integrity-capable
  * @disk:	struct request_queue pointer to make integrity-aware
@@ -402,7 +413,7 @@ void blk_integrity_register(struct request_queue *q, struct blk_integrity *templ
 	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
 		template->flags;
 	bi->interval_exp = ilog2(queue_logical_block_size(q));
-	bi->profile = template->profile;
+	bi->profile = template->profile ? template->profile : &nop_profile;
 	bi->tuple_size = template->tuple_size;
 	bi->tag_size = template->tag_size;
 
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index eeedd58bbcad..2ed3c934256f 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -392,24 +392,14 @@ void nvdimm_bus_unregister(struct nvdimm_bus *nvdimm_bus)
 EXPORT_SYMBOL_GPL(nvdimm_bus_unregister);
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
-static int nd_pi_nop_generate_verify(struct blk_integrity_iter *iter)
-{
-	return 0;
-}
-
 int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 {
 	struct blk_integrity bi;
-	static struct blk_integrity_profile profile = {
-		.name = "ND-PI-NOP",
-		.generate_fn = nd_pi_nop_generate_verify,
-		.verify_fn = nd_pi_nop_generate_verify,
-	};
 
 	if (meta_size == 0)
 		return 0;
 
-	bi.profile = &profile;
+	bi.profile = NULL;
 	bi.tuple_size = meta_size;
 	bi.tag_size = meta_size;
 

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-10-14 19:42                               ` Williams, Dan J
@ 2015-10-14 19:47                                 ` hch
  2015-10-14 20:00                                   ` Williams, Dan J
  0 siblings, 1 reply; 67+ messages in thread
From: hch @ 2015-10-14 19:47 UTC (permalink / raw)


On Wed, Oct 14, 2015@07:42:40PM +0000, Williams, Dan J wrote:
> Sounds good, how about?

Looks good.  Note that NVMe has a copy of it that could now be
consolidate.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-10-14 19:47                                 ` hch
@ 2015-10-14 20:00                                   ` Williams, Dan J
  2015-10-14 22:42                                     ` Martin K. Petersen
  0 siblings, 1 reply; 67+ messages in thread
From: Williams, Dan J @ 2015-10-14 20:00 UTC (permalink / raw)


On Wed, 2015-10-14@12:47 -0700, hch@infradead.org wrote:
> On Wed, Oct 14, 2015@07:42:40PM +0000, Williams, Dan J wrote:
> > Sounds good, how about?
> 
> Looks good.  Note that NVMe has a copy of it that could now be
> consolidate.

Ah, cool, thanks for the heads up.

8<----
Subject: block, libnvdimm, nvme: provide a built-in blk_integrity nop profile

From: Dan Williams <dan.j.williams@intel.com>

The libnvidmm-btt and nvme drivers use blk_integrity to reserve space
for per-sector metadata, but sometimes without protection checksums.
This property is generically useful, so teach the block core to
internally specify a nop profile if one is not provided at registration
time.

Cc: Keith Busch <keith.busch at intel.com>
Cc: Matthew Wilcox <willy at linux.intel.com>
Suggested-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c   |   13 ++++++++++++-
 drivers/nvdimm/core.c   |   12 +-----------
 drivers/nvme/host/pci.c |   18 +-----------------
 3 files changed, 14 insertions(+), 29 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index dc4dea7b8a93..506cc16c1a92 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -384,6 +384,17 @@ static struct kobj_type integrity_ktype = {
 	.sysfs_ops	= &integrity_ops,
 };
 
+static int blk_integrity_nop_fn(struct blk_integrity_iter *iter)
+{
+	return 0;
+}
+
+static struct blk_integrity_profile nop_profile = {
+	.name = "nop",
+	.generate_fn = blk_integrity_nop_fn,
+	.verify_fn = blk_integrity_nop_fn,
+};
+
 /**
  * blk_integrity_register - Register a request_queue as being integrity-capable
  * @disk:	struct request_queue pointer to make integrity-aware
@@ -402,7 +413,7 @@ void blk_integrity_register(struct request_queue *q, struct blk_integrity *templ
 	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
 		template->flags;
 	bi->interval_exp = ilog2(queue_logical_block_size(q));
-	bi->profile = template->profile;
+	bi->profile = template->profile ? template->profile : &nop_profile;
 	bi->tuple_size = template->tuple_size;
 	bi->tag_size = template->tag_size;
 
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index eeedd58bbcad..2ed3c934256f 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -392,24 +392,14 @@ void nvdimm_bus_unregister(struct nvdimm_bus *nvdimm_bus)
 EXPORT_SYMBOL_GPL(nvdimm_bus_unregister);
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
-static int nd_pi_nop_generate_verify(struct blk_integrity_iter *iter)
-{
-	return 0;
-}
-
 int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 {
 	struct blk_integrity bi;
-	static struct blk_integrity_profile profile = {
-		.name = "ND-PI-NOP",
-		.generate_fn = nd_pi_nop_generate_verify,
-		.verify_fn = nd_pi_nop_generate_verify,
-	};
 
 	if (meta_size == 0)
 		return 0;
 
-	bi.profile = &profile;
+	bi.profile = NULL;
 	bi.tuple_size = meta_size;
 	bi.tag_size = meta_size;
 
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e4a0cc7fb421..532b6a491fca 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -550,22 +550,6 @@ static void nvme_dif_remap(struct request *req,
 	kunmap_atomic(pmap);
 }
 
-static int nvme_noop_verify(struct blk_integrity_iter *iter)
-{
-	return 0;
-}
-
-static int nvme_noop_generate(struct blk_integrity_iter *iter)
-{
-	return 0;
-}
-
-struct blk_integrity_profile nvme_meta_noop = {
-	.name			= "NVME_META_NOOP",
-	.generate_fn		= nvme_noop_generate,
-	.verify_fn		= nvme_noop_verify,
-};
-
 static void nvme_init_integrity(struct nvme_ns *ns)
 {
 	struct blk_integrity integrity;
@@ -579,7 +563,7 @@ static void nvme_init_integrity(struct nvme_ns *ns)
 		integrity.profile = &t10_pi_type1_crc;
 		break;
 	default:
-		integrity.profile = &nvme_meta_noop;
+		integrity.profile = NULL;
 		break;
 	}
 	integrity.tuple_size = ns->ms;

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/5] block: Consolidate static integrity profile properties
  2015-10-14 20:00                                   ` Williams, Dan J
@ 2015-10-14 22:42                                     ` Martin K. Petersen
  0 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-14 22:42 UTC (permalink / raw)


>>>>> "Dan" == Williams, Dan J <dan.j.williams at intel.com> writes:

Dan> The libnvidmm-btt and nvme drivers use blk_integrity to reserve
Dan> space for per-sector metadata, but sometimes without protection
Dan> checksums.  This property is generically useful, so teach the block
Dan> core to internally specify a nop profile if one is not provided at
Dan> registration time.

Looks good to me.

Acked-by: Martin K. Petersen <martin.petersen at oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 5/5] block: Inline blk_integrity in struct gendisk
  2015-10-20  2:45 ` Simplify block integrity registration v2 Martin K. Petersen
@ 2015-10-20  2:45   ` Martin K. Petersen
  0 siblings, 0 replies; 67+ messages in thread
From: Martin K. Petersen @ 2015-10-20  2:45 UTC (permalink / raw)


Up until now the_integrity profile has been dynamically allocated and
attached to struct gendisk after the disk has been made active.

This causes problems because NVMe devices need to register the profile
prior to the partition table being read due to a mandatory metadata
buffer requirement. In addition, DM goes through hoops to deal with
preallocating, but not initializing integrity profiles.

Since the integrity profile is small (4 bytes + a pointer), Christoph
suggested moving it to struct gendisk proper. This requires several
changes:

 - Moving the blk_integrity definition to genhd.h.

 - Inlining blk_integrity in struct gendisk.

 - Removing the dynamic allocation code.

 - Adding helper functions which allow gendisk to set up and tear down
   the integrity sysfs dir when a disk is added/deleted.

 - Adding a blk_integrity_revalidate() callback for updating the stable
   pages bdi setting.

 - The calls that depend on whether a device has an integrity profile or
   not now key off of the bi->profile pointer.

 - Simplifying the integrity support routines in DM (Mike Snitzer).

Signed-off-by: Martin K. Petersen <martin.petersen at oracle.com>
Reported-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Sagi Grimberg <sagig at mellanox.com>
Signed-off-by: Mike Snitzer <snitzer at redhat.com>
Cc: Dan Williams <dan.j.williams at intel.com>
---
 block/blk-integrity.c     | 160 +++++++++++++++++-----------------------------
 block/genhd.c             |   2 +
 block/partition-generic.c |   1 +
 drivers/block/nvme-core.c |   5 +-
 drivers/md/dm-table.c     |  88 +++++++++++++------------
 drivers/md/md.c           |   9 +--
 drivers/nvdimm/core.c     |   6 +-
 fs/block_dev.c            |   2 +-
 include/linux/blkdev.h    |  34 ++++------
 include/linux/genhd.h     |  26 +++++++-
 10 files changed, 152 insertions(+), 181 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 7a96f57ed195..4615a3386798 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -30,10 +30,6 @@
 
 #include "blk.h"
 
-static struct kmem_cache *integrity_cachep;
-
-static const char *bi_unsupported_name = "unsupported";
-
 /**
  * blk_rq_count_integrity_sg - Count number of integrity scatterlist elements
  * @q:		request queue
@@ -146,13 +142,13 @@ EXPORT_SYMBOL(blk_rq_map_integrity_sg);
  */
 int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 {
-	struct blk_integrity *b1 = gd1->integrity;
-	struct blk_integrity *b2 = gd2->integrity;
+	struct blk_integrity *b1 = &gd1->integrity;
+	struct blk_integrity *b2 = &gd2->integrity;
 
-	if (!b1 && !b2)
+	if (!b1->profile && !b2->profile)
 		return 0;
 
-	if (!b1 || !b2)
+	if (!b1->profile || !b2->profile)
 		return -1;
 
 	if (b1->interval_exp != b2->interval_exp) {
@@ -163,21 +159,21 @@ int blk_integrity_compare(struct gendisk *gd1, struct gendisk *gd2)
 	}
 
 	if (b1->tuple_size != b2->tuple_size) {
-		printk(KERN_ERR "%s: %s/%s tuple sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tuple sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tuple_size, b2->tuple_size);
 		return -1;
 	}
 
 	if (b1->tag_size && b2->tag_size && (b1->tag_size != b2->tag_size)) {
-		printk(KERN_ERR "%s: %s/%s tag sz %u != %u\n", __func__,
+		pr_err("%s: %s/%s tag sz %u != %u\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->tag_size, b2->tag_size);
 		return -1;
 	}
 
 	if (b1->profile != b2->profile) {
-		printk(KERN_ERR "%s: %s/%s type %s != %s\n", __func__,
+		pr_err("%s: %s/%s type %s != %s\n", __func__,
 		       gd1->disk_name, gd2->disk_name,
 		       b1->profile->name, b2->profile->name);
 		return -1;
@@ -250,7 +246,7 @@ static ssize_t integrity_attr_show(struct kobject *kobj, struct attribute *attr,
 				   char *page)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 
@@ -262,7 +258,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 				    size_t count)
 {
 	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
+	struct blk_integrity *bi = &disk->integrity;
 	struct integrity_sysfs_entry *entry =
 		container_of(attr, struct integrity_sysfs_entry, attr);
 	ssize_t ret = 0;
@@ -275,7 +271,7 @@ static ssize_t integrity_attr_store(struct kobject *kobj,
 
 static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL && bi->profile->name != NULL)
+	if (bi->profile && bi->profile->name)
 		return sprintf(page, "%s\n", bi->profile->name);
 	else
 		return sprintf(page, "none\n");
@@ -283,18 +279,13 @@ static ssize_t integrity_format_show(struct blk_integrity *bi, char *page)
 
 static ssize_t integrity_tag_size_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", bi->tag_size);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n", bi->tag_size);
 }
 
 static ssize_t integrity_interval_show(struct blk_integrity *bi, char *page)
 {
-	if (bi != NULL)
-		return sprintf(page, "%u\n", 1 << bi->interval_exp);
-	else
-		return sprintf(page, "0\n");
+	return sprintf(page, "%u\n",
+		       bi->interval_exp ? 1 << bi->interval_exp : 0);
 }
 
 static ssize_t integrity_verify_store(struct blk_integrity *bi,
@@ -388,113 +379,78 @@ static const struct sysfs_ops integrity_ops = {
 	.store	= &integrity_attr_store,
 };
 
-static int __init blk_dev_integrity_init(void)
-{
-	integrity_cachep = kmem_cache_create("blkdev_integrity",
-					     sizeof(struct blk_integrity),
-					     0, SLAB_PANIC, NULL);
-	return 0;
-}
-subsys_initcall(blk_dev_integrity_init);
-
-static void blk_integrity_release(struct kobject *kobj)
-{
-	struct gendisk *disk = container_of(kobj, struct gendisk, integrity_kobj);
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	kmem_cache_free(integrity_cachep, bi);
-}
-
 static struct kobj_type integrity_ktype = {
 	.default_attrs	= integrity_attrs,
 	.sysfs_ops	= &integrity_ops,
-	.release	= blk_integrity_release,
 };
 
-bool blk_integrity_is_initialized(struct gendisk *disk)
-{
-	struct blk_integrity *bi = blk_get_integrity(disk);
-
-	return (bi && bi->profile->name && strcmp(bi->profile->name,
-						  bi_unsupported_name) != 0);
-}
-EXPORT_SYMBOL(blk_integrity_is_initialized);
-
 /**
  * blk_integrity_register - Register a gendisk as being integrity-capable
  * @disk:	struct gendisk pointer to make integrity-aware
- * @template:	optional integrity profile to register
+ * @template:	block integrity profile to register
  *
- * Description: When a device needs to advertise itself as being able
- * to send/receive integrity metadata it must use this function to
- * register the capability with the block layer.  The template is a
- * blk_integrity struct with values appropriate for the underlying
- * hardware.  If template is NULL the new profile is allocated but
- * not filled out. See Documentation/block/data-integrity.txt.
+ * Description: When a device needs to advertise itself as being able to
+ * send/receive integrity metadata it must use this function to register
+ * the capability with the block layer. The template is a blk_integrity
+ * struct with values appropriate for the underlying hardware. See
+ * Documentation/block/data-integrity.txt.
  */
-int blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
+void blk_integrity_register(struct gendisk *disk, struct blk_integrity *template)
 {
-	struct blk_integrity *bi;
-
-	BUG_ON(disk == NULL);
+	struct blk_integrity *bi = &disk->integrity;
 
-	if (disk->integrity == NULL) {
-		bi = kmem_cache_alloc(integrity_cachep,
-				      GFP_KERNEL | __GFP_ZERO);
-		if (!bi)
-			return -1;
+	bi->flags = BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE |
+		template->flags;
+	bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
+	bi->profile = template->profile;
+	bi->tuple_size = template->tuple_size;
+	bi->tag_size = template->tag_size;
 
-		if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
-					 &disk_to_dev(disk)->kobj,
-					 "%s", "integrity")) {
-			kmem_cache_free(integrity_cachep, bi);
-			return -1;
-		}
-
-		kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
-
-		bi->flags |= BLK_INTEGRITY_VERIFY | BLK_INTEGRITY_GENERATE;
-		bi->interval_exp = ilog2(queue_logical_block_size(disk->queue));
-		disk->integrity = bi;
-	} else
-		bi = disk->integrity;
-
-	/* Use the provided profile as template */
-	if (template != NULL) {
-		bi->profile = template->profile;
-		bi->tuple_size = template->tuple_size;
-		bi->tag_size = template->tag_size;
-		bi->flags |= template->flags;
-	} else
-		bi->profile->name = bi_unsupported_name;
-
-	disk->queue->backing_dev_info.capabilities |= BDI_CAP_STABLE_WRITES;
-
-	return 0;
+	blk_integrity_revalidate(disk);
 }
 EXPORT_SYMBOL(blk_integrity_register);
 
 /**
- * blk_integrity_unregister - Remove block integrity profile
- * @disk:	disk whose integrity profile to deallocate
+ * blk_integrity_unregister - Unregister block integrity profile
+ * @disk:	disk whose integrity profile to unregister
  *
- * Description: This function frees all memory used by the block
- * integrity profile.  To be called at device teardown.
+ * Description: This function unregisters the integrity capability from
+ * a block device.
  */
 void blk_integrity_unregister(struct gendisk *disk)
 {
-	struct blk_integrity *bi;
+	blk_integrity_revalidate(disk);
+	memset(&disk->integrity, 0, sizeof(struct blk_integrity));
+}
+EXPORT_SYMBOL(blk_integrity_unregister);
 
-	if (!disk || !disk->integrity)
+void blk_integrity_revalidate(struct gendisk *disk)
+{
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!(disk->flags & GENHD_FL_UP))
 		return;
 
-	disk->queue->backing_dev_info.capabilities &= ~BDI_CAP_STABLE_WRITES;
+	if (bi->profile)
+		disk->queue->backing_dev_info.capabilities |=
+			BDI_CAP_STABLE_WRITES;
+	else
+		disk->queue->backing_dev_info.capabilities &=
+			~BDI_CAP_STABLE_WRITES;
+}
 
-	bi = disk->integrity;
+void blk_integrity_add(struct gendisk *disk)
+{
+	if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
+				 &disk_to_dev(disk)->kobj, "%s", "integrity"))
+		return;
 
+	kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
+}
+
+void blk_integrity_del(struct gendisk *disk)
+{
 	kobject_uevent(&disk->integrity_kobj, KOBJ_REMOVE);
 	kobject_del(&disk->integrity_kobj);
 	kobject_put(&disk->integrity_kobj);
-	disk->integrity = NULL;
 }
-EXPORT_SYMBOL(blk_integrity_unregister);
diff --git a/block/genhd.c b/block/genhd.c
index 0c706f33a599..e5cafa51567c 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -630,6 +630,7 @@ void add_disk(struct gendisk *disk)
 	WARN_ON(retval);
 
 	disk_add_events(disk);
+	blk_integrity_add(disk);
 }
 EXPORT_SYMBOL(add_disk);
 
@@ -638,6 +639,7 @@ void del_gendisk(struct gendisk *disk)
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
+	blk_integrity_del(disk);
 	disk_del_events(disk);
 
 	/* invalidate stuff */
diff --git a/block/partition-generic.c b/block/partition-generic.c
index e7711133284e..3b030157ec85 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -428,6 +428,7 @@ rescan:
 
 	if (disk->fops->revalidate_disk)
 		disk->fops->revalidate_disk(disk);
+	blk_integrity_revalidate(disk);
 	check_disk_size_change(disk, bdev);
 	bdev->bd_invalidated = 0;
 	if (!get_capacity(disk) || !(state = check_partition(disk, bdev)))
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 3ea1616814d8..01197f0d3c01 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -535,7 +535,7 @@ static void nvme_dif_remap(struct request *req,
 	virt = bip_get_seed(bip);
 	phys = nvme_block_nr(ns, blk_rq_pos(req));
 	nlb = (blk_rq_bytes(req) >> ns->lba_shift);
-	ts = ns->disk->integrity->tuple_size;
+	ts = ns->disk->integrity.tuple_size;
 
 	for (i = 0; i < nlb; i++, virt++, phys++) {
 		pi = (struct t10_pi_tuple *)p;
@@ -2034,8 +2034,7 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 	ns->pi_type = pi_type;
 	blk_queue_logical_block_size(ns->queue, bs);
 
-	if (ns->ms && !blk_get_integrity(disk) && (disk->flags & GENHD_FL_UP) &&
-								!ns->ext)
+	if (ns->ms && !ns->ext)
 		nvme_init_integrity(ns);
 
 	if (ns->ms && !(ns->ms == 8 && ns->pi_type) && !blk_get_integrity(disk))
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index e76ed003769e..061152a43730 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1014,15 +1014,16 @@ static int dm_table_build_index(struct dm_table *t)
 	return r;
 }
 
+static bool integrity_profile_exists(struct gendisk *disk)
+{
+	return !!blk_get_integrity(disk);
+}
+
 /*
  * Get a disk whose integrity profile reflects the table's profile.
- * If %match_all is true, all devices' profiles must match.
- * If %match_all is false, all devices must at least have an
- * allocated integrity profile; but uninitialized is ok.
  * Returns NULL if integrity support was inconsistent or unavailable.
  */
-static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
-						    bool match_all)
+static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t)
 {
 	struct list_head *devices = dm_table_get_devices(t);
 	struct dm_dev_internal *dd = NULL;
@@ -1030,10 +1031,8 @@ static struct gendisk * dm_table_get_integrity_disk(struct dm_table *t,
 
 	list_for_each_entry(dd, devices, list) {
 		template_disk = dd->dm_dev->bdev->bd_disk;
-		if (!blk_get_integrity(template_disk))
+		if (!integrity_profile_exists(template_disk))
 			goto no_integrity;
-		if (!match_all && !blk_integrity_is_initialized(template_disk))
-			continue; /* skip uninitialized profiles */
 		else if (prev_disk &&
 			 blk_integrity_compare(prev_disk, template_disk) < 0)
 			goto no_integrity;
@@ -1052,34 +1051,40 @@ no_integrity:
 }
 
 /*
- * Register the mapped device for blk_integrity support if
- * the underlying devices have an integrity profile.  But all devices
- * may not have matching profiles (checking all devices isn't reliable
+ * Register the mapped device for blk_integrity support if the
+ * underlying devices have an integrity profile.  But all devices may
+ * not have matching profiles (checking all devices isn't reliable
  * during table load because this table may use other DM device(s) which
- * must be resumed before they will have an initialized integity profile).
- * Stacked DM devices force a 2 stage integrity profile validation:
- * 1 - during load, validate all initialized integrity profiles match
- * 2 - during resume, validate all integrity profiles match
+ * must be resumed before they will have an initialized integity
+ * profile).  Consequently, stacked DM devices force a 2 stage integrity
+ * profile validation: First pass during table load, final pass during
+ * resume.
  */
-static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device *md)
+static int dm_table_register_integrity(struct dm_table *t)
 {
+	struct mapped_device *md = t->md;
 	struct gendisk *template_disk = NULL;
 
-	template_disk = dm_table_get_integrity_disk(t, false);
+	template_disk = dm_table_get_integrity_disk(t);
 	if (!template_disk)
 		return 0;
 
-	if (!blk_integrity_is_initialized(dm_disk(md))) {
+	if (!integrity_profile_exists(dm_disk(md))) {
 		t->integrity_supported = 1;
-		return blk_integrity_register(dm_disk(md), NULL);
+		/*
+		 * Register integrity profile during table load; we can do
+		 * this because the final profile must match during resume.
+		 */
+		blk_integrity_register(dm_disk(md),
+				       blk_get_integrity(template_disk));
+		return 0;
 	}
 
 	/*
-	 * If DM device already has an initalized integrity
+	 * If DM device already has an initialized integrity
 	 * profile the new profile should not conflict.
 	 */
-	if (blk_integrity_is_initialized(template_disk) &&
-	    blk_integrity_compare(dm_disk(md), template_disk) < 0) {
+	if (blk_integrity_compare(dm_disk(md), template_disk) < 0) {
 		DMWARN("%s: conflict with existing integrity profile: "
 		       "%s profile mismatch",
 		       dm_device_name(t->md),
@@ -1087,7 +1092,7 @@ static int dm_table_prealloc_integrity(struct dm_table *t, struct mapped_device
 		return 1;
 	}
 
-	/* Preserve existing initialized integrity profile */
+	/* Preserve existing integrity profile */
 	t->integrity_supported = 1;
 	return 0;
 }
@@ -1112,7 +1117,7 @@ int dm_table_complete(struct dm_table *t)
 		return r;
 	}
 
-	r = dm_table_prealloc_integrity(t, t->md);
+	r = dm_table_register_integrity(t);
 	if (r) {
 		DMERR("could not register integrity profile.");
 		return r;
@@ -1278,29 +1283,30 @@ combine_limits:
 }
 
 /*
- * Set the integrity profile for this device if all devices used have
- * matching profiles.  We're quite deep in the resume path but still
- * don't know if all devices (particularly DM devices this device
- * may be stacked on) have matching profiles.  Even if the profiles
- * don't match we have no way to fail (to resume) at this point.
+ * Verify that all devices have an integrity profile that matches the
+ * DM device's registered integrity profile.  If the profiles don't
+ * match then unregister the DM device's integrity profile.
  */
-static void dm_table_set_integrity(struct dm_table *t)
+static void dm_table_verify_integrity(struct dm_table *t)
 {
 	struct gendisk *template_disk = NULL;
 
-	if (!blk_get_integrity(dm_disk(t->md)))
-		return;
+	if (t->integrity_supported) {
+		/*
+		 * Verify that the original integrity profile
+		 * matches all the devices in this table.
+		 */
+		template_disk = dm_table_get_integrity_disk(t);
+		if (template_disk &&
+		    blk_integrity_compare(dm_disk(t->md), template_disk) >= 0)
+			return;
+	}
 
-	template_disk = dm_table_get_integrity_disk(t, true);
-	if (template_disk)
-		blk_integrity_register(dm_disk(t->md),
-				       blk_get_integrity(template_disk));
-	else if (blk_integrity_is_initialized(dm_disk(t->md)))
-		DMWARN("%s: device no longer has a valid integrity profile",
-		       dm_device_name(t->md));
-	else
+	if (integrity_profile_exists(dm_disk(t->md))) {
 		DMWARN("%s: unable to establish an integrity profile",
 		       dm_device_name(t->md));
+		blk_integrity_unregister(dm_disk(t->md));
+	}
 }
 
 static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev,
@@ -1500,7 +1506,7 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
 	else
 		queue_flag_set_unlocked(QUEUE_FLAG_NO_SG_MERGE, q);
 
-	dm_table_set_integrity(t);
+	dm_table_verify_integrity(t);
 
 	/*
 	 * Determine whether or not this queue's I/O timings contribute
diff --git a/drivers/md/md.c b/drivers/md/md.c
index c702de18207a..2af9d590e1a0 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1962,12 +1962,9 @@ int md_integrity_register(struct mddev *mddev)
 	 * All component devices are integrity capable and have matching
 	 * profiles, register the common profile for the md device.
 	 */
-	if (blk_integrity_register(mddev->gendisk,
-			bdev_get_integrity(reference->bdev)) != 0) {
-		printk(KERN_ERR "md: failed to register integrity for %s\n",
-			mdname(mddev));
-		return -EINVAL;
-	}
+	blk_integrity_register(mddev->gendisk,
+			       bdev_get_integrity(reference->bdev));
+
 	printk(KERN_NOTICE "md: data integrity enabled on %s\n", mdname(mddev));
 	if (bioset_integrity_create(mddev->bio_set, BIO_POOL_SIZE)) {
 		printk(KERN_ERR "md: failed to create integrity pool for %s\n",
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 9e1b0f656a9b..9b6ac57c6e73 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -405,7 +405,6 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 		.generate_fn = nd_pi_nop_generate_verify,
 		.verify_fn = nd_pi_nop_generate_verify,
 	};
-	int ret;
 
 	if (meta_size == 0)
 		return 0;
@@ -414,10 +413,7 @@ int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 	bi.tuple_size = meta_size;
 	bi.tag_size = meta_size;
 
-	ret = blk_integrity_register(disk, &bi);
-	if (ret)
-		return ret;
-
+	blk_integrity_register(disk, &bi);
 	blk_queue_max_integrity_segments(disk->queue, 1);
 
 	return 0;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 073bb57adab1..0a793c7930eb 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1075,7 +1075,7 @@ int revalidate_disk(struct gendisk *disk)
 
 	if (disk->fops->revalidate_disk)
 		ret = disk->fops->revalidate_disk(disk);
-
+	blk_integrity_revalidate(disk);
 	bdev = bdget_disk(disk, 0);
 	if (!bdev)
 		return ret;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4f1968f15e30..60669c20190f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1468,16 +1468,7 @@ struct blk_integrity_profile {
 	const char			*name;
 };
 
-struct blk_integrity {
-	struct blk_integrity_profile	*profile;
-	unsigned char			flags;
-	unsigned char			tuple_size;
-	unsigned char			interval_exp;
-	unsigned char			tag_size;
-};
-
-extern bool blk_integrity_is_initialized(struct gendisk *);
-extern int blk_integrity_register(struct gendisk *, struct blk_integrity *);
+extern void blk_integrity_register(struct gendisk *, struct blk_integrity *);
 extern void blk_integrity_unregister(struct gendisk *);
 extern int blk_integrity_compare(struct gendisk *, struct gendisk *);
 extern int blk_rq_map_integrity_sg(struct request_queue *, struct bio *,
@@ -1488,15 +1479,20 @@ extern bool blk_integrity_merge_rq(struct request_queue *, struct request *,
 extern bool blk_integrity_merge_bio(struct request_queue *, struct request *,
 				    struct bio *);
 
-static inline
-struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
+static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
 {
-	return bdev->bd_disk->integrity;
+	struct blk_integrity *bi = &disk->integrity;
+
+	if (!bi->profile)
+		return NULL;
+
+	return bi;
 }
 
-static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
+static inline
+struct blk_integrity *bdev_get_integrity(struct block_device *bdev)
 {
-	return disk->integrity;
+	return blk_get_integrity(bdev->bd_disk);
 }
 
 static inline bool blk_integrity_rq(struct request *rq)
@@ -1570,10 +1566,9 @@ static inline int blk_integrity_compare(struct gendisk *a, struct gendisk *b)
 {
 	return 0;
 }
-static inline int blk_integrity_register(struct gendisk *d,
+static inline void blk_integrity_register(struct gendisk *d,
 					 struct blk_integrity *b)
 {
-	return 0;
 }
 static inline void blk_integrity_unregister(struct gendisk *d)
 {
@@ -1598,10 +1593,7 @@ static inline bool blk_integrity_merge_bio(struct request_queue *rq,
 {
 	return true;
 }
-static inline bool blk_integrity_is_initialized(struct gendisk *g)
-{
-	return 0;
-}
+
 static inline bool integrity_req_gap_back_merge(struct request *req,
 						struct bio *next)
 {
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 9e6e0dfa97ad..82f4911e0ad8 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -163,6 +163,18 @@ struct disk_part_tbl {
 
 struct disk_events;
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+
+struct blk_integrity {
+	struct blk_integrity_profile	*profile;
+	unsigned char			flags;
+	unsigned char			tuple_size;
+	unsigned char			interval_exp;
+	unsigned char			tag_size;
+};
+
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 struct gendisk {
 	/* major, first_minor and minors are input parameters only,
 	 * don't use directly.  Use disk_devt() and disk_max_parts().
@@ -198,9 +210,9 @@ struct gendisk {
 	atomic_t sync_io;		/* RAID */
 	struct disk_events *ev;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
-	struct blk_integrity *integrity;
+	struct blk_integrity integrity;
 	struct kobject integrity_kobj;
-#endif
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;
 };
 
@@ -728,6 +740,16 @@ static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
 #endif
 }
 
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+extern void blk_integrity_add(struct gendisk *);
+extern void blk_integrity_del(struct gendisk *);
+extern void blk_integrity_revalidate(struct gendisk *);
+#else	/* CONFIG_BLK_DEV_INTEGRITY */
+static inline void blk_integrity_add(struct gendisk *disk) { }
+static inline void blk_integrity_del(struct gendisk *disk) { }
+static inline void blk_integrity_revalidate(struct gendisk *disk) { }
+#endif	/* CONFIG_BLK_DEV_INTEGRITY */
+
 #else /* CONFIG_BLOCK */
 
 static inline void printk_all_partitions(void) { }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2015-10-20  2:45 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-14 17:57 [PATCH] NVMe: Reread partitions on metadata formats Keith Busch
2015-07-14 18:49 ` Jens Axboe
2015-07-14 19:01   ` Paul Grabinar
2015-07-15  3:48   ` Dan Williams
2015-07-15 21:36     ` Jens Axboe
2015-07-15 22:28       ` Keith Busch
2015-07-16  9:19         ` Christoph Hellwig
2015-07-17  1:47           ` Martin K. Petersen
2015-07-17  9:30             ` Christoph Hellwig
2015-07-21  6:02           ` Data integrity tweaks Martin K. Petersen
2015-07-21  6:02             ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
2015-07-22 11:32               ` Sagi Grimberg
2015-07-21  6:02             ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
2015-07-22 11:33               ` Sagi Grimberg
2015-07-21  6:02             ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
2015-07-21 11:53               ` Christoph Hellwig
2015-07-24 15:14                 ` Martin K. Petersen
2015-07-22 11:35               ` Sagi Grimberg
2015-07-21  6:02             ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
2015-07-22 11:37               ` Sagi Grimberg
2015-07-24 15:26                 ` Martin K. Petersen
2015-07-21  6:02             ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
2015-07-21 12:01               ` Christoph Hellwig
2015-08-20 20:41                 ` Simplify block integrity registration Martin K. Petersen
2015-08-20 20:41                   ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
2015-09-16 17:26                     ` Mike Snitzer
2015-09-16 17:26                       ` Mike Snitzer
2015-08-20 20:41                   ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
2015-08-20 20:41                   ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
2015-08-20 20:41                   ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
2015-08-20 20:41                   ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
2015-08-21 23:47                     ` Busch, Keith
2015-08-27  0:25                       ` Martin K. Petersen
2015-08-27  0:25                       ` [PATCH] " Martin K. Petersen
2015-08-27  8:28                         ` Christoph Hellwig
2015-09-03 20:38                         ` Keith Busch
2015-09-04  3:25                           ` Martin K. Petersen
2015-10-12 21:05                       ` Block integrity registration update Martin K. Petersen
2015-10-12 21:05                         ` [PATCH 1/5] block: Move integrity kobject to struct gendisk Martin K. Petersen
2015-10-12 21:05                         ` [PATCH 2/5] block: Consolidate static integrity profile properties Martin K. Petersen
2015-10-14  1:11                           ` Dan Williams
2015-10-14  7:23                             ` Christoph Hellwig
2015-10-14 19:42                               ` Williams, Dan J
2015-10-14 19:47                                 ` hch
2015-10-14 20:00                                   ` Williams, Dan J
2015-10-14 22:42                                     ` Martin K. Petersen
2015-10-12 21:05                         ` [PATCH 3/5] block: Reduce the size of struct blk_integrity Martin K. Petersen
2015-10-12 21:05                         ` [PATCH 4/5] block: Export integrity data interval size in sysfs Martin K. Petersen
2015-10-12 21:05                         ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen
2015-10-12 23:06                           ` Mike Snitzer
2015-10-13  0:31                         ` Block integrity registration update Williams, Dan J
2015-10-13  1:53                           ` Williams, Dan J
2015-10-13 12:26                             ` hch
2015-10-13 17:38                               ` Dan Williams
2015-10-13 17:38                                 ` Dan Williams
2015-09-16  1:07                     ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Mike Snitzer
2015-09-16  1:07                       ` Mike Snitzer
2015-09-21 20:45                       ` Mike Snitzer
2015-09-21 20:45                         ` Mike Snitzer
2015-10-09  7:36                         ` Christoph Hellwig
2015-10-09  7:36                           ` Christoph Hellwig
2015-10-12  1:17                           ` Martin K. Petersen
2015-10-12  1:17                             ` Martin K. Petersen
2015-08-20 20:45                   ` Simplify block integrity registration Mike Snitzer
2015-07-17  1:44         ` [PATCH] NVMe: Reread partitions on metadata formats Martin K. Petersen
2015-07-15 17:37   ` Paul Grabinar
2015-10-20  2:24 [PATCH v2 10/12] block: move blk_integrity to request_queue Martin K. Petersen
2015-10-20  2:45 ` Simplify block integrity registration v2 Martin K. Petersen
2015-10-20  2:45   ` [PATCH 5/5] block: Inline blk_integrity in struct gendisk Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.