linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support
@ 2020-03-27 17:15 Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 1/1] nvme-cli/fabrics: Add pi_enable param to connect cmd Max Gurtovoy
                   ` (17 more replies)
  0 siblings, 18 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

Hello Sagi, Christoph, Keith, Martin, James and Co

This patchset adds metadata (T10-PI) support for NVMeoF/RDMA host side
and target side, using signature verbs API. This set starts with a few
preparation commits to the NVMe host core layer. It continues with
NVMeoF/RDMA host implementation + few preparation commits to the RDMA/rw
API and to NVMe target core layer. The patchset ends with NVMeoF/RDMA
target implementation. Also patch for NVMe-cli added to this series.

In V5 I mainly did some renamings and removed 2 patches to get_mdts that
were already merged to main branch. I tried get some inspiration from
James suggestion of the settings in NVMe core, but unfortunately couldn't
take much from there and I stayed with Christoph suggestion for features
flag per namespace. I found the code more readable in this form and
hopefully we can continue with the review and the merge soon.

Configuration:
Host:
 - nvme connect --pi_enable --transport=rdma --traddr=10.0.1.1 --nqn=test-nvme

Target:
 - echo 1 > /config/nvmet/subsystems/${NAME}/attr_pi_enable
 - echo 1 > /config/nvmet/ports/${PORT_NUM}/param_pi_enable

The code was tested using Mellanox's ConnectX-4/ConnectX-5 HCAs.
This series applies on top of nvme_5.7 branch cleanly.

Changes from v4:
 - removed get_mdts patches (merged)
 - added enum nvme_ns_features instead of defines (patch 1/17)
 - rename pi/prot to md (patches 2/17 + 6/17 + 8/17 + 9/17 + 10/17)
 - another rebase

Changes from v3:
 - Added Reviewed-by signatures
 - New RDMA/rw patch (Patch 17/19)
 - Add mdts setting op for controllers (Patches 14/19, 18/19)
 - Rename NVME_NS_DIX_SUPPORTED to NVME_NS_MD_HOST_SUPPORTED and
   NVME_NS_DIF_SUPPORTED to NVME_NS_MD_CTRL_SUPPORTED (Patch 01/19)
 - Split "nvme: Introduce namespace features flag" patch (patch 02/19)
 - Rename nvmet_rdma_set_diff_domain to nvmet_rdma_set_sig_domain
   and nvme_rdma_set_diff_domain to nvme_rdma_set_sig_domain
   (Patches 08/19, 19/19)
 - Remove ns parameter from nvme_rdma_set_sig_domain/nvmet_rdma_set_sig_domain
   functions (patch 08/19, 19/19)
 - Rebase over nvme-5.7 branch

Changes from v2:
 - Convert the virtual start sector (which passed to bip_set_seed function)
   to be in integrity interval units (Patch 14/15)
 - Clarify some commit messages

Changes from v1:
 - Added Reviewed-by signatures
 - Added namespace features flag (Patch 01/15)
 - Remove nvme_ns_has_pi function (Patch 01/15)
 - Added has_pi field to struct nvme_request (Patch 01/15)
 - Subject change for patch 02/15
 - Fix comment for PCI metadata (Patch 03/15)
 - Rebase over "nvme: Avoid preallocating big SGL for data" patchset
 - Introduce NVME_INLINE_PROT_SG_CNT flag (Patch 05/15)
 - Introduce nvme_rdma_sgl structure (Patch 06/15)
 - Remove first_sgl pointer from struct nvme_rdma_request (Patch 06/15)
 - Split nvme-rdma patches (Patches 06/15, 07/15)
 - Rename is_protected to use_pi (Patch 07/15)
 - Refactor nvme_rdma_get_max_fr_pages function (Patch 07/15)
 - Added ifdef CONFIG_BLK_DEV_INTEGRITY (Patches 07/15, 09/15, 13/15,
   14/15, 15/15)
 - Added port configfs pi_enable (Patch 14/15)


Israel Rukshin (12):
  nvme: introduce namespace features flag
  nvme: Add has_md field to the nvme_req structure
  nvme-fabrics: Allow user enabling metadata/T10-PI support
  nvme: introduce NVME_INLINE_MD_SG_CNT
  nvme-rdma: Introduce nvme_rdma_sgl structure
  nvmet: prepare metadata request
  nvmet: add metadata characteristics for a namespace
  nvmet: Rename nvmet_rw_len to nvmet_rw_data_len
  nvmet: Rename nvmet_check_data_len to nvmet_check_transfer_len
  nvme: Add Metadata Capabilities enumerations
  nvmet: Add metadata/T10-PI support
  nvmet: Add metadata support for block devices

Max Gurtovoy (5):
  nvme: Enforce extended LBA format for fabrics metadata
  nvme: introduce max_integrity_segments ctrl attribute
  nvme-rdma: add metadata/T10-PI support
  RDMA/rw: Expose maximal page list for a device per 1 MR
  nvmet-rdma: Add metadata/T10-PI support

 drivers/infiniband/core/rw.c      |  14 +-
 drivers/nvme/host/core.c          |  79 +++++---
 drivers/nvme/host/fabrics.c       |  11 ++
 drivers/nvme/host/fabrics.h       |   3 +
 drivers/nvme/host/nvme.h          |  12 +-
 drivers/nvme/host/pci.c           |   7 +
 drivers/nvme/host/rdma.c          | 367 +++++++++++++++++++++++++++++++++-----
 drivers/nvme/target/admin-cmd.c   |  33 +++-
 drivers/nvme/target/configfs.c    |  61 +++++++
 drivers/nvme/target/core.c        |  54 ++++--
 drivers/nvme/target/discovery.c   |   8 +-
 drivers/nvme/target/fabrics-cmd.c |  19 +-
 drivers/nvme/target/io-cmd-bdev.c | 113 +++++++++++-
 drivers/nvme/target/io-cmd-file.c |   6 +-
 drivers/nvme/target/nvmet.h       |  38 +++-
 drivers/nvme/target/rdma.c        | 245 +++++++++++++++++++++++--
 include/linux/nvme.h              |   2 +
 include/rdma/rw.h                 |   1 +
 18 files changed, 946 insertions(+), 127 deletions(-)

-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH 1/1] nvme-cli/fabrics: Add pi_enable param to connect cmd
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 01/17] nvme: introduce namespace features flag Max Gurtovoy
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Added 'pi_enable' to 'connect' command so users can enable metadata support.

usage examples:
nvme connect --pi_enable --transport=rdma --traddr=10.0.1.1 --nqn=test-nvme
nvme connect -p -t rdma -a 10.0.1.1 -n test_nvme

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 fabrics.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fabrics.c b/fabrics.c
index a112f76..1b5b3f7 100644
--- a/fabrics.c
+++ b/fabrics.c
@@ -78,6 +78,7 @@ static struct config {
 	int  disable_sqflow;
 	int  hdr_digest;
 	int  data_digest;
+	int  pi_enable;
 	bool persistent;
 	bool quiet;
 } cfg = { NULL };
@@ -851,7 +852,9 @@ static int build_options(char *argstr, int max_len, bool discover)
 	    add_bool_argument(&argstr, &max_len, "disable_sqflow",
 				cfg.disable_sqflow) ||
 	    add_bool_argument(&argstr, &max_len, "hdr_digest", cfg.hdr_digest) ||
-	    add_bool_argument(&argstr, &max_len, "data_digest", cfg.data_digest))
+	    add_bool_argument(&argstr, &max_len, "data_digest",
+				cfg.data_digest) ||
+	    add_bool_argument(&argstr, &max_len, "pi_enable", cfg.pi_enable))
 		return -EINVAL;
 
 	return 0;
@@ -980,6 +983,13 @@ retry:
 		p += len;
 	}
 
+	if (cfg.pi_enable) {
+		len = sprintf(p, ",pi_enable");
+		if (len < 0)
+			return -EINVAL;
+		p += len;
+	}
+
 	switch (e->trtype) {
 	case NVMF_TRTYPE_RDMA:
 	case NVMF_TRTYPE_TCP:
@@ -1259,6 +1269,7 @@ int discover(const char *desc, int argc, char **argv, bool connect)
 		OPT_INT("tos",             'T', &cfg.tos,             "type of service"),
 		OPT_FLAG("hdr_digest",     'g', &cfg.hdr_digest,      "enable transport protocol header digest (TCP transport)"),
 		OPT_FLAG("data_digest",    'G', &cfg.data_digest,     "enable transport protocol data digest (TCP transport)"),
+		OPT_FLAG("pi_enable",      'p', &cfg.pi_enable,       "enable metadata (T10-PI) support (default false)"),
 		OPT_INT("nr-io-queues",    'i', &cfg.nr_io_queues,    "number of io queues to use (default is core count)"),
 		OPT_INT("nr-write-queues", 'W', &cfg.nr_write_queues, "number of write queues to use (default 0)"),
 		OPT_INT("nr-poll-queues",  'P', &cfg.nr_poll_queues,  "number of poll queues to use (default 0)"),
@@ -1319,6 +1330,7 @@ int connect(const char *desc, int argc, char **argv)
 		OPT_FLAG("disable-sqflow",    'd', &cfg.disable_sqflow,    "disable controller sq flow control (default false)"),
 		OPT_FLAG("hdr-digest",        'g', &cfg.hdr_digest,        "enable transport protocol header digest (TCP transport)"),
 		OPT_FLAG("data-digest",       'G', &cfg.data_digest,       "enable transport protocol data digest (TCP transport)"),
+		OPT_FLAG("pi_enable",         'p', &cfg.pi_enable,         "enable metadata (T10-PI) support (default false)"),
 		OPT_END()
 	};
 
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 01/17] nvme: introduce namespace features flag
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 1/1] nvme-cli/fabrics: Add pi_enable param to connect cmd Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 11:59   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 02/17] nvme: Add has_md field to the nvme_req structure Max Gurtovoy
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Centralize all the metadata checks to one place and make the code more
readable. Introduce a new enum nvme_ns_features for that matter.
The features flag description:
 - NVME_NS_EXT_LBAS - NVMe namespace supports extended LBA format.
 - NVME_NS_MD_HOST_SUPPORTED - NVMe namespace supports getting metadata
   from host's block layer.
 - NVME_NS_MD_CTRL_SUPPORTED - NVMe namespace supports metadata actions
   by the controller (generate/strip).

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
---
 drivers/nvme/host/core.c | 42 ++++++++++++++++++++++++++++--------------
 drivers/nvme/host/nvme.h |  8 +++++++-
 2 files changed, 35 insertions(+), 15 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1cab3c6..f3a184f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -203,11 +203,6 @@ static void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl)
 	nvme_put_ctrl(ctrl);
 }
 
-static inline bool nvme_ns_has_pi(struct nvme_ns *ns)
-{
-	return ns->pi_type && ns->ms == sizeof(struct t10_pi_tuple);
-}
-
 static blk_status_t nvme_error_status(u16 status)
 {
 	switch (status & 0x7ff) {
@@ -706,7 +701,8 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 		 * namespace capacity to zero to prevent any I/O.
 		 */
 		if (!blk_integrity_rq(req)) {
-			if (WARN_ON_ONCE(!nvme_ns_has_pi(ns)))
+			if (WARN_ON_ONCE(!(ns->features &
+					   NVME_NS_MD_CTRL_SUPPORTED)))
 				return BLK_STS_NOTSUPP;
 			control |= NVME_RW_PRINFO_PRACT;
 		}
@@ -1277,7 +1273,7 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
 	meta_len = (io.nblocks + 1) * ns->ms;
 	metadata = (void __user *)(uintptr_t)io.metadata;
 
-	if (ns->ext) {
+	if (ns->features & NVME_NS_EXT_LBAS) {
 		length += meta_len;
 		meta_len = 0;
 	} else if (meta_len) {
@@ -1837,11 +1833,12 @@ static void nvme_update_disk_info(struct gendisk *disk,
 	blk_queue_io_min(disk->queue, phys_bs);
 	blk_queue_io_opt(disk->queue, io_opt);
 
-	if (ns->ms && !ns->ext &&
-	    (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
+	if (ns->features & NVME_NS_MD_HOST_SUPPORTED)
 		nvme_init_integrity(disk, ns->ms, ns->pi_type);
-	if ((ns->ms && !nvme_ns_has_pi(ns) && !blk_get_integrity(disk)) ||
-	    ns->lba_shift > PAGE_SHIFT)
+
+	if ((ns->ms && !(ns->features & NVME_NS_MD_CTRL_SUPPORTED) &&
+	     !(ns->features & NVME_NS_MD_HOST_SUPPORTED) &&
+	     !blk_get_integrity(disk)) || ns->lba_shift > PAGE_SHIFT)
 		capacity = 0;
 
 	set_capacity(disk, capacity);
@@ -1870,12 +1867,29 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
 		ns->lba_shift = 9;
 	ns->noiob = le16_to_cpu(id->noiob);
 	ns->ms = le16_to_cpu(id->lbaf[id->flbas & NVME_NS_FLBAS_LBA_MASK].ms);
-	ns->ext = ns->ms && (id->flbas & NVME_NS_FLBAS_META_EXT);
+	ns->features = 0;
 	/* the PI implementation requires metadata equal t10 pi tuple size */
-	if (ns->ms == sizeof(struct t10_pi_tuple))
+	if (ns->ms == sizeof(struct t10_pi_tuple)) {
 		ns->pi_type = id->dps & NVME_NS_DPS_PI_MASK;
-	else
+		if (ns->pi_type)
+			ns->features |= NVME_NS_MD_CTRL_SUPPORTED;
+	} else {
 		ns->pi_type = 0;
+	}
+
+	if (ns->ms) {
+		if (id->flbas & NVME_NS_FLBAS_META_EXT)
+			ns->features |= NVME_NS_EXT_LBAS;
+
+		/*
+		 * For PCI, Extended logical block will be generated by the
+		 * controller.
+		 */
+		if (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED) {
+			if (!(ns->features & NVME_NS_EXT_LBAS))
+				ns->features |= NVME_NS_MD_HOST_SUPPORTED;
+		}
+	}
 
 	if (ns->noiob)
 		nvme_set_chunk_size(ns);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 2e04a36..83296d0 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -363,6 +363,12 @@ struct nvme_ns_head {
 #endif
 };
 
+enum nvme_ns_features {
+	NVME_NS_EXT_LBAS = 1 << 0,
+	NVME_NS_MD_HOST_SUPPORTED = 1 << 1,
+	NVME_NS_MD_CTRL_SUPPORTED = 1 << 2,
+};
+
 struct nvme_ns {
 	struct list_head list;
 
@@ -382,8 +388,8 @@ struct nvme_ns {
 	u16 ms;
 	u16 sgs;
 	u32 sws;
-	bool ext;
 	u8 pi_type;
+	unsigned long features;
 	unsigned long flags;
 #define NVME_NS_REMOVING	0
 #define NVME_NS_DEAD     	1
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 02/17] nvme: Add has_md field to the nvme_req structure
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 1/1] nvme-cli/fabrics: Add pi_enable param to connect cmd Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 01/17] nvme: introduce namespace features flag Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 11:59   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 03/17] nvme: Enforce extended LBA format for fabrics metadata Max Gurtovoy
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Transport drivers will use this field to determine if the request has
metadata.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
---
 drivers/nvme/host/core.c | 6 ++++--
 drivers/nvme/host/nvme.h | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f3a184f..f34ff34 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -461,6 +461,7 @@ static inline void nvme_clear_nvme_request(struct request *req)
 	if (!(req->rq_flags & RQF_DONTPREP)) {
 		nvme_req(req)->retries = 0;
 		nvme_req(req)->flags = 0;
+		nvme_req(req)->has_md = false;
 		req->rq_flags |= RQF_DONTPREP;
 	}
 }
@@ -694,6 +695,8 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 		nvme_assign_write_stream(ctrl, req, &control, &dsmgmt);
 
 	if (ns->ms) {
+		nvme_req(req)->has_md =
+			ns->features & NVME_NS_MD_CTRL_SUPPORTED;
 		/*
 		 * If formated with metadata, the block layer always provides a
 		 * metadata buffer if CONFIG_BLK_DEV_INTEGRITY is enabled.  Else
@@ -701,8 +704,7 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 		 * namespace capacity to zero to prevent any I/O.
 		 */
 		if (!blk_integrity_rq(req)) {
-			if (WARN_ON_ONCE(!(ns->features &
-					   NVME_NS_MD_CTRL_SUPPORTED)))
+			if (WARN_ON_ONCE(!nvme_req(req)->has_md))
 				return BLK_STS_NOTSUPP;
 			control |= NVME_RW_PRINFO_PRACT;
 		}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 83296d0..a353255 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -139,6 +139,7 @@ struct nvme_request {
 	u8			flags;
 	u16			status;
 	struct nvme_ctrl	*ctrl;
+	bool			has_md;
 };
 
 /*
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 03/17] nvme: Enforce extended LBA format for fabrics metadata
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (2 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 02/17] nvme: Add has_md field to the nvme_req structure Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 12:08   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 04/17] nvme: introduce max_integrity_segments ctrl attribute Max Gurtovoy
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

An extended LBA is a larger LBA that is created when metadata associated
with the LBA is transferred contiguously with the LBA data (AKA
interleaved). The metadata may be either transferred as part of the LBA
(creating an extended LBA) or it may be transferred as a separate
contiguous buffer of data. According to the NVMeoF spec, a fabrics ctrl
supports only an Extended LBA format. Fail revalidation in case we have a
spec violation. Also initialize the integrity profile for the block device
for fabrics ctrl.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
---
 drivers/nvme/host/core.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f34ff34..b93a8c6 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1856,7 +1856,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
 	blk_mq_unfreeze_queue(disk->queue);
 }
 
-static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
+static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
 {
 	struct nvme_ns *ns = disk->private_data;
 
@@ -1884,11 +1884,21 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
 			ns->features |= NVME_NS_EXT_LBAS;
 
 		/*
+		 * For Fabrics, only metadata as part of extended data LBA is
+		 * supported. Fail in case of a spec violation.
+		 */
+		if (ns->ctrl->ops->flags & NVME_F_FABRICS) {
+			if (WARN_ON_ONCE(!(ns->features & NVME_NS_EXT_LBAS)))
+				return -EINVAL;
+		}
+
+		/*
 		 * For PCI, Extended logical block will be generated by the
 		 * controller.
 		 */
 		if (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED) {
-			if (!(ns->features & NVME_NS_EXT_LBAS))
+			if (ns->ctrl->ops->flags & NVME_F_FABRICS ||
+			    !(ns->features & NVME_NS_EXT_LBAS))
 				ns->features |= NVME_NS_MD_HOST_SUPPORTED;
 		}
 	}
@@ -1903,6 +1913,7 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
 		revalidate_disk(ns->head->disk);
 	}
 #endif
+	return 0;
 }
 
 static int nvme_revalidate_disk(struct gendisk *disk)
@@ -1927,7 +1938,10 @@ static int nvme_revalidate_disk(struct gendisk *disk)
 		goto free_id;
 	}
 
-	__nvme_revalidate_disk(disk, id);
+	ret = __nvme_revalidate_disk(disk, id);
+	if (ret)
+		goto free_id;
+
 	ret = nvme_report_ns_ids(ctrl, ns->head->ns_id, id, &ids);
 	if (ret)
 		goto free_id;
@@ -3614,7 +3628,8 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	memcpy(disk->disk_name, disk_name, DISK_NAME_LEN);
 	ns->disk = disk;
 
-	__nvme_revalidate_disk(disk, id);
+	if (__nvme_revalidate_disk(disk, id))
+		goto out_free_disk;
 
 	if ((ctrl->quirks & NVME_QUIRK_LIGHTNVM) && id->vs[0] == 0x1) {
 		ret = nvme_nvm_register(ns, disk_name, node);
@@ -3639,6 +3654,8 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	return;
  out_put_disk:
 	put_disk(ns->disk);
+ out_free_disk:
+	del_gendisk(ns->disk);
  out_unlink_ns:
 	mutex_lock(&ctrl->subsys->lock);
 	list_del_rcu(&ns->siblings);
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 04/17] nvme: introduce max_integrity_segments ctrl attribute
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (3 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 03/17] nvme: Enforce extended LBA format for fabrics metadata Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 12:09   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support Max Gurtovoy
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

This patch doesn't change any logic, and is needed as a preparation
for adding PI support for fabrics drivers that will use an extended
LBA format for metadata and will support more than 1 integrity segment.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/nvme/host/core.c | 11 +++++++----
 drivers/nvme/host/nvme.h |  1 +
 drivers/nvme/host/pci.c  |  7 +++++++
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b93a8c6..2feb7fb 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1667,7 +1667,8 @@ static int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo)
 }
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
-static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
+static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
+				u32 max_integrity_segments)
 {
 	struct blk_integrity integrity;
 
@@ -1690,10 +1691,11 @@ static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
 	}
 	integrity.tuple_size = ms;
 	blk_integrity_register(disk, &integrity);
-	blk_queue_max_integrity_segments(disk->queue, 1);
+	blk_queue_max_integrity_segments(disk->queue, max_integrity_segments);
 }
 #else
-static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
+static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
+				u32 max_integrity_segments)
 {
 }
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
@@ -1836,7 +1838,8 @@ static void nvme_update_disk_info(struct gendisk *disk,
 	blk_queue_io_opt(disk->queue, io_opt);
 
 	if (ns->features & NVME_NS_MD_HOST_SUPPORTED)
-		nvme_init_integrity(disk, ns->ms, ns->pi_type);
+		nvme_init_integrity(disk, ns->ms, ns->pi_type,
+				    ns->ctrl->max_integrity_segments);
 
 	if ((ns->ms && !(ns->features & NVME_NS_MD_CTRL_SUPPORTED) &&
 	     !(ns->features & NVME_NS_MD_HOST_SUPPORTED) &&
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a353255..d132af9 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -229,6 +229,7 @@ struct nvme_ctrl {
 	u32 page_size;
 	u32 max_hw_sectors;
 	u32 max_segments;
+	u32 max_integrity_segments;
 	u16 crdt[3];
 	u16 oncs;
 	u16 oacs;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4e79e41..95fa663 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2553,6 +2553,13 @@ static void nvme_reset_work(struct work_struct *work)
 		goto out;
 	}
 
+	/*
+	 * NVMe PCI driver doesn't support Extended LBA format and supports
+	 * only a single integrity segment for a separate contiguous buffer
+	 * of metadata.
+	 */
+	dev->ctrl.max_integrity_segments = 1;
+
 	result = nvme_init_identify(&dev->ctrl);
 	if (result)
 		goto out;
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (4 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 04/17] nvme: introduce max_integrity_segments ctrl attribute Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 12:12   ` Christoph Hellwig
  2020-04-21 15:17   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 06/17] nvme: introduce NVME_INLINE_MD_SG_CNT Max Gurtovoy
                   ` (11 subsequent siblings)
  17 siblings, 2 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Preparation for adding metadata (T10-PI) over fabric support. This will
allow end-to-end protection information passthrough and validation for
NVMe over Fabric.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/host/core.c    |  3 ++-
 drivers/nvme/host/fabrics.c | 11 +++++++++++
 drivers/nvme/host/fabrics.h |  3 +++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 2feb7fb..0b4d847 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1900,7 +1900,8 @@ static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id)
 		 * controller.
 		 */
 		if (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED) {
-			if (ns->ctrl->ops->flags & NVME_F_FABRICS ||
+			if ((ns->ctrl->ops->flags & NVME_F_FABRICS &&
+			     ns->ctrl->opts->pi_enable) ||
 			    !(ns->features & NVME_NS_EXT_LBAS))
 				ns->features |= NVME_NS_MD_HOST_SUPPORTED;
 		}
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 2a6c819..38f4213 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -612,6 +612,7 @@ bool __nvmf_check_ready(struct nvme_ctrl *ctrl, struct request *rq,
 	{ NVMF_OPT_NR_WRITE_QUEUES,	"nr_write_queues=%d"	},
 	{ NVMF_OPT_NR_POLL_QUEUES,	"nr_poll_queues=%d"	},
 	{ NVMF_OPT_TOS,			"tos=%d"		},
+	{ NVMF_OPT_PI_ENABLE,		"pi_enable"		},
 	{ NVMF_OPT_ERR,			NULL			}
 };
 
@@ -634,6 +635,7 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 	opts->hdr_digest = false;
 	opts->data_digest = false;
 	opts->tos = -1; /* < 0 == use transport default */
+	opts->pi_enable = false;
 
 	options = o = kstrdup(buf, GFP_KERNEL);
 	if (!options)
@@ -867,6 +869,15 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 			}
 			opts->tos = token;
 			break;
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+		case NVMF_OPT_PI_ENABLE:
+			if (opts->discovery_nqn) {
+				pr_debug("Ignoring pi_enable value for discovery controller\n");
+				break;
+			}
+			opts->pi_enable = true;
+			break;
+#endif
 		default:
 			pr_warn("unknown parameter or missing value '%s' in ctrl creation request\n",
 				p);
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index a0ec40a..773f748 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -56,6 +56,7 @@ enum {
 	NVMF_OPT_NR_WRITE_QUEUES = 1 << 17,
 	NVMF_OPT_NR_POLL_QUEUES = 1 << 18,
 	NVMF_OPT_TOS		= 1 << 19,
+	NVMF_OPT_PI_ENABLE	= 1 << 20,
 };
 
 /**
@@ -89,6 +90,7 @@ enum {
  * @nr_write_queues: number of queues for write I/O
  * @nr_poll_queues: number of queues for polling I/O
  * @tos: type of service
+ * @pi_enable: Enable metadata (T10-PI) support
  */
 struct nvmf_ctrl_options {
 	unsigned		mask;
@@ -111,6 +113,7 @@ struct nvmf_ctrl_options {
 	unsigned int		nr_write_queues;
 	unsigned int		nr_poll_queues;
 	int			tos;
+	bool			pi_enable;
 };
 
 /*
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 06/17] nvme: introduce NVME_INLINE_MD_SG_CNT
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (5 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 12:12   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 07/17] nvme-rdma: Introduce nvme_rdma_sgl structure Max Gurtovoy
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

SGL size of metadata is usually small. Thus, 1 inline sg should cover
most cases. The macro will be used for pre-allocate a single SGL entry
for metadata. The preallocation of small inline SGLs depends on SG_CHAIN
capability so if the ARCH doesn't support SG_CHAIN, use the runtime
allocation for the SGL. This patch is a preparation for adding metadata
(T10-PI) over fabric support.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/nvme/host/nvme.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index d132af9..08f470d 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -30,8 +30,10 @@
 
 #ifdef CONFIG_ARCH_NO_SG_CHAIN
 #define  NVME_INLINE_SG_CNT  0
+#define  NVME_INLINE_MD_SG_CNT  0
 #else
 #define  NVME_INLINE_SG_CNT  2
+#define  NVME_INLINE_MD_SG_CNT  1
 #endif
 
 extern struct workqueue_struct *nvme_wq;
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 07/17] nvme-rdma: Introduce nvme_rdma_sgl structure
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (6 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 06/17] nvme: introduce NVME_INLINE_MD_SG_CNT Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 12:13   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 08/17] nvme-rdma: add metadata/T10-PI support Max Gurtovoy
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Remove first_sgl pointer from struct nvme_rdma_request and use pointer
arithmetic instead. The inline scatterlist, if exists, will be located
right after the nvme_rdma_request. This patch is needed as a preparation
for adding PI support.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/host/rdma.c | 41 ++++++++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 3ae3011..e38f8f7 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -48,6 +48,11 @@ struct nvme_rdma_qe {
 	u64			dma;
 };
 
+struct nvme_rdma_sgl {
+	int			nents;
+	struct sg_table		sg_table;
+};
+
 struct nvme_rdma_queue;
 struct nvme_rdma_request {
 	struct nvme_request	req;
@@ -58,12 +63,10 @@ struct nvme_rdma_request {
 	refcount_t		ref;
 	struct ib_sge		sge[1 + NVME_RDMA_MAX_INLINE_SEGMENTS];
 	u32			num_sge;
-	int			nents;
 	struct ib_reg_wr	reg_wr;
 	struct ib_cqe		reg_cqe;
 	struct nvme_rdma_queue  *queue;
-	struct sg_table		sg_table;
-	struct scatterlist	first_sgl[];
+	struct nvme_rdma_sgl	data_sgl;
 };
 
 enum nvme_rdma_queue_flags {
@@ -1164,8 +1167,9 @@ static void nvme_rdma_unmap_data(struct nvme_rdma_queue *queue,
 		req->mr = NULL;
 	}
 
-	ib_dma_unmap_sg(ibdev, req->sg_table.sgl, req->nents, rq_dma_dir(rq));
-	sg_free_table_chained(&req->sg_table, NVME_INLINE_SG_CNT);
+	ib_dma_unmap_sg(ibdev, req->data_sgl.sg_table.sgl, req->data_sgl.nents,
+			rq_dma_dir(rq));
+	sg_free_table_chained(&req->data_sgl.sg_table, NVME_INLINE_SG_CNT);
 }
 
 static int nvme_rdma_set_sg_null(struct nvme_command *c)
@@ -1184,7 +1188,7 @@ static int nvme_rdma_map_sg_inline(struct nvme_rdma_queue *queue,
 		int count)
 {
 	struct nvme_sgl_desc *sg = &c->common.dptr.sgl;
-	struct scatterlist *sgl = req->sg_table.sgl;
+	struct scatterlist *sgl = req->data_sgl.sg_table.sgl;
 	struct ib_sge *sge = &req->sge[1];
 	u32 len = 0;
 	int i;
@@ -1209,8 +1213,8 @@ static int nvme_rdma_map_sg_single(struct nvme_rdma_queue *queue,
 {
 	struct nvme_keyed_sgl_desc *sg = &c->common.dptr.ksgl;
 
-	sg->addr = cpu_to_le64(sg_dma_address(req->sg_table.sgl));
-	put_unaligned_le24(sg_dma_len(req->sg_table.sgl), sg->length);
+	sg->addr = cpu_to_le64(sg_dma_address(req->data_sgl.sg_table.sgl));
+	put_unaligned_le24(sg_dma_len(req->data_sgl.sg_table.sgl), sg->length);
 	put_unaligned_le32(queue->device->pd->unsafe_global_rkey, sg->key);
 	sg->type = NVME_KEY_SGL_FMT_DATA_DESC << 4;
 	return 0;
@@ -1231,7 +1235,8 @@ static int nvme_rdma_map_sg_fr(struct nvme_rdma_queue *queue,
 	 * Align the MR to a 4K page size to match the ctrl page size and
 	 * the block virtual boundary.
 	 */
-	nr = ib_map_mr_sg(req->mr, req->sg_table.sgl, count, NULL, SZ_4K);
+	nr = ib_map_mr_sg(req->mr, req->data_sgl.sg_table.sgl, count, NULL,
+			  SZ_4K);
 	if (unlikely(nr < count)) {
 		ib_mr_pool_put(queue->qp, &queue->qp->rdma_mrs, req->mr);
 		req->mr = NULL;
@@ -1278,17 +1283,18 @@ static int nvme_rdma_map_data(struct nvme_rdma_queue *queue,
 	if (!blk_rq_nr_phys_segments(rq))
 		return nvme_rdma_set_sg_null(c);
 
-	req->sg_table.sgl = req->first_sgl;
-	ret = sg_alloc_table_chained(&req->sg_table,
-			blk_rq_nr_phys_segments(rq), req->sg_table.sgl,
+	req->data_sgl.sg_table.sgl = (struct scatterlist *)(req + 1);
+	ret = sg_alloc_table_chained(&req->data_sgl.sg_table,
+			blk_rq_nr_phys_segments(rq), req->data_sgl.sg_table.sgl,
 			NVME_INLINE_SG_CNT);
 	if (ret)
 		return -ENOMEM;
 
-	req->nents = blk_rq_map_sg(rq->q, rq, req->sg_table.sgl);
+	req->data_sgl.nents = blk_rq_map_sg(rq->q, rq,
+					    req->data_sgl.sg_table.sgl);
 
-	count = ib_dma_map_sg(ibdev, req->sg_table.sgl, req->nents,
-			      rq_dma_dir(rq));
+	count = ib_dma_map_sg(ibdev, req->data_sgl.sg_table.sgl,
+			      req->data_sgl.nents, rq_dma_dir(rq));
 	if (unlikely(count <= 0)) {
 		ret = -EIO;
 		goto out_free_table;
@@ -1317,9 +1323,10 @@ static int nvme_rdma_map_data(struct nvme_rdma_queue *queue,
 	return 0;
 
 out_unmap_sg:
-	ib_dma_unmap_sg(ibdev, req->sg_table.sgl, req->nents, rq_dma_dir(rq));
+	ib_dma_unmap_sg(ibdev, req->data_sgl.sg_table.sgl, req->data_sgl.nents,
+			rq_dma_dir(rq));
 out_free_table:
-	sg_free_table_chained(&req->sg_table, NVME_INLINE_SG_CNT);
+	sg_free_table_chained(&req->data_sgl.sg_table, NVME_INLINE_SG_CNT);
 	return ret;
 }
 
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 08/17] nvme-rdma: add metadata/T10-PI support
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (7 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 07/17] nvme-rdma: Introduce nvme_rdma_sgl structure Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 12:20   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 09/17] nvmet: prepare metadata request Max Gurtovoy
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

For capable HCAs (e.g. ConnectX-5/ConnectX-6) this will allow end-to-end
protection information passthrough and validation for NVMe over RDMA
transport. Metadata offload support was implemented over the new RDMA
signature verbs API and it is enabled per controller by using nvme-cli.

usage example:
nvme connect --pi_enable --transport=rdma --traddr=10.0.1.1 --nqn=test-nvme

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
---
 drivers/nvme/host/rdma.c | 330 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 296 insertions(+), 34 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index e38f8f7..23cc77e 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -67,6 +67,9 @@ struct nvme_rdma_request {
 	struct ib_cqe		reg_cqe;
 	struct nvme_rdma_queue  *queue;
 	struct nvme_rdma_sgl	data_sgl;
+	/* Metadata (T10-PI) support */
+	struct nvme_rdma_sgl	*md_sgl;
+	bool			use_md;
 };
 
 enum nvme_rdma_queue_flags {
@@ -88,6 +91,7 @@ struct nvme_rdma_queue {
 	struct rdma_cm_id	*cm_id;
 	int			cm_error;
 	struct completion	cm_done;
+	bool			pi_support;
 };
 
 struct nvme_rdma_ctrl {
@@ -115,6 +119,7 @@ struct nvme_rdma_ctrl {
 	struct nvme_ctrl	ctrl;
 	bool			use_inline_data;
 	u32			io_queues[HCTX_MAX_TYPES];
+	bool			pi_support;
 };
 
 static inline struct nvme_rdma_ctrl *to_rdma_ctrl(struct nvme_ctrl *ctrl)
@@ -272,6 +277,8 @@ static int nvme_rdma_create_qp(struct nvme_rdma_queue *queue, const int factor)
 	init_attr.qp_type = IB_QPT_RC;
 	init_attr.send_cq = queue->ib_cq;
 	init_attr.recv_cq = queue->ib_cq;
+	if (queue->pi_support)
+		init_attr.create_flags |= IB_QP_CREATE_INTEGRITY_EN;
 
 	ret = rdma_create_qp(queue->cm_id, dev->pd, &init_attr);
 
@@ -287,6 +294,18 @@ static void nvme_rdma_exit_request(struct blk_mq_tag_set *set,
 	kfree(req->sqe.data);
 }
 
+static unsigned int nvme_rdma_cmd_size(bool has_md)
+{
+	/*
+	 * nvme rdma command consists of struct nvme_rdma_request, data SGL,
+	 * PI nvme_rdma_sgl struct and then PI SGL.
+	 */
+	return sizeof(struct nvme_rdma_request) +
+	       sizeof(struct scatterlist) * NVME_INLINE_SG_CNT +
+	       has_md * (sizeof(struct nvme_rdma_sgl) +
+			 sizeof(struct scatterlist) * NVME_INLINE_MD_SG_CNT);
+}
+
 static int nvme_rdma_init_request(struct blk_mq_tag_set *set,
 		struct request *rq, unsigned int hctx_idx,
 		unsigned int numa_node)
@@ -301,6 +320,10 @@ static int nvme_rdma_init_request(struct blk_mq_tag_set *set,
 	if (!req->sqe.data)
 		return -ENOMEM;
 
+	/* metadata nvme_rdma_sgl struct is located after command's data SGL */
+	if (queue->pi_support)
+		req->md_sgl = (void *)nvme_req(rq) + nvme_rdma_cmd_size(false);
+
 	req->queue = queue;
 
 	return 0;
@@ -411,6 +434,8 @@ static void nvme_rdma_destroy_queue_ib(struct nvme_rdma_queue *queue)
 	dev = queue->device;
 	ibdev = dev->dev;
 
+	if (queue->pi_support)
+		ib_mr_pool_destroy(queue->qp, &queue->qp->sig_mrs);
 	ib_mr_pool_destroy(queue->qp, &queue->qp->rdma_mrs);
 
 	/*
@@ -427,10 +452,13 @@ static void nvme_rdma_destroy_queue_ib(struct nvme_rdma_queue *queue)
 	nvme_rdma_dev_put(dev);
 }
 
-static int nvme_rdma_get_max_fr_pages(struct ib_device *ibdev)
+static int nvme_rdma_get_max_fr_pages(struct ib_device *ibdev, bool pi_support)
 {
-	return min_t(u32, NVME_RDMA_MAX_SEGMENTS,
-		     ibdev->attrs.max_fast_reg_page_list_len - 1);
+	u32 max_page_list_len =
+		pi_support ? ibdev->attrs.max_pi_fast_reg_page_list_len :
+			     ibdev->attrs.max_fast_reg_page_list_len;
+
+	return min_t(u32, NVME_RDMA_MAX_SEGMENTS, max_page_list_len - 1);
 }
 
 static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
@@ -487,7 +515,7 @@ static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
 	 * misaligned we'll end up using two entries for a single data page,
 	 * so one additional entry is required.
 	 */
-	pages_per_mr = nvme_rdma_get_max_fr_pages(ibdev) + 1;
+	pages_per_mr = nvme_rdma_get_max_fr_pages(ibdev, queue->pi_support) + 1;
 	ret = ib_mr_pool_init(queue->qp, &queue->qp->rdma_mrs,
 			      queue->queue_size,
 			      IB_MR_TYPE_MEM_REG,
@@ -499,10 +527,24 @@ static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
 		goto out_destroy_ring;
 	}
 
+	if (queue->pi_support) {
+		ret = ib_mr_pool_init(queue->qp, &queue->qp->sig_mrs,
+				      queue->queue_size, IB_MR_TYPE_INTEGRITY,
+				      pages_per_mr, pages_per_mr);
+		if (ret) {
+			dev_err(queue->ctrl->ctrl.device,
+				"failed to initialize PI MR pool sized %d for QID %d\n",
+				queue->queue_size, idx);
+			goto out_destroy_mr_pool;
+		}
+	}
+
 	set_bit(NVME_RDMA_Q_TR_READY, &queue->flags);
 
 	return 0;
 
+out_destroy_mr_pool:
+	ib_mr_pool_destroy(queue->qp, &queue->qp->rdma_mrs);
 out_destroy_ring:
 	nvme_rdma_free_ring(ibdev, queue->rsp_ring, queue->queue_size,
 			    sizeof(struct nvme_completion), DMA_FROM_DEVICE);
@@ -524,6 +566,7 @@ static int nvme_rdma_alloc_queue(struct nvme_rdma_ctrl *ctrl,
 
 	queue = &ctrl->queues[idx];
 	queue->ctrl = ctrl;
+	queue->pi_support = idx && ctrl->pi_support;
 	init_completion(&queue->cm_done);
 
 	if (idx > 0)
@@ -733,8 +776,7 @@ static struct blk_mq_tag_set *nvme_rdma_alloc_tagset(struct nvme_ctrl *nctrl,
 		set->queue_depth = NVME_AQ_MQ_TAG_DEPTH;
 		set->reserved_tags = 2; /* connect + keep-alive */
 		set->numa_node = nctrl->numa_node;
-		set->cmd_size = sizeof(struct nvme_rdma_request) +
-			NVME_INLINE_SG_CNT * sizeof(struct scatterlist);
+		set->cmd_size = nvme_rdma_cmd_size(false);
 		set->driver_data = ctrl;
 		set->nr_hw_queues = 1;
 		set->timeout = ADMIN_TIMEOUT;
@@ -747,8 +789,7 @@ static struct blk_mq_tag_set *nvme_rdma_alloc_tagset(struct nvme_ctrl *nctrl,
 		set->reserved_tags = 1; /* fabric connect */
 		set->numa_node = nctrl->numa_node;
 		set->flags = BLK_MQ_F_SHOULD_MERGE;
-		set->cmd_size = sizeof(struct nvme_rdma_request) +
-			NVME_INLINE_SG_CNT * sizeof(struct scatterlist);
+		set->cmd_size = nvme_rdma_cmd_size(ctrl->pi_support);
 		set->driver_data = ctrl;
 		set->nr_hw_queues = nctrl->queue_count - 1;
 		set->timeout = NVME_IO_TIMEOUT;
@@ -790,7 +831,21 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 	ctrl->device = ctrl->queues[0].device;
 	ctrl->ctrl.numa_node = dev_to_node(ctrl->device->dev->dma_device);
 
-	ctrl->max_fr_pages = nvme_rdma_get_max_fr_pages(ctrl->device->dev);
+	/* T10-PI support */
+	if (ctrl->ctrl.opts->pi_enable) {
+		if (!(ctrl->device->dev->attrs.device_cap_flags &
+		      IB_DEVICE_INTEGRITY_HANDOVER)) {
+			dev_warn(ctrl->ctrl.device,
+				 "T10-PI requested but not supported on %s, continue without T10-PI\n",
+				 ctrl->device->dev->name);
+			ctrl->ctrl.opts->pi_enable = false;
+		} else {
+			ctrl->pi_support = true;
+		}
+	}
+
+	ctrl->max_fr_pages = nvme_rdma_get_max_fr_pages(ctrl->device->dev,
+							ctrl->pi_support);
 
 	/*
 	 * Bind the async event SQE DMA mapping to the admin queue lifetime.
@@ -832,6 +887,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 
 	ctrl->ctrl.max_segments = ctrl->max_fr_pages;
 	ctrl->ctrl.max_hw_sectors = ctrl->max_fr_pages << (ilog2(SZ_4K) - 9);
+	if (ctrl->pi_support)
+		ctrl->ctrl.max_integrity_segments = ctrl->max_fr_pages;
 
 	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
 
@@ -1162,8 +1219,19 @@ static void nvme_rdma_unmap_data(struct nvme_rdma_queue *queue,
 	if (!blk_rq_nr_phys_segments(rq))
 		return;
 
+	if (blk_integrity_rq(rq)) {
+		ib_dma_unmap_sg(ibdev, req->md_sgl->sg_table.sgl,
+				req->md_sgl->nents, rq_dma_dir(rq));
+		sg_free_table_chained(&req->md_sgl->sg_table,
+				      NVME_INLINE_MD_SG_CNT);
+	}
+
 	if (req->mr) {
-		ib_mr_pool_put(queue->qp, &queue->qp->rdma_mrs, req->mr);
+		if (req->use_md)
+			ib_mr_pool_put(queue->qp, &queue->qp->sig_mrs, req->mr);
+		else
+			ib_mr_pool_put(queue->qp, &queue->qp->rdma_mrs,
+				       req->mr);
 		req->mr = NULL;
 	}
 
@@ -1220,34 +1288,112 @@ static int nvme_rdma_map_sg_single(struct nvme_rdma_queue *queue,
 	return 0;
 }
 
-static int nvme_rdma_map_sg_fr(struct nvme_rdma_queue *queue,
-		struct nvme_rdma_request *req, struct nvme_command *c,
-		int count)
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static void nvme_rdma_set_sig_domain(struct blk_integrity *bi,
+		struct nvme_command *cmd, struct ib_sig_domain *domain,
+		u16 control)
 {
-	struct nvme_keyed_sgl_desc *sg = &c->common.dptr.ksgl;
-	int nr;
-
-	req->mr = ib_mr_pool_get(queue->qp, &queue->qp->rdma_mrs);
-	if (WARN_ON_ONCE(!req->mr))
-		return -EAGAIN;
+	domain->sig_type = IB_SIG_TYPE_T10_DIF;
+	domain->sig.dif.bg_type = IB_T10DIF_CRC;
+	domain->sig.dif.pi_interval = 1 << bi->interval_exp;
+	domain->sig.dif.ref_tag = le32_to_cpu(cmd->rw.reftag);
 
 	/*
-	 * Align the MR to a 4K page size to match the ctrl page size and
-	 * the block virtual boundary.
+	 * At the moment we hard code those, but in the future
+	 * we will take them from cmd.
 	 */
-	nr = ib_map_mr_sg(req->mr, req->data_sgl.sg_table.sgl, count, NULL,
-			  SZ_4K);
-	if (unlikely(nr < count)) {
-		ib_mr_pool_put(queue->qp, &queue->qp->rdma_mrs, req->mr);
-		req->mr = NULL;
-		if (nr < 0)
-			return nr;
-		return -EINVAL;
+	domain->sig.dif.apptag_check_mask = 0xffff;
+	domain->sig.dif.app_escape = true;
+	domain->sig.dif.ref_escape = true;
+	if (control & NVME_RW_PRINFO_PRCHK_REF)
+		domain->sig.dif.ref_remap = true;
+}
+
+static void nvme_rdma_set_sig_attrs(struct blk_integrity *bi,
+		struct nvme_command *cmd, struct ib_sig_attrs *sig_attrs)
+{
+	u16 control = le16_to_cpu(cmd->rw.control);
+
+	WARN_ON(bi == NULL);
+
+	memset(sig_attrs, 0, sizeof(*sig_attrs));
+
+	if (control & NVME_RW_PRINFO_PRACT) {
+		/* for WRITE_INSERT/READ_STRIP no memory domain */
+		sig_attrs->mem.sig_type = IB_SIG_TYPE_NONE;
+		nvme_rdma_set_sig_domain(bi, cmd, &sig_attrs->wire, control);
+		/* Clear the PRACT bit since HCA will generate/verify the PI */
+		control &= ~NVME_RW_PRINFO_PRACT;
+		cmd->rw.control = cpu_to_le16(control);
+	} else {
+		/* for WRITE_PASS/READ_PASS both wire/memory domains exist */
+		nvme_rdma_set_sig_domain(bi, cmd, &sig_attrs->wire, control);
+		nvme_rdma_set_sig_domain(bi, cmd, &sig_attrs->mem, control);
 	}
+}
+
+static void nvme_rdma_set_prot_checks(struct nvme_command *cmd, u8 *mask)
+{
+	*mask = 0;
+	if (le16_to_cpu(cmd->rw.control) & NVME_RW_PRINFO_PRCHK_REF)
+		*mask |= IB_SIG_CHECK_REFTAG;
+	if (le16_to_cpu(cmd->rw.control) & NVME_RW_PRINFO_PRCHK_GUARD)
+		*mask |= IB_SIG_CHECK_GUARD;
+}
+
+static void nvme_rdma_sig_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+	if (unlikely(wc->status != IB_WC_SUCCESS))
+		nvme_rdma_wr_error(cq, wc, "SIG");
+}
+
+static void nvme_rdma_set_pi_wr(struct nvme_rdma_request *req,
+				struct nvme_command *c)
+{
+	struct ib_sig_attrs *sig_attrs = req->mr->sig_attrs;
+	struct ib_reg_wr *wr = &req->reg_wr;
+	struct request *rq = blk_mq_rq_from_pdu(req);
+	struct bio *bio = rq->bio;
+	struct nvme_keyed_sgl_desc *sg = &c->common.dptr.ksgl;
+
+	nvme_rdma_set_sig_attrs(blk_get_integrity(bio->bi_disk), c, sig_attrs);
+
+	nvme_rdma_set_prot_checks(c, &sig_attrs->check_mask);
 
 	ib_update_fast_reg_key(req->mr, ib_inc_rkey(req->mr->rkey));
 
+	req->reg_cqe.done = nvme_rdma_sig_done;
+
+	memset(wr, 0, sizeof(*wr));
+	wr->wr.opcode = IB_WR_REG_MR_INTEGRITY;
+	wr->wr.wr_cqe = &req->reg_cqe;
+	wr->wr.num_sge = 0;
+	wr->wr.send_flags = 0;
+	wr->mr = req->mr;
+	wr->key = req->mr->rkey;
+	wr->access = IB_ACCESS_LOCAL_WRITE |
+		     IB_ACCESS_REMOTE_READ |
+		     IB_ACCESS_REMOTE_WRITE;
+
+	sg->addr = cpu_to_le64(req->mr->iova);
+	put_unaligned_le24(req->mr->length, sg->length);
+	put_unaligned_le32(req->mr->rkey, sg->key);
+	sg->type = NVME_KEY_SGL_FMT_DATA_DESC << 4;
+}
+#else
+static void nvme_rdma_set_pi_wr(struct nvme_rdma_request *req,
+				struct nvme_command *c)
+{
+}
+#endif /* CONFIG_BLK_DEV_INTEGRITY */
+
+static void nvme_rdma_set_reg_wr(struct nvme_rdma_request *req,
+				 struct nvme_command *c)
+{
+	struct nvme_keyed_sgl_desc *sg = &c->common.dptr.ksgl;
+
 	req->reg_cqe.done = nvme_rdma_memreg_done;
+
 	memset(&req->reg_wr, 0, sizeof(req->reg_wr));
 	req->reg_wr.wr.opcode = IB_WR_REG_MR;
 	req->reg_wr.wr.wr_cqe = &req->reg_cqe;
@@ -1263,8 +1409,52 @@ static int nvme_rdma_map_sg_fr(struct nvme_rdma_queue *queue,
 	put_unaligned_le32(req->mr->rkey, sg->key);
 	sg->type = (NVME_KEY_SGL_FMT_DATA_DESC << 4) |
 			NVME_SGL_FMT_INVALIDATE;
+}
+
+static int nvme_rdma_map_sg_fr(struct nvme_rdma_queue *queue,
+		struct nvme_rdma_request *req, struct nvme_command *c,
+		int count, int pi_count)
+{
+	struct nvme_rdma_sgl *sgl = &req->data_sgl;
+	int nr;
+
+	if (req->use_md) {
+		req->mr = ib_mr_pool_get(queue->qp, &queue->qp->sig_mrs);
+		if (WARN_ON_ONCE(!req->mr))
+			return -EAGAIN;
+
+		nr = ib_map_mr_sg_pi(req->mr, sgl->sg_table.sgl, count, 0,
+				     req->md_sgl->sg_table.sgl, pi_count, 0,
+				     SZ_4K);
+		if (unlikely(nr))
+			goto mr_put;
+
+		nvme_rdma_set_pi_wr(req, c);
+	} else {
+		req->mr = ib_mr_pool_get(queue->qp, &queue->qp->rdma_mrs);
+		if (WARN_ON_ONCE(!req->mr))
+			return -EAGAIN;
+		/*
+		 * Align the MR to a 4K page size to match the ctrl page size
+		 * and the block virtual boundary.
+		 */
+		nr = ib_map_mr_sg(req->mr, sgl->sg_table.sgl, count, 0, SZ_4K);
+		if (unlikely(nr < count))
+			goto mr_put;
+
+		nvme_rdma_set_reg_wr(req, c);
+	}
 
 	return 0;
+mr_put:
+	if (req->use_md)
+		ib_mr_pool_put(queue->qp, &queue->qp->sig_mrs, req->mr);
+	else
+		ib_mr_pool_put(queue->qp, &queue->qp->rdma_mrs, req->mr);
+	req->mr = NULL;
+	if (nr < 0)
+		return nr;
+	return -EINVAL;
 }
 
 static int nvme_rdma_map_data(struct nvme_rdma_queue *queue,
@@ -1273,6 +1463,7 @@ static int nvme_rdma_map_data(struct nvme_rdma_queue *queue,
 	struct nvme_rdma_request *req = blk_mq_rq_to_pdu(rq);
 	struct nvme_rdma_device *dev = queue->device;
 	struct ib_device *ibdev = dev->dev;
+	int pi_count = 0;
 	int count, ret;
 
 	req->num_sge = 1;
@@ -1300,7 +1491,30 @@ static int nvme_rdma_map_data(struct nvme_rdma_queue *queue,
 		goto out_free_table;
 	}
 
-	if (count <= dev->num_inline_segments) {
+	if (blk_integrity_rq(rq)) {
+		memset(req->md_sgl, 0, sizeof(struct nvme_rdma_sgl));
+		req->md_sgl->sg_table.sgl =
+			(struct scatterlist *)(req->md_sgl + 1);
+		ret = sg_alloc_table_chained(&req->md_sgl->sg_table,
+				blk_rq_count_integrity_sg(rq->q, rq->bio),
+				req->md_sgl->sg_table.sgl,
+				NVME_INLINE_MD_SG_CNT);
+		if (unlikely(ret)) {
+			ret = -ENOMEM;
+			goto out_unmap_sg;
+		}
+
+		req->md_sgl->nents = blk_rq_map_integrity_sg(rq->q, rq->bio,
+					req->md_sgl->sg_table.sgl);
+		pi_count = ib_dma_map_sg(ibdev, req->md_sgl->sg_table.sgl,
+					 req->md_sgl->nents, rq_dma_dir(rq));
+		if (unlikely(pi_count <= 0)) {
+			ret = -EIO;
+			goto out_free_pi_table;
+		}
+	}
+
+	if (count <= dev->num_inline_segments && !req->use_md) {
 		if (rq_data_dir(rq) == WRITE && nvme_rdma_queue_idx(queue) &&
 		    queue->ctrl->use_inline_data &&
 		    blk_rq_payload_bytes(rq) <=
@@ -1315,13 +1529,21 @@ static int nvme_rdma_map_data(struct nvme_rdma_queue *queue,
 		}
 	}
 
-	ret = nvme_rdma_map_sg_fr(queue, req, c, count);
+	ret = nvme_rdma_map_sg_fr(queue, req, c, count, pi_count);
 out:
 	if (unlikely(ret))
-		goto out_unmap_sg;
+		goto out_unmap_pi_sg;
 
 	return 0;
 
+out_unmap_pi_sg:
+	if (blk_integrity_rq(rq))
+		ib_dma_unmap_sg(ibdev, req->md_sgl->sg_table.sgl,
+				req->md_sgl->nents, rq_dma_dir(rq));
+out_free_pi_table:
+	if (blk_integrity_rq(rq))
+		sg_free_table_chained(&req->md_sgl->sg_table,
+				      NVME_INLINE_MD_SG_CNT);
 out_unmap_sg:
 	ib_dma_unmap_sg(ibdev, req->data_sgl.sg_table.sgl, req->data_sgl.nents,
 			rq_dma_dir(rq));
@@ -1774,6 +1996,8 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 	blk_mq_start_request(rq);
 
+	req->use_md = queue->pi_support && nvme_req(rq)->has_md;
+
 	err = nvme_rdma_map_data(queue, rq, c);
 	if (unlikely(err < 0)) {
 		dev_err(queue->ctrl->ctrl.device,
@@ -1814,12 +2038,50 @@ static int nvme_rdma_poll(struct blk_mq_hw_ctx *hctx)
 	return ib_process_cq_direct(queue->ib_cq, -1);
 }
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static void nvme_rdma_check_pi_status(struct nvme_rdma_request *req)
+{
+	struct request *rq = blk_mq_rq_from_pdu(req);
+	struct ib_mr_status mr_status;
+	int ret;
+
+	ret = ib_check_mr_status(req->mr, IB_MR_CHECK_SIG_STATUS, &mr_status);
+	if (ret) {
+		pr_err("ib_check_mr_status failed, ret %d\n", ret);
+		nvme_req(rq)->status = NVME_SC_INVALID_PI;
+		return;
+	}
+
+	if (mr_status.fail_status & IB_MR_CHECK_SIG_STATUS) {
+		switch (mr_status.sig_err.err_type) {
+		case IB_SIG_BAD_GUARD:
+			nvme_req(rq)->status = NVME_SC_GUARD_CHECK;
+			break;
+		case IB_SIG_BAD_REFTAG:
+			nvme_req(rq)->status = NVME_SC_REFTAG_CHECK;
+			break;
+		case IB_SIG_BAD_APPTAG:
+			nvme_req(rq)->status = NVME_SC_APPTAG_CHECK;
+			break;
+		}
+		pr_err("PI error found type %d expected 0x%x vs actual 0x%x\n",
+		       mr_status.sig_err.err_type, mr_status.sig_err.expected,
+		       mr_status.sig_err.actual);
+	}
+}
+#endif
+
 static void nvme_rdma_complete_rq(struct request *rq)
 {
 	struct nvme_rdma_request *req = blk_mq_rq_to_pdu(rq);
 	struct nvme_rdma_queue *queue = req->queue;
 	struct ib_device *ibdev = queue->device->dev;
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+	if (req->use_md)
+		nvme_rdma_check_pi_status(req);
+#endif
+
 	nvme_rdma_unmap_data(queue, rq);
 	ib_dma_unmap_single(ibdev, req->sqe.dma, sizeof(struct nvme_command),
 			    DMA_TO_DEVICE);
@@ -1939,7 +2201,7 @@ static void nvme_rdma_reset_ctrl_work(struct work_struct *work)
 static const struct nvme_ctrl_ops nvme_rdma_ctrl_ops = {
 	.name			= "rdma",
 	.module			= THIS_MODULE,
-	.flags			= NVME_F_FABRICS,
+	.flags			= NVME_F_FABRICS | NVME_F_METADATA_SUPPORTED,
 	.reg_read32		= nvmf_reg_read32,
 	.reg_read64		= nvmf_reg_read64,
 	.reg_write32		= nvmf_reg_write32,
@@ -2081,7 +2343,7 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 	.allowed_opts	= NVMF_OPT_TRSVCID | NVMF_OPT_RECONNECT_DELAY |
 			  NVMF_OPT_HOST_TRADDR | NVMF_OPT_CTRL_LOSS_TMO |
 			  NVMF_OPT_NR_WRITE_QUEUES | NVMF_OPT_NR_POLL_QUEUES |
-			  NVMF_OPT_TOS,
+			  NVMF_OPT_TOS | NVMF_OPT_PI_ENABLE,
 	.create_ctrl	= nvme_rdma_create_ctrl,
 };
 
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 09/17] nvmet: prepare metadata request
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (8 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 08/17] nvme-rdma: add metadata/T10-PI support Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 15:21   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 10/17] nvmet: add metadata characteristics for a namespace Max Gurtovoy
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Allocate the metadata SGL buffers and set metadata fields for the
request. This is a preparation for adding metadata support over the
fabrics.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/target/core.c  | 48 ++++++++++++++++++++++++++++++++++++++-------
 drivers/nvme/target/nvmet.h |  5 +++++
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index b685f99d..3899b2b4 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -873,13 +873,17 @@ bool nvmet_req_init(struct nvmet_req *req, struct nvmet_cq *cq,
 	req->sq = sq;
 	req->ops = ops;
 	req->sg = NULL;
+	req->md_sg = NULL;
 	req->sg_cnt = 0;
+	req->md_sg_cnt = 0;
 	req->transfer_len = 0;
+	req->md_len = 0;
 	req->cqe->status = 0;
 	req->cqe->sq_head = 0;
 	req->ns = NULL;
 	req->error_loc = NVMET_NO_ERROR_LOC;
 	req->error_slba = 0;
+	req->use_md = false;
 
 	trace_nvmet_req_init(req, req->cmd);
 
@@ -962,6 +966,7 @@ bool nvmet_check_data_len_lte(struct nvmet_req *req, size_t data_len)
 int nvmet_req_alloc_sgl(struct nvmet_req *req)
 {
 	struct pci_dev *p2p_dev = NULL;
+	int data_len = req->transfer_len - req->md_len;
 
 	if (IS_ENABLED(CONFIG_PCI_P2PDMA)) {
 		if (req->sq->ctrl && req->ns)
@@ -971,11 +976,23 @@ int nvmet_req_alloc_sgl(struct nvmet_req *req)
 		req->p2p_dev = NULL;
 		if (req->sq->qid && p2p_dev) {
 			req->sg = pci_p2pmem_alloc_sgl(p2p_dev, &req->sg_cnt,
-						       req->transfer_len);
-			if (req->sg) {
-				req->p2p_dev = p2p_dev;
-				return 0;
+						       data_len);
+			if (!req->sg)
+				goto fallback;
+
+			if (req->md_len) {
+				req->md_sg =
+					pci_p2pmem_alloc_sgl(p2p_dev,
+							     &req->md_sg_cnt,
+							     req->md_len);
+				if (!req->md_sg) {
+					pci_p2pmem_free_sgl(req->p2p_dev,
+							    req->sg);
+					goto fallback;
+				}
 			}
+			req->p2p_dev = p2p_dev;
+			return 0;
 		}
 
 		/*
@@ -984,23 +1001,40 @@ int nvmet_req_alloc_sgl(struct nvmet_req *req)
 		 */
 	}
 
-	req->sg = sgl_alloc(req->transfer_len, GFP_KERNEL, &req->sg_cnt);
+fallback:
+	req->sg = sgl_alloc(data_len, GFP_KERNEL, &req->sg_cnt);
 	if (unlikely(!req->sg))
 		return -ENOMEM;
 
+	if (req->md_len) {
+		req->md_sg = sgl_alloc(req->md_len, GFP_KERNEL,
+					 &req->md_sg_cnt);
+		if (unlikely(!req->md_sg)) {
+			sgl_free(req->sg);
+			return -ENOMEM;
+		}
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nvmet_req_alloc_sgl);
 
 void nvmet_req_free_sgl(struct nvmet_req *req)
 {
-	if (req->p2p_dev)
+	if (req->p2p_dev) {
 		pci_p2pmem_free_sgl(req->p2p_dev, req->sg);
-	else
+		if (req->md_sg)
+			pci_p2pmem_free_sgl(req->p2p_dev, req->md_sg);
+	} else {
 		sgl_free(req->sg);
+		if (req->md_sg)
+			sgl_free(req->md_sg);
+	}
 
 	req->sg = NULL;
+	req->md_sg = NULL;
 	req->sg_cnt = 0;
+	req->md_sg_cnt = 0;
 }
 EXPORT_SYMBOL_GPL(nvmet_req_free_sgl);
 
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 421dff3..8c75667 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -302,6 +302,7 @@ struct nvmet_req {
 	struct nvmet_cq		*cq;
 	struct nvmet_ns		*ns;
 	struct scatterlist	*sg;
+	struct scatterlist	*md_sg;
 	struct bio_vec		inline_bvec[NVMET_MAX_INLINE_BIOVEC];
 	union {
 		struct {
@@ -315,8 +316,10 @@ struct nvmet_req {
 		} f;
 	};
 	int			sg_cnt;
+	int			md_sg_cnt;
 	/* data length as parsed from the SGL descriptor: */
 	size_t			transfer_len;
+	size_t			md_len;
 
 	struct nvmet_port	*port;
 
@@ -327,6 +330,8 @@ struct nvmet_req {
 	struct device		*p2p_client;
 	u16			error_loc;
 	u64			error_slba;
+	/* Metadata support */
+	bool			use_md;
 };
 
 extern struct workqueue_struct *buffered_io_wq;
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 10/17] nvmet: add metadata characteristics for a namespace
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (9 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 09/17] nvmet: prepare metadata request Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 15:23   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 11/17] nvmet: Rename nvmet_rw_len to nvmet_rw_data_len Max Gurtovoy
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Fill those namespace fields from the block device format for adding
metadata (T10-PI) over fabric support with block devices.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/nvme/target/io-cmd-bdev.c | 22 ++++++++++++++++++++++
 drivers/nvme/target/nvmet.h       |  3 +++
 2 files changed, 25 insertions(+)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index ea0e596..bdf611f 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -50,6 +50,9 @@ void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id)
 int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
 {
 	int ret;
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+	struct blk_integrity *bi;
+#endif
 
 	ns->bdev = blkdev_get_by_path(ns->device_path,
 			FMODE_READ | FMODE_WRITE, NULL);
@@ -64,6 +67,25 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
 	}
 	ns->size = i_size_read(ns->bdev->bd_inode);
 	ns->blksize_shift = blksize_bits(bdev_logical_block_size(ns->bdev));
+
+	ns->md_type = 0;
+	ns->ms = 0;
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+	bi = bdev_get_integrity(ns->bdev);
+	if (bi) {
+		ns->ms = bi->tuple_size;
+		if (bi->profile == &t10_pi_type1_crc)
+			ns->md_type = NVME_NS_DPS_PI_TYPE1;
+		else if (bi->profile == &t10_pi_type3_crc)
+			ns->md_type = NVME_NS_DPS_PI_TYPE3;
+		else
+			/* Unsupported metadata type */
+			ns->ms = 0;
+	}
+
+	pr_debug("ms %d md_type %d\n", ns->ms, ns->md_type);
+#endif
+
 	return 0;
 }
 
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 8c75667..f5f93d4 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -19,6 +19,7 @@
 #include <linux/rcupdate.h>
 #include <linux/blkdev.h>
 #include <linux/radix-tree.h>
+#include <linux/t10-pi.h>
 
 #define NVMET_ASYNC_EVENTS		4
 #define NVMET_ERROR_LOG_SLOTS		128
@@ -77,6 +78,8 @@ struct nvmet_ns {
 
 	int			use_p2pmem;
 	struct pci_dev		*p2p_dev;
+	int			md_type;
+	int			ms;
 };
 
 static inline struct nvmet_ns *to_nvmet_ns(struct config_item *item)
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 11/17] nvmet: Rename nvmet_rw_len to nvmet_rw_data_len
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (10 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 10/17] nvmet: add metadata characteristics for a namespace Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 12/17] nvmet: Rename nvmet_check_data_len to nvmet_check_transfer_len Max Gurtovoy
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

The function doesn't add the metadata length (only data length is
calculated). This is preparation for adding metadata (T10-PI) support.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/nvme/target/io-cmd-bdev.c | 2 +-
 drivers/nvme/target/io-cmd-file.c | 2 +-
 drivers/nvme/target/nvmet.h       | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index bdf611f..1669483 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -173,7 +173,7 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 	sector_t sector;
 	int op, i;
 
-	if (!nvmet_check_data_len(req, nvmet_rw_len(req)))
+	if (!nvmet_check_data_len(req, nvmet_rw_data_len(req)))
 		return;
 
 	if (!req->sg_cnt) {
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
index cd5670b..ffbba88 100644
--- a/drivers/nvme/target/io-cmd-file.c
+++ b/drivers/nvme/target/io-cmd-file.c
@@ -232,7 +232,7 @@ static void nvmet_file_execute_rw(struct nvmet_req *req)
 {
 	ssize_t nr_bvec = req->sg_cnt;
 
-	if (!nvmet_check_data_len(req, nvmet_rw_len(req)))
+	if (!nvmet_check_data_len(req, nvmet_rw_data_len(req)))
 		return;
 
 	if (!req->sg_cnt || !nr_bvec) {
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index f5f93d4..923c452 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -507,7 +507,7 @@ void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
 u16 nvmet_file_flush(struct nvmet_req *req);
 void nvmet_ns_changed(struct nvmet_subsys *subsys, u32 nsid);
 
-static inline u32 nvmet_rw_len(struct nvmet_req *req)
+static inline u32 nvmet_rw_data_len(struct nvmet_req *req)
 {
 	return ((u32)le16_to_cpu(req->cmd->rw.length) + 1) <<
 			req->ns->blksize_shift;
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 12/17] nvmet: Rename nvmet_check_data_len to nvmet_check_transfer_len
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (11 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 11/17] nvmet: Rename nvmet_rw_len to nvmet_rw_data_len Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 13/17] nvme: Add Metadata Capabilities enumerations Max Gurtovoy
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

The function doesn't check only the data length, because the transfer
length includes also the metadata length in some cases. This is
preparation for adding metadata (T10-PI) support.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/nvme/target/admin-cmd.c   | 14 +++++++-------
 drivers/nvme/target/core.c        |  6 +++---
 drivers/nvme/target/discovery.c   |  8 ++++----
 drivers/nvme/target/fabrics-cmd.c |  8 ++++----
 drivers/nvme/target/io-cmd-bdev.c |  6 +++---
 drivers/nvme/target/io-cmd-file.c |  6 +++---
 drivers/nvme/target/nvmet.h       |  2 +-
 7 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index b9ec489..24ed5b2 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -294,7 +294,7 @@ static void nvmet_execute_get_log_page_ana(struct nvmet_req *req)
 
 static void nvmet_execute_get_log_page(struct nvmet_req *req)
 {
-	if (!nvmet_check_data_len(req, nvmet_get_log_page_len(req->cmd)))
+	if (!nvmet_check_transfer_len(req, nvmet_get_log_page_len(req->cmd)))
 		return;
 
 	switch (req->cmd->get_log_page.lid) {
@@ -624,7 +624,7 @@ static void nvmet_execute_identify_desclist(struct nvmet_req *req)
 
 static void nvmet_execute_identify(struct nvmet_req *req)
 {
-	if (!nvmet_check_data_len(req, NVME_IDENTIFY_DATA_SIZE))
+	if (!nvmet_check_transfer_len(req, NVME_IDENTIFY_DATA_SIZE))
 		return;
 
 	switch (req->cmd->identify.cns) {
@@ -653,7 +653,7 @@ static void nvmet_execute_identify(struct nvmet_req *req)
  */
 static void nvmet_execute_abort(struct nvmet_req *req)
 {
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 	nvmet_set_result(req, 1);
 	nvmet_req_complete(req, 0);
@@ -742,7 +742,7 @@ static void nvmet_execute_set_features(struct nvmet_req *req)
 	u16 nsqr;
 	u16 ncqr;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	switch (cdw10 & 0xff) {
@@ -814,7 +814,7 @@ static void nvmet_execute_get_features(struct nvmet_req *req)
 	u32 cdw10 = le32_to_cpu(req->cmd->common.cdw10);
 	u16 status = 0;
 
-	if (!nvmet_check_data_len(req, nvmet_feat_data_len(req, cdw10)))
+	if (!nvmet_check_transfer_len(req, nvmet_feat_data_len(req, cdw10)))
 		return;
 
 	switch (cdw10 & 0xff) {
@@ -881,7 +881,7 @@ void nvmet_execute_async_event(struct nvmet_req *req)
 {
 	struct nvmet_ctrl *ctrl = req->sq->ctrl;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	mutex_lock(&ctrl->lock);
@@ -900,7 +900,7 @@ void nvmet_execute_keep_alive(struct nvmet_req *req)
 {
 	struct nvmet_ctrl *ctrl = req->sq->ctrl;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	pr_debug("ctrl %d update keep-alive timer for %d secs\n",
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 3899b2b4..175445d 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -940,9 +940,9 @@ void nvmet_req_uninit(struct nvmet_req *req)
 }
 EXPORT_SYMBOL_GPL(nvmet_req_uninit);
 
-bool nvmet_check_data_len(struct nvmet_req *req, size_t data_len)
+bool nvmet_check_transfer_len(struct nvmet_req *req, size_t len)
 {
-	if (unlikely(data_len != req->transfer_len)) {
+	if (unlikely(len != req->transfer_len)) {
 		req->error_loc = offsetof(struct nvme_common_command, dptr);
 		nvmet_req_complete(req, NVME_SC_SGL_INVALID_DATA | NVME_SC_DNR);
 		return false;
@@ -950,7 +950,7 @@ bool nvmet_check_data_len(struct nvmet_req *req, size_t data_len)
 
 	return true;
 }
-EXPORT_SYMBOL_GPL(nvmet_check_data_len);
+EXPORT_SYMBOL_GPL(nvmet_check_transfer_len);
 
 bool nvmet_check_data_len_lte(struct nvmet_req *req, size_t data_len)
 {
diff --git a/drivers/nvme/target/discovery.c b/drivers/nvme/target/discovery.c
index 0c2274b..40cf0b6 100644
--- a/drivers/nvme/target/discovery.c
+++ b/drivers/nvme/target/discovery.c
@@ -171,7 +171,7 @@ static void nvmet_execute_disc_get_log_page(struct nvmet_req *req)
 	u16 status = 0;
 	void *buffer;
 
-	if (!nvmet_check_data_len(req, data_len))
+	if (!nvmet_check_transfer_len(req, data_len))
 		return;
 
 	if (req->cmd->get_log_page.lid != NVME_LOG_DISC) {
@@ -244,7 +244,7 @@ static void nvmet_execute_disc_identify(struct nvmet_req *req)
 	const char model[] = "Linux";
 	u16 status = 0;
 
-	if (!nvmet_check_data_len(req, NVME_IDENTIFY_DATA_SIZE))
+	if (!nvmet_check_transfer_len(req, NVME_IDENTIFY_DATA_SIZE))
 		return;
 
 	if (req->cmd->identify.cns != NVME_ID_CNS_CTRL) {
@@ -298,7 +298,7 @@ static void nvmet_execute_disc_set_features(struct nvmet_req *req)
 	u32 cdw10 = le32_to_cpu(req->cmd->common.cdw10);
 	u16 stat;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	switch (cdw10 & 0xff) {
@@ -324,7 +324,7 @@ static void nvmet_execute_disc_get_features(struct nvmet_req *req)
 	u32 cdw10 = le32_to_cpu(req->cmd->common.cdw10);
 	u16 stat = 0;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	switch (cdw10 & 0xff) {
diff --git a/drivers/nvme/target/fabrics-cmd.c b/drivers/nvme/target/fabrics-cmd.c
index feef15c..52a6f70 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -12,7 +12,7 @@ static void nvmet_execute_prop_set(struct nvmet_req *req)
 	u64 val = le64_to_cpu(req->cmd->prop_set.value);
 	u16 status = 0;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	if (req->cmd->prop_set.attrib & 1) {
@@ -41,7 +41,7 @@ static void nvmet_execute_prop_get(struct nvmet_req *req)
 	u16 status = 0;
 	u64 val = 0;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	if (req->cmd->prop_get.attrib & 1) {
@@ -156,7 +156,7 @@ static void nvmet_execute_admin_connect(struct nvmet_req *req)
 	struct nvmet_ctrl *ctrl = NULL;
 	u16 status = 0;
 
-	if (!nvmet_check_data_len(req, sizeof(struct nvmf_connect_data)))
+	if (!nvmet_check_transfer_len(req, sizeof(struct nvmf_connect_data)))
 		return;
 
 	d = kmalloc(sizeof(*d), GFP_KERNEL);
@@ -223,7 +223,7 @@ static void nvmet_execute_io_connect(struct nvmet_req *req)
 	u16 qid = le16_to_cpu(c->qid);
 	u16 status = 0;
 
-	if (!nvmet_check_data_len(req, sizeof(struct nvmf_connect_data)))
+	if (!nvmet_check_transfer_len(req, sizeof(struct nvmf_connect_data)))
 		return;
 
 	d = kmalloc(sizeof(*d), GFP_KERNEL);
diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index 1669483..2e9e309 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -173,7 +173,7 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 	sector_t sector;
 	int op, i;
 
-	if (!nvmet_check_data_len(req, nvmet_rw_data_len(req)))
+	if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req)))
 		return;
 
 	if (!req->sg_cnt) {
@@ -234,7 +234,7 @@ static void nvmet_bdev_execute_flush(struct nvmet_req *req)
 {
 	struct bio *bio = &req->b.inline_bio;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	bio_init(bio, req->inline_bvec, ARRAY_SIZE(req->inline_bvec));
@@ -326,7 +326,7 @@ static void nvmet_bdev_execute_write_zeroes(struct nvmet_req *req)
 	sector_t nr_sector;
 	int ret;
 
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 
 	sector = le64_to_cpu(write_zeroes->slba) <<
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
index ffbba88..6b098df 100644
--- a/drivers/nvme/target/io-cmd-file.c
+++ b/drivers/nvme/target/io-cmd-file.c
@@ -232,7 +232,7 @@ static void nvmet_file_execute_rw(struct nvmet_req *req)
 {
 	ssize_t nr_bvec = req->sg_cnt;
 
-	if (!nvmet_check_data_len(req, nvmet_rw_data_len(req)))
+	if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req)))
 		return;
 
 	if (!req->sg_cnt || !nr_bvec) {
@@ -276,7 +276,7 @@ static void nvmet_file_flush_work(struct work_struct *w)
 
 static void nvmet_file_execute_flush(struct nvmet_req *req)
 {
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 	INIT_WORK(&req->f.work, nvmet_file_flush_work);
 	schedule_work(&req->f.work);
@@ -366,7 +366,7 @@ static void nvmet_file_write_zeroes_work(struct work_struct *w)
 
 static void nvmet_file_execute_write_zeroes(struct nvmet_req *req)
 {
-	if (!nvmet_check_data_len(req, 0))
+	if (!nvmet_check_transfer_len(req, 0))
 		return;
 	INIT_WORK(&req->f.work, nvmet_file_write_zeroes_work);
 	schedule_work(&req->f.work);
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 923c452..e7954d9 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -392,7 +392,7 @@ static inline bool nvmet_aen_bit_disabled(struct nvmet_ctrl *ctrl, u32 bn)
 bool nvmet_req_init(struct nvmet_req *req, struct nvmet_cq *cq,
 		struct nvmet_sq *sq, const struct nvmet_fabrics_ops *ops);
 void nvmet_req_uninit(struct nvmet_req *req);
-bool nvmet_check_data_len(struct nvmet_req *req, size_t data_len);
+bool nvmet_check_transfer_len(struct nvmet_req *req, size_t len);
 bool nvmet_check_data_len_lte(struct nvmet_req *req, size_t data_len);
 void nvmet_req_complete(struct nvmet_req *req, u16 status);
 int nvmet_req_alloc_sgl(struct nvmet_req *req);
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 13/17] nvme: Add Metadata Capabilities enumerations
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (12 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 12/17] nvmet: Rename nvmet_check_data_len to nvmet_check_transfer_len Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 15:24   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 14/17] nvmet: Add metadata/T10-PI support Max Gurtovoy
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

The enumerations will be used to expose the namespace metadata format by
the target.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
 include/linux/nvme.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 3d5189f..4e968d3 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -396,6 +396,8 @@ enum {
 	NVME_NS_FEAT_THIN	= 1 << 0,
 	NVME_NS_FLBAS_LBA_MASK	= 0xf,
 	NVME_NS_FLBAS_META_EXT	= 0x10,
+	NVME_NS_MC_META_EXT	= 1 << 0,
+	NVME_NS_MC_META_BUF	= 1 << 1,
 	NVME_LBAF_RP_BEST	= 0,
 	NVME_LBAF_RP_BETTER	= 1,
 	NVME_LBAF_RP_GOOD	= 2,
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 14/17] nvmet: Add metadata/T10-PI support
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (13 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 13/17] nvme: Add Metadata Capabilities enumerations Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 15:30   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 15/17] nvmet: Add metadata support for block devices Max Gurtovoy
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Expose the namespace metadata format when PI is enabled. The user needs
to enable the capability per subsystem and per port. The other metadata
properties are taken from the namespace/bdev.

Usage example:
echo 1 > /config/nvmet/subsystems/${NAME}/attr_pi_enable
echo 1 > /config/nvmet/ports/${PORT_NUM}/param_pi_enable

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/target/admin-cmd.c   | 19 ++++++++++--
 drivers/nvme/target/configfs.c    | 61 +++++++++++++++++++++++++++++++++++++++
 drivers/nvme/target/fabrics-cmd.c | 11 +++++++
 drivers/nvme/target/nvmet.h       | 26 +++++++++++++++++
 4 files changed, 115 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index 24ed5b2..8a2c059 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -432,9 +432,13 @@ static void nvmet_execute_identify_ctrl(struct nvmet_req *req)
 
 	strlcpy(id->subnqn, ctrl->subsys->subsysnqn, sizeof(id->subnqn));
 
-	/* Max command capsule size is sqe + single page of in-capsule data */
+	/*
+	 * Max command capsule size is sqe + single page of in-capsule data.
+	 * Disable inline data for Metadata capable controllers.
+	 */
 	id->ioccsz = cpu_to_le32((sizeof(struct nvme_command) +
-				  req->port->inline_data_size) / 16);
+				  req->port->inline_data_size *
+				  !ctrl->pi_support) / 16);
 	/* Max response capsule size is cqe */
 	id->iorcsz = cpu_to_le32(sizeof(struct nvme_completion) / 16);
 
@@ -464,6 +468,7 @@ static void nvmet_execute_identify_ctrl(struct nvmet_req *req)
 
 static void nvmet_execute_identify_ns(struct nvmet_req *req)
 {
+	struct nvmet_ctrl *ctrl = req->sq->ctrl;
 	struct nvmet_ns *ns;
 	struct nvme_id_ns *id;
 	u16 status = 0;
@@ -520,6 +525,16 @@ static void nvmet_execute_identify_ns(struct nvmet_req *req)
 
 	id->lbaf[0].ds = ns->blksize_shift;
 
+	if (ctrl->pi_support && nvmet_ns_has_pi(ns)) {
+		id->dpc = NVME_NS_DPC_PI_FIRST | NVME_NS_DPC_PI_LAST |
+			  NVME_NS_DPC_PI_TYPE1 | NVME_NS_DPC_PI_TYPE2 |
+			  NVME_NS_DPC_PI_TYPE3;
+		id->mc = NVME_NS_MC_META_EXT;
+		id->dps = ns->md_type;
+		id->flbas = NVME_NS_FLBAS_META_EXT;
+		id->lbaf[0].ms = ns->ms;
+	}
+
 	if (ns->readonly)
 		id->nsattr |= (1 << 0);
 	nvmet_put_namespace(ns);
diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 7aa1078..19bf240 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -248,6 +248,36 @@ static ssize_t nvmet_param_inline_data_size_store(struct config_item *item,
 
 CONFIGFS_ATTR(nvmet_, param_inline_data_size);
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static ssize_t nvmet_param_pi_enable_show(struct config_item *item,
+		char *page)
+{
+	struct nvmet_port *port = to_nvmet_port(item);
+
+	return snprintf(page, PAGE_SIZE, "%d\n", port->pi_enable);
+}
+
+static ssize_t nvmet_param_pi_enable_store(struct config_item *item,
+		const char *page, size_t count)
+{
+	struct nvmet_port *port = to_nvmet_port(item);
+	bool val;
+
+	if (strtobool(page, &val))
+		return -EINVAL;
+
+	if (port->enabled) {
+		pr_err("Disable port before setting pi_enable value.\n");
+		return -EACCES;
+	}
+
+	port->pi_enable = val;
+	return count;
+}
+
+CONFIGFS_ATTR(nvmet_, param_pi_enable);
+#endif
+
 static ssize_t nvmet_addr_trtype_show(struct config_item *item,
 		char *page)
 {
@@ -987,6 +1017,31 @@ static ssize_t nvmet_subsys_attr_model_store(struct config_item *item,
 }
 CONFIGFS_ATTR(nvmet_subsys_, attr_model);
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static ssize_t nvmet_subsys_attr_pi_enable_show(struct config_item *item,
+						char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%d\n", to_subsys(item)->pi_support);
+}
+
+static ssize_t nvmet_subsys_attr_pi_enable_store(struct config_item *item,
+						 const char *page, size_t count)
+{
+	struct nvmet_subsys *subsys = to_subsys(item);
+	bool pi_enable;
+
+	if (strtobool(page, &pi_enable))
+		return -EINVAL;
+
+	down_write(&nvmet_config_sem);
+	subsys->pi_support = pi_enable;
+	up_write(&nvmet_config_sem);
+
+	return count;
+}
+CONFIGFS_ATTR(nvmet_subsys_, attr_pi_enable);
+#endif
+
 static struct configfs_attribute *nvmet_subsys_attrs[] = {
 	&nvmet_subsys_attr_attr_allow_any_host,
 	&nvmet_subsys_attr_attr_version,
@@ -994,6 +1049,9 @@ static ssize_t nvmet_subsys_attr_model_store(struct config_item *item,
 	&nvmet_subsys_attr_attr_cntlid_min,
 	&nvmet_subsys_attr_attr_cntlid_max,
 	&nvmet_subsys_attr_attr_model,
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+	&nvmet_subsys_attr_attr_pi_enable,
+#endif
 	NULL,
 };
 
@@ -1289,6 +1347,9 @@ static void nvmet_port_release(struct config_item *item)
 	&nvmet_attr_addr_trsvcid,
 	&nvmet_attr_addr_trtype,
 	&nvmet_attr_param_inline_data_size,
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+	&nvmet_attr_param_pi_enable,
+#endif
 	NULL,
 };
 
diff --git a/drivers/nvme/target/fabrics-cmd.c b/drivers/nvme/target/fabrics-cmd.c
index 52a6f70..799de18 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -197,6 +197,17 @@ static void nvmet_execute_admin_connect(struct nvmet_req *req)
 		goto out;
 	}
 
+	if (ctrl->subsys->pi_support && ctrl->port->pi_enable) {
+		if (ctrl->port->pi_capable) {
+			ctrl->pi_support = true;
+			pr_info("controller %d T10-PI enabled\n", ctrl->cntlid);
+		} else {
+			ctrl->pi_support = false;
+			pr_warn("T10-PI is not supported on controller %d\n",
+				ctrl->cntlid);
+		}
+	}
+
 	uuid_copy(&ctrl->hostid, &d->hostid);
 
 	status = nvmet_install_queue(ctrl, req);
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index e7954d9..b976aad 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -145,6 +145,8 @@ struct nvmet_port {
 	bool				enabled;
 	int				inline_data_size;
 	const struct nvmet_fabrics_ops	*tr_ops;
+	bool				pi_capable;
+	bool				pi_enable;
 };
 
 static inline struct nvmet_port *to_nvmet_port(struct config_item *item)
@@ -204,6 +206,7 @@ struct nvmet_ctrl {
 	spinlock_t		error_lock;
 	u64			err_counter;
 	struct nvme_error_slot	slots[NVMET_ERROR_LOG_SLOTS];
+	bool			pi_support;
 };
 
 struct nvmet_subsys_model {
@@ -233,6 +236,7 @@ struct nvmet_subsys {
 	u64			ver;
 	u64			serial;
 	char			*subsysnqn;
+	bool			pi_support;
 
 	struct config_group	group;
 
@@ -513,6 +517,28 @@ static inline u32 nvmet_rw_data_len(struct nvmet_req *req)
 			req->ns->blksize_shift;
 }
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static inline u32 nvmet_rw_md_len(struct nvmet_req *req)
+{
+	return ((u32)le16_to_cpu(req->cmd->rw.length) + 1) * req->ns->ms;
+}
+
+static inline bool nvmet_ns_has_pi(struct nvmet_ns *ns)
+{
+	return ns->md_type && ns->ms == sizeof(struct t10_pi_tuple);
+}
+#else
+static inline u32 nvmet_rw_md_len(struct nvmet_req *req)
+{
+	return 0;
+}
+
+static inline bool nvmet_ns_has_pi(struct nvmet_ns *ns)
+{
+	return false;
+}
+#endif /* CONFIG_BLK_DEV_INTEGRITY */
+
 static inline u32 nvmet_dsm_len(struct nvmet_req *req)
 {
 	return (le32_to_cpu(req->cmd->dsm.nr) + 1) *
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 15/17] nvmet: Add metadata support for block devices
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (14 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 14/17] nvmet: Add metadata/T10-PI support Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 15:33   ` Christoph Hellwig
  2020-03-27 17:15 ` [PATCH 16/17] RDMA/rw: Expose maximal page list for a device per 1 MR Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 17/17] nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

From: Israel Rukshin <israelr@mellanox.com>

Create a block IO request for the metadata from the protection SG list.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/target/io-cmd-bdev.c | 87 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 85 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index 2e9e309..018acd7 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -164,6 +164,61 @@ static void nvmet_bio_done(struct bio *bio)
 		bio_put(bio);
 }
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static int nvmet_bdev_alloc_bip(struct nvmet_req *req, struct bio *bio,
+				struct sg_mapping_iter *miter)
+{
+	struct blk_integrity *bi;
+	struct bio_integrity_payload *bip;
+	struct block_device *bdev = req->ns->bdev;
+	int rc;
+	size_t resid, len;
+
+	bi = bdev_get_integrity(bdev);
+	if (unlikely(!bi)) {
+		pr_err("Unable to locate bio_integrity\n");
+		return -ENODEV;
+	}
+
+	bip = bio_integrity_alloc(bio, GFP_NOIO,
+			min_t(unsigned int, req->md_sg_cnt, BIO_MAX_PAGES));
+	if (IS_ERR(bip)) {
+		pr_err("Unable to allocate bio_integrity_payload\n");
+		return PTR_ERR(bip);
+	}
+
+	bip->bip_iter.bi_size = bio_integrity_bytes(bi, bio_sectors(bio));
+	/* virtual start sector must be in integrity interval units */
+	bip_set_seed(bip, bio->bi_iter.bi_sector >>
+		     (bi->interval_exp - SECTOR_SHIFT));
+
+	resid = bip->bip_iter.bi_size;
+	while (resid > 0 && sg_miter_next(miter)) {
+		len = min_t(size_t, miter->length, resid);
+		rc = bio_integrity_add_page(bio, miter->page, len,
+					    offset_in_page(miter->addr));
+		if (unlikely(rc != len)) {
+			pr_err("bio_integrity_add_page() failed; %d\n", rc);
+			sg_miter_stop(miter);
+			return -ENOMEM;
+		}
+
+		resid -= len;
+		if (len < miter->length)
+			miter->consumed -= miter->length - len;
+	}
+	sg_miter_stop(miter);
+
+	return 0;
+}
+#else
+static int nvmet_bdev_alloc_bip(struct nvmet_req *req, struct bio *bio,
+				struct sg_mapping_iter *miter)
+{
+	return 0;
+}
+#endif /* CONFIG_BLK_DEV_INTEGRITY */
+
 static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 {
 	int sg_cnt = req->sg_cnt;
@@ -171,9 +226,11 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 	struct scatterlist *sg;
 	struct blk_plug plug;
 	sector_t sector;
-	int op, i;
+	int op, i, rc;
+	struct sg_mapping_iter prot_miter;
 
-	if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req)))
+	if (!nvmet_check_transfer_len(req,
+				      nvmet_rw_data_len(req) + req->md_len))
 		return;
 
 	if (!req->sg_cnt) {
@@ -208,11 +265,25 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 	bio->bi_opf = op;
 
 	blk_start_plug(&plug);
+	if (req->use_md)
+		sg_miter_start(&prot_miter, req->md_sg, req->md_sg_cnt,
+			       op == REQ_OP_READ ? SG_MITER_FROM_SG :
+						   SG_MITER_TO_SG);
+
 	for_each_sg(req->sg, sg, req->sg_cnt, i) {
 		while (bio_add_page(bio, sg_page(sg), sg->length, sg->offset)
 				!= sg->length) {
 			struct bio *prev = bio;
 
+			if (req->use_md) {
+				rc = nvmet_bdev_alloc_bip(req, bio,
+							  &prot_miter);
+				if (unlikely(rc)) {
+					bio_io_error(bio);
+					return;
+				}
+			}
+
 			bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
 			bio_set_dev(bio, req->ns->bdev);
 			bio->bi_iter.bi_sector = sector;
@@ -226,6 +297,14 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 		sg_cnt--;
 	}
 
+	if (req->use_md) {
+		rc = nvmet_bdev_alloc_bip(req, bio, &prot_miter);
+		if (unlikely(rc)) {
+			bio_io_error(bio);
+			return;
+		}
+	}
+
 	submit_bio(bio);
 	blk_finish_plug(&plug);
 }
@@ -353,6 +432,10 @@ u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req)
 	case nvme_cmd_read:
 	case nvme_cmd_write:
 		req->execute = nvmet_bdev_execute_rw;
+		if (req->sq->ctrl->pi_support && nvmet_ns_has_pi(req->ns)) {
+			req->use_md = true;
+			req->md_len = nvmet_rw_md_len(req);
+		}
 		return 0;
 	case nvme_cmd_flush:
 		req->execute = nvmet_bdev_execute_flush;
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 16/17] RDMA/rw: Expose maximal page list for a device per 1 MR
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (15 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 15/17] nvmet: Add metadata support for block devices Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-03-27 17:15 ` [PATCH 17/17] nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
  17 siblings, 0 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

ULP's will use this information to determine the maximal data transfer
size per IO operation.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/infiniband/core/rw.c | 14 ++++++++++++--
 include/rdma/rw.h            |  1 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c
index 4fad732..edc9bee 100644
--- a/drivers/infiniband/core/rw.c
+++ b/drivers/infiniband/core/rw.c
@@ -56,8 +56,17 @@ static inline bool rdma_rw_io_needs_mr(struct ib_device *dev, u8 port_num,
 	return false;
 }
 
-static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev,
-					   bool pi_support)
+/**
+ * rdma_rw_fr_page_list_len - return the max number of pages mapped by 1 MR
+ * @dev:	RDMA device that will eventually create a PD for needed MRs
+ * @pi_support:	Whether MRs will be created for protection information offload
+ *
+ * Returns the number of pages that one MR can map for RDMA operation by the
+ * given device. One can determine the maximal data size according to the
+ * result of this function, or chose using multiple MRs for the RDMA operation
+ * as well.
+ */
+u32 rdma_rw_fr_page_list_len(struct ib_device *dev, bool pi_support)
 {
 	u32 max_pages;
 
@@ -69,6 +78,7 @@ static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev,
 	/* arbitrary limit to avoid allocating gigantic resources */
 	return min_t(u32, max_pages, 256);
 }
+EXPORT_SYMBOL(rdma_rw_fr_page_list_len);
 
 static inline int rdma_rw_inv_key(struct rdma_rw_reg_ctx *reg)
 {
diff --git a/include/rdma/rw.h b/include/rdma/rw.h
index 6ad9dc8..a9bbda7 100644
--- a/include/rdma/rw.h
+++ b/include/rdma/rw.h
@@ -69,5 +69,6 @@ unsigned int rdma_rw_mr_factor(struct ib_device *device, u8 port_num,
 void rdma_rw_init_qp(struct ib_device *dev, struct ib_qp_init_attr *attr);
 int rdma_rw_init_mrs(struct ib_qp *qp, struct ib_qp_init_attr *attr);
 void rdma_rw_cleanup_mrs(struct ib_qp *qp);
+u32 rdma_rw_fr_page_list_len(struct ib_device *dev, bool pi_support);
 
 #endif /* _RDMA_RW_H */
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 17/17] nvmet-rdma: Add metadata/T10-PI support
  2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
                   ` (16 preceding siblings ...)
  2020-03-27 17:15 ` [PATCH 16/17] RDMA/rw: Expose maximal page list for a device per 1 MR Max Gurtovoy
@ 2020-03-27 17:15 ` Max Gurtovoy
  2020-04-21 15:37   ` Christoph Hellwig
  17 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-03-27 17:15 UTC (permalink / raw)
  To: linux-nvme, kbusch, hch, sagi, martin.petersen, jsmart2021, linux-rdma
  Cc: axboe, vladimirk, shlomin, israelr, idanb, jgg, oren, maxg

For capable HCAs (e.g. ConnectX-5/ConnectX-6) this will allow end-to-end
protection information passthrough and validation for NVMe over RDMA
transport. Metadata support was implemented over the new RDMA signature
verbs API.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/target/rdma.c | 245 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 227 insertions(+), 18 deletions(-)

diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 9e1b8c6..ec250bd 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -33,6 +33,7 @@
 
 /* Assume mpsmin == device_page_size == 4KB */
 #define NVMET_RDMA_MAX_MDTS			8
+#define NVMET_RDMA_MAX_MD_MDTS			5
 
 struct nvmet_rdma_cmd {
 	struct ib_sge		sge[NVMET_RDMA_MAX_INLINE_SGE + 1];
@@ -57,6 +58,7 @@ struct nvmet_rdma_rsp {
 	struct nvmet_rdma_queue	*queue;
 
 	struct ib_cqe		read_cqe;
+	struct ib_cqe		write_cqe;
 	struct rdma_rw_ctx	rw;
 
 	struct nvmet_req	req;
@@ -132,6 +134,9 @@ struct nvmet_rdma_device {
 static void nvmet_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc);
 static void nvmet_rdma_recv_done(struct ib_cq *cq, struct ib_wc *wc);
 static void nvmet_rdma_read_data_done(struct ib_cq *cq, struct ib_wc *wc);
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static void nvmet_rdma_write_data_done(struct ib_cq *cq, struct ib_wc *wc);
+#endif
 static void nvmet_rdma_qp_event(struct ib_event *event, void *priv);
 static void nvmet_rdma_queue_disconnect(struct nvmet_rdma_queue *queue);
 static void nvmet_rdma_free_rsp(struct nvmet_rdma_device *ndev,
@@ -389,6 +394,10 @@ static int nvmet_rdma_alloc_rsp(struct nvmet_rdma_device *ndev,
 
 	/* Data In / RDMA READ */
 	r->read_cqe.done = nvmet_rdma_read_data_done;
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+	/* Data Out / RDMA WRITE */
+	r->write_cqe.done = nvmet_rdma_write_data_done;
+#endif
 	return 0;
 
 out_free_rsp:
@@ -498,6 +507,138 @@ static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue *queue)
 	spin_unlock(&queue->rsp_wr_wait_lock);
 }
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static u16 nvmet_rdma_check_pi_status(struct ib_mr *sig_mr)
+{
+	struct ib_mr_status mr_status;
+	int ret;
+	u16 status = 0;
+
+	ret = ib_check_mr_status(sig_mr, IB_MR_CHECK_SIG_STATUS, &mr_status);
+	if (ret) {
+		pr_err("ib_check_mr_status failed, ret %d\n", ret);
+		return NVME_SC_INVALID_PI;
+	}
+
+	if (mr_status.fail_status & IB_MR_CHECK_SIG_STATUS) {
+		switch (mr_status.sig_err.err_type) {
+		case IB_SIG_BAD_GUARD:
+			status = NVME_SC_GUARD_CHECK;
+			break;
+		case IB_SIG_BAD_REFTAG:
+			status = NVME_SC_REFTAG_CHECK;
+			break;
+		case IB_SIG_BAD_APPTAG:
+			status = NVME_SC_APPTAG_CHECK;
+			break;
+		}
+		pr_err("PI error found type %d expected 0x%x vs actual 0x%x\n",
+		       mr_status.sig_err.err_type,
+		       mr_status.sig_err.expected,
+		       mr_status.sig_err.actual);
+	}
+
+	return status;
+}
+
+static void nvmet_rdma_set_sig_domain(struct blk_integrity *bi,
+		struct nvme_command *cmd, struct ib_sig_domain *domain,
+		u16 control)
+{
+	domain->sig_type = IB_SIG_TYPE_T10_DIF;
+	domain->sig.dif.bg_type = IB_T10DIF_CRC;
+	domain->sig.dif.pi_interval = 1 << bi->interval_exp;
+	domain->sig.dif.ref_tag = le32_to_cpu(cmd->rw.reftag);
+
+	/*
+	 * At the moment we hard code those, but in the future
+	 * we will take them from cmd.
+	 */
+	domain->sig.dif.apptag_check_mask = 0xffff;
+	domain->sig.dif.app_escape = true;
+	domain->sig.dif.ref_escape = true;
+	if (control & NVME_RW_PRINFO_PRCHK_REF)
+		domain->sig.dif.ref_remap = true;
+}
+
+static void nvmet_rdma_set_sig_attrs(struct nvmet_req *req,
+				     struct ib_sig_attrs *sig_attrs)
+{
+	struct nvme_command *cmd = req->cmd;
+	struct blk_integrity *bi = bdev_get_integrity(req->ns->bdev);
+	u16 control = le16_to_cpu(cmd->rw.control);
+
+	WARN_ON(bi == NULL);
+
+	memset(sig_attrs, 0, sizeof(*sig_attrs));
+
+	if (control & NVME_RW_PRINFO_PRACT) {
+		/* for WRITE_INSERT/READ_STRIP no wire domain */
+		sig_attrs->wire.sig_type = IB_SIG_TYPE_NONE;
+		nvmet_rdma_set_sig_domain(bi, cmd, &sig_attrs->mem, control);
+		/* Clear the PRACT bit since HCA will generate/verify the PI */
+		control &= ~NVME_RW_PRINFO_PRACT;
+		cmd->rw.control = cpu_to_le16(control);
+		/* PI is added by the HW */
+		req->transfer_len += req->md_len;
+	} else {
+		/* for WRITE_PASS/READ_PASS both wire/memory domains exist */
+		nvmet_rdma_set_sig_domain(bi, cmd, &sig_attrs->wire, control);
+		nvmet_rdma_set_sig_domain(bi, cmd, &sig_attrs->mem, control);
+	}
+
+	if (control & NVME_RW_PRINFO_PRCHK_REF)
+		sig_attrs->check_mask |= IB_SIG_CHECK_REFTAG;
+	if (control & NVME_RW_PRINFO_PRCHK_GUARD)
+		sig_attrs->check_mask |= IB_SIG_CHECK_GUARD;
+	if (control & NVME_RW_PRINFO_PRCHK_APP)
+		sig_attrs->check_mask |= IB_SIG_CHECK_APPTAG;
+}
+#else
+static u16 nvmet_rdma_check_pi_status(struct ib_mr *sig_mr)
+{
+	return 0;
+}
+
+static void nvmet_rdma_set_sig_attrs(struct nvmet_req *req,
+				     struct ib_sig_attrs *sig_attrs)
+{
+}
+#endif /* CONFIG_BLK_DEV_INTEGRITY */
+
+static int nvmet_rdma_rw_ctx_init(struct nvmet_rdma_rsp *rsp, u64 addr, u32 key,
+				  struct ib_sig_attrs *sig_attrs)
+{
+	struct rdma_cm_id *cm_id = rsp->queue->cm_id;
+	struct nvmet_req *req = &rsp->req;
+	int ret;
+
+	if (req->use_md)
+		ret = rdma_rw_ctx_signature_init(&rsp->rw, cm_id->qp,
+			cm_id->port_num, req->sg, req->sg_cnt, req->md_sg,
+			req->md_sg_cnt, sig_attrs, addr, key,
+			nvmet_data_dir(req));
+	else
+		ret = rdma_rw_ctx_init(&rsp->rw, cm_id->qp, cm_id->port_num,
+				       req->sg, req->sg_cnt, 0, addr, key,
+				       nvmet_data_dir(req));
+
+	return ret;
+}
+
+static void nvmet_rdma_rw_ctx_destroy(struct nvmet_rdma_rsp *rsp)
+{
+	struct rdma_cm_id *cm_id = rsp->queue->cm_id;
+	struct nvmet_req *req = &rsp->req;
+
+	if (req->use_md)
+		rdma_rw_ctx_destroy_signature(&rsp->rw, cm_id->qp,
+			cm_id->port_num, req->sg, req->sg_cnt, req->md_sg,
+			req->md_sg_cnt, nvmet_data_dir(req));
+	else
+		rdma_rw_ctx_destroy(&rsp->rw, cm_id->qp, cm_id->port_num,
+				    req->sg, req->sg_cnt, nvmet_data_dir(req));
+}
 
 static void nvmet_rdma_release_rsp(struct nvmet_rdma_rsp *rsp)
 {
@@ -505,11 +646,8 @@ static void nvmet_rdma_release_rsp(struct nvmet_rdma_rsp *rsp)
 
 	atomic_add(1 + rsp->n_rdma, &queue->sq_wr_avail);
 
-	if (rsp->n_rdma) {
-		rdma_rw_ctx_destroy(&rsp->rw, queue->cm_id->qp,
-				queue->cm_id->port_num, rsp->req.sg,
-				rsp->req.sg_cnt, nvmet_data_dir(&rsp->req));
-	}
+	if (rsp->n_rdma)
+		nvmet_rdma_rw_ctx_destroy(rsp);
 
 	if (rsp->req.sg != rsp->cmd->inline_sg)
 		nvmet_req_free_sgl(&rsp->req);
@@ -564,11 +702,16 @@ static void nvmet_rdma_queue_response(struct nvmet_req *req)
 		rsp->send_wr.opcode = IB_WR_SEND;
 	}
 
-	if (nvmet_rdma_need_data_out(rsp))
-		first_wr = rdma_rw_ctx_wrs(&rsp->rw, cm_id->qp,
-				cm_id->port_num, NULL, &rsp->send_wr);
-	else
+	if (nvmet_rdma_need_data_out(rsp)) {
+		if (rsp->req.use_md)
+			first_wr = rdma_rw_ctx_wrs(&rsp->rw, cm_id->qp,
+					cm_id->port_num, &rsp->write_cqe, NULL);
+		else
+			first_wr = rdma_rw_ctx_wrs(&rsp->rw, cm_id->qp,
+					cm_id->port_num, NULL, &rsp->send_wr);
+	} else {
 		first_wr = &rsp->send_wr;
+	}
 
 	nvmet_rdma_post_recv(rsp->queue->dev, rsp->cmd);
 
@@ -587,15 +730,14 @@ static void nvmet_rdma_read_data_done(struct ib_cq *cq, struct ib_wc *wc)
 	struct nvmet_rdma_rsp *rsp =
 		container_of(wc->wr_cqe, struct nvmet_rdma_rsp, read_cqe);
 	struct nvmet_rdma_queue *queue = cq->cq_context;
+	u16 status = 0;
 
 	WARN_ON(rsp->n_rdma <= 0);
 	atomic_add(rsp->n_rdma, &queue->sq_wr_avail);
-	rdma_rw_ctx_destroy(&rsp->rw, queue->cm_id->qp,
-			queue->cm_id->port_num, rsp->req.sg,
-			rsp->req.sg_cnt, nvmet_data_dir(&rsp->req));
 	rsp->n_rdma = 0;
 
 	if (unlikely(wc->status != IB_WC_SUCCESS)) {
+		nvmet_rdma_rw_ctx_destroy(rsp);
 		nvmet_req_uninit(&rsp->req);
 		nvmet_rdma_release_rsp(rsp);
 		if (wc->status != IB_WC_WR_FLUSH_ERR) {
@@ -606,9 +748,59 @@ static void nvmet_rdma_read_data_done(struct ib_cq *cq, struct ib_wc *wc)
 		return;
 	}
 
-	rsp->req.execute(&rsp->req);
+	if (rsp->req.use_md)
+		status = nvmet_rdma_check_pi_status(rsp->rw.reg->mr);
+	nvmet_rdma_rw_ctx_destroy(rsp);
+
+	if (unlikely(status))
+		nvmet_req_complete(&rsp->req, status);
+	else
+		rsp->req.execute(&rsp->req);
 }
 
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static void nvmet_rdma_write_data_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct nvmet_rdma_rsp *rsp =
+		container_of(wc->wr_cqe, struct nvmet_rdma_rsp, write_cqe);
+	struct nvmet_rdma_queue *queue = cq->cq_context;
+	struct rdma_cm_id *cm_id = rsp->queue->cm_id;
+	u16 status;
+
+	WARN_ON(rsp->n_rdma <= 0);
+	atomic_add(rsp->n_rdma, &queue->sq_wr_avail);
+	rsp->n_rdma = 0;
+
+	if (unlikely(wc->status != IB_WC_SUCCESS)) {
+		nvmet_rdma_rw_ctx_destroy(rsp);
+		nvmet_req_uninit(&rsp->req);
+		nvmet_rdma_release_rsp(rsp);
+		if (wc->status != IB_WC_WR_FLUSH_ERR) {
+			pr_info("RDMA WRITE for CQE 0x%p failed with status %s (%d).\n",
+				wc->wr_cqe, ib_wc_status_msg(wc->status),
+				wc->status);
+			nvmet_rdma_error_comp(queue);
+		}
+		return;
+	}
+
+	/*
+	 * Upon RDMA completion check the signature status
+	 * - if succeeded send good NVMe response
+	 * - if failed send bad NVMe response with appropriate error
+	 */
+	status = nvmet_rdma_check_pi_status(rsp->rw.reg->mr);
+	if (unlikely(status))
+		rsp->req.cqe->status = cpu_to_le16(status << 1);
+	nvmet_rdma_rw_ctx_destroy(rsp);
+
+	if (unlikely(ib_post_send(cm_id->qp, &rsp->send_wr, NULL))) {
+		pr_err("sending cmd response failed\n");
+		nvmet_rdma_release_rsp(rsp);
+	}
+}
+#endif /* CONFIG_BLK_DEV_INTEGRITY */
+
 static void nvmet_rdma_use_inline_sg(struct nvmet_rdma_rsp *rsp, u32 len,
 		u64 off)
 {
@@ -663,9 +855,9 @@ static u16 nvmet_rdma_map_sgl_inline(struct nvmet_rdma_rsp *rsp)
 static u16 nvmet_rdma_map_sgl_keyed(struct nvmet_rdma_rsp *rsp,
 		struct nvme_keyed_sgl_desc *sgl, bool invalidate)
 {
-	struct rdma_cm_id *cm_id = rsp->queue->cm_id;
 	u64 addr = le64_to_cpu(sgl->addr);
 	u32 key = get_unaligned_le32(sgl->key);
+	struct ib_sig_attrs sig_attrs;
 	int ret;
 
 	rsp->req.transfer_len = get_unaligned_le24(sgl->length);
@@ -674,13 +866,14 @@ static u16 nvmet_rdma_map_sgl_keyed(struct nvmet_rdma_rsp *rsp,
 	if (!rsp->req.transfer_len)
 		return 0;
 
+	if (rsp->req.use_md)
+		nvmet_rdma_set_sig_attrs(&rsp->req, &sig_attrs);
+
 	ret = nvmet_req_alloc_sgl(&rsp->req);
 	if (unlikely(ret < 0))
 		goto error_out;
 
-	ret = rdma_rw_ctx_init(&rsp->rw, cm_id->qp, cm_id->port_num,
-			rsp->req.sg, rsp->req.sg_cnt, 0, addr, key,
-			nvmet_data_dir(&rsp->req));
+	ret = nvmet_rdma_rw_ctx_init(rsp, addr, key, &sig_attrs);
 	if (unlikely(ret < 0))
 		goto error_out;
 	rsp->n_rdma += ret;
@@ -959,6 +1152,9 @@ static void nvmet_rdma_free_dev(struct kref *ref)
 			goto out_free_pd;
 	}
 
+	port->pi_capable = ndev->device->attrs.device_cap_flags &
+			IB_DEVICE_INTEGRITY_HANDOVER ? true : false;
+
 	list_add(&ndev->entry, &device_list);
 out_unlock:
 	mutex_unlock(&device_list_mutex);
@@ -1025,6 +1221,10 @@ static int nvmet_rdma_create_queue_ib(struct nvmet_rdma_queue *queue)
 		qp_attr.cap.max_recv_sge = 1 + ndev->inline_page_count;
 	}
 
+	if (queue->port->pi_capable && queue->port->pi_enable &&
+	    queue->host_qid)
+		qp_attr.create_flags |= IB_QP_CREATE_INTEGRITY_EN;
+
 	ret = rdma_create_qp(queue->cm_id, ndev->pd, &qp_attr);
 	if (ret) {
 		pr_err("failed to create_qp ret= %d\n", ret);
@@ -1169,6 +1369,7 @@ static int nvmet_rdma_cm_reject(struct rdma_cm_id *cm_id,
 	INIT_WORK(&queue->release_work, nvmet_rdma_release_queue_work);
 	queue->dev = ndev;
 	queue->cm_id = cm_id;
+	queue->port = cm_id->context;
 
 	spin_lock_init(&queue->state_lock);
 	queue->state = NVMET_RDMA_Q_CONNECTING;
@@ -1287,7 +1488,6 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id,
 		ret = -ENOMEM;
 		goto put_device;
 	}
-	queue->port = cm_id->context;
 
 	if (queue->host_qid == 0) {
 		/* Let inflight controller teardown complete */
@@ -1609,6 +1809,15 @@ static void nvmet_rdma_disc_port_addr(struct nvmet_req *req,
 
 static u8 nvmet_rdma_get_mdts(const struct nvmet_ctrl *ctrl)
 {
+	struct nvmet_port *port = ctrl->port;
+	struct rdma_cm_id *cm_id = port->priv;
+
+	if (ctrl->pi_support) {
+		u32 max_pages = rdma_rw_fr_page_list_len(cm_id->device, true);
+
+		return min(ilog2(max_pages), NVMET_RDMA_MAX_MD_MDTS);
+	}
+
 	return NVMET_RDMA_MAX_MDTS;
 }
 
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH 01/17] nvme: introduce namespace features flag
  2020-03-27 17:15 ` [PATCH 01/17] nvme: introduce namespace features flag Max Gurtovoy
@ 2020-04-21 11:59   ` Christoph Hellwig
  2020-04-21 15:53     ` James Smart
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 11:59 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:29PM +0300, Max Gurtovoy wrote:
> From: Israel Rukshin <israelr@mellanox.com>
> 
> Centralize all the metadata checks to one place and make the code more
> readable. Introduce a new enum nvme_ns_features for that matter.
> The features flag description:
>  - NVME_NS_EXT_LBAS - NVMe namespace supports extended LBA format.
>  - NVME_NS_MD_HOST_SUPPORTED - NVMe namespace supports getting metadata
>    from host's block layer.
>  - NVME_NS_MD_CTRL_SUPPORTED - NVMe namespace supports metadata actions
>    by the controller (generate/strip).

So whole I like the ->features flag, the defintion of these two
metadata related features really confuses me.

Here are my vague ideas to improve the situation:

> -static inline bool nvme_ns_has_pi(struct nvme_ns *ns)
> -{
> -	return ns->pi_type && ns->ms == sizeof(struct t10_pi_tuple);
> -}

This function I think is generally useful, I'd rather keep iţ, document
it with a comment and remove the new NVME_NS_MD_CTRL_SUPPORTED
flag.

> -	if (ns->ms && !ns->ext &&
> -	    (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
> +	if (ns->features & NVME_NS_MD_HOST_SUPPORTED)
>  		nvme_init_integrity(disk, ns->ms, ns->pi_type);
> -	if ((ns->ms && !nvme_ns_has_pi(ns) && !blk_get_integrity(disk)) ||
> -	    ns->lba_shift > PAGE_SHIFT)
> +
> +	if ((ns->ms && !(ns->features & NVME_NS_MD_CTRL_SUPPORTED) &&
> +	     !(ns->features & NVME_NS_MD_HOST_SUPPORTED) &&
> +	     !blk_get_integrity(disk)) || ns->lba_shift > PAGE_SHIFT)
>  		capacity = 0;

I find this very confusing.  Can we do something like:

	/*
	 * The block layer can't support LBA sizes larger than the page size
	 * yet, so catch this early and don't allow block I/O.
	 */
	if (ns->lba_shift > PAGE_SHIFT)
  		capacity = 0;

	/*
	 * Register a metadata profile for PI, or the plain non-integrity NVMe
	 * metadata masquerading as Typ 0 if supported, otherwise reject block
	 * I/O to namespaces with metadata except when the namespace supports
	 * PI, as it can strip/insert in that case.
	 */
	if (ns->ms) {
		if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
		    (ns->features & NVME_NS_MD_HOST_SUPPORTED))
			nvme_init_integrity(disk, ns->ms, ns->pi_type);
		else if (!nvme_ns_has_pi(ns))
			capacity = 0;
	}

> +	if (ns->ms) {
> +		if (id->flbas & NVME_NS_FLBAS_META_EXT)
> +			ns->features |= NVME_NS_EXT_LBAS;
> +
> +		/*
> +		 * For PCI, Extended logical block will be generated by the
> +		 * controller.
> +		 */
> +		if (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED) {
> +			if (!(ns->features & NVME_NS_EXT_LBAS))
> +				ns->features |= NVME_NS_MD_HOST_SUPPORTED;
> +		}

Maybe:

> +	if (ns->ms) {
> +		if (id->flbas & NVME_NS_FLBAS_META_EXT)
> +			ns->features |= NVME_NS_EXT_LBAS;
> +
> +		/*
> +		 * For PCI, Extended logical block will be generated by the
> +		 * controller.
> +		 */
> +		if (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED) {
> +			if (!(ns->features & NVME_NS_EXT_LBAS))
> +				ns->features |= NVME_NS_MD_HOST_SUPPORTED;
> +		}

This looks a little strange now, but I guess it will make more sense
with the fabrics addition.  I'll take another look later in the series.

> +enum nvme_ns_features {
> +	NVME_NS_EXT_LBAS = 1 << 0,
> +	NVME_NS_MD_HOST_SUPPORTED = 1 << 1,
> +	NVME_NS_MD_CTRL_SUPPORTED = 1 << 2,
> +};

Please document the meaning of each flag.  I also suspect that just
moving ext to a flag first and than adding the NVME_NS_MD_HOST_SUPPORTED
bit might make more sense.  I'd also rename NVME_NS_MD_HOST_SUPPORTED
to NVME_NS_METADATA_SUPPORTED.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 02/17] nvme: Add has_md field to the nvme_req structure
  2020-03-27 17:15 ` [PATCH 02/17] nvme: Add has_md field to the nvme_req structure Max Gurtovoy
@ 2020-04-21 11:59   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 11:59 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:30PM +0300, Max Gurtovoy wrote:
> From: Israel Rukshin <israelr@mellanox.com>
> 
> Transport drivers will use this field to determine if the request has
> metadata.

Can we make this a flags field with a bit?  Also why do we even need
it, can't we use blk_integrity_rq?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 03/17] nvme: Enforce extended LBA format for fabrics metadata
  2020-03-27 17:15 ` [PATCH 03/17] nvme: Enforce extended LBA format for fabrics metadata Max Gurtovoy
@ 2020-04-21 12:08   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 12:08 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:31PM +0300, Max Gurtovoy wrote:
>  		/*
> +		 * For Fabrics, only metadata as part of extended data LBA is
> +		 * supported. Fail in case of a spec violation.
> +		 */
> +		if (ns->ctrl->ops->flags & NVME_F_FABRICS) {
> +			if (WARN_ON_ONCE(!(ns->features & NVME_NS_EXT_LBAS)))
> +				return -EINVAL;
> +		}
> +
> +		/*
>  		 * For PCI, Extended logical block will be generated by the
>  		 * controller.
>  		 */
>  		if (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED) {
> +			if (ns->ctrl->ops->flags & NVME_F_FABRICS ||
> +			    !(ns->features & NVME_NS_EXT_LBAS))
>  				ns->features |= NVME_NS_MD_HOST_SUPPORTED;

This looks a little confusing.  I think it might make more sense


	struct nvme_ctrl *ctrl = ns->ctrl;

	...

	if (ns->ms) {
  		/*
		 * For PCIe only the separate metadata pointer is supported,
		 * as the block layer supplies metadata in a separate
		 * bio_vec chain.  For Fabrics, only metadata as part of
		 * extended data LBA is supported on the wire per the Fabrics
		 * specification, but the HBA/HCA will do the remapping from
		 * the separate metadata buffers for us.
		 */
		if (id->flbas & NVME_NS_FLBAS_META_EXT) {
			ns->features |= NVME_NS_EXT_LBAS;
			if ((ctrl->ops->flags & NVME_F_FABRICS) &&
			    (ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
 				ns->features |= NVME_NS_METADATA_SUPPORTED;
		} else {
			if (WARN_ON_ONCE(ctrl->ops->flags & NVME_F_FABRICS))
				return -EINVAL;
			if (ctrl->ops->flags & NVME_F_METADATA_SUPPORTED)
 				ns->features |= NVME_NS_METADATA_SUPPORTED;
				
		}

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 04/17] nvme: introduce max_integrity_segments ctrl attribute
  2020-03-27 17:15 ` [PATCH 04/17] nvme: introduce max_integrity_segments ctrl attribute Max Gurtovoy
@ 2020-04-21 12:09   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 12:09 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:32PM +0300, Max Gurtovoy wrote:
> +	/*
> +	 * NVMe PCI driver doesn't support Extended LBA format and supports
> +	 * only a single integrity segment for a separate contiguous buffer
> +	 * of metadata.
> +	 */

That isn't strictly true, PCIe can also support SGLs for metadata.

I'd rather ѕay something like:

	/*
	 * We do not support an SGL for metadata (yet), so we are limited
	 * to a single integrity segment for the separate metadata pointer.
	 */

Except for that this looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-03-27 17:15 ` [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support Max Gurtovoy
@ 2020-04-21 12:12   ` Christoph Hellwig
  2020-04-21 15:17   ` Christoph Hellwig
  1 sibling, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 12:12 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

Looks good modulo the interaction with the reordering in the previous
patch.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 06/17] nvme: introduce NVME_INLINE_MD_SG_CNT
  2020-03-27 17:15 ` [PATCH 06/17] nvme: introduce NVME_INLINE_MD_SG_CNT Max Gurtovoy
@ 2020-04-21 12:12   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 12:12 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:34PM +0300, Max Gurtovoy wrote:
> From: Israel Rukshin <israelr@mellanox.com>
> 
> SGL size of metadata is usually small. Thus, 1 inline sg should cover
> most cases. The macro will be used for pre-allocate a single SGL entry
> for metadata. The preallocation of small inline SGLs depends on SG_CHAIN
> capability so if the ARCH doesn't support SG_CHAIN, use the runtime
> allocation for the SGL. This patch is a preparation for adding metadata
> (T10-PI) over fabric support.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 07/17] nvme-rdma: Introduce nvme_rdma_sgl structure
  2020-03-27 17:15 ` [PATCH 07/17] nvme-rdma: Introduce nvme_rdma_sgl structure Max Gurtovoy
@ 2020-04-21 12:13   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 12:13 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:35PM +0300, Max Gurtovoy wrote:
> From: Israel Rukshin <israelr@mellanox.com>
> 
> Remove first_sgl pointer from struct nvme_rdma_request and use pointer
> arithmetic instead. The inline scatterlist, if exists, will be located
> right after the nvme_rdma_request. This patch is needed as a preparation
> for adding PI support.
> 
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 08/17] nvme-rdma: add metadata/T10-PI support
  2020-03-27 17:15 ` [PATCH 08/17] nvme-rdma: add metadata/T10-PI support Max Gurtovoy
@ 2020-04-21 12:20   ` Christoph Hellwig
  2020-04-23  9:22     ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 12:20 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:36PM +0300, Max Gurtovoy wrote:
> For capable HCAs (e.g. ConnectX-5/ConnectX-6) this will allow end-to-end
> protection information passthrough and validation for NVMe over RDMA
> transport. Metadata offload support was implemented over the new RDMA
> signature verbs API and it is enabled per controller by using nvme-cli.
> 
> usage example:
> nvme connect --pi_enable --transport=rdma --traddr=10.0.1.1 --nqn=test-nvme
> 
> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> ---
>  drivers/nvme/host/rdma.c | 330 ++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 296 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index e38f8f7..23cc77e 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -67,6 +67,9 @@ struct nvme_rdma_request {
>  	struct ib_cqe		reg_cqe;
>  	struct nvme_rdma_queue  *queue;
>  	struct nvme_rdma_sgl	data_sgl;
> +	/* Metadata (T10-PI) support */
> +	struct nvme_rdma_sgl	*md_sgl;
> +	bool			use_md;

Do we need a use_md flag vs just using md_sgl as a boolean and/or
using blk_integrity_rq?

>  enum nvme_rdma_queue_flags {
> @@ -88,6 +91,7 @@ struct nvme_rdma_queue {
>  	struct rdma_cm_id	*cm_id;
>  	int			cm_error;
>  	struct completion	cm_done;
> +	bool			pi_support;

Why do we need this on a per-queue basis vs always checking the
controller?

> +	u32 max_page_list_len =
> +		pi_support ? ibdev->attrs.max_pi_fast_reg_page_list_len :
> +			     ibdev->attrs.max_fast_reg_page_list_len;
> +
> +	return min_t(u32, NVME_RDMA_MAX_SEGMENTS, max_page_list_len - 1);

Can you use a good old if / else here?

> +#ifdef CONFIG_BLK_DEV_INTEGRITY
> +static void nvme_rdma_set_sig_domain(struct blk_integrity *bi,
> +		struct nvme_command *cmd, struct ib_sig_domain *domain,
> +		u16 control)
>  {
> +	domain->sig_type = IB_SIG_TYPE_T10_DIF;
> +	domain->sig.dif.bg_type = IB_T10DIF_CRC;
> +	domain->sig.dif.pi_interval = 1 << bi->interval_exp;
> +	domain->sig.dif.ref_tag = le32_to_cpu(cmd->rw.reftag);
>  
>  	/*
> +	 * At the moment we hard code those, but in the future
> +	 * we will take them from cmd.

I don't understand this comment.  Also it doesn't use up all 80 chars.


> +static void nvme_rdma_set_sig_attrs(struct blk_integrity *bi,
> +		struct nvme_command *cmd, struct ib_sig_attrs *sig_attrs)
> +{
> +	u16 control = le16_to_cpu(cmd->rw.control);
> +
> +	WARN_ON(bi == NULL);

I think this WARN_ON is pointless, as we'll get a NULL pointer derference
a little later anyway.

> +mr_put:
> +	if (req->use_md)
> +		ib_mr_pool_put(queue->qp, &queue->qp->sig_mrs, req->mr);
> +	else
> +		ib_mr_pool_put(queue->qp, &queue->qp->rdma_mrs, req->mr);

I've seen this patterns a few times, maybe a little helper to return
the right mr pool for a request?

> +	if (blk_integrity_rq(rq)) {
> +		memset(req->md_sgl, 0, sizeof(struct nvme_rdma_sgl));

Why do we need this memset?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-03-27 17:15 ` [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support Max Gurtovoy
  2020-04-21 12:12   ` Christoph Hellwig
@ 2020-04-21 15:17   ` Christoph Hellwig
  2020-04-22 22:07     ` Max Gurtovoy
  1 sibling, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 15:17 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:33PM +0300, Max Gurtovoy wrote:
> From: Israel Rukshin <israelr@mellanox.com>
> 
> Preparation for adding metadata (T10-PI) over fabric support. This will
> allow end-to-end protection information passthrough and validation for
> NVMe over Fabric.

So actually - for PCIe we enable PI by default.  Not sure why RDMA would
be any different?  If we have a switch to turn it off we probably want
it work similar (can't be the same due to the lack of connect) for PCIe
as well.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 09/17] nvmet: prepare metadata request
  2020-03-27 17:15 ` [PATCH 09/17] nvmet: prepare metadata request Max Gurtovoy
@ 2020-04-21 15:21   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 15:21 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:37PM +0300, Max Gurtovoy wrote:
> From: Israel Rukshin <israelr@mellanox.com>
> 
> Allocate the metadata SGL buffers and set metadata fields for the
> request. This is a preparation for adding metadata support over the
> fabrics.

I don't think having this as a separate prep patch is a good idea, it
should go with the code making use of it.

> -						       req->transfer_len);
> -			if (req->sg) {
> -				req->p2p_dev = p2p_dev;
> -				return 0;
> +						       data_len);
> +			if (!req->sg)
> +				goto fallback;
> +
> +			if (req->md_len) {
> +				req->md_sg =
> +					pci_p2pmem_alloc_sgl(p2p_dev,
> +							     &req->md_sg_cnt,
> +							     req->md_len);
> +				if (!req->md_sg) {
> +					pci_p2pmem_free_sgl(req->p2p_dev,
> +							    req->sg);
> +					goto fallback;
> +				}
>  			}
> +			req->p2p_dev = p2p_dev;
> +			return 0;
>  		}
>  
>  		/*
> @@ -984,23 +1001,40 @@ int nvmet_req_alloc_sgl(struct nvmet_req *req)
>  		 */
>  	}
>  
> -	req->sg = sgl_alloc(req->transfer_len, GFP_KERNEL, &req->sg_cnt);
> +fallback:
> +	req->sg = sgl_alloc(data_len, GFP_KERNEL, &req->sg_cnt);
>  	if (unlikely(!req->sg))
>  		return -ENOMEM;
>  
> +	if (req->md_len) {
> +		req->md_sg = sgl_alloc(req->md_len, GFP_KERNEL,
> +					 &req->md_sg_cnt);
> +		if (unlikely(!req->md_sg)) {
> +			sgl_free(req->sg);
> +			return -ENOMEM;
> +		}
> +	}
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(nvmet_req_alloc_sgl);

I suspect this might be a little cleaner with a nvmet_alloc_sgl
helper that returns a scatterlist instead of duplicating all the
code for data vs metadata.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 10/17] nvmet: add metadata characteristics for a namespace
  2020-03-27 17:15 ` [PATCH 10/17] nvmet: add metadata characteristics for a namespace Max Gurtovoy
@ 2020-04-21 15:23   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 15:23 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:38PM +0300, Max Gurtovoy wrote:
> From: Israel Rukshin <israelr@mellanox.com>
> 
> Fill those namespace fields from the block device format for adding
> metadata (T10-PI) over fabric support with block devices.
> 
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
> ---
>  drivers/nvme/target/io-cmd-bdev.c | 22 ++++++++++++++++++++++
>  drivers/nvme/target/nvmet.h       |  3 +++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
> index ea0e596..bdf611f 100644
> --- a/drivers/nvme/target/io-cmd-bdev.c
> +++ b/drivers/nvme/target/io-cmd-bdev.c
> @@ -50,6 +50,9 @@ void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id)
>  int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
>  {
>  	int ret;
> +#ifdef CONFIG_BLK_DEV_INTEGRITY
> +	struct blk_integrity *bi;
> +#endif
>  
>  	ns->bdev = blkdev_get_by_path(ns->device_path,
>  			FMODE_READ | FMODE_WRITE, NULL);
> @@ -64,6 +67,25 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
>  	}
>  	ns->size = i_size_read(ns->bdev->bd_inode);
>  	ns->blksize_shift = blksize_bits(bdev_logical_block_size(ns->bdev));
> +
> +	ns->md_type = 0;
> +	ns->ms = 0;
> +#ifdef CONFIG_BLK_DEV_INTEGRITY
> +	bi = bdev_get_integrity(ns->bdev);
> +	if (bi) {
> +		ns->ms = bi->tuple_size;
> +		if (bi->profile == &t10_pi_type1_crc)
> +			ns->md_type = NVME_NS_DPS_PI_TYPE1;
> +		else if (bi->profile == &t10_pi_type3_crc)
> +			ns->md_type = NVME_NS_DPS_PI_TYPE3;
> +		else
> +			/* Unsupported metadata type */
> +			ns->ms = 0;
> +	}
> +
> +	pr_debug("ms %d md_type %d\n", ns->ms, ns->md_type);
> +#endif

Given that bdev_get_integrity is stubbed out and returns NULL
for the !CONFIG_BLK_DEV_INTEGRITY case, can we just skip the ifdef and
let the compiler optimize the dead code away?  I also don't think we
need the pr_debug.

> +	int			md_type;
> +	int			ms;

Can we use more descriptive names?  md_type seems to be what we call
pi_type elsewhere, anѕ ms could be spelled out as metadata_size.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 13/17] nvme: Add Metadata Capabilities enumerations
  2020-03-27 17:15 ` [PATCH 13/17] nvme: Add Metadata Capabilities enumerations Max Gurtovoy
@ 2020-04-21 15:24   ` Christoph Hellwig
  2020-04-23 12:09     ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 15:24 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:41PM +0300, Max Gurtovoy wrote:
> From: Israel Rukshin <israelr@mellanox.com>
> 
> The enumerations will be used to expose the namespace metadata format by
> the target.
> 
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>

I'd be tempted to use a separate enum and add a comment to which field
this relates.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 14/17] nvmet: Add metadata/T10-PI support
  2020-03-27 17:15 ` [PATCH 14/17] nvmet: Add metadata/T10-PI support Max Gurtovoy
@ 2020-04-21 15:30   ` Christoph Hellwig
  2020-04-23 12:39     ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 15:30 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

> +	/*
> +	 * Max command capsule size is sqe + single page of in-capsule data.
> +	 * Disable inline data for Metadata capable controllers.
> +	 */
>  	id->ioccsz = cpu_to_le32((sizeof(struct nvme_command) +
> -				  req->port->inline_data_size) / 16);
> +				  req->port->inline_data_size *
> +				  !ctrl->pi_support) / 16);

Can we de-obsfucated this a little?

	cmd_capsule_size = sizeof(struct nvme_command);
	if (!ctrl->pi_support)
		cmd_capsule_size += req->port->inline_data_size;
	id->ioccsz = cpu_to_le32(cmd_capsule_size / 16);

> +	if (ctrl->subsys->pi_support && ctrl->port->pi_enable) {
> +		if (ctrl->port->pi_capable) {
> +			ctrl->pi_support = true;
> +			pr_info("controller %d T10-PI enabled\n", ctrl->cntlid);
> +		} else {
> +			ctrl->pi_support = false;
> +			pr_warn("T10-PI is not supported on controller %d\n",
> +				ctrl->cntlid);
> +		}

I think the printks are a little verbose.  Also why can we set
ctrl->port->pi_enable if ctrl->port->pi_capable is false?  Shoudn't
we reject that earlier?  In that case this could simply become:

	ctrl->pi_support = ctrl->subsys->pi_support && ctrl->port->pi_enable;

> +#ifdef CONFIG_BLK_DEV_INTEGRITY
> +static inline u32 nvmet_rw_md_len(struct nvmet_req *req)
> +{
> +	return ((u32)le16_to_cpu(req->cmd->rw.length) + 1) * req->ns->ms;
> +}
> +
> +static inline bool nvmet_ns_has_pi(struct nvmet_ns *ns)
> +{
> +	return ns->md_type && ns->ms == sizeof(struct t10_pi_tuple);
> +}
> +#else
> +static inline u32 nvmet_rw_md_len(struct nvmet_req *req)
> +{
> +	return 0;

Do we really need a stub for nvmet_rw_md_len?  Also for nvmet_ns_has_pi
we could probably reword it as:

static inline bool nvmet_ns_has_pi(struct nvmet_ns *ns)
{
	if (!IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY))
		return false;
	return ns->pi_type && ns->metadata_size == sizeof(struct t10_pi_tuple);
}

and avoid the need for a stub as well.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 15/17] nvmet: Add metadata support for block devices
  2020-03-27 17:15 ` [PATCH 15/17] nvmet: Add metadata support for block devices Max Gurtovoy
@ 2020-04-21 15:33   ` Christoph Hellwig
  2020-04-23 17:25     ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 15:33 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

On Fri, Mar 27, 2020 at 08:15:43PM +0300, Max Gurtovoy wrote:
> -	if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req)))
> +	if (!nvmet_check_transfer_len(req,
> +				      nvmet_rw_data_len(req) + req->md_len))

Shouldn't we also calculate the actual metadata length on the fly here?

>  	blk_start_plug(&plug);
> +	if (req->use_md)

Can't we use a non-NULL req->md_sg or non-null req->md_sg_cnt as a
metadata supported indicator and remove the use_md flag?  Maybe wrap
them in a helper function that also checks for blk integrity support
using IS_ENABLED and we can skip the stubs as well.


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 17/17] nvmet-rdma: Add metadata/T10-PI support
  2020-03-27 17:15 ` [PATCH 17/17] nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
@ 2020-04-21 15:37   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 15:37 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch, hch

>  /* Assume mpsmin == device_page_size == 4KB */
>  #define NVMET_RDMA_MAX_MDTS			8
> +#define NVMET_RDMA_MAX_MD_MDTS			5

Can you spell out METADATA everywhere?

> +#ifdef CONFIG_BLK_DEV_INTEGRITY
> +	/* Data Out / RDMA WRITE */
> +	r->write_cqe.done = nvmet_rdma_write_data_done;
> +#endif
>  	return 0;
>  
>  out_free_rsp:
> @@ -498,6 +507,138 @@ static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue *queue)
>  	spin_unlock(&queue->rsp_wr_wait_lock);
>  }
>  
> +#ifdef CONFIG_BLK_DEV_INTEGRITY

Any chance we could use IS_ENABLED() instead of all these ifdefs?

> +	/*
> +	 * At the moment we hard code those, but in the future
> +	 * we will take them from cmd.
> +	 */

Same weird comment as on the host side.

> +	struct nvme_command *cmd = req->cmd;
> +	struct blk_integrity *bi = bdev_get_integrity(req->ns->bdev);
> +	u16 control = le16_to_cpu(cmd->rw.control);
> +
> +	WARN_ON(bi == NULL);

No need for this WARN_ON either I think.

> +	port->pi_capable = ndev->device->attrs.device_cap_flags &
> +			IB_DEVICE_INTEGRITY_HANDOVER ? true : false;

No need for the ? :, but then again personally I'd prefer a good old:

	if (ndev->device->attrs.device_cap_flags & IB_DEVICE_INTEGRITY_HANDOVER)
		port->pi_capable = true;

anyway.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 01/17] nvme: introduce namespace features flag
  2020-04-21 11:59   ` Christoph Hellwig
@ 2020-04-21 15:53     ` James Smart
  2020-04-21 18:11       ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: James Smart @ 2020-04-21 15:53 UTC (permalink / raw)
  To: Christoph Hellwig, Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, idanb, linux-rdma,
	israelr, vladimirk, linux-nvme, shlomin, jgg, oren, kbusch

On 4/21/2020 4:59 AM, Christoph Hellwig wrote:
> On Fri, Mar 27, 2020 at 08:15:29PM +0300, Max Gurtovoy wrote:
>> From: Israel Rukshin <israelr@mellanox.com>
>>
>> Centralize all the metadata checks to one place and make the code more
>> readable. Introduce a new enum nvme_ns_features for that matter.
>> The features flag description:
>>   - NVME_NS_EXT_LBAS - NVMe namespace supports extended LBA format.
>>   - NVME_NS_MD_HOST_SUPPORTED - NVMe namespace supports getting metadata
>>     from host's block layer.
>>   - NVME_NS_MD_CTRL_SUPPORTED - NVMe namespace supports metadata actions
>>     by the controller (generate/strip).
> So whole I like the ->features flag, the defintion of these two
> metadata related features really confuses me.
>
> Here are my vague ideas to improve the situation:
>

Care to look at any of the RFC items I posted on 2/24 - which does 
things a little differently ?   Perhaps find a common ground with Max's 
patches.
http://lists.infradead.org/pipermail/linux-nvme/2020-February/029066.html

Granted I've tweaked what I sent a little as there was no need to make 
nvme_ns_has_pi accessible to the transport.

-- james


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 01/17] nvme: introduce namespace features flag
  2020-04-21 15:53     ` James Smart
@ 2020-04-21 18:11       ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-21 18:11 UTC (permalink / raw)
  To: James Smart
  Cc: axboe, jsmart2021, sagi, martin.petersen, idanb, linux-rdma,
	israelr, vladimirk, linux-nvme, shlomin, jgg, oren, kbusch,
	Max Gurtovoy, Christoph Hellwig

On Tue, Apr 21, 2020 at 08:53:18AM -0700, James Smart wrote:
> Care to look at any of the RFC items I posted on 2/24 - which does things a 
> little differently ?   Perhaps find a common ground with Max's patches.
> http://lists.infradead.org/pipermail/linux-nvme/2020-February/029066.html
>
> Granted I've tweaked what I sent a little as there was no need to make 
> nvme_ns_has_pi accessible to the transport.

I've replied to the series.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-21 15:17   ` Christoph Hellwig
@ 2020-04-22 22:07     ` Max Gurtovoy
  2020-04-22 22:24       ` James Smart
  2020-04-23  5:53       ` Christoph Hellwig
  0 siblings, 2 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-22 22:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/21/2020 6:17 PM, Christoph Hellwig wrote:
> On Fri, Mar 27, 2020 at 08:15:33PM +0300, Max Gurtovoy wrote:
>> From: Israel Rukshin <israelr@mellanox.com>
>>
>> Preparation for adding metadata (T10-PI) over fabric support. This will
>> allow end-to-end protection information passthrough and validation for
>> NVMe over Fabric.
> So actually - for PCIe we enable PI by default.  Not sure why RDMA would
> be any different?  If we have a switch to turn it off we probably want
> it work similar (can't be the same due to the lack of connect) for PCIe
> as well.

For PCI we use a format command to configure metadata. In fabrics we can 
choose doing it in the connect command and we can also choose to have 
"protected" controllers and "non-protected" controllers.

I don't think it's all or nothing case, and configuration using nvme-cli 
(or other tool) seems reasonable and flexible.



_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-22 22:07     ` Max Gurtovoy
@ 2020-04-22 22:24       ` James Smart
  2020-04-22 22:39         ` Max Gurtovoy
  2020-04-23  5:53       ` Christoph Hellwig
  1 sibling, 1 reply; 59+ messages in thread
From: James Smart @ 2020-04-22 22:24 UTC (permalink / raw)
  To: Max Gurtovoy, Christoph Hellwig
  Cc: axboe, sagi, martin.petersen, shlomin, linux-rdma, israelr,
	vladimirk, linux-nvme, idanb, jgg, oren, kbusch

On 4/22/2020 3:07 PM, Max Gurtovoy wrote:
> 
> On 4/21/2020 6:17 PM, Christoph Hellwig wrote:
>> On Fri, Mar 27, 2020 at 08:15:33PM +0300, Max Gurtovoy wrote:
>>> From: Israel Rukshin <israelr@mellanox.com>
>>>
>>> Preparation for adding metadata (T10-PI) over fabric support. This will
>>> allow end-to-end protection information passthrough and validation for
>>> NVMe over Fabric.
>> So actually - for PCIe we enable PI by default.  Not sure why RDMA would
>> be any different?  If we have a switch to turn it off we probably want
>> it work similar (can't be the same due to the lack of connect) for PCIe
>> as well.
> 
> For PCI we use a format command to configure metadata. In fabrics we can 
> choose doing it in the connect command and we can also choose to have 
> "protected" controllers and "non-protected" controllers.
> 
> I don't think it's all or nothing case, and configuration using nvme-cli 
> (or other tool) seems reasonable and flexible.

I think you need to change this to "some fabrics".

With FC, we don't do anything in connect. The transport passes 
attributes on what it can do for PI support, including: passing through 
metadata (no checks); checking of metadata (normal); generation/strip of 
metadata on the wire (meaning OS does not have to have a metadata 
buffer), and so on.  Enablement is just like pci - format the ns, then 
match up the attribute with the behavior. There is no such thing as 
protected and non-protected controllers.  There's either a ns that has 
metadata or not. If metadata and the attributes of the transport can't 
support it, the ns is inaccessible.

I understand why you are describing it as you are, but I'm a bit 
concerned about the creation of things that aren't comprehended in the 
standards at all right now (protected vs non-protected controllers). 
This affects how multipath configures as well.

-- james



_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-22 22:24       ` James Smart
@ 2020-04-22 22:39         ` Max Gurtovoy
  2020-04-23  5:54           ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-22 22:39 UTC (permalink / raw)
  To: James Smart, Christoph Hellwig
  Cc: axboe, sagi, martin.petersen, shlomin, linux-rdma, israelr,
	vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/23/2020 1:24 AM, James Smart wrote:
> On 4/22/2020 3:07 PM, Max Gurtovoy wrote:
>>
>> On 4/21/2020 6:17 PM, Christoph Hellwig wrote:
>>> On Fri, Mar 27, 2020 at 08:15:33PM +0300, Max Gurtovoy wrote:
>>>> From: Israel Rukshin <israelr@mellanox.com>
>>>>
>>>> Preparation for adding metadata (T10-PI) over fabric support. This 
>>>> will
>>>> allow end-to-end protection information passthrough and validation for
>>>> NVMe over Fabric.
>>> So actually - for PCIe we enable PI by default.  Not sure why RDMA 
>>> would
>>> be any different?  If we have a switch to turn it off we probably want
>>> it work similar (can't be the same due to the lack of connect) for PCIe
>>> as well.
>>
>> For PCI we use a format command to configure metadata. In fabrics we 
>> can choose doing it in the connect command and we can also choose to 
>> have "protected" controllers and "non-protected" controllers.
>>
>> I don't think it's all or nothing case, and configuration using 
>> nvme-cli (or other tool) seems reasonable and flexible.
>
> I think you need to change this to "some fabrics".
>
> With FC, we don't do anything in connect. The transport passes 
> attributes on what it can do for PI support, including: passing 
> through metadata (no checks); checking of metadata (normal); 
> generation/strip of metadata on the wire (meaning OS does not have to 
> have a metadata buffer), and so on.  Enablement is just like pci - 
> format the ns, then match up the attribute with the behavior. There is 
> no such thing as protected and non-protected controllers.  There's 
> either a ns that has metadata or not. If metadata and the attributes 
> of the transport can't support it, the ns is inaccessible.
>
> I understand why you are describing it as you are, but I'm a bit 
> concerned about the creation of things that aren't comprehended in the 
> standards at all right now (protected vs non-protected controllers). 
> This affects how multipath configures as well.

it's a bit late for me now so I probably wrote non standard sentence above.

BUT what I meant to say is I would like to give the user an option to 
decide whether use E2E protection or not (of course a controller can 
control protected and non-protected namespaces :) )

AFAIK, there is no option to format a ns in NVMf (at least for RDMA 
there is only 1 lbaf exposed by the target) so i'm not sure how exactly 
this will work.

We are doing all or nothing approach in iSER for T10 and I prefer not to 
do the same mistake for NVMf as well.

When NVMe/FC will support metadata in Linux we'll be able to adjust the 
code and the pi_enable according to HBA cap or any other logic that will 
fit.

>
> -- james
>
>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-22 22:07     ` Max Gurtovoy
  2020-04-22 22:24       ` James Smart
@ 2020-04-23  5:53       ` Christoph Hellwig
  1 sibling, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-23  5:53 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Thu, Apr 23, 2020 at 01:07:47AM +0300, Max Gurtovoy wrote:
>
> On 4/21/2020 6:17 PM, Christoph Hellwig wrote:
>> On Fri, Mar 27, 2020 at 08:15:33PM +0300, Max Gurtovoy wrote:
>>> From: Israel Rukshin <israelr@mellanox.com>
>>>
>>> Preparation for adding metadata (T10-PI) over fabric support. This will
>>> allow end-to-end protection information passthrough and validation for
>>> NVMe over Fabric.
>> So actually - for PCIe we enable PI by default.  Not sure why RDMA would
>> be any different?  If we have a switch to turn it off we probably want
>> it work similar (can't be the same due to the lack of connect) for PCIe
>> as well.
>
> For PCI we use a format command to configure metadata. In fabrics we can 
> choose doing it in the connect command and we can also choose to have 
> "protected" controllers and "non-protected" controllers.
>
> I don't think it's all or nothing case, and configuration using nvme-cli 
> (or other tool) seems reasonable and flexible.

Format applies to a namespace and is not limited to PCIe.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-22 22:39         ` Max Gurtovoy
@ 2020-04-23  5:54           ` Christoph Hellwig
  2020-04-23  7:30             ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-23  5:54 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, James Smart, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Thu, Apr 23, 2020 at 01:39:26AM +0300, Max Gurtovoy wrote:
> it's a bit late for me now so I probably wrote non standard sentence above.
>
> BUT what I meant to say is I would like to give the user an option to 
> decide whether use E2E protection or not (of course a controller can 
> control protected and non-protected namespaces :) )

I don't really have a problem with an opt-out, but I'd like to apply it
consistently over all transports.

>
> AFAIK, there is no option to format a ns in NVMf (at least for RDMA there 
> is only 1 lbaf exposed by the target) so i'm not sure how exactly this will 
> work.

The NVMe protocol Format NVM support is independent of the transport.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-23  5:54           ` Christoph Hellwig
@ 2020-04-23  7:30             ` Max Gurtovoy
  2020-04-24  7:06               ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-23  7:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, James Smart, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/23/2020 8:54 AM, Christoph Hellwig wrote:
> On Thu, Apr 23, 2020 at 01:39:26AM +0300, Max Gurtovoy wrote:
>> it's a bit late for me now so I probably wrote non standard sentence above.
>>
>> BUT what I meant to say is I would like to give the user an option to
>> decide whether use E2E protection or not (of course a controller can
>> control protected and non-protected namespaces :) )
> I don't really have a problem with an opt-out, but I'd like to apply it
> consistently over all transports.
>
>> AFAIK, there is no option to format a ns in NVMf (at least for RDMA there
>> is only 1 lbaf exposed by the target) so i'm not sure how exactly this will
>> work.
> The NVMe protocol Format NVM support is independent of the transport.

Ok, but it's not supported in Linux.

Are you saying we should implement Format NVM for fabrics ? or stay 
consistent for NVMf (and not nvmf + pci) ?


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 08/17] nvme-rdma: add metadata/T10-PI support
  2020-04-21 12:20   ` Christoph Hellwig
@ 2020-04-23  9:22     ` Max Gurtovoy
  2020-04-24  7:09       ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-23  9:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/21/2020 3:20 PM, Christoph Hellwig wrote:
> On Fri, Mar 27, 2020 at 08:15:36PM +0300, Max Gurtovoy wrote:
>> For capable HCAs (e.g. ConnectX-5/ConnectX-6) this will allow end-to-end
>> protection information passthrough and validation for NVMe over RDMA
>> transport. Metadata offload support was implemented over the new RDMA
>> signature verbs API and it is enabled per controller by using nvme-cli.
>>
>> usage example:
>> nvme connect --pi_enable --transport=rdma --traddr=10.0.1.1 --nqn=test-nvme
>>
>> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
>> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
>> ---
>>   drivers/nvme/host/rdma.c | 330 ++++++++++++++++++++++++++++++++++++++++++-----
>>   1 file changed, 296 insertions(+), 34 deletions(-)
>>
>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>> index e38f8f7..23cc77e 100644
>> --- a/drivers/nvme/host/rdma.c
>> +++ b/drivers/nvme/host/rdma.c
>> @@ -67,6 +67,9 @@ struct nvme_rdma_request {
>>   	struct ib_cqe		reg_cqe;
>>   	struct nvme_rdma_queue  *queue;
>>   	struct nvme_rdma_sgl	data_sgl;
>> +	/* Metadata (T10-PI) support */
>> +	struct nvme_rdma_sgl	*md_sgl;
>> +	bool			use_md;
> Do we need a use_md flag vs just using md_sgl as a boolean and/or
> using blk_integrity_rq?

md_sgl will be used if we'll get a blk request with blk_integrity 
(memory domain).

use_md will be responsible for wire domain.

so instead of this bool we can check in any place (after prev commit 
changes):

"

if (queue->pi_support && nvme_ns_has_pi(ns))
                 req->use_md = c.common.opcode == nvme_cmd_write ||
                               c.common.opcode == nvme_cmd_read;

"

And this is less readable IMO.

>
>>   enum nvme_rdma_queue_flags {
>> @@ -88,6 +91,7 @@ struct nvme_rdma_queue {
>>   	struct rdma_cm_id	*cm_id;
>>   	int			cm_error;
>>   	struct completion	cm_done;
>> +	bool			pi_support;
> Why do we need this on a per-queue basis vs always checking the
> controller?

To distinguish between admin and IO queues. I don't want to allocate PI 
resources on admin queues and prefer not checking (idx && 
ctrl->pi_support) every time.


>
>> +	u32 max_page_list_len =
>> +		pi_support ? ibdev->attrs.max_pi_fast_reg_page_list_len :
>> +			     ibdev->attrs.max_fast_reg_page_list_len;
>> +
>> +	return min_t(u32, NVME_RDMA_MAX_SEGMENTS, max_page_list_len - 1);
> Can you use a good old if / else here?

sure.


>> +#ifdef CONFIG_BLK_DEV_INTEGRITY
>> +static void nvme_rdma_set_sig_domain(struct blk_integrity *bi,
>> +		struct nvme_command *cmd, struct ib_sig_domain *domain,
>> +		u16 control)
>>   {
>> +	domain->sig_type = IB_SIG_TYPE_T10_DIF;
>> +	domain->sig.dif.bg_type = IB_T10DIF_CRC;
>> +	domain->sig.dif.pi_interval = 1 << bi->interval_exp;
>> +	domain->sig.dif.ref_tag = le32_to_cpu(cmd->rw.reftag);
>>   
>>   	/*
>> +	 * At the moment we hard code those, but in the future
>> +	 * we will take them from cmd.
> I don't understand this comment.  Also it doesn't use up all 80 chars.

It's a copy&paste from iSER.

I'll remove it.


>
>
>> +static void nvme_rdma_set_sig_attrs(struct blk_integrity *bi,
>> +		struct nvme_command *cmd, struct ib_sig_attrs *sig_attrs)
>> +{
>> +	u16 control = le16_to_cpu(cmd->rw.control);
>> +
>> +	WARN_ON(bi == NULL);
> I think this WARN_ON is pointless, as we'll get a NULL pointer derference
> a little later anyway.

I'll remove it.


>> +mr_put:
>> +	if (req->use_md)
>> +		ib_mr_pool_put(queue->qp, &queue->qp->sig_mrs, req->mr);
>> +	else
>> +		ib_mr_pool_put(queue->qp, &queue->qp->rdma_mrs, req->mr);
> I've seen this patterns a few times, maybe a little helper to return
> the right mr pool for a request?

yes I'll add:

static void nvme_rdma_mr_pool_put(struct ib_qp *qp,
                 struct nvme_rdma_request *req)
{
         if (req->use_md)
                 ib_mr_pool_put(qp, &qp->sig_mrs, req->mr);
         else
                 ib_mr_pool_put(qp, &qp->rdma_mrs, req->mr);

         req->mr = NULL;
}



>> +	if (blk_integrity_rq(rq)) {
>> +		memset(req->md_sgl, 0, sizeof(struct nvme_rdma_sgl));
> Why do we need this memset?

just good practice we took from drivers/scsi/scsi_lib.c.

It's not a must and I can remove it if needed and test it.



_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 13/17] nvme: Add Metadata Capabilities enumerations
  2020-04-21 15:24   ` Christoph Hellwig
@ 2020-04-23 12:09     ` Max Gurtovoy
  2020-04-24  7:12       ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-23 12:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/21/2020 6:24 PM, Christoph Hellwig wrote:
> On Fri, Mar 27, 2020 at 08:15:41PM +0300, Max Gurtovoy wrote:
>> From: Israel Rukshin <israelr@mellanox.com>
>>
>> The enumerations will be used to expose the namespace metadata format by
>> the target.
>>
>> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
>> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
> I'd be tempted to use a separate enum and add a comment to which field
> this relates.

something like:

+/* Metadata Capabilities */
+enum {
+       /* supports metadata being transferred as part of an extended LBA */
+       NVME_NS_MC_META_EXT     = 1 << 0,
+       /* supports metadata being transferred as part of a separate 
buffer */
+       NVME_NS_MC_META_BUF     = 1 << 1,
+};

?


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 14/17] nvmet: Add metadata/T10-PI support
  2020-04-21 15:30   ` Christoph Hellwig
@ 2020-04-23 12:39     ` Max Gurtovoy
  2020-04-24  7:14       ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-23 12:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/21/2020 6:30 PM, Christoph Hellwig wrote:
>> +	/*
>> +	 * Max command capsule size is sqe + single page of in-capsule data.
>> +	 * Disable inline data for Metadata capable controllers.
>> +	 */
>>   	id->ioccsz = cpu_to_le32((sizeof(struct nvme_command) +
>> -				  req->port->inline_data_size) / 16);
>> +				  req->port->inline_data_size *
>> +				  !ctrl->pi_support) / 16);
> Can we de-obsfucated this a little?
>
> 	cmd_capsule_size = sizeof(struct nvme_command);
> 	if (!ctrl->pi_support)
> 		cmd_capsule_size += req->port->inline_data_size;
> 	id->ioccsz = cpu_to_le32(cmd_capsule_size / 16);

Yes good idea.


>
>> +	if (ctrl->subsys->pi_support && ctrl->port->pi_enable) {
>> +		if (ctrl->port->pi_capable) {
>> +			ctrl->pi_support = true;
>> +			pr_info("controller %d T10-PI enabled\n", ctrl->cntlid);
>> +		} else {
>> +			ctrl->pi_support = false;
>> +			pr_warn("T10-PI is not supported on controller %d\n",
>> +				ctrl->cntlid);
>> +		}
> I think the printks are a little verbose.  Also why can we set
> ctrl->port->pi_enable if ctrl->port->pi_capable is false?  Shoudn't
> we reject that earlier?  In that case this could simply become:
>
> 	ctrl->pi_support = ctrl->subsys->pi_support && ctrl->port->pi_enable;

for that we'll need to check pi_capable during add_port process and 
disable pi_enable bit if user set it.

User should set it before enable the port (this will always succeed).

I'll make this change as well.

re the verbosity, sometimes I get many requests from users to get 
indication for some features.

We can remove this as well if needed.

>> +#ifdef CONFIG_BLK_DEV_INTEGRITY
>> +static inline u32 nvmet_rw_md_len(struct nvmet_req *req)
>> +{
>> +	return ((u32)le16_to_cpu(req->cmd->rw.length) + 1) * req->ns->ms;
>> +}
>> +
>> +static inline bool nvmet_ns_has_pi(struct nvmet_ns *ns)
>> +{
>> +	return ns->md_type && ns->ms == sizeof(struct t10_pi_tuple);
>> +}
>> +#else
>> +static inline u32 nvmet_rw_md_len(struct nvmet_req *req)
>> +{
>> +	return 0;
> Do we really need a stub for nvmet_rw_md_len?  Also for nvmet_ns_has_pi
> we could probably reword it as:
>
> static inline bool nvmet_ns_has_pi(struct nvmet_ns *ns)
> {
> 	if (!IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY))
> 		return false;
> 	return ns->pi_type && ns->metadata_size == sizeof(struct t10_pi_tuple);
> }
>
> and avoid the need for a stub as well.

yup.



_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 15/17] nvmet: Add metadata support for block devices
  2020-04-21 15:33   ` Christoph Hellwig
@ 2020-04-23 17:25     ` Max Gurtovoy
  2020-04-24  7:54       ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-23 17:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/21/2020 6:33 PM, Christoph Hellwig wrote:
> On Fri, Mar 27, 2020 at 08:15:43PM +0300, Max Gurtovoy wrote:
>> -	if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req)))
>> +	if (!nvmet_check_transfer_len(req,
>> +				      nvmet_rw_data_len(req) + req->md_len))
> Shouldn't we also calculate the actual metadata length on the fly here?

we calculate it during nvmet_init_req.


>
>>   	blk_start_plug(&plug);
>> +	if (req->use_md)
> Can't we use a non-NULL req->md_sg or non-null req->md_sg_cnt as a
> metadata supported indicator and remove the use_md flag?  Maybe wrap
> them in a helper function that also checks for blk integrity support
> using IS_ENABLED and we can skip the stubs as well.

I'll replace it with:

static inline bool nvmet_req_has_pi(struct nvmet_req *req)
{
         return req->sq->ctrl->pi_support && nvmet_ns_has_pi(req->ns);
}


>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-23  7:30             ` Max Gurtovoy
@ 2020-04-24  7:06               ` Christoph Hellwig
  2020-04-26  9:48                 ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-24  7:06 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, James Smart, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Thu, Apr 23, 2020 at 10:30:44AM +0300, Max Gurtovoy wrote:
>
> On 4/23/2020 8:54 AM, Christoph Hellwig wrote:
>> On Thu, Apr 23, 2020 at 01:39:26AM +0300, Max Gurtovoy wrote:
>>> it's a bit late for me now so I probably wrote non standard sentence above.
>>>
>>> BUT what I meant to say is I would like to give the user an option to
>>> decide whether use E2E protection or not (of course a controller can
>>> control protected and non-protected namespaces :) )
>> I don't really have a problem with an opt-out, but I'd like to apply it
>> consistently over all transports.
>>
>>> AFAIK, there is no option to format a ns in NVMf (at least for RDMA there
>>> is only 1 lbaf exposed by the target) so i'm not sure how exactly this will
>>> work.
>> The NVMe protocol Format NVM support is independent of the transport.
>
> Ok, but it's not supported in Linux.
>
> Are you saying we should implement Format NVM for fabrics ? or stay 
> consistent for NVMf (and not nvmf + pci) ?

I see no reason not to support a simple Format NVM for our fabrics target
implementation.  But that isn't the point - you don't really need Format
as you can also control it from configfs in your series.  So for the
initial version I don't think we need Format NVM, but I don't mind
adding it later.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 08/17] nvme-rdma: add metadata/T10-PI support
  2020-04-23  9:22     ` Max Gurtovoy
@ 2020-04-24  7:09       ` Christoph Hellwig
  2020-04-26 10:04         ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-24  7:09 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Thu, Apr 23, 2020 at 12:22:27PM +0300, Max Gurtovoy wrote:
>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>>> index e38f8f7..23cc77e 100644
>>> --- a/drivers/nvme/host/rdma.c
>>> +++ b/drivers/nvme/host/rdma.c
>>> @@ -67,6 +67,9 @@ struct nvme_rdma_request {
>>>   	struct ib_cqe		reg_cqe;
>>>   	struct nvme_rdma_queue  *queue;
>>>   	struct nvme_rdma_sgl	data_sgl;
>>> +	/* Metadata (T10-PI) support */
>>> +	struct nvme_rdma_sgl	*md_sgl;
>>> +	bool			use_md;
>> Do we need a use_md flag vs just using md_sgl as a boolean and/or
>> using blk_integrity_rq?
>
> md_sgl will be used if we'll get a blk request with blk_integrity (memory 
> domain).
>
> use_md will be responsible for wire domain.
>
> so instead of this bool we can check in any place (after prev commit 
> changes):
>
> "
>
> if (queue->pi_support && nvme_ns_has_pi(ns))
>                 req->use_md = c.common.opcode == nvme_cmd_write ||
>                               c.common.opcode == nvme_cmd_read;
>
> "
>
> And this is less readable IMO.

It would obviously have to go into a little helper, but I really hate
adding lots of little fields caching easily derived information.  There
are a few exception, for example if we really need to not touch too
many cache lines.  Do you have a git tree with your current code?  That
might be a little easier to follow than the various patches, maybe
I can think of something better.

>>> +	if (blk_integrity_rq(rq)) {
>>> +		memset(req->md_sgl, 0, sizeof(struct nvme_rdma_sgl));
>> Why do we need this memset?
>
> just good practice we took from drivers/scsi/scsi_lib.c.
>
> It's not a must and I can remove it if needed and test it.

I think (and please double check) that we initialize all three fields
anyway, so the memset should not be needed.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 13/17] nvme: Add Metadata Capabilities enumerations
  2020-04-23 12:09     ` Max Gurtovoy
@ 2020-04-24  7:12       ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-24  7:12 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Thu, Apr 23, 2020 at 03:09:47PM +0300, Max Gurtovoy wrote:
>
> On 4/21/2020 6:24 PM, Christoph Hellwig wrote:
>> On Fri, Mar 27, 2020 at 08:15:41PM +0300, Max Gurtovoy wrote:
>>> From: Israel Rukshin <israelr@mellanox.com>
>>>
>>> The enumerations will be used to expose the namespace metadata format by
>>> the target.
>>>
>>> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
>>> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
>> I'd be tempted to use a separate enum and add a comment to which field
>> this relates.
>
> something like:
>
> +/* Metadata Capabilities */
> +enum {
> +       /* supports metadata being transferred as part of an extended LBA */
> +       NVME_NS_MC_META_EXT     = 1 << 0,
> +       /* supports metadata being transferred as part of a separate 
> buffer */
> +       NVME_NS_MC_META_BUF     = 1 << 1,
> +};

What about:

/* Identify Namespace Metadata Capabilities (MC): */
enum {
	NVME_MC_METADATA_PTR	= (1 << 0),
	NVME_MC_EXTENDED_LBA	= (1 << 1),
};


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 14/17] nvmet: Add metadata/T10-PI support
  2020-04-23 12:39     ` Max Gurtovoy
@ 2020-04-24  7:14       ` Christoph Hellwig
  2020-04-26 10:50         ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-24  7:14 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Thu, Apr 23, 2020 at 03:39:38PM +0300, Max Gurtovoy wrote:
>>> +	if (ctrl->subsys->pi_support && ctrl->port->pi_enable) {
>>> +		if (ctrl->port->pi_capable) {
>>> +			ctrl->pi_support = true;
>>> +			pr_info("controller %d T10-PI enabled\n", ctrl->cntlid);
>>> +		} else {
>>> +			ctrl->pi_support = false;
>>> +			pr_warn("T10-PI is not supported on controller %d\n",
>>> +				ctrl->cntlid);
>>> +		}
>> I think the printks are a little verbose.  Also why can we set
>> ctrl->port->pi_enable if ctrl->port->pi_capable is false?  Shoudn't
>> we reject that earlier?  In that case this could simply become:
>>
>> 	ctrl->pi_support = ctrl->subsys->pi_support && ctrl->port->pi_enable;
>
> for that we'll need to check pi_capable during add_port process and disable 
> pi_enable bit if user set it.

Which seems pretty sensible.  In fact I think the configfs write for
pi enable should fail if we don't have the capability.

> User should set it before enable the port (this will always succeed).
>
> I'll make this change as well.
>
> re the verbosity, sometimes I get many requests from users to get 
> indication for some features.
>
> We can remove this as well if needed.

I'd rather keep debug prints silent.  You could add a verbose mode
in nvmetcli that prints all the settings applied if that helps these
users.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 15/17] nvmet: Add metadata support for block devices
  2020-04-23 17:25     ` Max Gurtovoy
@ 2020-04-24  7:54       ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-24  7:54 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Thu, Apr 23, 2020 at 08:25:24PM +0300, Max Gurtovoy wrote:
>
> On 4/21/2020 6:33 PM, Christoph Hellwig wrote:
>> On Fri, Mar 27, 2020 at 08:15:43PM +0300, Max Gurtovoy wrote:
>>> -	if (!nvmet_check_transfer_len(req, nvmet_rw_data_len(req)))
>>> +	if (!nvmet_check_transfer_len(req,
>>> +				      nvmet_rw_data_len(req) + req->md_len))
>> Shouldn't we also calculate the actual metadata length on the fly here?
>
> we calculate it during nvmet_init_req.

Indeed.  

>
>>
>>>   	blk_start_plug(&plug);
>>> +	if (req->use_md)
>> Can't we use a non-NULL req->md_sg or non-null req->md_sg_cnt as a
>> metadata supported indicator and remove the use_md flag?  Maybe wrap
>> them in a helper function that also checks for blk integrity support
>> using IS_ENABLED and we can skip the stubs as well.
>
> I'll replace it with:
>
> static inline bool nvmet_req_has_pi(struct nvmet_req *req)
> {
>         return req->sq->ctrl->pi_support && nvmet_ns_has_pi(req->ns);
> }

Actually I think you can simply check for a non-0 md_len, as we always
set them at the same time.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-24  7:06               ` Christoph Hellwig
@ 2020-04-26  9:48                 ` Max Gurtovoy
  2020-04-27  6:04                   ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-26  9:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, James Smart, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/24/2020 10:06 AM, Christoph Hellwig wrote:
> On Thu, Apr 23, 2020 at 10:30:44AM +0300, Max Gurtovoy wrote:
>> On 4/23/2020 8:54 AM, Christoph Hellwig wrote:
>>> On Thu, Apr 23, 2020 at 01:39:26AM +0300, Max Gurtovoy wrote:
>>>> it's a bit late for me now so I probably wrote non standard sentence above.
>>>>
>>>> BUT what I meant to say is I would like to give the user an option to
>>>> decide whether use E2E protection or not (of course a controller can
>>>> control protected and non-protected namespaces :) )
>>> I don't really have a problem with an opt-out, but I'd like to apply it
>>> consistently over all transports.
>>>
>>>> AFAIK, there is no option to format a ns in NVMf (at least for RDMA there
>>>> is only 1 lbaf exposed by the target) so i'm not sure how exactly this will
>>>> work.
>>> The NVMe protocol Format NVM support is independent of the transport.
>> Ok, but it's not supported in Linux.
>>
>> Are you saying we should implement Format NVM for fabrics ? or stay
>> consistent for NVMf (and not nvmf + pci) ?
> I see no reason not to support a simple Format NVM for our fabrics target
> implementation.  But that isn't the point - you don't really need Format
> as you can also control it from configfs in your series.  So for the
> initial version I don't think we need Format NVM, but I don't mind
> adding it later.

so we're ok with passing -p in nvme-cli during connect command ?


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 08/17] nvme-rdma: add metadata/T10-PI support
  2020-04-24  7:09       ` Christoph Hellwig
@ 2020-04-26 10:04         ` Max Gurtovoy
  0 siblings, 0 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-26 10:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/24/2020 10:09 AM, Christoph Hellwig wrote:
> On Thu, Apr 23, 2020 at 12:22:27PM +0300, Max Gurtovoy wrote:
>>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>>>> index e38f8f7..23cc77e 100644
>>>> --- a/drivers/nvme/host/rdma.c
>>>> +++ b/drivers/nvme/host/rdma.c
>>>> @@ -67,6 +67,9 @@ struct nvme_rdma_request {
>>>>    	struct ib_cqe		reg_cqe;
>>>>    	struct nvme_rdma_queue  *queue;
>>>>    	struct nvme_rdma_sgl	data_sgl;
>>>> +	/* Metadata (T10-PI) support */
>>>> +	struct nvme_rdma_sgl	*md_sgl;
>>>> +	bool			use_md;
>>> Do we need a use_md flag vs just using md_sgl as a boolean and/or
>>> using blk_integrity_rq?
>> md_sgl will be used if we'll get a blk request with blk_integrity (memory
>> domain).
>>
>> use_md will be responsible for wire domain.
>>
>> so instead of this bool we can check in any place (after prev commit
>> changes):
>>
>> "
>>
>> if (queue->pi_support && nvme_ns_has_pi(ns))
>>                  req->use_md = c.common.opcode == nvme_cmd_write ||
>>                                c.common.opcode == nvme_cmd_read;
>>
>> "
>>
>> And this is less readable IMO.
> It would obviously have to go into a little helper, but I really hate
> adding lots of little fields caching easily derived information.  There
> are a few exception, for example if we really need to not touch too
> many cache lines.  Do you have a git tree with your current code?  That
> might be a little easier to follow than the various patches, maybe
> I can think of something better.
>
>>>> +	if (blk_integrity_rq(rq)) {
>>>> +		memset(req->md_sgl, 0, sizeof(struct nvme_rdma_sgl));
>>> Why do we need this memset?
>> just good practice we took from drivers/scsi/scsi_lib.c.
>>
>> It's not a must and I can remove it if needed and test it.
> I think (and please double check) that we initialize all three fields
> anyway, so the memset should not be needed.

right:

if (first_chunk && nents_first_chunk) {
                 if (nents <= nents_first_chunk) {
*table->nents = table->orig_nents* = nents;
                         sg_init_table(table->sgl, nents);
                         return 0;
                 }
         }

and also in __sg_alloc_table:

*memset(table, 0, sizeof(*table));*



_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 14/17] nvmet: Add metadata/T10-PI support
  2020-04-24  7:14       ` Christoph Hellwig
@ 2020-04-26 10:50         ` Max Gurtovoy
  2020-04-27  6:06           ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-26 10:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/24/2020 10:14 AM, Christoph Hellwig wrote:
> On Thu, Apr 23, 2020 at 03:39:38PM +0300, Max Gurtovoy wrote:
>>>> +	if (ctrl->subsys->pi_support && ctrl->port->pi_enable) {
>>>> +		if (ctrl->port->pi_capable) {
>>>> +			ctrl->pi_support = true;
>>>> +			pr_info("controller %d T10-PI enabled\n", ctrl->cntlid);
>>>> +		} else {
>>>> +			ctrl->pi_support = false;
>>>> +			pr_warn("T10-PI is not supported on controller %d\n",
>>>> +				ctrl->cntlid);
>>>> +		}
>>> I think the printks are a little verbose.  Also why can we set
>>> ctrl->port->pi_enable if ctrl->port->pi_capable is false?  Shoudn't
>>> we reject that earlier?  In that case this could simply become:
>>>
>>> 	ctrl->pi_support = ctrl->subsys->pi_support && ctrl->port->pi_enable;
>> for that we'll need to check pi_capable during add_port process and disable
>> pi_enable bit if user set it.
> Which seems pretty sensible.  In fact I think the configfs write for
> pi enable should fail if we don't have the capability.

The port is not enabled so it's not possible currently.

But we can disable it afterwards (in nvmet_enable_port) :

+       /* If the transport didn't set pi_capable, then disable it. */
+       if (!port->pi_capable)
+               port->pi_enable = false;


>
>> User should set it before enable the port (this will always succeed).
>>
>> I'll make this change as well.
>>
>> re the verbosity, sometimes I get many requests from users to get
>> indication for some features.
>>
>> We can remove this as well if needed.
> I'd rather keep debug prints silent.  You could add a verbose mode
> in nvmetcli that prints all the settings applied if that helps these
> users.

how about:

-       pr_info("creating controller %d for subsystem %s for NQN %s.\n",
-               ctrl->cntlid, ctrl->subsys->subsysnqn, ctrl->hostnqn);
+       pr_info("creating controller %d for subsystem %s for NQN %s%s.\n",
+               ctrl->cntlid, ctrl->subsys->subsysnqn, ctrl->hostnqn,
+               ctrl->pi_support ? " T10-PI is enabled" : "");




_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-26  9:48                 ` Max Gurtovoy
@ 2020-04-27  6:04                   ` Christoph Hellwig
  2020-04-27 13:52                     ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-27  6:04 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, James Smart, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Sun, Apr 26, 2020 at 12:48:18PM +0300, Max Gurtovoy wrote:
>> I see no reason not to support a simple Format NVM for our fabrics target
>> implementation.  But that isn't the point - you don't really need Format
>> as you can also control it from configfs in your series.  So for the
>> initial version I don't think we need Format NVM, but I don't mind
>> adding it later.
>
> so we're ok with passing -p in nvme-cli during connect command ?

PI should be enable by default.  We can think of a hook disabling it,
but please keep it at the end of the series.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 14/17] nvmet: Add metadata/T10-PI support
  2020-04-26 10:50         ` Max Gurtovoy
@ 2020-04-27  6:06           ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-27  6:06 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, jsmart2021, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Sun, Apr 26, 2020 at 01:50:05PM +0300, Max Gurtovoy wrote:
>>>> I think the printks are a little verbose.  Also why can we set
>>>> ctrl->port->pi_enable if ctrl->port->pi_capable is false?  Shoudn't
>>>> we reject that earlier?  In that case this could simply become:
>>>>
>>>> 	ctrl->pi_support = ctrl->subsys->pi_support && ctrl->port->pi_enable;
>>> for that we'll need to check pi_capable during add_port process and disable
>>> pi_enable bit if user set it.
>> Which seems pretty sensible.  In fact I think the configfs write for
>> pi enable should fail if we don't have the capability.
>
> The port is not enabled so it's not possible currently.
>
> But we can disable it afterwards (in nvmet_enable_port) :
>
> +       /* If the transport didn't set pi_capable, then disable it. */
> +       if (!port->pi_capable)
> +               port->pi_enable = false;

I don't think we should allow users to enable invalid configurations
and silently clear flags, but reject flags as soon as we can - write
to the attribute where possible, else during enable.

> how about:
>
> -       pr_info("creating controller %d for subsystem %s for NQN %s.\n",
> -               ctrl->cntlid, ctrl->subsys->subsysnqn, ctrl->hostnqn);
> +       pr_info("creating controller %d for subsystem %s for NQN %s%s.\n",
> +               ctrl->cntlid, ctrl->subsys->subsysnqn, ctrl->hostnqn,
> +               ctrl->pi_support ? " T10-PI is enabled" : "");

Ok.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-27  6:04                   ` Christoph Hellwig
@ 2020-04-27 13:52                     ` Max Gurtovoy
  2020-04-27 13:54                       ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-27 13:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, James Smart, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/27/2020 9:04 AM, Christoph Hellwig wrote:
> On Sun, Apr 26, 2020 at 12:48:18PM +0300, Max Gurtovoy wrote:
>>> I see no reason not to support a simple Format NVM for our fabrics target
>>> implementation.  But that isn't the point - you don't really need Format
>>> as you can also control it from configfs in your series.  So for the
>>> initial version I don't think we need Format NVM, but I don't mind
>>> adding it later.
>> so we're ok with passing -p in nvme-cli during connect command ?
> PI should be enable by default.  We can think of a hook disabling it,
> but please keep it at the end of the series.

but the default format on NVMe pci drives is without metadata and to 
enable it we do an NVM format command.

So for this model we need to do some action to enable T10/Metadata.

Regular users will usually use default configuration and are not aware 
of the pros/cons with adding the metadata (resources/performance/etc..).

Also adding module param as we did in iSER is less flexible than having 
a flag per controller in the CLI.

I'll think about another option that will replace -p in connect command 
(the only one I can think of now is always "on" as suggested...)



_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-27 13:52                     ` Max Gurtovoy
@ 2020-04-27 13:54                       ` Christoph Hellwig
  2020-04-28  9:18                         ` Max Gurtovoy
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2020-04-27 13:54 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: axboe, James Smart, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch,
	Christoph Hellwig

On Mon, Apr 27, 2020 at 04:52:52PM +0300, Max Gurtovoy wrote:
> but the default format on NVMe pci drives is without metadata and to enable 
> it we do an NVM format command.

The defaut namespace format is entirely vendor specific for both
PCIe and Fabrics.  For OEMs that rely on PI the drives will usually
shipped with a PI-enabled format.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support
  2020-04-27 13:54                       ` Christoph Hellwig
@ 2020-04-28  9:18                         ` Max Gurtovoy
  0 siblings, 0 replies; 59+ messages in thread
From: Max Gurtovoy @ 2020-04-28  9:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, James Smart, sagi, martin.petersen, shlomin, linux-rdma,
	israelr, vladimirk, linux-nvme, idanb, jgg, oren, kbusch


On 4/27/2020 4:54 PM, Christoph Hellwig wrote:
> On Mon, Apr 27, 2020 at 04:52:52PM +0300, Max Gurtovoy wrote:
>> but the default format on NVMe pci drives is without metadata and to enable
>> it we do an NVM format command.
> The defaut namespace format is entirely vendor specific for both
> PCIe and Fabrics.  For OEMs that rely on PI the drives will usually
> shipped with a PI-enabled format.

Ok I'll set the controller to be capable in case the HCA is capable and 
then act according to exposed namespace from the target.

It will cost us integrity_mrs (that are expensive) for ConnectX-4 and 
above but I guess we can live with that for now.


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2020-04-28  9:18 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-27 17:15 [PATCH 00/17 V5] nvme-rdma/nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
2020-03-27 17:15 ` [PATCH 1/1] nvme-cli/fabrics: Add pi_enable param to connect cmd Max Gurtovoy
2020-03-27 17:15 ` [PATCH 01/17] nvme: introduce namespace features flag Max Gurtovoy
2020-04-21 11:59   ` Christoph Hellwig
2020-04-21 15:53     ` James Smart
2020-04-21 18:11       ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 02/17] nvme: Add has_md field to the nvme_req structure Max Gurtovoy
2020-04-21 11:59   ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 03/17] nvme: Enforce extended LBA format for fabrics metadata Max Gurtovoy
2020-04-21 12:08   ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 04/17] nvme: introduce max_integrity_segments ctrl attribute Max Gurtovoy
2020-04-21 12:09   ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 05/17] nvme-fabrics: Allow user enabling metadata/T10-PI support Max Gurtovoy
2020-04-21 12:12   ` Christoph Hellwig
2020-04-21 15:17   ` Christoph Hellwig
2020-04-22 22:07     ` Max Gurtovoy
2020-04-22 22:24       ` James Smart
2020-04-22 22:39         ` Max Gurtovoy
2020-04-23  5:54           ` Christoph Hellwig
2020-04-23  7:30             ` Max Gurtovoy
2020-04-24  7:06               ` Christoph Hellwig
2020-04-26  9:48                 ` Max Gurtovoy
2020-04-27  6:04                   ` Christoph Hellwig
2020-04-27 13:52                     ` Max Gurtovoy
2020-04-27 13:54                       ` Christoph Hellwig
2020-04-28  9:18                         ` Max Gurtovoy
2020-04-23  5:53       ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 06/17] nvme: introduce NVME_INLINE_MD_SG_CNT Max Gurtovoy
2020-04-21 12:12   ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 07/17] nvme-rdma: Introduce nvme_rdma_sgl structure Max Gurtovoy
2020-04-21 12:13   ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 08/17] nvme-rdma: add metadata/T10-PI support Max Gurtovoy
2020-04-21 12:20   ` Christoph Hellwig
2020-04-23  9:22     ` Max Gurtovoy
2020-04-24  7:09       ` Christoph Hellwig
2020-04-26 10:04         ` Max Gurtovoy
2020-03-27 17:15 ` [PATCH 09/17] nvmet: prepare metadata request Max Gurtovoy
2020-04-21 15:21   ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 10/17] nvmet: add metadata characteristics for a namespace Max Gurtovoy
2020-04-21 15:23   ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 11/17] nvmet: Rename nvmet_rw_len to nvmet_rw_data_len Max Gurtovoy
2020-03-27 17:15 ` [PATCH 12/17] nvmet: Rename nvmet_check_data_len to nvmet_check_transfer_len Max Gurtovoy
2020-03-27 17:15 ` [PATCH 13/17] nvme: Add Metadata Capabilities enumerations Max Gurtovoy
2020-04-21 15:24   ` Christoph Hellwig
2020-04-23 12:09     ` Max Gurtovoy
2020-04-24  7:12       ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 14/17] nvmet: Add metadata/T10-PI support Max Gurtovoy
2020-04-21 15:30   ` Christoph Hellwig
2020-04-23 12:39     ` Max Gurtovoy
2020-04-24  7:14       ` Christoph Hellwig
2020-04-26 10:50         ` Max Gurtovoy
2020-04-27  6:06           ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 15/17] nvmet: Add metadata support for block devices Max Gurtovoy
2020-04-21 15:33   ` Christoph Hellwig
2020-04-23 17:25     ` Max Gurtovoy
2020-04-24  7:54       ` Christoph Hellwig
2020-03-27 17:15 ` [PATCH 16/17] RDMA/rw: Expose maximal page list for a device per 1 MR Max Gurtovoy
2020-03-27 17:15 ` [PATCH 17/17] nvmet-rdma: Add metadata/T10-PI support Max Gurtovoy
2020-04-21 15:37   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).