Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
* [RFC 0/6] nvme: core layer metadata/pi support
@ 2020-02-24 18:48 James Smart
  2020-02-24 18:48 ` [RFC 1/6] nvme: Add ns to nvme_request struct James Smart
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: James Smart @ 2020-02-24 18:48 UTC (permalink / raw)
  To: linux-nvme; +Cc: James Smart

This patch set is alternative to the core layer mods proposed by Max
Gurtovoy.

This set adds much the same functionality, but does so via transport ops
flags that indicate capabilities that are then stamped into the controller.
Transports can fine-tune the capabilities on a per-controller basis by
tuning the controller flag values.  The patches also use different
language. Rather than introducing extended lba data terms and naming all
metadata as "integrity" data, it keeps things generic as "metadata", that
then may or may not contain pi data. It also revises some of the checks
in the core layer to better show fabric deltas and options for metadata.

This patch set does not cover the patch adding a fabrics "pi_enable"
option. That patch may still be used, but would be implemented differently
in that the rdma transport would look for the option, and when set, revise
the ctrl->t_flags after calling nvme_init_integrity. The rdma transport
patches would also largely not change in function, but would change in the
flags and how they would look for items.

This patch set does not address (yet) the nvmet side.


-- james

James Smart (6):
  nvme: Add ns to nvme_request struct
  nvme: make nvme_ns_has_pi accessible to transports
  nvme: Introduce max_meta_segments ctrl and ops attribute
  nvme: Add t_flags and pi_blkszs ctrl and ops attributes
  nvme: Add pi_flags to nvme_request for transport pi support
  nvme: Introduce NVME_INLINE_PROT_SG_CNT

 drivers/nvme/host/core.c | 115 +++++++++++++++++++++++++++++++++++++----------
 drivers/nvme/host/nvme.h |  46 +++++++++++++++++--
 drivers/nvme/host/pci.c  |   7 +++
 3 files changed, 141 insertions(+), 27 deletions(-)

-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC 1/6] nvme: Add ns to nvme_request struct
  2020-02-24 18:48 [RFC 0/6] nvme: core layer metadata/pi support James Smart
@ 2020-02-24 18:48 ` James Smart
  2020-02-24 18:48 ` [RFC 2/6] nvme: make nvme_ns_has_pi accessible to transports James Smart
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: James Smart @ 2020-02-24 18:48 UTC (permalink / raw)
  To: linux-nvme; +Cc: James Smart

To facilitate obtaining status about the namespace issuing the request,
add a ns pointer to the nvme_request structure.

Signed-off-by: James Smart <jsmart2021@gmail.com>
---
 drivers/nvme/host/core.c | 2 ++
 drivers/nvme/host/nvme.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 84914223c537..a42c0ab37ef4 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -475,6 +475,7 @@ static inline void nvme_clear_nvme_request(struct request *req)
 	if (!(req->rq_flags & RQF_DONTPREP)) {
 		nvme_req(req)->retries = 0;
 		nvme_req(req)->flags = 0;
+		nvme_req(req)->ns = NULL;
 		req->rq_flags |= RQF_DONTPREP;
 	}
 }
@@ -758,6 +759,7 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
 	blk_status_t ret = BLK_STS_OK;
 
 	nvme_clear_nvme_request(req);
+	nvme_req(req)->ns = ns;
 
 	memset(cmd, 0, sizeof(*cmd));
 	switch (req_op(req)) {
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 1024fec7914c..11336ec6d27b 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -139,6 +139,7 @@ struct nvme_request {
 	u8			flags;
 	u16			status;
 	struct nvme_ctrl	*ctrl;
+	struct nvme_ns		*ns;
 };
 
 /*
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC 2/6] nvme: make nvme_ns_has_pi accessible to transports
  2020-02-24 18:48 [RFC 0/6] nvme: core layer metadata/pi support James Smart
  2020-02-24 18:48 ` [RFC 1/6] nvme: Add ns to nvme_request struct James Smart
@ 2020-02-24 18:48 ` James Smart
  2020-02-24 18:48 ` [RFC 3/6] nvme: Introduce max_meta_segments ctrl and ops attribute James Smart
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: James Smart @ 2020-02-24 18:48 UTC (permalink / raw)
  To: linux-nvme; +Cc: James Smart

Move the nvme_ns_has_pi() inline from core.c to the nvme.h header.
This allows use by the transports

Signed-off-by: James Smart <jsmart2021@gmail.com>
---
 drivers/nvme/host/core.c | 6 ------
 drivers/nvme/host/nvme.h | 6 ++++++
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a42c0ab37ef4..4f08c637ec2e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -18,7 +18,6 @@
 #include <linux/pr.h>
 #include <linux/ptrace.h>
 #include <linux/nvme_ioctl.h>
-#include <linux/t10-pi.h>
 #include <linux/pm_qos.h>
 #include <asm/unaligned.h>
 
@@ -209,11 +208,6 @@ static int nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl)
 	return ret;
 }
 
-static inline bool nvme_ns_has_pi(struct nvme_ns *ns)
-{
-	return ns->pi_type && ns->ms == sizeof(struct t10_pi_tuple);
-}
-
 static blk_status_t nvme_error_status(u16 status)
 {
 	switch (status & 0x7ff) {
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 11336ec6d27b..0725cc7c7a7a 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -16,6 +16,7 @@
 #include <linux/fault-inject.h>
 #include <linux/rcupdate.h>
 #include <linux/wait.h>
+#include <linux/t10-pi.h>
 
 #include <trace/events/block.h>
 
@@ -394,6 +395,11 @@ struct nvme_ns {
 
 };
 
+static inline bool nvme_ns_has_pi(struct nvme_ns *ns)
+{
+	return ns->pi_type && ns->ms == sizeof(struct t10_pi_tuple);
+}
+
 struct nvme_ctrl_ops {
 	const char *name;
 	struct module *module;
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC 3/6] nvme: Introduce max_meta_segments ctrl and ops attribute
  2020-02-24 18:48 [RFC 0/6] nvme: core layer metadata/pi support James Smart
  2020-02-24 18:48 ` [RFC 1/6] nvme: Add ns to nvme_request struct James Smart
  2020-02-24 18:48 ` [RFC 2/6] nvme: make nvme_ns_has_pi accessible to transports James Smart
@ 2020-02-24 18:48 ` James Smart
  2020-02-25  9:49   ` Max Gurtovoy
  2020-02-24 18:48 ` [RFC 4/6] nvme: Add t_flags and pi_blkszs ctrl and ops attributes James Smart
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 8+ messages in thread
From: James Smart @ 2020-02-24 18:48 UTC (permalink / raw)
  To: linux-nvme; +Cc: Max Gurtovoy, James Smart, Sagi Grimberg, Israel Rukshin

This patch was originally posted by Max Gurtovoy <maxg@mellanox.com>

This patch doesn't change any logic, and is needed as a preparation
for adding PI support for fabrics drivers.

This patch parameterizes the number of sgl segments supported for
a separate metadata buffer.

The parameter is added to the nvme_ctrl_ops struct and the nvme_ctrl_ops
struct. nvme_init_ctrl() was modified to initialize the controller
value from the ops value. It was done in this manner such that if the
transport supports a singular/unchanged value, it can be set in the ops
struct and no other code is necessary. However, if the transport must
set the value on a per-controller basis, likely due to differents in the
host transport hardware, it can directly modify the field in the ctrl
struct.

CC: Max Gurtovoy <maxg@mellanox.com>
CC: Israel Rukshin <israelr@mellanox.com>
CC: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: James Smart <jsmart2021@gmail.com>

---
Modifications to Max's patch:
 Rename max_integrity_segments to max_meta_segments.
 Add parameter to ops struct and initialize ctrl field in nvme_init_ctrl.
---
 drivers/nvme/host/core.c | 12 ++++++++----
 drivers/nvme/host/nvme.h |  3 +++
 drivers/nvme/host/pci.c  |  3 +++
 3 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4f08c637ec2e..8421eafa81c6 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1624,7 +1624,8 @@ static int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo)
 }
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
-static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
+static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
+				u32 max_meta_segments)
 {
 	struct blk_integrity integrity;
 
@@ -1647,10 +1648,11 @@ static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
 	}
 	integrity.tuple_size = ms;
 	blk_integrity_register(disk, &integrity);
-	blk_queue_max_integrity_segments(disk->queue, 1);
+	blk_queue_max_integrity_segments(disk->queue, max_meta_segments);
 }
 #else
-static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
+static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
+				u32 max_meta_segments)
 {
 }
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
@@ -1805,7 +1807,8 @@ static void nvme_update_disk_info(struct gendisk *disk,
 
 	if (ns->ms && !ns->ext &&
 	    (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
-		nvme_init_integrity(disk, ns->ms, ns->pi_type);
+		nvme_init_integrity(disk, ns->ms, ns->pi_type,
+				    ns->ctrl->max_meta_segments);
 	if ((ns->ms && !nvme_ns_has_pi(ns) && !blk_get_integrity(disk)) ||
 	    ns->lba_shift > PAGE_SHIFT)
 		capacity = 0;
@@ -4058,6 +4061,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 	init_rwsem(&ctrl->namespaces_rwsem);
 	ctrl->dev = dev;
 	ctrl->ops = ops;
+	ctrl->max_meta_segments = ops->max_meta_segments;
 	ctrl->quirks = quirks;
 	INIT_WORK(&ctrl->scan_work, nvme_scan_work);
 	INIT_WORK(&ctrl->async_event_work, nvme_async_event_work);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 0725cc7c7a7a..4c6b6fc18560 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -262,6 +262,8 @@ struct nvme_ctrl {
 	struct work_struct fw_act_work;
 	unsigned long events;
 
+	u32 max_meta_segments;
+
 #ifdef CONFIG_NVME_MULTIPATH
 	/* asymmetric namespace access: */
 	u8 anacap;
@@ -414,6 +416,7 @@ struct nvme_ctrl_ops {
 	void (*submit_async_event)(struct nvme_ctrl *ctrl);
 	void (*delete_ctrl)(struct nvme_ctrl *ctrl);
 	int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
+	u32 max_meta_segments;
 };
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index da392b50f73e..7cbd2fbda743 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2690,6 +2690,9 @@ static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = {
 	.free_ctrl		= nvme_pci_free_ctrl,
 	.submit_async_event	= nvme_pci_submit_async_event,
 	.get_address		= nvme_pci_get_address,
+	/* PCI supports metadata via single segment separate buffer only */
+	.max_meta_segments	= 1,
+
 };
 
 static int nvme_dev_map(struct nvme_dev *dev)
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC 4/6] nvme: Add t_flags and pi_blkszs ctrl and ops attributes
  2020-02-24 18:48 [RFC 0/6] nvme: core layer metadata/pi support James Smart
                   ` (2 preceding siblings ...)
  2020-02-24 18:48 ` [RFC 3/6] nvme: Introduce max_meta_segments ctrl and ops attribute James Smart
@ 2020-02-24 18:48 ` James Smart
  2020-02-24 18:48 ` [RFC 5/6] nvme: Add pi_flags to nvme_request for transport pi support James Smart
  2020-02-24 18:48 ` [RFC 6/6] nvme: Introduce NVME_INLINE_PROT_SG_CNT James Smart
  5 siblings, 0 replies; 8+ messages in thread
From: James Smart @ 2020-02-24 18:48 UTC (permalink / raw)
  To: linux-nvme; +Cc: James Smart

To prepare for fabric PI support, the ops->flags handling needed to be
reworked and the addition of a new field, pi_blkszs is needed.

The nvme_ctrl_ops flags defines were moved to an enum, and values were
added to cover PI types that the host transport supports, as well as a
flag on whether the transport can insert/strip PI data for payload
received without pi data. A new field, pi_blkszs, was added. The field
parallels ns->lba_shift - each bit is a power of 2 block size, and if set,
is a block size the transport knows how to do pi operations on.

The new fields were also added to nvme_ctrl. This allows the default
values to be loaded from the ops via nvme_init_ctrl, but the transport
is allowed to tune the value on a per-controller basis if needed (directly
writes the field after nvme_init_ctrl). For example, the same transport
may have two host ports that have hw with different capabilities.

The PCI driver's ops struct was updated for the new pi flags and the
new pi_blkszs field. As all pci devices will behave the same, the ops
structure is all that needed to change.

The core code was updated to take the transport flags from the controller
rather than the ops structure.  nvme_update_disk_info() was revised to
better check the support of the transport when there is metadata.

Signed-off-by: James Smart <jsmart2021@gmail.com>

---
 I don't believe the following check, added by this patch, is necessary
 with the code in this new form. Meaning: in all cases now, if we reach this
 point, nvme_init_integrity() is called. So I don't think we need to check
 blk_get_integrity() any more.
 Please review:
  @@ -1805,12 +1806,32 @@ static void nvme_update_disk_info(struct gendisk *disk,
  ...
  +               if (!nvme_ns_has_pi(ns) && !blk_get_integrity(disk))
  +                       capacity = 0;
---
 drivers/nvme/host/core.c | 41 ++++++++++++++++++++++++++++++++---------
 drivers/nvme/host/nvme.h | 33 +++++++++++++++++++++++++++++----
 drivers/nvme/host/pci.c  |  6 +++++-
 3 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 8421eafa81c6..ab03311105c3 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1762,6 +1762,7 @@ static bool nvme_ns_ids_equal(struct nvme_ns_ids *a, struct nvme_ns_ids *b)
 static void nvme_update_disk_info(struct gendisk *disk,
 		struct nvme_ns *ns, struct nvme_id_ns *id)
 {
+	struct nvme_ctrl *ctrl = ns->ctrl;
 	sector_t capacity = nvme_lba_to_sect(ns, le64_to_cpu(id->nsze));
 	unsigned short bs = 1 << ns->lba_shift;
 	u32 atomic_bs, phys_bs, io_opt;
@@ -1782,7 +1783,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
 		if (id->nsfeat & (1 << 1) && id->nawupf)
 			atomic_bs = (1 + le16_to_cpu(id->nawupf)) * bs;
 		else
-			atomic_bs = (1 + ns->ctrl->subsys->awupf) * bs;
+			atomic_bs = (1 + ctrl->subsys->awupf) * bs;
 	} else {
 		atomic_bs = bs;
 	}
@@ -1805,12 +1806,32 @@ static void nvme_update_disk_info(struct gendisk *disk,
 	blk_queue_io_min(disk->queue, phys_bs);
 	blk_queue_io_opt(disk->queue, io_opt);
 
-	if (ns->ms && !ns->ext &&
-	    (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
+	if (ns->ms) {
+		/*
+		 * Ensure transport supports separate metadata. If not,
+		 * only allowable configuration is with pi enabled and
+		 * using PRACT=1 or the host transport to insert/strip.
+		 */
+		if (!(ctrl->t_flags & NVME_F_METADATA_SUPPORTED) &&
+		    !nvme_ns_has_pi(ns))
+			goto set_0_capacity;
+
+		/* Check controller requirements:
+		 *  PCI requires separate metadata support.
+		 *  Fabrics do not support separate metadata.
+		 */
+		if ((!ns->ext && ctrl->t_flags & NVME_F_FABRICS) ||
+		    (ns->ext && !(ctrl->t_flags & NVME_F_FABRICS)))
+			goto set_0_capacity;
+
 		nvme_init_integrity(disk, ns->ms, ns->pi_type,
-				    ns->ctrl->max_meta_segments);
-	if ((ns->ms && !nvme_ns_has_pi(ns) && !blk_get_integrity(disk)) ||
-	    ns->lba_shift > PAGE_SHIFT)
+				    ctrl->max_meta_segments);
+
+		if (!nvme_ns_has_pi(ns) && !blk_get_integrity(disk))
+			capacity = 0;
+	}
+	if (ns->lba_shift > PAGE_SHIFT)
+set_0_capacity:
 		capacity = 0;
 
 	set_capacity(disk, capacity);
@@ -2764,7 +2785,7 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
 			goto out_free;
 	}
 
-	if (!(ctrl->ops->flags & NVME_F_FABRICS))
+	if (!(ctrl->t_flags & NVME_F_FABRICS))
 		ctrl->cntlid = le16_to_cpu(id->cntlid);
 
 	if (!ctrl->identified) {
@@ -2848,7 +2869,7 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
 	}
 	memcpy(ctrl->psd, id->psd, sizeof(ctrl->psd));
 
-	if (ctrl->ops->flags & NVME_F_FABRICS) {
+	if (ctrl->t_flags & NVME_F_FABRICS) {
 		ctrl->icdoff = le16_to_cpu(id->icdoff);
 		ctrl->ioccsz = le32_to_cpu(id->ioccsz);
 		ctrl->iorcsz = le32_to_cpu(id->iorcsz);
@@ -3530,7 +3551,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 			|= BDI_CAP_STABLE_WRITES;
 
 	blk_queue_flag_set(QUEUE_FLAG_NONROT, ns->queue);
-	if (ctrl->ops->flags & NVME_F_PCI_P2PDMA)
+	if (ctrl->t_flags & NVME_F_PCI_P2PDMA)
 		blk_queue_flag_set(QUEUE_FLAG_PCI_P2PDMA, ns->queue);
 
 	ns->queue->queuedata = ns;
@@ -4061,7 +4082,9 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
 	init_rwsem(&ctrl->namespaces_rwsem);
 	ctrl->dev = dev;
 	ctrl->ops = ops;
+	ctrl->t_flags = ops->flags;
 	ctrl->max_meta_segments = ops->max_meta_segments;
+	ctrl->pi_blkszs = ops->pi_blkszs;
 	ctrl->quirks = quirks;
 	INIT_WORK(&ctrl->scan_work, nvme_scan_work);
 	INIT_WORK(&ctrl->async_event_work, nvme_async_event_work);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 4c6b6fc18560..88e82250bb3d 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -262,7 +262,10 @@ struct nvme_ctrl {
 	struct work_struct fw_act_work;
 	unsigned long events;
 
+	/* host transport attributes */
+	u32 t_flags;	/* transport flags: see enum nvme_transport_flags */
 	u32 max_meta_segments;
+	u32 pi_blkszs;	/* each bit is a power of 2. See ns->lba_shift */
 
 #ifdef CONFIG_NVME_MULTIPATH
 	/* asymmetric namespace access: */
@@ -402,13 +405,34 @@ static inline bool nvme_ns_has_pi(struct nvme_ns *ns)
 	return ns->pi_type && ns->ms == sizeof(struct t10_pi_tuple);
 }
 
+enum nvme_transport_flags {
+	NVME_F_FABRICS			= 1 << 0,
+	NVME_F_PCI_P2PDMA		= 1 << 1,
+
+	/* Metadata requires use of a separate blk_integrity_rq buffer */
+	NVME_F_METADATA_SUPPORTED	= 1 << 8,
+
+	/*
+	 * The PI types supported by the host transport, if it participates
+	 * in PI checking or can insert/strip PI on behalf of the OS.
+	 */
+	NVME_F_PI_TYPE_MASK		= 0x7 << 9,
+	NVME_F_PI_TYPE1_PROTECTION	= 1 << 9,
+	NVME_F_PI_TYPE2_PROTECTION	= 1 << 10,
+	NVME_F_PI_TYPE3_PROTECTION	= 1 << 11,
+
+	/* Indicates that the host transport can insert/strip PI for
+	 * presentation to the controller. I.E. OS does not have metadata/PI.
+	 * Transport will insert PI/meta on TX and strip PI/meta on RX.
+	 * Similar to PRACT=1.
+	 */
+	NVME_F_PI_TXINSERT_RXSTRIP	= 1 << 12,
+};
+
 struct nvme_ctrl_ops {
 	const char *name;
 	struct module *module;
-	unsigned int flags;
-#define NVME_F_FABRICS			(1 << 0)
-#define NVME_F_METADATA_SUPPORTED	(1 << 1)
-#define NVME_F_PCI_P2PDMA		(1 << 2)
+	unsigned int flags;		/* see enum nvme_transport_flags */
 	int (*reg_read32)(struct nvme_ctrl *ctrl, u32 off, u32 *val);
 	int (*reg_write32)(struct nvme_ctrl *ctrl, u32 off, u32 val);
 	int (*reg_read64)(struct nvme_ctrl *ctrl, u32 off, u64 *val);
@@ -417,6 +441,7 @@ struct nvme_ctrl_ops {
 	void (*delete_ctrl)(struct nvme_ctrl *ctrl);
 	int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
 	u32 max_meta_segments;
+	u32 pi_blkszs;	/* each bit is a power of 2. See ns->lba_shift */
 };
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 7cbd2fbda743..f56038294c23 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2683,6 +2683,9 @@ static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = {
 	.name			= "pcie",
 	.module			= THIS_MODULE,
 	.flags			= NVME_F_METADATA_SUPPORTED |
+				  NVME_F_PI_TYPE_MASK | /* all types presented
+							 * by the controller
+							 */
 				  NVME_F_PCI_P2PDMA,
 	.reg_read32		= nvme_pci_reg_read32,
 	.reg_write32		= nvme_pci_reg_write32,
@@ -2692,7 +2695,8 @@ static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = {
 	.get_address		= nvme_pci_get_address,
 	/* PCI supports metadata via single segment separate buffer only */
 	.max_meta_segments	= 1,
-
+	/* Support is whatever the controller presents */
+	.pi_blkszs		= 0xFFFFFFFF,
 };
 
 static int nvme_dev_map(struct nvme_dev *dev)
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC 5/6] nvme: Add pi_flags to nvme_request for transport pi support
  2020-02-24 18:48 [RFC 0/6] nvme: core layer metadata/pi support James Smart
                   ` (3 preceding siblings ...)
  2020-02-24 18:48 ` [RFC 4/6] nvme: Add t_flags and pi_blkszs ctrl and ops attributes James Smart
@ 2020-02-24 18:48 ` James Smart
  2020-02-24 18:48 ` [RFC 6/6] nvme: Introduce NVME_INLINE_PROT_SG_CNT James Smart
  5 siblings, 0 replies; 8+ messages in thread
From: James Smart @ 2020-02-24 18:48 UTC (permalink / raw)
  To: linux-nvme; +Cc: James Smart

Create a new field called pi_flags in the nvme request.   This field
uses the enum nvme_transport_flags definition. However, they are set in
a different manner:
 NVME_F_FABRICS and NVME_F_PCI_P2PDMA will not be set.
 NVME_F_METADATA_SUPPORTED will not be set. The transport must look
   for blk_request_rq() to determine that there is metadata.
   Note: There may or may not be PI when set.
 If there is PI and metadata, only the NVME_F_PI_TYPE_xxx flag for the
   ns's pi type will be set.
 If there is PI and no metadata, the NVME_F_PI_TXINSERT_RXSTRIP flag
   may be set.

Signed-off-by: James Smart <jsmart2021@gmail.com>
---
 drivers/nvme/host/core.c | 56 +++++++++++++++++++++++++++++++++++++++++++-----
 drivers/nvme/host/nvme.h |  1 +
 2 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ab03311105c3..ec808e7d395d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -470,6 +470,7 @@ static inline void nvme_clear_nvme_request(struct request *req)
 		nvme_req(req)->retries = 0;
 		nvme_req(req)->flags = 0;
 		nvme_req(req)->ns = NULL;
+		nvme_req(req)->pi_flags = 0;
 		req->rq_flags |= RQF_DONTPREP;
 	}
 }
@@ -704,28 +705,73 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 
 	if (ns->ms) {
 		/*
-		 * If formated with metadata, the block layer always provides a
-		 * metadata buffer if CONFIG_BLK_DEV_INTEGRITY is enabled.  Else
-		 * we enable the PRACT bit for protection information or set the
-		 * namespace capacity to zero to prevent any I/O.
+		 * When formatted for metadata, a separate metadata buffer
+		 * is required (via blk_integrity_rq()) with one exception:
+		 * if the ns is formatted for PI, then either the host
+		 * transport can insert/strip the pi metadata or the io
+		 * can be issued with PRACT=1 to have the controller
+		 * insert/strip. It's preferred the host transport does
+		 * the insert/strip if possible as it protects fabric
+		 * traversal.
 		 */
 		if (!blk_integrity_rq(req)) {
 			if (WARN_ON_ONCE(!nvme_ns_has_pi(ns)))
 				return BLK_STS_NOTSUPP;
+			/* set both flags for now. Code below will resolve
+			 * to one setting after host support of pi type is
+			 * checked.
+			 */
+			nvme_req(req)->pi_flags |=
+				ctrl->t_flags & NVME_F_PI_TXINSERT_RXSTRIP;
 			control |= NVME_RW_PRINFO_PRACT;
-		}
+
+		} else if (!(ctrl->t_flags & NVME_F_METADATA_SUPPORTED) ||
+			   (!ns->ext && ctrl->t_flags & NVME_F_FABRICS) ||
+			   (ns->ext && !(ctrl->t_flags & NVME_F_FABRICS)))
+			return BLK_STS_NOTSUPP;
 
 		switch (ns->pi_type) {
 		case NVME_NS_DPS_PI_TYPE3:
+			nvme_req(req)->pi_flags |=
+				ctrl->t_flags & NVME_F_PI_TYPE3_PROTECTION;
 			control |= NVME_RW_PRINFO_PRCHK_GUARD;
 			break;
 		case NVME_NS_DPS_PI_TYPE1:
+			nvme_req(req)->pi_flags |=
+				ctrl->t_flags & NVME_F_PI_TYPE1_PROTECTION;
+			goto set_guard_check;
 		case NVME_NS_DPS_PI_TYPE2:
+			nvme_req(req)->pi_flags |=
+				ctrl->t_flags & NVME_F_PI_TYPE2_PROTECTION;
+set_guard_check:
 			control |= NVME_RW_PRINFO_PRCHK_GUARD |
 					NVME_RW_PRINFO_PRCHK_REF;
 			cmnd->rw.reftag = cpu_to_le32(t10_pi_ref_tag(req));
 			break;
 		}
+
+		/* resolve insert/strip:
+		 * if no host pi support, leave PRACT. Otherwise, clear PRACT.
+		 */
+		if (nvme_req(req)->pi_flags & NVME_F_PI_TXINSERT_RXSTRIP) {
+			if (!(nvme_req(req)->pi_flags & NVME_F_PI_TYPE_MASK))
+				nvme_req(req)->pi_flags &=
+						~NVME_F_PI_TXINSERT_RXSTRIP;
+			else
+				control &= ~NVME_RW_PRINFO_PRACT;
+		}
+
+		/* If no metadata:
+		 *  Either the transport or controller will insert/strip
+		 *  metadata.
+		 * If separate metadata:
+		 *  Pci transport will pass metadata as separate buffer to
+		 *    controller and controller will perform PI actions.
+		 *  Fabric transports will minimally multiplex the separate
+		 *    buffer into interleaved data for the controller.
+		 *    If PI flags are set, the transport may perform
+		 *    PI checking as payload egresses/ingresses.
+		 */
 	}
 
 	cmnd->rw.control = cpu_to_le16(control);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 88e82250bb3d..bedee3fa9d41 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -141,6 +141,7 @@ struct nvme_request {
 	u16			status;
 	struct nvme_ctrl	*ctrl;
 	struct nvme_ns		*ns;
+	u32			pi_flags; /* see enum nvme_transport_flags */
 };
 
 /*
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC 6/6] nvme: Introduce NVME_INLINE_PROT_SG_CNT
  2020-02-24 18:48 [RFC 0/6] nvme: core layer metadata/pi support James Smart
                   ` (4 preceding siblings ...)
  2020-02-24 18:48 ` [RFC 5/6] nvme: Add pi_flags to nvme_request for transport pi support James Smart
@ 2020-02-24 18:48 ` James Smart
  5 siblings, 0 replies; 8+ messages in thread
From: James Smart @ 2020-02-24 18:48 UTC (permalink / raw)
  To: linux-nvme; +Cc: Israel Rukshin, James Smart

From: Israel Rukshin <israelr@mellanox.com>

SGL size of metadata is usually small. Thus, 1 inline sg should
cover most cases. The macro will be used for pre-allocate a single SGL
entry for metadata information. The preallocation of small inline SGLs
depends on SG_CHAIN capability so if the ARCH doesn't support SG_CHAIN,
use the runtime allocation for the SGL. This patch is in preparation for
adding metadata (T10-PI) over fabric support.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
---
Differences from Israel's original patch:
  Renamed NVME_INLINE_PROT_SG_CNT to NVME_INLINE_META_SG_CNT
---
 drivers/nvme/host/nvme.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index bedee3fa9d41..d5fc3b9d9b77 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -31,8 +31,10 @@ extern unsigned int admin_timeout;
 
 #ifdef CONFIG_ARCH_NO_SG_CHAIN
 #define  NVME_INLINE_SG_CNT  0
+#define  NVME_INLINE_META_SG_CNT  0
 #else
 #define  NVME_INLINE_SG_CNT  2
+#define  NVME_INLINE_META_SG_CNT  1
 #endif
 
 extern struct workqueue_struct *nvme_wq;
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 3/6] nvme: Introduce max_meta_segments ctrl and ops attribute
  2020-02-24 18:48 ` [RFC 3/6] nvme: Introduce max_meta_segments ctrl and ops attribute James Smart
@ 2020-02-25  9:49   ` Max Gurtovoy
  0 siblings, 0 replies; 8+ messages in thread
From: Max Gurtovoy @ 2020-02-25  9:49 UTC (permalink / raw)
  To: James Smart, linux-nvme; +Cc: Israel Rukshin, Sagi Grimberg


On 2/24/2020 8:48 PM, James Smart wrote:
> This patch was originally posted by Max Gurtovoy <maxg@mellanox.com>
>
> This patch doesn't change any logic, and is needed as a preparation
> for adding PI support for fabrics drivers.
>
> This patch parameterizes the number of sgl segments supported for
> a separate metadata buffer.
>
> The parameter is added to the nvme_ctrl_ops struct and the nvme_ctrl_ops
> struct. nvme_init_ctrl() was modified to initialize the controller
> value from the ops value. It was done in this manner such that if the
> transport supports a singular/unchanged value, it can be set in the ops
> struct and no other code is necessary. However, if the transport must
> set the value on a per-controller basis, likely due to differents in the
> host transport hardware, it can directly modify the field in the ctrl
> struct.

I don't think we need to add a ops for that and override it per ctrl if 
needed later on.

And if a new op is preferred, please add a callback (see bellow).

Btw,

This was already reviewed by Sagi and Marting so it would be nice to 
leave it as-is (or maybe only rename it to max_metadata_segments).


> CC: Max Gurtovoy <maxg@mellanox.com>
> CC: Israel Rukshin <israelr@mellanox.com>
> CC: Sagi Grimberg <sagi@grimberg.me>
> Signed-off-by: James Smart <jsmart2021@gmail.com>
>
> ---
> Modifications to Max's patch:
>   Rename max_integrity_segments to max_meta_segments.
>   Add parameter to ops struct and initialize ctrl field in nvme_init_ctrl.
> ---
>   drivers/nvme/host/core.c | 12 ++++++++----
>   drivers/nvme/host/nvme.h |  3 +++
>   drivers/nvme/host/pci.c  |  3 +++
>   3 files changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 4f08c637ec2e..8421eafa81c6 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1624,7 +1624,8 @@ static int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo)
>   }
>   
>   #ifdef CONFIG_BLK_DEV_INTEGRITY
> -static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
> +static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
> +				u32 max_meta_segments)
>   {
>   	struct blk_integrity integrity;
>   
> @@ -1647,10 +1648,11 @@ static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
>   	}
>   	integrity.tuple_size = ms;
>   	blk_integrity_register(disk, &integrity);
> -	blk_queue_max_integrity_segments(disk->queue, 1);
> +	blk_queue_max_integrity_segments(disk->queue, max_meta_segments);
>   }
>   #else
> -static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type)
> +static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
> +				u32 max_meta_segments)
>   {
>   }
>   #endif /* CONFIG_BLK_DEV_INTEGRITY */
> @@ -1805,7 +1807,8 @@ static void nvme_update_disk_info(struct gendisk *disk,
>   
>   	if (ns->ms && !ns->ext &&
>   	    (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
> -		nvme_init_integrity(disk, ns->ms, ns->pi_type);
> +		nvme_init_integrity(disk, ns->ms, ns->pi_type,
> +				    ns->ctrl->max_meta_segments);
>   	if ((ns->ms && !nvme_ns_has_pi(ns) && !blk_get_integrity(disk)) ||
>   	    ns->lba_shift > PAGE_SHIFT)
>   		capacity = 0;
> @@ -4058,6 +4061,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
>   	init_rwsem(&ctrl->namespaces_rwsem);
>   	ctrl->dev = dev;
>   	ctrl->ops = ops;
> +	ctrl->max_meta_segments = ops->max_meta_segments;
>   	ctrl->quirks = quirks;
>   	INIT_WORK(&ctrl->scan_work, nvme_scan_work);
>   	INIT_WORK(&ctrl->async_event_work, nvme_async_event_work);
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 0725cc7c7a7a..4c6b6fc18560 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -262,6 +262,8 @@ struct nvme_ctrl {
>   	struct work_struct fw_act_work;
>   	unsigned long events;
>   
> +	u32 max_meta_segments;
> +
>   #ifdef CONFIG_NVME_MULTIPATH
>   	/* asymmetric namespace access: */
>   	u8 anacap;
> @@ -414,6 +416,7 @@ struct nvme_ctrl_ops {
>   	void (*submit_async_event)(struct nvme_ctrl *ctrl);
>   	void (*delete_ctrl)(struct nvme_ctrl *ctrl);
>   	int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
> +	u32 max_meta_segments;

This shouldn't be hard-coded value.

please add a callback:

u32 (*get_max_meta_segments)(struct nvme_ctrl *ctrl);


>   };
>   
>   #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index da392b50f73e..7cbd2fbda743 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2690,6 +2690,9 @@ static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = {
>   	.free_ctrl		= nvme_pci_free_ctrl,
>   	.submit_async_event	= nvme_pci_submit_async_event,
>   	.get_address		= nvme_pci_get_address,
> +	/* PCI supports metadata via single segment separate buffer only */
> +	.max_meta_segments	= 1,
> +
>   };
>   
>   static int nvme_dev_map(struct nvme_dev *dev)

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-24 18:48 [RFC 0/6] nvme: core layer metadata/pi support James Smart
2020-02-24 18:48 ` [RFC 1/6] nvme: Add ns to nvme_request struct James Smart
2020-02-24 18:48 ` [RFC 2/6] nvme: make nvme_ns_has_pi accessible to transports James Smart
2020-02-24 18:48 ` [RFC 3/6] nvme: Introduce max_meta_segments ctrl and ops attribute James Smart
2020-02-25  9:49   ` Max Gurtovoy
2020-02-24 18:48 ` [RFC 4/6] nvme: Add t_flags and pi_blkszs ctrl and ops attributes James Smart
2020-02-24 18:48 ` [RFC 5/6] nvme: Add pi_flags to nvme_request for transport pi support James Smart
2020-02-24 18:48 ` [RFC 6/6] nvme: Introduce NVME_INLINE_PROT_SG_CNT James Smart

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git