All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/4] Add command id quirk for fabrics
@ 2021-11-08 14:46 Max Gurtovoy
  2021-11-08 14:46 ` [PATCH 1/1 nvmecli] fabrics: add new --skip-cid-gen flag to connect cmd Max Gurtovoy
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-08 14:46 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi
  Cc: chaitanyak, oren, benishay, borisp, aviadye, idanb, jsmart2021,
	Max Gurtovoy

Hi all,
Commit a2941f6aa71a ("nvme: add command id quirk for apple controllers")
was merged to fix a regression in apple controllers that was introduced
after merging commit e7006de6c238 ("nvme: code command_id with a genctr
for use-after-free validation").

This series is comming to enable the same quirk for fabrics controllers
that used the command id index in the same way that was probably used in
apple controllers.

This series is a complementary series to NVMe-CLi and libnvme patches
that will introduce a new flag for "nvme connect" command:
"--skip-cid-gen". Using this flag will cause enabling
NVME_QUIRK_SKIP_CID_GEN quirk and will actually add the ability to
ignore the command id generation for other transport alongside PCI
transport.

default value of the flag is false.

Usage:
nvme connect -t <transport> -n <nqn> -a <addr> -s <sid> --skip-cid-gen
or
nvme connect -t <transport> -n <nqn> -a <addr> -s <sid> -p

Max Gurtovoy (4):
  nvme-fabrics: add command id quirk for fabrics controllers
  nvme-rdma: add command id quirk for RDMA controllers
  nvme-tcp: add command id quirk for TCP controllers
  nvme-fc: add command id quirk for FC controllers

 drivers/nvme/host/fabrics.c | 7 ++++++-
 drivers/nvme/host/fabrics.h | 2 ++
 drivers/nvme/host/fc.c      | 6 +++++-
 drivers/nvme/host/rdma.c    | 7 +++++--
 drivers/nvme/host/tcp.c     | 6 +++++-
 5 files changed, 23 insertions(+), 5 deletions(-)

-- 
2.18.1



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1 nvmecli] fabrics: add new --skip-cid-gen flag to connect cmd
  2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
@ 2021-11-08 14:46 ` Max Gurtovoy
  2021-11-08 14:46 ` [PATCH 1/1 libnvme] fabrics: add support for new cli --skip-cid-gen flag Max Gurtovoy
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-08 14:46 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi
  Cc: chaitanyak, oren, benishay, borisp, aviadye, idanb, jsmart2021,
	Max Gurtovoy

Setting this flag will enable NVME_QUIRK_SKIP_CID_GEN quirk and will
actually add the ability to ignore the command id generation for other
transport alongside PCI transport that is possible today.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 Documentation/nvme-connect.html | 15 ++++++++++++++-
 fabrics.c                       | 10 ++++++----
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/Documentation/nvme-connect.html b/Documentation/nvme-connect.html
index 8e25db9..d0a53d6 100644
--- a/Documentation/nvme-connect.html
+++ b/Documentation/nvme-connect.html
@@ -767,7 +767,8 @@ nvme-connect(1) Manual Page
                 [--duplicate-connect      | -D]
                 [--disable-sqflow         | -d]
                 [--hdr-digest             | -g]
-                [--data-digest            | -G]</pre>
+                [--data-digest            | -G]
+                [--skip-cid-gen           | -p]</pre>
 <div class="attribution">
 </div></div>
 </div>
@@ -1028,6 +1029,18 @@ cellspacing="0" cellpadding="4">
         Generates/verifies data digest (TCP).
 </p>
 </dd>
+<dt class="hdlist1">
+-p
+</dt>
+<dt class="hdlist1">
+--skip-cid-gen
+</dt>
+<dd>
+<p>
+        Skip command id generation control that was added to protect against buggy
+	fabric controllers.
+</p>
+</dd>
 </dl></div>
 </div>
 </div>
diff --git a/fabrics.c b/fabrics.c
index 012bcb8..02a4108 100644
--- a/fabrics.c
+++ b/fabrics.c
@@ -72,6 +72,7 @@ static const char *nvmf_dup_connect	= "allow duplicate connections between same
 static const char *nvmf_disable_sqflow	= "disable controller sq flow control (default false)";
 static const char *nvmf_hdr_digest	= "enable transport protocol header digest (TCP transport)";
 static const char *nvmf_data_digest	= "enable transport protocol data digest (TCP transport)";
+static const char *nvmf_skip_cid_gen	= "skip command id generation control (default false)";
 static const char *nvmf_config_file	= "Use specified JSON configuration file or 'none' to disable";
 
 #define NVMF_OPTS(c)									\
@@ -93,7 +94,8 @@ static const char *nvmf_config_file	= "Use specified JSON configuration file or
 	OPT_FLAG("duplicate-connect", 'D', &c.duplicate_connect,  nvmf_dup_connect),	\
 	OPT_FLAG("disable-sqflow",    'd', &c.disable_sqflow,     nvmf_disable_sqflow),	\
 	OPT_FLAG("hdr-digest",        'g', &c.hdr_digest,         nvmf_hdr_digest),	\
-	OPT_FLAG("data-digest",       'G', &c.data_digest,        nvmf_data_digest)     \
+	OPT_FLAG("data-digest",       'G', &c.data_digest,        nvmf_data_digest),    \
+	OPT_FLAG("skip-cid-gen",      'p', &c.skip_cid_gen,       nvmf_skip_cid_gen)	\
 
 
 static void space_strip_len(int max, char *str)
@@ -363,7 +365,7 @@ static int discover_from_conf_file(nvme_host_t h, const char *desc,
 		if (!c)
 			goto next;
 		errno = 0;
-		ret = nvmf_add_ctrl(h, c, &cfg, false);
+		ret = nvmf_add_ctrl(h, c, &cfg, false, false);
 		if (!ret) {
 			__discover(c, &cfg, raw, connect,
 				   persistent, flags);
@@ -513,7 +515,7 @@ int nvmf_discover(const char *desc, int argc, char **argv, bool connect)
 			ret = errno;
 			goto out_free;
 		}
-		ret = nvmf_add_ctrl(h, c, &cfg, false);
+		ret = nvmf_add_ctrl(h, c, &cfg, false, false);
 		if (ret) {
 			nvme_msg(LOG_ERR,
 				 "failed to add controller, error %d\n", errno);
@@ -626,7 +628,7 @@ int nvmf_connect(const char *desc, int argc, char **argv)
 	}
 
 	errno = 0;
-	ret = nvmf_add_ctrl(h, c, &cfg, cfg.disable_sqflow);
+	ret = nvmf_add_ctrl(h, c, &cfg, cfg.disable_sqflow, cfg.skip_cid_gen);
 	if (ret)
 		nvme_msg(LOG_ERR, "no controller found\n");
 out_free:
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/1 libnvme] fabrics: add support for new cli --skip-cid-gen flag
  2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
  2021-11-08 14:46 ` [PATCH 1/1 nvmecli] fabrics: add new --skip-cid-gen flag to connect cmd Max Gurtovoy
@ 2021-11-08 14:46 ` Max Gurtovoy
  2021-11-08 14:47 ` [PATCH 1/4] nvme-fabrics: add command id quirk for fabrics controllers Max Gurtovoy
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-08 14:46 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi
  Cc: chaitanyak, oren, benishay, borisp, aviadye, idanb, jsmart2021,
	Max Gurtovoy

Setting this flag will enable NVME_QUIRK_SKIP_CID_GEN quirk and will
actually add the ability to ignore the command id generation for
other transport alongside PCI transport that is possible today.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 doc/libnvme.rst          |  3 +++
 examples/discover-loop.c |  2 +-
 pynvme/nvme.i            |  2 ++
 src/nvme/fabrics.c       | 10 +++++++---
 src/nvme/fabrics.h       |  4 +++-
 src/nvme/tree.c          |  7 +++++++
 src/nvme/tree.h          |  9 +++++++++
 7 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/doc/libnvme.rst b/doc/libnvme.rst
index d52462f..e2555f9 100644
--- a/doc/libnvme.rst
+++ b/doc/libnvme.rst
@@ -11435,6 +11435,7 @@ errno set otherwise.
     bool disable_sqflow;
     bool hdr_digest;
     bool data_digest;
+    bool skip_cid_gen;
   };
 
 **Members**
@@ -11496,6 +11497,8 @@ errno set otherwise.
 ``data_digest``
   Generate/verify data digest (TCP)
 
+``skip_cid_gen``
+  Skip command id generation control
 
 
 .. c:function:: int nvmf_add_ctrl_opts (struct nvme_fabrics_config * cfg)
diff --git a/examples/discover-loop.c b/examples/discover-loop.c
index c2a2385..66e9d8b 100644
--- a/examples/discover-loop.c
+++ b/examples/discover-loop.c
@@ -70,7 +70,7 @@ int main()
 		fprintf(stderr, "Failed to allocate memory\n");
 		return ENOMEM;
 	}
-	ret = nvmf_add_ctrl(h, c, &cfg, false);
+	ret = nvmf_add_ctrl(h, c, &cfg, false, false);
 	if (ret < 0) {
 		fprintf(stderr, "no controller found\n");
 		return errno;
diff --git a/pynvme/nvme.i b/pynvme/nvme.i
index 4c45a04..cfa0a44 100644
--- a/pynvme/nvme.i
+++ b/pynvme/nvme.i
@@ -145,6 +145,8 @@ static int discover_err = 0;
       temp.hdr_digest = PyObject_IsTrue(value) ? true : false;
     if (!PyUnicode_CompareWithASCIIString(key, "data_digest"))
       temp.data_digest = PyObject_IsTrue(value) ? true : false;
+    if (!PyUnicode_CompareWithASCIIString(key, "skip_cid_gen"))
+      temp.skip_cid_gen = PyObject_IsTrue(value) ? true : false;
   }
   $1 = &temp;
  };
diff --git a/src/nvme/fabrics.c b/src/nvme/fabrics.c
index 94cecd0..2005dc4 100644
--- a/src/nvme/fabrics.c
+++ b/src/nvme/fabrics.c
@@ -162,6 +162,7 @@ static struct nvme_fabrics_config *merge_config(nvme_ctrl_t c,
 	UPDATE_CFG_OPTION(ctrl_cfg, cfg, disable_sqflow, false);
 	UPDATE_CFG_OPTION(ctrl_cfg, cfg, hdr_digest, false);
 	UPDATE_CFG_OPTION(ctrl_cfg, cfg, data_digest, false);
+	UPDATE_CFG_OPTION(ctrl_cfg, cfg, skip_cid_gen, false);
 
 	return ctrl_cfg;
 }
@@ -445,6 +446,8 @@ static int build_options(nvme_host_t h, nvme_ctrl_t c, char **argstr)
 			      cfg->duplicate_connect) ||
 	    add_bool_argument(argstr, "disable_sqflow",
 			      cfg->disable_sqflow) ||
+	    add_bool_argument(argstr, "skip_cid_gen",
+			      cfg->skip_cid_gen) ||
 	    (!strcmp(transport, "tcp") &&
 	     add_bool_argument(argstr, "hdr_digest", cfg->hdr_digest)) ||
 	    (!strcmp(transport, "tcp") &&
@@ -531,13 +534,14 @@ int nvmf_add_ctrl_opts(nvme_ctrl_t c, struct nvme_fabrics_config *cfg)
 
 int nvmf_add_ctrl(nvme_host_t h, nvme_ctrl_t c,
 		  const struct nvme_fabrics_config *cfg,
-		  bool disable_sqflow)
+		  bool disable_sqflow, bool skip_cid_gen)
 {
 	char *argstr;
 	int ret;
 
 	cfg = merge_config(c, cfg);
 	nvme_ctrl_disable_sqflow(c, disable_sqflow);
+	nvme_ctrl_skip_cid_gen(c, skip_cid_gen);
 	nvme_ctrl_set_discovered(c, true);
 	if (traddr_is_hostname(c)) {
 		ret = hostname2traddr(c);
@@ -644,7 +648,7 @@ nvme_ctrl_t nvmf_connect_disc_entry(nvme_host_t h,
 	if (e->treq & NVMF_TREQ_DISABLE_SQFLOW)
 		disable_sqflow = true;
 
-	ret = nvmf_add_ctrl(h, c, cfg, disable_sqflow);
+	ret = nvmf_add_ctrl(h, c, cfg, disable_sqflow, false);
 	if (!ret)
 		return c;
 
@@ -654,7 +658,7 @@ nvme_ctrl_t nvmf_connect_disc_entry(nvme_host_t h,
 		nvme_msg(LOG_INFO, "failed to connect controller, "
 			 "retry with disabling SQ flow control\n");
 		disable_sqflow = false;
-		ret = nvmf_add_ctrl(h, c, cfg, disable_sqflow);
+		ret = nvmf_add_ctrl(h, c, cfg, disable_sqflow, false);
 		if (!ret)
 			return c;
 	}
diff --git a/src/nvme/fabrics.h b/src/nvme/fabrics.h
index 9b796be..e9e4b21 100644
--- a/src/nvme/fabrics.h
+++ b/src/nvme/fabrics.h
@@ -31,6 +31,7 @@
  * @disable_sqflow:	Disable controller sq flow control
  * @hdr_digest:		Generate/verify header digest (TCP)
  * @data_digest:	Generate/verify data digest (TCP)
+ * @skip_cid_gen:	Skip command id generation control
  */
 struct nvme_fabrics_config {
 	int queue_size;
@@ -47,6 +48,7 @@ struct nvme_fabrics_config {
 	bool disable_sqflow;
 	bool hdr_digest;
 	bool data_digest;
+	bool skip_cid_gen;
 };
 
 /**
@@ -133,7 +135,7 @@ int nvmf_add_ctrl_opts(nvme_ctrl_t c, struct nvme_fabrics_config *cfg);
  */
 int nvmf_add_ctrl(nvme_host_t h, nvme_ctrl_t c,
 		  const struct nvme_fabrics_config *cfg,
-		  bool disable_sqflow);
+		  bool disable_sqflow, bool skip_cid_gen);
 
 /**
  * nvmf_get_discovery_log() -
diff --git a/src/nvme/tree.c b/src/nvme/tree.c
index 0f72320..df1ac31 100644
--- a/src/nvme/tree.c
+++ b/src/nvme/tree.c
@@ -700,6 +700,13 @@ void nvme_ctrl_disable_sqflow(nvme_ctrl_t c, bool disable_sqflow)
 		c->s->h->r->modified = true;
 }
 
+void nvme_ctrl_skip_cid_gen(nvme_ctrl_t c, bool skip_cid_gen)
+{
+	c->cfg.skip_cid_gen = skip_cid_gen;
+	if (c->s && c->s->h && c->s->h->r)
+		c->s->h->r->modified = true;
+}
+
 void nvme_ctrl_set_discovered(nvme_ctrl_t c, bool discovered)
 {
 	c->discovered = discovered;
diff --git a/src/nvme/tree.h b/src/nvme/tree.h
index f9e7d37..c819556 100644
--- a/src/nvme/tree.h
+++ b/src/nvme/tree.h
@@ -822,6 +822,15 @@ bool nvme_ctrl_is_persistent(nvme_ctrl_t c);
  */
 void nvme_ctrl_disable_sqflow(nvme_ctrl_t c, bool disable_sqflow);
 
+/**
+ * nvme_ctrl_skip_cid_gen() -
+ * @c:
+ * @skip_cid_gen:
+ *
+ * Return:
+ */
+void nvme_ctrl_skip_cid_gen(nvme_ctrl_t c, bool skip_cid_gen);
+
 /**
  * nvme_ctrl_identify() -
  * @c:
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/4] nvme-fabrics: add command id quirk for fabrics controllers
  2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
  2021-11-08 14:46 ` [PATCH 1/1 nvmecli] fabrics: add new --skip-cid-gen flag to connect cmd Max Gurtovoy
  2021-11-08 14:46 ` [PATCH 1/1 libnvme] fabrics: add support for new cli --skip-cid-gen flag Max Gurtovoy
@ 2021-11-08 14:47 ` Max Gurtovoy
  2021-11-08 14:47 ` [PATCH 2/4] nvme-rdma: add command id quirk for RDMA controllers Max Gurtovoy
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-08 14:47 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi
  Cc: chaitanyak, oren, benishay, borisp, aviadye, idanb, jsmart2021,
	Max Gurtovoy

Commit a2941f6aa71a ("nvme: add command id quirk for apple controllers")
introduced a quirk for apple controllers that were affected by adding
the new generation bit inside the command id. Add this quirk for fabrics
controllers as well, since some fabrics implementations also use command
id as an index in their logic. This quirk will be disabled by default
and can be enabled per connection.

For example (with a suitable NVMe-cli version):
nvme connect -t <transport> -n <nqn> -a <addr> -s <sid> --skip-cid-gen
or
nvme connect -t <transport> -n <nqn> -a <addr> -s <sid> -p

As was mentioned for the apple controllers commit, the driver will not
have the ability to detect bad completions when this quirk is used, but
we weren't previously checking this anyway.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 drivers/nvme/host/fabrics.c | 7 ++++++-
 drivers/nvme/host/fabrics.h | 2 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index c5a2b71c5268..c8ec8ae2266a 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -549,6 +549,7 @@ static const match_table_t opt_tokens = {
 	{ NVMF_OPT_TOS,			"tos=%d"		},
 	{ NVMF_OPT_FAIL_FAST_TMO,	"fast_io_fail_tmo=%d"	},
 	{ NVMF_OPT_DISCOVERY,		"discovery"		},
+	{ NVMF_OPT_SKIP_CID_GEN,	"skip_cid_gen"		},
 	{ NVMF_OPT_ERR,			NULL			}
 };
 
@@ -827,6 +828,9 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 		case NVMF_OPT_DISCOVERY:
 			opts->discovery_nqn = true;
 			break;
+		case NVMF_OPT_SKIP_CID_GEN:
+			opts->skip_cid_gen = true;
+			break;
 		default:
 			pr_warn("unknown parameter or missing value '%s' in ctrl creation request\n",
 				p);
@@ -954,7 +958,8 @@ EXPORT_SYMBOL_GPL(nvmf_free_options);
 				 NVMF_OPT_KATO | NVMF_OPT_HOSTNQN | \
 				 NVMF_OPT_HOST_ID | NVMF_OPT_DUP_CONNECT |\
 				 NVMF_OPT_DISABLE_SQFLOW | NVMF_OPT_DISCOVERY |\
-				 NVMF_OPT_FAIL_FAST_TMO)
+				 NVMF_OPT_FAIL_FAST_TMO |\
+				 NVMF_OPT_SKIP_CID_GEN)
 
 static struct nvme_ctrl *
 nvmf_create_ctrl(struct device *dev, const char *buf)
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index c3203ff1c654..eefb2d1d477e 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -68,6 +68,7 @@ enum {
 	NVMF_OPT_FAIL_FAST_TMO	= 1 << 20,
 	NVMF_OPT_HOST_IFACE	= 1 << 21,
 	NVMF_OPT_DISCOVERY	= 1 << 22,
+	NVMF_OPT_SKIP_CID_GEN	= 1 << 23,
 };
 
 /**
@@ -128,6 +129,7 @@ struct nvmf_ctrl_options {
 	unsigned int		nr_poll_queues;
 	int			tos;
 	int			fast_io_fail_tmo;
+	bool			skip_cid_gen;
 };
 
 /*
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/4] nvme-rdma: add command id quirk for RDMA controllers
  2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
                   ` (2 preceding siblings ...)
  2021-11-08 14:47 ` [PATCH 1/4] nvme-fabrics: add command id quirk for fabrics controllers Max Gurtovoy
@ 2021-11-08 14:47 ` Max Gurtovoy
  2021-11-08 14:47 ` [PATCH 3/4] nvme-tcp: add command id quirk for TCP controllers Max Gurtovoy
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-08 14:47 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi
  Cc: chaitanyak, oren, benishay, borisp, aviadye, idanb, jsmart2021,
	Max Gurtovoy

Enable NVME_QUIRK_SKIP_CID_GEN quirk if "--skip-cid-gen" flag was set by
the user during the connect command for RDMA transport.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 drivers/nvme/host/rdma.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 9317f26e51e0..0c4c496bd621 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -2326,6 +2326,7 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 	struct nvme_rdma_ctrl *ctrl;
 	int ret;
 	bool changed;
+	unsigned long quirks = 0;
 
 	ctrl = kzalloc(sizeof(*ctrl), GFP_KERNEL);
 	if (!ctrl)
@@ -2382,8 +2383,10 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 	if (!ctrl->queues)
 		goto out_free_ctrl;
 
-	ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_rdma_ctrl_ops,
-				0 /* no quirks, we're perfect! */);
+	if (opts->skip_cid_gen)
+		quirks |= NVME_QUIRK_SKIP_CID_GEN;
+
+	ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_rdma_ctrl_ops, quirks);
 	if (ret)
 		goto out_kfree_queues;
 
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/4] nvme-tcp: add command id quirk for TCP controllers
  2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
                   ` (3 preceding siblings ...)
  2021-11-08 14:47 ` [PATCH 2/4] nvme-rdma: add command id quirk for RDMA controllers Max Gurtovoy
@ 2021-11-08 14:47 ` Max Gurtovoy
  2021-11-08 14:47 ` [PATCH 4/4] nvme-fc: add command id quirk for FC controllers Max Gurtovoy
  2021-11-08 16:45 ` [PATCH v1 0/4] Add command id quirk for fabrics Keith Busch
  6 siblings, 0 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-08 14:47 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi
  Cc: chaitanyak, oren, benishay, borisp, aviadye, idanb, jsmart2021,
	Max Gurtovoy

Enable NVME_QUIRK_SKIP_CID_GEN quirk if "--skip-cid-gen" flag was
set by the user during the connect command for TCP transport.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 drivers/nvme/host/tcp.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 07156ea9d1a8..13f641f0033b 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2501,6 +2501,7 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev,
 {
 	struct nvme_tcp_ctrl *ctrl;
 	int ret;
+	unsigned long quirks = 0;
 
 	ctrl = kzalloc(sizeof(*ctrl), GFP_KERNEL);
 	if (!ctrl)
@@ -2567,7 +2568,10 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev,
 		goto out_free_ctrl;
 	}
 
-	ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_tcp_ctrl_ops, 0);
+	if (opts->skip_cid_gen)
+		quirks |= NVME_QUIRK_SKIP_CID_GEN;
+
+	ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_tcp_ctrl_ops, quirks);
 	if (ret)
 		goto out_kfree_queues;
 
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/4] nvme-fc: add command id quirk for FC controllers
  2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
                   ` (4 preceding siblings ...)
  2021-11-08 14:47 ` [PATCH 3/4] nvme-tcp: add command id quirk for TCP controllers Max Gurtovoy
@ 2021-11-08 14:47 ` Max Gurtovoy
  2021-11-08 16:45 ` [PATCH v1 0/4] Add command id quirk for fabrics Keith Busch
  6 siblings, 0 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-08 14:47 UTC (permalink / raw)
  To: linux-nvme, hch, kbusch, sagi
  Cc: chaitanyak, oren, benishay, borisp, aviadye, idanb, jsmart2021,
	Max Gurtovoy

Enable NVME_QUIRK_SKIP_CID_GEN quirk if "--skip-cid-gen" flag was
set by the user during the connect command for FC transport.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 drivers/nvme/host/fc.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index be9892894849..919313aaa902 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3453,6 +3453,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 	struct nvme_fc_ctrl *ctrl;
 	unsigned long flags;
 	int ret, idx, ctrl_loss_tmo;
+	unsigned long quirks = 0;
 
 	if (!(rport->remoteport.port_role &
 	    (FC_PORT_ROLE_NVME_DISCOVERY | FC_PORT_ROLE_NVME_TARGET))) {
@@ -3568,7 +3569,10 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
 	 * Defer this to the connect path.
 	 */
 
-	ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_fc_ctrl_ops, 0);
+	if (opts->skip_cid_gen)
+		quirks |= NVME_QUIRK_SKIP_CID_GEN;
+
+	ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_fc_ctrl_ops, quirks);
 	if (ret)
 		goto out_cleanup_admin_q;
 
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
                   ` (5 preceding siblings ...)
  2021-11-08 14:47 ` [PATCH 4/4] nvme-fc: add command id quirk for FC controllers Max Gurtovoy
@ 2021-11-08 16:45 ` Keith Busch
  2021-11-09  8:09   ` Christoph Hellwig
  6 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2021-11-08 16:45 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: linux-nvme, hch, sagi, chaitanyak, oren, benishay, borisp,
	aviadye, idanb, jsmart2021

On Mon, Nov 08, 2021 at 04:46:57PM +0200, Max Gurtovoy wrote:
> Hi all,
> Commit a2941f6aa71a ("nvme: add command id quirk for apple controllers")
> was merged to fix a regression in apple controllers that was introduced
> after merging commit e7006de6c238 ("nvme: code command_id with a genctr
> for use-after-free validation").
> 
> This series is comming to enable the same quirk for fabrics controllers
> that used the command id index in the same way that was probably used in
> apple controllers.

If there really are targets behaving this way, then this looks good and
necessary, however unfortunate. A TCP target triggered the need for
valid tag validation in the first place.

Are there really fabrics targets behaving this way, or is this series
anticipating they might exist? Apple disregarding specs is nothing new,
but I would have hoped no other targets would do this since most vendors
care about interop.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-08 16:45 ` [PATCH v1 0/4] Add command id quirk for fabrics Keith Busch
@ 2021-11-09  8:09   ` Christoph Hellwig
  2021-11-09 12:08     ` Max Gurtovoy
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2021-11-09  8:09 UTC (permalink / raw)
  To: Keith Busch
  Cc: Max Gurtovoy, linux-nvme, hch, sagi, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021

On Mon, Nov 08, 2021 at 08:45:11AM -0800, Keith Busch wrote:
> On Mon, Nov 08, 2021 at 04:46:57PM +0200, Max Gurtovoy wrote:
> > Hi all,
> > Commit a2941f6aa71a ("nvme: add command id quirk for apple controllers")
> > was merged to fix a regression in apple controllers that was introduced
> > after merging commit e7006de6c238 ("nvme: code command_id with a genctr
> > for use-after-free validation").
> > 
> > This series is comming to enable the same quirk for fabrics controllers
> > that used the command id index in the same way that was probably used in
> > apple controllers.
> 
> If there really are targets behaving this way, then this looks good and
> necessary, however unfortunate. A TCP target triggered the need for
> valid tag validation in the first place.
> 
> Are there really fabrics targets behaving this way, or is this series
> anticipating they might exist? Apple disregarding specs is nothing new,
> but I would have hoped no other targets would do this since most vendors
> care about interop.

Seconded.  We probably also need to document the broken targers in the
nvme-cli documentation.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09  8:09   ` Christoph Hellwig
@ 2021-11-09 12:08     ` Max Gurtovoy
  2021-11-09 13:15       ` Christoph Hellwig
  2021-11-10 10:32       ` Daniel Wagner
  0 siblings, 2 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-09 12:08 UTC (permalink / raw)
  To: Christoph Hellwig, Keith Busch
  Cc: linux-nvme, sagi, chaitanyak, oren, benishay, borisp, aviadye,
	idanb, jsmart2021


On 11/9/2021 10:09 AM, Christoph Hellwig wrote:
> On Mon, Nov 08, 2021 at 08:45:11AM -0800, Keith Busch wrote:
>> On Mon, Nov 08, 2021 at 04:46:57PM +0200, Max Gurtovoy wrote:
>>> Hi all,
>>> Commit a2941f6aa71a ("nvme: add command id quirk for apple controllers")
>>> was merged to fix a regression in apple controllers that was introduced
>>> after merging commit e7006de6c238 ("nvme: code command_id with a genctr
>>> for use-after-free validation").
>>>
>>> This series is comming to enable the same quirk for fabrics controllers
>>> that used the command id index in the same way that was probably used in
>>> apple controllers.
>> If there really are targets behaving this way, then this looks good and
>> necessary, however unfortunate. A TCP target triggered the need for
>> valid tag validation in the first place.

What is really unfortunate is that we added this code by default and not 
as a quirk for buggy controllers.

This series is just completing the initial commit for genctr and gives 
it the right flexibility for smart sys-admins.

The genctr patch broke working controllers (such as Apple) and added 
conditions to the IO path in SW.

Also, controllers/devices that did some internal optimizations (such as 
accessing some task context using the cmd id as index in O(1) instead of 
looking the element in linked list in O(N)) to improve performance will 
now suffer. They may work, but performance will be worse.

Similarly, there is a specification for max_nsid and max_supported_ns, 
right ? exactly for these reasons. We might consider same specification 
for cmd_ids.

>>
>> Are there really fabrics targets behaving this way, or is this series
>> anticipating they might exist? Apple disregarding specs is nothing new,
>> but I would have hoped no other targets would do this since most vendors
>> care about interop.
> Seconded.  We probably also need to document the broken targers in the
> nvme-cli documentation.

I don't think there is a list of broken targets, and broken is not the 
only use case as mentioned above.

We also don't know the target that cause the bug in the initial genctr 
commit..



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09 12:08     ` Max Gurtovoy
@ 2021-11-09 13:15       ` Christoph Hellwig
  2021-11-09 14:23         ` Max Gurtovoy
  2021-11-10 10:32       ` Daniel Wagner
  1 sibling, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2021-11-09 13:15 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, sagi, chaitanyak,
	oren, benishay, borisp, aviadye, idanb, jsmart2021

Max, if you can't point us to a broken target (and yes, it is broken)
this will not go anywhere.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09 13:15       ` Christoph Hellwig
@ 2021-11-09 14:23         ` Max Gurtovoy
  2021-11-09 14:31           ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-09 14:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, linux-nvme, sagi, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021


On 11/9/2021 3:15 PM, Christoph Hellwig wrote:
> Max, if you can't point us to a broken target (and yes, it is broken)
> this will not go anywhere.

Any target that uses Apple device as backend can be harmed.

Most simple example is Linux PT target that copy the sqe as-is and 
passes it to the NVMe Apple drive.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09 14:23         ` Max Gurtovoy
@ 2021-11-09 14:31           ` Christoph Hellwig
  2021-11-09 16:15             ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2021-11-09 14:31 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, sagi, chaitanyak,
	oren, benishay, borisp, aviadye, idanb, jsmart2021

On Tue, Nov 09, 2021 at 04:23:33PM +0200, Max Gurtovoy wrote:
> On 11/9/2021 3:15 PM, Christoph Hellwig wrote:
>> Max, if you can't point us to a broken target (and yes, it is broken)
>> this will not go anywhere.
>
> Any target that uses Apple device as backend can be harmed.
>
> Most simple example is Linux PT target that copy the sqe as-is and passes 
> it to the NVMe Apple drive.

Take another close look at how command_id are assigned my Linux driver.
We obviously do not pass it through as that would be completely broken.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09 14:31           ` Christoph Hellwig
@ 2021-11-09 16:15             ` Keith Busch
  2021-11-09 16:59               ` Max Gurtovoy
  0 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2021-11-09 16:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Max Gurtovoy, linux-nvme, sagi, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021

On Tue, Nov 09, 2021 at 03:31:02PM +0100, Christoph Hellwig wrote:
> On Tue, Nov 09, 2021 at 04:23:33PM +0200, Max Gurtovoy wrote:
> > On 11/9/2021 3:15 PM, Christoph Hellwig wrote:
> >> Max, if you can't point us to a broken target (and yes, it is broken)
> >> this will not go anywhere.
> >
> > Any target that uses Apple device as backend can be harmed.
> >
> > Most simple example is Linux PT target that copy the sqe as-is and passes 
> > it to the NVMe Apple drive.
> 
> Take another close look at how command_id are assigned my Linux driver.
> We obviously do not pass it through as that would be completely broken.

Also worth noting this driver has always defined the command id as a
__u16, not __le16, yet we don't have any bug reports from big-endian
hosts.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09 16:15             ` Keith Busch
@ 2021-11-09 16:59               ` Max Gurtovoy
  2021-11-09 19:04                 ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-09 16:59 UTC (permalink / raw)
  To: Keith Busch, Christoph Hellwig
  Cc: linux-nvme, sagi, chaitanyak, oren, benishay, borisp, aviadye,
	idanb, jsmart2021


On 11/9/2021 6:15 PM, Keith Busch wrote:
> On Tue, Nov 09, 2021 at 03:31:02PM +0100, Christoph Hellwig wrote:
>> On Tue, Nov 09, 2021 at 04:23:33PM +0200, Max Gurtovoy wrote:
>>> On 11/9/2021 3:15 PM, Christoph Hellwig wrote:
>>>> Max, if you can't point us to a broken target (and yes, it is broken)
>>>> this will not go anywhere.
>>> Any target that uses Apple device as backend can be harmed.
>>>
>>> Most simple example is Linux PT target that copy the sqe as-is and passes
>>> it to the NVMe Apple drive.
>> Take another close look at how command_id are assigned my Linux driver.
>> We obviously do not pass it through as that would be completely broken.
> Also worth noting this driver has always defined the command id as a
> __u16, not __le16, yet we don't have any bug reports from big-endian
> hosts.

Right, my bad. I thought that the pass-through target uses the same id.

Linux PT target works fine.

Bad example.

Linux kernel world is covered but I still think we need to add this 
ability for fabrics controllers as we did for pci controllers.

There are a lot of vendors out there with their optimizations and 
solutions and by adding some code to cover a broken TCP target (that no 
one said what is this target and why nobody fixed it) by default that 
hurts others (even if it's spec compliant) is not a good practice.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09 16:59               ` Max Gurtovoy
@ 2021-11-09 19:04                 ` Keith Busch
  2021-11-10 19:45                   ` Sagi Grimberg
  0 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2021-11-09 19:04 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Christoph Hellwig, linux-nvme, sagi, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021

On Tue, Nov 09, 2021 at 06:59:05PM +0200, Max Gurtovoy wrote:
> 
> On 11/9/2021 6:15 PM, Keith Busch wrote:
> > On Tue, Nov 09, 2021 at 03:31:02PM +0100, Christoph Hellwig wrote:
> > > On Tue, Nov 09, 2021 at 04:23:33PM +0200, Max Gurtovoy wrote:
> > > > On 11/9/2021 3:15 PM, Christoph Hellwig wrote:
> > > > > Max, if you can't point us to a broken target (and yes, it is broken)
> > > > > this will not go anywhere.
> > > > Any target that uses Apple device as backend can be harmed.
> > > > 
> > > > Most simple example is Linux PT target that copy the sqe as-is and passes
> > > > it to the NVMe Apple drive.
> > > Take another close look at how command_id are assigned my Linux driver.
> > > We obviously do not pass it through as that would be completely broken.
> > Also worth noting this driver has always defined the command id as a
> > __u16, not __le16, yet we don't have any bug reports from big-endian
> > hosts.
> 
> Right, my bad. I thought that the pass-through target uses the same id.
> 
> Linux PT target works fine.
> 
> Bad example.
> 
> Linux kernel world is covered but I still think we need to add this ability
> for fabrics controllers as we did for pci controllers.
> 
> There are a lot of vendors out there with their optimizations and solutions
> and by adding some code to cover a broken TCP target (that no one said what
> is this target and why nobody fixed it) by default that hurts others (even
> if it's spec compliant) is not a good practice.

Could you qualify the harm this caused? The command id is just an opaque
cookie; the target should not do any interpretation on it, so this
encoding should be inconsequential from the target's perspective.

There are more hosts than just Linux that may encode id's with flags for
driver use, so non-compliance here is just asking for trouble.

If a vendor wants to constrain the command id for some vendor specific
optimization, they should bring forth a TPar and fight it out in the
workgroup.

We did get bug reports that not validating command id's will crash the
kernel or corrupt data if an unexpected response is observed. Even
though the incorrect id is not the kernel's fault, we generally strive
for resilience against those types of observations in spite of
potentially flaky hardware.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09 12:08     ` Max Gurtovoy
  2021-11-09 13:15       ` Christoph Hellwig
@ 2021-11-10 10:32       ` Daniel Wagner
  2021-11-10 10:56         ` Max Gurtovoy
  1 sibling, 1 reply; 26+ messages in thread
From: Daniel Wagner @ 2021-11-10 10:32 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, sagi, chaitanyak,
	oren, benishay, borisp, aviadye, idanb, jsmart2021

On Tue, Nov 09, 2021 at 02:08:27PM +0200, Max Gurtovoy wrote:
> We also don't know the target that cause the bug in the initial genctr
> commit..

As I was the one asking for this feature: During development phase we
saw invalid cmd_id from a (buggy) target. The target has been fixed. It
never got released to the public.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-10 10:32       ` Daniel Wagner
@ 2021-11-10 10:56         ` Max Gurtovoy
  2021-11-10 11:18           ` Daniel Wagner
  0 siblings, 1 reply; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-10 10:56 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, sagi, chaitanyak,
	oren, benishay, borisp, aviadye, idanb, jsmart2021


On 11/10/2021 12:32 PM, Daniel Wagner wrote:
> On Tue, Nov 09, 2021 at 02:08:27PM +0200, Max Gurtovoy wrote:
>> We also don't know the target that cause the bug in the initial genctr
>> commit..
> As I was the one asking for this feature: During development phase we
> saw invalid cmd_id from a (buggy) target. The target has been fixed. It
> never got released to the public.

I see.

It's strange that we need to add all these if conditions and logic in 
the fast path for RDMA/PCI/FC and even for TCP for some non existing 
target that was buggy in the development process.

And all of this is the default behavior for all transports.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-10 10:56         ` Max Gurtovoy
@ 2021-11-10 11:18           ` Daniel Wagner
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Wagner @ 2021-11-10 11:18 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, sagi, chaitanyak,
	oren, benishay, borisp, aviadye, idanb, jsmart2021

On Wed, Nov 10, 2021 at 12:56:38PM +0200, Max Gurtovoy wrote:
> It's strange that we need to add all these if conditions and logic in the
> fast path for RDMA/PCI/FC and even for TCP for some non existing target that
> was buggy in the development process.

The concern is that a buggy target could easily cause unnoticed data
corruption, which is even worse than just crash. IIRC, I wasn't the
first one seeing a wrong cmd_id, although it seems to be pretty rare
(well, these days even rare bugs happen way too often).


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-09 19:04                 ` Keith Busch
@ 2021-11-10 19:45                   ` Sagi Grimberg
  2021-11-11  9:29                     ` Max Gurtovoy
  0 siblings, 1 reply; 26+ messages in thread
From: Sagi Grimberg @ 2021-11-10 19:45 UTC (permalink / raw)
  To: Keith Busch, Max Gurtovoy
  Cc: Christoph Hellwig, linux-nvme, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021

Hey, sorry for the late chime here, ramping up on some emails.

>>>>>> Max, if you can't point us to a broken target (and yes, it is broken)
>>>>>> this will not go anywhere.
>>>>> Any target that uses Apple device as backend can be harmed.
>>>>>
>>>>> Most simple example is Linux PT target that copy the sqe as-is and passes
>>>>> it to the NVMe Apple drive.
>>>> Take another close look at how command_id are assigned my Linux driver.
>>>> We obviously do not pass it through as that would be completely broken.
>>> Also worth noting this driver has always defined the command id as a
>>> __u16, not __le16, yet we don't have any bug reports from big-endian
>>> hosts.
>>
>> Right, my bad. I thought that the pass-through target uses the same id.
>>
>> Linux PT target works fine.
>>
>> Bad example.
>>
>> Linux kernel world is covered but I still think we need to add this ability
>> for fabrics controllers as we did for pci controllers.
>>
>> There are a lot of vendors out there with their optimizations and solutions
>> and by adding some code to cover a broken TCP target (that no one said what
>> is this target and why nobody fixed it) by default that hurts others (even
>> if it's spec compliant) is not a good practice.

Completely disagree here. The TCP original report was just an example of
lack of protection we have against spurious completions. Nothing
specific about nvme-tcp here, this was discussed and agreed on in
the original report.

> Could you qualify the harm this caused? The command id is just an opaque
> cookie; the target should not do any interpretation on it, so this
> encoding should be inconsequential from the target's perspective.

Exactly, the command id is an opaque that is solely up to the host
discretion in terms of how to use it. It's pure coincidence that Linux
uses it for command indexes.

Any implementation that interprets command ids to _anything_ needs
a quirk, not the other way around.

> There are more hosts than just Linux that may encode id's with flags for
> driver use, so non-compliance here is just asking for trouble.

I know of at least one significant host implementation where command
ids are not indexes.

> If a vendor wants to constrain the command id for some vendor specific
> optimization, they should bring forth a TPar and fight it out in the
> workgroup.
> 
> We did get bug reports that not validating command id's will crash the
> kernel or corrupt data if an unexpected response is observed. Even
> though the incorrect id is not the kernel's fault, we generally strive
> for resilience against those types of observations in spite of
> potentially flaky hardware.

Agreed.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-10 19:45                   ` Sagi Grimberg
@ 2021-11-11  9:29                     ` Max Gurtovoy
  2021-11-11 17:36                       ` Keith Busch
  2021-11-12 16:07                       ` Sagi Grimberg
  0 siblings, 2 replies; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-11  9:29 UTC (permalink / raw)
  To: Sagi Grimberg, Keith Busch
  Cc: Christoph Hellwig, linux-nvme, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021


On 11/10/2021 9:45 PM, Sagi Grimberg wrote:
> Hey, sorry for the late chime here, ramping up on some emails.
>
>>>>>>> Max, if you can't point us to a broken target (and yes, it is 
>>>>>>> broken)
>>>>>>> this will not go anywhere.
>>>>>> Any target that uses Apple device as backend can be harmed.
>>>>>>
>>>>>> Most simple example is Linux PT target that copy the sqe as-is 
>>>>>> and passes
>>>>>> it to the NVMe Apple drive.
>>>>> Take another close look at how command_id are assigned my Linux 
>>>>> driver.
>>>>> We obviously do not pass it through as that would be completely 
>>>>> broken.
>>>> Also worth noting this driver has always defined the command id as a
>>>> __u16, not __le16, yet we don't have any bug reports from big-endian
>>>> hosts.
>>>
>>> Right, my bad. I thought that the pass-through target uses the same id.
>>>
>>> Linux PT target works fine.
>>>
>>> Bad example.
>>>
>>> Linux kernel world is covered but I still think we need to add this 
>>> ability
>>> for fabrics controllers as we did for pci controllers.
>>>
>>> There are a lot of vendors out there with their optimizations and 
>>> solutions
>>> and by adding some code to cover a broken TCP target (that no one 
>>> said what
>>> is this target and why nobody fixed it) by default that hurts others 
>>> (even
>>> if it's spec compliant) is not a good practice.
>
> Completely disagree here. The TCP original report was just an example of
> lack of protection we have against spurious completions. Nothing
> specific about nvme-tcp here, this was discussed and agreed on in
> the original report.
>
You are ignoring the facts:

1. The device that broke the spec in the first place was that device for 
which caused you to add the gen bits to CID.

2. These gen bits are causing the limit of 4K Q_depth.

3. It's not mention anywhere in the spec, and if it was intended to be 
implemented like it's now - it would have mentioned in the spec.

4. Since gen bits were introduced, other devices got broken (such as 
Apple), hence the quirk for PCI.

5. The gen bits adds "if" conditions and logic to the fast path for 
"innosent" transports.

6. This series just extends this quirk for fabrics.

7. Even if not broken, some devices may suffer from reduced performance 
having CID space spanning all 16 possible bit - fact that we ignore

8. This series provides a flag to disable default behavior per connection.

9. This series doesn't add any logic to fast path.

10. My patch from last year for resiliency for nvme_pci was rejected 
because it added one if condition to the fast path - no consistency.


>> Could you qualify the harm this caused? The command id is just an opaque
>> cookie; the target should not do any interpretation on it, so this
>> encoding should be inconsequential from the target's perspective.
>
> Exactly, the command id is an opaque that is solely up to the host
> discretion in terms of how to use it. It's pure coincidence that Linux
> uses it for command indexes.
>
> Any implementation that interprets command ids to _anything_ needs
> a quirk, not the other way around.
>
>> There are more hosts than just Linux that may encode id's with flags for
>> driver use, so non-compliance here is just asking for trouble.
>
> I know of at least one significant host implementation where command
> ids are not indexes.
>
>> If a vendor wants to constrain the command id for some vendor specific
>> optimization, they should bring forth a TPar and fight it out in the
>> workgroup.
>>
>> We did get bug reports that not validating command id's will crash the
>> kernel or corrupt data if an unexpected response is observed. Even
>> though the incorrect id is not the kernel's fault, we generally strive
>> for resilience against those types of observations in spite of
>> potentially flaky hardware.
>
> Agreed.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-11  9:29                     ` Max Gurtovoy
@ 2021-11-11 17:36                       ` Keith Busch
  2021-11-12 16:07                       ` Sagi Grimberg
  1 sibling, 0 replies; 26+ messages in thread
From: Keith Busch @ 2021-11-11 17:36 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Sagi Grimberg, Christoph Hellwig, linux-nvme, chaitanyak, oren,
	benishay, borisp, aviadye, idanb, jsmart2021

On Thu, Nov 11, 2021 at 11:29:11AM +0200, Max Gurtovoy wrote:
> 1. The device that broke the spec in the first place was that device for
> which caused you to add the gen bits to CID.
> 
> 2. These gen bits are causing the limit of 4K Q_depth.

That is true. At least one person (Bart) didn't like that.

On the other hand, we do observe timeouts in production environments
because the queues are already too large, and there is simply not enough
bandwidth available to satisfy default host latency requirements at
current depths. Going deeper doesn't improve iops or throughput, but it
does increase tail latencies.
 
> 3. It's not mention anywhere in the spec, and if it was intended to be
> implemented like it's now - it would have mentioned in the spec.

The only thing the spec says is that host software must ensure the CID
is unique on the SQ for the lifetime of the command. There's no need for
the spec to cover every possible way a driver can satisfy that
requirement.

> 4. Since gen bits were introduced, other devices got broken (such as Apple),
> hence the quirk for PCI.

Apple doesn't care about spec compatibility though. They support their
hardware only with their own host software, hence all the apple specific
Linux quirks.

> 5. The gen bits adds "if" conditions and logic to the fast path for
> "innosent" transports.

The quirk doesn't change that, though.

> 6. This series just extends this quirk for fabrics.

My only request has been that you identify at least one real fabrics
target that behaves this way. So far it sounds like this series is
speculating they might exist.

And if they do exist, you will have to change all the __u16 command_id
to __le16 and use appropriate cpu_to_le16() wrappers, otherwise
big-endian machines will produce completely different command id's than
little-endian. Since we haven't heard a single bug report from
big-endian hosts, I am pretty sure that such target behavior doesn't
actually exist in the field.

> 7. Even if not broken, some devices may suffer from reduced
> performance having CID space spanning all 16 possible bit - fact that
> we ignore

I don't see how. Why would this matter to any target? It's just a
cookie to pass back.
 
> 8. This series provides a flag to disable default behavior per
> connection.
> 
> 9. This series doesn't add any logic to fast path.
> 
> 10. My patch from last year for resiliency for nvme_pci was rejected
> because
> it added one if condition to the fast path - no consistency.

It's based only on what anyone could measure on available hardware. I
have completely different machines than what I tested at that time. If
someone can measure a loss with the current driver, that may change the
discussion. 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-11  9:29                     ` Max Gurtovoy
  2021-11-11 17:36                       ` Keith Busch
@ 2021-11-12 16:07                       ` Sagi Grimberg
  2021-11-12 21:37                         ` Keith Busch
  2021-11-18 11:19                         ` Max Gurtovoy
  1 sibling, 2 replies; 26+ messages in thread
From: Sagi Grimberg @ 2021-11-12 16:07 UTC (permalink / raw)
  To: Max Gurtovoy, Keith Busch
  Cc: Christoph Hellwig, linux-nvme, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021


>> Completely disagree here. The TCP original report was just an example of
>> lack of protection we have against spurious completions. Nothing
>> specific about nvme-tcp here, this was discussed and agreed on in
>> the original report.
>>
> You are ignoring the facts:
> 
> 1. The device that broke the spec in the first place was that device for 
> which caused you to add the gen bits to CID.

Correct.

> 2. These gen bits are causing the limit of 4K Q_depth.

Correct. Which should be more than enough.

> 3. It's not mention anywhere in the spec, and if it was intended to be 
> implemented like it's now - it would have mentioned in the spec.

As Keith mentioned, the spec doesn't define anything about how the host
should set the command id, hence it is free to do what it wants. The
spec doesn't need to say "the host can set the command id field however
it sees fit". See Keith comment on the field endianess.

> 4. Since gen bits were introduced, other devices got broken (such as 
> Apple), hence the quirk for PCI.
You mean it _exposed_ an already broken device. Yes, that is correct.

> 5. The gen bits adds "if" conditions and logic to the fast path for 
> "innosent" transports.

There is nothing transport specific about any of this, so I'm not sure
I understand what you are talking about.

> 6. This series just extends this quirk for fabrics.

I don't think the patchset got rejected, the ask afaict was to document
known broken controllers - exactly like pci quirks.

Here is the original question from Keith:
"Are there really fabrics targets behaving this way, or is this series
anticipating they might exist?"

I don't think there is any desire to keep any controller that got the
spec wrong in this particular case unusable.

> 7. Even if not broken, some devices may suffer from reduced performance 
> having CID space spanning all 16 possible bit - fact that we ignore

Not sure exactly what you mean here. Again, if the controller assumes
anything on how the host would populate the command id it is doing
something wrong. It's not a theoretical argument, this controller
already does not work with existing at least one host implementation.

So you are correct, we don't take it into consideration.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-12 16:07                       ` Sagi Grimberg
@ 2021-11-12 21:37                         ` Keith Busch
  2021-11-18 11:19                         ` Max Gurtovoy
  1 sibling, 0 replies; 26+ messages in thread
From: Keith Busch @ 2021-11-12 21:37 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Max Gurtovoy, Keith Busch, Christoph Hellwig, linux-nvme,
	Chaitanya Kulkarni, oren, benishay, borisp, aviadye, idanb,
	jsmart2021

On Fri, Nov 12, 2021 at 9:08 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> >> Completely disagree here. The TCP original report was just an example of
> >> lack of protection we have against spurious completions. Nothing
> >> specific about nvme-tcp here, this was discussed and agreed on in
> >> the original report.
> >>
> > You are ignoring the facts:
> >
> > 1. The device that broke the spec in the first place was that device for
> > which caused you to add the gen bits to CID.
>
> Correct.

Can we also acknowledge that the broken device isn't really nvme? Apple uses
PCI class code 018020, which is an unspecified mass storage controller, clearly
*not* nvme. Apple does not care if their devices work with a generic
NVMe driver.
The only reason it works in Linux is because (1) a savvy user base exists to
reverse engineer their crap, and (2) their crap so far hasn't been invasive
enough to justify an entirely different driver. No one should consider Apple an
example to follow.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-12 16:07                       ` Sagi Grimberg
  2021-11-12 21:37                         ` Keith Busch
@ 2021-11-18 11:19                         ` Max Gurtovoy
  2021-11-21 10:05                           ` Sagi Grimberg
  1 sibling, 1 reply; 26+ messages in thread
From: Max Gurtovoy @ 2021-11-18 11:19 UTC (permalink / raw)
  To: Sagi Grimberg, Keith Busch
  Cc: Christoph Hellwig, linux-nvme, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021, Or Gerlitz, Liran Liss,
	Jason Gunthorpe


On 11/12/2021 6:07 PM, Sagi Grimberg wrote:
>
>> 6. This series just extends this quirk for fabrics.
>
> I don't think the patchset got rejected, the ask afaict was to document
> known broken controllers - exactly like pci quirks.
>
> Here is the original question from Keith:
> "Are there really fabrics targets behaving this way, or is this series
> anticipating they might exist?"
>
> I don't think there is any desire to keep any controller that got the
> spec wrong in this particular case unusable.

The example I have is the new mlx5 offload for nvme-tcp has HW design 
that, unfortunately, made an optimization that relies on a indexed CID.
All the testing was done with Linux kernel's prior to the gen counter 
change using internal tools, so chips now exist with some performance and
memory foot print implications when the gen counter is operated.

NVIDIA is committed to NVMe standards compliance and will fix this 
design in next silicon so no quirks will be needed to get the best 
performance.

I guess we can put this discussion and patch-set aside until we come 
with some performance impact measurements with the mlx5 offload HW.
All we know right now is that the new CID algorithm triggers bad 
performance using our internal simulation.

Thanks.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Add command id quirk for fabrics
  2021-11-18 11:19                         ` Max Gurtovoy
@ 2021-11-21 10:05                           ` Sagi Grimberg
  0 siblings, 0 replies; 26+ messages in thread
From: Sagi Grimberg @ 2021-11-21 10:05 UTC (permalink / raw)
  To: Max Gurtovoy, Keith Busch
  Cc: Christoph Hellwig, linux-nvme, chaitanyak, oren, benishay,
	borisp, aviadye, idanb, jsmart2021, Or Gerlitz, Liran Liss,
	Jason Gunthorpe


>>> 6. This series just extends this quirk for fabrics.
>>
>> I don't think the patchset got rejected, the ask afaict was to document
>> known broken controllers - exactly like pci quirks.
>>
>> Here is the original question from Keith:
>> "Are there really fabrics targets behaving this way, or is this series
>> anticipating they might exist?"
>>
>> I don't think there is any desire to keep any controller that got the
>> spec wrong in this particular case unusable.
> 
> The example I have is the new mlx5 offload for nvme-tcp has HW design 
> that, unfortunately, made an optimization that relies on a indexed CID.
> All the testing was done with Linux kernel's prior to the gen counter 
> change using internal tools, so chips now exist with some performance and
> memory foot print implications when the gen counter is operated.

I see. Well as said, I don't think anyone rejected the proposed
patchset, we were simply looking to understand if this is a real
problem or a theoretical one.

> NVIDIA is committed to NVMe standards compliance and will fix this 
> design in next silicon so no quirks will be needed to get the best 
> performance.

That is great to know.

> I guess we can put this discussion and patch-set aside until we come 
> with some performance impact measurements with the mlx5 offload HW.
> All we know right now is that the new CID algorithm triggers bad 
> performance using our internal simulation.

OK, I think that it will make sense to revisit this with the next
submission of the offload.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2021-11-21 10:05 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
2021-11-08 14:46 ` [PATCH 1/1 nvmecli] fabrics: add new --skip-cid-gen flag to connect cmd Max Gurtovoy
2021-11-08 14:46 ` [PATCH 1/1 libnvme] fabrics: add support for new cli --skip-cid-gen flag Max Gurtovoy
2021-11-08 14:47 ` [PATCH 1/4] nvme-fabrics: add command id quirk for fabrics controllers Max Gurtovoy
2021-11-08 14:47 ` [PATCH 2/4] nvme-rdma: add command id quirk for RDMA controllers Max Gurtovoy
2021-11-08 14:47 ` [PATCH 3/4] nvme-tcp: add command id quirk for TCP controllers Max Gurtovoy
2021-11-08 14:47 ` [PATCH 4/4] nvme-fc: add command id quirk for FC controllers Max Gurtovoy
2021-11-08 16:45 ` [PATCH v1 0/4] Add command id quirk for fabrics Keith Busch
2021-11-09  8:09   ` Christoph Hellwig
2021-11-09 12:08     ` Max Gurtovoy
2021-11-09 13:15       ` Christoph Hellwig
2021-11-09 14:23         ` Max Gurtovoy
2021-11-09 14:31           ` Christoph Hellwig
2021-11-09 16:15             ` Keith Busch
2021-11-09 16:59               ` Max Gurtovoy
2021-11-09 19:04                 ` Keith Busch
2021-11-10 19:45                   ` Sagi Grimberg
2021-11-11  9:29                     ` Max Gurtovoy
2021-11-11 17:36                       ` Keith Busch
2021-11-12 16:07                       ` Sagi Grimberg
2021-11-12 21:37                         ` Keith Busch
2021-11-18 11:19                         ` Max Gurtovoy
2021-11-21 10:05                           ` Sagi Grimberg
2021-11-10 10:32       ` Daniel Wagner
2021-11-10 10:56         ` Max Gurtovoy
2021-11-10 11:18           ` Daniel Wagner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.