All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] nvme: protect against possible request reference after completion
@ 2021-06-16 21:19 Sagi Grimberg
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

Nothing in nvme protects against referencing a request after it was completed.
For example, in case a buggy controller sends a completion twice for the same
request, the host can access and modify a request that was already completed.

At best, this will cause a panic, but on the worst case, this can cause a silent
data corruption if the request was already reused and executed by the time
we reference it.

The nvme command_id is an opaque that we simply placed the request tag thus far.
To protect against a access after completion, we introduce a generation counter
to the upper 4-bits of the command_id that will increment every invocation and
be validated upon the reception of a completion. This will limit the maximum
queue depth to be effectively 4095, but we hardly ever use such long queues
(in fabrics the maximum is already 1024).

Changes from v2:
- cc linux-nfs,linux-kernel for patch 1/4
- fix expected genctr print in patch 4/4
- match param_set_uint_minmax indentation
- collected review tags

Changes from v1:
- lift param_set_uint_minmax and reuse it
- simplify initialization in patch 3/4

Sagi Grimberg (4):
  params: lift param_set_uint_minmax to common code
  nvme-pci: limit maximum queue depth to 4095
  nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data
  nvme: code command_id with a genctr for use-after-free validation

 drivers/nvme/host/core.c    |  3 ++-
 drivers/nvme/host/nvme.h    | 47 ++++++++++++++++++++++++++++++++++++-
 drivers/nvme/host/pci.c     | 17 ++++++--------
 drivers/nvme/host/rdma.c    |  4 ++--
 drivers/nvme/host/tcp.c     | 38 ++++++++++++------------------
 drivers/nvme/target/loop.c  |  4 ++--
 include/linux/moduleparam.h |  3 +++
 kernel/params.c             | 19 +++++++++++++++
 net/sunrpc/xprtsock.c       | 18 --------------
 9 files changed, 96 insertions(+), 57 deletions(-)

-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] params: lift param_set_uint_minmax to common code
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
@ 2021-06-16 21:19 ` Sagi Grimberg
  2021-06-17  5:45   ` Hannes Reinecke
                     ` (2 more replies)
  2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

It is a useful helper hence move it to common code so others can enjoy
it.

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 include/linux/moduleparam.h |  3 +++
 kernel/params.c             | 19 +++++++++++++++++++
 net/sunrpc/xprtsock.c       | 18 ------------------
 3 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index eed280fae433..b36ece513616 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -432,6 +432,9 @@ extern const struct kernel_param_ops param_ops_uint;
 extern int param_set_uint(const char *val, const struct kernel_param *kp);
 extern int param_get_uint(char *buffer, const struct kernel_param *kp);
 #define param_check_uint(name, p) __param_check(name, p, unsigned int)
+int param_set_uint_minmax(const char *val,
+			  const struct kernel_param *kp,
+			  unsigned int min, unsigned int max);
 
 extern const struct kernel_param_ops param_ops_long;
 extern int param_set_long(const char *val, const struct kernel_param *kp);
diff --git a/kernel/params.c b/kernel/params.c
index 2daa2780a92c..7d3c61e64140 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -243,6 +243,25 @@ STANDARD_PARAM_DEF(ulong,	unsigned long,		"%lu",		kstrtoul);
 STANDARD_PARAM_DEF(ullong,	unsigned long long,	"%llu",		kstrtoull);
 STANDARD_PARAM_DEF(hexint,	unsigned int,		"%#08x", 	kstrtouint);
 
+int param_set_uint_minmax(const char *val,
+			  const struct kernel_param *kp,
+			  unsigned int min, unsigned int max)
+{
+	unsigned int num;
+	int ret;
+
+	if (!val)
+		return -EINVAL;
+	ret = kstrtouint(val, 0, &num);
+	if (ret)
+		return ret;
+	if (num < min || num > max)
+		return -EINVAL;
+	*((unsigned int *)kp->arg) = num;
+	return 0;
+}
+EXPORT_SYMBOL(param_set_uint_minmax);
+
 int param_set_charp(const char *val, const struct kernel_param *kp)
 {
 	if (strlen(val) > 1024) {
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 47aa47a2b07c..0cfbf618e8c2 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -3130,24 +3130,6 @@ void cleanup_socket_xprt(void)
 	xprt_unregister_transport(&xs_bc_tcp_transport);
 }
 
-static int param_set_uint_minmax(const char *val,
-		const struct kernel_param *kp,
-		unsigned int min, unsigned int max)
-{
-	unsigned int num;
-	int ret;
-
-	if (!val)
-		return -EINVAL;
-	ret = kstrtouint(val, 0, &num);
-	if (ret)
-		return ret;
-	if (num < min || num > max)
-		return -EINVAL;
-	*((unsigned int *)kp->arg) = num;
-	return 0;
-}
-
 static int param_set_portnr(const char *val, const struct kernel_param *kp)
 {
 	return param_set_uint_minmax(val, kp,
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
@ 2021-06-16 21:19 ` Sagi Grimberg
  2021-06-17  5:46   ` Hannes Reinecke
  2021-06-17  8:04   ` Daniel Wagner
  2021-06-16 21:19 ` [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

We are going to use the upper 4-bits of the command_id for a generation
counter, so enforce the new queue depth upper limit. As we enforce
both min and max queue depth, use param_set_uint_minmax istead of
open coding it.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/pci.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 9cdf2099027a..d2ee9e46f5d3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -60,6 +60,8 @@ MODULE_PARM_DESC(sgl_threshold,
 		"Use SGLs when average request segment size is larger or equal to "
 		"this size. Use 0 to disable SGLs.");
 
+#define NVME_PCI_MIN_QUEUE_SIZE 2
+#define NVME_PCI_MAX_QUEUE_SIZE 4095
 static int io_queue_depth_set(const char *val, const struct kernel_param *kp);
 static const struct kernel_param_ops io_queue_depth_ops = {
 	.set = io_queue_depth_set,
@@ -68,7 +70,7 @@ static const struct kernel_param_ops io_queue_depth_ops = {
 
 static unsigned int io_queue_depth = 1024;
 module_param_cb(io_queue_depth, &io_queue_depth_ops, &io_queue_depth, 0644);
-MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2 and < 4096");
 
 static int io_queue_count_set(const char *val, const struct kernel_param *kp)
 {
@@ -157,14 +159,9 @@ struct nvme_dev {
 
 static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
 {
-	int ret;
-	u32 n;
-
-	ret = kstrtou32(val, 10, &n);
-	if (ret != 0 || n < 2)
-		return -EINVAL;
-
-	return param_set_uint(val, kp);
+	return param_set_uint_minmax(val, kp,
+			NVME_PCI_MIN_QUEUE_SIZE,
+			NVME_PCI_MAX_QUEUE_SIZE);
 }
 
 static inline unsigned int sq_idx(unsigned int qid, u32 stride)
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
  2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
@ 2021-06-16 21:19 ` Sagi Grimberg
  2021-06-17  8:11   ` Daniel Wagner
  2021-06-16 21:19 ` [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
  2021-07-16  7:15 ` [PATCH v3 0/4] nvme: protect against possible request reference after completion Christoph Hellwig
  4 siblings, 1 reply; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

We already validate it when receiving the c2hdata pdu header
and this is not changing so this is a redundant check.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c7bd37103cf4..3ad65f42fc1e 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -703,17 +703,9 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
 			      unsigned int *offset, size_t *len)
 {
 	struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
-	struct nvme_tcp_request *req;
-	struct request *rq;
-
-	rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
-	if (!rq) {
-		dev_err(queue->ctrl->ctrl.device,
-			"queue %d tag %#x not found\n",
-			nvme_tcp_queue_id(queue), pdu->command_id);
-		return -ENOENT;
-	}
-	req = blk_mq_rq_to_pdu(rq);
+	struct request *rq =
+		blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
 
 	while (true) {
 		int recv_len, ret;
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
                   ` (2 preceding siblings ...)
  2021-06-16 21:19 ` [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
@ 2021-06-16 21:19 ` Sagi Grimberg
  2021-06-17  8:56   ` Daniel Wagner
  2021-07-16  7:15 ` [PATCH v3 0/4] nvme: protect against possible request reference after completion Christoph Hellwig
  4 siblings, 1 reply; 13+ messages in thread
From: Sagi Grimberg @ 2021-06-16 21:19 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Hannes Reinecke, Chaitanya Kulkarni

We cannot detect a (perhaps buggy) controller that is sending us
a completion for a request that was already completed (for example
sending a completion twice), this phenomenon was seen in the wild
a few times.

So to protect against this, we use the upper 4 msbits of the nvme sqe
command_id to use as a 4-bit generation counter and verify it matches
the existing request generation that is incrementing on every execution.

The 16-bit command_id structure now is constructed by:
| xxxx | xxxxxxxxxxxx |
  gen    request tag

This means that we are giving up some possible queue depth as 12 bits
allow for a maximum queue depth of 4095 instead of 65536, however we
never create such long queues anyways so no real harm done.

Suggested-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/core.c   |  3 ++-
 drivers/nvme/host/nvme.h   | 47 +++++++++++++++++++++++++++++++++++++-
 drivers/nvme/host/pci.c    |  2 +-
 drivers/nvme/host/rdma.c   |  4 ++--
 drivers/nvme/host/tcp.c    | 26 ++++++++++-----------
 drivers/nvme/target/loop.c |  4 ++--
 6 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 25ad9027f928..a75876dfa38c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1026,7 +1026,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
 		return BLK_STS_IOERR;
 	}
 
-	cmd->common.command_id = req->tag;
+	nvme_req(req)->genctr++;
+	cmd->common.command_id = nvme_cid(req);
 	trace_nvme_setup_cmd(req, cmd);
 	return ret;
 }
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 1aab74128d40..ad518b1c0fac 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -158,6 +158,7 @@ enum nvme_quirks {
 struct nvme_request {
 	struct nvme_command	*cmd;
 	union nvme_result	result;
+	u8			genctr;
 	u8			retries;
 	u8			flags;
 	u16			status;
@@ -497,6 +498,49 @@ struct nvme_ctrl_ops {
 	int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
 };
 
+/*
+ * nvme command_id is constructed as such:
+ * | xxxx | xxxxxxxxxxxx |
+ *   gen    request tag
+ */
+#define nvme_genctr_mask(gen)			(gen & 0xf)
+#define nvme_cid_install_genctr(gen)		(nvme_genctr_mask(gen) << 12)
+#define nvme_genctr_from_cid(cid)		((cid & 0xf000) >> 12)
+#define nvme_tag_from_cid(cid)			(cid & 0xfff)
+
+static inline u16 nvme_cid(struct request *rq)
+{
+	return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
+}
+
+static inline struct request *nvme_find_rq(struct blk_mq_tags *tags,
+		u16 command_id)
+{
+	u8 genctr = nvme_genctr_from_cid(command_id);
+	u16 tag = nvme_tag_from_cid(command_id);
+	struct request *rq;
+
+	rq = blk_mq_tag_to_rq(tags, tag);
+	if (unlikely(!rq)) {
+		pr_err("could not locate request for tag %#x\n",
+			tag);
+		return NULL;
+	}
+	if (unlikely(nvme_genctr_mask(nvme_req(rq)->genctr) != genctr)) {
+		dev_err(nvme_req(rq)->ctrl->device,
+			"request %#x genctr mismatch (got %#x expected %#x)\n",
+			tag, genctr, nvme_genctr_mask(nvme_req(rq)->genctr));
+		return NULL;
+	}
+	return rq;
+}
+
+static inline struct request *nvme_cid_to_rq(struct blk_mq_tags *tags,
+                u16 command_id)
+{
+	return blk_mq_tag_to_rq(tags, nvme_tag_from_cid(command_id));
+}
+
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
 void nvme_fault_inject_init(struct nvme_fault_inject *fault_inj,
 			    const char *dev_name);
@@ -594,7 +638,8 @@ static inline void nvme_put_ctrl(struct nvme_ctrl *ctrl)
 
 static inline bool nvme_is_aen_req(u16 qid, __u16 command_id)
 {
-	return !qid && command_id >= NVME_AQ_BLK_MQ_DEPTH;
+	return !qid &&
+		nvme_tag_from_cid(command_id) >= NVME_AQ_BLK_MQ_DEPTH;
 }
 
 void nvme_complete_rq(struct request *req);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d2ee9e46f5d3..e111532c2082 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1012,7 +1012,7 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx)
 		return;
 	}
 
-	req = blk_mq_tag_to_rq(nvme_queue_tagset(nvmeq), command_id);
+	req = nvme_find_rq(nvme_queue_tagset(nvmeq), command_id);
 	if (unlikely(!req)) {
 		dev_warn(nvmeq->dev->ctrl.device,
 			"invalid id %d completed on queue %d\n",
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 74bf2c7f2b80..14d5023603d7 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1730,10 +1730,10 @@ static void nvme_rdma_process_nvme_rsp(struct nvme_rdma_queue *queue,
 	struct request *rq;
 	struct nvme_rdma_request *req;
 
-	rq = blk_mq_tag_to_rq(nvme_rdma_tagset(queue), cqe->command_id);
+	rq = nvme_find_rq(nvme_rdma_tagset(queue), cqe->command_id);
 	if (!rq) {
 		dev_err(queue->ctrl->ctrl.device,
-			"tag 0x%x on QP %#x not found\n",
+			"got bad command_id %#x on QP %#x\n",
 			cqe->command_id, queue->qp->qp_num);
 		nvme_rdma_error_recovery(queue->ctrl);
 		return;
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 3ad65f42fc1e..b906ee41449e 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -488,11 +488,11 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
 {
 	struct request *rq;
 
-	rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), cqe->command_id);
+	rq = nvme_find_rq(nvme_tcp_tagset(queue), cqe->command_id);
 	if (!rq) {
 		dev_err(queue->ctrl->ctrl.device,
-			"queue %d tag 0x%x not found\n",
-			nvme_tcp_queue_id(queue), cqe->command_id);
+			"got bad cqe.command_id %#x on queue %d\n",
+			cqe->command_id, nvme_tcp_queue_id(queue));
 		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
 		return -EINVAL;
 	}
@@ -509,11 +509,11 @@ static int nvme_tcp_handle_c2h_data(struct nvme_tcp_queue *queue,
 {
 	struct request *rq;
 
-	rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+	rq = nvme_find_rq(nvme_tcp_tagset(queue), pdu->command_id);
 	if (!rq) {
 		dev_err(queue->ctrl->ctrl.device,
-			"queue %d tag %#x not found\n",
-			nvme_tcp_queue_id(queue), pdu->command_id);
+			"got bad c2hdata.command_id %#x on queue %d\n",
+			pdu->command_id, nvme_tcp_queue_id(queue));
 		return -ENOENT;
 	}
 
@@ -607,7 +607,7 @@ static int nvme_tcp_setup_h2c_data_pdu(struct nvme_tcp_request *req,
 	data->hdr.plen =
 		cpu_to_le32(data->hdr.hlen + hdgst + req->pdu_len + ddgst);
 	data->ttag = pdu->ttag;
-	data->command_id = rq->tag;
+	data->command_id = nvme_cid(rq);
 	data->data_offset = cpu_to_le32(req->data_sent);
 	data->data_length = cpu_to_le32(req->pdu_len);
 	return 0;
@@ -620,11 +620,11 @@ static int nvme_tcp_handle_r2t(struct nvme_tcp_queue *queue,
 	struct request *rq;
 	int ret;
 
-	rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+	rq = nvme_find_rq(nvme_tcp_tagset(queue), pdu->command_id);
 	if (!rq) {
 		dev_err(queue->ctrl->ctrl.device,
-			"queue %d tag %#x not found\n",
-			nvme_tcp_queue_id(queue), pdu->command_id);
+			"got bad r2t.command_id %#x on queue %d\n",
+			pdu->command_id, nvme_tcp_queue_id(queue));
 		return -ENOENT;
 	}
 	req = blk_mq_rq_to_pdu(rq);
@@ -704,7 +704,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
 {
 	struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
 	struct request *rq =
-		blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+		nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
 	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
 
 	while (true) {
@@ -797,8 +797,8 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue,
 	}
 
 	if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
-		struct request *rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue),
-						pdu->command_id);
+		struct request *rq = nvme_cid_to_rq(nvme_tcp_tagset(queue),
+					pdu->command_id);
 
 		nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
 		queue->nr_cqe++;
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index cb30cb942e1d..88d34212b6dc 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -107,10 +107,10 @@ static void nvme_loop_queue_response(struct nvmet_req *req)
 	} else {
 		struct request *rq;
 
-		rq = blk_mq_tag_to_rq(nvme_loop_tagset(queue), cqe->command_id);
+		rq = nvme_find_rq(nvme_loop_tagset(queue), cqe->command_id);
 		if (!rq) {
 			dev_err(queue->ctrl->ctrl.device,
-				"tag 0x%x on queue %d not found\n",
+				"got bad command_id %#x on queue %d\n",
 				cqe->command_id, nvme_loop_queue_idx(queue));
 			return;
 		}
-- 
2.27.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] params: lift param_set_uint_minmax to common code
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
@ 2021-06-17  5:45   ` Hannes Reinecke
  2021-06-17  8:00   ` Daniel Wagner
  2021-06-17 13:48   ` Christoph Hellwig
  2 siblings, 0 replies; 13+ messages in thread
From: Hannes Reinecke @ 2021-06-17  5:45 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Chaitanya Kulkarni

On 6/16/21 11:19 PM, Sagi Grimberg wrote:
> It is a useful helper hence move it to common code so others can enjoy
> it.
> 
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Cc: linux-nfs@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   include/linux/moduleparam.h |  3 +++
>   kernel/params.c             | 19 +++++++++++++++++++
>   net/sunrpc/xprtsock.c       | 18 ------------------
>   3 files changed, 22 insertions(+), 18 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095
  2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
@ 2021-06-17  5:46   ` Hannes Reinecke
  2021-06-17  8:04   ` Daniel Wagner
  1 sibling, 0 replies; 13+ messages in thread
From: Hannes Reinecke @ 2021-06-17  5:46 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Chaitanya Kulkarni

On 6/16/21 11:19 PM, Sagi Grimberg wrote:
> We are going to use the upper 4-bits of the command_id for a generation
> counter, so enforce the new queue depth upper limit. As we enforce
> both min and max queue depth, use param_set_uint_minmax istead of
> open coding it.
> 
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/pci.c | 15 ++++++---------
>   1 file changed, 6 insertions(+), 9 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] params: lift param_set_uint_minmax to common code
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
  2021-06-17  5:45   ` Hannes Reinecke
@ 2021-06-17  8:00   ` Daniel Wagner
  2021-06-17 13:48   ` Christoph Hellwig
  2 siblings, 0 replies; 13+ messages in thread
From: Daniel Wagner @ 2021-06-17  8:00 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

On Wed, Jun 16, 2021 at 02:19:33PM -0700, Sagi Grimberg wrote:
> It is a useful helper hence move it to common code so others can enjoy
> it.
> 
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Cc: linux-nfs@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>

[...]

> +int param_set_uint_minmax(const char *val,
> +			  const struct kernel_param *kp,
> +			  unsigned int min, unsigned int max)
> +{
> +	unsigned int num;
> +	int ret;
> +
> +	if (!val)
> +		return -EINVAL;
> +	ret = kstrtouint(val, 0, &num);
> +	if (ret)
> +		return ret;
> +	if (num < min || num > max)
> +		return -EINVAL;
> +	*((unsigned int *)kp->arg) = num;
> +	return 0;
> +}
> +EXPORT_SYMBOL(param_set_uint_minmax);

Couldn't this be EXPORT_SYMBOL_GPL?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095
  2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
  2021-06-17  5:46   ` Hannes Reinecke
@ 2021-06-17  8:04   ` Daniel Wagner
  1 sibling, 0 replies; 13+ messages in thread
From: Daniel Wagner @ 2021-06-17  8:04 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

On Wed, Jun 16, 2021 at 02:19:34PM -0700, Sagi Grimberg wrote:
> We are going to use the upper 4-bits of the command_id for a generation
> counter, so enforce the new queue depth upper limit. As we enforce
> both min and max queue depth, use param_set_uint_minmax istead of
> open coding it.
> 
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---

Reviewed-by: Daniel Wagner <dwagner@suse.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data
  2021-06-16 21:19 ` [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
@ 2021-06-17  8:11   ` Daniel Wagner
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Wagner @ 2021-06-17  8:11 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

On Wed, Jun 16, 2021 at 02:19:35PM -0700, Sagi Grimberg wrote:
> We already validate it when receiving the c2hdata pdu header
> and this is not changing so this is a redundant check.
> 
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>

Reviewed-by: Daniel Wagner <dwagner@suse.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation
  2021-06-16 21:19 ` [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
@ 2021-06-17  8:56   ` Daniel Wagner
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Wagner @ 2021-06-17  8:56 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

On Wed, Jun 16, 2021 at 02:19:36PM -0700, Sagi Grimberg wrote:
> We cannot detect a (perhaps buggy) controller that is sending us
> a completion for a request that was already completed (for example
> sending a completion twice), this phenomenon was seen in the wild
> a few times.
> 
> So to protect against this, we use the upper 4 msbits of the nvme sqe
> command_id to use as a 4-bit generation counter and verify it matches
> the existing request generation that is incrementing on every execution.
> 
> The 16-bit command_id structure now is constructed by:
> | xxxx | xxxxxxxxxxxx |
>   gen    request tag
> 
> This means that we are giving up some possible queue depth as 12 bits
> allow for a maximum queue depth of 4095 instead of 65536, however we
> never create such long queues anyways so no real harm done.
> 
> Suggested-by: Keith Busch <kbusch@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Acked-by: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>

I've tested (only functional) this on FC (NetApp target). All looks
good.

Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] params: lift param_set_uint_minmax to common code
  2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
  2021-06-17  5:45   ` Hannes Reinecke
  2021-06-17  8:00   ` Daniel Wagner
@ 2021-06-17 13:48   ` Christoph Hellwig
  2 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-06-17 13:48 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

This needs a Cc to linux-kernel..

> @@ -432,6 +432,9 @@ extern const struct kernel_param_ops param_ops_uint;
>  extern int param_set_uint(const char *val, const struct kernel_param *kp);
>  extern int param_get_uint(char *buffer, const struct kernel_param *kp);
>  #define param_check_uint(name, p) __param_check(name, p, unsigned int)
> +int param_set_uint_minmax(const char *val,
> +			  const struct kernel_param *kp,
> +			  unsigned int min, unsigned int max);

Super minor nitpick, but for consistency I'd move it above the
param_check_uint definition/

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/4] nvme: protect against possible request reference after completion
  2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
                   ` (3 preceding siblings ...)
  2021-06-16 21:19 ` [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
@ 2021-07-16  7:15 ` Christoph Hellwig
  4 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2021-07-16  7:15 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke,
	Chaitanya Kulkarni

Thanks,

applies to nvme-5.15.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-07-16  7:15 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-16 21:19 [PATCH v3 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
2021-06-16 21:19 ` [PATCH v3 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
2021-06-17  5:45   ` Hannes Reinecke
2021-06-17  8:00   ` Daniel Wagner
2021-06-17 13:48   ` Christoph Hellwig
2021-06-16 21:19 ` [PATCH v3 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
2021-06-17  5:46   ` Hannes Reinecke
2021-06-17  8:04   ` Daniel Wagner
2021-06-16 21:19 ` [PATCH v3 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
2021-06-17  8:11   ` Daniel Wagner
2021-06-16 21:19 ` [PATCH v3 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
2021-06-17  8:56   ` Daniel Wagner
2021-07-16  7:15 ` [PATCH v3 0/4] nvme: protect against possible request reference after completion Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.