Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH V2 0/4] blk-mq/nvme-tcp: fix timed out related races
@ 2020-10-20  8:52 Ming Lei
  2020-10-20  8:52 ` [PATCH V2 1/4] blk-mq: check rq->state explicitly in blk_mq_tagset_count_completed_rqs Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Ming Lei @ 2020-10-20  8:52 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Yi Zhang, Sagi Grimberg, Chao Leng, Ming Lei

Hi,

The 1st 2 patches fixes request completion related races.

The 2nd 3 patches fixes/improves nvme-tcp error recovery.

With the 4 patches, nvme/012 can pass on nvme-tcp in Zhang Yi's test
machine.

V2:
	- re-order patch3 and patch4
	- fix comment
	- improve patch "nvme: tcp: fix race between timeout and normal completion"


Ming Lei (4):
  blk-mq: check rq->state explicitly in
    blk_mq_tagset_count_completed_rqs
  blk-mq: fix blk_mq_request_completed
  nvme: tcp: complete non-IO requests atomically
  nvme: tcp: fix race between timeout and normal completion

 block/blk-flush.c       |  2 +
 block/blk-mq-tag.c      |  2 +-
 drivers/nvme/host/tcp.c | 98 ++++++++++++++++++++++++++++++++---------
 include/linux/blk-mq.h  |  8 +++-
 4 files changed, 86 insertions(+), 24 deletions(-)

CC: Chao Leng <lengchao@huawei.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Yi Zhang <yi.zhang@redhat.com>
-- 
2.25.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 1/4] blk-mq: check rq->state explicitly in blk_mq_tagset_count_completed_rqs
  2020-10-20  8:52 [PATCH V2 0/4] blk-mq/nvme-tcp: fix timed out related races Ming Lei
@ 2020-10-20  8:52 ` Ming Lei
  2020-10-20  8:52 ` [PATCH V2 2/4] blk-mq: fix blk_mq_request_completed Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2020-10-20  8:52 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Yi Zhang, Sagi Grimberg, Chao Leng, Ming Lei

blk_mq_tagset_count_completed_rqs() is called from
blk_mq_tagset_wait_completed_request() for draining requests being
completed remotely. What we need to do is to make sure that request->state
is switched to non-MQ_RQ_COMPLETE.

So check MQ_RQ_COMPLETE explicitly in blk_mq_tagset_count_completed_rqs().

Meantime mark flush request as IDLE in its .end_io() for aligning to
end normal request because flush request may stay in inflight tags in case
of !elevator, so we need to change its state into IDLE.

Cc: Chao Leng <lengchao@huawei.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-flush.c  | 2 ++
 block/blk-mq-tag.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 53abb5c73d99..f6a07ae533c9 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -231,6 +231,8 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 		return;
 	}
 
+	WRITE_ONCE(flush_rq->state, MQ_RQ_IDLE);
+
 	if (fq->rq_status != BLK_STS_OK)
 		error = fq->rq_status;
 
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 9c92053e704d..10ff8968b93b 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -367,7 +367,7 @@ static bool blk_mq_tagset_count_completed_rqs(struct request *rq,
 {
 	unsigned *count = data;
 
-	if (blk_mq_request_completed(rq))
+	if (blk_mq_rq_state(rq) == MQ_RQ_COMPLETE)
 		(*count)++;
 	return true;
 }
-- 
2.25.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 2/4] blk-mq: fix blk_mq_request_completed
  2020-10-20  8:52 [PATCH V2 0/4] blk-mq/nvme-tcp: fix timed out related races Ming Lei
  2020-10-20  8:52 ` [PATCH V2 1/4] blk-mq: check rq->state explicitly in blk_mq_tagset_count_completed_rqs Ming Lei
@ 2020-10-20  8:52 ` Ming Lei
  2020-10-20  8:53 ` [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically Ming Lei
  2020-10-20  8:53 ` [PATCH V2 4/4] nvme: tcp: fix race between timeout and normal completion Ming Lei
  3 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2020-10-20  8:52 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Yi Zhang, Sagi Grimberg, Chao Leng, Ming Lei

MQ_RQ_COMPLETE is one transient state, because the .complete callback
ends or requeues this request, then the request state is updated to
IDLE from the .complete callback.

blk_mq_request_completed() is often used by driver for avoiding
double completion with help of driver's specific sync approach. Such as,
NVMe TCP calls blk_mq_request_completed() in its timeout handler
and abort handler for avoiding double completion. If request's state
is updated to IDLE in either one, another code path may think this
request as not completed, and will complete it one more time. Then
double completion is triggered.

Yi reported[1] that 'refcount_t: underflow; use-after-free' of rq->ref
is triggered in blktests(nvme/012) on one very slow machine.

Fixes this issue by thinking request as completed if its state becomes
not IN_FLIGHT.

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Cc: Chao Leng <lengchao@huawei.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/blk-mq.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 90da3582b91d..9a67408f79d9 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -478,9 +478,15 @@ static inline int blk_mq_request_started(struct request *rq)
 	return blk_mq_rq_state(rq) != MQ_RQ_IDLE;
 }
 
+/*
+ * It is often called in abort handler for avoiding double completion,
+ * MQ_RQ_COMPLETE is one transient state because .complete callback
+ * may end or requeue this request, in either way the request is marked
+ * as IDLE. So return true if this request's state become not IN_FLIGHT.
+ */
 static inline int blk_mq_request_completed(struct request *rq)
 {
-	return blk_mq_rq_state(rq) == MQ_RQ_COMPLETE;
+	return blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT;
 }
 
 void blk_mq_start_request(struct request *rq);
-- 
2.25.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically
  2020-10-20  8:52 [PATCH V2 0/4] blk-mq/nvme-tcp: fix timed out related races Ming Lei
  2020-10-20  8:52 ` [PATCH V2 1/4] blk-mq: check rq->state explicitly in blk_mq_tagset_count_completed_rqs Ming Lei
  2020-10-20  8:52 ` [PATCH V2 2/4] blk-mq: fix blk_mq_request_completed Ming Lei
@ 2020-10-20  8:53 ` Ming Lei
  2020-10-20  9:04   ` Chao Leng
  2020-10-20  8:53 ` [PATCH V2 4/4] nvme: tcp: fix race between timeout and normal completion Ming Lei
  3 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2020-10-20  8:53 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Yi Zhang, Sagi Grimberg, Chao Leng, Ming Lei

During controller's CONNECTING state, admin/fabric/connect requests
are submitted for recovery controller, and we allow to abort this request
directly in time out handler for not blocking setup procedure.

So timout vs. normal completion race exists on these requests since
admin/fabirc/connect queues won't be shutdown before handling timeout
during CONNECTING state.

Add atomic completion for requests from connect/fabric/admin queue for
avoiding the race.

CC: Chao Leng <lengchao@huawei.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/nvme/host/tcp.c | 40 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index d6a3e1487354..7e85bd4a8d1b 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -30,6 +30,8 @@ static int so_priority;
 module_param(so_priority, int, 0644);
 MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
 
+#define REQ_STATE_COMPLETE     0
+
 enum nvme_tcp_send_state {
 	NVME_TCP_SEND_CMD_PDU = 0,
 	NVME_TCP_SEND_H2C_PDU,
@@ -56,6 +58,8 @@ struct nvme_tcp_request {
 	size_t			offset;
 	size_t			data_sent;
 	enum nvme_tcp_send_state state;
+
+	unsigned long		comp_state;
 };
 
 enum nvme_tcp_queue_flags {
@@ -469,6 +473,33 @@ static void nvme_tcp_error_recovery(struct nvme_ctrl *ctrl)
 	queue_work(nvme_reset_wq, &to_tcp_ctrl(ctrl)->err_work);
 }
 
+/*
+ * requests originated from admin, fabrics and connect_q have to be
+ * completed atomically because we don't cover the race between timeout
+ * and normal completion for these queues.
+ */
+static inline bool nvme_tcp_need_atomic_complete(struct request *rq)
+{
+	return !rq->rq_disk;
+}
+
+static inline void nvme_tcp_clear_rq_complete(struct request *rq)
+{
+	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
+
+	if (unlikely(nvme_tcp_need_atomic_complete(rq)))
+		clear_bit(REQ_STATE_COMPLETE, &req->comp_state);
+}
+
+static inline bool nvme_tcp_mark_rq_complete(struct request *rq)
+{
+	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
+
+	if (unlikely(nvme_tcp_need_atomic_complete(rq)))
+		return !test_and_set_bit(REQ_STATE_COMPLETE, &req->comp_state);
+	return true;
+}
+
 static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
 		struct nvme_completion *cqe)
 {
@@ -483,7 +514,8 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
 		return -EINVAL;
 	}
 
-	if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
+	if (nvme_tcp_mark_rq_complete(rq) &&
+			!nvme_try_complete_req(rq, cqe->status, cqe->result))
 		nvme_complete_rq(rq);
 	queue->nr_cqe++;
 
@@ -674,7 +706,8 @@ static inline void nvme_tcp_end_request(struct request *rq, u16 status)
 {
 	union nvme_result res = {};
 
-	if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
+	if (nvme_tcp_mark_rq_complete(rq) &&
+			!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
 		nvme_complete_rq(rq);
 }
 
@@ -2174,7 +2207,7 @@ static void nvme_tcp_complete_timed_out(struct request *rq)
 	/* fence other contexts that may complete the command */
 	mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock);
 	nvme_tcp_stop_queue(ctrl, nvme_tcp_queue_id(req->queue));
-	if (!blk_mq_request_completed(rq)) {
+	if (nvme_tcp_mark_rq_complete(rq) && !blk_mq_request_completed(rq)) {
 		nvme_req(rq)->status = NVME_SC_HOST_ABORTED_CMD;
 		blk_mq_complete_request(rq);
 	}
@@ -2315,6 +2348,7 @@ static blk_status_t nvme_tcp_queue_rq(struct blk_mq_hw_ctx *hctx,
 	if (unlikely(ret))
 		return ret;
 
+	nvme_tcp_clear_rq_complete(rq);
 	blk_mq_start_request(rq);
 
 	nvme_tcp_queue_request(req, true, bd->last);
-- 
2.25.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 4/4] nvme: tcp: fix race between timeout and normal completion
  2020-10-20  8:52 [PATCH V2 0/4] blk-mq/nvme-tcp: fix timed out related races Ming Lei
                   ` (2 preceding siblings ...)
  2020-10-20  8:53 ` [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically Ming Lei
@ 2020-10-20  8:53 ` Ming Lei
  3 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2020-10-20  8:53 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme, Christoph Hellwig, Keith Busch
  Cc: Yi Zhang, Sagi Grimberg, Chao Leng, Ming Lei

NVMe TCP timeout handler allows to abort request directly when the
controller isn't in LIVE state. nvme_tcp_error_recovery() updates
controller state as RESETTING, and schedule reset work function. If
new timeout comes before the work function is called, the new timedout
request will be aborted directly, however at that time, the controller
isn't shut down yet, then timeout abort vs. normal completion race
will be triggered.

Fix the race by the following approach:

1) delay unquiesce io queues and admin queue until controller is LIVE
because it isn't necessary to start queues during RESETTING. Instead,
this way may risk timeout vs. normal completion race because we need
to abort timed-out request directly during CONNECTING state for setting
up controller.

2) aborting timed out request directly only in case that controller is in
CONNECTING and DELETING state. In CONNECTING state, requests are only
submitted for recovering controller, and normal IO requests aren't
allowed, so it is safe to do so. In DELETING state, teardown controller
if IO request timeout happens.

CC: Chao Leng <lengchao@huawei.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/nvme/host/tcp.c | 58 +++++++++++++++++++++++++++--------------
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 7e85bd4a8d1b..3a137631b2b3 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1919,7 +1919,6 @@ static int nvme_tcp_configure_admin_queue(struct nvme_ctrl *ctrl, bool new)
 static void nvme_tcp_teardown_admin_queue(struct nvme_ctrl *ctrl,
 		bool remove)
 {
-	mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock);
 	blk_mq_quiesce_queue(ctrl->admin_q);
 	nvme_tcp_stop_queue(ctrl, 0);
 	if (ctrl->admin_tagset) {
@@ -1930,15 +1929,13 @@ static void nvme_tcp_teardown_admin_queue(struct nvme_ctrl *ctrl,
 	if (remove)
 		blk_mq_unquiesce_queue(ctrl->admin_q);
 	nvme_tcp_destroy_admin_queue(ctrl, remove);
-	mutex_unlock(&to_tcp_ctrl(ctrl)->teardown_lock);
 }
 
 static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl,
 		bool remove)
 {
-	mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock);
 	if (ctrl->queue_count <= 1)
-		goto out;
+		return;
 	blk_mq_quiesce_queue(ctrl->admin_q);
 	nvme_start_freeze(ctrl);
 	nvme_stop_queues(ctrl);
@@ -1951,8 +1948,6 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl,
 	if (remove)
 		nvme_start_queues(ctrl);
 	nvme_tcp_destroy_io_queues(ctrl, remove);
-out:
-	mutex_unlock(&to_tcp_ctrl(ctrl)->teardown_lock);
 }
 
 static void nvme_tcp_reconnect_or_remove(struct nvme_ctrl *ctrl)
@@ -1971,6 +1966,10 @@ static void nvme_tcp_reconnect_or_remove(struct nvme_ctrl *ctrl)
 				ctrl->opts->reconnect_delay * HZ);
 	} else {
 		dev_info(ctrl->device, "Removing controller...\n");
+
+		/* start queues for not blocking removing path */
+		nvme_start_queues(ctrl);
+		blk_mq_unquiesce_queue(ctrl->admin_q);
 		nvme_delete_ctrl(ctrl);
 	}
 }
@@ -2063,11 +2062,11 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 	struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
 
 	nvme_stop_keep_alive(ctrl);
+
+	mutex_lock(&tcp_ctrl->teardown_lock);
 	nvme_tcp_teardown_io_queues(ctrl, false);
-	/* unquiesce to fail fast pending requests */
-	nvme_start_queues(ctrl);
 	nvme_tcp_teardown_admin_queue(ctrl, false);
-	blk_mq_unquiesce_queue(ctrl->admin_q);
+	mutex_unlock(&tcp_ctrl->teardown_lock);
 
 	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) {
 		/* state change failure is ok if we started ctrl delete */
@@ -2084,6 +2083,7 @@ static void nvme_tcp_teardown_ctrl(struct nvme_ctrl *ctrl, bool shutdown)
 	cancel_work_sync(&to_tcp_ctrl(ctrl)->err_work);
 	cancel_delayed_work_sync(&to_tcp_ctrl(ctrl)->connect_work);
 
+	mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock);
 	nvme_tcp_teardown_io_queues(ctrl, shutdown);
 	blk_mq_quiesce_queue(ctrl->admin_q);
 	if (shutdown)
@@ -2091,6 +2091,7 @@ static void nvme_tcp_teardown_ctrl(struct nvme_ctrl *ctrl, bool shutdown)
 	else
 		nvme_disable_ctrl(ctrl);
 	nvme_tcp_teardown_admin_queue(ctrl, shutdown);
+	mutex_unlock(&to_tcp_ctrl(ctrl)->teardown_lock);
 }
 
 static void nvme_tcp_delete_ctrl(struct nvme_ctrl *ctrl)
@@ -2225,22 +2226,41 @@ nvme_tcp_timeout(struct request *rq, bool reserved)
 		"queue %d: timeout request %#x type %d\n",
 		nvme_tcp_queue_id(req->queue), rq->tag, pdu->hdr.type);
 
-	if (ctrl->state != NVME_CTRL_LIVE) {
+	/*
+	 * During CONNECTING or DELETING, the controller has been shutdown,
+	 * so it is safe to abort the request directly, otherwise timeout
+	 * vs. normal completion will be triggered.
+	 */
+	if (ctrl->state == NVME_CTRL_CONNECTING ||
+			ctrl->state == NVME_CTRL_DELETING ||
+			ctrl->state == NVME_CTRL_DELETING_NOIO) {
 		/*
-		 * If we are resetting, connecting or deleting we should
-		 * complete immediately because we may block controller
-		 * teardown or setup sequence
+		 * If we are connecting we should complete immediately because
+		 * we may block controller setup sequence
 		 * - ctrl disable/shutdown fabrics requests
 		 * - connect requests
 		 * - initialization admin requests
-		 * - I/O requests that entered after unquiescing and
-		 *   the controller stopped responding
+		 */
+		if (!rq->rq_disk) {
+			nvme_tcp_complete_timed_out(rq);
+			return BLK_EH_DONE;
+		}
+
+		/*
+		 * During CONNECTING, any in-flight requests are aborted, and
+		 * queue is stopped, so in theory not possible to see timed out
+		 * requests. And it might happen when one IO timeout is triggered
+		 * before changing to CONNECTING, but the timeout handling is
+		 * scheduled after updating to CONNECTING, so safe to ignore
+		 * this case.
 		 *
-		 * All other requests should be cancelled by the error
-		 * recovery work, so it's fine that we fail it here.
+		 * During DELETING, tear down controller and make forward
+		 * progress.
 		 */
-		nvme_tcp_complete_timed_out(rq);
-		return BLK_EH_DONE;
+		if (ctrl->state != NVME_CTRL_CONNECTING) {
+			nvme_tcp_teardown_ctrl(ctrl, false);
+			return BLK_EH_DONE;
+		}
 	}
 
 	/*
-- 
2.25.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically
  2020-10-20  8:53 ` [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically Ming Lei
@ 2020-10-20  9:04   ` Chao Leng
  2020-10-21  1:22     ` Ming Lei
  0 siblings, 1 reply; 10+ messages in thread
From: Chao Leng @ 2020-10-20  9:04 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, linux-nvme, Christoph Hellwig,
	Keith Busch
  Cc: Yi Zhang, Sagi Grimberg



On 2020/10/20 16:53, Ming Lei wrote:
> During controller's CONNECTING state, admin/fabric/connect requests
> are submitted for recovery controller, and we allow to abort this request
> directly in time out handler for not blocking setup procedure.
> 
> So timout vs. normal completion race exists on these requests since
> admin/fabirc/connect queues won't be shutdown before handling timeout
> during CONNECTING state.
> 
> Add atomic completion for requests from connect/fabric/admin queue for
> avoiding the race.
> 
> CC: Chao Leng <lengchao@huawei.com>
> Cc: Sagi Grimberg <sagi@grimberg.me>
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Tested-by: Yi Zhang <yi.zhang@redhat.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   drivers/nvme/host/tcp.c | 40 +++++++++++++++++++++++++++++++++++++---
>   1 file changed, 37 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index d6a3e1487354..7e85bd4a8d1b 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -30,6 +30,8 @@ static int so_priority;
>   module_param(so_priority, int, 0644);
>   MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
>   
> +#define REQ_STATE_COMPLETE     0
> +
>   enum nvme_tcp_send_state {
>   	NVME_TCP_SEND_CMD_PDU = 0,
>   	NVME_TCP_SEND_H2C_PDU,
> @@ -56,6 +58,8 @@ struct nvme_tcp_request {
>   	size_t			offset;
>   	size_t			data_sent;
>   	enum nvme_tcp_send_state state;
> +
> +	unsigned long		comp_state;
I do not think adding state is a good idea.
It is similar to rq->state.
In the teardown process, after quiesced queues delete the timer and
cancel the timeout work maybe a better option.
I will send the patch later.
The patch is already tested with roce more than one week.
>   };
>   
>   enum nvme_tcp_queue_flags {
> @@ -469,6 +473,33 @@ static void nvme_tcp_error_recovery(struct nvme_ctrl *ctrl)
>   	queue_work(nvme_reset_wq, &to_tcp_ctrl(ctrl)->err_work);
>   }
>   
> +/*
> + * requests originated from admin, fabrics and connect_q have to be
> + * completed atomically because we don't cover the race between timeout
> + * and normal completion for these queues.
> + */
> +static inline bool nvme_tcp_need_atomic_complete(struct request *rq)
> +{
> +	return !rq->rq_disk;
> +}
> +
> +static inline void nvme_tcp_clear_rq_complete(struct request *rq)
> +{
> +	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
> +
> +	if (unlikely(nvme_tcp_need_atomic_complete(rq)))
> +		clear_bit(REQ_STATE_COMPLETE, &req->comp_state);
> +}
> +
> +static inline bool nvme_tcp_mark_rq_complete(struct request *rq)
> +{
> +	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
> +
> +	if (unlikely(nvme_tcp_need_atomic_complete(rq)))
> +		return !test_and_set_bit(REQ_STATE_COMPLETE, &req->comp_state);
> +	return true;
> +}
> +
>   static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
>   		struct nvme_completion *cqe)
>   {
> @@ -483,7 +514,8 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
>   		return -EINVAL;
>   	}
>   
> -	if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
> +	if (nvme_tcp_mark_rq_complete(rq) &&
> +			!nvme_try_complete_req(rq, cqe->status, cqe->result))
>   		nvme_complete_rq(rq);
>   	queue->nr_cqe++;
>   
> @@ -674,7 +706,8 @@ static inline void nvme_tcp_end_request(struct request *rq, u16 status)
>   {
>   	union nvme_result res = {};
>   
> -	if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
> +	if (nvme_tcp_mark_rq_complete(rq) &&
> +			!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res))
>   		nvme_complete_rq(rq);
>   }
>   
> @@ -2174,7 +2207,7 @@ static void nvme_tcp_complete_timed_out(struct request *rq)
>   	/* fence other contexts that may complete the command */
>   	mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock);
>   	nvme_tcp_stop_queue(ctrl, nvme_tcp_queue_id(req->queue));
> -	if (!blk_mq_request_completed(rq)) {
> +	if (nvme_tcp_mark_rq_complete(rq) && !blk_mq_request_completed(rq)) {
>   		nvme_req(rq)->status = NVME_SC_HOST_ABORTED_CMD;
>   		blk_mq_complete_request(rq);
>   	}
> @@ -2315,6 +2348,7 @@ static blk_status_t nvme_tcp_queue_rq(struct blk_mq_hw_ctx *hctx,
>   	if (unlikely(ret))
>   		return ret;
>   
> +	nvme_tcp_clear_rq_complete(rq);
>   	blk_mq_start_request(rq);
>   
>   	nvme_tcp_queue_request(req, true, bd->last);
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically
  2020-10-20  9:04   ` Chao Leng
@ 2020-10-21  1:22     ` Ming Lei
  2020-10-21  2:20       ` Chao Leng
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2020-10-21  1:22 UTC (permalink / raw)
  To: Chao Leng
  Cc: Jens Axboe, Yi Zhang, Sagi Grimberg, linux-nvme, linux-block,
	Keith Busch, Christoph Hellwig

On Tue, Oct 20, 2020 at 05:04:29PM +0800, Chao Leng wrote:
> 
> 
> On 2020/10/20 16:53, Ming Lei wrote:
> > During controller's CONNECTING state, admin/fabric/connect requests
> > are submitted for recovery controller, and we allow to abort this request
> > directly in time out handler for not blocking setup procedure.
> > 
> > So timout vs. normal completion race exists on these requests since
> > admin/fabirc/connect queues won't be shutdown before handling timeout
> > during CONNECTING state.
> > 
> > Add atomic completion for requests from connect/fabric/admin queue for
> > avoiding the race.
> > 
> > CC: Chao Leng <lengchao@huawei.com>
> > Cc: Sagi Grimberg <sagi@grimberg.me>
> > Reported-by: Yi Zhang <yi.zhang@redhat.com>
> > Tested-by: Yi Zhang <yi.zhang@redhat.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   drivers/nvme/host/tcp.c | 40 +++++++++++++++++++++++++++++++++++++---
> >   1 file changed, 37 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> > index d6a3e1487354..7e85bd4a8d1b 100644
> > --- a/drivers/nvme/host/tcp.c
> > +++ b/drivers/nvme/host/tcp.c
> > @@ -30,6 +30,8 @@ static int so_priority;
> >   module_param(so_priority, int, 0644);
> >   MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
> > +#define REQ_STATE_COMPLETE     0
> > +
> >   enum nvme_tcp_send_state {
> >   	NVME_TCP_SEND_CMD_PDU = 0,
> >   	NVME_TCP_SEND_H2C_PDU,
> > @@ -56,6 +58,8 @@ struct nvme_tcp_request {
> >   	size_t			offset;
> >   	size_t			data_sent;
> >   	enum nvme_tcp_send_state state;
> > +
> > +	unsigned long		comp_state;
> I do not think adding state is a good idea.
> It is similar to rq->state.
> In the teardown process, after quiesced queues delete the timer and
> cancel the timeout work maybe a better option.
> I will send the patch later.
> The patch is already tested with roce more than one week.

Actually there isn't race between timeout and teardown, and patch 1 and patch
2 are enough to fix the issue reported by Yi.

It is just that rq->state is updated to IDLE in its. complete(), so
either one of code paths may think that this rq isn't completed, and
patch 2 has addressed this issue.

In short, teardown lock is enough to cover the race.


Thanks,
Ming


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically
  2020-10-21  1:22     ` Ming Lei
@ 2020-10-21  2:20       ` Chao Leng
  2020-10-21  2:55         ` Ming Lei
  0 siblings, 1 reply; 10+ messages in thread
From: Chao Leng @ 2020-10-21  2:20 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Yi Zhang, Sagi Grimberg, linux-nvme, linux-block,
	Keith Busch, Christoph Hellwig



On 2020/10/21 9:22, Ming Lei wrote:
> On Tue, Oct 20, 2020 at 05:04:29PM +0800, Chao Leng wrote:
>>
>>
>> On 2020/10/20 16:53, Ming Lei wrote:
>>> During controller's CONNECTING state, admin/fabric/connect requests
>>> are submitted for recovery controller, and we allow to abort this request
>>> directly in time out handler for not blocking setup procedure.
>>>
>>> So timout vs. normal completion race exists on these requests since
>>> admin/fabirc/connect queues won't be shutdown before handling timeout
>>> during CONNECTING state.
>>>
>>> Add atomic completion for requests from connect/fabric/admin queue for
>>> avoiding the race.
>>>
>>> CC: Chao Leng <lengchao@huawei.com>
>>> Cc: Sagi Grimberg <sagi@grimberg.me>
>>> Reported-by: Yi Zhang <yi.zhang@redhat.com>
>>> Tested-by: Yi Zhang <yi.zhang@redhat.com>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>>    drivers/nvme/host/tcp.c | 40 +++++++++++++++++++++++++++++++++++++---
>>>    1 file changed, 37 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>> index d6a3e1487354..7e85bd4a8d1b 100644
>>> --- a/drivers/nvme/host/tcp.c
>>> +++ b/drivers/nvme/host/tcp.c
>>> @@ -30,6 +30,8 @@ static int so_priority;
>>>    module_param(so_priority, int, 0644);
>>>    MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
>>> +#define REQ_STATE_COMPLETE     0
>>> +
>>>    enum nvme_tcp_send_state {
>>>    	NVME_TCP_SEND_CMD_PDU = 0,
>>>    	NVME_TCP_SEND_H2C_PDU,
>>> @@ -56,6 +58,8 @@ struct nvme_tcp_request {
>>>    	size_t			offset;
>>>    	size_t			data_sent;
>>>    	enum nvme_tcp_send_state state;
>>> +
>>> +	unsigned long		comp_state;
>> I do not think adding state is a good idea.
>> It is similar to rq->state.
>> In the teardown process, after quiesced queues delete the timer and
>> cancel the timeout work maybe a better option.
>> I will send the patch later.
>> The patch is already tested with roce more than one week.
> 
> Actually there isn't race between timeout and teardown, and patch 1 and patch
> 2 are enough to fix the issue reported by Yi.
> 
> It is just that rq->state is updated to IDLE in its. complete(), so
> either one of code paths may think that this rq isn't completed, and
> patch 2 has addressed this issue.
> 
> In short, teardown lock is enough to cover the race.
The race may cause abnormals:
1. Reported by Yi Zhang <yi.zhang@redhat.com>
detail: https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/
2. BUG_ON in blk_mq_requeue_request
Because error recovery and time out may repeated completion request.
First error recovery cancel request in tear down process, the request
will be retried in completion, rq->state will be changed to IDEL.
And then time out will complete the request again, and samely retry
the request, BUG_ON will happen in blk_mq_requeue_request.
3. abnormal link disconnection
Firt error recovery cancel all request, reconnect success, the request
will be restarted. And then time out will complete the request again,
the queue will be stoped in nvme_rdma(tcp)_complete_timed_out,
Abnormal link diconnection will happen. The condition(time out process
is delayed long time by some reason such as hardware interrupt) is need.
So the probability is low.

teardown_lock just serialize the race. and add checkint the rq->state can avoid
the 1 and 2 scenario, but 3 scenario can not be fixed.
> 
> 
> Thanks,
> Ming
> 
> .
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically
  2020-10-21  2:20       ` Chao Leng
@ 2020-10-21  2:55         ` Ming Lei
  2020-10-21  3:14           ` Chao Leng
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2020-10-21  2:55 UTC (permalink / raw)
  To: Chao Leng
  Cc: Jens Axboe, Yi Zhang, Sagi Grimberg, linux-nvme, linux-block,
	Keith Busch, Christoph Hellwig

On Wed, Oct 21, 2020 at 10:20:11AM +0800, Chao Leng wrote:
> 
> 
> On 2020/10/21 9:22, Ming Lei wrote:
> > On Tue, Oct 20, 2020 at 05:04:29PM +0800, Chao Leng wrote:
> > > 
> > > 
> > > On 2020/10/20 16:53, Ming Lei wrote:
> > > > During controller's CONNECTING state, admin/fabric/connect requests
> > > > are submitted for recovery controller, and we allow to abort this request
> > > > directly in time out handler for not blocking setup procedure.
> > > > 
> > > > So timout vs. normal completion race exists on these requests since
> > > > admin/fabirc/connect queues won't be shutdown before handling timeout
> > > > during CONNECTING state.
> > > > 
> > > > Add atomic completion for requests from connect/fabric/admin queue for
> > > > avoiding the race.
> > > > 
> > > > CC: Chao Leng <lengchao@huawei.com>
> > > > Cc: Sagi Grimberg <sagi@grimberg.me>
> > > > Reported-by: Yi Zhang <yi.zhang@redhat.com>
> > > > Tested-by: Yi Zhang <yi.zhang@redhat.com>
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > >    drivers/nvme/host/tcp.c | 40 +++++++++++++++++++++++++++++++++++++---
> > > >    1 file changed, 37 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> > > > index d6a3e1487354..7e85bd4a8d1b 100644
> > > > --- a/drivers/nvme/host/tcp.c
> > > > +++ b/drivers/nvme/host/tcp.c
> > > > @@ -30,6 +30,8 @@ static int so_priority;
> > > >    module_param(so_priority, int, 0644);
> > > >    MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
> > > > +#define REQ_STATE_COMPLETE     0
> > > > +
> > > >    enum nvme_tcp_send_state {
> > > >    	NVME_TCP_SEND_CMD_PDU = 0,
> > > >    	NVME_TCP_SEND_H2C_PDU,
> > > > @@ -56,6 +58,8 @@ struct nvme_tcp_request {
> > > >    	size_t			offset;
> > > >    	size_t			data_sent;
> > > >    	enum nvme_tcp_send_state state;
> > > > +
> > > > +	unsigned long		comp_state;
> > > I do not think adding state is a good idea.
> > > It is similar to rq->state.
> > > In the teardown process, after quiesced queues delete the timer and
> > > cancel the timeout work maybe a better option.
> > > I will send the patch later.
> > > The patch is already tested with roce more than one week.
> > 
> > Actually there isn't race between timeout and teardown, and patch 1 and patch
> > 2 are enough to fix the issue reported by Yi.
> > 
> > It is just that rq->state is updated to IDLE in its. complete(), so
> > either one of code paths may think that this rq isn't completed, and
> > patch 2 has addressed this issue.
> > 
> > In short, teardown lock is enough to cover the race.
> The race may cause abnormals:
> 1. Reported by Yi Zhang <yi.zhang@redhat.com>
> detail: https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/
> 2. BUG_ON in blk_mq_requeue_request
> Because error recovery and time out may repeated completion request.
> First error recovery cancel request in tear down process, the request
> will be retried in completion, rq->state will be changed to IDEL.

Right.

> And then time out will complete the request again, and samely retry
> the request, BUG_ON will happen in blk_mq_requeue_request.

With patch2 in this patchset, timeout handler won't complete the request any
more.

> 3. abnormal link disconnection
> Firt error recovery cancel all request, reconnect success, the request
> will be restarted. And then time out will complete the request again,
> the queue will be stoped in nvme_rdma(tcp)_complete_timed_out,
> Abnormal link diconnection will happen. The condition(time out process
> is delayed long time by some reason such as hardware interrupt) is need.
> So the probability is low.

OK, the timeout handler may just get chance to run after recovery is
done, and it can be fixed by calling nvme_sync_queues() after
updating to CONNECTING or before updating to LIVE together with patch 1 & 2.

> teardown_lock just serialize the race. and add checkint the rq->state can avoid
> the 1 and 2 scenario, but 3 scenario can not be fixed.

I didn't think of scenario 3, which seems not triggered in Yi's test.


thanks,
Ming


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically
  2020-10-21  2:55         ` Ming Lei
@ 2020-10-21  3:14           ` Chao Leng
  0 siblings, 0 replies; 10+ messages in thread
From: Chao Leng @ 2020-10-21  3:14 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Yi Zhang, Sagi Grimberg, linux-nvme, linux-block,
	Keith Busch, Christoph Hellwig



On 2020/10/21 10:55, Ming Lei wrote:
> On Wed, Oct 21, 2020 at 10:20:11AM +0800, Chao Leng wrote:
>>
>>
>> On 2020/10/21 9:22, Ming Lei wrote:
>>> On Tue, Oct 20, 2020 at 05:04:29PM +0800, Chao Leng wrote:
>>>>
>>>>
>>>> On 2020/10/20 16:53, Ming Lei wrote:
>>>>> During controller's CONNECTING state, admin/fabric/connect requests
>>>>> are submitted for recovery controller, and we allow to abort this request
>>>>> directly in time out handler for not blocking setup procedure.
>>>>>
>>>>> So timout vs. normal completion race exists on these requests since
>>>>> admin/fabirc/connect queues won't be shutdown before handling timeout
>>>>> during CONNECTING state.
>>>>>
>>>>> Add atomic completion for requests from connect/fabric/admin queue for
>>>>> avoiding the race.
>>>>>
>>>>> CC: Chao Leng <lengchao@huawei.com>
>>>>> Cc: Sagi Grimberg <sagi@grimberg.me>
>>>>> Reported-by: Yi Zhang <yi.zhang@redhat.com>
>>>>> Tested-by: Yi Zhang <yi.zhang@redhat.com>
>>>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>>>> ---
>>>>>     drivers/nvme/host/tcp.c | 40 +++++++++++++++++++++++++++++++++++++---
>>>>>     1 file changed, 37 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>>>> index d6a3e1487354..7e85bd4a8d1b 100644
>>>>> --- a/drivers/nvme/host/tcp.c
>>>>> +++ b/drivers/nvme/host/tcp.c
>>>>> @@ -30,6 +30,8 @@ static int so_priority;
>>>>>     module_param(so_priority, int, 0644);
>>>>>     MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
>>>>> +#define REQ_STATE_COMPLETE     0
>>>>> +
>>>>>     enum nvme_tcp_send_state {
>>>>>     	NVME_TCP_SEND_CMD_PDU = 0,
>>>>>     	NVME_TCP_SEND_H2C_PDU,
>>>>> @@ -56,6 +58,8 @@ struct nvme_tcp_request {
>>>>>     	size_t			offset;
>>>>>     	size_t			data_sent;
>>>>>     	enum nvme_tcp_send_state state;
>>>>> +
>>>>> +	unsigned long		comp_state;
>>>> I do not think adding state is a good idea.
>>>> It is similar to rq->state.
>>>> In the teardown process, after quiesced queues delete the timer and
>>>> cancel the timeout work maybe a better option.
>>>> I will send the patch later.
>>>> The patch is already tested with roce more than one week.
>>>
>>> Actually there isn't race between timeout and teardown, and patch 1 and patch
>>> 2 are enough to fix the issue reported by Yi.
>>>
>>> It is just that rq->state is updated to IDLE in its. complete(), so
>>> either one of code paths may think that this rq isn't completed, and
>>> patch 2 has addressed this issue.
>>>
>>> In short, teardown lock is enough to cover the race.
>> The race may cause abnormals:
>> 1. Reported by Yi Zhang <yi.zhang@redhat.com>
>> detail: https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/
>> 2. BUG_ON in blk_mq_requeue_request
>> Because error recovery and time out may repeated completion request.
>> First error recovery cancel request in tear down process, the request
>> will be retried in completion, rq->state will be changed to IDEL.
> 
> Right.
> 
>> And then time out will complete the request again, and samely retry
>> the request, BUG_ON will happen in blk_mq_requeue_request.
> 
> With patch2 in this patchset, timeout handler won't complete the request any
> more.
> 
>> 3. abnormal link disconnection
>> Firt error recovery cancel all request, reconnect success, the request
>> will be restarted. And then time out will complete the request again,
>> the queue will be stoped in nvme_rdma(tcp)_complete_timed_out,
>> Abnormal link diconnection will happen. The condition(time out process
>> is delayed long time by some reason such as hardware interrupt) is need.
>> So the probability is low.
> 
> OK, the timeout handler may just get chance to run after recovery is
> done, and it can be fixed by calling nvme_sync_queues() after
> updating to CONNECTING or before updating to LIVE together with patch 1 & 2.
> 
>> teardown_lock just serialize the race. and add checkint the rq->state can avoid
>> the 1 and 2 scenario, but 3 scenario can not be fixed.
> 
> I didn't think of scenario 3, which seems not triggered in Yi's test.
The scenario 3 is unlikely triggered in normal test.
The trigger condition are harsh. It'll only happen in some extreme situations.
If without scenario 3, Sagi's patch can work well.
> 
> 
> thanks,
> Ming
> 
> .
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-20  8:52 [PATCH V2 0/4] blk-mq/nvme-tcp: fix timed out related races Ming Lei
2020-10-20  8:52 ` [PATCH V2 1/4] blk-mq: check rq->state explicitly in blk_mq_tagset_count_completed_rqs Ming Lei
2020-10-20  8:52 ` [PATCH V2 2/4] blk-mq: fix blk_mq_request_completed Ming Lei
2020-10-20  8:53 ` [PATCH V2 3/4] nvme: tcp: complete non-IO requests atomically Ming Lei
2020-10-20  9:04   ` Chao Leng
2020-10-21  1:22     ` Ming Lei
2020-10-21  2:20       ` Chao Leng
2020-10-21  2:55         ` Ming Lei
2020-10-21  3:14           ` Chao Leng
2020-10-20  8:53 ` [PATCH V2 4/4] nvme: tcp: fix race between timeout and normal completion Ming Lei

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git