All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/3] nvme: sanitize KATO handling
@ 2021-03-02  9:26 Hannes Reinecke
  2021-03-02  9:26 ` [PATCH 1/3] nvme: fixup kato deadlock Hannes Reinecke
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-02  9:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Daniel Wagner, linux-nvme,
	Hannes Reinecke, Chao Leng

Hi all,

one of our customer had been running into a deadlock trying to terminate
outstanding KATO commands during reset.
Looking closer at it, I found that we never actually _track_ if a KATO
command is submitted, so we might happily be sending several KATO commands
to the same controller simultaneously.
Also, I found it slightly odd that we signal a different KATO value to the
controller than what we're using internally; I would have thought that both
sides should agree on the same KATO value. And even that wouldn't be so
bad, but we really should be using the KATO value we annouonced to the
controller when setting the request timeout.

With these patches I attempt to resolve the situation; the first patch
ensures that only one KATO command to a given controller is outstanding.
With that the delay between sending KATO commands and the KATO timeout
are decoupled, and we can follow the recommendation from the base spec
to send the KATO commands at half the KATO timeout intervals.
And to aid userspace detecting persistent controllers we should be
displaying the KATO value in sysfs, too.

As usual, comments and reviews are welcome.

Changes to v1:
- Include reviews from Christoph
- Include reviews from Chao Leng
- Add new patch to display KATO value in sysfs

Hannes Reinecke (3):
  nvme: fixup kato deadlock
  nvme: sanitize KATO setting
  nvme: add 'kato' sysfs attribute

 drivers/nvme/host/core.c    | 21 ++++++++++++++++-----
 drivers/nvme/host/fabrics.c |  2 +-
 drivers/nvme/host/nvme.h    | 10 +++++++++-
 3 files changed, 26 insertions(+), 7 deletions(-)

-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/3] nvme: fixup kato deadlock
  2021-03-02  9:26 [PATCHv2 0/3] nvme: sanitize KATO handling Hannes Reinecke
@ 2021-03-02  9:26 ` Hannes Reinecke
  2021-03-03  8:42   ` Christoph Hellwig
  2021-03-02  9:26 ` [PATCH 2/3] nvme: sanitize KATO setting Hannes Reinecke
  2021-03-02  9:26 ` [PATCH 3/3] nvme: add 'kato' sysfs attribute Hannes Reinecke
  2 siblings, 1 reply; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-02  9:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Daniel Wagner, linux-nvme,
	Hannes Reinecke, Chao Leng

A customer of ours has run into this deadlock with RDMA:
- The ka_work workqueue item is executed
- A new ka_work workqueue item is scheduled just after that.
- Now both, the kato request timeout _and_ the workqueue delay
  will execute at roughly the same time
- If the timing is correct the workqueue executes _before_
  the kato request timeout triggers
- Kato request timeout triggers, and starts error recovery
- error recovery deadlocks, as it needs to flush the kato
  workqueue item; this is stuck in nvme_alloc_request() as all
  reserved tags are in use.

The reserved tags would have been freed up later when cancelling all
outstanding requests in the queue:

	nvme_stop_keep_alive(&ctrl->ctrl);
	nvme_rdma_teardown_io_queues(ctrl, false);
	nvme_start_queues(&ctrl->ctrl);
	nvme_rdma_teardown_admin_queue(ctrl, false);
	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);

but as we're stuck in nvme_stop_keep_alive() we'll never get this far.

To fix this a new controller flag 'NVME_CTRL_KATO_RUNNING' is added
which will short-circuit the nvme_keep_alive() function if one
keep-alive command is already running.
Additionally we should be allocating the KATO request with
BLK_MQ_REQ_NOWAIT as we must not block on request allocation; if we
cannot get a request we cannot determine if the connection is healthy,
and need to reset it anyway.

Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/core.c | 10 ++++++++--
 drivers/nvme/host/nvme.h |  1 +
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 587f8395435b..f890b310499e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1207,6 +1207,7 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
 	bool startka = false;
 
 	blk_mq_free_request(rq);
+	clear_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags);
 
 	if (status) {
 		dev_err(ctrl->device,
@@ -1229,10 +1230,15 @@ static int nvme_keep_alive(struct nvme_ctrl *ctrl)
 {
 	struct request *rq;
 
+	if (test_and_set_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags))
+		return 0;
+
 	rq = nvme_alloc_request(ctrl->admin_q, &ctrl->ka_cmd,
-			BLK_MQ_REQ_RESERVED);
-	if (IS_ERR(rq))
+			BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
+	if (IS_ERR(rq)) {
+		clear_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags);
 		return PTR_ERR(rq);
+	}
 
 	rq->timeout = ctrl->kato * HZ;
 	rq->end_io_data = ctrl;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 07b34175c6ce..23711f6b7d13 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -344,6 +344,7 @@ struct nvme_ctrl {
 	int nr_reconnects;
 	unsigned long flags;
 #define NVME_CTRL_FAILFAST_EXPIRED	0
+#define NVME_CTRL_KATO_RUNNING		1
 	struct nvmf_ctrl_options *opts;
 
 	struct page *discard_page;
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/3] nvme: sanitize KATO setting
  2021-03-02  9:26 [PATCHv2 0/3] nvme: sanitize KATO handling Hannes Reinecke
  2021-03-02  9:26 ` [PATCH 1/3] nvme: fixup kato deadlock Hannes Reinecke
@ 2021-03-02  9:26 ` Hannes Reinecke
  2021-03-03  8:53   ` Chao Leng
                     ` (2 more replies)
  2021-03-02  9:26 ` [PATCH 3/3] nvme: add 'kato' sysfs attribute Hannes Reinecke
  2 siblings, 3 replies; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-02  9:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Daniel Wagner, linux-nvme,
	Hannes Reinecke, Chao Leng

According to the NVMe base spec the KATO commands should be sent
at half of the KATO interval, to properly account for round-trip
times.
As we now will only ever send one KATO command per connection we
can easily use the recommended values.
This also fixes a potential issue where the request timeout for
the KATO command does not match the value in the connect command,
which might be causing spurious connection drops from the target.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/core.c    | 9 ++++++---
 drivers/nvme/host/fabrics.c | 2 +-
 drivers/nvme/host/nvme.h    | 9 ++++++++-
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f890b310499e..6d096d41a82f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1223,7 +1223,8 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
 		startka = true;
 	spin_unlock_irqrestore(&ctrl->lock, flags);
 	if (startka)
-		queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
+		queue_delayed_work(nvme_wq, &ctrl->ka_work,
+				   NVME_KATO_DELAY(ctrl->kato) * HZ);
 }
 
 static int nvme_keep_alive(struct nvme_ctrl *ctrl)
@@ -1258,7 +1259,8 @@ static void nvme_keep_alive_work(struct work_struct *work)
 		dev_dbg(ctrl->device,
 			"reschedule traffic based keep-alive timer\n");
 		ctrl->comp_seen = false;
-		queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
+		queue_delayed_work(nvme_wq, &ctrl->ka_work,
+				   NVME_KATO_DELAY(ctrl->kato) * HZ);
 		return;
 	}
 
@@ -1275,7 +1277,8 @@ static void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
 	if (unlikely(ctrl->kato == 0))
 		return;
 
-	queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
+	queue_delayed_work(nvme_wq, &ctrl->ka_work,
+			   NVME_KATO_DELAY(ctrl->kato) * HZ);
 }
 
 void nvme_stop_keep_alive(struct nvme_ctrl *ctrl)
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 5dfd806fc2d2..dba32e39afbf 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -382,7 +382,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 	 * and add a grace period for controller kato enforcement
 	 */
 	cmd.connect.kato = ctrl->kato ?
-		cpu_to_le32((ctrl->kato + NVME_KATO_GRACE) * 1000) : 0;
+		cpu_to_le32(ctrl->kato * 1000) : 0;
 
 	if (ctrl->opts->disable_sqflow)
 		cmd.connect.cattr |= NVME_CONNECT_DISABLE_SQFLOW;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 23711f6b7d13..912830389997 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -27,7 +27,14 @@ extern unsigned int admin_timeout;
 #define NVME_ADMIN_TIMEOUT	(admin_timeout * HZ)
 
 #define NVME_DEFAULT_KATO	5
-#define NVME_KATO_GRACE		10
+
+/*
+ * The recommended frequency for KATO commands
+ * according to NVMe 1.4 section 7.12.1:
+ * The host should send Keep Alive commands at half of the
+ * Keep Alive Timeout accounting for transport roundtrip times [..].
+ */
+#define NVME_KATO_DELAY(k)	((k) >> 1)
 
 #ifdef CONFIG_ARCH_NO_SG_CHAIN
 #define  NVME_INLINE_SG_CNT  0
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/3] nvme: add 'kato' sysfs attribute
  2021-03-02  9:26 [PATCHv2 0/3] nvme: sanitize KATO handling Hannes Reinecke
  2021-03-02  9:26 ` [PATCH 1/3] nvme: fixup kato deadlock Hannes Reinecke
  2021-03-02  9:26 ` [PATCH 2/3] nvme: sanitize KATO setting Hannes Reinecke
@ 2021-03-02  9:26 ` Hannes Reinecke
  2021-03-05 20:38   ` Sagi Grimberg
  2021-03-08 13:06   ` Max Gurtovoy
  2 siblings, 2 replies; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-02  9:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Daniel Wagner, linux-nvme,
	Hannes Reinecke, Chao Leng

Add a 'kato' controller sysfs attribute to display the current
keep-alive timeout value (if any). This allows userspace to identify
persistent discovery controllers, as these will have a non-zero
KATO value.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6d096d41a82f..4f92c5035045 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3554,6 +3554,7 @@ nvme_show_int_function(cntlid);
 nvme_show_int_function(numa_node);
 nvme_show_int_function(queue_count);
 nvme_show_int_function(sqsize);
+nvme_show_int_function(kato);
 
 static ssize_t nvme_sysfs_delete(struct device *dev,
 				struct device_attribute *attr, const char *buf,
@@ -3720,6 +3721,7 @@ static struct attribute *nvme_dev_attrs[] = {
 	&dev_attr_hostid.attr,
 	&dev_attr_ctrl_loss_tmo.attr,
 	&dev_attr_reconnect_delay.attr,
+	&dev_attr_kato.attr,
 	NULL
 };
 
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] nvme: fixup kato deadlock
  2021-03-02  9:26 ` [PATCH 1/3] nvme: fixup kato deadlock Hannes Reinecke
@ 2021-03-03  8:42   ` Christoph Hellwig
  2021-03-03 12:01     ` Hannes Reinecke
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2021-03-03  8:42 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, Daniel Wagner,
	Chao Leng, linux-nvme

On Tue, Mar 02, 2021 at 10:26:42AM +0100, Hannes Reinecke wrote:
> +	if (test_and_set_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags))
> +		return 0;
> +
>  	rq = nvme_alloc_request(ctrl->admin_q, &ctrl->ka_cmd,
> -			BLK_MQ_REQ_RESERVED);
> -	if (IS_ERR(rq))
> +			BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
> +	if (IS_ERR(rq)) {
> +		clear_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags);
>  		return PTR_ERR(rq);
> +	}

Adding BLK_MQ_REQ_NOWAIT should be a separate prep patch, together with
reducing the number of reserved tags.

Also why do we still need the extra test_and_set_bit with the
NOWAIT allocation?


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/3] nvme: sanitize KATO setting
  2021-03-02  9:26 ` [PATCH 2/3] nvme: sanitize KATO setting Hannes Reinecke
@ 2021-03-03  8:53   ` Chao Leng
  2021-03-03 12:40     ` Christoph Hellwig
  2021-03-05 20:38   ` Sagi Grimberg
  2021-03-08 13:11   ` Max Gurtovoy
  2 siblings, 1 reply; 18+ messages in thread
From: Chao Leng @ 2021-03-03  8:53 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, Daniel Wagner, linux-nvme



On 2021/3/2 17:26, Hannes Reinecke wrote:
> According to the NVMe base spec the KATO commands should be sent
> at half of the KATO interval, to properly account for round-trip
> times.
> As we now will only ever send one KATO command per connection we
> can easily use the recommended values.
> This also fixes a potential issue where the request timeout for
> the KATO command does not match the value in the connect command,
> which might be causing spurious connection drops from the target.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/core.c    | 9 ++++++---
>   drivers/nvme/host/fabrics.c | 2 +-
>   drivers/nvme/host/nvme.h    | 9 ++++++++-
>   3 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f890b310499e..6d096d41a82f 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1223,7 +1223,8 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
>   		startka = true;
>   	spin_unlock_irqrestore(&ctrl->lock, flags);
>   	if (startka)
> -		queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
> +		queue_delayed_work(nvme_wq, &ctrl->ka_work,
> +				   NVME_KATO_DELAY(ctrl->kato) * HZ);
* HZ and then >> 1 is a better choice. do like this: NVME_KATO_DELAY(ctrl->kato * HZ).
>   }
>   
>   static int nvme_keep_alive(struct nvme_ctrl *ctrl)
> @@ -1258,7 +1259,8 @@ static void nvme_keep_alive_work(struct work_struct *work)
>   		dev_dbg(ctrl->device,
>   			"reschedule traffic based keep-alive timer\n");
>   		ctrl->comp_seen = false;
> -		queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
> +		queue_delayed_work(nvme_wq, &ctrl->ka_work,
> +				   NVME_KATO_DELAY(ctrl->kato) * HZ);
>   		return;
>   	}
>   
> @@ -1275,7 +1277,8 @@ static void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
>   	if (unlikely(ctrl->kato == 0))
>   		return;
>   
> -	queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
> +	queue_delayed_work(nvme_wq, &ctrl->ka_work,
> +			   NVME_KATO_DELAY(ctrl->kato) * HZ);
>   }
>   
>   void nvme_stop_keep_alive(struct nvme_ctrl *ctrl)
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index 5dfd806fc2d2..dba32e39afbf 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -382,7 +382,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
>   	 * and add a grace period for controller kato enforcement
>   	 */
>   	cmd.connect.kato = ctrl->kato ?
> -		cpu_to_le32((ctrl->kato + NVME_KATO_GRACE) * 1000) : 0;
> +		cpu_to_le32(ctrl->kato * 1000) : 0;
>   
>   	if (ctrl->opts->disable_sqflow)
>   		cmd.connect.cattr |= NVME_CONNECT_DISABLE_SQFLOW;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 23711f6b7d13..912830389997 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -27,7 +27,14 @@ extern unsigned int admin_timeout;
>   #define NVME_ADMIN_TIMEOUT	(admin_timeout * HZ)
>   
>   #define NVME_DEFAULT_KATO	5
> -#define NVME_KATO_GRACE		10
> +
> +/*
> + * The recommended frequency for KATO commands
> + * according to NVMe 1.4 section 7.12.1:
> + * The host should send Keep Alive commands at half of the
> + * Keep Alive Timeout accounting for transport roundtrip times [..].
> + */
> +#define NVME_KATO_DELAY(k)	((k) >> 1)
>   
>   #ifdef CONFIG_ARCH_NO_SG_CHAIN
>   #define  NVME_INLINE_SG_CNT  0
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] nvme: fixup kato deadlock
  2021-03-03  8:42   ` Christoph Hellwig
@ 2021-03-03 12:01     ` Hannes Reinecke
  2021-03-03 12:35       ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-03 12:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, Daniel Wagner, Chao Leng, linux-nvme

On 3/3/21 9:42 AM, Christoph Hellwig wrote:
> On Tue, Mar 02, 2021 at 10:26:42AM +0100, Hannes Reinecke wrote:
>> +	if (test_and_set_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags))
>> +		return 0;
>> +
>>  	rq = nvme_alloc_request(ctrl->admin_q, &ctrl->ka_cmd,
>> -			BLK_MQ_REQ_RESERVED);
>> -	if (IS_ERR(rq))
>> +			BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
>> +	if (IS_ERR(rq)) {
>> +		clear_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags);
>>  		return PTR_ERR(rq);
>> +	}
> 
> Adding BLK_MQ_REQ_NOWAIT should be a separate prep patch, together with
> reducing the number of reserved tags.
> 
Okay.
But why would we need to reduce the number of tags? It's not that we're
changing anything with the allocation, we're just guaranteed to fail if
for some reason the stack messes up.

> Also why do we still need the extra test_and_set_bit with the
> NOWAIT allocation?
> 
This is essentially a safeguard; there is nothing in the current code
telling us if a KATO command is in flight or not.
And I really do hate doing pointless work (like allocating) commands if
we don't need to.

Plus it'll differentiate between legit callers of
nvme_start_keep_alive() (which can be called at any time, and hence
might be adding the ka_work element just after the previous one had
submitted the KATO command), and real failures like failing to allocate
the KATO command itself, which points to a real issue in the stack as
normally there should be enough reserved commands.

But yeah, I'll split if off in two patches.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] nvme: fixup kato deadlock
  2021-03-03 12:01     ` Hannes Reinecke
@ 2021-03-03 12:35       ` Christoph Hellwig
  2021-03-03 13:11         ` Hannes Reinecke
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2021-03-03 12:35 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, Daniel Wagner,
	Chao Leng, linux-nvme

On Wed, Mar 03, 2021 at 01:01:50PM +0100, Hannes Reinecke wrote:
> > Adding BLK_MQ_REQ_NOWAIT should be a separate prep patch, together with
> > reducing the number of reserved tags.
> > 
> Okay.
> But why would we need to reduce the number of tags? It's not that we're
> changing anything with the allocation, we're just guaranteed to fail if
> for some reason the stack messes up.

Because we could otherwise still have to keep alive requests, or one keep
a live and one connect requests allocated at the same time.

> 
> > Also why do we still need the extra test_and_set_bit with the
> > NOWAIT allocation?
> > 
> This is essentially a safeguard; there is nothing in the current code
> telling us if a KATO command is in flight or not.
> And I really do hate doing pointless work (like allocating) commands if
> we don't need to.
> 
> Plus it'll differentiate between legit callers of
> nvme_start_keep_alive() (which can be called at any time, and hence
> might be adding the ka_work element just after the previous one had
> submitted the KATO command), and real failures like failing to allocate
> the KATO command itself, which points to a real issue in the stack as
> normally there should be enough reserved commands.

You can distinguish them by the error code from blk_mq_alloc_request.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/3] nvme: sanitize KATO setting
  2021-03-03  8:53   ` Chao Leng
@ 2021-03-03 12:40     ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2021-03-03 12:40 UTC (permalink / raw)
  To: Chao Leng
  Cc: Hannes Reinecke, Christoph Hellwig, Sagi Grimberg, Keith Busch,
	Daniel Wagner, linux-nvme

On Wed, Mar 03, 2021 at 04:53:58PM +0800, Chao Leng wrote:
>
>
> On 2021/3/2 17:26, Hannes Reinecke wrote:
>> According to the NVMe base spec the KATO commands should be sent
>> at half of the KATO interval, to properly account for round-trip
>> times.
>> As we now will only ever send one KATO command per connection we
>> can easily use the recommended values.
>> This also fixes a potential issue where the request timeout for
>> the KATO command does not match the value in the connect command,
>> which might be causing spurious connection drops from the target.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/host/core.c    | 9 ++++++---
>>   drivers/nvme/host/fabrics.c | 2 +-
>>   drivers/nvme/host/nvme.h    | 9 ++++++++-
>>   3 files changed, 15 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index f890b310499e..6d096d41a82f 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -1223,7 +1223,8 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
>>   		startka = true;
>>   	spin_unlock_irqrestore(&ctrl->lock, flags);
>>   	if (startka)
>> -		queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
>> +		queue_delayed_work(nvme_wq, &ctrl->ka_work,
>> +				   NVME_KATO_DELAY(ctrl->kato) * HZ);
> * HZ and then >> 1 is a better choice. do like this: NVME_KATO_DELAY(ctrl->kato * HZ).

Also there is no reason to not include the "* HZ" in NVME_KATO_DELAY,
as all callers do it anyway.  Also please turn NVME_KATO_DELAY into a
normal inline function using normal lower case naming.  Alternatively
we could also just do a helper for the whole queue_delayed_work, which
might actually end up even cleaner.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] nvme: fixup kato deadlock
  2021-03-03 12:35       ` Christoph Hellwig
@ 2021-03-03 13:11         ` Hannes Reinecke
  2021-03-03 14:23           ` Hannes Reinecke
  0 siblings, 1 reply; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-03 13:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, Daniel Wagner, Chao Leng, linux-nvme

On 3/3/21 1:35 PM, Christoph Hellwig wrote:
> On Wed, Mar 03, 2021 at 01:01:50PM +0100, Hannes Reinecke wrote:
>>> Adding BLK_MQ_REQ_NOWAIT should be a separate prep patch, together with
>>> reducing the number of reserved tags.
>>>
>> Okay.
>> But why would we need to reduce the number of tags? It's not that we're
>> changing anything with the allocation, we're just guaranteed to fail if
>> for some reason the stack messes up.
> 
> Because we could otherwise still have to keep alive requests, or one keep
> a live and one connect requests allocated at the same time.
> 
But that's what we have today.
So you're saying that we should have only _1_ reserved request, ie
either connect _or_ keep-alive?
Sure we can do that; I had been a bit hesitant to do that seeing that it
had been that way since the beginning.

But if you say ...

>>
>>> Also why do we still need the extra test_and_set_bit with the
>>> NOWAIT allocation?
>>>
>> This is essentially a safeguard; there is nothing in the current code
>> telling us if a KATO command is in flight or not.
>> And I really do hate doing pointless work (like allocating) commands if
>> we don't need to.
>>
>> Plus it'll differentiate between legit callers of
>> nvme_start_keep_alive() (which can be called at any time, and hence
>> might be adding the ka_work element just after the previous one had
>> submitted the KATO command), and real failures like failing to allocate
>> the KATO command itself, which points to a real issue in the stack as
>> normally there should be enough reserved commands.
> 
> You can distinguish them by the error code from blk_mq_alloc_request.
> 
True. Will be dropping that patch, then.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] nvme: fixup kato deadlock
  2021-03-03 13:11         ` Hannes Reinecke
@ 2021-03-03 14:23           ` Hannes Reinecke
  2021-03-04  8:02             ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-03 14:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, Daniel Wagner, Chao Leng, linux-nvme

On 3/3/21 2:11 PM, Hannes Reinecke wrote:
> On 3/3/21 1:35 PM, Christoph Hellwig wrote:
>> On Wed, Mar 03, 2021 at 01:01:50PM +0100, Hannes Reinecke wrote:
>>>> Adding BLK_MQ_REQ_NOWAIT should be a separate prep patch, together with
>>>> reducing the number of reserved tags.
>>>>
>>> Okay.
>>> But why would we need to reduce the number of tags? It's not that we're
>>> changing anything with the allocation, we're just guaranteed to fail if
>>> for some reason the stack messes up.
>>
>> Because we could otherwise still have to keep alive requests, or one keep
>> a live and one connect requests allocated at the same time.
>>
> But that's what we have today.
> So you're saying that we should have only _1_ reserved request, ie
> either connect _or_ keep-alive?
> Sure we can do that; I had been a bit hesitant to do that seeing that it
> had been that way since the beginning.
> 
> But if you say ...
> 
Actually, having thought a bit more, I'm not sure if we should change
that. Changing the number of reserved commands _will_ have an impact on
error recovery, up to the point that we cannot issue a 'connect' command
if a keep-alive command is still running.
Which means we might be seeing spurious connection failures during
recovery, something which I'd be very cautious about.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] nvme: fixup kato deadlock
  2021-03-03 14:23           ` Hannes Reinecke
@ 2021-03-04  8:02             ` Christoph Hellwig
  2021-03-04  8:56               ` Hannes Reinecke
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2021-03-04  8:02 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, Daniel Wagner,
	Chao Leng, linux-nvme

On Wed, Mar 03, 2021 at 03:23:12PM +0100, Hannes Reinecke wrote:
> Actually, having thought a bit more, I'm not sure if we should change
> that. Changing the number of reserved commands _will_ have an impact on
> error recovery, up to the point that we cannot issue a 'connect' command
> if a keep-alive command is still running.
> Which means we might be seeing spurious connection failures during
> recovery, something which I'd be very cautious about.

We always do a blk_mq_tagset_busy_iter that cancels all requests before
reconnecting.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] nvme: fixup kato deadlock
  2021-03-04  8:02             ` Christoph Hellwig
@ 2021-03-04  8:56               ` Hannes Reinecke
  0 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-04  8:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, Daniel Wagner, Chao Leng, linux-nvme

On 3/4/21 9:02 AM, Christoph Hellwig wrote:
> On Wed, Mar 03, 2021 at 03:23:12PM +0100, Hannes Reinecke wrote:
>> Actually, having thought a bit more, I'm not sure if we should change
>> that. Changing the number of reserved commands _will_ have an impact on
>> error recovery, up to the point that we cannot issue a 'connect' command
>> if a keep-alive command is still running.
>> Which means we might be seeing spurious connection failures during
>> recovery, something which I'd be very cautious about.
> 
> We always do a blk_mq_tagset_busy_iter that cancels all requests before
> reconnecting.
> 
Right; and as we're now failing allocations we should be good.
Let's see whether it holds up during testing.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/3] nvme: sanitize KATO setting
  2021-03-02  9:26 ` [PATCH 2/3] nvme: sanitize KATO setting Hannes Reinecke
  2021-03-03  8:53   ` Chao Leng
@ 2021-03-05 20:38   ` Sagi Grimberg
  2021-03-08 13:11   ` Max Gurtovoy
  2 siblings, 0 replies; 18+ messages in thread
From: Sagi Grimberg @ 2021-03-05 20:38 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, Daniel Wagner, Chao Leng, linux-nvme


> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index 5dfd806fc2d2..dba32e39afbf 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -382,7 +382,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
>   	 * and add a grace period for controller kato enforcement
>   	 */
>   	cmd.connect.kato = ctrl->kato ?
> -		cpu_to_le32((ctrl->kato + NVME_KATO_GRACE) * 1000) : 0;
> +		cpu_to_le32(ctrl->kato * 1000) : 0;

There is no need for the conditional anymore..

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] nvme: add 'kato' sysfs attribute
  2021-03-02  9:26 ` [PATCH 3/3] nvme: add 'kato' sysfs attribute Hannes Reinecke
@ 2021-03-05 20:38   ` Sagi Grimberg
  2021-03-08 13:06   ` Max Gurtovoy
  1 sibling, 0 replies; 18+ messages in thread
From: Sagi Grimberg @ 2021-03-05 20:38 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, Daniel Wagner, Chao Leng, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] nvme: add 'kato' sysfs attribute
  2021-03-02  9:26 ` [PATCH 3/3] nvme: add 'kato' sysfs attribute Hannes Reinecke
  2021-03-05 20:38   ` Sagi Grimberg
@ 2021-03-08 13:06   ` Max Gurtovoy
  1 sibling, 0 replies; 18+ messages in thread
From: Max Gurtovoy @ 2021-03-08 13:06 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Daniel Wagner, linux-nvme, Chao Leng


On 3/2/2021 11:26 AM, Hannes Reinecke wrote:
> Add a 'kato' controller sysfs attribute to display the current
> keep-alive timeout value (if any). This allows userspace to identify
> persistent discovery controllers, as these will have a non-zero
> KATO value.

I'm not sure it will identify the persistent discovery controller since 
you might have non-zero KATO for regular controllers as well.

But for identification you can set special MN or SN for these controllers.

The patch is useful for displaying the keep-alive timeout value.


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/3] nvme: sanitize KATO setting
  2021-03-02  9:26 ` [PATCH 2/3] nvme: sanitize KATO setting Hannes Reinecke
  2021-03-03  8:53   ` Chao Leng
  2021-03-05 20:38   ` Sagi Grimberg
@ 2021-03-08 13:11   ` Max Gurtovoy
  2021-03-08 13:54     ` Hannes Reinecke
  2 siblings, 1 reply; 18+ messages in thread
From: Max Gurtovoy @ 2021-03-08 13:11 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Daniel Wagner, linux-nvme, Chao Leng


On 3/2/2021 11:26 AM, Hannes Reinecke wrote:
> According to the NVMe base spec the KATO commands should be sent
> at half of the KATO interval, to properly account for round-trip
> times.
> As we now will only ever send one KATO command per connection we
> can easily use the recommended values.
> This also fixes a potential issue where the request timeout for
> the KATO command does not match the value in the connect command,
> which might be causing spurious connection drops from the target.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/host/core.c    | 9 ++++++---
>   drivers/nvme/host/fabrics.c | 2 +-
>   drivers/nvme/host/nvme.h    | 9 ++++++++-
>   3 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f890b310499e..6d096d41a82f 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1223,7 +1223,8 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
>   		startka = true;
>   	spin_unlock_irqrestore(&ctrl->lock, flags);
>   	if (startka)
> -		queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
> +		queue_delayed_work(nvme_wq, &ctrl->ka_work,
> +				   NVME_KATO_DELAY(ctrl->kato) * HZ);
>   }
>   
>   static int nvme_keep_alive(struct nvme_ctrl *ctrl)
> @@ -1258,7 +1259,8 @@ static void nvme_keep_alive_work(struct work_struct *work)
>   		dev_dbg(ctrl->device,
>   			"reschedule traffic based keep-alive timer\n");
>   		ctrl->comp_seen = false;
> -		queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
> +		queue_delayed_work(nvme_wq, &ctrl->ka_work,
> +				   NVME_KATO_DELAY(ctrl->kato) * HZ);
>   		return;
>   	}
>   
> @@ -1275,7 +1277,8 @@ static void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
>   	if (unlikely(ctrl->kato == 0))
>   		return;
>   
> -	queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
> +	queue_delayed_work(nvme_wq, &ctrl->ka_work,
> +			   NVME_KATO_DELAY(ctrl->kato) * HZ);
>   }
>   
>   void nvme_stop_keep_alive(struct nvme_ctrl *ctrl)
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index 5dfd806fc2d2..dba32e39afbf 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -382,7 +382,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
>   	 * and add a grace period for controller kato enforcement
>   	 */
>   	cmd.connect.kato = ctrl->kato ?
> -		cpu_to_le32((ctrl->kato + NVME_KATO_GRACE) * 1000) : 0;
> +		cpu_to_le32(ctrl->kato * 1000) : 0;
>   
>   	if (ctrl->opts->disable_sqflow)
>   		cmd.connect.cattr |= NVME_CONNECT_DISABLE_SQFLOW;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 23711f6b7d13..912830389997 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -27,7 +27,14 @@ extern unsigned int admin_timeout;
>   #define NVME_ADMIN_TIMEOUT	(admin_timeout * HZ)
>   
>   #define NVME_DEFAULT_KATO	5
> -#define NVME_KATO_GRACE		10
> +
> +/*
> + * The recommended frequency for KATO commands
> + * according to NVMe 1.4 section 7.12.1:
> + * The host should send Keep Alive commands at half of the
> + * Keep Alive Timeout accounting for transport roundtrip times [..].
> + */
> +#define NVME_KATO_DELAY(k)	((k) >> 1)

what will happen in case k == 1 ?


>   
>   #ifdef CONFIG_ARCH_NO_SG_CHAIN
>   #define  NVME_INLINE_SG_CNT  0

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/3] nvme: sanitize KATO setting
  2021-03-08 13:11   ` Max Gurtovoy
@ 2021-03-08 13:54     ` Hannes Reinecke
  0 siblings, 0 replies; 18+ messages in thread
From: Hannes Reinecke @ 2021-03-08 13:54 UTC (permalink / raw)
  To: Max Gurtovoy, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Daniel Wagner, linux-nvme, Chao Leng

On 3/8/21 2:11 PM, Max Gurtovoy wrote:
> 
> On 3/2/2021 11:26 AM, Hannes Reinecke wrote:
>> According to the NVMe base spec the KATO commands should be sent
>> at half of the KATO interval, to properly account for round-trip
>> times.
>> As we now will only ever send one KATO command per connection we
>> can easily use the recommended values.
>> This also fixes a potential issue where the request timeout for
>> the KATO command does not match the value in the connect command,
>> which might be causing spurious connection drops from the target.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/host/core.c    | 9 ++++++---
>>   drivers/nvme/host/fabrics.c | 2 +-
>>   drivers/nvme/host/nvme.h    | 9 ++++++++-
>>   3 files changed, 15 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index f890b310499e..6d096d41a82f 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -1223,7 +1223,8 @@ static void nvme_keep_alive_end_io(struct
>> request *rq, blk_status_t status)
>>           startka = true;
>>       spin_unlock_irqrestore(&ctrl->lock, flags);
>>       if (startka)
>> -        queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
>> +        queue_delayed_work(nvme_wq, &ctrl->ka_work,
>> +                   NVME_KATO_DELAY(ctrl->kato) * HZ);
>>   }
>>     static int nvme_keep_alive(struct nvme_ctrl *ctrl)
>> @@ -1258,7 +1259,8 @@ static void nvme_keep_alive_work(struct
>> work_struct *work)
>>           dev_dbg(ctrl->device,
>>               "reschedule traffic based keep-alive timer\n");
>>           ctrl->comp_seen = false;
>> -        queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
>> +        queue_delayed_work(nvme_wq, &ctrl->ka_work,
>> +                   NVME_KATO_DELAY(ctrl->kato) * HZ);
>>           return;
>>       }
>>   @@ -1275,7 +1277,8 @@ static void nvme_start_keep_alive(struct
>> nvme_ctrl *ctrl)
>>       if (unlikely(ctrl->kato == 0))
>>           return;
>>   -    queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
>> +    queue_delayed_work(nvme_wq, &ctrl->ka_work,
>> +               NVME_KATO_DELAY(ctrl->kato) * HZ);
>>   }
>>     void nvme_stop_keep_alive(struct nvme_ctrl *ctrl)
>> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
>> index 5dfd806fc2d2..dba32e39afbf 100644
>> --- a/drivers/nvme/host/fabrics.c
>> +++ b/drivers/nvme/host/fabrics.c
>> @@ -382,7 +382,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
>>        * and add a grace period for controller kato enforcement
>>        */
>>       cmd.connect.kato = ctrl->kato ?
>> -        cpu_to_le32((ctrl->kato + NVME_KATO_GRACE) * 1000) : 0;
>> +        cpu_to_le32(ctrl->kato * 1000) : 0;
>>         if (ctrl->opts->disable_sqflow)
>>           cmd.connect.cattr |= NVME_CONNECT_DISABLE_SQFLOW;
>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>> index 23711f6b7d13..912830389997 100644
>> --- a/drivers/nvme/host/nvme.h
>> +++ b/drivers/nvme/host/nvme.h
>> @@ -27,7 +27,14 @@ extern unsigned int admin_timeout;
>>   #define NVME_ADMIN_TIMEOUT    (admin_timeout * HZ)
>>     #define NVME_DEFAULT_KATO    5
>> -#define NVME_KATO_GRACE        10
>> +
>> +/*
>> + * The recommended frequency for KATO commands
>> + * according to NVMe 1.4 section 7.12.1:
>> + * The host should send Keep Alive commands at half of the
>> + * Keep Alive Timeout accounting for transport roundtrip times [..].
>> + */
>> +#define NVME_KATO_DELAY(k)    ((k) >> 1)
> 
> what will happen in case k == 1 ?
> 
> 
Ho-hum. Good point.
Will fix it up.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-03-08 13:56 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-02  9:26 [PATCHv2 0/3] nvme: sanitize KATO handling Hannes Reinecke
2021-03-02  9:26 ` [PATCH 1/3] nvme: fixup kato deadlock Hannes Reinecke
2021-03-03  8:42   ` Christoph Hellwig
2021-03-03 12:01     ` Hannes Reinecke
2021-03-03 12:35       ` Christoph Hellwig
2021-03-03 13:11         ` Hannes Reinecke
2021-03-03 14:23           ` Hannes Reinecke
2021-03-04  8:02             ` Christoph Hellwig
2021-03-04  8:56               ` Hannes Reinecke
2021-03-02  9:26 ` [PATCH 2/3] nvme: sanitize KATO setting Hannes Reinecke
2021-03-03  8:53   ` Chao Leng
2021-03-03 12:40     ` Christoph Hellwig
2021-03-05 20:38   ` Sagi Grimberg
2021-03-08 13:11   ` Max Gurtovoy
2021-03-08 13:54     ` Hannes Reinecke
2021-03-02  9:26 ` [PATCH 3/3] nvme: add 'kato' sysfs attribute Hannes Reinecke
2021-03-05 20:38   ` Sagi Grimberg
2021-03-08 13:06   ` Max Gurtovoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.