[PATCH 0/2] nvme: sanitize KATO handling

* [PATCH 0/2] nvme: sanitize KATO handling
@ 2021-02-23 12:07 Hannes Reinecke
  2021-02-23 12:07 ` [PATCH 1/2] nvme: fixup kato deadlock Hannes Reinecke
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Hannes Reinecke @ 2021-02-23 12:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-nvme, Daniel Wagner, Sagi Grimberg, Keith Busch, Hannes Reinecke

Hi all,

one of our customer had been running into a deadlock trying to terminate
outstanding KATO commands during reset.
Looking closer at it, I found that we never actually _track_ if a KATO
command is submitted, so we might happily be sending several KATO commands
to the same controller simultaneously.
Also, I found it slightly odd that we signal a different KATO value to the
controller than what we're using internally; I would have thought that both
sides should agree on the same KATO value. And even that wouldn't be so
bad, but we really should be using the KATO value we annouonced to the
controller when setting the request timeout.

With these patches I attempt to resolve the situation; the first patch
ensures that only one KATO command to a given controller is outstanding.
With that the delay between sending KATO commands and the KATO timeout
are decoupled, and we can follow the recommendation from the base spec
to send the KATO commands at half the KATO timeout intervals.

As usual, comments and reviews are welcome.

Hannes Reinecke (2):
  nvme: fixup kato deadlock
  nvme: sanitize KATO setting

 drivers/nvme/host/core.c    | 22 +++++++++++++++++-----
 drivers/nvme/host/fabrics.c |  2 +-
 drivers/nvme/host/nvme.h    |  2 +-
 3 files changed, 19 insertions(+), 7 deletions(-)

-- 
2.29.2

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread