[PATCH 0/3] blk-mq/nvme: improve nvme-pci reset handler

* [PATCH 0/3] blk-mq/nvme: improve nvme-pci reset handler
@ 2020-05-20 11:56 Ming Lei
  2020-05-20 11:56 ` [PATCH 1/3] blk-mq: add API of blk_mq_queue_frozen Ming Lei
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Ming Lei @ 2020-05-20 11:56 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme, Christoph Hellwig
  Cc: Alan Adamson, Ming Lei

Hi,

For nvme-pci, after controller is recovered, in-flight IOs are waited
before updating nr hw queues. If new controller error happens during
this period, nvme-pci driver deletes the controller and fails in-flight
IO. This way is too violent, and not friendly from user viewpoint.

Add APIs for checking if queue is frozen, and replace nvme_wait_freeze
in nvme-pci reset handler with checking if all ns queues are frozen &
controller disabled. Then a fresh new reset can be scheduled for
handling new controller error during waiting for in-flight IO completion.

So deleting controller & failing IOs can be avoided in this situation.

Without this patches, when fail io timeout injection is run, the
controller can be removed very quickly. With this patch, no controller
removing can be observed, and controller can recover to normal state
after stopping to inject io timeout failure.

Ming Lei (3):
  blk-mq: add API of blk_mq_queue_frozen
  nvme: add nvme_frozen
  nvme-pci: make nvme reset more reliable

 block/blk-mq.c           |  6 ++++++
 drivers/nvme/host/core.c | 14 ++++++++++++++
 drivers/nvme/host/nvme.h |  1 +
 drivers/nvme/host/pci.c  | 37 ++++++++++++++++++++++++++++++-------
 include/linux/blk-mq.h   |  1 +
 5 files changed, 52 insertions(+), 7 deletions(-)

-- 
2.25.2

^ permalink raw reply	[flat|nested] 14+ messages in thread