[PATCH 0/2] blk-mq: fix blk_mq_alloc_request_hctx

From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>, Sagi Grimberg <sagi@grimberg.me>,
	Daniel Wagner <dwagner@suse.de>, Wen Xiong <wenxiong@us.ibm.com>,
	John Garry <john.garry@huawei.com>
Subject: [PATCH 0/2] blk-mq: fix blk_mq_alloc_request_hctx
Date: Tue, 29 Jun 2021 15:49:49 +0800	[thread overview]
Message-ID: <20210629074951.1981284-1-ming.lei@redhat.com> (raw)

Hi,

blk_mq_alloc_request_hctx() is used by NVMe fc/rdma/tcp/loop to connect
io queue. Also the sw ctx is chosen as the 1st online cpu in hctx->cpumask.
However, all cpus in hctx->cpumask may be offline.

This usage model isn't well supported by blk-mq which supposes allocator is
always done on one online CPU in hctx->cpumask. This assumption is
related with managed irq, which also requires blk-mq to drain inflight
request in this hctx when the last cpu in hctx->cpumask is going to
offline.

However, NVMe fc/rdma/tcp/loop don't use managed irq, so we should allow
them to ask for request allocation when the specified hctx is inactive
(all cpus in hctx->cpumask are offline).

Fix blk_mq_alloc_request_hctx() by adding/passing flag of
BLK_MQ_F_NOT_USE_MANAGED_IRQ. 

Ming Lei (2):
  blk-mq: not deactivate hctx if the device doesn't use managed irq
  nvme: pass BLK_MQ_F_NOT_USE_MANAGED_IRQ for fc/rdma/tcp/loop

 block/blk-mq.c             | 6 +++++-
 drivers/nvme/host/fc.c     | 3 ++-
 drivers/nvme/host/rdma.c   | 3 ++-
 drivers/nvme/host/tcp.c    | 3 ++-
 drivers/nvme/target/loop.c | 3 ++-
 include/linux/blk-mq.h     | 1 +
 6 files changed, 14 insertions(+), 5 deletions(-)

Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Wen Xiong <wenxiong@us.ibm.com>
Cc: John Garry <john.garry@huawei.com>

-- 
2.31.1