[PATCH 0/3] fix interrupt swamp in NVMe

* [PATCH 0/3] fix interrupt swamp in NVMe
@ 2019-08-20  6:14 longli
  2019-08-20  6:14 ` [PATCH 1/3] sched: define a function to report the number of context switches on a CPU longli
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: longli @ 2019-08-20  6:14 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Keith Busch, Jens Axboe,
	Christoph Hellwig, Sagi Grimberg, linux-nvme, linux-kernel
  Cc: Long Li

From: Long Li <longli@microsoft.com>

This patch set tries to fix interrupt swamp in NVMe devices.

On large systems with many CPUs, a number of CPUs may share one NVMe hardware
queue. It may have this situation where several CPUs are issuing I/Os, and
all the I/Os are returned on the CPU where the hardware queue is bound to.
This may result in that CPU swamped by interrupts and stay in interrupt mode
for extended time while other CPUs continue to issue I/O. This can trigger
Watchdog and RCU timeout, and make the system unresponsive.

This patch set addresses this by enforcing scheduling and throttling I/O when
CPU is starved in this situation.

Long Li (3):
  sched: define a function to report the number of context switches on a
    CPU
  sched: export idle_cpu()
  nvme: complete request in work queue on CPU with flooded interrupts

 drivers/nvme/host/core.c | 57 +++++++++++++++++++++++++++++++++++++++-
 drivers/nvme/host/nvme.h |  1 +
 include/linux/sched.h    |  2 ++
 kernel/sched/core.c      |  7 +++++
 4 files changed, 66 insertions(+), 1 deletion(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 30+ messages in thread