All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] nvme-tcp: queue stalls under high load
@ 2022-05-19  6:26 Hannes Reinecke
  2022-05-19  6:26 ` [PATCH 1/3] nvme-tcp: spurious I/O timeout " Hannes Reinecke
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Hannes Reinecke @ 2022-05-19  6:26 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

Hi all,

one of our partners registered queue stalls and I/O timeouts under
high load. Analysis revealed that we see an extremely 'choppy' I/O
behaviour when running large transfers on systems on low-performance
links (eg 1GigE networks).
We had a system with 30 queues trying to transfer 128M requests; simple
calculation shows that transferring a _single_ request on all queues
will take up to 38 seconds, thereby timing out the last request before
it got sent.
As a solution I first fixed up the timeout handler to reset the timeout
if the request is still queued or in the process of being send. The
second path modifies the send path to only allow for new requests if we
have enough space on the TX queue, and finally break up the send loop to
avoid system stalls when sending large request.

As usual, comments and reviews are welcome.

Hannes Reinecke (3):
  nvme-tcp: spurious I/O timeout under high load
  nvme-tcp: Check for write space before queueing requests
  nvme-tcp: send quota for nvme_tcp_send_all()

 drivers/nvme/host/tcp.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-05-24  9:59 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-19  6:26 [PATCH 0/3] nvme-tcp: queue stalls under high load Hannes Reinecke
2022-05-19  6:26 ` [PATCH 1/3] nvme-tcp: spurious I/O timeout " Hannes Reinecke
2022-05-20  9:05   ` Sagi Grimberg
2022-05-23  8:42     ` Hannes Reinecke
2022-05-23 13:36       ` Sagi Grimberg
2022-05-23 14:01         ` Hannes Reinecke
2022-05-23 15:05           ` Sagi Grimberg
2022-05-23 16:07             ` Hannes Reinecke
2022-05-24  7:57               ` Sagi Grimberg
2022-05-24  8:08                 ` Hannes Reinecke
2022-05-24  8:53                   ` Sagi Grimberg
2022-05-24  9:34                     ` Hannes Reinecke
2022-05-24  9:58                       ` Sagi Grimberg
2022-05-19  6:26 ` [PATCH 2/3] nvme-tcp: Check for write space before queueing requests Hannes Reinecke
2022-05-20  9:17   ` Sagi Grimberg
2022-05-20 10:05     ` Hannes Reinecke
2022-05-21 20:01       ` Sagi Grimberg
2022-05-19  6:26 ` [PATCH 3/3] nvme-tcp: send quota for nvme_tcp_send_all() Hannes Reinecke
2022-05-20  9:19   ` Sagi Grimberg
2022-05-20  9:59     ` Hannes Reinecke
2022-05-21 20:02       ` Sagi Grimberg
2022-05-20  9:20 ` [PATCH 0/3] nvme-tcp: queue stalls under high load Sagi Grimberg
2022-05-20 10:01   ` Hannes Reinecke
2022-05-21 20:03     ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.