[PATCH 0/6] return an error when cqe is dropped

* [PATCH 0/6] return an error when cqe is dropped
@ 2022-04-21  9:13 Dylan Yudaken
  2022-04-21  9:13 ` [PATCH 1/6] io_uring: add trace support for CQE overflow Dylan Yudaken
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Dylan Yudaken @ 2022-04-21  9:13 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, asml.silence, linux-kernel, kernel-team, Dylan Yudaken

This series addresses a rare but real error condition when a CQE is
dropped. Many applications rely on 1 SQE resulting in 1 CQE and may even
block waiting for the CQE. In overflow conditions if the GFP_ATOMIC
allocation fails, the CQE is dropped and a counter is incremented. However
the application is not actively signalled that something bad has
happened. We would like to indicate this error condition to the
application but in a way that does not rely on the application doing
invasive changes such as checking a flag before each wait.

This series returns an error code to the application when the error hits,
and then resets the error condition. If the application is ok with this
error it can continue as is, or more likely it can clean up sanely.

Patches 1&2 add tracing for overflows
Patches 3&4 prep for adding this error
Patch 5 is the main one returning an error
Patch 6 allows liburing to test these conditions more easily with IOPOLL

Dylan Yudaken (6):
  io_uring: add trace support for CQE overflow
  io_uring: trace cqe overflows
  io_uring: rework io_uring_enter to simplify return value
  io_uring: use constants for cq_overflow bitfield
  io_uring: return an error when cqe is dropped
  io_uring: allow NOP opcode in IOPOLL mode

 fs/io_uring.c                   | 89 ++++++++++++++++++++++-----------
 include/trace/events/io_uring.h | 42 +++++++++++++++-
 2 files changed, 102 insertions(+), 29 deletions(-)

base-commit: 7c648b7d6186c59ed3a0e0ae4b774aaf4b415ef2
-- 
2.30.2

^ permalink raw reply	[flat|nested] 10+ messages in thread