On 08/11/2019 18:51, Jens Axboe wrote: > It's useful for the application to know if the kernel had to dip into > using the backlog to prevent overflows. Let's keep on accounting any > overflow in cq_ring->overflow, even if we handled it correctly. As it's > impossible to get dropped events with IORING_SETUP_CQ_NODROP, overflow > with CQ_NODROP enabled simply provides a hint to the application that it > may reconsider using a bigger ring. > > Signed-off-by: Jens Axboe > > --- > > Since this hasn't been released yet, we can tweak the behavior a bit. I > think it makes sense to still account the overflows, even if we handled > it correctly. If the application doesn't care, it simply doesn't need to > look at cq_ring->overflow if it is using CQ_NODROP. But it may care, as > it is less efficient than a suitably sized ring. > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 94ec44caac00..aa3b6149dfe9 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -666,10 +666,10 @@ static void io_cqring_overflow(struct io_ring_ctx *ctx, struct io_kiocb *req, > long res) > __must_hold(&ctx->completion_lock) > { > - if (!(ctx->flags & IORING_SETUP_CQ_NODROP)) { > - WRITE_ONCE(ctx->rings->cq_overflow, > - atomic_inc_return(&ctx->cached_cq_overflow)); > - } else { > + WRITE_ONCE(ctx->rings->cq_overflow, > + atomic_inc_return(&ctx->cached_cq_overflow)); > + > + if (ctx->flags & IORING_SETUP_CQ_NODROP) { We used cq_overflow to fix __io_sequence_defer(). This breaks the assumption: cached_cq_tail + cached_cq_overflow == total number of handled completions First, we account overflow, and then add it to cq_ring (i.e. cached_cq_tail++) in io_cqring_overflow_flush() > refcount_inc(&req->refs); > req->result = res; > list_add_tail(&req->list, &ctx->cq_overflow_list); > -- Pavel Begunkov