All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Papadakis <markuspapadakis@icloud.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: io-uring@vger.kernel.org
Subject: Re: io_uring and spurious wake-ups from eventfd
Date: Wed, 8 Jan 2020 09:36:17 +0200	[thread overview]
Message-ID: <4DED8D2F-8F0B-46FB-800D-FEC3F2A5B553@icloud.com> (raw)
In-Reply-To: <60360091-ffce-fc8b-50d5-1a20fecaf047@kernel.dk>



> On 7 Jan 2020, at 10:34 PM, Jens Axboe <axboe@kernel.dk> wrote:
> 
> On 1/7/20 1:26 PM, Jens Axboe wrote:
>> On 1/7/20 8:55 AM, Mark Papadakis wrote:
>>> This is perhaps an odd request, but if it’s trivial to implement
>>> support for this described feature, it could help others like it ‘d
>>> help me (I ‘ve been experimenting with io_uring for some time now).
>>> 
>>> Being able to register an eventfd with an io_uring context is very
>>> handy, if you e.g have some sort of reactor thread multiplexing I/O
>>> using epoll etc, where you want to be notified when there are pending
>>> CQEs to drain. The problem, such as it is, is that this can result in
>>> un-necessary/spurious wake-ups.
>>> 
>>> If, for example, you are monitoring some sockets for EPOLLIN, and when
>>> poll says you have pending bytes to read from their sockets, and said
>>> sockets are non-blocking, and for each some reported event you reserve
>>> an SQE for preadv() to read that data and then you io_uring_enter to
>>> submit the SQEs, because the data is readily available, as soon as
>>> io_uring_enter returns, you will have your completions available -
>>> which you can process.  The “problem” is that poll will wake up
>>> immediately thereafter in the next reactor loop iteration because
>>> eventfd was tripped (which is reasonable but un-necessary).
>>> 
>>> What if there was a flag for io_uring_setup() so that the eventfd
>>> would only be tripped for CQEs that were processed asynchronously, or,
>>> if that’s non-trivial, only for CQEs that reference file FDs?
>>> 
>>> That’d help with that spurious wake-up.
>> 
>> One easy way to do that would be for the application to signal that it
>> doesn't want eventfd notifications for certain requests. Like using an
>> IOSQE_ flag for that. Then you could set that on the requests you submit
>> in response to triggering an eventfd event.
> 


Thanks Jens,

This is great, but perhaps there is a somewhat slightly more optimal way to do this.
Ideally, io_uring should trip the eventfd if there are any new completions available, that haven’t been produced
In the context of an io_uring_enter(). That is to say, if any SQEs can be immediately served (because data is readily available in
Buffers/caches in the kernel), then their respective CQEs will be produced in the context of that io_uring_enter() that submitted said SQEs(and thus the CQEs can be processed immediately after io_uring_enter() returns). 
So, if any CQEs are placed in the respective ring at any other time, but not during an io_uring_enter() call, then it means those completions were produced asynchronously, and thus the eventfd can be tripped, otherwise, there is no need to trip the eventfd at all.

e.g (pseudocode):
void produce_completion(cfq_ctx *ctx, const bool in_io_uring_enter_ctx) {
        cqe_ring_push(cqe_from_ctx(ctx));
        if (false == in_io_uring_enter_ctx && eventfd_registered()) {
                trip_iouring_eventfd();
        } else {
                // don't bother
        }
}

@markpapadakis

  reply	other threads:[~2020-01-08  7:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 15:55 io_uring and spurious wake-ups from eventfd Mark Papadakis
2020-01-07 20:26 ` Jens Axboe
2020-01-07 20:34   ` Jens Axboe
2020-01-08  7:36     ` Mark Papadakis [this message]
2020-01-08 16:24       ` Jens Axboe
2020-01-08 16:46         ` Mark Papadakis
2020-01-08 16:50           ` Jens Axboe
2020-01-08 17:20             ` Jens Axboe
2020-01-08 18:08               ` Jens Axboe
2020-01-09  6:09         ` Daurnimator
2020-01-09 15:14           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DED8D2F-8F0B-46FB-800D-FEC3F2A5B553@icloud.com \
    --to=markuspapadakis@icloud.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.