io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, Stefan Metzmacher <metze@samba.org>
Cc: io-uring <io-uring@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Samba Technical <samba-technical@lists.samba.org>
Subject: Re: Problems replacing epoll with io_uring in tevent
Date: Wed, 26 Oct 2022 18:41:00 +0100	[thread overview]
Message-ID: <063b631b-19a7-fb24-23e7-65cbcc141554@gmail.com> (raw)
In-Reply-To: <270f3b9a-8fa6-68bf-7c57-277f107167c9@kernel.dk>

On 10/26/22 18:08, Jens Axboe wrote:
> On 10/26/22 10:00 AM, Stefan Metzmacher wrote:
>> Hi Jens,
[...]
>>>      b) The major show stopper is that IORING_OP_POLL_ADD calls fget(), while
>>>         it's pending. Which means that a close() on the related file descriptor
>>>         is not able to remove the last reference! This is a problem for points 3.d,
>>>         4.a and 4.b from above.
>>>
>>>         I doubt IORING_ASYNC_CANCEL_FD would be able to be used as there's not always
>>>         code being triggered around a raw close() syscall, which could do a sync cancel.
>>>
>>>         For now I plan to epoll_ctl (or IORING_OP_EPOLL_CTL) and only
>>>         register the fd from epoll_create() with IORING_OP_POLL_ADD
>>>         or I keep epoll_wait() as blocking call and register the io_uring fd
>>>         with epoll.
>>>
>>>         I looked at the related epoll code and found that it uses
>>>         a list in struct file->f_ep to keep the reference, which gets
>>>         detached also via eventpoll_release_file() called from __fput()
>>>
>>>         Would it be possible move IORING_OP_POLL_ADD to use a similar model
>>>         so that close() will causes a cqe with -ECANCELED?
>>
>> I'm currently trying to prototype for an IORING_POLL_CANCEL_ON_CLOSE
>> flag that can be passed to POLL_ADD. With that we'll register
>> the request in &req->file->f_uring_poll (similar to the file->f_ep list for epoll)
>> Then we only get a real reference to the file during the call to
>> vfs_poll() otherwise we drop the fget/fput reference and rely on
>> an io_uring_poll_release_file() (similar to eventpoll_release_file())
>> to cancel our registered poll request.
> 
> Yes, this is a bit tricky as we hold the file ref across the operation. I'd
> be interested in seeing your approach to this, and also how it would
> interact with registered files...

Not sure I mentioned before but shutdown(2) / IORING_OP_SHUTDOWN
usually helps. Is there anything keeping you from doing that?
Do you only poll sockets or pipes as well?


>>>      c) A simple pipe based performance test shows the following numbers:
>>>         - 'poll':               Got 232387.31 pipe events/sec
>>>         - 'epoll':              Got 251125.25 pipe events/sec
>>>         - 'samba_io_uring_ev':  Got 210998.77 pipe events/sec
>>>         So the io_uring backend is even slower than the 'poll' backend.
>>>         I guess the reason is the constant re-submission of IORING_OP_POLL_ADD.
>>
>> Added some feature autodetection today and I'm now using
>> IORING_SETUP_COOP_TASKRUN, IORING_SETUP_TASKRUN_FLAG,
>> IORING_SETUP_SINGLE_ISSUER and IORING_SETUP_DEFER_TASKRUN if supported
>> by the kernel.
>>
>> On a 6.1 kernel this improved the performance a lot, it's now faster
>> than the epoll backend.
>>
>> The key flag is IORING_SETUP_DEFER_TASKRUN. On a different system than above
>> I'm getting the following numbers:
>> - epoll:                                    Got 114450.16 pipe events/sec
>> - poll:                                     Got 105872.52 pipe events/sec
>> - samba_io_uring_ev-without-defer_taskrun': Got  95564.22 pipe events/sec
>> - samba_io_uring_ev-with-defer_taskrun':    Got 122853.85 pipe events/sec
> 
> Any chance you can do a run with just IORING_SETUP_COOP_TASKRUN set? I'm
> curious how big of an impact the IPI elimination is, where it slots in
> compared to the defer taskrun and the default settings.

And if it doesn't take too much time to test, it would also be interesting
to see if there is any impact from IORING_SETUP_SINGLE_ISSUER alone,
without TASKRUN flags.

-- 
Pavel Begunkov

  reply	other threads:[~2022-10-26 17:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-18 14:42 Problems replacing epoll with io_uring in tevent Stefan Metzmacher
2022-10-26 16:00 ` Stefan Metzmacher
2022-10-26 17:08   ` Jens Axboe
2022-10-26 17:41     ` Pavel Begunkov [this message]
2022-10-27  8:18       ` Stefan Metzmacher
2022-10-27  8:05     ` Stefan Metzmacher
2022-10-27 19:25       ` Stefan Metzmacher
2022-12-28 16:19         ` Stefan Metzmacher
2023-01-18 15:56           ` Jens Axboe
2023-02-01 20:29             ` Stefan Metzmacher
2022-10-27  8:51     ` Stefan Metzmacher
2022-10-27 12:12       ` Jens Axboe
2022-10-27 18:35         ` Stefan Metzmacher
2022-10-27 19:54     ` Stefan Metzmacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=063b631b-19a7-fb24-23e7-65cbcc141554@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=metze@samba.org \
    --cc=samba-technical@lists.samba.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).