All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hao Xu <haoxu.linux@gmail.com>
To: Jens Axboe <axboe@kernel.dk>,
	Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>,
	io-uring@vger.kernel.org
Cc: asml.silence@gmail.com
Subject: Re: [PATCH] io_uring: add io_uring_enter(2) fixed file support
Date: Thu, 21 Apr 2022 22:16:35 +0800	[thread overview]
Message-ID: <1ec0bd06-09bf-6f36-503e-46eb579d5736@gmail.com> (raw)
In-Reply-To: <528ce414-c0fe-3318-483a-f51aa8a407b9@kernel.dk>

Hi all,

在 3.3.22 下午10:36, Jens Axboe 写道:
> On 3/3/22 6:38 AM, Jens Axboe wrote:
>> On 3/2/22 10:28 PM, Xiaoguang Wang wrote:
>>> IORING_REGISTER_FILES is a good feature to reduce fget/fput overhead for
>>> each IO we do on file, but still left one, which is io_uring_enter(2).
>>> In io_uring_enter(2), it still fget/fput io_ring fd. I have observed
>>> this overhead in some our internal oroutine implementations based on
>>> io_uring with low submit batch. To totally remove fget/fput overhead in
>>> io_uring, we may add a small struct file cache in io_uring_task and add
>>> a new IORING_ENTER_FIXED_FILE flag. Currently the capacity of this file
>>> cache is 16, wihcih I think it maybe enough, also not that this cache is
>>> per-thread.
>> Would indeed be nice to get rid of, can be a substantial amount of time
>> wasted in fdget/fdput. Does this resolve dependencies correctly if
>> someone passes the ring fd? Adding ring registration to test/ring-leak.c
>> from the liburing repo would be a useful exercise.
> Seems to pass that fine, but I did miss on first read through that you
> add that hook to files_cancel() which should break that dependency.
>
> Since I think this is a potentially big win for certain workloads, maybe
> we should consider making this easier to use? I don't think we
> necessarily need to tie this to the regular file registration. What if
> we instead added a SETUP flag for this, and just return the internal
> offset for that case? Then we don't need an enter flag, we don't need to
> add register/unregister opcodes for it.
[1]
>
> This does pose a problem when we fill the array. We can easily go beyond
> 16 here, that's just an arbitrary limit, but at some point we do have to
> handle the case where SETUP_REGISTERED (or whatever we call it) can't
> get a slot. I think we just clear the flag and setup the fd normally in
> that case. The user doesn't need to know, all the application needs to
> are about is that it can use the passed back 'fd' to call the other
> io_uring functions.
>
> The only potential oddity here is that the fd passed back is not a
> legitimate fd. io_uring does support poll(2) on its file descriptor,
[2]
> so
> that could cause some confusion even if I don't think anyone actually
> does poll(2) on io_uring.
>
> What do you think?

Can we use something like a heap based on the registered_rings arrray so 
that we can return the

real fd to the userspace meanwhilely. The disadvantage is the time cost 
is O(lgn) for each fd

registration/searching. Then we can achieve [1] and avoid [2].


Regards,

Hao

>

      parent reply	other threads:[~2022-04-21 14:16 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-03  5:28 [PATCH] io_uring: add io_uring_enter(2) fixed file support Xiaoguang Wang
2022-03-03  8:56 ` Hao Xu
2022-03-03 13:38 ` Jens Axboe
2022-03-03 14:36   ` Jens Axboe
2022-03-03 14:40     ` Jens Axboe
2022-03-03 16:31       ` Jens Axboe
2022-03-03 17:18         ` Jens Axboe
2022-03-03 20:41           ` Jens Axboe
2022-03-03 21:19             ` Jens Axboe
2022-03-04  0:07               ` Jens Axboe
2022-03-04 13:39                 ` Xiaoguang Wang
2022-03-04 13:44                   ` Jens Axboe
2022-03-04 15:16                     ` Xiaoguang Wang
2022-03-04 15:22                       ` Jens Axboe
2022-03-08  8:38                         ` Xiaoguang Wang
2022-03-08 13:10                           ` Jens Axboe
2022-03-03 22:24             ` Vito Caputo
2022-03-03 22:26               ` Jens Axboe
2022-03-04  1:49         ` Pavel Begunkov
2022-03-04  2:18           ` Jens Axboe
2022-03-04  2:28             ` Pavel Begunkov
2022-03-04  2:35               ` Pavel Begunkov
2022-03-04  2:43               ` Jens Axboe
2022-03-04  1:52         ` Pavel Begunkov
2022-03-04  2:19           ` Jens Axboe
2022-03-04  2:39             ` Pavel Begunkov
2022-03-04  3:03               ` Jens Axboe
2022-04-21 14:16     ` Hao Xu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1ec0bd06-09bf-6f36-503e-46eb579d5736@gmail.com \
    --to=haoxu.linux@gmail.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=xiaoguang.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.