linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, Josh Triplett <josh@joshtriplett.org>
Cc: io-uring@vger.kernel.org,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Stefan Metzmacher <metze@samba.org>
Subject: Re: [PATCH v3 0/4] open/accept directly into io_uring fixed file table
Date: Tue, 24 Aug 2021 15:43:44 +0100	[thread overview]
Message-ID: <ce1aba5d-3fdd-092d-9870-ff989642ffd2@gmail.com> (raw)
In-Reply-To: <c4653859-4003-70db-8b81-291dd17a6718@kernel.dk>

On 8/24/21 3:02 PM, Jens Axboe wrote:
> On 8/24/21 3:48 AM, Pavel Begunkov wrote:
>> On 8/23/21 8:40 PM, Jens Axboe wrote:
>>> On 8/23/21 1:13 PM, Josh Triplett wrote:
>>>> On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote:
>>>>> On 8/21/21 9:52 AM, Pavel Begunkov wrote:
>>>>>> Add an optional feature to open/accept directly into io_uring's fixed
>>>>>> file table bypassing the normal file table. Same behaviour if as the
>>>>>> snippet below, but in one operation:
>>>>>>
>>>>>> sqe = prep_[open,accept](...);
>>>>>> cqe = submit_and_wait(sqe);
>>>>>> io_uring_register_files_update(uring_idx, (fd = cqe->res));
>>>>>> close((fd = cqe->res));
>>>>>>
>>>>>> The idea in pretty old, and was brough up and implemented a year ago
>>>>>> by Josh Triplett, though haven't sought the light for some reasons.
>>>>>>
>>>>>> The behaviour is controlled by setting sqe->file_index, where 0 implies
>>>>>> the old behaviour. If non-zero value is specified, then it will behave
>>>>>> as described and place the file into a fixed file slot
>>>>>> sqe->file_index - 1. A file table should be already created, the slot
>>>>>> should be valid and empty, otherwise the operation will fail.
>>>>>>
>>>>>> we can't use IOSQE_FIXED_FILE to switch between modes, because accept
>>>>>> takes a file, and it already uses the flag with a different meaning.
>>>>>>
>>>>>> since RFC:
>>>>>>  - added attribution
>>>>>>  - updated descriptions
>>>>>>  - rebased
>>>>>>
>>>>>> since v1:
>>>>>>  - EBADF if slot is already used (Josh Triplett)
>>>>>>  - alias index with splice_fd_in (Josh Triplett)
>>>>>>  - fix a bound check bug
>>>>>
>>>>> With the prep series, this looks good to me now. Josh, what do you
>>>>> think?
>>>>
>>>> I would still like to see this using a union with the `nofile` field in
>>>> io_open and io_accept, rather than overloading the 16-bit buf_index
>>>> field. That would avoid truncating to 16 bits, and make less work for
>>>> expansion to more than 16 bits of fixed file indexes.
>>>>
>>>> (I'd also like that to actually use a union, rather than overloading the
>>>> meaning of buf_index/nofile.)
>>>
>>> Agree, and in fact there's room in the open and accept command parts, so
>>> we can just make it a separate entry there instead of using ->buf_index.
>>> Then just pass in the index to io_install_fixed_file() instead of having
>>> it pull it from req->buf_index.
>>
>> That's internal details, can be expanded at wish in the future, if we'd
>> ever need larger tables. ->buf_index already holds indexes to different
>> resources just fine.
> 
> Sure it's internal and can always be changed, doesn't change the fact
> that it's a bit iffy that it's used differently in different spots. As
> it costs us nothing to simply add a 'fixed_file' u32 for io_accept and
> io_open, I really think that should be done instead.
> 
>> Aliasing with nofile would rather be ugly, so the only option, as you
>> mentioned, is to grab some space from open/accept structs, but don't see
>> why we'd want it when there is a more convenient alternative.
> 
> Because it's a lot more readable and less error prone imho. Agree on the
> union, we don't have to resort to that.

Ok, I don't have a strong opinion on that. Will resend



>>>> I personally still feel that using non-zero to signify index-plus-one is
>>>> both error-prone and not as future-compatible. I think we could do
>>>> better with no additional overhead. But I think the final call on that
>>>> interface is up to you, Jens. Do you think it'd be worth spending a flag
>>>> bit or using a different opcode, to get a cleaner interface? If you
>>>> don't, then I'd be fine with seeing this go in with just the io_open and
>>>> io_accept change.
>>>
>>> I'd be inclined to go the extra opcode route instead, as the flag only
>>> really would make sense to requests that instantiate file descriptors.
>>> For this particular case, we'd need 3 new opcodes for
>>> openat/openat2/accept, which is probably a worthwhile expenditure.
>>>
>>> Pavel, what do you think? Switch to using a different opcode for the new
>>> requests, and just grab some space in io_open and io_accept for the fd
>>> and pass it in to install.
>>
>> I don't get it, why it's even called hackish? How that's anyhow better?
>> To me the feature looks like a natural extension to the operations, just
>> like a read can be tuned with flags, so and creating new opcodes seems
>> a bit ugly, unnecessary taking space from opcodes and adding duplication
>> (even if both versions call the same handler).
> 
> I agree that it's a natural extension, the problem is that we have to do
> unnatural things (somewhat) to make it work. I'm fine with using the
> union for the splice_fd_in to pass it in, I don't think it's a big deal.
> 
> I do wish that IORING_OP_CLOSE would work with them, though. I think we
> should to that as a followup patch. It's a bit odd to be able to open a
> file with IORING_OP_OPENAT and not being able to close it with
> IORING_OP_CLOSE. For the latter, we should just give it fixed file
> support, which would be pretty trivial.
> 
>> First, why it's not future-compatible? It's a serious argument, but I
>> don't see where it came from. Do I miss something?
>>
>> It's u32 now, and so will easily cover all indexes. SQE fields should
>> always be zeroed, that's a rule, liburing follows it, and there would
>> have been already lots of problems for users not honoring it. And there
>> will be a helper hiding all the index conversions for convenience.
>>
>> void io_uring_prep_open_direct(sqe, index, ...)
>> {
>> 	io_uring_prep_open(sqe, ...);
>> 	sqe->file_index = index + 1;
>> }
> 
> Let's keep it the way that it is, but I do want to see the buf_index
> thing go away and just req->open.fixed_file or whatever being used for
> open and accept. We should fold that in.

-- 
Pavel Begunkov

      reply	other threads:[~2021-08-24 14:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-21 15:52 [PATCH v3 0/4] open/accept directly into io_uring fixed file table Pavel Begunkov
2021-08-21 15:52 ` [PATCH v3 1/4] net: add accept helper not installing fd Pavel Begunkov
2021-08-23 16:35   ` Jakub Kicinski
2021-08-21 15:52 ` [PATCH v3 2/4] io_uring: openat directly into fixed fd table Pavel Begunkov
2021-08-21 15:52 ` [PATCH v3 3/4] io_uring: hand code io_accept() fd installing Pavel Begunkov
2021-08-21 15:52 ` [PATCH v3 4/4] io_uring: accept directly into fixed file table Pavel Begunkov
2021-08-22  2:18 ` [PATCH v3 0/4] open/accept directly into io_uring " Jens Axboe
2021-08-23 19:13   ` Josh Triplett
2021-08-23 19:40     ` Jens Axboe
2021-08-24  9:48       ` Pavel Begunkov
2021-08-24 14:02         ` Jens Axboe
2021-08-24 14:43           ` Pavel Begunkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce1aba5d-3fdd-092d-9870-ff989642ffd2@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=io-uring@vger.kernel.org \
    --cc=josh@joshtriplett.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).