All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Stefan Metzmacher <metze@samba.org>,
	io-uring <io-uring@vger.kernel.org>, Jens Axboe <axboe@kernel.dk>
Cc: Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
	Dylan Yudaken <dylany@fb.com>
Subject: Re: IORING_CQE_F_COPIED
Date: Wed, 19 Oct 2022 16:06:50 +0100	[thread overview]
Message-ID: <cd87b6d0-a6d6-8f24-1af4-4b8845aa669c@gmail.com> (raw)
In-Reply-To: <4bbf6bc1-ee4b-8758-7860-a06f57f35d14@samba.org>

On 10/18/22 09:43, Stefan Metzmacher wrote:
> Hi Pavel,
> 
>> On 10/14/22 12:06, Stefan Metzmacher wrote:
>>> Hi Pavel,
>>>
>>> In the tests I made I used this version of IORING_CQE_F_COPIED:
>>> https://git.samba.org/?p=metze/linux/wip.git;a=commitdiff;h=645d3b584c417a247d92d71baa6266a5f3d0d17d
>>> (also inlined at the end)
>>>
>>> Would that something we want for 6.1? (Should I post that with a useful commit message, after doing some more tests)
>>
>> I was thinking, can it be delivered separately but not in the same cqe?
>> The intention is to keep it off the IO path. For example, it can emit a
>> zc status CQE or maybe keep a "zc failed" counter inside the ring. Other
>> options? And we can add a separate callback for that, will make a couple
>> of things better.
>>
>> What do you think? Especially from the userspace usability perspective.
> 
> So far I can't think of any other way that would be useful yet,
> but that doesn't mean something else might exist...
> 
> IORING_CQE_F_COPIED is available per request and makes it possible
> to judge why the related SENDMSG_ZC was fast or not.
> It's also available in trace-cmd report.
> 
> Everything else would likely re-introduce similar complexity like we
> had with the notif slots.
> 
> Instead of a new IORING_CQE_F_COPIED flag we could also set
> cqe.res = SO_EE_CODE_ZEROCOPY_COPIED, but that isn't really different.
> 
> As I basically use the same logic that's used to generate SO_EE_CODE_ZEROCOPY_COPIED
> for the native MSG_ZEROCOPY, I don't see the problem with IORING_CQE_F_COPIED.
> Can you be more verbose why you're thinking about something different?

Because it feels like something that should be done roughly once and in
advance. Performance wise, I agree that a bunch of extra instructions in
the (io_uring) IO path won't make difference as the net overhead is
already high, but I still prefer to keep it thin. The complexity is a
good point though, if only we could piggy back it onto MSG_PROBE.
Ok, let's do IORING_CQE_F_COPIED and aim 6.2 + possibly backport.

First, there is no more ubuf_info::zerocopy, see for-next, but you can
grab space in io_kiocb, io_kiocb::iopoll_completed is a good candidate.
You would want to take one io_uring patch I'm going to send (will CC
you), with that you won't need to change anything in net/. And the last
bit, let's make the zc probing conditional under IORING_RECVSEND_* flag,
I'll make it zero overhead when not set later by replacing the callback.

-- 
Pavel Begunkov

  reply	other threads:[~2022-10-19 15:17 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-14 11:06 IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-17 16:46 ` IORING_CQE_F_COPIED Pavel Begunkov
2022-10-18  8:43   ` IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-19 15:06     ` Pavel Begunkov [this message]
2022-10-19 16:12       ` IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-20  2:24         ` IORING_CQE_F_COPIED Pavel Begunkov
2022-10-20 10:04           ` IORING_SEND_NOTIF_REPORT_USAGE (was Re: IORING_CQE_F_COPIED) Stefan Metzmacher
2022-10-20 13:46             ` Pavel Begunkov
2022-10-20 14:51               ` Stefan Metzmacher
2022-10-20 15:31                 ` Pavel Begunkov
2022-10-21  9:36                   ` Stefan Metzmacher
2022-10-21 11:09                     ` Pavel Begunkov
2022-10-21 14:03                       ` Stefan Metzmacher
2022-10-27  8:47                         ` Stefan Metzmacher
2022-10-27 10:51                         ` Pavel Begunkov
2022-10-20 10:10           ` IORING_SEND_NOTIF_USER_DATA " Stefan Metzmacher
2022-10-20 15:37             ` Pavel Begunkov
2022-10-21  8:32               ` Stefan Metzmacher
2022-10-21  9:27                 ` Pavel Begunkov
2022-10-21  9:45                   ` Stefan Metzmacher
2022-10-21 11:20                     ` Pavel Begunkov
2022-10-21 12:10                       ` Stefan Metzmacher
2022-10-21 10:15                   ` Stefan Metzmacher
2022-10-21 11:26                     ` Pavel Begunkov
2022-10-21 12:38                       ` Stefan Metzmacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd87b6d0-a6d6-8f24-1af4-4b8845aa669c@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=dylany@fb.com \
    --cc=io-uring@vger.kernel.org \
    --cc=kuba@kernel.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.