All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Stefan Metzmacher <metze@samba.org>,
	io-uring <io-uring@vger.kernel.org>, Jens Axboe <axboe@kernel.dk>
Cc: Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
	Dylan Yudaken <dylany@fb.com>
Subject: Re: IORING_CQE_F_COPIED
Date: Thu, 20 Oct 2022 03:24:21 +0100	[thread overview]
Message-ID: <fb6a7599-8a9b-15e5-9b64-6cd9d01c6ff4@gmail.com> (raw)
In-Reply-To: <df47dbd0-75e4-5f39-58ad-ec28e50d0b9c@samba.org>

On 10/19/22 17:12, Stefan Metzmacher wrote:
> Hi Pavel,
> 
>>> As I basically use the same logic that's used to generate SO_EE_CODE_ZEROCOPY_COPIED
>>> for the native MSG_ZEROCOPY, I don't see the problem with IORING_CQE_F_COPIED.
>>> Can you be more verbose why you're thinking about something different?
>>
>> Because it feels like something that should be done roughly once and in
>> advance. Performance wise, I agree that a bunch of extra instructions in
>> the (io_uring) IO path won't make difference as the net overhead is
>> already high, but I still prefer to keep it thin. The complexity is a
>> good point though, if only we could piggy back it onto MSG_PROBE.
>> Ok, let's do IORING_CQE_F_COPIED and aim 6.2 + possibly backport.
> 
> Thanks!
> 
> Experimenting with this stuff lets me wish to have a way to
> have a different 'user_data' field for the notif cqe,
> maybe based on a IORING_RECVSEND_ flag, it may make my life
> easier and would avoid some complexity in userspace...
> As I need to handle retry on short writes even with MSG_WAITALL
> as EINTR and other errors could cause them.
> 
> What do you think?
> 
>> First, there is no more ubuf_info::zerocopy, see for-next, but you can
>> grab space in io_kiocb, io_kiocb::iopoll_completed is a good candidate.
> 
> Ok I found your "net: introduce struct ubuf_info_msgzc" and
> "net: shrink struct ubuf_info" commits.
> 
> I think the change would be trivial, the zerocopy field would just move
> to struct io_notif_data..., maybe as 'bool copied'.
> 
>> You would want to take one io_uring patch I'm going to send (will CC
>> you), with that you won't need to change anything in net/.
> 
> The problem is that e.g. tcp_sendmsg_locked() won't ever call
> the callback at all if 'zc' is false.
> 
> That's why there's the:
> 
>                          if (!zc)
>                                  uarg->zerocopy = 0;
> 
> Maybe I can inverse the logic and use two variables 'zero_copied'
> and 'copied'.
> 
> We'd start with both being false and this logic in the callback:> 
> if (success) {
>      if (unlikely(!nd->zero_copied && !nd->copied))
>         nd->zero_copied = true;
> } else {
>      if (unlikely(!nd->copied)) {
>         nd->copied = true;
>         nd->zero_copied = false;
>      }
> }

Yep, sth like that should do, but let's guard against
spurious net_zcopy_put() just in case.

used = false;
copied = false;

callback(skb, success, ubuf) {
	if (skb)
		used = true;
	if (!success)
		copied = true;
}
complete() {
	if (!used || copied)
		set_flag(IORING_CQE_F_COPIED);
}

> And __io_notif_complete_tw still needs:
> 
>          if (!nd->zero_copied)
>                  notif->cqe.flags |= IORING_CQE_F_COPIED;

Which can be shoved in a custom callback


>> And the last bit, let's make the zc probing conditional under IORING_RECVSEND_* flag,
>> I'll make it zero overhead when not set later by replacing the callback.
> 
> And the if statement to select a highspeed callback based on
> a IORING_RECVSEND_ flag is less overhead than
> the if statements in the slow callback version?

I'm more concerned about future changes around it, but there won't
be extra ifs.

#define COMMON_FLAGS (RECVSEND_FIRST_POLL|...)
#define ALL_FLAGS (COMMON_FLAGS|RECVSEND_PROBE)

if (flags & ~COMMON_FLAGS) {
	if (flags & ~ALL_FLAGS)
		return err;
	if (flags & RECVSEND_PROBE)
		set_callback(notif);
}

-- 
Pavel Begunkov

  reply	other threads:[~2022-10-20  2:25 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-14 11:06 IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-17 16:46 ` IORING_CQE_F_COPIED Pavel Begunkov
2022-10-18  8:43   ` IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-19 15:06     ` IORING_CQE_F_COPIED Pavel Begunkov
2022-10-19 16:12       ` IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-20  2:24         ` Pavel Begunkov [this message]
2022-10-20 10:04           ` IORING_SEND_NOTIF_REPORT_USAGE (was Re: IORING_CQE_F_COPIED) Stefan Metzmacher
2022-10-20 13:46             ` Pavel Begunkov
2022-10-20 14:51               ` Stefan Metzmacher
2022-10-20 15:31                 ` Pavel Begunkov
2022-10-21  9:36                   ` Stefan Metzmacher
2022-10-21 11:09                     ` Pavel Begunkov
2022-10-21 14:03                       ` Stefan Metzmacher
2022-10-27  8:47                         ` Stefan Metzmacher
2022-10-27 10:51                         ` Pavel Begunkov
2022-10-20 10:10           ` IORING_SEND_NOTIF_USER_DATA " Stefan Metzmacher
2022-10-20 15:37             ` Pavel Begunkov
2022-10-21  8:32               ` Stefan Metzmacher
2022-10-21  9:27                 ` Pavel Begunkov
2022-10-21  9:45                   ` Stefan Metzmacher
2022-10-21 11:20                     ` Pavel Begunkov
2022-10-21 12:10                       ` Stefan Metzmacher
2022-10-21 10:15                   ` Stefan Metzmacher
2022-10-21 11:26                     ` Pavel Begunkov
2022-10-21 12:38                       ` Stefan Metzmacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb6a7599-8a9b-15e5-9b64-6cd9d01c6ff4@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=dylany@fb.com \
    --cc=io-uring@vger.kernel.org \
    --cc=kuba@kernel.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.