All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Stefan Metzmacher <metze@samba.org>,
	io-uring <io-uring@vger.kernel.org>, Jens Axboe <axboe@kernel.dk>
Cc: Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
	Dylan Yudaken <dylany@fb.com>
Subject: Re: IORING_SEND_NOTIF_REPORT_USAGE (was Re: IORING_CQE_F_COPIED)
Date: Fri, 21 Oct 2022 12:09:09 +0100	[thread overview]
Message-ID: <273f154a-4cbd-2412-d056-a31fab5368d3@gmail.com> (raw)
In-Reply-To: <3e56c92b-567c-7bb4-2644-dc1ad1d8c3ae@samba.org>

On 10/21/22 10:36, Stefan Metzmacher wrote:
> Hi Pavel,
[...]
>> Right, I'm just tired of back porting patches by hand :)
> 
> ok, I just assumed it would be 6.1 only.

I'm fine with 6.1 only, it'd make things easier. I thought from
your first postings you wanted it 6.0. Then we don't need to care
about the placing of the copied/used flags.

>>> Otherwise we could have IORING_CQE_F_COPIED by default without opt-in
>>> flag...
> 
> Do you still want an opt-in flag to get IORING_CQE_F_COPIED?
> If so what name do you want it to be?

Ala a IORING_SEND_* flag? Yes please.

*_REPORT_USAGE was fine but I'd make it IORING_SEND_ZC_REPORT_USAGE.
And can be extended if there is more info needed in the future.

And I don't mind using a bit in cqe->res, makes cflags less polluted.


>>>>> +static void io_uring_tx_zerocopy_callback_report_usage(struct sk_buff *skb,
>>>>> +                            struct ubuf_info *uarg,
>>>>> +                            bool success)
>>>>> +{
>>>>> +    struct io_notif_data *nd = container_of(uarg, struct io_notif_data, uarg);
>>>>> +
>>>>> +    if (success && !nd->zc_used && skb)
>>>>> +        nd->zc_used = true;
>>>>> +    else if (unlikely(!success && !nd->zc_copied))
>>>>> +        nd->zc_copied = true;
>>>>
>>>> It's fine but racy, so let's WRITE_ONCE() to indicate it.
>>>
>>> I don't see how this could be a problem, but I can add it.
>>
>> It's not a problem, but better to be a little be more explicit
>> about parallel writes.
> 
> ok.
> 
>>>>> diff --git a/io_uring/notif.h b/io_uring/notif.h
>>>>> index 5b4d710c8ca5..5ac7a2745e52 100644
>>>>> --- a/io_uring/notif.h
>>>>> +++ b/io_uring/notif.h
>>>>> @@ -13,10 +13,12 @@ struct io_notif_data {
>>>>>       struct file        *file;
>>>>>       struct ubuf_info    uarg;
>>>>>       unsigned long        account_pages;
>>>>> +    bool            zc_used;
>>>>> +    bool            zc_copied;
>>>>
>>>> IIRC io_notif_data is fully packed in 6.1, so placing zc_{used,copied}
>>>> there might complicate backporting (if any). We can place them in io_kiocb
>>>> directly and move in 6.2. Alternatively account_pages doesn't have to be
>>>> long.
>>>
>>> As far as I can see kernel-dk-block/io_uring-6.1 alread has your
>>> shrink patches included...
>>
>> Sorry, I mean 6.0
> 
> So you want to backport to 6.0?
> 
> Find the current version below, sizeof(struct io_kiocb) will grow from
> 3*64 + 24 to 3*64 + 32 (on x64_64) to it stays within 4 cache lines.
> 
> I tried this first:
> 
> union {
>    u8 iopoll_completed;
>    struct {
>      u8 zc_used:1;
>      u8 zc_copied:1;
>    };
> };
> 
> But then WRITE_ONCE() complains about a bitfield write.

rightfully so, it can't be a bitfield as it would be racy
and not only in theory this time.


> So let me now about the opt-in flag and I'll prepare real commits
> including a patch that moves from struct io_kiocb to struct io_notif_data
> on top.

Yeah, better to be opt-in, but apart from it and comments above
looks good.


> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
> index f5b687a787a3..189152ad78d6 100644
> --- a/include/linux/io_uring_types.h
> +++ b/include/linux/io_uring_types.h
> @@ -515,6 +515,9 @@ struct io_kiocb {
>       u8                opcode;
>       /* polled IO has completed */
>       u8                iopoll_completed;
> +    /* these will be moved to struct io_notif_data in 6.1 */
> +    bool                zc_used;
> +    bool                zc_copied;
>       /*
>        * Can be either a fixed buffer index, or used with provided buffers.
>        * For the latter, before issue it points to the buffer group ID,
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index ab7458033ee3..738d6234d1d9 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -350,6 +350,7 @@ struct io_uring_cqe {
>   #define IORING_CQE_F_MORE        (1U << 1)
>   #define IORING_CQE_F_SOCK_NONEMPTY    (1U << 2)
>   #define IORING_CQE_F_NOTIF        (1U << 3)
> +#define IORING_CQE_F_COPIED        (1U << 4)
> 
>   enum {
>       IORING_CQE_BUFFER_SHIFT        = 16,
> diff --git a/io_uring/notif.c b/io_uring/notif.c
> index e37c6569d82e..033aca064b10 100644
> --- a/io_uring/notif.c
> +++ b/io_uring/notif.c
> @@ -18,6 +18,10 @@ static void __io_notif_complete_tw(struct io_kiocb *notif, bool *locked)
>           __io_unaccount_mem(ctx->user, nd->account_pages);
>           nd->account_pages = 0;
>       }
> +
> +    if (notif->zc_copied || !notif->zc_used)
> +        notif->cqe.flags |= IORING_CQE_F_COPIED;
> +

As discussed above, should depend on IORING_SEND_ZC_REPORT_USAGE

>       io_req_task_complete(notif, locked);
>   }
> 
> @@ -28,6 +32,11 @@ static void io_uring_tx_zerocopy_callback(struct sk_buff *skb,
>       struct io_notif_data *nd = container_of(uarg, struct io_notif_data, uarg);
>       struct io_kiocb *notif = cmd_to_io_kiocb(nd);
> 
> +    if (success && !notif->zc_used && skb)
> +        WRITE_ONCE(notif->zc_used, true);
> +    else if (!success && !notif->zc_copied)
> +        WRITE_ONCE(notif->zc_copied, true);
> +
>       if (refcount_dec_and_test(&uarg->refcnt)) {
>           notif->io_task_work.func = __io_notif_complete_tw;
>           io_req_task_work_add(notif);
> @@ -55,6 +64,7 @@ struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx)
>       nd->account_pages = 0;
>       nd->uarg.flags = SKBFL_ZEROCOPY_FRAG | SKBFL_DONT_ORPHAN;
>       nd->uarg.callback = io_uring_tx_zerocopy_callback;
> +    notif->zc_used = notif->zc_copied = false;
>       refcount_set(&nd->uarg.refcnt, 1);
>       return notif;
>   }
> 

-- 
Pavel Begunkov

  reply	other threads:[~2022-10-21 11:10 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-14 11:06 IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-17 16:46 ` IORING_CQE_F_COPIED Pavel Begunkov
2022-10-18  8:43   ` IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-19 15:06     ` IORING_CQE_F_COPIED Pavel Begunkov
2022-10-19 16:12       ` IORING_CQE_F_COPIED Stefan Metzmacher
2022-10-20  2:24         ` IORING_CQE_F_COPIED Pavel Begunkov
2022-10-20 10:04           ` IORING_SEND_NOTIF_REPORT_USAGE (was Re: IORING_CQE_F_COPIED) Stefan Metzmacher
2022-10-20 13:46             ` Pavel Begunkov
2022-10-20 14:51               ` Stefan Metzmacher
2022-10-20 15:31                 ` Pavel Begunkov
2022-10-21  9:36                   ` Stefan Metzmacher
2022-10-21 11:09                     ` Pavel Begunkov [this message]
2022-10-21 14:03                       ` Stefan Metzmacher
2022-10-27  8:47                         ` Stefan Metzmacher
2022-10-27 10:51                         ` Pavel Begunkov
2022-10-20 10:10           ` IORING_SEND_NOTIF_USER_DATA " Stefan Metzmacher
2022-10-20 15:37             ` Pavel Begunkov
2022-10-21  8:32               ` Stefan Metzmacher
2022-10-21  9:27                 ` Pavel Begunkov
2022-10-21  9:45                   ` Stefan Metzmacher
2022-10-21 11:20                     ` Pavel Begunkov
2022-10-21 12:10                       ` Stefan Metzmacher
2022-10-21 10:15                   ` Stefan Metzmacher
2022-10-21 11:26                     ` Pavel Begunkov
2022-10-21 12:38                       ` Stefan Metzmacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=273f154a-4cbd-2412-d056-a31fab5368d3@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=dylany@fb.com \
    --cc=io-uring@vger.kernel.org \
    --cc=kuba@kernel.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.