linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Pavel Begunkov <asml.silence@gmail.com>,
	io-uring@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>,
	Jonathan Lemon <jonathan.lemon@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Willem de Bruijn <willemb@google.com>,
	Eric Dumazet <edumazet@google.com>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	David Ahern <dsahern@kernel.org>, Jens Axboe <axboe@kernel.dk>
Subject: Re: [RFC 00/12] io_uring zerocopy send
Date: Wed, 1 Dec 2021 10:57:09 -0700	[thread overview]
Message-ID: <c4424a7a-2ef1-6524-9b10-1e7d1f1e1fe4@gmail.com> (raw)
In-Reply-To: <994e315b-fdb7-1467-553e-290d4434d853@gmail.com>

On 12/1/21 8:32 AM, Pavel Begunkov wrote:
> 
> Sure. First, for dummy I set mtu by hand, not sure can do it from
> the userspace, can I? Without it __ip_append_data() falls into
> non-zerocopy path.
> 
> diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
> index f82ad7419508..5c5aeacdabd5 100644
> --- a/drivers/net/dummy.c
> +++ b/drivers/net/dummy.c
> @@ -132,7 +132,8 @@ static void dummy_setup(struct net_device *dev)
>      eth_hw_addr_random(dev);
>  
>      dev->min_mtu = 0;
> -    dev->max_mtu = 0;
> +    dev->mtu = 0xffff;
> +    dev->max_mtu = 0xffff;
>  }
> 
> # dummy configuration
> 
> modprobe dummy numdummies=1
> ip link set dummy0 up


No change is needed to the dummy driver:
  ip li add dummy0 type dummy
  ip li set dummy0 up mtu 65536


> # force requests to <dummy_ip_addr> go through the dummy device
> ip route add <dummy_ip_addr> dev dummy0

that command is not necessary.

> 
> 
> With dummy I was just sinking the traffic to the dummy device,
> was good enough for me. Omitting "taskset" and "nice":
> 
> send-zc -4 -D <dummy_ip_addr> -t 10 udp
> 
> Similarly with msg_zerocopy:
> 
> <kernel>/tools/testing/selftests/net/msg_zerocopy -4 -p 6666 -D
> <dummy_ip_addr> -t 10 -z udp

I get -ENOBUFS with '-z' and any local address.

> 
> 
> For loopback testing, as zerocopy is not allowed for it as Willem
> explained in
> the original MSG_ZEROCOPY cover-letter, I used a hack to bypass it:
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index ebb12a7d386d..42df33b175ce 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2854,9 +2854,7 @@ static inline int skb_orphan_frags(struct sk_buff
> *skb, gfp_t gfp_mask)
>  /* Frags must be orphaned, even if refcounted, if skb might loop to rx
> path */
>  static inline int skb_orphan_frags_rx(struct sk_buff *skb, gfp_t gfp_mask)
>  {
> -    if (likely(!skb_zcopy(skb)))
> -        return 0;
> -    return skb_copy_ubufs(skb, gfp_mask);
> +    return skb_orphan_frags(skb, gfp_mask);
>  }
>  

that is the key change that is missing in your repo. All local traffic
(traffic to the address on a dummy device falls into this comment) goes
through loopback. That's just the way Linux works. If you look at the
dummy driver, it's xmit function just drops packets if any actually make
it there.


>> mileage varies quite a bit.
> 
> Interesting, any brief notes on the setup and the results? Dummy

VM on Chromebook. I just cloned your repos, built, install and test. As
mentioned above, the skb_orphan_frags_rx change is missing from your
repo and that is the key to your reported performance gains.


  reply	other threads:[~2021-12-01 17:58 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-30 15:18 [RFC 00/12] io_uring zerocopy send Pavel Begunkov
2021-11-30 15:18 ` [RFC 01/12] skbuff: add SKBFL_DONT_ORPHAN flag Pavel Begunkov
2021-11-30 15:18 ` [RFC 02/12] skbuff: pass a struct ubuf_info in msghdr Pavel Begunkov
2021-11-30 15:18 ` [RFC 03/12] net/udp: add support msgdr::msg_ubuf Pavel Begunkov
2021-11-30 15:18 ` [RFC 04/12] net: add zerocopy_sg_from_iter for bvec Pavel Begunkov
2021-11-30 15:18 ` [RFC 05/12] net: optimise page get/free for bvec zc Pavel Begunkov
2021-12-01 19:20   ` Jonathan Lemon
2021-12-01 20:17     ` Pavel Begunkov
2021-11-30 15:18 ` [RFC 06/12] io_uring: add send notifiers registration Pavel Begunkov
2021-11-30 15:18 ` [RFC 07/12] io_uring: infrastructure for send zc notifications Pavel Begunkov
2021-11-30 15:18 ` [RFC 08/12] io_uring: wire send zc request type Pavel Begunkov
2021-11-30 15:18 ` [RFC 09/12] io_uring: add an option to flush zc notifications Pavel Begunkov
2021-11-30 15:18 ` [RFC 10/12] io_uring: opcode independent fixed buf import Pavel Begunkov
2021-11-30 15:18 ` [RFC 11/12] io_uring: sendzc with fixed buffers Pavel Begunkov
2021-11-30 15:19 ` [RFC 12/12] io_uring: cache struct ubuf_info Pavel Begunkov
2021-12-01  3:10 ` [RFC 00/12] io_uring zerocopy send David Ahern
2021-12-01 15:32   ` Pavel Begunkov
2021-12-01 17:57     ` David Ahern [this message]
2021-12-01 19:11       ` Pavel Begunkov
2021-12-01 19:20         ` David Ahern
2021-12-01 20:15           ` Pavel Begunkov
2021-12-01 21:51             ` Martin KaFai Lau
2021-12-01 22:35               ` David Ahern
2021-12-01 23:07                 ` Martin KaFai Lau
2021-12-01 23:18                   ` Pavel Begunkov
2021-12-02 15:48               ` Pavel Begunkov
2021-12-02 17:40                 ` Martin KaFai Lau
2021-12-01 20:42       ` Pavel Begunkov
2021-12-01 14:31 ` Pavel Begunkov
2021-12-01 17:49   ` David Ahern
2021-12-01 19:59     ` Pavel Begunkov
2021-12-01 18:10 ` Willem de Bruijn
2021-12-01 19:59   ` Pavel Begunkov
2021-12-01 20:29     ` Pavel Begunkov
2021-12-02  0:36       ` Willem de Bruijn
2021-12-02 16:25         ` Pavel Begunkov
2021-12-02  0:32     ` Willem de Bruijn
2021-12-02 16:45       ` Pavel Begunkov
2021-12-02 21:25         ` Willem de Bruijn
2021-12-03 16:19           ` Pavel Begunkov
2021-12-03 16:30             ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c4424a7a-2ef1-6524-9b10-1e7d1f1e1fe4@gmail.com \
    --to=dsahern@gmail.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=io-uring@vger.kernel.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).