netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
@ 2020-03-25  2:23 Eric Dumazet
  2020-03-25  4:22 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Eric Dumazet @ 2020-03-25  2:23 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet

TCP recvmsg() calls skb_copy_datagram_iter(), which
calls an indirect function (cb pointing to simple_copy_to_iter())
for every MSS (fragment) present in the skb.

CONFIG_RETPOLINE=y forces a very expensive operation
that we can avoid thanks to indirect call wrappers.

This patch gives a 13% increase of performance on
a single flow, if the bottleneck is the thread reading
the TCP socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/datagram.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index 4213081c6ed3d4fda69501641a8c76e041f26b42..639745d4f3b94a248da9a685f45158410a85bec7 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -51,6 +51,7 @@
 #include <linux/slab.h>
 #include <linux/pagemap.h>
 #include <linux/uio.h>
+#include <linux/indirect_call_wrapper.h>
 
 #include <net/protocol.h>
 #include <linux/skbuff.h>
@@ -403,6 +404,11 @@ int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags)
 }
 EXPORT_SYMBOL(skb_kill_datagram);
 
+INDIRECT_CALLABLE_DECLARE(static size_t simple_copy_to_iter(const void *addr,
+						size_t bytes,
+						void *data __always_unused,
+						struct iov_iter *i));
+
 static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
 			       struct iov_iter *to, int len, bool fault_short,
 			       size_t (*cb)(const void *, size_t, void *,
@@ -416,7 +422,8 @@ static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
 	if (copy > 0) {
 		if (copy > len)
 			copy = len;
-		n = cb(skb->data + offset, copy, data, to);
+		n = INDIRECT_CALL_1(cb, simple_copy_to_iter,
+				    skb->data + offset, copy, data, to);
 		offset += n;
 		if (n != copy)
 			goto short_copy;
@@ -438,8 +445,9 @@ static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
 
 			if (copy > len)
 				copy = len;
-			n = cb(vaddr + skb_frag_off(frag) + offset - start,
-			       copy, data, to);
+			n = INDIRECT_CALL_1(cb, simple_copy_to_iter,
+					vaddr + skb_frag_off(frag) + offset - start,
+					copy, data, to);
 			kunmap(page);
 			offset += n;
 			if (n != copy)
-- 
2.25.1.696.g5e7596f4ac-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25  2:23 [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter() Eric Dumazet
@ 2020-03-25  4:22 ` Eric Dumazet
  2020-03-25 11:52 ` Paolo Abeni
  2020-03-25 18:31 ` David Miller
  2 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2020-03-25  4:22 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller; +Cc: netdev, Eric Dumazet, Sagi Grimberg



On 3/24/20 7:23 PM, Eric Dumazet wrote:
> TCP recvmsg() calls skb_copy_datagram_iter(), which
> calls an indirect function (cb pointing to simple_copy_to_iter())
> for every MSS (fragment) present in the skb.
> 
> CONFIG_RETPOLINE=y forces a very expensive operation
> that we can avoid thanks to indirect call wrappers.
> 
> This patch gives a 13% increase of performance on
> a single flow, if the bottleneck is the thread reading
> the TCP socket.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>


BTW, the expensive indirect call came with :

So we could also add a Fixes: tag eventually

Fixes: 950fcaecd5cc ("datagram: consolidate datagram copy to iter helpers")

commit 950fcaecd5cc6c014bb96506fd0652a501c85276
Author: Sagi Grimberg <sagi@lightbitslabs.com>
Date:   Mon Dec 3 17:52:08 2018 -0800

    datagram: consolidate datagram copy to iter helpers
    
    skb_copy_datagram_iter and skb_copy_and_csum_datagram are essentialy
    the same but with a couple of differences: The first is the copy
    operation used which either a simple copy or a csum_and_copy, and the
    second are the behavior on the "short copy" path where simply copy
    needs to return the number of bytes successfully copied while csum_and_copy
    needs to fault immediately as the checksum is partial.
    
    Introduce __skb_datagram_iter that additionally accepts:
    1. copy operation function pointer
    2. private data that goes with the copy operation
    3. fault_short flag to indicate the action on short copy
    
    Suggested-by: David S. Miller <davem@davemloft.net>
    Acked-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sagi Grimberg <sagi@lightbitslabs.com>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

> ---
>  net/core/datagram.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/datagram.c b/net/core/datagram.c
> index 4213081c6ed3d4fda69501641a8c76e041f26b42..639745d4f3b94a248da9a685f45158410a85bec7 100644
> --- a/net/core/datagram.c
> +++ b/net/core/datagram.c
> @@ -51,6 +51,7 @@
>  #include <linux/slab.h>
>  #include <linux/pagemap.h>
>  #include <linux/uio.h>
> +#include <linux/indirect_call_wrapper.h>
>  
>  #include <net/protocol.h>
>  #include <linux/skbuff.h>
> @@ -403,6 +404,11 @@ int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags)
>  }
>  EXPORT_SYMBOL(skb_kill_datagram);
>  
> +INDIRECT_CALLABLE_DECLARE(static size_t simple_copy_to_iter(const void *addr,
> +						size_t bytes,
> +						void *data __always_unused,
> +						struct iov_iter *i));
> +
>  static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
>  			       struct iov_iter *to, int len, bool fault_short,
>  			       size_t (*cb)(const void *, size_t, void *,
> @@ -416,7 +422,8 @@ static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
>  	if (copy > 0) {
>  		if (copy > len)
>  			copy = len;
> -		n = cb(skb->data + offset, copy, data, to);
> +		n = INDIRECT_CALL_1(cb, simple_copy_to_iter,
> +				    skb->data + offset, copy, data, to);
>  		offset += n;
>  		if (n != copy)
>  			goto short_copy;
> @@ -438,8 +445,9 @@ static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
>  
>  			if (copy > len)
>  				copy = len;
> -			n = cb(vaddr + skb_frag_off(frag) + offset - start,
> -			       copy, data, to);
> +			n = INDIRECT_CALL_1(cb, simple_copy_to_iter,
> +					vaddr + skb_frag_off(frag) + offset - start,
> +					copy, data, to);
>  			kunmap(page);
>  			offset += n;
>  			if (n != copy)
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25  2:23 [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter() Eric Dumazet
  2020-03-25  4:22 ` Eric Dumazet
@ 2020-03-25 11:52 ` Paolo Abeni
  2020-03-25 14:55   ` Willem de Bruijn
  2020-03-25 15:14   ` Eric Dumazet
  2020-03-25 18:31 ` David Miller
  2 siblings, 2 replies; 11+ messages in thread
From: Paolo Abeni @ 2020-03-25 11:52 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller; +Cc: netdev, Eric Dumazet

On Tue, 2020-03-24 at 19:23 -0700, Eric Dumazet wrote:
> TCP recvmsg() calls skb_copy_datagram_iter(), which
> calls an indirect function (cb pointing to simple_copy_to_iter())
> for every MSS (fragment) present in the skb.
> 
> CONFIG_RETPOLINE=y forces a very expensive operation
> that we can avoid thanks to indirect call wrappers.
> 
> This patch gives a 13% increase of performance on
> a single flow, if the bottleneck is the thread reading
> the TCP socket.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/core/datagram.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/datagram.c b/net/core/datagram.c
> index 4213081c6ed3d4fda69501641a8c76e041f26b42..639745d4f3b94a248da9a685f45158410a85bec7 100644
> --- a/net/core/datagram.c
> +++ b/net/core/datagram.c
> @@ -51,6 +51,7 @@
>  #include <linux/slab.h>
>  #include <linux/pagemap.h>
>  #include <linux/uio.h>
> +#include <linux/indirect_call_wrapper.h>
>  
>  #include <net/protocol.h>
>  #include <linux/skbuff.h>
> @@ -403,6 +404,11 @@ int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags)
>  }
>  EXPORT_SYMBOL(skb_kill_datagram);
>  
> +INDIRECT_CALLABLE_DECLARE(static size_t simple_copy_to_iter(const void *addr,
> +						size_t bytes,
> +						void *data __always_unused,
> +						struct iov_iter *i));
> +
>  static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
>  			       struct iov_iter *to, int len, bool fault_short,
>  			       size_t (*cb)(const void *, size_t, void *,
> @@ -416,7 +422,8 @@ static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
>  	if (copy > 0) {
>  		if (copy > len)
>  			copy = len;
> -		n = cb(skb->data + offset, copy, data, to);
> +		n = INDIRECT_CALL_1(cb, simple_copy_to_iter,
> +				    skb->data + offset, copy, data, to);
>  		offset += n;
>  		if (n != copy)
>  			goto short_copy;
> @@ -438,8 +445,9 @@ static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
>  
>  			if (copy > len)
>  				copy = len;
> -			n = cb(vaddr + skb_frag_off(frag) + offset - start,
> -			       copy, data, to);
> +			n = INDIRECT_CALL_1(cb, simple_copy_to_iter,
> +					vaddr + skb_frag_off(frag) + offset - start,
> +					copy, data, to);
>  			kunmap(page);
>  			offset += n;
>  			if (n != copy)

I wondered if we could add a second argument for
'csum_and_copy_to_iter', but I guess that is a slower path anyway and
more datapoint would be needed. The patch LGTM, thanks!

Acked-by: Paolo Abeni <pabeni@redhat.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25 11:52 ` Paolo Abeni
@ 2020-03-25 14:55   ` Willem de Bruijn
  2020-03-25 16:00     ` Paolo Abeni
  2020-03-25 15:14   ` Eric Dumazet
  1 sibling, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2020-03-25 14:55 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Eric Dumazet, David S . Miller, netdev, Eric Dumazet

On Wed, Mar 25, 2020 at 7:52 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Tue, 2020-03-24 at 19:23 -0700, Eric Dumazet wrote:
> > TCP recvmsg() calls skb_copy_datagram_iter(), which
> > calls an indirect function (cb pointing to simple_copy_to_iter())
> > for every MSS (fragment) present in the skb.
> >
> > CONFIG_RETPOLINE=y forces a very expensive operation
> > that we can avoid thanks to indirect call wrappers.
> >
> > This patch gives a 13% increase of performance on
> > a single flow, if the bottleneck is the thread reading
> > the TCP socket.
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Willem de Bruijn <willemb@google.com>

> > @@ -438,8 +445,9 @@ static int __skb_datagram_iter(const struct sk_buff *skb, int offset,
> >
> >                       if (copy > len)
> >                               copy = len;
> > -                     n = cb(vaddr + skb_frag_off(frag) + offset - start,
> > -                            copy, data, to);
> > +                     n = INDIRECT_CALL_1(cb, simple_copy_to_iter,
> > +                                     vaddr + skb_frag_off(frag) + offset - start,
> > +                                     copy, data, to);
> >                       kunmap(page);
> >                       offset += n;
> >                       if (n != copy)
>
> I wondered if we could add a second argument for
> 'csum_and_copy_to_iter', but I guess that is a slower path anyway and
> more datapoint would be needed. The patch LGTM, thanks!
>
> Acked-by: Paolo Abeni <pabeni@redhat.com>

On the UDP front this reminded me of another indirect function call
without indirect call wrapper: getfrag in __ip_append_data.

That is called for each datagram once per linear + once per page. That
said, the noise in my quick RR test was too great to measure any
benefit from the following. Paolo, did you happen to also look at that
when introducing the indirect callers? Seems like it won't hurt to
add.

 static int
 ip_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
@@ -1128,7 +1129,8 @@ static int __ip_append_data(struct sock *sk,
                        }

                        copy = datalen - transhdrlen - fraggap - pagedlen;
-                       if (copy > 0 && getfrag(from, data +
transhdrlen, offset, copy, fraggap, skb) < 0) {
+                       if (copy > 0 &&
+                           INDIRECT_CALL_1(getfrag,
ip_generic_getfrag, from, data + transhdrlen, offset, copy, fraggap,
skb) < 0) {
                                err = -EFAULT;
                                kfree_skb(skb);
                                goto error;
@@ -1170,7 +1172,7 @@ static int __ip_append_data(struct sock *sk,
                        unsigned int off;

                        off = skb->len;
-                       if (getfrag(from, skb_put(skb, copy),
+                       if (INDIRECT_CALL_1(getfrag,
ip_generic_getfrag, from, skb_put(skb, copy),
                                        offset, copy, off, skb) < 0) {
                                __skb_trim(skb, off);
                                err = -EFAULT;
@@ -1195,7 +1197,7 @@ static int __ip_append_data(struct sock *sk,
                                get_page(pfrag->page);
                        }
                        copy = min_t(int, copy, pfrag->size - pfrag->offset);
-                       if (getfrag(from,
+                       if (INDIRECT_CALL_1(getfrag, ip_generic_getfrag, from,

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25 11:52 ` Paolo Abeni
  2020-03-25 14:55   ` Willem de Bruijn
@ 2020-03-25 15:14   ` Eric Dumazet
  1 sibling, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2020-03-25 15:14 UTC (permalink / raw)
  To: Paolo Abeni, Eric Dumazet, David S . Miller; +Cc: netdev



On 3/25/20 4:52 AM, Paolo Abeni wrote:

> 
> I wondered if we could add a second argument for
> 'csum_and_copy_to_iter', but I guess that is a slower path anyway and
> more datapoint would be needed. The patch LGTM, thanks!

Yes, TCP would not need the csum stuff, I suspect the only users
of csum would avoid one indirect call per system call at most,
that is pure noise.

While TCP right now can trigger 45 indirect calls per skb copied to user space,
assuming standard 1500 bytes MTU.

> 
> Acked-by: Paolo Abeni <pabeni@redhat.com>

Thanks !


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25 14:55   ` Willem de Bruijn
@ 2020-03-25 16:00     ` Paolo Abeni
  2020-03-25 16:07       ` Eric Dumazet
  2020-03-25 20:58       ` Willem de Bruijn
  0 siblings, 2 replies; 11+ messages in thread
From: Paolo Abeni @ 2020-03-25 16:00 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Eric Dumazet, David S . Miller, netdev, Eric Dumazet

On Wed, 2020-03-25 at 10:55 -0400, Willem de Bruijn wrote:
> On the UDP front this reminded me of another indirect function call
> without indirect call wrapper: getfrag in __ip_append_data.
> 
> That is called for each datagram once per linear + once per page. That
> said, the noise in my quick RR test was too great to measure any
> benefit from the following. 

Why an RR test ?

I think you should be able to measure some raw tput improvement with
large UDP GSO write towards a blackhole dst/or dropping ingress pkts
with XDP (just to be sure the bottle-neck is on the sender side).

> Paolo, did you happen to also look at that
> when introducing the indirect callers? Seems like it won't hurt to
> add.

Nope, sorry I haven't experimented that.

For the record, I have 2 others item on my list, I hope to have time to
process some day: the ingress dst->input and the default ->enqueue  and
->dequeue

Cheers,

Paolo

p.s. feel free to move this on a different thread, as it fit you better


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25 16:00     ` Paolo Abeni
@ 2020-03-25 16:07       ` Eric Dumazet
  2020-03-25 16:24         ` Paolo Abeni
  2020-03-25 20:58       ` Willem de Bruijn
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2020-03-25 16:07 UTC (permalink / raw)
  To: Paolo Abeni, Willem de Bruijn
  Cc: Eric Dumazet, David S . Miller, netdev, Eric Dumazet



On 3/25/20 9:00 AM, Paolo Abeni wrote:

> 
> For the record, I have 2 others item on my list, I hope to have time to
> process some day: the ingress dst->input and the default ->enqueue  and
> ->dequeue

What is the default ->enqueue() and ->dequeue() ?

For us, this is FQ.

(Even if we do not select NET_SCH_DEFAULT and leave pfifo_fast as the 'default' qdisc)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25 16:07       ` Eric Dumazet
@ 2020-03-25 16:24         ` Paolo Abeni
       [not found]           ` <CANn89iKotU9Tkd6KBgyicHFV72K9gZ+eeKwkPU097=gZZYCjrA@mail.gmail.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2020-03-25 16:24 UTC (permalink / raw)
  To: Eric Dumazet, Willem de Bruijn; +Cc: Eric Dumazet, David S . Miller, netdev

On Wed, 2020-03-25 at 09:07 -0700, Eric Dumazet wrote:
> 
> On 3/25/20 9:00 AM, Paolo Abeni wrote:
> 
> > For the record, I have 2 others item on my list, I hope to have time to
> > process some day: the ingress dst->input and the default ->enqueue  and
> > ->dequeue
> 
> What is the default ->enqueue() and ->dequeue() ?

The idea is (or should I say 'was' ?!?) to tie it to NET_SCH_DEFAULT,
so it depends on your config...

> For us, this is FQ.
> 
> (Even if we do not select NET_SCH_DEFAULT and leave pfifo_fast as the 'default' qdisc)

... this one will see no benefit.

Just out of sheer curiosity, why don't you set NET_SCH_DEFAULT?

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
       [not found]           ` <CANn89iKotU9Tkd6KBgyicHFV72K9gZ+eeKwkPU097=gZZYCjrA@mail.gmail.com>
@ 2020-03-25 16:46             ` Eric Dumazet
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2020-03-25 16:46 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Eric Dumazet, Willem de Bruijn, David S . Miller, netdev

Resend without HTML encoding


On Wed, Mar 25, 2020 at 9:41 AM Eric Dumazet <edumazet@google.com> wrote:
>
>
>
> On Wed, Mar 25, 2020 at 9:24 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>
>>
>> Just out of sheer curiosity, why don't you set NET_SCH_DEFAULT?
>>
>
> Because we have boot-time scripts setting optimal configs, and since we need
> to set XPS properly to get correct NUMA allocations, we have to perform the qdisc
> allocations after some other stuff.
>
> (Look for netdev_queue_numa_node_read() calls)
>
> Also, some users still expect pfifo_fast to be used when a tun device is created :)
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25  2:23 [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter() Eric Dumazet
  2020-03-25  4:22 ` Eric Dumazet
  2020-03-25 11:52 ` Paolo Abeni
@ 2020-03-25 18:31 ` David Miller
  2 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2020-03-25 18:31 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 24 Mar 2020 19:23:21 -0700

> TCP recvmsg() calls skb_copy_datagram_iter(), which
> calls an indirect function (cb pointing to simple_copy_to_iter())
> for every MSS (fragment) present in the skb.
> 
> CONFIG_RETPOLINE=y forces a very expensive operation
> that we can avoid thanks to indirect call wrappers.
> 
> This patch gives a 13% increase of performance on
> a single flow, if the bottleneck is the thread reading
> the TCP socket.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter()
  2020-03-25 16:00     ` Paolo Abeni
  2020-03-25 16:07       ` Eric Dumazet
@ 2020-03-25 20:58       ` Willem de Bruijn
  1 sibling, 0 replies; 11+ messages in thread
From: Willem de Bruijn @ 2020-03-25 20:58 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Willem de Bruijn, Eric Dumazet, David S . Miller, netdev, Eric Dumazet

On Wed, Mar 25, 2020 at 12:00 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Wed, 2020-03-25 at 10:55 -0400, Willem de Bruijn wrote:
> > On the UDP front this reminded me of another indirect function call
> > without indirect call wrapper: getfrag in __ip_append_data.
> >
> > That is called for each datagram once per linear + once per page. That
> > said, the noise in my quick RR test was too great to measure any
> > benefit from the following.
>
> Why an RR test ?
>
> I think you should be able to measure some raw tput improvement with
> large UDP GSO write towards a blackhole dst/or dropping ingress pkts
> with XDP (just to be sure the bottle-neck is on the sender side).

Thanks for the suggestion. I ran a send-only udpgso_bench_tx test
to a dummy device with NETIF_F_GSO_UDP_L4.

    ip link add dummy0 type dummy
    ip link set dev dummy0 mtu 1500
    ip link set dev dummy0 up
    ip addr add 10.0.0.1/24 dev dummy0
    perf stat -- ./udpgso_bench_tx -C 1 -4 -D 10.0.0.2 -l 5 -S 0

By default, this generates only 3 getfrag calls per sendmsg, due to
sk_page_frag_refill generating 32KB compound pages.

When disabling compound pages by setting sysctl
net.core.high_order_alloc_disable , this increased to 17 getfrag calls
per sendmsg.

Even then any benefit appears to be in the noise. Both reported
10900-11700 MB/s.

The effect of that sysctl, and thus compound pages, was much larger
than I expected. With that disabled, I observed 16500-18100 MBps.

In summary, this particular indirect call does not appear worthwhile.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-04-02 14:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-25  2:23 [PATCH net-next] net: use indirect call wrappers for skb_copy_datagram_iter() Eric Dumazet
2020-03-25  4:22 ` Eric Dumazet
2020-03-25 11:52 ` Paolo Abeni
2020-03-25 14:55   ` Willem de Bruijn
2020-03-25 16:00     ` Paolo Abeni
2020-03-25 16:07       ` Eric Dumazet
2020-03-25 16:24         ` Paolo Abeni
     [not found]           ` <CANn89iKotU9Tkd6KBgyicHFV72K9gZ+eeKwkPU097=gZZYCjrA@mail.gmail.com>
2020-03-25 16:46             ` Eric Dumazet
2020-03-25 20:58       ` Willem de Bruijn
2020-03-25 15:14   ` Eric Dumazet
2020-03-25 18:31 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).