All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] veth: ensure skb entering GRO are not cloned.
@ 2021-12-21 21:33 Paolo Abeni
  2021-12-22  4:31 ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Paolo Abeni @ 2021-12-21 21:33 UTC (permalink / raw)
  To: netdev; +Cc: Ignat Korchagin, Eric Dumazet

After commit d3256efd8e8b ("veth: allow enabling NAPI even without XDP"),
if GRO is enabled on a veth device and TSO is disabled on the peer
device, TCP skbs will go through the NAPI callback. If there is no XDP
program attached, the veth code does not perform any share check, and
shared/cloned skbs could enter the GRO engine.

Ignat reported a BUG triggered later-on due to the above condition:

[   53.970529][    C1] kernel BUG at net/core/skbuff.c:3574!
[   53.981755][    C1] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[   53.982634][    C1] CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.16.0-rc5+ #25
[   53.982634][    C1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[   53.982634][    C1] RIP: 0010:skb_shift+0x13ef/0x23b0
[   53.982634][    C1] Code: ea 03 0f b6 04 02 48 89 fa 83 e2 07 38 d0
7f 08 84 c0 0f 85 41 0c 00 00 41 80 7f 02 00 4d 8d b5 d0 00 00 00 0f
85 74 f5 ff ff <0f> 0b 4d 8d 77 20 be 04 00 00 00 4c 89 44 24 78 4c 89
f7 4c 89 8c
[   53.982634][    C1] RSP: 0018:ffff8881008f7008 EFLAGS: 00010246
[   53.982634][    C1] RAX: 0000000000000000 RBX: ffff8881180b4c80 RCX: 0000000000000000
[   53.982634][    C1] RDX: 0000000000000002 RSI: ffff8881180b4d3c RDI: ffff88810bc9cac2
[   53.982634][    C1] RBP: ffff8881008f70b8 R08: ffff8881180b4cf4 R09: ffff8881180b4cf0
[   53.982634][    C1] R10: ffffed1022999e5c R11: 0000000000000002 R12: 0000000000000590
[   53.982634][    C1] R13: ffff88810f940c80 R14: ffff88810f940d50 R15: ffff88810bc9cac0
[   53.982634][    C1] FS:  0000000000000000(0000) GS:ffff888235880000(0000) knlGS:0000000000000000
[   53.982634][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.982634][    C1] CR2: 00007ff5f9b86680 CR3: 0000000108ce8004 CR4: 0000000000170ee0
[   53.982634][    C1] Call Trace:
[   53.982634][    C1]  <TASK>
[   53.982634][    C1]  tcp_sacktag_walk+0xaba/0x18e0
[   53.982634][    C1]  tcp_sacktag_write_queue+0xe7b/0x3460
[   53.982634][    C1]  tcp_ack+0x2666/0x54b0
[   53.982634][    C1]  tcp_rcv_established+0x4d9/0x20f0
[   53.982634][    C1]  tcp_v4_do_rcv+0x551/0x810
[   53.982634][    C1]  tcp_v4_rcv+0x22ed/0x2ed0
[   53.982634][    C1]  ip_protocol_deliver_rcu+0x96/0xaf0
[   53.982634][    C1]  ip_local_deliver_finish+0x1e0/0x2f0
[   53.982634][    C1]  ip_sublist_rcv_finish+0x211/0x440
[   53.982634][    C1]  ip_list_rcv_finish.constprop.0+0x424/0x660
[   53.982634][    C1]  ip_list_rcv+0x2c8/0x410
[   53.982634][    C1]  __netif_receive_skb_list_core+0x65c/0x910
[   53.982634][    C1]  netif_receive_skb_list_internal+0x5f9/0xcb0
[   53.982634][    C1]  napi_complete_done+0x188/0x6e0
[   53.982634][    C1]  gro_cell_poll+0x10c/0x1d0
[   53.982634][    C1]  __napi_poll+0xa1/0x530
[   53.982634][    C1]  net_rx_action+0x567/0x1270
[   53.982634][    C1]  __do_softirq+0x28a/0x9ba
[   53.982634][    C1]  run_ksoftirqd+0x32/0x60
[   53.982634][    C1]  smpboot_thread_fn+0x559/0x8c0
[   53.982634][    C1]  kthread+0x3b9/0x490
[   53.982634][    C1]  ret_from_fork+0x22/0x30
[   53.982634][    C1]  </TASK>

Address the issue checking for cloned skbs even in the GRO-without-XDP
input path.

Reported-and-tested-by: Ignat Korchagin <ignat@cloudflare.com>
Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/veth.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index b78894c38933..abd1f949b2f5 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -718,6 +718,14 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (unlikely(!xdp_prog)) {
+		if (unlikely(skb_shared(skb) || skb_head_is_locked(skb))) {
+			struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC | __GFP_NOWARN);
+
+			if (!nskb)
+				goto drop;
+			consume_skb(skb);
+			skb = nskb;
+		}
 		rcu_read_unlock();
 		goto out;
 	}
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] veth: ensure skb entering GRO are not cloned.
  2021-12-21 21:33 [PATCH net] veth: ensure skb entering GRO are not cloned Paolo Abeni
@ 2021-12-22  4:31 ` Eric Dumazet
  2021-12-22 11:06   ` Paolo Abeni
  2021-12-23 18:08   ` Dave Taht
  0 siblings, 2 replies; 6+ messages in thread
From: Eric Dumazet @ 2021-12-22  4:31 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev, Ignat Korchagin

On Tue, Dec 21, 2021 at 1:34 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> After commit d3256efd8e8b ("veth: allow enabling NAPI even without XDP"),
> if GRO is enabled on a veth device and TSO is disabled on the peer
> device, TCP skbs will go through the NAPI callback. If there is no XDP
> program attached, the veth code does not perform any share check, and
> shared/cloned skbs could enter the GRO engine.
>
>

...

> Address the issue checking for cloned skbs even in the GRO-without-XDP
> input path.
>
> Reported-and-tested-by: Ignat Korchagin <ignat@cloudflare.com>
> Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  drivers/net/veth.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index b78894c38933..abd1f949b2f5 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -718,6 +718,14 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
>         rcu_read_lock();
>         xdp_prog = rcu_dereference(rq->xdp_prog);
>         if (unlikely(!xdp_prog)) {
> +               if (unlikely(skb_shared(skb) || skb_head_is_locked(skb))) {

Why skb_head_is_locked() needed here ?
I would think skb_cloned() is enough for the problem we want to address.

> +                       struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC | __GFP_NOWARN);
> +
> +                       if (!nskb)
> +                               goto drop;
> +                       consume_skb(skb);
> +                       skb = nskb;
> +               }
>                 rcu_read_unlock();
>                 goto out;
>         }
> --
> 2.33.1
>

- It seems adding yet memory alloc/free and copies is defeating GRO purpose.
- After skb_copy(), GRO is forced to use the expensive frag_list way
for aggregation anyway.
- veth mtu could be set to 64KB, so we could have order-4 allocation
attempts here.

Would the following fix [1] be better maybe, in terms of efficiency,
and keeping around skb EDT/tstamp
information (see recent thread with Martin and Daniel )

I think it also focuses more on the problem (GRO is not capable of
dealing with cloned skb yet).
Who knows, maybe in the future we will _have_ to add more checks in
GRO fast path for some other reason,
since it is becoming the Swiss army knife of networking :)

Although I guess this whole case (disabling TSO) is moot, I have no
idea why anyone would do that :)

[1]
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 50eb43e5bf459bb998e264d399bc85d4e9d73594..fe7a4d2f7bfc834ea56d1da185c0f53bfbd22ad0
100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -879,8 +879,12 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,

                        stats->xdp_bytes += skb->len;
                        skb = veth_xdp_rcv_skb(rq, skb, bq, stats);
-                       if (skb)
-                               napi_gro_receive(&rq->xdp_napi, skb);
+                       if (skb) {
+                               if (skb_shared(skb) || skb_cloned(skb))
+                                       netif_receive_skb(skb);
+                               else
+                                       napi_gro_receive(&rq->xdp_napi, skb);
+                       }
                }
                done++;
        }

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] veth: ensure skb entering GRO are not cloned.
  2021-12-22  4:31 ` Eric Dumazet
@ 2021-12-22 11:06   ` Paolo Abeni
  2021-12-22 11:58     ` Eric Dumazet
  2021-12-23 18:08   ` Dave Taht
  1 sibling, 1 reply; 6+ messages in thread
From: Paolo Abeni @ 2021-12-22 11:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Ignat Korchagin

Hello,

On Tue, 2021-12-21 at 20:31 -0800, Eric Dumazet wrote:
> On Tue, Dec 21, 2021 at 1:34 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > 
> > After commit d3256efd8e8b ("veth: allow enabling NAPI even without XDP"),
> > if GRO is enabled on a veth device and TSO is disabled on the peer
> > device, TCP skbs will go through the NAPI callback. If there is no XDP
> > program attached, the veth code does not perform any share check, and
> > shared/cloned skbs could enter the GRO engine.
> > 
> > 
> 
> ...
> 
> > Address the issue checking for cloned skbs even in the GRO-without-XDP
> > input path.
> > 
> > Reported-and-tested-by: Ignat Korchagin <ignat@cloudflare.com>
> > Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  drivers/net/veth.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > index b78894c38933..abd1f949b2f5 100644
> > --- a/drivers/net/veth.c
> > +++ b/drivers/net/veth.c
> > @@ -718,6 +718,14 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
> >         rcu_read_lock();
> >         xdp_prog = rcu_dereference(rq->xdp_prog);
> >         if (unlikely(!xdp_prog)) {
> > +               if (unlikely(skb_shared(skb) || skb_head_is_locked(skb))) {
> 
> Why skb_head_is_locked() needed here ?
> I would think skb_cloned() is enough for the problem we want to address.

Thank you for the feedback.

I double checked the above: in my test even skb_cloned() suffice.

> > +                       struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC | __GFP_NOWARN);
> > +
> > +                       if (!nskb)
> > +                               goto drop;
> > +                       consume_skb(skb);
> > +                       skb = nskb;
> > +               }
> >                 rcu_read_unlock();
> >                 goto out;
> >         }
> > --
> > 2.33.1
> > 
> 
> - It seems adding yet memory alloc/free and copies is defeating GRO purpose.
> - After skb_copy(), GRO is forced to use the expensive frag_list way
> for aggregation anyway.
> - veth mtu could be set to 64KB, so we could have order-4 allocation
> attempts here.
> 
> Would the following fix [1] be better maybe, in terms of efficiency,
> and keeping around skb EDT/tstamp
> information (see recent thread with Martin and Daniel )
> 
> I think it also focuses more on the problem (GRO is not capable of
> dealing with cloned skb yet).
> Who knows, maybe in the future we will _have_ to add more checks in
> GRO fast path for some other reason,
> since it is becoming the Swiss army knife of networking :)

Only vaguely related: I have a bunch of micro optimizations for the GRO
engine. I did not submit the patches because I can observe the gain
only in micro-benchmarks, but I'm wondering if that could be visible
with very high speed TCP stream? I can share the code if that could be
of general interest (after some rebasing, the patches predates gro.c)

> Although I guess this whole case (disabling TSO) is moot, I have no
> idea why anyone would do that :)
> 
> [1]
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 50eb43e5bf459bb998e264d399bc85d4e9d73594..fe7a4d2f7bfc834ea56d1da185c0f53bfbd22ad0
> 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -879,8 +879,12 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
> 
>                         stats->xdp_bytes += skb->len;
>                         skb = veth_xdp_rcv_skb(rq, skb, bq, stats);
> -                       if (skb)
> -                               napi_gro_receive(&rq->xdp_napi, skb);
> +                       if (skb) {
> +                               if (skb_shared(skb) || skb_cloned(skb))
> +                                       netif_receive_skb(skb);
> +                               else
> +                                       napi_gro_receive(&rq->xdp_napi, skb);
> +                       }
>                 }
>                 done++;
>         }

I tested the above, and it works, too.

I thought about something similar, but I overlooked possible OoO or
behaviour changes when a packet socket is attached to the paired device
(as it would disable GRO).

It looks like tcpdump should have not ill-effects (the mmap rx-path
releases the skb clone before the orig packet reaches the other end),
so I guess the above is fine (and sure is better to avoid more
timestamp related problem).

Do you prefer to submit it formally, or do you prefer I'll send a v2
with the latter code?

Thanks!

Paolo



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] veth: ensure skb entering GRO are not cloned.
  2021-12-22 11:06   ` Paolo Abeni
@ 2021-12-22 11:58     ` Eric Dumazet
  2021-12-22 16:10       ` Paolo Abeni
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2021-12-22 11:58 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev, Ignat Korchagin

On Wed, Dec 22, 2021 at 3:06 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> Hello,
>
> On Tue, 2021-12-21 at 20:31 -0800, Eric Dumazet wrote:
> > On Tue, Dec 21, 2021 at 1:34 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > >
> > > After commit d3256efd8e8b ("veth: allow enabling NAPI even without XDP"),
> > > if GRO is enabled on a veth device and TSO is disabled on the peer
> > > device, TCP skbs will go through the NAPI callback. If there is no XDP
> > > program attached, the veth code does not perform any share check, and
> > > shared/cloned skbs could enter the GRO engine.
> > >
> > >
> >
> > ...
> >
> > > Address the issue checking for cloned skbs even in the GRO-without-XDP
> > > input path.
> > >
> > > Reported-and-tested-by: Ignat Korchagin <ignat@cloudflare.com>
> > > Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
> > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > > ---
> > >  drivers/net/veth.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > > index b78894c38933..abd1f949b2f5 100644
> > > --- a/drivers/net/veth.c
> > > +++ b/drivers/net/veth.c
> > > @@ -718,6 +718,14 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
> > >         rcu_read_lock();
> > >         xdp_prog = rcu_dereference(rq->xdp_prog);
> > >         if (unlikely(!xdp_prog)) {
> > > +               if (unlikely(skb_shared(skb) || skb_head_is_locked(skb))) {
> >
> > Why skb_head_is_locked() needed here ?
> > I would think skb_cloned() is enough for the problem we want to address.
>
> Thank you for the feedback.
>
> I double checked the above: in my test even skb_cloned() suffice.
>
> > > +                       struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC | __GFP_NOWARN);
> > > +
> > > +                       if (!nskb)
> > > +                               goto drop;
> > > +                       consume_skb(skb);
> > > +                       skb = nskb;
> > > +               }
> > >                 rcu_read_unlock();
> > >                 goto out;
> > >         }
> > > --
> > > 2.33.1
> > >
> >
> > - It seems adding yet memory alloc/free and copies is defeating GRO purpose.
> > - After skb_copy(), GRO is forced to use the expensive frag_list way
> > for aggregation anyway.
> > - veth mtu could be set to 64KB, so we could have order-4 allocation
> > attempts here.
> >
> > Would the following fix [1] be better maybe, in terms of efficiency,
> > and keeping around skb EDT/tstamp
> > information (see recent thread with Martin and Daniel )
> >
> > I think it also focuses more on the problem (GRO is not capable of
> > dealing with cloned skb yet).
> > Who knows, maybe in the future we will _have_ to add more checks in
> > GRO fast path for some other reason,
> > since it is becoming the Swiss army knife of networking :)
>
> Only vaguely related: I have a bunch of micro optimizations for the GRO
> engine. I did not submit the patches because I can observe the gain
> only in micro-benchmarks, but I'm wondering if that could be visible
> with very high speed TCP stream? I can share the code if that could be
> of general interest (after some rebasing, the patches predates gro.c)
>
> > Although I guess this whole case (disabling TSO) is moot, I have no
> > idea why anyone would do that :)
> >
> > [1]
> > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > index 50eb43e5bf459bb998e264d399bc85d4e9d73594..fe7a4d2f7bfc834ea56d1da185c0f53bfbd22ad0
> > 100644
> > --- a/drivers/net/veth.c
> > +++ b/drivers/net/veth.c
> > @@ -879,8 +879,12 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
> >
> >                         stats->xdp_bytes += skb->len;
> >                         skb = veth_xdp_rcv_skb(rq, skb, bq, stats);
> > -                       if (skb)
> > -                               napi_gro_receive(&rq->xdp_napi, skb);
> > +                       if (skb) {
> > +                               if (skb_shared(skb) || skb_cloned(skb))
> > +                                       netif_receive_skb(skb);
> > +                               else
> > +                                       napi_gro_receive(&rq->xdp_napi, skb);
> > +                       }
> >                 }
> >                 done++;
> >         }
>
> I tested the above, and it works, too.
>
> I thought about something similar, but I overlooked possible OoO or
> behaviour changes when a packet socket is attached to the paired device
> (as it would disable GRO).

Have you tried a pskb_expand_head() instead of a full copy ?
Perhaps that would be enough, and keep all packets going through GRO to
make sure OOO is covered.

>
> It looks like tcpdump should have not ill-effects (the mmap rx-path
> releases the skb clone before the orig packet reaches the other end),
> so I guess the above is fine (and sure is better to avoid more
> timestamp related problem).
>
> Do you prefer to submit it formally, or do you prefer I'll send a v2
> with the latter code?

Sure, please submit a V2.

>
> Thanks!
>
> Paolo
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] veth: ensure skb entering GRO are not cloned.
  2021-12-22 11:58     ` Eric Dumazet
@ 2021-12-22 16:10       ` Paolo Abeni
  0 siblings, 0 replies; 6+ messages in thread
From: Paolo Abeni @ 2021-12-22 16:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Ignat Korchagin

On Wed, 2021-12-22 at 03:58 -0800, Eric Dumazet wrote:
> On Wed, Dec 22, 2021 at 3:06 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > I thought about something similar, but I overlooked possible OoO or
> > behaviour changes when a packet socket is attached to the paired device
> > (as it would disable GRO).
> 
> Have you tried a pskb_expand_head() instead of a full copy ?
> Perhaps that would be enough, and keep all packets going through GRO to
> make sure OOO is covered.

Indeed it looks like it's enough. I'll do some more testing and I'll
send a v2 using pskb_expand_head().

Many thanks!

Paolo


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] veth: ensure skb entering GRO are not cloned.
  2021-12-22  4:31 ` Eric Dumazet
  2021-12-22 11:06   ` Paolo Abeni
@ 2021-12-23 18:08   ` Dave Taht
  1 sibling, 0 replies; 6+ messages in thread
From: Dave Taht @ 2021-12-23 18:08 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paolo Abeni, netdev, Ignat Korchagin

On Wed, Dec 22, 2021 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Dec 21, 2021 at 1:34 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > After commit d3256efd8e8b ("veth: allow enabling NAPI even without XDP"),
> > if GRO is enabled on a veth device and TSO is disabled on the peer
> > device, TCP skbs will go through the NAPI callback. If there is no XDP
> > program attached, the veth code does not perform any share check, and
> > shared/cloned skbs could enter the GRO engine.
> >
> >
>
> ...
>
> > Address the issue checking for cloned skbs even in the GRO-without-XDP
> > input path.
> >
> > Reported-and-tested-by: Ignat Korchagin <ignat@cloudflare.com>
> > Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  drivers/net/veth.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > index b78894c38933..abd1f949b2f5 100644
> > --- a/drivers/net/veth.c
> > +++ b/drivers/net/veth.c
> > @@ -718,6 +718,14 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
> >         rcu_read_lock();
> >         xdp_prog = rcu_dereference(rq->xdp_prog);
> >         if (unlikely(!xdp_prog)) {
> > +               if (unlikely(skb_shared(skb) || skb_head_is_locked(skb))) {
>
> Why skb_head_is_locked() needed here ?
> I would think skb_cloned() is enough for the problem we want to address.
>
> > +                       struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC | __GFP_NOWARN);
> > +
> > +                       if (!nskb)
> > +                               goto drop;
> > +                       consume_skb(skb);
> > +                       skb = nskb;
> > +               }
> >                 rcu_read_unlock();
> >                 goto out;
> >         }
> > --
> > 2.33.1
> >
>
> - It seems adding yet memory alloc/free and copies is defeating GRO purpose.
> - After skb_copy(), GRO is forced to use the expensive frag_list way
> for aggregation anyway.
> - veth mtu could be set to 64KB, so we could have order-4 allocation
> attempts here.
>
> Would the following fix [1] be better maybe, in terms of efficiency,
> and keeping around skb EDT/tstamp
> information (see recent thread with Martin and Daniel )

I've always liked the idea of being able to coherently timestamp from
packet ingress to egress.

> I think it also focuses more on the problem (GRO is not capable of
> dealing with cloned skb yet).
> Who knows, maybe in the future we will _have_ to add more checks in
> GRO fast path for some other reason,
> since it is becoming the Swiss army knife of networking :)

GRO is the bane of my sub-gbit existence. I've been wishing we had a
compile time option to always split it up or a righter path forward
using veth interfaces here: https://github.com/rchac/LibreQoS

> Although I guess this whole case (disabling TSO) is moot, I have no
> idea why anyone would do that :)
>
> [1]
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 50eb43e5bf459bb998e264d399bc85d4e9d73594..fe7a4d2f7bfc834ea56d1da185c0f53bfbd22ad0
> 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -879,8 +879,12 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
>
>                         stats->xdp_bytes += skb->len;
>                         skb = veth_xdp_rcv_skb(rq, skb, bq, stats);
> -                       if (skb)
> -                               napi_gro_receive(&rq->xdp_napi, skb);
> +                       if (skb) {
> +                               if (skb_shared(skb) || skb_cloned(skb))
> +                                       netif_receive_skb(skb);
> +                               else
> +                                       napi_gro_receive(&rq->xdp_napi, skb);
> +                       }
>                 }
>                 done++;
>         }



-- 
I tried to build a better future, a few times:
https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org

Dave Täht CEO, TekLibre, LLC

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-23 18:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-21 21:33 [PATCH net] veth: ensure skb entering GRO are not cloned Paolo Abeni
2021-12-22  4:31 ` Eric Dumazet
2021-12-22 11:06   ` Paolo Abeni
2021-12-22 11:58     ` Eric Dumazet
2021-12-22 16:10       ` Paolo Abeni
2021-12-23 18:08   ` Dave Taht

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.