All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Lobakin <alobakin@pm.me>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Alexander Lobakin <alobakin@pm.me>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Network Development <netdev@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Steffen Klassert <steffen.klassert@secunet.com>
Subject: Re: [PATCH v2 net] net: udp: fix Fast/frag0 UDP GRO
Date: Mon, 09 Nov 2020 19:48:04 +0000	[thread overview]
Message-ID: <nokss4CZW7lx6fLaxY3rnl4Qx5buif9VLcfCbI@cp3-web-020.plabs.ch> (raw)
In-Reply-To: <CA+FuTSd2q=zaBJffxi+ORs7rxZN50c8X3n4NoJcrAq+pAZAMCA@mail.gmail.com>

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Mon, 9 Nov 2020 14:06:07 -0500

> On Mon, Nov 9, 2020 at 1:54 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>
>>
>> On 11/9/20 7:28 PM, Willem de Bruijn wrote:
>>> On Mon, Nov 9, 2020 at 12:37 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On 11/9/20 5:56 PM, Alexander Lobakin wrote:
>>>>> While testing UDP GSO fraglists forwarding through driver that uses
>>>>> Fast GRO (via napi_gro_frags()), I was observing lots of out-of-order
>>>>> iperf packets:
>>>>>
>>>>> [ ID] Interval           Transfer     Bitrate         Jitter
>>>>> [SUM]  0.0-40.0 sec  12106 datagrams received out-of-order
>>>>>
>>>>> Simple switch to napi_gro_receive() any other method without frag0
>>>>> shortcut completely resolved them.
>>>>>
>>>>> I've found that UDP GRO uses udp_hdr(skb) in its .gro_receive()
>>>>> callback. While it's probably OK for non-frag0 paths (when all
>>>>> headers or even the entire frame are already in skb->data), this
>>>>> inline points to junk when using Fast GRO (napi_gro_frags() or
>>>>> napi_gro_receive() with only Ethernet header in skb->data and all
>>>>> the rest in shinfo->frags) and breaks GRO packet compilation and
>>>>> the packet flow itself.
>>>>> To support both modes, skb_gro_header_fast() + skb_gro_header_slow()
>>>>> are typically used. UDP even has an inline helper that makes use of
>>>>> them, udp_gro_udphdr(). Use that instead of troublemaking udp_hdr()
>>>>> to get rid of the out-of-order delivers.
>>>>>
>>>>> Present since the introduction of plain UDP GRO in 5.0-rc1.
>>>>>
>>>>> Since v1 [1]:
>>>>>  - added a NULL pointer check for "uh" as suggested by Willem.
>>>>>
>>>>> [1] https://lore.kernel.org/netdev/YazU6GEzBdpyZMDMwJirxDX7B4sualpDG68ADZYvJI@cp4-web-034.plabs.ch
>>>>>
>>>>> Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.")
>>>>> Signed-off-by: Alexander Lobakin <alobakin@pm.me>
>>>>> ---
>>>>>  net/ipv4/udp_offload.c | 7 ++++++-
>>>>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>>>>> index e67a66fbf27b..7f6bd221880a 100644
>>>>> --- a/net/ipv4/udp_offload.c
>>>>> +++ b/net/ipv4/udp_offload.c
>>>>> @@ -366,13 +366,18 @@ static struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
>>>>>  static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
>>>>>                                              struct sk_buff *skb)
>>>>>  {
>>>>> -     struct udphdr *uh = udp_hdr(skb);
>>>>> +     struct udphdr *uh = udp_gro_udphdr(skb);
>>>>>       struct sk_buff *pp = NULL;
>>>>>       struct udphdr *uh2;
>>>>>       struct sk_buff *p;
>>>>>       unsigned int ulen;
>>>>>       int ret = 0;
>>>>>
>>>>> +     if (unlikely(!uh)) {
>>>>
>>>> How uh could be NULL here ?
>>>>
>>>> My understanding is that udp_gro_receive() is called
>>>> only after udp4_gro_receive() or udp6_gro_receive()
>>>> validated that udp_gro_udphdr(skb) was not NULL.
>>>
>>> Oh indeed. This has already been checked before.
>>>
>>>>> +             NAPI_GRO_CB(skb)->flush = 1;
>>>>> +             return NULL;
>>>>> +     }
>>>>> +
>>>>>       /* requires non zero csum, for symmetry with GSO */
>>>>>       if (!uh->check) {
>>>>>               NAPI_GRO_CB(skb)->flush = 1;
>>>>>
>>>>
>>>> Why uh2 is left unchanged ?
>>>>
>>>>     uh2 = udp_hdr(p);
>>>
>>> Isn't that the same as th2 = tcp_hdr(p) in tcp_gro_receive? no frag0
>>> optimization to worry about for packets on the list.
>>
>> My feeling was that tcp_gro_receive() is terminal in the GRO stack.
>>
>> While UDP could be encapsulated in UDP :)
>>
>> I guess we do not support this yet.
>>
>> Years ago we made sure to propagate the current header offset into GRO stack
>> (when we added SIT/IPIP/GRE support to GRO)
>> 299603e8370a93dd5d8e8d800f0dff1ce2c53d36 ("net-gro: Prepare GRO stack for the upcoming tunneling support")
>
> On which note, and Alexander's mention of udp4_lib_lookup_skb(), that
> performs a standard ip_hdr on the possibly frag0 optimized incoming
> packet:
>
> struct sock *udp4_lib_lookup_skb(struct sk_buff *skb,
>                                  __be16 sport, __be16 dport)
> {
>         const struct iphdr *iph = ip_hdr(skb);
>
>         return __udp4_lib_lookup(dev_net(skb->dev), iph->saddr, sport,
>                                  iph->daddr, dport, inet_iif(skb),
>                                  inet_sdif(skb), &udp_table, NULL);
> }
>
> This should use skb_gro_header_.. too, then?

Well, both of udp{4,6}_lib_lookup_skb() and a couple of neighbour
functions use ip_hdr(skb). I didn't check this path before as my
test case was NAT/forwarding with no sk.
I suspect they should use skb_gro_network_header() to obtain an IP
header. I'll fix them in v3, thanks for pointing this out!

Al


      reply	other threads:[~2020-11-09 19:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-09 16:56 [PATCH v2 net] net: udp: fix Fast/frag0 UDP GRO Alexander Lobakin
2020-11-09 17:08 ` Willem de Bruijn
2020-11-09 17:37 ` Eric Dumazet
2020-11-09 18:26   ` Alexander Lobakin
2020-11-09 19:29     ` Eric Dumazet
2020-11-09 19:56       ` Alexander Lobakin
2020-11-09 18:28   ` Willem de Bruijn
2020-11-09 18:54     ` Eric Dumazet
2020-11-09 19:06       ` Willem de Bruijn
2020-11-09 19:48         ` Alexander Lobakin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nokss4CZW7lx6fLaxY3rnl4Qx5buif9VLcfCbI@cp3-web-020.plabs.ch \
    --to=alobakin@pm.me \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=kuba@kernel.org \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=steffen.klassert@secunet.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.