All of lore.kernel.org
 help / color / mirror / Atom feed
From: 黄学森 <hxseverything@gmail.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: David Miller <davem@davemloft.net>, bpf <bpf@vger.kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Network Development <netdev@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	chengzhiyong <chengzhiyong@kuaishou.com>,
	wangli <wangli09@kuaishou.com>
Subject: Re: [PATCH] bpf: in bpf_skb_adjust_room correct inner protocol for vxlan
Date: Tue, 9 Feb 2021 18:41:41 +0800	[thread overview]
Message-ID: <8552C5F8-8410-4E81-8AF4-7018878AFCDC@gmail.com> (raw)
In-Reply-To: <CA+FuTScScC2o6uDjua0T3Eucbjt8-YPf65h3xgxMpTtWvgjWmg@mail.gmail.com>

Appreciate for your reply Willem!

The original intention of this commit is that when we use bpf_skb_adjust_room  to encapsulate 
Vxlan packets, we find some powerful device features disabled. 

Setting the inner_protocol directly as skb->protocol is the root cause.

I understand that it’s not easy to handle all tunnel protocol in one bpf helper function. But for my
immature idea, when pushing Ethernet header, setting the inner_protocol as ETH_P_TEB may
be better.

Now the flag BPF_F_ADJ_ROOM_ENCAP_L4_UDP includes many udp tunnel types( e.g. 
udp+mpls, geneve, vxlan). Adding an independent flag to represents Vxlan looks a little 
reduplicative. What’s your suggestion?

Thanks again for your reply!



> 2021年2月8日 下午9:06,Willem de Bruijn <willemdebruijn.kernel@gmail.com> 写道:
> 
> On Mon, Feb 8, 2021 at 7:16 AM huangxuesen <hxseverything@gmail.com> wrote:
>> 
>> From: huangxuesen <huangxuesen@kuaishou.com>
>> 
>> When pushing vxlan tunnel header, set inner protocol as ETH_P_TEB in skb
>> to avoid HW device disabling udp tunnel segmentation offload, just like
>> vxlan_build_skb does.
>> 
>> Drivers for NIC may invoke vxlan_features_check to check the
>> inner_protocol in skb for vxlan packets to decide whether to disable
>> NETIF_F_GSO_MASK. Currently it sets inner_protocol as the original
>> skb->protocol, that will make mlx5_core disable TSO and lead to huge
>> performance degradation.
>> 
>> Signed-off-by: huangxuesen <huangxuesen@kuaishou.com>
>> Signed-off-by: chengzhiyong <chengzhiyong@kuaishou.com>
>> Signed-off-by: wangli <wangli09@kuaishou.com>
>> ---
>> net/core/filter.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 255aeee72402..f8d3ba3fe10f 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3466,7 +3466,12 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>                skb->inner_mac_header = inner_net - inner_mac_len;
>>                skb->inner_network_header = inner_net;
>>                skb->inner_transport_header = inner_trans;
>> -               skb_set_inner_protocol(skb, skb->protocol);
>> +
>> +               if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP &&
>> +                   inner_mac_len == ETH_HLEN)
>> +                       skb_set_inner_protocol(skb, htons(ETH_P_TEB));
> 
> This may be used by vxlan, but it does not imply it.
> 
> Adding ETH_HLEN bytes likely means pushing an Ethernet header, but same point.
> 
> Conversely, pushing an Ethernet header is not limited to UDP encap.
> 
> This probably needs a new explicit BPF_F_ADJ_ROOM_.. flag, rather than
> trying to infer from imprecise heuristics.


  reply	other threads:[~2021-02-09 10:51 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 11:38 [PATCH] bpf: in bpf_skb_adjust_room correct inner protocol for vxlan huangxuesen
2021-02-08 13:06 ` Willem de Bruijn
2021-02-09 10:41   ` 黄学森 [this message]
2021-02-09 13:48     ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8552C5F8-8410-4E81-8AF4-7018878AFCDC@gmail.com \
    --to=hxseverything@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=chengzhiyong@kuaishou.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=wangli09@kuaishou.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.