bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xuesen Huang <hxseverything@gmail.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	David Miller <davem@davemloft.net>, bpf <bpf@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Zhiyong Cheng <chengzhiyong@kuaishou.com>,
	Li Wang <wangli09@kuaishou.com>
Subject: Re: [PATCH/v4] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH
Date: Thu, 4 Mar 2021 11:22:56 +0800	[thread overview]
Message-ID: <5D5B444A-FE98-46CF-80D2-DEEBE9C1D74A@gmail.com> (raw)
In-Reply-To: <CA+FuTSfY0y7Y2XSKO-rqPY5mX83NWgAWbQeVukFA94eJVu2B2g@mail.gmail.com>



> 2021年3月4日 上午2:53,Willem de Bruijn <willemdebruijn.kernel@gmail.com> 写道:
> 
> On Wed, Mar 3, 2021 at 7:33 AM Xuesen Huang <hxseverything@gmail.com> wrote:
>> 
>> From: Xuesen Huang <huangxuesen@kuaishou.com>
>> 
>> bpf_skb_adjust_room sets the inner_protocol as skb->protocol for packets
>> encapsulation. But that is not appropriate when pushing Ethernet header.
>> 
>> Add an option to further specify encap L2 type and set the inner_protocol
>> as ETH_P_TEB.
>> 
>> Update test_tc_tunnel to verify adding vxlan encapsulation works with
>> this flag.
>> 
>> Suggested-by: Willem de Bruijn <willemb@google.com>
>> Signed-off-by: Xuesen Huang <huangxuesen@kuaishou.com>
>> Signed-off-by: Zhiyong Cheng <chengzhiyong@kuaishou.com>
>> Signed-off-by: Li Wang <wangli09@kuaishou.com>
> 
> Thanks for adding the test. Perhaps that is better in a separate patch?
> 
> Overall looks great to me.
> 
> The patch has not (yet?) arrived on patchwork.
> 
Thanks Willem, I will separate it into two patch.

I will send patch/v5 with only that new flag addition, lol.

>> enum {
>> diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
>> index 37bce7a..6e144db 100644
>> --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
>> +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
>> @@ -20,6 +20,14 @@
>> #include <bpf/bpf_endian.h>
>> #include <bpf/bpf_helpers.h>
>> 
>> +#define encap_ipv4(...) __encap_ipv4(__VA_ARGS__, 0)
>> +
>> +#define encap_ipv4_with_ext_proto(...) __encap_ipv4(__VA_ARGS__)
>> +
>> +#define encap_ipv6(...) __encap_ipv6(__VA_ARGS__, 0)
>> +
>> +#define encap_ipv6_with_ext_proto(...) __encap_ipv6(__VA_ARGS__)
>> +
> 
> Instead of untyped macros, I'd define encap_ipv4 as a function that
> calls __encap_ipv4.
> 
> And no need for encap_ipv4_with_ext_proto equivalent to __encap_ipv4.
> 
I defined these macros to try to keep the existing  invocation for encap_ipv4/6
as the same, if we define this as a function all invocation should be modified?

>> static const int cfg_port = 8000;
>> 
>> static const int cfg_udp_src = 20000;
>> @@ -27,11 +35,24 @@
>> #define        UDP_PORT                5555
>> #define        MPLS_OVER_UDP_PORT      6635
>> #define        ETH_OVER_UDP_PORT       7777
>> +#define        VXLAN_UDP_PORT          8472
>> +
>> +#define        EXTPROTO_VXLAN  0x1
>> +
>> +#define        VXLAN_N_VID     (1u << 24)
>> +#define        VXLAN_VNI_MASK  bpf_htonl((VXLAN_N_VID - 1) << 8)
>> +#define        VXLAN_FLAGS     0x8
>> +#define        VXLAN_VNI       1
>> 
>> /* MPLS label 1000 with S bit (last label) set and ttl of 255. */
>> static const __u32 mpls_label = __bpf_constant_htonl(1000 << 12 |
>>                                                     MPLS_LS_S_MASK | 0xff);
>> 
>> +struct vxlanhdr {
>> +       __be32 vx_flags;
>> +       __be32 vx_vni;
>> +} __attribute__((packed));
>> +
>> struct gre_hdr {
>>        __be16 flags;
>>        __be16 protocol;
>> @@ -45,13 +66,13 @@ struct gre_hdr {
>> struct v4hdr {
>>        struct iphdr ip;
>>        union l4hdr l4hdr;
>> -       __u8 pad[16];                   /* enough space for L2 header */
>> +       __u8 pad[24];                   /* space for L2 header / vxlan header ... */
> 
> could we use something like sizeof(..) instead of a constant?
> 
Thanks, I will try to fix this.

>> @@ -171,14 +197,26 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, __u8 encap_proto,
>>        }
>> 
>>        /* add L2 encap (if specified) */
>> +       l2_hdr = (__u8 *)&h_outer + olen;
>>        switch (l2_proto) {
>>        case ETH_P_MPLS_UC:
>> -               *((__u32 *)((__u8 *)&h_outer + olen)) = mpls_label;
>> +               *(__u32 *)l2_hdr = mpls_label;
>>                break;
>>        case ETH_P_TEB:
>> -               if (bpf_skb_load_bytes(skb, 0, (__u8 *)&h_outer + olen,
>> -                                      ETH_HLEN))
> 
> This is non-standard indentation? Here and elsewhere.
I thinks it’s a previous issue.

> 
>> @@ -249,7 +288,11 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto,
>>                break;
>>        case ETH_P_TEB:
>>                l2_len = ETH_HLEN;
>> -               udp_dst = ETH_OVER_UDP_PORT;
>> +               if (ext_proto & EXTPROTO_VXLAN) {
>> +                       udp_dst = VXLAN_UDP_PORT;
>> +                       l2_len += sizeof(struct vxlanhdr);
>> +               } else
>> +                       udp_dst = ETH_OVER_UDP_PORT;
>>                break;
>>        }
>>        flags |= BPF_F_ADJ_ROOM_ENCAP_L2(l2_len);
>> @@ -267,7 +310,7 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto,
>>                h_outer.l4hdr.udp.source = __bpf_constant_htons(cfg_udp_src);
>>                h_outer.l4hdr.udp.dest = bpf_htons(udp_dst);
>>                tot_len = bpf_ntohs(iph_inner.payload_len) + sizeof(iph_inner) +
>> -                         sizeof(h_outer.l4hdr.udp);
>> +                         sizeof(h_outer.l4hdr.udp) + l2_len;
> 
> Was this a bug previously?
> 
Yes, a tiny bug.

>>                h_outer.l4hdr.udp.check = 0;
>>                h_outer.l4hdr.udp.len = bpf_htons(tot_len);
>>                break;
>> @@ -278,13 +321,24 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto,
>>        }
>> 
>>        /* add L2 encap (if specified) */
>> +       l2_hdr = (__u8 *)&h_outer + olen;
>>        switch (l2_proto) {
>>        case ETH_P_MPLS_UC:
>> -               *((__u32 *)((__u8 *)&h_outer + olen)) = mpls_label;
>> +               *(__u32 *)l2_hdr = mpls_label;
>>                break;
>>        case ETH_P_TEB:
>> -               if (bpf_skb_load_bytes(skb, 0, (__u8 *)&h_outer + olen,
>> -                                      ETH_HLEN))
>> +               flags |= BPF_F_ADJ_ROOM_ENCAP_L2_ETH;
> 
> This is a change also for the existing case. Correctly so, I imagine.
> But the test used to pass with the wrong protocol?
Yes all tests pass. I’m not sure should we add this flag for the existing tests
which encap eth as the l2 header or only for the Vxlan test?  

Waiting for your suggestion.
Thanks.



  reply	other threads:[~2021-03-04  3:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 12:33 [PATCH/v4] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH Xuesen Huang
2021-03-03 12:48 ` Xuesen Huang
2021-03-03 18:53 ` Willem de Bruijn
2021-03-04  3:22   ` Xuesen Huang [this message]
2021-03-04  3:40     ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5D5B444A-FE98-46CF-80D2-DEEBE9C1D74A@gmail.com \
    --to=hxseverything@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=chengzhiyong@kuaishou.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=wangli09@kuaishou.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).