netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()
@ 2023-01-11  8:01 Ziyang Xuan
  2023-01-11  8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan
  2023-01-11  8:01 ` [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel Ziyang Xuan
  0 siblings, 2 replies; 8+ messages in thread
From: Ziyang Xuan @ 2023-01-11  8:01 UTC (permalink / raw)
  To: ast, daniel, andrii, davem, edumazet, kuba, pabeni, willemb, bpf,
	netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf,
	haoluo, jolsa

Add ipip6 and ip6ip decap support for bpf_skb_adjust_room().
Main use case is for using cls_bpf on ingress hook to decapsulate
IPv4 over IPv6 and IPv6 over IPv4 tunnel packets.

And add ipip6 and ip6ip decap testcases to verify that
bpf_skb_adjust_room() correctly decapsulate ipip6 and ip6ip
tunnel packets.

$./test_tc_tunnel.sh
ipip
encap 192.168.1.1 to 192.168.1.2, type ipip, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
ipip6
encap 192.168.1.1 to 192.168.1.2, type ipip6, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
ip6ip6
encap fd::1 to fd::2, type ip6tnl, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
sit
encap fd::1 to fd::2, type sit, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
...
OK. All tests passed

v2:
  - Use decap flags to indicate the new IP header.
    Do not rely on skb->encapsulation.

Ziyang Xuan (2):
  bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()
  selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel

 include/uapi/linux/bpf.h                      |  8 ++
 net/core/filter.c                             | 26 +++++-
 tools/include/uapi/linux/bpf.h                |  8 ++
 .../selftests/bpf/progs/test_tc_tunnel.c      | 91 ++++++++++++++++++-
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 +--
 5 files changed, 139 insertions(+), 9 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v2 1/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()
  2023-01-11  8:01 [PATCH bpf-next v2 0/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() Ziyang Xuan
@ 2023-01-11  8:01 ` Ziyang Xuan
  2023-01-11 15:43   ` Willem de Bruijn
  2023-01-12  1:26   ` Martin KaFai Lau
  2023-01-11  8:01 ` [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel Ziyang Xuan
  1 sibling, 2 replies; 8+ messages in thread
From: Ziyang Xuan @ 2023-01-11  8:01 UTC (permalink / raw)
  To: ast, daniel, andrii, davem, edumazet, kuba, pabeni, willemb, bpf,
	netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf,
	haoluo, jolsa

Add ipip6 and ip6ip decap support for bpf_skb_adjust_room().
Main use case is for using cls_bpf on ingress hook to decapsulate
IPv4 over IPv6 and IPv6 over IPv4 tunnel packets.

Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the
new IP header version after decapsulating the outer IP header.

Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
---
 include/uapi/linux/bpf.h       |  8 ++++++++
 net/core/filter.c              | 26 +++++++++++++++++++++++++-
 tools/include/uapi/linux/bpf.h |  8 ++++++++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 464ca3f01fe7..dde1c2ea1c84 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2644,6 +2644,12 @@ union bpf_attr {
  *		  Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the
  *		  L2 type as Ethernet.
  *
+ *              * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
+ *                **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
+ *                Indicate the new IP header version after decapsulating the
+ *                outer IP header. Mainly used in scenarios that the inner and
+ *                outer IP versions are different.
+ *
  * 		A call to this helper is susceptible to change the underlying
  * 		packet buffer. Therefore, at load time, all checks on pointers
  * 		previously done by the verifier are invalidated and must be
@@ -5803,6 +5809,8 @@ enum {
 	BPF_F_ADJ_ROOM_ENCAP_L4_UDP	= (1ULL << 4),
 	BPF_F_ADJ_ROOM_NO_CSUM_RESET	= (1ULL << 5),
 	BPF_F_ADJ_ROOM_ENCAP_L2_ETH	= (1ULL << 6),
+	BPF_F_ADJ_ROOM_DECAP_L3_IPV4	= (1ULL << 7),
+	BPF_F_ADJ_ROOM_DECAP_L3_IPV6	= (1ULL << 8),
 };
 
 enum {
diff --git a/net/core/filter.c b/net/core/filter.c
index 43cc1fe58a2c..5fb113953f80 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3381,13 +3381,17 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 #define BPF_F_ADJ_ROOM_ENCAP_L3_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
 					 BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
 
+#define BPF_F_ADJ_ROOM_DECAP_L3_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
+					 BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
+
 #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
 					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
 					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
 					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
 					 BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
 					 BPF_F_ADJ_ROOM_ENCAP_L2( \
-					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
+					  BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
+					 BPF_F_ADJ_ROOM_DECAP_L3_MASK)
 
 static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 			    u64 flags)
@@ -3501,6 +3505,7 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
 	int ret;
 
 	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
+			       BPF_F_ADJ_ROOM_DECAP_L3_MASK |
 			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
 		return -EINVAL;
 
@@ -3519,6 +3524,14 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
 	if (unlikely(ret < 0))
 		return ret;
 
+	/* Match skb->protocol to new outer l3 protocol */
+	if (skb->protocol == htons(ETH_P_IP) &&
+	    flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
+		skb->protocol = htons(ETH_P_IPV6);
+	else if (skb->protocol == htons(ETH_P_IPV6) &&
+		 flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
+		skb->protocol = htons(ETH_P_IP);
+
 	if (skb_is_gso(skb)) {
 		struct skb_shared_info *shinfo = skb_shinfo(skb);
 
@@ -3596,6 +3609,10 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	if (unlikely(proto != htons(ETH_P_IP) &&
 		     proto != htons(ETH_P_IPV6)))
 		return -ENOTSUPP;
+	if (unlikely((shrink && ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
+		      BPF_F_ADJ_ROOM_DECAP_L3_MASK)) || (!shrink &&
+		      flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK)))
+		return -EINVAL;
 
 	off = skb_mac_header_len(skb);
 	switch (mode) {
@@ -3608,6 +3625,13 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 		return -ENOTSUPP;
 	}
 
+	if (shrink) {
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
+			len_min = sizeof(struct ipv6hdr);
+		else if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
+			len_min = sizeof(struct iphdr);
+	}
+
 	len_cur = skb->len - skb_network_offset(skb);
 	if ((shrink && (len_diff_abs >= len_cur ||
 			len_cur - len_diff_abs < len_min)) ||
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 464ca3f01fe7..22672e5c8466 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2644,6 +2644,12 @@ union bpf_attr {
  *		  Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the
  *		  L2 type as Ethernet.
  *
+ *              * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
+ *                **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
+ *                Indicate the new IP header version after decapsulating the
+ *                outer IP header. Mainly used in scenarios that the inner and
+ *                outer IP versions are different.
+ *
  * 		A call to this helper is susceptible to change the underlying
  * 		packet buffer. Therefore, at load time, all checks on pointers
  * 		previously done by the verifier are invalidated and must be
@@ -5803,6 +5809,8 @@ enum {
 	BPF_F_ADJ_ROOM_ENCAP_L4_UDP	= (1ULL << 4),
 	BPF_F_ADJ_ROOM_NO_CSUM_RESET	= (1ULL << 5),
 	BPF_F_ADJ_ROOM_ENCAP_L2_ETH	= (1ULL << 6),
+	BPF_F_ADJ_ROOM_DECAP_L3_IPV4    = (1ULL << 7),
+	BPF_F_ADJ_ROOM_DECAP_L3_IPV6    = (1ULL << 8),
 };
 
 enum {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel
  2023-01-11  8:01 [PATCH bpf-next v2 0/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() Ziyang Xuan
  2023-01-11  8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan
@ 2023-01-11  8:01 ` Ziyang Xuan
  2023-01-11 15:47   ` Willem de Bruijn
  1 sibling, 1 reply; 8+ messages in thread
From: Ziyang Xuan @ 2023-01-11  8:01 UTC (permalink / raw)
  To: ast, daniel, andrii, davem, edumazet, kuba, pabeni, willemb, bpf,
	netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf,
	haoluo, jolsa

Add ipip6 and ip6ip decap testcases. Verify that bpf_skb_adjust_room()
correctly decapsulate ipip6 and ip6ip tunnel packets.

Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 91 ++++++++++++++++++-
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 +--
 2 files changed, 98 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index a0e7762b1e5a..e6e678aa9874 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -38,6 +38,10 @@ static const int cfg_udp_src = 20000;
 #define	VXLAN_FLAGS     0x8
 #define	VXLAN_VNI       1
 
+#ifndef NEXTHDR_DEST
+#define NEXTHDR_DEST	60
+#endif
+
 /* MPLS label 1000 with S bit (last label) set and ttl of 255. */
 static const __u32 mpls_label = __bpf_constant_htonl(1000 << 12 |
 						     MPLS_LS_S_MASK | 0xff);
@@ -363,6 +367,61 @@ static __always_inline int __encap_ipv6(struct __sk_buff *skb, __u8 encap_proto,
 	return TC_ACT_OK;
 }
 
+static int encap_ipv6_ipip6(struct __sk_buff *skb)
+{
+	struct iphdr iph_inner;
+	struct v6hdr h_outer;
+	struct tcphdr tcph;
+	struct ethhdr eth;
+	__u64 flags;
+	int olen;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
+			       sizeof(iph_inner)) < 0)
+		return TC_ACT_OK;
+
+	/* filter only packets we want */
+	if (bpf_skb_load_bytes(skb, ETH_HLEN + (iph_inner.ihl << 2),
+			       &tcph, sizeof(tcph)) < 0)
+		return TC_ACT_OK;
+
+	if (tcph.dest != __bpf_constant_htons(cfg_port))
+		return TC_ACT_OK;
+
+	olen = sizeof(h_outer.ip);
+
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV6;
+
+	/* add room between mac and network header */
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
+		return TC_ACT_SHOT;
+
+	/* prepare new outer network header */
+	memset(&h_outer.ip, 0, sizeof(h_outer.ip));
+	h_outer.ip.version = 6;
+	h_outer.ip.hop_limit = iph_inner.ttl;
+	h_outer.ip.saddr.s6_addr[1] = 0xfd;
+	h_outer.ip.saddr.s6_addr[15] = 1;
+	h_outer.ip.daddr.s6_addr[1] = 0xfd;
+	h_outer.ip.daddr.s6_addr[15] = 2;
+	h_outer.ip.payload_len = iph_inner.tot_len;
+	h_outer.ip.nexthdr = IPPROTO_IPIP;
+
+	/* store new outer network header */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen,
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	/* update eth->h_proto */
+	if (bpf_skb_load_bytes(skb, 0, &eth, sizeof(eth)) < 0)
+		return TC_ACT_SHOT;
+	eth.h_proto = bpf_htons(ETH_P_IPV6);
+	if (bpf_skb_store_bytes(skb, 0, &eth, sizeof(eth), 0) < 0)
+		return TC_ACT_SHOT;
+
+	return TC_ACT_OK;
+}
+
 static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto,
 				      __u16 l2_proto)
 {
@@ -461,6 +520,15 @@ int __encap_ip6tnl_none(struct __sk_buff *skb)
 		return TC_ACT_OK;
 }
 
+SEC("encap_ipip6_none")
+int __encap_ipip6_none(struct __sk_buff *skb)
+{
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IP))
+		return encap_ipv6_ipip6(skb);
+	else
+		return TC_ACT_OK;
+}
+
 SEC("encap_ip6gre_none")
 int __encap_ip6gre_none(struct __sk_buff *skb)
 {
@@ -528,13 +596,33 @@ int __encap_ip6vxlan_eth(struct __sk_buff *skb)
 
 static int decap_internal(struct __sk_buff *skb, int off, int len, char proto)
 {
+	__u64 flags = BPF_F_ADJ_ROOM_FIXED_GSO;
+	struct ipv6_opt_hdr ip6_opt_hdr;
 	struct gre_hdr greh;
 	struct udphdr udph;
 	int olen = len;
 
 	switch (proto) {
 	case IPPROTO_IPIP:
+		flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV4;
+		break;
 	case IPPROTO_IPV6:
+		flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV6;
+		break;
+	case NEXTHDR_DEST:
+		if (bpf_skb_load_bytes(skb, off + len, &ip6_opt_hdr,
+				       sizeof(ip6_opt_hdr)) < 0)
+			return TC_ACT_OK;
+		switch (ip6_opt_hdr.nexthdr) {
+		case IPPROTO_IPIP:
+			flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV4;
+			break;
+		case IPPROTO_IPV6:
+			flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV6;
+			break;
+		default:
+			return TC_ACT_OK;
+		}
 		break;
 	case IPPROTO_GRE:
 		olen += sizeof(struct gre_hdr);
@@ -569,8 +657,7 @@ static int decap_internal(struct __sk_buff *skb, int off, int len, char proto)
 		return TC_ACT_OK;
 	}
 
-	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC,
-				BPF_F_ADJ_ROOM_FIXED_GSO))
+	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, flags))
 		return TC_ACT_SHOT;
 
 	return TC_ACT_OK;
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 334bdfeab940..910044f08908 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -100,6 +100,9 @@ if [[ "$#" -eq "0" ]]; then
 	echo "ipip"
 	$0 ipv4 ipip none 100
 
+	echo "ipip6"
+	$0 ipv4 ipip6 none 100
+
 	echo "ip6ip6"
 	$0 ipv6 ip6tnl none 100
 
@@ -224,6 +227,9 @@ elif [[ "$tuntype" =~ "gre" && "$mac" == "eth" ]]; then
 elif [[ "$tuntype" =~ "vxlan" && "$mac" == "eth" ]]; then
 	ttype="vxlan"
 	targs="id 1 dstport 8472 udp6zerocsumrx"
+elif [[ "$tuntype" == "ipip6" ]]; then
+	ttype="ip6tnl"
+	targs=""
 else
 	ttype=$tuntype
 	targs=""
@@ -233,6 +239,9 @@ fi
 if [[ "${tuntype}" == "sit" ]]; then
 	link_addr1="${ns1_v4}"
 	link_addr2="${ns2_v4}"
+elif [[ "${tuntype}" == "ipip6" ]]; then
+	link_addr1="${ns1_v6}"
+	link_addr2="${ns2_v6}"
 else
 	link_addr1="${addr1}"
 	link_addr2="${addr2}"
@@ -287,12 +296,6 @@ else
 	server_listen
 fi
 
-# bpf_skb_net_shrink does not take tunnel flags yet, cannot update L3.
-if [[ "${tuntype}" == "sit" ]]; then
-	echo OK
-	exit 0
-fi
-
 # serverside, use BPF for decap
 ip netns exec "${ns2}" ip link del dev testtun0
 ip netns exec "${ns2}" tc qdisc add dev veth2 clsact
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()
  2023-01-11  8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan
@ 2023-01-11 15:43   ` Willem de Bruijn
  2023-01-12  7:15     ` Ziyang Xuan (William)
  2023-01-12  1:26   ` Martin KaFai Lau
  1 sibling, 1 reply; 8+ messages in thread
From: Willem de Bruijn @ 2023-01-11 15:43 UTC (permalink / raw)
  To: Ziyang Xuan
  Cc: ast, daniel, andrii, davem, edumazet, kuba, pabeni, bpf, netdev,
	martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo,
	jolsa

On Wed, Jan 11, 2023 at 3:01 AM Ziyang Xuan
<william.xuanziyang@huawei.com> wrote:
>
> Add ipip6 and ip6ip decap support for bpf_skb_adjust_room().
> Main use case is for using cls_bpf on ingress hook to decapsulate
> IPv4 over IPv6 and IPv6 over IPv4 tunnel packets.
>
> Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the
> new IP header version after decapsulating the outer IP header.
>
> Suggested-by: Willem de Bruijn <willemb@google.com>
> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
> ---
>  include/uapi/linux/bpf.h       |  8 ++++++++
>  net/core/filter.c              | 26 +++++++++++++++++++++++++-
>  tools/include/uapi/linux/bpf.h |  8 ++++++++
>  3 files changed, 41 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 464ca3f01fe7..dde1c2ea1c84 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2644,6 +2644,12 @@ union bpf_attr {
>   *               Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the
>   *               L2 type as Ethernet.
>   *
> + *              * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
> + *                **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
> + *                Indicate the new IP header version after decapsulating the
> + *                outer IP header. Mainly used in scenarios that the inner and
> + *                outer IP versions are different.
> + *

Nit (only since I have another comment below)

Indicate -> Set
[Mainly used .. that] -> [Used when]

>         if (skb_is_gso(skb)) {
>                 struct skb_shared_info *shinfo = skb_shinfo(skb);
>
> @@ -3596,6 +3609,10 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>         if (unlikely(proto != htons(ETH_P_IP) &&
>                      proto != htons(ETH_P_IPV6)))
>                 return -ENOTSUPP;
> +       if (unlikely((shrink && ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> +                     BPF_F_ADJ_ROOM_DECAP_L3_MASK)) || (!shrink &&
> +                     flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK)))
> +               return -EINVAL;
>
>         off = skb_mac_header_len(skb);
>         switch (mode) {
> @@ -3608,6 +3625,13 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>                 return -ENOTSUPP;
>         }
>
> +       if (shrink) {
> +               if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
> +                       len_min = sizeof(struct ipv6hdr);
> +               else if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
> +                       len_min = sizeof(struct iphdr);
> +       }
> +

How about combining this branch with the above:

  if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
    if (!shrink)
      return -EINVAL;

    switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
      case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
        len_min = sizeof(struct iphdr);
        break;
      case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
        len_min = sizeof(struct ipv6hdr);
        break;
      default:
        return -EINVAL;
    }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel
  2023-01-11  8:01 ` [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel Ziyang Xuan
@ 2023-01-11 15:47   ` Willem de Bruijn
  2023-01-12  8:20     ` Ziyang Xuan (William)
  0 siblings, 1 reply; 8+ messages in thread
From: Willem de Bruijn @ 2023-01-11 15:47 UTC (permalink / raw)
  To: Ziyang Xuan
  Cc: ast, daniel, andrii, davem, edumazet, kuba, pabeni, bpf, netdev,
	martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo,
	jolsa

On Wed, Jan 11, 2023 at 3:02 AM Ziyang Xuan
<william.xuanziyang@huawei.com> wrote:
>
> Add ipip6 and ip6ip decap testcases. Verify that bpf_skb_adjust_room()
> correctly decapsulate ipip6 and ip6ip tunnel packets.
>
> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
> ---
>  .../selftests/bpf/progs/test_tc_tunnel.c      | 91 ++++++++++++++++++-
>  tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 +--
>  2 files changed, 98 insertions(+), 8 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
> index a0e7762b1e5a..e6e678aa9874 100644
> --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
> +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
> @@ -38,6 +38,10 @@ static const int cfg_udp_src = 20000;
>  #define        VXLAN_FLAGS     0x8
>  #define        VXLAN_VNI       1
>
> +#ifndef NEXTHDR_DEST
> +#define NEXTHDR_DEST   60
> +#endif

Should not be needed if including the right header? include/net/ipv6.h

Otherwise very nice extension. Thanks for expanding the test.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()
  2023-01-11  8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan
  2023-01-11 15:43   ` Willem de Bruijn
@ 2023-01-12  1:26   ` Martin KaFai Lau
  1 sibling, 0 replies; 8+ messages in thread
From: Martin KaFai Lau @ 2023-01-12  1:26 UTC (permalink / raw)
  To: Ziyang Xuan, ast, daniel, andrii, davem, edumazet, kuba, pabeni,
	willemb, bpf, netdev, song, yhs, john.fastabend, kpsingh, sdf,
	haoluo, jolsa

On 1/11/23 12:01 AM, Ziyang Xuan wrote:
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 464ca3f01fe7..dde1c2ea1c84 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2644,6 +2644,12 @@ union bpf_attr {
>    *		  Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the
>    *		  L2 type as Ethernet.
>    *
> + *              * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
> + *                **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
> + *                Indicate the new IP header version after decapsulating the
> + *                outer IP header. Mainly used in scenarios that the inner and
> + *                outer IP versions are different.
> + *

selftests/bpf failed to compile. It is probably because there is leading spaces 
instead of using tabs: 
https://github.com/kernel-patches/bpf/actions/runs/3890850490/jobs/6640395038#step:11:112

   /tmp/work/bpf/bpf/tools/testing/selftests/bpf/bpf-helpers.rst:1112: 
(WARNING/2) Bullet list ends without a blank line; unexpected unindent.
   make[1]: *** [Makefile.docs:76: 
/tmp/work/bpf/bpf/tools/testing/selftests/bpf/bpf-helpers.7] Error 12
   make: *** [Makefile:259: docs] Error 2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()
  2023-01-11 15:43   ` Willem de Bruijn
@ 2023-01-12  7:15     ` Ziyang Xuan (William)
  0 siblings, 0 replies; 8+ messages in thread
From: Ziyang Xuan (William) @ 2023-01-12  7:15 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: ast, daniel, andrii, davem, edumazet, kuba, pabeni, bpf, netdev,
	martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo,
	jolsa

> On Wed, Jan 11, 2023 at 3:01 AM Ziyang Xuan
> <william.xuanziyang@huawei.com> wrote:
>>
>> Add ipip6 and ip6ip decap support for bpf_skb_adjust_room().
>> Main use case is for using cls_bpf on ingress hook to decapsulate
>> IPv4 over IPv6 and IPv6 over IPv4 tunnel packets.
>>
>> Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the
>> new IP header version after decapsulating the outer IP header.
>>
>> Suggested-by: Willem de Bruijn <willemb@google.com>
>> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
>> ---
>>  include/uapi/linux/bpf.h       |  8 ++++++++
>>  net/core/filter.c              | 26 +++++++++++++++++++++++++-
>>  tools/include/uapi/linux/bpf.h |  8 ++++++++
>>  3 files changed, 41 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 464ca3f01fe7..dde1c2ea1c84 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -2644,6 +2644,12 @@ union bpf_attr {
>>   *               Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the
>>   *               L2 type as Ethernet.
>>   *
>> + *              * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
>> + *                **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
>> + *                Indicate the new IP header version after decapsulating the
>> + *                outer IP header. Mainly used in scenarios that the inner and
>> + *                outer IP versions are different.
>> + *
> 
> Nit (only since I have another comment below)
> 
> Indicate -> Set
Sorry, I think "Indicate" maybe more suitable. Because the new IP header is original inner
IP header, it's not be changed. The flags assist the kernel to better complete specific tasks.
I think "Set" has a meaning of change.

> [Mainly used .. that] -> [Used when]
This looks good to me. Thanks!

> 
>>         if (skb_is_gso(skb)) {
>>                 struct skb_shared_info *shinfo = skb_shinfo(skb);
>>
>> @@ -3596,6 +3609,10 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>>         if (unlikely(proto != htons(ETH_P_IP) &&
>>                      proto != htons(ETH_P_IPV6)))
>>                 return -ENOTSUPP;
>> +       if (unlikely((shrink && ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
>> +                     BPF_F_ADJ_ROOM_DECAP_L3_MASK)) || (!shrink &&
>> +                     flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK)))
>> +               return -EINVAL;
>>
>>         off = skb_mac_header_len(skb);
>>         switch (mode) {
>> @@ -3608,6 +3625,13 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>>                 return -ENOTSUPP;
>>         }
>>
>> +       if (shrink) {
>> +               if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>> +                       len_min = sizeof(struct ipv6hdr);
>> +               else if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>> +                       len_min = sizeof(struct iphdr);
>> +       }
>> +
> 
> How about combining this branch with the above:
> 
>   if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>     if (!shrink)
>       return -EINVAL;
> 
>     switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>       case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
>         len_min = sizeof(struct iphdr);
>         break;
>       case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
>         len_min = sizeof(struct ipv6hdr);
>         break;
>       default:
>         return -EINVAL;
>     }
> This looks good to me. Thanks!

> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel
  2023-01-11 15:47   ` Willem de Bruijn
@ 2023-01-12  8:20     ` Ziyang Xuan (William)
  0 siblings, 0 replies; 8+ messages in thread
From: Ziyang Xuan (William) @ 2023-01-12  8:20 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: ast, daniel, andrii, davem, edumazet, kuba, pabeni, bpf, netdev,
	martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo,
	jolsa

> On Wed, Jan 11, 2023 at 3:02 AM Ziyang Xuan
> <william.xuanziyang@huawei.com> wrote:
>>
>> Add ipip6 and ip6ip decap testcases. Verify that bpf_skb_adjust_room()
>> correctly decapsulate ipip6 and ip6ip tunnel packets.
>>
>> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
>> ---
>>  .../selftests/bpf/progs/test_tc_tunnel.c      | 91 ++++++++++++++++++-
>>  tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 +--
>>  2 files changed, 98 insertions(+), 8 deletions(-)
>>
>> diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
>> index a0e7762b1e5a..e6e678aa9874 100644
>> --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
>> +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
>> @@ -38,6 +38,10 @@ static const int cfg_udp_src = 20000;
>>  #define        VXLAN_FLAGS     0x8
>>  #define        VXLAN_VNI       1
>>
>> +#ifndef NEXTHDR_DEST
>> +#define NEXTHDR_DEST   60
>> +#endif
> 
> Should not be needed if including the right header? include/net/ipv6.h
> 
> Otherwise very nice extension. Thanks for expanding the test.

"net/ipv6.h" do not under /usr/include/ and can not be included in bpf programs.

> .
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-01-12  8:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-11  8:01 [PATCH bpf-next v2 0/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() Ziyang Xuan
2023-01-11  8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan
2023-01-11 15:43   ` Willem de Bruijn
2023-01-12  7:15     ` Ziyang Xuan (William)
2023-01-12  1:26   ` Martin KaFai Lau
2023-01-11  8:01 ` [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel Ziyang Xuan
2023-01-11 15:47   ` Willem de Bruijn
2023-01-12  8:20     ` Ziyang Xuan (William)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).