All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 00/13] bpf tc tunneling
@ 2019-03-22 18:32 Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path Willem de Bruijn
                   ` (14 more replies)
  0 siblings, 15 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

BPF allows for dynamic tunneling, choosing the tunnel destination and
features on-demand. Extend bpf_skb_adjust_room to allow for efficient
tunneling at the TC hooks.

Most features are required for large packets with GSO, as these will
be modified after this patch.

Patch 1
  is a performance optimization, avoiding an unnecessary unclone
  for the TCP hot path.

Patches 2..6
  introduce a regression test. These can be squashed, but the code is
  arguably more readable when gradually expanding the feature set.

Patch 7
  is a performance optimization, avoid copying network headers
  that are going to be overwritten. This also simplifies the bpf
  program.

Patch 8
  reenables bpf_skb_adjust_room for UDP packets.

Patch 9
  configures skb tunneling metadata analogous to tunnel devices.

Patches 10..13
  expand the regression test to make use of the new features and
  enable the GSO testcases.

Changes
  v1->v2
  - move BPF_F_ADJ_ROOM_MASK out of uapi as it can be expanded
  - document new flags
  - in tests replace netcat -q flag with coreutils timeout:
      the -q flag is not supported in all netcat versions
  v2->v3
  - move BPF_F_ADJ_ROOM_ENCAP_L3_MASK out of uapi as it has no
    use in userspace

Willem de Bruijn (13):
  bpf: in bpf_skb_adjust_room avoid copy in tx fast path
  selftests/bpf: bpf tunnel encap test
  selftests/bpf: expand bpf tunnel test with decap
  selftests/bpf: expand bpf tunnel test to ipv6
  selftests/bpf: extend bpf tunnel test with gre
  selftests/bpf: extend bpf tunnel test with tso
  bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC
  bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO
  bpf: add bpf_skb_adjust_room encap flags
  bpf: Sync bpf.h to tools
  selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC
  selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO
  selftests/bpf: convert bpf tunnel test to encap modes

 include/uapi/linux/bpf.h                      |  29 +-
 net/core/filter.c                             | 132 +++++++--
 tools/include/uapi/linux/bpf.h                |  29 +-
 tools/testing/selftests/bpf/Makefile          |   3 +-
 tools/testing/selftests/bpf/config            |   2 +
 .../selftests/bpf/progs/test_tc_tunnel.c      | 261 ++++++++++++++++++
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 178 ++++++++++++
 7 files changed, 598 insertions(+), 36 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_tc_tunnel.c
 create mode 100755 tools/testing/selftests/bpf/test_tc_tunnel.sh

-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 02/13] selftests/bpf: bpf tunnel encap test Willem de Bruijn
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

bpf_skb_adjust_room calls skb_cow on grow.

This expensive operation can be avoided in the fast path when the only
other clone has released the header. This is the common case for TCP,
where one headerless clone is kept on the retransmit queue.

It is safe to do so even when touching the gso fields in skb_shinfo.
Regular tunnel encap with iptunnel_handle_offloads takes the same
optimization.

The tcp stack unclones in the unlikely case that it accesses these
fields through headerless clones packets on the retransmit queue (see
__tcp_retransmit_skb).

If any other clones are present, e.g., from packet sockets,
skb_cow_head returns the same value as skb_cow().

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/core/filter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index d2511fe46db3..d21e1acdde29 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2971,7 +2971,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 len_diff)
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
 		return -ENOTSUPP;
 
-	ret = skb_cow(skb, len_diff);
+	ret = skb_cow_head(skb, len_diff);
 	if (unlikely(ret < 0))
 		return ret;
 
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 02/13] selftests/bpf: bpf tunnel encap test
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 03/13] selftests/bpf: expand bpf tunnel test with decap Willem de Bruijn
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Validate basic tunnel encapsulation using ipip.

Set up two namespaces connected by veth. Connect a client and server.
Do this with and without bpf encap.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/bpf/Makefile          |  3 +-
 .../selftests/bpf/progs/test_tc_tunnel.c      | 83 +++++++++++++++++++
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 75 +++++++++++++++++
 3 files changed, 160 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_tc_tunnel.c
 create mode 100755 tools/testing/selftests/bpf/test_tc_tunnel.sh

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index edd59707cb1f..cdcc54ddf4b9 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -52,7 +52,8 @@ TEST_PROGS := test_kmod.sh \
 	test_flow_dissector.sh \
 	test_xdp_vlan.sh \
 	test_lwt_ip_encap.sh \
-	test_tcp_check_syncookie.sh
+	test_tcp_check_syncookie.sh \
+	test_tc_tunnel.sh
 
 TEST_PROGS_EXTENDED := with_addr.sh \
 	with_tunnels.sh \
diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
new file mode 100644
index 000000000000..8223e4347be8
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* In-place tunneling */
+
+#include <linux/stddef.h>
+#include <linux/bpf.h>
+#include <linux/if_ether.h>
+#include <linux/in.h>
+#include <linux/ip.h>
+#include <linux/tcp.h>
+#include <linux/pkt_cls.h>
+#include <linux/types.h>
+
+#include "bpf_endian.h"
+#include "bpf_helpers.h"
+
+static const int cfg_port = 8000;
+
+static __always_inline void set_ipv4_csum(struct iphdr *iph)
+{
+	__u16 *iph16 = (__u16 *)iph;
+	__u32 csum;
+	int i;
+
+	iph->check = 0;
+
+#pragma clang loop unroll(full)
+	for (i = 0, csum = 0; i < sizeof(*iph) >> 1; i++)
+		csum += *iph16++;
+
+	iph->check = ~((csum & 0xffff) + (csum >> 16));
+}
+
+SEC("encap")
+int encap_f(struct __sk_buff *skb)
+{
+	struct iphdr iph_outer, iph_inner;
+	struct tcphdr tcph;
+
+	if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
+		return TC_ACT_OK;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
+			       sizeof(iph_inner)) < 0)
+		return TC_ACT_OK;
+
+	/* filter only packets we want */
+	if (iph_inner.ihl != 5 || iph_inner.protocol != IPPROTO_TCP)
+		return TC_ACT_OK;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph_inner),
+			       &tcph, sizeof(tcph)) < 0)
+		return TC_ACT_OK;
+
+	if (tcph.dest != __bpf_constant_htons(cfg_port))
+		return TC_ACT_OK;
+
+	/* add room between mac and network header */
+	if (bpf_skb_adjust_room(skb, sizeof(iph_outer), BPF_ADJ_ROOM_NET, 0))
+		return TC_ACT_SHOT;
+
+	/* prepare new outer network header */
+	iph_outer = iph_inner;
+	iph_outer.protocol = IPPROTO_IPIP;
+	iph_outer.tot_len = bpf_htons(sizeof(iph_outer) +
+				      bpf_htons(iph_outer.tot_len));
+	set_ipv4_csum(&iph_outer);
+
+	/* store new outer network header */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_outer, sizeof(iph_outer),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	/* bpf_skb_adjust_room has moved header to start of room: restore */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+				&iph_inner, sizeof(iph_inner),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	return TC_ACT_OK;
+}
+
+char __license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
new file mode 100755
index 000000000000..6ebb288a3afc
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -0,0 +1,75 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# In-place tunneling
+
+# must match the port that the bpf program filters on
+readonly port=8000
+
+readonly ns_prefix="ns-$$-"
+readonly ns1="${ns_prefix}1"
+readonly ns2="${ns_prefix}2"
+
+readonly ns1_v4=192.168.1.1
+readonly ns2_v4=192.168.1.2
+
+setup() {
+	ip netns add "${ns1}"
+	ip netns add "${ns2}"
+
+	ip link add dev veth1 mtu 1500 netns "${ns1}" type veth \
+	      peer name veth2 mtu 1500 netns "${ns2}"
+
+	ip -netns "${ns1}" link set veth1 up
+	ip -netns "${ns2}" link set veth2 up
+
+	ip -netns "${ns1}" -4 addr add "${ns1_v4}/24" dev veth1
+	ip -netns "${ns2}" -4 addr add "${ns2_v4}/24" dev veth2
+
+	sleep 1
+}
+
+cleanup() {
+	ip netns del "${ns2}"
+	ip netns del "${ns1}"
+}
+
+server_listen() {
+	ip netns exec "${ns2}" nc -l -p "${port}" &
+	sleep 0.2
+}
+
+client_connect() {
+	ip netns exec "${ns1}" nc -z -w 1 "${ns2_v4}" "${port}"
+	echo $?
+}
+
+set -e
+trap cleanup EXIT
+
+setup
+
+# basic communication works
+echo "test basic connectivity"
+server_listen
+client_connect
+
+# clientside, insert bpf program to encap all TCP to port ${port}
+# client can no longer connect
+ip netns exec "${ns1}" tc qdisc add dev veth1 clsact
+ip netns exec "${ns1}" tc filter add dev veth1 egress \
+	bpf direct-action object-file ./test_tc_tunnel.o section encap
+echo "test bpf encap without decap (expect failure)"
+server_listen
+! client_connect
+
+# serverside, insert decap module
+# server is still running
+# client can connect again
+ip netns exec "${ns2}" ip link add dev testtun0 type ipip \
+	remote "${ns1_v4}" local "${ns2_v4}"
+ip netns exec "${ns2}" ip link set dev testtun0 up
+echo "test bpf encap with tunnel device decap"
+client_connect
+
+echo OK
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 03/13] selftests/bpf: expand bpf tunnel test with decap
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 02/13] selftests/bpf: bpf tunnel encap test Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 04/13] selftests/bpf: expand bpf tunnel test to ipv6 Willem de Bruijn
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

The bpf tunnel test encapsulates using bpf, then decapsulates using
a standard tunnel device to verify correctness.

Once encap is verified, also test decap, by replacing the tunnel
device on decap with another bpf program.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 31 +++++++++++++++++++
 tools/testing/selftests/bpf/test_tc_tunnel.sh |  9 ++++++
 2 files changed, 40 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 8223e4347be8..25db148635ab 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -80,4 +80,35 @@ int encap_f(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
+SEC("decap")
+int decap_f(struct __sk_buff *skb)
+{
+	struct iphdr iph_outer, iph_inner;
+
+	if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
+		return TC_ACT_OK;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_outer,
+			       sizeof(iph_outer)) < 0)
+		return TC_ACT_OK;
+
+	if (iph_outer.ihl != 5 || iph_outer.protocol != IPPROTO_IPIP)
+		return TC_ACT_OK;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+			       &iph_inner, sizeof(iph_inner)) < 0)
+		return TC_ACT_OK;
+
+	if (bpf_skb_adjust_room(skb, -(int)sizeof(iph_outer),
+				BPF_ADJ_ROOM_NET, 0))
+		return TC_ACT_SHOT;
+
+	/* bpf_skb_adjust_room has moved outer over inner header: restore */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_inner, sizeof(iph_inner),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	return TC_ACT_OK;
+}
+
 char __license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 6ebb288a3afc..91151d91e5a1 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -72,4 +72,13 @@ ip netns exec "${ns2}" ip link set dev testtun0 up
 echo "test bpf encap with tunnel device decap"
 client_connect
 
+# serverside, use BPF for decap
+ip netns exec "${ns2}" ip link del dev testtun0
+ip netns exec "${ns2}" tc qdisc add dev veth2 clsact
+ip netns exec "${ns2}" tc filter add dev veth2 ingress \
+	bpf direct-action object-file ./test_tc_tunnel.o section decap
+server_listen
+echo "test bpf encap with bpf decap"
+client_connect
+
 echo OK
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 04/13] selftests/bpf: expand bpf tunnel test to ipv6
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (2 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 03/13] selftests/bpf: expand bpf tunnel test with decap Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 05/13] selftests/bpf: extend bpf tunnel test with gre Willem de Bruijn
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

The test only uses ipv4 so far, expand to ipv6.
This is mostly a boilerplate near copy of the ipv4 path.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/bpf/config            |   2 +
 .../selftests/bpf/progs/test_tc_tunnel.c      | 116 +++++++++++++++---
 tools/testing/selftests/bpf/test_tc_tunnel.sh |  53 +++++++-
 3 files changed, 149 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 37f947ec44ed..a42f4fc4dc11 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -23,3 +23,5 @@ CONFIG_LWTUNNEL=y
 CONFIG_BPF_STREAM_PARSER=y
 CONFIG_XDP_SOCKETS=y
 CONFIG_FTRACE_SYSCALLS=y
+CONFIG_IPV6_TUNNEL=y
+CONFIG_IPV6_GRE=y
diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 25db148635ab..591f540ce513 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -7,6 +7,7 @@
 #include <linux/if_ether.h>
 #include <linux/in.h>
 #include <linux/ip.h>
+#include <linux/ipv6.h>
 #include <linux/tcp.h>
 #include <linux/pkt_cls.h>
 #include <linux/types.h>
@@ -31,15 +32,11 @@ static __always_inline void set_ipv4_csum(struct iphdr *iph)
 	iph->check = ~((csum & 0xffff) + (csum >> 16));
 }
 
-SEC("encap")
-int encap_f(struct __sk_buff *skb)
+static int encap_ipv4(struct __sk_buff *skb)
 {
 	struct iphdr iph_outer, iph_inner;
 	struct tcphdr tcph;
 
-	if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
-		return TC_ACT_OK;
-
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
 			       sizeof(iph_inner)) < 0)
 		return TC_ACT_OK;
@@ -80,35 +77,118 @@ int encap_f(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
-SEC("decap")
-int decap_f(struct __sk_buff *skb)
+static int encap_ipv6(struct __sk_buff *skb)
 {
-	struct iphdr iph_outer, iph_inner;
+	struct ipv6hdr iph_outer, iph_inner;
+	struct tcphdr tcph;
 
-	if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
+			       sizeof(iph_inner)) < 0)
 		return TC_ACT_OK;
 
-	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_outer,
-			       sizeof(iph_outer)) < 0)
+	/* filter only packets we want */
+	if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph_inner),
+			       &tcph, sizeof(tcph)) < 0)
 		return TC_ACT_OK;
 
-	if (iph_outer.ihl != 5 || iph_outer.protocol != IPPROTO_IPIP)
+	if (tcph.dest != __bpf_constant_htons(cfg_port))
+		return TC_ACT_OK;
+
+	/* add room between mac and network header */
+	if (bpf_skb_adjust_room(skb, sizeof(iph_outer), BPF_ADJ_ROOM_NET, 0))
+		return TC_ACT_SHOT;
+
+	/* prepare new outer network header */
+	iph_outer = iph_inner;
+	iph_outer.nexthdr = IPPROTO_IPV6;
+	iph_outer.payload_len = bpf_htons(sizeof(iph_outer) +
+					  bpf_ntohs(iph_outer.payload_len));
+
+	/* store new outer network header */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_outer, sizeof(iph_outer),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	/* bpf_skb_adjust_room has moved header to start of room: restore */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+				&iph_inner, sizeof(iph_inner),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	return TC_ACT_OK;
+}
+
+SEC("encap")
+int encap_f(struct __sk_buff *skb)
+{
+	switch (skb->protocol) {
+	case __bpf_constant_htons(ETH_P_IP):
+		return encap_ipv4(skb);
+	case __bpf_constant_htons(ETH_P_IPV6):
+		return encap_ipv6(skb);
+	default:
+		/* does not match, ignore */
 		return TC_ACT_OK;
+	}
+}
 
-	if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph_outer),
-			       &iph_inner, sizeof(iph_inner)) < 0)
+static int decap_internal(struct __sk_buff *skb, int off, int len)
+{
+	char buf[sizeof(struct ipv6hdr)];
+
+	if (bpf_skb_load_bytes(skb, off + len, &buf, len) < 0)
 		return TC_ACT_OK;
 
-	if (bpf_skb_adjust_room(skb, -(int)sizeof(iph_outer),
-				BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, -len, BPF_ADJ_ROOM_NET, 0))
 		return TC_ACT_SHOT;
 
 	/* bpf_skb_adjust_room has moved outer over inner header: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_inner, sizeof(iph_inner),
-				BPF_F_INVALIDATE_HASH) < 0)
+	if (bpf_skb_store_bytes(skb, off, buf, len, BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
 	return TC_ACT_OK;
 }
 
+static int decap_ipv4(struct __sk_buff *skb)
+{
+	struct iphdr iph_outer;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_outer,
+			       sizeof(iph_outer)) < 0)
+		return TC_ACT_OK;
+
+	if (iph_outer.ihl != 5 || iph_outer.protocol != IPPROTO_IPIP)
+		return TC_ACT_OK;
+
+	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer));
+}
+
+static int decap_ipv6(struct __sk_buff *skb)
+{
+	struct ipv6hdr iph_outer;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_outer,
+			       sizeof(iph_outer)) < 0)
+		return TC_ACT_OK;
+
+	if (iph_outer.nexthdr != IPPROTO_IPV6)
+		return TC_ACT_OK;
+
+	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer));
+}
+
+SEC("decap")
+int decap_f(struct __sk_buff *skb)
+{
+	switch (skb->protocol) {
+	case __bpf_constant_htons(ETH_P_IP):
+		return decap_ipv4(skb);
+	case __bpf_constant_htons(ETH_P_IPV6):
+		return decap_ipv6(skb);
+	default:
+		/* does not match, ignore */
+		return TC_ACT_OK;
+	}
+}
+
 char __license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 91151d91e5a1..7b1758f3006b 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -12,6 +12,9 @@ readonly ns2="${ns_prefix}2"
 
 readonly ns1_v4=192.168.1.1
 readonly ns2_v4=192.168.1.2
+readonly ns1_v6=fd::1
+readonly ns2_v6=fd::2
+
 
 setup() {
 	ip netns add "${ns1}"
@@ -25,6 +28,8 @@ setup() {
 
 	ip -netns "${ns1}" -4 addr add "${ns1_v4}/24" dev veth1
 	ip -netns "${ns2}" -4 addr add "${ns2_v4}/24" dev veth2
+	ip -netns "${ns1}" -6 addr add "${ns1_v6}/64" dev veth1 nodad
+	ip -netns "${ns2}" -6 addr add "${ns2_v6}/64" dev veth2 nodad
 
 	sleep 1
 }
@@ -35,16 +40,56 @@ cleanup() {
 }
 
 server_listen() {
-	ip netns exec "${ns2}" nc -l -p "${port}" &
+	ip netns exec "${ns2}" nc "${netcat_opt}" -l -p "${port}" &
 	sleep 0.2
 }
 
 client_connect() {
-	ip netns exec "${ns1}" nc -z -w 1 "${ns2_v4}" "${port}"
+	ip netns exec "${ns1}" nc "${netcat_opt}" -z -w 1 "${addr2}" "${port}"
 	echo $?
 }
 
 set -e
+
+# no arguments: automated test, run all
+if [[ "$#" -eq "0" ]]; then
+	echo "ipip"
+	$0 ipv4
+
+	echo "ip6ip6"
+	$0 ipv6
+
+	echo "OK. All tests passed"
+	exit 0
+fi
+
+if [[ "$#" -ne "1" ]]; then
+	echo "Usage: $0"
+	echo "   or: $0 <ipv4|ipv6>"
+	exit 1
+fi
+
+case "$1" in
+"ipv4")
+	readonly tuntype=ipip
+	readonly addr1="${ns1_v4}"
+	readonly addr2="${ns2_v4}"
+	readonly netcat_opt=-4
+	;;
+"ipv6")
+	readonly tuntype=ip6tnl
+	readonly addr1="${ns1_v6}"
+	readonly addr2="${ns2_v6}"
+	readonly netcat_opt=-6
+	;;
+*)
+	echo "unknown arg: $1"
+	exit 1
+	;;
+esac
+
+echo "encap ${addr1} to ${addr2}, type ${tuntype}"
+
 trap cleanup EXIT
 
 setup
@@ -66,8 +111,8 @@ server_listen
 # serverside, insert decap module
 # server is still running
 # client can connect again
-ip netns exec "${ns2}" ip link add dev testtun0 type ipip \
-	remote "${ns1_v4}" local "${ns2_v4}"
+ip netns exec "${ns2}" ip link add dev testtun0 type "${tuntype}" \
+	remote "${addr1}" local "${addr2}"
 ip netns exec "${ns2}" ip link set dev testtun0 up
 echo "test bpf encap with tunnel device decap"
 client_connect
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 05/13] selftests/bpf: extend bpf tunnel test with gre
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (3 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 04/13] selftests/bpf: expand bpf tunnel test to ipv6 Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 06/13] selftests/bpf: extend bpf tunnel test with tso Willem de Bruijn
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

GRE is a commonly used protocol. Add GRE cases for both IPv4 and IPv6.

It also inserts different sized headers, which can expose some
unexpected edge cases.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 148 +++++++++++++-----
 tools/testing/selftests/bpf/test_tc_tunnel.sh |  21 ++-
 2 files changed, 123 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 591f540ce513..900c5653105f 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -2,6 +2,9 @@
 
 /* In-place tunneling */
 
+#include <stdbool.h>
+#include <string.h>
+
 #include <linux/stddef.h>
 #include <linux/bpf.h>
 #include <linux/if_ether.h>
@@ -17,6 +20,18 @@
 
 static const int cfg_port = 8000;
 
+struct grev4hdr {
+	struct iphdr ip;
+	__be16 flags;
+	__be16 protocol;
+} __attribute__((packed));
+
+struct grev6hdr {
+	struct ipv6hdr ip;
+	__be16 flags;
+	__be16 protocol;
+} __attribute__((packed));
+
 static __always_inline void set_ipv4_csum(struct iphdr *iph)
 {
 	__u16 *iph16 = (__u16 *)iph;
@@ -32,10 +47,12 @@ static __always_inline void set_ipv4_csum(struct iphdr *iph)
 	iph->check = ~((csum & 0xffff) + (csum >> 16));
 }
 
-static int encap_ipv4(struct __sk_buff *skb)
+static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 {
-	struct iphdr iph_outer, iph_inner;
+	struct grev4hdr h_outer;
+	struct iphdr iph_inner;
 	struct tcphdr tcph;
+	int olen;
 
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
 			       sizeof(iph_inner)) < 0)
@@ -52,24 +69,33 @@ static int encap_ipv4(struct __sk_buff *skb)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
+	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
+
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, sizeof(iph_outer), BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_NET, 0))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
-	iph_outer = iph_inner;
-	iph_outer.protocol = IPPROTO_IPIP;
-	iph_outer.tot_len = bpf_htons(sizeof(iph_outer) +
-				      bpf_htons(iph_outer.tot_len));
-	set_ipv4_csum(&iph_outer);
+	h_outer.ip = iph_inner;
+	h_outer.ip.tot_len = bpf_htons(olen +
+				      bpf_htons(h_outer.ip.tot_len));
+	if (with_gre) {
+		h_outer.ip.protocol = IPPROTO_GRE;
+		h_outer.protocol = bpf_htons(ETH_P_IP);
+		h_outer.flags = 0;
+	} else {
+		h_outer.ip.protocol = IPPROTO_IPIP;
+	}
+
+	set_ipv4_csum((void *)&h_outer.ip);
 
 	/* store new outer network header */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_outer, sizeof(iph_outer),
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen,
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
 	/* bpf_skb_adjust_room has moved header to start of room: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+	if (bpf_skb_store_bytes(skb, ETH_HLEN + olen,
 				&iph_inner, sizeof(iph_inner),
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
@@ -77,10 +103,12 @@ static int encap_ipv4(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
-static int encap_ipv6(struct __sk_buff *skb)
+static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 {
-	struct ipv6hdr iph_outer, iph_inner;
+	struct ipv6hdr iph_inner;
+	struct grev6hdr h_outer;
 	struct tcphdr tcph;
+	int olen;
 
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
 			       sizeof(iph_inner)) < 0)
@@ -94,23 +122,31 @@ static int encap_ipv6(struct __sk_buff *skb)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
+	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
+
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, sizeof(iph_outer), BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_NET, 0))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
-	iph_outer = iph_inner;
-	iph_outer.nexthdr = IPPROTO_IPV6;
-	iph_outer.payload_len = bpf_htons(sizeof(iph_outer) +
-					  bpf_ntohs(iph_outer.payload_len));
+	h_outer.ip = iph_inner;
+	h_outer.ip.payload_len = bpf_htons(olen +
+					   bpf_ntohs(h_outer.ip.payload_len));
+	if (with_gre) {
+		h_outer.ip.nexthdr = IPPROTO_GRE;
+		h_outer.protocol = bpf_htons(ETH_P_IPV6);
+		h_outer.flags = 0;
+	} else {
+		h_outer.ip.nexthdr = IPPROTO_IPV6;
+	}
 
 	/* store new outer network header */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_outer, sizeof(iph_outer),
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen,
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
 	/* bpf_skb_adjust_room has moved header to start of room: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+	if (bpf_skb_store_bytes(skb, ETH_HLEN + olen,
 				&iph_inner, sizeof(iph_inner),
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
@@ -118,28 +154,63 @@ static int encap_ipv6(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
-SEC("encap")
-int encap_f(struct __sk_buff *skb)
+SEC("encap_ipip")
+int __encap_ipip(struct __sk_buff *skb)
 {
-	switch (skb->protocol) {
-	case __bpf_constant_htons(ETH_P_IP):
-		return encap_ipv4(skb);
-	case __bpf_constant_htons(ETH_P_IPV6):
-		return encap_ipv6(skb);
-	default:
-		/* does not match, ignore */
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IP))
+		return encap_ipv4(skb, false);
+	else
 		return TC_ACT_OK;
-	}
 }
 
-static int decap_internal(struct __sk_buff *skb, int off, int len)
+SEC("encap_gre")
+int __encap_gre(struct __sk_buff *skb)
 {
-	char buf[sizeof(struct ipv6hdr)];
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IP))
+		return encap_ipv4(skb, true);
+	else
+		return TC_ACT_OK;
+}
 
-	if (bpf_skb_load_bytes(skb, off + len, &buf, len) < 0)
+SEC("encap_ip6tnl")
+int __encap_ip6tnl(struct __sk_buff *skb)
+{
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6))
+		return encap_ipv6(skb, false);
+	else
+		return TC_ACT_OK;
+}
+
+SEC("encap_ip6gre")
+int __encap_ip6gre(struct __sk_buff *skb)
+{
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6))
+		return encap_ipv6(skb, true);
+	else
 		return TC_ACT_OK;
+}
 
-	if (bpf_skb_adjust_room(skb, -len, BPF_ADJ_ROOM_NET, 0))
+static int decap_internal(struct __sk_buff *skb, int off, int len, char proto)
+{
+	char buf[sizeof(struct grev6hdr)];
+	int olen;
+
+	switch (proto) {
+	case IPPROTO_IPIP:
+	case IPPROTO_IPV6:
+		olen = len;
+		break;
+	case IPPROTO_GRE:
+		olen = len + 4 /* gre hdr */;
+		break;
+	default:
+		return TC_ACT_OK;
+	}
+
+	if (bpf_skb_load_bytes(skb, off + olen, &buf, olen) < 0)
+		return TC_ACT_OK;
+
+	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_NET, 0))
 		return TC_ACT_SHOT;
 
 	/* bpf_skb_adjust_room has moved outer over inner header: restore */
@@ -157,10 +228,11 @@ static int decap_ipv4(struct __sk_buff *skb)
 			       sizeof(iph_outer)) < 0)
 		return TC_ACT_OK;
 
-	if (iph_outer.ihl != 5 || iph_outer.protocol != IPPROTO_IPIP)
+	if (iph_outer.ihl != 5)
 		return TC_ACT_OK;
 
-	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer));
+	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer),
+			      iph_outer.protocol);
 }
 
 static int decap_ipv6(struct __sk_buff *skb)
@@ -171,10 +243,8 @@ static int decap_ipv6(struct __sk_buff *skb)
 			       sizeof(iph_outer)) < 0)
 		return TC_ACT_OK;
 
-	if (iph_outer.nexthdr != IPPROTO_IPV6)
-		return TC_ACT_OK;
-
-	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer));
+	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer),
+			      iph_outer.nexthdr);
 }
 
 SEC("decap")
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 7b1758f3006b..c78922048610 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -54,30 +54,36 @@ set -e
 # no arguments: automated test, run all
 if [[ "$#" -eq "0" ]]; then
 	echo "ipip"
-	$0 ipv4
+	$0 ipv4 ipip
 
 	echo "ip6ip6"
-	$0 ipv6
+	$0 ipv6 ip6tnl
+
+	echo "ip gre"
+	$0 ipv4 gre
+
+	echo "ip6 gre"
+	$0 ipv6 ip6gre
 
 	echo "OK. All tests passed"
 	exit 0
 fi
 
-if [[ "$#" -ne "1" ]]; then
+if [[ "$#" -ne "2" ]]; then
 	echo "Usage: $0"
-	echo "   or: $0 <ipv4|ipv6>"
+	echo "   or: $0 <ipv4|ipv6> <tuntype>"
 	exit 1
 fi
 
 case "$1" in
 "ipv4")
-	readonly tuntype=ipip
+	readonly tuntype=$2
 	readonly addr1="${ns1_v4}"
 	readonly addr2="${ns2_v4}"
 	readonly netcat_opt=-4
 	;;
 "ipv6")
-	readonly tuntype=ip6tnl
+	readonly tuntype=$2
 	readonly addr1="${ns1_v6}"
 	readonly addr2="${ns2_v6}"
 	readonly netcat_opt=-6
@@ -103,7 +109,8 @@ client_connect
 # client can no longer connect
 ip netns exec "${ns1}" tc qdisc add dev veth1 clsact
 ip netns exec "${ns1}" tc filter add dev veth1 egress \
-	bpf direct-action object-file ./test_tc_tunnel.o section encap
+	bpf direct-action object-file ./test_tc_tunnel.o \
+	section "encap_${tuntype}"
 echo "test bpf encap without decap (expect failure)"
 server_listen
 ! client_connect
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 06/13] selftests/bpf: extend bpf tunnel test with tso
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (4 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 05/13] selftests/bpf: extend bpf tunnel test with gre Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC Willem de Bruijn
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Segmentation offload takes a longer path. Verify that the feature
works with large packets.

The test succeeds if not setting dodgy in bpf_skb_adjust_room, as veth
TSO is permissive.

If not setting SKB_GSO_DODGY, this enables tunneled TSO offload on
supporting NICs.

The feature sets SKB_GSO_DODGY because the caller is untrusted. As a
result the packets traverse through the gso stack at least up to TCP.
And fail the gso_type validation, such as the skb->encapsulation check
in gre_gso_segment and the gso_type checks introduced in commit
418e897e0716 ("gso: validate gso_type on ipip style tunnel").

This will be addressed in a follow-on feature patch. In the meantime,
disable the new gso tests.

Changes v1->v2:
  - not all netcat versions support flag '-q', use timeout instead

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 60 +++++++++++++++----
 1 file changed, 49 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index c78922048610..9e18754f2354 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -15,6 +15,8 @@ readonly ns2_v4=192.168.1.2
 readonly ns1_v6=fd::1
 readonly ns2_v6=fd::2
 
+readonly infile="$(mktemp)"
+readonly outfile="$(mktemp)"
 
 setup() {
 	ip netns add "${ns1}"
@@ -23,6 +25,8 @@ setup() {
 	ip link add dev veth1 mtu 1500 netns "${ns1}" type veth \
 	      peer name veth2 mtu 1500 netns "${ns2}"
 
+	ip netns exec "${ns1}" ethtool -K veth1 tso off
+
 	ip -netns "${ns1}" link set veth1 up
 	ip -netns "${ns2}" link set veth2 up
 
@@ -32,58 +36,86 @@ setup() {
 	ip -netns "${ns2}" -6 addr add "${ns2_v6}/64" dev veth2 nodad
 
 	sleep 1
+
+	dd if=/dev/urandom of="${infile}" bs="${datalen}" count=1 status=none
 }
 
 cleanup() {
 	ip netns del "${ns2}"
 	ip netns del "${ns1}"
+
+	if [[ -f "${outfile}" ]]; then
+		rm "${outfile}"
+	fi
+	if [[ -f "${infile}" ]]; then
+		rm "${infile}"
+	fi
 }
 
 server_listen() {
-	ip netns exec "${ns2}" nc "${netcat_opt}" -l -p "${port}" &
+	ip netns exec "${ns2}" nc "${netcat_opt}" -l -p "${port}" > "${outfile}" &
+	server_pid=$!
 	sleep 0.2
 }
 
 client_connect() {
-	ip netns exec "${ns1}" nc "${netcat_opt}" -z -w 1 "${addr2}" "${port}"
+	ip netns exec "${ns1}" timeout 2 nc "${netcat_opt}" -w 1 "${addr2}" "${port}" < "${infile}"
 	echo $?
 }
 
+verify_data() {
+	wait "${server_pid}"
+	# sha1sum returns two fields [sha1] [filepath]
+	# convert to bash array and access first elem
+	insum=($(sha1sum ${infile}))
+	outsum=($(sha1sum ${outfile}))
+	if [[ "${insum[0]}" != "${outsum[0]}" ]]; then
+		echo "data mismatch"
+		exit 1
+	fi
+}
+
 set -e
 
 # no arguments: automated test, run all
 if [[ "$#" -eq "0" ]]; then
 	echo "ipip"
-	$0 ipv4 ipip
+	$0 ipv4 ipip 100
 
 	echo "ip6ip6"
-	$0 ipv6 ip6tnl
+	$0 ipv6 ip6tnl 100
 
 	echo "ip gre"
-	$0 ipv4 gre
+	$0 ipv4 gre 100
 
 	echo "ip6 gre"
-	$0 ipv6 ip6gre
+	$0 ipv6 ip6gre 100
+
+	# disabled until passes SKB_GSO_DODGY checks
+	# echo "ip gre gso"
+	# $0 ipv4 gre 2000
+
+	# disabled until passes SKB_GSO_DODGY checks
+	# echo "ip6 gre gso"
+	# $0 ipv6 ip6gre 2000
 
 	echo "OK. All tests passed"
 	exit 0
 fi
 
-if [[ "$#" -ne "2" ]]; then
+if [[ "$#" -ne "3" ]]; then
 	echo "Usage: $0"
-	echo "   or: $0 <ipv4|ipv6> <tuntype>"
+	echo "   or: $0 <ipv4|ipv6> <tuntype> <data_len>"
 	exit 1
 fi
 
 case "$1" in
 "ipv4")
-	readonly tuntype=$2
 	readonly addr1="${ns1_v4}"
 	readonly addr2="${ns2_v4}"
 	readonly netcat_opt=-4
 	;;
 "ipv6")
-	readonly tuntype=$2
 	readonly addr1="${ns1_v6}"
 	readonly addr2="${ns2_v6}"
 	readonly netcat_opt=-6
@@ -94,7 +126,10 @@ case "$1" in
 	;;
 esac
 
-echo "encap ${addr1} to ${addr2}, type ${tuntype}"
+readonly tuntype=$2
+readonly datalen=$3
+
+echo "encap ${addr1} to ${addr2}, type ${tuntype}, len ${datalen}"
 
 trap cleanup EXIT
 
@@ -104,6 +139,7 @@ setup
 echo "test basic connectivity"
 server_listen
 client_connect
+verify_data
 
 # clientside, insert bpf program to encap all TCP to port ${port}
 # client can no longer connect
@@ -123,6 +159,7 @@ ip netns exec "${ns2}" ip link add dev testtun0 type "${tuntype}" \
 ip netns exec "${ns2}" ip link set dev testtun0 up
 echo "test bpf encap with tunnel device decap"
 client_connect
+verify_data
 
 # serverside, use BPF for decap
 ip netns exec "${ns2}" ip link del dev testtun0
@@ -132,5 +169,6 @@ ip netns exec "${ns2}" tc filter add dev veth2 ingress \
 server_listen
 echo "test bpf encap with bpf decap"
 client_connect
+verify_data
 
 echo OK
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (5 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 06/13] selftests/bpf: extend bpf tunnel test with tso Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

bpf_skb_adjust_room net allows inserting room in an skb.

Existing mode BPF_ADJ_ROOM_NET inserts room after the network header
by pulling the skb, moving the network header forward and zeroing the
new space.

Add new mode BPF_ADJUST_ROOM_MAC that inserts room after the mac
header. This allows inserting tunnel headers in front of the network
header without having to recreate the network header in the original
space, avoiding two copies.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 include/uapi/linux/bpf.h |  6 +++++-
 net/core/filter.c        | 38 ++++++++++++++++++++------------------
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 3c04410137d9..7c8fd0647070 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1478,7 +1478,10 @@ union bpf_attr {
  * 		Grow or shrink the room for data in the packet associated to
  * 		*skb* by *len_diff*, and according to the selected *mode*.
  *
- * 		There is a single supported mode at this time:
+ *		There are two supported modes at this time:
+ *
+ *		* **BPF_ADJ_ROOM_MAC**: Adjust room at the mac layer
+ *		  (room space is added or removed below the layer 2 header).
  *
  * 		* **BPF_ADJ_ROOM_NET**: Adjust room at the network layer
  * 		  (room space is added or removed below the layer 3 header).
@@ -2627,6 +2630,7 @@ enum bpf_func_id {
 /* Mode for BPF_FUNC_skb_adjust_room helper. */
 enum bpf_adj_room_mode {
 	BPF_ADJ_ROOM_NET,
+	BPF_ADJ_ROOM_MAC,
 };
 
 /* Mode for BPF_FUNC_skb_load_bytes_relative helper. */
diff --git a/net/core/filter.c b/net/core/filter.c
index d21e1acdde29..e7b7720b18e9 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2963,9 +2963,8 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 	}
 }
 
-static int bpf_skb_net_grow(struct sk_buff *skb, u32 len_diff)
+static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
 {
-	u32 off = skb_mac_header_len(skb) + bpf_skb_net_base_len(skb);
 	int ret;
 
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
@@ -2992,9 +2991,8 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 len_diff)
 	return 0;
 }
 
-static int bpf_skb_net_shrink(struct sk_buff *skb, u32 len_diff)
+static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff)
 {
-	u32 off = skb_mac_header_len(skb) + bpf_skb_net_base_len(skb);
 	int ret;
 
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
@@ -3027,7 +3025,8 @@ static u32 __bpf_skb_max_len(const struct sk_buff *skb)
 			  SKB_MAX_ALLOC;
 }
 
-static int bpf_skb_adjust_net(struct sk_buff *skb, s32 len_diff)
+BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
+	   u32, mode, u64, flags)
 {
 	bool trans_same = skb->transport_header == skb->network_header;
 	u32 len_cur, len_diff_abs = abs(len_diff);
@@ -3035,14 +3034,28 @@ static int bpf_skb_adjust_net(struct sk_buff *skb, s32 len_diff)
 	u32 len_max = __bpf_skb_max_len(skb);
 	__be16 proto = skb->protocol;
 	bool shrink = len_diff < 0;
+	u32 off;
 	int ret;
 
+	if (unlikely(flags))
+		return -EINVAL;
 	if (unlikely(len_diff_abs > 0xfffU))
 		return -EFAULT;
 	if (unlikely(proto != htons(ETH_P_IP) &&
 		     proto != htons(ETH_P_IPV6)))
 		return -ENOTSUPP;
 
+	off = skb_mac_header_len(skb);
+	switch (mode) {
+	case BPF_ADJ_ROOM_NET:
+		off += bpf_skb_net_base_len(skb);
+		break;
+	case BPF_ADJ_ROOM_MAC:
+		break;
+	default:
+		return -ENOTSUPP;
+	}
+
 	len_cur = skb->len - skb_network_offset(skb);
 	if (skb_transport_header_was_set(skb) && !trans_same)
 		len_cur = skb_network_header_len(skb);
@@ -3052,24 +3065,13 @@ static int bpf_skb_adjust_net(struct sk_buff *skb, s32 len_diff)
 			 !skb_is_gso(skb))))
 		return -ENOTSUPP;
 
-	ret = shrink ? bpf_skb_net_shrink(skb, len_diff_abs) :
-		       bpf_skb_net_grow(skb, len_diff_abs);
+	ret = shrink ? bpf_skb_net_shrink(skb, off, len_diff_abs) :
+		       bpf_skb_net_grow(skb, off, len_diff_abs);
 
 	bpf_compute_data_pointers(skb);
 	return ret;
 }
 
-BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
-	   u32, mode, u64, flags)
-{
-	if (unlikely(flags))
-		return -EINVAL;
-	if (likely(mode == BPF_ADJ_ROOM_NET))
-		return bpf_skb_adjust_net(skb, len_diff);
-
-	return -ENOTSUPP;
-}
-
 static const struct bpf_func_proto bpf_skb_adjust_room_proto = {
 	.func		= bpf_skb_adjust_room,
 	.gpl_only	= false,
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (6 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 09/13] bpf: add bpf_skb_adjust_room encap flags Willem de Bruijn
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

bpf_skb_adjust_room adjusts gso_size of gso packets to account for the
pushed or popped header room.

This is not allowed with UDP, where gso_size delineates datagrams. Add
an option to avoid these updates and allow this call for datagrams.

It can also be used with TCP, when MSS is known to allow headroom,
e.g., through MSS clamping or route MTU.

Changes v1->v2:
  - document flag BPF_F_ADJ_ROOM_FIXED_GSO
  - do not expose BPF_F_ADJ_ROOM_MASK through uapi, as it may change.

Link: https://patchwork.ozlabs.org/patch/1052497/
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 include/uapi/linux/bpf.h |  9 +++++++--
 net/core/filter.c        | 38 +++++++++++++++++++++++++++-----------
 2 files changed, 34 insertions(+), 13 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 7c8fd0647070..4f157d0ec571 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1486,8 +1486,10 @@ union bpf_attr {
  * 		* **BPF_ADJ_ROOM_NET**: Adjust room at the network layer
  * 		  (room space is added or removed below the layer 3 header).
  *
- * 		All values for *flags* are reserved for future usage, and must
- * 		be left at zero.
+ *		There is one supported flag at this time:
+ *
+ *		* **BPF_F_ADJ_ROOM_FIXED_GSO**: Do not adjust gso_size.
+ *		  Adjusting mss in this way is not allowed for datagrams.
  *
  * 		A call to this helper is susceptible to change the underlaying
  * 		packet buffer. Therefore, at load time, all checks on pointers
@@ -2627,6 +2629,9 @@ enum bpf_func_id {
 /* Current network namespace */
 #define BPF_F_CURRENT_NETNS		(-1L)
 
+/* BPF_FUNC_skb_adjust_room flags. */
+#define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)
+
 /* Mode for BPF_FUNC_skb_adjust_room helper. */
 enum bpf_adj_room_mode {
 	BPF_ADJ_ROOM_NET,
diff --git a/net/core/filter.c b/net/core/filter.c
index e7b7720b18e9..d3240a0a0eeb 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2963,12 +2963,19 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 	}
 }
 
-static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
+#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO)
+
+static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
+			    u64 flags)
 {
 	int ret;
 
-	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
-		return -ENOTSUPP;
+	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
+		/* udp gso_size delineates datagrams, only allow if fixed */
+		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
+		    !(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
+			return -ENOTSUPP;
+	}
 
 	ret = skb_cow_head(skb, len_diff);
 	if (unlikely(ret < 0))
@@ -2982,7 +2989,9 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
 		struct skb_shared_info *shinfo = skb_shinfo(skb);
 
 		/* Due to header grow, MSS needs to be downgraded. */
-		skb_decrease_gso_size(shinfo, len_diff);
+		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
+			skb_decrease_gso_size(shinfo, len_diff);
+
 		/* Header must be checked, and gso_segs recomputed. */
 		shinfo->gso_type |= SKB_GSO_DODGY;
 		shinfo->gso_segs = 0;
@@ -2991,12 +3000,17 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
 	return 0;
 }
 
-static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff)
+static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
+			      u64 flags)
 {
 	int ret;
 
-	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
-		return -ENOTSUPP;
+	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
+		/* udp gso_size delineates datagrams, only allow if fixed */
+		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
+		    !(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
+			return -ENOTSUPP;
+	}
 
 	ret = skb_unclone(skb, GFP_ATOMIC);
 	if (unlikely(ret < 0))
@@ -3010,7 +3024,9 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff)
 		struct skb_shared_info *shinfo = skb_shinfo(skb);
 
 		/* Due to header shrink, MSS can be upgraded. */
-		skb_increase_gso_size(shinfo, len_diff);
+		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
+			skb_increase_gso_size(shinfo, len_diff);
+
 		/* Header must be checked, and gso_segs recomputed. */
 		shinfo->gso_type |= SKB_GSO_DODGY;
 		shinfo->gso_segs = 0;
@@ -3037,7 +3053,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	u32 off;
 	int ret;
 
-	if (unlikely(flags))
+	if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
 		return -EINVAL;
 	if (unlikely(len_diff_abs > 0xfffU))
 		return -EFAULT;
@@ -3065,8 +3081,8 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 			 !skb_is_gso(skb))))
 		return -ENOTSUPP;
 
-	ret = shrink ? bpf_skb_net_shrink(skb, off, len_diff_abs) :
-		       bpf_skb_net_grow(skb, off, len_diff_abs);
+	ret = shrink ? bpf_skb_net_shrink(skb, off, len_diff_abs, flags) :
+		       bpf_skb_net_grow(skb, off, len_diff_abs, flags);
 
 	bpf_compute_data_pointers(skb);
 	return ret;
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 09/13] bpf: add bpf_skb_adjust_room encap flags
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (7 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 10/13] bpf: Sync bpf.h to tools Willem de Bruijn
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

When pushing tunnel headers, annotate skbs in the same way as tunnel
devices.

For GSO packets, the network stack requires certain fields set to
segment packets with tunnel headers. gro_gse_segment depends on
transport and inner mac header, for instance.

Add an option to pass this information.

Remove the restriction on len_diff to network header length, which
is too short, e.g., for GRE protocols.

Changes
  v1->v2:
  - document new flags
  - BPF_F_ADJ_ROOM_MASK moved
  v2->v3:
  - BPF_F_ADJ_ROOM_ENCAP_L3_MASK moved

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 include/uapi/linux/bpf.h | 16 +++++++++-
 net/core/filter.c        | 66 +++++++++++++++++++++++++++++++++++++---
 2 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4f157d0ec571..837024512baf 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1486,11 +1486,20 @@ union bpf_attr {
  * 		* **BPF_ADJ_ROOM_NET**: Adjust room at the network layer
  * 		  (room space is added or removed below the layer 3 header).
  *
- *		There is one supported flag at this time:
+ *		The following flags are supported at this time:
  *
  *		* **BPF_F_ADJ_ROOM_FIXED_GSO**: Do not adjust gso_size.
  *		  Adjusting mss in this way is not allowed for datagrams.
  *
+ *		* **BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 **:
+ *		* **BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 **:
+ *		  Any new space is reserved to hold a tunnel header.
+ *		  Configure skb offsets and other fields accordingly.
+ *
+ *		* **BPF_F_ADJ_ROOM_ENCAP_L4_GRE **:
+ *		* **BPF_F_ADJ_ROOM_ENCAP_L4_UDP **:
+ *		  Use with ENCAP_L3 flags to further specify the tunnel type.
+ *
  * 		A call to this helper is susceptible to change the underlaying
  * 		packet buffer. Therefore, at load time, all checks on pointers
  * 		previously done by the verifier are invalidated and must be
@@ -2632,6 +2641,11 @@ enum bpf_func_id {
 /* BPF_FUNC_skb_adjust_room flags. */
 #define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)
 
+#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	(1ULL << 1)
+#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	(1ULL << 2)
+#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE	(1ULL << 3)
+#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP	(1ULL << 4)
+
 /* Mode for BPF_FUNC_skb_adjust_room helper. */
 enum bpf_adj_room_mode {
 	BPF_ADJ_ROOM_NET,
diff --git a/net/core/filter.c b/net/core/filter.c
index d3240a0a0eeb..c1d19b074d6c 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2963,11 +2963,20 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 	}
 }
 
-#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO)
+#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
+					 BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
+
+#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
+					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
+					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
+					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
 
 static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 			    u64 flags)
 {
+	bool encap = flags & BPF_F_ADJ_ROOM_ENCAP_L3_MASK;
+	unsigned int gso_type = SKB_GSO_DODGY;
+	u16 mac_len, inner_net, inner_trans;
 	int ret;
 
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
@@ -2981,10 +2990,60 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 	if (unlikely(ret < 0))
 		return ret;
 
+	if (encap) {
+		if (skb->protocol != htons(ETH_P_IP) &&
+		    skb->protocol != htons(ETH_P_IPV6))
+			return -ENOTSUPP;
+
+		if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 &&
+		    flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
+			return -EINVAL;
+
+		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE &&
+		    flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
+			return -EINVAL;
+
+		if (skb->encapsulation)
+			return -EALREADY;
+
+		mac_len = skb->network_header - skb->mac_header;
+		inner_net = skb->network_header;
+		inner_trans = skb->transport_header;
+	}
+
 	ret = bpf_skb_net_hdr_push(skb, off, len_diff);
 	if (unlikely(ret < 0))
 		return ret;
 
+	if (encap) {
+		/* inner mac == inner_net on l3 encap */
+		skb->inner_mac_header = inner_net;
+		skb->inner_network_header = inner_net;
+		skb->inner_transport_header = inner_trans;
+		skb_set_inner_protocol(skb, skb->protocol);
+
+		skb->encapsulation = 1;
+		skb_set_network_header(skb, mac_len);
+
+		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
+			gso_type |= SKB_GSO_UDP_TUNNEL;
+		else if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE)
+			gso_type |= SKB_GSO_GRE;
+		else if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
+			gso_type |= SKB_GSO_IPXIP6;
+		else
+			gso_type |= SKB_GSO_IPXIP4;
+
+		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE ||
+		    flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP) {
+			int nh_len = flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 ?
+					sizeof(struct ipv6hdr) :
+					sizeof(struct iphdr);
+
+			skb_set_transport_header(skb, mac_len + nh_len);
+		}
+	}
+
 	if (skb_is_gso(skb)) {
 		struct skb_shared_info *shinfo = skb_shinfo(skb);
 
@@ -2993,7 +3052,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 			skb_decrease_gso_size(shinfo, len_diff);
 
 		/* Header must be checked, and gso_segs recomputed. */
-		shinfo->gso_type |= SKB_GSO_DODGY;
+		shinfo->gso_type |= gso_type;
 		shinfo->gso_segs = 0;
 	}
 
@@ -3044,7 +3103,6 @@ static u32 __bpf_skb_max_len(const struct sk_buff *skb)
 BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	   u32, mode, u64, flags)
 {
-	bool trans_same = skb->transport_header == skb->network_header;
 	u32 len_cur, len_diff_abs = abs(len_diff);
 	u32 len_min = bpf_skb_net_base_len(skb);
 	u32 len_max = __bpf_skb_max_len(skb);
@@ -3073,8 +3131,6 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	}
 
 	len_cur = skb->len - skb_network_offset(skb);
-	if (skb_transport_header_was_set(skb) && !trans_same)
-		len_cur = skb_network_header_len(skb);
 	if ((shrink && (len_diff_abs >= len_cur ||
 			len_cur - len_diff_abs < len_min)) ||
 	    (!shrink && (skb->len + len_diff_abs > len_max &&
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 10/13] bpf: Sync bpf.h to tools
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (8 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 09/13] bpf: add bpf_skb_adjust_room encap flags Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC Willem de Bruijn
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Sync include/uapi/linux/bpf.h with tools/

Changes
  v1->v2:
  - BPF_F_ADJ_ROOM_MASK moved, no longer in this commit
  v2->v3:
  - BPF_F_ADJ_ROOM_ENCAP_L3_MASK moved, no longer in this commit

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/include/uapi/linux/bpf.h | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 3c04410137d9..837024512baf 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1478,13 +1478,27 @@ union bpf_attr {
  * 		Grow or shrink the room for data in the packet associated to
  * 		*skb* by *len_diff*, and according to the selected *mode*.
  *
- * 		There is a single supported mode at this time:
+ *		There are two supported modes at this time:
+ *
+ *		* **BPF_ADJ_ROOM_MAC**: Adjust room at the mac layer
+ *		  (room space is added or removed below the layer 2 header).
  *
  * 		* **BPF_ADJ_ROOM_NET**: Adjust room at the network layer
  * 		  (room space is added or removed below the layer 3 header).
  *
- * 		All values for *flags* are reserved for future usage, and must
- * 		be left at zero.
+ *		The following flags are supported at this time:
+ *
+ *		* **BPF_F_ADJ_ROOM_FIXED_GSO**: Do not adjust gso_size.
+ *		  Adjusting mss in this way is not allowed for datagrams.
+ *
+ *		* **BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 **:
+ *		* **BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 **:
+ *		  Any new space is reserved to hold a tunnel header.
+ *		  Configure skb offsets and other fields accordingly.
+ *
+ *		* **BPF_F_ADJ_ROOM_ENCAP_L4_GRE **:
+ *		* **BPF_F_ADJ_ROOM_ENCAP_L4_UDP **:
+ *		  Use with ENCAP_L3 flags to further specify the tunnel type.
  *
  * 		A call to this helper is susceptible to change the underlaying
  * 		packet buffer. Therefore, at load time, all checks on pointers
@@ -2624,9 +2638,18 @@ enum bpf_func_id {
 /* Current network namespace */
 #define BPF_F_CURRENT_NETNS		(-1L)
 
+/* BPF_FUNC_skb_adjust_room flags. */
+#define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)
+
+#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	(1ULL << 1)
+#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	(1ULL << 2)
+#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE	(1ULL << 3)
+#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP	(1ULL << 4)
+
 /* Mode for BPF_FUNC_skb_adjust_room helper. */
 enum bpf_adj_room_mode {
 	BPF_ADJ_ROOM_NET,
+	BPF_ADJ_ROOM_MAC,
 };
 
 /* Mode for BPF_FUNC_skb_load_bytes_relative helper. */
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (9 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 10/13] bpf: Sync bpf.h to tools Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:32 ` [PATCH bpf-next v3 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Avoid moving the network layer header when prefixing tunnel headers.

This avoids an explicit call to bpf_skb_store_bytes and an implicit
move of the network header bytes in bpf_skb_adjust_room.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 25 +++----------------
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 900c5653105f..f6a16fd23dbd 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -72,7 +72,7 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
 
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, 0))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
@@ -94,12 +94,6 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
-	/* bpf_skb_adjust_room has moved header to start of room: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN + olen,
-				&iph_inner, sizeof(iph_inner),
-				BPF_F_INVALIDATE_HASH) < 0)
-		return TC_ACT_SHOT;
-
 	return TC_ACT_OK;
 }
 
@@ -125,7 +119,7 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
 
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, 0))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
@@ -145,12 +139,6 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
-	/* bpf_skb_adjust_room has moved header to start of room: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN + olen,
-				&iph_inner, sizeof(iph_inner),
-				BPF_F_INVALIDATE_HASH) < 0)
-		return TC_ACT_SHOT;
-
 	return TC_ACT_OK;
 }
 
@@ -207,14 +195,7 @@ static int decap_internal(struct __sk_buff *skb, int off, int len, char proto)
 		return TC_ACT_OK;
 	}
 
-	if (bpf_skb_load_bytes(skb, off + olen, &buf, olen) < 0)
-		return TC_ACT_OK;
-
-	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_NET, 0))
-		return TC_ACT_SHOT;
-
-	/* bpf_skb_adjust_room has moved outer over inner header: restore */
-	if (bpf_skb_store_bytes(skb, off, buf, len, BPF_F_INVALIDATE_HASH) < 0)
+	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, 0))
 		return TC_ACT_SHOT;
 
 	return TC_ACT_OK;
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (10 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC Willem de Bruijn
@ 2019-03-22 18:32 ` Willem de Bruijn
  2019-03-22 18:33 ` [PATCH bpf-next v3 13/13] selftests/bpf: convert bpf tunnel test to encap modes Willem de Bruijn
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:32 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Lower route MTU to ensure packets fit in device MTU after encap, then
skip the gso_size changes.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 11 ++++++++---
 tools/testing/selftests/bpf/test_tc_tunnel.sh      |  6 ++++++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index f6a16fd23dbd..3b79dffb8103 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -52,6 +52,7 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 	struct grev4hdr h_outer;
 	struct iphdr iph_inner;
 	struct tcphdr tcph;
+	__u64 flags;
 	int olen;
 
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
@@ -69,10 +70,11 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO;
 	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
 
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
@@ -102,6 +104,7 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 	struct ipv6hdr iph_inner;
 	struct grev6hdr h_outer;
 	struct tcphdr tcph;
+	__u64 flags;
 	int olen;
 
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
@@ -116,10 +119,11 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO;
 	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
 
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
@@ -195,7 +199,8 @@ static int decap_internal(struct __sk_buff *skb, int off, int len, char proto)
 		return TC_ACT_OK;
 	}
 
-	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, 0))
+	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC,
+				BPF_F_ADJ_ROOM_FIXED_GSO))
 		return TC_ACT_SHOT;
 
 	return TC_ACT_OK;
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 9e18754f2354..cda5317790d2 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -35,6 +35,12 @@ setup() {
 	ip -netns "${ns1}" -6 addr add "${ns1_v6}/64" dev veth1 nodad
 	ip -netns "${ns2}" -6 addr add "${ns2_v6}/64" dev veth2 nodad
 
+	# clamp route to reserve room for tunnel headers
+	ip -netns "${ns1}" -4 route flush table main
+	ip -netns "${ns1}" -6 route flush table main
+	ip -netns "${ns1}" -4 route add "${ns2_v4}" mtu 1476 dev veth1
+	ip -netns "${ns1}" -6 route add "${ns2_v6}" mtu 1456 dev veth1
+
 	sleep 1
 
 	dd if=/dev/urandom of="${infile}" bs="${datalen}" count=1 status=none
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 13/13] selftests/bpf: convert bpf tunnel test to encap modes
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (11 preceding siblings ...)
  2019-03-22 18:32 ` [PATCH bpf-next v3 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
@ 2019-03-22 18:33 ` Willem de Bruijn
  2019-03-22 21:01 ` [PATCH bpf-next v3 00/13] bpf tc tunneling Alexei Starovoitov
  2019-03-23 10:53 ` Daniel Borkmann
  14 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-22 18:33 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Make the tests correctly annotate skbs with tunnel metadata.

This makes the gso tests succeed. Enable them.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 19 +++++++++++++++----
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 10 ++++------
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 3b79dffb8103..f541c2de947d 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -70,8 +70,13 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
-	flags = BPF_F_ADJ_ROOM_FIXED_GSO;
-	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV4;
+	if (with_gre) {
+		flags |= BPF_F_ADJ_ROOM_ENCAP_L4_GRE;
+		olen = sizeof(h_outer);
+	} else {
+		olen = sizeof(h_outer.ip);
+	}
 
 	/* add room between mac and network header */
 	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
@@ -119,8 +124,14 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
-	flags = BPF_F_ADJ_ROOM_FIXED_GSO;
-	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV6;
+	if (with_gre) {
+		flags |= BPF_F_ADJ_ROOM_ENCAP_L4_GRE;
+		olen = sizeof(h_outer);
+	} else {
+		olen = sizeof(h_outer.ip);
+	}
+
 
 	/* add room between mac and network header */
 	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index cda5317790d2..dcf320626931 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -97,13 +97,11 @@ if [[ "$#" -eq "0" ]]; then
 	echo "ip6 gre"
 	$0 ipv6 ip6gre 100
 
-	# disabled until passes SKB_GSO_DODGY checks
-	# echo "ip gre gso"
-	# $0 ipv4 gre 2000
+	echo "ip gre gso"
+	$0 ipv4 gre 2000
 
-	# disabled until passes SKB_GSO_DODGY checks
-	# echo "ip6 gre gso"
-	# $0 ipv6 ip6gre 2000
+	echo "ip6 gre gso"
+	$0 ipv6 ip6gre 2000
 
 	echo "OK. All tests passed"
 	exit 0
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 00/13] bpf tc tunneling
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (12 preceding siblings ...)
  2019-03-22 18:33 ` [PATCH bpf-next v3 13/13] selftests/bpf: convert bpf tunnel test to encap modes Willem de Bruijn
@ 2019-03-22 21:01 ` Alexei Starovoitov
  2019-03-23 10:53 ` Daniel Borkmann
  14 siblings, 0 replies; 17+ messages in thread
From: Alexei Starovoitov @ 2019-03-22 21:01 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, ast, daniel, sdf, posk, Willem de Bruijn

On Fri, Mar 22, 2019 at 02:32:47PM -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> BPF allows for dynamic tunneling, choosing the tunnel destination and
> features on-demand. Extend bpf_skb_adjust_room to allow for efficient
> tunneling at the TC hooks.
> 
> Most features are required for large packets with GSO, as these will
> be modified after this patch.
> 
> Patch 1
>   is a performance optimization, avoiding an unnecessary unclone
>   for the TCP hot path.
> 
> Patches 2..6
>   introduce a regression test. These can be squashed, but the code is
>   arguably more readable when gradually expanding the feature set.
> 
> Patch 7
>   is a performance optimization, avoid copying network headers
>   that are going to be overwritten. This also simplifies the bpf
>   program.
> 
> Patch 8
>   reenables bpf_skb_adjust_room for UDP packets.
> 
> Patch 9
>   configures skb tunneling metadata analogous to tunnel devices.
> 
> Patches 10..13
>   expand the regression test to make use of the new features and
>   enable the GSO testcases.
> 
> Changes
>   v1->v2
>   - move BPF_F_ADJ_ROOM_MASK out of uapi as it can be expanded
>   - document new flags
>   - in tests replace netcat -q flag with coreutils timeout:
>       the -q flag is not supported in all netcat versions
>   v2->v3
>   - move BPF_F_ADJ_ROOM_ENCAP_L3_MASK out of uapi as it has no
>     use in userspace

Applied, Thanks


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 00/13] bpf tc tunneling
  2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
                   ` (13 preceding siblings ...)
  2019-03-22 21:01 ` [PATCH bpf-next v3 00/13] bpf tc tunneling Alexei Starovoitov
@ 2019-03-23 10:53 ` Daniel Borkmann
  2019-03-23 16:03   ` Willem de Bruijn
  14 siblings, 1 reply; 17+ messages in thread
From: Daniel Borkmann @ 2019-03-23 10:53 UTC (permalink / raw)
  To: Willem de Bruijn, netdev; +Cc: ast, sdf, posk, Willem de Bruijn

Hey Willem,

On 03/22/2019 07:32 PM, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> BPF allows for dynamic tunneling, choosing the tunnel destination and
> features on-demand. Extend bpf_skb_adjust_room to allow for efficient
> tunneling at the TC hooks.

Patch 9 in this series caused the following warning, please take a look
for sending a follow-up fix, thanks.

Report:

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git  master
branch HEAD: 7df5e3db8f6354125540dfa50affd2c02b7d2832  selftests: bpf: tc-bpf flow shaping with EDT

Regressions in current branch:

net/core/filter.c:3026:3: warning: 'mac_len' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/core/filter.c:3027:3: warning: 'mac_len' may be used uninitialized in this function [-Wmaybe-uninitialized]

Error ids grouped by kconfigs:

recent_errors
├── arm64-allmodconfig
│   └── net-core-filter.c:warning:mac_len-may-be-used-uninitialized-in-this-function
└── x86_64-allmodconfig
    └── net-core-filter.c:warning:mac_len-may-be-used-uninitialized-in-this-function

elapsed time: 171m

configs tested: 219

alpha                               defconfig
parisc                            allnoconfig
parisc                         b180_defconfig
parisc                        c3000_defconfig
parisc                              defconfig
um                                  defconfig
arm                           u8500_defconfig
nds32                            allyesconfig
mips                 pnx8335_stb225_defconfig
um                               allmodconfig
mips                malta_kvm_guest_defconfig
powerpc               mpc834x_itxgp_defconfig
x86_64                 randconfig-a0-03231212
riscv                              tinyconfig
i386                               tinyconfig
arm                         iop13xx_defconfig
arm                            pleb_defconfig
mips                       lemote2f_defconfig
powerpc                 mpc8560_ads_defconfig
mips                        bcm63xx_defconfig
powerpc                 mpc8540_ads_defconfig
alpha                            alldefconfig
nds32                             allnoconfig
arm                           viper_defconfig
mips                        generic_defconfig
m68k                            mac_defconfig
mips                          rb532_defconfig
x86_64                              fedora-25
x86_64                                  kexec
x86_64                                    lkp
x86_64                                   rhel
x86_64                               rhel-7.6
i386                     randconfig-n3-201911
i386                     randconfig-n2-201911
i386                     randconfig-n0-201911
i386                     randconfig-n1-201911
mips                           32r2_defconfig
mips                         64r6el_defconfig
mips                             allmodconfig
mips                              allnoconfig
mips                      fuloong2e_defconfig
mips                                   jz4740
mips                      malta_kvm_defconfig
mips                                     txx9
nds32                               defconfig
riscv                             allnoconfig
riscv                               defconfig
x86_64                           allyesconfig
i386                             allmodconfig
x86_64                 randconfig-x015-201911
x86_64                 randconfig-x016-201911
x86_64                 randconfig-x013-201911
x86_64                 randconfig-x012-201911
x86_64                 randconfig-x017-201911
x86_64                 randconfig-x018-201911
x86_64                 randconfig-x010-201911
x86_64                 randconfig-x014-201911
x86_64                 randconfig-x019-201911
x86_64                 randconfig-x011-201911
ia64                             alldefconfig
ia64                             allmodconfig
ia64                              allnoconfig
ia64                                defconfig
arm                              allmodconfig
arm                                      arm5
arm                                     arm67
arm                       imx_v6_v7_defconfig
arm                          ixp4xx_defconfig
arm                        mvebu_v7_defconfig
arm                       omap2plus_defconfig
arm                                    sa1100
arm                                   samsung
arm                                        sh
arm                           tegra_defconfig
arm64                            alldefconfig
arm64                            allmodconfig
i386                     randconfig-a1-201911
i386                     randconfig-a2-201911
i386                     randconfig-a0-201911
i386                     randconfig-a3-201911
c6x                        evmc6678_defconfig
h8300                    h8300h-sim_defconfig
nios2                         10m50_defconfig
xtensa                       common_defconfig
xtensa                          iss_defconfig
arm                               allnoconfig
arm                         at91_dt_defconfig
arm                           efm32_defconfig
arm                          exynos_defconfig
arm                        multi_v5_defconfig
arm                        multi_v7_defconfig
arm                        shmobile_defconfig
arm                           sunxi_defconfig
arm64                             allnoconfig
arm64                            allyesconfig
arm64                               defconfig
microblaze                      mmu_defconfig
microblaze                    nommu_defconfig
sparc                               defconfig
sparc64                          allmodconfig
sparc64                           allnoconfig
sparc64                             defconfig
sh                               allyesconfig
sh                         ecovec24_defconfig
mips                malta_qemu_32r6_defconfig
um                           x86_64_defconfig
arm                             ezx_defconfig
sh                                allnoconfig
mips                        bcm47xx_defconfig
powerpc                        fsp2_defconfig
arm                       aspeed_g4_defconfig
powerpc                      arches_defconfig
arm                            mps2_defconfig
arm                      tct_hammer_defconfig
arm                       cns3420vb_defconfig
s390                             alldefconfig
c6x                              allmodconfig
powerpc                   currituck_defconfig
h8300                            allyesconfig
powerpc                       maple_defconfig
arm                           spitz_defconfig
c6x                               allnoconfig
mips                       bmips_be_defconfig
powerpc                     ep8248e_defconfig
nios2                            allyesconfig
x86_64                 randconfig-u0-03231148
x86_64                 randconfig-u0-03231200
x86_64                 randconfig-u0-03231210
x86_64                 randconfig-u0-03231220
x86_64                 randconfig-u0-03231230
x86_64                 randconfig-u0-03231240
x86_64                 randconfig-k0-03180607
x86_64                 randconfig-k1-03180607
x86_64                 randconfig-k2-03180607
x86_64                 randconfig-k3-03180607
i386                   randconfig-k0-03180607
i386                   randconfig-k1-03180607
i386                   randconfig-k2-03180607
i386                   randconfig-k3-03180607
x86_64                             acpi-redef
x86_64                           allyesdebian
x86_64                                nfsroot
powerpc                           allnoconfig
powerpc                             defconfig
powerpc                       ppc64_defconfig
s390                        default_defconfig
sh                               allmodconfig
sh                          rsk7269_defconfig
sh                  sh7785lcr_32bit_defconfig
sh                            titan_defconfig
openrisc                    or1ksim_defconfig
um                             i386_defconfig
x86_64                 randconfig-m1-03190153
i386                   randconfig-m0-03190153
x86_64                 randconfig-m0-03190153
i386                   randconfig-m1-03190153
x86_64                 randconfig-m2-03190153
x86_64                 randconfig-m3-03190153
i386                   randconfig-m3-03190153
i386                   randconfig-m2-03190153
x86_64                 randconfig-l0-03230910
x86_64                 randconfig-l1-03230910
x86_64                 randconfig-l2-03230910
x86_64                 randconfig-l3-03230910
i386                   randconfig-l0-03230910
i386                   randconfig-l1-03230910
i386                   randconfig-l2-03230910
i386                   randconfig-l3-03230910
x86_64                 randconfig-s3-03231146
x86_64                 randconfig-s4-03231146
x86_64                 randconfig-s5-03231146
i386                   randconfig-x010-201911
i386                   randconfig-x018-201911
i386                   randconfig-x017-201911
i386                   randconfig-x013-201911
i386                   randconfig-x014-201911
i386                   randconfig-x011-201911
i386                   randconfig-x015-201911
i386                   randconfig-x019-201911
i386                   randconfig-x012-201911
i386                   randconfig-x016-201911
i386                   randconfig-x077-201911
i386                   randconfig-x070-201911
i386                   randconfig-x074-201911
i386                   randconfig-x072-201911
i386                   randconfig-x075-201911
i386                   randconfig-x071-201911
i386                   randconfig-x078-201911
i386                   randconfig-x079-201911
i386                   randconfig-x076-201911
i386                   randconfig-x073-201911
m68k                             allmodconfig
m68k                       m5475evb_defconfig
m68k                          multi_defconfig
m68k                           sun3_defconfig
i386                   randconfig-x002-201911
i386                   randconfig-x005-201911
i386                   randconfig-x001-201911
i386                   randconfig-x007-201911
i386                   randconfig-x000-201911
i386                   randconfig-x008-201911
i386                   randconfig-x004-201911
i386                   randconfig-x009-201911
i386                   randconfig-x003-201911
i386                   randconfig-x006-201911
x86_64                 randconfig-x000-201911
x86_64                 randconfig-x001-201911
x86_64                 randconfig-x004-201911
x86_64                 randconfig-x003-201911
x86_64                 randconfig-x007-201911
x86_64                 randconfig-x008-201911
x86_64                 randconfig-x002-201911
x86_64                 randconfig-x006-201911
x86_64                 randconfig-x005-201911
x86_64                 randconfig-x009-201911
i386                             alldefconfig
i386                              allnoconfig
i386                                defconfig
x86_64                           allmodconfig

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 00/13] bpf tc tunneling
  2019-03-23 10:53 ` Daniel Borkmann
@ 2019-03-23 16:03   ` Willem de Bruijn
  0 siblings, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-03-23 16:03 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Network Development, Alexei Starovoitov, Stanislav Fomichev,
	Peter Oskolkov, Willem de Bruijn

On Sat, Mar 23, 2019 at 6:53 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Hey Willem,
>
> On 03/22/2019 07:32 PM, Willem de Bruijn wrote:
> > From: Willem de Bruijn <willemb@google.com>
> >
> > BPF allows for dynamic tunneling, choosing the tunnel destination and
> > features on-demand. Extend bpf_skb_adjust_room to allow for efficient
> > tunneling at the TC hooks.
>
> Patch 9 in this series caused the following warning, please take a look
> for sending a follow-up fix, thanks.
>
> Report:
>
> tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git  master
> branch HEAD: 7df5e3db8f6354125540dfa50affd2c02b7d2832  selftests: bpf: tc-bpf flow shaping with EDT
>
> Regressions in current branch:
>
> net/core/filter.c:3026:3: warning: 'mac_len' may be used uninitialized in this function [-Wmaybe-uninitialized]
> net/core/filter.c:3027:3: warning: 'mac_len' may be used uninitialized in this function [-Wmaybe-uninitialized]

That's surprising, sorry about that. Sent
http://patchwork.ozlabs.org/patch/1062325/ .

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-03-23 16:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-22 18:32 [PATCH bpf-next v3 00/13] bpf tc tunneling Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 02/13] selftests/bpf: bpf tunnel encap test Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 03/13] selftests/bpf: expand bpf tunnel test with decap Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 04/13] selftests/bpf: expand bpf tunnel test to ipv6 Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 05/13] selftests/bpf: extend bpf tunnel test with gre Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 06/13] selftests/bpf: extend bpf tunnel test with tso Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 09/13] bpf: add bpf_skb_adjust_room encap flags Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 10/13] bpf: Sync bpf.h to tools Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC Willem de Bruijn
2019-03-22 18:32 ` [PATCH bpf-next v3 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
2019-03-22 18:33 ` [PATCH bpf-next v3 13/13] selftests/bpf: convert bpf tunnel test to encap modes Willem de Bruijn
2019-03-22 21:01 ` [PATCH bpf-next v3 00/13] bpf tc tunneling Alexei Starovoitov
2019-03-23 10:53 ` Daniel Borkmann
2019-03-23 16:03   ` Willem de Bruijn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.