All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next 00/13] bpf tc tunneling
@ 2019-03-20 14:49 Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path Willem de Bruijn
                   ` (12 more replies)
  0 siblings, 13 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

BPF allows for dynamic tunneling, choosing the tunnel destination and
features on-demand. Extend bpf_skb_adjust_room to allow for efficient
tunneling at the TC hooks.

Patch 1
  is a performance optimization, avoiding an unnecessary unclone
  for the TCP hot path.

Patches 2..6
  introduce a regression test. These can be squashed, but the code is
  arguably more readable when gradually expanding the feature set.

Patch 7
  is a performance optimization, avoid copying network headers
  that are going to be overwritten. This also simplifies the bpf
  program.

Patch 8
  reenables bpf_skb_adjust_room for UDP packets.

Patch 9
  add support for gso packets, which require additional metadata set
  in the skb. It does this through new flags to bpf_skb_adjust_room.
  other alternatives considered:
  - individual bpf_{ipip, gre, udp, ..}_encap functions that combine
    adjust room and bpf_skb_store_bytes.
  - new bpf_encap_fixup function called after bpf_skb_adjust_room and
    bpf_skb_store_bytes that parses the tunnel and sets the metadata.

Patches 10..13
  expand the regression test to make use of the new features and
  enable the GSO testcases.

  these could be interleaved with each of the new features, were it
  not for the separate sync bpf.h patch.

Willem de Bruijn (13):
  bpf: in bpf_skb_adjust_room avoid copy in tx fast path
  selftests/bpf: bpf tunnel encap test
  selftests/bpf: expand bpf tunnel test with decap
  selftests/bpf: expand bpf tunnel test to ipv6
  selftests/bpf: extend bpf tunnel test with gre
  selftests/bpf: extend bpf tunnel test with tso
  bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC
  bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO
  bpf: add bpf_skb_adjust_room encap flags
  bpf: Sync bpf.h to tools
  selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC
  selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO
  selftests/bpf: convert bpf tunnel test to encap modes

 include/uapi/linux/bpf.h                      |  22 +-
 net/core/filter.c                             | 124 +++++++--
 tools/include/uapi/linux/bpf.h                |  22 +-
 tools/testing/selftests/bpf/Makefile          |   3 +-
 tools/testing/selftests/bpf/config            |   2 +
 .../selftests/bpf/progs/test_tc_tunnel.c      | 261 ++++++++++++++++++
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 178 ++++++++++++
 7 files changed, 580 insertions(+), 32 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_tc_tunnel.c
 create mode 100755 tools/testing/selftests/bpf/test_tc_tunnel.sh

-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 02/13] selftests/bpf: bpf tunnel encap test Willem de Bruijn
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

bpf_skb_adjust_room calls skb_cow on grow.

This expensive operation can be avoided in the fast path when the only
other clone has released the header. This is the common case for TCP,
where one headerless clone is kept on the retransmit queue.

It is safe to do so even when touching the gso fields in skb_shinfo.
Regular tunnel encap with iptunnel_handle_offloads takes the same
optimization.

The tcp stack unclones in the unlikely case that it accesses these
fields through headerless clones packets on the retransmit queue (see
__tcp_retransmit_skb).

If any other clones are present, e.g., from packet sockets,
skb_cow_head returns the same value as skb_cow().

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/core/filter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 647c63a7b25b6..8e15fb919b574 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2971,7 +2971,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 len_diff)
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
 		return -ENOTSUPP;
 
-	ret = skb_cow(skb, len_diff);
+	ret = skb_cow_head(skb, len_diff);
 	if (unlikely(ret < 0))
 		return ret;
 
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 02/13] selftests/bpf: bpf tunnel encap test
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 03/13] selftests/bpf: expand bpf tunnel test with decap Willem de Bruijn
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Validate basic tunnel encapsulation using ipip.

Set up two namespaces connected by veth. Connect a client and server.
Do this with and without bpf encap.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/bpf/Makefile          |  3 +-
 .../selftests/bpf/progs/test_tc_tunnel.c      | 83 +++++++++++++++++++
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 75 +++++++++++++++++
 3 files changed, 160 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_tc_tunnel.c
 create mode 100755 tools/testing/selftests/bpf/test_tc_tunnel.sh

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 2aed37ea61a4c..40992d12b5c8a 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -51,7 +51,8 @@ TEST_PROGS := test_kmod.sh \
 	test_skb_cgroup_id.sh \
 	test_flow_dissector.sh \
 	test_xdp_vlan.sh \
-	test_lwt_ip_encap.sh
+	test_lwt_ip_encap.sh \
+	test_tc_tunnel.sh
 
 TEST_PROGS_EXTENDED := with_addr.sh \
 	with_tunnels.sh \
diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
new file mode 100644
index 0000000000000..8223e4347be8f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* In-place tunneling */
+
+#include <linux/stddef.h>
+#include <linux/bpf.h>
+#include <linux/if_ether.h>
+#include <linux/in.h>
+#include <linux/ip.h>
+#include <linux/tcp.h>
+#include <linux/pkt_cls.h>
+#include <linux/types.h>
+
+#include "bpf_endian.h"
+#include "bpf_helpers.h"
+
+static const int cfg_port = 8000;
+
+static __always_inline void set_ipv4_csum(struct iphdr *iph)
+{
+	__u16 *iph16 = (__u16 *)iph;
+	__u32 csum;
+	int i;
+
+	iph->check = 0;
+
+#pragma clang loop unroll(full)
+	for (i = 0, csum = 0; i < sizeof(*iph) >> 1; i++)
+		csum += *iph16++;
+
+	iph->check = ~((csum & 0xffff) + (csum >> 16));
+}
+
+SEC("encap")
+int encap_f(struct __sk_buff *skb)
+{
+	struct iphdr iph_outer, iph_inner;
+	struct tcphdr tcph;
+
+	if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
+		return TC_ACT_OK;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
+			       sizeof(iph_inner)) < 0)
+		return TC_ACT_OK;
+
+	/* filter only packets we want */
+	if (iph_inner.ihl != 5 || iph_inner.protocol != IPPROTO_TCP)
+		return TC_ACT_OK;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph_inner),
+			       &tcph, sizeof(tcph)) < 0)
+		return TC_ACT_OK;
+
+	if (tcph.dest != __bpf_constant_htons(cfg_port))
+		return TC_ACT_OK;
+
+	/* add room between mac and network header */
+	if (bpf_skb_adjust_room(skb, sizeof(iph_outer), BPF_ADJ_ROOM_NET, 0))
+		return TC_ACT_SHOT;
+
+	/* prepare new outer network header */
+	iph_outer = iph_inner;
+	iph_outer.protocol = IPPROTO_IPIP;
+	iph_outer.tot_len = bpf_htons(sizeof(iph_outer) +
+				      bpf_htons(iph_outer.tot_len));
+	set_ipv4_csum(&iph_outer);
+
+	/* store new outer network header */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_outer, sizeof(iph_outer),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	/* bpf_skb_adjust_room has moved header to start of room: restore */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+				&iph_inner, sizeof(iph_inner),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	return TC_ACT_OK;
+}
+
+char __license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
new file mode 100755
index 0000000000000..6ebb288a3afc7
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -0,0 +1,75 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# In-place tunneling
+
+# must match the port that the bpf program filters on
+readonly port=8000
+
+readonly ns_prefix="ns-$$-"
+readonly ns1="${ns_prefix}1"
+readonly ns2="${ns_prefix}2"
+
+readonly ns1_v4=192.168.1.1
+readonly ns2_v4=192.168.1.2
+
+setup() {
+	ip netns add "${ns1}"
+	ip netns add "${ns2}"
+
+	ip link add dev veth1 mtu 1500 netns "${ns1}" type veth \
+	      peer name veth2 mtu 1500 netns "${ns2}"
+
+	ip -netns "${ns1}" link set veth1 up
+	ip -netns "${ns2}" link set veth2 up
+
+	ip -netns "${ns1}" -4 addr add "${ns1_v4}/24" dev veth1
+	ip -netns "${ns2}" -4 addr add "${ns2_v4}/24" dev veth2
+
+	sleep 1
+}
+
+cleanup() {
+	ip netns del "${ns2}"
+	ip netns del "${ns1}"
+}
+
+server_listen() {
+	ip netns exec "${ns2}" nc -l -p "${port}" &
+	sleep 0.2
+}
+
+client_connect() {
+	ip netns exec "${ns1}" nc -z -w 1 "${ns2_v4}" "${port}"
+	echo $?
+}
+
+set -e
+trap cleanup EXIT
+
+setup
+
+# basic communication works
+echo "test basic connectivity"
+server_listen
+client_connect
+
+# clientside, insert bpf program to encap all TCP to port ${port}
+# client can no longer connect
+ip netns exec "${ns1}" tc qdisc add dev veth1 clsact
+ip netns exec "${ns1}" tc filter add dev veth1 egress \
+	bpf direct-action object-file ./test_tc_tunnel.o section encap
+echo "test bpf encap without decap (expect failure)"
+server_listen
+! client_connect
+
+# serverside, insert decap module
+# server is still running
+# client can connect again
+ip netns exec "${ns2}" ip link add dev testtun0 type ipip \
+	remote "${ns1_v4}" local "${ns2_v4}"
+ip netns exec "${ns2}" ip link set dev testtun0 up
+echo "test bpf encap with tunnel device decap"
+client_connect
+
+echo OK
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 03/13] selftests/bpf: expand bpf tunnel test with decap
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 02/13] selftests/bpf: bpf tunnel encap test Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 04/13] selftests/bpf: expand bpf tunnel test to ipv6 Willem de Bruijn
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

The bpf tunnel test encapsulates using bpf, then decapsulates using
a standard tunnel device to verify correctness.

Once encap is verified, also test decap, by replacing the tunnel
device on decap with another bpf program.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 31 +++++++++++++++++++
 tools/testing/selftests/bpf/test_tc_tunnel.sh |  9 ++++++
 2 files changed, 40 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 8223e4347be8f..25db148635ab7 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -80,4 +80,35 @@ int encap_f(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
+SEC("decap")
+int decap_f(struct __sk_buff *skb)
+{
+	struct iphdr iph_outer, iph_inner;
+
+	if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
+		return TC_ACT_OK;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_outer,
+			       sizeof(iph_outer)) < 0)
+		return TC_ACT_OK;
+
+	if (iph_outer.ihl != 5 || iph_outer.protocol != IPPROTO_IPIP)
+		return TC_ACT_OK;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+			       &iph_inner, sizeof(iph_inner)) < 0)
+		return TC_ACT_OK;
+
+	if (bpf_skb_adjust_room(skb, -(int)sizeof(iph_outer),
+				BPF_ADJ_ROOM_NET, 0))
+		return TC_ACT_SHOT;
+
+	/* bpf_skb_adjust_room has moved outer over inner header: restore */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_inner, sizeof(iph_inner),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	return TC_ACT_OK;
+}
+
 char __license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 6ebb288a3afc7..91151d91e5a1b 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -72,4 +72,13 @@ ip netns exec "${ns2}" ip link set dev testtun0 up
 echo "test bpf encap with tunnel device decap"
 client_connect
 
+# serverside, use BPF for decap
+ip netns exec "${ns2}" ip link del dev testtun0
+ip netns exec "${ns2}" tc qdisc add dev veth2 clsact
+ip netns exec "${ns2}" tc filter add dev veth2 ingress \
+	bpf direct-action object-file ./test_tc_tunnel.o section decap
+server_listen
+echo "test bpf encap with bpf decap"
+client_connect
+
 echo OK
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 04/13] selftests/bpf: expand bpf tunnel test to ipv6
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (2 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 03/13] selftests/bpf: expand bpf tunnel test with decap Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 05/13] selftests/bpf: extend bpf tunnel test with gre Willem de Bruijn
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

The test only uses ipv4 so far, expand to ipv6.
This is mostly a boilerplate near copy of the ipv4 path.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/bpf/config            |   2 +
 .../selftests/bpf/progs/test_tc_tunnel.c      | 116 +++++++++++++++---
 tools/testing/selftests/bpf/test_tc_tunnel.sh |  53 +++++++-
 3 files changed, 149 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 37f947ec44ed9..a42f4fc4dc11f 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -23,3 +23,5 @@ CONFIG_LWTUNNEL=y
 CONFIG_BPF_STREAM_PARSER=y
 CONFIG_XDP_SOCKETS=y
 CONFIG_FTRACE_SYSCALLS=y
+CONFIG_IPV6_TUNNEL=y
+CONFIG_IPV6_GRE=y
diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 25db148635ab7..591f540ce513d 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -7,6 +7,7 @@
 #include <linux/if_ether.h>
 #include <linux/in.h>
 #include <linux/ip.h>
+#include <linux/ipv6.h>
 #include <linux/tcp.h>
 #include <linux/pkt_cls.h>
 #include <linux/types.h>
@@ -31,15 +32,11 @@ static __always_inline void set_ipv4_csum(struct iphdr *iph)
 	iph->check = ~((csum & 0xffff) + (csum >> 16));
 }
 
-SEC("encap")
-int encap_f(struct __sk_buff *skb)
+static int encap_ipv4(struct __sk_buff *skb)
 {
 	struct iphdr iph_outer, iph_inner;
 	struct tcphdr tcph;
 
-	if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
-		return TC_ACT_OK;
-
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
 			       sizeof(iph_inner)) < 0)
 		return TC_ACT_OK;
@@ -80,35 +77,118 @@ int encap_f(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
-SEC("decap")
-int decap_f(struct __sk_buff *skb)
+static int encap_ipv6(struct __sk_buff *skb)
 {
-	struct iphdr iph_outer, iph_inner;
+	struct ipv6hdr iph_outer, iph_inner;
+	struct tcphdr tcph;
 
-	if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
+			       sizeof(iph_inner)) < 0)
 		return TC_ACT_OK;
 
-	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_outer,
-			       sizeof(iph_outer)) < 0)
+	/* filter only packets we want */
+	if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph_inner),
+			       &tcph, sizeof(tcph)) < 0)
 		return TC_ACT_OK;
 
-	if (iph_outer.ihl != 5 || iph_outer.protocol != IPPROTO_IPIP)
+	if (tcph.dest != __bpf_constant_htons(cfg_port))
+		return TC_ACT_OK;
+
+	/* add room between mac and network header */
+	if (bpf_skb_adjust_room(skb, sizeof(iph_outer), BPF_ADJ_ROOM_NET, 0))
+		return TC_ACT_SHOT;
+
+	/* prepare new outer network header */
+	iph_outer = iph_inner;
+	iph_outer.nexthdr = IPPROTO_IPV6;
+	iph_outer.payload_len = bpf_htons(sizeof(iph_outer) +
+					  bpf_ntohs(iph_outer.payload_len));
+
+	/* store new outer network header */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_outer, sizeof(iph_outer),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	/* bpf_skb_adjust_room has moved header to start of room: restore */
+	if (bpf_skb_store_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+				&iph_inner, sizeof(iph_inner),
+				BPF_F_INVALIDATE_HASH) < 0)
+		return TC_ACT_SHOT;
+
+	return TC_ACT_OK;
+}
+
+SEC("encap")
+int encap_f(struct __sk_buff *skb)
+{
+	switch (skb->protocol) {
+	case __bpf_constant_htons(ETH_P_IP):
+		return encap_ipv4(skb);
+	case __bpf_constant_htons(ETH_P_IPV6):
+		return encap_ipv6(skb);
+	default:
+		/* does not match, ignore */
 		return TC_ACT_OK;
+	}
+}
 
-	if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph_outer),
-			       &iph_inner, sizeof(iph_inner)) < 0)
+static int decap_internal(struct __sk_buff *skb, int off, int len)
+{
+	char buf[sizeof(struct ipv6hdr)];
+
+	if (bpf_skb_load_bytes(skb, off + len, &buf, len) < 0)
 		return TC_ACT_OK;
 
-	if (bpf_skb_adjust_room(skb, -(int)sizeof(iph_outer),
-				BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, -len, BPF_ADJ_ROOM_NET, 0))
 		return TC_ACT_SHOT;
 
 	/* bpf_skb_adjust_room has moved outer over inner header: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_inner, sizeof(iph_inner),
-				BPF_F_INVALIDATE_HASH) < 0)
+	if (bpf_skb_store_bytes(skb, off, buf, len, BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
 	return TC_ACT_OK;
 }
 
+static int decap_ipv4(struct __sk_buff *skb)
+{
+	struct iphdr iph_outer;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_outer,
+			       sizeof(iph_outer)) < 0)
+		return TC_ACT_OK;
+
+	if (iph_outer.ihl != 5 || iph_outer.protocol != IPPROTO_IPIP)
+		return TC_ACT_OK;
+
+	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer));
+}
+
+static int decap_ipv6(struct __sk_buff *skb)
+{
+	struct ipv6hdr iph_outer;
+
+	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_outer,
+			       sizeof(iph_outer)) < 0)
+		return TC_ACT_OK;
+
+	if (iph_outer.nexthdr != IPPROTO_IPV6)
+		return TC_ACT_OK;
+
+	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer));
+}
+
+SEC("decap")
+int decap_f(struct __sk_buff *skb)
+{
+	switch (skb->protocol) {
+	case __bpf_constant_htons(ETH_P_IP):
+		return decap_ipv4(skb);
+	case __bpf_constant_htons(ETH_P_IPV6):
+		return decap_ipv6(skb);
+	default:
+		/* does not match, ignore */
+		return TC_ACT_OK;
+	}
+}
+
 char __license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 91151d91e5a1b..7b1758f3006b0 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -12,6 +12,9 @@ readonly ns2="${ns_prefix}2"
 
 readonly ns1_v4=192.168.1.1
 readonly ns2_v4=192.168.1.2
+readonly ns1_v6=fd::1
+readonly ns2_v6=fd::2
+
 
 setup() {
 	ip netns add "${ns1}"
@@ -25,6 +28,8 @@ setup() {
 
 	ip -netns "${ns1}" -4 addr add "${ns1_v4}/24" dev veth1
 	ip -netns "${ns2}" -4 addr add "${ns2_v4}/24" dev veth2
+	ip -netns "${ns1}" -6 addr add "${ns1_v6}/64" dev veth1 nodad
+	ip -netns "${ns2}" -6 addr add "${ns2_v6}/64" dev veth2 nodad
 
 	sleep 1
 }
@@ -35,16 +40,56 @@ cleanup() {
 }
 
 server_listen() {
-	ip netns exec "${ns2}" nc -l -p "${port}" &
+	ip netns exec "${ns2}" nc "${netcat_opt}" -l -p "${port}" &
 	sleep 0.2
 }
 
 client_connect() {
-	ip netns exec "${ns1}" nc -z -w 1 "${ns2_v4}" "${port}"
+	ip netns exec "${ns1}" nc "${netcat_opt}" -z -w 1 "${addr2}" "${port}"
 	echo $?
 }
 
 set -e
+
+# no arguments: automated test, run all
+if [[ "$#" -eq "0" ]]; then
+	echo "ipip"
+	$0 ipv4
+
+	echo "ip6ip6"
+	$0 ipv6
+
+	echo "OK. All tests passed"
+	exit 0
+fi
+
+if [[ "$#" -ne "1" ]]; then
+	echo "Usage: $0"
+	echo "   or: $0 <ipv4|ipv6>"
+	exit 1
+fi
+
+case "$1" in
+"ipv4")
+	readonly tuntype=ipip
+	readonly addr1="${ns1_v4}"
+	readonly addr2="${ns2_v4}"
+	readonly netcat_opt=-4
+	;;
+"ipv6")
+	readonly tuntype=ip6tnl
+	readonly addr1="${ns1_v6}"
+	readonly addr2="${ns2_v6}"
+	readonly netcat_opt=-6
+	;;
+*)
+	echo "unknown arg: $1"
+	exit 1
+	;;
+esac
+
+echo "encap ${addr1} to ${addr2}, type ${tuntype}"
+
 trap cleanup EXIT
 
 setup
@@ -66,8 +111,8 @@ server_listen
 # serverside, insert decap module
 # server is still running
 # client can connect again
-ip netns exec "${ns2}" ip link add dev testtun0 type ipip \
-	remote "${ns1_v4}" local "${ns2_v4}"
+ip netns exec "${ns2}" ip link add dev testtun0 type "${tuntype}" \
+	remote "${addr1}" local "${addr2}"
 ip netns exec "${ns2}" ip link set dev testtun0 up
 echo "test bpf encap with tunnel device decap"
 client_connect
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 05/13] selftests/bpf: extend bpf tunnel test with gre
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (3 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 04/13] selftests/bpf: expand bpf tunnel test to ipv6 Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 06/13] selftests/bpf: extend bpf tunnel test with tso Willem de Bruijn
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

GRE is a commonly used protocol. Add GRE cases for both IPv4 and IPv6.

It also inserts different sized headers, which can expose some
unexpected edge cases.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 148 +++++++++++++-----
 tools/testing/selftests/bpf/test_tc_tunnel.sh |  21 ++-
 2 files changed, 123 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 591f540ce513d..900c5653105fe 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -2,6 +2,9 @@
 
 /* In-place tunneling */
 
+#include <stdbool.h>
+#include <string.h>
+
 #include <linux/stddef.h>
 #include <linux/bpf.h>
 #include <linux/if_ether.h>
@@ -17,6 +20,18 @@
 
 static const int cfg_port = 8000;
 
+struct grev4hdr {
+	struct iphdr ip;
+	__be16 flags;
+	__be16 protocol;
+} __attribute__((packed));
+
+struct grev6hdr {
+	struct ipv6hdr ip;
+	__be16 flags;
+	__be16 protocol;
+} __attribute__((packed));
+
 static __always_inline void set_ipv4_csum(struct iphdr *iph)
 {
 	__u16 *iph16 = (__u16 *)iph;
@@ -32,10 +47,12 @@ static __always_inline void set_ipv4_csum(struct iphdr *iph)
 	iph->check = ~((csum & 0xffff) + (csum >> 16));
 }
 
-static int encap_ipv4(struct __sk_buff *skb)
+static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 {
-	struct iphdr iph_outer, iph_inner;
+	struct grev4hdr h_outer;
+	struct iphdr iph_inner;
 	struct tcphdr tcph;
+	int olen;
 
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
 			       sizeof(iph_inner)) < 0)
@@ -52,24 +69,33 @@ static int encap_ipv4(struct __sk_buff *skb)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
+	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
+
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, sizeof(iph_outer), BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_NET, 0))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
-	iph_outer = iph_inner;
-	iph_outer.protocol = IPPROTO_IPIP;
-	iph_outer.tot_len = bpf_htons(sizeof(iph_outer) +
-				      bpf_htons(iph_outer.tot_len));
-	set_ipv4_csum(&iph_outer);
+	h_outer.ip = iph_inner;
+	h_outer.ip.tot_len = bpf_htons(olen +
+				      bpf_htons(h_outer.ip.tot_len));
+	if (with_gre) {
+		h_outer.ip.protocol = IPPROTO_GRE;
+		h_outer.protocol = bpf_htons(ETH_P_IP);
+		h_outer.flags = 0;
+	} else {
+		h_outer.ip.protocol = IPPROTO_IPIP;
+	}
+
+	set_ipv4_csum((void *)&h_outer.ip);
 
 	/* store new outer network header */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_outer, sizeof(iph_outer),
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen,
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
 	/* bpf_skb_adjust_room has moved header to start of room: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+	if (bpf_skb_store_bytes(skb, ETH_HLEN + olen,
 				&iph_inner, sizeof(iph_inner),
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
@@ -77,10 +103,12 @@ static int encap_ipv4(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
-static int encap_ipv6(struct __sk_buff *skb)
+static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 {
-	struct ipv6hdr iph_outer, iph_inner;
+	struct ipv6hdr iph_inner;
+	struct grev6hdr h_outer;
 	struct tcphdr tcph;
+	int olen;
 
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
 			       sizeof(iph_inner)) < 0)
@@ -94,23 +122,31 @@ static int encap_ipv6(struct __sk_buff *skb)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
+	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
+
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, sizeof(iph_outer), BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_NET, 0))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
-	iph_outer = iph_inner;
-	iph_outer.nexthdr = IPPROTO_IPV6;
-	iph_outer.payload_len = bpf_htons(sizeof(iph_outer) +
-					  bpf_ntohs(iph_outer.payload_len));
+	h_outer.ip = iph_inner;
+	h_outer.ip.payload_len = bpf_htons(olen +
+					   bpf_ntohs(h_outer.ip.payload_len));
+	if (with_gre) {
+		h_outer.ip.nexthdr = IPPROTO_GRE;
+		h_outer.protocol = bpf_htons(ETH_P_IPV6);
+		h_outer.flags = 0;
+	} else {
+		h_outer.ip.nexthdr = IPPROTO_IPV6;
+	}
 
 	/* store new outer network header */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN, &iph_outer, sizeof(iph_outer),
+	if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen,
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
 	/* bpf_skb_adjust_room has moved header to start of room: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN + sizeof(iph_outer),
+	if (bpf_skb_store_bytes(skb, ETH_HLEN + olen,
 				&iph_inner, sizeof(iph_inner),
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
@@ -118,28 +154,63 @@ static int encap_ipv6(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
-SEC("encap")
-int encap_f(struct __sk_buff *skb)
+SEC("encap_ipip")
+int __encap_ipip(struct __sk_buff *skb)
 {
-	switch (skb->protocol) {
-	case __bpf_constant_htons(ETH_P_IP):
-		return encap_ipv4(skb);
-	case __bpf_constant_htons(ETH_P_IPV6):
-		return encap_ipv6(skb);
-	default:
-		/* does not match, ignore */
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IP))
+		return encap_ipv4(skb, false);
+	else
 		return TC_ACT_OK;
-	}
 }
 
-static int decap_internal(struct __sk_buff *skb, int off, int len)
+SEC("encap_gre")
+int __encap_gre(struct __sk_buff *skb)
 {
-	char buf[sizeof(struct ipv6hdr)];
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IP))
+		return encap_ipv4(skb, true);
+	else
+		return TC_ACT_OK;
+}
 
-	if (bpf_skb_load_bytes(skb, off + len, &buf, len) < 0)
+SEC("encap_ip6tnl")
+int __encap_ip6tnl(struct __sk_buff *skb)
+{
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6))
+		return encap_ipv6(skb, false);
+	else
+		return TC_ACT_OK;
+}
+
+SEC("encap_ip6gre")
+int __encap_ip6gre(struct __sk_buff *skb)
+{
+	if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6))
+		return encap_ipv6(skb, true);
+	else
 		return TC_ACT_OK;
+}
 
-	if (bpf_skb_adjust_room(skb, -len, BPF_ADJ_ROOM_NET, 0))
+static int decap_internal(struct __sk_buff *skb, int off, int len, char proto)
+{
+	char buf[sizeof(struct grev6hdr)];
+	int olen;
+
+	switch (proto) {
+	case IPPROTO_IPIP:
+	case IPPROTO_IPV6:
+		olen = len;
+		break;
+	case IPPROTO_GRE:
+		olen = len + 4 /* gre hdr */;
+		break;
+	default:
+		return TC_ACT_OK;
+	}
+
+	if (bpf_skb_load_bytes(skb, off + olen, &buf, olen) < 0)
+		return TC_ACT_OK;
+
+	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_NET, 0))
 		return TC_ACT_SHOT;
 
 	/* bpf_skb_adjust_room has moved outer over inner header: restore */
@@ -157,10 +228,11 @@ static int decap_ipv4(struct __sk_buff *skb)
 			       sizeof(iph_outer)) < 0)
 		return TC_ACT_OK;
 
-	if (iph_outer.ihl != 5 || iph_outer.protocol != IPPROTO_IPIP)
+	if (iph_outer.ihl != 5)
 		return TC_ACT_OK;
 
-	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer));
+	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer),
+			      iph_outer.protocol);
 }
 
 static int decap_ipv6(struct __sk_buff *skb)
@@ -171,10 +243,8 @@ static int decap_ipv6(struct __sk_buff *skb)
 			       sizeof(iph_outer)) < 0)
 		return TC_ACT_OK;
 
-	if (iph_outer.nexthdr != IPPROTO_IPV6)
-		return TC_ACT_OK;
-
-	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer));
+	return decap_internal(skb, ETH_HLEN, sizeof(iph_outer),
+			      iph_outer.nexthdr);
 }
 
 SEC("decap")
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 7b1758f3006b0..c78922048610b 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -54,30 +54,36 @@ set -e
 # no arguments: automated test, run all
 if [[ "$#" -eq "0" ]]; then
 	echo "ipip"
-	$0 ipv4
+	$0 ipv4 ipip
 
 	echo "ip6ip6"
-	$0 ipv6
+	$0 ipv6 ip6tnl
+
+	echo "ip gre"
+	$0 ipv4 gre
+
+	echo "ip6 gre"
+	$0 ipv6 ip6gre
 
 	echo "OK. All tests passed"
 	exit 0
 fi
 
-if [[ "$#" -ne "1" ]]; then
+if [[ "$#" -ne "2" ]]; then
 	echo "Usage: $0"
-	echo "   or: $0 <ipv4|ipv6>"
+	echo "   or: $0 <ipv4|ipv6> <tuntype>"
 	exit 1
 fi
 
 case "$1" in
 "ipv4")
-	readonly tuntype=ipip
+	readonly tuntype=$2
 	readonly addr1="${ns1_v4}"
 	readonly addr2="${ns2_v4}"
 	readonly netcat_opt=-4
 	;;
 "ipv6")
-	readonly tuntype=ip6tnl
+	readonly tuntype=$2
 	readonly addr1="${ns1_v6}"
 	readonly addr2="${ns2_v6}"
 	readonly netcat_opt=-6
@@ -103,7 +109,8 @@ client_connect
 # client can no longer connect
 ip netns exec "${ns1}" tc qdisc add dev veth1 clsact
 ip netns exec "${ns1}" tc filter add dev veth1 egress \
-	bpf direct-action object-file ./test_tc_tunnel.o section encap
+	bpf direct-action object-file ./test_tc_tunnel.o \
+	section "encap_${tuntype}"
 echo "test bpf encap without decap (expect failure)"
 server_listen
 ! client_connect
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 06/13] selftests/bpf: extend bpf tunnel test with tso
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (4 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 05/13] selftests/bpf: extend bpf tunnel test with gre Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC Willem de Bruijn
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Segmentation offload takes a longer path. Verify that the feature
works with large packets.

The test succeeds if not setting dodgy in bpf_skb_adjust_room, as veth
TSO is permissive.

If not setting SKB_GSO_DODGY, this enables tunneled TSO offload on
supporting NICs.

The feature sets SKB_GSO_DODGY because the caller is untrusted. As a
result the packets traverse through the gso stack at least up to TCP.
And fail the gso_type validation, such as the skb->encapsulation check
in gre_gso_segment and the gso_type checks introduced in commit
418e897e0716 ("gso: validate gso_type on ipip style tunnel").

This will be addressed in a follow-on feature patch. In the meantime,
disable the new gso tests.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 60 +++++++++++++++----
 1 file changed, 49 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index c78922048610b..5d9d56520c694 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -15,6 +15,8 @@ readonly ns2_v4=192.168.1.2
 readonly ns1_v6=fd::1
 readonly ns2_v6=fd::2
 
+readonly infile="$(mktemp)"
+readonly outfile="$(mktemp)"
 
 setup() {
 	ip netns add "${ns1}"
@@ -23,6 +25,8 @@ setup() {
 	ip link add dev veth1 mtu 1500 netns "${ns1}" type veth \
 	      peer name veth2 mtu 1500 netns "${ns2}"
 
+	ip netns exec "${ns1}" ethtool -K veth1 tso off
+
 	ip -netns "${ns1}" link set veth1 up
 	ip -netns "${ns2}" link set veth2 up
 
@@ -32,58 +36,86 @@ setup() {
 	ip -netns "${ns2}" -6 addr add "${ns2_v6}/64" dev veth2 nodad
 
 	sleep 1
+
+	dd if=/dev/urandom of="${infile}" bs="${datalen}" count=1 status=none
 }
 
 cleanup() {
 	ip netns del "${ns2}"
 	ip netns del "${ns1}"
+
+	if [[ -f "${outfile}" ]]; then
+		rm "${outfile}"
+	fi
+	if [[ -f "${infile}" ]]; then
+		rm "${infile}"
+	fi
 }
 
 server_listen() {
-	ip netns exec "${ns2}" nc "${netcat_opt}" -l -p "${port}" &
+	ip netns exec "${ns2}" nc "${netcat_opt}" -l -p "${port}" > "${outfile}" &
+	server_pid=$!
 	sleep 0.2
 }
 
 client_connect() {
-	ip netns exec "${ns1}" nc "${netcat_opt}" -z -w 1 "${addr2}" "${port}"
+	ip netns exec "${ns1}" nc "${netcat_opt}" -q 0 -w 1 "${addr2}" "${port}" < "${infile}"
 	echo $?
 }
 
+verify_data() {
+	wait "${server_pid}"
+	# sha1sum returns two fields [sha1] [filepath]
+	# convert to bash array and access first elem
+	insum=($(sha1sum ${infile}))
+	outsum=($(sha1sum ${outfile}))
+	if [[ "${insum[0]}" != "${outsum[0]}" ]]; then
+		echo "data mismatch"
+		exit 1
+	fi
+}
+
 set -e
 
 # no arguments: automated test, run all
 if [[ "$#" -eq "0" ]]; then
 	echo "ipip"
-	$0 ipv4 ipip
+	$0 ipv4 ipip 100
 
 	echo "ip6ip6"
-	$0 ipv6 ip6tnl
+	$0 ipv6 ip6tnl 100
 
 	echo "ip gre"
-	$0 ipv4 gre
+	$0 ipv4 gre 100
 
 	echo "ip6 gre"
-	$0 ipv6 ip6gre
+	$0 ipv6 ip6gre 100
+
+	# disabled until passes SKB_GSO_DODGY checks
+	# echo "ip gre gso"
+	# $0 ipv4 gre 2000
+
+	# disabled until passes SKB_GSO_DODGY checks
+	# echo "ip6 gre gso"
+	# $0 ipv6 ip6gre 2000
 
 	echo "OK. All tests passed"
 	exit 0
 fi
 
-if [[ "$#" -ne "2" ]]; then
+if [[ "$#" -ne "3" ]]; then
 	echo "Usage: $0"
-	echo "   or: $0 <ipv4|ipv6> <tuntype>"
+	echo "   or: $0 <ipv4|ipv6> <tuntype> <data_len>"
 	exit 1
 fi
 
 case "$1" in
 "ipv4")
-	readonly tuntype=$2
 	readonly addr1="${ns1_v4}"
 	readonly addr2="${ns2_v4}"
 	readonly netcat_opt=-4
 	;;
 "ipv6")
-	readonly tuntype=$2
 	readonly addr1="${ns1_v6}"
 	readonly addr2="${ns2_v6}"
 	readonly netcat_opt=-6
@@ -94,7 +126,10 @@ case "$1" in
 	;;
 esac
 
-echo "encap ${addr1} to ${addr2}, type ${tuntype}"
+readonly tuntype=$2
+readonly datalen=$3
+
+echo "encap ${addr1} to ${addr2}, type ${tuntype}, len ${datalen}"
 
 trap cleanup EXIT
 
@@ -104,6 +139,7 @@ setup
 echo "test basic connectivity"
 server_listen
 client_connect
+verify_data
 
 # clientside, insert bpf program to encap all TCP to port ${port}
 # client can no longer connect
@@ -123,6 +159,7 @@ ip netns exec "${ns2}" ip link add dev testtun0 type "${tuntype}" \
 ip netns exec "${ns2}" ip link set dev testtun0 up
 echo "test bpf encap with tunnel device decap"
 client_connect
+verify_data
 
 # serverside, use BPF for decap
 ip netns exec "${ns2}" ip link del dev testtun0
@@ -132,5 +169,6 @@ ip netns exec "${ns2}" tc filter add dev veth2 ingress \
 server_listen
 echo "test bpf encap with bpf decap"
 client_connect
+verify_data
 
 echo OK
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (5 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 06/13] selftests/bpf: extend bpf tunnel test with tso Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

bpf_skb_adjust_room net allows inserting room in an skb.

Existing mode BPF_ADJ_ROOM_NET inserts room after the network header
by pulling the skb, moving the network header forward and zeroing the
new space.

Add new mode BPF_ADJUST_ROOM_MAC that inserts room after the mac
header. This allows inserting tunnel headers in front of the network
header without having to recreate the network header in the original
space, avoiding two copies.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 include/uapi/linux/bpf.h |  6 +++++-
 net/core/filter.c        | 38 ++++++++++++++++++++------------------
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 929c8e537a14a..4f5c918e6fcf4 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1478,7 +1478,10 @@ union bpf_attr {
  * 		Grow or shrink the room for data in the packet associated to
  * 		*skb* by *len_diff*, and according to the selected *mode*.
  *
- * 		There is a single supported mode at this time:
+ *		There are two supported modes at this time:
+ *
+ *		* **BPF_ADJ_ROOM_MAC**: Adjust room at the mac layer
+ *		  (room space is added or removed below the layer 2 header).
  *
  * 		* **BPF_ADJ_ROOM_NET**: Adjust room at the network layer
  * 		  (room space is added or removed below the layer 3 header).
@@ -2593,6 +2596,7 @@ enum bpf_func_id {
 /* Mode for BPF_FUNC_skb_adjust_room helper. */
 enum bpf_adj_room_mode {
 	BPF_ADJ_ROOM_NET,
+	BPF_ADJ_ROOM_MAC,
 };
 
 /* Mode for BPF_FUNC_skb_load_bytes_relative helper. */
diff --git a/net/core/filter.c b/net/core/filter.c
index 8e15fb919b574..1a0cf578b4502 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2963,9 +2963,8 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 	}
 }
 
-static int bpf_skb_net_grow(struct sk_buff *skb, u32 len_diff)
+static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
 {
-	u32 off = skb_mac_header_len(skb) + bpf_skb_net_base_len(skb);
 	int ret;
 
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
@@ -2992,9 +2991,8 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 len_diff)
 	return 0;
 }
 
-static int bpf_skb_net_shrink(struct sk_buff *skb, u32 len_diff)
+static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff)
 {
-	u32 off = skb_mac_header_len(skb) + bpf_skb_net_base_len(skb);
 	int ret;
 
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
@@ -3027,7 +3025,8 @@ static u32 __bpf_skb_max_len(const struct sk_buff *skb)
 			  SKB_MAX_ALLOC;
 }
 
-static int bpf_skb_adjust_net(struct sk_buff *skb, s32 len_diff)
+BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
+	   u32, mode, u64, flags)
 {
 	bool trans_same = skb->transport_header == skb->network_header;
 	u32 len_cur, len_diff_abs = abs(len_diff);
@@ -3035,14 +3034,28 @@ static int bpf_skb_adjust_net(struct sk_buff *skb, s32 len_diff)
 	u32 len_max = __bpf_skb_max_len(skb);
 	__be16 proto = skb->protocol;
 	bool shrink = len_diff < 0;
+	u32 off;
 	int ret;
 
+	if (unlikely(flags))
+		return -EINVAL;
 	if (unlikely(len_diff_abs > 0xfffU))
 		return -EFAULT;
 	if (unlikely(proto != htons(ETH_P_IP) &&
 		     proto != htons(ETH_P_IPV6)))
 		return -ENOTSUPP;
 
+	off = skb_mac_header_len(skb);
+	switch (mode) {
+	case BPF_ADJ_ROOM_NET:
+		off += bpf_skb_net_base_len(skb);
+		break;
+	case BPF_ADJ_ROOM_MAC:
+		break;
+	default:
+		return -ENOTSUPP;
+	}
+
 	len_cur = skb->len - skb_network_offset(skb);
 	if (skb_transport_header_was_set(skb) && !trans_same)
 		len_cur = skb_network_header_len(skb);
@@ -3052,24 +3065,13 @@ static int bpf_skb_adjust_net(struct sk_buff *skb, s32 len_diff)
 			 !skb_is_gso(skb))))
 		return -ENOTSUPP;
 
-	ret = shrink ? bpf_skb_net_shrink(skb, len_diff_abs) :
-		       bpf_skb_net_grow(skb, len_diff_abs);
+	ret = shrink ? bpf_skb_net_shrink(skb, off, len_diff_abs) :
+		       bpf_skb_net_grow(skb, off, len_diff_abs);
 
 	bpf_compute_data_pointers(skb);
 	return ret;
 }
 
-BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
-	   u32, mode, u64, flags)
-{
-	if (unlikely(flags))
-		return -EINVAL;
-	if (likely(mode == BPF_ADJ_ROOM_NET))
-		return bpf_skb_adjust_net(skb, len_diff);
-
-	return -ENOTSUPP;
-}
-
 static const struct bpf_func_proto bpf_skb_adjust_room_proto = {
 	.func		= bpf_skb_adjust_room,
 	.gpl_only	= false,
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (6 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-21 13:42   ` Alan Maguire
  2019-03-20 14:49 ` [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags Willem de Bruijn
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

bpf_skb_adjust_room adjusts gso_size of gso packets to account for the
pushed or popped header room.

This is not allowed with UDP, where gso_size delineates datagrams. Add
an option to avoid these updates and allow this call for datagrams.

It can also be used with TCP, when MSS is known to allow headroom,
e.g., through MSS clamping or route MTU.

Link: https://patchwork.ozlabs.org/patch/1052497/
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 include/uapi/linux/bpf.h |  4 ++++
 net/core/filter.c        | 36 +++++++++++++++++++++++++-----------
 2 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4f5c918e6fcf4..0eda8f564a381 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2593,6 +2593,10 @@ enum bpf_func_id {
 /* Current network namespace */
 #define BPF_F_CURRENT_NETNS		(-1L)
 
+/* BPF_FUNC_skb_adjust_room flags. */
+#define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)
+#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO)
+
 /* Mode for BPF_FUNC_skb_adjust_room helper. */
 enum bpf_adj_room_mode {
 	BPF_ADJ_ROOM_NET,
diff --git a/net/core/filter.c b/net/core/filter.c
index 1a0cf578b4502..e346e48098000 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2963,12 +2963,17 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 	}
 }
 
-static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
+static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
+			    u64 flags)
 {
 	int ret;
 
-	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
-		return -ENOTSUPP;
+	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
+		/* udp gso_size delineates datagrams, only allow if fixed */
+		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
+		    !(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
+			return -ENOTSUPP;
+	}
 
 	ret = skb_cow_head(skb, len_diff);
 	if (unlikely(ret < 0))
@@ -2982,7 +2987,9 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
 		struct skb_shared_info *shinfo = skb_shinfo(skb);
 
 		/* Due to header grow, MSS needs to be downgraded. */
-		skb_decrease_gso_size(shinfo, len_diff);
+		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
+			skb_decrease_gso_size(shinfo, len_diff);
+
 		/* Header must be checked, and gso_segs recomputed. */
 		shinfo->gso_type |= SKB_GSO_DODGY;
 		shinfo->gso_segs = 0;
@@ -2991,12 +2998,17 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
 	return 0;
 }
 
-static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff)
+static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
+			      u64 flags)
 {
 	int ret;
 
-	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
-		return -ENOTSUPP;
+	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
+		/* udp gso_size delineates datagrams, only allow if fixed */
+		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
+		    !(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
+			return -ENOTSUPP;
+	}
 
 	ret = skb_unclone(skb, GFP_ATOMIC);
 	if (unlikely(ret < 0))
@@ -3010,7 +3022,9 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff)
 		struct skb_shared_info *shinfo = skb_shinfo(skb);
 
 		/* Due to header shrink, MSS can be upgraded. */
-		skb_increase_gso_size(shinfo, len_diff);
+		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
+			skb_increase_gso_size(shinfo, len_diff);
+
 		/* Header must be checked, and gso_segs recomputed. */
 		shinfo->gso_type |= SKB_GSO_DODGY;
 		shinfo->gso_segs = 0;
@@ -3037,7 +3051,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	u32 off;
 	int ret;
 
-	if (unlikely(flags))
+	if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
 		return -EINVAL;
 	if (unlikely(len_diff_abs > 0xfffU))
 		return -EFAULT;
@@ -3065,8 +3079,8 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 			 !skb_is_gso(skb))))
 		return -ENOTSUPP;
 
-	ret = shrink ? bpf_skb_net_shrink(skb, off, len_diff_abs) :
-		       bpf_skb_net_grow(skb, off, len_diff_abs);
+	ret = shrink ? bpf_skb_net_shrink(skb, off, len_diff_abs, flags) :
+		       bpf_skb_net_grow(skb, off, len_diff_abs, flags);
 
 	bpf_compute_data_pointers(skb);
 	return ret;
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (7 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 15:51   ` Alan Maguire
  2019-03-21  3:13   ` Alexei Starovoitov
  2019-03-20 14:49 ` [PATCH bpf-next 10/13] bpf: Sync bpf.h to tools Willem de Bruijn
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

When pushing tunnel headers, annotate skbs in the same way as tunnel
devices.

For GSO packets, the network stack requires certain fields set to
segment packets with tunnel headers. gro_gse_segment depends on
transport and inner mac header, for instance.

Add an option to pass this information.

Remove the restriction on len_diff to network header length, which
is too short, e.g., for GRE protocols.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 include/uapi/linux/bpf.h | 14 +++++++++-
 net/core/filter.c        | 58 +++++++++++++++++++++++++++++++++++++---
 2 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0eda8f564a381..a444534cc88d7 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2595,7 +2595,19 @@ enum bpf_func_id {
 
 /* BPF_FUNC_skb_adjust_room flags. */
 #define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)
-#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO)
+
+#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	(1ULL << 1)
+#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	(1ULL << 2)
+#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
+					 BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
+
+#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE	(1ULL << 3)
+#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP	(1ULL << 4)
+
+#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
+					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
+					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
+					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
 
 /* Mode for BPF_FUNC_skb_adjust_room helper. */
 enum bpf_adj_room_mode {
diff --git a/net/core/filter.c b/net/core/filter.c
index e346e48098000..6007b0b4bc0d7 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2966,6 +2966,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 			    u64 flags)
 {
+	bool encap = flags & BPF_F_ADJ_ROOM_ENCAP_L3_MASK;
+	unsigned int gso_type = SKB_GSO_DODGY;
+	u16 mac_len, inner_net, inner_trans;
 	int ret;
 
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
@@ -2979,10 +2982,60 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 	if (unlikely(ret < 0))
 		return ret;
 
+	if (encap) {
+		if (skb->protocol != htons(ETH_P_IP) &&
+		    skb->protocol != htons(ETH_P_IPV6))
+			return -ENOTSUPP;
+
+		if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 &&
+		    flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
+			return -EINVAL;
+
+		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE &&
+		    flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
+			return -EINVAL;
+
+		if (skb->encapsulation)
+			return -EALREADY;
+
+		mac_len = skb->network_header - skb->mac_header;
+		inner_net = skb->network_header;
+		inner_trans = skb->transport_header;
+	}
+
 	ret = bpf_skb_net_hdr_push(skb, off, len_diff);
 	if (unlikely(ret < 0))
 		return ret;
 
+	if (encap) {
+		/* inner mac == inner_net on l3 encap */
+		skb->inner_mac_header = inner_net;
+		skb->inner_network_header = inner_net;
+		skb->inner_transport_header = inner_trans;
+		skb_set_inner_protocol(skb, skb->protocol);
+
+		skb->encapsulation = 1;
+		skb_set_network_header(skb, mac_len);
+
+		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
+			gso_type |= SKB_GSO_UDP_TUNNEL;
+		else if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE)
+			gso_type |= SKB_GSO_GRE;
+		else if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
+			gso_type |= SKB_GSO_IPXIP6;
+		else
+			gso_type |= SKB_GSO_IPXIP4;
+
+		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE ||
+		    flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP) {
+			int nh_len = flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 ?
+					sizeof(struct ipv6hdr) :
+					sizeof(struct iphdr);
+
+			skb_set_transport_header(skb, mac_len + nh_len);
+		}
+	}
+
 	if (skb_is_gso(skb)) {
 		struct skb_shared_info *shinfo = skb_shinfo(skb);
 
@@ -2991,7 +3044,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 			skb_decrease_gso_size(shinfo, len_diff);
 
 		/* Header must be checked, and gso_segs recomputed. */
-		shinfo->gso_type |= SKB_GSO_DODGY;
+		shinfo->gso_type |= gso_type;
 		shinfo->gso_segs = 0;
 	}
 
@@ -3042,7 +3095,6 @@ static u32 __bpf_skb_max_len(const struct sk_buff *skb)
 BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	   u32, mode, u64, flags)
 {
-	bool trans_same = skb->transport_header == skb->network_header;
 	u32 len_cur, len_diff_abs = abs(len_diff);
 	u32 len_min = bpf_skb_net_base_len(skb);
 	u32 len_max = __bpf_skb_max_len(skb);
@@ -3071,8 +3123,6 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	}
 
 	len_cur = skb->len - skb_network_offset(skb);
-	if (skb_transport_header_was_set(skb) && !trans_same)
-		len_cur = skb_network_header_len(skb);
 	if ((shrink && (len_diff_abs >= len_cur ||
 			len_cur - len_diff_abs < len_min)) ||
 	    (!shrink && (skb->len + len_diff_abs > len_max &&
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 10/13] bpf: Sync bpf.h to tools
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (8 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:56   ` Soheil Hassas Yeganeh
  2019-03-20 14:49 ` [PATCH bpf-next 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC Willem de Bruijn
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Sync include/uapi/linux/bpf.h with tools/

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/include/uapi/linux/bpf.h | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 929c8e537a14a..a444534cc88d7 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1478,7 +1478,10 @@ union bpf_attr {
  * 		Grow or shrink the room for data in the packet associated to
  * 		*skb* by *len_diff*, and according to the selected *mode*.
  *
- * 		There is a single supported mode at this time:
+ *		There are two supported modes at this time:
+ *
+ *		* **BPF_ADJ_ROOM_MAC**: Adjust room at the mac layer
+ *		  (room space is added or removed below the layer 2 header).
  *
  * 		* **BPF_ADJ_ROOM_NET**: Adjust room at the network layer
  * 		  (room space is added or removed below the layer 3 header).
@@ -2590,9 +2593,26 @@ enum bpf_func_id {
 /* Current network namespace */
 #define BPF_F_CURRENT_NETNS		(-1L)
 
+/* BPF_FUNC_skb_adjust_room flags. */
+#define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)
+
+#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	(1ULL << 1)
+#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	(1ULL << 2)
+#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
+					 BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
+
+#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE	(1ULL << 3)
+#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP	(1ULL << 4)
+
+#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
+					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
+					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
+					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
+
 /* Mode for BPF_FUNC_skb_adjust_room helper. */
 enum bpf_adj_room_mode {
 	BPF_ADJ_ROOM_NET,
+	BPF_ADJ_ROOM_MAC,
 };
 
 /* Mode for BPF_FUNC_skb_load_bytes_relative helper. */
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (9 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 10/13] bpf: Sync bpf.h to tools Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 13/13] selftests/bpf: convert bpf tunnel test to encap modes Willem de Bruijn
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Avoid moving the network layer header when prefixing tunnel headers.

This avoids an explicit call to bpf_skb_store_bytes and an implicit
move of the network header bytes in bpf_skb_adjust_room.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 25 +++----------------
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 900c5653105fe..f6a16fd23dbd5 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -72,7 +72,7 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
 
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, 0))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
@@ -94,12 +94,6 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
-	/* bpf_skb_adjust_room has moved header to start of room: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN + olen,
-				&iph_inner, sizeof(iph_inner),
-				BPF_F_INVALIDATE_HASH) < 0)
-		return TC_ACT_SHOT;
-
 	return TC_ACT_OK;
 }
 
@@ -125,7 +119,7 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
 
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_NET, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, 0))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
@@ -145,12 +139,6 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 				BPF_F_INVALIDATE_HASH) < 0)
 		return TC_ACT_SHOT;
 
-	/* bpf_skb_adjust_room has moved header to start of room: restore */
-	if (bpf_skb_store_bytes(skb, ETH_HLEN + olen,
-				&iph_inner, sizeof(iph_inner),
-				BPF_F_INVALIDATE_HASH) < 0)
-		return TC_ACT_SHOT;
-
 	return TC_ACT_OK;
 }
 
@@ -207,14 +195,7 @@ static int decap_internal(struct __sk_buff *skb, int off, int len, char proto)
 		return TC_ACT_OK;
 	}
 
-	if (bpf_skb_load_bytes(skb, off + olen, &buf, olen) < 0)
-		return TC_ACT_OK;
-
-	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_NET, 0))
-		return TC_ACT_SHOT;
-
-	/* bpf_skb_adjust_room has moved outer over inner header: restore */
-	if (bpf_skb_store_bytes(skb, off, buf, len, BPF_F_INVALIDATE_HASH) < 0)
+	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, 0))
 		return TC_ACT_SHOT;
 
 	return TC_ACT_OK;
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (10 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  2019-03-20 14:49 ` [PATCH bpf-next 13/13] selftests/bpf: convert bpf tunnel test to encap modes Willem de Bruijn
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Lower route MTU to ensure packets fit in device MTU after encap, then
skip the gso_size changes.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 11 ++++++++---
 tools/testing/selftests/bpf/test_tc_tunnel.sh      |  6 ++++++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index f6a16fd23dbd5..3b79dffb81037 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -52,6 +52,7 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 	struct grev4hdr h_outer;
 	struct iphdr iph_inner;
 	struct tcphdr tcph;
+	__u64 flags;
 	int olen;
 
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
@@ -69,10 +70,11 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO;
 	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
 
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
@@ -102,6 +104,7 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 	struct ipv6hdr iph_inner;
 	struct grev6hdr h_outer;
 	struct tcphdr tcph;
+	__u64 flags;
 	int olen;
 
 	if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner,
@@ -116,10 +119,11 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO;
 	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
 
 	/* add room between mac and network header */
-	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, 0))
+	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
 		return TC_ACT_SHOT;
 
 	/* prepare new outer network header */
@@ -195,7 +199,8 @@ static int decap_internal(struct __sk_buff *skb, int off, int len, char proto)
 		return TC_ACT_OK;
 	}
 
-	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, 0))
+	if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC,
+				BPF_F_ADJ_ROOM_FIXED_GSO))
 		return TC_ACT_SHOT;
 
 	return TC_ACT_OK;
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 5d9d56520c694..3d2111d89315c 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -35,6 +35,12 @@ setup() {
 	ip -netns "${ns1}" -6 addr add "${ns1_v6}/64" dev veth1 nodad
 	ip -netns "${ns2}" -6 addr add "${ns2_v6}/64" dev veth2 nodad
 
+	# clamp route to reserve room for tunnel headers
+	ip -netns "${ns1}" -4 route flush table main
+	ip -netns "${ns1}" -6 route flush table main
+	ip -netns "${ns1}" -4 route add "${ns2_v4}" mtu 1476 dev veth1
+	ip -netns "${ns1}" -6 route add "${ns2_v6}" mtu 1456 dev veth1
+
 	sleep 1
 
 	dd if=/dev/urandom of="${infile}" bs="${datalen}" count=1 status=none
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next 13/13] selftests/bpf: convert bpf tunnel test to encap modes
  2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
                   ` (11 preceding siblings ...)
  2019-03-20 14:49 ` [PATCH bpf-next 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
@ 2019-03-20 14:49 ` Willem de Bruijn
  12 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 14:49 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, sdf, posk, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Make the tests correctly annotate skbs with tunnel metadata.

This makes the gso tests succeed. Enable them.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 .../selftests/bpf/progs/test_tc_tunnel.c      | 19 +++++++++++++++----
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 10 ++++------
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
index 3b79dffb81037..f541c2de947d2 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c
@@ -70,8 +70,13 @@ static __always_inline int encap_ipv4(struct __sk_buff *skb, bool with_gre)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
-	flags = BPF_F_ADJ_ROOM_FIXED_GSO;
-	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV4;
+	if (with_gre) {
+		flags |= BPF_F_ADJ_ROOM_ENCAP_L4_GRE;
+		olen = sizeof(h_outer);
+	} else {
+		olen = sizeof(h_outer.ip);
+	}
 
 	/* add room between mac and network header */
 	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
@@ -119,8 +124,14 @@ static __always_inline int encap_ipv6(struct __sk_buff *skb, bool with_gre)
 	if (tcph.dest != __bpf_constant_htons(cfg_port))
 		return TC_ACT_OK;
 
-	flags = BPF_F_ADJ_ROOM_FIXED_GSO;
-	olen = with_gre ? sizeof(h_outer) : sizeof(h_outer.ip);
+	flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV6;
+	if (with_gre) {
+		flags |= BPF_F_ADJ_ROOM_ENCAP_L4_GRE;
+		olen = sizeof(h_outer);
+	} else {
+		olen = sizeof(h_outer.ip);
+	}
+
 
 	/* add room between mac and network header */
 	if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags))
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 3d2111d89315c..4e66c45ec465f 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -97,13 +97,11 @@ if [[ "$#" -eq "0" ]]; then
 	echo "ip6 gre"
 	$0 ipv6 ip6gre 100
 
-	# disabled until passes SKB_GSO_DODGY checks
-	# echo "ip gre gso"
-	# $0 ipv4 gre 2000
+	echo "ip gre gso"
+	$0 ipv4 gre 2000
 
-	# disabled until passes SKB_GSO_DODGY checks
-	# echo "ip6 gre gso"
-	# $0 ipv6 ip6gre 2000
+	echo "ip6 gre gso"
+	$0 ipv6 ip6gre 2000
 
 	echo "OK. All tests passed"
 	exit 0
-- 
2.21.0.225.g810b269d1ac-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next 10/13] bpf: Sync bpf.h to tools
  2019-03-20 14:49 ` [PATCH bpf-next 10/13] bpf: Sync bpf.h to tools Willem de Bruijn
@ 2019-03-20 14:56   ` Soheil Hassas Yeganeh
  0 siblings, 0 replies; 21+ messages in thread
From: Soheil Hassas Yeganeh @ 2019-03-20 14:56 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
	Peter Oskolkov, Willem de Bruijn

On Wed, Mar 20, 2019 at 10:50 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> From: Willem de Bruijn <willemb@google.com>
>
> Sync include/uapi/linux/bpf.h with tools/
>
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  tools/include/uapi/linux/bpf.h | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 929c8e537a14a..a444534cc88d7 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -1478,7 +1478,10 @@ union bpf_attr {
>   *             Grow or shrink the room for data in the packet associated to
>   *             *skb* by *len_diff*, and according to the selected *mode*.
>   *
> - *             There is a single supported mode at this time:
> + *             There are two supported modes at this time:
> + *
> + *             * **BPF_ADJ_ROOM_MAC**: Adjust room at the mac layer
> + *               (room space is added or removed below the layer 2 header).
>   *
>   *             * **BPF_ADJ_ROOM_NET**: Adjust room at the network layer
>   *               (room space is added or removed below the layer 3 header).
> @@ -2590,9 +2593,26 @@ enum bpf_func_id {
>  /* Current network namespace */
>  #define BPF_F_CURRENT_NETNS            (-1L)
>
> +/* BPF_FUNC_skb_adjust_room flags. */
> +#define BPF_F_ADJ_ROOM_FIXED_GSO       (1ULL << 0)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4   (1ULL << 1)
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6   (1ULL << 2)
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK   (BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
> +                                        BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE    (1ULL << 3)
> +#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP    (1ULL << 4)
> +
> +#define BPF_F_ADJ_ROOM_MASK            (BPF_F_ADJ_ROOM_FIXED_GSO | \
> +                                        BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> +                                        BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
> +                                        BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
> +
>  /* Mode for BPF_FUNC_skb_adjust_room helper. */
>  enum bpf_adj_room_mode {
>         BPF_ADJ_ROOM_NET,
> +       BPF_ADJ_ROOM_MAC,
>  };
>
>  /* Mode for BPF_FUNC_skb_load_bytes_relative helper. */
> --
> 2.21.0.225.g810b269d1ac-goog
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags
  2019-03-20 14:49 ` [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags Willem de Bruijn
@ 2019-03-20 15:51   ` Alan Maguire
  2019-03-20 18:10     ` Willem de Bruijn
  2019-03-21  3:13   ` Alexei Starovoitov
  1 sibling, 1 reply; 21+ messages in thread
From: Alan Maguire @ 2019-03-20 15:51 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, ast, daniel, sdf, posk, Willem de Bruijn

On Wed, 20 Mar 2019, Willem de Bruijn wrote:

> From: Willem de Bruijn <willemb@google.com>
> 
> When pushing tunnel headers, annotate skbs in the same way as tunnel
> devices.
>
This is great stuff Willem!
 
> For GSO packets, the network stack requires certain fields set to
> segment packets with tunnel headers. gro_gse_segment depends on
> transport and inner mac header, for instance.
>

By coincidence I've been working on a patch to solve part of
this problem (attached).

I took a slightly different approach (which I think you mentioned
you considered) - adding an additional helper to mark the inner
headers.  The reason I needed this is that the mac header length
in my case was sometimes not the same as the outer mac size (it
could be a set of MPLS labels sometimes, or indeed there might
be no mac header at all).  If I'm reading your code correctly,
you derive  the mac header length from the outer mac header size -
would there be a possibility of overloading the flags field for
bpf_skb_adjust_room to use 8 bits to store a non-standard mac length 
perhaps? I'd be happy to work on that as a separate patch if that seems 
reasonable.
 
> Add an option to pass this information.
> 
> Remove the restriction on len_diff to network header length, which
> is too short, e.g., for GRE protocols.
>

I think this solves another problem I'd observed; when de-encapsulating
packets which had been GRO re-assembled, bpf_skb_adjust_room would
fail becuase GRO reassembly set the transport header, and as
shrinkage was limited to ensure we still had an IPv4/IPv6 header's
worth of space between the network and transport headers, the operation
would fail.  I think that problem is fixed here, is that right?

Reviewed-by: Alan Maguire <alan.maguire@oracle.com>

Thanks!

Alan

From 388c1bf0cfc76901782520c5af58f73b2649a4c0 Mon Sep 17 00:00:00 2001
From: Alan Maguire <alan.maguire@oracle.com>
Date: Wed, 20 Mar 2019 14:12:36 +0100
Subject: [PATCH] bpf: add bpf_skb_set_inner_header helper

This work adds a helper which can mark inner mac and network
headers and sets inner protocol type for the relevant skb such
that generic segmentation offload (GSO) can segment the packet
appropriately while taking newly-added encapsulation headers into
account.

It is intended to be used in conjunction with other helpers such
as bpf_skb_adjust_room for cases where custom encapsulation is
implemented in a tc BPF program and GSO functionality is needed.
Currently UDP and GRE encapsulation are supported.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 include/uapi/linux/bpf.h | 23 ++++++++++++++++-
 net/core/filter.c        | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 929c8e5..3ce3c16 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2431,6 +2431,22 @@ struct bpf_stack_build_id {
  *	Return
  *		A **struct bpf_sock** pointer on success, or **NULL** in
  *		case of failure.
+ *
+ * int bpf_skb_set_inner_header(skb, outer_proto, inner_proto, inner_offset,
+ *				flags)
+ *	Description
+ *		Set inner header at *inner_offset* for specified *inner_proto*
+ *		for tunnel of type *outer_proto*. *outer_proto* must be one
+ *		of **IPPROTO_UDP** or **IPPROTO_GRE**.  *inner_proto* must be
+ *		**ETH_P_IP** or **ETH_P_IPV6**.  *inner_offset* should specify
+ *		offset of the relevant inner header, or should be 0 to reset
+ *		inner headers. *flags* should be a combination of
+ *
+ *			* **BPF_F_L2_INNER_OFFSET** offset is inner mac header
+ *			* **BPF_F_L3_INNER_OFFSET** offset is inner network
+ *			  header
+ *	Return
+ *		0 on success, or negative error in case of failure.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2531,7 +2547,8 @@ struct bpf_stack_build_id {
 	FN(sk_fullsock),		\
 	FN(tcp_sock),			\
 	FN(skb_ecn_set_ce),		\
-	FN(get_listener_sock),
+	FN(get_listener_sock),		\
+	FN(skb_set_inner_header),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
@@ -2595,6 +2612,10 @@ enum bpf_adj_room_mode {
 	BPF_ADJ_ROOM_NET,
 };
 
+/* BPF_FUNC_skb_set_inner_header flags. */
+#define	BPF_F_L2_INNER_OFFSET		(1ULL << 0)
+#define	BPF_F_L3_INNER_OFFSET		(1ULL << 1)
+
 /* Mode for BPF_FUNC_skb_load_bytes_relative helper. */
 enum bpf_hdr_start_off {
 	BPF_HDR_START_MAC,
diff --git a/net/core/filter.c b/net/core/filter.c
index 647c63a..f8265f3 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5463,6 +5463,68 @@ u32 bpf_tcp_sock_convert_ctx_access(enum bpf_access_type type,
 };
 #endif /* CONFIG_INET */
 
+BPF_CALL_5(bpf_skb_set_inner_header, struct sk_buff *, skb,
+	   __be16, outer_proto, __be16, inner_proto, u32, inner_offset,
+	   u64, flags)
+{
+	unsigned int gso_type;
+
+	if (unlikely(inner_offset > skb->len))
+		return -EINVAL;
+
+	if (unlikely(flags & ~(BPF_F_L2_INNER_OFFSET | BPF_F_L3_INNER_OFFSET)))
+		return -EINVAL;
+
+	switch (outer_proto) {
+	case IPPROTO_UDP:
+		gso_type = SKB_GSO_UDP_TUNNEL;
+		break;
+	case IPPROTO_GRE:
+		gso_type = SKB_GSO_GRE;
+		break;
+	default:
+		return -ENOTSUPP;
+	}
+
+	switch (inner_proto) {
+	case ETH_P_IP:
+	case ETH_P_IPV6:
+		break;
+	default:
+		return -ENOTSUPP;
+	}
+
+	if (inner_offset == 0) {
+		skb->encapsulation = 0;
+		skb_reset_inner_headers(skb);
+		return 0;
+	}
+
+	if (flags & BPF_F_L2_INNER_OFFSET)
+		skb_set_inner_mac_header(skb, inner_offset);
+	if (flags & BPF_F_L3_INNER_OFFSET)
+		skb_set_inner_network_header(skb, inner_offset);
+
+	skb_set_inner_protocol(skb, htons(inner_proto));
+	skb->encapsulation = 1;
+
+	if (skb_is_gso(skb))
+		skb_shinfo(skb)->gso_type |= gso_type;
+
+	return 0;
+}
+
+static const struct bpf_func_proto bpf_skb_set_inner_header_proto = {
+	.func		= bpf_skb_set_inner_header,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+	.arg2_type	= ARG_ANYTHING,
+	.arg3_type	= ARG_ANYTHING,
+	.arg4_type	= ARG_ANYTHING,
+	.arg5_type	= ARG_ANYTHING,
+};
+
 bool bpf_helper_changes_pkt_data(void *func)
 {
 	if (func == bpf_skb_vlan_push ||
@@ -5720,6 +5782,8 @@ bool bpf_helper_changes_pkt_data(void *func)
 	case BPF_FUNC_get_listener_sock:
 		return &bpf_get_listener_sock_proto;
 #endif
+	case BPF_FUNC_skb_set_inner_header:
+		return &bpf_skb_set_inner_header_proto;
 	default:
 		return bpf_base_func_proto(func_id);
 	}
-- 
1.8.3.1


> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>  include/uapi/linux/bpf.h | 14 +++++++++-
>  net/core/filter.c        | 58 +++++++++++++++++++++++++++++++++++++---
>  2 files changed, 67 insertions(+), 5 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 0eda8f564a381..a444534cc88d7 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2595,7 +2595,19 @@ enum bpf_func_id {
>  
>  /* BPF_FUNC_skb_adjust_room flags. */
>  #define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)
> -#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	(1ULL << 1)
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	(1ULL << 2)
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
> +					 BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE	(1ULL << 3)
> +#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP	(1ULL << 4)
> +
> +#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
> +					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> +					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
> +					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
>  
>  /* Mode for BPF_FUNC_skb_adjust_room helper. */
>  enum bpf_adj_room_mode {
> diff --git a/net/core/filter.c b/net/core/filter.c
> index e346e48098000..6007b0b4bc0d7 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2966,6 +2966,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>  static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>  			    u64 flags)
>  {
> +	bool encap = flags & BPF_F_ADJ_ROOM_ENCAP_L3_MASK;
> +	unsigned int gso_type = SKB_GSO_DODGY;
> +	u16 mac_len, inner_net, inner_trans;
>  	int ret;
>  
>  	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
> @@ -2979,10 +2982,60 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>  	if (unlikely(ret < 0))
>  		return ret;
>  
> +	if (encap) {
> +		if (skb->protocol != htons(ETH_P_IP) &&
> +		    skb->protocol != htons(ETH_P_IPV6))
> +			return -ENOTSUPP;
> +
> +		if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 &&
> +		    flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> +			return -EINVAL;
> +
> +		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE &&
> +		    flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
> +			return -EINVAL;
> +
> +		if (skb->encapsulation)
> +			return -EALREADY;
> +
> +		mac_len = skb->network_header - skb->mac_header;
> +		inner_net = skb->network_header;
> +		inner_trans = skb->transport_header;
> +	}
> +
>  	ret = bpf_skb_net_hdr_push(skb, off, len_diff);
>  	if (unlikely(ret < 0))
>  		return ret;
>  
> +	if (encap) {
> +		/* inner mac == inner_net on l3 encap */
> +		skb->inner_mac_header = inner_net;
> +		skb->inner_network_header = inner_net;
> +		skb->inner_transport_header = inner_trans;
> +		skb_set_inner_protocol(skb, skb->protocol);
> +
> +		skb->encapsulation = 1;
> +		skb_set_network_header(skb, mac_len);
> +
> +		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
> +			gso_type |= SKB_GSO_UDP_TUNNEL;
> +		else if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE)
> +			gso_type |= SKB_GSO_GRE;
> +		else if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> +			gso_type |= SKB_GSO_IPXIP6;
> +		else
> +			gso_type |= SKB_GSO_IPXIP4;
> +
> +		if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE ||
> +		    flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP) {
> +			int nh_len = flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 ?
> +					sizeof(struct ipv6hdr) :
> +					sizeof(struct iphdr);
> +
> +			skb_set_transport_header(skb, mac_len + nh_len);
> +		}
> +	}
> +
>  	if (skb_is_gso(skb)) {
>  		struct skb_shared_info *shinfo = skb_shinfo(skb);
>  
> @@ -2991,7 +3044,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>  			skb_decrease_gso_size(shinfo, len_diff);
>  
>  		/* Header must be checked, and gso_segs recomputed. */
> -		shinfo->gso_type |= SKB_GSO_DODGY;
> +		shinfo->gso_type |= gso_type;
>  		shinfo->gso_segs = 0;
>  	}
>  
> @@ -3042,7 +3095,6 @@ static u32 __bpf_skb_max_len(const struct sk_buff *skb)
>  BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>  	   u32, mode, u64, flags)
>  {
> -	bool trans_same = skb->transport_header == skb->network_header;
>  	u32 len_cur, len_diff_abs = abs(len_diff);
>  	u32 len_min = bpf_skb_net_base_len(skb);
>  	u32 len_max = __bpf_skb_max_len(skb);
> @@ -3071,8 +3123,6 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>  	}
>  
>  	len_cur = skb->len - skb_network_offset(skb);
> -	if (skb_transport_header_was_set(skb) && !trans_same)
> -		len_cur = skb_network_header_len(skb);
>  	if ((shrink && (len_diff_abs >= len_cur ||
>  			len_cur - len_diff_abs < len_min)) ||
>  	    (!shrink && (skb->len + len_diff_abs > len_max &&
> -- 
> 2.21.0.225.g810b269d1ac-goog
> 
> 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags
  2019-03-20 15:51   ` Alan Maguire
@ 2019-03-20 18:10     ` Willem de Bruijn
  0 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-20 18:10 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Network Development, Alexei Starovoitov, Daniel Borkmann,
	Stanislav Fomichev, Peter Oskolkov, Willem de Bruijn

On Wed, Mar 20, 2019 at 11:51 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On Wed, 20 Mar 2019, Willem de Bruijn wrote:
>
> > From: Willem de Bruijn <willemb@google.com>
> >
> > When pushing tunnel headers, annotate skbs in the same way as tunnel
> > devices.
> >
> This is great stuff Willem!
>
> > For GSO packets, the network stack requires certain fields set to
> > segment packets with tunnel headers. gro_gse_segment depends on
> > transport and inner mac header, for instance.
> >
>
> By coincidence I've been working on a patch to solve part of
> this problem (attached).
>
> I took a slightly different approach (which I think you mentioned
> you considered) - adding an additional helper to mark the inner
> headers.  The reason I needed this is that the mac header length
> in my case was sometimes not the same as the outer mac size (it
> could be a set of MPLS labels sometimes, or indeed there might
> be no mac header at all).  If I'm reading your code correctly,
> you derive  the mac header length from the outer mac header size -
> would there be a possibility of overloading the flags field for
> bpf_skb_adjust_room to use 8 bits to store a non-standard mac length
> perhaps? I'd be happy to work on that as a separate patch if that seems
> reasonable.

My patch indeed has still some limitations in that regard. It assumes
network layer encap, so no inner mac. This excludes GRE with
ETH_P_TEB or MPLS. And there can be MPLS both on the outer and inner
packet.

My plan to deal with those eventually was to add modes
BPF_F_ADJ_ROOM_ENCAP_L2_... and require a separate call to
bpf_skb_adjust_room for each layer of encap. Would that work for your
use-case?

> > Add an option to pass this information.
> >
> > Remove the restriction on len_diff to network header length, which
> > is too short, e.g., for GRE protocols.
> >
>
> I think this solves another problem I'd observed; when de-encapsulating
> packets which had been GRO re-assembled, bpf_skb_adjust_room would
> fail becuase GRO reassembly set the transport header, and as
> shrinkage was limited to ensure we still had an IPv4/IPv6 header's
> worth of space between the network and transport headers, the operation
> would fail.  I think that problem is fixed here, is that right?

Yes, sounds familiar. The included selftest also fails without this because
the transport header is set in the transmit path and not scrubbed when
looped over veth.

Speaking of GRO, the patchset is as still lacks the inverse of this
encap patch: modify gso_type to remove tunnel types like
SKB_GSO_GRE.

> Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
>
> Thanks!

Thanks for reviewing and sharing your patch, Alan :)

There is definitely something to be said for moving this out of
bpf_skb_adjust_room to a separate call bpf_skb_adjust_tunnel_headers
or so that is called after bpf_skb_store_bytes.

Instead of taking all the arguments explicitly, this could also infer
the values by parsing the headers. I was a bit hesitant that this
would turn into (yet another) custom packet parser.

And else merging this with bpf_skb_adjust_room is essentially just a
performance optimization to avoid one extra indirect function call.


>
> Alan
>
> From 388c1bf0cfc76901782520c5af58f73b2649a4c0 Mon Sep 17 00:00:00 2001
> From: Alan Maguire <alan.maguire@oracle.com>
> Date: Wed, 20 Mar 2019 14:12:36 +0100
> Subject: [PATCH] bpf: add bpf_skb_set_inner_header helper
>
> This work adds a helper which can mark inner mac and network
> headers and sets inner protocol type for the relevant skb such
> that generic segmentation offload (GSO) can segment the packet
> appropriately while taking newly-added encapsulation headers into
> account.
>
> It is intended to be used in conjunction with other helpers such
> as bpf_skb_adjust_room for cases where custom encapsulation is
> implemented in a tc BPF program and GSO functionality is needed.
> Currently UDP and GRE encapsulation are supported.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  include/uapi/linux/bpf.h | 23 ++++++++++++++++-
>  net/core/filter.c        | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 86 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 929c8e5..3ce3c16 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2431,6 +2431,22 @@ struct bpf_stack_build_id {
>   *     Return
>   *             A **struct bpf_sock** pointer on success, or **NULL** in
>   *             case of failure.
> + *
> + * int bpf_skb_set_inner_header(skb, outer_proto, inner_proto, inner_offset,
> + *                             flags)
> + *     Description
> + *             Set inner header at *inner_offset* for specified *inner_proto*
> + *             for tunnel of type *outer_proto*. *outer_proto* must be one
> + *             of **IPPROTO_UDP** or **IPPROTO_GRE**.  *inner_proto* must be
> + *             **ETH_P_IP** or **ETH_P_IPV6**.  *inner_offset* should specify
> + *             offset of the relevant inner header, or should be 0 to reset
> + *             inner headers. *flags* should be a combination of
> + *
> + *                     * **BPF_F_L2_INNER_OFFSET** offset is inner mac header
> + *                     * **BPF_F_L3_INNER_OFFSET** offset is inner network
> + *                       header
> + *     Return
> + *             0 on success, or negative error in case of failure.
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -2531,7 +2547,8 @@ struct bpf_stack_build_id {
>         FN(sk_fullsock),                \
>         FN(tcp_sock),                   \
>         FN(skb_ecn_set_ce),             \
> -       FN(get_listener_sock),
> +       FN(get_listener_sock),          \
> +       FN(skb_set_inner_header),
>
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>   * function eBPF program intends to call
> @@ -2595,6 +2612,10 @@ enum bpf_adj_room_mode {
>         BPF_ADJ_ROOM_NET,
>  };
>
> +/* BPF_FUNC_skb_set_inner_header flags. */
> +#define        BPF_F_L2_INNER_OFFSET           (1ULL << 0)
> +#define        BPF_F_L3_INNER_OFFSET           (1ULL << 1)
> +
>  /* Mode for BPF_FUNC_skb_load_bytes_relative helper. */
>  enum bpf_hdr_start_off {
>         BPF_HDR_START_MAC,
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 647c63a..f8265f3 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5463,6 +5463,68 @@ u32 bpf_tcp_sock_convert_ctx_access(enum bpf_access_type type,
>  };
>  #endif /* CONFIG_INET */
>
> +BPF_CALL_5(bpf_skb_set_inner_header, struct sk_buff *, skb,
> +          __be16, outer_proto, __be16, inner_proto, u32, inner_offset,
> +          u64, flags)
> +{
> +       unsigned int gso_type;
> +
> +       if (unlikely(inner_offset > skb->len))
> +               return -EINVAL;
> +
> +       if (unlikely(flags & ~(BPF_F_L2_INNER_OFFSET | BPF_F_L3_INNER_OFFSET)))
> +               return -EINVAL;
> +
> +       switch (outer_proto) {
> +       case IPPROTO_UDP:
> +               gso_type = SKB_GSO_UDP_TUNNEL;
> +               break;
> +       case IPPROTO_GRE:
> +               gso_type = SKB_GSO_GRE;
> +               break;
> +       default:
> +               return -ENOTSUPP;
> +       }
> +
> +       switch (inner_proto) {
> +       case ETH_P_IP:
> +       case ETH_P_IPV6:
> +               break;
> +       default:
> +               return -ENOTSUPP;
> +       }
> +
> +       if (inner_offset == 0) {
> +               skb->encapsulation = 0;
> +               skb_reset_inner_headers(skb);
> +               return 0;
> +       }
> +
> +       if (flags & BPF_F_L2_INNER_OFFSET)
> +               skb_set_inner_mac_header(skb, inner_offset);
> +       if (flags & BPF_F_L3_INNER_OFFSET)
> +               skb_set_inner_network_header(skb, inner_offset);
> +
> +       skb_set_inner_protocol(skb, htons(inner_proto));
> +       skb->encapsulation = 1;
> +
> +       if (skb_is_gso(skb))
> +               skb_shinfo(skb)->gso_type |= gso_type;
> +
> +       return 0;
> +}
> +
> +static const struct bpf_func_proto bpf_skb_set_inner_header_proto = {
> +       .func           = bpf_skb_set_inner_header,
> +       .gpl_only       = false,
> +       .ret_type       = RET_INTEGER,
> +       .arg1_type      = ARG_PTR_TO_CTX,
> +       .arg2_type      = ARG_ANYTHING,
> +       .arg3_type      = ARG_ANYTHING,
> +       .arg4_type      = ARG_ANYTHING,
> +       .arg5_type      = ARG_ANYTHING,
> +};
> +
>  bool bpf_helper_changes_pkt_data(void *func)
>  {
>         if (func == bpf_skb_vlan_push ||
> @@ -5720,6 +5782,8 @@ bool bpf_helper_changes_pkt_data(void *func)
>         case BPF_FUNC_get_listener_sock:
>                 return &bpf_get_listener_sock_proto;
>  #endif
> +       case BPF_FUNC_skb_set_inner_header:
> +               return &bpf_skb_set_inner_header_proto;
>         default:
>                 return bpf_base_func_proto(func_id);
>         }
> --
> 1.8.3.1
>
>
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > ---
> >  include/uapi/linux/bpf.h | 14 +++++++++-
> >  net/core/filter.c        | 58 +++++++++++++++++++++++++++++++++++++---
> >  2 files changed, 67 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 0eda8f564a381..a444534cc88d7 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2595,7 +2595,19 @@ enum bpf_func_id {
> >
> >  /* BPF_FUNC_skb_adjust_room flags. */
> >  #define BPF_F_ADJ_ROOM_FIXED_GSO     (1ULL << 0)
> > -#define BPF_F_ADJ_ROOM_MASK          (BPF_F_ADJ_ROOM_FIXED_GSO)
> > +
> > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 (1ULL << 1)
> > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 (1ULL << 2)
> > +#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
> > +                                      BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> > +
> > +#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE  (1ULL << 3)
> > +#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP  (1ULL << 4)
> > +
> > +#define BPF_F_ADJ_ROOM_MASK          (BPF_F_ADJ_ROOM_FIXED_GSO | \
> > +                                      BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> > +                                      BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
> > +                                      BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
> >
> >  /* Mode for BPF_FUNC_skb_adjust_room helper. */
> >  enum bpf_adj_room_mode {
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index e346e48098000..6007b0b4bc0d7 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -2966,6 +2966,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
> >  static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
> >                           u64 flags)
> >  {
> > +     bool encap = flags & BPF_F_ADJ_ROOM_ENCAP_L3_MASK;
> > +     unsigned int gso_type = SKB_GSO_DODGY;
> > +     u16 mac_len, inner_net, inner_trans;
> >       int ret;
> >
> >       if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
> > @@ -2979,10 +2982,60 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
> >       if (unlikely(ret < 0))
> >               return ret;
> >
> > +     if (encap) {
> > +             if (skb->protocol != htons(ETH_P_IP) &&
> > +                 skb->protocol != htons(ETH_P_IPV6))
> > +                     return -ENOTSUPP;
> > +
> > +             if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 &&
> > +                 flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> > +                     return -EINVAL;
> > +
> > +             if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE &&
> > +                 flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
> > +                     return -EINVAL;
> > +
> > +             if (skb->encapsulation)
> > +                     return -EALREADY;
> > +
> > +             mac_len = skb->network_header - skb->mac_header;
> > +             inner_net = skb->network_header;
> > +             inner_trans = skb->transport_header;
> > +     }
> > +
> >       ret = bpf_skb_net_hdr_push(skb, off, len_diff);
> >       if (unlikely(ret < 0))
> >               return ret;
> >
> > +     if (encap) {
> > +             /* inner mac == inner_net on l3 encap */
> > +             skb->inner_mac_header = inner_net;
> > +             skb->inner_network_header = inner_net;
> > +             skb->inner_transport_header = inner_trans;
> > +             skb_set_inner_protocol(skb, skb->protocol);
> > +
> > +             skb->encapsulation = 1;
> > +             skb_set_network_header(skb, mac_len);
> > +
> > +             if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
> > +                     gso_type |= SKB_GSO_UDP_TUNNEL;
> > +             else if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE)
> > +                     gso_type |= SKB_GSO_GRE;
> > +             else if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> > +                     gso_type |= SKB_GSO_IPXIP6;
> > +             else
> > +                     gso_type |= SKB_GSO_IPXIP4;
> > +
> > +             if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE ||
> > +                 flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP) {
> > +                     int nh_len = flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 ?
> > +                                     sizeof(struct ipv6hdr) :
> > +                                     sizeof(struct iphdr);
> > +
> > +                     skb_set_transport_header(skb, mac_len + nh_len);
> > +             }
> > +     }
> > +
> >       if (skb_is_gso(skb)) {
> >               struct skb_shared_info *shinfo = skb_shinfo(skb);
> >
> > @@ -2991,7 +3044,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
> >                       skb_decrease_gso_size(shinfo, len_diff);
> >
> >               /* Header must be checked, and gso_segs recomputed. */
> > -             shinfo->gso_type |= SKB_GSO_DODGY;
> > +             shinfo->gso_type |= gso_type;
> >               shinfo->gso_segs = 0;
> >       }
> >
> > @@ -3042,7 +3095,6 @@ static u32 __bpf_skb_max_len(const struct sk_buff *skb)
> >  BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
> >          u32, mode, u64, flags)
> >  {
> > -     bool trans_same = skb->transport_header == skb->network_header;
> >       u32 len_cur, len_diff_abs = abs(len_diff);
> >       u32 len_min = bpf_skb_net_base_len(skb);
> >       u32 len_max = __bpf_skb_max_len(skb);
> > @@ -3071,8 +3123,6 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
> >       }
> >
> >       len_cur = skb->len - skb_network_offset(skb);
> > -     if (skb_transport_header_was_set(skb) && !trans_same)
> > -             len_cur = skb_network_header_len(skb);
> >       if ((shrink && (len_diff_abs >= len_cur ||
> >                       len_cur - len_diff_abs < len_min)) ||
> >           (!shrink && (skb->len + len_diff_abs > len_max &&
> > --
> > 2.21.0.225.g810b269d1ac-goog
> >
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags
  2019-03-20 14:49 ` [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags Willem de Bruijn
  2019-03-20 15:51   ` Alan Maguire
@ 2019-03-21  3:13   ` Alexei Starovoitov
  2019-03-21 13:25     ` Willem de Bruijn
  1 sibling, 1 reply; 21+ messages in thread
From: Alexei Starovoitov @ 2019-03-21  3:13 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, ast, daniel, sdf, posk, Willem de Bruijn

On Wed, Mar 20, 2019 at 10:49:40AM -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> When pushing tunnel headers, annotate skbs in the same way as tunnel
> devices.
> 
> For GSO packets, the network stack requires certain fields set to
> segment packets with tunnel headers. gro_gse_segment depends on
> transport and inner mac header, for instance.
> 
> Add an option to pass this information.
> 
> Remove the restriction on len_diff to network header length, which
> is too short, e.g., for GRE protocols.
> 
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>  include/uapi/linux/bpf.h | 14 +++++++++-
>  net/core/filter.c        | 58 +++++++++++++++++++++++++++++++++++++---
>  2 files changed, 67 insertions(+), 5 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 0eda8f564a381..a444534cc88d7 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2595,7 +2595,19 @@ enum bpf_func_id {
>  
>  /* BPF_FUNC_skb_adjust_room flags. */
>  #define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)
> -#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	(1ULL << 1)
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	(1ULL << 2)
> +#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
> +					 BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE	(1ULL << 3)
> +#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP	(1ULL << 4)
> +
> +#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
> +					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> +					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
> +					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP)

The patches looks great.
One small nit. Please move *_MASK out of uapi.
When flags are added this uapi macro keeps changing
which may break user space compiled with older macro.
Essentially user space cannot rely on this macro and cannot use it.
I think it's best to leave this macro for kernel internals.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags
  2019-03-21  3:13   ` Alexei Starovoitov
@ 2019-03-21 13:25     ` Willem de Bruijn
  0 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-21 13:25 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Network Development, Alexei Starovoitov, Daniel Borkmann,
	Stanislav Fomichev, Peter Oskolkov, Willem de Bruijn

On Wed, Mar 20, 2019 at 11:13 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Mar 20, 2019 at 10:49:40AM -0400, Willem de Bruijn wrote:
> > From: Willem de Bruijn <willemb@google.com>
> >
> > When pushing tunnel headers, annotate skbs in the same way as tunnel
> > devices.
> >
> > For GSO packets, the network stack requires certain fields set to
> > segment packets with tunnel headers. gro_gse_segment depends on
> > transport and inner mac header, for instance.
> >
> > Add an option to pass this information.
> >
> > Remove the restriction on len_diff to network header length, which
> > is too short, e.g., for GRE protocols.
> >
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > ---
> >  include/uapi/linux/bpf.h | 14 +++++++++-
> >  net/core/filter.c        | 58 +++++++++++++++++++++++++++++++++++++---
> >  2 files changed, 67 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 0eda8f564a381..a444534cc88d7 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2595,7 +2595,19 @@ enum bpf_func_id {
> >
> >  /* BPF_FUNC_skb_adjust_room flags. */
> >  #define BPF_F_ADJ_ROOM_FIXED_GSO     (1ULL << 0)
> > -#define BPF_F_ADJ_ROOM_MASK          (BPF_F_ADJ_ROOM_FIXED_GSO)
> > +
> > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 (1ULL << 1)
> > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 (1ULL << 2)
> > +#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \
> > +                                      BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
> > +
> > +#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE  (1ULL << 3)
> > +#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP  (1ULL << 4)
> > +
> > +#define BPF_F_ADJ_ROOM_MASK          (BPF_F_ADJ_ROOM_FIXED_GSO | \
> > +                                      BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> > +                                      BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
> > +                                      BPF_F_ADJ_ROOM_ENCAP_L4_UDP)
>
> The patches looks great.
> One small nit. Please move *_MASK out of uapi.
> When flags are added this uapi macro keeps changing
> which may break user space compiled with older macro.

Ah, good point. Thanks, Alexei. Will fix in v2.


> Essentially user space cannot rely on this macro and cannot use it.
> I think it's best to leave this macro for kernel internals.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO
  2019-03-20 14:49 ` [PATCH bpf-next 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
@ 2019-03-21 13:42   ` Alan Maguire
  2019-03-21 14:00     ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Alan Maguire @ 2019-03-21 13:42 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, ast, daniel, sdf, posk, Willem de Bruijn



On Wed, 20 Mar 2019, Willem de Bruijn wrote:

> From: Willem de Bruijn <willemb@google.com>
> 
> bpf_skb_adjust_room adjusts gso_size of gso packets to account for the
> pushed or popped header room.
> 
> This is not allowed with UDP, where gso_size delineates datagrams. Add
> an option to avoid these updates and allow this call for datagrams.
> 
> It can also be used with TCP, when MSS is known to allow headroom,
> e.g., through MSS clamping or route MTU.
> 
> Link: https://patchwork.ozlabs.org/patch/1052497/
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>  include/uapi/linux/bpf.h |  4 ++++
>  net/core/filter.c        | 36 +++++++++++++++++++++++++-----------
>  2 files changed, 29 insertions(+), 11 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 4f5c918e6fcf4..0eda8f564a381 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2593,6 +2593,10 @@ enum bpf_func_id {
>  /* Current network namespace */
>  #define BPF_F_CURRENT_NETNS		(-1L)
>  
> +/* BPF_FUNC_skb_adjust_room flags. */
> +#define BPF_F_ADJ_ROOM_FIXED_GSO	(1ULL << 0)

minor nit - could we add this flag to the documentation for 
bpf_skb_adjust_room? Same suggestion for the encap flags in
patch 8 too. Thanks!

Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
 
> +#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO)
> +
>  /* Mode for BPF_FUNC_skb_adjust_room helper. */
>  enum bpf_adj_room_mode {
>  	BPF_ADJ_ROOM_NET,
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 1a0cf578b4502..e346e48098000 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2963,12 +2963,17 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>  	}
>  }
>  
> -static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
> +static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
> +			    u64 flags)
>  {
>  	int ret;
>  
> -	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
> -		return -ENOTSUPP;
> +	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
> +		/* udp gso_size delineates datagrams, only allow if fixed */
> +		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
> +		    !(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
> +			return -ENOTSUPP;
> +	}
>  
>  	ret = skb_cow_head(skb, len_diff);
>  	if (unlikely(ret < 0))
> @@ -2982,7 +2987,9 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
>  		struct skb_shared_info *shinfo = skb_shinfo(skb);
>  
>  		/* Due to header grow, MSS needs to be downgraded. */
> -		skb_decrease_gso_size(shinfo, len_diff);
> +		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
> +			skb_decrease_gso_size(shinfo, len_diff);
> +
>  		/* Header must be checked, and gso_segs recomputed. */
>  		shinfo->gso_type |= SKB_GSO_DODGY;
>  		shinfo->gso_segs = 0;
> @@ -2991,12 +2998,17 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff)
>  	return 0;
>  }
>  
> -static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff)
> +static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
> +			      u64 flags)
>  {
>  	int ret;
>  
> -	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb))
> -		return -ENOTSUPP;
> +	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
> +		/* udp gso_size delineates datagrams, only allow if fixed */
> +		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
> +		    !(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
> +			return -ENOTSUPP;
> +	}
>  
>  	ret = skb_unclone(skb, GFP_ATOMIC);
>  	if (unlikely(ret < 0))
> @@ -3010,7 +3022,9 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff)
>  		struct skb_shared_info *shinfo = skb_shinfo(skb);
>  
>  		/* Due to header shrink, MSS can be upgraded. */
> -		skb_increase_gso_size(shinfo, len_diff);
> +		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
> +			skb_increase_gso_size(shinfo, len_diff);
> +
>  		/* Header must be checked, and gso_segs recomputed. */
>  		shinfo->gso_type |= SKB_GSO_DODGY;
>  		shinfo->gso_segs = 0;
> @@ -3037,7 +3051,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>  	u32 off;
>  	int ret;
>  
> -	if (unlikely(flags))
> +	if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
>  		return -EINVAL;
>  	if (unlikely(len_diff_abs > 0xfffU))
>  		return -EFAULT;
> @@ -3065,8 +3079,8 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>  			 !skb_is_gso(skb))))
>  		return -ENOTSUPP;
>  
> -	ret = shrink ? bpf_skb_net_shrink(skb, off, len_diff_abs) :
> -		       bpf_skb_net_grow(skb, off, len_diff_abs);
> +	ret = shrink ? bpf_skb_net_shrink(skb, off, len_diff_abs, flags) :
> +		       bpf_skb_net_grow(skb, off, len_diff_abs, flags);
>  
>  	bpf_compute_data_pointers(skb);
>  	return ret;
> -- 
> 2.21.0.225.g810b269d1ac-goog
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO
  2019-03-21 13:42   ` Alan Maguire
@ 2019-03-21 14:00     ` Willem de Bruijn
  0 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2019-03-21 14:00 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Network Development, Alexei Starovoitov, Daniel Borkmann,
	Stanislav Fomichev, Peter Oskolkov, Willem de Bruijn

On Thu, Mar 21, 2019 at 9:43 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
>
>
> On Wed, 20 Mar 2019, Willem de Bruijn wrote:
>
> > From: Willem de Bruijn <willemb@google.com>
> >
> > bpf_skb_adjust_room adjusts gso_size of gso packets to account for the
> > pushed or popped header room.
> >
> > This is not allowed with UDP, where gso_size delineates datagrams. Add
> > an option to avoid these updates and allow this call for datagrams.
> >
> > It can also be used with TCP, when MSS is known to allow headroom,
> > e.g., through MSS clamping or route MTU.
> >
> > Link: https://patchwork.ozlabs.org/patch/1052497/
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > ---
> >  include/uapi/linux/bpf.h |  4 ++++
> >  net/core/filter.c        | 36 +++++++++++++++++++++++++-----------
> >  2 files changed, 29 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 4f5c918e6fcf4..0eda8f564a381 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2593,6 +2593,10 @@ enum bpf_func_id {
> >  /* Current network namespace */
> >  #define BPF_F_CURRENT_NETNS          (-1L)
> >
> > +/* BPF_FUNC_skb_adjust_room flags. */
> > +#define BPF_F_ADJ_ROOM_FIXED_GSO     (1ULL << 0)
>
> minor nit - could we add this flag to the documentation for
> bpf_skb_adjust_room? Same suggestion for the encap flags in
> patch 8 too. Thanks!
>
> Reviewed-by: Alan Maguire <alan.maguire@oracle.com>

Definitely, will do in v2. Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-03-21 14:01 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-20 14:49 [PATCH bpf-next 00/13] bpf tc tunneling Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 02/13] selftests/bpf: bpf tunnel encap test Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 03/13] selftests/bpf: expand bpf tunnel test with decap Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 04/13] selftests/bpf: expand bpf tunnel test to ipv6 Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 05/13] selftests/bpf: extend bpf tunnel test with gre Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 06/13] selftests/bpf: extend bpf tunnel test with tso Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
2019-03-21 13:42   ` Alan Maguire
2019-03-21 14:00     ` Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 09/13] bpf: add bpf_skb_adjust_room encap flags Willem de Bruijn
2019-03-20 15:51   ` Alan Maguire
2019-03-20 18:10     ` Willem de Bruijn
2019-03-21  3:13   ` Alexei Starovoitov
2019-03-21 13:25     ` Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 10/13] bpf: Sync bpf.h to tools Willem de Bruijn
2019-03-20 14:56   ` Soheil Hassas Yeganeh
2019-03-20 14:49 ` [PATCH bpf-next 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO Willem de Bruijn
2019-03-20 14:49 ` [PATCH bpf-next 13/13] selftests/bpf: convert bpf tunnel test to encap modes Willem de Bruijn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.