All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags
@ 2019-07-25 15:33 Stanislav Fomichev
  2019-07-25 15:33 ` [PATCH bpf-next v2 1/7] bpf/flow_dissector: pass input flags to BPF flow dissector program Stanislav Fomichev
                   ` (7 more replies)
  0 siblings, 8 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 15:33 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, Song Liu,
	Willem de Bruijn, Petar Penkov

C flow dissector supports input flags that tell it to customize parsing
by either stopping early or trying to parse as deep as possible.
BPF flow dissector always parses as deep as possible which is sub-optimal.
Pass input flags to the BPF flow dissector as well so it can make the same
decisions.

Series outline:
* remove unused FLOW_DISSECTOR_F_STOP_AT_L3 flag
* export FLOW_DISSECTOR_F_XXX flags as uapi and pass them to BPF
  flow dissector
* add documentation for the export flags
* support input flags in BPF_PROG_TEST_RUN via ctx_{in,out}
* sync uapi to tools
* support FLOW_DISSECTOR_F_PARSE_1ST_FRAG in selftest
* support FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL in kernel and selftest
* support FLOW_DISSECTOR_F_STOP_AT_ENCAP in selftest

Pros:
* makes BPF flow dissector faster by avoiding burning extra cycles
* existing BPF progs continue to work by ignoring the flags and always
  parsing as deep as possible

Cons:
* new UAPI which we need to support (OTOH, if we need to deprecate some
  flags, we can just stop setting them upon calling BPF programs)

Some numbers (with .repeat = 4000000 in test_flow_dissector):
        test_flow_dissector:PASS:ipv4-frag 35 nsec
        test_flow_dissector:PASS:ipv4-frag 35 nsec
        test_flow_dissector:PASS:ipv4-no-frag 32 nsec
        test_flow_dissector:PASS:ipv4-no-frag 32 nsec

        test_flow_dissector:PASS:ipv6-frag 39 nsec
        test_flow_dissector:PASS:ipv6-frag 39 nsec
        test_flow_dissector:PASS:ipv6-no-frag 36 nsec
        test_flow_dissector:PASS:ipv6-no-frag 36 nsec

        test_flow_dissector:PASS:ipv6-flow-label 36 nsec
        test_flow_dissector:PASS:ipv6-flow-label 36 nsec
        test_flow_dissector:PASS:ipv6-no-flow-label 33 nsec
        test_flow_dissector:PASS:ipv6-no-flow-label 33 nsec

        test_flow_dissector:PASS:ipip-encap 38 nsec
        test_flow_dissector:PASS:ipip-encap 38 nsec
        test_flow_dissector:PASS:ipip-no-encap 32 nsec
        test_flow_dissector:PASS:ipip-no-encap 32 nsec

The improvement is around 10%, but it's in a tight cache-hot
BPF_PROG_TEST_RUN loop.

Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>

Stanislav Fomichev (7):
  bpf/flow_dissector: pass input flags to BPF flow dissector program
  bpf/flow_dissector: document flags
  bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN
  tools/bpf: sync bpf_flow_keys flags
  selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
  bpf/flow_dissector: support ipv6 flow_label and
    FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL
  selftests/bpf: support FLOW_DISSECTOR_F_STOP_AT_ENCAP

 Documentation/bpf/prog_flow_dissector.rst     |  18 ++
 include/linux/skbuff.h                        |   2 +-
 include/net/flow_dissector.h                  |   4 -
 include/uapi/linux/bpf.h                      |   6 +
 net/bpf/test_run.c                            |  39 ++-
 net/core/flow_dissector.c                     |  14 +-
 tools/include/uapi/linux/bpf.h                |   6 +
 .../selftests/bpf/prog_tests/flow_dissector.c | 242 +++++++++++++++++-
 tools/testing/selftests/bpf/progs/bpf_flow.c  |  46 +++-
 9 files changed, 359 insertions(+), 18 deletions(-)

-- 
2.22.0.657.g960e92d24f-goog

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v2 1/7] bpf/flow_dissector: pass input flags to BPF flow dissector program
  2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
@ 2019-07-25 15:33 ` Stanislav Fomichev
  2019-07-25 19:58   ` Alexei Starovoitov
  2019-07-25 15:33 ` [PATCH bpf-next v2 2/7] bpf/flow_dissector: document flags Stanislav Fomichev
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 15:33 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, Willem de Bruijn,
	Song Liu, Petar Penkov

C flow dissector supports input flags that tell it to customize parsing
by either stopping early or trying to parse as deep as possible. Pass
those flags to the BPF flow dissector so it can make the same
decisions. In the next commits I'll add support for those flags to
our reference bpf_flow.c

Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/skbuff.h       | 2 +-
 include/net/flow_dissector.h | 4 ----
 include/uapi/linux/bpf.h     | 5 +++++
 net/bpf/test_run.c           | 2 +-
 net/core/flow_dissector.c    | 5 +++--
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 718742b1c505..9b7a8038beec 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1271,7 +1271,7 @@ static inline int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr)
 
 struct bpf_flow_dissector;
 bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx,
-		      __be16 proto, int nhoff, int hlen);
+		      __be16 proto, int nhoff, int hlen, unsigned int flags);
 
 bool __skb_flow_dissect(const struct net *net,
 			const struct sk_buff *skb,
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 90bd210be060..3e2642587b76 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -253,10 +253,6 @@ enum flow_dissector_key_id {
 	FLOW_DISSECTOR_KEY_MAX,
 };
 
-#define FLOW_DISSECTOR_F_PARSE_1ST_FRAG		BIT(0)
-#define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL	BIT(1)
-#define FLOW_DISSECTOR_F_STOP_AT_ENCAP		BIT(2)
-
 struct flow_dissector_key {
 	enum flow_dissector_key_id key_id;
 	size_t offset; /* offset of struct flow_dissector_key_*
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index fa1c753dcdbc..b4ad19bd6aa8 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3507,6 +3507,10 @@ enum bpf_task_fd_type {
 	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
 };
 
+#define FLOW_DISSECTOR_F_PARSE_1ST_FRAG		(1U << 0)
+#define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL	(1U << 1)
+#define FLOW_DISSECTOR_F_STOP_AT_ENCAP		(1U << 2)
+
 struct bpf_flow_keys {
 	__u16	nhoff;
 	__u16	thoff;
@@ -3528,6 +3532,7 @@ struct bpf_flow_keys {
 			__u32	ipv6_dst[4];	/* in6_addr; network order */
 		};
 	};
+	__u32	flags;
 };
 
 struct bpf_func_info {
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 80e6f3a6864d..4e41d15a1098 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -419,7 +419,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	time_start = ktime_get_ns();
 	for (i = 0; i < repeat; i++) {
 		retval = bpf_flow_dissect(prog, &ctx, eth->h_proto, ETH_HLEN,
-					  size);
+					  size, 0);
 
 		if (signal_pending(current)) {
 			preempt_enable();
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 3e6fedb57bc1..a74c4ed1b30d 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -784,7 +784,7 @@ static void __skb_flow_bpf_to_target(const struct bpf_flow_keys *flow_keys,
 }
 
 bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx,
-		      __be16 proto, int nhoff, int hlen)
+		      __be16 proto, int nhoff, int hlen, unsigned int flags)
 {
 	struct bpf_flow_keys *flow_keys = ctx->flow_keys;
 	u32 result;
@@ -794,6 +794,7 @@ bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx,
 	flow_keys->n_proto = proto;
 	flow_keys->nhoff = nhoff;
 	flow_keys->thoff = flow_keys->nhoff;
+	flow_keys->flags = flags;
 
 	preempt_disable();
 	result = BPF_PROG_RUN(prog, ctx);
@@ -914,7 +915,7 @@ bool __skb_flow_dissect(const struct net *net,
 			}
 
 			ret = bpf_flow_dissect(attached, &ctx, n_proto, nhoff,
-					       hlen);
+					       hlen, flags);
 			__skb_flow_bpf_to_target(&flow_keys, flow_dissector,
 						 target_container);
 			rcu_read_unlock();
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v2 2/7] bpf/flow_dissector: document flags
  2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
  2019-07-25 15:33 ` [PATCH bpf-next v2 1/7] bpf/flow_dissector: pass input flags to BPF flow dissector program Stanislav Fomichev
@ 2019-07-25 15:33 ` Stanislav Fomichev
  2019-07-25 15:33 ` [PATCH bpf-next v2 3/7] bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN Stanislav Fomichev
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 15:33 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, Willem de Bruijn,
	Song Liu, Petar Penkov

Describe what each input flag does and who uses it.

Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 Documentation/bpf/prog_flow_dissector.rst | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
index ed343abe541e..0f3f380b2ce4 100644
--- a/Documentation/bpf/prog_flow_dissector.rst
+++ b/Documentation/bpf/prog_flow_dissector.rst
@@ -26,6 +26,7 @@ and output arguments.
   * ``nhoff`` - initial offset of the networking header
   * ``thoff`` - initial offset of the transport header, initialized to nhoff
   * ``n_proto`` - L3 protocol type, parsed out of L2 header
+  * ``flags`` - optional flags
 
 Flow dissector BPF program should fill out the rest of the ``struct
 bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be
@@ -101,6 +102,23 @@ can be called for both cases and would have to be written carefully to
 handle both cases.
 
 
+Flags
+=====
+
+``flow_keys->flags`` might contain optional input flags that work as follows:
+
+* ``FLOW_DISSECTOR_F_PARSE_1ST_FRAG`` - tells BPF flow dissector to continue
+  parsing first fragment; the default expected behavior is that flow dissector
+  returns as soon as it finds out that the packet is fragmented;
+  used by ``eth_get_headlen`` to estimate length of all headers for GRO.
+* ``FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL`` - tells BPF flow dissector to stop
+  parsing as soon as it reaches IPv6 flow label; used by ``___skb_get_hash``
+  and ``__skb_get_hash_symmetric`` to get flow hash.
+* ``FLOW_DISSECTOR_F_STOP_AT_ENCAP`` - tells BPF flow dissector to stop
+  parsing as soon as it reaches encapsulated headers; used by routing
+  infrastructure.
+
+
 Reference Implementation
 ========================
 
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v2 3/7] bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN
  2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
  2019-07-25 15:33 ` [PATCH bpf-next v2 1/7] bpf/flow_dissector: pass input flags to BPF flow dissector program Stanislav Fomichev
  2019-07-25 15:33 ` [PATCH bpf-next v2 2/7] bpf/flow_dissector: document flags Stanislav Fomichev
@ 2019-07-25 15:33 ` Stanislav Fomichev
  2019-07-25 15:33 ` [PATCH bpf-next v2 4/7] tools/bpf: sync bpf_flow_keys flags Stanislav Fomichev
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 15:33 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, Willem de Bruijn,
	Song Liu, Petar Penkov

This will allow us to write tests for those flags.

v2:
* Swap kfree(data) and kfree(user_ctx) (Song Liu)

Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 net/bpf/test_run.c | 39 +++++++++++++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 4e41d15a1098..1153bbcdff72 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -377,6 +377,22 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 	return ret;
 }
 
+static int verify_user_bpf_flow_keys(struct bpf_flow_keys *ctx)
+{
+	/* make sure the fields we don't use are zeroed */
+	if (!range_is_zero(ctx, 0, offsetof(struct bpf_flow_keys, flags)))
+		return -EINVAL;
+
+	/* flags is allowed */
+
+	if (!range_is_zero(ctx, offsetof(struct bpf_flow_keys, flags) +
+			   FIELD_SIZEOF(struct bpf_flow_keys, flags),
+			   sizeof(struct bpf_flow_keys)))
+		return -EINVAL;
+
+	return 0;
+}
+
 int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 				     const union bpf_attr *kattr,
 				     union bpf_attr __user *uattr)
@@ -384,9 +400,11 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	u32 size = kattr->test.data_size_in;
 	struct bpf_flow_dissector ctx = {};
 	u32 repeat = kattr->test.repeat;
+	struct bpf_flow_keys *user_ctx;
 	struct bpf_flow_keys flow_keys;
 	u64 time_start, time_spent = 0;
 	const struct ethhdr *eth;
+	unsigned int flags = 0;
 	u32 retval, duration;
 	void *data;
 	int ret;
@@ -395,9 +413,6 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	if (prog->type != BPF_PROG_TYPE_FLOW_DISSECTOR)
 		return -EINVAL;
 
-	if (kattr->test.ctx_in || kattr->test.ctx_out)
-		return -EINVAL;
-
 	if (size < ETH_HLEN)
 		return -EINVAL;
 
@@ -410,6 +425,18 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	if (!repeat)
 		repeat = 1;
 
+	user_ctx = bpf_ctx_init(kattr, sizeof(struct bpf_flow_keys));
+	if (IS_ERR(user_ctx)) {
+		kfree(data);
+		return PTR_ERR(user_ctx);
+	}
+	if (user_ctx) {
+		ret = verify_user_bpf_flow_keys(user_ctx);
+		if (ret)
+			goto out;
+		flags = user_ctx->flags;
+	}
+
 	ctx.flow_keys = &flow_keys;
 	ctx.data = data;
 	ctx.data_end = (__u8 *)data + size;
@@ -419,7 +446,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	time_start = ktime_get_ns();
 	for (i = 0; i < repeat; i++) {
 		retval = bpf_flow_dissect(prog, &ctx, eth->h_proto, ETH_HLEN,
-					  size, 0);
+					  size, flags);
 
 		if (signal_pending(current)) {
 			preempt_enable();
@@ -450,8 +477,12 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 
 	ret = bpf_test_finish(kattr, uattr, &flow_keys, sizeof(flow_keys),
 			      retval, duration);
+	if (!ret)
+		ret = bpf_ctx_finish(kattr, uattr, user_ctx,
+				     sizeof(struct bpf_flow_keys));
 
 out:
+	kfree(user_ctx);
 	kfree(data);
 	return ret;
 }
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v2 4/7] tools/bpf: sync bpf_flow_keys flags
  2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
                   ` (2 preceding siblings ...)
  2019-07-25 15:33 ` [PATCH bpf-next v2 3/7] bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN Stanislav Fomichev
@ 2019-07-25 15:33 ` Stanislav Fomichev
  2019-07-25 15:33 ` [PATCH bpf-next v2 5/7] selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG Stanislav Fomichev
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 15:33 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, Willem de Bruijn,
	Song Liu, Petar Penkov

Export bpf_flow_keys flags to tools/libbpf/selftests.

Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/include/uapi/linux/bpf.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4e455018da65..a0e1c891b56f 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3504,6 +3504,10 @@ enum bpf_task_fd_type {
 	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
 };
 
+#define FLOW_DISSECTOR_F_PARSE_1ST_FRAG		(1U << 0)
+#define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL	(1U << 1)
+#define FLOW_DISSECTOR_F_STOP_AT_ENCAP		(1U << 2)
+
 struct bpf_flow_keys {
 	__u16	nhoff;
 	__u16	thoff;
@@ -3525,6 +3529,7 @@ struct bpf_flow_keys {
 			__u32	ipv6_dst[4];	/* in6_addr; network order */
 		};
 	};
+	__u32	flags;
 };
 
 struct bpf_func_info {
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v2 5/7] selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
  2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
                   ` (3 preceding siblings ...)
  2019-07-25 15:33 ` [PATCH bpf-next v2 4/7] tools/bpf: sync bpf_flow_keys flags Stanislav Fomichev
@ 2019-07-25 15:33 ` Stanislav Fomichev
  2019-07-25 17:15   ` Song Liu
  2019-07-25 15:33 ` [PATCH bpf-next v2 6/7] bpf/flow_dissector: support ipv6 flow_label and FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL Stanislav Fomichev
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 15:33 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, Willem de Bruijn,
	Song Liu, Petar Penkov

bpf_flow.c: exit early unless FLOW_DISSECTOR_F_PARSE_1ST_FRAG is passed
in flags. Also, set ip_proto earlier, this makes sure we have correct
value with fragmented packets.

Add selftest cases to test ipv4/ipv6 fragments and skip eth_get_headlen
tests that don't have FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag.

eth_get_headlen calls flow dissector with
FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag so we can't run tests that
have different set of input flags against it.

v2:
 * sefltests -> selftests (Willem de Bruijn)
 * Reword a comment about eth_get_headlen flags (Song Liu)

Acked-by: Willem de Bruijn <willemb@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/flow_dissector.c | 132 +++++++++++++++++-
 tools/testing/selftests/bpf/progs/bpf_flow.c  |  28 +++-
 2 files changed, 153 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
index c938283ac232..f93a115db650 100644
--- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
+++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
@@ -5,6 +5,10 @@
 #include <linux/if_tun.h>
 #include <sys/uio.h>
 
+#ifndef IP_MF
+#define IP_MF 0x2000
+#endif
+
 #define CHECK_FLOW_KEYS(desc, got, expected)				\
 	CHECK_ATTR(memcmp(&got, &expected, sizeof(got)) != 0,		\
 	      desc,							\
@@ -49,6 +53,18 @@ struct ipv6_pkt {
 	struct tcphdr tcp;
 } __packed;
 
+struct ipv6_frag_pkt {
+	struct ethhdr eth;
+	struct ipv6hdr iph;
+	struct frag_hdr {
+		__u8 nexthdr;
+		__u8 reserved;
+		__be16 frag_off;
+		__be32 identification;
+	} ipf;
+	struct tcphdr tcp;
+} __packed;
+
 struct dvlan_ipv6_pkt {
 	struct ethhdr eth;
 	__u16 vlan_tci;
@@ -65,9 +81,11 @@ struct test {
 		struct ipv4_pkt ipv4;
 		struct svlan_ipv4_pkt svlan_ipv4;
 		struct ipv6_pkt ipv6;
+		struct ipv6_frag_pkt ipv6_frag;
 		struct dvlan_ipv6_pkt dvlan_ipv6;
 	} pkt;
 	struct bpf_flow_keys keys;
+	__u32 flags;
 };
 
 #define VLAN_HLEN	4
@@ -143,6 +161,102 @@ struct test tests[] = {
 			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
 		},
 	},
+	{
+		.name = "ipv4-frag",
+		.pkt.ipv4 = {
+			.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
+			.iph.ihl = 5,
+			.iph.protocol = IPPROTO_TCP,
+			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
+			.iph.frag_off = __bpf_constant_htons(IP_MF),
+			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
+		},
+		.keys = {
+			.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
+			.nhoff = ETH_HLEN,
+			.thoff = ETH_HLEN + sizeof(struct iphdr),
+			.addr_proto = ETH_P_IP,
+			.ip_proto = IPPROTO_TCP,
+			.n_proto = __bpf_constant_htons(ETH_P_IP),
+			.is_frag = true,
+			.is_first_frag = true,
+			.sport = 80,
+			.dport = 8080,
+		},
+		.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
+	},
+	{
+		.name = "ipv4-no-frag",
+		.pkt.ipv4 = {
+			.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
+			.iph.ihl = 5,
+			.iph.protocol = IPPROTO_TCP,
+			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
+			.iph.frag_off = __bpf_constant_htons(IP_MF),
+			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
+		},
+		.keys = {
+			.nhoff = ETH_HLEN,
+			.thoff = ETH_HLEN + sizeof(struct iphdr),
+			.addr_proto = ETH_P_IP,
+			.ip_proto = IPPROTO_TCP,
+			.n_proto = __bpf_constant_htons(ETH_P_IP),
+			.is_frag = true,
+			.is_first_frag = true,
+		},
+	},
+	{
+		.name = "ipv6-frag",
+		.pkt.ipv6_frag = {
+			.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.iph.nexthdr = IPPROTO_FRAGMENT,
+			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
+			.ipf.nexthdr = IPPROTO_TCP,
+			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
+		},
+		.keys = {
+			.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
+			.nhoff = ETH_HLEN,
+			.thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
+				sizeof(struct frag_hdr),
+			.addr_proto = ETH_P_IPV6,
+			.ip_proto = IPPROTO_TCP,
+			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.is_frag = true,
+			.is_first_frag = true,
+			.sport = 80,
+			.dport = 8080,
+		},
+		.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
+	},
+	{
+		.name = "ipv6-no-frag",
+		.pkt.ipv6_frag = {
+			.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.iph.nexthdr = IPPROTO_FRAGMENT,
+			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
+			.ipf.nexthdr = IPPROTO_TCP,
+			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
+		},
+		.keys = {
+			.nhoff = ETH_HLEN,
+			.thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
+				sizeof(struct frag_hdr),
+			.addr_proto = ETH_P_IPV6,
+			.ip_proto = IPPROTO_TCP,
+			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.is_frag = true,
+			.is_first_frag = true,
+		},
+	},
 };
 
 static int create_tap(const char *ifname)
@@ -225,6 +339,13 @@ void test_flow_dissector(void)
 			.data_size_in = sizeof(tests[i].pkt),
 			.data_out = &flow_keys,
 		};
+		static struct bpf_flow_keys ctx = {};
+
+		if (tests[i].flags) {
+			tattr.ctx_in = &ctx;
+			tattr.ctx_size_in = sizeof(ctx);
+			ctx.flags = tests[i].flags;
+		}
 
 		err = bpf_prog_test_run_xattr(&tattr);
 		CHECK_ATTR(tattr.data_size_out != sizeof(flow_keys) ||
@@ -251,10 +372,19 @@ void test_flow_dissector(void)
 	CHECK(err, "ifup", "err %d errno %d\n", err, errno);
 
 	for (i = 0; i < ARRAY_SIZE(tests); i++) {
-		struct bpf_flow_keys flow_keys = {};
+		/* Keep in sync with 'flags' from eth_get_headlen. */
+		__u32 eth_get_headlen_flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG;
 		struct bpf_prog_test_run_attr tattr = {};
+		struct bpf_flow_keys flow_keys = {};
 		__u32 key = 0;
 
+		/* For skb-less case we can't pass input flags; run
+		 * only the tests that have a matching set of flags.
+		 */
+
+		if (tests[i].flags != eth_get_headlen_flags)
+			continue;
+
 		err = tx_tap(tap_fd, &tests[i].pkt, sizeof(tests[i].pkt));
 		CHECK(err < 0, "tx_tap", "err %d errno %d\n", err, errno);
 
diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c
index 5ae485a6af3f..0eabe5e57944 100644
--- a/tools/testing/selftests/bpf/progs/bpf_flow.c
+++ b/tools/testing/selftests/bpf/progs/bpf_flow.c
@@ -153,7 +153,6 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto)
 	struct tcphdr *tcp, _tcp;
 	struct udphdr *udp, _udp;
 
-	keys->ip_proto = proto;
 	switch (proto) {
 	case IPPROTO_ICMP:
 		icmp = bpf_flow_dissect_get_header(skb, sizeof(*icmp), &_icmp);
@@ -231,7 +230,6 @@ static __always_inline int parse_ipv6_proto(struct __sk_buff *skb, __u8 nexthdr)
 {
 	struct bpf_flow_keys *keys = skb->flow_keys;
 
-	keys->ip_proto = nexthdr;
 	switch (nexthdr) {
 	case IPPROTO_HOPOPTS:
 	case IPPROTO_DSTOPTS:
@@ -266,6 +264,7 @@ PROG(IP)(struct __sk_buff *skb)
 	keys->addr_proto = ETH_P_IP;
 	keys->ipv4_src = iph->saddr;
 	keys->ipv4_dst = iph->daddr;
+	keys->ip_proto = iph->protocol;
 
 	keys->thoff += iph->ihl << 2;
 	if (data + keys->thoff > data_end)
@@ -273,13 +272,19 @@ PROG(IP)(struct __sk_buff *skb)
 
 	if (iph->frag_off & bpf_htons(IP_MF | IP_OFFSET)) {
 		keys->is_frag = true;
-		if (iph->frag_off & bpf_htons(IP_OFFSET))
+		if (iph->frag_off & bpf_htons(IP_OFFSET)) {
 			/* From second fragment on, packets do not have headers
 			 * we can parse.
 			 */
 			done = true;
-		else
+		} else {
 			keys->is_first_frag = true;
+			/* No need to parse fragmented packet unless
+			 * explicitly asked for.
+			 */
+			if (!(keys->flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG))
+				done = true;
+		}
 	}
 
 	if (done)
@@ -301,6 +306,7 @@ PROG(IPV6)(struct __sk_buff *skb)
 	memcpy(&keys->ipv6_src, &ip6h->saddr, 2*sizeof(ip6h->saddr));
 
 	keys->thoff += sizeof(struct ipv6hdr);
+	keys->ip_proto = ip6h->nexthdr;
 
 	return parse_ipv6_proto(skb, ip6h->nexthdr);
 }
@@ -317,7 +323,8 @@ PROG(IPV6OP)(struct __sk_buff *skb)
 	/* hlen is in 8-octets and does not include the first 8 bytes
 	 * of the header
 	 */
-	skb->flow_keys->thoff += (1 + ip6h->hdrlen) << 3;
+	keys->thoff += (1 + ip6h->hdrlen) << 3;
+	keys->ip_proto = ip6h->nexthdr;
 
 	return parse_ipv6_proto(skb, ip6h->nexthdr);
 }
@@ -333,9 +340,18 @@ PROG(IPV6FR)(struct __sk_buff *skb)
 
 	keys->thoff += sizeof(*fragh);
 	keys->is_frag = true;
-	if (!(fragh->frag_off & bpf_htons(IP6_OFFSET)))
+	keys->ip_proto = fragh->nexthdr;
+
+	if (!(fragh->frag_off & bpf_htons(IP6_OFFSET))) {
 		keys->is_first_frag = true;
 
+		/* No need to parse fragmented packet unless
+		 * explicitly asked for.
+		 */
+		if (!(keys->flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG))
+			return export_flow_keys(keys, BPF_OK);
+	}
+
 	return parse_ipv6_proto(skb, fragh->nexthdr);
 }
 
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v2 6/7] bpf/flow_dissector: support ipv6 flow_label and FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL
  2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
                   ` (4 preceding siblings ...)
  2019-07-25 15:33 ` [PATCH bpf-next v2 5/7] selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG Stanislav Fomichev
@ 2019-07-25 15:33 ` Stanislav Fomichev
  2019-07-25 15:33 ` [PATCH bpf-next v2 7/7] selftests/bpf: support FLOW_DISSECTOR_F_STOP_AT_ENCAP Stanislav Fomichev
  2019-07-25 17:05 ` [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Petar Penkov
  7 siblings, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 15:33 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, Willem de Bruijn,
	Song Liu, Petar Penkov

Add support for exporting ipv6 flow label via bpf_flow_keys.
Export flow label from bpf_flow.c and also return early when
FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL is passed.

Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/uapi/linux/bpf.h                      |  1 +
 net/core/flow_dissector.c                     |  9 ++++
 tools/include/uapi/linux/bpf.h                |  1 +
 .../selftests/bpf/prog_tests/flow_dissector.c | 46 +++++++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_flow.c  | 10 ++++
 5 files changed, 67 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b4ad19bd6aa8..83b4150466af 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3533,6 +3533,7 @@ struct bpf_flow_keys {
 		};
 	};
 	__u32	flags;
+	__be32	flow_label;
 };
 
 struct bpf_func_info {
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index a74c4ed1b30d..bcdb863cad28 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -737,6 +737,7 @@ static void __skb_flow_bpf_to_target(const struct bpf_flow_keys *flow_keys,
 	struct flow_dissector_key_basic *key_basic;
 	struct flow_dissector_key_addrs *key_addrs;
 	struct flow_dissector_key_ports *key_ports;
+	struct flow_dissector_key_tags *key_tags;
 
 	key_control = skb_flow_dissector_target(flow_dissector,
 						FLOW_DISSECTOR_KEY_CONTROL,
@@ -781,6 +782,14 @@ static void __skb_flow_bpf_to_target(const struct bpf_flow_keys *flow_keys,
 		key_ports->src = flow_keys->sport;
 		key_ports->dst = flow_keys->dport;
 	}
+
+	if (dissector_uses_key(flow_dissector,
+			       FLOW_DISSECTOR_KEY_FLOW_LABEL)) {
+		key_tags = skb_flow_dissector_target(flow_dissector,
+						     FLOW_DISSECTOR_KEY_FLOW_LABEL,
+						     target_container);
+		key_tags->flow_label = ntohl(flow_keys->flow_label);
+	}
 }
 
 bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx,
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index a0e1c891b56f..c26ca432b1b3 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3530,6 +3530,7 @@ struct bpf_flow_keys {
 		};
 	};
 	__u32	flags;
+	__be32	flow_label;
 };
 
 struct bpf_func_info {
diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
index f93a115db650..ada032be6199 100644
--- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
+++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
@@ -20,6 +20,7 @@
 	      "is_encap=%u/%u "						\
 	      "ip_proto=0x%x/0x%x "					\
 	      "n_proto=0x%x/0x%x "					\
+	      "flow_label=0x%x/0x%x "					\
 	      "sport=%u/%u "						\
 	      "dport=%u/%u\n",						\
 	      got.nhoff, expected.nhoff,				\
@@ -30,6 +31,7 @@
 	      got.is_encap, expected.is_encap,				\
 	      got.ip_proto, expected.ip_proto,				\
 	      got.n_proto, expected.n_proto,				\
+	      got.flow_label, expected.flow_label,			\
 	      got.sport, expected.sport,				\
 	      got.dport, expected.dport)
 
@@ -257,6 +259,50 @@ struct test tests[] = {
 			.is_first_frag = true,
 		},
 	},
+	{
+		.name = "ipv6-flow-label",
+		.pkt.ipv6 = {
+			.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.iph.nexthdr = IPPROTO_TCP,
+			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
+			.iph.flow_lbl = { 0xb, 0xee, 0xef },
+			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
+		},
+		.keys = {
+			.nhoff = ETH_HLEN,
+			.thoff = ETH_HLEN + sizeof(struct ipv6hdr),
+			.addr_proto = ETH_P_IPV6,
+			.ip_proto = IPPROTO_TCP,
+			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.sport = 80,
+			.dport = 8080,
+			.flow_label = __bpf_constant_htonl(0xbeeef),
+		},
+	},
+	{
+		.name = "ipv6-no-flow-label",
+		.pkt.ipv6 = {
+			.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.iph.nexthdr = IPPROTO_TCP,
+			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
+			.iph.flow_lbl = { 0xb, 0xee, 0xef },
+			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
+		},
+		.keys = {
+			.flags = FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL,
+			.nhoff = ETH_HLEN,
+			.thoff = ETH_HLEN + sizeof(struct ipv6hdr),
+			.addr_proto = ETH_P_IPV6,
+			.ip_proto = IPPROTO_TCP,
+			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.flow_label = __bpf_constant_htonl(0xbeeef),
+		},
+		.flags = FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL,
+	},
 };
 
 static int create_tap(const char *ifname)
diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c
index 0eabe5e57944..7d73b7bfe609 100644
--- a/tools/testing/selftests/bpf/progs/bpf_flow.c
+++ b/tools/testing/selftests/bpf/progs/bpf_flow.c
@@ -83,6 +83,12 @@ static __always_inline int export_flow_keys(struct bpf_flow_keys *keys,
 	return ret;
 }
 
+#define IPV6_FLOWLABEL_MASK		__bpf_constant_htonl(0x000FFFFF)
+static inline __be32 ip6_flowlabel(const struct ipv6hdr *hdr)
+{
+	return *(__be32 *)hdr & IPV6_FLOWLABEL_MASK;
+}
+
 static __always_inline void *bpf_flow_dissect_get_header(struct __sk_buff *skb,
 							 __u16 hdr_size,
 							 void *buffer)
@@ -307,6 +313,10 @@ PROG(IPV6)(struct __sk_buff *skb)
 
 	keys->thoff += sizeof(struct ipv6hdr);
 	keys->ip_proto = ip6h->nexthdr;
+	keys->flow_label = ip6_flowlabel(ip6h);
+
+	if (keys->flags & FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL)
+		return export_flow_keys(keys, BPF_OK);
 
 	return parse_ipv6_proto(skb, ip6h->nexthdr);
 }
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v2 7/7] selftests/bpf: support FLOW_DISSECTOR_F_STOP_AT_ENCAP
  2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
                   ` (5 preceding siblings ...)
  2019-07-25 15:33 ` [PATCH bpf-next v2 6/7] bpf/flow_dissector: support ipv6 flow_label and FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL Stanislav Fomichev
@ 2019-07-25 15:33 ` Stanislav Fomichev
  2019-07-25 17:05 ` [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Petar Penkov
  7 siblings, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 15:33 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, Willem de Bruijn,
	Song Liu, Petar Penkov

Exit as soon as we found that packet is encapped when
FLOW_DISSECTOR_F_STOP_AT_ENCAP is passed.
Add appropriate selftest cases.

v2:
* Subtract sizeof(struct iphdr) from .iph_inner.tot_len (Willem de Bruijn)

Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Petar Penkov <ppenkov@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/flow_dissector.c | 64 +++++++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_flow.c  |  8 +++
 2 files changed, 72 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
index ada032be6199..15265c7a90a3 100644
--- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
+++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
@@ -41,6 +41,13 @@ struct ipv4_pkt {
 	struct tcphdr tcp;
 } __packed;
 
+struct ipip_pkt {
+	struct ethhdr eth;
+	struct iphdr iph;
+	struct iphdr iph_inner;
+	struct tcphdr tcp;
+} __packed;
+
 struct svlan_ipv4_pkt {
 	struct ethhdr eth;
 	__u16 vlan_tci;
@@ -82,6 +89,7 @@ struct test {
 	union {
 		struct ipv4_pkt ipv4;
 		struct svlan_ipv4_pkt svlan_ipv4;
+		struct ipip_pkt ipip;
 		struct ipv6_pkt ipv6;
 		struct ipv6_frag_pkt ipv6_frag;
 		struct dvlan_ipv6_pkt dvlan_ipv6;
@@ -303,6 +311,62 @@ struct test tests[] = {
 		},
 		.flags = FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL,
 	},
+	{
+		.name = "ipip-encap",
+		.pkt.ipip = {
+			.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
+			.iph.ihl = 5,
+			.iph.protocol = IPPROTO_IPIP,
+			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
+			.iph_inner.ihl = 5,
+			.iph_inner.protocol = IPPROTO_TCP,
+			.iph_inner.tot_len =
+				__bpf_constant_htons(MAGIC_BYTES) -
+				sizeof(struct iphdr),
+			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
+		},
+		.keys = {
+			.nhoff = 0,
+			.nhoff = ETH_HLEN,
+			.thoff = ETH_HLEN + sizeof(struct iphdr) +
+				sizeof(struct iphdr),
+			.addr_proto = ETH_P_IP,
+			.ip_proto = IPPROTO_TCP,
+			.n_proto = __bpf_constant_htons(ETH_P_IP),
+			.is_encap = true,
+			.sport = 80,
+			.dport = 8080,
+		},
+	},
+	{
+		.name = "ipip-no-encap",
+		.pkt.ipip = {
+			.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
+			.iph.ihl = 5,
+			.iph.protocol = IPPROTO_IPIP,
+			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
+			.iph_inner.ihl = 5,
+			.iph_inner.protocol = IPPROTO_TCP,
+			.iph_inner.tot_len =
+				__bpf_constant_htons(MAGIC_BYTES) -
+				sizeof(struct iphdr),
+			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
+		},
+		.keys = {
+			.flags = FLOW_DISSECTOR_F_STOP_AT_ENCAP,
+			.nhoff = ETH_HLEN,
+			.thoff = ETH_HLEN + sizeof(struct iphdr),
+			.addr_proto = ETH_P_IP,
+			.ip_proto = IPPROTO_IPIP,
+			.n_proto = __bpf_constant_htons(ETH_P_IP),
+			.is_encap = true,
+		},
+		.flags = FLOW_DISSECTOR_F_STOP_AT_ENCAP,
+	},
 };
 
 static int create_tap(const char *ifname)
diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c
index 7d73b7bfe609..b6236cdf8564 100644
--- a/tools/testing/selftests/bpf/progs/bpf_flow.c
+++ b/tools/testing/selftests/bpf/progs/bpf_flow.c
@@ -167,9 +167,15 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto)
 		return export_flow_keys(keys, BPF_OK);
 	case IPPROTO_IPIP:
 		keys->is_encap = true;
+		if (keys->flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
+			return export_flow_keys(keys, BPF_OK);
+
 		return parse_eth_proto(skb, bpf_htons(ETH_P_IP));
 	case IPPROTO_IPV6:
 		keys->is_encap = true;
+		if (keys->flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
+			return export_flow_keys(keys, BPF_OK);
+
 		return parse_eth_proto(skb, bpf_htons(ETH_P_IPV6));
 	case IPPROTO_GRE:
 		gre = bpf_flow_dissect_get_header(skb, sizeof(*gre), &_gre);
@@ -189,6 +195,8 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto)
 			keys->thoff += 4; /* Step over sequence number */
 
 		keys->is_encap = true;
+		if (keys->flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
+			return export_flow_keys(keys, BPF_OK);
 
 		if (gre->proto == bpf_htons(ETH_P_TEB)) {
 			eth = bpf_flow_dissect_get_header(skb, sizeof(*eth),
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags
  2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
                   ` (6 preceding siblings ...)
  2019-07-25 15:33 ` [PATCH bpf-next v2 7/7] selftests/bpf: support FLOW_DISSECTOR_F_STOP_AT_ENCAP Stanislav Fomichev
@ 2019-07-25 17:05 ` Petar Penkov
  7 siblings, 0 replies; 12+ messages in thread
From: Petar Penkov @ 2019-07-25 17:05 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S . Miller, Alexei Starovoitov,
	Daniel Borkmann, Song Liu, Willem de Bruijn

Thanks! For the series:

Acked-by: Petar Penkov <ppenkov@google.com>

On Thu, Jul 25, 2019 at 8:33 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> C flow dissector supports input flags that tell it to customize parsing
> by either stopping early or trying to parse as deep as possible.
> BPF flow dissector always parses as deep as possible which is sub-optimal.
> Pass input flags to the BPF flow dissector as well so it can make the same
> decisions.
>
> Series outline:
> * remove unused FLOW_DISSECTOR_F_STOP_AT_L3 flag
> * export FLOW_DISSECTOR_F_XXX flags as uapi and pass them to BPF
>   flow dissector
> * add documentation for the export flags
> * support input flags in BPF_PROG_TEST_RUN via ctx_{in,out}
> * sync uapi to tools
> * support FLOW_DISSECTOR_F_PARSE_1ST_FRAG in selftest
> * support FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL in kernel and selftest
> * support FLOW_DISSECTOR_F_STOP_AT_ENCAP in selftest
>
> Pros:
> * makes BPF flow dissector faster by avoiding burning extra cycles
> * existing BPF progs continue to work by ignoring the flags and always
>   parsing as deep as possible
>
> Cons:
> * new UAPI which we need to support (OTOH, if we need to deprecate some
>   flags, we can just stop setting them upon calling BPF programs)
>
> Some numbers (with .repeat = 4000000 in test_flow_dissector):
>         test_flow_dissector:PASS:ipv4-frag 35 nsec
>         test_flow_dissector:PASS:ipv4-frag 35 nsec
>         test_flow_dissector:PASS:ipv4-no-frag 32 nsec
>         test_flow_dissector:PASS:ipv4-no-frag 32 nsec
>
>         test_flow_dissector:PASS:ipv6-frag 39 nsec
>         test_flow_dissector:PASS:ipv6-frag 39 nsec
>         test_flow_dissector:PASS:ipv6-no-frag 36 nsec
>         test_flow_dissector:PASS:ipv6-no-frag 36 nsec
>
>         test_flow_dissector:PASS:ipv6-flow-label 36 nsec
>         test_flow_dissector:PASS:ipv6-flow-label 36 nsec
>         test_flow_dissector:PASS:ipv6-no-flow-label 33 nsec
>         test_flow_dissector:PASS:ipv6-no-flow-label 33 nsec
>
>         test_flow_dissector:PASS:ipip-encap 38 nsec
>         test_flow_dissector:PASS:ipip-encap 38 nsec
>         test_flow_dissector:PASS:ipip-no-encap 32 nsec
>         test_flow_dissector:PASS:ipip-no-encap 32 nsec
>
> The improvement is around 10%, but it's in a tight cache-hot
> BPF_PROG_TEST_RUN loop.
>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
>
> Stanislav Fomichev (7):
>   bpf/flow_dissector: pass input flags to BPF flow dissector program
>   bpf/flow_dissector: document flags
>   bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN
>   tools/bpf: sync bpf_flow_keys flags
>   selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
>   bpf/flow_dissector: support ipv6 flow_label and
>     FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL
>   selftests/bpf: support FLOW_DISSECTOR_F_STOP_AT_ENCAP
>
>  Documentation/bpf/prog_flow_dissector.rst     |  18 ++
>  include/linux/skbuff.h                        |   2 +-
>  include/net/flow_dissector.h                  |   4 -
>  include/uapi/linux/bpf.h                      |   6 +
>  net/bpf/test_run.c                            |  39 ++-
>  net/core/flow_dissector.c                     |  14 +-
>  tools/include/uapi/linux/bpf.h                |   6 +
>  .../selftests/bpf/prog_tests/flow_dissector.c | 242 +++++++++++++++++-
>  tools/testing/selftests/bpf/progs/bpf_flow.c  |  46 +++-
>  9 files changed, 359 insertions(+), 18 deletions(-)
>
> --
> 2.22.0.657.g960e92d24f-goog

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2 5/7] selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
  2019-07-25 15:33 ` [PATCH bpf-next v2 5/7] selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG Stanislav Fomichev
@ 2019-07-25 17:15   ` Song Liu
  0 siblings, 0 replies; 12+ messages in thread
From: Song Liu @ 2019-07-25 17:15 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S. Miller, ast, daniel, Willem de Bruijn,
	Petar Penkov



> On Jul 25, 2019, at 8:33 AM, Stanislav Fomichev <sdf@google.com> wrote:
> 
> bpf_flow.c: exit early unless FLOW_DISSECTOR_F_PARSE_1ST_FRAG is passed
> in flags. Also, set ip_proto earlier, this makes sure we have correct
> value with fragmented packets.
> 
> Add selftest cases to test ipv4/ipv6 fragments and skip eth_get_headlen
> tests that don't have FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag.
> 
> eth_get_headlen calls flow dissector with
> FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag so we can't run tests that
> have different set of input flags against it.
> 
> v2:
> * sefltests -> selftests (Willem de Bruijn)
> * Reword a comment about eth_get_headlen flags (Song Liu)
> 
> Acked-by: Willem de Bruijn <willemb@google.com>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>

Acked-by: Song Liu <songliubraving@fb.com>


> ---
> .../selftests/bpf/prog_tests/flow_dissector.c | 132 +++++++++++++++++-
> tools/testing/selftests/bpf/progs/bpf_flow.c  |  28 +++-
> 2 files changed, 153 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> index c938283ac232..f93a115db650 100644
> --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> @@ -5,6 +5,10 @@
> #include <linux/if_tun.h>
> #include <sys/uio.h>
> 
> +#ifndef IP_MF
> +#define IP_MF 0x2000
> +#endif
> +
> #define CHECK_FLOW_KEYS(desc, got, expected)				\
> 	CHECK_ATTR(memcmp(&got, &expected, sizeof(got)) != 0,		\
> 	      desc,							\
> @@ -49,6 +53,18 @@ struct ipv6_pkt {
> 	struct tcphdr tcp;
> } __packed;
> 
> +struct ipv6_frag_pkt {
> +	struct ethhdr eth;
> +	struct ipv6hdr iph;
> +	struct frag_hdr {
> +		__u8 nexthdr;
> +		__u8 reserved;
> +		__be16 frag_off;
> +		__be32 identification;
> +	} ipf;
> +	struct tcphdr tcp;
> +} __packed;
> +
> struct dvlan_ipv6_pkt {
> 	struct ethhdr eth;
> 	__u16 vlan_tci;
> @@ -65,9 +81,11 @@ struct test {
> 		struct ipv4_pkt ipv4;
> 		struct svlan_ipv4_pkt svlan_ipv4;
> 		struct ipv6_pkt ipv6;
> +		struct ipv6_frag_pkt ipv6_frag;
> 		struct dvlan_ipv6_pkt dvlan_ipv6;
> 	} pkt;
> 	struct bpf_flow_keys keys;
> +	__u32 flags;
> };
> 
> #define VLAN_HLEN	4
> @@ -143,6 +161,102 @@ struct test tests[] = {
> 			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
> 		},
> 	},
> +	{
> +		.name = "ipv4-frag",
> +		.pkt.ipv4 = {
> +			.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> +			.iph.ihl = 5,
> +			.iph.protocol = IPPROTO_TCP,
> +			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +			.iph.frag_off = __bpf_constant_htons(IP_MF),
> +			.tcp.doff = 5,
> +			.tcp.source = 80,
> +			.tcp.dest = 8080,
> +		},
> +		.keys = {
> +			.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +			.nhoff = ETH_HLEN,
> +			.thoff = ETH_HLEN + sizeof(struct iphdr),
> +			.addr_proto = ETH_P_IP,
> +			.ip_proto = IPPROTO_TCP,
> +			.n_proto = __bpf_constant_htons(ETH_P_IP),
> +			.is_frag = true,
> +			.is_first_frag = true,
> +			.sport = 80,
> +			.dport = 8080,
> +		},
> +		.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +	},
> +	{
> +		.name = "ipv4-no-frag",
> +		.pkt.ipv4 = {
> +			.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> +			.iph.ihl = 5,
> +			.iph.protocol = IPPROTO_TCP,
> +			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +			.iph.frag_off = __bpf_constant_htons(IP_MF),
> +			.tcp.doff = 5,
> +			.tcp.source = 80,
> +			.tcp.dest = 8080,
> +		},
> +		.keys = {
> +			.nhoff = ETH_HLEN,
> +			.thoff = ETH_HLEN + sizeof(struct iphdr),
> +			.addr_proto = ETH_P_IP,
> +			.ip_proto = IPPROTO_TCP,
> +			.n_proto = __bpf_constant_htons(ETH_P_IP),
> +			.is_frag = true,
> +			.is_first_frag = true,
> +		},
> +	},
> +	{
> +		.name = "ipv6-frag",
> +		.pkt.ipv6_frag = {
> +			.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> +			.iph.nexthdr = IPPROTO_FRAGMENT,
> +			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> +			.ipf.nexthdr = IPPROTO_TCP,
> +			.tcp.doff = 5,
> +			.tcp.source = 80,
> +			.tcp.dest = 8080,
> +		},
> +		.keys = {
> +			.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +			.nhoff = ETH_HLEN,
> +			.thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> +				sizeof(struct frag_hdr),
> +			.addr_proto = ETH_P_IPV6,
> +			.ip_proto = IPPROTO_TCP,
> +			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
> +			.is_frag = true,
> +			.is_first_frag = true,
> +			.sport = 80,
> +			.dport = 8080,
> +		},
> +		.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +	},
> +	{
> +		.name = "ipv6-no-frag",
> +		.pkt.ipv6_frag = {
> +			.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> +			.iph.nexthdr = IPPROTO_FRAGMENT,
> +			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> +			.ipf.nexthdr = IPPROTO_TCP,
> +			.tcp.doff = 5,
> +			.tcp.source = 80,
> +			.tcp.dest = 8080,
> +		},
> +		.keys = {
> +			.nhoff = ETH_HLEN,
> +			.thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> +				sizeof(struct frag_hdr),
> +			.addr_proto = ETH_P_IPV6,
> +			.ip_proto = IPPROTO_TCP,
> +			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
> +			.is_frag = true,
> +			.is_first_frag = true,
> +		},
> +	},
> };
> 
> static int create_tap(const char *ifname)
> @@ -225,6 +339,13 @@ void test_flow_dissector(void)
> 			.data_size_in = sizeof(tests[i].pkt),
> 			.data_out = &flow_keys,
> 		};
> +		static struct bpf_flow_keys ctx = {};
> +
> +		if (tests[i].flags) {
> +			tattr.ctx_in = &ctx;
> +			tattr.ctx_size_in = sizeof(ctx);
> +			ctx.flags = tests[i].flags;
> +		}
> 
> 		err = bpf_prog_test_run_xattr(&tattr);
> 		CHECK_ATTR(tattr.data_size_out != sizeof(flow_keys) ||
> @@ -251,10 +372,19 @@ void test_flow_dissector(void)
> 	CHECK(err, "ifup", "err %d errno %d\n", err, errno);
> 
> 	for (i = 0; i < ARRAY_SIZE(tests); i++) {
> -		struct bpf_flow_keys flow_keys = {};
> +		/* Keep in sync with 'flags' from eth_get_headlen. */
> +		__u32 eth_get_headlen_flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG;
> 		struct bpf_prog_test_run_attr tattr = {};
> +		struct bpf_flow_keys flow_keys = {};
> 		__u32 key = 0;
> 
> +		/* For skb-less case we can't pass input flags; run
> +		 * only the tests that have a matching set of flags.
> +		 */
> +
> +		if (tests[i].flags != eth_get_headlen_flags)
> +			continue;
> +
> 		err = tx_tap(tap_fd, &tests[i].pkt, sizeof(tests[i].pkt));
> 		CHECK(err < 0, "tx_tap", "err %d errno %d\n", err, errno);
> 
> diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c
> index 5ae485a6af3f..0eabe5e57944 100644
> --- a/tools/testing/selftests/bpf/progs/bpf_flow.c
> +++ b/tools/testing/selftests/bpf/progs/bpf_flow.c
> @@ -153,7 +153,6 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto)
> 	struct tcphdr *tcp, _tcp;
> 	struct udphdr *udp, _udp;
> 
> -	keys->ip_proto = proto;
> 	switch (proto) {
> 	case IPPROTO_ICMP:
> 		icmp = bpf_flow_dissect_get_header(skb, sizeof(*icmp), &_icmp);
> @@ -231,7 +230,6 @@ static __always_inline int parse_ipv6_proto(struct __sk_buff *skb, __u8 nexthdr)
> {
> 	struct bpf_flow_keys *keys = skb->flow_keys;
> 
> -	keys->ip_proto = nexthdr;
> 	switch (nexthdr) {
> 	case IPPROTO_HOPOPTS:
> 	case IPPROTO_DSTOPTS:
> @@ -266,6 +264,7 @@ PROG(IP)(struct __sk_buff *skb)
> 	keys->addr_proto = ETH_P_IP;
> 	keys->ipv4_src = iph->saddr;
> 	keys->ipv4_dst = iph->daddr;
> +	keys->ip_proto = iph->protocol;
> 
> 	keys->thoff += iph->ihl << 2;
> 	if (data + keys->thoff > data_end)
> @@ -273,13 +272,19 @@ PROG(IP)(struct __sk_buff *skb)
> 
> 	if (iph->frag_off & bpf_htons(IP_MF | IP_OFFSET)) {
> 		keys->is_frag = true;
> -		if (iph->frag_off & bpf_htons(IP_OFFSET))
> +		if (iph->frag_off & bpf_htons(IP_OFFSET)) {
> 			/* From second fragment on, packets do not have headers
> 			 * we can parse.
> 			 */
> 			done = true;
> -		else
> +		} else {
> 			keys->is_first_frag = true;
> +			/* No need to parse fragmented packet unless
> +			 * explicitly asked for.
> +			 */
> +			if (!(keys->flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG))
> +				done = true;
> +		}
> 	}
> 
> 	if (done)
> @@ -301,6 +306,7 @@ PROG(IPV6)(struct __sk_buff *skb)
> 	memcpy(&keys->ipv6_src, &ip6h->saddr, 2*sizeof(ip6h->saddr));
> 
> 	keys->thoff += sizeof(struct ipv6hdr);
> +	keys->ip_proto = ip6h->nexthdr;
> 
> 	return parse_ipv6_proto(skb, ip6h->nexthdr);
> }
> @@ -317,7 +323,8 @@ PROG(IPV6OP)(struct __sk_buff *skb)
> 	/* hlen is in 8-octets and does not include the first 8 bytes
> 	 * of the header
> 	 */
> -	skb->flow_keys->thoff += (1 + ip6h->hdrlen) << 3;
> +	keys->thoff += (1 + ip6h->hdrlen) << 3;
> +	keys->ip_proto = ip6h->nexthdr;
> 
> 	return parse_ipv6_proto(skb, ip6h->nexthdr);
> }
> @@ -333,9 +340,18 @@ PROG(IPV6FR)(struct __sk_buff *skb)
> 
> 	keys->thoff += sizeof(*fragh);
> 	keys->is_frag = true;
> -	if (!(fragh->frag_off & bpf_htons(IP6_OFFSET)))
> +	keys->ip_proto = fragh->nexthdr;
> +
> +	if (!(fragh->frag_off & bpf_htons(IP6_OFFSET))) {
> 		keys->is_first_frag = true;
> 
> +		/* No need to parse fragmented packet unless
> +		 * explicitly asked for.
> +		 */
> +		if (!(keys->flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG))
> +			return export_flow_keys(keys, BPF_OK);
> +	}
> +
> 	return parse_ipv6_proto(skb, fragh->nexthdr);
> }
> 
> -- 
> 2.22.0.657.g960e92d24f-goog
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2 1/7] bpf/flow_dissector: pass input flags to BPF flow dissector program
  2019-07-25 15:33 ` [PATCH bpf-next v2 1/7] bpf/flow_dissector: pass input flags to BPF flow dissector program Stanislav Fomichev
@ 2019-07-25 19:58   ` Alexei Starovoitov
  2019-07-25 20:03     ` Stanislav Fomichev
  0 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2019-07-25 19:58 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: netdev, bpf, davem, ast, daniel, Willem de Bruijn, Song Liu,
	Petar Penkov

On Thu, Jul 25, 2019 at 08:33:36AM -0700, Stanislav Fomichev wrote:
> C flow dissector supports input flags that tell it to customize parsing
> by either stopping early or trying to parse as deep as possible. Pass
> those flags to the BPF flow dissector so it can make the same
> decisions. In the next commits I'll add support for those flags to
> our reference bpf_flow.c
> 
> Acked-by: Willem de Bruijn <willemb@google.com>
> Acked-by: Song Liu <songliubraving@fb.com>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/linux/skbuff.h       | 2 +-
>  include/net/flow_dissector.h | 4 ----
>  include/uapi/linux/bpf.h     | 5 +++++
>  net/bpf/test_run.c           | 2 +-
>  net/core/flow_dissector.c    | 5 +++--
>  5 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 718742b1c505..9b7a8038beec 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1271,7 +1271,7 @@ static inline int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr)
>  
>  struct bpf_flow_dissector;
>  bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx,
> -		      __be16 proto, int nhoff, int hlen);
> +		      __be16 proto, int nhoff, int hlen, unsigned int flags);
>  
>  bool __skb_flow_dissect(const struct net *net,
>  			const struct sk_buff *skb,
> diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
> index 90bd210be060..3e2642587b76 100644
> --- a/include/net/flow_dissector.h
> +++ b/include/net/flow_dissector.h
> @@ -253,10 +253,6 @@ enum flow_dissector_key_id {
>  	FLOW_DISSECTOR_KEY_MAX,
>  };
>  
> -#define FLOW_DISSECTOR_F_PARSE_1ST_FRAG		BIT(0)
> -#define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL	BIT(1)
> -#define FLOW_DISSECTOR_F_STOP_AT_ENCAP		BIT(2)
> -
>  struct flow_dissector_key {
>  	enum flow_dissector_key_id key_id;
>  	size_t offset; /* offset of struct flow_dissector_key_*
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index fa1c753dcdbc..b4ad19bd6aa8 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -3507,6 +3507,10 @@ enum bpf_task_fd_type {
>  	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
>  };
>  
> +#define FLOW_DISSECTOR_F_PARSE_1ST_FRAG		(1U << 0)
> +#define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL	(1U << 1)
> +#define FLOW_DISSECTOR_F_STOP_AT_ENCAP		(1U << 2)
> +

I'm a bit concerned with direct move.
Last time we were in similar situation we've created:
enum {
        BPF_TCP_ESTABLISHED = 1,
        BPF_TCP_SYN_SENT,

and added:
        BUILD_BUG_ON((int)BPF_TCP_ESTABLISHED != (int)TCP_ESTABLISHED);
        BUILD_BUG_ON((int)BPF_TCP_SYN_SENT != (int)TCP_SYN_SENT);

It may be overkill here, but feels safer than direct move.
Adding BPF_ prefix also feels necessary to avoid very unlikely
(but still theoretically possible) conflicts.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2 1/7] bpf/flow_dissector: pass input flags to BPF flow dissector program
  2019-07-25 19:58   ` Alexei Starovoitov
@ 2019-07-25 20:03     ` Stanislav Fomichev
  0 siblings, 0 replies; 12+ messages in thread
From: Stanislav Fomichev @ 2019-07-25 20:03 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Stanislav Fomichev, netdev, bpf, davem, ast, daniel,
	Willem de Bruijn, Song Liu, Petar Penkov

On 07/25, Alexei Starovoitov wrote:
> On Thu, Jul 25, 2019 at 08:33:36AM -0700, Stanislav Fomichev wrote:
> > C flow dissector supports input flags that tell it to customize parsing
> > by either stopping early or trying to parse as deep as possible. Pass
> > those flags to the BPF flow dissector so it can make the same
> > decisions. In the next commits I'll add support for those flags to
> > our reference bpf_flow.c
> > 
> > Acked-by: Willem de Bruijn <willemb@google.com>
> > Acked-by: Song Liu <songliubraving@fb.com>
> > Cc: Song Liu <songliubraving@fb.com>
> > Cc: Willem de Bruijn <willemb@google.com>
> > Cc: Petar Penkov <ppenkov@google.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/linux/skbuff.h       | 2 +-
> >  include/net/flow_dissector.h | 4 ----
> >  include/uapi/linux/bpf.h     | 5 +++++
> >  net/bpf/test_run.c           | 2 +-
> >  net/core/flow_dissector.c    | 5 +++--
> >  5 files changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index 718742b1c505..9b7a8038beec 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -1271,7 +1271,7 @@ static inline int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr)
> >  
> >  struct bpf_flow_dissector;
> >  bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx,
> > -		      __be16 proto, int nhoff, int hlen);
> > +		      __be16 proto, int nhoff, int hlen, unsigned int flags);
> >  
> >  bool __skb_flow_dissect(const struct net *net,
> >  			const struct sk_buff *skb,
> > diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
> > index 90bd210be060..3e2642587b76 100644
> > --- a/include/net/flow_dissector.h
> > +++ b/include/net/flow_dissector.h
> > @@ -253,10 +253,6 @@ enum flow_dissector_key_id {
> >  	FLOW_DISSECTOR_KEY_MAX,
> >  };
> >  
> > -#define FLOW_DISSECTOR_F_PARSE_1ST_FRAG		BIT(0)
> > -#define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL	BIT(1)
> > -#define FLOW_DISSECTOR_F_STOP_AT_ENCAP		BIT(2)
> > -
> >  struct flow_dissector_key {
> >  	enum flow_dissector_key_id key_id;
> >  	size_t offset; /* offset of struct flow_dissector_key_*
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index fa1c753dcdbc..b4ad19bd6aa8 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -3507,6 +3507,10 @@ enum bpf_task_fd_type {
> >  	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
> >  };
> >  
> > +#define FLOW_DISSECTOR_F_PARSE_1ST_FRAG		(1U << 0)
> > +#define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL	(1U << 1)
> > +#define FLOW_DISSECTOR_F_STOP_AT_ENCAP		(1U << 2)
> > +
> 
> I'm a bit concerned with direct move.
> Last time we were in similar situation we've created:
> enum {
>         BPF_TCP_ESTABLISHED = 1,
>         BPF_TCP_SYN_SENT,
> 
> and added:
>         BUILD_BUG_ON((int)BPF_TCP_ESTABLISHED != (int)TCP_ESTABLISHED);
>         BUILD_BUG_ON((int)BPF_TCP_SYN_SENT != (int)TCP_SYN_SENT);
> 
> It may be overkill here, but feels safer than direct move.
> Adding BPF_ prefix also feels necessary to avoid very unlikely
> (but still theoretically possible) conflicts.
Sounds good, thanks for the pointers, will do the same here!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-07-25 20:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-25 15:33 [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Stanislav Fomichev
2019-07-25 15:33 ` [PATCH bpf-next v2 1/7] bpf/flow_dissector: pass input flags to BPF flow dissector program Stanislav Fomichev
2019-07-25 19:58   ` Alexei Starovoitov
2019-07-25 20:03     ` Stanislav Fomichev
2019-07-25 15:33 ` [PATCH bpf-next v2 2/7] bpf/flow_dissector: document flags Stanislav Fomichev
2019-07-25 15:33 ` [PATCH bpf-next v2 3/7] bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN Stanislav Fomichev
2019-07-25 15:33 ` [PATCH bpf-next v2 4/7] tools/bpf: sync bpf_flow_keys flags Stanislav Fomichev
2019-07-25 15:33 ` [PATCH bpf-next v2 5/7] selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG Stanislav Fomichev
2019-07-25 17:15   ` Song Liu
2019-07-25 15:33 ` [PATCH bpf-next v2 6/7] bpf/flow_dissector: support ipv6 flow_label and FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL Stanislav Fomichev
2019-07-25 15:33 ` [PATCH bpf-next v2 7/7] selftests/bpf: support FLOW_DISSECTOR_F_STOP_AT_ENCAP Stanislav Fomichev
2019-07-25 17:05 ` [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags Petar Penkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.