bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
@ 2020-09-25  0:03 Martin KaFai Lau
  2020-09-25  0:03 ` [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type() Martin KaFai Lau
                   ` (14 more replies)
  0 siblings, 15 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:03 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

This set allows networking prog type to directly read fields from
the in-kernel socket type, e.g. "struct tcp_sock".

Patch 2 has the details on the use case.

v3:
- Pass arg_btf_id instead of fn into check_reg_type() in Patch 1 (Lorenz)
- Move arg_btf_id from func_proto to struct bpf_reg_types in Patch 2 (Lorenz)
- Remove test_sock_fields from .gitignore in Patch 8 (Andrii)
- Add tests to have better coverage on the modified helpers (Alexei)
  Patch 13 is added.
- Use "void *sk" as the helper argument in UAPI bpf.h
  
v3:
- ARG_PTR_TO_SOCK_COMMON_OR_NULL was attempted in v2.  The _OR_NULL was
  needed because the PTR_TO_BTF_ID could be NULL but note that a could be NULL
  PTR_TO_BTF_ID is not a scalar NULL to the verifier.  "_OR_NULL" implicitly
  gives an expectation that the helper can take a scalar NULL which does
  not make sense in most (except one) helpers.  Passing scalar NULL
  should be rejected at the verification time.

  Thus, this patch uses ARG_PTR_TO_BTF_ID_SOCK_COMMON to specify that the
  helper can take both the btf-id ptr or the legacy PTR_TO_SOCK_COMMON but
  not scalar NULL.  It requires the func_proto to explicitly specify the
  arg_btf_id such that there is a very clear expectation that the helper
  can handle a NULL PTR_TO_BTF_ID.

v2:
- Add ARG_PTR_TO_SOCK_COMMON_OR_NULL (Lorenz)

Martin KaFai Lau (13):
  bpf: Move the PTR_TO_BTF_ID check to check_reg_type()
  bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept
    ARG_PTR_TO_BTF_ID_SOCK_COMMON
  bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
  bpf: Change bpf_tcp_*_syncookie to accept
    ARG_PTR_TO_BTF_ID_SOCK_COMMON
  bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
  bpf: selftest: Add ref_tracking verifier test for bpf_skc casting
  bpf: selftest: Move sock_fields test into test_progs
  bpf: selftest: Adapt sock_fields test to use skel and global variables
  bpf: selftest: Use network_helpers in the sock_fields test
  bpf: selftest: Use bpf_skc_to_tcp_sock() in the sock_fields test
  bpf: selftest: Remove enum tcp_ca_state from bpf_tcp_helpers.h
  bpf: selftest: Add test_btf_skc_cls_ingress

 include/linux/bpf.h                           |   1 +
 include/net/bpf_sk_storage.h                  |   2 -
 include/uapi/linux/bpf.h                      |  15 +-
 kernel/bpf/bpf_lsm.c                          |   4 +-
 kernel/bpf/verifier.c                         |  94 ++--
 net/core/bpf_sk_storage.c                     |  29 +-
 net/core/filter.c                             | 111 ++--
 net/ipv4/bpf_tcp_ca.c                         |  23 +-
 tools/include/uapi/linux/bpf.h                |  15 +-
 tools/testing/selftests/bpf/.gitignore        |   1 -
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/bpf_tcp_helpers.h |  13 +-
 .../bpf/prog_tests/btf_skc_cls_ingress.c      | 234 +++++++++
 .../selftests/bpf/prog_tests/sock_fields.c    | 382 ++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_cubic.c |   2 +
 tools/testing/selftests/bpf/progs/bpf_dctcp.c |   2 +
 .../bpf/progs/test_btf_skc_cls_ingress.c      | 174 +++++++
 ..._sock_fields_kern.c => test_sock_fields.c} | 176 ++++---
 .../testing/selftests/bpf/test_sock_fields.c  | 482 ------------------
 .../selftests/bpf/verifier/ref_tracking.c     |  47 ++
 20 files changed, 1093 insertions(+), 716 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_fields.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c
 rename tools/testing/selftests/bpf/progs/{test_sock_fields_kern.c => test_sock_fields.c} (61%)
 delete mode 100644 tools/testing/selftests/bpf/test_sock_fields.c

-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type()
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
@ 2020-09-25  0:03 ` Martin KaFai Lau
  2020-09-25  8:22   ` Lorenz Bauer
  2020-09-25 13:36   ` John Fastabend
  2020-09-25  0:03 ` [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:03 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

check_reg_type() checks whether a reg can be used as an arg of a
func_proto.  For PTR_TO_BTF_ID, the check is actually not
completely done until the reg->btf_id is pointing to a
kernel struct that is acceptable by the func_proto.

Thus, this patch moves the btf_id check into check_reg_type().
"arg_type" and "arg_btf_id" are passed to check_reg_type() instead of
"compatible".  The compatible_reg_types[] usage is localized in
check_reg_type() now.

The "if (!btf_id) verbose(...); " is also removed since it won't happen.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/verifier.c | 60 ++++++++++++++++++++++---------------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 42dee5dcbc74..945fa2b4d096 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4028,19 +4028,27 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
 };
 
 static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
-			  const struct bpf_reg_types *compatible)
+			  enum bpf_arg_type arg_type,
+			  const u32 *arg_btf_id)
 {
 	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
 	enum bpf_reg_type expected, type = reg->type;
+	const struct bpf_reg_types *compatible;
 	int i, j;
 
+	compatible = compatible_reg_types[arg_type];
+	if (!compatible) {
+		verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);
+		return -EFAULT;
+	}
+
 	for (i = 0; i < ARRAY_SIZE(compatible->types); i++) {
 		expected = compatible->types[i];
 		if (expected == NOT_INIT)
 			break;
 
 		if (type == expected)
-			return 0;
+			goto found;
 	}
 
 	verbose(env, "R%d type=%s expected=", regno, reg_type_str[type]);
@@ -4048,6 +4056,25 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
 		verbose(env, "%s, ", reg_type_str[compatible->types[j]]);
 	verbose(env, "%s\n", reg_type_str[compatible->types[j]]);
 	return -EACCES;
+
+found:
+	if (type == PTR_TO_BTF_ID) {
+		if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id,
+					  *arg_btf_id)) {
+			verbose(env, "R%d is of type %s but %s is expected\n",
+				regno, kernel_type_name(reg->btf_id),
+				kernel_type_name(*arg_btf_id));
+			return -EACCES;
+		}
+
+		if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
+			verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
+				regno);
+			return -EACCES;
+		}
+	}
+
+	return 0;
 }
 
 static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
@@ -4057,7 +4084,6 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 	u32 regno = BPF_REG_1 + arg;
 	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
 	enum bpf_arg_type arg_type = fn->arg_type[arg];
-	const struct bpf_reg_types *compatible;
 	enum bpf_reg_type type = reg->type;
 	int err = 0;
 
@@ -4097,35 +4123,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		 */
 		goto skip_type_check;
 
-	compatible = compatible_reg_types[arg_type];
-	if (!compatible) {
-		verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);
-		return -EFAULT;
-	}
-
-	err = check_reg_type(env, regno, compatible);
+	err = check_reg_type(env, regno, arg_type, fn->arg_btf_id[arg]);
 	if (err)
 		return err;
 
-	if (type == PTR_TO_BTF_ID) {
-		const u32 *btf_id = fn->arg_btf_id[arg];
-
-		if (!btf_id) {
-			verbose(env, "verifier internal error: missing BTF ID\n");
-			return -EFAULT;
-		}
-
-		if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id, *btf_id)) {
-			verbose(env, "R%d is of type %s but %s is expected\n",
-				regno, kernel_type_name(reg->btf_id), kernel_type_name(*btf_id));
-			return -EACCES;
-		}
-		if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
-			verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
-				regno);
-			return -EACCES;
-		}
-	} else if (type == PTR_TO_CTX) {
+	if (type == PTR_TO_CTX) {
 		err = check_ctx_reg(env, reg, regno);
 		if (err < 0)
 			return err;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
  2020-09-25  0:03 ` [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type() Martin KaFai Lau
@ 2020-09-25  0:03 ` Martin KaFai Lau
  2020-09-25  8:26   ` Lorenz Bauer
  2020-09-25 14:21   ` John Fastabend
  2020-09-25  0:03 ` [PATCH v4 bpf-next 03/13] bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON Martin KaFai Lau
                   ` (12 subsequent siblings)
  14 siblings, 2 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:03 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

There is a constant need to add more fields into the bpf_tcp_sock
for the bpf programs running at tc, sock_ops...etc.

A current workaround could be to use bpf_probe_read_kernel().  However,
other than making another helper call for reading each field and missing
CO-RE, it is also not as intuitive to use as directly reading
"tp->lsndtime" for example.  While already having perfmon cap to do
bpf_probe_read_kernel(), it will be much easier if the bpf prog can
directly read from the tcp_sock.

This patch tries to do that by using the existing casting-helpers
bpf_skc_to_*() whose func_proto returns a btf_id.  For example, the
func_proto of bpf_skc_to_tcp_sock returns the btf_id of the
kernel "struct tcp_sock".

These helpers are also added to is_ptr_cast_function().
It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id.
That will keep the ref-tracking works properly.

The bpf_skc_to_* helpers are made available to most of the bpf prog
types in filter.c. The bpf_skc_to_* helpers will be limited by
perfmon cap.

This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON.  The helper accepting
this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON])
or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON).  The bpf_skc_to_*()
helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that
they will accept pointer obtained from skb->sk.

Instead of specifying both arg_type and arg_btf_id in the same func_proto
which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of
the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the
compatible_reg_types[] in verifier.c.  The reason is the arg_btf_id is
always the same.  Discussion in this thread:
https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/

The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is
expecting a PTR_TO_BTF_ID which could be NULL.  This is the same
behavior as the existing helper taking ARG_PTR_TO_BTF_ID.

The _SOCK_COMMON part means the helper is also expecting the legacy
SOCK_COMMON pointer.

By excluding the _OR_NULL part, the bpf prog cannot call helper
with a literal NULL which doesn't make sense in most cases.
e.g. bpf_skc_to_tcp_sock(NULL) will be rejected.  All PTR_TO_*_OR_NULL
reg has to do a NULL check first before passing into the helper or else
the bpf prog will be rejected.  This behavior is nothing new and
consistent with the current expectation during bpf-prog-load.

[ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace
  ARG_PTR_TO_SOCK* of other existing helpers later such that
  those existing helpers can take the PTR_TO_BTF_ID returned by
  the bpf_skc_to_*() helpers.

  The only special case is bpf_sk_lookup_assign() which can accept a
  literal NULL ptr.  It has to be handled specially in another follow
  up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL
  to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ]

[ When converting the older helpers that take ARG_PTR_TO_SOCK* in
  the later patch, if the kernel does not support BTF,
  ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON
  because no reg->type could have PTR_TO_BTF_ID in this case.

  It is not a concern for the newer-btf-only helper like the bpf_skc_to_*()
  here though because these helpers must require BTF vmlinux to begin
  with. ]

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 34 +++++++++++++++++++--
 net/core/filter.c     | 69 ++++++++++++++++++++++++++++++-------------
 3 files changed, 82 insertions(+), 22 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index fc5c901c7542..d0937f1d2980 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -292,6 +292,7 @@ enum bpf_arg_type {
 	ARG_PTR_TO_ALLOC_MEM,	/* pointer to dynamically allocated memory */
 	ARG_PTR_TO_ALLOC_MEM_OR_NULL,	/* pointer to dynamically allocated memory or NULL */
 	ARG_CONST_ALLOC_SIZE_OR_ZERO,	/* number of allocated bytes requested */
+	ARG_PTR_TO_BTF_ID_SOCK_COMMON,	/* pointer to in-kernel sock_common or bpf-mirrored bpf_sock */
 	__BPF_ARG_TYPE_MAX,
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 945fa2b4d096..d4ba29fb17a6 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -486,7 +486,12 @@ static bool is_acquire_function(enum bpf_func_id func_id,
 static bool is_ptr_cast_function(enum bpf_func_id func_id)
 {
 	return func_id == BPF_FUNC_tcp_sock ||
-		func_id == BPF_FUNC_sk_fullsock;
+		func_id == BPF_FUNC_sk_fullsock ||
+		func_id == BPF_FUNC_skc_to_tcp_sock ||
+		func_id == BPF_FUNC_skc_to_tcp6_sock ||
+		func_id == BPF_FUNC_skc_to_udp6_sock ||
+		func_id == BPF_FUNC_skc_to_tcp_timewait_sock ||
+		func_id == BPF_FUNC_skc_to_tcp_request_sock;
 }
 
 /* string representation of 'enum bpf_reg_type' */
@@ -3953,6 +3958,7 @@ static int resolve_map_arg_type(struct bpf_verifier_env *env,
 
 struct bpf_reg_types {
 	const enum bpf_reg_type types[10];
+	u32 *btf_id;
 };
 
 static const struct bpf_reg_types map_key_value_types = {
@@ -3973,6 +3979,17 @@ static const struct bpf_reg_types sock_types = {
 	},
 };
 
+static const struct bpf_reg_types btf_id_sock_common_types = {
+	.types = {
+		PTR_TO_SOCK_COMMON,
+		PTR_TO_SOCKET,
+		PTR_TO_TCP_SOCK,
+		PTR_TO_XDP_SOCK,
+		PTR_TO_BTF_ID,
+	},
+	.btf_id = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
+};
+
 static const struct bpf_reg_types mem_types = {
 	.types = {
 		PTR_TO_STACK,
@@ -4014,6 +4031,7 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
 	[ARG_PTR_TO_CTX]		= &context_types,
 	[ARG_PTR_TO_CTX_OR_NULL]	= &context_types,
 	[ARG_PTR_TO_SOCK_COMMON]	= &sock_types,
+	[ARG_PTR_TO_BTF_ID_SOCK_COMMON]	= &btf_id_sock_common_types,
 	[ARG_PTR_TO_SOCKET]		= &fullsock_types,
 	[ARG_PTR_TO_SOCKET_OR_NULL]	= &fullsock_types,
 	[ARG_PTR_TO_BTF_ID]		= &btf_ptr_types,
@@ -4059,6 +4077,14 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
 
 found:
 	if (type == PTR_TO_BTF_ID) {
+		if (!arg_btf_id) {
+			if (!compatible->btf_id) {
+				verbose(env, "verifier internal error: missing arg compatible BTF ID\n");
+				return -EFAULT;
+			}
+			arg_btf_id = compatible->btf_id;
+		}
+
 		if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id,
 					  *arg_btf_id)) {
 			verbose(env, "R%d is of type %s but %s is expected\n",
@@ -4575,10 +4601,14 @@ static bool check_btf_id_ok(const struct bpf_func_proto *fn)
 {
 	int i;
 
-	for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++)
+	for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) {
 		if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
 			return false;
 
+		if (fn->arg_type[i] != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
+			return false;
+	}
+
 	return true;
 }
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 706f8db0ccf8..6d1864f2bd51 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -77,6 +77,9 @@
 #include <net/transp_v6.h>
 #include <linux/btf_ids.h>
 
+static const struct bpf_func_proto *
+bpf_sk_base_func_proto(enum bpf_func_id func_id);
+
 int copy_bpf_fprog_from_user(struct sock_fprog *dst, sockptr_t src, int len)
 {
 	if (in_compat_syscall()) {
@@ -6620,7 +6623,7 @@ sock_addr_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 			return NULL;
 		}
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -6639,7 +6642,7 @@ sk_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_perf_event_output:
 		return &bpf_skb_event_output_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -6800,7 +6803,7 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_sk_assign_proto;
 #endif
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -6841,7 +6844,7 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_tcp_gen_syncookie_proto;
 #endif
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -6883,7 +6886,7 @@ sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_tcp_sock_proto;
 #endif /* CONFIG_INET */
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -6929,7 +6932,7 @@ sk_msg_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_cgroup_classid_curr_proto;
 #endif
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -6971,7 +6974,7 @@ sk_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_skc_lookup_tcp_proto;
 #endif
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -6982,7 +6985,7 @@ flow_dissector_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_skb_load_bytes:
 		return &bpf_flow_dissector_load_bytes_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -7009,7 +7012,7 @@ lwt_out_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_skb_under_cgroup:
 		return &bpf_skb_under_cgroup_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -9746,7 +9749,7 @@ sk_lookup_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_sk_release:
 		return &bpf_sk_release_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id);
 	}
 }
 
@@ -9913,8 +9916,7 @@ const struct bpf_func_proto bpf_skc_to_tcp6_sock_proto = {
 	.func			= bpf_skc_to_tcp6_sock,
 	.gpl_only		= false,
 	.ret_type		= RET_PTR_TO_BTF_ID_OR_NULL,
-	.arg1_type		= ARG_PTR_TO_BTF_ID,
-	.arg1_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
+	.arg1_type		= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.ret_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_TCP6],
 };
 
@@ -9930,8 +9932,7 @@ const struct bpf_func_proto bpf_skc_to_tcp_sock_proto = {
 	.func			= bpf_skc_to_tcp_sock,
 	.gpl_only		= false,
 	.ret_type		= RET_PTR_TO_BTF_ID_OR_NULL,
-	.arg1_type		= ARG_PTR_TO_BTF_ID,
-	.arg1_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
+	.arg1_type		= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.ret_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_TCP],
 };
 
@@ -9954,8 +9955,7 @@ const struct bpf_func_proto bpf_skc_to_tcp_timewait_sock_proto = {
 	.func			= bpf_skc_to_tcp_timewait_sock,
 	.gpl_only		= false,
 	.ret_type		= RET_PTR_TO_BTF_ID_OR_NULL,
-	.arg1_type		= ARG_PTR_TO_BTF_ID,
-	.arg1_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
+	.arg1_type		= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.ret_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_TCP_TW],
 };
 
@@ -9978,8 +9978,7 @@ const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto = {
 	.func			= bpf_skc_to_tcp_request_sock,
 	.gpl_only		= false,
 	.ret_type		= RET_PTR_TO_BTF_ID_OR_NULL,
-	.arg1_type		= ARG_PTR_TO_BTF_ID,
-	.arg1_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
+	.arg1_type		= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.ret_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_TCP_REQ],
 };
 
@@ -10000,7 +9999,37 @@ const struct bpf_func_proto bpf_skc_to_udp6_sock_proto = {
 	.func			= bpf_skc_to_udp6_sock,
 	.gpl_only		= false,
 	.ret_type		= RET_PTR_TO_BTF_ID_OR_NULL,
-	.arg1_type		= ARG_PTR_TO_BTF_ID,
-	.arg1_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
+	.arg1_type		= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.ret_btf_id		= &btf_sock_ids[BTF_SOCK_TYPE_UDP6],
 };
+
+static const struct bpf_func_proto *
+bpf_sk_base_func_proto(enum bpf_func_id func_id)
+{
+	const struct bpf_func_proto *func;
+
+	switch (func_id) {
+	case BPF_FUNC_skc_to_tcp6_sock:
+		func = &bpf_skc_to_tcp6_sock_proto;
+		break;
+	case BPF_FUNC_skc_to_tcp_sock:
+		func = &bpf_skc_to_tcp_sock_proto;
+		break;
+	case BPF_FUNC_skc_to_tcp_timewait_sock:
+		func = &bpf_skc_to_tcp_timewait_sock_proto;
+		break;
+	case BPF_FUNC_skc_to_tcp_request_sock:
+		func = &bpf_skc_to_tcp_request_sock_proto;
+		break;
+	case BPF_FUNC_skc_to_udp6_sock:
+		func = &bpf_skc_to_udp6_sock_proto;
+		break;
+	default:
+		return bpf_base_func_proto(func_id);
+	}
+
+	if (!perfmon_capable())
+		return NULL;
+
+	return func;
+}
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 03/13] bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
  2020-09-25  0:03 ` [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type() Martin KaFai Lau
  2020-09-25  0:03 ` [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
@ 2020-09-25  0:03 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 04/13] bpf: Change bpf_sk_storage_*() " Martin KaFai Lau
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:03 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

The previous patch allows the networking bpf prog to use the
bpf_skc_to_*() helpers to get a PTR_TO_BTF_ID socket pointer,
e.g. "struct tcp_sock *".  It allows the bpf prog to read all the
fields of the tcp_sock.

This patch changes the bpf_sk_release() and bpf_sk_*cgroup_id()
to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will
work with the pointer returned by the bpf_skc_to_*() helpers
also.  For example, the following will work:

	sk = bpf_skc_lookup_tcp(skb, tuple, tuplen, BPF_F_CURRENT_NETNS, 0);
	if (!sk)
		return;
	tp = bpf_skc_to_tcp_sock(sk);
	if (!tp) {
		bpf_sk_release(sk);
		return;
	}
	lsndtime = tp->lsndtime;
	/* Pass tp to bpf_sk_release() will also work */
	bpf_sk_release(tp);

Since PTR_TO_BTF_ID could be NULL, the helper taking
ARG_PTR_TO_BTF_ID_SOCK_COMMON has to check for NULL at runtime.

A btf_id of "struct sock" may not always mean a fullsock.  Regardless
the helper's running context may get a non-fullsock or not,
considering fullsock check/handling is pretty cheap, it is better to
keep the same verifier expectation on helper that takes ARG_PTR_TO_BTF_ID*
will be able to handle the minisock situation.  In the bpf_sk_*cgroup_id()
case,  it will try to get a fullsock by using sk_to_full_sk() as its
skb variant bpf_sk"b"_*cgroup_id() has already been doing.

bpf_sk_release can already handle minisock, so nothing special has to
be done.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/uapi/linux/bpf.h       |  8 ++++----
 net/core/filter.c              | 30 ++++++++++++++----------------
 tools/include/uapi/linux/bpf.h |  8 ++++----
 3 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a22812561064..c96a56d9c3be 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2512,7 +2512,7 @@ union bpf_attr {
  *		result is from *reuse*\ **->socks**\ [] using the hash of the
  *		tuple.
  *
- * long bpf_sk_release(struct bpf_sock *sock)
+ * long bpf_sk_release(void *sock)
  *	Description
  *		Release the reference held by *sock*. *sock* must be a
  *		non-**NULL** pointer that was returned from
@@ -3234,11 +3234,11 @@ union bpf_attr {
  *
  *		**-EOVERFLOW** if an overflow happened: The same object will be tried again.
  *
- * u64 bpf_sk_cgroup_id(struct bpf_sock *sk)
+ * u64 bpf_sk_cgroup_id(void *sk)
  *	Description
  *		Return the cgroup v2 id of the socket *sk*.
  *
- *		*sk* must be a non-**NULL** pointer to a full socket, e.g. one
+ *		*sk* must be a non-**NULL** pointer to a socket, e.g. one
  *		returned from **bpf_sk_lookup_xxx**\ (),
  *		**bpf_sk_fullsock**\ (), etc. The format of returned id is
  *		same as in **bpf_skb_cgroup_id**\ ().
@@ -3248,7 +3248,7 @@ union bpf_attr {
  *	Return
  *		The id is returned or 0 in case the id could not be retrieved.
  *
- * u64 bpf_sk_ancestor_cgroup_id(struct bpf_sock *sk, int ancestor_level)
+ * u64 bpf_sk_ancestor_cgroup_id(void *sk, int ancestor_level)
  *	Description
  *		Return id of cgroup v2 that is ancestor of cgroup associated
  *		with the *sk* at the *ancestor_level*.  The root cgroup is at
diff --git a/net/core/filter.c b/net/core/filter.c
index 6d1864f2bd51..06d397eeef2a 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4088,18 +4088,17 @@ static inline u64 __bpf_sk_cgroup_id(struct sock *sk)
 {
 	struct cgroup *cgrp;
 
+	sk = sk_to_full_sk(sk);
+	if (!sk || !sk_fullsock(sk))
+		return 0;
+
 	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
 	return cgroup_id(cgrp);
 }
 
 BPF_CALL_1(bpf_skb_cgroup_id, const struct sk_buff *, skb)
 {
-	struct sock *sk = skb_to_full_sk(skb);
-
-	if (!sk || !sk_fullsock(sk))
-		return 0;
-
-	return __bpf_sk_cgroup_id(sk);
+	return __bpf_sk_cgroup_id(skb->sk);
 }
 
 static const struct bpf_func_proto bpf_skb_cgroup_id_proto = {
@@ -4115,6 +4114,10 @@ static inline u64 __bpf_sk_ancestor_cgroup_id(struct sock *sk,
 	struct cgroup *ancestor;
 	struct cgroup *cgrp;
 
+	sk = sk_to_full_sk(sk);
+	if (!sk || !sk_fullsock(sk))
+		return 0;
+
 	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
 	ancestor = cgroup_ancestor(cgrp, ancestor_level);
 	if (!ancestor)
@@ -4126,12 +4129,7 @@ static inline u64 __bpf_sk_ancestor_cgroup_id(struct sock *sk,
 BPF_CALL_2(bpf_skb_ancestor_cgroup_id, const struct sk_buff *, skb, int,
 	   ancestor_level)
 {
-	struct sock *sk = skb_to_full_sk(skb);
-
-	if (!sk || !sk_fullsock(sk))
-		return 0;
-
-	return __bpf_sk_ancestor_cgroup_id(sk, ancestor_level);
+	return __bpf_sk_ancestor_cgroup_id(skb->sk, ancestor_level);
 }
 
 static const struct bpf_func_proto bpf_skb_ancestor_cgroup_id_proto = {
@@ -4151,7 +4149,7 @@ static const struct bpf_func_proto bpf_sk_cgroup_id_proto = {
 	.func           = bpf_sk_cgroup_id,
 	.gpl_only       = false,
 	.ret_type       = RET_INTEGER,
-	.arg1_type      = ARG_PTR_TO_SOCKET,
+	.arg1_type      = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 };
 
 BPF_CALL_2(bpf_sk_ancestor_cgroup_id, struct sock *, sk, int, ancestor_level)
@@ -4163,7 +4161,7 @@ static const struct bpf_func_proto bpf_sk_ancestor_cgroup_id_proto = {
 	.func           = bpf_sk_ancestor_cgroup_id,
 	.gpl_only       = false,
 	.ret_type       = RET_INTEGER,
-	.arg1_type      = ARG_PTR_TO_SOCKET,
+	.arg1_type      = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.arg2_type      = ARG_ANYTHING,
 };
 #endif
@@ -5697,7 +5695,7 @@ static const struct bpf_func_proto bpf_sk_lookup_udp_proto = {
 
 BPF_CALL_1(bpf_sk_release, struct sock *, sk)
 {
-	if (sk_is_refcounted(sk))
+	if (sk && sk_is_refcounted(sk))
 		sock_gen_put(sk);
 	return 0;
 }
@@ -5706,7 +5704,7 @@ static const struct bpf_func_proto bpf_sk_release_proto = {
 	.func		= bpf_sk_release,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
-	.arg1_type	= ARG_PTR_TO_SOCK_COMMON,
+	.arg1_type	= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 };
 
 BPF_CALL_5(bpf_xdp_sk_lookup_udp, struct xdp_buff *, ctx,
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index a22812561064..c96a56d9c3be 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2512,7 +2512,7 @@ union bpf_attr {
  *		result is from *reuse*\ **->socks**\ [] using the hash of the
  *		tuple.
  *
- * long bpf_sk_release(struct bpf_sock *sock)
+ * long bpf_sk_release(void *sock)
  *	Description
  *		Release the reference held by *sock*. *sock* must be a
  *		non-**NULL** pointer that was returned from
@@ -3234,11 +3234,11 @@ union bpf_attr {
  *
  *		**-EOVERFLOW** if an overflow happened: The same object will be tried again.
  *
- * u64 bpf_sk_cgroup_id(struct bpf_sock *sk)
+ * u64 bpf_sk_cgroup_id(void *sk)
  *	Description
  *		Return the cgroup v2 id of the socket *sk*.
  *
- *		*sk* must be a non-**NULL** pointer to a full socket, e.g. one
+ *		*sk* must be a non-**NULL** pointer to a socket, e.g. one
  *		returned from **bpf_sk_lookup_xxx**\ (),
  *		**bpf_sk_fullsock**\ (), etc. The format of returned id is
  *		same as in **bpf_skb_cgroup_id**\ ().
@@ -3248,7 +3248,7 @@ union bpf_attr {
  *	Return
  *		The id is returned or 0 in case the id could not be retrieved.
  *
- * u64 bpf_sk_ancestor_cgroup_id(struct bpf_sock *sk, int ancestor_level)
+ * u64 bpf_sk_ancestor_cgroup_id(void *sk, int ancestor_level)
  *	Description
  *		Return id of cgroup v2 that is ancestor of cgroup associated
  *		with the *sk* at the *ancestor_level*.  The root cgroup is at
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 04/13] bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (2 preceding siblings ...)
  2020-09-25  0:03 ` [PATCH v4 bpf-next 03/13] bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 05/13] bpf: Change bpf_tcp_*_syncookie " Martin KaFai Lau
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

This patch changes the bpf_sk_storage_*() to take
ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer
returned by the bpf_skc_to_*() helpers also.

A micro benchmark has been done on a "cgroup_skb/egress" bpf program
which does a bpf_sk_storage_get().  It was driven by netperf doing
a 4096 connected UDP_STREAM test with 64bytes packet.
The stats from "kernel.bpf_stats_enabled" shows no meaningful difference.

The sk_storage_get_btf_proto, sk_storage_delete_btf_proto,
btf_sk_storage_get_proto, and btf_sk_storage_delete_proto are
no longer needed, so they are removed.

Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/net/bpf_sk_storage.h   |  2 --
 include/uapi/linux/bpf.h       |  1 +
 kernel/bpf/bpf_lsm.c           |  4 ++--
 net/core/bpf_sk_storage.c      | 29 ++++++-----------------------
 net/ipv4/bpf_tcp_ca.c          | 23 ++---------------------
 tools/include/uapi/linux/bpf.h |  1 +
 6 files changed, 12 insertions(+), 48 deletions(-)

diff --git a/include/net/bpf_sk_storage.h b/include/net/bpf_sk_storage.h
index 119f4c9c3a9c..3c516dd07caf 100644
--- a/include/net/bpf_sk_storage.h
+++ b/include/net/bpf_sk_storage.h
@@ -20,8 +20,6 @@ void bpf_sk_storage_free(struct sock *sk);
 
 extern const struct bpf_func_proto bpf_sk_storage_get_proto;
 extern const struct bpf_func_proto bpf_sk_storage_delete_proto;
-extern const struct bpf_func_proto sk_storage_get_btf_proto;
-extern const struct bpf_func_proto sk_storage_delete_btf_proto;
 
 struct bpf_local_storage_elem;
 struct bpf_sk_storage_diag;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c96a56d9c3be..0ec6dbeb17a5 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2861,6 +2861,7 @@ union bpf_attr {
  *		0 on success.
  *
  *		**-ENOENT** if the bpf-local-storage cannot be found.
+ *		**-EINVAL** if sk is not a fullsock (e.g. a request_sock).
  *
  * long bpf_send_signal(u32 sig)
  *	Description
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 9cd1428c7199..78ea8a7bd27f 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -56,9 +56,9 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_inode_storage_delete:
 		return &bpf_inode_storage_delete_proto;
 	case BPF_FUNC_sk_storage_get:
-		return &sk_storage_get_btf_proto;
+		return &bpf_sk_storage_get_proto;
 	case BPF_FUNC_sk_storage_delete:
-		return &sk_storage_delete_btf_proto;
+		return &bpf_sk_storage_delete_proto;
 	default:
 		return tracing_prog_func_proto(func_id, prog);
 	}
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 838efc682cff..c907f0dc7f87 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -269,7 +269,7 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
 {
 	struct bpf_local_storage_data *sdata;
 
-	if (flags > BPF_SK_STORAGE_GET_F_CREATE)
+	if (!sk || !sk_fullsock(sk) || flags > BPF_SK_STORAGE_GET_F_CREATE)
 		return (unsigned long)NULL;
 
 	sdata = sk_storage_lookup(sk, map, true);
@@ -299,6 +299,9 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
 
 BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk)
 {
+	if (!sk || !sk_fullsock(sk))
+		return -EINVAL;
+
 	if (refcount_inc_not_zero(&sk->sk_refcnt)) {
 		int err;
 
@@ -355,7 +358,7 @@ const struct bpf_func_proto bpf_sk_storage_get_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_PTR_TO_MAP_VALUE_OR_NULL,
 	.arg1_type	= ARG_CONST_MAP_PTR,
-	.arg2_type	= ARG_PTR_TO_SOCKET,
+	.arg2_type	= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.arg3_type	= ARG_PTR_TO_MAP_VALUE_OR_NULL,
 	.arg4_type	= ARG_ANYTHING,
 };
@@ -375,27 +378,7 @@ const struct bpf_func_proto bpf_sk_storage_delete_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_CONST_MAP_PTR,
-	.arg2_type	= ARG_PTR_TO_SOCKET,
-};
-
-const struct bpf_func_proto sk_storage_get_btf_proto = {
-	.func		= bpf_sk_storage_get,
-	.gpl_only	= false,
-	.ret_type	= RET_PTR_TO_MAP_VALUE_OR_NULL,
-	.arg1_type	= ARG_CONST_MAP_PTR,
-	.arg2_type	= ARG_PTR_TO_BTF_ID,
-	.arg2_btf_id	= &btf_sock_ids[BTF_SOCK_TYPE_SOCK],
-	.arg3_type	= ARG_PTR_TO_MAP_VALUE_OR_NULL,
-	.arg4_type	= ARG_ANYTHING,
-};
-
-const struct bpf_func_proto sk_storage_delete_btf_proto = {
-	.func		= bpf_sk_storage_delete,
-	.gpl_only	= false,
-	.ret_type	= RET_INTEGER,
-	.arg1_type	= ARG_CONST_MAP_PTR,
-	.arg2_type	= ARG_PTR_TO_BTF_ID,
-	.arg2_btf_id	= &btf_sock_ids[BTF_SOCK_TYPE_SOCK],
+	.arg2_type	= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 };
 
 struct bpf_sk_storage_diag {
diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index 74a2ef598c31..618954f82764 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -28,22 +28,6 @@ static u32 unsupported_ops[] = {
 static const struct btf_type *tcp_sock_type;
 static u32 tcp_sock_id, sock_id;
 
-static struct bpf_func_proto btf_sk_storage_get_proto __read_mostly;
-static struct bpf_func_proto btf_sk_storage_delete_proto __read_mostly;
-
-static void convert_sk_func_proto(struct bpf_func_proto *to, const struct bpf_func_proto *from)
-{
-	int i;
-
-	*to = *from;
-	for (i = 0; i < ARRAY_SIZE(to->arg_type); i++) {
-		if (to->arg_type[i] == ARG_PTR_TO_SOCKET) {
-			to->arg_type[i] = ARG_PTR_TO_BTF_ID;
-			to->arg_btf_id[i] = &tcp_sock_id;
-		}
-	}
-}
-
 static int bpf_tcp_ca_init(struct btf *btf)
 {
 	s32 type_id;
@@ -59,9 +43,6 @@ static int bpf_tcp_ca_init(struct btf *btf)
 	tcp_sock_id = type_id;
 	tcp_sock_type = btf_type_by_id(btf, tcp_sock_id);
 
-	convert_sk_func_proto(&btf_sk_storage_get_proto, &bpf_sk_storage_get_proto);
-	convert_sk_func_proto(&btf_sk_storage_delete_proto, &bpf_sk_storage_delete_proto);
-
 	return 0;
 }
 
@@ -188,9 +169,9 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id,
 	case BPF_FUNC_tcp_send_ack:
 		return &bpf_tcp_send_ack_proto;
 	case BPF_FUNC_sk_storage_get:
-		return &btf_sk_storage_get_proto;
+		return &bpf_sk_storage_get_proto;
 	case BPF_FUNC_sk_storage_delete:
-		return &btf_sk_storage_delete_proto;
+		return &bpf_sk_storage_delete_proto;
 	default:
 		return bpf_base_func_proto(func_id);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c96a56d9c3be..0ec6dbeb17a5 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2861,6 +2861,7 @@ union bpf_attr {
  *		0 on success.
  *
  *		**-ENOENT** if the bpf-local-storage cannot be found.
+ *		**-EINVAL** if sk is not a fullsock (e.g. a request_sock).
  *
  * long bpf_send_signal(u32 sig)
  *	Description
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 05/13] bpf: Change bpf_tcp_*_syncookie to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (3 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 04/13] bpf: Change bpf_sk_storage_*() " Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 06/13] bpf: Change bpf_sk_assign " Martin KaFai Lau
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

This patch changes the bpf_tcp_*_syncookie() to take
ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer
returned by the bpf_skc_to_*() helpers also.

Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/uapi/linux/bpf.h       | 4 ++--
 net/core/filter.c              | 8 ++++----
 tools/include/uapi/linux/bpf.h | 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0ec6dbeb17a5..69b9e30375bc 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2692,7 +2692,7 @@ union bpf_attr {
  *		result is from *reuse*\ **->socks**\ [] using the hash of the
  *		tuple.
  *
- * long bpf_tcp_check_syncookie(struct bpf_sock *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
+ * long bpf_tcp_check_syncookie(void *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
  * 	Description
  * 		Check whether *iph* and *th* contain a valid SYN cookie ACK for
  * 		the listening socket in *sk*.
@@ -2878,7 +2878,7 @@ union bpf_attr {
  *
  *		**-EAGAIN** if bpf program can try again.
  *
- * s64 bpf_tcp_gen_syncookie(struct bpf_sock *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
+ * s64 bpf_tcp_gen_syncookie(void *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
  *	Description
  *		Try to issue a SYN cookie for the packet with corresponding
  *		IP/TCP headers, *iph* and *th*, on the listening socket in *sk*.
diff --git a/net/core/filter.c b/net/core/filter.c
index 06d397eeef2a..1d88e9b498eb 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6086,7 +6086,7 @@ BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len
 	u32 cookie;
 	int ret;
 
-	if (unlikely(th_len < sizeof(*th)))
+	if (unlikely(!sk || th_len < sizeof(*th)))
 		return -EINVAL;
 
 	/* sk_listener() allows TCP_NEW_SYN_RECV, which makes no sense here. */
@@ -6139,7 +6139,7 @@ static const struct bpf_func_proto bpf_tcp_check_syncookie_proto = {
 	.gpl_only	= true,
 	.pkt_access	= true,
 	.ret_type	= RET_INTEGER,
-	.arg1_type	= ARG_PTR_TO_SOCK_COMMON,
+	.arg1_type	= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.arg2_type	= ARG_PTR_TO_MEM,
 	.arg3_type	= ARG_CONST_SIZE,
 	.arg4_type	= ARG_PTR_TO_MEM,
@@ -6153,7 +6153,7 @@ BPF_CALL_5(bpf_tcp_gen_syncookie, struct sock *, sk, void *, iph, u32, iph_len,
 	u32 cookie;
 	u16 mss;
 
-	if (unlikely(th_len < sizeof(*th) || th_len != th->doff * 4))
+	if (unlikely(!sk || th_len < sizeof(*th) || th_len != th->doff * 4))
 		return -EINVAL;
 
 	if (sk->sk_protocol != IPPROTO_TCP || sk->sk_state != TCP_LISTEN)
@@ -6208,7 +6208,7 @@ static const struct bpf_func_proto bpf_tcp_gen_syncookie_proto = {
 	.gpl_only	= true, /* __cookie_v*_init_sequence() is GPL */
 	.pkt_access	= true,
 	.ret_type	= RET_INTEGER,
-	.arg1_type	= ARG_PTR_TO_SOCK_COMMON,
+	.arg1_type	= ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.arg2_type	= ARG_PTR_TO_MEM,
 	.arg3_type	= ARG_CONST_SIZE,
 	.arg4_type	= ARG_PTR_TO_MEM,
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0ec6dbeb17a5..69b9e30375bc 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2692,7 +2692,7 @@ union bpf_attr {
  *		result is from *reuse*\ **->socks**\ [] using the hash of the
  *		tuple.
  *
- * long bpf_tcp_check_syncookie(struct bpf_sock *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
+ * long bpf_tcp_check_syncookie(void *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
  * 	Description
  * 		Check whether *iph* and *th* contain a valid SYN cookie ACK for
  * 		the listening socket in *sk*.
@@ -2878,7 +2878,7 @@ union bpf_attr {
  *
  *		**-EAGAIN** if bpf program can try again.
  *
- * s64 bpf_tcp_gen_syncookie(struct bpf_sock *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
+ * s64 bpf_tcp_gen_syncookie(void *sk, void *iph, u32 iph_len, struct tcphdr *th, u32 th_len)
  *	Description
  *		Try to issue a SYN cookie for the packet with corresponding
  *		IP/TCP headers, *iph* and *th*, on the listening socket in *sk*.
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 06/13] bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (4 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 05/13] bpf: Change bpf_tcp_*_syncookie " Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 07/13] bpf: selftest: Add ref_tracking verifier test for bpf_skc casting Martin KaFai Lau
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

This patch changes the bpf_sk_assign() to take
ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer
returned by the bpf_skc_to_*() helpers also.

The bpf_sk_lookup_assign() is taking ARG_PTR_TO_SOCKET_"OR_NULL".  Meaning
it specifically takes a literal NULL.  ARG_PTR_TO_BTF_ID_SOCK_COMMON
does not allow a literal NULL, so another ARG type is required
for this purpose and another follow-up patch can be used if
there is such need.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 include/uapi/linux/bpf.h       | 2 +-
 net/core/filter.c              | 4 ++--
 tools/include/uapi/linux/bpf.h | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 69b9e30375bc..2d6519a2ed77 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3107,7 +3107,7 @@ union bpf_attr {
  * 	Return
  * 		The id is returned or 0 in case the id could not be retrieved.
  *
- * long bpf_sk_assign(struct sk_buff *skb, struct bpf_sock *sk, u64 flags)
+ * long bpf_sk_assign(struct sk_buff *skb, void *sk, u64 flags)
  *	Description
  *		Helper is overloaded depending on BPF program type. This
  *		description applies to **BPF_PROG_TYPE_SCHED_CLS** and
diff --git a/net/core/filter.c b/net/core/filter.c
index 1d88e9b498eb..af88935e24b1 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6217,7 +6217,7 @@ static const struct bpf_func_proto bpf_tcp_gen_syncookie_proto = {
 
 BPF_CALL_3(bpf_sk_assign, struct sk_buff *, skb, struct sock *, sk, u64, flags)
 {
-	if (flags != 0)
+	if (!sk || flags != 0)
 		return -EINVAL;
 	if (!skb_at_tc_ingress(skb))
 		return -EOPNOTSUPP;
@@ -6241,7 +6241,7 @@ static const struct bpf_func_proto bpf_sk_assign_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type      = ARG_PTR_TO_CTX,
-	.arg2_type      = ARG_PTR_TO_SOCK_COMMON,
+	.arg2_type      = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
 	.arg3_type	= ARG_ANYTHING,
 };
 
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 69b9e30375bc..2d6519a2ed77 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3107,7 +3107,7 @@ union bpf_attr {
  * 	Return
  * 		The id is returned or 0 in case the id could not be retrieved.
  *
- * long bpf_sk_assign(struct sk_buff *skb, struct bpf_sock *sk, u64 flags)
+ * long bpf_sk_assign(struct sk_buff *skb, void *sk, u64 flags)
  *	Description
  *		Helper is overloaded depending on BPF program type. This
  *		description applies to **BPF_PROG_TYPE_SCHED_CLS** and
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 07/13] bpf: selftest: Add ref_tracking verifier test for bpf_skc casting
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (5 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 06/13] bpf: Change bpf_sk_assign " Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  8:30   ` Lorenz Bauer
  2020-09-25  0:04 ` [PATCH v4 bpf-next 08/13] bpf: selftest: Move sock_fields test into test_progs Martin KaFai Lau
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

The patch tests for:
1. bpf_sk_release() can be called on a tcp_sock btf_id ptr.

2. Ensure the tcp_sock btf_id pointer cannot be used
   after bpf_sk_release().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 .../selftests/bpf/verifier/ref_tracking.c     | 47 +++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/tools/testing/selftests/bpf/verifier/ref_tracking.c b/tools/testing/selftests/bpf/verifier/ref_tracking.c
index 056e0273bf12..006b5bd99c08 100644
--- a/tools/testing/selftests/bpf/verifier/ref_tracking.c
+++ b/tools/testing/selftests/bpf/verifier/ref_tracking.c
@@ -854,3 +854,50 @@
 	.errstr = "Unreleased reference",
 	.result = REJECT,
 },
+{
+	"reference tracking: bpf_sk_release(btf_tcp_sock)",
+	.insns = {
+	BPF_SK_LOOKUP(sk_lookup_tcp),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+	BPF_EMIT_CALL(BPF_FUNC_skc_to_tcp_sock),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+	BPF_EMIT_CALL(BPF_FUNC_sk_release),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+	BPF_EMIT_CALL(BPF_FUNC_sk_release),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_SCHED_CLS,
+	.result = ACCEPT,
+	.result_unpriv = REJECT,
+	.errstr_unpriv = "unknown func",
+},
+{
+	"reference tracking: use ptr from bpf_skc_to_tcp_sock() after release",
+	.insns = {
+	BPF_SK_LOOKUP(sk_lookup_tcp),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+	BPF_EMIT_CALL(BPF_FUNC_skc_to_tcp_sock),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+	BPF_EMIT_CALL(BPF_FUNC_sk_release),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_7, BPF_REG_0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+	BPF_EMIT_CALL(BPF_FUNC_sk_release),
+	BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_7, 0),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_SCHED_CLS,
+	.result = REJECT,
+	.errstr = "invalid mem access",
+	.result_unpriv = REJECT,
+	.errstr_unpriv = "unknown func",
+},
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 08/13] bpf: selftest: Move sock_fields test into test_progs
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (6 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 07/13] bpf: selftest: Add ref_tracking verifier test for bpf_skc casting Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 09/13] bpf: selftest: Adapt sock_fields test to use skel and global variables Martin KaFai Lau
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

This is a mechanical change to
1. move test_sock_fields.c to prog_tests/sock_fields.c
2. rename progs/test_sock_fields_kern.c to progs/test_sock_fields.c

Minimal change is made to the code itself.  Next patch will make
changes to use new ways of writing test, e.g. use skel and global
variables.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 tools/testing/selftests/bpf/.gitignore                      | 1 -
 tools/testing/selftests/bpf/Makefile                        | 2 +-
 .../bpf/{test_sock_fields.c => prog_tests/sock_fields.c}    | 6 ++----
 .../progs/{test_sock_fields_kern.c => test_sock_fields.c}   | 0
 4 files changed, 3 insertions(+), 6 deletions(-)
 rename tools/testing/selftests/bpf/{test_sock_fields.c => prog_tests/sock_fields.c} (99%)
 rename tools/testing/selftests/bpf/progs/{test_sock_fields_kern.c => test_sock_fields.c} (100%)

diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
index e8fed558b8b8..3ab1200e172f 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -13,7 +13,6 @@ test_verifier_log
 feature
 test_sock
 test_sock_addr
-test_sock_fields
 urandom_read
 test_sockmap
 test_lirc_mode2_user
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 59a5fa5fe837..bdbeafec371b 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -35,7 +35,7 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test
 	test_verifier_log test_dev_cgroup test_tcpbpf_user \
 	test_sock test_sockmap get_cgroup_id_user test_socket_cookie \
 	test_cgroup_storage \
-	test_netcnt test_tcpnotify_user test_sock_fields test_sysctl \
+	test_netcnt test_tcpnotify_user test_sysctl \
 	test_progs-no_alu32 \
 	test_current_pid_tgid_new_ns
 
diff --git a/tools/testing/selftests/bpf/test_sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
similarity index 99%
rename from tools/testing/selftests/bpf/test_sock_fields.c
rename to tools/testing/selftests/bpf/prog_tests/sock_fields.c
index 6c9f269c396d..1138223780fc 100644
--- a/tools/testing/selftests/bpf/test_sock_fields.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
@@ -409,10 +409,10 @@ static void test(void)
 	check_result();
 }
 
-int main(int argc, char **argv)
+void test_sock_fields(void)
 {
 	struct bpf_prog_load_attr attr = {
-		.file = "test_sock_fields_kern.o",
+		.file = "test_sock_fields.o",
 		.prog_type = BPF_PROG_TYPE_CGROUP_SKB,
 		.prog_flags = BPF_F_TEST_RND_HI32,
 	};
@@ -477,6 +477,4 @@ int main(int argc, char **argv)
 	cleanup_cgroup_environment();
 
 	printf("PASS\n");
-
-	return 0;
 }
diff --git a/tools/testing/selftests/bpf/progs/test_sock_fields_kern.c b/tools/testing/selftests/bpf/progs/test_sock_fields.c
similarity index 100%
rename from tools/testing/selftests/bpf/progs/test_sock_fields_kern.c
rename to tools/testing/selftests/bpf/progs/test_sock_fields.c
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 09/13] bpf: selftest: Adapt sock_fields test to use skel and global variables
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (7 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 08/13] bpf: selftest: Move sock_fields test into test_progs Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 10/13] bpf: selftest: Use network_helpers in the sock_fields test Martin KaFai Lau
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

skel is used.

Global variables are used to store the result from bpf prog.
addr_map, sock_result_map, and tcp_sock_result_map are gone.
Instead, global variables listen_tp, srv_sa6, cli_tp,, srv_tp,
listen_sk, srv_sk, and cli_sk are added.
Because of that, bpf_addr_array_idx and bpf_result_array_idx are also
no longer needed.

CHECK() macro from test_progs.h is reused and bail as soon as
a CHECK failure.

shutdown() is used to ensure the previous data-ack is received.
The bytes_acked, bytes_received, and the pkt_out_cnt checks are
using "<" to accommodate the final ack may not have been received/sent.
It is enough since it is not the focus of this test.

The sk local storage is all initialized to 0xeB9F now, so the
check_sk_pkt_out_cnt() always checks with the 0xeB9F base.  It is to
keep things simple.

The next patch will reuse helpers from network_helpers.h to simplify
things further.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 .../selftests/bpf/prog_tests/sock_fields.c    | 380 ++++++++----------
 .../selftests/bpf/progs/test_sock_fields.c    | 154 +++----
 2 files changed, 229 insertions(+), 305 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
index 1138223780fc..d96b718639fb 100644
--- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
@@ -14,20 +14,9 @@
 #include <bpf/libbpf.h>
 
 #include "cgroup_helpers.h"
+#include "test_progs.h"
 #include "bpf_rlimit.h"
-
-enum bpf_addr_array_idx {
-	ADDR_SRV_IDX,
-	ADDR_CLI_IDX,
-	__NR_BPF_ADDR_ARRAY_IDX,
-};
-
-enum bpf_result_array_idx {
-	EGRESS_SRV_IDX,
-	EGRESS_CLI_IDX,
-	INGRESS_LISTEN_IDX,
-	__NR_BPF_RESULT_ARRAY_IDX,
-};
+#include "test_sock_fields.skel.h"
 
 enum bpf_linum_array_idx {
 	EGRESS_LINUM_IDX,
@@ -40,34 +29,16 @@ struct bpf_spinlock_cnt {
 	__u32 cnt;
 };
 
-#define CHECK(condition, tag, format...) ({				\
-	int __ret = !!(condition);					\
-	if (__ret) {							\
-		printf("%s(%d):FAIL:%s ", __func__, __LINE__, tag);	\
-		printf(format);						\
-		printf("\n");						\
-		exit(-1);						\
-	}								\
-})
-
 #define TEST_CGROUP "/test-bpf-sock-fields"
 #define DATA "Hello BPF!"
 #define DATA_LEN sizeof(DATA)
 
 static struct sockaddr_in6 srv_sa6, cli_sa6;
 static int sk_pkt_out_cnt10_fd;
+struct test_sock_fields *skel;
 static int sk_pkt_out_cnt_fd;
 static int linum_map_fd;
-static int addr_map_fd;
-static int tp_map_fd;
-static int sk_map_fd;
-
-static __u32 addr_srv_idx = ADDR_SRV_IDX;
-static __u32 addr_cli_idx = ADDR_CLI_IDX;
-
-static __u32 egress_srv_idx = EGRESS_SRV_IDX;
-static __u32 egress_cli_idx = EGRESS_CLI_IDX;
-static __u32 ingress_listen_idx = INGRESS_LISTEN_IDX;
+static __u32 duration;
 
 static __u32 egress_linum_idx = EGRESS_LINUM_IDX;
 static __u32 ingress_linum_idx = INGRESS_LINUM_IDX;
@@ -79,7 +50,7 @@ static void init_loopback6(struct sockaddr_in6 *sa6)
 	sa6->sin6_addr = in6addr_loopback;
 }
 
-static void print_sk(const struct bpf_sock *sk)
+static void print_sk(const struct bpf_sock *sk, const char *prefix)
 {
 	char src_ip4[24], dst_ip4[24];
 	char src_ip6[64], dst_ip6[64];
@@ -89,9 +60,10 @@ static void print_sk(const struct bpf_sock *sk)
 	inet_ntop(AF_INET, &sk->dst_ip4, dst_ip4, sizeof(dst_ip4));
 	inet_ntop(AF_INET6, &sk->dst_ip6, dst_ip6, sizeof(dst_ip6));
 
-	printf("state:%u bound_dev_if:%u family:%u type:%u protocol:%u mark:%u priority:%u "
+	printf("%s: state:%u bound_dev_if:%u family:%u type:%u protocol:%u mark:%u priority:%u "
 	       "src_ip4:%x(%s) src_ip6:%x:%x:%x:%x(%s) src_port:%u "
 	       "dst_ip4:%x(%s) dst_ip6:%x:%x:%x:%x(%s) dst_port:%u\n",
+	       prefix,
 	       sk->state, sk->bound_dev_if, sk->family, sk->type, sk->protocol,
 	       sk->mark, sk->priority,
 	       sk->src_ip4, src_ip4,
@@ -102,14 +74,15 @@ static void print_sk(const struct bpf_sock *sk)
 	       dst_ip6, ntohs(sk->dst_port));
 }
 
-static void print_tp(const struct bpf_tcp_sock *tp)
+static void print_tp(const struct bpf_tcp_sock *tp, const char *prefix)
 {
-	printf("snd_cwnd:%u srtt_us:%u rtt_min:%u snd_ssthresh:%u rcv_nxt:%u "
+	printf("%s: snd_cwnd:%u srtt_us:%u rtt_min:%u snd_ssthresh:%u rcv_nxt:%u "
 	       "snd_nxt:%u snd:una:%u mss_cache:%u ecn_flags:%u "
 	       "rate_delivered:%u rate_interval_us:%u packets_out:%u "
 	       "retrans_out:%u total_retrans:%u segs_in:%u data_segs_in:%u "
 	       "segs_out:%u data_segs_out:%u lost_out:%u sacked_out:%u "
 	       "bytes_received:%llu bytes_acked:%llu\n",
+	       prefix,
 	       tp->snd_cwnd, tp->srtt_us, tp->rtt_min, tp->snd_ssthresh,
 	       tp->rcv_nxt, tp->snd_nxt, tp->snd_una, tp->mss_cache,
 	       tp->ecn_flags, tp->rate_delivered, tp->rate_interval_us,
@@ -129,57 +102,26 @@ static void check_result(void)
 	err = bpf_map_lookup_elem(linum_map_fd, &egress_linum_idx,
 				  &egress_linum);
 	CHECK(err == -1, "bpf_map_lookup_elem(linum_map_fd)",
-	      "err:%d errno:%d", err, errno);
+	      "err:%d errno:%d\n", err, errno);
 
 	err = bpf_map_lookup_elem(linum_map_fd, &ingress_linum_idx,
 				  &ingress_linum);
 	CHECK(err == -1, "bpf_map_lookup_elem(linum_map_fd)",
-	      "err:%d errno:%d", err, errno);
-
-	err = bpf_map_lookup_elem(sk_map_fd, &egress_srv_idx, &srv_sk);
-	CHECK(err == -1, "bpf_map_lookup_elem(sk_map_fd, &egress_srv_idx)",
-	      "err:%d errno:%d", err, errno);
-	err = bpf_map_lookup_elem(tp_map_fd, &egress_srv_idx, &srv_tp);
-	CHECK(err == -1, "bpf_map_lookup_elem(tp_map_fd, &egress_srv_idx)",
-	      "err:%d errno:%d", err, errno);
-
-	err = bpf_map_lookup_elem(sk_map_fd, &egress_cli_idx, &cli_sk);
-	CHECK(err == -1, "bpf_map_lookup_elem(sk_map_fd, &egress_cli_idx)",
-	      "err:%d errno:%d", err, errno);
-	err = bpf_map_lookup_elem(tp_map_fd, &egress_cli_idx, &cli_tp);
-	CHECK(err == -1, "bpf_map_lookup_elem(tp_map_fd, &egress_cli_idx)",
-	      "err:%d errno:%d", err, errno);
-
-	err = bpf_map_lookup_elem(sk_map_fd, &ingress_listen_idx, &listen_sk);
-	CHECK(err == -1, "bpf_map_lookup_elem(sk_map_fd, &ingress_listen_idx)",
-	      "err:%d errno:%d", err, errno);
-	err = bpf_map_lookup_elem(tp_map_fd, &ingress_listen_idx, &listen_tp);
-	CHECK(err == -1, "bpf_map_lookup_elem(tp_map_fd, &ingress_listen_idx)",
-	      "err:%d errno:%d", err, errno);
-
-	printf("listen_sk: ");
-	print_sk(&listen_sk);
-	printf("\n");
-
-	printf("srv_sk: ");
-	print_sk(&srv_sk);
-	printf("\n");
-
-	printf("cli_sk: ");
-	print_sk(&cli_sk);
-	printf("\n");
-
-	printf("listen_tp: ");
-	print_tp(&listen_tp);
-	printf("\n");
-
-	printf("srv_tp: ");
-	print_tp(&srv_tp);
-	printf("\n");
-
-	printf("cli_tp: ");
-	print_tp(&cli_tp);
-	printf("\n");
+	      "err:%d errno:%d\n", err, errno);
+
+	memcpy(&srv_sk, &skel->bss->srv_sk, sizeof(srv_sk));
+	memcpy(&srv_tp, &skel->bss->srv_tp, sizeof(srv_tp));
+	memcpy(&cli_sk, &skel->bss->cli_sk, sizeof(cli_sk));
+	memcpy(&cli_tp, &skel->bss->cli_tp, sizeof(cli_tp));
+	memcpy(&listen_sk, &skel->bss->listen_sk, sizeof(listen_sk));
+	memcpy(&listen_tp, &skel->bss->listen_tp, sizeof(listen_tp));
+
+	print_sk(&listen_sk, "listen_sk");
+	print_sk(&srv_sk, "srv_sk");
+	print_sk(&cli_sk, "cli_sk");
+	print_tp(&listen_tp, "listen_tp");
+	print_tp(&srv_tp, "srv_tp");
+	print_tp(&cli_tp, "cli_tp");
 
 	CHECK(listen_sk.state != 10 ||
 	      listen_sk.family != AF_INET6 ||
@@ -190,8 +132,8 @@ static void check_result(void)
 	      listen_sk.dst_ip6[2] || listen_sk.dst_ip6[3] ||
 	      listen_sk.src_port != ntohs(srv_sa6.sin6_port) ||
 	      listen_sk.dst_port,
-	      "Unexpected listen_sk",
-	      "Check listen_sk output. ingress_linum:%u",
+	      "listen_sk",
+	      "Unexpected. Check listen_sk output. ingress_linum:%u\n",
 	      ingress_linum);
 
 	CHECK(srv_sk.state == 10 ||
@@ -204,7 +146,7 @@ static void check_result(void)
 		     sizeof(srv_sk.dst_ip6)) ||
 	      srv_sk.src_port != ntohs(srv_sa6.sin6_port) ||
 	      srv_sk.dst_port != cli_sa6.sin6_port,
-	      "Unexpected srv_sk", "Check srv_sk output. egress_linum:%u",
+	      "srv_sk", "Unexpected. Check srv_sk output. egress_linum:%u\n",
 	      egress_linum);
 
 	CHECK(cli_sk.state == 10 ||
@@ -217,30 +159,31 @@ static void check_result(void)
 		     sizeof(cli_sk.dst_ip6)) ||
 	      cli_sk.src_port != ntohs(cli_sa6.sin6_port) ||
 	      cli_sk.dst_port != srv_sa6.sin6_port,
-	      "Unexpected cli_sk", "Check cli_sk output. egress_linum:%u",
+	      "cli_sk", "Unexpected. Check cli_sk output. egress_linum:%u\n",
 	      egress_linum);
 
 	CHECK(listen_tp.data_segs_out ||
 	      listen_tp.data_segs_in ||
 	      listen_tp.total_retrans ||
 	      listen_tp.bytes_acked,
-	      "Unexpected listen_tp", "Check listen_tp output. ingress_linum:%u",
+	      "listen_tp",
+	      "Unexpected. Check listen_tp output. ingress_linum:%u\n",
 	      ingress_linum);
 
 	CHECK(srv_tp.data_segs_out != 2 ||
 	      srv_tp.data_segs_in ||
 	      srv_tp.snd_cwnd != 10 ||
 	      srv_tp.total_retrans ||
-	      srv_tp.bytes_acked != 2 * DATA_LEN,
-	      "Unexpected srv_tp", "Check srv_tp output. egress_linum:%u",
+	      srv_tp.bytes_acked < 2 * DATA_LEN,
+	      "srv_tp", "Unexpected. Check srv_tp output. egress_linum:%u\n",
 	      egress_linum);
 
 	CHECK(cli_tp.data_segs_out ||
 	      cli_tp.data_segs_in != 2 ||
 	      cli_tp.snd_cwnd != 10 ||
 	      cli_tp.total_retrans ||
-	      cli_tp.bytes_received != 2 * DATA_LEN,
-	      "Unexpected cli_tp", "Check cli_tp output. egress_linum:%u",
+	      cli_tp.bytes_received < 2 * DATA_LEN,
+	      "cli_tp", "Unexpected. Check cli_tp output. egress_linum:%u\n",
 	      egress_linum);
 }
 
@@ -257,15 +200,14 @@ static void check_sk_pkt_out_cnt(int accept_fd, int cli_fd)
 					  &pkt_out_cnt10);
 
 	/* The bpf prog only counts for fullsock and
-	 * passive conneciton did not become fullsock until 3WHS
-	 * had been finished.
-	 * The bpf prog only counted two data packet out but we
-	 * specially init accept_fd's pkt_out_cnt by 2 in
-	 * init_sk_storage().  Hence, 4 here.
+	 * passive connection did not become fullsock until 3WHS
+	 * had been finished, so the bpf prog only counted two data
+	 * packet out.
 	 */
-	CHECK(err || pkt_out_cnt.cnt != 4 || pkt_out_cnt10.cnt != 40,
+	CHECK(err || pkt_out_cnt.cnt < 0xeB9F + 2 ||
+	      pkt_out_cnt10.cnt < 0xeB9F + 20,
 	      "bpf_map_lookup_elem(sk_pkt_out_cnt, &accept_fd)",
-	      "err:%d errno:%d pkt_out_cnt:%u pkt_out_cnt10:%u",
+	      "err:%d errno:%d pkt_out_cnt:%u pkt_out_cnt10:%u\n",
 	      err, errno, pkt_out_cnt.cnt, pkt_out_cnt10.cnt);
 
 	pkt_out_cnt.cnt = ~0;
@@ -280,14 +222,14 @@ static void check_sk_pkt_out_cnt(int accept_fd, int cli_fd)
 	 *
 	 * The bpf_prog initialized it to 0xeB9F.
 	 */
-	CHECK(err || pkt_out_cnt.cnt != 0xeB9F + 4 ||
-	      pkt_out_cnt10.cnt != 0xeB9F + 40,
+	CHECK(err || pkt_out_cnt.cnt < 0xeB9F + 4 ||
+	      pkt_out_cnt10.cnt < 0xeB9F + 40,
 	      "bpf_map_lookup_elem(sk_pkt_out_cnt, &cli_fd)",
-	      "err:%d errno:%d pkt_out_cnt:%u pkt_out_cnt10:%u",
+	      "err:%d errno:%d pkt_out_cnt:%u pkt_out_cnt10:%u\n",
 	      err, errno, pkt_out_cnt.cnt, pkt_out_cnt10.cnt);
 }
 
-static void init_sk_storage(int sk_fd, __u32 pkt_out_cnt)
+static int init_sk_storage(int sk_fd, __u32 pkt_out_cnt)
 {
 	struct bpf_spinlock_cnt scnt = {};
 	int err;
@@ -295,186 +237,190 @@ static void init_sk_storage(int sk_fd, __u32 pkt_out_cnt)
 	scnt.cnt = pkt_out_cnt;
 	err = bpf_map_update_elem(sk_pkt_out_cnt_fd, &sk_fd, &scnt,
 				  BPF_NOEXIST);
-	CHECK(err, "bpf_map_update_elem(sk_pkt_out_cnt_fd)",
-	      "err:%d errno:%d", err, errno);
+	if (CHECK(err, "bpf_map_update_elem(sk_pkt_out_cnt_fd)",
+		  "err:%d errno:%d\n", err, errno))
+		return err;
 
-	scnt.cnt *= 10;
 	err = bpf_map_update_elem(sk_pkt_out_cnt10_fd, &sk_fd, &scnt,
 				  BPF_NOEXIST);
-	CHECK(err, "bpf_map_update_elem(sk_pkt_out_cnt10_fd)",
-	      "err:%d errno:%d", err, errno);
+	if (CHECK(err, "bpf_map_update_elem(sk_pkt_out_cnt10_fd)",
+		  "err:%d errno:%d\n", err, errno))
+		return err;
+
+	return 0;
 }
 
 static void test(void)
 {
-	int listen_fd, cli_fd, accept_fd, epfd, err;
+	int listen_fd = -1, cli_fd = -1, accept_fd = -1, epfd, err, i;
 	struct epoll_event ev;
 	socklen_t addrlen;
-	int i;
+	char buf[DATA_LEN];
 
 	addrlen = sizeof(struct sockaddr_in6);
 	ev.events = EPOLLIN;
 
 	epfd = epoll_create(1);
-	CHECK(epfd == -1, "epoll_create()", "epfd:%d errno:%d", epfd, errno);
+	if (CHECK(epfd == -1, "epoll_create()", "epfd:%d errno:%d\n",
+		  epfd, errno))
+		return;
 
 	/* Prepare listen_fd */
 	listen_fd = socket(AF_INET6, SOCK_STREAM | SOCK_NONBLOCK, 0);
-	CHECK(listen_fd == -1, "socket()", "listen_fd:%d errno:%d",
-	      listen_fd, errno);
+	if (CHECK(listen_fd == -1, "socket()", "listen_fd:%d errno:%d\n",
+		  listen_fd, errno))
+		goto done;
 
 	init_loopback6(&srv_sa6);
 	err = bind(listen_fd, (struct sockaddr *)&srv_sa6, sizeof(srv_sa6));
-	CHECK(err, "bind(listen_fd)", "err:%d errno:%d", err, errno);
+	if (CHECK(err, "bind(listen_fd)", "err:%d errno:%d\n", err, errno))
+		goto done;
 
 	err = getsockname(listen_fd, (struct sockaddr *)&srv_sa6, &addrlen);
-	CHECK(err, "getsockname(listen_fd)", "err:%d errno:%d", err, errno);
+	if (CHECK(err, "getsockname(listen_fd)", "err:%d errno:%d\n", err,
+		  errno))
+		goto done;
+	memcpy(&skel->bss->srv_sa6, &srv_sa6, sizeof(srv_sa6));
 
 	err = listen(listen_fd, 1);
-	CHECK(err, "listen(listen_fd)", "err:%d errno:%d", err, errno);
+	if (CHECK(err, "listen(listen_fd)", "err:%d errno:%d\n", err, errno))
+		goto done;
 
 	/* Prepare cli_fd */
 	cli_fd = socket(AF_INET6, SOCK_STREAM | SOCK_NONBLOCK, 0);
-	CHECK(cli_fd == -1, "socket()", "cli_fd:%d errno:%d", cli_fd, errno);
+	if (CHECK(cli_fd == -1, "socket()", "cli_fd:%d errno:%d\n", cli_fd,
+		  errno))
+		goto done;
 
 	init_loopback6(&cli_sa6);
 	err = bind(cli_fd, (struct sockaddr *)&cli_sa6, sizeof(cli_sa6));
-	CHECK(err, "bind(cli_fd)", "err:%d errno:%d", err, errno);
+	if (CHECK(err, "bind(cli_fd)", "err:%d errno:%d\n", err, errno))
+		goto done;
 
 	err = getsockname(cli_fd, (struct sockaddr *)&cli_sa6, &addrlen);
-	CHECK(err, "getsockname(cli_fd)", "err:%d errno:%d",
-	      err, errno);
-
-	/* Update addr_map with srv_sa6 and cli_sa6 */
-	err = bpf_map_update_elem(addr_map_fd, &addr_srv_idx, &srv_sa6, 0);
-	CHECK(err, "map_update", "err:%d errno:%d", err, errno);
-
-	err = bpf_map_update_elem(addr_map_fd, &addr_cli_idx, &cli_sa6, 0);
-	CHECK(err, "map_update", "err:%d errno:%d", err, errno);
+	if (CHECK(err, "getsockname(cli_fd)", "err:%d errno:%d\n",
+		  err, errno))
+		goto done;
 
 	/* Connect from cli_sa6 to srv_sa6 */
 	err = connect(cli_fd, (struct sockaddr *)&srv_sa6, addrlen);
 	printf("srv_sa6.sin6_port:%u cli_sa6.sin6_port:%u\n\n",
 	       ntohs(srv_sa6.sin6_port), ntohs(cli_sa6.sin6_port));
-	CHECK(err && errno != EINPROGRESS,
-	      "connect(cli_fd)", "err:%d errno:%d", err, errno);
+	if (CHECK(err && errno != EINPROGRESS,
+		  "connect(cli_fd)", "err:%d errno:%d\n", err, errno))
+		goto done;
 
 	ev.data.fd = listen_fd;
 	err = epoll_ctl(epfd, EPOLL_CTL_ADD, listen_fd, &ev);
-	CHECK(err, "epoll_ctl(EPOLL_CTL_ADD, listen_fd)", "err:%d errno:%d",
-	      err, errno);
+	if (CHECK(err, "epoll_ctl(EPOLL_CTL_ADD, listen_fd)",
+		  "err:%d errno:%d\n", err, errno))
+		goto done;
 
 	/* Accept the connection */
 	/* Have some timeout in accept(listen_fd). Just in case. */
 	err = epoll_wait(epfd, &ev, 1, 1000);
-	CHECK(err != 1 || ev.data.fd != listen_fd,
-	      "epoll_wait(listen_fd)",
-	      "err:%d errno:%d ev.data.fd:%d listen_fd:%d",
-	      err, errno, ev.data.fd, listen_fd);
+	if (CHECK(err != 1 || ev.data.fd != listen_fd,
+		  "epoll_wait(listen_fd)",
+		  "err:%d errno:%d ev.data.fd:%d listen_fd:%d\n",
+		  err, errno, ev.data.fd, listen_fd))
+		goto done;
 
 	accept_fd = accept(listen_fd, NULL, NULL);
-	CHECK(accept_fd == -1, "accept(listen_fd)", "accept_fd:%d errno:%d",
-	      accept_fd, errno);
-	close(listen_fd);
+	if (CHECK(accept_fd == -1, "accept(listen_fd)",
+		  "accept_fd:%d errno:%d\n",
+		  accept_fd, errno))
+		goto done;
 
 	ev.data.fd = cli_fd;
 	err = epoll_ctl(epfd, EPOLL_CTL_ADD, cli_fd, &ev);
-	CHECK(err, "epoll_ctl(EPOLL_CTL_ADD, cli_fd)", "err:%d errno:%d",
-	      err, errno);
+	if (CHECK(err, "epoll_ctl(EPOLL_CTL_ADD, cli_fd)",
+		  "err:%d errno:%d\n", err, errno))
+		goto done;
 
-	init_sk_storage(accept_fd, 2);
+	if (init_sk_storage(accept_fd, 0xeB9F))
+		goto done;
 
 	for (i = 0; i < 2; i++) {
-		/* Send some data from accept_fd to cli_fd */
-		err = send(accept_fd, DATA, DATA_LEN, 0);
-		CHECK(err != DATA_LEN, "send(accept_fd)", "err:%d errno:%d",
-		      err, errno);
+		/* Send some data from accept_fd to cli_fd.
+		 * MSG_EOR to stop kernel from coalescing two pkts.
+		 */
+		err = send(accept_fd, DATA, DATA_LEN, MSG_EOR);
+		if (CHECK(err != DATA_LEN, "send(accept_fd)",
+			  "err:%d errno:%d\n", err, errno))
+			goto done;
 
 		/* Have some timeout in recv(cli_fd). Just in case. */
 		err = epoll_wait(epfd, &ev, 1, 1000);
-		CHECK(err != 1 || ev.data.fd != cli_fd,
-		      "epoll_wait(cli_fd)", "err:%d errno:%d ev.data.fd:%d cli_fd:%d",
-		      err, errno, ev.data.fd, cli_fd);
-
-		err = recv(cli_fd, NULL, 0, MSG_TRUNC);
-		CHECK(err, "recv(cli_fd)", "err:%d errno:%d", err, errno);
+		if (CHECK(err != 1 || ev.data.fd != cli_fd,
+			  "epoll_wait(cli_fd)",
+			  "err:%d errno:%d ev.data.fd:%d cli_fd:%d\n",
+			  err, errno, ev.data.fd, cli_fd))
+			goto done;
+
+		err = recv(cli_fd, buf, DATA_LEN, 0);
+		if (CHECK(err != DATA_LEN, "recv(cli_fd)", "err:%d errno:%d\n",
+			  err, errno))
+			goto done;
 	}
 
+	shutdown(cli_fd, SHUT_WR);
+	err = recv(accept_fd, buf, 1, 0);
+	if (CHECK(err, "recv(accept_fd) for fin", "err:%d errno:%d\n",
+		  err, errno))
+		goto done;
+	shutdown(accept_fd, SHUT_WR);
+	err = recv(cli_fd, buf, 1, 0);
+	if (CHECK(err, "recv(cli_fd) for fin", "err:%d errno:%d\n",
+		  err, errno))
+		goto done;
 	check_sk_pkt_out_cnt(accept_fd, cli_fd);
+	check_result();
 
+done:
+	if (accept_fd != -1)
+		close(accept_fd);
+	if (cli_fd != -1)
+		close(cli_fd);
+	if (listen_fd != -1)
+		close(listen_fd);
 	close(epfd);
-	close(accept_fd);
-	close(cli_fd);
-
-	check_result();
 }
 
 void test_sock_fields(void)
 {
-	struct bpf_prog_load_attr attr = {
-		.file = "test_sock_fields.o",
-		.prog_type = BPF_PROG_TYPE_CGROUP_SKB,
-		.prog_flags = BPF_F_TEST_RND_HI32,
-	};
-	int cgroup_fd, egress_fd, ingress_fd, err;
-	struct bpf_program *ingress_prog;
-	struct bpf_object *obj;
-	struct bpf_map *map;
+	struct bpf_link *egress_link = NULL, *ingress_link = NULL;
+	int cgroup_fd;
 
 	/* Create a cgroup, get fd, and join it */
-	cgroup_fd = cgroup_setup_and_join(TEST_CGROUP);
-	CHECK(cgroup_fd < 0, "cgroup_setup_and_join()",
-	      "cgroup_fd:%d errno:%d", cgroup_fd, errno);
-	atexit(cleanup_cgroup_environment);
-
-	err = bpf_prog_load_xattr(&attr, &obj, &egress_fd);
-	CHECK(err, "bpf_prog_load_xattr()", "err:%d", err);
-
-	ingress_prog = bpf_object__find_program_by_title(obj,
-							 "cgroup_skb/ingress");
-	CHECK(!ingress_prog,
-	      "bpf_object__find_program_by_title(cgroup_skb/ingress)",
-	      "not found");
-	ingress_fd = bpf_program__fd(ingress_prog);
-
-	err = bpf_prog_attach(egress_fd, cgroup_fd, BPF_CGROUP_INET_EGRESS, 0);
-	CHECK(err == -1, "bpf_prog_attach(CPF_CGROUP_INET_EGRESS)",
-	      "err:%d errno%d", err, errno);
-
-	err = bpf_prog_attach(ingress_fd, cgroup_fd,
-			      BPF_CGROUP_INET_INGRESS, 0);
-	CHECK(err == -1, "bpf_prog_attach(CPF_CGROUP_INET_INGRESS)",
-	      "err:%d errno%d", err, errno);
-	close(cgroup_fd);
-
-	map = bpf_object__find_map_by_name(obj, "addr_map");
-	CHECK(!map, "cannot find addr_map", "(null)");
-	addr_map_fd = bpf_map__fd(map);
-
-	map = bpf_object__find_map_by_name(obj, "sock_result_map");
-	CHECK(!map, "cannot find sock_result_map", "(null)");
-	sk_map_fd = bpf_map__fd(map);
-
-	map = bpf_object__find_map_by_name(obj, "tcp_sock_result_map");
-	CHECK(!map, "cannot find tcp_sock_result_map", "(null)");
-	tp_map_fd = bpf_map__fd(map);
-
-	map = bpf_object__find_map_by_name(obj, "linum_map");
-	CHECK(!map, "cannot find linum_map", "(null)");
-	linum_map_fd = bpf_map__fd(map);
-
-	map = bpf_object__find_map_by_name(obj, "sk_pkt_out_cnt");
-	CHECK(!map, "cannot find sk_pkt_out_cnt", "(null)");
-	sk_pkt_out_cnt_fd = bpf_map__fd(map);
-
-	map = bpf_object__find_map_by_name(obj, "sk_pkt_out_cnt10");
-	CHECK(!map, "cannot find sk_pkt_out_cnt10", "(null)");
-	sk_pkt_out_cnt10_fd = bpf_map__fd(map);
+	cgroup_fd = test__join_cgroup(TEST_CGROUP);
+	if (CHECK_FAIL(cgroup_fd < 0))
+		return;
+
+	skel = test_sock_fields__open_and_load();
+	if (CHECK(!skel, "test_sock_fields__open_and_load", "failed\n"))
+		goto done;
+
+	egress_link = bpf_program__attach_cgroup(skel->progs.egress_read_sock_fields,
+						 cgroup_fd);
+	if (CHECK(IS_ERR(egress_link), "attach_cgroup(egress)", "err:%ld\n",
+		  PTR_ERR(egress_link)))
+		goto done;
+
+	ingress_link = bpf_program__attach_cgroup(skel->progs.ingress_read_sock_fields,
+						  cgroup_fd);
+	if (CHECK(IS_ERR(ingress_link), "attach_cgroup(ingress)", "err:%ld\n",
+		  PTR_ERR(ingress_link)))
+		goto done;
+
+	linum_map_fd = bpf_map__fd(skel->maps.linum_map);
+	sk_pkt_out_cnt_fd = bpf_map__fd(skel->maps.sk_pkt_out_cnt);
+	sk_pkt_out_cnt10_fd = bpf_map__fd(skel->maps.sk_pkt_out_cnt10);
 
 	test();
 
-	bpf_object__close(obj);
-	cleanup_cgroup_environment();
-
-	printf("PASS\n");
+done:
+	bpf_link__destroy(egress_link);
+	bpf_link__destroy(ingress_link);
+	test_sock_fields__destroy(skel);
+	close(cgroup_fd);
 }
diff --git a/tools/testing/selftests/bpf/progs/test_sock_fields.c b/tools/testing/selftests/bpf/progs/test_sock_fields.c
index 9bcaa37f476a..370e33a858db 100644
--- a/tools/testing/selftests/bpf/progs/test_sock_fields.c
+++ b/tools/testing/selftests/bpf/progs/test_sock_fields.c
@@ -8,46 +8,12 @@
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_endian.h>
 
-enum bpf_addr_array_idx {
-	ADDR_SRV_IDX,
-	ADDR_CLI_IDX,
-	__NR_BPF_ADDR_ARRAY_IDX,
-};
-
-enum bpf_result_array_idx {
-	EGRESS_SRV_IDX,
-	EGRESS_CLI_IDX,
-	INGRESS_LISTEN_IDX,
-	__NR_BPF_RESULT_ARRAY_IDX,
-};
-
 enum bpf_linum_array_idx {
 	EGRESS_LINUM_IDX,
 	INGRESS_LINUM_IDX,
 	__NR_BPF_LINUM_ARRAY_IDX,
 };
 
-struct {
-	__uint(type, BPF_MAP_TYPE_ARRAY);
-	__uint(max_entries, __NR_BPF_ADDR_ARRAY_IDX);
-	__type(key, __u32);
-	__type(value, struct sockaddr_in6);
-} addr_map SEC(".maps");
-
-struct {
-	__uint(type, BPF_MAP_TYPE_ARRAY);
-	__uint(max_entries, __NR_BPF_RESULT_ARRAY_IDX);
-	__type(key, __u32);
-	__type(value, struct bpf_sock);
-} sock_result_map SEC(".maps");
-
-struct {
-	__uint(type, BPF_MAP_TYPE_ARRAY);
-	__uint(max_entries, __NR_BPF_RESULT_ARRAY_IDX);
-	__type(key, __u32);
-	__type(value, struct bpf_tcp_sock);
-} tcp_sock_result_map SEC(".maps");
-
 struct {
 	__uint(type, BPF_MAP_TYPE_ARRAY);
 	__uint(max_entries, __NR_BPF_LINUM_ARRAY_IDX);
@@ -74,6 +40,14 @@ struct {
 	__type(value, struct bpf_spinlock_cnt);
 } sk_pkt_out_cnt10 SEC(".maps");
 
+struct bpf_tcp_sock listen_tp = {};
+struct sockaddr_in6 srv_sa6 = {};
+struct bpf_tcp_sock cli_tp = {};
+struct bpf_tcp_sock srv_tp = {};
+struct bpf_sock listen_sk = {};
+struct bpf_sock srv_sk = {};
+struct bpf_sock cli_sk = {};
+
 static bool is_loopback6(__u32 *a6)
 {
 	return !a6[0] && !a6[1] && !a6[2] && a6[3] == bpf_htonl(1);
@@ -130,19 +104,20 @@ static void tpcpy(struct bpf_tcp_sock *dst,
 	dst->bytes_acked = src->bytes_acked;
 }
 
-#define RETURN {						\
+/* Always return CG_OK so that no pkt will be filtered out */
+#define CG_OK 1
+
+#define RET_LOG() ({						\
 	linum = __LINE__;					\
-	bpf_map_update_elem(&linum_map, &linum_idx, &linum, 0);	\
-	return 1;						\
-}
+	bpf_map_update_elem(&linum_map, &linum_idx, &linum, BPF_NOEXIST);	\
+	return CG_OK;						\
+})
 
 SEC("cgroup_skb/egress")
 int egress_read_sock_fields(struct __sk_buff *skb)
 {
 	struct bpf_spinlock_cnt cli_cnt_init = { .lock = 0, .cnt = 0xeB9F };
-	__u32 srv_idx = ADDR_SRV_IDX, cli_idx = ADDR_CLI_IDX, result_idx;
 	struct bpf_spinlock_cnt *pkt_out_cnt, *pkt_out_cnt10;
-	struct sockaddr_in6 *srv_sa6, *cli_sa6;
 	struct bpf_tcp_sock *tp, *tp_ret;
 	struct bpf_sock *sk, *sk_ret;
 	__u32 linum, linum_idx;
@@ -150,39 +125,46 @@ int egress_read_sock_fields(struct __sk_buff *skb)
 	linum_idx = EGRESS_LINUM_IDX;
 
 	sk = skb->sk;
-	if (!sk || sk->state == 10)
-		RETURN;
+	if (!sk)
+		RET_LOG();
 
+	/* Not the testing egress traffic or
+	 * TCP_LISTEN (10) socket will be copied at the ingress side.
+	 */
+	if (sk->family != AF_INET6 || !is_loopback6(sk->src_ip6) ||
+	    sk->state == 10)
+		return CG_OK;
+
+	if (sk->src_port == bpf_ntohs(srv_sa6.sin6_port)) {
+		/* Server socket */
+		sk_ret = &srv_sk;
+		tp_ret = &srv_tp;
+	} else if (sk->dst_port == srv_sa6.sin6_port) {
+		/* Client socket */
+		sk_ret = &cli_sk;
+		tp_ret = &cli_tp;
+	} else {
+		/* Not the testing egress traffic */
+		return CG_OK;
+	}
+
+	/* It must be a fullsock for cgroup_skb/egress prog */
 	sk = bpf_sk_fullsock(sk);
-	if (!sk || sk->family != AF_INET6 || sk->protocol != IPPROTO_TCP ||
-	    !is_loopback6(sk->src_ip6))
-		RETURN;
+	if (!sk)
+		RET_LOG();
+
+	/* Not the testing egress traffic */
+	if (sk->protocol != IPPROTO_TCP)
+		return CG_OK;
 
 	tp = bpf_tcp_sock(sk);
 	if (!tp)
-		RETURN;
-
-	srv_sa6 = bpf_map_lookup_elem(&addr_map, &srv_idx);
-	cli_sa6 = bpf_map_lookup_elem(&addr_map, &cli_idx);
-	if (!srv_sa6 || !cli_sa6)
-		RETURN;
-
-	if (sk->src_port == bpf_ntohs(srv_sa6->sin6_port))
-		result_idx = EGRESS_SRV_IDX;
-	else if (sk->src_port == bpf_ntohs(cli_sa6->sin6_port))
-		result_idx = EGRESS_CLI_IDX;
-	else
-		RETURN;
-
-	sk_ret = bpf_map_lookup_elem(&sock_result_map, &result_idx);
-	tp_ret = bpf_map_lookup_elem(&tcp_sock_result_map, &result_idx);
-	if (!sk_ret || !tp_ret)
-		RETURN;
+		RET_LOG();
 
 	skcpy(sk_ret, sk);
 	tpcpy(tp_ret, tp);
 
-	if (result_idx == EGRESS_SRV_IDX) {
+	if (sk_ret == &srv_sk) {
 		/* The userspace has created it for srv sk */
 		pkt_out_cnt = bpf_sk_storage_get(&sk_pkt_out_cnt, sk, 0, 0);
 		pkt_out_cnt10 = bpf_sk_storage_get(&sk_pkt_out_cnt10, sk,
@@ -197,7 +179,7 @@ int egress_read_sock_fields(struct __sk_buff *skb)
 	}
 
 	if (!pkt_out_cnt || !pkt_out_cnt10)
-		RETURN;
+		RET_LOG();
 
 	/* Even both cnt and cnt10 have lock defined in their BTF,
 	 * intentionally one cnt takes lock while one does not
@@ -208,48 +190,44 @@ int egress_read_sock_fields(struct __sk_buff *skb)
 	pkt_out_cnt10->cnt += 10;
 	bpf_spin_unlock(&pkt_out_cnt10->lock);
 
-	RETURN;
+	return CG_OK;
 }
 
 SEC("cgroup_skb/ingress")
 int ingress_read_sock_fields(struct __sk_buff *skb)
 {
-	__u32 srv_idx = ADDR_SRV_IDX, result_idx = INGRESS_LISTEN_IDX;
-	struct bpf_tcp_sock *tp, *tp_ret;
-	struct bpf_sock *sk, *sk_ret;
-	struct sockaddr_in6 *srv_sa6;
+	struct bpf_tcp_sock *tp;
 	__u32 linum, linum_idx;
+	struct bpf_sock *sk;
 
 	linum_idx = INGRESS_LINUM_IDX;
 
 	sk = skb->sk;
-	if (!sk || sk->family != AF_INET6 || !is_loopback6(sk->src_ip6))
-		RETURN;
+	if (!sk)
+		RET_LOG();
 
-	srv_sa6 = bpf_map_lookup_elem(&addr_map, &srv_idx);
-	if (!srv_sa6 || sk->src_port != bpf_ntohs(srv_sa6->sin6_port))
-		RETURN;
+	/* Not the testing ingress traffic to the server */
+	if (sk->family != AF_INET6 || !is_loopback6(sk->src_ip6) ||
+	    sk->src_port != bpf_ntohs(srv_sa6.sin6_port))
+		return CG_OK;
 
-	if (sk->state != 10 && sk->state != 12)
-		RETURN;
+	/* Only interested in TCP_LISTEN */
+	if (sk->state != 10)
+		return CG_OK;
 
-	sk = bpf_get_listener_sock(sk);
+	/* It must be a fullsock for cgroup_skb/ingress prog */
+	sk = bpf_sk_fullsock(sk);
 	if (!sk)
-		RETURN;
+		RET_LOG();
 
 	tp = bpf_tcp_sock(sk);
 	if (!tp)
-		RETURN;
-
-	sk_ret = bpf_map_lookup_elem(&sock_result_map, &result_idx);
-	tp_ret = bpf_map_lookup_elem(&tcp_sock_result_map, &result_idx);
-	if (!sk_ret || !tp_ret)
-		RETURN;
+		RET_LOG();
 
-	skcpy(sk_ret, sk);
-	tpcpy(tp_ret, tp);
+	skcpy(&listen_sk, sk);
+	tpcpy(&listen_tp, tp);
 
-	RETURN;
+	return CG_OK;
 }
 
 char _license[] SEC("license") = "GPL";
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 10/13] bpf: selftest: Use network_helpers in the sock_fields test
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (8 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 09/13] bpf: selftest: Adapt sock_fields test to use skel and global variables Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 11/13] bpf: selftest: Use bpf_skc_to_tcp_sock() " Martin KaFai Lau
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

This patch uses start_server() and connect_to_fd() from network_helpers.h
to remove the network testing boiler plate codes.  epoll is no longer
needed also since the timeout has already been taken care of also.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 .../selftests/bpf/prog_tests/sock_fields.c    | 88 ++-----------------
 1 file changed, 9 insertions(+), 79 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
index d96b718639fb..eea8b96bb1be 100644
--- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
@@ -1,8 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook */
 
-#include <sys/socket.h>
-#include <sys/epoll.h>
 #include <netinet/in.h>
 #include <arpa/inet.h>
 #include <unistd.h>
@@ -12,7 +10,9 @@
 
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
+#include <linux/compiler.h>
 
+#include "network_helpers.h"
 #include "cgroup_helpers.h"
 #include "test_progs.h"
 #include "bpf_rlimit.h"
@@ -43,13 +43,6 @@ static __u32 duration;
 static __u32 egress_linum_idx = EGRESS_LINUM_IDX;
 static __u32 ingress_linum_idx = INGRESS_LINUM_IDX;
 
-static void init_loopback6(struct sockaddr_in6 *sa6)
-{
-	memset(sa6, 0, sizeof(*sa6));
-	sa6->sin6_family = AF_INET6;
-	sa6->sin6_addr = in6addr_loopback;
-}
-
 static void print_sk(const struct bpf_sock *sk, const char *prefix)
 {
 	char src_ip4[24], dst_ip4[24];
@@ -252,28 +245,14 @@ static int init_sk_storage(int sk_fd, __u32 pkt_out_cnt)
 
 static void test(void)
 {
-	int listen_fd = -1, cli_fd = -1, accept_fd = -1, epfd, err, i;
-	struct epoll_event ev;
-	socklen_t addrlen;
+	int listen_fd = -1, cli_fd = -1, accept_fd = -1, err, i;
+	socklen_t addrlen = sizeof(struct sockaddr_in6);
 	char buf[DATA_LEN];
 
-	addrlen = sizeof(struct sockaddr_in6);
-	ev.events = EPOLLIN;
-
-	epfd = epoll_create(1);
-	if (CHECK(epfd == -1, "epoll_create()", "epfd:%d errno:%d\n",
-		  epfd, errno))
-		return;
-
 	/* Prepare listen_fd */
-	listen_fd = socket(AF_INET6, SOCK_STREAM | SOCK_NONBLOCK, 0);
-	if (CHECK(listen_fd == -1, "socket()", "listen_fd:%d errno:%d\n",
-		  listen_fd, errno))
-		goto done;
-
-	init_loopback6(&srv_sa6);
-	err = bind(listen_fd, (struct sockaddr *)&srv_sa6, sizeof(srv_sa6));
-	if (CHECK(err, "bind(listen_fd)", "err:%d errno:%d\n", err, errno))
+	listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+	/* start_server() has logged the error details */
+	if (CHECK_FAIL(listen_fd == -1))
 		goto done;
 
 	err = getsockname(listen_fd, (struct sockaddr *)&srv_sa6, &addrlen);
@@ -282,19 +261,8 @@ static void test(void)
 		goto done;
 	memcpy(&skel->bss->srv_sa6, &srv_sa6, sizeof(srv_sa6));
 
-	err = listen(listen_fd, 1);
-	if (CHECK(err, "listen(listen_fd)", "err:%d errno:%d\n", err, errno))
-		goto done;
-
-	/* Prepare cli_fd */
-	cli_fd = socket(AF_INET6, SOCK_STREAM | SOCK_NONBLOCK, 0);
-	if (CHECK(cli_fd == -1, "socket()", "cli_fd:%d errno:%d\n", cli_fd,
-		  errno))
-		goto done;
-
-	init_loopback6(&cli_sa6);
-	err = bind(cli_fd, (struct sockaddr *)&cli_sa6, sizeof(cli_sa6));
-	if (CHECK(err, "bind(cli_fd)", "err:%d errno:%d\n", err, errno))
+	cli_fd = connect_to_fd(listen_fd, 0);
+	if (CHECK_FAIL(cli_fd == -1))
 		goto done;
 
 	err = getsockname(cli_fd, (struct sockaddr *)&cli_sa6, &addrlen);
@@ -302,41 +270,12 @@ static void test(void)
 		  err, errno))
 		goto done;
 
-	/* Connect from cli_sa6 to srv_sa6 */
-	err = connect(cli_fd, (struct sockaddr *)&srv_sa6, addrlen);
-	printf("srv_sa6.sin6_port:%u cli_sa6.sin6_port:%u\n\n",
-	       ntohs(srv_sa6.sin6_port), ntohs(cli_sa6.sin6_port));
-	if (CHECK(err && errno != EINPROGRESS,
-		  "connect(cli_fd)", "err:%d errno:%d\n", err, errno))
-		goto done;
-
-	ev.data.fd = listen_fd;
-	err = epoll_ctl(epfd, EPOLL_CTL_ADD, listen_fd, &ev);
-	if (CHECK(err, "epoll_ctl(EPOLL_CTL_ADD, listen_fd)",
-		  "err:%d errno:%d\n", err, errno))
-		goto done;
-
-	/* Accept the connection */
-	/* Have some timeout in accept(listen_fd). Just in case. */
-	err = epoll_wait(epfd, &ev, 1, 1000);
-	if (CHECK(err != 1 || ev.data.fd != listen_fd,
-		  "epoll_wait(listen_fd)",
-		  "err:%d errno:%d ev.data.fd:%d listen_fd:%d\n",
-		  err, errno, ev.data.fd, listen_fd))
-		goto done;
-
 	accept_fd = accept(listen_fd, NULL, NULL);
 	if (CHECK(accept_fd == -1, "accept(listen_fd)",
 		  "accept_fd:%d errno:%d\n",
 		  accept_fd, errno))
 		goto done;
 
-	ev.data.fd = cli_fd;
-	err = epoll_ctl(epfd, EPOLL_CTL_ADD, cli_fd, &ev);
-	if (CHECK(err, "epoll_ctl(EPOLL_CTL_ADD, cli_fd)",
-		  "err:%d errno:%d\n", err, errno))
-		goto done;
-
 	if (init_sk_storage(accept_fd, 0xeB9F))
 		goto done;
 
@@ -349,14 +288,6 @@ static void test(void)
 			  "err:%d errno:%d\n", err, errno))
 			goto done;
 
-		/* Have some timeout in recv(cli_fd). Just in case. */
-		err = epoll_wait(epfd, &ev, 1, 1000);
-		if (CHECK(err != 1 || ev.data.fd != cli_fd,
-			  "epoll_wait(cli_fd)",
-			  "err:%d errno:%d ev.data.fd:%d cli_fd:%d\n",
-			  err, errno, ev.data.fd, cli_fd))
-			goto done;
-
 		err = recv(cli_fd, buf, DATA_LEN, 0);
 		if (CHECK(err != DATA_LEN, "recv(cli_fd)", "err:%d errno:%d\n",
 			  err, errno))
@@ -383,7 +314,6 @@ static void test(void)
 		close(cli_fd);
 	if (listen_fd != -1)
 		close(listen_fd);
-	close(epfd);
 }
 
 void test_sock_fields(void)
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 11/13] bpf: selftest: Use bpf_skc_to_tcp_sock() in the sock_fields test
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (9 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 10/13] bpf: selftest: Use network_helpers in the sock_fields test Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 12/13] bpf: selftest: Remove enum tcp_ca_state from bpf_tcp_helpers.h Martin KaFai Lau
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

This test uses bpf_skc_to_tcp_sock() to get a kernel tcp_sock ptr "ktp".
Access the ktp->lsndtime and also pass ktp to bpf_sk_storage_get().

It also exercises the bpf_sk_cgroup_id() and bpf_sk_ancestor_cgroup_id()
with the "ktp".  To do that, a parent cgroup and a child cgroup are
created.  The bpf prog is attached to the child cgroup.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 .../selftests/bpf/prog_tests/sock_fields.c    | 40 +++++++++++++++----
 .../selftests/bpf/progs/test_sock_fields.c    | 24 ++++++++++-
 2 files changed, 55 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
index eea8b96bb1be..66e83b8fc69d 100644
--- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
@@ -29,7 +29,8 @@ struct bpf_spinlock_cnt {
 	__u32 cnt;
 };
 
-#define TEST_CGROUP "/test-bpf-sock-fields"
+#define PARENT_CGROUP	"/test-bpf-sock-fields"
+#define CHILD_CGROUP	"/test-bpf-sock-fields/child"
 #define DATA "Hello BPF!"
 #define DATA_LEN sizeof(DATA)
 
@@ -37,6 +38,8 @@ static struct sockaddr_in6 srv_sa6, cli_sa6;
 static int sk_pkt_out_cnt10_fd;
 struct test_sock_fields *skel;
 static int sk_pkt_out_cnt_fd;
+static __u64 parent_cg_id;
+static __u64 child_cg_id;
 static int linum_map_fd;
 static __u32 duration;
 
@@ -142,6 +145,8 @@ static void check_result(void)
 	      "srv_sk", "Unexpected. Check srv_sk output. egress_linum:%u\n",
 	      egress_linum);
 
+	CHECK(!skel->bss->lsndtime, "srv_tp", "Unexpected lsndtime:0\n");
+
 	CHECK(cli_sk.state == 10 ||
 	      !cli_sk.state ||
 	      cli_sk.family != AF_INET6 ||
@@ -178,6 +183,14 @@ static void check_result(void)
 	      cli_tp.bytes_received < 2 * DATA_LEN,
 	      "cli_tp", "Unexpected. Check cli_tp output. egress_linum:%u\n",
 	      egress_linum);
+
+	CHECK(skel->bss->parent_cg_id != parent_cg_id,
+	      "parent_cg_id", "%zu != %zu\n",
+	      (size_t)skel->bss->parent_cg_id, (size_t)parent_cg_id);
+
+	CHECK(skel->bss->child_cg_id != child_cg_id,
+	      "child_cg_id", "%zu != %zu\n",
+	       (size_t)skel->bss->child_cg_id, (size_t)child_cg_id);
 }
 
 static void check_sk_pkt_out_cnt(int accept_fd, int cli_fd)
@@ -319,25 +332,35 @@ static void test(void)
 void test_sock_fields(void)
 {
 	struct bpf_link *egress_link = NULL, *ingress_link = NULL;
-	int cgroup_fd;
+	int parent_cg_fd = -1, child_cg_fd = -1;
 
 	/* Create a cgroup, get fd, and join it */
-	cgroup_fd = test__join_cgroup(TEST_CGROUP);
-	if (CHECK_FAIL(cgroup_fd < 0))
+	parent_cg_fd = test__join_cgroup(PARENT_CGROUP);
+	if (CHECK_FAIL(parent_cg_fd < 0))
 		return;
+	parent_cg_id = get_cgroup_id(PARENT_CGROUP);
+	if (CHECK_FAIL(!parent_cg_id))
+		goto done;
+
+	child_cg_fd = test__join_cgroup(CHILD_CGROUP);
+	if (CHECK_FAIL(child_cg_fd < 0))
+		goto done;
+	child_cg_id = get_cgroup_id(CHILD_CGROUP);
+	if (CHECK_FAIL(!child_cg_id))
+		goto done;
 
 	skel = test_sock_fields__open_and_load();
 	if (CHECK(!skel, "test_sock_fields__open_and_load", "failed\n"))
 		goto done;
 
 	egress_link = bpf_program__attach_cgroup(skel->progs.egress_read_sock_fields,
-						 cgroup_fd);
+						 child_cg_fd);
 	if (CHECK(IS_ERR(egress_link), "attach_cgroup(egress)", "err:%ld\n",
 		  PTR_ERR(egress_link)))
 		goto done;
 
 	ingress_link = bpf_program__attach_cgroup(skel->progs.ingress_read_sock_fields,
-						  cgroup_fd);
+						  child_cg_fd);
 	if (CHECK(IS_ERR(ingress_link), "attach_cgroup(ingress)", "err:%ld\n",
 		  PTR_ERR(ingress_link)))
 		goto done;
@@ -352,5 +375,8 @@ void test_sock_fields(void)
 	bpf_link__destroy(egress_link);
 	bpf_link__destroy(ingress_link);
 	test_sock_fields__destroy(skel);
-	close(cgroup_fd);
+	if (child_cg_fd != -1)
+		close(child_cg_fd);
+	if (parent_cg_fd != -1)
+		close(parent_cg_fd);
 }
diff --git a/tools/testing/selftests/bpf/progs/test_sock_fields.c b/tools/testing/selftests/bpf/progs/test_sock_fields.c
index 370e33a858db..81b57b9aaaea 100644
--- a/tools/testing/selftests/bpf/progs/test_sock_fields.c
+++ b/tools/testing/selftests/bpf/progs/test_sock_fields.c
@@ -7,6 +7,7 @@
 
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_endian.h>
+#include "bpf_tcp_helpers.h"
 
 enum bpf_linum_array_idx {
 	EGRESS_LINUM_IDX,
@@ -47,6 +48,9 @@ struct bpf_tcp_sock srv_tp = {};
 struct bpf_sock listen_sk = {};
 struct bpf_sock srv_sk = {};
 struct bpf_sock cli_sk = {};
+__u64 parent_cg_id = 0;
+__u64 child_cg_id = 0;
+__u64 lsndtime = 0;
 
 static bool is_loopback6(__u32 *a6)
 {
@@ -121,6 +125,7 @@ int egress_read_sock_fields(struct __sk_buff *skb)
 	struct bpf_tcp_sock *tp, *tp_ret;
 	struct bpf_sock *sk, *sk_ret;
 	__u32 linum, linum_idx;
+	struct tcp_sock *ktp;
 
 	linum_idx = EGRESS_LINUM_IDX;
 
@@ -165,9 +170,24 @@ int egress_read_sock_fields(struct __sk_buff *skb)
 	tpcpy(tp_ret, tp);
 
 	if (sk_ret == &srv_sk) {
+		ktp = bpf_skc_to_tcp_sock(sk);
+
+		if (!ktp)
+			RET_LOG();
+
+		lsndtime = ktp->lsndtime;
+
+		child_cg_id = bpf_sk_cgroup_id(ktp);
+		if (!child_cg_id)
+			RET_LOG();
+
+		parent_cg_id = bpf_sk_ancestor_cgroup_id(ktp, 2);
+		if (!parent_cg_id)
+			RET_LOG();
+
 		/* The userspace has created it for srv sk */
-		pkt_out_cnt = bpf_sk_storage_get(&sk_pkt_out_cnt, sk, 0, 0);
-		pkt_out_cnt10 = bpf_sk_storage_get(&sk_pkt_out_cnt10, sk,
+		pkt_out_cnt = bpf_sk_storage_get(&sk_pkt_out_cnt, ktp, 0, 0);
+		pkt_out_cnt10 = bpf_sk_storage_get(&sk_pkt_out_cnt10, ktp,
 						   0, 0);
 	} else {
 		pkt_out_cnt = bpf_sk_storage_get(&sk_pkt_out_cnt, sk,
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 12/13] bpf: selftest: Remove enum tcp_ca_state from bpf_tcp_helpers.h
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (10 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 11/13] bpf: selftest: Use bpf_skc_to_tcp_sock() " Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25  0:04 ` [PATCH v4 bpf-next 13/13] bpf: selftest: Add test_btf_skc_cls_ingress Martin KaFai Lau
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

The enum tcp_ca_state is available in <linux/tcp.h>.
Remove it from the bpf_tcp_helpers.h to avoid conflict when the bpf prog
needs to include both both <linux/tcp.h> and bpf_tcp_helpers.h.

Modify the bpf_cubic.c and bpf_dctcp.c to use <linux/tcp.h> instead.
The <linux/stddef.h> is needed by <linux/tcp.h>.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 tools/testing/selftests/bpf/bpf_tcp_helpers.h | 8 --------
 tools/testing/selftests/bpf/progs/bpf_cubic.c | 2 ++
 tools/testing/selftests/bpf/progs/bpf_dctcp.c | 2 ++
 3 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index 5bf2fe9b1efa..a0e8b3758bd7 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -115,14 +115,6 @@ enum tcp_ca_event {
 	CA_EVENT_ECN_IS_CE = 5,
 };
 
-enum tcp_ca_state {
-	TCP_CA_Open = 0,
-	TCP_CA_Disorder = 1,
-	TCP_CA_CWR = 2,
-	TCP_CA_Recovery = 3,
-	TCP_CA_Loss = 4
-};
-
 struct ack_sample {
 	__u32 pkts_acked;
 	__s32 rtt_us;
diff --git a/tools/testing/selftests/bpf/progs/bpf_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cubic.c
index ef574087f1e1..6939bfd8690f 100644
--- a/tools/testing/selftests/bpf/progs/bpf_cubic.c
+++ b/tools/testing/selftests/bpf/progs/bpf_cubic.c
@@ -15,6 +15,8 @@
  */
 
 #include <linux/bpf.h>
+#include <linux/stddef.h>
+#include <linux/tcp.h>
 #include "bpf_tcp_helpers.h"
 
 char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/bpf_dctcp.c b/tools/testing/selftests/bpf/progs/bpf_dctcp.c
index 3fb4260570b1..4dc1a967776a 100644
--- a/tools/testing/selftests/bpf/progs/bpf_dctcp.c
+++ b/tools/testing/selftests/bpf/progs/bpf_dctcp.c
@@ -9,6 +9,8 @@
 #include <stddef.h>
 #include <linux/bpf.h>
 #include <linux/types.h>
+#include <linux/stddef.h>
+#include <linux/tcp.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 #include "bpf_tcp_helpers.h"
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 bpf-next 13/13] bpf: selftest: Add test_btf_skc_cls_ingress
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (11 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 12/13] bpf: selftest: Remove enum tcp_ca_state from bpf_tcp_helpers.h Martin KaFai Lau
@ 2020-09-25  0:04 ` Martin KaFai Lau
  2020-09-25 16:24   ` John Fastabend
  2020-09-25  9:40 ` [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Lorenz Bauer
  2020-09-25 23:22 ` Alexei Starovoitov
  14 siblings, 1 reply; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25  0:04 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

This patch attaches a classifier prog to the ingress filter.
It exercises the following helpers with different socket pointer
types in different logical branches:
1. bpf_sk_release()
2. bpf_sk_assign()
3. bpf_skc_to_tcp_request_sock(), bpf_skc_to_tcp_sock()
4. bpf_tcp_gen_syncookie, bpf_tcp_check_syncookie

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 tools/testing/selftests/bpf/bpf_tcp_helpers.h |   5 +
 .../bpf/prog_tests/btf_skc_cls_ingress.c      | 234 ++++++++++++++++++
 .../bpf/progs/test_btf_skc_cls_ingress.c      | 174 +++++++++++++
 3 files changed, 413 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c

diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index a0e8b3758bd7..2915664c335d 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -16,6 +16,7 @@ BPF_PROG(name, args)
 
 struct sock_common {
 	unsigned char	skc_state;
+	__u16		skc_num;
 } __attribute__((preserve_access_index));
 
 enum sk_pacing {
@@ -45,6 +46,10 @@ struct inet_connection_sock {
 	__u64			  icsk_ca_priv[104 / sizeof(__u64)];
 } __attribute__((preserve_access_index));
 
+struct request_sock {
+	struct sock_common		__req_common;
+} __attribute__((preserve_access_index));
+
 struct tcp_sock {
 	struct inet_connection_sock	inet_conn;
 
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c b/tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c
new file mode 100644
index 000000000000..4ce0e8a25bc5
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c
@@ -0,0 +1,234 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Facebook */
+
+#define _GNU_SOURCE
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <sched.h>
+#include <linux/compiler.h>
+#include <bpf/libbpf.h>
+
+#include "network_helpers.h"
+#include "test_progs.h"
+#include "test_btf_skc_cls_ingress.skel.h"
+
+struct test_btf_skc_cls_ingress *skel;
+struct sockaddr_in6 srv_sa6;
+static __u32 duration;
+
+#define PROG_PIN_FILE "/sys/fs/bpf/btf_skc_cls_ingress"
+
+static int write_sysctl(const char *sysctl, const char *value)
+{
+	int fd, err, len;
+
+	fd = open(sysctl, O_WRONLY);
+	if (CHECK(fd == -1, "open sysctl", "open(%s): %s (%d)\n",
+		  sysctl, strerror(errno), errno))
+		return -1;
+
+	len = strlen(value);
+	err = write(fd, value, len);
+	close(fd);
+	if (CHECK(err != len, "write sysctl",
+		  "write(%s, %s, %d): err:%d %s (%d)\n",
+		  sysctl, value, len, err, strerror(errno), errno))
+		return -1;
+
+	return 0;
+}
+
+static int prepare_netns(void)
+{
+	if (CHECK(unshare(CLONE_NEWNET), "create netns",
+		  "unshare(CLONE_NEWNET): %s (%d)",
+		  strerror(errno), errno))
+		return -1;
+
+	if (CHECK(system("ip link set dev lo up"),
+		  "ip link set dev lo up", "failed\n"))
+		return -1;
+
+	if (CHECK(system("tc qdisc add dev lo clsact"),
+		  "tc qdisc add dev lo clsact", "failed\n"))
+		return -1;
+
+	if (CHECK(system("tc filter add dev lo ingress bpf direct-action object-pinned " PROG_PIN_FILE),
+		  "install tc cls-prog at ingress", "failed\n"))
+		return -1;
+
+	/* Ensure 20 bytes options (i.e. in total 40 bytes tcp header) for the
+	 * bpf_tcp_gen_syncookie() helper.
+	 */
+	if (write_sysctl("/proc/sys/net/ipv4/tcp_window_scaling", "1") ||
+	    write_sysctl("/proc/sys/net/ipv4/tcp_timestamps", "1") ||
+	    write_sysctl("/proc/sys/net/ipv4/tcp_sack", "1"))
+		return -1;
+
+	return 0;
+}
+
+static void reset_test(void)
+{
+	memset(&skel->bss->srv_sa6, 0, sizeof(skel->bss->srv_sa6));
+	skel->bss->listen_tp_sport = 0;
+	skel->bss->req_sk_sport = 0;
+	skel->bss->recv_cookie = 0;
+	skel->bss->gen_cookie = 0;
+	skel->bss->linum = 0;
+}
+
+static void print_err_line(void)
+{
+	if (skel->bss->linum)
+		printf("bpf prog error at line %u\n", skel->bss->linum);
+}
+
+static void test_conn(void)
+{
+	int listen_fd = -1, cli_fd = -1, err;
+	socklen_t addrlen = sizeof(srv_sa6);
+	int srv_port;
+
+	if (write_sysctl("/proc/sys/net/ipv4/tcp_syncookies", "1"))
+		return;
+
+	listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+	if (CHECK_FAIL(listen_fd == -1))
+		return;
+
+	err = getsockname(listen_fd, (struct sockaddr *)&srv_sa6, &addrlen);
+	if (CHECK(err, "getsockname(listen_fd)", "err:%d errno:%d\n", err,
+		  errno))
+		goto done;
+	memcpy(&skel->bss->srv_sa6, &srv_sa6, sizeof(srv_sa6));
+	srv_port = ntohs(srv_sa6.sin6_port);
+
+	cli_fd = connect_to_fd(listen_fd, 0);
+	if (CHECK_FAIL(cli_fd == -1))
+		goto done;
+
+	if (CHECK(skel->bss->listen_tp_sport != srv_port ||
+		  skel->bss->req_sk_sport != srv_port,
+		  "Unexpected sk src port",
+		  "listen_tp_sport:%u req_sk_sport:%u expected:%u\n",
+		  skel->bss->listen_tp_sport, skel->bss->req_sk_sport,
+		  srv_port))
+		goto done;
+
+	if (CHECK(skel->bss->gen_cookie || skel->bss->recv_cookie,
+		  "Unexpected syncookie states",
+		  "gen_cookie:%u recv_cookie:%u\n",
+		  skel->bss->gen_cookie, skel->bss->recv_cookie))
+		goto done;
+
+	CHECK(skel->bss->linum, "bpf prog detected error", "at line %u\n",
+	      skel->bss->linum);
+
+done:
+	if (listen_fd != -1)
+		close(listen_fd);
+	if (cli_fd != -1)
+		close(cli_fd);
+}
+
+static void test_syncookie(void)
+{
+	int listen_fd = -1, cli_fd = -1, err;
+	socklen_t addrlen = sizeof(srv_sa6);
+	int srv_port;
+
+	/* Enforce syncookie mode */
+	if (write_sysctl("/proc/sys/net/ipv4/tcp_syncookies", "2"))
+		return;
+
+	listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+	if (CHECK_FAIL(listen_fd == -1))
+		return;
+
+	err = getsockname(listen_fd, (struct sockaddr *)&srv_sa6, &addrlen);
+	if (CHECK(err, "getsockname(listen_fd)", "err:%d errno:%d\n", err,
+		  errno))
+		goto done;
+	memcpy(&skel->bss->srv_sa6, &srv_sa6, sizeof(srv_sa6));
+	srv_port = ntohs(srv_sa6.sin6_port);
+
+	cli_fd = connect_to_fd(listen_fd, 0);
+	if (CHECK_FAIL(cli_fd == -1))
+		goto done;
+
+	if (CHECK(skel->bss->listen_tp_sport != srv_port,
+		  "Unexpected tp src port",
+		  "listen_tp_sport:%u expected:%u\n",
+		  skel->bss->listen_tp_sport, srv_port))
+		goto done;
+
+	if (CHECK(skel->bss->req_sk_sport,
+		  "Unexpected req_sk src port",
+		  "req_sk_sport:%u expected:0\n",
+		   skel->bss->req_sk_sport))
+		goto done;
+
+	if (CHECK(!skel->bss->gen_cookie ||
+		  skel->bss->gen_cookie != skel->bss->recv_cookie,
+		  "Unexpected syncookie states",
+		  "gen_cookie:%u recv_cookie:%u\n",
+		  skel->bss->gen_cookie, skel->bss->recv_cookie))
+		goto done;
+
+	CHECK(skel->bss->linum, "bpf prog detected error", "at line %u\n",
+	      skel->bss->linum);
+
+done:
+	if (listen_fd != -1)
+		close(listen_fd);
+	if (cli_fd != -1)
+		close(cli_fd);
+}
+
+struct test {
+	const char *desc;
+	void (*run)(void);
+};
+
+#define DEF_TEST(name) { #name, test_##name }
+static struct test tests[] = {
+	DEF_TEST(conn),
+	DEF_TEST(syncookie),
+};
+
+void test_btf_skc_cls_ingress(void)
+{
+	int i, err;
+
+	skel = test_btf_skc_cls_ingress__open_and_load();
+	if (CHECK(!skel, "test_btf_skc_cls_ingress__open_and_load", "failed\n"))
+		return;
+
+	err = bpf_program__pin(skel->progs.cls_ingress, PROG_PIN_FILE);
+	if (CHECK(err, "bpf_program__pin",
+		  "cannot pin bpf prog to %s. err:%d\n", PROG_PIN_FILE, err)) {
+		test_btf_skc_cls_ingress__destroy(skel);
+		return;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(tests); i++) {
+		if (!test__start_subtest(tests[i].desc))
+			continue;
+
+		if (prepare_netns())
+			break;
+
+		tests[i].run();
+
+		print_err_line();
+		reset_test();
+	}
+
+	bpf_program__unpin(skel->progs.cls_ingress, PROG_PIN_FILE);
+	test_btf_skc_cls_ingress__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c b/tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c
new file mode 100644
index 000000000000..9a6b85dd52d2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c
@@ -0,0 +1,174 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Facebook */
+
+#include <string.h>
+#include <errno.h>
+#include <netinet/in.h>
+#include <linux/stddef.h>
+#include <linux/bpf.h>
+#include <linux/ipv6.h>
+#include <linux/tcp.h>
+#include <linux/if_ether.h>
+#include <linux/pkt_cls.h>
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_endian.h>
+#include "bpf_tcp_helpers.h"
+
+struct sockaddr_in6 srv_sa6 = {};
+__u16 listen_tp_sport = 0;
+__u16 req_sk_sport = 0;
+__u32 recv_cookie = 0;
+__u32 gen_cookie = 0;
+__u32 linum = 0;
+
+#define LOG() ({ if (!linum) linum = __LINE__; })
+
+static void test_syncookie_helper(struct ipv6hdr *ip6h, struct tcphdr *th,
+				  struct tcp_sock *tp,
+				  struct __sk_buff *skb)
+{
+	if (th->syn) {
+		__s64 mss_cookie;
+		void *data_end;
+
+		data_end = (void *)(long)(skb->data_end);
+
+		if (th->doff * 4 != 40) {
+			LOG();
+			return;
+		}
+
+		if ((void *)th + 40 > data_end) {
+			LOG();
+			return;
+		}
+
+		mss_cookie = bpf_tcp_gen_syncookie(tp, ip6h, sizeof(*ip6h),
+						   th, 40);
+		if (mss_cookie < 0) {
+			if (mss_cookie != -ENOENT)
+				LOG();
+		} else {
+			gen_cookie = (__u32)mss_cookie;
+		}
+	} else if (gen_cookie) {
+		/* It was in cookie mode */
+		int ret = bpf_tcp_check_syncookie(tp, ip6h, sizeof(*ip6h),
+						  th, sizeof(*th));
+
+		if (ret < 0) {
+			if (ret != -ENOENT)
+				LOG();
+		} else {
+			recv_cookie = bpf_ntohl(th->ack_seq) - 1;
+		}
+	}
+}
+
+static int handle_ip6_tcp(struct ipv6hdr *ip6h, struct __sk_buff *skb)
+{
+	struct bpf_sock_tuple *tuple;
+	struct bpf_sock *bpf_skc;
+	unsigned int tuple_len;
+	struct tcphdr *th;
+	void *data_end;
+
+	data_end = (void *)(long)(skb->data_end);
+
+	th = (struct tcphdr *)(ip6h + 1);
+	if (th + 1 > data_end)
+		return TC_ACT_OK;
+
+	/* Is it the testing traffic? */
+	if (th->dest != srv_sa6.sin6_port)
+		return TC_ACT_OK;
+
+	tuple_len = sizeof(tuple->ipv6);
+	tuple = (struct bpf_sock_tuple *)&ip6h->saddr;
+	if ((void *)tuple + tuple_len > data_end) {
+		LOG();
+		return TC_ACT_OK;
+	}
+
+	bpf_skc = bpf_skc_lookup_tcp(skb, tuple, tuple_len,
+				     BPF_F_CURRENT_NETNS, 0);
+	if (!bpf_skc) {
+		LOG();
+		return TC_ACT_OK;
+	}
+
+	if (bpf_skc->state == BPF_TCP_NEW_SYN_RECV) {
+		struct request_sock *req_sk;
+
+		req_sk = (struct request_sock *)bpf_skc_to_tcp_request_sock(bpf_skc);
+		if (!req_sk) {
+			LOG();
+			goto release;
+		}
+
+		if (bpf_sk_assign(skb, req_sk, 0)) {
+			LOG();
+			goto release;
+		}
+
+		req_sk_sport = req_sk->__req_common.skc_num;
+
+		bpf_sk_release(req_sk);
+		return TC_ACT_OK;
+	} else if (bpf_skc->state == BPF_TCP_LISTEN) {
+		struct tcp_sock *tp;
+
+		tp = bpf_skc_to_tcp_sock(bpf_skc);
+		if (!tp) {
+			LOG();
+			goto release;
+		}
+
+		if (bpf_sk_assign(skb, tp, 0)) {
+			LOG();
+			goto release;
+		}
+
+		listen_tp_sport = tp->inet_conn.icsk_inet.sk.__sk_common.skc_num;
+
+		test_syncookie_helper(ip6h, th, tp, skb);
+		bpf_sk_release(tp);
+		return TC_ACT_OK;
+	}
+
+	if (bpf_sk_assign(skb, bpf_skc, 0))
+		LOG();
+
+release:
+	bpf_sk_release(bpf_skc);
+	return TC_ACT_OK;
+}
+
+SEC("classifier/ingress")
+int cls_ingress(struct __sk_buff *skb)
+{
+	struct ipv6hdr *ip6h;
+	struct ethhdr *eth;
+	void *data_end;
+
+	data_end = (void *)(long)(skb->data_end);
+
+	eth = (struct ethhdr *)(long)(skb->data);
+	if (eth + 1 > data_end)
+		return TC_ACT_OK;
+
+	if (eth->h_proto != bpf_htons(ETH_P_IPV6))
+		return TC_ACT_OK;
+
+	ip6h = (struct ipv6hdr *)(eth + 1);
+	if (ip6h + 1 > data_end)
+		return TC_ACT_OK;
+
+	if (ip6h->nexthdr == IPPROTO_TCP)
+		return handle_ip6_tcp(ip6h, skb);
+
+	return TC_ACT_OK;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.24.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type()
  2020-09-25  0:03 ` [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type() Martin KaFai Lau
@ 2020-09-25  8:22   ` Lorenz Bauer
  2020-09-25 13:36   ` John Fastabend
  1 sibling, 0 replies; 26+ messages in thread
From: Lorenz Bauer @ 2020-09-25  8:22 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Networking

On Fri, 25 Sep 2020 at 01:03, Martin KaFai Lau <kafai@fb.com> wrote:
>
> check_reg_type() checks whether a reg can be used as an arg of a
> func_proto.  For PTR_TO_BTF_ID, the check is actually not
> completely done until the reg->btf_id is pointing to a
> kernel struct that is acceptable by the func_proto.
>
> Thus, this patch moves the btf_id check into check_reg_type().
> "arg_type" and "arg_btf_id" are passed to check_reg_type() instead of
> "compatible".  The compatible_reg_types[] usage is localized in
> check_reg_type() now.
>
> The "if (!btf_id) verbose(...); " is also removed since it won't happen.
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

Acked-by: Lorenz Bauer <lmb@cloudflare.com>

> ---
>  kernel/bpf/verifier.c | 60 ++++++++++++++++++++++---------------------
>  1 file changed, 31 insertions(+), 29 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 42dee5dcbc74..945fa2b4d096 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -4028,19 +4028,27 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
>  };
>
>  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> -                         const struct bpf_reg_types *compatible)
> +                         enum bpf_arg_type arg_type,
> +                         const u32 *arg_btf_id)
>  {
>         struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
>         enum bpf_reg_type expected, type = reg->type;
> +       const struct bpf_reg_types *compatible;
>         int i, j;
>
> +       compatible = compatible_reg_types[arg_type];
> +       if (!compatible) {
> +               verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);
> +               return -EFAULT;
> +       }
> +
>         for (i = 0; i < ARRAY_SIZE(compatible->types); i++) {
>                 expected = compatible->types[i];
>                 if (expected == NOT_INIT)
>                         break;
>
>                 if (type == expected)
> -                       return 0;
> +                       goto found;
>         }
>
>         verbose(env, "R%d type=%s expected=", regno, reg_type_str[type]);
> @@ -4048,6 +4056,25 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
>                 verbose(env, "%s, ", reg_type_str[compatible->types[j]]);
>         verbose(env, "%s\n", reg_type_str[compatible->types[j]]);
>         return -EACCES;
> +
> +found:
> +       if (type == PTR_TO_BTF_ID) {
> +               if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id,
> +                                         *arg_btf_id)) {
> +                       verbose(env, "R%d is of type %s but %s is expected\n",
> +                               regno, kernel_type_name(reg->btf_id),
> +                               kernel_type_name(*arg_btf_id));
> +                       return -EACCES;
> +               }
> +
> +               if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
> +                       verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
> +                               regno);
> +                       return -EACCES;
> +               }
> +       }
> +
> +       return 0;
>  }
>
>  static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> @@ -4057,7 +4084,6 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>         u32 regno = BPF_REG_1 + arg;
>         struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
>         enum bpf_arg_type arg_type = fn->arg_type[arg];
> -       const struct bpf_reg_types *compatible;
>         enum bpf_reg_type type = reg->type;
>         int err = 0;
>
> @@ -4097,35 +4123,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                  */
>                 goto skip_type_check;
>
> -       compatible = compatible_reg_types[arg_type];
> -       if (!compatible) {
> -               verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);
> -               return -EFAULT;
> -       }
> -
> -       err = check_reg_type(env, regno, compatible);
> +       err = check_reg_type(env, regno, arg_type, fn->arg_btf_id[arg]);
>         if (err)
>                 return err;
>
> -       if (type == PTR_TO_BTF_ID) {
> -               const u32 *btf_id = fn->arg_btf_id[arg];
> -
> -               if (!btf_id) {
> -                       verbose(env, "verifier internal error: missing BTF ID\n");
> -                       return -EFAULT;
> -               }
> -
> -               if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id, *btf_id)) {
> -                       verbose(env, "R%d is of type %s but %s is expected\n",
> -                               regno, kernel_type_name(reg->btf_id), kernel_type_name(*btf_id));
> -                       return -EACCES;
> -               }
> -               if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
> -                       verbose(env, "R%d is a pointer to in-kernel struct with non-zero offset\n",
> -                               regno);
> -                       return -EACCES;
> -               }
> -       } else if (type == PTR_TO_CTX) {
> +       if (type == PTR_TO_CTX) {
>                 err = check_ctx_reg(env, reg, regno);
>                 if (err < 0)
>                         return err;
> --
> 2.24.1
>


-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  2020-09-25  0:03 ` [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
@ 2020-09-25  8:26   ` Lorenz Bauer
  2020-09-25 13:18     ` Martin KaFai Lau
  2020-09-25 14:21   ` John Fastabend
  1 sibling, 1 reply; 26+ messages in thread
From: Lorenz Bauer @ 2020-09-25  8:26 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Networking

On Fri, 25 Sep 2020 at 01:04, Martin KaFai Lau <kafai@fb.com> wrote:
>
> There is a constant need to add more fields into the bpf_tcp_sock
> for the bpf programs running at tc, sock_ops...etc.
>
> A current workaround could be to use bpf_probe_read_kernel().  However,
> other than making another helper call for reading each field and missing
> CO-RE, it is also not as intuitive to use as directly reading
> "tp->lsndtime" for example.  While already having perfmon cap to do
> bpf_probe_read_kernel(), it will be much easier if the bpf prog can
> directly read from the tcp_sock.
>
> This patch tries to do that by using the existing casting-helpers
> bpf_skc_to_*() whose func_proto returns a btf_id.  For example, the
> func_proto of bpf_skc_to_tcp_sock returns the btf_id of the
> kernel "struct tcp_sock".
>
> These helpers are also added to is_ptr_cast_function().
> It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id.
> That will keep the ref-tracking works properly.
>
> The bpf_skc_to_* helpers are made available to most of the bpf prog
> types in filter.c. The bpf_skc_to_* helpers will be limited by
> perfmon cap.
>
> This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON.  The helper accepting
> this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON])
> or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON).  The bpf_skc_to_*()
> helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that
> they will accept pointer obtained from skb->sk.
>
> Instead of specifying both arg_type and arg_btf_id in the same func_proto
> which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of
> the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the
> compatible_reg_types[] in verifier.c.  The reason is the arg_btf_id is
> always the same.  Discussion in this thread:
> https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/
>
> The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is
> expecting a PTR_TO_BTF_ID which could be NULL.  This is the same
> behavior as the existing helper taking ARG_PTR_TO_BTF_ID.
>
> The _SOCK_COMMON part means the helper is also expecting the legacy
> SOCK_COMMON pointer.
>
> By excluding the _OR_NULL part, the bpf prog cannot call helper
> with a literal NULL which doesn't make sense in most cases.
> e.g. bpf_skc_to_tcp_sock(NULL) will be rejected.  All PTR_TO_*_OR_NULL
> reg has to do a NULL check first before passing into the helper or else
> the bpf prog will be rejected.  This behavior is nothing new and
> consistent with the current expectation during bpf-prog-load.
>
> [ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace
>   ARG_PTR_TO_SOCK* of other existing helpers later such that
>   those existing helpers can take the PTR_TO_BTF_ID returned by
>   the bpf_skc_to_*() helpers.
>
>   The only special case is bpf_sk_lookup_assign() which can accept a
>   literal NULL ptr.  It has to be handled specially in another follow
>   up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL
>   to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ]
>
> [ When converting the older helpers that take ARG_PTR_TO_SOCK* in
>   the later patch, if the kernel does not support BTF,
>   ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON
>   because no reg->type could have PTR_TO_BTF_ID in this case.
>
>   It is not a concern for the newer-btf-only helper like the bpf_skc_to_*()
>   here though because these helpers must require BTF vmlinux to begin
>   with. ]
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---
>  include/linux/bpf.h   |  1 +
>  kernel/bpf/verifier.c | 34 +++++++++++++++++++--
>  net/core/filter.c     | 69 ++++++++++++++++++++++++++++++-------------
>  3 files changed, 82 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index fc5c901c7542..d0937f1d2980 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -292,6 +292,7 @@ enum bpf_arg_type {
>         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */
>         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */
>         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */
> +       ARG_PTR_TO_BTF_ID_SOCK_COMMON,  /* pointer to in-kernel sock_common or bpf-mirrored bpf_sock */
>         __BPF_ARG_TYPE_MAX,
>  };
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 945fa2b4d096..d4ba29fb17a6 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -486,7 +486,12 @@ static bool is_acquire_function(enum bpf_func_id func_id,
>  static bool is_ptr_cast_function(enum bpf_func_id func_id)
>  {
>         return func_id == BPF_FUNC_tcp_sock ||
> -               func_id == BPF_FUNC_sk_fullsock;
> +               func_id == BPF_FUNC_sk_fullsock ||
> +               func_id == BPF_FUNC_skc_to_tcp_sock ||
> +               func_id == BPF_FUNC_skc_to_tcp6_sock ||
> +               func_id == BPF_FUNC_skc_to_udp6_sock ||
> +               func_id == BPF_FUNC_skc_to_tcp_timewait_sock ||
> +               func_id == BPF_FUNC_skc_to_tcp_request_sock;
>  }
>
>  /* string representation of 'enum bpf_reg_type' */
> @@ -3953,6 +3958,7 @@ static int resolve_map_arg_type(struct bpf_verifier_env *env,
>
>  struct bpf_reg_types {
>         const enum bpf_reg_type types[10];
> +       u32 *btf_id;
>  };
>
>  static const struct bpf_reg_types map_key_value_types = {
> @@ -3973,6 +3979,17 @@ static const struct bpf_reg_types sock_types = {
>         },
>  };
>
> +static const struct bpf_reg_types btf_id_sock_common_types = {
> +       .types = {
> +               PTR_TO_SOCK_COMMON,
> +               PTR_TO_SOCKET,
> +               PTR_TO_TCP_SOCK,
> +               PTR_TO_XDP_SOCK,
> +               PTR_TO_BTF_ID,
> +       },
> +       .btf_id = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
> +};
> +
>  static const struct bpf_reg_types mem_types = {
>         .types = {
>                 PTR_TO_STACK,
> @@ -4014,6 +4031,7 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
>         [ARG_PTR_TO_CTX]                = &context_types,
>         [ARG_PTR_TO_CTX_OR_NULL]        = &context_types,
>         [ARG_PTR_TO_SOCK_COMMON]        = &sock_types,
> +       [ARG_PTR_TO_BTF_ID_SOCK_COMMON] = &btf_id_sock_common_types,
>         [ARG_PTR_TO_SOCKET]             = &fullsock_types,
>         [ARG_PTR_TO_SOCKET_OR_NULL]     = &fullsock_types,
>         [ARG_PTR_TO_BTF_ID]             = &btf_ptr_types,
> @@ -4059,6 +4077,14 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
>
>  found:
>         if (type == PTR_TO_BTF_ID) {
> +               if (!arg_btf_id) {
> +                       if (!compatible->btf_id) {
> +                               verbose(env, "verifier internal error: missing arg compatible BTF ID\n");
> +                               return -EFAULT;
> +                       }
> +                       arg_btf_id = compatible->btf_id;
> +               }
> +
>                 if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id,
>                                           *arg_btf_id)) {
>                         verbose(env, "R%d is of type %s but %s is expected\n",
> @@ -4575,10 +4601,14 @@ static bool check_btf_id_ok(const struct bpf_func_proto *fn)
>  {
>         int i;
>
> -       for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++)
> +       for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) {
>                 if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
>                         return false;
>
> +               if (fn->arg_type[i] != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
> +                       return false;
> +       }
> +

This is a hold over from the previous patchset?

>         return true;
>  }
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 706f8db0ccf8..6d1864f2bd51 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -77,6 +77,9 @@
>  #include <net/transp_v6.h>
>  #include <linux/btf_ids.h>
>
> +static const struct bpf_func_proto *
> +bpf_sk_base_func_proto(enum bpf_func_id func_id);
> +
>  int copy_bpf_fprog_from_user(struct sock_fprog *dst, sockptr_t src, int len)
>  {
>         if (in_compat_syscall()) {
> @@ -6620,7 +6623,7 @@ sock_addr_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                         return NULL;
>                 }
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -6639,7 +6642,7 @@ sk_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>         case BPF_FUNC_perf_event_output:
>                 return &bpf_skb_event_output_proto;
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -6800,7 +6803,7 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                 return &bpf_sk_assign_proto;
>  #endif
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -6841,7 +6844,7 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                 return &bpf_tcp_gen_syncookie_proto;
>  #endif
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -6883,7 +6886,7 @@ sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                 return &bpf_tcp_sock_proto;
>  #endif /* CONFIG_INET */
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -6929,7 +6932,7 @@ sk_msg_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                 return &bpf_get_cgroup_classid_curr_proto;
>  #endif
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -6971,7 +6974,7 @@ sk_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                 return &bpf_skc_lookup_tcp_proto;
>  #endif
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -6982,7 +6985,7 @@ flow_dissector_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>         case BPF_FUNC_skb_load_bytes:
>                 return &bpf_flow_dissector_load_bytes_proto;
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -7009,7 +7012,7 @@ lwt_out_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>         case BPF_FUNC_skb_under_cgroup:
>                 return &bpf_skb_under_cgroup_proto;
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -9746,7 +9749,7 @@ sk_lookup_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>         case BPF_FUNC_sk_release:
>                 return &bpf_sk_release_proto;
>         default:
> -               return bpf_base_func_proto(func_id);
> +               return bpf_sk_base_func_proto(func_id);
>         }
>  }
>
> @@ -9913,8 +9916,7 @@ const struct bpf_func_proto bpf_skc_to_tcp6_sock_proto = {
>         .func                   = bpf_skc_to_tcp6_sock,
>         .gpl_only               = false,
>         .ret_type               = RET_PTR_TO_BTF_ID_OR_NULL,
> -       .arg1_type              = ARG_PTR_TO_BTF_ID,
> -       .arg1_btf_id            = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
> +       .arg1_type              = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
>         .ret_btf_id             = &btf_sock_ids[BTF_SOCK_TYPE_TCP6],
>  };
>
> @@ -9930,8 +9932,7 @@ const struct bpf_func_proto bpf_skc_to_tcp_sock_proto = {
>         .func                   = bpf_skc_to_tcp_sock,
>         .gpl_only               = false,
>         .ret_type               = RET_PTR_TO_BTF_ID_OR_NULL,
> -       .arg1_type              = ARG_PTR_TO_BTF_ID,
> -       .arg1_btf_id            = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
> +       .arg1_type              = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
>         .ret_btf_id             = &btf_sock_ids[BTF_SOCK_TYPE_TCP],
>  };
>
> @@ -9954,8 +9955,7 @@ const struct bpf_func_proto bpf_skc_to_tcp_timewait_sock_proto = {
>         .func                   = bpf_skc_to_tcp_timewait_sock,
>         .gpl_only               = false,
>         .ret_type               = RET_PTR_TO_BTF_ID_OR_NULL,
> -       .arg1_type              = ARG_PTR_TO_BTF_ID,
> -       .arg1_btf_id            = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
> +       .arg1_type              = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
>         .ret_btf_id             = &btf_sock_ids[BTF_SOCK_TYPE_TCP_TW],
>  };
>
> @@ -9978,8 +9978,7 @@ const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto = {
>         .func                   = bpf_skc_to_tcp_request_sock,
>         .gpl_only               = false,
>         .ret_type               = RET_PTR_TO_BTF_ID_OR_NULL,
> -       .arg1_type              = ARG_PTR_TO_BTF_ID,
> -       .arg1_btf_id            = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
> +       .arg1_type              = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
>         .ret_btf_id             = &btf_sock_ids[BTF_SOCK_TYPE_TCP_REQ],
>  };
>
> @@ -10000,7 +9999,37 @@ const struct bpf_func_proto bpf_skc_to_udp6_sock_proto = {
>         .func                   = bpf_skc_to_udp6_sock,
>         .gpl_only               = false,
>         .ret_type               = RET_PTR_TO_BTF_ID_OR_NULL,
> -       .arg1_type              = ARG_PTR_TO_BTF_ID,
> -       .arg1_btf_id            = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
> +       .arg1_type              = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
>         .ret_btf_id             = &btf_sock_ids[BTF_SOCK_TYPE_UDP6],
>  };
> +
> +static const struct bpf_func_proto *
> +bpf_sk_base_func_proto(enum bpf_func_id func_id)
> +{
> +       const struct bpf_func_proto *func;
> +
> +       switch (func_id) {
> +       case BPF_FUNC_skc_to_tcp6_sock:
> +               func = &bpf_skc_to_tcp6_sock_proto;
> +               break;
> +       case BPF_FUNC_skc_to_tcp_sock:
> +               func = &bpf_skc_to_tcp_sock_proto;
> +               break;
> +       case BPF_FUNC_skc_to_tcp_timewait_sock:
> +               func = &bpf_skc_to_tcp_timewait_sock_proto;
> +               break;
> +       case BPF_FUNC_skc_to_tcp_request_sock:
> +               func = &bpf_skc_to_tcp_request_sock_proto;
> +               break;
> +       case BPF_FUNC_skc_to_udp6_sock:
> +               func = &bpf_skc_to_udp6_sock_proto;
> +               break;
> +       default:
> +               return bpf_base_func_proto(func_id);
> +       }
> +
> +       if (!perfmon_capable())
> +               return NULL;
> +
> +       return func;
> +}
> --
> 2.24.1
>


-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 07/13] bpf: selftest: Add ref_tracking verifier test for bpf_skc casting
  2020-09-25  0:04 ` [PATCH v4 bpf-next 07/13] bpf: selftest: Add ref_tracking verifier test for bpf_skc casting Martin KaFai Lau
@ 2020-09-25  8:30   ` Lorenz Bauer
  0 siblings, 0 replies; 26+ messages in thread
From: Lorenz Bauer @ 2020-09-25  8:30 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Networking

On Fri, 25 Sep 2020 at 01:04, Martin KaFai Lau <kafai@fb.com> wrote:
>
> The patch tests for:
> 1. bpf_sk_release() can be called on a tcp_sock btf_id ptr.
>
> 2. Ensure the tcp_sock btf_id pointer cannot be used
>    after bpf_sk_release().
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

Acked-by: Lorenz Bauer <lmb@cloudflare.com>

> ---
>  .../selftests/bpf/verifier/ref_tracking.c     | 47 +++++++++++++++++++
>  1 file changed, 47 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/verifier/ref_tracking.c b/tools/testing/selftests/bpf/verifier/ref_tracking.c
> index 056e0273bf12..006b5bd99c08 100644
> --- a/tools/testing/selftests/bpf/verifier/ref_tracking.c
> +++ b/tools/testing/selftests/bpf/verifier/ref_tracking.c
> @@ -854,3 +854,50 @@
>         .errstr = "Unreleased reference",
>         .result = REJECT,
>  },
> +{
> +       "reference tracking: bpf_sk_release(btf_tcp_sock)",
> +       .insns = {
> +       BPF_SK_LOOKUP(sk_lookup_tcp),
> +       BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
> +       BPF_EXIT_INSN(),
> +       BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
> +       BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
> +       BPF_EMIT_CALL(BPF_FUNC_skc_to_tcp_sock),
> +       BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3),
> +       BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
> +       BPF_EMIT_CALL(BPF_FUNC_sk_release),
> +       BPF_EXIT_INSN(),
> +       BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
> +       BPF_EMIT_CALL(BPF_FUNC_sk_release),
> +       BPF_EXIT_INSN(),
> +       },
> +       .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> +       .result = ACCEPT,
> +       .result_unpriv = REJECT,
> +       .errstr_unpriv = "unknown func",
> +},
> +{
> +       "reference tracking: use ptr from bpf_skc_to_tcp_sock() after release",
> +       .insns = {
> +       BPF_SK_LOOKUP(sk_lookup_tcp),
> +       BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
> +       BPF_EXIT_INSN(),
> +       BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
> +       BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
> +       BPF_EMIT_CALL(BPF_FUNC_skc_to_tcp_sock),
> +       BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 3),
> +       BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
> +       BPF_EMIT_CALL(BPF_FUNC_sk_release),
> +       BPF_EXIT_INSN(),
> +       BPF_MOV64_REG(BPF_REG_7, BPF_REG_0),
> +       BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
> +       BPF_EMIT_CALL(BPF_FUNC_sk_release),
> +       BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_7, 0),
> +       BPF_EXIT_INSN(),
> +       },
> +       .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> +       .result = REJECT,
> +       .errstr = "invalid mem access",
> +       .result_unpriv = REJECT,
> +       .errstr_unpriv = "unknown func",
> +},
> --
> 2.24.1
>


-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (12 preceding siblings ...)
  2020-09-25  0:04 ` [PATCH v4 bpf-next 13/13] bpf: selftest: Add test_btf_skc_cls_ingress Martin KaFai Lau
@ 2020-09-25  9:40 ` Lorenz Bauer
  2020-09-25 23:22 ` Alexei Starovoitov
  14 siblings, 0 replies; 26+ messages in thread
From: Lorenz Bauer @ 2020-09-25  9:40 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Networking

On Fri, 25 Sep 2020 at 01:03, Martin KaFai Lau <kafai@fb.com> wrote:
>
> This set allows networking prog type to directly read fields from
> the in-kernel socket type, e.g. "struct tcp_sock".
>
> Patch 2 has the details on the use case.
>
> v3:
> - Pass arg_btf_id instead of fn into check_reg_type() in Patch 1 (Lorenz)
> - Move arg_btf_id from func_proto to struct bpf_reg_types in Patch 2 (Lorenz)
> - Remove test_sock_fields from .gitignore in Patch 8 (Andrii)
> - Add tests to have better coverage on the modified helpers (Alexei)
>   Patch 13 is added.
> - Use "void *sk" as the helper argument in UAPI bpf.h
>
> v3:
> - ARG_PTR_TO_SOCK_COMMON_OR_NULL was attempted in v2.  The _OR_NULL was
>   needed because the PTR_TO_BTF_ID could be NULL but note that a could be NULL
>   PTR_TO_BTF_ID is not a scalar NULL to the verifier.  "_OR_NULL" implicitly
>   gives an expectation that the helper can take a scalar NULL which does
>   not make sense in most (except one) helpers.  Passing scalar NULL
>   should be rejected at the verification time.
>
>   Thus, this patch uses ARG_PTR_TO_BTF_ID_SOCK_COMMON to specify that the
>   helper can take both the btf-id ptr or the legacy PTR_TO_SOCK_COMMON but
>   not scalar NULL.  It requires the func_proto to explicitly specify the
>   arg_btf_id such that there is a very clear expectation that the helper
>   can handle a NULL PTR_TO_BTF_ID.
>
> v2:
> - Add ARG_PTR_TO_SOCK_COMMON_OR_NULL (Lorenz)
>
> Martin KaFai Lau (13):
>   bpf: Move the PTR_TO_BTF_ID check to check_reg_type()
>   bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
>   bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept
>     ARG_PTR_TO_BTF_ID_SOCK_COMMON
>   bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
>   bpf: Change bpf_tcp_*_syncookie to accept
>     ARG_PTR_TO_BTF_ID_SOCK_COMMON
>   bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON
>   bpf: selftest: Add ref_tracking verifier test for bpf_skc casting
>   bpf: selftest: Move sock_fields test into test_progs
>   bpf: selftest: Adapt sock_fields test to use skel and global variables
>   bpf: selftest: Use network_helpers in the sock_fields test
>   bpf: selftest: Use bpf_skc_to_tcp_sock() in the sock_fields test
>   bpf: selftest: Remove enum tcp_ca_state from bpf_tcp_helpers.h
>   bpf: selftest: Add test_btf_skc_cls_ingress


LGTM, thanks!

-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  2020-09-25  8:26   ` Lorenz Bauer
@ 2020-09-25 13:18     ` Martin KaFai Lau
  2020-09-25 13:50       ` Lorenz Bauer
  0 siblings, 1 reply; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25 13:18 UTC (permalink / raw)
  To: Lorenz Bauer
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Networking

On Fri, Sep 25, 2020 at 09:26:36AM +0100, Lorenz Bauer wrote:
> On Fri, 25 Sep 2020 at 01:04, Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > There is a constant need to add more fields into the bpf_tcp_sock
> > for the bpf programs running at tc, sock_ops...etc.
> >
> > A current workaround could be to use bpf_probe_read_kernel().  However,
> > other than making another helper call for reading each field and missing
> > CO-RE, it is also not as intuitive to use as directly reading
> > "tp->lsndtime" for example.  While already having perfmon cap to do
> > bpf_probe_read_kernel(), it will be much easier if the bpf prog can
> > directly read from the tcp_sock.
> >
> > This patch tries to do that by using the existing casting-helpers
> > bpf_skc_to_*() whose func_proto returns a btf_id.  For example, the
> > func_proto of bpf_skc_to_tcp_sock returns the btf_id of the
> > kernel "struct tcp_sock".
> >
> > These helpers are also added to is_ptr_cast_function().
> > It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id.
> > That will keep the ref-tracking works properly.
> >
> > The bpf_skc_to_* helpers are made available to most of the bpf prog
> > types in filter.c. The bpf_skc_to_* helpers will be limited by
> > perfmon cap.
> >
> > This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON.  The helper accepting
> > this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON])
> > or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON).  The bpf_skc_to_*()
> > helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that
> > they will accept pointer obtained from skb->sk.
> >
> > Instead of specifying both arg_type and arg_btf_id in the same func_proto
> > which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of
> > the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the
> > compatible_reg_types[] in verifier.c.  The reason is the arg_btf_id is
> > always the same.  Discussion in this thread:
> > https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/
> >
> > The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is
> > expecting a PTR_TO_BTF_ID which could be NULL.  This is the same
> > behavior as the existing helper taking ARG_PTR_TO_BTF_ID.
> >
> > The _SOCK_COMMON part means the helper is also expecting the legacy
> > SOCK_COMMON pointer.
> >
> > By excluding the _OR_NULL part, the bpf prog cannot call helper
> > with a literal NULL which doesn't make sense in most cases.
> > e.g. bpf_skc_to_tcp_sock(NULL) will be rejected.  All PTR_TO_*_OR_NULL
> > reg has to do a NULL check first before passing into the helper or else
> > the bpf prog will be rejected.  This behavior is nothing new and
> > consistent with the current expectation during bpf-prog-load.
> >
> > [ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace
> >   ARG_PTR_TO_SOCK* of other existing helpers later such that
> >   those existing helpers can take the PTR_TO_BTF_ID returned by
> >   the bpf_skc_to_*() helpers.
> >
> >   The only special case is bpf_sk_lookup_assign() which can accept a
> >   literal NULL ptr.  It has to be handled specially in another follow
> >   up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL
> >   to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ]
> >
> > [ When converting the older helpers that take ARG_PTR_TO_SOCK* in
> >   the later patch, if the kernel does not support BTF,
> >   ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON
> >   because no reg->type could have PTR_TO_BTF_ID in this case.
> >
> >   It is not a concern for the newer-btf-only helper like the bpf_skc_to_*()
> >   here though because these helpers must require BTF vmlinux to begin
> >   with. ]
> >
> > Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> > ---
> >  include/linux/bpf.h   |  1 +
> >  kernel/bpf/verifier.c | 34 +++++++++++++++++++--
> >  net/core/filter.c     | 69 ++++++++++++++++++++++++++++++-------------
> >  3 files changed, 82 insertions(+), 22 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index fc5c901c7542..d0937f1d2980 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -292,6 +292,7 @@ enum bpf_arg_type {
> >         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */
> >         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */
> >         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */
> > +       ARG_PTR_TO_BTF_ID_SOCK_COMMON,  /* pointer to in-kernel sock_common or bpf-mirrored bpf_sock */
> >         __BPF_ARG_TYPE_MAX,
> >  };
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 945fa2b4d096..d4ba29fb17a6 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -486,7 +486,12 @@ static bool is_acquire_function(enum bpf_func_id func_id,
> >  static bool is_ptr_cast_function(enum bpf_func_id func_id)
> >  {
> >         return func_id == BPF_FUNC_tcp_sock ||
> > -               func_id == BPF_FUNC_sk_fullsock;
> > +               func_id == BPF_FUNC_sk_fullsock ||
> > +               func_id == BPF_FUNC_skc_to_tcp_sock ||
> > +               func_id == BPF_FUNC_skc_to_tcp6_sock ||
> > +               func_id == BPF_FUNC_skc_to_udp6_sock ||
> > +               func_id == BPF_FUNC_skc_to_tcp_timewait_sock ||
> > +               func_id == BPF_FUNC_skc_to_tcp_request_sock;
> >  }
> >
> >  /* string representation of 'enum bpf_reg_type' */
> > @@ -3953,6 +3958,7 @@ static int resolve_map_arg_type(struct bpf_verifier_env *env,
> >
> >  struct bpf_reg_types {
> >         const enum bpf_reg_type types[10];
> > +       u32 *btf_id;
> >  };
> >
> >  static const struct bpf_reg_types map_key_value_types = {
> > @@ -3973,6 +3979,17 @@ static const struct bpf_reg_types sock_types = {
> >         },
> >  };
> >
> > +static const struct bpf_reg_types btf_id_sock_common_types = {
> > +       .types = {
> > +               PTR_TO_SOCK_COMMON,
> > +               PTR_TO_SOCKET,
> > +               PTR_TO_TCP_SOCK,
> > +               PTR_TO_XDP_SOCK,
> > +               PTR_TO_BTF_ID,
> > +       },
> > +       .btf_id = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
> > +};
> > +
> >  static const struct bpf_reg_types mem_types = {
> >         .types = {
> >                 PTR_TO_STACK,
> > @@ -4014,6 +4031,7 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> >         [ARG_PTR_TO_CTX]                = &context_types,
> >         [ARG_PTR_TO_CTX_OR_NULL]        = &context_types,
> >         [ARG_PTR_TO_SOCK_COMMON]        = &sock_types,
> > +       [ARG_PTR_TO_BTF_ID_SOCK_COMMON] = &btf_id_sock_common_types,
> >         [ARG_PTR_TO_SOCKET]             = &fullsock_types,
> >         [ARG_PTR_TO_SOCKET_OR_NULL]     = &fullsock_types,
> >         [ARG_PTR_TO_BTF_ID]             = &btf_ptr_types,
> > @@ -4059,6 +4077,14 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> >
> >  found:
> >         if (type == PTR_TO_BTF_ID) {
> > +               if (!arg_btf_id) {
> > +                       if (!compatible->btf_id) {
> > +                               verbose(env, "verifier internal error: missing arg compatible BTF ID\n");
> > +                               return -EFAULT;
> > +                       }
> > +                       arg_btf_id = compatible->btf_id;
> > +               }
> > +
> >                 if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id,
> >                                           *arg_btf_id)) {
> >                         verbose(env, "R%d is of type %s but %s is expected\n",
> > @@ -4575,10 +4601,14 @@ static bool check_btf_id_ok(const struct bpf_func_proto *fn)
> >  {
> >         int i;
> >
> > -       for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++)
> > +       for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) {
> >                 if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
> >                         return false;
> >
> > +               if (fn->arg_type[i] != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
> > +                       return false;
> > +       }
> > +
> 
> This is a hold over from the previous patchset?
hmm... what do you mean?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type()
  2020-09-25  0:03 ` [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type() Martin KaFai Lau
  2020-09-25  8:22   ` Lorenz Bauer
@ 2020-09-25 13:36   ` John Fastabend
  1 sibling, 0 replies; 26+ messages in thread
From: John Fastabend @ 2020-09-25 13:36 UTC (permalink / raw)
  To: Martin KaFai Lau, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

Martin KaFai Lau wrote:
> check_reg_type() checks whether a reg can be used as an arg of a
> func_proto.  For PTR_TO_BTF_ID, the check is actually not
> completely done until the reg->btf_id is pointing to a
> kernel struct that is acceptable by the func_proto.
> 
> Thus, this patch moves the btf_id check into check_reg_type().
> "arg_type" and "arg_btf_id" are passed to check_reg_type() instead of
> "compatible".  The compatible_reg_types[] usage is localized in
> check_reg_type() now.
> 
> The "if (!btf_id) verbose(...); " is also removed since it won't happen.
> 
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  2020-09-25 13:18     ` Martin KaFai Lau
@ 2020-09-25 13:50       ` Lorenz Bauer
  2020-09-25 15:47         ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Lorenz Bauer @ 2020-09-25 13:50 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Networking

On Fri, 25 Sep 2020 at 14:18, Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Fri, Sep 25, 2020 at 09:26:36AM +0100, Lorenz Bauer wrote:
> > On Fri, 25 Sep 2020 at 01:04, Martin KaFai Lau <kafai@fb.com> wrote:
> > >
> > > There is a constant need to add more fields into the bpf_tcp_sock
> > > for the bpf programs running at tc, sock_ops...etc.
> > >
> > > A current workaround could be to use bpf_probe_read_kernel().  However,
> > > other than making another helper call for reading each field and missing
> > > CO-RE, it is also not as intuitive to use as directly reading
> > > "tp->lsndtime" for example.  While already having perfmon cap to do
> > > bpf_probe_read_kernel(), it will be much easier if the bpf prog can
> > > directly read from the tcp_sock.
> > >
> > > This patch tries to do that by using the existing casting-helpers
> > > bpf_skc_to_*() whose func_proto returns a btf_id.  For example, the
> > > func_proto of bpf_skc_to_tcp_sock returns the btf_id of the
> > > kernel "struct tcp_sock".
> > >
> > > These helpers are also added to is_ptr_cast_function().
> > > It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id.
> > > That will keep the ref-tracking works properly.
> > >
> > > The bpf_skc_to_* helpers are made available to most of the bpf prog
> > > types in filter.c. The bpf_skc_to_* helpers will be limited by
> > > perfmon cap.
> > >
> > > This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON.  The helper accepting
> > > this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON])
> > > or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON).  The bpf_skc_to_*()
> > > helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that
> > > they will accept pointer obtained from skb->sk.
> > >
> > > Instead of specifying both arg_type and arg_btf_id in the same func_proto
> > > which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of
> > > the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the
> > > compatible_reg_types[] in verifier.c.  The reason is the arg_btf_id is
> > > always the same.  Discussion in this thread:
> > > https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/
> > >
> > > The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is
> > > expecting a PTR_TO_BTF_ID which could be NULL.  This is the same
> > > behavior as the existing helper taking ARG_PTR_TO_BTF_ID.
> > >
> > > The _SOCK_COMMON part means the helper is also expecting the legacy
> > > SOCK_COMMON pointer.
> > >
> > > By excluding the _OR_NULL part, the bpf prog cannot call helper
> > > with a literal NULL which doesn't make sense in most cases.
> > > e.g. bpf_skc_to_tcp_sock(NULL) will be rejected.  All PTR_TO_*_OR_NULL
> > > reg has to do a NULL check first before passing into the helper or else
> > > the bpf prog will be rejected.  This behavior is nothing new and
> > > consistent with the current expectation during bpf-prog-load.
> > >
> > > [ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace
> > >   ARG_PTR_TO_SOCK* of other existing helpers later such that
> > >   those existing helpers can take the PTR_TO_BTF_ID returned by
> > >   the bpf_skc_to_*() helpers.
> > >
> > >   The only special case is bpf_sk_lookup_assign() which can accept a
> > >   literal NULL ptr.  It has to be handled specially in another follow
> > >   up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL
> > >   to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ]
> > >
> > > [ When converting the older helpers that take ARG_PTR_TO_SOCK* in
> > >   the later patch, if the kernel does not support BTF,
> > >   ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON
> > >   because no reg->type could have PTR_TO_BTF_ID in this case.
> > >
> > >   It is not a concern for the newer-btf-only helper like the bpf_skc_to_*()
> > >   here though because these helpers must require BTF vmlinux to begin
> > >   with. ]
> > >
> > > Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> > > ---
> > >  include/linux/bpf.h   |  1 +
> > >  kernel/bpf/verifier.c | 34 +++++++++++++++++++--
> > >  net/core/filter.c     | 69 ++++++++++++++++++++++++++++++-------------
> > >  3 files changed, 82 insertions(+), 22 deletions(-)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index fc5c901c7542..d0937f1d2980 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -292,6 +292,7 @@ enum bpf_arg_type {
> > >         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */
> > >         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */
> > >         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */
> > > +       ARG_PTR_TO_BTF_ID_SOCK_COMMON,  /* pointer to in-kernel sock_common or bpf-mirrored bpf_sock */
> > >         __BPF_ARG_TYPE_MAX,
> > >  };
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 945fa2b4d096..d4ba29fb17a6 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -486,7 +486,12 @@ static bool is_acquire_function(enum bpf_func_id func_id,
> > >  static bool is_ptr_cast_function(enum bpf_func_id func_id)
> > >  {
> > >         return func_id == BPF_FUNC_tcp_sock ||
> > > -               func_id == BPF_FUNC_sk_fullsock;
> > > +               func_id == BPF_FUNC_sk_fullsock ||
> > > +               func_id == BPF_FUNC_skc_to_tcp_sock ||
> > > +               func_id == BPF_FUNC_skc_to_tcp6_sock ||
> > > +               func_id == BPF_FUNC_skc_to_udp6_sock ||
> > > +               func_id == BPF_FUNC_skc_to_tcp_timewait_sock ||
> > > +               func_id == BPF_FUNC_skc_to_tcp_request_sock;
> > >  }
> > >
> > >  /* string representation of 'enum bpf_reg_type' */
> > > @@ -3953,6 +3958,7 @@ static int resolve_map_arg_type(struct bpf_verifier_env *env,
> > >
> > >  struct bpf_reg_types {
> > >         const enum bpf_reg_type types[10];
> > > +       u32 *btf_id;
> > >  };
> > >
> > >  static const struct bpf_reg_types map_key_value_types = {
> > > @@ -3973,6 +3979,17 @@ static const struct bpf_reg_types sock_types = {
> > >         },
> > >  };
> > >
> > > +static const struct bpf_reg_types btf_id_sock_common_types = {
> > > +       .types = {
> > > +               PTR_TO_SOCK_COMMON,
> > > +               PTR_TO_SOCKET,
> > > +               PTR_TO_TCP_SOCK,
> > > +               PTR_TO_XDP_SOCK,
> > > +               PTR_TO_BTF_ID,
> > > +       },
> > > +       .btf_id = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
> > > +};
> > > +
> > >  static const struct bpf_reg_types mem_types = {
> > >         .types = {
> > >                 PTR_TO_STACK,
> > > @@ -4014,6 +4031,7 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> > >         [ARG_PTR_TO_CTX]                = &context_types,
> > >         [ARG_PTR_TO_CTX_OR_NULL]        = &context_types,
> > >         [ARG_PTR_TO_SOCK_COMMON]        = &sock_types,
> > > +       [ARG_PTR_TO_BTF_ID_SOCK_COMMON] = &btf_id_sock_common_types,
> > >         [ARG_PTR_TO_SOCKET]             = &fullsock_types,
> > >         [ARG_PTR_TO_SOCKET_OR_NULL]     = &fullsock_types,
> > >         [ARG_PTR_TO_BTF_ID]             = &btf_ptr_types,
> > > @@ -4059,6 +4077,14 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> > >
> > >  found:
> > >         if (type == PTR_TO_BTF_ID) {
> > > +               if (!arg_btf_id) {
> > > +                       if (!compatible->btf_id) {
> > > +                               verbose(env, "verifier internal error: missing arg compatible BTF ID\n");
> > > +                               return -EFAULT;
> > > +                       }
> > > +                       arg_btf_id = compatible->btf_id;
> > > +               }
> > > +
> > >                 if (!btf_struct_ids_match(&env->log, reg->off, reg->btf_id,
> > >                                           *arg_btf_id)) {
> > >                         verbose(env, "R%d is of type %s but %s is expected\n",
> > > @@ -4575,10 +4601,14 @@ static bool check_btf_id_ok(const struct bpf_func_proto *fn)
> > >  {
> > >         int i;
> > >
> > > -       for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++)
> > > +       for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) {
> > >                 if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
> > >                         return false;
> > >
> > > +               if (fn->arg_type[i] != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
> > > +                       return false;
> > > +       }
> > > +
> >
> > This is a hold over from the previous patchset?
> hmm... what do you mean?

Sorry, I misread the patch!

-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  2020-09-25  0:03 ` [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
  2020-09-25  8:26   ` Lorenz Bauer
@ 2020-09-25 14:21   ` John Fastabend
  1 sibling, 0 replies; 26+ messages in thread
From: John Fastabend @ 2020-09-25 14:21 UTC (permalink / raw)
  To: Martin KaFai Lau, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

Martin KaFai Lau wrote:
> There is a constant need to add more fields into the bpf_tcp_sock
> for the bpf programs running at tc, sock_ops...etc.
> 
> A current workaround could be to use bpf_probe_read_kernel().  However,
> other than making another helper call for reading each field and missing
> CO-RE, it is also not as intuitive to use as directly reading
> "tp->lsndtime" for example.  While already having perfmon cap to do
> bpf_probe_read_kernel(), it will be much easier if the bpf prog can
> directly read from the tcp_sock.
> 
> This patch tries to do that by using the existing casting-helpers
> bpf_skc_to_*() whose func_proto returns a btf_id.  For example, the
> func_proto of bpf_skc_to_tcp_sock returns the btf_id of the
> kernel "struct tcp_sock".
> 
> These helpers are also added to is_ptr_cast_function().
> It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id.
> That will keep the ref-tracking works properly.
> 
> The bpf_skc_to_* helpers are made available to most of the bpf prog
> types in filter.c. The bpf_skc_to_* helpers will be limited by
> perfmon cap.
> 
> This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON.  The helper accepting
> this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON])
> or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON).  The bpf_skc_to_*()
> helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that
> they will accept pointer obtained from skb->sk.
> 
> Instead of specifying both arg_type and arg_btf_id in the same func_proto
> which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of
> the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the
> compatible_reg_types[] in verifier.c.  The reason is the arg_btf_id is
> always the same.  Discussion in this thread:
> https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/
> 
> The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is
> expecting a PTR_TO_BTF_ID which could be NULL.  This is the same
> behavior as the existing helper taking ARG_PTR_TO_BTF_ID.
> 
> The _SOCK_COMMON part means the helper is also expecting the legacy
> SOCK_COMMON pointer.
> 
> By excluding the _OR_NULL part, the bpf prog cannot call helper
> with a literal NULL which doesn't make sense in most cases.
> e.g. bpf_skc_to_tcp_sock(NULL) will be rejected.  All PTR_TO_*_OR_NULL
> reg has to do a NULL check first before passing into the helper or else
> the bpf prog will be rejected.  This behavior is nothing new and
> consistent with the current expectation during bpf-prog-load.
> 
> [ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace
>   ARG_PTR_TO_SOCK* of other existing helpers later such that
>   those existing helpers can take the PTR_TO_BTF_ID returned by
>   the bpf_skc_to_*() helpers.
> 
>   The only special case is bpf_sk_lookup_assign() which can accept a
>   literal NULL ptr.  It has to be handled specially in another follow
>   up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL
>   to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ]
> 
> [ When converting the older helpers that take ARG_PTR_TO_SOCK* in
>   the later patch, if the kernel does not support BTF,
>   ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON
>   because no reg->type could have PTR_TO_BTF_ID in this case.
> 
>   It is not a concern for the newer-btf-only helper like the bpf_skc_to_*()
>   here though because these helpers must require BTF vmlinux to begin
>   with. ]
> 
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---

LGTM it took a bit of looking around to convince myself that
we have ret_type set to PTR_TO_SOCK_*_OR_NULL types in the sk lookup
helpers so that we force a null check before passing these into the
skc_to_* helpers here, but I didn't see any holes. Thanks.

Acked-by: John Fastabend <john.fastabend@gmail.com>

> @@ -4575,10 +4601,14 @@ static bool check_btf_id_ok(const struct bpf_func_proto *fn)
>  {
>  	int i;
>  
> -	for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++)
> +	for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) {
>  		if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
>  			return false;
>  
> +		if (fn->arg_type[i] != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
> +			return false;
> +	}
> +

I guess this case is harmless? If some other arg has a btf_id its setup
wrong, so nice to fail here I think.

>  	return true;
>  }
>  

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  2020-09-25 13:50       ` Lorenz Bauer
@ 2020-09-25 15:47         ` Alexei Starovoitov
  0 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2020-09-25 15:47 UTC (permalink / raw)
  To: Lorenz Bauer, Martin KaFai Lau
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team, Networking

On 9/25/20 6:50 AM, Lorenz Bauer wrote:
>>>> -       for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++)
>>>> +       for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) {
>>>>                  if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
>>>>                          return false;
>>>>
>>>> +               if (fn->arg_type[i] != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
>>>> +                       return false;
>>>> +       }
>>>> +
>>> This is a hold over from the previous patchset?
>> hmm... what do you mean?
> Sorry, I misread the patch!

Folks, please trim your replies.
You could have quoted just few lines above instead of most of the patch.
Scrolling takes time for those of us who used tbird, mutt, etc.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v4 bpf-next 13/13] bpf: selftest: Add test_btf_skc_cls_ingress
  2020-09-25  0:04 ` [PATCH v4 bpf-next 13/13] bpf: selftest: Add test_btf_skc_cls_ingress Martin KaFai Lau
@ 2020-09-25 16:24   ` John Fastabend
  2020-09-25 17:58     ` Martin KaFai Lau
  0 siblings, 1 reply; 26+ messages in thread
From: John Fastabend @ 2020-09-25 16:24 UTC (permalink / raw)
  To: Martin KaFai Lau, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Lorenz Bauer, netdev

Martin KaFai Lau wrote:
> This patch attaches a classifier prog to the ingress filter.
> It exercises the following helpers with different socket pointer
> types in different logical branches:
> 1. bpf_sk_release()
> 2. bpf_sk_assign()
> 3. bpf_skc_to_tcp_request_sock(), bpf_skc_to_tcp_sock()
> 4. bpf_tcp_gen_syncookie, bpf_tcp_check_syncookie
> 
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---
>  tools/testing/selftests/bpf/bpf_tcp_helpers.h |   5 +
>  .../bpf/prog_tests/btf_skc_cls_ingress.c      | 234 ++++++++++++++++++
>  .../bpf/progs/test_btf_skc_cls_ingress.c      | 174 +++++++++++++
>  3 files changed, 413 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_skc_cls_ingress.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_btf_skc_cls_ingress.c
> 


Hi Martin,

One piece I'm missing is how does this handle null pointer dereferences
from network side when reading BTF objects? I've not got through all the
code yet so maybe I'm just not there yet.

For example,

> diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
> index a0e8b3758bd7..2915664c335d 100644
> --- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
> +++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
> @@ -16,6 +16,7 @@ BPF_PROG(name, args)
>  
>  struct sock_common {
>  	unsigned char	skc_state;
> +	__u16		skc_num;
>  } __attribute__((preserve_access_index));
>  
>  enum sk_pacing {
> @@ -45,6 +46,10 @@ struct inet_connection_sock {
>  	__u64			  icsk_ca_priv[104 / sizeof(__u64)];
>  } __attribute__((preserve_access_index));
>  
> +struct request_sock {
> +	struct sock_common		__req_common;
> +} __attribute__((preserve_access_index));
> +
>  struct tcp_sock {
>  	struct inet_connection_sock	inet_conn;

add some pointer from tcp_sock which is likely not set so should be NULL,

        struct tcp_fastopen_request *fastopen_req;

[...]

> +	if (bpf_skc->state == BPF_TCP_NEW_SYN_RECV) {
> +		struct request_sock *req_sk;
> +
> +		req_sk = (struct request_sock *)bpf_skc_to_tcp_request_sock(bpf_skc);
> +		if (!req_sk) {
> +			LOG();
> +			goto release;
> +		}
> +
> +		if (bpf_sk_assign(skb, req_sk, 0)) {
> +			LOG();
> +			goto release;
> +		}
> +
> +		req_sk_sport = req_sk->__req_common.skc_num;
> +
> +		bpf_sk_release(req_sk);
> +		return TC_ACT_OK;
> +	} else if (bpf_skc->state == BPF_TCP_LISTEN) {
> +		struct tcp_sock *tp;
> +
> +		tp = bpf_skc_to_tcp_sock(bpf_skc);
> +		if (!tp) {
> +			LOG();
> +			goto release;
> +		}
> +
> +		if (bpf_sk_assign(skb, tp, 0)) {
> +			LOG();
> +			goto release;
> +		}
> +


Then use it here without a null check in the BPF program,

                fastopen = tp->fastopen_req;
		if (fastopen->size > 0x1234)
                      (do something)

> +		listen_tp_sport = tp->inet_conn.icsk_inet.sk.__sk_common.skc_num;
> +
> +		test_syncookie_helper(ip6h, th, tp, skb);
> +		bpf_sk_release(tp);
> +		return TC_ACT_OK;
> +	}

My quick check shows this didn't actually fault and the xlated
looks like it did the read and dereference. Must be missing
something? We shouldn't have fault_handler set for cls_ingress.

Perhaps a comment in the cover letter about this would be
helpful? Or if I'm just being dense this morning let me know
as well. ;)

> +
> +	if (bpf_sk_assign(skb, bpf_skc, 0))
> +		LOG();
> +
> +release:
> +	bpf_sk_release(bpf_skc);
> +	return TC_ACT_OK;
> +}

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 13/13] bpf: selftest: Add test_btf_skc_cls_ingress
  2020-09-25 16:24   ` John Fastabend
@ 2020-09-25 17:58     ` Martin KaFai Lau
  0 siblings, 0 replies; 26+ messages in thread
From: Martin KaFai Lau @ 2020-09-25 17:58 UTC (permalink / raw)
  To: John Fastabend
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	Lorenz Bauer, netdev

On Fri, Sep 25, 2020 at 09:24:02AM -0700, John Fastabend wrote:
[ ... ]

> Hi Martin,
> 
> One piece I'm missing is how does this handle null pointer dereferences
> from network side when reading BTF objects? I've not got through all the
> code yet so maybe I'm just not there yet.
> 
> For example,
> 
> > diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
> > index a0e8b3758bd7..2915664c335d 100644
> > --- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
> > +++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
> > @@ -16,6 +16,7 @@ BPF_PROG(name, args)
> >  
> >  struct sock_common {
> >  	unsigned char	skc_state;
> > +	__u16		skc_num;
> >  } __attribute__((preserve_access_index));
> >  
> >  enum sk_pacing {
> > @@ -45,6 +46,10 @@ struct inet_connection_sock {
> >  	__u64			  icsk_ca_priv[104 / sizeof(__u64)];
> >  } __attribute__((preserve_access_index));
> >  
> > +struct request_sock {
> > +	struct sock_common		__req_common;
> > +} __attribute__((preserve_access_index));
> > +
> >  struct tcp_sock {
> >  	struct inet_connection_sock	inet_conn;
> 
> add some pointer from tcp_sock which is likely not set so should be NULL,
> 
>         struct tcp_fastopen_request *fastopen_req;
> 
> [...]
> 
> > +	if (bpf_skc->state == BPF_TCP_NEW_SYN_RECV) {
> > +		struct request_sock *req_sk;
> > +
> > +		req_sk = (struct request_sock *)bpf_skc_to_tcp_request_sock(bpf_skc);
> > +		if (!req_sk) {
> > +			LOG();
> > +			goto release;
> > +		}
> > +
> > +		if (bpf_sk_assign(skb, req_sk, 0)) {
> > +			LOG();
> > +			goto release;
> > +		}
> > +
> > +		req_sk_sport = req_sk->__req_common.skc_num;
> > +
> > +		bpf_sk_release(req_sk);
> > +		return TC_ACT_OK;
> > +	} else if (bpf_skc->state == BPF_TCP_LISTEN) {
> > +		struct tcp_sock *tp;
> > +
> > +		tp = bpf_skc_to_tcp_sock(bpf_skc);
> > +		if (!tp) {
> > +			LOG();
> > +			goto release;
> > +		}
> > +
> > +		if (bpf_sk_assign(skb, tp, 0)) {
> > +			LOG();
> > +			goto release;
> > +		}
> > +
> 
> 
> Then use it here without a null check in the BPF program,
> 
>                 fastopen = tp->fastopen_req;
fastopen is in PTR_TO_BTF_ID here.

> 		if (fastopen->size > 0x1234)
This load will be marked with BPF_PROBE_MEM.

>                       (do something)
> 
> > +		listen_tp_sport = tp->inet_conn.icsk_inet.sk.__sk_common.skc_num;
> > +
> > +		test_syncookie_helper(ip6h, th, tp, skb);
> > +		bpf_sk_release(tp);
> > +		return TC_ACT_OK;
> > +	}
> 
> My quick check shows this didn't actually fault and the xlated
> looks like it did the read and dereference. Must be missing
> something? We shouldn't have fault_handler set for cls_ingress.
By xlated, do you mean the interpreter mode?  The LDX_PROBE_MEM
is done by bpf_probe_read_kernel() in bpf/core.c.

I don't think the handling of PTR_TO_BTF_ID is depending on
prog->type.

> 
> Perhaps a comment in the cover letter about this would be
> helpful? Or if I'm just being dense this morning let me know
> as well. ;)
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type
  2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
                   ` (13 preceding siblings ...)
  2020-09-25  9:40 ` [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Lorenz Bauer
@ 2020-09-25 23:22 ` Alexei Starovoitov
  14 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2020-09-25 23:22 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team,
	Lorenz Bauer, Network Development

On Thu, Sep 24, 2020 at 5:03 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> This set allows networking prog type to directly read fields from
> the in-kernel socket type, e.g. "struct tcp_sock".

Applied. Thanks

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-09-25 23:22 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-25  0:03 [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
2020-09-25  0:03 ` [PATCH v4 bpf-next 01/13] bpf: Move the PTR_TO_BTF_ID check to check_reg_type() Martin KaFai Lau
2020-09-25  8:22   ` Lorenz Bauer
2020-09-25 13:36   ` John Fastabend
2020-09-25  0:03 ` [PATCH v4 bpf-next 02/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Martin KaFai Lau
2020-09-25  8:26   ` Lorenz Bauer
2020-09-25 13:18     ` Martin KaFai Lau
2020-09-25 13:50       ` Lorenz Bauer
2020-09-25 15:47         ` Alexei Starovoitov
2020-09-25 14:21   ` John Fastabend
2020-09-25  0:03 ` [PATCH v4 bpf-next 03/13] bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 04/13] bpf: Change bpf_sk_storage_*() " Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 05/13] bpf: Change bpf_tcp_*_syncookie " Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 06/13] bpf: Change bpf_sk_assign " Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 07/13] bpf: selftest: Add ref_tracking verifier test for bpf_skc casting Martin KaFai Lau
2020-09-25  8:30   ` Lorenz Bauer
2020-09-25  0:04 ` [PATCH v4 bpf-next 08/13] bpf: selftest: Move sock_fields test into test_progs Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 09/13] bpf: selftest: Adapt sock_fields test to use skel and global variables Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 10/13] bpf: selftest: Use network_helpers in the sock_fields test Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 11/13] bpf: selftest: Use bpf_skc_to_tcp_sock() " Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 12/13] bpf: selftest: Remove enum tcp_ca_state from bpf_tcp_helpers.h Martin KaFai Lau
2020-09-25  0:04 ` [PATCH v4 bpf-next 13/13] bpf: selftest: Add test_btf_skc_cls_ingress Martin KaFai Lau
2020-09-25 16:24   ` John Fastabend
2020-09-25 17:58     ` Martin KaFai Lau
2020-09-25  9:40 ` [PATCH v4 bpf-next 00/13] bpf: Enable bpf_skc_to_* sock casting helper to networking prog type Lorenz Bauer
2020-09-25 23:22 ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).