netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor
@ 2022-05-18 22:55 Stanislav Fomichev
  2022-05-18 22:55 ` [PATCH bpf-next v7 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
                   ` (11 more replies)
  0 siblings, 12 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf
  Cc: ast, daniel, andrii, Stanislav Fomichev, kafai, kpsingh, jakub

This series implements new lsm flavor for attaching per-cgroup programs to
existing lsm hooks. The cgroup is taken out of 'current', unless
the first argument of the hook is 'struct socket'. In this case,
the cgroup association is taken out of socket. The attachment
looks like a regular per-cgroup attachment: we add new BPF_LSM_CGROUP
attach type which, together with attach_btf_id, signals per-cgroup lsm.
Behind the scenes, we allocate trampoline shim program and
attach to lsm. This program looks up cgroup from current/socket
and runs cgroup's effective prog array. The rest of the per-cgroup BPF
stays the same: hierarchy, local storage, retval conventions
(return 1 == success).

Current limitations:
* haven't considered sleepable bpf; can be extended later on
* not sure the verifier does the right thing with null checks;
  see latest selftest for details
* total of 10 (global) per-cgroup LSM attach points

Cc: ast@kernel.org
Cc: daniel@iogearbox.net
Cc: kafai@fb.com
Cc: kpsingh@kernel.org
Cc: jakub@cloudflare.com

v7:
- there were a lot of comments last time, hope I didn't forget anything,
  some of the bigger ones:
  - Martin: use/extend BTF_SOCK_TYPE_SOCKET
  - Martin: expose bpf_set_retval
  - Martin: reject 'return 0' at the verifier for 'void' hooks
  - Martin: prog_query returns all BPF_LSM_CGROUP, prog_info
    returns attach_btf_func_id
  - Andrii: split libbpf changes
  - Andrii: add field access test to test_progs, not test_verifier (still
    using asm though)
- things that I haven't addressed, stating them here explicitly, let
  me know if some of these are still problematic:
  1. Andrii: exposing only link-based api: seems like the changes
     to support non-link-based ones are minimal, couple of lines,
     so seems like it worth having it?
  2. Alexei: applying cgroup_atype for all cgroup hooks, not only
     cgroup lsm: looks a bit harder to apply everywhere that I
     originally thought; with lsm cgroup, we have a shim_prog pointer where
     we store cgroup_atype; for non-lsm programs, we don't have a
     trace program where to store it, so we still need some kind
     of global table to map from "static" hook to "dynamic" slot.
     So I'm dropping this "can be easily extended" clause from the
     description for now. I have converted this whole machinery
     to an RCU-managed list to remove synchronize_rcu().
- also note that I had to introduce new bpf_shim_tramp_link and
  moved refcnt there; we need something to manage new bpf_tramp_link

v6:
- remove active count & stats for shim program (Martin KaFai Lau)
- remove NULL/error check for btf_vmlinux (Martin)
- don't check cgroup_atype in bpf_cgroup_lsm_shim_release (Martin)
- use old_prog (instead of passed one) in __cgroup_bpf_detach (Martin)
- make sure attach_btf_id is the same in __cgroup_bpf_replace (Martin)
- enable cgroup local storage and test it (Martin)
- properly implement prog query and add bpftool & tests (Martin)
- prohibit non-shared cgroup storage mode for BPF_LSM_CGROUP (Martin)

v5:
- __cgroup_bpf_run_lsm_socket remove NULL sock/sk checks (Martin KaFai Lau)
- __cgroup_bpf_run_lsm_{socket,current} s/prog/shim_prog/ (Martin)
- make sure bpf_lsm_find_cgroup_shim works for hooks without args (Martin)
- __cgroup_bpf_attach make sure attach_btf_id is the same when replacing (Martin)
- call bpf_cgroup_lsm_shim_release only for LSM_CGROUP (Martin)
- drop BPF_LSM_CGROUP from bpf_attach_type_to_tramp (Martin)
- drop jited check from cgroup_shim_find (Martin)
- new patch to convert cgroup_bpf to hlist_node (Jakub Sitnicki)
- new shim flavor for 'struct sock' + list of exceptions (Martin)

v4:
- fix build when jit is on but syscall is off

v3:
- add BPF_LSM_CGROUP to bpftool
- use simple int instead of refcnt_t (to avoid use-after-free
  false positive)

v2:
- addressed build bot failures

Stanislav Fomichev (11):
  bpf: add bpf_func_t and trampoline helpers
  bpf: convert cgroup_bpf.progs to hlist
  bpf: per-cgroup lsm flavor
  bpf: minimize number of allocated lsm slots per program
  bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  bpf: allow writing to a subset of sock fields from lsm progtype
  libbpf: implement bpf_prog_query_opts
  libbpf: add lsm_cgoup_sock type
  bpftool: implement cgroup tree for BPF_LSM_CGROUP
  selftests/bpf: lsm_cgroup functional test
  selftests/bpf: verify lsm_cgroup struct sock access

 arch/x86/net/bpf_jit_comp.c                   |  24 +-
 include/linux/bpf-cgroup-defs.h               |  11 +-
 include/linux/bpf-cgroup.h                    |   9 +-
 include/linux/bpf.h                           |  36 +-
 include/linux/bpf_lsm.h                       |   8 +
 include/linux/btf_ids.h                       |   3 +-
 include/uapi/linux/bpf.h                      |   6 +
 kernel/bpf/bpf_lsm.c                          | 103 ++++
 kernel/bpf/btf.c                              |  11 +
 kernel/bpf/cgroup.c                           | 487 +++++++++++++++---
 kernel/bpf/core.c                             |   2 +
 kernel/bpf/syscall.c                          |  14 +-
 kernel/bpf/trampoline.c                       | 244 ++++++++-
 kernel/bpf/verifier.c                         |  31 +-
 tools/bpf/bpftool/cgroup.c                    |  77 ++-
 tools/bpf/bpftool/common.c                    |   1 +
 tools/include/linux/btf_ids.h                 |   4 +-
 tools/include/uapi/linux/bpf.h                |   6 +
 tools/lib/bpf/bpf.c                           |  42 +-
 tools/lib/bpf/bpf.h                           |  15 +
 tools/lib/bpf/libbpf.c                        |   2 +
 tools/lib/bpf/libbpf.map                      |   1 +
 .../selftests/bpf/prog_tests/lsm_cgroup.c     | 346 +++++++++++++
 .../testing/selftests/bpf/progs/lsm_cgroup.c  | 160 ++++++
 24 files changed, 1480 insertions(+), 163 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
 create mode 100644 tools/testing/selftests/bpf/progs/lsm_cgroup.c

-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 01/11] bpf: add bpf_func_t and trampoline helpers
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-20  0:45   ` Yonghong Song
  2022-05-18 22:55 ` [PATCH bpf-next v7 02/11] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

I'll be adding lsm cgroup specific helpers that grab
trampoline mutex.

No functional changes.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf.h     | 11 ++++----
 kernel/bpf/trampoline.c | 62 ++++++++++++++++++++++-------------------
 2 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c107392b0ba7..ea3674a415f9 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -53,6 +53,8 @@ typedef u64 (*bpf_callback_t)(u64, u64, u64, u64, u64);
 typedef int (*bpf_iter_init_seq_priv_t)(void *private_data,
 					struct bpf_iter_aux_info *aux);
 typedef void (*bpf_iter_fini_seq_priv_t)(void *private_data);
+typedef unsigned int (*bpf_func_t)(const void *,
+				   const struct bpf_insn *);
 struct bpf_iter_seq_info {
 	const struct seq_operations *seq_ops;
 	bpf_iter_init_seq_priv_t init_seq_private;
@@ -853,8 +855,7 @@ struct bpf_dispatcher {
 static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func(
 	const void *ctx,
 	const struct bpf_insn *insnsi,
-	unsigned int (*bpf_func)(const void *,
-				 const struct bpf_insn *))
+	bpf_func_t bpf_func)
 {
 	return bpf_func(ctx, insnsi);
 }
@@ -883,8 +884,7 @@ int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
 	noinline __nocfi unsigned int bpf_dispatcher_##name##_func(	\
 		const void *ctx,					\
 		const struct bpf_insn *insnsi,				\
-		unsigned int (*bpf_func)(const void *,			\
-					 const struct bpf_insn *))	\
+		bpf_func_t bpf_func)					\
 	{								\
 		return bpf_func(ctx, insnsi);				\
 	}								\
@@ -895,8 +895,7 @@ int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
 	unsigned int bpf_dispatcher_##name##_func(			\
 		const void *ctx,					\
 		const struct bpf_insn *insnsi,				\
-		unsigned int (*bpf_func)(const void *,			\
-					 const struct bpf_insn *));	\
+		bpf_func_t bpf_func);					\
 	extern struct bpf_dispatcher bpf_dispatcher_##name;
 #define BPF_DISPATCHER_FUNC(name) bpf_dispatcher_##name##_func
 #define BPF_DISPATCHER_PTR(name) (&bpf_dispatcher_##name)
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 93c7675f0c9e..01ce78c1df80 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -410,7 +410,7 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(struct bpf_prog *prog)
 	}
 }
 
-int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
 {
 	enum bpf_tramp_prog_type kind;
 	struct bpf_tramp_link *link_exiting;
@@ -418,44 +418,33 @@ int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline
 	int cnt = 0, i;
 
 	kind = bpf_attach_type_to_tramp(link->link.prog);
-	mutex_lock(&tr->mutex);
-	if (tr->extension_prog) {
+	if (tr->extension_prog)
 		/* cannot attach fentry/fexit if extension prog is attached.
 		 * cannot overwrite extension prog either.
 		 */
-		err = -EBUSY;
-		goto out;
-	}
+		return -EBUSY;
 
 	for (i = 0; i < BPF_TRAMP_MAX; i++)
 		cnt += tr->progs_cnt[i];
 
 	if (kind == BPF_TRAMP_REPLACE) {
 		/* Cannot attach extension if fentry/fexit are in use. */
-		if (cnt) {
-			err = -EBUSY;
-			goto out;
-		}
+		if (cnt)
+			return -EBUSY;
 		tr->extension_prog = link->link.prog;
-		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
-					 link->link.prog->bpf_func);
-		goto out;
-	}
-	if (cnt >= BPF_MAX_TRAMP_LINKS) {
-		err = -E2BIG;
-		goto out;
+		return bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
+					  link->link.prog->bpf_func);
 	}
-	if (!hlist_unhashed(&link->tramp_hlist)) {
+	if (cnt >= BPF_MAX_TRAMP_LINKS)
+		return -E2BIG;
+	if (!hlist_unhashed(&link->tramp_hlist))
 		/* prog already linked */
-		err = -EBUSY;
-		goto out;
-	}
+		return -EBUSY;
 	hlist_for_each_entry(link_exiting, &tr->progs_hlist[kind], tramp_hlist) {
 		if (link_exiting->link.prog != link->link.prog)
 			continue;
 		/* prog already linked */
-		err = -EBUSY;
-		goto out;
+		return -EBUSY;
 	}
 
 	hlist_add_head(&link->tramp_hlist, &tr->progs_hlist[kind]);
@@ -465,30 +454,45 @@ int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline
 		hlist_del_init(&link->tramp_hlist);
 		tr->progs_cnt[kind]--;
 	}
-out:
+	return err;
+}
+
+int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+{
+	int err;
+
+	mutex_lock(&tr->mutex);
+	err = __bpf_trampoline_link_prog(link, tr);
 	mutex_unlock(&tr->mutex);
 	return err;
 }
 
 /* bpf_trampoline_unlink_prog() should never fail. */
-int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
 {
 	enum bpf_tramp_prog_type kind;
 	int err;
 
 	kind = bpf_attach_type_to_tramp(link->link.prog);
-	mutex_lock(&tr->mutex);
 	if (kind == BPF_TRAMP_REPLACE) {
 		WARN_ON_ONCE(!tr->extension_prog);
 		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP,
 					 tr->extension_prog->bpf_func, NULL);
 		tr->extension_prog = NULL;
-		goto out;
+		return err;
 	}
 	hlist_del_init(&link->tramp_hlist);
 	tr->progs_cnt[kind]--;
-	err = bpf_trampoline_update(tr);
-out:
+	return bpf_trampoline_update(tr);
+}
+
+/* bpf_trampoline_unlink_prog() should never fail. */
+int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+{
+	int err;
+
+	mutex_lock(&tr->mutex);
+	err = __bpf_trampoline_unlink_prog(link, tr);
 	mutex_unlock(&tr->mutex);
 	return err;
 }
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 02/11] bpf: convert cgroup_bpf.progs to hlist
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
  2022-05-18 22:55 ` [PATCH bpf-next v7 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-18 22:55 ` [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev, Jakub Sitnicki

This lets us reclaim some space to be used by new cgroup lsm slots.

Before:
struct cgroup_bpf {
	struct bpf_prog_array *    effective[23];        /*     0   184 */
	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
	struct list_head           progs[23];            /*   184   368 */
	/* --- cacheline 8 boundary (512 bytes) was 40 bytes ago --- */
	u32                        flags[23];            /*   552    92 */

	/* XXX 4 bytes hole, try to pack */

	/* --- cacheline 10 boundary (640 bytes) was 8 bytes ago --- */
	struct list_head           storages;             /*   648    16 */
	struct bpf_prog_array *    inactive;             /*   664     8 */
	struct percpu_ref          refcnt;               /*   672    16 */
	struct work_struct         release_work;         /*   688    32 */

	/* size: 720, cachelines: 12, members: 7 */
	/* sum members: 716, holes: 1, sum holes: 4 */
	/* last cacheline: 16 bytes */
};

After:
struct cgroup_bpf {
	struct bpf_prog_array *    effective[23];        /*     0   184 */
	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
	struct hlist_head          progs[23];            /*   184   184 */
	/* --- cacheline 5 boundary (320 bytes) was 48 bytes ago --- */
	u8                         flags[23];            /*   368    23 */

	/* XXX 1 byte hole, try to pack */

	/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
	struct list_head           storages;             /*   392    16 */
	struct bpf_prog_array *    inactive;             /*   408     8 */
	struct percpu_ref          refcnt;               /*   416    16 */
	struct work_struct         release_work;         /*   432    72 */

	/* size: 504, cachelines: 8, members: 7 */
	/* sum members: 503, holes: 1, sum holes: 1 */
	/* last cacheline: 56 bytes */
};

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf-cgroup-defs.h |  4 +-
 include/linux/bpf-cgroup.h      |  2 +-
 kernel/bpf/cgroup.c             | 72 +++++++++++++++++++--------------
 3 files changed, 45 insertions(+), 33 deletions(-)

diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index 695d1224a71b..5d268e76d8e6 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -47,8 +47,8 @@ struct cgroup_bpf {
 	 * have either zero or one element
 	 * when BPF_F_ALLOW_MULTI the list can have up to BPF_CGROUP_MAX_PROGS
 	 */
-	struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE];
-	u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE];
+	struct hlist_head progs[MAX_CGROUP_BPF_ATTACH_TYPE];
+	u8 flags[MAX_CGROUP_BPF_ATTACH_TYPE];
 
 	/* list of cgroup shared storages */
 	struct list_head storages;
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 669d96d074ad..6673acfbf2ef 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -95,7 +95,7 @@ struct bpf_cgroup_link {
 };
 
 struct bpf_prog_list {
-	struct list_head node;
+	struct hlist_node node;
 	struct bpf_prog *prog;
 	struct bpf_cgroup_link *link;
 	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE];
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index afb414b26d01..134785ab487c 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -157,11 +157,12 @@ static void cgroup_bpf_release(struct work_struct *work)
 	mutex_lock(&cgroup_mutex);
 
 	for (atype = 0; atype < ARRAY_SIZE(cgrp->bpf.progs); atype++) {
-		struct list_head *progs = &cgrp->bpf.progs[atype];
-		struct bpf_prog_list *pl, *pltmp;
+		struct hlist_head *progs = &cgrp->bpf.progs[atype];
+		struct bpf_prog_list *pl;
+		struct hlist_node *pltmp;
 
-		list_for_each_entry_safe(pl, pltmp, progs, node) {
-			list_del(&pl->node);
+		hlist_for_each_entry_safe(pl, pltmp, progs, node) {
+			hlist_del(&pl->node);
 			if (pl->prog)
 				bpf_prog_put(pl->prog);
 			if (pl->link)
@@ -217,12 +218,12 @@ static struct bpf_prog *prog_list_prog(struct bpf_prog_list *pl)
 /* count number of elements in the list.
  * it's slow but the list cannot be long
  */
-static u32 prog_list_length(struct list_head *head)
+static u32 prog_list_length(struct hlist_head *head)
 {
 	struct bpf_prog_list *pl;
 	u32 cnt = 0;
 
-	list_for_each_entry(pl, head, node) {
+	hlist_for_each_entry(pl, head, node) {
 		if (!prog_list_prog(pl))
 			continue;
 		cnt++;
@@ -291,7 +292,7 @@ static int compute_effective_progs(struct cgroup *cgrp,
 		if (cnt > 0 && !(p->bpf.flags[atype] & BPF_F_ALLOW_MULTI))
 			continue;
 
-		list_for_each_entry(pl, &p->bpf.progs[atype], node) {
+		hlist_for_each_entry(pl, &p->bpf.progs[atype], node) {
 			if (!prog_list_prog(pl))
 				continue;
 
@@ -342,7 +343,7 @@ int cgroup_bpf_inherit(struct cgroup *cgrp)
 		cgroup_bpf_get(p);
 
 	for (i = 0; i < NR; i++)
-		INIT_LIST_HEAD(&cgrp->bpf.progs[i]);
+		INIT_HLIST_HEAD(&cgrp->bpf.progs[i]);
 
 	INIT_LIST_HEAD(&cgrp->bpf.storages);
 
@@ -418,7 +419,7 @@ static int update_effective_progs(struct cgroup *cgrp,
 
 #define BPF_CGROUP_MAX_PROGS 64
 
-static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
+static struct bpf_prog_list *find_attach_entry(struct hlist_head *progs,
 					       struct bpf_prog *prog,
 					       struct bpf_cgroup_link *link,
 					       struct bpf_prog *replace_prog,
@@ -428,12 +429,12 @@ static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
 
 	/* single-attach case */
 	if (!allow_multi) {
-		if (list_empty(progs))
+		if (hlist_empty(progs))
 			return NULL;
-		return list_first_entry(progs, typeof(*pl), node);
+		return hlist_entry(progs->first, typeof(*pl), node);
 	}
 
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (prog && pl->prog == prog && prog != replace_prog)
 			/* disallow attaching the same prog twice */
 			return ERR_PTR(-EINVAL);
@@ -444,7 +445,7 @@ static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
 
 	/* direct prog multi-attach w/ replacement case */
 	if (replace_prog) {
-		list_for_each_entry(pl, progs, node) {
+		hlist_for_each_entry(pl, progs, node) {
 			if (pl->prog == replace_prog)
 				/* a match found */
 				return pl;
@@ -480,7 +481,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	int err;
 
 	if (((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI)) ||
@@ -503,7 +504,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	if (!hierarchy_allows_attach(cgrp, atype))
 		return -EPERM;
 
-	if (!list_empty(progs) && cgrp->bpf.flags[atype] != saved_flags)
+	if (!hlist_empty(progs) && cgrp->bpf.flags[atype] != saved_flags)
 		/* Disallow attaching non-overridable on top
 		 * of existing overridable in this cgroup.
 		 * Disallow attaching multi-prog if overridable or none
@@ -525,12 +526,22 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	if (pl) {
 		old_prog = pl->prog;
 	} else {
+		struct hlist_node *last = NULL;
+
 		pl = kmalloc(sizeof(*pl), GFP_KERNEL);
 		if (!pl) {
 			bpf_cgroup_storages_free(new_storage);
 			return -ENOMEM;
 		}
-		list_add_tail(&pl->node, progs);
+		if (hlist_empty(progs))
+			hlist_add_head(&pl->node, progs);
+		else
+			hlist_for_each(last, progs) {
+				if (last->next)
+					continue;
+				hlist_add_behind(&pl->node, last);
+				break;
+			}
 	}
 
 	pl->prog = prog;
@@ -556,7 +567,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	}
 	bpf_cgroup_storages_free(new_storage);
 	if (!old_prog) {
-		list_del(&pl->node);
+		hlist_del(&pl->node);
 		kfree(pl);
 	}
 	return err;
@@ -587,7 +598,7 @@ static void replace_effective_prog(struct cgroup *cgrp,
 	struct cgroup_subsys_state *css;
 	struct bpf_prog_array *progs;
 	struct bpf_prog_list *pl;
-	struct list_head *head;
+	struct hlist_head *head;
 	struct cgroup *cg;
 	int pos;
 
@@ -603,7 +614,7 @@ static void replace_effective_prog(struct cgroup *cgrp,
 				continue;
 
 			head = &cg->bpf.progs[atype];
-			list_for_each_entry(pl, head, node) {
+			hlist_for_each_entry(pl, head, node) {
 				if (!prog_list_prog(pl))
 					continue;
 				if (pl->link == link)
@@ -637,7 +648,7 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog *old_prog;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	bool found = false;
 
 	atype = to_cgroup_bpf_attach_type(link->type);
@@ -649,7 +660,7 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	if (link->link.prog->type != new_prog->type)
 		return -EINVAL;
 
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (pl->link == link) {
 			found = true;
 			break;
@@ -688,7 +699,7 @@ static int cgroup_bpf_replace(struct bpf_link *link, struct bpf_prog *new_prog,
 	return ret;
 }
 
-static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
+static struct bpf_prog_list *find_detach_entry(struct hlist_head *progs,
 					       struct bpf_prog *prog,
 					       struct bpf_cgroup_link *link,
 					       bool allow_multi)
@@ -696,14 +707,14 @@ static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
 	struct bpf_prog_list *pl;
 
 	if (!allow_multi) {
-		if (list_empty(progs))
+		if (hlist_empty(progs))
 			/* report error when trying to detach and nothing is attached */
 			return ERR_PTR(-ENOENT);
 
 		/* to maintain backward compatibility NONE and OVERRIDE cgroups
 		 * allow detaching with invalid FD (prog==NULL) in legacy mode
 		 */
-		return list_first_entry(progs, typeof(*pl), node);
+		return hlist_entry(progs->first, typeof(*pl), node);
 	}
 
 	if (!prog && !link)
@@ -713,7 +724,7 @@ static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
 		return ERR_PTR(-EINVAL);
 
 	/* find the prog or link and detach it */
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (pl->prog == prog && pl->link == link)
 			return pl;
 	}
@@ -737,7 +748,7 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog *old_prog;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	u32 flags;
 	int err;
 
@@ -766,9 +777,10 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 		goto cleanup;
 
 	/* now can actually delete it from this cgroup list */
-	list_del(&pl->node);
+	hlist_del(&pl->node);
+
 	kfree(pl);
-	if (list_empty(progs))
+	if (hlist_empty(progs))
 		/* last program was detached, reset flags to zero */
 		cgrp->bpf.flags[atype] = 0;
 	if (old_prog)
@@ -802,7 +814,7 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 	enum bpf_attach_type type = attr->query.attach_type;
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_array *effective;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	struct bpf_prog *prog;
 	int cnt, ret = 0, i;
 	u32 flags;
@@ -841,7 +853,7 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 		u32 id;
 
 		i = 0;
-		list_for_each_entry(pl, progs, node) {
+		hlist_for_each_entry(pl, progs, node) {
 			prog = prog_list_prog(pl);
 			id = prog->aux->id;
 			if (copy_to_user(prog_ids + i, &id, sizeof(id)))
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
  2022-05-18 22:55 ` [PATCH bpf-next v7 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
  2022-05-18 22:55 ` [PATCH bpf-next v7 02/11] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-20  1:00   ` Yonghong Song
  2022-05-21  0:53   ` Martin KaFai Lau
  2022-05-18 22:55 ` [PATCH bpf-next v7 04/11] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Allow attaching to lsm hooks in the cgroup context.

Attaching to per-cgroup LSM works exactly like attaching
to other per-cgroup hooks. New BPF_LSM_CGROUP is added
to trigger new mode; the actual lsm hook we attach to is
signaled via existing attach_btf_id.

For the hooks that have 'struct socket' or 'struct sock' as its first
argument, we use the cgroup associated with that socket. For the rest,
we use 'current' cgroup (this is all on default hierarchy == v2 only).
Note that for some hooks that work on 'struct sock' we still
take the cgroup from 'current' because some of them work on the socket
that hasn't been properly initialized yet.

Behind the scenes, we allocate a shim program that is attached
to the trampoline and runs cgroup effective BPF programs array.
This shim has some rudimentary ref counting and can be shared
between several programs attaching to the same per-cgroup lsm hook.

Note that this patch bloats cgroup size because we add 211
cgroup_bpf_attach_type(s) for simplicity sake. This will be
addressed in the subsequent patch.

Also note that we only add non-sleepable flavor for now. To enable
sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
shim programs have to be freed via trace rcu, cgroup_bpf.effective
should be also trace-rcu-managed + maybe some other changes that
I'm not aware of.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 arch/x86/net/bpf_jit_comp.c     |  24 +++--
 include/linux/bpf-cgroup-defs.h |   6 ++
 include/linux/bpf-cgroup.h      |   7 ++
 include/linux/bpf.h             |  25 +++++
 include/linux/bpf_lsm.h         |  14 +++
 include/linux/btf_ids.h         |   3 +-
 include/uapi/linux/bpf.h        |   1 +
 kernel/bpf/bpf_lsm.c            |  50 +++++++++
 kernel/bpf/btf.c                |  11 ++
 kernel/bpf/cgroup.c             | 181 ++++++++++++++++++++++++++++---
 kernel/bpf/core.c               |   2 +
 kernel/bpf/syscall.c            |  10 ++
 kernel/bpf/trampoline.c         | 184 ++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c           |  28 +++++
 tools/include/linux/btf_ids.h   |   4 +-
 tools/include/uapi/linux/bpf.h  |   1 +
 16 files changed, 527 insertions(+), 24 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index a2b6d197c226..5cdebf4312da 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1765,6 +1765,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 			   struct bpf_tramp_link *l, int stack_size,
 			   int run_ctx_off, bool save_ret)
 {
+	void (*exit)(struct bpf_prog *prog, u64 start,
+		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
+	u64 (*enter)(struct bpf_prog *prog,
+		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
 	u8 *prog = *pprog;
 	u8 *jmp_insn;
 	int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
@@ -1783,15 +1787,21 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	 */
 	emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_1, -run_ctx_off + ctx_cookie_off);
 
+	if (p->aux->sleepable) {
+		enter = __bpf_prog_enter_sleepable;
+		exit = __bpf_prog_exit_sleepable;
+	} else if (p->expected_attach_type == BPF_LSM_CGROUP) {
+		enter = __bpf_prog_enter_lsm_cgroup;
+		exit = __bpf_prog_exit_lsm_cgroup;
+	}
+
 	/* arg1: mov rdi, progs[i] */
 	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
 	/* arg2: lea rsi, [rbp - ctx_cookie_off] */
 	EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);
 
-	if (emit_call(&prog,
-		      p->aux->sleepable ? __bpf_prog_enter_sleepable :
-		      __bpf_prog_enter, prog))
-			return -EINVAL;
+	if (emit_call(&prog, enter, prog))
+		return -EINVAL;
 	/* remember prog start time returned by __bpf_prog_enter */
 	emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
 
@@ -1835,10 +1845,8 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
 	/* arg3: lea rdx, [rbp - run_ctx_off] */
 	EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
-	if (emit_call(&prog,
-		      p->aux->sleepable ? __bpf_prog_exit_sleepable :
-		      __bpf_prog_exit, prog))
-			return -EINVAL;
+	if (emit_call(&prog, exit, prog))
+		return -EINVAL;
 
 	*pprog = prog;
 	return 0;
diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index 5d268e76d8e6..d5a70a35dace 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -10,6 +10,8 @@
 
 struct bpf_prog_array;
 
+#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
+
 enum cgroup_bpf_attach_type {
 	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
 	CGROUP_INET_INGRESS = 0,
@@ -35,6 +37,10 @@ enum cgroup_bpf_attach_type {
 	CGROUP_INET4_GETSOCKNAME,
 	CGROUP_INET6_GETSOCKNAME,
 	CGROUP_INET_SOCK_RELEASE,
+#ifdef CONFIG_BPF_LSM
+	CGROUP_LSM_START,
+	CGROUP_LSM_END = CGROUP_LSM_START + CGROUP_LSM_NUM - 1,
+#endif
 	MAX_CGROUP_BPF_ATTACH_TYPE
 };
 
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 6673acfbf2ef..2bd1b5f8de9b 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -23,6 +23,13 @@ struct ctl_table;
 struct ctl_table_header;
 struct task_struct;
 
+unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
+				       const struct bpf_insn *insn);
+unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
+					 const struct bpf_insn *insn);
+unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
+					  const struct bpf_insn *insn);
+
 #ifdef CONFIG_CGROUP_BPF
 
 #define CGROUP_ATYPE(type) \
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ea3674a415f9..70cf1dad91df 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -768,6 +768,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
 u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
 void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
 				       struct bpf_tramp_run_ctx *run_ctx);
+u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
+					struct bpf_tramp_run_ctx *run_ctx);
+void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
+					struct bpf_tramp_run_ctx *run_ctx);
 void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
 void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
 
@@ -1035,6 +1039,7 @@ struct bpf_prog_aux {
 	u64 load_time; /* ns since boottime */
 	u32 verified_insns;
 	struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
+	int cgroup_atype; /* enum cgroup_bpf_attach_type */
 	char name[BPF_OBJ_NAME_LEN];
 #ifdef CONFIG_SECURITY
 	void *security;
@@ -1107,6 +1112,12 @@ struct bpf_tramp_link {
 	u64 cookie;
 };
 
+struct bpf_shim_tramp_link {
+	struct bpf_tramp_link tramp_link;
+	struct bpf_trampoline *tr;
+	atomic64_t refcnt;
+};
+
 struct bpf_tracing_link {
 	struct bpf_tramp_link link;
 	enum bpf_attach_type attach_type;
@@ -1185,6 +1196,9 @@ struct bpf_dummy_ops {
 int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
 			    union bpf_attr __user *uattr);
 #endif
+int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+				    struct bpf_attach_target_info *tgt_info);
+void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog);
 #else
 static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
 {
@@ -1208,6 +1222,14 @@ static inline int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map,
 {
 	return -EINVAL;
 }
+static inline int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+						  struct bpf_attach_target_info *tgt_info)
+{
+	return -EOPNOTSUPP;
+}
+static inline void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
+{
+}
 #endif
 
 struct bpf_array {
@@ -2250,6 +2272,8 @@ extern const struct bpf_func_proto bpf_loop_proto;
 extern const struct bpf_func_proto bpf_strncmp_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_task_proto;
 extern const struct bpf_func_proto bpf_kptr_xchg_proto;
+extern const struct bpf_func_proto bpf_set_retval_proto;
+extern const struct bpf_func_proto bpf_get_retval_proto;
 
 const struct bpf_func_proto *tracing_prog_func_proto(
   enum bpf_func_id func_id, const struct bpf_prog *prog);
@@ -2366,6 +2390,7 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len);
 
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
+int btf_id_set_index(const struct btf_id_set *set, u32 id);
 
 #define MAX_BPRINTF_VARARGS		12
 
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index 479c101546ad..7f0e59f5f9be 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -42,6 +42,9 @@ extern const struct bpf_func_proto bpf_inode_storage_get_proto;
 extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
 void bpf_inode_storage_free(struct inode *inode);
 
+int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
+int bpf_lsm_hook_idx(u32 btf_id);
+
 #else /* !CONFIG_BPF_LSM */
 
 static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id)
@@ -65,6 +68,17 @@ static inline void bpf_inode_storage_free(struct inode *inode)
 {
 }
 
+static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
+					   bpf_func_t *bpf_func)
+{
+	return -ENOENT;
+}
+
+static inline int bpf_lsm_hook_idx(u32 btf_id)
+{
+	return -EINVAL;
+}
+
 #endif /* CONFIG_BPF_LSM */
 
 #endif /* _LINUX_BPF_LSM_H */
diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index bc5d9cc34e4c..857cc37094da 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -178,7 +178,8 @@ extern struct btf_id_set name;
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP6, tcp6_sock)			\
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock)			\
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)			\
-	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)			\
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_SOCKET, socket)
 
 enum {
 #define BTF_SOCK_TYPE(name, str) name,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0210f85131b3..b9d2d6de63a7 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -998,6 +998,7 @@ enum bpf_attach_type {
 	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
 	BPF_PERF_EVENT,
 	BPF_TRACE_KPROBE_MULTI,
+	BPF_LSM_CGROUP,
 	__MAX_BPF_ATTACH_TYPE
 };
 
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index c1351df9f7ee..654c23577ad3 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -16,6 +16,7 @@
 #include <linux/bpf_local_storage.h>
 #include <linux/btf_ids.h>
 #include <linux/ima.h>
+#include <linux/bpf-cgroup.h>
 
 /* For every LSM hook that allows attachment of BPF programs, declare a nop
  * function where a BPF program can be attached.
@@ -35,6 +36,46 @@ BTF_SET_START(bpf_lsm_hooks)
 #undef LSM_HOOK
 BTF_SET_END(bpf_lsm_hooks)
 
+/* List of LSM hooks that should operate on 'current' cgroup regardless
+ * of function signature.
+ */
+BTF_SET_START(bpf_lsm_current_hooks)
+/* operate on freshly allocated sk without any cgroup association */
+BTF_ID(func, bpf_lsm_sk_alloc_security)
+BTF_ID(func, bpf_lsm_sk_free_security)
+BTF_SET_END(bpf_lsm_current_hooks)
+
+int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
+			     bpf_func_t *bpf_func)
+{
+	const struct btf_param *args;
+
+	if (btf_type_vlen(prog->aux->attach_func_proto) < 1 ||
+	    btf_id_set_contains(&bpf_lsm_current_hooks,
+				prog->aux->attach_btf_id)) {
+		*bpf_func = __cgroup_bpf_run_lsm_current;
+		return 0;
+	}
+
+	args = btf_params(prog->aux->attach_func_proto);
+
+#ifdef CONFIG_NET
+	if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCKET])
+		*bpf_func = __cgroup_bpf_run_lsm_socket;
+	else if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCK])
+		*bpf_func = __cgroup_bpf_run_lsm_sock;
+	else
+#endif
+		*bpf_func = __cgroup_bpf_run_lsm_current;
+
+	return 0;
+}
+
+int bpf_lsm_hook_idx(u32 btf_id)
+{
+	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
+}
+
 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 			const struct bpf_prog *prog)
 {
@@ -158,6 +199,15 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return prog->aux->sleepable ? &bpf_ima_file_hash_proto : NULL;
 	case BPF_FUNC_get_attach_cookie:
 		return bpf_prog_has_trampoline(prog) ? &bpf_get_attach_cookie_proto : NULL;
+	case BPF_FUNC_get_local_storage:
+		return prog->expected_attach_type == BPF_LSM_CGROUP ?
+			&bpf_get_local_storage_proto : NULL;
+	case BPF_FUNC_set_retval:
+		return prog->expected_attach_type == BPF_LSM_CGROUP ?
+			&bpf_set_retval_proto : NULL;
+	case BPF_FUNC_get_retval:
+		return prog->expected_attach_type == BPF_LSM_CGROUP ?
+			&bpf_get_retval_proto : NULL;
 	default:
 		return tracing_prog_func_proto(func_id, prog);
 	}
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 2f0b0440131c..a90f04a8a8ee 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5248,6 +5248,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 
 	if (arg == nr_args) {
 		switch (prog->expected_attach_type) {
+		case BPF_LSM_CGROUP:
 		case BPF_LSM_MAC:
 		case BPF_TRACE_FEXIT:
 			/* When LSM programs are attached to void LSM hooks
@@ -6726,6 +6727,16 @@ static int btf_id_cmp_func(const void *a, const void *b)
 	return *pa - *pb;
 }
 
+int btf_id_set_index(const struct btf_id_set *set, u32 id)
+{
+	const u32 *p;
+
+	p = bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func);
+	if (!p)
+		return -1;
+	return p - set->ids;
+}
+
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id)
 {
 	return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 134785ab487c..2c356a38f4cf 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -14,6 +14,9 @@
 #include <linux/string.h>
 #include <linux/bpf.h>
 #include <linux/bpf-cgroup.h>
+#include <linux/btf_ids.h>
+#include <linux/bpf_lsm.h>
+#include <linux/bpf_verifier.h>
 #include <net/sock.h>
 #include <net/bpf_sk_storage.h>
 
@@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
 	return run_ctx.retval;
 }
 
+unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
+				       const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct sock *sk;
+	struct cgroup *cgrp;
+	int ret = 0;
+	u64 *regs;
+
+	regs = (u64 *)ctx;
+	sk = (void *)(unsigned long)regs[BPF_REG_0];
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	return ret;
+}
+
+unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
+					 const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct socket *sock;
+	struct cgroup *cgrp;
+	int ret = 0;
+	u64 *regs;
+
+	regs = (u64 *)ctx;
+	sock = (void *)(unsigned long)regs[BPF_REG_0];
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	return ret;
+}
+
+unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
+					  const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct cgroup *cgrp;
+	int ret = 0;
+
+	if (unlikely(!current))
+		return 0;
+
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	rcu_read_lock();
+	cgrp = task_dfl_cgroup(current);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	rcu_read_unlock();
+	return ret;
+}
+
+#ifdef CONFIG_BPF_LSM
+static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
+{
+	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
+}
+#else
+static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
+{
+	return -EOPNOTSUPP;
+}
+#endif /* CONFIG_BPF_LSM */
+
 void cgroup_bpf_offline(struct cgroup *cgrp)
 {
 	cgroup_get(cgrp);
@@ -139,6 +221,11 @@ static void bpf_cgroup_link_auto_detach(struct bpf_cgroup_link *link)
 	link->cgroup = NULL;
 }
 
+static void bpf_cgroup_lsm_shim_release(struct bpf_prog *prog)
+{
+	bpf_trampoline_unlink_cgroup_shim(prog);
+}
+
 /**
  * cgroup_bpf_release() - put references of all bpf programs and
  *                        release all cgroup bpf data
@@ -163,10 +250,16 @@ static void cgroup_bpf_release(struct work_struct *work)
 
 		hlist_for_each_entry_safe(pl, pltmp, progs, node) {
 			hlist_del(&pl->node);
-			if (pl->prog)
+			if (pl->prog) {
+				if (atype == BPF_LSM_CGROUP)
+					bpf_cgroup_lsm_shim_release(pl->prog);
 				bpf_prog_put(pl->prog);
-			if (pl->link)
+			}
+			if (pl->link) {
+				if (atype == BPF_LSM_CGROUP)
+					bpf_cgroup_lsm_shim_release(pl->link->link.prog);
 				bpf_cgroup_link_auto_detach(pl->link);
+			}
 			kfree(pl);
 			static_branch_dec(&cgroup_bpf_enabled_key[atype]);
 		}
@@ -479,6 +572,8 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	struct bpf_prog *old_prog = NULL;
 	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
 	struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
+	struct bpf_prog *new_prog = prog ? : link->link.prog;
+	struct bpf_attach_target_info tgt_info = {};
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_list *pl;
 	struct hlist_head *progs;
@@ -495,9 +590,32 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 		/* replace_prog implies BPF_F_REPLACE, and vice versa */
 		return -EINVAL;
 
-	atype = to_cgroup_bpf_attach_type(type);
-	if (atype < 0)
-		return -EINVAL;
+	if (type == BPF_LSM_CGROUP) {
+		if (replace_prog) {
+			/* Reusing shim from the original program. */
+			if (replace_prog->aux->attach_btf_id !=
+			    new_prog->aux->attach_btf_id)
+				return -EINVAL;
+
+			atype = replace_prog->aux->cgroup_atype;
+		} else {
+			err = bpf_check_attach_target(NULL, new_prog, NULL,
+						      new_prog->aux->attach_btf_id,
+						      &tgt_info);
+			if (err)
+				return -EINVAL;
+
+			atype = bpf_lsm_attach_type_get(new_prog->aux->attach_btf_id);
+			if (atype < 0)
+				return atype;
+		}
+
+		new_prog->aux->cgroup_atype = atype;
+	} else {
+		atype = to_cgroup_bpf_attach_type(type);
+		if (atype < 0)
+			return -EINVAL;
+	}
 
 	progs = &cgrp->bpf.progs[atype];
 
@@ -549,9 +667,15 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	bpf_cgroup_storages_assign(pl->storage, storage);
 	cgrp->bpf.flags[atype] = saved_flags;
 
+	if (type == BPF_LSM_CGROUP && !old_prog) {
+		err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info);
+		if (err)
+			goto cleanup;
+	}
+
 	err = update_effective_progs(cgrp, atype);
 	if (err)
-		goto cleanup;
+		goto cleanup_trampoline;
 
 	if (old_prog)
 		bpf_prog_put(old_prog);
@@ -560,6 +684,10 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	bpf_cgroup_storages_link(new_storage, cgrp, type);
 	return 0;
 
+cleanup_trampoline:
+	if (type == BPF_LSM_CGROUP && !old_prog)
+		bpf_trampoline_unlink_cgroup_shim(new_prog);
+
 cleanup:
 	if (old_prog) {
 		pl->prog = old_prog;
@@ -651,9 +779,18 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	struct hlist_head *progs;
 	bool found = false;
 
-	atype = to_cgroup_bpf_attach_type(link->type);
-	if (atype < 0)
-		return -EINVAL;
+	if (link->type == BPF_LSM_CGROUP) {
+		atype = link->link.prog->aux->cgroup_atype;
+
+		/* Reusing shim from the original program. */
+		if (new_prog->aux->attach_btf_id !=
+		    link->link.prog->aux->attach_btf_id)
+			return -EINVAL;
+	} else {
+		atype = to_cgroup_bpf_attach_type(link->type);
+		if (atype < 0)
+			return -EINVAL;
+	}
 
 	progs = &cgrp->bpf.progs[atype];
 
@@ -669,6 +806,9 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	if (!found)
 		return -ENOENT;
 
+	if (link->type == BPF_LSM_CGROUP)
+		new_prog->aux->cgroup_atype = atype;
+
 	old_prog = xchg(&link->link.prog, new_prog);
 	replace_effective_prog(cgrp, atype, link);
 	bpf_prog_put(old_prog);
@@ -752,9 +892,15 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	u32 flags;
 	int err;
 
-	atype = to_cgroup_bpf_attach_type(type);
-	if (atype < 0)
-		return -EINVAL;
+	if (type == BPF_LSM_CGROUP) {
+		struct bpf_prog *p = prog ? : link->link.prog;
+
+		atype = p->aux->cgroup_atype;
+	} else {
+		atype = to_cgroup_bpf_attach_type(type);
+		if (atype < 0)
+			return -EINVAL;
+	}
 
 	progs = &cgrp->bpf.progs[atype];
 	flags = cgrp->bpf.flags[atype];
@@ -776,6 +922,13 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	if (err)
 		goto cleanup;
 
+
+	if (type == BPF_LSM_CGROUP) {
+		struct bpf_prog *p = old_prog ? : link->link.prog;
+
+		bpf_cgroup_lsm_shim_release(p);
+	}
+
 	/* now can actually delete it from this cgroup list */
 	hlist_del(&pl->node);
 
@@ -1293,7 +1446,7 @@ BPF_CALL_0(bpf_get_retval)
 	return ctx->retval;
 }
 
-static const struct bpf_func_proto bpf_get_retval_proto = {
+const struct bpf_func_proto bpf_get_retval_proto = {
 	.func		= bpf_get_retval,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
@@ -1308,7 +1461,7 @@ BPF_CALL_1(bpf_set_retval, int, retval)
 	return 0;
 }
 
-static const struct bpf_func_proto bpf_set_retval_proto = {
+const struct bpf_func_proto bpf_set_retval_proto = {
 	.func		= bpf_set_retval,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 9cc91f0f3115..b9c408dbb155 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2650,6 +2650,8 @@ const struct bpf_func_proto bpf_get_local_storage_proto __weak;
 const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
 const struct bpf_func_proto bpf_snprintf_btf_proto __weak;
 const struct bpf_func_proto bpf_seq_printf_btf_proto __weak;
+const struct bpf_func_proto bpf_set_retval_proto __weak;
+const struct bpf_func_proto bpf_get_retval_proto __weak;
 
 const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
 {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 72e53489165d..5ed2093e51cc 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3416,6 +3416,8 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type)
 		return BPF_PROG_TYPE_SK_LOOKUP;
 	case BPF_XDP:
 		return BPF_PROG_TYPE_XDP;
+	case BPF_LSM_CGROUP:
+		return BPF_PROG_TYPE_LSM;
 	default:
 		return BPF_PROG_TYPE_UNSPEC;
 	}
@@ -3469,6 +3471,11 @@ static int bpf_prog_attach(const union bpf_attr *attr)
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_SOCK_OPS:
+	case BPF_PROG_TYPE_LSM:
+		if (ptype == BPF_PROG_TYPE_LSM &&
+		    prog->expected_attach_type != BPF_LSM_CGROUP)
+			return -EINVAL;
+
 		ret = cgroup_bpf_prog_attach(attr, ptype, prog);
 		break;
 	default:
@@ -3506,6 +3513,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_SOCK_OPS:
+	case BPF_PROG_TYPE_LSM:
 		return cgroup_bpf_prog_detach(attr, ptype);
 	default:
 		return -EINVAL;
@@ -4539,6 +4547,8 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 			ret = bpf_raw_tp_link_attach(prog, NULL);
 		else if (prog->expected_attach_type == BPF_TRACE_ITER)
 			ret = bpf_iter_link_attach(attr, uattr, prog);
+		else if (prog->expected_attach_type == BPF_LSM_CGROUP)
+			ret = cgroup_bpf_link_attach(attr, prog);
 		else
 			ret = bpf_tracing_prog_attach(prog,
 						      attr->link_create.target_fd,
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 01ce78c1df80..c424056f0b35 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -11,6 +11,8 @@
 #include <linux/rcupdate_wait.h>
 #include <linux/module.h>
 #include <linux/static_call.h>
+#include <linux/bpf_verifier.h>
+#include <linux/bpf_lsm.h>
 
 /* dummy _ops. The verifier will operate on target program's ops. */
 const struct bpf_verifier_ops bpf_extension_verifier_ops = {
@@ -497,6 +499,163 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
 	return err;
 }
 
+#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
+static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
+						     bpf_func_t bpf_func)
+{
+	struct bpf_shim_tramp_link *shim_link = NULL;
+	struct bpf_prog *p;
+
+	shim_link = kzalloc(sizeof(*shim_link), GFP_USER);
+	if (!shim_link)
+		return NULL;
+
+	p = bpf_prog_alloc(1, 0);
+	if (!p) {
+		kfree(shim_link);
+		return NULL;
+	}
+
+	p->jited = false;
+	p->bpf_func = bpf_func;
+
+	p->aux->cgroup_atype = prog->aux->cgroup_atype;
+	p->aux->attach_func_proto = prog->aux->attach_func_proto;
+	p->aux->attach_btf_id = prog->aux->attach_btf_id;
+	p->aux->attach_btf = prog->aux->attach_btf;
+	btf_get(p->aux->attach_btf);
+	p->type = BPF_PROG_TYPE_LSM;
+	p->expected_attach_type = BPF_LSM_MAC;
+	bpf_prog_inc(p);
+	bpf_link_init(&shim_link->tramp_link.link, BPF_LINK_TYPE_TRACING, NULL, p);
+	atomic64_set(&shim_link->refcnt, 1);
+
+	return shim_link;
+}
+
+static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr,
+						    bpf_func_t bpf_func)
+{
+	struct bpf_tramp_link *link;
+	int kind;
+
+	for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
+		hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
+			struct bpf_prog *p = link->link.prog;
+
+			if (p->bpf_func == bpf_func)
+				return container_of(link, struct bpf_shim_tramp_link, tramp_link);
+		}
+	}
+
+	return NULL;
+}
+
+static void cgroup_shim_put(struct bpf_shim_tramp_link *shim_link)
+{
+	if (shim_link->tr)
+		bpf_trampoline_put(shim_link->tr);
+
+	if (!atomic64_dec_and_test(&shim_link->refcnt))
+		return;
+
+	if (!shim_link->tr)
+		return;
+
+	WARN_ON_ONCE(__bpf_trampoline_unlink_prog(&shim_link->tramp_link, shim_link->tr));
+	kfree(shim_link);
+}
+
+int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+				    struct bpf_attach_target_info *tgt_info)
+{
+	struct bpf_shim_tramp_link *shim_link = NULL;
+	struct bpf_trampoline *tr;
+	bpf_func_t bpf_func;
+	u64 key;
+	int err;
+
+	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
+					 prog->aux->attach_btf_id);
+
+	err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
+	if (err)
+		return err;
+
+	tr = bpf_trampoline_get(key, tgt_info);
+	if (!tr)
+		return  -ENOMEM;
+
+	mutex_lock(&tr->mutex);
+
+	shim_link = cgroup_shim_find(tr, bpf_func);
+	if (shim_link) {
+		/* Reusing existing shim attached by the other program. */
+		atomic64_inc(&shim_link->refcnt);
+		/* note, we're still holding tr refcnt from above */
+
+		mutex_unlock(&tr->mutex);
+		return 0;
+	}
+
+	/* Allocate and install new shim. */
+
+	shim_link = cgroup_shim_alloc(prog, bpf_func);
+	if (!shim_link) {
+		bpf_trampoline_put(tr);
+		err = -ENOMEM;
+		goto out;
+	}
+
+	err = __bpf_trampoline_link_prog(&shim_link->tramp_link, tr);
+	if (err)
+		goto out;
+
+	shim_link->tr = tr;
+
+	mutex_unlock(&tr->mutex);
+
+	return 0;
+out:
+	mutex_unlock(&tr->mutex);
+
+	if (shim_link)
+		cgroup_shim_put(shim_link);
+
+	return err;
+}
+
+void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
+{
+	struct bpf_shim_tramp_link *shim_link = NULL;
+	struct bpf_trampoline *tr;
+	bpf_func_t bpf_func;
+	u64 key;
+	int err;
+
+	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
+					 prog->aux->attach_btf_id);
+
+	err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
+	if (err)
+		return;
+
+	tr = bpf_trampoline_lookup(key);
+	if (!tr)
+		return;
+
+	mutex_lock(&tr->mutex);
+
+	shim_link = cgroup_shim_find(tr, bpf_func);
+	if (shim_link)
+		cgroup_shim_put(shim_link);
+
+	mutex_unlock(&tr->mutex);
+
+	bpf_trampoline_put(tr); /* bpf_trampoline_lookup above */
+}
+#endif
+
 struct bpf_trampoline *bpf_trampoline_get(u64 key,
 					  struct bpf_attach_target_info *tgt_info)
 {
@@ -629,6 +788,31 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
 	rcu_read_unlock();
 }
 
+u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
+					struct bpf_tramp_run_ctx *run_ctx)
+	__acquires(RCU)
+{
+	/* Runtime stats are exported via actual BPF_LSM_CGROUP
+	 * programs, not the shims.
+	 */
+	rcu_read_lock();
+	migrate_disable();
+
+	run_ctx->saved_run_ctx = bpf_set_run_ctx(&run_ctx->run_ctx);
+
+	return NO_START_TIME;
+}
+
+void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
+					struct bpf_tramp_run_ctx *run_ctx)
+	__releases(RCU)
+{
+	bpf_reset_run_ctx(run_ctx->saved_run_ctx);
+
+	migrate_enable();
+	rcu_read_unlock();
+}
+
 u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx)
 {
 	rcu_read_lock_trace();
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9b59581026f8..ff43188e3040 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7021,6 +7021,19 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
 					set_loop_callback_state);
 		break;
+
+	case BPF_FUNC_set_retval:
+		if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
+			if (!env->prog->aux->attach_func_proto->type) {
+				/* Make sure programs that attach to void
+				 * hooks don't try to modify return value.
+				 */
+				err = -EINVAL;
+				verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
+			}
+		}
+
+		break;
 	}
 
 	if (err)
@@ -10216,6 +10229,18 @@ static int check_return_code(struct bpf_verifier_env *env)
 	case BPF_PROG_TYPE_SK_LOOKUP:
 		range = tnum_range(SK_DROP, SK_PASS);
 		break;
+
+	case BPF_PROG_TYPE_LSM:
+		if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
+			if (!env->prog->aux->attach_func_proto->type) {
+				/* Make sure programs that attach to void
+				 * hooks don't try to modify return value.
+				 */
+				range = tnum_range(1, 1);
+			}
+		}
+		break;
+
 	case BPF_PROG_TYPE_EXT:
 		/* freplace program can return anything as its return value
 		 * depends on the to-be-replaced kernel func or bpf program.
@@ -10232,6 +10257,8 @@ static int check_return_code(struct bpf_verifier_env *env)
 
 	if (!tnum_in(range, reg->var_off)) {
 		verbose_invalid_scalar(env, reg, &range, "program exit", "R0");
+		if (env->prog->expected_attach_type == BPF_LSM_CGROUP)
+			verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
 		return -EINVAL;
 	}
 
@@ -14455,6 +14482,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 		fallthrough;
 	case BPF_MODIFY_RETURN:
 	case BPF_LSM_MAC:
+	case BPF_LSM_CGROUP:
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
 		if (!btf_type_is_func(t)) {
diff --git a/tools/include/linux/btf_ids.h b/tools/include/linux/btf_ids.h
index 57890b357f85..2345b502b439 100644
--- a/tools/include/linux/btf_ids.h
+++ b/tools/include/linux/btf_ids.h
@@ -172,7 +172,9 @@ extern struct btf_id_set name;
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP_TW, tcp_timewait_sock)		\
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP6, tcp6_sock)			\
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock)			\
-	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)			\
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)			\
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_SOCKET, socket)
 
 enum {
 #define BTF_SOCK_TYPE(name, str) name,
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0210f85131b3..b9d2d6de63a7 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -998,6 +998,7 @@ enum bpf_attach_type {
 	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
 	BPF_PERF_EVENT,
 	BPF_TRACE_KPROBE_MULTI,
+	BPF_LSM_CGROUP,
 	__MAX_BPF_ATTACH_TYPE
 };
 
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 04/11] bpf: minimize number of allocated lsm slots per program
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (2 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-21  6:56   ` Martin KaFai Lau
  2022-05-18 22:55 ` [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Previous patch adds 1:1 mapping between all 211 LSM hooks
and bpf_cgroup program array. Instead of reserving a slot per
possible hook, reserve 10 slots per cgroup for lsm programs.
Those slots are dynamically allocated on demand and reclaimed.

struct cgroup_bpf {
	struct bpf_prog_array *    effective[33];        /*     0   264 */
	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
	struct hlist_head          progs[33];            /*   264   264 */
	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
	u8                         flags[33];            /*   528    33 */

	/* XXX 7 bytes hole, try to pack */

	struct list_head           storages;             /*   568    16 */
	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
	struct bpf_prog_array *    inactive;             /*   584     8 */
	struct percpu_ref          refcnt;               /*   592    16 */
	struct work_struct         release_work;         /*   608    72 */

	/* size: 680, cachelines: 11, members: 7 */
	/* sum members: 673, holes: 1, sum holes: 7 */
	/* last cacheline: 40 bytes */
};

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf-cgroup-defs.h |   3 +-
 include/linux/bpf_lsm.h         |   6 --
 kernel/bpf/bpf_lsm.c            |   5 --
 kernel/bpf/cgroup.c             | 135 +++++++++++++++++++++++++++++---
 4 files changed, 125 insertions(+), 24 deletions(-)

diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index d5a70a35dace..359d3f16abea 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -10,7 +10,8 @@
 
 struct bpf_prog_array;
 
-#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
+/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
+#define CGROUP_LSM_NUM 10
 
 enum cgroup_bpf_attach_type {
 	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index 7f0e59f5f9be..613de44aa429 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -43,7 +43,6 @@ extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
 void bpf_inode_storage_free(struct inode *inode);
 
 int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
-int bpf_lsm_hook_idx(u32 btf_id);
 
 #else /* !CONFIG_BPF_LSM */
 
@@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
 	return -ENOENT;
 }
 
-static inline int bpf_lsm_hook_idx(u32 btf_id)
-{
-	return -EINVAL;
-}
-
 #endif /* CONFIG_BPF_LSM */
 
 #endif /* _LINUX_BPF_LSM_H */
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 654c23577ad3..96503c3e7a71 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -71,11 +71,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
 	return 0;
 }
 
-int bpf_lsm_hook_idx(u32 btf_id)
-{
-	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
-}
-
 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 			const struct bpf_prog *prog)
 {
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 2c356a38f4cf..a959cdd22870 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -132,15 +132,110 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
 }
 
 #ifdef CONFIG_BPF_LSM
+struct list_head unused_bpf_lsm_atypes;
+struct list_head used_bpf_lsm_atypes;
+
+struct bpf_lsm_attach_type {
+	int index;
+	u32 btf_id;
+	int usecnt;
+	struct list_head atypes;
+	struct rcu_head rcu_head;
+};
+
+static int __init bpf_lsm_attach_type_init(void)
+{
+	struct bpf_lsm_attach_type *atype;
+	int i;
+
+	INIT_LIST_HEAD_RCU(&unused_bpf_lsm_atypes);
+	INIT_LIST_HEAD_RCU(&used_bpf_lsm_atypes);
+
+	for (i = 0; i < CGROUP_LSM_NUM; i++) {
+		atype = kzalloc(sizeof(*atype), GFP_KERNEL);
+		if (!atype)
+			continue;
+
+		atype->index = i;
+		list_add_tail_rcu(&atype->atypes, &unused_bpf_lsm_atypes);
+	}
+
+	return 0;
+}
+late_initcall(bpf_lsm_attach_type_init);
+
 static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
 {
-	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
+	struct bpf_lsm_attach_type *atype;
+
+	lockdep_assert_held(&cgroup_mutex);
+
+	list_for_each_entry_rcu(atype, &used_bpf_lsm_atypes, atypes) {
+		if (atype->btf_id != attach_btf_id)
+			continue;
+
+		atype->usecnt++;
+		return CGROUP_LSM_START + atype->index;
+	}
+
+	atype = list_first_or_null_rcu(&unused_bpf_lsm_atypes, struct bpf_lsm_attach_type, atypes);
+	if (!atype)
+		return -E2BIG;
+
+	list_del_rcu(&atype->atypes);
+	atype->btf_id = attach_btf_id;
+	atype->usecnt = 1;
+	list_add_tail_rcu(&atype->atypes, &used_bpf_lsm_atypes);
+
+	return CGROUP_LSM_START + atype->index;
+}
+
+static void bpf_lsm_attach_type_reclaim(struct rcu_head *head)
+{
+	struct bpf_lsm_attach_type *atype =
+		container_of(head, struct bpf_lsm_attach_type, rcu_head);
+
+	atype->btf_id = 0;
+	atype->usecnt = 0;
+	list_add_tail_rcu(&atype->atypes, &unused_bpf_lsm_atypes);
+}
+
+static void bpf_lsm_attach_type_put(u32 attach_btf_id)
+{
+	struct bpf_lsm_attach_type *atype;
+
+	lockdep_assert_held(&cgroup_mutex);
+
+	list_for_each_entry_rcu(atype, &used_bpf_lsm_atypes, atypes) {
+		if (atype->btf_id != attach_btf_id)
+			continue;
+
+		if (--atype->usecnt <= 0) {
+			list_del_rcu(&atype->atypes);
+			WARN_ON_ONCE(atype->usecnt < 0);
+
+			/* call_rcu here prevents atype reuse within
+			 * the same rcu grace period.
+			 * shim programs use __bpf_prog_enter_lsm_cgroup
+			 * which starts RCU read section.
+			 */
+			call_rcu(&atype->rcu_head, bpf_lsm_attach_type_reclaim);
+		}
+
+		return;
+	}
+
+	WARN_ON_ONCE(1);
 }
 #else
 static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
 {
 	return -EOPNOTSUPP;
 }
+
+static void bpf_lsm_attach_type_put(u32 attach_btf_id)
+{
+}
 #endif /* CONFIG_BPF_LSM */
 
 void cgroup_bpf_offline(struct cgroup *cgrp)
@@ -224,6 +319,7 @@ static void bpf_cgroup_link_auto_detach(struct bpf_cgroup_link *link)
 static void bpf_cgroup_lsm_shim_release(struct bpf_prog *prog)
 {
 	bpf_trampoline_unlink_cgroup_shim(prog);
+	bpf_lsm_attach_type_put(prog->aux->attach_btf_id);
 }
 
 /**
@@ -619,27 +715,37 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 
 	progs = &cgrp->bpf.progs[atype];
 
-	if (!hierarchy_allows_attach(cgrp, atype))
-		return -EPERM;
+	if (!hierarchy_allows_attach(cgrp, atype)) {
+		err = -EPERM;
+		goto cleanup_attach_type;
+	}
 
-	if (!hlist_empty(progs) && cgrp->bpf.flags[atype] != saved_flags)
+	if (!hlist_empty(progs) && cgrp->bpf.flags[atype] != saved_flags) {
 		/* Disallow attaching non-overridable on top
 		 * of existing overridable in this cgroup.
 		 * Disallow attaching multi-prog if overridable or none
 		 */
-		return -EPERM;
+		err = -EPERM;
+		goto cleanup_attach_type;
+	}
 
-	if (prog_list_length(progs) >= BPF_CGROUP_MAX_PROGS)
-		return -E2BIG;
+	if (prog_list_length(progs) >= BPF_CGROUP_MAX_PROGS) {
+		err = -E2BIG;
+		goto cleanup_attach_type;
+	}
 
 	pl = find_attach_entry(progs, prog, link, replace_prog,
 			       flags & BPF_F_ALLOW_MULTI);
-	if (IS_ERR(pl))
-		return PTR_ERR(pl);
+	if (IS_ERR(pl)) {
+		err = PTR_ERR(pl);
+		goto cleanup_attach_type;
+	}
 
 	if (bpf_cgroup_storages_alloc(storage, new_storage, type,
-				      prog ? : link->link.prog, cgrp))
-		return -ENOMEM;
+				      prog ? : link->link.prog, cgrp)) {
+		err = -ENOMEM;
+		goto cleanup_attach_type;
+	}
 
 	if (pl) {
 		old_prog = pl->prog;
@@ -649,7 +755,8 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 		pl = kmalloc(sizeof(*pl), GFP_KERNEL);
 		if (!pl) {
 			bpf_cgroup_storages_free(new_storage);
-			return -ENOMEM;
+			err = -ENOMEM;
+			goto cleanup_attach_type;
 		}
 		if (hlist_empty(progs))
 			hlist_add_head(&pl->node, progs);
@@ -698,6 +805,10 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 		hlist_del(&pl->node);
 		kfree(pl);
 	}
+
+cleanup_attach_type:
+	if (type == BPF_LSM_CGROUP)
+		bpf_lsm_attach_type_put(new_prog->aux->attach_btf_id);
 	return err;
 }
 
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (3 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 04/11] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-19  2:31   ` kernel test robot
                     ` (3 more replies)
  2022-05-18 22:55 ` [PATCH bpf-next v7 06/11] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
                   ` (6 subsequent siblings)
  11 siblings, 4 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

We have two options:
1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point

I was doing (2) in the original patch, but switching to (1) here:

* bpf_prog_query returns all attached BPF_LSM_CGROUP programs
regardless of attach_btf_id
* attach_btf_id is exported via bpf_prog_info

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/uapi/linux/bpf.h |   5 ++
 kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
 kernel/bpf/syscall.c     |   4 +-
 3 files changed, 81 insertions(+), 31 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b9d2d6de63a7..432fc5f49567 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1432,6 +1432,7 @@ union bpf_attr {
 		__u32		attach_flags;
 		__aligned_u64	prog_ids;
 		__u32		prog_cnt;
+		__aligned_u64	prog_attach_flags; /* output: per-program attach_flags */
 	} query;
 
 	struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
@@ -5911,6 +5912,10 @@ struct bpf_prog_info {
 	__u64 run_cnt;
 	__u64 recursion_misses;
 	__u32 verified_insns;
+	/* BTF ID of the function to attach to within BTF object identified
+	 * by btf_id.
+	 */
+	__u32 attach_btf_func_id;
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index a959cdd22870..08a1015ee09e 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 			      union bpf_attr __user *uattr)
 {
+	__u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
 	__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
 	enum bpf_attach_type type = attr->query.attach_type;
 	enum cgroup_bpf_attach_type atype;
@@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 	struct hlist_head *progs;
 	struct bpf_prog *prog;
 	int cnt, ret = 0, i;
+	int total_cnt = 0;
 	u32 flags;
 
-	atype = to_cgroup_bpf_attach_type(type);
-	if (atype < 0)
-		return -EINVAL;
+	enum cgroup_bpf_attach_type from_atype, to_atype;
 
-	progs = &cgrp->bpf.progs[atype];
-	flags = cgrp->bpf.flags[atype];
+	if (type == BPF_LSM_CGROUP) {
+		from_atype = CGROUP_LSM_START;
+		to_atype = CGROUP_LSM_END;
+	} else {
+		from_atype = to_cgroup_bpf_attach_type(type);
+		if (from_atype < 0)
+			return -EINVAL;
+		to_atype = from_atype;
+	}
 
-	effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
-					      lockdep_is_held(&cgroup_mutex));
+	for (atype = from_atype; atype <= to_atype; atype++) {
+		progs = &cgrp->bpf.progs[atype];
+		flags = cgrp->bpf.flags[atype];
 
-	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
-		cnt = bpf_prog_array_length(effective);
-	else
-		cnt = prog_list_length(progs);
+		effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
+						      lockdep_is_held(&cgroup_mutex));
 
-	if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
-		return -EFAULT;
-	if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
+		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
+			total_cnt += bpf_prog_array_length(effective);
+		else
+			total_cnt += prog_list_length(progs);
+	}
+
+	if (type != BPF_LSM_CGROUP)
+		if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
+			return -EFAULT;
+	if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
 		return -EFAULT;
-	if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
+	if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
 		/* return early if user requested only program count + flags */
 		return 0;
-	if (attr->query.prog_cnt < cnt) {
-		cnt = attr->query.prog_cnt;
+
+	if (attr->query.prog_cnt < total_cnt) {
+		total_cnt = attr->query.prog_cnt;
 		ret = -ENOSPC;
 	}
 
-	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
-		return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
-	} else {
-		struct bpf_prog_list *pl;
-		u32 id;
+	for (atype = from_atype; atype <= to_atype; atype++) {
+		if (total_cnt <= 0)
+			break;
 
-		i = 0;
-		hlist_for_each_entry(pl, progs, node) {
-			prog = prog_list_prog(pl);
-			id = prog->aux->id;
-			if (copy_to_user(prog_ids + i, &id, sizeof(id)))
-				return -EFAULT;
-			if (++i == cnt)
-				break;
+		progs = &cgrp->bpf.progs[atype];
+		flags = cgrp->bpf.flags[atype];
+
+		effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
+						      lockdep_is_held(&cgroup_mutex));
+
+		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
+			cnt = bpf_prog_array_length(effective);
+		else
+			cnt = prog_list_length(progs);
+
+		if (cnt >= total_cnt)
+			cnt = total_cnt;
+
+		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
+			ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
+		} else {
+			struct bpf_prog_list *pl;
+			u32 id;
+
+			i = 0;
+			hlist_for_each_entry(pl, progs, node) {
+				prog = prog_list_prog(pl);
+				id = prog->aux->id;
+				if (copy_to_user(prog_ids + i, &id, sizeof(id)))
+					return -EFAULT;
+				if (++i == cnt)
+					break;
+			}
 		}
+
+		if (prog_attach_flags)
+			for (i = 0; i < cnt; i++)
+				if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
+					return -EFAULT;
+
+		prog_ids += cnt;
+		total_cnt -= cnt;
+		if (prog_attach_flags)
+			prog_attach_flags += cnt;
 	}
 	return ret;
 }
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 5ed2093e51cc..4137583c04a2 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
 	}
 }
 
-#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
+#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
 
 static int bpf_prog_query(const union bpf_attr *attr,
 			  union bpf_attr __user *uattr)
@@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
 	case BPF_CGROUP_SYSCTL:
 	case BPF_CGROUP_GETSOCKOPT:
 	case BPF_CGROUP_SETSOCKOPT:
+	case BPF_LSM_CGROUP:
 		return cgroup_bpf_prog_query(attr, uattr);
 	case BPF_LIRC_MODE2:
 		return lirc_prog_query(attr, uattr);
@@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
 
 	if (prog->aux->btf)
 		info.btf_id = btf_obj_id(prog->aux->btf);
+	info.attach_btf_func_id = prog->aux->attach_btf_id;
 
 	ulen = info.nr_func_info;
 	info.nr_func_info = prog->aux->func_info_cnt;
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 06/11] bpf: allow writing to a subset of sock fields from lsm progtype
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (4 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-18 22:55 ` [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts Stanislav Fomichev
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

For now, allow only the obvious ones, like sk_priority and sk_mark.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 kernel/bpf/bpf_lsm.c  | 58 +++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c |  3 ++-
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 96503c3e7a71..b7e9ed8325f5 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -305,7 +305,65 @@ bool bpf_lsm_is_sleepable_hook(u32 btf_id)
 const struct bpf_prog_ops lsm_prog_ops = {
 };
 
+static int lsm_btf_struct_access(struct bpf_verifier_log *log,
+					const struct btf *btf,
+					const struct btf_type *t, int off,
+					int size, enum bpf_access_type atype,
+					u32 *next_btf_id,
+					enum bpf_type_flag *flag)
+{
+	const struct btf_type *sock_type;
+	struct btf *btf_vmlinux;
+	s32 type_id;
+	size_t end;
+
+	if (atype == BPF_READ)
+		return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
+					 flag);
+
+	btf_vmlinux = bpf_get_btf_vmlinux();
+	if (!btf_vmlinux) {
+		bpf_log(log, "no vmlinux btf\n");
+		return -EOPNOTSUPP;
+	}
+
+	type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
+	if (type_id < 0) {
+		bpf_log(log, "'struct sock' not found in vmlinux btf\n");
+		return -EINVAL;
+	}
+
+	sock_type = btf_type_by_id(btf_vmlinux, type_id);
+
+	if (t != sock_type) {
+		bpf_log(log, "only 'struct sock' writes are supported\n");
+		return -EACCES;
+	}
+
+	switch (off) {
+	case bpf_ctx_range(struct sock, sk_priority):
+		end = offsetofend(struct sock, sk_priority);
+		break;
+	case bpf_ctx_range(struct sock, sk_mark):
+		end = offsetofend(struct sock, sk_mark);
+		break;
+	default:
+		bpf_log(log, "no write support to 'struct sock' at off %d\n", off);
+		return -EACCES;
+	}
+
+	if (off + size > end) {
+		bpf_log(log,
+			"write access at off %d with size %d beyond the member of 'struct sock' ended at %zu\n",
+			off, size, end);
+		return -EACCES;
+	}
+
+	return NOT_INIT;
+}
+
 const struct bpf_verifier_ops lsm_verifier_ops = {
 	.get_func_proto = bpf_lsm_func_proto,
 	.is_valid_access = btf_ctx_access,
+	.btf_struct_access = lsm_btf_struct_access,
 };
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ff43188e3040..af9b9ba8b796 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13151,7 +13151,8 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 				insn->code = BPF_LDX | BPF_PROBE_MEM |
 					BPF_SIZE((insn)->code);
 				env->prog->aux->num_exentries++;
-			} else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS) {
+			} else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS &&
+				   resolve_prog_type(env->prog) != BPF_PROG_TYPE_LSM) {
 				verbose(env, "Writes through BTF pointers are not allowed\n");
 				return -EINVAL;
 			}
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (5 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 06/11] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-23 23:22   ` Andrii Nakryiko
  2022-05-18 22:55 ` [PATCH bpf-next v7 08/11] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Implement bpf_prog_query_opts as a more expendable version of
bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
well:

* prog_attach_flags is a per-program attach_type; relevant only for
  lsm cgroup program which might have different attach_flags
  per attach_btf_id
* attach_btf_func_id is a new field expose for prog_query which
  specifies real btf function id for lsm cgroup attachments

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/include/uapi/linux/bpf.h |  5 ++++
 tools/lib/bpf/bpf.c            | 42 +++++++++++++++++++++++++++-------
 tools/lib/bpf/bpf.h            | 15 ++++++++++++
 tools/lib/bpf/libbpf.map       |  1 +
 4 files changed, 55 insertions(+), 8 deletions(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index b9d2d6de63a7..432fc5f49567 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1432,6 +1432,7 @@ union bpf_attr {
 		__u32		attach_flags;
 		__aligned_u64	prog_ids;
 		__u32		prog_cnt;
+		__aligned_u64	prog_attach_flags; /* output: per-program attach_flags */
 	} query;
 
 	struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
@@ -5911,6 +5912,10 @@ struct bpf_prog_info {
 	__u64 run_cnt;
 	__u64 recursion_misses;
 	__u32 verified_insns;
+	/* BTF ID of the function to attach to within BTF object identified
+	 * by btf_id.
+	 */
+	__u32 attach_btf_func_id;
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 4677644d80f4..f0c2c2ea5a93 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -968,28 +968,54 @@ int bpf_iter_create(int link_fd)
 	return libbpf_err_errno(fd);
 }
 
-int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
-		   __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
+int bpf_prog_query_opts(int target_fd,
+			enum bpf_attach_type type,
+			struct bpf_prog_query_opts *opts)
 {
 	union bpf_attr attr;
 	int ret;
 
 	memset(&attr, 0, sizeof(attr));
+
+	if (!OPTS_VALID(opts, bpf_prog_query_opts))
+		return libbpf_err(-EINVAL);
+
 	attr.query.target_fd	= target_fd;
 	attr.query.attach_type	= type;
-	attr.query.query_flags	= query_flags;
-	attr.query.prog_cnt	= *prog_cnt;
-	attr.query.prog_ids	= ptr_to_u64(prog_ids);
+	attr.query.query_flags	= OPTS_GET(opts, query_flags, 0);
+	attr.query.prog_cnt	= OPTS_GET(opts, prog_cnt, 0);
+	attr.query.prog_ids	= ptr_to_u64(OPTS_GET(opts, prog_ids, NULL));
+	attr.query.prog_attach_flags = ptr_to_u64(OPTS_GET(opts, prog_attach_flags, NULL));
 
 	ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
 
-	if (attach_flags)
-		*attach_flags = attr.query.attach_flags;
-	*prog_cnt = attr.query.prog_cnt;
+	if (OPTS_HAS(opts, prog_cnt))
+		opts->prog_cnt = attr.query.prog_cnt;
+	if (OPTS_HAS(opts, attach_flags))
+		opts->attach_flags = attr.query.attach_flags;
 
 	return libbpf_err_errno(ret);
 }
 
+int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
+		   __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
+{
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
+	int ret;
+
+	p.query_flags = query_flags;
+	p.prog_ids = prog_ids;
+	p.prog_cnt = *prog_cnt;
+
+	ret = bpf_prog_query_opts(target_fd, type, &p);
+
+	if (attach_flags)
+		*attach_flags = p.attach_flags;
+	*prog_cnt = p.prog_cnt;
+
+	return ret;
+}
+
 int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size,
 		      void *data_out, __u32 *size_out, __u32 *retval,
 		      __u32 *duration)
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 2e0d3731e4c0..11ffbed99637 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -484,9 +484,24 @@ LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_link_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);
+
+struct bpf_prog_query_opts {
+	size_t sz; /* size of this struct for forward/backward compatibility */
+	__u32 query_flags;
+	__u32 attach_flags; /* output argument */
+	__u32 *prog_ids;
+	__u32 prog_cnt; /* input+output argument */
+	__u32 *prog_attach_flags;
+};
+#define bpf_prog_query_opts__last_field prog_attach_flags
+
+LIBBPF_API int bpf_prog_query_opts(int target_fd,
+				   enum bpf_attach_type type,
+				   struct bpf_prog_query_opts *opts);
 LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
 			      __u32 query_flags, __u32 *attach_flags,
 			      __u32 *prog_ids, __u32 *prog_cnt);
+
 LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd);
 LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
 				 __u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 6b36f46ab5d8..24f7a5147bf2 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -452,6 +452,7 @@ LIBBPF_0.8.0 {
 		bpf_map_delete_elem_flags;
 		bpf_object__destroy_subskeleton;
 		bpf_object__open_subskeleton;
+		bpf_prog_query_opts;
 		bpf_program__attach_kprobe_multi_opts;
 		bpf_program__attach_trace_opts;
 		bpf_program__attach_usdt;
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 08/11] libbpf: add lsm_cgoup_sock type
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (6 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-23 23:26   ` Andrii Nakryiko
  2022-05-18 22:55 ` [PATCH bpf-next v7 09/11] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/lib/bpf/libbpf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index ef7f302e542f..854449dcd072 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -9027,6 +9027,7 @@ static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("fmod_ret.s+",		TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
 	SEC_DEF("fexit.s+",		TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
 	SEC_DEF("freplace+",		EXT, 0, SEC_ATTACH_BTF, attach_trace),
+	SEC_DEF("lsm_cgroup+",		LSM, BPF_LSM_CGROUP, SEC_ATTACH_BTF),
 	SEC_DEF("lsm+",			LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm),
 	SEC_DEF("lsm.s+",		LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
 	SEC_DEF("iter+",		TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF, attach_iter),
@@ -9450,6 +9451,7 @@ void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
 		*kind = BTF_KIND_TYPEDEF;
 		break;
 	case BPF_LSM_MAC:
+	case BPF_LSM_CGROUP:
 		*prefix = BTF_LSM_PREFIX;
 		*kind = BTF_KIND_FUNC;
 		break;
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 09/11] bpftool: implement cgroup tree for BPF_LSM_CGROUP
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (7 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 08/11] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-18 22:55 ` [PATCH bpf-next v7 10/11] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

$ bpftool --nomount prog loadall $KDIR/tools/testing/selftests/bpf/lsm_cgroup.o /sys/fs/bpf/x
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_alloc
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_bind
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_clone
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_post_create
$ bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name
/sys/fs/cgroup
6        lsm_cgroup                      socket_post_create bpf_lsm_socket_post_create
8        lsm_cgroup                      socket_bind     bpf_lsm_socket_bind
10       lsm_cgroup                      socket_alloc    bpf_lsm_sk_alloc_security
11       lsm_cgroup                      socket_clone    bpf_lsm_inet_csk_clone

$ bpftool cgroup detach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_post_create
$ bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name
/sys/fs/cgroup
8        lsm_cgroup                      socket_bind     bpf_lsm_socket_bind
10       lsm_cgroup                      socket_alloc    bpf_lsm_sk_alloc_security
11       lsm_cgroup                      socket_clone    bpf_lsm_inet_csk_clone

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/bpf/bpftool/cgroup.c | 77 +++++++++++++++++++++++++++-----------
 tools/bpf/bpftool/common.c |  1 +
 2 files changed, 56 insertions(+), 22 deletions(-)

diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
index effe136119d7..23e2d8a21e28 100644
--- a/tools/bpf/bpftool/cgroup.c
+++ b/tools/bpf/bpftool/cgroup.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 
 #include <bpf/bpf.h>
+#include <bpf/btf.h>
 
 #include "main.h"
 
@@ -32,6 +33,7 @@
 	"                        sock_release }"
 
 static unsigned int query_flags;
+static struct btf *btf_vmlinux;
 
 static enum bpf_attach_type parse_attach_type(const char *str)
 {
@@ -51,6 +53,7 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 			 int level)
 {
 	char prog_name[MAX_PROG_FULL_NAME];
+	const char *attach_btf_name = NULL;
 	struct bpf_prog_info info = {};
 	__u32 info_len = sizeof(info);
 	int prog_fd;
@@ -64,6 +67,18 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 		return -1;
 	}
 
+	if (btf_vmlinux &&
+	    info.attach_btf_func_id < btf__type_cnt(btf_vmlinux)) {
+		/* Note, we ignore info.btf_id for now. There
+		 * is no good way to resolve btf_id to vmlinux
+		 * or module btf.
+		 */
+		const struct btf_type *t = btf__type_by_id(btf_vmlinux,
+							   info.attach_btf_func_id);
+		attach_btf_name = btf__name_by_offset(btf_vmlinux,
+						      t->name_off);
+	}
+
 	get_prog_full_name(&info, prog_fd, prog_name, sizeof(prog_name));
 	if (json_output) {
 		jsonw_start_object(json_wtr);
@@ -76,6 +91,10 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 		jsonw_string_field(json_wtr, "attach_flags",
 				   attach_flags_str);
 		jsonw_string_field(json_wtr, "name", prog_name);
+		if (attach_btf_name)
+			jsonw_string_field(json_wtr, "attach_btf_name", attach_btf_name);
+		jsonw_uint_field(json_wtr, "btf_id", info.btf_id);
+		jsonw_uint_field(json_wtr, "attach_btf_func_id", info.attach_btf_func_id);
 		jsonw_end_object(json_wtr);
 	} else {
 		printf("%s%-8u ", level ? "    " : "", info.id);
@@ -83,7 +102,12 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 			printf("%-15s", attach_type_name[attach_type]);
 		else
 			printf("type %-10u", attach_type);
-		printf(" %-15s %-15s\n", attach_flags_str, prog_name);
+		printf(" %-15s %-15s", attach_flags_str, prog_name);
+		if (attach_btf_name)
+			printf(" %-15s", attach_btf_name);
+		else if (info.attach_btf_func_id)
+			printf(" btf_id=%d btf_func_id=%d", info.btf_id, info.attach_btf_func_id);
+		printf("\n");
 	}
 
 	close(prog_fd);
@@ -125,40 +149,48 @@ static int cgroup_has_attached_progs(int cgroup_fd)
 static int show_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type,
 				   int level)
 {
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
 	const char *attach_flags_str;
 	__u32 prog_ids[1024] = {0};
-	__u32 prog_cnt, iter;
-	__u32 attach_flags;
+	__u32 attach_prog_flags[1024] = {0};
 	char buf[32];
+	__u32 iter;
 	int ret;
 
-	prog_cnt = ARRAY_SIZE(prog_ids);
-	ret = bpf_prog_query(cgroup_fd, type, query_flags, &attach_flags,
-			     prog_ids, &prog_cnt);
+	p.query_flags = query_flags;
+	p.prog_cnt = ARRAY_SIZE(prog_ids);
+	p.prog_ids = prog_ids;
+
+	ret = bpf_prog_query_opts(cgroup_fd, type, &p);
 	if (ret)
 		return ret;
 
-	if (prog_cnt == 0)
+	if (p.prog_cnt == 0)
 		return 0;
 
-	switch (attach_flags) {
-	case BPF_F_ALLOW_MULTI:
-		attach_flags_str = "multi";
-		break;
-	case BPF_F_ALLOW_OVERRIDE:
-		attach_flags_str = "override";
-		break;
-	case 0:
-		attach_flags_str = "";
-		break;
-	default:
-		snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags);
-		attach_flags_str = buf;
-	}
+	for (iter = 0; iter < p.prog_cnt; iter++) {
+		__u32 attach_flags;
+
+		attach_flags = attach_prog_flags[iter] ?: p.attach_flags;
+
+		switch (attach_flags) {
+		case BPF_F_ALLOW_MULTI:
+			attach_flags_str = "multi";
+			break;
+		case BPF_F_ALLOW_OVERRIDE:
+			attach_flags_str = "override";
+			break;
+		case 0:
+			attach_flags_str = "";
+			break;
+		default:
+			snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags);
+			attach_flags_str = buf;
+		}
 
-	for (iter = 0; iter < prog_cnt; iter++)
 		show_bpf_prog(prog_ids[iter], type,
 			      attach_flags_str, level);
+	}
 
 	return 0;
 }
@@ -523,5 +555,6 @@ static const struct cmd cmds[] = {
 
 int do_cgroup(int argc, char **argv)
 {
+	btf_vmlinux = libbpf_find_kernel_btf();
 	return cmd_select(cmds, argc, argv, do_help);
 }
diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index c740142c24d8..a7a913784c47 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -66,6 +66,7 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = {
 	[BPF_TRACE_FEXIT]		= "fexit",
 	[BPF_MODIFY_RETURN]		= "mod_ret",
 	[BPF_LSM_MAC]			= "lsm_mac",
+	[BPF_LSM_CGROUP]		= "lsm_cgroup",
 	[BPF_SK_LOOKUP]			= "sk_lookup",
 	[BPF_TRACE_ITER]		= "trace_iter",
 	[BPF_XDP_DEVMAP]		= "xdp_devmap",
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 10/11] selftests/bpf: lsm_cgroup functional test
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (8 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 09/11] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-18 22:55 ` [PATCH bpf-next v7 11/11] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
  2022-05-19 23:34 ` [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Yonghong Song
  11 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Functional test that exercises the following:

1. apply default sk_priority policy
2. permit TX-only AF_PACKET socket
3. cgroup attach/detach/replace
4. reusing trampoline shim

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/lsm_cgroup.c     | 277 ++++++++++++++++++
 .../testing/selftests/bpf/progs/lsm_cgroup.c  | 160 ++++++++++
 2 files changed, 437 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
 create mode 100644 tools/testing/selftests/bpf/progs/lsm_cgroup.c

diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
new file mode 100644
index 000000000000..29292ec40343
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
@@ -0,0 +1,277 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <test_progs.h>
+#include <bpf/btf.h>
+
+#include "lsm_cgroup.skel.h"
+#include "cgroup_helpers.h"
+#include "network_helpers.h"
+
+static __u32 query_prog_cnt(int cgroup_fd, const char *attach_func)
+{
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
+	static struct btf *btf;
+	int cnt = 0;
+	int i;
+
+	ASSERT_OK(bpf_prog_query_opts(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query");
+
+	if (!attach_func)
+		return p.prog_cnt;
+
+	/* When attach_func is provided, count the number of progs that
+	 * attach to the given symbol.
+	 */
+
+	if (!btf)
+		btf = btf__load_vmlinux_btf();
+	if (!ASSERT_OK(libbpf_get_error(btf), "btf_vmlinux"))
+		return -1;
+
+	p.prog_ids = malloc(sizeof(u32) * p.prog_cnt);
+	p.prog_attach_flags = malloc(sizeof(u32) * p.prog_cnt);
+	ASSERT_OK(bpf_prog_query_opts(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query");
+
+	for (i = 0; i < p.prog_cnt; i++) {
+		struct bpf_prog_info info = {};
+		__u32 info_len = sizeof(info);
+		int fd;
+
+		fd = bpf_prog_get_fd_by_id(p.prog_ids[i]);
+		ASSERT_GE(fd, 0, "prog_get_fd_by_id");
+		ASSERT_OK(bpf_obj_get_info_by_fd(fd, &info, &info_len), "prog_info_by_fd");
+		close(fd);
+
+		if (info.attach_btf_func_id ==
+		    btf__find_by_name_kind(btf, attach_func, BTF_KIND_FUNC))
+			cnt++;
+	}
+
+	return cnt;
+}
+
+static void test_lsm_cgroup_functional(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_prog_attach_opts, attach_opts);
+	DECLARE_LIBBPF_OPTS(bpf_link_update_opts, update_opts);
+	int cgroup_fd, cgroup_fd2, err, fd, prio;
+	int listen_fd, client_fd, accepted_fd;
+	struct lsm_cgroup *skel = NULL;
+	int post_create_prog_fd2 = -1;
+	int post_create_prog_fd = -1;
+	int bind_link_fd2 = -1;
+	int bind_prog_fd2 = -1;
+	int alloc_prog_fd = -1;
+	int bind_prog_fd = -1;
+	int bind_link_fd = -1;
+	int clone_prog_fd = -1;
+	socklen_t socklen;
+
+	cgroup_fd = test__join_cgroup("/sock_policy");
+	if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup"))
+		goto close_skel;
+
+	cgroup_fd2 = create_and_get_cgroup("/sock_policy2");
+	if (!ASSERT_GE(cgroup_fd2, 0, "create second cgroup"))
+		goto close_skel;
+
+	skel = lsm_cgroup__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		goto close_cgroup;
+
+	post_create_prog_fd = bpf_program__fd(skel->progs.socket_post_create);
+	post_create_prog_fd2 = bpf_program__fd(skel->progs.socket_post_create2);
+	bind_prog_fd = bpf_program__fd(skel->progs.socket_bind);
+	bind_prog_fd2 = bpf_program__fd(skel->progs.socket_bind2);
+	alloc_prog_fd = bpf_program__fd(skel->progs.socket_alloc);
+	clone_prog_fd = bpf_program__fd(skel->progs.socket_clone);
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 0, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 0, "total prog count");
+	err = bpf_prog_attach(alloc_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach alloc_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 1, "total prog count");
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 0, "prog count");
+	err = bpf_prog_attach(clone_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach clone_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 2, "total prog count");
+
+	/* Make sure replacing works. */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 0, "prog count");
+	err = bpf_prog_attach(post_create_prog_fd, cgroup_fd,
+			      BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach post_create_prog_fd"))
+		goto close_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 3, "total prog count");
+
+	attach_opts.replace_prog_fd = post_create_prog_fd;
+	err = bpf_prog_attach_opts(post_create_prog_fd2, cgroup_fd,
+				   BPF_LSM_CGROUP, &attach_opts);
+	if (!ASSERT_OK(err, "prog replace post_create_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 3, "total prog count");
+
+	/* Try the same attach/replace via link API. */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 0, "prog count");
+	bind_link_fd = bpf_link_create(bind_prog_fd, cgroup_fd,
+				       BPF_LSM_CGROUP, NULL);
+	if (!ASSERT_GE(bind_link_fd, 0, "link create bind_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+
+	update_opts.old_prog_fd = bind_prog_fd;
+	update_opts.flags = BPF_F_REPLACE;
+
+	err = bpf_link_update(bind_link_fd, bind_prog_fd2, &update_opts);
+	if (!ASSERT_OK(err, "link update bind_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+
+	/* Attach another instance of bind program to another cgroup.
+	 * This should trigger the reuse of the trampoline shim (two
+	 * programs attaching to the same btf_id).
+	 */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 0, "prog count");
+	bind_link_fd2 = bpf_link_create(bind_prog_fd2, cgroup_fd2,
+					BPF_LSM_CGROUP, NULL);
+	if (!ASSERT_GE(bind_link_fd2, 0, "link create bind_prog_fd2"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd2, NULL), 1, "total prog count");
+
+	/* AF_UNIX is prohibited. */
+
+	fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	ASSERT_LT(fd, 0, "socket(AF_UNIX)");
+
+	/* AF_INET6 gets default policy (sk_priority). */
+
+	fd = socket(AF_INET6, SOCK_STREAM, 0);
+	if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)"))
+		goto detach_cgroup;
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 123, "sk_priority");
+
+	close(fd);
+
+	/* TX-only AF_PACKET is allowed. */
+
+	ASSERT_LT(socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)), 0,
+		  "socket(AF_PACKET, ..., ETH_P_ALL)");
+
+	fd = socket(AF_PACKET, SOCK_RAW, 0);
+	ASSERT_GE(fd, 0, "socket(AF_PACKET, ..., 0)");
+
+	/* TX-only AF_PACKET can not be rebound. */
+
+	struct sockaddr_ll sa = {
+		.sll_family = AF_PACKET,
+		.sll_protocol = htons(ETH_P_ALL),
+	};
+	ASSERT_LT(bind(fd, (struct sockaddr *)&sa, sizeof(sa)), 0,
+		  "bind(ETH_P_ALL)");
+
+	close(fd);
+
+	/* Trigger passive open. */
+
+	listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+	ASSERT_GE(listen_fd, 0, "start_server");
+	client_fd = connect_to_fd(listen_fd, 0);
+	ASSERT_GE(client_fd, 0, "connect_to_fd");
+	accepted_fd = accept(listen_fd, NULL, NULL);
+	ASSERT_GE(accepted_fd, 0, "accept");
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(accepted_fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 234, "sk_priority");
+
+	/* These are replaced and never called. */
+	ASSERT_EQ(skel->bss->called_socket_post_create, 0, "called_create");
+	ASSERT_EQ(skel->bss->called_socket_bind, 0, "called_bind");
+
+	/* AF_INET6+SOCK_STREAM
+	 * AF_PACKET+SOCK_RAW
+	 * listen_fd
+	 * client_fd
+	 * accepted_fd
+	 */
+	ASSERT_EQ(skel->bss->called_socket_post_create2, 5, "called_create2");
+
+	/* start_server
+	 * bind(ETH_P_ALL)
+	 */
+	ASSERT_EQ(skel->bss->called_socket_bind2, 2, "called_bind2");
+	/* Single accept(). */
+	ASSERT_EQ(skel->bss->called_socket_clone, 1, "called_clone");
+
+	/* AF_UNIX+SOCK_STREAM (failed)
+	 * AF_INET6+SOCK_STREAM
+	 * AF_PACKET+SOCK_RAW (failed)
+	 * AF_PACKET+SOCK_RAW
+	 * listen_fd
+	 * client_fd
+	 * accepted_fd
+	 */
+	ASSERT_EQ(skel->bss->called_socket_alloc, 7, "called_alloc");
+
+	/* Make sure other cgroup doesn't trigger the programs. */
+
+	if (!ASSERT_OK(join_cgroup(""), "join root cgroup"))
+		goto detach_cgroup;
+
+	fd = socket(AF_INET6, SOCK_STREAM, 0);
+	if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)"))
+		goto detach_cgroup;
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 0, "sk_priority");
+
+	close(fd);
+
+detach_cgroup:
+	ASSERT_GE(bpf_prog_detach2(post_create_prog_fd2, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_create");
+	close(bind_link_fd);
+	/* Don't close bind_link_fd2, exercise cgroup release cleanup. */
+	ASSERT_GE(bpf_prog_detach2(alloc_prog_fd, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_alloc");
+	ASSERT_GE(bpf_prog_detach2(clone_prog_fd, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_clone");
+
+close_cgroup:
+	close(cgroup_fd);
+close_skel:
+	lsm_cgroup__destroy(skel);
+}
+
+void test_lsm_cgroup(void)
+{
+	if (test__start_subtest("functional"))
+		test_lsm_cgroup_functional();
+}
diff --git a/tools/testing/selftests/bpf/progs/lsm_cgroup.c b/tools/testing/selftests/bpf/progs/lsm_cgroup.c
new file mode 100644
index 000000000000..a263830900e2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/lsm_cgroup.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+#ifndef AF_PACKET
+#define AF_PACKET 17
+#endif
+
+#ifndef AF_UNIX
+#define AF_UNIX 1
+#endif
+
+#ifndef EPERM
+#define EPERM 1
+#endif
+
+struct {
+	__uint(type, BPF_MAP_TYPE_CGROUP_STORAGE);
+	__type(key, __u64);
+	__type(value, __u64);
+} cgroup_storage SEC(".maps");
+
+int called_socket_post_create;
+int called_socket_post_create2;
+int called_socket_bind;
+int called_socket_bind2;
+int called_socket_alloc;
+int called_socket_clone;
+
+static __always_inline int test_local_storage(void)
+{
+	__u64 *val;
+
+	val = bpf_get_local_storage(&cgroup_storage, 0);
+	if (!val)
+		return 0;
+	*val += 1;
+
+	return 1;
+}
+
+static __always_inline int real_create(struct socket *sock, int family,
+				       int protocol)
+{
+	struct sock *sk;
+
+	/* Reject non-tx-only AF_PACKET. */
+	if (family == AF_PACKET && protocol != 0)
+		return 0; /* EPERM */
+
+	sk = sock->sk;
+	if (!sk)
+		return 1;
+
+	/* The rest of the sockets get default policy. */
+	sk->sk_priority = 123;
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create, struct socket *sock, int family,
+	     int type, int protocol, int kern)
+{
+	called_socket_post_create++;
+	return real_create(sock, family, protocol);
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create2, struct socket *sock, int family,
+	     int type, int protocol, int kern)
+{
+	called_socket_post_create2++;
+	return real_create(sock, family, protocol);
+}
+
+static __always_inline int real_bind(struct socket *sock,
+				     struct sockaddr *address,
+				     int addrlen)
+{
+	struct sockaddr_ll sa = {};
+
+	if (sock->sk->__sk_common.skc_family != AF_PACKET)
+		return 1;
+
+	if (sock->sk->sk_kern_sock)
+		return 1;
+
+	bpf_probe_read_kernel(&sa, sizeof(sa), address);
+	if (sa.sll_protocol)
+		return 0; /* EPERM */
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_bind")
+int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	called_socket_bind++;
+	return real_bind(sock, address, addrlen);
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_bind")
+int BPF_PROG(socket_bind2, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	called_socket_bind2++;
+	return real_bind(sock, address, addrlen);
+}
+
+/* __cgroup_bpf_run_lsm_current (via bpf_lsm_current_hooks) */
+SEC("lsm_cgroup/sk_alloc_security")
+int BPF_PROG(socket_alloc, struct sock *sk, int family, gfp_t priority)
+{
+	called_socket_alloc++;
+	if (family == AF_UNIX)
+		return 0; /* EPERM */
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_sock */
+SEC("lsm_cgroup/inet_csk_clone")
+int BPF_PROG(socket_clone, struct sock *newsk, const struct request_sock *req)
+{
+	called_socket_clone++;
+
+	if (!newsk)
+		return 1;
+
+	/* Accepted request sockets get a different priority. */
+	newsk->sk_priority = 234;
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v7 11/11] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (9 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 10/11] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
@ 2022-05-18 22:55 ` Stanislav Fomichev
  2022-05-23 23:33   ` Andrii Nakryiko
  2022-05-19 23:34 ` [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Yonghong Song
  11 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-18 22:55 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

sk_priority & sk_mark are writable, the rest is readonly.

One interesting thing here is that the verifier doesn't
really force me to add NULL checks anywhere :-/

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/lsm_cgroup.c     | 69 +++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
index 29292ec40343..64b6830e03f5 100644
--- a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
+++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
@@ -270,8 +270,77 @@ static void test_lsm_cgroup_functional(void)
 	lsm_cgroup__destroy(skel);
 }
 
+static int field_offset(const char *type, const char *field)
+{
+	const struct btf_member *memb;
+	const struct btf_type *tp;
+	const char *name;
+	struct btf *btf;
+	int btf_id;
+	int i;
+
+	btf = btf__load_vmlinux_btf();
+	if (!btf)
+		return -1;
+
+	btf_id = btf__find_by_name_kind(btf, type, BTF_KIND_STRUCT);
+	if (btf_id < 0)
+		return -1;
+
+	tp = btf__type_by_id(btf, btf_id);
+	memb = btf_members(tp);
+
+	for (i = 0; i < btf_vlen(tp); i++) {
+		name = btf__name_by_offset(btf,
+					   memb->name_off);
+		if (strcmp(field, name) == 0)
+			return memb->offset / 8;
+		memb++;
+	}
+
+	return -1;
+}
+
+static bool sk_writable_field(const char *type, const char *field, int size)
+{
+	LIBBPF_OPTS(bpf_prog_load_opts, opts,
+		    .expected_attach_type = BPF_LSM_CGROUP);
+	struct bpf_insn	insns[] = {
+		/* r1 = *(u64 *)(r1 + 0) */
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0),
+		/* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, field_offset("socket", "sk")),
+		/* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */
+		BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, field_offset(type, field)),
+		/* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */
+		BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, field_offset(type, field)),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+	};
+	int fd;
+
+	opts.attach_btf_id = libbpf_find_vmlinux_btf_id("socket_post_create",
+							opts.expected_attach_type);
+
+	fd = bpf_prog_load(BPF_PROG_TYPE_LSM, NULL, "GPL", insns, ARRAY_SIZE(insns), &opts);
+	if (fd >= 0)
+		close(fd);
+	return fd >= 0;
+}
+
+static void test_lsm_cgroup_access(void)
+{
+	ASSERT_FALSE(sk_writable_field("sock_common", "skc_family", BPF_H), "skc_family");
+	ASSERT_FALSE(sk_writable_field("sock", "sk_sndtimeo", BPF_DW), "sk_sndtimeo");
+	ASSERT_TRUE(sk_writable_field("sock", "sk_priority", BPF_W), "sk_priority");
+	ASSERT_TRUE(sk_writable_field("sock", "sk_mark", BPF_W), "sk_mark");
+	ASSERT_FALSE(sk_writable_field("sock", "sk_pacing_rate", BPF_DW), "sk_pacing_rate");
+}
+
 void test_lsm_cgroup(void)
 {
 	if (test__start_subtest("functional"))
 		test_lsm_cgroup_functional();
+	if (test__start_subtest("access"))
+		test_lsm_cgroup_access();
 }
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-18 22:55 ` [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-05-19  2:31   ` kernel test robot
  2022-05-19 14:57   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 54+ messages in thread
From: kernel test robot @ 2022-05-19  2:31 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf
  Cc: kbuild-all, ast, daniel, andrii, Stanislav Fomichev

Hi Stanislav,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/bpf-cgroup_sock-lsm-flavor/20220519-065944
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20220519/202205191035.Ja5udws3-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 11.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/f3b115441e4b11ef3e65cad30e1c8fb7a2becfab
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Stanislav-Fomichev/bpf-cgroup_sock-lsm-flavor/20220519-065944
        git checkout f3b115441e4b11ef3e65cad30e1c8fb7a2becfab
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   kernel/bpf/cgroup.c: In function '__cgroup_bpf_query':
>> kernel/bpf/cgroup.c:1091:30: error: 'CGROUP_LSM_START' undeclared (first use in this function); did you mean 'CGROUP_LSM_NUM'?
    1091 |                 from_atype = CGROUP_LSM_START;
         |                              ^~~~~~~~~~~~~~~~
         |                              CGROUP_LSM_NUM
   kernel/bpf/cgroup.c:1091:30: note: each undeclared identifier is reported only once for each function it appears in
>> kernel/bpf/cgroup.c:1092:28: error: 'CGROUP_LSM_END' undeclared (first use in this function); did you mean 'CGROUP_LSM_NUM'?
    1092 |                 to_atype = CGROUP_LSM_END;
         |                            ^~~~~~~~~~~~~~
         |                            CGROUP_LSM_NUM


vim +1091 kernel/bpf/cgroup.c

  1072	
  1073	/* Must be called with cgroup_mutex held to avoid races. */
  1074	static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
  1075				      union bpf_attr __user *uattr)
  1076	{
  1077		__u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
  1078		__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
  1079		enum bpf_attach_type type = attr->query.attach_type;
  1080		enum cgroup_bpf_attach_type atype;
  1081		struct bpf_prog_array *effective;
  1082		struct hlist_head *progs;
  1083		struct bpf_prog *prog;
  1084		int cnt, ret = 0, i;
  1085		int total_cnt = 0;
  1086		u32 flags;
  1087	
  1088		enum cgroup_bpf_attach_type from_atype, to_atype;
  1089	
  1090		if (type == BPF_LSM_CGROUP) {
> 1091			from_atype = CGROUP_LSM_START;
> 1092			to_atype = CGROUP_LSM_END;
  1093		} else {
  1094			from_atype = to_cgroup_bpf_attach_type(type);
  1095			if (from_atype < 0)
  1096				return -EINVAL;
  1097			to_atype = from_atype;
  1098		}
  1099	
  1100		for (atype = from_atype; atype <= to_atype; atype++) {
  1101			progs = &cgrp->bpf.progs[atype];
  1102			flags = cgrp->bpf.flags[atype];
  1103	
  1104			effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
  1105							      lockdep_is_held(&cgroup_mutex));
  1106	
  1107			if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
  1108				total_cnt += bpf_prog_array_length(effective);
  1109			else
  1110				total_cnt += prog_list_length(progs);
  1111		}
  1112	
  1113		if (type != BPF_LSM_CGROUP)
  1114			if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
  1115				return -EFAULT;
  1116		if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
  1117			return -EFAULT;
  1118		if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
  1119			/* return early if user requested only program count + flags */
  1120			return 0;
  1121	
  1122		if (attr->query.prog_cnt < total_cnt) {
  1123			total_cnt = attr->query.prog_cnt;
  1124			ret = -ENOSPC;
  1125		}
  1126	
  1127		for (atype = from_atype; atype <= to_atype; atype++) {
  1128			if (total_cnt <= 0)
  1129				break;
  1130	
  1131			progs = &cgrp->bpf.progs[atype];
  1132			flags = cgrp->bpf.flags[atype];
  1133	
  1134			effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
  1135							      lockdep_is_held(&cgroup_mutex));
  1136	
  1137			if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
  1138				cnt = bpf_prog_array_length(effective);
  1139			else
  1140				cnt = prog_list_length(progs);
  1141	
  1142			if (cnt >= total_cnt)
  1143				cnt = total_cnt;
  1144	
  1145			if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
  1146				ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
  1147			} else {
  1148				struct bpf_prog_list *pl;
  1149				u32 id;
  1150	
  1151				i = 0;
  1152				hlist_for_each_entry(pl, progs, node) {
  1153					prog = prog_list_prog(pl);
  1154					id = prog->aux->id;
  1155					if (copy_to_user(prog_ids + i, &id, sizeof(id)))
  1156						return -EFAULT;
  1157					if (++i == cnt)
  1158						break;
  1159				}
  1160			}
  1161	
  1162			if (prog_attach_flags)
  1163				for (i = 0; i < cnt; i++)
  1164					if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
  1165						return -EFAULT;
  1166	
  1167			prog_ids += cnt;
  1168			total_cnt -= cnt;
  1169			if (prog_attach_flags)
  1170				prog_attach_flags += cnt;
  1171		}
  1172		return ret;
  1173	}
  1174	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-18 22:55 ` [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
  2022-05-19  2:31   ` kernel test robot
@ 2022-05-19 14:57   ` kernel test robot
  2022-05-23 23:23   ` Andrii Nakryiko
  2022-05-24  3:48   ` Martin KaFai Lau
  3 siblings, 0 replies; 54+ messages in thread
From: kernel test robot @ 2022-05-19 14:57 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf
  Cc: llvm, kbuild-all, ast, daniel, andrii, Stanislav Fomichev

Hi Stanislav,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/bpf-cgroup_sock-lsm-flavor/20220519-065944
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: hexagon-randconfig-r035-20220519 (https://download.01.org/0day-ci/archive/20220519/202205192210.FPl5GoFS-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project e00cbbec06c08dc616a0d52a20f678b8fbd4e304)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/f3b115441e4b11ef3e65cad30e1c8fb7a2becfab
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Stanislav-Fomichev/bpf-cgroup_sock-lsm-flavor/20220519-065944
        git checkout f3b115441e4b11ef3e65cad30e1c8fb7a2becfab
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash kernel/bpf/ lib/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> kernel/bpf/cgroup.c:1091:16: error: use of undeclared identifier 'CGROUP_LSM_START'
                   from_atype = CGROUP_LSM_START;
                                ^
>> kernel/bpf/cgroup.c:1092:14: error: use of undeclared identifier 'CGROUP_LSM_END'
                   to_atype = CGROUP_LSM_END;
                              ^
   2 errors generated.


vim +/CGROUP_LSM_START +1091 kernel/bpf/cgroup.c

  1072	
  1073	/* Must be called with cgroup_mutex held to avoid races. */
  1074	static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
  1075				      union bpf_attr __user *uattr)
  1076	{
  1077		__u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
  1078		__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
  1079		enum bpf_attach_type type = attr->query.attach_type;
  1080		enum cgroup_bpf_attach_type atype;
  1081		struct bpf_prog_array *effective;
  1082		struct hlist_head *progs;
  1083		struct bpf_prog *prog;
  1084		int cnt, ret = 0, i;
  1085		int total_cnt = 0;
  1086		u32 flags;
  1087	
  1088		enum cgroup_bpf_attach_type from_atype, to_atype;
  1089	
  1090		if (type == BPF_LSM_CGROUP) {
> 1091			from_atype = CGROUP_LSM_START;
> 1092			to_atype = CGROUP_LSM_END;
  1093		} else {
  1094			from_atype = to_cgroup_bpf_attach_type(type);
  1095			if (from_atype < 0)
  1096				return -EINVAL;
  1097			to_atype = from_atype;
  1098		}
  1099	
  1100		for (atype = from_atype; atype <= to_atype; atype++) {
  1101			progs = &cgrp->bpf.progs[atype];
  1102			flags = cgrp->bpf.flags[atype];
  1103	
  1104			effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
  1105							      lockdep_is_held(&cgroup_mutex));
  1106	
  1107			if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
  1108				total_cnt += bpf_prog_array_length(effective);
  1109			else
  1110				total_cnt += prog_list_length(progs);
  1111		}
  1112	
  1113		if (type != BPF_LSM_CGROUP)
  1114			if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
  1115				return -EFAULT;
  1116		if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
  1117			return -EFAULT;
  1118		if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
  1119			/* return early if user requested only program count + flags */
  1120			return 0;
  1121	
  1122		if (attr->query.prog_cnt < total_cnt) {
  1123			total_cnt = attr->query.prog_cnt;
  1124			ret = -ENOSPC;
  1125		}
  1126	
  1127		for (atype = from_atype; atype <= to_atype; atype++) {
  1128			if (total_cnt <= 0)
  1129				break;
  1130	
  1131			progs = &cgrp->bpf.progs[atype];
  1132			flags = cgrp->bpf.flags[atype];
  1133	
  1134			effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
  1135							      lockdep_is_held(&cgroup_mutex));
  1136	
  1137			if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
  1138				cnt = bpf_prog_array_length(effective);
  1139			else
  1140				cnt = prog_list_length(progs);
  1141	
  1142			if (cnt >= total_cnt)
  1143				cnt = total_cnt;
  1144	
  1145			if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
  1146				ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
  1147			} else {
  1148				struct bpf_prog_list *pl;
  1149				u32 id;
  1150	
  1151				i = 0;
  1152				hlist_for_each_entry(pl, progs, node) {
  1153					prog = prog_list_prog(pl);
  1154					id = prog->aux->id;
  1155					if (copy_to_user(prog_ids + i, &id, sizeof(id)))
  1156						return -EFAULT;
  1157					if (++i == cnt)
  1158						break;
  1159				}
  1160			}
  1161	
  1162			if (prog_attach_flags)
  1163				for (i = 0; i < cnt; i++)
  1164					if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
  1165						return -EFAULT;
  1166	
  1167			prog_ids += cnt;
  1168			total_cnt -= cnt;
  1169			if (prog_attach_flags)
  1170				prog_attach_flags += cnt;
  1171		}
  1172		return ret;
  1173	}
  1174	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor
  2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (10 preceding siblings ...)
  2022-05-18 22:55 ` [PATCH bpf-next v7 11/11] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
@ 2022-05-19 23:34 ` Yonghong Song
  2022-05-19 23:39   ` Stanislav Fomichev
  11 siblings, 1 reply; 54+ messages in thread
From: Yonghong Song @ 2022-05-19 23:34 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf
  Cc: ast, daniel, andrii, kafai, kpsingh, jakub



On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
> This series implements new lsm flavor for attaching per-cgroup programs to
> existing lsm hooks. The cgroup is taken out of 'current', unless
> the first argument of the hook is 'struct socket'. In this case,
> the cgroup association is taken out of socket. The attachment
> looks like a regular per-cgroup attachment: we add new BPF_LSM_CGROUP
> attach type which, together with attach_btf_id, signals per-cgroup lsm.
> Behind the scenes, we allocate trampoline shim program and
> attach to lsm. This program looks up cgroup from current/socket
> and runs cgroup's effective prog array. The rest of the per-cgroup BPF
> stays the same: hierarchy, local storage, retval conventions
> (return 1 == success).
> 
> Current limitations:
> * haven't considered sleepable bpf; can be extended later on
> * not sure the verifier does the right thing with null checks;
>    see latest selftest for details
> * total of 10 (global) per-cgroup LSM attach points
> 
> Cc: ast@kernel.org
> Cc: daniel@iogearbox.net
> Cc: kafai@fb.com
> Cc: kpsingh@kernel.org
> Cc: jakub@cloudflare.com
> 
> v7:
> - there were a lot of comments last time, hope I didn't forget anything,
>    some of the bigger ones:
>    - Martin: use/extend BTF_SOCK_TYPE_SOCKET
>    - Martin: expose bpf_set_retval
>    - Martin: reject 'return 0' at the verifier for 'void' hooks
>    - Martin: prog_query returns all BPF_LSM_CGROUP, prog_info
>      returns attach_btf_func_id
>    - Andrii: split libbpf changes
>    - Andrii: add field access test to test_progs, not test_verifier (still
>      using asm though)
> - things that I haven't addressed, stating them here explicitly, let
>    me know if some of these are still problematic:
>    1. Andrii: exposing only link-based api: seems like the changes
>       to support non-link-based ones are minimal, couple of lines,
>       so seems like it worth having it?
>    2. Alexei: applying cgroup_atype for all cgroup hooks, not only
>       cgroup lsm: looks a bit harder to apply everywhere that I
>       originally thought; with lsm cgroup, we have a shim_prog pointer where
>       we store cgroup_atype; for non-lsm programs, we don't have a
>       trace program where to store it, so we still need some kind
>       of global table to map from "static" hook to "dynamic" slot.
>       So I'm dropping this "can be easily extended" clause from the
>       description for now. I have converted this whole machinery
>       to an RCU-managed list to remove synchronize_rcu().
> - also note that I had to introduce new bpf_shim_tramp_link and
>    moved refcnt there; we need something to manage new bpf_tramp_link
> 
> v6:
> - remove active count & stats for shim program (Martin KaFai Lau)
> - remove NULL/error check for btf_vmlinux (Martin)
> - don't check cgroup_atype in bpf_cgroup_lsm_shim_release (Martin)
> - use old_prog (instead of passed one) in __cgroup_bpf_detach (Martin)
> - make sure attach_btf_id is the same in __cgroup_bpf_replace (Martin)
> - enable cgroup local storage and test it (Martin)
> - properly implement prog query and add bpftool & tests (Martin)
> - prohibit non-shared cgroup storage mode for BPF_LSM_CGROUP (Martin)
> 
> v5:
> - __cgroup_bpf_run_lsm_socket remove NULL sock/sk checks (Martin KaFai Lau)
> - __cgroup_bpf_run_lsm_{socket,current} s/prog/shim_prog/ (Martin)
> - make sure bpf_lsm_find_cgroup_shim works for hooks without args (Martin)
> - __cgroup_bpf_attach make sure attach_btf_id is the same when replacing (Martin)
> - call bpf_cgroup_lsm_shim_release only for LSM_CGROUP (Martin)
> - drop BPF_LSM_CGROUP from bpf_attach_type_to_tramp (Martin)
> - drop jited check from cgroup_shim_find (Martin)
> - new patch to convert cgroup_bpf to hlist_node (Jakub Sitnicki)
> - new shim flavor for 'struct sock' + list of exceptions (Martin)
> 
> v4:
> - fix build when jit is on but syscall is off
> 
> v3:
> - add BPF_LSM_CGROUP to bpftool
> - use simple int instead of refcnt_t (to avoid use-after-free
>    false positive)
> 
> v2:
> - addressed build bot failures
> 
> Stanislav Fomichev (11):
>    bpf: add bpf_func_t and trampoline helpers
>    bpf: convert cgroup_bpf.progs to hlist
>    bpf: per-cgroup lsm flavor
>    bpf: minimize number of allocated lsm slots per program
>    bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
>    bpf: allow writing to a subset of sock fields from lsm progtype
>    libbpf: implement bpf_prog_query_opts
>    libbpf: add lsm_cgoup_sock type
>    bpftool: implement cgroup tree for BPF_LSM_CGROUP
>    selftests/bpf: lsm_cgroup functional test
>    selftests/bpf: verify lsm_cgroup struct sock access
> 
>   arch/x86/net/bpf_jit_comp.c                   |  24 +-
>   include/linux/bpf-cgroup-defs.h               |  11 +-
>   include/linux/bpf-cgroup.h                    |   9 +-
>   include/linux/bpf.h                           |  36 +-
>   include/linux/bpf_lsm.h                       |   8 +
>   include/linux/btf_ids.h                       |   3 +-
>   include/uapi/linux/bpf.h                      |   6 +
>   kernel/bpf/bpf_lsm.c                          | 103 ++++
>   kernel/bpf/btf.c                              |  11 +
>   kernel/bpf/cgroup.c                           | 487 +++++++++++++++---
>   kernel/bpf/core.c                             |   2 +
>   kernel/bpf/syscall.c                          |  14 +-
>   kernel/bpf/trampoline.c                       | 244 ++++++++-
>   kernel/bpf/verifier.c                         |  31 +-
>   tools/bpf/bpftool/cgroup.c                    |  77 ++-
>   tools/bpf/bpftool/common.c                    |   1 +
>   tools/include/linux/btf_ids.h                 |   4 +-
>   tools/include/uapi/linux/bpf.h                |   6 +
>   tools/lib/bpf/bpf.c                           |  42 +-
>   tools/lib/bpf/bpf.h                           |  15 +
>   tools/lib/bpf/libbpf.c                        |   2 +
>   tools/lib/bpf/libbpf.map                      |   1 +
>   .../selftests/bpf/prog_tests/lsm_cgroup.c     | 346 +++++++++++++
>   .../testing/selftests/bpf/progs/lsm_cgroup.c  | 160 ++++++
>   24 files changed, 1480 insertions(+), 163 deletions(-)
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
>   create mode 100644 tools/testing/selftests/bpf/progs/lsm_cgroup.c

There are 4 test failures for test_progs in CI.
 
https://github.com/kernel-patches/bpf/runs/6511113546?check_suite_focus=true
All have error messages like:
     At program exit the register R0 has value (0xffffffff; 0x0) should 
have been in (0x0; 0x1)

Could you take a look?



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor
  2022-05-19 23:34 ` [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Yonghong Song
@ 2022-05-19 23:39   ` Stanislav Fomichev
  0 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-19 23:39 UTC (permalink / raw)
  To: Yonghong Song; +Cc: netdev, bpf, ast, daniel, andrii, kafai, kpsingh, jakub

On Thu, May 19, 2022 at 4:34 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
> > This series implements new lsm flavor for attaching per-cgroup programs to
> > existing lsm hooks. The cgroup is taken out of 'current', unless
> > the first argument of the hook is 'struct socket'. In this case,
> > the cgroup association is taken out of socket. The attachment
> > looks like a regular per-cgroup attachment: we add new BPF_LSM_CGROUP
> > attach type which, together with attach_btf_id, signals per-cgroup lsm.
> > Behind the scenes, we allocate trampoline shim program and
> > attach to lsm. This program looks up cgroup from current/socket
> > and runs cgroup's effective prog array. The rest of the per-cgroup BPF
> > stays the same: hierarchy, local storage, retval conventions
> > (return 1 == success).
> >
> > Current limitations:
> > * haven't considered sleepable bpf; can be extended later on
> > * not sure the verifier does the right thing with null checks;
> >    see latest selftest for details
> > * total of 10 (global) per-cgroup LSM attach points
> >
> > Cc: ast@kernel.org
> > Cc: daniel@iogearbox.net
> > Cc: kafai@fb.com
> > Cc: kpsingh@kernel.org
> > Cc: jakub@cloudflare.com
> >
> > v7:
> > - there were a lot of comments last time, hope I didn't forget anything,
> >    some of the bigger ones:
> >    - Martin: use/extend BTF_SOCK_TYPE_SOCKET
> >    - Martin: expose bpf_set_retval
> >    - Martin: reject 'return 0' at the verifier for 'void' hooks
> >    - Martin: prog_query returns all BPF_LSM_CGROUP, prog_info
> >      returns attach_btf_func_id
> >    - Andrii: split libbpf changes
> >    - Andrii: add field access test to test_progs, not test_verifier (still
> >      using asm though)
> > - things that I haven't addressed, stating them here explicitly, let
> >    me know if some of these are still problematic:
> >    1. Andrii: exposing only link-based api: seems like the changes
> >       to support non-link-based ones are minimal, couple of lines,
> >       so seems like it worth having it?
> >    2. Alexei: applying cgroup_atype for all cgroup hooks, not only
> >       cgroup lsm: looks a bit harder to apply everywhere that I
> >       originally thought; with lsm cgroup, we have a shim_prog pointer where
> >       we store cgroup_atype; for non-lsm programs, we don't have a
> >       trace program where to store it, so we still need some kind
> >       of global table to map from "static" hook to "dynamic" slot.
> >       So I'm dropping this "can be easily extended" clause from the
> >       description for now. I have converted this whole machinery
> >       to an RCU-managed list to remove synchronize_rcu().
> > - also note that I had to introduce new bpf_shim_tramp_link and
> >    moved refcnt there; we need something to manage new bpf_tramp_link
> >
> > v6:
> > - remove active count & stats for shim program (Martin KaFai Lau)
> > - remove NULL/error check for btf_vmlinux (Martin)
> > - don't check cgroup_atype in bpf_cgroup_lsm_shim_release (Martin)
> > - use old_prog (instead of passed one) in __cgroup_bpf_detach (Martin)
> > - make sure attach_btf_id is the same in __cgroup_bpf_replace (Martin)
> > - enable cgroup local storage and test it (Martin)
> > - properly implement prog query and add bpftool & tests (Martin)
> > - prohibit non-shared cgroup storage mode for BPF_LSM_CGROUP (Martin)
> >
> > v5:
> > - __cgroup_bpf_run_lsm_socket remove NULL sock/sk checks (Martin KaFai Lau)
> > - __cgroup_bpf_run_lsm_{socket,current} s/prog/shim_prog/ (Martin)
> > - make sure bpf_lsm_find_cgroup_shim works for hooks without args (Martin)
> > - __cgroup_bpf_attach make sure attach_btf_id is the same when replacing (Martin)
> > - call bpf_cgroup_lsm_shim_release only for LSM_CGROUP (Martin)
> > - drop BPF_LSM_CGROUP from bpf_attach_type_to_tramp (Martin)
> > - drop jited check from cgroup_shim_find (Martin)
> > - new patch to convert cgroup_bpf to hlist_node (Jakub Sitnicki)
> > - new shim flavor for 'struct sock' + list of exceptions (Martin)
> >
> > v4:
> > - fix build when jit is on but syscall is off
> >
> > v3:
> > - add BPF_LSM_CGROUP to bpftool
> > - use simple int instead of refcnt_t (to avoid use-after-free
> >    false positive)
> >
> > v2:
> > - addressed build bot failures
> >
> > Stanislav Fomichev (11):
> >    bpf: add bpf_func_t and trampoline helpers
> >    bpf: convert cgroup_bpf.progs to hlist
> >    bpf: per-cgroup lsm flavor
> >    bpf: minimize number of allocated lsm slots per program
> >    bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
> >    bpf: allow writing to a subset of sock fields from lsm progtype
> >    libbpf: implement bpf_prog_query_opts
> >    libbpf: add lsm_cgoup_sock type
> >    bpftool: implement cgroup tree for BPF_LSM_CGROUP
> >    selftests/bpf: lsm_cgroup functional test
> >    selftests/bpf: verify lsm_cgroup struct sock access
> >
> >   arch/x86/net/bpf_jit_comp.c                   |  24 +-
> >   include/linux/bpf-cgroup-defs.h               |  11 +-
> >   include/linux/bpf-cgroup.h                    |   9 +-
> >   include/linux/bpf.h                           |  36 +-
> >   include/linux/bpf_lsm.h                       |   8 +
> >   include/linux/btf_ids.h                       |   3 +-
> >   include/uapi/linux/bpf.h                      |   6 +
> >   kernel/bpf/bpf_lsm.c                          | 103 ++++
> >   kernel/bpf/btf.c                              |  11 +
> >   kernel/bpf/cgroup.c                           | 487 +++++++++++++++---
> >   kernel/bpf/core.c                             |   2 +
> >   kernel/bpf/syscall.c                          |  14 +-
> >   kernel/bpf/trampoline.c                       | 244 ++++++++-
> >   kernel/bpf/verifier.c                         |  31 +-
> >   tools/bpf/bpftool/cgroup.c                    |  77 ++-
> >   tools/bpf/bpftool/common.c                    |   1 +
> >   tools/include/linux/btf_ids.h                 |   4 +-
> >   tools/include/uapi/linux/bpf.h                |   6 +
> >   tools/lib/bpf/bpf.c                           |  42 +-
> >   tools/lib/bpf/bpf.h                           |  15 +
> >   tools/lib/bpf/libbpf.c                        |   2 +
> >   tools/lib/bpf/libbpf.map                      |   1 +
> >   .../selftests/bpf/prog_tests/lsm_cgroup.c     | 346 +++++++++++++
> >   .../testing/selftests/bpf/progs/lsm_cgroup.c  | 160 ++++++
> >   24 files changed, 1480 insertions(+), 163 deletions(-)
> >   create mode 100644 tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> >   create mode 100644 tools/testing/selftests/bpf/progs/lsm_cgroup.c
>
> There are 4 test failures for test_progs in CI.
>
> https://github.com/kernel-patches/bpf/runs/6511113546?check_suite_focus=true
> All have error messages like:
>      At program exit the register R0 has value (0xffffffff; 0x0) should
> have been in (0x0; 0x1)
>
> Could you take a look?

Ugh, definitely, thanks for the pointer!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 01/11] bpf: add bpf_func_t and trampoline helpers
  2022-05-18 22:55 ` [PATCH bpf-next v7 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
@ 2022-05-20  0:45   ` Yonghong Song
  2022-05-21  0:03     ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Yonghong Song @ 2022-05-20  0:45 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf; +Cc: ast, daniel, andrii



On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
> I'll be adding lsm cgroup specific helpers that grab
> trampoline mutex.
> 
> No functional changes.
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>   include/linux/bpf.h     | 11 ++++----
>   kernel/bpf/trampoline.c | 62 ++++++++++++++++++++++-------------------
>   2 files changed, 38 insertions(+), 35 deletions(-)
> 
[...]
> +
> +int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
> +{
> +	int err;
> +
> +	mutex_lock(&tr->mutex);
> +	err = __bpf_trampoline_link_prog(link, tr);
>   	mutex_unlock(&tr->mutex);
>   	return err;
>   }
>   
>   /* bpf_trampoline_unlink_prog() should never fail. */

The comment here can be removed.

> -int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
> +static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
>   {
>   	enum bpf_tramp_prog_type kind;
>   	int err;
>   
>   	kind = bpf_attach_type_to_tramp(link->link.prog);
> -	mutex_lock(&tr->mutex);
>   	if (kind == BPF_TRAMP_REPLACE) {
>   		WARN_ON_ONCE(!tr->extension_prog);
>   		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP,
>   					 tr->extension_prog->bpf_func, NULL);
>   		tr->extension_prog = NULL;
> -		goto out;
> +		return err;
>   	}
>   	hlist_del_init(&link->tramp_hlist);
>   	tr->progs_cnt[kind]--;
> -	err = bpf_trampoline_update(tr);
> -out:
> +	return bpf_trampoline_update(tr);
> +}
> +
> +/* bpf_trampoline_unlink_prog() should never fail. */
> +int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
> +{
> +	int err;
> +
> +	mutex_lock(&tr->mutex);
> +	err = __bpf_trampoline_unlink_prog(link, tr);
>   	mutex_unlock(&tr->mutex);
>   	return err;
>   }

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-18 22:55 ` [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
@ 2022-05-20  1:00   ` Yonghong Song
  2022-05-21  0:03     ` Stanislav Fomichev
  2022-05-21  0:53   ` Martin KaFai Lau
  1 sibling, 1 reply; 54+ messages in thread
From: Yonghong Song @ 2022-05-20  1:00 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf; +Cc: ast, daniel, andrii



On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
> Allow attaching to lsm hooks in the cgroup context.
> 
> Attaching to per-cgroup LSM works exactly like attaching
> to other per-cgroup hooks. New BPF_LSM_CGROUP is added
> to trigger new mode; the actual lsm hook we attach to is
> signaled via existing attach_btf_id.
> 
> For the hooks that have 'struct socket' or 'struct sock' as its first
> argument, we use the cgroup associated with that socket. For the rest,
> we use 'current' cgroup (this is all on default hierarchy == v2 only).
> Note that for some hooks that work on 'struct sock' we still
> take the cgroup from 'current' because some of them work on the socket
> that hasn't been properly initialized yet.
> 
> Behind the scenes, we allocate a shim program that is attached
> to the trampoline and runs cgroup effective BPF programs array.
> This shim has some rudimentary ref counting and can be shared
> between several programs attaching to the same per-cgroup lsm hook.
> 
> Note that this patch bloats cgroup size because we add 211
> cgroup_bpf_attach_type(s) for simplicity sake. This will be
> addressed in the subsequent patch.
> 
> Also note that we only add non-sleepable flavor for now. To enable
> sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
> shim programs have to be freed via trace rcu, cgroup_bpf.effective
> should be also trace-rcu-managed + maybe some other changes that
> I'm not aware of.
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>   arch/x86/net/bpf_jit_comp.c     |  24 +++--
>   include/linux/bpf-cgroup-defs.h |   6 ++
>   include/linux/bpf-cgroup.h      |   7 ++
>   include/linux/bpf.h             |  25 +++++
>   include/linux/bpf_lsm.h         |  14 +++
>   include/linux/btf_ids.h         |   3 +-
>   include/uapi/linux/bpf.h        |   1 +
>   kernel/bpf/bpf_lsm.c            |  50 +++++++++
>   kernel/bpf/btf.c                |  11 ++
>   kernel/bpf/cgroup.c             | 181 ++++++++++++++++++++++++++++---
>   kernel/bpf/core.c               |   2 +
>   kernel/bpf/syscall.c            |  10 ++
>   kernel/bpf/trampoline.c         | 184 ++++++++++++++++++++++++++++++++
>   kernel/bpf/verifier.c           |  28 +++++
>   tools/include/linux/btf_ids.h   |   4 +-
>   tools/include/uapi/linux/bpf.h  |   1 +
>   16 files changed, 527 insertions(+), 24 deletions(-)

A few nits below.

> 
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index a2b6d197c226..5cdebf4312da 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1765,6 +1765,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>   			   struct bpf_tramp_link *l, int stack_size,
>   			   int run_ctx_off, bool save_ret)
>   {
> +	void (*exit)(struct bpf_prog *prog, u64 start,
> +		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
> +	u64 (*enter)(struct bpf_prog *prog,
> +		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
>   	u8 *prog = *pprog;
>   	u8 *jmp_insn;
>   	int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
[...]
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index ea3674a415f9..70cf1dad91df 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -768,6 +768,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
>   u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
>   void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
>   				       struct bpf_tramp_run_ctx *run_ctx);
> +u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
> +					struct bpf_tramp_run_ctx *run_ctx);
> +void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
> +					struct bpf_tramp_run_ctx *run_ctx);
>   void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
>   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
>   
> @@ -1035,6 +1039,7 @@ struct bpf_prog_aux {
>   	u64 load_time; /* ns since boottime */
>   	u32 verified_insns;
>   	struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
> +	int cgroup_atype; /* enum cgroup_bpf_attach_type */

Move cgroup_atype right after verified_insns to fill the existing gap?

>   	char name[BPF_OBJ_NAME_LEN];
>   #ifdef CONFIG_SECURITY
>   	void *security;
> @@ -1107,6 +1112,12 @@ struct bpf_tramp_link {
>   	u64 cookie;
>   };
>   
> +struct bpf_shim_tramp_link {
> +	struct bpf_tramp_link tramp_link;
> +	struct bpf_trampoline *tr;
> +	atomic64_t refcnt;
> +};
> +
>   struct bpf_tracing_link {
>   	struct bpf_tramp_link link;
>   	enum bpf_attach_type attach_type;
> @@ -1185,6 +1196,9 @@ struct bpf_dummy_ops {
>   int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
>   			    union bpf_attr __user *uattr);
>   #endif
> +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> +				    struct bpf_attach_target_info *tgt_info);
> +void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog);
>   #else
>   static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
>   {
> @@ -1208,6 +1222,14 @@ static inline int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map,
>   {
>   	return -EINVAL;
>   }
> +static inline int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> +						  struct bpf_attach_target_info *tgt_info)
> +{
> +	return -EOPNOTSUPP;
> +}
> +static inline void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
> +{
> +}
>   #endif
>   
>   struct bpf_array {
> @@ -2250,6 +2272,8 @@ extern const struct bpf_func_proto bpf_loop_proto;
>   extern const struct bpf_func_proto bpf_strncmp_proto;
>   extern const struct bpf_func_proto bpf_copy_from_user_task_proto;
>   extern const struct bpf_func_proto bpf_kptr_xchg_proto;
> +extern const struct bpf_func_proto bpf_set_retval_proto;
> +extern const struct bpf_func_proto bpf_get_retval_proto;
>   
>   const struct bpf_func_proto *tracing_prog_func_proto(
>     enum bpf_func_id func_id, const struct bpf_prog *prog);
> @@ -2366,6 +2390,7 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len);
>   
>   struct btf_id_set;
>   bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
> +int btf_id_set_index(const struct btf_id_set *set, u32 id);
>   
>   #define MAX_BPRINTF_VARARGS		12
>   
> diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> index 479c101546ad..7f0e59f5f9be 100644
> --- a/include/linux/bpf_lsm.h
> +++ b/include/linux/bpf_lsm.h
> @@ -42,6 +42,9 @@ extern const struct bpf_func_proto bpf_inode_storage_get_proto;
>   extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
>   void bpf_inode_storage_free(struct inode *inode);
>   
> +int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
> +int bpf_lsm_hook_idx(u32 btf_id);
> +
>   #else /* !CONFIG_BPF_LSM */
>   
>   static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id)
> @@ -65,6 +68,17 @@ static inline void bpf_inode_storage_free(struct inode *inode)
>   {
>   }
>   
> +static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> +					   bpf_func_t *bpf_func)
> +{
> +	return -ENOENT;
> +}
> +
> +static inline int bpf_lsm_hook_idx(u32 btf_id)
> +{
> +	return -EINVAL;
> +}
> +
>   #endif /* CONFIG_BPF_LSM */
>   
>   #endif /* _LINUX_BPF_LSM_H */
> diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
> index bc5d9cc34e4c..857cc37094da 100644
> --- a/include/linux/btf_ids.h
> +++ b/include/linux/btf_ids.h
> @@ -178,7 +178,8 @@ extern struct btf_id_set name;
>   	BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP6, tcp6_sock)			\
>   	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock)			\
>   	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)			\
> -	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)
> +	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)			\
> +	BTF_SOCK_TYPE(BTF_SOCK_TYPE_SOCKET, socket)
>   
>   enum {
>   #define BTF_SOCK_TYPE(name, str) name,
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 0210f85131b3..b9d2d6de63a7 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -998,6 +998,7 @@ enum bpf_attach_type {
>   	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
>   	BPF_PERF_EVENT,
>   	BPF_TRACE_KPROBE_MULTI,
> +	BPF_LSM_CGROUP,
>   	__MAX_BPF_ATTACH_TYPE
>   };
>   
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index c1351df9f7ee..654c23577ad3 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -16,6 +16,7 @@
>   #include <linux/bpf_local_storage.h>
>   #include <linux/btf_ids.h>
>   #include <linux/ima.h>
> +#include <linux/bpf-cgroup.h>
>   
>   /* For every LSM hook that allows attachment of BPF programs, declare a nop
>    * function where a BPF program can be attached.
> @@ -35,6 +36,46 @@ BTF_SET_START(bpf_lsm_hooks)
>   #undef LSM_HOOK
>   BTF_SET_END(bpf_lsm_hooks)
>   
> +/* List of LSM hooks that should operate on 'current' cgroup regardless
> + * of function signature.
> + */
> +BTF_SET_START(bpf_lsm_current_hooks)
> +/* operate on freshly allocated sk without any cgroup association */
> +BTF_ID(func, bpf_lsm_sk_alloc_security)
> +BTF_ID(func, bpf_lsm_sk_free_security)
> +BTF_SET_END(bpf_lsm_current_hooks)
> +
> +int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> +			     bpf_func_t *bpf_func)
> +{
> +	const struct btf_param *args;
> +
> +	if (btf_type_vlen(prog->aux->attach_func_proto) < 1 ||
> +	    btf_id_set_contains(&bpf_lsm_current_hooks,
> +				prog->aux->attach_btf_id)) {
> +		*bpf_func = __cgroup_bpf_run_lsm_current;
> +		return 0;
> +	}
> +
> +	args = btf_params(prog->aux->attach_func_proto);
> +
> +#ifdef CONFIG_NET
> +	if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCKET])
> +		*bpf_func = __cgroup_bpf_run_lsm_socket;
> +	else if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCK])
> +		*bpf_func = __cgroup_bpf_run_lsm_sock;
> +	else
> +#endif
> +		*bpf_func = __cgroup_bpf_run_lsm_current;
> +
> +	return 0;

This function always return 0, change the return type to void?

> +}
> +
> +int bpf_lsm_hook_idx(u32 btf_id)
> +{
> +	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> +}
> +
>   int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
>   			const struct bpf_prog *prog)
>   {
> @@ -158,6 +199,15 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   		return prog->aux->sleepable ? &bpf_ima_file_hash_proto : NULL;
>   	case BPF_FUNC_get_attach_cookie:
>   		return bpf_prog_has_trampoline(prog) ? &bpf_get_attach_cookie_proto : NULL;
> +	case BPF_FUNC_get_local_storage:
> +		return prog->expected_attach_type == BPF_LSM_CGROUP ?
> +			&bpf_get_local_storage_proto : NULL;
> +	case BPF_FUNC_set_retval:
> +		return prog->expected_attach_type == BPF_LSM_CGROUP ?
> +			&bpf_set_retval_proto : NULL;
> +	case BPF_FUNC_get_retval:
> +		return prog->expected_attach_type == BPF_LSM_CGROUP ?
> +			&bpf_get_retval_proto : NULL;
>   	default:
>   		return tracing_prog_func_proto(func_id, prog);
>   	}
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 2f0b0440131c..a90f04a8a8ee 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -5248,6 +5248,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
>   
>   	if (arg == nr_args) {
>   		switch (prog->expected_attach_type) {
> +		case BPF_LSM_CGROUP:
>   		case BPF_LSM_MAC:
>   		case BPF_TRACE_FEXIT:
>   			/* When LSM programs are attached to void LSM hooks
> @@ -6726,6 +6727,16 @@ static int btf_id_cmp_func(const void *a, const void *b)
>   	return *pa - *pb;
>   }
>   
> +int btf_id_set_index(const struct btf_id_set *set, u32 id)
> +{
> +	const u32 *p;
> +
> +	p = bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func);
> +	if (!p)
> +		return -1;
> +	return p - set->ids;
> +}
> +
>   bool btf_id_set_contains(const struct btf_id_set *set, u32 id)
>   {
>   	return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 134785ab487c..2c356a38f4cf 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -14,6 +14,9 @@
>   #include <linux/string.h>
>   #include <linux/bpf.h>
>   #include <linux/bpf-cgroup.h>
> +#include <linux/btf_ids.h>
> +#include <linux/bpf_lsm.h>
> +#include <linux/bpf_verifier.h>
>   #include <net/sock.h>
>   #include <net/bpf_sk_storage.h>
>   
> @@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
>   	return run_ctx.retval;
>   }
>   
> +unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
> +				       const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct sock *sk;
> +	struct cgroup *cgrp;
> +	int ret = 0;
> +	u64 *regs;
> +
> +	regs = (u64 *)ctx;
> +	sk = (void *)(unsigned long)regs[BPF_REG_0];

Maybe just my own opinion. Using BPF_REG_0 as index is a little bit
confusing. Maybe just use '0' to indicate the first parameters.
Maybe change 'regs' to 'params' is also a better choice?
In reality, trampline just passed an array of parameters to
the program. The same for a few places below.

> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));

I didn't experiment, but why container_of won't work?

> +
> +	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	return ret;
> +}
> +
> +unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
> +					 const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct socket *sock;
> +	struct cgroup *cgrp;
> +	int ret = 0;
> +	u64 *regs;
> +
> +	regs = (u64 *)ctx;
> +	sock = (void *)(unsigned long)regs[BPF_REG_0];
> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> +
> +	cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	return ret;
> +}
> +
> +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> +					  const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct cgroup *cgrp;
> +	int ret = 0;
> +
> +	if (unlikely(!current))
> +		return 0;

I think we don't need this check.

> +
> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> +
> +	rcu_read_lock();
> +	cgrp = task_dfl_cgroup(current);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	rcu_read_unlock();
> +	return ret;
> +}
> +
[...]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-20  1:00   ` Yonghong Song
@ 2022-05-21  0:03     ` Stanislav Fomichev
  2022-05-23 15:41       ` Yonghong Song
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-21  0:03 UTC (permalink / raw)
  To: Yonghong Song; +Cc: netdev, bpf, ast, daniel, andrii

On Thu, May 19, 2022 at 6:01 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
> > Allow attaching to lsm hooks in the cgroup context.
> >
> > Attaching to per-cgroup LSM works exactly like attaching
> > to other per-cgroup hooks. New BPF_LSM_CGROUP is added
> > to trigger new mode; the actual lsm hook we attach to is
> > signaled via existing attach_btf_id.
> >
> > For the hooks that have 'struct socket' or 'struct sock' as its first
> > argument, we use the cgroup associated with that socket. For the rest,
> > we use 'current' cgroup (this is all on default hierarchy == v2 only).
> > Note that for some hooks that work on 'struct sock' we still
> > take the cgroup from 'current' because some of them work on the socket
> > that hasn't been properly initialized yet.
> >
> > Behind the scenes, we allocate a shim program that is attached
> > to the trampoline and runs cgroup effective BPF programs array.
> > This shim has some rudimentary ref counting and can be shared
> > between several programs attaching to the same per-cgroup lsm hook.
> >
> > Note that this patch bloats cgroup size because we add 211
> > cgroup_bpf_attach_type(s) for simplicity sake. This will be
> > addressed in the subsequent patch.
> >
> > Also note that we only add non-sleepable flavor for now. To enable
> > sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
> > shim programs have to be freed via trace rcu, cgroup_bpf.effective
> > should be also trace-rcu-managed + maybe some other changes that
> > I'm not aware of.
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >   arch/x86/net/bpf_jit_comp.c     |  24 +++--
> >   include/linux/bpf-cgroup-defs.h |   6 ++
> >   include/linux/bpf-cgroup.h      |   7 ++
> >   include/linux/bpf.h             |  25 +++++
> >   include/linux/bpf_lsm.h         |  14 +++
> >   include/linux/btf_ids.h         |   3 +-
> >   include/uapi/linux/bpf.h        |   1 +
> >   kernel/bpf/bpf_lsm.c            |  50 +++++++++
> >   kernel/bpf/btf.c                |  11 ++
> >   kernel/bpf/cgroup.c             | 181 ++++++++++++++++++++++++++++---
> >   kernel/bpf/core.c               |   2 +
> >   kernel/bpf/syscall.c            |  10 ++
> >   kernel/bpf/trampoline.c         | 184 ++++++++++++++++++++++++++++++++
> >   kernel/bpf/verifier.c           |  28 +++++
> >   tools/include/linux/btf_ids.h   |   4 +-
> >   tools/include/uapi/linux/bpf.h  |   1 +
> >   16 files changed, 527 insertions(+), 24 deletions(-)
>
> A few nits below.
>
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index a2b6d197c226..5cdebf4312da 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1765,6 +1765,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
> >                          struct bpf_tramp_link *l, int stack_size,
> >                          int run_ctx_off, bool save_ret)
> >   {
> > +     void (*exit)(struct bpf_prog *prog, u64 start,
> > +                  struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
> > +     u64 (*enter)(struct bpf_prog *prog,
> > +                  struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
> >       u8 *prog = *pprog;
> >       u8 *jmp_insn;
> >       int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
> [...]
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index ea3674a415f9..70cf1dad91df 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -768,6 +768,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
> >   u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
> >   void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
> >                                      struct bpf_tramp_run_ctx *run_ctx);
> > +u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
> > +                                     struct bpf_tramp_run_ctx *run_ctx);
> > +void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
> > +                                     struct bpf_tramp_run_ctx *run_ctx);
> >   void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
> >   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
> >
> > @@ -1035,6 +1039,7 @@ struct bpf_prog_aux {
> >       u64 load_time; /* ns since boottime */
> >       u32 verified_insns;
> >       struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
> > +     int cgroup_atype; /* enum cgroup_bpf_attach_type */
>
> Move cgroup_atype right after verified_insns to fill the existing gap?

Good idea!

> >       char name[BPF_OBJ_NAME_LEN];
> >   #ifdef CONFIG_SECURITY
> >       void *security;
> > @@ -1107,6 +1112,12 @@ struct bpf_tramp_link {
> >       u64 cookie;
> >   };
> >
> > +struct bpf_shim_tramp_link {
> > +     struct bpf_tramp_link tramp_link;
> > +     struct bpf_trampoline *tr;
> > +     atomic64_t refcnt;
> > +};
> > +
> >   struct bpf_tracing_link {
> >       struct bpf_tramp_link link;
> >       enum bpf_attach_type attach_type;
> > @@ -1185,6 +1196,9 @@ struct bpf_dummy_ops {
> >   int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
> >                           union bpf_attr __user *uattr);
> >   #endif
> > +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> > +                                 struct bpf_attach_target_info *tgt_info);
> > +void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog);
> >   #else
> >   static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
> >   {
> > @@ -1208,6 +1222,14 @@ static inline int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map,
> >   {
> >       return -EINVAL;
> >   }
> > +static inline int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> > +                                               struct bpf_attach_target_info *tgt_info)
> > +{
> > +     return -EOPNOTSUPP;
> > +}
> > +static inline void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
> > +{
> > +}
> >   #endif
> >
> >   struct bpf_array {
> > @@ -2250,6 +2272,8 @@ extern const struct bpf_func_proto bpf_loop_proto;
> >   extern const struct bpf_func_proto bpf_strncmp_proto;
> >   extern const struct bpf_func_proto bpf_copy_from_user_task_proto;
> >   extern const struct bpf_func_proto bpf_kptr_xchg_proto;
> > +extern const struct bpf_func_proto bpf_set_retval_proto;
> > +extern const struct bpf_func_proto bpf_get_retval_proto;
> >
> >   const struct bpf_func_proto *tracing_prog_func_proto(
> >     enum bpf_func_id func_id, const struct bpf_prog *prog);
> > @@ -2366,6 +2390,7 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len);
> >
> >   struct btf_id_set;
> >   bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
> > +int btf_id_set_index(const struct btf_id_set *set, u32 id);
> >
> >   #define MAX_BPRINTF_VARARGS         12
> >
> > diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> > index 479c101546ad..7f0e59f5f9be 100644
> > --- a/include/linux/bpf_lsm.h
> > +++ b/include/linux/bpf_lsm.h
> > @@ -42,6 +42,9 @@ extern const struct bpf_func_proto bpf_inode_storage_get_proto;
> >   extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
> >   void bpf_inode_storage_free(struct inode *inode);
> >
> > +int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
> > +int bpf_lsm_hook_idx(u32 btf_id);
> > +
> >   #else /* !CONFIG_BPF_LSM */
> >
> >   static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id)
> > @@ -65,6 +68,17 @@ static inline void bpf_inode_storage_free(struct inode *inode)
> >   {
> >   }
> >
> > +static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> > +                                        bpf_func_t *bpf_func)
> > +{
> > +     return -ENOENT;
> > +}
> > +
> > +static inline int bpf_lsm_hook_idx(u32 btf_id)
> > +{
> > +     return -EINVAL;
> > +}
> > +
> >   #endif /* CONFIG_BPF_LSM */
> >
> >   #endif /* _LINUX_BPF_LSM_H */
> > diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
> > index bc5d9cc34e4c..857cc37094da 100644
> > --- a/include/linux/btf_ids.h
> > +++ b/include/linux/btf_ids.h
> > @@ -178,7 +178,8 @@ extern struct btf_id_set name;
> >       BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP6, tcp6_sock)                    \
> >       BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock)                      \
> >       BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)                    \
> > -     BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)
> > +     BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)                    \
> > +     BTF_SOCK_TYPE(BTF_SOCK_TYPE_SOCKET, socket)
> >
> >   enum {
> >   #define BTF_SOCK_TYPE(name, str) name,
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 0210f85131b3..b9d2d6de63a7 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -998,6 +998,7 @@ enum bpf_attach_type {
> >       BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
> >       BPF_PERF_EVENT,
> >       BPF_TRACE_KPROBE_MULTI,
> > +     BPF_LSM_CGROUP,
> >       __MAX_BPF_ATTACH_TYPE
> >   };
> >
> > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > index c1351df9f7ee..654c23577ad3 100644
> > --- a/kernel/bpf/bpf_lsm.c
> > +++ b/kernel/bpf/bpf_lsm.c
> > @@ -16,6 +16,7 @@
> >   #include <linux/bpf_local_storage.h>
> >   #include <linux/btf_ids.h>
> >   #include <linux/ima.h>
> > +#include <linux/bpf-cgroup.h>
> >
> >   /* For every LSM hook that allows attachment of BPF programs, declare a nop
> >    * function where a BPF program can be attached.
> > @@ -35,6 +36,46 @@ BTF_SET_START(bpf_lsm_hooks)
> >   #undef LSM_HOOK
> >   BTF_SET_END(bpf_lsm_hooks)
> >
> > +/* List of LSM hooks that should operate on 'current' cgroup regardless
> > + * of function signature.
> > + */
> > +BTF_SET_START(bpf_lsm_current_hooks)
> > +/* operate on freshly allocated sk without any cgroup association */
> > +BTF_ID(func, bpf_lsm_sk_alloc_security)
> > +BTF_ID(func, bpf_lsm_sk_free_security)
> > +BTF_SET_END(bpf_lsm_current_hooks)
> > +
> > +int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> > +                          bpf_func_t *bpf_func)
> > +{
> > +     const struct btf_param *args;
> > +
> > +     if (btf_type_vlen(prog->aux->attach_func_proto) < 1 ||
> > +         btf_id_set_contains(&bpf_lsm_current_hooks,
> > +                             prog->aux->attach_btf_id)) {
> > +             *bpf_func = __cgroup_bpf_run_lsm_current;
> > +             return 0;
> > +     }
> > +
> > +     args = btf_params(prog->aux->attach_func_proto);
> > +
> > +#ifdef CONFIG_NET
> > +     if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCKET])
> > +             *bpf_func = __cgroup_bpf_run_lsm_socket;
> > +     else if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCK])
> > +             *bpf_func = __cgroup_bpf_run_lsm_sock;
> > +     else
> > +#endif
> > +             *bpf_func = __cgroup_bpf_run_lsm_current;
> > +
> > +     return 0;
>
> This function always return 0, change the return type to void?

Oh, good catch, over time we've removed all error cases from it, will
convert to void.

> > +}
> > +
> > +int bpf_lsm_hook_idx(u32 btf_id)
> > +{
> > +     return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> > +}
> > +
> >   int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> >                       const struct bpf_prog *prog)
> >   {
> > @@ -158,6 +199,15 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >               return prog->aux->sleepable ? &bpf_ima_file_hash_proto : NULL;
> >       case BPF_FUNC_get_attach_cookie:
> >               return bpf_prog_has_trampoline(prog) ? &bpf_get_attach_cookie_proto : NULL;
> > +     case BPF_FUNC_get_local_storage:
> > +             return prog->expected_attach_type == BPF_LSM_CGROUP ?
> > +                     &bpf_get_local_storage_proto : NULL;
> > +     case BPF_FUNC_set_retval:
> > +             return prog->expected_attach_type == BPF_LSM_CGROUP ?
> > +                     &bpf_set_retval_proto : NULL;
> > +     case BPF_FUNC_get_retval:
> > +             return prog->expected_attach_type == BPF_LSM_CGROUP ?
> > +                     &bpf_get_retval_proto : NULL;
> >       default:
> >               return tracing_prog_func_proto(func_id, prog);
> >       }
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 2f0b0440131c..a90f04a8a8ee 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -5248,6 +5248,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
> >
> >       if (arg == nr_args) {
> >               switch (prog->expected_attach_type) {
> > +             case BPF_LSM_CGROUP:
> >               case BPF_LSM_MAC:
> >               case BPF_TRACE_FEXIT:
> >                       /* When LSM programs are attached to void LSM hooks
> > @@ -6726,6 +6727,16 @@ static int btf_id_cmp_func(const void *a, const void *b)
> >       return *pa - *pb;
> >   }
> >
> > +int btf_id_set_index(const struct btf_id_set *set, u32 id)
> > +{
> > +     const u32 *p;
> > +
> > +     p = bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func);
> > +     if (!p)
> > +             return -1;
> > +     return p - set->ids;
> > +}
> > +
> >   bool btf_id_set_contains(const struct btf_id_set *set, u32 id)
> >   {
> >       return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 134785ab487c..2c356a38f4cf 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -14,6 +14,9 @@
> >   #include <linux/string.h>
> >   #include <linux/bpf.h>
> >   #include <linux/bpf-cgroup.h>
> > +#include <linux/btf_ids.h>
> > +#include <linux/bpf_lsm.h>
> > +#include <linux/bpf_verifier.h>
> >   #include <net/sock.h>
> >   #include <net/bpf_sk_storage.h>
> >
> > @@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
> >       return run_ctx.retval;
> >   }
> >
> > +unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
> > +                                    const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct sock *sk;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> > +     u64 *regs;
> > +
> > +     regs = (u64 *)ctx;
> > +     sk = (void *)(unsigned long)regs[BPF_REG_0];
>
> Maybe just my own opinion. Using BPF_REG_0 as index is a little bit
> confusing. Maybe just use '0' to indicate the first parameters.
> Maybe change 'regs' to 'params' is also a better choice?
> In reality, trampline just passed an array of parameters to
> the program. The same for a few places below.

Sure, let's rename it and use 0. I'll do args instead of params maybe?

> > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
>
> I didn't experiment, but why container_of won't work?

There is a type check in container_of that doesn't seem to work for flex arrays:

kernel/bpf/cgroup.c:78:14: error: static_assert failed due to
requirement '__builtin_types_compatible_p(const struct bpf_insn,
struct bpf_insn []"
        shim_prog = container_of(insn, struct bpf_prog, insnsi);
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/container_of.h:19:2: note: expanded from macro 'container_of'
        static_assert(__same_type(*(ptr), ((type *)0)->member) ||       \
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
#define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
#define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                        ^              ~~~~
1 error generated.


Am I doing it wrong?

> > +
> > +     cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> > +     if (likely(cgrp))
> > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > +                                         shim_prog->aux->cgroup_atype,
> > +                                         ctx, bpf_prog_run, 0, NULL);
> > +     return ret;
> > +}
> > +
> > +unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
> > +                                      const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct socket *sock;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> > +     u64 *regs;
> > +
> > +     regs = (u64 *)ctx;
> > +     sock = (void *)(unsigned long)regs[BPF_REG_0];
> > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> > +
> > +     cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
> > +     if (likely(cgrp))
> > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > +                                         shim_prog->aux->cgroup_atype,
> > +                                         ctx, bpf_prog_run, 0, NULL);
> > +     return ret;
> > +}
> > +
> > +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> > +                                       const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> > +
> > +     if (unlikely(!current))
> > +             return 0;
>
> I think we don't need this check.

SG, will remove it. Indeed, there doesn't seem to be a lot of "if
(current)" checks elsewhere.

Thank you for the review! Will try to address everything and respin
sometime next week (in case others want to have a quick look).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 01/11] bpf: add bpf_func_t and trampoline helpers
  2022-05-20  0:45   ` Yonghong Song
@ 2022-05-21  0:03     ` Stanislav Fomichev
  0 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-21  0:03 UTC (permalink / raw)
  To: Yonghong Song; +Cc: netdev, bpf, ast, daniel, andrii

On Thu, May 19, 2022 at 5:45 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
> > I'll be adding lsm cgroup specific helpers that grab
> > trampoline mutex.
> >
> > No functional changes.
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >   include/linux/bpf.h     | 11 ++++----
> >   kernel/bpf/trampoline.c | 62 ++++++++++++++++++++++-------------------
> >   2 files changed, 38 insertions(+), 35 deletions(-)
> >
> [...]
> > +
> > +int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
> > +{
> > +     int err;
> > +
> > +     mutex_lock(&tr->mutex);
> > +     err = __bpf_trampoline_link_prog(link, tr);
> >       mutex_unlock(&tr->mutex);
> >       return err;
> >   }
> >
> >   /* bpf_trampoline_unlink_prog() should never fail. */
>
> The comment here can be removed.

Will do, thank you!


> > -int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
> > +static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
> >   {
> >       enum bpf_tramp_prog_type kind;
> >       int err;
> >
> >       kind = bpf_attach_type_to_tramp(link->link.prog);
> > -     mutex_lock(&tr->mutex);
> >       if (kind == BPF_TRAMP_REPLACE) {
> >               WARN_ON_ONCE(!tr->extension_prog);
> >               err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP,
> >                                        tr->extension_prog->bpf_func, NULL);
> >               tr->extension_prog = NULL;
> > -             goto out;
> > +             return err;
> >       }
> >       hlist_del_init(&link->tramp_hlist);
> >       tr->progs_cnt[kind]--;
> > -     err = bpf_trampoline_update(tr);
> > -out:
> > +     return bpf_trampoline_update(tr);
> > +}
> > +
> > +/* bpf_trampoline_unlink_prog() should never fail. */
> > +int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
> > +{
> > +     int err;
> > +
> > +     mutex_lock(&tr->mutex);
> > +     err = __bpf_trampoline_unlink_prog(link, tr);
> >       mutex_unlock(&tr->mutex);
> >       return err;
> >   }

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-18 22:55 ` [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
  2022-05-20  1:00   ` Yonghong Song
@ 2022-05-21  0:53   ` Martin KaFai Lau
  2022-05-24  2:15     ` Stanislav Fomichev
  1 sibling, 1 reply; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-21  0:53 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Wed, May 18, 2022 at 03:55:23PM -0700, Stanislav Fomichev wrote:

[ ... ]

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index ea3674a415f9..70cf1dad91df 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -768,6 +768,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
>  u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
>  void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
>  				       struct bpf_tramp_run_ctx *run_ctx);
> +u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
> +					struct bpf_tramp_run_ctx *run_ctx);
> +void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
> +					struct bpf_tramp_run_ctx *run_ctx);
>  void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
>  void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
>  
> @@ -1035,6 +1039,7 @@ struct bpf_prog_aux {
>  	u64 load_time; /* ns since boottime */
>  	u32 verified_insns;
>  	struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
> +	int cgroup_atype; /* enum cgroup_bpf_attach_type */
>  	char name[BPF_OBJ_NAME_LEN];
>  #ifdef CONFIG_SECURITY
>  	void *security;
> @@ -1107,6 +1112,12 @@ struct bpf_tramp_link {
>  	u64 cookie;
>  };
>  
> +struct bpf_shim_tramp_link {
> +	struct bpf_tramp_link tramp_link;
> +	struct bpf_trampoline *tr;
> +	atomic64_t refcnt;
There is already a refcnt in 'struct bpf_link'.
Reuse that one if possible.

[ ... ]

> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 01ce78c1df80..c424056f0b35 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -11,6 +11,8 @@
>  #include <linux/rcupdate_wait.h>
>  #include <linux/module.h>
>  #include <linux/static_call.h>
> +#include <linux/bpf_verifier.h>
> +#include <linux/bpf_lsm.h>
>  
>  /* dummy _ops. The verifier will operate on target program's ops. */
>  const struct bpf_verifier_ops bpf_extension_verifier_ops = {
> @@ -497,6 +499,163 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
>  	return err;
>  }
>  
> +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> +static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
> +						     bpf_func_t bpf_func)
> +{
> +	struct bpf_shim_tramp_link *shim_link = NULL;
> +	struct bpf_prog *p;
> +
> +	shim_link = kzalloc(sizeof(*shim_link), GFP_USER);
> +	if (!shim_link)
> +		return NULL;
> +
> +	p = bpf_prog_alloc(1, 0);
> +	if (!p) {
> +		kfree(shim_link);
> +		return NULL;
> +	}
> +
> +	p->jited = false;
> +	p->bpf_func = bpf_func;
> +
> +	p->aux->cgroup_atype = prog->aux->cgroup_atype;
> +	p->aux->attach_func_proto = prog->aux->attach_func_proto;
> +	p->aux->attach_btf_id = prog->aux->attach_btf_id;
> +	p->aux->attach_btf = prog->aux->attach_btf;
> +	btf_get(p->aux->attach_btf);
> +	p->type = BPF_PROG_TYPE_LSM;
> +	p->expected_attach_type = BPF_LSM_MAC;
> +	bpf_prog_inc(p);
> +	bpf_link_init(&shim_link->tramp_link.link, BPF_LINK_TYPE_TRACING, NULL, p);
> +	atomic64_set(&shim_link->refcnt, 1);
> +
> +	return shim_link;
> +}
> +
> +static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr,
> +						    bpf_func_t bpf_func)
> +{
> +	struct bpf_tramp_link *link;
> +	int kind;
> +
> +	for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> +		hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
> +			struct bpf_prog *p = link->link.prog;
> +
> +			if (p->bpf_func == bpf_func)
> +				return container_of(link, struct bpf_shim_tramp_link, tramp_link);
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +static void cgroup_shim_put(struct bpf_shim_tramp_link *shim_link)
> +{
> +	if (shim_link->tr)
I have been spinning back and forth with this "shim_link->tr" test and
the "!shim_link->tr" test below with an atomic64_dec_and_test() test
in between  :)

> +		bpf_trampoline_put(shim_link->tr);
Why put(tr) here? 

Intuitive thinking is that should be done after __bpf_trampoline_unlink_prog(.., tr)
which is still using the tr.
or I missed something inside __bpf_trampoline_unlink_prog(..., tr) ?

> +
> +	if (!atomic64_dec_and_test(&shim_link->refcnt))
> +		return;
> +
> +	if (!shim_link->tr)
And this is only for the error case in bpf_trampoline_link_cgroup_shim()?
Can it be handled locally in bpf_trampoline_link_cgroup_shim()
where it could actually happen ?

> +		return;
> +
> +	WARN_ON_ONCE(__bpf_trampoline_unlink_prog(&shim_link->tramp_link, shim_link->tr));
> +	kfree(shim_link);
How about shim_link->tramp_link.link.prog, is the prog freed ?

Considering the bpf_link_put() does bpf_prog_put(link->prog).
Is there a reason the bpf_link_put() not used and needs to
manage its own shim_link->refcnt here ?

> +}
> +
> +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> +				    struct bpf_attach_target_info *tgt_info)
> +{
> +	struct bpf_shim_tramp_link *shim_link = NULL;
> +	struct bpf_trampoline *tr;
> +	bpf_func_t bpf_func;
> +	u64 key;
> +	int err;
> +
> +	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> +					 prog->aux->attach_btf_id);
> +
> +	err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> +	if (err)
> +		return err;
> +
> +	tr = bpf_trampoline_get(key, tgt_info);
> +	if (!tr)
> +		return  -ENOMEM;
> +
> +	mutex_lock(&tr->mutex);
> +
> +	shim_link = cgroup_shim_find(tr, bpf_func);
> +	if (shim_link) {
> +		/* Reusing existing shim attached by the other program. */
> +		atomic64_inc(&shim_link->refcnt);
> +		/* note, we're still holding tr refcnt from above */
hmm... why it still needs to hold the tr refcnt ?

> +
> +		mutex_unlock(&tr->mutex);
> +		return 0;
> +	}
> +
> +	/* Allocate and install new shim. */
> +
> +	shim_link = cgroup_shim_alloc(prog, bpf_func);
> +	if (!shim_link) {
> +		bpf_trampoline_put(tr);
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +
> +	err = __bpf_trampoline_link_prog(&shim_link->tramp_link, tr);
> +	if (err)
> +		goto out;
> +
> +	shim_link->tr = tr;
> +
> +	mutex_unlock(&tr->mutex);
> +
> +	return 0;
> +out:
> +	mutex_unlock(&tr->mutex);
> +
> +	if (shim_link)
> +		cgroup_shim_put(shim_link);
> +
> +	return err;
> +}
> +

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 04/11] bpf: minimize number of allocated lsm slots per program
  2022-05-18 22:55 ` [PATCH bpf-next v7 04/11] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
@ 2022-05-21  6:56   ` Martin KaFai Lau
  2022-05-24  2:14     ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-21  6:56 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Wed, May 18, 2022 at 03:55:24PM -0700, Stanislav Fomichev wrote:
> Previous patch adds 1:1 mapping between all 211 LSM hooks
> and bpf_cgroup program array. Instead of reserving a slot per
> possible hook, reserve 10 slots per cgroup for lsm programs.
> Those slots are dynamically allocated on demand and reclaimed.
> 
> struct cgroup_bpf {
> 	struct bpf_prog_array *    effective[33];        /*     0   264 */
> 	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
> 	struct hlist_head          progs[33];            /*   264   264 */
> 	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
> 	u8                         flags[33];            /*   528    33 */
> 
> 	/* XXX 7 bytes hole, try to pack */
> 
> 	struct list_head           storages;             /*   568    16 */
> 	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
> 	struct bpf_prog_array *    inactive;             /*   584     8 */
> 	struct percpu_ref          refcnt;               /*   592    16 */
> 	struct work_struct         release_work;         /*   608    72 */
> 
> 	/* size: 680, cachelines: 11, members: 7 */
> 	/* sum members: 673, holes: 1, sum holes: 7 */
> 	/* last cacheline: 40 bytes */
> };
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/linux/bpf-cgroup-defs.h |   3 +-
>  include/linux/bpf_lsm.h         |   6 --
>  kernel/bpf/bpf_lsm.c            |   5 --
>  kernel/bpf/cgroup.c             | 135 +++++++++++++++++++++++++++++---
>  4 files changed, 125 insertions(+), 24 deletions(-)
> 
> diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
> index d5a70a35dace..359d3f16abea 100644
> --- a/include/linux/bpf-cgroup-defs.h
> +++ b/include/linux/bpf-cgroup-defs.h
> @@ -10,7 +10,8 @@
>  
>  struct bpf_prog_array;
>  
> -#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
> +/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
> +#define CGROUP_LSM_NUM 10
>  
>  enum cgroup_bpf_attach_type {
>  	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
> diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> index 7f0e59f5f9be..613de44aa429 100644
> --- a/include/linux/bpf_lsm.h
> +++ b/include/linux/bpf_lsm.h
> @@ -43,7 +43,6 @@ extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
>  void bpf_inode_storage_free(struct inode *inode);
>  
>  int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
> -int bpf_lsm_hook_idx(u32 btf_id);
>  
>  #else /* !CONFIG_BPF_LSM */
>  
> @@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
>  	return -ENOENT;
>  }
>  
> -static inline int bpf_lsm_hook_idx(u32 btf_id)
> -{
> -	return -EINVAL;
> -}
> -
>  #endif /* CONFIG_BPF_LSM */
>  
>  #endif /* _LINUX_BPF_LSM_H */
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index 654c23577ad3..96503c3e7a71 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -71,11 +71,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
>  	return 0;
>  }
>  
> -int bpf_lsm_hook_idx(u32 btf_id)
> -{
> -	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> -}
> -
>  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
>  			const struct bpf_prog *prog)
>  {
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 2c356a38f4cf..a959cdd22870 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -132,15 +132,110 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
>  }
>  
>  #ifdef CONFIG_BPF_LSM
> +struct list_head unused_bpf_lsm_atypes;
> +struct list_head used_bpf_lsm_atypes;
> +
> +struct bpf_lsm_attach_type {
> +	int index;
> +	u32 btf_id;
> +	int usecnt;
> +	struct list_head atypes;
> +	struct rcu_head rcu_head;
> +};
> +
> +static int __init bpf_lsm_attach_type_init(void)
> +{
> +	struct bpf_lsm_attach_type *atype;
> +	int i;
> +
> +	INIT_LIST_HEAD_RCU(&unused_bpf_lsm_atypes);
> +	INIT_LIST_HEAD_RCU(&used_bpf_lsm_atypes);
> +
> +	for (i = 0; i < CGROUP_LSM_NUM; i++) {
> +		atype = kzalloc(sizeof(*atype), GFP_KERNEL);
> +		if (!atype)
> +			continue;
> +
> +		atype->index = i;
> +		list_add_tail_rcu(&atype->atypes, &unused_bpf_lsm_atypes);
> +	}
> +
> +	return 0;
> +}
> +late_initcall(bpf_lsm_attach_type_init);
> +
>  static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
>  {
> -	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
> +	struct bpf_lsm_attach_type *atype;
> +
> +	lockdep_assert_held(&cgroup_mutex);
> +
> +	list_for_each_entry_rcu(atype, &used_bpf_lsm_atypes, atypes) {
> +		if (atype->btf_id != attach_btf_id)
> +			continue;
> +
> +		atype->usecnt++;
> +		return CGROUP_LSM_START + atype->index;
> +	}
> +
> +	atype = list_first_or_null_rcu(&unused_bpf_lsm_atypes, struct bpf_lsm_attach_type, atypes);
> +	if (!atype)
> +		return -E2BIG;
> +
> +	list_del_rcu(&atype->atypes);
> +	atype->btf_id = attach_btf_id;
> +	atype->usecnt = 1;
> +	list_add_tail_rcu(&atype->atypes, &used_bpf_lsm_atypes);
> +
> +	return CGROUP_LSM_START + atype->index;
> +}
> +
> +static void bpf_lsm_attach_type_reclaim(struct rcu_head *head)
> +{
> +	struct bpf_lsm_attach_type *atype =
> +		container_of(head, struct bpf_lsm_attach_type, rcu_head);
> +
> +	atype->btf_id = 0;
> +	atype->usecnt = 0;
> +	list_add_tail_rcu(&atype->atypes, &unused_bpf_lsm_atypes);
hmm...... no need to hold the cgroup_mutex when changing
the unused_bpf_lsm_atypes list ?
but it is a rcu callback, so spinlock is needed.

> +}
> +
> +static void bpf_lsm_attach_type_put(u32 attach_btf_id)
> +{
> +	struct bpf_lsm_attach_type *atype;
> +
> +	lockdep_assert_held(&cgroup_mutex);
> +
> +	list_for_each_entry_rcu(atype, &used_bpf_lsm_atypes, atypes) {
> +		if (atype->btf_id != attach_btf_id)
> +			continue;
> +
> +		if (--atype->usecnt <= 0) {
> +			list_del_rcu(&atype->atypes);
> +			WARN_ON_ONCE(atype->usecnt < 0);
> +
> +			/* call_rcu here prevents atype reuse within
> +			 * the same rcu grace period.
> +			 * shim programs use __bpf_prog_enter_lsm_cgroup
> +			 * which starts RCU read section.
It is a bit unclear for me to think through why
there is no need to assign 'shim_prog->aux->cgroup_atype = CGROUP_BPF_ATTACH_TYPE_INVALID'
here before reclaim and the shim_prog->bpf_func does not need to check
shim_prog->aux->cgroup_atype before using it.

It will be very useful to have a few word comments here to explain this.

> +			 */
> +			call_rcu(&atype->rcu_head, bpf_lsm_attach_type_reclaim);
How about doing this bpf_lsm_attach_type_put() in bpf_prog_free_deferred().
And only do it for the shim_prog but not the cgroup-lsm prog.
The shim_prog is the only one needing cgroup_atype.  Then the cgroup_atype
naturally can be reused when the shim_prog is being destroyed.

bpf_prog_free_deferred has already gone through a rcu grace
period (__bpf_prog_put_rcu) and it can block, so cgroup_mutex
can be used.

The need for the rcu_head here should go away also.  The v6 array approach
could be reconsidered.

The cgroup-lsm prog does not necessarily need to hold a usecnt to the cgroup_atype.
Their aux->cgroup_atype can be CGROUP_BPF_ATTACH_TYPE_INVALID.
My understanding is the replace_prog->aux->cgroup_atype during attach
is an optimization, it can always search again.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-21  0:03     ` Stanislav Fomichev
@ 2022-05-23 15:41       ` Yonghong Song
  0 siblings, 0 replies; 54+ messages in thread
From: Yonghong Song @ 2022-05-23 15:41 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii



On 5/20/22 5:03 PM, Stanislav Fomichev wrote:
> On Thu, May 19, 2022 at 6:01 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
>>> Allow attaching to lsm hooks in the cgroup context.
>>>
>>> Attaching to per-cgroup LSM works exactly like attaching
>>> to other per-cgroup hooks. New BPF_LSM_CGROUP is added
>>> to trigger new mode; the actual lsm hook we attach to is
>>> signaled via existing attach_btf_id.
>>>
>>> For the hooks that have 'struct socket' or 'struct sock' as its first
>>> argument, we use the cgroup associated with that socket. For the rest,
>>> we use 'current' cgroup (this is all on default hierarchy == v2 only).
>>> Note that for some hooks that work on 'struct sock' we still
>>> take the cgroup from 'current' because some of them work on the socket
>>> that hasn't been properly initialized yet.
>>>
>>> Behind the scenes, we allocate a shim program that is attached
>>> to the trampoline and runs cgroup effective BPF programs array.
>>> This shim has some rudimentary ref counting and can be shared
>>> between several programs attaching to the same per-cgroup lsm hook.
>>>
>>> Note that this patch bloats cgroup size because we add 211
>>> cgroup_bpf_attach_type(s) for simplicity sake. This will be
>>> addressed in the subsequent patch.
>>>
>>> Also note that we only add non-sleepable flavor for now. To enable
>>> sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
>>> shim programs have to be freed via trace rcu, cgroup_bpf.effective
>>> should be also trace-rcu-managed + maybe some other changes that
>>> I'm not aware of.
>>>
>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>> ---
>>>    arch/x86/net/bpf_jit_comp.c     |  24 +++--
>>>    include/linux/bpf-cgroup-defs.h |   6 ++
>>>    include/linux/bpf-cgroup.h      |   7 ++
>>>    include/linux/bpf.h             |  25 +++++
>>>    include/linux/bpf_lsm.h         |  14 +++
>>>    include/linux/btf_ids.h         |   3 +-
>>>    include/uapi/linux/bpf.h        |   1 +
>>>    kernel/bpf/bpf_lsm.c            |  50 +++++++++
>>>    kernel/bpf/btf.c                |  11 ++
>>>    kernel/bpf/cgroup.c             | 181 ++++++++++++++++++++++++++++---
>>>    kernel/bpf/core.c               |   2 +
>>>    kernel/bpf/syscall.c            |  10 ++
>>>    kernel/bpf/trampoline.c         | 184 ++++++++++++++++++++++++++++++++
>>>    kernel/bpf/verifier.c           |  28 +++++
>>>    tools/include/linux/btf_ids.h   |   4 +-
>>>    tools/include/uapi/linux/bpf.h  |   1 +
>>>    16 files changed, 527 insertions(+), 24 deletions(-)
>>
>> A few nits below.
>>
>>>
>>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>>> index a2b6d197c226..5cdebf4312da 100644
>>> --- a/arch/x86/net/bpf_jit_comp.c
>>> +++ b/arch/x86/net/bpf_jit_comp.c
>>> @@ -1765,6 +1765,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>>>                           struct bpf_tramp_link *l, int stack_size,
>>>                           int run_ctx_off, bool save_ret)
>>>    {
>>> +     void (*exit)(struct bpf_prog *prog, u64 start,
>>> +                  struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
>>> +     u64 (*enter)(struct bpf_prog *prog,
>>> +                  struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
>>>        u8 *prog = *pprog;
>>>        u8 *jmp_insn;
>>>        int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
[...]
>>>        return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
>>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
>>> index 134785ab487c..2c356a38f4cf 100644
>>> --- a/kernel/bpf/cgroup.c
>>> +++ b/kernel/bpf/cgroup.c
>>> @@ -14,6 +14,9 @@
>>>    #include <linux/string.h>
>>>    #include <linux/bpf.h>
>>>    #include <linux/bpf-cgroup.h>
>>> +#include <linux/btf_ids.h>
>>> +#include <linux/bpf_lsm.h>
>>> +#include <linux/bpf_verifier.h>
>>>    #include <net/sock.h>
>>>    #include <net/bpf_sk_storage.h>
>>>
>>> @@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
>>>        return run_ctx.retval;
>>>    }
>>>
>>> +unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
>>> +                                    const struct bpf_insn *insn)
>>> +{
>>> +     const struct bpf_prog *shim_prog;
>>> +     struct sock *sk;
>>> +     struct cgroup *cgrp;
>>> +     int ret = 0;
>>> +     u64 *regs;
>>> +
>>> +     regs = (u64 *)ctx;
>>> +     sk = (void *)(unsigned long)regs[BPF_REG_0];
>>
>> Maybe just my own opinion. Using BPF_REG_0 as index is a little bit
>> confusing. Maybe just use '0' to indicate the first parameters.
>> Maybe change 'regs' to 'params' is also a better choice?
>> In reality, trampline just passed an array of parameters to
>> the program. The same for a few places below.
> 
> Sure, let's rename it and use 0. I'll do args instead of params maybe?

'args' works for me too.

> 
>>> +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
>>> +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
>>
>> I didn't experiment, but why container_of won't work?
> 
> There is a type check in container_of that doesn't seem to work for flex arrays:
> 
> kernel/bpf/cgroup.c:78:14: error: static_assert failed due to
> requirement '__builtin_types_compatible_p(const struct bpf_insn,
> struct bpf_insn []"
>          shim_prog = container_of(insn, struct bpf_prog, insnsi);
>                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/container_of.h:19:2: note: expanded from macro 'container_of'
>          static_assert(__same_type(*(ptr), ((type *)0)->member) ||       \
>          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
> #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
>                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
> #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
>                                          ^              ~~~~
> 1 error generated.

You are right. Thanks for explanation.

[...]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts
  2022-05-18 22:55 ` [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts Stanislav Fomichev
@ 2022-05-23 23:22   ` Andrii Nakryiko
  2022-05-24  2:15     ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-23 23:22 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Wed, May 18, 2022 at 3:55 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> Implement bpf_prog_query_opts as a more expendable version of
> bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
> well:
>
> * prog_attach_flags is a per-program attach_type; relevant only for
>   lsm cgroup program which might have different attach_flags
>   per attach_btf_id
> * attach_btf_func_id is a new field expose for prog_query which
>   specifies real btf function id for lsm cgroup attachments
>

just thoughts aloud... Shouldn't bpf_prog_query() also return link_id
if the attachment was done with LINK_CREATE? And then attach flags
could actually be fetched through corresponding struct bpf_link_info.
That is, bpf_prog_query() returns a list of link_ids, and whatever
link-specific information can be fetched by querying individual links.
Seems more logical (and useful overall) to extend struct bpf_link_info
(you can get it more generically from bpftool, by querying fdinfo,
etc).

> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/include/uapi/linux/bpf.h |  5 ++++
>  tools/lib/bpf/bpf.c            | 42 +++++++++++++++++++++++++++-------
>  tools/lib/bpf/bpf.h            | 15 ++++++++++++
>  tools/lib/bpf/libbpf.map       |  1 +
>  4 files changed, 55 insertions(+), 8 deletions(-)
>

[...]

>         ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
>
> -       if (attach_flags)
> -               *attach_flags = attr.query.attach_flags;
> -       *prog_cnt = attr.query.prog_cnt;
> +       if (OPTS_HAS(opts, prog_cnt))
> +               opts->prog_cnt = attr.query.prog_cnt;

just use OPTS_SET() instead of OPTS_HAS check

> +       if (OPTS_HAS(opts, attach_flags))
> +               opts->attach_flags = attr.query.attach_flags;
>
>         return libbpf_err_errno(ret);
>  }
>

[...]

> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index 6b36f46ab5d8..24f7a5147bf2 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -452,6 +452,7 @@ LIBBPF_0.8.0 {
>                 bpf_map_delete_elem_flags;
>                 bpf_object__destroy_subskeleton;
>                 bpf_object__open_subskeleton;
> +               bpf_prog_query_opts;

please put it into LIBBPF_1.0.0 section, 0.8 is closed now


>                 bpf_program__attach_kprobe_multi_opts;
>                 bpf_program__attach_trace_opts;
>                 bpf_program__attach_usdt;
> --
> 2.36.1.124.g0e6072fb45-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-18 22:55 ` [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
  2022-05-19  2:31   ` kernel test robot
  2022-05-19 14:57   ` kernel test robot
@ 2022-05-23 23:23   ` Andrii Nakryiko
  2022-05-24  2:15     ` Stanislav Fomichev
  2022-05-24  3:48   ` Martin KaFai Lau
  3 siblings, 1 reply; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-23 23:23 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Wed, May 18, 2022 at 3:55 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> We have two options:
> 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
>
> I was doing (2) in the original patch, but switching to (1) here:
>
> * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> regardless of attach_btf_id
> * attach_btf_id is exported via bpf_prog_info
>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/uapi/linux/bpf.h |   5 ++
>  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
>  kernel/bpf/syscall.c     |   4 +-
>  3 files changed, 81 insertions(+), 31 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index b9d2d6de63a7..432fc5f49567 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1432,6 +1432,7 @@ union bpf_attr {
>                 __u32           attach_flags;
>                 __aligned_u64   prog_ids;
>                 __u32           prog_cnt;
> +               __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
>         } query;
>
>         struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
>         __u64 run_cnt;
>         __u64 recursion_misses;
>         __u32 verified_insns;
> +       /* BTF ID of the function to attach to within BTF object identified
> +        * by btf_id.
> +        */
> +       __u32 attach_btf_func_id;

it's called attach_btf_id for PROG_LOAD command, keep it consistently
named (and a bit more generic)?

>  } __attribute__((aligned(8)));
>
>  struct bpf_map_info {

[...]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 08/11] libbpf: add lsm_cgoup_sock type
  2022-05-18 22:55 ` [PATCH bpf-next v7 08/11] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
@ 2022-05-23 23:26   ` Andrii Nakryiko
  2022-05-24  2:15     ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-23 23:26 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Wed, May 18, 2022 at 3:55 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.
>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/lib/bpf/libbpf.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index ef7f302e542f..854449dcd072 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -9027,6 +9027,7 @@ static const struct bpf_sec_def section_defs[] = {
>         SEC_DEF("fmod_ret.s+",          TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
>         SEC_DEF("fexit.s+",             TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
>         SEC_DEF("freplace+",            EXT, 0, SEC_ATTACH_BTF, attach_trace),
> +       SEC_DEF("lsm_cgroup+",          LSM, BPF_LSM_CGROUP, SEC_ATTACH_BTF),

we don't do simplistic prefix match anymore, so this doesn't have to
go before lsm+ (we do prefix match only for legacy SEC_SLOPPY cases).
So total nit (but wanted to dispel preconception that we need to avoid
subprefix matches), I'd put this after lsm+


>         SEC_DEF("lsm+",                 LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm),
>         SEC_DEF("lsm.s+",               LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
>         SEC_DEF("iter+",                TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF, attach_iter),
> @@ -9450,6 +9451,7 @@ void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
>                 *kind = BTF_KIND_TYPEDEF;
>                 break;
>         case BPF_LSM_MAC:
> +       case BPF_LSM_CGROUP:
>                 *prefix = BTF_LSM_PREFIX;
>                 *kind = BTF_KIND_FUNC;
>                 break;
> --
> 2.36.1.124.g0e6072fb45-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 11/11] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-18 22:55 ` [PATCH bpf-next v7 11/11] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
@ 2022-05-23 23:33   ` Andrii Nakryiko
  2022-05-24  2:15     ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-23 23:33 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Wed, May 18, 2022 at 3:56 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> sk_priority & sk_mark are writable, the rest is readonly.
>
> One interesting thing here is that the verifier doesn't
> really force me to add NULL checks anywhere :-/
>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  .../selftests/bpf/prog_tests/lsm_cgroup.c     | 69 +++++++++++++++++++
>  1 file changed, 69 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> index 29292ec40343..64b6830e03f5 100644
> --- a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> +++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> @@ -270,8 +270,77 @@ static void test_lsm_cgroup_functional(void)
>         lsm_cgroup__destroy(skel);
>  }
>
> +static int field_offset(const char *type, const char *field)
> +{
> +       const struct btf_member *memb;
> +       const struct btf_type *tp;
> +       const char *name;
> +       struct btf *btf;
> +       int btf_id;
> +       int i;
> +
> +       btf = btf__load_vmlinux_btf();
> +       if (!btf)
> +               return -1;
> +
> +       btf_id = btf__find_by_name_kind(btf, type, BTF_KIND_STRUCT);
> +       if (btf_id < 0)
> +               return -1;
> +
> +       tp = btf__type_by_id(btf, btf_id);
> +       memb = btf_members(tp);
> +
> +       for (i = 0; i < btf_vlen(tp); i++) {
> +               name = btf__name_by_offset(btf,
> +                                          memb->name_off);
> +               if (strcmp(field, name) == 0)
> +                       return memb->offset / 8;
> +               memb++;
> +       }
> +
> +       return -1;
> +}
> +
> +static bool sk_writable_field(const char *type, const char *field, int size)
> +{
> +       LIBBPF_OPTS(bpf_prog_load_opts, opts,
> +                   .expected_attach_type = BPF_LSM_CGROUP);
> +       struct bpf_insn insns[] = {
> +               /* r1 = *(u64 *)(r1 + 0) */
> +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0),
> +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */
> +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, field_offset("socket", "sk")),
> +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */
> +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, field_offset(type, field)),
> +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */
> +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, field_offset(type, field)),
> +               BPF_MOV64_IMM(BPF_REG_0, 1),
> +               BPF_EXIT_INSN(),
> +       };
> +       int fd;

This is really not much better than test_verifier assembly. What I had
in mind when I was suggesting to use test_progs was that you'd have a
normal C source code for BPF part, something like this:

__u64 tmp;

SEC("?lsm_cgroup/socket_bind")
int BPF_PROG(access1_bad, struct socket *sock, struct sockaddr
*address, int addrlen)
{
    *(volatile u16 *)(sock->sk.skc_family) = *(volatile u16
*)sock->sk.skc_family;
    return 0;
}


SEC("?lsm_cgroup/socket_bind")
int BPF_PROG(access2_bad, struct socket *sock, struct sockaddr
*address, int addrlen)
{
    *(volatile u64 *)(sock->sk.sk_sndtimeo) = *(volatile u64
*)sock->sk.sk_sndtimeo;
    return 0;
}

and so on. From user-space you'd be loading just one of those
accessX_bad programs at a time (note SEC("?"))


But having said that, what you did is pretty self-contained, so not
too bad. It's just not what I was suggesting :)

> +
> +       opts.attach_btf_id = libbpf_find_vmlinux_btf_id("socket_post_create",
> +                                                       opts.expected_attach_type);
> +
> +       fd = bpf_prog_load(BPF_PROG_TYPE_LSM, NULL, "GPL", insns, ARRAY_SIZE(insns), &opts);
> +       if (fd >= 0)
> +               close(fd);
> +       return fd >= 0;
> +}
> +
> +static void test_lsm_cgroup_access(void)
> +{
> +       ASSERT_FALSE(sk_writable_field("sock_common", "skc_family", BPF_H), "skc_family");
> +       ASSERT_FALSE(sk_writable_field("sock", "sk_sndtimeo", BPF_DW), "sk_sndtimeo");
> +       ASSERT_TRUE(sk_writable_field("sock", "sk_priority", BPF_W), "sk_priority");
> +       ASSERT_TRUE(sk_writable_field("sock", "sk_mark", BPF_W), "sk_mark");
> +       ASSERT_FALSE(sk_writable_field("sock", "sk_pacing_rate", BPF_DW), "sk_pacing_rate");
> +}
> +
>  void test_lsm_cgroup(void)
>  {
>         if (test__start_subtest("functional"))
>                 test_lsm_cgroup_functional();
> +       if (test__start_subtest("access"))
> +               test_lsm_cgroup_access();
>  }
> --
> 2.36.1.124.g0e6072fb45-goog
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 04/11] bpf: minimize number of allocated lsm slots per program
  2022-05-21  6:56   ` Martin KaFai Lau
@ 2022-05-24  2:14     ` Stanislav Fomichev
  2022-05-24  5:53       ` Martin KaFai Lau
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-24  2:14 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, May 20, 2022 at 11:56 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, May 18, 2022 at 03:55:24PM -0700, Stanislav Fomichev wrote:
> > Previous patch adds 1:1 mapping between all 211 LSM hooks
> > and bpf_cgroup program array. Instead of reserving a slot per
> > possible hook, reserve 10 slots per cgroup for lsm programs.
> > Those slots are dynamically allocated on demand and reclaimed.
> >
> > struct cgroup_bpf {
> >       struct bpf_prog_array *    effective[33];        /*     0   264 */
> >       /* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
> >       struct hlist_head          progs[33];            /*   264   264 */
> >       /* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
> >       u8                         flags[33];            /*   528    33 */
> >
> >       /* XXX 7 bytes hole, try to pack */
> >
> >       struct list_head           storages;             /*   568    16 */
> >       /* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
> >       struct bpf_prog_array *    inactive;             /*   584     8 */
> >       struct percpu_ref          refcnt;               /*   592    16 */
> >       struct work_struct         release_work;         /*   608    72 */
> >
> >       /* size: 680, cachelines: 11, members: 7 */
> >       /* sum members: 673, holes: 1, sum holes: 7 */
> >       /* last cacheline: 40 bytes */
> > };
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/linux/bpf-cgroup-defs.h |   3 +-
> >  include/linux/bpf_lsm.h         |   6 --
> >  kernel/bpf/bpf_lsm.c            |   5 --
> >  kernel/bpf/cgroup.c             | 135 +++++++++++++++++++++++++++++---
> >  4 files changed, 125 insertions(+), 24 deletions(-)
> >
> > diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
> > index d5a70a35dace..359d3f16abea 100644
> > --- a/include/linux/bpf-cgroup-defs.h
> > +++ b/include/linux/bpf-cgroup-defs.h
> > @@ -10,7 +10,8 @@
> >
> >  struct bpf_prog_array;
> >
> > -#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
> > +/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
> > +#define CGROUP_LSM_NUM 10
> >
> >  enum cgroup_bpf_attach_type {
> >       CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
> > diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> > index 7f0e59f5f9be..613de44aa429 100644
> > --- a/include/linux/bpf_lsm.h
> > +++ b/include/linux/bpf_lsm.h
> > @@ -43,7 +43,6 @@ extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
> >  void bpf_inode_storage_free(struct inode *inode);
> >
> >  int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
> > -int bpf_lsm_hook_idx(u32 btf_id);
> >
> >  #else /* !CONFIG_BPF_LSM */
> >
> > @@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> >       return -ENOENT;
> >  }
> >
> > -static inline int bpf_lsm_hook_idx(u32 btf_id)
> > -{
> > -     return -EINVAL;
> > -}
> > -
> >  #endif /* CONFIG_BPF_LSM */
> >
> >  #endif /* _LINUX_BPF_LSM_H */
> > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > index 654c23577ad3..96503c3e7a71 100644
> > --- a/kernel/bpf/bpf_lsm.c
> > +++ b/kernel/bpf/bpf_lsm.c
> > @@ -71,11 +71,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> >       return 0;
> >  }
> >
> > -int bpf_lsm_hook_idx(u32 btf_id)
> > -{
> > -     return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> > -}
> > -
> >  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> >                       const struct bpf_prog *prog)
> >  {
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 2c356a38f4cf..a959cdd22870 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -132,15 +132,110 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> >  }
> >
> >  #ifdef CONFIG_BPF_LSM
> > +struct list_head unused_bpf_lsm_atypes;
> > +struct list_head used_bpf_lsm_atypes;
> > +
> > +struct bpf_lsm_attach_type {
> > +     int index;
> > +     u32 btf_id;
> > +     int usecnt;
> > +     struct list_head atypes;
> > +     struct rcu_head rcu_head;
> > +};
> > +
> > +static int __init bpf_lsm_attach_type_init(void)
> > +{
> > +     struct bpf_lsm_attach_type *atype;
> > +     int i;
> > +
> > +     INIT_LIST_HEAD_RCU(&unused_bpf_lsm_atypes);
> > +     INIT_LIST_HEAD_RCU(&used_bpf_lsm_atypes);
> > +
> > +     for (i = 0; i < CGROUP_LSM_NUM; i++) {
> > +             atype = kzalloc(sizeof(*atype), GFP_KERNEL);
> > +             if (!atype)
> > +                     continue;
> > +
> > +             atype->index = i;
> > +             list_add_tail_rcu(&atype->atypes, &unused_bpf_lsm_atypes);
> > +     }
> > +
> > +     return 0;
> > +}
> > +late_initcall(bpf_lsm_attach_type_init);
> > +
> >  static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
> >  {
> > -     return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
> > +     struct bpf_lsm_attach_type *atype;
> > +
> > +     lockdep_assert_held(&cgroup_mutex);
> > +
> > +     list_for_each_entry_rcu(atype, &used_bpf_lsm_atypes, atypes) {
> > +             if (atype->btf_id != attach_btf_id)
> > +                     continue;
> > +
> > +             atype->usecnt++;
> > +             return CGROUP_LSM_START + atype->index;
> > +     }
> > +
> > +     atype = list_first_or_null_rcu(&unused_bpf_lsm_atypes, struct bpf_lsm_attach_type, atypes);
> > +     if (!atype)
> > +             return -E2BIG;
> > +
> > +     list_del_rcu(&atype->atypes);
> > +     atype->btf_id = attach_btf_id;
> > +     atype->usecnt = 1;
> > +     list_add_tail_rcu(&atype->atypes, &used_bpf_lsm_atypes);
> > +
> > +     return CGROUP_LSM_START + atype->index;
> > +}
> > +
> > +static void bpf_lsm_attach_type_reclaim(struct rcu_head *head)
> > +{
> > +     struct bpf_lsm_attach_type *atype =
> > +             container_of(head, struct bpf_lsm_attach_type, rcu_head);
> > +
> > +     atype->btf_id = 0;
> > +     atype->usecnt = 0;
> > +     list_add_tail_rcu(&atype->atypes, &unused_bpf_lsm_atypes);
> hmm...... no need to hold the cgroup_mutex when changing
> the unused_bpf_lsm_atypes list ?
> but it is a rcu callback, so spinlock is needed.

Oh, good point.

> > +}
> > +
> > +static void bpf_lsm_attach_type_put(u32 attach_btf_id)
> > +{
> > +     struct bpf_lsm_attach_type *atype;
> > +
> > +     lockdep_assert_held(&cgroup_mutex);
> > +
> > +     list_for_each_entry_rcu(atype, &used_bpf_lsm_atypes, atypes) {
> > +             if (atype->btf_id != attach_btf_id)
> > +                     continue;
> > +
> > +             if (--atype->usecnt <= 0) {
> > +                     list_del_rcu(&atype->atypes);
> > +                     WARN_ON_ONCE(atype->usecnt < 0);
> > +
> > +                     /* call_rcu here prevents atype reuse within
> > +                      * the same rcu grace period.
> > +                      * shim programs use __bpf_prog_enter_lsm_cgroup
> > +                      * which starts RCU read section.
> It is a bit unclear for me to think through why
> there is no need to assign 'shim_prog->aux->cgroup_atype = CGROUP_BPF_ATTACH_TYPE_INVALID'
> here before reclaim and the shim_prog->bpf_func does not need to check
> shim_prog->aux->cgroup_atype before using it.
>
> It will be very useful to have a few word comments here to explain this.

My thinking is:
- shim program starts rcu read section (via __bpf_prog_enter_lsm_cgroup)
- on release (bpf_lsm_attach_type_put) we do
list_del_rcu(&atype->atypes) to make sure that particular atype is
"reserved" until grace period and not being reused
- we won't reuse that particular atype for new attachments until grace period
- existing shim programs will still use this atype until grace period,
but we rely on cgroup effective array be empty by that point
- after grace period, we reclaim that atype

Does it clarify your concern? Am I missing something? Not sure how to
put it into a small/concise comment :-)

(maybe moot after your next comment)

> > +                      */
> > +                     call_rcu(&atype->rcu_head, bpf_lsm_attach_type_reclaim);
> How about doing this bpf_lsm_attach_type_put() in bpf_prog_free_deferred().
> And only do it for the shim_prog but not the cgroup-lsm prog.
> The shim_prog is the only one needing cgroup_atype.  Then the cgroup_atype
> naturally can be reused when the shim_prog is being destroyed.
>
> bpf_prog_free_deferred has already gone through a rcu grace
> period (__bpf_prog_put_rcu) and it can block, so cgroup_mutex
> can be used.
>
> The need for the rcu_head here should go away also.  The v6 array approach
> could be reconsidered.
>
> The cgroup-lsm prog does not necessarily need to hold a usecnt to the cgroup_atype.
> Their aux->cgroup_atype can be CGROUP_BPF_ATTACH_TYPE_INVALID.
> My understanding is the replace_prog->aux->cgroup_atype during attach
> is an optimization, it can always search again.

I've considered using bpf_prog_free (I think Alexei also suggested
it?), but not ended up using it because of the situation where the
program can be attached, then detached but not actually freed (there
is a link or an fd holding it); in this case we'll be blocking that
atype reuse. But not sure if it's a real problem?

Let me try to see if it works again, your suggestions make sense.
(especially the part about cgroup_atype for shim only, I don't like
all these replace_prog->aux->cgroup_atype = atype in weird places)

Thank you for the review!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-21  0:53   ` Martin KaFai Lau
@ 2022-05-24  2:15     ` Stanislav Fomichev
  2022-05-24  5:40       ` Martin KaFai Lau
  2022-05-24  5:57       ` Martin KaFai Lau
  0 siblings, 2 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-24  2:15 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

,

On Fri, May 20, 2022 at 5:53 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, May 18, 2022 at 03:55:23PM -0700, Stanislav Fomichev wrote:
>
> [ ... ]
>
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index ea3674a415f9..70cf1dad91df 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -768,6 +768,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
> >  u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
> >  void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
> >                                      struct bpf_tramp_run_ctx *run_ctx);
> > +u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
> > +                                     struct bpf_tramp_run_ctx *run_ctx);
> > +void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
> > +                                     struct bpf_tramp_run_ctx *run_ctx);
> >  void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
> >  void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
> >
> > @@ -1035,6 +1039,7 @@ struct bpf_prog_aux {
> >       u64 load_time; /* ns since boottime */
> >       u32 verified_insns;
> >       struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
> > +     int cgroup_atype; /* enum cgroup_bpf_attach_type */
> >       char name[BPF_OBJ_NAME_LEN];
> >  #ifdef CONFIG_SECURITY
> >       void *security;
> > @@ -1107,6 +1112,12 @@ struct bpf_tramp_link {
> >       u64 cookie;
> >  };
> >
> > +struct bpf_shim_tramp_link {
> > +     struct bpf_tramp_link tramp_link;
> > +     struct bpf_trampoline *tr;
> > +     atomic64_t refcnt;
> There is already a refcnt in 'struct bpf_link'.
> Reuse that one if possible.

I was assuming that having a per-bpf_shim_tramp_link recfnt might be
more readable. I'll switch to the one from bpf_link per comments
below.

> [ ... ]
>
> > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > index 01ce78c1df80..c424056f0b35 100644
> > --- a/kernel/bpf/trampoline.c
> > +++ b/kernel/bpf/trampoline.c
> > @@ -11,6 +11,8 @@
> >  #include <linux/rcupdate_wait.h>
> >  #include <linux/module.h>
> >  #include <linux/static_call.h>
> > +#include <linux/bpf_verifier.h>
> > +#include <linux/bpf_lsm.h>
> >
> >  /* dummy _ops. The verifier will operate on target program's ops. */
> >  const struct bpf_verifier_ops bpf_extension_verifier_ops = {
> > @@ -497,6 +499,163 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
> >       return err;
> >  }
> >
> > +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> > +static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
> > +                                                  bpf_func_t bpf_func)
> > +{
> > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > +     struct bpf_prog *p;
> > +
> > +     shim_link = kzalloc(sizeof(*shim_link), GFP_USER);
> > +     if (!shim_link)
> > +             return NULL;
> > +
> > +     p = bpf_prog_alloc(1, 0);
> > +     if (!p) {
> > +             kfree(shim_link);
> > +             return NULL;
> > +     }
> > +
> > +     p->jited = false;
> > +     p->bpf_func = bpf_func;
> > +
> > +     p->aux->cgroup_atype = prog->aux->cgroup_atype;
> > +     p->aux->attach_func_proto = prog->aux->attach_func_proto;
> > +     p->aux->attach_btf_id = prog->aux->attach_btf_id;
> > +     p->aux->attach_btf = prog->aux->attach_btf;
> > +     btf_get(p->aux->attach_btf);
> > +     p->type = BPF_PROG_TYPE_LSM;
> > +     p->expected_attach_type = BPF_LSM_MAC;
> > +     bpf_prog_inc(p);
> > +     bpf_link_init(&shim_link->tramp_link.link, BPF_LINK_TYPE_TRACING, NULL, p);
> > +     atomic64_set(&shim_link->refcnt, 1);
> > +
> > +     return shim_link;
> > +}
> > +
> > +static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr,
> > +                                                 bpf_func_t bpf_func)
> > +{
> > +     struct bpf_tramp_link *link;
> > +     int kind;
> > +
> > +     for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> > +             hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
> > +                     struct bpf_prog *p = link->link.prog;
> > +
> > +                     if (p->bpf_func == bpf_func)
> > +                             return container_of(link, struct bpf_shim_tramp_link, tramp_link);
> > +             }
> > +     }
> > +
> > +     return NULL;
> > +}
> > +
> > +static void cgroup_shim_put(struct bpf_shim_tramp_link *shim_link)
> > +{
> > +     if (shim_link->tr)
> I have been spinning back and forth with this "shim_link->tr" test and
> the "!shim_link->tr" test below with an atomic64_dec_and_test() test
> in between  :)

I did this dance so I can call cgroup_shim_put from
bpf_trampoline_link_cgroup_shim, I guess that's confusing.
bpf_trampoline_link_cgroup_shim can call cgroup_shim_put when
__bpf_trampoline_link_prog fails (shim_prog->tr==NULL);
cgroup_shim_put can be also called to unlink the prog from the
trampoline (shim_prog->tr!=NULL).

> > +             bpf_trampoline_put(shim_link->tr);
> Why put(tr) here?
>
> Intuitive thinking is that should be done after __bpf_trampoline_unlink_prog(.., tr)
> which is still using the tr.
> or I missed something inside __bpf_trampoline_unlink_prog(..., tr) ?
>
> > +
> > +     if (!atomic64_dec_and_test(&shim_link->refcnt))
> > +             return;
> > +
> > +     if (!shim_link->tr)
> And this is only for the error case in bpf_trampoline_link_cgroup_shim()?
> Can it be handled locally in bpf_trampoline_link_cgroup_shim()
> where it could actually happen ?

Yeah, agreed, I'll move the cleanup path to
bpf_trampoline_link_cgroup_shim to make it less confusing here.

> > +             return;
> > +
> > +     WARN_ON_ONCE(__bpf_trampoline_unlink_prog(&shim_link->tramp_link, shim_link->tr));
> > +     kfree(shim_link);
> How about shim_link->tramp_link.link.prog, is the prog freed ?
>
> Considering the bpf_link_put() does bpf_prog_put(link->prog).
> Is there a reason the bpf_link_put() not used and needs to
> manage its own shim_link->refcnt here ?

Good catch, I've missed the bpf_prog_put(link->prog) part. Let me see
if I can use the link's refcnt, it seems like I can define my own
link->ops->dealloc to call __bpf_trampoline_unlink_prog and the rest
will be taken care of.

> > +}
> > +
> > +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> > +                                 struct bpf_attach_target_info *tgt_info)
> > +{
> > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > +     struct bpf_trampoline *tr;
> > +     bpf_func_t bpf_func;
> > +     u64 key;
> > +     int err;
> > +
> > +     key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> > +                                      prog->aux->attach_btf_id);
> > +
> > +     err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> > +     if (err)
> > +             return err;
> > +
> > +     tr = bpf_trampoline_get(key, tgt_info);
> > +     if (!tr)
> > +             return  -ENOMEM;
> > +
> > +     mutex_lock(&tr->mutex);
> > +
> > +     shim_link = cgroup_shim_find(tr, bpf_func);
> > +     if (shim_link) {
> > +             /* Reusing existing shim attached by the other program. */
> > +             atomic64_inc(&shim_link->refcnt);
> > +             /* note, we're still holding tr refcnt from above */
> hmm... why it still needs to hold the tr refcnt ?

I'm assuming we need to hold the trampoline for as long as shim_prog
is attached to it, right? Otherwise it gets kfreed.



> > +
> > +             mutex_unlock(&tr->mutex);
> > +             return 0;
> > +     }
> > +
> > +     /* Allocate and install new shim. */
> > +
> > +     shim_link = cgroup_shim_alloc(prog, bpf_func);
> > +     if (!shim_link) {
> > +             bpf_trampoline_put(tr);
> > +             err = -ENOMEM;
> > +             goto out;
> > +     }
> > +
> > +     err = __bpf_trampoline_link_prog(&shim_link->tramp_link, tr);
> > +     if (err)
> > +             goto out;
> > +
> > +     shim_link->tr = tr;
> > +
> > +     mutex_unlock(&tr->mutex);
> > +
> > +     return 0;
> > +out:
> > +     mutex_unlock(&tr->mutex);
> > +
> > +     if (shim_link)
> > +             cgroup_shim_put(shim_link);
> > +
> > +     return err;
> > +}
> > +

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts
  2022-05-23 23:22   ` Andrii Nakryiko
@ 2022-05-24  2:15     ` Stanislav Fomichev
  2022-05-24  3:45       ` Andrii Nakryiko
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-24  2:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 23, 2022 at 4:22 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, May 18, 2022 at 3:55 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > Implement bpf_prog_query_opts as a more expendable version of
> > bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
> > well:
> >
> > * prog_attach_flags is a per-program attach_type; relevant only for
> >   lsm cgroup program which might have different attach_flags
> >   per attach_btf_id
> > * attach_btf_func_id is a new field expose for prog_query which
> >   specifies real btf function id for lsm cgroup attachments
> >
>
> just thoughts aloud... Shouldn't bpf_prog_query() also return link_id
> if the attachment was done with LINK_CREATE? And then attach flags
> could actually be fetched through corresponding struct bpf_link_info.
> That is, bpf_prog_query() returns a list of link_ids, and whatever
> link-specific information can be fetched by querying individual links.
> Seems more logical (and useful overall) to extend struct bpf_link_info
> (you can get it more generically from bpftool, by querying fdinfo,
> etc).

Note that I haven't removed non-link-based APIs because they are easy
to support. That might be an argument in favor of dropping them.
Regarding the implementation: I'm not sure there is an easy way, in
the kernel, to find all links associated with a given bpf_prog?

> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  tools/include/uapi/linux/bpf.h |  5 ++++
> >  tools/lib/bpf/bpf.c            | 42 +++++++++++++++++++++++++++-------
> >  tools/lib/bpf/bpf.h            | 15 ++++++++++++
> >  tools/lib/bpf/libbpf.map       |  1 +
> >  4 files changed, 55 insertions(+), 8 deletions(-)
> >
>
> [...]
>
> >         ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
> >
> > -       if (attach_flags)
> > -               *attach_flags = attr.query.attach_flags;
> > -       *prog_cnt = attr.query.prog_cnt;
> > +       if (OPTS_HAS(opts, prog_cnt))
> > +               opts->prog_cnt = attr.query.prog_cnt;
>
> just use OPTS_SET() instead of OPTS_HAS check

Ah, definitely, for some reason I thought that these are "output"
arguments and OPT_SET won't work for them.

> > +       if (OPTS_HAS(opts, attach_flags))
> > +               opts->attach_flags = attr.query.attach_flags;
> >
> >         return libbpf_err_errno(ret);
> >  }
> >
>
> [...]
>
> > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> > index 6b36f46ab5d8..24f7a5147bf2 100644
> > --- a/tools/lib/bpf/libbpf.map
> > +++ b/tools/lib/bpf/libbpf.map
> > @@ -452,6 +452,7 @@ LIBBPF_0.8.0 {
> >                 bpf_map_delete_elem_flags;
> >                 bpf_object__destroy_subskeleton;
> >                 bpf_object__open_subskeleton;
> > +               bpf_prog_query_opts;
>
> please put it into LIBBPF_1.0.0 section, 0.8 is closed now

Definitely, will pull new changes and put them into proper place.

Thank you for your review!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-23 23:23   ` Andrii Nakryiko
@ 2022-05-24  2:15     ` Stanislav Fomichev
  0 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-24  2:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 23, 2022 at 4:24 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, May 18, 2022 at 3:55 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > We have two options:
> > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> >
> > I was doing (2) in the original patch, but switching to (1) here:
> >
> > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > regardless of attach_btf_id
> > * attach_btf_id is exported via bpf_prog_info
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/uapi/linux/bpf.h |   5 ++
> >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> >  kernel/bpf/syscall.c     |   4 +-
> >  3 files changed, 81 insertions(+), 31 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index b9d2d6de63a7..432fc5f49567 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1432,6 +1432,7 @@ union bpf_attr {
> >                 __u32           attach_flags;
> >                 __aligned_u64   prog_ids;
> >                 __u32           prog_cnt;
> > +               __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> >         } query;
> >
> >         struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> >         __u64 run_cnt;
> >         __u64 recursion_misses;
> >         __u32 verified_insns;
> > +       /* BTF ID of the function to attach to within BTF object identified
> > +        * by btf_id.
> > +        */
> > +       __u32 attach_btf_func_id;
>
> it's called attach_btf_id for PROG_LOAD command, keep it consistently
> named (and a bit more generic)?
>
> >  } __attribute__((aligned(8)));
> >
> >  struct bpf_map_info {
>
> [...]

SG. Making it generic makes sense.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 08/11] libbpf: add lsm_cgoup_sock type
  2022-05-23 23:26   ` Andrii Nakryiko
@ 2022-05-24  2:15     ` Stanislav Fomichev
  0 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-24  2:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 23, 2022 at 4:26 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, May 18, 2022 at 3:55 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  tools/lib/bpf/libbpf.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index ef7f302e542f..854449dcd072 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -9027,6 +9027,7 @@ static const struct bpf_sec_def section_defs[] = {
> >         SEC_DEF("fmod_ret.s+",          TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> >         SEC_DEF("fexit.s+",             TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> >         SEC_DEF("freplace+",            EXT, 0, SEC_ATTACH_BTF, attach_trace),
> > +       SEC_DEF("lsm_cgroup+",          LSM, BPF_LSM_CGROUP, SEC_ATTACH_BTF),
>
> we don't do simplistic prefix match anymore, so this doesn't have to
> go before lsm+ (we do prefix match only for legacy SEC_SLOPPY cases).
> So total nit (but wanted to dispel preconception that we need to avoid
> subprefix matches), I'd put this after lsm+

Sure, didn't know the ordering doesn't matter, will do, thanks!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 11/11] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-23 23:33   ` Andrii Nakryiko
@ 2022-05-24  2:15     ` Stanislav Fomichev
  2022-05-24  3:46       ` Andrii Nakryiko
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-24  2:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 23, 2022 at 4:33 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, May 18, 2022 at 3:56 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > sk_priority & sk_mark are writable, the rest is readonly.
> >
> > One interesting thing here is that the verifier doesn't
> > really force me to add NULL checks anywhere :-/
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  .../selftests/bpf/prog_tests/lsm_cgroup.c     | 69 +++++++++++++++++++
> >  1 file changed, 69 insertions(+)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> > index 29292ec40343..64b6830e03f5 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> > @@ -270,8 +270,77 @@ static void test_lsm_cgroup_functional(void)
> >         lsm_cgroup__destroy(skel);
> >  }
> >
> > +static int field_offset(const char *type, const char *field)
> > +{
> > +       const struct btf_member *memb;
> > +       const struct btf_type *tp;
> > +       const char *name;
> > +       struct btf *btf;
> > +       int btf_id;
> > +       int i;
> > +
> > +       btf = btf__load_vmlinux_btf();
> > +       if (!btf)
> > +               return -1;
> > +
> > +       btf_id = btf__find_by_name_kind(btf, type, BTF_KIND_STRUCT);
> > +       if (btf_id < 0)
> > +               return -1;
> > +
> > +       tp = btf__type_by_id(btf, btf_id);
> > +       memb = btf_members(tp);
> > +
> > +       for (i = 0; i < btf_vlen(tp); i++) {
> > +               name = btf__name_by_offset(btf,
> > +                                          memb->name_off);
> > +               if (strcmp(field, name) == 0)
> > +                       return memb->offset / 8;
> > +               memb++;
> > +       }
> > +
> > +       return -1;
> > +}
> > +
> > +static bool sk_writable_field(const char *type, const char *field, int size)
> > +{
> > +       LIBBPF_OPTS(bpf_prog_load_opts, opts,
> > +                   .expected_attach_type = BPF_LSM_CGROUP);
> > +       struct bpf_insn insns[] = {
> > +               /* r1 = *(u64 *)(r1 + 0) */
> > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0),
> > +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */
> > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, field_offset("socket", "sk")),
> > +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */
> > +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, field_offset(type, field)),
> > +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */
> > +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, field_offset(type, field)),
> > +               BPF_MOV64_IMM(BPF_REG_0, 1),
> > +               BPF_EXIT_INSN(),
> > +       };
> > +       int fd;
>
> This is really not much better than test_verifier assembly. What I had
> in mind when I was suggesting to use test_progs was that you'd have a
> normal C source code for BPF part, something like this:
>
> __u64 tmp;
>
> SEC("?lsm_cgroup/socket_bind")
> int BPF_PROG(access1_bad, struct socket *sock, struct sockaddr
> *address, int addrlen)
> {
>     *(volatile u16 *)(sock->sk.skc_family) = *(volatile u16
> *)sock->sk.skc_family;
>     return 0;
> }
>
>
> SEC("?lsm_cgroup/socket_bind")
> int BPF_PROG(access2_bad, struct socket *sock, struct sockaddr
> *address, int addrlen)
> {
>     *(volatile u64 *)(sock->sk.sk_sndtimeo) = *(volatile u64
> *)sock->sk.sk_sndtimeo;
>     return 0;
> }
>
> and so on. From user-space you'd be loading just one of those
> accessX_bad programs at a time (note SEC("?"))
>
>
> But having said that, what you did is pretty self-contained, so not
> too bad. It's just not what I was suggesting :)

Yeah, that's what I suggested I was gonna try in:
https://lore.kernel.org/bpf/CAKH8qBuHU7OAjTMk-6GU08Nmwnn6J7Cw1TzP6GwCEq0x1Wwd9w@mail.gmail.com/

I don't really want to separate the program from the test, it seems
like keeping everything in one file is easier to read.
So unless you strongly dislike this new self-contained version, I'd
keep it as is.



> > +
> > +       opts.attach_btf_id = libbpf_find_vmlinux_btf_id("socket_post_create",
> > +                                                       opts.expected_attach_type);
> > +
> > +       fd = bpf_prog_load(BPF_PROG_TYPE_LSM, NULL, "GPL", insns, ARRAY_SIZE(insns), &opts);
> > +       if (fd >= 0)
> > +               close(fd);
> > +       return fd >= 0;
> > +}
> > +
> > +static void test_lsm_cgroup_access(void)
> > +{
> > +       ASSERT_FALSE(sk_writable_field("sock_common", "skc_family", BPF_H), "skc_family");
> > +       ASSERT_FALSE(sk_writable_field("sock", "sk_sndtimeo", BPF_DW), "sk_sndtimeo");
> > +       ASSERT_TRUE(sk_writable_field("sock", "sk_priority", BPF_W), "sk_priority");
> > +       ASSERT_TRUE(sk_writable_field("sock", "sk_mark", BPF_W), "sk_mark");
> > +       ASSERT_FALSE(sk_writable_field("sock", "sk_pacing_rate", BPF_DW), "sk_pacing_rate");
> > +}
> > +
> >  void test_lsm_cgroup(void)
> >  {
> >         if (test__start_subtest("functional"))
> >                 test_lsm_cgroup_functional();
> > +       if (test__start_subtest("access"))
> > +               test_lsm_cgroup_access();
> >  }
> > --
> > 2.36.1.124.g0e6072fb45-goog
> >

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts
  2022-05-24  2:15     ` Stanislav Fomichev
@ 2022-05-24  3:45       ` Andrii Nakryiko
  2022-05-24  4:01         ` Martin KaFai Lau
  0 siblings, 1 reply; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-24  3:45 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 23, 2022 at 7:15 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Mon, May 23, 2022 at 4:22 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, May 18, 2022 at 3:55 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > Implement bpf_prog_query_opts as a more expendable version of
> > > bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
> > > well:
> > >
> > > * prog_attach_flags is a per-program attach_type; relevant only for
> > >   lsm cgroup program which might have different attach_flags
> > >   per attach_btf_id
> > > * attach_btf_func_id is a new field expose for prog_query which
> > >   specifies real btf function id for lsm cgroup attachments
> > >
> >
> > just thoughts aloud... Shouldn't bpf_prog_query() also return link_id
> > if the attachment was done with LINK_CREATE? And then attach flags
> > could actually be fetched through corresponding struct bpf_link_info.
> > That is, bpf_prog_query() returns a list of link_ids, and whatever
> > link-specific information can be fetched by querying individual links.
> > Seems more logical (and useful overall) to extend struct bpf_link_info
> > (you can get it more generically from bpftool, by querying fdinfo,
> > etc).
>
> Note that I haven't removed non-link-based APIs because they are easy
> to support. That might be an argument in favor of dropping them.
> Regarding the implementation: I'm not sure there is an easy way, in
> the kernel, to find all links associated with a given bpf_prog?

Nope, kernel doesn't keep track of this explicitly, in general. If you
were building a tool for something like that you'd probably use
bpf_link iterator program which we recently added. But in this case
kernel knows links that are attached to cgroups (they are in
prog_item->link if it's not NULL), so you shouldn't need any extra
information.

>
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  tools/include/uapi/linux/bpf.h |  5 ++++
> > >  tools/lib/bpf/bpf.c            | 42 +++++++++++++++++++++++++++-------
> > >  tools/lib/bpf/bpf.h            | 15 ++++++++++++
> > >  tools/lib/bpf/libbpf.map       |  1 +
> > >  4 files changed, 55 insertions(+), 8 deletions(-)
> > >
> >
> > [...]
> >
> > >         ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
> > >
> > > -       if (attach_flags)
> > > -               *attach_flags = attr.query.attach_flags;
> > > -       *prog_cnt = attr.query.prog_cnt;
> > > +       if (OPTS_HAS(opts, prog_cnt))
> > > +               opts->prog_cnt = attr.query.prog_cnt;
> >
> > just use OPTS_SET() instead of OPTS_HAS check
>
> Ah, definitely, for some reason I thought that these are "output"
> arguments and OPT_SET won't work for them.
>
> > > +       if (OPTS_HAS(opts, attach_flags))
> > > +               opts->attach_flags = attr.query.attach_flags;
> > >
> > >         return libbpf_err_errno(ret);
> > >  }
> > >
> >
> > [...]
> >
> > > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> > > index 6b36f46ab5d8..24f7a5147bf2 100644
> > > --- a/tools/lib/bpf/libbpf.map
> > > +++ b/tools/lib/bpf/libbpf.map
> > > @@ -452,6 +452,7 @@ LIBBPF_0.8.0 {
> > >                 bpf_map_delete_elem_flags;
> > >                 bpf_object__destroy_subskeleton;
> > >                 bpf_object__open_subskeleton;
> > > +               bpf_prog_query_opts;
> >
> > please put it into LIBBPF_1.0.0 section, 0.8 is closed now
>
> Definitely, will pull new changes and put them into proper place.
>
> Thank you for your review!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 11/11] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-24  2:15     ` Stanislav Fomichev
@ 2022-05-24  3:46       ` Andrii Nakryiko
  0 siblings, 0 replies; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-24  3:46 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 23, 2022 at 7:15 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Mon, May 23, 2022 at 4:33 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, May 18, 2022 at 3:56 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > sk_priority & sk_mark are writable, the rest is readonly.
> > >
> > > One interesting thing here is that the verifier doesn't
> > > really force me to add NULL checks anywhere :-/
> > >
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  .../selftests/bpf/prog_tests/lsm_cgroup.c     | 69 +++++++++++++++++++
> > >  1 file changed, 69 insertions(+)
> > >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> > > index 29292ec40343..64b6830e03f5 100644
> > > --- a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> > > +++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
> > > @@ -270,8 +270,77 @@ static void test_lsm_cgroup_functional(void)
> > >         lsm_cgroup__destroy(skel);
> > >  }
> > >
> > > +static int field_offset(const char *type, const char *field)
> > > +{
> > > +       const struct btf_member *memb;
> > > +       const struct btf_type *tp;
> > > +       const char *name;
> > > +       struct btf *btf;
> > > +       int btf_id;
> > > +       int i;
> > > +
> > > +       btf = btf__load_vmlinux_btf();
> > > +       if (!btf)
> > > +               return -1;
> > > +
> > > +       btf_id = btf__find_by_name_kind(btf, type, BTF_KIND_STRUCT);
> > > +       if (btf_id < 0)
> > > +               return -1;
> > > +
> > > +       tp = btf__type_by_id(btf, btf_id);
> > > +       memb = btf_members(tp);
> > > +
> > > +       for (i = 0; i < btf_vlen(tp); i++) {
> > > +               name = btf__name_by_offset(btf,
> > > +                                          memb->name_off);
> > > +               if (strcmp(field, name) == 0)
> > > +                       return memb->offset / 8;
> > > +               memb++;
> > > +       }
> > > +
> > > +       return -1;
> > > +}
> > > +
> > > +static bool sk_writable_field(const char *type, const char *field, int size)
> > > +{
> > > +       LIBBPF_OPTS(bpf_prog_load_opts, opts,
> > > +                   .expected_attach_type = BPF_LSM_CGROUP);
> > > +       struct bpf_insn insns[] = {
> > > +               /* r1 = *(u64 *)(r1 + 0) */
> > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0),
> > > +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */
> > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, field_offset("socket", "sk")),
> > > +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */
> > > +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, field_offset(type, field)),
> > > +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */
> > > +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, field_offset(type, field)),
> > > +               BPF_MOV64_IMM(BPF_REG_0, 1),
> > > +               BPF_EXIT_INSN(),
> > > +       };
> > > +       int fd;
> >
> > This is really not much better than test_verifier assembly. What I had
> > in mind when I was suggesting to use test_progs was that you'd have a
> > normal C source code for BPF part, something like this:
> >
> > __u64 tmp;
> >
> > SEC("?lsm_cgroup/socket_bind")
> > int BPF_PROG(access1_bad, struct socket *sock, struct sockaddr
> > *address, int addrlen)
> > {
> >     *(volatile u16 *)(sock->sk.skc_family) = *(volatile u16
> > *)sock->sk.skc_family;
> >     return 0;
> > }
> >
> >
> > SEC("?lsm_cgroup/socket_bind")
> > int BPF_PROG(access2_bad, struct socket *sock, struct sockaddr
> > *address, int addrlen)
> > {
> >     *(volatile u64 *)(sock->sk.sk_sndtimeo) = *(volatile u64
> > *)sock->sk.sk_sndtimeo;
> >     return 0;
> > }
> >
> > and so on. From user-space you'd be loading just one of those
> > accessX_bad programs at a time (note SEC("?"))
> >
> >
> > But having said that, what you did is pretty self-contained, so not
> > too bad. It's just not what I was suggesting :)
>
> Yeah, that's what I suggested I was gonna try in:
> https://lore.kernel.org/bpf/CAKH8qBuHU7OAjTMk-6GU08Nmwnn6J7Cw1TzP6GwCEq0x1Wwd9w@mail.gmail.com/
>
> I don't really want to separate the program from the test, it seems
> like keeping everything in one file is easier to read.
> So unless you strongly dislike this new self-contained version, I'd
> keep it as is.
>

It's fine by me.

>
>
> > > +
> > > +       opts.attach_btf_id = libbpf_find_vmlinux_btf_id("socket_post_create",
> > > +                                                       opts.expected_attach_type);
> > > +
> > > +       fd = bpf_prog_load(BPF_PROG_TYPE_LSM, NULL, "GPL", insns, ARRAY_SIZE(insns), &opts);
> > > +       if (fd >= 0)
> > > +               close(fd);
> > > +       return fd >= 0;
> > > +}
> > > +
> > > +static void test_lsm_cgroup_access(void)
> > > +{
> > > +       ASSERT_FALSE(sk_writable_field("sock_common", "skc_family", BPF_H), "skc_family");
> > > +       ASSERT_FALSE(sk_writable_field("sock", "sk_sndtimeo", BPF_DW), "sk_sndtimeo");
> > > +       ASSERT_TRUE(sk_writable_field("sock", "sk_priority", BPF_W), "sk_priority");
> > > +       ASSERT_TRUE(sk_writable_field("sock", "sk_mark", BPF_W), "sk_mark");
> > > +       ASSERT_FALSE(sk_writable_field("sock", "sk_pacing_rate", BPF_DW), "sk_pacing_rate");
> > > +}
> > > +
> > >  void test_lsm_cgroup(void)
> > >  {
> > >         if (test__start_subtest("functional"))
> > >                 test_lsm_cgroup_functional();
> > > +       if (test__start_subtest("access"))
> > > +               test_lsm_cgroup_access();
> > >  }
> > > --
> > > 2.36.1.124.g0e6072fb45-goog
> > >

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-18 22:55 ` [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
                     ` (2 preceding siblings ...)
  2022-05-23 23:23   ` Andrii Nakryiko
@ 2022-05-24  3:48   ` Martin KaFai Lau
  2022-05-24 15:55     ` Stanislav Fomichev
  3 siblings, 1 reply; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-24  3:48 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> We have two options:
> 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> 
> I was doing (2) in the original patch, but switching to (1) here:
> 
> * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> regardless of attach_btf_id
> * attach_btf_id is exported via bpf_prog_info
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/uapi/linux/bpf.h |   5 ++
>  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
>  kernel/bpf/syscall.c     |   4 +-
>  3 files changed, 81 insertions(+), 31 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index b9d2d6de63a7..432fc5f49567 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1432,6 +1432,7 @@ union bpf_attr {
>  		__u32		attach_flags;
>  		__aligned_u64	prog_ids;
>  		__u32		prog_cnt;
> +		__aligned_u64	prog_attach_flags; /* output: per-program attach_flags */
>  	} query;
>  
>  	struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
>  	__u64 run_cnt;
>  	__u64 recursion_misses;
>  	__u32 verified_insns;
> +	/* BTF ID of the function to attach to within BTF object identified
> +	 * by btf_id.
> +	 */
> +	__u32 attach_btf_func_id;
>  } __attribute__((aligned(8)));
>  
>  struct bpf_map_info {
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index a959cdd22870..08a1015ee09e 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
>  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
>  			      union bpf_attr __user *uattr)
>  {
> +	__u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
>  	__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
>  	enum bpf_attach_type type = attr->query.attach_type;
>  	enum cgroup_bpf_attach_type atype;
> @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
>  	struct hlist_head *progs;
>  	struct bpf_prog *prog;
>  	int cnt, ret = 0, i;
> +	int total_cnt = 0;
>  	u32 flags;
>  
> -	atype = to_cgroup_bpf_attach_type(type);
> -	if (atype < 0)
> -		return -EINVAL;
> +	enum cgroup_bpf_attach_type from_atype, to_atype;
>  
> -	progs = &cgrp->bpf.progs[atype];
> -	flags = cgrp->bpf.flags[atype];
> +	if (type == BPF_LSM_CGROUP) {
> +		from_atype = CGROUP_LSM_START;
> +		to_atype = CGROUP_LSM_END;
> +	} else {
> +		from_atype = to_cgroup_bpf_attach_type(type);
> +		if (from_atype < 0)
> +			return -EINVAL;
> +		to_atype = from_atype;
> +	}
>  
> -	effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> -					      lockdep_is_held(&cgroup_mutex));
> +	for (atype = from_atype; atype <= to_atype; atype++) {
> +		progs = &cgrp->bpf.progs[atype];
> +		flags = cgrp->bpf.flags[atype];
>  
> -	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> -		cnt = bpf_prog_array_length(effective);
> -	else
> -		cnt = prog_list_length(progs);
> +		effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> +						      lockdep_is_held(&cgroup_mutex));
>  
> -	if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> -		return -EFAULT;
> -	if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> +		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> +			total_cnt += bpf_prog_array_length(effective);
> +		else
> +			total_cnt += prog_list_length(progs);
> +	}
> +
> +	if (type != BPF_LSM_CGROUP)
> +		if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> +			return -EFAULT;
> +	if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
>  		return -EFAULT;
> -	if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> +	if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
>  		/* return early if user requested only program count + flags */
>  		return 0;
> -	if (attr->query.prog_cnt < cnt) {
> -		cnt = attr->query.prog_cnt;
> +
> +	if (attr->query.prog_cnt < total_cnt) {
> +		total_cnt = attr->query.prog_cnt;
>  		ret = -ENOSPC;
>  	}
>  
> -	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> -		return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> -	} else {
> -		struct bpf_prog_list *pl;
> -		u32 id;
> +	for (atype = from_atype; atype <= to_atype; atype++) {
> +		if (total_cnt <= 0)
> +			break;
>  
> -		i = 0;
> -		hlist_for_each_entry(pl, progs, node) {
> -			prog = prog_list_prog(pl);
> -			id = prog->aux->id;
> -			if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> -				return -EFAULT;
> -			if (++i == cnt)
> -				break;
> +		progs = &cgrp->bpf.progs[atype];
> +		flags = cgrp->bpf.flags[atype];
> +
> +		effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> +						      lockdep_is_held(&cgroup_mutex));
> +
> +		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> +			cnt = bpf_prog_array_length(effective);
> +		else
> +			cnt = prog_list_length(progs);
> +
> +		if (cnt >= total_cnt)
> +			cnt = total_cnt;
> +
> +		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> +			ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> +		} else {
> +			struct bpf_prog_list *pl;
> +			u32 id;
> +
> +			i = 0;
> +			hlist_for_each_entry(pl, progs, node) {
> +				prog = prog_list_prog(pl);
> +				id = prog->aux->id;
> +				if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> +					return -EFAULT;
> +				if (++i == cnt)
> +					break;
> +			}
>  		}
> +
> +		if (prog_attach_flags)
> +			for (i = 0; i < cnt; i++)
> +				if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> +					return -EFAULT;
> +
> +		prog_ids += cnt;
> +		total_cnt -= cnt;
> +		if (prog_attach_flags)
> +			prog_attach_flags += cnt;
>  	}
>  	return ret;
>  }
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 5ed2093e51cc..4137583c04a2 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
>  	}
>  }
>  
> -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
>  
>  static int bpf_prog_query(const union bpf_attr *attr,
>  			  union bpf_attr __user *uattr)
> @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
>  	case BPF_CGROUP_SYSCTL:
>  	case BPF_CGROUP_GETSOCKOPT:
>  	case BPF_CGROUP_SETSOCKOPT:
> +	case BPF_LSM_CGROUP:
>  		return cgroup_bpf_prog_query(attr, uattr);
>  	case BPF_LIRC_MODE2:
>  		return lirc_prog_query(attr, uattr);
> @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
>  
>  	if (prog->aux->btf)
>  		info.btf_id = btf_obj_id(prog->aux->btf);
> +	info.attach_btf_func_id = prog->aux->attach_btf_id;
Note that exposing prog->aux->attach_btf_id only may not be enough
unless it can assume info.attach_btf_id is always referring to btf_vmlinux
for all bpf prog types.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts
  2022-05-24  3:45       ` Andrii Nakryiko
@ 2022-05-24  4:01         ` Martin KaFai Lau
  0 siblings, 0 replies; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-24  4:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stanislav Fomichev, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Mon, May 23, 2022 at 08:45:13PM -0700, Andrii Nakryiko wrote:
> On Mon, May 23, 2022 at 7:15 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Mon, May 23, 2022 at 4:22 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Wed, May 18, 2022 at 3:55 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > Implement bpf_prog_query_opts as a more expendable version of
> > > > bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
> > > > well:
> > > >
> > > > * prog_attach_flags is a per-program attach_type; relevant only for
> > > >   lsm cgroup program which might have different attach_flags
> > > >   per attach_btf_id
> > > > * attach_btf_func_id is a new field expose for prog_query which
> > > >   specifies real btf function id for lsm cgroup attachments
> > > >
> > >
> > > just thoughts aloud... Shouldn't bpf_prog_query() also return link_id
> > > if the attachment was done with LINK_CREATE? And then attach flags
> > > could actually be fetched through corresponding struct bpf_link_info.
> > > That is, bpf_prog_query() returns a list of link_ids, and whatever
> > > link-specific information can be fetched by querying individual links.
> > > Seems more logical (and useful overall) to extend struct bpf_link_info
> > > (you can get it more generically from bpftool, by querying fdinfo,
> > > etc).
> >
> > Note that I haven't removed non-link-based APIs because they are easy
> > to support. That might be an argument in favor of dropping them.
> > Regarding the implementation: I'm not sure there is an easy way, in
> > the kernel, to find all links associated with a given bpf_prog?
> 
> Nope, kernel doesn't keep track of this explicitly, in general. If you
> were building a tool for something like that you'd probably use
> bpf_link iterator program which we recently added. But in this case
> kernel knows links that are attached to cgroups (they are in
> prog_item->link if it's not NULL), so you shouldn't need any extra
> information.
It will be useful to be able to figure out the effective
bpf progs of a cgroup.  Something that the bpftool currently supports.
With links, the usespace can probably figure that out by
knowing how the kernel evaluate the effective array and
doing it similarly in the userspace ?

or it is something that fits better with cgroup iter in the future.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-24  2:15     ` Stanislav Fomichev
@ 2022-05-24  5:40       ` Martin KaFai Lau
  2022-05-24 15:56         ` Stanislav Fomichev
  2022-05-24  5:57       ` Martin KaFai Lau
  1 sibling, 1 reply; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-24  5:40 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, May 23, 2022 at 07:15:03PM -0700, Stanislav Fomichev wrote:
> ,
> 
> On Fri, May 20, 2022 at 5:53 PM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Wed, May 18, 2022 at 03:55:23PM -0700, Stanislav Fomichev wrote:
> >
> > [ ... ]
> >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index ea3674a415f9..70cf1dad91df 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -768,6 +768,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
> > >  u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
> > >  void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
> > >                                      struct bpf_tramp_run_ctx *run_ctx);
> > > +u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
> > > +                                     struct bpf_tramp_run_ctx *run_ctx);
> > > +void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
> > > +                                     struct bpf_tramp_run_ctx *run_ctx);
> > >  void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
> > >  void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
> > >
> > > @@ -1035,6 +1039,7 @@ struct bpf_prog_aux {
> > >       u64 load_time; /* ns since boottime */
> > >       u32 verified_insns;
> > >       struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
> > > +     int cgroup_atype; /* enum cgroup_bpf_attach_type */
> > >       char name[BPF_OBJ_NAME_LEN];
> > >  #ifdef CONFIG_SECURITY
> > >       void *security;
> > > @@ -1107,6 +1112,12 @@ struct bpf_tramp_link {
> > >       u64 cookie;
> > >  };
> > >
> > > +struct bpf_shim_tramp_link {
> > > +     struct bpf_tramp_link tramp_link;
> > > +     struct bpf_trampoline *tr;
> > > +     atomic64_t refcnt;
> > There is already a refcnt in 'struct bpf_link'.
> > Reuse that one if possible.
> 
> I was assuming that having a per-bpf_shim_tramp_link recfnt might be
> more readable. I'll switch to the one from bpf_link per comments
> below.
> 
> > [ ... ]
> >
> > > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > > index 01ce78c1df80..c424056f0b35 100644
> > > --- a/kernel/bpf/trampoline.c
> > > +++ b/kernel/bpf/trampoline.c
> > > @@ -11,6 +11,8 @@
> > >  #include <linux/rcupdate_wait.h>
> > >  #include <linux/module.h>
> > >  #include <linux/static_call.h>
> > > +#include <linux/bpf_verifier.h>
> > > +#include <linux/bpf_lsm.h>
> > >
> > >  /* dummy _ops. The verifier will operate on target program's ops. */
> > >  const struct bpf_verifier_ops bpf_extension_verifier_ops = {
> > > @@ -497,6 +499,163 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
> > >       return err;
> > >  }
> > >
> > > +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> > > +static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
> > > +                                                  bpf_func_t bpf_func)
> > > +{
> > > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > > +     struct bpf_prog *p;
> > > +
> > > +     shim_link = kzalloc(sizeof(*shim_link), GFP_USER);
> > > +     if (!shim_link)
> > > +             return NULL;
> > > +
> > > +     p = bpf_prog_alloc(1, 0);
> > > +     if (!p) {
> > > +             kfree(shim_link);
> > > +             return NULL;
> > > +     }
> > > +
> > > +     p->jited = false;
> > > +     p->bpf_func = bpf_func;
> > > +
> > > +     p->aux->cgroup_atype = prog->aux->cgroup_atype;
> > > +     p->aux->attach_func_proto = prog->aux->attach_func_proto;
> > > +     p->aux->attach_btf_id = prog->aux->attach_btf_id;
> > > +     p->aux->attach_btf = prog->aux->attach_btf;
> > > +     btf_get(p->aux->attach_btf);
> > > +     p->type = BPF_PROG_TYPE_LSM;
> > > +     p->expected_attach_type = BPF_LSM_MAC;
> > > +     bpf_prog_inc(p);
> > > +     bpf_link_init(&shim_link->tramp_link.link, BPF_LINK_TYPE_TRACING, NULL, p);
> > > +     atomic64_set(&shim_link->refcnt, 1);
> > > +
> > > +     return shim_link;
> > > +}
> > > +
> > > +static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr,
> > > +                                                 bpf_func_t bpf_func)
> > > +{
> > > +     struct bpf_tramp_link *link;
> > > +     int kind;
> > > +
> > > +     for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> > > +             hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
> > > +                     struct bpf_prog *p = link->link.prog;
> > > +
> > > +                     if (p->bpf_func == bpf_func)
> > > +                             return container_of(link, struct bpf_shim_tramp_link, tramp_link);
> > > +             }
> > > +     }
> > > +
> > > +     return NULL;
> > > +}
> > > +
> > > +static void cgroup_shim_put(struct bpf_shim_tramp_link *shim_link)
> > > +{
> > > +     if (shim_link->tr)
> > I have been spinning back and forth with this "shim_link->tr" test and
> > the "!shim_link->tr" test below with an atomic64_dec_and_test() test
> > in between  :)
> 
> I did this dance so I can call cgroup_shim_put from
> bpf_trampoline_link_cgroup_shim, I guess that's confusing.
> bpf_trampoline_link_cgroup_shim can call cgroup_shim_put when
> __bpf_trampoline_link_prog fails (shim_prog->tr==NULL);
> cgroup_shim_put can be also called to unlink the prog from the
> trampoline (shim_prog->tr!=NULL).
> 
> > > +             bpf_trampoline_put(shim_link->tr);
> > Why put(tr) here?
> >
> > Intuitive thinking is that should be done after __bpf_trampoline_unlink_prog(.., tr)
> > which is still using the tr.
> > or I missed something inside __bpf_trampoline_unlink_prog(..., tr) ?
> >
> > > +
> > > +     if (!atomic64_dec_and_test(&shim_link->refcnt))
> > > +             return;
> > > +
> > > +     if (!shim_link->tr)
> > And this is only for the error case in bpf_trampoline_link_cgroup_shim()?
> > Can it be handled locally in bpf_trampoline_link_cgroup_shim()
> > where it could actually happen ?
> 
> Yeah, agreed, I'll move the cleanup path to
> bpf_trampoline_link_cgroup_shim to make it less confusing here.
> 
> > > +             return;
> > > +
> > > +     WARN_ON_ONCE(__bpf_trampoline_unlink_prog(&shim_link->tramp_link, shim_link->tr));
> > > +     kfree(shim_link);
> > How about shim_link->tramp_link.link.prog, is the prog freed ?
> >
> > Considering the bpf_link_put() does bpf_prog_put(link->prog).
> > Is there a reason the bpf_link_put() not used and needs to
> > manage its own shim_link->refcnt here ?
> 
> Good catch, I've missed the bpf_prog_put(link->prog) part. Let me see
> if I can use the link's refcnt, it seems like I can define my own
> link->ops->dealloc to call __bpf_trampoline_unlink_prog and the rest
> will be taken care of.
> 
> > > +}
> > > +
> > > +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> > > +                                 struct bpf_attach_target_info *tgt_info)
> > > +{
> > > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > > +     struct bpf_trampoline *tr;
> > > +     bpf_func_t bpf_func;
> > > +     u64 key;
> > > +     int err;
> > > +
> > > +     key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> > > +                                      prog->aux->attach_btf_id);
> > > +
> > > +     err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> > > +     if (err)
> > > +             return err;
> > > +
> > > +     tr = bpf_trampoline_get(key, tgt_info);
> > > +     if (!tr)
> > > +             return  -ENOMEM;
> > > +
> > > +     mutex_lock(&tr->mutex);
> > > +
> > > +     shim_link = cgroup_shim_find(tr, bpf_func);
> > > +     if (shim_link) {
> > > +             /* Reusing existing shim attached by the other program. */
> > > +             atomic64_inc(&shim_link->refcnt);
> > > +             /* note, we're still holding tr refcnt from above */
> > hmm... why it still needs to hold the tr refcnt ?
> 
> I'm assuming we need to hold the trampoline for as long as shim_prog
> is attached to it, right? Otherwise it gets kfreed.
Each 'attached' cgroup-lsm prog holds the shim_link's refcnt.
shim_link holds both the trampoline's and the shim_prog's refcnt.

As long as there is attached cgroup-lsm prog(s).  shim_link's refcnt
should not be zero.  The shim_link will stay and so does the
shim_link's trampoline and shim_prog.

When the last cgroup-lsm prog is detached, bpf_link_put() should
unlink itself (and its shim_prog) from the trampoline first and
then do a bpf_trampoline_put(tr) and bpf_prog_put(shim_prog).
I think bpf_tracing_link_release() is doing something similar also.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 04/11] bpf: minimize number of allocated lsm slots per program
  2022-05-24  2:14     ` Stanislav Fomichev
@ 2022-05-24  5:53       ` Martin KaFai Lau
  0 siblings, 0 replies; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-24  5:53 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, May 23, 2022 at 07:14:42PM -0700, Stanislav Fomichev wrote:
> On Fri, May 20, 2022 at 11:56 PM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Wed, May 18, 2022 at 03:55:24PM -0700, Stanislav Fomichev wrote:
> > > Previous patch adds 1:1 mapping between all 211 LSM hooks
> > > and bpf_cgroup program array. Instead of reserving a slot per
> > > possible hook, reserve 10 slots per cgroup for lsm programs.
> > > Those slots are dynamically allocated on demand and reclaimed.
> > >
> > > struct cgroup_bpf {
> > >       struct bpf_prog_array *    effective[33];        /*     0   264 */
> > >       /* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
> > >       struct hlist_head          progs[33];            /*   264   264 */
> > >       /* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
> > >       u8                         flags[33];            /*   528    33 */
> > >
> > >       /* XXX 7 bytes hole, try to pack */
> > >
> > >       struct list_head           storages;             /*   568    16 */
> > >       /* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
> > >       struct bpf_prog_array *    inactive;             /*   584     8 */
> > >       struct percpu_ref          refcnt;               /*   592    16 */
> > >       struct work_struct         release_work;         /*   608    72 */
> > >
> > >       /* size: 680, cachelines: 11, members: 7 */
> > >       /* sum members: 673, holes: 1, sum holes: 7 */
> > >       /* last cacheline: 40 bytes */
> > > };
> > >
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  include/linux/bpf-cgroup-defs.h |   3 +-
> > >  include/linux/bpf_lsm.h         |   6 --
> > >  kernel/bpf/bpf_lsm.c            |   5 --
> > >  kernel/bpf/cgroup.c             | 135 +++++++++++++++++++++++++++++---
> > >  4 files changed, 125 insertions(+), 24 deletions(-)
> > >
> > > diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
> > > index d5a70a35dace..359d3f16abea 100644
> > > --- a/include/linux/bpf-cgroup-defs.h
> > > +++ b/include/linux/bpf-cgroup-defs.h
> > > @@ -10,7 +10,8 @@
> > >
> > >  struct bpf_prog_array;
> > >
> > > -#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
> > > +/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
> > > +#define CGROUP_LSM_NUM 10
> > >
> > >  enum cgroup_bpf_attach_type {
> > >       CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
> > > diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> > > index 7f0e59f5f9be..613de44aa429 100644
> > > --- a/include/linux/bpf_lsm.h
> > > +++ b/include/linux/bpf_lsm.h
> > > @@ -43,7 +43,6 @@ extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
> > >  void bpf_inode_storage_free(struct inode *inode);
> > >
> > >  int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
> > > -int bpf_lsm_hook_idx(u32 btf_id);
> > >
> > >  #else /* !CONFIG_BPF_LSM */
> > >
> > > @@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> > >       return -ENOENT;
> > >  }
> > >
> > > -static inline int bpf_lsm_hook_idx(u32 btf_id)
> > > -{
> > > -     return -EINVAL;
> > > -}
> > > -
> > >  #endif /* CONFIG_BPF_LSM */
> > >
> > >  #endif /* _LINUX_BPF_LSM_H */
> > > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > > index 654c23577ad3..96503c3e7a71 100644
> > > --- a/kernel/bpf/bpf_lsm.c
> > > +++ b/kernel/bpf/bpf_lsm.c
> > > @@ -71,11 +71,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> > >       return 0;
> > >  }
> > >
> > > -int bpf_lsm_hook_idx(u32 btf_id)
> > > -{
> > > -     return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> > > -}
> > > -
> > >  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> > >                       const struct bpf_prog *prog)
> > >  {
> > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > index 2c356a38f4cf..a959cdd22870 100644
> > > --- a/kernel/bpf/cgroup.c
> > > +++ b/kernel/bpf/cgroup.c
> > > @@ -132,15 +132,110 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> > >  }
> > >
> > >  #ifdef CONFIG_BPF_LSM
> > > +struct list_head unused_bpf_lsm_atypes;
> > > +struct list_head used_bpf_lsm_atypes;
> > > +
> > > +struct bpf_lsm_attach_type {
> > > +     int index;
> > > +     u32 btf_id;
> > > +     int usecnt;
> > > +     struct list_head atypes;
> > > +     struct rcu_head rcu_head;
> > > +};
> > > +
> > > +static int __init bpf_lsm_attach_type_init(void)
> > > +{
> > > +     struct bpf_lsm_attach_type *atype;
> > > +     int i;
> > > +
> > > +     INIT_LIST_HEAD_RCU(&unused_bpf_lsm_atypes);
> > > +     INIT_LIST_HEAD_RCU(&used_bpf_lsm_atypes);
> > > +
> > > +     for (i = 0; i < CGROUP_LSM_NUM; i++) {
> > > +             atype = kzalloc(sizeof(*atype), GFP_KERNEL);
> > > +             if (!atype)
> > > +                     continue;
> > > +
> > > +             atype->index = i;
> > > +             list_add_tail_rcu(&atype->atypes, &unused_bpf_lsm_atypes);
> > > +     }
> > > +
> > > +     return 0;
> > > +}
> > > +late_initcall(bpf_lsm_attach_type_init);
> > > +
> > >  static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
> > >  {
> > > -     return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
> > > +     struct bpf_lsm_attach_type *atype;
> > > +
> > > +     lockdep_assert_held(&cgroup_mutex);
> > > +
> > > +     list_for_each_entry_rcu(atype, &used_bpf_lsm_atypes, atypes) {
> > > +             if (atype->btf_id != attach_btf_id)
> > > +                     continue;
> > > +
> > > +             atype->usecnt++;
> > > +             return CGROUP_LSM_START + atype->index;
> > > +     }
> > > +
> > > +     atype = list_first_or_null_rcu(&unused_bpf_lsm_atypes, struct bpf_lsm_attach_type, atypes);
> > > +     if (!atype)
> > > +             return -E2BIG;
> > > +
> > > +     list_del_rcu(&atype->atypes);
> > > +     atype->btf_id = attach_btf_id;
> > > +     atype->usecnt = 1;
> > > +     list_add_tail_rcu(&atype->atypes, &used_bpf_lsm_atypes);
> > > +
> > > +     return CGROUP_LSM_START + atype->index;
> > > +}
> > > +
> > > +static void bpf_lsm_attach_type_reclaim(struct rcu_head *head)
> > > +{
> > > +     struct bpf_lsm_attach_type *atype =
> > > +             container_of(head, struct bpf_lsm_attach_type, rcu_head);
> > > +
> > > +     atype->btf_id = 0;
> > > +     atype->usecnt = 0;
> > > +     list_add_tail_rcu(&atype->atypes, &unused_bpf_lsm_atypes);
> > hmm...... no need to hold the cgroup_mutex when changing
> > the unused_bpf_lsm_atypes list ?
> > but it is a rcu callback, so spinlock is needed.
> 
> Oh, good point.
> 
> > > +}
> > > +
> > > +static void bpf_lsm_attach_type_put(u32 attach_btf_id)
> > > +{
> > > +     struct bpf_lsm_attach_type *atype;
> > > +
> > > +     lockdep_assert_held(&cgroup_mutex);
> > > +
> > > +     list_for_each_entry_rcu(atype, &used_bpf_lsm_atypes, atypes) {
> > > +             if (atype->btf_id != attach_btf_id)
> > > +                     continue;
> > > +
> > > +             if (--atype->usecnt <= 0) {
> > > +                     list_del_rcu(&atype->atypes);
> > > +                     WARN_ON_ONCE(atype->usecnt < 0);
> > > +
> > > +                     /* call_rcu here prevents atype reuse within
> > > +                      * the same rcu grace period.
> > > +                      * shim programs use __bpf_prog_enter_lsm_cgroup
> > > +                      * which starts RCU read section.
> > It is a bit unclear for me to think through why
> > there is no need to assign 'shim_prog->aux->cgroup_atype = CGROUP_BPF_ATTACH_TYPE_INVALID'
> > here before reclaim and the shim_prog->bpf_func does not need to check
> > shim_prog->aux->cgroup_atype before using it.
> >
> > It will be very useful to have a few word comments here to explain this.
> 
> My thinking is:
> - shim program starts rcu read section (via __bpf_prog_enter_lsm_cgroup)
> - on release (bpf_lsm_attach_type_put) we do
> list_del_rcu(&atype->atypes) to make sure that particular atype is
> "reserved" until grace period and not being reused
> - we won't reuse that particular atype for new attachments until grace period
> - existing shim programs will still use this atype until grace period,
> but we rely on cgroup effective array be empty by that point
> - after grace period, we reclaim that atype
> 
> Does it clarify your concern? Am I missing something? Not sure how to
> put it into a small/concise comment :-)
Make sense.  Thanks for the explanation.

> 
> (maybe moot after your next comment)
> 
> > > +                      */
> > > +                     call_rcu(&atype->rcu_head, bpf_lsm_attach_type_reclaim);
> > How about doing this bpf_lsm_attach_type_put() in bpf_prog_free_deferred().
> > And only do it for the shim_prog but not the cgroup-lsm prog.
> > The shim_prog is the only one needing cgroup_atype.  Then the cgroup_atype
> > naturally can be reused when the shim_prog is being destroyed.
> >
> > bpf_prog_free_deferred has already gone through a rcu grace
> > period (__bpf_prog_put_rcu) and it can block, so cgroup_mutex
> > can be used.
> >
> > The need for the rcu_head here should go away also.  The v6 array approach
> > could be reconsidered.
> >
> > The cgroup-lsm prog does not necessarily need to hold a usecnt to the cgroup_atype.
> > Their aux->cgroup_atype can be CGROUP_BPF_ATTACH_TYPE_INVALID.
> > My understanding is the replace_prog->aux->cgroup_atype during attach
> > is an optimization, it can always search again.
> 
> I've considered using bpf_prog_free (I think Alexei also suggested
> it?), but not ended up using it because of the situation where the
> program can be attached, then detached but not actually freed (there
> is a link or an fd holding it); in this case we'll be blocking that
> atype reuse. But not sure if it's a real problem?
For cgroup-lsm prog loaded but not attached, it should not block the
atype to be reused.  It probably will not as long as the cgroup-lsm
prog is not the one holding the atype's refcnt.

For shim_prog, it only does bpf_prog_free_deferred() when there is
no cgroup-lsm prog attached to it.  That will happen after a rcu
grace period which should be fine.

> Let me try to see if it works again, your suggestions make sense.
> (especially the part about cgroup_atype for shim only, I don't like
> all these replace_prog->aux->cgroup_atype = atype in weird places)
Yep.  Sounds good.  It would be nice if some of the special handling for
BPF_LSM_CGROUP in cgroup.c can be simplified because of this.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-24  2:15     ` Stanislav Fomichev
  2022-05-24  5:40       ` Martin KaFai Lau
@ 2022-05-24  5:57       ` Martin KaFai Lau
  1 sibling, 0 replies; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-24  5:57 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, May 23, 2022 at 07:15:03PM -0700, Stanislav Fomichev wrote:
> > > +             return;
> > > +
> > > +     WARN_ON_ONCE(__bpf_trampoline_unlink_prog(&shim_link->tramp_link, shim_link->tr));
> > > +     kfree(shim_link);
> > How about shim_link->tramp_link.link.prog, is the prog freed ?
> >
> > Considering the bpf_link_put() does bpf_prog_put(link->prog).
> > Is there a reason the bpf_link_put() not used and needs to
> > manage its own shim_link->refcnt here ?
> 
> Good catch, I've missed the bpf_prog_put(link->prog) part. Let me see
> if I can use the link's refcnt, it seems like I can define my own
> link->ops->dealloc to call __bpf_trampoline_unlink_prog and the rest
> will be taken care of.
From looking at bpf_link_free(), link->ops->release may be a better one
because the link->ops->release() will still need to use the shim_prog
(e.g. shim_prog->aux->cgroup_atype).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-24  3:48   ` Martin KaFai Lau
@ 2022-05-24 15:55     ` Stanislav Fomichev
  2022-05-24 17:50       ` Martin KaFai Lau
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-24 15:55 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> > We have two options:
> > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> >
> > I was doing (2) in the original patch, but switching to (1) here:
> >
> > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > regardless of attach_btf_id
> > * attach_btf_id is exported via bpf_prog_info
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/uapi/linux/bpf.h |   5 ++
> >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> >  kernel/bpf/syscall.c     |   4 +-
> >  3 files changed, 81 insertions(+), 31 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index b9d2d6de63a7..432fc5f49567 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1432,6 +1432,7 @@ union bpf_attr {
> >               __u32           attach_flags;
> >               __aligned_u64   prog_ids;
> >               __u32           prog_cnt;
> > +             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> >       } query;
> >
> >       struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> >       __u64 run_cnt;
> >       __u64 recursion_misses;
> >       __u32 verified_insns;
> > +     /* BTF ID of the function to attach to within BTF object identified
> > +      * by btf_id.
> > +      */
> > +     __u32 attach_btf_func_id;
> >  } __attribute__((aligned(8)));
> >
> >  struct bpf_map_info {
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index a959cdd22870..08a1015ee09e 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> >                             union bpf_attr __user *uattr)
> >  {
> > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> >       enum bpf_attach_type type = attr->query.attach_type;
> >       enum cgroup_bpf_attach_type atype;
> > @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> >       struct hlist_head *progs;
> >       struct bpf_prog *prog;
> >       int cnt, ret = 0, i;
> > +     int total_cnt = 0;
> >       u32 flags;
> >
> > -     atype = to_cgroup_bpf_attach_type(type);
> > -     if (atype < 0)
> > -             return -EINVAL;
> > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> >
> > -     progs = &cgrp->bpf.progs[atype];
> > -     flags = cgrp->bpf.flags[atype];
> > +     if (type == BPF_LSM_CGROUP) {
> > +             from_atype = CGROUP_LSM_START;
> > +             to_atype = CGROUP_LSM_END;
> > +     } else {
> > +             from_atype = to_cgroup_bpf_attach_type(type);
> > +             if (from_atype < 0)
> > +                     return -EINVAL;
> > +             to_atype = from_atype;
> > +     }
> >
> > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > -                                           lockdep_is_held(&cgroup_mutex));
> > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > +             progs = &cgrp->bpf.progs[atype];
> > +             flags = cgrp->bpf.flags[atype];
> >
> > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > -             cnt = bpf_prog_array_length(effective);
> > -     else
> > -             cnt = prog_list_length(progs);
> > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > +                                                   lockdep_is_held(&cgroup_mutex));
> >
> > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > -             return -EFAULT;
> > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > +                     total_cnt += bpf_prog_array_length(effective);
> > +             else
> > +                     total_cnt += prog_list_length(progs);
> > +     }
> > +
> > +     if (type != BPF_LSM_CGROUP)
> > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > +                     return -EFAULT;
> > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> >               return -EFAULT;
> > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> >               /* return early if user requested only program count + flags */
> >               return 0;
> > -     if (attr->query.prog_cnt < cnt) {
> > -             cnt = attr->query.prog_cnt;
> > +
> > +     if (attr->query.prog_cnt < total_cnt) {
> > +             total_cnt = attr->query.prog_cnt;
> >               ret = -ENOSPC;
> >       }
> >
> > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > -     } else {
> > -             struct bpf_prog_list *pl;
> > -             u32 id;
> > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > +             if (total_cnt <= 0)
> > +                     break;
> >
> > -             i = 0;
> > -             hlist_for_each_entry(pl, progs, node) {
> > -                     prog = prog_list_prog(pl);
> > -                     id = prog->aux->id;
> > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > -                             return -EFAULT;
> > -                     if (++i == cnt)
> > -                             break;
> > +             progs = &cgrp->bpf.progs[atype];
> > +             flags = cgrp->bpf.flags[atype];
> > +
> > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > +                                                   lockdep_is_held(&cgroup_mutex));
> > +
> > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > +                     cnt = bpf_prog_array_length(effective);
> > +             else
> > +                     cnt = prog_list_length(progs);
> > +
> > +             if (cnt >= total_cnt)
> > +                     cnt = total_cnt;
> > +
> > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > +             } else {
> > +                     struct bpf_prog_list *pl;
> > +                     u32 id;
> > +
> > +                     i = 0;
> > +                     hlist_for_each_entry(pl, progs, node) {
> > +                             prog = prog_list_prog(pl);
> > +                             id = prog->aux->id;
> > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > +                                     return -EFAULT;
> > +                             if (++i == cnt)
> > +                                     break;
> > +                     }
> >               }
> > +
> > +             if (prog_attach_flags)
> > +                     for (i = 0; i < cnt; i++)
> > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > +                                     return -EFAULT;
> > +
> > +             prog_ids += cnt;
> > +             total_cnt -= cnt;
> > +             if (prog_attach_flags)
> > +                     prog_attach_flags += cnt;
> >       }
> >       return ret;
> >  }
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index 5ed2093e51cc..4137583c04a2 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> >       }
> >  }
> >
> > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> >
> >  static int bpf_prog_query(const union bpf_attr *attr,
> >                         union bpf_attr __user *uattr)
> > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> >       case BPF_CGROUP_SYSCTL:
> >       case BPF_CGROUP_GETSOCKOPT:
> >       case BPF_CGROUP_SETSOCKOPT:
> > +     case BPF_LSM_CGROUP:
> >               return cgroup_bpf_prog_query(attr, uattr);
> >       case BPF_LIRC_MODE2:
> >               return lirc_prog_query(attr, uattr);
> > @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> >
> >       if (prog->aux->btf)
> >               info.btf_id = btf_obj_id(prog->aux->btf);
> > +     info.attach_btf_func_id = prog->aux->attach_btf_id;
> Note that exposing prog->aux->attach_btf_id only may not be enough
> unless it can assume info.attach_btf_id is always referring to btf_vmlinux
> for all bpf prog types.

We also export btf_id two lines above, right? Btw, I left a comment in
the bpftool about those btf_ids, I'm not sure how resolve them and
always assume vmlinux for now.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor
  2022-05-24  5:40       ` Martin KaFai Lau
@ 2022-05-24 15:56         ` Stanislav Fomichev
  0 siblings, 0 replies; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-24 15:56 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, May 23, 2022 at 10:40 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Mon, May 23, 2022 at 07:15:03PM -0700, Stanislav Fomichev wrote:
> > ,
> >
> > On Fri, May 20, 2022 at 5:53 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > >
> > > On Wed, May 18, 2022 at 03:55:23PM -0700, Stanislav Fomichev wrote:
> > >
> > > [ ... ]
> > >
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index ea3674a415f9..70cf1dad91df 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -768,6 +768,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
> > > >  u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
> > > >  void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
> > > >                                      struct bpf_tramp_run_ctx *run_ctx);
> > > > +u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
> > > > +                                     struct bpf_tramp_run_ctx *run_ctx);
> > > > +void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
> > > > +                                     struct bpf_tramp_run_ctx *run_ctx);
> > > >  void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
> > > >  void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
> > > >
> > > > @@ -1035,6 +1039,7 @@ struct bpf_prog_aux {
> > > >       u64 load_time; /* ns since boottime */
> > > >       u32 verified_insns;
> > > >       struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
> > > > +     int cgroup_atype; /* enum cgroup_bpf_attach_type */
> > > >       char name[BPF_OBJ_NAME_LEN];
> > > >  #ifdef CONFIG_SECURITY
> > > >       void *security;
> > > > @@ -1107,6 +1112,12 @@ struct bpf_tramp_link {
> > > >       u64 cookie;
> > > >  };
> > > >
> > > > +struct bpf_shim_tramp_link {
> > > > +     struct bpf_tramp_link tramp_link;
> > > > +     struct bpf_trampoline *tr;
> > > > +     atomic64_t refcnt;
> > > There is already a refcnt in 'struct bpf_link'.
> > > Reuse that one if possible.
> >
> > I was assuming that having a per-bpf_shim_tramp_link recfnt might be
> > more readable. I'll switch to the one from bpf_link per comments
> > below.
> >
> > > [ ... ]
> > >
> > > > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > > > index 01ce78c1df80..c424056f0b35 100644
> > > > --- a/kernel/bpf/trampoline.c
> > > > +++ b/kernel/bpf/trampoline.c
> > > > @@ -11,6 +11,8 @@
> > > >  #include <linux/rcupdate_wait.h>
> > > >  #include <linux/module.h>
> > > >  #include <linux/static_call.h>
> > > > +#include <linux/bpf_verifier.h>
> > > > +#include <linux/bpf_lsm.h>
> > > >
> > > >  /* dummy _ops. The verifier will operate on target program's ops. */
> > > >  const struct bpf_verifier_ops bpf_extension_verifier_ops = {
> > > > @@ -497,6 +499,163 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
> > > >       return err;
> > > >  }
> > > >
> > > > +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> > > > +static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
> > > > +                                                  bpf_func_t bpf_func)
> > > > +{
> > > > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > > > +     struct bpf_prog *p;
> > > > +
> > > > +     shim_link = kzalloc(sizeof(*shim_link), GFP_USER);
> > > > +     if (!shim_link)
> > > > +             return NULL;
> > > > +
> > > > +     p = bpf_prog_alloc(1, 0);
> > > > +     if (!p) {
> > > > +             kfree(shim_link);
> > > > +             return NULL;
> > > > +     }
> > > > +
> > > > +     p->jited = false;
> > > > +     p->bpf_func = bpf_func;
> > > > +
> > > > +     p->aux->cgroup_atype = prog->aux->cgroup_atype;
> > > > +     p->aux->attach_func_proto = prog->aux->attach_func_proto;
> > > > +     p->aux->attach_btf_id = prog->aux->attach_btf_id;
> > > > +     p->aux->attach_btf = prog->aux->attach_btf;
> > > > +     btf_get(p->aux->attach_btf);
> > > > +     p->type = BPF_PROG_TYPE_LSM;
> > > > +     p->expected_attach_type = BPF_LSM_MAC;
> > > > +     bpf_prog_inc(p);
> > > > +     bpf_link_init(&shim_link->tramp_link.link, BPF_LINK_TYPE_TRACING, NULL, p);
> > > > +     atomic64_set(&shim_link->refcnt, 1);
> > > > +
> > > > +     return shim_link;
> > > > +}
> > > > +
> > > > +static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr,
> > > > +                                                 bpf_func_t bpf_func)
> > > > +{
> > > > +     struct bpf_tramp_link *link;
> > > > +     int kind;
> > > > +
> > > > +     for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> > > > +             hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
> > > > +                     struct bpf_prog *p = link->link.prog;
> > > > +
> > > > +                     if (p->bpf_func == bpf_func)
> > > > +                             return container_of(link, struct bpf_shim_tramp_link, tramp_link);
> > > > +             }
> > > > +     }
> > > > +
> > > > +     return NULL;
> > > > +}
> > > > +
> > > > +static void cgroup_shim_put(struct bpf_shim_tramp_link *shim_link)
> > > > +{
> > > > +     if (shim_link->tr)
> > > I have been spinning back and forth with this "shim_link->tr" test and
> > > the "!shim_link->tr" test below with an atomic64_dec_and_test() test
> > > in between  :)
> >
> > I did this dance so I can call cgroup_shim_put from
> > bpf_trampoline_link_cgroup_shim, I guess that's confusing.
> > bpf_trampoline_link_cgroup_shim can call cgroup_shim_put when
> > __bpf_trampoline_link_prog fails (shim_prog->tr==NULL);
> > cgroup_shim_put can be also called to unlink the prog from the
> > trampoline (shim_prog->tr!=NULL).
> >
> > > > +             bpf_trampoline_put(shim_link->tr);
> > > Why put(tr) here?
> > >
> > > Intuitive thinking is that should be done after __bpf_trampoline_unlink_prog(.., tr)
> > > which is still using the tr.
> > > or I missed something inside __bpf_trampoline_unlink_prog(..., tr) ?
> > >
> > > > +
> > > > +     if (!atomic64_dec_and_test(&shim_link->refcnt))
> > > > +             return;
> > > > +
> > > > +     if (!shim_link->tr)
> > > And this is only for the error case in bpf_trampoline_link_cgroup_shim()?
> > > Can it be handled locally in bpf_trampoline_link_cgroup_shim()
> > > where it could actually happen ?
> >
> > Yeah, agreed, I'll move the cleanup path to
> > bpf_trampoline_link_cgroup_shim to make it less confusing here.
> >
> > > > +             return;
> > > > +
> > > > +     WARN_ON_ONCE(__bpf_trampoline_unlink_prog(&shim_link->tramp_link, shim_link->tr));
> > > > +     kfree(shim_link);
> > > How about shim_link->tramp_link.link.prog, is the prog freed ?
> > >
> > > Considering the bpf_link_put() does bpf_prog_put(link->prog).
> > > Is there a reason the bpf_link_put() not used and needs to
> > > manage its own shim_link->refcnt here ?
> >
> > Good catch, I've missed the bpf_prog_put(link->prog) part. Let me see
> > if I can use the link's refcnt, it seems like I can define my own
> > link->ops->dealloc to call __bpf_trampoline_unlink_prog and the rest
> > will be taken care of.
> >
> > > > +}
> > > > +
> > > > +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> > > > +                                 struct bpf_attach_target_info *tgt_info)
> > > > +{
> > > > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > > > +     struct bpf_trampoline *tr;
> > > > +     bpf_func_t bpf_func;
> > > > +     u64 key;
> > > > +     int err;
> > > > +
> > > > +     key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> > > > +                                      prog->aux->attach_btf_id);
> > > > +
> > > > +     err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> > > > +     if (err)
> > > > +             return err;
> > > > +
> > > > +     tr = bpf_trampoline_get(key, tgt_info);
> > > > +     if (!tr)
> > > > +             return  -ENOMEM;
> > > > +
> > > > +     mutex_lock(&tr->mutex);
> > > > +
> > > > +     shim_link = cgroup_shim_find(tr, bpf_func);
> > > > +     if (shim_link) {
> > > > +             /* Reusing existing shim attached by the other program. */
> > > > +             atomic64_inc(&shim_link->refcnt);
> > > > +             /* note, we're still holding tr refcnt from above */
> > > hmm... why it still needs to hold the tr refcnt ?
> >
> > I'm assuming we need to hold the trampoline for as long as shim_prog
> > is attached to it, right? Otherwise it gets kfreed.
> Each 'attached' cgroup-lsm prog holds the shim_link's refcnt.
> shim_link holds both the trampoline's and the shim_prog's refcnt.
>
> As long as there is attached cgroup-lsm prog(s).  shim_link's refcnt
> should not be zero.  The shim_link will stay and so does the
> shim_link's trampoline and shim_prog.
>
> When the last cgroup-lsm prog is detached, bpf_link_put() should
> unlink itself (and its shim_prog) from the trampoline first and
> then do a bpf_trampoline_put(tr) and bpf_prog_put(shim_prog).
> I think bpf_tracing_link_release() is doing something similar also.

Yeah, I played with it a bit yesterday and ended up with the same
contents as bpf_tracing_link_release. Thanks for the pointers!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-24 15:55     ` Stanislav Fomichev
@ 2022-05-24 17:50       ` Martin KaFai Lau
  2022-05-24 23:45         ` Andrii Nakryiko
  0 siblings, 1 reply; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-24 17:50 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev wrote:
> On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> > > We have two options:
> > > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> > >
> > > I was doing (2) in the original patch, but switching to (1) here:
> > >
> > > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > > regardless of attach_btf_id
> > > * attach_btf_id is exported via bpf_prog_info
> > >
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  include/uapi/linux/bpf.h |   5 ++
> > >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> > >  kernel/bpf/syscall.c     |   4 +-
> > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > >
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index b9d2d6de63a7..432fc5f49567 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > >               __u32           attach_flags;
> > >               __aligned_u64   prog_ids;
> > >               __u32           prog_cnt;
> > > +             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> > >       } query;
> > >
> > >       struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > >       __u64 run_cnt;
> > >       __u64 recursion_misses;
> > >       __u32 verified_insns;
> > > +     /* BTF ID of the function to attach to within BTF object identified
> > > +      * by btf_id.
> > > +      */
> > > +     __u32 attach_btf_func_id;
> > >  } __attribute__((aligned(8)));
> > >
> > >  struct bpf_map_info {
> > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > index a959cdd22870..08a1015ee09e 100644
> > > --- a/kernel/bpf/cgroup.c
> > > +++ b/kernel/bpf/cgroup.c
> > > @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > >                             union bpf_attr __user *uattr)
> > >  {
> > > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> > >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> > >       enum bpf_attach_type type = attr->query.attach_type;
> > >       enum cgroup_bpf_attach_type atype;
> > > @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > >       struct hlist_head *progs;
> > >       struct bpf_prog *prog;
> > >       int cnt, ret = 0, i;
> > > +     int total_cnt = 0;
> > >       u32 flags;
> > >
> > > -     atype = to_cgroup_bpf_attach_type(type);
> > > -     if (atype < 0)
> > > -             return -EINVAL;
> > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > >
> > > -     progs = &cgrp->bpf.progs[atype];
> > > -     flags = cgrp->bpf.flags[atype];
> > > +     if (type == BPF_LSM_CGROUP) {
> > > +             from_atype = CGROUP_LSM_START;
> > > +             to_atype = CGROUP_LSM_END;
> > > +     } else {
> > > +             from_atype = to_cgroup_bpf_attach_type(type);
> > > +             if (from_atype < 0)
> > > +                     return -EINVAL;
> > > +             to_atype = from_atype;
> > > +     }
> > >
> > > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > -                                           lockdep_is_held(&cgroup_mutex));
> > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > +             progs = &cgrp->bpf.progs[atype];
> > > +             flags = cgrp->bpf.flags[atype];
> > >
> > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > -             cnt = bpf_prog_array_length(effective);
> > > -     else
> > > -             cnt = prog_list_length(progs);
> > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > +                                                   lockdep_is_held(&cgroup_mutex));
> > >
> > > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > -             return -EFAULT;
> > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > +                     total_cnt += bpf_prog_array_length(effective);
> > > +             else
> > > +                     total_cnt += prog_list_length(progs);
> > > +     }
> > > +
> > > +     if (type != BPF_LSM_CGROUP)
> > > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > +                     return -EFAULT;
> > > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> > >               return -EFAULT;
> > > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> > >               /* return early if user requested only program count + flags */
> > >               return 0;
> > > -     if (attr->query.prog_cnt < cnt) {
> > > -             cnt = attr->query.prog_cnt;
> > > +
> > > +     if (attr->query.prog_cnt < total_cnt) {
> > > +             total_cnt = attr->query.prog_cnt;
> > >               ret = -ENOSPC;
> > >       }
> > >
> > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > -     } else {
> > > -             struct bpf_prog_list *pl;
> > > -             u32 id;
> > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > +             if (total_cnt <= 0)
> > > +                     break;
> > >
> > > -             i = 0;
> > > -             hlist_for_each_entry(pl, progs, node) {
> > > -                     prog = prog_list_prog(pl);
> > > -                     id = prog->aux->id;
> > > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > -                             return -EFAULT;
> > > -                     if (++i == cnt)
> > > -                             break;
> > > +             progs = &cgrp->bpf.progs[atype];
> > > +             flags = cgrp->bpf.flags[atype];
> > > +
> > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > +
> > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > +                     cnt = bpf_prog_array_length(effective);
> > > +             else
> > > +                     cnt = prog_list_length(progs);
> > > +
> > > +             if (cnt >= total_cnt)
> > > +                     cnt = total_cnt;
> > > +
> > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > +             } else {
> > > +                     struct bpf_prog_list *pl;
> > > +                     u32 id;
> > > +
> > > +                     i = 0;
> > > +                     hlist_for_each_entry(pl, progs, node) {
> > > +                             prog = prog_list_prog(pl);
> > > +                             id = prog->aux->id;
> > > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > +                                     return -EFAULT;
> > > +                             if (++i == cnt)
> > > +                                     break;
> > > +                     }
> > >               }
> > > +
> > > +             if (prog_attach_flags)
> > > +                     for (i = 0; i < cnt; i++)
> > > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > +                                     return -EFAULT;
> > > +
> > > +             prog_ids += cnt;
> > > +             total_cnt -= cnt;
> > > +             if (prog_attach_flags)
> > > +                     prog_attach_flags += cnt;
> > >       }
> > >       return ret;
> > >  }
> > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > index 5ed2093e51cc..4137583c04a2 100644
> > > --- a/kernel/bpf/syscall.c
> > > +++ b/kernel/bpf/syscall.c
> > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> > >       }
> > >  }
> > >
> > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> > >
> > >  static int bpf_prog_query(const union bpf_attr *attr,
> > >                         union bpf_attr __user *uattr)
> > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> > >       case BPF_CGROUP_SYSCTL:
> > >       case BPF_CGROUP_GETSOCKOPT:
> > >       case BPF_CGROUP_SETSOCKOPT:
> > > +     case BPF_LSM_CGROUP:
> > >               return cgroup_bpf_prog_query(attr, uattr);
> > >       case BPF_LIRC_MODE2:
> > >               return lirc_prog_query(attr, uattr);
> > > @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> > >
> > >       if (prog->aux->btf)
> > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > +     info.attach_btf_func_id = prog->aux->attach_btf_id;
> > Note that exposing prog->aux->attach_btf_id only may not be enough
> > unless it can assume info.attach_btf_id is always referring to btf_vmlinux
> > for all bpf prog types.
> 
> We also export btf_id two lines above, right? Btw, I left a comment in
> the bpftool about those btf_ids, I'm not sure how resolve them and
> always assume vmlinux for now.
yeah, that btf_id above is the cgroup-lsm prog's btf_id which has its
func info, line info...etc.   It is not the one the attach_btf_id correspond
to.  attach_btf_id refers to either aux->attach_btf or aux->dst_prog's btf (or
target btf id here).

It needs a consensus on where this attach_btf_id, target btf id, and
prog_attach_flags should be.  If I read the patch 7 thread correctly,
I think Andrii is suggesting to expose them to userspace through link, so
potentially putting them in bpf_link_info.  The bpf_prog_query will
output a list of link ids.  The same probably applies to
the BPF_F_QUERY_EFFECTIVE query_flags but not sure about the prog_attach_flags
in this case and probably the userspace can figure that out by using
the cgroup_id in the link?  That is all I can think of right now
and don't have better idea :)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-24 17:50       ` Martin KaFai Lau
@ 2022-05-24 23:45         ` Andrii Nakryiko
  2022-05-25  4:03           ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-24 23:45 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Stanislav Fomichev, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev wrote:
> > On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > >
> > > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> > > > We have two options:
> > > > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> > > >
> > > > I was doing (2) in the original patch, but switching to (1) here:
> > > >
> > > > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > > > regardless of attach_btf_id
> > > > * attach_btf_id is exported via bpf_prog_info
> > > >
> > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > ---
> > > >  include/uapi/linux/bpf.h |   5 ++
> > > >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> > > >  kernel/bpf/syscall.c     |   4 +-
> > > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > > >
> > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > index b9d2d6de63a7..432fc5f49567 100644
> > > > --- a/include/uapi/linux/bpf.h
> > > > +++ b/include/uapi/linux/bpf.h
> > > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > > >               __u32           attach_flags;
> > > >               __aligned_u64   prog_ids;
> > > >               __u32           prog_cnt;
> > > > +             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> > > >       } query;
> > > >
> > > >       struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > > >       __u64 run_cnt;
> > > >       __u64 recursion_misses;
> > > >       __u32 verified_insns;
> > > > +     /* BTF ID of the function to attach to within BTF object identified
> > > > +      * by btf_id.
> > > > +      */
> > > > +     __u32 attach_btf_func_id;
> > > >  } __attribute__((aligned(8)));
> > > >
> > > >  struct bpf_map_info {
> > > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > > index a959cdd22870..08a1015ee09e 100644
> > > > --- a/kernel/bpf/cgroup.c
> > > > +++ b/kernel/bpf/cgroup.c
> > > > @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > > >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > >                             union bpf_attr __user *uattr)
> > > >  {
> > > > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> > > >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> > > >       enum bpf_attach_type type = attr->query.attach_type;
> > > >       enum cgroup_bpf_attach_type atype;
> > > > @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > >       struct hlist_head *progs;
> > > >       struct bpf_prog *prog;
> > > >       int cnt, ret = 0, i;
> > > > +     int total_cnt = 0;
> > > >       u32 flags;
> > > >
> > > > -     atype = to_cgroup_bpf_attach_type(type);
> > > > -     if (atype < 0)
> > > > -             return -EINVAL;
> > > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > > >
> > > > -     progs = &cgrp->bpf.progs[atype];
> > > > -     flags = cgrp->bpf.flags[atype];
> > > > +     if (type == BPF_LSM_CGROUP) {
> > > > +             from_atype = CGROUP_LSM_START;
> > > > +             to_atype = CGROUP_LSM_END;
> > > > +     } else {
> > > > +             from_atype = to_cgroup_bpf_attach_type(type);
> > > > +             if (from_atype < 0)
> > > > +                     return -EINVAL;
> > > > +             to_atype = from_atype;
> > > > +     }
> > > >
> > > > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > -                                           lockdep_is_held(&cgroup_mutex));
> > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > +             progs = &cgrp->bpf.progs[atype];
> > > > +             flags = cgrp->bpf.flags[atype];
> > > >
> > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > -             cnt = bpf_prog_array_length(effective);
> > > > -     else
> > > > -             cnt = prog_list_length(progs);
> > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > >
> > > > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > -             return -EFAULT;
> > > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > +                     total_cnt += bpf_prog_array_length(effective);
> > > > +             else
> > > > +                     total_cnt += prog_list_length(progs);
> > > > +     }
> > > > +
> > > > +     if (type != BPF_LSM_CGROUP)
> > > > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > +                     return -EFAULT;
> > > > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> > > >               return -EFAULT;
> > > > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > > > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> > > >               /* return early if user requested only program count + flags */
> > > >               return 0;
> > > > -     if (attr->query.prog_cnt < cnt) {
> > > > -             cnt = attr->query.prog_cnt;
> > > > +
> > > > +     if (attr->query.prog_cnt < total_cnt) {
> > > > +             total_cnt = attr->query.prog_cnt;
> > > >               ret = -ENOSPC;
> > > >       }
> > > >
> > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > -     } else {
> > > > -             struct bpf_prog_list *pl;
> > > > -             u32 id;
> > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > +             if (total_cnt <= 0)
> > > > +                     break;
> > > >
> > > > -             i = 0;
> > > > -             hlist_for_each_entry(pl, progs, node) {
> > > > -                     prog = prog_list_prog(pl);
> > > > -                     id = prog->aux->id;
> > > > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > -                             return -EFAULT;
> > > > -                     if (++i == cnt)
> > > > -                             break;
> > > > +             progs = &cgrp->bpf.progs[atype];
> > > > +             flags = cgrp->bpf.flags[atype];
> > > > +
> > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > +
> > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > +                     cnt = bpf_prog_array_length(effective);
> > > > +             else
> > > > +                     cnt = prog_list_length(progs);
> > > > +
> > > > +             if (cnt >= total_cnt)
> > > > +                     cnt = total_cnt;
> > > > +
> > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > +             } else {
> > > > +                     struct bpf_prog_list *pl;
> > > > +                     u32 id;
> > > > +
> > > > +                     i = 0;
> > > > +                     hlist_for_each_entry(pl, progs, node) {
> > > > +                             prog = prog_list_prog(pl);
> > > > +                             id = prog->aux->id;
> > > > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > +                                     return -EFAULT;
> > > > +                             if (++i == cnt)
> > > > +                                     break;
> > > > +                     }
> > > >               }
> > > > +
> > > > +             if (prog_attach_flags)
> > > > +                     for (i = 0; i < cnt; i++)
> > > > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > > +                                     return -EFAULT;
> > > > +
> > > > +             prog_ids += cnt;
> > > > +             total_cnt -= cnt;
> > > > +             if (prog_attach_flags)
> > > > +                     prog_attach_flags += cnt;
> > > >       }
> > > >       return ret;
> > > >  }
> > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > index 5ed2093e51cc..4137583c04a2 100644
> > > > --- a/kernel/bpf/syscall.c
> > > > +++ b/kernel/bpf/syscall.c
> > > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> > > >       }
> > > >  }
> > > >
> > > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> > > >
> > > >  static int bpf_prog_query(const union bpf_attr *attr,
> > > >                         union bpf_attr __user *uattr)
> > > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> > > >       case BPF_CGROUP_SYSCTL:
> > > >       case BPF_CGROUP_GETSOCKOPT:
> > > >       case BPF_CGROUP_SETSOCKOPT:
> > > > +     case BPF_LSM_CGROUP:
> > > >               return cgroup_bpf_prog_query(attr, uattr);
> > > >       case BPF_LIRC_MODE2:
> > > >               return lirc_prog_query(attr, uattr);
> > > > @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> > > >
> > > >       if (prog->aux->btf)
> > > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > > +     info.attach_btf_func_id = prog->aux->attach_btf_id;
> > > Note that exposing prog->aux->attach_btf_id only may not be enough
> > > unless it can assume info.attach_btf_id is always referring to btf_vmlinux
> > > for all bpf prog types.
> >
> > We also export btf_id two lines above, right? Btw, I left a comment in
> > the bpftool about those btf_ids, I'm not sure how resolve them and
> > always assume vmlinux for now.
> yeah, that btf_id above is the cgroup-lsm prog's btf_id which has its
> func info, line info...etc.   It is not the one the attach_btf_id correspond
> to.  attach_btf_id refers to either aux->attach_btf or aux->dst_prog's btf (or
> target btf id here).
>
> It needs a consensus on where this attach_btf_id, target btf id, and
> prog_attach_flags should be.  If I read the patch 7 thread correctly,
> I think Andrii is suggesting to expose them to userspace through link, so
> potentially putting them in bpf_link_info.  The bpf_prog_query will
> output a list of link ids.  The same probably applies to

Yep and I think it makes sense because link is representing one
specific attachment (and I presume flags can be stored inside the link
itself as well, right?).

But if legacy non-link BPF_PROG_ATTACH is supported then using
bpf_link_info won't cover legacy prog-only attachments.

> the BPF_F_QUERY_EFFECTIVE query_flags but not sure about the prog_attach_flags
> in this case and probably the userspace can figure that out by using
> the cgroup_id in the link?  That is all I can think of right now
> and don't have better idea :)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-24 23:45         ` Andrii Nakryiko
@ 2022-05-25  4:03           ` Stanislav Fomichev
  2022-05-25  4:39             ` Andrii Nakryiko
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-25  4:03 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Martin KaFai Lau, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Tue, May 24, 2022 at 4:45 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev wrote:
> > > On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > >
> > > > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> > > > > We have two options:
> > > > > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > > > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> > > > >
> > > > > I was doing (2) in the original patch, but switching to (1) here:
> > > > >
> > > > > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > > > > regardless of attach_btf_id
> > > > > * attach_btf_id is exported via bpf_prog_info
> > > > >
> > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > ---
> > > > >  include/uapi/linux/bpf.h |   5 ++
> > > > >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> > > > >  kernel/bpf/syscall.c     |   4 +-
> > > > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > > > >
> > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > index b9d2d6de63a7..432fc5f49567 100644
> > > > > --- a/include/uapi/linux/bpf.h
> > > > > +++ b/include/uapi/linux/bpf.h
> > > > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > > > >               __u32           attach_flags;
> > > > >               __aligned_u64   prog_ids;
> > > > >               __u32           prog_cnt;
> > > > > +             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> > > > >       } query;
> > > > >
> > > > >       struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > > > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > > > >       __u64 run_cnt;
> > > > >       __u64 recursion_misses;
> > > > >       __u32 verified_insns;
> > > > > +     /* BTF ID of the function to attach to within BTF object identified
> > > > > +      * by btf_id.
> > > > > +      */
> > > > > +     __u32 attach_btf_func_id;
> > > > >  } __attribute__((aligned(8)));
> > > > >
> > > > >  struct bpf_map_info {
> > > > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > > > index a959cdd22870..08a1015ee09e 100644
> > > > > --- a/kernel/bpf/cgroup.c
> > > > > +++ b/kernel/bpf/cgroup.c
> > > > > @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > > > >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > >                             union bpf_attr __user *uattr)
> > > > >  {
> > > > > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> > > > >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> > > > >       enum bpf_attach_type type = attr->query.attach_type;
> > > > >       enum cgroup_bpf_attach_type atype;
> > > > > @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > >       struct hlist_head *progs;
> > > > >       struct bpf_prog *prog;
> > > > >       int cnt, ret = 0, i;
> > > > > +     int total_cnt = 0;
> > > > >       u32 flags;
> > > > >
> > > > > -     atype = to_cgroup_bpf_attach_type(type);
> > > > > -     if (atype < 0)
> > > > > -             return -EINVAL;
> > > > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > > > >
> > > > > -     progs = &cgrp->bpf.progs[atype];
> > > > > -     flags = cgrp->bpf.flags[atype];
> > > > > +     if (type == BPF_LSM_CGROUP) {
> > > > > +             from_atype = CGROUP_LSM_START;
> > > > > +             to_atype = CGROUP_LSM_END;
> > > > > +     } else {
> > > > > +             from_atype = to_cgroup_bpf_attach_type(type);
> > > > > +             if (from_atype < 0)
> > > > > +                     return -EINVAL;
> > > > > +             to_atype = from_atype;
> > > > > +     }
> > > > >
> > > > > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > -                                           lockdep_is_held(&cgroup_mutex));
> > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > +             flags = cgrp->bpf.flags[atype];
> > > > >
> > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > -             cnt = bpf_prog_array_length(effective);
> > > > > -     else
> > > > > -             cnt = prog_list_length(progs);
> > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > >
> > > > > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > -             return -EFAULT;
> > > > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > +                     total_cnt += bpf_prog_array_length(effective);
> > > > > +             else
> > > > > +                     total_cnt += prog_list_length(progs);
> > > > > +     }
> > > > > +
> > > > > +     if (type != BPF_LSM_CGROUP)
> > > > > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > +                     return -EFAULT;
> > > > > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> > > > >               return -EFAULT;
> > > > > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > > > > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> > > > >               /* return early if user requested only program count + flags */
> > > > >               return 0;
> > > > > -     if (attr->query.prog_cnt < cnt) {
> > > > > -             cnt = attr->query.prog_cnt;
> > > > > +
> > > > > +     if (attr->query.prog_cnt < total_cnt) {
> > > > > +             total_cnt = attr->query.prog_cnt;
> > > > >               ret = -ENOSPC;
> > > > >       }
> > > > >
> > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > -     } else {
> > > > > -             struct bpf_prog_list *pl;
> > > > > -             u32 id;
> > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > +             if (total_cnt <= 0)
> > > > > +                     break;
> > > > >
> > > > > -             i = 0;
> > > > > -             hlist_for_each_entry(pl, progs, node) {
> > > > > -                     prog = prog_list_prog(pl);
> > > > > -                     id = prog->aux->id;
> > > > > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > -                             return -EFAULT;
> > > > > -                     if (++i == cnt)
> > > > > -                             break;
> > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > +
> > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > +
> > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > +                     cnt = bpf_prog_array_length(effective);
> > > > > +             else
> > > > > +                     cnt = prog_list_length(progs);
> > > > > +
> > > > > +             if (cnt >= total_cnt)
> > > > > +                     cnt = total_cnt;
> > > > > +
> > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > +             } else {
> > > > > +                     struct bpf_prog_list *pl;
> > > > > +                     u32 id;
> > > > > +
> > > > > +                     i = 0;
> > > > > +                     hlist_for_each_entry(pl, progs, node) {
> > > > > +                             prog = prog_list_prog(pl);
> > > > > +                             id = prog->aux->id;
> > > > > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > +                                     return -EFAULT;
> > > > > +                             if (++i == cnt)
> > > > > +                                     break;
> > > > > +                     }
> > > > >               }
> > > > > +
> > > > > +             if (prog_attach_flags)
> > > > > +                     for (i = 0; i < cnt; i++)
> > > > > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > > > +                                     return -EFAULT;
> > > > > +
> > > > > +             prog_ids += cnt;
> > > > > +             total_cnt -= cnt;
> > > > > +             if (prog_attach_flags)
> > > > > +                     prog_attach_flags += cnt;
> > > > >       }
> > > > >       return ret;
> > > > >  }
> > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > index 5ed2093e51cc..4137583c04a2 100644
> > > > > --- a/kernel/bpf/syscall.c
> > > > > +++ b/kernel/bpf/syscall.c
> > > > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> > > > >       }
> > > > >  }
> > > > >
> > > > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > > > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> > > > >
> > > > >  static int bpf_prog_query(const union bpf_attr *attr,
> > > > >                         union bpf_attr __user *uattr)
> > > > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> > > > >       case BPF_CGROUP_SYSCTL:
> > > > >       case BPF_CGROUP_GETSOCKOPT:
> > > > >       case BPF_CGROUP_SETSOCKOPT:
> > > > > +     case BPF_LSM_CGROUP:
> > > > >               return cgroup_bpf_prog_query(attr, uattr);
> > > > >       case BPF_LIRC_MODE2:
> > > > >               return lirc_prog_query(attr, uattr);
> > > > > @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> > > > >
> > > > >       if (prog->aux->btf)
> > > > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > > > +     info.attach_btf_func_id = prog->aux->attach_btf_id;
> > > > Note that exposing prog->aux->attach_btf_id only may not be enough
> > > > unless it can assume info.attach_btf_id is always referring to btf_vmlinux
> > > > for all bpf prog types.
> > >
> > > We also export btf_id two lines above, right? Btw, I left a comment in
> > > the bpftool about those btf_ids, I'm not sure how resolve them and
> > > always assume vmlinux for now.
> > yeah, that btf_id above is the cgroup-lsm prog's btf_id which has its
> > func info, line info...etc.   It is not the one the attach_btf_id correspond
> > to.  attach_btf_id refers to either aux->attach_btf or aux->dst_prog's btf (or
> > target btf id here).
> >
> > It needs a consensus on where this attach_btf_id, target btf id, and
> > prog_attach_flags should be.  If I read the patch 7 thread correctly,
> > I think Andrii is suggesting to expose them to userspace through link, so
> > potentially putting them in bpf_link_info.  The bpf_prog_query will
> > output a list of link ids.  The same probably applies to
>
> Yep and I think it makes sense because link is representing one
> specific attachment (and I presume flags can be stored inside the link
> itself as well, right?).
>
> But if legacy non-link BPF_PROG_ATTACH is supported then using
> bpf_link_info won't cover legacy prog-only attachments.

I don't have any attachment to the legacy apis, I'm supporting them
only because it takes two lines of code; we can go link-only if there
is an agreement that it's inherently better.

How about I keep sys_bpf(BPF_PROG_QUERY) as is and I do a loop in the
userspace (for BPF_LSM_CGROUP only) over all links
(BPF_LINK_GET_NEXT_ID) and will find the the ones with matching prog
ids (BPF_LINK_GET_FD_BY_ID+BPF_OBJ_GET_INFO_BY_FD)?

That way we keep new fields in bpf_link_info, but we don't have to
extend sys_bpf(BPF_PROG_QUERY) because there doesn't seem to be a good
way to do it. Exporting links via new link_fds would mean we'd have to
support BPF_F_QUERY_EFFECTIVE, but getting an effective array of links
seems to be messy. If, in the future, we figure out a better way to
expose a list of attached/effective links per cgroup, we can
convert/optimize bpftool.

WDYT?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-25  4:03           ` Stanislav Fomichev
@ 2022-05-25  4:39             ` Andrii Nakryiko
  2022-05-25 16:01               ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-25  4:39 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Martin KaFai Lau, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Tue, May 24, 2022 at 9:03 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Tue, May 24, 2022 at 4:45 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > >
> > > On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev wrote:
> > > > On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > > >
> > > > > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> > > > > > We have two options:
> > > > > > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > > > > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> > > > > >
> > > > > > I was doing (2) in the original patch, but switching to (1) here:
> > > > > >
> > > > > > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > > > > > regardless of attach_btf_id
> > > > > > * attach_btf_id is exported via bpf_prog_info
> > > > > >
> > > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > > ---
> > > > > >  include/uapi/linux/bpf.h |   5 ++
> > > > > >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> > > > > >  kernel/bpf/syscall.c     |   4 +-
> > > > > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > > > > >
> > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > index b9d2d6de63a7..432fc5f49567 100644
> > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > > > > >               __u32           attach_flags;
> > > > > >               __aligned_u64   prog_ids;
> > > > > >               __u32           prog_cnt;
> > > > > > +             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> > > > > >       } query;
> > > > > >
> > > > > >       struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > > > > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > > > > >       __u64 run_cnt;
> > > > > >       __u64 recursion_misses;
> > > > > >       __u32 verified_insns;
> > > > > > +     /* BTF ID of the function to attach to within BTF object identified
> > > > > > +      * by btf_id.
> > > > > > +      */
> > > > > > +     __u32 attach_btf_func_id;
> > > > > >  } __attribute__((aligned(8)));
> > > > > >
> > > > > >  struct bpf_map_info {
> > > > > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > > > > index a959cdd22870..08a1015ee09e 100644
> > > > > > --- a/kernel/bpf/cgroup.c
> > > > > > +++ b/kernel/bpf/cgroup.c
> > > > > > @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > > > > >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > >                             union bpf_attr __user *uattr)
> > > > > >  {
> > > > > > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> > > > > >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> > > > > >       enum bpf_attach_type type = attr->query.attach_type;
> > > > > >       enum cgroup_bpf_attach_type atype;
> > > > > > @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > >       struct hlist_head *progs;
> > > > > >       struct bpf_prog *prog;
> > > > > >       int cnt, ret = 0, i;
> > > > > > +     int total_cnt = 0;
> > > > > >       u32 flags;
> > > > > >
> > > > > > -     atype = to_cgroup_bpf_attach_type(type);
> > > > > > -     if (atype < 0)
> > > > > > -             return -EINVAL;
> > > > > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > > > > >
> > > > > > -     progs = &cgrp->bpf.progs[atype];
> > > > > > -     flags = cgrp->bpf.flags[atype];
> > > > > > +     if (type == BPF_LSM_CGROUP) {
> > > > > > +             from_atype = CGROUP_LSM_START;
> > > > > > +             to_atype = CGROUP_LSM_END;
> > > > > > +     } else {
> > > > > > +             from_atype = to_cgroup_bpf_attach_type(type);
> > > > > > +             if (from_atype < 0)
> > > > > > +                     return -EINVAL;
> > > > > > +             to_atype = from_atype;
> > > > > > +     }
> > > > > >
> > > > > > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > -                                           lockdep_is_held(&cgroup_mutex));
> > > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > >
> > > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > -             cnt = bpf_prog_array_length(effective);
> > > > > > -     else
> > > > > > -             cnt = prog_list_length(progs);
> > > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > >
> > > > > > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > -             return -EFAULT;
> > > > > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > +                     total_cnt += bpf_prog_array_length(effective);
> > > > > > +             else
> > > > > > +                     total_cnt += prog_list_length(progs);
> > > > > > +     }
> > > > > > +
> > > > > > +     if (type != BPF_LSM_CGROUP)
> > > > > > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > +                     return -EFAULT;
> > > > > > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> > > > > >               return -EFAULT;
> > > > > > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > > > > > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> > > > > >               /* return early if user requested only program count + flags */
> > > > > >               return 0;
> > > > > > -     if (attr->query.prog_cnt < cnt) {
> > > > > > -             cnt = attr->query.prog_cnt;
> > > > > > +
> > > > > > +     if (attr->query.prog_cnt < total_cnt) {
> > > > > > +             total_cnt = attr->query.prog_cnt;
> > > > > >               ret = -ENOSPC;
> > > > > >       }
> > > > > >
> > > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > -     } else {
> > > > > > -             struct bpf_prog_list *pl;
> > > > > > -             u32 id;
> > > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > > +             if (total_cnt <= 0)
> > > > > > +                     break;
> > > > > >
> > > > > > -             i = 0;
> > > > > > -             hlist_for_each_entry(pl, progs, node) {
> > > > > > -                     prog = prog_list_prog(pl);
> > > > > > -                     id = prog->aux->id;
> > > > > > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > -                             return -EFAULT;
> > > > > > -                     if (++i == cnt)
> > > > > > -                             break;
> > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > +
> > > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > > +
> > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > +                     cnt = bpf_prog_array_length(effective);
> > > > > > +             else
> > > > > > +                     cnt = prog_list_length(progs);
> > > > > > +
> > > > > > +             if (cnt >= total_cnt)
> > > > > > +                     cnt = total_cnt;
> > > > > > +
> > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > +             } else {
> > > > > > +                     struct bpf_prog_list *pl;
> > > > > > +                     u32 id;
> > > > > > +
> > > > > > +                     i = 0;
> > > > > > +                     hlist_for_each_entry(pl, progs, node) {
> > > > > > +                             prog = prog_list_prog(pl);
> > > > > > +                             id = prog->aux->id;
> > > > > > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > +                                     return -EFAULT;
> > > > > > +                             if (++i == cnt)
> > > > > > +                                     break;
> > > > > > +                     }
> > > > > >               }
> > > > > > +
> > > > > > +             if (prog_attach_flags)
> > > > > > +                     for (i = 0; i < cnt; i++)
> > > > > > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > > > > +                                     return -EFAULT;
> > > > > > +
> > > > > > +             prog_ids += cnt;
> > > > > > +             total_cnt -= cnt;
> > > > > > +             if (prog_attach_flags)
> > > > > > +                     prog_attach_flags += cnt;
> > > > > >       }
> > > > > >       return ret;
> > > > > >  }
> > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > index 5ed2093e51cc..4137583c04a2 100644
> > > > > > --- a/kernel/bpf/syscall.c
> > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> > > > > >       }
> > > > > >  }
> > > > > >
> > > > > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > > > > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> > > > > >
> > > > > >  static int bpf_prog_query(const union bpf_attr *attr,
> > > > > >                         union bpf_attr __user *uattr)
> > > > > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> > > > > >       case BPF_CGROUP_SYSCTL:
> > > > > >       case BPF_CGROUP_GETSOCKOPT:
> > > > > >       case BPF_CGROUP_SETSOCKOPT:
> > > > > > +     case BPF_LSM_CGROUP:
> > > > > >               return cgroup_bpf_prog_query(attr, uattr);
> > > > > >       case BPF_LIRC_MODE2:
> > > > > >               return lirc_prog_query(attr, uattr);
> > > > > > @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> > > > > >
> > > > > >       if (prog->aux->btf)
> > > > > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > > > > +     info.attach_btf_func_id = prog->aux->attach_btf_id;
> > > > > Note that exposing prog->aux->attach_btf_id only may not be enough
> > > > > unless it can assume info.attach_btf_id is always referring to btf_vmlinux
> > > > > for all bpf prog types.
> > > >
> > > > We also export btf_id two lines above, right? Btw, I left a comment in
> > > > the bpftool about those btf_ids, I'm not sure how resolve them and
> > > > always assume vmlinux for now.
> > > yeah, that btf_id above is the cgroup-lsm prog's btf_id which has its
> > > func info, line info...etc.   It is not the one the attach_btf_id correspond
> > > to.  attach_btf_id refers to either aux->attach_btf or aux->dst_prog's btf (or
> > > target btf id here).
> > >
> > > It needs a consensus on where this attach_btf_id, target btf id, and
> > > prog_attach_flags should be.  If I read the patch 7 thread correctly,
> > > I think Andrii is suggesting to expose them to userspace through link, so
> > > potentially putting them in bpf_link_info.  The bpf_prog_query will
> > > output a list of link ids.  The same probably applies to
> >
> > Yep and I think it makes sense because link is representing one
> > specific attachment (and I presume flags can be stored inside the link
> > itself as well, right?).
> >
> > But if legacy non-link BPF_PROG_ATTACH is supported then using
> > bpf_link_info won't cover legacy prog-only attachments.
>
> I don't have any attachment to the legacy apis, I'm supporting them
> only because it takes two lines of code; we can go link-only if there
> is an agreement that it's inherently better.
>
> How about I keep sys_bpf(BPF_PROG_QUERY) as is and I do a loop in the
> userspace (for BPF_LSM_CGROUP only) over all links
> (BPF_LINK_GET_NEXT_ID) and will find the the ones with matching prog
> ids (BPF_LINK_GET_FD_BY_ID+BPF_OBJ_GET_INFO_BY_FD)?
>
> That way we keep new fields in bpf_link_info, but we don't have to
> extend sys_bpf(BPF_PROG_QUERY) because there doesn't seem to be a good
> way to do it. Exporting links via new link_fds would mean we'd have to
> support BPF_F_QUERY_EFFECTIVE, but getting an effective array of links
> seems to be messy. If, in the future, we figure out a better way to
> expose a list of attached/effective links per cgroup, we can
> convert/optimize bpftool.

Why not use iter/bpf_link program (see progs/bpf_iter_bpf_link.c for
an example) instead? Once you have struct bpf_link and you know it's
cgroup link, you can cast it to struct bpf_cgroup_link and get access
to prog and cgroup. From cgroup to cgroup_bpf you can even get access
to effective array. Basically whatever kernel has access to you can
have access to from bpftool without extending any UAPIs.

>
> WDYT?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-25  4:39             ` Andrii Nakryiko
@ 2022-05-25 16:01               ` Stanislav Fomichev
  2022-05-25 17:02                 ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-25 16:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Martin KaFai Lau, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Tue, May 24, 2022 at 9:39 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, May 24, 2022 at 9:03 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Tue, May 24, 2022 at 4:45 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > > >
> > > > On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev wrote:
> > > > > On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > > > >
> > > > > > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> > > > > > > We have two options:
> > > > > > > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > > > > > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> > > > > > >
> > > > > > > I was doing (2) in the original patch, but switching to (1) here:
> > > > > > >
> > > > > > > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > > > > > > regardless of attach_btf_id
> > > > > > > * attach_btf_id is exported via bpf_prog_info
> > > > > > >
> > > > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > > > ---
> > > > > > >  include/uapi/linux/bpf.h |   5 ++
> > > > > > >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> > > > > > >  kernel/bpf/syscall.c     |   4 +-
> > > > > > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > > > > > >
> > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > > index b9d2d6de63a7..432fc5f49567 100644
> > > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > > > > > >               __u32           attach_flags;
> > > > > > >               __aligned_u64   prog_ids;
> > > > > > >               __u32           prog_cnt;
> > > > > > > +             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> > > > > > >       } query;
> > > > > > >
> > > > > > >       struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > > > > > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > > > > > >       __u64 run_cnt;
> > > > > > >       __u64 recursion_misses;
> > > > > > >       __u32 verified_insns;
> > > > > > > +     /* BTF ID of the function to attach to within BTF object identified
> > > > > > > +      * by btf_id.
> > > > > > > +      */
> > > > > > > +     __u32 attach_btf_func_id;
> > > > > > >  } __attribute__((aligned(8)));
> > > > > > >
> > > > > > >  struct bpf_map_info {
> > > > > > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > > > > > index a959cdd22870..08a1015ee09e 100644
> > > > > > > --- a/kernel/bpf/cgroup.c
> > > > > > > +++ b/kernel/bpf/cgroup.c
> > > > > > > @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > > > > > >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > > >                             union bpf_attr __user *uattr)
> > > > > > >  {
> > > > > > > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> > > > > > >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> > > > > > >       enum bpf_attach_type type = attr->query.attach_type;
> > > > > > >       enum cgroup_bpf_attach_type atype;
> > > > > > > @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > > >       struct hlist_head *progs;
> > > > > > >       struct bpf_prog *prog;
> > > > > > >       int cnt, ret = 0, i;
> > > > > > > +     int total_cnt = 0;
> > > > > > >       u32 flags;
> > > > > > >
> > > > > > > -     atype = to_cgroup_bpf_attach_type(type);
> > > > > > > -     if (atype < 0)
> > > > > > > -             return -EINVAL;
> > > > > > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > > > > > >
> > > > > > > -     progs = &cgrp->bpf.progs[atype];
> > > > > > > -     flags = cgrp->bpf.flags[atype];
> > > > > > > +     if (type == BPF_LSM_CGROUP) {
> > > > > > > +             from_atype = CGROUP_LSM_START;
> > > > > > > +             to_atype = CGROUP_LSM_END;
> > > > > > > +     } else {
> > > > > > > +             from_atype = to_cgroup_bpf_attach_type(type);
> > > > > > > +             if (from_atype < 0)
> > > > > > > +                     return -EINVAL;
> > > > > > > +             to_atype = from_atype;
> > > > > > > +     }
> > > > > > >
> > > > > > > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > -                                           lockdep_is_held(&cgroup_mutex));
> > > > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > >
> > > > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > -             cnt = bpf_prog_array_length(effective);
> > > > > > > -     else
> > > > > > > -             cnt = prog_list_length(progs);
> > > > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > > >
> > > > > > > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > > -             return -EFAULT;
> > > > > > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > +                     total_cnt += bpf_prog_array_length(effective);
> > > > > > > +             else
> > > > > > > +                     total_cnt += prog_list_length(progs);
> > > > > > > +     }
> > > > > > > +
> > > > > > > +     if (type != BPF_LSM_CGROUP)
> > > > > > > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > > +                     return -EFAULT;
> > > > > > > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> > > > > > >               return -EFAULT;
> > > > > > > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > > > > > > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> > > > > > >               /* return early if user requested only program count + flags */
> > > > > > >               return 0;
> > > > > > > -     if (attr->query.prog_cnt < cnt) {
> > > > > > > -             cnt = attr->query.prog_cnt;
> > > > > > > +
> > > > > > > +     if (attr->query.prog_cnt < total_cnt) {
> > > > > > > +             total_cnt = attr->query.prog_cnt;
> > > > > > >               ret = -ENOSPC;
> > > > > > >       }
> > > > > > >
> > > > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > > > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > -     } else {
> > > > > > > -             struct bpf_prog_list *pl;
> > > > > > > -             u32 id;
> > > > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > > > +             if (total_cnt <= 0)
> > > > > > > +                     break;
> > > > > > >
> > > > > > > -             i = 0;
> > > > > > > -             hlist_for_each_entry(pl, progs, node) {
> > > > > > > -                     prog = prog_list_prog(pl);
> > > > > > > -                     id = prog->aux->id;
> > > > > > > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > > -                             return -EFAULT;
> > > > > > > -                     if (++i == cnt)
> > > > > > > -                             break;
> > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > +
> > > > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > > > +
> > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > +                     cnt = bpf_prog_array_length(effective);
> > > > > > > +             else
> > > > > > > +                     cnt = prog_list_length(progs);
> > > > > > > +
> > > > > > > +             if (cnt >= total_cnt)
> > > > > > > +                     cnt = total_cnt;
> > > > > > > +
> > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > > > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > +             } else {
> > > > > > > +                     struct bpf_prog_list *pl;
> > > > > > > +                     u32 id;
> > > > > > > +
> > > > > > > +                     i = 0;
> > > > > > > +                     hlist_for_each_entry(pl, progs, node) {
> > > > > > > +                             prog = prog_list_prog(pl);
> > > > > > > +                             id = prog->aux->id;
> > > > > > > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > > +                                     return -EFAULT;
> > > > > > > +                             if (++i == cnt)
> > > > > > > +                                     break;
> > > > > > > +                     }
> > > > > > >               }
> > > > > > > +
> > > > > > > +             if (prog_attach_flags)
> > > > > > > +                     for (i = 0; i < cnt; i++)
> > > > > > > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > > > > > +                                     return -EFAULT;
> > > > > > > +
> > > > > > > +             prog_ids += cnt;
> > > > > > > +             total_cnt -= cnt;
> > > > > > > +             if (prog_attach_flags)
> > > > > > > +                     prog_attach_flags += cnt;
> > > > > > >       }
> > > > > > >       return ret;
> > > > > > >  }
> > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > > index 5ed2093e51cc..4137583c04a2 100644
> > > > > > > --- a/kernel/bpf/syscall.c
> > > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> > > > > > >       }
> > > > > > >  }
> > > > > > >
> > > > > > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > > > > > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> > > > > > >
> > > > > > >  static int bpf_prog_query(const union bpf_attr *attr,
> > > > > > >                         union bpf_attr __user *uattr)
> > > > > > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> > > > > > >       case BPF_CGROUP_SYSCTL:
> > > > > > >       case BPF_CGROUP_GETSOCKOPT:
> > > > > > >       case BPF_CGROUP_SETSOCKOPT:
> > > > > > > +     case BPF_LSM_CGROUP:
> > > > > > >               return cgroup_bpf_prog_query(attr, uattr);
> > > > > > >       case BPF_LIRC_MODE2:
> > > > > > >               return lirc_prog_query(attr, uattr);
> > > > > > > @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> > > > > > >
> > > > > > >       if (prog->aux->btf)
> > > > > > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > > > > > +     info.attach_btf_func_id = prog->aux->attach_btf_id;
> > > > > > Note that exposing prog->aux->attach_btf_id only may not be enough
> > > > > > unless it can assume info.attach_btf_id is always referring to btf_vmlinux
> > > > > > for all bpf prog types.
> > > > >
> > > > > We also export btf_id two lines above, right? Btw, I left a comment in
> > > > > the bpftool about those btf_ids, I'm not sure how resolve them and
> > > > > always assume vmlinux for now.
> > > > yeah, that btf_id above is the cgroup-lsm prog's btf_id which has its
> > > > func info, line info...etc.   It is not the one the attach_btf_id correspond
> > > > to.  attach_btf_id refers to either aux->attach_btf or aux->dst_prog's btf (or
> > > > target btf id here).
> > > >
> > > > It needs a consensus on where this attach_btf_id, target btf id, and
> > > > prog_attach_flags should be.  If I read the patch 7 thread correctly,
> > > > I think Andrii is suggesting to expose them to userspace through link, so
> > > > potentially putting them in bpf_link_info.  The bpf_prog_query will
> > > > output a list of link ids.  The same probably applies to
> > >
> > > Yep and I think it makes sense because link is representing one
> > > specific attachment (and I presume flags can be stored inside the link
> > > itself as well, right?).
> > >
> > > But if legacy non-link BPF_PROG_ATTACH is supported then using
> > > bpf_link_info won't cover legacy prog-only attachments.
> >
> > I don't have any attachment to the legacy apis, I'm supporting them
> > only because it takes two lines of code; we can go link-only if there
> > is an agreement that it's inherently better.
> >
> > How about I keep sys_bpf(BPF_PROG_QUERY) as is and I do a loop in the
> > userspace (for BPF_LSM_CGROUP only) over all links
> > (BPF_LINK_GET_NEXT_ID) and will find the the ones with matching prog
> > ids (BPF_LINK_GET_FD_BY_ID+BPF_OBJ_GET_INFO_BY_FD)?
> >
> > That way we keep new fields in bpf_link_info, but we don't have to
> > extend sys_bpf(BPF_PROG_QUERY) because there doesn't seem to be a good
> > way to do it. Exporting links via new link_fds would mean we'd have to
> > support BPF_F_QUERY_EFFECTIVE, but getting an effective array of links
> > seems to be messy. If, in the future, we figure out a better way to
> > expose a list of attached/effective links per cgroup, we can
> > convert/optimize bpftool.
>
> Why not use iter/bpf_link program (see progs/bpf_iter_bpf_link.c for
> an example) instead? Once you have struct bpf_link and you know it's
> cgroup link, you can cast it to struct bpf_cgroup_link and get access
> to prog and cgroup. From cgroup to cgroup_bpf you can even get access
> to effective array. Basically whatever kernel has access to you can
> have access to from bpftool without extending any UAPIs.

Seems a bit too involved just to read back the fields? I might as well
use drgn? I'm also not sure about the implementation: will I be able
to upcast bpf_link to bpf_cgroup_link in the bpf prog? And getting
attach_type might be problematic from the iterator program as well: I
need to call kernel's bpf_lsm_attach_type_get to find atype for
attach_btf_id, I'd have to export it as kfunc?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-25 16:01               ` Stanislav Fomichev
@ 2022-05-25 17:02                 ` Stanislav Fomichev
  2022-05-25 20:39                   ` Martin KaFai Lau
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-25 17:02 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Martin KaFai Lau, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Wed, May 25, 2022 at 9:01 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Tue, May 24, 2022 at 9:39 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, May 24, 2022 at 9:03 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > On Tue, May 24, 2022 at 4:45 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > > > >
> > > > > On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev wrote:
> > > > > > On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > > > > >
> > > > > > > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> > > > > > > > We have two options:
> > > > > > > > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > > > > > > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> > > > > > > >
> > > > > > > > I was doing (2) in the original patch, but switching to (1) here:
> > > > > > > >
> > > > > > > > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > > > > > > > regardless of attach_btf_id
> > > > > > > > * attach_btf_id is exported via bpf_prog_info
> > > > > > > >
> > > > > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > > > > ---
> > > > > > > >  include/uapi/linux/bpf.h |   5 ++
> > > > > > > >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> > > > > > > >  kernel/bpf/syscall.c     |   4 +-
> > > > > > > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > > > index b9d2d6de63a7..432fc5f49567 100644
> > > > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > > > > > > >               __u32           attach_flags;
> > > > > > > >               __aligned_u64   prog_ids;
> > > > > > > >               __u32           prog_cnt;
> > > > > > > > +             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> > > > > > > >       } query;
> > > > > > > >
> > > > > > > >       struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > > > > > > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > > > > > > >       __u64 run_cnt;
> > > > > > > >       __u64 recursion_misses;
> > > > > > > >       __u32 verified_insns;
> > > > > > > > +     /* BTF ID of the function to attach to within BTF object identified
> > > > > > > > +      * by btf_id.
> > > > > > > > +      */
> > > > > > > > +     __u32 attach_btf_func_id;
> > > > > > > >  } __attribute__((aligned(8)));
> > > > > > > >
> > > > > > > >  struct bpf_map_info {
> > > > > > > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > > > > > > index a959cdd22870..08a1015ee09e 100644
> > > > > > > > --- a/kernel/bpf/cgroup.c
> > > > > > > > +++ b/kernel/bpf/cgroup.c
> > > > > > > > @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > > > > > > >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > > > >                             union bpf_attr __user *uattr)
> > > > > > > >  {
> > > > > > > > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> > > > > > > >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> > > > > > > >       enum bpf_attach_type type = attr->query.attach_type;
> > > > > > > >       enum cgroup_bpf_attach_type atype;
> > > > > > > > @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > > > >       struct hlist_head *progs;
> > > > > > > >       struct bpf_prog *prog;
> > > > > > > >       int cnt, ret = 0, i;
> > > > > > > > +     int total_cnt = 0;
> > > > > > > >       u32 flags;
> > > > > > > >
> > > > > > > > -     atype = to_cgroup_bpf_attach_type(type);
> > > > > > > > -     if (atype < 0)
> > > > > > > > -             return -EINVAL;
> > > > > > > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > > > > > > >
> > > > > > > > -     progs = &cgrp->bpf.progs[atype];
> > > > > > > > -     flags = cgrp->bpf.flags[atype];
> > > > > > > > +     if (type == BPF_LSM_CGROUP) {
> > > > > > > > +             from_atype = CGROUP_LSM_START;
> > > > > > > > +             to_atype = CGROUP_LSM_END;
> > > > > > > > +     } else {
> > > > > > > > +             from_atype = to_cgroup_bpf_attach_type(type);
> > > > > > > > +             if (from_atype < 0)
> > > > > > > > +                     return -EINVAL;
> > > > > > > > +             to_atype = from_atype;
> > > > > > > > +     }
> > > > > > > >
> > > > > > > > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > -                                           lockdep_is_held(&cgroup_mutex));
> > > > > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > >
> > > > > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > > -             cnt = bpf_prog_array_length(effective);
> > > > > > > > -     else
> > > > > > > > -             cnt = prog_list_length(progs);
> > > > > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > > > >
> > > > > > > > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > > > -             return -EFAULT;
> > > > > > > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > > +                     total_cnt += bpf_prog_array_length(effective);
> > > > > > > > +             else
> > > > > > > > +                     total_cnt += prog_list_length(progs);
> > > > > > > > +     }
> > > > > > > > +
> > > > > > > > +     if (type != BPF_LSM_CGROUP)
> > > > > > > > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > > > +                     return -EFAULT;
> > > > > > > > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> > > > > > > >               return -EFAULT;
> > > > > > > > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > > > > > > > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> > > > > > > >               /* return early if user requested only program count + flags */
> > > > > > > >               return 0;
> > > > > > > > -     if (attr->query.prog_cnt < cnt) {
> > > > > > > > -             cnt = attr->query.prog_cnt;
> > > > > > > > +
> > > > > > > > +     if (attr->query.prog_cnt < total_cnt) {
> > > > > > > > +             total_cnt = attr->query.prog_cnt;
> > > > > > > >               ret = -ENOSPC;
> > > > > > > >       }
> > > > > > > >
> > > > > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > > > > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > > -     } else {
> > > > > > > > -             struct bpf_prog_list *pl;
> > > > > > > > -             u32 id;
> > > > > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > > > > +             if (total_cnt <= 0)
> > > > > > > > +                     break;
> > > > > > > >
> > > > > > > > -             i = 0;
> > > > > > > > -             hlist_for_each_entry(pl, progs, node) {
> > > > > > > > -                     prog = prog_list_prog(pl);
> > > > > > > > -                     id = prog->aux->id;
> > > > > > > > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > > > -                             return -EFAULT;
> > > > > > > > -                     if (++i == cnt)
> > > > > > > > -                             break;
> > > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > > +
> > > > > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > > > > +
> > > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > > +                     cnt = bpf_prog_array_length(effective);
> > > > > > > > +             else
> > > > > > > > +                     cnt = prog_list_length(progs);
> > > > > > > > +
> > > > > > > > +             if (cnt >= total_cnt)
> > > > > > > > +                     cnt = total_cnt;
> > > > > > > > +
> > > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > > > > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > > +             } else {
> > > > > > > > +                     struct bpf_prog_list *pl;
> > > > > > > > +                     u32 id;
> > > > > > > > +
> > > > > > > > +                     i = 0;
> > > > > > > > +                     hlist_for_each_entry(pl, progs, node) {
> > > > > > > > +                             prog = prog_list_prog(pl);
> > > > > > > > +                             id = prog->aux->id;
> > > > > > > > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > > > +                                     return -EFAULT;
> > > > > > > > +                             if (++i == cnt)
> > > > > > > > +                                     break;
> > > > > > > > +                     }
> > > > > > > >               }
> > > > > > > > +
> > > > > > > > +             if (prog_attach_flags)
> > > > > > > > +                     for (i = 0; i < cnt; i++)
> > > > > > > > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > > > > > > +                                     return -EFAULT;
> > > > > > > > +
> > > > > > > > +             prog_ids += cnt;
> > > > > > > > +             total_cnt -= cnt;
> > > > > > > > +             if (prog_attach_flags)
> > > > > > > > +                     prog_attach_flags += cnt;
> > > > > > > >       }
> > > > > > > >       return ret;
> > > > > > > >  }
> > > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > > > index 5ed2093e51cc..4137583c04a2 100644
> > > > > > > > --- a/kernel/bpf/syscall.c
> > > > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> > > > > > > >       }
> > > > > > > >  }
> > > > > > > >
> > > > > > > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > > > > > > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> > > > > > > >
> > > > > > > >  static int bpf_prog_query(const union bpf_attr *attr,
> > > > > > > >                         union bpf_attr __user *uattr)
> > > > > > > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> > > > > > > >       case BPF_CGROUP_SYSCTL:
> > > > > > > >       case BPF_CGROUP_GETSOCKOPT:
> > > > > > > >       case BPF_CGROUP_SETSOCKOPT:
> > > > > > > > +     case BPF_LSM_CGROUP:
> > > > > > > >               return cgroup_bpf_prog_query(attr, uattr);
> > > > > > > >       case BPF_LIRC_MODE2:
> > > > > > > >               return lirc_prog_query(attr, uattr);
> > > > > > > > @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> > > > > > > >
> > > > > > > >       if (prog->aux->btf)
> > > > > > > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > > > > > > +     info.attach_btf_func_id = prog->aux->attach_btf_id;
> > > > > > > Note that exposing prog->aux->attach_btf_id only may not be enough
> > > > > > > unless it can assume info.attach_btf_id is always referring to btf_vmlinux
> > > > > > > for all bpf prog types.
> > > > > >
> > > > > > We also export btf_id two lines above, right? Btw, I left a comment in
> > > > > > the bpftool about those btf_ids, I'm not sure how resolve them and
> > > > > > always assume vmlinux for now.
> > > > > yeah, that btf_id above is the cgroup-lsm prog's btf_id which has its
> > > > > func info, line info...etc.   It is not the one the attach_btf_id correspond
> > > > > to.  attach_btf_id refers to either aux->attach_btf or aux->dst_prog's btf (or
> > > > > target btf id here).
> > > > >
> > > > > It needs a consensus on where this attach_btf_id, target btf id, and
> > > > > prog_attach_flags should be.  If I read the patch 7 thread correctly,
> > > > > I think Andrii is suggesting to expose them to userspace through link, so
> > > > > potentially putting them in bpf_link_info.  The bpf_prog_query will
> > > > > output a list of link ids.  The same probably applies to
> > > >
> > > > Yep and I think it makes sense because link is representing one
> > > > specific attachment (and I presume flags can be stored inside the link
> > > > itself as well, right?).
> > > >
> > > > But if legacy non-link BPF_PROG_ATTACH is supported then using
> > > > bpf_link_info won't cover legacy prog-only attachments.
> > >
> > > I don't have any attachment to the legacy apis, I'm supporting them
> > > only because it takes two lines of code; we can go link-only if there
> > > is an agreement that it's inherently better.
> > >
> > > How about I keep sys_bpf(BPF_PROG_QUERY) as is and I do a loop in the
> > > userspace (for BPF_LSM_CGROUP only) over all links
> > > (BPF_LINK_GET_NEXT_ID) and will find the the ones with matching prog
> > > ids (BPF_LINK_GET_FD_BY_ID+BPF_OBJ_GET_INFO_BY_FD)?
> > >
> > > That way we keep new fields in bpf_link_info, but we don't have to
> > > extend sys_bpf(BPF_PROG_QUERY) because there doesn't seem to be a good
> > > way to do it. Exporting links via new link_fds would mean we'd have to
> > > support BPF_F_QUERY_EFFECTIVE, but getting an effective array of links
> > > seems to be messy. If, in the future, we figure out a better way to
> > > expose a list of attached/effective links per cgroup, we can
> > > convert/optimize bpftool.
> >
> > Why not use iter/bpf_link program (see progs/bpf_iter_bpf_link.c for
> > an example) instead? Once you have struct bpf_link and you know it's
> > cgroup link, you can cast it to struct bpf_cgroup_link and get access
> > to prog and cgroup. From cgroup to cgroup_bpf you can even get access
> > to effective array. Basically whatever kernel has access to you can
> > have access to from bpftool without extending any UAPIs.
>
> Seems a bit too involved just to read back the fields? I might as well
> use drgn? I'm also not sure about the implementation: will I be able
> to upcast bpf_link to bpf_cgroup_link in the bpf prog? And getting
> attach_type might be problematic from the iterator program as well: I
> need to call kernel's bpf_lsm_attach_type_get to find atype for
> attach_btf_id, I'd have to export it as kfunc?

I've prototyped whatever I've suggested above and there is another
problem with going link-only: bpftool currently uses bpf_prog_attach
unconditionally; we'd have to change that to use links for
BPF_LSM_CGROUP (and pin them in some hard-coded locations?) :-(
I'm leaning towards keeping those legacy apis around and exporting via
prog_info; there doesn't seem to be a clear benefit :-(

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-25 17:02                 ` Stanislav Fomichev
@ 2022-05-25 20:39                   ` Martin KaFai Lau
  2022-05-25 21:25                     ` sdf
  0 siblings, 1 reply; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-25 20:39 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Andrii Nakryiko, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Wed, May 25, 2022 at 10:02:07AM -0700, Stanislav Fomichev wrote:
> On Wed, May 25, 2022 at 9:01 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Tue, May 24, 2022 at 9:39 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Tue, May 24, 2022 at 9:03 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > On Tue, May 24, 2022 at 4:45 PM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > > > > >
> > > > > > On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev wrote:
> > > > > > > On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav Fomichev wrote:
> > > > > > > > > We have two options:
> > > > > > > > > 1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
> > > > > > > > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> > > > > > > > >
> > > > > > > > > I was doing (2) in the original patch, but switching to (1) here:
> > > > > > > > >
> > > > > > > > > * bpf_prog_query returns all attached BPF_LSM_CGROUP programs
> > > > > > > > > regardless of attach_btf_id
> > > > > > > > > * attach_btf_id is exported via bpf_prog_info
> > > > > > > > >
> > > > > > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > > > > > ---
> > > > > > > > >  include/uapi/linux/bpf.h |   5 ++
> > > > > > > > >  kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
> > > > > > > > >  kernel/bpf/syscall.c     |   4 +-
> > > > > > > > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > > > > index b9d2d6de63a7..432fc5f49567 100644
> > > > > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > > > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > > > > > > > >               __u32           attach_flags;
> > > > > > > > >               __aligned_u64   prog_ids;
> > > > > > > > >               __u32           prog_cnt;
> > > > > > > > > +             __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
> > > > > > > > >       } query;
> > > > > > > > >
> > > > > > > > >       struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> > > > > > > > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > > > > > > > >       __u64 run_cnt;
> > > > > > > > >       __u64 recursion_misses;
> > > > > > > > >       __u32 verified_insns;
> > > > > > > > > +     /* BTF ID of the function to attach to within BTF object identified
> > > > > > > > > +      * by btf_id.
> > > > > > > > > +      */
> > > > > > > > > +     __u32 attach_btf_func_id;
> > > > > > > > >  } __attribute__((aligned(8)));
> > > > > > > > >
> > > > > > > > >  struct bpf_map_info {
> > > > > > > > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > > > > > > > index a959cdd22870..08a1015ee09e 100644
> > > > > > > > > --- a/kernel/bpf/cgroup.c
> > > > > > > > > +++ b/kernel/bpf/cgroup.c
> > > > > > > > > @@ -1074,6 +1074,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > > > > > > > >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > > > > >                             union bpf_attr __user *uattr)
> > > > > > > > >  {
> > > > > > > > > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> > > > > > > > >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> > > > > > > > >       enum bpf_attach_type type = attr->query.attach_type;
> > > > > > > > >       enum cgroup_bpf_attach_type atype;
> > > > > > > > > @@ -1081,50 +1082,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > > > > >       struct hlist_head *progs;
> > > > > > > > >       struct bpf_prog *prog;
> > > > > > > > >       int cnt, ret = 0, i;
> > > > > > > > > +     int total_cnt = 0;
> > > > > > > > >       u32 flags;
> > > > > > > > >
> > > > > > > > > -     atype = to_cgroup_bpf_attach_type(type);
> > > > > > > > > -     if (atype < 0)
> > > > > > > > > -             return -EINVAL;
> > > > > > > > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > > > > > > > >
> > > > > > > > > -     progs = &cgrp->bpf.progs[atype];
> > > > > > > > > -     flags = cgrp->bpf.flags[atype];
> > > > > > > > > +     if (type == BPF_LSM_CGROUP) {
> > > > > > > > > +             from_atype = CGROUP_LSM_START;
> > > > > > > > > +             to_atype = CGROUP_LSM_END;
> > > > > > > > > +     } else {
> > > > > > > > > +             from_atype = to_cgroup_bpf_attach_type(type);
> > > > > > > > > +             if (from_atype < 0)
> > > > > > > > > +                     return -EINVAL;
> > > > > > > > > +             to_atype = from_atype;
> > > > > > > > > +     }
> > > > > > > > >
> > > > > > > > > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > -                                           lockdep_is_held(&cgroup_mutex));
> > > > > > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > > >
> > > > > > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > -             cnt = bpf_prog_array_length(effective);
> > > > > > > > > -     else
> > > > > > > > > -             cnt = prog_list_length(progs);
> > > > > > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > > > > >
> > > > > > > > > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > > > > -             return -EFAULT;
> > > > > > > > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > > > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > +                     total_cnt += bpf_prog_array_length(effective);
> > > > > > > > > +             else
> > > > > > > > > +                     total_cnt += prog_list_length(progs);
> > > > > > > > > +     }
> > > > > > > > > +
> > > > > > > > > +     if (type != BPF_LSM_CGROUP)
> > > > > > > > > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > > > > +                     return -EFAULT;
> > > > > > > > > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> > > > > > > > >               return -EFAULT;
> > > > > > > > > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > > > > > > > > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> > > > > > > > >               /* return early if user requested only program count + flags */
> > > > > > > > >               return 0;
> > > > > > > > > -     if (attr->query.prog_cnt < cnt) {
> > > > > > > > > -             cnt = attr->query.prog_cnt;
> > > > > > > > > +
> > > > > > > > > +     if (attr->query.prog_cnt < total_cnt) {
> > > > > > > > > +             total_cnt = attr->query.prog_cnt;
> > > > > > > > >               ret = -ENOSPC;
> > > > > > > > >       }
> > > > > > > > >
> > > > > > > > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > > > > > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > > > -     } else {
> > > > > > > > > -             struct bpf_prog_list *pl;
> > > > > > > > > -             u32 id;
> > > > > > > > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > > > > > > > +             if (total_cnt <= 0)
> > > > > > > > > +                     break;
> > > > > > > > >
> > > > > > > > > -             i = 0;
> > > > > > > > > -             hlist_for_each_entry(pl, progs, node) {
> > > > > > > > > -                     prog = prog_list_prog(pl);
> > > > > > > > > -                     id = prog->aux->id;
> > > > > > > > > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > > > > -                             return -EFAULT;
> > > > > > > > > -                     if (++i == cnt)
> > > > > > > > > -                             break;
> > > > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > > > +
> > > > > > > > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > +                                                   lockdep_is_held(&cgroup_mutex));
> > > > > > > > > +
> > > > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > +                     cnt = bpf_prog_array_length(effective);
> > > > > > > > > +             else
> > > > > > > > > +                     cnt = prog_list_length(progs);
> > > > > > > > > +
> > > > > > > > > +             if (cnt >= total_cnt)
> > > > > > > > > +                     cnt = total_cnt;
> > > > > > > > > +
> > > > > > > > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > > > > > > > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > > > +             } else {
> > > > > > > > > +                     struct bpf_prog_list *pl;
> > > > > > > > > +                     u32 id;
> > > > > > > > > +
> > > > > > > > > +                     i = 0;
> > > > > > > > > +                     hlist_for_each_entry(pl, progs, node) {
> > > > > > > > > +                             prog = prog_list_prog(pl);
> > > > > > > > > +                             id = prog->aux->id;
> > > > > > > > > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > > > > +                                     return -EFAULT;
> > > > > > > > > +                             if (++i == cnt)
> > > > > > > > > +                                     break;
> > > > > > > > > +                     }
> > > > > > > > >               }
> > > > > > > > > +
> > > > > > > > > +             if (prog_attach_flags)
> > > > > > > > > +                     for (i = 0; i < cnt; i++)
> > > > > > > > > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > > > > > > > +                                     return -EFAULT;
> > > > > > > > > +
> > > > > > > > > +             prog_ids += cnt;
> > > > > > > > > +             total_cnt -= cnt;
> > > > > > > > > +             if (prog_attach_flags)
> > > > > > > > > +                     prog_attach_flags += cnt;
> > > > > > > > >       }
> > > > > > > > >       return ret;
> > > > > > > > >  }
> > > > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > > > > index 5ed2093e51cc..4137583c04a2 100644
> > > > > > > > > --- a/kernel/bpf/syscall.c
> > > > > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > > > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> > > > > > > > >       }
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > > > > > > > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> > > > > > > > >
> > > > > > > > >  static int bpf_prog_query(const union bpf_attr *attr,
> > > > > > > > >                         union bpf_attr __user *uattr)
> > > > > > > > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> > > > > > > > >       case BPF_CGROUP_SYSCTL:
> > > > > > > > >       case BPF_CGROUP_GETSOCKOPT:
> > > > > > > > >       case BPF_CGROUP_SETSOCKOPT:
> > > > > > > > > +     case BPF_LSM_CGROUP:
> > > > > > > > >               return cgroup_bpf_prog_query(attr, uattr);
> > > > > > > > >       case BPF_LIRC_MODE2:
> > > > > > > > >               return lirc_prog_query(attr, uattr);
> > > > > > > > > @@ -4066,6 +4067,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> > > > > > > > >
> > > > > > > > >       if (prog->aux->btf)
> > > > > > > > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > > > > > > > +     info.attach_btf_func_id = prog->aux->attach_btf_id;
> > > > > > > > Note that exposing prog->aux->attach_btf_id only may not be enough
> > > > > > > > unless it can assume info.attach_btf_id is always referring to btf_vmlinux
> > > > > > > > for all bpf prog types.
> > > > > > >
> > > > > > > We also export btf_id two lines above, right? Btw, I left a comment in
> > > > > > > the bpftool about those btf_ids, I'm not sure how resolve them and
> > > > > > > always assume vmlinux for now.
> > > > > > yeah, that btf_id above is the cgroup-lsm prog's btf_id which has its
> > > > > > func info, line info...etc.   It is not the one the attach_btf_id correspond
> > > > > > to.  attach_btf_id refers to either aux->attach_btf or aux->dst_prog's btf (or
> > > > > > target btf id here).
> > > > > >
> > > > > > It needs a consensus on where this attach_btf_id, target btf id, and
> > > > > > prog_attach_flags should be.  If I read the patch 7 thread correctly,
> > > > > > I think Andrii is suggesting to expose them to userspace through link, so
> > > > > > potentially putting them in bpf_link_info.  The bpf_prog_query will
> > > > > > output a list of link ids.  The same probably applies to
> > > > >
> > > > > Yep and I think it makes sense because link is representing one
> > > > > specific attachment (and I presume flags can be stored inside the link
> > > > > itself as well, right?).
> > > > >
> > > > > But if legacy non-link BPF_PROG_ATTACH is supported then using
> > > > > bpf_link_info won't cover legacy prog-only attachments.
> > > >
> > > > I don't have any attachment to the legacy apis, I'm supporting them
> > > > only because it takes two lines of code; we can go link-only if there
> > > > is an agreement that it's inherently better.
> > > >
> > > > How about I keep sys_bpf(BPF_PROG_QUERY) as is and I do a loop in the
> > > > userspace (for BPF_LSM_CGROUP only) over all links
> > > > (BPF_LINK_GET_NEXT_ID) and will find the the ones with matching prog
> > > > ids (BPF_LINK_GET_FD_BY_ID+BPF_OBJ_GET_INFO_BY_FD)?
> > > >
> > > > That way we keep new fields in bpf_link_info, but we don't have to
> > > > extend sys_bpf(BPF_PROG_QUERY) because there doesn't seem to be a good
> > > > way to do it. Exporting links via new link_fds would mean we'd have to
> > > > support BPF_F_QUERY_EFFECTIVE, but getting an effective array of links
> > > > seems to be messy. If, in the future, we figure out a better way to
I don't see a clean way to get effective array from one individual
link[_info] through link iteration.  effective array is the progs that
will be run at a cgroup and in such order.  The prog running at a
cgroup doesn't necessarily linked to that cgroup.

If staying with BPF_PROG_QUERY+BPF_F_QUERY_EFFECTIVE to get effective array
and if it is decided the addition should be done in bpf_link_info,
then a list of link ids needs to be output instead of the current list of
prog ids.  The old attach type will still have to stay with the list of
prog ids though :/

It will be sad not to be able to get effective only for BPF_LSM_CGROUP.
I found it more useful to show what will be run at a cgroup and in such
order instead of what is linked to a cgroup.

> > > > expose a list of attached/effective links per cgroup, we can
> > > > convert/optimize bpftool.
> > >
> > > Why not use iter/bpf_link program (see progs/bpf_iter_bpf_link.c for
> > > an example) instead? Once you have struct bpf_link and you know it's
> > > cgroup link, you can cast it to struct bpf_cgroup_link and get access
> > > to prog and cgroup. From cgroup to cgroup_bpf you can even get access
> > > to effective array. Basically whatever kernel has access to you can
> > > have access to from bpftool without extending any UAPIs.
> >
> > Seems a bit too involved just to read back the fields? I might as well
> > use drgn? I'm also not sure about the implementation: will I be able
> > to upcast bpf_link to bpf_cgroup_link in the bpf prog? And getting
> > attach_type might be problematic from the iterator program as well: I
> > need to call kernel's bpf_lsm_attach_type_get to find atype for
> > attach_btf_id, I'd have to export it as kfunc?
> 
> I've prototyped whatever I've suggested above and there is another
> problem with going link-only: bpftool currently uses bpf_prog_attach
> unconditionally; we'd have to change that to use links for
> BPF_LSM_CGROUP (and pin them in some hard-coded locations?) :-(
> I'm leaning towards keeping those legacy apis around and exporting via
> prog_info; there doesn't seem to be a clear benefit :-(

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-25 20:39                   ` Martin KaFai Lau
@ 2022-05-25 21:25                     ` sdf
  2022-05-26  0:03                       ` Martin KaFai Lau
  0 siblings, 1 reply; 54+ messages in thread
From: sdf @ 2022-05-25 21:25 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Andrii Nakryiko, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On 05/25, Martin KaFai Lau wrote:
> On Wed, May 25, 2022 at 10:02:07AM -0700, Stanislav Fomichev wrote:
> > On Wed, May 25, 2022 at 9:01 AM Stanislav Fomichev <sdf@google.com>  
> wrote:
> > >
> > > On Tue, May 24, 2022 at 9:39 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Tue, May 24, 2022 at 9:03 PM Stanislav Fomichev <sdf@google.com>  
> wrote:
> > > > >
> > > > > On Tue, May 24, 2022 at 4:45 PM Andrii Nakryiko
> > > > > <andrii.nakryiko@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau  
> <kafai@fb.com> wrote:
> > > > > > >
> > > > > > > On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev  
> wrote:
> > > > > > > > On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau  
> <kafai@fb.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav  
> Fomichev wrote:
> > > > > > > > > > We have two options:
> > > > > > > > > > 1. Treat all BPF_LSM_CGROUP the same, regardless of  
> attach_btf_id
> > > > > > > > > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate  
> hook point
> > > > > > > > > >
> > > > > > > > > > I was doing (2) in the original patch, but switching to  
> (1) here:
> > > > > > > > > >
> > > > > > > > > > * bpf_prog_query returns all attached BPF_LSM_CGROUP  
> programs
> > > > > > > > > > regardless of attach_btf_id
> > > > > > > > > > * attach_btf_id is exported via bpf_prog_info
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > > > > > > ---
> > > > > > > > > >  include/uapi/linux/bpf.h |   5 ++
> > > > > > > > > >  kernel/bpf/cgroup.c      | 103  
> +++++++++++++++++++++++++++------------
> > > > > > > > > >  kernel/bpf/syscall.c     |   4 +-
> > > > > > > > > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/include/uapi/linux/bpf.h  
> b/include/uapi/linux/bpf.h
> > > > > > > > > > index b9d2d6de63a7..432fc5f49567 100644
> > > > > > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > > > > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > > > > > > > > >               __u32           attach_flags;
> > > > > > > > > >               __aligned_u64   prog_ids;
> > > > > > > > > >               __u32           prog_cnt;
> > > > > > > > > > +             __aligned_u64   prog_attach_flags; /*  
> output: per-program attach_flags */
> > > > > > > > > >       } query;
> > > > > > > > > >
> > > > > > > > > >       struct { /* anonymous struct used by  
> BPF_RAW_TRACEPOINT_OPEN command */
> > > > > > > > > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > > > > > > > > >       __u64 run_cnt;
> > > > > > > > > >       __u64 recursion_misses;
> > > > > > > > > >       __u32 verified_insns;
> > > > > > > > > > +     /* BTF ID of the function to attach to within BTF  
> object identified
> > > > > > > > > > +      * by btf_id.
> > > > > > > > > > +      */
> > > > > > > > > > +     __u32 attach_btf_func_id;
> > > > > > > > > >  } __attribute__((aligned(8)));
> > > > > > > > > >
> > > > > > > > > >  struct bpf_map_info {
> > > > > > > > > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > > > > > > > > index a959cdd22870..08a1015ee09e 100644
> > > > > > > > > > --- a/kernel/bpf/cgroup.c
> > > > > > > > > > +++ b/kernel/bpf/cgroup.c
> > > > > > > > > > @@ -1074,6 +1074,7 @@ static int  
> cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > > > > > > > > >  static int __cgroup_bpf_query(struct cgroup *cgrp,  
> const union bpf_attr *attr,
> > > > > > > > > >                             union bpf_attr __user  
> *uattr)
> > > > > > > > > >  {
> > > > > > > > > > +     __u32 __user *prog_attach_flags =  
> u64_to_user_ptr(attr->query.prog_attach_flags);
> > > > > > > > > >       __u32 __user *prog_ids =  
> u64_to_user_ptr(attr->query.prog_ids);
> > > > > > > > > >       enum bpf_attach_type type =  
> attr->query.attach_type;
> > > > > > > > > >       enum cgroup_bpf_attach_type atype;
> > > > > > > > > > @@ -1081,50 +1082,92 @@ static int  
> __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > > > > > >       struct hlist_head *progs;
> > > > > > > > > >       struct bpf_prog *prog;
> > > > > > > > > >       int cnt, ret = 0, i;
> > > > > > > > > > +     int total_cnt = 0;
> > > > > > > > > >       u32 flags;
> > > > > > > > > >
> > > > > > > > > > -     atype = to_cgroup_bpf_attach_type(type);
> > > > > > > > > > -     if (atype < 0)
> > > > > > > > > > -             return -EINVAL;
> > > > > > > > > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > > > > > > > > >
> > > > > > > > > > -     progs = &cgrp->bpf.progs[atype];
> > > > > > > > > > -     flags = cgrp->bpf.flags[atype];
> > > > > > > > > > +     if (type == BPF_LSM_CGROUP) {
> > > > > > > > > > +             from_atype = CGROUP_LSM_START;
> > > > > > > > > > +             to_atype = CGROUP_LSM_END;
> > > > > > > > > > +     } else {
> > > > > > > > > > +             from_atype =  
> to_cgroup_bpf_attach_type(type);
> > > > > > > > > > +             if (from_atype < 0)
> > > > > > > > > > +                     return -EINVAL;
> > > > > > > > > > +             to_atype = from_atype;
> > > > > > > > > > +     }
> > > > > > > > > >
> > > > > > > > > > -     effective =  
> rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > > -                                            
> lockdep_is_held(&cgroup_mutex));
> > > > > > > > > > +     for (atype = from_atype; atype <= to_atype;  
> atype++) {
> > > > > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > > > >
> > > > > > > > > > -     if (attr->query.query_flags &  
> BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > > -             cnt = bpf_prog_array_length(effective);
> > > > > > > > > > -     else
> > > > > > > > > > -             cnt = prog_list_length(progs);
> > > > > > > > > > +             effective =  
> rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > > +                                                    
> lockdep_is_held(&cgroup_mutex));
> > > > > > > > > >
> > > > > > > > > > -     if (copy_to_user(&uattr->query.attach_flags,  
> &flags, sizeof(flags)))
> > > > > > > > > > -             return -EFAULT;
> > > > > > > > > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt,  
> sizeof(cnt)))
> > > > > > > > > > +             if (attr->query.query_flags &  
> BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > > +                     total_cnt +=  
> bpf_prog_array_length(effective);
> > > > > > > > > > +             else
> > > > > > > > > > +                     total_cnt +=  
> prog_list_length(progs);
> > > > > > > > > > +     }
> > > > > > > > > > +
> > > > > > > > > > +     if (type != BPF_LSM_CGROUP)
> > > > > > > > > > +             if  
> (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > > > > > +                     return -EFAULT;
> > > > > > > > > > +     if (copy_to_user(&uattr->query.prog_cnt,  
> &total_cnt, sizeof(total_cnt)))
> > > > > > > > > >               return -EFAULT;
> > > > > > > > > > -     if (attr->query.prog_cnt == 0 || !prog_ids | 
> | !cnt)
> > > > > > > > > > +     if (attr->query.prog_cnt == 0 || !prog_ids | 
> | !total_cnt)
> > > > > > > > > >               /* return early if user requested only  
> program count + flags */
> > > > > > > > > >               return 0;
> > > > > > > > > > -     if (attr->query.prog_cnt < cnt) {
> > > > > > > > > > -             cnt = attr->query.prog_cnt;
> > > > > > > > > > +
> > > > > > > > > > +     if (attr->query.prog_cnt < total_cnt) {
> > > > > > > > > > +             total_cnt = attr->query.prog_cnt;
> > > > > > > > > >               ret = -ENOSPC;
> > > > > > > > > >       }
> > > > > > > > > >
> > > > > > > > > > -     if (attr->query.query_flags &  
> BPF_F_QUERY_EFFECTIVE) {
> > > > > > > > > > -             return  
> bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > > > > -     } else {
> > > > > > > > > > -             struct bpf_prog_list *pl;
> > > > > > > > > > -             u32 id;
> > > > > > > > > > +     for (atype = from_atype; atype <= to_atype;  
> atype++) {
> > > > > > > > > > +             if (total_cnt <= 0)
> > > > > > > > > > +                     break;
> > > > > > > > > >
> > > > > > > > > > -             i = 0;
> > > > > > > > > > -             hlist_for_each_entry(pl, progs, node) {
> > > > > > > > > > -                     prog = prog_list_prog(pl);
> > > > > > > > > > -                     id = prog->aux->id;
> > > > > > > > > > -                     if (copy_to_user(prog_ids + i,  
> &id, sizeof(id)))
> > > > > > > > > > -                             return -EFAULT;
> > > > > > > > > > -                     if (++i == cnt)
> > > > > > > > > > -                             break;
> > > > > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > > > > +
> > > > > > > > > > +             effective =  
> rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > > +                                                    
> lockdep_is_held(&cgroup_mutex));
> > > > > > > > > > +
> > > > > > > > > > +             if (attr->query.query_flags &  
> BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > > +                     cnt =  
> bpf_prog_array_length(effective);
> > > > > > > > > > +             else
> > > > > > > > > > +                     cnt = prog_list_length(progs);
> > > > > > > > > > +
> > > > > > > > > > +             if (cnt >= total_cnt)
> > > > > > > > > > +                     cnt = total_cnt;
> > > > > > > > > > +
> > > > > > > > > > +             if (attr->query.query_flags &  
> BPF_F_QUERY_EFFECTIVE) {
> > > > > > > > > > +                     ret =  
> bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > > > > +             } else {
> > > > > > > > > > +                     struct bpf_prog_list *pl;
> > > > > > > > > > +                     u32 id;
> > > > > > > > > > +
> > > > > > > > > > +                     i = 0;
> > > > > > > > > > +                     hlist_for_each_entry(pl, progs,  
> node) {
> > > > > > > > > > +                             prog = prog_list_prog(pl);
> > > > > > > > > > +                             id = prog->aux->id;
> > > > > > > > > > +                             if (copy_to_user(prog_ids  
> + i, &id, sizeof(id)))
> > > > > > > > > > +                                     return -EFAULT;
> > > > > > > > > > +                             if (++i == cnt)
> > > > > > > > > > +                                     break;
> > > > > > > > > > +                     }
> > > > > > > > > >               }
> > > > > > > > > > +
> > > > > > > > > > +             if (prog_attach_flags)
> > > > > > > > > > +                     for (i = 0; i < cnt; i++)
> > > > > > > > > > +                             if  
> (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > > > > > > > > +                                     return -EFAULT;
> > > > > > > > > > +
> > > > > > > > > > +             prog_ids += cnt;
> > > > > > > > > > +             total_cnt -= cnt;
> > > > > > > > > > +             if (prog_attach_flags)
> > > > > > > > > > +                     prog_attach_flags += cnt;
> > > > > > > > > >       }
> > > > > > > > > >       return ret;
> > > > > > > > > >  }
> > > > > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > > > > > index 5ed2093e51cc..4137583c04a2 100644
> > > > > > > > > > --- a/kernel/bpf/syscall.c
> > > > > > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > > > > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const  
> union bpf_attr *attr)
> > > > > > > > > >       }
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > > > > > > > > +#define BPF_PROG_QUERY_LAST_FIELD  
> query.prog_attach_flags
> > > > > > > > > >
> > > > > > > > > >  static int bpf_prog_query(const union bpf_attr *attr,
> > > > > > > > > >                         union bpf_attr __user *uattr)
> > > > > > > > > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const  
> union bpf_attr *attr,
> > > > > > > > > >       case BPF_CGROUP_SYSCTL:
> > > > > > > > > >       case BPF_CGROUP_GETSOCKOPT:
> > > > > > > > > >       case BPF_CGROUP_SETSOCKOPT:
> > > > > > > > > > +     case BPF_LSM_CGROUP:
> > > > > > > > > >               return cgroup_bpf_prog_query(attr, uattr);
> > > > > > > > > >       case BPF_LIRC_MODE2:
> > > > > > > > > >               return lirc_prog_query(attr, uattr);
> > > > > > > > > > @@ -4066,6 +4067,7 @@ static int  
> bpf_prog_get_info_by_fd(struct file *file,
> > > > > > > > > >
> > > > > > > > > >       if (prog->aux->btf)
> > > > > > > > > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > > > > > > > > +     info.attach_btf_func_id =  
> prog->aux->attach_btf_id;
> > > > > > > > > Note that exposing prog->aux->attach_btf_id only may not  
> be enough
> > > > > > > > > unless it can assume info.attach_btf_id is always  
> referring to btf_vmlinux
> > > > > > > > > for all bpf prog types.
> > > > > > > >
> > > > > > > > We also export btf_id two lines above, right? Btw, I left a  
> comment in
> > > > > > > > the bpftool about those btf_ids, I'm not sure how resolve  
> them and
> > > > > > > > always assume vmlinux for now.
> > > > > > > yeah, that btf_id above is the cgroup-lsm prog's btf_id which  
> has its
> > > > > > > func info, line info...etc.   It is not the one the  
> attach_btf_id correspond
> > > > > > > to.  attach_btf_id refers to either aux->attach_btf or  
> aux->dst_prog's btf (or
> > > > > > > target btf id here).
> > > > > > >
> > > > > > > It needs a consensus on where this attach_btf_id, target btf  
> id, and
> > > > > > > prog_attach_flags should be.  If I read the patch 7 thread  
> correctly,
> > > > > > > I think Andrii is suggesting to expose them to userspace  
> through link, so
> > > > > > > potentially putting them in bpf_link_info.  The  
> bpf_prog_query will
> > > > > > > output a list of link ids.  The same probably applies to
> > > > > >
> > > > > > Yep and I think it makes sense because link is representing one
> > > > > > specific attachment (and I presume flags can be stored inside  
> the link
> > > > > > itself as well, right?).
> > > > > >
> > > > > > But if legacy non-link BPF_PROG_ATTACH is supported then using
> > > > > > bpf_link_info won't cover legacy prog-only attachments.
> > > > >
> > > > > I don't have any attachment to the legacy apis, I'm supporting  
> them
> > > > > only because it takes two lines of code; we can go link-only if  
> there
> > > > > is an agreement that it's inherently better.
> > > > >
> > > > > How about I keep sys_bpf(BPF_PROG_QUERY) as is and I do a loop in  
> the
> > > > > userspace (for BPF_LSM_CGROUP only) over all links
> > > > > (BPF_LINK_GET_NEXT_ID) and will find the the ones with matching  
> prog
> > > > > ids (BPF_LINK_GET_FD_BY_ID+BPF_OBJ_GET_INFO_BY_FD)?
> > > > >
> > > > > That way we keep new fields in bpf_link_info, but we don't have to
> > > > > extend sys_bpf(BPF_PROG_QUERY) because there doesn't seem to be a  
> good
> > > > > way to do it. Exporting links via new link_fds would mean we'd  
> have to
> > > > > support BPF_F_QUERY_EFFECTIVE, but getting an effective array of  
> links
> > > > > seems to be messy. If, in the future, we figure out a better way  
> to
> I don't see a clean way to get effective array from one individual
> link[_info] through link iteration.  effective array is the progs that
> will be run at a cgroup and in such order.  The prog running at a
> cgroup doesn't necessarily linked to that cgroup.

Yeah, that's the problem with exposing links via prog_info; getting an
effective list is painful.

> If staying with BPF_PROG_QUERY+BPF_F_QUERY_EFFECTIVE to get effective  
> array
> and if it is decided the addition should be done in bpf_link_info,
> then a list of link ids needs to be output instead of the current list of
> prog ids.  The old attach type will still have to stay with the list of
> prog ids though :/

> It will be sad not to be able to get effective only for BPF_LSM_CGROUP.
> I found it more useful to show what will be run at a cgroup and in such
> order instead of what is linked to a cgroup.

See my hacky proof-of-concept below (on top of this series).

I think if we keep prog_info as is (don't export anything new, don't
export the list of links), iterating through all links on the host should  
work,
right? We get prog_ids list (effective or not, doesn't matter), then we
go through all the links and find the ones with with the same
prog_id (we can ignore cgroup, it shouldn't matter). Then we can export
attach_type/attach_btf_id/etc. If it happens to be slow in the future,
we can improve with some tbd interface to get the list of links for cgroup
(and then we'd have to care about effective list).

But the problem with going link-only is that I'd have to teach bpftool
to use links for BPF_LSM_CGROUP and it brings a bunch of problems:
* I'd have to pin those links somewhere to make them stick around
* Those pin paths essentially become an API now because "detach" now
   depends on them?
* (right now it automatically works with the legacy apis without any  
changes)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 269ad43b68c1..f34b64b9ba97 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -6049,6 +6049,9 @@ struct bpf_link_info {
  		struct {
  			__u64 cgroup_id;
  			__u32 attach_type;
+			__u32 attach_flags;
+			__u32 attach_btf_id;
+			__u32 attach_btf_obj_id;
  		} cgroup;
  		struct {
  			__aligned_u64 target_name; /* in/out: target_name buffer ptr */
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index e05f7a11b45a..b5159e7f64f5 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1211,6 +1211,20 @@ static int bpf_cgroup_link_fill_link_info(const  
struct bpf_link *link,

  	info->cgroup.cgroup_id = cg_id;
  	info->cgroup.attach_type = cg_link->type;
+
+	info->cgroup.attach_btf_id = cg_link->link.prog->aux->attach_btf_id;
+	if (cg_link->link.prog->aux->attach_btf)
+		info->cgroup.attach_btf_obj_id =  
btf_obj_id(cg_link->link.prog->aux->attach_btf);
+
+	if (cg_link->cgroup) {
+		int atype;
+
+		mutex_lock(&cgroup_mutex);
+		atype = bpf_cgroup_atype_find(cg_link->type,  
cg_link->link.prog->aux->attach_btf_id);
+		info->cgroup.attach_flags = cg_link->cgroup->bpf.flags[atype];
+		mutex_unlock(&cgroup_mutex);
+	}
+
  	return 0;
  }

diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
index f40d4745711c..3cece48ebaa9 100644
--- a/tools/bpf/bpftool/cgroup.c
+++ b/tools/bpf/bpftool/cgroup.c
@@ -49,15 +49,34 @@ static enum bpf_attach_type parse_attach_type(const  
char *str)
  }

  static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
-			 const char *attach_flags_str,
+			 __u32 attach_flags,
+			 __u32 attach_btf_id,
+			 __u32 attach_btf_obj_id,
  			 int level)
  {
  	char prog_name[MAX_PROG_FULL_NAME];
  	const char *attach_btf_name = NULL;
  	struct bpf_prog_info info = {};
  	__u32 info_len = sizeof(info);
+	const char *attach_flags_str;
+	char buf[32];
  	int prog_fd;

+	switch (attach_flags) {
+	case BPF_F_ALLOW_MULTI:
+		attach_flags_str = "multi";
+		break;
+	case BPF_F_ALLOW_OVERRIDE:
+		attach_flags_str = "override";
+		break;
+	case 0:
+		attach_flags_str = "";
+		break;
+	default:
+		snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags);
+		attach_flags_str = buf;
+	}
+
  	prog_fd = bpf_prog_get_fd_by_id(id);
  	if (prog_fd < 0)
  		return -1;
@@ -68,13 +87,13 @@ static int show_bpf_prog(int id, enum bpf_attach_type  
attach_type,
  	}

  	if (btf_vmlinux &&
-	    info.attach_btf_id < btf__type_cnt(btf_vmlinux)) {
-		/* Note, we ignore info.btf_id for now. There
+	    attach_btf_id < btf__type_cnt(btf_vmlinux)) {
+		/* Note, we ignore attach_btf_obj_id for now. There
  		 * is no good way to resolve btf_id to vmlinux
  		 * or module btf.
  		 */
  		const struct btf_type *t = btf__type_by_id(btf_vmlinux,
-							   info.attach_btf_id);
+							   attach_btf_id);
  		attach_btf_name = btf__name_by_offset(btf_vmlinux,
  						      t->name_off);
  	}
@@ -93,8 +112,8 @@ static int show_bpf_prog(int id, enum bpf_attach_type  
attach_type,
  		jsonw_string_field(json_wtr, "name", prog_name);
  		if (attach_btf_name)
  			jsonw_string_field(json_wtr, "attach_btf_name", attach_btf_name);
-		jsonw_uint_field(json_wtr, "btf_id", info.btf_id);
-		jsonw_uint_field(json_wtr, "attach_btf_id", info.attach_btf_id);
+		jsonw_uint_field(json_wtr, "btf_id", attach_btf_obj_id);
+		jsonw_uint_field(json_wtr, "attach_btf_id", attach_btf_id);
  		jsonw_end_object(json_wtr);
  	} else {
  		printf("%s%-8u ", level ? "    " : "", info.id);
@@ -105,8 +124,8 @@ static int show_bpf_prog(int id, enum bpf_attach_type  
attach_type,
  		printf(" %-15s %-15s", attach_flags_str, prog_name);
  		if (attach_btf_name)
  			printf(" %-15s", attach_btf_name);
-		else if (info.attach_btf_id)
-			printf(" btf_id=%d attach_btf_id=%d", info.btf_id, info.attach_btf_id);
+		else if (attach_btf_id)
+			printf(" btf_id=%d attach_btf_id=%d", attach_btf_obj_id, attach_btf_id);
  		printf("\n");
  	}

@@ -150,10 +169,10 @@ static int show_attached_bpf_progs(int cgroup_fd,  
enum bpf_attach_type type,
  				   int level)
  {
  	LIBBPF_OPTS(bpf_prog_query_opts, p);
-	const char *attach_flags_str;
  	__u32 prog_ids[1024] = {0};
  	__u32 attach_prog_flags[1024] = {0};
-	char buf[32];
+	__u32 attach_btf_id[1024] = {0};
+	__u32 attach_btf_obj_id[1024] = {0};
  	__u32 iter;
  	int ret;

@@ -168,29 +187,58 @@ static int show_attached_bpf_progs(int cgroup_fd,  
enum bpf_attach_type type,
  	if (p.prog_cnt == 0)
  		return 0;

-	for (iter = 0; iter < p.prog_cnt; iter++) {
-		__u32 attach_flags;
+	if (type == BPF_LSM_CGROUP) {
+		/* Match prog_id to the link to find out link info. */
+		struct bpf_link_info link_info;
+		__u32 id = 0;
+		__u32 link_len;
+		int err;
+		int fd;
+
+		while (1) {
+			err = bpf_link_get_next_id(id, &id);
+			if (err) {
+				if (errno == ENOENT)
+					break;
+				continue;
+			}

-		attach_flags = attach_prog_flags[iter] ?: p.attach_flags;
+			fd = bpf_link_get_fd_by_id(id);
+			if (fd < 0)
+				continue;

-		switch (attach_flags) {
-		case BPF_F_ALLOW_MULTI:
-			attach_flags_str = "multi";
-			break;
-		case BPF_F_ALLOW_OVERRIDE:
-			attach_flags_str = "override";
-			break;
-		case 0:
-			attach_flags_str = "";
-			break;
-		default:
-			snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags);
-			attach_flags_str = buf;
+			link_len = sizeof(struct bpf_link_info);
+			memset(&link_info, 0, link_len);
+			err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
+			if (err) {
+				close(fd);
+				continue;
+			}
+
+			if (link_info.type != BPF_LINK_TYPE_CGROUP) {
+				close(fd);
+				continue;
+			}
+
+			for (iter = 0; iter < p.prog_cnt; iter++) {
+				if (prog_ids[iter] != link_info.prog_id)
+					continue;
+
+				fprintf(stderr, "\nprog_fd=%d btf_id=%d\n", link_info.prog_id,  
link_info.cgroup.attach_btf_id);
+
+				attach_prog_flags[iter] = link_info.cgroup.attach_flags;
+				attach_btf_id[iter] = link_info.cgroup.attach_btf_id;
+				attach_btf_obj_id[iter] = link_info.cgroup.attach_btf_obj_id;
+			}
  		}
+	}

+	for (iter = 0; iter < p.prog_cnt; iter++)
  		show_bpf_prog(prog_ids[iter], type,
-			      attach_flags_str, level);
-	}
+			      attach_prog_flags[iter] ?: p.attach_flags,
+			      attach_btf_id[iter],
+			      attach_btf_obj_id[iter],
+			      level);

  	return 0;
  }
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 269ad43b68c1..f34b64b9ba97 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -6049,6 +6049,9 @@ struct bpf_link_info {
  		struct {
  			__u64 cgroup_id;
  			__u32 attach_type;
+			__u32 attach_flags;
+			__u32 attach_btf_id;
+			__u32 attach_btf_obj_id;
  		} cgroup;
  		struct {
  			__aligned_u64 target_name; /* in/out: target_name buffer ptr */

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-25 21:25                     ` sdf
@ 2022-05-26  0:03                       ` Martin KaFai Lau
  2022-05-26  1:23                         ` Martin KaFai Lau
  0 siblings, 1 reply; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-26  0:03 UTC (permalink / raw)
  To: sdf
  Cc: Andrii Nakryiko, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Wed, May 25, 2022 at 02:25:54PM -0700, sdf@google.com wrote:
> On 05/25, Martin KaFai Lau wrote:
> > On Wed, May 25, 2022 at 10:02:07AM -0700, Stanislav Fomichev wrote:
> > > On Wed, May 25, 2022 at 9:01 AM Stanislav Fomichev <sdf@google.com>
> > wrote:
> > > >
> > > > On Tue, May 24, 2022 at 9:39 PM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Tue, May 24, 2022 at 9:03 PM Stanislav Fomichev
> > <sdf@google.com> wrote:
> > > > > >
> > > > > > On Tue, May 24, 2022 at 4:45 PM Andrii Nakryiko
> > > > > > <andrii.nakryiko@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, May 24, 2022 at 10:50 AM Martin KaFai Lau
> > <kafai@fb.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, May 24, 2022 at 08:55:04AM -0700, Stanislav Fomichev
> > wrote:
> > > > > > > > > On Mon, May 23, 2022 at 8:49 PM Martin KaFai Lau
> > <kafai@fb.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, May 18, 2022 at 03:55:25PM -0700, Stanislav
> > Fomichev wrote:
> > > > > > > > > > > We have two options:
> > > > > > > > > > > 1. Treat all BPF_LSM_CGROUP the same, regardless of
> > attach_btf_id
> > > > > > > > > > > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate
> > hook point
> > > > > > > > > > >
> > > > > > > > > > > I was doing (2) in the original patch, but switching
> > to (1) here:
> > > > > > > > > > >
> > > > > > > > > > > * bpf_prog_query returns all attached BPF_LSM_CGROUP
> > programs
> > > > > > > > > > > regardless of attach_btf_id
> > > > > > > > > > > * attach_btf_id is exported via bpf_prog_info
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  include/uapi/linux/bpf.h |   5 ++
> > > > > > > > > > >  kernel/bpf/cgroup.c      | 103
> > +++++++++++++++++++++++++++------------
> > > > > > > > > > >  kernel/bpf/syscall.c     |   4 +-
> > > > > > > > > > >  3 files changed, 81 insertions(+), 31 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/include/uapi/linux/bpf.h
> > b/include/uapi/linux/bpf.h
> > > > > > > > > > > index b9d2d6de63a7..432fc5f49567 100644
> > > > > > > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > > > > > > @@ -1432,6 +1432,7 @@ union bpf_attr {
> > > > > > > > > > >               __u32           attach_flags;
> > > > > > > > > > >               __aligned_u64   prog_ids;
> > > > > > > > > > >               __u32           prog_cnt;
> > > > > > > > > > > +             __aligned_u64   prog_attach_flags; /*
> > output: per-program attach_flags */
> > > > > > > > > > >       } query;
> > > > > > > > > > >
> > > > > > > > > > >       struct { /* anonymous struct used by
> > BPF_RAW_TRACEPOINT_OPEN command */
> > > > > > > > > > > @@ -5911,6 +5912,10 @@ struct bpf_prog_info {
> > > > > > > > > > >       __u64 run_cnt;
> > > > > > > > > > >       __u64 recursion_misses;
> > > > > > > > > > >       __u32 verified_insns;
> > > > > > > > > > > +     /* BTF ID of the function to attach to within
> > BTF object identified
> > > > > > > > > > > +      * by btf_id.
> > > > > > > > > > > +      */
> > > > > > > > > > > +     __u32 attach_btf_func_id;
> > > > > > > > > > >  } __attribute__((aligned(8)));
> > > > > > > > > > >
> > > > > > > > > > >  struct bpf_map_info {
> > > > > > > > > > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > > > > > > > > > index a959cdd22870..08a1015ee09e 100644
> > > > > > > > > > > --- a/kernel/bpf/cgroup.c
> > > > > > > > > > > +++ b/kernel/bpf/cgroup.c
> > > > > > > > > > > @@ -1074,6 +1074,7 @@ static int
> > cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > > > > > > > > > >  static int __cgroup_bpf_query(struct cgroup *cgrp,
> > const union bpf_attr *attr,
> > > > > > > > > > >                             union bpf_attr __user
> > *uattr)
> > > > > > > > > > >  {
> > > > > > > > > > > +     __u32 __user *prog_attach_flags =
> > u64_to_user_ptr(attr->query.prog_attach_flags);
> > > > > > > > > > >       __u32 __user *prog_ids =
> > u64_to_user_ptr(attr->query.prog_ids);
> > > > > > > > > > >       enum bpf_attach_type type =
> > attr->query.attach_type;
> > > > > > > > > > >       enum cgroup_bpf_attach_type atype;
> > > > > > > > > > > @@ -1081,50 +1082,92 @@ static int
> > __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > > > > > > > > > >       struct hlist_head *progs;
> > > > > > > > > > >       struct bpf_prog *prog;
> > > > > > > > > > >       int cnt, ret = 0, i;
> > > > > > > > > > > +     int total_cnt = 0;
> > > > > > > > > > >       u32 flags;
> > > > > > > > > > >
> > > > > > > > > > > -     atype = to_cgroup_bpf_attach_type(type);
> > > > > > > > > > > -     if (atype < 0)
> > > > > > > > > > > -             return -EINVAL;
> > > > > > > > > > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > > > > > > > > > >
> > > > > > > > > > > -     progs = &cgrp->bpf.progs[atype];
> > > > > > > > > > > -     flags = cgrp->bpf.flags[atype];
> > > > > > > > > > > +     if (type == BPF_LSM_CGROUP) {
> > > > > > > > > > > +             from_atype = CGROUP_LSM_START;
> > > > > > > > > > > +             to_atype = CGROUP_LSM_END;
> > > > > > > > > > > +     } else {
> > > > > > > > > > > +             from_atype =
> > to_cgroup_bpf_attach_type(type);
> > > > > > > > > > > +             if (from_atype < 0)
> > > > > > > > > > > +                     return -EINVAL;
> > > > > > > > > > > +             to_atype = from_atype;
> > > > > > > > > > > +     }
> > > > > > > > > > >
> > > > > > > > > > > -     effective =
> > rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > > > -
> > lockdep_is_held(&cgroup_mutex));
> > > > > > > > > > > +     for (atype = from_atype; atype <= to_atype;
> > atype++) {
> > > > > > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > > > > >
> > > > > > > > > > > -     if (attr->query.query_flags &
> > BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > > > -             cnt = bpf_prog_array_length(effective);
> > > > > > > > > > > -     else
> > > > > > > > > > > -             cnt = prog_list_length(progs);
> > > > > > > > > > > +             effective =
> > rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > > > +
> > lockdep_is_held(&cgroup_mutex));
> > > > > > > > > > >
> > > > > > > > > > > -     if (copy_to_user(&uattr->query.attach_flags,
> > &flags, sizeof(flags)))
> > > > > > > > > > > -             return -EFAULT;
> > > > > > > > > > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt,
> > sizeof(cnt)))
> > > > > > > > > > > +             if (attr->query.query_flags &
> > BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > > > +                     total_cnt +=
> > bpf_prog_array_length(effective);
> > > > > > > > > > > +             else
> > > > > > > > > > > +                     total_cnt +=
> > prog_list_length(progs);
> > > > > > > > > > > +     }
> > > > > > > > > > > +
> > > > > > > > > > > +     if (type != BPF_LSM_CGROUP)
> > > > > > > > > > > +             if
> > (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > > > > > > > > > +                     return -EFAULT;
> > > > > > > > > > > +     if (copy_to_user(&uattr->query.prog_cnt,
> > &total_cnt, sizeof(total_cnt)))
> > > > > > > > > > >               return -EFAULT;
> > > > > > > > > > > -     if (attr->query.prog_cnt == 0 || !prog_ids ||
> > !cnt)
> > > > > > > > > > > +     if (attr->query.prog_cnt == 0 || !prog_ids ||
> > !total_cnt)
> > > > > > > > > > >               /* return early if user requested only
> > program count + flags */
> > > > > > > > > > >               return 0;
> > > > > > > > > > > -     if (attr->query.prog_cnt < cnt) {
> > > > > > > > > > > -             cnt = attr->query.prog_cnt;
> > > > > > > > > > > +
> > > > > > > > > > > +     if (attr->query.prog_cnt < total_cnt) {
> > > > > > > > > > > +             total_cnt = attr->query.prog_cnt;
> > > > > > > > > > >               ret = -ENOSPC;
> > > > > > > > > > >       }
> > > > > > > > > > >
> > > > > > > > > > > -     if (attr->query.query_flags &
> > BPF_F_QUERY_EFFECTIVE) {
> > > > > > > > > > > -             return
> > bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > > > > > -     } else {
> > > > > > > > > > > -             struct bpf_prog_list *pl;
> > > > > > > > > > > -             u32 id;
> > > > > > > > > > > +     for (atype = from_atype; atype <= to_atype;
> > atype++) {
> > > > > > > > > > > +             if (total_cnt <= 0)
> > > > > > > > > > > +                     break;
> > > > > > > > > > >
> > > > > > > > > > > -             i = 0;
> > > > > > > > > > > -             hlist_for_each_entry(pl, progs, node) {
> > > > > > > > > > > -                     prog = prog_list_prog(pl);
> > > > > > > > > > > -                     id = prog->aux->id;
> > > > > > > > > > > -                     if (copy_to_user(prog_ids + i,
> > &id, sizeof(id)))
> > > > > > > > > > > -                             return -EFAULT;
> > > > > > > > > > > -                     if (++i == cnt)
> > > > > > > > > > > -                             break;
> > > > > > > > > > > +             progs = &cgrp->bpf.progs[atype];
> > > > > > > > > > > +             flags = cgrp->bpf.flags[atype];
> > > > > > > > > > > +
> > > > > > > > > > > +             effective =
> > rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > > > > > > > > > +
> > lockdep_is_held(&cgroup_mutex));
> > > > > > > > > > > +
> > > > > > > > > > > +             if (attr->query.query_flags &
> > BPF_F_QUERY_EFFECTIVE)
> > > > > > > > > > > +                     cnt =
> > bpf_prog_array_length(effective);
> > > > > > > > > > > +             else
> > > > > > > > > > > +                     cnt = prog_list_length(progs);
> > > > > > > > > > > +
> > > > > > > > > > > +             if (cnt >= total_cnt)
> > > > > > > > > > > +                     cnt = total_cnt;
> > > > > > > > > > > +
> > > > > > > > > > > +             if (attr->query.query_flags &
> > BPF_F_QUERY_EFFECTIVE) {
> > > > > > > > > > > +                     ret =
> > bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > > > > > > > > > +             } else {
> > > > > > > > > > > +                     struct bpf_prog_list *pl;
> > > > > > > > > > > +                     u32 id;
> > > > > > > > > > > +
> > > > > > > > > > > +                     i = 0;
> > > > > > > > > > > +                     hlist_for_each_entry(pl, progs,
> > node) {
> > > > > > > > > > > +                             prog = prog_list_prog(pl);
> > > > > > > > > > > +                             id = prog->aux->id;
> > > > > > > > > > > +                             if
> > (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > > > > > > > > > +                                     return -EFAULT;
> > > > > > > > > > > +                             if (++i == cnt)
> > > > > > > > > > > +                                     break;
> > > > > > > > > > > +                     }
> > > > > > > > > > >               }
> > > > > > > > > > > +
> > > > > > > > > > > +             if (prog_attach_flags)
> > > > > > > > > > > +                     for (i = 0; i < cnt; i++)
> > > > > > > > > > > +                             if
> > (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > > > > > > > > > +                                     return -EFAULT;
> > > > > > > > > > > +
> > > > > > > > > > > +             prog_ids += cnt;
> > > > > > > > > > > +             total_cnt -= cnt;
> > > > > > > > > > > +             if (prog_attach_flags)
> > > > > > > > > > > +                     prog_attach_flags += cnt;
> > > > > > > > > > >       }
> > > > > > > > > > >       return ret;
> > > > > > > > > > >  }
> > > > > > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > > > > > > index 5ed2093e51cc..4137583c04a2 100644
> > > > > > > > > > > --- a/kernel/bpf/syscall.c
> > > > > > > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > > > > > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const
> > union bpf_attr *attr)
> > > > > > > > > > >       }
> > > > > > > > > > >  }
> > > > > > > > > > >
> > > > > > > > > > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > > > > > > > > > +#define BPF_PROG_QUERY_LAST_FIELD
> > query.prog_attach_flags
> > > > > > > > > > >
> > > > > > > > > > >  static int bpf_prog_query(const union bpf_attr *attr,
> > > > > > > > > > >                         union bpf_attr __user *uattr)
> > > > > > > > > > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const
> > union bpf_attr *attr,
> > > > > > > > > > >       case BPF_CGROUP_SYSCTL:
> > > > > > > > > > >       case BPF_CGROUP_GETSOCKOPT:
> > > > > > > > > > >       case BPF_CGROUP_SETSOCKOPT:
> > > > > > > > > > > +     case BPF_LSM_CGROUP:
> > > > > > > > > > >               return cgroup_bpf_prog_query(attr, uattr);
> > > > > > > > > > >       case BPF_LIRC_MODE2:
> > > > > > > > > > >               return lirc_prog_query(attr, uattr);
> > > > > > > > > > > @@ -4066,6 +4067,7 @@ static int
> > bpf_prog_get_info_by_fd(struct file *file,
> > > > > > > > > > >
> > > > > > > > > > >       if (prog->aux->btf)
> > > > > > > > > > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > > > > > > > > > +     info.attach_btf_func_id =
> > prog->aux->attach_btf_id;
> > > > > > > > > > Note that exposing prog->aux->attach_btf_id only may not
> > be enough
> > > > > > > > > > unless it can assume info.attach_btf_id is always
> > referring to btf_vmlinux
> > > > > > > > > > for all bpf prog types.
> > > > > > > > >
> > > > > > > > > We also export btf_id two lines above, right? Btw, I left
> > a comment in
> > > > > > > > > the bpftool about those btf_ids, I'm not sure how resolve
> > them and
> > > > > > > > > always assume vmlinux for now.
> > > > > > > > yeah, that btf_id above is the cgroup-lsm prog's btf_id
> > which has its
> > > > > > > > func info, line info...etc.   It is not the one the
> > attach_btf_id correspond
> > > > > > > > to.  attach_btf_id refers to either aux->attach_btf or
> > aux->dst_prog's btf (or
> > > > > > > > target btf id here).
> > > > > > > >
> > > > > > > > It needs a consensus on where this attach_btf_id, target btf
> > id, and
> > > > > > > > prog_attach_flags should be.  If I read the patch 7 thread
> > correctly,
> > > > > > > > I think Andrii is suggesting to expose them to userspace
> > through link, so
> > > > > > > > potentially putting them in bpf_link_info.  The
> > bpf_prog_query will
> > > > > > > > output a list of link ids.  The same probably applies to
> > > > > > >
> > > > > > > Yep and I think it makes sense because link is representing one
> > > > > > > specific attachment (and I presume flags can be stored inside
> > the link
> > > > > > > itself as well, right?).
> > > > > > >
> > > > > > > But if legacy non-link BPF_PROG_ATTACH is supported then using
> > > > > > > bpf_link_info won't cover legacy prog-only attachments.
> > > > > >
> > > > > > I don't have any attachment to the legacy apis, I'm supporting
> > them
> > > > > > only because it takes two lines of code; we can go link-only if
> > there
> > > > > > is an agreement that it's inherently better.
> > > > > >
> > > > > > How about I keep sys_bpf(BPF_PROG_QUERY) as is and I do a loop
> > in the
> > > > > > userspace (for BPF_LSM_CGROUP only) over all links
> > > > > > (BPF_LINK_GET_NEXT_ID) and will find the the ones with matching
> > prog
> > > > > > ids (BPF_LINK_GET_FD_BY_ID+BPF_OBJ_GET_INFO_BY_FD)?
> > > > > >
> > > > > > That way we keep new fields in bpf_link_info, but we don't have to
> > > > > > extend sys_bpf(BPF_PROG_QUERY) because there doesn't seem to be
> > a good
> > > > > > way to do it. Exporting links via new link_fds would mean we'd
> > have to
> > > > > > support BPF_F_QUERY_EFFECTIVE, but getting an effective array of
> > links
> > > > > > seems to be messy. If, in the future, we figure out a better way
> > to
> > I don't see a clean way to get effective array from one individual
> > link[_info] through link iteration.  effective array is the progs that
> > will be run at a cgroup and in such order.  The prog running at a
> > cgroup doesn't necessarily linked to that cgroup.
> 
> Yeah, that's the problem with exposing links via prog_info; getting an
> effective list is painful.
> 
> > If staying with BPF_PROG_QUERY+BPF_F_QUERY_EFFECTIVE to get effective
> > array
> > and if it is decided the addition should be done in bpf_link_info,
> > then a list of link ids needs to be output instead of the current list of
> > prog ids.  The old attach type will still have to stay with the list of
> > prog ids though :/
> 
> > It will be sad not to be able to get effective only for BPF_LSM_CGROUP.
> > I found it more useful to show what will be run at a cgroup and in such
> > order instead of what is linked to a cgroup.
> 
> See my hacky proof-of-concept below (on top of this series).
yeah. the PoC makes sense and I don't mind that considering
adding them to bpf_link_info (or bpf_prog_info) will be useful in
general even without this use case.

A quick thought is this is sort of partly going back to v6 but
just iterating different things instead of the bpf_lsm hooks.

> 
> I think if we keep prog_info as is (don't export anything new, don't
> export the list of links), iterating through all links on the host should
> work,
> right? We get prog_ids list (effective or not, doesn't matter), then we
> go through all the links and find the ones with with the same
> prog_id (we can ignore cgroup, it shouldn't matter). Then we can export
> attach_type/attach_btf_id/etc. If it happens to be slow in the future,
> we can improve with some tbd interface to get the list of links for cgroup
> (and then we'd have to care about effective list).
> 
> But the problem with going link-only is that I'd have to teach bpftool
> to use links for BPF_LSM_CGROUP and it brings a bunch of problems:
> * I'd have to pin those links somewhere to make them stick around
> * Those pin paths essentially become an API now because "detach" now
>   depends on them?
> * (right now it automatically works with the legacy apis without any
> changes)
It is already the current API for all links (tracing, cgroup...).  It goes
away (detach) with the process unless it is pinned.  but yeah, it will
be a new exception in the "bpftool cgroup" subcommand only for
BPF_LSM_CGROUP.

If it is an issue with your use case, may be going back to v6 that extends
the query bpf_attr with attach_btf_id and support both attach API ?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-26  0:03                       ` Martin KaFai Lau
@ 2022-05-26  1:23                         ` Martin KaFai Lau
  2022-05-26  2:50                           ` Stanislav Fomichev
  0 siblings, 1 reply; 54+ messages in thread
From: Martin KaFai Lau @ 2022-05-26  1:23 UTC (permalink / raw)
  To: sdf
  Cc: Andrii Nakryiko, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Wed, May 25, 2022 at 05:03:40PM -0700, Martin KaFai Lau wrote:
> > But the problem with going link-only is that I'd have to teach bpftool
> > to use links for BPF_LSM_CGROUP and it brings a bunch of problems:
> > * I'd have to pin those links somewhere to make them stick around
> > * Those pin paths essentially become an API now because "detach" now
> >   depends on them?
> > * (right now it automatically works with the legacy apis without any
> > changes)
> It is already the current API for all links (tracing, cgroup...).  It goes
> away (detach) with the process unless it is pinned.  but yeah, it will
> be a new exception in the "bpftool cgroup" subcommand only for
> BPF_LSM_CGROUP.
> 
> If it is an issue with your use case, may be going back to v6 that extends
> the query bpf_attr with attach_btf_id and support both attach API ?
[ hit sent too early... ]
or extending the bpf_prog_info as you also mentioned in the earlier reply.
It seems all have their ups and downs.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-26  1:23                         ` Martin KaFai Lau
@ 2022-05-26  2:50                           ` Stanislav Fomichev
  2022-05-31 23:08                             ` Andrii Nakryiko
  0 siblings, 1 reply; 54+ messages in thread
From: Stanislav Fomichev @ 2022-05-26  2:50 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Andrii Nakryiko, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Wed, May 25, 2022 at 6:23 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, May 25, 2022 at 05:03:40PM -0700, Martin KaFai Lau wrote:
> > > But the problem with going link-only is that I'd have to teach bpftool
> > > to use links for BPF_LSM_CGROUP and it brings a bunch of problems:
> > > * I'd have to pin those links somewhere to make them stick around
> > > * Those pin paths essentially become an API now because "detach" now
> > >   depends on them?
> > > * (right now it automatically works with the legacy apis without any
> > > changes)
> > It is already the current API for all links (tracing, cgroup...).  It goes
> > away (detach) with the process unless it is pinned.  but yeah, it will
> > be a new exception in the "bpftool cgroup" subcommand only for
> > BPF_LSM_CGROUP.
> >
> > If it is an issue with your use case, may be going back to v6 that extends
> > the query bpf_attr with attach_btf_id and support both attach API ?
> [ hit sent too early... ]
> or extending the bpf_prog_info as you also mentioned in the earlier reply.
> It seems all have their ups and downs.

I'm thinking on putting everything I need into bpf_prog_info and
exporting a list of attach_flags in prog_query (as it's done here in
v7 + add attach_btf_obj_id).
I'm a bit concerned with special casing bpf_lsm_cgroup even more if we
go with a link-only api :-(
I can definitely also put this info into bpf_link_info, but I'm not
sure what's Andrii's preference? I'm assuming he was suggesting to do
either bpf_prog_info or bpf_link_info, but not both?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-26  2:50                           ` Stanislav Fomichev
@ 2022-05-31 23:08                             ` Andrii Nakryiko
  0 siblings, 0 replies; 54+ messages in thread
From: Andrii Nakryiko @ 2022-05-31 23:08 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Martin KaFai Lau, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko

On Wed, May 25, 2022 at 7:50 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Wed, May 25, 2022 at 6:23 PM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Wed, May 25, 2022 at 05:03:40PM -0700, Martin KaFai Lau wrote:
> > > > But the problem with going link-only is that I'd have to teach bpftool
> > > > to use links for BPF_LSM_CGROUP and it brings a bunch of problems:
> > > > * I'd have to pin those links somewhere to make them stick around
> > > > * Those pin paths essentially become an API now because "detach" now
> > > >   depends on them?
> > > > * (right now it automatically works with the legacy apis without any
> > > > changes)
> > > It is already the current API for all links (tracing, cgroup...).  It goes
> > > away (detach) with the process unless it is pinned.  but yeah, it will
> > > be a new exception in the "bpftool cgroup" subcommand only for
> > > BPF_LSM_CGROUP.
> > >
> > > If it is an issue with your use case, may be going back to v6 that extends
> > > the query bpf_attr with attach_btf_id and support both attach API ?
> > [ hit sent too early... ]
> > or extending the bpf_prog_info as you also mentioned in the earlier reply.
> > It seems all have their ups and downs.
>
> I'm thinking on putting everything I need into bpf_prog_info and
> exporting a list of attach_flags in prog_query (as it's done here in
> v7 + add attach_btf_obj_id).
> I'm a bit concerned with special casing bpf_lsm_cgroup even more if we
> go with a link-only api :-(
> I can definitely also put this info into bpf_link_info, but I'm not
> sure what's Andrii's preference? I'm assuming he was suggesting to do
> either bpf_prog_info or bpf_link_info, but not both?

I don't care much, tbh. Whichever makes most sense to you.

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2022-05-31 23:08 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-18 22:55 [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
2022-05-18 22:55 ` [PATCH bpf-next v7 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
2022-05-20  0:45   ` Yonghong Song
2022-05-21  0:03     ` Stanislav Fomichev
2022-05-18 22:55 ` [PATCH bpf-next v7 02/11] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
2022-05-18 22:55 ` [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
2022-05-20  1:00   ` Yonghong Song
2022-05-21  0:03     ` Stanislav Fomichev
2022-05-23 15:41       ` Yonghong Song
2022-05-21  0:53   ` Martin KaFai Lau
2022-05-24  2:15     ` Stanislav Fomichev
2022-05-24  5:40       ` Martin KaFai Lau
2022-05-24 15:56         ` Stanislav Fomichev
2022-05-24  5:57       ` Martin KaFai Lau
2022-05-18 22:55 ` [PATCH bpf-next v7 04/11] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
2022-05-21  6:56   ` Martin KaFai Lau
2022-05-24  2:14     ` Stanislav Fomichev
2022-05-24  5:53       ` Martin KaFai Lau
2022-05-18 22:55 ` [PATCH bpf-next v7 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
2022-05-19  2:31   ` kernel test robot
2022-05-19 14:57   ` kernel test robot
2022-05-23 23:23   ` Andrii Nakryiko
2022-05-24  2:15     ` Stanislav Fomichev
2022-05-24  3:48   ` Martin KaFai Lau
2022-05-24 15:55     ` Stanislav Fomichev
2022-05-24 17:50       ` Martin KaFai Lau
2022-05-24 23:45         ` Andrii Nakryiko
2022-05-25  4:03           ` Stanislav Fomichev
2022-05-25  4:39             ` Andrii Nakryiko
2022-05-25 16:01               ` Stanislav Fomichev
2022-05-25 17:02                 ` Stanislav Fomichev
2022-05-25 20:39                   ` Martin KaFai Lau
2022-05-25 21:25                     ` sdf
2022-05-26  0:03                       ` Martin KaFai Lau
2022-05-26  1:23                         ` Martin KaFai Lau
2022-05-26  2:50                           ` Stanislav Fomichev
2022-05-31 23:08                             ` Andrii Nakryiko
2022-05-18 22:55 ` [PATCH bpf-next v7 06/11] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
2022-05-18 22:55 ` [PATCH bpf-next v7 07/11] libbpf: implement bpf_prog_query_opts Stanislav Fomichev
2022-05-23 23:22   ` Andrii Nakryiko
2022-05-24  2:15     ` Stanislav Fomichev
2022-05-24  3:45       ` Andrii Nakryiko
2022-05-24  4:01         ` Martin KaFai Lau
2022-05-18 22:55 ` [PATCH bpf-next v7 08/11] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
2022-05-23 23:26   ` Andrii Nakryiko
2022-05-24  2:15     ` Stanislav Fomichev
2022-05-18 22:55 ` [PATCH bpf-next v7 09/11] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
2022-05-18 22:55 ` [PATCH bpf-next v7 10/11] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
2022-05-18 22:55 ` [PATCH bpf-next v7 11/11] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
2022-05-23 23:33   ` Andrii Nakryiko
2022-05-24  2:15     ` Stanislav Fomichev
2022-05-24  3:46       ` Andrii Nakryiko
2022-05-19 23:34 ` [PATCH bpf-next v7 00/11] bpf: cgroup_sock lsm flavor Yonghong Song
2022-05-19 23:39   ` Stanislav Fomichev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).