netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor
@ 2022-06-01 19:02 Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
                   ` (10 more replies)
  0 siblings, 11 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf
  Cc: ast, daniel, andrii, Stanislav Fomichev, kafai, kpsingh, jakub

This series implements new lsm flavor for attaching per-cgroup programs to
existing lsm hooks. The cgroup is taken out of 'current', unless
the first argument of the hook is 'struct socket'. In this case,
the cgroup association is taken out of socket. The attachment
looks like a regular per-cgroup attachment: we add new BPF_LSM_CGROUP
attach type which, together with attach_btf_id, signals per-cgroup lsm.
Behind the scenes, we allocate trampoline shim program and
attach to lsm. This program looks up cgroup from current/socket
and runs cgroup's effective prog array. The rest of the per-cgroup BPF
stays the same: hierarchy, local storage, retval conventions
(return 1 == success).

Current limitations:
* haven't considered sleepable bpf; can be extended later on
* not sure the verifier does the right thing with null checks;
  see latest selftest for details
* total of 10 (global) per-cgroup LSM attach points

Cc: ast@kernel.org
Cc: daniel@iogearbox.net
Cc: kafai@fb.com
Cc: kpsingh@kernel.org
Cc: jakub@cloudflare.com

v8:
- CI: fix compile issue
- CI: fix broken bpf_cookie
- Yonghong: remove __bpf_trampoline_unlink_prog comment
- Yonghong: move cgroup_atype around to fill the gap
- Yonghong: make bpf_lsm_find_cgroup_shim void
- Yonghong: rename regs to args
- Yonghong: remove if(current) check
- Martin: move refcnt into bpf_link
- Martin: move shim management to bpf_link ops
- Martin: use cgroup_atype for shim only
- Martin: go back to arrays for managing cgroup_atype(s)
- Martin: export bpf_obj_id(aux->attach_btf)
- Andrii: reorder SEC_DEF("lsm_cgroup+")
- Andrii: OPTS_SET instead of OPTS_HAS
- Andrii: rename attach_btf_func_id
- Andrii: move into 1.0 map

v7:
- there were a lot of comments last time, hope I didn't forget anything,
  some of the bigger ones:
  - Martin: use/extend BTF_SOCK_TYPE_SOCKET
  - Martin: expose bpf_set_retval
  - Martin: reject 'return 0' at the verifier for 'void' hooks
  - Martin: prog_query returns all BPF_LSM_CGROUP, prog_info
    returns attach_btf_func_id
  - Andrii: split libbpf changes
  - Andrii: add field access test to test_progs, not test_verifier (still
    using asm though)
- things that I haven't addressed, stating them here explicitly, let
  me know if some of these are still problematic:
  1. Andrii: exposing only link-based api: seems like the changes
     to support non-link-based ones are minimal, couple of lines,
     so seems like it worth having it?
  2. Alexei: applying cgroup_atype for all cgroup hooks, not only
     cgroup lsm: looks a bit harder to apply everywhere that I
     originally thought; with lsm cgroup, we have a shim_prog pointer where
     we store cgroup_atype; for non-lsm programs, we don't have a
     trace program where to store it, so we still need some kind
     of global table to map from "static" hook to "dynamic" slot.
     So I'm dropping this "can be easily extended" clause from the
     description for now. I have converted this whole machinery
     to an RCU-managed list to remove synchronize_rcu().
- also note that I had to introduce new bpf_shim_tramp_link and
  moved refcnt there; we need something to manage new bpf_tramp_link

v6:
- remove active count & stats for shim program (Martin KaFai Lau)
- remove NULL/error check for btf_vmlinux (Martin)
- don't check cgroup_atype in bpf_cgroup_lsm_shim_release (Martin)
- use old_prog (instead of passed one) in __cgroup_bpf_detach (Martin)
- make sure attach_btf_id is the same in __cgroup_bpf_replace (Martin)
- enable cgroup local storage and test it (Martin)
- properly implement prog query and add bpftool & tests (Martin)
- prohibit non-shared cgroup storage mode for BPF_LSM_CGROUP (Martin)

v5:
- __cgroup_bpf_run_lsm_socket remove NULL sock/sk checks (Martin KaFai Lau)
- __cgroup_bpf_run_lsm_{socket,current} s/prog/shim_prog/ (Martin)
- make sure bpf_lsm_find_cgroup_shim works for hooks without args (Martin)
- __cgroup_bpf_attach make sure attach_btf_id is the same when replacing (Martin)
- call bpf_cgroup_lsm_shim_release only for LSM_CGROUP (Martin)
- drop BPF_LSM_CGROUP from bpf_attach_type_to_tramp (Martin)
- drop jited check from cgroup_shim_find (Martin)
- new patch to convert cgroup_bpf to hlist_node (Jakub Sitnicki)
- new shim flavor for 'struct sock' + list of exceptions (Martin)

v4:
- fix build when jit is on but syscall is off

v3:
- add BPF_LSM_CGROUP to bpftool
- use simple int instead of refcnt_t (to avoid use-after-free
  false positive)

v2:
- addressed build bot failures

Stanislav Fomichev (11):
  bpf: add bpf_func_t and trampoline helpers
  bpf: convert cgroup_bpf.progs to hlist
  bpf: per-cgroup lsm flavor
  bpf: minimize number of allocated lsm slots per program
  bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  bpf: allow writing to a subset of sock fields from lsm progtype
  libbpf: add lsm_cgoup_sock type
  libbpf: implement bpf_prog_query_opts
  bpftool: implement cgroup tree for BPF_LSM_CGROUP
  selftests/bpf: lsm_cgroup functional test
  selftests/bpf: verify lsm_cgroup struct sock access

 arch/x86/net/bpf_jit_comp.c                   |  24 +-
 include/linux/bpf-cgroup-defs.h               |  13 +-
 include/linux/bpf-cgroup.h                    |   9 +-
 include/linux/bpf.h                           |  39 +-
 include/linux/bpf_lsm.h                       |   7 +
 include/linux/btf_ids.h                       |   3 +-
 include/uapi/linux/bpf.h                      |   4 +
 kernel/bpf/bpf_lsm.c                          | 101 +++++
 kernel/bpf/btf.c                              |  11 +
 kernel/bpf/cgroup.c                           | 370 ++++++++++++++----
 kernel/bpf/core.c                             |   9 +
 kernel/bpf/syscall.c                          |  16 +-
 kernel/bpf/trampoline.c                       | 251 ++++++++++--
 kernel/bpf/verifier.c                         |  35 +-
 tools/bpf/bpftool/cgroup.c                    |  79 ++--
 tools/include/linux/btf_ids.h                 |   4 +-
 tools/include/uapi/linux/bpf.h                |   4 +
 tools/lib/bpf/bpf.c                           |  40 +-
 tools/lib/bpf/bpf.h                           |  15 +
 tools/lib/bpf/libbpf.c                        |   3 +
 tools/lib/bpf/libbpf.map                      |   2 +-
 .../selftests/bpf/prog_tests/lsm_cgroup.c     | 346 ++++++++++++++++
 .../testing/selftests/bpf/progs/lsm_cgroup.c  | 160 ++++++++
 23 files changed, 1394 insertions(+), 151 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
 create mode 100644 tools/testing/selftests/bpf/progs/lsm_cgroup.c

-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 01/11] bpf: add bpf_func_t and trampoline helpers
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 02/11] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

I'll be adding lsm cgroup specific helpers that grab
trampoline mutex.

No functional changes.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf.h     | 11 ++++---
 kernel/bpf/trampoline.c | 63 +++++++++++++++++++++--------------------
 2 files changed, 38 insertions(+), 36 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8e6092d0ea95..b0af68b6297b 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -53,6 +53,8 @@ typedef u64 (*bpf_callback_t)(u64, u64, u64, u64, u64);
 typedef int (*bpf_iter_init_seq_priv_t)(void *private_data,
 					struct bpf_iter_aux_info *aux);
 typedef void (*bpf_iter_fini_seq_priv_t)(void *private_data);
+typedef unsigned int (*bpf_func_t)(const void *,
+				   const struct bpf_insn *);
 struct bpf_iter_seq_info {
 	const struct seq_operations *seq_ops;
 	bpf_iter_init_seq_priv_t init_seq_private;
@@ -863,8 +865,7 @@ struct bpf_dispatcher {
 static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func(
 	const void *ctx,
 	const struct bpf_insn *insnsi,
-	unsigned int (*bpf_func)(const void *,
-				 const struct bpf_insn *))
+	bpf_func_t bpf_func)
 {
 	return bpf_func(ctx, insnsi);
 }
@@ -893,8 +894,7 @@ int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
 	noinline __nocfi unsigned int bpf_dispatcher_##name##_func(	\
 		const void *ctx,					\
 		const struct bpf_insn *insnsi,				\
-		unsigned int (*bpf_func)(const void *,			\
-					 const struct bpf_insn *))	\
+		bpf_func_t bpf_func)					\
 	{								\
 		return bpf_func(ctx, insnsi);				\
 	}								\
@@ -905,8 +905,7 @@ int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
 	unsigned int bpf_dispatcher_##name##_func(			\
 		const void *ctx,					\
 		const struct bpf_insn *insnsi,				\
-		unsigned int (*bpf_func)(const void *,			\
-					 const struct bpf_insn *));	\
+		bpf_func_t bpf_func);					\
 	extern struct bpf_dispatcher bpf_dispatcher_##name;
 #define BPF_DISPATCHER_FUNC(name) bpf_dispatcher_##name##_func
 #define BPF_DISPATCHER_PTR(name) (&bpf_dispatcher_##name)
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 93c7675f0c9e..5466e15be61f 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -410,7 +410,7 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(struct bpf_prog *prog)
 	}
 }
 
-int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
 {
 	enum bpf_tramp_prog_type kind;
 	struct bpf_tramp_link *link_exiting;
@@ -418,44 +418,33 @@ int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline
 	int cnt = 0, i;
 
 	kind = bpf_attach_type_to_tramp(link->link.prog);
-	mutex_lock(&tr->mutex);
-	if (tr->extension_prog) {
+	if (tr->extension_prog)
 		/* cannot attach fentry/fexit if extension prog is attached.
 		 * cannot overwrite extension prog either.
 		 */
-		err = -EBUSY;
-		goto out;
-	}
+		return -EBUSY;
 
 	for (i = 0; i < BPF_TRAMP_MAX; i++)
 		cnt += tr->progs_cnt[i];
 
 	if (kind == BPF_TRAMP_REPLACE) {
 		/* Cannot attach extension if fentry/fexit are in use. */
-		if (cnt) {
-			err = -EBUSY;
-			goto out;
-		}
+		if (cnt)
+			return -EBUSY;
 		tr->extension_prog = link->link.prog;
-		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
-					 link->link.prog->bpf_func);
-		goto out;
-	}
-	if (cnt >= BPF_MAX_TRAMP_LINKS) {
-		err = -E2BIG;
-		goto out;
+		return bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
+					  link->link.prog->bpf_func);
 	}
-	if (!hlist_unhashed(&link->tramp_hlist)) {
+	if (cnt >= BPF_MAX_TRAMP_LINKS)
+		return -E2BIG;
+	if (!hlist_unhashed(&link->tramp_hlist))
 		/* prog already linked */
-		err = -EBUSY;
-		goto out;
-	}
+		return -EBUSY;
 	hlist_for_each_entry(link_exiting, &tr->progs_hlist[kind], tramp_hlist) {
 		if (link_exiting->link.prog != link->link.prog)
 			continue;
 		/* prog already linked */
-		err = -EBUSY;
-		goto out;
+		return -EBUSY;
 	}
 
 	hlist_add_head(&link->tramp_hlist, &tr->progs_hlist[kind]);
@@ -465,30 +454,44 @@ int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline
 		hlist_del_init(&link->tramp_hlist);
 		tr->progs_cnt[kind]--;
 	}
-out:
+	return err;
+}
+
+int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+{
+	int err;
+
+	mutex_lock(&tr->mutex);
+	err = __bpf_trampoline_link_prog(link, tr);
 	mutex_unlock(&tr->mutex);
 	return err;
 }
 
-/* bpf_trampoline_unlink_prog() should never fail. */
-int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
 {
 	enum bpf_tramp_prog_type kind;
 	int err;
 
 	kind = bpf_attach_type_to_tramp(link->link.prog);
-	mutex_lock(&tr->mutex);
 	if (kind == BPF_TRAMP_REPLACE) {
 		WARN_ON_ONCE(!tr->extension_prog);
 		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP,
 					 tr->extension_prog->bpf_func, NULL);
 		tr->extension_prog = NULL;
-		goto out;
+		return err;
 	}
 	hlist_del_init(&link->tramp_hlist);
 	tr->progs_cnt[kind]--;
-	err = bpf_trampoline_update(tr);
-out:
+	return bpf_trampoline_update(tr);
+}
+
+/* bpf_trampoline_unlink_prog() should never fail. */
+int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+{
+	int err;
+
+	mutex_lock(&tr->mutex);
+	err = __bpf_trampoline_unlink_prog(link, tr);
 	mutex_unlock(&tr->mutex);
 	return err;
 }
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 02/11] bpf: convert cgroup_bpf.progs to hlist
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev, Jakub Sitnicki

This lets us reclaim some space to be used by new cgroup lsm slots.

Before:
struct cgroup_bpf {
	struct bpf_prog_array *    effective[23];        /*     0   184 */
	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
	struct list_head           progs[23];            /*   184   368 */
	/* --- cacheline 8 boundary (512 bytes) was 40 bytes ago --- */
	u32                        flags[23];            /*   552    92 */

	/* XXX 4 bytes hole, try to pack */

	/* --- cacheline 10 boundary (640 bytes) was 8 bytes ago --- */
	struct list_head           storages;             /*   648    16 */
	struct bpf_prog_array *    inactive;             /*   664     8 */
	struct percpu_ref          refcnt;               /*   672    16 */
	struct work_struct         release_work;         /*   688    32 */

	/* size: 720, cachelines: 12, members: 7 */
	/* sum members: 716, holes: 1, sum holes: 4 */
	/* last cacheline: 16 bytes */
};

After:
struct cgroup_bpf {
	struct bpf_prog_array *    effective[23];        /*     0   184 */
	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
	struct hlist_head          progs[23];            /*   184   184 */
	/* --- cacheline 5 boundary (320 bytes) was 48 bytes ago --- */
	u8                         flags[23];            /*   368    23 */

	/* XXX 1 byte hole, try to pack */

	/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
	struct list_head           storages;             /*   392    16 */
	struct bpf_prog_array *    inactive;             /*   408     8 */
	struct percpu_ref          refcnt;               /*   416    16 */
	struct work_struct         release_work;         /*   432    72 */

	/* size: 504, cachelines: 8, members: 7 */
	/* sum members: 503, holes: 1, sum holes: 1 */
	/* last cacheline: 56 bytes */
};

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf-cgroup-defs.h |  4 +-
 include/linux/bpf-cgroup.h      |  2 +-
 kernel/bpf/cgroup.c             | 76 +++++++++++++++++++--------------
 3 files changed, 47 insertions(+), 35 deletions(-)

diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index 695d1224a71b..5d268e76d8e6 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -47,8 +47,8 @@ struct cgroup_bpf {
 	 * have either zero or one element
 	 * when BPF_F_ALLOW_MULTI the list can have up to BPF_CGROUP_MAX_PROGS
 	 */
-	struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE];
-	u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE];
+	struct hlist_head progs[MAX_CGROUP_BPF_ATTACH_TYPE];
+	u8 flags[MAX_CGROUP_BPF_ATTACH_TYPE];
 
 	/* list of cgroup shared storages */
 	struct list_head storages;
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 669d96d074ad..6673acfbf2ef 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -95,7 +95,7 @@ struct bpf_cgroup_link {
 };
 
 struct bpf_prog_list {
-	struct list_head node;
+	struct hlist_node node;
 	struct bpf_prog *prog;
 	struct bpf_cgroup_link *link;
 	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE];
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 7a394f7c205c..4adb4f3ecb7f 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -157,11 +157,12 @@ static void cgroup_bpf_release(struct work_struct *work)
 	mutex_lock(&cgroup_mutex);
 
 	for (atype = 0; atype < ARRAY_SIZE(cgrp->bpf.progs); atype++) {
-		struct list_head *progs = &cgrp->bpf.progs[atype];
-		struct bpf_prog_list *pl, *pltmp;
+		struct hlist_head *progs = &cgrp->bpf.progs[atype];
+		struct bpf_prog_list *pl;
+		struct hlist_node *pltmp;
 
-		list_for_each_entry_safe(pl, pltmp, progs, node) {
-			list_del(&pl->node);
+		hlist_for_each_entry_safe(pl, pltmp, progs, node) {
+			hlist_del(&pl->node);
 			if (pl->prog)
 				bpf_prog_put(pl->prog);
 			if (pl->link)
@@ -217,12 +218,12 @@ static struct bpf_prog *prog_list_prog(struct bpf_prog_list *pl)
 /* count number of elements in the list.
  * it's slow but the list cannot be long
  */
-static u32 prog_list_length(struct list_head *head)
+static u32 prog_list_length(struct hlist_head *head)
 {
 	struct bpf_prog_list *pl;
 	u32 cnt = 0;
 
-	list_for_each_entry(pl, head, node) {
+	hlist_for_each_entry(pl, head, node) {
 		if (!prog_list_prog(pl))
 			continue;
 		cnt++;
@@ -291,7 +292,7 @@ static int compute_effective_progs(struct cgroup *cgrp,
 		if (cnt > 0 && !(p->bpf.flags[atype] & BPF_F_ALLOW_MULTI))
 			continue;
 
-		list_for_each_entry(pl, &p->bpf.progs[atype], node) {
+		hlist_for_each_entry(pl, &p->bpf.progs[atype], node) {
 			if (!prog_list_prog(pl))
 				continue;
 
@@ -342,7 +343,7 @@ int cgroup_bpf_inherit(struct cgroup *cgrp)
 		cgroup_bpf_get(p);
 
 	for (i = 0; i < NR; i++)
-		INIT_LIST_HEAD(&cgrp->bpf.progs[i]);
+		INIT_HLIST_HEAD(&cgrp->bpf.progs[i]);
 
 	INIT_LIST_HEAD(&cgrp->bpf.storages);
 
@@ -418,7 +419,7 @@ static int update_effective_progs(struct cgroup *cgrp,
 
 #define BPF_CGROUP_MAX_PROGS 64
 
-static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
+static struct bpf_prog_list *find_attach_entry(struct hlist_head *progs,
 					       struct bpf_prog *prog,
 					       struct bpf_cgroup_link *link,
 					       struct bpf_prog *replace_prog,
@@ -428,12 +429,12 @@ static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
 
 	/* single-attach case */
 	if (!allow_multi) {
-		if (list_empty(progs))
+		if (hlist_empty(progs))
 			return NULL;
-		return list_first_entry(progs, typeof(*pl), node);
+		return hlist_entry(progs->first, typeof(*pl), node);
 	}
 
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (prog && pl->prog == prog && prog != replace_prog)
 			/* disallow attaching the same prog twice */
 			return ERR_PTR(-EINVAL);
@@ -444,7 +445,7 @@ static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
 
 	/* direct prog multi-attach w/ replacement case */
 	if (replace_prog) {
-		list_for_each_entry(pl, progs, node) {
+		hlist_for_each_entry(pl, progs, node) {
 			if (pl->prog == replace_prog)
 				/* a match found */
 				return pl;
@@ -480,7 +481,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	int err;
 
 	if (((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI)) ||
@@ -503,7 +504,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	if (!hierarchy_allows_attach(cgrp, atype))
 		return -EPERM;
 
-	if (!list_empty(progs) && cgrp->bpf.flags[atype] != saved_flags)
+	if (!hlist_empty(progs) && cgrp->bpf.flags[atype] != saved_flags)
 		/* Disallow attaching non-overridable on top
 		 * of existing overridable in this cgroup.
 		 * Disallow attaching multi-prog if overridable or none
@@ -525,12 +526,22 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	if (pl) {
 		old_prog = pl->prog;
 	} else {
+		struct hlist_node *last = NULL;
+
 		pl = kmalloc(sizeof(*pl), GFP_KERNEL);
 		if (!pl) {
 			bpf_cgroup_storages_free(new_storage);
 			return -ENOMEM;
 		}
-		list_add_tail(&pl->node, progs);
+		if (hlist_empty(progs))
+			hlist_add_head(&pl->node, progs);
+		else
+			hlist_for_each(last, progs) {
+				if (last->next)
+					continue;
+				hlist_add_behind(&pl->node, last);
+				break;
+			}
 	}
 
 	pl->prog = prog;
@@ -556,7 +567,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	}
 	bpf_cgroup_storages_free(new_storage);
 	if (!old_prog) {
-		list_del(&pl->node);
+		hlist_del(&pl->node);
 		kfree(pl);
 	}
 	return err;
@@ -587,7 +598,7 @@ static void replace_effective_prog(struct cgroup *cgrp,
 	struct cgroup_subsys_state *css;
 	struct bpf_prog_array *progs;
 	struct bpf_prog_list *pl;
-	struct list_head *head;
+	struct hlist_head *head;
 	struct cgroup *cg;
 	int pos;
 
@@ -603,7 +614,7 @@ static void replace_effective_prog(struct cgroup *cgrp,
 				continue;
 
 			head = &cg->bpf.progs[atype];
-			list_for_each_entry(pl, head, node) {
+			hlist_for_each_entry(pl, head, node) {
 				if (!prog_list_prog(pl))
 					continue;
 				if (pl->link == link)
@@ -637,7 +648,7 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog *old_prog;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	bool found = false;
 
 	atype = to_cgroup_bpf_attach_type(link->type);
@@ -649,7 +660,7 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	if (link->link.prog->type != new_prog->type)
 		return -EINVAL;
 
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (pl->link == link) {
 			found = true;
 			break;
@@ -688,7 +699,7 @@ static int cgroup_bpf_replace(struct bpf_link *link, struct bpf_prog *new_prog,
 	return ret;
 }
 
-static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
+static struct bpf_prog_list *find_detach_entry(struct hlist_head *progs,
 					       struct bpf_prog *prog,
 					       struct bpf_cgroup_link *link,
 					       bool allow_multi)
@@ -696,14 +707,14 @@ static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
 	struct bpf_prog_list *pl;
 
 	if (!allow_multi) {
-		if (list_empty(progs))
+		if (hlist_empty(progs))
 			/* report error when trying to detach and nothing is attached */
 			return ERR_PTR(-ENOENT);
 
 		/* to maintain backward compatibility NONE and OVERRIDE cgroups
 		 * allow detaching with invalid FD (prog==NULL) in legacy mode
 		 */
-		return list_first_entry(progs, typeof(*pl), node);
+		return hlist_entry(progs->first, typeof(*pl), node);
 	}
 
 	if (!prog && !link)
@@ -713,7 +724,7 @@ static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
 		return ERR_PTR(-EINVAL);
 
 	/* find the prog or link and detach it */
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (pl->prog == prog && pl->link == link)
 			return pl;
 	}
@@ -737,7 +748,7 @@ static void purge_effective_progs(struct cgroup *cgrp, struct bpf_prog *prog,
 	struct cgroup_subsys_state *css;
 	struct bpf_prog_array *progs;
 	struct bpf_prog_list *pl;
-	struct list_head *head;
+	struct hlist_head *head;
 	struct cgroup *cg;
 	int pos;
 
@@ -754,7 +765,7 @@ static void purge_effective_progs(struct cgroup *cgrp, struct bpf_prog *prog,
 				continue;
 
 			head = &cg->bpf.progs[atype];
-			list_for_each_entry(pl, head, node) {
+			hlist_for_each_entry(pl, head, node) {
 				if (!prog_list_prog(pl))
 					continue;
 				if (pl->prog == prog && pl->link == link)
@@ -791,7 +802,7 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog *old_prog;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	u32 flags;
 
 	atype = to_cgroup_bpf_attach_type(type);
@@ -822,9 +833,10 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	}
 
 	/* now can actually delete it from this cgroup list */
-	list_del(&pl->node);
+	hlist_del(&pl->node);
+
 	kfree(pl);
-	if (list_empty(progs))
+	if (hlist_empty(progs))
 		/* last program was detached, reset flags to zero */
 		cgrp->bpf.flags[atype] = 0;
 	if (old_prog)
@@ -852,7 +864,7 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 	enum bpf_attach_type type = attr->query.attach_type;
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_array *effective;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	struct bpf_prog *prog;
 	int cnt, ret = 0, i;
 	u32 flags;
@@ -891,7 +903,7 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 		u32 id;
 
 		i = 0;
-		list_for_each_entry(pl, progs, node) {
+		hlist_for_each_entry(pl, progs, node) {
 			prog = prog_list_prog(pl);
 			id = prog->aux->id;
 			if (copy_to_user(prog_ids + i, &id, sizeof(id)))
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 02/11] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-02  6:16   ` kernel test robot
  2022-06-04  6:11   ` Martin KaFai Lau
  2022-06-01 19:02 ` [PATCH bpf-next v8 04/11] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Allow attaching to lsm hooks in the cgroup context.

Attaching to per-cgroup LSM works exactly like attaching
to other per-cgroup hooks. New BPF_LSM_CGROUP is added
to trigger new mode; the actual lsm hook we attach to is
signaled via existing attach_btf_id.

For the hooks that have 'struct socket' or 'struct sock' as its first
argument, we use the cgroup associated with that socket. For the rest,
we use 'current' cgroup (this is all on default hierarchy == v2 only).
Note that for some hooks that work on 'struct sock' we still
take the cgroup from 'current' because some of them work on the socket
that hasn't been properly initialized yet.

Behind the scenes, we allocate a shim program that is attached
to the trampoline and runs cgroup effective BPF programs array.
This shim has some rudimentary ref counting and can be shared
between several programs attaching to the same per-cgroup lsm hook.

Note that this patch bloats cgroup size because we add 211
cgroup_bpf_attach_type(s) for simplicity sake. This will be
addressed in the subsequent patch.

Also note that we only add non-sleepable flavor for now. To enable
sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
shim programs have to be freed via trace rcu, cgroup_bpf.effective
should be also trace-rcu-managed + maybe some other changes that
I'm not aware of.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 arch/x86/net/bpf_jit_comp.c     |  24 ++--
 include/linux/bpf-cgroup-defs.h |   8 ++
 include/linux/bpf-cgroup.h      |   7 ++
 include/linux/bpf.h             |  26 +++++
 include/linux/bpf_lsm.h         |  13 +++
 include/linux/btf_ids.h         |   3 +-
 include/uapi/linux/bpf.h        |   1 +
 kernel/bpf/bpf_lsm.c            |  48 ++++++++
 kernel/bpf/btf.c                |  11 ++
 kernel/bpf/cgroup.c             | 148 +++++++++++++++++++++++--
 kernel/bpf/core.c               |   2 +
 kernel/bpf/syscall.c            |  10 ++
 kernel/bpf/trampoline.c         | 190 ++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c           |  32 ++++++
 tools/include/linux/btf_ids.h   |   4 +-
 tools/include/uapi/linux/bpf.h  |   1 +
 16 files changed, 510 insertions(+), 18 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index f298b18a9a3d..54a814fa41e0 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1770,6 +1770,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 			   struct bpf_tramp_link *l, int stack_size,
 			   int run_ctx_off, bool save_ret)
 {
+	void (*exit)(struct bpf_prog *prog, u64 start,
+		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
+	u64 (*enter)(struct bpf_prog *prog,
+		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
 	u8 *prog = *pprog;
 	u8 *jmp_insn;
 	int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
@@ -1788,15 +1792,21 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	 */
 	emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_1, -run_ctx_off + ctx_cookie_off);
 
+	if (p->aux->sleepable) {
+		enter = __bpf_prog_enter_sleepable;
+		exit = __bpf_prog_exit_sleepable;
+	} else if (p->expected_attach_type == BPF_LSM_CGROUP) {
+		enter = __bpf_prog_enter_lsm_cgroup;
+		exit = __bpf_prog_exit_lsm_cgroup;
+	}
+
 	/* arg1: mov rdi, progs[i] */
 	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
 	/* arg2: lea rsi, [rbp - ctx_cookie_off] */
 	EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);
 
-	if (emit_call(&prog,
-		      p->aux->sleepable ? __bpf_prog_enter_sleepable :
-		      __bpf_prog_enter, prog))
-			return -EINVAL;
+	if (emit_call(&prog, enter, prog))
+		return -EINVAL;
 	/* remember prog start time returned by __bpf_prog_enter */
 	emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
 
@@ -1840,10 +1850,8 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
 	/* arg3: lea rdx, [rbp - run_ctx_off] */
 	EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
-	if (emit_call(&prog,
-		      p->aux->sleepable ? __bpf_prog_exit_sleepable :
-		      __bpf_prog_exit, prog))
-			return -EINVAL;
+	if (emit_call(&prog, exit, prog))
+		return -EINVAL;
 
 	*pprog = prog;
 	return 0;
diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index 5d268e76d8e6..b99f8c3e37ea 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -10,6 +10,12 @@
 
 struct bpf_prog_array;
 
+#ifdef CONFIG_BPF_LSM
+#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
+#else
+#define CGROUP_LSM_NUM 0
+#endif
+
 enum cgroup_bpf_attach_type {
 	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
 	CGROUP_INET_INGRESS = 0,
@@ -35,6 +41,8 @@ enum cgroup_bpf_attach_type {
 	CGROUP_INET4_GETSOCKNAME,
 	CGROUP_INET6_GETSOCKNAME,
 	CGROUP_INET_SOCK_RELEASE,
+	CGROUP_LSM_START,
+	CGROUP_LSM_END = CGROUP_LSM_START + CGROUP_LSM_NUM - 1,
 	MAX_CGROUP_BPF_ATTACH_TYPE
 };
 
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 6673acfbf2ef..2bd1b5f8de9b 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -23,6 +23,13 @@ struct ctl_table;
 struct ctl_table_header;
 struct task_struct;
 
+unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
+				       const struct bpf_insn *insn);
+unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
+					 const struct bpf_insn *insn);
+unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
+					  const struct bpf_insn *insn);
+
 #ifdef CONFIG_CGROUP_BPF
 
 #define CGROUP_ATYPE(type) \
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index b0af68b6297b..210c4203db78 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -778,6 +778,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
 u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
 void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
 				       struct bpf_tramp_run_ctx *run_ctx);
+u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
+					struct bpf_tramp_run_ctx *run_ctx);
+void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
+					struct bpf_tramp_run_ctx *run_ctx);
 void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
 void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
 
@@ -1044,6 +1048,7 @@ struct bpf_prog_aux {
 	struct user_struct *user;
 	u64 load_time; /* ns since boottime */
 	u32 verified_insns;
+	int cgroup_atype; /* enum cgroup_bpf_attach_type */
 	struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
 	char name[BPF_OBJ_NAME_LEN];
 #ifdef CONFIG_SECURITY
@@ -1117,6 +1122,11 @@ struct bpf_tramp_link {
 	u64 cookie;
 };
 
+struct bpf_shim_tramp_link {
+	struct bpf_tramp_link link;
+	struct bpf_trampoline *trampoline;
+};
+
 struct bpf_tracing_link {
 	struct bpf_tramp_link link;
 	enum bpf_attach_type attach_type;
@@ -1195,6 +1205,10 @@ struct bpf_dummy_ops {
 int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
 			    union bpf_attr __user *uattr);
 #endif
+int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+				    struct bpf_attach_target_info *tgt_info,
+				    int cgroup_atype);
+void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog);
 #else
 static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
 {
@@ -1218,6 +1232,15 @@ static inline int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map,
 {
 	return -EINVAL;
 }
+static inline int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+						  struct bpf_attach_target_info *tgt_info,
+						  int cgroup_atype)
+{
+	return -EOPNOTSUPP;
+}
+static inline void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
+{
+}
 #endif
 
 struct bpf_array {
@@ -2267,6 +2290,8 @@ extern const struct bpf_func_proto bpf_loop_proto;
 extern const struct bpf_func_proto bpf_strncmp_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_task_proto;
 extern const struct bpf_func_proto bpf_kptr_xchg_proto;
+extern const struct bpf_func_proto bpf_set_retval_proto;
+extern const struct bpf_func_proto bpf_get_retval_proto;
 
 const struct bpf_func_proto *tracing_prog_func_proto(
   enum bpf_func_id func_id, const struct bpf_prog *prog);
@@ -2384,6 +2409,7 @@ int bpf_arch_text_invalidate(void *dst, size_t len);
 
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
+int btf_id_set_index(const struct btf_id_set *set, u32 id);
 
 #define MAX_BPRINTF_VARARGS		12
 
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index 479c101546ad..61787a5f6af9 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -42,6 +42,9 @@ extern const struct bpf_func_proto bpf_inode_storage_get_proto;
 extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
 void bpf_inode_storage_free(struct inode *inode);
 
+int bpf_lsm_hook_idx(u32 btf_id);
+void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
+
 #else /* !CONFIG_BPF_LSM */
 
 static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id)
@@ -65,6 +68,16 @@ static inline void bpf_inode_storage_free(struct inode *inode)
 {
 }
 
+static inline void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
+					   bpf_func_t *bpf_func)
+{
+}
+
+static inline int bpf_lsm_hook_idx(u32 btf_id)
+{
+	return -EINVAL;
+}
+
 #endif /* CONFIG_BPF_LSM */
 
 #endif /* _LINUX_BPF_LSM_H */
diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index 335a19092368..252a4befeab1 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -179,7 +179,8 @@ extern struct btf_id_set name;
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock)			\
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)			\
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)			\
-	BTF_SOCK_TYPE(BTF_SOCK_TYPE_MPTCP, mptcp_sock)
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_MPTCP, mptcp_sock)			\
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_SOCKET, socket)
 
 enum {
 #define BTF_SOCK_TYPE(name, str) name,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f4009dbdf62d..fa64b0b612fd 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -998,6 +998,7 @@ enum bpf_attach_type {
 	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
 	BPF_PERF_EVENT,
 	BPF_TRACE_KPROBE_MULTI,
+	BPF_LSM_CGROUP,
 	__MAX_BPF_ATTACH_TYPE
 };
 
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index c1351df9f7ee..0f72020bfdcf 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -16,6 +16,7 @@
 #include <linux/bpf_local_storage.h>
 #include <linux/btf_ids.h>
 #include <linux/ima.h>
+#include <linux/bpf-cgroup.h>
 
 /* For every LSM hook that allows attachment of BPF programs, declare a nop
  * function where a BPF program can be attached.
@@ -35,6 +36,44 @@ BTF_SET_START(bpf_lsm_hooks)
 #undef LSM_HOOK
 BTF_SET_END(bpf_lsm_hooks)
 
+/* List of LSM hooks that should operate on 'current' cgroup regardless
+ * of function signature.
+ */
+BTF_SET_START(bpf_lsm_current_hooks)
+/* operate on freshly allocated sk without any cgroup association */
+BTF_ID(func, bpf_lsm_sk_alloc_security)
+BTF_ID(func, bpf_lsm_sk_free_security)
+BTF_SET_END(bpf_lsm_current_hooks)
+
+void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
+			     bpf_func_t *bpf_func)
+{
+	const struct btf_param *args;
+
+	if (btf_type_vlen(prog->aux->attach_func_proto) < 1 ||
+	    btf_id_set_contains(&bpf_lsm_current_hooks,
+				prog->aux->attach_btf_id)) {
+		*bpf_func = __cgroup_bpf_run_lsm_current;
+		return;
+	}
+
+	args = btf_params(prog->aux->attach_func_proto);
+
+#ifdef CONFIG_NET
+	if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCKET])
+		*bpf_func = __cgroup_bpf_run_lsm_socket;
+	else if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCK])
+		*bpf_func = __cgroup_bpf_run_lsm_sock;
+	else
+#endif
+		*bpf_func = __cgroup_bpf_run_lsm_current;
+}
+
+int bpf_lsm_hook_idx(u32 btf_id)
+{
+	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
+}
+
 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 			const struct bpf_prog *prog)
 {
@@ -158,6 +197,15 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return prog->aux->sleepable ? &bpf_ima_file_hash_proto : NULL;
 	case BPF_FUNC_get_attach_cookie:
 		return bpf_prog_has_trampoline(prog) ? &bpf_get_attach_cookie_proto : NULL;
+	case BPF_FUNC_get_local_storage:
+		return prog->expected_attach_type == BPF_LSM_CGROUP ?
+			&bpf_get_local_storage_proto : NULL;
+	case BPF_FUNC_set_retval:
+		return prog->expected_attach_type == BPF_LSM_CGROUP ?
+			&bpf_set_retval_proto : NULL;
+	case BPF_FUNC_get_retval:
+		return prog->expected_attach_type == BPF_LSM_CGROUP ?
+			&bpf_get_retval_proto : NULL;
 	default:
 		return tracing_prog_func_proto(func_id, prog);
 	}
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 7bccaa4646e5..23a710dfca59 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5250,6 +5250,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 
 	if (arg == nr_args) {
 		switch (prog->expected_attach_type) {
+		case BPF_LSM_CGROUP:
 		case BPF_LSM_MAC:
 		case BPF_TRACE_FEXIT:
 			/* When LSM programs are attached to void LSM hooks
@@ -6728,6 +6729,16 @@ static int btf_id_cmp_func(const void *a, const void *b)
 	return *pa - *pb;
 }
 
+int btf_id_set_index(const struct btf_id_set *set, u32 id)
+{
+	const u32 *p;
+
+	p = bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func);
+	if (!p)
+		return -1;
+	return p - set->ids;
+}
+
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id)
 {
 	return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 4adb4f3ecb7f..66b644a76a69 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -14,6 +14,9 @@
 #include <linux/string.h>
 #include <linux/bpf.h>
 #include <linux/bpf-cgroup.h>
+#include <linux/btf_ids.h>
+#include <linux/bpf_lsm.h>
+#include <linux/bpf_verifier.h>
 #include <net/sock.h>
 #include <net/bpf_sk_storage.h>
 
@@ -61,6 +64,88 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
 	return run_ctx.retval;
 }
 
+unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
+				       const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct sock *sk;
+	struct cgroup *cgrp;
+	int ret = 0;
+	u64 *args;
+
+	args = (u64 *)ctx;
+	sk = (void *)(unsigned long)args[0];
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	return ret;
+}
+
+unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
+					 const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct socket *sock;
+	struct cgroup *cgrp;
+	int ret = 0;
+	u64 *args;
+
+	args = (u64 *)ctx;
+	sock = (void *)(unsigned long)args[0];
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	return ret;
+}
+
+unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
+					  const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct cgroup *cgrp;
+	int ret = 0;
+
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	rcu_read_lock();
+	cgrp = task_dfl_cgroup(current);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	rcu_read_unlock();
+	return ret;
+}
+
+#ifdef CONFIG_BPF_LSM
+static enum cgroup_bpf_attach_type
+bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
+{
+	if (attach_type != BPF_LSM_CGROUP)
+		return to_cgroup_bpf_attach_type(attach_type);
+	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
+}
+#else
+static enum cgroup_bpf_attach_type
+bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
+{
+	if (attach_type != BPF_LSM_CGROUP)
+		return to_cgroup_bpf_attach_type(attach_type);
+	return -EOPNOTSUPP;
+}
+#endif /* CONFIG_BPF_LSM */
+
 void cgroup_bpf_offline(struct cgroup *cgrp)
 {
 	cgroup_get(cgrp);
@@ -163,10 +248,16 @@ static void cgroup_bpf_release(struct work_struct *work)
 
 		hlist_for_each_entry_safe(pl, pltmp, progs, node) {
 			hlist_del(&pl->node);
-			if (pl->prog)
+			if (pl->prog) {
+				if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
+					bpf_trampoline_unlink_cgroup_shim(pl->prog);
 				bpf_prog_put(pl->prog);
-			if (pl->link)
+			}
+			if (pl->link) {
+				if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
+					bpf_trampoline_unlink_cgroup_shim(pl->link->link.prog);
 				bpf_cgroup_link_auto_detach(pl->link);
+			}
 			kfree(pl);
 			static_branch_dec(&cgroup_bpf_enabled_key[atype]);
 		}
@@ -479,6 +570,8 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	struct bpf_prog *old_prog = NULL;
 	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
 	struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
+	struct bpf_prog *new_prog = prog ? : link->link.prog;
+	struct bpf_attach_target_info tgt_info = {};
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_list *pl;
 	struct hlist_head *progs;
@@ -495,7 +588,20 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 		/* replace_prog implies BPF_F_REPLACE, and vice versa */
 		return -EINVAL;
 
-	atype = to_cgroup_bpf_attach_type(type);
+	if (type == BPF_LSM_CGROUP) {
+		if (replace_prog) {
+			if (replace_prog->aux->attach_btf_id !=
+			    new_prog->aux->attach_btf_id)
+				return -EINVAL;
+		}
+		err = bpf_check_attach_target(NULL, new_prog, NULL,
+					      new_prog->aux->attach_btf_id,
+					      &tgt_info);
+		if (err)
+			return -EINVAL;
+	}
+
+	atype = bpf_cgroup_atype_find(type, new_prog->aux->attach_btf_id);
 	if (atype < 0)
 		return -EINVAL;
 
@@ -549,9 +655,15 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	bpf_cgroup_storages_assign(pl->storage, storage);
 	cgrp->bpf.flags[atype] = saved_flags;
 
+	if (type == BPF_LSM_CGROUP && !old_prog) {
+		err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info, atype);
+		if (err)
+			goto cleanup;
+	}
+
 	err = update_effective_progs(cgrp, atype);
 	if (err)
-		goto cleanup;
+		goto cleanup_trampoline;
 
 	if (old_prog)
 		bpf_prog_put(old_prog);
@@ -560,6 +672,10 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	bpf_cgroup_storages_link(new_storage, cgrp, type);
 	return 0;
 
+cleanup_trampoline:
+	if (type == BPF_LSM_CGROUP && !old_prog)
+		bpf_trampoline_unlink_cgroup_shim(new_prog);
+
 cleanup:
 	if (old_prog) {
 		pl->prog = old_prog;
@@ -651,7 +767,13 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	struct hlist_head *progs;
 	bool found = false;
 
-	atype = to_cgroup_bpf_attach_type(link->type);
+	if (link->type == BPF_LSM_CGROUP) {
+		if (new_prog->aux->attach_btf_id !=
+		    link->link.prog->aux->attach_btf_id)
+			return -EINVAL;
+	}
+
+	atype = bpf_cgroup_atype_find(link->type, new_prog->aux->attach_btf_id);
 	if (atype < 0)
 		return -EINVAL;
 
@@ -803,9 +925,15 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	struct bpf_prog *old_prog;
 	struct bpf_prog_list *pl;
 	struct hlist_head *progs;
+	u32 attach_btf_id = 0;
 	u32 flags;
 
-	atype = to_cgroup_bpf_attach_type(type);
+	if (prog)
+		attach_btf_id = prog->aux->attach_btf_id;
+	if (link)
+		attach_btf_id = link->link.prog->aux->attach_btf_id;
+
+	atype = bpf_cgroup_atype_find(type, attach_btf_id);
 	if (atype < 0)
 		return -EINVAL;
 
@@ -832,6 +960,10 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 		purge_effective_progs(cgrp, old_prog, link, atype);
 	}
 
+
+	if (type == BPF_LSM_CGROUP)
+		bpf_trampoline_unlink_cgroup_shim(old_prog ? : link->link.prog);
+
 	/* now can actually delete it from this cgroup list */
 	hlist_del(&pl->node);
 
@@ -1343,7 +1475,7 @@ BPF_CALL_0(bpf_get_retval)
 	return ctx->retval;
 }
 
-static const struct bpf_func_proto bpf_get_retval_proto = {
+const struct bpf_func_proto bpf_get_retval_proto = {
 	.func		= bpf_get_retval,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
@@ -1358,7 +1490,7 @@ BPF_CALL_1(bpf_set_retval, int, retval)
 	return 0;
 }
 
-static const struct bpf_func_proto bpf_set_retval_proto = {
+const struct bpf_func_proto bpf_set_retval_proto = {
 	.func		= bpf_set_retval,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 4cbc20a98e65..091ee210842f 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2655,6 +2655,8 @@ const struct bpf_func_proto bpf_get_local_storage_proto __weak;
 const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
 const struct bpf_func_proto bpf_snprintf_btf_proto __weak;
 const struct bpf_func_proto bpf_seq_printf_btf_proto __weak;
+const struct bpf_func_proto bpf_set_retval_proto __weak;
+const struct bpf_func_proto bpf_get_retval_proto __weak;
 
 const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
 {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index aeb31137b2ed..a237be4f8bb3 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3416,6 +3416,8 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type)
 		return BPF_PROG_TYPE_SK_LOOKUP;
 	case BPF_XDP:
 		return BPF_PROG_TYPE_XDP;
+	case BPF_LSM_CGROUP:
+		return BPF_PROG_TYPE_LSM;
 	default:
 		return BPF_PROG_TYPE_UNSPEC;
 	}
@@ -3469,6 +3471,11 @@ static int bpf_prog_attach(const union bpf_attr *attr)
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_SOCK_OPS:
+	case BPF_PROG_TYPE_LSM:
+		if (ptype == BPF_PROG_TYPE_LSM &&
+		    prog->expected_attach_type != BPF_LSM_CGROUP)
+			return -EINVAL;
+
 		ret = cgroup_bpf_prog_attach(attr, ptype, prog);
 		break;
 	default:
@@ -3506,6 +3513,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_SOCK_OPS:
+	case BPF_PROG_TYPE_LSM:
 		return cgroup_bpf_prog_detach(attr, ptype);
 	default:
 		return -EINVAL;
@@ -4540,6 +4548,8 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 			ret = bpf_raw_tp_link_attach(prog, NULL);
 		else if (prog->expected_attach_type == BPF_TRACE_ITER)
 			ret = bpf_iter_link_attach(attr, uattr, prog);
+		else if (prog->expected_attach_type == BPF_LSM_CGROUP)
+			ret = cgroup_bpf_link_attach(attr, prog);
 		else
 			ret = bpf_tracing_prog_attach(prog,
 						      attr->link_create.target_fd,
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 5466e15be61f..45dfcece76e7 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -11,6 +11,8 @@
 #include <linux/rcupdate_wait.h>
 #include <linux/module.h>
 #include <linux/static_call.h>
+#include <linux/bpf_verifier.h>
+#include <linux/bpf_lsm.h>
 
 /* dummy _ops. The verifier will operate on target program's ops. */
 const struct bpf_verifier_ops bpf_extension_verifier_ops = {
@@ -496,6 +498,169 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
 	return err;
 }
 
+#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
+static void bpf_shim_tramp_link_release(struct bpf_link *link)
+{
+	struct bpf_shim_tramp_link *shim_link =
+		container_of(link, struct bpf_shim_tramp_link, link.link);
+
+	if (!shim_link->trampoline)
+		return;
+
+	WARN_ON_ONCE(bpf_trampoline_unlink_prog(&shim_link->link, shim_link->trampoline));
+	bpf_trampoline_put(shim_link->trampoline);
+}
+
+static void bpf_shim_tramp_link_dealloc(struct bpf_link *link)
+{
+	struct bpf_shim_tramp_link *shim_link =
+		container_of(link, struct bpf_shim_tramp_link, link.link);
+
+	kfree(shim_link);
+}
+
+static const struct bpf_link_ops bpf_shim_tramp_link_lops = {
+	.release = bpf_shim_tramp_link_release,
+	.dealloc = bpf_shim_tramp_link_dealloc,
+};
+
+static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
+						     bpf_func_t bpf_func,
+						     int cgroup_atype)
+{
+	struct bpf_shim_tramp_link *shim_link = NULL;
+	struct bpf_prog *p;
+
+	shim_link = kzalloc(sizeof(*shim_link), GFP_USER);
+	if (!shim_link)
+		return NULL;
+
+	p = bpf_prog_alloc(1, 0);
+	if (!p) {
+		kfree(shim_link);
+		return NULL;
+	}
+
+	p->jited = false;
+	p->bpf_func = bpf_func;
+
+	p->aux->cgroup_atype = cgroup_atype;
+	p->aux->attach_func_proto = prog->aux->attach_func_proto;
+	p->aux->attach_btf_id = prog->aux->attach_btf_id;
+	p->aux->attach_btf = prog->aux->attach_btf;
+	btf_get(p->aux->attach_btf);
+	p->type = BPF_PROG_TYPE_LSM;
+	p->expected_attach_type = BPF_LSM_MAC;
+	bpf_prog_inc(p);
+	bpf_link_init(&shim_link->link.link, BPF_LINK_TYPE_UNSPEC,
+		      &bpf_shim_tramp_link_lops, p);
+
+	return shim_link;
+}
+
+static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr,
+						    bpf_func_t bpf_func)
+{
+	struct bpf_tramp_link *link;
+	int kind;
+
+	for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
+		hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
+			struct bpf_prog *p = link->link.prog;
+
+			if (p->bpf_func == bpf_func)
+				return container_of(link, struct bpf_shim_tramp_link, link);
+		}
+	}
+
+	return NULL;
+}
+
+int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+				    struct bpf_attach_target_info *tgt_info,
+				    int cgroup_atype)
+{
+	struct bpf_shim_tramp_link *shim_link = NULL;
+	struct bpf_trampoline *tr;
+	bpf_func_t bpf_func;
+	u64 key;
+	int err;
+
+	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
+					 prog->aux->attach_btf_id);
+
+	bpf_lsm_find_cgroup_shim(prog, &bpf_func);
+	tr = bpf_trampoline_get(key, tgt_info);
+	if (!tr)
+		return  -ENOMEM;
+
+	mutex_lock(&tr->mutex);
+
+	shim_link = cgroup_shim_find(tr, bpf_func);
+	if (shim_link) {
+		/* Reusing existing shim attached by the other program. */
+		atomic64_inc(&shim_link->link.link.refcnt);
+		/* note, we're still holding tr refcnt from above */
+
+		mutex_unlock(&tr->mutex);
+		return 0;
+	}
+
+	/* Allocate and install new shim. */
+
+	shim_link = cgroup_shim_alloc(prog, bpf_func, cgroup_atype);
+	if (!shim_link) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	err = __bpf_trampoline_link_prog(&shim_link->link, tr);
+	if (err)
+		goto out;
+
+	shim_link->trampoline = tr;
+	/* note, we're still holding tr refcnt from above */
+
+	mutex_unlock(&tr->mutex);
+
+	return 0;
+out:
+	mutex_unlock(&tr->mutex);
+
+	if (shim_link)
+		bpf_link_put(&shim_link->link.link);
+
+	bpf_trampoline_put(tr); /* bpf_trampoline_get above */
+
+	return err;
+}
+
+void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
+{
+	struct bpf_shim_tramp_link *shim_link = NULL;
+	struct bpf_trampoline *tr;
+	bpf_func_t bpf_func;
+	u64 key;
+
+	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
+					 prog->aux->attach_btf_id);
+
+	bpf_lsm_find_cgroup_shim(prog, &bpf_func);
+	tr = bpf_trampoline_lookup(key);
+	if (!tr)
+		return;
+
+	mutex_lock(&tr->mutex);
+	shim_link = cgroup_shim_find(tr, bpf_func);
+	mutex_unlock(&tr->mutex);
+
+	if (shim_link)
+		bpf_link_put(&shim_link->link.link);
+
+	bpf_trampoline_put(tr); /* bpf_trampoline_lookup above */
+}
+#endif
+
 struct bpf_trampoline *bpf_trampoline_get(u64 key,
 					  struct bpf_attach_target_info *tgt_info)
 {
@@ -628,6 +793,31 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
 	rcu_read_unlock();
 }
 
+u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
+					struct bpf_tramp_run_ctx *run_ctx)
+	__acquires(RCU)
+{
+	/* Runtime stats are exported via actual BPF_LSM_CGROUP
+	 * programs, not the shims.
+	 */
+	rcu_read_lock();
+	migrate_disable();
+
+	run_ctx->saved_run_ctx = bpf_set_run_ctx(&run_ctx->run_ctx);
+
+	return NO_START_TIME;
+}
+
+void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
+					struct bpf_tramp_run_ctx *run_ctx)
+	__releases(RCU)
+{
+	bpf_reset_run_ctx(run_ctx->saved_run_ctx);
+
+	migrate_enable();
+	rcu_read_unlock();
+}
+
 u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx)
 {
 	rcu_read_lock_trace();
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index aedac2ac02b9..caa5740b39b3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7264,6 +7264,18 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 				reg_type_str(env, regs[BPF_REG_1].type));
 			return -EACCES;
 		}
+		break;
+	case BPF_FUNC_set_retval:
+		if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
+			if (!env->prog->aux->attach_func_proto->type) {
+				/* Make sure programs that attach to void
+				 * hooks don't try to modify return value.
+				 */
+				err = -EINVAL;
+				verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
+			}
+		}
+		break;
 	}
 
 	if (err)
@@ -10474,6 +10486,23 @@ static int check_return_code(struct bpf_verifier_env *env)
 	case BPF_PROG_TYPE_SK_LOOKUP:
 		range = tnum_range(SK_DROP, SK_PASS);
 		break;
+
+	case BPF_PROG_TYPE_LSM:
+		if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
+			if (!env->prog->aux->attach_func_proto->type) {
+				/* Make sure programs that attach to void
+				 * hooks don't try to modify return value.
+				 */
+				range = tnum_range(1, 1);
+			}
+		} else {
+			/* regular BPF_PROG_TYPE_LSM programs can return
+			 * any value.
+			 */
+			return 0;
+		}
+		break;
+
 	case BPF_PROG_TYPE_EXT:
 		/* freplace program can return anything as its return value
 		 * depends on the to-be-replaced kernel func or bpf program.
@@ -10490,6 +10519,8 @@ static int check_return_code(struct bpf_verifier_env *env)
 
 	if (!tnum_in(range, reg->var_off)) {
 		verbose_invalid_scalar(env, reg, &range, "program exit", "R0");
+		if (env->prog->expected_attach_type == BPF_LSM_CGROUP)
+			verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
 		return -EINVAL;
 	}
 
@@ -14713,6 +14744,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 		fallthrough;
 	case BPF_MODIFY_RETURN:
 	case BPF_LSM_MAC:
+	case BPF_LSM_CGROUP:
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
 		if (!btf_type_is_func(t)) {
diff --git a/tools/include/linux/btf_ids.h b/tools/include/linux/btf_ids.h
index 57890b357f85..2345b502b439 100644
--- a/tools/include/linux/btf_ids.h
+++ b/tools/include/linux/btf_ids.h
@@ -172,7 +172,9 @@ extern struct btf_id_set name;
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP_TW, tcp_timewait_sock)		\
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP6, tcp6_sock)			\
 	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock)			\
-	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)			\
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)			\
+	BTF_SOCK_TYPE(BTF_SOCK_TYPE_SOCKET, socket)
 
 enum {
 #define BTF_SOCK_TYPE(name, str) name,
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index f4009dbdf62d..fa64b0b612fd 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -998,6 +998,7 @@ enum bpf_attach_type {
 	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
 	BPF_PERF_EVENT,
 	BPF_TRACE_KPROBE_MULTI,
+	BPF_LSM_CGROUP,
 	__MAX_BPF_ATTACH_TYPE
 };
 
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 04/11] bpf: minimize number of allocated lsm slots per program
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (2 preceding siblings ...)
  2022-06-01 19:02 ` [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-04  6:35   ` Martin KaFai Lau
  2022-06-01 19:02 ` [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Previous patch adds 1:1 mapping between all 211 LSM hooks
and bpf_cgroup program array. Instead of reserving a slot per
possible hook, reserve 10 slots per cgroup for lsm programs.
Those slots are dynamically allocated on demand and reclaimed.

struct cgroup_bpf {
	struct bpf_prog_array *    effective[33];        /*     0   264 */
	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
	struct hlist_head          progs[33];            /*   264   264 */
	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
	u8                         flags[33];            /*   528    33 */

	/* XXX 7 bytes hole, try to pack */

	struct list_head           storages;             /*   568    16 */
	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
	struct bpf_prog_array *    inactive;             /*   584     8 */
	struct percpu_ref          refcnt;               /*   592    16 */
	struct work_struct         release_work;         /*   608    72 */

	/* size: 680, cachelines: 11, members: 7 */
	/* sum members: 673, holes: 1, sum holes: 7 */
	/* last cacheline: 40 bytes */
};

Move 'ifdef CONFIG_CGROUP_BPF' to expose CGROUP_BPF_ATTACH_TYPE_INVALID
to non-cgroup (core) parts.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf-cgroup-defs.h |  3 ++-
 include/linux/bpf.h             |  2 ++
 include/linux/bpf_lsm.h         |  6 -----
 kernel/bpf/bpf_lsm.c            |  5 ----
 kernel/bpf/cgroup.c             | 45 ++++++++++++++++++++++++++++++++-
 kernel/bpf/core.c               |  7 +++++
 6 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index b99f8c3e37ea..7b121bd780eb 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -11,7 +11,8 @@
 struct bpf_prog_array;
 
 #ifdef CONFIG_BPF_LSM
-#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
+/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
+#define CGROUP_LSM_NUM 10
 #else
 #define CGROUP_LSM_NUM 0
 #endif
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 210c4203db78..1a547fd6443c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2446,4 +2446,6 @@ void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
 void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
 int bpf_dynptr_check_size(u32 size);
 
+void bpf_cgroup_atype_free(int cgroup_atype);
+
 #endif /* _LINUX_BPF_H */
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index 61787a5f6af9..4bcf76a9bb06 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -42,7 +42,6 @@ extern const struct bpf_func_proto bpf_inode_storage_get_proto;
 extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
 void bpf_inode_storage_free(struct inode *inode);
 
-int bpf_lsm_hook_idx(u32 btf_id);
 void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
 
 #else /* !CONFIG_BPF_LSM */
@@ -73,11 +72,6 @@ static inline void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
 {
 }
 
-static inline int bpf_lsm_hook_idx(u32 btf_id)
-{
-	return -EINVAL;
-}
-
 #endif /* CONFIG_BPF_LSM */
 
 #endif /* _LINUX_BPF_LSM_H */
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 0f72020bfdcf..83aa431dd52e 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -69,11 +69,6 @@ void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
 		*bpf_func = __cgroup_bpf_run_lsm_current;
 }
 
-int bpf_lsm_hook_idx(u32 btf_id)
-{
-	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
-}
-
 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 			const struct bpf_prog *prog)
 {
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 66b644a76a69..a27a6a7bd852 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -129,12 +129,46 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
 }
 
 #ifdef CONFIG_BPF_LSM
+u32 cgroup_lsm_atype_btf_id[CGROUP_LSM_NUM];
+
 static enum cgroup_bpf_attach_type
 bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
 {
+	int i;
+
+	lockdep_assert_held(&cgroup_mutex);
+
 	if (attach_type != BPF_LSM_CGROUP)
 		return to_cgroup_bpf_attach_type(attach_type);
-	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
+
+	for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_btf_id); i++)
+		if (cgroup_lsm_atype_btf_id[i] == attach_btf_id)
+			return CGROUP_LSM_START + i;
+
+	for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_btf_id); i++)
+		if (cgroup_lsm_atype_btf_id[i] == 0)
+			return CGROUP_LSM_START + i;
+
+	return -E2BIG;
+
+}
+
+static void bpf_cgroup_atype_alloc(u32 attach_btf_id, int cgroup_atype)
+{
+	int i = cgroup_atype - CGROUP_LSM_START;
+
+	lockdep_assert_held(&cgroup_mutex);
+
+	cgroup_lsm_atype_btf_id[i] = attach_btf_id;
+}
+
+void bpf_cgroup_atype_free(int cgroup_atype)
+{
+	int i = cgroup_atype - CGROUP_LSM_START;
+
+	mutex_lock(&cgroup_mutex);
+	cgroup_lsm_atype_btf_id[i] = 0;
+	mutex_unlock(&cgroup_mutex);
 }
 #else
 static enum cgroup_bpf_attach_type
@@ -144,6 +178,14 @@ bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
 		return to_cgroup_bpf_attach_type(attach_type);
 	return -EOPNOTSUPP;
 }
+
+static void bpf_cgroup_atype_alloc(u32 attach_btf_id, int cgroup_atype)
+{
+}
+
+void bpf_cgroup_atype_free(int cgroup_atype)
+{
+}
 #endif /* CONFIG_BPF_LSM */
 
 void cgroup_bpf_offline(struct cgroup *cgrp)
@@ -659,6 +701,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 		err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info, atype);
 		if (err)
 			goto cleanup;
+		bpf_cgroup_atype_alloc(new_prog->aux->attach_btf_id, atype);
 	}
 
 	err = update_effective_progs(cgrp, atype);
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 091ee210842f..224bb4d4fe4e 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -107,6 +107,9 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
 	fp->aux->prog = fp;
 	fp->jit_requested = ebpf_jit_enabled();
 	fp->blinding_requested = bpf_jit_blinding_enabled(fp);
+#ifdef CONFIG_BPF_LSM
+	aux->cgroup_atype = CGROUP_BPF_ATTACH_TYPE_INVALID;
+#endif
 
 	INIT_LIST_HEAD_RCU(&fp->aux->ksym.lnode);
 	mutex_init(&fp->aux->used_maps_mutex);
@@ -2558,6 +2561,10 @@ static void bpf_prog_free_deferred(struct work_struct *work)
 	aux = container_of(work, struct bpf_prog_aux, work);
 #ifdef CONFIG_BPF_SYSCALL
 	bpf_free_kfunc_btf_tab(aux->kfunc_btf_tab);
+#endif
+#ifdef CONFIG_BPF_LSM
+	if (aux->cgroup_atype != CGROUP_BPF_ATTACH_TYPE_INVALID)
+		bpf_cgroup_atype_free(aux->cgroup_atype);
 #endif
 	bpf_free_used_maps(aux);
 	bpf_free_used_btfs(aux);
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (3 preceding siblings ...)
  2022-06-01 19:02 ` [PATCH bpf-next v8 04/11] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-04  6:59   ` Martin KaFai Lau
  2022-06-01 19:02 ` [PATCH bpf-next v8 06/11] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

We have two options:
1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point

I was doing (2) in the original patch, but switching to (1) here:

* bpf_prog_query returns all attached BPF_LSM_CGROUP programs
regardless of attach_btf_id
* attach_btf_id is exported via bpf_prog_info

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/uapi/linux/bpf.h |   3 ++
 kernel/bpf/cgroup.c      | 103 +++++++++++++++++++++++++++------------
 kernel/bpf/syscall.c     |   6 ++-
 3 files changed, 81 insertions(+), 31 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index fa64b0b612fd..4271ef3c2afb 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1432,6 +1432,7 @@ union bpf_attr {
 		__u32		attach_flags;
 		__aligned_u64	prog_ids;
 		__u32		prog_cnt;
+		__aligned_u64	prog_attach_flags; /* output: per-program attach_flags */
 	} query;
 
 	struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
@@ -5996,6 +5997,8 @@ struct bpf_prog_info {
 	__u64 run_cnt;
 	__u64 recursion_misses;
 	__u32 verified_insns;
+	__u32 attach_btf_obj_id;
+	__u32 attach_btf_id;
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index a27a6a7bd852..cb3338ef01e0 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1035,6 +1035,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 			      union bpf_attr __user *uattr)
 {
+	__u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
 	__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
 	enum bpf_attach_type type = attr->query.attach_type;
 	enum cgroup_bpf_attach_type atype;
@@ -1042,50 +1043,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 	struct hlist_head *progs;
 	struct bpf_prog *prog;
 	int cnt, ret = 0, i;
+	int total_cnt = 0;
 	u32 flags;
 
-	atype = to_cgroup_bpf_attach_type(type);
-	if (atype < 0)
-		return -EINVAL;
+	enum cgroup_bpf_attach_type from_atype, to_atype;
 
-	progs = &cgrp->bpf.progs[atype];
-	flags = cgrp->bpf.flags[atype];
+	if (type == BPF_LSM_CGROUP) {
+		from_atype = CGROUP_LSM_START;
+		to_atype = CGROUP_LSM_END;
+	} else {
+		from_atype = to_cgroup_bpf_attach_type(type);
+		if (from_atype < 0)
+			return -EINVAL;
+		to_atype = from_atype;
+	}
 
-	effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
-					      lockdep_is_held(&cgroup_mutex));
+	for (atype = from_atype; atype <= to_atype; atype++) {
+		progs = &cgrp->bpf.progs[atype];
+		flags = cgrp->bpf.flags[atype];
 
-	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
-		cnt = bpf_prog_array_length(effective);
-	else
-		cnt = prog_list_length(progs);
+		effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
+						      lockdep_is_held(&cgroup_mutex));
 
-	if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
-		return -EFAULT;
-	if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
+		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
+			total_cnt += bpf_prog_array_length(effective);
+		else
+			total_cnt += prog_list_length(progs);
+	}
+
+	if (type != BPF_LSM_CGROUP)
+		if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
+			return -EFAULT;
+	if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
 		return -EFAULT;
-	if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
+	if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
 		/* return early if user requested only program count + flags */
 		return 0;
-	if (attr->query.prog_cnt < cnt) {
-		cnt = attr->query.prog_cnt;
+
+	if (attr->query.prog_cnt < total_cnt) {
+		total_cnt = attr->query.prog_cnt;
 		ret = -ENOSPC;
 	}
 
-	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
-		return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
-	} else {
-		struct bpf_prog_list *pl;
-		u32 id;
+	for (atype = from_atype; atype <= to_atype; atype++) {
+		if (total_cnt <= 0)
+			break;
 
-		i = 0;
-		hlist_for_each_entry(pl, progs, node) {
-			prog = prog_list_prog(pl);
-			id = prog->aux->id;
-			if (copy_to_user(prog_ids + i, &id, sizeof(id)))
-				return -EFAULT;
-			if (++i == cnt)
-				break;
+		progs = &cgrp->bpf.progs[atype];
+		flags = cgrp->bpf.flags[atype];
+
+		effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
+						      lockdep_is_held(&cgroup_mutex));
+
+		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
+			cnt = bpf_prog_array_length(effective);
+		else
+			cnt = prog_list_length(progs);
+
+		if (cnt >= total_cnt)
+			cnt = total_cnt;
+
+		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
+			ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
+		} else {
+			struct bpf_prog_list *pl;
+			u32 id;
+
+			i = 0;
+			hlist_for_each_entry(pl, progs, node) {
+				prog = prog_list_prog(pl);
+				id = prog->aux->id;
+				if (copy_to_user(prog_ids + i, &id, sizeof(id)))
+					return -EFAULT;
+				if (++i == cnt)
+					break;
+			}
 		}
+
+		if (prog_attach_flags)
+			for (i = 0; i < cnt; i++)
+				if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
+					return -EFAULT;
+
+		prog_ids += cnt;
+		total_cnt -= cnt;
+		if (prog_attach_flags)
+			prog_attach_flags += cnt;
 	}
 	return ret;
 }
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index a237be4f8bb3..27492d44133f 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
 	}
 }
 
-#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
+#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
 
 static int bpf_prog_query(const union bpf_attr *attr,
 			  union bpf_attr __user *uattr)
@@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
 	case BPF_CGROUP_SYSCTL:
 	case BPF_CGROUP_GETSOCKOPT:
 	case BPF_CGROUP_SETSOCKOPT:
+	case BPF_LSM_CGROUP:
 		return cgroup_bpf_prog_query(attr, uattr);
 	case BPF_LIRC_MODE2:
 		return lirc_prog_query(attr, uattr);
@@ -4066,6 +4067,9 @@ static int bpf_prog_get_info_by_fd(struct file *file,
 
 	if (prog->aux->btf)
 		info.btf_id = btf_obj_id(prog->aux->btf);
+	info.attach_btf_id = prog->aux->attach_btf_id;
+	if (prog->aux->attach_btf)
+		info.attach_btf_obj_id = btf_obj_id(prog->aux->attach_btf);
 
 	ulen = info.nr_func_info;
 	info.nr_func_info = prog->aux->func_info_cnt;
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 06/11] bpf: allow writing to a subset of sock fields from lsm progtype
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (4 preceding siblings ...)
  2022-06-01 19:02 ` [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-04  7:18   ` Martin KaFai Lau
  2022-06-01 19:02 ` [PATCH bpf-next v8 07/11] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

For now, allow only the obvious ones, like sk_priority and sk_mark.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 kernel/bpf/bpf_lsm.c  | 58 +++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c |  3 ++-
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 83aa431dd52e..feba8e96f58d 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -303,7 +303,65 @@ bool bpf_lsm_is_sleepable_hook(u32 btf_id)
 const struct bpf_prog_ops lsm_prog_ops = {
 };
 
+static int lsm_btf_struct_access(struct bpf_verifier_log *log,
+					const struct btf *btf,
+					const struct btf_type *t, int off,
+					int size, enum bpf_access_type atype,
+					u32 *next_btf_id,
+					enum bpf_type_flag *flag)
+{
+	const struct btf_type *sock_type;
+	struct btf *btf_vmlinux;
+	s32 type_id;
+	size_t end;
+
+	if (atype == BPF_READ)
+		return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
+					 flag);
+
+	btf_vmlinux = bpf_get_btf_vmlinux();
+	if (!btf_vmlinux) {
+		bpf_log(log, "no vmlinux btf\n");
+		return -EOPNOTSUPP;
+	}
+
+	type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
+	if (type_id < 0) {
+		bpf_log(log, "'struct sock' not found in vmlinux btf\n");
+		return -EINVAL;
+	}
+
+	sock_type = btf_type_by_id(btf_vmlinux, type_id);
+
+	if (t != sock_type) {
+		bpf_log(log, "only 'struct sock' writes are supported\n");
+		return -EACCES;
+	}
+
+	switch (off) {
+	case bpf_ctx_range(struct sock, sk_priority):
+		end = offsetofend(struct sock, sk_priority);
+		break;
+	case bpf_ctx_range(struct sock, sk_mark):
+		end = offsetofend(struct sock, sk_mark);
+		break;
+	default:
+		bpf_log(log, "no write support to 'struct sock' at off %d\n", off);
+		return -EACCES;
+	}
+
+	if (off + size > end) {
+		bpf_log(log,
+			"write access at off %d with size %d beyond the member of 'struct sock' ended at %zu\n",
+			off, size, end);
+		return -EACCES;
+	}
+
+	return NOT_INIT;
+}
+
 const struct bpf_verifier_ops lsm_verifier_ops = {
 	.get_func_proto = bpf_lsm_func_proto,
 	.is_valid_access = btf_ctx_access,
+	.btf_struct_access = lsm_btf_struct_access,
 };
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index caa5740b39b3..c54e171d9c23 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13413,7 +13413,8 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 				insn->code = BPF_LDX | BPF_PROBE_MEM |
 					BPF_SIZE((insn)->code);
 				env->prog->aux->num_exentries++;
-			} else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS) {
+			} else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS &&
+				   resolve_prog_type(env->prog) != BPF_PROG_TYPE_LSM) {
 				verbose(env, "Writes through BTF pointers are not allowed\n");
 				return -EINVAL;
 			}
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 07/11] libbpf: add lsm_cgoup_sock type
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (5 preceding siblings ...)
  2022-06-01 19:02 ` [PATCH bpf-next v8 06/11] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-03 21:30   ` Andrii Nakryiko
  2022-06-01 19:02 ` [PATCH bpf-next v8 08/11] libbpf: implement bpf_prog_query_opts Stanislav Fomichev
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/lib/bpf/libbpf.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 5afe4cbd684f..a4209a1ad02f 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -107,6 +107,7 @@ static const char * const attach_type_name[] = {
 	[BPF_TRACE_FEXIT]		= "trace_fexit",
 	[BPF_MODIFY_RETURN]		= "modify_return",
 	[BPF_LSM_MAC]			= "lsm_mac",
+	[BPF_LSM_CGROUP]		= "lsm_cgroup",
 	[BPF_SK_LOOKUP]			= "sk_lookup",
 	[BPF_TRACE_ITER]		= "trace_iter",
 	[BPF_XDP_DEVMAP]		= "xdp_devmap",
@@ -9157,6 +9158,7 @@ static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("freplace+",		EXT, 0, SEC_ATTACH_BTF, attach_trace),
 	SEC_DEF("lsm+",			LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm),
 	SEC_DEF("lsm.s+",		LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
+	SEC_DEF("lsm_cgroup+",		LSM, BPF_LSM_CGROUP, SEC_ATTACH_BTF),
 	SEC_DEF("iter+",		TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF, attach_iter),
 	SEC_DEF("iter.s+",		TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_iter),
 	SEC_DEF("syscall",		SYSCALL, 0, SEC_SLEEPABLE),
@@ -9610,6 +9612,7 @@ void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
 		*kind = BTF_KIND_TYPEDEF;
 		break;
 	case BPF_LSM_MAC:
+	case BPF_LSM_CGROUP:
 		*prefix = BTF_LSM_PREFIX;
 		*kind = BTF_KIND_FUNC;
 		break;
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 08/11] libbpf: implement bpf_prog_query_opts
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (6 preceding siblings ...)
  2022-06-01 19:02 ` [PATCH bpf-next v8 07/11] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-03 21:34   ` Andrii Nakryiko
  2022-06-01 19:02 ` [PATCH bpf-next v8 09/11] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Implement bpf_prog_query_opts as a more expendable version of
bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
well:

* prog_attach_flags is a per-program attach_type; relevant only for
  lsm cgroup program which might have different attach_flags
  per attach_btf_id
* attach_btf_func_id is a new field expose for prog_query which
  specifies real btf function id for lsm cgroup attachments

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/include/uapi/linux/bpf.h |  3 +++
 tools/lib/bpf/bpf.c            | 40 +++++++++++++++++++++++++++-------
 tools/lib/bpf/bpf.h            | 15 +++++++++++++
 tools/lib/bpf/libbpf.map       |  2 +-
 4 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index fa64b0b612fd..4271ef3c2afb 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1432,6 +1432,7 @@ union bpf_attr {
 		__u32		attach_flags;
 		__aligned_u64	prog_ids;
 		__u32		prog_cnt;
+		__aligned_u64	prog_attach_flags; /* output: per-program attach_flags */
 	} query;
 
 	struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
@@ -5996,6 +5997,8 @@ struct bpf_prog_info {
 	__u64 run_cnt;
 	__u64 recursion_misses;
 	__u32 verified_insns;
+	__u32 attach_btf_obj_id;
+	__u32 attach_btf_id;
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 240186aac8e6..c7af7db53725 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -888,28 +888,52 @@ int bpf_iter_create(int link_fd)
 	return libbpf_err_errno(fd);
 }
 
-int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
-		   __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
+int bpf_prog_query_opts(int target_fd,
+			enum bpf_attach_type type,
+			struct bpf_prog_query_opts *opts)
 {
 	union bpf_attr attr;
 	int ret;
 
 	memset(&attr, 0, sizeof(attr));
+
+	if (!OPTS_VALID(opts, bpf_prog_query_opts))
+		return libbpf_err(-EINVAL);
+
 	attr.query.target_fd	= target_fd;
 	attr.query.attach_type	= type;
-	attr.query.query_flags	= query_flags;
-	attr.query.prog_cnt	= *prog_cnt;
-	attr.query.prog_ids	= ptr_to_u64(prog_ids);
+	attr.query.query_flags	= OPTS_GET(opts, query_flags, 0);
+	attr.query.prog_cnt	= OPTS_GET(opts, prog_cnt, 0);
+	attr.query.prog_ids	= ptr_to_u64(OPTS_GET(opts, prog_ids, NULL));
+	attr.query.prog_attach_flags = ptr_to_u64(OPTS_GET(opts, prog_attach_flags, NULL));
 
 	ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
 
-	if (attach_flags)
-		*attach_flags = attr.query.attach_flags;
-	*prog_cnt = attr.query.prog_cnt;
+	OPTS_SET(opts, attach_flags, attr.query.attach_flags);
+	OPTS_SET(opts, prog_cnt, attr.query.prog_cnt);
 
 	return libbpf_err_errno(ret);
 }
 
+int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
+		   __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
+{
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
+	int ret;
+
+	p.query_flags = query_flags;
+	p.prog_ids = prog_ids;
+	p.prog_cnt = *prog_cnt;
+
+	ret = bpf_prog_query_opts(target_fd, type, &p);
+
+	if (attach_flags)
+		*attach_flags = p.attach_flags;
+	*prog_cnt = p.prog_cnt;
+
+	return ret;
+}
+
 int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size,
 		      void *data_out, __u32 *size_out, __u32 *retval,
 		      __u32 *duration)
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index cabc03703e29..e8f70ce6b537 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -442,9 +442,24 @@ LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_link_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);
+
+struct bpf_prog_query_opts {
+	size_t sz; /* size of this struct for forward/backward compatibility */
+	__u32 query_flags;
+	__u32 attach_flags; /* output argument */
+	__u32 *prog_ids;
+	__u32 prog_cnt; /* input+output argument */
+	__u32 *prog_attach_flags;
+};
+#define bpf_prog_query_opts__last_field prog_attach_flags
+
+LIBBPF_API int bpf_prog_query_opts(int target_fd,
+				   enum bpf_attach_type type,
+				   struct bpf_prog_query_opts *opts);
 LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
 			      __u32 query_flags, __u32 *attach_flags,
 			      __u32 *prog_ids, __u32 *prog_cnt);
+
 LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd);
 LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
 				 __u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 38e284ff057d..f2f713161775 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -462,10 +462,10 @@ LIBBPF_0.8.0 {
 
 LIBBPF_1.0.0 {
 	global:
+		bpf_prog_query_opts;
 		libbpf_bpf_attach_type_str;
 		libbpf_bpf_link_type_str;
 		libbpf_bpf_map_type_str;
 		libbpf_bpf_prog_type_str;
-
 	local: *;
 };
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 09/11] bpftool: implement cgroup tree for BPF_LSM_CGROUP
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (7 preceding siblings ...)
  2022-06-01 19:02 ` [PATCH bpf-next v8 08/11] libbpf: implement bpf_prog_query_opts Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 10/11] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 11/11] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
  10 siblings, 0 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

$ bpftool --nomount prog loadall $KDIR/tools/testing/selftests/bpf/lsm_cgroup.o /sys/fs/bpf/x
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_alloc
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_bind
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_clone
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_post_create
$ bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name
/sys/fs/cgroup
6        lsm_cgroup                      socket_post_create bpf_lsm_socket_post_create
8        lsm_cgroup                      socket_bind     bpf_lsm_socket_bind
10       lsm_cgroup                      socket_alloc    bpf_lsm_sk_alloc_security
11       lsm_cgroup                      socket_clone    bpf_lsm_inet_csk_clone

$ bpftool cgroup detach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_post_create
$ bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name
/sys/fs/cgroup
8        lsm_cgroup                      socket_bind     bpf_lsm_socket_bind
10       lsm_cgroup                      socket_alloc    bpf_lsm_sk_alloc_security
11       lsm_cgroup                      socket_clone    bpf_lsm_inet_csk_clone

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/bpf/bpftool/cgroup.c | 79 +++++++++++++++++++++++++++-----------
 1 file changed, 57 insertions(+), 22 deletions(-)

diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
index 42421fe47a58..28f0f16b96ce 100644
--- a/tools/bpf/bpftool/cgroup.c
+++ b/tools/bpf/bpftool/cgroup.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 
 #include <bpf/bpf.h>
+#include <bpf/btf.h>
 
 #include "main.h"
 
@@ -36,6 +37,7 @@
 	"                        cgroup_inet_sock_release }"
 
 static unsigned int query_flags;
+static struct btf *btf_vmlinux;
 
 static enum bpf_attach_type parse_attach_type(const char *str)
 {
@@ -69,6 +71,7 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 			 int level)
 {
 	char prog_name[MAX_PROG_FULL_NAME];
+	const char *attach_btf_name = NULL;
 	struct bpf_prog_info info = {};
 	const char *attach_type_str;
 	__u32 info_len = sizeof(info);
@@ -84,6 +87,19 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 	}
 
 	attach_type_str = libbpf_bpf_attach_type_str(attach_type);
+
+	if (btf_vmlinux &&
+	    info.attach_btf_id < btf__type_cnt(btf_vmlinux)) {
+		/* Note, we ignore info.attach_btf_obj_id for now. There
+		 * is no good way to resolve btf_id to vmlinux
+		 * or module btf.
+		 */
+		const struct btf_type *t = btf__type_by_id(btf_vmlinux,
+							   info.attach_btf_id);
+		attach_btf_name = btf__name_by_offset(btf_vmlinux,
+						      t->name_off);
+	}
+
 	get_prog_full_name(&info, prog_fd, prog_name, sizeof(prog_name));
 	if (json_output) {
 		jsonw_start_object(json_wtr);
@@ -95,6 +111,10 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 		jsonw_string_field(json_wtr, "attach_flags",
 				   attach_flags_str);
 		jsonw_string_field(json_wtr, "name", prog_name);
+		if (attach_btf_name)
+			jsonw_string_field(json_wtr, "attach_btf_name", attach_btf_name);
+		jsonw_uint_field(json_wtr, "attach_btf_obj_id", info.attach_btf_obj_id);
+		jsonw_uint_field(json_wtr, "attach_btf_id", info.attach_btf_id);
 		jsonw_end_object(json_wtr);
 	} else {
 		printf("%s%-8u ", level ? "    " : "", info.id);
@@ -102,7 +122,13 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 			printf("%-15s", attach_type_str);
 		else
 			printf("type %-10u", attach_type);
-		printf(" %-15s %-15s\n", attach_flags_str, prog_name);
+		printf(" %-15s %-15s", attach_flags_str, prog_name);
+		if (attach_btf_name)
+			printf(" %-15s", attach_btf_name);
+		else if (info.attach_btf_id)
+			printf(" attach_btf_obj_id=%d attach_btf_id=%d",
+			       info.attach_btf_obj_id, info.attach_btf_id);
+		printf("\n");
 	}
 
 	close(prog_fd);
@@ -144,40 +170,48 @@ static int cgroup_has_attached_progs(int cgroup_fd)
 static int show_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type,
 				   int level)
 {
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
 	const char *attach_flags_str;
 	__u32 prog_ids[1024] = {0};
-	__u32 prog_cnt, iter;
-	__u32 attach_flags;
+	__u32 attach_prog_flags[1024] = {0};
 	char buf[32];
+	__u32 iter;
 	int ret;
 
-	prog_cnt = ARRAY_SIZE(prog_ids);
-	ret = bpf_prog_query(cgroup_fd, type, query_flags, &attach_flags,
-			     prog_ids, &prog_cnt);
+	p.query_flags = query_flags;
+	p.prog_cnt = ARRAY_SIZE(prog_ids);
+	p.prog_ids = prog_ids;
+
+	ret = bpf_prog_query_opts(cgroup_fd, type, &p);
 	if (ret)
 		return ret;
 
-	if (prog_cnt == 0)
+	if (p.prog_cnt == 0)
 		return 0;
 
-	switch (attach_flags) {
-	case BPF_F_ALLOW_MULTI:
-		attach_flags_str = "multi";
-		break;
-	case BPF_F_ALLOW_OVERRIDE:
-		attach_flags_str = "override";
-		break;
-	case 0:
-		attach_flags_str = "";
-		break;
-	default:
-		snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags);
-		attach_flags_str = buf;
-	}
+	for (iter = 0; iter < p.prog_cnt; iter++) {
+		__u32 attach_flags;
+
+		attach_flags = attach_prog_flags[iter] ?: p.attach_flags;
+
+		switch (attach_flags) {
+		case BPF_F_ALLOW_MULTI:
+			attach_flags_str = "multi";
+			break;
+		case BPF_F_ALLOW_OVERRIDE:
+			attach_flags_str = "override";
+			break;
+		case 0:
+			attach_flags_str = "";
+			break;
+		default:
+			snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags);
+			attach_flags_str = buf;
+		}
 
-	for (iter = 0; iter < prog_cnt; iter++)
 		show_bpf_prog(prog_ids[iter], type,
 			      attach_flags_str, level);
+	}
 
 	return 0;
 }
@@ -542,5 +576,6 @@ static const struct cmd cmds[] = {
 
 int do_cgroup(int argc, char **argv)
 {
+	btf_vmlinux = libbpf_find_kernel_btf();
 	return cmd_select(cmds, argc, argv, do_help);
 }
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 10/11] selftests/bpf: lsm_cgroup functional test
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (8 preceding siblings ...)
  2022-06-01 19:02 ` [PATCH bpf-next v8 09/11] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-01 19:02 ` [PATCH bpf-next v8 11/11] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
  10 siblings, 0 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Functional test that exercises the following:

1. apply default sk_priority policy
2. permit TX-only AF_PACKET socket
3. cgroup attach/detach/replace
4. reusing trampoline shim

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/lsm_cgroup.c     | 277 ++++++++++++++++++
 .../testing/selftests/bpf/progs/lsm_cgroup.c  | 160 ++++++++++
 2 files changed, 437 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
 create mode 100644 tools/testing/selftests/bpf/progs/lsm_cgroup.c

diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
new file mode 100644
index 000000000000..a96057ec7dd4
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
@@ -0,0 +1,277 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <test_progs.h>
+#include <bpf/btf.h>
+
+#include "lsm_cgroup.skel.h"
+#include "cgroup_helpers.h"
+#include "network_helpers.h"
+
+static __u32 query_prog_cnt(int cgroup_fd, const char *attach_func)
+{
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
+	static struct btf *btf;
+	int cnt = 0;
+	int i;
+
+	ASSERT_OK(bpf_prog_query_opts(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query");
+
+	if (!attach_func)
+		return p.prog_cnt;
+
+	/* When attach_func is provided, count the number of progs that
+	 * attach to the given symbol.
+	 */
+
+	if (!btf)
+		btf = btf__load_vmlinux_btf();
+	if (!ASSERT_OK(libbpf_get_error(btf), "btf_vmlinux"))
+		return -1;
+
+	p.prog_ids = malloc(sizeof(u32) * p.prog_cnt);
+	p.prog_attach_flags = malloc(sizeof(u32) * p.prog_cnt);
+	ASSERT_OK(bpf_prog_query_opts(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query");
+
+	for (i = 0; i < p.prog_cnt; i++) {
+		struct bpf_prog_info info = {};
+		__u32 info_len = sizeof(info);
+		int fd;
+
+		fd = bpf_prog_get_fd_by_id(p.prog_ids[i]);
+		ASSERT_GE(fd, 0, "prog_get_fd_by_id");
+		ASSERT_OK(bpf_obj_get_info_by_fd(fd, &info, &info_len), "prog_info_by_fd");
+		close(fd);
+
+		if (info.attach_btf_id ==
+		    btf__find_by_name_kind(btf, attach_func, BTF_KIND_FUNC))
+			cnt++;
+	}
+
+	return cnt;
+}
+
+static void test_lsm_cgroup_functional(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_prog_attach_opts, attach_opts);
+	DECLARE_LIBBPF_OPTS(bpf_link_update_opts, update_opts);
+	int cgroup_fd, cgroup_fd2, err, fd, prio;
+	int listen_fd, client_fd, accepted_fd;
+	struct lsm_cgroup *skel = NULL;
+	int post_create_prog_fd2 = -1;
+	int post_create_prog_fd = -1;
+	int bind_link_fd2 = -1;
+	int bind_prog_fd2 = -1;
+	int alloc_prog_fd = -1;
+	int bind_prog_fd = -1;
+	int bind_link_fd = -1;
+	int clone_prog_fd = -1;
+	socklen_t socklen;
+
+	cgroup_fd = test__join_cgroup("/sock_policy");
+	if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup"))
+		goto close_skel;
+
+	cgroup_fd2 = create_and_get_cgroup("/sock_policy2");
+	if (!ASSERT_GE(cgroup_fd2, 0, "create second cgroup"))
+		goto close_skel;
+
+	skel = lsm_cgroup__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		goto close_cgroup;
+
+	post_create_prog_fd = bpf_program__fd(skel->progs.socket_post_create);
+	post_create_prog_fd2 = bpf_program__fd(skel->progs.socket_post_create2);
+	bind_prog_fd = bpf_program__fd(skel->progs.socket_bind);
+	bind_prog_fd2 = bpf_program__fd(skel->progs.socket_bind2);
+	alloc_prog_fd = bpf_program__fd(skel->progs.socket_alloc);
+	clone_prog_fd = bpf_program__fd(skel->progs.socket_clone);
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 0, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 0, "total prog count");
+	err = bpf_prog_attach(alloc_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach alloc_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 1, "total prog count");
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 0, "prog count");
+	err = bpf_prog_attach(clone_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach clone_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 2, "total prog count");
+
+	/* Make sure replacing works. */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 0, "prog count");
+	err = bpf_prog_attach(post_create_prog_fd, cgroup_fd,
+			      BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach post_create_prog_fd"))
+		goto close_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 3, "total prog count");
+
+	attach_opts.replace_prog_fd = post_create_prog_fd;
+	err = bpf_prog_attach_opts(post_create_prog_fd2, cgroup_fd,
+				   BPF_LSM_CGROUP, &attach_opts);
+	if (!ASSERT_OK(err, "prog replace post_create_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 3, "total prog count");
+
+	/* Try the same attach/replace via link API. */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 0, "prog count");
+	bind_link_fd = bpf_link_create(bind_prog_fd, cgroup_fd,
+				       BPF_LSM_CGROUP, NULL);
+	if (!ASSERT_GE(bind_link_fd, 0, "link create bind_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+
+	update_opts.old_prog_fd = bind_prog_fd;
+	update_opts.flags = BPF_F_REPLACE;
+
+	err = bpf_link_update(bind_link_fd, bind_prog_fd2, &update_opts);
+	if (!ASSERT_OK(err, "link update bind_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+
+	/* Attach another instance of bind program to another cgroup.
+	 * This should trigger the reuse of the trampoline shim (two
+	 * programs attaching to the same btf_id).
+	 */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 0, "prog count");
+	bind_link_fd2 = bpf_link_create(bind_prog_fd2, cgroup_fd2,
+					BPF_LSM_CGROUP, NULL);
+	if (!ASSERT_GE(bind_link_fd2, 0, "link create bind_prog_fd2"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd2, NULL), 1, "total prog count");
+
+	/* AF_UNIX is prohibited. */
+
+	fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	ASSERT_LT(fd, 0, "socket(AF_UNIX)");
+
+	/* AF_INET6 gets default policy (sk_priority). */
+
+	fd = socket(AF_INET6, SOCK_STREAM, 0);
+	if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)"))
+		goto detach_cgroup;
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 123, "sk_priority");
+
+	close(fd);
+
+	/* TX-only AF_PACKET is allowed. */
+
+	ASSERT_LT(socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)), 0,
+		  "socket(AF_PACKET, ..., ETH_P_ALL)");
+
+	fd = socket(AF_PACKET, SOCK_RAW, 0);
+	ASSERT_GE(fd, 0, "socket(AF_PACKET, ..., 0)");
+
+	/* TX-only AF_PACKET can not be rebound. */
+
+	struct sockaddr_ll sa = {
+		.sll_family = AF_PACKET,
+		.sll_protocol = htons(ETH_P_ALL),
+	};
+	ASSERT_LT(bind(fd, (struct sockaddr *)&sa, sizeof(sa)), 0,
+		  "bind(ETH_P_ALL)");
+
+	close(fd);
+
+	/* Trigger passive open. */
+
+	listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+	ASSERT_GE(listen_fd, 0, "start_server");
+	client_fd = connect_to_fd(listen_fd, 0);
+	ASSERT_GE(client_fd, 0, "connect_to_fd");
+	accepted_fd = accept(listen_fd, NULL, NULL);
+	ASSERT_GE(accepted_fd, 0, "accept");
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(accepted_fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 234, "sk_priority");
+
+	/* These are replaced and never called. */
+	ASSERT_EQ(skel->bss->called_socket_post_create, 0, "called_create");
+	ASSERT_EQ(skel->bss->called_socket_bind, 0, "called_bind");
+
+	/* AF_INET6+SOCK_STREAM
+	 * AF_PACKET+SOCK_RAW
+	 * listen_fd
+	 * client_fd
+	 * accepted_fd
+	 */
+	ASSERT_EQ(skel->bss->called_socket_post_create2, 5, "called_create2");
+
+	/* start_server
+	 * bind(ETH_P_ALL)
+	 */
+	ASSERT_EQ(skel->bss->called_socket_bind2, 2, "called_bind2");
+	/* Single accept(). */
+	ASSERT_EQ(skel->bss->called_socket_clone, 1, "called_clone");
+
+	/* AF_UNIX+SOCK_STREAM (failed)
+	 * AF_INET6+SOCK_STREAM
+	 * AF_PACKET+SOCK_RAW (failed)
+	 * AF_PACKET+SOCK_RAW
+	 * listen_fd
+	 * client_fd
+	 * accepted_fd
+	 */
+	ASSERT_EQ(skel->bss->called_socket_alloc, 7, "called_alloc");
+
+	/* Make sure other cgroup doesn't trigger the programs. */
+
+	if (!ASSERT_OK(join_cgroup(""), "join root cgroup"))
+		goto detach_cgroup;
+
+	fd = socket(AF_INET6, SOCK_STREAM, 0);
+	if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)"))
+		goto detach_cgroup;
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 0, "sk_priority");
+
+	close(fd);
+
+detach_cgroup:
+	ASSERT_GE(bpf_prog_detach2(post_create_prog_fd2, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_create");
+	close(bind_link_fd);
+	/* Don't close bind_link_fd2, exercise cgroup release cleanup. */
+	ASSERT_GE(bpf_prog_detach2(alloc_prog_fd, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_alloc");
+	ASSERT_GE(bpf_prog_detach2(clone_prog_fd, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_clone");
+
+close_cgroup:
+	close(cgroup_fd);
+close_skel:
+	lsm_cgroup__destroy(skel);
+}
+
+void test_lsm_cgroup(void)
+{
+	if (test__start_subtest("functional"))
+		test_lsm_cgroup_functional();
+}
diff --git a/tools/testing/selftests/bpf/progs/lsm_cgroup.c b/tools/testing/selftests/bpf/progs/lsm_cgroup.c
new file mode 100644
index 000000000000..a263830900e2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/lsm_cgroup.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+#ifndef AF_PACKET
+#define AF_PACKET 17
+#endif
+
+#ifndef AF_UNIX
+#define AF_UNIX 1
+#endif
+
+#ifndef EPERM
+#define EPERM 1
+#endif
+
+struct {
+	__uint(type, BPF_MAP_TYPE_CGROUP_STORAGE);
+	__type(key, __u64);
+	__type(value, __u64);
+} cgroup_storage SEC(".maps");
+
+int called_socket_post_create;
+int called_socket_post_create2;
+int called_socket_bind;
+int called_socket_bind2;
+int called_socket_alloc;
+int called_socket_clone;
+
+static __always_inline int test_local_storage(void)
+{
+	__u64 *val;
+
+	val = bpf_get_local_storage(&cgroup_storage, 0);
+	if (!val)
+		return 0;
+	*val += 1;
+
+	return 1;
+}
+
+static __always_inline int real_create(struct socket *sock, int family,
+				       int protocol)
+{
+	struct sock *sk;
+
+	/* Reject non-tx-only AF_PACKET. */
+	if (family == AF_PACKET && protocol != 0)
+		return 0; /* EPERM */
+
+	sk = sock->sk;
+	if (!sk)
+		return 1;
+
+	/* The rest of the sockets get default policy. */
+	sk->sk_priority = 123;
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create, struct socket *sock, int family,
+	     int type, int protocol, int kern)
+{
+	called_socket_post_create++;
+	return real_create(sock, family, protocol);
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create2, struct socket *sock, int family,
+	     int type, int protocol, int kern)
+{
+	called_socket_post_create2++;
+	return real_create(sock, family, protocol);
+}
+
+static __always_inline int real_bind(struct socket *sock,
+				     struct sockaddr *address,
+				     int addrlen)
+{
+	struct sockaddr_ll sa = {};
+
+	if (sock->sk->__sk_common.skc_family != AF_PACKET)
+		return 1;
+
+	if (sock->sk->sk_kern_sock)
+		return 1;
+
+	bpf_probe_read_kernel(&sa, sizeof(sa), address);
+	if (sa.sll_protocol)
+		return 0; /* EPERM */
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_bind")
+int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	called_socket_bind++;
+	return real_bind(sock, address, addrlen);
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_bind")
+int BPF_PROG(socket_bind2, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	called_socket_bind2++;
+	return real_bind(sock, address, addrlen);
+}
+
+/* __cgroup_bpf_run_lsm_current (via bpf_lsm_current_hooks) */
+SEC("lsm_cgroup/sk_alloc_security")
+int BPF_PROG(socket_alloc, struct sock *sk, int family, gfp_t priority)
+{
+	called_socket_alloc++;
+	if (family == AF_UNIX)
+		return 0; /* EPERM */
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_sock */
+SEC("lsm_cgroup/inet_csk_clone")
+int BPF_PROG(socket_clone, struct sock *newsk, const struct request_sock *req)
+{
+	called_socket_clone++;
+
+	if (!newsk)
+		return 1;
+
+	/* Accepted request sockets get a different priority. */
+	newsk->sk_priority = 234;
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH bpf-next v8 11/11] selftests/bpf: verify lsm_cgroup struct sock access
  2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (9 preceding siblings ...)
  2022-06-01 19:02 ` [PATCH bpf-next v8 10/11] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
@ 2022-06-01 19:02 ` Stanislav Fomichev
  2022-06-03  1:52   ` Martin KaFai Lau
  10 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-01 19:02 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

sk_priority & sk_mark are writable, the rest is readonly.

One interesting thing here is that the verifier doesn't
really force me to add NULL checks anywhere :-/

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/lsm_cgroup.c     | 69 +++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
index a96057ec7dd4..a730acd9dfdd 100644
--- a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
+++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
@@ -270,8 +270,77 @@ static void test_lsm_cgroup_functional(void)
 	lsm_cgroup__destroy(skel);
 }
 
+static int field_offset(const char *type, const char *field)
+{
+	const struct btf_member *memb;
+	const struct btf_type *tp;
+	const char *name;
+	struct btf *btf;
+	int btf_id;
+	int i;
+
+	btf = btf__load_vmlinux_btf();
+	if (!btf)
+		return -1;
+
+	btf_id = btf__find_by_name_kind(btf, type, BTF_KIND_STRUCT);
+	if (btf_id < 0)
+		return -1;
+
+	tp = btf__type_by_id(btf, btf_id);
+	memb = btf_members(tp);
+
+	for (i = 0; i < btf_vlen(tp); i++) {
+		name = btf__name_by_offset(btf,
+					   memb->name_off);
+		if (strcmp(field, name) == 0)
+			return memb->offset / 8;
+		memb++;
+	}
+
+	return -1;
+}
+
+static bool sk_writable_field(const char *type, const char *field, int size)
+{
+	LIBBPF_OPTS(bpf_prog_load_opts, opts,
+		    .expected_attach_type = BPF_LSM_CGROUP);
+	struct bpf_insn	insns[] = {
+		/* r1 = *(u64 *)(r1 + 0) */
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0),
+		/* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, field_offset("socket", "sk")),
+		/* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */
+		BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, field_offset(type, field)),
+		/* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */
+		BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, field_offset(type, field)),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+	};
+	int fd;
+
+	opts.attach_btf_id = libbpf_find_vmlinux_btf_id("socket_post_create",
+							opts.expected_attach_type);
+
+	fd = bpf_prog_load(BPF_PROG_TYPE_LSM, NULL, "GPL", insns, ARRAY_SIZE(insns), &opts);
+	if (fd >= 0)
+		close(fd);
+	return fd >= 0;
+}
+
+static void test_lsm_cgroup_access(void)
+{
+	ASSERT_FALSE(sk_writable_field("sock_common", "skc_family", BPF_H), "skc_family");
+	ASSERT_FALSE(sk_writable_field("sock", "sk_sndtimeo", BPF_DW), "sk_sndtimeo");
+	ASSERT_TRUE(sk_writable_field("sock", "sk_priority", BPF_W), "sk_priority");
+	ASSERT_TRUE(sk_writable_field("sock", "sk_mark", BPF_W), "sk_mark");
+	ASSERT_FALSE(sk_writable_field("sock", "sk_pacing_rate", BPF_DW), "sk_pacing_rate");
+}
+
 void test_lsm_cgroup(void)
 {
 	if (test__start_subtest("functional"))
 		test_lsm_cgroup_functional();
+	if (test__start_subtest("access"))
+		test_lsm_cgroup_access();
 }
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor
  2022-06-01 19:02 ` [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
@ 2022-06-02  6:16   ` kernel test robot
  2022-06-02 16:49     ` Stanislav Fomichev
  2022-06-04  6:11   ` Martin KaFai Lau
  1 sibling, 1 reply; 35+ messages in thread
From: kernel test robot @ 2022-06-02  6:16 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf
  Cc: llvm, kbuild-all, ast, daniel, andrii, Stanislav Fomichev

Hi Stanislav,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/bpf-cgroup_sock-lsm-flavor/20220602-050600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: i386-randconfig-a004 (https://download.01.org/0day-ci/archive/20220602/202206021403.M9hFZdbY-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project b364c76683f8ef241025a9556300778c07b590c2)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/584b25fdd30894c312d577f4b6b83f93d64e464b
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Stanislav-Fomichev/bpf-cgroup_sock-lsm-flavor/20220602-050600
        git checkout 584b25fdd30894c312d577f4b6b83f93d64e464b
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash kernel/bpf/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> kernel/bpf/cgroup.c:257:35: warning: overlapping comparisons always evaluate to false [-Wtautological-overlap-compare]
                                   if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
                                       ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
   kernel/bpf/cgroup.c:252:35: warning: overlapping comparisons always evaluate to false [-Wtautological-overlap-compare]
                                   if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
                                       ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
   2 warnings generated.


vim +257 kernel/bpf/cgroup.c

   226	
   227	/**
   228	 * cgroup_bpf_release() - put references of all bpf programs and
   229	 *                        release all cgroup bpf data
   230	 * @work: work structure embedded into the cgroup to modify
   231	 */
   232	static void cgroup_bpf_release(struct work_struct *work)
   233	{
   234		struct cgroup *p, *cgrp = container_of(work, struct cgroup,
   235						       bpf.release_work);
   236		struct bpf_prog_array *old_array;
   237		struct list_head *storages = &cgrp->bpf.storages;
   238		struct bpf_cgroup_storage *storage, *stmp;
   239	
   240		unsigned int atype;
   241	
   242		mutex_lock(&cgroup_mutex);
   243	
   244		for (atype = 0; atype < ARRAY_SIZE(cgrp->bpf.progs); atype++) {
   245			struct hlist_head *progs = &cgrp->bpf.progs[atype];
   246			struct bpf_prog_list *pl;
   247			struct hlist_node *pltmp;
   248	
   249			hlist_for_each_entry_safe(pl, pltmp, progs, node) {
   250				hlist_del(&pl->node);
   251				if (pl->prog) {
   252					if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
   253						bpf_trampoline_unlink_cgroup_shim(pl->prog);
   254					bpf_prog_put(pl->prog);
   255				}
   256				if (pl->link) {
 > 257					if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
   258						bpf_trampoline_unlink_cgroup_shim(pl->link->link.prog);
   259					bpf_cgroup_link_auto_detach(pl->link);
   260				}
   261				kfree(pl);
   262				static_branch_dec(&cgroup_bpf_enabled_key[atype]);
   263			}
   264			old_array = rcu_dereference_protected(
   265					cgrp->bpf.effective[atype],
   266					lockdep_is_held(&cgroup_mutex));
   267			bpf_prog_array_free(old_array);
   268		}
   269	
   270		list_for_each_entry_safe(storage, stmp, storages, list_cg) {
   271			bpf_cgroup_storage_unlink(storage);
   272			bpf_cgroup_storage_free(storage);
   273		}
   274	
   275		mutex_unlock(&cgroup_mutex);
   276	
   277		for (p = cgroup_parent(cgrp); p; p = cgroup_parent(p))
   278			cgroup_bpf_put(p);
   279	
   280		percpu_ref_exit(&cgrp->bpf.refcnt);
   281		cgroup_put(cgrp);
   282	}
   283	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor
  2022-06-02  6:16   ` kernel test robot
@ 2022-06-02 16:49     ` Stanislav Fomichev
  0 siblings, 0 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-02 16:49 UTC (permalink / raw)
  To: kernel test robot; +Cc: netdev, bpf, llvm, kbuild-all, ast, daniel, andrii

On Wed, Jun 1, 2022 at 11:17 PM kernel test robot <lkp@intel.com> wrote:
>
> Hi Stanislav,
>
> Thank you for the patch! Perhaps something to improve:

This is a config without CONFIG_BPF_LSM and it makes CGROUP_LSM_START
greater than CGROUP_LSM_END (to make sure we don't
waste slots on non-CONFIG_BPF_LSM builds) and it screws up
(atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END) check.
I'll add an ifdef around that.


> [auto build test WARNING on bpf-next/master]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/bpf-cgroup_sock-lsm-flavor/20220602-050600
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> config: i386-randconfig-a004 (https://download.01.org/0day-ci/archive/20220602/202206021403.M9hFZdbY-lkp@intel.com/config)
> compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project b364c76683f8ef241025a9556300778c07b590c2)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://github.com/intel-lab-lkp/linux/commit/584b25fdd30894c312d577f4b6b83f93d64e464b
>         git remote add linux-review https://github.com/intel-lab-lkp/linux
>         git fetch --no-tags linux-review Stanislav-Fomichev/bpf-cgroup_sock-lsm-flavor/20220602-050600
>         git checkout 584b25fdd30894c312d577f4b6b83f93d64e464b
>         # save the config file
>         mkdir build_dir && cp config build_dir/.config
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash kernel/bpf/
>
> If you fix the issue, kindly add following tag where applicable
> Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
> >> kernel/bpf/cgroup.c:257:35: warning: overlapping comparisons always evaluate to false [-Wtautological-overlap-compare]
>                                    if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
>                                        ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
>    kernel/bpf/cgroup.c:252:35: warning: overlapping comparisons always evaluate to false [-Wtautological-overlap-compare]
>                                    if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
>                                        ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
>    2 warnings generated.
>
>
> vim +257 kernel/bpf/cgroup.c
>
>    226
>    227  /**
>    228   * cgroup_bpf_release() - put references of all bpf programs and
>    229   *                        release all cgroup bpf data
>    230   * @work: work structure embedded into the cgroup to modify
>    231   */
>    232  static void cgroup_bpf_release(struct work_struct *work)
>    233  {
>    234          struct cgroup *p, *cgrp = container_of(work, struct cgroup,
>    235                                                 bpf.release_work);
>    236          struct bpf_prog_array *old_array;
>    237          struct list_head *storages = &cgrp->bpf.storages;
>    238          struct bpf_cgroup_storage *storage, *stmp;
>    239
>    240          unsigned int atype;
>    241
>    242          mutex_lock(&cgroup_mutex);
>    243
>    244          for (atype = 0; atype < ARRAY_SIZE(cgrp->bpf.progs); atype++) {
>    245                  struct hlist_head *progs = &cgrp->bpf.progs[atype];
>    246                  struct bpf_prog_list *pl;
>    247                  struct hlist_node *pltmp;
>    248
>    249                  hlist_for_each_entry_safe(pl, pltmp, progs, node) {
>    250                          hlist_del(&pl->node);
>    251                          if (pl->prog) {
>    252                                  if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
>    253                                          bpf_trampoline_unlink_cgroup_shim(pl->prog);
>    254                                  bpf_prog_put(pl->prog);
>    255                          }
>    256                          if (pl->link) {
>  > 257                                  if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
>    258                                          bpf_trampoline_unlink_cgroup_shim(pl->link->link.prog);
>    259                                  bpf_cgroup_link_auto_detach(pl->link);
>    260                          }
>    261                          kfree(pl);
>    262                          static_branch_dec(&cgroup_bpf_enabled_key[atype]);
>    263                  }
>    264                  old_array = rcu_dereference_protected(
>    265                                  cgrp->bpf.effective[atype],
>    266                                  lockdep_is_held(&cgroup_mutex));
>    267                  bpf_prog_array_free(old_array);
>    268          }
>    269
>    270          list_for_each_entry_safe(storage, stmp, storages, list_cg) {
>    271                  bpf_cgroup_storage_unlink(storage);
>    272                  bpf_cgroup_storage_free(storage);
>    273          }
>    274
>    275          mutex_unlock(&cgroup_mutex);
>    276
>    277          for (p = cgroup_parent(cgrp); p; p = cgroup_parent(p))
>    278                  cgroup_bpf_put(p);
>    279
>    280          percpu_ref_exit(&cgrp->bpf.refcnt);
>    281          cgroup_put(cgrp);
>    282  }
>    283
>
> --
> 0-DAY CI Kernel Test Service
> https://01.org/lkp

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 11/11] selftests/bpf: verify lsm_cgroup struct sock access
  2022-06-01 19:02 ` [PATCH bpf-next v8 11/11] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
@ 2022-06-03  1:52   ` Martin KaFai Lau
  2022-06-03  1:59     ` Stanislav Fomichev
  0 siblings, 1 reply; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-03  1:52 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Wed, Jun 01, 2022 at 12:02:18PM -0700, Stanislav Fomichev wrote:
> sk_priority & sk_mark are writable, the rest is readonly.
> 
> One interesting thing here is that the verifier doesn't
> really force me to add NULL checks anywhere :-/
Are you aware if it is possible to get a NULL sk from some of the
bpf_lsm hooks ?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 11/11] selftests/bpf: verify lsm_cgroup struct sock access
  2022-06-03  1:52   ` Martin KaFai Lau
@ 2022-06-03  1:59     ` Stanislav Fomichev
  2022-06-03  5:38       ` Martin KaFai Lau
  0 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-03  1:59 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Thu, Jun 2, 2022 at 6:52 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, Jun 01, 2022 at 12:02:18PM -0700, Stanislav Fomichev wrote:
> > sk_priority & sk_mark are writable, the rest is readonly.
> >
> > One interesting thing here is that the verifier doesn't
> > really force me to add NULL checks anywhere :-/
> Are you aware if it is possible to get a NULL sk from some of the
> bpf_lsm hooks ?

No, I don't think it's relevant for lsm hooks. I'm more concerned
about fentry/fexit which supposedly should go through the same
verifier path and can be attached everywhere?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 11/11] selftests/bpf: verify lsm_cgroup struct sock access
  2022-06-03  1:59     ` Stanislav Fomichev
@ 2022-06-03  5:38       ` Martin KaFai Lau
  0 siblings, 0 replies; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-03  5:38 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Thu, Jun 02, 2022 at 06:59:47PM -0700, Stanislav Fomichev wrote:
> On Thu, Jun 2, 2022 at 6:52 PM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Wed, Jun 01, 2022 at 12:02:18PM -0700, Stanislav Fomichev wrote:
> > > sk_priority & sk_mark are writable, the rest is readonly.
> > >
> > > One interesting thing here is that the verifier doesn't
> > > really force me to add NULL checks anywhere :-/
> > Are you aware if it is possible to get a NULL sk from some of the
> > bpf_lsm hooks ?
> 
> No, I don't think it's relevant for lsm hooks. I'm more concerned
> about fentry/fexit which supposedly should go through the same
> verifier path and can be attached everywhere?
fentry/fexit is BPF_READ.  It will be marked with BPF_PROBE_MEM
and the fault will be handled by the bpf extable handler.

If the lsm hooks cannot get a NULL sk, BPF_WRITE on the sk_prioirity and
sk_mark is fine.

Took a first pass on the set and will take a closer look.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 07/11] libbpf: add lsm_cgoup_sock type
  2022-06-01 19:02 ` [PATCH bpf-next v8 07/11] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
@ 2022-06-03 21:30   ` Andrii Nakryiko
  0 siblings, 0 replies; 35+ messages in thread
From: Andrii Nakryiko @ 2022-06-03 21:30 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Wed, Jun 1, 2022 at 12:02 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.
>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  tools/lib/bpf/libbpf.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 5afe4cbd684f..a4209a1ad02f 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -107,6 +107,7 @@ static const char * const attach_type_name[] = {
>         [BPF_TRACE_FEXIT]               = "trace_fexit",
>         [BPF_MODIFY_RETURN]             = "modify_return",
>         [BPF_LSM_MAC]                   = "lsm_mac",
> +       [BPF_LSM_CGROUP]                = "lsm_cgroup",
>         [BPF_SK_LOOKUP]                 = "sk_lookup",
>         [BPF_TRACE_ITER]                = "trace_iter",
>         [BPF_XDP_DEVMAP]                = "xdp_devmap",
> @@ -9157,6 +9158,7 @@ static const struct bpf_sec_def section_defs[] = {
>         SEC_DEF("freplace+",            EXT, 0, SEC_ATTACH_BTF, attach_trace),
>         SEC_DEF("lsm+",                 LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm),
>         SEC_DEF("lsm.s+",               LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
> +       SEC_DEF("lsm_cgroup+",          LSM, BPF_LSM_CGROUP, SEC_ATTACH_BTF),
>         SEC_DEF("iter+",                TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF, attach_iter),
>         SEC_DEF("iter.s+",              TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_iter),
>         SEC_DEF("syscall",              SYSCALL, 0, SEC_SLEEPABLE),
> @@ -9610,6 +9612,7 @@ void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
>                 *kind = BTF_KIND_TYPEDEF;
>                 break;
>         case BPF_LSM_MAC:
> +       case BPF_LSM_CGROUP:
>                 *prefix = BTF_LSM_PREFIX;
>                 *kind = BTF_KIND_FUNC;
>                 break;
> --
> 2.36.1.255.ge46751e96f-goog
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 08/11] libbpf: implement bpf_prog_query_opts
  2022-06-01 19:02 ` [PATCH bpf-next v8 08/11] libbpf: implement bpf_prog_query_opts Stanislav Fomichev
@ 2022-06-03 21:34   ` Andrii Nakryiko
  2022-06-03 23:13     ` Stanislav Fomichev
  0 siblings, 1 reply; 35+ messages in thread
From: Andrii Nakryiko @ 2022-06-03 21:34 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Wed, Jun 1, 2022 at 12:02 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> Implement bpf_prog_query_opts as a more expendable version of
> bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
> well:
>
> * prog_attach_flags is a per-program attach_type; relevant only for
>   lsm cgroup program which might have different attach_flags
>   per attach_btf_id
> * attach_btf_func_id is a new field expose for prog_query which
>   specifies real btf function id for lsm cgroup attachments
>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/include/uapi/linux/bpf.h |  3 +++
>  tools/lib/bpf/bpf.c            | 40 +++++++++++++++++++++++++++-------
>  tools/lib/bpf/bpf.h            | 15 +++++++++++++
>  tools/lib/bpf/libbpf.map       |  2 +-
>  4 files changed, 51 insertions(+), 9 deletions(-)
>

Few consistency nits, but otherwise:

Acked-by: Andrii Nakryiko <andrii@kernel.org>


> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index fa64b0b612fd..4271ef3c2afb 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -1432,6 +1432,7 @@ union bpf_attr {
>                 __u32           attach_flags;
>                 __aligned_u64   prog_ids;
>                 __u32           prog_cnt;
> +               __aligned_u64   prog_attach_flags; /* output: per-program attach_flags */
>         } query;
>
>         struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> @@ -5996,6 +5997,8 @@ struct bpf_prog_info {
>         __u64 run_cnt;
>         __u64 recursion_misses;
>         __u32 verified_insns;
> +       __u32 attach_btf_obj_id;
> +       __u32 attach_btf_id;
>  } __attribute__((aligned(8)));
>
>  struct bpf_map_info {
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 240186aac8e6..c7af7db53725 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -888,28 +888,52 @@ int bpf_iter_create(int link_fd)
>         return libbpf_err_errno(fd);
>  }
>
> -int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
> -                  __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
> +int bpf_prog_query_opts(int target_fd,
> +                       enum bpf_attach_type type,
> +                       struct bpf_prog_query_opts *opts)
>  {
>         union bpf_attr attr;
>         int ret;
>
>         memset(&attr, 0, sizeof(attr));
> +
> +       if (!OPTS_VALID(opts, bpf_prog_query_opts))
> +               return libbpf_err(-EINVAL);
> +

nit: check input args before you do work (memset), but it's very minor

>         attr.query.target_fd    = target_fd;
>         attr.query.attach_type  = type;
> -       attr.query.query_flags  = query_flags;
> -       attr.query.prog_cnt     = *prog_cnt;
> -       attr.query.prog_ids     = ptr_to_u64(prog_ids);
> +       attr.query.query_flags  = OPTS_GET(opts, query_flags, 0);
> +       attr.query.prog_cnt     = OPTS_GET(opts, prog_cnt, 0);
> +       attr.query.prog_ids     = ptr_to_u64(OPTS_GET(opts, prog_ids, NULL));
> +       attr.query.prog_attach_flags = ptr_to_u64(OPTS_GET(opts, prog_attach_flags, NULL));
>
>         ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
>
> -       if (attach_flags)
> -               *attach_flags = attr.query.attach_flags;
> -       *prog_cnt = attr.query.prog_cnt;
> +       OPTS_SET(opts, attach_flags, attr.query.attach_flags);
> +       OPTS_SET(opts, prog_cnt, attr.query.prog_cnt);
>
>         return libbpf_err_errno(ret);
>  }
>
> +int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
> +                  __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
> +{
> +       LIBBPF_OPTS(bpf_prog_query_opts, p);

nit: for consistency it would be good to call variable "opts" as we do
pretty much everywhere else?

> +       int ret;
> +
> +       p.query_flags = query_flags;
> +       p.prog_ids = prog_ids;
> +       p.prog_cnt = *prog_cnt;
> +
> +       ret = bpf_prog_query_opts(target_fd, type, &p);
> +
> +       if (attach_flags)
> +               *attach_flags = p.attach_flags;
> +       *prog_cnt = p.prog_cnt;
> +
> +       return ret;

maybe use libbpf_err() here for consistency and just in case we add
something that can clobber errno

> +}
> +
>  int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size,
>                       void *data_out, __u32 *size_out, __u32 *retval,
>                       __u32 *duration)

[...]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 08/11] libbpf: implement bpf_prog_query_opts
  2022-06-03 21:34   ` Andrii Nakryiko
@ 2022-06-03 23:13     ` Stanislav Fomichev
  0 siblings, 0 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-03 23:13 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Fri, Jun 3, 2022 at 2:35 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Jun 1, 2022 at 12:02 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > Implement bpf_prog_query_opts as a more expendable version of
> > bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
> > well:
> >
> > * prog_attach_flags is a per-program attach_type; relevant only for
> >   lsm cgroup program which might have different attach_flags
> >   per attach_btf_id
> > * attach_btf_func_id is a new field expose for prog_query which
> >   specifies real btf function id for lsm cgroup attachments
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  tools/include/uapi/linux/bpf.h |  3 +++
> >  tools/lib/bpf/bpf.c            | 40 +++++++++++++++++++++++++++-------
> >  tools/lib/bpf/bpf.h            | 15 +++++++++++++
> >  tools/lib/bpf/libbpf.map       |  2 +-
> >  4 files changed, 51 insertions(+), 9 deletions(-)
> >
>
> Few consistency nits, but otherwise:
>
> Acked-by: Andrii Nakryiko <andrii@kernel.org>

Thank you for the review, agreed, will address everything in a respin.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor
  2022-06-01 19:02 ` [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
  2022-06-02  6:16   ` kernel test robot
@ 2022-06-04  6:11   ` Martin KaFai Lau
  2022-06-04  8:27     ` Martin KaFai Lau
  2022-06-06 22:46     ` Stanislav Fomichev
  1 sibling, 2 replies; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-04  6:11 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Wed, Jun 01, 2022 at 12:02:10PM -0700, Stanislav Fomichev wrote:
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 4adb4f3ecb7f..66b644a76a69 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -14,6 +14,9 @@
>  #include <linux/string.h>
>  #include <linux/bpf.h>
>  #include <linux/bpf-cgroup.h>
> +#include <linux/btf_ids.h>
This should not be needed ?

> +#include <linux/bpf_lsm.h>
> +#include <linux/bpf_verifier.h>
>  #include <net/sock.h>
>  #include <net/bpf_sk_storage.h>
>  
> @@ -61,6 +64,88 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
>  	return run_ctx.retval;
>  }
>  
> +unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
> +				       const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct sock *sk;
> +	struct cgroup *cgrp;
> +	int ret = 0;
> +	u64 *args;
> +
> +	args = (u64 *)ctx;
> +	sk = (void *)(unsigned long)args[0];
> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> +
> +	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	return ret;
> +}
> +
> +unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
> +					 const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct socket *sock;
> +	struct cgroup *cgrp;
> +	int ret = 0;
> +	u64 *args;
> +
> +	args = (u64 *)ctx;
> +	sock = (void *)(unsigned long)args[0];
> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> +
> +	cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	return ret;
> +}
> +
> +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> +					  const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct cgroup *cgrp;
> +	int ret = 0;
> +
> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> +
> +	rcu_read_lock();
> +	cgrp = task_dfl_cgroup(current);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	rcu_read_unlock();
> +	return ret;
> +}
> +
> +#ifdef CONFIG_BPF_LSM
> +static enum cgroup_bpf_attach_type
> +bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
> +{
> +	if (attach_type != BPF_LSM_CGROUP)
> +		return to_cgroup_bpf_attach_type(attach_type);
> +	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
> +}
> +#else
> +static enum cgroup_bpf_attach_type
> +bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
> +{
> +	if (attach_type != BPF_LSM_CGROUP)
> +		return to_cgroup_bpf_attach_type(attach_type);
> +	return -EOPNOTSUPP;
> +}
> +#endif /* CONFIG_BPF_LSM */
> +
>  void cgroup_bpf_offline(struct cgroup *cgrp)
>  {
>  	cgroup_get(cgrp);
> @@ -163,10 +248,16 @@ static void cgroup_bpf_release(struct work_struct *work)
>  
>  		hlist_for_each_entry_safe(pl, pltmp, progs, node) {
>  			hlist_del(&pl->node);
> -			if (pl->prog)
> +			if (pl->prog) {
> +				if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
nit.  Check expected_attach_type == BPF_LSM_CGROUP like
other places ?

> +					bpf_trampoline_unlink_cgroup_shim(pl->prog);
>  				bpf_prog_put(pl->prog);
> -			if (pl->link)
> +			}
> +			if (pl->link) {
> +				if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
Same here.

> +					bpf_trampoline_unlink_cgroup_shim(pl->link->link.prog);
>  				bpf_cgroup_link_auto_detach(pl->link);
> +			}
>  			kfree(pl);
>  			static_branch_dec(&cgroup_bpf_enabled_key[atype]);
>  		}
> @@ -479,6 +570,8 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
>  	struct bpf_prog *old_prog = NULL;
>  	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
>  	struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
> +	struct bpf_prog *new_prog = prog ? : link->link.prog;
> +	struct bpf_attach_target_info tgt_info = {};
>  	enum cgroup_bpf_attach_type atype;
>  	struct bpf_prog_list *pl;
>  	struct hlist_head *progs;
> @@ -495,7 +588,20 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
>  		/* replace_prog implies BPF_F_REPLACE, and vice versa */
>  		return -EINVAL;
>  
> -	atype = to_cgroup_bpf_attach_type(type);
> +	if (type == BPF_LSM_CGROUP) {
> +		if (replace_prog) {
> +			if (replace_prog->aux->attach_btf_id !=
> +			    new_prog->aux->attach_btf_id)
> +				return -EINVAL;
> +		}
This is no longer needed if it does not reuse the
replace_prog's cgroup_atype.

If old_prog is not NULL, it must have the same attach_btf_id.

This should still be true even bpf_cgroup_atype_find() return
a new cgroup_atype.  In that case, 'pl = find_attach_entry();'
must be NULL.

WDYT?

> +		err = bpf_check_attach_target(NULL, new_prog, NULL,
> +					      new_prog->aux->attach_btf_id,
> +					      &tgt_info);
If the above attach_btf_id check in unnecessary,
this can be done in bpf_trampoline_link_cgroup_shim() where
it is the only place that tgt_info will be used.
This should never fail also because this had been done
once during the prog load time, so doing it later is fine.

Then this whole "if (type == BPF_LSM_CGROUP)" case can be removed
from __cgroup_bpf_attach().

> +		if (err)
> +			return -EINVAL;
> +	}
> +
> +	atype = bpf_cgroup_atype_find(type, new_prog->aux->attach_btf_id);
>  	if (atype < 0)
>  		return -EINVAL;
>  
> @@ -549,9 +655,15 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
>  	bpf_cgroup_storages_assign(pl->storage, storage);
>  	cgrp->bpf.flags[atype] = saved_flags;
>  
> +	if (type == BPF_LSM_CGROUP && !old_prog) {
hmm... I think this "!old_prog" test should not be here.

In allow_multi, old_prog can be NULL but it still needs
to bump the shim_link's refcnt by calling
bpf_trampoline_link_cgroup_shim().

This is a bit tricky.  Does it make sense ?

> +		err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info, atype);
> +		if (err)
> +			goto cleanup;
> +	}
> +
>  	err = update_effective_progs(cgrp, atype);
>  	if (err)
> -		goto cleanup;
> +		goto cleanup_trampoline;
>  
>  	if (old_prog)
Then it needs a bpf_trampoline_unlink_cgroup_shim(old_prog) here.

>  		bpf_prog_put(old_prog);
> @@ -560,6 +672,10 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
>  	bpf_cgroup_storages_link(new_storage, cgrp, type);
>  	return 0;
>  
> +cleanup_trampoline:
> +	if (type == BPF_LSM_CGROUP && !old_prog)
The "!old_prog" test should also be removed.

> +		bpf_trampoline_unlink_cgroup_shim(new_prog);
> +
>  cleanup:
>  	if (old_prog) {
>  		pl->prog = old_prog;
> @@ -651,7 +767,13 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
>  	struct hlist_head *progs;
>  	bool found = false;
>  
> -	atype = to_cgroup_bpf_attach_type(link->type);
> +	if (link->type == BPF_LSM_CGROUP) {
> +		if (new_prog->aux->attach_btf_id !=
> +		    link->link.prog->aux->attach_btf_id)
> +			return -EINVAL;
This should be no longer needed also.
It will return -ENOENT later.

> +	}
> +
> +	atype = bpf_cgroup_atype_find(link->type, new_prog->aux->attach_btf_id);
>  	if (atype < 0)
>  		return -EINVAL;
>  
> @@ -803,9 +925,15 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
>  	struct bpf_prog *old_prog;
>  	struct bpf_prog_list *pl;
>  	struct hlist_head *progs;
> +	u32 attach_btf_id = 0;
>  	u32 flags;
>  
> -	atype = to_cgroup_bpf_attach_type(type);
> +	if (prog)
> +		attach_btf_id = prog->aux->attach_btf_id;
> +	if (link)
> +		attach_btf_id = link->link.prog->aux->attach_btf_id;
> +
> +	atype = bpf_cgroup_atype_find(type, attach_btf_id);
>  	if (atype < 0)
>  		return -EINVAL;
>  
> @@ -832,6 +960,10 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
>  		purge_effective_progs(cgrp, old_prog, link, atype);
>  	}
>  
> +
> +	if (type == BPF_LSM_CGROUP)
> +		bpf_trampoline_unlink_cgroup_shim(old_prog ? : link->link.prog);
For the '!old_prog' case (link case), do the
bpf_trampoline_unlink_cgroup_shim() in bpf_cgroup_link_release().

For the old_prog case (non-link case),
the bpf_trampoline_unlink_cgroup_shim() can be done
under the same 'if (old_prog)' check a few lines below.

Then for both cases shim unlink will be closer to where bpf_prog_put()
will be done and have similar pattern as in __cgroup_bpf_attach() and
cgroup_bpf_release(), so easier to reason in the future:
shim unlink and then prog put.

[ ... ]

> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 5466e15be61f..45dfcece76e7 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -11,6 +11,8 @@
>  #include <linux/rcupdate_wait.h>
>  #include <linux/module.h>
>  #include <linux/static_call.h>
> +#include <linux/bpf_verifier.h>
> +#include <linux/bpf_lsm.h>
>  
>  /* dummy _ops. The verifier will operate on target program's ops. */
>  const struct bpf_verifier_ops bpf_extension_verifier_ops = {
> @@ -496,6 +498,169 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
>  	return err;
>  }
>  
> +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> +static void bpf_shim_tramp_link_release(struct bpf_link *link)
> +{
> +	struct bpf_shim_tramp_link *shim_link =
> +		container_of(link, struct bpf_shim_tramp_link, link.link);
> +
> +	if (!shim_link->trampoline)
> +		return;
> +
> +	WARN_ON_ONCE(bpf_trampoline_unlink_prog(&shim_link->link, shim_link->trampoline));
> +	bpf_trampoline_put(shim_link->trampoline);
> +}
> +
> +static void bpf_shim_tramp_link_dealloc(struct bpf_link *link)
> +{
> +	struct bpf_shim_tramp_link *shim_link =
> +		container_of(link, struct bpf_shim_tramp_link, link.link);
> +
> +	kfree(shim_link);
> +}
> +
> +static const struct bpf_link_ops bpf_shim_tramp_link_lops = {
> +	.release = bpf_shim_tramp_link_release,
> +	.dealloc = bpf_shim_tramp_link_dealloc,
> +};
> +
> +static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
> +						     bpf_func_t bpf_func,
> +						     int cgroup_atype)
> +{
> +	struct bpf_shim_tramp_link *shim_link = NULL;
> +	struct bpf_prog *p;
> +
> +	shim_link = kzalloc(sizeof(*shim_link), GFP_USER);
> +	if (!shim_link)
> +		return NULL;
> +
> +	p = bpf_prog_alloc(1, 0);
> +	if (!p) {
> +		kfree(shim_link);
> +		return NULL;
> +	}
> +
> +	p->jited = false;
> +	p->bpf_func = bpf_func;
> +
> +	p->aux->cgroup_atype = cgroup_atype;
> +	p->aux->attach_func_proto = prog->aux->attach_func_proto;
> +	p->aux->attach_btf_id = prog->aux->attach_btf_id;
> +	p->aux->attach_btf = prog->aux->attach_btf;
> +	btf_get(p->aux->attach_btf);
> +	p->type = BPF_PROG_TYPE_LSM;
> +	p->expected_attach_type = BPF_LSM_MAC;
> +	bpf_prog_inc(p);
> +	bpf_link_init(&shim_link->link.link, BPF_LINK_TYPE_UNSPEC,
> +		      &bpf_shim_tramp_link_lops, p);
> +
> +	return shim_link;
> +}
> +
> +static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr,
> +						    bpf_func_t bpf_func)
> +{
> +	struct bpf_tramp_link *link;
> +	int kind;
> +
> +	for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> +		hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
> +			struct bpf_prog *p = link->link.prog;
> +
> +			if (p->bpf_func == bpf_func)
> +				return container_of(link, struct bpf_shim_tramp_link, link);
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> +				    struct bpf_attach_target_info *tgt_info,
> +				    int cgroup_atype)
> +{
> +	struct bpf_shim_tramp_link *shim_link = NULL;
> +	struct bpf_trampoline *tr;
> +	bpf_func_t bpf_func;
> +	u64 key;
> +	int err;
> +
> +	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> +					 prog->aux->attach_btf_id);
Directly get tgt_info here instead of doing it
in __cgroup_bpf_attach() and then passing in.
This is the only place needed it.

	err = bpf_check_attach_target(NULL, prog, NULL,
				prog->aux->attach_btf_id,
				&tgt_info);
	if (err)
		return err;

> +
> +	bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> +	tr = bpf_trampoline_get(key, tgt_info);
> +	if (!tr)
> +		return  -ENOMEM;
> +
> +	mutex_lock(&tr->mutex);
> +
> +	shim_link = cgroup_shim_find(tr, bpf_func);
> +	if (shim_link) {
> +		/* Reusing existing shim attached by the other program. */
> +		atomic64_inc(&shim_link->link.link.refcnt);
Use bpf_link_inc() instead to pair with bpf_link_put().

> +		/* note, we're still holding tr refcnt from above */
It has to do a bpf_trampoline_put(tr) after mutex_unlock(&tr->mutex).
shim_link already holds one refcnt to tr.

> +
> +		mutex_unlock(&tr->mutex);
> +		return 0;
> +	}
> +
> +	/* Allocate and install new shim. */
> +
> +	shim_link = cgroup_shim_alloc(prog, bpf_func, cgroup_atype);
> +	if (!shim_link) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +
> +	err = __bpf_trampoline_link_prog(&shim_link->link, tr);
> +	if (err)
> +		goto out;
> +
> +	shim_link->trampoline = tr;
> +	/* note, we're still holding tr refcnt from above */
> +
> +	mutex_unlock(&tr->mutex);
> +
> +	return 0;
> +out:
> +	mutex_unlock(&tr->mutex);
> +
> +	if (shim_link)
> +		bpf_link_put(&shim_link->link.link);
> +
> +	bpf_trampoline_put(tr); /* bpf_trampoline_get above */
Doing it here is because mutex_unlock(&tr->mutex) has
to be done first?  A comment will be useful.

How about passing tr to cgroup_shim_alloc(..., tr)
which is initializing everything else in shim_link anyway.
Then the 'if (!shim_link->trampoline)' in bpf_shim_tramp_link_release()
can go away also.

Like:

static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
						     bpf_func_t bpf_func,
						     struct bpf_trampoline *tr)

{
	/* ... */
	shim_link->trampoline = tr;

	return shim_link;
}

int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
				    int cgroup_atype)
{

	/* ... */

	/* Allocate and install new shim. */

	shim_link = cgroup_shim_alloc(prog, bpf_func, cgroup_atype, tr);
	if (!shim_link) {
		err = -ENOMEM;
		goto error;
	}

	err = __bpf_trampoline_link_prog(&shim_link->link, tr);
	if (err)
		goto error;

	mutex_unlock(&tr->mutex);

	return 0;

error:
	mutex_unlock(&tr->mutex);
	if (shim_link)
		bpf_link_put(&shim_link->link.link);
	else
		bpf_trampoline_put(tr);

	return err;
}

[ ... ]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index aedac2ac02b9..caa5740b39b3 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -7264,6 +7264,18 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  				reg_type_str(env, regs[BPF_REG_1].type));
>  			return -EACCES;
>  		}
> +		break;
> +	case BPF_FUNC_set_retval:
> +		if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
> +			if (!env->prog->aux->attach_func_proto->type) {
> +				/* Make sure programs that attach to void
> +				 * hooks don't try to modify return value.
> +				 */
> +				err = -EINVAL;
nit. Directly 'return -EINVAL;' after verbose() logging.

> +				verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
> +			}
> +		}
> +		break;
>  	}
>  
>  	if (err)
> @@ -10474,6 +10486,23 @@ static int check_return_code(struct bpf_verifier_env *env)
>  	case BPF_PROG_TYPE_SK_LOOKUP:
>  		range = tnum_range(SK_DROP, SK_PASS);
>  		break;
> +
> +	case BPF_PROG_TYPE_LSM:
> +		if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
> +			if (!env->prog->aux->attach_func_proto->type) {
nit. Check 'if ( ... != BPF_LSM_CGROUP) return 0;' first to remove
one level of indentation.

> +				/* Make sure programs that attach to void
> +				 * hooks don't try to modify return value.
> +				 */
> +				range = tnum_range(1, 1);
> +			}
> +		} else {
> +			/* regular BPF_PROG_TYPE_LSM programs can return
> +			 * any value.
> +			 */
> +			return 0;
> +		}
> +		break;
> +
>  	case BPF_PROG_TYPE_EXT:
>  		/* freplace program can return anything as its return value
>  		 * depends on the to-be-replaced kernel func or bpf program.
> @@ -10490,6 +10519,8 @@ static int check_return_code(struct bpf_verifier_env *env)
>  
>  	if (!tnum_in(range, reg->var_off)) {
>  		verbose_invalid_scalar(env, reg, &range, "program exit", "R0");
> +		if (env->prog->expected_attach_type == BPF_LSM_CGROUP)
> +			verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
This is not accurate to verbose on void-return only.
For int-return lsm look, the BPF_LSM_CGROUP prog returning
neither 0 nor 1 will also trigger this range check failure.

>  		return -EINVAL;
>  	}
>  
> @@ -14713,6 +14744,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
>  		fallthrough;
>  	case BPF_MODIFY_RETURN:
>  	case BPF_LSM_MAC:
> +	case BPF_LSM_CGROUP:
>  	case BPF_TRACE_FENTRY:
>  	case BPF_TRACE_FEXIT:
>  		if (!btf_type_is_func(t)) {

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 04/11] bpf: minimize number of allocated lsm slots per program
  2022-06-01 19:02 ` [PATCH bpf-next v8 04/11] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
@ 2022-06-04  6:35   ` Martin KaFai Lau
  2022-06-06 22:46     ` Stanislav Fomichev
  0 siblings, 1 reply; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-04  6:35 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Wed, Jun 01, 2022 at 12:02:11PM -0700, Stanislav Fomichev wrote:
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index 0f72020bfdcf..83aa431dd52e 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -69,11 +69,6 @@ void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
>  		*bpf_func = __cgroup_bpf_run_lsm_current;
>  }
>  
> -int bpf_lsm_hook_idx(u32 btf_id)
> -{
> -	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
The implementation of btf_id_set_index() added in patch 3
should be removed also.

> -}
> -
>  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
>  			const struct bpf_prog *prog)
>  {
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 66b644a76a69..a27a6a7bd852 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -129,12 +129,46 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
>  }
>  
>  #ifdef CONFIG_BPF_LSM
> +u32 cgroup_lsm_atype_btf_id[CGROUP_LSM_NUM];
static

> +
>  static enum cgroup_bpf_attach_type
>  bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
>  {
> +	int i;
> +
> +	lockdep_assert_held(&cgroup_mutex);
> +
>  	if (attach_type != BPF_LSM_CGROUP)
>  		return to_cgroup_bpf_attach_type(attach_type);
> -	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
> +
> +	for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_btf_id); i++)
> +		if (cgroup_lsm_atype_btf_id[i] == attach_btf_id)
> +			return CGROUP_LSM_START + i;
> +
> +	for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_btf_id); i++)
> +		if (cgroup_lsm_atype_btf_id[i] == 0)
> +			return CGROUP_LSM_START + i;
> +
> +	return -E2BIG;
> +
> +}
> +
> +static void bpf_cgroup_atype_alloc(u32 attach_btf_id, int cgroup_atype)
> +{
> +	int i = cgroup_atype - CGROUP_LSM_START;
> +
> +	lockdep_assert_held(&cgroup_mutex);
> +
> +	cgroup_lsm_atype_btf_id[i] = attach_btf_id;
> +}
> +
> +void bpf_cgroup_atype_free(int cgroup_atype)
> +{
> +	int i = cgroup_atype - CGROUP_LSM_START;
> +
> +	mutex_lock(&cgroup_mutex);
> +	cgroup_lsm_atype_btf_id[i] = 0;
I think holding cgroup_mutex in the __cgroup_bpf_attach() and
bpf_cgroup_atype_free() is not enough.

Consider a case that __cgroup_bpf_attach() runs first and bpf_trampoline_link_cgroup_shim()
cannot find the shim_link because it is unlinked and its shim_prog
is currently still under the bpf_prog_free_deferred's deadrow.
Then bpf_prog_free_deferred() got run and do the bpf_cgroup_atype_free().

A refcnt is still needed.  It is better to put them together in a
struct instead of having two arrays, like

struct cgroup_lsm_atype {
       u32 attach_btf_id;
       u32 refcnt;
};

> +	mutex_unlock(&cgroup_mutex);
>  }
>  #else
>  static enum cgroup_bpf_attach_type
> @@ -144,6 +178,14 @@ bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
>  		return to_cgroup_bpf_attach_type(attach_type);
>  	return -EOPNOTSUPP;
>  }
> +
> +static void bpf_cgroup_atype_alloc(u32 attach_btf_id, int cgroup_atype)
> +{
> +}
> +
> +void bpf_cgroup_atype_free(int cgroup_atype)
> +{
> +}
>  #endif /* CONFIG_BPF_LSM */
>  
>  void cgroup_bpf_offline(struct cgroup *cgrp)
> @@ -659,6 +701,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
>  		err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info, atype);
>  		if (err)
>  			goto cleanup;
> +		bpf_cgroup_atype_alloc(new_prog->aux->attach_btf_id, atype);
This atype alloc (or refcnt inc) should be done in
cgroup_shim_alloc() where the shim_prog is the one holding
the refcnt of the atype.  If the above "!old_prog" needs to be
removed as the earlier comment in patch 3, bumping the atype refcnt
here will be wrong.

>  	}
>  
>  	err = update_effective_progs(cgrp, atype);
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 091ee210842f..224bb4d4fe4e 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -107,6 +107,9 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
>  	fp->aux->prog = fp;
>  	fp->jit_requested = ebpf_jit_enabled();
>  	fp->blinding_requested = bpf_jit_blinding_enabled(fp);
> +#ifdef CONFIG_BPF_LSM
I don't think this is needed.

> +	aux->cgroup_atype = CGROUP_BPF_ATTACH_TYPE_INVALID;
> +#endif
>  
>  	INIT_LIST_HEAD_RCU(&fp->aux->ksym.lnode);
>  	mutex_init(&fp->aux->used_maps_mutex);
> @@ -2558,6 +2561,10 @@ static void bpf_prog_free_deferred(struct work_struct *work)
>  	aux = container_of(work, struct bpf_prog_aux, work);
>  #ifdef CONFIG_BPF_SYSCALL
>  	bpf_free_kfunc_btf_tab(aux->kfunc_btf_tab);
> +#endif
> +#ifdef CONFIG_BPF_LSM
Same here.

> +	if (aux->cgroup_atype != CGROUP_BPF_ATTACH_TYPE_INVALID)
> +		bpf_cgroup_atype_free(aux->cgroup_atype);
>  #endif
>  	bpf_free_used_maps(aux);
>  	bpf_free_used_btfs(aux);
> -- 
> 2.36.1.255.ge46751e96f-goog
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-06-01 19:02 ` [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-06-04  6:59   ` Martin KaFai Lau
  2022-06-04  8:05     ` Martin KaFai Lau
  2022-06-06 22:46     ` Stanislav Fomichev
  0 siblings, 2 replies; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-04  6:59 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Wed, Jun 01, 2022 at 12:02:12PM -0700, Stanislav Fomichev wrote:
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index a27a6a7bd852..cb3338ef01e0 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -1035,6 +1035,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
>  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
>  			      union bpf_attr __user *uattr)
>  {
> +	__u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
>  	__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
>  	enum bpf_attach_type type = attr->query.attach_type;
>  	enum cgroup_bpf_attach_type atype;
> @@ -1042,50 +1043,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
>  	struct hlist_head *progs;
>  	struct bpf_prog *prog;
>  	int cnt, ret = 0, i;
> +	int total_cnt = 0;
>  	u32 flags;
>  
> -	atype = to_cgroup_bpf_attach_type(type);
> -	if (atype < 0)
> -		return -EINVAL;
> +	enum cgroup_bpf_attach_type from_atype, to_atype;
>  
> -	progs = &cgrp->bpf.progs[atype];
> -	flags = cgrp->bpf.flags[atype];
> +	if (type == BPF_LSM_CGROUP) {
> +		from_atype = CGROUP_LSM_START;
> +		to_atype = CGROUP_LSM_END;
Enforce prog_attach_flags for BPF_LSM_CGROUP:

		if (total_cnt && !prog_attach_flags)
			return -EINVAL;
			
> +	} else {
> +		from_atype = to_cgroup_bpf_attach_type(type);
> +		if (from_atype < 0)
> +			return -EINVAL;
> +		to_atype = from_atype;
> +	}
>  
> -	effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> -					      lockdep_is_held(&cgroup_mutex));
> +	for (atype = from_atype; atype <= to_atype; atype++) {
> +		progs = &cgrp->bpf.progs[atype];
> +		flags = cgrp->bpf.flags[atype];
>  
> -	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> -		cnt = bpf_prog_array_length(effective);
> -	else
> -		cnt = prog_list_length(progs);
> +		effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> +						      lockdep_is_held(&cgroup_mutex));
nit. This can be done under the BPF_F_QUERY_EFFECTIVE case below.

>  
> -	if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> -		return -EFAULT;
> -	if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> +		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> +			total_cnt += bpf_prog_array_length(effective);
> +		else
> +			total_cnt += prog_list_length(progs);
> +	}
> +
> +	if (type != BPF_LSM_CGROUP)
> +		if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> +			return -EFAULT;
nit. Move this copy_to_user(&uattr->query.attach_flags,...) to the
'if (type == BPF_LSM_CGROUP) { from_atype = .... } else { ... }' above.
That will save a BPF_LSM_CGROUP test.

I think the '== BPF_LSM_CGROUP' case needs to copy a 0 to user also.

> +	if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
>  		return -EFAULT;
> -	if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> +	if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
>  		/* return early if user requested only program count + flags */
>  		return 0;
> -	if (attr->query.prog_cnt < cnt) {
> -		cnt = attr->query.prog_cnt;
> +
> +	if (attr->query.prog_cnt < total_cnt) {
> +		total_cnt = attr->query.prog_cnt;
>  		ret = -ENOSPC;
>  	}
>  
> -	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> -		return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> -	} else {
> -		struct bpf_prog_list *pl;
> -		u32 id;
> +	for (atype = from_atype; atype <= to_atype; atype++) {
> +		if (total_cnt <= 0)
nit. total_cnt cannot be -ve ?
!total_cnt instead ?
and may be do it in the for-loop test.

> +			break;
>  
> -		i = 0;
> -		hlist_for_each_entry(pl, progs, node) {
> -			prog = prog_list_prog(pl);
> -			id = prog->aux->id;
> -			if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> -				return -EFAULT;
> -			if (++i == cnt)
> -				break;
> +		progs = &cgrp->bpf.progs[atype];
> +		flags = cgrp->bpf.flags[atype];
> +
> +		effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> +						      lockdep_is_held(&cgroup_mutex));
nit. This can be done under the BPF_F_QUERY_EFFECTIVE case below.

> +
> +		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> +			cnt = bpf_prog_array_length(effective);
> +		else
> +			cnt = prog_list_length(progs);
> +
> +		if (cnt >= total_cnt)
> +			cnt = total_cnt;
nit. This seems to be the only reason that
the "if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)"
need to be broken up into two halves.  One above and one below.
It does not save much code.  How about repeating this one line
'cnt = min_t(int, cnt, total_cnt);' instead ?

> +
> +		if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> +			ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> +		} else {
> +			struct bpf_prog_list *pl;
> +			u32 id;
> +
> +			i = 0;
> +			hlist_for_each_entry(pl, progs, node) {
> +				prog = prog_list_prog(pl);
> +				id = prog->aux->id;
> +				if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> +					return -EFAULT;
> +				if (++i == cnt)
> +					break;
> +			}
>  		}
> +
> +		if (prog_attach_flags)
> +			for (i = 0; i < cnt; i++)
> +				if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> +					return -EFAULT;
> +
> +		prog_ids += cnt;
> +		total_cnt -= cnt;
> +		if (prog_attach_flags)
> +			prog_attach_flags += cnt;
nit. Merge this into the above "if (prog_attach_flags)" case.

>  	}
>  	return ret;
>  }
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index a237be4f8bb3..27492d44133f 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
>  	}
>  }
>  
> -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
>  
>  static int bpf_prog_query(const union bpf_attr *attr,
>  			  union bpf_attr __user *uattr)
> @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
>  	case BPF_CGROUP_SYSCTL:
>  	case BPF_CGROUP_GETSOCKOPT:
>  	case BPF_CGROUP_SETSOCKOPT:
> +	case BPF_LSM_CGROUP:
>  		return cgroup_bpf_prog_query(attr, uattr);
>  	case BPF_LIRC_MODE2:
>  		return lirc_prog_query(attr, uattr);
> @@ -4066,6 +4067,9 @@ static int bpf_prog_get_info_by_fd(struct file *file,
>  
>  	if (prog->aux->btf)
>  		info.btf_id = btf_obj_id(prog->aux->btf);
> +	info.attach_btf_id = prog->aux->attach_btf_id;
> +	if (prog->aux->attach_btf)
> +		info.attach_btf_obj_id = btf_obj_id(prog->aux->attach_btf);
Need this also:

	else if (prog->aux->dst_prog)
		info.attach_btf_obj_id = btf_obj_id(prog->aux->dst_prog->aux->attach_btf);

>  
>  	ulen = info.nr_func_info;
>  	info.nr_func_info = prog->aux->func_info_cnt;
> -- 
> 2.36.1.255.ge46751e96f-goog
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 06/11] bpf: allow writing to a subset of sock fields from lsm progtype
  2022-06-01 19:02 ` [PATCH bpf-next v8 06/11] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
@ 2022-06-04  7:18   ` Martin KaFai Lau
  2022-06-06 22:46     ` Stanislav Fomichev
  0 siblings, 1 reply; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-04  7:18 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Wed, Jun 01, 2022 at 12:02:13PM -0700, Stanislav Fomichev wrote:
> For now, allow only the obvious ones, like sk_priority and sk_mark.
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  kernel/bpf/bpf_lsm.c  | 58 +++++++++++++++++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c |  3 ++-
>  2 files changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index 83aa431dd52e..feba8e96f58d 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -303,7 +303,65 @@ bool bpf_lsm_is_sleepable_hook(u32 btf_id)
>  const struct bpf_prog_ops lsm_prog_ops = {
>  };
>  
> +static int lsm_btf_struct_access(struct bpf_verifier_log *log,
> +					const struct btf *btf,
> +					const struct btf_type *t, int off,
> +					int size, enum bpf_access_type atype,
> +					u32 *next_btf_id,
> +					enum bpf_type_flag *flag)
> +{
> +	const struct btf_type *sock_type;
> +	struct btf *btf_vmlinux;
> +	s32 type_id;
> +	size_t end;
> +
> +	if (atype == BPF_READ)
> +		return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
> +					 flag);
> +
> +	btf_vmlinux = bpf_get_btf_vmlinux();
> +	if (!btf_vmlinux) {
> +		bpf_log(log, "no vmlinux btf\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
> +	if (type_id < 0) {
> +		bpf_log(log, "'struct sock' not found in vmlinux btf\n");
> +		return -EINVAL;
> +	}
> +
> +	sock_type = btf_type_by_id(btf_vmlinux, type_id);
> +
> +	if (t != sock_type) {
> +		bpf_log(log, "only 'struct sock' writes are supported\n");
> +		return -EACCES;
> +	}
> +
> +	switch (off) {
> +	case bpf_ctx_range(struct sock, sk_priority):
This looks wrong.  It should not allow to write at
any bytes of the '__u32 sk_priority'.

> +		end = offsetofend(struct sock, sk_priority);
> +		break;
> +	case bpf_ctx_range(struct sock, sk_mark):
Same here.

Just came to my mind,
if the current need is only sk_priority and sk_mark,
do you think allowing bpf_setsockopt will be more useful instead ?
It currently has SO_MARK, SO_PRIORITY, and other options.
Also, changing SO_MARK requires to clear the sk->sk_dst_cache.
In general, is it safe to do bpf_setsockopt in all bpf_lsm hooks ?

> +		end = offsetofend(struct sock, sk_mark);
> +		break;
> +	default:
> +		bpf_log(log, "no write support to 'struct sock' at off %d\n", off);
> +		return -EACCES;
> +	}
> +
> +	if (off + size > end) {
> +		bpf_log(log,
> +			"write access at off %d with size %d beyond the member of 'struct sock' ended at %zu\n",
> +			off, size, end);
> +		return -EACCES;
> +	}
> +
> +	return NOT_INIT;
> +}
> +
>  const struct bpf_verifier_ops lsm_verifier_ops = {
>  	.get_func_proto = bpf_lsm_func_proto,
>  	.is_valid_access = btf_ctx_access,
> +	.btf_struct_access = lsm_btf_struct_access,
>  };
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index caa5740b39b3..c54e171d9c23 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -13413,7 +13413,8 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>  				insn->code = BPF_LDX | BPF_PROBE_MEM |
>  					BPF_SIZE((insn)->code);
>  				env->prog->aux->num_exentries++;
> -			} else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS) {
> +			} else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS &&
> +				   resolve_prog_type(env->prog) != BPF_PROG_TYPE_LSM) {
>  				verbose(env, "Writes through BTF pointers are not allowed\n");
>  				return -EINVAL;
>  			}
> -- 
> 2.36.1.255.ge46751e96f-goog
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-06-04  6:59   ` Martin KaFai Lau
@ 2022-06-04  8:05     ` Martin KaFai Lau
  2022-06-06 22:46     ` Stanislav Fomichev
  1 sibling, 0 replies; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-04  8:05 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, Jun 03, 2022 at 11:59:38PM -0700, Martin KaFai Lau wrote:
> On Wed, Jun 01, 2022 at 12:02:12PM -0700, Stanislav Fomichev wrote:
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index a27a6a7bd852..cb3338ef01e0 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -1035,6 +1035,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> >  			      union bpf_attr __user *uattr)
> >  {
> > +	__u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> >  	__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> >  	enum bpf_attach_type type = attr->query.attach_type;
> >  	enum cgroup_bpf_attach_type atype;
> > @@ -1042,50 +1043,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> >  	struct hlist_head *progs;
> >  	struct bpf_prog *prog;
> >  	int cnt, ret = 0, i;
> > +	int total_cnt = 0;
> >  	u32 flags;
> >  
> > -	atype = to_cgroup_bpf_attach_type(type);
> > -	if (atype < 0)
> > -		return -EINVAL;
> > +	enum cgroup_bpf_attach_type from_atype, to_atype;
> >  
> > -	progs = &cgrp->bpf.progs[atype];
> > -	flags = cgrp->bpf.flags[atype];
> > +	if (type == BPF_LSM_CGROUP) {
> > +		from_atype = CGROUP_LSM_START;
> > +		to_atype = CGROUP_LSM_END;
> Enforce prog_attach_flags for BPF_LSM_CGROUP:
> 
> 		if (total_cnt && !prog_attach_flags)
Correction.  Mixed up with the total_cnt below.

Should be "attr->query.prog_cnt && prog_ids && !prog_attach_flags".

> 			return -EINVAL;
> 			


[ ... ]

> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index a237be4f8bb3..27492d44133f 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> >  	}
> >  }
> >  
> > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> >  
> >  static int bpf_prog_query(const union bpf_attr *attr,
> >  			  union bpf_attr __user *uattr)
> > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> >  	case BPF_CGROUP_SYSCTL:
> >  	case BPF_CGROUP_GETSOCKOPT:
> >  	case BPF_CGROUP_SETSOCKOPT:
> > +	case BPF_LSM_CGROUP:
> >  		return cgroup_bpf_prog_query(attr, uattr);
> >  	case BPF_LIRC_MODE2:
> >  		return lirc_prog_query(attr, uattr);
> > @@ -4066,6 +4067,9 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> >  
> >  	if (prog->aux->btf)
> >  		info.btf_id = btf_obj_id(prog->aux->btf);
> > +	info.attach_btf_id = prog->aux->attach_btf_id;
> > +	if (prog->aux->attach_btf)
> > +		info.attach_btf_obj_id = btf_obj_id(prog->aux->attach_btf);
> Need this also:
> 
> 	else if (prog->aux->dst_prog)
> 		info.attach_btf_obj_id = btf_obj_id(prog->aux->dst_prog->aux->attach_btf);
Should be btf_obj_id(prog->aux->dst_prog->aux->btf) instead.

> 
> >  
> >  	ulen = info.nr_func_info;
> >  	info.nr_func_info = prog->aux->func_info_cnt;
> > -- 
> > 2.36.1.255.ge46751e96f-goog
> > 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor
  2022-06-04  6:11   ` Martin KaFai Lau
@ 2022-06-04  8:27     ` Martin KaFai Lau
  2022-06-06 22:46       ` Stanislav Fomichev
  2022-06-06 22:46     ` Stanislav Fomichev
  1 sibling, 1 reply; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-04  8:27 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, Jun 03, 2022 at 11:11:58PM -0700, Martin KaFai Lau wrote:
> > @@ -549,9 +655,15 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >  	bpf_cgroup_storages_assign(pl->storage, storage);
> >  	cgrp->bpf.flags[atype] = saved_flags;
> >  
> > +	if (type == BPF_LSM_CGROUP && !old_prog) {
> hmm... I think this "!old_prog" test should not be here.
> 
> In allow_multi, old_prog can be NULL but it still needs
> to bump the shim_link's refcnt by calling
> bpf_trampoline_link_cgroup_shim().
> 
> This is a bit tricky.  Does it make sense ?
I think I read the "!"old_prog upside-down.  I think I got the
intention here now after reading some latter patches.
It is to save a bpf_trampoline_link_cgroup_shim() and unlink
for the replace case ?  I would prefer not to do this.
It is quite confusing to read and does not save much.

> 
> > +		err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info, atype);
> > +		if (err)
> > +			goto cleanup;
> > +	}
> > +
> >  	err = update_effective_progs(cgrp, atype);
> >  	if (err)
> > -		goto cleanup;
> > +		goto cleanup_trampoline;
> >  
> >  	if (old_prog)
> Then it needs a bpf_trampoline_unlink_cgroup_shim(old_prog) here.
> 
> >  		bpf_prog_put(old_prog);
> > @@ -560,6 +672,10 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >  	bpf_cgroup_storages_link(new_storage, cgrp, type);
> >  	return 0;
> >  
> > +cleanup_trampoline:
> > +	if (type == BPF_LSM_CGROUP && !old_prog)
> The "!old_prog" test should also be removed.
> 
> > +		bpf_trampoline_unlink_cgroup_shim(new_prog);
> > +
> >  cleanup:
> >  	if (old_prog) {
> >  		pl->prog = old_prog;

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor
  2022-06-04  8:27     ` Martin KaFai Lau
@ 2022-06-06 22:46       ` Stanislav Fomichev
  0 siblings, 0 replies; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-06 22:46 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Sat, Jun 4, 2022 at 1:27 AM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Fri, Jun 03, 2022 at 11:11:58PM -0700, Martin KaFai Lau wrote:
> > > @@ -549,9 +655,15 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> > >     bpf_cgroup_storages_assign(pl->storage, storage);
> > >     cgrp->bpf.flags[atype] = saved_flags;
> > >
> > > +   if (type == BPF_LSM_CGROUP && !old_prog) {
> > hmm... I think this "!old_prog" test should not be here.
> >
> > In allow_multi, old_prog can be NULL but it still needs
> > to bump the shim_link's refcnt by calling
> > bpf_trampoline_link_cgroup_shim().
> >
> > This is a bit tricky.  Does it make sense ?
> I think I read the "!"old_prog upside-down.  I think I got the
> intention here now after reading some latter patches.
> It is to save a bpf_trampoline_link_cgroup_shim() and unlink
> for the replace case ?  I would prefer not to do this.
> It is quite confusing to read and does not save much.

Ok, let me try to drop it!


> > > +           err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info, atype);
> > > +           if (err)
> > > +                   goto cleanup;
> > > +   }
> > > +
> > >     err = update_effective_progs(cgrp, atype);
> > >     if (err)
> > > -           goto cleanup;
> > > +           goto cleanup_trampoline;
> > >
> > >     if (old_prog)
> > Then it needs a bpf_trampoline_unlink_cgroup_shim(old_prog) here.
> >
> > >             bpf_prog_put(old_prog);
> > > @@ -560,6 +672,10 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> > >     bpf_cgroup_storages_link(new_storage, cgrp, type);
> > >     return 0;
> > >
> > > +cleanup_trampoline:
> > > +   if (type == BPF_LSM_CGROUP && !old_prog)
> > The "!old_prog" test should also be removed.
> >
> > > +           bpf_trampoline_unlink_cgroup_shim(new_prog);
> > > +
> > >  cleanup:
> > >     if (old_prog) {
> > >             pl->prog = old_prog;

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor
  2022-06-04  6:11   ` Martin KaFai Lau
  2022-06-04  8:27     ` Martin KaFai Lau
@ 2022-06-06 22:46     ` Stanislav Fomichev
  2022-06-07 19:24       ` Martin KaFai Lau
  1 sibling, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-06 22:46 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

"

On Fri, Jun 3, 2022 at 11:12 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, Jun 01, 2022 at 12:02:10PM -0700, Stanislav Fomichev wrote:
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 4adb4f3ecb7f..66b644a76a69 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -14,6 +14,9 @@
> >  #include <linux/string.h>
> >  #include <linux/bpf.h>
> >  #include <linux/bpf-cgroup.h>
> > +#include <linux/btf_ids.h>
> This should not be needed ?

For some reason I thought that was needed for BTF_SOCK_TYPE_SOCKET here..

> > +#include <linux/bpf_lsm.h>
> > +#include <linux/bpf_verifier.h>
> >  #include <net/sock.h>
> >  #include <net/bpf_sk_storage.h>
> >
> > @@ -61,6 +64,88 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
> >       return run_ctx.retval;
> >  }
> >
> > +unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
> > +                                    const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct sock *sk;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> > +     u64 *args;
> > +
> > +     args = (u64 *)ctx;
> > +     sk = (void *)(unsigned long)args[0];
> > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> > +
> > +     cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> > +     if (likely(cgrp))
> > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > +                                         shim_prog->aux->cgroup_atype,
> > +                                         ctx, bpf_prog_run, 0, NULL);
> > +     return ret;
> > +}
> > +
> > +unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
> > +                                      const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct socket *sock;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> > +     u64 *args;
> > +
> > +     args = (u64 *)ctx;
> > +     sock = (void *)(unsigned long)args[0];
> > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> > +
> > +     cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
> > +     if (likely(cgrp))
> > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > +                                         shim_prog->aux->cgroup_atype,
> > +                                         ctx, bpf_prog_run, 0, NULL);
> > +     return ret;
> > +}
> > +
> > +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> > +                                       const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> > +
> > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> > +
> > +     rcu_read_lock();
> > +     cgrp = task_dfl_cgroup(current);
> > +     if (likely(cgrp))
> > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > +                                         shim_prog->aux->cgroup_atype,
> > +                                         ctx, bpf_prog_run, 0, NULL);
> > +     rcu_read_unlock();
> > +     return ret;
> > +}
> > +
> > +#ifdef CONFIG_BPF_LSM
> > +static enum cgroup_bpf_attach_type
> > +bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
> > +{
> > +     if (attach_type != BPF_LSM_CGROUP)
> > +             return to_cgroup_bpf_attach_type(attach_type);
> > +     return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
> > +}
> > +#else
> > +static enum cgroup_bpf_attach_type
> > +bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
> > +{
> > +     if (attach_type != BPF_LSM_CGROUP)
> > +             return to_cgroup_bpf_attach_type(attach_type);
> > +     return -EOPNOTSUPP;
> > +}
> > +#endif /* CONFIG_BPF_LSM */
> > +
> >  void cgroup_bpf_offline(struct cgroup *cgrp)
> >  {
> >       cgroup_get(cgrp);
> > @@ -163,10 +248,16 @@ static void cgroup_bpf_release(struct work_struct *work)
> >
> >               hlist_for_each_entry_safe(pl, pltmp, progs, node) {
> >                       hlist_del(&pl->node);
> > -                     if (pl->prog)
> > +                     if (pl->prog) {
> > +                             if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
> nit.  Check expected_attach_type == BPF_LSM_CGROUP like
> other places ?

Yeah, looks better, let's do this.

> > +                                     bpf_trampoline_unlink_cgroup_shim(pl->prog);
> >                               bpf_prog_put(pl->prog);
> > -                     if (pl->link)
> > +                     }
> > +                     if (pl->link) {
> > +                             if (atype >= CGROUP_LSM_START && atype <= CGROUP_LSM_END)
> Same here.
>
> > +                                     bpf_trampoline_unlink_cgroup_shim(pl->link->link.prog);
> >                               bpf_cgroup_link_auto_detach(pl->link);
> > +                     }
> >                       kfree(pl);
> >                       static_branch_dec(&cgroup_bpf_enabled_key[atype]);
> >               }
> > @@ -479,6 +570,8 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >       struct bpf_prog *old_prog = NULL;
> >       struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
> >       struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
> > +     struct bpf_prog *new_prog = prog ? : link->link.prog;
> > +     struct bpf_attach_target_info tgt_info = {};
> >       enum cgroup_bpf_attach_type atype;
> >       struct bpf_prog_list *pl;
> >       struct hlist_head *progs;
> > @@ -495,7 +588,20 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >               /* replace_prog implies BPF_F_REPLACE, and vice versa */
> >               return -EINVAL;
> >
> > -     atype = to_cgroup_bpf_attach_type(type);
> > +     if (type == BPF_LSM_CGROUP) {
> > +             if (replace_prog) {
> > +                     if (replace_prog->aux->attach_btf_id !=
> > +                         new_prog->aux->attach_btf_id)
> > +                             return -EINVAL;
> > +             }
> This is no longer needed if it does not reuse the
> replace_prog's cgroup_atype.
>
> If old_prog is not NULL, it must have the same attach_btf_id.
>
> This should still be true even bpf_cgroup_atype_find() return
> a new cgroup_atype.  In that case, 'pl = find_attach_entry();'
> must be NULL.
>
> WDYT?

As you mentioned in a separate email, let's remove it since it brings
more confusion instead of doing something useful :-)

> > +             err = bpf_check_attach_target(NULL, new_prog, NULL,
> > +                                           new_prog->aux->attach_btf_id,
> > +                                           &tgt_info);
> If the above attach_btf_id check in unnecessary,
> this can be done in bpf_trampoline_link_cgroup_shim() where
> it is the only place that tgt_info will be used.
> This should never fail also because this had been done
> once during the prog load time, so doing it later is fine.
>
> Then this whole "if (type == BPF_LSM_CGROUP)" case can be removed
> from __cgroup_bpf_attach().

+1, having less of those special 'if (type == BPF_LSM_CGROUP)' seems
inherently better, thanks!

> > +             if (err)
> > +                     return -EINVAL;
> > +     }
> > +
> > +     atype = bpf_cgroup_atype_find(type, new_prog->aux->attach_btf_id);
> >       if (atype < 0)
> >               return -EINVAL;
> >
> > @@ -549,9 +655,15 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >       bpf_cgroup_storages_assign(pl->storage, storage);
> >       cgrp->bpf.flags[atype] = saved_flags;
> >
> > +     if (type == BPF_LSM_CGROUP && !old_prog) {
> hmm... I think this "!old_prog" test should not be here.
>
> In allow_multi, old_prog can be NULL but it still needs
> to bump the shim_link's refcnt by calling
> bpf_trampoline_link_cgroup_shim().
>
> This is a bit tricky.  Does it make sense ?

Yeah, this is all, again, due to "smart" reusing of the existing shim.
Let's not do this and rely on proper shim prog refcnt; will drop
"!old_prog" from here and from the cleanup.

> > +             err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info, atype);
> > +             if (err)
> > +                     goto cleanup;
> > +     }
> > +
> >       err = update_effective_progs(cgrp, atype);
> >       if (err)
> > -             goto cleanup;
> > +             goto cleanup_trampoline;
> >
> >       if (old_prog)
> Then it needs a bpf_trampoline_unlink_cgroup_shim(old_prog) here.
>
> >               bpf_prog_put(old_prog);
> > @@ -560,6 +672,10 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >       bpf_cgroup_storages_link(new_storage, cgrp, type);
> >       return 0;
> >
> > +cleanup_trampoline:
> > +     if (type == BPF_LSM_CGROUP && !old_prog)
> The "!old_prog" test should also be removed.
>
> > +             bpf_trampoline_unlink_cgroup_shim(new_prog);
> > +
> >  cleanup:
> >       if (old_prog) {
> >               pl->prog = old_prog;
> > @@ -651,7 +767,13 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
> >       struct hlist_head *progs;
> >       bool found = false;
> >
> > -     atype = to_cgroup_bpf_attach_type(link->type);
> > +     if (link->type == BPF_LSM_CGROUP) {
> > +             if (new_prog->aux->attach_btf_id !=
> > +                 link->link.prog->aux->attach_btf_id)
> > +                     return -EINVAL;
> This should be no longer needed also.
> It will return -ENOENT later.

Sounds good, let's rely on bpf_cgroup_atype_find to return the same
slot for the replacement to work.

> > +     }
> > +
> > +     atype = bpf_cgroup_atype_find(link->type, new_prog->aux->attach_btf_id);
> >       if (atype < 0)
> >               return -EINVAL;
> >
> > @@ -803,9 +925,15 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> >       struct bpf_prog *old_prog;
> >       struct bpf_prog_list *pl;
> >       struct hlist_head *progs;
> > +     u32 attach_btf_id = 0;
> >       u32 flags;
> >
> > -     atype = to_cgroup_bpf_attach_type(type);
> > +     if (prog)
> > +             attach_btf_id = prog->aux->attach_btf_id;
> > +     if (link)
> > +             attach_btf_id = link->link.prog->aux->attach_btf_id;
> > +
> > +     atype = bpf_cgroup_atype_find(type, attach_btf_id);
> >       if (atype < 0)
> >               return -EINVAL;
> >
> > @@ -832,6 +960,10 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> >               purge_effective_progs(cgrp, old_prog, link, atype);
> >       }
> >
> > +
> > +     if (type == BPF_LSM_CGROUP)
> > +             bpf_trampoline_unlink_cgroup_shim(old_prog ? : link->link.prog);
> For the '!old_prog' case (link case), do the
> bpf_trampoline_unlink_cgroup_shim() in bpf_cgroup_link_release().
>
> For the old_prog case (non-link case),
> the bpf_trampoline_unlink_cgroup_shim() can be done
> under the same 'if (old_prog)' check a few lines below.
>
> Then for both cases shim unlink will be closer to where bpf_prog_put()
> will be done and have similar pattern as in __cgroup_bpf_attach() and
> cgroup_bpf_release(), so easier to reason in the future:
> shim unlink and then prog put.
>
> [ ... ]
>
> > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > index 5466e15be61f..45dfcece76e7 100644
> > --- a/kernel/bpf/trampoline.c
> > +++ b/kernel/bpf/trampoline.c
> > @@ -11,6 +11,8 @@
> >  #include <linux/rcupdate_wait.h>
> >  #include <linux/module.h>
> >  #include <linux/static_call.h>
> > +#include <linux/bpf_verifier.h>
> > +#include <linux/bpf_lsm.h>
> >
> >  /* dummy _ops. The verifier will operate on target program's ops. */
> >  const struct bpf_verifier_ops bpf_extension_verifier_ops = {
> > @@ -496,6 +498,169 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
> >       return err;
> >  }
> >
> > +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> > +static void bpf_shim_tramp_link_release(struct bpf_link *link)
> > +{
> > +     struct bpf_shim_tramp_link *shim_link =
> > +             container_of(link, struct bpf_shim_tramp_link, link.link);
> > +
> > +     if (!shim_link->trampoline)
> > +             return;
> > +
> > +     WARN_ON_ONCE(bpf_trampoline_unlink_prog(&shim_link->link, shim_link->trampoline));
> > +     bpf_trampoline_put(shim_link->trampoline);
> > +}
> > +
> > +static void bpf_shim_tramp_link_dealloc(struct bpf_link *link)
> > +{
> > +     struct bpf_shim_tramp_link *shim_link =
> > +             container_of(link, struct bpf_shim_tramp_link, link.link);
> > +
> > +     kfree(shim_link);
> > +}
> > +
> > +static const struct bpf_link_ops bpf_shim_tramp_link_lops = {
> > +     .release = bpf_shim_tramp_link_release,
> > +     .dealloc = bpf_shim_tramp_link_dealloc,
> > +};
> > +
> > +static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
> > +                                                  bpf_func_t bpf_func,
> > +                                                  int cgroup_atype)
> > +{
> > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > +     struct bpf_prog *p;
> > +
> > +     shim_link = kzalloc(sizeof(*shim_link), GFP_USER);
> > +     if (!shim_link)
> > +             return NULL;
> > +
> > +     p = bpf_prog_alloc(1, 0);
> > +     if (!p) {
> > +             kfree(shim_link);
> > +             return NULL;
> > +     }
> > +
> > +     p->jited = false;
> > +     p->bpf_func = bpf_func;
> > +
> > +     p->aux->cgroup_atype = cgroup_atype;
> > +     p->aux->attach_func_proto = prog->aux->attach_func_proto;
> > +     p->aux->attach_btf_id = prog->aux->attach_btf_id;
> > +     p->aux->attach_btf = prog->aux->attach_btf;
> > +     btf_get(p->aux->attach_btf);
> > +     p->type = BPF_PROG_TYPE_LSM;
> > +     p->expected_attach_type = BPF_LSM_MAC;
> > +     bpf_prog_inc(p);
> > +     bpf_link_init(&shim_link->link.link, BPF_LINK_TYPE_UNSPEC,
> > +                   &bpf_shim_tramp_link_lops, p);
> > +
> > +     return shim_link;
> > +}
> > +
> > +static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr,
> > +                                                 bpf_func_t bpf_func)
> > +{
> > +     struct bpf_tramp_link *link;
> > +     int kind;
> > +
> > +     for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> > +             hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
> > +                     struct bpf_prog *p = link->link.prog;
> > +
> > +                     if (p->bpf_func == bpf_func)
> > +                             return container_of(link, struct bpf_shim_tramp_link, link);
> > +             }
> > +     }
> > +
> > +     return NULL;
> > +}
> > +
> > +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> > +                                 struct bpf_attach_target_info *tgt_info,
> > +                                 int cgroup_atype)
> > +{
> > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > +     struct bpf_trampoline *tr;
> > +     bpf_func_t bpf_func;
> > +     u64 key;
> > +     int err;
> > +
> > +     key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> > +                                      prog->aux->attach_btf_id);
> Directly get tgt_info here instead of doing it
> in __cgroup_bpf_attach() and then passing in.
> This is the only place needed it.

Ack.

>         err = bpf_check_attach_target(NULL, prog, NULL,
>                                 prog->aux->attach_btf_id,
>                                 &tgt_info);
>         if (err)
>                 return err;
>
> > +
> > +     bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> > +     tr = bpf_trampoline_get(key, tgt_info);
> > +     if (!tr)
> > +             return  -ENOMEM;
> > +
> > +     mutex_lock(&tr->mutex);
> > +
> > +     shim_link = cgroup_shim_find(tr, bpf_func);
> > +     if (shim_link) {
> > +             /* Reusing existing shim attached by the other program. */
> > +             atomic64_inc(&shim_link->link.link.refcnt);
> Use bpf_link_inc() instead to pair with bpf_link_put().

SG!

> > +             /* note, we're still holding tr refcnt from above */
> It has to do a bpf_trampoline_put(tr) after mutex_unlock(&tr->mutex).
> shim_link already holds one refcnt to tr.

Right, since we are not doing that reuse anymore; will add a deref here.




> > +
> > +             mutex_unlock(&tr->mutex);
> > +             return 0;
> > +     }
> > +
> > +     /* Allocate and install new shim. */
> > +
> > +     shim_link = cgroup_shim_alloc(prog, bpf_func, cgroup_atype);
> > +     if (!shim_link) {
> > +             err = -ENOMEM;
> > +             goto out;
> > +     }
> > +
> > +     err = __bpf_trampoline_link_prog(&shim_link->link, tr);
> > +     if (err)
> > +             goto out;
> > +
> > +     shim_link->trampoline = tr;
> > +     /* note, we're still holding tr refcnt from above */
> > +
> > +     mutex_unlock(&tr->mutex);
> > +
> > +     return 0;
> > +out:
> > +     mutex_unlock(&tr->mutex);
> > +
> > +     if (shim_link)
> > +             bpf_link_put(&shim_link->link.link);
> > +
> > +     bpf_trampoline_put(tr); /* bpf_trampoline_get above */
> Doing it here is because mutex_unlock(&tr->mutex) has
> to be done first?  A comment will be useful.
>
> How about passing tr to cgroup_shim_alloc(..., tr)
> which is initializing everything else in shim_link anyway.
> Then the 'if (!shim_link->trampoline)' in bpf_shim_tramp_link_release()
> can go away also.
> Like:
>
> static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
>                                                      bpf_func_t bpf_func,
>                                                      struct bpf_trampoline *tr)
>
> {
>         /* ... */
>         shim_link->trampoline = tr;

I believe this part has to happen after __bpf_trampoline_link_prog;
otherwise bpf_shim_tramp_link_release might try to
unlink_prog/bpf_trampoline_put on the shim that wan't fully linked?

>
>         return shim_link;
> }
>
> int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
>                                     int cgroup_atype)
> {
>
>         /* ... */
>
>         /* Allocate and install new shim. */
>
>         shim_link = cgroup_shim_alloc(prog, bpf_func, cgroup_atype, tr);
>         if (!shim_link) {
>                 err = -ENOMEM;
>                 goto error;
>         }
>
>         err = __bpf_trampoline_link_prog(&shim_link->link, tr);
>         if (err)
>                 goto error;
>
>         mutex_unlock(&tr->mutex);
>
>         return 0;
>
> error:
>         mutex_unlock(&tr->mutex);
>         if (shim_link)
>                 bpf_link_put(&shim_link->link.link);
>         else
>                 bpf_trampoline_put(tr);
>
>         return err;
> }
>
> [ ... ]
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index aedac2ac02b9..caa5740b39b3 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -7264,6 +7264,18 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> >                               reg_type_str(env, regs[BPF_REG_1].type));
> >                       return -EACCES;
> >               }
> > +             break;
> > +     case BPF_FUNC_set_retval:
> > +             if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
> > +                     if (!env->prog->aux->attach_func_proto->type) {
> > +                             /* Make sure programs that attach to void
> > +                              * hooks don't try to modify return value.
> > +                              */
> > +                             err = -EINVAL;
> nit. Directly 'return -EINVAL;' after verbose() logging.

SG!

> > +                             verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
> > +                     }
> > +             }
> > +             break;
> >       }
> >
> >       if (err)
> > @@ -10474,6 +10486,23 @@ static int check_return_code(struct bpf_verifier_env *env)
> >       case BPF_PROG_TYPE_SK_LOOKUP:
> >               range = tnum_range(SK_DROP, SK_PASS);
> >               break;
> > +
> > +     case BPF_PROG_TYPE_LSM:
> > +             if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
> > +                     if (!env->prog->aux->attach_func_proto->type) {
> nit. Check 'if ( ... != BPF_LSM_CGROUP) return 0;' first to remove
> one level of indentation.

SG!

> > +                             /* Make sure programs that attach to void
> > +                              * hooks don't try to modify return value.
> > +                              */
> > +                             range = tnum_range(1, 1);
> > +                     }
> > +             } else {
> > +                     /* regular BPF_PROG_TYPE_LSM programs can return
> > +                      * any value.
> > +                      */
> > +                     return 0;
> > +             }
> > +             break;
> > +
> >       case BPF_PROG_TYPE_EXT:
> >               /* freplace program can return anything as its return value
> >                * depends on the to-be-replaced kernel func or bpf program.
> > @@ -10490,6 +10519,8 @@ static int check_return_code(struct bpf_verifier_env *env)
> >
> >       if (!tnum_in(range, reg->var_off)) {
> >               verbose_invalid_scalar(env, reg, &range, "program exit", "R0");
> > +             if (env->prog->expected_attach_type == BPF_LSM_CGROUP)
> > +                     verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
> This is not accurate to verbose on void-return only.
> For int-return lsm look, the BPF_LSM_CGROUP prog returning
> neither 0 nor 1 will also trigger this range check failure.

verbose_invalid_scalar will handle the case for int-returning ones?

Maybe change that new verbose to "Note, BPF_LSM_CGROUP that attach to
void LSM hooks can't modify return value!" ?


> >               return -EINVAL;
> >       }
> >
> > @@ -14713,6 +14744,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
> >               fallthrough;
> >       case BPF_MODIFY_RETURN:
> >       case BPF_LSM_MAC:
> > +     case BPF_LSM_CGROUP:
> >       case BPF_TRACE_FENTRY:
> >       case BPF_TRACE_FEXIT:
> >               if (!btf_type_is_func(t)) {

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-06-04  6:59   ` Martin KaFai Lau
  2022-06-04  8:05     ` Martin KaFai Lau
@ 2022-06-06 22:46     ` Stanislav Fomichev
  2022-06-07 20:44       ` Martin KaFai Lau
  1 sibling, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-06 22:46 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, Jun 3, 2022 at 11:59 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, Jun 01, 2022 at 12:02:12PM -0700, Stanislav Fomichev wrote:
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index a27a6a7bd852..cb3338ef01e0 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -1035,6 +1035,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> >                             union bpf_attr __user *uattr)
> >  {
> > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> >       enum bpf_attach_type type = attr->query.attach_type;
> >       enum cgroup_bpf_attach_type atype;
> > @@ -1042,50 +1043,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> >       struct hlist_head *progs;
> >       struct bpf_prog *prog;
> >       int cnt, ret = 0, i;
> > +     int total_cnt = 0;
> >       u32 flags;
> >
> > -     atype = to_cgroup_bpf_attach_type(type);
> > -     if (atype < 0)
> > -             return -EINVAL;
> > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> >
> > -     progs = &cgrp->bpf.progs[atype];
> > -     flags = cgrp->bpf.flags[atype];
> > +     if (type == BPF_LSM_CGROUP) {
> > +             from_atype = CGROUP_LSM_START;
> > +             to_atype = CGROUP_LSM_END;
> Enforce prog_attach_flags for BPF_LSM_CGROUP:
>
>                 if (total_cnt && !prog_attach_flags)
>                         return -EINVAL;

All the comments make sense, will apply. The only thing that I'll
probably keep as is is the copy_to_user(flags) part. We can't move it
above because it hasn't been properly initialized there.

What I think I'll do is:

if (atype != to_atype) /* exported via prog_attach_flags */
  flags = 0;
copy_to_user(.., flags,..);

That seems to better describe why we're not doing it?


> > +     } else {
> > +             from_atype = to_cgroup_bpf_attach_type(type);
> > +             if (from_atype < 0)
> > +                     return -EINVAL;
> > +             to_atype = from_atype;
> > +     }
> >
> > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > -                                           lockdep_is_held(&cgroup_mutex));
> > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > +             progs = &cgrp->bpf.progs[atype];
> > +             flags = cgrp->bpf.flags[atype];
> >
> > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > -             cnt = bpf_prog_array_length(effective);
> > -     else
> > -             cnt = prog_list_length(progs);
> > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > +                                                   lockdep_is_held(&cgroup_mutex));
> nit. This can be done under the BPF_F_QUERY_EFFECTIVE case below.
>
> >
> > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > -             return -EFAULT;
> > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > +                     total_cnt += bpf_prog_array_length(effective);
> > +             else
> > +                     total_cnt += prog_list_length(progs);
> > +     }
> > +
> > +     if (type != BPF_LSM_CGROUP)
> > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > +                     return -EFAULT;
> nit. Move this copy_to_user(&uattr->query.attach_flags,...) to the
> 'if (type == BPF_LSM_CGROUP) { from_atype = .... } else { ... }' above.
> That will save a BPF_LSM_CGROUP test.
>
> I think the '== BPF_LSM_CGROUP' case needs to copy a 0 to user also.
>
> > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> >               return -EFAULT;
> > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> >               /* return early if user requested only program count + flags */
> >               return 0;
> > -     if (attr->query.prog_cnt < cnt) {
> > -             cnt = attr->query.prog_cnt;
> > +
> > +     if (attr->query.prog_cnt < total_cnt) {
> > +             total_cnt = attr->query.prog_cnt;
> >               ret = -ENOSPC;
> >       }
> >
> > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > -     } else {
> > -             struct bpf_prog_list *pl;
> > -             u32 id;
> > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > +             if (total_cnt <= 0)
> nit. total_cnt cannot be -ve ?
> !total_cnt instead ?
> and may be do it in the for-loop test.
>
> > +                     break;
> >
> > -             i = 0;
> > -             hlist_for_each_entry(pl, progs, node) {
> > -                     prog = prog_list_prog(pl);
> > -                     id = prog->aux->id;
> > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > -                             return -EFAULT;
> > -                     if (++i == cnt)
> > -                             break;
> > +             progs = &cgrp->bpf.progs[atype];
> > +             flags = cgrp->bpf.flags[atype];
> > +
> > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > +                                                   lockdep_is_held(&cgroup_mutex));
> nit. This can be done under the BPF_F_QUERY_EFFECTIVE case below.
>
> > +
> > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > +                     cnt = bpf_prog_array_length(effective);
> > +             else
> > +                     cnt = prog_list_length(progs);
> > +
> > +             if (cnt >= total_cnt)
> > +                     cnt = total_cnt;
> nit. This seems to be the only reason that
> the "if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)"
> need to be broken up into two halves.  One above and one below.
> It does not save much code.  How about repeating this one line
> 'cnt = min_t(int, cnt, total_cnt);' instead ?
>
> > +
> > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > +             } else {
> > +                     struct bpf_prog_list *pl;
> > +                     u32 id;
> > +
> > +                     i = 0;
> > +                     hlist_for_each_entry(pl, progs, node) {
> > +                             prog = prog_list_prog(pl);
> > +                             id = prog->aux->id;
> > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > +                                     return -EFAULT;
> > +                             if (++i == cnt)
> > +                                     break;
> > +                     }
> >               }
> > +
> > +             if (prog_attach_flags)
> > +                     for (i = 0; i < cnt; i++)
> > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > +                                     return -EFAULT;
> > +
> > +             prog_ids += cnt;
> > +             total_cnt -= cnt;
> > +             if (prog_attach_flags)
> > +                     prog_attach_flags += cnt;
> nit. Merge this into the above "if (prog_attach_flags)" case.
>
> >       }
> >       return ret;
> >  }
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index a237be4f8bb3..27492d44133f 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> >       }
> >  }
> >
> > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> >
> >  static int bpf_prog_query(const union bpf_attr *attr,
> >                         union bpf_attr __user *uattr)
> > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> >       case BPF_CGROUP_SYSCTL:
> >       case BPF_CGROUP_GETSOCKOPT:
> >       case BPF_CGROUP_SETSOCKOPT:
> > +     case BPF_LSM_CGROUP:
> >               return cgroup_bpf_prog_query(attr, uattr);
> >       case BPF_LIRC_MODE2:
> >               return lirc_prog_query(attr, uattr);
> > @@ -4066,6 +4067,9 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> >
> >       if (prog->aux->btf)
> >               info.btf_id = btf_obj_id(prog->aux->btf);
> > +     info.attach_btf_id = prog->aux->attach_btf_id;
> > +     if (prog->aux->attach_btf)
> > +             info.attach_btf_obj_id = btf_obj_id(prog->aux->attach_btf);
> Need this also:
>
>         else if (prog->aux->dst_prog)
>                 info.attach_btf_obj_id = btf_obj_id(prog->aux->dst_prog->aux->attach_btf);
>
> >
> >       ulen = info.nr_func_info;
> >       info.nr_func_info = prog->aux->func_info_cnt;
> > --
> > 2.36.1.255.ge46751e96f-goog
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 06/11] bpf: allow writing to a subset of sock fields from lsm progtype
  2022-06-04  7:18   ` Martin KaFai Lau
@ 2022-06-06 22:46     ` Stanislav Fomichev
  2022-06-07 22:17       ` Martin KaFai Lau
  0 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-06 22:46 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Sat, Jun 4, 2022 at 12:18 AM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, Jun 01, 2022 at 12:02:13PM -0700, Stanislav Fomichev wrote:
> > For now, allow only the obvious ones, like sk_priority and sk_mark.
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  kernel/bpf/bpf_lsm.c  | 58 +++++++++++++++++++++++++++++++++++++++++++
> >  kernel/bpf/verifier.c |  3 ++-
> >  2 files changed, 60 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > index 83aa431dd52e..feba8e96f58d 100644
> > --- a/kernel/bpf/bpf_lsm.c
> > +++ b/kernel/bpf/bpf_lsm.c
> > @@ -303,7 +303,65 @@ bool bpf_lsm_is_sleepable_hook(u32 btf_id)
> >  const struct bpf_prog_ops lsm_prog_ops = {
> >  };
> >
> > +static int lsm_btf_struct_access(struct bpf_verifier_log *log,
> > +                                     const struct btf *btf,
> > +                                     const struct btf_type *t, int off,
> > +                                     int size, enum bpf_access_type atype,
> > +                                     u32 *next_btf_id,
> > +                                     enum bpf_type_flag *flag)
> > +{
> > +     const struct btf_type *sock_type;
> > +     struct btf *btf_vmlinux;
> > +     s32 type_id;
> > +     size_t end;
> > +
> > +     if (atype == BPF_READ)
> > +             return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
> > +                                      flag);
> > +
> > +     btf_vmlinux = bpf_get_btf_vmlinux();
> > +     if (!btf_vmlinux) {
> > +             bpf_log(log, "no vmlinux btf\n");
> > +             return -EOPNOTSUPP;
> > +     }
> > +
> > +     type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
> > +     if (type_id < 0) {
> > +             bpf_log(log, "'struct sock' not found in vmlinux btf\n");
> > +             return -EINVAL;
> > +     }
> > +
> > +     sock_type = btf_type_by_id(btf_vmlinux, type_id);
> > +
> > +     if (t != sock_type) {
> > +             bpf_log(log, "only 'struct sock' writes are supported\n");
> > +             return -EACCES;
> > +     }
> > +
> > +     switch (off) {
> > +     case bpf_ctx_range(struct sock, sk_priority):
> This looks wrong.  It should not allow to write at
> any bytes of the '__u32 sk_priority'.

SG, I'll change to offsetof() and will enfoce u32 size.

> > +             end = offsetofend(struct sock, sk_priority);
> > +             break;
> > +     case bpf_ctx_range(struct sock, sk_mark):
> Same here.
>
> Just came to my mind,
> if the current need is only sk_priority and sk_mark,
> do you think allowing bpf_setsockopt will be more useful instead ?

For my use-case I only need sk_priority, but I was thinking that we
can later extend that list as needed. But you suggestion to use
bpf_setsockopt sounds good, let me try to use that. That might be
better than poking directly into the fields.

> It currently has SO_MARK, SO_PRIORITY, and other options.
> Also, changing SO_MARK requires to clear the sk->sk_dst_cache.
> In general, is it safe to do bpf_setsockopt in all bpf_lsm hooks ?

It seems that we might need to more strictly control it (regardless of
helper or direct field access). Not all lsm hooks lock sk argument, so
we so maybe start with some allowlist of attach_btf_ids that can do
bpf_setsockopt? (I'll add existing hooks that work on new/unreferenced
or locked sockets to the list)


> > +             end = offsetofend(struct sock, sk_mark);
> > +             break;
> > +     default:
> > +             bpf_log(log, "no write support to 'struct sock' at off %d\n", off);
> > +             return -EACCES;
> > +     }
> > +
> > +     if (off + size > end) {
> > +             bpf_log(log,
> > +                     "write access at off %d with size %d beyond the member of 'struct sock' ended at %zu\n",
> > +                     off, size, end);
> > +             return -EACCES;
> > +     }
> > +
> > +     return NOT_INIT;
> > +}
> > +
> >  const struct bpf_verifier_ops lsm_verifier_ops = {
> >       .get_func_proto = bpf_lsm_func_proto,
> >       .is_valid_access = btf_ctx_access,
> > +     .btf_struct_access = lsm_btf_struct_access,
> >  };
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index caa5740b39b3..c54e171d9c23 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -13413,7 +13413,8 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
> >                               insn->code = BPF_LDX | BPF_PROBE_MEM |
> >                                       BPF_SIZE((insn)->code);
> >                               env->prog->aux->num_exentries++;
> > -                     } else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS) {
> > +                     } else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS &&
> > +                                resolve_prog_type(env->prog) != BPF_PROG_TYPE_LSM) {
> >                               verbose(env, "Writes through BTF pointers are not allowed\n");
> >                               return -EINVAL;
> >                       }
> > --
> > 2.36.1.255.ge46751e96f-goog
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 04/11] bpf: minimize number of allocated lsm slots per program
  2022-06-04  6:35   ` Martin KaFai Lau
@ 2022-06-06 22:46     ` Stanislav Fomichev
  2022-06-07 19:57       ` Martin KaFai Lau
  0 siblings, 1 reply; 35+ messages in thread
From: Stanislav Fomichev @ 2022-06-06 22:46 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, Jun 3, 2022 at 11:36 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Wed, Jun 01, 2022 at 12:02:11PM -0700, Stanislav Fomichev wrote:
> > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > index 0f72020bfdcf..83aa431dd52e 100644
> > --- a/kernel/bpf/bpf_lsm.c
> > +++ b/kernel/bpf/bpf_lsm.c
> > @@ -69,11 +69,6 @@ void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> >               *bpf_func = __cgroup_bpf_run_lsm_current;
> >  }
> >
> > -int bpf_lsm_hook_idx(u32 btf_id)
> > -{
> > -     return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> The implementation of btf_id_set_index() added in patch 3
> should be removed also.

Thank you for the review!

At some point I was using btf_id_set_index inside of
btf_id_set_contains but since then reverted this part, will fix!

> > -}
> > -
> >  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> >                       const struct bpf_prog *prog)
> >  {
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 66b644a76a69..a27a6a7bd852 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -129,12 +129,46 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> >  }
> >
> >  #ifdef CONFIG_BPF_LSM
> > +u32 cgroup_lsm_atype_btf_id[CGROUP_LSM_NUM];
> static

Thx!

> > +
> >  static enum cgroup_bpf_attach_type
> >  bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
> >  {
> > +     int i;
> > +
> > +     lockdep_assert_held(&cgroup_mutex);
> > +
> >       if (attach_type != BPF_LSM_CGROUP)
> >               return to_cgroup_bpf_attach_type(attach_type);
> > -     return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
> > +
> > +     for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_btf_id); i++)
> > +             if (cgroup_lsm_atype_btf_id[i] == attach_btf_id)
> > +                     return CGROUP_LSM_START + i;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_btf_id); i++)
> > +             if (cgroup_lsm_atype_btf_id[i] == 0)
> > +                     return CGROUP_LSM_START + i;
> > +
> > +     return -E2BIG;
> > +
> > +}
> > +
> > +static void bpf_cgroup_atype_alloc(u32 attach_btf_id, int cgroup_atype)
> > +{
> > +     int i = cgroup_atype - CGROUP_LSM_START;
> > +
> > +     lockdep_assert_held(&cgroup_mutex);
> > +
> > +     cgroup_lsm_atype_btf_id[i] = attach_btf_id;
> > +}
> > +
> > +void bpf_cgroup_atype_free(int cgroup_atype)
> > +{
> > +     int i = cgroup_atype - CGROUP_LSM_START;
> > +
> > +     mutex_lock(&cgroup_mutex);
> > +     cgroup_lsm_atype_btf_id[i] = 0;
> I think holding cgroup_mutex in the __cgroup_bpf_attach() and
> bpf_cgroup_atype_free() is not enough.
>
> Consider a case that __cgroup_bpf_attach() runs first and bpf_trampoline_link_cgroup_shim()
> cannot find the shim_link because it is unlinked and its shim_prog
> is currently still under the bpf_prog_free_deferred's deadrow.
> Then bpf_prog_free_deferred() got run and do the bpf_cgroup_atype_free().
>
> A refcnt is still needed.  It is better to put them together in a
> struct instead of having two arrays, like
>
> struct cgroup_lsm_atype {
>        u32 attach_btf_id;
>        u32 refcnt;
> };

Ugh, that's true, very good point, will add.

> > +     mutex_unlock(&cgroup_mutex);
> >  }
> >  #else
> >  static enum cgroup_bpf_attach_type
> > @@ -144,6 +178,14 @@ bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id)
> >               return to_cgroup_bpf_attach_type(attach_type);
> >       return -EOPNOTSUPP;
> >  }
> > +
> > +static void bpf_cgroup_atype_alloc(u32 attach_btf_id, int cgroup_atype)
> > +{
> > +}
> > +
> > +void bpf_cgroup_atype_free(int cgroup_atype)
> > +{
> > +}
> >  #endif /* CONFIG_BPF_LSM */
> >
> >  void cgroup_bpf_offline(struct cgroup *cgrp)
> > @@ -659,6 +701,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >               err = bpf_trampoline_link_cgroup_shim(new_prog, &tgt_info, atype);
> >               if (err)
> >                       goto cleanup;
> > +             bpf_cgroup_atype_alloc(new_prog->aux->attach_btf_id, atype);
> This atype alloc (or refcnt inc) should be done in
> cgroup_shim_alloc() where the shim_prog is the one holding
> the refcnt of the atype.  If the above "!old_prog" needs to be
> removed as the earlier comment in patch 3, bumping the atype refcnt
> here will be wrong.
>
> >       }

Agreed. The only thing here that might be a bit unclear is the need to
hold cgroup_mutex. I'll keep lockdep_assert_held to make it more
apparent.

> >       err = update_effective_progs(cgrp, atype);
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index 091ee210842f..224bb4d4fe4e 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -107,6 +107,9 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
> >       fp->aux->prog = fp;
> >       fp->jit_requested = ebpf_jit_enabled();
> >       fp->blinding_requested = bpf_jit_blinding_enabled(fp);
> > +#ifdef CONFIG_BPF_LSM
> I don't think this is needed.

enum cgroup_bpf_attach_type is under '#ifdef CONFIG_CGROUP_BPF', so it
fails in some configs without cgroup/bpf. I was trying to move that
enumo out of ifdef but then decided that it's easier to fix here.
Should I instead try to move ifdef in bpf-cgroup-defs.h around?



> > +     aux->cgroup_atype = CGROUP_BPF_ATTACH_TYPE_INVALID;
> > +#endif
> >
> >       INIT_LIST_HEAD_RCU(&fp->aux->ksym.lnode);
> >       mutex_init(&fp->aux->used_maps_mutex);
> > @@ -2558,6 +2561,10 @@ static void bpf_prog_free_deferred(struct work_struct *work)
> >       aux = container_of(work, struct bpf_prog_aux, work);
> >  #ifdef CONFIG_BPF_SYSCALL
> >       bpf_free_kfunc_btf_tab(aux->kfunc_btf_tab);
> > +#endif
> > +#ifdef CONFIG_BPF_LSM
> Same here.
>
> > +     if (aux->cgroup_atype != CGROUP_BPF_ATTACH_TYPE_INVALID)
> > +             bpf_cgroup_atype_free(aux->cgroup_atype);
> >  #endif
> >       bpf_free_used_maps(aux);
> >       bpf_free_used_btfs(aux);
> > --
> > 2.36.1.255.ge46751e96f-goog
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor
  2022-06-06 22:46     ` Stanislav Fomichev
@ 2022-06-07 19:24       ` Martin KaFai Lau
  0 siblings, 0 replies; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-07 19:24 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, Jun 06, 2022 at 03:46:21PM -0700, Stanislav Fomichev wrote:
> > > +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> > > +                                 struct bpf_attach_target_info *tgt_info,
> > > +                                 int cgroup_atype)
> > > +{
> > > +     struct bpf_shim_tramp_link *shim_link = NULL;
> > > +     struct bpf_trampoline *tr;
> > > +     bpf_func_t bpf_func;
> > > +     u64 key;
> > > +     int err;
> > > +
> > > +     key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> > > +                                      prog->aux->attach_btf_id);
> > Directly get tgt_info here instead of doing it
> > in __cgroup_bpf_attach() and then passing in.
> > This is the only place needed it.
> 
> Ack.
> 
> >         err = bpf_check_attach_target(NULL, prog, NULL,
> >                                 prog->aux->attach_btf_id,
> >                                 &tgt_info);
> >         if (err)
> >                 return err;
> >
> > > +
> > > +     bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> > > +     tr = bpf_trampoline_get(key, tgt_info);
> > > +     if (!tr)
> > > +             return  -ENOMEM;
> > > +
> > > +     mutex_lock(&tr->mutex);
> > > +
> > > +     shim_link = cgroup_shim_find(tr, bpf_func);
> > > +     if (shim_link) {
> > > +             /* Reusing existing shim attached by the other program. */
> > > +             atomic64_inc(&shim_link->link.link.refcnt);
> > Use bpf_link_inc() instead to pair with bpf_link_put().
> 
> SG!
> 
> > > +             /* note, we're still holding tr refcnt from above */
> > It has to do a bpf_trampoline_put(tr) after mutex_unlock(&tr->mutex).
> > shim_link already holds one refcnt to tr.
> 
> Right, since we are not doing that reuse anymore; will add a deref here.
> 
> 
> 
> 
> > > +
> > > +             mutex_unlock(&tr->mutex);
> > > +             return 0;
> > > +     }
> > > +
> > > +     /* Allocate and install new shim. */
> > > +
> > > +     shim_link = cgroup_shim_alloc(prog, bpf_func, cgroup_atype);
> > > +     if (!shim_link) {
> > > +             err = -ENOMEM;
> > > +             goto out;
> > > +     }
> > > +
> > > +     err = __bpf_trampoline_link_prog(&shim_link->link, tr);
> > > +     if (err)
> > > +             goto out;
> > > +
> > > +     shim_link->trampoline = tr;
> > > +     /* note, we're still holding tr refcnt from above */
> > > +
> > > +     mutex_unlock(&tr->mutex);
> > > +
> > > +     return 0;
> > > +out:
> > > +     mutex_unlock(&tr->mutex);
> > > +
> > > +     if (shim_link)
> > > +             bpf_link_put(&shim_link->link.link);
> > > +
> > > +     bpf_trampoline_put(tr); /* bpf_trampoline_get above */
> > Doing it here is because mutex_unlock(&tr->mutex) has
> > to be done first?  A comment will be useful.
> >
> > How about passing tr to cgroup_shim_alloc(..., tr)
> > which is initializing everything else in shim_link anyway.
> > Then the 'if (!shim_link->trampoline)' in bpf_shim_tramp_link_release()
> > can go away also.
> > Like:
> >
> > static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog,
> >                                                      bpf_func_t bpf_func,
> >                                                      struct bpf_trampoline *tr)
> >
> > {
> >         /* ... */
> >         shim_link->trampoline = tr;
> 
> I believe this part has to happen after __bpf_trampoline_link_prog;
> otherwise bpf_shim_tramp_link_release might try to
> unlink_prog/bpf_trampoline_put on the shim that wan't fully linked?
Ah.  You are correct.  missed that.

Yeah, I don't see a better way out unless a separate shim_link cleanup
func is used during the error case.  That may be overkill.
The current approach in the patch is better.

> > > @@ -10474,6 +10486,23 @@ static int check_return_code(struct bpf_verifier_env *env)
> > >       case BPF_PROG_TYPE_SK_LOOKUP:
> > >               range = tnum_range(SK_DROP, SK_PASS);
> > >               break;
> > > +
> > > +     case BPF_PROG_TYPE_LSM:
> > > +             if (env->prog->expected_attach_type == BPF_LSM_CGROUP) {
> > > +                     if (!env->prog->aux->attach_func_proto->type) {
> > nit. Check 'if ( ... != BPF_LSM_CGROUP) return 0;' first to remove
> > one level of indentation.
> 
> SG!
> 
> > > +                             /* Make sure programs that attach to void
> > > +                              * hooks don't try to modify return value.
> > > +                              */
> > > +                             range = tnum_range(1, 1);
> > > +                     }
> > > +             } else {
> > > +                     /* regular BPF_PROG_TYPE_LSM programs can return
> > > +                      * any value.
> > > +                      */
> > > +                     return 0;
> > > +             }
> > > +             break;
> > > +
> > >       case BPF_PROG_TYPE_EXT:
> > >               /* freplace program can return anything as its return value
> > >                * depends on the to-be-replaced kernel func or bpf program.
> > > @@ -10490,6 +10519,8 @@ static int check_return_code(struct bpf_verifier_env *env)
> > >
> > >       if (!tnum_in(range, reg->var_off)) {
> > >               verbose_invalid_scalar(env, reg, &range, "program exit", "R0");
> > > +             if (env->prog->expected_attach_type == BPF_LSM_CGROUP)
> > > +                     verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n");
> > This is not accurate to verbose on void-return only.
> > For int-return lsm look, the BPF_LSM_CGROUP prog returning
> > neither 0 nor 1 will also trigger this range check failure.
> 
> verbose_invalid_scalar will handle the case for int-returning ones?
> 
> Maybe change that new verbose to "Note, BPF_LSM_CGROUP that attach to
> void LSM hooks can't modify return value!" ?
I was thinking only verbose_invalid_scalar is good enough.

Having the new 'void LSM hooks' message may confuse the lsm-hooks that
have an int ret type and the bpf prog returns -1.
If keeping this verbose,  I think adding
'!attach_func_proto->type' check should be useful before
printing.  No strong opinion here.

> 
> 
> > >               return -EINVAL;
> > >       }
> > >
> > > @@ -14713,6 +14744,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
> > >               fallthrough;
> > >       case BPF_MODIFY_RETURN:
> > >       case BPF_LSM_MAC:
> > > +     case BPF_LSM_CGROUP:
> > >       case BPF_TRACE_FENTRY:
> > >       case BPF_TRACE_FEXIT:
> > >               if (!btf_type_is_func(t)) {

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 04/11] bpf: minimize number of allocated lsm slots per program
  2022-06-06 22:46     ` Stanislav Fomichev
@ 2022-06-07 19:57       ` Martin KaFai Lau
  0 siblings, 0 replies; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-07 19:57 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, Jun 06, 2022 at 03:46:32PM -0700, Stanislav Fomichev wrote:
> > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > > index 091ee210842f..224bb4d4fe4e 100644
> > > --- a/kernel/bpf/core.c
> > > +++ b/kernel/bpf/core.c
> > > @@ -107,6 +107,9 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
> > >       fp->aux->prog = fp;
> > >       fp->jit_requested = ebpf_jit_enabled();
> > >       fp->blinding_requested = bpf_jit_blinding_enabled(fp);
> > > +#ifdef CONFIG_BPF_LSM
> > I don't think this is needed.
> 
> enum cgroup_bpf_attach_type is under '#ifdef CONFIG_CGROUP_BPF', so it
> fails in some configs without cgroup/bpf. I was trying to move that
> enumo out of ifdef but then decided that it's easier to fix here.
> Should I instead try to move ifdef in bpf-cgroup-defs.h around?
Having ifdef here is ok.  Should CONFIG_CGROUP_BPF be checked ?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-06-06 22:46     ` Stanislav Fomichev
@ 2022-06-07 20:44       ` Martin KaFai Lau
  0 siblings, 0 replies; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-07 20:44 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, Jun 06, 2022 at 03:46:25PM -0700, Stanislav Fomichev wrote:
> On Fri, Jun 3, 2022 at 11:59 PM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Wed, Jun 01, 2022 at 12:02:12PM -0700, Stanislav Fomichev wrote:
> > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > index a27a6a7bd852..cb3338ef01e0 100644
> > > --- a/kernel/bpf/cgroup.c
> > > +++ b/kernel/bpf/cgroup.c
> > > @@ -1035,6 +1035,7 @@ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
> > >  static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > >                             union bpf_attr __user *uattr)
> > >  {
> > > +     __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags);
> > >       __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
> > >       enum bpf_attach_type type = attr->query.attach_type;
> > >       enum cgroup_bpf_attach_type atype;
> > > @@ -1042,50 +1043,92 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
> > >       struct hlist_head *progs;
> > >       struct bpf_prog *prog;
> > >       int cnt, ret = 0, i;
> > > +     int total_cnt = 0;
> > >       u32 flags;
> > >
> > > -     atype = to_cgroup_bpf_attach_type(type);
> > > -     if (atype < 0)
> > > -             return -EINVAL;
> > > +     enum cgroup_bpf_attach_type from_atype, to_atype;
> > >
> > > -     progs = &cgrp->bpf.progs[atype];
> > > -     flags = cgrp->bpf.flags[atype];
> > > +     if (type == BPF_LSM_CGROUP) {
> > > +             from_atype = CGROUP_LSM_START;
> > > +             to_atype = CGROUP_LSM_END;
> > Enforce prog_attach_flags for BPF_LSM_CGROUP:
> >
> >                 if (total_cnt && !prog_attach_flags)
> >                         return -EINVAL;
> 
> All the comments make sense, will apply. The only thing that I'll
> probably keep as is is the copy_to_user(flags) part. We can't move it
> above because it hasn't been properly initialized there.
Initialize earlier:

if (type == BPF_LSM_CGROUP)
{
	from_atype = CGROUP_LSM_START;
	to_atype = CGROUP_LSM_END;
	flags = 0;
} else {
	from_atype = to_cgroup_bpf_attach_type(type);
	if (from_atype < 0)
		 return -EINVAL;
	to_atype = from_atype;
	flags = cgrp->bpf.flags[to_atype];
}

if (copy_to_user(&uattr->query.attach_flags, &flags...))
	/* ... */

for (atype = from_atype; atype <= to_atype; atype++) {
	/* Figuring out total_cnt.
	 * No need to get the flags here.
	 /
}

> 
> What I think I'll do is:
> 
> if (atype != to_atype) /* exported via prog_attach_flags */
Check "from_atype != to_atype" and then reset flags back to 0?
Yeah, that will work.  I think in that case checking
BPF_LSM_CGROUP is more straight forward.

Also no strong opinion here.  was just thinking to avoid
multiple BPF_LSM_CGROUP checks because copying the flags
before and after the for-loop is the same.


>   flags = 0;
> copy_to_user(.., flags,..);

> 
> That seems to better describe why we're not doing it?
> 
> 
> > > +     } else {
> > > +             from_atype = to_cgroup_bpf_attach_type(type);
> > > +             if (from_atype < 0)
> > > +                     return -EINVAL;
> > > +             to_atype = from_atype;
> > > +     }
> > >
> > > -     effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > -                                           lockdep_is_held(&cgroup_mutex));
> > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > +             progs = &cgrp->bpf.progs[atype];
> > > +             flags = cgrp->bpf.flags[atype];
> > >
> > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > -             cnt = bpf_prog_array_length(effective);
> > > -     else
> > > -             cnt = prog_list_length(progs);
> > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > +                                                   lockdep_is_held(&cgroup_mutex));
> > nit. This can be done under the BPF_F_QUERY_EFFECTIVE case below.
> >
> > >
> > > -     if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > -             return -EFAULT;
> > > -     if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
> > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > +                     total_cnt += bpf_prog_array_length(effective);
> > > +             else
> > > +                     total_cnt += prog_list_length(progs);
> > > +     }
> > > +
> > > +     if (type != BPF_LSM_CGROUP)
> > > +             if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
> > > +                     return -EFAULT;
> > nit. Move this copy_to_user(&uattr->query.attach_flags,...) to the
> > 'if (type == BPF_LSM_CGROUP) { from_atype = .... } else { ... }' above.
> > That will save a BPF_LSM_CGROUP test.
> >
> > I think the '== BPF_LSM_CGROUP' case needs to copy a 0 to user also.
> >
> > > +     if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt)))
> > >               return -EFAULT;
> > > -     if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
> > > +     if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt)
> > >               /* return early if user requested only program count + flags */
> > >               return 0;
> > > -     if (attr->query.prog_cnt < cnt) {
> > > -             cnt = attr->query.prog_cnt;
> > > +
> > > +     if (attr->query.prog_cnt < total_cnt) {
> > > +             total_cnt = attr->query.prog_cnt;
> > >               ret = -ENOSPC;
> > >       }
> > >
> > > -     if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > -             return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > -     } else {
> > > -             struct bpf_prog_list *pl;
> > > -             u32 id;
> > > +     for (atype = from_atype; atype <= to_atype; atype++) {
> > > +             if (total_cnt <= 0)
> > nit. total_cnt cannot be -ve ?
> > !total_cnt instead ?
> > and may be do it in the for-loop test.
> >
> > > +                     break;
> > >
> > > -             i = 0;
> > > -             hlist_for_each_entry(pl, progs, node) {
> > > -                     prog = prog_list_prog(pl);
> > > -                     id = prog->aux->id;
> > > -                     if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > -                             return -EFAULT;
> > > -                     if (++i == cnt)
> > > -                             break;
> > > +             progs = &cgrp->bpf.progs[atype];
> > > +             flags = cgrp->bpf.flags[atype];
> > > +
> > > +             effective = rcu_dereference_protected(cgrp->bpf.effective[atype],
> > > +                                                   lockdep_is_held(&cgroup_mutex));
> > nit. This can be done under the BPF_F_QUERY_EFFECTIVE case below.
> >
> > > +
> > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)
> > > +                     cnt = bpf_prog_array_length(effective);
> > > +             else
> > > +                     cnt = prog_list_length(progs);
> > > +
> > > +             if (cnt >= total_cnt)
> > > +                     cnt = total_cnt;
> > nit. This seems to be the only reason that
> > the "if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE)"
> > need to be broken up into two halves.  One above and one below.
> > It does not save much code.  How about repeating this one line
> > 'cnt = min_t(int, cnt, total_cnt);' instead ?
> >
> > > +
> > > +             if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
> > > +                     ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
> > > +             } else {
> > > +                     struct bpf_prog_list *pl;
> > > +                     u32 id;
> > > +
> > > +                     i = 0;
> > > +                     hlist_for_each_entry(pl, progs, node) {
> > > +                             prog = prog_list_prog(pl);
> > > +                             id = prog->aux->id;
> > > +                             if (copy_to_user(prog_ids + i, &id, sizeof(id)))
> > > +                                     return -EFAULT;
> > > +                             if (++i == cnt)
> > > +                                     break;
> > > +                     }
> > >               }
> > > +
> > > +             if (prog_attach_flags)
> > > +                     for (i = 0; i < cnt; i++)
> > > +                             if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags)))
> > > +                                     return -EFAULT;
> > > +
> > > +             prog_ids += cnt;
> > > +             total_cnt -= cnt;
> > > +             if (prog_attach_flags)
> > > +                     prog_attach_flags += cnt;
> > nit. Merge this into the above "if (prog_attach_flags)" case.
> >
> > >       }
> > >       return ret;
> > >  }
> > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > index a237be4f8bb3..27492d44133f 100644
> > > --- a/kernel/bpf/syscall.c
> > > +++ b/kernel/bpf/syscall.c
> > > @@ -3520,7 +3520,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
> > >       }
> > >  }
> > >
> > > -#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
> > > +#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags
> > >
> > >  static int bpf_prog_query(const union bpf_attr *attr,
> > >                         union bpf_attr __user *uattr)
> > > @@ -3556,6 +3556,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
> > >       case BPF_CGROUP_SYSCTL:
> > >       case BPF_CGROUP_GETSOCKOPT:
> > >       case BPF_CGROUP_SETSOCKOPT:
> > > +     case BPF_LSM_CGROUP:
> > >               return cgroup_bpf_prog_query(attr, uattr);
> > >       case BPF_LIRC_MODE2:
> > >               return lirc_prog_query(attr, uattr);
> > > @@ -4066,6 +4067,9 @@ static int bpf_prog_get_info_by_fd(struct file *file,
> > >
> > >       if (prog->aux->btf)
> > >               info.btf_id = btf_obj_id(prog->aux->btf);
> > > +     info.attach_btf_id = prog->aux->attach_btf_id;
> > > +     if (prog->aux->attach_btf)
> > > +             info.attach_btf_obj_id = btf_obj_id(prog->aux->attach_btf);
> > Need this also:
> >
> >         else if (prog->aux->dst_prog)
> >                 info.attach_btf_obj_id = btf_obj_id(prog->aux->dst_prog->aux->attach_btf);
> >
> > >
> > >       ulen = info.nr_func_info;
> > >       info.nr_func_info = prog->aux->func_info_cnt;
> > > --
> > > 2.36.1.255.ge46751e96f-goog
> > >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH bpf-next v8 06/11] bpf: allow writing to a subset of sock fields from lsm progtype
  2022-06-06 22:46     ` Stanislav Fomichev
@ 2022-06-07 22:17       ` Martin KaFai Lau
  0 siblings, 0 replies; 35+ messages in thread
From: Martin KaFai Lau @ 2022-06-07 22:17 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, Jun 06, 2022 at 03:46:29PM -0700, Stanislav Fomichev wrote:
> On Sat, Jun 4, 2022 at 12:18 AM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Wed, Jun 01, 2022 at 12:02:13PM -0700, Stanislav Fomichev wrote:
> > > For now, allow only the obvious ones, like sk_priority and sk_mark.
> > >
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  kernel/bpf/bpf_lsm.c  | 58 +++++++++++++++++++++++++++++++++++++++++++
> > >  kernel/bpf/verifier.c |  3 ++-
> > >  2 files changed, 60 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > > index 83aa431dd52e..feba8e96f58d 100644
> > > --- a/kernel/bpf/bpf_lsm.c
> > > +++ b/kernel/bpf/bpf_lsm.c
> > > @@ -303,7 +303,65 @@ bool bpf_lsm_is_sleepable_hook(u32 btf_id)
> > >  const struct bpf_prog_ops lsm_prog_ops = {
> > >  };
> > >
> > > +static int lsm_btf_struct_access(struct bpf_verifier_log *log,
> > > +                                     const struct btf *btf,
> > > +                                     const struct btf_type *t, int off,
> > > +                                     int size, enum bpf_access_type atype,
> > > +                                     u32 *next_btf_id,
> > > +                                     enum bpf_type_flag *flag)
> > > +{
> > > +     const struct btf_type *sock_type;
> > > +     struct btf *btf_vmlinux;
> > > +     s32 type_id;
> > > +     size_t end;
> > > +
> > > +     if (atype == BPF_READ)
> > > +             return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
> > > +                                      flag);
> > > +
> > > +     btf_vmlinux = bpf_get_btf_vmlinux();
> > > +     if (!btf_vmlinux) {
> > > +             bpf_log(log, "no vmlinux btf\n");
> > > +             return -EOPNOTSUPP;
> > > +     }
> > > +
> > > +     type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
> > > +     if (type_id < 0) {
> > > +             bpf_log(log, "'struct sock' not found in vmlinux btf\n");
> > > +             return -EINVAL;
> > > +     }
> > > +
> > > +     sock_type = btf_type_by_id(btf_vmlinux, type_id);
> > > +
> > > +     if (t != sock_type) {
> > > +             bpf_log(log, "only 'struct sock' writes are supported\n");
> > > +             return -EACCES;
> > > +     }
> > > +
> > > +     switch (off) {
> > > +     case bpf_ctx_range(struct sock, sk_priority):
> > This looks wrong.  It should not allow to write at
> > any bytes of the '__u32 sk_priority'.
> 
> SG, I'll change to offsetof() and will enfoce u32 size.
> 
> > > +             end = offsetofend(struct sock, sk_priority);
> > > +             break;
> > > +     case bpf_ctx_range(struct sock, sk_mark):
> > Same here.
> >
> > Just came to my mind,
> > if the current need is only sk_priority and sk_mark,
> > do you think allowing bpf_setsockopt will be more useful instead ?
> 
> For my use-case I only need sk_priority, but I was thinking that we
> can later extend that list as needed. But you suggestion to use
> bpf_setsockopt sounds good, let me try to use that. That might be
> better than poking directly into the fields.
semi-related.
bpf_setsockopt will have more options (and more useful) in the future.
I am thinking to refactor the bpf_setsockopt() to avoid duplicate
codes between the syscall setsockopt().  With sockptr_t, that may be
easier to do now.  It will then be easier to allow most of the options
in bpf_setsockopt().

> 
> > It currently has SO_MARK, SO_PRIORITY, and other options.
> > Also, changing SO_MARK requires to clear the sk->sk_dst_cache.
> > In general, is it safe to do bpf_setsockopt in all bpf_lsm hooks ?
> 
> It seems that we might need to more strictly control it (regardless of
> helper or direct field access). Not all lsm hooks lock sk argument, so
> we so maybe start with some allowlist of attach_btf_ids that can do
> bpf_setsockopt? (I'll add existing hooks that work on new/unreferenced
> or locked sockets to the list)
Yeah, it seems a btf id list is needed.  Regardless of bpf_setsockopt()
or not, locking the sk is needed to make changes.

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-06-08  0:23 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-01 19:02 [PATCH bpf-next v8 00/11] bpf: cgroup_sock lsm flavor Stanislav Fomichev
2022-06-01 19:02 ` [PATCH bpf-next v8 01/11] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
2022-06-01 19:02 ` [PATCH bpf-next v8 02/11] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
2022-06-01 19:02 ` [PATCH bpf-next v8 03/11] bpf: per-cgroup lsm flavor Stanislav Fomichev
2022-06-02  6:16   ` kernel test robot
2022-06-02 16:49     ` Stanislav Fomichev
2022-06-04  6:11   ` Martin KaFai Lau
2022-06-04  8:27     ` Martin KaFai Lau
2022-06-06 22:46       ` Stanislav Fomichev
2022-06-06 22:46     ` Stanislav Fomichev
2022-06-07 19:24       ` Martin KaFai Lau
2022-06-01 19:02 ` [PATCH bpf-next v8 04/11] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
2022-06-04  6:35   ` Martin KaFai Lau
2022-06-06 22:46     ` Stanislav Fomichev
2022-06-07 19:57       ` Martin KaFai Lau
2022-06-01 19:02 ` [PATCH bpf-next v8 05/11] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
2022-06-04  6:59   ` Martin KaFai Lau
2022-06-04  8:05     ` Martin KaFai Lau
2022-06-06 22:46     ` Stanislav Fomichev
2022-06-07 20:44       ` Martin KaFai Lau
2022-06-01 19:02 ` [PATCH bpf-next v8 06/11] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
2022-06-04  7:18   ` Martin KaFai Lau
2022-06-06 22:46     ` Stanislav Fomichev
2022-06-07 22:17       ` Martin KaFai Lau
2022-06-01 19:02 ` [PATCH bpf-next v8 07/11] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
2022-06-03 21:30   ` Andrii Nakryiko
2022-06-01 19:02 ` [PATCH bpf-next v8 08/11] libbpf: implement bpf_prog_query_opts Stanislav Fomichev
2022-06-03 21:34   ` Andrii Nakryiko
2022-06-03 23:13     ` Stanislav Fomichev
2022-06-01 19:02 ` [PATCH bpf-next v8 09/11] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
2022-06-01 19:02 ` [PATCH bpf-next v8 10/11] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
2022-06-01 19:02 ` [PATCH bpf-next v8 11/11] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
2022-06-03  1:52   ` Martin KaFai Lau
2022-06-03  1:59     ` Stanislav Fomichev
2022-06-03  5:38       ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).