bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor
@ 2022-04-29 21:15 Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 01/10] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
                   ` (9 more replies)
  0 siblings, 10 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf
  Cc: ast, daniel, andrii, Stanislav Fomichev, kafai, kpsingh, jakub

This series implements new lsm flavor for attaching per-cgroup programs to
existing lsm hooks. The cgroup is taken out of 'current', unless
the first argument of the hook is 'struct socket'. In this case,
the cgroup association is taken out of socket. The attachment
looks like a regular per-cgroup attachment: we add new BPF_LSM_CGROUP
attach type which, together with attach_btf_id, signals per-cgroup lsm.
Behind the scenes, we allocate trampoline shim program and
attach to lsm. This program looks up cgroup from current/socket
and runs cgroup's effective prog array. The rest of the per-cgroup BPF
stays the same: hierarchy, local storage, retval conventions
(return 1 == success).

Current limitations:
* haven't considered sleepable bpf; can be extended later on
* not sure the verifier does the right thing with null checks;
  see latest selftest for details
* total of 10 (global) per-cgroup LSM attach points; this bloats
  bpf_cgroup a bit

Cc: ast@kernel.org
Cc: daniel@iogearbox.net
Cc: kafai@fb.com
Cc: kpsingh@kernel.org
Cc: jakub@cloudflare.com

v6:
- remove active count & stats for shim program (Martin KaFai Lau)
- remove NULL/error check for btf_vmlinux (Martin)
- don't check cgroup_atype in bpf_cgroup_lsm_shim_release (Martin)
- use old_prog (instead of passed one) in __cgroup_bpf_detach (Martin)
- make sure attach_btf_id is the same in __cgroup_bpf_replace (Martin)
- enable cgroup local storage and test it (Martin)
- properly implement prog query and add bpftool & tests (Martin)
- prohibit non-shared cgroup storage mode for BPF_LSM_CGROUP (Martin)

v5:
- __cgroup_bpf_run_lsm_socket remove NULL sock/sk checks (Martin KaFai Lau)
- __cgroup_bpf_run_lsm_{socket,current} s/prog/shim_prog/ (Martin)
- make sure bpf_lsm_find_cgroup_shim works for hooks without args (Martin)
- __cgroup_bpf_attach make sure attach_btf_id is the same when replacing (Martin)
- call bpf_cgroup_lsm_shim_release only for LSM_CGROUP (Martin)
- drop BPF_LSM_CGROUP from bpf_attach_type_to_tramp (Martin)
- drop jited check from cgroup_shim_find (Martin)
- new patch to convert cgroup_bpf to hlist_node (Jakub Sitnicki)
- new shim flavor for 'struct sock' + list of exceptions (Martin)

v4:
- fix build when jit is on but syscall is off

v3:
- add BPF_LSM_CGROUP to bpftool
- use simple int instead of refcnt_t (to avoid use-after-free
  false positive)

v2:
- addressed build bot failures

Stanislav Fomichev (10):
  bpf: add bpf_func_t and trampoline helpers
  bpf: convert cgroup_bpf.progs to hlist
  bpf: per-cgroup lsm flavor
  bpf: minimize number of allocated lsm slots per program
  bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  bpf: allow writing to a subset of sock fields from lsm progtype
  libbpf: add lsm_cgoup_sock type
  bpftool: implement cgroup tree for BPF_LSM_CGROUP
  selftests/bpf: lsm_cgroup functional test
  selftests/bpf: verify lsm_cgroup struct sock access

 arch/x86/net/bpf_jit_comp.c                   |  22 +-
 include/linux/bpf-cgroup-defs.h               |  11 +-
 include/linux/bpf-cgroup.h                    |   9 +-
 include/linux/bpf.h                           |  26 +-
 include/linux/bpf_lsm.h                       |   8 +
 include/uapi/linux/bpf.h                      |   2 +
 kernel/bpf/bpf_lsm.c                          | 117 ++++++
 kernel/bpf/btf.c                              |  11 +
 kernel/bpf/cgroup.c                           | 391 +++++++++++++++---
 kernel/bpf/syscall.c                          |  13 +-
 kernel/bpf/trampoline.c                       | 222 ++++++++--
 kernel/bpf/verifier.c                         |  35 +-
 tools/bpf/bpftool/cgroup.c                    | 138 +++++--
 tools/bpf/bpftool/common.c                    |   1 +
 tools/include/uapi/linux/bpf.h                |   2 +
 tools/lib/bpf/bpf.c                           |  42 +-
 tools/lib/bpf/bpf.h                           |  15 +
 tools/lib/bpf/libbpf.c                        |   2 +
 tools/lib/bpf/libbpf.map                      |   1 +
 .../selftests/bpf/prog_tests/lsm_cgroup.c     | 236 +++++++++++
 .../testing/selftests/bpf/progs/lsm_cgroup.c  | 160 +++++++
 tools/testing/selftests/bpf/test_verifier.c   |  54 ++-
 .../selftests/bpf/verifier/lsm_cgroup.c       |  34 ++
 23 files changed, 1406 insertions(+), 146 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
 create mode 100644 tools/testing/selftests/bpf/progs/lsm_cgroup.c
 create mode 100644 tools/testing/selftests/bpf/verifier/lsm_cgroup.c

-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 01/10] bpf: add bpf_func_t and trampoline helpers
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 02/10] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

I'll be adding lsm cgroup specific helpers that grab
trampoline mutex.

No functional changes.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf.h     | 11 ++++---
 kernel/bpf/trampoline.c | 63 +++++++++++++++++++++++------------------
 2 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index be94833d390a..0a8927103018 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -53,6 +53,8 @@ typedef u64 (*bpf_callback_t)(u64, u64, u64, u64, u64);
 typedef int (*bpf_iter_init_seq_priv_t)(void *private_data,
 					struct bpf_iter_aux_info *aux);
 typedef void (*bpf_iter_fini_seq_priv_t)(void *private_data);
+typedef unsigned int (*bpf_func_t)(const void *,
+				   const struct bpf_insn *);
 struct bpf_iter_seq_info {
 	const struct seq_operations *seq_ops;
 	bpf_iter_init_seq_priv_t init_seq_private;
@@ -847,8 +849,7 @@ struct bpf_dispatcher {
 static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func(
 	const void *ctx,
 	const struct bpf_insn *insnsi,
-	unsigned int (*bpf_func)(const void *,
-				 const struct bpf_insn *))
+	bpf_func_t bpf_func)
 {
 	return bpf_func(ctx, insnsi);
 }
@@ -876,8 +877,7 @@ int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
 	noinline __nocfi unsigned int bpf_dispatcher_##name##_func(	\
 		const void *ctx,					\
 		const struct bpf_insn *insnsi,				\
-		unsigned int (*bpf_func)(const void *,			\
-					 const struct bpf_insn *))	\
+		bpf_func_t bpf_func)					\
 	{								\
 		return bpf_func(ctx, insnsi);				\
 	}								\
@@ -888,8 +888,7 @@ int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
 	unsigned int bpf_dispatcher_##name##_func(			\
 		const void *ctx,					\
 		const struct bpf_insn *insnsi,				\
-		unsigned int (*bpf_func)(const void *,			\
-					 const struct bpf_insn *));	\
+		bpf_func_t bpf_func);					\
 	extern struct bpf_dispatcher bpf_dispatcher_##name;
 #define BPF_DISPATCHER_FUNC(name) bpf_dispatcher_##name##_func
 #define BPF_DISPATCHER_PTR(name) (&bpf_dispatcher_##name)
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index ada97751ae1b..0c4fd194e801 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -407,42 +407,34 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(struct bpf_prog *prog)
 	}
 }
 
-int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+static int __bpf_trampoline_link_prog(struct bpf_prog *prog,
+				      struct bpf_trampoline *tr)
 {
 	enum bpf_tramp_prog_type kind;
 	int err = 0;
 	int cnt;
 
 	kind = bpf_attach_type_to_tramp(prog);
-	mutex_lock(&tr->mutex);
-	if (tr->extension_prog) {
+	if (tr->extension_prog)
 		/* cannot attach fentry/fexit if extension prog is attached.
 		 * cannot overwrite extension prog either.
 		 */
-		err = -EBUSY;
-		goto out;
-	}
+		return -EBUSY;
+
 	cnt = tr->progs_cnt[BPF_TRAMP_FENTRY] + tr->progs_cnt[BPF_TRAMP_FEXIT];
 	if (kind == BPF_TRAMP_REPLACE) {
 		/* Cannot attach extension if fentry/fexit are in use. */
-		if (cnt) {
-			err = -EBUSY;
-			goto out;
-		}
+		if (cnt)
+			return -EBUSY;
 		tr->extension_prog = prog;
-		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
-					 prog->bpf_func);
-		goto out;
-	}
-	if (cnt >= BPF_MAX_TRAMP_PROGS) {
-		err = -E2BIG;
-		goto out;
+		return bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
+					  prog->bpf_func);
 	}
-	if (!hlist_unhashed(&prog->aux->tramp_hlist)) {
+	if (cnt >= BPF_MAX_TRAMP_PROGS)
+		return -E2BIG;
+	if (!hlist_unhashed(&prog->aux->tramp_hlist))
 		/* prog already linked */
-		err = -EBUSY;
-		goto out;
-	}
+		return -EBUSY;
 	hlist_add_head(&prog->aux->tramp_hlist, &tr->progs_hlist[kind]);
 	tr->progs_cnt[kind]++;
 	err = bpf_trampoline_update(tr);
@@ -450,30 +442,45 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
 		hlist_del_init(&prog->aux->tramp_hlist);
 		tr->progs_cnt[kind]--;
 	}
-out:
+	return err;
+}
+
+int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+{
+	int err;
+
+	mutex_lock(&tr->mutex);
+	err = __bpf_trampoline_link_prog(prog, tr);
 	mutex_unlock(&tr->mutex);
 	return err;
 }
 
-/* bpf_trampoline_unlink_prog() should never fail. */
-int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+static int __bpf_trampoline_unlink_prog(struct bpf_prog *prog,
+					struct bpf_trampoline *tr)
 {
 	enum bpf_tramp_prog_type kind;
 	int err;
 
 	kind = bpf_attach_type_to_tramp(prog);
-	mutex_lock(&tr->mutex);
 	if (kind == BPF_TRAMP_REPLACE) {
 		WARN_ON_ONCE(!tr->extension_prog);
 		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP,
 					 tr->extension_prog->bpf_func, NULL);
 		tr->extension_prog = NULL;
-		goto out;
+		return err;
 	}
 	hlist_del_init(&prog->aux->tramp_hlist);
 	tr->progs_cnt[kind]--;
-	err = bpf_trampoline_update(tr);
-out:
+	return bpf_trampoline_update(tr);
+}
+
+/* bpf_trampoline_unlink_prog() should never fail. */
+int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+{
+	int err;
+
+	mutex_lock(&tr->mutex);
+	err = __bpf_trampoline_unlink_prog(prog, tr);
 	mutex_unlock(&tr->mutex);
 	return err;
 }
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 02/10] bpf: convert cgroup_bpf.progs to hlist
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 01/10] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-05-18 15:16   ` Jakub Sitnicki
  2022-04-29 21:15 ` [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor Stanislav Fomichev
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev, Jakub Sitnicki

This lets us reclaim some space to be used by new cgroup lsm slots.

Before:
struct cgroup_bpf {
	struct bpf_prog_array *    effective[23];        /*     0   184 */
	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
	struct list_head           progs[23];            /*   184   368 */
	/* --- cacheline 8 boundary (512 bytes) was 40 bytes ago --- */
	u32                        flags[23];            /*   552    92 */

	/* XXX 4 bytes hole, try to pack */

	/* --- cacheline 10 boundary (640 bytes) was 8 bytes ago --- */
	struct list_head           storages;             /*   648    16 */
	struct bpf_prog_array *    inactive;             /*   664     8 */
	struct percpu_ref          refcnt;               /*   672    16 */
	struct work_struct         release_work;         /*   688    32 */

	/* size: 720, cachelines: 12, members: 7 */
	/* sum members: 716, holes: 1, sum holes: 4 */
	/* last cacheline: 16 bytes */
};

After:
struct cgroup_bpf {
	struct bpf_prog_array *    effective[23];        /*     0   184 */
	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
	struct hlist_head          progs[23];            /*   184   184 */
	/* --- cacheline 5 boundary (320 bytes) was 48 bytes ago --- */
	u8                         flags[23];            /*   368    23 */

	/* XXX 1 byte hole, try to pack */

	/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
	struct list_head           storages;             /*   392    16 */
	struct bpf_prog_array *    inactive;             /*   408     8 */
	struct percpu_ref          refcnt;               /*   416    16 */
	struct work_struct         release_work;         /*   432    72 */

	/* size: 504, cachelines: 8, members: 7 */
	/* sum members: 503, holes: 1, sum holes: 1 */
	/* last cacheline: 56 bytes */
};

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf-cgroup-defs.h |  4 +-
 include/linux/bpf-cgroup.h      |  2 +-
 kernel/bpf/cgroup.c             | 72 +++++++++++++++++++--------------
 3 files changed, 45 insertions(+), 33 deletions(-)

diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index 695d1224a71b..5d268e76d8e6 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -47,8 +47,8 @@ struct cgroup_bpf {
 	 * have either zero or one element
 	 * when BPF_F_ALLOW_MULTI the list can have up to BPF_CGROUP_MAX_PROGS
 	 */
-	struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE];
-	u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE];
+	struct hlist_head progs[MAX_CGROUP_BPF_ATTACH_TYPE];
+	u8 flags[MAX_CGROUP_BPF_ATTACH_TYPE];
 
 	/* list of cgroup shared storages */
 	struct list_head storages;
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 669d96d074ad..6673acfbf2ef 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -95,7 +95,7 @@ struct bpf_cgroup_link {
 };
 
 struct bpf_prog_list {
-	struct list_head node;
+	struct hlist_node node;
 	struct bpf_prog *prog;
 	struct bpf_cgroup_link *link;
 	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE];
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index afb414b26d01..134785ab487c 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -157,11 +157,12 @@ static void cgroup_bpf_release(struct work_struct *work)
 	mutex_lock(&cgroup_mutex);
 
 	for (atype = 0; atype < ARRAY_SIZE(cgrp->bpf.progs); atype++) {
-		struct list_head *progs = &cgrp->bpf.progs[atype];
-		struct bpf_prog_list *pl, *pltmp;
+		struct hlist_head *progs = &cgrp->bpf.progs[atype];
+		struct bpf_prog_list *pl;
+		struct hlist_node *pltmp;
 
-		list_for_each_entry_safe(pl, pltmp, progs, node) {
-			list_del(&pl->node);
+		hlist_for_each_entry_safe(pl, pltmp, progs, node) {
+			hlist_del(&pl->node);
 			if (pl->prog)
 				bpf_prog_put(pl->prog);
 			if (pl->link)
@@ -217,12 +218,12 @@ static struct bpf_prog *prog_list_prog(struct bpf_prog_list *pl)
 /* count number of elements in the list.
  * it's slow but the list cannot be long
  */
-static u32 prog_list_length(struct list_head *head)
+static u32 prog_list_length(struct hlist_head *head)
 {
 	struct bpf_prog_list *pl;
 	u32 cnt = 0;
 
-	list_for_each_entry(pl, head, node) {
+	hlist_for_each_entry(pl, head, node) {
 		if (!prog_list_prog(pl))
 			continue;
 		cnt++;
@@ -291,7 +292,7 @@ static int compute_effective_progs(struct cgroup *cgrp,
 		if (cnt > 0 && !(p->bpf.flags[atype] & BPF_F_ALLOW_MULTI))
 			continue;
 
-		list_for_each_entry(pl, &p->bpf.progs[atype], node) {
+		hlist_for_each_entry(pl, &p->bpf.progs[atype], node) {
 			if (!prog_list_prog(pl))
 				continue;
 
@@ -342,7 +343,7 @@ int cgroup_bpf_inherit(struct cgroup *cgrp)
 		cgroup_bpf_get(p);
 
 	for (i = 0; i < NR; i++)
-		INIT_LIST_HEAD(&cgrp->bpf.progs[i]);
+		INIT_HLIST_HEAD(&cgrp->bpf.progs[i]);
 
 	INIT_LIST_HEAD(&cgrp->bpf.storages);
 
@@ -418,7 +419,7 @@ static int update_effective_progs(struct cgroup *cgrp,
 
 #define BPF_CGROUP_MAX_PROGS 64
 
-static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
+static struct bpf_prog_list *find_attach_entry(struct hlist_head *progs,
 					       struct bpf_prog *prog,
 					       struct bpf_cgroup_link *link,
 					       struct bpf_prog *replace_prog,
@@ -428,12 +429,12 @@ static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
 
 	/* single-attach case */
 	if (!allow_multi) {
-		if (list_empty(progs))
+		if (hlist_empty(progs))
 			return NULL;
-		return list_first_entry(progs, typeof(*pl), node);
+		return hlist_entry(progs->first, typeof(*pl), node);
 	}
 
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (prog && pl->prog == prog && prog != replace_prog)
 			/* disallow attaching the same prog twice */
 			return ERR_PTR(-EINVAL);
@@ -444,7 +445,7 @@ static struct bpf_prog_list *find_attach_entry(struct list_head *progs,
 
 	/* direct prog multi-attach w/ replacement case */
 	if (replace_prog) {
-		list_for_each_entry(pl, progs, node) {
+		hlist_for_each_entry(pl, progs, node) {
 			if (pl->prog == replace_prog)
 				/* a match found */
 				return pl;
@@ -480,7 +481,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	int err;
 
 	if (((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI)) ||
@@ -503,7 +504,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	if (!hierarchy_allows_attach(cgrp, atype))
 		return -EPERM;
 
-	if (!list_empty(progs) && cgrp->bpf.flags[atype] != saved_flags)
+	if (!hlist_empty(progs) && cgrp->bpf.flags[atype] != saved_flags)
 		/* Disallow attaching non-overridable on top
 		 * of existing overridable in this cgroup.
 		 * Disallow attaching multi-prog if overridable or none
@@ -525,12 +526,22 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	if (pl) {
 		old_prog = pl->prog;
 	} else {
+		struct hlist_node *last = NULL;
+
 		pl = kmalloc(sizeof(*pl), GFP_KERNEL);
 		if (!pl) {
 			bpf_cgroup_storages_free(new_storage);
 			return -ENOMEM;
 		}
-		list_add_tail(&pl->node, progs);
+		if (hlist_empty(progs))
+			hlist_add_head(&pl->node, progs);
+		else
+			hlist_for_each(last, progs) {
+				if (last->next)
+					continue;
+				hlist_add_behind(&pl->node, last);
+				break;
+			}
 	}
 
 	pl->prog = prog;
@@ -556,7 +567,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	}
 	bpf_cgroup_storages_free(new_storage);
 	if (!old_prog) {
-		list_del(&pl->node);
+		hlist_del(&pl->node);
 		kfree(pl);
 	}
 	return err;
@@ -587,7 +598,7 @@ static void replace_effective_prog(struct cgroup *cgrp,
 	struct cgroup_subsys_state *css;
 	struct bpf_prog_array *progs;
 	struct bpf_prog_list *pl;
-	struct list_head *head;
+	struct hlist_head *head;
 	struct cgroup *cg;
 	int pos;
 
@@ -603,7 +614,7 @@ static void replace_effective_prog(struct cgroup *cgrp,
 				continue;
 
 			head = &cg->bpf.progs[atype];
-			list_for_each_entry(pl, head, node) {
+			hlist_for_each_entry(pl, head, node) {
 				if (!prog_list_prog(pl))
 					continue;
 				if (pl->link == link)
@@ -637,7 +648,7 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog *old_prog;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	bool found = false;
 
 	atype = to_cgroup_bpf_attach_type(link->type);
@@ -649,7 +660,7 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	if (link->link.prog->type != new_prog->type)
 		return -EINVAL;
 
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (pl->link == link) {
 			found = true;
 			break;
@@ -688,7 +699,7 @@ static int cgroup_bpf_replace(struct bpf_link *link, struct bpf_prog *new_prog,
 	return ret;
 }
 
-static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
+static struct bpf_prog_list *find_detach_entry(struct hlist_head *progs,
 					       struct bpf_prog *prog,
 					       struct bpf_cgroup_link *link,
 					       bool allow_multi)
@@ -696,14 +707,14 @@ static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
 	struct bpf_prog_list *pl;
 
 	if (!allow_multi) {
-		if (list_empty(progs))
+		if (hlist_empty(progs))
 			/* report error when trying to detach and nothing is attached */
 			return ERR_PTR(-ENOENT);
 
 		/* to maintain backward compatibility NONE and OVERRIDE cgroups
 		 * allow detaching with invalid FD (prog==NULL) in legacy mode
 		 */
-		return list_first_entry(progs, typeof(*pl), node);
+		return hlist_entry(progs->first, typeof(*pl), node);
 	}
 
 	if (!prog && !link)
@@ -713,7 +724,7 @@ static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
 		return ERR_PTR(-EINVAL);
 
 	/* find the prog or link and detach it */
-	list_for_each_entry(pl, progs, node) {
+	hlist_for_each_entry(pl, progs, node) {
 		if (pl->prog == prog && pl->link == link)
 			return pl;
 	}
@@ -737,7 +748,7 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog *old_prog;
 	struct bpf_prog_list *pl;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	u32 flags;
 	int err;
 
@@ -766,9 +777,10 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 		goto cleanup;
 
 	/* now can actually delete it from this cgroup list */
-	list_del(&pl->node);
+	hlist_del(&pl->node);
+
 	kfree(pl);
-	if (list_empty(progs))
+	if (hlist_empty(progs))
 		/* last program was detached, reset flags to zero */
 		cgrp->bpf.flags[atype] = 0;
 	if (old_prog)
@@ -802,7 +814,7 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 	enum bpf_attach_type type = attr->query.attach_type;
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_array *effective;
-	struct list_head *progs;
+	struct hlist_head *progs;
 	struct bpf_prog *prog;
 	int cnt, ret = 0, i;
 	u32 flags;
@@ -841,7 +853,7 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 		u32 id;
 
 		i = 0;
-		list_for_each_entry(pl, progs, node) {
+		hlist_for_each_entry(pl, progs, node) {
 			prog = prog_list_prog(pl);
 			id = prog->aux->id;
 			if (copy_to_user(prog_ids + i, &id, sizeof(id)))
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 01/10] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 02/10] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-05-06 23:02   ` Martin KaFai Lau
  2022-05-09 21:51   ` Andrii Nakryiko
  2022-04-29 21:15 ` [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Allow attaching to lsm hooks in the cgroup context.

Attaching to per-cgroup LSM works exactly like attaching
to other per-cgroup hooks. New BPF_LSM_CGROUP is added
to trigger new mode; the actual lsm hook we attach to is
signaled via existing attach_btf_id.

For the hooks that have 'struct socket' or 'struct sock' as its first
argument, we use the cgroup associated with that socket. For the rest,
we use 'current' cgroup (this is all on default hierarchy == v2 only).
Note that for some hooks that work on 'struct sock' we still
take the cgroup from 'current' because some of them work on the socket
that hasn't been properly initialized yet.

Behind the scenes, we allocate a shim program that is attached
to the trampoline and runs cgroup effective BPF programs array.
This shim has some rudimentary ref counting and can be shared
between several programs attaching to the same per-cgroup lsm hook.

Note that this patch bloats cgroup size because we add 211
cgroup_bpf_attach_type(s) for simplicity sake. This will be
addressed in the subsequent patch.

Also note that we only add non-sleepable flavor for now. To enable
sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
shim programs have to be freed via trace rcu, cgroup_bpf.effective
should be also trace-rcu-managed + maybe some other changes that
I'm not aware of.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 arch/x86/net/bpf_jit_comp.c     |  22 ++--
 include/linux/bpf-cgroup-defs.h |   6 ++
 include/linux/bpf-cgroup.h      |   7 ++
 include/linux/bpf.h             |  15 +++
 include/linux/bpf_lsm.h         |  14 +++
 include/uapi/linux/bpf.h        |   1 +
 kernel/bpf/bpf_lsm.c            |  64 ++++++++++++
 kernel/bpf/btf.c                |  11 ++
 kernel/bpf/cgroup.c             | 179 +++++++++++++++++++++++++++++---
 kernel/bpf/syscall.c            |  10 ++
 kernel/bpf/trampoline.c         | 161 ++++++++++++++++++++++++++++
 kernel/bpf/verifier.c           |  32 ++++++
 tools/include/uapi/linux/bpf.h  |   1 +
 13 files changed, 503 insertions(+), 20 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 16b6efacf7c6..0b11be002c28 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1764,15 +1764,23 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
 static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 			   struct bpf_prog *p, int stack_size, bool save_ret)
 {
+	void (*exit)(struct bpf_prog *prog, u64 start) = __bpf_prog_exit;
+	u64 (*enter)(struct bpf_prog *prog) = __bpf_prog_enter;
 	u8 *prog = *pprog;
 	u8 *jmp_insn;
 
+	if (p->aux->sleepable) {
+		enter = __bpf_prog_enter_sleepable;
+		exit = __bpf_prog_exit_sleepable;
+	} else if (p->expected_attach_type == BPF_LSM_CGROUP) {
+		enter = __bpf_prog_enter_lsm_cgroup;
+		exit = __bpf_prog_exit_lsm_cgroup;
+	}
+
 	/* arg1: mov rdi, progs[i] */
 	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
-	if (emit_call(&prog,
-		      p->aux->sleepable ? __bpf_prog_enter_sleepable :
-		      __bpf_prog_enter, prog))
-			return -EINVAL;
+	if (emit_call(&prog, enter, prog))
+		return -EINVAL;
 	/* remember prog start time returned by __bpf_prog_enter */
 	emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
 
@@ -1814,10 +1822,8 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
 	/* arg2: mov rsi, rbx <- start time in nsec */
 	emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
-	if (emit_call(&prog,
-		      p->aux->sleepable ? __bpf_prog_exit_sleepable :
-		      __bpf_prog_exit, prog))
-			return -EINVAL;
+	if (emit_call(&prog, exit, prog))
+		return -EINVAL;
 
 	*pprog = prog;
 	return 0;
diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index 5d268e76d8e6..d5a70a35dace 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -10,6 +10,8 @@
 
 struct bpf_prog_array;
 
+#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
+
 enum cgroup_bpf_attach_type {
 	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
 	CGROUP_INET_INGRESS = 0,
@@ -35,6 +37,10 @@ enum cgroup_bpf_attach_type {
 	CGROUP_INET4_GETSOCKNAME,
 	CGROUP_INET6_GETSOCKNAME,
 	CGROUP_INET_SOCK_RELEASE,
+#ifdef CONFIG_BPF_LSM
+	CGROUP_LSM_START,
+	CGROUP_LSM_END = CGROUP_LSM_START + CGROUP_LSM_NUM - 1,
+#endif
 	MAX_CGROUP_BPF_ATTACH_TYPE
 };
 
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 6673acfbf2ef..2bd1b5f8de9b 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -23,6 +23,13 @@ struct ctl_table;
 struct ctl_table_header;
 struct task_struct;
 
+unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
+				       const struct bpf_insn *insn);
+unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
+					 const struct bpf_insn *insn);
+unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
+					  const struct bpf_insn *insn);
+
 #ifdef CONFIG_CGROUP_BPF
 
 #define CGROUP_ATYPE(type) \
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 0a8927103018..b4ab73c76140 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -762,6 +762,8 @@ u64 notrace __bpf_prog_enter(struct bpf_prog *prog);
 void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start);
 u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog);
 void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start);
+u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog);
+void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start);
 void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
 void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
 
@@ -1029,6 +1031,7 @@ struct bpf_prog_aux {
 	u64 load_time; /* ns since boottime */
 	u32 verified_insns;
 	struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
+	int cgroup_atype; /* enum cgroup_bpf_attach_type */
 	char name[BPF_OBJ_NAME_LEN];
 #ifdef CONFIG_SECURITY
 	void *security;
@@ -1166,6 +1169,9 @@ struct bpf_dummy_ops {
 int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
 			    union bpf_attr __user *uattr);
 #endif
+int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+				    struct bpf_attach_target_info *tgt_info);
+void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog);
 #else
 static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
 {
@@ -1189,6 +1195,14 @@ static inline int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map,
 {
 	return -EINVAL;
 }
+static inline int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+						  struct bpf_attach_target_info *tgt_info)
+{
+	return -EOPNOTSUPP;
+}
+static inline void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
+{
+}
 #endif
 
 struct bpf_array {
@@ -2338,6 +2352,7 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len);
 
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
+int btf_id_set_index(const struct btf_id_set *set, u32 id);
 
 #define MAX_BPRINTF_VARARGS		12
 
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index 479c101546ad..7f0e59f5f9be 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -42,6 +42,9 @@ extern const struct bpf_func_proto bpf_inode_storage_get_proto;
 extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
 void bpf_inode_storage_free(struct inode *inode);
 
+int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
+int bpf_lsm_hook_idx(u32 btf_id);
+
 #else /* !CONFIG_BPF_LSM */
 
 static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id)
@@ -65,6 +68,17 @@ static inline void bpf_inode_storage_free(struct inode *inode)
 {
 }
 
+static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
+					   bpf_func_t *bpf_func)
+{
+	return -ENOENT;
+}
+
+static inline int bpf_lsm_hook_idx(u32 btf_id)
+{
+	return -EINVAL;
+}
+
 #endif /* CONFIG_BPF_LSM */
 
 #endif /* _LINUX_BPF_LSM_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 444fe6f1cf35..112e396bbe65 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -998,6 +998,7 @@ enum bpf_attach_type {
 	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
 	BPF_PERF_EVENT,
 	BPF_TRACE_KPROBE_MULTI,
+	BPF_LSM_CGROUP,
 	__MAX_BPF_ATTACH_TYPE
 };
 
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 064eccba641d..a0e68ef5dfb1 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -16,6 +16,7 @@
 #include <linux/bpf_local_storage.h>
 #include <linux/btf_ids.h>
 #include <linux/ima.h>
+#include <linux/bpf-cgroup.h>
 
 /* For every LSM hook that allows attachment of BPF programs, declare a nop
  * function where a BPF program can be attached.
@@ -35,6 +36,66 @@ BTF_SET_START(bpf_lsm_hooks)
 #undef LSM_HOOK
 BTF_SET_END(bpf_lsm_hooks)
 
+/* List of LSM hooks that should operate on 'current' cgroup regardless
+ * of function signature.
+ */
+BTF_SET_START(bpf_lsm_current_hooks)
+/* operate on freshly allocated sk without any cgroup association */
+BTF_ID(func, bpf_lsm_sk_alloc_security)
+BTF_ID(func, bpf_lsm_sk_free_security)
+BTF_SET_END(bpf_lsm_current_hooks)
+
+int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
+			     bpf_func_t *bpf_func)
+{
+	const struct btf_type *first_arg_type;
+	const struct btf_type *sock_type;
+	const struct btf_type *sk_type;
+	const struct btf *btf_vmlinux;
+	const struct btf_param *args;
+	s32 type_id;
+
+	if (!prog->aux->attach_func_proto ||
+	    !btf_type_is_func_proto(prog->aux->attach_func_proto))
+		return -EINVAL;
+
+	if (btf_type_vlen(prog->aux->attach_func_proto) < 1 ||
+	    btf_id_set_contains(&bpf_lsm_current_hooks,
+				prog->aux->attach_btf_id)) {
+		*bpf_func = __cgroup_bpf_run_lsm_current;
+		return 0;
+	}
+
+	args = btf_params(prog->aux->attach_func_proto);
+
+	btf_vmlinux = bpf_get_btf_vmlinux();
+
+	type_id = btf_find_by_name_kind(btf_vmlinux, "socket", BTF_KIND_STRUCT);
+	if (type_id < 0)
+		return -EINVAL;
+	sock_type = btf_type_by_id(btf_vmlinux, type_id);
+
+	type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
+	if (type_id < 0)
+		return -EINVAL;
+	sk_type = btf_type_by_id(btf_vmlinux, type_id);
+
+	first_arg_type = btf_type_resolve_ptr(btf_vmlinux, args[0].type, NULL);
+	if (first_arg_type == sock_type)
+		*bpf_func = __cgroup_bpf_run_lsm_socket;
+	else if (first_arg_type == sk_type)
+		*bpf_func = __cgroup_bpf_run_lsm_sock;
+	else
+		*bpf_func = __cgroup_bpf_run_lsm_current;
+
+	return 0;
+}
+
+int bpf_lsm_hook_idx(u32 btf_id)
+{
+	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
+}
+
 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 			const struct bpf_prog *prog)
 {
@@ -141,6 +202,9 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return prog->aux->sleepable ? &bpf_ima_inode_hash_proto : NULL;
 	case BPF_FUNC_ima_file_hash:
 		return prog->aux->sleepable ? &bpf_ima_file_hash_proto : NULL;
+	case BPF_FUNC_get_local_storage:
+		return prog->expected_attach_type == BPF_LSM_CGROUP ?
+			&bpf_get_local_storage_proto : NULL;
 	default:
 		return tracing_prog_func_proto(func_id, prog);
 	}
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 2f0b0440131c..a90f04a8a8ee 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5248,6 +5248,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 
 	if (arg == nr_args) {
 		switch (prog->expected_attach_type) {
+		case BPF_LSM_CGROUP:
 		case BPF_LSM_MAC:
 		case BPF_TRACE_FEXIT:
 			/* When LSM programs are attached to void LSM hooks
@@ -6726,6 +6727,16 @@ static int btf_id_cmp_func(const void *a, const void *b)
 	return *pa - *pb;
 }
 
+int btf_id_set_index(const struct btf_id_set *set, u32 id)
+{
+	const u32 *p;
+
+	p = bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func);
+	if (!p)
+		return -1;
+	return p - set->ids;
+}
+
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id)
 {
 	return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 134785ab487c..9cc38454e402 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -14,6 +14,9 @@
 #include <linux/string.h>
 #include <linux/bpf.h>
 #include <linux/bpf-cgroup.h>
+#include <linux/btf_ids.h>
+#include <linux/bpf_lsm.h>
+#include <linux/bpf_verifier.h>
 #include <net/sock.h>
 #include <net/bpf_sk_storage.h>
 
@@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
 	return run_ctx.retval;
 }
 
+unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
+				       const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct sock *sk;
+	struct cgroup *cgrp;
+	int ret = 0;
+	u64 *regs;
+
+	regs = (u64 *)ctx;
+	sk = (void *)(unsigned long)regs[BPF_REG_0];
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	return ret;
+}
+
+unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
+					 const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct socket *sock;
+	struct cgroup *cgrp;
+	int ret = 0;
+	u64 *regs;
+
+	regs = (u64 *)ctx;
+	sock = (void *)(unsigned long)regs[BPF_REG_0];
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	return ret;
+}
+
+unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
+					  const struct bpf_insn *insn)
+{
+	const struct bpf_prog *shim_prog;
+	struct cgroup *cgrp;
+	int ret = 0;
+
+	if (unlikely(!current))
+		return 0;
+
+	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
+	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
+
+	rcu_read_lock();
+	cgrp = task_dfl_cgroup(current);
+	if (likely(cgrp))
+		ret = bpf_prog_run_array_cg(&cgrp->bpf,
+					    shim_prog->aux->cgroup_atype,
+					    ctx, bpf_prog_run, 0, NULL);
+	rcu_read_unlock();
+	return ret;
+}
+
+#ifdef CONFIG_BPF_LSM
+static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
+{
+	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
+}
+#else
+static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
+{
+	return -EOPNOTSUPP;
+}
+#endif /* CONFIG_BPF_LSM */
+
 void cgroup_bpf_offline(struct cgroup *cgrp)
 {
 	cgroup_get(cgrp);
@@ -139,6 +221,11 @@ static void bpf_cgroup_link_auto_detach(struct bpf_cgroup_link *link)
 	link->cgroup = NULL;
 }
 
+static void bpf_cgroup_lsm_shim_release(struct bpf_prog *prog)
+{
+	bpf_trampoline_unlink_cgroup_shim(prog);
+}
+
 /**
  * cgroup_bpf_release() - put references of all bpf programs and
  *                        release all cgroup bpf data
@@ -163,10 +250,16 @@ static void cgroup_bpf_release(struct work_struct *work)
 
 		hlist_for_each_entry_safe(pl, pltmp, progs, node) {
 			hlist_del(&pl->node);
-			if (pl->prog)
+			if (pl->prog) {
+				if (atype == BPF_LSM_CGROUP)
+					bpf_cgroup_lsm_shim_release(pl->prog);
 				bpf_prog_put(pl->prog);
-			if (pl->link)
+			}
+			if (pl->link) {
+				if (atype == BPF_LSM_CGROUP)
+					bpf_cgroup_lsm_shim_release(pl->link->link.prog);
 				bpf_cgroup_link_auto_detach(pl->link);
+			}
 			kfree(pl);
 			static_branch_dec(&cgroup_bpf_enabled_key[atype]);
 		}
@@ -479,6 +572,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	struct bpf_prog *old_prog = NULL;
 	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
 	struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
+	struct bpf_attach_target_info tgt_info = {};
 	enum cgroup_bpf_attach_type atype;
 	struct bpf_prog_list *pl;
 	struct hlist_head *progs;
@@ -495,9 +589,34 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 		/* replace_prog implies BPF_F_REPLACE, and vice versa */
 		return -EINVAL;
 
-	atype = to_cgroup_bpf_attach_type(type);
-	if (atype < 0)
-		return -EINVAL;
+	if (type == BPF_LSM_CGROUP) {
+		struct bpf_prog *p = prog ? : link->link.prog;
+
+		if (replace_prog) {
+			/* Reusing shim from the original program. */
+			if (replace_prog->aux->attach_btf_id !=
+			    p->aux->attach_btf_id)
+				return -EINVAL;
+
+			atype = replace_prog->aux->cgroup_atype;
+		} else {
+			err = bpf_check_attach_target(NULL, p, NULL,
+						      p->aux->attach_btf_id,
+						      &tgt_info);
+			if (err)
+				return -EINVAL;
+
+			atype = bpf_lsm_attach_type_get(p->aux->attach_btf_id);
+			if (atype < 0)
+				return atype;
+		}
+
+		p->aux->cgroup_atype = atype;
+	} else {
+		atype = to_cgroup_bpf_attach_type(type);
+		if (atype < 0)
+			return -EINVAL;
+	}
 
 	progs = &cgrp->bpf.progs[atype];
 
@@ -549,9 +668,17 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	bpf_cgroup_storages_assign(pl->storage, storage);
 	cgrp->bpf.flags[atype] = saved_flags;
 
+	if (type == BPF_LSM_CGROUP && !old_prog) {
+		struct bpf_prog *p = prog ? : link->link.prog;
+
+		err = bpf_trampoline_link_cgroup_shim(p, &tgt_info);
+		if (err)
+			goto cleanup;
+	}
+
 	err = update_effective_progs(cgrp, atype);
 	if (err)
-		goto cleanup;
+		goto cleanup_trampoline;
 
 	if (old_prog)
 		bpf_prog_put(old_prog);
@@ -560,6 +687,13 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 	bpf_cgroup_storages_link(new_storage, cgrp, type);
 	return 0;
 
+cleanup_trampoline:
+	if (type == BPF_LSM_CGROUP && !old_prog) {
+		struct bpf_prog *p = prog ? : link->link.prog;
+
+		bpf_trampoline_unlink_cgroup_shim(p);
+	}
+
 cleanup:
 	if (old_prog) {
 		pl->prog = old_prog;
@@ -651,9 +785,18 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	struct hlist_head *progs;
 	bool found = false;
 
-	atype = to_cgroup_bpf_attach_type(link->type);
-	if (atype < 0)
-		return -EINVAL;
+	if (link->type == BPF_LSM_CGROUP) {
+		atype = link->link.prog->aux->cgroup_atype;
+
+		/* Reusing shim from the original program. */
+		if (new_prog->aux->attach_btf_id !=
+		    link->link.prog->aux->attach_btf_id)
+			return -EINVAL;
+	} else {
+		atype = to_cgroup_bpf_attach_type(link->type);
+		if (atype < 0)
+			return -EINVAL;
+	}
 
 	progs = &cgrp->bpf.progs[atype];
 
@@ -669,6 +812,9 @@ static int __cgroup_bpf_replace(struct cgroup *cgrp,
 	if (!found)
 		return -ENOENT;
 
+	if (link->type == BPF_LSM_CGROUP)
+		new_prog->aux->cgroup_atype = atype;
+
 	old_prog = xchg(&link->link.prog, new_prog);
 	replace_effective_prog(cgrp, atype, link);
 	bpf_prog_put(old_prog);
@@ -752,9 +898,15 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	u32 flags;
 	int err;
 
-	atype = to_cgroup_bpf_attach_type(type);
-	if (atype < 0)
-		return -EINVAL;
+	if (type == BPF_LSM_CGROUP) {
+		struct bpf_prog *p = prog ? : link->link.prog;
+
+		atype = p->aux->cgroup_atype;
+	} else {
+		atype = to_cgroup_bpf_attach_type(type);
+		if (atype < 0)
+			return -EINVAL;
+	}
 
 	progs = &cgrp->bpf.progs[atype];
 	flags = cgrp->bpf.flags[atype];
@@ -776,6 +928,9 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 	if (err)
 		goto cleanup;
 
+	if (old_prog && old_prog->expected_attach_type == BPF_LSM_CGROUP)
+		bpf_cgroup_lsm_shim_release(old_prog);
+
 	/* now can actually delete it from this cgroup list */
 	hlist_del(&pl->node);
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e0aead17dff4..53fe0ae9f8cf 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3421,6 +3421,8 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type)
 		return BPF_PROG_TYPE_SK_LOOKUP;
 	case BPF_XDP:
 		return BPF_PROG_TYPE_XDP;
+	case BPF_LSM_CGROUP:
+		return BPF_PROG_TYPE_LSM;
 	default:
 		return BPF_PROG_TYPE_UNSPEC;
 	}
@@ -3474,6 +3476,11 @@ static int bpf_prog_attach(const union bpf_attr *attr)
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_SOCK_OPS:
+	case BPF_PROG_TYPE_LSM:
+		if (ptype == BPF_PROG_TYPE_LSM &&
+		    prog->expected_attach_type != BPF_LSM_CGROUP)
+			return -EINVAL;
+
 		ret = cgroup_bpf_prog_attach(attr, ptype, prog);
 		break;
 	default:
@@ -3511,6 +3518,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_SOCK_OPS:
+	case BPF_PROG_TYPE_LSM:
 		return cgroup_bpf_prog_detach(attr, ptype);
 	default:
 		return -EINVAL;
@@ -4543,6 +4551,8 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 			ret = bpf_raw_tp_link_attach(prog, NULL);
 		else if (prog->expected_attach_type == BPF_TRACE_ITER)
 			ret = bpf_iter_link_attach(attr, uattr, prog);
+		else if (prog->expected_attach_type == BPF_LSM_CGROUP)
+			ret = cgroup_bpf_link_attach(attr, prog);
 		else
 			ret = bpf_tracing_prog_attach(prog,
 						      attr->link_create.target_fd,
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 0c4fd194e801..77dfa517c47c 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -11,6 +11,8 @@
 #include <linux/rcupdate_wait.h>
 #include <linux/module.h>
 #include <linux/static_call.h>
+#include <linux/bpf_verifier.h>
+#include <linux/bpf_lsm.h>
 
 /* dummy _ops. The verifier will operate on target program's ops. */
 const struct bpf_verifier_ops bpf_extension_verifier_ops = {
@@ -485,6 +487,147 @@ int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
 	return err;
 }
 
+#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
+static struct bpf_prog *cgroup_shim_alloc(const struct bpf_prog *prog,
+					  bpf_func_t bpf_func)
+{
+	struct bpf_prog *p;
+
+	p = bpf_prog_alloc(1, 0);
+	if (!p)
+		return NULL;
+
+	p->jited = false;
+	p->bpf_func = bpf_func;
+
+	p->aux->cgroup_atype = prog->aux->cgroup_atype;
+	p->aux->attach_func_proto = prog->aux->attach_func_proto;
+	p->aux->attach_btf_id = prog->aux->attach_btf_id;
+	p->aux->attach_btf = prog->aux->attach_btf;
+	btf_get(p->aux->attach_btf);
+	p->type = BPF_PROG_TYPE_LSM;
+	p->expected_attach_type = BPF_LSM_MAC;
+	bpf_prog_inc(p);
+
+	return p;
+}
+
+static struct bpf_prog *cgroup_shim_find(struct bpf_trampoline *tr,
+					 bpf_func_t bpf_func)
+{
+	const struct bpf_prog_aux *aux;
+	int kind;
+
+	for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
+		hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist) {
+			struct bpf_prog *p = aux->prog;
+
+			if (p->bpf_func == bpf_func)
+				return p;
+		}
+	}
+
+	return NULL;
+}
+
+int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
+				    struct bpf_attach_target_info *tgt_info)
+{
+	struct bpf_prog *shim_prog = NULL;
+	struct bpf_trampoline *tr;
+	bpf_func_t bpf_func;
+	u64 key;
+	int err;
+
+	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
+					 prog->aux->attach_btf_id);
+
+	err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
+	if (err)
+		return err;
+
+	tr = bpf_trampoline_get(key, tgt_info);
+	if (!tr)
+		return  -ENOMEM;
+
+	mutex_lock(&tr->mutex);
+
+	shim_prog = cgroup_shim_find(tr, bpf_func);
+	if (shim_prog) {
+		/* Reusing existing shim attached by the other program. */
+		bpf_prog_inc(shim_prog);
+		mutex_unlock(&tr->mutex);
+		return 0;
+	}
+
+	/* Allocate and install new shim. */
+
+	shim_prog = cgroup_shim_alloc(prog, bpf_func);
+	if (!shim_prog) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	err = __bpf_trampoline_link_prog(shim_prog, tr);
+	if (err)
+		goto out;
+
+	mutex_unlock(&tr->mutex);
+
+	return 0;
+out:
+	if (shim_prog)
+		bpf_prog_put(shim_prog);
+
+	mutex_unlock(&tr->mutex);
+	return err;
+}
+
+void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
+{
+	struct bpf_prog *shim_prog;
+	struct bpf_trampoline *tr;
+	bpf_func_t bpf_func;
+	u64 key;
+	int err;
+
+	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
+					 prog->aux->attach_btf_id);
+
+	err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
+	if (err)
+		return;
+
+	tr = bpf_trampoline_lookup(key);
+	if (!tr)
+		return;
+
+	mutex_lock(&tr->mutex);
+
+	shim_prog = cgroup_shim_find(tr, bpf_func);
+	if (shim_prog) {
+		/* We use shim_prog refcnt for tracking whether to
+		 * remove the shim program from the trampoline.
+		 * Trampoline's mutex is held while refcnt is
+		 * added/subtracted so we don't need to care about
+		 * potential races.
+		 */
+
+		if (atomic64_read(&shim_prog->aux->refcnt) == 1)
+			WARN_ON_ONCE(__bpf_trampoline_unlink_prog(shim_prog, tr));
+
+		bpf_prog_put(shim_prog);
+	}
+
+	mutex_unlock(&tr->mutex);
+
+	bpf_trampoline_put(tr); /* bpf_trampoline_lookup */
+
+	if (shim_prog)
+		bpf_trampoline_put(tr);
+}
+#endif
+
 struct bpf_trampoline *bpf_trampoline_get(u64 key,
 					  struct bpf_attach_target_info *tgt_info)
 {
@@ -609,6 +752,24 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start)
 	rcu_read_unlock();
 }
 
+u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog)
+	__acquires(RCU)
+{
+	/* Runtime stats are exported via actual BPF_LSM_CGROUP
+	 * programs, not the shims.
+	 */
+	rcu_read_lock();
+	migrate_disable();
+	return NO_START_TIME;
+}
+
+void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start)
+	__releases(RCU)
+{
+	migrate_enable();
+	rcu_read_unlock();
+}
+
 u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog)
 {
 	rcu_read_lock_trace();
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 813f6ee80419..99703d96c579 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14455,6 +14455,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 		fallthrough;
 	case BPF_MODIFY_RETURN:
 	case BPF_LSM_MAC:
+	case BPF_LSM_CGROUP:
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
 		if (!btf_type_is_func(t)) {
@@ -14633,6 +14634,33 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 	return 0;
 }
 
+static int check_used_maps(struct bpf_verifier_env *env)
+{
+	int i;
+
+	for (i = 0; i < env->used_map_cnt; i++) {
+		struct bpf_map *map = env->used_maps[i];
+
+		switch (env->prog->expected_attach_type) {
+		case BPF_LSM_CGROUP:
+			if (map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE &&
+			    map->map_type != BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE)
+				break;
+
+			if (map->key_size != sizeof(__u64)) {
+				verbose(env, "only global cgroup local storage is supported for BPF_LSM_CGROUP\n");
+				return -EINVAL;
+			}
+
+			break;
+		default:
+			break;
+		}
+	}
+
+	return 0;
+}
+
 struct btf *bpf_get_btf_vmlinux(void)
 {
 	if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
@@ -14854,6 +14882,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
 		convert_pseudo_ld_imm64(env);
 	}
 
+	ret = check_used_maps(env);
+	if (ret < 0)
+		goto err_release_maps;
+
 	adjust_btf_func(env);
 
 err_release_maps:
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 444fe6f1cf35..112e396bbe65 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -998,6 +998,7 @@ enum bpf_attach_type {
 	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
 	BPF_PERF_EVENT,
 	BPF_TRACE_KPROBE_MULTI,
+	BPF_LSM_CGROUP,
 	__MAX_BPF_ATTACH_TYPE
 };
 
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (2 preceding siblings ...)
  2022-04-29 21:15 ` [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-05-10  5:05   ` Alexei Starovoitov
  2022-04-29 21:15 ` [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Previous patch adds 1:1 mapping between all 211 LSM hooks
and bpf_cgroup program array. Instead of reserving a slot per
possible hook, reserve 10 slots per cgroup for lsm programs.
Those slots are dynamically allocated on demand and reclaimed.

It should be possible to eventually extend this idea to all hooks if
the memory consumption is unacceptable and shrink overall effective
programs array.

struct cgroup_bpf {
	struct bpf_prog_array *    effective[33];        /*     0   264 */
	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
	struct hlist_head          progs[33];            /*   264   264 */
	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
	u8                         flags[33];            /*   528    33 */

	/* XXX 7 bytes hole, try to pack */

	struct list_head           storages;             /*   568    16 */
	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
	struct bpf_prog_array *    inactive;             /*   584     8 */
	struct percpu_ref          refcnt;               /*   592    16 */
	struct work_struct         release_work;         /*   608    72 */

	/* size: 680, cachelines: 11, members: 7 */
	/* sum members: 673, holes: 1, sum holes: 7 */
	/* last cacheline: 40 bytes */
};

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf-cgroup-defs.h |   3 +-
 include/linux/bpf_lsm.h         |   6 --
 kernel/bpf/bpf_lsm.c            |   5 --
 kernel/bpf/cgroup.c             | 107 +++++++++++++++++++++++++++-----
 4 files changed, 94 insertions(+), 27 deletions(-)

diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
index d5a70a35dace..359d3f16abea 100644
--- a/include/linux/bpf-cgroup-defs.h
+++ b/include/linux/bpf-cgroup-defs.h
@@ -10,7 +10,8 @@
 
 struct bpf_prog_array;
 
-#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
+/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
+#define CGROUP_LSM_NUM 10
 
 enum cgroup_bpf_attach_type {
 	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index 7f0e59f5f9be..613de44aa429 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -43,7 +43,6 @@ extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
 void bpf_inode_storage_free(struct inode *inode);
 
 int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
-int bpf_lsm_hook_idx(u32 btf_id);
 
 #else /* !CONFIG_BPF_LSM */
 
@@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
 	return -ENOENT;
 }
 
-static inline int bpf_lsm_hook_idx(u32 btf_id)
-{
-	return -EINVAL;
-}
-
 #endif /* CONFIG_BPF_LSM */
 
 #endif /* _LINUX_BPF_LSM_H */
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index a0e68ef5dfb1..1079c747e061 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -91,11 +91,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
 	return 0;
 }
 
-int bpf_lsm_hook_idx(u32 btf_id)
-{
-	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
-}
-
 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 			const struct bpf_prog *prog)
 {
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 9cc38454e402..787ff6cf8d42 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -79,10 +79,13 @@ unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
 	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
 
 	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
-	if (likely(cgrp))
+	if (likely(cgrp)) {
+		rcu_read_lock(); /* See bpf_lsm_attach_type_get(). */
 		ret = bpf_prog_run_array_cg(&cgrp->bpf,
 					    shim_prog->aux->cgroup_atype,
 					    ctx, bpf_prog_run, 0, NULL);
+		rcu_read_unlock();
+	}
 	return ret;
 }
 
@@ -101,10 +104,13 @@ unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
 	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
 
 	cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
-	if (likely(cgrp))
+	if (likely(cgrp)) {
+		rcu_read_lock(); /* See bpf_lsm_attach_type_get(). */
 		ret = bpf_prog_run_array_cg(&cgrp->bpf,
 					    shim_prog->aux->cgroup_atype,
 					    ctx, bpf_prog_run, 0, NULL);
+		rcu_read_unlock();
+	}
 	return ret;
 }
 
@@ -121,7 +127,7 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
 	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
 	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
 
-	rcu_read_lock();
+	rcu_read_lock(); /* See bpf_lsm_attach_type_get(). */
 	cgrp = task_dfl_cgroup(current);
 	if (likely(cgrp))
 		ret = bpf_prog_run_array_cg(&cgrp->bpf,
@@ -132,15 +138,67 @@ unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
 }
 
 #ifdef CONFIG_BPF_LSM
+/* Readers are protected by rcu+synchronize_rcu.
+ * Writers are protected by cgroup_mutex.
+ */
+int cgroup_lsm_atype_usecnt[CGROUP_LSM_NUM];
+u32 cgroup_lsm_atype_btf_id[CGROUP_LSM_NUM];
+
 static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
 {
-	return CGROUP_LSM_START + bpf_lsm_hook_idx(attach_btf_id);
+	int i;
+
+	WARN_ON_ONCE(!mutex_is_locked(&cgroup_mutex));
+
+	for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_btf_id); i++) {
+		if (cgroup_lsm_atype_btf_id[i] != attach_btf_id)
+			continue;
+
+		cgroup_lsm_atype_usecnt[i]++;
+		return CGROUP_LSM_START + i;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_usecnt); i++) {
+		if (cgroup_lsm_atype_usecnt[i] != 0)
+			continue;
+
+		cgroup_lsm_atype_btf_id[i] = attach_btf_id;
+		cgroup_lsm_atype_usecnt[i] = 1;
+		return CGROUP_LSM_START + i;
+	}
+
+	return -E2BIG;
+}
+
+static void bpf_lsm_attach_type_put(u32 attach_btf_id)
+{
+	int i;
+
+	WARN_ON_ONCE(!mutex_is_locked(&cgroup_mutex));
+
+	for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype_btf_id); i++) {
+		if (cgroup_lsm_atype_btf_id[i] != attach_btf_id)
+			continue;
+
+		if (--cgroup_lsm_atype_usecnt[i] <= 0) {
+			/* Wait for any existing users to finish. */
+			synchronize_rcu();
+			WARN_ON_ONCE(cgroup_lsm_atype_usecnt[i] < 0);
+		}
+		return;
+	}
+
+	WARN_ON_ONCE(1);
 }
 #else
 static enum cgroup_bpf_attach_type bpf_lsm_attach_type_get(u32 attach_btf_id)
 {
 	return -EOPNOTSUPP;
 }
+
+static void bpf_lsm_attach_type_put(u32 attach_btf_id)
+{
+}
 #endif /* CONFIG_BPF_LSM */
 
 void cgroup_bpf_offline(struct cgroup *cgrp)
@@ -224,6 +282,7 @@ static void bpf_cgroup_link_auto_detach(struct bpf_cgroup_link *link)
 static void bpf_cgroup_lsm_shim_release(struct bpf_prog *prog)
 {
 	bpf_trampoline_unlink_cgroup_shim(prog);
+	bpf_lsm_attach_type_put(prog->aux->attach_btf_id);
 }
 
 /**
@@ -620,27 +679,37 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 
 	progs = &cgrp->bpf.progs[atype];
 
-	if (!hierarchy_allows_attach(cgrp, atype))
-		return -EPERM;
+	if (!hierarchy_allows_attach(cgrp, atype)) {
+		err = -EPERM;
+		goto cleanup_attach_type;
+	}
 
-	if (!hlist_empty(progs) && cgrp->bpf.flags[atype] != saved_flags)
+	if (!hlist_empty(progs) && cgrp->bpf.flags[atype] != saved_flags) {
 		/* Disallow attaching non-overridable on top
 		 * of existing overridable in this cgroup.
 		 * Disallow attaching multi-prog if overridable or none
 		 */
-		return -EPERM;
+		err = -EPERM;
+		goto cleanup_attach_type;
+	}
 
-	if (prog_list_length(progs) >= BPF_CGROUP_MAX_PROGS)
-		return -E2BIG;
+	if (prog_list_length(progs) >= BPF_CGROUP_MAX_PROGS) {
+		err = -E2BIG;
+		goto cleanup_attach_type;
+	}
 
 	pl = find_attach_entry(progs, prog, link, replace_prog,
 			       flags & BPF_F_ALLOW_MULTI);
-	if (IS_ERR(pl))
-		return PTR_ERR(pl);
+	if (IS_ERR(pl)) {
+		err = PTR_ERR(pl);
+		goto cleanup_attach_type;
+	}
 
 	if (bpf_cgroup_storages_alloc(storage, new_storage, type,
-				      prog ? : link->link.prog, cgrp))
-		return -ENOMEM;
+				      prog ? : link->link.prog, cgrp)) {
+		err = -ENOMEM;
+		goto cleanup_attach_type;
+	}
 
 	if (pl) {
 		old_prog = pl->prog;
@@ -650,7 +719,8 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 		pl = kmalloc(sizeof(*pl), GFP_KERNEL);
 		if (!pl) {
 			bpf_cgroup_storages_free(new_storage);
-			return -ENOMEM;
+			err = -ENOMEM;
+			goto cleanup_attach_type;
 		}
 		if (hlist_empty(progs))
 			hlist_add_head(&pl->node, progs);
@@ -704,6 +774,13 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
 		hlist_del(&pl->node);
 		kfree(pl);
 	}
+
+cleanup_attach_type:
+	if (type == BPF_LSM_CGROUP) {
+		struct bpf_prog *p = prog ? : link->link.prog;
+
+		bpf_lsm_attach_type_put(p->aux->attach_btf_id);
+	}
 	return err;
 }
 
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (3 preceding siblings ...)
  2022-04-29 21:15 ` [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-05-07  0:12   ` Martin KaFai Lau
  2022-05-09 21:49   ` Andrii Nakryiko
  2022-04-29 21:15 ` [PATCH bpf-next v6 06/10] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

We have two options:
1. Treat all BPF_LSM_CGROUP as the same, regardless of attach_btf_id
2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point

I'm doing (2) here and adding attach_btf_id as a new BPF_PROG_QUERY
argument. The downside is that it requires iterating over all possible
bpf_lsm_ hook points in the userspace which might take some time.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/uapi/linux/bpf.h       |  1 +
 kernel/bpf/cgroup.c            | 43 ++++++++++++++++++++++++----------
 kernel/bpf/syscall.c           |  3 ++-
 tools/include/uapi/linux/bpf.h |  1 +
 tools/lib/bpf/bpf.c            | 42 ++++++++++++++++++++++++++-------
 tools/lib/bpf/bpf.h            | 15 ++++++++++++
 tools/lib/bpf/libbpf.map       |  1 +
 7 files changed, 85 insertions(+), 21 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 112e396bbe65..e38ea0b47b6a 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1431,6 +1431,7 @@ union bpf_attr {
 		__u32		attach_flags;
 		__aligned_u64	prog_ids;
 		__u32		prog_cnt;
+		__u32		attach_btf_id;	/* for BPF_LSM_CGROUP */
 	} query;
 
 	struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 787ff6cf8d42..0add465c06b9 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1051,9 +1051,15 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 	int cnt, ret = 0, i;
 	u32 flags;
 
-	atype = to_cgroup_bpf_attach_type(type);
-	if (atype < 0)
-		return -EINVAL;
+	if (type == BPF_LSM_CGROUP) {
+		atype = bpf_lsm_attach_type_get(attr->query.attach_btf_id);
+		if (atype < 0)
+			return atype;
+	} else {
+		atype = to_cgroup_bpf_attach_type(type);
+		if (atype < 0)
+			return -EINVAL;
+	}
 
 	progs = &cgrp->bpf.progs[atype];
 	flags = cgrp->bpf.flags[atype];
@@ -1066,20 +1072,26 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 	else
 		cnt = prog_list_length(progs);
 
-	if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags)))
-		return -EFAULT;
-	if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt)))
-		return -EFAULT;
-	if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
+	if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags))) {
+		ret = -EFAULT;
+		goto cleanup_attach_type;
+	}
+	if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt))) {
+		ret = -EFAULT;
+		goto cleanup_attach_type;
+	}
+	if (attr->query.prog_cnt == 0 || !prog_ids || !cnt) {
 		/* return early if user requested only program count + flags */
-		return 0;
+		ret = 0;
+		goto cleanup_attach_type;
+	}
 	if (attr->query.prog_cnt < cnt) {
 		cnt = attr->query.prog_cnt;
 		ret = -ENOSPC;
 	}
 
 	if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) {
-		return bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
+		ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt);
 	} else {
 		struct bpf_prog_list *pl;
 		u32 id;
@@ -1088,12 +1100,19 @@ static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 		hlist_for_each_entry(pl, progs, node) {
 			prog = prog_list_prog(pl);
 			id = prog->aux->id;
-			if (copy_to_user(prog_ids + i, &id, sizeof(id)))
-				return -EFAULT;
+			if (copy_to_user(prog_ids + i, &id, sizeof(id))) {
+				ret = -EFAULT;
+				goto cleanup_attach_type;
+			}
 			if (++i == cnt)
 				break;
 		}
 	}
+
+cleanup_attach_type:
+	if (type == BPF_LSM_CGROUP)
+		bpf_lsm_attach_type_put(attr->query.attach_btf_id);
+
 	return ret;
 }
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 53fe0ae9f8cf..cdf1937b71ca 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3525,7 +3525,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
 	}
 }
 
-#define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt
+#define BPF_PROG_QUERY_LAST_FIELD query.attach_btf_id
 
 static int bpf_prog_query(const union bpf_attr *attr,
 			  union bpf_attr __user *uattr)
@@ -3561,6 +3561,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
 	case BPF_CGROUP_SYSCTL:
 	case BPF_CGROUP_GETSOCKOPT:
 	case BPF_CGROUP_SETSOCKOPT:
+	case BPF_LSM_CGROUP:
 		return cgroup_bpf_prog_query(attr, uattr);
 	case BPF_LIRC_MODE2:
 		return lirc_prog_query(attr, uattr);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 112e396bbe65..e38ea0b47b6a 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1431,6 +1431,7 @@ union bpf_attr {
 		__u32		attach_flags;
 		__aligned_u64	prog_ids;
 		__u32		prog_cnt;
+		__u32		attach_btf_id;	/* for BPF_LSM_CGROUP */
 	} query;
 
 	struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index a9d292c106c2..f62823451b99 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -946,28 +946,54 @@ int bpf_iter_create(int link_fd)
 	return libbpf_err_errno(fd);
 }
 
-int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
-		   __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
+int bpf_prog_query2(int target_fd,
+		    enum bpf_attach_type type,
+		    struct bpf_prog_query_opts *opts)
 {
 	union bpf_attr attr;
 	int ret;
 
 	memset(&attr, 0, sizeof(attr));
+
+	if (!OPTS_VALID(opts, bpf_prog_query_opts))
+		return libbpf_err(-EINVAL);
+
 	attr.query.target_fd	= target_fd;
 	attr.query.attach_type	= type;
-	attr.query.query_flags	= query_flags;
-	attr.query.prog_cnt	= *prog_cnt;
-	attr.query.prog_ids	= ptr_to_u64(prog_ids);
+	attr.query.query_flags	= OPTS_GET(opts, query_flags, 0);
+	attr.query.prog_cnt	= OPTS_GET(opts, prog_cnt, 0);
+	attr.query.prog_ids	= ptr_to_u64(OPTS_GET(opts, prog_ids, NULL));
+	attr.query.attach_btf_id = OPTS_GET(opts, attach_btf_id, 0);
 
 	ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
 
-	if (attach_flags)
-		*attach_flags = attr.query.attach_flags;
-	*prog_cnt = attr.query.prog_cnt;
+	if (OPTS_HAS(opts, prog_cnt))
+		opts->prog_cnt = attr.query.prog_cnt;
+	if (OPTS_HAS(opts, attach_flags))
+		opts->attach_flags = attr.query.attach_flags;
 
 	return libbpf_err_errno(ret);
 }
 
+int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
+		   __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
+{
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
+	int ret;
+
+	p.query_flags = query_flags;
+	p.prog_ids = prog_ids;
+	p.prog_cnt = *prog_cnt;
+
+	ret = bpf_prog_query2(target_fd, type, &p);
+
+	if (attach_flags)
+		*attach_flags = p.attach_flags;
+	*prog_cnt = p.prog_cnt;
+
+	return ret;
+}
+
 int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size,
 		      void *data_out, __u32 *size_out, __u32 *retval,
 		      __u32 *duration)
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index f4b4afb6d4ba..6b1ff77ac984 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -480,9 +480,24 @@ LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_link_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);
+
+struct bpf_prog_query_opts {
+	size_t sz; /* size of this struct for forward/backward compatibility */
+	__u32 query_flags;
+	__u32 attach_flags; /* output argument */
+	__u32 *prog_ids;
+	__u32 prog_cnt; /* input+output argument */
+	__u32 attach_btf_id;
+};
+#define bpf_prog_query_opts__last_field attach_btf_id
+
+LIBBPF_API int bpf_prog_query2(int target_fd,
+			       enum bpf_attach_type type,
+			       struct bpf_prog_query_opts *opts);
 LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
 			      __u32 query_flags, __u32 *attach_flags,
 			      __u32 *prog_ids, __u32 *prog_cnt);
+
 LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd);
 LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
 				 __u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index b5bc84039407..5e5bb3e437cc 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -450,4 +450,5 @@ LIBBPF_0.8.0 {
 		bpf_program__attach_usdt;
 		libbpf_register_prog_handler;
 		libbpf_unregister_prog_handler;
+		bpf_prog_query2;
 } LIBBPF_0.7.0;
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 06/10] bpf: allow writing to a subset of sock fields from lsm progtype
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (4 preceding siblings ...)
  2022-04-29 21:15 ` [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 07/10] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

For now, allow only the obvious ones, like sk_priority and sk_mark.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 kernel/bpf/bpf_lsm.c  | 58 +++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c |  3 ++-
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 1079c747e061..64406d39e861 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -302,7 +302,65 @@ bool bpf_lsm_is_sleepable_hook(u32 btf_id)
 const struct bpf_prog_ops lsm_prog_ops = {
 };
 
+static int lsm_btf_struct_access(struct bpf_verifier_log *log,
+					const struct btf *btf,
+					const struct btf_type *t, int off,
+					int size, enum bpf_access_type atype,
+					u32 *next_btf_id,
+					enum bpf_type_flag *flag)
+{
+	const struct btf_type *sock_type;
+	struct btf *btf_vmlinux;
+	s32 type_id;
+	size_t end;
+
+	if (atype == BPF_READ)
+		return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
+					 flag);
+
+	btf_vmlinux = bpf_get_btf_vmlinux();
+	if (!btf_vmlinux) {
+		bpf_log(log, "no vmlinux btf\n");
+		return -EOPNOTSUPP;
+	}
+
+	type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
+	if (type_id < 0) {
+		bpf_log(log, "'struct sock' not found in vmlinux btf\n");
+		return -EINVAL;
+	}
+
+	sock_type = btf_type_by_id(btf_vmlinux, type_id);
+
+	if (t != sock_type) {
+		bpf_log(log, "only 'struct sock' writes are supported\n");
+		return -EACCES;
+	}
+
+	switch (off) {
+	case bpf_ctx_range(struct sock, sk_priority):
+		end = offsetofend(struct sock, sk_priority);
+		break;
+	case bpf_ctx_range(struct sock, sk_mark):
+		end = offsetofend(struct sock, sk_mark);
+		break;
+	default:
+		bpf_log(log, "no write support to 'struct sock' at off %d\n", off);
+		return -EACCES;
+	}
+
+	if (off + size > end) {
+		bpf_log(log,
+			"write access at off %d with size %d beyond the member of 'struct sock' ended at %zu\n",
+			off, size, end);
+		return -EACCES;
+	}
+
+	return NOT_INIT;
+}
+
 const struct bpf_verifier_ops lsm_verifier_ops = {
 	.get_func_proto = bpf_lsm_func_proto,
 	.is_valid_access = btf_ctx_access,
+	.btf_struct_access = lsm_btf_struct_access,
 };
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 99703d96c579..2ada6fd48638 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13130,7 +13130,8 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 				insn->code = BPF_LDX | BPF_PROBE_MEM |
 					BPF_SIZE((insn)->code);
 				env->prog->aux->num_exentries++;
-			} else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS) {
+			} else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS &&
+				   resolve_prog_type(env->prog) != BPF_PROG_TYPE_LSM) {
 				verbose(env, "Writes through BTF pointers are not allowed\n");
 				return -EINVAL;
 			}
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 07/10] libbpf: add lsm_cgoup_sock type
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (5 preceding siblings ...)
  2022-04-29 21:15 ` [PATCH bpf-next v6 06/10] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 08/10] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/lib/bpf/libbpf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 63c0f412266c..3177bf2fbf5e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -8967,6 +8967,7 @@ static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("fmod_ret.s+",		TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
 	SEC_DEF("fexit.s+",		TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
 	SEC_DEF("freplace+",		EXT, 0, SEC_ATTACH_BTF, attach_trace),
+	SEC_DEF("lsm_cgroup+",		LSM, BPF_LSM_CGROUP, SEC_ATTACH_BTF),
 	SEC_DEF("lsm+",			LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm),
 	SEC_DEF("lsm.s+",		LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
 	SEC_DEF("iter+",		TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF, attach_iter),
@@ -9390,6 +9391,7 @@ void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
 		*kind = BTF_KIND_TYPEDEF;
 		break;
 	case BPF_LSM_MAC:
+	case BPF_LSM_CGROUP:
 		*prefix = BTF_LSM_PREFIX;
 		*kind = BTF_KIND_FUNC;
 		break;
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 08/10] bpftool: implement cgroup tree for BPF_LSM_CGROUP
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (6 preceding siblings ...)
  2022-04-29 21:15 ` [PATCH bpf-next v6 07/10] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 09/10] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
  9 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

$ bpftool --nomount prog loadall $KDIR/tools/testing/selftests/bpf/lsm_cgroup.o /sys/fs/bpf/x
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_alloc
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_bind
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_clone
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_post_create
$ bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name
/sys/fs/cgroup
6        lsm_cgroup                      socket_post_create bpf_lsm_socket_post_create 33733
8        lsm_cgroup                      socket_bind     bpf_lsm_socket_bind 33737
10       lsm_cgroup                      socket_alloc    bpf_lsm_sk_alloc_security 33761
11       lsm_cgroup                      socket_clone    bpf_lsm_inet_csk_clone 33772

$ bpftool cgroup detach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_post_create
$ bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name
/sys/fs/cgroup
8        lsm_cgroup                      socket_bind     bpf_lsm_socket_bind 33737
10       lsm_cgroup                      socket_alloc    bpf_lsm_sk_alloc_security 33761
11       lsm_cgroup                      socket_clone    bpf_lsm_inet_csk_clone 33772

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/bpf/bpftool/cgroup.c | 138 +++++++++++++++++++++++++++++--------
 tools/bpf/bpftool/common.c |   1 +
 2 files changed, 112 insertions(+), 27 deletions(-)

diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
index effe136119d7..d271600c76dd 100644
--- a/tools/bpf/bpftool/cgroup.c
+++ b/tools/bpf/bpftool/cgroup.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 
 #include <bpf/bpf.h>
+#include <bpf/btf.h>
 
 #include "main.h"
 
@@ -32,6 +33,41 @@
 	"                        sock_release }"
 
 static unsigned int query_flags;
+static struct btf *btf_vmlinux;
+static __u32 bpf_lsm_id[1024];
+static int bpf_lsm_num;
+
+static void load_btf_lsm(void)
+{
+	__u32 i;
+
+	btf_vmlinux = libbpf_find_kernel_btf();
+	if (libbpf_get_error(btf_vmlinux))
+		goto no_btf;
+
+	for (i = 1; i < btf__type_cnt(btf_vmlinux); i++) {
+		const struct btf_type *t = btf__type_by_id(btf_vmlinux, i);
+		const char *name = btf__name_by_offset(btf_vmlinux,
+						       t->name_off);
+
+		if (btf_kind(t) != BTF_KIND_FUNC)
+			continue;
+
+		if (name && strstr(name, "bpf_lsm_") != name)
+			continue;
+
+		bpf_lsm_id[bpf_lsm_num++] = i;
+	}
+
+	if (!bpf_lsm_num)
+		goto no_btf;
+
+	return;
+
+no_btf:
+	bpf_lsm_num = 1;
+	bpf_lsm_id[0] = 0;
+}
 
 static enum bpf_attach_type parse_attach_type(const char *str)
 {
@@ -48,9 +84,11 @@ static enum bpf_attach_type parse_attach_type(const char *str)
 
 static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 			 const char *attach_flags_str,
+			 __u32 attach_btf_id,
 			 int level)
 {
 	char prog_name[MAX_PROG_FULL_NAME];
+	const char *attach_btf_name = NULL;
 	struct bpf_prog_info info = {};
 	__u32 info_len = sizeof(info);
 	int prog_fd;
@@ -64,6 +102,16 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 		return -1;
 	}
 
+	if (attach_btf_id) {
+		if (btf_vmlinux &&
+		    attach_btf_id < btf__type_cnt(btf_vmlinux)) {
+			const struct btf_type *t = btf__type_by_id(btf_vmlinux,
+								   attach_btf_id);
+			attach_btf_name = btf__name_by_offset(btf_vmlinux,
+							      t->name_off);
+		}
+	}
+
 	get_prog_full_name(&info, prog_fd, prog_name, sizeof(prog_name));
 	if (json_output) {
 		jsonw_start_object(json_wtr);
@@ -76,6 +124,10 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 		jsonw_string_field(json_wtr, "attach_flags",
 				   attach_flags_str);
 		jsonw_string_field(json_wtr, "name", prog_name);
+		if (attach_btf_name)
+			jsonw_string_field(json_wtr, "attach_btf_name", attach_btf_name);
+		if (attach_btf_id)
+			jsonw_uint_field(json_wtr, "attach_btf_id", attach_btf_id);
 		jsonw_end_object(json_wtr);
 	} else {
 		printf("%s%-8u ", level ? "    " : "", info.id);
@@ -83,65 +135,82 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type,
 			printf("%-15s", attach_type_name[attach_type]);
 		else
 			printf("type %-10u", attach_type);
-		printf(" %-15s %-15s\n", attach_flags_str, prog_name);
+		printf(" %-15s %-15s", attach_flags_str, prog_name);
+		if (attach_btf_name)
+			printf(" %-15s", attach_btf_name);
+		if (attach_btf_id)
+			printf(" %d", attach_btf_id);
+		printf("\n");
 	}
 
 	close(prog_fd);
 	return 0;
 }
 
-static int count_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type)
+static int count_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type,
+				    __u32 attach_btf_id)
 {
-	__u32 prog_cnt = 0;
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
 	int ret;
 
-	ret = bpf_prog_query(cgroup_fd, type, query_flags, NULL,
-			     NULL, &prog_cnt);
+	p.query_flags = query_flags;
+	p.attach_btf_id = attach_btf_id;
+
+	ret = bpf_prog_query2(cgroup_fd, type, &p);
 	if (ret)
 		return -1;
 
-	return prog_cnt;
+	return p.prog_cnt;
 }
 
 static int cgroup_has_attached_progs(int cgroup_fd)
 {
 	enum bpf_attach_type type;
 	bool no_prog = true;
+	int i;
 
 	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
-		int count = count_attached_bpf_progs(cgroup_fd, type);
+		for (i = 0; i < bpf_lsm_num; i++) {
+			int count = count_attached_bpf_progs(cgroup_fd, type, bpf_lsm_id[i]);
 
-		if (count < 0 && errno != EINVAL)
-			return -1;
+			if (count < 0 && errno != EINVAL)
+				return -1;
+
+			if (count > 0) {
+				no_prog = false;
+				break;
+			}
 
-		if (count > 0) {
-			no_prog = false;
-			break;
+			if (type != BPF_LSM_CGROUP)
+				break;
 		}
 	}
 
 	return no_prog ? 0 : 1;
 }
 static int show_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type,
-				   int level)
+				   __u32 attach_btf_id, int level)
 {
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
 	const char *attach_flags_str;
 	__u32 prog_ids[1024] = {0};
-	__u32 prog_cnt, iter;
-	__u32 attach_flags;
 	char buf[32];
+	__u32 iter;
 	int ret;
 
-	prog_cnt = ARRAY_SIZE(prog_ids);
-	ret = bpf_prog_query(cgroup_fd, type, query_flags, &attach_flags,
-			     prog_ids, &prog_cnt);
+	p.attach_btf_id = attach_btf_id;
+	p.query_flags = query_flags;
+	p.prog_cnt = ARRAY_SIZE(prog_ids);
+	p.prog_ids = prog_ids;
+
+	ret = bpf_prog_query2(cgroup_fd, type, &p);
 	if (ret)
 		return ret;
 
-	if (prog_cnt == 0)
+	if (p.prog_cnt == 0)
 		return 0;
 
-	switch (attach_flags) {
+	switch (p.attach_flags) {
 	case BPF_F_ALLOW_MULTI:
 		attach_flags_str = "multi";
 		break;
@@ -152,13 +221,13 @@ static int show_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type,
 		attach_flags_str = "";
 		break;
 	default:
-		snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags);
+		snprintf(buf, sizeof(buf), "unknown(%x)", p.attach_flags);
 		attach_flags_str = buf;
 	}
 
-	for (iter = 0; iter < prog_cnt; iter++)
+	for (iter = 0; iter < p.prog_cnt; iter++)
 		show_bpf_prog(prog_ids[iter], type,
-			      attach_flags_str, level);
+			      attach_flags_str, attach_btf_id, level);
 
 	return 0;
 }
@@ -170,6 +239,7 @@ static int do_show(int argc, char **argv)
 	const char *path;
 	int cgroup_fd;
 	int ret = -1;
+	int i;
 
 	query_flags = 0;
 
@@ -221,8 +291,14 @@ static int do_show(int argc, char **argv)
 		 * If we were able to get the show for at least one
 		 * attach type, let's return 0.
 		 */
-		if (show_attached_bpf_progs(cgroup_fd, type, 0) == 0)
-			ret = 0;
+
+		for (i = 0; i < bpf_lsm_num; i++) {
+			if (show_attached_bpf_progs(cgroup_fd, type, bpf_lsm_id[i], 0) == 0)
+				ret = 0;
+
+			if (type != BPF_LSM_CGROUP)
+				break;
+		}
 	}
 
 	if (json_output)
@@ -247,6 +323,7 @@ static int do_show_tree_fn(const char *fpath, const struct stat *sb,
 	enum bpf_attach_type type;
 	int has_attached_progs;
 	int cgroup_fd;
+	int i;
 
 	if (typeflag != FTW_D)
 		return 0;
@@ -277,8 +354,14 @@ static int do_show_tree_fn(const char *fpath, const struct stat *sb,
 		printf("%s\n", fpath);
 	}
 
-	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++)
-		show_attached_bpf_progs(cgroup_fd, type, ftw->level);
+	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
+		for (i = 0; i < bpf_lsm_num; i++) {
+			show_attached_bpf_progs(cgroup_fd, type, bpf_lsm_id[i], 0);
+
+			if (type != BPF_LSM_CGROUP)
+				break;
+		}
+	}
 
 	if (errno == EINVAL)
 		/* Last attach type does not support query.
@@ -523,5 +606,6 @@ static const struct cmd cmds[] = {
 
 int do_cgroup(int argc, char **argv)
 {
+	load_btf_lsm();
 	return cmd_select(cmds, argc, argv, do_help);
 }
diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index c740142c24d8..a7a913784c47 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -66,6 +66,7 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = {
 	[BPF_TRACE_FEXIT]		= "fexit",
 	[BPF_MODIFY_RETURN]		= "mod_ret",
 	[BPF_LSM_MAC]			= "lsm_mac",
+	[BPF_LSM_CGROUP]		= "lsm_cgroup",
 	[BPF_SK_LOOKUP]			= "sk_lookup",
 	[BPF_TRACE_ITER]		= "trace_iter",
 	[BPF_XDP_DEVMAP]		= "xdp_devmap",
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 09/10] selftests/bpf: lsm_cgroup functional test
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (7 preceding siblings ...)
  2022-04-29 21:15 ` [PATCH bpf-next v6 08/10] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-04-29 21:15 ` [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
  9 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

Functional test that exercises the following:

1. apply default sk_priority policy
2. permit TX-only AF_PACKET socket
3. cgroup attach/detach/replace
4. reusing trampoline shim

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/lsm_cgroup.c     | 236 ++++++++++++++++++
 .../testing/selftests/bpf/progs/lsm_cgroup.c  | 160 ++++++++++++
 2 files changed, 396 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
 create mode 100644 tools/testing/selftests/bpf/progs/lsm_cgroup.c

diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
new file mode 100644
index 000000000000..667fb7b5cb03
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
@@ -0,0 +1,236 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <test_progs.h>
+#include <bpf/btf.h>
+
+#include "lsm_cgroup.skel.h"
+#include "cgroup_helpers.h"
+#include "network_helpers.h"
+
+static __u32 query_prog_cnt(int cgroup_fd, const char *attach_func)
+{
+	LIBBPF_OPTS(bpf_prog_query_opts, p);
+	static struct btf *btf;
+
+	if (!btf)
+		btf = btf__load_vmlinux_btf();
+	if (!ASSERT_OK(libbpf_get_error(btf), "btf_vmlinux"))
+		return -1;
+
+	p.attach_btf_id = btf__find_by_name_kind(btf, attach_func, BTF_KIND_FUNC);
+	ASSERT_GE(p.attach_btf_id, 0, "attach_btf_id");
+
+	ASSERT_OK(bpf_prog_query2(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query");
+	return p.prog_cnt;
+}
+
+void test_lsm_cgroup(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_prog_attach_opts, attach_opts);
+	DECLARE_LIBBPF_OPTS(bpf_link_update_opts, update_opts);
+	int cgroup_fd, cgroup_fd2, err, fd, prio;
+	int listen_fd, client_fd, accepted_fd;
+	struct lsm_cgroup *skel = NULL;
+	int post_create_prog_fd2 = -1;
+	int post_create_prog_fd = -1;
+	int bind_link_fd2 = -1;
+	int bind_prog_fd2 = -1;
+	int alloc_prog_fd = -1;
+	int bind_prog_fd = -1;
+	int bind_link_fd = -1;
+	int clone_prog_fd = -1;
+	socklen_t socklen;
+
+	cgroup_fd = test__join_cgroup("/sock_policy");
+	if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup"))
+		goto close_skel;
+
+	cgroup_fd2 = create_and_get_cgroup("/sock_policy2");
+	if (!ASSERT_GE(cgroup_fd2, 0, "create second cgroup"))
+		goto close_skel;
+
+	skel = lsm_cgroup__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		goto close_cgroup;
+
+	post_create_prog_fd = bpf_program__fd(skel->progs.socket_post_create);
+	post_create_prog_fd2 = bpf_program__fd(skel->progs.socket_post_create2);
+	bind_prog_fd = bpf_program__fd(skel->progs.socket_bind);
+	bind_prog_fd2 = bpf_program__fd(skel->progs.socket_bind2);
+	alloc_prog_fd = bpf_program__fd(skel->progs.socket_alloc);
+	clone_prog_fd = bpf_program__fd(skel->progs.socket_clone);
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 0, "prog count");
+	err = bpf_prog_attach(alloc_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach alloc_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 1, "prog count");
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 0, "prog count");
+	err = bpf_prog_attach(clone_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach clone_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 1, "prog count");
+
+	/* Make sure replacing works. */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 0, "prog count");
+	err = bpf_prog_attach(post_create_prog_fd, cgroup_fd,
+			      BPF_LSM_CGROUP, 0);
+	if (!ASSERT_OK(err, "attach post_create_prog_fd"))
+		goto close_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count");
+
+	attach_opts.replace_prog_fd = post_create_prog_fd;
+	err = bpf_prog_attach_opts(post_create_prog_fd2, cgroup_fd,
+				   BPF_LSM_CGROUP, &attach_opts);
+	if (!ASSERT_OK(err, "prog replace post_create_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count");
+
+	/* Try the same attach/replace via link API. */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 0, "prog count");
+	bind_link_fd = bpf_link_create(bind_prog_fd, cgroup_fd,
+				       BPF_LSM_CGROUP, NULL);
+	if (!ASSERT_GE(bind_link_fd, 0, "link create bind_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+
+	update_opts.old_prog_fd = bind_prog_fd;
+	update_opts.flags = BPF_F_REPLACE;
+
+	err = bpf_link_update(bind_link_fd, bind_prog_fd2, &update_opts);
+	if (!ASSERT_OK(err, "link update bind_prog_fd"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+
+	/* Attach another instance of bind program to another cgroup.
+	 * This should trigger the reuse of the trampoline shim (two
+	 * programs attaching to the same btf_id).
+	 */
+
+	ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count");
+	ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 0, "prog count");
+	bind_link_fd2 = bpf_link_create(bind_prog_fd2, cgroup_fd2,
+					BPF_LSM_CGROUP, NULL);
+	if (!ASSERT_GE(bind_link_fd2, 0, "link create bind_prog_fd2"))
+		goto detach_cgroup;
+	ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 1, "prog count");
+
+	/* AF_UNIX is prohibited. */
+
+	fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	ASSERT_LT(fd, 0, "socket(AF_UNIX)");
+
+	/* AF_INET6 gets default policy (sk_priority). */
+
+	fd = socket(AF_INET6, SOCK_STREAM, 0);
+	if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)"))
+		goto detach_cgroup;
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 123, "sk_priority");
+
+	close(fd);
+
+	/* TX-only AF_PACKET is allowed. */
+
+	ASSERT_LT(socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)), 0,
+		  "socket(AF_PACKET, ..., ETH_P_ALL)");
+
+	fd = socket(AF_PACKET, SOCK_RAW, 0);
+	ASSERT_GE(fd, 0, "socket(AF_PACKET, ..., 0)");
+
+	/* TX-only AF_PACKET can not be rebound. */
+
+	struct sockaddr_ll sa = {
+		.sll_family = AF_PACKET,
+		.sll_protocol = htons(ETH_P_ALL),
+	};
+	ASSERT_LT(bind(fd, (struct sockaddr *)&sa, sizeof(sa)), 0,
+		  "bind(ETH_P_ALL)");
+
+	close(fd);
+
+	/* Trigger passive open. */
+
+	listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+	ASSERT_GE(listen_fd, 0, "start_server");
+	client_fd = connect_to_fd(listen_fd, 0);
+	ASSERT_GE(client_fd, 0, "connect_to_fd");
+	accepted_fd = accept(listen_fd, NULL, NULL);
+	ASSERT_GE(accepted_fd, 0, "accept");
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(accepted_fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 234, "sk_priority");
+
+	/* These are replaced and never called. */
+	ASSERT_EQ(skel->bss->called_socket_post_create, 0, "called_create");
+	ASSERT_EQ(skel->bss->called_socket_bind, 0, "called_bind");
+
+	/* AF_INET6+SOCK_STREAM
+	 * AF_PACKET+SOCK_RAW
+	 * listen_fd
+	 * client_fd
+	 * accepted_fd
+	 */
+	ASSERT_EQ(skel->bss->called_socket_post_create2, 5, "called_create2");
+
+	/* start_server
+	 * bind(ETH_P_ALL)
+	 */
+	ASSERT_EQ(skel->bss->called_socket_bind2, 2, "called_bind2");
+	/* Single accept(). */
+	ASSERT_EQ(skel->bss->called_socket_clone, 1, "called_clone");
+
+	/* AF_UNIX+SOCK_STREAM (failed)
+	 * AF_INET6+SOCK_STREAM
+	 * AF_PACKET+SOCK_RAW (failed)
+	 * AF_PACKET+SOCK_RAW
+	 * listen_fd
+	 * client_fd
+	 * accepted_fd
+	 */
+	ASSERT_EQ(skel->bss->called_socket_alloc, 7, "called_alloc");
+
+	/* Make sure other cgroup doesn't trigger the programs. */
+
+	if (!ASSERT_OK(join_cgroup(""), "join root cgroup"))
+		goto detach_cgroup;
+
+	fd = socket(AF_INET6, SOCK_STREAM, 0);
+	if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)"))
+		goto detach_cgroup;
+
+	prio = 0;
+	socklen = sizeof(prio);
+	ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0,
+		  "getsockopt");
+	ASSERT_EQ(prio, 0, "sk_priority");
+
+	close(fd);
+
+detach_cgroup:
+	ASSERT_GE(bpf_prog_detach2(post_create_prog_fd2, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_create");
+	close(bind_link_fd);
+	/* Don't close bind_link_fd2, exercise cgroup release cleanup. */
+	ASSERT_GE(bpf_prog_detach2(alloc_prog_fd, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_alloc");
+	ASSERT_GE(bpf_prog_detach2(clone_prog_fd, cgroup_fd,
+				   BPF_LSM_CGROUP), 0, "detach_clone");
+
+close_cgroup:
+	close(cgroup_fd);
+close_skel:
+	lsm_cgroup__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/lsm_cgroup.c b/tools/testing/selftests/bpf/progs/lsm_cgroup.c
new file mode 100644
index 000000000000..a263830900e2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/lsm_cgroup.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+#ifndef AF_PACKET
+#define AF_PACKET 17
+#endif
+
+#ifndef AF_UNIX
+#define AF_UNIX 1
+#endif
+
+#ifndef EPERM
+#define EPERM 1
+#endif
+
+struct {
+	__uint(type, BPF_MAP_TYPE_CGROUP_STORAGE);
+	__type(key, __u64);
+	__type(value, __u64);
+} cgroup_storage SEC(".maps");
+
+int called_socket_post_create;
+int called_socket_post_create2;
+int called_socket_bind;
+int called_socket_bind2;
+int called_socket_alloc;
+int called_socket_clone;
+
+static __always_inline int test_local_storage(void)
+{
+	__u64 *val;
+
+	val = bpf_get_local_storage(&cgroup_storage, 0);
+	if (!val)
+		return 0;
+	*val += 1;
+
+	return 1;
+}
+
+static __always_inline int real_create(struct socket *sock, int family,
+				       int protocol)
+{
+	struct sock *sk;
+
+	/* Reject non-tx-only AF_PACKET. */
+	if (family == AF_PACKET && protocol != 0)
+		return 0; /* EPERM */
+
+	sk = sock->sk;
+	if (!sk)
+		return 1;
+
+	/* The rest of the sockets get default policy. */
+	sk->sk_priority = 123;
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create, struct socket *sock, int family,
+	     int type, int protocol, int kern)
+{
+	called_socket_post_create++;
+	return real_create(sock, family, protocol);
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create2, struct socket *sock, int family,
+	     int type, int protocol, int kern)
+{
+	called_socket_post_create2++;
+	return real_create(sock, family, protocol);
+}
+
+static __always_inline int real_bind(struct socket *sock,
+				     struct sockaddr *address,
+				     int addrlen)
+{
+	struct sockaddr_ll sa = {};
+
+	if (sock->sk->__sk_common.skc_family != AF_PACKET)
+		return 1;
+
+	if (sock->sk->sk_kern_sock)
+		return 1;
+
+	bpf_probe_read_kernel(&sa, sizeof(sa), address);
+	if (sa.sll_protocol)
+		return 0; /* EPERM */
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_bind")
+int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	called_socket_bind++;
+	return real_bind(sock, address, addrlen);
+}
+
+/* __cgroup_bpf_run_lsm_socket */
+SEC("lsm_cgroup/socket_bind")
+int BPF_PROG(socket_bind2, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	called_socket_bind2++;
+	return real_bind(sock, address, addrlen);
+}
+
+/* __cgroup_bpf_run_lsm_current (via bpf_lsm_current_hooks) */
+SEC("lsm_cgroup/sk_alloc_security")
+int BPF_PROG(socket_alloc, struct sock *sk, int family, gfp_t priority)
+{
+	called_socket_alloc++;
+	if (family == AF_UNIX)
+		return 0; /* EPERM */
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
+
+/* __cgroup_bpf_run_lsm_sock */
+SEC("lsm_cgroup/inet_csk_clone")
+int BPF_PROG(socket_clone, struct sock *newsk, const struct request_sock *req)
+{
+	called_socket_clone++;
+
+	if (!newsk)
+		return 1;
+
+	/* Accepted request sockets get a different priority. */
+	newsk->sk_priority = 234;
+
+	/* Can access cgroup local storage. */
+	if (!test_local_storage())
+		return 0; /* EPERM */
+
+	return 1;
+}
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access
  2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
                   ` (8 preceding siblings ...)
  2022-04-29 21:15 ` [PATCH bpf-next v6 09/10] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
@ 2022-04-29 21:15 ` Stanislav Fomichev
  2022-05-09 21:54   ` Andrii Nakryiko
  9 siblings, 1 reply; 33+ messages in thread
From: Stanislav Fomichev @ 2022-04-29 21:15 UTC (permalink / raw)
  To: netdev, bpf; +Cc: ast, daniel, andrii, Stanislav Fomichev

sk_priority & sk_mark are writable, the rest is readonly.

Add new ldx_offset fixups to lookup the offset of struct field.
Allow using test.kfunc regardless of prog_type.

One interesting thing here is that the verifier doesn't
really force me to add NULL checks anywhere :-/

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/test_verifier.c   | 54 ++++++++++++++++++-
 .../selftests/bpf/verifier/lsm_cgroup.c       | 34 ++++++++++++
 2 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/verifier/lsm_cgroup.c

diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 372579c9f45e..49961492cbd4 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -75,6 +75,12 @@ struct kfunc_btf_id_pair {
 	int insn_idx;
 };
 
+struct ldx_offset {
+	const char *strct;
+	const char *field;
+	int insn_idx;
+};
+
 struct bpf_test {
 	const char *descr;
 	struct bpf_insn	insns[MAX_INSNS];
@@ -103,6 +109,7 @@ struct bpf_test {
 	int fixup_map_timer[MAX_FIXUPS];
 	int fixup_map_kptr[MAX_FIXUPS];
 	struct kfunc_btf_id_pair fixup_kfunc_btf_id[MAX_FIXUPS];
+	struct ldx_offset fixup_ldx[MAX_FIXUPS];
 	/* Expected verifier log output for result REJECT or VERBOSE_ACCEPT.
 	 * Can be a tab-separated sequence of expected strings. An empty string
 	 * means no log verification.
@@ -799,6 +806,7 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type,
 	int *fixup_map_timer = test->fixup_map_timer;
 	int *fixup_map_kptr = test->fixup_map_kptr;
 	struct kfunc_btf_id_pair *fixup_kfunc_btf_id = test->fixup_kfunc_btf_id;
+	struct ldx_offset *fixup_ldx = test->fixup_ldx;
 
 	if (test->fill_helper) {
 		test->fill_insns = calloc(MAX_TEST_INSNS, sizeof(struct bpf_insn));
@@ -1018,6 +1026,50 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type,
 			fixup_kfunc_btf_id++;
 		} while (fixup_kfunc_btf_id->kfunc);
 	}
+
+	if (fixup_ldx->strct) {
+		const struct btf_member *memb;
+		const struct btf_type *tp;
+		const char *name;
+		struct btf *btf;
+		int btf_id;
+		int off;
+		int i;
+
+		btf = btf__load_vmlinux_btf();
+
+		do {
+			off = -1;
+			if (!btf)
+				goto next_ldx;
+
+			btf_id = btf__find_by_name_kind(btf,
+							fixup_ldx->strct,
+							BTF_KIND_STRUCT);
+			if (btf_id < 0)
+				goto next_ldx;
+
+			tp = btf__type_by_id(btf, btf_id);
+			memb = btf_members(tp);
+
+			for (i = 0; i < btf_vlen(tp); i++) {
+				name = btf__name_by_offset(btf,
+							   memb->name_off);
+				if (strcmp(fixup_ldx->field, name) == 0) {
+					off = memb->offset / 8;
+					break;
+				}
+				memb++;
+			}
+
+next_ldx:
+			prog[fixup_ldx->insn_idx].off = off;
+			fixup_ldx++;
+
+		} while (fixup_ldx->strct);
+
+		btf__free(btf);
+	}
 }
 
 struct libcap {
@@ -1182,7 +1234,7 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 		opts.log_level = 4;
 	opts.prog_flags = pflags;
 
-	if (prog_type == BPF_PROG_TYPE_TRACING && test->kfunc) {
+	if (test->kfunc) {
 		int attach_btf_id;
 
 		attach_btf_id = libbpf_find_vmlinux_btf_id(test->kfunc,
diff --git a/tools/testing/selftests/bpf/verifier/lsm_cgroup.c b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
new file mode 100644
index 000000000000..af0efe783511
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
@@ -0,0 +1,34 @@
+#define SK_WRITABLE_FIELD(tp, field, size, res) \
+{ \
+	.descr = field, \
+	.insns = { \
+		/* r1 = *(u64 *)(r1 + 0) */ \
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
+		/* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */ \
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
+		/* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */ \
+		BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, 0), \
+		/* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */ \
+		BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, 0), \
+		BPF_MOV64_IMM(BPF_REG_0, 1), \
+		BPF_EXIT_INSN(), \
+	}, \
+	.result = res, \
+	.errstr = res ? "no write support to 'struct sock' at off" : "", \
+	.prog_type = BPF_PROG_TYPE_LSM, \
+	.expected_attach_type = BPF_LSM_CGROUP, \
+	.kfunc = "socket_post_create", \
+	.fixup_ldx = { \
+		{ "socket", "sk", 1 }, \
+		{ tp, field, 2 }, \
+		{ tp, field, 3 }, \
+	}, \
+}
+
+SK_WRITABLE_FIELD("sock_common", "skc_family", BPF_H, REJECT),
+SK_WRITABLE_FIELD("sock", "sk_sndtimeo", BPF_DW, REJECT),
+SK_WRITABLE_FIELD("sock", "sk_priority", BPF_W, ACCEPT),
+SK_WRITABLE_FIELD("sock", "sk_mark", BPF_W, ACCEPT),
+SK_WRITABLE_FIELD("sock", "sk_pacing_rate", BPF_DW, REJECT),
+
+#undef SK_WRITABLE_FIELD
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-04-29 21:15 ` [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor Stanislav Fomichev
@ 2022-05-06 23:02   ` Martin KaFai Lau
  2022-05-09 23:38     ` Stanislav Fomichev
  2022-05-09 21:51   ` Andrii Nakryiko
  1 sibling, 1 reply; 33+ messages in thread
From: Martin KaFai Lau @ 2022-05-06 23:02 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, Apr 29, 2022 at 02:15:33PM -0700, Stanislav Fomichev wrote:
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index 064eccba641d..a0e68ef5dfb1 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -16,6 +16,7 @@
>  #include <linux/bpf_local_storage.h>
>  #include <linux/btf_ids.h>
>  #include <linux/ima.h>
> +#include <linux/bpf-cgroup.h>
>  
>  /* For every LSM hook that allows attachment of BPF programs, declare a nop
>   * function where a BPF program can be attached.
> @@ -35,6 +36,66 @@ BTF_SET_START(bpf_lsm_hooks)
>  #undef LSM_HOOK
>  BTF_SET_END(bpf_lsm_hooks)
>  
> +/* List of LSM hooks that should operate on 'current' cgroup regardless
> + * of function signature.
> + */
> +BTF_SET_START(bpf_lsm_current_hooks)
> +/* operate on freshly allocated sk without any cgroup association */
> +BTF_ID(func, bpf_lsm_sk_alloc_security)
> +BTF_ID(func, bpf_lsm_sk_free_security)
> +BTF_SET_END(bpf_lsm_current_hooks)
> +
> +int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> +			     bpf_func_t *bpf_func)
> +{
> +	const struct btf_type *first_arg_type;
> +	const struct btf_type *sock_type;
> +	const struct btf_type *sk_type;
> +	const struct btf *btf_vmlinux;
> +	const struct btf_param *args;
> +	s32 type_id;
> +
> +	if (!prog->aux->attach_func_proto ||
> +	    !btf_type_is_func_proto(prog->aux->attach_func_proto))
Also remove these tests during the attach time.  These should have
already been checked during the load time.

> +		return -EINVAL;
> +
> +	if (btf_type_vlen(prog->aux->attach_func_proto) < 1 ||
> +	    btf_id_set_contains(&bpf_lsm_current_hooks,
> +				prog->aux->attach_btf_id)) {
> +		*bpf_func = __cgroup_bpf_run_lsm_current;
> +		return 0;
> +	}
> +
> +	args = btf_params(prog->aux->attach_func_proto);
> +
> +	btf_vmlinux = bpf_get_btf_vmlinux();
> +
> +	type_id = btf_find_by_name_kind(btf_vmlinux, "socket", BTF_KIND_STRUCT);
> +	if (type_id < 0)
> +		return -EINVAL;
> +	sock_type = btf_type_by_id(btf_vmlinux, type_id);
> +
> +	type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
> +	if (type_id < 0)
> +		return -EINVAL;
> +	sk_type = btf_type_by_id(btf_vmlinux, type_id);
Although practical kconfig should have CONFIG_NET, not sure btf has
"socket" and "sock" in some unusual kconfig setup.  If it does not, it will
return error too early for other hooks.

May be put them under "#ifdef CONFIG_NET"?  While adding CONFIG_NET,
it will be easier to just use the btf_sock_ids[].  "socket" needs to be
added to btf_sock_ids[].  Take a look at btf_ids.h and filter.c.

> +
> +	first_arg_type = btf_type_resolve_ptr(btf_vmlinux, args[0].type, NULL);
> +	if (first_arg_type == sock_type)
> +		*bpf_func = __cgroup_bpf_run_lsm_socket;
> +	else if (first_arg_type == sk_type)
> +		*bpf_func = __cgroup_bpf_run_lsm_sock;
> +	else
> +		*bpf_func = __cgroup_bpf_run_lsm_current;
> +
> +	return 0;
> +}

[ ... ]

> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 134785ab487c..9cc38454e402 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -14,6 +14,9 @@
>  #include <linux/string.h>
>  #include <linux/bpf.h>
>  #include <linux/bpf-cgroup.h>
> +#include <linux/btf_ids.h>
> +#include <linux/bpf_lsm.h>
> +#include <linux/bpf_verifier.h>
>  #include <net/sock.h>
>  #include <net/bpf_sk_storage.h>
>  
> @@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
>  	return run_ctx.retval;
>  }
>  
> +unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
> +				       const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct sock *sk;
> +	struct cgroup *cgrp;
> +	int ret = 0;
> +	u64 *regs;
> +
> +	regs = (u64 *)ctx;
> +	sk = (void *)(unsigned long)regs[BPF_REG_0];
> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> +
> +	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	return ret;
> +}
> +
> +unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
> +					 const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct socket *sock;
> +	struct cgroup *cgrp;
> +	int ret = 0;
> +	u64 *regs;
> +
> +	regs = (u64 *)ctx;
> +	sock = (void *)(unsigned long)regs[BPF_REG_0];
> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> +
> +	cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	return ret;
> +}
> +
> +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> +					  const struct bpf_insn *insn)
> +{
> +	const struct bpf_prog *shim_prog;
> +	struct cgroup *cgrp;
> +	int ret = 0;
From lsm_hook_defs.h, there are some default return values that are not 0.
Is it ok to always return 0 in cases like the cgroup array is empty ?

> +
> +	if (unlikely(!current))
> +		return 0;
> +
> +	/*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> +	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> +
> +	rcu_read_lock();
> +	cgrp = task_dfl_cgroup(current);
> +	if (likely(cgrp))
> +		ret = bpf_prog_run_array_cg(&cgrp->bpf,
> +					    shim_prog->aux->cgroup_atype,
> +					    ctx, bpf_prog_run, 0, NULL);
> +	rcu_read_unlock();
> +	return ret;
> +}

[ ... ]

> @@ -479,6 +572,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
>  	struct bpf_prog *old_prog = NULL;
>  	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
>  	struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
> +	struct bpf_attach_target_info tgt_info = {};
>  	enum cgroup_bpf_attach_type atype;
>  	struct bpf_prog_list *pl;
>  	struct hlist_head *progs;
> @@ -495,9 +589,34 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
>  		/* replace_prog implies BPF_F_REPLACE, and vice versa */
>  		return -EINVAL;
>  
> -	atype = to_cgroup_bpf_attach_type(type);
> -	if (atype < 0)
> -		return -EINVAL;
> +	if (type == BPF_LSM_CGROUP) {
> +		struct bpf_prog *p = prog ? : link->link.prog;
nit. This "prog ? ...." has been done multiple times (new and existing codes)
in this function.  may be do it once at the very beginning.

> +
> +		if (replace_prog) {
> +			/* Reusing shim from the original program. */
> +			if (replace_prog->aux->attach_btf_id !=
> +			    p->aux->attach_btf_id)
> +				return -EINVAL;
> +
> +			atype = replace_prog->aux->cgroup_atype;
> +		} else {
> +			err = bpf_check_attach_target(NULL, p, NULL,
> +						      p->aux->attach_btf_id,
> +						      &tgt_info);
> +			if (err)
> +				return -EINVAL;
return err;

> +
> +			atype = bpf_lsm_attach_type_get(p->aux->attach_btf_id);
> +			if (atype < 0)
> +				return atype;
> +		}
> +
> +		p->aux->cgroup_atype = atype;
> +	} else {
> +		atype = to_cgroup_bpf_attach_type(type);
> +		if (atype < 0)
> +			return -EINVAL;
> +	}
>  
>  	progs = &cgrp->bpf.progs[atype];

[ ... ]

> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 0c4fd194e801..77dfa517c47c 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -11,6 +11,8 @@
>  #include <linux/rcupdate_wait.h>
>  #include <linux/module.h>
>  #include <linux/static_call.h>
> +#include <linux/bpf_verifier.h>
> +#include <linux/bpf_lsm.h>
>  
>  /* dummy _ops. The verifier will operate on target program's ops. */
>  const struct bpf_verifier_ops bpf_extension_verifier_ops = {
> @@ -485,6 +487,147 @@ int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
>  	return err;
>  }
>  
> +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> +static struct bpf_prog *cgroup_shim_alloc(const struct bpf_prog *prog,
> +					  bpf_func_t bpf_func)
> +{
> +	struct bpf_prog *p;
> +
> +	p = bpf_prog_alloc(1, 0);
> +	if (!p)
> +		return NULL;
> +
> +	p->jited = false;
> +	p->bpf_func = bpf_func;
> +
> +	p->aux->cgroup_atype = prog->aux->cgroup_atype;
> +	p->aux->attach_func_proto = prog->aux->attach_func_proto;
> +	p->aux->attach_btf_id = prog->aux->attach_btf_id;
> +	p->aux->attach_btf = prog->aux->attach_btf;
> +	btf_get(p->aux->attach_btf);
> +	p->type = BPF_PROG_TYPE_LSM;
> +	p->expected_attach_type = BPF_LSM_MAC;
> +	bpf_prog_inc(p);
> +
> +	return p;
> +}
> +
> +static struct bpf_prog *cgroup_shim_find(struct bpf_trampoline *tr,
> +					 bpf_func_t bpf_func)
> +{
> +	const struct bpf_prog_aux *aux;
> +	int kind;
> +
> +	for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> +		hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist) {
> +			struct bpf_prog *p = aux->prog;
> +
> +			if (p->bpf_func == bpf_func)
> +				return p;
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> +				    struct bpf_attach_target_info *tgt_info)
> +{
> +	struct bpf_prog *shim_prog = NULL;
> +	struct bpf_trampoline *tr;
> +	bpf_func_t bpf_func;
> +	u64 key;
> +	int err;
> +
> +	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> +					 prog->aux->attach_btf_id);
> +
> +	err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> +	if (err)
> +		return err;
> +
> +	tr = bpf_trampoline_get(key, tgt_info);
> +	if (!tr)
> +		return  -ENOMEM;
> +
> +	mutex_lock(&tr->mutex);
> +
> +	shim_prog = cgroup_shim_find(tr, bpf_func);
> +	if (shim_prog) {
> +		/* Reusing existing shim attached by the other program. */
> +		bpf_prog_inc(shim_prog);
> +		mutex_unlock(&tr->mutex);
> +		return 0;
> +	}
> +
> +	/* Allocate and install new shim. */
> +
> +	shim_prog = cgroup_shim_alloc(prog, bpf_func);
> +	if (!shim_prog) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +
> +	err = __bpf_trampoline_link_prog(shim_prog, tr);
> +	if (err)
> +		goto out;
> +
> +	mutex_unlock(&tr->mutex);
> +
> +	return 0;
> +out:
> +	if (shim_prog)
> +		bpf_prog_put(shim_prog);
> +
bpf_trampoline_get() was done earlier.
Does it need to call bpf_trampoline_put(tr) in error cases ?
More on tr refcnt later.

> +	mutex_unlock(&tr->mutex);
> +	return err;
> +}
> +
> +void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
> +{
> +	struct bpf_prog *shim_prog;
> +	struct bpf_trampoline *tr;
> +	bpf_func_t bpf_func;
> +	u64 key;
> +	int err;
> +
> +	key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> +					 prog->aux->attach_btf_id);
> +
> +	err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> +	if (err)
> +		return;
> +
> +	tr = bpf_trampoline_lookup(key);
> +	if (!tr)
> +		return;
> +
> +	mutex_lock(&tr->mutex);
> +
> +	shim_prog = cgroup_shim_find(tr, bpf_func);
> +	if (shim_prog) {
> +		/* We use shim_prog refcnt for tracking whether to
> +		 * remove the shim program from the trampoline.
> +		 * Trampoline's mutex is held while refcnt is
> +		 * added/subtracted so we don't need to care about
> +		 * potential races.
> +		 */
> +
> +		if (atomic64_read(&shim_prog->aux->refcnt) == 1)
> +			WARN_ON_ONCE(__bpf_trampoline_unlink_prog(shim_prog, tr));
> +
> +		bpf_prog_put(shim_prog);
> +	}
> +
> +	mutex_unlock(&tr->mutex);
> +
> +	bpf_trampoline_put(tr); /* bpf_trampoline_lookup */
> +
> +	if (shim_prog)
> +		bpf_trampoline_put(tr);
While looking at the bpf_trampoline_{link,unlink}_cgroup_shim() again,
which prog (cgroup lsm progs or shim_prog) actually owns the tr.
It should be the shim_prog ?

How about storing tr in "shim_prog->aux->dst_trampoline = tr;"
Then the earlier bpf_prog_put(shim_prog) in this function will take care
of the bpf_trampoline_put(shim_prog->aux->dst_trampoline)
instead of each cgroup lsm prog owns a refcnt of the tr
and then this individual bpf_trampoline_put(tr) here can also
go away.

> +}
> +#endif
> +
>  struct bpf_trampoline *bpf_trampoline_get(u64 key,
>  					  struct bpf_attach_target_info *tgt_info)
>  {
> @@ -609,6 +752,24 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start)
>  	rcu_read_unlock();
>  }
>  
> +u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog)
> +	__acquires(RCU)
> +{
> +	/* Runtime stats are exported via actual BPF_LSM_CGROUP
> +	 * programs, not the shims.
> +	 */
> +	rcu_read_lock();
> +	migrate_disable();
> +	return NO_START_TIME;
> +}
> +
> +void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start)
> +	__releases(RCU)
> +{
> +	migrate_enable();
> +	rcu_read_unlock();
> +}
> +
>  u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog)
>  {
>  	rcu_read_lock_trace();
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 813f6ee80419..99703d96c579 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -14455,6 +14455,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
>  		fallthrough;
>  	case BPF_MODIFY_RETURN:
>  	case BPF_LSM_MAC:
> +	case BPF_LSM_CGROUP:
>  	case BPF_TRACE_FENTRY:
>  	case BPF_TRACE_FEXIT:
>  		if (!btf_type_is_func(t)) {
> @@ -14633,6 +14634,33 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
>  	return 0;
>  }
>  
> +static int check_used_maps(struct bpf_verifier_env *env)
> +{
> +	int i;
> +
> +	for (i = 0; i < env->used_map_cnt; i++) {
> +		struct bpf_map *map = env->used_maps[i];
> +
> +		switch (env->prog->expected_attach_type) {
> +		case BPF_LSM_CGROUP:
> +			if (map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE &&
> +			    map->map_type != BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE)
> +				break;
> +
> +			if (map->key_size != sizeof(__u64)) {
Follow on my very last comment in v5.  I think we should not
limit the bpf_cgroup_storage_key and should allow using both
cgroup_inode_id and attach_type as the key to avoid
inconsistent surprise (some keys are not allowed in a specific
attach_type), so this check_used_maps() function can also be avoided.

> +				verbose(env, "only global cgroup local storage is supported for BPF_LSM_CGROUP\n");
> +				return -EINVAL;
> +			}
> +
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  struct btf *bpf_get_btf_vmlinux(void)
>  {
>  	if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
> @@ -14854,6 +14882,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
>  		convert_pseudo_ld_imm64(env);
>  	}
>  
> +	ret = check_used_maps(env);
> +	if (ret < 0)
> +		goto err_release_maps;
> +
>  	adjust_btf_func(env);
>  
>  err_release_maps:
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 444fe6f1cf35..112e396bbe65 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -998,6 +998,7 @@ enum bpf_attach_type {
>  	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
>  	BPF_PERF_EVENT,
>  	BPF_TRACE_KPROBE_MULTI,
> +	BPF_LSM_CGROUP,
>  	__MAX_BPF_ATTACH_TYPE
>  };
>  
> -- 
> 2.36.0.464.gb9c8b46e94-goog
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-04-29 21:15 ` [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
@ 2022-05-07  0:12   ` Martin KaFai Lau
  2022-05-09 23:38     ` Stanislav Fomichev
  2022-05-09 21:49   ` Andrii Nakryiko
  1 sibling, 1 reply; 33+ messages in thread
From: Martin KaFai Lau @ 2022-05-07  0:12 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, Apr 29, 2022 at 02:15:35PM -0700, Stanislav Fomichev wrote:
> We have two options:
> 1. Treat all BPF_LSM_CGROUP as the same, regardless of attach_btf_id
> 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> 
> I'm doing (2) here and adding attach_btf_id as a new BPF_PROG_QUERY
> argument. The downside is that it requires iterating over all possible
> bpf_lsm_ hook points in the userspace which might take some time.
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/uapi/linux/bpf.h       |  1 +
>  kernel/bpf/cgroup.c            | 43 ++++++++++++++++++++++++----------
>  kernel/bpf/syscall.c           |  3 ++-
>  tools/include/uapi/linux/bpf.h |  1 +
>  tools/lib/bpf/bpf.c            | 42 ++++++++++++++++++++++++++-------
>  tools/lib/bpf/bpf.h            | 15 ++++++++++++
>  tools/lib/bpf/libbpf.map       |  1 +
>  7 files changed, 85 insertions(+), 21 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 112e396bbe65..e38ea0b47b6a 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1431,6 +1431,7 @@ union bpf_attr {
>  		__u32		attach_flags;
>  		__aligned_u64	prog_ids;
>  		__u32		prog_cnt;
> +		__u32		attach_btf_id;	/* for BPF_LSM_CGROUP */
If the downside/concern on (1) is the bpftool cannot show
which bpf_lsm_* hook that a cgroup-lsm is attached to,
how about adding this attach_btf_id to the bpf_prog_info instead.
The bpftool side is getting the bpf_prog_info (e.g. for the name) anyway.

Probably need to rename it to attach_func_btf_id (== prog->aux->attach_btf_id)
and then also add the attach_btf_id (== prog->aux->attach_btf->id) to
bpf_prog_info.

The bpftool then will work mostly the same and no need to iterate btf_vmlinux
to figure out the btf_id for all bpf_lsm_* hooks and no need to worry about
the increasing total number of lsm hooks in the future while
the latter bpftool patch has a static 1024.

If you also agree on (1), for this patch on the kernel side concern,
it needs to return all BPF_LSM_CGROUP progs to the userspace.

Feel free to put the bpf_prog_info modification and bpftool changes as a follow up
patch.  In this same set is also fine.  Suggesting it because this set is
getting long already.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-04-29 21:15 ` [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
  2022-05-07  0:12   ` Martin KaFai Lau
@ 2022-05-09 21:49   ` Andrii Nakryiko
  2022-05-09 23:38     ` Stanislav Fomichev
  1 sibling, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2022-05-09 21:49 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Fri, Apr 29, 2022 at 2:15 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> We have two options:
> 1. Treat all BPF_LSM_CGROUP as the same, regardless of attach_btf_id
> 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
>
> I'm doing (2) here and adding attach_btf_id as a new BPF_PROG_QUERY
> argument. The downside is that it requires iterating over all possible
> bpf_lsm_ hook points in the userspace which might take some time.
>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/uapi/linux/bpf.h       |  1 +
>  kernel/bpf/cgroup.c            | 43 ++++++++++++++++++++++++----------
>  kernel/bpf/syscall.c           |  3 ++-
>  tools/include/uapi/linux/bpf.h |  1 +
>  tools/lib/bpf/bpf.c            | 42 ++++++++++++++++++++++++++-------
>  tools/lib/bpf/bpf.h            | 15 ++++++++++++
>  tools/lib/bpf/libbpf.map       |  1 +

please split kernel and libbpf changes into separate patches and mark
libbpf's with "libbpf: " prefix

>  7 files changed, 85 insertions(+), 21 deletions(-)
>

[...]

> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index a9d292c106c2..f62823451b99 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -946,28 +946,54 @@ int bpf_iter_create(int link_fd)
>         return libbpf_err_errno(fd);
>  }
>
> -int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
> -                  __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
> +int bpf_prog_query2(int target_fd,

this would have to be named bpf_prog_query_opts()

> +                   enum bpf_attach_type type,
> +                   struct bpf_prog_query_opts *opts)
>  {
>         union bpf_attr attr;
>         int ret;
>
>         memset(&attr, 0, sizeof(attr));
> +

[...]

> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index b5bc84039407..5e5bb3e437cc 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -450,4 +450,5 @@ LIBBPF_0.8.0 {
>                 bpf_program__attach_usdt;
>                 libbpf_register_prog_handler;
>                 libbpf_unregister_prog_handler;
> +               bpf_prog_query2;

this list is alphabetically ordered

>  } LIBBPF_0.7.0;
> --
> 2.36.0.464.gb9c8b46e94-goog
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-04-29 21:15 ` [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor Stanislav Fomichev
  2022-05-06 23:02   ` Martin KaFai Lau
@ 2022-05-09 21:51   ` Andrii Nakryiko
  2022-05-09 23:38     ` Stanislav Fomichev
  1 sibling, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2022-05-09 21:51 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Fri, Apr 29, 2022 at 2:15 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> Allow attaching to lsm hooks in the cgroup context.
>
> Attaching to per-cgroup LSM works exactly like attaching
> to other per-cgroup hooks. New BPF_LSM_CGROUP is added
> to trigger new mode; the actual lsm hook we attach to is
> signaled via existing attach_btf_id.
>
> For the hooks that have 'struct socket' or 'struct sock' as its first
> argument, we use the cgroup associated with that socket. For the rest,
> we use 'current' cgroup (this is all on default hierarchy == v2 only).
> Note that for some hooks that work on 'struct sock' we still
> take the cgroup from 'current' because some of them work on the socket
> that hasn't been properly initialized yet.
>
> Behind the scenes, we allocate a shim program that is attached
> to the trampoline and runs cgroup effective BPF programs array.
> This shim has some rudimentary ref counting and can be shared
> between several programs attaching to the same per-cgroup lsm hook.
>
> Note that this patch bloats cgroup size because we add 211
> cgroup_bpf_attach_type(s) for simplicity sake. This will be
> addressed in the subsequent patch.
>
> Also note that we only add non-sleepable flavor for now. To enable
> sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
> shim programs have to be freed via trace rcu, cgroup_bpf.effective
> should be also trace-rcu-managed + maybe some other changes that
> I'm not aware of.
>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  arch/x86/net/bpf_jit_comp.c     |  22 ++--
>  include/linux/bpf-cgroup-defs.h |   6 ++
>  include/linux/bpf-cgroup.h      |   7 ++
>  include/linux/bpf.h             |  15 +++
>  include/linux/bpf_lsm.h         |  14 +++
>  include/uapi/linux/bpf.h        |   1 +
>  kernel/bpf/bpf_lsm.c            |  64 ++++++++++++
>  kernel/bpf/btf.c                |  11 ++
>  kernel/bpf/cgroup.c             | 179 +++++++++++++++++++++++++++++---
>  kernel/bpf/syscall.c            |  10 ++
>  kernel/bpf/trampoline.c         | 161 ++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c           |  32 ++++++
>  tools/include/uapi/linux/bpf.h  |   1 +
>  13 files changed, 503 insertions(+), 20 deletions(-)
>

[...]

> @@ -3474,6 +3476,11 @@ static int bpf_prog_attach(const union bpf_attr *attr)
>         case BPF_PROG_TYPE_CGROUP_SOCKOPT:
>         case BPF_PROG_TYPE_CGROUP_SYSCTL:
>         case BPF_PROG_TYPE_SOCK_OPS:
> +       case BPF_PROG_TYPE_LSM:
> +               if (ptype == BPF_PROG_TYPE_LSM &&
> +                   prog->expected_attach_type != BPF_LSM_CGROUP)
> +                       return -EINVAL;
> +

Is it a hard requirement to support non-bpf_link attach for these BPF
trampoline-backed programs? Can we keep it bpf_link-only and use
LINK_CREATE for attachment? That way we won't need to extend query
command and instead add new field to bpf_link_info?

>                 ret = cgroup_bpf_prog_attach(attr, ptype, prog);
>                 break;
>         default:

[...]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access
  2022-04-29 21:15 ` [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
@ 2022-05-09 21:54   ` Andrii Nakryiko
  2022-05-09 23:38     ` Stanislav Fomichev
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2022-05-09 21:54 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Fri, Apr 29, 2022 at 2:16 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> sk_priority & sk_mark are writable, the rest is readonly.
>
> Add new ldx_offset fixups to lookup the offset of struct field.
> Allow using test.kfunc regardless of prog_type.
>
> One interesting thing here is that the verifier doesn't
> really force me to add NULL checks anywhere :-/
>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/testing/selftests/bpf/test_verifier.c   | 54 ++++++++++++++++++-
>  .../selftests/bpf/verifier/lsm_cgroup.c       | 34 ++++++++++++
>  2 files changed, 87 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/bpf/verifier/lsm_cgroup.c
>

[...]

> diff --git a/tools/testing/selftests/bpf/verifier/lsm_cgroup.c b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> new file mode 100644
> index 000000000000..af0efe783511
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> @@ -0,0 +1,34 @@
> +#define SK_WRITABLE_FIELD(tp, field, size, res) \
> +{ \
> +       .descr = field, \
> +       .insns = { \
> +               /* r1 = *(u64 *)(r1 + 0) */ \
> +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */ \
> +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */ \
> +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, 0), \
> +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */ \
> +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, 0), \
> +               BPF_MOV64_IMM(BPF_REG_0, 1), \
> +               BPF_EXIT_INSN(), \
> +       }, \
> +       .result = res, \
> +       .errstr = res ? "no write support to 'struct sock' at off" : "", \
> +       .prog_type = BPF_PROG_TYPE_LSM, \
> +       .expected_attach_type = BPF_LSM_CGROUP, \
> +       .kfunc = "socket_post_create", \
> +       .fixup_ldx = { \
> +               { "socket", "sk", 1 }, \
> +               { tp, field, 2 }, \
> +               { tp, field, 3 }, \
> +       }, \
> +}
> +
> +SK_WRITABLE_FIELD("sock_common", "skc_family", BPF_H, REJECT),
> +SK_WRITABLE_FIELD("sock", "sk_sndtimeo", BPF_DW, REJECT),
> +SK_WRITABLE_FIELD("sock", "sk_priority", BPF_W, ACCEPT),
> +SK_WRITABLE_FIELD("sock", "sk_mark", BPF_W, ACCEPT),
> +SK_WRITABLE_FIELD("sock", "sk_pacing_rate", BPF_DW, REJECT),
> +

have you tried writing it as C program and adding the test to
test_progs? Does something not work there?

> +#undef SK_WRITABLE_FIELD
> --
> 2.36.0.464.gb9c8b46e94-goog
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-07  0:12   ` Martin KaFai Lau
@ 2022-05-09 23:38     ` Stanislav Fomichev
  0 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-09 23:38 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, May 6, 2022 at 5:12 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Fri, Apr 29, 2022 at 02:15:35PM -0700, Stanislav Fomichev wrote:
> > We have two options:
> > 1. Treat all BPF_LSM_CGROUP as the same, regardless of attach_btf_id
> > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> >
> > I'm doing (2) here and adding attach_btf_id as a new BPF_PROG_QUERY
> > argument. The downside is that it requires iterating over all possible
> > bpf_lsm_ hook points in the userspace which might take some time.
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/uapi/linux/bpf.h       |  1 +
> >  kernel/bpf/cgroup.c            | 43 ++++++++++++++++++++++++----------
> >  kernel/bpf/syscall.c           |  3 ++-
> >  tools/include/uapi/linux/bpf.h |  1 +
> >  tools/lib/bpf/bpf.c            | 42 ++++++++++++++++++++++++++-------
> >  tools/lib/bpf/bpf.h            | 15 ++++++++++++
> >  tools/lib/bpf/libbpf.map       |  1 +
> >  7 files changed, 85 insertions(+), 21 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 112e396bbe65..e38ea0b47b6a 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1431,6 +1431,7 @@ union bpf_attr {
> >               __u32           attach_flags;
> >               __aligned_u64   prog_ids;
> >               __u32           prog_cnt;
> > +             __u32           attach_btf_id;  /* for BPF_LSM_CGROUP */
> If the downside/concern on (1) is the bpftool cannot show
> which bpf_lsm_* hook that a cgroup-lsm is attached to,
> how about adding this attach_btf_id to the bpf_prog_info instead.
> The bpftool side is getting the bpf_prog_info (e.g. for the name) anyway.
>
> Probably need to rename it to attach_func_btf_id (== prog->aux->attach_btf_id)
> and then also add the attach_btf_id (== prog->aux->attach_btf->id) to
> bpf_prog_info.
>
> The bpftool then will work mostly the same and no need to iterate btf_vmlinux
> to figure out the btf_id for all bpf_lsm_* hooks and no need to worry about
> the increasing total number of lsm hooks in the future while
> the latter bpftool patch has a static 1024.
>
> If you also agree on (1), for this patch on the kernel side concern,
> it needs to return all BPF_LSM_CGROUP progs to the userspace.

I was exploring this initially but with this scheme I'm not sure how
to export attach_flags. I'm assuming that one lsm hook can, say, be
attached as BPF_F_ALLOW_OVERRIDE and the other one can use
BPF_F_ALLOW_MULTI. Now, if we return all BPF_LSM_CGROUP programs (1),
we have a problem because there is only one attach_flags field per
attach_type.

I can extend BPF_PROG_QUERY with another user-provided pointer where
the kernel can put per-program attach_flags, doesn't seem like there
would be a problem, right?

> Feel free to put the bpf_prog_info modification and bpftool changes as a follow up
> patch.  In this same set is also fine.  Suggesting it because this set is
> getting long already.

SG. Let's discuss it first. I can do a follow up series to add this
query api, the series is getting long indeed :-(

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
  2022-05-09 21:49   ` Andrii Nakryiko
@ 2022-05-09 23:38     ` Stanislav Fomichev
  0 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-09 23:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 9, 2022 at 2:49 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Apr 29, 2022 at 2:15 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > We have two options:
> > 1. Treat all BPF_LSM_CGROUP as the same, regardless of attach_btf_id
> > 2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point
> >
> > I'm doing (2) here and adding attach_btf_id as a new BPF_PROG_QUERY
> > argument. The downside is that it requires iterating over all possible
> > bpf_lsm_ hook points in the userspace which might take some time.
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/uapi/linux/bpf.h       |  1 +
> >  kernel/bpf/cgroup.c            | 43 ++++++++++++++++++++++++----------
> >  kernel/bpf/syscall.c           |  3 ++-
> >  tools/include/uapi/linux/bpf.h |  1 +
> >  tools/lib/bpf/bpf.c            | 42 ++++++++++++++++++++++++++-------
> >  tools/lib/bpf/bpf.h            | 15 ++++++++++++
> >  tools/lib/bpf/libbpf.map       |  1 +
>
> please split kernel and libbpf changes into separate patches and mark
> libbpf's with "libbpf: " prefix

Ack, thanks, will do this and will address the rest.


> >  7 files changed, 85 insertions(+), 21 deletions(-)
> >
>
> [...]
>
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index a9d292c106c2..f62823451b99 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -946,28 +946,54 @@ int bpf_iter_create(int link_fd)
> >         return libbpf_err_errno(fd);
> >  }
> >
> > -int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
> > -                  __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
> > +int bpf_prog_query2(int target_fd,
>
> this would have to be named bpf_prog_query_opts()
>
> > +                   enum bpf_attach_type type,
> > +                   struct bpf_prog_query_opts *opts)
> >  {
> >         union bpf_attr attr;
> >         int ret;
> >
> >         memset(&attr, 0, sizeof(attr));
> > +
>
> [...]
>
> > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> > index b5bc84039407..5e5bb3e437cc 100644
> > --- a/tools/lib/bpf/libbpf.map
> > +++ b/tools/lib/bpf/libbpf.map
> > @@ -450,4 +450,5 @@ LIBBPF_0.8.0 {
> >                 bpf_program__attach_usdt;
> >                 libbpf_register_prog_handler;
> >                 libbpf_unregister_prog_handler;
> > +               bpf_prog_query2;
>
> this list is alphabetically ordered
>
> >  } LIBBPF_0.7.0;
> > --
> > 2.36.0.464.gb9c8b46e94-goog
> >

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-09 21:54   ` Andrii Nakryiko
@ 2022-05-09 23:38     ` Stanislav Fomichev
  2022-05-09 23:43       ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-09 23:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 9, 2022 at 2:54 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Apr 29, 2022 at 2:16 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > sk_priority & sk_mark are writable, the rest is readonly.
> >
> > Add new ldx_offset fixups to lookup the offset of struct field.
> > Allow using test.kfunc regardless of prog_type.
> >
> > One interesting thing here is that the verifier doesn't
> > really force me to add NULL checks anywhere :-/
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  tools/testing/selftests/bpf/test_verifier.c   | 54 ++++++++++++++++++-
> >  .../selftests/bpf/verifier/lsm_cgroup.c       | 34 ++++++++++++
> >  2 files changed, 87 insertions(+), 1 deletion(-)
> >  create mode 100644 tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> >
>
> [...]
>
> > diff --git a/tools/testing/selftests/bpf/verifier/lsm_cgroup.c b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > new file mode 100644
> > index 000000000000..af0efe783511
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > @@ -0,0 +1,34 @@
> > +#define SK_WRITABLE_FIELD(tp, field, size, res) \
> > +{ \
> > +       .descr = field, \
> > +       .insns = { \
> > +               /* r1 = *(u64 *)(r1 + 0) */ \
> > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */ \
> > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */ \
> > +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, 0), \
> > +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */ \
> > +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, 0), \
> > +               BPF_MOV64_IMM(BPF_REG_0, 1), \
> > +               BPF_EXIT_INSN(), \
> > +       }, \
> > +       .result = res, \
> > +       .errstr = res ? "no write support to 'struct sock' at off" : "", \
> > +       .prog_type = BPF_PROG_TYPE_LSM, \
> > +       .expected_attach_type = BPF_LSM_CGROUP, \
> > +       .kfunc = "socket_post_create", \
> > +       .fixup_ldx = { \
> > +               { "socket", "sk", 1 }, \
> > +               { tp, field, 2 }, \
> > +               { tp, field, 3 }, \
> > +       }, \
> > +}
> > +
> > +SK_WRITABLE_FIELD("sock_common", "skc_family", BPF_H, REJECT),
> > +SK_WRITABLE_FIELD("sock", "sk_sndtimeo", BPF_DW, REJECT),
> > +SK_WRITABLE_FIELD("sock", "sk_priority", BPF_W, ACCEPT),
> > +SK_WRITABLE_FIELD("sock", "sk_mark", BPF_W, ACCEPT),
> > +SK_WRITABLE_FIELD("sock", "sk_pacing_rate", BPF_DW, REJECT),
> > +
>
> have you tried writing it as C program and adding the test to
> test_progs? Does something not work there?

Seems like it should work, I don't see any issues with writing 5
programs to test each field.
But test_verified still feels like a better fit? Any reason in
particular you'd prefer test_progs over test_verifier?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-05-09 21:51   ` Andrii Nakryiko
@ 2022-05-09 23:38     ` Stanislav Fomichev
  0 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-09 23:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 9, 2022 at 2:51 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Apr 29, 2022 at 2:15 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > Allow attaching to lsm hooks in the cgroup context.
> >
> > Attaching to per-cgroup LSM works exactly like attaching
> > to other per-cgroup hooks. New BPF_LSM_CGROUP is added
> > to trigger new mode; the actual lsm hook we attach to is
> > signaled via existing attach_btf_id.
> >
> > For the hooks that have 'struct socket' or 'struct sock' as its first
> > argument, we use the cgroup associated with that socket. For the rest,
> > we use 'current' cgroup (this is all on default hierarchy == v2 only).
> > Note that for some hooks that work on 'struct sock' we still
> > take the cgroup from 'current' because some of them work on the socket
> > that hasn't been properly initialized yet.
> >
> > Behind the scenes, we allocate a shim program that is attached
> > to the trampoline and runs cgroup effective BPF programs array.
> > This shim has some rudimentary ref counting and can be shared
> > between several programs attaching to the same per-cgroup lsm hook.
> >
> > Note that this patch bloats cgroup size because we add 211
> > cgroup_bpf_attach_type(s) for simplicity sake. This will be
> > addressed in the subsequent patch.
> >
> > Also note that we only add non-sleepable flavor for now. To enable
> > sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
> > shim programs have to be freed via trace rcu, cgroup_bpf.effective
> > should be also trace-rcu-managed + maybe some other changes that
> > I'm not aware of.
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  arch/x86/net/bpf_jit_comp.c     |  22 ++--
> >  include/linux/bpf-cgroup-defs.h |   6 ++
> >  include/linux/bpf-cgroup.h      |   7 ++
> >  include/linux/bpf.h             |  15 +++
> >  include/linux/bpf_lsm.h         |  14 +++
> >  include/uapi/linux/bpf.h        |   1 +
> >  kernel/bpf/bpf_lsm.c            |  64 ++++++++++++
> >  kernel/bpf/btf.c                |  11 ++
> >  kernel/bpf/cgroup.c             | 179 +++++++++++++++++++++++++++++---
> >  kernel/bpf/syscall.c            |  10 ++
> >  kernel/bpf/trampoline.c         | 161 ++++++++++++++++++++++++++++
> >  kernel/bpf/verifier.c           |  32 ++++++
> >  tools/include/uapi/linux/bpf.h  |   1 +
> >  13 files changed, 503 insertions(+), 20 deletions(-)
> >
>
> [...]
>
> > @@ -3474,6 +3476,11 @@ static int bpf_prog_attach(const union bpf_attr *attr)
> >         case BPF_PROG_TYPE_CGROUP_SOCKOPT:
> >         case BPF_PROG_TYPE_CGROUP_SYSCTL:
> >         case BPF_PROG_TYPE_SOCK_OPS:
> > +       case BPF_PROG_TYPE_LSM:
> > +               if (ptype == BPF_PROG_TYPE_LSM &&
> > +                   prog->expected_attach_type != BPF_LSM_CGROUP)
> > +                       return -EINVAL;
> > +
>
> Is it a hard requirement to support non-bpf_link attach for these BPF
> trampoline-backed programs? Can we keep it bpf_link-only and use
> LINK_CREATE for attachment? That way we won't need to extend query
> command and instead add new field to bpf_link_info?

I didn't think it was an option :-) So if non-link-based apis are
deprecated, I'll drop them from the patch series.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-05-06 23:02   ` Martin KaFai Lau
@ 2022-05-09 23:38     ` Stanislav Fomichev
  2022-05-10  7:13       ` Martin KaFai Lau
  0 siblings, 1 reply; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-09 23:38 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

.

On Fri, May 6, 2022 at 4:02 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Fri, Apr 29, 2022 at 02:15:33PM -0700, Stanislav Fomichev wrote:
> > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > index 064eccba641d..a0e68ef5dfb1 100644
> > --- a/kernel/bpf/bpf_lsm.c
> > +++ b/kernel/bpf/bpf_lsm.c
> > @@ -16,6 +16,7 @@
> >  #include <linux/bpf_local_storage.h>
> >  #include <linux/btf_ids.h>
> >  #include <linux/ima.h>
> > +#include <linux/bpf-cgroup.h>
> >
> >  /* For every LSM hook that allows attachment of BPF programs, declare a nop
> >   * function where a BPF program can be attached.
> > @@ -35,6 +36,66 @@ BTF_SET_START(bpf_lsm_hooks)
> >  #undef LSM_HOOK
> >  BTF_SET_END(bpf_lsm_hooks)
> >
> > +/* List of LSM hooks that should operate on 'current' cgroup regardless
> > + * of function signature.
> > + */
> > +BTF_SET_START(bpf_lsm_current_hooks)
> > +/* operate on freshly allocated sk without any cgroup association */
> > +BTF_ID(func, bpf_lsm_sk_alloc_security)
> > +BTF_ID(func, bpf_lsm_sk_free_security)
> > +BTF_SET_END(bpf_lsm_current_hooks)
> > +
> > +int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> > +                          bpf_func_t *bpf_func)
> > +{
> > +     const struct btf_type *first_arg_type;
> > +     const struct btf_type *sock_type;
> > +     const struct btf_type *sk_type;
> > +     const struct btf *btf_vmlinux;
> > +     const struct btf_param *args;
> > +     s32 type_id;
> > +
> > +     if (!prog->aux->attach_func_proto ||
> > +         !btf_type_is_func_proto(prog->aux->attach_func_proto))
> Also remove these tests during the attach time.  These should have
> already been checked during the load time.

Ack, makes sense!

> > +             return -EINVAL;
> > +
> > +     if (btf_type_vlen(prog->aux->attach_func_proto) < 1 ||
> > +         btf_id_set_contains(&bpf_lsm_current_hooks,
> > +                             prog->aux->attach_btf_id)) {
> > +             *bpf_func = __cgroup_bpf_run_lsm_current;
> > +             return 0;
> > +     }
> > +
> > +     args = btf_params(prog->aux->attach_func_proto);
> > +
> > +     btf_vmlinux = bpf_get_btf_vmlinux();
> > +
> > +     type_id = btf_find_by_name_kind(btf_vmlinux, "socket", BTF_KIND_STRUCT);
> > +     if (type_id < 0)
> > +             return -EINVAL;
> > +     sock_type = btf_type_by_id(btf_vmlinux, type_id);
> > +
> > +     type_id = btf_find_by_name_kind(btf_vmlinux, "sock", BTF_KIND_STRUCT);
> > +     if (type_id < 0)
> > +             return -EINVAL;
> > +     sk_type = btf_type_by_id(btf_vmlinux, type_id);
> Although practical kconfig should have CONFIG_NET, not sure btf has
> "socket" and "sock" in some unusual kconfig setup.  If it does not, it will
> return error too early for other hooks.
>
> May be put them under "#ifdef CONFIG_NET"?  While adding CONFIG_NET,
> it will be easier to just use the btf_sock_ids[].  "socket" needs to be
> added to btf_sock_ids[].  Take a look at btf_ids.h and filter.c.

That looks better, thx!

> > +
> > +     first_arg_type = btf_type_resolve_ptr(btf_vmlinux, args[0].type, NULL);
> > +     if (first_arg_type == sock_type)
> > +             *bpf_func = __cgroup_bpf_run_lsm_socket;
> > +     else if (first_arg_type == sk_type)
> > +             *bpf_func = __cgroup_bpf_run_lsm_sock;
> > +     else
> > +             *bpf_func = __cgroup_bpf_run_lsm_current;
> > +
> > +     return 0;
> > +}
>
> [ ... ]
>
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 134785ab487c..9cc38454e402 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -14,6 +14,9 @@
> >  #include <linux/string.h>
> >  #include <linux/bpf.h>
> >  #include <linux/bpf-cgroup.h>
> > +#include <linux/btf_ids.h>
> > +#include <linux/bpf_lsm.h>
> > +#include <linux/bpf_verifier.h>
> >  #include <net/sock.h>
> >  #include <net/bpf_sk_storage.h>
> >
> > @@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
> >       return run_ctx.retval;
> >  }
> >
> > +unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
> > +                                    const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct sock *sk;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> > +     u64 *regs;
> > +
> > +     regs = (u64 *)ctx;
> > +     sk = (void *)(unsigned long)regs[BPF_REG_0];
> > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> > +
> > +     cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> > +     if (likely(cgrp))
> > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > +                                         shim_prog->aux->cgroup_atype,
> > +                                         ctx, bpf_prog_run, 0, NULL);
> > +     return ret;
> > +}
> > +
> > +unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
> > +                                      const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct socket *sock;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> > +     u64 *regs;
> > +
> > +     regs = (u64 *)ctx;
> > +     sock = (void *)(unsigned long)regs[BPF_REG_0];
> > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> > +
> > +     cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data);
> > +     if (likely(cgrp))
> > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > +                                         shim_prog->aux->cgroup_atype,
> > +                                         ctx, bpf_prog_run, 0, NULL);
> > +     return ret;
> > +}
> > +
> > +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> > +                                       const struct bpf_insn *insn)
> > +{
> > +     const struct bpf_prog *shim_prog;
> > +     struct cgroup *cgrp;
> > +     int ret = 0;
> From lsm_hook_defs.h, there are some default return values that are not 0.
> Is it ok to always return 0 in cases like the cgroup array is empty ?

That's a good point, I haven't thought about it. You're right, it
seems like attaching to this hook for some LSMs will change the
default from some error to zero.
Let's start by prohibiting those hooks for now? I guess in theory,
when we generate a trampoline, we can put this default value as an
input arg to these new __cgroup_bpf_run_lsm_xxx helpers (in the
future)?

Another thing that seems to be related: there are a bunch of hooks
that return void, so returning EPERM from the cgroup programs won't
work as expected.
I can probably record, at verification time, whether lsm_cgroup
programs return any "non-success" return codes and prohibit attaching
these progs to the void hooks?

> > +
> > +     if (unlikely(!current))
> > +             return 0;
> > +
> > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> > +
> > +     rcu_read_lock();
> > +     cgrp = task_dfl_cgroup(current);
> > +     if (likely(cgrp))
> > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > +                                         shim_prog->aux->cgroup_atype,
> > +                                         ctx, bpf_prog_run, 0, NULL);
> > +     rcu_read_unlock();
> > +     return ret;
> > +}
>
> [ ... ]
>
> > @@ -479,6 +572,7 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >       struct bpf_prog *old_prog = NULL;
> >       struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
> >       struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {};
> > +     struct bpf_attach_target_info tgt_info = {};
> >       enum cgroup_bpf_attach_type atype;
> >       struct bpf_prog_list *pl;
> >       struct hlist_head *progs;
> > @@ -495,9 +589,34 @@ static int __cgroup_bpf_attach(struct cgroup *cgrp,
> >               /* replace_prog implies BPF_F_REPLACE, and vice versa */
> >               return -EINVAL;
> >
> > -     atype = to_cgroup_bpf_attach_type(type);
> > -     if (atype < 0)
> > -             return -EINVAL;
> > +     if (type == BPF_LSM_CGROUP) {
> > +             struct bpf_prog *p = prog ? : link->link.prog;
> nit. This "prog ? ...." has been done multiple times (new and existing codes)
> in this function.  may be do it once at the very beginning.

Yeah, but we need to do it in some clear way. Having p, prog and link
is probably more confusing? I'll try:

struct bpf_prog *new_prog = prog ? : link->link.prog;

> > +
> > +             if (replace_prog) {
> > +                     /* Reusing shim from the original program. */
> > +                     if (replace_prog->aux->attach_btf_id !=
> > +                         p->aux->attach_btf_id)
> > +                             return -EINVAL;
> > +
> > +                     atype = replace_prog->aux->cgroup_atype;
> > +             } else {
> > +                     err = bpf_check_attach_target(NULL, p, NULL,
> > +                                                   p->aux->attach_btf_id,
> > +                                                   &tgt_info);
> > +                     if (err)
> > +                             return -EINVAL;
> return err;
>
> > +
> > +                     atype = bpf_lsm_attach_type_get(p->aux->attach_btf_id);
> > +                     if (atype < 0)
> > +                             return atype;
> > +             }
> > +
> > +             p->aux->cgroup_atype = atype;
> > +     } else {
> > +             atype = to_cgroup_bpf_attach_type(type);
> > +             if (atype < 0)
> > +                     return -EINVAL;
> > +     }
> >
> >       progs = &cgrp->bpf.progs[atype];
>
> [ ... ]
>
> > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > index 0c4fd194e801..77dfa517c47c 100644
> > --- a/kernel/bpf/trampoline.c
> > +++ b/kernel/bpf/trampoline.c
> > @@ -11,6 +11,8 @@
> >  #include <linux/rcupdate_wait.h>
> >  #include <linux/module.h>
> >  #include <linux/static_call.h>
> > +#include <linux/bpf_verifier.h>
> > +#include <linux/bpf_lsm.h>
> >
> >  /* dummy _ops. The verifier will operate on target program's ops. */
> >  const struct bpf_verifier_ops bpf_extension_verifier_ops = {
> > @@ -485,6 +487,147 @@ int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
> >       return err;
> >  }
> >
> > +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
> > +static struct bpf_prog *cgroup_shim_alloc(const struct bpf_prog *prog,
> > +                                       bpf_func_t bpf_func)
> > +{
> > +     struct bpf_prog *p;
> > +
> > +     p = bpf_prog_alloc(1, 0);
> > +     if (!p)
> > +             return NULL;
> > +
> > +     p->jited = false;
> > +     p->bpf_func = bpf_func;
> > +
> > +     p->aux->cgroup_atype = prog->aux->cgroup_atype;
> > +     p->aux->attach_func_proto = prog->aux->attach_func_proto;
> > +     p->aux->attach_btf_id = prog->aux->attach_btf_id;
> > +     p->aux->attach_btf = prog->aux->attach_btf;
> > +     btf_get(p->aux->attach_btf);
> > +     p->type = BPF_PROG_TYPE_LSM;
> > +     p->expected_attach_type = BPF_LSM_MAC;
> > +     bpf_prog_inc(p);
> > +
> > +     return p;
> > +}
> > +
> > +static struct bpf_prog *cgroup_shim_find(struct bpf_trampoline *tr,
> > +                                      bpf_func_t bpf_func)
> > +{
> > +     const struct bpf_prog_aux *aux;
> > +     int kind;
> > +
> > +     for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> > +             hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist) {
> > +                     struct bpf_prog *p = aux->prog;
> > +
> > +                     if (p->bpf_func == bpf_func)
> > +                             return p;
> > +             }
> > +     }
> > +
> > +     return NULL;
> > +}
> > +
> > +int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
> > +                                 struct bpf_attach_target_info *tgt_info)
> > +{
> > +     struct bpf_prog *shim_prog = NULL;
> > +     struct bpf_trampoline *tr;
> > +     bpf_func_t bpf_func;
> > +     u64 key;
> > +     int err;
> > +
> > +     key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> > +                                      prog->aux->attach_btf_id);
> > +
> > +     err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> > +     if (err)
> > +             return err;
> > +
> > +     tr = bpf_trampoline_get(key, tgt_info);
> > +     if (!tr)
> > +             return  -ENOMEM;
> > +
> > +     mutex_lock(&tr->mutex);
> > +
> > +     shim_prog = cgroup_shim_find(tr, bpf_func);
> > +     if (shim_prog) {
> > +             /* Reusing existing shim attached by the other program. */
> > +             bpf_prog_inc(shim_prog);
> > +             mutex_unlock(&tr->mutex);
> > +             return 0;
> > +     }
> > +
> > +     /* Allocate and install new shim. */
> > +
> > +     shim_prog = cgroup_shim_alloc(prog, bpf_func);
> > +     if (!shim_prog) {
> > +             err = -ENOMEM;
> > +             goto out;
> > +     }
> > +
> > +     err = __bpf_trampoline_link_prog(shim_prog, tr);
> > +     if (err)
> > +             goto out;
> > +
> > +     mutex_unlock(&tr->mutex);
> > +
> > +     return 0;
> > +out:
> > +     if (shim_prog)
> > +             bpf_prog_put(shim_prog);
> > +
> bpf_trampoline_get() was done earlier.
> Does it need to call bpf_trampoline_put(tr) in error cases ?
> More on tr refcnt later.
>
> > +     mutex_unlock(&tr->mutex);
> > +     return err;
> > +}
> > +
> > +void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
> > +{
> > +     struct bpf_prog *shim_prog;
> > +     struct bpf_trampoline *tr;
> > +     bpf_func_t bpf_func;
> > +     u64 key;
> > +     int err;
> > +
> > +     key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf,
> > +                                      prog->aux->attach_btf_id);
> > +
> > +     err = bpf_lsm_find_cgroup_shim(prog, &bpf_func);
> > +     if (err)
> > +             return;
> > +
> > +     tr = bpf_trampoline_lookup(key);
> > +     if (!tr)
> > +             return;
> > +
> > +     mutex_lock(&tr->mutex);
> > +
> > +     shim_prog = cgroup_shim_find(tr, bpf_func);
> > +     if (shim_prog) {
> > +             /* We use shim_prog refcnt for tracking whether to
> > +              * remove the shim program from the trampoline.
> > +              * Trampoline's mutex is held while refcnt is
> > +              * added/subtracted so we don't need to care about
> > +              * potential races.
> > +              */
> > +
> > +             if (atomic64_read(&shim_prog->aux->refcnt) == 1)
> > +                     WARN_ON_ONCE(__bpf_trampoline_unlink_prog(shim_prog, tr));
> > +
> > +             bpf_prog_put(shim_prog);
> > +     }
> > +
> > +     mutex_unlock(&tr->mutex);
> > +
> > +     bpf_trampoline_put(tr); /* bpf_trampoline_lookup */
> > +
> > +     if (shim_prog)
> > +             bpf_trampoline_put(tr);
> While looking at the bpf_trampoline_{link,unlink}_cgroup_shim() again,
> which prog (cgroup lsm progs or shim_prog) actually owns the tr.
> It should be the shim_prog ?
>
> How about storing tr in "shim_prog->aux->dst_trampoline = tr;"
> Then the earlier bpf_prog_put(shim_prog) in this function will take care
> of the bpf_trampoline_put(shim_prog->aux->dst_trampoline)
> instead of each cgroup lsm prog owns a refcnt of the tr
> and then this individual bpf_trampoline_put(tr) here can also
> go away.

Yeah, that seems like a much better way to handle it, didn't know
about dst_trampoline.

> > +}
> > +#endif
> > +
> >  struct bpf_trampoline *bpf_trampoline_get(u64 key,
> >                                         struct bpf_attach_target_info *tgt_info)
> >  {
> > @@ -609,6 +752,24 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start)
> >       rcu_read_unlock();
> >  }
> >
> > +u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog)
> > +     __acquires(RCU)
> > +{
> > +     /* Runtime stats are exported via actual BPF_LSM_CGROUP
> > +      * programs, not the shims.
> > +      */
> > +     rcu_read_lock();
> > +     migrate_disable();
> > +     return NO_START_TIME;
> > +}
> > +
> > +void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start)
> > +     __releases(RCU)
> > +{
> > +     migrate_enable();
> > +     rcu_read_unlock();
> > +}
> > +
> >  u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog)
> >  {
> >       rcu_read_lock_trace();
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 813f6ee80419..99703d96c579 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -14455,6 +14455,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
> >               fallthrough;
> >       case BPF_MODIFY_RETURN:
> >       case BPF_LSM_MAC:
> > +     case BPF_LSM_CGROUP:
> >       case BPF_TRACE_FENTRY:
> >       case BPF_TRACE_FEXIT:
> >               if (!btf_type_is_func(t)) {
> > @@ -14633,6 +14634,33 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
> >       return 0;
> >  }
> >
> > +static int check_used_maps(struct bpf_verifier_env *env)
> > +{
> > +     int i;
> > +
> > +     for (i = 0; i < env->used_map_cnt; i++) {
> > +             struct bpf_map *map = env->used_maps[i];
> > +
> > +             switch (env->prog->expected_attach_type) {
> > +             case BPF_LSM_CGROUP:
> > +                     if (map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE &&
> > +                         map->map_type != BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE)
> > +                             break;
> > +
> > +                     if (map->key_size != sizeof(__u64)) {
> Follow on my very last comment in v5.  I think we should not
> limit the bpf_cgroup_storage_key and should allow using both
> cgroup_inode_id and attach_type as the key to avoid
> inconsistent surprise (some keys are not allowed in a specific
> attach_type), so this check_used_maps() function can also be avoided.

I should've replied to your original mail: it felt a bit confusing to
me. But I guess we can enable this per-attach_type mode and let all
lsm_cgroup programs use the same underlying storage. As you mentioned,
we can later extend with attach_btf_id if needed. Will remove this new
check_used_maps

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-09 23:38     ` Stanislav Fomichev
@ 2022-05-09 23:43       ` Andrii Nakryiko
  2022-05-10 17:31         ` Stanislav Fomichev
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2022-05-09 23:43 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 9, 2022 at 4:38 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Mon, May 9, 2022 at 2:54 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Fri, Apr 29, 2022 at 2:16 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > sk_priority & sk_mark are writable, the rest is readonly.
> > >
> > > Add new ldx_offset fixups to lookup the offset of struct field.
> > > Allow using test.kfunc regardless of prog_type.
> > >
> > > One interesting thing here is that the verifier doesn't
> > > really force me to add NULL checks anywhere :-/
> > >
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  tools/testing/selftests/bpf/test_verifier.c   | 54 ++++++++++++++++++-
> > >  .../selftests/bpf/verifier/lsm_cgroup.c       | 34 ++++++++++++
> > >  2 files changed, 87 insertions(+), 1 deletion(-)
> > >  create mode 100644 tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > >
> >
> > [...]
> >
> > > diff --git a/tools/testing/selftests/bpf/verifier/lsm_cgroup.c b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > new file mode 100644
> > > index 000000000000..af0efe783511
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > @@ -0,0 +1,34 @@
> > > +#define SK_WRITABLE_FIELD(tp, field, size, res) \
> > > +{ \
> > > +       .descr = field, \
> > > +       .insns = { \
> > > +               /* r1 = *(u64 *)(r1 + 0) */ \
> > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > > +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */ \
> > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > > +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */ \
> > > +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, 0), \
> > > +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */ \
> > > +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, 0), \
> > > +               BPF_MOV64_IMM(BPF_REG_0, 1), \
> > > +               BPF_EXIT_INSN(), \
> > > +       }, \
> > > +       .result = res, \
> > > +       .errstr = res ? "no write support to 'struct sock' at off" : "", \
> > > +       .prog_type = BPF_PROG_TYPE_LSM, \
> > > +       .expected_attach_type = BPF_LSM_CGROUP, \
> > > +       .kfunc = "socket_post_create", \
> > > +       .fixup_ldx = { \
> > > +               { "socket", "sk", 1 }, \
> > > +               { tp, field, 2 }, \
> > > +               { tp, field, 3 }, \
> > > +       }, \
> > > +}
> > > +
> > > +SK_WRITABLE_FIELD("sock_common", "skc_family", BPF_H, REJECT),
> > > +SK_WRITABLE_FIELD("sock", "sk_sndtimeo", BPF_DW, REJECT),
> > > +SK_WRITABLE_FIELD("sock", "sk_priority", BPF_W, ACCEPT),
> > > +SK_WRITABLE_FIELD("sock", "sk_mark", BPF_W, ACCEPT),
> > > +SK_WRITABLE_FIELD("sock", "sk_pacing_rate", BPF_DW, REJECT),
> > > +
> >
> > have you tried writing it as C program and adding the test to
> > test_progs? Does something not work there?
>
> Seems like it should work, I don't see any issues with writing 5
> programs to test each field.
> But test_verified still feels like a better fit? Any reason in
> particular you'd prefer test_progs over test_verifier?

Adding that fixup_ldx->strct special handling didn't feel like the
best fit, tbh. test_progs is generally much nicer to deal with in
terms of CI and in terms of comprehending what's going on and
supporting the code longer term.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program
  2022-04-29 21:15 ` [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
@ 2022-05-10  5:05   ` Alexei Starovoitov
  2022-05-10 17:31     ` sdf
  0 siblings, 1 reply; 33+ messages in thread
From: Alexei Starovoitov @ 2022-05-10  5:05 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, Apr 29, 2022 at 02:15:34PM -0700, Stanislav Fomichev wrote:
> Previous patch adds 1:1 mapping between all 211 LSM hooks
> and bpf_cgroup program array. Instead of reserving a slot per
> possible hook, reserve 10 slots per cgroup for lsm programs.
> Those slots are dynamically allocated on demand and reclaimed.
> 
> It should be possible to eventually extend this idea to all hooks if
> the memory consumption is unacceptable and shrink overall effective
> programs array.
> 
> struct cgroup_bpf {
> 	struct bpf_prog_array *    effective[33];        /*     0   264 */
> 	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
> 	struct hlist_head          progs[33];            /*   264   264 */
> 	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
> 	u8                         flags[33];            /*   528    33 */
> 
> 	/* XXX 7 bytes hole, try to pack */
> 
> 	struct list_head           storages;             /*   568    16 */
> 	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
> 	struct bpf_prog_array *    inactive;             /*   584     8 */
> 	struct percpu_ref          refcnt;               /*   592    16 */
> 	struct work_struct         release_work;         /*   608    72 */
> 
> 	/* size: 680, cachelines: 11, members: 7 */
> 	/* sum members: 673, holes: 1, sum holes: 7 */
> 	/* last cacheline: 40 bytes */
> };
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/linux/bpf-cgroup-defs.h |   3 +-
>  include/linux/bpf_lsm.h         |   6 --
>  kernel/bpf/bpf_lsm.c            |   5 --
>  kernel/bpf/cgroup.c             | 107 +++++++++++++++++++++++++++-----
>  4 files changed, 94 insertions(+), 27 deletions(-)
> 
> diff --git a/include/linux/bpf-cgroup-defs.h b/include/linux/bpf-cgroup-defs.h
> index d5a70a35dace..359d3f16abea 100644
> --- a/include/linux/bpf-cgroup-defs.h
> +++ b/include/linux/bpf-cgroup-defs.h
> @@ -10,7 +10,8 @@
>  
>  struct bpf_prog_array;
>  
> -#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
> +/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
> +#define CGROUP_LSM_NUM 10
>  
>  enum cgroup_bpf_attach_type {
>  	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
> diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> index 7f0e59f5f9be..613de44aa429 100644
> --- a/include/linux/bpf_lsm.h
> +++ b/include/linux/bpf_lsm.h
> @@ -43,7 +43,6 @@ extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
>  void bpf_inode_storage_free(struct inode *inode);
>  
>  int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);
> -int bpf_lsm_hook_idx(u32 btf_id);
>  
>  #else /* !CONFIG_BPF_LSM */
>  
> @@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
>  	return -ENOENT;
>  }
>  
> -static inline int bpf_lsm_hook_idx(u32 btf_id)
> -{
> -	return -EINVAL;
> -}
> -
>  #endif /* CONFIG_BPF_LSM */
>  
>  #endif /* _LINUX_BPF_LSM_H */
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index a0e68ef5dfb1..1079c747e061 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -91,11 +91,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
>  	return 0;
>  }
>  
> -int bpf_lsm_hook_idx(u32 btf_id)
> -{
> -	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> -}
> -
>  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
>  			const struct bpf_prog *prog)
>  {
> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> index 9cc38454e402..787ff6cf8d42 100644
> --- a/kernel/bpf/cgroup.c
> +++ b/kernel/bpf/cgroup.c
> @@ -79,10 +79,13 @@ unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
>  	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
>  
>  	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> -	if (likely(cgrp))
> +	if (likely(cgrp)) {
> +		rcu_read_lock(); /* See bpf_lsm_attach_type_get(). */

I've looked at bpf_lsm_attach_type_get/put, but still don't get it :)
shim_prog->aux->cgroup_atype stays the same for the life of shim_prog.
atype_usecnt will go up and down, but atype_usecnt == 0 is the only
interesting one from the pov of selecting atype in _get().
And there shim_prog will be detached and trampoline destroyed.
The shim_prog->aux->cgroup_atype deref below cannot be happening on
freed shim_prog.
So what is the point of this critical section and sync_rcu() ?
It seems none of it is necessary.

> It should be possible to eventually extend this idea to all hooks if
> the memory consumption is unacceptable and shrink overall effective
> programs array.

if BPF_LSM_CGROUP do atype differently looks too special.
Why not to do this generalization right now?
Do atype_get for all cgroup hooks and get rid of to_cgroup_bpf_attach_type ?
Combine ranges of attach_btf_id for lsm_cgroup and enum bpf_attach_type
for traditional cgroup hooks into single _get() method that returns a slot
in effective[] array ?
attach/detach/query apis won't notice this internal implementation detail.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-05-09 23:38     ` Stanislav Fomichev
@ 2022-05-10  7:13       ` Martin KaFai Lau
  2022-05-10 17:30         ` Stanislav Fomichev
  0 siblings, 1 reply; 33+ messages in thread
From: Martin KaFai Lau @ 2022-05-10  7:13 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Mon, May 09, 2022 at 04:38:36PM -0700, Stanislav Fomichev wrote:
> > > +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> > > +                                       const struct bpf_insn *insn)
> > > +{
> > > +     const struct bpf_prog *shim_prog;
> > > +     struct cgroup *cgrp;
> > > +     int ret = 0;
> > From lsm_hook_defs.h, there are some default return values that are not 0.
> > Is it ok to always return 0 in cases like the cgroup array is empty ?
> 
> That's a good point, I haven't thought about it. You're right, it
> seems like attaching to this hook for some LSMs will change the
> default from some error to zero.
> Let's start by prohibiting those hooks for now? I guess in theory,
> when we generate a trampoline, we can put this default value as an
> input arg to these new __cgroup_bpf_run_lsm_xxx helpers (in the
> future)?
After looking at arch_prepare_bpf_trampoline, return 0 here should be fine.
If I read it correctly, when the shim_prog returns 0, the trampoline
will call the original kernel function which is the bpf_lsm_##NAME()
defined in bpf_lsm.c and it will then return the zero/-ve DEFAULT.

> 
> Another thing that seems to be related: there are a bunch of hooks
> that return void, so returning EPERM from the cgroup programs won't
> work as expected.
> I can probably record, at verification time, whether lsm_cgroup
> programs return any "non-success" return codes and prohibit attaching
> these progs to the void hooks?
hmm...yeah, BPF_LSM_CGROUP can be enforced to return either 0 or 1 as
most other cgroup-progs do.

Do you have a use case that needs to return something other than -EPERM ?

> 
> > > +
> > > +     if (unlikely(!current))
> > > +             return 0;
> > > +
> > > +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
> > > +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
> > > +
> > > +     rcu_read_lock();
> > > +     cgrp = task_dfl_cgroup(current);
> > > +     if (likely(cgrp))
> > > +             ret = bpf_prog_run_array_cg(&cgrp->bpf,
> > > +                                         shim_prog->aux->cgroup_atype,
> > > +                                         ctx, bpf_prog_run, 0, NULL);
> > > +     rcu_read_unlock();
> > > +     return ret;
> > > +}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-05-10  7:13       ` Martin KaFai Lau
@ 2022-05-10 17:30         ` Stanislav Fomichev
  2022-05-10 19:18           ` Martin KaFai Lau
  0 siblings, 1 reply; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-10 17:30 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Tue, May 10, 2022 at 12:13 AM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Mon, May 09, 2022 at 04:38:36PM -0700, Stanislav Fomichev wrote:
> > > > +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> > > > +                                       const struct bpf_insn *insn)
> > > > +{
> > > > +     const struct bpf_prog *shim_prog;
> > > > +     struct cgroup *cgrp;
> > > > +     int ret = 0;
> > > From lsm_hook_defs.h, there are some default return values that are not 0.
> > > Is it ok to always return 0 in cases like the cgroup array is empty ?
> >
> > That's a good point, I haven't thought about it. You're right, it
> > seems like attaching to this hook for some LSMs will change the
> > default from some error to zero.
> > Let's start by prohibiting those hooks for now? I guess in theory,
> > when we generate a trampoline, we can put this default value as an
> > input arg to these new __cgroup_bpf_run_lsm_xxx helpers (in the
> > future)?
> After looking at arch_prepare_bpf_trampoline, return 0 here should be fine.
> If I read it correctly, when the shim_prog returns 0, the trampoline
> will call the original kernel function which is the bpf_lsm_##NAME()
> defined in bpf_lsm.c and it will then return the zero/-ve DEFAULT.

Not sure I read the same :-/ I'm assuming that for those cases we
actually end up generating fmod_ret trampoline which seems to be
unconditionally saving r0 into fp-8 ?

> > Another thing that seems to be related: there are a bunch of hooks
> > that return void, so returning EPERM from the cgroup programs won't
> > work as expected.
> > I can probably record, at verification time, whether lsm_cgroup
> > programs return any "non-success" return codes and prohibit attaching
> > these progs to the void hooks?
> hmm...yeah, BPF_LSM_CGROUP can be enforced to return either 0 or 1 as
> most other cgroup-progs do.
>
> Do you have a use case that needs to return something other than -EPERM ?

We do already enforce 0/1 for cgroup progs (and we have helpers to
expose custom errno). What I want to avoid is letting users attach
programs that try to return the error for the void hooks. And it seems
like we record that return range for a particular cgroup program and
verify it at attach time, WDYT?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-09 23:43       ` Andrii Nakryiko
@ 2022-05-10 17:31         ` Stanislav Fomichev
  2022-05-12  3:37           ` Andrii Nakryiko
  0 siblings, 1 reply; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-10 17:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Mon, May 9, 2022 at 4:44 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, May 9, 2022 at 4:38 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Mon, May 9, 2022 at 2:54 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Fri, Apr 29, 2022 at 2:16 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > sk_priority & sk_mark are writable, the rest is readonly.
> > > >
> > > > Add new ldx_offset fixups to lookup the offset of struct field.
> > > > Allow using test.kfunc regardless of prog_type.
> > > >
> > > > One interesting thing here is that the verifier doesn't
> > > > really force me to add NULL checks anywhere :-/
> > > >
> > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > ---
> > > >  tools/testing/selftests/bpf/test_verifier.c   | 54 ++++++++++++++++++-
> > > >  .../selftests/bpf/verifier/lsm_cgroup.c       | 34 ++++++++++++
> > > >  2 files changed, 87 insertions(+), 1 deletion(-)
> > > >  create mode 100644 tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > >
> > >
> > > [...]
> > >
> > > > diff --git a/tools/testing/selftests/bpf/verifier/lsm_cgroup.c b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > > new file mode 100644
> > > > index 000000000000..af0efe783511
> > > > --- /dev/null
> > > > +++ b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > > @@ -0,0 +1,34 @@
> > > > +#define SK_WRITABLE_FIELD(tp, field, size, res) \
> > > > +{ \
> > > > +       .descr = field, \
> > > > +       .insns = { \
> > > > +               /* r1 = *(u64 *)(r1 + 0) */ \
> > > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > > > +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */ \
> > > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > > > +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */ \
> > > > +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, 0), \
> > > > +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */ \
> > > > +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, 0), \
> > > > +               BPF_MOV64_IMM(BPF_REG_0, 1), \
> > > > +               BPF_EXIT_INSN(), \
> > > > +       }, \
> > > > +       .result = res, \
> > > > +       .errstr = res ? "no write support to 'struct sock' at off" : "", \
> > > > +       .prog_type = BPF_PROG_TYPE_LSM, \
> > > > +       .expected_attach_type = BPF_LSM_CGROUP, \
> > > > +       .kfunc = "socket_post_create", \
> > > > +       .fixup_ldx = { \
> > > > +               { "socket", "sk", 1 }, \
> > > > +               { tp, field, 2 }, \
> > > > +               { tp, field, 3 }, \
> > > > +       }, \
> > > > +}
> > > > +
> > > > +SK_WRITABLE_FIELD("sock_common", "skc_family", BPF_H, REJECT),
> > > > +SK_WRITABLE_FIELD("sock", "sk_sndtimeo", BPF_DW, REJECT),
> > > > +SK_WRITABLE_FIELD("sock", "sk_priority", BPF_W, ACCEPT),
> > > > +SK_WRITABLE_FIELD("sock", "sk_mark", BPF_W, ACCEPT),
> > > > +SK_WRITABLE_FIELD("sock", "sk_pacing_rate", BPF_DW, REJECT),
> > > > +
> > >
> > > have you tried writing it as C program and adding the test to
> > > test_progs? Does something not work there?
> >
> > Seems like it should work, I don't see any issues with writing 5
> > programs to test each field.
> > But test_verified still feels like a better fit? Any reason in
> > particular you'd prefer test_progs over test_verifier?
>
> Adding that fixup_ldx->strct special handling didn't feel like the
> best fit, tbh. test_progs is generally much nicer to deal with in
> terms of CI and in terms of comprehending what's going on and
> supporting the code longer term.

This is not new, right? We already have a bunch of fixup_xxx things.
I can try to move this into test_progs in largely the same manner if
you prefer, having a C file per field seems like an overkill.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program
  2022-05-10  5:05   ` Alexei Starovoitov
@ 2022-05-10 17:31     ` sdf
  2022-05-12  4:07       ` Alexei Starovoitov
  0 siblings, 1 reply; 33+ messages in thread
From: sdf @ 2022-05-10 17:31 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: netdev, bpf, ast, daniel, andrii

On 05/09, Alexei Starovoitov wrote:
> On Fri, Apr 29, 2022 at 02:15:34PM -0700, Stanislav Fomichev wrote:
> > Previous patch adds 1:1 mapping between all 211 LSM hooks
> > and bpf_cgroup program array. Instead of reserving a slot per
> > possible hook, reserve 10 slots per cgroup for lsm programs.
> > Those slots are dynamically allocated on demand and reclaimed.
> >
> > It should be possible to eventually extend this idea to all hooks if
> > the memory consumption is unacceptable and shrink overall effective
> > programs array.
> >
> > struct cgroup_bpf {
> > 	struct bpf_prog_array *    effective[33];        /*     0   264 */
> > 	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
> > 	struct hlist_head          progs[33];            /*   264   264 */
> > 	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
> > 	u8                         flags[33];            /*   528    33 */
> >
> > 	/* XXX 7 bytes hole, try to pack */
> >
> > 	struct list_head           storages;             /*   568    16 */
> > 	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
> > 	struct bpf_prog_array *    inactive;             /*   584     8 */
> > 	struct percpu_ref          refcnt;               /*   592    16 */
> > 	struct work_struct         release_work;         /*   608    72 */
> >
> > 	/* size: 680, cachelines: 11, members: 7 */
> > 	/* sum members: 673, holes: 1, sum holes: 7 */
> > 	/* last cacheline: 40 bytes */
> > };
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/linux/bpf-cgroup-defs.h |   3 +-
> >  include/linux/bpf_lsm.h         |   6 --
> >  kernel/bpf/bpf_lsm.c            |   5 --
> >  kernel/bpf/cgroup.c             | 107 +++++++++++++++++++++++++++-----
> >  4 files changed, 94 insertions(+), 27 deletions(-)
> >
> > diff --git a/include/linux/bpf-cgroup-defs.h  
> b/include/linux/bpf-cgroup-defs.h
> > index d5a70a35dace..359d3f16abea 100644
> > --- a/include/linux/bpf-cgroup-defs.h
> > +++ b/include/linux/bpf-cgroup-defs.h
> > @@ -10,7 +10,8 @@
> >
> >  struct bpf_prog_array;
> >
> > -#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
> > +/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
> > +#define CGROUP_LSM_NUM 10
> >
> >  enum cgroup_bpf_attach_type {
> >  	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
> > diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> > index 7f0e59f5f9be..613de44aa429 100644
> > --- a/include/linux/bpf_lsm.h
> > +++ b/include/linux/bpf_lsm.h
> > @@ -43,7 +43,6 @@ extern const struct bpf_func_proto  
> bpf_inode_storage_delete_proto;
> >  void bpf_inode_storage_free(struct inode *inode);
> >
> >  int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t  
> *bpf_func);
> > -int bpf_lsm_hook_idx(u32 btf_id);
> >
> >  #else /* !CONFIG_BPF_LSM */
> >
> > @@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const  
> struct bpf_prog *prog,
> >  	return -ENOENT;
> >  }
> >
> > -static inline int bpf_lsm_hook_idx(u32 btf_id)
> > -{
> > -	return -EINVAL;
> > -}
> > -
> >  #endif /* CONFIG_BPF_LSM */
> >
> >  #endif /* _LINUX_BPF_LSM_H */
> > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > index a0e68ef5dfb1..1079c747e061 100644
> > --- a/kernel/bpf/bpf_lsm.c
> > +++ b/kernel/bpf/bpf_lsm.c
> > @@ -91,11 +91,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog  
> *prog,
> >  	return 0;
> >  }
> >
> > -int bpf_lsm_hook_idx(u32 btf_id)
> > -{
> > -	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> > -}
> > -
> >  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> >  			const struct bpf_prog *prog)
> >  {
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 9cc38454e402..787ff6cf8d42 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -79,10 +79,13 @@ unsigned int __cgroup_bpf_run_lsm_sock(const void  
> *ctx,
> >  	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct  
> bpf_prog, insnsi));
> >
> >  	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> > -	if (likely(cgrp))
> > +	if (likely(cgrp)) {
> > +		rcu_read_lock(); /* See bpf_lsm_attach_type_get(). */

> I've looked at bpf_lsm_attach_type_get/put, but still don't get it :)
> shim_prog->aux->cgroup_atype stays the same for the life of shim_prog.
> atype_usecnt will go up and down, but atype_usecnt == 0 is the only
> interesting one from the pov of selecting atype in _get().
> And there shim_prog will be detached and trampoline destroyed.
> The shim_prog->aux->cgroup_atype deref below cannot be happening on
> freed shim_prog.
> So what is the point of this critical section and sync_rcu() ?
> It seems none of it is necessary.

I was trying to guard against the reuse of the same cgroup_atype:

CPU0                                     CPU1
__cgroup_bpf_run_lsm_socket:
atype = shim_prog->aux->cgroup_atype
                                          __cgroup_bpf_detach
                                          bpf_lsm_attach_type_put(shim_prog  
attach_btf_id)
                                          __cgroup_bpf_attach(another hook)
                                          bpf_lsm_attach_type_get(another  
btf_id)
                                          ^^^ can reuse the same cgroup_atype
array = cgrp->effective[atype]
^^^ run effective from another btf_id?

So I added that sync_rcu to wait for existing shim_prog users to exit.
Am I too paranoid? Maybe if I move bpf_lsm_attach_type_put deep into
bpf_prog_put (deferred path) that won't be an issue and we can drop
rcu_sync+read lock?

> > It should be possible to eventually extend this idea to all hooks if
> > the memory consumption is unacceptable and shrink overall effective
> > programs array.

> if BPF_LSM_CGROUP do atype differently looks too special.
> Why not to do this generalization right now?
> Do atype_get for all cgroup hooks and get rid of  
> to_cgroup_bpf_attach_type ?
> Combine ranges of attach_btf_id for lsm_cgroup and enum bpf_attach_type
> for traditional cgroup hooks into single _get() method that returns a slot
> in effective[] array ?
> attach/detach/query apis won't notice this internal implementation detail.

I'm being extra cautious by using this new allocation scheme for LSM only.
If there is no general pushback, I can try to convert everything at the
same time.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-05-10 17:30         ` Stanislav Fomichev
@ 2022-05-10 19:18           ` Martin KaFai Lau
  2022-05-10 21:14             ` Stanislav Fomichev
  0 siblings, 1 reply; 33+ messages in thread
From: Martin KaFai Lau @ 2022-05-10 19:18 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Tue, May 10, 2022 at 10:30:57AM -0700, Stanislav Fomichev wrote:
> On Tue, May 10, 2022 at 12:13 AM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Mon, May 09, 2022 at 04:38:36PM -0700, Stanislav Fomichev wrote:
> > > > > +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> > > > > +                                       const struct bpf_insn *insn)
> > > > > +{
> > > > > +     const struct bpf_prog *shim_prog;
> > > > > +     struct cgroup *cgrp;
> > > > > +     int ret = 0;
> > > > From lsm_hook_defs.h, there are some default return values that are not 0.
> > > > Is it ok to always return 0 in cases like the cgroup array is empty ?
> > >
> > > That's a good point, I haven't thought about it. You're right, it
> > > seems like attaching to this hook for some LSMs will change the
> > > default from some error to zero.
> > > Let's start by prohibiting those hooks for now? I guess in theory,
> > > when we generate a trampoline, we can put this default value as an
> > > input arg to these new __cgroup_bpf_run_lsm_xxx helpers (in the
> > > future)?
> > After looking at arch_prepare_bpf_trampoline, return 0 here should be fine.
> > If I read it correctly, when the shim_prog returns 0, the trampoline
> > will call the original kernel function which is the bpf_lsm_##NAME()
> > defined in bpf_lsm.c and it will then return the zero/-ve DEFAULT.
> 
> Not sure I read the same :-/ I'm assuming that for those cases we
> actually end up generating fmod_ret trampoline which seems to be
> unconditionally saving r0 into fp-8 ?
invoke_bpf_mod_ret() calls invoke_bpf_prog(..., true) that saves the r0.

Later, the "if (flags & BPF_TRAMP_F_CALL_ORIG)" will still
"/* call the original function */" and then stores the r0 retval
from the original function, no? or I mis-read something ?

> 
> > > Another thing that seems to be related: there are a bunch of hooks
> > > that return void, so returning EPERM from the cgroup programs won't
> > > work as expected.
> > > I can probably record, at verification time, whether lsm_cgroup
> > > programs return any "non-success" return codes and prohibit attaching
> > > these progs to the void hooks?
> > hmm...yeah, BPF_LSM_CGROUP can be enforced to return either 0 or 1 as
> > most other cgroup-progs do.
> >
> > Do you have a use case that needs to return something other than -EPERM ?
> 
> We do already enforce 0/1 for cgroup progs (and we have helpers to
> expose custom errno). What I want to avoid is letting users attach
> programs that try to return the error for the void hooks. And it seems
> like we record that return range for a particular cgroup program and
> verify it at attach time, WDYT?
Make sense.  Do that in check_return_code() at load time instead of
attach time?
To be specific, meaning enforce BPF_LSM_CGROUP to 0/1 for int return type
and always 1 for void return type?

Ah, I forgot there is a bpf_set_retval().  I assume we eventually want
to allow that for BPF_LSM_CGROUP later?  Once it is allowed,
the verifier should also reject bpf_set_retval() when the
attach_btf_id has a void return type?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor
  2022-05-10 19:18           ` Martin KaFai Lau
@ 2022-05-10 21:14             ` Stanislav Fomichev
  0 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-10 21:14 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, bpf, ast, daniel, andrii

On Tue, May 10, 2022 at 12:18 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Tue, May 10, 2022 at 10:30:57AM -0700, Stanislav Fomichev wrote:
> > On Tue, May 10, 2022 at 12:13 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > >
> > > On Mon, May 09, 2022 at 04:38:36PM -0700, Stanislav Fomichev wrote:
> > > > > > +unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
> > > > > > +                                       const struct bpf_insn *insn)
> > > > > > +{
> > > > > > +     const struct bpf_prog *shim_prog;
> > > > > > +     struct cgroup *cgrp;
> > > > > > +     int ret = 0;
> > > > > From lsm_hook_defs.h, there are some default return values that are not 0.
> > > > > Is it ok to always return 0 in cases like the cgroup array is empty ?
> > > >
> > > > That's a good point, I haven't thought about it. You're right, it
> > > > seems like attaching to this hook for some LSMs will change the
> > > > default from some error to zero.
> > > > Let's start by prohibiting those hooks for now? I guess in theory,
> > > > when we generate a trampoline, we can put this default value as an
> > > > input arg to these new __cgroup_bpf_run_lsm_xxx helpers (in the
> > > > future)?
> > > After looking at arch_prepare_bpf_trampoline, return 0 here should be fine.
> > > If I read it correctly, when the shim_prog returns 0, the trampoline
> > > will call the original kernel function which is the bpf_lsm_##NAME()
> > > defined in bpf_lsm.c and it will then return the zero/-ve DEFAULT.
> >
> > Not sure I read the same :-/ I'm assuming that for those cases we
> > actually end up generating fmod_ret trampoline which seems to be
> > unconditionally saving r0 into fp-8 ?
> invoke_bpf_mod_ret() calls invoke_bpf_prog(..., true) that saves the r0.
>
> Later, the "if (flags & BPF_TRAMP_F_CALL_ORIG)" will still
> "/* call the original function */" and then stores the r0 retval
> from the original function, no? or I mis-read something ?

I was under the wrong assumption this whole time that fmod_ret
programs run after the original one and the first bpf program sees the
output of the original one.
Turns out it's not the case; agreed that we already do the right
thing; thanks for pointing it out!

> > > > Another thing that seems to be related: there are a bunch of hooks
> > > > that return void, so returning EPERM from the cgroup programs won't
> > > > work as expected.
> > > > I can probably record, at verification time, whether lsm_cgroup
> > > > programs return any "non-success" return codes and prohibit attaching
> > > > these progs to the void hooks?
> > > hmm...yeah, BPF_LSM_CGROUP can be enforced to return either 0 or 1 as
> > > most other cgroup-progs do.
> > >
> > > Do you have a use case that needs to return something other than -EPERM ?
> >
> > We do already enforce 0/1 for cgroup progs (and we have helpers to
> > expose custom errno). What I want to avoid is letting users attach
> > programs that try to return the error for the void hooks. And it seems
> > like we record that return range for a particular cgroup program and
> > verify it at attach time, WDYT?
> Make sense.  Do that in check_return_code() at load time instead of
> attach time?
> To be specific, meaning enforce BPF_LSM_CGROUP to 0/1 for int return type
> and always 1 for void return type?

Yeah, let's try to enforce the following at load time:
- return 0 or call to bpf_set_retval should happen only for the hooks
that return int
- for the void ones, only 'return 1' should be accepted

> Ah, I forgot there is a bpf_set_retval().  I assume we eventually want
> to allow that for BPF_LSM_CGROUP later?  Once it is allowed,
> the verifier should also reject bpf_set_retval() when the
> attach_btf_id has a void return type?

Right, let me actually try to add bpf_set_retval to the set of allowed
helpers from the start, shouldn't be hard..

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-10 17:31         ` Stanislav Fomichev
@ 2022-05-12  3:37           ` Andrii Nakryiko
  2022-05-12 17:11             ` Stanislav Fomichev
  0 siblings, 1 reply; 33+ messages in thread
From: Andrii Nakryiko @ 2022-05-12  3:37 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Tue, May 10, 2022 at 10:31 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Mon, May 9, 2022 at 4:44 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, May 9, 2022 at 4:38 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > On Mon, May 9, 2022 at 2:54 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 29, 2022 at 2:16 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > > >
> > > > > sk_priority & sk_mark are writable, the rest is readonly.
> > > > >
> > > > > Add new ldx_offset fixups to lookup the offset of struct field.
> > > > > Allow using test.kfunc regardless of prog_type.
> > > > >
> > > > > One interesting thing here is that the verifier doesn't
> > > > > really force me to add NULL checks anywhere :-/
> > > > >
> > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > ---
> > > > >  tools/testing/selftests/bpf/test_verifier.c   | 54 ++++++++++++++++++-
> > > > >  .../selftests/bpf/verifier/lsm_cgroup.c       | 34 ++++++++++++
> > > > >  2 files changed, 87 insertions(+), 1 deletion(-)
> > > > >  create mode 100644 tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > > >
> > > >
> > > > [...]
> > > >
> > > > > diff --git a/tools/testing/selftests/bpf/verifier/lsm_cgroup.c b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > > > new file mode 100644
> > > > > index 000000000000..af0efe783511
> > > > > --- /dev/null
> > > > > +++ b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > > > @@ -0,0 +1,34 @@
> > > > > +#define SK_WRITABLE_FIELD(tp, field, size, res) \
> > > > > +{ \
> > > > > +       .descr = field, \
> > > > > +       .insns = { \
> > > > > +               /* r1 = *(u64 *)(r1 + 0) */ \
> > > > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > > > > +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */ \
> > > > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > > > > +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */ \
> > > > > +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, 0), \
> > > > > +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */ \
> > > > > +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, 0), \
> > > > > +               BPF_MOV64_IMM(BPF_REG_0, 1), \
> > > > > +               BPF_EXIT_INSN(), \
> > > > > +       }, \
> > > > > +       .result = res, \
> > > > > +       .errstr = res ? "no write support to 'struct sock' at off" : "", \
> > > > > +       .prog_type = BPF_PROG_TYPE_LSM, \
> > > > > +       .expected_attach_type = BPF_LSM_CGROUP, \
> > > > > +       .kfunc = "socket_post_create", \
> > > > > +       .fixup_ldx = { \
> > > > > +               { "socket", "sk", 1 }, \
> > > > > +               { tp, field, 2 }, \
> > > > > +               { tp, field, 3 }, \
> > > > > +       }, \
> > > > > +}
> > > > > +
> > > > > +SK_WRITABLE_FIELD("sock_common", "skc_family", BPF_H, REJECT),
> > > > > +SK_WRITABLE_FIELD("sock", "sk_sndtimeo", BPF_DW, REJECT),
> > > > > +SK_WRITABLE_FIELD("sock", "sk_priority", BPF_W, ACCEPT),
> > > > > +SK_WRITABLE_FIELD("sock", "sk_mark", BPF_W, ACCEPT),
> > > > > +SK_WRITABLE_FIELD("sock", "sk_pacing_rate", BPF_DW, REJECT),
> > > > > +
> > > >
> > > > have you tried writing it as C program and adding the test to
> > > > test_progs? Does something not work there?
> > >
> > > Seems like it should work, I don't see any issues with writing 5
> > > programs to test each field.
> > > But test_verified still feels like a better fit? Any reason in
> > > particular you'd prefer test_progs over test_verifier?
> >
> > Adding that fixup_ldx->strct special handling didn't feel like the
> > best fit, tbh. test_progs is generally much nicer to deal with in
> > terms of CI and in terms of comprehending what's going on and
> > supporting the code longer term.
>
> This is not new, right? We already have a bunch of fixup_xxx things.

I'm not saying it's wrong, but we don't have to keep adding extra
custom fixup_xxx things and having hand crafted assembly test cases if
we can do C tests, right? BPF assembly tests are sometimes necessary
if we need to craft some special conditions which are hard to
guarantee from Clang side during C to BPF assembly translation. But
this one doesn't seem to be the case.

> I can try to move this into test_progs in largely the same manner if
> you prefer, having a C file per field seems like an overkill.

You don't need a separate C file for each case. See what Joanne does
with SEC("?...") for dynptr tests, or what Kumar did for his kptr
tests. You can put multiple negative tests as separate BPF programs in
one file with auto-load disabled through SEC("?...") and then
open/load skeleton each time for each program, one at a time.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program
  2022-05-10 17:31     ` sdf
@ 2022-05-12  4:07       ` Alexei Starovoitov
  0 siblings, 0 replies; 33+ messages in thread
From: Alexei Starovoitov @ 2022-05-12  4:07 UTC (permalink / raw)
  To: sdf; +Cc: netdev, bpf, ast, daniel, andrii

On Tue, May 10, 2022 at 10:31:05AM -0700, sdf@google.com wrote:
> On 05/09, Alexei Starovoitov wrote:
> > On Fri, Apr 29, 2022 at 02:15:34PM -0700, Stanislav Fomichev wrote:
> > > Previous patch adds 1:1 mapping between all 211 LSM hooks
> > > and bpf_cgroup program array. Instead of reserving a slot per
> > > possible hook, reserve 10 slots per cgroup for lsm programs.
> > > Those slots are dynamically allocated on demand and reclaimed.
> > >
> > > It should be possible to eventually extend this idea to all hooks if
> > > the memory consumption is unacceptable and shrink overall effective
> > > programs array.
> > >
> > > struct cgroup_bpf {
> > > 	struct bpf_prog_array *    effective[33];        /*     0   264 */
> > > 	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
> > > 	struct hlist_head          progs[33];            /*   264   264 */
> > > 	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
> > > 	u8                         flags[33];            /*   528    33 */
> > >
> > > 	/* XXX 7 bytes hole, try to pack */
> > >
> > > 	struct list_head           storages;             /*   568    16 */
> > > 	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
> > > 	struct bpf_prog_array *    inactive;             /*   584     8 */
> > > 	struct percpu_ref          refcnt;               /*   592    16 */
> > > 	struct work_struct         release_work;         /*   608    72 */
> > >
> > > 	/* size: 680, cachelines: 11, members: 7 */
> > > 	/* sum members: 673, holes: 1, sum holes: 7 */
> > > 	/* last cacheline: 40 bytes */
> > > };
> > >
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  include/linux/bpf-cgroup-defs.h |   3 +-
> > >  include/linux/bpf_lsm.h         |   6 --
> > >  kernel/bpf/bpf_lsm.c            |   5 --
> > >  kernel/bpf/cgroup.c             | 107 +++++++++++++++++++++++++++-----
> > >  4 files changed, 94 insertions(+), 27 deletions(-)
> > >
> > > diff --git a/include/linux/bpf-cgroup-defs.h
> > b/include/linux/bpf-cgroup-defs.h
> > > index d5a70a35dace..359d3f16abea 100644
> > > --- a/include/linux/bpf-cgroup-defs.h
> > > +++ b/include/linux/bpf-cgroup-defs.h
> > > @@ -10,7 +10,8 @@
> > >
> > >  struct bpf_prog_array;
> > >
> > > -#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
> > > +/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
> > > +#define CGROUP_LSM_NUM 10
> > >
> > >  enum cgroup_bpf_attach_type {
> > >  	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
> > > diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> > > index 7f0e59f5f9be..613de44aa429 100644
> > > --- a/include/linux/bpf_lsm.h
> > > +++ b/include/linux/bpf_lsm.h
> > > @@ -43,7 +43,6 @@ extern const struct bpf_func_proto
> > bpf_inode_storage_delete_proto;
> > >  void bpf_inode_storage_free(struct inode *inode);
> > >
> > >  int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t
> > *bpf_func);
> > > -int bpf_lsm_hook_idx(u32 btf_id);
> > >
> > >  #else /* !CONFIG_BPF_LSM */
> > >
> > > @@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const
> > struct bpf_prog *prog,
> > >  	return -ENOENT;
> > >  }
> > >
> > > -static inline int bpf_lsm_hook_idx(u32 btf_id)
> > > -{
> > > -	return -EINVAL;
> > > -}
> > > -
> > >  #endif /* CONFIG_BPF_LSM */
> > >
> > >  #endif /* _LINUX_BPF_LSM_H */
> > > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > > index a0e68ef5dfb1..1079c747e061 100644
> > > --- a/kernel/bpf/bpf_lsm.c
> > > +++ b/kernel/bpf/bpf_lsm.c
> > > @@ -91,11 +91,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog
> > *prog,
> > >  	return 0;
> > >  }
> > >
> > > -int bpf_lsm_hook_idx(u32 btf_id)
> > > -{
> > > -	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> > > -}
> > > -
> > >  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> > >  			const struct bpf_prog *prog)
> > >  {
> > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > > index 9cc38454e402..787ff6cf8d42 100644
> > > --- a/kernel/bpf/cgroup.c
> > > +++ b/kernel/bpf/cgroup.c
> > > @@ -79,10 +79,13 @@ unsigned int __cgroup_bpf_run_lsm_sock(const void
> > *ctx,
> > >  	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct
> > bpf_prog, insnsi));
> > >
> > >  	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> > > -	if (likely(cgrp))
> > > +	if (likely(cgrp)) {
> > > +		rcu_read_lock(); /* See bpf_lsm_attach_type_get(). */
> 
> > I've looked at bpf_lsm_attach_type_get/put, but still don't get it :)
> > shim_prog->aux->cgroup_atype stays the same for the life of shim_prog.
> > atype_usecnt will go up and down, but atype_usecnt == 0 is the only
> > interesting one from the pov of selecting atype in _get().
> > And there shim_prog will be detached and trampoline destroyed.
> > The shim_prog->aux->cgroup_atype deref below cannot be happening on
> > freed shim_prog.
> > So what is the point of this critical section and sync_rcu() ?
> > It seems none of it is necessary.
> 
> I was trying to guard against the reuse of the same cgroup_atype:
> 
> CPU0                                     CPU1
> __cgroup_bpf_run_lsm_socket:
> atype = shim_prog->aux->cgroup_atype
>                                          __cgroup_bpf_detach
>                                          bpf_lsm_attach_type_put(shim_prog
> attach_btf_id)

but inbetween the two ops on CPU1 you're doing:
bpf_trampoline_unlink_cgroup_shim
which will go through the trampoline update.
I guess it CPU0 runs sleepable prog for long time
there is a chance...
Maybe put bpf_lsm_attach_type_put in shim_prog free path?

>                                          __cgroup_bpf_attach(another hook)
>                                          bpf_lsm_attach_type_get(another
> btf_id)
>                                          ^^^ can reuse the same cgroup_atype
> array = cgrp->effective[atype]
> ^^^ run effective from another btf_id?
> 
> So I added that sync_rcu to wait for existing shim_prog users to exit.
> Am I too paranoid? Maybe if I move bpf_lsm_attach_type_put deep into
> bpf_prog_put (deferred path) that won't be an issue and we can drop
> rcu_sync+read lock?

Mainly not excited about sync_rcu. It was causing latency issues in the past.
Please use call_rcu whenever possible.

> > > It should be possible to eventually extend this idea to all hooks if
> > > the memory consumption is unacceptable and shrink overall effective
> > > programs array.
> 
> > if BPF_LSM_CGROUP do atype differently looks too special.
> > Why not to do this generalization right now?
> > Do atype_get for all cgroup hooks and get rid of
> > to_cgroup_bpf_attach_type ?
> > Combine ranges of attach_btf_id for lsm_cgroup and enum bpf_attach_type
> > for traditional cgroup hooks into single _get() method that returns a slot
> > in effective[] array ?
> > attach/detach/query apis won't notice this internal implementation detail.
> 
> I'm being extra cautious by using this new allocation scheme for LSM only.
> If there is no general pushback, I can try to convert everything at the
> same time.

Hopefully generalization of the mechanism will move atype_get/put logic
into a path that is clearly race free.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access
  2022-05-12  3:37           ` Andrii Nakryiko
@ 2022-05-12 17:11             ` Stanislav Fomichev
  0 siblings, 0 replies; 33+ messages in thread
From: Stanislav Fomichev @ 2022-05-12 17:11 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko

On Wed, May 11, 2022 at 8:38 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, May 10, 2022 at 10:31 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Mon, May 9, 2022 at 4:44 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Mon, May 9, 2022 at 4:38 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > On Mon, May 9, 2022 at 2:54 PM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Fri, Apr 29, 2022 at 2:16 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > > > >
> > > > > > sk_priority & sk_mark are writable, the rest is readonly.
> > > > > >
> > > > > > Add new ldx_offset fixups to lookup the offset of struct field.
> > > > > > Allow using test.kfunc regardless of prog_type.
> > > > > >
> > > > > > One interesting thing here is that the verifier doesn't
> > > > > > really force me to add NULL checks anywhere :-/
> > > > > >
> > > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > > ---
> > > > > >  tools/testing/selftests/bpf/test_verifier.c   | 54 ++++++++++++++++++-
> > > > > >  .../selftests/bpf/verifier/lsm_cgroup.c       | 34 ++++++++++++
> > > > > >  2 files changed, 87 insertions(+), 1 deletion(-)
> > > > > >  create mode 100644 tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > > > >
> > > > >
> > > > > [...]
> > > > >
> > > > > > diff --git a/tools/testing/selftests/bpf/verifier/lsm_cgroup.c b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > > > > new file mode 100644
> > > > > > index 000000000000..af0efe783511
> > > > > > --- /dev/null
> > > > > > +++ b/tools/testing/selftests/bpf/verifier/lsm_cgroup.c
> > > > > > @@ -0,0 +1,34 @@
> > > > > > +#define SK_WRITABLE_FIELD(tp, field, size, res) \
> > > > > > +{ \
> > > > > > +       .descr = field, \
> > > > > > +       .insns = { \
> > > > > > +               /* r1 = *(u64 *)(r1 + 0) */ \
> > > > > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > > > > > +               /* r1 = *(u64 *)(r1 + offsetof(struct socket, sk)) */ \
> > > > > > +               BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0), \
> > > > > > +               /* r2 = *(u64 *)(r1 + offsetof(struct sock, <field>)) */ \
> > > > > > +               BPF_LDX_MEM(size, BPF_REG_2, BPF_REG_1, 0), \
> > > > > > +               /* *(u64 *)(r1 + offsetof(struct sock, <field>)) = r2 */ \
> > > > > > +               BPF_STX_MEM(size, BPF_REG_1, BPF_REG_2, 0), \
> > > > > > +               BPF_MOV64_IMM(BPF_REG_0, 1), \
> > > > > > +               BPF_EXIT_INSN(), \
> > > > > > +       }, \
> > > > > > +       .result = res, \
> > > > > > +       .errstr = res ? "no write support to 'struct sock' at off" : "", \
> > > > > > +       .prog_type = BPF_PROG_TYPE_LSM, \
> > > > > > +       .expected_attach_type = BPF_LSM_CGROUP, \
> > > > > > +       .kfunc = "socket_post_create", \
> > > > > > +       .fixup_ldx = { \
> > > > > > +               { "socket", "sk", 1 }, \
> > > > > > +               { tp, field, 2 }, \
> > > > > > +               { tp, field, 3 }, \
> > > > > > +       }, \
> > > > > > +}
> > > > > > +
> > > > > > +SK_WRITABLE_FIELD("sock_common", "skc_family", BPF_H, REJECT),
> > > > > > +SK_WRITABLE_FIELD("sock", "sk_sndtimeo", BPF_DW, REJECT),
> > > > > > +SK_WRITABLE_FIELD("sock", "sk_priority", BPF_W, ACCEPT),
> > > > > > +SK_WRITABLE_FIELD("sock", "sk_mark", BPF_W, ACCEPT),
> > > > > > +SK_WRITABLE_FIELD("sock", "sk_pacing_rate", BPF_DW, REJECT),
> > > > > > +
> > > > >
> > > > > have you tried writing it as C program and adding the test to
> > > > > test_progs? Does something not work there?
> > > >
> > > > Seems like it should work, I don't see any issues with writing 5
> > > > programs to test each field.
> > > > But test_verified still feels like a better fit? Any reason in
> > > > particular you'd prefer test_progs over test_verifier?
> > >
> > > Adding that fixup_ldx->strct special handling didn't feel like the
> > > best fit, tbh. test_progs is generally much nicer to deal with in
> > > terms of CI and in terms of comprehending what's going on and
> > > supporting the code longer term.
> >
> > This is not new, right? We already have a bunch of fixup_xxx things.
>
> I'm not saying it's wrong, but we don't have to keep adding extra
> custom fixup_xxx things and having hand crafted assembly test cases if
> we can do C tests, right? BPF assembly tests are sometimes necessary
> if we need to craft some special conditions which are hard to
> guarantee from Clang side during C to BPF assembly translation. But
> this one doesn't seem to be the case.
>
> > I can try to move this into test_progs in largely the same manner if
> > you prefer, having a C file per field seems like an overkill.
>
> You don't need a separate C file for each case. See what Joanne does
> with SEC("?...") for dynptr tests, or what Kumar did for his kptr
> tests. You can put multiple negative tests as separate BPF programs in
> one file with auto-load disabled through SEC("?...") and then
> open/load skeleton each time for each program, one at a time.

I'm gonna start with keeping the assembly, but moving it into
test_progs. I think it looks a bit nicer than the fixup stuff I'm
currently doing and I like how everything is in the same place. So
please yell at me if you still don't like it next time I send it out
and I'll try to explore SEC("?...").

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH bpf-next v6 02/10] bpf: convert cgroup_bpf.progs to hlist
  2022-04-29 21:15 ` [PATCH bpf-next v6 02/10] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
@ 2022-05-18 15:16   ` Jakub Sitnicki
  0 siblings, 0 replies; 33+ messages in thread
From: Jakub Sitnicki @ 2022-05-18 15:16 UTC (permalink / raw)
  To: Stanislav Fomichev; +Cc: netdev, bpf, ast, daniel, andrii

On Fri, Apr 29, 2022 at 02:15 PM -07, Stanislav Fomichev wrote:
> This lets us reclaim some space to be used by new cgroup lsm slots.
>
> Before:
> struct cgroup_bpf {
> 	struct bpf_prog_array *    effective[23];        /*     0   184 */
> 	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
> 	struct list_head           progs[23];            /*   184   368 */
> 	/* --- cacheline 8 boundary (512 bytes) was 40 bytes ago --- */
> 	u32                        flags[23];            /*   552    92 */
>
> 	/* XXX 4 bytes hole, try to pack */
>
> 	/* --- cacheline 10 boundary (640 bytes) was 8 bytes ago --- */
> 	struct list_head           storages;             /*   648    16 */
> 	struct bpf_prog_array *    inactive;             /*   664     8 */
> 	struct percpu_ref          refcnt;               /*   672    16 */
> 	struct work_struct         release_work;         /*   688    32 */
>
> 	/* size: 720, cachelines: 12, members: 7 */
> 	/* sum members: 716, holes: 1, sum holes: 4 */
> 	/* last cacheline: 16 bytes */
> };
>
> After:
> struct cgroup_bpf {
> 	struct bpf_prog_array *    effective[23];        /*     0   184 */
> 	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
> 	struct hlist_head          progs[23];            /*   184   184 */
> 	/* --- cacheline 5 boundary (320 bytes) was 48 bytes ago --- */
> 	u8                         flags[23];            /*   368    23 */
>
> 	/* XXX 1 byte hole, try to pack */
>
> 	/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
> 	struct list_head           storages;             /*   392    16 */
> 	struct bpf_prog_array *    inactive;             /*   408     8 */
> 	struct percpu_ref          refcnt;               /*   416    16 */
> 	struct work_struct         release_work;         /*   432    72 */
>
> 	/* size: 504, cachelines: 8, members: 7 */
> 	/* sum members: 503, holes: 1, sum holes: 1 */
> 	/* last cacheline: 56 bytes */
> };
>
> Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---

Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-05-18 15:16 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 01/10] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 02/10] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
2022-05-18 15:16   ` Jakub Sitnicki
2022-04-29 21:15 ` [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor Stanislav Fomichev
2022-05-06 23:02   ` Martin KaFai Lau
2022-05-09 23:38     ` Stanislav Fomichev
2022-05-10  7:13       ` Martin KaFai Lau
2022-05-10 17:30         ` Stanislav Fomichev
2022-05-10 19:18           ` Martin KaFai Lau
2022-05-10 21:14             ` Stanislav Fomichev
2022-05-09 21:51   ` Andrii Nakryiko
2022-05-09 23:38     ` Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
2022-05-10  5:05   ` Alexei Starovoitov
2022-05-10 17:31     ` sdf
2022-05-12  4:07       ` Alexei Starovoitov
2022-04-29 21:15 ` [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
2022-05-07  0:12   ` Martin KaFai Lau
2022-05-09 23:38     ` Stanislav Fomichev
2022-05-09 21:49   ` Andrii Nakryiko
2022-05-09 23:38     ` Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 06/10] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 07/10] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 08/10] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 09/10] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
2022-05-09 21:54   ` Andrii Nakryiko
2022-05-09 23:38     ` Stanislav Fomichev
2022-05-09 23:43       ` Andrii Nakryiko
2022-05-10 17:31         ` Stanislav Fomichev
2022-05-12  3:37           ` Andrii Nakryiko
2022-05-12 17:11             ` Stanislav Fomichev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).