All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND bpf-next 00/18] BPF token
@ 2023-06-02 14:59 Andrii Nakryiko
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object Andrii Nakryiko
                   ` (18 more replies)
  0 siblings, 19 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 14:59 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

*Resending with trimmed CC list because original version didn't make it to
the mailing list.*

This patch set introduces new BPF object, BPF token, which allows to delegate
a subset of BPF functionality from privileged system-wide daemon (e.g.,
systemd or any other container manager) to a *trusted* unprivileged
application. Trust is the key here. This functionality is not about allowing
unconditional unprivileged BPF usage. Establishing trust, though, is
completely up to the discretion of respective privileged application that
would create a BPF token.

The main motivation for BPF token is a desire to enable containerized
BPF applications to be used together with user namespaces. This is currently
impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
arbitrary memory, and it's impossible to ensure that they only read memory of
processes belonging to any given namespace. This means that it's impossible to
have namespace-aware CAP_BPF capability, and as such another mechanism to
allow safe usage of BPF functionality is necessary. BPF token and delegation
of it to a trusted unprivileged applications is such mechanism. Kernel makes
no assumption about what "trusted" constitutes in any particular case, and
it's up to specific privileged applications and their surrounding
infrastructure to decide that. What kernel provides is a set of APIs to create
and tune BPF token, and pass it around to privileged BPF commands that are
creating new BPF objects like BPF programs, BPF maps, etc.

Previous attempt at addressing this very same problem ([0]) attempted to
utilize authoritative LSM approach, but was conclusively rejected by upstream
LSM maintainers. BPF token concept is not changing anything about LSM
approach, but can be combined with LSM hooks for very fine-grained security
policy. Some ideas about making BPF token more convenient to use with LSM (in
particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
2023 presentation ([1]). E.g., an ability to specify user-provided data
(context), which in combination with BPF LSM would allow implementing a very
dynamic and fine-granular custom security policies on top of BPF token. In the
interest of minimizing API surface area discussions this is going to be
added in follow up patches, as it's not essential to the fundamental concept
of delegatable BPF token.

It should be noted that BPF token is conceptually quite similar to the idea of
/dev/bpf device file, proposed by Song a while ago ([2]). The biggest
difference is the idea of using virtual anon_inode file to hold BPF token and
allowing multiple independent instances of them, each with its own set of
restrictions. BPF pinning solves the problem of exposing such BPF token
through file system (BPF FS, in this case) for cases where transferring FDs
over Unix domain sockets is not convenient. And also, crucially, BPF token
approach is not using any special stateful task-scoped flags. Instead, bpf()
syscall accepts token_fd parameters explicitly for each relevant BPF command.
This addresses main concerns brought up during the /dev/bpf discussion, and
fits better with overall BPF subsystem design.

This patch set adds a basic minimum of functionality to make BPF token useful
and to discuss API and functionality. Currently only low-level libbpf APIs
support passing BPF token around, allowing to test kernel functionality, but
for the most part is not sufficient for real-world applications, which
typically use high-level libbpf APIs based on `struct bpf_object` type. This
was done with the intent to limit the size of patch set and concentrate on
mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
as a separate follow up patch set kernel support makes it upstream.

Another part that should happen once kernel-side BPF token is established, is
a set of conventions between applications (e.g., systemd), tools (e.g.,
bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
at well-defined locations to allow applications take advantage of this in
automatic fashion without explicit code changes on BPF application's side.
But I'd like to postpone this discussion to after BPF token concept lands.

  [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
  [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
  [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/

Andrii Nakryiko (18):
  bpf: introduce BPF token object
  libbpf: add bpf_token_create() API
  selftests/bpf: add BPF_TOKEN_CREATE test
  bpf: move unprivileged checks into map_create() and bpf_prog_load()
  bpf: inline map creation logic in map_create() function
  bpf: centralize permissions checks for all BPF map types
  bpf: add BPF token support to BPF_MAP_CREATE command
  libbpf: add BPF token support to bpf_map_create() API
  selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
  bpf: add BPF token support to BPF_BTF_LOAD command
  libbpf: add BPF token support to bpf_btf_load() API
  selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
  bpf: keep BPF_PROG_LOAD permission checks clear of validations
  bpf: add BPF token support to BPF_PROG_LOAD command
  bpf: take into account BPF token when fetching helper protos
  bpf: consistenly use BPF token throughout BPF verifier logic
  libbpf: add BPF token support to bpf_prog_load() API
  selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests

 drivers/media/rc/bpf-lirc.c                   |   2 +-
 include/linux/bpf.h                           |  66 ++-
 include/linux/filter.h                        |   2 +-
 include/uapi/linux/bpf.h                      |  74 +++
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/arraymap.c                         |   2 +-
 kernel/bpf/bloom_filter.c                     |   3 -
 kernel/bpf/bpf_local_storage.c                |   3 -
 kernel/bpf/bpf_struct_ops.c                   |   3 -
 kernel/bpf/cgroup.c                           |   6 +-
 kernel/bpf/core.c                             |   3 +-
 kernel/bpf/cpumap.c                           |   4 -
 kernel/bpf/devmap.c                           |   3 -
 kernel/bpf/hashtab.c                          |   6 -
 kernel/bpf/helpers.c                          |   6 +-
 kernel/bpf/inode.c                            |  26 ++
 kernel/bpf/lpm_trie.c                         |   3 -
 kernel/bpf/queue_stack_maps.c                 |   4 -
 kernel/bpf/reuseport_array.c                  |   3 -
 kernel/bpf/stackmap.c                         |   3 -
 kernel/bpf/syscall.c                          | 429 ++++++++++++++----
 kernel/bpf/token.c                            | 141 ++++++
 kernel/bpf/verifier.c                         |  13 +-
 kernel/trace/bpf_trace.c                      |   2 +-
 net/core/filter.c                             |  36 +-
 net/core/sock_map.c                           |   4 -
 net/ipv4/bpf_tcp_ca.c                         |   2 +-
 net/netfilter/nf_bpf_link.c                   |   2 +-
 net/xdp/xskmap.c                              |   4 -
 tools/include/uapi/linux/bpf.h                |  74 +++
 tools/lib/bpf/bpf.c                           |  32 +-
 tools/lib/bpf/bpf.h                           |  24 +-
 tools/lib/bpf/libbpf.map                      |   1 +
 .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
 .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
 .../testing/selftests/bpf/prog_tests/token.c  | 282 ++++++++++++
 .../bpf/prog_tests/unpriv_bpf_disabled.c      |   6 +-
 37 files changed, 1098 insertions(+), 188 deletions(-)
 create mode 100644 kernel/bpf/token.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
@ 2023-06-02 14:59 ` Andrii Nakryiko
  2023-06-02 17:41   ` kernel test robot
                     ` (2 more replies)
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 02/18] libbpf: add bpf_token_create() API Andrii Nakryiko
                   ` (17 subsequent siblings)
  18 siblings, 3 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 14:59 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Add new kind of BPF kernel object, BPF token. BPF token is meant to to
allow delegating privileged BPF functionality, like loading a BPF
program or creating a BPF map, from privileged process to a *trusted*
unprivileged process, all while have a good amount of control over which
privileged operation could be done using provided BPF token.

This patch adds new BPF_TOKEN_CREATE command to bpf() syscall, which
allows to create a new BPF token object along with a set of allowed
commands. Currently only BPF_TOKEN_CREATE command itself can be
delegated, but other patches gradually add ability to delegate
BPF_MAP_CREATE, BPF_BTF_LOAD, and BPF_PROG_LOAD commands.

The above means that BPF token creation can be allowed by another
existing BPF token, if original privileged creator allowed that. New
derived BPF token cannot be more powerful than the original BPF token.

BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag is added to allow application to do
express "all supported BPF commands should be allowed" without worrying
about which subset of desired commands is actually supported by
potentially outdated kernel. Allowing this semantics doesn't seem to
introduce any backwards compatibility issues and doesn't introduce any
risk of abusing or misusing bit set field, but makes backwards
compatibility story for various applications and tools much more
straightforward, making it unnecessary to probe support for each
individual possible bit. This is especially useful in follow up patches
where we add BPF map types and prog types bit sets.

Lastly, BPF token can be pinned in and retrieved from BPF FS, just like
progs, maps, BTFs, and links. This allows applications (like container
managers) to share BPF token with other applications through file system
just like any other BPF object, and further control access to it using
file system permissions, if desired.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h            |  34 +++++++++
 include/uapi/linux/bpf.h       |  42 ++++++++++++
 kernel/bpf/Makefile            |   2 +-
 kernel/bpf/inode.c             |  26 +++++++
 kernel/bpf/syscall.c           |  74 ++++++++++++++++++++
 kernel/bpf/token.c             | 122 +++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  40 +++++++++++
 7 files changed, 339 insertions(+), 1 deletion(-)
 create mode 100644 kernel/bpf/token.c

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f58895830ada..fe6d51c3a5b1 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -51,6 +51,7 @@ struct module;
 struct bpf_func_state;
 struct ftrace_ops;
 struct cgroup;
+struct bpf_token;
 
 extern struct idr btf_idr;
 extern spinlock_t btf_idr_lock;
@@ -1533,6 +1534,12 @@ struct bpf_link_primer {
 	u32 id;
 };
 
+struct bpf_token {
+	struct work_struct work;
+	atomic64_t refcnt;
+	u64 allowed_cmds;
+};
+
 struct bpf_struct_ops_value;
 struct btf_member;
 
@@ -2077,6 +2084,15 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
 struct bpf_link *bpf_link_get_from_fd(u32 ufd);
 struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
 
+void bpf_token_inc(struct bpf_token *token);
+void bpf_token_put(struct bpf_token *token);
+struct bpf_token *bpf_token_alloc(void);
+int bpf_token_new_fd(struct bpf_token *token);
+struct bpf_token *bpf_token_get_from_fd(u32 ufd);
+
+bool bpf_token_capable(const struct bpf_token *token, int cap);
+bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
+
 int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
 int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
 
@@ -2436,6 +2452,24 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)
 	return -EOPNOTSUPP;
 }
 
+static inline void bpf_token_inc(struct bpf_token *token)
+{
+}
+
+static inline void bpf_token_put(struct bpf_token *token)
+{
+}
+
+static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline struct bpf_token *bpf_token_get_from_fd(u32 ufd)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
+
 static inline void __dev_flush(void)
 {
 }
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 9273c654743c..01ab79f2ad9f 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -846,6 +846,16 @@ union bpf_iter_link_info {
  *		Returns zero on success. On error, -1 is returned and *errno*
  *		is set appropriately.
  *
+ * BPF_TOKEN_CREATE
+ *	Description
+ *		Create BPF token with embedded information about what
+ *		BPF-related functionality is allowed. This BPF token can be
+ *		passed as an extra parameter to various bpf() syscall command.
+ *
+ *	Return
+ *		A new file descriptor (a nonnegative integer), or -1 if an
+ *		error occurred (in which case, *errno* is set appropriately).
+ *
  * NOTES
  *	eBPF objects (maps and programs) can be shared between processes.
  *
@@ -900,6 +910,7 @@ enum bpf_cmd {
 	BPF_ITER_CREATE,
 	BPF_LINK_DETACH,
 	BPF_PROG_BIND_MAP,
+	BPF_TOKEN_CREATE,
 };
 
 enum bpf_map_type {
@@ -1169,6 +1180,24 @@ enum bpf_link_type {
  */
 #define BPF_F_KPROBE_MULTI_RETURN	(1U << 0)
 
+/* BPF_TOKEN_CREATE command flags
+ */
+enum {
+	/* Ignore unrecognized bits in token_create.allowed_cmds bit set.  If
+	 * this flag is set, kernel won't return -EINVAL for a bit
+	 * corresponding to a non-existing command or the one that doesn't
+	 * support BPF token passing. This flags allows application to request
+	 * BPF token creation for a desired set of commands without worrying
+	 * about older kernels not supporting some of the commands.
+	 * Presumably, deployed applications will do separate feature
+	 * detection and will avoid calling not-yet-supported bpf() commands,
+	 * so this BPF token will work equally well both on older and newer
+	 * kernels, even if some of the requested commands won't be BPF
+	 * token-enabled.
+	 */
+	BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS		  = 1U << 0,
+};
+
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
@@ -1621,6 +1650,19 @@ union bpf_attr {
 		__u32		flags;		/* extra flags */
 	} prog_bind_map;
 
+	struct { /* struct used by BPF_TOKEN_CREATE command */
+		__u32		flags;
+		__u32		token_fd;
+		/* a bit set of allowed bpf() syscall commands,
+		 * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
+		 * will allow creating derived BPF tokens and loading new BPF
+		 * programs;
+		 * see also BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS for its effect on
+		 * validity checking of this set
+		 */
+		__u64		allowed_cmds;
+	} token_create;
+
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 1d3892168d32..bbc17ea3878f 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -6,7 +6,7 @@ cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse
 endif
 CFLAGS_core.o += $(call cc-disable-warning, override-init) $(cflags-nogcse-yy)
 
-obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o
+obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o token.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
 obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 4174f76133df..55d9a945ad18 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -27,6 +27,7 @@ enum bpf_type {
 	BPF_TYPE_PROG,
 	BPF_TYPE_MAP,
 	BPF_TYPE_LINK,
+	BPF_TYPE_TOKEN,
 };
 
 static void *bpf_any_get(void *raw, enum bpf_type type)
@@ -41,6 +42,9 @@ static void *bpf_any_get(void *raw, enum bpf_type type)
 	case BPF_TYPE_LINK:
 		bpf_link_inc(raw);
 		break;
+	case BPF_TYPE_TOKEN:
+		bpf_token_inc(raw);
+		break;
 	default:
 		WARN_ON_ONCE(1);
 		break;
@@ -61,6 +65,9 @@ static void bpf_any_put(void *raw, enum bpf_type type)
 	case BPF_TYPE_LINK:
 		bpf_link_put(raw);
 		break;
+	case BPF_TYPE_TOKEN:
+		bpf_token_put(raw);
+		break;
 	default:
 		WARN_ON_ONCE(1);
 		break;
@@ -89,6 +96,12 @@ static void *bpf_fd_probe_obj(u32 ufd, enum bpf_type *type)
 		return raw;
 	}
 
+	raw = bpf_token_get_from_fd(ufd);
+	if (!IS_ERR(raw)) {
+		*type = BPF_TYPE_TOKEN;
+		return raw;
+	}
+
 	return ERR_PTR(-EINVAL);
 }
 
@@ -97,6 +110,7 @@ static const struct inode_operations bpf_dir_iops;
 static const struct inode_operations bpf_prog_iops = { };
 static const struct inode_operations bpf_map_iops  = { };
 static const struct inode_operations bpf_link_iops  = { };
+static const struct inode_operations bpf_token_iops  = { };
 
 static struct inode *bpf_get_inode(struct super_block *sb,
 				   const struct inode *dir,
@@ -136,6 +150,8 @@ static int bpf_inode_type(const struct inode *inode, enum bpf_type *type)
 		*type = BPF_TYPE_MAP;
 	else if (inode->i_op == &bpf_link_iops)
 		*type = BPF_TYPE_LINK;
+	else if (inode->i_op == &bpf_token_iops)
+		*type = BPF_TYPE_TOKEN;
 	else
 		return -EACCES;
 
@@ -369,6 +385,11 @@ static int bpf_mklink(struct dentry *dentry, umode_t mode, void *arg)
 			     &bpf_iter_fops : &bpffs_obj_fops);
 }
 
+static int bpf_mktoken(struct dentry *dentry, umode_t mode, void *arg)
+{
+	return bpf_mkobj_ops(dentry, mode, arg, &bpf_token_iops, &bpffs_obj_fops);
+}
+
 static struct dentry *
 bpf_lookup(struct inode *dir, struct dentry *dentry, unsigned flags)
 {
@@ -469,6 +490,9 @@ static int bpf_obj_do_pin(int path_fd, const char __user *pathname, void *raw,
 	case BPF_TYPE_LINK:
 		ret = vfs_mkobj(dentry, mode, bpf_mklink, raw);
 		break;
+	case BPF_TYPE_TOKEN:
+		ret = vfs_mkobj(dentry, mode, bpf_mktoken, raw);
+		break;
 	default:
 		ret = -EPERM;
 	}
@@ -547,6 +571,8 @@ int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags)
 		ret = bpf_map_new_fd(raw, f_flags);
 	else if (type == BPF_TYPE_LINK)
 		ret = (f_flags != O_RDWR) ? -EINVAL : bpf_link_new_fd(raw);
+	else if (type == BPF_TYPE_TOKEN)
+		ret = (f_flags != O_RDWR) ? -EINVAL : bpf_token_new_fd(raw);
 	else
 		return -ENOENT;
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 92a57efc77de..edafb0f3053f 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -5024,6 +5024,77 @@ static int bpf_prog_bind_map(union bpf_attr *attr)
 	return ret;
 }
 
+#define BPF_TOKEN_FLAGS_MASK (BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS)
+#define BPF_TOKEN_CMDS_MASK ((1ULL << BPF_TOKEN_CREATE))
+
+#define BPF_TOKEN_CREATE_LAST_FIELD token_create.allowed_cmds
+
+static int token_create(union bpf_attr *attr)
+{
+	struct bpf_token *new_token, *token = NULL;
+	u64 allowed_cmds;
+	int fd, err;
+
+	if (CHECK_ATTR(BPF_TOKEN_CREATE))
+		return -EINVAL;
+
+	if (attr->token_create.flags & ~BPF_TOKEN_FLAGS_MASK)
+		return -EINVAL;
+
+	if (attr->token_create.token_fd) {
+		token = bpf_token_get_from_fd(attr->token_create.token_fd);
+		if (IS_ERR(token))
+			return PTR_ERR(token);
+		/* if provided BPF token doesn't allow creating new tokens,
+		 * then use system-wide capability checks only
+		 */
+		if (!bpf_token_allow_cmd(token, BPF_TOKEN_CREATE)) {
+			bpf_token_put(token);
+			token = NULL;
+		}
+	}
+
+	allowed_cmds = attr->token_create.allowed_cmds;
+	if (!(attr->token_create.flags & BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS) &&
+	    allowed_cmds & ~BPF_TOKEN_CMDS_MASK) {
+		err = -ENOTSUPP;
+		goto err_out;
+	}
+
+	if (!bpf_token_capable(token, CAP_SYS_ADMIN)) {
+		err = -EPERM;
+		goto err_out;
+	}
+
+	/* requested cmds should be a subset of associated token's set */
+	if (token && (token->allowed_cmds & allowed_cmds) != allowed_cmds) {
+		err = -EPERM;
+		goto err_out;
+	}
+
+	new_token = bpf_token_alloc();
+	if (!new_token) {
+		err = -ENOMEM;
+		goto err_out;
+	}
+
+	new_token->allowed_cmds = allowed_cmds & BPF_TOKEN_CMDS_MASK;
+
+	fd = bpf_token_new_fd(new_token);
+	if (fd < 0) {
+		bpf_token_put(new_token);
+		err = fd;
+		goto err_out;
+	}
+
+	bpf_token_put(token);
+	return fd;
+
+err_out:
+	bpf_token_put(token);
+	return err;
+}
+
 static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
 {
 	union bpf_attr attr;
@@ -5172,6 +5243,9 @@ static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
 	case BPF_PROG_BIND_MAP:
 		err = bpf_prog_bind_map(&attr);
 		break;
+	case BPF_TOKEN_CREATE:
+		err = token_create(&attr);
+		break;
 	default:
 		err = -EINVAL;
 		break;
diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
new file mode 100644
index 000000000000..7e989b25fa06
--- /dev/null
+++ b/kernel/bpf/token.c
@@ -0,0 +1,122 @@
+#include <linux/bpf.h>
+#include <linux/vmalloc.h>
+#include <linux/anon_inodes.h>
+#include <linux/fdtable.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/kernel.h>
+#include <linux/idr.h>
+
+DEFINE_IDR(token_idr);
+DEFINE_SPINLOCK(token_idr_lock);
+
+void bpf_token_inc(struct bpf_token *token)
+{
+	atomic64_inc(&token->refcnt);
+}
+
+static void bpf_token_put_deferred(struct work_struct *work)
+{
+	struct bpf_token *token = container_of(work, struct bpf_token, work);
+
+	kvfree(token);
+}
+
+void bpf_token_put(struct bpf_token *token)
+{
+	if (!token)
+		return;
+
+	if (!atomic64_dec_and_test(&token->refcnt))
+		return;
+
+	INIT_WORK(&token->work, bpf_token_put_deferred);
+	schedule_work(&token->work);
+}
+
+static int bpf_token_release(struct inode *inode, struct file *filp)
+{
+	struct bpf_token *token = filp->private_data;
+
+	bpf_token_put(token);
+	return 0;
+}
+
+static ssize_t bpf_dummy_read(struct file *filp, char __user *buf, size_t siz,
+			      loff_t *ppos)
+{
+	/* We need this handler such that alloc_file() enables
+	 * f_mode with FMODE_CAN_READ.
+	 */
+	return -EINVAL;
+}
+
+static ssize_t bpf_dummy_write(struct file *filp, const char __user *buf,
+			       size_t siz, loff_t *ppos)
+{
+	/* We need this handler such that alloc_file() enables
+	 * f_mode with FMODE_CAN_WRITE.
+	 */
+	return -EINVAL;
+}
+
+static const struct file_operations bpf_token_fops = {
+	.release	= bpf_token_release,
+	.read		= bpf_dummy_read,
+	.write		= bpf_dummy_write,
+};
+
+struct bpf_token *bpf_token_alloc(void)
+{
+	struct bpf_token *token;
+
+	token = kvzalloc(sizeof(*token), GFP_USER);
+	if (token == NULL)
+		return NULL;
+
+	atomic64_set(&token->refcnt, 1);
+
+	return token;
+}
+
+#define BPF_TOKEN_INODE_NAME "bpf-token"
+
+/* Alloc anon_inode and FD for prepared token.
+ * Returns fd >= 0 on success; negative error, otherwise.
+ */
+int bpf_token_new_fd(struct bpf_token *token)
+{
+	return anon_inode_getfd(BPF_TOKEN_INODE_NAME, &bpf_token_fops, token, O_CLOEXEC);
+}
+
+struct bpf_token *bpf_token_get_from_fd(u32 ufd)
+{
+	struct fd f = fdget(ufd);
+	struct bpf_token *token;
+
+	if (!f.file)
+		return ERR_PTR(-EBADF);
+	if (f.file->f_op != &bpf_token_fops) {
+		fdput(f);
+		return ERR_PTR(-EINVAL);
+	}
+
+	token = f.file->private_data;
+	bpf_token_inc(token);
+	fdput(f);
+
+	return token;
+}
+
+bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd)
+{
+	if (!token)
+		return false;
+
+	return token->allowed_cmds & (1ULL << cmd);
+}
+
+bool bpf_token_capable(const struct bpf_token *token, int cap)
+{
+	return token || capable(cap) || (cap != CAP_SYS_ADMIN && capable(CAP_SYS_ADMIN));
+}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 9273c654743c..d1d7ca71756f 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -846,6 +846,16 @@ union bpf_iter_link_info {
  *		Returns zero on success. On error, -1 is returned and *errno*
  *		is set appropriately.
  *
+ * BPF_TOKEN_CREATE
+ *	Description
+ *		Create BPF token with embedded information about what
+ *		BPF-related functionality is allowed. This BPF token can be
+ *		passed as an extra parameter to various bpf() syscall command.
+ *
+ *	Return
+ *		A new file descriptor (a nonnegative integer), or -1 if an
+ *		error occurred (in which case, *errno* is set appropriately).
+ *
  * NOTES
  *	eBPF objects (maps and programs) can be shared between processes.
  *
@@ -900,6 +910,7 @@ enum bpf_cmd {
 	BPF_ITER_CREATE,
 	BPF_LINK_DETACH,
 	BPF_PROG_BIND_MAP,
+	BPF_TOKEN_CREATE,
 };
 
 enum bpf_map_type {
@@ -1169,6 +1180,24 @@ enum bpf_link_type {
  */
 #define BPF_F_KPROBE_MULTI_RETURN	(1U << 0)
 
+/* BPF_TOKEN_CREATE command flags
+ */
+enum {
+	/* Ignore unrecognized bits in token_create.allowed_cmds bit set.  If
+	 * this flag is set, kernel won't return -EINVAL for a bit
+	 * corresponding to a non-existing command or the one that doesn't
+	 * support BPF token passing. This flags allows application to request
+	 * BPF token creation for a desired set of commands without worrying
+	 * about older kernels not supporting some of the commands.
+	 * Presumably, deployed applications will do separate feature
+	 * detection and will avoid calling not-yet-supported bpf() commands,
+	 * so this BPF token will work equally well both on older and newer
+	 * kernels, even if some of the requested commands won't be BPF
+	 * token-enabled.
+	 */
+	BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS		  = 1U << 0,
+};
+
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
@@ -1621,6 +1650,17 @@ union bpf_attr {
 		__u32		flags;		/* extra flags */
 	} prog_bind_map;
 
+	struct { /* struct used by BPF_TOKEN_CREATE command */
+		__u32		flags;
+		__u32		token_fd;
+		/* a bit set of allowed bpf() syscall commands,
+		 * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
+		 * will allow creating derived BPF tokens and loading new BPF
+		 * programs
+		 */
+		__u64		allowed_cmds;
+	} token_create;
+
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 02/18] libbpf: add bpf_token_create() API
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object Andrii Nakryiko
@ 2023-06-02 14:59 ` Andrii Nakryiko
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 03/18] selftests/bpf: add BPF_TOKEN_CREATE test Andrii Nakryiko
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 14:59 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/bpf.c      | 18 ++++++++++++++++++
 tools/lib/bpf/bpf.h      | 11 +++++++++++
 tools/lib/bpf/libbpf.map |  1 +
 3 files changed, 30 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index ed86b37d8024..38be66719485 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -1201,3 +1201,21 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
 	ret = sys_bpf(BPF_PROG_BIND_MAP, &attr, attr_sz);
 	return libbpf_err_errno(ret);
 }
+
+int bpf_token_create(struct bpf_token_create_opts *opts)
+{
+	const size_t attr_sz = offsetofend(union bpf_attr, token_create);
+	union bpf_attr attr;
+	int ret;
+
+	if (!OPTS_VALID(opts, bpf_token_create_opts))
+		return libbpf_err(-EINVAL);
+
+	memset(&attr, 0, attr_sz);
+	attr.token_create.flags = OPTS_GET(opts, flags, 0);
+	attr.token_create.token_fd = OPTS_GET(opts, token_fd, 0);
+	attr.token_create.allowed_cmds = OPTS_GET(opts, allowed_cmds, 0);
+
+	ret = sys_bpf_fd(BPF_TOKEN_CREATE, &attr, attr_sz);
+	return libbpf_err_errno(ret);
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 9aa0ee473754..f2b8041ca27a 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -551,6 +551,17 @@ struct bpf_test_run_opts {
 LIBBPF_API int bpf_prog_test_run_opts(int prog_fd,
 				      struct bpf_test_run_opts *opts);
 
+struct bpf_token_create_opts {
+	size_t sz; /* size of this struct for forward/backward compatibility */
+	__u32 flags;
+	__u32 token_fd;
+	__u64 allowed_cmds;
+	size_t :0;
+};
+#define bpf_token_create_opts__last_field allowed_cmds
+
+LIBBPF_API int bpf_token_create(struct bpf_token_create_opts *opts);
+
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 7521a2fb7626..62cbe4775081 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -395,4 +395,5 @@ LIBBPF_1.2.0 {
 LIBBPF_1.3.0 {
 	global:
 		bpf_obj_pin_opts;
+		bpf_token_create;
 } LIBBPF_1.2.0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 03/18] selftests/bpf: add BPF_TOKEN_CREATE test
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object Andrii Nakryiko
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 02/18] libbpf: add bpf_token_create() API Andrii Nakryiko
@ 2023-06-02 14:59 ` Andrii Nakryiko
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 04/18] bpf: move unprivileged checks into map_create() and bpf_prog_load() Andrii Nakryiko
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 14:59 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Add a subtest validating BPF_TOKEN_CREATE command, pinning/getting BPF
token in/from BPF FS, and creating derived BPF tokens using token_fd
parameter.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../testing/selftests/bpf/prog_tests/token.c  | 95 +++++++++++++++++++
 1 file changed, 95 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c

diff --git a/tools/testing/selftests/bpf/prog_tests/token.c b/tools/testing/selftests/bpf/prog_tests/token.c
new file mode 100644
index 000000000000..fe78b558d697
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/token.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+#include "linux/bpf.h"
+#include <test_progs.h>
+#include <bpf/btf.h>
+#include "cap_helpers.h"
+
+static int drop_priv_caps(__u64 *old_caps)
+{
+	return cap_disable_effective((1ULL << CAP_BPF) |
+				     (1ULL << CAP_PERFMON) |
+				     (1ULL << CAP_NET_ADMIN) |
+				     (1ULL << CAP_SYS_ADMIN), old_caps);
+}
+
+static int restore_priv_caps(__u64 old_caps)
+{
+	return cap_enable_effective(old_caps, NULL);
+}
+
+#define TOKEN_PATH "/sys/fs/bpf/test_token"
+
+static void subtest_token_create(void)
+{
+	LIBBPF_OPTS(bpf_token_create_opts, opts);
+	int token_fd = 0, limited_token_fd = 0, tmp_fd = 0, err;
+	__u64 old_caps = 0;
+
+	/* create BPF token which allows creating derived BPF tokens */
+	opts.allowed_cmds = 1ULL << BPF_TOKEN_CREATE;
+	token_fd = bpf_token_create(&opts);
+	if (!ASSERT_GT(token_fd, 0, "token_create"))
+		return;
+
+	/* check that IGNORE_UNKNOWN_CMDS flag is respected */
+	opts.flags = BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS;
+	opts.allowed_cmds = ~0ULL; /* any current and future cmd is allowed */
+	tmp_fd = bpf_token_create(&opts);
+	if (!ASSERT_GT(token_fd, 0, "token_create_future_proof"))
+		return;
+	close(tmp_fd);
+	tmp_fd = 0;
+
+	/* validate pinning and getting works as expected */
+	err = bpf_obj_pin(token_fd, TOKEN_PATH);
+	if (!ASSERT_OK(err, "token_pin"))
+		goto cleanup;
+
+	tmp_fd = bpf_obj_get(TOKEN_PATH);
+	ASSERT_GT(tmp_fd, 0, "token_get");
+	close(tmp_fd);
+	tmp_fd = 0;
+	unlink(TOKEN_PATH);
+
+	/* drop privileges to test token_fd passing */
+	if (!ASSERT_OK(drop_priv_caps(&old_caps), "drop_caps"))
+		goto cleanup;
+
+	/* unprivileged BPF_TOKEN_CREATE should fail */
+	tmp_fd = bpf_token_create(NULL);
+	if (!ASSERT_LT(tmp_fd, 0, "token_create_unpriv_fail"))
+		goto cleanup;
+
+	/* unprivileged BPF_TOKEN_CREATE with associated BPF token succeeds */
+	opts.flags = 0;
+	opts.allowed_cmds = 0; /* ask for BPF token which doesn't allow new tokens */
+	opts.token_fd = token_fd;
+	limited_token_fd = bpf_token_create(&opts);
+	if (!ASSERT_GT(limited_token_fd, 0, "token_create_limited"))
+		goto cleanup;
+
+	/* creating yet another token using "limited" BPF token should fail */
+	opts.flags = 0;
+	opts.allowed_cmds = 0;
+	opts.token_fd = limited_token_fd;
+	tmp_fd = bpf_token_create(&opts);
+	if (!ASSERT_LT(tmp_fd, 0, "token_create_from_lim_fail"))
+		goto cleanup;
+
+cleanup:
+	if (tmp_fd)
+		close(tmp_fd);
+	if (token_fd)
+		close(token_fd);
+	if (limited_token_fd)
+		close(limited_token_fd);
+	if (old_caps)
+		ASSERT_OK(restore_priv_caps(old_caps), "restore_caps");
+}
+
+void test_token(void)
+{
+	if (test__start_subtest("token_create"))
+		subtest_token_create();
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 04/18] bpf: move unprivileged checks into map_create() and bpf_prog_load()
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 03/18] selftests/bpf: add BPF_TOKEN_CREATE test Andrii Nakryiko
@ 2023-06-02 14:59 ` Andrii Nakryiko
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 05/18] bpf: inline map creation logic in map_create() function Andrii Nakryiko
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 14:59 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Make each bpf() syscall command a bit more self-contained, making it
easier to further enhance it. We move sysctl_unprivileged_bpf_disabled
handling down to map_create() and bpf_prog_load(), two special commands
in this regard.

Also swap the order of checks, calling bpf_capable() only if
sysctl_unprivileged_bpf_disabled is true, avoiding unnecessary audit
messages.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/syscall.c | 34 +++++++++++++++++++---------------
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index edafb0f3053f..4c9e79ec40e2 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1157,6 +1157,15 @@ static int map_create(union bpf_attr *attr)
 	     !node_online(numa_node)))
 		return -EINVAL;
 
+	/* Intent here is for unprivileged_bpf_disabled to block BPF map
+	 * creation for unprivileged users; other actions depend
+	 * on fd availability and access to bpffs, so are dependent on
+	 * object creation success. Even with unprivileged BPF disabled,
+	 * capability checks are still carried out.
+	 */
+	if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
+		return -EPERM;
+
 	/* find map type and init map: hashtable vs rbtree vs bloom vs ... */
 	map = find_and_alloc_map(attr);
 	if (IS_ERR(map))
@@ -2532,6 +2541,16 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	/* eBPF programs must be GPL compatible to use GPL-ed functions */
 	is_gpl = license_is_gpl_compatible(license);
 
+	/* Intent here is for unprivileged_bpf_disabled to block BPF program
+	 * creation for unprivileged users; other actions depend
+	 * on fd availability and access to bpffs, so are dependent on
+	 * object creation success. Even with unprivileged BPF disabled,
+	 * capability checks are still carried out for these
+	 * and other operations.
+	 */
+	if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
+		return -EPERM;
+
 	if (attr->insn_cnt == 0 ||
 	    attr->insn_cnt > (bpf_capable() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
 		return -E2BIG;
@@ -5098,23 +5117,8 @@ static int token_create(union bpf_attr *attr)
 static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
 {
 	union bpf_attr attr;
-	bool capable;
 	int err;
 
-	capable = bpf_capable() || !sysctl_unprivileged_bpf_disabled;
-
-	/* Intent here is for unprivileged_bpf_disabled to block key object
-	 * creation commands for unprivileged users; other actions depend
-	 * of fd availability and access to bpffs, so are dependent on
-	 * object creation success.  Capabilities are later verified for
-	 * operations such as load and map create, so even with unprivileged
-	 * BPF disabled, capability checks are still carried out for these
-	 * and other operations.
-	 */
-	if (!capable &&
-	    (cmd == BPF_MAP_CREATE || cmd == BPF_PROG_LOAD))
-		return -EPERM;
-
 	err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
 	if (err)
 		return err;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 05/18] bpf: inline map creation logic in map_create() function
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 04/18] bpf: move unprivileged checks into map_create() and bpf_prog_load() Andrii Nakryiko
@ 2023-06-02 14:59 ` Andrii Nakryiko
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 06/18] bpf: centralize permissions checks for all BPF map types Andrii Nakryiko
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 14:59 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Currently find_and_alloc_map() performs two separate functions: some
argument sanity checking and partial map creation workflow hanling.
Neither of those functions are self-sufficient and are augmented by
further checks and initialization logic in the caller (map_create()
function). So unify all the sanity checks, permission checks, and
creation and initialization logic in one linear piece of code in
map_create() instead. This also make it easier to further enhance
permission checks and keep them located in one place.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/syscall.c | 57 +++++++++++++++++++-------------------------
 1 file changed, 24 insertions(+), 33 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4c9e79ec40e2..cd68c57c0689 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -109,37 +109,6 @@ const struct bpf_map_ops bpf_map_offload_ops = {
 	.map_mem_usage = bpf_map_offload_map_mem_usage,
 };
 
-static struct bpf_map *find_and_alloc_map(union bpf_attr *attr)
-{
-	const struct bpf_map_ops *ops;
-	u32 type = attr->map_type;
-	struct bpf_map *map;
-	int err;
-
-	if (type >= ARRAY_SIZE(bpf_map_types))
-		return ERR_PTR(-EINVAL);
-	type = array_index_nospec(type, ARRAY_SIZE(bpf_map_types));
-	ops = bpf_map_types[type];
-	if (!ops)
-		return ERR_PTR(-EINVAL);
-
-	if (ops->map_alloc_check) {
-		err = ops->map_alloc_check(attr);
-		if (err)
-			return ERR_PTR(err);
-	}
-	if (attr->map_ifindex)
-		ops = &bpf_map_offload_ops;
-	if (!ops->map_mem_usage)
-		return ERR_PTR(-EINVAL);
-	map = ops->map_alloc(attr);
-	if (IS_ERR(map))
-		return map;
-	map->ops = ops;
-	map->map_type = type;
-	return map;
-}
-
 static void bpf_map_write_active_inc(struct bpf_map *map)
 {
 	atomic64_inc(&map->writecnt);
@@ -1127,7 +1096,9 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
 /* called via syscall */
 static int map_create(union bpf_attr *attr)
 {
+	const struct bpf_map_ops *ops;
 	int numa_node = bpf_map_attr_numa_node(attr);
+	u32 map_type = attr->map_type;
 	struct bpf_map *map;
 	int f_flags;
 	int err;
@@ -1157,6 +1128,25 @@ static int map_create(union bpf_attr *attr)
 	     !node_online(numa_node)))
 		return -EINVAL;
 
+	/* find map type and init map: hashtable vs rbtree vs bloom vs ... */
+	map_type = attr->map_type;
+	if (map_type >= ARRAY_SIZE(bpf_map_types))
+		return -EINVAL;
+	map_type = array_index_nospec(map_type, ARRAY_SIZE(bpf_map_types));
+	ops = bpf_map_types[map_type];
+	if (!ops)
+		return -EINVAL;
+
+	if (ops->map_alloc_check) {
+		err = ops->map_alloc_check(attr);
+		if (err)
+			return err;
+	}
+	if (attr->map_ifindex)
+		ops = &bpf_map_offload_ops;
+	if (!ops->map_mem_usage)
+		return -EINVAL;
+
 	/* Intent here is for unprivileged_bpf_disabled to block BPF map
 	 * creation for unprivileged users; other actions depend
 	 * on fd availability and access to bpffs, so are dependent on
@@ -1166,10 +1156,11 @@ static int map_create(union bpf_attr *attr)
 	if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
 		return -EPERM;
 
-	/* find map type and init map: hashtable vs rbtree vs bloom vs ... */
-	map = find_and_alloc_map(attr);
+	map = ops->map_alloc(attr);
 	if (IS_ERR(map))
 		return PTR_ERR(map);
+	map->ops = ops;
+	map->map_type = map_type;
 
 	err = bpf_obj_name_cpy(map->name, attr->map_name,
 			       sizeof(attr->map_name));
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 06/18] bpf: centralize permissions checks for all BPF map types
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 05/18] bpf: inline map creation logic in map_create() function Andrii Nakryiko
@ 2023-06-02 14:59 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 07/18] bpf: add BPF token support to BPF_MAP_CREATE command Andrii Nakryiko
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 14:59 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

This allows to do more centralized decisions later on, and generally
makes it very explicit which maps are privileged and which are not
(e.g., LRU_HASH and LRU_PERCPU_HASH, which are privileged HASH variants,
as opposed to unprivileged HASH and HASH_PERCPU; now this is explicit
and easy to verify).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/bloom_filter.c                     |  3 --
 kernel/bpf/bpf_local_storage.c                |  3 --
 kernel/bpf/bpf_struct_ops.c                   |  3 --
 kernel/bpf/cpumap.c                           |  4 --
 kernel/bpf/devmap.c                           |  3 --
 kernel/bpf/hashtab.c                          |  6 ---
 kernel/bpf/lpm_trie.c                         |  3 --
 kernel/bpf/queue_stack_maps.c                 |  4 --
 kernel/bpf/reuseport_array.c                  |  3 --
 kernel/bpf/stackmap.c                         |  3 --
 kernel/bpf/syscall.c                          | 47 +++++++++++++++++++
 net/core/sock_map.c                           |  4 --
 net/xdp/xskmap.c                              |  4 --
 .../bpf/prog_tests/unpriv_bpf_disabled.c      |  6 ++-
 14 files changed, 52 insertions(+), 44 deletions(-)

diff --git a/kernel/bpf/bloom_filter.c b/kernel/bpf/bloom_filter.c
index 540331b610a9..addf3dd57b59 100644
--- a/kernel/bpf/bloom_filter.c
+++ b/kernel/bpf/bloom_filter.c
@@ -86,9 +86,6 @@ static struct bpf_map *bloom_map_alloc(union bpf_attr *attr)
 	int numa_node = bpf_map_attr_numa_node(attr);
 	struct bpf_bloom_filter *bloom;
 
-	if (!bpf_capable())
-		return ERR_PTR(-EPERM);
-
 	if (attr->key_size != 0 || attr->value_size == 0 ||
 	    attr->max_entries == 0 ||
 	    attr->map_flags & ~BLOOM_CREATE_FLAG_MASK ||
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 47d9948d768f..b5149cfce7d4 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -723,9 +723,6 @@ int bpf_local_storage_map_alloc_check(union bpf_attr *attr)
 	    !attr->btf_key_type_id || !attr->btf_value_type_id)
 		return -EINVAL;
 
-	if (!bpf_capable())
-		return -EPERM;
-
 	if (attr->value_size > BPF_LOCAL_STORAGE_MAX_VALUE_SIZE)
 		return -E2BIG;
 
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index d3f0a4825fa6..116a0ce378ec 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -655,9 +655,6 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
 	const struct btf_type *t, *vt;
 	struct bpf_map *map;
 
-	if (!bpf_capable())
-		return ERR_PTR(-EPERM);
-
 	st_ops = bpf_struct_ops_find_value(attr->btf_vmlinux_value_type_id);
 	if (!st_ops)
 		return ERR_PTR(-ENOTSUPP);
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 8ec18faa74ac..8a33e8747a0e 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -28,7 +28,6 @@
 #include <linux/sched.h>
 #include <linux/workqueue.h>
 #include <linux/kthread.h>
-#include <linux/capability.h>
 #include <trace/events/xdp.h>
 #include <linux/btf_ids.h>
 
@@ -89,9 +88,6 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
 	u32 value_size = attr->value_size;
 	struct bpf_cpu_map *cmap;
 
-	if (!bpf_capable())
-		return ERR_PTR(-EPERM);
-
 	/* check sanity of attributes */
 	if (attr->max_entries == 0 || attr->key_size != 4 ||
 	    (value_size != offsetofend(struct bpf_cpumap_val, qsize) &&
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 802692fa3905..49cc0b5671c6 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -160,9 +160,6 @@ static struct bpf_map *dev_map_alloc(union bpf_attr *attr)
 	struct bpf_dtab *dtab;
 	int err;
 
-	if (!capable(CAP_NET_ADMIN))
-		return ERR_PTR(-EPERM);
-
 	dtab = bpf_map_area_alloc(sizeof(*dtab), NUMA_NO_NODE);
 	if (!dtab)
 		return ERR_PTR(-ENOMEM);
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 9901efee4339..56d3da7d0bc6 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -422,12 +422,6 @@ static int htab_map_alloc_check(union bpf_attr *attr)
 	BUILD_BUG_ON(offsetof(struct htab_elem, fnode.next) !=
 		     offsetof(struct htab_elem, hash_node.pprev));
 
-	if (lru && !bpf_capable())
-		/* LRU implementation is much complicated than other
-		 * maps.  Hence, limit to CAP_BPF.
-		 */
-		return -EPERM;
-
 	if (zero_seed && !capable(CAP_SYS_ADMIN))
 		/* Guard against local DoS, and discourage production use. */
 		return -EPERM;
diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
index e0d3ddf2037a..17c7e7782a1f 100644
--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -544,9 +544,6 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
 {
 	struct lpm_trie *trie;
 
-	if (!bpf_capable())
-		return ERR_PTR(-EPERM);
-
 	/* check sanity of attributes */
 	if (attr->max_entries == 0 ||
 	    !(attr->map_flags & BPF_F_NO_PREALLOC) ||
diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c
index 601609164ef3..8d2ddcb7566b 100644
--- a/kernel/bpf/queue_stack_maps.c
+++ b/kernel/bpf/queue_stack_maps.c
@@ -7,7 +7,6 @@
 #include <linux/bpf.h>
 #include <linux/list.h>
 #include <linux/slab.h>
-#include <linux/capability.h>
 #include <linux/btf_ids.h>
 #include "percpu_freelist.h"
 
@@ -46,9 +45,6 @@ static bool queue_stack_map_is_full(struct bpf_queue_stack *qs)
 /* Called from syscall */
 static int queue_stack_map_alloc_check(union bpf_attr *attr)
 {
-	if (!bpf_capable())
-		return -EPERM;
-
 	/* check sanity of attributes */
 	if (attr->max_entries == 0 || attr->key_size != 0 ||
 	    attr->value_size == 0 ||
diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
index cbf2d8d784b8..4b4f9670f1a9 100644
--- a/kernel/bpf/reuseport_array.c
+++ b/kernel/bpf/reuseport_array.c
@@ -151,9 +151,6 @@ static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr)
 	int numa_node = bpf_map_attr_numa_node(attr);
 	struct reuseport_array *array;
 
-	if (!bpf_capable())
-		return ERR_PTR(-EPERM);
-
 	/* allocate all map elements and zero-initialize them */
 	array = bpf_map_area_alloc(struct_size(array, ptrs, attr->max_entries), numa_node);
 	if (!array)
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index b25fce425b2c..458bb80b14d5 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -74,9 +74,6 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
 	u64 cost, n_buckets;
 	int err;
 
-	if (!bpf_capable())
-		return ERR_PTR(-EPERM);
-
 	if (attr->map_flags & ~STACK_CREATE_FLAG_MASK)
 		return ERR_PTR(-EINVAL);
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index cd68c57c0689..6e7ccbd54524 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1156,6 +1156,53 @@ static int map_create(union bpf_attr *attr)
 	if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
 		return -EPERM;
 
+	/* check privileged map type permissions */
+	switch (map_type) {
+	case BPF_MAP_TYPE_ARRAY:
+	case BPF_MAP_TYPE_PERCPU_ARRAY:
+	case BPF_MAP_TYPE_PROG_ARRAY:
+	case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
+	case BPF_MAP_TYPE_CGROUP_ARRAY:
+	case BPF_MAP_TYPE_ARRAY_OF_MAPS:
+	case BPF_MAP_TYPE_HASH:
+	case BPF_MAP_TYPE_PERCPU_HASH:
+	case BPF_MAP_TYPE_HASH_OF_MAPS:
+	case BPF_MAP_TYPE_RINGBUF:
+	case BPF_MAP_TYPE_USER_RINGBUF:
+	case BPF_MAP_TYPE_CGROUP_STORAGE:
+	case BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE:
+		/* unprivileged */
+		break;
+	case BPF_MAP_TYPE_SK_STORAGE:
+	case BPF_MAP_TYPE_INODE_STORAGE:
+	case BPF_MAP_TYPE_TASK_STORAGE:
+	case BPF_MAP_TYPE_CGRP_STORAGE:
+	case BPF_MAP_TYPE_BLOOM_FILTER:
+	case BPF_MAP_TYPE_LPM_TRIE:
+	case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
+	case BPF_MAP_TYPE_STACK_TRACE:
+	case BPF_MAP_TYPE_QUEUE:
+	case BPF_MAP_TYPE_STACK:
+	case BPF_MAP_TYPE_LRU_HASH:
+	case BPF_MAP_TYPE_LRU_PERCPU_HASH:
+	case BPF_MAP_TYPE_STRUCT_OPS:
+	case BPF_MAP_TYPE_CPUMAP:
+		if (!bpf_capable())
+			return -EPERM;
+		break;
+	case BPF_MAP_TYPE_SOCKMAP:
+	case BPF_MAP_TYPE_SOCKHASH:
+	case BPF_MAP_TYPE_DEVMAP:
+	case BPF_MAP_TYPE_DEVMAP_HASH:
+	case BPF_MAP_TYPE_XSKMAP:
+		if (!capable(CAP_NET_ADMIN))
+			return -EPERM;
+		break;
+	default:
+		WARN(1, "unsupported map type %d", map_type);
+		return -EPERM;
+	}
+
 	map = ops->map_alloc(attr);
 	if (IS_ERR(map))
 		return PTR_ERR(map);
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 00afb66cd095..19538d628714 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -32,8 +32,6 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
 {
 	struct bpf_stab *stab;
 
-	if (!capable(CAP_NET_ADMIN))
-		return ERR_PTR(-EPERM);
 	if (attr->max_entries == 0 ||
 	    attr->key_size    != 4 ||
 	    (attr->value_size != sizeof(u32) &&
@@ -1085,8 +1083,6 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
 	struct bpf_shtab *htab;
 	int i, err;
 
-	if (!capable(CAP_NET_ADMIN))
-		return ERR_PTR(-EPERM);
 	if (attr->max_entries == 0 ||
 	    attr->key_size    == 0 ||
 	    (attr->value_size != sizeof(u32) &&
diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c
index 2c1427074a3b..e1c526f97ce3 100644
--- a/net/xdp/xskmap.c
+++ b/net/xdp/xskmap.c
@@ -5,7 +5,6 @@
 
 #include <linux/bpf.h>
 #include <linux/filter.h>
-#include <linux/capability.h>
 #include <net/xdp_sock.h>
 #include <linux/slab.h>
 #include <linux/sched.h>
@@ -68,9 +67,6 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
 	int numa_node;
 	u64 size;
 
-	if (!capable(CAP_NET_ADMIN))
-		return ERR_PTR(-EPERM);
-
 	if (attr->max_entries == 0 || attr->key_size != 4 ||
 	    attr->value_size != 4 ||
 	    attr->map_flags & ~(BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY))
diff --git a/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c b/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c
index 8383a99f610f..0adf8d9475cb 100644
--- a/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c
+++ b/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c
@@ -171,7 +171,11 @@ static void test_unpriv_bpf_disabled_negative(struct test_unpriv_bpf_disabled *s
 				prog_insns, prog_insn_cnt, &load_opts),
 		  -EPERM, "prog_load_fails");
 
-	for (i = BPF_MAP_TYPE_HASH; i <= BPF_MAP_TYPE_BLOOM_FILTER; i++)
+	/* some map types require particular correct parameters which could be
+	 * sanity-checked before enforcing -EPERM, so only validate that
+	 * the simple ARRAY and HASH maps are failing with -EPERM
+	 */
+	for (i = BPF_MAP_TYPE_HASH; i <= BPF_MAP_TYPE_ARRAY; i++)
 		ASSERT_EQ(bpf_map_create(i, NULL, sizeof(int), sizeof(int), 1, NULL),
 			  -EPERM, "map_create_fails");
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 07/18] bpf: add BPF token support to BPF_MAP_CREATE command
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (5 preceding siblings ...)
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 06/18] bpf: centralize permissions checks for all BPF map types Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 08/18] libbpf: add BPF token support to bpf_map_create() API Andrii Nakryiko
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Allow providing token_fd for BPF_MAP_CREATE command to allow controlled
BPF map creation from unprivileged process through delegated BPF token.

Further, add a filter of allowed BPF map types to BPF token, specified
at BPF token creation time. This, in combination with allowed_cmds
allows to create a narrowly-focused BPF token (controlled by privileged
agent) with a restrictive set of BPF maps that application can attempt
to create.

BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES flag allows application to express
"any supported BPF map type" without having to do an elaborate feature
detection of each supported BPF map. This is a very untrivial process,
especially as some BPF maps have special requirements just to be able to
instantiate a minimal instance (e.g., custom BTF). Allowing application
to just specify ~0 as a bit set makes writing application much simple
without degrading any of kernel's safety of backwards compatibility
concerns.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h                           |  3 +
 include/uapi/linux/bpf.h                      | 12 +++
 kernel/bpf/syscall.c                          | 82 +++++++++++++++----
 kernel/bpf/token.c                            |  8 ++
 tools/include/uapi/linux/bpf.h                | 16 +++-
 .../selftests/bpf/prog_tests/libbpf_str.c     |  3 +
 6 files changed, 108 insertions(+), 16 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index fe6d51c3a5b1..657bec546351 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -251,6 +251,7 @@ struct bpf_map {
 	u32 btf_value_type_id;
 	u32 btf_vmlinux_value_type_id;
 	struct btf *btf;
+	struct bpf_token *token;
 #ifdef CONFIG_MEMCG_KMEM
 	struct obj_cgroup *objcg;
 #endif
@@ -1538,6 +1539,7 @@ struct bpf_token {
 	struct work_struct work;
 	atomic64_t refcnt;
 	u64 allowed_cmds;
+	u64 allowed_map_types;
 };
 
 struct bpf_struct_ops_value;
@@ -2092,6 +2094,7 @@ struct bpf_token *bpf_token_get_from_fd(u32 ufd);
 
 bool bpf_token_capable(const struct bpf_token *token, int cap);
 bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
+bool bpf_token_allow_map_type(const struct bpf_token *token, enum bpf_map_type type);
 
 int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
 int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 01ab79f2ad9f..7cfaa2da84ee 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -954,6 +954,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_BLOOM_FILTER,
 	BPF_MAP_TYPE_USER_RINGBUF,
 	BPF_MAP_TYPE_CGRP_STORAGE,
+	__MAX_BPF_MAP_TYPE
 };
 
 /* Note that tracing related programs such as
@@ -1196,6 +1197,10 @@ enum {
 	 * token-enabled.
 	 */
 	BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS		  = 1U << 0,
+	/* Similar to BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag, but for
+	 * token_create.allowed_map_types bit set.
+	 */
+	BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES	  = 1U << 1,
 };
 
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
@@ -1377,6 +1382,7 @@ union bpf_attr {
 		 * to using 5 hash functions).
 		 */
 		__u64	map_extra;
+		__u32	map_token_fd;
 	};
 
 	struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
@@ -1661,6 +1667,12 @@ union bpf_attr {
 		 * validity checking of this set
 		 */
 		__u64		allowed_cmds;
+		/* similarly to allowed_cmds, a bit set of BPF map types that
+		 * are allowed to be created by requested BPF token;
+		 * see also BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES for its
+		 * effect on validity checking of this set
+		 */
+		__u64		allowed_map_types;
 	} token_create;
 
 } __attribute__((aligned(8)));
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 6e7ccbd54524..eb77ba71fbcf 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -691,6 +691,7 @@ static void bpf_map_free_deferred(struct work_struct *work)
 {
 	struct bpf_map *map = container_of(work, struct bpf_map, work);
 	struct btf_record *rec = map->record;
+	struct bpf_token *token = map->token;
 
 	security_bpf_map_free(map);
 	bpf_map_release_memcg(map);
@@ -706,6 +707,7 @@ static void bpf_map_free_deferred(struct work_struct *work)
 	 * template bpf_map struct used during verification.
 	 */
 	btf_record_free(rec);
+	bpf_token_put(token);
 }
 
 static void bpf_map_put_uref(struct bpf_map *map)
@@ -1010,7 +1012,7 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
 	if (!IS_ERR_OR_NULL(map->record)) {
 		int i;
 
-		if (!bpf_capable()) {
+		if (!bpf_token_capable(map->token, CAP_BPF)) {
 			ret = -EPERM;
 			goto free_map_tab;
 		}
@@ -1092,11 +1094,12 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
 	return ret;
 }
 
-#define BPF_MAP_CREATE_LAST_FIELD map_extra
+#define BPF_MAP_CREATE_LAST_FIELD map_token_fd
 /* called via syscall */
 static int map_create(union bpf_attr *attr)
 {
 	const struct bpf_map_ops *ops;
+	struct bpf_token *token = NULL;
 	int numa_node = bpf_map_attr_numa_node(attr);
 	u32 map_type = attr->map_type;
 	struct bpf_map *map;
@@ -1147,14 +1150,32 @@ static int map_create(union bpf_attr *attr)
 	if (!ops->map_mem_usage)
 		return -EINVAL;
 
+	if (attr->map_token_fd) {
+		token = bpf_token_get_from_fd(attr->map_token_fd);
+		if (IS_ERR(token))
+			return PTR_ERR(token);
+
+		/* if current token doesn't grant map creation permissions,
+		 * then we can't use this token, so ignore it and rely on
+		 * system-wide capabilities checks
+		 */
+		if (!bpf_token_allow_cmd(token, BPF_MAP_CREATE) ||
+		    !bpf_token_allow_map_type(token, attr->map_type)) {
+			bpf_token_put(token);
+			token = NULL;
+		}
+	}
+
+	err = -EPERM;
+
 	/* Intent here is for unprivileged_bpf_disabled to block BPF map
 	 * creation for unprivileged users; other actions depend
 	 * on fd availability and access to bpffs, so are dependent on
 	 * object creation success. Even with unprivileged BPF disabled,
 	 * capability checks are still carried out.
 	 */
-	if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
-		return -EPERM;
+	if (sysctl_unprivileged_bpf_disabled && !bpf_token_capable(token, CAP_BPF))
+		goto put_token;
 
 	/* check privileged map type permissions */
 	switch (map_type) {
@@ -1187,28 +1208,36 @@ static int map_create(union bpf_attr *attr)
 	case BPF_MAP_TYPE_LRU_PERCPU_HASH:
 	case BPF_MAP_TYPE_STRUCT_OPS:
 	case BPF_MAP_TYPE_CPUMAP:
-		if (!bpf_capable())
-			return -EPERM;
+		if (!bpf_token_capable(token, CAP_BPF))
+			goto put_token;
 		break;
 	case BPF_MAP_TYPE_SOCKMAP:
 	case BPF_MAP_TYPE_SOCKHASH:
 	case BPF_MAP_TYPE_DEVMAP:
 	case BPF_MAP_TYPE_DEVMAP_HASH:
 	case BPF_MAP_TYPE_XSKMAP:
-		if (!capable(CAP_NET_ADMIN))
-			return -EPERM;
+		if (!bpf_token_capable(token, CAP_NET_ADMIN))
+			goto put_token;
 		break;
 	default:
 		WARN(1, "unsupported map type %d", map_type);
-		return -EPERM;
+		goto put_token;
 	}
 
 	map = ops->map_alloc(attr);
-	if (IS_ERR(map))
-		return PTR_ERR(map);
+	if (IS_ERR(map)) {
+		err = PTR_ERR(map);
+		goto put_token;
+	}
 	map->ops = ops;
 	map->map_type = map_type;
 
+	if (token) {
+		/* move token reference into map->token, reuse our refcnt */
+		map->token = token;
+		token = NULL;
+	}
+
 	err = bpf_obj_name_cpy(map->name, attr->map_name,
 			       sizeof(attr->map_name));
 	if (err < 0)
@@ -1281,8 +1310,11 @@ static int map_create(union bpf_attr *attr)
 free_map_sec:
 	security_bpf_map_free(map);
 free_map:
+	bpf_token_put(map->token);
 	btf_put(map->btf);
 	map->ops->map_free(map);
+put_token:
+	bpf_token_put(token);
 	return err;
 }
 
@@ -5081,15 +5113,23 @@ static int bpf_prog_bind_map(union bpf_attr *attr)
 	return ret;
 }
 
-#define BPF_TOKEN_FLAGS_MASK (BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS)
-#define BPF_TOKEN_CMDS_MASK ((1ULL << BPF_TOKEN_CREATE))
+#define BPF_TOKEN_FLAGS_MASK (			\
+	BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS		\
+	| BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES	\
+)
+#define BPF_TOKEN_CMDS_MASK (			\
+	(1ULL << BPF_TOKEN_CREATE)		\
+	| (1ULL << BPF_MAP_CREATE)		\
+)
+#define BPF_TOKEN_MAP_TYPES_MASK \
+	((BIT_ULL(__MAX_BPF_MAP_TYPE) - 1) & ~BIT_ULL(BPF_MAP_TYPE_UNSPEC))
 
-#define BPF_TOKEN_CREATE_LAST_FIELD token_create.allowed_cmds
+#define BPF_TOKEN_CREATE_LAST_FIELD token_create.allowed_map_types
 
 static int token_create(union bpf_attr *attr)
 {
 	struct bpf_token *new_token, *token = NULL;
-	u64 allowed_cmds;
+	u64 allowed_cmds, allowed_map_types;
 	int fd, err;
 
 	if (CHECK_ATTR(BPF_TOKEN_CREATE))
@@ -5117,6 +5157,12 @@ static int token_create(union bpf_attr *attr)
 		err = -ENOTSUPP;
 		goto err_out;
 	}
+	allowed_map_types = attr->token_create.allowed_map_types;
+	if (!(attr->token_create.flags & BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES) &&
+	    allowed_map_types & ~BPF_TOKEN_MAP_TYPES_MASK) {
+		err = -ENOTSUPP;
+		goto err_out;
+	}
 
 	if (!bpf_token_capable(token, CAP_SYS_ADMIN)) {
 		err = -EPERM;
@@ -5128,6 +5174,11 @@ static int token_create(union bpf_attr *attr)
 		err = -EPERM;
 		goto err_out;
 	}
+	/* requested map types should be a subset of associated token's set */
+	if (token && (token->allowed_map_types & allowed_map_types) != allowed_map_types) {
+		err = -EPERM;
+		goto err_out;
+	}
 
 	new_token = bpf_token_alloc();
 	if (!new_token) {
@@ -5136,6 +5187,7 @@ static int token_create(union bpf_attr *attr)
 	}
 
 	new_token->allowed_cmds = allowed_cmds & BPF_TOKEN_CMDS_MASK;
+	new_token->allowed_map_types = allowed_map_types & BPF_TOKEN_MAP_TYPES_MASK;
 
 	fd = bpf_token_new_fd(new_token);
 	if (fd < 0) {
diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
index 7e989b25fa06..ef053c48d7db 100644
--- a/kernel/bpf/token.c
+++ b/kernel/bpf/token.c
@@ -116,6 +116,14 @@ bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd)
 	return token->allowed_cmds & (1ULL << cmd);
 }
 
+bool bpf_token_allow_map_type(const struct bpf_token *token, enum bpf_map_type type)
+{
+	if (!token || type >= __MAX_BPF_MAP_TYPE)
+		return false;
+
+	return token->allowed_map_types & (1ULL << type);
+}
+
 bool bpf_token_capable(const struct bpf_token *token, int cap)
 {
 	return token || capable(cap) || (cap != CAP_SYS_ADMIN && capable(CAP_SYS_ADMIN));
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d1d7ca71756f..7cfaa2da84ee 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -954,6 +954,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_BLOOM_FILTER,
 	BPF_MAP_TYPE_USER_RINGBUF,
 	BPF_MAP_TYPE_CGRP_STORAGE,
+	__MAX_BPF_MAP_TYPE
 };
 
 /* Note that tracing related programs such as
@@ -1196,6 +1197,10 @@ enum {
 	 * token-enabled.
 	 */
 	BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS		  = 1U << 0,
+	/* Similar to BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag, but for
+	 * token_create.allowed_map_types bit set.
+	 */
+	BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES	  = 1U << 1,
 };
 
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
@@ -1377,6 +1382,7 @@ union bpf_attr {
 		 * to using 5 hash functions).
 		 */
 		__u64	map_extra;
+		__u32	map_token_fd;
 	};
 
 	struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
@@ -1656,9 +1662,17 @@ union bpf_attr {
 		/* a bit set of allowed bpf() syscall commands,
 		 * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
 		 * will allow creating derived BPF tokens and loading new BPF
-		 * programs
+		 * programs;
+		 * see also BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS for its effect on
+		 * validity checking of this set
 		 */
 		__u64		allowed_cmds;
+		/* similarly to allowed_cmds, a bit set of BPF map types that
+		 * are allowed to be created by requested BPF token;
+		 * see also BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES for its
+		 * effect on validity checking of this set
+		 */
+		__u64		allowed_map_types;
 	} token_create;
 
 } __attribute__((aligned(8)));
diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c
index efb8bd43653c..e677c0435cec 100644
--- a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c
+++ b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c
@@ -132,6 +132,9 @@ static void test_libbpf_bpf_map_type_str(void)
 		const char *map_type_str;
 		char buf[256];
 
+		if (map_type == __MAX_BPF_MAP_TYPE)
+			continue;
+
 		map_type_name = btf__str_by_offset(btf, e->name_off);
 		map_type_str = libbpf_bpf_map_type_str(map_type);
 		ASSERT_OK_PTR(map_type_str, map_type_name);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 08/18] libbpf: add BPF token support to bpf_map_create() API
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (6 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 07/18] bpf: add BPF token support to BPF_MAP_CREATE command Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 09/18] selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command Andrii Nakryiko
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Add ability to provide token_fd for BPF_MAP_CREATE command through
bpf_map_create() API.

Also wire through token_create.allowed_map_types param for
BPF_TOKEN_CREATE command.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/bpf.c | 5 ++++-
 tools/lib/bpf/bpf.h | 7 +++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 38be66719485..0318538d43eb 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -169,7 +169,7 @@ int bpf_map_create(enum bpf_map_type map_type,
 		   __u32 max_entries,
 		   const struct bpf_map_create_opts *opts)
 {
-	const size_t attr_sz = offsetofend(union bpf_attr, map_extra);
+	const size_t attr_sz = offsetofend(union bpf_attr, map_token_fd);
 	union bpf_attr attr;
 	int fd;
 
@@ -198,6 +198,8 @@ int bpf_map_create(enum bpf_map_type map_type,
 	attr.numa_node = OPTS_GET(opts, numa_node, 0);
 	attr.map_ifindex = OPTS_GET(opts, map_ifindex, 0);
 
+	attr.map_token_fd = OPTS_GET(opts, token_fd, 0);
+
 	fd = sys_bpf_fd(BPF_MAP_CREATE, &attr, attr_sz);
 	return libbpf_err_errno(fd);
 }
@@ -1215,6 +1217,7 @@ int bpf_token_create(struct bpf_token_create_opts *opts)
 	attr.token_create.flags = OPTS_GET(opts, flags, 0);
 	attr.token_create.token_fd = OPTS_GET(opts, token_fd, 0);
 	attr.token_create.allowed_cmds = OPTS_GET(opts, allowed_cmds, 0);
+	attr.token_create.allowed_map_types = OPTS_GET(opts, allowed_map_types, 0);
 
 	ret = sys_bpf_fd(BPF_TOKEN_CREATE, &attr, attr_sz);
 	return libbpf_err_errno(ret);
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index f2b8041ca27a..19a43201d1af 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -51,8 +51,10 @@ struct bpf_map_create_opts {
 
 	__u32 numa_node;
 	__u32 map_ifindex;
+
+	__u32 token_fd;
 };
-#define bpf_map_create_opts__last_field map_ifindex
+#define bpf_map_create_opts__last_field token_fd
 
 LIBBPF_API int bpf_map_create(enum bpf_map_type map_type,
 			      const char *map_name,
@@ -556,9 +558,10 @@ struct bpf_token_create_opts {
 	__u32 flags;
 	__u32 token_fd;
 	__u64 allowed_cmds;
+	__u64 allowed_map_types;
 	size_t :0;
 };
-#define bpf_token_create_opts__last_field allowed_cmds
+#define bpf_token_create_opts__last_field allowed_map_types
 
 LIBBPF_API int bpf_token_create(struct bpf_token_create_opts *opts);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 09/18] selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (7 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 08/18] libbpf: add BPF token support to bpf_map_create() API Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 10/18] bpf: add BPF token support to BPF_BTF_LOAD command Andrii Nakryiko
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Add test for creating BPF token with support for BPF_MAP_CREATE
delegation. And validate that its allowed_map_types filter works as
expected and allows to create privileged BPF maps through delegated
token, as long as they are allowed by privileged creator of a token.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../testing/selftests/bpf/prog_tests/token.c  | 52 +++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/token.c b/tools/testing/selftests/bpf/prog_tests/token.c
index fe78b558d697..0b6e699d2439 100644
--- a/tools/testing/selftests/bpf/prog_tests/token.c
+++ b/tools/testing/selftests/bpf/prog_tests/token.c
@@ -88,8 +88,60 @@ static void subtest_token_create(void)
 		ASSERT_OK(restore_priv_caps(old_caps), "restore_caps");
 }
 
+static void subtest_map_token(void)
+{
+	LIBBPF_OPTS(bpf_token_create_opts, token_opts);
+	LIBBPF_OPTS(bpf_map_create_opts, map_opts);
+	int token_fd = 0, map_fd = 0;
+	__u64 old_caps = 0;
+
+	/* check that IGNORE_UNKNOWN_MAP_TYPES flag is respected */
+	token_opts.flags = BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES;
+	token_opts.allowed_map_types = ~0ULL; /* any current and future map types is allowed */
+	token_fd = bpf_token_create(&token_opts);
+	if (!ASSERT_GT(token_fd, 0, "token_create_future_proof"))
+		return;
+	close(token_fd);
+
+	/* create BPF token allowing STACK, but not QUEUE map */
+	token_opts.flags = 0;
+	token_opts.allowed_cmds = 1ULL << BPF_MAP_CREATE;
+	token_opts.allowed_map_types = 1ULL << BPF_MAP_TYPE_STACK; /* but not QUEUE */
+	token_fd = bpf_token_create(&token_opts);
+	if (!ASSERT_GT(token_fd, 0, "token_create"))
+		return;
+
+	/* drop privileges to test token_fd passing */
+	if (!ASSERT_OK(drop_priv_caps(&old_caps), "drop_caps"))
+		goto cleanup;
+
+	/* BPF_MAP_TYPE_STACK is privileged, but with given token_fd should succeed */
+	map_opts.token_fd = token_fd;
+	map_fd = bpf_map_create(BPF_MAP_TYPE_STACK, "token_stack", 0, 8, 1, &map_opts);
+	if (!ASSERT_GT(map_fd, 0, "stack_map_fd"))
+		goto cleanup;
+	close(map_fd);
+	map_fd = 0;
+
+	/* BPF_MAP_TYPE_QUEUE is privileged, and token doesn't allow it, so should fail */
+	map_opts.token_fd = token_fd;
+	map_fd = bpf_map_create(BPF_MAP_TYPE_QUEUE, "token_queue", 0, 8, 1, &map_opts);
+	if (!ASSERT_EQ(map_fd, -EPERM, "queue_map_fd"))
+		goto cleanup;
+
+cleanup:
+	if (map_fd > 0)
+		close(map_fd);
+	if (token_fd)
+		close(token_fd);
+	if (old_caps)
+		ASSERT_OK(restore_priv_caps(old_caps), "restore_caps");
+}
+
 void test_token(void)
 {
 	if (test__start_subtest("token_create"))
 		subtest_token_create();
+	if (test__start_subtest("map_token"))
+		subtest_map_token();
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 10/18] bpf: add BPF token support to BPF_BTF_LOAD command
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (8 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 09/18] selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 11/18] libbpf: add BPF token support to bpf_btf_load() API Andrii Nakryiko
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading
through delegated BPF token. BTF loading is a pretty straightforward
operation, so as long as BPF token is created with allow_cmds granting
BPF_BTF_LOAD command, kernel proceeds to parsing BTF data and creating
BTF object.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/uapi/linux/bpf.h                      |  1 +
 kernel/bpf/syscall.c                          | 21 +++++++++++++++++--
 tools/include/uapi/linux/bpf.h                |  1 +
 .../selftests/bpf/prog_tests/libbpf_probes.c  |  2 ++
 4 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 7cfaa2da84ee..d30fb567d22a 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1549,6 +1549,7 @@ union bpf_attr {
 		 * truncated), or smaller (if log buffer wasn't filled completely).
 		 */
 		__u32		btf_log_true_size;
+		__u32		btf_token_fd;
 	};
 
 	struct {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index eb77ba71fbcf..05e941e9bbe6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4475,15 +4475,31 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
 	return err;
 }
 
-#define BPF_BTF_LOAD_LAST_FIELD btf_log_true_size
+#define BPF_BTF_LOAD_LAST_FIELD btf_token_fd
 
 static int bpf_btf_load(const union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size)
 {
+	struct bpf_token *token = NULL;
+
 	if (CHECK_ATTR(BPF_BTF_LOAD))
 		return -EINVAL;
 
-	if (!bpf_capable())
+	if (attr->btf_token_fd) {
+		token = bpf_token_get_from_fd(attr->btf_token_fd);
+		if (IS_ERR(token))
+			return PTR_ERR(token);
+		if (!bpf_token_allow_cmd(token, BPF_BTF_LOAD)) {
+			bpf_token_put(token);
+			token = NULL;
+		}
+	}
+
+	if (!bpf_token_capable(token, CAP_BPF)) {
+		bpf_token_put(token);
 		return -EPERM;
+	}
+
+	bpf_token_put(token);
 
 	return btf_new_fd(attr, uattr, uattr_size);
 }
@@ -5120,6 +5136,7 @@ static int bpf_prog_bind_map(union bpf_attr *attr)
 #define BPF_TOKEN_CMDS_MASK (			\
 	(1ULL << BPF_TOKEN_CREATE)		\
 	| (1ULL << BPF_MAP_CREATE)		\
+	| (1ULL << BPF_BTF_LOAD)		\
 )
 #define BPF_TOKEN_MAP_TYPES_MASK \
 	((BIT_ULL(__MAX_BPF_MAP_TYPE) - 1) & ~BIT_ULL(BPF_MAP_TYPE_UNSPEC))
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 7cfaa2da84ee..d30fb567d22a 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1549,6 +1549,7 @@ union bpf_attr {
 		 * truncated), or smaller (if log buffer wasn't filled completely).
 		 */
 		__u32		btf_log_true_size;
+		__u32		btf_token_fd;
 	};
 
 	struct {
diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c
index 9f766ddd946a..573249a2814d 100644
--- a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c
+++ b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c
@@ -68,6 +68,8 @@ void test_libbpf_probe_map_types(void)
 
 		if (map_type == BPF_MAP_TYPE_UNSPEC)
 			continue;
+		if (strcmp(map_type_name, "__MAX_BPF_MAP_TYPE") == 0)
+			continue;
 
 		if (!test__start_subtest(map_type_name))
 			continue;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 11/18] libbpf: add BPF token support to bpf_btf_load() API
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (9 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 10/18] bpf: add BPF token support to BPF_BTF_LOAD command Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 12/18] selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest Andrii Nakryiko
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Allow user to specify token_fd for bpf_btf_load() API that wraps
kernel's BPF_BTF_LOAD command. This allows loading BTF from unprivileged
process as long as it has BPF token allowing BPF_BTF_LOAD command, which
can be created and delegated by privileged process.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/bpf.c | 4 +++-
 tools/lib/bpf/bpf.h | 3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 0318538d43eb..193993dbbdc4 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -1098,7 +1098,7 @@ int bpf_raw_tracepoint_open(const char *name, int prog_fd)
 
 int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts *opts)
 {
-	const size_t attr_sz = offsetofend(union bpf_attr, btf_log_true_size);
+	const size_t attr_sz = offsetofend(union bpf_attr, btf_token_fd);
 	union bpf_attr attr;
 	char *log_buf;
 	size_t log_size;
@@ -1123,6 +1123,8 @@ int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts
 
 	attr.btf = ptr_to_u64(btf_data);
 	attr.btf_size = btf_size;
+	attr.btf_token_fd = OPTS_GET(opts, token_fd, 0);
+
 	/* log_level == 0 and log_buf != NULL means "try loading without
 	 * log_buf, but retry with log_buf and log_level=1 on error", which is
 	 * consistent across low-level and high-level BTF and program loading
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 19a43201d1af..3153a9e697e2 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -132,9 +132,10 @@ struct bpf_btf_load_opts {
 	 * If kernel doesn't support this feature, log_size is left unchanged.
 	 */
 	__u32 log_true_size;
+	__u32 token_fd;
 	size_t :0;
 };
-#define bpf_btf_load_opts__last_field log_true_size
+#define bpf_btf_load_opts__last_field token_fd
 
 LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size,
 			    struct bpf_btf_load_opts *opts);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 12/18] selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (10 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 11/18] libbpf: add BPF token support to bpf_btf_load() API Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 13/18] bpf: keep BPF_PROG_LOAD permission checks clear of validations Andrii Nakryiko
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Add a simple test validating that BTF loading can be done from
unprivileged process through delegated BPF token.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../testing/selftests/bpf/prog_tests/token.c  | 55 +++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/token.c b/tools/testing/selftests/bpf/prog_tests/token.c
index 0b6e699d2439..b141f722c0c6 100644
--- a/tools/testing/selftests/bpf/prog_tests/token.c
+++ b/tools/testing/selftests/bpf/prog_tests/token.c
@@ -138,10 +138,65 @@ static void subtest_map_token(void)
 		ASSERT_OK(restore_priv_caps(old_caps), "restore_caps");
 }
 
+static void subtest_btf_token(void)
+{
+	LIBBPF_OPTS(bpf_token_create_opts, token_opts);
+	LIBBPF_OPTS(bpf_btf_load_opts, btf_opts);
+	int token_fd = 0, btf_fd = 0;
+	const void *raw_btf_data;
+	struct btf *btf = NULL;
+	__u32 raw_btf_size;
+	__u64 old_caps = 0;
+
+	/* create BPF token allowing BPF_BTF_LOAD command */
+	token_opts.allowed_cmds = 1ULL << BPF_BTF_LOAD;
+	token_fd = bpf_token_create(&token_opts);
+	if (!ASSERT_GT(token_fd, 0, "token_create"))
+		return;
+
+	/* drop privileges to test token_fd passing */
+	if (!ASSERT_OK(drop_priv_caps(&old_caps), "drop_caps"))
+		goto cleanup;
+
+	btf = btf__new_empty();
+	if (!ASSERT_OK_PTR(btf, "empty_btf"))
+		goto cleanup;
+
+	ASSERT_GT(btf__add_int(btf, "int", 4, 0), 0, "int_type");
+
+	raw_btf_data = btf__raw_data(btf, &raw_btf_size);
+	if (!ASSERT_OK_PTR(raw_btf_data, "raw_btf_data"))
+		goto cleanup;
+
+	/* validate we can successfully load new BTF with token */
+	btf_opts.token_fd = token_fd;
+	btf_fd = bpf_btf_load(raw_btf_data, raw_btf_size, &btf_opts);
+	if (!ASSERT_GT(btf_fd, 0, "btf_fd"))
+		goto cleanup;
+	close(btf_fd);
+
+	/* now validate that we *cannot* load BTF without token */
+	btf_opts.token_fd = 0;
+	btf_fd = bpf_btf_load(raw_btf_data, raw_btf_size, &btf_opts);
+	if (!ASSERT_EQ(btf_fd, -EPERM, "btf_fd_eperm"))
+		goto cleanup;
+
+cleanup:
+	btf__free(btf);
+	if (btf_fd > 0)
+		close(btf_fd);
+	if (token_fd)
+		close(token_fd);
+	if (old_caps)
+		ASSERT_OK(restore_priv_caps(old_caps), "restore_caps");
+}
+
 void test_token(void)
 {
 	if (test__start_subtest("token_create"))
 		subtest_token_create();
 	if (test__start_subtest("map_token"))
 		subtest_map_token();
+	if (test__start_subtest("btf_token"))
+		subtest_btf_token();
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 13/18] bpf: keep BPF_PROG_LOAD permission checks clear of validations
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (11 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 12/18] selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 14/18] bpf: add BPF token support to BPF_PROG_LOAD command Andrii Nakryiko
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Move out flags validation and license checks out of the permission
checks. They were intermingled, which makes subsequent changes harder.
Clean this up: perform straightforward flag validation upfront, and
fetch and check license later, right where we use it. Also consolidate
capabilities check in one block, right after basic attribute sanity
checks.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/syscall.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 05e941e9bbe6..b0a85ac9a42f 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2582,7 +2582,6 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	struct btf *attach_btf = NULL;
 	int err;
 	char license[128];
-	bool is_gpl;
 
 	if (CHECK_ATTR(BPF_PROG_LOAD))
 		return -EINVAL;
@@ -2601,16 +2600,6 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	    !bpf_capable())
 		return -EPERM;
 
-	/* copy eBPF program license from user space */
-	if (strncpy_from_bpfptr(license,
-				make_bpfptr(attr->license, uattr.is_kernel),
-				sizeof(license) - 1) < 0)
-		return -EFAULT;
-	license[sizeof(license) - 1] = 0;
-
-	/* eBPF programs must be GPL compatible to use GPL-ed functions */
-	is_gpl = license_is_gpl_compatible(license);
-
 	/* Intent here is for unprivileged_bpf_disabled to block BPF program
 	 * creation for unprivileged users; other actions depend
 	 * on fd availability and access to bpffs, so are dependent on
@@ -2703,12 +2692,20 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 			     make_bpfptr(attr->insns, uattr.is_kernel),
 			     bpf_prog_insn_size(prog)) != 0)
 		goto free_prog_sec;
+	/* copy eBPF program license from user space */
+	if (strncpy_from_bpfptr(license,
+				make_bpfptr(attr->license, uattr.is_kernel),
+				sizeof(license) - 1) < 0)
+		goto free_prog_sec;
+	license[sizeof(license) - 1] = 0;
+
+	/* eBPF programs must be GPL compatible to use GPL-ed functions */
+	prog->gpl_compatible = license_is_gpl_compatible(license) ? 1 : 0;
 
 	prog->orig_prog = NULL;
 	prog->jited = 0;
 
 	atomic64_set(&prog->aux->refcnt, 1);
-	prog->gpl_compatible = is_gpl ? 1 : 0;
 
 	if (bpf_prog_is_dev_bound(prog->aux)) {
 		err = bpf_prog_dev_bound_init(prog, attr);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 14/18] bpf: add BPF token support to BPF_PROG_LOAD command
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (12 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 13/18] bpf: keep BPF_PROG_LOAD permission checks clear of validations Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos Andrii Nakryiko
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Add basic support of BPF token to BPF_PROG_LOAD. Extend BPF token to
allow specifying BPF_PROG_LOAD as an allowed command, and also allow to
specify bit sets of program type and attach type combination that would
be allowed to be loaded by requested BPF token.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h                           |   6 +
 include/uapi/linux/bpf.h                      |  19 +++
 kernel/bpf/core.c                             |   1 +
 kernel/bpf/syscall.c                          | 118 ++++++++++++++----
 kernel/bpf/token.c                            |  11 ++
 tools/include/uapi/linux/bpf.h                |  19 +++
 .../selftests/bpf/prog_tests/libbpf_probes.c  |   2 +
 .../selftests/bpf/prog_tests/libbpf_str.c     |   3 +
 8 files changed, 154 insertions(+), 25 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 657bec546351..320d93c542ed 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1411,6 +1411,7 @@ struct bpf_prog_aux {
 #ifdef CONFIG_SECURITY
 	void *security;
 #endif
+	struct bpf_token *token;
 	struct bpf_prog_offload *offload;
 	struct btf *btf;
 	struct bpf_func_info *func_info;
@@ -1540,6 +1541,8 @@ struct bpf_token {
 	atomic64_t refcnt;
 	u64 allowed_cmds;
 	u64 allowed_map_types;
+	u64 allowed_prog_types;
+	u64 allowed_attach_types;
 };
 
 struct bpf_struct_ops_value;
@@ -2095,6 +2098,9 @@ struct bpf_token *bpf_token_get_from_fd(u32 ufd);
 bool bpf_token_capable(const struct bpf_token *token, int cap);
 bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
 bool bpf_token_allow_map_type(const struct bpf_token *token, enum bpf_map_type type);
+bool bpf_token_allow_prog_type(const struct bpf_token *token,
+			       enum bpf_prog_type prog_type,
+			       enum bpf_attach_type attach_type);
 
 int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
 int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d30fb567d22a..c2867e622c30 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -999,6 +999,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_SK_LOOKUP,
 	BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
 	BPF_PROG_TYPE_NETFILTER,
+	__MAX_BPF_PROG_TYPE
 };
 
 enum bpf_attach_type {
@@ -1201,6 +1202,14 @@ enum {
 	 * token_create.allowed_map_types bit set.
 	 */
 	BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES	  = 1U << 1,
+	/* Similar to BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag, but for
+	 * token_create.allowed_prog_types bit set.
+	 */
+	BPF_F_TOKEN_IGNORE_UNKNOWN_PROG_TYPES	  = 1U << 2,
+	/* Similar to BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag, but for
+	 * token_create.allowed_attach_types bit set.
+	 */
+	BPF_F_TOKEN_IGNORE_UNKNOWN_ATTACH_TYPES	  = 1U << 3,
 };
 
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
@@ -1452,6 +1461,7 @@ union bpf_attr {
 		 * truncated), or smaller (if log buffer wasn't filled completely).
 		 */
 		__u32		log_true_size;
+		__u32		prog_token_fd;
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -1674,6 +1684,15 @@ union bpf_attr {
 		 * effect on validity checking of this set
 		 */
 		__u64		allowed_map_types;
+		/* similarly to allowed_map_types, bit sets of BPF program
+		 * types and BPF program attach types that are allowed to be
+		 * loaded by requested BPF token;
+		 * see also BPF_F_TOKEN_IGNORE_UNKNOWN_PROG_TYPES and
+		 * BPF_F_TOKEN_IGNORE_UNKNOWN_ATTACH_TYPES for their
+		 * effect on validity checking of these sets
+		 */
+		__u64		allowed_prog_types;
+		__u64		allowed_attach_types;
 	} token_create;
 
 } __attribute__((aligned(8)));
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 7421487422d4..cd0a93968009 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2597,6 +2597,7 @@ void bpf_prog_free(struct bpf_prog *fp)
 
 	if (aux->dst_prog)
 		bpf_prog_put(aux->dst_prog);
+	bpf_token_put(aux->token);
 	INIT_WORK(&aux->work, bpf_prog_free_deferred);
 	schedule_work(&aux->work);
 }
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b0a85ac9a42f..e02688bebf8e 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2573,13 +2573,15 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 }
 
 /* last field in 'union bpf_attr' used by this command */
-#define	BPF_PROG_LOAD_LAST_FIELD log_true_size
+#define BPF_PROG_LOAD_LAST_FIELD prog_token_fd
 
 static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 {
 	enum bpf_prog_type type = attr->prog_type;
 	struct bpf_prog *prog, *dst_prog = NULL;
 	struct btf *attach_btf = NULL;
+	struct bpf_token *token = NULL;
+	bool bpf_cap;
 	int err;
 	char license[128];
 
@@ -2595,10 +2597,31 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 				 BPF_F_XDP_DEV_BOUND_ONLY))
 		return -EINVAL;
 
+	bpf_prog_load_fixup_attach_type(attr);
+
+	if (attr->prog_token_fd) {
+		token = bpf_token_get_from_fd(attr->prog_token_fd);
+		if (IS_ERR(token))
+			return PTR_ERR(token);
+		/* if current token doesn't grant prog loading permissions,
+		 * then we can't use this token, so ignore it and rely on
+		 * system-wide capabilities checks
+		 */
+		if (!bpf_token_allow_cmd(token, BPF_PROG_LOAD) ||
+		    !bpf_token_allow_prog_type(token, attr->prog_type,
+					       attr->expected_attach_type)) {
+			bpf_token_put(token);
+			token = NULL;
+		}
+	}
+
+	bpf_cap = bpf_token_capable(token, CAP_BPF);
+	err = -EPERM;
+
 	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
 	    (attr->prog_flags & BPF_F_ANY_ALIGNMENT) &&
-	    !bpf_capable())
-		return -EPERM;
+	    !bpf_cap)
+		goto put_token;
 
 	/* Intent here is for unprivileged_bpf_disabled to block BPF program
 	 * creation for unprivileged users; other actions depend
@@ -2607,21 +2630,23 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	 * capability checks are still carried out for these
 	 * and other operations.
 	 */
-	if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
-		return -EPERM;
+	if (sysctl_unprivileged_bpf_disabled && !bpf_cap)
+		goto put_token;
 
 	if (attr->insn_cnt == 0 ||
-	    attr->insn_cnt > (bpf_capable() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
-		return -E2BIG;
+	    attr->insn_cnt > (bpf_cap ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS)) {
+		err = -E2BIG;
+		goto put_token;
+	}
 	if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
 	    type != BPF_PROG_TYPE_CGROUP_SKB &&
-	    !bpf_capable())
-		return -EPERM;
+	    !bpf_cap)
+		goto put_token;
 
-	if (is_net_admin_prog_type(type) && !capable(CAP_NET_ADMIN) && !capable(CAP_SYS_ADMIN))
-		return -EPERM;
-	if (is_perfmon_prog_type(type) && !perfmon_capable())
-		return -EPERM;
+	if (is_net_admin_prog_type(type) && !bpf_token_capable(token, CAP_NET_ADMIN))
+		goto put_token;
+	if (is_perfmon_prog_type(type) && !bpf_token_capable(token, CAP_PERFMON))
+		goto put_token;
 
 	/* attach_prog_fd/attach_btf_obj_fd can specify fd of either bpf_prog
 	 * or btf, we need to check which one it is
@@ -2631,27 +2656,33 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 		if (IS_ERR(dst_prog)) {
 			dst_prog = NULL;
 			attach_btf = btf_get_by_fd(attr->attach_btf_obj_fd);
-			if (IS_ERR(attach_btf))
-				return -EINVAL;
+			if (IS_ERR(attach_btf)) {
+				err = -EINVAL;
+				goto put_token;
+			}
 			if (!btf_is_kernel(attach_btf)) {
 				/* attaching through specifying bpf_prog's BTF
 				 * objects directly might be supported eventually
 				 */
 				btf_put(attach_btf);
-				return -ENOTSUPP;
+				err = -ENOTSUPP;
+				goto put_token;
 			}
 		}
 	} else if (attr->attach_btf_id) {
 		/* fall back to vmlinux BTF, if BTF type ID is specified */
 		attach_btf = bpf_get_btf_vmlinux();
-		if (IS_ERR(attach_btf))
-			return PTR_ERR(attach_btf);
-		if (!attach_btf)
-			return -EINVAL;
+		if (IS_ERR(attach_btf)) {
+			err = PTR_ERR(attach_btf);
+			goto put_token;
+		}
+		if (!attach_btf) {
+			err = -EINVAL;
+			goto put_token;
+		}
 		btf_get(attach_btf);
 	}
 
-	bpf_prog_load_fixup_attach_type(attr);
 	if (bpf_prog_load_check_attach(type, attr->expected_attach_type,
 				       attach_btf, attr->attach_btf_id,
 				       dst_prog)) {
@@ -2659,7 +2690,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 			bpf_prog_put(dst_prog);
 		if (attach_btf)
 			btf_put(attach_btf);
-		return -EINVAL;
+		err = -EINVAL;
+		goto put_token;
 	}
 
 	/* plain bpf_prog allocation */
@@ -2669,7 +2701,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 			bpf_prog_put(dst_prog);
 		if (attach_btf)
 			btf_put(attach_btf);
-		return -ENOMEM;
+		err = -EINVAL;
+		goto put_token;
 	}
 
 	prog->expected_attach_type = attr->expected_attach_type;
@@ -2680,6 +2713,10 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	prog->aux->sleepable = attr->prog_flags & BPF_F_SLEEPABLE;
 	prog->aux->xdp_has_frags = attr->prog_flags & BPF_F_XDP_HAS_FRAGS;
 
+	/* move token into prog->aux, reuse taken refcnt */
+	prog->aux->token = token;
+	token = NULL;
+
 	err = security_bpf_prog_alloc(prog->aux);
 	if (err)
 		goto free_prog;
@@ -2781,6 +2818,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	if (prog->aux->attach_btf)
 		btf_put(prog->aux->attach_btf);
 	bpf_prog_free(prog);
+put_token:
+	bpf_token_put(token);
 	return err;
 }
 
@@ -3537,7 +3576,7 @@ static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog,
 	case BPF_PROG_TYPE_SK_LOOKUP:
 		return attach_type == prog->expected_attach_type ? 0 : -EINVAL;
 	case BPF_PROG_TYPE_CGROUP_SKB:
-		if (!capable(CAP_NET_ADMIN))
+		if (!bpf_token_capable(prog->aux->token, CAP_NET_ADMIN))
 			/* cg-skb progs can be loaded by unpriv user.
 			 * check permissions at attach time.
 			 */
@@ -5129,21 +5168,29 @@ static int bpf_prog_bind_map(union bpf_attr *attr)
 #define BPF_TOKEN_FLAGS_MASK (			\
 	BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS		\
 	| BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES	\
+	| BPF_F_TOKEN_IGNORE_UNKNOWN_PROG_TYPES	\
+	| BPF_F_TOKEN_IGNORE_UNKNOWN_ATTACH_TYPES	\
 )
 #define BPF_TOKEN_CMDS_MASK (			\
 	(1ULL << BPF_TOKEN_CREATE)		\
 	| (1ULL << BPF_MAP_CREATE)		\
 	| (1ULL << BPF_BTF_LOAD)		\
+	| (1ULL << BPF_PROG_LOAD)		\
 )
 #define BPF_TOKEN_MAP_TYPES_MASK \
 	((BIT_ULL(__MAX_BPF_MAP_TYPE) - 1) & ~BIT_ULL(BPF_MAP_TYPE_UNSPEC))
+#define BPF_TOKEN_PROG_TYPES_MASK \
+	((BIT_ULL(__MAX_BPF_PROG_TYPE) - 1) & ~BIT_ULL(BPF_PROG_TYPE_UNSPEC))
+#define BPF_TOKEN_ATTACH_TYPES_MASK \
+	(BIT_ULL(__MAX_BPF_ATTACH_TYPE) - 1)
 
-#define BPF_TOKEN_CREATE_LAST_FIELD token_create.allowed_map_types
+#define BPF_TOKEN_CREATE_LAST_FIELD token_create.allowed_attach_types
 
 static int token_create(union bpf_attr *attr)
 {
 	struct bpf_token *new_token, *token = NULL;
 	u64 allowed_cmds, allowed_map_types;
+	u64 allowed_prog_types, allowed_attach_types;
 	int fd, err;
 
 	if (CHECK_ATTR(BPF_TOKEN_CREATE))
@@ -5177,6 +5224,18 @@ static int token_create(union bpf_attr *attr)
 		err = -ENOTSUPP;
 		goto err_out;
 	}
+	allowed_prog_types = attr->token_create.allowed_prog_types;
+	if (!(attr->token_create.flags & BPF_F_TOKEN_IGNORE_UNKNOWN_PROG_TYPES) &&
+	    allowed_prog_types & ~BPF_TOKEN_PROG_TYPES_MASK) {
+		err = -ENOTSUPP;
+		goto err_out;
+	}
+	allowed_attach_types = attr->token_create.allowed_attach_types;
+	if (!(attr->token_create.flags & BPF_F_TOKEN_IGNORE_UNKNOWN_ATTACH_TYPES) &&
+	    allowed_attach_types & ~BPF_TOKEN_ATTACH_TYPES_MASK) {
+		err = -ENOTSUPP;
+		goto err_out;
+	}
 
 	if (!bpf_token_capable(token, CAP_SYS_ADMIN)) {
 		err = -EPERM;
@@ -5193,6 +5252,13 @@ static int token_create(union bpf_attr *attr)
 		err = -EPERM;
 		goto err_out;
 	}
+	/* requested prog/attach types should be a subset of associated token's set */
+	if (token &&
+	    (((token->allowed_prog_types & allowed_prog_types) != allowed_prog_types) ||
+	    ((token->allowed_attach_types & allowed_attach_types) != allowed_attach_types))) {
+		err = -EPERM;
+		goto err_out;
+	}
 
 	new_token = bpf_token_alloc();
 	if (!new_token) {
@@ -5202,6 +5268,8 @@ static int token_create(union bpf_attr *attr)
 
 	new_token->allowed_cmds = allowed_cmds & BPF_TOKEN_CMDS_MASK;
 	new_token->allowed_map_types = allowed_map_types & BPF_TOKEN_MAP_TYPES_MASK;
+	new_token->allowed_prog_types = allowed_prog_types & BPF_TOKEN_PROG_TYPES_MASK;
+	new_token->allowed_attach_types = allowed_attach_types & BPF_TOKEN_ATTACH_TYPES_MASK;
 
 	fd = bpf_token_new_fd(new_token);
 	if (fd < 0) {
diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
index ef053c48d7db..e9f651ba07da 100644
--- a/kernel/bpf/token.c
+++ b/kernel/bpf/token.c
@@ -124,6 +124,17 @@ bool bpf_token_allow_map_type(const struct bpf_token *token, enum bpf_map_type t
 	return token->allowed_map_types & (1ULL << type);
 }
 
+bool bpf_token_allow_prog_type(const struct bpf_token *token,
+			       enum bpf_prog_type prog_type,
+			       enum bpf_attach_type attach_type)
+{
+	if (!token || prog_type >= __MAX_BPF_PROG_TYPE || attach_type >= __MAX_BPF_ATTACH_TYPE)
+		return false;
+
+	return (token->allowed_prog_types & (1ULL << prog_type)) &&
+	       (token->allowed_attach_types & (1ULL << attach_type));
+}
+
 bool bpf_token_capable(const struct bpf_token *token, int cap)
 {
 	return token || capable(cap) || (cap != CAP_SYS_ADMIN && capable(CAP_SYS_ADMIN));
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d30fb567d22a..c2867e622c30 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -999,6 +999,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_SK_LOOKUP,
 	BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
 	BPF_PROG_TYPE_NETFILTER,
+	__MAX_BPF_PROG_TYPE
 };
 
 enum bpf_attach_type {
@@ -1201,6 +1202,14 @@ enum {
 	 * token_create.allowed_map_types bit set.
 	 */
 	BPF_F_TOKEN_IGNORE_UNKNOWN_MAP_TYPES	  = 1U << 1,
+	/* Similar to BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag, but for
+	 * token_create.allowed_prog_types bit set.
+	 */
+	BPF_F_TOKEN_IGNORE_UNKNOWN_PROG_TYPES	  = 1U << 2,
+	/* Similar to BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag, but for
+	 * token_create.allowed_attach_types bit set.
+	 */
+	BPF_F_TOKEN_IGNORE_UNKNOWN_ATTACH_TYPES	  = 1U << 3,
 };
 
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
@@ -1452,6 +1461,7 @@ union bpf_attr {
 		 * truncated), or smaller (if log buffer wasn't filled completely).
 		 */
 		__u32		log_true_size;
+		__u32		prog_token_fd;
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -1674,6 +1684,15 @@ union bpf_attr {
 		 * effect on validity checking of this set
 		 */
 		__u64		allowed_map_types;
+		/* similarly to allowed_map_types, bit sets of BPF program
+		 * types and BPF program attach types that are allowed to be
+		 * loaded by requested BPF token;
+		 * see also BPF_F_TOKEN_IGNORE_UNKNOWN_PROG_TYPES and
+		 * BPF_F_TOKEN_IGNORE_UNKNOWN_ATTACH_TYPES for their
+		 * effect on validity checking of these sets
+		 */
+		__u64		allowed_prog_types;
+		__u64		allowed_attach_types;
 	} token_create;
 
 } __attribute__((aligned(8)));
diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c
index 573249a2814d..4ed46ed58a7b 100644
--- a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c
+++ b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c
@@ -30,6 +30,8 @@ void test_libbpf_probe_prog_types(void)
 
 		if (prog_type == BPF_PROG_TYPE_UNSPEC)
 			continue;
+		if (strcmp(prog_type_name, "__MAX_BPF_PROG_TYPE") == 0)
+			continue;
 
 		if (!test__start_subtest(prog_type_name))
 			continue;
diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c
index e677c0435cec..ea2a8c4063a8 100644
--- a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c
+++ b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c
@@ -185,6 +185,9 @@ static void test_libbpf_bpf_prog_type_str(void)
 		const char *prog_type_str;
 		char buf[256];
 
+		if (prog_type == __MAX_BPF_PROG_TYPE)
+			continue;
+
 		prog_type_name = btf__str_by_offset(btf, e->name_off);
 		prog_type_str = libbpf_bpf_prog_type_str(prog_type);
 		ASSERT_OK_PTR(prog_type_str, prog_type_name);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (13 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 14/18] bpf: add BPF token support to BPF_PROG_LOAD command Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 18:46   ` kernel test robot
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 16/18] bpf: consistenly use BPF token throughout BPF verifier logic Andrii Nakryiko
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Instead of performing unconditional system-wide bpf_capable() and
perfmon_capable() calls inside bpf_base_func_proto() function (and other
similar ones) to determine eligibility of a given BPF helper for a given
program, use previously recorded BPF token during BPF_PROG_LOAD command
handling to inform the decision.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 drivers/media/rc/bpf-lirc.c |  2 +-
 include/linux/bpf.h         |  5 +++--
 kernel/bpf/cgroup.c         |  6 +++---
 kernel/bpf/helpers.c        |  6 +++---
 kernel/bpf/syscall.c        |  5 +++--
 kernel/trace/bpf_trace.c    |  2 +-
 net/core/filter.c           | 32 ++++++++++++++++----------------
 net/ipv4/bpf_tcp_ca.c       |  2 +-
 net/netfilter/nf_bpf_link.c |  2 +-
 9 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/drivers/media/rc/bpf-lirc.c b/drivers/media/rc/bpf-lirc.c
index fe17c7f98e81..6d07693c6b9f 100644
--- a/drivers/media/rc/bpf-lirc.c
+++ b/drivers/media/rc/bpf-lirc.c
@@ -110,7 +110,7 @@ lirc_mode2_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_get_prandom_u32:
 		return &bpf_get_prandom_u32_proto;
 	case BPF_FUNC_trace_printk:
-		if (perfmon_capable())
+		if (bpf_token_capable(prog->aux->token, CAP_PERFMON))
 			return bpf_get_trace_printk_proto();
 		fallthrough;
 	default:
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 320d93c542ed..9467d093e88e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2345,7 +2345,8 @@ int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *pr
 struct bpf_prog *bpf_prog_by_id(u32 id);
 struct bpf_link *bpf_link_by_id(u32 id);
 
-const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id);
+const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id,
+						 const struct bpf_prog *prog);
 void bpf_task_storage_free(struct task_struct *task);
 void bpf_cgrp_storage_free(struct cgroup *cgroup);
 bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog);
@@ -2602,7 +2603,7 @@ static inline int btf_struct_access(struct bpf_verifier_log *log,
 }
 
 static inline const struct bpf_func_proto *
-bpf_base_func_proto(enum bpf_func_id func_id)
+bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
 	return NULL;
 }
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 5b2741aa0d9b..39d6cfb6f304 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1615,7 +1615,7 @@ cgroup_dev_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_perf_event_output:
 		return &bpf_event_output_data_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog);
 	}
 }
 
@@ -2173,7 +2173,7 @@ sysctl_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_perf_event_output:
 		return &bpf_event_output_data_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog);
 	}
 }
 
@@ -2330,7 +2330,7 @@ cg_sockopt_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_perf_event_output:
 		return &bpf_event_output_data_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog);
 	}
 }
 
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 4ef4c4f8a355..31cd0b956c7e 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1663,7 +1663,7 @@ const struct bpf_func_proto bpf_probe_read_kernel_str_proto __weak;
 const struct bpf_func_proto bpf_task_pt_regs_proto __weak;
 
 const struct bpf_func_proto *
-bpf_base_func_proto(enum bpf_func_id func_id)
+bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
 	switch (func_id) {
 	case BPF_FUNC_map_lookup_elem:
@@ -1714,7 +1714,7 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		break;
 	}
 
-	if (!bpf_capable())
+	if (!bpf_token_capable(prog->aux->token, CAP_BPF))
 		return NULL;
 
 	switch (func_id) {
@@ -1772,7 +1772,7 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		break;
 	}
 
-	if (!perfmon_capable())
+	if (!bpf_token_capable(prog->aux->token, CAP_PERFMON))
 		return NULL;
 
 	switch (func_id) {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e02688bebf8e..4ec366f20760 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -5528,7 +5528,7 @@ static const struct bpf_func_proto bpf_sys_bpf_proto = {
 const struct bpf_func_proto * __weak
 tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
-	return bpf_base_func_proto(func_id);
+	return bpf_base_func_proto(func_id, prog);
 }
 
 BPF_CALL_1(bpf_sys_close, u32, fd)
@@ -5578,7 +5578,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
 	switch (func_id) {
 	case BPF_FUNC_sys_bpf:
-		return !perfmon_capable() ? NULL : &bpf_sys_bpf_proto;
+		return !bpf_token_capable(prog->aux->token, CAP_PERFMON)
+		       ? NULL : &bpf_sys_bpf_proto;
 	case BPF_FUNC_btf_find_by_name_kind:
 		return &bpf_btf_find_by_name_kind_proto;
 	case BPF_FUNC_sys_close:
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 2bc41e6ac9fe..f5382d8bb690 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1511,7 +1511,7 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_trace_vprintk:
 		return bpf_get_trace_vprintk_proto();
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog);
 	}
 }
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 968139f4a1ac..10d655c140c9 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -83,7 +83,7 @@
 #include <net/netfilter/nf_conntrack_bpf.h>
 
 static const struct bpf_func_proto *
-bpf_sk_base_func_proto(enum bpf_func_id func_id);
+bpf_sk_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog);
 
 int copy_bpf_fprog_from_user(struct sock_fprog *dst, sockptr_t src, int len)
 {
@@ -7726,7 +7726,7 @@ sock_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_ktime_get_coarse_ns:
 		return &bpf_ktime_get_coarse_ns_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog);
 	}
 }
 
@@ -7809,7 +7809,7 @@ sock_addr_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 			return NULL;
 		}
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -7828,7 +7828,7 @@ sk_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_perf_event_output:
 		return &bpf_skb_event_output_proto;
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -8015,7 +8015,7 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 #endif
 #endif
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -8074,7 +8074,7 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 #endif
 #endif
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 
 #if IS_MODULE(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES)
@@ -8135,7 +8135,7 @@ sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_tcp_sock_proto;
 #endif /* CONFIG_INET */
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -8177,7 +8177,7 @@ sk_msg_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_cgroup_classid_curr_proto;
 #endif
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -8221,7 +8221,7 @@ sk_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_skc_lookup_tcp_proto;
 #endif
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -8232,7 +8232,7 @@ flow_dissector_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_skb_load_bytes:
 		return &bpf_flow_dissector_load_bytes_proto;
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -8259,7 +8259,7 @@ lwt_out_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_skb_under_cgroup:
 		return &bpf_skb_under_cgroup_proto;
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -11090,7 +11090,7 @@ sk_reuseport_func_proto(enum bpf_func_id func_id,
 	case BPF_FUNC_ktime_get_coarse_ns:
 		return &bpf_ktime_get_coarse_ns_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog);
 	}
 }
 
@@ -11272,7 +11272,7 @@ sk_lookup_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_sk_release:
 		return &bpf_sk_release_proto;
 	default:
-		return bpf_sk_base_func_proto(func_id);
+		return bpf_sk_base_func_proto(func_id, prog);
 	}
 }
 
@@ -11606,7 +11606,7 @@ const struct bpf_func_proto bpf_sock_from_file_proto = {
 };
 
 static const struct bpf_func_proto *
-bpf_sk_base_func_proto(enum bpf_func_id func_id)
+bpf_sk_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
 	const struct bpf_func_proto *func;
 
@@ -11635,10 +11635,10 @@ bpf_sk_base_func_proto(enum bpf_func_id func_id)
 	case BPF_FUNC_ktime_get_coarse_ns:
 		return &bpf_ktime_get_coarse_ns_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog);
 	}
 
-	if (!perfmon_capable())
+	if (!bpf_token_capable(prog->aux->token, CAP_PERFMON))
 		return NULL;
 
 	return func;
diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index 4406d796cc2f..0a3a60e7c282 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -193,7 +193,7 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id,
 	case BPF_FUNC_ktime_get_coarse_ns:
 		return &bpf_ktime_get_coarse_ns_proto;
 	default:
-		return bpf_base_func_proto(func_id);
+		return bpf_base_func_proto(func_id, prog);
 	}
 }
 
diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c
index c36da56d756f..d7786ea9c01a 100644
--- a/net/netfilter/nf_bpf_link.c
+++ b/net/netfilter/nf_bpf_link.c
@@ -219,7 +219,7 @@ static bool nf_is_valid_access(int off, int size, enum bpf_access_type type,
 static const struct bpf_func_proto *
 bpf_nf_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
-	return bpf_base_func_proto(func_id);
+	return bpf_base_func_proto(func_id, prog);
 }
 
 const struct bpf_verifier_ops netfilter_verifier_ops = {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 16/18] bpf: consistenly use BPF token throughout BPF verifier logic
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (14 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 17/18] libbpf: add BPF token support to bpf_prog_load() API Andrii Nakryiko
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Remove remaining direct queries to perfmon_capable() and bpf_capable()
in BPF verifier logic and instead use BPF token (if available) to make
decisions about privileges.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h    | 18 ++++++++++--------
 include/linux/filter.h |  2 +-
 kernel/bpf/arraymap.c  |  2 +-
 kernel/bpf/core.c      |  2 +-
 kernel/bpf/verifier.c  | 13 ++++++-------
 net/core/filter.c      |  4 ++--
 6 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9467d093e88e..bd0b448e1a22 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2054,24 +2054,26 @@ bpf_map_alloc_percpu(const struct bpf_map *map, size_t size, size_t align,
 
 extern int sysctl_unprivileged_bpf_disabled;
 
-static inline bool bpf_allow_ptr_leaks(void)
+bool bpf_token_capable(const struct bpf_token *token, int cap);
+
+static inline bool bpf_allow_ptr_leaks(const struct bpf_token *token)
 {
-	return perfmon_capable();
+	return bpf_token_capable(token, CAP_PERFMON);
 }
 
-static inline bool bpf_allow_uninit_stack(void)
+static inline bool bpf_allow_uninit_stack(const struct bpf_token *token)
 {
-	return perfmon_capable();
+	return bpf_token_capable(token, CAP_PERFMON);
 }
 
-static inline bool bpf_bypass_spec_v1(void)
+static inline bool bpf_bypass_spec_v1(const struct bpf_token *token)
 {
-	return perfmon_capable();
+	return bpf_token_capable(token, CAP_PERFMON);
 }
 
-static inline bool bpf_bypass_spec_v4(void)
+static inline bool bpf_bypass_spec_v4(const struct bpf_token *token)
 {
-	return perfmon_capable();
+	return bpf_token_capable(token, CAP_PERFMON);
 }
 
 int bpf_map_new_fd(struct bpf_map *map, int flags);
diff --git a/include/linux/filter.h b/include/linux/filter.h
index bbce89937fde..60c0420143b3 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1110,7 +1110,7 @@ static inline bool bpf_jit_blinding_enabled(struct bpf_prog *prog)
 		return false;
 	if (!bpf_jit_harden)
 		return false;
-	if (bpf_jit_harden == 1 && bpf_capable())
+	if (bpf_jit_harden == 1 && bpf_token_capable(prog->aux->token, CAP_BPF))
 		return false;
 
 	return true;
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 2058e89b5ddd..f0c64df6b6ff 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -82,7 +82,7 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 	bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY;
 	int numa_node = bpf_map_attr_numa_node(attr);
 	u32 elem_size, index_mask, max_entries;
-	bool bypass_spec_v1 = bpf_bypass_spec_v1();
+	bool bypass_spec_v1 = bpf_bypass_spec_v1(NULL);
 	u64 array_size, mask64;
 	struct bpf_array *array;
 
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index cd0a93968009..c48303e097ec 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -661,7 +661,7 @@ static bool bpf_prog_kallsyms_candidate(const struct bpf_prog *fp)
 void bpf_prog_kallsyms_add(struct bpf_prog *fp)
 {
 	if (!bpf_prog_kallsyms_candidate(fp) ||
-	    !bpf_capable())
+	    !bpf_token_capable(fp->aux->token, CAP_BPF))
 		return;
 
 	bpf_prog_ksym_set_addr(fp);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 086b2a14905b..9a0a93fa2c15 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -19219,7 +19219,12 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	env->prog = *prog;
 	env->ops = bpf_verifier_ops[env->prog->type];
 	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
-	is_priv = bpf_capable();
+
+	env->allow_ptr_leaks = bpf_allow_ptr_leaks(env->prog->aux->token);
+	env->allow_uninit_stack = bpf_allow_uninit_stack(env->prog->aux->token);
+	env->bypass_spec_v1 = bpf_bypass_spec_v1(env->prog->aux->token);
+	env->bypass_spec_v4 = bpf_bypass_spec_v4(env->prog->aux->token);
+	env->bpf_capable = is_priv = bpf_token_capable(env->prog->aux->token, CAP_BPF);
 
 	bpf_get_btf_vmlinux();
 
@@ -19251,12 +19256,6 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (attr->prog_flags & BPF_F_ANY_ALIGNMENT)
 		env->strict_alignment = false;
 
-	env->allow_ptr_leaks = bpf_allow_ptr_leaks();
-	env->allow_uninit_stack = bpf_allow_uninit_stack();
-	env->bypass_spec_v1 = bpf_bypass_spec_v1();
-	env->bypass_spec_v4 = bpf_bypass_spec_v4();
-	env->bpf_capable = bpf_capable();
-
 	if (is_priv)
 		env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ;
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 10d655c140c9..1b4b2ae1cedc 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -8434,7 +8434,7 @@ static bool cg_skb_is_valid_access(int off, int size,
 		return false;
 	case bpf_ctx_range(struct __sk_buff, data):
 	case bpf_ctx_range(struct __sk_buff, data_end):
-		if (!bpf_capable())
+		if (!bpf_token_capable(prog->aux->token, CAP_BPF))
 			return false;
 		break;
 	}
@@ -8446,7 +8446,7 @@ static bool cg_skb_is_valid_access(int off, int size,
 		case bpf_ctx_range_till(struct __sk_buff, cb[0], cb[4]):
 			break;
 		case bpf_ctx_range(struct __sk_buff, tstamp):
-			if (!bpf_capable())
+			if (!bpf_token_capable(prog->aux->token, CAP_BPF))
 				return false;
 			break;
 		default:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 17/18] libbpf: add BPF token support to bpf_prog_load() API
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (15 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 16/18] bpf: consistenly use BPF token throughout BPF verifier logic Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 18/18] selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests Andrii Nakryiko
  2023-06-02 15:55 ` [PATCH RESEND bpf-next 00/18] BPF token Casey Schaufler
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Wire through token_fd into bpf_prog_load(). Also make sure to pass
allowed_{prog,attach}_types to kernel in bpf_token_create().

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/bpf.c | 5 ++++-
 tools/lib/bpf/bpf.h | 7 +++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 193993dbbdc4..cd8f0c525de6 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -234,7 +234,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type,
 		  const struct bpf_insn *insns, size_t insn_cnt,
 		  struct bpf_prog_load_opts *opts)
 {
-	const size_t attr_sz = offsetofend(union bpf_attr, log_true_size);
+	const size_t attr_sz = offsetofend(union bpf_attr, prog_token_fd);
 	void *finfo = NULL, *linfo = NULL;
 	const char *func_info, *line_info;
 	__u32 log_size, log_level, attach_prog_fd, attach_btf_obj_fd;
@@ -263,6 +263,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type,
 	attr.prog_flags = OPTS_GET(opts, prog_flags, 0);
 	attr.prog_ifindex = OPTS_GET(opts, prog_ifindex, 0);
 	attr.kern_version = OPTS_GET(opts, kern_version, 0);
+	attr.prog_token_fd = OPTS_GET(opts, token_fd, 0);
 
 	if (prog_name && kernel_supports(NULL, FEAT_PROG_NAME))
 		libbpf_strlcpy(attr.prog_name, prog_name, sizeof(attr.prog_name));
@@ -1220,6 +1221,8 @@ int bpf_token_create(struct bpf_token_create_opts *opts)
 	attr.token_create.token_fd = OPTS_GET(opts, token_fd, 0);
 	attr.token_create.allowed_cmds = OPTS_GET(opts, allowed_cmds, 0);
 	attr.token_create.allowed_map_types = OPTS_GET(opts, allowed_map_types, 0);
+	attr.token_create.allowed_prog_types = OPTS_GET(opts, allowed_prog_types, 0);
+	attr.token_create.allowed_attach_types = OPTS_GET(opts, allowed_attach_types, 0);
 
 	ret = sys_bpf_fd(BPF_TOKEN_CREATE, &attr, attr_sz);
 	return libbpf_err_errno(ret);
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 3153a9e697e2..f9afc7846762 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -104,9 +104,10 @@ struct bpf_prog_load_opts {
 	 * If kernel doesn't support this feature, log_size is left unchanged.
 	 */
 	__u32 log_true_size;
+	__u32 token_fd;
 	size_t :0;
 };
-#define bpf_prog_load_opts__last_field log_true_size
+#define bpf_prog_load_opts__last_field token_fd
 
 LIBBPF_API int bpf_prog_load(enum bpf_prog_type prog_type,
 			     const char *prog_name, const char *license,
@@ -560,9 +561,11 @@ struct bpf_token_create_opts {
 	__u32 token_fd;
 	__u64 allowed_cmds;
 	__u64 allowed_map_types;
+	__u64 allowed_prog_types;
+	__u64 allowed_attach_types;
 	size_t :0;
 };
-#define bpf_token_create_opts__last_field allowed_map_types
+#define bpf_token_create_opts__last_field allowed_attach_types
 
 LIBBPF_API int bpf_token_create(struct bpf_token_create_opts *opts);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH RESEND bpf-next 18/18] selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (16 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 17/18] libbpf: add BPF token support to bpf_prog_load() API Andrii Nakryiko
@ 2023-06-02 15:00 ` Andrii Nakryiko
  2023-06-02 15:55 ` [PATCH RESEND bpf-next 00/18] BPF token Casey Schaufler
  18 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 15:00 UTC (permalink / raw)
  To: bpf; +Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto

Add a test validating that BPF token can be used to load privileged BPF
program using privileged BPF helpers through delegated BPF token created
by privileged process.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../testing/selftests/bpf/prog_tests/token.c  | 80 +++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/token.c b/tools/testing/selftests/bpf/prog_tests/token.c
index b141f722c0c6..d5093ededf06 100644
--- a/tools/testing/selftests/bpf/prog_tests/token.c
+++ b/tools/testing/selftests/bpf/prog_tests/token.c
@@ -4,6 +4,7 @@
 #include <test_progs.h>
 #include <bpf/btf.h>
 #include "cap_helpers.h"
+#include <linux/filter.h>
 
 static int drop_priv_caps(__u64 *old_caps)
 {
@@ -191,6 +192,83 @@ static void subtest_btf_token(void)
 		ASSERT_OK(restore_priv_caps(old_caps), "restore_caps");
 }
 
+static void subtest_prog_token(void)
+{
+	LIBBPF_OPTS(bpf_token_create_opts, token_opts);
+	LIBBPF_OPTS(bpf_prog_load_opts, prog_opts);
+	int token_fd = 0, prog_fd = 0;
+	__u64 old_caps = 0;
+	struct bpf_insn insns[] = {
+		/* bpf_jiffies64() requires CAP_BPF */
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		/* bpf_get_current_task() requires CAP_PERFMON */
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_current_task),
+		/* r0 = 0; exit; */
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	size_t insn_cnt = ARRAY_SIZE(insns);
+
+	/* check that IGNORE_UNKNOWN_PROG_TYPES flag is respected */
+	token_opts.flags = BPF_F_TOKEN_IGNORE_UNKNOWN_PROG_TYPES;
+	token_opts.allowed_prog_types = ~0ULL; /* any current and future prog type is allowed */
+	token_opts.allowed_attach_types = 0;
+	token_fd = bpf_token_create(&token_opts);
+	if (!ASSERT_GT(token_fd, 0, "token_create_prog_type_future_proof"))
+		return;
+	close(token_fd);
+
+	/* check that IGNORE_UNKNOWN_ATTACH_TYPES flag is respected */
+	token_opts.flags = BPF_F_TOKEN_IGNORE_UNKNOWN_ATTACH_TYPES;
+	token_opts.allowed_prog_types = 0;
+	token_opts.allowed_attach_types = ~0ULL; /* any current and future attach type is allowed */
+	token_fd = bpf_token_create(&token_opts);
+	if (!ASSERT_GT(token_fd, 0, "token_create_prog_type_future_proof"))
+		return;
+	close(token_fd);
+
+	/* create BPF token allowing BPF_PROG_LOAD command */
+	token_opts.flags = 0;
+	token_opts.allowed_cmds = 1ULL << BPF_PROG_LOAD;
+	token_opts.allowed_prog_types = 1ULL << BPF_PROG_TYPE_XDP;
+	token_opts.allowed_attach_types = 1ULL << BPF_XDP;
+	token_fd = bpf_token_create(&token_opts);
+	if (!ASSERT_GT(token_fd, 0, "token_create"))
+		return;
+
+	/* drop privileges to test token_fd passing */
+	if (!ASSERT_OK(drop_priv_caps(&old_caps), "drop_caps"))
+		goto cleanup;
+
+	/* validate we can successfully load BPF program with token; this
+	 * being XDP program (CAP_NET_ADMIN) using bpf_jiffies64() (CAP_BPF)
+	 * and bpf_get_current_task() (CAP_PERFMON) helpers validates we have
+	 * BPF token wired properly in a bunch of places in the kernel
+	 */
+	prog_opts.token_fd = token_fd;
+	prog_opts.expected_attach_type = BPF_XDP;
+	prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, "token_prog", "GPL",
+				insns, insn_cnt, &prog_opts);
+	if (!ASSERT_GT(prog_fd, 0, "prog_fd"))
+		goto cleanup;
+	close(prog_fd);
+
+	/* now validate that we *cannot* load BPF program without token */
+	prog_opts.token_fd = 0;
+	prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, "token_prog", "GPL",
+				insns, insn_cnt, &prog_opts);
+	if (!ASSERT_EQ(prog_fd, -EPERM, "prog_fd_eperm"))
+		goto cleanup;
+
+cleanup:
+	if (prog_fd > 0)
+		close(prog_fd);
+	if (token_fd)
+		close(token_fd);
+	if (old_caps)
+		ASSERT_OK(restore_priv_caps(old_caps), "restore_caps");
+}
+
 void test_token(void)
 {
 	if (test__start_subtest("token_create"))
@@ -199,4 +277,6 @@ void test_token(void)
 		subtest_map_token();
 	if (test__start_subtest("btf_token"))
 		subtest_btf_token();
+	if (test__start_subtest("prog_token"))
+		subtest_prog_token();
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 00/18] BPF token
  2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
                   ` (17 preceding siblings ...)
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 18/18] selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests Andrii Nakryiko
@ 2023-06-02 15:55 ` Casey Schaufler
  2023-06-05 20:41   ` Andrii Nakryiko
  18 siblings, 1 reply; 36+ messages in thread
From: Casey Schaufler @ 2023-06-02 15:55 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf
  Cc: linux-security-module, keescook, brauner, lennart, cyphar, luto,
	Casey Schaufler

On 6/2/2023 7:59 AM, Andrii Nakryiko wrote:
> *Resending with trimmed CC list because original version didn't make it to
> the mailing list.*
>
> This patch set introduces new BPF object, BPF token, which allows to delegate
> a subset of BPF functionality from privileged system-wide daemon (e.g.,
> systemd or any other container manager) to a *trusted* unprivileged
> application. Trust is the key here. This functionality is not about allowing
> unconditional unprivileged BPF usage. Establishing trust, though, is
> completely up to the discretion of respective privileged application that
> would create a BPF token.

Token based privilege has a number of well understood weaknesses,
none of which I see addressed here. I also have a real problem with
the notion of "trusted unprivileged" where trust is established by
a user space application. Ignoring the possibility of malicious code
for the moment, the opportunity for accidental privilege leakage is
huge. It would be trivial (and tempting) to create a privileged BPF
"shell" that would then be allowed to "trust" any application and
run it with privilege by passing it a token.

>
> The main motivation for BPF token is a desire to enable containerized
> BPF applications to be used together with user namespaces. This is currently
> impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
> or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
> helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
> arbitrary memory, and it's impossible to ensure that they only read memory of
> processes belonging to any given namespace. This means that it's impossible to
> have namespace-aware CAP_BPF capability, and as such another mechanism to
> allow safe usage of BPF functionality is necessary. BPF token and delegation
> of it to a trusted unprivileged applications is such mechanism. Kernel makes
> no assumption about what "trusted" constitutes in any particular case, and
> it's up to specific privileged applications and their surrounding
> infrastructure to decide that. What kernel provides is a set of APIs to create
> and tune BPF token, and pass it around to privileged BPF commands that are
> creating new BPF objects like BPF programs, BPF maps, etc.
>
> Previous attempt at addressing this very same problem ([0]) attempted to
> utilize authoritative LSM approach, but was conclusively rejected by upstream
> LSM maintainers. BPF token concept is not changing anything about LSM
> approach, but can be combined with LSM hooks for very fine-grained security
> policy. Some ideas about making BPF token more convenient to use with LSM (in
> particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
> 2023 presentation ([1]). E.g., an ability to specify user-provided data
> (context), which in combination with BPF LSM would allow implementing a very
> dynamic and fine-granular custom security policies on top of BPF token. In the
> interest of minimizing API surface area discussions this is going to be
> added in follow up patches, as it's not essential to the fundamental concept
> of delegatable BPF token.
>
> It should be noted that BPF token is conceptually quite similar to the idea of
> /dev/bpf device file, proposed by Song a while ago ([2]). The biggest
> difference is the idea of using virtual anon_inode file to hold BPF token and
> allowing multiple independent instances of them, each with its own set of
> restrictions. BPF pinning solves the problem of exposing such BPF token
> through file system (BPF FS, in this case) for cases where transferring FDs
> over Unix domain sockets is not convenient. And also, crucially, BPF token
> approach is not using any special stateful task-scoped flags. Instead, bpf()
> syscall accepts token_fd parameters explicitly for each relevant BPF command.
> This addresses main concerns brought up during the /dev/bpf discussion, and
> fits better with overall BPF subsystem design.
>
> This patch set adds a basic minimum of functionality to make BPF token useful
> and to discuss API and functionality. Currently only low-level libbpf APIs
> support passing BPF token around, allowing to test kernel functionality, but
> for the most part is not sufficient for real-world applications, which
> typically use high-level libbpf APIs based on `struct bpf_object` type. This
> was done with the intent to limit the size of patch set and concentrate on
> mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
> as a separate follow up patch set kernel support makes it upstream.
>
> Another part that should happen once kernel-side BPF token is established, is
> a set of conventions between applications (e.g., systemd), tools (e.g.,
> bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
> at well-defined locations to allow applications take advantage of this in
> automatic fashion without explicit code changes on BPF application's side.
> But I'd like to postpone this discussion to after BPF token concept lands.
>
>   [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
>   [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
>   [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/
>
> Andrii Nakryiko (18):
>   bpf: introduce BPF token object
>   libbpf: add bpf_token_create() API
>   selftests/bpf: add BPF_TOKEN_CREATE test
>   bpf: move unprivileged checks into map_create() and bpf_prog_load()
>   bpf: inline map creation logic in map_create() function
>   bpf: centralize permissions checks for all BPF map types
>   bpf: add BPF token support to BPF_MAP_CREATE command
>   libbpf: add BPF token support to bpf_map_create() API
>   selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
>   bpf: add BPF token support to BPF_BTF_LOAD command
>   libbpf: add BPF token support to bpf_btf_load() API
>   selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
>   bpf: keep BPF_PROG_LOAD permission checks clear of validations
>   bpf: add BPF token support to BPF_PROG_LOAD command
>   bpf: take into account BPF token when fetching helper protos
>   bpf: consistenly use BPF token throughout BPF verifier logic
>   libbpf: add BPF token support to bpf_prog_load() API
>   selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests
>
>  drivers/media/rc/bpf-lirc.c                   |   2 +-
>  include/linux/bpf.h                           |  66 ++-
>  include/linux/filter.h                        |   2 +-
>  include/uapi/linux/bpf.h                      |  74 +++
>  kernel/bpf/Makefile                           |   2 +-
>  kernel/bpf/arraymap.c                         |   2 +-
>  kernel/bpf/bloom_filter.c                     |   3 -
>  kernel/bpf/bpf_local_storage.c                |   3 -
>  kernel/bpf/bpf_struct_ops.c                   |   3 -
>  kernel/bpf/cgroup.c                           |   6 +-
>  kernel/bpf/core.c                             |   3 +-
>  kernel/bpf/cpumap.c                           |   4 -
>  kernel/bpf/devmap.c                           |   3 -
>  kernel/bpf/hashtab.c                          |   6 -
>  kernel/bpf/helpers.c                          |   6 +-
>  kernel/bpf/inode.c                            |  26 ++
>  kernel/bpf/lpm_trie.c                         |   3 -
>  kernel/bpf/queue_stack_maps.c                 |   4 -
>  kernel/bpf/reuseport_array.c                  |   3 -
>  kernel/bpf/stackmap.c                         |   3 -
>  kernel/bpf/syscall.c                          | 429 ++++++++++++++----
>  kernel/bpf/token.c                            | 141 ++++++
>  kernel/bpf/verifier.c                         |  13 +-
>  kernel/trace/bpf_trace.c                      |   2 +-
>  net/core/filter.c                             |  36 +-
>  net/core/sock_map.c                           |   4 -
>  net/ipv4/bpf_tcp_ca.c                         |   2 +-
>  net/netfilter/nf_bpf_link.c                   |   2 +-
>  net/xdp/xskmap.c                              |   4 -
>  tools/include/uapi/linux/bpf.h                |  74 +++
>  tools/lib/bpf/bpf.c                           |  32 +-
>  tools/lib/bpf/bpf.h                           |  24 +-
>  tools/lib/bpf/libbpf.map                      |   1 +
>  .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
>  .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
>  .../testing/selftests/bpf/prog_tests/token.c  | 282 ++++++++++++
>  .../bpf/prog_tests/unpriv_bpf_disabled.c      |   6 +-
>  37 files changed, 1098 insertions(+), 188 deletions(-)
>  create mode 100644 kernel/bpf/token.c
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object Andrii Nakryiko
@ 2023-06-02 17:41   ` kernel test robot
  2023-06-02 20:41   ` kernel test robot
  2023-06-03  1:32   ` Stanislav Fomichev
  2 siblings, 0 replies; 36+ messages in thread
From: kernel test robot @ 2023-06-02 17:41 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf
  Cc: oe-kbuild-all, linux-security-module, keescook, brauner, lennart,
	cyphar, luto

Hi Andrii,

kernel test robot noticed the following build warnings:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Andrii-Nakryiko/bpf-introduce-BPF-token-object/20230602-230448
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230602150011.1657856-2-andrii%40kernel.org
patch subject: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
config: um-x86_64_defconfig (https://download.01.org/0day-ci/archive/20230603/202306030138.u9AeNgUk-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/59e6ef2000a056ce3386db8481e477e5abfbbe15
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Andrii-Nakryiko/bpf-introduce-BPF-token-object/20230602-230448
        git checkout 59e6ef2000a056ce3386db8481e477e5abfbbe15
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=um SUBARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=um SUBARCH=x86_64 SHELL=/bin/bash kernel/ net/core/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306030138.u9AeNgUk-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from kernel/fork.c:98:
   include/linux/bpf.h: In function 'bpf_token_new_fd':
>> include/linux/bpf.h:2465:16: warning: returning 'int' from a function with return type 'struct bpf_token *' makes pointer from integer without a cast [-Wint-conversion]
    2465 |         return -EOPNOTSUPP;
         |                ^
   kernel/fork.c: At top level:
   kernel/fork.c:164:13: warning: no previous prototype for 'arch_release_task_struct' [-Wmissing-prototypes]
     164 | void __weak arch_release_task_struct(struct task_struct *tsk)
         |             ^~~~~~~~~~~~~~~~~~~~~~~~
   kernel/fork.c:991:20: warning: no previous prototype for 'arch_task_cache_init' [-Wmissing-prototypes]
     991 | void __init __weak arch_task_cache_init(void) { }
         |                    ^~~~~~~~~~~~~~~~~~~~
   kernel/fork.c:1086:12: warning: no previous prototype for 'arch_dup_task_struct' [-Wmissing-prototypes]
    1086 | int __weak arch_dup_task_struct(struct task_struct *dst,
         |            ^~~~~~~~~~~~~~~~~~~~
--
   In file included from include/linux/filter.h:9,
                    from kernel/sysctl.c:35:
   include/linux/bpf.h: In function 'bpf_token_new_fd':
>> include/linux/bpf.h:2465:16: warning: returning 'int' from a function with return type 'struct bpf_token *' makes pointer from integer without a cast [-Wint-conversion]
    2465 |         return -EOPNOTSUPP;
         |                ^
--
   In file included from include/linux/filter.h:9,
                    from kernel/kallsyms.c:25:
   include/linux/bpf.h: In function 'bpf_token_new_fd':
>> include/linux/bpf.h:2465:16: warning: returning 'int' from a function with return type 'struct bpf_token *' makes pointer from integer without a cast [-Wint-conversion]
    2465 |         return -EOPNOTSUPP;
         |                ^
   kernel/kallsyms.c: At top level:
   kernel/kallsyms.c:662:12: warning: no previous prototype for 'arch_get_kallsym' [-Wmissing-prototypes]
     662 | int __weak arch_get_kallsym(unsigned int symnum, unsigned long *value,
         |            ^~~~~~~~~~~~~~~~
--
   In file included from include/linux/filter.h:9,
                    from kernel/bpf/core.c:21:
   include/linux/bpf.h: In function 'bpf_token_new_fd':
>> include/linux/bpf.h:2465:16: warning: returning 'int' from a function with return type 'struct bpf_token *' makes pointer from integer without a cast [-Wint-conversion]
    2465 |         return -EOPNOTSUPP;
         |                ^
   kernel/bpf/core.c: At top level:
   kernel/bpf/core.c:1638:12: warning: no previous prototype for 'bpf_probe_read_kernel' [-Wmissing-prototypes]
    1638 | u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
         |            ^~~~~~~~~~~~~~~~~~~~~
   kernel/bpf/core.c:2075:6: warning: no previous prototype for 'bpf_patch_call_args' [-Wmissing-prototypes]
    2075 | void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth)
         |      ^~~~~~~~~~~~~~~~~~~


vim +2465 include/linux/bpf.h

  2462	
  2463	static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
  2464	{
> 2465		return -EOPNOTSUPP;
  2466	}
  2467	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos
  2023-06-02 15:00 ` [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos Andrii Nakryiko
@ 2023-06-02 18:46   ` kernel test robot
  2023-06-02 20:07     ` Andrii Nakryiko
  0 siblings, 1 reply; 36+ messages in thread
From: kernel test robot @ 2023-06-02 18:46 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf
  Cc: oe-kbuild-all, linux-security-module, keescook, brauner, lennart,
	cyphar, luto

Hi Andrii,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Andrii-Nakryiko/bpf-introduce-BPF-token-object/20230602-230448
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230602150011.1657856-16-andrii%40kernel.org
patch subject: [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos
config: um-x86_64_defconfig (https://download.01.org/0day-ci/archive/20230603/202306030252.UOXkWZTK-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/3d830ca845b075ab4132487aaaa69b70a467863c
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Andrii-Nakryiko/bpf-introduce-BPF-token-object/20230602-230448
        git checkout 3d830ca845b075ab4132487aaaa69b70a467863c
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=um SUBARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=um SUBARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306030252.UOXkWZTK-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/linux/bpf_verifier.h:7,
                    from net/core/filter.c:21:
   include/linux/bpf.h: In function 'bpf_token_new_fd':
   include/linux/bpf.h:2475:16: warning: returning 'int' from a function with return type 'struct bpf_token *' makes pointer from integer without a cast [-Wint-conversion]
    2475 |         return -EOPNOTSUPP;
         |                ^
   net/core/filter.c: In function 'bpf_sk_base_func_proto':
>> net/core/filter.c:11653:14: error: implicit declaration of function 'bpf_token_capable'; did you mean 'bpf_token_put'? [-Werror=implicit-function-declaration]
   11653 |         if (!bpf_token_capable(prog->aux->token, CAP_PERFMON))
         |              ^~~~~~~~~~~~~~~~~
         |              bpf_token_put
   cc1: some warnings being treated as errors


vim +11653 net/core/filter.c

 11619	
 11620	static const struct bpf_func_proto *
 11621	bpf_sk_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 11622	{
 11623		const struct bpf_func_proto *func;
 11624	
 11625		switch (func_id) {
 11626		case BPF_FUNC_skc_to_tcp6_sock:
 11627			func = &bpf_skc_to_tcp6_sock_proto;
 11628			break;
 11629		case BPF_FUNC_skc_to_tcp_sock:
 11630			func = &bpf_skc_to_tcp_sock_proto;
 11631			break;
 11632		case BPF_FUNC_skc_to_tcp_timewait_sock:
 11633			func = &bpf_skc_to_tcp_timewait_sock_proto;
 11634			break;
 11635		case BPF_FUNC_skc_to_tcp_request_sock:
 11636			func = &bpf_skc_to_tcp_request_sock_proto;
 11637			break;
 11638		case BPF_FUNC_skc_to_udp6_sock:
 11639			func = &bpf_skc_to_udp6_sock_proto;
 11640			break;
 11641		case BPF_FUNC_skc_to_unix_sock:
 11642			func = &bpf_skc_to_unix_sock_proto;
 11643			break;
 11644		case BPF_FUNC_skc_to_mptcp_sock:
 11645			func = &bpf_skc_to_mptcp_sock_proto;
 11646			break;
 11647		case BPF_FUNC_ktime_get_coarse_ns:
 11648			return &bpf_ktime_get_coarse_ns_proto;
 11649		default:
 11650			return bpf_base_func_proto(func_id, prog);
 11651		}
 11652	
 11653		if (!bpf_token_capable(prog->aux->token, CAP_PERFMON))
 11654			return NULL;
 11655	
 11656		return func;
 11657	}
 11658	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos
  2023-06-02 18:46   ` kernel test robot
@ 2023-06-02 20:07     ` Andrii Nakryiko
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-02 20:07 UTC (permalink / raw)
  To: kernel test robot
  Cc: Andrii Nakryiko, bpf, oe-kbuild-all, linux-security-module,
	keescook, brauner, lennart, cyphar, luto

On Fri, Jun 2, 2023 at 11:48 AM kernel test robot <lkp@intel.com> wrote:
>
> Hi Andrii,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on bpf-next/master]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Andrii-Nakryiko/bpf-introduce-BPF-token-object/20230602-230448
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> patch link:    https://lore.kernel.org/r/20230602150011.1657856-16-andrii%40kernel.org
> patch subject: [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos
> config: um-x86_64_defconfig (https://download.01.org/0day-ci/archive/20230603/202306030252.UOXkWZTK-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce (this is a W=1 build):
>         # https://github.com/intel-lab-lkp/linux/commit/3d830ca845b075ab4132487aaaa69b70a467863c
>         git remote add linux-review https://github.com/intel-lab-lkp/linux
>         git fetch --no-tags linux-review Andrii-Nakryiko/bpf-introduce-BPF-token-object/20230602-230448
>         git checkout 3d830ca845b075ab4132487aaaa69b70a467863c
>         # save the config file
>         mkdir build_dir && cp config build_dir/.config
>         make W=1 O=build_dir ARCH=um SUBARCH=x86_64 olddefconfig
>         make W=1 O=build_dir ARCH=um SUBARCH=x86_64 SHELL=/bin/bash
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202306030252.UOXkWZTK-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>    In file included from include/linux/bpf_verifier.h:7,
>                     from net/core/filter.c:21:
>    include/linux/bpf.h: In function 'bpf_token_new_fd':
>    include/linux/bpf.h:2475:16: warning: returning 'int' from a function with return type 'struct bpf_token *' makes pointer from integer without a cast [-Wint-conversion]
>     2475 |         return -EOPNOTSUPP;
>          |                ^

bad copy/paste, this function should return int. I forgot to test that
everything compiles without CONFIG_BPF_SYSCALL.


>    net/core/filter.c: In function 'bpf_sk_base_func_proto':
> >> net/core/filter.c:11653:14: error: implicit declaration of function 'bpf_token_capable'; did you mean 'bpf_token_put'? [-Werror=implicit-function-declaration]
>    11653 |         if (!bpf_token_capable(prog->aux->token, CAP_PERFMON))
>          |              ^~~~~~~~~~~~~~~~~
>          |              bpf_token_put
>    cc1: some warnings being treated as errors
>
>

hm.. maybe I'll just make bpf_token_capable() a static inline function
in include/linux/bpf.h

> vim +11653 net/core/filter.c
>
>  11619
>  11620  static const struct bpf_func_proto *
>  11621  bpf_sk_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>  11622  {
>  11623          const struct bpf_func_proto *func;
>  11624
>  11625          switch (func_id) {
>  11626          case BPF_FUNC_skc_to_tcp6_sock:
>  11627                  func = &bpf_skc_to_tcp6_sock_proto;
>  11628                  break;
>  11629          case BPF_FUNC_skc_to_tcp_sock:
>  11630                  func = &bpf_skc_to_tcp_sock_proto;
>  11631                  break;
>  11632          case BPF_FUNC_skc_to_tcp_timewait_sock:
>  11633                  func = &bpf_skc_to_tcp_timewait_sock_proto;
>  11634                  break;
>  11635          case BPF_FUNC_skc_to_tcp_request_sock:
>  11636                  func = &bpf_skc_to_tcp_request_sock_proto;
>  11637                  break;
>  11638          case BPF_FUNC_skc_to_udp6_sock:
>  11639                  func = &bpf_skc_to_udp6_sock_proto;
>  11640                  break;
>  11641          case BPF_FUNC_skc_to_unix_sock:
>  11642                  func = &bpf_skc_to_unix_sock_proto;
>  11643                  break;
>  11644          case BPF_FUNC_skc_to_mptcp_sock:
>  11645                  func = &bpf_skc_to_mptcp_sock_proto;
>  11646                  break;
>  11647          case BPF_FUNC_ktime_get_coarse_ns:
>  11648                  return &bpf_ktime_get_coarse_ns_proto;
>  11649          default:
>  11650                  return bpf_base_func_proto(func_id, prog);
>  11651          }
>  11652
>  11653          if (!bpf_token_capable(prog->aux->token, CAP_PERFMON))
>  11654                  return NULL;
>  11655
>  11656          return func;
>  11657  }
>  11658
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object Andrii Nakryiko
  2023-06-02 17:41   ` kernel test robot
@ 2023-06-02 20:41   ` kernel test robot
  2023-06-03  1:32   ` Stanislav Fomichev
  2 siblings, 0 replies; 36+ messages in thread
From: kernel test robot @ 2023-06-02 20:41 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf
  Cc: oe-kbuild-all, linux-security-module, keescook, brauner, lennart,
	cyphar, luto

Hi Andrii,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Andrii-Nakryiko/bpf-introduce-BPF-token-object/20230602-230448
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230602150011.1657856-2-andrii%40kernel.org
patch subject: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
config: sparc-randconfig-r022-20230531 (https://download.01.org/0day-ci/archive/20230603/202306030402.Nn38A6qD-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 12.3.0
reproduce (this is a W=1 build):
        mkdir -p ~/bin
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/59e6ef2000a056ce3386db8481e477e5abfbbe15
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Andrii-Nakryiko/bpf-introduce-BPF-token-object/20230602-230448
        git checkout 59e6ef2000a056ce3386db8481e477e5abfbbe15
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross W=1 O=build_dir ARCH=sparc olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross W=1 O=build_dir ARCH=sparc SHELL=/bin/bash arch/sparc/kernel/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306030402.Nn38A6qD-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/linux/filter.h:9,
                    from arch/sparc/kernel/sys_sparc32.c:29:
   include/linux/bpf.h: In function 'bpf_token_new_fd':
>> include/linux/bpf.h:2465:16: error: returning 'int' from a function with return type 'struct bpf_token *' makes pointer from integer without a cast [-Werror=int-conversion]
    2465 |         return -EOPNOTSUPP;
         |                ^
   cc1: all warnings being treated as errors


vim +2465 include/linux/bpf.h

  2462	
  2463	static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
  2464	{
> 2465		return -EOPNOTSUPP;
  2466	}
  2467	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-02 14:59 ` [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object Andrii Nakryiko
  2023-06-02 17:41   ` kernel test robot
  2023-06-02 20:41   ` kernel test robot
@ 2023-06-03  1:32   ` Stanislav Fomichev
  2023-06-05 20:56     ` Andrii Nakryiko
  2 siblings, 1 reply; 36+ messages in thread
From: Stanislav Fomichev @ 2023-06-03  1:32 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, linux-security-module, keescook, brauner, lennart, cyphar, luto

On 06/02, Andrii Nakryiko wrote:
> Add new kind of BPF kernel object, BPF token. BPF token is meant to to
> allow delegating privileged BPF functionality, like loading a BPF
> program or creating a BPF map, from privileged process to a *trusted*
> unprivileged process, all while have a good amount of control over which
> privileged operation could be done using provided BPF token.
> 
> This patch adds new BPF_TOKEN_CREATE command to bpf() syscall, which
> allows to create a new BPF token object along with a set of allowed
> commands. Currently only BPF_TOKEN_CREATE command itself can be
> delegated, but other patches gradually add ability to delegate
> BPF_MAP_CREATE, BPF_BTF_LOAD, and BPF_PROG_LOAD commands.
> 
> The above means that BPF token creation can be allowed by another
> existing BPF token, if original privileged creator allowed that. New
> derived BPF token cannot be more powerful than the original BPF token.
> 
> BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag is added to allow application to do
> express "all supported BPF commands should be allowed" without worrying
> about which subset of desired commands is actually supported by
> potentially outdated kernel. Allowing this semantics doesn't seem to
> introduce any backwards compatibility issues and doesn't introduce any
> risk of abusing or misusing bit set field, but makes backwards
> compatibility story for various applications and tools much more
> straightforward, making it unnecessary to probe support for each
> individual possible bit. This is especially useful in follow up patches
> where we add BPF map types and prog types bit sets.
> 
> Lastly, BPF token can be pinned in and retrieved from BPF FS, just like
> progs, maps, BTFs, and links. This allows applications (like container
> managers) to share BPF token with other applications through file system
> just like any other BPF object, and further control access to it using
> file system permissions, if desired.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  include/linux/bpf.h            |  34 +++++++++
>  include/uapi/linux/bpf.h       |  42 ++++++++++++
>  kernel/bpf/Makefile            |   2 +-
>  kernel/bpf/inode.c             |  26 +++++++
>  kernel/bpf/syscall.c           |  74 ++++++++++++++++++++
>  kernel/bpf/token.c             | 122 +++++++++++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h |  40 +++++++++++
>  7 files changed, 339 insertions(+), 1 deletion(-)
>  create mode 100644 kernel/bpf/token.c
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index f58895830ada..fe6d51c3a5b1 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -51,6 +51,7 @@ struct module;
>  struct bpf_func_state;
>  struct ftrace_ops;
>  struct cgroup;
> +struct bpf_token;
>  
>  extern struct idr btf_idr;
>  extern spinlock_t btf_idr_lock;
> @@ -1533,6 +1534,12 @@ struct bpf_link_primer {
>  	u32 id;
>  };
>  
> +struct bpf_token {
> +	struct work_struct work;
> +	atomic64_t refcnt;
> +	u64 allowed_cmds;
> +};
> +
>  struct bpf_struct_ops_value;
>  struct btf_member;
>  
> @@ -2077,6 +2084,15 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
>  struct bpf_link *bpf_link_get_from_fd(u32 ufd);
>  struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
>  
> +void bpf_token_inc(struct bpf_token *token);
> +void bpf_token_put(struct bpf_token *token);
> +struct bpf_token *bpf_token_alloc(void);
> +int bpf_token_new_fd(struct bpf_token *token);
> +struct bpf_token *bpf_token_get_from_fd(u32 ufd);
> +
> +bool bpf_token_capable(const struct bpf_token *token, int cap);
> +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
> +
>  int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
>  int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
>  
> @@ -2436,6 +2452,24 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)
>  	return -EOPNOTSUPP;
>  }
>  
> +static inline void bpf_token_inc(struct bpf_token *token)
> +{
> +}
> +
> +static inline void bpf_token_put(struct bpf_token *token)
> +{
> +}
> +
> +static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
> +{
> +	return -EOPNOTSUPP;
> +}
> +
> +static inline struct bpf_token *bpf_token_get_from_fd(u32 ufd)
> +{
> +	return ERR_PTR(-EOPNOTSUPP);
> +}
> +
>  static inline void __dev_flush(void)
>  {
>  }
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 9273c654743c..01ab79f2ad9f 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -846,6 +846,16 @@ union bpf_iter_link_info {
>   *		Returns zero on success. On error, -1 is returned and *errno*
>   *		is set appropriately.
>   *
> + * BPF_TOKEN_CREATE
> + *	Description
> + *		Create BPF token with embedded information about what
> + *		BPF-related functionality is allowed. This BPF token can be
> + *		passed as an extra parameter to various bpf() syscall command.
> + *
> + *	Return
> + *		A new file descriptor (a nonnegative integer), or -1 if an
> + *		error occurred (in which case, *errno* is set appropriately).
> + *
>   * NOTES
>   *	eBPF objects (maps and programs) can be shared between processes.
>   *
> @@ -900,6 +910,7 @@ enum bpf_cmd {
>  	BPF_ITER_CREATE,
>  	BPF_LINK_DETACH,
>  	BPF_PROG_BIND_MAP,
> +	BPF_TOKEN_CREATE,
>  };
>  
>  enum bpf_map_type {
> @@ -1169,6 +1180,24 @@ enum bpf_link_type {
>   */
>  #define BPF_F_KPROBE_MULTI_RETURN	(1U << 0)
>  
> +/* BPF_TOKEN_CREATE command flags
> + */
> +enum {
> +	/* Ignore unrecognized bits in token_create.allowed_cmds bit set.  If
> +	 * this flag is set, kernel won't return -EINVAL for a bit
> +	 * corresponding to a non-existing command or the one that doesn't
> +	 * support BPF token passing. This flags allows application to request
> +	 * BPF token creation for a desired set of commands without worrying
> +	 * about older kernels not supporting some of the commands.
> +	 * Presumably, deployed applications will do separate feature
> +	 * detection and will avoid calling not-yet-supported bpf() commands,
> +	 * so this BPF token will work equally well both on older and newer
> +	 * kernels, even if some of the requested commands won't be BPF
> +	 * token-enabled.
> +	 */
> +	BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS		  = 1U << 0,
> +};
> +
>  /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
>   * the following extensions:
>   *
> @@ -1621,6 +1650,19 @@ union bpf_attr {
>  		__u32		flags;		/* extra flags */
>  	} prog_bind_map;
>  
> +	struct { /* struct used by BPF_TOKEN_CREATE command */
> +		__u32		flags;
> +		__u32		token_fd;
> +		/* a bit set of allowed bpf() syscall commands,
> +		 * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
> +		 * will allow creating derived BPF tokens and loading new BPF
> +		 * programs;
> +		 * see also BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS for its effect on
> +		 * validity checking of this set
> +		 */
> +		__u64		allowed_cmds;
> +	} token_create;

Do you think this might eventually grow into something like
"allow only lookup operation for this specific map"? If yes, maybe it
makes sense to separate token-create vs token-add-capability operations?

BPF_TOKEN_CREATE would create a token that can't do anything. Then you
would call a bunch of BPF_TOKEN_ALLOW with maybe op=SYSCALL_CMD
value=BPF_TOKEN_CREATE.

This will be more future-proof plus won't really depend on having a
bitmask in the uapi. Then the users will be able to handle
BPF_TOKEN_ALLOW{op=SYSCALL_CMD value=SOME_VALUE_NOT_SUPPORTED_ON_THIS_KERNEL}
themselves (IOW, BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS won't be needed).

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 00/18] BPF token
  2023-06-02 15:55 ` [PATCH RESEND bpf-next 00/18] BPF token Casey Schaufler
@ 2023-06-05 20:41   ` Andrii Nakryiko
  2023-06-05 22:26     ` Casey Schaufler
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-05 20:41 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto

On Fri, Jun 2, 2023 at 8:55 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 6/2/2023 7:59 AM, Andrii Nakryiko wrote:
> > *Resending with trimmed CC list because original version didn't make it to
> > the mailing list.*
> >
> > This patch set introduces new BPF object, BPF token, which allows to delegate
> > a subset of BPF functionality from privileged system-wide daemon (e.g.,
> > systemd or any other container manager) to a *trusted* unprivileged
> > application. Trust is the key here. This functionality is not about allowing
> > unconditional unprivileged BPF usage. Establishing trust, though, is
> > completely up to the discretion of respective privileged application that
> > would create a BPF token.
>
> Token based privilege has a number of well understood weaknesses,
> none of which I see addressed here. I also have a real problem with

Can you please provide some more details about those weaknesses? Hard
to respond without knowing exactly what we are talking about.

> the notion of "trusted unprivileged" where trust is established by
> a user space application. Ignoring the possibility of malicious code
> for the moment, the opportunity for accidental privilege leakage is
> huge. It would be trivial (and tempting) to create a privileged BPF
> "shell" that would then be allowed to "trust" any application and
> run it with privilege by passing it a token.

Right now most BPF applications are running as real root in
production. Users have to trust such applications to not do anything
bad with their full root capabilities. How it is done depends on
specific production and organizational setups, and could be code
reviewing, audits, LSM, etc. So in that sense BPF token doesn't make
things worse. And it actually allows us to improve the situation by
creating and sharing more restrictive BPF tokens that limit what bpf()
syscall parts are allowed to be used.

>
> >
> > The main motivation for BPF token is a desire to enable containerized
> > BPF applications to be used together with user namespaces. This is currently
> > impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
> > or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
> > helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
> > arbitrary memory, and it's impossible to ensure that they only read memory of
> > processes belonging to any given namespace. This means that it's impossible to
> > have namespace-aware CAP_BPF capability, and as such another mechanism to
> > allow safe usage of BPF functionality is necessary. BPF token and delegation
> > of it to a trusted unprivileged applications is such mechanism. Kernel makes
> > no assumption about what "trusted" constitutes in any particular case, and
> > it's up to specific privileged applications and their surrounding
> > infrastructure to decide that. What kernel provides is a set of APIs to create
> > and tune BPF token, and pass it around to privileged BPF commands that are
> > creating new BPF objects like BPF programs, BPF maps, etc.
> >
> > Previous attempt at addressing this very same problem ([0]) attempted to
> > utilize authoritative LSM approach, but was conclusively rejected by upstream
> > LSM maintainers. BPF token concept is not changing anything about LSM
> > approach, but can be combined with LSM hooks for very fine-grained security
> > policy. Some ideas about making BPF token more convenient to use with LSM (in
> > particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
> > 2023 presentation ([1]). E.g., an ability to specify user-provided data
> > (context), which in combination with BPF LSM would allow implementing a very
> > dynamic and fine-granular custom security policies on top of BPF token. In the
> > interest of minimizing API surface area discussions this is going to be
> > added in follow up patches, as it's not essential to the fundamental concept
> > of delegatable BPF token.
> >
> > It should be noted that BPF token is conceptually quite similar to the idea of
> > /dev/bpf device file, proposed by Song a while ago ([2]). The biggest
> > difference is the idea of using virtual anon_inode file to hold BPF token and
> > allowing multiple independent instances of them, each with its own set of
> > restrictions. BPF pinning solves the problem of exposing such BPF token
> > through file system (BPF FS, in this case) for cases where transferring FDs
> > over Unix domain sockets is not convenient. And also, crucially, BPF token
> > approach is not using any special stateful task-scoped flags. Instead, bpf()
> > syscall accepts token_fd parameters explicitly for each relevant BPF command.
> > This addresses main concerns brought up during the /dev/bpf discussion, and
> > fits better with overall BPF subsystem design.
> >
> > This patch set adds a basic minimum of functionality to make BPF token useful
> > and to discuss API and functionality. Currently only low-level libbpf APIs
> > support passing BPF token around, allowing to test kernel functionality, but
> > for the most part is not sufficient for real-world applications, which
> > typically use high-level libbpf APIs based on `struct bpf_object` type. This
> > was done with the intent to limit the size of patch set and concentrate on
> > mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
> > as a separate follow up patch set kernel support makes it upstream.
> >
> > Another part that should happen once kernel-side BPF token is established, is
> > a set of conventions between applications (e.g., systemd), tools (e.g.,
> > bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
> > at well-defined locations to allow applications take advantage of this in
> > automatic fashion without explicit code changes on BPF application's side.
> > But I'd like to postpone this discussion to after BPF token concept lands.
> >
> >   [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
> >   [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
> >   [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/
> >
> > Andrii Nakryiko (18):
> >   bpf: introduce BPF token object
> >   libbpf: add bpf_token_create() API
> >   selftests/bpf: add BPF_TOKEN_CREATE test
> >   bpf: move unprivileged checks into map_create() and bpf_prog_load()
> >   bpf: inline map creation logic in map_create() function
> >   bpf: centralize permissions checks for all BPF map types
> >   bpf: add BPF token support to BPF_MAP_CREATE command
> >   libbpf: add BPF token support to bpf_map_create() API
> >   selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
> >   bpf: add BPF token support to BPF_BTF_LOAD command
> >   libbpf: add BPF token support to bpf_btf_load() API
> >   selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
> >   bpf: keep BPF_PROG_LOAD permission checks clear of validations
> >   bpf: add BPF token support to BPF_PROG_LOAD command
> >   bpf: take into account BPF token when fetching helper protos
> >   bpf: consistenly use BPF token throughout BPF verifier logic
> >   libbpf: add BPF token support to bpf_prog_load() API
> >   selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests
> >
> >  drivers/media/rc/bpf-lirc.c                   |   2 +-
> >  include/linux/bpf.h                           |  66 ++-
> >  include/linux/filter.h                        |   2 +-
> >  include/uapi/linux/bpf.h                      |  74 +++
> >  kernel/bpf/Makefile                           |   2 +-
> >  kernel/bpf/arraymap.c                         |   2 +-
> >  kernel/bpf/bloom_filter.c                     |   3 -
> >  kernel/bpf/bpf_local_storage.c                |   3 -
> >  kernel/bpf/bpf_struct_ops.c                   |   3 -
> >  kernel/bpf/cgroup.c                           |   6 +-
> >  kernel/bpf/core.c                             |   3 +-
> >  kernel/bpf/cpumap.c                           |   4 -
> >  kernel/bpf/devmap.c                           |   3 -
> >  kernel/bpf/hashtab.c                          |   6 -
> >  kernel/bpf/helpers.c                          |   6 +-
> >  kernel/bpf/inode.c                            |  26 ++
> >  kernel/bpf/lpm_trie.c                         |   3 -
> >  kernel/bpf/queue_stack_maps.c                 |   4 -
> >  kernel/bpf/reuseport_array.c                  |   3 -
> >  kernel/bpf/stackmap.c                         |   3 -
> >  kernel/bpf/syscall.c                          | 429 ++++++++++++++----
> >  kernel/bpf/token.c                            | 141 ++++++
> >  kernel/bpf/verifier.c                         |  13 +-
> >  kernel/trace/bpf_trace.c                      |   2 +-
> >  net/core/filter.c                             |  36 +-
> >  net/core/sock_map.c                           |   4 -
> >  net/ipv4/bpf_tcp_ca.c                         |   2 +-
> >  net/netfilter/nf_bpf_link.c                   |   2 +-
> >  net/xdp/xskmap.c                              |   4 -
> >  tools/include/uapi/linux/bpf.h                |  74 +++
> >  tools/lib/bpf/bpf.c                           |  32 +-
> >  tools/lib/bpf/bpf.h                           |  24 +-
> >  tools/lib/bpf/libbpf.map                      |   1 +
> >  .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
> >  .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
> >  .../testing/selftests/bpf/prog_tests/token.c  | 282 ++++++++++++
> >  .../bpf/prog_tests/unpriv_bpf_disabled.c      |   6 +-
> >  37 files changed, 1098 insertions(+), 188 deletions(-)
> >  create mode 100644 kernel/bpf/token.c
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c
> >

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-03  1:32   ` Stanislav Fomichev
@ 2023-06-05 20:56     ` Andrii Nakryiko
  2023-06-05 21:48       ` Stanislav Fomichev
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-05 20:56 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto

On Fri, Jun 2, 2023 at 6:32 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On 06/02, Andrii Nakryiko wrote:
> > Add new kind of BPF kernel object, BPF token. BPF token is meant to to
> > allow delegating privileged BPF functionality, like loading a BPF
> > program or creating a BPF map, from privileged process to a *trusted*
> > unprivileged process, all while have a good amount of control over which
> > privileged operation could be done using provided BPF token.
> >
> > This patch adds new BPF_TOKEN_CREATE command to bpf() syscall, which
> > allows to create a new BPF token object along with a set of allowed
> > commands. Currently only BPF_TOKEN_CREATE command itself can be
> > delegated, but other patches gradually add ability to delegate
> > BPF_MAP_CREATE, BPF_BTF_LOAD, and BPF_PROG_LOAD commands.
> >
> > The above means that BPF token creation can be allowed by another
> > existing BPF token, if original privileged creator allowed that. New
> > derived BPF token cannot be more powerful than the original BPF token.
> >
> > BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag is added to allow application to do
> > express "all supported BPF commands should be allowed" without worrying
> > about which subset of desired commands is actually supported by
> > potentially outdated kernel. Allowing this semantics doesn't seem to
> > introduce any backwards compatibility issues and doesn't introduce any
> > risk of abusing or misusing bit set field, but makes backwards
> > compatibility story for various applications and tools much more
> > straightforward, making it unnecessary to probe support for each
> > individual possible bit. This is especially useful in follow up patches
> > where we add BPF map types and prog types bit sets.
> >
> > Lastly, BPF token can be pinned in and retrieved from BPF FS, just like
> > progs, maps, BTFs, and links. This allows applications (like container
> > managers) to share BPF token with other applications through file system
> > just like any other BPF object, and further control access to it using
> > file system permissions, if desired.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  include/linux/bpf.h            |  34 +++++++++
> >  include/uapi/linux/bpf.h       |  42 ++++++++++++
> >  kernel/bpf/Makefile            |   2 +-
> >  kernel/bpf/inode.c             |  26 +++++++
> >  kernel/bpf/syscall.c           |  74 ++++++++++++++++++++
> >  kernel/bpf/token.c             | 122 +++++++++++++++++++++++++++++++++
> >  tools/include/uapi/linux/bpf.h |  40 +++++++++++
> >  7 files changed, 339 insertions(+), 1 deletion(-)
> >  create mode 100644 kernel/bpf/token.c
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index f58895830ada..fe6d51c3a5b1 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -51,6 +51,7 @@ struct module;
> >  struct bpf_func_state;
> >  struct ftrace_ops;
> >  struct cgroup;
> > +struct bpf_token;
> >
> >  extern struct idr btf_idr;
> >  extern spinlock_t btf_idr_lock;
> > @@ -1533,6 +1534,12 @@ struct bpf_link_primer {
> >       u32 id;
> >  };
> >
> > +struct bpf_token {
> > +     struct work_struct work;
> > +     atomic64_t refcnt;
> > +     u64 allowed_cmds;
> > +};
> > +
> >  struct bpf_struct_ops_value;
> >  struct btf_member;
> >
> > @@ -2077,6 +2084,15 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
> >  struct bpf_link *bpf_link_get_from_fd(u32 ufd);
> >  struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
> >
> > +void bpf_token_inc(struct bpf_token *token);
> > +void bpf_token_put(struct bpf_token *token);
> > +struct bpf_token *bpf_token_alloc(void);
> > +int bpf_token_new_fd(struct bpf_token *token);
> > +struct bpf_token *bpf_token_get_from_fd(u32 ufd);
> > +
> > +bool bpf_token_capable(const struct bpf_token *token, int cap);
> > +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
> > +
> >  int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
> >  int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
> >
> > @@ -2436,6 +2452,24 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)
> >       return -EOPNOTSUPP;
> >  }
> >
> > +static inline void bpf_token_inc(struct bpf_token *token)
> > +{
> > +}
> > +
> > +static inline void bpf_token_put(struct bpf_token *token)
> > +{
> > +}
> > +
> > +static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
> > +{
> > +     return -EOPNOTSUPP;
> > +}
> > +
> > +static inline struct bpf_token *bpf_token_get_from_fd(u32 ufd)
> > +{
> > +     return ERR_PTR(-EOPNOTSUPP);
> > +}
> > +
> >  static inline void __dev_flush(void)
> >  {
> >  }
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 9273c654743c..01ab79f2ad9f 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -846,6 +846,16 @@ union bpf_iter_link_info {
> >   *           Returns zero on success. On error, -1 is returned and *errno*
> >   *           is set appropriately.
> >   *
> > + * BPF_TOKEN_CREATE
> > + *   Description
> > + *           Create BPF token with embedded information about what
> > + *           BPF-related functionality is allowed. This BPF token can be
> > + *           passed as an extra parameter to various bpf() syscall command.
> > + *
> > + *   Return
> > + *           A new file descriptor (a nonnegative integer), or -1 if an
> > + *           error occurred (in which case, *errno* is set appropriately).
> > + *
> >   * NOTES
> >   *   eBPF objects (maps and programs) can be shared between processes.
> >   *
> > @@ -900,6 +910,7 @@ enum bpf_cmd {
> >       BPF_ITER_CREATE,
> >       BPF_LINK_DETACH,
> >       BPF_PROG_BIND_MAP,
> > +     BPF_TOKEN_CREATE,
> >  };
> >
> >  enum bpf_map_type {
> > @@ -1169,6 +1180,24 @@ enum bpf_link_type {
> >   */
> >  #define BPF_F_KPROBE_MULTI_RETURN    (1U << 0)
> >
> > +/* BPF_TOKEN_CREATE command flags
> > + */
> > +enum {
> > +     /* Ignore unrecognized bits in token_create.allowed_cmds bit set.  If
> > +      * this flag is set, kernel won't return -EINVAL for a bit
> > +      * corresponding to a non-existing command or the one that doesn't
> > +      * support BPF token passing. This flags allows application to request
> > +      * BPF token creation for a desired set of commands without worrying
> > +      * about older kernels not supporting some of the commands.
> > +      * Presumably, deployed applications will do separate feature
> > +      * detection and will avoid calling not-yet-supported bpf() commands,
> > +      * so this BPF token will work equally well both on older and newer
> > +      * kernels, even if some of the requested commands won't be BPF
> > +      * token-enabled.
> > +      */
> > +     BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS           = 1U << 0,
> > +};
> > +
> >  /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
> >   * the following extensions:
> >   *
> > @@ -1621,6 +1650,19 @@ union bpf_attr {
> >               __u32           flags;          /* extra flags */
> >       } prog_bind_map;
> >
> > +     struct { /* struct used by BPF_TOKEN_CREATE command */
> > +             __u32           flags;
> > +             __u32           token_fd;
> > +             /* a bit set of allowed bpf() syscall commands,
> > +              * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
> > +              * will allow creating derived BPF tokens and loading new BPF
> > +              * programs;
> > +              * see also BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS for its effect on
> > +              * validity checking of this set
> > +              */
> > +             __u64           allowed_cmds;
> > +     } token_create;
>
> Do you think this might eventually grow into something like
> "allow only lookup operation for this specific map"? If yes, maybe it

If it was strictly up for me, then no. I think fine-granular and
highly-dynamic restrictions are more the (BPF) LSM domain. In practice
I envision that users will use a combination of BPF token to specify
what BPF functionality can be used by applications in "broad strokes",
e.g., specifying that only networking programs and
ARRAY/HASHMAP/SK_STORAGE maps can be used, but disallow most of
tracing functionality. And then on top of that LSM can be utilized to
provide more nuanced (and as I said, more dynamic) controls over what
operations over BPF map application can perform.

If you look at the final set of token_create parameters, you can see
that I only aim to control and restrict BPF commands that are creating
new BPF objects (BTF, map, prog; we might do similar stuff for links
later, perhaps) only. In that sense BPF token controls "constructors",
while if users want to control operation on BPF objects that were
created (maps and programs, for the most part), I see this a bit
outside of BPF token scope. I also don't think we should do more
fine-grained control of construction parameters. E.g., I think it's
too much to enforce which attach_btf_id can be provided.

It's all code, though, so we could push it in any direction we want,
but in my view BPF token is about a somewhat static prescription of
what bpf() functionality is accessible to the application, broadly.
And LSM can complement it with more dynamic abilities.


> makes sense to separate token-create vs token-add-capability operations?
>
> BPF_TOKEN_CREATE would create a token that can't do anything. Then you
> would call a bunch of BPF_TOKEN_ALLOW with maybe op=SYSCALL_CMD
> value=BPF_TOKEN_CREATE.
>
> This will be more future-proof plus won't really depend on having a
> bitmask in the uapi. Then the users will be able to handle
> BPF_TOKEN_ALLOW{op=SYSCALL_CMD value=SOME_VALUE_NOT_SUPPORTED_ON_THIS_KERNEL}
> themselves (IOW, BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS won't be needed).

So I very much intentionally wanted to keep the BPF token immutable
once created. This makes it simple to reason about what BPF token
allows and guarantee that it won't change after the fact. It's doable
to make BPF token mutable and then "finalize" it (and BPF_MAP_FREEZE
stands as a good reminder of races and complications such model
introduces), creating a sort of builder pattern APIs, but that seems
like an overkill and unnecessary complication.

But let me address that "more future-proof" part. What about our
binary bpf_attr extensible approach is not future proof? In both cases
we'll have to specify as part of UAPI that there is a possibility to
restrict a set of bpf() syscall commands, right? In one case you'll do
it through multiple syscall invocations, while I chose a
straightforward bit mask. I could have done it as a pointer to an
array of `enum bpf_cmd` items, but I think it's extremely unlikely
we'll get to >64, ever. But even if we do, adding `u64 allowed_cmds2`
doesn't seem like a big deal to me.

The main point though, both approaches are equally extensible. But
making BPF token mutable adds a lot of undesirable (IMO)
complications.


As for BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS, I'm thinking of dropping such
flags for simplicity.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-05 20:56     ` Andrii Nakryiko
@ 2023-06-05 21:48       ` Stanislav Fomichev
  2023-06-05 23:00         ` Andrii Nakryiko
  0 siblings, 1 reply; 36+ messages in thread
From: Stanislav Fomichev @ 2023-06-05 21:48 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto

On 06/05, Andrii Nakryiko wrote:
> On Fri, Jun 2, 2023 at 6:32 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On 06/02, Andrii Nakryiko wrote:
> > > Add new kind of BPF kernel object, BPF token. BPF token is meant to to
> > > allow delegating privileged BPF functionality, like loading a BPF
> > > program or creating a BPF map, from privileged process to a *trusted*
> > > unprivileged process, all while have a good amount of control over which
> > > privileged operation could be done using provided BPF token.
> > >
> > > This patch adds new BPF_TOKEN_CREATE command to bpf() syscall, which
> > > allows to create a new BPF token object along with a set of allowed
> > > commands. Currently only BPF_TOKEN_CREATE command itself can be
> > > delegated, but other patches gradually add ability to delegate
> > > BPF_MAP_CREATE, BPF_BTF_LOAD, and BPF_PROG_LOAD commands.
> > >
> > > The above means that BPF token creation can be allowed by another
> > > existing BPF token, if original privileged creator allowed that. New
> > > derived BPF token cannot be more powerful than the original BPF token.
> > >
> > > BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag is added to allow application to do
> > > express "all supported BPF commands should be allowed" without worrying
> > > about which subset of desired commands is actually supported by
> > > potentially outdated kernel. Allowing this semantics doesn't seem to
> > > introduce any backwards compatibility issues and doesn't introduce any
> > > risk of abusing or misusing bit set field, but makes backwards
> > > compatibility story for various applications and tools much more
> > > straightforward, making it unnecessary to probe support for each
> > > individual possible bit. This is especially useful in follow up patches
> > > where we add BPF map types and prog types bit sets.
> > >
> > > Lastly, BPF token can be pinned in and retrieved from BPF FS, just like
> > > progs, maps, BTFs, and links. This allows applications (like container
> > > managers) to share BPF token with other applications through file system
> > > just like any other BPF object, and further control access to it using
> > > file system permissions, if desired.
> > >
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > ---
> > >  include/linux/bpf.h            |  34 +++++++++
> > >  include/uapi/linux/bpf.h       |  42 ++++++++++++
> > >  kernel/bpf/Makefile            |   2 +-
> > >  kernel/bpf/inode.c             |  26 +++++++
> > >  kernel/bpf/syscall.c           |  74 ++++++++++++++++++++
> > >  kernel/bpf/token.c             | 122 +++++++++++++++++++++++++++++++++
> > >  tools/include/uapi/linux/bpf.h |  40 +++++++++++
> > >  7 files changed, 339 insertions(+), 1 deletion(-)
> > >  create mode 100644 kernel/bpf/token.c
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index f58895830ada..fe6d51c3a5b1 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -51,6 +51,7 @@ struct module;
> > >  struct bpf_func_state;
> > >  struct ftrace_ops;
> > >  struct cgroup;
> > > +struct bpf_token;
> > >
> > >  extern struct idr btf_idr;
> > >  extern spinlock_t btf_idr_lock;
> > > @@ -1533,6 +1534,12 @@ struct bpf_link_primer {
> > >       u32 id;
> > >  };
> > >
> > > +struct bpf_token {
> > > +     struct work_struct work;
> > > +     atomic64_t refcnt;
> > > +     u64 allowed_cmds;
> > > +};
> > > +
> > >  struct bpf_struct_ops_value;
> > >  struct btf_member;
> > >
> > > @@ -2077,6 +2084,15 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
> > >  struct bpf_link *bpf_link_get_from_fd(u32 ufd);
> > >  struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
> > >
> > > +void bpf_token_inc(struct bpf_token *token);
> > > +void bpf_token_put(struct bpf_token *token);
> > > +struct bpf_token *bpf_token_alloc(void);
> > > +int bpf_token_new_fd(struct bpf_token *token);
> > > +struct bpf_token *bpf_token_get_from_fd(u32 ufd);
> > > +
> > > +bool bpf_token_capable(const struct bpf_token *token, int cap);
> > > +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
> > > +
> > >  int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
> > >  int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
> > >
> > > @@ -2436,6 +2452,24 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)
> > >       return -EOPNOTSUPP;
> > >  }
> > >
> > > +static inline void bpf_token_inc(struct bpf_token *token)
> > > +{
> > > +}
> > > +
> > > +static inline void bpf_token_put(struct bpf_token *token)
> > > +{
> > > +}
> > > +
> > > +static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
> > > +{
> > > +     return -EOPNOTSUPP;
> > > +}
> > > +
> > > +static inline struct bpf_token *bpf_token_get_from_fd(u32 ufd)
> > > +{
> > > +     return ERR_PTR(-EOPNOTSUPP);
> > > +}
> > > +
> > >  static inline void __dev_flush(void)
> > >  {
> > >  }
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index 9273c654743c..01ab79f2ad9f 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -846,6 +846,16 @@ union bpf_iter_link_info {
> > >   *           Returns zero on success. On error, -1 is returned and *errno*
> > >   *           is set appropriately.
> > >   *
> > > + * BPF_TOKEN_CREATE
> > > + *   Description
> > > + *           Create BPF token with embedded information about what
> > > + *           BPF-related functionality is allowed. This BPF token can be
> > > + *           passed as an extra parameter to various bpf() syscall command.
> > > + *
> > > + *   Return
> > > + *           A new file descriptor (a nonnegative integer), or -1 if an
> > > + *           error occurred (in which case, *errno* is set appropriately).
> > > + *
> > >   * NOTES
> > >   *   eBPF objects (maps and programs) can be shared between processes.
> > >   *
> > > @@ -900,6 +910,7 @@ enum bpf_cmd {
> > >       BPF_ITER_CREATE,
> > >       BPF_LINK_DETACH,
> > >       BPF_PROG_BIND_MAP,
> > > +     BPF_TOKEN_CREATE,
> > >  };
> > >
> > >  enum bpf_map_type {
> > > @@ -1169,6 +1180,24 @@ enum bpf_link_type {
> > >   */
> > >  #define BPF_F_KPROBE_MULTI_RETURN    (1U << 0)
> > >
> > > +/* BPF_TOKEN_CREATE command flags
> > > + */
> > > +enum {
> > > +     /* Ignore unrecognized bits in token_create.allowed_cmds bit set.  If
> > > +      * this flag is set, kernel won't return -EINVAL for a bit
> > > +      * corresponding to a non-existing command or the one that doesn't
> > > +      * support BPF token passing. This flags allows application to request
> > > +      * BPF token creation for a desired set of commands without worrying
> > > +      * about older kernels not supporting some of the commands.
> > > +      * Presumably, deployed applications will do separate feature
> > > +      * detection and will avoid calling not-yet-supported bpf() commands,
> > > +      * so this BPF token will work equally well both on older and newer
> > > +      * kernels, even if some of the requested commands won't be BPF
> > > +      * token-enabled.
> > > +      */
> > > +     BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS           = 1U << 0,
> > > +};
> > > +
> > >  /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
> > >   * the following extensions:
> > >   *
> > > @@ -1621,6 +1650,19 @@ union bpf_attr {
> > >               __u32           flags;          /* extra flags */
> > >       } prog_bind_map;
> > >
> > > +     struct { /* struct used by BPF_TOKEN_CREATE command */
> > > +             __u32           flags;
> > > +             __u32           token_fd;
> > > +             /* a bit set of allowed bpf() syscall commands,
> > > +              * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
> > > +              * will allow creating derived BPF tokens and loading new BPF
> > > +              * programs;
> > > +              * see also BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS for its effect on
> > > +              * validity checking of this set
> > > +              */
> > > +             __u64           allowed_cmds;
> > > +     } token_create;
> >
> > Do you think this might eventually grow into something like
> > "allow only lookup operation for this specific map"? If yes, maybe it
> 
> If it was strictly up for me, then no. I think fine-granular and
> highly-dynamic restrictions are more the (BPF) LSM domain. In practice
> I envision that users will use a combination of BPF token to specify
> what BPF functionality can be used by applications in "broad strokes",
> e.g., specifying that only networking programs and
> ARRAY/HASHMAP/SK_STORAGE maps can be used, but disallow most of
> tracing functionality. And then on top of that LSM can be utilized to
> provide more nuanced (and as I said, more dynamic) controls over what
> operations over BPF map application can perform.

In this case, why not fully embrace lsm here?

Maybe all we really need is:
- a BPF_TOKEN_CREATE command (without any granularity)
- a holder of the token (passed as you're suggesting, via new uapi
  field) would be equivalent to capable(CAP_BPF)
- security_bpf() will provide fine-grained control
- extend landlock to provide coarse-grained policy (and later
  finer granularity)

?

Or we still want the token to carry the policy somehow? (why? because
of the filesystem pinning?)

> If you look at the final set of token_create parameters, you can see
> that I only aim to control and restrict BPF commands that are creating
> new BPF objects (BTF, map, prog; we might do similar stuff for links
> later, perhaps) only. In that sense BPF token controls "constructors",
> while if users want to control operation on BPF objects that were
> created (maps and programs, for the most part), I see this a bit
> outside of BPF token scope. I also don't think we should do more
> fine-grained control of construction parameters. E.g., I think it's
> too much to enforce which attach_btf_id can be provided.
> 
> It's all code, though, so we could push it in any direction we want,
> but in my view BPF token is about a somewhat static prescription of
> what bpf() functionality is accessible to the application, broadly.
> And LSM can complement it with more dynamic abilities.

Are you planning to follow up with the other, non-constructing commands?
Somebody here recently was proposing to namespacify CAP_BPF, something
like a read-only-capable token should, in theory, solve it?

> > makes sense to separate token-create vs token-add-capability operations?
> >
> > BPF_TOKEN_CREATE would create a token that can't do anything. Then you
> > would call a bunch of BPF_TOKEN_ALLOW with maybe op=SYSCALL_CMD
> > value=BPF_TOKEN_CREATE.
> >
> > This will be more future-proof plus won't really depend on having a
> > bitmask in the uapi. Then the users will be able to handle
> > BPF_TOKEN_ALLOW{op=SYSCALL_CMD value=SOME_VALUE_NOT_SUPPORTED_ON_THIS_KERNEL}
> > themselves (IOW, BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS won't be needed).
> 
> So I very much intentionally wanted to keep the BPF token immutable
> once created. This makes it simple to reason about what BPF token
> allows and guarantee that it won't change after the fact. It's doable
> to make BPF token mutable and then "finalize" it (and BPF_MAP_FREEZE
> stands as a good reminder of races and complications such model
> introduces), creating a sort of builder pattern APIs, but that seems
> like an overkill and unnecessary complication.
> 
> But let me address that "more future-proof" part. What about our
> binary bpf_attr extensible approach is not future proof? In both cases
> we'll have to specify as part of UAPI that there is a possibility to
> restrict a set of bpf() syscall commands, right? In one case you'll do
> it through multiple syscall invocations, while I chose a
> straightforward bit mask. I could have done it as a pointer to an
> array of `enum bpf_cmd` items, but I think it's extremely unlikely
> we'll get to >64, ever. But even if we do, adding `u64 allowed_cmds2`
> doesn't seem like a big deal to me.
> 
> The main point though, both approaches are equally extensible. But
> making BPF token mutable adds a lot of undesirable (IMO)
> complications.
> 
> 
> As for BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS, I'm thinking of dropping such
> flags for simplicity.

Ack, I just hope we're not inventing another landlock here. As mentioned
above, maybe doing simple BPF_TOKEN_CREATE + pushing the rest of the
policy into lsm/landlock is a good alternative, idk.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 00/18] BPF token
  2023-06-05 20:41   ` Andrii Nakryiko
@ 2023-06-05 22:26     ` Casey Schaufler
  2023-06-05 23:12       ` Andrii Nakryiko
  0 siblings, 1 reply; 36+ messages in thread
From: Casey Schaufler @ 2023-06-05 22:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto, Casey Schaufler

On 6/5/2023 1:41 PM, Andrii Nakryiko wrote:
> On Fri, Jun 2, 2023 at 8:55 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 6/2/2023 7:59 AM, Andrii Nakryiko wrote:
>>> *Resending with trimmed CC list because original version didn't make it to
>>> the mailing list.*
>>>
>>> This patch set introduces new BPF object, BPF token, which allows to delegate
>>> a subset of BPF functionality from privileged system-wide daemon (e.g.,
>>> systemd or any other container manager) to a *trusted* unprivileged
>>> application. Trust is the key here. This functionality is not about allowing
>>> unconditional unprivileged BPF usage. Establishing trust, though, is
>>> completely up to the discretion of respective privileged application that
>>> would create a BPF token.
>> Token based privilege has a number of well understood weaknesses,
>> none of which I see addressed here. I also have a real problem with
> Can you please provide some more details about those weaknesses? Hard
> to respond without knowing exactly what we are talking about.

Privileged Process (PP) sends a Token to Trusted Process (TP).
TP sends the Token along to Untrusted Process, which performs nefarious
deeds.

Privileged Process (PP) sends a Token to Trusted Process (TP).
TP uses Token, and then saves it in its toolbox. PP later sends
TP a different Token. TP realizes that with the combination of
Tokens it now has it can do considerably more than what PP
intended in either of the cases it sent Token for. TP performs
nefarious deeds.

Granted, in both cases TP does not deserve to be trusted. 
Because TP does not run with privilege of its own, it is not
treated with the same level of caution as it would be if it did.

Privileged Process (PP) sends a Token to what it thinks is a Trusted
Process (TP) but is in fact an Imposter Process (IP) that has been
enabled on the system using any number of K33L techniques.

I don't see anything that ensures that PP communicates Tokens only
to TP, nor any criteria for "trust" are met.

Those are the issues I'm most familiar with, although I believe
there are others.

>> the notion of "trusted unprivileged" where trust is established by
>> a user space application. Ignoring the possibility of malicious code
>> for the moment, the opportunity for accidental privilege leakage is
>> huge. It would be trivial (and tempting) to create a privileged BPF
>> "shell" that would then be allowed to "trust" any application and
>> run it with privilege by passing it a token.
> Right now most BPF applications are running as real root in
> production. Users have to trust such applications to not do anything
> bad with their full root capabilities. How it is done depends on
> specific production and organizational setups, and could be code
> reviewing, audits, LSM, etc. So in that sense BPF token doesn't make
> things worse. And it actually allows us to improve the situation by
> creating and sharing more restrictive BPF tokens that limit what bpf()
> syscall parts are allowed to be used.
>
>>> The main motivation for BPF token is a desire to enable containerized
>>> BPF applications to be used together with user namespaces. This is currently
>>> impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
>>> or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
>>> helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
>>> arbitrary memory, and it's impossible to ensure that they only read memory of
>>> processes belonging to any given namespace. This means that it's impossible to
>>> have namespace-aware CAP_BPF capability, and as such another mechanism to
>>> allow safe usage of BPF functionality is necessary. BPF token and delegation
>>> of it to a trusted unprivileged applications is such mechanism. Kernel makes
>>> no assumption about what "trusted" constitutes in any particular case, and
>>> it's up to specific privileged applications and their surrounding
>>> infrastructure to decide that. What kernel provides is a set of APIs to create
>>> and tune BPF token, and pass it around to privileged BPF commands that are
>>> creating new BPF objects like BPF programs, BPF maps, etc.
>>>
>>> Previous attempt at addressing this very same problem ([0]) attempted to
>>> utilize authoritative LSM approach, but was conclusively rejected by upstream
>>> LSM maintainers. BPF token concept is not changing anything about LSM
>>> approach, but can be combined with LSM hooks for very fine-grained security
>>> policy. Some ideas about making BPF token more convenient to use with LSM (in
>>> particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
>>> 2023 presentation ([1]). E.g., an ability to specify user-provided data
>>> (context), which in combination with BPF LSM would allow implementing a very
>>> dynamic and fine-granular custom security policies on top of BPF token. In the
>>> interest of minimizing API surface area discussions this is going to be
>>> added in follow up patches, as it's not essential to the fundamental concept
>>> of delegatable BPF token.
>>>
>>> It should be noted that BPF token is conceptually quite similar to the idea of
>>> /dev/bpf device file, proposed by Song a while ago ([2]). The biggest
>>> difference is the idea of using virtual anon_inode file to hold BPF token and
>>> allowing multiple independent instances of them, each with its own set of
>>> restrictions. BPF pinning solves the problem of exposing such BPF token
>>> through file system (BPF FS, in this case) for cases where transferring FDs
>>> over Unix domain sockets is not convenient. And also, crucially, BPF token
>>> approach is not using any special stateful task-scoped flags. Instead, bpf()
>>> syscall accepts token_fd parameters explicitly for each relevant BPF command.
>>> This addresses main concerns brought up during the /dev/bpf discussion, and
>>> fits better with overall BPF subsystem design.
>>>
>>> This patch set adds a basic minimum of functionality to make BPF token useful
>>> and to discuss API and functionality. Currently only low-level libbpf APIs
>>> support passing BPF token around, allowing to test kernel functionality, but
>>> for the most part is not sufficient for real-world applications, which
>>> typically use high-level libbpf APIs based on `struct bpf_object` type. This
>>> was done with the intent to limit the size of patch set and concentrate on
>>> mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
>>> as a separate follow up patch set kernel support makes it upstream.
>>>
>>> Another part that should happen once kernel-side BPF token is established, is
>>> a set of conventions between applications (e.g., systemd), tools (e.g.,
>>> bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
>>> at well-defined locations to allow applications take advantage of this in
>>> automatic fashion without explicit code changes on BPF application's side.
>>> But I'd like to postpone this discussion to after BPF token concept lands.
>>>
>>>   [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
>>>   [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
>>>   [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/
>>>
>>> Andrii Nakryiko (18):
>>>   bpf: introduce BPF token object
>>>   libbpf: add bpf_token_create() API
>>>   selftests/bpf: add BPF_TOKEN_CREATE test
>>>   bpf: move unprivileged checks into map_create() and bpf_prog_load()
>>>   bpf: inline map creation logic in map_create() function
>>>   bpf: centralize permissions checks for all BPF map types
>>>   bpf: add BPF token support to BPF_MAP_CREATE command
>>>   libbpf: add BPF token support to bpf_map_create() API
>>>   selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
>>>   bpf: add BPF token support to BPF_BTF_LOAD command
>>>   libbpf: add BPF token support to bpf_btf_load() API
>>>   selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
>>>   bpf: keep BPF_PROG_LOAD permission checks clear of validations
>>>   bpf: add BPF token support to BPF_PROG_LOAD command
>>>   bpf: take into account BPF token when fetching helper protos
>>>   bpf: consistenly use BPF token throughout BPF verifier logic
>>>   libbpf: add BPF token support to bpf_prog_load() API
>>>   selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests
>>>
>>>  drivers/media/rc/bpf-lirc.c                   |   2 +-
>>>  include/linux/bpf.h                           |  66 ++-
>>>  include/linux/filter.h                        |   2 +-
>>>  include/uapi/linux/bpf.h                      |  74 +++
>>>  kernel/bpf/Makefile                           |   2 +-
>>>  kernel/bpf/arraymap.c                         |   2 +-
>>>  kernel/bpf/bloom_filter.c                     |   3 -
>>>  kernel/bpf/bpf_local_storage.c                |   3 -
>>>  kernel/bpf/bpf_struct_ops.c                   |   3 -
>>>  kernel/bpf/cgroup.c                           |   6 +-
>>>  kernel/bpf/core.c                             |   3 +-
>>>  kernel/bpf/cpumap.c                           |   4 -
>>>  kernel/bpf/devmap.c                           |   3 -
>>>  kernel/bpf/hashtab.c                          |   6 -
>>>  kernel/bpf/helpers.c                          |   6 +-
>>>  kernel/bpf/inode.c                            |  26 ++
>>>  kernel/bpf/lpm_trie.c                         |   3 -
>>>  kernel/bpf/queue_stack_maps.c                 |   4 -
>>>  kernel/bpf/reuseport_array.c                  |   3 -
>>>  kernel/bpf/stackmap.c                         |   3 -
>>>  kernel/bpf/syscall.c                          | 429 ++++++++++++++----
>>>  kernel/bpf/token.c                            | 141 ++++++
>>>  kernel/bpf/verifier.c                         |  13 +-
>>>  kernel/trace/bpf_trace.c                      |   2 +-
>>>  net/core/filter.c                             |  36 +-
>>>  net/core/sock_map.c                           |   4 -
>>>  net/ipv4/bpf_tcp_ca.c                         |   2 +-
>>>  net/netfilter/nf_bpf_link.c                   |   2 +-
>>>  net/xdp/xskmap.c                              |   4 -
>>>  tools/include/uapi/linux/bpf.h                |  74 +++
>>>  tools/lib/bpf/bpf.c                           |  32 +-
>>>  tools/lib/bpf/bpf.h                           |  24 +-
>>>  tools/lib/bpf/libbpf.map                      |   1 +
>>>  .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
>>>  .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
>>>  .../testing/selftests/bpf/prog_tests/token.c  | 282 ++++++++++++
>>>  .../bpf/prog_tests/unpriv_bpf_disabled.c      |   6 +-
>>>  37 files changed, 1098 insertions(+), 188 deletions(-)
>>>  create mode 100644 kernel/bpf/token.c
>>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c
>>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-05 21:48       ` Stanislav Fomichev
@ 2023-06-05 23:00         ` Andrii Nakryiko
  2023-06-06 16:58           ` Stanislav Fomichev
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-05 23:00 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto

On Mon, Jun 5, 2023 at 2:48 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On 06/05, Andrii Nakryiko wrote:
> > On Fri, Jun 2, 2023 at 6:32 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > On 06/02, Andrii Nakryiko wrote:
> > > > Add new kind of BPF kernel object, BPF token. BPF token is meant to to
> > > > allow delegating privileged BPF functionality, like loading a BPF
> > > > program or creating a BPF map, from privileged process to a *trusted*
> > > > unprivileged process, all while have a good amount of control over which
> > > > privileged operation could be done using provided BPF token.
> > > >
> > > > This patch adds new BPF_TOKEN_CREATE command to bpf() syscall, which
> > > > allows to create a new BPF token object along with a set of allowed
> > > > commands. Currently only BPF_TOKEN_CREATE command itself can be
> > > > delegated, but other patches gradually add ability to delegate
> > > > BPF_MAP_CREATE, BPF_BTF_LOAD, and BPF_PROG_LOAD commands.
> > > >
> > > > The above means that BPF token creation can be allowed by another
> > > > existing BPF token, if original privileged creator allowed that. New
> > > > derived BPF token cannot be more powerful than the original BPF token.
> > > >
> > > > BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag is added to allow application to do
> > > > express "all supported BPF commands should be allowed" without worrying
> > > > about which subset of desired commands is actually supported by
> > > > potentially outdated kernel. Allowing this semantics doesn't seem to
> > > > introduce any backwards compatibility issues and doesn't introduce any
> > > > risk of abusing or misusing bit set field, but makes backwards
> > > > compatibility story for various applications and tools much more
> > > > straightforward, making it unnecessary to probe support for each
> > > > individual possible bit. This is especially useful in follow up patches
> > > > where we add BPF map types and prog types bit sets.
> > > >
> > > > Lastly, BPF token can be pinned in and retrieved from BPF FS, just like
> > > > progs, maps, BTFs, and links. This allows applications (like container
> > > > managers) to share BPF token with other applications through file system
> > > > just like any other BPF object, and further control access to it using
> > > > file system permissions, if desired.
> > > >
> > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > ---
> > > >  include/linux/bpf.h            |  34 +++++++++
> > > >  include/uapi/linux/bpf.h       |  42 ++++++++++++
> > > >  kernel/bpf/Makefile            |   2 +-
> > > >  kernel/bpf/inode.c             |  26 +++++++
> > > >  kernel/bpf/syscall.c           |  74 ++++++++++++++++++++
> > > >  kernel/bpf/token.c             | 122 +++++++++++++++++++++++++++++++++
> > > >  tools/include/uapi/linux/bpf.h |  40 +++++++++++
> > > >  7 files changed, 339 insertions(+), 1 deletion(-)
> > > >  create mode 100644 kernel/bpf/token.c
> > > >
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index f58895830ada..fe6d51c3a5b1 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -51,6 +51,7 @@ struct module;
> > > >  struct bpf_func_state;
> > > >  struct ftrace_ops;
> > > >  struct cgroup;
> > > > +struct bpf_token;
> > > >
> > > >  extern struct idr btf_idr;
> > > >  extern spinlock_t btf_idr_lock;
> > > > @@ -1533,6 +1534,12 @@ struct bpf_link_primer {
> > > >       u32 id;
> > > >  };
> > > >
> > > > +struct bpf_token {
> > > > +     struct work_struct work;
> > > > +     atomic64_t refcnt;
> > > > +     u64 allowed_cmds;
> > > > +};
> > > > +
> > > >  struct bpf_struct_ops_value;
> > > >  struct btf_member;
> > > >
> > > > @@ -2077,6 +2084,15 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
> > > >  struct bpf_link *bpf_link_get_from_fd(u32 ufd);
> > > >  struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
> > > >
> > > > +void bpf_token_inc(struct bpf_token *token);
> > > > +void bpf_token_put(struct bpf_token *token);
> > > > +struct bpf_token *bpf_token_alloc(void);
> > > > +int bpf_token_new_fd(struct bpf_token *token);
> > > > +struct bpf_token *bpf_token_get_from_fd(u32 ufd);
> > > > +
> > > > +bool bpf_token_capable(const struct bpf_token *token, int cap);
> > > > +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
> > > > +
> > > >  int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
> > > >  int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
> > > >
> > > > @@ -2436,6 +2452,24 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)
> > > >       return -EOPNOTSUPP;
> > > >  }
> > > >
> > > > +static inline void bpf_token_inc(struct bpf_token *token)
> > > > +{
> > > > +}
> > > > +
> > > > +static inline void bpf_token_put(struct bpf_token *token)
> > > > +{
> > > > +}
> > > > +
> > > > +static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
> > > > +{
> > > > +     return -EOPNOTSUPP;
> > > > +}
> > > > +
> > > > +static inline struct bpf_token *bpf_token_get_from_fd(u32 ufd)
> > > > +{
> > > > +     return ERR_PTR(-EOPNOTSUPP);
> > > > +}
> > > > +
> > > >  static inline void __dev_flush(void)
> > > >  {
> > > >  }
> > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > index 9273c654743c..01ab79f2ad9f 100644
> > > > --- a/include/uapi/linux/bpf.h
> > > > +++ b/include/uapi/linux/bpf.h
> > > > @@ -846,6 +846,16 @@ union bpf_iter_link_info {
> > > >   *           Returns zero on success. On error, -1 is returned and *errno*
> > > >   *           is set appropriately.
> > > >   *
> > > > + * BPF_TOKEN_CREATE
> > > > + *   Description
> > > > + *           Create BPF token with embedded information about what
> > > > + *           BPF-related functionality is allowed. This BPF token can be
> > > > + *           passed as an extra parameter to various bpf() syscall command.
> > > > + *
> > > > + *   Return
> > > > + *           A new file descriptor (a nonnegative integer), or -1 if an
> > > > + *           error occurred (in which case, *errno* is set appropriately).
> > > > + *
> > > >   * NOTES
> > > >   *   eBPF objects (maps and programs) can be shared between processes.
> > > >   *
> > > > @@ -900,6 +910,7 @@ enum bpf_cmd {
> > > >       BPF_ITER_CREATE,
> > > >       BPF_LINK_DETACH,
> > > >       BPF_PROG_BIND_MAP,
> > > > +     BPF_TOKEN_CREATE,
> > > >  };
> > > >
> > > >  enum bpf_map_type {
> > > > @@ -1169,6 +1180,24 @@ enum bpf_link_type {
> > > >   */
> > > >  #define BPF_F_KPROBE_MULTI_RETURN    (1U << 0)
> > > >
> > > > +/* BPF_TOKEN_CREATE command flags
> > > > + */
> > > > +enum {
> > > > +     /* Ignore unrecognized bits in token_create.allowed_cmds bit set.  If
> > > > +      * this flag is set, kernel won't return -EINVAL for a bit
> > > > +      * corresponding to a non-existing command or the one that doesn't
> > > > +      * support BPF token passing. This flags allows application to request
> > > > +      * BPF token creation for a desired set of commands without worrying
> > > > +      * about older kernels not supporting some of the commands.
> > > > +      * Presumably, deployed applications will do separate feature
> > > > +      * detection and will avoid calling not-yet-supported bpf() commands,
> > > > +      * so this BPF token will work equally well both on older and newer
> > > > +      * kernels, even if some of the requested commands won't be BPF
> > > > +      * token-enabled.
> > > > +      */
> > > > +     BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS           = 1U << 0,
> > > > +};
> > > > +
> > > >  /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
> > > >   * the following extensions:
> > > >   *
> > > > @@ -1621,6 +1650,19 @@ union bpf_attr {
> > > >               __u32           flags;          /* extra flags */
> > > >       } prog_bind_map;
> > > >
> > > > +     struct { /* struct used by BPF_TOKEN_CREATE command */
> > > > +             __u32           flags;
> > > > +             __u32           token_fd;
> > > > +             /* a bit set of allowed bpf() syscall commands,
> > > > +              * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
> > > > +              * will allow creating derived BPF tokens and loading new BPF
> > > > +              * programs;
> > > > +              * see also BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS for its effect on
> > > > +              * validity checking of this set
> > > > +              */
> > > > +             __u64           allowed_cmds;
> > > > +     } token_create;
> > >
> > > Do you think this might eventually grow into something like
> > > "allow only lookup operation for this specific map"? If yes, maybe it
> >
> > If it was strictly up for me, then no. I think fine-granular and
> > highly-dynamic restrictions are more the (BPF) LSM domain. In practice
> > I envision that users will use a combination of BPF token to specify
> > what BPF functionality can be used by applications in "broad strokes",
> > e.g., specifying that only networking programs and
> > ARRAY/HASHMAP/SK_STORAGE maps can be used, but disallow most of
> > tracing functionality. And then on top of that LSM can be utilized to
> > provide more nuanced (and as I said, more dynamic) controls over what
> > operations over BPF map application can perform.
>
> In this case, why not fully embrace lsm here?
>
> Maybe all we really need is:
> - a BPF_TOKEN_CREATE command (without any granularity)
> - a holder of the token (passed as you're suggesting, via new uapi
>   field) would be equivalent to capable(CAP_BPF)
> - security_bpf() will provide fine-grained control
> - extend landlock to provide coarse-grained policy (and later
>   finer granularity)
>
> ?

That's one option, yes. But I got the feeling at LSF/MM/BPF that
people are worried about having a BPF token that allows the entire
bpf() syscall with no control. I think this coarse-grained control
strikes a reasonable and pragmatic balance, but I'm open to just going
all in. :)

>
> Or we still want the token to carry the policy somehow? (why? because
> of the filesystem pinning?)

I think it's nice to be able to say "this application can only do
networking programs and no fancy data structures" with purely BPF
token, with no BPF LSM involved. Or on the other hand, "just tracing,
no networking" for another class of programs.

LSM and BPF LSM is definitely a more logistical hurdle, so if it can
be avoided in some scenarios, that seems like a win.


>
> > If you look at the final set of token_create parameters, you can see
> > that I only aim to control and restrict BPF commands that are creating
> > new BPF objects (BTF, map, prog; we might do similar stuff for links
> > later, perhaps) only. In that sense BPF token controls "constructors",
> > while if users want to control operation on BPF objects that were
> > created (maps and programs, for the most part), I see this a bit
> > outside of BPF token scope. I also don't think we should do more
> > fine-grained control of construction parameters. E.g., I think it's
> > too much to enforce which attach_btf_id can be provided.
> >
> > It's all code, though, so we could push it in any direction we want,
> > but in my view BPF token is about a somewhat static prescription of
> > what bpf() functionality is accessible to the application, broadly.
> > And LSM can complement it with more dynamic abilities.
>
> Are you planning to follow up with the other, non-constructing commands?
> Somebody here recently was proposing to namespacify CAP_BPF, something
> like a read-only-capable token should, in theory, solve it?

Maybe for LINK_CREATE. Most other commands are already unprivileged
and rely on FD (prog, map, link) availability as a proof of being able
to work with that object. GET_FD_BY_ID is another candidate for BPF
token, but I wanted to get real production feedback before making
exact decisions here.

>
> > > makes sense to separate token-create vs token-add-capability operations?
> > >
> > > BPF_TOKEN_CREATE would create a token that can't do anything. Then you
> > > would call a bunch of BPF_TOKEN_ALLOW with maybe op=SYSCALL_CMD
> > > value=BPF_TOKEN_CREATE.
> > >
> > > This will be more future-proof plus won't really depend on having a
> > > bitmask in the uapi. Then the users will be able to handle
> > > BPF_TOKEN_ALLOW{op=SYSCALL_CMD value=SOME_VALUE_NOT_SUPPORTED_ON_THIS_KERNEL}
> > > themselves (IOW, BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS won't be needed).
> >
> > So I very much intentionally wanted to keep the BPF token immutable
> > once created. This makes it simple to reason about what BPF token
> > allows and guarantee that it won't change after the fact. It's doable
> > to make BPF token mutable and then "finalize" it (and BPF_MAP_FREEZE
> > stands as a good reminder of races and complications such model
> > introduces), creating a sort of builder pattern APIs, but that seems
> > like an overkill and unnecessary complication.
> >
> > But let me address that "more future-proof" part. What about our
> > binary bpf_attr extensible approach is not future proof? In both cases
> > we'll have to specify as part of UAPI that there is a possibility to
> > restrict a set of bpf() syscall commands, right? In one case you'll do
> > it through multiple syscall invocations, while I chose a
> > straightforward bit mask. I could have done it as a pointer to an
> > array of `enum bpf_cmd` items, but I think it's extremely unlikely
> > we'll get to >64, ever. But even if we do, adding `u64 allowed_cmds2`
> > doesn't seem like a big deal to me.
> >
> > The main point though, both approaches are equally extensible. But
> > making BPF token mutable adds a lot of undesirable (IMO)
> > complications.
> >
> >
> > As for BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS, I'm thinking of dropping such
> > flags for simplicity.
>
> Ack, I just hope we're not inventing another landlock here. As mentioned
> above, maybe doing simple BPF_TOKEN_CREATE + pushing the rest of the
> policy into lsm/landlock is a good alternative, idk.

The biggest blocker today is incompatibility of BPF usage with user
namespaces. Having a simple BPF token would allow us to make progress
here. The rest is just something to strike a balance with, yep.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 00/18] BPF token
  2023-06-05 22:26     ` Casey Schaufler
@ 2023-06-05 23:12       ` Andrii Nakryiko
  2023-06-06  0:05         ` Casey Schaufler
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-05 23:12 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto

On Mon, Jun 5, 2023 at 3:26 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 6/5/2023 1:41 PM, Andrii Nakryiko wrote:
> > On Fri, Jun 2, 2023 at 8:55 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 6/2/2023 7:59 AM, Andrii Nakryiko wrote:
> >>> *Resending with trimmed CC list because original version didn't make it to
> >>> the mailing list.*
> >>>
> >>> This patch set introduces new BPF object, BPF token, which allows to delegate
> >>> a subset of BPF functionality from privileged system-wide daemon (e.g.,
> >>> systemd or any other container manager) to a *trusted* unprivileged
> >>> application. Trust is the key here. This functionality is not about allowing
> >>> unconditional unprivileged BPF usage. Establishing trust, though, is
> >>> completely up to the discretion of respective privileged application that
> >>> would create a BPF token.
> >> Token based privilege has a number of well understood weaknesses,
> >> none of which I see addressed here. I also have a real problem with
> > Can you please provide some more details about those weaknesses? Hard
> > to respond without knowing exactly what we are talking about.
>
> Privileged Process (PP) sends a Token to Trusted Process (TP).
> TP sends the Token along to Untrusted Process, which performs nefarious
> deeds.
>
> Privileged Process (PP) sends a Token to Trusted Process (TP).
> TP uses Token, and then saves it in its toolbox. PP later sends
> TP a different Token. TP realizes that with the combination of
> Tokens it now has it can do considerably more than what PP
> intended in either of the cases it sent Token for. TP performs
> nefarious deeds.
>
> Granted, in both cases TP does not deserve to be trusted.

Right, exactly. The intended use case here is a controlled production
containerized environment, where the container manager is privileged
and controls which applications are run inside the container. These
are coming from applications that are code reviewed and controlled by
whichever organization.

> Because TP does not run with privilege of its own, it is not
> treated with the same level of caution as it would be if it did.
>
> Privileged Process (PP) sends a Token to what it thinks is a Trusted
> Process (TP) but is in fact an Imposter Process (IP) that has been
> enabled on the system using any number of K33L techniques.

So if there is a probability of Imposter Process, neither BPF token
nor CAP_BPF should be granted at all. In production no one gives
CAP_BPF to processes that we cannot be reasonably sure is safe to use
BPF. As I mentioned in the cover letter, BPF token is not a mechanism
to implement unprivileged BPF.

What I'm trying to achieve here is instead of needing to grant root
capabilities to any (trusted, otherwise no one would do this)
BPF-using application, we'd like to grant BPF token which is more
limited in scope and gives much less privileges to do anything with
the system. And, crucially, CAP_BPF is incompatible with user
namespaces, while BPF token is.

Basically, I'd like to go from having root/CAP_BPF processes in init
namespace, to have unprivileged processes under user namespace, but
with BPF token that would still allow to do them controlled (through
combination of code reviews, audit, and security enforcements) BPF
usage.

>
> I don't see anything that ensures that PP communicates Tokens only
> to TP, nor any criteria for "trust" are met.

This should be up to PP how to organize this and will differ in
different production setups. E.g., for something like systemd or
container manager, one way to communicate this is to create a
dedicated instance of BPF FS, pin BPF token in it, and expose that
specific instance of BPF FS in the container's mount namespace.

>
> Those are the issues I'm most familiar with, although I believe
> there are others.
>
> >> the notion of "trusted unprivileged" where trust is established by
> >> a user space application. Ignoring the possibility of malicious code
> >> for the moment, the opportunity for accidental privilege leakage is
> >> huge. It would be trivial (and tempting) to create a privileged BPF
> >> "shell" that would then be allowed to "trust" any application and
> >> run it with privilege by passing it a token.
> > Right now most BPF applications are running as real root in
> > production. Users have to trust such applications to not do anything
> > bad with their full root capabilities. How it is done depends on
> > specific production and organizational setups, and could be code
> > reviewing, audits, LSM, etc. So in that sense BPF token doesn't make
> > things worse. And it actually allows us to improve the situation by
> > creating and sharing more restrictive BPF tokens that limit what bpf()
> > syscall parts are allowed to be used.
> >
> >>> The main motivation for BPF token is a desire to enable containerized
> >>> BPF applications to be used together with user namespaces. This is currently
> >>> impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
> >>> or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
> >>> helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
> >>> arbitrary memory, and it's impossible to ensure that they only read memory of
> >>> processes belonging to any given namespace. This means that it's impossible to
> >>> have namespace-aware CAP_BPF capability, and as such another mechanism to
> >>> allow safe usage of BPF functionality is necessary. BPF token and delegation
> >>> of it to a trusted unprivileged applications is such mechanism. Kernel makes
> >>> no assumption about what "trusted" constitutes in any particular case, and
> >>> it's up to specific privileged applications and their surrounding
> >>> infrastructure to decide that. What kernel provides is a set of APIs to create
> >>> and tune BPF token, and pass it around to privileged BPF commands that are
> >>> creating new BPF objects like BPF programs, BPF maps, etc.
> >>>
> >>> Previous attempt at addressing this very same problem ([0]) attempted to
> >>> utilize authoritative LSM approach, but was conclusively rejected by upstream
> >>> LSM maintainers. BPF token concept is not changing anything about LSM
> >>> approach, but can be combined with LSM hooks for very fine-grained security
> >>> policy. Some ideas about making BPF token more convenient to use with LSM (in
> >>> particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
> >>> 2023 presentation ([1]). E.g., an ability to specify user-provided data
> >>> (context), which in combination with BPF LSM would allow implementing a very
> >>> dynamic and fine-granular custom security policies on top of BPF token. In the
> >>> interest of minimizing API surface area discussions this is going to be
> >>> added in follow up patches, as it's not essential to the fundamental concept
> >>> of delegatable BPF token.
> >>>
> >>> It should be noted that BPF token is conceptually quite similar to the idea of
> >>> /dev/bpf device file, proposed by Song a while ago ([2]). The biggest
> >>> difference is the idea of using virtual anon_inode file to hold BPF token and
> >>> allowing multiple independent instances of them, each with its own set of
> >>> restrictions. BPF pinning solves the problem of exposing such BPF token
> >>> through file system (BPF FS, in this case) for cases where transferring FDs
> >>> over Unix domain sockets is not convenient. And also, crucially, BPF token
> >>> approach is not using any special stateful task-scoped flags. Instead, bpf()
> >>> syscall accepts token_fd parameters explicitly for each relevant BPF command.
> >>> This addresses main concerns brought up during the /dev/bpf discussion, and
> >>> fits better with overall BPF subsystem design.
> >>>
> >>> This patch set adds a basic minimum of functionality to make BPF token useful
> >>> and to discuss API and functionality. Currently only low-level libbpf APIs
> >>> support passing BPF token around, allowing to test kernel functionality, but
> >>> for the most part is not sufficient for real-world applications, which
> >>> typically use high-level libbpf APIs based on `struct bpf_object` type. This
> >>> was done with the intent to limit the size of patch set and concentrate on
> >>> mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
> >>> as a separate follow up patch set kernel support makes it upstream.
> >>>
> >>> Another part that should happen once kernel-side BPF token is established, is
> >>> a set of conventions between applications (e.g., systemd), tools (e.g.,
> >>> bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
> >>> at well-defined locations to allow applications take advantage of this in
> >>> automatic fashion without explicit code changes on BPF application's side.
> >>> But I'd like to postpone this discussion to after BPF token concept lands.
> >>>
> >>>   [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
> >>>   [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
> >>>   [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/
> >>>
> >>> Andrii Nakryiko (18):
> >>>   bpf: introduce BPF token object
> >>>   libbpf: add bpf_token_create() API
> >>>   selftests/bpf: add BPF_TOKEN_CREATE test
> >>>   bpf: move unprivileged checks into map_create() and bpf_prog_load()
> >>>   bpf: inline map creation logic in map_create() function
> >>>   bpf: centralize permissions checks for all BPF map types
> >>>   bpf: add BPF token support to BPF_MAP_CREATE command
> >>>   libbpf: add BPF token support to bpf_map_create() API
> >>>   selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
> >>>   bpf: add BPF token support to BPF_BTF_LOAD command
> >>>   libbpf: add BPF token support to bpf_btf_load() API
> >>>   selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
> >>>   bpf: keep BPF_PROG_LOAD permission checks clear of validations
> >>>   bpf: add BPF token support to BPF_PROG_LOAD command
> >>>   bpf: take into account BPF token when fetching helper protos
> >>>   bpf: consistenly use BPF token throughout BPF verifier logic
> >>>   libbpf: add BPF token support to bpf_prog_load() API
> >>>   selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests
> >>>
> >>>  drivers/media/rc/bpf-lirc.c                   |   2 +-
> >>>  include/linux/bpf.h                           |  66 ++-
> >>>  include/linux/filter.h                        |   2 +-
> >>>  include/uapi/linux/bpf.h                      |  74 +++
> >>>  kernel/bpf/Makefile                           |   2 +-
> >>>  kernel/bpf/arraymap.c                         |   2 +-
> >>>  kernel/bpf/bloom_filter.c                     |   3 -
> >>>  kernel/bpf/bpf_local_storage.c                |   3 -
> >>>  kernel/bpf/bpf_struct_ops.c                   |   3 -
> >>>  kernel/bpf/cgroup.c                           |   6 +-
> >>>  kernel/bpf/core.c                             |   3 +-
> >>>  kernel/bpf/cpumap.c                           |   4 -
> >>>  kernel/bpf/devmap.c                           |   3 -
> >>>  kernel/bpf/hashtab.c                          |   6 -
> >>>  kernel/bpf/helpers.c                          |   6 +-
> >>>  kernel/bpf/inode.c                            |  26 ++
> >>>  kernel/bpf/lpm_trie.c                         |   3 -
> >>>  kernel/bpf/queue_stack_maps.c                 |   4 -
> >>>  kernel/bpf/reuseport_array.c                  |   3 -
> >>>  kernel/bpf/stackmap.c                         |   3 -
> >>>  kernel/bpf/syscall.c                          | 429 ++++++++++++++----
> >>>  kernel/bpf/token.c                            | 141 ++++++
> >>>  kernel/bpf/verifier.c                         |  13 +-
> >>>  kernel/trace/bpf_trace.c                      |   2 +-
> >>>  net/core/filter.c                             |  36 +-
> >>>  net/core/sock_map.c                           |   4 -
> >>>  net/ipv4/bpf_tcp_ca.c                         |   2 +-
> >>>  net/netfilter/nf_bpf_link.c                   |   2 +-
> >>>  net/xdp/xskmap.c                              |   4 -
> >>>  tools/include/uapi/linux/bpf.h                |  74 +++
> >>>  tools/lib/bpf/bpf.c                           |  32 +-
> >>>  tools/lib/bpf/bpf.h                           |  24 +-
> >>>  tools/lib/bpf/libbpf.map                      |   1 +
> >>>  .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
> >>>  .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
> >>>  .../testing/selftests/bpf/prog_tests/token.c  | 282 ++++++++++++
> >>>  .../bpf/prog_tests/unpriv_bpf_disabled.c      |   6 +-
> >>>  37 files changed, 1098 insertions(+), 188 deletions(-)
> >>>  create mode 100644 kernel/bpf/token.c
> >>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c
> >>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 00/18] BPF token
  2023-06-05 23:12       ` Andrii Nakryiko
@ 2023-06-06  0:05         ` Casey Schaufler
  2023-06-06 16:38           ` Andrii Nakryiko
  0 siblings, 1 reply; 36+ messages in thread
From: Casey Schaufler @ 2023-06-06  0:05 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto, Casey Schaufler

On 6/5/2023 4:12 PM, Andrii Nakryiko wrote:
> On Mon, Jun 5, 2023 at 3:26 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 6/5/2023 1:41 PM, Andrii Nakryiko wrote:
>>> On Fri, Jun 2, 2023 at 8:55 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>> On 6/2/2023 7:59 AM, Andrii Nakryiko wrote:
>>>>> *Resending with trimmed CC list because original version didn't make it to
>>>>> the mailing list.*
>>>>>
>>>>> This patch set introduces new BPF object, BPF token, which allows to delegate
>>>>> a subset of BPF functionality from privileged system-wide daemon (e.g.,
>>>>> systemd or any other container manager) to a *trusted* unprivileged
>>>>> application. Trust is the key here. This functionality is not about allowing
>>>>> unconditional unprivileged BPF usage. Establishing trust, though, is
>>>>> completely up to the discretion of respective privileged application that
>>>>> would create a BPF token.
>>>> Token based privilege has a number of well understood weaknesses,
>>>> none of which I see addressed here. I also have a real problem with
>>> Can you please provide some more details about those weaknesses? Hard
>>> to respond without knowing exactly what we are talking about.
>> Privileged Process (PP) sends a Token to Trusted Process (TP).
>> TP sends the Token along to Untrusted Process, which performs nefarious
>> deeds.
>>
>> Privileged Process (PP) sends a Token to Trusted Process (TP).
>> TP uses Token, and then saves it in its toolbox. PP later sends
>> TP a different Token. TP realizes that with the combination of
>> Tokens it now has it can do considerably more than what PP
>> intended in either of the cases it sent Token for. TP performs
>> nefarious deeds.
>>
>> Granted, in both cases TP does not deserve to be trusted.
> Right, exactly. The intended use case here is a controlled production
> containerized environment, where the container manager is privileged
> and controls which applications are run inside the container. These
> are coming from applications that are code reviewed and controlled by
> whichever organization.

I understand the intended use case. You have to allow for unintended abuse
cases when you implement a security mechanism. You can't wave your hand and
say that everything that is trusted in worthy of trust. You have to have
mechanism to ensure that. The existing security mechanisms (uids, capabilities
and so forth) have explicit criteria for how they delegate privilege, and
what the consequences of doing so might be.

>
>> Because TP does not run with privilege of its own, it is not
>> treated with the same level of caution as it would be if it did.
>>
>> Privileged Process (PP) sends a Token to what it thinks is a Trusted
>> Process (TP) but is in fact an Imposter Process (IP) that has been
>> enabled on the system using any number of K33L techniques.
> So if there is a probability of Imposter Process, neither BPF token
> nor CAP_BPF should be granted at all. In production no one gives
> CAP_BPF to processes that we cannot be reasonably sure is safe to use
> BPF. As I mentioned in the cover letter, BPF token is not a mechanism
> to implement unprivileged BPF.

You're correct, PP *shouldn't* grant IP a Token. But it *can* do so.
Think of the military definition of a threat. It's what the other guy
is capable of doing to you, not what the other guy is expected to do to you.
PP is capable to giving a Token to IP, even if PP does not intend to.

> What I'm trying to achieve here is instead of needing to grant root
> capabilities to any (trusted, otherwise no one would do this)
> BPF-using application, we'd like to grant BPF token which is more
> limited in scope and gives much less privileges to do anything with
> the system. And, crucially, CAP_BPF is incompatible with user
> namespaces, while BPF token is.

I get that. Dynamically increasing a process' privilege (TP) from an
external source (PP) without somehow marking TP as worthy of the privilege
is going to be insanely dangerous. Even in well controlled environments. 

>
> Basically, I'd like to go from having root/CAP_BPF processes in init
> namespace, to have unprivileged processes under user namespace, but
> with BPF token that would still allow to do them controlled (through
> combination of code reviews, audit, and security enforcements) BPF
> usage.

And the problem there is that if you put the feature in the kernel
you can assume that some number of people will use it without code
reviews, audit or security enforcement. Of course you can call out
"user error", but someone is going to want you to "fix" it. 

>
>> I don't see anything that ensures that PP communicates Tokens only
>> to TP, nor any criteria for "trust" are met.
> This should be up to PP how to organize this and will differ in
> different production setups. E.g., for something like systemd or
> container manager, one way to communicate this is to create a
> dedicated instance of BPF FS, pin BPF token in it, and expose that
> specific instance of BPF FS in the container's mount namespace.

I have no doubt that you can make a system that works and works correctly.
I'm saying that it's very easy to create a system that had easily exploited
security holes. You won't do it, but someone who didn't design your
mechanism will.

>
>> Those are the issues I'm most familiar with, although I believe
>> there are others.
>>
>>>> the notion of "trusted unprivileged" where trust is established by
>>>> a user space application. Ignoring the possibility of malicious code
>>>> for the moment, the opportunity for accidental privilege leakage is
>>>> huge. It would be trivial (and tempting) to create a privileged BPF
>>>> "shell" that would then be allowed to "trust" any application and
>>>> run it with privilege by passing it a token.
>>> Right now most BPF applications are running as real root in
>>> production. Users have to trust such applications to not do anything
>>> bad with their full root capabilities. How it is done depends on
>>> specific production and organizational setups, and could be code
>>> reviewing, audits, LSM, etc. So in that sense BPF token doesn't make
>>> things worse. And it actually allows us to improve the situation by
>>> creating and sharing more restrictive BPF tokens that limit what bpf()
>>> syscall parts are allowed to be used.
>>>
>>>>> The main motivation for BPF token is a desire to enable containerized
>>>>> BPF applications to be used together with user namespaces. This is currently
>>>>> impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
>>>>> or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
>>>>> helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
>>>>> arbitrary memory, and it's impossible to ensure that they only read memory of
>>>>> processes belonging to any given namespace. This means that it's impossible to
>>>>> have namespace-aware CAP_BPF capability, and as such another mechanism to
>>>>> allow safe usage of BPF functionality is necessary. BPF token and delegation
>>>>> of it to a trusted unprivileged applications is such mechanism. Kernel makes
>>>>> no assumption about what "trusted" constitutes in any particular case, and
>>>>> it's up to specific privileged applications and their surrounding
>>>>> infrastructure to decide that. What kernel provides is a set of APIs to create
>>>>> and tune BPF token, and pass it around to privileged BPF commands that are
>>>>> creating new BPF objects like BPF programs, BPF maps, etc.
>>>>>
>>>>> Previous attempt at addressing this very same problem ([0]) attempted to
>>>>> utilize authoritative LSM approach, but was conclusively rejected by upstream
>>>>> LSM maintainers. BPF token concept is not changing anything about LSM
>>>>> approach, but can be combined with LSM hooks for very fine-grained security
>>>>> policy. Some ideas about making BPF token more convenient to use with LSM (in
>>>>> particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
>>>>> 2023 presentation ([1]). E.g., an ability to specify user-provided data
>>>>> (context), which in combination with BPF LSM would allow implementing a very
>>>>> dynamic and fine-granular custom security policies on top of BPF token. In the
>>>>> interest of minimizing API surface area discussions this is going to be
>>>>> added in follow up patches, as it's not essential to the fundamental concept
>>>>> of delegatable BPF token.
>>>>>
>>>>> It should be noted that BPF token is conceptually quite similar to the idea of
>>>>> /dev/bpf device file, proposed by Song a while ago ([2]). The biggest
>>>>> difference is the idea of using virtual anon_inode file to hold BPF token and
>>>>> allowing multiple independent instances of them, each with its own set of
>>>>> restrictions. BPF pinning solves the problem of exposing such BPF token
>>>>> through file system (BPF FS, in this case) for cases where transferring FDs
>>>>> over Unix domain sockets is not convenient. And also, crucially, BPF token
>>>>> approach is not using any special stateful task-scoped flags. Instead, bpf()
>>>>> syscall accepts token_fd parameters explicitly for each relevant BPF command.
>>>>> This addresses main concerns brought up during the /dev/bpf discussion, and
>>>>> fits better with overall BPF subsystem design.
>>>>>
>>>>> This patch set adds a basic minimum of functionality to make BPF token useful
>>>>> and to discuss API and functionality. Currently only low-level libbpf APIs
>>>>> support passing BPF token around, allowing to test kernel functionality, but
>>>>> for the most part is not sufficient for real-world applications, which
>>>>> typically use high-level libbpf APIs based on `struct bpf_object` type. This
>>>>> was done with the intent to limit the size of patch set and concentrate on
>>>>> mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
>>>>> as a separate follow up patch set kernel support makes it upstream.
>>>>>
>>>>> Another part that should happen once kernel-side BPF token is established, is
>>>>> a set of conventions between applications (e.g., systemd), tools (e.g.,
>>>>> bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
>>>>> at well-defined locations to allow applications take advantage of this in
>>>>> automatic fashion without explicit code changes on BPF application's side.
>>>>> But I'd like to postpone this discussion to after BPF token concept lands.
>>>>>
>>>>>   [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
>>>>>   [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
>>>>>   [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/
>>>>>
>>>>> Andrii Nakryiko (18):
>>>>>   bpf: introduce BPF token object
>>>>>   libbpf: add bpf_token_create() API
>>>>>   selftests/bpf: add BPF_TOKEN_CREATE test
>>>>>   bpf: move unprivileged checks into map_create() and bpf_prog_load()
>>>>>   bpf: inline map creation logic in map_create() function
>>>>>   bpf: centralize permissions checks for all BPF map types
>>>>>   bpf: add BPF token support to BPF_MAP_CREATE command
>>>>>   libbpf: add BPF token support to bpf_map_create() API
>>>>>   selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
>>>>>   bpf: add BPF token support to BPF_BTF_LOAD command
>>>>>   libbpf: add BPF token support to bpf_btf_load() API
>>>>>   selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
>>>>>   bpf: keep BPF_PROG_LOAD permission checks clear of validations
>>>>>   bpf: add BPF token support to BPF_PROG_LOAD command
>>>>>   bpf: take into account BPF token when fetching helper protos
>>>>>   bpf: consistenly use BPF token throughout BPF verifier logic
>>>>>   libbpf: add BPF token support to bpf_prog_load() API
>>>>>   selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests
>>>>>
>>>>>  drivers/media/rc/bpf-lirc.c                   |   2 +-
>>>>>  include/linux/bpf.h                           |  66 ++-
>>>>>  include/linux/filter.h                        |   2 +-
>>>>>  include/uapi/linux/bpf.h                      |  74 +++
>>>>>  kernel/bpf/Makefile                           |   2 +-
>>>>>  kernel/bpf/arraymap.c                         |   2 +-
>>>>>  kernel/bpf/bloom_filter.c                     |   3 -
>>>>>  kernel/bpf/bpf_local_storage.c                |   3 -
>>>>>  kernel/bpf/bpf_struct_ops.c                   |   3 -
>>>>>  kernel/bpf/cgroup.c                           |   6 +-
>>>>>  kernel/bpf/core.c                             |   3 +-
>>>>>  kernel/bpf/cpumap.c                           |   4 -
>>>>>  kernel/bpf/devmap.c                           |   3 -
>>>>>  kernel/bpf/hashtab.c                          |   6 -
>>>>>  kernel/bpf/helpers.c                          |   6 +-
>>>>>  kernel/bpf/inode.c                            |  26 ++
>>>>>  kernel/bpf/lpm_trie.c                         |   3 -
>>>>>  kernel/bpf/queue_stack_maps.c                 |   4 -
>>>>>  kernel/bpf/reuseport_array.c                  |   3 -
>>>>>  kernel/bpf/stackmap.c                         |   3 -
>>>>>  kernel/bpf/syscall.c                          | 429 ++++++++++++++----
>>>>>  kernel/bpf/token.c                            | 141 ++++++
>>>>>  kernel/bpf/verifier.c                         |  13 +-
>>>>>  kernel/trace/bpf_trace.c                      |   2 +-
>>>>>  net/core/filter.c                             |  36 +-
>>>>>  net/core/sock_map.c                           |   4 -
>>>>>  net/ipv4/bpf_tcp_ca.c                         |   2 +-
>>>>>  net/netfilter/nf_bpf_link.c                   |   2 +-
>>>>>  net/xdp/xskmap.c                              |   4 -
>>>>>  tools/include/uapi/linux/bpf.h                |  74 +++
>>>>>  tools/lib/bpf/bpf.c                           |  32 +-
>>>>>  tools/lib/bpf/bpf.h                           |  24 +-
>>>>>  tools/lib/bpf/libbpf.map                      |   1 +
>>>>>  .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
>>>>>  .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
>>>>>  .../testing/selftests/bpf/prog_tests/token.c  | 282 ++++++++++++
>>>>>  .../bpf/prog_tests/unpriv_bpf_disabled.c      |   6 +-
>>>>>  37 files changed, 1098 insertions(+), 188 deletions(-)
>>>>>  create mode 100644 kernel/bpf/token.c
>>>>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c
>>>>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 00/18] BPF token
  2023-06-06  0:05         ` Casey Schaufler
@ 2023-06-06 16:38           ` Andrii Nakryiko
  2023-06-06 20:13             ` Casey Schaufler
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-06 16:38 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto

On Mon, Jun 5, 2023 at 5:06 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 6/5/2023 4:12 PM, Andrii Nakryiko wrote:
> > On Mon, Jun 5, 2023 at 3:26 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 6/5/2023 1:41 PM, Andrii Nakryiko wrote:
> >>> On Fri, Jun 2, 2023 at 8:55 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>>> On 6/2/2023 7:59 AM, Andrii Nakryiko wrote:
> >>>>> *Resending with trimmed CC list because original version didn't make it to
> >>>>> the mailing list.*
> >>>>>
> >>>>> This patch set introduces new BPF object, BPF token, which allows to delegate
> >>>>> a subset of BPF functionality from privileged system-wide daemon (e.g.,
> >>>>> systemd or any other container manager) to a *trusted* unprivileged
> >>>>> application. Trust is the key here. This functionality is not about allowing
> >>>>> unconditional unprivileged BPF usage. Establishing trust, though, is
> >>>>> completely up to the discretion of respective privileged application that
> >>>>> would create a BPF token.
> >>>> Token based privilege has a number of well understood weaknesses,
> >>>> none of which I see addressed here. I also have a real problem with
> >>> Can you please provide some more details about those weaknesses? Hard
> >>> to respond without knowing exactly what we are talking about.
> >> Privileged Process (PP) sends a Token to Trusted Process (TP).
> >> TP sends the Token along to Untrusted Process, which performs nefarious
> >> deeds.
> >>
> >> Privileged Process (PP) sends a Token to Trusted Process (TP).
> >> TP uses Token, and then saves it in its toolbox. PP later sends
> >> TP a different Token. TP realizes that with the combination of
> >> Tokens it now has it can do considerably more than what PP
> >> intended in either of the cases it sent Token for. TP performs
> >> nefarious deeds.
> >>
> >> Granted, in both cases TP does not deserve to be trusted.
> > Right, exactly. The intended use case here is a controlled production
> > containerized environment, where the container manager is privileged
> > and controls which applications are run inside the container. These
> > are coming from applications that are code reviewed and controlled by
> > whichever organization.
>
> I understand the intended use case. You have to allow for unintended abuse
> cases when you implement a security mechanism. You can't wave your hand and
> say that everything that is trusted in worthy of trust. You have to have
> mechanism to ensure that. The existing security mechanisms (uids, capabilities
> and so forth) have explicit criteria for how they delegate privilege, and
> what the consequences of doing so might be.
>

I'm sorry, I'm failing to see the point you are trying to make. Any
API (especially privileged ones, like BPF_TOKEN_CREATE) can be misused
or abused, but we still grant root permissions to various production
processes, right? So I'm not sure where this is going. If you have
something specific in mind, please do tell.

> >
> >> Because TP does not run with privilege of its own, it is not
> >> treated with the same level of caution as it would be if it did.
> >>
> >> Privileged Process (PP) sends a Token to what it thinks is a Trusted
> >> Process (TP) but is in fact an Imposter Process (IP) that has been
> >> enabled on the system using any number of K33L techniques.
> > So if there is a probability of Imposter Process, neither BPF token
> > nor CAP_BPF should be granted at all. In production no one gives
> > CAP_BPF to processes that we cannot be reasonably sure is safe to use
> > BPF. As I mentioned in the cover letter, BPF token is not a mechanism
> > to implement unprivileged BPF.
>
> You're correct, PP *shouldn't* grant IP a Token. But it *can* do so.
> Think of the military definition of a threat. It's what the other guy
> is capable of doing to you, not what the other guy is expected to do to you.
> PP is capable to giving a Token to IP, even if PP does not intend to.
>
> > What I'm trying to achieve here is instead of needing to grant root
> > capabilities to any (trusted, otherwise no one would do this)
> > BPF-using application, we'd like to grant BPF token which is more
> > limited in scope and gives much less privileges to do anything with
> > the system. And, crucially, CAP_BPF is incompatible with user
> > namespaces, while BPF token is.
>
> I get that. Dynamically increasing a process' privilege (TP) from an
> external source (PP) without somehow marking TP as worthy of the privilege
> is going to be insanely dangerous. Even in well controlled environments.
>
> >
> > Basically, I'd like to go from having root/CAP_BPF processes in init
> > namespace, to have unprivileged processes under user namespace, but
> > with BPF token that would still allow to do them controlled (through
> > combination of code reviews, audit, and security enforcements) BPF
> > usage.
>
> And the problem there is that if you put the feature in the kernel
> you can assume that some number of people will use it without code
> reviews, audit or security enforcement. Of course you can call out
> "user error", but someone is going to want you to "fix" it.
>
> >
> >> I don't see anything that ensures that PP communicates Tokens only
> >> to TP, nor any criteria for "trust" are met.
> > This should be up to PP how to organize this and will differ in
> > different production setups. E.g., for something like systemd or
> > container manager, one way to communicate this is to create a
> > dedicated instance of BPF FS, pin BPF token in it, and expose that
> > specific instance of BPF FS in the container's mount namespace.
>
> I have no doubt that you can make a system that works and works correctly.
> I'm saying that it's very easy to create a system that had easily exploited
> security holes. You won't do it, but someone who didn't design your
> mechanism will.
>

As I mentioned above, I'm failing to see where this is going... If you
give a process CAP_SYS_ADMIN capabilities, it can create a BPF token.
And it can pass it to some other process either through Unix domain
socket (SCM_RIGHTS), or BPF FS. If you can't be sure the privileged
process will do the right thing -- don't give it CAP_SYS_ADMIN. If you
did, don't blame API existence for your misused of it.

There are many privileged APIs, they exist for a reason, and yes, they
can be dangerous (which is why they are privileged), but they help to
solve real problems. Same here, we need something like a BPF token to
allow use of BPF within containers. And do that in a safer way than
granting CAP_SYS_ADMIN, CAP_BPF, etc capabilities.

> >
> >> Those are the issues I'm most familiar with, although I believe
> >> there are others.
> >>
> >>>> the notion of "trusted unprivileged" where trust is established by
> >>>> a user space application. Ignoring the possibility of malicious code
> >>>> for the moment, the opportunity for accidental privilege leakage is
> >>>> huge. It would be trivial (and tempting) to create a privileged BPF
> >>>> "shell" that would then be allowed to "trust" any application and
> >>>> run it with privilege by passing it a token.
> >>> Right now most BPF applications are running as real root in
> >>> production. Users have to trust such applications to not do anything
> >>> bad with their full root capabilities. How it is done depends on
> >>> specific production and organizational setups, and could be code
> >>> reviewing, audits, LSM, etc. So in that sense BPF token doesn't make
> >>> things worse. And it actually allows us to improve the situation by
> >>> creating and sharing more restrictive BPF tokens that limit what bpf()
> >>> syscall parts are allowed to be used.
> >>>
> >>>>> The main motivation for BPF token is a desire to enable containerized
> >>>>> BPF applications to be used together with user namespaces. This is currently
> >>>>> impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
> >>>>> or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
> >>>>> helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
> >>>>> arbitrary memory, and it's impossible to ensure that they only read memory of
> >>>>> processes belonging to any given namespace. This means that it's impossible to
> >>>>> have namespace-aware CAP_BPF capability, and as such another mechanism to
> >>>>> allow safe usage of BPF functionality is necessary. BPF token and delegation
> >>>>> of it to a trusted unprivileged applications is such mechanism. Kernel makes
> >>>>> no assumption about what "trusted" constitutes in any particular case, and
> >>>>> it's up to specific privileged applications and their surrounding
> >>>>> infrastructure to decide that. What kernel provides is a set of APIs to create
> >>>>> and tune BPF token, and pass it around to privileged BPF commands that are
> >>>>> creating new BPF objects like BPF programs, BPF maps, etc.
> >>>>>
> >>>>> Previous attempt at addressing this very same problem ([0]) attempted to
> >>>>> utilize authoritative LSM approach, but was conclusively rejected by upstream
> >>>>> LSM maintainers. BPF token concept is not changing anything about LSM
> >>>>> approach, but can be combined with LSM hooks for very fine-grained security
> >>>>> policy. Some ideas about making BPF token more convenient to use with LSM (in
> >>>>> particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
> >>>>> 2023 presentation ([1]). E.g., an ability to specify user-provided data
> >>>>> (context), which in combination with BPF LSM would allow implementing a very
> >>>>> dynamic and fine-granular custom security policies on top of BPF token. In the
> >>>>> interest of minimizing API surface area discussions this is going to be
> >>>>> added in follow up patches, as it's not essential to the fundamental concept
> >>>>> of delegatable BPF token.
> >>>>>
> >>>>> It should be noted that BPF token is conceptually quite similar to the idea of
> >>>>> /dev/bpf device file, proposed by Song a while ago ([2]). The biggest
> >>>>> difference is the idea of using virtual anon_inode file to hold BPF token and
> >>>>> allowing multiple independent instances of them, each with its own set of
> >>>>> restrictions. BPF pinning solves the problem of exposing such BPF token
> >>>>> through file system (BPF FS, in this case) for cases where transferring FDs
> >>>>> over Unix domain sockets is not convenient. And also, crucially, BPF token
> >>>>> approach is not using any special stateful task-scoped flags. Instead, bpf()
> >>>>> syscall accepts token_fd parameters explicitly for each relevant BPF command.
> >>>>> This addresses main concerns brought up during the /dev/bpf discussion, and
> >>>>> fits better with overall BPF subsystem design.
> >>>>>
> >>>>> This patch set adds a basic minimum of functionality to make BPF token useful
> >>>>> and to discuss API and functionality. Currently only low-level libbpf APIs
> >>>>> support passing BPF token around, allowing to test kernel functionality, but
> >>>>> for the most part is not sufficient for real-world applications, which
> >>>>> typically use high-level libbpf APIs based on `struct bpf_object` type. This
> >>>>> was done with the intent to limit the size of patch set and concentrate on
> >>>>> mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
> >>>>> as a separate follow up patch set kernel support makes it upstream.
> >>>>>
> >>>>> Another part that should happen once kernel-side BPF token is established, is
> >>>>> a set of conventions between applications (e.g., systemd), tools (e.g.,
> >>>>> bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
> >>>>> at well-defined locations to allow applications take advantage of this in
> >>>>> automatic fashion without explicit code changes on BPF application's side.
> >>>>> But I'd like to postpone this discussion to after BPF token concept lands.
> >>>>>
> >>>>>   [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
> >>>>>   [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
> >>>>>   [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/
> >>>>>
> >>>>> Andrii Nakryiko (18):
> >>>>>   bpf: introduce BPF token object
> >>>>>   libbpf: add bpf_token_create() API
> >>>>>   selftests/bpf: add BPF_TOKEN_CREATE test
> >>>>>   bpf: move unprivileged checks into map_create() and bpf_prog_load()
> >>>>>   bpf: inline map creation logic in map_create() function
> >>>>>   bpf: centralize permissions checks for all BPF map types
> >>>>>   bpf: add BPF token support to BPF_MAP_CREATE command
> >>>>>   libbpf: add BPF token support to bpf_map_create() API
> >>>>>   selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
> >>>>>   bpf: add BPF token support to BPF_BTF_LOAD command
> >>>>>   libbpf: add BPF token support to bpf_btf_load() API
> >>>>>   selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
> >>>>>   bpf: keep BPF_PROG_LOAD permission checks clear of validations
> >>>>>   bpf: add BPF token support to BPF_PROG_LOAD command
> >>>>>   bpf: take into account BPF token when fetching helper protos
> >>>>>   bpf: consistenly use BPF token throughout BPF verifier logic
> >>>>>   libbpf: add BPF token support to bpf_prog_load() API
> >>>>>   selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests
> >>>>>
> >>>>>  drivers/media/rc/bpf-lirc.c                   |   2 +-
> >>>>>  include/linux/bpf.h                           |  66 ++-
> >>>>>  include/linux/filter.h                        |   2 +-
> >>>>>  include/uapi/linux/bpf.h                      |  74 +++
> >>>>>  kernel/bpf/Makefile                           |   2 +-
> >>>>>  kernel/bpf/arraymap.c                         |   2 +-
> >>>>>  kernel/bpf/bloom_filter.c                     |   3 -
> >>>>>  kernel/bpf/bpf_local_storage.c                |   3 -
> >>>>>  kernel/bpf/bpf_struct_ops.c                   |   3 -
> >>>>>  kernel/bpf/cgroup.c                           |   6 +-
> >>>>>  kernel/bpf/core.c                             |   3 +-
> >>>>>  kernel/bpf/cpumap.c                           |   4 -
> >>>>>  kernel/bpf/devmap.c                           |   3 -
> >>>>>  kernel/bpf/hashtab.c                          |   6 -
> >>>>>  kernel/bpf/helpers.c                          |   6 +-
> >>>>>  kernel/bpf/inode.c                            |  26 ++
> >>>>>  kernel/bpf/lpm_trie.c                         |   3 -
> >>>>>  kernel/bpf/queue_stack_maps.c                 |   4 -
> >>>>>  kernel/bpf/reuseport_array.c                  |   3 -
> >>>>>  kernel/bpf/stackmap.c                         |   3 -
> >>>>>  kernel/bpf/syscall.c                          | 429 ++++++++++++++----
> >>>>>  kernel/bpf/token.c                            | 141 ++++++
> >>>>>  kernel/bpf/verifier.c                         |  13 +-
> >>>>>  kernel/trace/bpf_trace.c                      |   2 +-
> >>>>>  net/core/filter.c                             |  36 +-
> >>>>>  net/core/sock_map.c                           |   4 -
> >>>>>  net/ipv4/bpf_tcp_ca.c                         |   2 +-
> >>>>>  net/netfilter/nf_bpf_link.c                   |   2 +-
> >>>>>  net/xdp/xskmap.c                              |   4 -
> >>>>>  tools/include/uapi/linux/bpf.h                |  74 +++
> >>>>>  tools/lib/bpf/bpf.c                           |  32 +-
> >>>>>  tools/lib/bpf/bpf.h                           |  24 +-
> >>>>>  tools/lib/bpf/libbpf.map                      |   1 +
> >>>>>  .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
> >>>>>  .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
> >>>>>  .../testing/selftests/bpf/prog_tests/token.c  | 282 ++++++++++++
> >>>>>  .../bpf/prog_tests/unpriv_bpf_disabled.c      |   6 +-
> >>>>>  37 files changed, 1098 insertions(+), 188 deletions(-)
> >>>>>  create mode 100644 kernel/bpf/token.c
> >>>>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c
> >>>>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-05 23:00         ` Andrii Nakryiko
@ 2023-06-06 16:58           ` Stanislav Fomichev
  2023-06-06 17:04             ` Andrii Nakryiko
  0 siblings, 1 reply; 36+ messages in thread
From: Stanislav Fomichev @ 2023-06-06 16:58 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto

On 06/05, Andrii Nakryiko wrote:
> On Mon, Jun 5, 2023 at 2:48 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On 06/05, Andrii Nakryiko wrote:
> > > On Fri, Jun 2, 2023 at 6:32 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > On 06/02, Andrii Nakryiko wrote:
> > > > > Add new kind of BPF kernel object, BPF token. BPF token is meant to to
> > > > > allow delegating privileged BPF functionality, like loading a BPF
> > > > > program or creating a BPF map, from privileged process to a *trusted*
> > > > > unprivileged process, all while have a good amount of control over which
> > > > > privileged operation could be done using provided BPF token.
> > > > >
> > > > > This patch adds new BPF_TOKEN_CREATE command to bpf() syscall, which
> > > > > allows to create a new BPF token object along with a set of allowed
> > > > > commands. Currently only BPF_TOKEN_CREATE command itself can be
> > > > > delegated, but other patches gradually add ability to delegate
> > > > > BPF_MAP_CREATE, BPF_BTF_LOAD, and BPF_PROG_LOAD commands.
> > > > >
> > > > > The above means that BPF token creation can be allowed by another
> > > > > existing BPF token, if original privileged creator allowed that. New
> > > > > derived BPF token cannot be more powerful than the original BPF token.
> > > > >
> > > > > BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag is added to allow application to do
> > > > > express "all supported BPF commands should be allowed" without worrying
> > > > > about which subset of desired commands is actually supported by
> > > > > potentially outdated kernel. Allowing this semantics doesn't seem to
> > > > > introduce any backwards compatibility issues and doesn't introduce any
> > > > > risk of abusing or misusing bit set field, but makes backwards
> > > > > compatibility story for various applications and tools much more
> > > > > straightforward, making it unnecessary to probe support for each
> > > > > individual possible bit. This is especially useful in follow up patches
> > > > > where we add BPF map types and prog types bit sets.
> > > > >
> > > > > Lastly, BPF token can be pinned in and retrieved from BPF FS, just like
> > > > > progs, maps, BTFs, and links. This allows applications (like container
> > > > > managers) to share BPF token with other applications through file system
> > > > > just like any other BPF object, and further control access to it using
> > > > > file system permissions, if desired.
> > > > >
> > > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > > ---
> > > > >  include/linux/bpf.h            |  34 +++++++++
> > > > >  include/uapi/linux/bpf.h       |  42 ++++++++++++
> > > > >  kernel/bpf/Makefile            |   2 +-
> > > > >  kernel/bpf/inode.c             |  26 +++++++
> > > > >  kernel/bpf/syscall.c           |  74 ++++++++++++++++++++
> > > > >  kernel/bpf/token.c             | 122 +++++++++++++++++++++++++++++++++
> > > > >  tools/include/uapi/linux/bpf.h |  40 +++++++++++
> > > > >  7 files changed, 339 insertions(+), 1 deletion(-)
> > > > >  create mode 100644 kernel/bpf/token.c
> > > > >
> > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > index f58895830ada..fe6d51c3a5b1 100644
> > > > > --- a/include/linux/bpf.h
> > > > > +++ b/include/linux/bpf.h
> > > > > @@ -51,6 +51,7 @@ struct module;
> > > > >  struct bpf_func_state;
> > > > >  struct ftrace_ops;
> > > > >  struct cgroup;
> > > > > +struct bpf_token;
> > > > >
> > > > >  extern struct idr btf_idr;
> > > > >  extern spinlock_t btf_idr_lock;
> > > > > @@ -1533,6 +1534,12 @@ struct bpf_link_primer {
> > > > >       u32 id;
> > > > >  };
> > > > >
> > > > > +struct bpf_token {
> > > > > +     struct work_struct work;
> > > > > +     atomic64_t refcnt;
> > > > > +     u64 allowed_cmds;
> > > > > +};
> > > > > +
> > > > >  struct bpf_struct_ops_value;
> > > > >  struct btf_member;
> > > > >
> > > > > @@ -2077,6 +2084,15 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
> > > > >  struct bpf_link *bpf_link_get_from_fd(u32 ufd);
> > > > >  struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
> > > > >
> > > > > +void bpf_token_inc(struct bpf_token *token);
> > > > > +void bpf_token_put(struct bpf_token *token);
> > > > > +struct bpf_token *bpf_token_alloc(void);
> > > > > +int bpf_token_new_fd(struct bpf_token *token);
> > > > > +struct bpf_token *bpf_token_get_from_fd(u32 ufd);
> > > > > +
> > > > > +bool bpf_token_capable(const struct bpf_token *token, int cap);
> > > > > +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
> > > > > +
> > > > >  int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
> > > > >  int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
> > > > >
> > > > > @@ -2436,6 +2452,24 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)
> > > > >       return -EOPNOTSUPP;
> > > > >  }
> > > > >
> > > > > +static inline void bpf_token_inc(struct bpf_token *token)
> > > > > +{
> > > > > +}
> > > > > +
> > > > > +static inline void bpf_token_put(struct bpf_token *token)
> > > > > +{
> > > > > +}
> > > > > +
> > > > > +static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
> > > > > +{
> > > > > +     return -EOPNOTSUPP;
> > > > > +}
> > > > > +
> > > > > +static inline struct bpf_token *bpf_token_get_from_fd(u32 ufd)
> > > > > +{
> > > > > +     return ERR_PTR(-EOPNOTSUPP);
> > > > > +}
> > > > > +
> > > > >  static inline void __dev_flush(void)
> > > > >  {
> > > > >  }
> > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > index 9273c654743c..01ab79f2ad9f 100644
> > > > > --- a/include/uapi/linux/bpf.h
> > > > > +++ b/include/uapi/linux/bpf.h
> > > > > @@ -846,6 +846,16 @@ union bpf_iter_link_info {
> > > > >   *           Returns zero on success. On error, -1 is returned and *errno*
> > > > >   *           is set appropriately.
> > > > >   *
> > > > > + * BPF_TOKEN_CREATE
> > > > > + *   Description
> > > > > + *           Create BPF token with embedded information about what
> > > > > + *           BPF-related functionality is allowed. This BPF token can be
> > > > > + *           passed as an extra parameter to various bpf() syscall command.
> > > > > + *
> > > > > + *   Return
> > > > > + *           A new file descriptor (a nonnegative integer), or -1 if an
> > > > > + *           error occurred (in which case, *errno* is set appropriately).
> > > > > + *
> > > > >   * NOTES
> > > > >   *   eBPF objects (maps and programs) can be shared between processes.
> > > > >   *
> > > > > @@ -900,6 +910,7 @@ enum bpf_cmd {
> > > > >       BPF_ITER_CREATE,
> > > > >       BPF_LINK_DETACH,
> > > > >       BPF_PROG_BIND_MAP,
> > > > > +     BPF_TOKEN_CREATE,
> > > > >  };
> > > > >
> > > > >  enum bpf_map_type {
> > > > > @@ -1169,6 +1180,24 @@ enum bpf_link_type {
> > > > >   */
> > > > >  #define BPF_F_KPROBE_MULTI_RETURN    (1U << 0)
> > > > >
> > > > > +/* BPF_TOKEN_CREATE command flags
> > > > > + */
> > > > > +enum {
> > > > > +     /* Ignore unrecognized bits in token_create.allowed_cmds bit set.  If
> > > > > +      * this flag is set, kernel won't return -EINVAL for a bit
> > > > > +      * corresponding to a non-existing command or the one that doesn't
> > > > > +      * support BPF token passing. This flags allows application to request
> > > > > +      * BPF token creation for a desired set of commands without worrying
> > > > > +      * about older kernels not supporting some of the commands.
> > > > > +      * Presumably, deployed applications will do separate feature
> > > > > +      * detection and will avoid calling not-yet-supported bpf() commands,
> > > > > +      * so this BPF token will work equally well both on older and newer
> > > > > +      * kernels, even if some of the requested commands won't be BPF
> > > > > +      * token-enabled.
> > > > > +      */
> > > > > +     BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS           = 1U << 0,
> > > > > +};
> > > > > +
> > > > >  /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
> > > > >   * the following extensions:
> > > > >   *
> > > > > @@ -1621,6 +1650,19 @@ union bpf_attr {
> > > > >               __u32           flags;          /* extra flags */
> > > > >       } prog_bind_map;
> > > > >
> > > > > +     struct { /* struct used by BPF_TOKEN_CREATE command */
> > > > > +             __u32           flags;
> > > > > +             __u32           token_fd;
> > > > > +             /* a bit set of allowed bpf() syscall commands,
> > > > > +              * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
> > > > > +              * will allow creating derived BPF tokens and loading new BPF
> > > > > +              * programs;
> > > > > +              * see also BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS for its effect on
> > > > > +              * validity checking of this set
> > > > > +              */
> > > > > +             __u64           allowed_cmds;
> > > > > +     } token_create;
> > > >
> > > > Do you think this might eventually grow into something like
> > > > "allow only lookup operation for this specific map"? If yes, maybe it
> > >
> > > If it was strictly up for me, then no. I think fine-granular and
> > > highly-dynamic restrictions are more the (BPF) LSM domain. In practice
> > > I envision that users will use a combination of BPF token to specify
> > > what BPF functionality can be used by applications in "broad strokes",
> > > e.g., specifying that only networking programs and
> > > ARRAY/HASHMAP/SK_STORAGE maps can be used, but disallow most of
> > > tracing functionality. And then on top of that LSM can be utilized to
> > > provide more nuanced (and as I said, more dynamic) controls over what
> > > operations over BPF map application can perform.
> >
> > In this case, why not fully embrace lsm here?
> >
> > Maybe all we really need is:
> > - a BPF_TOKEN_CREATE command (without any granularity)
> > - a holder of the token (passed as you're suggesting, via new uapi
> >   field) would be equivalent to capable(CAP_BPF)
> > - security_bpf() will provide fine-grained control
> > - extend landlock to provide coarse-grained policy (and later
> >   finer granularity)
> >
> > ?
> 
> That's one option, yes. But I got the feeling at LSF/MM/BPF that
> people are worried about having a BPF token that allows the entire
> bpf() syscall with no control. I think this coarse-grained control
> strikes a reasonable and pragmatic balance, but I'm open to just going
> all in. :)

Sg, let's see what the rest of the folks think.

> > Or we still want the token to carry the policy somehow? (why? because
> > of the filesystem pinning?)
> 
> I think it's nice to be able to say "this application can only do
> networking programs and no fancy data structures" with purely BPF
> token, with no BPF LSM involved. Or on the other hand, "just tracing,
> no networking" for another class of programs.
> 
> LSM and BPF LSM is definitely a more logistical hurdle, so if it can
> be avoided in some scenarios, that seems like a win.

Agreed, although it's pretty hard to define what networking program
really is nowadays. Our networking programs have a bunch of tracepoints
in them. That's why I'm a bit tilting towards pushing all this policy
stuff to lsm.

> > > If you look at the final set of token_create parameters, you can see
> > > that I only aim to control and restrict BPF commands that are creating
> > > new BPF objects (BTF, map, prog; we might do similar stuff for links
> > > later, perhaps) only. In that sense BPF token controls "constructors",
> > > while if users want to control operation on BPF objects that were
> > > created (maps and programs, for the most part), I see this a bit
> > > outside of BPF token scope. I also don't think we should do more
> > > fine-grained control of construction parameters. E.g., I think it's
> > > too much to enforce which attach_btf_id can be provided.
> > >
> > > It's all code, though, so we could push it in any direction we want,
> > > but in my view BPF token is about a somewhat static prescription of
> > > what bpf() functionality is accessible to the application, broadly.
> > > And LSM can complement it with more dynamic abilities.
> >
> > Are you planning to follow up with the other, non-constructing commands?
> > Somebody here recently was proposing to namespacify CAP_BPF, something
> > like a read-only-capable token should, in theory, solve it?
> 
> Maybe for LINK_CREATE. Most other commands are already unprivileged
> and rely on FD (prog, map, link) availability as a proof of being able
> to work with that object. GET_FD_BY_ID is another candidate for BPF
> token, but I wanted to get real production feedback before making
> exact decisions here.

Yeah, I was mostly talking about GET_FD_BY_ID, let's follow up
separately.

> 
> >
> > > > makes sense to separate token-create vs token-add-capability operations?
> > > >
> > > > BPF_TOKEN_CREATE would create a token that can't do anything. Then you
> > > > would call a bunch of BPF_TOKEN_ALLOW with maybe op=SYSCALL_CMD
> > > > value=BPF_TOKEN_CREATE.
> > > >
> > > > This will be more future-proof plus won't really depend on having a
> > > > bitmask in the uapi. Then the users will be able to handle
> > > > BPF_TOKEN_ALLOW{op=SYSCALL_CMD value=SOME_VALUE_NOT_SUPPORTED_ON_THIS_KERNEL}
> > > > themselves (IOW, BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS won't be needed).
> > >
> > > So I very much intentionally wanted to keep the BPF token immutable
> > > once created. This makes it simple to reason about what BPF token
> > > allows and guarantee that it won't change after the fact. It's doable
> > > to make BPF token mutable and then "finalize" it (and BPF_MAP_FREEZE
> > > stands as a good reminder of races and complications such model
> > > introduces), creating a sort of builder pattern APIs, but that seems
> > > like an overkill and unnecessary complication.
> > >
> > > But let me address that "more future-proof" part. What about our
> > > binary bpf_attr extensible approach is not future proof? In both cases
> > > we'll have to specify as part of UAPI that there is a possibility to
> > > restrict a set of bpf() syscall commands, right? In one case you'll do
> > > it through multiple syscall invocations, while I chose a
> > > straightforward bit mask. I could have done it as a pointer to an
> > > array of `enum bpf_cmd` items, but I think it's extremely unlikely
> > > we'll get to >64, ever. But even if we do, adding `u64 allowed_cmds2`
> > > doesn't seem like a big deal to me.
> > >
> > > The main point though, both approaches are equally extensible. But
> > > making BPF token mutable adds a lot of undesirable (IMO)
> > > complications.
> > >
> > >
> > > As for BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS, I'm thinking of dropping such
> > > flags for simplicity.
> >
> > Ack, I just hope we're not inventing another landlock here. As mentioned
> > above, maybe doing simple BPF_TOKEN_CREATE + pushing the rest of the
> > policy into lsm/landlock is a good alternative, idk.
> 
> The biggest blocker today is incompatibility of BPF usage with user
> namespaces. Having a simple BPF token would allow us to make progress
> here. The rest is just something to strike a balance with, yep.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object
  2023-06-06 16:58           ` Stanislav Fomichev
@ 2023-06-06 17:04             ` Andrii Nakryiko
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2023-06-06 17:04 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto

On Tue, Jun 6, 2023 at 9:58 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> On 06/05, Andrii Nakryiko wrote:
> > On Mon, Jun 5, 2023 at 2:48 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > On 06/05, Andrii Nakryiko wrote:
> > > > On Fri, Jun 2, 2023 at 6:32 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > > >
> > > > > On 06/02, Andrii Nakryiko wrote:
> > > > > > Add new kind of BPF kernel object, BPF token. BPF token is meant to to
> > > > > > allow delegating privileged BPF functionality, like loading a BPF
> > > > > > program or creating a BPF map, from privileged process to a *trusted*
> > > > > > unprivileged process, all while have a good amount of control over which
> > > > > > privileged operation could be done using provided BPF token.
> > > > > >
> > > > > > This patch adds new BPF_TOKEN_CREATE command to bpf() syscall, which
> > > > > > allows to create a new BPF token object along with a set of allowed
> > > > > > commands. Currently only BPF_TOKEN_CREATE command itself can be
> > > > > > delegated, but other patches gradually add ability to delegate
> > > > > > BPF_MAP_CREATE, BPF_BTF_LOAD, and BPF_PROG_LOAD commands.
> > > > > >
> > > > > > The above means that BPF token creation can be allowed by another
> > > > > > existing BPF token, if original privileged creator allowed that. New
> > > > > > derived BPF token cannot be more powerful than the original BPF token.
> > > > > >
> > > > > > BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS flag is added to allow application to do
> > > > > > express "all supported BPF commands should be allowed" without worrying
> > > > > > about which subset of desired commands is actually supported by
> > > > > > potentially outdated kernel. Allowing this semantics doesn't seem to
> > > > > > introduce any backwards compatibility issues and doesn't introduce any
> > > > > > risk of abusing or misusing bit set field, but makes backwards
> > > > > > compatibility story for various applications and tools much more
> > > > > > straightforward, making it unnecessary to probe support for each
> > > > > > individual possible bit. This is especially useful in follow up patches
> > > > > > where we add BPF map types and prog types bit sets.
> > > > > >
> > > > > > Lastly, BPF token can be pinned in and retrieved from BPF FS, just like
> > > > > > progs, maps, BTFs, and links. This allows applications (like container
> > > > > > managers) to share BPF token with other applications through file system
> > > > > > just like any other BPF object, and further control access to it using
> > > > > > file system permissions, if desired.
> > > > > >
> > > > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > > > ---
> > > > > >  include/linux/bpf.h            |  34 +++++++++
> > > > > >  include/uapi/linux/bpf.h       |  42 ++++++++++++
> > > > > >  kernel/bpf/Makefile            |   2 +-
> > > > > >  kernel/bpf/inode.c             |  26 +++++++
> > > > > >  kernel/bpf/syscall.c           |  74 ++++++++++++++++++++
> > > > > >  kernel/bpf/token.c             | 122 +++++++++++++++++++++++++++++++++
> > > > > >  tools/include/uapi/linux/bpf.h |  40 +++++++++++
> > > > > >  7 files changed, 339 insertions(+), 1 deletion(-)
> > > > > >  create mode 100644 kernel/bpf/token.c
> > > > > >
> > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > > index f58895830ada..fe6d51c3a5b1 100644
> > > > > > --- a/include/linux/bpf.h
> > > > > > +++ b/include/linux/bpf.h
> > > > > > @@ -51,6 +51,7 @@ struct module;
> > > > > >  struct bpf_func_state;
> > > > > >  struct ftrace_ops;
> > > > > >  struct cgroup;
> > > > > > +struct bpf_token;
> > > > > >
> > > > > >  extern struct idr btf_idr;
> > > > > >  extern spinlock_t btf_idr_lock;
> > > > > > @@ -1533,6 +1534,12 @@ struct bpf_link_primer {
> > > > > >       u32 id;
> > > > > >  };
> > > > > >
> > > > > > +struct bpf_token {
> > > > > > +     struct work_struct work;
> > > > > > +     atomic64_t refcnt;
> > > > > > +     u64 allowed_cmds;
> > > > > > +};
> > > > > > +
> > > > > >  struct bpf_struct_ops_value;
> > > > > >  struct btf_member;
> > > > > >
> > > > > > @@ -2077,6 +2084,15 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
> > > > > >  struct bpf_link *bpf_link_get_from_fd(u32 ufd);
> > > > > >  struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
> > > > > >
> > > > > > +void bpf_token_inc(struct bpf_token *token);
> > > > > > +void bpf_token_put(struct bpf_token *token);
> > > > > > +struct bpf_token *bpf_token_alloc(void);
> > > > > > +int bpf_token_new_fd(struct bpf_token *token);
> > > > > > +struct bpf_token *bpf_token_get_from_fd(u32 ufd);
> > > > > > +
> > > > > > +bool bpf_token_capable(const struct bpf_token *token, int cap);
> > > > > > +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd);
> > > > > > +
> > > > > >  int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
> > > > > >  int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
> > > > > >
> > > > > > @@ -2436,6 +2452,24 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)
> > > > > >       return -EOPNOTSUPP;
> > > > > >  }
> > > > > >
> > > > > > +static inline void bpf_token_inc(struct bpf_token *token)
> > > > > > +{
> > > > > > +}
> > > > > > +
> > > > > > +static inline void bpf_token_put(struct bpf_token *token)
> > > > > > +{
> > > > > > +}
> > > > > > +
> > > > > > +static inline struct bpf_token *bpf_token_new_fd(struct bpf_token *token)
> > > > > > +{
> > > > > > +     return -EOPNOTSUPP;
> > > > > > +}
> > > > > > +
> > > > > > +static inline struct bpf_token *bpf_token_get_from_fd(u32 ufd)
> > > > > > +{
> > > > > > +     return ERR_PTR(-EOPNOTSUPP);
> > > > > > +}
> > > > > > +
> > > > > >  static inline void __dev_flush(void)
> > > > > >  {
> > > > > >  }
> > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > index 9273c654743c..01ab79f2ad9f 100644
> > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > @@ -846,6 +846,16 @@ union bpf_iter_link_info {
> > > > > >   *           Returns zero on success. On error, -1 is returned and *errno*
> > > > > >   *           is set appropriately.
> > > > > >   *
> > > > > > + * BPF_TOKEN_CREATE
> > > > > > + *   Description
> > > > > > + *           Create BPF token with embedded information about what
> > > > > > + *           BPF-related functionality is allowed. This BPF token can be
> > > > > > + *           passed as an extra parameter to various bpf() syscall command.
> > > > > > + *
> > > > > > + *   Return
> > > > > > + *           A new file descriptor (a nonnegative integer), or -1 if an
> > > > > > + *           error occurred (in which case, *errno* is set appropriately).
> > > > > > + *
> > > > > >   * NOTES
> > > > > >   *   eBPF objects (maps and programs) can be shared between processes.
> > > > > >   *
> > > > > > @@ -900,6 +910,7 @@ enum bpf_cmd {
> > > > > >       BPF_ITER_CREATE,
> > > > > >       BPF_LINK_DETACH,
> > > > > >       BPF_PROG_BIND_MAP,
> > > > > > +     BPF_TOKEN_CREATE,
> > > > > >  };
> > > > > >
> > > > > >  enum bpf_map_type {
> > > > > > @@ -1169,6 +1180,24 @@ enum bpf_link_type {
> > > > > >   */
> > > > > >  #define BPF_F_KPROBE_MULTI_RETURN    (1U << 0)
> > > > > >
> > > > > > +/* BPF_TOKEN_CREATE command flags
> > > > > > + */
> > > > > > +enum {
> > > > > > +     /* Ignore unrecognized bits in token_create.allowed_cmds bit set.  If
> > > > > > +      * this flag is set, kernel won't return -EINVAL for a bit
> > > > > > +      * corresponding to a non-existing command or the one that doesn't
> > > > > > +      * support BPF token passing. This flags allows application to request
> > > > > > +      * BPF token creation for a desired set of commands without worrying
> > > > > > +      * about older kernels not supporting some of the commands.
> > > > > > +      * Presumably, deployed applications will do separate feature
> > > > > > +      * detection and will avoid calling not-yet-supported bpf() commands,
> > > > > > +      * so this BPF token will work equally well both on older and newer
> > > > > > +      * kernels, even if some of the requested commands won't be BPF
> > > > > > +      * token-enabled.
> > > > > > +      */
> > > > > > +     BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS           = 1U << 0,
> > > > > > +};
> > > > > > +
> > > > > >  /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
> > > > > >   * the following extensions:
> > > > > >   *
> > > > > > @@ -1621,6 +1650,19 @@ union bpf_attr {
> > > > > >               __u32           flags;          /* extra flags */
> > > > > >       } prog_bind_map;
> > > > > >
> > > > > > +     struct { /* struct used by BPF_TOKEN_CREATE command */
> > > > > > +             __u32           flags;
> > > > > > +             __u32           token_fd;
> > > > > > +             /* a bit set of allowed bpf() syscall commands,
> > > > > > +              * e.g., (1ULL << BPF_TOKEN_CREATE) | (1ULL << BPF_PROG_LOAD)
> > > > > > +              * will allow creating derived BPF tokens and loading new BPF
> > > > > > +              * programs;
> > > > > > +              * see also BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS for its effect on
> > > > > > +              * validity checking of this set
> > > > > > +              */
> > > > > > +             __u64           allowed_cmds;
> > > > > > +     } token_create;
> > > > >
> > > > > Do you think this might eventually grow into something like
> > > > > "allow only lookup operation for this specific map"? If yes, maybe it
> > > >
> > > > If it was strictly up for me, then no. I think fine-granular and
> > > > highly-dynamic restrictions are more the (BPF) LSM domain. In practice
> > > > I envision that users will use a combination of BPF token to specify
> > > > what BPF functionality can be used by applications in "broad strokes",
> > > > e.g., specifying that only networking programs and
> > > > ARRAY/HASHMAP/SK_STORAGE maps can be used, but disallow most of
> > > > tracing functionality. And then on top of that LSM can be utilized to
> > > > provide more nuanced (and as I said, more dynamic) controls over what
> > > > operations over BPF map application can perform.
> > >
> > > In this case, why not fully embrace lsm here?
> > >
> > > Maybe all we really need is:
> > > - a BPF_TOKEN_CREATE command (without any granularity)
> > > - a holder of the token (passed as you're suggesting, via new uapi
> > >   field) would be equivalent to capable(CAP_BPF)
> > > - security_bpf() will provide fine-grained control
> > > - extend landlock to provide coarse-grained policy (and later
> > >   finer granularity)
> > >
> > > ?
> >
> > That's one option, yes. But I got the feeling at LSF/MM/BPF that
> > people are worried about having a BPF token that allows the entire
> > bpf() syscall with no control. I think this coarse-grained control
> > strikes a reasonable and pragmatic balance, but I'm open to just going
> > all in. :)
>
> Sg, let's see what the rest of the folks think.
>

Cool, thanks.

> > > Or we still want the token to carry the policy somehow? (why? because
> > > of the filesystem pinning?)
> >
> > I think it's nice to be able to say "this application can only do
> > networking programs and no fancy data structures" with purely BPF
> > token, with no BPF LSM involved. Or on the other hand, "just tracing,
> > no networking" for another class of programs.
> >
> > LSM and BPF LSM is definitely a more logistical hurdle, so if it can
> > be avoided in some scenarios, that seems like a win.
>
> Agreed, although it's pretty hard to define what networking program
> really is nowadays. Our networking programs have a bunch of tracepoints
> in them. That's why I'm a bit tilting towards pushing all this policy
> stuff to lsm.

Agreed about tracing becoming part of typical networking applications.
In my defense, I'm trying to make it easy to say "don't care, allow
any program type" with `allowed_prog_types = ~0;` mask, so if there is
nothing to be gained there, for user it's easy to opt out. But for
simpler cases where we do not want kprobes be used, for example, it
seems nice (and simple) to create BPF token that disallows kprobes (as
one random example; same can be said about struct_ops or whatever
other program type).

>
> > > > If you look at the final set of token_create parameters, you can see
> > > > that I only aim to control and restrict BPF commands that are creating
> > > > new BPF objects (BTF, map, prog; we might do similar stuff for links
> > > > later, perhaps) only. In that sense BPF token controls "constructors",
> > > > while if users want to control operation on BPF objects that were
> > > > created (maps and programs, for the most part), I see this a bit
> > > > outside of BPF token scope. I also don't think we should do more
> > > > fine-grained control of construction parameters. E.g., I think it's
> > > > too much to enforce which attach_btf_id can be provided.
> > > >
> > > > It's all code, though, so we could push it in any direction we want,
> > > > but in my view BPF token is about a somewhat static prescription of
> > > > what bpf() functionality is accessible to the application, broadly.
> > > > And LSM can complement it with more dynamic abilities.
> > >
> > > Are you planning to follow up with the other, non-constructing commands?
> > > Somebody here recently was proposing to namespacify CAP_BPF, something
> > > like a read-only-capable token should, in theory, solve it?
> >
> > Maybe for LINK_CREATE. Most other commands are already unprivileged
> > and rely on FD (prog, map, link) availability as a proof of being able
> > to work with that object. GET_FD_BY_ID is another candidate for BPF
> > token, but I wanted to get real production feedback before making
> > exact decisions here.
>
> Yeah, I was mostly talking about GET_FD_BY_ID, let's follow up
> separately.

Sounds good. I'm sure we'll have to tweak and adjust as this gets
applied in production.

>
> >
> > >
> > > > > makes sense to separate token-create vs token-add-capability operations?
> > > > >
> > > > > BPF_TOKEN_CREATE would create a token that can't do anything. Then you
> > > > > would call a bunch of BPF_TOKEN_ALLOW with maybe op=SYSCALL_CMD
> > > > > value=BPF_TOKEN_CREATE.
> > > > >
> > > > > This will be more future-proof plus won't really depend on having a
> > > > > bitmask in the uapi. Then the users will be able to handle
> > > > > BPF_TOKEN_ALLOW{op=SYSCALL_CMD value=SOME_VALUE_NOT_SUPPORTED_ON_THIS_KERNEL}
> > > > > themselves (IOW, BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS won't be needed).
> > > >
> > > > So I very much intentionally wanted to keep the BPF token immutable
> > > > once created. This makes it simple to reason about what BPF token
> > > > allows and guarantee that it won't change after the fact. It's doable
> > > > to make BPF token mutable and then "finalize" it (and BPF_MAP_FREEZE
> > > > stands as a good reminder of races and complications such model
> > > > introduces), creating a sort of builder pattern APIs, but that seems
> > > > like an overkill and unnecessary complication.
> > > >
> > > > But let me address that "more future-proof" part. What about our
> > > > binary bpf_attr extensible approach is not future proof? In both cases
> > > > we'll have to specify as part of UAPI that there is a possibility to
> > > > restrict a set of bpf() syscall commands, right? In one case you'll do
> > > > it through multiple syscall invocations, while I chose a
> > > > straightforward bit mask. I could have done it as a pointer to an
> > > > array of `enum bpf_cmd` items, but I think it's extremely unlikely
> > > > we'll get to >64, ever. But even if we do, adding `u64 allowed_cmds2`
> > > > doesn't seem like a big deal to me.
> > > >
> > > > The main point though, both approaches are equally extensible. But
> > > > making BPF token mutable adds a lot of undesirable (IMO)
> > > > complications.
> > > >
> > > >
> > > > As for BPF_F_TOKEN_IGNORE_UNKNOWN_CMDS, I'm thinking of dropping such
> > > > flags for simplicity.
> > >
> > > Ack, I just hope we're not inventing another landlock here. As mentioned
> > > above, maybe doing simple BPF_TOKEN_CREATE + pushing the rest of the
> > > policy into lsm/landlock is a good alternative, idk.
> >
> > The biggest blocker today is incompatibility of BPF usage with user
> > namespaces. Having a simple BPF token would allow us to make progress
> > here. The rest is just something to strike a balance with, yep.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH RESEND bpf-next 00/18] BPF token
  2023-06-06 16:38           ` Andrii Nakryiko
@ 2023-06-06 20:13             ` Casey Schaufler
  0 siblings, 0 replies; 36+ messages in thread
From: Casey Schaufler @ 2023-06-06 20:13 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, linux-security-module, keescook, brauner,
	lennart, cyphar, luto, Casey Schaufler

On 6/6/2023 9:38 AM, Andrii Nakryiko wrote:
> On Mon, Jun 5, 2023 at 5:06 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 6/5/2023 4:12 PM, Andrii Nakryiko wrote:
>>> On Mon, Jun 5, 2023 at 3:26 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>> On 6/5/2023 1:41 PM, Andrii Nakryiko wrote:
>>>>> On Fri, Jun 2, 2023 at 8:55 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>>>> On 6/2/2023 7:59 AM, Andrii Nakryiko wrote:
>>>>>>> *Resending with trimmed CC list because original version didn't make it to
>>>>>>> the mailing list.*
>>>>>>>
>>>>>>> This patch set introduces new BPF object, BPF token, which allows to delegate
>>>>>>> a subset of BPF functionality from privileged system-wide daemon (e.g.,
>>>>>>> systemd or any other container manager) to a *trusted* unprivileged
>>>>>>> application. Trust is the key here. This functionality is not about allowing
>>>>>>> unconditional unprivileged BPF usage. Establishing trust, though, is
>>>>>>> completely up to the discretion of respective privileged application that
>>>>>>> would create a BPF token.
>>>>>> Token based privilege has a number of well understood weaknesses,
>>>>>> none of which I see addressed here. I also have a real problem with
>>>>> Can you please provide some more details about those weaknesses? Hard
>>>>> to respond without knowing exactly what we are talking about.
>>>> Privileged Process (PP) sends a Token to Trusted Process (TP).
>>>> TP sends the Token along to Untrusted Process, which performs nefarious
>>>> deeds.
>>>>
>>>> Privileged Process (PP) sends a Token to Trusted Process (TP).
>>>> TP uses Token, and then saves it in its toolbox. PP later sends
>>>> TP a different Token. TP realizes that with the combination of
>>>> Tokens it now has it can do considerably more than what PP
>>>> intended in either of the cases it sent Token for. TP performs
>>>> nefarious deeds.
>>>>
>>>> Granted, in both cases TP does not deserve to be trusted.
>>> Right, exactly. The intended use case here is a controlled production
>>> containerized environment, where the container manager is privileged
>>> and controls which applications are run inside the container. These
>>> are coming from applications that are code reviewed and controlled by
>>> whichever organization.
>> I understand the intended use case. You have to allow for unintended abuse
>> cases when you implement a security mechanism. You can't wave your hand and
>> say that everything that is trusted in worthy of trust. You have to have
>> mechanism to ensure that. The existing security mechanisms (uids, capabilities
>> and so forth) have explicit criteria for how they delegate privilege, and
>> what the consequences of doing so might be.
>>
> I'm sorry, I'm failing to see the point you are trying to make. Any
> API (especially privileged ones, like BPF_TOKEN_CREATE) can be misused
> or abused, but we still grant root permissions to various production
> processes, right? So I'm not sure where this is going. If you have
> something specific in mind, please do tell.

I'm sorry too. All the privilege mechanisms we currently have are tightly
controlled regarding how a process can acquire privilege. What you're
suggesting is a mechanism whereby a process can acquire privilege without
any constraint. Further, it's a feature of your mechanism that the
process that grants privilege is unconstrained in how it does so, the
judgement being completely at the discretion of the program.

It's as if you've proposed that a privileged process could change the
uid of process 42 by doing chown on /proc/42. Is the problem with that
difficult to see?

>
>>>> Because TP does not run with privilege of its own, it is not
>>>> treated with the same level of caution as it would be if it did.
>>>>
>>>> Privileged Process (PP) sends a Token to what it thinks is a Trusted
>>>> Process (TP) but is in fact an Imposter Process (IP) that has been
>>>> enabled on the system using any number of K33L techniques.
>>> So if there is a probability of Imposter Process, neither BPF token
>>> nor CAP_BPF should be granted at all. In production no one gives
>>> CAP_BPF to processes that we cannot be reasonably sure is safe to use
>>> BPF. As I mentioned in the cover letter, BPF token is not a mechanism
>>> to implement unprivileged BPF.
>> You're correct, PP *shouldn't* grant IP a Token. But it *can* do so.
>> Think of the military definition of a threat. It's what the other guy
>> is capable of doing to you, not what the other guy is expected to do to you.
>> PP is capable to giving a Token to IP, even if PP does not intend to.
>>
>>> What I'm trying to achieve here is instead of needing to grant root
>>> capabilities to any (trusted, otherwise no one would do this)
>>> BPF-using application, we'd like to grant BPF token which is more
>>> limited in scope and gives much less privileges to do anything with
>>> the system. And, crucially, CAP_BPF is incompatible with user
>>> namespaces, while BPF token is.
>> I get that. Dynamically increasing a process' privilege (TP) from an
>> external source (PP) without somehow marking TP as worthy of the privilege
>> is going to be insanely dangerous. Even in well controlled environments.
>>
>>> Basically, I'd like to go from having root/CAP_BPF processes in init
>>> namespace, to have unprivileged processes under user namespace, but
>>> with BPF token that would still allow to do them controlled (through
>>> combination of code reviews, audit, and security enforcements) BPF
>>> usage.
>> And the problem there is that if you put the feature in the kernel
>> you can assume that some number of people will use it without code
>> reviews, audit or security enforcement. Of course you can call out
>> "user error", but someone is going to want you to "fix" it.
>>
>>>> I don't see anything that ensures that PP communicates Tokens only
>>>> to TP, nor any criteria for "trust" are met.
>>> This should be up to PP how to organize this and will differ in
>>> different production setups. E.g., for something like systemd or
>>> container manager, one way to communicate this is to create a
>>> dedicated instance of BPF FS, pin BPF token in it, and expose that
>>> specific instance of BPF FS in the container's mount namespace.
>> I have no doubt that you can make a system that works and works correctly.
>> I'm saying that it's very easy to create a system that had easily exploited
>> security holes. You won't do it, but someone who didn't design your
>> mechanism will.
>>
> As I mentioned above, I'm failing to see where this is going... If you
> give a process CAP_SYS_ADMIN capabilities, it can create a BPF token.
> And it can pass it to some other process either through Unix domain
> socket (SCM_RIGHTS), or BPF FS. If you can't be sure the privileged
> process will do the right thing -- don't give it CAP_SYS_ADMIN. If you
> did, don't blame API existence for your misused of it.
>
> There are many privileged APIs, they exist for a reason, and yes, they
> can be dangerous (which is why they are privileged), but they help to
> solve real problems. Same here, we need something like a BPF token to
> allow use of BPF within containers. And do that in a safer way than
> granting CAP_SYS_ADMIN, CAP_BPF, etc capabilities.

Containers are not kernel constructs. 

>
>>>> Those are the issues I'm most familiar with, although I believe
>>>> there are others.
>>>>
>>>>>> the notion of "trusted unprivileged" where trust is established by
>>>>>> a user space application. Ignoring the possibility of malicious code
>>>>>> for the moment, the opportunity for accidental privilege leakage is
>>>>>> huge. It would be trivial (and tempting) to create a privileged BPF
>>>>>> "shell" that would then be allowed to "trust" any application and
>>>>>> run it with privilege by passing it a token.
>>>>> Right now most BPF applications are running as real root in
>>>>> production. Users have to trust such applications to not do anything
>>>>> bad with their full root capabilities. How it is done depends on
>>>>> specific production and organizational setups, and could be code
>>>>> reviewing, audits, LSM, etc. So in that sense BPF token doesn't make
>>>>> things worse. And it actually allows us to improve the situation by
>>>>> creating and sharing more restrictive BPF tokens that limit what bpf()
>>>>> syscall parts are allowed to be used.
>>>>>
>>>>>>> The main motivation for BPF token is a desire to enable containerized
>>>>>>> BPF applications to be used together with user namespaces. This is currently
>>>>>>> impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
>>>>>>> or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
>>>>>>> helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
>>>>>>> arbitrary memory, and it's impossible to ensure that they only read memory of
>>>>>>> processes belonging to any given namespace. This means that it's impossible to
>>>>>>> have namespace-aware CAP_BPF capability, and as such another mechanism to
>>>>>>> allow safe usage of BPF functionality is necessary. BPF token and delegation
>>>>>>> of it to a trusted unprivileged applications is such mechanism. Kernel makes
>>>>>>> no assumption about what "trusted" constitutes in any particular case, and
>>>>>>> it's up to specific privileged applications and their surrounding
>>>>>>> infrastructure to decide that. What kernel provides is a set of APIs to create
>>>>>>> and tune BPF token, and pass it around to privileged BPF commands that are
>>>>>>> creating new BPF objects like BPF programs, BPF maps, etc.
>>>>>>>
>>>>>>> Previous attempt at addressing this very same problem ([0]) attempted to
>>>>>>> utilize authoritative LSM approach, but was conclusively rejected by upstream
>>>>>>> LSM maintainers. BPF token concept is not changing anything about LSM
>>>>>>> approach, but can be combined with LSM hooks for very fine-grained security
>>>>>>> policy. Some ideas about making BPF token more convenient to use with LSM (in
>>>>>>> particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
>>>>>>> 2023 presentation ([1]). E.g., an ability to specify user-provided data
>>>>>>> (context), which in combination with BPF LSM would allow implementing a very
>>>>>>> dynamic and fine-granular custom security policies on top of BPF token. In the
>>>>>>> interest of minimizing API surface area discussions this is going to be
>>>>>>> added in follow up patches, as it's not essential to the fundamental concept
>>>>>>> of delegatable BPF token.
>>>>>>>
>>>>>>> It should be noted that BPF token is conceptually quite similar to the idea of
>>>>>>> /dev/bpf device file, proposed by Song a while ago ([2]). The biggest
>>>>>>> difference is the idea of using virtual anon_inode file to hold BPF token and
>>>>>>> allowing multiple independent instances of them, each with its own set of
>>>>>>> restrictions. BPF pinning solves the problem of exposing such BPF token
>>>>>>> through file system (BPF FS, in this case) for cases where transferring FDs
>>>>>>> over Unix domain sockets is not convenient. And also, crucially, BPF token
>>>>>>> approach is not using any special stateful task-scoped flags. Instead, bpf()
>>>>>>> syscall accepts token_fd parameters explicitly for each relevant BPF command.
>>>>>>> This addresses main concerns brought up during the /dev/bpf discussion, and
>>>>>>> fits better with overall BPF subsystem design.
>>>>>>>
>>>>>>> This patch set adds a basic minimum of functionality to make BPF token useful
>>>>>>> and to discuss API and functionality. Currently only low-level libbpf APIs
>>>>>>> support passing BPF token around, allowing to test kernel functionality, but
>>>>>>> for the most part is not sufficient for real-world applications, which
>>>>>>> typically use high-level libbpf APIs based on `struct bpf_object` type. This
>>>>>>> was done with the intent to limit the size of patch set and concentrate on
>>>>>>> mostly kernel-side changes. All the necessary plumbing for libbpf will be sent
>>>>>>> as a separate follow up patch set kernel support makes it upstream.
>>>>>>>
>>>>>>> Another part that should happen once kernel-side BPF token is established, is
>>>>>>> a set of conventions between applications (e.g., systemd), tools (e.g.,
>>>>>>> bpftool), and libraries (e.g., libbpf) about sharing BPF tokens through BPF FS
>>>>>>> at well-defined locations to allow applications take advantage of this in
>>>>>>> automatic fashion without explicit code changes on BPF application's side.
>>>>>>> But I'd like to postpone this discussion to after BPF token concept lands.
>>>>>>>
>>>>>>>   [0] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
>>>>>>>   [1] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
>>>>>>>   [2] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/
>>>>>>>
>>>>>>> Andrii Nakryiko (18):
>>>>>>>   bpf: introduce BPF token object
>>>>>>>   libbpf: add bpf_token_create() API
>>>>>>>   selftests/bpf: add BPF_TOKEN_CREATE test
>>>>>>>   bpf: move unprivileged checks into map_create() and bpf_prog_load()
>>>>>>>   bpf: inline map creation logic in map_create() function
>>>>>>>   bpf: centralize permissions checks for all BPF map types
>>>>>>>   bpf: add BPF token support to BPF_MAP_CREATE command
>>>>>>>   libbpf: add BPF token support to bpf_map_create() API
>>>>>>>   selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command
>>>>>>>   bpf: add BPF token support to BPF_BTF_LOAD command
>>>>>>>   libbpf: add BPF token support to bpf_btf_load() API
>>>>>>>   selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest
>>>>>>>   bpf: keep BPF_PROG_LOAD permission checks clear of validations
>>>>>>>   bpf: add BPF token support to BPF_PROG_LOAD command
>>>>>>>   bpf: take into account BPF token when fetching helper protos
>>>>>>>   bpf: consistenly use BPF token throughout BPF verifier logic
>>>>>>>   libbpf: add BPF token support to bpf_prog_load() API
>>>>>>>   selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests
>>>>>>>
>>>>>>>  drivers/media/rc/bpf-lirc.c                   |   2 +-
>>>>>>>  include/linux/bpf.h                           |  66 ++-
>>>>>>>  include/linux/filter.h                        |   2 +-
>>>>>>>  include/uapi/linux/bpf.h                      |  74 +++
>>>>>>>  kernel/bpf/Makefile                           |   2 +-
>>>>>>>  kernel/bpf/arraymap.c                         |   2 +-
>>>>>>>  kernel/bpf/bloom_filter.c                     |   3 -
>>>>>>>  kernel/bpf/bpf_local_storage.c                |   3 -
>>>>>>>  kernel/bpf/bpf_struct_ops.c                   |   3 -
>>>>>>>  kernel/bpf/cgroup.c                           |   6 +-
>>>>>>>  kernel/bpf/core.c                             |   3 +-
>>>>>>>  kernel/bpf/cpumap.c                           |   4 -
>>>>>>>  kernel/bpf/devmap.c                           |   3 -
>>>>>>>  kernel/bpf/hashtab.c                          |   6 -
>>>>>>>  kernel/bpf/helpers.c                          |   6 +-
>>>>>>>  kernel/bpf/inode.c                            |  26 ++
>>>>>>>  kernel/bpf/lpm_trie.c                         |   3 -
>>>>>>>  kernel/bpf/queue_stack_maps.c                 |   4 -
>>>>>>>  kernel/bpf/reuseport_array.c                  |   3 -
>>>>>>>  kernel/bpf/stackmap.c                         |   3 -
>>>>>>>  kernel/bpf/syscall.c                          | 429 ++++++++++++++----
>>>>>>>  kernel/bpf/token.c                            | 141 ++++++
>>>>>>>  kernel/bpf/verifier.c                         |  13 +-
>>>>>>>  kernel/trace/bpf_trace.c                      |   2 +-
>>>>>>>  net/core/filter.c                             |  36 +-
>>>>>>>  net/core/sock_map.c                           |   4 -
>>>>>>>  net/ipv4/bpf_tcp_ca.c                         |   2 +-
>>>>>>>  net/netfilter/nf_bpf_link.c                   |   2 +-
>>>>>>>  net/xdp/xskmap.c                              |   4 -
>>>>>>>  tools/include/uapi/linux/bpf.h                |  74 +++
>>>>>>>  tools/lib/bpf/bpf.c                           |  32 +-
>>>>>>>  tools/lib/bpf/bpf.h                           |  24 +-
>>>>>>>  tools/lib/bpf/libbpf.map                      |   1 +
>>>>>>>  .../selftests/bpf/prog_tests/libbpf_probes.c  |   4 +
>>>>>>>  .../selftests/bpf/prog_tests/libbpf_str.c     |   6 +
>>>>>>>  .../testing/selftests/bpf/prog_tests/token.c  | 282 ++++++++++++
>>>>>>>  .../bpf/prog_tests/unpriv_bpf_disabled.c      |   6 +-
>>>>>>>  37 files changed, 1098 insertions(+), 188 deletions(-)
>>>>>>>  create mode 100644 kernel/bpf/token.c
>>>>>>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c
>>>>>>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2023-06-06 20:13 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-02 14:59 [PATCH RESEND bpf-next 00/18] BPF token Andrii Nakryiko
2023-06-02 14:59 ` [PATCH RESEND bpf-next 01/18] bpf: introduce BPF token object Andrii Nakryiko
2023-06-02 17:41   ` kernel test robot
2023-06-02 20:41   ` kernel test robot
2023-06-03  1:32   ` Stanislav Fomichev
2023-06-05 20:56     ` Andrii Nakryiko
2023-06-05 21:48       ` Stanislav Fomichev
2023-06-05 23:00         ` Andrii Nakryiko
2023-06-06 16:58           ` Stanislav Fomichev
2023-06-06 17:04             ` Andrii Nakryiko
2023-06-02 14:59 ` [PATCH RESEND bpf-next 02/18] libbpf: add bpf_token_create() API Andrii Nakryiko
2023-06-02 14:59 ` [PATCH RESEND bpf-next 03/18] selftests/bpf: add BPF_TOKEN_CREATE test Andrii Nakryiko
2023-06-02 14:59 ` [PATCH RESEND bpf-next 04/18] bpf: move unprivileged checks into map_create() and bpf_prog_load() Andrii Nakryiko
2023-06-02 14:59 ` [PATCH RESEND bpf-next 05/18] bpf: inline map creation logic in map_create() function Andrii Nakryiko
2023-06-02 14:59 ` [PATCH RESEND bpf-next 06/18] bpf: centralize permissions checks for all BPF map types Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 07/18] bpf: add BPF token support to BPF_MAP_CREATE command Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 08/18] libbpf: add BPF token support to bpf_map_create() API Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 09/18] selftests/bpf: add BPF token-enabled test for BPF_MAP_CREATE command Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 10/18] bpf: add BPF token support to BPF_BTF_LOAD command Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 11/18] libbpf: add BPF token support to bpf_btf_load() API Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 12/18] selftests/bpf: add BPF token-enabled BPF_BTF_LOAD selftest Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 13/18] bpf: keep BPF_PROG_LOAD permission checks clear of validations Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 14/18] bpf: add BPF token support to BPF_PROG_LOAD command Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 15/18] bpf: take into account BPF token when fetching helper protos Andrii Nakryiko
2023-06-02 18:46   ` kernel test robot
2023-06-02 20:07     ` Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 16/18] bpf: consistenly use BPF token throughout BPF verifier logic Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 17/18] libbpf: add BPF token support to bpf_prog_load() API Andrii Nakryiko
2023-06-02 15:00 ` [PATCH RESEND bpf-next 18/18] selftests/bpf: add BPF token-enabled BPF_PROG_LOAD tests Andrii Nakryiko
2023-06-02 15:55 ` [PATCH RESEND bpf-next 00/18] BPF token Casey Schaufler
2023-06-05 20:41   ` Andrii Nakryiko
2023-06-05 22:26     ` Casey Schaufler
2023-06-05 23:12       ` Andrii Nakryiko
2023-06-06  0:05         ` Casey Schaufler
2023-06-06 16:38           ` Andrii Nakryiko
2023-06-06 20:13             ` Casey Schaufler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.