linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 bpf-next 0/4] Add O_PATH-based BPF_OBJ_PIN and BPF_OBJ_GET support
@ 2023-05-22 23:29 Andrii Nakryiko
  2023-05-22 23:29 ` [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM Andrii Nakryiko
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2023-05-22 23:29 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau
  Cc: cyphar, brauner, lennart, linux-fsdevel, Andrii Nakryiko

Add ability to specify pinning location within BPF FS using O_PATH-based FDs,
similar to openat() family of APIs. Patch #2 adds necessary kernel-side
changes. Patch #3 exposes this through libbpf APIs. Patch #4 uses new mount
APIs (fsopen, fsconfig, fsmount) to demonstrated how now it's possible to work
with detach-mounted BPF FS using new BPF_OBJ_PIN and BPF_OBJ_GET
functionality. We also add few more tests using various combinations of
path_fd and pathname to validate proper argument propagation in kernel code.

This feature is inspired as a result of recent conversations during
LSF/MM/BPF 2023 conference about shortcomings of being able to perform BPF
objects pinning only using lookup-based paths.

v2->v3:
  - __s32 for path_fd in union bpf_attr (Christian);
  - added subtest for absolute/relative paths during pinning/getting;
  - added pre-patch moving LSM hook (security_path_mknod) around (Christian);
v1->v2:
  - add BPF_F_PATH_FD flag that should go along with path FD (Christian).

Andrii Nakryiko (4):
  bpf: validate BPF object in BPF_OBJ_PIN before calling LSM
  bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands
  libbpf: add opts-based bpf_obj_pin() API and add support for path_fd
  selftests/bpf: add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests

 include/linux/bpf.h                           |   4 +-
 include/uapi/linux/bpf.h                      |  10 +
 kernel/bpf/inode.c                            |  27 +-
 kernel/bpf/syscall.c                          |  25 +-
 tools/include/uapi/linux/bpf.h                |  10 +
 tools/lib/bpf/bpf.c                           |  17 +-
 tools/lib/bpf/bpf.h                           |  18 +-
 tools/lib/bpf/libbpf.map                      |   1 +
 .../bpf/prog_tests/bpf_obj_pinning.c          | 268 ++++++++++++++++++
 9 files changed, 354 insertions(+), 26 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM
  2023-05-22 23:29 [PATCH v3 bpf-next 0/4] Add O_PATH-based BPF_OBJ_PIN and BPF_OBJ_GET support Andrii Nakryiko
@ 2023-05-22 23:29 ` Andrii Nakryiko
  2023-05-23 10:03   ` Christian Brauner
  2023-05-23 15:00   ` Daniel Borkmann
  2023-05-22 23:29 ` [PATCH v3 bpf-next 2/4] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands Andrii Nakryiko
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2023-05-22 23:29 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau
  Cc: cyphar, brauner, lennart, linux-fsdevel, Andrii Nakryiko

Do a sanity check whether provided file-to-be-pinned is actually a BPF
object (prog, map, btf) before calling security_path_mknod LSM hook. If
it's not, LSM hook doesn't have to be triggered, as the operation has no
chance of succeeding anyways.

Suggested-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/inode.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 9948b542a470..329f27d5cacf 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -448,18 +448,17 @@ static int bpf_obj_do_pin(const char __user *pathname, void *raw,
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	mode = S_IFREG | ((S_IRUSR | S_IWUSR) & ~current_umask());
-
-	ret = security_path_mknod(&path, dentry, mode, 0);
-	if (ret)
-		goto out;
-
 	dir = d_inode(path.dentry);
 	if (dir->i_op != &bpf_dir_iops) {
 		ret = -EPERM;
 		goto out;
 	}
 
+	mode = S_IFREG | ((S_IRUSR | S_IWUSR) & ~current_umask());
+	ret = security_path_mknod(&path, dentry, mode, 0);
+	if (ret)
+		goto out;
+
 	switch (type) {
 	case BPF_TYPE_PROG:
 		ret = vfs_mkobj(dentry, mode, bpf_mkprog, raw);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 bpf-next 2/4] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands
  2023-05-22 23:29 [PATCH v3 bpf-next 0/4] Add O_PATH-based BPF_OBJ_PIN and BPF_OBJ_GET support Andrii Nakryiko
  2023-05-22 23:29 ` [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM Andrii Nakryiko
@ 2023-05-22 23:29 ` Andrii Nakryiko
  2023-05-23 10:26   ` Christian Brauner
  2023-05-22 23:29 ` [PATCH v3 bpf-next 3/4] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd Andrii Nakryiko
  2023-05-22 23:29 ` [PATCH v3 bpf-next 4/4] selftests/bpf: add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests Andrii Nakryiko
  3 siblings, 1 reply; 10+ messages in thread
From: Andrii Nakryiko @ 2023-05-22 23:29 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau
  Cc: cyphar, brauner, lennart, linux-fsdevel, Andrii Nakryiko

Current UAPI of BPF_OBJ_PIN and BPF_OBJ_GET commands of bpf() syscall
forces users to specify pinning location as a string-based absolute or
relative (to current working directory) path. This has various
implications related to security (e.g., symlink-based attacks), forces
BPF FS to be exposed in the file system, which can cause races with
other applications.

One of the feedbacks we got from folks working with containers heavily
was that inability to use purely FD-based location specification was an
unfortunate limitation and hindrance for BPF_OBJ_PIN and BPF_OBJ_GET
commands. This patch closes this oversight, adding path_fd field to
BPF_OBJ_PIN and BPF_OBJ_GET UAPI, following conventions established by
*at() syscalls for dirfd + pathname combinations.

This now allows interesting possibilities like working with detached BPF
FS mount (e.g., to perform multiple pinnings without running a risk of
someone interfering with them), and generally making pinning/getting
more secure and not prone to any races and/or security attacks.

This is demonstrated by a selftest added in subsequent patch that takes
advantage of new mount APIs (fsopen, fsconfig, fsmount) to demonstrate
creating detached BPF FS mount, pinning, and then getting BPF map out of
it, all while never exposing this private instance of BPF FS to outside
worlds.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h            |  4 ++--
 include/uapi/linux/bpf.h       | 10 ++++++++++
 kernel/bpf/inode.c             | 16 ++++++++--------
 kernel/bpf/syscall.c           | 25 ++++++++++++++++++++-----
 tools/include/uapi/linux/bpf.h | 10 ++++++++++
 5 files changed, 50 insertions(+), 15 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 36e4b2d8cca2..f58895830ada 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2077,8 +2077,8 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
 struct bpf_link *bpf_link_get_from_fd(u32 ufd);
 struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
 
-int bpf_obj_pin_user(u32 ufd, const char __user *pathname);
-int bpf_obj_get_user(const char __user *pathname, int flags);
+int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
+int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
 
 #define BPF_ITER_FUNC_PREFIX "bpf_iter_"
 #define DEFINE_BPF_ITER_FUNC(target, args...)			\
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1bb11a6ee667..9273c654743c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1272,6 +1272,9 @@ enum {
 
 /* Create a map that will be registered/unregesitered by the backed bpf_link */
 	BPF_F_LINK		= (1U << 13),
+
+/* Get path from provided FD in BPF_OBJ_PIN/BPF_OBJ_GET commands */
+	BPF_F_PATH_FD		= (1U << 14),
 };
 
 /* Flags for BPF_PROG_QUERY. */
@@ -1420,6 +1423,13 @@ union bpf_attr {
 		__aligned_u64	pathname;
 		__u32		bpf_fd;
 		__u32		file_flags;
+		/* Same as dirfd in openat() syscall; see openat(2)
+		 * manpage for details of path FD and pathname semantics;
+		 * path_fd should accompanied by BPF_F_PATH_FD flag set in
+		 * file_flags field, otherwise it should be set to zero;
+		 * if BPF_F_PATH_FD flag is not set, AT_FDCWD is assumed.
+		 */
+		__s32		path_fd;
 	};
 
 	struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 329f27d5cacf..4174f76133df 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -435,7 +435,7 @@ static int bpf_iter_link_pin_kernel(struct dentry *parent,
 	return ret;
 }
 
-static int bpf_obj_do_pin(const char __user *pathname, void *raw,
+static int bpf_obj_do_pin(int path_fd, const char __user *pathname, void *raw,
 			  enum bpf_type type)
 {
 	struct dentry *dentry;
@@ -444,7 +444,7 @@ static int bpf_obj_do_pin(const char __user *pathname, void *raw,
 	umode_t mode;
 	int ret;
 
-	dentry = user_path_create(AT_FDCWD, pathname, &path, 0);
+	dentry = user_path_create(path_fd, pathname, &path, 0);
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
@@ -477,7 +477,7 @@ static int bpf_obj_do_pin(const char __user *pathname, void *raw,
 	return ret;
 }
 
-int bpf_obj_pin_user(u32 ufd, const char __user *pathname)
+int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname)
 {
 	enum bpf_type type;
 	void *raw;
@@ -487,14 +487,14 @@ int bpf_obj_pin_user(u32 ufd, const char __user *pathname)
 	if (IS_ERR(raw))
 		return PTR_ERR(raw);
 
-	ret = bpf_obj_do_pin(pathname, raw, type);
+	ret = bpf_obj_do_pin(path_fd, pathname, raw, type);
 	if (ret != 0)
 		bpf_any_put(raw, type);
 
 	return ret;
 }
 
-static void *bpf_obj_do_get(const char __user *pathname,
+static void *bpf_obj_do_get(int path_fd, const char __user *pathname,
 			    enum bpf_type *type, int flags)
 {
 	struct inode *inode;
@@ -502,7 +502,7 @@ static void *bpf_obj_do_get(const char __user *pathname,
 	void *raw;
 	int ret;
 
-	ret = user_path_at(AT_FDCWD, pathname, LOOKUP_FOLLOW, &path);
+	ret = user_path_at(path_fd, pathname, LOOKUP_FOLLOW, &path);
 	if (ret)
 		return ERR_PTR(ret);
 
@@ -526,7 +526,7 @@ static void *bpf_obj_do_get(const char __user *pathname,
 	return ERR_PTR(ret);
 }
 
-int bpf_obj_get_user(const char __user *pathname, int flags)
+int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags)
 {
 	enum bpf_type type = BPF_TYPE_UNSPEC;
 	int f_flags;
@@ -537,7 +537,7 @@ int bpf_obj_get_user(const char __user *pathname, int flags)
 	if (f_flags < 0)
 		return f_flags;
 
-	raw = bpf_obj_do_get(pathname, &type, f_flags);
+	raw = bpf_obj_do_get(path_fd, pathname, &type, f_flags);
 	if (IS_ERR(raw))
 		return PTR_ERR(raw);
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b2621089904b..c7f6807215e6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2697,23 +2697,38 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	return err;
 }
 
-#define BPF_OBJ_LAST_FIELD file_flags
+#define BPF_OBJ_LAST_FIELD path_fd
 
 static int bpf_obj_pin(const union bpf_attr *attr)
 {
-	if (CHECK_ATTR(BPF_OBJ) || attr->file_flags != 0)
+	int path_fd;
+
+	if (CHECK_ATTR(BPF_OBJ) || attr->file_flags & ~BPF_F_PATH_FD)
+		return -EINVAL;
+
+	/* path_fd has to be accompanied by BPF_F_PATH_FD flag */
+	if (!(attr->file_flags & BPF_F_PATH_FD) && attr->path_fd)
 		return -EINVAL;
 
-	return bpf_obj_pin_user(attr->bpf_fd, u64_to_user_ptr(attr->pathname));
+	path_fd = attr->file_flags & BPF_F_PATH_FD ? attr->path_fd : AT_FDCWD;
+	return bpf_obj_pin_user(attr->bpf_fd, path_fd,
+				u64_to_user_ptr(attr->pathname));
 }
 
 static int bpf_obj_get(const union bpf_attr *attr)
 {
+	int path_fd;
+
 	if (CHECK_ATTR(BPF_OBJ) || attr->bpf_fd != 0 ||
-	    attr->file_flags & ~BPF_OBJ_FLAG_MASK)
+	    attr->file_flags & ~(BPF_OBJ_FLAG_MASK | BPF_F_PATH_FD))
+		return -EINVAL;
+
+	/* path_fd has to be accompanied by BPF_F_PATH_FD flag */
+	if (!(attr->file_flags & BPF_F_PATH_FD) && attr->path_fd)
 		return -EINVAL;
 
-	return bpf_obj_get_user(u64_to_user_ptr(attr->pathname),
+	path_fd = attr->file_flags & BPF_F_PATH_FD ? attr->path_fd : AT_FDCWD;
+	return bpf_obj_get_user(path_fd, u64_to_user_ptr(attr->pathname),
 				attr->file_flags);
 }
 
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 1bb11a6ee667..9273c654743c 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1272,6 +1272,9 @@ enum {
 
 /* Create a map that will be registered/unregesitered by the backed bpf_link */
 	BPF_F_LINK		= (1U << 13),
+
+/* Get path from provided FD in BPF_OBJ_PIN/BPF_OBJ_GET commands */
+	BPF_F_PATH_FD		= (1U << 14),
 };
 
 /* Flags for BPF_PROG_QUERY. */
@@ -1420,6 +1423,13 @@ union bpf_attr {
 		__aligned_u64	pathname;
 		__u32		bpf_fd;
 		__u32		file_flags;
+		/* Same as dirfd in openat() syscall; see openat(2)
+		 * manpage for details of path FD and pathname semantics;
+		 * path_fd should accompanied by BPF_F_PATH_FD flag set in
+		 * file_flags field, otherwise it should be set to zero;
+		 * if BPF_F_PATH_FD flag is not set, AT_FDCWD is assumed.
+		 */
+		__s32		path_fd;
 	};
 
 	struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 bpf-next 3/4] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd
  2023-05-22 23:29 [PATCH v3 bpf-next 0/4] Add O_PATH-based BPF_OBJ_PIN and BPF_OBJ_GET support Andrii Nakryiko
  2023-05-22 23:29 ` [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM Andrii Nakryiko
  2023-05-22 23:29 ` [PATCH v3 bpf-next 2/4] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands Andrii Nakryiko
@ 2023-05-22 23:29 ` Andrii Nakryiko
  2023-05-23 14:38   ` Daniel Borkmann
  2023-05-22 23:29 ` [PATCH v3 bpf-next 4/4] selftests/bpf: add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests Andrii Nakryiko
  3 siblings, 1 reply; 10+ messages in thread
From: Andrii Nakryiko @ 2023-05-22 23:29 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau
  Cc: cyphar, brauner, lennart, linux-fsdevel, Andrii Nakryiko

Add path_fd support for bpf_obj_pin() and bpf_obj_get() operations
(through their opts-based variants). This allows to take advantage of
new kernel-side support for O_PATH-based pin/get location specification.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/bpf.c      | 17 ++++++++++++++---
 tools/lib/bpf/bpf.h      | 18 ++++++++++++++++--
 tools/lib/bpf/libbpf.map |  1 +
 3 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 128ac723c4ea..ed86b37d8024 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -572,20 +572,30 @@ int bpf_map_update_batch(int fd, const void *keys, const void *values, __u32 *co
 				    (void *)keys, (void *)values, count, opts);
 }
 
-int bpf_obj_pin(int fd, const char *pathname)
+int bpf_obj_pin_opts(int fd, const char *pathname, const struct bpf_obj_pin_opts *opts)
 {
-	const size_t attr_sz = offsetofend(union bpf_attr, file_flags);
+	const size_t attr_sz = offsetofend(union bpf_attr, path_fd);
 	union bpf_attr attr;
 	int ret;
 
+	if (!OPTS_VALID(opts, bpf_obj_pin_opts))
+		return libbpf_err(-EINVAL);
+
 	memset(&attr, 0, attr_sz);
+	attr.path_fd = OPTS_GET(opts, path_fd, 0);
 	attr.pathname = ptr_to_u64((void *)pathname);
+	attr.file_flags = OPTS_GET(opts, file_flags, 0);
 	attr.bpf_fd = fd;
 
 	ret = sys_bpf(BPF_OBJ_PIN, &attr, attr_sz);
 	return libbpf_err_errno(ret);
 }
 
+int bpf_obj_pin(int fd, const char *pathname)
+{
+	return bpf_obj_pin_opts(fd, pathname, NULL);
+}
+
 int bpf_obj_get(const char *pathname)
 {
 	return bpf_obj_get_opts(pathname, NULL);
@@ -593,7 +603,7 @@ int bpf_obj_get(const char *pathname)
 
 int bpf_obj_get_opts(const char *pathname, const struct bpf_obj_get_opts *opts)
 {
-	const size_t attr_sz = offsetofend(union bpf_attr, file_flags);
+	const size_t attr_sz = offsetofend(union bpf_attr, path_fd);
 	union bpf_attr attr;
 	int fd;
 
@@ -601,6 +611,7 @@ int bpf_obj_get_opts(const char *pathname, const struct bpf_obj_get_opts *opts)
 		return libbpf_err(-EINVAL);
 
 	memset(&attr, 0, attr_sz);
+	attr.path_fd = OPTS_GET(opts, path_fd, 0);
 	attr.pathname = ptr_to_u64((void *)pathname);
 	attr.file_flags = OPTS_GET(opts, file_flags, 0);
 
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index a2c091389b18..9aa0ee473754 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -284,16 +284,30 @@ LIBBPF_API int bpf_map_update_batch(int fd, const void *keys, const void *values
 				    __u32 *count,
 				    const struct bpf_map_batch_opts *opts);
 
-struct bpf_obj_get_opts {
+struct bpf_obj_pin_opts {
 	size_t sz; /* size of this struct for forward/backward compatibility */
 
 	__u32 file_flags;
+	int path_fd;
 
 	size_t :0;
 };
-#define bpf_obj_get_opts__last_field file_flags
+#define bpf_obj_pin_opts__last_field path_fd
 
 LIBBPF_API int bpf_obj_pin(int fd, const char *pathname);
+LIBBPF_API int bpf_obj_pin_opts(int fd, const char *pathname,
+				const struct bpf_obj_pin_opts *opts);
+
+struct bpf_obj_get_opts {
+	size_t sz; /* size of this struct for forward/backward compatibility */
+
+	__u32 file_flags;
+	int path_fd;
+
+	size_t :0;
+};
+#define bpf_obj_get_opts__last_field path_fd
+
 LIBBPF_API int bpf_obj_get(const char *pathname);
 LIBBPF_API int bpf_obj_get_opts(const char *pathname,
 				const struct bpf_obj_get_opts *opts);
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index a5aa3a383d69..7a4fe80da360 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -389,5 +389,6 @@ LIBBPF_1.2.0 {
 		bpf_link__update_map;
 		bpf_link_get_info_by_fd;
 		bpf_map_get_info_by_fd;
+		bpf_obj_pin_opts;
 		bpf_prog_get_info_by_fd;
 } LIBBPF_1.1.0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 bpf-next 4/4] selftests/bpf: add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests
  2023-05-22 23:29 [PATCH v3 bpf-next 0/4] Add O_PATH-based BPF_OBJ_PIN and BPF_OBJ_GET support Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2023-05-22 23:29 ` [PATCH v3 bpf-next 3/4] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd Andrii Nakryiko
@ 2023-05-22 23:29 ` Andrii Nakryiko
  3 siblings, 0 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2023-05-22 23:29 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau
  Cc: cyphar, brauner, lennart, linux-fsdevel, Andrii Nakryiko

Add a selftest demonstrating using detach-mounted BPF FS using new mount
APIs, and pinning and getting BPF map using such mount. This
demonstrates how something like container manager could setup BPF FS,
pin and adjust all the necessary objects in it, all before exposing BPF
FS to a particular mount namespace.

Also add a few subtests validating all meaningful combinations of
path_fd and pathname. We use mounted /sys/fs/bpf location for these.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../bpf/prog_tests/bpf_obj_pinning.c          | 268 ++++++++++++++++++
 1 file changed, 268 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c b/tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c
new file mode 100644
index 000000000000..31f1e815f671
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c
@@ -0,0 +1,268 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include <bpf/btf.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <linux/unistd.h>
+#include <linux/mount.h>
+#include <sys/syscall.h>
+
+static inline int sys_fsopen(const char *fsname, unsigned flags)
+{
+	return syscall(__NR_fsopen, fsname, flags);
+}
+
+static inline int sys_fsconfig(int fs_fd, unsigned cmd, const char *key, const void *val, int aux)
+{
+	return syscall(__NR_fsconfig, fs_fd, cmd, key, val, aux);
+}
+
+static inline int sys_fsmount(int fs_fd, unsigned flags, unsigned ms_flags)
+{
+	return syscall(__NR_fsmount, fs_fd, flags, ms_flags);
+}
+
+__attribute__((unused))
+static inline int sys_move_mount(int from_dfd, const char *from_path,
+			         int to_dfd, const char *to_path,
+			         unsigned int ms_flags)
+{
+	return syscall(__NR_move_mount, from_dfd, from_path, to_dfd, to_path, ms_flags);
+}
+
+static void bpf_obj_pinning_detached(void)
+{
+	LIBBPF_OPTS(bpf_obj_pin_opts, pin_opts);
+	LIBBPF_OPTS(bpf_obj_get_opts, get_opts);
+	int fs_fd = -1, mnt_fd = -1;
+	int map_fd = -1, map_fd2 = -1;
+	int zero = 0, src_value, dst_value, err;
+	const char *map_name = "fsmount_map";
+
+	/* A bunch of below UAPI calls are constructed based on reading:
+	 * https://brauner.io/2023/02/28/mounting-into-mount-namespaces.html
+	 */
+
+	/* create VFS context */
+	fs_fd = sys_fsopen("bpf", 0);
+	if (!ASSERT_GE(fs_fd, 0, "fs_fd"))
+		goto cleanup;
+
+	/* instantiate FS object */
+	err = sys_fsconfig(fs_fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+	if (!ASSERT_OK(err, "fs_create"))
+		goto cleanup;
+
+	/* create O_PATH fd for detached mount */
+	mnt_fd = sys_fsmount(fs_fd, 0, 0);
+	if (!ASSERT_GE(mnt_fd, 0, "mnt_fd"))
+		goto cleanup;
+
+	/* If we wanted to expose detached mount in the file system, we'd do
+	 * something like below. But the whole point is that we actually don't
+	 * even have to expose BPF FS in the file system to be able to work
+	 * (pin/get objects) with it.
+	 *
+	 * err = sys_move_mount(mnt_fd, "", -EBADF, mnt_path, MOVE_MOUNT_F_EMPTY_PATH);
+	 * if (!ASSERT_OK(err, "move_mount"))
+	 *	goto cleanup;
+	 */
+
+	/* create BPF map to pin */
+	map_fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, map_name, 4, 4, 1, NULL);
+	if (!ASSERT_GE(map_fd, 0, "map_fd"))
+		goto cleanup;
+
+	/* pin BPF map into detached BPF FS through mnt_fd */
+	pin_opts.file_flags = BPF_F_PATH_FD;
+	pin_opts.path_fd = mnt_fd;
+	err = bpf_obj_pin_opts(map_fd, map_name, &pin_opts);
+	if (!ASSERT_OK(err, "map_pin"))
+		goto cleanup;
+
+	/* get BPF map from detached BPF FS through mnt_fd */
+	get_opts.file_flags = BPF_F_PATH_FD;
+	get_opts.path_fd = mnt_fd;
+	map_fd2 = bpf_obj_get_opts(map_name, &get_opts);
+	if (!ASSERT_GE(map_fd2, 0, "map_get"))
+		goto cleanup;
+
+	/* update map through one FD */
+	src_value = 0xcafebeef;
+	err = bpf_map_update_elem(map_fd, &zero, &src_value, 0);
+	ASSERT_OK(err, "map_update");
+
+	/* check values written/read through different FDs do match */
+	dst_value = 0;
+	err = bpf_map_lookup_elem(map_fd2, &zero, &dst_value);
+	ASSERT_OK(err, "map_lookup");
+	ASSERT_EQ(dst_value, src_value, "map_value_eq1");
+	ASSERT_EQ(dst_value, 0xcafebeef, "map_value_eq2");
+
+cleanup:
+	if (map_fd >= 0)
+		ASSERT_OK(close(map_fd), "close_map_fd");
+	if (map_fd2 >= 0)
+		ASSERT_OK(close(map_fd2), "close_map_fd2");
+	if (fs_fd >= 0)
+		ASSERT_OK(close(fs_fd), "close_fs_fd");
+	if (mnt_fd >= 0)
+		ASSERT_OK(close(mnt_fd), "close_mnt_fd");
+}
+
+enum path_kind
+{
+	PATH_STR_ABS,
+	PATH_STR_REL,
+	PATH_FD_REL,
+};
+
+static void validate_pin(int map_fd, const char *map_name, int src_value,
+			 enum path_kind path_kind)
+{
+	LIBBPF_OPTS(bpf_obj_pin_opts, pin_opts);
+	char abs_path[PATH_MAX], old_cwd[PATH_MAX];
+	const char *pin_path = NULL;
+	int zero = 0, dst_value, map_fd2, err;
+
+	snprintf(abs_path, sizeof(abs_path), "/sys/fs/bpf/%s", map_name);
+	old_cwd[0] = '\0';
+
+	switch (path_kind) {
+	case PATH_STR_ABS:
+		/* absolute path */
+		pin_path = abs_path;
+		break;
+	case PATH_STR_REL:
+		/* cwd + relative path */
+		ASSERT_OK_PTR(getcwd(old_cwd, sizeof(old_cwd)), "getcwd");
+		ASSERT_OK(chdir("/sys/fs/bpf"), "chdir");
+		pin_path = map_name;
+		break;
+	case PATH_FD_REL:
+		/* dir fd + relative path */
+		pin_opts.file_flags = BPF_F_PATH_FD;
+		pin_opts.path_fd = open("/sys/fs/bpf", O_PATH);
+		ASSERT_GE(pin_opts.path_fd, 0, "path_fd");
+		pin_path = map_name;
+		break;
+	}
+
+	/* pin BPF map using specified path definition */
+	err = bpf_obj_pin_opts(map_fd, pin_path, &pin_opts);
+	ASSERT_OK(err, "obj_pin");
+
+	/* cleanup */
+	if (pin_opts.path_fd >= 0)
+		close(pin_opts.path_fd);
+	if (old_cwd[0])
+		ASSERT_OK(chdir(old_cwd), "restore_cwd");
+
+	map_fd2 = bpf_obj_get(abs_path);
+	if (!ASSERT_GE(map_fd2, 0, "map_get"))
+		goto cleanup;
+
+	/* update map through one FD */
+	err = bpf_map_update_elem(map_fd, &zero, &src_value, 0);
+	ASSERT_OK(err, "map_update");
+
+	/* check values written/read through different FDs do match */
+	dst_value = 0;
+	err = bpf_map_lookup_elem(map_fd2, &zero, &dst_value);
+	ASSERT_OK(err, "map_lookup");
+	ASSERT_EQ(dst_value, src_value, "map_value_eq");
+cleanup:
+	if (map_fd2 >= 0)
+		ASSERT_OK(close(map_fd2), "close_map_fd2");
+	unlink(abs_path);
+}
+
+static void validate_get(int map_fd, const char *map_name, int src_value,
+			 enum path_kind path_kind)
+{
+	LIBBPF_OPTS(bpf_obj_get_opts, get_opts);
+	char abs_path[PATH_MAX], old_cwd[PATH_MAX];
+	const char *pin_path = NULL;
+	int zero = 0, dst_value, map_fd2, err;
+
+	snprintf(abs_path, sizeof(abs_path), "/sys/fs/bpf/%s", map_name);
+	/* pin BPF map using specified path definition */
+	err = bpf_obj_pin(map_fd, abs_path);
+	if (!ASSERT_OK(err, "pin_map"))
+		return;
+
+	old_cwd[0] = '\0';
+
+	switch (path_kind) {
+	case PATH_STR_ABS:
+		/* absolute path */
+		pin_path = abs_path;
+		break;
+	case PATH_STR_REL:
+		/* cwd + relative path */
+		ASSERT_OK_PTR(getcwd(old_cwd, sizeof(old_cwd)), "getcwd");
+		ASSERT_OK(chdir("/sys/fs/bpf"), "chdir");
+		pin_path = map_name;
+		break;
+	case PATH_FD_REL:
+		/* dir fd + relative path */
+		get_opts.file_flags = BPF_F_PATH_FD;
+		get_opts.path_fd = open("/sys/fs/bpf", O_PATH);
+		ASSERT_GE(get_opts.path_fd, 0, "path_fd");
+		pin_path = map_name;
+		break;
+	}
+
+	map_fd2 = bpf_obj_get_opts(pin_path, &get_opts);
+	if (!ASSERT_GE(map_fd2, 0, "map_get"))
+		goto cleanup;
+
+	/* cleanup */
+	if (get_opts.path_fd >= 0)
+		close(get_opts.path_fd);
+	if (old_cwd[0])
+		ASSERT_OK(chdir(old_cwd), "restore_cwd");
+
+	/* update map through one FD */
+	err = bpf_map_update_elem(map_fd, &zero, &src_value, 0);
+	ASSERT_OK(err, "map_update");
+
+	/* check values written/read through different FDs do match */
+	dst_value = 0;
+	err = bpf_map_lookup_elem(map_fd2, &zero, &dst_value);
+	ASSERT_OK(err, "map_lookup");
+	ASSERT_EQ(dst_value, src_value, "map_value_eq");
+cleanup:
+	if (map_fd2 >= 0)
+		ASSERT_OK(close(map_fd2), "close_map_fd2");
+	unlink(abs_path);
+}
+
+static void bpf_obj_pinning_mounted(enum path_kind path_kind)
+{
+	const char *map_name = "mounted_map";
+	int map_fd;
+
+	/* create BPF map to pin */
+	map_fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, map_name, 4, 4, 1, NULL);
+	if (!ASSERT_GE(map_fd, 0, "map_fd"))
+		return;
+
+	validate_pin(map_fd, map_name, 100 + (int)path_kind, path_kind);
+	validate_get(map_fd, map_name, 200 + (int)path_kind, path_kind);
+	ASSERT_OK(close(map_fd), "close_map_fd");
+}
+
+void test_bpf_obj_pinning()
+{
+	if (test__start_subtest("detached"))
+		bpf_obj_pinning_detached();
+	if (test__start_subtest("mounted-str-abs"))
+		bpf_obj_pinning_mounted(PATH_STR_ABS);
+	if (test__start_subtest("mounted-str-rel"))
+		bpf_obj_pinning_mounted(PATH_STR_REL);
+	if (test__start_subtest("mounted-fd-rel"))
+		bpf_obj_pinning_mounted(PATH_FD_REL);
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM
  2023-05-22 23:29 ` [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM Andrii Nakryiko
@ 2023-05-23 10:03   ` Christian Brauner
  2023-05-23 15:00   ` Daniel Borkmann
  1 sibling, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2023-05-23 10:03 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, daniel, martin.lau, cyphar, lennart, linux-fsdevel

On Mon, May 22, 2023 at 04:29:14PM -0700, Andrii Nakryiko wrote:
> Do a sanity check whether provided file-to-be-pinned is actually a BPF
> object (prog, map, btf) before calling security_path_mknod LSM hook. If
> it's not, LSM hook doesn't have to be triggered, as the operation has no
> chance of succeeding anyways.
> 
> Suggested-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf-next 2/4] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands
  2023-05-22 23:29 ` [PATCH v3 bpf-next 2/4] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands Andrii Nakryiko
@ 2023-05-23 10:26   ` Christian Brauner
  0 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2023-05-23 10:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, daniel, martin.lau, cyphar, lennart, linux-fsdevel

On Mon, May 22, 2023 at 04:29:15PM -0700, Andrii Nakryiko wrote:
> Current UAPI of BPF_OBJ_PIN and BPF_OBJ_GET commands of bpf() syscall
> forces users to specify pinning location as a string-based absolute or
> relative (to current working directory) path. This has various
> implications related to security (e.g., symlink-based attacks), forces
> BPF FS to be exposed in the file system, which can cause races with
> other applications.
> 
> One of the feedbacks we got from folks working with containers heavily
> was that inability to use purely FD-based location specification was an
> unfortunate limitation and hindrance for BPF_OBJ_PIN and BPF_OBJ_GET
> commands. This patch closes this oversight, adding path_fd field to
> BPF_OBJ_PIN and BPF_OBJ_GET UAPI, following conventions established by
> *at() syscalls for dirfd + pathname combinations.
> 
> This now allows interesting possibilities like working with detached BPF
> FS mount (e.g., to perform multiple pinnings without running a risk of
> someone interfering with them), and generally making pinning/getting
> more secure and not prone to any races and/or security attacks.
> 
> This is demonstrated by a selftest added in subsequent patch that takes
> advantage of new mount APIs (fsopen, fsconfig, fsmount) to demonstrate
> creating detached BPF FS mount, pinning, and then getting BPF map out of
> it, all while never exposing this private instance of BPF FS to outside
> worlds.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf-next 3/4] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd
  2023-05-22 23:29 ` [PATCH v3 bpf-next 3/4] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd Andrii Nakryiko
@ 2023-05-23 14:38   ` Daniel Borkmann
  2023-05-23 16:21     ` Andrii Nakryiko
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Borkmann @ 2023-05-23 14:38 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, ast, martin.lau
  Cc: cyphar, brauner, lennart, linux-fsdevel

On 5/23/23 1:29 AM, Andrii Nakryiko wrote:
[...]
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index a5aa3a383d69..7a4fe80da360 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -389,5 +389,6 @@ LIBBPF_1.2.0 {
>   		bpf_link__update_map;
>   		bpf_link_get_info_by_fd;
>   		bpf_map_get_info_by_fd;
> +		bpf_obj_pin_opts;

Given 1.2.0 went out [0], shouldn't this go into a new LIBBPF_1.3.0 section?

>   		bpf_prog_get_info_by_fd;
>   } LIBBPF_1.1.0;

Thanks,
Daniel

   [0] https://lore.kernel.org/bpf/CAEf4BzYJhzEDHarRGvidhPd-DRtu4VXxnQ=HhOG-LZjkbK-MwQ@mail.gmail.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM
  2023-05-22 23:29 ` [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM Andrii Nakryiko
  2023-05-23 10:03   ` Christian Brauner
@ 2023-05-23 15:00   ` Daniel Borkmann
  1 sibling, 0 replies; 10+ messages in thread
From: Daniel Borkmann @ 2023-05-23 15:00 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, ast, martin.lau
  Cc: cyphar, brauner, lennart, linux-fsdevel

On 5/23/23 1:29 AM, Andrii Nakryiko wrote:
> Do a sanity check whether provided file-to-be-pinned is actually a BPF
> object (prog, map, btf) before calling security_path_mknod LSM hook. If
> it's not, LSM hook doesn't have to be triggered, as the operation has no
> chance of succeeding anyways.
> 
> Suggested-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

(I took this one already in, thanks!)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf-next 3/4] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd
  2023-05-23 14:38   ` Daniel Borkmann
@ 2023-05-23 16:21     ` Andrii Nakryiko
  0 siblings, 0 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2023-05-23 16:21 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Andrii Nakryiko, bpf, ast, martin.lau, cyphar, brauner, lennart,
	linux-fsdevel

On Tue, May 23, 2023 at 7:38 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 5/23/23 1:29 AM, Andrii Nakryiko wrote:
> [...]
> > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> > index a5aa3a383d69..7a4fe80da360 100644
> > --- a/tools/lib/bpf/libbpf.map
> > +++ b/tools/lib/bpf/libbpf.map
> > @@ -389,5 +389,6 @@ LIBBPF_1.2.0 {
> >               bpf_link__update_map;
> >               bpf_link_get_info_by_fd;
> >               bpf_map_get_info_by_fd;
> > +             bpf_obj_pin_opts;
>
> Given 1.2.0 went out [0], shouldn't this go into a new LIBBPF_1.3.0 section?

yep, you are right, this should have gone into a new 1.3.0 sections.
I'll add "starting v1.3 dev cycle" patch into v4 and resend.

>
> >               bpf_prog_get_info_by_fd;
> >   } LIBBPF_1.1.0;
>
> Thanks,
> Daniel
>
>    [0] https://lore.kernel.org/bpf/CAEf4BzYJhzEDHarRGvidhPd-DRtu4VXxnQ=HhOG-LZjkbK-MwQ@mail.gmail.com/
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-05-23 16:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-22 23:29 [PATCH v3 bpf-next 0/4] Add O_PATH-based BPF_OBJ_PIN and BPF_OBJ_GET support Andrii Nakryiko
2023-05-22 23:29 ` [PATCH v3 bpf-next 1/4] bpf: validate BPF object in BPF_OBJ_PIN before calling LSM Andrii Nakryiko
2023-05-23 10:03   ` Christian Brauner
2023-05-23 15:00   ` Daniel Borkmann
2023-05-22 23:29 ` [PATCH v3 bpf-next 2/4] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands Andrii Nakryiko
2023-05-23 10:26   ` Christian Brauner
2023-05-22 23:29 ` [PATCH v3 bpf-next 3/4] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd Andrii Nakryiko
2023-05-23 14:38   ` Daniel Borkmann
2023-05-23 16:21     ` Andrii Nakryiko
2023-05-22 23:29 ` [PATCH v3 bpf-next 4/4] selftests/bpf: add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).