From: Christian Brauner <brauner@kernel.org>
To: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
martin.lau@kernel.org, cyphar@cyphar.com, lennart@poettering.net,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2 bpf-next 1/3] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands
Date: Fri, 19 May 2023 14:37:35 +0200 [thread overview]
Message-ID: <20230519-ratschlag-gockel-c27d5fdfb72d@brauner> (raw)
In-Reply-To: <20230519-eiswasser-leibarzt-ed7e52934486@brauner>
On Fri, May 19, 2023 at 11:49:50AM +0200, Christian Brauner wrote:
> On Thu, May 18, 2023 at 02:54:42PM -0700, Andrii Nakryiko wrote:
> > Current UAPI of BPF_OBJ_PIN and BPF_OBJ_GET commands of bpf() syscall
> > forces users to specify pinning location as a string-based absolute or
> > relative (to current working directory) path. This has various
> > implications related to security (e.g., symlink-based attacks), forces
> > BPF FS to be exposed in the file system, which can cause races with
> > other applications.
> >
> > One of the feedbacks we got from folks working with containers heavily
> > was that inability to use purely FD-based location specification was an
> > unfortunate limitation and hindrance for BPF_OBJ_PIN and BPF_OBJ_GET
> > commands. This patch closes this oversight, adding path_fd field to
> > BPF_OBJ_PIN and BPF_OBJ_GET UAPI, following conventions established by
> > *at() syscalls for dirfd + pathname combinations.
> >
> > This now allows interesting possibilities like working with detached BPF
> > FS mount (e.g., to perform multiple pinnings without running a risk of
> > someone interfering with them), and generally making pinning/getting
> > more secure and not prone to any races and/or security attacks.
> >
> > This is demonstrated by a selftest added in subsequent patch that takes
> > advantage of new mount APIs (fsopen, fsconfig, fsmount) to demonstrate
> > creating detached BPF FS mount, pinning, and then getting BPF map out of
> > it, all while never exposing this private instance of BPF FS to outside
> > worlds.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> > include/linux/bpf.h | 4 ++--
> > include/uapi/linux/bpf.h | 10 ++++++++++
> > kernel/bpf/inode.c | 16 ++++++++--------
> > kernel/bpf/syscall.c | 25 ++++++++++++++++++++-----
> > tools/include/uapi/linux/bpf.h | 10 ++++++++++
> > 5 files changed, 50 insertions(+), 15 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 36e4b2d8cca2..f58895830ada 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -2077,8 +2077,8 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd);
> > struct bpf_link *bpf_link_get_from_fd(u32 ufd);
> > struct bpf_link *bpf_link_get_curr_or_next(u32 *id);
> >
> > -int bpf_obj_pin_user(u32 ufd, const char __user *pathname);
> > -int bpf_obj_get_user(const char __user *pathname, int flags);
> > +int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname);
> > +int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags);
> >
> > #define BPF_ITER_FUNC_PREFIX "bpf_iter_"
> > #define DEFINE_BPF_ITER_FUNC(target, args...) \
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 1bb11a6ee667..3731284671e4 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1272,6 +1272,9 @@ enum {
> >
> > /* Create a map that will be registered/unregesitered by the backed bpf_link */
> > BPF_F_LINK = (1U << 13),
> > +
> > +/* Get path from provided FD in BPF_OBJ_PIN/BPF_OBJ_GET commands */
> > + BPF_F_PATH_FD = (1U << 14),
> > };
> >
> > /* Flags for BPF_PROG_QUERY. */
> > @@ -1420,6 +1423,13 @@ union bpf_attr {
> > __aligned_u64 pathname;
> > __u32 bpf_fd;
> > __u32 file_flags;
> > + /* Same as dirfd in openat() syscall; see openat(2)
> > + * manpage for details of path FD and pathname semantics;
> > + * path_fd should accompanied by BPF_F_PATH_FD flag set in
> > + * file_flags field, otherwise it should be set to zero;
> > + * if BPF_F_PATH_FD flag is not set, AT_FDCWD is assumed.
> > + */
> > + __u32 path_fd;
> > };
>
> Thanks for changing that.
>
> This is still odd though because you prevent users from specifying
> AT_FDCWD explicitly. They should be allowed to do that plus file
> descriptors are signed integers so please s/__u32/__s32/. AT_FDCWD
> should be passable anywhere where we have at* semantics. Plus, if in the
> vfs we ever add
> #define AT_ROOT -200
> or something you can't use without coming up with your own custom flags.
> If you just follow what everyone else does and use __s32 then you're
> good.
>
> File descriptors really need to be signed. There's no way around that.
> See io_uring as a good example
>
> io_uring_sqe {
> __u8 opcode; /* type of operation for this sqe */
> __u8 flags; /* IOSQE_ flags */
> __u16 ioprio; /* ioprio for the request */
> __s32 fd; /* file descriptor to do IO on */
> }
>
> where the __s32 fd is used in all fd based requests including
> io_openat*() (See io_uring/openclose.c) which are effectively the
> semantics you want to emulate here.
I should clarify that this is mainly for apis that return fds or that
provide at* semantics. We certainly do use unsigned in cases where the
system call operates directly on an fd without any lookup semantics.
next prev parent reply other threads:[~2023-05-19 12:37 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-18 21:54 [PATCH v2 bpf-next 0/3] Add O_PATH-based BPF_OBJ_PIN and BPF_OBJ_GET support Andrii Nakryiko
2023-05-18 21:54 ` [PATCH v2 bpf-next 1/3] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands Andrii Nakryiko
2023-05-18 23:58 ` kernel test robot
2023-05-19 0:19 ` Andrii Nakryiko
2023-05-19 0:53 ` Alexei Starovoitov
2023-05-19 2:42 ` Andrii Nakryiko
2023-05-19 9:49 ` Christian Brauner
2023-05-19 12:37 ` Christian Brauner [this message]
2023-05-19 16:01 ` Andrii Nakryiko
2023-05-20 13:39 ` Christian Brauner
2023-05-18 21:54 ` [PATCH v2 bpf-next 2/3] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd Andrii Nakryiko
2023-05-18 21:54 ` [PATCH v2 bpf-next 3/3] selftests/bpf: add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230519-ratschlag-gockel-c27d5fdfb72d@brauner \
--to=brauner@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=cyphar@cyphar.com \
--cc=daniel@iogearbox.net \
--cc=lennart@poettering.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).