All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hao Luo <haoluo@google.com>
To: sdf@google.com
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>, KP Singh <kpsingh@kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	Joe Burton <jevburton.kernel@gmail.com>,
	bpf@vger.kernel.org
Subject: Re: [PATCH RFC bpf-next v1 0/8] Pinning bpf objects outside bpffs
Date: Mon, 10 Jan 2022 10:55:54 -0800	[thread overview]
Message-ID: <CA+khW7ihrLZwvzPTGAy0GyFmKzB7tH-FU6D+-fthqbj4wuiwFg@mail.gmail.com> (raw)
In-Reply-To: <YdiTrq4Y7JwmQumc@google.com>

On Fri, Jan 7, 2022 at 11:25 AM <sdf@google.com> wrote:
>
> On 01/07, Hao Luo wrote:
> > On Thu, Jan 6, 2022 at 3:03 PM <sdf@google.com> wrote:
> > >
> > > On 01/06, Hao Luo wrote:
> > > > Bpffs is a pseudo file system that persists bpf objects. Previously
> > > > bpf objects can only be pinned in bpffs, this patchset extends pinning
> > > > to allow bpf objects to be pinned (or exposed) to other file systems.
> > >
> > > > In particular, this patchset allows pinning bpf objects in kernfs.
> > This
> > > > creates a new file entry in the kernfs file system and the created
> > file
> > > > is able to reference the bpf object. By doing so, bpf can be used to
> > > > customize the file's operations, such as seq_show.
> > >
> > > > As a concrete usecase of this feature, this patchset introduces a
> > > > simple new program type called 'bpf_view', which can be used to format
> > > > a seq file by a kernel object's state. By pinning a bpf_view program
> > > > into a cgroup directory, userspace is able to read the cgroup's state
> > > > from file in a format defined by the bpf program.
> > >
> > > > Different from bpffs, kernfs doesn't have a callback when a kernfs
> > node
> > > > is freed, which is problem if we allow the kernfs node to hold an
> > extra
> > > > reference of the bpf object, because there is no chance to dec the
> > > > object's refcnt. Therefore the kernfs node created by pinning doesn't
> > > > hold reference of the bpf object. The lifetime of the kernfs node
> > > > depends on the lifetime of the bpf object. Rather than "pinning in
> > > > kernfs", it is "exposing to kernfs". We require the bpf object to be
> > > > pinned in bpffs first before it can be pinned in kernfs. When the
> > > > object is unpinned from bpffs, their kernfs nodes will be removed
> > > > automatically. This somehow treats a pinned bpf object as a persistent
> > > > "device".
> > >
> > > > We rely on fsnotify to monitor the inode events in bpffs. A new
> > function
> > > > bpf_watch_inode() is introduced. It allows registering a callback
> > > > function at inode destruction. For the kernfs case, a callback that
> > > > removes kernfs node is registered at the destruction of bpffs inodes.
> > > > For other file systems such as sockfs, bpf_watch_inode() can monitor
> > the
> > > > destruction of sockfs inodes and the created file entry can hold the
> > bpf
> > > > object's reference. In this case, it is truly "pinning".
> > >
> > > > File operations other than seq_show can also be implemented using bpf.
> > > > For example, bpf may be of help for .poll and .mmap in kernfs.
> > >
> > > This looks awesome!
> > >
> > > One thing I don't understand is: why did go through the pinning
> > > interface VS regular attach/detach? IOW, why not allow regular
> > > sys_bpf(BPF_PROG_ATTACH, prog_id, cgroup_id) and attach to the cgroup
> > > (which, in turn, creates the kernfs nodes). Seems like this way you can
> > drop
> > > the requirement on the object being pinned in the bpffs first?
>
> > Thanks Stan.
>
> > Yeah, the attach/detach approach is definitely another option. IIUC,
> > in comparison to pinning, does attach/detach only work for cgroups?
>
> attach has target_fd argument that, in theory, can be whatever. We can
> add support for different fd types.
>

I see. With attach API, are we also able to specify some attributes
for the attachment? For example, a property that we may want is: let
descendent cgroups inherit their parent cgroup's programs.

> > Pinning may be used on other file systems, sockfs, sysfs or resctrl.
> > But I don't know whether this generality is welcome and implementing
> > seq_show is the only concrete use case I can think of right now. If
> > people think the ability of creating files in other subsystems is not
> > good, I'd be happy to take a look at the attach/detach approach and
> > that may be the right way.
>
> The reason I started thinking about attach/detach is because of clunky
> unlink that you have to do (aka echo "rm" > file). IMO, having standard
> attach/detach is a much more clear. But I might be missing some
> complexity associated with non-cgroup filesystems.

Oh, I see. Looks good. Let me play with it before sending the next version.

  reply	other threads:[~2022-01-10 18:56 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-06 21:50 [PATCH RFC bpf-next v1 0/8] Pinning bpf objects outside bpffs Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 1/8] bpf: Support pinning in non-bpf file system Hao Luo
2022-01-07  0:04   ` kernel test robot
2022-01-07  0:33   ` Yonghong Song
2022-01-08  0:41   ` kernel test robot
2022-01-08  0:41     ` kernel test robot
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 2/8] bpf: Record back pointer to the inode in bpffs Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 3/8] bpf: Expose bpf object in kernfs Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 4/8] bpf: Support removing kernfs entries Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 5/8] bpf: Introduce a new program type bpf_view Hao Luo
2022-01-07  0:35   ` Yonghong Song
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 6/8] libbpf: Support of bpf_view prog type Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 7/8] bpf: Add seq_show operation for bpf in cgroupfs Hao Luo
2022-01-06 21:50 ` [PATCH RFC bpf-next v1 8/8] selftests/bpf: Test exposing bpf objects in kernfs Hao Luo
2022-01-06 23:02 ` [PATCH RFC bpf-next v1 0/8] Pinning bpf objects outside bpffs sdf
2022-01-07 18:59   ` Hao Luo
2022-01-07 19:25     ` sdf
2022-01-10 18:55       ` Hao Luo [this message]
2022-01-10 19:22         ` Stanislav Fomichev
2022-01-11  3:33         ` Alexei Starovoitov
2022-01-11 17:06           ` Stanislav Fomichev
2022-01-11 18:20           ` Hao Luo
2022-01-12 18:55             ` Song Liu
2022-01-12 19:19               ` Hao Luo
2022-01-07  0:30 ` Yonghong Song
2022-01-07 20:43   ` Hao Luo
2022-01-10 17:30     ` Yonghong Song
2022-01-10 18:56       ` Hao Luo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+khW7ihrLZwvzPTGAy0GyFmKzB7tH-FU6D+-fthqbj4wuiwFg@mail.gmail.com \
    --to=haoluo@google.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jevburton.kernel@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=sdf@google.com \
    --cc=shakeelb@google.com \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.