All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hao Luo <haoluo@google.com>
To: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>, KP Singh <kpsingh@kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	Joe Burton <jevburton.kernel@gmail.com>,
	Stanislav Fomichev <sdf@google.com>,
	bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	Hao Luo <haoluo@google.com>
Subject: [PATCH RFC bpf-next v2 0/5] Extend cgroup interface with bpf
Date: Tue,  1 Feb 2022 12:55:29 -0800	[thread overview]
Message-ID: <20220201205534.1962784-1-haoluo@google.com> (raw)

This patchset introduces a new program type to extend the cgroup interfaces.
It extends the bpf filesystem (bpffs) to allow creating a directory
hierarchy that tracks any kernfs hierarchy, in particular cgroupfs. Each
subdirectory in this hierarchy will keep a reference to a corresponding
kernfs node when created. This is done by associating a new data
structure (called "tag" in this patchset) to the bpffs directory inodes.
File inode in bpffs holds a reference to bpf object and directory inode
may point to a tag, which holds a kernfs node.

A bpf object can be pinned in these directories and objects can choose to
enable inheritance in tagged directories. In this patchset, a new
program type "cgroup_view" is introduced, which supports inheritance.
More specifically, when a link to cgroup_view prog is pinned in a bpffs
directory, it tags the directory and connects the directory to the root
cgroup. Subdirectories created underneath has to match a subcgroup, and
when created, they will inherit the pinned cgroup_view link from the
parent directory.

The pinned cgroup_view objects can be read as files. When the object is
read, it tries to get the cgroup its parent directory is matched to.
Failure to get the cgroup's reference will not run the cgroup_view prog.
Users can implement cgroup_view program to define what to print out to
the file, given the cgroup object.

See patch 5/5 for an example of how this works.

Userspace has to manually create/remove directories in bpffs to mirror
the cgroup hierarchy. It was suggested using overlayfs to create a
hierarchy that contains both cgroupfs and bpffs. But after some
experiments, I found overlayfs is not intended to be used this way:
overlayfs assumes the underlying filesystem will not change [1], but our
underlaying fs (i.e. cgroupfs) will change and cause weird behavior. So
I didn't pursue in that direction.

This patchset v2 is only for demonstrating the high level design. There
are a lot of places in its implementation that can be improved. Cgroup_view
is a type of bpf_iter, because seqfile printing has been supported well
in bpf_iter, although cgroup_view is not iterating kernel objects.

Changes v1->v2:
 - Complete redesign. v1 implements pinning bpf objects in cgroupfs[2].
   v2 implements object inheritance in bpffs. Due to its simplicity,
   bpffs is better for implementing inheritance compared to cgroupfs.
 - Extend selftest to include a more realistic use case. The selftests
   in v2 developed a cgroup-level sched performance metric and exported
   through the new prog type.

[1] https://www.kernel.org/doc/html/latest/filesystems/overlayfs.html#changes-to-underlying-filesystems
[2] https://lore.kernel.org/bpf/Ydd1IIUG7%2F3kQRcR@google.com/

Hao Luo (5):
  bpf: Bpffs directory tag
  bpf: Introduce inherit list for dir tag.
  bpf: cgroup_view iter
  bpf: Pin cgroup_view
  selftests/bpf: test for pinning for cgroup_view link

 include/linux/bpf.h                           |   2 +
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bpf_iter.c                         |  11 +
 kernel/bpf/cgroup_view_iter.c                 | 114 ++++++++
 kernel/bpf/inode.c                            | 272 +++++++++++++++++-
 kernel/bpf/inode.h                            |  55 ++++
 .../selftests/bpf/prog_tests/pinning_cgroup.c | 143 +++++++++
 tools/testing/selftests/bpf/progs/bpf_iter.h  |   7 +
 .../bpf/progs/bpf_iter_cgroup_view.c          | 232 +++++++++++++++
 9 files changed, 829 insertions(+), 9 deletions(-)
 create mode 100644 kernel/bpf/cgroup_view_iter.c
 create mode 100644 kernel/bpf/inode.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/pinning_cgroup.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_cgroup_view.c

-- 
2.35.0.rc2.247.g8bbb082509-goog


             reply	other threads:[~2022-02-01 20:55 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-01 20:55 Hao Luo [this message]
2022-02-01 20:55 ` [PATCH RFC bpf-next v2 1/5] bpf: Bpffs directory tag Hao Luo
2022-02-01 20:55 ` [PATCH RFC bpf-next v2 2/5] bpf: Introduce inherit list for dir tag Hao Luo
2022-02-01 20:55 ` [PATCH RFC bpf-next v2 3/5] bpf: cgroup_view iter Hao Luo
2022-02-02  1:17   ` kernel test robot
2022-02-02  3:39   ` kernel test robot
2022-02-02  3:39     ` kernel test robot
2022-02-01 20:55 ` [PATCH RFC bpf-next v2 4/5] bpf: Pin cgroup_view Hao Luo
2022-02-02  4:20   ` kernel test robot
2022-02-02  5:11   ` kernel test robot
2022-02-02  5:11     ` kernel test robot
2022-02-01 20:55 ` [PATCH RFC bpf-next v2 5/5] selftests/bpf: test for pinning for cgroup_view link Hao Luo
2022-02-03 18:04   ` Alexei Starovoitov
2022-02-03 22:46     ` Hao Luo
2022-02-04  3:33       ` Alexei Starovoitov
2022-02-04 18:26         ` Hao Luo
2022-02-06  4:29           ` Alexei Starovoitov
2022-02-08 20:07             ` Hao Luo
2022-02-08 21:20               ` Alexei Starovoitov
2022-02-08 21:34                 ` Hao Luo
2022-02-14 18:29                 ` Hao Luo
2022-02-14 19:24                   ` Alexei Starovoitov
2022-02-14 20:23                     ` Hao Luo
2022-02-14 20:27                       ` Alexei Starovoitov
2022-02-14 20:39                         ` Hao Luo
2022-02-02  9:15 [PATCH RFC bpf-next v2 2/5] bpf: Introduce inherit list for dir tag kernel test robot
2022-02-02  9:32 ` Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220201205534.1962784-1-haoluo@google.com \
    --to=haoluo@google.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jevburton.kernel@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=shakeelb@google.com \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.