linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next 0/9] bpf: cgroup hierarchical stats collection
@ 2022-05-10  0:17 Yosry Ahmed
  2022-05-10  0:17 ` [RFC PATCH bpf-next 1/9] bpf: introduce CGROUP_SUBSYS_RSTAT program type Yosry Ahmed
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Yosry Ahmed @ 2022-05-10  0:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Hao Luo, Tejun Heo, Zefan Li, Johannes Weiner,
	Shuah Khan, Roman Gushchin, Michal Hocko
  Cc: Stanislav Fomichev, David Rientjes, Greg Thelen, Shakeel Butt,
	linux-kernel, netdev, bpf, cgroups, Yosry Ahmed

This patch series allows for using bpf to collect hierarchical cgroup
stats efficiently by integrating with the rstat framework. The rstat
framework provides an efficient way to collect cgroup stats and
propagate them through the cgroup hierarchy.

The last patch is a selftest that demonastrates the entire workflow.
The workflow consists of:
- bpf programs that collect per-cpu per-cgroup stats (tracing progs).
- bpf rstat flusher that contains the logic for aggregating stats
  across cpus and across the cgroup hierarchy.
- bpf cgroup_iter responsible for outputting the stats to userspace
  through reading a file in bpffs.

The first 3 patches include the new bpf rstat flusher program type and
the needed support in rstat code and libbpf. The rstat flusher program
is a callback that the rstat framework makes to bpf when a stat flush is
ongoing, similar to the css_rstat_flush() callback that rstat makes to
cgroup controllers. Each callback is parameterized by a (cgroup, cpu)
pair that has been updated. The program contains the logic for
aggregating the stats across cpus and across the cgroup hierarchy.
These programs can be attached to any cgroup subsystem, not only the
ones that implement the css_rstat_flush() callback in the kernel. This
gives bpf programs more flexibility, and more isolation from the kernel
implementation.

The following 2 patches add necessary helpers for the stats collection
workflow. Helpers that call into cgroup_rstat_updated() and
cgroup_rstat_flush() are added to allow bpf programs collecting stats to
tell the rstat framework that a cgroup has been updated, and to allow
bpf programs outputting stats to tell the rstat framework to flush the
stats before they are displayed to the user. An additional
bpf_map_lookup_percpu_elem is introduced to allow rstat flusher programs
to access percpu stats of the cpu being flushed.

The following 3 patches add the cgroup_iter program type (v2). This was
originally introduced by Hao as a part of a different series [1].
Their usecase is better showcased as part of this patch series. We also
make cgroup_get_from_id() cgroup v1 friendly to allow cgroup_iter programs
to display stats for cgroup v1 as well. This small change makes the
entire workflow cgroup v1 friendly without any other dedicated changes.

The final patch is a selftest demonstrating the entire workflow with a
set of bpf programs that collect per-cgroup latency of memcg reclaim.

[1]https://lore.kernel.org/lkml/20220225234339.2386398-9-haoluo@google.com/


Hao Luo (2):
  cgroup: Add cgroup_put() in !CONFIG_CGROUPS case
  bpf: Introduce cgroup iter

Yosry Ahmed (7):
  bpf: introduce CGROUP_SUBSYS_RSTAT program type
  cgroup: bpf: flush bpf stats on rstat flush
  libbpf: Add support for rstat progs and links
  bpf: add bpf rstat helpers
  bpf: add bpf_map_lookup_percpu_elem() helper
  cgroup: add v1 support to cgroup_get_from_id()
  bpf: add a selftest for cgroup hierarchical stats collection

 include/linux/bpf-cgroup-subsys.h             |  35 ++
 include/linux/bpf.h                           |   4 +
 include/linux/bpf_types.h                     |   2 +
 include/linux/cgroup-defs.h                   |   4 +
 include/linux/cgroup.h                        |   5 +
 include/uapi/linux/bpf.h                      |  45 +++
 kernel/bpf/Makefile                           |   3 +-
 kernel/bpf/arraymap.c                         |  11 +-
 kernel/bpf/cgroup_iter.c                      | 148 ++++++++
 kernel/bpf/cgroup_subsys.c                    | 212 +++++++++++
 kernel/bpf/hashtab.c                          |  25 +-
 kernel/bpf/helpers.c                          |  56 +++
 kernel/bpf/syscall.c                          |   6 +
 kernel/bpf/verifier.c                         |   6 +
 kernel/cgroup/cgroup.c                        |  16 +-
 kernel/cgroup/rstat.c                         |  11 +
 scripts/bpf_doc.py                            |   2 +
 tools/include/uapi/linux/bpf.h                |  45 +++
 tools/lib/bpf/bpf.c                           |   3 +
 tools/lib/bpf/bpf.h                           |   3 +
 tools/lib/bpf/libbpf.c                        |  35 ++
 tools/lib/bpf/libbpf.h                        |   3 +
 tools/lib/bpf/libbpf.map                      |   1 +
 .../test_cgroup_hierarchical_stats.c          | 335 ++++++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_iter.h  |   7 +
 .../selftests/bpf/progs/cgroup_vmscan.c       | 211 +++++++++++
 26 files changed, 1212 insertions(+), 22 deletions(-)
 create mode 100644 include/linux/bpf-cgroup-subsys.h
 create mode 100644 kernel/bpf/cgroup_iter.c
 create mode 100644 kernel/bpf/cgroup_subsys.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_cgroup_hierarchical_stats.c
 create mode 100644 tools/testing/selftests/bpf/progs/cgroup_vmscan.c

-- 
2.36.0.512.ge40c2bad7a-goog


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-05-13  7:17 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-10  0:17 [RFC PATCH bpf-next 0/9] bpf: cgroup hierarchical stats collection Yosry Ahmed
2022-05-10  0:17 ` [RFC PATCH bpf-next 1/9] bpf: introduce CGROUP_SUBSYS_RSTAT program type Yosry Ahmed
2022-05-10 18:07   ` Yosry Ahmed
2022-05-10 19:21     ` Yosry Ahmed
2022-05-10 18:44   ` Tejun Heo
2022-05-10 19:34     ` Yosry Ahmed
2022-05-10 19:59       ` Tejun Heo
2022-05-10 20:43         ` Yosry Ahmed
2022-05-10 21:01           ` Tejun Heo
2022-05-10 21:55             ` Yosry Ahmed
2022-05-10 22:09               ` Tejun Heo
2022-05-10 22:10                 ` Yosry Ahmed
2022-05-10  0:18 ` [RFC PATCH bpf-next 2/9] cgroup: bpf: flush bpf stats on rstat flush Yosry Ahmed
2022-05-10 18:45   ` Tejun Heo
2022-05-10  0:18 ` [RFC PATCH bpf-next 3/9] libbpf: Add support for rstat progs and links Yosry Ahmed
2022-05-10  0:18 ` [RFC PATCH bpf-next 4/9] bpf: add bpf rstat helpers Yosry Ahmed
2022-05-10  0:18 ` [RFC PATCH bpf-next 5/9] bpf: add bpf_map_lookup_percpu_elem() helper Yosry Ahmed
2022-05-10  0:18 ` [RFC PATCH bpf-next 6/9] cgroup: add v1 support to cgroup_get_from_id() Yosry Ahmed
2022-05-10 18:33   ` Tejun Heo
2022-05-10 18:36     ` Yosry Ahmed
2022-05-10  0:18 ` [RFC PATCH bpf-next 7/9] cgroup: Add cgroup_put() in !CONFIG_CGROUPS case Yosry Ahmed
2022-05-10 18:25   ` Hao Luo
2022-05-10  0:18 ` [RFC PATCH bpf-next 8/9] bpf: Introduce cgroup iter Yosry Ahmed
2022-05-10 18:25   ` Hao Luo
2022-05-10 18:54   ` Tejun Heo
2022-05-10 21:12     ` Hao Luo
2022-05-10 22:07       ` Tejun Heo
2022-05-10 22:49         ` Hao Luo
2022-05-10  0:18 ` [RFC PATCH bpf-next 9/9] selftest/bpf: add a selftest for cgroup hierarchical stats Yosry Ahmed
2022-05-13  7:16 ` [RFC PATCH bpf-next 0/9] bpf: cgroup hierarchical stats collection Yosry Ahmed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).