[RFC 0/2] perf core: Sharing events with multiple cgroups

* [RFC 0/2] perf core: Sharing events with multiple cgroups
@ 2021-03-23 16:21 Namhyung Kim
  2021-03-23 16:21 ` [PATCH 1/2] perf/core: Share an event " Namhyung Kim
  2021-03-23 16:21 ` [PATCH 2/2] perf/core: Support reading group events with shared cgroups Namhyung Kim
  0 siblings, 2 replies; 16+ messages in thread
From: Namhyung Kim @ 2021-03-23 16:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa, Mark Rutland,
	Alexander Shishkin, LKML, Stephane Eranian, Andi Kleen,
	Ian Rogers, Song Liu

Hello,

This work is to make perf stat more scalable with a lot of cgroups.

Currently we need to open a separate perf_event to count an event in a
cgroup.  For a big machine, this requires lots of events like

  256 cpu x 8 events x 200 cgroups = 409600 events

This is very wasteful and not scalable.  In this case, the perf stat
actually counts exactly same events for each cgroup.  I think we can
just use a single event to measure all cgroups running on that cpu.

So I added new ioctl commands to add per-cgroup counters to an
existing perf_event and to read the per-cgroup counters from the
event.  The per-cgroup counters are updated during the context switch
if tasks' cgroups are different (and no need to change the HW PMU).
It keeps the counters in a hash table with cgroup id as a key.

With this change, average processing time of my internal test workload
which runs tasks in a different cgroup and communicates by pipes
dropped from 11.3 usec to 5.8 usec.

Thanks,
Namhyung

Namhyung Kim (2):
  perf/core: Share an event with multiple cgroups
  perf/core: Support reading group events with shared cgroups

 include/linux/perf_event.h      |  22 ++
 include/uapi/linux/perf_event.h |   2 +
 kernel/events/core.c            | 588 ++++++++++++++++++++++++++++++--
 3 files changed, 585 insertions(+), 27 deletions(-)

-- 
2.31.0.rc2.261.g7f71774620-goog

^ permalink raw reply	[flat|nested] 16+ messages in thread