linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Benchmark and improve event synthesis performance
@ 2020-04-01 23:39 Ian Rogers
  2020-04-01 23:39 ` [PATCH 1/5] perf bench: add event synthesis benchmark Ian Rogers
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Ian Rogers @ 2020-04-01 23:39 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Petr Mladek, Andrey Zhizhikin, Kefeng Wang, Thomas Gleixner,
	Kan Liang, linux-kernel, linux-perf-users
  Cc: Stephane Eranian, Ian Rogers


Event synthesis is performance critical in common tasks using perf. For
example, when perf record starts in system wide mode the /proc file
system is scanned with events synthesized for each process and all
executable mmaps. With large machines and lots of processes, we have seen
O(seconds) of wall clock time while synthesis is occurring.

This patch set adds a benchmark for synthesis performance in a new
benchmark collection called 'internals'. The benchmark uses the
machine__synthesize_threads function, single threaded on the perf process
with a 'tool' that just drops the events, to measure how long synthesis
takes.

By profiling this benchmark 2 performance bottlenecks were identified,
hugetlbfs_mountpoint and stdio. The impact of theses changes are:

Before:
Average synthesis took: 167.616800 usec
Average data synthesis took: 208.655600 usec

After hugetlbfs_mountpoint scalability fix:
Average synthesis took: 120.195100 usec
Average data synthesis took: 156.582300 usec

After removal of stdio in /proc/pid/maps code:
Average synthesis took: 67.189100 usec
Average data synthesis took: 102.451600 usec

Time was measured on an Intel Xeon 6154 compiling with Debian gcc 9.2.1.

Two patches in the set were sent to LKML previously but are included
here for context around the benchmark performance impact:
https://lore.kernel.org/lkml/20200327172914.28603-1-irogers@google.com/T/#u
https://lore.kernel.org/lkml/20200328014221.168130-1-irogers@google.com/T/#u

A future area of improvement could be to add the perf top
num-thread-synthesize option more widely to other perf commands, and
also to benchmark its effectiveness.

Ian Rogers (4):
  perf bench: add event synthesis benchmark
  perf synthetic-events: save 4kb from 2 stack frames
  tools api: add a lightweight buffered reading api
  perf synthetic events: Remove use of sscanf from /proc reading

Stephane Eranian (1):
  tools api fs: make xxx__mountpoint() more scalable

 tools/lib/api/fs/fs.c              |  17 +++
 tools/lib/api/fs/fs.h              |  12 ++
 tools/lib/api/io.h                 | 103 +++++++++++++++++
 tools/perf/bench/Build             |   2 +-
 tools/perf/bench/bench.h           |   2 +-
 tools/perf/bench/synthesize.c      | 101 ++++++++++++++++
 tools/perf/builtin-bench.c         |   6 +
 tools/perf/util/synthetic-events.c | 177 +++++++++++++++++++----------
 8 files changed, 355 insertions(+), 65 deletions(-)
 create mode 100644 tools/lib/api/io.h
 create mode 100644 tools/perf/bench/synthesize.c

-- 
2.26.0.rc2.310.g2932bb562d-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-04-02 13:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01 23:39 [PATCH 0/5] Benchmark and improve event synthesis performance Ian Rogers
2020-04-01 23:39 ` [PATCH 1/5] perf bench: add event synthesis benchmark Ian Rogers
2020-04-02 13:41   ` Jiri Olsa
2020-04-01 23:39 ` [PATCH 2/5] tools api fs: make xxx__mountpoint() more scalable Ian Rogers
2020-04-01 23:39 ` [PATCH 3/5] perf synthetic-events: save 4kb from 2 stack frames Ian Rogers
2020-04-01 23:39 ` [PATCH 4/5] tools api: add a lightweight buffered reading api Ian Rogers
2020-04-02 13:41   ` Jiri Olsa
2020-04-01 23:39 ` [PATCH 5/5] perf synthetic events: Remove use of sscanf from /proc reading Ian Rogers
2020-04-02 13:40   ` Jiri Olsa
2020-04-02 13:41   ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).