linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>, Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Clark Williams <williams@redhat.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Michael Petlan <mpetlan@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: [PATCH 01/57] perf tools: Allow to build with -ltcmalloc
Date: Mon, 21 Oct 2019 10:37:38 -0300	[thread overview]
Message-ID: <20191021133834.25998-2-acme@kernel.org> (raw)
In-Reply-To: <20191021133834.25998-1-acme@kernel.org>

From: Jiri Olsa <jolsa@kernel.org>

By using "make TCMALLOC=1" you can enable perf to be build for usage
with libtcmalloc.so (gperftools).

Get heap profile (tools/perf directory):

  $ <install gperftools>
  $ make TCMALLOC=1 DEBUG=1
  $ HEAPPROFILE=/tmp/heapprof ./perf ...
  $ pprof ./perf /tmp/heapprof.000*
  (pprof) top
  Total: 2335.5 MB
    1735.1  74.3%  74.3%   1735.1  74.3% memdup
     402.0  17.2%  91.5%    402.0  17.2% zalloc
     140.2   6.0%  97.5%    145.8   6.2% map__new
      33.6   1.4%  98.9%     33.6   1.4% symbol__new
      12.4   0.5%  99.5%     12.4   0.5% alloc_event
       6.2   0.3%  99.7%      6.2   0.3% nsinfo__new
       5.5   0.2% 100.0%      5.5   0.2% nsinfo__copy
       0.3   0.0% 100.0%      0.3   0.0% dso__new
       0.1   0.0% 100.0%      0.1   0.0% do_read_string
       0.0   0.0% 100.0%      0.0   0.0% __GI__IO_file_doallocate

See callstack:
  $ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf
  $ pprof --web ./perf /tmp/heapprof.00*

Committer testing:

Install gperftools, on fedora:

  # dnf install gperftools-devel

Then build:

 $ make TCMALLOC=1 DEBUG=1 -C tools/perf O=/tmp/build/perf install-bin

Verify that it linked against the right library:

  $ ldd ~/bin/perf | grep tcma
	libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fb2953a7000)
  $

Run 'perf trace' system wide for 1 minute:

  # HEAPPROFILE=/tmp/heapprof perf trace -a sleep 1m
  <SNIP>
   59985.524 ( 0.006 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafb0) = -1 EAGAIN (Resource temporarily unavailable)
   59985.536 ( 0.005 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafc0) = -1 EAGAIN (Resource temporarily unavailable)
   59981.956 (10.143 ms): SCTP timer/21716  ... [continued]: select())                            = 0 (Timeout)
   59985.549 (         ): Web Content/20354 poll(ufds: 0x7f1df38af180, nfds: 3, timeout_msecs: 4294967295) ...
       0.926 (59999.481 ms): sleep/29764  ... [continued]: nanosleep())                           = 0
   59992.133 (         ): SCTP timer/21716 select(tvp: 0x7ff5bf7fee80)                            ...
   60000.477 ( 0.009 ms): sleep/29764 close(fd: 1)                                                = 0
   60000.493 ( 0.005 ms): sleep/29764 close(fd: 2)                                                = 0
   60000.514 (         ): sleep/29764 exit_group()                                                = ?
  Dumping heap profile to /tmp/heapprof.0001.heap (Exiting, 3 MB in use)
[root@quaco ~]#

Install pprof:

  # dnf install pprof

And run it:

  # pprof ~/bin/perf /tmp/heapprof.0001.heap
  Using local file /root/bin/perf.
  Using local file /tmp/heapprof.0001.heap.
  Welcome to pprof!  For help, type 'help'.
  (pprof) top
  Total: 4.0 MB
       1.7  42.0%  42.0%      2.2  54.1% map__new
       0.9  23.3%  65.3%      0.9  23.3% zalloc
       0.5  11.4%  76.7%      0.5  11.4% dso__new
       0.2   5.6%  82.3%      0.3   8.5% trace__sys_enter
       0.2   4.9%  87.2%      0.2   4.9% __GI___strdup
       0.2   3.8%  91.0%      0.2   3.8% new_term
       0.1   2.2%  93.2%      0.4  10.1% __perf_pmu__new_alias
       0.0   1.0%  94.3%      0.0   1.2% event_read_fields
       0.0   0.8%  95.1%      0.0   0.8% nsinfo__new
       0.0   0.7%  95.8%      0.1   3.2% trace__read_syscall_info
  (pprof)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191013151427.11941-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.config | 5 +++++
 tools/perf/Makefile.perf   | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 063202c53b64..1783427da9b0 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -265,6 +265,11 @@ LDFLAGS += -Wl,-z,noexecstack
 
 EXTLIBS = -lpthread -lrt -lm -ldl
 
+ifneq ($(TCMALLOC),)
+  CFLAGS += -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free
+  EXTLIBS += -ltcmalloc
+endif
+
 ifeq ($(FEATURES_DUMP),)
 include $(srctree)/tools/build/Makefile.feature
 else
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index a099a8a89447..8f1ba986d3bf 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -114,6 +114,8 @@ include ../scripts/utilities.mak
 # Define NO_LIBZSTD if you do not want support of Zstandard based runtime
 # trace compression in record mode.
 #
+# Define TCMALLOC to enable tcmalloc heap profiling.
+#
 
 # As per kernel Makefile, avoid funny character set dependencies
 unexport LC_ALL
-- 
2.21.0


  reply	other threads:[~2019-10-21 13:38 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-21 13:37 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-10-21 13:37 ` Arnaldo Carvalho de Melo [this message]
2019-10-21 13:37 ` [PATCH 02/57] perf script: Fix --reltime with --time Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 03/57] perf evlist: Fix fix for freed id arrays Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 04/57] perf test: Report failure for mmap events Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 05/57] perf test: Avoid infinite loop for task exit case Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 06/57] perf report: Add warning when libunwind not compiled in Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 07/57] perf annotate: Avoid reallocation in objdump parsing Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 08/57] perf annotate: Use libsubcmd's run-command.h to fork objdump Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 09/57] perf annotate: Don't pipe objdump output through 'grep' command Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 10/57] perf annotate: Don't pipe objdump output through 'expand' command Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 11/57] perf annotate: Fix objdump --no-show-raw-insn flag Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 12/57] perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy() Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 13/57] perf stat: Support --all-kernel/--all-user Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 14/57] perf trace: Add syscall failure stats to -s/--summary and -S/--with-summary Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 15/57] perf trace: Introduce --errno-summary Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 16/57] perf string: Export asprintf__tp_filter_pids() Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 17/57] perf trace: Filter own pid to avoid a feedback look in 'perf trace record -a' Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 18/57] perf trace: Support tracepoint dynamic char arrays Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 19/57] perf vendor events arm64: Fix Hisi hip08 DDRC PMU eventname Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 20/57] perf vendor events arm64: Add some missing events for Hisi hip08 DDRC PMU Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 21/57] perf vendor events arm64: Add some missing events for Hisi hip08 L3C PMU Arnaldo Carvalho de Melo
2019-10-21 13:37 ` [PATCH 22/57] perf vendor events arm64: Add some missing events for Hisi hip08 HHA PMU Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 23/57] tools arch x86: Grab a copy of the file containing the IRQ vector defines Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 24/57] libbeauty: Add a generator for x86's IRQ vectors -> strings Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 25/57] libbeauty: Hook up the x86 irq_vectors table generator Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 26/57] libbeauty: Add a strarray__scnprintf_suffix() method Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 27/57] perf trace beauty: Add the glue for the autogenerated x86 IRQ vector array Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 28/57] perf trace: Hook the 'vec' tracepoint argument with the x86 IRQ vectors scnprintf/strtoul Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 29/57] perf trace: Show error message when not finding a field used in a filter expression Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 30/57] perf trace: Introduce accessors to trace specific evsel->priv Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 31/57] perf trace: Hide evsel->access further, simplify code Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 32/57] perf trace: Introduce 'struct evsel__trace' for evsel->priv needs Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 33/57] perf trace: Initialize evsel_trace->fmt for syscalls:sys_enter_* tracepoints Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 34/57] perf scripting engines: Iterate on tep event arrays directly Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 35/57] perf tools: Remove unused trace_find_next_event() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 36/57] libbeauty: Introduce syscall_arg__strtoul_strarray() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 37/57] perf trace: Honour --max-events in processing syscalls:sys_enter_* Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 38/57] perf trace: Pass a syscall_arg to syscall_arg_fmt->strtoul() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 39/57] perf list: Hide deprecated events by default Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 40/57] perf tests: Remove needless headers for bp_account Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 41/57] perf tests bp_account: Add dedicated checking helper is_supported() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 42/57] perf tests: Disable bp_signal testing for arm64 Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 43/57] libperf: Introduce perf_evlist__for_each_mmap() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 44/57] libperf: Move mmap allocation to perf_evlist__mmap_ops::get Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 45/57] libperf: Move mask setup to perf_evlist__mmap_ops() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 46/57] libperf: Link static tests with libapi.a Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 47/57] libperf: Add tests_mmap_thread test Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 48/57] libperf: Add tests_mmap_cpus test Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 49/57] libperf: Keep count of failed tests Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 50/57] libperf: Do not export perf_evsel__init()/perf_evlist__init() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 51/57] libperf: Add pr_err() macro Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 52/57] libbeauty: Introduce syscall_arg__strtoul_strarrays() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 53/57] perf trace: Use strtoul for the fcntl 'cmd' argument Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 54/57] libbeauty: Make the mmap_flags strarray visible outside of its beautifier Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 55/57] libbeauty: Introduce strarray__strtoul_flags() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 56/57] perf trace: Wire up strarray__strtoul_flags() Arnaldo Carvalho de Melo
2019-10-21 13:38 ` [PATCH 57/57] perf trace: Use STUL_STRARRAY_FLAGS with mmap Arnaldo Carvalho de Melo
2019-10-21 23:16 ` [GIT PULL] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191021133834.25998-2-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mpetlan@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).