All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brendan Gregg <brendan.d.gregg@gmail.com>
To: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Borislav Petkov <bp@suse.de>,
	Chandler Carruth <chandlerc@gmail.com>,
	David Ahern <dsahern@gmail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Stephane Eranian <eranian@google.com>,
	Wang Nan <wangnan0@huawei.com>
Subject: Re: [PATCH 13/16] perf callchain: Switch default to 'graph,0.5,caller'
Date: Fri, 9 Oct 2015 15:10:29 -0700	[thread overview]
Message-ID: <CAE40pddVCjHVC4nD=xBcMmE6dy+xPe2rVgK8a1QM9sv0zZWEgw@mail.gmail.com> (raw)
In-Reply-To: <20151009215626.GM14409@kernel.org>

On Fri, Oct 9, 2015 at 2:56 PM, Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Fri, Oct 09, 2015 at 01:34:33PM -0700, Brendan Gregg escreveu:
> > On Mon, Oct 5, 2015 at 2:03 PM, Arnaldo Carvalho de Melo
> > <acme@kernel.org> wrote:
> > >
> > > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > >
> > > Which is the most common default found in other similar tools.
> >
> > Interactive tools, sure, like the perf report TUI.
>
> > But this also changes the ordering of the non-interactive tools which
> > dump stacks: "perf report -n --stdio" and "perf script". The most
> > common default for dumping stacks is caller. Eg:
>
> And you use that for scripting?

Yes; how I typically CPU profile:

git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph
perf record -F 99 -a -g -- sleep 60
perf script | ./stackcollapse-perf.pl | /flamegraph.pl > flame.svg

Then open flame.svg in a browser and click around. Try it. :)

But it's not just scripting; We often email around "perf report -n
--stdio" output, or attach it to tickets, when working on an issue.
Easier than trying to grab the right TUI screenshot.

>
> > # perf report -n --stdio
> > [...]
> >     16.87%        334           iperf  [kernel.kallsyms]     [k] copy_user_enhanced_fast_string
> >                      |
> >                      --- 0x7f0683ba1ccd
> >                          system_call_fastpath
> >                          sys_write
> >                          vfs_write
> >                          do_sync_write
> >                          sock_aio_write
> >                          do_sock_write.isra.10
> >                          inet_sendmsg
> >                          copy_user_enhanced_fast_string
> > [...]
> >
> > That's upside down. The current default preserves ordering from the
> > informational line onwards:
> >
> > # perf report -n --stdio -g fractal,0.5,callee
> > [...]
> >     16.87%        334           iperf  [kernel.kallsyms]     [k] copy_user_enhanced_fast_string
> >                      |
> >                      --- copy_user_enhanced_fast_string
> >                         |
> >                         |--64.37%-- inet_sendmsg
> >                         |          do_sock_write.isra.10
> >                         |          sock_aio_write
> >                         |          do_sync_write
> >                         |          vfs_write
> >                         |          sys_write
> >                         |          system_call_fastpath
> >                         |          0x7f0683ba1ccd
> >
> > ... Those are just short examples. Another profile I'm working on now
> > gets really messy on "perf report -n --stdio"; eg:
> >
> > perf report -n --stdio -g graph,0.5,caller
> >     94.80%     0.10%             2  iperf     [kernel.vmlinux]    [k]
> > entry_SYSCALL_64_fastpath
> >                |
> >                |--94.70%-- entry_SYSCALL_64_fastpath
> >                |          |
>
> >
> > The current default never gets beyond 5 levels deep. The new default
> > goes to 25 levels. At least with perf report I can override the
> > default using "-g". perf script doesn't support that.
>
> Ok, so changing defaults is not nice, but in this case looked sensible,
> ends up not being for you...

I'm pretty sure this would surprise anyone looking at dumped stacks,
where the convention is caller. pstack, jstack, gdb, systemtap,
dtrace, oops message, etc. I get that we want this for the TUI, but
not dumped stacks.

>
> > Can this patch please preserve the callee ordering for non-interactive
> > output? (perf script, perf report -n --stdio). Thanks,
>
> If this is because you do scripting on it? Wouldn't it be better to not
> depend on defaults, always specify what you want and then the bug would
> be constrained to 'perf script' where we need to provide a way to change
> the default?

Actually, for my flame graphs we should really have perf report have a
--folded output to emit folded output (I emailed perf-users), callee.

For scripting we can always specify -g.

I'm thinking of others who use perf report/script at the CLI, and
expect callee output.

Brendan

  reply	other threads:[~2015-10-09 22:10 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-05 21:03 [GIT PULL 00/16] perf/core improvements and fixes Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 01/16] tools lib api fs: No need to use PATH_MAX + 1 Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 02/16] perf evlist: Display DATA_SRC sample type bit Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 03/16] perf annotate: Fix sizeof_sym_hist overflow issue Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 04/16] perf tools: Export perf_event_attr__set_max_precise_ip() Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 05/16] perf tools: Introduce 'P' modifier to request max precision Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 06/16] perf tests: Add parsing test for 'P' modifier Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 07/16] perf tools: Add support for sorting on the iaddr Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 08/16] perf tools: Setup proper width for symbol_iaddr field Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 09/16] perf tools: Handle -h and -v options Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 10/16] perf tests: Add arch tests Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 11/16] perf tests: Move x86 tests into arch directory Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 12/16] perf tests: Add Intel CQM test Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 13/16] perf callchain: Switch default to 'graph,0.5,caller' Arnaldo Carvalho de Melo
2015-10-09 20:34   ` Brendan Gregg
2015-10-09 21:56     ` Arnaldo Carvalho de Melo
2015-10-09 22:10       ` Brendan Gregg [this message]
2015-10-09 22:25         ` Arnaldo Carvalho de Melo
2015-10-20  0:16           ` Brendan Gregg
2015-10-20 12:00             ` Arnaldo Carvalho de Melo
2015-10-20 12:19               ` Frederic Weisbecker
2015-10-20 13:06                 ` Arnaldo Carvalho de Melo
2015-10-20 17:21                   ` Frederic Weisbecker
2015-10-20 18:44                     ` Arnaldo Carvalho de Melo
2015-10-21  1:21                       ` Namhyung Kim
2015-10-21 13:24                         ` Arnaldo Carvalho de Melo
2015-10-21  8:09                     ` Namhyung Kim
2015-10-21 11:57                       ` Wangnan (F)
2015-10-21 16:35                       ` Frederic Weisbecker
     [not found]                   ` <CAAwGriEtYeBytGt9x24=uUqSEy5oJ2HigfA2KXnKyrAioKrtNg@mail.gmail.com>
2015-10-21 16:27                     ` Frederic Weisbecker
2015-10-21 18:28                     ` Brendan Gregg
2015-10-21 19:23                       ` Arnaldo Carvalho de Melo
2015-10-22  0:44                         ` Brendan Gregg
2015-10-21  8:06               ` Ingo Molnar
2015-10-21 13:21                 ` Arnaldo Carvalho de Melo
2015-10-21 19:18                 ` Brendan Gregg
2015-10-10  7:09         ` Ingo Molnar
2015-10-10  7:34           ` Brendan Gregg
2015-10-10  9:07             ` Ingo Molnar
2015-10-12 15:27   ` Frederic Weisbecker
2015-10-13  4:26   ` Namhyung Kim
2015-10-19 23:50     ` Brendan Gregg
2015-10-21  7:29       ` Namhyung Kim
2015-10-20 13:23   ` Wangnan (F)
2015-10-20 13:38     ` Arnaldo Carvalho de Melo
2015-10-21  1:44       ` Namhyung Kim
2015-10-21  8:48       ` Ingo Molnar
2015-10-21 13:43         ` Arnaldo Carvalho de Melo
2015-10-21 13:46           ` Arnaldo Carvalho de Melo
2015-10-22  8:46           ` Ingo Molnar
2015-10-22 12:36             ` Namhyung Kim
2015-10-05 21:03 ` [PATCH 14/16] perf ui browser: Optional horizontal scrolling key binding Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 15/16] perf hists browser: Implement horizontal scrolling Arnaldo Carvalho de Melo
2015-10-05 21:03 ` [PATCH 16/16] perf tools: Fail properly in case pattern matching fails to find tracepoint Arnaldo Carvalho de Melo
2015-10-06  7:09 ` [GIT PULL 00/16] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAE40pddVCjHVC4nD=xBcMmE6dy+xPe2rVgK8a1QM9sv0zZWEgw@mail.gmail.com' \
    --to=brendan.d.gregg@gmail.com \
    --cc=adrian.hunter@intel.com \
    --cc=arnaldo.melo@gmail.com \
    --cc=bp@suse.de \
    --cc=chandlerc@gmail.com \
    --cc=dsahern@gmail.com \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.