All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Jiri Olsa <jolsa@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	David Ahern <dsahern@gmail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Andi Kleen <andi@firstfloor.org>, Kan Liang <kan.liang@intel.com>
Subject: Re: [RFC/PATCH 0/4] perf report: Support folded callchain output (v2)
Date: Mon, 2 Nov 2015 21:46:47 -0300	[thread overview]
Message-ID: <20151103004647.GE21609@kernel.org> (raw)
In-Reply-To: <20151102234606.GB11498@danjae.kornet>

Em Tue, Nov 03, 2015 at 08:46:06AM +0900, Namhyung Kim escreveu:
> On Mon, Nov 02, 2015 at 08:04:36PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Nov 03, 2015 at 07:49:27AM +0900, Namhyung Kim escreveu:
> > > On Mon, Nov 02, 2015 at 07:28:42PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Tue, Nov 03, 2015 at 07:12:04AM +0900, Namhyung Kim escreveu:
> > > > > On Mon, Nov 02, 2015 at 06:30:21PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > Em Mon, Nov 02, 2015 at 12:37:28PM -0800, Brendan Gregg escreveu:
> > > > > > > On Mon, Nov 2, 2015 at 4:57 AM, Namhyung Kim <namhyung@kernel.org> wrote:
> > > > > > > > This is what Brendan requested on the perf-users mailing list [1] to
> > > > > > > > support FlameGraphs [2] more efficiently.  This patchset adds a few
> > > > > > > > more callchain options to adjust the output for it.
> > > > 
> > > > > > > > At first, 'folded' output mode was added.  The folded output puts all
> > > > > > > > calchain nodes in a line separated by semicolons, a space and the
> > > > > > > > value.  Now it only supports --stdio as other UI provides some way of
> > > > > > > > folding/expanding callchains dynamically.
> > > > 
> > > > > > > > The value is now can be one of 'percent', 'period', or 'count'.  The
> > > > > > > > percent is current default output and the period is the raw number of
> > > > > > > > sample periods.  The count is the number of samples for each callchain.
> > > > 
> > > > > > > > Here's an example:
> > > > 
> > > > > > > >   $ perf report --no-children --show-nr-samples --stdio -g folded,count
> > > > > > > >   ...
> > > > > > > >     39.93%     80  swapper  [kernel.vmlinux]  [k] intel_idel
> > > > > > > >   intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 57
> > > > > > > >   intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;... 23

> > > > > > > So for the folded output I don't need the summary line (the row of
> > > > > > > columns printed by hist_entry__snprintf()), and don't need anything
> > > > > > > except folded stacks and the counts. If working with the existing
> > > > > > > stdio interface is making it harder than it needs to be, might it be

> > > > > > I don't think it so, just add some flag asking for that
> > > > > > hist_entry__snprintf() to be supressed, ideas for a long option name?

> > > > > > Having it as Namhyung did may have value for some people as a more
> > > > > > compact way to show the callchains together with the hist_entry line.

> > > > > Yeah, I'd keep the hist entry line unless it's too hard to
> > > > > parse/filter.  IMHO it's just a way to show callchains, so no need to

> > > > What I suggested was to have something like:

> > > >   $ perf report --no-children --no-hists --stdio -g folded,count
> > > >                               ^^^^^^^^^^
> > > >                               ^^^^^^^^^^
> > > >   ...
> > > >   intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 57
> > > >   intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;... 23
> > > > 
> > > > I.e. the first entry in the callchain is 'intel_idle', just like in what
> > > > Brendan called the 'summary line', i.e. reduntant when what he wants its
> > > > just all the callchains and how many times they were sampled.

> > > Yep, I know.  But isn't 'perf report' all for seeing hist lines? :)

> > Well, so far, yes, but he is presenting a usecase where what we want to
> > see is just callchains, and we can achieve that rather easily, no?
 
> But it's also easy to filter from the script side.

Why not go all the way and provide just what the script wants?
 
> > > I'm not insisting it strongly, but it's a bit strange for me if perf
> > > report doesn't show any hist lines..
> > 
> > If that is of no use in this use case, why not?
> 
> Well, I think FlameGraphs is a rather unusual case and folded output
> seems useful to other use cases too.

Sure thing, I agreed with that, its just one flag to tell if the
hist_entry__snprintf should be used or not.

> > > > > have separate output mode..
> > > >  
> > > > > Brendan, I guess you still need to know other info like cpu or pid, no?
> > > > 
> > > > Possibly, but just with the callchains he has enough info for the basic
> > > > flame graph, no?
> > > >  
> > > > > And I feel like it'd be better to put the count before the callchains
> > > > > for consistency like below.  Is it OK to you?
> > > > 
> > > > Consistency with what?
> > > 
> > > Oh, I meant consistency with other callchain output style like graph,
> > > fractal or flat - They all show the numbers before callchains.  And I
> > > think it's easier to read for human. :)
> > 
> > Well, As I said, isn't the main object here the callchain? :-)
> > 
> > And Brendan's request is for a something to be consumed by scripts, i.e.
> > something like we have for perf stat:
> > 
> > For humans:
> > 
> > [root@felicio ~]# perf stat -e cycles -I 1000 -a
> > #           time             counts unit events
> >      1.000304391          1,820,038      cycles                   
> >      2.000490191      1,005,477,007      cycles                   
> >      3.000657813          1,717,007      cycles                   
> > ^C     3.917890293          2,804,034      cycles                   
> > 
> > For machines/scripts:
> > 
> > [root@felicio ~]# perf stat -x, -e cycles -I 1000 -a
> >      1.000291954,1923360,,cycles,3998167210,100.00
> >      2.000477154,1005608105,,cycles,3998475482,100.00
> >      3.000612612,1345483,,cycles,3998332391,100.00
> >      4.000744469,1005046913,,cycles,3998258199,100.00
> > ^C     4.331684347,1551327,,cycles,3463190970,100.00
> > 
> > [root@felicio ~]#
 
> Yes, I thought about it too.  Maybe -t/--field-separator option can be
> used to separate folded callchains too.

What I meant here was: for humans, we don't want a field separator, and
we want headers, we want alignment, etc, while for scripts, its better
something easily parseable and with a record per line, no alignment is
needed, etc.
 
> > > > The main thing here is the callchain, all the other stuff are things
> > > > related to it, so showing it first makes sense to me.
> > > > 
> > > > Having some way to list the desired info to have for each callchain may
> > > > be interesting, and if he could do it like:
> > > > 
> > > >    -g folded,count,cpu,other,fields
> > > > 
> > > > then he would know how to parse the per-callchain info at the end of
> > > > each line, right?
> > > 
> > > Hmm.. looks like that it ends up having redundant info.  I don't think
> > 
> > What is redundant, and with with what?
> 
> When it's used with normal perf report cases, those other info in
> callchain lines are redundant to hist lines.  Also if a hist entry has

Sure, but if the user doesn't want to see the output of
hist_entry__snprintf()... :-)

> many callchains, each callchain lines will have same info in other fields.

Sure, but that would be what the script expects to consume, i.e. one
line per callchain.

> > > it's generally useful to other 'perf report' stuffs.  Wouldn't it be
> > > better just adding minimal support and let the external tool parse the
> > > output?
> > 
> > Oh well, perhaps we could have a 'perf callchain' tool that would be
> > centered on callchains and would provided one line per callchain, which
> > would have:
> > 
> > callchain;seprarated;colons series,of,desired,fields,for,this,callchain
> > 
> > Which would reuse heavily the 'perf report' / 'perf top' code for
> > histograms, no?
 
> I guess the callchain code is pretty isolated or can be isolated
> easily though.
 
> > I still think that this is a 'perf report' thing, but one that is
> > centered in callchains, and that is to be consumed by scripts, not
> > humans.
 
> Agreed.
 
> I'm just looking for a way to support it with minimal change. :)

Hey, me too. A --no-hists flag looks like a quickie, no need to isolate
callchain code, or anything like that, just one long option switch and
we get what we need.

- Arnaldo

  reply	other threads:[~2015-11-03  0:46 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-02 12:57 [RFC/PATCH 0/4] perf report: Support folded callchain output (v2) Namhyung Kim
2015-11-02 12:57 ` [RFC/PATCH v2 1/4] perf report: Support folded callchain mode on --stdio Namhyung Kim
2015-11-02 12:57 ` [RFC/PATCH v2 2/4] perf callchain: Abstract callchain print function Namhyung Kim
2015-11-02 12:57 ` [RFC/PATCH v2 3/4] perf callchain: Add count fields to struct callchain_node Namhyung Kim
2015-11-02 12:57 ` [RFC/PATCH v2 4/4] perf report: Add callchain value option Namhyung Kim
2015-11-02 20:37 ` [RFC/PATCH 0/4] perf report: Support folded callchain output (v2) Brendan Gregg
2015-11-02 21:30   ` Arnaldo Carvalho de Melo
2015-11-02 22:12     ` Namhyung Kim
2015-11-02 22:28       ` Arnaldo Carvalho de Melo
2015-11-02 22:49         ` Namhyung Kim
2015-11-02 23:04           ` Arnaldo Carvalho de Melo
2015-11-02 23:46             ` Namhyung Kim
2015-11-03  0:46               ` Arnaldo Carvalho de Melo [this message]
2015-11-03  1:35                 ` Namhyung Kim
2015-11-03  1:46                   ` Arnaldo Carvalho de Melo
2015-11-03  3:17                     ` Namhyung Kim
2015-11-02 22:43       ` Brendan Gregg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151103004647.GE21609@kernel.org \
    --to=acme@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=andi@firstfloor.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=dsahern@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.