linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] perf annotate: Support to display the LBR data in tui mode
  2018-02-23 14:35 [PATCH] perf annotate: Support to display the LBR data in tui mode Jin Yao
@ 2018-02-23  8:25 ` Peter Zijlstra
  2018-02-23  9:08   ` Jin, Yao
  2018-02-23 15:29   ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 12+ messages in thread
From: Peter Zijlstra @ 2018-02-23  8:25 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:
> Unlike the perf report interactive annotate mode, the perf annotate
> doesn't display the LBR data.
> 
> perf record -b ...
> perf annotate function
> 
> It should show IPC/cycle, but it doesn't.

There is far more than IPC/cycle for the LBR data, so this Changelog is
misleading.

Also, I think that this patch goes the wrong way, we should reduce the
divergence of the various modes, not make it worse.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf annotate: Support to display the LBR data in tui mode
  2018-02-23  8:25 ` Peter Zijlstra
@ 2018-02-23  9:08   ` Jin, Yao
  2018-02-23 15:29   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 12+ messages in thread
From: Jin, Yao @ 2018-02-23  9:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: acme, jolsa, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 2/23/2018 4:25 PM, Peter Zijlstra wrote:
> On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:
>> Unlike the perf report interactive annotate mode, the perf annotate
>> doesn't display the LBR data.
>>
>> perf record -b ...
>> perf annotate function
>>
>> It should show IPC/cycle, but it doesn't.
> 
> There is far more than IPC/cycle for the LBR data, so this Changelog is
> misleading.
> 

I will change the changelog to make it more clear.

> Also, I think that this patch goes the wrong way, we should reduce the
> divergence of the various modes, not make it worse.
> 
I do plan to support stdio mode. While stdio mode needs more changes 
than tui mode, so I plan to do it in a follow-up patch.

Posting this patch now is because I want to listen from community first 
for this feature. If the tui patch could be accepted, then it's worth 
putting more efforts on stdio version. That's my thoughts.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] perf annotate: Support to display the LBR data in tui mode
@ 2018-02-23 14:35 Jin Yao
  2018-02-23  8:25 ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Jin Yao @ 2018-02-23 14:35 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the LBR data.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.

This patch lets perf annotate support the displaying of LBR data.

For example,

perf annotate compute_flag

Percent│ IPC Cycle
       │
       │
       │                Disassembly of section .text:
       │
       │                0000000000400640 <compute_flag>:
       │                compute_flag():
       │                volatile int count;
       │                static unsigned int s_randseed;
       │
       │                __attribute__((noinline))
       │                int compute_flag()
       │                {
 22.96 │1.18   584        sub    $0x8,%rsp
       │                        int i;
       │
       │                        i = rand() % 2;
 23.02 │1.18     1      → callq  rand@plt
       │
       │                        return i;
 27.05 │3.37              mov    %eax,%edx
       │                }
       │3.37              add    $0x8,%rsp
       │                {
       │                        int i;
       │
       │                        i = rand() % 2;
       │
       │                        return i;
       │3.37              shr    $0x1f,%edx
       │3.37              add    %edx,%eax
       │3.37              and    $0x1,%eax
       │3.37              sub    %edx,%eax
       │                }
 26.97 │3.37     2      ← retq

Note that, this patch only supports tui mode. For other mode like --stdio,
it just keeps original behavior.

perf annotate compute_flag --stdio

 Percent |      Source code & Disassembly of div for cycles:ppp (7993 samples)
------------------------------------------------------------------------------
         :
         :
         :
         :            Disassembly of section .text:
         :
         :            0000000000400640 <compute_flag>:
         :            compute_flag():
         :            volatile int count;
         :            static unsigned int s_randseed;
         :
         :            __attribute__((noinline))
         :            int compute_flag()
         :            {
    0.29 :   400640:       sub    $0x8,%rsp     # +100.00%
         :                    int i;
         :
         :                    i = rand() % 2;
   42.93 :   400644:       callq  400490 <rand@plt>     # -100.00% (p:100.00%)
         :
         :                    return i;
    0.10 :   400649:       mov    %eax,%edx     # +100.00%
         :            }
    0.94 :   40064b:       add    $0x8,%rsp
         :            {
         :                    int i;
         :
         :                    i = rand() % 2;
         :
         :                    return i;
   27.02 :   40064f:       shr    $0x1f,%edx
    0.15 :   400652:       add    %edx,%eax
    1.24 :   400654:       and    $0x1,%eax
    2.08 :   400657:       sub    %edx,%eax
         :            }
   25.26 :   400659:       retq # -100.00% (p:100.00%)

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-annotate.c | 88 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 82 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a..ead6ae4 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -44,6 +44,7 @@ struct perf_annotate {
 	bool	   full_paths;
 	bool	   print_line;
 	bool	   skip_missing;
+	bool	   has_br_stack;
 	const char *sym_hist_filter;
 	const char *cpu_list;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, struct addr_location *
 	free(bi);
 }
 
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+				      struct addr_location *al __maybe_unused,
+				      bool single __maybe_unused,
+				      void *arg __maybe_unused)
+{
+	struct hist_entry *he = iter->he;
+	struct branch_info *bi;
+	struct perf_sample *sample = iter->sample;
+	struct perf_evsel *evsel = iter->evsel;
+	int err;
+
+	hist__account_cycles(sample->branch_stack, al, sample, false);
+
+	bi = he->branch_info;
+	err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
+
+	if (err)
+		goto out;
+
+	err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
+
+out:
+	return err;
+}
+
+static int process_branch_callback(struct perf_evsel *evsel,
+				   struct perf_sample *sample,
+				   struct addr_location *al __maybe_unused,
+				   struct perf_annotate *ann,
+				   struct machine *machine)
+{
+	struct hist_entry_iter iter = {
+		.evsel		= evsel,
+		.sample		= sample,
+		.add_entry_cb	= hist_iter__branch_callback,
+		.hide_unresolved	= symbol_conf.hide_unresolved,
+		.ops		= &hist_iter_branch,
+	};
+
+	struct addr_location a;
+	int ret;
+
+	if (machine__resolve(machine, &a, sample) < 0)
+		return -1;
+
+	if (a.sym == NULL)
+		return 0;
+
+	if (a.map != NULL)
+		a.map->dso->hit = 1;
+
+	ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
+	return ret;
+}
+
 static int perf_evsel__add_sample(struct perf_evsel *evsel,
 				  struct perf_sample *sample,
 				  struct addr_location *al,
-				  struct perf_annotate *ann)
+				  struct perf_annotate *ann,
+				  struct machine *machine)
 {
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
 	int ret;
 
-	if (ann->sym_hist_filter != NULL &&
+	if ((!ann->has_br_stack || !ui__has_annotation()) &&
+	    ann->sym_hist_filter != NULL &&
 	    (al->sym == NULL ||
 	     strcmp(ann->sym_hist_filter, al->sym->name) != 0)) {
 		/* We're only interested in a symbol named sym_hist_filter */
@@ -178,6 +236,9 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 	 */
 	process_branch_stack(sample->branch_stack, al, sample);
 
+	if (ann->has_br_stack && ui__has_annotation())
+		return process_branch_callback(evsel, sample, al, ann, machine);
+
 	he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
 	if (he == NULL)
 		return -ENOMEM;
@@ -206,7 +267,8 @@ static int process_sample_event(struct perf_tool *tool,
 	if (ann->cpu_list && !test_bit(sample->cpu, ann->cpu_bitmap))
 		goto out_put;
 
-	if (!al.filtered && perf_evsel__add_sample(evsel, sample, &al, ann)) {
+	if (!al.filtered &&
+	    perf_evsel__add_sample(evsel, sample, &al, ann, machine)) {
 		pr_warning("problem incrementing symbol count, "
 			   "skipping event\n");
 		ret = -1;
@@ -238,6 +300,10 @@ static void hists__find_annotations(struct hists *hists,
 		if (he->ms.sym == NULL || he->ms.map->dso->annotate_warned)
 			goto find_next;
 
+		if (ann->sym_hist_filter &&
+		    (strcmp(he->ms.sym->name, ann->sym_hist_filter) != 0))
+			goto find_next;
+
 		notes = symbol__annotation(he->ms.sym);
 		if (notes->src == NULL) {
 find_next:
@@ -269,6 +335,7 @@ static void hists__find_annotations(struct hists *hists,
 			nd = rb_next(nd);
 		} else if (use_browser == 1) {
 			key = hist_entry__tui_annotate(he, evsel, NULL);
+
 			switch (key) {
 			case -1:
 				if (!ann->skip_missing)
@@ -489,6 +556,9 @@ int cmd_annotate(int argc, const char **argv)
 	if (annotate.session == NULL)
 		return -1;
 
+	annotate.has_br_stack = perf_header__has_feat(&annotate.session->header,
+						      HEADER_BRANCH_STACK);
+
 	ret = symbol__annotation_init();
 	if (ret < 0)
 		goto out_delete;
@@ -499,9 +569,6 @@ int cmd_annotate(int argc, const char **argv)
 	if (ret < 0)
 		goto out_delete;
 
-	if (setup_sorting(NULL) < 0)
-		usage_with_options(annotate_usage, options);
-
 	if (annotate.use_stdio)
 		use_browser = 0;
 	else if (annotate.use_tui)
@@ -511,6 +578,15 @@ int cmd_annotate(int argc, const char **argv)
 
 	setup_browser(true);
 
+	if (use_browser == 1 && annotate.has_br_stack) {
+		sort__mode = SORT_MODE__BRANCH;
+		if (setup_sorting(annotate.session->evlist) < 0)
+			usage_with_options(annotate_usage, options);
+	} else {
+		if (setup_sorting(NULL) < 0)
+			usage_with_options(annotate_usage, options);
+	}
+
 	ret = __cmd_annotate(&annotate);
 
 out_delete:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf annotate: Support to display the LBR data in tui mode
  2018-02-23  8:25 ` Peter Zijlstra
  2018-02-23  9:08   ` Jin, Yao
@ 2018-02-23 15:29   ` Arnaldo Carvalho de Melo
  2018-02-23 16:59     ` Peter Zijlstra
                       ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-02-23 15:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jin Yao, jolsa, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Fri, Feb 23, 2018 at 09:25:00AM +0100, Peter Zijlstra escreveu:
> On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:
> > Unlike the perf report interactive annotate mode, the perf annotate
> > doesn't display the LBR data.

> > perf record -b ...
> > perf annotate function

> > It should show IPC/cycle, but it doesn't.

> There is far more than IPC/cycle for the LBR data, so this Changelog is
> misleading.

> Also, I think that this patch goes the wrong way, we should reduce the
> divergence of the various modes, not make it worse.

Right, Peter, what would you think if I made --stdio use the same
routines used to format the TUI, i.e. stdio would be equal to the TUI
modulo de navigation/jump arrows, etc.

We would have switches to provide the TUI output options that make sense
for non-interactive mode, like:

  J   Toggle showing number of jump sources on targets
  o   Toggle disassembler output/simplified view
  s   Toggle source code view
  t   Circulate percent, total period, samples view 
  k   Toggle line numbers

And would still have e --stdio-classic (deprecated), that we would keep
for a while.

I think that this new mode with "dissassembler output" would be the same
as the current --stdio, I'll check.

To further clarify, this wouldn't use any ncurses/slang TUI code, just
plain printf with things formatted using what is used now for the TUI
mode.

This way there would never be any drift amongst the output modes and we
would have less work to do when implementing new stuff like this LBR
case.

What do you think?

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf annotate: Support to display the LBR data in tui mode
  2018-02-23 15:29   ` Arnaldo Carvalho de Melo
@ 2018-02-23 16:59     ` Peter Zijlstra
  2018-02-23 17:02     ` Andi Kleen
  2018-02-24  1:40     ` [PATCH] perf annotate: Support to display the LBR data in tui mode Jin, Yao
  2 siblings, 0 replies; 12+ messages in thread
From: Peter Zijlstra @ 2018-02-23 16:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jin Yao, jolsa, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Fri, Feb 23, 2018 at 12:29:06PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Feb 23, 2018 at 09:25:00AM +0100, Peter Zijlstra escreveu:
> > On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:
> > > Unlike the perf report interactive annotate mode, the perf annotate
> > > doesn't display the LBR data.
> 
> > > perf record -b ...
> > > perf annotate function
> 
> > > It should show IPC/cycle, but it doesn't.
> 
> > There is far more than IPC/cycle for the LBR data, so this Changelog is
> > misleading.
> 
> > Also, I think that this patch goes the wrong way, we should reduce the
> > divergence of the various modes, not make it worse.
> 
> Right, Peter, what would you think if I made --stdio use the same
> routines used to format the TUI, i.e. stdio would be equal to the TUI
> modulo de navigation/jump arrows, etc.

Ideally we'd share the whole lot between stdio/TUI/GUI. That said, I
think stdio currently has a bunch of features that the other lack (my
fault).

> We would have switches to provide the TUI output options that make sense
> for non-interactive mode, like:
> 
>   J   Toggle showing number of jump sources on targets
>   o   Toggle disassembler output/simplified view
>   s   Toggle source code view
>   t   Circulate percent, total period, samples view 
>   k   Toggle line numbers

I really have no idea what you're talking about; this is because I've
just _never_ seen TUI mode. I exclusively use stdio.

> I think that this new mode with "dissassembler output" would be the same
> as the current --stdio, I'll check.

When I did the LBR coverage stuff I only did stdio; at the time we
talked about merging all this further, and IIRC you said you had
something like that on the TODO already so I left it there.

> This way there would never be any drift amongst the output modes and we
> would have less work to do when implementing new stuff like this LBR
> case.

Yes, I think that's the right direction, but I fear there's quite a bit
of work before we're at that point.

My only fear is that the resulting output code will be impenetrable,
there's a reason I only ever touch stdio output :/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf annotate: Support to display the LBR data in tui mode
  2018-02-23 15:29   ` Arnaldo Carvalho de Melo
  2018-02-23 16:59     ` Peter Zijlstra
@ 2018-02-23 17:02     ` Andi Kleen
  2018-02-27  9:38       ` [PATCH v2] perf annotate: Support to display the IPC/Cycle " Jin Yao
  2018-02-24  1:40     ` [PATCH] perf annotate: Support to display the LBR data in tui mode Jin, Yao
  2 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2018-02-23 17:02 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Jin Yao, jolsa, mingo, alexander.shishkin,
	Linux-kernel, kan.liang, yao.jin

On Fri, Feb 23, 2018 at 12:29:06PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Feb 23, 2018 at 09:25:00AM +0100, Peter Zijlstra escreveu:
> > On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:
> > > Unlike the perf report interactive annotate mode, the perf annotate
> > > doesn't display the LBR data.
> 
> > > perf record -b ...
> > > perf annotate function
> 
> > > It should show IPC/cycle, but it doesn't.
> 
> > There is far more than IPC/cycle for the LBR data, so this Changelog is
> > misleading.
> 
> > Also, I think that this patch goes the wrong way, we should reduce the
> > divergence of the various modes, not make it worse.
> 
> Right, Peter, what would you think if I made --stdio use the same
> routines used to format the TUI, i.e. stdio would be equal to the TUI
> modulo de navigation/jump arrows, etc.

With the unification Jin Yao's patch would be the right thing anyways
-- it would magically work for stdio mode too. So it seems to me
merging the patch would be the right thing to do.

-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf annotate: Support to display the LBR data in tui mode
  2018-02-23 15:29   ` Arnaldo Carvalho de Melo
  2018-02-23 16:59     ` Peter Zijlstra
  2018-02-23 17:02     ` Andi Kleen
@ 2018-02-24  1:40     ` Jin, Yao
  2018-02-26 13:57       ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 12+ messages in thread
From: Jin, Yao @ 2018-02-24  1:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra
  Cc: jolsa, mingo, alexander.shishkin, Linux-kernel, ak, kan.liang, yao.jin



On 2/23/2018 11:29 PM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Feb 23, 2018 at 09:25:00AM +0100, Peter Zijlstra escreveu:
>> On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:
>>> Unlike the perf report interactive annotate mode, the perf annotate
>>> doesn't display the LBR data.
> 
>>> perf record -b ...
>>> perf annotate function
> 
>>> It should show IPC/cycle, but it doesn't.
> 
>> There is far more than IPC/cycle for the LBR data, so this Changelog is
>> misleading.
> 
>> Also, I think that this patch goes the wrong way, we should reduce the
>> divergence of the various modes, not make it worse.
> 
> Right, Peter, what would you think if I made --stdio use the same
> routines used to format the TUI, i.e. stdio would be equal to the TUI
> modulo de navigation/jump arrows, etc.
> 
> We would have switches to provide the TUI output options that make sense
> for non-interactive mode, like:
> 
>    J   Toggle showing number of jump sources on targets
>    o   Toggle disassembler output/simplified view
>    s   Toggle source code view
>    t   Circulate percent, total period, samples view
>    k   Toggle line numbers
> 

Hi Arnaldo, looks your idea is very similar as my idea. In my 
understanding, for example, we may provide switch to tui routine like 
annotate_browser__write() and use fprintf() to replace 
ui_browser__printf()/ui_browser_write__xxx() if switch is on for stdio.

Is that your idea?

For this approach, I think, the benefit is we can reuse most of existing 
code but the disadvantage is we have to mix tui and stdio up.

Thanks
Jin Yao

> And would still have e --stdio-classic (deprecated), that we would keep
> for a while.
> 
> I think that this new mode with "dissassembler output" would be the same
> as the current --stdio, I'll check.
> 
> To further clarify, this wouldn't use any ncurses/slang TUI code, just
> plain printf with things formatted using what is used now for the TUI
> mode.
> 
> This way there would never be any drift amongst the output modes and we
> would have less work to do when implementing new stuff like this LBR
> case.
> 
> What do you think?
> 
> - Arnaldo
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf annotate: Support to display the LBR data in tui mode
  2018-02-24  1:40     ` [PATCH] perf annotate: Support to display the LBR data in tui mode Jin, Yao
@ 2018-02-26 13:57       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-02-26 13:57 UTC (permalink / raw)
  To: Jin, Yao
  Cc: Peter Zijlstra, jolsa, mingo, alexander.shishkin, Linux-kernel,
	ak, kan.liang, yao.jin

Em Sat, Feb 24, 2018 at 09:40:02AM +0800, Jin, Yao escreveu:
> 
> 
> On 2/23/2018 11:29 PM, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Feb 23, 2018 at 09:25:00AM +0100, Peter Zijlstra escreveu:
> > > On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:
> > > > Unlike the perf report interactive annotate mode, the perf annotate
> > > > doesn't display the LBR data.
> > 
> > > > perf record -b ...
> > > > perf annotate function
> > 
> > > > It should show IPC/cycle, but it doesn't.
> > 
> > > There is far more than IPC/cycle for the LBR data, so this Changelog is
> > > misleading.
> > 
> > > Also, I think that this patch goes the wrong way, we should reduce the
> > > divergence of the various modes, not make it worse.
> > 
> > Right, Peter, what would you think if I made --stdio use the same
> > routines used to format the TUI, i.e. stdio would be equal to the TUI
> > modulo de navigation/jump arrows, etc.
> > 
> > We would have switches to provide the TUI output options that make sense
> > for non-interactive mode, like:
> > 
> >    J   Toggle showing number of jump sources on targets
> >    o   Toggle disassembler output/simplified view
> >    s   Toggle source code view
> >    t   Circulate percent, total period, samples view
> >    k   Toggle line numbers
> > 
> 
> Hi Arnaldo, looks your idea is very similar as my idea. In my understanding,
> for example, we may provide switch to tui routine like
> annotate_browser__write() and use fprintf() to replace
> ui_browser__printf()/ui_browser_write__xxx() if switch is on for stdio.
> 
> Is that your idea?

Yes, right now the TUI simply uses foo__scnprintf() routines to then
pass formatted buffers to the TUI routines, we would then just pass them
to plain fprintf(sdtout).
 
> For this approach, I think, the benefit is we can reuse most of existing
> code but the disadvantage is we have to mix tui and stdio up.

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] perf annotate: Support to display the IPC/Cycle in tui mode
@ 2018-02-27  9:38       ` Jin Yao
  2018-03-08  5:22         ` Jin, Yao
  2018-03-09  8:57         ` [tip:perf/core] perf annotate: Support to display the IPC/Cycle in TUI mode tip-bot for Jin Yao
  0 siblings, 2 replies; 12+ messages in thread
From: Jin Yao @ 2018-02-27  9:38 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the IPC/Cycle even if branch info is recorded in perf
data file.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.

This patch lets perf annotate support the displaying of IPC/Cycle if
branch info is in perf data.

For example,

perf annotate compute_flag

Percent│ IPC Cycle
       │
       │
       │                Disassembly of section .text:
       │
       │                0000000000400640 <compute_flag>:
       │                compute_flag():
       │                volatile int count;
       │                static unsigned int s_randseed;
       │
       │                __attribute__((noinline))
       │                int compute_flag()
       │                {
 22.96 │1.18   584        sub    $0x8,%rsp
       │                        int i;
       │
       │                        i = rand() % 2;
 23.02 │1.18     1      → callq  rand@plt
       │
       │                        return i;
 27.05 │3.37              mov    %eax,%edx
       │                }
       │3.37              add    $0x8,%rsp
       │                {
       │                        int i;
       │
       │                        i = rand() % 2;
       │
       │                        return i;
       │3.37              shr    $0x1f,%edx
       │3.37              add    %edx,%eax
       │3.37              and    $0x1,%eax
       │3.37              sub    %edx,%eax
       │                }
 26.97 │3.37     2      ← retq

Note that, this patch only supports tui mode. For stdio,
now it just keeps original behavior. Will support it in a
follow-up patch.

perf annotate compute_flag --stdio

 Percent |      Source code & Disassembly of div for cycles:ppp (7993 samples)
------------------------------------------------------------------------------
         :
         :
         :
         :            Disassembly of section .text:
         :
         :            0000000000400640 <compute_flag>:
         :            compute_flag():
         :            volatile int count;
         :            static unsigned int s_randseed;
         :
         :            __attribute__((noinline))
         :            int compute_flag()
         :            {
    0.29 :   400640:       sub    $0x8,%rsp     # +100.00%
         :                    int i;
         :
         :                    i = rand() % 2;
   42.93 :   400644:       callq  400490 <rand@plt>     # -100.00% (p:100.00%)
         :
         :                    return i;
    0.10 :   400649:       mov    %eax,%edx     # +100.00%
         :            }
    0.94 :   40064b:       add    $0x8,%rsp
         :            {
         :                    int i;
         :
         :                    i = rand() % 2;
         :
         :                    return i;
   27.02 :   40064f:       shr    $0x1f,%edx
    0.15 :   400652:       add    %edx,%eax
    1.24 :   400654:       and    $0x1,%eax
    2.08 :   400657:       sub    %edx,%eax
         :            }
   25.26 :   400659:       retq # -100.00% (p:100.00%)

Change-log:
-----------
v2:
---
No code change. Just change the patch description to make it more clear.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-annotate.c | 88 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 82 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a..ead6ae4 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -44,6 +44,7 @@ struct perf_annotate {
 	bool	   full_paths;
 	bool	   print_line;
 	bool	   skip_missing;
+	bool	   has_br_stack;
 	const char *sym_hist_filter;
 	const char *cpu_list;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, struct addr_location *
 	free(bi);
 }
 
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+				      struct addr_location *al __maybe_unused,
+				      bool single __maybe_unused,
+				      void *arg __maybe_unused)
+{
+	struct hist_entry *he = iter->he;
+	struct branch_info *bi;
+	struct perf_sample *sample = iter->sample;
+	struct perf_evsel *evsel = iter->evsel;
+	int err;
+
+	hist__account_cycles(sample->branch_stack, al, sample, false);
+
+	bi = he->branch_info;
+	err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
+
+	if (err)
+		goto out;
+
+	err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
+
+out:
+	return err;
+}
+
+static int process_branch_callback(struct perf_evsel *evsel,
+				   struct perf_sample *sample,
+				   struct addr_location *al __maybe_unused,
+				   struct perf_annotate *ann,
+				   struct machine *machine)
+{
+	struct hist_entry_iter iter = {
+		.evsel		= evsel,
+		.sample		= sample,
+		.add_entry_cb	= hist_iter__branch_callback,
+		.hide_unresolved	= symbol_conf.hide_unresolved,
+		.ops		= &hist_iter_branch,
+	};
+
+	struct addr_location a;
+	int ret;
+
+	if (machine__resolve(machine, &a, sample) < 0)
+		return -1;
+
+	if (a.sym == NULL)
+		return 0;
+
+	if (a.map != NULL)
+		a.map->dso->hit = 1;
+
+	ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
+	return ret;
+}
+
 static int perf_evsel__add_sample(struct perf_evsel *evsel,
 				  struct perf_sample *sample,
 				  struct addr_location *al,
-				  struct perf_annotate *ann)
+				  struct perf_annotate *ann,
+				  struct machine *machine)
 {
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
 	int ret;
 
-	if (ann->sym_hist_filter != NULL &&
+	if ((!ann->has_br_stack || !ui__has_annotation()) &&
+	    ann->sym_hist_filter != NULL &&
 	    (al->sym == NULL ||
 	     strcmp(ann->sym_hist_filter, al->sym->name) != 0)) {
 		/* We're only interested in a symbol named sym_hist_filter */
@@ -178,6 +236,9 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 	 */
 	process_branch_stack(sample->branch_stack, al, sample);
 
+	if (ann->has_br_stack && ui__has_annotation())
+		return process_branch_callback(evsel, sample, al, ann, machine);
+
 	he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
 	if (he == NULL)
 		return -ENOMEM;
@@ -206,7 +267,8 @@ static int process_sample_event(struct perf_tool *tool,
 	if (ann->cpu_list && !test_bit(sample->cpu, ann->cpu_bitmap))
 		goto out_put;
 
-	if (!al.filtered && perf_evsel__add_sample(evsel, sample, &al, ann)) {
+	if (!al.filtered &&
+	    perf_evsel__add_sample(evsel, sample, &al, ann, machine)) {
 		pr_warning("problem incrementing symbol count, "
 			   "skipping event\n");
 		ret = -1;
@@ -238,6 +300,10 @@ static void hists__find_annotations(struct hists *hists,
 		if (he->ms.sym == NULL || he->ms.map->dso->annotate_warned)
 			goto find_next;
 
+		if (ann->sym_hist_filter &&
+		    (strcmp(he->ms.sym->name, ann->sym_hist_filter) != 0))
+			goto find_next;
+
 		notes = symbol__annotation(he->ms.sym);
 		if (notes->src == NULL) {
 find_next:
@@ -269,6 +335,7 @@ static void hists__find_annotations(struct hists *hists,
 			nd = rb_next(nd);
 		} else if (use_browser == 1) {
 			key = hist_entry__tui_annotate(he, evsel, NULL);
+
 			switch (key) {
 			case -1:
 				if (!ann->skip_missing)
@@ -489,6 +556,9 @@ int cmd_annotate(int argc, const char **argv)
 	if (annotate.session == NULL)
 		return -1;
 
+	annotate.has_br_stack = perf_header__has_feat(&annotate.session->header,
+						      HEADER_BRANCH_STACK);
+
 	ret = symbol__annotation_init();
 	if (ret < 0)
 		goto out_delete;
@@ -499,9 +569,6 @@ int cmd_annotate(int argc, const char **argv)
 	if (ret < 0)
 		goto out_delete;
 
-	if (setup_sorting(NULL) < 0)
-		usage_with_options(annotate_usage, options);
-
 	if (annotate.use_stdio)
 		use_browser = 0;
 	else if (annotate.use_tui)
@@ -511,6 +578,15 @@ int cmd_annotate(int argc, const char **argv)
 
 	setup_browser(true);
 
+	if (use_browser == 1 && annotate.has_br_stack) {
+		sort__mode = SORT_MODE__BRANCH;
+		if (setup_sorting(annotate.session->evlist) < 0)
+			usage_with_options(annotate_usage, options);
+	} else {
+		if (setup_sorting(NULL) < 0)
+			usage_with_options(annotate_usage, options);
+	}
+
 	ret = __cmd_annotate(&annotate);
 
 out_delete:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] perf annotate: Support to display the IPC/Cycle in tui mode
  2018-02-27  9:38       ` [PATCH v2] perf annotate: Support to display the IPC/Cycle " Jin Yao
@ 2018-03-08  5:22         ` Jin, Yao
  2018-03-08 13:12           ` Arnaldo Carvalho de Melo
  2018-03-09  8:57         ` [tip:perf/core] perf annotate: Support to display the IPC/Cycle in TUI mode tip-bot for Jin Yao
  1 sibling, 1 reply; 12+ messages in thread
From: Jin, Yao @ 2018-03-08  5:22 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin

Hi,

Could this patch be accepted?

I'm working on the supporting for stdio mode. I just wish to post the 
patch series after this patch being merged.

Thanks
Jin Yao

On 2/27/2018 5:38 PM, Jin Yao wrote:
> Unlike the perf report interactive annotate mode, the perf annotate
> doesn't display the IPC/Cycle even if branch info is recorded in perf
> data file.
> 
> perf record -b ...
> perf annotate function
> 
> It should show IPC/cycle, but it doesn't.
> 
> This patch lets perf annotate support the displaying of IPC/Cycle if
> branch info is in perf data.
> 
> For example,
> 
> perf annotate compute_flag
> 
> Percent│ IPC Cycle
>         │
>         │
>         │                Disassembly of section .text:
>         │
>         │                0000000000400640 <compute_flag>:
>         │                compute_flag():
>         │                volatile int count;
>         │                static unsigned int s_randseed;
>         │
>         │                __attribute__((noinline))
>         │                int compute_flag()
>         │                {
>   22.96 │1.18   584        sub    $0x8,%rsp
>         │                        int i;
>         │
>         │                        i = rand() % 2;
>   23.02 │1.18     1      → callq  rand@plt
>         │
>         │                        return i;
>   27.05 │3.37              mov    %eax,%edx
>         │                }
>         │3.37              add    $0x8,%rsp
>         │                {
>         │                        int i;
>         │
>         │                        i = rand() % 2;
>         │
>         │                        return i;
>         │3.37              shr    $0x1f,%edx
>         │3.37              add    %edx,%eax
>         │3.37              and    $0x1,%eax
>         │3.37              sub    %edx,%eax
>         │                }
>   26.97 │3.37     2      ← retq
> 
> Note that, this patch only supports tui mode. For stdio,
> now it just keeps original behavior. Will support it in a
> follow-up patch.
> 
> perf annotate compute_flag --stdio
> 
>   Percent |      Source code & Disassembly of div for cycles:ppp (7993 samples)
> ------------------------------------------------------------------------------
>           :
>           :
>           :
>           :            Disassembly of section .text:
>           :
>           :            0000000000400640 <compute_flag>:
>           :            compute_flag():
>           :            volatile int count;
>           :            static unsigned int s_randseed;
>           :
>           :            __attribute__((noinline))
>           :            int compute_flag()
>           :            {
>      0.29 :   400640:       sub    $0x8,%rsp     # +100.00%
>           :                    int i;
>           :
>           :                    i = rand() % 2;
>     42.93 :   400644:       callq  400490 <rand@plt>     # -100.00% (p:100.00%)
>           :
>           :                    return i;
>      0.10 :   400649:       mov    %eax,%edx     # +100.00%
>           :            }
>      0.94 :   40064b:       add    $0x8,%rsp
>           :            {
>           :                    int i;
>           :
>           :                    i = rand() % 2;
>           :
>           :                    return i;
>     27.02 :   40064f:       shr    $0x1f,%edx
>      0.15 :   400652:       add    %edx,%eax
>      1.24 :   400654:       and    $0x1,%eax
>      2.08 :   400657:       sub    %edx,%eax
>           :            }
>     25.26 :   400659:       retq # -100.00% (p:100.00%)
> 
> Change-log:
> -----------
> v2:
> ---
> No code change. Just change the patch description to make it more clear.
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>   tools/perf/builtin-annotate.c | 88 ++++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 82 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
> index f15731a..ead6ae4 100644
> --- a/tools/perf/builtin-annotate.c
> +++ b/tools/perf/builtin-annotate.c
> @@ -44,6 +44,7 @@ struct perf_annotate {
>   	bool	   full_paths;
>   	bool	   print_line;
>   	bool	   skip_missing;
> +	bool	   has_br_stack;
>   	const char *sym_hist_filter;
>   	const char *cpu_list;
>   	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
> @@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, struct addr_location *
>   	free(bi);
>   }
>   
> +static int hist_iter__branch_callback(struct hist_entry_iter *iter,
> +				      struct addr_location *al __maybe_unused,
> +				      bool single __maybe_unused,
> +				      void *arg __maybe_unused)
> +{
> +	struct hist_entry *he = iter->he;
> +	struct branch_info *bi;
> +	struct perf_sample *sample = iter->sample;
> +	struct perf_evsel *evsel = iter->evsel;
> +	int err;
> +
> +	hist__account_cycles(sample->branch_stack, al, sample, false);
> +
> +	bi = he->branch_info;
> +	err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
> +
> +	if (err)
> +		goto out;
> +
> +	err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
> +
> +out:
> +	return err;
> +}
> +
> +static int process_branch_callback(struct perf_evsel *evsel,
> +				   struct perf_sample *sample,
> +				   struct addr_location *al __maybe_unused,
> +				   struct perf_annotate *ann,
> +				   struct machine *machine)
> +{
> +	struct hist_entry_iter iter = {
> +		.evsel		= evsel,
> +		.sample		= sample,
> +		.add_entry_cb	= hist_iter__branch_callback,
> +		.hide_unresolved	= symbol_conf.hide_unresolved,
> +		.ops		= &hist_iter_branch,
> +	};
> +
> +	struct addr_location a;
> +	int ret;
> +
> +	if (machine__resolve(machine, &a, sample) < 0)
> +		return -1;
> +
> +	if (a.sym == NULL)
> +		return 0;
> +
> +	if (a.map != NULL)
> +		a.map->dso->hit = 1;
> +
> +	ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
> +	return ret;
> +}
> +
>   static int perf_evsel__add_sample(struct perf_evsel *evsel,
>   				  struct perf_sample *sample,
>   				  struct addr_location *al,
> -				  struct perf_annotate *ann)
> +				  struct perf_annotate *ann,
> +				  struct machine *machine)
>   {
>   	struct hists *hists = evsel__hists(evsel);
>   	struct hist_entry *he;
>   	int ret;
>   
> -	if (ann->sym_hist_filter != NULL &&
> +	if ((!ann->has_br_stack || !ui__has_annotation()) &&
> +	    ann->sym_hist_filter != NULL &&
>   	    (al->sym == NULL ||
>   	     strcmp(ann->sym_hist_filter, al->sym->name) != 0)) {
>   		/* We're only interested in a symbol named sym_hist_filter */
> @@ -178,6 +236,9 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
>   	 */
>   	process_branch_stack(sample->branch_stack, al, sample);
>   
> +	if (ann->has_br_stack && ui__has_annotation())
> +		return process_branch_callback(evsel, sample, al, ann, machine);
> +
>   	he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
>   	if (he == NULL)
>   		return -ENOMEM;
> @@ -206,7 +267,8 @@ static int process_sample_event(struct perf_tool *tool,
>   	if (ann->cpu_list && !test_bit(sample->cpu, ann->cpu_bitmap))
>   		goto out_put;
>   
> -	if (!al.filtered && perf_evsel__add_sample(evsel, sample, &al, ann)) {
> +	if (!al.filtered &&
> +	    perf_evsel__add_sample(evsel, sample, &al, ann, machine)) {
>   		pr_warning("problem incrementing symbol count, "
>   			   "skipping event\n");
>   		ret = -1;
> @@ -238,6 +300,10 @@ static void hists__find_annotations(struct hists *hists,
>   		if (he->ms.sym == NULL || he->ms.map->dso->annotate_warned)
>   			goto find_next;
>   
> +		if (ann->sym_hist_filter &&
> +		    (strcmp(he->ms.sym->name, ann->sym_hist_filter) != 0))
> +			goto find_next;
> +
>   		notes = symbol__annotation(he->ms.sym);
>   		if (notes->src == NULL) {
>   find_next:
> @@ -269,6 +335,7 @@ static void hists__find_annotations(struct hists *hists,
>   			nd = rb_next(nd);
>   		} else if (use_browser == 1) {
>   			key = hist_entry__tui_annotate(he, evsel, NULL);
> +
>   			switch (key) {
>   			case -1:
>   				if (!ann->skip_missing)
> @@ -489,6 +556,9 @@ int cmd_annotate(int argc, const char **argv)
>   	if (annotate.session == NULL)
>   		return -1;
>   
> +	annotate.has_br_stack = perf_header__has_feat(&annotate.session->header,
> +						      HEADER_BRANCH_STACK);
> +
>   	ret = symbol__annotation_init();
>   	if (ret < 0)
>   		goto out_delete;
> @@ -499,9 +569,6 @@ int cmd_annotate(int argc, const char **argv)
>   	if (ret < 0)
>   		goto out_delete;
>   
> -	if (setup_sorting(NULL) < 0)
> -		usage_with_options(annotate_usage, options);
> -
>   	if (annotate.use_stdio)
>   		use_browser = 0;
>   	else if (annotate.use_tui)
> @@ -511,6 +578,15 @@ int cmd_annotate(int argc, const char **argv)
>   
>   	setup_browser(true);
>   
> +	if (use_browser == 1 && annotate.has_br_stack) {
> +		sort__mode = SORT_MODE__BRANCH;
> +		if (setup_sorting(annotate.session->evlist) < 0)
> +			usage_with_options(annotate_usage, options);
> +	} else {
> +		if (setup_sorting(NULL) < 0)
> +			usage_with_options(annotate_usage, options);
> +	}
> +
>   	ret = __cmd_annotate(&annotate);
>   
>   out_delete:
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] perf annotate: Support to display the IPC/Cycle in tui mode
  2018-03-08  5:22         ` Jin, Yao
@ 2018-03-08 13:12           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-03-08 13:12 UTC (permalink / raw)
  To: Jin, Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Thu, Mar 08, 2018 at 01:22:15PM +0800, Jin, Yao escreveu:
> Hi,
> 
> Could this patch be accepted?
 
> I'm working on the supporting for stdio mode. I just wish to post the patch
> series after this patch being merged.

merged, please proceed.

- Arnaldo
 
> Thanks
> Jin Yao
> 
> On 2/27/2018 5:38 PM, Jin Yao wrote:
> > Unlike the perf report interactive annotate mode, the perf annotate
> > doesn't display the IPC/Cycle even if branch info is recorded in perf
> > data file.
> > 
> > perf record -b ...
> > perf annotate function
> > 
> > It should show IPC/cycle, but it doesn't.
> > 
> > This patch lets perf annotate support the displaying of IPC/Cycle if
> > branch info is in perf data.
> > 
> > For example,
> > 
> > perf annotate compute_flag
> > 
> > Percent│ IPC Cycle
> >         │
> >         │
> >         │                Disassembly of section .text:
> >         │
> >         │                0000000000400640 <compute_flag>:
> >         │                compute_flag():
> >         │                volatile int count;
> >         │                static unsigned int s_randseed;
> >         │
> >         │                __attribute__((noinline))
> >         │                int compute_flag()
> >         │                {
> >   22.96 │1.18   584        sub    $0x8,%rsp
> >         │                        int i;
> >         │
> >         │                        i = rand() % 2;
> >   23.02 │1.18     1      → callq  rand@plt
> >         │
> >         │                        return i;
> >   27.05 │3.37              mov    %eax,%edx
> >         │                }
> >         │3.37              add    $0x8,%rsp
> >         │                {
> >         │                        int i;
> >         │
> >         │                        i = rand() % 2;
> >         │
> >         │                        return i;
> >         │3.37              shr    $0x1f,%edx
> >         │3.37              add    %edx,%eax
> >         │3.37              and    $0x1,%eax
> >         │3.37              sub    %edx,%eax
> >         │                }
> >   26.97 │3.37     2      ← retq
> > 
> > Note that, this patch only supports tui mode. For stdio,
> > now it just keeps original behavior. Will support it in a
> > follow-up patch.
> > 
> > perf annotate compute_flag --stdio
> > 
> >   Percent |      Source code & Disassembly of div for cycles:ppp (7993 samples)
> > ------------------------------------------------------------------------------
> >           :
> >           :
> >           :
> >           :            Disassembly of section .text:
> >           :
> >           :            0000000000400640 <compute_flag>:
> >           :            compute_flag():
> >           :            volatile int count;
> >           :            static unsigned int s_randseed;
> >           :
> >           :            __attribute__((noinline))
> >           :            int compute_flag()
> >           :            {
> >      0.29 :   400640:       sub    $0x8,%rsp     # +100.00%
> >           :                    int i;
> >           :
> >           :                    i = rand() % 2;
> >     42.93 :   400644:       callq  400490 <rand@plt>     # -100.00% (p:100.00%)
> >           :
> >           :                    return i;
> >      0.10 :   400649:       mov    %eax,%edx     # +100.00%
> >           :            }
> >      0.94 :   40064b:       add    $0x8,%rsp
> >           :            {
> >           :                    int i;
> >           :
> >           :                    i = rand() % 2;
> >           :
> >           :                    return i;
> >     27.02 :   40064f:       shr    $0x1f,%edx
> >      0.15 :   400652:       add    %edx,%eax
> >      1.24 :   400654:       and    $0x1,%eax
> >      2.08 :   400657:       sub    %edx,%eax
> >           :            }
> >     25.26 :   400659:       retq # -100.00% (p:100.00%)
> > 
> > Change-log:
> > -----------
> > v2:
> > ---
> > No code change. Just change the patch description to make it more clear.
> > 
> > Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> > ---
> >   tools/perf/builtin-annotate.c | 88 ++++++++++++++++++++++++++++++++++++++++---
> >   1 file changed, 82 insertions(+), 6 deletions(-)
> > 
> > diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
> > index f15731a..ead6ae4 100644
> > --- a/tools/perf/builtin-annotate.c
> > +++ b/tools/perf/builtin-annotate.c
> > @@ -44,6 +44,7 @@ struct perf_annotate {
> >   	bool	   full_paths;
> >   	bool	   print_line;
> >   	bool	   skip_missing;
> > +	bool	   has_br_stack;
> >   	const char *sym_hist_filter;
> >   	const char *cpu_list;
> >   	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
> > @@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, struct addr_location *
> >   	free(bi);
> >   }
> > +static int hist_iter__branch_callback(struct hist_entry_iter *iter,
> > +				      struct addr_location *al __maybe_unused,
> > +				      bool single __maybe_unused,
> > +				      void *arg __maybe_unused)
> > +{
> > +	struct hist_entry *he = iter->he;
> > +	struct branch_info *bi;
> > +	struct perf_sample *sample = iter->sample;
> > +	struct perf_evsel *evsel = iter->evsel;
> > +	int err;
> > +
> > +	hist__account_cycles(sample->branch_stack, al, sample, false);
> > +
> > +	bi = he->branch_info;
> > +	err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
> > +
> > +	if (err)
> > +		goto out;
> > +
> > +	err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
> > +
> > +out:
> > +	return err;
> > +}
> > +
> > +static int process_branch_callback(struct perf_evsel *evsel,
> > +				   struct perf_sample *sample,
> > +				   struct addr_location *al __maybe_unused,
> > +				   struct perf_annotate *ann,
> > +				   struct machine *machine)
> > +{
> > +	struct hist_entry_iter iter = {
> > +		.evsel		= evsel,
> > +		.sample		= sample,
> > +		.add_entry_cb	= hist_iter__branch_callback,
> > +		.hide_unresolved	= symbol_conf.hide_unresolved,
> > +		.ops		= &hist_iter_branch,
> > +	};
> > +
> > +	struct addr_location a;
> > +	int ret;
> > +
> > +	if (machine__resolve(machine, &a, sample) < 0)
> > +		return -1;
> > +
> > +	if (a.sym == NULL)
> > +		return 0;
> > +
> > +	if (a.map != NULL)
> > +		a.map->dso->hit = 1;
> > +
> > +	ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
> > +	return ret;
> > +}
> > +
> >   static int perf_evsel__add_sample(struct perf_evsel *evsel,
> >   				  struct perf_sample *sample,
> >   				  struct addr_location *al,
> > -				  struct perf_annotate *ann)
> > +				  struct perf_annotate *ann,
> > +				  struct machine *machine)
> >   {
> >   	struct hists *hists = evsel__hists(evsel);
> >   	struct hist_entry *he;
> >   	int ret;
> > -	if (ann->sym_hist_filter != NULL &&
> > +	if ((!ann->has_br_stack || !ui__has_annotation()) &&
> > +	    ann->sym_hist_filter != NULL &&
> >   	    (al->sym == NULL ||
> >   	     strcmp(ann->sym_hist_filter, al->sym->name) != 0)) {
> >   		/* We're only interested in a symbol named sym_hist_filter */
> > @@ -178,6 +236,9 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
> >   	 */
> >   	process_branch_stack(sample->branch_stack, al, sample);
> > +	if (ann->has_br_stack && ui__has_annotation())
> > +		return process_branch_callback(evsel, sample, al, ann, machine);
> > +
> >   	he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
> >   	if (he == NULL)
> >   		return -ENOMEM;
> > @@ -206,7 +267,8 @@ static int process_sample_event(struct perf_tool *tool,
> >   	if (ann->cpu_list && !test_bit(sample->cpu, ann->cpu_bitmap))
> >   		goto out_put;
> > -	if (!al.filtered && perf_evsel__add_sample(evsel, sample, &al, ann)) {
> > +	if (!al.filtered &&
> > +	    perf_evsel__add_sample(evsel, sample, &al, ann, machine)) {
> >   		pr_warning("problem incrementing symbol count, "
> >   			   "skipping event\n");
> >   		ret = -1;
> > @@ -238,6 +300,10 @@ static void hists__find_annotations(struct hists *hists,
> >   		if (he->ms.sym == NULL || he->ms.map->dso->annotate_warned)
> >   			goto find_next;
> > +		if (ann->sym_hist_filter &&
> > +		    (strcmp(he->ms.sym->name, ann->sym_hist_filter) != 0))
> > +			goto find_next;
> > +
> >   		notes = symbol__annotation(he->ms.sym);
> >   		if (notes->src == NULL) {
> >   find_next:
> > @@ -269,6 +335,7 @@ static void hists__find_annotations(struct hists *hists,
> >   			nd = rb_next(nd);
> >   		} else if (use_browser == 1) {
> >   			key = hist_entry__tui_annotate(he, evsel, NULL);
> > +
> >   			switch (key) {
> >   			case -1:
> >   				if (!ann->skip_missing)
> > @@ -489,6 +556,9 @@ int cmd_annotate(int argc, const char **argv)
> >   	if (annotate.session == NULL)
> >   		return -1;
> > +	annotate.has_br_stack = perf_header__has_feat(&annotate.session->header,
> > +						      HEADER_BRANCH_STACK);
> > +
> >   	ret = symbol__annotation_init();
> >   	if (ret < 0)
> >   		goto out_delete;
> > @@ -499,9 +569,6 @@ int cmd_annotate(int argc, const char **argv)
> >   	if (ret < 0)
> >   		goto out_delete;
> > -	if (setup_sorting(NULL) < 0)
> > -		usage_with_options(annotate_usage, options);
> > -
> >   	if (annotate.use_stdio)
> >   		use_browser = 0;
> >   	else if (annotate.use_tui)
> > @@ -511,6 +578,15 @@ int cmd_annotate(int argc, const char **argv)
> >   	setup_browser(true);
> > +	if (use_browser == 1 && annotate.has_br_stack) {
> > +		sort__mode = SORT_MODE__BRANCH;
> > +		if (setup_sorting(annotate.session->evlist) < 0)
> > +			usage_with_options(annotate_usage, options);
> > +	} else {
> > +		if (setup_sorting(NULL) < 0)
> > +			usage_with_options(annotate_usage, options);
> > +	}
> > +
> >   	ret = __cmd_annotate(&annotate);
> >   out_delete:
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:perf/core] perf annotate: Support to display the IPC/Cycle in TUI mode
  2018-02-27  9:38       ` [PATCH v2] perf annotate: Support to display the IPC/Cycle " Jin Yao
  2018-03-08  5:22         ` Jin, Yao
@ 2018-03-09  8:57         ` tip-bot for Jin Yao
  1 sibling, 0 replies; 12+ messages in thread
From: tip-bot for Jin Yao @ 2018-03-09  8:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, yao.jin, mingo, tglx, ak, hpa, jolsa, peterz, acme,
	alexander.shishkin

Commit-ID:  bb848c14f80d93059cb10b1e1446cc6823d77142
Gitweb:     https://git.kernel.org/tip/bb848c14f80d93059cb10b1e1446cc6823d77142
Author:     Jin Yao <yao.jin@linux.intel.com>
AuthorDate: Tue, 27 Feb 2018 17:38:47 +0800
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 8 Mar 2018 11:30:52 -0300

perf annotate: Support to display the IPC/Cycle in TUI mode

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the IPC/Cycle even if branch info is recorded in perf
data file.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.

This patch lets perf annotate support the displaying of IPC/Cycle if
branch info is in perf data.

For example,

  perf annotate compute_flag

  Percent│ IPC Cycle
         │
         │
         │                Disassembly of section .text:
         │
         │                0000000000400640 <compute_flag>:
         │                compute_flag():
         │                volatile int count;
         │                static unsigned int s_randseed;
         │
         │                __attribute__((noinline))
         │                int compute_flag()
         │                {
   22.96 │1.18   584        sub    $0x8,%rsp
         │                        int i;
         │
         │                        i = rand() % 2;
   23.02 │1.18     1      → callq  rand@plt
         │
         │                        return i;
   27.05 │3.37              mov    %eax,%edx
         │                }
         │3.37              add    $0x8,%rsp
         │                {
         │                        int i;
         │
         │                        i = rand() % 2;
         │
         │                        return i;
         │3.37              shr    $0x1f,%edx
         │3.37              add    %edx,%eax
         │3.37              and    $0x1,%eax
         │3.37              sub    %edx,%eax
         │                }
   26.97 │3.37     2      ← retq

Note that, this patch only supports TUI mode. For stdio, now it just keeps
original behavior. Will support it in a follow-up patch.

  $ perf annotate compute_flag --stdio

   Percent |      Source code & Disassembly of div for cycles:ppp (7993 samples)
  ------------------------------------------------------------------------------
           :
           :
           :
           :            Disassembly of section .text:
           :
           :            0000000000400640 <compute_flag>:
           :            compute_flag():
           :            volatile int count;
           :            static unsigned int s_randseed;
           :
           :            __attribute__((noinline))
           :            int compute_flag()
           :            {
      0.29 :   400640:       sub    $0x8,%rsp     # +100.00%
           :                    int i;
           :
           :                    i = rand() % 2;
     42.93 :   400644:       callq  400490 <rand@plt>     # -100.00% (p:100.00%)
           :
           :                    return i;
      0.10 :   400649:       mov    %eax,%edx     # +100.00%
           :            }
      0.94 :   40064b:       add    $0x8,%rsp
           :            {
           :                    int i;
           :
           :                    i = rand() % 2;
           :
           :                    return i;
     27.02 :   40064f:       shr    $0x1f,%edx
      0.15 :   400652:       add    %edx,%eax
      1.24 :   400654:       and    $0x1,%eax
      2.08 :   400657:       sub    %edx,%eax
           :            }
     25.26 :   400659:       retq # -100.00% (p:100.00%)

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20180223170210.GC7045@tassilo.jf.intel.com
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1519724327-7773-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-annotate.c | 88 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 82 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a3d438..ead6ae4549e5 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -44,6 +44,7 @@ struct perf_annotate {
 	bool	   full_paths;
 	bool	   print_line;
 	bool	   skip_missing;
+	bool	   has_br_stack;
 	const char *sym_hist_filter;
 	const char *cpu_list;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, struct addr_location *
 	free(bi);
 }
 
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+				      struct addr_location *al __maybe_unused,
+				      bool single __maybe_unused,
+				      void *arg __maybe_unused)
+{
+	struct hist_entry *he = iter->he;
+	struct branch_info *bi;
+	struct perf_sample *sample = iter->sample;
+	struct perf_evsel *evsel = iter->evsel;
+	int err;
+
+	hist__account_cycles(sample->branch_stack, al, sample, false);
+
+	bi = he->branch_info;
+	err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
+
+	if (err)
+		goto out;
+
+	err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
+
+out:
+	return err;
+}
+
+static int process_branch_callback(struct perf_evsel *evsel,
+				   struct perf_sample *sample,
+				   struct addr_location *al __maybe_unused,
+				   struct perf_annotate *ann,
+				   struct machine *machine)
+{
+	struct hist_entry_iter iter = {
+		.evsel		= evsel,
+		.sample		= sample,
+		.add_entry_cb	= hist_iter__branch_callback,
+		.hide_unresolved	= symbol_conf.hide_unresolved,
+		.ops		= &hist_iter_branch,
+	};
+
+	struct addr_location a;
+	int ret;
+
+	if (machine__resolve(machine, &a, sample) < 0)
+		return -1;
+
+	if (a.sym == NULL)
+		return 0;
+
+	if (a.map != NULL)
+		a.map->dso->hit = 1;
+
+	ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
+	return ret;
+}
+
 static int perf_evsel__add_sample(struct perf_evsel *evsel,
 				  struct perf_sample *sample,
 				  struct addr_location *al,
-				  struct perf_annotate *ann)
+				  struct perf_annotate *ann,
+				  struct machine *machine)
 {
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
 	int ret;
 
-	if (ann->sym_hist_filter != NULL &&
+	if ((!ann->has_br_stack || !ui__has_annotation()) &&
+	    ann->sym_hist_filter != NULL &&
 	    (al->sym == NULL ||
 	     strcmp(ann->sym_hist_filter, al->sym->name) != 0)) {
 		/* We're only interested in a symbol named sym_hist_filter */
@@ -178,6 +236,9 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 	 */
 	process_branch_stack(sample->branch_stack, al, sample);
 
+	if (ann->has_br_stack && ui__has_annotation())
+		return process_branch_callback(evsel, sample, al, ann, machine);
+
 	he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
 	if (he == NULL)
 		return -ENOMEM;
@@ -206,7 +267,8 @@ static int process_sample_event(struct perf_tool *tool,
 	if (ann->cpu_list && !test_bit(sample->cpu, ann->cpu_bitmap))
 		goto out_put;
 
-	if (!al.filtered && perf_evsel__add_sample(evsel, sample, &al, ann)) {
+	if (!al.filtered &&
+	    perf_evsel__add_sample(evsel, sample, &al, ann, machine)) {
 		pr_warning("problem incrementing symbol count, "
 			   "skipping event\n");
 		ret = -1;
@@ -238,6 +300,10 @@ static void hists__find_annotations(struct hists *hists,
 		if (he->ms.sym == NULL || he->ms.map->dso->annotate_warned)
 			goto find_next;
 
+		if (ann->sym_hist_filter &&
+		    (strcmp(he->ms.sym->name, ann->sym_hist_filter) != 0))
+			goto find_next;
+
 		notes = symbol__annotation(he->ms.sym);
 		if (notes->src == NULL) {
 find_next:
@@ -269,6 +335,7 @@ find_next:
 			nd = rb_next(nd);
 		} else if (use_browser == 1) {
 			key = hist_entry__tui_annotate(he, evsel, NULL);
+
 			switch (key) {
 			case -1:
 				if (!ann->skip_missing)
@@ -489,6 +556,9 @@ int cmd_annotate(int argc, const char **argv)
 	if (annotate.session == NULL)
 		return -1;
 
+	annotate.has_br_stack = perf_header__has_feat(&annotate.session->header,
+						      HEADER_BRANCH_STACK);
+
 	ret = symbol__annotation_init();
 	if (ret < 0)
 		goto out_delete;
@@ -499,9 +569,6 @@ int cmd_annotate(int argc, const char **argv)
 	if (ret < 0)
 		goto out_delete;
 
-	if (setup_sorting(NULL) < 0)
-		usage_with_options(annotate_usage, options);
-
 	if (annotate.use_stdio)
 		use_browser = 0;
 	else if (annotate.use_tui)
@@ -511,6 +578,15 @@ int cmd_annotate(int argc, const char **argv)
 
 	setup_browser(true);
 
+	if (use_browser == 1 && annotate.has_br_stack) {
+		sort__mode = SORT_MODE__BRANCH;
+		if (setup_sorting(annotate.session->evlist) < 0)
+			usage_with_options(annotate_usage, options);
+	} else {
+		if (setup_sorting(NULL) < 0)
+			usage_with_options(annotate_usage, options);
+	}
+
 	ret = __cmd_annotate(&annotate);
 
 out_delete:

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-03-09  8:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-23 14:35 [PATCH] perf annotate: Support to display the LBR data in tui mode Jin Yao
2018-02-23  8:25 ` Peter Zijlstra
2018-02-23  9:08   ` Jin, Yao
2018-02-23 15:29   ` Arnaldo Carvalho de Melo
2018-02-23 16:59     ` Peter Zijlstra
2018-02-23 17:02     ` Andi Kleen
2018-02-27  9:38       ` [PATCH v2] perf annotate: Support to display the IPC/Cycle " Jin Yao
2018-03-08  5:22         ` Jin, Yao
2018-03-08 13:12           ` Arnaldo Carvalho de Melo
2018-03-09  8:57         ` [tip:perf/core] perf annotate: Support to display the IPC/Cycle in TUI mode tip-bot for Jin Yao
2018-02-24  1:40     ` [PATCH] perf annotate: Support to display the LBR data in tui mode Jin, Yao
2018-02-26 13:57       ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).