linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7] perf annotate: Fix sample events lost in stdio mode
@ 2021-03-19 12:35 Yang Jihong
  2021-03-26  2:24 ` Yang Jihong
  0 siblings, 1 reply; 10+ messages in thread
From: Yang Jihong @ 2021-03-19 12:35 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, yao.jin, gustavoars, mliska, linux-kernel
  Cc: yangjihong1, zhangjinhao2

In hist__find_annotations function, since different hist_entry may point to same
symbol, we free notes->src to signal already processed this symbol in stdio mode;
when annotate, entry will skipped if notes->src is NULL to avoid repeated output.

However, there is a problem, for example, run the following command:

 # perf record -e branch-misses -e branch-instructions -a sleep 1

perf.data file contains different types of sample event.

If the same IP sample event exists in branch-misses and branch-instructions,
this event uses the same symbol. When annotate branch-misses events, notes->src
corresponding to this event is set to null, as a result, when annotate
branch-instructions events, this event is skipped and no annotate is output.

Solution of this patch is to remove zfree in hists__find_annotations and
change sort order to "dso,symbol" to avoid duplicate output when different
processes correspond to the same symbol.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---

Changes since v6:
  - Remove separate setup_sorting() for branch mode.

Changes since v5:
  - Add Signed-off-by tag.

Changes since v4:
  - Use the same sort key "dso,symbol" in branch stack mode.

Changes since v3:
  - Modify the first line of comments.

Changes since v2:
  - Remove zfree in hists__find_annotations.
  - Change sort order to avoid duplicate output.

Changes since v1:
  - Change processed flag variable from u8 to bool.

 tools/perf/builtin-annotate.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index a23ba6bb99b6..795c2ac7fcd1 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists *hists,
 		} else {
 			hist_entry__tty_annotate(he, evsel, ann);
 			nd = rb_next(nd);
-			/*
-			 * Since we have a hist_entry per IP for the same
-			 * symbol, free he->ms.sym->src to signal we already
-			 * processed this symbol.
-			 */
-			zfree(&notes->src->cycles_hist);
-			zfree(&notes->src);
 		}
 	}
 }
@@ -619,14 +612,22 @@ int cmd_annotate(int argc, const char **argv)
 
 	setup_browser(true);
 
-	if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) {
+	/*
+	 * Events of different processes may correspond to the same
+	 * symbol, we do not care about the processes in annotate,
+	 * set sort order to avoid repeated output.
+	 */
+	sort_order = "dso,symbol";
+
+	/*
+	 * Set SORT_MODE__BRANCH so that annotate display IPC/Cycle
+	 * if branch info is in perf data in TUI mode.
+	 */
+	if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack)
 		sort__mode = SORT_MODE__BRANCH;
-		if (setup_sorting(annotate.session->evlist) < 0)
-			usage_with_options(annotate_usage, options);
-	} else {
-		if (setup_sorting(NULL) < 0)
-			usage_with_options(annotate_usage, options);
-	}
+
+	if (setup_sorting(NULL) < 0)
+		usage_with_options(annotate_usage, options);
 
 	ret = __cmd_annotate(&annotate);
 
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-03-19 12:35 [PATCH v7] perf annotate: Fix sample events lost in stdio mode Yang Jihong
@ 2021-03-26  2:24 ` Yang Jihong
  2021-03-26  3:25   ` Namhyung Kim
  0 siblings, 1 reply; 10+ messages in thread
From: Yang Jihong @ 2021-03-26  2:24 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, yao.jin, gustavoars, mliska, linux-kernel
  Cc: zhangjinhao2

Hello,
ping :)

On 2021/3/19 20:35, Yang Jihong wrote:
> In hist__find_annotations function, since different hist_entry may point to same
> symbol, we free notes->src to signal already processed this symbol in stdio mode;
> when annotate, entry will skipped if notes->src is NULL to avoid repeated output.
> 
> However, there is a problem, for example, run the following command:
> 
>   # perf record -e branch-misses -e branch-instructions -a sleep 1
> 
> perf.data file contains different types of sample event.
> 
> If the same IP sample event exists in branch-misses and branch-instructions,
> this event uses the same symbol. When annotate branch-misses events, notes->src
> corresponding to this event is set to null, as a result, when annotate
> branch-instructions events, this event is skipped and no annotate is output.
> 
> Solution of this patch is to remove zfree in hists__find_annotations and
> change sort order to "dso,symbol" to avoid duplicate output when different
> processes correspond to the same symbol.
> 
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> ---
> 
> Changes since v6:
>    - Remove separate setup_sorting() for branch mode.
> 
> Changes since v5:
>    - Add Signed-off-by tag.
> 
> Changes since v4:
>    - Use the same sort key "dso,symbol" in branch stack mode.
> 
> Changes since v3:
>    - Modify the first line of comments.
> 
> Changes since v2:
>    - Remove zfree in hists__find_annotations.
>    - Change sort order to avoid duplicate output.
> 
> Changes since v1:
>    - Change processed flag variable from u8 to bool.
> 
>   tools/perf/builtin-annotate.c | 29 +++++++++++++++--------------
>   1 file changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
> index a23ba6bb99b6..795c2ac7fcd1 100644
> --- a/tools/perf/builtin-annotate.c
> +++ b/tools/perf/builtin-annotate.c
> @@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists *hists,
>   		} else {
>   			hist_entry__tty_annotate(he, evsel, ann);
>   			nd = rb_next(nd);
> -			/*
> -			 * Since we have a hist_entry per IP for the same
> -			 * symbol, free he->ms.sym->src to signal we already
> -			 * processed this symbol.
> -			 */
> -			zfree(&notes->src->cycles_hist);
> -			zfree(&notes->src);
>   		}
>   	}
>   }
> @@ -619,14 +612,22 @@ int cmd_annotate(int argc, const char **argv)
>   
>   	setup_browser(true);
>   
> -	if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) {
> +	/*
> +	 * Events of different processes may correspond to the same
> +	 * symbol, we do not care about the processes in annotate,
> +	 * set sort order to avoid repeated output.
> +	 */
> +	sort_order = "dso,symbol";
> +
> +	/*
> +	 * Set SORT_MODE__BRANCH so that annotate display IPC/Cycle
> +	 * if branch info is in perf data in TUI mode.
> +	 */
> +	if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack)
>   		sort__mode = SORT_MODE__BRANCH;
> -		if (setup_sorting(annotate.session->evlist) < 0)
> -			usage_with_options(annotate_usage, options);
> -	} else {
> -		if (setup_sorting(NULL) < 0)
> -			usage_with_options(annotate_usage, options);
> -	}
> +
> +	if (setup_sorting(NULL) < 0)
> +		usage_with_options(annotate_usage, options);
>   
>   	ret = __cmd_annotate(&annotate);
>   
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-03-26  2:24 ` Yang Jihong
@ 2021-03-26  3:25   ` Namhyung Kim
  2021-03-26 12:06     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Namhyung Kim @ 2021-03-26  3:25 UTC (permalink / raw)
  To: Yang Jihong
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars,
	mliska, linux-kernel, zhangjinhao2

Hello,

On Fri, Mar 26, 2021 at 11:24 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>
> Hello,
> ping :)
>
> On 2021/3/19 20:35, Yang Jihong wrote:
> > In hist__find_annotations function, since different hist_entry may point to same
> > symbol, we free notes->src to signal already processed this symbol in stdio mode;
> > when annotate, entry will skipped if notes->src is NULL to avoid repeated output.
> >
> > However, there is a problem, for example, run the following command:
> >
> >   # perf record -e branch-misses -e branch-instructions -a sleep 1
> >
> > perf.data file contains different types of sample event.
> >
> > If the same IP sample event exists in branch-misses and branch-instructions,
> > this event uses the same symbol. When annotate branch-misses events, notes->src
> > corresponding to this event is set to null, as a result, when annotate
> > branch-instructions events, this event is skipped and no annotate is output.
> >
> > Solution of this patch is to remove zfree in hists__find_annotations and
> > change sort order to "dso,symbol" to avoid duplicate output when different
> > processes correspond to the same symbol.
> >
> > Signed-off-by: Yang Jihong <yangjihong1@huawei.com>

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung


> > ---
> >
> > Changes since v6:
> >    - Remove separate setup_sorting() for branch mode.
> >
> > Changes since v5:
> >    - Add Signed-off-by tag.
> >
> > Changes since v4:
> >    - Use the same sort key "dso,symbol" in branch stack mode.
> >
> > Changes since v3:
> >    - Modify the first line of comments.
> >
> > Changes since v2:
> >    - Remove zfree in hists__find_annotations.
> >    - Change sort order to avoid duplicate output.
> >
> > Changes since v1:
> >    - Change processed flag variable from u8 to bool.
> >
> >   tools/perf/builtin-annotate.c | 29 +++++++++++++++--------------
> >   1 file changed, 15 insertions(+), 14 deletions(-)
> >
> > diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
> > index a23ba6bb99b6..795c2ac7fcd1 100644
> > --- a/tools/perf/builtin-annotate.c
> > +++ b/tools/perf/builtin-annotate.c
> > @@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists *hists,
> >               } else {
> >                       hist_entry__tty_annotate(he, evsel, ann);
> >                       nd = rb_next(nd);
> > -                     /*
> > -                      * Since we have a hist_entry per IP for the same
> > -                      * symbol, free he->ms.sym->src to signal we already
> > -                      * processed this symbol.
> > -                      */
> > -                     zfree(&notes->src->cycles_hist);
> > -                     zfree(&notes->src);
> >               }
> >       }
> >   }
> > @@ -619,14 +612,22 @@ int cmd_annotate(int argc, const char **argv)
> >
> >       setup_browser(true);
> >
> > -     if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) {
> > +     /*
> > +      * Events of different processes may correspond to the same
> > +      * symbol, we do not care about the processes in annotate,
> > +      * set sort order to avoid repeated output.
> > +      */
> > +     sort_order = "dso,symbol";
> > +
> > +     /*
> > +      * Set SORT_MODE__BRANCH so that annotate display IPC/Cycle
> > +      * if branch info is in perf data in TUI mode.
> > +      */
> > +     if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack)
> >               sort__mode = SORT_MODE__BRANCH;
> > -             if (setup_sorting(annotate.session->evlist) < 0)
> > -                     usage_with_options(annotate_usage, options);
> > -     } else {
> > -             if (setup_sorting(NULL) < 0)
> > -                     usage_with_options(annotate_usage, options);
> > -     }
> > +
> > +     if (setup_sorting(NULL) < 0)
> > +             usage_with_options(annotate_usage, options);
> >
> >       ret = __cmd_annotate(&annotate);
> >
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-03-26  3:25   ` Namhyung Kim
@ 2021-03-26 12:06     ` Arnaldo Carvalho de Melo
  2021-03-27  2:15       ` Yang Jihong
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-26 12:06 UTC (permalink / raw)
  To: Namhyung Kim, Yang Jihong
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel,
	zhangjinhao2

Em Fri, Mar 26, 2021 at 12:25:37PM +0900, Namhyung Kim escreveu:
> On Fri, Mar 26, 2021 at 11:24 AM Yang Jihong <yangjihong1@huawei.com> wrote:
> > On 2021/3/19 20:35, Yang Jihong wrote:
> > > In hist__find_annotations function, since different hist_entry may point to same
> > > symbol, we free notes->src to signal already processed this symbol in stdio mode;
> > > when annotate, entry will skipped if notes->src is NULL to avoid repeated output.
> > >
> > > However, there is a problem, for example, run the following command:
> > >
> > >   # perf record -e branch-misses -e branch-instructions -a sleep 1
> > >
> > > perf.data file contains different types of sample event.
> > >
> > > If the same IP sample event exists in branch-misses and branch-instructions,
> > > this event uses the same symbol. When annotate branch-misses events, notes->src
> > > corresponding to this event is set to null, as a result, when annotate
> > > branch-instructions events, this event is skipped and no annotate is output.
> > >
> > > Solution of this patch is to remove zfree in hists__find_annotations and
> > > change sort order to "dso,symbol" to avoid duplicate output when different
> > > processes correspond to the same symbol.

> > > Signed-off-by: Yang Jihong <yangjihong1@huawei.com>

> Acked-by: Namhyung Kim <namhyung@kernel.org>

Without looking at the patch, just at its description of the problem, I
tried to annotate two events in a group, to get the annotate group view
output with both events, and it seems I'm getting samples accounted for
both events:

[root@five ~]# perf record -e '{branch-misses,branch-instructions}' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.296 MB perf.data (2507 samples) ]
[root@five ~]#
[root@five ~]# perf report | grep -v '^#' | head -5
    17.49%  19.19%  ThreadPoolForeg  chromium-browser              [.] v8::internal::ConcurrentMarking::Run
    12.17%  17.04%  ThreadPoolForeg  chromium-browser              [.] v8::internal::Sweeper::RawSweep
    11.14%  11.63%  ThreadPoolForeg  chromium-browser              [.] v8::internal::MarkingVisitorBase<v8::internal::ConcurrentMarkingVisitor, v8::internal::ConcurrentMarkingState>::ProcessStrongHeapObject<v8::internal::CompressedHeapObjectSlot>
     7.65%   7.84%  ThreadPoolForeg  chromium-browser              [.] v8::internal::ConcurrentMarkingVisitor::ShouldVisit
     5.66%   6.72%  ThreadPoolForeg  chromium-browser              [.] v8::internal::ConcurrentMarkingVisitor::VisitPointersInSnapshot

[root@five ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run
Samples: 444  of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period]
v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser
Percent                      
                             
                             
                     Disassembly of section .text:
                             
                     0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>:
                     v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool):
                       push       %rbp        
                       mov        %rsp,%rbp   
                       push       %r15        
                       push       %r14        
                       mov        %rdi,%r14   
                       push       %r13        
                       mov        %edx,%r13d  
                       push       %r12        
                       mov        %ecx,%r12d  
                       push       %rbx        
                       sub        $0x1298,%rsp
                       mov        %rsi,-0x1228(%rbp)
                       mov        %fs:0x28,%rax
                       mov        %rax,-0x38(%rbp)
<SNIP>		       
                       movzwl     0x2(%rbx),%eax
                       test       %ax,%ax     
                     ↓ jne        4a9         
                       mov        -0x10e8(%rbp),%rdx
                       cmpw       $0x0,0x2(%rdx)
  0.41   0.39        ↓ je         4b90        
                       movq       %rbx,%xmm0  
                       movq       %rdx,%xmm2  
                       mov        %rdx,%rbx   
                       punpcklqdq %xmm2,%xmm0 
                       movups     %xmm0,-0x10e8(%rbp)
                       movzwl     0x2(%rdx),%eax
                4a9:   sub        $0x1,%eax   
                       mov        %ax,0x2(%rbx)
  0.36   0.91          movzwl     %ax,%eax    
  0.60   0.00          mov        0x10(%rbx,%rax,8),%rax
  3.44   2.46          mov        %rax,-0x11e0(%rbp)
  0.00   0.34   4bf:   mov        0x8(%r13),%rax
  0.00   0.36          add        $0x1,%r15d  
  0.00   0.34          mov        0x110(%rax),%rax
                       mov        0x128(%rax),%rcx
  0.88   0.36          mov        0x8(%r13),%rax
                       mov        0x110(%rax),%rdx
                       mov        0x130(%rdx),%rdx
  0.00   0.48          mov        0x140(%rax),%rax
                       mov        0x110(%rax),%rsi
  0.61   0.47          mov        -0x11e0(%rbp),%rax
  2.01   2.32          sub        $0x1,%rax   
                       cmp        %rcx,%rax   
  0.00   0.35          setae      %cl         
  1.31   0.33          cmp        %rdx,%rax   
                       setb       %dl         
  0.00   0.24          test       %dl,%cl     
  0.00   0.12        ↓ jne        4b70        
                       cmp        %rsi,%rax   
                     ↓ je         4b70        
                       mov        (%rax),%eax 
 29.10  29.90          add        -0x1220(%rbp),%rax
                       cmpb       $0x0,-0x1218(%rbp)
                       mov        %rax,-0x1210(%rbp)
  0.00   0.65        ↓ jne        4fa0        
                       mov        -0x11e0(%rbp),%r9
                       lea        0x6(%rax),%rbx
  0.38   0.00   545:   movzbl     (%rbx),%eax 
  4.90   5.34          cmp        $0x4c,%al   
                     ↓ ja         5026        
  0.58   0.00   550:   lea        v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi
                       movslq     (%rdi,%rax,4),%rax
  4.97   3.19          add        %rdi,%rax   
<SNIP>

If I ask for number of samples:

[root@five ~]# perf config annotate.show_nr_samples=true
[root@five ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run

Samples: 444  of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period]
v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser
Samples                      
                             
                             
                     Disassembly of section .text:
                             
                     0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>:
                     v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool):
                       push       %rbp        
                       mov        %rsp,%rbp   
                       push       %r15        
                       push       %r14        
                       mov        %rdi,%r14   
                       push       %r13        
                       mov        %edx,%r13d  
                       push       %r12        
                       mov        %ecx,%r12d  
                       push       %rbx        
                       sub        $0x1298,%rsp
                       mov        %rsi,-0x1228(%rbp)
                       mov        %fs:0x28,%rax
                       mov        %rax,-0x38(%rbp)
<SNIP>
                       movzwl     0x2(%rbx),%eax
                       test       %ax,%ax     
                     ↓ jne        4a9         
                       mov        -0x10e8(%rbp),%rdx
                       cmpw       $0x0,0x2(%rdx)
     1      1        ↓ je         4b90        
                       movq       %rbx,%xmm0  
                       movq       %rdx,%xmm2  
                       mov        %rdx,%rbx   
                       punpcklqdq %xmm2,%xmm0 
                       movups     %xmm0,-0x10e8(%rbp)
                       movzwl     0x2(%rdx),%eax
                4a9:   sub        $0x1,%eax   
                       mov        %ax,0x2(%rbx)
     1      2          movzwl     %ax,%eax    
     1      0          mov        0x10(%rbx,%rax,8),%rax
     8      5          mov        %rax,-0x11e0(%rbp)
     0      1   4bf:   mov        0x8(%r13),%rax
     0      1          add        $0x1,%r15d  
     0      1          mov        0x110(%rax),%rax
                       mov        0x128(%rax),%rcx
     3      1          mov        0x8(%r13),%rax
                       mov        0x110(%rax),%rdx
                       mov        0x130(%rdx),%rdx
     0      1          mov        0x140(%rax),%rax
                       mov        0x110(%rax),%rsi
     2      1          mov        -0x11e0(%rbp),%rax
     6      6          sub        $0x1,%rax   
                       cmp        %rcx,%rax   
     0      1          setae      %cl         
     2      1          cmp        %rdx,%rax   
                       setb       %dl         
     0      1          test       %dl,%cl     
     0      1        ↓ jne        4b70        
                       cmp        %rsi,%rax   
                     ↓ je         4b70        
                       mov        (%rax),%eax 
    58     73          add        -0x1220(%rbp),%rax
                       cmpb       $0x0,-0x1218(%rbp)
                       mov        %rax,-0x1210(%rbp)
     0      1        ↓ jne        4fa0        
                       mov        -0x11e0(%rbp),%r9
                       lea        0x6(%rax),%rbx
     1      0   545:   movzbl     (%rbx),%eax 
    10     13          cmp        $0x4c,%al   
                     ↓ ja         5026        
     1      0   550:   lea        v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi
                       movslq     (%rdi,%rax,4),%rax
     9      8          add        %rdi,%rax   

<SNIP>

So it seems to be working, what am I missing? Is this strictly non
group related?

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-03-26 12:06     ` Arnaldo Carvalho de Melo
@ 2021-03-27  2:15       ` Yang Jihong
  2021-03-30  7:26         ` Namhyung Kim
  0 siblings, 1 reply; 10+ messages in thread
From: Yang Jihong @ 2021-03-27  2:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel,
	zhangjinhao2

Hello,

On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote:
> Em Fri, Mar 26, 2021 at 12:25:37PM +0900, Namhyung Kim escreveu:
>> On Fri, Mar 26, 2021 at 11:24 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>>> On 2021/3/19 20:35, Yang Jihong wrote:
>>>> In hist__find_annotations function, since different hist_entry may point to same
>>>> symbol, we free notes->src to signal already processed this symbol in stdio mode;
>>>> when annotate, entry will skipped if notes->src is NULL to avoid repeated output.
>>>>
>>>> However, there is a problem, for example, run the following command:
>>>>
>>>>    # perf record -e branch-misses -e branch-instructions -a sleep 1
>>>>
>>>> perf.data file contains different types of sample event.
>>>>
>>>> If the same IP sample event exists in branch-misses and branch-instructions,
>>>> this event uses the same symbol. When annotate branch-misses events, notes->src
>>>> corresponding to this event is set to null, as a result, when annotate
>>>> branch-instructions events, this event is skipped and no annotate is output.
>>>>
>>>> Solution of this patch is to remove zfree in hists__find_annotations and
>>>> change sort order to "dso,symbol" to avoid duplicate output when different
>>>> processes correspond to the same symbol.
> 
>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> 
>> Acked-by: Namhyung Kim <namhyung@kernel.org>
> 
> Without looking at the patch, just at its description of the problem, I
> tried to annotate two events in a group, to get the annotate group view
> output with both events, and it seems I'm getting samples accounted for
> both events:
> 
> [root@five ~]# perf record -e '{branch-misses,branch-instructions}' -a sleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.296 MB perf.data (2507 samples) ]
> [root@five ~]#
> [root@five ~]# perf report | grep -v '^#' | head -5
>      17.49%  19.19%  ThreadPoolForeg  chromium-browser              [.] v8::internal::ConcurrentMarking::Run
>      12.17%  17.04%  ThreadPoolForeg  chromium-browser              [.] v8::internal::Sweeper::RawSweep
>      11.14%  11.63%  ThreadPoolForeg  chromium-browser              [.] v8::internal::MarkingVisitorBase<v8::internal::ConcurrentMarkingVisitor, v8::internal::ConcurrentMarkingState>::ProcessStrongHeapObject<v8::internal::CompressedHeapObjectSlot>
>       7.65%   7.84%  ThreadPoolForeg  chromium-browser              [.] v8::internal::ConcurrentMarkingVisitor::ShouldVisit
>       5.66%   6.72%  ThreadPoolForeg  chromium-browser              [.] v8::internal::ConcurrentMarkingVisitor::VisitPointersInSnapshot
> 
> [root@five ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run
> Samples: 444  of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period]
> v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser
> Percent
>                               
>                               
>                       Disassembly of section .text:
>                               
>                       0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>:
>                       v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool):
>                         push       %rbp
>                         mov        %rsp,%rbp
>                         push       %r15
>                         push       %r14
>                         mov        %rdi,%r14
>                         push       %r13
>                         mov        %edx,%r13d
>                         push       %r12
>                         mov        %ecx,%r12d
>                         push       %rbx
>                         sub        $0x1298,%rsp
>                         mov        %rsi,-0x1228(%rbp)
>                         mov        %fs:0x28,%rax
>                         mov        %rax,-0x38(%rbp)
> <SNIP>		
>                         movzwl     0x2(%rbx),%eax
>                         test       %ax,%ax
>                       ↓ jne        4a9
>                         mov        -0x10e8(%rbp),%rdx
>                         cmpw       $0x0,0x2(%rdx)
>    0.41   0.39        ↓ je         4b90
>                         movq       %rbx,%xmm0
>                         movq       %rdx,%xmm2
>                         mov        %rdx,%rbx
>                         punpcklqdq %xmm2,%xmm0
>                         movups     %xmm0,-0x10e8(%rbp)
>                         movzwl     0x2(%rdx),%eax
>                  4a9:   sub        $0x1,%eax
>                         mov        %ax,0x2(%rbx)
>    0.36   0.91          movzwl     %ax,%eax
>    0.60   0.00          mov        0x10(%rbx,%rax,8),%rax
>    3.44   2.46          mov        %rax,-0x11e0(%rbp)
>    0.00   0.34   4bf:   mov        0x8(%r13),%rax
>    0.00   0.36          add        $0x1,%r15d
>    0.00   0.34          mov        0x110(%rax),%rax
>                         mov        0x128(%rax),%rcx
>    0.88   0.36          mov        0x8(%r13),%rax
>                         mov        0x110(%rax),%rdx
>                         mov        0x130(%rdx),%rdx
>    0.00   0.48          mov        0x140(%rax),%rax
>                         mov        0x110(%rax),%rsi
>    0.61   0.47          mov        -0x11e0(%rbp),%rax
>    2.01   2.32          sub        $0x1,%rax
>                         cmp        %rcx,%rax
>    0.00   0.35          setae      %cl
>    1.31   0.33          cmp        %rdx,%rax
>                         setb       %dl
>    0.00   0.24          test       %dl,%cl
>    0.00   0.12        ↓ jne        4b70
>                         cmp        %rsi,%rax
>                       ↓ je         4b70
>                         mov        (%rax),%eax
>   29.10  29.90          add        -0x1220(%rbp),%rax
>                         cmpb       $0x0,-0x1218(%rbp)
>                         mov        %rax,-0x1210(%rbp)
>    0.00   0.65        ↓ jne        4fa0
>                         mov        -0x11e0(%rbp),%r9
>                         lea        0x6(%rax),%rbx
>    0.38   0.00   545:   movzbl     (%rbx),%eax
>    4.90   5.34          cmp        $0x4c,%al
>                       ↓ ja         5026
>    0.58   0.00   550:   lea        v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi
>                         movslq     (%rdi,%rax,4),%rax
>    4.97   3.19          add        %rdi,%rax
> <SNIP>
> 
> If I ask for number of samples:
> 
> [root@five ~]# perf config annotate.show_nr_samples=true
> [root@five ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run
> 
> Samples: 444  of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period]
> v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser
> Samples
>                               
>                               
>                       Disassembly of section .text:
>                               
>                       0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>:
>                       v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool):
>                         push       %rbp
>                         mov        %rsp,%rbp
>                         push       %r15
>                         push       %r14
>                         mov        %rdi,%r14
>                         push       %r13
>                         mov        %edx,%r13d
>                         push       %r12
>                         mov        %ecx,%r12d
>                         push       %rbx
>                         sub        $0x1298,%rsp
>                         mov        %rsi,-0x1228(%rbp)
>                         mov        %fs:0x28,%rax
>                         mov        %rax,-0x38(%rbp)
> <SNIP>
>                         movzwl     0x2(%rbx),%eax
>                         test       %ax,%ax
>                       ↓ jne        4a9
>                         mov        -0x10e8(%rbp),%rdx
>                         cmpw       $0x0,0x2(%rdx)
>       1      1        ↓ je         4b90
>                         movq       %rbx,%xmm0
>                         movq       %rdx,%xmm2
>                         mov        %rdx,%rbx
>                         punpcklqdq %xmm2,%xmm0
>                         movups     %xmm0,-0x10e8(%rbp)
>                         movzwl     0x2(%rdx),%eax
>                  4a9:   sub        $0x1,%eax
>                         mov        %ax,0x2(%rbx)
>       1      2          movzwl     %ax,%eax
>       1      0          mov        0x10(%rbx,%rax,8),%rax
>       8      5          mov        %rax,-0x11e0(%rbp)
>       0      1   4bf:   mov        0x8(%r13),%rax
>       0      1          add        $0x1,%r15d
>       0      1          mov        0x110(%rax),%rax
>                         mov        0x128(%rax),%rcx
>       3      1          mov        0x8(%r13),%rax
>                         mov        0x110(%rax),%rdx
>                         mov        0x130(%rdx),%rdx
>       0      1          mov        0x140(%rax),%rax
>                         mov        0x110(%rax),%rsi
>       2      1          mov        -0x11e0(%rbp),%rax
>       6      6          sub        $0x1,%rax
>                         cmp        %rcx,%rax
>       0      1          setae      %cl
>       2      1          cmp        %rdx,%rax
>                         setb       %dl
>       0      1          test       %dl,%cl
>       0      1        ↓ jne        4b70
>                         cmp        %rsi,%rax
>                       ↓ je         4b70
>                         mov        (%rax),%eax
>      58     73          add        -0x1220(%rbp),%rax
>                         cmpb       $0x0,-0x1218(%rbp)
>                         mov        %rax,-0x1210(%rbp)
>       0      1        ↓ jne        4fa0
>                         mov        -0x11e0(%rbp),%r9
>                         lea        0x6(%rax),%rbx
>       1      0   545:   movzbl     (%rbx),%eax
>      10     13          cmp        $0x4c,%al
>                       ↓ ja         5026
>       1      0   550:   lea        v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi
>                         movslq     (%rdi,%rax,4),%rax
>       9      8          add        %rdi,%rax
> 
> <SNIP>
> 
> So it seems to be working, what am I missing? Is this strictly non
> group related?
> 
Yes, it is non group related.
This problem occurs only when different events need to be recorded at 
the same time, i.e.:
perf record -e branch-misses -e branch-instructions -a sleep 1

The output results of perf script and perf annotate do not match.
Some events are not output in perf annotate.

> - Arnaldo
> .
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-03-27  2:15       ` Yang Jihong
@ 2021-03-30  7:26         ` Namhyung Kim
  2021-03-31  2:18           ` Yang Jihong
  0 siblings, 1 reply; 10+ messages in thread
From: Namhyung Kim @ 2021-03-30  7:26 UTC (permalink / raw)
  To: Yang Jihong
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars,
	mliska, linux-kernel, zhangjinhao2

Hi Yang and Arnaldo,

On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> wrote:
> On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote:
> > So it seems to be working, what am I missing? Is this strictly non
> > group related?
> >
> Yes, it is non group related.
> This problem occurs only when different events need to be recorded at
> the same time, i.e.:
> perf record -e branch-misses -e branch-instructions -a sleep 1
>
> The output results of perf script and perf annotate do not match.
> Some events are not output in perf annotate.

Yeah I think it's related to sort keys.  The code works with a single
hist_entry for each event and symbol.  But the default sort key
creates multiple entries for different threads and it causes the
confusion.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-03-30  7:26         ` Namhyung Kim
@ 2021-03-31  2:18           ` Yang Jihong
  2021-04-12  7:22             ` Yang Jihong
  0 siblings, 1 reply; 10+ messages in thread
From: Yang Jihong @ 2021-03-31  2:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars,
	mliska, linux-kernel, zhangjinhao2

Hello, Namhyung  and Arnaldo,

On 2021/3/30 15:26, Namhyung Kim wrote:
> Hi Yang and Arnaldo,
> 
> On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>> On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote:
>>> So it seems to be working, what am I missing? Is this strictly non
>>> group related?
>>>
>> Yes, it is non group related.
>> This problem occurs only when different events need to be recorded at
>> the same time, i.e.:
>> perf record -e branch-misses -e branch-instructions -a sleep 1
>>
>> The output results of perf script and perf annotate do not match.
>> Some events are not output in perf annotate.
> 
> Yeah I think it's related to sort keys.  The code works with a single
> hist_entry for each event and symbol.  But the default sort key
> creates multiple entries for different threads and it causes the
> confusion.
> 
Yes, After revome zfree from hists__find_annotations, the output of perf 
annotate is repeated, which is related to sort keys.

The original problem is that notes->src may correspond to multiple 
sample events. Therefore, we cannot simply zfree notes->src to avoid 
repeated output.

Arnaldo, is there any problem with this patch? :)

Thanks,
Yang
> Thanks,
> Namhyung
> .
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-03-31  2:18           ` Yang Jihong
@ 2021-04-12  7:22             ` Yang Jihong
  2021-04-14 12:23               ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Yang Jihong @ 2021-04-12  7:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel,
	zhangjinhao2, Namhyung Kim

Hello, Arnaldo,

On 2021/3/31 10:18, Yang Jihong wrote:
> Hello, Namhyung  and Arnaldo,
> 
> On 2021/3/30 15:26, Namhyung Kim wrote:
>> Hi Yang and Arnaldo,
>>
>> On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> 
>> wrote:
>>> On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote:
>>>> So it seems to be working, what am I missing? Is this strictly non
>>>> group related?
>>>>
>>> Yes, it is non group related.
>>> This problem occurs only when different events need to be recorded at
>>> the same time, i.e.:
>>> perf record -e branch-misses -e branch-instructions -a sleep 1
>>>
>>> The output results of perf script and perf annotate do not match.
>>> Some events are not output in perf annotate.
>>
>> Yeah I think it's related to sort keys.  The code works with a single
>> hist_entry for each event and symbol.  But the default sort key
>> creates multiple entries for different threads and it causes the
>> confusion.
>>
> Yes, After revome zfree from hists__find_annotations, the output of perf 
> annotate is repeated, which is related to sort keys.
> 
> The original problem is that notes->src may correspond to multiple 
> sample events. Therefore, we cannot simply zfree notes->src to avoid 
> repeated output.
> 
> Arnaldo, is there any problem with this patch? :)
> 
> Thanks,
> Yang
>> Thanks,
>> Namhyung
>> .
>>
PING :)
Is there any problem with this patch that needs to be modified?

Thanks,
Yang

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-04-12  7:22             ` Yang Jihong
@ 2021-04-14 12:23               ` Arnaldo Carvalho de Melo
  2021-04-14 13:15                 ` Namhyung Kim
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-04-14 12:23 UTC (permalink / raw)
  To: Yang Jihong
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel,
	zhangjinhao2, Namhyung Kim

Em Mon, Apr 12, 2021 at 03:22:29PM +0800, Yang Jihong escreveu:
> On 2021/3/31 10:18, Yang Jihong wrote:
> > On 2021/3/30 15:26, Namhyung Kim wrote:
> > > On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> wrote:
> > > > On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote:
> > > > > So it seems to be working, what am I missing? Is this strictly non
> > > > > group related?

> > > > Yes, it is non group related.
> > > > This problem occurs only when different events need to be recorded at
> > > > the same time, i.e.:
> > > > perf record -e branch-misses -e branch-instructions -a sleep 1

> > > > The output results of perf script and perf annotate do not match.
> > > > Some events are not output in perf annotate.

> > > Yeah I think it's related to sort keys.  The code works with a single
> > > hist_entry for each event and symbol.  But the default sort key
> > > creates multiple entries for different threads and it causes the
> > > confusion.

> > Yes, After revome zfree from hists__find_annotations, the output of perf
> > annotate is repeated, which is related to sort keys.

> > The original problem is that notes->src may correspond to multiple
> > sample events. Therefore, we cannot simply zfree notes->src to avoid
> > repeated output.

> > Arnaldo, is there any problem with this patch? :)

> PING :)
> Is there any problem with this patch that needs to be modified?

I continue having a feeling this is kinda a bandaid, i.e. avoid the
problem, and since we have a way to work this when using a group, I fail
to see why it couldn't work when not grouping events.

But since I have no time to dive into this and Namhyung is ok with it,
I'll merge it now.

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode
  2021-04-14 12:23               ` Arnaldo Carvalho de Melo
@ 2021-04-14 13:15                 ` Namhyung Kim
  0 siblings, 0 replies; 10+ messages in thread
From: Namhyung Kim @ 2021-04-14 13:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yang Jihong, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska,
	linux-kernel, zhangjinhao2

Hi Arnaldo,

On Wed, Apr 14, 2021 at 9:23 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Mon, Apr 12, 2021 at 03:22:29PM +0800, Yang Jihong escreveu:
> > On 2021/3/31 10:18, Yang Jihong wrote:
> > > On 2021/3/30 15:26, Namhyung Kim wrote:
> > > > On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> wrote:
> > > > > On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote:
> > > > > > So it seems to be working, what am I missing? Is this strictly non
> > > > > > group related?
>
> > > > > Yes, it is non group related.
> > > > > This problem occurs only when different events need to be recorded at
> > > > > the same time, i.e.:
> > > > > perf record -e branch-misses -e branch-instructions -a sleep 1
>
> > > > > The output results of perf script and perf annotate do not match.
> > > > > Some events are not output in perf annotate.
>
> > > > Yeah I think it's related to sort keys.  The code works with a single
> > > > hist_entry for each event and symbol.  But the default sort key
> > > > creates multiple entries for different threads and it causes the
> > > > confusion.
>
> > > Yes, After revome zfree from hists__find_annotations, the output of perf
> > > annotate is repeated, which is related to sort keys.
>
> > > The original problem is that notes->src may correspond to multiple
> > > sample events. Therefore, we cannot simply zfree notes->src to avoid
> > > repeated output.
>
> > > Arnaldo, is there any problem with this patch? :)
>
> > PING :)
> > Is there any problem with this patch that needs to be modified?
>
> I continue having a feeling this is kinda a bandaid, i.e. avoid the
> problem, and since we have a way to work this when using a group, I fail
> to see why it couldn't work when not grouping events.

When we use a group, there's a single iteration only for the leader event.
But if not, it'll iterate the hist entries twice (for two events).
Each iteration
used to have multiple entries for the same symbol (due to the sort key),
so it marked the symbol (by freeing notes->src) to skip the same symbol
during the iteration.

However as the first iteration freed sym->notes->src, then the second
(or later) event cannot see the annotation for the deleted symbols
for that event even if it has some samples.

>
> But since I have no time to dive into this and Namhyung is ok with it,
> I'll merge it now.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-04-14 13:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-19 12:35 [PATCH v7] perf annotate: Fix sample events lost in stdio mode Yang Jihong
2021-03-26  2:24 ` Yang Jihong
2021-03-26  3:25   ` Namhyung Kim
2021-03-26 12:06     ` Arnaldo Carvalho de Melo
2021-03-27  2:15       ` Yang Jihong
2021-03-30  7:26         ` Namhyung Kim
2021-03-31  2:18           ` Yang Jihong
2021-04-12  7:22             ` Yang Jihong
2021-04-14 12:23               ` Arnaldo Carvalho de Melo
2021-04-14 13:15                 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).