* [PATCH 0/4] perf annotate: Bugfixes @ 2017-03-20 2:56 Taeung Song 2017-03-20 2:56 ` [PATCH 1/4] perf annotate: Use build-id dir when reading link name Taeung Song ` (3 more replies) 0 siblings, 4 replies; 19+ messages in thread From: Taeung Song @ 2017-03-20 2:56 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Taeung Song Hi, perf-annotate has little bugs so I fix them. I'd appreciate some feedback on this patchset. :) Thanks, Taeung Taeung Song (4): perf annotate: Use build-id dir when reading link name perf annotate: Avoid division by zero when calculating percent perf annotate: Fix missing setting nr samples on source_line perf annotate: More exactly grep -v of the objdump command tools/perf/util/annotate.c | 26 ++++++++++++++++++++------ tools/perf/util/annotate.h | 2 +- 2 files changed, 21 insertions(+), 7 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/4] perf annotate: Use build-id dir when reading link name 2017-03-20 2:56 [PATCH 0/4] perf annotate: Bugfixes Taeung Song @ 2017-03-20 2:56 ` Taeung Song 2017-03-20 2:56 ` [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent Taeung Song ` (2 subsequent siblings) 3 siblings, 0 replies; 19+ messages in thread From: Taeung Song @ 2017-03-20 2:56 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Taeung Song, Jiri Olsa In dso__disassemble_filename() when reading link name from a build-id file, it is failed each time because a build-id file gotten from dso__build_id_filename() is not symbolic link. So use build-id directory path instead of a build-id file name. For example, if build-id file name gotten from dso__build_id_filename() is as below, /root/.debug/.build-id/4f/75c7d197c951659d1c1b8b5fd49bcdf8f3f8b1/elf instead of the above build-id file name, use the build-id dir path that is a symbolic link as below. /root/.debug/.build-id/4f/75c7d197c951659d1c1b8b5fd49bcdf8f3f8b1 Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> --- tools/perf/util/annotate.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 273f21f..fc91c6b 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1307,6 +1307,7 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil { char linkname[PATH_MAX]; char *build_id_filename; + char *build_id_path = NULL; if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS && !dso__is_kcore(dso)) @@ -1322,8 +1323,14 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil goto fallback; } + build_id_path = strdup(filename); + if (!build_id_path) + return -1; + + dirname(build_id_path); + if (dso__is_kcore(dso) || - readlink(filename, linkname, sizeof(linkname)) < 0 || + readlink(build_id_path, linkname, sizeof(linkname)) < 0 || strstr(linkname, DSO__NAME_KALLSYMS) || access(filename, R_OK)) { fallback: @@ -1335,6 +1342,7 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil __symbol__join_symfs(filename, filename_size, dso->long_name); } + free(build_id_path); return 0; } -- 2.7.4 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent 2017-03-20 2:56 [PATCH 0/4] perf annotate: Bugfixes Taeung Song 2017-03-20 2:56 ` [PATCH 1/4] perf annotate: Use build-id dir when reading link name Taeung Song @ 2017-03-20 2:56 ` Taeung Song 2017-03-20 18:15 ` Arnaldo Carvalho de Melo 2017-03-20 2:56 ` [PATCH 3/4] perf annotate: Fix missing setting nr samples on source_line Taeung Song 2017-03-20 2:56 ` [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command Taeung Song 3 siblings, 1 reply; 19+ messages in thread From: Taeung Song @ 2017-03-20 2:56 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Taeung Song, Jiri Olsa Currently perf-annotate with --print-line can print -nan(0x8000000000000) because of division by zero when calculating percent. So if a sum of samples is zero, skip calculating percent. Before: $ perf annotate --stdio -l Sorted summary for file /home/taeung/workspace/a.out ---------------------------------------------- 32.89 -nan 7.04 a.c:38 25.14 -nan 0.00 a.c:34 16.26 -nan 56.34 a.c:31 15.88 -nan 1.41 a.c:37 5.67 -nan 0.00 a.c:39 1.13 -nan 35.21 a.c:26 0.95 -nan 0.00 a.c:44 0.57 -nan 0.00 a.c:32 Percent | Source code & Disassembly of a.out for cycles (529 samples) ----------------------------------------------------------------------------------------- : ... a.c:26 0.57 -nan 4.23 : 40081a: mov %edi,-0x24(%rbp) a.c:26 0.00 -nan 9.86 : 40081d: mov %rsi,-0x30(%rbp) ... After: $ perf annotate --stdio -l Sorted summary for file /home/taeung/workspace/a.out ---------------------------------------------- 32.89 0.00 7.04 a.c:38 25.14 0.00 0.00 a.c:34 16.26 0.00 56.34 a.c:31 15.88 0.00 1.41 a.c:37 5.67 0.00 0.00 a.c:39 1.13 0.00 35.21 a.c:26 0.95 0.00 0.00 a.c:44 0.57 0.00 0.00 a.c:32 Percent | Source code & Disassembly of old for cycles (529 samples) ----------------------------------------------------------------------------------------- : ... a.c:26 0.57 0.00 4.23 : 40081a: mov %edi,-0x24(%rbp) a.c:26 0.00 0.00 9.86 : 40081d: mov %rsi,-0x30(%rbp) ... Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> --- tools/perf/util/annotate.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index fc91c6b..9bb43cd 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1665,11 +1665,15 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map, src_line->nr_pcnt = nr_pcnt; for (k = 0; k < nr_pcnt; k++) { + double percent = 0.0; + h = annotation__histogram(notes, evidx + k); - src_line->samples[k].percent = 100.0 * h->addr[i] / h->sum; + if (h->sum) + percent = 100.0 * h->addr[i] / h->sum; - if (src_line->samples[k].percent > percent_max) - percent_max = src_line->samples[k].percent; + if (percent > percent_max) + percent_max = percent; + src_line->samples[k].percent = percent; } if (percent_max <= 0.5) -- 2.7.4 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent 2017-03-20 2:56 ` [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent Taeung Song @ 2017-03-20 18:15 ` Arnaldo Carvalho de Melo 2017-03-20 22:11 ` Taeung Song 0 siblings, 1 reply; 19+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-20 18:15 UTC (permalink / raw) To: Taeung Song Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Em Mon, Mar 20, 2017 at 11:56:55AM +0900, Taeung Song escreveu: > Currently perf-annotate with --print-line can print > -nan(0x8000000000000) because of division by zero > when calculating percent. > > So if a sum of samples is zero, skip calculating percent. Tried to reproduce it here, couldn't, syswide record: [root@jouet ~]# perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [root@jouet ~]# perf annotate --stdio -l 2> /dev/null | grep -i nan [root@jouet ~]# Can you please send me a perf.data file with this problem? I have to go thru the code to see how this can take place... - Arnaldo > Before: > > $ perf annotate --stdio -l > > Sorted summary for file /home/taeung/workspace/a.out > ---------------------------------------------- > > 32.89 -nan 7.04 a.c:38 > 25.14 -nan 0.00 a.c:34 > 16.26 -nan 56.34 a.c:31 > 15.88 -nan 1.41 a.c:37 > 5.67 -nan 0.00 a.c:39 > 1.13 -nan 35.21 a.c:26 > 0.95 -nan 0.00 a.c:44 > 0.57 -nan 0.00 a.c:32 > Percent | Source code & Disassembly of a.out for cycles (529 samples) > ----------------------------------------------------------------------------------------- > : > ... > > a.c:26 0.57 -nan 4.23 : 40081a: mov %edi,-0x24(%rbp) > a.c:26 0.00 -nan 9.86 : 40081d: mov %rsi,-0x30(%rbp) > > ... > > After: > > $ perf annotate --stdio -l > > Sorted summary for file /home/taeung/workspace/a.out > ---------------------------------------------- > > 32.89 0.00 7.04 a.c:38 > 25.14 0.00 0.00 a.c:34 > 16.26 0.00 56.34 a.c:31 > 15.88 0.00 1.41 a.c:37 > 5.67 0.00 0.00 a.c:39 > 1.13 0.00 35.21 a.c:26 > 0.95 0.00 0.00 a.c:44 > 0.57 0.00 0.00 a.c:32 > Percent | Source code & Disassembly of old for cycles (529 samples) > ----------------------------------------------------------------------------------------- > : > ... > > a.c:26 0.57 0.00 4.23 : 40081a: mov %edi,-0x24(%rbp) > a.c:26 0.00 0.00 9.86 : 40081d: mov %rsi,-0x30(%rbp) > > ... > > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Jiri Olsa <jolsa@redhat.com> > Signed-off-by: Taeung Song <treeze.taeung@gmail.com> > --- > tools/perf/util/annotate.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > index fc91c6b..9bb43cd 100644 > --- a/tools/perf/util/annotate.c > +++ b/tools/perf/util/annotate.c > @@ -1665,11 +1665,15 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map, > src_line->nr_pcnt = nr_pcnt; > > for (k = 0; k < nr_pcnt; k++) { > + double percent = 0.0; > + > h = annotation__histogram(notes, evidx + k); > - src_line->samples[k].percent = 100.0 * h->addr[i] / h->sum; > + if (h->sum) > + percent = 100.0 * h->addr[i] / h->sum; > > - if (src_line->samples[k].percent > percent_max) > - percent_max = src_line->samples[k].percent; > + if (percent > percent_max) > + percent_max = percent; > + src_line->samples[k].percent = percent; > } > > if (percent_max <= 0.5) > -- > 2.7.4 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent 2017-03-20 18:15 ` Arnaldo Carvalho de Melo @ 2017-03-20 22:11 ` Taeung Song 2017-03-20 22:20 ` Taeung Song 0 siblings, 1 reply; 19+ messages in thread From: Taeung Song @ 2017-03-20 22:11 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Hi Arnaldo :) Here the perf.data is, https://www.dropbox.com/s/nr4nnv8g3cipluf/perf.data?dl=1&pl=1 I tested as below. $ perf record -e "{cycles,page-faults,branch-misses}" ./old <input.txt $ perf annotate --stdio -l -f 2> /dev/null | grep -i nan | head -3 29.04 -nan 1.52 old_pack_knapsack.c:34 28.27 -nan 0.00 old_pack_knapsack.c:38 16.37 -nan 0.00 old_pack_knapsack.c:37 Thanks, Taeung On 03/21/2017 03:15 AM, Arnaldo Carvalho de Melo wrote: > Em Mon, Mar 20, 2017 at 11:56:55AM +0900, Taeung Song escreveu: >> Currently perf-annotate with --print-line can print >> -nan(0x8000000000000) because of division by zero >> when calculating percent. >> >> So if a sum of samples is zero, skip calculating percent. > > Tried to reproduce it here, couldn't, syswide record: > > [root@jouet ~]# perf evlist -v > cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 > [root@jouet ~]# perf annotate --stdio -l 2> /dev/null | grep -i nan > [root@jouet ~]# > > Can you please send me a perf.data file with this problem? I have to go > thru the code to see how this can take place... > > - Arnaldo > > >> Before: >> >> $ perf annotate --stdio -l >> >> Sorted summary for file /home/taeung/workspace/a.out >> ---------------------------------------------- >> >> 32.89 -nan 7.04 a.c:38 >> 25.14 -nan 0.00 a.c:34 >> 16.26 -nan 56.34 a.c:31 >> 15.88 -nan 1.41 a.c:37 >> 5.67 -nan 0.00 a.c:39 >> 1.13 -nan 35.21 a.c:26 >> 0.95 -nan 0.00 a.c:44 >> 0.57 -nan 0.00 a.c:32 >> Percent | Source code & Disassembly of a.out for cycles (529 samples) >> ----------------------------------------------------------------------------------------- >> : >> ... >> >> a.c:26 0.57 -nan 4.23 : 40081a: mov %edi,-0x24(%rbp) >> a.c:26 0.00 -nan 9.86 : 40081d: mov %rsi,-0x30(%rbp) >> >> ... >> >> After: >> >> $ perf annotate --stdio -l >> >> Sorted summary for file /home/taeung/workspace/a.out >> ---------------------------------------------- >> >> 32.89 0.00 7.04 a.c:38 >> 25.14 0.00 0.00 a.c:34 >> 16.26 0.00 56.34 a.c:31 >> 15.88 0.00 1.41 a.c:37 >> 5.67 0.00 0.00 a.c:39 >> 1.13 0.00 35.21 a.c:26 >> 0.95 0.00 0.00 a.c:44 >> 0.57 0.00 0.00 a.c:32 >> Percent | Source code & Disassembly of old for cycles (529 samples) >> ----------------------------------------------------------------------------------------- >> : >> ... >> >> a.c:26 0.57 0.00 4.23 : 40081a: mov %edi,-0x24(%rbp) >> a.c:26 0.00 0.00 9.86 : 40081d: mov %rsi,-0x30(%rbp) >> >> ... >> >> Cc: Namhyung Kim <namhyung@kernel.org> >> Cc: Jiri Olsa <jolsa@redhat.com> >> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> >> --- >> tools/perf/util/annotate.c | 10 +++++++--- >> 1 file changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c >> index fc91c6b..9bb43cd 100644 >> --- a/tools/perf/util/annotate.c >> +++ b/tools/perf/util/annotate.c >> @@ -1665,11 +1665,15 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map, >> src_line->nr_pcnt = nr_pcnt; >> >> for (k = 0; k < nr_pcnt; k++) { >> + double percent = 0.0; >> + >> h = annotation__histogram(notes, evidx + k); >> - src_line->samples[k].percent = 100.0 * h->addr[i] / h->sum; >> + if (h->sum) >> + percent = 100.0 * h->addr[i] / h->sum; >> >> - if (src_line->samples[k].percent > percent_max) >> - percent_max = src_line->samples[k].percent; >> + if (percent > percent_max) >> + percent_max = percent; >> + src_line->samples[k].percent = percent; >> } >> >> if (percent_max <= 0.5) >> -- >> 2.7.4 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent 2017-03-20 22:11 ` Taeung Song @ 2017-03-20 22:20 ` Taeung Song 2017-03-21 14:14 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 19+ messages in thread From: Taeung Song @ 2017-03-20 22:20 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa And, I tested by perf-stat on the same situation as below. $ perf stat -e "{cycles,page-faults,branch-misses}" ./old <input.txt 6623856 Performance counter stats for './old': 472,007,763 cycles (99.85%) 71 page-faults (99.85%) 220,073 branch-misses (99.85%) 0.170768608 seconds time elapsed Many times, the number of samples 'page-faults' was 68 ~ 71. In spite of it, how did the below 'h->sum' is zero.. util/annotate.c:1660~1661 1660 h = annotation__histogram(notes, evidx + k); 1661 src_line->samples[k].percent = 100.0 * h->addr[i] / h->sum; This patch just add if statement 'if (h->sum)' to handle the case that h->sum is zero. But now I wonder how h->sum could be zero.. I'll dig the problem to find the root cause of it, too ! Thanks, Taeung On 03/21/2017 07:11 AM, Taeung Song wrote: > Hi Arnaldo :) > > Here the perf.data is, > https://www.dropbox.com/s/nr4nnv8g3cipluf/perf.data?dl=1&pl=1 > > I tested as below. > > $ perf record -e "{cycles,page-faults,branch-misses}" ./old <input.txt > > $ perf annotate --stdio -l -f 2> /dev/null | grep -i nan | head -3 > 29.04 -nan 1.52 old_pack_knapsack.c:34 > 28.27 -nan 0.00 old_pack_knapsack.c:38 > 16.37 -nan 0.00 old_pack_knapsack.c:37 > > > Thanks, > Taeung > > On 03/21/2017 03:15 AM, Arnaldo Carvalho de Melo wrote: >> Em Mon, Mar 20, 2017 at 11:56:55AM +0900, Taeung Song escreveu: >>> Currently perf-annotate with --print-line can print >>> -nan(0x8000000000000) because of division by zero >>> when calculating percent. >>> >>> So if a sum of samples is zero, skip calculating percent. >> >> Tried to reproduce it here, couldn't, syswide record: >> >> [root@jouet ~]# perf evlist -v >> cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: >> IP|TID|TIME|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, >> freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, >> comm_exec: 1 >> [root@jouet ~]# perf annotate --stdio -l 2> /dev/null | grep -i nan >> [root@jouet ~]# >> >> Can you please send me a perf.data file with this problem? I have to go >> thru the code to see how this can take place... >> >> - Arnaldo >> >> >>> Before: >>> >>> $ perf annotate --stdio -l >>> >>> Sorted summary for file /home/taeung/workspace/a.out >>> ---------------------------------------------- >>> >>> 32.89 -nan 7.04 a.c:38 >>> 25.14 -nan 0.00 a.c:34 >>> 16.26 -nan 56.34 a.c:31 >>> 15.88 -nan 1.41 a.c:37 >>> 5.67 -nan 0.00 a.c:39 >>> 1.13 -nan 35.21 a.c:26 >>> 0.95 -nan 0.00 a.c:44 >>> 0.57 -nan 0.00 a.c:32 >>> Percent | Source code & Disassembly of a.out >>> for cycles (529 samples) >>> ----------------------------------------------------------------------------------------- >>> >>> : >>> ... >>> >>> a.c:26 0.57 -nan 4.23 : 40081a: mov >>> %edi,-0x24(%rbp) >>> a.c:26 0.00 -nan 9.86 : 40081d: mov >>> %rsi,-0x30(%rbp) >>> >>> ... >>> >>> After: >>> >>> $ perf annotate --stdio -l >>> >>> Sorted summary for file /home/taeung/workspace/a.out >>> ---------------------------------------------- >>> >>> 32.89 0.00 7.04 a.c:38 >>> 25.14 0.00 0.00 a.c:34 >>> 16.26 0.00 56.34 a.c:31 >>> 15.88 0.00 1.41 a.c:37 >>> 5.67 0.00 0.00 a.c:39 >>> 1.13 0.00 35.21 a.c:26 >>> 0.95 0.00 0.00 a.c:44 >>> 0.57 0.00 0.00 a.c:32 >>> Percent | Source code & Disassembly of old for >>> cycles (529 samples) >>> ----------------------------------------------------------------------------------------- >>> >>> : >>> ... >>> >>> a.c:26 0.57 0.00 4.23 : 40081a: mov >>> %edi,-0x24(%rbp) >>> a.c:26 0.00 0.00 9.86 : 40081d: mov >>> %rsi,-0x30(%rbp) >>> >>> ... >>> >>> Cc: Namhyung Kim <namhyung@kernel.org> >>> Cc: Jiri Olsa <jolsa@redhat.com> >>> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> >>> --- >>> tools/perf/util/annotate.c | 10 +++++++--- >>> 1 file changed, 7 insertions(+), 3 deletions(-) >>> >>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c >>> index fc91c6b..9bb43cd 100644 >>> --- a/tools/perf/util/annotate.c >>> +++ b/tools/perf/util/annotate.c >>> @@ -1665,11 +1665,15 @@ static int symbol__get_source_line(struct >>> symbol *sym, struct map *map, >>> src_line->nr_pcnt = nr_pcnt; >>> >>> for (k = 0; k < nr_pcnt; k++) { >>> + double percent = 0.0; >>> + >>> h = annotation__histogram(notes, evidx + k); >>> - src_line->samples[k].percent = 100.0 * h->addr[i] / h->sum; >>> + if (h->sum) >>> + percent = 100.0 * h->addr[i] / h->sum; >>> >>> - if (src_line->samples[k].percent > percent_max) >>> - percent_max = src_line->samples[k].percent; >>> + if (percent > percent_max) >>> + percent_max = percent; >>> + src_line->samples[k].percent = percent; >>> } >>> >>> if (percent_max <= 0.5) >>> -- >>> 2.7.4 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent 2017-03-20 22:20 ` Taeung Song @ 2017-03-21 14:14 ` Arnaldo Carvalho de Melo 2017-03-21 14:21 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 19+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-21 14:14 UTC (permalink / raw) To: Taeung Song Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Em Tue, Mar 21, 2017 at 07:20:20AM +0900, Taeung Song escreveu: > And, > I tested by perf-stat on the same situation as below. > > $ perf stat -e "{cycles,page-faults,branch-misses}" ./old <input.txt > 6623856 Please always try to spell out all the steps needed to get to some result, for instance, in this case the info above, that you are asking for three counters to be recorded at once probably has the key to reproduce this, as I think that you may run your workload and sometimes not get one page fault, leading tho that division by zero, but I have to try to reproduce it now that I have this clue. Thanks, - Arnaldo > Performance counter stats for './old': > > 472,007,763 cycles (99.85%) > 71 page-faults (99.85%) > 220,073 branch-misses (99.85%) > > 0.170768608 seconds time elapsed > > Many times, the number of samples 'page-faults' was 68 ~ 71. > In spite of it, how did the below 'h->sum' is zero.. > > util/annotate.c:1660~1661 > > 1660 h = annotation__histogram(notes, evidx + k); > 1661 src_line->samples[k].percent = 100.0 * h->addr[i] / h->sum; > > > This patch just add if statement 'if (h->sum)' to handle the case > that h->sum is zero. But now I wonder how h->sum could be zero.. > > I'll dig the problem to find the root cause of it, too ! > > Thanks, > Taeung > > On 03/21/2017 07:11 AM, Taeung Song wrote: > > Hi Arnaldo :) > > > > Here the perf.data is, > > https://www.dropbox.com/s/nr4nnv8g3cipluf/perf.data?dl=1&pl=1 > > > > I tested as below. > > > > $ perf record -e "{cycles,page-faults,branch-misses}" ./old <input.txt > > > > $ perf annotate --stdio -l -f 2> /dev/null | grep -i nan | head -3 > > 29.04 -nan 1.52 old_pack_knapsack.c:34 > > 28.27 -nan 0.00 old_pack_knapsack.c:38 > > 16.37 -nan 0.00 old_pack_knapsack.c:37 > > > > > > Thanks, > > Taeung > > > > On 03/21/2017 03:15 AM, Arnaldo Carvalho de Melo wrote: > > > Em Mon, Mar 20, 2017 at 11:56:55AM +0900, Taeung Song escreveu: > > > > Currently perf-annotate with --print-line can print > > > > -nan(0x8000000000000) because of division by zero > > > > when calculating percent. > > > > > > > > So if a sum of samples is zero, skip calculating percent. > > > > > > Tried to reproduce it here, couldn't, syswide record: > > > > > > [root@jouet ~]# perf evlist -v > > > cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: > > > IP|TID|TIME|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, > > > freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, > > > comm_exec: 1 > > > [root@jouet ~]# perf annotate --stdio -l 2> /dev/null | grep -i nan > > > [root@jouet ~]# > > > > > > Can you please send me a perf.data file with this problem? I have to go > > > thru the code to see how this can take place... > > > > > > - Arnaldo > > > > > > > > > > Before: > > > > > > > > $ perf annotate --stdio -l > > > > > > > > Sorted summary for file /home/taeung/workspace/a.out > > > > ---------------------------------------------- > > > > > > > > 32.89 -nan 7.04 a.c:38 > > > > 25.14 -nan 0.00 a.c:34 > > > > 16.26 -nan 56.34 a.c:31 > > > > 15.88 -nan 1.41 a.c:37 > > > > 5.67 -nan 0.00 a.c:39 > > > > 1.13 -nan 35.21 a.c:26 > > > > 0.95 -nan 0.00 a.c:44 > > > > 0.57 -nan 0.00 a.c:32 > > > > Percent | Source code & Disassembly of a.out > > > > for cycles (529 samples) > > > > ----------------------------------------------------------------------------------------- > > > > > > > > : > > > > ... > > > > > > > > a.c:26 0.57 -nan 4.23 : 40081a: mov > > > > %edi,-0x24(%rbp) > > > > a.c:26 0.00 -nan 9.86 : 40081d: mov > > > > %rsi,-0x30(%rbp) > > > > > > > > ... > > > > > > > > After: > > > > > > > > $ perf annotate --stdio -l > > > > > > > > Sorted summary for file /home/taeung/workspace/a.out > > > > ---------------------------------------------- > > > > > > > > 32.89 0.00 7.04 a.c:38 > > > > 25.14 0.00 0.00 a.c:34 > > > > 16.26 0.00 56.34 a.c:31 > > > > 15.88 0.00 1.41 a.c:37 > > > > 5.67 0.00 0.00 a.c:39 > > > > 1.13 0.00 35.21 a.c:26 > > > > 0.95 0.00 0.00 a.c:44 > > > > 0.57 0.00 0.00 a.c:32 > > > > Percent | Source code & Disassembly of old for > > > > cycles (529 samples) > > > > ----------------------------------------------------------------------------------------- > > > > > > > > : > > > > ... > > > > > > > > a.c:26 0.57 0.00 4.23 : 40081a: mov > > > > %edi,-0x24(%rbp) > > > > a.c:26 0.00 0.00 9.86 : 40081d: mov > > > > %rsi,-0x30(%rbp) > > > > > > > > ... > > > > > > > > Cc: Namhyung Kim <namhyung@kernel.org> > > > > Cc: Jiri Olsa <jolsa@redhat.com> > > > > Signed-off-by: Taeung Song <treeze.taeung@gmail.com> > > > > --- > > > > tools/perf/util/annotate.c | 10 +++++++--- > > > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > > > > index fc91c6b..9bb43cd 100644 > > > > --- a/tools/perf/util/annotate.c > > > > +++ b/tools/perf/util/annotate.c > > > > @@ -1665,11 +1665,15 @@ static int symbol__get_source_line(struct > > > > symbol *sym, struct map *map, > > > > src_line->nr_pcnt = nr_pcnt; > > > > > > > > for (k = 0; k < nr_pcnt; k++) { > > > > + double percent = 0.0; > > > > + > > > > h = annotation__histogram(notes, evidx + k); > > > > - src_line->samples[k].percent = 100.0 * h->addr[i] / h->sum; > > > > + if (h->sum) > > > > + percent = 100.0 * h->addr[i] / h->sum; > > > > > > > > - if (src_line->samples[k].percent > percent_max) > > > > - percent_max = src_line->samples[k].percent; > > > > + if (percent > percent_max) > > > > + percent_max = percent; > > > > + src_line->samples[k].percent = percent; > > > > } > > > > > > > > if (percent_max <= 0.5) > > > > -- > > > > 2.7.4 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent 2017-03-21 14:14 ` Arnaldo Carvalho de Melo @ 2017-03-21 14:21 ` Arnaldo Carvalho de Melo 2017-03-21 14:36 ` Taeung Song 2017-03-22 12:00 ` Taeung Song 0 siblings, 2 replies; 19+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-21 14:21 UTC (permalink / raw) To: Taeung Song Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Em Tue, Mar 21, 2017 at 11:14:07AM -0300, Arnaldo Carvalho de Melo escreveu: > Em Tue, Mar 21, 2017 at 07:20:20AM +0900, Taeung Song escreveu: > > And, > > I tested by perf-stat on the same situation as below. > > > > $ perf stat -e "{cycles,page-faults,branch-misses}" ./old <input.txt > > 6623856 > > Please always try to spell out all the steps needed to get to some > result, for instance, in this case the info above, that you are asking > for three counters to be recorded at once probably has the key to > reproduce this, as I think that you may run your workload and sometimes > not get one page fault, leading tho that division by zero, but I have to > try to reproduce it now that I have this clue. Bingo! [root@jouet ~]# perf record -e "{cycles,page-faults,branch-misses}" usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data (20 samples) ] [root@jouet ~]# perf annotate -l --stdio 2>&1 | grep -w nan | head 100.00 -nan 100.00 msr.h:104 0.00 -nan 0.00 : ffffffff81060bf0: mov %edi,%ecx 0.00 -nan 0.00 : ffffffff81060bf2: mov %esi,%eax 0.00 -nan 0.00 : ffffffff81060bf4: wrmsr msr.h:104 100.00 -nan 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) 0.00 -nan 0.00 : ffffffff81060bfb: retq 0.00 -nan 0.00 : ffffffff81060bfc: push %rbp 0.00 -nan 0.00 : ffffffff81060bfd: mov %esi,%eax 0.00 -nan 0.00 : ffffffff81060bff: mov %rdx,%rsi 0.00 -nan 0.00 : ffffffff81060c02: shl $0x20,%rsi [root@jouet ~]# So, if we don't use -l/--print-line we get: [root@jouet ~]# perf annotate --stdio --no-source native_write_msr Percent | Source code & Disassembly of vmlinux for cycles (3 samples) ------------------------------------------------------------------------------------------- : ffffffff81060bf0 <native_write_msr>: : __wrmsr(): 0.00 0.00 0.00 : ffffffff81060bf0: mov %edi,%ecx 0.00 0.00 0.00 : ffffffff81060bf2: mov %esi,%eax 0.00 0.00 0.00 : ffffffff81060bf4: wrmsr : arch_static_branch(): 100.00 0.00 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) : native_write_msr(): 0.00 0.00 0.00 : ffffffff81060bfb: retq 0.00 0.00 0.00 : ffffffff81060bfc: push %rbp 0.00 0.00 0.00 : ffffffff81060bfd: mov %esi,%eax 0.00 0.00 0.00 : ffffffff81060bff: mov %rdx,%rsi 0.00 0.00 0.00 : ffffffff81060c02: shl $0x20,%rsi 0.00 0.00 0.00 : ffffffff81060c06: mov %eax,%edx 0.00 0.00 0.00 : ffffffff81060c08: mov %rsp,%rbp 0.00 0.00 0.00 : ffffffff81060c0b: or %rdx,%rsi 0.00 0.00 0.00 : ffffffff81060c0e: xor %edx,%edx 0.00 0.00 0.00 : ffffffff81060c10: callq ffffffff81457df0 <do_trace_write_msr> 0.00 0.00 0.00 : ffffffff81060c15: pop %rbp 0.00 0.00 0.00 : ffffffff81060c16: retq [root@jouet ~]# I.e. one column per event in that group, and no -nan, all is well. But if we ask for --lines it gets b0rked: [root@jouet ~]# perf annotate --stdio --no-source native_write_msr --print-line Sorted summary for file /lib/modules/4.11.0-rc2+/build/vmlinux ---------------------------------------------- 100.00 -nan 100.00 msr.h:104 Percent | Source code & Disassembly of vmlinux for cycles (3 samples) ------------------------------------------------------------------------------------------- : ffffffff81060bf0 <native_write_msr>: : __wrmsr(): 0.00 -nan 0.00 : ffffffff81060bf0: mov %edi,%ecx 0.00 -nan 0.00 : ffffffff81060bf2: mov %esi,%eax 0.00 -nan 0.00 : ffffffff81060bf4: wrmsr : arch_static_branch(): msr.h:104 100.00 -nan 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) : native_write_msr(): 0.00 -nan 0.00 : ffffffff81060bfb: retq 0.00 -nan 0.00 : ffffffff81060bfc: push %rbp 0.00 -nan 0.00 : ffffffff81060bfd: mov %esi,%eax 0.00 -nan 0.00 : ffffffff81060bff: mov %rdx,%rsi 0.00 -nan 0.00 : ffffffff81060c02: shl $0x20,%rsi 0.00 -nan 0.00 : ffffffff81060c06: mov %eax,%edx 0.00 -nan 0.00 : ffffffff81060c08: mov %rsp,%rbp 0.00 -nan 0.00 : ffffffff81060c0b: or %rdx,%rsi 0.00 -nan 0.00 : ffffffff81060c0e: xor %edx,%edx 0.00 -nan 0.00 : ffffffff81060c10: callq ffffffff81457df0 <do_trace_write_msr> 0.00 -nan 0.00 : ffffffff81060c15: pop %rbp 0.00 -nan 0.00 : ffffffff81060c16: retq [root@jouet ~]# I don't have that much time to continue pursuing this, try to continue investigating, I'll try to help and to fix this later. - Arnaldo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent 2017-03-21 14:21 ` Arnaldo Carvalho de Melo @ 2017-03-21 14:36 ` Taeung Song 2017-03-22 12:00 ` Taeung Song 1 sibling, 0 replies; 19+ messages in thread From: Taeung Song @ 2017-03-21 14:36 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Hi, On 03/21/2017 11:21 PM, Arnaldo Carvalho de Melo wrote: > Em Tue, Mar 21, 2017 at 11:14:07AM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Tue, Mar 21, 2017 at 07:20:20AM +0900, Taeung Song escreveu: >>> And, >>> I tested by perf-stat on the same situation as below. >>> >>> $ perf stat -e "{cycles,page-faults,branch-misses}" ./old <input.txt >>> 6623856 >> >> Please always try to spell out all the steps needed to get to some >> result, for instance, in this case the info above, that you are asking >> for three counters to be recorded at once probably has the key to >> reproduce this, as I think that you may run your workload and sometimes >> not get one page fault, leading tho that division by zero, but I have to >> try to reproduce it now that I have this clue. > > Bingo! > > [root@jouet ~]# perf record -e "{cycles,page-faults,branch-misses}" usleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.020 MB perf.data (20 samples) ] > [root@jouet ~]# perf annotate -l --stdio 2>&1 | grep -w nan | head > 100.00 -nan 100.00 msr.h:104 > 0.00 -nan 0.00 : ffffffff81060bf0: mov %edi,%ecx > 0.00 -nan 0.00 : ffffffff81060bf2: mov %esi,%eax > 0.00 -nan 0.00 : ffffffff81060bf4: wrmsr > msr.h:104 100.00 -nan 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) > 0.00 -nan 0.00 : ffffffff81060bfb: retq > 0.00 -nan 0.00 : ffffffff81060bfc: push %rbp > 0.00 -nan 0.00 : ffffffff81060bfd: mov %esi,%eax > 0.00 -nan 0.00 : ffffffff81060bff: mov %rdx,%rsi > 0.00 -nan 0.00 : ffffffff81060c02: shl $0x20,%rsi > [root@jouet ~]# > > So, if we don't use -l/--print-line we get: > > [root@jouet ~]# perf annotate --stdio --no-source native_write_msr > Percent | Source code & Disassembly of vmlinux for cycles (3 samples) > ------------------------------------------------------------------------------------------- > : ffffffff81060bf0 <native_write_msr>: > : __wrmsr(): > 0.00 0.00 0.00 : ffffffff81060bf0: mov %edi,%ecx > 0.00 0.00 0.00 : ffffffff81060bf2: mov %esi,%eax > 0.00 0.00 0.00 : ffffffff81060bf4: wrmsr > : arch_static_branch(): > 100.00 0.00 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) > : native_write_msr(): > 0.00 0.00 0.00 : ffffffff81060bfb: retq > 0.00 0.00 0.00 : ffffffff81060bfc: push %rbp > 0.00 0.00 0.00 : ffffffff81060bfd: mov %esi,%eax > 0.00 0.00 0.00 : ffffffff81060bff: mov %rdx,%rsi > 0.00 0.00 0.00 : ffffffff81060c02: shl $0x20,%rsi > 0.00 0.00 0.00 : ffffffff81060c06: mov %eax,%edx > 0.00 0.00 0.00 : ffffffff81060c08: mov %rsp,%rbp > 0.00 0.00 0.00 : ffffffff81060c0b: or %rdx,%rsi > 0.00 0.00 0.00 : ffffffff81060c0e: xor %edx,%edx > 0.00 0.00 0.00 : ffffffff81060c10: callq ffffffff81457df0 <do_trace_write_msr> > 0.00 0.00 0.00 : ffffffff81060c15: pop %rbp > 0.00 0.00 0.00 : ffffffff81060c16: retq > [root@jouet ~]# > > I.e. one column per event in that group, and no -nan, all is well. > > But if we ask for --lines it gets b0rked: > > [root@jouet ~]# perf annotate --stdio --no-source native_write_msr --print-line > > Sorted summary for file /lib/modules/4.11.0-rc2+/build/vmlinux > ---------------------------------------------- > > 100.00 -nan 100.00 msr.h:104 > Percent | Source code & Disassembly of vmlinux for cycles (3 samples) > ------------------------------------------------------------------------------------------- > : ffffffff81060bf0 <native_write_msr>: > : __wrmsr(): > 0.00 -nan 0.00 : ffffffff81060bf0: mov %edi,%ecx > 0.00 -nan 0.00 : ffffffff81060bf2: mov %esi,%eax > 0.00 -nan 0.00 : ffffffff81060bf4: wrmsr > : arch_static_branch(): > msr.h:104 100.00 -nan 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) > : native_write_msr(): > 0.00 -nan 0.00 : ffffffff81060bfb: retq > 0.00 -nan 0.00 : ffffffff81060bfc: push %rbp > 0.00 -nan 0.00 : ffffffff81060bfd: mov %esi,%eax > 0.00 -nan 0.00 : ffffffff81060bff: mov %rdx,%rsi > 0.00 -nan 0.00 : ffffffff81060c02: shl $0x20,%rsi > 0.00 -nan 0.00 : ffffffff81060c06: mov %eax,%edx > 0.00 -nan 0.00 : ffffffff81060c08: mov %rsp,%rbp > 0.00 -nan 0.00 : ffffffff81060c0b: or %rdx,%rsi > 0.00 -nan 0.00 : ffffffff81060c0e: xor %edx,%edx > 0.00 -nan 0.00 : ffffffff81060c10: callq ffffffff81457df0 <do_trace_write_msr> > 0.00 -nan 0.00 : ffffffff81060c15: pop %rbp > 0.00 -nan 0.00 : ffffffff81060c16: retq > [root@jouet ~]# > > I don't have that much time to continue pursuing this, try to continue > investigating, I'll try to help and to fix this later. > Okey, I found the cause about -nan and sent this patch to you. But do you want to get other way to fix this problem ? instead of this patch ? Or change the commit message ? Thanks, Taeung ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent 2017-03-21 14:21 ` Arnaldo Carvalho de Melo 2017-03-21 14:36 ` Taeung Song @ 2017-03-22 12:00 ` Taeung Song 1 sibling, 0 replies; 19+ messages in thread From: Taeung Song @ 2017-03-22 12:00 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa On 03/21/2017 11:21 PM, Arnaldo Carvalho de Melo wrote: > Em Tue, Mar 21, 2017 at 11:14:07AM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Tue, Mar 21, 2017 at 07:20:20AM +0900, Taeung Song escreveu: >>> And, >>> I tested by perf-stat on the same situation as below. >>> >>> $ perf stat -e "{cycles,page-faults,branch-misses}" ./old <input.txt >>> 6623856 >> >> Please always try to spell out all the steps needed to get to some >> result, for instance, in this case the info above, that you are asking >> for three counters to be recorded at once probably has the key to >> reproduce this, as I think that you may run your workload and sometimes >> not get one page fault, leading tho that division by zero, but I have to >> try to reproduce it now that I have this clue. > > Bingo! > > [root@jouet ~]# perf record -e "{cycles,page-faults,branch-misses}" usleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.020 MB perf.data (20 samples) ] > [root@jouet ~]# perf annotate -l --stdio 2>&1 | grep -w nan | head > 100.00 -nan 100.00 msr.h:104 > 0.00 -nan 0.00 : ffffffff81060bf0: mov %edi,%ecx > 0.00 -nan 0.00 : ffffffff81060bf2: mov %esi,%eax > 0.00 -nan 0.00 : ffffffff81060bf4: wrmsr > msr.h:104 100.00 -nan 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) > 0.00 -nan 0.00 : ffffffff81060bfb: retq > 0.00 -nan 0.00 : ffffffff81060bfc: push %rbp > 0.00 -nan 0.00 : ffffffff81060bfd: mov %esi,%eax > 0.00 -nan 0.00 : ffffffff81060bff: mov %rdx,%rsi > 0.00 -nan 0.00 : ffffffff81060c02: shl $0x20,%rsi > [root@jouet ~]# > > So, if we don't use -l/--print-line we get: > > [root@jouet ~]# perf annotate --stdio --no-source native_write_msr > Percent | Source code & Disassembly of vmlinux for cycles (3 samples) > ------------------------------------------------------------------------------------------- > : ffffffff81060bf0 <native_write_msr>: > : __wrmsr(): > 0.00 0.00 0.00 : ffffffff81060bf0: mov %edi,%ecx > 0.00 0.00 0.00 : ffffffff81060bf2: mov %esi,%eax > 0.00 0.00 0.00 : ffffffff81060bf4: wrmsr > : arch_static_branch(): > 100.00 0.00 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) > : native_write_msr(): > 0.00 0.00 0.00 : ffffffff81060bfb: retq > 0.00 0.00 0.00 : ffffffff81060bfc: push %rbp > 0.00 0.00 0.00 : ffffffff81060bfd: mov %esi,%eax > 0.00 0.00 0.00 : ffffffff81060bff: mov %rdx,%rsi > 0.00 0.00 0.00 : ffffffff81060c02: shl $0x20,%rsi > 0.00 0.00 0.00 : ffffffff81060c06: mov %eax,%edx > 0.00 0.00 0.00 : ffffffff81060c08: mov %rsp,%rbp > 0.00 0.00 0.00 : ffffffff81060c0b: or %rdx,%rsi > 0.00 0.00 0.00 : ffffffff81060c0e: xor %edx,%edx > 0.00 0.00 0.00 : ffffffff81060c10: callq ffffffff81457df0 <do_trace_write_msr> > 0.00 0.00 0.00 : ffffffff81060c15: pop %rbp > 0.00 0.00 0.00 : ffffffff81060c16: retq > [root@jouet ~]# > > I.e. one column per event in that group, and no -nan, all is well. The reason is the below if statement at util/annotate.c:910. 881 double disasm__calc_percent(struct annotation *notes, int evidx, s64 offset, ... 904 struct sym_hist *h = annotation__histogram(notes, evidx); 905 unsigned int hits = 0; 906 907 while (offset < end) 908 hits += h->addr[offset++]; 909 910 if (h->sum) { 911 *nr_samples = hits; 912 percent = 100.0 * hits / h->sum; 913 } 914 } The if statement 'if (h->sum)' prevent division by zero, so all is well. So I sent this patch because of same reason adding the if statement 'if (h->sum)' in symbol__get_source_line(). But will resend v2 with more clearly commit log message! I think previous log message is a bit insufficient.. Thanks, Taeung > > But if we ask for --lines it gets b0rked: > > [root@jouet ~]# perf annotate --stdio --no-source native_write_msr --print-line > > Sorted summary for file /lib/modules/4.11.0-rc2+/build/vmlinux > ---------------------------------------------- > > 100.00 -nan 100.00 msr.h:104 > Percent | Source code & Disassembly of vmlinux for cycles (3 samples) > ------------------------------------------------------------------------------------------- > : ffffffff81060bf0 <native_write_msr>: > : __wrmsr(): > 0.00 -nan 0.00 : ffffffff81060bf0: mov %edi,%ecx > 0.00 -nan 0.00 : ffffffff81060bf2: mov %esi,%eax > 0.00 -nan 0.00 : ffffffff81060bf4: wrmsr > : arch_static_branch(): > msr.h:104 100.00 -nan 100.00 : ffffffff81060bf6: nopl 0x0(%rax,%rax,1) > : native_write_msr(): > 0.00 -nan 0.00 : ffffffff81060bfb: retq > 0.00 -nan 0.00 : ffffffff81060bfc: push %rbp > 0.00 -nan 0.00 : ffffffff81060bfd: mov %esi,%eax > 0.00 -nan 0.00 : ffffffff81060bff: mov %rdx,%rsi > 0.00 -nan 0.00 : ffffffff81060c02: shl $0x20,%rsi > 0.00 -nan 0.00 : ffffffff81060c06: mov %eax,%edx > 0.00 -nan 0.00 : ffffffff81060c08: mov %rsp,%rbp > 0.00 -nan 0.00 : ffffffff81060c0b: or %rdx,%rsi > 0.00 -nan 0.00 : ffffffff81060c0e: xor %edx,%edx > 0.00 -nan 0.00 : ffffffff81060c10: callq ffffffff81457df0 <do_trace_write_msr> > 0.00 -nan 0.00 : ffffffff81060c15: pop %rbp > 0.00 -nan 0.00 : ffffffff81060c16: retq > [root@jouet ~]# > > I don't have that much time to continue pursuing this, try to continue > investigating, I'll try to help and to fix this later. > > - Arnaldo > ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/4] perf annotate: Fix missing setting nr samples on source_line 2017-03-20 2:56 [PATCH 0/4] perf annotate: Bugfixes Taeung Song 2017-03-20 2:56 ` [PATCH 1/4] perf annotate: Use build-id dir when reading link name Taeung Song 2017-03-20 2:56 ` [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent Taeung Song @ 2017-03-20 2:56 ` Taeung Song 2017-03-20 2:56 ` [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command Taeung Song 3 siblings, 0 replies; 19+ messages in thread From: Taeung Song @ 2017-03-20 2:56 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Taeung Song, Jiri Olsa If using --show-total-period with -l, disasm__calc_percent() use a 'samples' array of source_line. But samples[evidx].nr is always zero so print 0 values as below. Before: $ perf annotate --stdio -l --show-total-period ... 0 : 400816: push %rbp 0 : 400817: mov %rsp,%rbp 0 : 40081a: mov %edi,-0x24(%rbp) 0 : 40081d: mov %rsi,-0x30(%rbp) 0 : 400821: mov -0x24(%rbp),%eax 0 : 400824: mov -0x30(%rbp),%rdx 0 : 400828: mov (%rdx),%esi 0 : 40082a: mov $0x0,%edx ... The reason is that in symbol__get_source_line() did't set number of samples into a memeber variable 'nr' of samples of source_line. So fix it for correct number of samples when using --show-total-period option as below. After: $ perf annotate --stdio -l --show-total-period ... 3 : 400816: push %rbp 4 : 400817: mov %rsp,%rbp 0 : 40081a: mov %edi,-0x24(%rbp) 0 : 40081d: mov %rsi,-0x30(%rbp) 1 : 400821: mov -0x24(%rbp),%eax 2 : 400824: mov -0x30(%rbp),%rdx 0 : 400828: mov (%rdx),%esi 1 : 40082a: mov $0x0,%edx ... Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> --- tools/perf/util/annotate.c | 6 ++++-- tools/perf/util/annotate.h | 2 +- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 9bb43cd..63130ec 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1659,7 +1659,7 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map, start = map__rip_2objdump(map, sym->start); for (i = 0; i < len; i++) { - u64 offset; + u64 offset, nr_samples; double percent_max = 0.0; src_line->nr_pcnt = nr_pcnt; @@ -1668,12 +1668,14 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map, double percent = 0.0; h = annotation__histogram(notes, evidx + k); + nr_samples = h->addr[i]; if (h->sum) - percent = 100.0 * h->addr[i] / h->sum; + percent = 100.0 * nr_samples / h->sum; if (percent > percent_max) percent_max = percent; src_line->samples[k].percent = percent; + src_line->samples[k].nr = nr_samples; } if (percent_max <= 0.5) diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 09776b5..948aa8e 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -98,7 +98,7 @@ struct cyc_hist { struct source_line_samples { double percent; double percent_sum; - double nr; + u64 nr; }; struct source_line { -- 2.7.4 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command 2017-03-20 2:56 [PATCH 0/4] perf annotate: Bugfixes Taeung Song ` (2 preceding siblings ...) 2017-03-20 2:56 ` [PATCH 3/4] perf annotate: Fix missing setting nr samples on source_line Taeung Song @ 2017-03-20 2:56 ` Taeung Song 2017-03-21 14:37 ` Arnaldo Carvalho de Melo 2017-03-24 18:45 ` [tip:perf/core] " tip-bot for Taeung Song 3 siblings, 2 replies; 19+ messages in thread From: Taeung Song @ 2017-03-20 2:56 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Taeung Song, Jiri Olsa grep -v "file name" in the objdump command cause a side effect eliminating filename:linenr of output of 'objdump -l' if the object file name and source file name are about the same so fix it. The objdump command in symbol__disassemble() can be as below $ objdump -l -d -S -C /home/taeung/hello --start-address=... /home/taeung/hello: file format elf64-x86-64 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:4 void main() { 400526: 55 push %rbp 400527: 48 89 e5 mov %rsp,%rbp /home/taeung/hello.c:5 ... But currently it use grep -v "file name" e.g. "/home/taeung/hello" in the objdump command to remove the first line containing file name and file format such as, Before: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello But it cause a side effect removing filename:linenr because the object file and source file has same name e.g. "/home/taueng/hello", "/home/taeung/hello.c" So more exactly grep -v as below to correctly remove the one line e.g. "/home/taeung/hello: file format elf64-x86-64" After: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello: Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> --- tools/perf/util/annotate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 63130ec..e49eb7e 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1443,7 +1443,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na snprintf(command, sizeof(command), "%s %s%s --start-address=0x%016" PRIx64 " --stop-address=0x%016" PRIx64 - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", + " -l -d %s %s -C %s 2>/dev/null|grep -v %s: |expand", objdump_path ? objdump_path : "objdump", disassembler_style ? "-M " : "", disassembler_style ? disassembler_style : "", -- 2.7.4 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command 2017-03-20 2:56 ` [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command Taeung Song @ 2017-03-21 14:37 ` Arnaldo Carvalho de Melo 2017-03-21 16:19 ` Taeung Song 2017-03-21 16:19 ` Taeung Song 2017-03-24 18:45 ` [tip:perf/core] " tip-bot for Taeung Song 1 sibling, 2 replies; 19+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-21 14:37 UTC (permalink / raw) To: Taeung Song Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Em Mon, Mar 20, 2017 at 11:56:57AM +0900, Taeung Song escreveu: > grep -v "file name" in the objdump command > cause a side effect eliminating filename:linenr of > output of 'objdump -l' if the object file name and > source file name are about the same so fix it. > > The objdump command in symbol__disassemble() can be as below > > $ objdump -l -d -S -C /home/taeung/hello --start-address=... > > /home/taeung/hello: file format elf64-x86-64 > > Disassembly of section .text: > > 0000000000400526 <main>: > main(): > /home/taeung/hello.c:4 > > void main() > { > 400526: 55 push %rbp > 400527: 48 89 e5 mov %rsp,%rbp > /home/taeung/hello.c:5 > ... > > But currently it use grep -v "file name" e.g. "/home/taeung/hello" > in the objdump command to remove the first line containing file name > and file format such as, > > Before: > $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello > > But it cause a side effect removing filename:linenr > because the object file and source file has same name > e.g. "/home/taueng/hello", "/home/taeung/hello.c" > > So more exactly grep -v as below > to correctly remove the one line > e.g. "/home/taeung/hello: file format elf64-x86-64" > > After: > $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello: What environment is this? I tried to reproduce this here but got different results, and I had to figure out the 'perf record' you used, you forgot to mention it, i.e. 'perf record ./hello' doesn't work, one has to use 'perf record /full/path/to/hello', anyway, with my environemnt, which is: [root@jouet ~]# objdump -v GNU objdump version 2.26.1-1.fc25 I end up getting: Sorted summary for file /home/acme/c/hello ---------------------------------------------- 52.67 hello.c:7 47.32 hello.c:8 Percent | Source code & Disassembly of hello for cycles:u (83555 samples) ------------------------------------------------------------------------------- : 00000000004004f6 <main>: : main(): : #include <stdio.h> : : int main(void) : { 0.00 : 4004f6: push %rbp 0.00 : 4004f7: mov %rsp,%rbp 0.00 : 4004fa: sub $0x10,%rsp : unsigned long long i; : printf("hello, world\n"); 0.00 : 4004fe: mov $0x4005d0,%edi 0.00 : 400503: callq 4003f0 <puts@plt> : while (i < (unsigned long long)-1) { 0.00 : 400508: jmp 40052d <main+0x37> : i *= 171; hello.c:8 8.66 : 40050a: mov -0x8(%rbp),%rdx 0.01 : 40050e: mov %rdx,%rax 0.00 : 400511: shl $0x3,%rax hello.c:8 0.84 : 400515: add %rdx,%rax hello.c:8 8.60 : 400518: add %rax,%rax hello.c:8 3.46 : 40051b: add %rdx,%rax hello.c:8 8.57 : 40051e: lea 0x0(,%rax,8),%rdx 8.65 : 400526: add %rdx,%rax 8.56 : 400529: mov %rax,-0x8(%rbp) <SNIP> Which looks ok, i.e. since this version of objdump puts just the basename (hello.c), that grep doesn't nuke it. But yeah, your patch is more robust, just fix what I described below, i.e. use quotes in the grep -v bit, probably. - Arnaldo > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Jiri Olsa <jolsa@redhat.com> > Signed-off-by: Taeung Song <treeze.taeung@gmail.com> > --- > tools/perf/util/annotate.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > index 63130ec..e49eb7e 100644 > --- a/tools/perf/util/annotate.c > +++ b/tools/perf/util/annotate.c > @@ -1443,7 +1443,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na > snprintf(command, sizeof(command), > "%s %s%s --start-address=0x%016" PRIx64 > " --stop-address=0x%016" PRIx64 > - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", > + " -l -d %s %s -C %s 2>/dev/null|grep -v %s: |expand", Are you sure this works as you expect? I.e. both lines will have '%s:' that space just before the '|' will not be considered by grep > objdump_path ? objdump_path : "objdump", > disassembler_style ? "-M " : "", > disassembler_style ? disassembler_style : "", > -- > 2.7.4 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command 2017-03-21 14:37 ` Arnaldo Carvalho de Melo @ 2017-03-21 16:19 ` Taeung Song 2017-03-21 16:19 ` Taeung Song 1 sibling, 0 replies; 19+ messages in thread From: Taeung Song @ 2017-03-21 16:19 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Taeung Song Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Thank you for your review :) On 03/21/2017 11:37 PM, Arnaldo Carvalho de Melo wrote: > Em Mon, Mar 20, 2017 at 11:56:57AM +0900, Taeung Song escreveu: >> grep -v "file name" in the objdump command >> cause a side effect eliminating filename:linenr of >> output of 'objdump -l' if the object file name and >> source file name are about the same so fix it. >> >> The objdump command in symbol__disassemble() can be as below >> >> $ objdump -l -d -S -C /home/taeung/hello --start-address=... >> >> /home/taeung/hello: file format elf64-x86-64 >> >> Disassembly of section .text: >> >> 0000000000400526 <main>: >> main(): >> /home/taeung/hello.c:4 >> >> void main() >> { >> 400526: 55 push %rbp >> 400527: 48 89 e5 mov %rsp,%rbp >> /home/taeung/hello.c:5 >> ... >> >> But currently it use grep -v "file name" e.g. "/home/taeung/hello" >> in the objdump command to remove the first line containing file name >> and file format such as, >> >> Before: >> $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello >> >> But it cause a side effect removing filename:linenr >> because the object file and source file has same name >> e.g. "/home/taueng/hello", "/home/taeung/hello.c" >> >> So more exactly grep -v as below >> to correctly remove the one line >> e.g. "/home/taeung/hello: file format elf64-x86-64" >> >> After: >> $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello: > > What environment is this? I tried to reproduce this here but got > different results, and I had to figure out the 'perf record' you used, > you forgot to mention it, i.e. 'perf record ./hello' doesn't work, one > has to use 'perf record /full/path/to/hello', anyway, with my Sorry for my insufficient commit message. $ objdump -v GNU objdump (GNU Binutils) 2.28.51.20170309 ... $ pwd /home/taeung $ perf record ./hello hello, world ^C[ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.676 MB perf.data (17380 samples) ] And, when running 'perf annotate', I internally checked the full objdump command line with gdb and printf. Here it is, objdump --start-address=0x0000000000400526 --stop-address=0x0000000000400567 -l -d --no-show-raw -S -C /home/taeung/hello 2>/dev/null|grep -v /home/taeung/hello|expand If just running the command, the output is like below. $ objdump --start-address=0x0000000000400526 --stop-address=0x0000000000400567 -l -d --no-show-raw -S -C /home/taeung/hello 2>/dev/null|grep -v /home/taeung/hello|expand|head -16 Disassembly of section .text: 0000000000400526 <main>: main(): #include <stdio.h> void main() 400526: push %rbp 400527: mov %rsp,%rbp 40052a: sub $0x10,%rsp { unsigned long long i; As you can see, grep -v eliminate "filename:linenr" such as /home/taeung/hello.c:3 and etc But if running the command line without grep -v as below, we can see "filename:linenr". $ objdump --start-address=0x0000000000400526 --stop-address=0x0000000000400567 -l -d --no-show-raw -S -C /home/taeung/hello 2>/dev/null|expand|head -16 /home/taeung/hello: file format elf64-x86-64 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:3 #include <stdio.h> void main() 400526: push %rbp 400527: mov %rsp,%rbp 40052a: sub $0x10,%rsp /home/taeung/hello.c:6 So append ":" into grep -v as below. (it is contents of my patch) $ objdump --start-address=0x0000000000400526 --stop-address=0x0000000000400567 -l -d --no-show-raw -S -C /home/taeung/hello 2>/dev/null|grep -v /home/taeung/hello:|expand|head -16 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:3 #include <stdio.h> void main() 400526: push %rbp 400527: mov %rsp,%rbp 40052a: sub $0x10,%rsp /home/taeung/hello.c:6 { I'll change the commit message to be more clearly.. > environemnt, which is: > > [root@jouet ~]# objdump -v > GNU objdump version 2.26.1-1.fc25 > > I end up getting: > > Sorted summary for file /home/acme/c/hello > ---------------------------------------------- > > 52.67 hello.c:7 > 47.32 hello.c:8 > Percent | Source code & Disassembly of hello for cycles:u (83555 samples) > ------------------------------------------------------------------------------- > : 00000000004004f6 <main>: > : main(): > : #include <stdio.h> > : > : int main(void) > : { > 0.00 : 4004f6: push %rbp > 0.00 : 4004f7: mov %rsp,%rbp > 0.00 : 4004fa: sub $0x10,%rsp > : unsigned long long i; > : printf("hello, world\n"); > 0.00 : 4004fe: mov $0x4005d0,%edi > 0.00 : 400503: callq 4003f0 <puts@plt> > : while (i < (unsigned long long)-1) { > 0.00 : 400508: jmp 40052d <main+0x37> > : i *= 171; > hello.c:8 8.66 : 40050a: mov -0x8(%rbp),%rdx > 0.01 : 40050e: mov %rdx,%rax > 0.00 : 400511: shl $0x3,%rax > hello.c:8 0.84 : 400515: add %rdx,%rax > hello.c:8 8.60 : 400518: add %rax,%rax > hello.c:8 3.46 : 40051b: add %rdx,%rax > hello.c:8 8.57 : 40051e: lea 0x0(,%rax,8),%rdx > 8.65 : 400526: add %rdx,%rax > 8.56 : 400529: mov %rax,-0x8(%rbp) > <SNIP> > > Which looks ok, i.e. since this version of objdump puts just the basename (hello.c), > that grep doesn't nuke it. > > But yeah, your patch is more robust, just fix what I described below, i.e. use > quotes in the grep -v bit, probably. > > - Arnaldo > >> Cc: Namhyung Kim <namhyung@kernel.org> >> Cc: Jiri Olsa <jolsa@redhat.com> >> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> >> --- >> tools/perf/util/annotate.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c >> index 63130ec..e49eb7e 100644 >> --- a/tools/perf/util/annotate.c >> +++ b/tools/perf/util/annotate.c >> @@ -1443,7 +1443,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na >> snprintf(command, sizeof(command), >> "%s %s%s --start-address=0x%016" PRIx64 >> " --stop-address=0x%016" PRIx64 >> - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", >> + " -l -d %s %s -C %s 2>/dev/null|grep -v %s: |expand", > > > Are you sure this works as you expect? I.e. both lines will have '%s:' Yes, if we do, we can exactly the only below line without the side effect IMHO. "/home/taeung/hello: file format elf64-x86-64" > that space just before the '|' will not be considered by grep > IIUC, Just append ":" into grep -v leaving out " " space before the '|' ? as below diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 273f21f..4d325cd 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1435,7 +1435,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na snprintf(command, sizeof(command), "%s %s%s --start-address=0x%016" PRIx64 " --stop-address=0x%016" PRIx64 - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", + " -l -d %s %s -C %s 2>/dev/null|grep -v %s:|expand", objdump_path ? objdump_path : "objdump", disassembler_style ? "-M " : "", disassembler_style ? disassembler_style : "", Thanks, Taeung ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command 2017-03-21 14:37 ` Arnaldo Carvalho de Melo 2017-03-21 16:19 ` Taeung Song @ 2017-03-21 16:19 ` Taeung Song 2017-03-21 18:29 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 19+ messages in thread From: Taeung Song @ 2017-03-21 16:19 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Thank you for your review :) On 03/21/2017 11:37 PM, Arnaldo Carvalho de Melo wrote: > Em Mon, Mar 20, 2017 at 11:56:57AM +0900, Taeung Song escreveu: >> grep -v "file name" in the objdump command >> cause a side effect eliminating filename:linenr of >> output of 'objdump -l' if the object file name and >> source file name are about the same so fix it. >> >> The objdump command in symbol__disassemble() can be as below >> >> $ objdump -l -d -S -C /home/taeung/hello --start-address=... >> >> /home/taeung/hello: file format elf64-x86-64 >> >> Disassembly of section .text: >> >> 0000000000400526 <main>: >> main(): >> /home/taeung/hello.c:4 >> >> void main() >> { >> 400526: 55 push %rbp >> 400527: 48 89 e5 mov %rsp,%rbp >> /home/taeung/hello.c:5 >> ... >> >> But currently it use grep -v "file name" e.g. "/home/taeung/hello" >> in the objdump command to remove the first line containing file name >> and file format such as, >> >> Before: >> $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello >> >> But it cause a side effect removing filename:linenr >> because the object file and source file has same name >> e.g. "/home/taueng/hello", "/home/taeung/hello.c" >> >> So more exactly grep -v as below >> to correctly remove the one line >> e.g. "/home/taeung/hello: file format elf64-x86-64" >> >> After: >> $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello: > > What environment is this? I tried to reproduce this here but got > different results, and I had to figure out the 'perf record' you used, > you forgot to mention it, i.e. 'perf record ./hello' doesn't work, one > has to use 'perf record /full/path/to/hello', anyway, with my Sorry for my insufficient commit message. $ objdump -v GNU objdump (GNU Binutils) 2.28.51.20170309 ... $ pwd /home/taeung $ perf record ./hello hello, world ^C[ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.676 MB perf.data (17380 samples) ] And, when running 'perf annotate', I internally checked the full objdump command line with gdb and printf. Here it is, objdump --start-address=0x0000000000400526 --stop-address=0x0000000000400567 -l -d --no-show-raw -S -C /home/taeung/hello 2>/dev/null|grep -v /home/taeung/hello|expand If just running the command, the output is like below. $ objdump --start-address=0x0000000000400526 --stop-address=0x0000000000400567 -l -d --no-show-raw -S -C /home/taeung/hello 2>/dev/null|grep -v /home/taeung/hello|expand|head -16 Disassembly of section .text: 0000000000400526 <main>: main(): #include <stdio.h> void main() 400526: push %rbp 400527: mov %rsp,%rbp 40052a: sub $0x10,%rsp { unsigned long long i; As you can see, grep -v eliminate "filename:linenr" such as /home/taeung/hello.c:3 and etc But if running the command line without grep -v as below, we can see "filename:linenr". $ objdump --start-address=0x0000000000400526 --stop-address=0x0000000000400567 -l -d --no-show-raw -S -C /home/taeung/hello 2>/dev/null|expand|head -16 /home/taeung/hello: file format elf64-x86-64 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:3 #include <stdio.h> void main() 400526: push %rbp 400527: mov %rsp,%rbp 40052a: sub $0x10,%rsp /home/taeung/hello.c:6 So append ":" into grep -v as below. (it is contents of my patch) $ objdump --start-address=0x0000000000400526 --stop-address=0x0000000000400567 -l -d --no-show-raw -S -C /home/taeung/hello 2>/dev/null|grep -v /home/taeung/hello:|expand|head -16 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:3 #include <stdio.h> void main() 400526: push %rbp 400527: mov %rsp,%rbp 40052a: sub $0x10,%rsp /home/taeung/hello.c:6 { I'll change the commit message to be more clearly.. > environemnt, which is: > > [root@jouet ~]# objdump -v > GNU objdump version 2.26.1-1.fc25 > > I end up getting: > > Sorted summary for file /home/acme/c/hello > ---------------------------------------------- > > 52.67 hello.c:7 > 47.32 hello.c:8 > Percent | Source code & Disassembly of hello for cycles:u (83555 samples) > ------------------------------------------------------------------------------- > : 00000000004004f6 <main>: > : main(): > : #include <stdio.h> > : > : int main(void) > : { > 0.00 : 4004f6: push %rbp > 0.00 : 4004f7: mov %rsp,%rbp > 0.00 : 4004fa: sub $0x10,%rsp > : unsigned long long i; > : printf("hello, world\n"); > 0.00 : 4004fe: mov $0x4005d0,%edi > 0.00 : 400503: callq 4003f0 <puts@plt> > : while (i < (unsigned long long)-1) { > 0.00 : 400508: jmp 40052d <main+0x37> > : i *= 171; > hello.c:8 8.66 : 40050a: mov -0x8(%rbp),%rdx > 0.01 : 40050e: mov %rdx,%rax > 0.00 : 400511: shl $0x3,%rax > hello.c:8 0.84 : 400515: add %rdx,%rax > hello.c:8 8.60 : 400518: add %rax,%rax > hello.c:8 3.46 : 40051b: add %rdx,%rax > hello.c:8 8.57 : 40051e: lea 0x0(,%rax,8),%rdx > 8.65 : 400526: add %rdx,%rax > 8.56 : 400529: mov %rax,-0x8(%rbp) > <SNIP> > > Which looks ok, i.e. since this version of objdump puts just the basename (hello.c), > that grep doesn't nuke it. > > But yeah, your patch is more robust, just fix what I described below, i.e. use > quotes in the grep -v bit, probably. > > - Arnaldo > >> Cc: Namhyung Kim <namhyung@kernel.org> >> Cc: Jiri Olsa <jolsa@redhat.com> >> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> >> --- >> tools/perf/util/annotate.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c >> index 63130ec..e49eb7e 100644 >> --- a/tools/perf/util/annotate.c >> +++ b/tools/perf/util/annotate.c >> @@ -1443,7 +1443,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na >> snprintf(command, sizeof(command), >> "%s %s%s --start-address=0x%016" PRIx64 >> " --stop-address=0x%016" PRIx64 >> - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", >> + " -l -d %s %s -C %s 2>/dev/null|grep -v %s: |expand", > > > Are you sure this works as you expect? I.e. both lines will have '%s:' Yes, if we do, we can exactly the only below line without the side effect IMHO. "/home/taeung/hello: file format elf64-x86-64" > that space just before the '|' will not be considered by grep > IIUC, Just append ":" into grep -v leaving out " " space before the '|' ? as below diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 273f21f..4d325cd 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1435,7 +1435,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na snprintf(command, sizeof(command), "%s %s%s --start-address=0x%016" PRIx64 " --stop-address=0x%016" PRIx64 - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", + " -l -d %s %s -C %s 2>/dev/null|grep -v %s:|expand", objdump_path ? objdump_path : "objdump", disassembler_style ? "-M " : "", disassembler_style ? disassembler_style : "", Thanks, Taeung ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command 2017-03-21 16:19 ` Taeung Song @ 2017-03-21 18:29 ` Arnaldo Carvalho de Melo 2017-03-21 18:32 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 19+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-21 18:29 UTC (permalink / raw) To: Taeung Song Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Em Wed, Mar 22, 2017 at 01:19:49AM +0900, Taeung Song escreveu: > On 03/21/2017 11:37 PM, Arnaldo Carvalho de Melo wrote: > > > +++ b/tools/perf/util/annotate.c > > > @@ -1443,7 +1443,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na > > > snprintf(command, sizeof(command), > > > "%s %s%s --start-address=0x%016" PRIx64 > > > " --stop-address=0x%016" PRIx64 > > > - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", > > > + " -l -d %s %s -C %s 2>/dev/null|grep -v %s: |expand", > > Are you sure this works as you expect? I.e. both lines will have '%s:' > Yes, if we do, we can exactly the only below line without the side effect > IMHO. yeah, because the .c is not there, ok gotcha, will get your new commit log message, massage it a bit and apply the patch. - Arnaldo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command 2017-03-21 18:29 ` Arnaldo Carvalho de Melo @ 2017-03-21 18:32 ` Arnaldo Carvalho de Melo 2017-03-22 7:32 ` Taeung Song 0 siblings, 1 reply; 19+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-03-21 18:32 UTC (permalink / raw) To: Taeung Song Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa Em Tue, Mar 21, 2017 at 03:29:50PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Wed, Mar 22, 2017 at 01:19:49AM +0900, Taeung Song escreveu: > > On 03/21/2017 11:37 PM, Arnaldo Carvalho de Melo wrote: > > > > +++ b/tools/perf/util/annotate.c > > > > @@ -1443,7 +1443,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na > > > > snprintf(command, sizeof(command), > > > > "%s %s%s --start-address=0x%016" PRIx64 > > > > " --stop-address=0x%016" PRIx64 > > > > - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", > > > > + " -l -d %s %s -C %s 2>/dev/null|grep -v %s: |expand", > > > > Are you sure this works as you expect? I.e. both lines will have '%s:' > > > Yes, if we do, we can exactly the only below line without the side effect > > IMHO. > > yeah, because the .c is not there, ok gotcha, will get your new commit > log message, massage it a bit and apply the patch. Nah, with all this in mind, the original message is good enough, sorry for the confusion with the .c and the space after ':' (that I removed, needless and potentially confusing, as we saw here). - Arnaldo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command 2017-03-21 18:32 ` Arnaldo Carvalho de Melo @ 2017-03-22 7:32 ` Taeung Song 0 siblings, 0 replies; 19+ messages in thread From: Taeung Song @ 2017-03-22 7:32 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Ingo Molnar, Peter Zijlstra, Wang Nan, Masami Hiramatsu, Jiri Olsa On 03/22/2017 03:32 AM, Arnaldo Carvalho de Melo wrote: > Em Tue, Mar 21, 2017 at 03:29:50PM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Wed, Mar 22, 2017 at 01:19:49AM +0900, Taeung Song escreveu: >>> On 03/21/2017 11:37 PM, Arnaldo Carvalho de Melo wrote: >>>>> +++ b/tools/perf/util/annotate.c >>>>> @@ -1443,7 +1443,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na >>>>> snprintf(command, sizeof(command), >>>>> "%s %s%s --start-address=0x%016" PRIx64 >>>>> " --stop-address=0x%016" PRIx64 >>>>> - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", >>>>> + " -l -d %s %s -C %s 2>/dev/null|grep -v %s: |expand", >> >>>> Are you sure this works as you expect? I.e. both lines will have '%s:' >> >>> Yes, if we do, we can exactly the only below line without the side effect >>> IMHO. >> >> yeah, because the .c is not there, ok gotcha, will get your new commit >> log message, massage it a bit and apply the patch. > > Nah, with all this in mind, the original message is good enough, sorry > for the confusion with the .c and the space after ':' (that I removed, > needless and potentially confusing, as we saw here). > > - Arnaldo I got it! Thanks for your review, will also resend v2 with the rest bugfix patches making the commit log messages of them be more clearly. :) Thanks, Taeung ^ permalink raw reply [flat|nested] 19+ messages in thread
* [tip:perf/core] perf annotate: More exactly grep -v of the objdump command 2017-03-20 2:56 ` [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command Taeung Song 2017-03-21 14:37 ` Arnaldo Carvalho de Melo @ 2017-03-24 18:45 ` tip-bot for Taeung Song 1 sibling, 0 replies; 19+ messages in thread From: tip-bot for Taeung Song @ 2017-03-24 18:45 UTC (permalink / raw) To: linux-tip-commits Cc: hpa, peterz, mhiramat, namhyung, linux-kernel, mingo, wangnan0, tglx, acme, treeze.taeung, jolsa Commit-ID: e7cb9de211ebb2924d87fdeb77e50d74c2e673d1 Gitweb: http://git.kernel.org/tip/e7cb9de211ebb2924d87fdeb77e50d74c2e673d1 Author: Taeung Song <treeze.taeung@gmail.com> AuthorDate: Mon, 20 Mar 2017 11:56:57 +0900 Committer: Arnaldo Carvalho de Melo <acme@redhat.com> CommitDate: Tue, 21 Mar 2017 15:42:25 -0300 perf annotate: More exactly grep -v of the objdump command The 'grep -v "filename"' applied to the objdump command output cause a side effect eliminating filename:linenr of output of 'objdump -l' if the object file name and source file name are the same, fix it. E.g. the output of the following objdump command in symbol__disassemble(): $ objdump -l -d -S -C /home/taeung/hello --start-address=... /home/taeung/hello: file format elf64-x86-64 Disassembly of section .text: 0000000000400526 <main>: main(): /home/taeung/hello.c:4 void main() { 400526: 55 push %rbp 400527: 48 89 e5 mov %rsp,%rbp /home/taeung/hello.c:5 ... But it uses grep -v "filename" e.g. "/home/taeung/hello" in the objdump command to remove the first line containing file name and file format ("/home/taeung/hello: file format elf64-x86-64"): Before: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello But this causes a side effect, removing filename:linenr too, because the object file and source file have the same name e.g. "/home/taueng/hello", "/home/taeung/hello.c" So more do a better match by using grep -v as below to correctly remove that first line: "/home/taeung/hello: file format elf64-x86-64" After: $ objdump -l -d -S -C /home/taeung/hello | grep /home/taeung/hello: Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1489978617-31396-5-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/annotate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 273f21f..4d325cd 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1435,7 +1435,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map, const char *arch_na snprintf(command, sizeof(command), "%s %s%s --start-address=0x%016" PRIx64 " --stop-address=0x%016" PRIx64 - " -l -d %s %s -C %s 2>/dev/null|grep -v %s|expand", + " -l -d %s %s -C %s 2>/dev/null|grep -v %s:|expand", objdump_path ? objdump_path : "objdump", disassembler_style ? "-M " : "", disassembler_style ? disassembler_style : "", ^ permalink raw reply related [flat|nested] 19+ messages in thread
end of thread, other threads:[~2017-03-24 18:49 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-20 2:56 [PATCH 0/4] perf annotate: Bugfixes Taeung Song 2017-03-20 2:56 ` [PATCH 1/4] perf annotate: Use build-id dir when reading link name Taeung Song 2017-03-20 2:56 ` [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent Taeung Song 2017-03-20 18:15 ` Arnaldo Carvalho de Melo 2017-03-20 22:11 ` Taeung Song 2017-03-20 22:20 ` Taeung Song 2017-03-21 14:14 ` Arnaldo Carvalho de Melo 2017-03-21 14:21 ` Arnaldo Carvalho de Melo 2017-03-21 14:36 ` Taeung Song 2017-03-22 12:00 ` Taeung Song 2017-03-20 2:56 ` [PATCH 3/4] perf annotate: Fix missing setting nr samples on source_line Taeung Song 2017-03-20 2:56 ` [PATCH 4/4] perf annotate: More exactly grep -v of the objdump command Taeung Song 2017-03-21 14:37 ` Arnaldo Carvalho de Melo 2017-03-21 16:19 ` Taeung Song 2017-03-21 16:19 ` Taeung Song 2017-03-21 18:29 ` Arnaldo Carvalho de Melo 2017-03-21 18:32 ` Arnaldo Carvalho de Melo 2017-03-22 7:32 ` Taeung Song 2017-03-24 18:45 ` [tip:perf/core] " tip-bot for Taeung Song
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.