All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V7 1/1] perf tool:perf diff support for different binaries
@ 2015-02-09  5:39 kan.liang
  2015-02-25 15:14 ` Liang, Kan
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: kan.liang @ 2015-02-09  5:39 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, linux-kernel, ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries. This patch matches the symbol names.

Actually, perf diff once intended to compare the symbol names.
The commit as below can look for a pair by name.
604c5c92972d (perf diff: Change the default sort order to "dso,symbol")
However, at that time, perf diff used a global list of dsos. That means
the binaries which has same name can only be loaded once. That's a
problem for different binaries compare. For example, we have an old
binary and an updated binary. They very likely have same name and most
of the function, so only dsos from old binary will be loaded. When
processing the data from updated binary, perf still use the symbol
information from old binary. That's wrong.

Then the commit as below used IP to replace symbol name.
9c443dfdd31e ("perf diff: Fix support for all --sort combinations")
>From that time, perf diff starts to compare the symbol address.

The global dsos is discarded from a patch in 2010.
a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance
from host")
However, at that time, perf diff already compared by address. So perf
diff cannot work for different binaries as well.

This patch actually rolls back the perf diff to original design. The
document is also changed, so everybody knows the original design is to
compare the symbol names.

Here is an examples.
The only difference between example_v1.c and example_v2.c is the
location of f2 and f3. There is no change in behavior, but the previous
perf diff display the wrong differential profile.

example_v1.c
noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {

                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;

}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

example_v2.c
noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;
}

noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {
                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

[lk@localhost perf_diff]$ gcc example_v1.c -o example
[lk@localhost perf_diff]$ perf record -o example_v1.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.813 MB example_v1.data (~35522
samples) ]

[lk@localhost perf_diff]$ gcc example_v2.c -o example
[lk@localhost perf_diff]$ perf record -o example_v2.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.824 MB example_v2.data (~36015
samples) ]

Old perf diff result.
[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
 Event 'cycles'
 Baseline    Delta  Shared Object     Symbol
 ........  .......  ................  ...............................

                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%           example           [.] f2
    53.71%   -7.55%  example           [.] f3
            +53.81%  example           [.] f3
     0.02%           example           [.] main

New perf diff result.
[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%   -0.08%  example           [.] f2
    53.71%   +0.11%  example           [.] f3
     0.02%           example           [.] main

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>

---

Changes since V4: 
 - Seperate from old patch set 
 - Add an example in changelog 
 - Update man page 

Changes since V5:
 - Correct patch style

Changes since V6:
 - Update description

 tools/perf/Documentation/perf-diff.txt | 5 +++++
 tools/perf/util/sort.c                 | 9 +++++++++
 2 files changed, 14 insertions(+)

diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index e463caa..5182661 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -20,6 +20,11 @@ If no parameters are passed it will assume perf.data.old and perf.data.
 The differential profile is displayed only for events matching both
 specified perf.data files.
 
+If no parameters are passed the samples will be sorted by dso and symbol.
+As the perf.data files could come from different binaries, the symbols addresses
+could vary. So perf diff is based on the comparison of the files and
+symbols name.
+
 OPTIONS
 -------
 -D::
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 7a39c1e..4593f36 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1463,6 +1463,15 @@ int sort_dimension__add(const char *tok)
 			sort__has_parent = 1;
 		} else if (sd->entry == &sort_sym) {
 			sort__has_sym = 1;
+			/*
+			 * perf diff displays the performance difference amongst
+			 * two or more perf.data files. Those files could come
+			 * from different binaries. So we should not compare
+			 * their ips, but the name of symbol.
+			 */
+			if (sort__mode == SORT_MODE__DIFF)
+				sd->entry->se_collapse = sort__sym_sort;
+
 		} else if (sd->entry == &sort_dso) {
 			sort__has_dso = 1;
 		}
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [PATCH V7 1/1] perf tool:perf diff support for different binaries
  2015-02-09  5:39 [PATCH V7 1/1] perf tool:perf diff support for different binaries kan.liang
@ 2015-02-25 15:14 ` Liang, Kan
  2015-02-25 15:30   ` acme
  2015-02-26 15:17 ` Arnaldo Carvalho de Melo
  2015-02-28  9:32 ` [tip:perf/core] perf diff: Support " tip-bot for Kan Liang
  2 siblings, 1 reply; 8+ messages in thread
From: Liang, Kan @ 2015-02-25 15:14 UTC (permalink / raw)
  To: acme; +Cc: jolsa, namhyung, linux-kernel, ak

Hi Arnaldo,

Could you please review the patch?
I've already updated the patch description to try to address your concern.
Please let me know if you have any questions.

Thanks,
Kan

> 
> From: Kan Liang <kan.liang@intel.com>
> 
> Currently, the perf diff only works with same binaries. That's because it
> compares the symbol start address. It doesn't work if the perf.data comes
> from different binaries. This patch matches the symbol names.
> 
> Actually, perf diff once intended to compare the symbol names.
> The commit as below can look for a pair by name.
> 604c5c92972d (perf diff: Change the default sort order to "dso,symbol")
> However, at that time, perf diff used a global list of dsos. That means the
> binaries which has same name can only be loaded once. That's a problem
> for different binaries compare. For example, we have an old binary and an
> updated binary. They very likely have same name and most of the function,
> so only dsos from old binary will be loaded. When processing the data from
> updated binary, perf still use the symbol information from old binary.
> That's wrong.
> 
> Then the commit as below used IP to replace symbol name.
> 9c443dfdd31e ("perf diff: Fix support for all --sort combinations") From that
> time, perf diff starts to compare the symbol address.
> 
> The global dsos is discarded from a patch in 2010.
> a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance
> from host") However, at that time, perf diff already compared by address.
> So perf diff cannot work for different binaries as well.
> 
> This patch actually rolls back the perf diff to original design. The document
> is also changed, so everybody knows the original design is to compare the
> symbol names.
> 
> Here is an examples.
> The only difference between example_v1.c and example_v2.c is the
> location of f2 and f3. There is no change in behavior, but the previous perf
> diff display the wrong differential profile.
> 
> example_v1.c
> noinline void f3(void)
> {
>         volatile int i;
>         for (i = 0; i < 10000;) {
> 
>                 if(i%2)
>                         i++;
>                 else
>                         i++;
>         }
> }
> 
> noinline void f2(void)
> {
>         volatile int a = 100, b, c;
>         for (b = 0; b < 10000; b++)
>                 c = a * b;
> 
> }
> 
> noinline void f1(void)
> {
>                 f2();
>                 f3();
> }
> 
> int main()
> {
>         int i;
>         for (i = 0; i < 100000; i++)
>                 f1();
> }
> 
> example_v2.c
> noinline void f2(void)
> {
>         volatile int a = 100, b, c;
>         for (b = 0; b < 10000; b++)
>                 c = a * b;
> }
> 
> noinline void f3(void)
> {
>         volatile int i;
>         for (i = 0; i < 10000;) {
>                 if(i%2)
>                         i++;
>                 else
>                         i++;
>         }
> }
> 
> noinline void f1(void)
> {
>                 f2();
>                 f3();
> }
> 
> int main()
> {
>         int i;
>         for (i = 0; i < 100000; i++)
>                 f1();
> }
> 
> [lk@localhost perf_diff]$ gcc example_v1.c -o example [lk@localhost
> perf_diff]$ perf record -o example_v1.data ./example [ perf record:
> Woken up 4 times to write data ] [ perf record: Captured and wrote 0.813
> MB example_v1.data (~35522
> samples) ]
> 
> [lk@localhost perf_diff]$ gcc example_v2.c -o example [lk@localhost
> perf_diff]$ perf record -o example_v2.data ./example [ perf record:
> Woken up 4 times to write data ] [ perf record: Captured and wrote 0.824
> MB example_v2.data (~36015
> samples) ]
> 
> Old perf diff result.
> [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
> Event 'cycles'
>  Baseline    Delta  Shared Object     Symbol
>  ........  .......  ................  ...............................
> 
>                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
>      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
>                      [kernel.vmlinux]  [k] idle_cpu
>                      [kernel.vmlinux]  [k] intel_pstate_timer_func
>                      [kernel.vmlinux]  [k] native_read_msr_safe
>      0.00%           [kernel.vmlinux]  [k] native_read_tsc
>      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
>                      [kernel.vmlinux]  [k] ntp_tick_length
>      0.00%           [kernel.vmlinux]  [k] rb_erase
>      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
>      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
>      0.00%           [kernel.vmlinux]  [k] update_wall_time
>      0.00%           example           [.] f1
>     46.24%           example           [.] f2
>     53.71%   -7.55%  example           [.] f3
>             +53.81%  example           [.] f3
>      0.02%           example           [.] main
> 
> New perf diff result.
> [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
>                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
>      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
>                      [kernel.vmlinux]  [k] idle_cpu
>                      [kernel.vmlinux]  [k] intel_pstate_timer_func
>                      [kernel.vmlinux]  [k] native_read_msr_safe
>      0.00%           [kernel.vmlinux]  [k] native_read_tsc
>      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
>                      [kernel.vmlinux]  [k] ntp_tick_length
>      0.00%           [kernel.vmlinux]  [k] rb_erase
>      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
>      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
>      0.00%           [kernel.vmlinux]  [k] update_wall_time
>      0.00%           example           [.] f1
>     46.24%   -0.08%  example           [.] f2
>     53.71%   +0.11%  example           [.] f3
>      0.02%           example           [.] main
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> ---
> 
> Changes since V4:
>  - Seperate from old patch set
>  - Add an example in changelog
>  - Update man page
> 
> Changes since V5:
>  - Correct patch style
> 
> Changes since V6:
>  - Update description
> 
>  tools/perf/Documentation/perf-diff.txt | 5 +++++
>  tools/perf/util/sort.c                 | 9 +++++++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/tools/perf/Documentation/perf-diff.txt
> b/tools/perf/Documentation/perf-diff.txt
> index e463caa..5182661 100644
> --- a/tools/perf/Documentation/perf-diff.txt
> +++ b/tools/perf/Documentation/perf-diff.txt
> @@ -20,6 +20,11 @@ If no parameters are passed it will assume
> perf.data.old and perf.data.
>  The differential profile is displayed only for events matching both
> specified perf.data files.
> 
> +If no parameters are passed the samples will be sorted by dso and symbol.
> +As the perf.data files could come from different binaries, the symbols
> +addresses could vary. So perf diff is based on the comparison of the
> +files and symbols name.
> +
>  OPTIONS
>  -------
>  -D::
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index
> 7a39c1e..4593f36 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1463,6 +1463,15 @@ int sort_dimension__add(const char *tok)
>  			sort__has_parent = 1;
>  		} else if (sd->entry == &sort_sym) {
>  			sort__has_sym = 1;
> +			/*
> +			 * perf diff displays the performance difference
> amongst
> +			 * two or more perf.data files. Those files could
> come
> +			 * from different binaries. So we should not
> compare
> +			 * their ips, but the name of symbol.
> +			 */
> +			if (sort__mode == SORT_MODE__DIFF)
> +				sd->entry->se_collapse = sort__sym_sort;
> +
>  		} else if (sd->entry == &sort_dso) {
>  			sort__has_dso = 1;
>  		}
> --
> 1.7.11.7


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 1/1] perf tool:perf diff support for different binaries
  2015-02-25 15:14 ` Liang, Kan
@ 2015-02-25 15:30   ` acme
  0 siblings, 0 replies; 8+ messages in thread
From: acme @ 2015-02-25 15:30 UTC (permalink / raw)
  To: Liang, Kan; +Cc: jolsa, namhyung, linux-kernel, ak

Em Wed, Feb 25, 2015 at 03:14:06PM +0000, Liang, Kan escreveu:
> Hi Arnaldo,
> 
> Could you please review the patch?
> I've already updated the patch description to try to address your concern.
> Please let me know if you have any questions.

Just out of time, sorry, will get to it eventually.

- Arnaldo
 
> Thanks,
> Kan
> 
> > 
> > From: Kan Liang <kan.liang@intel.com>
> > 
> > Currently, the perf diff only works with same binaries. That's because it
> > compares the symbol start address. It doesn't work if the perf.data comes
> > from different binaries. This patch matches the symbol names.
> > 
> > Actually, perf diff once intended to compare the symbol names.
> > The commit as below can look for a pair by name.
> > 604c5c92972d (perf diff: Change the default sort order to "dso,symbol")
> > However, at that time, perf diff used a global list of dsos. That means the
> > binaries which has same name can only be loaded once. That's a problem
> > for different binaries compare. For example, we have an old binary and an
> > updated binary. They very likely have same name and most of the function,
> > so only dsos from old binary will be loaded. When processing the data from
> > updated binary, perf still use the symbol information from old binary.
> > That's wrong.
> > 
> > Then the commit as below used IP to replace symbol name.
> > 9c443dfdd31e ("perf diff: Fix support for all --sort combinations") From that
> > time, perf diff starts to compare the symbol address.
> > 
> > The global dsos is discarded from a patch in 2010.
> > a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance
> > from host") However, at that time, perf diff already compared by address.
> > So perf diff cannot work for different binaries as well.
> > 
> > This patch actually rolls back the perf diff to original design. The document
> > is also changed, so everybody knows the original design is to compare the
> > symbol names.
> > 
> > Here is an examples.
> > The only difference between example_v1.c and example_v2.c is the
> > location of f2 and f3. There is no change in behavior, but the previous perf
> > diff display the wrong differential profile.
> > 
> > example_v1.c
> > noinline void f3(void)
> > {
> >         volatile int i;
> >         for (i = 0; i < 10000;) {
> > 
> >                 if(i%2)
> >                         i++;
> >                 else
> >                         i++;
> >         }
> > }
> > 
> > noinline void f2(void)
> > {
> >         volatile int a = 100, b, c;
> >         for (b = 0; b < 10000; b++)
> >                 c = a * b;
> > 
> > }
> > 
> > noinline void f1(void)
> > {
> >                 f2();
> >                 f3();
> > }
> > 
> > int main()
> > {
> >         int i;
> >         for (i = 0; i < 100000; i++)
> >                 f1();
> > }
> > 
> > example_v2.c
> > noinline void f2(void)
> > {
> >         volatile int a = 100, b, c;
> >         for (b = 0; b < 10000; b++)
> >                 c = a * b;
> > }
> > 
> > noinline void f3(void)
> > {
> >         volatile int i;
> >         for (i = 0; i < 10000;) {
> >                 if(i%2)
> >                         i++;
> >                 else
> >                         i++;
> >         }
> > }
> > 
> > noinline void f1(void)
> > {
> >                 f2();
> >                 f3();
> > }
> > 
> > int main()
> > {
> >         int i;
> >         for (i = 0; i < 100000; i++)
> >                 f1();
> > }
> > 
> > [lk@localhost perf_diff]$ gcc example_v1.c -o example [lk@localhost
> > perf_diff]$ perf record -o example_v1.data ./example [ perf record:
> > Woken up 4 times to write data ] [ perf record: Captured and wrote 0.813
> > MB example_v1.data (~35522
> > samples) ]
> > 
> > [lk@localhost perf_diff]$ gcc example_v2.c -o example [lk@localhost
> > perf_diff]$ perf record -o example_v2.data ./example [ perf record:
> > Woken up 4 times to write data ] [ perf record: Captured and wrote 0.824
> > MB example_v2.data (~36015
> > samples) ]
> > 
> > Old perf diff result.
> > [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
> > Event 'cycles'
> >  Baseline    Delta  Shared Object     Symbol
> >  ........  .......  ................  ...............................
> > 
> >                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
> >      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
> >                      [kernel.vmlinux]  [k] idle_cpu
> >                      [kernel.vmlinux]  [k] intel_pstate_timer_func
> >                      [kernel.vmlinux]  [k] native_read_msr_safe
> >      0.00%           [kernel.vmlinux]  [k] native_read_tsc
> >      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
> >                      [kernel.vmlinux]  [k] ntp_tick_length
> >      0.00%           [kernel.vmlinux]  [k] rb_erase
> >      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
> >      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
> >      0.00%           [kernel.vmlinux]  [k] update_wall_time
> >      0.00%           example           [.] f1
> >     46.24%           example           [.] f2
> >     53.71%   -7.55%  example           [.] f3
> >             +53.81%  example           [.] f3
> >      0.02%           example           [.] main
> > 
> > New perf diff result.
> > [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
> >                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
> >      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
> >                      [kernel.vmlinux]  [k] idle_cpu
> >                      [kernel.vmlinux]  [k] intel_pstate_timer_func
> >                      [kernel.vmlinux]  [k] native_read_msr_safe
> >      0.00%           [kernel.vmlinux]  [k] native_read_tsc
> >      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
> >                      [kernel.vmlinux]  [k] ntp_tick_length
> >      0.00%           [kernel.vmlinux]  [k] rb_erase
> >      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
> >      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
> >      0.00%           [kernel.vmlinux]  [k] update_wall_time
> >      0.00%           example           [.] f1
> >     46.24%   -0.08%  example           [.] f2
> >     53.71%   +0.11%  example           [.] f3
> >      0.02%           example           [.] main
> > 
> > Signed-off-by: Kan Liang <kan.liang@intel.com>
> > Acked-by: Namhyung Kim <namhyung@kernel.org>
> > Acked-by: Jiri Olsa <jolsa@kernel.org>
> > 
> > ---
> > 
> > Changes since V4:
> >  - Seperate from old patch set
> >  - Add an example in changelog
> >  - Update man page
> > 
> > Changes since V5:
> >  - Correct patch style
> > 
> > Changes since V6:
> >  - Update description
> > 
> >  tools/perf/Documentation/perf-diff.txt | 5 +++++
> >  tools/perf/util/sort.c                 | 9 +++++++++
> >  2 files changed, 14 insertions(+)
> > 
> > diff --git a/tools/perf/Documentation/perf-diff.txt
> > b/tools/perf/Documentation/perf-diff.txt
> > index e463caa..5182661 100644
> > --- a/tools/perf/Documentation/perf-diff.txt
> > +++ b/tools/perf/Documentation/perf-diff.txt
> > @@ -20,6 +20,11 @@ If no parameters are passed it will assume
> > perf.data.old and perf.data.
> >  The differential profile is displayed only for events matching both
> > specified perf.data files.
> > 
> > +If no parameters are passed the samples will be sorted by dso and symbol.
> > +As the perf.data files could come from different binaries, the symbols
> > +addresses could vary. So perf diff is based on the comparison of the
> > +files and symbols name.
> > +
> >  OPTIONS
> >  -------
> >  -D::
> > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index
> > 7a39c1e..4593f36 100644
> > --- a/tools/perf/util/sort.c
> > +++ b/tools/perf/util/sort.c
> > @@ -1463,6 +1463,15 @@ int sort_dimension__add(const char *tok)
> >  			sort__has_parent = 1;
> >  		} else if (sd->entry == &sort_sym) {
> >  			sort__has_sym = 1;
> > +			/*
> > +			 * perf diff displays the performance difference
> > amongst
> > +			 * two or more perf.data files. Those files could
> > come
> > +			 * from different binaries. So we should not
> > compare
> > +			 * their ips, but the name of symbol.
> > +			 */
> > +			if (sort__mode == SORT_MODE__DIFF)
> > +				sd->entry->se_collapse = sort__sym_sort;
> > +
> >  		} else if (sd->entry == &sort_dso) {
> >  			sort__has_dso = 1;
> >  		}
> > --
> > 1.7.11.7

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 1/1] perf tool:perf diff support for different binaries
  2015-02-09  5:39 [PATCH V7 1/1] perf tool:perf diff support for different binaries kan.liang
  2015-02-25 15:14 ` Liang, Kan
@ 2015-02-26 15:17 ` Arnaldo Carvalho de Melo
  2015-02-26 20:34   ` Andi Kleen
  2015-02-28  9:32 ` [tip:perf/core] perf diff: Support " tip-bot for Kan Liang
  2 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-02-26 15:17 UTC (permalink / raw)
  To: kan.liang; +Cc: jolsa, namhyung, linux-kernel, ak

Em Mon, Feb 09, 2015 at 05:39:44AM +0000, kan.liang@intel.com escreveu:
> From: Kan Liang <kan.liang@intel.com>
> Currently, the perf diff only works with same binaries. That's because
> it compares the symbol start address. It doesn't work if the perf.data
> comes from different binaries. This patch matches the symbol names.
 
> Actually, perf diff once intended to compare the symbol names.
> The commit as below can look for a pair by name.
> 604c5c92972d (perf diff: Change the default sort order to "dso,symbol")
> However, at that time, perf diff used a global list of dsos. That means
> the binaries which has same name can only be loaded once. That's a
> problem for different binaries compare. For example, we have an old
> binary and an updated binary. They very likely have same name and most
> of the function, so only dsos from old binary will be loaded. When
> processing the data from updated binary, perf still use the symbol
> information from old binary. That's wrong.
 
> Then the commit as below used IP to replace symbol name.
> 9c443dfdd31e ("perf diff: Fix support for all --sort combinations")
> >From that time, perf diff starts to compare the symbol address.
 
> The global dsos is discarded from a patch in 2010.
> a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance
> from host")
> However, at that time, perf diff already compared by address. So perf
> diff cannot work for different binaries as well.
 
> This patch actually rolls back the perf diff to original design. The
> document is also changed, so everybody knows the original design is to
> compare the symbol names.
 
> Here is an examples.
> The only difference between example_v1.c and example_v2.c is the
> location of f2 and f3. There is no change in behavior, but the previous
> perf diff display the wrong differential profile.

Thanks for being persistent and addressing the comments.

Having a nice explanation of the problem helps, as my first reaction to
this patch was: "What? This is what this tool is supposed to do, to
compare two versions of a binary, one that is being developed from the
same source, the other with slight modifications, etc", while the
description of the patch made it look as this was a feature that was now
being introduced.

The problem was that over time new features made it stop working for the
initial purpose of the tool, gack!

It would be _really_ nice if you (or someone else :-) ) could get this
patch description and made it a script that would get the two versions
of the source code, build it, then called perf diff and checked that the
output was the expected one, then added it as an entry to 'perf test',
so that we would catch a problem like this quickly if we ever
reintroduce this problem or something else breaks 'perf diff' :-)

Applying the patch, thanks.

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 1/1] perf tool:perf diff support for different binaries
  2015-02-26 15:17 ` Arnaldo Carvalho de Melo
@ 2015-02-26 20:34   ` Andi Kleen
  2015-02-26 20:44     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2015-02-26 20:34 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: kan.liang, jolsa, namhyung, linux-kernel

> Having a nice explanation of the problem helps, as my first reaction to
> this patch was: "What? This is what this tool is supposed to do, to
> compare two versions of a binary, one that is being developed from the
> same source, the other with slight modifications, etc", while the
> description of the patch made it look as this was a feature that was now
> being introduced.

AFAIK it was actually to compare multiple runs of the same binary
with different setups: e.g. varying number of threads.

However in the Linux kernel development world it turns out comparing
multiple similar binaries is much more useful.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 1/1] perf tool:perf diff support for different binaries
  2015-02-26 20:34   ` Andi Kleen
@ 2015-02-26 20:44     ` Arnaldo Carvalho de Melo
  2015-02-26 22:49       ` Andi Kleen
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-02-26 20:44 UTC (permalink / raw)
  To: Andi Kleen; +Cc: kan.liang, jolsa, namhyung, linux-kernel

Em Thu, Feb 26, 2015 at 12:34:04PM -0800, Andi Kleen escreveu:
> > Having a nice explanation of the problem helps, as my first reaction to
> > this patch was: "What? This is what this tool is supposed to do, to
> > compare two versions of a binary, one that is being developed from the
> > same source, the other with slight modifications, etc", while the
> > description of the patch made it look as this was a feature that was now
> > being introduced.
> 
> AFAIK it was actually to compare multiple runs of the same binary
> with different setups: e.g. varying number of threads.
> 
> However in the Linux kernel development world it turns out comparing
> multiple similar binaries is much more useful.

Surely I am getting old and forgetting things, but when I wrote it, my
intent was to do:

vi a.c
build it
perf record a
vi a.c # change it
build it
perf record a
perf diff

And see if what I did while vi'ing it matched what I thought it would.

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 1/1] perf tool:perf diff support for different binaries
  2015-02-26 20:44     ` Arnaldo Carvalho de Melo
@ 2015-02-26 22:49       ` Andi Kleen
  0 siblings, 0 replies; 8+ messages in thread
From: Andi Kleen @ 2015-02-26 22:49 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: kan.liang, jolsa, namhyung, linux-kernel

> Surely I am getting old and forgetting things, but when I wrote it, my
> intent was to do:
> 
> vi a.c
> build it
> perf record a
> vi a.c # change it
> build it
> perf record a
> perf diff
> 
> And see if what I did while vi'ing it matched what I thought it would.

You're right of course. As you wrote it you know.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip:perf/core] perf diff: Support for different binaries
  2015-02-09  5:39 [PATCH V7 1/1] perf tool:perf diff support for different binaries kan.liang
  2015-02-25 15:14 ` Liang, Kan
  2015-02-26 15:17 ` Arnaldo Carvalho de Melo
@ 2015-02-28  9:32 ` tip-bot for Kan Liang
  2 siblings, 0 replies; 8+ messages in thread
From: tip-bot for Kan Liang @ 2015-02-28  9:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, namhyung, ak, kan.liang, linux-kernel, tglx, mingo, jolsa, acme

Commit-ID:  94ba462d69efeba2f97111321a9ba1aa8141da57
Gitweb:     http://git.kernel.org/tip/94ba462d69efeba2f97111321a9ba1aa8141da57
Author:     Kan Liang <kan.liang@intel.com>
AuthorDate: Mon, 9 Feb 2015 05:39:44 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 27 Feb 2015 10:08:38 -0300

perf diff: Support for different binaries

Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries. This patch matches the symbol names.

Actually, perf diff once intended to compare the symbol names.  The
commit as below can look for a pair by name.

604c5c92972d (perf diff: Change the default sort order to "dso,symbol")
However, at that time, perf diff used a global list of dsos. That means
the binaries which has same name can only be loaded once. That's a
problem for comparing different binaries.

For example, we have an old binary and an updated binary. They very
likely have same name and most of the functions, so only dsos from old
binary will be loaded. When processing the data from updated binary,
perf still use the symbol information from old binary. That's wrong.

Then the commit as below used IP to replace symbol name.
9c443dfdd31e ("perf diff: Fix support for all --sort combinations")
>From that time, perf diff starts to compare the symbol address.

The global dsos is discarded from a patch in 2010.
a1645ce12adb ("perf: 'perf kvm' tool for monitoring guest performance
from host")
However, at that time, perf diff already compared by address. So perf
diff cannot work for different binaries as well.

This patch actually rolls back the perf diff to original design. The
document is also changed, so everybody knows the original design is to
compare the symbol names.

Here are some examples:

The only difference between example_v1.c and example_v2.c is the
location of f2 and f3. There is no change in behavior, but the previous
perf diff display the wrong differential profile.

example_v1.c
noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {

                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;

}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

example_v2.c
noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;
}

noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {
                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

[lk@localhost perf_diff]$ gcc example_v1.c -o example
[lk@localhost perf_diff]$ perf record -o example_v1.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ]

[lk@localhost perf_diff]$ gcc example_v2.c -o example
[lk@localhost perf_diff]$ perf record -o example_v2.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ]

Old perf diff result:

[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
 Event 'cycles'
 Baseline    Delta  Shared Object     Symbol
 ........  .......  ................  ...............................

                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%           example           [.] f2
    53.71%   -7.55%  example           [.] f3
            +53.81%  example           [.] f3
     0.02%           example           [.] main

New perf diff result:

[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%   -0.08%  example           [.] f2
    53.71%   +0.11%  example           [.] f3
     0.02%           example           [.] main

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-diff.txt | 5 +++++
 tools/perf/util/sort.c                 | 9 +++++++++
 2 files changed, 14 insertions(+)

diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index e463caa..5182661 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -20,6 +20,11 @@ If no parameters are passed it will assume perf.data.old and perf.data.
 The differential profile is displayed only for events matching both
 specified perf.data files.
 
+If no parameters are passed the samples will be sorted by dso and symbol.
+As the perf.data files could come from different binaries, the symbols addresses
+could vary. So perf diff is based on the comparison of the files and
+symbols name.
+
 OPTIONS
 -------
 -D::
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 7a39c1e..4593f36 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1463,6 +1463,15 @@ int sort_dimension__add(const char *tok)
 			sort__has_parent = 1;
 		} else if (sd->entry == &sort_sym) {
 			sort__has_sym = 1;
+			/*
+			 * perf diff displays the performance difference amongst
+			 * two or more perf.data files. Those files could come
+			 * from different binaries. So we should not compare
+			 * their ips, but the name of symbol.
+			 */
+			if (sort__mode == SORT_MODE__DIFF)
+				sd->entry->se_collapse = sort__sym_sort;
+
 		} else if (sd->entry == &sort_dso) {
 			sort__has_dso = 1;
 		}

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-02-28  9:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-09  5:39 [PATCH V7 1/1] perf tool:perf diff support for different binaries kan.liang
2015-02-25 15:14 ` Liang, Kan
2015-02-25 15:30   ` acme
2015-02-26 15:17 ` Arnaldo Carvalho de Melo
2015-02-26 20:34   ` Andi Kleen
2015-02-26 20:44     ` Arnaldo Carvalho de Melo
2015-02-26 22:49       ` Andi Kleen
2015-02-28  9:32 ` [tip:perf/core] perf diff: Support " tip-bot for Kan Liang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.