linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v2 0/7] perf diff: diff cycles at basic block level
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
@ 2019-06-03  6:51 ` Jin, Yao
  2019-06-03 14:36 ` [PATCH v2 1/7] perf util: Create block_info structure Jin Yao
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-03  6:51 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin



On 6/3/2019 10:36 PM, Jin Yao wrote:
> In some cases small changes in hot loops can show big differences.
> But it's difficult to identify these differences.
> 
> perf diff currently can only diff symbols (functions). We can also expand
> it to diff cycles of individual programs blocks as reported by timed LBR.
> This would allow to identify changes in specific code accurately.
> 
> With this patch set, for example,
> 
>   # perf record -b ./div
>   # perf record -b ./div
>   # perf diff -s cycles
> 

Sorry, the command line should be perf diff -c cycles.

Thanks
Jin Yao

>   # Event 'cycles'
>   #
>   # Baseline         Block cycles diff [start:end]  Shared Object     Symbol
>   # ........  ....................................  ................  ....................................
>   #
>       49.03%        -9 [         4ef:         520]  div               [.] main
>       49.03%         0 [         4e8:         4ea]  div               [.] main
>       49.03%         0 [         4ef:         500]  div               [.] main
>       49.03%         0 [         4ef:         51c]  div               [.] main
>       49.03%         0 [         4ef:         535]  div               [.] main
>       18.82%         0 [       3ac40:       3ac4d]  libc-2.23.so      [.] __random_r
>       18.82%         0 [       3ac40:       3ac5c]  libc-2.23.so      [.] __random_r
>       18.82%         0 [       3ac40:       3ac76]  libc-2.23.so      [.] __random_r
>       18.82%         0 [       3ac40:       3ac88]  libc-2.23.so      [.] __random_r
>       18.82%         0 [       3ac90:       3ac9c]  libc-2.23.so      [.] __random_r
>       16.29%        -8 [       3aac0:       3aac0]  libc-2.23.so      [.] __random
>       16.29%         0 [       3aac0:       3aad2]  libc-2.23.so      [.] __random
>       16.29%         0 [       3aae0:       3aae7]  libc-2.23.so      [.] __random
>       16.29%         0 [       3ab03:       3ab0f]  libc-2.23.so      [.] __random
>       16.29%         0 [       3ab14:       3ab1b]  libc-2.23.so      [.] __random
>       16.29%         0 [       3ab28:       3ab2e]  libc-2.23.so      [.] __random
>       16.29%         0 [       3ab4a:       3ab53]  libc-2.23.so      [.] __random
>        8.11%         0 [         640:         644]  div               [.] compute_flag
>        8.11%         0 [         649:         659]  div               [.] compute_flag
>        5.46%         0 [       3af60:       3af60]  libc-2.23.so      [.] rand
>        5.46%         0 [       3af60:       3af64]  libc-2.23.so      [.] rand
>        2.25%         0 [         490:         490]  div               [.] rand@plt
>        0.01%        26 [      c00a27:      c00a27]  [kernel.vmlinux]  [k] native_irq_return_iret
>        0.00%      -157 [      2bf9f2:      2bfa63]  [kernel.vmlinux]  [k] update_blocked_averages
>        0.00%       -56 [      2bf980:      2bf9d3]  [kernel.vmlinux]  [k] update_blocked_averages
>        0.00%        48 [      2bf934:      2bf942]  [kernel.vmlinux]  [k] update_blocked_averages
>        0.00%         3 [      2bfb38:      2bfb67]  [kernel.vmlinux]  [k] update_blocked_averages
> 
> The 'cycles' is a new perf-diff computation selection, which enables
> the displaying of cycles difference of same program basic block amongst
> two perf.data. The program basic block is the code block between two
> branches in a function.
> 
>   v2:
>   ---
>   Keep standard perf diff format.
> 
>   Following is the v1 output.
> 
>   # perf diff --basic-block
> 
>   # Cycles diff  Basic block (start:end)
>   # ...........  .......................
>   #
>            -208  hrtimer_interrupt (30b9e0:30ba42)
>            -157  update_blocked_averages (2bf9f2:2bfa63)
>            -126  interrupt_entry (c00880:c0093a)
>             -86  hrtimer_interrupt (30bb29:30bb32)
>             -74  hrtimer_interrupt (30ba65:30bac4)
>             -56  update_blocked_averages (2bf980:2bf9d3)
>              48  update_blocked_averages (2bf934:2bf942)
>             -35  native_write_msr (267900:26790b)
>              26  native_irq_return_iret (c00a27:c00a27)
>              22  rcu_check_callbacks (2febb6:2febdc)
>             -21  __hrtimer_run_queues (30b220:30b2a3)
>              19  pvclock_gtod_notify (14ba0:14c1b)
>             -18  task_tick_fair (2c5d29:2c5d41)
> 
> Jin Yao (7):
>    perf util: Create block_info structure
>    perf util: Add block_info in hist_entry
>    perf diff: Check if all data files with branch stacks
>    perf diff: Use hists to manage basic blocks per symbol
>    perf diff: Link same basic blocks among different data files
>    perf diff: Print the basic block cycles diff
>    perf diff: Documentation -c cycles option
> 
>   tools/perf/Documentation/perf-diff.txt |  14 +-
>   tools/perf/builtin-diff.c              | 373 ++++++++++++++++++++++++++++++++-
>   tools/perf/ui/stdio/hist.c             |  26 +++
>   tools/perf/util/hist.c                 |  42 +++-
>   tools/perf/util/hist.h                 |   9 +
>   tools/perf/util/sort.h                 |   8 +
>   tools/perf/util/symbol.c               |  22 ++
>   tools/perf/util/symbol.h               |  23 ++
>   8 files changed, 509 insertions(+), 8 deletions(-)
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 0/7] perf diff: diff cycles at basic block level
@ 2019-06-03 14:36 Jin Yao
  2019-06-03  6:51 ` Jin, Yao
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Jin Yao @ 2019-06-03 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

In some cases small changes in hot loops can show big differences.
But it's difficult to identify these differences.

perf diff currently can only diff symbols (functions). We can also expand
it to diff cycles of individual programs blocks as reported by timed LBR.
This would allow to identify changes in specific code accurately.

With this patch set, for example,

 # perf record -b ./div
 # perf record -b ./div
 # perf diff -s cycles

 # Event 'cycles'
 #
 # Baseline         Block cycles diff [start:end]  Shared Object     Symbol
 # ........  ....................................  ................  ....................................
 #
     49.03%        -9 [         4ef:         520]  div               [.] main
     49.03%         0 [         4e8:         4ea]  div               [.] main
     49.03%         0 [         4ef:         500]  div               [.] main
     49.03%         0 [         4ef:         51c]  div               [.] main
     49.03%         0 [         4ef:         535]  div               [.] main
     18.82%         0 [       3ac40:       3ac4d]  libc-2.23.so      [.] __random_r
     18.82%         0 [       3ac40:       3ac5c]  libc-2.23.so      [.] __random_r
     18.82%         0 [       3ac40:       3ac76]  libc-2.23.so      [.] __random_r
     18.82%         0 [       3ac40:       3ac88]  libc-2.23.so      [.] __random_r
     18.82%         0 [       3ac90:       3ac9c]  libc-2.23.so      [.] __random_r
     16.29%        -8 [       3aac0:       3aac0]  libc-2.23.so      [.] __random
     16.29%         0 [       3aac0:       3aad2]  libc-2.23.so      [.] __random
     16.29%         0 [       3aae0:       3aae7]  libc-2.23.so      [.] __random
     16.29%         0 [       3ab03:       3ab0f]  libc-2.23.so      [.] __random
     16.29%         0 [       3ab14:       3ab1b]  libc-2.23.so      [.] __random
     16.29%         0 [       3ab28:       3ab2e]  libc-2.23.so      [.] __random
     16.29%         0 [       3ab4a:       3ab53]  libc-2.23.so      [.] __random
      8.11%         0 [         640:         644]  div               [.] compute_flag
      8.11%         0 [         649:         659]  div               [.] compute_flag
      5.46%         0 [       3af60:       3af60]  libc-2.23.so      [.] rand
      5.46%         0 [       3af60:       3af64]  libc-2.23.so      [.] rand
      2.25%         0 [         490:         490]  div               [.] rand@plt
      0.01%        26 [      c00a27:      c00a27]  [kernel.vmlinux]  [k] native_irq_return_iret
      0.00%      -157 [      2bf9f2:      2bfa63]  [kernel.vmlinux]  [k] update_blocked_averages
      0.00%       -56 [      2bf980:      2bf9d3]  [kernel.vmlinux]  [k] update_blocked_averages
      0.00%        48 [      2bf934:      2bf942]  [kernel.vmlinux]  [k] update_blocked_averages
      0.00%         3 [      2bfb38:      2bfb67]  [kernel.vmlinux]  [k] update_blocked_averages

The 'cycles' is a new perf-diff computation selection, which enables
the displaying of cycles difference of same program basic block amongst
two perf.data. The program basic block is the code block between two
branches in a function.

 v2:
 ---
 Keep standard perf diff format.

 Following is the v1 output.

 # perf diff --basic-block

 # Cycles diff  Basic block (start:end)
 # ...........  .......................
 #
          -208  hrtimer_interrupt (30b9e0:30ba42)
          -157  update_blocked_averages (2bf9f2:2bfa63)
          -126  interrupt_entry (c00880:c0093a)
           -86  hrtimer_interrupt (30bb29:30bb32)
           -74  hrtimer_interrupt (30ba65:30bac4)
           -56  update_blocked_averages (2bf980:2bf9d3)
            48  update_blocked_averages (2bf934:2bf942)
           -35  native_write_msr (267900:26790b)
            26  native_irq_return_iret (c00a27:c00a27)
            22  rcu_check_callbacks (2febb6:2febdc)
           -21  __hrtimer_run_queues (30b220:30b2a3)
            19  pvclock_gtod_notify (14ba0:14c1b)
           -18  task_tick_fair (2c5d29:2c5d41)

Jin Yao (7):
  perf util: Create block_info structure
  perf util: Add block_info in hist_entry
  perf diff: Check if all data files with branch stacks
  perf diff: Use hists to manage basic blocks per symbol
  perf diff: Link same basic blocks among different data files
  perf diff: Print the basic block cycles diff
  perf diff: Documentation -c cycles option

 tools/perf/Documentation/perf-diff.txt |  14 +-
 tools/perf/builtin-diff.c              | 373 ++++++++++++++++++++++++++++++++-
 tools/perf/ui/stdio/hist.c             |  26 +++
 tools/perf/util/hist.c                 |  42 +++-
 tools/perf/util/hist.h                 |   9 +
 tools/perf/util/sort.h                 |   8 +
 tools/perf/util/symbol.c               |  22 ++
 tools/perf/util/symbol.h               |  23 ++
 8 files changed, 509 insertions(+), 8 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/7] perf util: Create block_info structure
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
  2019-06-03  6:51 ` Jin, Yao
@ 2019-06-03 14:36 ` Jin Yao
  2019-06-03 14:36 ` [PATCH v2 2/7] perf util: Add block_info in hist_entry Jin Yao
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jin Yao @ 2019-06-03 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

perf diff currently can only diff symbols(functions). We should expand it
to diff cycles of individual programs blocks as reported by timed LBR.
This would allow to identify changes in specific code accurately.

We need a new structure to maintain the basic block information, such as,
symbol(function), start/end address of this block, cycles. This patch
creates this structure and with some ops.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/symbol.c | 22 ++++++++++++++++++++++
 tools/perf/util/symbol.h | 23 +++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index f4540f8..4e0a7b3 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2351,3 +2351,25 @@ struct mem_info *mem_info__new(void)
 		refcount_set(&mi->refcnt, 1);
 	return mi;
 }
+
+struct block_info *block_info__get(struct block_info *bi)
+{
+	if (bi)
+		refcount_inc(&bi->refcnt);
+	return bi;
+}
+
+void block_info__put(struct block_info *bi)
+{
+	if (bi && refcount_dec_and_test(&bi->refcnt))
+		free(bi);
+}
+
+struct block_info *block_info__new(void)
+{
+	struct block_info *bi = zalloc(sizeof(*bi));
+
+	if (bi)
+		refcount_set(&bi->refcnt, 1);
+	return bi;
+}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 9a8fe01..12755b4 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -131,6 +131,17 @@ struct mem_info {
 	refcount_t		refcnt;
 };
 
+struct block_info {
+	struct symbol		*sym;
+	u64			start;
+	u64			end;
+	u64			cycles;
+	u64			cycles_aggr;
+	int			num;
+	int			num_aggr;
+	refcount_t		refcnt;
+};
+
 struct addr_location {
 	struct machine *machine;
 	struct thread *thread;
@@ -332,4 +343,16 @@ static inline void __mem_info__zput(struct mem_info **mi)
 
 #define mem_info__zput(mi) __mem_info__zput(&mi)
 
+struct block_info *block_info__new(void);
+struct block_info *block_info__get(struct block_info *bi);
+void   block_info__put(struct block_info *bi);
+
+static inline void __block_info__zput(struct block_info **bi)
+{
+	block_info__put(*bi);
+	*bi = NULL;
+}
+
+#define block_info__zput(bi) __block_info__zput(&bi)
+
 #endif /* __PERF_SYMBOL */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/7] perf util: Add block_info in hist_entry
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
  2019-06-03  6:51 ` Jin, Yao
  2019-06-03 14:36 ` [PATCH v2 1/7] perf util: Create block_info structure Jin Yao
@ 2019-06-03 14:36 ` Jin Yao
  2019-06-03 14:36 ` [PATCH v2 3/7] perf diff: Check if all data files with branch stacks Jin Yao
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jin Yao @ 2019-06-03 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The block_info contains the program basic block information, i.e,
contains the start address and the end address of this basic block and
how much cycles it takes. We need to compare, sort and even print out
the basic block by some orders, i.e. sort by cycles.

For this purpose, we add block_info field to hist_entry. In order not to
impact current interface, we creates a new function hists__add_entry_block.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/hist.c | 22 ++++++++++++++++++++--
 tools/perf/util/hist.h |  6 ++++++
 tools/perf/util/sort.h |  1 +
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index fb3271f..680ad93 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -574,6 +574,8 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
 			 */
 			mem_info__zput(entry->mem_info);
 
+			block_info__zput(entry->block_info);
+
 			/* If the map of an existing hist_entry has
 			 * become out-of-date due to an exec() or
 			 * similar, update it.  Otherwise we will
@@ -645,6 +647,7 @@ __hists__add_entry(struct hists *hists,
 		   struct symbol *sym_parent,
 		   struct branch_info *bi,
 		   struct mem_info *mi,
+		   struct block_info *block_info,
 		   struct perf_sample *sample,
 		   bool sample_self,
 		   struct hist_entry_ops *ops)
@@ -677,6 +680,7 @@ __hists__add_entry(struct hists *hists,
 		.hists	= hists,
 		.branch_info = bi,
 		.mem_info = mi,
+		.block_info = block_info,
 		.transaction = sample->transaction,
 		.raw_data = sample->raw_data,
 		.raw_size = sample->raw_size,
@@ -699,7 +703,7 @@ struct hist_entry *hists__add_entry(struct hists *hists,
 				    struct perf_sample *sample,
 				    bool sample_self)
 {
-	return __hists__add_entry(hists, al, sym_parent, bi, mi,
+	return __hists__add_entry(hists, al, sym_parent, bi, mi, NULL,
 				  sample, sample_self, NULL);
 }
 
@@ -712,10 +716,24 @@ struct hist_entry *hists__add_entry_ops(struct hists *hists,
 					struct perf_sample *sample,
 					bool sample_self)
 {
-	return __hists__add_entry(hists, al, sym_parent, bi, mi,
+	return __hists__add_entry(hists, al, sym_parent, bi, mi, NULL,
 				  sample, sample_self, ops);
 }
 
+struct hist_entry *hists__add_entry_block(struct hists *hists,
+					  struct hist_entry_ops *ops,
+					  struct addr_location *al,
+					  struct block_info *block_info)
+{
+	struct hist_entry entry = {
+		.ops = ops,
+		.block_info = block_info,
+		.hists = hists,
+	}, *he = hists__findnew_entry(hists, &entry, al, false);
+
+	return he;
+}
+
 static int
 iter_next_nop_entry(struct hist_entry_iter *iter __maybe_unused,
 		    struct addr_location *al __maybe_unused)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 76ff6c6..c8f7d66 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -16,6 +16,7 @@ struct addr_location;
 struct map_symbol;
 struct mem_info;
 struct branch_info;
+struct block_info;
 struct symbol;
 
 enum hist_filter {
@@ -149,6 +150,11 @@ struct hist_entry *hists__add_entry_ops(struct hists *hists,
 					struct perf_sample *sample,
 					bool sample_self);
 
+struct hist_entry *hists__add_entry_block(struct hists *hists,
+					  struct hist_entry_ops *ops,
+					  struct addr_location *al,
+					  struct block_info *bi);
+
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 int max_stack_depth, void *arg);
 
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index ce376a7..43623fa 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -144,6 +144,7 @@ struct hist_entry {
 	long			time;
 	struct hists		*hists;
 	struct mem_info		*mem_info;
+	struct block_info	*block_info;
 	void			*raw_data;
 	u32			raw_size;
 	int			num_res;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 3/7] perf diff: Check if all data files with branch stacks
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (2 preceding siblings ...)
  2019-06-03 14:36 ` [PATCH v2 2/7] perf util: Add block_info in hist_entry Jin Yao
@ 2019-06-03 14:36 ` Jin Yao
  2019-06-03 14:36 ` [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jin Yao @ 2019-06-03 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We will expand perf diff to support diff cycles of individual programs
blocks, so it requires all data files having branch stacks.

This patch checks HEADER_BRANCH_STACK in header, and only set the flag
has_br_stack when HEADER_BRANCH_STACK are set in all data files.

 v2:
 ---
 Move check_file_brstack() from __cmd_diff() to cmd_diff().
 Because later patch will check flag 'has_br_stack' before
 ui_init().

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 6e79207..a7e0420 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -32,6 +32,7 @@ struct perf_diff {
 	struct perf_time_interval	*ptime_range;
 	int				 range_size;
 	int				 range_num;
+	bool				 has_br_stack;
 };
 
 /* Diff command specific HPP columns. */
@@ -873,6 +874,31 @@ static int parse_time_str(struct data__file *d, char *abstime_ostr,
 	return ret;
 }
 
+static int check_file_brstack(void)
+{
+	struct data__file *d;
+	bool has_br_stack;
+	int i;
+
+	data__for_each_file(i, d) {
+		d->session = perf_session__new(&d->data, false, &pdiff.tool);
+		if (!d->session) {
+			pr_err("Failed to open %s\n", d->data.path);
+			return -1;
+		}
+
+		has_br_stack = perf_header__has_feat(&d->session->header,
+						     HEADER_BRANCH_STACK);
+		perf_session__delete(d->session);
+		if (!has_br_stack)
+			return 0;
+	}
+
+	/* Set only all files having branch stacks */
+	pdiff.has_br_stack = true;
+	return 0;
+}
+
 static int __cmd_diff(void)
 {
 	struct data__file *d;
@@ -1487,6 +1513,9 @@ int cmd_diff(int argc, const char **argv)
 	if (data_init(argc, argv) < 0)
 		return -1;
 
+	if (check_file_brstack() < 0)
+		return -1;
+
 	if (ui_init() < 0)
 		return -1;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (3 preceding siblings ...)
  2019-06-03 14:36 ` [PATCH v2 3/7] perf diff: Check if all data files with branch stacks Jin Yao
@ 2019-06-03 14:36 ` Jin Yao
  2019-06-05 11:44   ` Jiri Olsa
                     ` (2 more replies)
  2019-06-03 14:36 ` [PATCH v2 5/7] perf diff: Link same basic blocks among different data files Jin Yao
                   ` (3 subsequent siblings)
  8 siblings, 3 replies; 27+ messages in thread
From: Jin Yao @ 2019-06-03 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The hist__account_cycles() can account cycles per basic
block. The basic block information is saved in cycles_hist
structure.

This patch processes each symbol, get basic blocks from
cycles_hist and add the basic block entries to a new hists
(block_hists in hist_entry). Using a hists is because
we need to compare, sort and print the basic blocks.

 v2:
 ---
 v1 adds the basic block entries to per data-file hists
 but v2 adds the basic block entries to per symbol hists.
 That is to keep current perf-diff format. Will show the
 result in next patches.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/sort.h    |   6 ++
 2 files changed, 200 insertions(+)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index a7e0420..310ba2a 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -20,6 +20,7 @@
 #include "util/data.h"
 #include "util/config.h"
 #include "util/time-utils.h"
+#include "util/annotate.h"
 
 #include <errno.h>
 #include <inttypes.h>
@@ -64,6 +65,17 @@ struct data__file {
 	struct diff_hpp_fmt	 fmt[PERF_HPP_DIFF__MAX_INDEX];
 };
 
+struct block_hpp_fmt {
+	struct perf_hpp_fmt	fmt;
+	struct data__file	*file;
+};
+
+struct block_hists {
+	struct hists		hists;
+	struct perf_hpp_list	list;
+	struct block_hpp_fmt	block_fmt;
+};
+
 static struct data__file *data__files;
 static int data__files_cnt;
 
@@ -87,11 +99,14 @@ static s64 compute_wdiff_w2;
 static const char		*cpu_list;
 static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 
+static struct addr_location dummy_al;
+
 enum {
 	COMPUTE_DELTA,
 	COMPUTE_RATIO,
 	COMPUTE_WEIGHTED_DIFF,
 	COMPUTE_DELTA_ABS,
+	COMPUTE_CYCLES,
 	COMPUTE_MAX,
 };
 
@@ -100,6 +115,7 @@ const char *compute_names[COMPUTE_MAX] = {
 	[COMPUTE_DELTA_ABS] = "delta-abs",
 	[COMPUTE_RATIO] = "ratio",
 	[COMPUTE_WEIGHTED_DIFF] = "wdiff",
+	[COMPUTE_CYCLES] = "cycles",
 };
 
 static int compute = COMPUTE_DELTA_ABS;
@@ -234,6 +250,8 @@ static int setup_compute(const struct option *opt, const char *str,
 	for (i = 0; i < COMPUTE_MAX; i++)
 		if (!strcmp(cstr, compute_names[i])) {
 			*cp = i;
+			if (i == COMPUTE_CYCLES)
+				break;
 			return setup_compute_opt(option);
 		}
 
@@ -368,6 +386,9 @@ static int diff__process_sample_event(struct perf_tool *tool,
 		goto out_put;
 	}
 
+	if (compute == COMPUTE_CYCLES)
+		hist__account_cycles(sample->branch_stack, &al, sample, false);
+
 	/*
 	 * The total_period is updated here before going to the output
 	 * tree since normally only the baseline hists will call
@@ -475,6 +496,155 @@ static void hists__baseline_only(struct hists *hists)
 	}
 }
 
+static int64_t block_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+			 struct hist_entry *left, struct hist_entry *right)
+{
+	struct block_info *bi_l = left->block_info;
+	struct block_info *bi_r = right->block_info;
+	int cmp;
+
+	if (!bi_l->sym || !bi_r->sym) {
+		if (!bi_l->sym && !bi_r->sym)
+			return 0;
+		else if (!bi_l->sym)
+			return -1;
+		else
+			return 1;
+	}
+
+	if (bi_l->sym == bi_r->sym) {
+		if (bi_l->start == bi_r->start) {
+			if (bi_l->end == bi_r->end)
+				return 0;
+			else
+				return (int64_t)(bi_r->end - bi_l->end);
+		} else
+			return (int64_t)(bi_r->start - bi_l->start);
+	} else {
+		cmp = strcmp(bi_l->sym->name, bi_r->sym->name);
+		return cmp;
+	}
+
+	if (bi_l->sym->start != bi_r->sym->start)
+		return (int64_t)(bi_r->sym->start - bi_l->sym->start);
+
+	return (int64_t)(bi_r->sym->end - bi_l->sym->end);
+}
+
+static int64_t block_cycles_diff_cmp(struct hist_entry *left,
+				     struct hist_entry *right)
+{
+	bool pairs_left  = hist_entry__has_pairs(left);
+	bool pairs_right = hist_entry__has_pairs(right);
+	s64 l, r;
+
+	if (!pairs_left && !pairs_right)
+		return 0;
+
+	l = abs(left->diff.cycles);
+	r = abs(right->diff.cycles);
+	return r - l;
+}
+
+static int64_t block_sort(struct perf_hpp_fmt *fmt __maybe_unused,
+			  struct hist_entry *left, struct hist_entry *right)
+{
+	return block_cycles_diff_cmp(right, left);
+}
+
+static struct block_hists *alloc_block_hists(struct data__file *d)
+{
+	struct block_hists *block_hists = zalloc(sizeof(*block_hists));
+	struct block_hpp_fmt *block_fmt;
+
+	if (block_hists) {
+		__hists__init(&block_hists->hists, &block_hists->list);
+		perf_hpp_list__init(&block_hists->list);
+
+		block_fmt = &block_hists->block_fmt;
+		INIT_LIST_HEAD(&block_fmt->fmt.list);
+		INIT_LIST_HEAD(&block_fmt->fmt.sort_list);
+		block_fmt->fmt.cmp = block_cmp;
+		block_fmt->fmt.sort = block_sort;
+		block_fmt->file = d;
+		perf_hpp_list__register_sort_field(&block_hists->list,
+						   &block_fmt->fmt);
+	}
+
+	return block_hists;
+}
+
+static void init_block_info(struct block_info *bi, struct symbol *sym,
+			    struct cyc_hist *ch, int offset)
+{
+	bi->sym = sym;
+	bi->start = ch->start;
+	bi->end = offset;
+	bi->cycles = ch->cycles;
+	bi->cycles_aggr = ch->cycles_aggr;
+	bi->num = ch->num;
+	bi->num_aggr = ch->num_aggr;
+}
+
+static void *block_he_zalloc(size_t size)
+{
+	return zalloc(size + sizeof(struct hist_entry));
+}
+
+static void block_he_free(void *he)
+{
+	struct block_info *bi = ((struct hist_entry *)he)->block_info;
+
+	block_info__put(bi);
+	free(he);
+}
+
+struct hist_entry_ops block_he_ops = {
+	.new    = block_he_zalloc,
+	.free   = block_he_free,
+};
+
+static int process_block_per_sym(struct hist_entry *he, struct data__file *d)
+{
+	struct annotation *notes;
+	struct cyc_hist *ch;
+
+	if (!he->ms.map || !he->ms.sym)
+		return 0;
+
+	notes = symbol__annotation(he->ms.sym);
+	if (!notes || !notes->src || !notes->src->cycles_hist)
+		return 0;
+
+	he->block_hists = alloc_block_hists(d);
+	if (!he->block_hists)
+		return -1;
+
+	ch = notes->src->cycles_hist;
+	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
+		if (ch[i].num_aggr) {
+			struct block_info *bi;
+			struct hist_entry *he_block;
+			struct block_hists *block_hists = he->block_hists;
+
+			bi = block_info__new();
+			if (!bi)
+				return -1;
+
+			init_block_info(bi, he->ms.sym, &ch[i], i);
+			he_block = hists__add_entry_block(&block_hists->hists,
+							  &block_he_ops,
+							  &dummy_al, bi);
+			if (!he_block) {
+				block_info__put(bi);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
 static void hists__precompute(struct hists *hists)
 {
 	struct rb_root_cached *root;
@@ -494,6 +664,9 @@ static void hists__precompute(struct hists *hists)
 		he   = rb_entry(next, struct hist_entry, rb_node_in);
 		next = rb_next(&he->rb_node_in);
 
+		if (compute == COMPUTE_CYCLES)
+			process_block_per_sym(he, &data__files[0]);
+
 		data__for_each_file_new(i, d) {
 			pair = get_pair_data(he, d);
 			if (!pair)
@@ -510,6 +683,9 @@ static void hists__precompute(struct hists *hists)
 			case COMPUTE_WEIGHTED_DIFF:
 				compute_wdiff(he, pair);
 				break;
+			case COMPUTE_CYCLES:
+				process_block_per_sym(pair, d);
+				break;
 			default:
 				BUG_ON(1);
 			}
@@ -713,6 +889,14 @@ hist_entry__cmp_wdiff_idx(struct perf_hpp_fmt *fmt __maybe_unused,
 					   sort_compute);
 }
 
+static int64_t
+hist_entry__cmp_cycles_idx(struct perf_hpp_fmt *fmt __maybe_unused,
+			   struct hist_entry *left __maybe_unused,
+			   struct hist_entry *right __maybe_unused)
+{
+	return 0;
+}
+
 static void hists__process(struct hists *hists)
 {
 	if (show_baseline_only)
@@ -746,6 +930,8 @@ static void data_process(void)
 	struct perf_evsel *evsel_base;
 	bool first = true;
 
+	memset(&dummy_al, 0, sizeof(dummy_al));
+
 	evlist__for_each_entry(evlist_base, evsel_base) {
 		struct hists *hists_base = evsel__hists(evsel_base);
 		struct data__file *d;
@@ -1411,6 +1597,9 @@ static int ui_init(void)
 	case COMPUTE_DELTA_ABS:
 		fmt->sort = hist_entry__cmp_delta_abs_idx;
 		break;
+	case COMPUTE_CYCLES:
+		fmt->sort = hist_entry__cmp_cycles_idx;
+		break;
 	default:
 		BUG_ON(1);
 	}
@@ -1507,6 +1696,8 @@ int cmd_diff(int argc, const char **argv)
 	if (quiet)
 		perf_quiet_option();
 
+	symbol__annotation_init();
+
 	if (symbol__init(NULL) < 0)
 		return -1;
 
@@ -1516,6 +1707,9 @@ int cmd_diff(int argc, const char **argv)
 	if (check_file_brstack() < 0)
 		return -1;
 
+	if (compute == COMPUTE_CYCLES && !pdiff.has_br_stack)
+		return -1;
+
 	if (ui_init() < 0)
 		return -1;
 
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 43623fa..d1641da 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -79,6 +79,9 @@ struct hist_entry_diff {
 
 		/* HISTC_WEIGHTED_DIFF */
 		s64	wdiff;
+
+		/* PERF_HPP_DIFF__CYCLES */
+		s64	cycles;
 	};
 };
 
@@ -143,6 +146,9 @@ struct hist_entry {
 	struct branch_info	*branch_info;
 	long			time;
 	struct hists		*hists;
+	void			*block_hists;
+	int			block_idx;
+	int			block_num;
 	struct mem_info		*mem_info;
 	struct block_info	*block_info;
 	void			*raw_data;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 5/7] perf diff: Link same basic blocks among different data files
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (4 preceding siblings ...)
  2019-06-03 14:36 ` [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
@ 2019-06-03 14:36 ` Jin Yao
  2019-06-03 14:36 ` [PATCH v2 6/7] perf diff: Print the basic block cycles diff Jin Yao
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Jin Yao @ 2019-06-03 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The target is to compare the performance difference (cycles
diff) for the same basic blocks in different data files.

The same basic block means same function, same start address
and same end address. This patch finds the same basic blocks
from different data files and link them together and resort
by the cycles diff.

 v2:
 ---
 Since now the basic block hists is changed to per symbol,
 the patch only links the basic block hists for the same
 symbol in different data files.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/hist.c    |  2 +-
 tools/perf/util/sort.h    |  1 +
 3 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 310ba2a..b50ed70 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -645,6 +645,84 @@ static int process_block_per_sym(struct hist_entry *he, struct data__file *d)
 	return 0;
 }
 
+static int block_pair_cmp(struct hist_entry *a, struct hist_entry *b)
+{
+	struct block_info *bi_a = a->block_info;
+	struct block_info *bi_b = b->block_info;
+	int cmp;
+
+	if (!bi_a->sym || !bi_b->sym)
+		return -1;
+
+	if (bi_a->sym->name && bi_b->sym->name) {
+		cmp = strcmp(bi_a->sym->name, bi_b->sym->name);
+		if ((!cmp) && (bi_a->start == bi_b->start) &&
+		    (bi_a->end == bi_b->end)) {
+			return 0;
+		}
+	}
+
+	return -1;
+}
+
+static struct hist_entry *get_block_pair(struct hist_entry *he,
+					 struct hists *hists_pair)
+{
+	struct rb_root_cached *root = hists_pair->entries_in;
+	struct rb_node *next = rb_first_cached(root);
+	int cmp;
+
+	while (next != NULL) {
+		struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
+						      rb_node_in);
+
+		next = rb_next(&he_pair->rb_node_in);
+
+		cmp = block_pair_cmp(he_pair, he);
+		if (!cmp)
+			return he_pair;
+	}
+
+	return NULL;
+}
+
+static void compute_cycles_diff(struct hist_entry *he,
+				struct hist_entry *pair)
+{
+	pair->diff.computed = true;
+	if (pair->block_info->num && he->block_info->num) {
+		pair->diff.cycles =
+			pair->block_info->cycles_aggr / pair->block_info->num_aggr -
+			he->block_info->cycles_aggr / he->block_info->num_aggr;
+	}
+}
+
+static void block_hists_match(struct hists *hists_base,
+			      struct hists *hists_pair)
+{
+	struct rb_root_cached *root = hists_base->entries_in;
+	struct rb_node *next = rb_first_cached(root);
+
+	while (next != NULL) {
+		struct hist_entry *he = rb_entry(next, struct hist_entry,
+						 rb_node_in);
+		struct hist_entry *pair = get_block_pair(he, hists_pair);
+
+		next = rb_next(&he->rb_node_in);
+
+		if (pair) {
+			hist_entry__add_pair(pair, he);
+			compute_cycles_diff(he, pair);
+		}
+	}
+}
+
+static int filter_cb(struct hist_entry *he, void *arg __maybe_unused)
+{
+	he->not_collen = true;
+	return 0;
+}
+
 static void hists__precompute(struct hists *hists)
 {
 	struct rb_root_cached *root;
@@ -659,13 +737,19 @@ static void hists__precompute(struct hists *hists)
 	while (next != NULL) {
 		struct hist_entry *he, *pair;
 		struct data__file *d;
+		struct block_hists *b;
+		struct block_hists *p;
 		int i;
 
 		he   = rb_entry(next, struct hist_entry, rb_node_in);
 		next = rb_next(&he->rb_node_in);
 
-		if (compute == COMPUTE_CYCLES)
+		if (compute == COMPUTE_CYCLES) {
 			process_block_per_sym(he, &data__files[0]);
+			b = he->block_hists;
+			if (b)
+				he->block_num = b->hists.nr_entries;
+		}
 
 		data__for_each_file_new(i, d) {
 			pair = get_pair_data(he, d);
@@ -685,6 +769,13 @@ static void hists__precompute(struct hists *hists)
 				break;
 			case COMPUTE_CYCLES:
 				process_block_per_sym(pair, d);
+				b = he->block_hists;
+				p = pair->block_hists;
+				if (b && p) {
+					block_hists_match(&b->hists, &p->hists);
+					hists__output_resort_cb(&p->hists, NULL,
+								filter_cb);
+				}
 				break;
 			default:
 				BUG_ON(1);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 680ad93..10ab45a 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1846,7 +1846,7 @@ static void output_resort(struct hists *hists, struct ui_progress *prog,
 		__hists__insert_output_entry(&hists->entries, n, min_callchain_hits, use_callchain);
 		hists__inc_stats(hists, n);
 
-		if (!n->filtered)
+		if ((!n->filtered) && (!n->not_collen))
 			hists__calc_col_len(hists, n);
 
 		if (prog)
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index d1641da..1b9752d 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -122,6 +122,7 @@ struct hist_entry {
 
 	char			level;
 	u8			filtered;
+	bool			not_collen;
 
 	u16			callchain_size;
 	union {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 6/7] perf diff: Print the basic block cycles diff
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (5 preceding siblings ...)
  2019-06-03 14:36 ` [PATCH v2 5/7] perf diff: Link same basic blocks among different data files Jin Yao
@ 2019-06-03 14:36 ` Jin Yao
  2019-06-05 11:44   ` Jiri Olsa
  2019-06-05 11:44   ` Jiri Olsa
  2019-06-03 14:36 ` [PATCH v2 7/7] perf diff: Documentation -c cycles option Jin Yao
  2019-06-05 11:44 ` [PATCH v2 0/7] perf diff: diff cycles at basic block level Jiri Olsa
  8 siblings, 2 replies; 27+ messages in thread
From: Jin Yao @ 2019-06-03 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

perf record -b ./div
perf record -b ./div

Following is the default perf diff output

 # perf diff

 # Event 'cycles'
 #
 # Baseline  Delta Abs  Shared Object     Symbol
 # ........  .........  ................  ....................................
 #
     49.03%     +0.30%  div               [.] main
     16.29%     -0.20%  libc-2.23.so      [.] __random
     18.82%     -0.07%  libc-2.23.so      [.] __random_r
      8.11%     -0.04%  div               [.] compute_flag
      2.25%     +0.01%  div               [.] rand@plt
      0.00%     +0.01%  [kernel.vmlinux]  [k] task_tick_fair
      5.46%     +0.01%  libc-2.23.so      [.] rand
      0.01%     -0.01%  [kernel.vmlinux]  [k] native_irq_return_iret
      0.00%     -0.00%  [kernel.vmlinux]  [k] interrupt_entry

This patch creates a new computation selection 'cycles'.

 # perf diff -c cycles

 # Event 'cycles'
 #
 # Baseline         Block cycles diff [start:end]  Shared Object     Symbol
 # ........  ....................................  ................  ....................................
 #
     49.03%        -9 [         4ef:         520]  div               [.] main
     49.03%         0 [         4e8:         4ea]  div               [.] main
     49.03%         0 [         4ef:         500]  div               [.] main
     49.03%         0 [         4ef:         51c]  div               [.] main
     49.03%         0 [         4ef:         535]  div               [.] main
     18.82%         0 [       3ac40:       3ac4d]  libc-2.23.so      [.] __random_r
     18.82%         0 [       3ac40:       3ac5c]  libc-2.23.so      [.] __random_r
     18.82%         0 [       3ac40:       3ac76]  libc-2.23.so      [.] __random_r
     18.82%         0 [       3ac40:       3ac88]  libc-2.23.so      [.] __random_r
     18.82%         0 [       3ac90:       3ac9c]  libc-2.23.so      [.] __random_r
     16.29%        -8 [       3aac0:       3aac0]  libc-2.23.so      [.] __random
     16.29%         0 [       3aac0:       3aad2]  libc-2.23.so      [.] __random
     16.29%         0 [       3aae0:       3aae7]  libc-2.23.so      [.] __random
     16.29%         0 [       3ab03:       3ab0f]  libc-2.23.so      [.] __random
     16.29%         0 [       3ab14:       3ab1b]  libc-2.23.so      [.] __random
     16.29%         0 [       3ab28:       3ab2e]  libc-2.23.so      [.] __random
     16.29%         0 [       3ab4a:       3ab53]  libc-2.23.so      [.] __random
      8.11%         0 [         640:         644]  div               [.] compute_flag
      8.11%         0 [         649:         659]  div               [.] compute_flag
      5.46%         0 [       3af60:       3af60]  libc-2.23.so      [.] rand
      5.46%         0 [       3af60:       3af64]  libc-2.23.so      [.] rand
      2.25%         0 [         490:         490]  div               [.] rand@plt
      0.01%        26 [      c00a27:      c00a27]  [kernel.vmlinux]  [k] native_irq_return_iret
      0.00%      -157 [      2bf9f2:      2bfa63]  [kernel.vmlinux]  [k] update_blocked_averages
      0.00%       -56 [      2bf980:      2bf9d3]  [kernel.vmlinux]  [k] update_blocked_averages
      0.00%        48 [      2bf934:      2bf942]  [kernel.vmlinux]  [k] update_blocked_averages
      0.00%         3 [      2bfb38:      2bfb67]  [kernel.vmlinux]  [k] update_blocked_averages
      0.00%         0 [      2bf968:      2bf97b]  [kernel.vmlinux]  [k] update_blocked_averages

"[start:end]" indicates the basic block range. The output is sorted
by "Baseline" and the basic blocks in the same function are sorted
by cycles diff.

 v2:
 ---
 Keep standard perf diff format and display the 'Baseline' and
 'Shared Object'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c  | 63 +++++++++++++++++++++++++++++++++++++++++++---
 tools/perf/ui/stdio/hist.c | 26 +++++++++++++++++++
 tools/perf/util/hist.c     | 18 +++++++++++++
 tools/perf/util/hist.h     |  3 +++
 4 files changed, 106 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index b50ed70..35e4c36 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -46,6 +46,7 @@ enum {
 	PERF_HPP_DIFF__WEIGHTED_DIFF,
 	PERF_HPP_DIFF__FORMULA,
 	PERF_HPP_DIFF__DELTA_ABS,
+	PERF_HPP_DIFF__CYCLES,
 
 	PERF_HPP_DIFF__MAX_INDEX
 };
@@ -125,6 +126,7 @@ static int compute_2_hpp[COMPUTE_MAX] = {
 	[COMPUTE_DELTA_ABS]	= PERF_HPP_DIFF__DELTA_ABS,
 	[COMPUTE_RATIO]		= PERF_HPP_DIFF__RATIO,
 	[COMPUTE_WEIGHTED_DIFF]	= PERF_HPP_DIFF__WEIGHTED_DIFF,
+	[COMPUTE_CYCLES]	= PERF_HPP_DIFF__CYCLES,
 };
 
 #define MAX_COL_WIDTH 70
@@ -163,6 +165,10 @@ static struct header_column {
 	[PERF_HPP_DIFF__FORMULA] = {
 		.name  = "Formula",
 		.width = MAX_COL_WIDTH,
+	},
+	[PERF_HPP_DIFF__CYCLES] = {
+		.name  = "Block cycles diff [start:end]",
+		.width = 36,
 	}
 };
 
@@ -250,8 +256,6 @@ static int setup_compute(const struct option *opt, const char *str,
 	for (i = 0; i < COMPUTE_MAX; i++)
 		if (!strcmp(cstr, compute_names[i])) {
 			*cp = i;
-			if (i == COMPUTE_CYCLES)
-				break;
 			return setup_compute_opt(option);
 		}
 
@@ -949,6 +953,14 @@ hist_entry__cmp_wdiff(struct perf_hpp_fmt *fmt,
 }
 
 static int64_t
+hist_entry__cmp_cycles(struct perf_hpp_fmt *fmt __maybe_unused,
+		       struct hist_entry *left __maybe_unused,
+		       struct hist_entry *right __maybe_unused)
+{
+	return 0;
+}
+
+static int64_t
 hist_entry__cmp_delta_idx(struct perf_hpp_fmt *fmt __maybe_unused,
 			  struct hist_entry *left, struct hist_entry *right)
 {
@@ -1253,7 +1265,7 @@ static const struct option options[] = {
 	OPT_BOOLEAN('b', "baseline-only", &show_baseline_only,
 		    "Show only items with match in baseline"),
 	OPT_CALLBACK('c', "compute", &compute,
-		     "delta,delta-abs,ratio,wdiff:w1,w2 (default delta-abs)",
+		     "delta,delta-abs,ratio,wdiff:w1,w2 (default delta-abs),cycles",
 		     "Entries differential computation selection",
 		     setup_compute),
 	OPT_BOOLEAN('p', "period", &show_period,
@@ -1331,6 +1343,33 @@ static int hpp__entry_baseline(struct hist_entry *he, char *buf, size_t size)
 	return ret;
 }
 
+static int cycles_printf(struct hist_entry *he, struct hist_entry *pair,
+			 struct perf_hpp *hpp, int width)
+{
+	struct block_hists *p = pair->block_hists;
+	struct hist_entry *block_he;
+	struct block_info *bi;
+	char buf[128];
+
+	if (!p) {
+		hpp->skip = true;
+		return 0;
+	}
+
+	block_he = hists__get_entry(&p->hists, he->block_idx);
+	if (!block_he) {
+		hpp->skip = true;
+		return 0;
+	}
+
+	bi = block_he->block_info;
+	scnprintf(buf, sizeof(buf), "%ld [%12lx:%12lx]",
+		  block_he->diff.cycles, bi->sym->start + bi->start,
+		  bi->sym->start + bi->end);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
 static int __hpp__color_compare(struct perf_hpp_fmt *fmt,
 				struct perf_hpp *hpp, struct hist_entry *he,
 				int comparison_method)
@@ -1342,8 +1381,12 @@ static int __hpp__color_compare(struct perf_hpp_fmt *fmt,
 	s64 wdiff;
 	char pfmt[20] = " ";
 
-	if (!pair)
+	if (!pair) {
+		if (comparison_method == COMPUTE_CYCLES && he->block_idx)
+			hpp->skip = true;
+
 		goto no_print;
+	}
 
 	switch (comparison_method) {
 	case COMPUTE_DELTA:
@@ -1378,6 +1421,8 @@ static int __hpp__color_compare(struct perf_hpp_fmt *fmt,
 		return color_snprintf(hpp->buf, hpp->size,
 				get_percent_color(wdiff),
 				pfmt, wdiff);
+	case COMPUTE_CYCLES:
+		return cycles_printf(he, pair, hpp, dfmt->header_width);
 	default:
 		BUG_ON(1);
 	}
@@ -1407,6 +1452,12 @@ static int hpp__color_wdiff(struct perf_hpp_fmt *fmt,
 	return __hpp__color_compare(fmt, hpp, he, COMPUTE_WEIGHTED_DIFF);
 }
 
+static int hpp__color_cycles(struct perf_hpp_fmt *fmt,
+			     struct perf_hpp *hpp, struct hist_entry *he)
+{
+	return __hpp__color_compare(fmt, hpp, he, COMPUTE_CYCLES);
+}
+
 static void
 hpp__entry_unpair(struct hist_entry *he, int idx, char *buf, size_t size)
 {
@@ -1608,6 +1659,10 @@ static void data__hpp_register(struct data__file *d, int idx)
 		fmt->color = hpp__color_delta;
 		fmt->sort  = hist_entry__cmp_delta_abs;
 		break;
+	case PERF_HPP_DIFF__CYCLES:
+		fmt->color = hpp__color_cycles;
+		fmt->sort  = hist_entry__cmp_cycles;
+		break;
 	default:
 		fmt->sort  = hist_entry__cmp_nop;
 		break;
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index a60f299..0bc03b8 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -531,6 +531,29 @@ static int hist_entry__hierarchy_fprintf(struct hist_entry *he,
 	return printed;
 }
 
+static int hist_entry__block_fprintf(struct hist_entry *he,
+				     char *bf, size_t size,
+				     FILE *fp)
+{
+	int ret = 0;
+
+	for (int i = 0; i < he->block_num; i++) {
+		struct perf_hpp hpp = {
+			.buf		= bf,
+			.size		= size,
+			.skip		= false,
+		};
+
+		he->block_idx = i;
+		hist_entry__snprintf(he, &hpp);
+
+		if (!hpp.skip)
+			ret += fprintf(fp, "%s\n", bf);
+	}
+
+	return ret;
+}
+
 static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 			       char *bf, size_t bfsz, FILE *fp,
 			       bool ignore_callchains)
@@ -550,6 +573,9 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 	if (symbol_conf.report_hierarchy)
 		return hist_entry__hierarchy_fprintf(he, &hpp, hists, fp);
 
+	if (he->block_hists)
+		return hist_entry__block_fprintf(he, bf, size, fp);
+
 	hist_entry__snprintf(he, &hpp);
 
 	ret = fprintf(fp, "%s\n", bf);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 10ab45a..169899f6 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -376,6 +376,24 @@ void hists__delete_entries(struct hists *hists)
 	}
 }
 
+struct hist_entry *hists__get_entry(struct hists *hists, int idx)
+{
+	struct rb_node *next = rb_first_cached(&hists->entries);
+	struct hist_entry *n;
+	int i = 0;
+
+	while (next) {
+		n = rb_entry(next, struct hist_entry, rb_node);
+		if (i == idx)
+			return n;
+
+		next = rb_next(&n->rb_node);
+		i++;
+	}
+
+	return NULL;
+}
+
 /*
  * histogram, sorted on item, collects periods
  */
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index c8f7d66..9124b54 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -184,6 +184,8 @@ void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel);
 void hists__delete_entries(struct hists *hists);
 void hists__output_recalc_col_len(struct hists *hists, int max_rows);
 
+struct hist_entry *hists__get_entry(struct hists *hists, int idx);
+
 u64 hists__total_period(struct hists *hists);
 void hists__reset_stats(struct hists *hists);
 void hists__inc_stats(struct hists *hists, struct hist_entry *h);
@@ -249,6 +251,7 @@ struct perf_hpp {
 	size_t size;
 	const char *sep;
 	void *ptr;
+	bool skip;
 };
 
 struct perf_hpp_fmt {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 7/7] perf diff: Documentation -c cycles option
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (6 preceding siblings ...)
  2019-06-03 14:36 ` [PATCH v2 6/7] perf diff: Print the basic block cycles diff Jin Yao
@ 2019-06-03 14:36 ` Jin Yao
  2019-06-05 11:44 ` [PATCH v2 0/7] perf diff: diff cycles at basic block level Jiri Olsa
  8 siblings, 0 replies; 27+ messages in thread
From: Jin Yao @ 2019-06-03 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Documentation the new computation selection 'cycles'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-diff.txt | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index da7809b..198119b 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -90,9 +90,10 @@ OPTIONS
 
 -c::
 --compute::
-        Differential computation selection - delta, ratio, wdiff, delta-abs
-        (default is delta-abs).  Default can be changed using diff.compute
-        config option.  See COMPARISON METHODS section for more info.
+        Differential computation selection - delta, ratio, wdiff, cycles,
+        delta-abs (default is delta-abs).  Default can be changed using
+        diff.compute config option.  See COMPARISON METHODS section for
+        more info.
 
 -p::
 --period::
@@ -278,6 +279,13 @@ If specified the 'Weighted diff' column is displayed with value 'd' computed as:
     - WEIGHT-A being the weight of the data file
     - WEIGHT-B being the weight of the baseline data file
 
+cycles
+~~~~~~
+If specified the 'Block cycles diff [start:end]' column is displayed.
+It displays the cycles difference of same program basic block amongst
+two perf.data. The program basic block is the code block between
+two branches in a function (indicated by [start:end]).
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-report[1]
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/7] perf diff: Print the basic block cycles diff
  2019-06-03 14:36 ` [PATCH v2 6/7] perf diff: Print the basic block cycles diff Jin Yao
@ 2019-06-05 11:44   ` Jiri Olsa
  2019-06-06  4:06     ` Jin, Yao
  2019-06-05 11:44   ` Jiri Olsa
  1 sibling, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2019-06-05 11:44 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Mon, Jun 03, 2019 at 10:36:16PM +0800, Jin Yao wrote:
> perf record -b ./div
> perf record -b ./div
> 
> Following is the default perf diff output
> 
>  # perf diff
> 
>  # Event 'cycles'
>  #
>  # Baseline  Delta Abs  Shared Object     Symbol
>  # ........  .........  ................  ....................................
>  #
>      49.03%     +0.30%  div               [.] main
>      16.29%     -0.20%  libc-2.23.so      [.] __random
>      18.82%     -0.07%  libc-2.23.so      [.] __random_r
>       8.11%     -0.04%  div               [.] compute_flag
>       2.25%     +0.01%  div               [.] rand@plt
>       0.00%     +0.01%  [kernel.vmlinux]  [k] task_tick_fair
>       5.46%     +0.01%  libc-2.23.so      [.] rand
>       0.01%     -0.01%  [kernel.vmlinux]  [k] native_irq_return_iret
>       0.00%     -0.00%  [kernel.vmlinux]  [k] interrupt_entry
> 
> This patch creates a new computation selection 'cycles'.
> 
>  # perf diff -c cycles
> 
>  # Event 'cycles'
>  #
>  # Baseline         Block cycles diff [start:end]  Shared Object     Symbol
>  # ........  ....................................  ................  ....................................
>  #
>      49.03%        -9 [         4ef:         520]  div               [.] main
>      49.03%         0 [         4e8:         4ea]  div               [.] main
>      49.03%         0 [         4ef:         500]  div               [.] main
>      49.03%         0 [         4ef:         51c]  div               [.] main
>      49.03%         0 [         4ef:         535]  div               [.] main
>      18.82%         0 [       3ac40:       3ac4d]  libc-2.23.so      [.] __random_r
>      18.82%         0 [       3ac40:       3ac5c]  libc-2.23.so      [.] __random_r
>      18.82%         0 [       3ac40:       3ac76]  libc-2.23.so      [.] __random_r
>      18.82%         0 [       3ac40:       3ac88]  libc-2.23.so      [.] __random_r
>      18.82%         0 [       3ac90:       3ac9c]  libc-2.23.so      [.] __random_r
>      16.29%        -8 [       3aac0:       3aac0]  libc-2.23.so      [.] __random
>      16.29%         0 [       3aac0:       3aad2]  libc-2.23.so      [.] __random
>      16.29%         0 [       3aae0:       3aae7]  libc-2.23.so      [.] __random
>      16.29%         0 [       3ab03:       3ab0f]  libc-2.23.so      [.] __random
>      16.29%         0 [       3ab14:       3ab1b]  libc-2.23.so      [.] __random
>      16.29%         0 [       3ab28:       3ab2e]  libc-2.23.so      [.] __random
>      16.29%         0 [       3ab4a:       3ab53]  libc-2.23.so      [.] __random
>       8.11%         0 [         640:         644]  div               [.] compute_flag
>       8.11%         0 [         649:         659]  div               [.] compute_flag
>       5.46%         0 [       3af60:       3af60]  libc-2.23.so      [.] rand
>       5.46%         0 [       3af60:       3af64]  libc-2.23.so      [.] rand
>       2.25%         0 [         490:         490]  div               [.] rand@plt
>       0.01%        26 [      c00a27:      c00a27]  [kernel.vmlinux]  [k] native_irq_return_iret
>       0.00%      -157 [      2bf9f2:      2bfa63]  [kernel.vmlinux]  [k] update_blocked_averages
>       0.00%       -56 [      2bf980:      2bf9d3]  [kernel.vmlinux]  [k] update_blocked_averages
>       0.00%        48 [      2bf934:      2bf942]  [kernel.vmlinux]  [k] update_blocked_averages
>       0.00%         3 [      2bfb38:      2bfb67]  [kernel.vmlinux]  [k] update_blocked_averages
>       0.00%         0 [      2bf968:      2bf97b]  [kernel.vmlinux]  [k] update_blocked_averages
> 

so what I'd expect would be Baseline column with cycles and another
column showing the differrence (in cycles) for given symbol

> "[start:end]" indicates the basic block range. The output is sorted
> by "Baseline" and the basic blocks in the same function are sorted
> by cycles diff.

hum, why is there multiple basic blocks [start:end] for a symbol?

thanks,
jirka

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-03 14:36 ` [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
@ 2019-06-05 11:44   ` Jiri Olsa
  2019-06-06  1:15     ` Jin, Yao
  2019-06-08 11:41     ` Jin, Yao
  2019-06-05 11:44   ` Jiri Olsa
  2019-06-05 11:44   ` Jiri Olsa
  2 siblings, 2 replies; 27+ messages in thread
From: Jiri Olsa @ 2019-06-05 11:44 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:

SNIP

> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index 43623fa..d1641da 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -79,6 +79,9 @@ struct hist_entry_diff {
>  
>  		/* HISTC_WEIGHTED_DIFF */
>  		s64	wdiff;
> +
> +		/* PERF_HPP_DIFF__CYCLES */
> +		s64	cycles;
>  	};
>  };
>  
> @@ -143,6 +146,9 @@ struct hist_entry {
>  	struct branch_info	*branch_info;
>  	long			time;
>  	struct hists		*hists;
> +	void			*block_hists;
> +	int			block_idx;
> +	int			block_num;
>  	struct mem_info		*mem_info;
>  	struct block_info	*block_info;

could you please not add the new block* stuff in here,
and instead use the "c2c model" and use yourr own struct
on top of hist_entry? we are trying to librarize this
stuff and keep only necessary things in here..

you're already using hist_entry_ops, so should be easy

something like:

	struct block_hist_entry {
		void			*block_hists;
		int			block_idx;
		int			block_num;
		struct block_info	*block_info;

		struct hist_entry	he;
	};



jirka

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/7] perf diff: diff cycles at basic block level
  2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (7 preceding siblings ...)
  2019-06-03 14:36 ` [PATCH v2 7/7] perf diff: Documentation -c cycles option Jin Yao
@ 2019-06-05 11:44 ` Jiri Olsa
  2019-06-06  1:05   ` Jin, Yao
  8 siblings, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2019-06-05 11:44 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Mon, Jun 03, 2019 at 10:36:10PM +0800, Jin Yao wrote:
> In some cases small changes in hot loops can show big differences.
> But it's difficult to identify these differences.
> 
> perf diff currently can only diff symbols (functions). We can also expand
> it to diff cycles of individual programs blocks as reported by timed LBR.
> This would allow to identify changes in specific code accurately.

can't compile on Fedora 30

builtin-diff.c: In function ‘block_cycles_diff_cmp’:
builtin-diff.c:544:6: error: absolute value function ‘abs’ given an argument of type ‘s64’ {aka ‘long int’} but has parameter of type ‘int’ which may cause truncation of value [-Werror=absolute-value]
  544 |  l = abs(left->diff.cycles);
      |      ^~~
builtin-diff.c:545:6: error: absolute value function ‘abs’ given an argument of type ‘s64’ {aka ‘long int’} but has parameter of type ‘int’ which may cause truncation of value [-Werror=absolute-value]
  545 |  r = abs(right->diff.cycles);
      |      ^~~

[jolsa@krava perf]$ gcc --version
gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1)

jirka

> 
> With this patch set, for example,
> 
>  # perf record -b ./div
>  # perf record -b ./div
>  # perf diff -s cycles
> 
>  # Event 'cycles'
>  #
>  # Baseline         Block cycles diff [start:end]  Shared Object     Symbol
>  # ........  ....................................  ................  ....................................
>  #
>      49.03%        -9 [         4ef:         520]  div               [.] main
>      49.03%         0 [         4e8:         4ea]  div               [.] main
>      49.03%         0 [         4ef:         500]  div               [.] main
>      49.03%         0 [         4ef:         51c]  div               [.] main
>      49.03%         0 [         4ef:         535]  div               [.] main
>      18.82%         0 [       3ac40:       3ac4d]  libc-2.23.so      [.] __random_r
>      18.82%         0 [       3ac40:       3ac5c]  libc-2.23.so      [.] __random_r
>      18.82%         0 [       3ac40:       3ac76]  libc-2.23.so      [.] __random_r
>      18.82%         0 [       3ac40:       3ac88]  libc-2.23.so      [.] __random_r
>      18.82%         0 [       3ac90:       3ac9c]  libc-2.23.so      [.] __random_r
>      16.29%        -8 [       3aac0:       3aac0]  libc-2.23.so      [.] __random
>      16.29%         0 [       3aac0:       3aad2]  libc-2.23.so      [.] __random
>      16.29%         0 [       3aae0:       3aae7]  libc-2.23.so      [.] __random
>      16.29%         0 [       3ab03:       3ab0f]  libc-2.23.so      [.] __random
>      16.29%         0 [       3ab14:       3ab1b]  libc-2.23.so      [.] __random
>      16.29%         0 [       3ab28:       3ab2e]  libc-2.23.so      [.] __random
>      16.29%         0 [       3ab4a:       3ab53]  libc-2.23.so      [.] __random
>       8.11%         0 [         640:         644]  div               [.] compute_flag
>       8.11%         0 [         649:         659]  div               [.] compute_flag
>       5.46%         0 [       3af60:       3af60]  libc-2.23.so      [.] rand
>       5.46%         0 [       3af60:       3af64]  libc-2.23.so      [.] rand
>       2.25%         0 [         490:         490]  div               [.] rand@plt
>       0.01%        26 [      c00a27:      c00a27]  [kernel.vmlinux]  [k] native_irq_return_iret
>       0.00%      -157 [      2bf9f2:      2bfa63]  [kernel.vmlinux]  [k] update_blocked_averages
>       0.00%       -56 [      2bf980:      2bf9d3]  [kernel.vmlinux]  [k] update_blocked_averages
>       0.00%        48 [      2bf934:      2bf942]  [kernel.vmlinux]  [k] update_blocked_averages
>       0.00%         3 [      2bfb38:      2bfb67]  [kernel.vmlinux]  [k] update_blocked_averages
> 
> The 'cycles' is a new perf-diff computation selection, which enables
> the displaying of cycles difference of same program basic block amongst
> two perf.data. The program basic block is the code block between two
> branches in a function.
> 
>  v2:
>  ---
>  Keep standard perf diff format.
> 
>  Following is the v1 output.
> 
>  # perf diff --basic-block
> 
>  # Cycles diff  Basic block (start:end)
>  # ...........  .......................
>  #
>           -208  hrtimer_interrupt (30b9e0:30ba42)
>           -157  update_blocked_averages (2bf9f2:2bfa63)
>           -126  interrupt_entry (c00880:c0093a)
>            -86  hrtimer_interrupt (30bb29:30bb32)
>            -74  hrtimer_interrupt (30ba65:30bac4)
>            -56  update_blocked_averages (2bf980:2bf9d3)
>             48  update_blocked_averages (2bf934:2bf942)
>            -35  native_write_msr (267900:26790b)
>             26  native_irq_return_iret (c00a27:c00a27)
>             22  rcu_check_callbacks (2febb6:2febdc)
>            -21  __hrtimer_run_queues (30b220:30b2a3)
>             19  pvclock_gtod_notify (14ba0:14c1b)
>            -18  task_tick_fair (2c5d29:2c5d41)
> 
> Jin Yao (7):
>   perf util: Create block_info structure
>   perf util: Add block_info in hist_entry
>   perf diff: Check if all data files with branch stacks
>   perf diff: Use hists to manage basic blocks per symbol
>   perf diff: Link same basic blocks among different data files
>   perf diff: Print the basic block cycles diff
>   perf diff: Documentation -c cycles option
> 
>  tools/perf/Documentation/perf-diff.txt |  14 +-
>  tools/perf/builtin-diff.c              | 373 ++++++++++++++++++++++++++++++++-
>  tools/perf/ui/stdio/hist.c             |  26 +++
>  tools/perf/util/hist.c                 |  42 +++-
>  tools/perf/util/hist.h                 |   9 +
>  tools/perf/util/sort.h                 |   8 +
>  tools/perf/util/symbol.c               |  22 ++
>  tools/perf/util/symbol.h               |  23 ++
>  8 files changed, 509 insertions(+), 8 deletions(-)
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-03 14:36 ` [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
  2019-06-05 11:44   ` Jiri Olsa
@ 2019-06-05 11:44   ` Jiri Olsa
  2019-06-06  1:26     ` Jin, Yao
  2019-06-05 11:44   ` Jiri Olsa
  2 siblings, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2019-06-05 11:44 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:

SNIP

>  					   sort_compute);
>  }
>  
> +static int64_t
> +hist_entry__cmp_cycles_idx(struct perf_hpp_fmt *fmt __maybe_unused,
> +			   struct hist_entry *left __maybe_unused,
> +			   struct hist_entry *right __maybe_unused)
> +{
> +	return 0;
> +}
> +
>  static void hists__process(struct hists *hists)
>  {
>  	if (show_baseline_only)
> @@ -746,6 +930,8 @@ static void data_process(void)
>  	struct perf_evsel *evsel_base;
>  	bool first = true;
>  
> +	memset(&dummy_al, 0, sizeof(dummy_al));

why is this needed? it's zero by static declaration, and it's never set, right?

jirka

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/7] perf diff: Print the basic block cycles diff
  2019-06-03 14:36 ` [PATCH v2 6/7] perf diff: Print the basic block cycles diff Jin Yao
  2019-06-05 11:44   ` Jiri Olsa
@ 2019-06-05 11:44   ` Jiri Olsa
  2019-06-06  2:02     ` Jin, Yao
  1 sibling, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2019-06-05 11:44 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Mon, Jun 03, 2019 at 10:36:16PM +0800, Jin Yao wrote:

SNIP

> -				break;
>  			return setup_compute_opt(option);
>  		}
>  
> @@ -949,6 +953,14 @@ hist_entry__cmp_wdiff(struct perf_hpp_fmt *fmt,
>  }
>  
>  static int64_t
> +hist_entry__cmp_cycles(struct perf_hpp_fmt *fmt __maybe_unused,
> +		       struct hist_entry *left __maybe_unused,
> +		       struct hist_entry *right __maybe_unused)
> +{
> +	return 0;
> +}

we have hist_entry__cmp_nop for that

SNIP

>  	default:
>  		BUG_ON(1);
>  	}
> @@ -1407,6 +1452,12 @@ static int hpp__color_wdiff(struct perf_hpp_fmt *fmt,
>  	return __hpp__color_compare(fmt, hpp, he, COMPUTE_WEIGHTED_DIFF);
>  }
>  
> +static int hpp__color_cycles(struct perf_hpp_fmt *fmt,
> +			     struct perf_hpp *hpp, struct hist_entry *he)
> +{
> +	return __hpp__color_compare(fmt, hpp, he, COMPUTE_CYCLES);
> +}
> +
>  static void
>  hpp__entry_unpair(struct hist_entry *he, int idx, char *buf, size_t size)
>  {
> @@ -1608,6 +1659,10 @@ static void data__hpp_register(struct data__file *d, int idx)
>  		fmt->color = hpp__color_delta;
>  		fmt->sort  = hist_entry__cmp_delta_abs;
>  		break;
> +	case PERF_HPP_DIFF__CYCLES:
> +		fmt->color = hpp__color_cycles;
> +		fmt->sort  = hist_entry__cmp_cycles;

also please explain in comment why it's nop

jirka

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-03 14:36 ` [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
  2019-06-05 11:44   ` Jiri Olsa
  2019-06-05 11:44   ` Jiri Olsa
@ 2019-06-05 11:44   ` Jiri Olsa
  2019-06-06  1:57     ` Jin, Yao
  2 siblings, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2019-06-05 11:44 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:

SNIP

>  		data__for_each_file_new(i, d) {
>  			pair = get_pair_data(he, d);
>  			if (!pair)
> @@ -510,6 +683,9 @@ static void hists__precompute(struct hists *hists)
>  			case COMPUTE_WEIGHTED_DIFF:
>  				compute_wdiff(he, pair);
>  				break;
> +			case COMPUTE_CYCLES:
> +				process_block_per_sym(pair, d);
> +				break;
>  			default:
>  				BUG_ON(1);
>  			}
> @@ -713,6 +889,14 @@ hist_entry__cmp_wdiff_idx(struct perf_hpp_fmt *fmt __maybe_unused,
>  					   sort_compute);
>  }
>  
> +static int64_t
> +hist_entry__cmp_cycles_idx(struct perf_hpp_fmt *fmt __maybe_unused,
> +			   struct hist_entry *left __maybe_unused,
> +			   struct hist_entry *right __maybe_unused)
> +{
> +	return 0;
> +}

this is hist_entry__cmp_nop.. please use it instead and
explain in comment why for COMPUTE_CYCLES we need the
default sort

jirka

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/7] perf diff: diff cycles at basic block level
  2019-06-05 11:44 ` [PATCH v2 0/7] perf diff: diff cycles at basic block level Jiri Olsa
@ 2019-06-06  1:05   ` Jin, Yao
  0 siblings, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-06  1:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> On Mon, Jun 03, 2019 at 10:36:10PM +0800, Jin Yao wrote:
>> In some cases small changes in hot loops can show big differences.
>> But it's difficult to identify these differences.
>>
>> perf diff currently can only diff symbols (functions). We can also expand
>> it to diff cycles of individual programs blocks as reported by timed LBR.
>> This would allow to identify changes in specific code accurately.
> 
> can't compile on Fedora 30
> 
> builtin-diff.c: In function ‘block_cycles_diff_cmp’:
> builtin-diff.c:544:6: error: absolute value function ‘abs’ given an argument of type ‘s64’ {aka ‘long int’} but has parameter of type ‘int’ which may cause truncation of value [-Werror=absolute-value]
>    544 |  l = abs(left->diff.cycles);
>        |      ^~~
> builtin-diff.c:545:6: error: absolute value function ‘abs’ given an argument of type ‘s64’ {aka ‘long int’} but has parameter of type ‘int’ which may cause truncation of value [-Werror=absolute-value]
>    545 |  r = abs(right->diff.cycles);
>        |      ^~~
> 
> [jolsa@krava perf]$ gcc --version
> gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1)
> 
> jirka
> 

Thanks for pointing out this issue.

I should use labs()

Thanks
Jin Yao

>>
>> With this patch set, for example,
>>
>>   # perf record -b ./div
>>   # perf record -b ./div
>>   # perf diff -s cycles
>>
>>   # Event 'cycles'
>>   #
>>   # Baseline         Block cycles diff [start:end]  Shared Object     Symbol
>>   # ........  ....................................  ................  ....................................
>>   #
>>       49.03%        -9 [         4ef:         520]  div               [.] main
>>       49.03%         0 [         4e8:         4ea]  div               [.] main
>>       49.03%         0 [         4ef:         500]  div               [.] main
>>       49.03%         0 [         4ef:         51c]  div               [.] main
>>       49.03%         0 [         4ef:         535]  div               [.] main
>>       18.82%         0 [       3ac40:       3ac4d]  libc-2.23.so      [.] __random_r
>>       18.82%         0 [       3ac40:       3ac5c]  libc-2.23.so      [.] __random_r
>>       18.82%         0 [       3ac40:       3ac76]  libc-2.23.so      [.] __random_r
>>       18.82%         0 [       3ac40:       3ac88]  libc-2.23.so      [.] __random_r
>>       18.82%         0 [       3ac90:       3ac9c]  libc-2.23.so      [.] __random_r
>>       16.29%        -8 [       3aac0:       3aac0]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3aac0:       3aad2]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3aae0:       3aae7]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3ab03:       3ab0f]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3ab14:       3ab1b]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3ab28:       3ab2e]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3ab4a:       3ab53]  libc-2.23.so      [.] __random
>>        8.11%         0 [         640:         644]  div               [.] compute_flag
>>        8.11%         0 [         649:         659]  div               [.] compute_flag
>>        5.46%         0 [       3af60:       3af60]  libc-2.23.so      [.] rand
>>        5.46%         0 [       3af60:       3af64]  libc-2.23.so      [.] rand
>>        2.25%         0 [         490:         490]  div               [.] rand@plt
>>        0.01%        26 [      c00a27:      c00a27]  [kernel.vmlinux]  [k] native_irq_return_iret
>>        0.00%      -157 [      2bf9f2:      2bfa63]  [kernel.vmlinux]  [k] update_blocked_averages
>>        0.00%       -56 [      2bf980:      2bf9d3]  [kernel.vmlinux]  [k] update_blocked_averages
>>        0.00%        48 [      2bf934:      2bf942]  [kernel.vmlinux]  [k] update_blocked_averages
>>        0.00%         3 [      2bfb38:      2bfb67]  [kernel.vmlinux]  [k] update_blocked_averages
>>
>> The 'cycles' is a new perf-diff computation selection, which enables
>> the displaying of cycles difference of same program basic block amongst
>> two perf.data. The program basic block is the code block between two
>> branches in a function.
>>
>>   v2:
>>   ---
>>   Keep standard perf diff format.
>>
>>   Following is the v1 output.
>>
>>   # perf diff --basic-block
>>
>>   # Cycles diff  Basic block (start:end)
>>   # ...........  .......................
>>   #
>>            -208  hrtimer_interrupt (30b9e0:30ba42)
>>            -157  update_blocked_averages (2bf9f2:2bfa63)
>>            -126  interrupt_entry (c00880:c0093a)
>>             -86  hrtimer_interrupt (30bb29:30bb32)
>>             -74  hrtimer_interrupt (30ba65:30bac4)
>>             -56  update_blocked_averages (2bf980:2bf9d3)
>>              48  update_blocked_averages (2bf934:2bf942)
>>             -35  native_write_msr (267900:26790b)
>>              26  native_irq_return_iret (c00a27:c00a27)
>>              22  rcu_check_callbacks (2febb6:2febdc)
>>             -21  __hrtimer_run_queues (30b220:30b2a3)
>>              19  pvclock_gtod_notify (14ba0:14c1b)
>>             -18  task_tick_fair (2c5d29:2c5d41)
>>
>> Jin Yao (7):
>>    perf util: Create block_info structure
>>    perf util: Add block_info in hist_entry
>>    perf diff: Check if all data files with branch stacks
>>    perf diff: Use hists to manage basic blocks per symbol
>>    perf diff: Link same basic blocks among different data files
>>    perf diff: Print the basic block cycles diff
>>    perf diff: Documentation -c cycles option
>>
>>   tools/perf/Documentation/perf-diff.txt |  14 +-
>>   tools/perf/builtin-diff.c              | 373 ++++++++++++++++++++++++++++++++-
>>   tools/perf/ui/stdio/hist.c             |  26 +++
>>   tools/perf/util/hist.c                 |  42 +++-
>>   tools/perf/util/hist.h                 |   9 +
>>   tools/perf/util/sort.h                 |   8 +
>>   tools/perf/util/symbol.c               |  22 ++
>>   tools/perf/util/symbol.h               |  23 ++
>>   8 files changed, 509 insertions(+), 8 deletions(-)
>>
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-05 11:44   ` Jiri Olsa
@ 2019-06-06  1:15     ` Jin, Yao
  2019-06-08 11:41     ` Jin, Yao
  1 sibling, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-06  1:15 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
>> index 43623fa..d1641da 100644
>> --- a/tools/perf/util/sort.h
>> +++ b/tools/perf/util/sort.h
>> @@ -79,6 +79,9 @@ struct hist_entry_diff {
>>   
>>   		/* HISTC_WEIGHTED_DIFF */
>>   		s64	wdiff;
>> +
>> +		/* PERF_HPP_DIFF__CYCLES */
>> +		s64	cycles;
>>   	};
>>   };
>>   
>> @@ -143,6 +146,9 @@ struct hist_entry {
>>   	struct branch_info	*branch_info;
>>   	long			time;
>>   	struct hists		*hists;
>> +	void			*block_hists;
>> +	int			block_idx;
>> +	int			block_num;
>>   	struct mem_info		*mem_info;
>>   	struct block_info	*block_info;
> 
> could you please not add the new block* stuff in here,
> and instead use the "c2c model" and use yourr own struct
> on top of hist_entry? we are trying to librarize this
> stuff and keep only necessary things in here..
> 
> you're already using hist_entry_ops, so should be easy
> 
> something like:
> 
> 	struct block_hist_entry {
> 		void			*block_hists;
> 		int			block_idx;
> 		int			block_num;
> 		struct block_info	*block_info;
> 
> 		struct hist_entry	he;
> 	};
> 
> 
> 
> jirka
> 

Oh, yes, I should refer to 'c2c_hist_entry'. That avoids to add more 
stuffs in hist_entry. That's a good idea!

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-05 11:44   ` Jiri Olsa
@ 2019-06-06  1:26     ` Jin, Yao
  0 siblings, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-06  1:26 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>>   					   sort_compute);
>>   }
>>   
>> +static int64_t
>> +hist_entry__cmp_cycles_idx(struct perf_hpp_fmt *fmt __maybe_unused,
>> +			   struct hist_entry *left __maybe_unused,
>> +			   struct hist_entry *right __maybe_unused)
>> +{
>> +	return 0;
>> +}
>> +
>>   static void hists__process(struct hists *hists)
>>   {
>>   	if (show_baseline_only)
>> @@ -746,6 +930,8 @@ static void data_process(void)
>>   	struct perf_evsel *evsel_base;
>>   	bool first = true;
>>   
>> +	memset(&dummy_al, 0, sizeof(dummy_al));
> 
> why is this needed? it's zero by static declaration, and it's never set, right?
> 
> jirka
> 

C standard says yes for initializing static variables to 0, but I just 
want to avoid any potential for unexpected behavior. Maybe I was over 
thinking it. I will remove this line.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-05 11:44   ` Jiri Olsa
@ 2019-06-06  1:57     ` Jin, Yao
  0 siblings, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-06  1:57 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>>   		data__for_each_file_new(i, d) {
>>   			pair = get_pair_data(he, d);
>>   			if (!pair)
>> @@ -510,6 +683,9 @@ static void hists__precompute(struct hists *hists)
>>   			case COMPUTE_WEIGHTED_DIFF:
>>   				compute_wdiff(he, pair);
>>   				break;
>> +			case COMPUTE_CYCLES:
>> +				process_block_per_sym(pair, d);
>> +				break;
>>   			default:
>>   				BUG_ON(1);
>>   			}
>> @@ -713,6 +889,14 @@ hist_entry__cmp_wdiff_idx(struct perf_hpp_fmt *fmt __maybe_unused,
>>   					   sort_compute);
>>   }
>>   
>> +static int64_t
>> +hist_entry__cmp_cycles_idx(struct perf_hpp_fmt *fmt __maybe_unused,
>> +			   struct hist_entry *left __maybe_unused,
>> +			   struct hist_entry *right __maybe_unused)
>> +{
>> +	return 0;
>> +}
> 
> this is hist_entry__cmp_nop.. please use it instead and
> explain in comment why for COMPUTE_CYCLES we need the
> default sort
> 
> jirka
> 

fmt->sort should be set otherwise since fmt->sort will be called without 
checking valid, the crash happens. Yes, I should use hist_entry__cmp_nop 
instead, and will add some comments for COMPUTE_CYCLES.

Thanks
Jin Yao



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/7] perf diff: Print the basic block cycles diff
  2019-06-05 11:44   ` Jiri Olsa
@ 2019-06-06  2:02     ` Jin, Yao
  0 siblings, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-06  2:02 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> On Mon, Jun 03, 2019 at 10:36:16PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> -				break;
>>   			return setup_compute_opt(option);
>>   		}
>>   
>> @@ -949,6 +953,14 @@ hist_entry__cmp_wdiff(struct perf_hpp_fmt *fmt,
>>   }
>>   
>>   static int64_t
>> +hist_entry__cmp_cycles(struct perf_hpp_fmt *fmt __maybe_unused,
>> +		       struct hist_entry *left __maybe_unused,
>> +		       struct hist_entry *right __maybe_unused)
>> +{
>> +	return 0;
>> +}
> 
> we have hist_entry__cmp_nop for that
> 
> SNIP
> 
>>   	default:
>>   		BUG_ON(1);
>>   	}
>> @@ -1407,6 +1452,12 @@ static int hpp__color_wdiff(struct perf_hpp_fmt *fmt,
>>   	return __hpp__color_compare(fmt, hpp, he, COMPUTE_WEIGHTED_DIFF);
>>   }
>>   
>> +static int hpp__color_cycles(struct perf_hpp_fmt *fmt,
>> +			     struct perf_hpp *hpp, struct hist_entry *he)
>> +{
>> +	return __hpp__color_compare(fmt, hpp, he, COMPUTE_CYCLES);
>> +}
>> +
>>   static void
>>   hpp__entry_unpair(struct hist_entry *he, int idx, char *buf, size_t size)
>>   {
>> @@ -1608,6 +1659,10 @@ static void data__hpp_register(struct data__file *d, int idx)
>>   		fmt->color = hpp__color_delta;
>>   		fmt->sort  = hist_entry__cmp_delta_abs;
>>   		break;
>> +	case PERF_HPP_DIFF__CYCLES:
>> +		fmt->color = hpp__color_cycles;
>> +		fmt->sort  = hist_entry__cmp_cycles;
> 
> also please explain in comment why it's nop
> 
> jirka
> 

Got it, I will update the patch.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/7] perf diff: Print the basic block cycles diff
  2019-06-05 11:44   ` Jiri Olsa
@ 2019-06-06  4:06     ` Jin, Yao
  0 siblings, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-06  4:06 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> On Mon, Jun 03, 2019 at 10:36:16PM +0800, Jin Yao wrote:
>> perf record -b ./div
>> perf record -b ./div
>>
>> Following is the default perf diff output
>>
>>   # perf diff
>>
>>   # Event 'cycles'
>>   #
>>   # Baseline  Delta Abs  Shared Object     Symbol
>>   # ........  .........  ................  ....................................
>>   #
>>       49.03%     +0.30%  div               [.] main
>>       16.29%     -0.20%  libc-2.23.so      [.] __random
>>       18.82%     -0.07%  libc-2.23.so      [.] __random_r
>>        8.11%     -0.04%  div               [.] compute_flag
>>        2.25%     +0.01%  div               [.] rand@plt
>>        0.00%     +0.01%  [kernel.vmlinux]  [k] task_tick_fair
>>        5.46%     +0.01%  libc-2.23.so      [.] rand
>>        0.01%     -0.01%  [kernel.vmlinux]  [k] native_irq_return_iret
>>        0.00%     -0.00%  [kernel.vmlinux]  [k] interrupt_entry
>>
>> This patch creates a new computation selection 'cycles'.
>>
>>   # perf diff -c cycles
>>
>>   # Event 'cycles'
>>   #
>>   # Baseline         Block cycles diff [start:end]  Shared Object     Symbol
>>   # ........  ....................................  ................  ....................................
>>   #
>>       49.03%        -9 [         4ef:         520]  div               [.] main
>>       49.03%         0 [         4e8:         4ea]  div               [.] main
>>       49.03%         0 [         4ef:         500]  div               [.] main
>>       49.03%         0 [         4ef:         51c]  div               [.] main
>>       49.03%         0 [         4ef:         535]  div               [.] main
>>       18.82%         0 [       3ac40:       3ac4d]  libc-2.23.so      [.] __random_r
>>       18.82%         0 [       3ac40:       3ac5c]  libc-2.23.so      [.] __random_r
>>       18.82%         0 [       3ac40:       3ac76]  libc-2.23.so      [.] __random_r
>>       18.82%         0 [       3ac40:       3ac88]  libc-2.23.so      [.] __random_r
>>       18.82%         0 [       3ac90:       3ac9c]  libc-2.23.so      [.] __random_r
>>       16.29%        -8 [       3aac0:       3aac0]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3aac0:       3aad2]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3aae0:       3aae7]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3ab03:       3ab0f]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3ab14:       3ab1b]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3ab28:       3ab2e]  libc-2.23.so      [.] __random
>>       16.29%         0 [       3ab4a:       3ab53]  libc-2.23.so      [.] __random
>>        8.11%         0 [         640:         644]  div               [.] compute_flag
>>        8.11%         0 [         649:         659]  div               [.] compute_flag
>>        5.46%         0 [       3af60:       3af60]  libc-2.23.so      [.] rand
>>        5.46%         0 [       3af60:       3af64]  libc-2.23.so      [.] rand
>>        2.25%         0 [         490:         490]  div               [.] rand@plt
>>        0.01%        26 [      c00a27:      c00a27]  [kernel.vmlinux]  [k] native_irq_return_iret
>>        0.00%      -157 [      2bf9f2:      2bfa63]  [kernel.vmlinux]  [k] update_blocked_averages
>>        0.00%       -56 [      2bf980:      2bf9d3]  [kernel.vmlinux]  [k] update_blocked_averages
>>        0.00%        48 [      2bf934:      2bf942]  [kernel.vmlinux]  [k] update_blocked_averages
>>        0.00%         3 [      2bfb38:      2bfb67]  [kernel.vmlinux]  [k] update_blocked_averages
>>        0.00%         0 [      2bf968:      2bf97b]  [kernel.vmlinux]  [k] update_blocked_averages
>>
> 
> so what I'd expect would be Baseline column with cycles and another
> column showing the differrence (in cycles) for given symbol
> 
>> "[start:end]" indicates the basic block range. The output is sorted
>> by "Baseline" and the basic blocks in the same function are sorted
>> by cycles diff.
> 
> hum, why is there multiple basic blocks [start:end] for a symbol?
> 
> thanks,
> jirka
> 

The basic block is the code between 2 branches (for one branch, for 
example, jmp, call, ret, interrupt, ...). So it's expected that one 
function (symbol is function) may contain multiple basic blocks.

The idea is, sorting by baseline to display the hottest functions and 
the second column shows the cycles diff of blocks in this function 
(comparing between different perf data files). This would allow to 
identify performance changes in specific code accurately and effectively.

Thanks
Jin Yao



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-05 11:44   ` Jiri Olsa
  2019-06-06  1:15     ` Jin, Yao
@ 2019-06-08 11:41     ` Jin, Yao
  2019-06-11  2:22       ` Jin, Yao
  2019-06-11  8:56       ` Jiri Olsa
  1 sibling, 2 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-08 11:41 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
>> index 43623fa..d1641da 100644
>> --- a/tools/perf/util/sort.h
>> +++ b/tools/perf/util/sort.h
>> @@ -79,6 +79,9 @@ struct hist_entry_diff {
>>   
>>   		/* HISTC_WEIGHTED_DIFF */
>>   		s64	wdiff;
>> +
>> +		/* PERF_HPP_DIFF__CYCLES */
>> +		s64	cycles;
>>   	};
>>   };
>>   
>> @@ -143,6 +146,9 @@ struct hist_entry {
>>   	struct branch_info	*branch_info;
>>   	long			time;
>>   	struct hists		*hists;
>> +	void			*block_hists;
>> +	int			block_idx;
>> +	int			block_num;
>>   	struct mem_info		*mem_info;
>>   	struct block_info	*block_info;
> 
> could you please not add the new block* stuff in here,
> and instead use the "c2c model" and use yourr own struct
> on top of hist_entry? we are trying to librarize this
> stuff and keep only necessary things in here..
> 
> you're already using hist_entry_ops, so should be easy
> 
> something like:
> 
> 	struct block_hist_entry {
> 		void			*block_hists;
> 		int			block_idx;
> 		int			block_num;
> 		struct block_info	*block_info;
> 
> 		struct hist_entry	he;
> 	};
> 
> 
> 
> jirka
> 

Hi Jiri,

After more considerations, maybe I can't move these stuffs from 
hist_entry to block_hist_entry.

Actually we use 2 kinds of hist_entry in this patch series. On kind of 
hist_entry is for symbol/function. The other kind of hist_entry is for 
basic block.

@@ -143,6 +146,9 @@ struct hist_entry {
   	struct branch_info	*branch_info;
   	long			time;
   	struct hists		*hists;
+	void			*block_hists;
+	int			block_idx;
+	int			block_num;
   	struct mem_info		*mem_info;
   	struct block_info	*block_info;

The above hist_entry is actually for symbol/function. This patch series 
collects all basic blocks in a symbol/function, so it needs a hists in 
struct hist_entry (block_hists) to point to the hists of basic blocks.

Correct me if I'm wrong.

Thanks
Jin Yao


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-08 11:41     ` Jin, Yao
@ 2019-06-11  2:22       ` Jin, Yao
  2019-06-11  8:56       ` Jiri Olsa
  1 sibling, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-11  2:22 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/8/2019 7:41 PM, Jin, Yao wrote:
> 
> 
> On 6/5/2019 7:44 PM, Jiri Olsa wrote:
>> On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
>>
>> SNIP
>>
>>> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
>>> index 43623fa..d1641da 100644
>>> --- a/tools/perf/util/sort.h
>>> +++ b/tools/perf/util/sort.h
>>> @@ -79,6 +79,9 @@ struct hist_entry_diff {
>>>           /* HISTC_WEIGHTED_DIFF */
>>>           s64    wdiff;
>>> +
>>> +        /* PERF_HPP_DIFF__CYCLES */
>>> +        s64    cycles;
>>>       };
>>>   };
>>> @@ -143,6 +146,9 @@ struct hist_entry {
>>>       struct branch_info    *branch_info;
>>>       long            time;
>>>       struct hists        *hists;
>>> +    void            *block_hists;
>>> +    int            block_idx;
>>> +    int            block_num;
>>>       struct mem_info        *mem_info;
>>>       struct block_info    *block_info;
>>
>> could you please not add the new block* stuff in here,
>> and instead use the "c2c model" and use yourr own struct
>> on top of hist_entry? we are trying to librarize this
>> stuff and keep only necessary things in here..
>>
>> you're already using hist_entry_ops, so should be easy
>>
>> something like:
>>
>>     struct block_hist_entry {
>>         void            *block_hists;
>>         int            block_idx;
>>         int            block_num;
>>         struct block_info    *block_info;
>>
>>         struct hist_entry    he;
>>     };
>>
>>
>>
>> jirka
>>
> 
> Hi Jiri,
> 
> After more considerations, maybe I can't move these stuffs from 
> hist_entry to block_hist_entry.
> 
> Actually we use 2 kinds of hist_entry in this patch series. On kind of 
> hist_entry is for symbol/function. The other kind of hist_entry is for 
> basic block.
> 
> @@ -143,6 +146,9 @@ struct hist_entry {
>        struct branch_info    *branch_info;
>        long            time;
>        struct hists        *hists;
> +    void            *block_hists;
> +    int            block_idx;
> +    int            block_num;
>        struct mem_info        *mem_info;
>        struct block_info    *block_info;
> 
> The above hist_entry is actually for symbol/function. This patch series 
> collects all basic blocks in a symbol/function, so it needs a hists in 
> struct hist_entry (block_hists) to point to the hists of basic blocks.
> 
> Correct me if I'm wrong.
> 
> Thanks
> Jin Yao
> 

Hi Jiri,

Either adding a new pointer 'priv' in 'struct map_symbol'?

struct map_symbol {
	struct map	*map;
	struct symbol	*sym;
+	void		*priv;
};

We create a struct outside and assign the pointer to priv. Logically it 
should make sense since the symbol/function may have private data for 
processing.

Any idea?

Thanks
Jin Yao





^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-08 11:41     ` Jin, Yao
  2019-06-11  2:22       ` Jin, Yao
@ 2019-06-11  8:56       ` Jiri Olsa
  2019-06-12  6:11         ` Jin, Yao
  1 sibling, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2019-06-11  8:56 UTC (permalink / raw)
  To: Jin, Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Sat, Jun 08, 2019 at 07:41:47PM +0800, Jin, Yao wrote:
> 
> 
> On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> > On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
> > 
> > SNIP
> > 
> > > diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> > > index 43623fa..d1641da 100644
> > > --- a/tools/perf/util/sort.h
> > > +++ b/tools/perf/util/sort.h
> > > @@ -79,6 +79,9 @@ struct hist_entry_diff {
> > >   		/* HISTC_WEIGHTED_DIFF */
> > >   		s64	wdiff;
> > > +
> > > +		/* PERF_HPP_DIFF__CYCLES */
> > > +		s64	cycles;
> > >   	};
> > >   };
> > > @@ -143,6 +146,9 @@ struct hist_entry {
> > >   	struct branch_info	*branch_info;
> > >   	long			time;
> > >   	struct hists		*hists;
> > > +	void			*block_hists;
> > > +	int			block_idx;
> > > +	int			block_num;
> > >   	struct mem_info		*mem_info;
> > >   	struct block_info	*block_info;
> > 
> > could you please not add the new block* stuff in here,
> > and instead use the "c2c model" and use yourr own struct
> > on top of hist_entry? we are trying to librarize this
> > stuff and keep only necessary things in here..
> > 
> > you're already using hist_entry_ops, so should be easy
> > 
> > something like:
> > 
> > 	struct block_hist_entry {
> > 		void			*block_hists;
> > 		int			block_idx;
> > 		int			block_num;
> > 		struct block_info	*block_info;
> > 
> > 		struct hist_entry	he;
> > 	};
> > 
> > 
> > 
> > jirka
> > 
> 
> Hi Jiri,
> 
> After more considerations, maybe I can't move these stuffs from hist_entry
> to block_hist_entry.

why?

> 
> Actually we use 2 kinds of hist_entry in this patch series. On kind of
> hist_entry is for symbol/function. The other kind of hist_entry is for basic
> block.

correct

so the way I see it the processing goes like this:


1) there's standard hist_entry processing ending up
   with evsel->hists->rb_root full of hist entries

2) then you process every hist_entry and create
   new 'struct hists' for each and fill it with
   symbol counts data



you could add 'struct hist_entry_ops' for the 1) processing
that adds the 'struct hists' object for each hist_entry

and add another 'struct hist_entry_ops' for 2) processing
to carry the block data for each hist_entry

jirka

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-11  8:56       ` Jiri Olsa
@ 2019-06-12  6:11         ` Jin, Yao
  2019-06-12  7:44           ` Jiri Olsa
  0 siblings, 1 reply; 27+ messages in thread
From: Jin, Yao @ 2019-06-12  6:11 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/11/2019 4:56 PM, Jiri Olsa wrote:
> On Sat, Jun 08, 2019 at 07:41:47PM +0800, Jin, Yao wrote:
>>
>>
>> On 6/5/2019 7:44 PM, Jiri Olsa wrote:
>>> On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
>>>
>>> SNIP
>>>
>>>> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
>>>> index 43623fa..d1641da 100644
>>>> --- a/tools/perf/util/sort.h
>>>> +++ b/tools/perf/util/sort.h
>>>> @@ -79,6 +79,9 @@ struct hist_entry_diff {
>>>>    		/* HISTC_WEIGHTED_DIFF */
>>>>    		s64	wdiff;
>>>> +
>>>> +		/* PERF_HPP_DIFF__CYCLES */
>>>> +		s64	cycles;
>>>>    	};
>>>>    };
>>>> @@ -143,6 +146,9 @@ struct hist_entry {
>>>>    	struct branch_info	*branch_info;
>>>>    	long			time;
>>>>    	struct hists		*hists;
>>>> +	void			*block_hists;
>>>> +	int			block_idx;
>>>> +	int			block_num;
>>>>    	struct mem_info		*mem_info;
>>>>    	struct block_info	*block_info;
>>>
>>> could you please not add the new block* stuff in here,
>>> and instead use the "c2c model" and use yourr own struct
>>> on top of hist_entry? we are trying to librarize this
>>> stuff and keep only necessary things in here..
>>>
>>> you're already using hist_entry_ops, so should be easy
>>>
>>> something like:
>>>
>>> 	struct block_hist_entry {
>>> 		void			*block_hists;
>>> 		int			block_idx;
>>> 		int			block_num;
>>> 		struct block_info	*block_info;
>>>
>>> 		struct hist_entry	he;
>>> 	};
>>>
>>>
>>>
>>> jirka
>>>
>>
>> Hi Jiri,
>>
>> After more considerations, maybe I can't move these stuffs from hist_entry
>> to block_hist_entry.
> 
> why?
> 
>>
>> Actually we use 2 kinds of hist_entry in this patch series. On kind of
>> hist_entry is for symbol/function. The other kind of hist_entry is for basic
>> block.
> 
> correct
> 
> so the way I see it the processing goes like this:
> 
> 
> 1) there's standard hist_entry processing ending up
>     with evsel->hists->rb_root full of hist entries
> 
> 2) then you process every hist_entry and create
>     new 'struct hists' for each and fill it with
>     symbol counts data
> 
> 
> 
> you could add 'struct hist_entry_ops' for the 1) processing
> that adds the 'struct hists' object for each hist_entry
> 
> and add another 'struct hist_entry_ops' for 2) processing
> to carry the block data for each hist_entry
> 
> jirka
> 

Hi Jiri,

Yes, I can use two hist_entry_ops but one thing is still difficult to 
handle that is the printing of blocks.

One function may contain multiple blocks so I add 'block_num' in 'struct 
hist_entry' to record the number of blocks.

In patch "perf diff: Print the basic block cycles diff", I reuse most of 
current code to print the blocks. The major change is:

  static int hist_entry__fprintf(struct hist_entry *he, size_t size,
                                char *bf, size_t bfsz, FILE *fp,
                                bool ignore_callchains) {

+       if (he->block_hists)
+               return hist_entry__block_fprintf(he, bf, size, fp);
+
         hist_entry__snprintf(he, &hpp);
}

+static int hist_entry__block_fprintf(struct hist_entry *he,
+                                    char *bf, size_t size,
+                                    FILE *fp)
+{
+       int ret = 0;
+
+       for (int i = 0; i < he->block_num; i++) {
+               struct perf_hpp hpp = {
+                       .buf            = bf,
+                       .size           = size,
+                       .skip           = false,
+               };
+
+               he->block_idx = i;
+               hist_entry__snprintf(he, &hpp);
+
+               if (!hpp.skip)
+                       ret += fprintf(fp, "%s\n", bf);
+       }
+
+       return ret;
+}
+

So it looks at least I need to add 'block_num' to 'struct hist_entry', 
otherwise I can't reuse most of codes.

Any idea for 'block_num'?

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-12  6:11         ` Jin, Yao
@ 2019-06-12  7:44           ` Jiri Olsa
  2019-06-12 12:54             ` Jin, Yao
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2019-06-12  7:44 UTC (permalink / raw)
  To: Jin, Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Wed, Jun 12, 2019 at 02:11:44PM +0800, Jin, Yao wrote:
> 
> 
> On 6/11/2019 4:56 PM, Jiri Olsa wrote:
> > On Sat, Jun 08, 2019 at 07:41:47PM +0800, Jin, Yao wrote:
> > > 
> > > 
> > > On 6/5/2019 7:44 PM, Jiri Olsa wrote:
> > > > On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
> > > > 
> > > > SNIP
> > > > 
> > > > > diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> > > > > index 43623fa..d1641da 100644
> > > > > --- a/tools/perf/util/sort.h
> > > > > +++ b/tools/perf/util/sort.h
> > > > > @@ -79,6 +79,9 @@ struct hist_entry_diff {
> > > > >    		/* HISTC_WEIGHTED_DIFF */
> > > > >    		s64	wdiff;
> > > > > +
> > > > > +		/* PERF_HPP_DIFF__CYCLES */
> > > > > +		s64	cycles;
> > > > >    	};
> > > > >    };
> > > > > @@ -143,6 +146,9 @@ struct hist_entry {
> > > > >    	struct branch_info	*branch_info;
> > > > >    	long			time;
> > > > >    	struct hists		*hists;
> > > > > +	void			*block_hists;
> > > > > +	int			block_idx;
> > > > > +	int			block_num;
> > > > >    	struct mem_info		*mem_info;
> > > > >    	struct block_info	*block_info;
> > > > 
> > > > could you please not add the new block* stuff in here,
> > > > and instead use the "c2c model" and use yourr own struct
> > > > on top of hist_entry? we are trying to librarize this
> > > > stuff and keep only necessary things in here..
> > > > 
> > > > you're already using hist_entry_ops, so should be easy
> > > > 
> > > > something like:
> > > > 
> > > > 	struct block_hist_entry {
> > > > 		void			*block_hists;
> > > > 		int			block_idx;
> > > > 		int			block_num;
> > > > 		struct block_info	*block_info;
> > > > 
> > > > 		struct hist_entry	he;
> > > > 	};
> > > > 
> > > > 
> > > > 
> > > > jirka
> > > > 
> > > 
> > > Hi Jiri,
> > > 
> > > After more considerations, maybe I can't move these stuffs from hist_entry
> > > to block_hist_entry.
> > 
> > why?
> > 
> > > 
> > > Actually we use 2 kinds of hist_entry in this patch series. On kind of
> > > hist_entry is for symbol/function. The other kind of hist_entry is for basic
> > > block.
> > 
> > correct
> > 
> > so the way I see it the processing goes like this:
> > 
> > 
> > 1) there's standard hist_entry processing ending up
> >     with evsel->hists->rb_root full of hist entries
> > 
> > 2) then you process every hist_entry and create
> >     new 'struct hists' for each and fill it with
> >     symbol counts data
> > 
> > 
> > 
> > you could add 'struct hist_entry_ops' for the 1) processing
> > that adds the 'struct hists' object for each hist_entry
> > 
> > and add another 'struct hist_entry_ops' for 2) processing
> > to carry the block data for each hist_entry
> > 
> > jirka
> > 
> 
> Hi Jiri,
> 
> Yes, I can use two hist_entry_ops but one thing is still difficult to handle
> that is the printing of blocks.
> 
> One function may contain multiple blocks so I add 'block_num' in 'struct
> hist_entry' to record the number of blocks.
> 
> In patch "perf diff: Print the basic block cycles diff", I reuse most of
> current code to print the blocks. The major change is:
> 
>  static int hist_entry__fprintf(struct hist_entry *he, size_t size,
>                                char *bf, size_t bfsz, FILE *fp,
>                                bool ignore_callchains) {
> 
> +       if (he->block_hists)
> +               return hist_entry__block_fprintf(he, bf, size, fp);
> +


you could do it the way we do hierarchy and have
something like 'symbol_conf.report_block'

        if (symbol_conf.report_hierarchy)
                return hist_entry__hierarchy_fprintf(he, &hpp, hists, fp);

and in hist_entry__block_fprintf you cast the hist_entry
to your struct.. so you'll have all the data

jirka

>         hist_entry__snprintf(he, &hpp);
> }
> 
> +static int hist_entry__block_fprintf(struct hist_entry *he,
> +                                    char *bf, size_t size,
> +                                    FILE *fp)
> +{
> +       int ret = 0;
> +
> +       for (int i = 0; i < he->block_num; i++) {
> +               struct perf_hpp hpp = {
> +                       .buf            = bf,
> +                       .size           = size,
> +                       .skip           = false,
> +               };
> +
> +               he->block_idx = i;
> +               hist_entry__snprintf(he, &hpp);
> +
> +               if (!hpp.skip)
> +                       ret += fprintf(fp, "%s\n", bf);
> +       }
> +
> +       return ret;
> +}
> +
> 
> So it looks at least I need to add 'block_num' to 'struct hist_entry',
> otherwise I can't reuse most of codes.
> 
> Any idea for 'block_num'?
> 
> Thanks
> Jin Yao

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-12  7:44           ` Jiri Olsa
@ 2019-06-12 12:54             ` Jin, Yao
  0 siblings, 0 replies; 27+ messages in thread
From: Jin, Yao @ 2019-06-12 12:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/12/2019 3:44 PM, Jiri Olsa wrote:
> On Wed, Jun 12, 2019 at 02:11:44PM +0800, Jin, Yao wrote:
>>
>>
>> On 6/11/2019 4:56 PM, Jiri Olsa wrote:
>>> On Sat, Jun 08, 2019 at 07:41:47PM +0800, Jin, Yao wrote:
>>>>
>>>>
>>>> On 6/5/2019 7:44 PM, Jiri Olsa wrote:
>>>>> On Mon, Jun 03, 2019 at 10:36:14PM +0800, Jin Yao wrote:
>>>>>
>>>>> SNIP
>>>>>
>>>>>> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
>>>>>> index 43623fa..d1641da 100644
>>>>>> --- a/tools/perf/util/sort.h
>>>>>> +++ b/tools/perf/util/sort.h
>>>>>> @@ -79,6 +79,9 @@ struct hist_entry_diff {
>>>>>>     		/* HISTC_WEIGHTED_DIFF */
>>>>>>     		s64	wdiff;
>>>>>> +
>>>>>> +		/* PERF_HPP_DIFF__CYCLES */
>>>>>> +		s64	cycles;
>>>>>>     	};
>>>>>>     };
>>>>>> @@ -143,6 +146,9 @@ struct hist_entry {
>>>>>>     	struct branch_info	*branch_info;
>>>>>>     	long			time;
>>>>>>     	struct hists		*hists;
>>>>>> +	void			*block_hists;
>>>>>> +	int			block_idx;
>>>>>> +	int			block_num;
>>>>>>     	struct mem_info		*mem_info;
>>>>>>     	struct block_info	*block_info;
>>>>>
>>>>> could you please not add the new block* stuff in here,
>>>>> and instead use the "c2c model" and use yourr own struct
>>>>> on top of hist_entry? we are trying to librarize this
>>>>> stuff and keep only necessary things in here..
>>>>>
>>>>> you're already using hist_entry_ops, so should be easy
>>>>>
>>>>> something like:
>>>>>
>>>>> 	struct block_hist_entry {
>>>>> 		void			*block_hists;
>>>>> 		int			block_idx;
>>>>> 		int			block_num;
>>>>> 		struct block_info	*block_info;
>>>>>
>>>>> 		struct hist_entry	he;
>>>>> 	};
>>>>>
>>>>>
>>>>>
>>>>> jirka
>>>>>
>>>>
>>>> Hi Jiri,
>>>>
>>>> After more considerations, maybe I can't move these stuffs from hist_entry
>>>> to block_hist_entry.
>>>
>>> why?
>>>
>>>>
>>>> Actually we use 2 kinds of hist_entry in this patch series. On kind of
>>>> hist_entry is for symbol/function. The other kind of hist_entry is for basic
>>>> block.
>>>
>>> correct
>>>
>>> so the way I see it the processing goes like this:
>>>
>>>
>>> 1) there's standard hist_entry processing ending up
>>>      with evsel->hists->rb_root full of hist entries
>>>
>>> 2) then you process every hist_entry and create
>>>      new 'struct hists' for each and fill it with
>>>      symbol counts data
>>>
>>>
>>>
>>> you could add 'struct hist_entry_ops' for the 1) processing
>>> that adds the 'struct hists' object for each hist_entry
>>>
>>> and add another 'struct hist_entry_ops' for 2) processing
>>> to carry the block data for each hist_entry
>>>
>>> jirka
>>>
>>
>> Hi Jiri,
>>
>> Yes, I can use two hist_entry_ops but one thing is still difficult to handle
>> that is the printing of blocks.
>>
>> One function may contain multiple blocks so I add 'block_num' in 'struct
>> hist_entry' to record the number of blocks.
>>
>> In patch "perf diff: Print the basic block cycles diff", I reuse most of
>> current code to print the blocks. The major change is:
>>
>>   static int hist_entry__fprintf(struct hist_entry *he, size_t size,
>>                                 char *bf, size_t bfsz, FILE *fp,
>>                                 bool ignore_callchains) {
>>
>> +       if (he->block_hists)
>> +               return hist_entry__block_fprintf(he, bf, size, fp);
>> +
> 
> 
> you could do it the way we do hierarchy and have
> something like 'symbol_conf.report_block'
> 
>          if (symbol_conf.report_hierarchy)
>                  return hist_entry__hierarchy_fprintf(he, &hpp, hists, fp);
> 
> and in hist_entry__block_fprintf you cast the hist_entry
> to your struct.. so you'll have all the data
> 
> jirka
> 

Thanks Jiri. So it looks I need to define 'struct block_hist_entry' in 
util/sort.h, then hist_entry__block_fprintf can know this struct. 
Previously I just defined this struct in builtin-diff.c.

Thanks
Jin Yao


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-06-12 12:54 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-03 14:36 [PATCH v2 0/7] perf diff: diff cycles at basic block level Jin Yao
2019-06-03  6:51 ` Jin, Yao
2019-06-03 14:36 ` [PATCH v2 1/7] perf util: Create block_info structure Jin Yao
2019-06-03 14:36 ` [PATCH v2 2/7] perf util: Add block_info in hist_entry Jin Yao
2019-06-03 14:36 ` [PATCH v2 3/7] perf diff: Check if all data files with branch stacks Jin Yao
2019-06-03 14:36 ` [PATCH v2 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
2019-06-05 11:44   ` Jiri Olsa
2019-06-06  1:15     ` Jin, Yao
2019-06-08 11:41     ` Jin, Yao
2019-06-11  2:22       ` Jin, Yao
2019-06-11  8:56       ` Jiri Olsa
2019-06-12  6:11         ` Jin, Yao
2019-06-12  7:44           ` Jiri Olsa
2019-06-12 12:54             ` Jin, Yao
2019-06-05 11:44   ` Jiri Olsa
2019-06-06  1:26     ` Jin, Yao
2019-06-05 11:44   ` Jiri Olsa
2019-06-06  1:57     ` Jin, Yao
2019-06-03 14:36 ` [PATCH v2 5/7] perf diff: Link same basic blocks among different data files Jin Yao
2019-06-03 14:36 ` [PATCH v2 6/7] perf diff: Print the basic block cycles diff Jin Yao
2019-06-05 11:44   ` Jiri Olsa
2019-06-06  4:06     ` Jin, Yao
2019-06-05 11:44   ` Jiri Olsa
2019-06-06  2:02     ` Jin, Yao
2019-06-03 14:36 ` [PATCH v2 7/7] perf diff: Documentation -c cycles option Jin Yao
2019-06-05 11:44 ` [PATCH v2 0/7] perf diff: diff cycles at basic block level Jiri Olsa
2019-06-06  1:05   ` Jin, Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).