linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] perf diff: diff cycles at basic block level
@ 2019-06-20 14:36 Jin Yao
  2019-06-20 14:36 ` [PATCH v4 1/7] perf util: Create block_info structure Jin Yao
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Jin Yao @ 2019-06-20 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

In some cases small changes in hot loops can show big differences.
But it's difficult to identify these differences.

perf diff currently can only diff symbols (functions). We can also expand
it to diff cycles of individual programs blocks as reported by timed LBR.
This would allow to identify changes in specific code accurately.

With this patch set, for example,

 $ perf record -b ./div
 $ perf record -b ./div
 $ perf diff -c cycles

 # Event 'cycles'
 #
 # Baseline                                       [Program Block Range] Cycles Diff  Shared Object     Symbol
 # ........  ......................................................................  ................  ..................................
 #
     48.75%                                             [div.c:42 -> div.c:45]  147  div               [.] main
     48.75%                                             [div.c:31 -> div.c:40]    4  div               [.] main
     48.75%                                             [div.c:40 -> div.c:40]    0  div               [.] main
     48.75%                                             [div.c:42 -> div.c:42]    0  div               [.] main
     48.75%                                             [div.c:42 -> div.c:44]    0  div               [.] main
     19.02%                                 [random_r.c:357 -> random_r.c:360]    0  libc-2.23.so      [.] __random_r
     19.02%                                 [random_r.c:357 -> random_r.c:373]    0  libc-2.23.so      [.] __random_r
     19.02%                                 [random_r.c:357 -> random_r.c:376]    0  libc-2.23.so      [.] __random_r
     19.02%                                 [random_r.c:357 -> random_r.c:380]    0  libc-2.23.so      [.] __random_r
     19.02%                                 [random_r.c:357 -> random_r.c:392]    0  libc-2.23.so      [.] __random_r
     16.17%                                     [random.c:288 -> random.c:291]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:288 -> random.c:291]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:288 -> random.c:295]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:288 -> random.c:297]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:291 -> random.c:291]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:293 -> random.c:293]    0  libc-2.23.so      [.] __random
      8.21%                                             [div.c:22 -> div.c:22]  148  div               [.] compute_flag
      8.21%                                             [div.c:22 -> div.c:25]    0  div               [.] compute_flag
      8.21%                                             [div.c:27 -> div.c:28]    0  div               [.] compute_flag
      5.52%                                           [rand.c:26 -> rand.c:27]    0  libc-2.23.so      [.] rand
      5.52%                                           [rand.c:26 -> rand.c:28]    0  libc-2.23.so      [.] rand
      2.27%                                         [rand@plt+0 -> rand@plt+0]    0  div               [.] rand@plt
      0.01%                                 [entry_64.S:694 -> entry_64.S:694]   16  [kernel.vmlinux]  [k] native_irq_return_iret
      0.00%                                       [fair.c:7676 -> fair.c:7665]  162  [kernel.vmlinux]  [k] update_blocked_averages

 '[Program Block Range]' indicates the range of program basic block
 (start -> end). If we can find the source line it prints the source
 line otherwise it prints the symbol+offset instead.

 v4:
 ---
 Use source lines or symbol+offset to indicate the basic block.

 Changed patches:
  perf diff: Print the basic block cycles diff
  perf diff: Documentation -c cycles option  

 v3:
 ---
 In v3, the major change is to move most of block stuffs from
 'struct hist_entry' to new structure 'struct block_hist' and
 update the code accordingly. But we still have to keep the
 block_info in 'struct hist_entry' since we need to compare by
 block info when inserting new entry to hists.

 Others are minor changes, such as abs() -> labs(), removing
 duplicated ops and etc.

 Changed patches:
  perf diff: Use hists to manage basic blocks per symbol
  perf diff: Link same basic blocks among different data
  perf diff: Print the basic block cycles diff

 v2:
 ---
 Keep standard perf diff format.

 Following is the v1 output.

 # perf diff --basic-block

 # Cycles diff  Basic block (start:end)
 # ...........  .......................
 #
          -208  hrtimer_interrupt (30b9e0:30ba42)
          -157  update_blocked_averages (2bf9f2:2bfa63)
          -126  interrupt_entry (c00880:c0093a)
           -86  hrtimer_interrupt (30bb29:30bb32)
           -74  hrtimer_interrupt (30ba65:30bac4)
           -56  update_blocked_averages (2bf980:2bf9d3)
            48  update_blocked_averages (2bf934:2bf942)
           -35  native_write_msr (267900:26790b)
            26  native_irq_return_iret (c00a27:c00a27)
            22  rcu_check_callbacks (2febb6:2febdc)
           -21  __hrtimer_run_queues (30b220:30b2a3)
            19  pvclock_gtod_notify (14ba0:14c1b)
           -18  task_tick_fair (2c5d29:2c5d41)

Jin Yao (7):
  perf util: Create block_info structure
  perf util: Add block_info in hist_entry
  perf diff: Check if all data files with branch stacks
  perf diff: Use hists to manage basic blocks per symbol
  perf diff: Link same basic blocks among different data
  perf diff: Print the basic block cycles diff
  perf diff: Documentation -c cycles option

 tools/perf/Documentation/perf-diff.txt |  17 +-
 tools/perf/builtin-diff.c              | 404 ++++++++++++++++++++++++++++++++-
 tools/perf/ui/stdio/hist.c             |  27 +++
 tools/perf/util/hist.c                 |  40 +++-
 tools/perf/util/hist.h                 |   9 +
 tools/perf/util/sort.h                 |  13 ++
 tools/perf/util/srcline.c              |   4 +-
 tools/perf/util/symbol.c               |  22 ++
 tools/perf/util/symbol.h               |  23 ++
 tools/perf/util/symbol_conf.h          |   4 +-
 10 files changed, 551 insertions(+), 12 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/7] perf util: Create block_info structure
  2019-06-20 14:36 [PATCH v4 0/7] perf diff: diff cycles at basic block level Jin Yao
@ 2019-06-20 14:36 ` Jin Yao
  2019-06-20 14:36 ` [PATCH v4 2/7] perf util: Add block_info in hist_entry Jin Yao
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jin Yao @ 2019-06-20 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

perf diff currently can only diff symbols(functions). We should expand it
to diff cycles of individual programs blocks as reported by timed LBR.
This would allow to identify changes in specific code accurately.

We need a new structure to maintain the basic block information, such as,
symbol(function), start/end address of this block, cycles. This patch
creates this structure and with some ops.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/symbol.c | 22 ++++++++++++++++++++++
 tools/perf/util/symbol.h | 23 +++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index f4540f8..4e0a7b3 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2351,3 +2351,25 @@ struct mem_info *mem_info__new(void)
 		refcount_set(&mi->refcnt, 1);
 	return mi;
 }
+
+struct block_info *block_info__get(struct block_info *bi)
+{
+	if (bi)
+		refcount_inc(&bi->refcnt);
+	return bi;
+}
+
+void block_info__put(struct block_info *bi)
+{
+	if (bi && refcount_dec_and_test(&bi->refcnt))
+		free(bi);
+}
+
+struct block_info *block_info__new(void)
+{
+	struct block_info *bi = zalloc(sizeof(*bi));
+
+	if (bi)
+		refcount_set(&bi->refcnt, 1);
+	return bi;
+}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 9a8fe01..12755b4 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -131,6 +131,17 @@ struct mem_info {
 	refcount_t		refcnt;
 };
 
+struct block_info {
+	struct symbol		*sym;
+	u64			start;
+	u64			end;
+	u64			cycles;
+	u64			cycles_aggr;
+	int			num;
+	int			num_aggr;
+	refcount_t		refcnt;
+};
+
 struct addr_location {
 	struct machine *machine;
 	struct thread *thread;
@@ -332,4 +343,16 @@ static inline void __mem_info__zput(struct mem_info **mi)
 
 #define mem_info__zput(mi) __mem_info__zput(&mi)
 
+struct block_info *block_info__new(void);
+struct block_info *block_info__get(struct block_info *bi);
+void   block_info__put(struct block_info *bi);
+
+static inline void __block_info__zput(struct block_info **bi)
+{
+	block_info__put(*bi);
+	*bi = NULL;
+}
+
+#define block_info__zput(bi) __block_info__zput(&bi)
+
 #endif /* __PERF_SYMBOL */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/7] perf util: Add block_info in hist_entry
  2019-06-20 14:36 [PATCH v4 0/7] perf diff: diff cycles at basic block level Jin Yao
  2019-06-20 14:36 ` [PATCH v4 1/7] perf util: Create block_info structure Jin Yao
@ 2019-06-20 14:36 ` Jin Yao
  2019-06-20 14:36 ` [PATCH v4 3/7] perf diff: Check if all data files with branch stacks Jin Yao
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jin Yao @ 2019-06-20 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The block_info contains the program basic block information, i.e,
contains the start address and the end address of this basic block and
how much cycles it takes. We need to compare, sort and even print out
the basic block by some orders, i.e. sort by cycles.

For this purpose, we add block_info field to hist_entry. In order not to
impact current interface, we creates a new function hists__add_entry_block.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/hist.c | 22 ++++++++++++++++++++--
 tools/perf/util/hist.h |  6 ++++++
 tools/perf/util/sort.h |  1 +
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index fb3271f..680ad93 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -574,6 +574,8 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
 			 */
 			mem_info__zput(entry->mem_info);
 
+			block_info__zput(entry->block_info);
+
 			/* If the map of an existing hist_entry has
 			 * become out-of-date due to an exec() or
 			 * similar, update it.  Otherwise we will
@@ -645,6 +647,7 @@ __hists__add_entry(struct hists *hists,
 		   struct symbol *sym_parent,
 		   struct branch_info *bi,
 		   struct mem_info *mi,
+		   struct block_info *block_info,
 		   struct perf_sample *sample,
 		   bool sample_self,
 		   struct hist_entry_ops *ops)
@@ -677,6 +680,7 @@ __hists__add_entry(struct hists *hists,
 		.hists	= hists,
 		.branch_info = bi,
 		.mem_info = mi,
+		.block_info = block_info,
 		.transaction = sample->transaction,
 		.raw_data = sample->raw_data,
 		.raw_size = sample->raw_size,
@@ -699,7 +703,7 @@ struct hist_entry *hists__add_entry(struct hists *hists,
 				    struct perf_sample *sample,
 				    bool sample_self)
 {
-	return __hists__add_entry(hists, al, sym_parent, bi, mi,
+	return __hists__add_entry(hists, al, sym_parent, bi, mi, NULL,
 				  sample, sample_self, NULL);
 }
 
@@ -712,10 +716,24 @@ struct hist_entry *hists__add_entry_ops(struct hists *hists,
 					struct perf_sample *sample,
 					bool sample_self)
 {
-	return __hists__add_entry(hists, al, sym_parent, bi, mi,
+	return __hists__add_entry(hists, al, sym_parent, bi, mi, NULL,
 				  sample, sample_self, ops);
 }
 
+struct hist_entry *hists__add_entry_block(struct hists *hists,
+					  struct hist_entry_ops *ops,
+					  struct addr_location *al,
+					  struct block_info *block_info)
+{
+	struct hist_entry entry = {
+		.ops = ops,
+		.block_info = block_info,
+		.hists = hists,
+	}, *he = hists__findnew_entry(hists, &entry, al, false);
+
+	return he;
+}
+
 static int
 iter_next_nop_entry(struct hist_entry_iter *iter __maybe_unused,
 		    struct addr_location *al __maybe_unused)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 76ff6c6..c8f7d66 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -16,6 +16,7 @@ struct addr_location;
 struct map_symbol;
 struct mem_info;
 struct branch_info;
+struct block_info;
 struct symbol;
 
 enum hist_filter {
@@ -149,6 +150,11 @@ struct hist_entry *hists__add_entry_ops(struct hists *hists,
 					struct perf_sample *sample,
 					bool sample_self);
 
+struct hist_entry *hists__add_entry_block(struct hists *hists,
+					  struct hist_entry_ops *ops,
+					  struct addr_location *al,
+					  struct block_info *bi);
+
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 int max_stack_depth, void *arg);
 
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index ce376a7..43623fa 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -144,6 +144,7 @@ struct hist_entry {
 	long			time;
 	struct hists		*hists;
 	struct mem_info		*mem_info;
+	struct block_info	*block_info;
 	void			*raw_data;
 	u32			raw_size;
 	int			num_res;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/7] perf diff: Check if all data files with branch stacks
  2019-06-20 14:36 [PATCH v4 0/7] perf diff: diff cycles at basic block level Jin Yao
  2019-06-20 14:36 ` [PATCH v4 1/7] perf util: Create block_info structure Jin Yao
  2019-06-20 14:36 ` [PATCH v4 2/7] perf util: Add block_info in hist_entry Jin Yao
@ 2019-06-20 14:36 ` Jin Yao
  2019-06-20 14:36 ` [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jin Yao @ 2019-06-20 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We will expand perf diff to support diff cycles of individual programs
blocks, so it requires all data files having branch stacks.

This patch checks HEADER_BRANCH_STACK in header, and only set the flag
has_br_stack when HEADER_BRANCH_STACK are set in all data files.

 v2:
 ---
 Move check_file_brstack() from __cmd_diff() to cmd_diff().
 Because later patch will check flag 'has_br_stack' before
 ui_init().

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 6e79207..a7e0420 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -32,6 +32,7 @@ struct perf_diff {
 	struct perf_time_interval	*ptime_range;
 	int				 range_size;
 	int				 range_num;
+	bool				 has_br_stack;
 };
 
 /* Diff command specific HPP columns. */
@@ -873,6 +874,31 @@ static int parse_time_str(struct data__file *d, char *abstime_ostr,
 	return ret;
 }
 
+static int check_file_brstack(void)
+{
+	struct data__file *d;
+	bool has_br_stack;
+	int i;
+
+	data__for_each_file(i, d) {
+		d->session = perf_session__new(&d->data, false, &pdiff.tool);
+		if (!d->session) {
+			pr_err("Failed to open %s\n", d->data.path);
+			return -1;
+		}
+
+		has_br_stack = perf_header__has_feat(&d->session->header,
+						     HEADER_BRANCH_STACK);
+		perf_session__delete(d->session);
+		if (!has_br_stack)
+			return 0;
+	}
+
+	/* Set only all files having branch stacks */
+	pdiff.has_br_stack = true;
+	return 0;
+}
+
 static int __cmd_diff(void)
 {
 	struct data__file *d;
@@ -1487,6 +1513,9 @@ int cmd_diff(int argc, const char **argv)
 	if (data_init(argc, argv) < 0)
 		return -1;
 
+	if (check_file_brstack() < 0)
+		return -1;
+
 	if (ui_init() < 0)
 		return -1;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-20 14:36 [PATCH v4 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (2 preceding siblings ...)
  2019-06-20 14:36 ` [PATCH v4 3/7] perf diff: Check if all data files with branch stacks Jin Yao
@ 2019-06-20 14:36 ` Jin Yao
  2019-06-24  7:57   ` Jiri Olsa
  2019-06-25  9:11   ` Jiri Olsa
  2019-06-20 14:36 ` [PATCH v4 5/7] perf diff: Link same basic blocks among different data Jin Yao
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 13+ messages in thread
From: Jin Yao @ 2019-06-20 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The hist__account_cycles() can account cycles per basic
block. The basic block information is saved in cycles_hist
structure.

This patch processes each symbol, get basic blocks from
cycles_hist and add the basic block entries to a new hists
(in 'struct block_hist'). Using a hists is because
we need to compare, sort and print the basic blocks later.

 v3:
 ---
 1. In v2, we put block stuffs in 'struct hist_entry', but
 it's not a good design. In v3, we create a new
 'struct block_hist' and cast the 'struct hist_entry' to
 'struct block_hist' in some places, which can avoid adding
 new stuffs in 'struct hist_entry'.

 2. abs() -> labs(), in block_cycles_diff_cmp().

 v2:
 ---
 v1 adds the basic block entries to per data-file hists
 but v2 adds the basic block entries to per symbol hists.
 That is to keep current perf-diff format. Will show the
 result in next patches.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 209 +++++++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/sort.h    |  12 +++
 2 files changed, 218 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index a7e0420..c6ec84d 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -20,6 +20,7 @@
 #include "util/data.h"
 #include "util/config.h"
 #include "util/time-utils.h"
+#include "util/annotate.h"
 
 #include <errno.h>
 #include <inttypes.h>
@@ -87,11 +88,14 @@ static s64 compute_wdiff_w2;
 static const char		*cpu_list;
 static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 
+static struct addr_location dummy_al;
+
 enum {
 	COMPUTE_DELTA,
 	COMPUTE_RATIO,
 	COMPUTE_WEIGHTED_DIFF,
 	COMPUTE_DELTA_ABS,
+	COMPUTE_CYCLES,
 	COMPUTE_MAX,
 };
 
@@ -100,6 +104,7 @@ const char *compute_names[COMPUTE_MAX] = {
 	[COMPUTE_DELTA_ABS] = "delta-abs",
 	[COMPUTE_RATIO] = "ratio",
 	[COMPUTE_WEIGHTED_DIFF] = "wdiff",
+	[COMPUTE_CYCLES] = "cycles",
 };
 
 static int compute = COMPUTE_DELTA_ABS;
@@ -234,6 +239,8 @@ static int setup_compute(const struct option *opt, const char *str,
 	for (i = 0; i < COMPUTE_MAX; i++)
 		if (!strcmp(cstr, compute_names[i])) {
 			*cp = i;
+			if (i == COMPUTE_CYCLES)
+				break;
 			return setup_compute_opt(option);
 		}
 
@@ -336,6 +343,31 @@ static int formula_fprintf(struct hist_entry *he, struct hist_entry *pair,
 	return -1;
 }
 
+static void *block_hist_zalloc(size_t size)
+{
+	struct block_hist *bh;
+
+	bh = zalloc(size + sizeof(*bh));
+	if (!bh)
+		return NULL;
+
+	return &bh->he;
+}
+
+static void block_hist_free(void *he)
+{
+	struct block_hist *bh;
+
+	bh = container_of(he, struct block_hist, he);
+	hists__delete_entries(&bh->block_hists);
+	free(bh);
+}
+
+struct hist_entry_ops block_hist_ops = {
+	.new    = block_hist_zalloc,
+	.free   = block_hist_free,
+};
+
 static int diff__process_sample_event(struct perf_tool *tool,
 				      union perf_event *event,
 				      struct perf_sample *sample,
@@ -363,9 +395,22 @@ static int diff__process_sample_event(struct perf_tool *tool,
 		goto out_put;
 	}
 
-	if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample, true)) {
-		pr_warning("problem incrementing symbol period, skipping event\n");
-		goto out_put;
+	if (compute != COMPUTE_CYCLES) {
+		if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
+				      true)) {
+			pr_warning("problem incrementing symbol period, "
+				   "skipping event\n");
+			goto out_put;
+		}
+	} else {
+		if (!hists__add_entry_ops(hists, &block_hist_ops, &al, NULL,
+					  NULL, NULL, sample, true)) {
+			pr_warning("problem incrementing symbol period, "
+				   "skipping event\n");
+			goto out_put;
+		}
+
+		hist__account_cycles(sample->branch_stack, &al, sample, false);
 	}
 
 	/*
@@ -475,6 +520,146 @@ static void hists__baseline_only(struct hists *hists)
 	}
 }
 
+static int64_t block_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+			 struct hist_entry *left, struct hist_entry *right)
+{
+	struct block_info *bi_l = left->block_info;
+	struct block_info *bi_r = right->block_info;
+	int cmp;
+
+	if (!bi_l->sym || !bi_r->sym) {
+		if (!bi_l->sym && !bi_r->sym)
+			return 0;
+		else if (!bi_l->sym)
+			return -1;
+		else
+			return 1;
+	}
+
+	if (bi_l->sym == bi_r->sym) {
+		if (bi_l->start == bi_r->start) {
+			if (bi_l->end == bi_r->end)
+				return 0;
+			else
+				return (int64_t)(bi_r->end - bi_l->end);
+		} else
+			return (int64_t)(bi_r->start - bi_l->start);
+	} else {
+		cmp = strcmp(bi_l->sym->name, bi_r->sym->name);
+		return cmp;
+	}
+
+	if (bi_l->sym->start != bi_r->sym->start)
+		return (int64_t)(bi_r->sym->start - bi_l->sym->start);
+
+	return (int64_t)(bi_r->sym->end - bi_l->sym->end);
+}
+
+static int64_t block_cycles_diff_cmp(struct hist_entry *left,
+				     struct hist_entry *right)
+{
+	bool pairs_left  = hist_entry__has_pairs(left);
+	bool pairs_right = hist_entry__has_pairs(right);
+	s64 l, r;
+
+	if (!pairs_left && !pairs_right)
+		return 0;
+
+	l = labs(left->diff.cycles);
+	r = labs(right->diff.cycles);
+	return r - l;
+}
+
+static int64_t block_sort(struct perf_hpp_fmt *fmt __maybe_unused,
+			  struct hist_entry *left, struct hist_entry *right)
+{
+	return block_cycles_diff_cmp(right, left);
+}
+
+static void init_block_hist(struct block_hist *bh)
+{
+	__hists__init(&bh->block_hists, &bh->block_list);
+	perf_hpp_list__init(&bh->block_list);
+
+	INIT_LIST_HEAD(&bh->block_fmt.list);
+	INIT_LIST_HEAD(&bh->block_fmt.sort_list);
+	bh->block_fmt.cmp = block_cmp;
+	bh->block_fmt.sort = block_sort;
+	perf_hpp_list__register_sort_field(&bh->block_list,
+					   &bh->block_fmt);
+	bh->valid = true;
+}
+
+static void init_block_info(struct block_info *bi, struct symbol *sym,
+			    struct cyc_hist *ch, int offset)
+{
+	bi->sym = sym;
+	bi->start = ch->start;
+	bi->end = offset;
+	bi->cycles = ch->cycles;
+	bi->cycles_aggr = ch->cycles_aggr;
+	bi->num = ch->num;
+	bi->num_aggr = ch->num_aggr;
+}
+
+static void *block_entry_zalloc(size_t size)
+{
+	return zalloc(size + sizeof(struct hist_entry));
+}
+
+static void block_entry_free(void *he)
+{
+	struct block_info *bi = ((struct hist_entry *)he)->block_info;
+
+	block_info__put(bi);
+	free(he);
+}
+
+struct hist_entry_ops block_entry_ops = {
+	.new    = block_entry_zalloc,
+	.free   = block_entry_free,
+};
+
+static int process_block_per_sym(struct hist_entry *he)
+{
+	struct annotation *notes;
+	struct cyc_hist *ch;
+	struct block_hist *bh;
+
+	if (!he->ms.map || !he->ms.sym)
+		return 0;
+
+	notes = symbol__annotation(he->ms.sym);
+	if (!notes || !notes->src || !notes->src->cycles_hist)
+		return 0;
+
+	bh = container_of(he, struct block_hist, he);
+	init_block_hist(bh);
+
+	ch = notes->src->cycles_hist;
+	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
+		if (ch[i].num_aggr) {
+			struct block_info *bi;
+			struct hist_entry *he_block;
+
+			bi = block_info__new();
+			if (!bi)
+				return -1;
+
+			init_block_info(bi, he->ms.sym, &ch[i], i);
+			he_block = hists__add_entry_block(&bh->block_hists,
+							  &block_entry_ops,
+							  &dummy_al, bi);
+			if (!he_block) {
+				block_info__put(bi);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
 static void hists__precompute(struct hists *hists)
 {
 	struct rb_root_cached *root;
@@ -494,6 +679,9 @@ static void hists__precompute(struct hists *hists)
 		he   = rb_entry(next, struct hist_entry, rb_node_in);
 		next = rb_next(&he->rb_node_in);
 
+		if (compute == COMPUTE_CYCLES)
+			process_block_per_sym(he);
+
 		data__for_each_file_new(i, d) {
 			pair = get_pair_data(he, d);
 			if (!pair)
@@ -510,6 +698,9 @@ static void hists__precompute(struct hists *hists)
 			case COMPUTE_WEIGHTED_DIFF:
 				compute_wdiff(he, pair);
 				break;
+			case COMPUTE_CYCLES:
+				process_block_per_sym(pair);
+				break;
 			default:
 				BUG_ON(1);
 			}
@@ -1411,6 +1602,13 @@ static int ui_init(void)
 	case COMPUTE_DELTA_ABS:
 		fmt->sort = hist_entry__cmp_delta_abs_idx;
 		break;
+	case COMPUTE_CYCLES:
+		/*
+		 * Should set since 'fmt->sort' is called without
+		 * checking valid during sorting
+		 */
+		fmt->sort = hist_entry__cmp_nop;
+		break;
 	default:
 		BUG_ON(1);
 	}
@@ -1507,6 +1705,8 @@ int cmd_diff(int argc, const char **argv)
 	if (quiet)
 		perf_quiet_option();
 
+	symbol__annotation_init();
+
 	if (symbol__init(NULL) < 0)
 		return -1;
 
@@ -1516,6 +1716,9 @@ int cmd_diff(int argc, const char **argv)
 	if (check_file_brstack() < 0)
 		return -1;
 
+	if (compute == COMPUTE_CYCLES && !pdiff.has_br_stack)
+		return -1;
+
 	if (ui_init() < 0)
 		return -1;
 
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 43623fa..a0f2321 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -79,6 +79,9 @@ struct hist_entry_diff {
 
 		/* HISTC_WEIGHTED_DIFF */
 		s64	wdiff;
+
+		/* PERF_HPP_DIFF__CYCLES */
+		s64	cycles;
 	};
 };
 
@@ -286,6 +289,15 @@ struct sort_entry {
 	u8	se_width_idx;
 };
 
+struct block_hist {
+	struct hists		block_hists;
+	struct perf_hpp_list	block_list;
+	struct perf_hpp_fmt	block_fmt;
+	int			block_idx;
+	bool			valid;
+	struct hist_entry	he;
+};
+
 extern struct sort_entry sort_thread;
 extern struct list_head hist_entry__sort_list;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 5/7] perf diff: Link same basic blocks among different data
  2019-06-20 14:36 [PATCH v4 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (3 preceding siblings ...)
  2019-06-20 14:36 ` [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
@ 2019-06-20 14:36 ` Jin Yao
  2019-06-20 14:36 ` [PATCH v4 6/7] perf diff: Print the basic block cycles diff Jin Yao
  2019-06-20 14:36 ` [PATCH v4 7/7] perf diff: Documentation -c cycles option Jin Yao
  6 siblings, 0 replies; 13+ messages in thread
From: Jin Yao @ 2019-06-20 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The target is to compare the performance difference (cycles
diff) for the same basic blocks in different data files.

The same basic block means same function, same start address
and same end address. This patch finds the same basic blocks
from different data files and link them together and resort
by the cycles diff.

 v3:
 ---
 The block stuffs are maintained by new structure 'block_hist',
 so this patch is update accordingly.

 v2:
 ---
 Since now the basic block hists is changed to per symbol,
 the patch only links the basic block hists for the same
 symbol in different data files.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index c6ec84d..5f5a9a9 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -660,6 +660,85 @@ static int process_block_per_sym(struct hist_entry *he)
 	return 0;
 }
 
+static int block_pair_cmp(struct hist_entry *a, struct hist_entry *b)
+{
+	struct block_info *bi_a = a->block_info;
+	struct block_info *bi_b = b->block_info;
+	int cmp;
+
+	if (!bi_a->sym || !bi_b->sym)
+		return -1;
+
+	if (bi_a->sym->name && bi_b->sym->name) {
+		cmp = strcmp(bi_a->sym->name, bi_b->sym->name);
+		if ((!cmp) && (bi_a->start == bi_b->start) &&
+		    (bi_a->end == bi_b->end)) {
+			return 0;
+		}
+	}
+
+	return -1;
+}
+
+static struct hist_entry *get_block_pair(struct hist_entry *he,
+					 struct hists *hists_pair)
+{
+	struct rb_root_cached *root = hists_pair->entries_in;
+	struct rb_node *next = rb_first_cached(root);
+	int cmp;
+
+	while (next != NULL) {
+		struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
+						      rb_node_in);
+
+		next = rb_next(&he_pair->rb_node_in);
+
+		cmp = block_pair_cmp(he_pair, he);
+		if (!cmp)
+			return he_pair;
+	}
+
+	return NULL;
+}
+
+static void compute_cycles_diff(struct hist_entry *he,
+				struct hist_entry *pair)
+{
+	pair->diff.computed = true;
+	if (pair->block_info->num && he->block_info->num) {
+		pair->diff.cycles =
+			pair->block_info->cycles_aggr / pair->block_info->num_aggr -
+			he->block_info->cycles_aggr / he->block_info->num_aggr;
+	}
+}
+
+static void block_hists_match(struct hists *hists_base,
+			      struct hists *hists_pair)
+{
+	struct rb_root_cached *root = hists_base->entries_in;
+	struct rb_node *next = rb_first_cached(root);
+
+	while (next != NULL) {
+		struct hist_entry *he = rb_entry(next, struct hist_entry,
+						 rb_node_in);
+		struct hist_entry *pair = get_block_pair(he, hists_pair);
+
+		next = rb_next(&he->rb_node_in);
+
+		if (pair) {
+			hist_entry__add_pair(pair, he);
+			compute_cycles_diff(he, pair);
+		}
+	}
+}
+
+static int filter_cb(struct hist_entry *he, void *arg __maybe_unused)
+{
+	/* Skip the calculation of column length in output_resort */
+	he->filtered = true;
+	return 0;
+}
+
 static void hists__precompute(struct hists *hists)
 {
 	struct rb_root_cached *root;
@@ -672,6 +751,7 @@ static void hists__precompute(struct hists *hists)
 
 	next = rb_first_cached(root);
 	while (next != NULL) {
+		struct block_hist *bh, *pair_bh;
 		struct hist_entry *he, *pair;
 		struct data__file *d;
 		int i;
@@ -700,6 +780,16 @@ static void hists__precompute(struct hists *hists)
 				break;
 			case COMPUTE_CYCLES:
 				process_block_per_sym(pair);
+				bh = container_of(he, struct block_hist, he);
+				pair_bh = container_of(pair, struct block_hist,
+						       he);
+
+				if (bh->valid && pair_bh->valid) {
+					block_hists_match(&bh->block_hists,
+							  &pair_bh->block_hists);
+					hists__output_resort_cb(&pair_bh->block_hists,
+								NULL, filter_cb);
+				}
 				break;
 			default:
 				BUG_ON(1);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 6/7] perf diff: Print the basic block cycles diff
  2019-06-20 14:36 [PATCH v4 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (4 preceding siblings ...)
  2019-06-20 14:36 ` [PATCH v4 5/7] perf diff: Link same basic blocks among different data Jin Yao
@ 2019-06-20 14:36 ` Jin Yao
  2019-06-20 14:36 ` [PATCH v4 7/7] perf diff: Documentation -c cycles option Jin Yao
  6 siblings, 0 replies; 13+ messages in thread
From: Jin Yao @ 2019-06-20 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

 $ perf record -b ./div
 $ perf record -b ./div

Following is the default perf diff output

 $ perf diff

 # Event 'cycles'
 #
 # Baseline  Delta Abs  Shared Object     Symbol
 # ........  .........  ................  ..................................
 #
     48.75%     +0.33%  div               [.] main
      8.21%     -0.20%  div               [.] compute_flag
     19.02%     -0.12%  libc-2.23.so      [.] __random_r
     16.17%     -0.09%  libc-2.23.so      [.] __random
      2.27%     -0.03%  div               [.] rand@plt
                +0.02%  [i915]            [k] gen8_irq_handler
      5.52%     +0.02%  libc-2.23.so      [.] rand

This patch creates a new computation selection 'cycles'.

 $ perf diff -c cycles

 # Event 'cycles'
 #
 # Baseline                                       [Program Block Range] Cycles Diff  Shared Object     Symbol
 # ........  ......................................................................  ................  ..................................
 #
     48.75%                                             [div.c:42 -> div.c:45]  147  div               [.] main
     48.75%                                             [div.c:31 -> div.c:40]    4  div               [.] main
     48.75%                                             [div.c:40 -> div.c:40]    0  div               [.] main
     48.75%                                             [div.c:42 -> div.c:42]    0  div               [.] main
     48.75%                                             [div.c:42 -> div.c:44]    0  div               [.] main
     19.02%                                 [random_r.c:357 -> random_r.c:360]    0  libc-2.23.so      [.] __random_r
     19.02%                                 [random_r.c:357 -> random_r.c:373]    0  libc-2.23.so      [.] __random_r
     19.02%                                 [random_r.c:357 -> random_r.c:376]    0  libc-2.23.so      [.] __random_r
     19.02%                                 [random_r.c:357 -> random_r.c:380]    0  libc-2.23.so      [.] __random_r
     19.02%                                 [random_r.c:357 -> random_r.c:392]    0  libc-2.23.so      [.] __random_r
     16.17%                                     [random.c:288 -> random.c:291]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:288 -> random.c:291]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:288 -> random.c:295]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:288 -> random.c:297]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:291 -> random.c:291]    0  libc-2.23.so      [.] __random
     16.17%                                     [random.c:293 -> random.c:293]    0  libc-2.23.so      [.] __random
      8.21%                                             [div.c:22 -> div.c:22]  148  div               [.] compute_flag
      8.21%                                             [div.c:22 -> div.c:25]    0  div               [.] compute_flag
      8.21%                                             [div.c:27 -> div.c:28]    0  div               [.] compute_flag
      5.52%                                           [rand.c:26 -> rand.c:27]    0  libc-2.23.so      [.] rand
      5.52%                                           [rand.c:26 -> rand.c:28]    0  libc-2.23.so      [.] rand
      2.27%                                         [rand@plt+0 -> rand@plt+0]    0  div               [.] rand@plt
      0.01%                                 [entry_64.S:694 -> entry_64.S:694]   16  [kernel.vmlinux]  [k] native_irq_return_iret
      0.00%                                       [fair.c:7676 -> fair.c:7665]  162  [kernel.vmlinux]  [k] update_blocked_averages

"[Program Block Range]" indicates the range of program basic block
(start -> end). If we can find the source line it prints the source
line otherwise it prints the symbol+offset instead.

 v4:
 ---
 Use source lines or symbol+offset to indicate the basic block. It should
 be easier to understand.

 v3:
 ---
 Cast 'struct hist_entry' to 'struct block_hist' in hist_entry__block_fprintf.
 Use symbol_conf.report_block to check if executing hist_entry__block_fprintf.

 v2:
 ---
 Keep standard perf diff format and display the 'Baseline' and
 'Shared Object'.

The output is sorted by "Baseline" and the basic blocks in the same
function are sorted by cycles diff.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c     | 80 ++++++++++++++++++++++++++++++++++++++++---
 tools/perf/ui/stdio/hist.c    | 27 +++++++++++++++
 tools/perf/util/hist.c        | 18 ++++++++++
 tools/perf/util/hist.h        |  3 ++
 tools/perf/util/srcline.c     |  4 ++-
 tools/perf/util/symbol_conf.h |  4 ++-
 6 files changed, 130 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 5f5a9a9..d909c82 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -21,6 +21,7 @@
 #include "util/config.h"
 #include "util/time-utils.h"
 #include "util/annotate.h"
+#include "util/map.h"
 
 #include <errno.h>
 #include <inttypes.h>
@@ -46,6 +47,7 @@ enum {
 	PERF_HPP_DIFF__WEIGHTED_DIFF,
 	PERF_HPP_DIFF__FORMULA,
 	PERF_HPP_DIFF__DELTA_ABS,
+	PERF_HPP_DIFF__CYCLES,
 
 	PERF_HPP_DIFF__MAX_INDEX
 };
@@ -114,6 +116,7 @@ static int compute_2_hpp[COMPUTE_MAX] = {
 	[COMPUTE_DELTA_ABS]	= PERF_HPP_DIFF__DELTA_ABS,
 	[COMPUTE_RATIO]		= PERF_HPP_DIFF__RATIO,
 	[COMPUTE_WEIGHTED_DIFF]	= PERF_HPP_DIFF__WEIGHTED_DIFF,
+	[COMPUTE_CYCLES]	= PERF_HPP_DIFF__CYCLES,
 };
 
 #define MAX_COL_WIDTH 70
@@ -152,6 +155,10 @@ static struct header_column {
 	[PERF_HPP_DIFF__FORMULA] = {
 		.name  = "Formula",
 		.width = MAX_COL_WIDTH,
+	},
+	[PERF_HPP_DIFF__CYCLES] = {
+		.name  = "[Program Block Range] Cycles Diff",
+		.width = 70,
 	}
 };
 
@@ -239,8 +246,6 @@ static int setup_compute(const struct option *opt, const char *str,
 	for (i = 0; i < COMPUTE_MAX; i++)
 		if (!strcmp(cstr, compute_names[i])) {
 			*cp = i;
-			if (i == COMPUTE_CYCLES)
-				break;
 			return setup_compute_opt(option);
 		}
 
@@ -1002,6 +1007,9 @@ static void hists__process(struct hists *hists)
 	hists__precompute(hists);
 	hists__output_resort(hists, NULL);
 
+	if (compute == COMPUTE_CYCLES)
+		symbol_conf.report_block = true;
+
 	hists__fprintf(hists, !quiet, 0, 0, 0, stdout,
 		       !symbol_conf.use_callchain);
 }
@@ -1257,7 +1265,7 @@ static const struct option options[] = {
 	OPT_BOOLEAN('b', "baseline-only", &show_baseline_only,
 		    "Show only items with match in baseline"),
 	OPT_CALLBACK('c', "compute", &compute,
-		     "delta,delta-abs,ratio,wdiff:w1,w2 (default delta-abs)",
+		     "delta,delta-abs,ratio,wdiff:w1,w2 (default delta-abs),cycles",
 		     "Entries differential computation selection",
 		     setup_compute),
 	OPT_BOOLEAN('p', "period", &show_period,
@@ -1335,6 +1343,49 @@ static int hpp__entry_baseline(struct hist_entry *he, char *buf, size_t size)
 	return ret;
 }
 
+static int cycles_printf(struct hist_entry *he, struct hist_entry *pair,
+			 struct perf_hpp *hpp, int width)
+{
+	struct block_hist *bh = container_of(he, struct block_hist, he);
+	struct block_hist *bh_pair = container_of(pair, struct block_hist, he);
+	struct hist_entry *block_he;
+	struct block_info *bi;
+	char buf[128];
+	char *start_line, *end_line;
+
+	block_he = hists__get_entry(&bh_pair->block_hists, bh->block_idx);
+	if (!block_he) {
+		hpp->skip = true;
+		return 0;
+	}
+
+	/*
+	 * Avoid printing the warning "addr2line_init failed for ..."
+	 */
+	symbol_conf.disable_add2line_warn = true;
+
+	bi = block_he->block_info;
+
+	start_line = map__srcline(he->ms.map, bi->sym->start + bi->start,
+				  he->ms.sym);
+
+	end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
+				he->ms.sym);
+
+	if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
+		scnprintf(buf, sizeof(buf), "[%s -> %s] %4ld",
+			  start_line, end_line, block_he->diff.cycles);
+	} else {
+		scnprintf(buf, sizeof(buf), "[%7lx -> %7lx] %4ld",
+			  bi->start, bi->end, block_he->diff.cycles);
+	}
+
+	free_srcline(start_line);
+	free_srcline(end_line);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
 static int __hpp__color_compare(struct perf_hpp_fmt *fmt,
 				struct perf_hpp *hpp, struct hist_entry *he,
 				int comparison_method)
@@ -1346,8 +1397,17 @@ static int __hpp__color_compare(struct perf_hpp_fmt *fmt,
 	s64 wdiff;
 	char pfmt[20] = " ";
 
-	if (!pair)
+	if (!pair) {
+		if (comparison_method == COMPUTE_CYCLES) {
+			struct block_hist *bh;
+
+			bh = container_of(he, struct block_hist, he);
+			if (bh->block_idx)
+				hpp->skip = true;
+		}
+
 		goto no_print;
+	}
 
 	switch (comparison_method) {
 	case COMPUTE_DELTA:
@@ -1382,6 +1442,8 @@ static int __hpp__color_compare(struct perf_hpp_fmt *fmt,
 		return color_snprintf(hpp->buf, hpp->size,
 				get_percent_color(wdiff),
 				pfmt, wdiff);
+	case COMPUTE_CYCLES:
+		return cycles_printf(he, pair, hpp, dfmt->header_width);
 	default:
 		BUG_ON(1);
 	}
@@ -1411,6 +1473,12 @@ static int hpp__color_wdiff(struct perf_hpp_fmt *fmt,
 	return __hpp__color_compare(fmt, hpp, he, COMPUTE_WEIGHTED_DIFF);
 }
 
+static int hpp__color_cycles(struct perf_hpp_fmt *fmt,
+			     struct perf_hpp *hpp, struct hist_entry *he)
+{
+	return __hpp__color_compare(fmt, hpp, he, COMPUTE_CYCLES);
+}
+
 static void
 hpp__entry_unpair(struct hist_entry *he, int idx, char *buf, size_t size)
 {
@@ -1612,6 +1680,10 @@ static void data__hpp_register(struct data__file *d, int idx)
 		fmt->color = hpp__color_delta;
 		fmt->sort  = hist_entry__cmp_delta_abs;
 		break;
+	case PERF_HPP_DIFF__CYCLES:
+		fmt->color = hpp__color_cycles;
+		fmt->sort  = hist_entry__cmp_nop;
+		break;
 	default:
 		fmt->sort  = hist_entry__cmp_nop;
 		break;
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index a60f299..4e74bb7 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -531,6 +531,30 @@ static int hist_entry__hierarchy_fprintf(struct hist_entry *he,
 	return printed;
 }
 
+static int hist_entry__block_fprintf(struct hist_entry *he,
+				     char *bf, size_t size,
+				     FILE *fp)
+{
+	struct block_hist *bh = container_of(he, struct block_hist, he);
+	int ret = 0;
+
+	for (unsigned int i = 0; i < bh->block_hists.nr_entries; i++) {
+		struct perf_hpp hpp = {
+			.buf		= bf,
+			.size		= size,
+			.skip		= false,
+		};
+
+		bh->block_idx = i;
+		hist_entry__snprintf(he, &hpp);
+
+		if (!hpp.skip)
+			ret += fprintf(fp, "%s\n", bf);
+	}
+
+	return ret;
+}
+
 static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 			       char *bf, size_t bfsz, FILE *fp,
 			       bool ignore_callchains)
@@ -550,6 +574,9 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 	if (symbol_conf.report_hierarchy)
 		return hist_entry__hierarchy_fprintf(he, &hpp, hists, fp);
 
+	if (symbol_conf.report_block)
+		return hist_entry__block_fprintf(he, bf, size, fp);
+
 	hist_entry__snprintf(he, &hpp);
 
 	ret = fprintf(fp, "%s\n", bf);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 680ad93..2fb9cb2 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -376,6 +376,24 @@ void hists__delete_entries(struct hists *hists)
 	}
 }
 
+struct hist_entry *hists__get_entry(struct hists *hists, int idx)
+{
+	struct rb_node *next = rb_first_cached(&hists->entries);
+	struct hist_entry *n;
+	int i = 0;
+
+	while (next) {
+		n = rb_entry(next, struct hist_entry, rb_node);
+		if (i == idx)
+			return n;
+
+		next = rb_next(&n->rb_node);
+		i++;
+	}
+
+	return NULL;
+}
+
 /*
  * histogram, sorted on item, collects periods
  */
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index c8f7d66..9124b54 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -184,6 +184,8 @@ void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel);
 void hists__delete_entries(struct hists *hists);
 void hists__output_recalc_col_len(struct hists *hists, int max_rows);
 
+struct hist_entry *hists__get_entry(struct hists *hists, int idx);
+
 u64 hists__total_period(struct hists *hists);
 void hists__reset_stats(struct hists *hists);
 void hists__inc_stats(struct hists *hists, struct hist_entry *h);
@@ -249,6 +251,7 @@ struct perf_hpp {
 	size_t size;
 	const char *sep;
 	void *ptr;
+	bool skip;
 };
 
 struct perf_hpp_fmt {
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 10ca153..0675ec9 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -10,6 +10,7 @@
 #include "util/util.h"
 #include "util/debug.h"
 #include "util/callchain.h"
+#include "util/symbol_conf.h"
 #include "srcline.h"
 #include "string2.h"
 #include "symbol.h"
@@ -287,7 +288,8 @@ static int addr2line(const char *dso_name, u64 addr,
 	}
 
 	if (a2l == NULL) {
-		pr_warning("addr2line_init failed for %s\n", dso_name);
+		if (!symbol_conf.disable_add2line_warn)
+			pr_warning("addr2line_init failed for %s\n", dso_name);
 		return 0;
 	}
 
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index 382ba63..e688078 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -39,7 +39,9 @@ struct symbol_conf {
 			hide_unresolved,
 			raw_trace,
 			report_hierarchy,
-			inline_name;
+			report_block,
+			inline_name,
+			disable_add2line_warn;
 	const char	*vmlinux_name,
 			*kallsyms_name,
 			*source_prefix,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 7/7] perf diff: Documentation -c cycles option
  2019-06-20 14:36 [PATCH v4 0/7] perf diff: diff cycles at basic block level Jin Yao
                   ` (5 preceding siblings ...)
  2019-06-20 14:36 ` [PATCH v4 6/7] perf diff: Print the basic block cycles diff Jin Yao
@ 2019-06-20 14:36 ` Jin Yao
  6 siblings, 0 replies; 13+ messages in thread
From: Jin Yao @ 2019-06-20 14:36 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Documentation the new computation selection 'cycles'.

 v4:
 ---
 Change the column 'Block cycles diff [start:end]' to
 '[Program Block Range] Cycles Diff'

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-diff.txt | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index facd91e..d5cc15e 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -90,9 +90,10 @@ OPTIONS
 
 -c::
 --compute::
-        Differential computation selection - delta, ratio, wdiff, delta-abs
-        (default is delta-abs).  Default can be changed using diff.compute
-        config option.  See COMPARISON METHODS section for more info.
+        Differential computation selection - delta, ratio, wdiff, cycles,
+        delta-abs (default is delta-abs).  Default can be changed using
+        diff.compute config option.  See COMPARISON METHODS section for
+        more info.
 
 -p::
 --period::
@@ -280,6 +281,16 @@ If specified the 'Weighted diff' column is displayed with value 'd' computed as:
     - WEIGHT-A being the weight of the data file
     - WEIGHT-B being the weight of the baseline data file
 
+cycles
+~~~~~~
+If specified the '[Program Block Range] Cycles Diff' column is displayed.
+It displays the cycles difference of same program basic block amongst
+two perf.data. The program basic block is the code between two branches.
+
+'[Program Block Range]' indicates the range of a program basic block.
+Source line is reported if it can be found otherwise uses symbol+offset
+instead.
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-report[1]
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-20 14:36 ` [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
@ 2019-06-24  7:57   ` Jiri Olsa
  2019-06-24 14:49     ` Jin, Yao
  2019-06-25  9:11   ` Jiri Olsa
  1 sibling, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2019-06-24  7:57 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Thu, Jun 20, 2019 at 10:36:39PM +0800, Jin Yao wrote:

SNIP

> +
> +static void *block_entry_zalloc(size_t size)
> +{
> +	return zalloc(size + sizeof(struct hist_entry));
> +}
> +
> +static void block_entry_free(void *he)
> +{
> +	struct block_info *bi = ((struct hist_entry *)he)->block_info;
> +
> +	block_info__put(bi);
> +	free(he);
> +}
> +
> +struct hist_entry_ops block_entry_ops = {
> +	.new    = block_entry_zalloc,
> +	.free   = block_entry_free,
> +};

hum, so there's already block_hist_ops moving that stuff into 'struct block_hist',
which is great, but why don't we have 'struct block_entry' in here? that would
keep the 'struct block_info'

thanks,
jirka

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-24  7:57   ` Jiri Olsa
@ 2019-06-24 14:49     ` Jin, Yao
  2019-06-25  8:50       ` Jiri Olsa
  0 siblings, 1 reply; 13+ messages in thread
From: Jin, Yao @ 2019-06-24 14:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/24/2019 3:57 PM, Jiri Olsa wrote:
> On Thu, Jun 20, 2019 at 10:36:39PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> +
>> +static void *block_entry_zalloc(size_t size)
>> +{
>> +	return zalloc(size + sizeof(struct hist_entry));
>> +}
>> +
>> +static void block_entry_free(void *he)
>> +{
>> +	struct block_info *bi = ((struct hist_entry *)he)->block_info;
>> +
>> +	block_info__put(bi);
>> +	free(he);
>> +}
>> +
>> +struct hist_entry_ops block_entry_ops = {
>> +	.new    = block_entry_zalloc,
>> +	.free   = block_entry_free,
>> +};
> 
> hum, so there's already block_hist_ops moving that stuff into 'struct block_hist',
> which is great, but why don't we have 'struct block_entry' in here? that would
> keep the 'struct block_info'
> 
> thanks,
> jirka
> 

Hi Jiri,

If I define 'struct block_entry' as following and cast 'he' to 'struct 
block_entry' in some places, such as in block_cmp(), we can get the 
'struct block_entry'.

struct block_entry {
	struct block_info bi;
	struct hist_entry he;
};

But I don't know when I can set the 'bi' of 'struct block_entry'. Before 
or after calling hists__add_entry_xxx()? Before calling 
hists__add_entry_xxx(), we don't know the hist_entry. After calling 
hists__add_entry_xxx(), actually the hist_entry__cmp doesn't work (no bi ).

That's why I create block_info in hist_entry. Maybe I misunderstand what 
your suggested, correct me if I'm wrong.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-24 14:49     ` Jin, Yao
@ 2019-06-25  8:50       ` Jiri Olsa
  0 siblings, 0 replies; 13+ messages in thread
From: Jiri Olsa @ 2019-06-25  8:50 UTC (permalink / raw)
  To: Jin, Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Mon, Jun 24, 2019 at 10:49:25PM +0800, Jin, Yao wrote:
> 
> 
> On 6/24/2019 3:57 PM, Jiri Olsa wrote:
> > On Thu, Jun 20, 2019 at 10:36:39PM +0800, Jin Yao wrote:
> > 
> > SNIP
> > 
> > > +
> > > +static void *block_entry_zalloc(size_t size)
> > > +{
> > > +	return zalloc(size + sizeof(struct hist_entry));
> > > +}
> > > +
> > > +static void block_entry_free(void *he)
> > > +{
> > > +	struct block_info *bi = ((struct hist_entry *)he)->block_info;
> > > +
> > > +	block_info__put(bi);
> > > +	free(he);
> > > +}
> > > +
> > > +struct hist_entry_ops block_entry_ops = {
> > > +	.new    = block_entry_zalloc,
> > > +	.free   = block_entry_free,
> > > +};
> > 
> > hum, so there's already block_hist_ops moving that stuff into 'struct block_hist',
> > which is great, but why don't we have 'struct block_entry' in here? that would
> > keep the 'struct block_info'
> > 
> > thanks,
> > jirka
> > 
> 
> Hi Jiri,
> 
> If I define 'struct block_entry' as following and cast 'he' to 'struct
> block_entry' in some places, such as in block_cmp(), we can get the 'struct
> block_entry'.
> 
> struct block_entry {
> 	struct block_info bi;
> 	struct hist_entry he;
> };
> 
> But I don't know when I can set the 'bi' of 'struct block_entry'. Before or
> after calling hists__add_entry_xxx()? Before calling hists__add_entry_xxx(),
> we don't know the hist_entry. After calling hists__add_entry_xxx(), actually
> the hist_entry__cmp doesn't work (no bi ).
> 
> That's why I create block_info in hist_entry. Maybe I misunderstand what
> your suggested, correct me if I'm wrong.

ah you need the data already in place when calling hists__add_entry_block,
because of the block_cmp sorting function.. ok, it's the same tyhing we
do in c2c with mem_info.. I guess we can survive one pointer

I'll check the rest

thanks,
jirka

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-20 14:36 ` [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
  2019-06-24  7:57   ` Jiri Olsa
@ 2019-06-25  9:11   ` Jiri Olsa
  2019-06-26  0:57     ` Jin, Yao
  1 sibling, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2019-06-25  9:11 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Thu, Jun 20, 2019 at 10:36:39PM +0800, Jin Yao wrote:

SNIP

> +
> +static void *block_entry_zalloc(size_t size)
> +{
> +	return zalloc(size + sizeof(struct hist_entry));
> +}
> +
> +static void block_entry_free(void *he)
> +{
> +	struct block_info *bi = ((struct hist_entry *)he)->block_info;
> +
> +	block_info__put(bi);
> +	free(he);
> +}

so if we carry block_info in 'struct hist_entry' we don't need
to use our own allocation in the 2nd level hist entries.. just
for the first level to carry the new 'struct block_hists'

the block_info should be released in hist_entry__delete then,
same as for other *info pointers

the rest of the patchset looks ok to me

thanks,
jirka

> +
> +struct hist_entry_ops block_entry_ops = {
> +	.new    = block_entry_zalloc,
> +	.free   = block_entry_free,
> +};
> +
> +static int process_block_per_sym(struct hist_entry *he)
> +{
> +	struct annotation *notes;
> +	struct cyc_hist *ch;
> +	struct block_hist *bh;
> +
> +	if (!he->ms.map || !he->ms.sym)
> +		return 0;
> +
> +	notes = symbol__annotation(he->ms.sym);
> +	if (!notes || !notes->src || !notes->src->cycles_hist)
> +		return 0;
> +
> +	bh = container_of(he, struct block_hist, he);
> +	init_block_hist(bh);
> +
> +	ch = notes->src->cycles_hist;
> +	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
> +		if (ch[i].num_aggr) {
> +			struct block_info *bi;
> +			struct hist_entry *he_block;
> +
> +			bi = block_info__new();
> +			if (!bi)
> +				return -1;
> +
> +			init_block_info(bi, he->ms.sym, &ch[i], i);
> +			he_block = hists__add_entry_block(&bh->block_hists,
> +							  &block_entry_ops,
> +							  &dummy_al, bi);
> +			if (!he_block) {
> +				block_info__put(bi);
> +				return -1;
> +			}
> +		}
> +	}
> +
> +	return 0;

SNIP

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol
  2019-06-25  9:11   ` Jiri Olsa
@ 2019-06-26  0:57     ` Jin, Yao
  0 siblings, 0 replies; 13+ messages in thread
From: Jin, Yao @ 2019-06-26  0:57 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/25/2019 5:11 PM, Jiri Olsa wrote:
> On Thu, Jun 20, 2019 at 10:36:39PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> +
>> +static void *block_entry_zalloc(size_t size)
>> +{
>> +	return zalloc(size + sizeof(struct hist_entry));
>> +}
>> +
>> +static void block_entry_free(void *he)
>> +{
>> +	struct block_info *bi = ((struct hist_entry *)he)->block_info;
>> +
>> +	block_info__put(bi);
>> +	free(he);
>> +}
> 
> so if we carry block_info in 'struct hist_entry' we don't need
> to use our own allocation in the 2nd level hist entries.. just
> for the first level to carry the new 'struct block_hists'
> 
> the block_info should be released in hist_entry__delete then,
> same as for other *info pointers
> 
> the rest of the patchset looks ok to me
> 
> thanks,
> jirka
> 

Yes, releasing the block_info in hist_entry__delete is a better way. 
Thanks Jiri, I will update the patch.

Thanks
Jin Yao

>> +
>> +struct hist_entry_ops block_entry_ops = {
>> +	.new    = block_entry_zalloc,
>> +	.free   = block_entry_free,
>> +};
>> +
>> +static int process_block_per_sym(struct hist_entry *he)
>> +{
>> +	struct annotation *notes;
>> +	struct cyc_hist *ch;
>> +	struct block_hist *bh;
>> +
>> +	if (!he->ms.map || !he->ms.sym)
>> +		return 0;
>> +
>> +	notes = symbol__annotation(he->ms.sym);
>> +	if (!notes || !notes->src || !notes->src->cycles_hist)
>> +		return 0;
>> +
>> +	bh = container_of(he, struct block_hist, he);
>> +	init_block_hist(bh);
>> +
>> +	ch = notes->src->cycles_hist;
>> +	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
>> +		if (ch[i].num_aggr) {
>> +			struct block_info *bi;
>> +			struct hist_entry *he_block;
>> +
>> +			bi = block_info__new();
>> +			if (!bi)
>> +				return -1;
>> +
>> +			init_block_info(bi, he->ms.sym, &ch[i], i);
>> +			he_block = hists__add_entry_block(&bh->block_hists,
>> +							  &block_entry_ops,
>> +							  &dummy_al, bi);
>> +			if (!he_block) {
>> +				block_info__put(bi);
>> +				return -1;
>> +			}
>> +		}
>> +	}
>> +
>> +	return 0;
> 
> SNIP
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-06-26  0:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-20 14:36 [PATCH v4 0/7] perf diff: diff cycles at basic block level Jin Yao
2019-06-20 14:36 ` [PATCH v4 1/7] perf util: Create block_info structure Jin Yao
2019-06-20 14:36 ` [PATCH v4 2/7] perf util: Add block_info in hist_entry Jin Yao
2019-06-20 14:36 ` [PATCH v4 3/7] perf diff: Check if all data files with branch stacks Jin Yao
2019-06-20 14:36 ` [PATCH v4 4/7] perf diff: Use hists to manage basic blocks per symbol Jin Yao
2019-06-24  7:57   ` Jiri Olsa
2019-06-24 14:49     ` Jin, Yao
2019-06-25  8:50       ` Jiri Olsa
2019-06-25  9:11   ` Jiri Olsa
2019-06-26  0:57     ` Jin, Yao
2019-06-20 14:36 ` [PATCH v4 5/7] perf diff: Link same basic blocks among different data Jin Yao
2019-06-20 14:36 ` [PATCH v4 6/7] perf diff: Print the basic block cycles diff Jin Yao
2019-06-20 14:36 ` [PATCH v4 7/7] perf diff: Documentation -c cycles option Jin Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).