linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/7] perf report: Support sorting all blocks by cycles
@ 2019-11-07  7:47 Jin Yao
  2019-11-07  7:47 ` [PATCH v7 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Jin Yao @ 2019-11-07  7:47 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It would be useful to support sorting for all blocks by the
sampled cycles percent per block. This is useful to concentrate
on the globally hottest blocks.

This patch series implements a new option "--total-cycles" which
sorts all blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is
block sampled cycles aggregation / total sampled cycles

For example,

perf record -b ./div
perf report --total-cycles --stdio

 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 2M of event 'cycles'
 # Event count (approx.): 2753248
 #
 # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
 # ...............  ..............  ...........  ..........  .................................................................  ....................
 #
            26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
            15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
             5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
             4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
             4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
             3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
             3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
             3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
             2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
             2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
             2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
             2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
             1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
             1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
             1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so
 ......

This patch series supports both stdio and tui. And also with the supporting
of --percent-limit.

 v7:
 ---
 Use use_browser in report__browse_block_hists and support
 reporting for both stdio and tui modes.
 
 Move block tui browser code from ui/browsers/hists.c
 to block-info.c.

 Impacted patches:
 -----------------
  perf report: Sort by sampled cycles percent per block for stdio
  perf report: Support --percent-limit for --total-cycles
  perf report: Sort by sampled cycles percent per block for tui

 v6:
 ---
 Move block displaying codes from builtin-report.c to
 block-info.c. Two new functions are created in block-info.c
 (report__browse_block_hists, report__tui_browse_block_hists).

 Impacted patches:
 -----------------
  perf report: Sort by sampled cycles percent per block for stdio
  perf report: Support --percent-limit for --total-cycles
  perf report: Sort by sampled cycles percent per block for tui

 v5:
 ---
 1. Move all block functions to block-info.c

 2. Move the code of setting ms(map+sym) in block hist_entry to
    patch 'perf util: Support block formats with compare/sort/display'.
    Because this info is needed for reporting the block range
    (i.e. source line)

 3. Fix a crash issue when tui mode is enabled.

 Impacted patches:
 -----------------
  perf util: Support block formats with compare/sort/display
  perf report: Sort by sampled cycles percent per block for stdio
  perf report: Support --percent-limit for --total-cycles
  perf report: Sort by sampled cycles percent per block for tui

 v4:
 ---
 1. Move the block collection out of block printing.

 2. Use new option '--total-cycles' to replace '-s total_cycles'

 3. Move code for skipping column length calculation to patch:
    'perf diff: Don't use hack to skip column length calculation'

 4. Some minor updates and cleanup.

 v3:
 ---
 1. Move common block info functions to block-info.h/block-info.c

 2. Remove nasty hack for skipping calculation of column length.

 3. Some minor cleanup.

 v2:
 ---
 Rebase to perf/core branch

Jin Yao (7):
  perf diff: Don't use hack to skip column length calculation
  perf util: Cleanup and refactor block info functions
  perf util: Count the total cycles of all samples
  perf util: Support block formats with compare/sort/display
  perf report: Sort by sampled cycles percent per block for stdio
  perf report: Support --percent-limit for --total-cycles
  perf report: Sort by sampled cycles percent per block for tui

 tools/perf/Documentation/perf-report.txt |  11 +
 tools/perf/builtin-annotate.c            |   2 +-
 tools/perf/builtin-diff.c                | 121 +----
 tools/perf/builtin-report.c              |  61 ++-
 tools/perf/builtin-top.c                 |   3 +-
 tools/perf/ui/browsers/hists.c           |   7 +-
 tools/perf/ui/browsers/hists.h           |   2 +
 tools/perf/ui/stdio/hist.c               |  29 +-
 tools/perf/util/Build                    |   1 +
 tools/perf/util/block-info.c             | 545 +++++++++++++++++++++++
 tools/perf/util/block-info.h             |  78 ++++
 tools/perf/util/hist.c                   |  13 +-
 tools/perf/util/hist.h                   |   3 +-
 tools/perf/util/symbol.c                 |  22 -
 tools/perf/util/symbol.h                 |  24 -
 tools/perf/util/symbol_conf.h            |   1 +
 16 files changed, 762 insertions(+), 161 deletions(-)
 create mode 100644 tools/perf/util/block-info.c
 create mode 100644 tools/perf/util/block-info.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v7 1/7] perf diff: Don't use hack to skip column length calculation
  2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
@ 2019-11-07  7:47 ` Jin Yao
  2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
  2019-11-07  7:47 ` [PATCH v7 2/7] perf util: Cleanup and refactor block info functions Jin Yao
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Jin Yao @ 2019-11-07  7:47 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Previously we use a nasty hack to skip the hists__calc_col_len for
block since this function is not very suitable for block column
length calculation.

This patch removes the hack code and add a check at the entry of
hists__calc_col_len to skip for block case.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 11 ++---------
 tools/perf/util/hist.c    |  2 ++
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 5281629c27b1..faf99a81ad3e 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -765,13 +765,6 @@ static void block_hists_match(struct hists *hists_base,
 	}
 }
 
-static int filter_cb(struct hist_entry *he, void *arg __maybe_unused)
-{
-	/* Skip the calculation of column length in output_resort */
-	he->filtered = true;
-	return 0;
-}
-
 static void hists__precompute(struct hists *hists)
 {
 	struct rb_root_cached *root;
@@ -820,8 +813,8 @@ static void hists__precompute(struct hists *hists)
 				if (bh->valid && pair_bh->valid) {
 					block_hists_match(&bh->block_hists,
 							  &pair_bh->block_hists);
-					hists__output_resort_cb(&pair_bh->block_hists,
-								NULL, filter_cb);
+					hists__output_resort(&pair_bh->block_hists,
+							     NULL);
 				}
 				break;
 			default:
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 679a1d75090c..daa6eef4fde0 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -80,6 +80,8 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
 	int symlen;
 	u16 len;
 
+	if (h->block_info)
+		return;
 	/*
 	 * +4 accounts for '[x] ' priv level info
 	 * +2 accounts for 0x prefix on raw addresses
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v7 2/7] perf util: Cleanup and refactor block info functions
  2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
  2019-11-07  7:47 ` [PATCH v7 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
@ 2019-11-07  7:47 ` Jin Yao
  2019-11-12 11:17   ` [tip: perf/core] perf block: " tip-bot2 for Jin Yao
  2019-11-07  7:47 ` [PATCH v7 3/7] perf util: Count the total cycles of all samples Jin Yao
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Jin Yao @ 2019-11-07  7:47 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We have already implemented some block-info related functions.
Now it's time to do some cleanup, refactoring and move the
functions and structures to new block-info.h/block-info.c.

 v4:
 ---
 Move code for skipping column length calculation to patch:
 'perf diff: Don't use hack to skip column length calculation'

 v3:
 ---
 1. Rename the patch title
 2. Rename from block.h/block.c to block-info.h/block-info.c
 3. Move more common part to block-info, such as
    block_info__process_sym.
 4. Remove the nasty hack for skipping calculation of column
    length

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c    | 107 +++--------------------------
 tools/perf/util/Build        |   1 +
 tools/perf/util/block-info.c | 129 +++++++++++++++++++++++++++++++++++
 tools/perf/util/block-info.h |  43 ++++++++++++
 tools/perf/util/hist.c       |   1 +
 tools/perf/util/symbol.c     |  22 ------
 tools/perf/util/symbol.h     |  24 -------
 7 files changed, 185 insertions(+), 142 deletions(-)
 create mode 100644 tools/perf/util/block-info.c
 create mode 100644 tools/perf/util/block-info.h

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index faf99a81ad3e..6728568fe5c4 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -24,6 +24,7 @@
 #include "util/annotate.h"
 #include "util/map.h"
 #include "util/spark.h"
+#include "util/block-info.h"
 #include <linux/err.h>
 #include <linux/zalloc.h>
 #include <subcmd/pager.h>
@@ -98,8 +99,6 @@ static s64 compute_wdiff_w2;
 static const char		*cpu_list;
 static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 
-static struct addr_location dummy_al;
-
 enum {
 	COMPUTE_DELTA,
 	COMPUTE_RATIO,
@@ -537,41 +536,6 @@ static void hists__baseline_only(struct hists *hists)
 	}
 }
 
-static int64_t block_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
-			 struct hist_entry *left, struct hist_entry *right)
-{
-	struct block_info *bi_l = left->block_info;
-	struct block_info *bi_r = right->block_info;
-	int cmp;
-
-	if (!bi_l->sym || !bi_r->sym) {
-		if (!bi_l->sym && !bi_r->sym)
-			return 0;
-		else if (!bi_l->sym)
-			return -1;
-		else
-			return 1;
-	}
-
-	if (bi_l->sym == bi_r->sym) {
-		if (bi_l->start == bi_r->start) {
-			if (bi_l->end == bi_r->end)
-				return 0;
-			else
-				return (int64_t)(bi_r->end - bi_l->end);
-		} else
-			return (int64_t)(bi_r->start - bi_l->start);
-	} else {
-		cmp = strcmp(bi_l->sym->name, bi_r->sym->name);
-		return cmp;
-	}
-
-	if (bi_l->sym->start != bi_r->sym->start)
-		return (int64_t)(bi_r->sym->start - bi_l->sym->start);
-
-	return (int64_t)(bi_r->sym->end - bi_l->sym->end);
-}
-
 static int64_t block_cycles_diff_cmp(struct hist_entry *left,
 				     struct hist_entry *right)
 {
@@ -600,67 +564,13 @@ static void init_block_hist(struct block_hist *bh)
 
 	INIT_LIST_HEAD(&bh->block_fmt.list);
 	INIT_LIST_HEAD(&bh->block_fmt.sort_list);
-	bh->block_fmt.cmp = block_cmp;
+	bh->block_fmt.cmp = block_info__cmp;
 	bh->block_fmt.sort = block_sort;
 	perf_hpp_list__register_sort_field(&bh->block_list,
 					   &bh->block_fmt);
 	bh->valid = true;
 }
 
-static void init_block_info(struct block_info *bi, struct symbol *sym,
-			    struct cyc_hist *ch, int offset)
-{
-	bi->sym = sym;
-	bi->start = ch->start;
-	bi->end = offset;
-	bi->cycles = ch->cycles;
-	bi->cycles_aggr = ch->cycles_aggr;
-	bi->num = ch->num;
-	bi->num_aggr = ch->num_aggr;
-
-	memcpy(bi->cycles_spark, ch->cycles_spark,
-	       NUM_SPARKS * sizeof(u64));
-}
-
-static int process_block_per_sym(struct hist_entry *he)
-{
-	struct annotation *notes;
-	struct cyc_hist *ch;
-	struct block_hist *bh;
-
-	if (!he->ms.map || !he->ms.sym)
-		return 0;
-
-	notes = symbol__annotation(he->ms.sym);
-	if (!notes || !notes->src || !notes->src->cycles_hist)
-		return 0;
-
-	bh = container_of(he, struct block_hist, he);
-	init_block_hist(bh);
-
-	ch = notes->src->cycles_hist;
-	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
-		if (ch[i].num_aggr) {
-			struct block_info *bi;
-			struct hist_entry *he_block;
-
-			bi = block_info__new();
-			if (!bi)
-				return -1;
-
-			init_block_info(bi, he->ms.sym, &ch[i], i);
-			he_block = hists__add_entry_block(&bh->block_hists,
-							  &dummy_al, bi);
-			if (!he_block) {
-				block_info__put(bi);
-				return -1;
-			}
-		}
-	}
-
-	return 0;
-}
-
 static int block_pair_cmp(struct hist_entry *a, struct hist_entry *b)
 {
 	struct block_info *bi_a = a->block_info;
@@ -785,8 +695,11 @@ static void hists__precompute(struct hists *hists)
 		he   = rb_entry(next, struct hist_entry, rb_node_in);
 		next = rb_next(&he->rb_node_in);
 
-		if (compute == COMPUTE_CYCLES)
-			process_block_per_sym(he);
+		if (compute == COMPUTE_CYCLES) {
+			bh = container_of(he, struct block_hist, he);
+			init_block_hist(bh);
+			block_info__process_sym(he, bh, NULL, 0);
+		}
 
 		data__for_each_file_new(i, d) {
 			pair = get_pair_data(he, d);
@@ -805,10 +718,12 @@ static void hists__precompute(struct hists *hists)
 				compute_wdiff(he, pair);
 				break;
 			case COMPUTE_CYCLES:
-				process_block_per_sym(pair);
-				bh = container_of(he, struct block_hist, he);
 				pair_bh = container_of(pair, struct block_hist,
 						       he);
+				init_block_hist(pair_bh);
+				block_info__process_sym(pair, pair_bh, NULL, 0);
+
+				bh = container_of(he, struct block_hist, he);
 
 				if (bh->valid && pair_bh->valid) {
 					block_hists_match(&bh->block_hists,
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 39814b1806a6..b8e05a147b2b 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -1,4 +1,5 @@
 perf-y += annotate.o
+perf-y += block-info.o
 perf-y += block-range.o
 perf-y += build-id.o
 perf-y += cacheline.o
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
new file mode 100644
index 000000000000..b9954a32b8f4
--- /dev/null
+++ b/tools/perf/util/block-info.c
@@ -0,0 +1,129 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdlib.h>
+#include <string.h>
+#include <linux/zalloc.h>
+#include "block-info.h"
+#include "sort.h"
+#include "annotate.h"
+#include "symbol.h"
+
+struct block_info *block_info__get(struct block_info *bi)
+{
+	if (bi)
+		refcount_inc(&bi->refcnt);
+	return bi;
+}
+
+void block_info__put(struct block_info *bi)
+{
+	if (bi && refcount_dec_and_test(&bi->refcnt))
+		free(bi);
+}
+
+struct block_info *block_info__new(void)
+{
+	struct block_info *bi = zalloc(sizeof(*bi));
+
+	if (bi)
+		refcount_set(&bi->refcnt, 1);
+	return bi;
+}
+
+int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+			struct hist_entry *left, struct hist_entry *right)
+{
+	struct block_info *bi_l = left->block_info;
+	struct block_info *bi_r = right->block_info;
+	int cmp;
+
+	if (!bi_l->sym || !bi_r->sym) {
+		if (!bi_l->sym && !bi_r->sym)
+			return 0;
+		else if (!bi_l->sym)
+			return -1;
+		else
+			return 1;
+	}
+
+	if (bi_l->sym == bi_r->sym) {
+		if (bi_l->start == bi_r->start) {
+			if (bi_l->end == bi_r->end)
+				return 0;
+			else
+				return (int64_t)(bi_r->end - bi_l->end);
+		} else
+			return (int64_t)(bi_r->start - bi_l->start);
+	} else {
+		cmp = strcmp(bi_l->sym->name, bi_r->sym->name);
+		return cmp;
+	}
+
+	if (bi_l->sym->start != bi_r->sym->start)
+		return (int64_t)(bi_r->sym->start - bi_l->sym->start);
+
+	return (int64_t)(bi_r->sym->end - bi_l->sym->end);
+}
+
+static void init_block_info(struct block_info *bi, struct symbol *sym,
+			    struct cyc_hist *ch, int offset,
+			    u64 total_cycles)
+{
+	bi->sym = sym;
+	bi->start = ch->start;
+	bi->end = offset;
+	bi->cycles = ch->cycles;
+	bi->cycles_aggr = ch->cycles_aggr;
+	bi->num = ch->num;
+	bi->num_aggr = ch->num_aggr;
+	bi->total_cycles = total_cycles;
+
+	memcpy(bi->cycles_spark, ch->cycles_spark,
+	       NUM_SPARKS * sizeof(u64));
+}
+
+int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
+			    u64 *block_cycles_aggr, u64 total_cycles)
+{
+	struct annotation *notes;
+	struct cyc_hist *ch;
+	static struct addr_location al;
+	u64 cycles = 0;
+
+	if (!he->ms.map || !he->ms.sym)
+		return 0;
+
+	memset(&al, 0, sizeof(al));
+	al.map = he->ms.map;
+	al.sym = he->ms.sym;
+
+	notes = symbol__annotation(he->ms.sym);
+	if (!notes || !notes->src || !notes->src->cycles_hist)
+		return 0;
+	ch = notes->src->cycles_hist;
+	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
+		if (ch[i].num_aggr) {
+			struct block_info *bi;
+			struct hist_entry *he_block;
+
+			bi = block_info__new();
+			if (!bi)
+				return -1;
+
+			init_block_info(bi, he->ms.sym, &ch[i], i,
+					total_cycles);
+			cycles += bi->cycles_aggr / bi->num_aggr;
+
+			he_block = hists__add_entry_block(&bh->block_hists,
+							  &al, bi);
+			if (!he_block) {
+				block_info__put(bi);
+				return -1;
+			}
+		}
+	}
+
+	if (block_cycles_aggr)
+		*block_cycles_aggr += cycles;
+
+	return 0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
new file mode 100644
index 000000000000..d55dfc2fda6f
--- /dev/null
+++ b/tools/perf/util/block-info.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_BLOCK_H
+#define __PERF_BLOCK_H
+
+#include <linux/types.h>
+#include <linux/refcount.h>
+#include "util/hist.h"
+#include "util/symbol.h"
+
+struct block_info {
+	struct symbol		*sym;
+	u64			start;
+	u64			end;
+	u64			cycles;
+	u64			cycles_aggr;
+	s64			cycles_spark[NUM_SPARKS];
+	u64			total_cycles;
+	int			num;
+	int			num_aggr;
+	refcount_t		refcnt;
+};
+
+struct block_hist;
+
+struct block_info *block_info__new(void);
+struct block_info *block_info__get(struct block_info *bi);
+void   block_info__put(struct block_info *bi);
+
+static inline void __block_info__zput(struct block_info **bi)
+{
+	block_info__put(*bi);
+	*bi = NULL;
+}
+
+#define block_info__zput(bi) __block_info__zput(&bi)
+
+int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+			struct hist_entry *left, struct hist_entry *right);
+
+int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
+			    u64 *block_cycles_aggr, u64 total_cycles);
+
+#endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index daa6eef4fde0..a7fa061987e4 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -18,6 +18,7 @@
 #include "srcline.h"
 #include "symbol.h"
 #include "thread.h"
+#include "block-info.h"
 #include "ui/progress.h"
 #include <errno.h>
 #include <math.h>
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 4ad39cc6368d..2764863212b1 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2351,25 +2351,3 @@ struct mem_info *mem_info__new(void)
 		refcount_set(&mi->refcnt, 1);
 	return mi;
 }
-
-struct block_info *block_info__get(struct block_info *bi)
-{
-	if (bi)
-		refcount_inc(&bi->refcnt);
-	return bi;
-}
-
-void block_info__put(struct block_info *bi)
-{
-	if (bi && refcount_dec_and_test(&bi->refcnt))
-		free(bi);
-}
-
-struct block_info *block_info__new(void)
-{
-	struct block_info *bi = zalloc(sizeof(*bi));
-
-	if (bi)
-		refcount_set(&bi->refcnt, 1);
-	return bi;
-}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index cc2a89b99d3d..c3bd16d75d5d 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -106,18 +106,6 @@ struct ref_reloc_sym {
 	u64		unrelocated_addr;
 };
 
-struct block_info {
-	struct symbol		*sym;
-	u64			start;
-	u64			end;
-	u64			cycles;
-	u64			cycles_aggr;
-	s64			cycles_spark[NUM_SPARKS];
-	int			num;
-	int			num_aggr;
-	refcount_t		refcnt;
-};
-
 struct addr_location {
 	struct machine *machine;
 	struct thread *thread;
@@ -291,16 +279,4 @@ static inline void __mem_info__zput(struct mem_info **mi)
 
 #define mem_info__zput(mi) __mem_info__zput(&mi)
 
-struct block_info *block_info__new(void);
-struct block_info *block_info__get(struct block_info *bi);
-void   block_info__put(struct block_info *bi);
-
-static inline void __block_info__zput(struct block_info **bi)
-{
-	block_info__put(*bi);
-	*bi = NULL;
-}
-
-#define block_info__zput(bi) __block_info__zput(&bi)
-
 #endif /* __PERF_SYMBOL */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v7 3/7] perf util: Count the total cycles of all samples
  2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
  2019-11-07  7:47 ` [PATCH v7 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
  2019-11-07  7:47 ` [PATCH v7 2/7] perf util: Cleanup and refactor block info functions Jin Yao
@ 2019-11-07  7:47 ` Jin Yao
  2019-11-12 11:17   ` [tip: perf/core] perf hist: " tip-bot2 for Jin Yao
  2019-11-07  7:47 ` [PATCH v7 4/7] perf util: Support block formats with compare/sort/display Jin Yao
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Jin Yao @ 2019-11-07  7:47 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We can get the per sample cycles by hist__account_cycles(). It's also
useful to know the total cycles of all samples in order to get the
cycles coverage for a single program block in further. For example,

coverage = per block sampled cycles / total sampled cycles

This patch creates a new argument 'total_cycles' in hist__account_cycles(),
which will be added with the cycles of each sample.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-annotate.c | 2 +-
 tools/perf/builtin-diff.c     | 3 ++-
 tools/perf/builtin-report.c   | 2 +-
 tools/perf/builtin-top.c      | 3 ++-
 tools/perf/util/hist.c        | 6 +++++-
 tools/perf/util/hist.h        | 3 ++-
 6 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 8db8fc9bddef..6ab0cc45b287 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -201,7 +201,7 @@ static int process_branch_callback(struct evsel *evsel,
 	if (a.map != NULL)
 		a.map->dso->hit = 1;
 
-	hist__account_cycles(sample->branch_stack, al, sample, false);
+	hist__account_cycles(sample->branch_stack, al, sample, false, NULL);
 
 	ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
 	return ret;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 6728568fe5c4..376dbf10ad64 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -426,7 +426,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
 			goto out_put;
 		}
 
-		hist__account_cycles(sample->branch_stack, &al, sample, false);
+		hist__account_cycles(sample->branch_stack, &al, sample, false,
+				     NULL);
 	}
 
 	/*
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3bbad039abf2..bc15b9dcccd6 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -292,7 +292,7 @@ static int process_sample_event(struct perf_tool *tool,
 
 	if (ui__has_annotation() || rep->symbol_ipc) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
-				     rep->nonany_branch_mode);
+				     rep->nonany_branch_mode, NULL);
 	}
 
 	ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d96f24c8770d..14c52e4d47f6 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -725,7 +725,8 @@ static int hist_iter__top_callback(struct hist_entry_iter *iter,
 		perf_top__record_precise_ip(top, he, iter->sample, evsel, al->addr);
 
 	hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
-		     !(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY));
+		     !(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY),
+		     NULL);
 	return 0;
 }
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index a7fa061987e4..0e27d6830011 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -2572,7 +2572,8 @@ int hists__unlink(struct hists *hists)
 }
 
 void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
-			  struct perf_sample *sample, bool nonany_branch_mode)
+			  struct perf_sample *sample, bool nonany_branch_mode,
+			  u64 *total_cycles)
 {
 	struct branch_info *bi;
 
@@ -2599,6 +2600,9 @@ void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
 					nonany_branch_mode ? NULL : prev,
 					bi[i].flags.cycles);
 				prev = &bi[i].to;
+
+				if (total_cycles)
+					*total_cycles += bi[i].flags.cycles;
 			}
 			free(bi);
 		}
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 6a186b668303..4d87c7b4c1b2 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -527,7 +527,8 @@ unsigned int hists__sort_list_width(struct hists *hists);
 unsigned int hists__overhead_width(struct hists *hists);
 
 void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
-			  struct perf_sample *sample, bool nonany_branch_mode);
+			  struct perf_sample *sample, bool nonany_branch_mode,
+			  u64 *total_cycles);
 
 struct option;
 int parse_filter_percentage(const struct option *opt, const char *arg, int unset);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v7 4/7] perf util: Support block formats with compare/sort/display
  2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (2 preceding siblings ...)
  2019-11-07  7:47 ` [PATCH v7 3/7] perf util: Count the total cycles of all samples Jin Yao
@ 2019-11-07  7:47 ` Jin Yao
  2019-11-07 12:45   ` Arnaldo Carvalho de Melo
  2019-11-12 11:17   ` [tip: perf/core] perf hist: " tip-bot2 for Jin Yao
  2019-11-07  7:47 ` [PATCH v7 5/7] perf report: Sort by sampled cycles percent per block for stdio Jin Yao
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 22+ messages in thread
From: Jin Yao @ 2019-11-07  7:47 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

This patch provides helper routines to support new
columns for block info output.

The new columns are:

Sampled Cycles%
Sampled Cycles
Avg Cycles%
Avg Cycles
[Program Block Range]
Shared Object

 v5:
 ---
 1. Move more block related functions from builtin-report.c to
    block-info.c

 2. Set ms (map+sym) in block hist_entry. Because this info
    is needed for reporting the block range (i.e. source line)

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/block-info.c | 317 +++++++++++++++++++++++++++++++++++
 tools/perf/util/block-info.h |  33 +++-
 tools/perf/util/hist.c       |   4 +
 3 files changed, 352 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index b9954a32b8f4..1242c3a33509 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -6,6 +6,40 @@
 #include "sort.h"
 #include "annotate.h"
 #include "symbol.h"
+#include "dso.h"
+#include "map.h"
+#include "srcline.h"
+#include "evlist.h"
+
+static struct block_header_column{
+	const char *name;
+	int width;
+} block_columns[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
+	[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT] = {
+		.name = "Sampled Cycles%",
+		.width = 15,
+	},
+	[PERF_HPP_REPORT__BLOCK_LBR_CYCLES] = {
+		.name = "Sampled Cycles",
+		.width = 14,
+	},
+	[PERF_HPP_REPORT__BLOCK_CYCLES_PCT] = {
+		.name = "Avg Cycles%",
+		.width = 11,
+	},
+	[PERF_HPP_REPORT__BLOCK_AVG_CYCLES] = {
+		.name = "Avg Cycles",
+		.width = 10,
+	},
+	[PERF_HPP_REPORT__BLOCK_RANGE] = {
+		.name = "[Program Block Range]",
+		.width = 70,
+	},
+	[PERF_HPP_REPORT__BLOCK_DSO] = {
+		.name = "Shared Object",
+		.width = 20,
+	}
+};
 
 struct block_info *block_info__get(struct block_info *bi)
 {
@@ -127,3 +161,286 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 
 	return 0;
 }
+
+static int block_column_header(struct perf_hpp_fmt *fmt,
+			       struct perf_hpp *hpp,
+			       struct hists *hists __maybe_unused,
+			       int line __maybe_unused,
+			       int *span __maybe_unused)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 block_fmt->header);
+}
+
+static int block_column_width(struct perf_hpp_fmt *fmt,
+			      struct perf_hpp *hpp __maybe_unused,
+			      struct hists *hists __maybe_unused)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+
+	return block_fmt->width;
+}
+
+static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
+					struct perf_hpp *hpp,
+					struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	double ratio = 0.0;
+	char buf[16];
+
+	if (block_fmt->total_cycles)
+		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
+
+	sprintf(buf, "%.2f%%", 100.0 * ratio);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
+					   struct hist_entry *left,
+					   struct hist_entry *right)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi_l = left->block_info;
+	struct block_info *bi_r = right->block_info;
+	double l, r;
+
+	if (block_fmt->total_cycles) {
+		l = ((double)bi_l->cycles /
+			(double)block_fmt->total_cycles) * 100000.0;
+		r = ((double)bi_r->cycles /
+			(double)block_fmt->total_cycles) * 100000.0;
+		return (int64_t)l - (int64_t)r;
+	}
+
+	return 0;
+}
+
+static void cycles_string(u64 cycles, char *buf, int size)
+{
+	if (cycles >= 1000000)
+		scnprintf(buf, size, "%.1fM", (double)cycles / 1000000.0);
+	else if (cycles >= 1000)
+		scnprintf(buf, size, "%.1fK", (double)cycles / 1000.0);
+	else
+		scnprintf(buf, size, "%1d", cycles);
+}
+
+static int block_cycles_lbr_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp, struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char cycles_buf[16];
+
+	cycles_string(bi->cycles_aggr, cycles_buf, sizeof(cycles_buf));
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 cycles_buf);
+}
+
+static int block_cycles_pct_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp, struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	double ratio = 0.0;
+	u64 avg;
+	char buf[16];
+
+	if (block_fmt->block_cycles && bi->num_aggr) {
+		avg = bi->cycles_aggr / bi->num_aggr;
+		ratio = (double)avg / (double)block_fmt->block_cycles;
+	}
+
+	sprintf(buf, "%.2f%%", 100.0 * ratio);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int block_avg_cycles_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp,
+				  struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char cycles_buf[16];
+
+	cycles_string(bi->cycles_aggr / bi->num_aggr, cycles_buf,
+		      sizeof(cycles_buf));
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 cycles_buf);
+}
+
+static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+			     struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char buf[128];
+	char *start_line, *end_line;
+
+	symbol_conf.disable_add2line_warn = true;
+
+	start_line = map__srcline(he->ms.map, bi->sym->start + bi->start,
+				  he->ms.sym);
+
+	end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
+				he->ms.sym);
+
+	if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
+		scnprintf(buf, sizeof(buf), "[%s -> %s]",
+			  start_line, end_line);
+	} else {
+		scnprintf(buf, sizeof(buf), "[%7lx -> %7lx]",
+			  bi->start, bi->end);
+	}
+
+	free_srcline(start_line);
+	free_srcline(end_line);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+			   struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct map *map = he->ms.map;
+
+	if (map && map->dso) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 map->dso->short_name);
+	}
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 "[unknown]");
+}
+
+static void init_block_header(struct block_fmt *block_fmt)
+{
+	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
+
+	BUG_ON(block_fmt->idx >= PERF_HPP_REPORT__BLOCK_MAX_INDEX);
+
+	block_fmt->header = block_columns[block_fmt->idx].name;
+	block_fmt->width = block_columns[block_fmt->idx].width;
+
+	fmt->header = block_column_header;
+	fmt->width = block_column_width;
+}
+
+static void hpp_register(struct block_fmt *block_fmt, int idx,
+			 struct perf_hpp_list *hpp_list)
+{
+	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
+
+	block_fmt->idx = idx;
+	INIT_LIST_HEAD(&fmt->list);
+	INIT_LIST_HEAD(&fmt->sort_list);
+
+	switch (idx) {
+	case PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT:
+		fmt->entry = block_total_cycles_pct_entry;
+		fmt->cmp = block_info__cmp;
+		fmt->sort = block_total_cycles_pct_sort;
+		break;
+	case PERF_HPP_REPORT__BLOCK_LBR_CYCLES:
+		fmt->entry = block_cycles_lbr_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_CYCLES_PCT:
+		fmt->entry = block_cycles_pct_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_AVG_CYCLES:
+		fmt->entry = block_avg_cycles_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_RANGE:
+		fmt->entry = block_range_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_DSO:
+		fmt->entry = block_dso_entry;
+		break;
+	default:
+		return;
+	}
+
+	init_block_header(block_fmt);
+	perf_hpp_list__column_register(hpp_list, fmt);
+}
+
+static void register_block_columns(struct perf_hpp_list *hpp_list,
+				   struct block_fmt *block_fmts)
+{
+	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++)
+		hpp_register(&block_fmts[i], i, hpp_list);
+}
+
+static void init_block_hist(struct block_hist *bh, struct block_fmt *block_fmts)
+{
+	__hists__init(&bh->block_hists, &bh->block_list);
+	perf_hpp_list__init(&bh->block_list);
+	bh->block_list.nr_header_lines = 1;
+
+	register_block_columns(&bh->block_list, block_fmts);
+
+	perf_hpp_list__register_sort_field(&bh->block_list,
+		&block_fmts[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT].fmt);
+}
+
+static inline void set_fmt(struct block_fmt *block_fmt,
+			   u64 total_cycles, u64 block_cycles)
+{
+	block_fmt->total_cycles = total_cycles;
+	block_fmt->block_cycles = block_cycles;
+}
+
+static void process_block_report(struct hists *hists,
+				 struct block_report *block_report,
+				 u64 total_cycles)
+{
+	struct rb_node *next = rb_first_cached(&hists->entries);
+	struct block_hist *bh = &block_report->hist;
+	struct hist_entry *he;
+
+	init_block_hist(bh, block_report->fmts);
+
+	while (next) {
+		he = rb_entry(next, struct hist_entry, rb_node);
+		block_info__process_sym(he, bh, &block_report->cycles,
+					total_cycles);
+		next = rb_next(&he->rb_node);
+	}
+
+	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++) {
+		block_report->fmts[i].total_cycles = total_cycles;
+		block_report->fmts[i].block_cycles = block_report->cycles;
+	}
+
+	hists__output_resort(&bh->block_hists, NULL);
+}
+
+struct block_report *block_info__create_report(struct evlist *evlist,
+					       u64 total_cycles)
+{
+	struct block_report *block_reports;
+	int nr_hists = evlist->core.nr_entries, i = 0;
+	struct evsel *pos;
+
+	block_reports = calloc(nr_hists, sizeof(struct block_report));
+	if (!block_reports)
+		return NULL;
+
+	evlist__for_each_entry(evlist, pos) {
+		struct hists *hists = evsel__hists(pos);
+
+		process_block_report(hists, &block_reports[i], total_cycles);
+		i++;
+	}
+
+	return block_reports;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index d55dfc2fda6f..b5266588d476 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -4,8 +4,9 @@
 
 #include <linux/types.h>
 #include <linux/refcount.h>
-#include "util/hist.h"
-#include "util/symbol.h"
+#include "hist.h"
+#include "symbol.h"
+#include "sort.h"
 
 struct block_info {
 	struct symbol		*sym;
@@ -20,6 +21,31 @@ struct block_info {
 	refcount_t		refcnt;
 };
 
+struct block_fmt {
+	struct perf_hpp_fmt	fmt;
+	int			idx;
+	int			width;
+	const char		*header;
+	u64			total_cycles;
+	u64			block_cycles;
+};
+
+enum {
+	PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
+	PERF_HPP_REPORT__BLOCK_LBR_CYCLES,
+	PERF_HPP_REPORT__BLOCK_CYCLES_PCT,
+	PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
+	PERF_HPP_REPORT__BLOCK_RANGE,
+	PERF_HPP_REPORT__BLOCK_DSO,
+	PERF_HPP_REPORT__BLOCK_MAX_INDEX
+};
+
+struct block_report {
+	struct block_hist	hist;
+	u64			cycles;
+	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
+};
+
 struct block_hist;
 
 struct block_info *block_info__new(void);
@@ -40,4 +66,7 @@ int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 			    u64 *block_cycles_aggr, u64 total_cycles);
 
+struct block_report *block_info__create_report(struct evlist *evlist,
+					       u64 total_cycles);
+
 #endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0e27d6830011..7cf137b0451b 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -758,6 +758,10 @@ struct hist_entry *hists__add_entry_block(struct hists *hists,
 	struct hist_entry entry = {
 		.block_info = block_info,
 		.hists = hists,
+		.ms = {
+			.map = al->map,
+			.sym = al->sym,
+		},
 	}, *he = hists__findnew_entry(hists, &entry, al, false);
 
 	return he;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v7 5/7] perf report: Sort by sampled cycles percent per block for stdio
  2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (3 preceding siblings ...)
  2019-11-07  7:47 ` [PATCH v7 4/7] perf util: Support block formats with compare/sort/display Jin Yao
@ 2019-11-07  7:47 ` Jin Yao
  2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
  2019-11-07  7:47 ` [PATCH v7 6/7] perf report: Support --percent-limit for --total-cycles Jin Yao
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Jin Yao @ 2019-11-07  7:47 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It would be useful to support sorting for all blocks by the
sampled cycles percent per block. This is useful to concentrate
on the globally hottest blocks.

This patch implements a new option "--total-cycles" which sorts
all blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is the
percent:

 percent = block sampled cycles aggregation / total sampled cycles

Note that, this patch only supports "--stdio" mode.

For example,

perf record -b ./div
perf report --total-cycles --stdio

 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 2M of event 'cycles'
 # Event count (approx.): 2753248
 #
 # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
 # ...............  ..............  ...........  ..........  .................................................................  ....................
 #
            26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
            15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
             5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
             4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
             4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
             3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
             3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
             3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
             2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
             2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
             2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
             2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
             1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
             1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
             1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so
             0.25%          182.5K        0.02%           1                                 [random_r.c:388 -> random_r.c:391]          libc-2.27.so
             0.00%              48        1.07%          48                         [x86_pmu_enable+284 -> x86_pmu_enable+298]     [kernel.kallsyms]
             0.00%              74        1.64%          74                              [vm_mmap_pgoff+0 -> vm_mmap_pgoff+92]     [kernel.kallsyms]
             0.00%              73        1.62%          73                                          [vm_mmap+0 -> vm_mmap+48]     [kernel.kallsyms]
             0.00%              63        0.69%          31                                        [up_write+0 -> up_write+34]     [kernel.kallsyms]
             0.00%              13        0.29%          13                       [setup_arg_pages+396 -> setup_arg_pages+413]     [kernel.kallsyms]
             0.00%               3        0.07%           3                       [setup_arg_pages+418 -> setup_arg_pages+450]     [kernel.kallsyms]
             0.00%             616        6.84%         308                    [security_mmap_file+0 -> security_mmap_file+72]     [kernel.kallsyms]
             0.00%              23        0.51%          23                   [security_mmap_file+77 -> security_mmap_file+87]     [kernel.kallsyms]
             0.00%               4        0.02%           1                                   [sched_clock+0 -> sched_clock+4]     [kernel.kallsyms]
             0.00%               4        0.02%           1                                  [sched_clock+9 -> sched_clock+12]     [kernel.kallsyms]
             0.00%               1        0.02%           1                                 [rcu_nmi_exit+0 -> rcu_nmi_exit+9]     [kernel.kallsyms]

 v7:
 ---
 Use use_browser in report__browse_block_hists for supporting
 stdio and potential tui mode.

 v6:
 ---
 Create report__browse_block_hists in block-info.c (codes are
 moved from builtin-report.c). It's called from
 perf_evlist__tty_browse_hists.

 v5:
 ---
 1. Move all block functions to block-info.c

 2. Move the code of setting ms in block hist_entry to
    other patch.

 v4:
 ---
 1. Use new option '--total-cycles' to replace
    '-s total_cycles' in v3.

 2. Move block info collection out of block info
    printing.

 v3:
 ---
 1. Use common function block_info__process_sym to
    process the blocks per symbol.

 2. Remove the nasty hack for skipping calculation
    of column length

 3. Some minor cleanup

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-report.txt | 11 ++++++
 tools/perf/builtin-report.c              | 44 ++++++++++++++++++++++--
 tools/perf/ui/stdio/hist.c               | 22 ++++++++++++
 tools/perf/util/block-info.c             | 17 +++++++++
 tools/perf/util/block-info.h             |  4 +++
 tools/perf/util/symbol_conf.h            |  1 +
 6 files changed, 96 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 7315f155803f..8dbe2119686a 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -525,6 +525,17 @@ include::itrace.txt[]
 	Configure time quantum for time sort key. Default 100ms.
 	Accepts s, us, ms, ns units.
 
+--total-cycles::
+	When --total-cycles is specified, it supports sorting for all blocks by
+	'Sampled Cycles%'. This is useful to concentrate on the globally hottest
+	blocks. In output, there are some new columns:
+
+	'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles
+	'Sampled Cycles'  - block sampled cycles aggregation
+	'Avg Cycles%'     - block average sampled cycles / sum of total block average
+			    sampled cycles
+	'Avg Cycles'      - block average sampled cycles
+
 include::callchain-overhead-calculation.txt[]
 
 SEE ALSO
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index bc15b9dcccd6..992b18bdd723 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -51,6 +51,7 @@
 #include "util/util.h" // perf_tip()
 #include "ui/ui.h"
 #include "ui/progress.h"
+#include "util/block-info.h"
 
 #include <dlfcn.h>
 #include <errno.h>
@@ -96,10 +97,13 @@ struct report {
 	float			min_percent;
 	u64			nr_entries;
 	u64			queue_size;
+	u64			total_cycles;
 	int			socket_filter;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 	struct branch_type_stat	brtype_stat;
 	bool			symbol_ipc;
+	bool			total_cycles_mode;
+	struct block_report	*block_reports;
 };
 
 static int report__config(const char *var, const char *value, void *cb)
@@ -290,9 +294,10 @@ static int process_sample_event(struct perf_tool *tool,
 	if (al.map != NULL)
 		al.map->dso->hit = 1;
 
-	if (ui__has_annotation() || rep->symbol_ipc) {
+	if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
-				     rep->nonany_branch_mode, NULL);
+				     rep->nonany_branch_mode,
+				     &rep->total_cycles);
 	}
 
 	ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
@@ -485,6 +490,7 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 					 const char *help)
 {
 	struct evsel *pos;
+	int i = 0;
 
 	if (!quiet) {
 		fprintf(stdout, "#\n# Total Lost Samples: %" PRIu64 "\n#\n",
@@ -500,6 +506,13 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 			continue;
 
 		hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
+
+		if (rep->total_cycles_mode) {
+			report__browse_block_hists(&rep->block_reports[i++].hist,
+						   0, pos);
+			continue;
+		}
+
 		hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
 			       !(symbol_conf.use_callchain ||
 			         symbol_conf.show_branchflag_count));
@@ -925,6 +938,13 @@ static int __cmd_report(struct report *rep)
 
 	report__output_resort(rep);
 
+	if (rep->total_cycles_mode) {
+		rep->block_reports = block_info__create_report(session->evlist,
+							       rep->total_cycles);
+		if (!rep->block_reports)
+			return -1;
+	}
+
 	return report__browse_hists(rep);
 }
 
@@ -1209,6 +1229,8 @@ int cmd_report(int argc, const char **argv)
 		     "Set time quantum for time sort key (default 100ms)",
 		     parse_time_quantum),
 	OPTS_EVSWITCH(&report.evswitch),
+	OPT_BOOLEAN(0, "total-cycles", &report.total_cycles_mode,
+		    "Sort all blocks by 'Sampled Cycles%'"),
 	OPT_END()
 	};
 	struct perf_data data = {
@@ -1371,6 +1393,17 @@ int cmd_report(int argc, const char **argv)
 		goto error;
 	}
 
+	if (report.total_cycles_mode) {
+		if (sort__mode != SORT_MODE__BRANCH)
+			report.total_cycles_mode = false;
+		else if (!report.use_stdio) {
+			pr_err("Error: --total-cycles can be only used together with --stdio\n");
+			goto error;
+		} else {
+			sort_order = "sym";
+		}
+	}
+
 	if (strcmp(input_name, "-") != 0)
 		setup_browser(true);
 	else
@@ -1421,7 +1454,8 @@ int cmd_report(int argc, const char **argv)
 	 * so don't allocate extra space that won't be used in the stdio
 	 * implementation.
 	 */
-	if (ui__has_annotation() || report.symbol_ipc) {
+	if (ui__has_annotation() || report.symbol_ipc ||
+	    report.total_cycles_mode) {
 		ret = symbol__annotation_init();
 		if (ret < 0)
 			goto error;
@@ -1482,6 +1516,10 @@ int cmd_report(int argc, const char **argv)
 		itrace_synth_opts__clear_time_range(&itrace_synth_opts);
 		zfree(&report.ptime_range);
 	}
+
+	if (report.block_reports)
+		zfree(&report.block_reports);
+
 	zstd_fini(&(session->zstd_data));
 	perf_session__delete(session);
 	return ret;
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 5365606e9dad..655ef7708cd0 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -558,6 +558,25 @@ static int hist_entry__block_fprintf(struct hist_entry *he,
 	return ret;
 }
 
+static int hist_entry__individual_block_fprintf(struct hist_entry *he,
+						char *bf, size_t size,
+						FILE *fp)
+{
+	int ret = 0;
+
+	struct perf_hpp hpp = {
+		.buf		= bf,
+		.size		= size,
+		.skip		= false,
+	};
+
+	hist_entry__snprintf(he, &hpp);
+	if (!hpp.skip)
+		ret += fprintf(fp, "%s\n", bf);
+
+	return ret;
+}
+
 static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 			       char *bf, size_t bfsz, FILE *fp,
 			       bool ignore_callchains)
@@ -580,6 +599,9 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 	if (symbol_conf.report_block)
 		return hist_entry__block_fprintf(he, bf, size, fp);
 
+	if (symbol_conf.report_individual_block)
+		return hist_entry__individual_block_fprintf(he, bf, size, fp);
+
 	hist_entry__snprintf(he, &hpp);
 
 	ret = fprintf(fp, "%s\n", bf);
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 1242c3a33509..3fe94f2a8743 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -444,3 +444,20 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 
 	return block_reports;
 }
+
+int report__browse_block_hists(struct block_hist *bh, float min_percent,
+			       struct evsel *evsel __maybe_unused)
+{
+	switch (use_browser) {
+	case 0:
+		symbol_conf.report_individual_block = true;
+		hists__fprintf(&bh->block_hists, true, 0, 0, min_percent,
+			       stdout, true);
+		hists__delete_entries(&bh->block_hists);
+		return 0;
+	default:
+		return -1;
+	}
+
+	return 0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index b5266588d476..8309297a6e8f 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -7,6 +7,7 @@
 #include "hist.h"
 #include "symbol.h"
 #include "sort.h"
+#include "ui/ui.h"
 
 struct block_info {
 	struct symbol		*sym;
@@ -69,4 +70,7 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 struct block_report *block_info__create_report(struct evlist *evlist,
 					       u64 total_cycles);
 
+int report__browse_block_hists(struct block_hist *bh, float min_percent,
+			       struct evsel *evsel);
+
 #endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index e6880789864c..10f1ec3e0349 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -40,6 +40,7 @@ struct symbol_conf {
 			raw_trace,
 			report_hierarchy,
 			report_block,
+			report_individual_block,
 			inline_name,
 			disable_add2line_warn;
 	const char	*vmlinux_name,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v7 6/7] perf report: Support --percent-limit for --total-cycles
  2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (4 preceding siblings ...)
  2019-11-07  7:47 ` [PATCH v7 5/7] perf report: Sort by sampled cycles percent per block for stdio Jin Yao
@ 2019-11-07  7:47 ` Jin Yao
  2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
  2019-11-07  7:47 ` [PATCH v7 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
  2019-11-07  9:28 ` [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jiri Olsa
  7 siblings, 1 reply; 22+ messages in thread
From: Jin Yao @ 2019-11-07  7:47 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We have already supported the '--total-cycles' option in previous
patch. It's also useful to show entries only above a threshold
percent.

This patch enables '--percent-limit' for not showing entries
under that percent.

For example,

 perf report --total-cycles --stdio --percent-limit 1

 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 2M of event 'cycles'
 # Event count (approx.): 2753248
 #
 # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
 # ...............  ..............  ...........  ..........  .................................................................  ....................
 #
            26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
            15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
             5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
             4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
             4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
             3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
             3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
             3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
             2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
             2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
             2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
             2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
             1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
             1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
             1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so

It only shows the entries which 'Sampled Cycles%' > 1%.

 v7:
 ---
 No functional change. Only fix the conflict issue because
 previous patches are changed.

 v6:
 ---
 No functional change. Only fix the conflict issue because
 previous patches are changed.

 v5:
 ---
 No functional change. Only fix the conflict issue because
 previous patches are changed.

 v4:
 ---
 No functional change. Only fix the build issue because
 previous patches are changed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-report.c  |  2 +-
 tools/perf/ui/stdio/hist.c   |  7 ++++++-
 tools/perf/util/block-info.c | 10 ++++++++++
 tools/perf/util/block-info.h |  2 ++
 4 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 992b18bdd723..ca41187525ed 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -509,7 +509,7 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 
 		if (rep->total_cycles_mode) {
 			report__browse_block_hists(&rep->block_reports[i++].hist,
-						   0, pos);
+						   rep->min_percent, pos);
 			continue;
 		}
 
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 655ef7708cd0..132056c7d5b7 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -15,6 +15,7 @@
 #include "../../util/srcline.h"
 #include "../../util/string2.h"
 #include "../../util/thread.h"
+#include "../../util/block-info.h"
 #include <linux/ctype.h>
 #include <linux/zalloc.h>
 
@@ -856,7 +857,11 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
 		if (h->filtered)
 			continue;
 
-		percent = hist_entry__get_percent_limit(h);
+		if (symbol_conf.report_individual_block)
+			percent = block_info__total_cycles_percent(h);
+		else
+			percent = hist_entry__get_percent_limit(h);
+
 		if (percent < min_pcnt)
 			continue;
 
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 3fe94f2a8743..a6c2a3eea6f7 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -461,3 +461,13 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
 
 	return 0;
 }
+
+float block_info__total_cycles_percent(struct hist_entry *he)
+{
+	struct block_info *bi = he->block_info;
+
+	if (bi->total_cycles)
+		return bi->cycles * 100.0 / bi->total_cycles;
+
+	return 0.0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index 8309297a6e8f..e4d20bccd9b6 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -73,4 +73,6 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 int report__browse_block_hists(struct block_hist *bh, float min_percent,
 			       struct evsel *evsel);
 
+float block_info__total_cycles_percent(struct hist_entry *he);
+
 #endif /* __PERF_BLOCK_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v7 7/7] perf report: Sort by sampled cycles percent per block for tui
  2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (5 preceding siblings ...)
  2019-11-07  7:47 ` [PATCH v7 6/7] perf report: Support --percent-limit for --total-cycles Jin Yao
@ 2019-11-07  7:47 ` Jin Yao
  2019-11-07 12:38   ` Arnaldo Carvalho de Melo
  2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
  2019-11-07  9:28 ` [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jiri Olsa
  7 siblings, 2 replies; 22+ messages in thread
From: Jin Yao @ 2019-11-07  7:47 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Previous patch has implemented a new option "--total-cycles".
But only stdio mode is supported.

This patch supports the tui mode and support '--percent-limit'.

For example,

 perf record -b ./div
 perf report --total-cycles --percent-limit 1

 # Samples: 2753248 of event 'cycles'
 Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
          26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
          15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
           5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
           4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
           4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
           3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
           3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
           3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
           2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
           2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
           2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
           2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
           2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
           2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
           1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
           1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
           1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so

 v7:
 ---
 1. Since we have used use_browser in report__browse_block_hists
    to support stdio mode, now we also add supporting for tui.

 2. Move block tui browser code from ui/browsers/hists.c
    to block-info.c.

 v6:
 ---
 Create report__tui_browse_block_hists in block-info.c
 (codes are moved from builtin-report.c).

 v5:
 ---
 Fix a crash issue when running perf report without
 '--total-cycles'. The issue is because the internal flag
 is renamed from 'total_cycles' to 'total_cycles_mode' in
 previous patch but this patch still uses 'total_cycles'
 to check if the '--total-cycles' option is enabled, which
 causes the code to be inconsistent.

 v4:
 ---
 Since the block collection is moved out of printing in
 previous patch, this patch is updated accordingly for
 tui supporting.

 v3:
 ---
 Minor change since the function name is changed:
 block_total_cycles_percent -> block_info__total_cycles_percent

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-report.c    | 27 ++++++++++---
 tools/perf/ui/browsers/hists.c |  7 +++-
 tools/perf/ui/browsers/hists.h |  2 +
 tools/perf/util/block-info.c   | 74 +++++++++++++++++++++++++++++++++-
 4 files changed, 103 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ca41187525ed..1e81985b7d56 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -485,6 +485,22 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
 	return ret + fprintf(fp, "\n#\n");
 }
 
+static int perf_evlist__tui_block_hists_browse(struct evlist *evlist,
+					       struct report *rep)
+{
+	struct evsel *pos;
+	int i = 0, ret;
+
+	evlist__for_each_entry(evlist, pos) {
+		ret = report__browse_block_hists(&rep->block_reports[i++].hist,
+						 rep->min_percent, pos);
+		if (ret != 0)
+			return ret;
+	}
+
+	return 0;
+}
+
 static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 					 struct report *rep,
 					 const char *help)
@@ -595,6 +611,11 @@ static int report__browse_hists(struct report *rep)
 
 	switch (use_browser) {
 	case 1:
+		if (rep->total_cycles_mode) {
+			ret = perf_evlist__tui_block_hists_browse(evlist, rep);
+			break;
+		}
+
 		ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
 						    rep->min_percent,
 						    &session->header.env,
@@ -1396,12 +1417,8 @@ int cmd_report(int argc, const char **argv)
 	if (report.total_cycles_mode) {
 		if (sort__mode != SORT_MODE__BRANCH)
 			report.total_cycles_mode = false;
-		else if (!report.use_stdio) {
-			pr_err("Error: --total-cycles can be only used together with --stdio\n");
-			goto error;
-		} else {
+		else
 			sort_order = "sym";
-		}
 	}
 
 	if (strcmp(input_name, "-") != 0)
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 7a7187e069b4..334afc2139e7 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -26,6 +26,7 @@
 #include "../../util/sort.h"
 #include "../../util/top.h"
 #include "../../util/thread.h"
+#include "../../util/block-info.h"
 #include "../../arch/common.h"
 #include "../../perf.h"
 
@@ -1783,7 +1784,11 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser)
 			continue;
 		}
 
-		percent = hist_entry__get_percent_limit(h);
+		if (symbol_conf.report_individual_block)
+			percent = block_info__total_cycles_percent(h);
+		else
+			percent = hist_entry__get_percent_limit(h);
+
 		if (percent < hb->min_pcnt)
 			continue;
 
diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
index 91d3e18b50aa..078f2f2c7abd 100644
--- a/tools/perf/ui/browsers/hists.h
+++ b/tools/perf/ui/browsers/hists.h
@@ -5,6 +5,7 @@
 #include "ui/browser.h"
 
 struct annotation_options;
+struct evsel;
 
 struct hist_browser {
 	struct ui_browser   b;
@@ -15,6 +16,7 @@ struct hist_browser {
 	struct pstack	    *pstack;
 	struct perf_env	    *env;
 	struct annotation_options *annotation_opts;
+	struct evsel	    *block_evsel;
 	int		     print_seq;
 	bool		     show_dso;
 	bool		     show_headers;
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index a6c2a3eea6f7..73a2dfe5ca68 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -10,6 +10,7 @@
 #include "map.h"
 #include "srcline.h"
 #include "evlist.h"
+#include "ui/browsers/hists.h"
 
 static struct block_header_column{
 	const char *name;
@@ -445,9 +446,75 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 	return block_reports;
 }
 
+#ifdef HAVE_SLANG_SUPPORT
+static int block_hists_browser__title(struct hist_browser *browser, char *bf,
+				      size_t size)
+{
+	struct hists *hists = evsel__hists(browser->block_evsel);
+	const char *evname = perf_evsel__name(browser->block_evsel);
+	unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
+	int ret;
+
+	ret = scnprintf(bf, size, "# Samples: %lu", nr_samples);
+	if (evname)
+		scnprintf(bf + ret, size -  ret, " of event '%s'", evname);
+
+	return 0;
+}
+
+static int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
+				  float min_percent)
+{
+	struct hists *hists = &bh->block_hists;
+	struct hist_browser *browser;
+	int key = -1;
+	static const char help[] =
+	" q             Quit \n";
+
+	browser = hist_browser__new(hists);
+	if (!browser)
+		return -1;
+
+	browser->block_evsel = evsel;
+	browser->title = block_hists_browser__title;
+	browser->min_pcnt = min_percent;
+
+	/* reset abort key so that it can get Ctrl-C as a key */
+	SLang_reset_tty();
+	SLang_init_tty(0, 0, 0);
+
+	while (1) {
+		key = hist_browser__run(browser, "? - help", true);
+
+		switch (key) {
+		case 'q':
+			goto out;
+		case '?':
+			ui_browser__help_window(&browser->b, help);
+			break;
+		default:
+			break;
+		}
+	}
+
+out:
+	hist_browser__delete(browser);
+	return 0;
+}
+#else
+static int block_hists_tui_browse(struct block_hist *bh __maybe_unused,
+				  struct evsel *evsel __maybe_unused,
+				  float min_percent __maybe_unused)
+{
+	return 0;
+}
+#endif
+
 int report__browse_block_hists(struct block_hist *bh, float min_percent,
-			       struct evsel *evsel __maybe_unused)
+			       struct evsel *evsel)
 {
+	int ret;
+
 	switch (use_browser) {
 	case 0:
 		symbol_conf.report_individual_block = true;
@@ -455,6 +522,11 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
 			       stdout, true);
 		hists__delete_entries(&bh->block_hists);
 		return 0;
+	case 1:
+		symbol_conf.report_individual_block = true;
+		ret = block_hists_tui_browse(bh, evsel, min_percent);
+		hists__delete_entries(&bh->block_hists);
+		return ret;
 	default:
 		return -1;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v7 0/7] perf report: Support sorting all blocks by cycles
  2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (6 preceding siblings ...)
  2019-11-07  7:47 ` [PATCH v7 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
@ 2019-11-07  9:28 ` Jiri Olsa
  2019-11-07 13:09   ` Jin, Yao
  7 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2019-11-07  9:28 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Thu, Nov 07, 2019 at 03:47:12PM +0800, Jin Yao wrote:
> It would be useful to support sorting for all blocks by the
> sampled cycles percent per block. This is useful to concentrate
> on the globally hottest blocks.
> 
> This patch series implements a new option "--total-cycles" which
> sorts all blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is
> block sampled cycles aggregation / total sampled cycles
> 
> For example,
> 
> perf record -b ./div
> perf report --total-cycles --stdio
> 
>  # To display the perf.data header info, please use --header/--header-only options.
>  #
>  #
>  # Total Lost Samples: 0
>  #
>  # Samples: 2M of event 'cycles'
>  # Event count (approx.): 2753248
>  #
>  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
>  # ...............  ..............  ...........  ..........  .................................................................  ....................
>  #
>             26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
>             15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
>              5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
>              4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
>              4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
>              3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
>              3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
>              3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
>              2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
>              2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
>              2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
>              2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
>              2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
>              2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
>              1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
>              1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
>              1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so
>  ......
> 
> This patch series supports both stdio and tui. And also with the supporting
> of --percent-limit.
> 
>  v7:
>  ---
>  Use use_browser in report__browse_block_hists and support
>  reporting for both stdio and tui modes.
>  
>  Move block tui browser code from ui/browsers/hists.c
>  to block-info.c.

thanks for bearing with me ;-)

Reviewed-by: Jiri Olsa <jolsa@kernel.org>

jirka


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v7 7/7] perf report: Sort by sampled cycles percent per block for tui
  2019-11-07  7:47 ` [PATCH v7 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
@ 2019-11-07 12:38   ` Arnaldo Carvalho de Melo
  2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
  1 sibling, 0 replies; 22+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-07 12:38 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

I prefer that all the browsers be in tools/perf/ui/browsers, but won't,
ho-hum, block this patch because of that, we can fix that later, and
don't need to hurry, I have patches that touch 'struct map'/'struct
addr_location', that I'll have to fixup to account for this, so its
better to fix anything here after I get my work merged,

Thanks a lot for this new feature, it looks awesome!

To further clarify: I've applied everything and also test it, very good
so far, we can fix things we find later, now going thru tests on my
build farm.

Thanks Jiri for all the reviewing, appreciated, as usual,

- Arnaldo
 
> +++ b/tools/perf/util/block-info.c
> @@ -10,6 +10,7 @@
>  #include "map.h"
>  #include "srcline.h"
>  #include "evlist.h"
> +#include "ui/browsers/hists.h"
>  
>  static struct block_header_column{
>  	const char *name;
> @@ -445,9 +446,75 @@ struct block_report *block_info__create_report(struct evlist *evlist,
>  	return block_reports;
>  }
>  
> +#ifdef HAVE_SLANG_SUPPORT
> +static int block_hists_browser__title(struct hist_browser *browser, char *bf,
> +				      size_t size)
> +{
> +	struct hists *hists = evsel__hists(browser->block_evsel);
> +	const char *evname = perf_evsel__name(browser->block_evsel);
> +	unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
> +	int ret;
> +
> +	ret = scnprintf(bf, size, "# Samples: %lu", nr_samples);
> +	if (evname)
> +		scnprintf(bf + ret, size -  ret, " of event '%s'", evname);
> +
> +	return 0;
> +}
> +
> +static int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
> +				  float min_percent)
> +{
> +	struct hists *hists = &bh->block_hists;
> +	struct hist_browser *browser;
> +	int key = -1;
> +	static const char help[] =
> +	" q             Quit \n";
> +
> +	browser = hist_browser__new(hists);
> +	if (!browser)
> +		return -1;
> +
> +	browser->block_evsel = evsel;
> +	browser->title = block_hists_browser__title;
> +	browser->min_pcnt = min_percent;
> +
> +	/* reset abort key so that it can get Ctrl-C as a key */
> +	SLang_reset_tty();
> +	SLang_init_tty(0, 0, 0);
> +
> +	while (1) {
> +		key = hist_browser__run(browser, "? - help", true);
> +
> +		switch (key) {
> +		case 'q':
> +			goto out;
> +		case '?':
> +			ui_browser__help_window(&browser->b, help);
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +
> +out:
> +	hist_browser__delete(browser);
> +	return 0;
> +}
> +#else
> +static int block_hists_tui_browse(struct block_hist *bh __maybe_unused,
> +				  struct evsel *evsel __maybe_unused,
> +				  float min_percent __maybe_unused)
> +{
> +	return 0;
> +}
> +#endif
> +
>  int report__browse_block_hists(struct block_hist *bh, float min_percent,
> -			       struct evsel *evsel __maybe_unused)
> +			       struct evsel *evsel)
>  {
> +	int ret;
> +
>  	switch (use_browser) {
>  	case 0:
>  		symbol_conf.report_individual_block = true;
> @@ -455,6 +522,11 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
>  			       stdout, true);
>  		hists__delete_entries(&bh->block_hists);
>  		return 0;
> +	case 1:
> +		symbol_conf.report_individual_block = true;
> +		ret = block_hists_tui_browse(bh, evsel, min_percent);
> +		hists__delete_entries(&bh->block_hists);
> +		return ret;
>  	default:
>  		return -1;
>  	}
> -- 
> 2.17.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v7 4/7] perf util: Support block formats with compare/sort/display
  2019-11-07  7:47 ` [PATCH v7 4/7] perf util: Support block formats with compare/sort/display Jin Yao
@ 2019-11-07 12:45   ` Arnaldo Carvalho de Melo
  2019-11-07 13:05     ` Jin, Yao
  2019-11-12 11:17   ` [tip: perf/core] perf hist: " tip-bot2 for Jin Yao
  1 sibling, 1 reply; 22+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-07 12:45 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Thu, Nov 07, 2019 at 03:47:16PM +0800, Jin Yao escreveu:
> This patch provides helper routines to support new
> columns for block info output.

This breaks bisection, some of my build containers complained about:

util/block-info.c:395:20: error: unused function 'set_fmt' [-Werror,-Wunused-function]
static inline void set_fmt(struct block_fmt *block_fmt,
                   ^
1 error generated.
mv: can't rename '/tmp/build/perf/util/.block-info.o.tmp': No such file or directory


Because you add a static inline function to a .c file and don't use it,
I'll try moving it to when it finally gets used.
 
> The new columns are:
> 
> Sampled Cycles%
> Sampled Cycles
> Avg Cycles%
> Avg Cycles
> [Program Block Range]
> Shared Object
> 
>  v5:
>  ---
>  1. Move more block related functions from builtin-report.c to
>     block-info.c
> 
>  2. Set ms (map+sym) in block hist_entry. Because this info
>     is needed for reporting the block range (i.e. source line)
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/util/block-info.c | 317 +++++++++++++++++++++++++++++++++++
>  tools/perf/util/block-info.h |  33 +++-
>  tools/perf/util/hist.c       |   4 +
>  3 files changed, 352 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
> index b9954a32b8f4..1242c3a33509 100644
> --- a/tools/perf/util/block-info.c
> +++ b/tools/perf/util/block-info.c
> @@ -6,6 +6,40 @@
>  #include "sort.h"
>  #include "annotate.h"
>  #include "symbol.h"
> +#include "dso.h"
> +#include "map.h"
> +#include "srcline.h"
> +#include "evlist.h"
> +
> +static struct block_header_column{
> +	const char *name;
> +	int width;
> +} block_columns[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
> +	[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT] = {
> +		.name = "Sampled Cycles%",
> +		.width = 15,
> +	},
> +	[PERF_HPP_REPORT__BLOCK_LBR_CYCLES] = {
> +		.name = "Sampled Cycles",
> +		.width = 14,
> +	},
> +	[PERF_HPP_REPORT__BLOCK_CYCLES_PCT] = {
> +		.name = "Avg Cycles%",
> +		.width = 11,
> +	},
> +	[PERF_HPP_REPORT__BLOCK_AVG_CYCLES] = {
> +		.name = "Avg Cycles",
> +		.width = 10,
> +	},
> +	[PERF_HPP_REPORT__BLOCK_RANGE] = {
> +		.name = "[Program Block Range]",
> +		.width = 70,
> +	},
> +	[PERF_HPP_REPORT__BLOCK_DSO] = {
> +		.name = "Shared Object",
> +		.width = 20,
> +	}
> +};
>  
>  struct block_info *block_info__get(struct block_info *bi)
>  {
> @@ -127,3 +161,286 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
>  
>  	return 0;
>  }
> +
> +static int block_column_header(struct perf_hpp_fmt *fmt,
> +			       struct perf_hpp *hpp,
> +			       struct hists *hists __maybe_unused,
> +			       int line __maybe_unused,
> +			       int *span __maybe_unused)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +
> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> +			 block_fmt->header);
> +}
> +
> +static int block_column_width(struct perf_hpp_fmt *fmt,
> +			      struct perf_hpp *hpp __maybe_unused,
> +			      struct hists *hists __maybe_unused)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +
> +	return block_fmt->width;
> +}
> +
> +static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
> +					struct perf_hpp *hpp,
> +					struct hist_entry *he)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +	struct block_info *bi = he->block_info;
> +	double ratio = 0.0;
> +	char buf[16];
> +
> +	if (block_fmt->total_cycles)
> +		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
> +
> +	sprintf(buf, "%.2f%%", 100.0 * ratio);
> +
> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
> +}
> +
> +static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
> +					   struct hist_entry *left,
> +					   struct hist_entry *right)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +	struct block_info *bi_l = left->block_info;
> +	struct block_info *bi_r = right->block_info;
> +	double l, r;
> +
> +	if (block_fmt->total_cycles) {
> +		l = ((double)bi_l->cycles /
> +			(double)block_fmt->total_cycles) * 100000.0;
> +		r = ((double)bi_r->cycles /
> +			(double)block_fmt->total_cycles) * 100000.0;
> +		return (int64_t)l - (int64_t)r;
> +	}
> +
> +	return 0;
> +}
> +
> +static void cycles_string(u64 cycles, char *buf, int size)
> +{
> +	if (cycles >= 1000000)
> +		scnprintf(buf, size, "%.1fM", (double)cycles / 1000000.0);
> +	else if (cycles >= 1000)
> +		scnprintf(buf, size, "%.1fK", (double)cycles / 1000.0);
> +	else
> +		scnprintf(buf, size, "%1d", cycles);
> +}
> +
> +static int block_cycles_lbr_entry(struct perf_hpp_fmt *fmt,
> +				  struct perf_hpp *hpp, struct hist_entry *he)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +	struct block_info *bi = he->block_info;
> +	char cycles_buf[16];
> +
> +	cycles_string(bi->cycles_aggr, cycles_buf, sizeof(cycles_buf));
> +
> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> +			 cycles_buf);
> +}
> +
> +static int block_cycles_pct_entry(struct perf_hpp_fmt *fmt,
> +				  struct perf_hpp *hpp, struct hist_entry *he)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +	struct block_info *bi = he->block_info;
> +	double ratio = 0.0;
> +	u64 avg;
> +	char buf[16];
> +
> +	if (block_fmt->block_cycles && bi->num_aggr) {
> +		avg = bi->cycles_aggr / bi->num_aggr;
> +		ratio = (double)avg / (double)block_fmt->block_cycles;
> +	}
> +
> +	sprintf(buf, "%.2f%%", 100.0 * ratio);
> +
> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
> +}
> +
> +static int block_avg_cycles_entry(struct perf_hpp_fmt *fmt,
> +				  struct perf_hpp *hpp,
> +				  struct hist_entry *he)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +	struct block_info *bi = he->block_info;
> +	char cycles_buf[16];
> +
> +	cycles_string(bi->cycles_aggr / bi->num_aggr, cycles_buf,
> +		      sizeof(cycles_buf));
> +
> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> +			 cycles_buf);
> +}
> +
> +static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
> +			     struct hist_entry *he)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +	struct block_info *bi = he->block_info;
> +	char buf[128];
> +	char *start_line, *end_line;
> +
> +	symbol_conf.disable_add2line_warn = true;
> +
> +	start_line = map__srcline(he->ms.map, bi->sym->start + bi->start,
> +				  he->ms.sym);
> +
> +	end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
> +				he->ms.sym);
> +
> +	if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
> +		scnprintf(buf, sizeof(buf), "[%s -> %s]",
> +			  start_line, end_line);
> +	} else {
> +		scnprintf(buf, sizeof(buf), "[%7lx -> %7lx]",
> +			  bi->start, bi->end);
> +	}
> +
> +	free_srcline(start_line);
> +	free_srcline(end_line);
> +
> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
> +}
> +
> +static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
> +			   struct hist_entry *he)
> +{
> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> +	struct map *map = he->ms.map;
> +
> +	if (map && map->dso) {
> +		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> +				 map->dso->short_name);
> +	}
> +
> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> +			 "[unknown]");
> +}
> +
> +static void init_block_header(struct block_fmt *block_fmt)
> +{
> +	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
> +
> +	BUG_ON(block_fmt->idx >= PERF_HPP_REPORT__BLOCK_MAX_INDEX);
> +
> +	block_fmt->header = block_columns[block_fmt->idx].name;
> +	block_fmt->width = block_columns[block_fmt->idx].width;
> +
> +	fmt->header = block_column_header;
> +	fmt->width = block_column_width;
> +}
> +
> +static void hpp_register(struct block_fmt *block_fmt, int idx,
> +			 struct perf_hpp_list *hpp_list)
> +{
> +	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
> +
> +	block_fmt->idx = idx;
> +	INIT_LIST_HEAD(&fmt->list);
> +	INIT_LIST_HEAD(&fmt->sort_list);
> +
> +	switch (idx) {
> +	case PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT:
> +		fmt->entry = block_total_cycles_pct_entry;
> +		fmt->cmp = block_info__cmp;
> +		fmt->sort = block_total_cycles_pct_sort;
> +		break;
> +	case PERF_HPP_REPORT__BLOCK_LBR_CYCLES:
> +		fmt->entry = block_cycles_lbr_entry;
> +		break;
> +	case PERF_HPP_REPORT__BLOCK_CYCLES_PCT:
> +		fmt->entry = block_cycles_pct_entry;
> +		break;
> +	case PERF_HPP_REPORT__BLOCK_AVG_CYCLES:
> +		fmt->entry = block_avg_cycles_entry;
> +		break;
> +	case PERF_HPP_REPORT__BLOCK_RANGE:
> +		fmt->entry = block_range_entry;
> +		break;
> +	case PERF_HPP_REPORT__BLOCK_DSO:
> +		fmt->entry = block_dso_entry;
> +		break;
> +	default:
> +		return;
> +	}
> +
> +	init_block_header(block_fmt);
> +	perf_hpp_list__column_register(hpp_list, fmt);
> +}
> +
> +static void register_block_columns(struct perf_hpp_list *hpp_list,
> +				   struct block_fmt *block_fmts)
> +{
> +	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++)
> +		hpp_register(&block_fmts[i], i, hpp_list);
> +}
> +
> +static void init_block_hist(struct block_hist *bh, struct block_fmt *block_fmts)
> +{
> +	__hists__init(&bh->block_hists, &bh->block_list);
> +	perf_hpp_list__init(&bh->block_list);
> +	bh->block_list.nr_header_lines = 1;
> +
> +	register_block_columns(&bh->block_list, block_fmts);
> +
> +	perf_hpp_list__register_sort_field(&bh->block_list,
> +		&block_fmts[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT].fmt);
> +}
> +
> +static inline void set_fmt(struct block_fmt *block_fmt,
> +			   u64 total_cycles, u64 block_cycles)
> +{
> +	block_fmt->total_cycles = total_cycles;
> +	block_fmt->block_cycles = block_cycles;
> +}
> +
> +static void process_block_report(struct hists *hists,
> +				 struct block_report *block_report,
> +				 u64 total_cycles)
> +{
> +	struct rb_node *next = rb_first_cached(&hists->entries);
> +	struct block_hist *bh = &block_report->hist;
> +	struct hist_entry *he;
> +
> +	init_block_hist(bh, block_report->fmts);
> +
> +	while (next) {
> +		he = rb_entry(next, struct hist_entry, rb_node);
> +		block_info__process_sym(he, bh, &block_report->cycles,
> +					total_cycles);
> +		next = rb_next(&he->rb_node);
> +	}
> +
> +	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++) {
> +		block_report->fmts[i].total_cycles = total_cycles;
> +		block_report->fmts[i].block_cycles = block_report->cycles;
> +	}
> +
> +	hists__output_resort(&bh->block_hists, NULL);
> +}
> +
> +struct block_report *block_info__create_report(struct evlist *evlist,
> +					       u64 total_cycles)
> +{
> +	struct block_report *block_reports;
> +	int nr_hists = evlist->core.nr_entries, i = 0;
> +	struct evsel *pos;
> +
> +	block_reports = calloc(nr_hists, sizeof(struct block_report));
> +	if (!block_reports)
> +		return NULL;
> +
> +	evlist__for_each_entry(evlist, pos) {
> +		struct hists *hists = evsel__hists(pos);
> +
> +		process_block_report(hists, &block_reports[i], total_cycles);
> +		i++;
> +	}
> +
> +	return block_reports;
> +}
> diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
> index d55dfc2fda6f..b5266588d476 100644
> --- a/tools/perf/util/block-info.h
> +++ b/tools/perf/util/block-info.h
> @@ -4,8 +4,9 @@
>  
>  #include <linux/types.h>
>  #include <linux/refcount.h>
> -#include "util/hist.h"
> -#include "util/symbol.h"
> +#include "hist.h"
> +#include "symbol.h"
> +#include "sort.h"
>  
>  struct block_info {
>  	struct symbol		*sym;
> @@ -20,6 +21,31 @@ struct block_info {
>  	refcount_t		refcnt;
>  };
>  
> +struct block_fmt {
> +	struct perf_hpp_fmt	fmt;
> +	int			idx;
> +	int			width;
> +	const char		*header;
> +	u64			total_cycles;
> +	u64			block_cycles;
> +};
> +
> +enum {
> +	PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
> +	PERF_HPP_REPORT__BLOCK_LBR_CYCLES,
> +	PERF_HPP_REPORT__BLOCK_CYCLES_PCT,
> +	PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
> +	PERF_HPP_REPORT__BLOCK_RANGE,
> +	PERF_HPP_REPORT__BLOCK_DSO,
> +	PERF_HPP_REPORT__BLOCK_MAX_INDEX
> +};
> +
> +struct block_report {
> +	struct block_hist	hist;
> +	u64			cycles;
> +	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
> +};
> +
>  struct block_hist;
>  
>  struct block_info *block_info__new(void);
> @@ -40,4 +66,7 @@ int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
>  int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
>  			    u64 *block_cycles_aggr, u64 total_cycles);
>  
> +struct block_report *block_info__create_report(struct evlist *evlist,
> +					       u64 total_cycles);
> +
>  #endif /* __PERF_BLOCK_H */
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 0e27d6830011..7cf137b0451b 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -758,6 +758,10 @@ struct hist_entry *hists__add_entry_block(struct hists *hists,
>  	struct hist_entry entry = {
>  		.block_info = block_info,
>  		.hists = hists,
> +		.ms = {
> +			.map = al->map,
> +			.sym = al->sym,
> +		},
>  	}, *he = hists__findnew_entry(hists, &entry, al, false);
>  
>  	return he;
> -- 
> 2.17.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v7 4/7] perf util: Support block formats with compare/sort/display
  2019-11-07 12:45   ` Arnaldo Carvalho de Melo
@ 2019-11-07 13:05     ` Jin, Yao
  2019-11-07 13:17       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 22+ messages in thread
From: Jin, Yao @ 2019-11-07 13:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 11/7/2019 8:45 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 07, 2019 at 03:47:16PM +0800, Jin Yao escreveu:
>> This patch provides helper routines to support new
>> columns for block info output.
> 
> This breaks bisection, some of my build containers complained about:
> 
> util/block-info.c:395:20: error: unused function 'set_fmt' [-Werror,-Wunused-function]
> static inline void set_fmt(struct block_fmt *block_fmt,
>                     ^
> 1 error generated.
> mv: can't rename '/tmp/build/perf/util/.block-info.o.tmp': No such file or directory
> 
> 
> Because you add a static inline function to a .c file and don't use it,
> I'll try moving it to when it finally gets used.
>   

Oh, very sorry, my fault. I should delete the set_fmt(). In new version, 
it's not used. But unfortunately my build machine didn't report the 
error. :(

Thanks
Jin Yao

>> The new columns are:
>>
>> Sampled Cycles%
>> Sampled Cycles
>> Avg Cycles%
>> Avg Cycles
>> [Program Block Range]
>> Shared Object
>>
>>   v5:
>>   ---
>>   1. Move more block related functions from builtin-report.c to
>>      block-info.c
>>
>>   2. Set ms (map+sym) in block hist_entry. Because this info
>>      is needed for reporting the block range (i.e. source line)
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/util/block-info.c | 317 +++++++++++++++++++++++++++++++++++
>>   tools/perf/util/block-info.h |  33 +++-
>>   tools/perf/util/hist.c       |   4 +
>>   3 files changed, 352 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
>> index b9954a32b8f4..1242c3a33509 100644
>> --- a/tools/perf/util/block-info.c
>> +++ b/tools/perf/util/block-info.c
>> @@ -6,6 +6,40 @@
>>   #include "sort.h"
>>   #include "annotate.h"
>>   #include "symbol.h"
>> +#include "dso.h"
>> +#include "map.h"
>> +#include "srcline.h"
>> +#include "evlist.h"
>> +
>> +static struct block_header_column{
>> +	const char *name;
>> +	int width;
>> +} block_columns[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
>> +	[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT] = {
>> +		.name = "Sampled Cycles%",
>> +		.width = 15,
>> +	},
>> +	[PERF_HPP_REPORT__BLOCK_LBR_CYCLES] = {
>> +		.name = "Sampled Cycles",
>> +		.width = 14,
>> +	},
>> +	[PERF_HPP_REPORT__BLOCK_CYCLES_PCT] = {
>> +		.name = "Avg Cycles%",
>> +		.width = 11,
>> +	},
>> +	[PERF_HPP_REPORT__BLOCK_AVG_CYCLES] = {
>> +		.name = "Avg Cycles",
>> +		.width = 10,
>> +	},
>> +	[PERF_HPP_REPORT__BLOCK_RANGE] = {
>> +		.name = "[Program Block Range]",
>> +		.width = 70,
>> +	},
>> +	[PERF_HPP_REPORT__BLOCK_DSO] = {
>> +		.name = "Shared Object",
>> +		.width = 20,
>> +	}
>> +};
>>   
>>   struct block_info *block_info__get(struct block_info *bi)
>>   {
>> @@ -127,3 +161,286 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
>>   
>>   	return 0;
>>   }
>> +
>> +static int block_column_header(struct perf_hpp_fmt *fmt,
>> +			       struct perf_hpp *hpp,
>> +			       struct hists *hists __maybe_unused,
>> +			       int line __maybe_unused,
>> +			       int *span __maybe_unused)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +
>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>> +			 block_fmt->header);
>> +}
>> +
>> +static int block_column_width(struct perf_hpp_fmt *fmt,
>> +			      struct perf_hpp *hpp __maybe_unused,
>> +			      struct hists *hists __maybe_unused)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +
>> +	return block_fmt->width;
>> +}
>> +
>> +static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
>> +					struct perf_hpp *hpp,
>> +					struct hist_entry *he)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +	struct block_info *bi = he->block_info;
>> +	double ratio = 0.0;
>> +	char buf[16];
>> +
>> +	if (block_fmt->total_cycles)
>> +		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
>> +
>> +	sprintf(buf, "%.2f%%", 100.0 * ratio);
>> +
>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
>> +}
>> +
>> +static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
>> +					   struct hist_entry *left,
>> +					   struct hist_entry *right)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +	struct block_info *bi_l = left->block_info;
>> +	struct block_info *bi_r = right->block_info;
>> +	double l, r;
>> +
>> +	if (block_fmt->total_cycles) {
>> +		l = ((double)bi_l->cycles /
>> +			(double)block_fmt->total_cycles) * 100000.0;
>> +		r = ((double)bi_r->cycles /
>> +			(double)block_fmt->total_cycles) * 100000.0;
>> +		return (int64_t)l - (int64_t)r;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static void cycles_string(u64 cycles, char *buf, int size)
>> +{
>> +	if (cycles >= 1000000)
>> +		scnprintf(buf, size, "%.1fM", (double)cycles / 1000000.0);
>> +	else if (cycles >= 1000)
>> +		scnprintf(buf, size, "%.1fK", (double)cycles / 1000.0);
>> +	else
>> +		scnprintf(buf, size, "%1d", cycles);
>> +}
>> +
>> +static int block_cycles_lbr_entry(struct perf_hpp_fmt *fmt,
>> +				  struct perf_hpp *hpp, struct hist_entry *he)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +	struct block_info *bi = he->block_info;
>> +	char cycles_buf[16];
>> +
>> +	cycles_string(bi->cycles_aggr, cycles_buf, sizeof(cycles_buf));
>> +
>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>> +			 cycles_buf);
>> +}
>> +
>> +static int block_cycles_pct_entry(struct perf_hpp_fmt *fmt,
>> +				  struct perf_hpp *hpp, struct hist_entry *he)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +	struct block_info *bi = he->block_info;
>> +	double ratio = 0.0;
>> +	u64 avg;
>> +	char buf[16];
>> +
>> +	if (block_fmt->block_cycles && bi->num_aggr) {
>> +		avg = bi->cycles_aggr / bi->num_aggr;
>> +		ratio = (double)avg / (double)block_fmt->block_cycles;
>> +	}
>> +
>> +	sprintf(buf, "%.2f%%", 100.0 * ratio);
>> +
>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
>> +}
>> +
>> +static int block_avg_cycles_entry(struct perf_hpp_fmt *fmt,
>> +				  struct perf_hpp *hpp,
>> +				  struct hist_entry *he)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +	struct block_info *bi = he->block_info;
>> +	char cycles_buf[16];
>> +
>> +	cycles_string(bi->cycles_aggr / bi->num_aggr, cycles_buf,
>> +		      sizeof(cycles_buf));
>> +
>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>> +			 cycles_buf);
>> +}
>> +
>> +static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
>> +			     struct hist_entry *he)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +	struct block_info *bi = he->block_info;
>> +	char buf[128];
>> +	char *start_line, *end_line;
>> +
>> +	symbol_conf.disable_add2line_warn = true;
>> +
>> +	start_line = map__srcline(he->ms.map, bi->sym->start + bi->start,
>> +				  he->ms.sym);
>> +
>> +	end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
>> +				he->ms.sym);
>> +
>> +	if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
>> +		scnprintf(buf, sizeof(buf), "[%s -> %s]",
>> +			  start_line, end_line);
>> +	} else {
>> +		scnprintf(buf, sizeof(buf), "[%7lx -> %7lx]",
>> +			  bi->start, bi->end);
>> +	}
>> +
>> +	free_srcline(start_line);
>> +	free_srcline(end_line);
>> +
>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
>> +}
>> +
>> +static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
>> +			   struct hist_entry *he)
>> +{
>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>> +	struct map *map = he->ms.map;
>> +
>> +	if (map && map->dso) {
>> +		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>> +				 map->dso->short_name);
>> +	}
>> +
>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>> +			 "[unknown]");
>> +}
>> +
>> +static void init_block_header(struct block_fmt *block_fmt)
>> +{
>> +	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
>> +
>> +	BUG_ON(block_fmt->idx >= PERF_HPP_REPORT__BLOCK_MAX_INDEX);
>> +
>> +	block_fmt->header = block_columns[block_fmt->idx].name;
>> +	block_fmt->width = block_columns[block_fmt->idx].width;
>> +
>> +	fmt->header = block_column_header;
>> +	fmt->width = block_column_width;
>> +}
>> +
>> +static void hpp_register(struct block_fmt *block_fmt, int idx,
>> +			 struct perf_hpp_list *hpp_list)
>> +{
>> +	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
>> +
>> +	block_fmt->idx = idx;
>> +	INIT_LIST_HEAD(&fmt->list);
>> +	INIT_LIST_HEAD(&fmt->sort_list);
>> +
>> +	switch (idx) {
>> +	case PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT:
>> +		fmt->entry = block_total_cycles_pct_entry;
>> +		fmt->cmp = block_info__cmp;
>> +		fmt->sort = block_total_cycles_pct_sort;
>> +		break;
>> +	case PERF_HPP_REPORT__BLOCK_LBR_CYCLES:
>> +		fmt->entry = block_cycles_lbr_entry;
>> +		break;
>> +	case PERF_HPP_REPORT__BLOCK_CYCLES_PCT:
>> +		fmt->entry = block_cycles_pct_entry;
>> +		break;
>> +	case PERF_HPP_REPORT__BLOCK_AVG_CYCLES:
>> +		fmt->entry = block_avg_cycles_entry;
>> +		break;
>> +	case PERF_HPP_REPORT__BLOCK_RANGE:
>> +		fmt->entry = block_range_entry;
>> +		break;
>> +	case PERF_HPP_REPORT__BLOCK_DSO:
>> +		fmt->entry = block_dso_entry;
>> +		break;
>> +	default:
>> +		return;
>> +	}
>> +
>> +	init_block_header(block_fmt);
>> +	perf_hpp_list__column_register(hpp_list, fmt);
>> +}
>> +
>> +static void register_block_columns(struct perf_hpp_list *hpp_list,
>> +				   struct block_fmt *block_fmts)
>> +{
>> +	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++)
>> +		hpp_register(&block_fmts[i], i, hpp_list);
>> +}
>> +
>> +static void init_block_hist(struct block_hist *bh, struct block_fmt *block_fmts)
>> +{
>> +	__hists__init(&bh->block_hists, &bh->block_list);
>> +	perf_hpp_list__init(&bh->block_list);
>> +	bh->block_list.nr_header_lines = 1;
>> +
>> +	register_block_columns(&bh->block_list, block_fmts);
>> +
>> +	perf_hpp_list__register_sort_field(&bh->block_list,
>> +		&block_fmts[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT].fmt);
>> +}
>> +
>> +static inline void set_fmt(struct block_fmt *block_fmt,
>> +			   u64 total_cycles, u64 block_cycles)
>> +{
>> +	block_fmt->total_cycles = total_cycles;
>> +	block_fmt->block_cycles = block_cycles;
>> +}
>> +
>> +static void process_block_report(struct hists *hists,
>> +				 struct block_report *block_report,
>> +				 u64 total_cycles)
>> +{
>> +	struct rb_node *next = rb_first_cached(&hists->entries);
>> +	struct block_hist *bh = &block_report->hist;
>> +	struct hist_entry *he;
>> +
>> +	init_block_hist(bh, block_report->fmts);
>> +
>> +	while (next) {
>> +		he = rb_entry(next, struct hist_entry, rb_node);
>> +		block_info__process_sym(he, bh, &block_report->cycles,
>> +					total_cycles);
>> +		next = rb_next(&he->rb_node);
>> +	}
>> +
>> +	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++) {
>> +		block_report->fmts[i].total_cycles = total_cycles;
>> +		block_report->fmts[i].block_cycles = block_report->cycles;
>> +	}
>> +
>> +	hists__output_resort(&bh->block_hists, NULL);
>> +}
>> +
>> +struct block_report *block_info__create_report(struct evlist *evlist,
>> +					       u64 total_cycles)
>> +{
>> +	struct block_report *block_reports;
>> +	int nr_hists = evlist->core.nr_entries, i = 0;
>> +	struct evsel *pos;
>> +
>> +	block_reports = calloc(nr_hists, sizeof(struct block_report));
>> +	if (!block_reports)
>> +		return NULL;
>> +
>> +	evlist__for_each_entry(evlist, pos) {
>> +		struct hists *hists = evsel__hists(pos);
>> +
>> +		process_block_report(hists, &block_reports[i], total_cycles);
>> +		i++;
>> +	}
>> +
>> +	return block_reports;
>> +}
>> diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
>> index d55dfc2fda6f..b5266588d476 100644
>> --- a/tools/perf/util/block-info.h
>> +++ b/tools/perf/util/block-info.h
>> @@ -4,8 +4,9 @@
>>   
>>   #include <linux/types.h>
>>   #include <linux/refcount.h>
>> -#include "util/hist.h"
>> -#include "util/symbol.h"
>> +#include "hist.h"
>> +#include "symbol.h"
>> +#include "sort.h"
>>   
>>   struct block_info {
>>   	struct symbol		*sym;
>> @@ -20,6 +21,31 @@ struct block_info {
>>   	refcount_t		refcnt;
>>   };
>>   
>> +struct block_fmt {
>> +	struct perf_hpp_fmt	fmt;
>> +	int			idx;
>> +	int			width;
>> +	const char		*header;
>> +	u64			total_cycles;
>> +	u64			block_cycles;
>> +};
>> +
>> +enum {
>> +	PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
>> +	PERF_HPP_REPORT__BLOCK_LBR_CYCLES,
>> +	PERF_HPP_REPORT__BLOCK_CYCLES_PCT,
>> +	PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
>> +	PERF_HPP_REPORT__BLOCK_RANGE,
>> +	PERF_HPP_REPORT__BLOCK_DSO,
>> +	PERF_HPP_REPORT__BLOCK_MAX_INDEX
>> +};
>> +
>> +struct block_report {
>> +	struct block_hist	hist;
>> +	u64			cycles;
>> +	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
>> +};
>> +
>>   struct block_hist;
>>   
>>   struct block_info *block_info__new(void);
>> @@ -40,4 +66,7 @@ int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
>>   int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
>>   			    u64 *block_cycles_aggr, u64 total_cycles);
>>   
>> +struct block_report *block_info__create_report(struct evlist *evlist,
>> +					       u64 total_cycles);
>> +
>>   #endif /* __PERF_BLOCK_H */
>> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
>> index 0e27d6830011..7cf137b0451b 100644
>> --- a/tools/perf/util/hist.c
>> +++ b/tools/perf/util/hist.c
>> @@ -758,6 +758,10 @@ struct hist_entry *hists__add_entry_block(struct hists *hists,
>>   	struct hist_entry entry = {
>>   		.block_info = block_info,
>>   		.hists = hists,
>> +		.ms = {
>> +			.map = al->map,
>> +			.sym = al->sym,
>> +		},
>>   	}, *he = hists__findnew_entry(hists, &entry, al, false);
>>   
>>   	return he;
>> -- 
>> 2.17.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v7 0/7] perf report: Support sorting all blocks by cycles
  2019-11-07  9:28 ` [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jiri Olsa
@ 2019-11-07 13:09   ` Jin, Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Jin, Yao @ 2019-11-07 13:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 11/7/2019 5:28 PM, Jiri Olsa wrote:
> On Thu, Nov 07, 2019 at 03:47:12PM +0800, Jin Yao wrote:
>> It would be useful to support sorting for all blocks by the
>> sampled cycles percent per block. This is useful to concentrate
>> on the globally hottest blocks.
>>
>> This patch series implements a new option "--total-cycles" which
>> sorts all blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is
>> block sampled cycles aggregation / total sampled cycles
>>
>> For example,
>>
>> perf record -b ./div
>> perf report --total-cycles --stdio
>>
>>   # To display the perf.data header info, please use --header/--header-only options.
>>   #
>>   #
>>   # Total Lost Samples: 0
>>   #
>>   # Samples: 2M of event 'cycles'
>>   # Event count (approx.): 2753248
>>   #
>>   # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
>>   # ...............  ..............  ...........  ..........  .................................................................  ....................
>>   #
>>              26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
>>              15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
>>               5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
>>               4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
>>               4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
>>               3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
>>               3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
>>               3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
>>               2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
>>               2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
>>               2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
>>               2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
>>               2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
>>               2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
>>               1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
>>               1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
>>               1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so
>>   ......
>>
>> This patch series supports both stdio and tui. And also with the supporting
>> of --percent-limit.
>>
>>   v7:
>>   ---
>>   Use use_browser in report__browse_block_hists and support
>>   reporting for both stdio and tui modes.
>>   
>>   Move block tui browser code from ui/browsers/hists.c
>>   to block-info.c.
> 
> thanks for bearing with me ;-)
> 
> Reviewed-by: Jiri Olsa <jolsa@kernel.org>
> 
> jirka
> 

Hi Jiri,

I really appreciate your help! Your review comments really let the v7 be 
much better than the initial version.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v7 4/7] perf util: Support block formats with compare/sort/display
  2019-11-07 13:05     ` Jin, Yao
@ 2019-11-07 13:17       ` Arnaldo Carvalho de Melo
  2019-11-07 13:21         ` Jin, Yao
  0 siblings, 1 reply; 22+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-07 13:17 UTC (permalink / raw)
  To: Jin, Yao
  Cc: Arnaldo Carvalho de Melo, jolsa, peterz, mingo,
	alexander.shishkin, Linux-kernel, ak, kan.liang, yao.jin

Em Thu, Nov 07, 2019 at 09:05:21PM +0800, Jin, Yao escreveu:
> 
> 
> On 11/7/2019 8:45 PM, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Nov 07, 2019 at 03:47:16PM +0800, Jin Yao escreveu:
> > > This patch provides helper routines to support new
> > > columns for block info output.
> > 
> > This breaks bisection, some of my build containers complained about:
> > 
> > util/block-info.c:395:20: error: unused function 'set_fmt' [-Werror,-Wunused-function]
> > static inline void set_fmt(struct block_fmt *block_fmt,
> >                     ^
> > 1 error generated.
> > mv: can't rename '/tmp/build/perf/util/.block-info.o.tmp': No such file or directory
> > 
> > 
> > Because you add a static inline function to a .c file and don't use it,
> > I'll try moving it to when it finally gets used.
> 
> Oh, very sorry, my fault. I should delete the set_fmt(). In new version,
> it's not used. But unfortunately my build machine didn't report the error.
> :(

I got it just because I had the containers building it as I was
applying your patches, got it from several, but had difficulty
reproducing, no problem fixed now, I double checked and it really isn't
being used.

No need to resend, I already fixed this and restarted the full container
build series.

- Arnaldo
 
> Thanks
> Jin Yao
> 
> > > The new columns are:
> > > 
> > > Sampled Cycles%
> > > Sampled Cycles
> > > Avg Cycles%
> > > Avg Cycles
> > > [Program Block Range]
> > > Shared Object
> > > 
> > >   v5:
> > >   ---
> > >   1. Move more block related functions from builtin-report.c to
> > >      block-info.c
> > > 
> > >   2. Set ms (map+sym) in block hist_entry. Because this info
> > >      is needed for reporting the block range (i.e. source line)
> > > 
> > > Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> > > ---
> > >   tools/perf/util/block-info.c | 317 +++++++++++++++++++++++++++++++++++
> > >   tools/perf/util/block-info.h |  33 +++-
> > >   tools/perf/util/hist.c       |   4 +
> > >   3 files changed, 352 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
> > > index b9954a32b8f4..1242c3a33509 100644
> > > --- a/tools/perf/util/block-info.c
> > > +++ b/tools/perf/util/block-info.c
> > > @@ -6,6 +6,40 @@
> > >   #include "sort.h"
> > >   #include "annotate.h"
> > >   #include "symbol.h"
> > > +#include "dso.h"
> > > +#include "map.h"
> > > +#include "srcline.h"
> > > +#include "evlist.h"
> > > +
> > > +static struct block_header_column{
> > > +	const char *name;
> > > +	int width;
> > > +} block_columns[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
> > > +	[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT] = {
> > > +		.name = "Sampled Cycles%",
> > > +		.width = 15,
> > > +	},
> > > +	[PERF_HPP_REPORT__BLOCK_LBR_CYCLES] = {
> > > +		.name = "Sampled Cycles",
> > > +		.width = 14,
> > > +	},
> > > +	[PERF_HPP_REPORT__BLOCK_CYCLES_PCT] = {
> > > +		.name = "Avg Cycles%",
> > > +		.width = 11,
> > > +	},
> > > +	[PERF_HPP_REPORT__BLOCK_AVG_CYCLES] = {
> > > +		.name = "Avg Cycles",
> > > +		.width = 10,
> > > +	},
> > > +	[PERF_HPP_REPORT__BLOCK_RANGE] = {
> > > +		.name = "[Program Block Range]",
> > > +		.width = 70,
> > > +	},
> > > +	[PERF_HPP_REPORT__BLOCK_DSO] = {
> > > +		.name = "Shared Object",
> > > +		.width = 20,
> > > +	}
> > > +};
> > >   struct block_info *block_info__get(struct block_info *bi)
> > >   {
> > > @@ -127,3 +161,286 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
> > >   	return 0;
> > >   }
> > > +
> > > +static int block_column_header(struct perf_hpp_fmt *fmt,
> > > +			       struct perf_hpp *hpp,
> > > +			       struct hists *hists __maybe_unused,
> > > +			       int line __maybe_unused,
> > > +			       int *span __maybe_unused)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +
> > > +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> > > +			 block_fmt->header);
> > > +}
> > > +
> > > +static int block_column_width(struct perf_hpp_fmt *fmt,
> > > +			      struct perf_hpp *hpp __maybe_unused,
> > > +			      struct hists *hists __maybe_unused)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +
> > > +	return block_fmt->width;
> > > +}
> > > +
> > > +static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
> > > +					struct perf_hpp *hpp,
> > > +					struct hist_entry *he)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +	struct block_info *bi = he->block_info;
> > > +	double ratio = 0.0;
> > > +	char buf[16];
> > > +
> > > +	if (block_fmt->total_cycles)
> > > +		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
> > > +
> > > +	sprintf(buf, "%.2f%%", 100.0 * ratio);
> > > +
> > > +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
> > > +}
> > > +
> > > +static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
> > > +					   struct hist_entry *left,
> > > +					   struct hist_entry *right)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +	struct block_info *bi_l = left->block_info;
> > > +	struct block_info *bi_r = right->block_info;
> > > +	double l, r;
> > > +
> > > +	if (block_fmt->total_cycles) {
> > > +		l = ((double)bi_l->cycles /
> > > +			(double)block_fmt->total_cycles) * 100000.0;
> > > +		r = ((double)bi_r->cycles /
> > > +			(double)block_fmt->total_cycles) * 100000.0;
> > > +		return (int64_t)l - (int64_t)r;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void cycles_string(u64 cycles, char *buf, int size)
> > > +{
> > > +	if (cycles >= 1000000)
> > > +		scnprintf(buf, size, "%.1fM", (double)cycles / 1000000.0);
> > > +	else if (cycles >= 1000)
> > > +		scnprintf(buf, size, "%.1fK", (double)cycles / 1000.0);
> > > +	else
> > > +		scnprintf(buf, size, "%1d", cycles);
> > > +}
> > > +
> > > +static int block_cycles_lbr_entry(struct perf_hpp_fmt *fmt,
> > > +				  struct perf_hpp *hpp, struct hist_entry *he)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +	struct block_info *bi = he->block_info;
> > > +	char cycles_buf[16];
> > > +
> > > +	cycles_string(bi->cycles_aggr, cycles_buf, sizeof(cycles_buf));
> > > +
> > > +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> > > +			 cycles_buf);
> > > +}
> > > +
> > > +static int block_cycles_pct_entry(struct perf_hpp_fmt *fmt,
> > > +				  struct perf_hpp *hpp, struct hist_entry *he)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +	struct block_info *bi = he->block_info;
> > > +	double ratio = 0.0;
> > > +	u64 avg;
> > > +	char buf[16];
> > > +
> > > +	if (block_fmt->block_cycles && bi->num_aggr) {
> > > +		avg = bi->cycles_aggr / bi->num_aggr;
> > > +		ratio = (double)avg / (double)block_fmt->block_cycles;
> > > +	}
> > > +
> > > +	sprintf(buf, "%.2f%%", 100.0 * ratio);
> > > +
> > > +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
> > > +}
> > > +
> > > +static int block_avg_cycles_entry(struct perf_hpp_fmt *fmt,
> > > +				  struct perf_hpp *hpp,
> > > +				  struct hist_entry *he)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +	struct block_info *bi = he->block_info;
> > > +	char cycles_buf[16];
> > > +
> > > +	cycles_string(bi->cycles_aggr / bi->num_aggr, cycles_buf,
> > > +		      sizeof(cycles_buf));
> > > +
> > > +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> > > +			 cycles_buf);
> > > +}
> > > +
> > > +static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
> > > +			     struct hist_entry *he)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +	struct block_info *bi = he->block_info;
> > > +	char buf[128];
> > > +	char *start_line, *end_line;
> > > +
> > > +	symbol_conf.disable_add2line_warn = true;
> > > +
> > > +	start_line = map__srcline(he->ms.map, bi->sym->start + bi->start,
> > > +				  he->ms.sym);
> > > +
> > > +	end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
> > > +				he->ms.sym);
> > > +
> > > +	if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
> > > +		scnprintf(buf, sizeof(buf), "[%s -> %s]",
> > > +			  start_line, end_line);
> > > +	} else {
> > > +		scnprintf(buf, sizeof(buf), "[%7lx -> %7lx]",
> > > +			  bi->start, bi->end);
> > > +	}
> > > +
> > > +	free_srcline(start_line);
> > > +	free_srcline(end_line);
> > > +
> > > +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
> > > +}
> > > +
> > > +static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
> > > +			   struct hist_entry *he)
> > > +{
> > > +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
> > > +	struct map *map = he->ms.map;
> > > +
> > > +	if (map && map->dso) {
> > > +		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> > > +				 map->dso->short_name);
> > > +	}
> > > +
> > > +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
> > > +			 "[unknown]");
> > > +}
> > > +
> > > +static void init_block_header(struct block_fmt *block_fmt)
> > > +{
> > > +	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
> > > +
> > > +	BUG_ON(block_fmt->idx >= PERF_HPP_REPORT__BLOCK_MAX_INDEX);
> > > +
> > > +	block_fmt->header = block_columns[block_fmt->idx].name;
> > > +	block_fmt->width = block_columns[block_fmt->idx].width;
> > > +
> > > +	fmt->header = block_column_header;
> > > +	fmt->width = block_column_width;
> > > +}
> > > +
> > > +static void hpp_register(struct block_fmt *block_fmt, int idx,
> > > +			 struct perf_hpp_list *hpp_list)
> > > +{
> > > +	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
> > > +
> > > +	block_fmt->idx = idx;
> > > +	INIT_LIST_HEAD(&fmt->list);
> > > +	INIT_LIST_HEAD(&fmt->sort_list);
> > > +
> > > +	switch (idx) {
> > > +	case PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT:
> > > +		fmt->entry = block_total_cycles_pct_entry;
> > > +		fmt->cmp = block_info__cmp;
> > > +		fmt->sort = block_total_cycles_pct_sort;
> > > +		break;
> > > +	case PERF_HPP_REPORT__BLOCK_LBR_CYCLES:
> > > +		fmt->entry = block_cycles_lbr_entry;
> > > +		break;
> > > +	case PERF_HPP_REPORT__BLOCK_CYCLES_PCT:
> > > +		fmt->entry = block_cycles_pct_entry;
> > > +		break;
> > > +	case PERF_HPP_REPORT__BLOCK_AVG_CYCLES:
> > > +		fmt->entry = block_avg_cycles_entry;
> > > +		break;
> > > +	case PERF_HPP_REPORT__BLOCK_RANGE:
> > > +		fmt->entry = block_range_entry;
> > > +		break;
> > > +	case PERF_HPP_REPORT__BLOCK_DSO:
> > > +		fmt->entry = block_dso_entry;
> > > +		break;
> > > +	default:
> > > +		return;
> > > +	}
> > > +
> > > +	init_block_header(block_fmt);
> > > +	perf_hpp_list__column_register(hpp_list, fmt);
> > > +}
> > > +
> > > +static void register_block_columns(struct perf_hpp_list *hpp_list,
> > > +				   struct block_fmt *block_fmts)
> > > +{
> > > +	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++)
> > > +		hpp_register(&block_fmts[i], i, hpp_list);
> > > +}
> > > +
> > > +static void init_block_hist(struct block_hist *bh, struct block_fmt *block_fmts)
> > > +{
> > > +	__hists__init(&bh->block_hists, &bh->block_list);
> > > +	perf_hpp_list__init(&bh->block_list);
> > > +	bh->block_list.nr_header_lines = 1;
> > > +
> > > +	register_block_columns(&bh->block_list, block_fmts);
> > > +
> > > +	perf_hpp_list__register_sort_field(&bh->block_list,
> > > +		&block_fmts[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT].fmt);
> > > +}
> > > +
> > > +static inline void set_fmt(struct block_fmt *block_fmt,
> > > +			   u64 total_cycles, u64 block_cycles)
> > > +{
> > > +	block_fmt->total_cycles = total_cycles;
> > > +	block_fmt->block_cycles = block_cycles;
> > > +}
> > > +
> > > +static void process_block_report(struct hists *hists,
> > > +				 struct block_report *block_report,
> > > +				 u64 total_cycles)
> > > +{
> > > +	struct rb_node *next = rb_first_cached(&hists->entries);
> > > +	struct block_hist *bh = &block_report->hist;
> > > +	struct hist_entry *he;
> > > +
> > > +	init_block_hist(bh, block_report->fmts);
> > > +
> > > +	while (next) {
> > > +		he = rb_entry(next, struct hist_entry, rb_node);
> > > +		block_info__process_sym(he, bh, &block_report->cycles,
> > > +					total_cycles);
> > > +		next = rb_next(&he->rb_node);
> > > +	}
> > > +
> > > +	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++) {
> > > +		block_report->fmts[i].total_cycles = total_cycles;
> > > +		block_report->fmts[i].block_cycles = block_report->cycles;
> > > +	}
> > > +
> > > +	hists__output_resort(&bh->block_hists, NULL);
> > > +}
> > > +
> > > +struct block_report *block_info__create_report(struct evlist *evlist,
> > > +					       u64 total_cycles)
> > > +{
> > > +	struct block_report *block_reports;
> > > +	int nr_hists = evlist->core.nr_entries, i = 0;
> > > +	struct evsel *pos;
> > > +
> > > +	block_reports = calloc(nr_hists, sizeof(struct block_report));
> > > +	if (!block_reports)
> > > +		return NULL;
> > > +
> > > +	evlist__for_each_entry(evlist, pos) {
> > > +		struct hists *hists = evsel__hists(pos);
> > > +
> > > +		process_block_report(hists, &block_reports[i], total_cycles);
> > > +		i++;
> > > +	}
> > > +
> > > +	return block_reports;
> > > +}
> > > diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
> > > index d55dfc2fda6f..b5266588d476 100644
> > > --- a/tools/perf/util/block-info.h
> > > +++ b/tools/perf/util/block-info.h
> > > @@ -4,8 +4,9 @@
> > >   #include <linux/types.h>
> > >   #include <linux/refcount.h>
> > > -#include "util/hist.h"
> > > -#include "util/symbol.h"
> > > +#include "hist.h"
> > > +#include "symbol.h"
> > > +#include "sort.h"
> > >   struct block_info {
> > >   	struct symbol		*sym;
> > > @@ -20,6 +21,31 @@ struct block_info {
> > >   	refcount_t		refcnt;
> > >   };
> > > +struct block_fmt {
> > > +	struct perf_hpp_fmt	fmt;
> > > +	int			idx;
> > > +	int			width;
> > > +	const char		*header;
> > > +	u64			total_cycles;
> > > +	u64			block_cycles;
> > > +};
> > > +
> > > +enum {
> > > +	PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
> > > +	PERF_HPP_REPORT__BLOCK_LBR_CYCLES,
> > > +	PERF_HPP_REPORT__BLOCK_CYCLES_PCT,
> > > +	PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
> > > +	PERF_HPP_REPORT__BLOCK_RANGE,
> > > +	PERF_HPP_REPORT__BLOCK_DSO,
> > > +	PERF_HPP_REPORT__BLOCK_MAX_INDEX
> > > +};
> > > +
> > > +struct block_report {
> > > +	struct block_hist	hist;
> > > +	u64			cycles;
> > > +	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
> > > +};
> > > +
> > >   struct block_hist;
> > >   struct block_info *block_info__new(void);
> > > @@ -40,4 +66,7 @@ int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
> > >   int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
> > >   			    u64 *block_cycles_aggr, u64 total_cycles);
> > > +struct block_report *block_info__create_report(struct evlist *evlist,
> > > +					       u64 total_cycles);
> > > +
> > >   #endif /* __PERF_BLOCK_H */
> > > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > > index 0e27d6830011..7cf137b0451b 100644
> > > --- a/tools/perf/util/hist.c
> > > +++ b/tools/perf/util/hist.c
> > > @@ -758,6 +758,10 @@ struct hist_entry *hists__add_entry_block(struct hists *hists,
> > >   	struct hist_entry entry = {
> > >   		.block_info = block_info,
> > >   		.hists = hists,
> > > +		.ms = {
> > > +			.map = al->map,
> > > +			.sym = al->sym,
> > > +		},
> > >   	}, *he = hists__findnew_entry(hists, &entry, al, false);
> > >   	return he;
> > > -- 
> > > 2.17.1
> > 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v7 4/7] perf util: Support block formats with compare/sort/display
  2019-11-07 13:17       ` Arnaldo Carvalho de Melo
@ 2019-11-07 13:21         ` Jin, Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Jin, Yao @ 2019-11-07 13:21 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 11/7/2019 9:17 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 07, 2019 at 09:05:21PM +0800, Jin, Yao escreveu:
>>
>>
>> On 11/7/2019 8:45 PM, Arnaldo Carvalho de Melo wrote:
>>> Em Thu, Nov 07, 2019 at 03:47:16PM +0800, Jin Yao escreveu:
>>>> This patch provides helper routines to support new
>>>> columns for block info output.
>>>
>>> This breaks bisection, some of my build containers complained about:
>>>
>>> util/block-info.c:395:20: error: unused function 'set_fmt' [-Werror,-Wunused-function]
>>> static inline void set_fmt(struct block_fmt *block_fmt,
>>>                      ^
>>> 1 error generated.
>>> mv: can't rename '/tmp/build/perf/util/.block-info.o.tmp': No such file or directory
>>>
>>>
>>> Because you add a static inline function to a .c file and don't use it,
>>> I'll try moving it to when it finally gets used.
>>
>> Oh, very sorry, my fault. I should delete the set_fmt(). In new version,
>> it's not used. But unfortunately my build machine didn't report the error.
>> :(
> 
> I got it just because I had the containers building it as I was
> applying your patches, got it from several, but had difficulty
> reproducing, no problem fixed now, I double checked and it really isn't
> being used.
> 
> No need to resend, I already fixed this and restarted the full container
> build series.
> 
> - Arnaldo
>   

Oh, thanks Arnaldo! :)

Thanks
Jin Yao

>> Thanks
>> Jin Yao
>>
>>>> The new columns are:
>>>>
>>>> Sampled Cycles%
>>>> Sampled Cycles
>>>> Avg Cycles%
>>>> Avg Cycles
>>>> [Program Block Range]
>>>> Shared Object
>>>>
>>>>    v5:
>>>>    ---
>>>>    1. Move more block related functions from builtin-report.c to
>>>>       block-info.c
>>>>
>>>>    2. Set ms (map+sym) in block hist_entry. Because this info
>>>>       is needed for reporting the block range (i.e. source line)
>>>>
>>>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>>>> ---
>>>>    tools/perf/util/block-info.c | 317 +++++++++++++++++++++++++++++++++++
>>>>    tools/perf/util/block-info.h |  33 +++-
>>>>    tools/perf/util/hist.c       |   4 +
>>>>    3 files changed, 352 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
>>>> index b9954a32b8f4..1242c3a33509 100644
>>>> --- a/tools/perf/util/block-info.c
>>>> +++ b/tools/perf/util/block-info.c
>>>> @@ -6,6 +6,40 @@
>>>>    #include "sort.h"
>>>>    #include "annotate.h"
>>>>    #include "symbol.h"
>>>> +#include "dso.h"
>>>> +#include "map.h"
>>>> +#include "srcline.h"
>>>> +#include "evlist.h"
>>>> +
>>>> +static struct block_header_column{
>>>> +	const char *name;
>>>> +	int width;
>>>> +} block_columns[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
>>>> +	[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT] = {
>>>> +		.name = "Sampled Cycles%",
>>>> +		.width = 15,
>>>> +	},
>>>> +	[PERF_HPP_REPORT__BLOCK_LBR_CYCLES] = {
>>>> +		.name = "Sampled Cycles",
>>>> +		.width = 14,
>>>> +	},
>>>> +	[PERF_HPP_REPORT__BLOCK_CYCLES_PCT] = {
>>>> +		.name = "Avg Cycles%",
>>>> +		.width = 11,
>>>> +	},
>>>> +	[PERF_HPP_REPORT__BLOCK_AVG_CYCLES] = {
>>>> +		.name = "Avg Cycles",
>>>> +		.width = 10,
>>>> +	},
>>>> +	[PERF_HPP_REPORT__BLOCK_RANGE] = {
>>>> +		.name = "[Program Block Range]",
>>>> +		.width = 70,
>>>> +	},
>>>> +	[PERF_HPP_REPORT__BLOCK_DSO] = {
>>>> +		.name = "Shared Object",
>>>> +		.width = 20,
>>>> +	}
>>>> +};
>>>>    struct block_info *block_info__get(struct block_info *bi)
>>>>    {
>>>> @@ -127,3 +161,286 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
>>>>    	return 0;
>>>>    }
>>>> +
>>>> +static int block_column_header(struct perf_hpp_fmt *fmt,
>>>> +			       struct perf_hpp *hpp,
>>>> +			       struct hists *hists __maybe_unused,
>>>> +			       int line __maybe_unused,
>>>> +			       int *span __maybe_unused)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +
>>>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>>>> +			 block_fmt->header);
>>>> +}
>>>> +
>>>> +static int block_column_width(struct perf_hpp_fmt *fmt,
>>>> +			      struct perf_hpp *hpp __maybe_unused,
>>>> +			      struct hists *hists __maybe_unused)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +
>>>> +	return block_fmt->width;
>>>> +}
>>>> +
>>>> +static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
>>>> +					struct perf_hpp *hpp,
>>>> +					struct hist_entry *he)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +	struct block_info *bi = he->block_info;
>>>> +	double ratio = 0.0;
>>>> +	char buf[16];
>>>> +
>>>> +	if (block_fmt->total_cycles)
>>>> +		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
>>>> +
>>>> +	sprintf(buf, "%.2f%%", 100.0 * ratio);
>>>> +
>>>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
>>>> +}
>>>> +
>>>> +static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
>>>> +					   struct hist_entry *left,
>>>> +					   struct hist_entry *right)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +	struct block_info *bi_l = left->block_info;
>>>> +	struct block_info *bi_r = right->block_info;
>>>> +	double l, r;
>>>> +
>>>> +	if (block_fmt->total_cycles) {
>>>> +		l = ((double)bi_l->cycles /
>>>> +			(double)block_fmt->total_cycles) * 100000.0;
>>>> +		r = ((double)bi_r->cycles /
>>>> +			(double)block_fmt->total_cycles) * 100000.0;
>>>> +		return (int64_t)l - (int64_t)r;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void cycles_string(u64 cycles, char *buf, int size)
>>>> +{
>>>> +	if (cycles >= 1000000)
>>>> +		scnprintf(buf, size, "%.1fM", (double)cycles / 1000000.0);
>>>> +	else if (cycles >= 1000)
>>>> +		scnprintf(buf, size, "%.1fK", (double)cycles / 1000.0);
>>>> +	else
>>>> +		scnprintf(buf, size, "%1d", cycles);
>>>> +}
>>>> +
>>>> +static int block_cycles_lbr_entry(struct perf_hpp_fmt *fmt,
>>>> +				  struct perf_hpp *hpp, struct hist_entry *he)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +	struct block_info *bi = he->block_info;
>>>> +	char cycles_buf[16];
>>>> +
>>>> +	cycles_string(bi->cycles_aggr, cycles_buf, sizeof(cycles_buf));
>>>> +
>>>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>>>> +			 cycles_buf);
>>>> +}
>>>> +
>>>> +static int block_cycles_pct_entry(struct perf_hpp_fmt *fmt,
>>>> +				  struct perf_hpp *hpp, struct hist_entry *he)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +	struct block_info *bi = he->block_info;
>>>> +	double ratio = 0.0;
>>>> +	u64 avg;
>>>> +	char buf[16];
>>>> +
>>>> +	if (block_fmt->block_cycles && bi->num_aggr) {
>>>> +		avg = bi->cycles_aggr / bi->num_aggr;
>>>> +		ratio = (double)avg / (double)block_fmt->block_cycles;
>>>> +	}
>>>> +
>>>> +	sprintf(buf, "%.2f%%", 100.0 * ratio);
>>>> +
>>>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
>>>> +}
>>>> +
>>>> +static int block_avg_cycles_entry(struct perf_hpp_fmt *fmt,
>>>> +				  struct perf_hpp *hpp,
>>>> +				  struct hist_entry *he)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +	struct block_info *bi = he->block_info;
>>>> +	char cycles_buf[16];
>>>> +
>>>> +	cycles_string(bi->cycles_aggr / bi->num_aggr, cycles_buf,
>>>> +		      sizeof(cycles_buf));
>>>> +
>>>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>>>> +			 cycles_buf);
>>>> +}
>>>> +
>>>> +static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
>>>> +			     struct hist_entry *he)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +	struct block_info *bi = he->block_info;
>>>> +	char buf[128];
>>>> +	char *start_line, *end_line;
>>>> +
>>>> +	symbol_conf.disable_add2line_warn = true;
>>>> +
>>>> +	start_line = map__srcline(he->ms.map, bi->sym->start + bi->start,
>>>> +				  he->ms.sym);
>>>> +
>>>> +	end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
>>>> +				he->ms.sym);
>>>> +
>>>> +	if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
>>>> +		scnprintf(buf, sizeof(buf), "[%s -> %s]",
>>>> +			  start_line, end_line);
>>>> +	} else {
>>>> +		scnprintf(buf, sizeof(buf), "[%7lx -> %7lx]",
>>>> +			  bi->start, bi->end);
>>>> +	}
>>>> +
>>>> +	free_srcline(start_line);
>>>> +	free_srcline(end_line);
>>>> +
>>>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
>>>> +}
>>>> +
>>>> +static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
>>>> +			   struct hist_entry *he)
>>>> +{
>>>> +	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
>>>> +	struct map *map = he->ms.map;
>>>> +
>>>> +	if (map && map->dso) {
>>>> +		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>>>> +				 map->dso->short_name);
>>>> +	}
>>>> +
>>>> +	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
>>>> +			 "[unknown]");
>>>> +}
>>>> +
>>>> +static void init_block_header(struct block_fmt *block_fmt)
>>>> +{
>>>> +	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
>>>> +
>>>> +	BUG_ON(block_fmt->idx >= PERF_HPP_REPORT__BLOCK_MAX_INDEX);
>>>> +
>>>> +	block_fmt->header = block_columns[block_fmt->idx].name;
>>>> +	block_fmt->width = block_columns[block_fmt->idx].width;
>>>> +
>>>> +	fmt->header = block_column_header;
>>>> +	fmt->width = block_column_width;
>>>> +}
>>>> +
>>>> +static void hpp_register(struct block_fmt *block_fmt, int idx,
>>>> +			 struct perf_hpp_list *hpp_list)
>>>> +{
>>>> +	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
>>>> +
>>>> +	block_fmt->idx = idx;
>>>> +	INIT_LIST_HEAD(&fmt->list);
>>>> +	INIT_LIST_HEAD(&fmt->sort_list);
>>>> +
>>>> +	switch (idx) {
>>>> +	case PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT:
>>>> +		fmt->entry = block_total_cycles_pct_entry;
>>>> +		fmt->cmp = block_info__cmp;
>>>> +		fmt->sort = block_total_cycles_pct_sort;
>>>> +		break;
>>>> +	case PERF_HPP_REPORT__BLOCK_LBR_CYCLES:
>>>> +		fmt->entry = block_cycles_lbr_entry;
>>>> +		break;
>>>> +	case PERF_HPP_REPORT__BLOCK_CYCLES_PCT:
>>>> +		fmt->entry = block_cycles_pct_entry;
>>>> +		break;
>>>> +	case PERF_HPP_REPORT__BLOCK_AVG_CYCLES:
>>>> +		fmt->entry = block_avg_cycles_entry;
>>>> +		break;
>>>> +	case PERF_HPP_REPORT__BLOCK_RANGE:
>>>> +		fmt->entry = block_range_entry;
>>>> +		break;
>>>> +	case PERF_HPP_REPORT__BLOCK_DSO:
>>>> +		fmt->entry = block_dso_entry;
>>>> +		break;
>>>> +	default:
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	init_block_header(block_fmt);
>>>> +	perf_hpp_list__column_register(hpp_list, fmt);
>>>> +}
>>>> +
>>>> +static void register_block_columns(struct perf_hpp_list *hpp_list,
>>>> +				   struct block_fmt *block_fmts)
>>>> +{
>>>> +	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++)
>>>> +		hpp_register(&block_fmts[i], i, hpp_list);
>>>> +}
>>>> +
>>>> +static void init_block_hist(struct block_hist *bh, struct block_fmt *block_fmts)
>>>> +{
>>>> +	__hists__init(&bh->block_hists, &bh->block_list);
>>>> +	perf_hpp_list__init(&bh->block_list);
>>>> +	bh->block_list.nr_header_lines = 1;
>>>> +
>>>> +	register_block_columns(&bh->block_list, block_fmts);
>>>> +
>>>> +	perf_hpp_list__register_sort_field(&bh->block_list,
>>>> +		&block_fmts[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT].fmt);
>>>> +}
>>>> +
>>>> +static inline void set_fmt(struct block_fmt *block_fmt,
>>>> +			   u64 total_cycles, u64 block_cycles)
>>>> +{
>>>> +	block_fmt->total_cycles = total_cycles;
>>>> +	block_fmt->block_cycles = block_cycles;
>>>> +}
>>>> +
>>>> +static void process_block_report(struct hists *hists,
>>>> +				 struct block_report *block_report,
>>>> +				 u64 total_cycles)
>>>> +{
>>>> +	struct rb_node *next = rb_first_cached(&hists->entries);
>>>> +	struct block_hist *bh = &block_report->hist;
>>>> +	struct hist_entry *he;
>>>> +
>>>> +	init_block_hist(bh, block_report->fmts);
>>>> +
>>>> +	while (next) {
>>>> +		he = rb_entry(next, struct hist_entry, rb_node);
>>>> +		block_info__process_sym(he, bh, &block_report->cycles,
>>>> +					total_cycles);
>>>> +		next = rb_next(&he->rb_node);
>>>> +	}
>>>> +
>>>> +	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++) {
>>>> +		block_report->fmts[i].total_cycles = total_cycles;
>>>> +		block_report->fmts[i].block_cycles = block_report->cycles;
>>>> +	}
>>>> +
>>>> +	hists__output_resort(&bh->block_hists, NULL);
>>>> +}
>>>> +
>>>> +struct block_report *block_info__create_report(struct evlist *evlist,
>>>> +					       u64 total_cycles)
>>>> +{
>>>> +	struct block_report *block_reports;
>>>> +	int nr_hists = evlist->core.nr_entries, i = 0;
>>>> +	struct evsel *pos;
>>>> +
>>>> +	block_reports = calloc(nr_hists, sizeof(struct block_report));
>>>> +	if (!block_reports)
>>>> +		return NULL;
>>>> +
>>>> +	evlist__for_each_entry(evlist, pos) {
>>>> +		struct hists *hists = evsel__hists(pos);
>>>> +
>>>> +		process_block_report(hists, &block_reports[i], total_cycles);
>>>> +		i++;
>>>> +	}
>>>> +
>>>> +	return block_reports;
>>>> +}
>>>> diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
>>>> index d55dfc2fda6f..b5266588d476 100644
>>>> --- a/tools/perf/util/block-info.h
>>>> +++ b/tools/perf/util/block-info.h
>>>> @@ -4,8 +4,9 @@
>>>>    #include <linux/types.h>
>>>>    #include <linux/refcount.h>
>>>> -#include "util/hist.h"
>>>> -#include "util/symbol.h"
>>>> +#include "hist.h"
>>>> +#include "symbol.h"
>>>> +#include "sort.h"
>>>>    struct block_info {
>>>>    	struct symbol		*sym;
>>>> @@ -20,6 +21,31 @@ struct block_info {
>>>>    	refcount_t		refcnt;
>>>>    };
>>>> +struct block_fmt {
>>>> +	struct perf_hpp_fmt	fmt;
>>>> +	int			idx;
>>>> +	int			width;
>>>> +	const char		*header;
>>>> +	u64			total_cycles;
>>>> +	u64			block_cycles;
>>>> +};
>>>> +
>>>> +enum {
>>>> +	PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
>>>> +	PERF_HPP_REPORT__BLOCK_LBR_CYCLES,
>>>> +	PERF_HPP_REPORT__BLOCK_CYCLES_PCT,
>>>> +	PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
>>>> +	PERF_HPP_REPORT__BLOCK_RANGE,
>>>> +	PERF_HPP_REPORT__BLOCK_DSO,
>>>> +	PERF_HPP_REPORT__BLOCK_MAX_INDEX
>>>> +};
>>>> +
>>>> +struct block_report {
>>>> +	struct block_hist	hist;
>>>> +	u64			cycles;
>>>> +	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
>>>> +};
>>>> +
>>>>    struct block_hist;
>>>>    struct block_info *block_info__new(void);
>>>> @@ -40,4 +66,7 @@ int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
>>>>    int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
>>>>    			    u64 *block_cycles_aggr, u64 total_cycles);
>>>> +struct block_report *block_info__create_report(struct evlist *evlist,
>>>> +					       u64 total_cycles);
>>>> +
>>>>    #endif /* __PERF_BLOCK_H */
>>>> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
>>>> index 0e27d6830011..7cf137b0451b 100644
>>>> --- a/tools/perf/util/hist.c
>>>> +++ b/tools/perf/util/hist.c
>>>> @@ -758,6 +758,10 @@ struct hist_entry *hists__add_entry_block(struct hists *hists,
>>>>    	struct hist_entry entry = {
>>>>    		.block_info = block_info,
>>>>    		.hists = hists,
>>>> +		.ms = {
>>>> +			.map = al->map,
>>>> +			.sym = al->sym,
>>>> +		},
>>>>    	}, *he = hists__findnew_entry(hists, &entry, al, false);
>>>>    	return he;
>>>> -- 
>>>> 2.17.1
>>>
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip: perf/core] perf report: Sort by sampled cycles percent per block for tui
  2019-11-07  7:47 ` [PATCH v7 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
  2019-11-07 12:38   ` Arnaldo Carvalho de Melo
@ 2019-11-12 11:17   ` tip-bot2 for Jin Yao
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Jin Yao @ 2019-11-12 11:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jin Yao, Jiri Olsa, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Andi Kleen, Jin Yao, Kan Liang, Peter Zijlstra, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     7fa46cbf20d327d78114b1c8c7e69fabe7c57794
Gitweb:        https://git.kernel.org/tip/7fa46cbf20d327d78114b1c8c7e69fabe7c57794
Author:        Jin Yao <yao.jin@linux.intel.com>
AuthorDate:    Thu, 07 Nov 2019 15:47:19 +08:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 07 Nov 2019 10:14:48 -03:00

perf report: Sort by sampled cycles percent per block for tui

Previous patch has implemented a new option "--total-cycles".  But only
stdio mode is supported.

This patch supports the tui mode and support '--percent-limit'.

For example,

 perf record -b ./div
 perf report --total-cycles --percent-limit 1

 # Samples: 2753248 of event 'cycles'
 Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
          26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
          15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
           5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
           4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
           4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
           3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
           3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
           3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
           2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
           2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
           2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
           2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
           2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
           2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
           1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
           1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
           1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so

--------------------------------------------------

 v7:
 ---
 1. Since we have used use_browser in report__browse_block_hists
    to support stdio mode, now we also add supporting for tui.

 2. Move block tui browser code from ui/browsers/hists.c
    to block-info.c.

 v6:
 ---
 Create report__tui_browse_block_hists in block-info.c
 (codes are moved from builtin-report.c).

 v5:
 ---
 Fix a crash issue when running perf report without
 '--total-cycles'. The issue is because the internal flag
 is renamed from 'total_cycles' to 'total_cycles_mode' in
 previous patch but this patch still uses 'total_cycles'
 to check if the '--total-cycles' option is enabled, which
 causes the code to be inconsistent.

 v4:
 ---
 Since the block collection is moved out of printing in
 previous patch, this patch is updated accordingly for
 tui supporting.

 v3:
 ---
 Minor change since the function name is changed:
 block_total_cycles_percent -> block_info__total_cycles_percent

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c    | 27 +++++++++---
 tools/perf/ui/browsers/hists.c |  7 ++-
 tools/perf/ui/browsers/hists.h |  2 +-
 tools/perf/util/block-info.c   | 74 ++++++++++++++++++++++++++++++++-
 4 files changed, 103 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ca41187..1e81985 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -485,6 +485,22 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
 	return ret + fprintf(fp, "\n#\n");
 }
 
+static int perf_evlist__tui_block_hists_browse(struct evlist *evlist,
+					       struct report *rep)
+{
+	struct evsel *pos;
+	int i = 0, ret;
+
+	evlist__for_each_entry(evlist, pos) {
+		ret = report__browse_block_hists(&rep->block_reports[i++].hist,
+						 rep->min_percent, pos);
+		if (ret != 0)
+			return ret;
+	}
+
+	return 0;
+}
+
 static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 					 struct report *rep,
 					 const char *help)
@@ -595,6 +611,11 @@ static int report__browse_hists(struct report *rep)
 
 	switch (use_browser) {
 	case 1:
+		if (rep->total_cycles_mode) {
+			ret = perf_evlist__tui_block_hists_browse(evlist, rep);
+			break;
+		}
+
 		ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
 						    rep->min_percent,
 						    &session->header.env,
@@ -1396,12 +1417,8 @@ repeat:
 	if (report.total_cycles_mode) {
 		if (sort__mode != SORT_MODE__BRANCH)
 			report.total_cycles_mode = false;
-		else if (!report.use_stdio) {
-			pr_err("Error: --total-cycles can be only used together with --stdio\n");
-			goto error;
-		} else {
+		else
 			sort_order = "sym";
-		}
 	}
 
 	if (strcmp(input_name, "-") != 0)
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 7a7187e..334afc2 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -26,6 +26,7 @@
 #include "../../util/sort.h"
 #include "../../util/top.h"
 #include "../../util/thread.h"
+#include "../../util/block-info.h"
 #include "../../arch/common.h"
 #include "../../perf.h"
 
@@ -1783,7 +1784,11 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser)
 			continue;
 		}
 
-		percent = hist_entry__get_percent_limit(h);
+		if (symbol_conf.report_individual_block)
+			percent = block_info__total_cycles_percent(h);
+		else
+			percent = hist_entry__get_percent_limit(h);
+
 		if (percent < hb->min_pcnt)
 			continue;
 
diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
index 91d3e18..078f2f2 100644
--- a/tools/perf/ui/browsers/hists.h
+++ b/tools/perf/ui/browsers/hists.h
@@ -5,6 +5,7 @@
 #include "ui/browser.h"
 
 struct annotation_options;
+struct evsel;
 
 struct hist_browser {
 	struct ui_browser   b;
@@ -15,6 +16,7 @@ struct hist_browser {
 	struct pstack	    *pstack;
 	struct perf_env	    *env;
 	struct annotation_options *annotation_opts;
+	struct evsel	    *block_evsel;
 	int		     print_seq;
 	bool		     show_dso;
 	bool		     show_headers;
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 597d120..9abc201 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -10,6 +10,7 @@
 #include "map.h"
 #include "srcline.h"
 #include "evlist.h"
+#include "ui/browsers/hists.h"
 
 static struct block_header_column {
 	const char *name;
@@ -438,9 +439,75 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 	return block_reports;
 }
 
+#ifdef HAVE_SLANG_SUPPORT
+static int block_hists_browser__title(struct hist_browser *browser, char *bf,
+				      size_t size)
+{
+	struct hists *hists = evsel__hists(browser->block_evsel);
+	const char *evname = perf_evsel__name(browser->block_evsel);
+	unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
+	int ret;
+
+	ret = scnprintf(bf, size, "# Samples: %lu", nr_samples);
+	if (evname)
+		scnprintf(bf + ret, size -  ret, " of event '%s'", evname);
+
+	return 0;
+}
+
+static int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
+				  float min_percent)
+{
+	struct hists *hists = &bh->block_hists;
+	struct hist_browser *browser;
+	int key = -1;
+	static const char help[] =
+	" q             Quit \n";
+
+	browser = hist_browser__new(hists);
+	if (!browser)
+		return -1;
+
+	browser->block_evsel = evsel;
+	browser->title = block_hists_browser__title;
+	browser->min_pcnt = min_percent;
+
+	/* reset abort key so that it can get Ctrl-C as a key */
+	SLang_reset_tty();
+	SLang_init_tty(0, 0, 0);
+
+	while (1) {
+		key = hist_browser__run(browser, "? - help", true);
+
+		switch (key) {
+		case 'q':
+			goto out;
+		case '?':
+			ui_browser__help_window(&browser->b, help);
+			break;
+		default:
+			break;
+		}
+	}
+
+out:
+	hist_browser__delete(browser);
+	return 0;
+}
+#else
+static int block_hists_tui_browse(struct block_hist *bh __maybe_unused,
+				  struct evsel *evsel __maybe_unused,
+				  float min_percent __maybe_unused)
+{
+	return 0;
+}
+#endif
+
 int report__browse_block_hists(struct block_hist *bh, float min_percent,
-			       struct evsel *evsel __maybe_unused)
+			       struct evsel *evsel)
 {
+	int ret;
+
 	switch (use_browser) {
 	case 0:
 		symbol_conf.report_individual_block = true;
@@ -448,6 +515,11 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
 			       stdout, true);
 		hists__delete_entries(&bh->block_hists);
 		return 0;
+	case 1:
+		symbol_conf.report_individual_block = true;
+		ret = block_hists_tui_browse(bh, evsel, min_percent);
+		hists__delete_entries(&bh->block_hists);
+		return ret;
 	default:
 		return -1;
 	}

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: perf/core] perf report: Sort by sampled cycles percent per block for stdio
  2019-11-07  7:47 ` [PATCH v7 5/7] perf report: Sort by sampled cycles percent per block for stdio Jin Yao
@ 2019-11-12 11:17   ` tip-bot2 for Jin Yao
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Jin Yao @ 2019-11-12 11:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jin Yao, Jiri Olsa, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Andi Kleen, Jin Yao, Kan Liang, Peter Zijlstra, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     6f7164fa231a5f360e576593c547bea7dc56ddbc
Gitweb:        https://git.kernel.org/tip/6f7164fa231a5f360e576593c547bea7dc56ddbc
Author:        Jin Yao <yao.jin@linux.intel.com>
AuthorDate:    Thu, 07 Nov 2019 15:47:17 +08:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 07 Nov 2019 10:14:48 -03:00

perf report: Sort by sampled cycles percent per block for stdio

It would be useful to support sorting for all blocks by the sampled
cycles percent per block. This is useful to concentrate on the globally
hottest blocks.

This patch implements a new option "--total-cycles" which sorts all
blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is the percent:

 percent = block sampled cycles aggregation / total sampled cycles

Note that, this patch only supports "--stdio" mode.

For example,

  # perf record -b ./div
  # perf report --total-cycles --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 2M of event 'cycles'
  # Event count (approx.): 2753248
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                             [Program Block Range]      Shared Object
  # ...............  ..............  ...........  ..........  ................................................  .................
  #
             26.04%            2.8M        0.40%          18                            [div.c:42 -> div.c:39]                div
             15.17%            1.2M        0.16%           7                [random_r.c:357 -> random_r.c:380]       libc-2.27.so
              5.11%          402.0K        0.04%           2                            [div.c:27 -> div.c:28]                div
              4.87%          381.6K        0.04%           2                    [random.c:288 -> random.c:291]       libc-2.27.so
              4.53%          381.0K        0.04%           2                            [div.c:40 -> div.c:40]                div
              3.85%          300.9K        0.02%           1                            [div.c:22 -> div.c:25]                div
              3.08%          241.1K        0.02%           1                          [rand.c:26 -> rand.c:27]       libc-2.27.so
              3.06%          240.0K        0.02%           1                    [random.c:291 -> random.c:291]       libc-2.27.so
              2.78%          215.7K        0.02%           1                    [random.c:298 -> random.c:298]       libc-2.27.so
              2.52%          198.3K        0.02%           1                    [random.c:293 -> random.c:293]       libc-2.27.so
              2.36%          184.8K        0.02%           1                          [rand.c:28 -> rand.c:28]       libc-2.27.so
              2.33%          180.5K        0.02%           1                    [random.c:295 -> random.c:295]       libc-2.27.so
              2.28%          176.7K        0.02%           1                    [random.c:295 -> random.c:295]       libc-2.27.so
              2.20%          168.8K        0.02%           1                        [rand@plt+0 -> rand@plt+0]                div
              1.98%          158.2K        0.02%           1                [random_r.c:388 -> random_r.c:388]       libc-2.27.so
              1.57%          123.3K        0.02%           1                            [div.c:42 -> div.c:44]                div
              1.44%          116.0K        0.42%          19                [random_r.c:357 -> random_r.c:394]       libc-2.27.so
              0.25%          182.5K        0.02%           1                [random_r.c:388 -> random_r.c:391]       libc-2.27.so
              0.00%              48        1.07%          48        [x86_pmu_enable+284 -> x86_pmu_enable+298]  [kernel.kallsyms]
              0.00%              74        1.64%          74             [vm_mmap_pgoff+0 -> vm_mmap_pgoff+92]  [kernel.kallsyms]
              0.00%              73        1.62%          73                         [vm_mmap+0 -> vm_mmap+48]  [kernel.kallsyms]
              0.00%              63        0.69%          31                       [up_write+0 -> up_write+34]  [kernel.kallsyms]
              0.00%              13        0.29%          13      [setup_arg_pages+396 -> setup_arg_pages+413]  [kernel.kallsyms]
              0.00%               3        0.07%           3      [setup_arg_pages+418 -> setup_arg_pages+450]  [kernel.kallsyms]
              0.00%             616        6.84%         308   [security_mmap_file+0 -> security_mmap_file+72]  [kernel.kallsyms]
              0.00%              23        0.51%          23  [security_mmap_file+77 -> security_mmap_file+87]  [kernel.kallsyms]
              0.00%               4        0.02%           1                  [sched_clock+0 -> sched_clock+4]  [kernel.kallsyms]
              0.00%               4        0.02%           1                 [sched_clock+9 -> sched_clock+12]  [kernel.kallsyms]
              0.00%               1        0.02%           1                [rcu_nmi_exit+0 -> rcu_nmi_exit+9]  [kernel.kallsyms]

Committer testing:

This should provide material for hours of endless joy, both from looking
for suspicious things in the implementation of this patch, such as the
top one:

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
              2.17%            1.7M        0.08%         607   [compiler.h:199 -> common.c:221]              [kernel.vmlinux]

As well from things that look legit:

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
              0.16%          123.0K        0.60%        4.7K   [nospec-branch.h:265 -> nospec-branch.h:278]  [kernel.vmlinux]

:-)

Very short system wide taken branches session:

  # perf record -h -b

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -b, --branch-any      sample any taken branches

  #
  # perf record -b
  ^C[ perf record: Woken up 595 times to write data ]
  [ perf record: Captured and wrote 156.672 MB perf.data (196873 samples) ]

  #
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
  #
  # perf report --total-cycles --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 6M of event 'cycles'
  # Event count (approx.): 6299936
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
  # ...............  ..............  ...........  ..........  ......................................................................  ....................
  #
              2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
              1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151]          libc-2.29.so
              0.72%          544.5K        0.03%         230                                      [entry_64.S:657 -> entry_64.S:662]      [kernel.vmlinux]
              0.56%          541.8K        0.09%         672                                        [compiler.h:199 -> common.c:300]      [kernel.vmlinux]
              0.39%          293.2K        0.01%         104                                    [list_debug.c:43 -> list_debug.c:61]      [kernel.vmlinux]
              0.36%          278.6K        0.03%         272                                    [entry_64.S:1289 -> entry_64.S:1308]      [kernel.vmlinux]
              0.30%          260.8K        0.07%         564                              [clear_page_64.S:47 -> clear_page_64.S:50]      [kernel.vmlinux]
              0.28%          215.3K        0.05%         369                                            [traps.c:623 -> traps.c:628]      [kernel.vmlinux]
              0.23%          178.1K        0.04%         278                                      [entry_64.S:271 -> entry_64.S:275]      [kernel.vmlinux]
              0.20%          152.6K        0.09%         706                                      [paravirt.c:177 -> paravirt.c:179]      [kernel.vmlinux]
              0.20%          155.8K        0.05%         373                                      [entry_64.S:153 -> entry_64.S:175]      [kernel.vmlinux]
              0.18%          136.6K        0.03%         222                                                [msr.h:105 -> msr.h:166]      [kernel.vmlinux]
              0.16%          123.0K        0.60%        4.7K                            [nospec-branch.h:265 -> nospec-branch.h:278]      [kernel.vmlinux]
              0.16%          118.3K        0.01%          44                                      [entry_64.S:632 -> entry_64.S:657]      [kernel.vmlinux]
              0.14%          104.5K        0.00%          28                                          [rwsem.c:1541 -> rwsem.c:1544]      [kernel.vmlinux]
              0.13%           99.2K        0.01%          53                                      [spinlock.c:150 -> spinlock.c:152]      [kernel.vmlinux]
              0.13%           95.5K        0.00%          35                                              [swap.c:456 -> swap.c:471]      [kernel.vmlinux]
              0.12%           96.2K        0.05%         407                              [copy_user_64.S:175 -> copy_user_64.S:209]      [kernel.vmlinux]
              0.11%           85.9K        0.00%          31                                        [swap.c:400 -> page-flags.h:188]      [kernel.vmlinux]
              0.10%           73.0K        0.01%          52                                          [paravirt.h:763 -> list.h:131]      [kernel.vmlinux]
              0.07%           56.2K        0.03%         214                                      [filemap.c:1524 -> filemap.c:1557]      [kernel.vmlinux]
              0.07%           54.2K        0.02%         145                                        [memory.c:1032 -> memory.c:1049]      [kernel.vmlinux]
              0.07%           50.3K        0.00%          39                                            [mmzone.c:49 -> mmzone.c:69]      [kernel.vmlinux]
              0.06%           48.3K        0.01%          40                                   [paravirt.h:768 -> page_alloc.c:3304]      [kernel.vmlinux]
              0.06%           46.7K        0.02%         155                                        [memory.c:1032 -> memory.c:1056]      [kernel.vmlinux]
              0.06%           46.9K        0.01%         103                                              [swap.c:867 -> swap.c:902]      [kernel.vmlinux]
              0.06%           47.8K        0.00%          34                                    [entry_64.S:1201 -> entry_64.S:1202]      [kernel.vmlinux]

 -----------------------------------------------------------

 v7:
 ---
 Use use_browser in report__browse_block_hists for supporting
 stdio and potential tui mode.

 v6:
 ---
 Create report__browse_block_hists in block-info.c (codes are
 moved from builtin-report.c). It's called from
 perf_evlist__tty_browse_hists.

 v5:
 ---
 1. Move all block functions to block-info.c

 2. Move the code of setting ms in block hist_entry to
    other patch.

 v4:
 ---
 1. Use new option '--total-cycles' to replace
    '-s total_cycles' in v3.

 2. Move block info collection out of block info
    printing.

 v3:
 ---
 1. Use common function block_info__process_sym to
    process the blocks per symbol.

 2. Remove the nasty hack for skipping calculation
    of column length

 3. Some minor cleanup

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-report.txt | 11 ++++++-
 tools/perf/builtin-report.c              | 44 +++++++++++++++++++++--
 tools/perf/ui/stdio/hist.c               | 22 ++++++++++++-
 tools/perf/util/block-info.c             | 17 +++++++++-
 tools/perf/util/block-info.h             |  4 ++-
 tools/perf/util/symbol_conf.h            |  1 +-
 6 files changed, 96 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 7315f15..8dbe211 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -525,6 +525,17 @@ include::itrace.txt[]
 	Configure time quantum for time sort key. Default 100ms.
 	Accepts s, us, ms, ns units.
 
+--total-cycles::
+	When --total-cycles is specified, it supports sorting for all blocks by
+	'Sampled Cycles%'. This is useful to concentrate on the globally hottest
+	blocks. In output, there are some new columns:
+
+	'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles
+	'Sampled Cycles'  - block sampled cycles aggregation
+	'Avg Cycles%'     - block average sampled cycles / sum of total block average
+			    sampled cycles
+	'Avg Cycles'      - block average sampled cycles
+
 include::callchain-overhead-calculation.txt[]
 
 SEE ALSO
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index bc15b9d..992b18b 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -51,6 +51,7 @@
 #include "util/util.h" // perf_tip()
 #include "ui/ui.h"
 #include "ui/progress.h"
+#include "util/block-info.h"
 
 #include <dlfcn.h>
 #include <errno.h>
@@ -96,10 +97,13 @@ struct report {
 	float			min_percent;
 	u64			nr_entries;
 	u64			queue_size;
+	u64			total_cycles;
 	int			socket_filter;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 	struct branch_type_stat	brtype_stat;
 	bool			symbol_ipc;
+	bool			total_cycles_mode;
+	struct block_report	*block_reports;
 };
 
 static int report__config(const char *var, const char *value, void *cb)
@@ -290,9 +294,10 @@ static int process_sample_event(struct perf_tool *tool,
 	if (al.map != NULL)
 		al.map->dso->hit = 1;
 
-	if (ui__has_annotation() || rep->symbol_ipc) {
+	if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
-				     rep->nonany_branch_mode, NULL);
+				     rep->nonany_branch_mode,
+				     &rep->total_cycles);
 	}
 
 	ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
@@ -485,6 +490,7 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 					 const char *help)
 {
 	struct evsel *pos;
+	int i = 0;
 
 	if (!quiet) {
 		fprintf(stdout, "#\n# Total Lost Samples: %" PRIu64 "\n#\n",
@@ -500,6 +506,13 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 			continue;
 
 		hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
+
+		if (rep->total_cycles_mode) {
+			report__browse_block_hists(&rep->block_reports[i++].hist,
+						   0, pos);
+			continue;
+		}
+
 		hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
 			       !(symbol_conf.use_callchain ||
 			         symbol_conf.show_branchflag_count));
@@ -925,6 +938,13 @@ static int __cmd_report(struct report *rep)
 
 	report__output_resort(rep);
 
+	if (rep->total_cycles_mode) {
+		rep->block_reports = block_info__create_report(session->evlist,
+							       rep->total_cycles);
+		if (!rep->block_reports)
+			return -1;
+	}
+
 	return report__browse_hists(rep);
 }
 
@@ -1209,6 +1229,8 @@ int cmd_report(int argc, const char **argv)
 		     "Set time quantum for time sort key (default 100ms)",
 		     parse_time_quantum),
 	OPTS_EVSWITCH(&report.evswitch),
+	OPT_BOOLEAN(0, "total-cycles", &report.total_cycles_mode,
+		    "Sort all blocks by 'Sampled Cycles%'"),
 	OPT_END()
 	};
 	struct perf_data data = {
@@ -1371,6 +1393,17 @@ repeat:
 		goto error;
 	}
 
+	if (report.total_cycles_mode) {
+		if (sort__mode != SORT_MODE__BRANCH)
+			report.total_cycles_mode = false;
+		else if (!report.use_stdio) {
+			pr_err("Error: --total-cycles can be only used together with --stdio\n");
+			goto error;
+		} else {
+			sort_order = "sym";
+		}
+	}
+
 	if (strcmp(input_name, "-") != 0)
 		setup_browser(true);
 	else
@@ -1421,7 +1454,8 @@ repeat:
 	 * so don't allocate extra space that won't be used in the stdio
 	 * implementation.
 	 */
-	if (ui__has_annotation() || report.symbol_ipc) {
+	if (ui__has_annotation() || report.symbol_ipc ||
+	    report.total_cycles_mode) {
 		ret = symbol__annotation_init();
 		if (ret < 0)
 			goto error;
@@ -1482,6 +1516,10 @@ error:
 		itrace_synth_opts__clear_time_range(&itrace_synth_opts);
 		zfree(&report.ptime_range);
 	}
+
+	if (report.block_reports)
+		zfree(&report.block_reports);
+
 	zstd_fini(&(session->zstd_data));
 	perf_session__delete(session);
 	return ret;
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 5365606..655ef77 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -558,6 +558,25 @@ static int hist_entry__block_fprintf(struct hist_entry *he,
 	return ret;
 }
 
+static int hist_entry__individual_block_fprintf(struct hist_entry *he,
+						char *bf, size_t size,
+						FILE *fp)
+{
+	int ret = 0;
+
+	struct perf_hpp hpp = {
+		.buf		= bf,
+		.size		= size,
+		.skip		= false,
+	};
+
+	hist_entry__snprintf(he, &hpp);
+	if (!hpp.skip)
+		ret += fprintf(fp, "%s\n", bf);
+
+	return ret;
+}
+
 static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 			       char *bf, size_t bfsz, FILE *fp,
 			       bool ignore_callchains)
@@ -580,6 +599,9 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 	if (symbol_conf.report_block)
 		return hist_entry__block_fprintf(he, bf, size, fp);
 
+	if (symbol_conf.report_individual_block)
+		return hist_entry__individual_block_fprintf(he, bf, size, fp);
+
 	hist_entry__snprintf(he, &hpp);
 
 	ret = fprintf(fp, "%s\n", bf);
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 4a7bac9..ba89175 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -437,3 +437,20 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 
 	return block_reports;
 }
+
+int report__browse_block_hists(struct block_hist *bh, float min_percent,
+			       struct evsel *evsel __maybe_unused)
+{
+	switch (use_browser) {
+	case 0:
+		symbol_conf.report_individual_block = true;
+		hists__fprintf(&bh->block_hists, true, 0, 0, min_percent,
+			       stdout, true);
+		hists__delete_entries(&bh->block_hists);
+		return 0;
+	default:
+		return -1;
+	}
+
+	return 0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index b526658..8309297 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -7,6 +7,7 @@
 #include "hist.h"
 #include "symbol.h"
 #include "sort.h"
+#include "ui/ui.h"
 
 struct block_info {
 	struct symbol		*sym;
@@ -69,4 +70,7 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 struct block_report *block_info__create_report(struct evlist *evlist,
 					       u64 total_cycles);
 
+int report__browse_block_hists(struct block_hist *bh, float min_percent,
+			       struct evsel *evsel);
+
 #endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index e688078..10f1ec3 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -40,6 +40,7 @@ struct symbol_conf {
 			raw_trace,
 			report_hierarchy,
 			report_block,
+			report_individual_block,
 			inline_name,
 			disable_add2line_warn;
 	const char	*vmlinux_name,

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: perf/core] perf report: Support --percent-limit for --total-cycles
  2019-11-07  7:47 ` [PATCH v7 6/7] perf report: Support --percent-limit for --total-cycles Jin Yao
@ 2019-11-12 11:17   ` tip-bot2 for Jin Yao
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Jin Yao @ 2019-11-12 11:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jin Yao, Jiri Olsa, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Andi Kleen, Jin Yao, Kan Liang, Peter Zijlstra, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     0b49f83657d5fb9a35c592faab7c5ea16e387539
Gitweb:        https://git.kernel.org/tip/0b49f83657d5fb9a35c592faab7c5ea16e387539
Author:        Jin Yao <yao.jin@linux.intel.com>
AuthorDate:    Thu, 07 Nov 2019 15:47:18 +08:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 07 Nov 2019 10:14:48 -03:00

perf report: Support --percent-limit for --total-cycles

We have already supported the '--total-cycles' option in previous patch.
It's also useful to show entries only above a threshold percent.

This patch enables '--percent-limit' for not showing entries
under that percent.

For example:

 perf report --total-cycles --stdio --percent-limit 1

 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 2M of event 'cycles'
 # Event count (approx.): 2753248
 #
 # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
 # ...............  ..............  ...........  ..........  .................................................................  ....................
 #
            26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
            15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
             5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
             4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
             4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
             3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
             3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
             3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
             2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
             2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
             2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
             2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
             1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
             1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
             1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so

Committer testing:

>From second exapmple onwards slightly edited for brevity:

  # perf report --total-cycles --percent-limit 2 --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 6M of event 'cycles'
  # Event count (approx.): 6299936
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
  # ...............  ..............  ...........  ..........  ......................................................................  ....................
  #
              2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
  #
  # (Tip: Create an archive with symtabs to analyse on other machine: perf archive)
  #
  # perf report --total-cycles --percent-limit 1 --stdio
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
              2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
              1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151]          libc-2.29.so
  #
  # perf report --total-cycles --percent-limit 0.7 --stdio
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
              2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
              1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151]          libc-2.29.so
              0.72%          544.5K        0.03%         230                                      [entry_64.S:657 -> entry_64.S:662]      [kernel.vmlinux]
  #

-------------------------------------------

It only shows the entries which 'Sampled Cycles%' > 1%.

 v7:
 ---
 No functional change. Only fix the conflict issue because
 previous patches are changed.

 v6:
 ---
 No functional change. Only fix the conflict issue because
 previous patches are changed.

 v5:
 ---
 No functional change. Only fix the conflict issue because
 previous patches are changed.

 v4:
 ---
 No functional change. Only fix the build issue because
 previous patches are changed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-7-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c  |  2 +-
 tools/perf/ui/stdio/hist.c   |  7 ++++++-
 tools/perf/util/block-info.c | 10 ++++++++++
 tools/perf/util/block-info.h |  2 ++
 4 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 992b18b..ca41187 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -509,7 +509,7 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 
 		if (rep->total_cycles_mode) {
 			report__browse_block_hists(&rep->block_reports[i++].hist,
-						   0, pos);
+						   rep->min_percent, pos);
 			continue;
 		}
 
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 655ef77..132056c 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -15,6 +15,7 @@
 #include "../../util/srcline.h"
 #include "../../util/string2.h"
 #include "../../util/thread.h"
+#include "../../util/block-info.h"
 #include <linux/ctype.h>
 #include <linux/zalloc.h>
 
@@ -856,7 +857,11 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
 		if (h->filtered)
 			continue;
 
-		percent = hist_entry__get_percent_limit(h);
+		if (symbol_conf.report_individual_block)
+			percent = block_info__total_cycles_percent(h);
+		else
+			percent = hist_entry__get_percent_limit(h);
+
 		if (percent < min_pcnt)
 			continue;
 
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index ba89175..597d120 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -454,3 +454,13 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
 
 	return 0;
 }
+
+float block_info__total_cycles_percent(struct hist_entry *he)
+{
+	struct block_info *bi = he->block_info;
+
+	if (bi->total_cycles)
+		return bi->cycles * 100.0 / bi->total_cycles;
+
+	return 0.0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index 8309297..e4d20bc 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -73,4 +73,6 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 int report__browse_block_hists(struct block_hist *bh, float min_percent,
 			       struct evsel *evsel);
 
+float block_info__total_cycles_percent(struct hist_entry *he);
+
 #endif /* __PERF_BLOCK_H */

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: perf/core] perf hist: Support block formats with compare/sort/display
  2019-11-07  7:47 ` [PATCH v7 4/7] perf util: Support block formats with compare/sort/display Jin Yao
  2019-11-07 12:45   ` Arnaldo Carvalho de Melo
@ 2019-11-12 11:17   ` tip-bot2 for Jin Yao
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Jin Yao @ 2019-11-12 11:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jin Yao, Jiri Olsa, Alexander Shishkin, Andi Kleen, Jin Yao,
	Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     b65a7d372b1a55db6fb48a5b3c48941eb68716cb
Gitweb:        https://git.kernel.org/tip/b65a7d372b1a55db6fb48a5b3c48941eb68716cb
Author:        Jin Yao <yao.jin@linux.intel.com>
AuthorDate:    Thu, 07 Nov 2019 15:47:16 +08:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 07 Nov 2019 10:14:05 -03:00

perf hist: Support block formats with compare/sort/display

This patch provides helper routines to support new columns for block
info output.

The new columns are:

  Sampled Cycles%
  Sampled Cycles
  Avg Cycles%
  Avg Cycles
  [Program Block Range]
  Shared Object

 v5:
 ---
 1. Move more block related functions from builtin-report.c to
    block-info.c

 2. Set ms (map+sym) in block hist_entry. Because this info
    is needed for reporting the block range (i.e. source line)

Committer notes:

Remove unused set_fmt() function, some build were not completing with:

  util/block-info.c:396:20: error: unused function 'set_fmt' [-Werror,-Wunused-function]
  static inline void set_fmt(struct block_fmt *block_fmt,
                     ^
  1 error generated.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/block-info.c | 310 ++++++++++++++++++++++++++++++++++-
 tools/perf/util/block-info.h |  33 +++-
 tools/perf/util/hist.c       |   4 +-
 3 files changed, 345 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index b9954a3..4a7bac9 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -6,6 +6,40 @@
 #include "sort.h"
 #include "annotate.h"
 #include "symbol.h"
+#include "dso.h"
+#include "map.h"
+#include "srcline.h"
+#include "evlist.h"
+
+static struct block_header_column {
+	const char *name;
+	int width;
+} block_columns[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
+	[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT] = {
+		.name = "Sampled Cycles%",
+		.width = 15,
+	},
+	[PERF_HPP_REPORT__BLOCK_LBR_CYCLES] = {
+		.name = "Sampled Cycles",
+		.width = 14,
+	},
+	[PERF_HPP_REPORT__BLOCK_CYCLES_PCT] = {
+		.name = "Avg Cycles%",
+		.width = 11,
+	},
+	[PERF_HPP_REPORT__BLOCK_AVG_CYCLES] = {
+		.name = "Avg Cycles",
+		.width = 10,
+	},
+	[PERF_HPP_REPORT__BLOCK_RANGE] = {
+		.name = "[Program Block Range]",
+		.width = 70,
+	},
+	[PERF_HPP_REPORT__BLOCK_DSO] = {
+		.name = "Shared Object",
+		.width = 20,
+	}
+};
 
 struct block_info *block_info__get(struct block_info *bi)
 {
@@ -127,3 +161,279 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 
 	return 0;
 }
+
+static int block_column_header(struct perf_hpp_fmt *fmt,
+			       struct perf_hpp *hpp,
+			       struct hists *hists __maybe_unused,
+			       int line __maybe_unused,
+			       int *span __maybe_unused)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 block_fmt->header);
+}
+
+static int block_column_width(struct perf_hpp_fmt *fmt,
+			      struct perf_hpp *hpp __maybe_unused,
+			      struct hists *hists __maybe_unused)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+
+	return block_fmt->width;
+}
+
+static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
+					struct perf_hpp *hpp,
+					struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	double ratio = 0.0;
+	char buf[16];
+
+	if (block_fmt->total_cycles)
+		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
+
+	sprintf(buf, "%.2f%%", 100.0 * ratio);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
+					   struct hist_entry *left,
+					   struct hist_entry *right)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi_l = left->block_info;
+	struct block_info *bi_r = right->block_info;
+	double l, r;
+
+	if (block_fmt->total_cycles) {
+		l = ((double)bi_l->cycles /
+			(double)block_fmt->total_cycles) * 100000.0;
+		r = ((double)bi_r->cycles /
+			(double)block_fmt->total_cycles) * 100000.0;
+		return (int64_t)l - (int64_t)r;
+	}
+
+	return 0;
+}
+
+static void cycles_string(u64 cycles, char *buf, int size)
+{
+	if (cycles >= 1000000)
+		scnprintf(buf, size, "%.1fM", (double)cycles / 1000000.0);
+	else if (cycles >= 1000)
+		scnprintf(buf, size, "%.1fK", (double)cycles / 1000.0);
+	else
+		scnprintf(buf, size, "%1d", cycles);
+}
+
+static int block_cycles_lbr_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp, struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char cycles_buf[16];
+
+	cycles_string(bi->cycles_aggr, cycles_buf, sizeof(cycles_buf));
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 cycles_buf);
+}
+
+static int block_cycles_pct_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp, struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	double ratio = 0.0;
+	u64 avg;
+	char buf[16];
+
+	if (block_fmt->block_cycles && bi->num_aggr) {
+		avg = bi->cycles_aggr / bi->num_aggr;
+		ratio = (double)avg / (double)block_fmt->block_cycles;
+	}
+
+	sprintf(buf, "%.2f%%", 100.0 * ratio);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int block_avg_cycles_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp,
+				  struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char cycles_buf[16];
+
+	cycles_string(bi->cycles_aggr / bi->num_aggr, cycles_buf,
+		      sizeof(cycles_buf));
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 cycles_buf);
+}
+
+static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+			     struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char buf[128];
+	char *start_line, *end_line;
+
+	symbol_conf.disable_add2line_warn = true;
+
+	start_line = map__srcline(he->ms.map, bi->sym->start + bi->start,
+				  he->ms.sym);
+
+	end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
+				he->ms.sym);
+
+	if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
+		scnprintf(buf, sizeof(buf), "[%s -> %s]",
+			  start_line, end_line);
+	} else {
+		scnprintf(buf, sizeof(buf), "[%7lx -> %7lx]",
+			  bi->start, bi->end);
+	}
+
+	free_srcline(start_line);
+	free_srcline(end_line);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+			   struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct map *map = he->ms.map;
+
+	if (map && map->dso) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 map->dso->short_name);
+	}
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 "[unknown]");
+}
+
+static void init_block_header(struct block_fmt *block_fmt)
+{
+	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
+
+	BUG_ON(block_fmt->idx >= PERF_HPP_REPORT__BLOCK_MAX_INDEX);
+
+	block_fmt->header = block_columns[block_fmt->idx].name;
+	block_fmt->width = block_columns[block_fmt->idx].width;
+
+	fmt->header = block_column_header;
+	fmt->width = block_column_width;
+}
+
+static void hpp_register(struct block_fmt *block_fmt, int idx,
+			 struct perf_hpp_list *hpp_list)
+{
+	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
+
+	block_fmt->idx = idx;
+	INIT_LIST_HEAD(&fmt->list);
+	INIT_LIST_HEAD(&fmt->sort_list);
+
+	switch (idx) {
+	case PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT:
+		fmt->entry = block_total_cycles_pct_entry;
+		fmt->cmp = block_info__cmp;
+		fmt->sort = block_total_cycles_pct_sort;
+		break;
+	case PERF_HPP_REPORT__BLOCK_LBR_CYCLES:
+		fmt->entry = block_cycles_lbr_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_CYCLES_PCT:
+		fmt->entry = block_cycles_pct_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_AVG_CYCLES:
+		fmt->entry = block_avg_cycles_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_RANGE:
+		fmt->entry = block_range_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_DSO:
+		fmt->entry = block_dso_entry;
+		break;
+	default:
+		return;
+	}
+
+	init_block_header(block_fmt);
+	perf_hpp_list__column_register(hpp_list, fmt);
+}
+
+static void register_block_columns(struct perf_hpp_list *hpp_list,
+				   struct block_fmt *block_fmts)
+{
+	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++)
+		hpp_register(&block_fmts[i], i, hpp_list);
+}
+
+static void init_block_hist(struct block_hist *bh, struct block_fmt *block_fmts)
+{
+	__hists__init(&bh->block_hists, &bh->block_list);
+	perf_hpp_list__init(&bh->block_list);
+	bh->block_list.nr_header_lines = 1;
+
+	register_block_columns(&bh->block_list, block_fmts);
+
+	perf_hpp_list__register_sort_field(&bh->block_list,
+		&block_fmts[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT].fmt);
+}
+
+static void process_block_report(struct hists *hists,
+				 struct block_report *block_report,
+				 u64 total_cycles)
+{
+	struct rb_node *next = rb_first_cached(&hists->entries);
+	struct block_hist *bh = &block_report->hist;
+	struct hist_entry *he;
+
+	init_block_hist(bh, block_report->fmts);
+
+	while (next) {
+		he = rb_entry(next, struct hist_entry, rb_node);
+		block_info__process_sym(he, bh, &block_report->cycles,
+					total_cycles);
+		next = rb_next(&he->rb_node);
+	}
+
+	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++) {
+		block_report->fmts[i].total_cycles = total_cycles;
+		block_report->fmts[i].block_cycles = block_report->cycles;
+	}
+
+	hists__output_resort(&bh->block_hists, NULL);
+}
+
+struct block_report *block_info__create_report(struct evlist *evlist,
+					       u64 total_cycles)
+{
+	struct block_report *block_reports;
+	int nr_hists = evlist->core.nr_entries, i = 0;
+	struct evsel *pos;
+
+	block_reports = calloc(nr_hists, sizeof(struct block_report));
+	if (!block_reports)
+		return NULL;
+
+	evlist__for_each_entry(evlist, pos) {
+		struct hists *hists = evsel__hists(pos);
+
+		process_block_report(hists, &block_reports[i], total_cycles);
+		i++;
+	}
+
+	return block_reports;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index d55dfc2..b526658 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -4,8 +4,9 @@
 
 #include <linux/types.h>
 #include <linux/refcount.h>
-#include "util/hist.h"
-#include "util/symbol.h"
+#include "hist.h"
+#include "symbol.h"
+#include "sort.h"
 
 struct block_info {
 	struct symbol		*sym;
@@ -20,6 +21,31 @@ struct block_info {
 	refcount_t		refcnt;
 };
 
+struct block_fmt {
+	struct perf_hpp_fmt	fmt;
+	int			idx;
+	int			width;
+	const char		*header;
+	u64			total_cycles;
+	u64			block_cycles;
+};
+
+enum {
+	PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
+	PERF_HPP_REPORT__BLOCK_LBR_CYCLES,
+	PERF_HPP_REPORT__BLOCK_CYCLES_PCT,
+	PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
+	PERF_HPP_REPORT__BLOCK_RANGE,
+	PERF_HPP_REPORT__BLOCK_DSO,
+	PERF_HPP_REPORT__BLOCK_MAX_INDEX
+};
+
+struct block_report {
+	struct block_hist	hist;
+	u64			cycles;
+	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
+};
+
 struct block_hist;
 
 struct block_info *block_info__new(void);
@@ -40,4 +66,7 @@ int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 			    u64 *block_cycles_aggr, u64 total_cycles);
 
+struct block_report *block_info__create_report(struct evlist *evlist,
+					       u64 total_cycles);
+
 #endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0e27d68..7cf137b 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -758,6 +758,10 @@ struct hist_entry *hists__add_entry_block(struct hists *hists,
 	struct hist_entry entry = {
 		.block_info = block_info,
 		.hists = hists,
+		.ms = {
+			.map = al->map,
+			.sym = al->sym,
+		},
 	}, *he = hists__findnew_entry(hists, &entry, al, false);
 
 	return he;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: perf/core] perf hist: Count the total cycles of all samples
  2019-11-07  7:47 ` [PATCH v7 3/7] perf util: Count the total cycles of all samples Jin Yao
@ 2019-11-12 11:17   ` tip-bot2 for Jin Yao
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Jin Yao @ 2019-11-12 11:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jin Yao, Jiri Olsa, Alexander Shishkin, Andi Kleen, Jin Yao,
	Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     7841f40aed933dd3838f8d9f2dfcf286c352b7ee
Gitweb:        https://git.kernel.org/tip/7841f40aed933dd3838f8d9f2dfcf286c352b7ee
Author:        Jin Yao <yao.jin@linux.intel.com>
AuthorDate:    Thu, 07 Nov 2019 15:47:15 +08:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 07 Nov 2019 09:14:15 -03:00

perf hist: Count the total cycles of all samples

We can get the per sample cycles by hist__account_cycles(). It's also
useful to know the total cycles of all samples in order to get the
cycles coverage for a single program block in further. For example:

  coverage = per block sampled cycles / total sampled cycles

This patch creates a new argument 'total_cycles' in hist__account_cycles(),
which will be added with the cycles of each sample.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-annotate.c | 2 +-
 tools/perf/builtin-diff.c     | 3 ++-
 tools/perf/builtin-report.c   | 2 +-
 tools/perf/builtin-top.c      | 3 ++-
 tools/perf/util/hist.c        | 6 +++++-
 tools/perf/util/hist.h        | 3 ++-
 6 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 8db8fc9..6ab0cc4 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -201,7 +201,7 @@ static int process_branch_callback(struct evsel *evsel,
 	if (a.map != NULL)
 		a.map->dso->hit = 1;
 
-	hist__account_cycles(sample->branch_stack, al, sample, false);
+	hist__account_cycles(sample->branch_stack, al, sample, false, NULL);
 
 	ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
 	return ret;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 6728568..376dbf1 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -426,7 +426,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
 			goto out_put;
 		}
 
-		hist__account_cycles(sample->branch_stack, &al, sample, false);
+		hist__account_cycles(sample->branch_stack, &al, sample, false,
+				     NULL);
 	}
 
 	/*
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3bbad03..bc15b9d 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -292,7 +292,7 @@ static int process_sample_event(struct perf_tool *tool,
 
 	if (ui__has_annotation() || rep->symbol_ipc) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
-				     rep->nonany_branch_mode);
+				     rep->nonany_branch_mode, NULL);
 	}
 
 	ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d96f24c..14c52e4 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -725,7 +725,8 @@ static int hist_iter__top_callback(struct hist_entry_iter *iter,
 		perf_top__record_precise_ip(top, he, iter->sample, evsel, al->addr);
 
 	hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
-		     !(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY));
+		     !(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY),
+		     NULL);
 	return 0;
 }
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index a7fa061..0e27d68 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -2572,7 +2572,8 @@ int hists__unlink(struct hists *hists)
 }
 
 void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
-			  struct perf_sample *sample, bool nonany_branch_mode)
+			  struct perf_sample *sample, bool nonany_branch_mode,
+			  u64 *total_cycles)
 {
 	struct branch_info *bi;
 
@@ -2599,6 +2600,9 @@ void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
 					nonany_branch_mode ? NULL : prev,
 					bi[i].flags.cycles);
 				prev = &bi[i].to;
+
+				if (total_cycles)
+					*total_cycles += bi[i].flags.cycles;
 			}
 			free(bi);
 		}
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 6a186b6..4d87c7b 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -527,7 +527,8 @@ unsigned int hists__sort_list_width(struct hists *hists);
 unsigned int hists__overhead_width(struct hists *hists);
 
 void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
-			  struct perf_sample *sample, bool nonany_branch_mode);
+			  struct perf_sample *sample, bool nonany_branch_mode,
+			  u64 *total_cycles);
 
 struct option;
 int parse_filter_percentage(const struct option *opt, const char *arg, int unset);

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: perf/core] perf block: Cleanup and refactor block info functions
  2019-11-07  7:47 ` [PATCH v7 2/7] perf util: Cleanup and refactor block info functions Jin Yao
@ 2019-11-12 11:17   ` tip-bot2 for Jin Yao
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Jin Yao @ 2019-11-12 11:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jin Yao, Jiri Olsa, Alexander Shishkin, Andi Kleen, Jin Yao,
	Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     6041441870ab521a2652f1d558a770c586b790be
Gitweb:        https://git.kernel.org/tip/6041441870ab521a2652f1d558a770c586b790be
Author:        Jin Yao <yao.jin@linux.intel.com>
AuthorDate:    Thu, 07 Nov 2019 15:47:14 +08:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 07 Nov 2019 09:09:18 -03:00

perf block: Cleanup and refactor block info functions

We have already implemented some block-info related functions.
Now it's time to do some cleanup, refactoring and move the
functions and structures to new block-info.h/block-info.c.

 v4:
 ---
 Move code for skipping column length calculation to patch:
 'perf diff: Don't use hack to skip column length calculation'

 v3:
 ---
 1. Rename the patch title
 2. Rename from block.h/block.c to block-info.h/block-info.c
 3. Move more common part to block-info, such as
    block_info__process_sym.
 4. Remove the nasty hack for skipping calculation of column
    length

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-diff.c    | 107 ++--------------------------
 tools/perf/util/Build        |   1 +-
 tools/perf/util/block-info.c | 129 ++++++++++++++++++++++++++++++++++-
 tools/perf/util/block-info.h |  43 +++++++++++-
 tools/perf/util/hist.c       |   1 +-
 tools/perf/util/symbol.c     |  22 +------
 tools/perf/util/symbol.h     |  24 +------
 7 files changed, 185 insertions(+), 142 deletions(-)
 create mode 100644 tools/perf/util/block-info.c
 create mode 100644 tools/perf/util/block-info.h

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index faf99a8..6728568 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -24,6 +24,7 @@
 #include "util/annotate.h"
 #include "util/map.h"
 #include "util/spark.h"
+#include "util/block-info.h"
 #include <linux/err.h>
 #include <linux/zalloc.h>
 #include <subcmd/pager.h>
@@ -98,8 +99,6 @@ static s64 compute_wdiff_w2;
 static const char		*cpu_list;
 static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 
-static struct addr_location dummy_al;
-
 enum {
 	COMPUTE_DELTA,
 	COMPUTE_RATIO,
@@ -537,41 +536,6 @@ static void hists__baseline_only(struct hists *hists)
 	}
 }
 
-static int64_t block_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
-			 struct hist_entry *left, struct hist_entry *right)
-{
-	struct block_info *bi_l = left->block_info;
-	struct block_info *bi_r = right->block_info;
-	int cmp;
-
-	if (!bi_l->sym || !bi_r->sym) {
-		if (!bi_l->sym && !bi_r->sym)
-			return 0;
-		else if (!bi_l->sym)
-			return -1;
-		else
-			return 1;
-	}
-
-	if (bi_l->sym == bi_r->sym) {
-		if (bi_l->start == bi_r->start) {
-			if (bi_l->end == bi_r->end)
-				return 0;
-			else
-				return (int64_t)(bi_r->end - bi_l->end);
-		} else
-			return (int64_t)(bi_r->start - bi_l->start);
-	} else {
-		cmp = strcmp(bi_l->sym->name, bi_r->sym->name);
-		return cmp;
-	}
-
-	if (bi_l->sym->start != bi_r->sym->start)
-		return (int64_t)(bi_r->sym->start - bi_l->sym->start);
-
-	return (int64_t)(bi_r->sym->end - bi_l->sym->end);
-}
-
 static int64_t block_cycles_diff_cmp(struct hist_entry *left,
 				     struct hist_entry *right)
 {
@@ -600,67 +564,13 @@ static void init_block_hist(struct block_hist *bh)
 
 	INIT_LIST_HEAD(&bh->block_fmt.list);
 	INIT_LIST_HEAD(&bh->block_fmt.sort_list);
-	bh->block_fmt.cmp = block_cmp;
+	bh->block_fmt.cmp = block_info__cmp;
 	bh->block_fmt.sort = block_sort;
 	perf_hpp_list__register_sort_field(&bh->block_list,
 					   &bh->block_fmt);
 	bh->valid = true;
 }
 
-static void init_block_info(struct block_info *bi, struct symbol *sym,
-			    struct cyc_hist *ch, int offset)
-{
-	bi->sym = sym;
-	bi->start = ch->start;
-	bi->end = offset;
-	bi->cycles = ch->cycles;
-	bi->cycles_aggr = ch->cycles_aggr;
-	bi->num = ch->num;
-	bi->num_aggr = ch->num_aggr;
-
-	memcpy(bi->cycles_spark, ch->cycles_spark,
-	       NUM_SPARKS * sizeof(u64));
-}
-
-static int process_block_per_sym(struct hist_entry *he)
-{
-	struct annotation *notes;
-	struct cyc_hist *ch;
-	struct block_hist *bh;
-
-	if (!he->ms.map || !he->ms.sym)
-		return 0;
-
-	notes = symbol__annotation(he->ms.sym);
-	if (!notes || !notes->src || !notes->src->cycles_hist)
-		return 0;
-
-	bh = container_of(he, struct block_hist, he);
-	init_block_hist(bh);
-
-	ch = notes->src->cycles_hist;
-	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
-		if (ch[i].num_aggr) {
-			struct block_info *bi;
-			struct hist_entry *he_block;
-
-			bi = block_info__new();
-			if (!bi)
-				return -1;
-
-			init_block_info(bi, he->ms.sym, &ch[i], i);
-			he_block = hists__add_entry_block(&bh->block_hists,
-							  &dummy_al, bi);
-			if (!he_block) {
-				block_info__put(bi);
-				return -1;
-			}
-		}
-	}
-
-	return 0;
-}
-
 static int block_pair_cmp(struct hist_entry *a, struct hist_entry *b)
 {
 	struct block_info *bi_a = a->block_info;
@@ -785,8 +695,11 @@ static void hists__precompute(struct hists *hists)
 		he   = rb_entry(next, struct hist_entry, rb_node_in);
 		next = rb_next(&he->rb_node_in);
 
-		if (compute == COMPUTE_CYCLES)
-			process_block_per_sym(he);
+		if (compute == COMPUTE_CYCLES) {
+			bh = container_of(he, struct block_hist, he);
+			init_block_hist(bh);
+			block_info__process_sym(he, bh, NULL, 0);
+		}
 
 		data__for_each_file_new(i, d) {
 			pair = get_pair_data(he, d);
@@ -805,10 +718,12 @@ static void hists__precompute(struct hists *hists)
 				compute_wdiff(he, pair);
 				break;
 			case COMPUTE_CYCLES:
-				process_block_per_sym(pair);
-				bh = container_of(he, struct block_hist, he);
 				pair_bh = container_of(pair, struct block_hist,
 						       he);
+				init_block_hist(pair_bh);
+				block_info__process_sym(pair, pair_bh, NULL, 0);
+
+				bh = container_of(he, struct block_hist, he);
 
 				if (bh->valid && pair_bh->valid) {
 					block_hists_match(&bh->block_hists,
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 39814b1..b8e05a1 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -1,4 +1,5 @@
 perf-y += annotate.o
+perf-y += block-info.o
 perf-y += block-range.o
 perf-y += build-id.o
 perf-y += cacheline.o
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
new file mode 100644
index 0000000..b9954a3
--- /dev/null
+++ b/tools/perf/util/block-info.c
@@ -0,0 +1,129 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdlib.h>
+#include <string.h>
+#include <linux/zalloc.h>
+#include "block-info.h"
+#include "sort.h"
+#include "annotate.h"
+#include "symbol.h"
+
+struct block_info *block_info__get(struct block_info *bi)
+{
+	if (bi)
+		refcount_inc(&bi->refcnt);
+	return bi;
+}
+
+void block_info__put(struct block_info *bi)
+{
+	if (bi && refcount_dec_and_test(&bi->refcnt))
+		free(bi);
+}
+
+struct block_info *block_info__new(void)
+{
+	struct block_info *bi = zalloc(sizeof(*bi));
+
+	if (bi)
+		refcount_set(&bi->refcnt, 1);
+	return bi;
+}
+
+int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+			struct hist_entry *left, struct hist_entry *right)
+{
+	struct block_info *bi_l = left->block_info;
+	struct block_info *bi_r = right->block_info;
+	int cmp;
+
+	if (!bi_l->sym || !bi_r->sym) {
+		if (!bi_l->sym && !bi_r->sym)
+			return 0;
+		else if (!bi_l->sym)
+			return -1;
+		else
+			return 1;
+	}
+
+	if (bi_l->sym == bi_r->sym) {
+		if (bi_l->start == bi_r->start) {
+			if (bi_l->end == bi_r->end)
+				return 0;
+			else
+				return (int64_t)(bi_r->end - bi_l->end);
+		} else
+			return (int64_t)(bi_r->start - bi_l->start);
+	} else {
+		cmp = strcmp(bi_l->sym->name, bi_r->sym->name);
+		return cmp;
+	}
+
+	if (bi_l->sym->start != bi_r->sym->start)
+		return (int64_t)(bi_r->sym->start - bi_l->sym->start);
+
+	return (int64_t)(bi_r->sym->end - bi_l->sym->end);
+}
+
+static void init_block_info(struct block_info *bi, struct symbol *sym,
+			    struct cyc_hist *ch, int offset,
+			    u64 total_cycles)
+{
+	bi->sym = sym;
+	bi->start = ch->start;
+	bi->end = offset;
+	bi->cycles = ch->cycles;
+	bi->cycles_aggr = ch->cycles_aggr;
+	bi->num = ch->num;
+	bi->num_aggr = ch->num_aggr;
+	bi->total_cycles = total_cycles;
+
+	memcpy(bi->cycles_spark, ch->cycles_spark,
+	       NUM_SPARKS * sizeof(u64));
+}
+
+int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
+			    u64 *block_cycles_aggr, u64 total_cycles)
+{
+	struct annotation *notes;
+	struct cyc_hist *ch;
+	static struct addr_location al;
+	u64 cycles = 0;
+
+	if (!he->ms.map || !he->ms.sym)
+		return 0;
+
+	memset(&al, 0, sizeof(al));
+	al.map = he->ms.map;
+	al.sym = he->ms.sym;
+
+	notes = symbol__annotation(he->ms.sym);
+	if (!notes || !notes->src || !notes->src->cycles_hist)
+		return 0;
+	ch = notes->src->cycles_hist;
+	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
+		if (ch[i].num_aggr) {
+			struct block_info *bi;
+			struct hist_entry *he_block;
+
+			bi = block_info__new();
+			if (!bi)
+				return -1;
+
+			init_block_info(bi, he->ms.sym, &ch[i], i,
+					total_cycles);
+			cycles += bi->cycles_aggr / bi->num_aggr;
+
+			he_block = hists__add_entry_block(&bh->block_hists,
+							  &al, bi);
+			if (!he_block) {
+				block_info__put(bi);
+				return -1;
+			}
+		}
+	}
+
+	if (block_cycles_aggr)
+		*block_cycles_aggr += cycles;
+
+	return 0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
new file mode 100644
index 0000000..d55dfc2
--- /dev/null
+++ b/tools/perf/util/block-info.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_BLOCK_H
+#define __PERF_BLOCK_H
+
+#include <linux/types.h>
+#include <linux/refcount.h>
+#include "util/hist.h"
+#include "util/symbol.h"
+
+struct block_info {
+	struct symbol		*sym;
+	u64			start;
+	u64			end;
+	u64			cycles;
+	u64			cycles_aggr;
+	s64			cycles_spark[NUM_SPARKS];
+	u64			total_cycles;
+	int			num;
+	int			num_aggr;
+	refcount_t		refcnt;
+};
+
+struct block_hist;
+
+struct block_info *block_info__new(void);
+struct block_info *block_info__get(struct block_info *bi);
+void   block_info__put(struct block_info *bi);
+
+static inline void __block_info__zput(struct block_info **bi)
+{
+	block_info__put(*bi);
+	*bi = NULL;
+}
+
+#define block_info__zput(bi) __block_info__zput(&bi)
+
+int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+			struct hist_entry *left, struct hist_entry *right);
+
+int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
+			    u64 *block_cycles_aggr, u64 total_cycles);
+
+#endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index daa6eef..a7fa061 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -18,6 +18,7 @@
 #include "srcline.h"
 #include "symbol.h"
 #include "thread.h"
+#include "block-info.h"
 #include "ui/progress.h"
 #include <errno.h>
 #include <math.h>
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 4ad39cc..2764863 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2351,25 +2351,3 @@ struct mem_info *mem_info__new(void)
 		refcount_set(&mi->refcnt, 1);
 	return mi;
 }
-
-struct block_info *block_info__get(struct block_info *bi)
-{
-	if (bi)
-		refcount_inc(&bi->refcnt);
-	return bi;
-}
-
-void block_info__put(struct block_info *bi)
-{
-	if (bi && refcount_dec_and_test(&bi->refcnt))
-		free(bi);
-}
-
-struct block_info *block_info__new(void)
-{
-	struct block_info *bi = zalloc(sizeof(*bi));
-
-	if (bi)
-		refcount_set(&bi->refcnt, 1);
-	return bi;
-}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index cc2a89b..c3bd16d 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -106,18 +106,6 @@ struct ref_reloc_sym {
 	u64		unrelocated_addr;
 };
 
-struct block_info {
-	struct symbol		*sym;
-	u64			start;
-	u64			end;
-	u64			cycles;
-	u64			cycles_aggr;
-	s64			cycles_spark[NUM_SPARKS];
-	int			num;
-	int			num_aggr;
-	refcount_t		refcnt;
-};
-
 struct addr_location {
 	struct machine *machine;
 	struct thread *thread;
@@ -291,16 +279,4 @@ static inline void __mem_info__zput(struct mem_info **mi)
 
 #define mem_info__zput(mi) __mem_info__zput(&mi)
 
-struct block_info *block_info__new(void);
-struct block_info *block_info__get(struct block_info *bi);
-void   block_info__put(struct block_info *bi);
-
-static inline void __block_info__zput(struct block_info **bi)
-{
-	block_info__put(*bi);
-	*bi = NULL;
-}
-
-#define block_info__zput(bi) __block_info__zput(&bi)
-
 #endif /* __PERF_SYMBOL */

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: perf/core] perf diff: Don't use hack to skip column length calculation
  2019-11-07  7:47 ` [PATCH v7 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
@ 2019-11-12 11:17   ` tip-bot2 for Jin Yao
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Jin Yao @ 2019-11-12 11:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jin Yao, Jiri Olsa, Alexander Shishkin, Andi Kleen, Jin Yao,
	Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     0bdf181fe0e5b6f6d5764ff482d7ae4707f8986b
Gitweb:        https://git.kernel.org/tip/0bdf181fe0e5b6f6d5764ff482d7ae4707f8986b
Author:        Jin Yao <yao.jin@linux.intel.com>
AuthorDate:    Thu, 07 Nov 2019 15:47:13 +08:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 07 Nov 2019 09:08:03 -03:00

perf diff: Don't use hack to skip column length calculation

Previously we use a nasty hack to skip the hists__calc_col_len for block
since this function is not very suitable for block column length
calculation.

This patch removes the hack code and add a check at the entry of
hists__calc_col_len to skip for block case.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191107074719.26139-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-diff.c | 11 ++---------
 tools/perf/util/hist.c    |  2 ++
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 5281629..faf99a8 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -765,13 +765,6 @@ static void block_hists_match(struct hists *hists_base,
 	}
 }
 
-static int filter_cb(struct hist_entry *he, void *arg __maybe_unused)
-{
-	/* Skip the calculation of column length in output_resort */
-	he->filtered = true;
-	return 0;
-}
-
 static void hists__precompute(struct hists *hists)
 {
 	struct rb_root_cached *root;
@@ -820,8 +813,8 @@ static void hists__precompute(struct hists *hists)
 				if (bh->valid && pair_bh->valid) {
 					block_hists_match(&bh->block_hists,
 							  &pair_bh->block_hists);
-					hists__output_resort_cb(&pair_bh->block_hists,
-								NULL, filter_cb);
+					hists__output_resort(&pair_bh->block_hists,
+							     NULL);
 				}
 				break;
 			default:
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 679a1d7..daa6eef 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -80,6 +80,8 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
 	int symlen;
 	u16 len;
 
+	if (h->block_info)
+		return;
 	/*
 	 * +4 accounts for '[x] ' priv level info
 	 * +2 accounts for 0x prefix on raw addresses

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-11-12 11:22 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-07  7:47 [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jin Yao
2019-11-07  7:47 ` [PATCH v7 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
2019-11-07  7:47 ` [PATCH v7 2/7] perf util: Cleanup and refactor block info functions Jin Yao
2019-11-12 11:17   ` [tip: perf/core] perf block: " tip-bot2 for Jin Yao
2019-11-07  7:47 ` [PATCH v7 3/7] perf util: Count the total cycles of all samples Jin Yao
2019-11-12 11:17   ` [tip: perf/core] perf hist: " tip-bot2 for Jin Yao
2019-11-07  7:47 ` [PATCH v7 4/7] perf util: Support block formats with compare/sort/display Jin Yao
2019-11-07 12:45   ` Arnaldo Carvalho de Melo
2019-11-07 13:05     ` Jin, Yao
2019-11-07 13:17       ` Arnaldo Carvalho de Melo
2019-11-07 13:21         ` Jin, Yao
2019-11-12 11:17   ` [tip: perf/core] perf hist: " tip-bot2 for Jin Yao
2019-11-07  7:47 ` [PATCH v7 5/7] perf report: Sort by sampled cycles percent per block for stdio Jin Yao
2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
2019-11-07  7:47 ` [PATCH v7 6/7] perf report: Support --percent-limit for --total-cycles Jin Yao
2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
2019-11-07  7:47 ` [PATCH v7 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
2019-11-07 12:38   ` Arnaldo Carvalho de Melo
2019-11-12 11:17   ` [tip: perf/core] " tip-bot2 for Jin Yao
2019-11-07  9:28 ` [PATCH v7 0/7] perf report: Support sorting all blocks by cycles Jiri Olsa
2019-11-07 13:09   ` Jin, Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).