linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/7] perf report: Support sorting all blocks by cycles
@ 2019-10-30  6:04 Jin Yao
  2019-10-30  6:04 ` [PATCH v5 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Jin Yao @ 2019-10-30  6:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It would be useful to support sorting for all blocks by the
sampled cycles percent per block. This is useful to concentrate
on the globally hottest blocks.

This patch series implements a new option "--total-cycles" which
sorts all blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is
block sampled cycles aggregation / total sampled cycles

For example,

perf record -b ./div
perf report --total-cycles --stdio

 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 2M of event 'cycles'
 # Event count (approx.): 2753248
 #
 # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
 # ...............  ..............  ...........  ..........  .................................................................  ....................
 #
            26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
            15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
             5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
             4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
             4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
             3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
             3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
             3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
             2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
             2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
             2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
             2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
             1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
             1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
             1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so
 ......

This patch series supports both stdio and tui. And also with the supporting
of --percent-limit.

 v5:
 ---
 1. Move all block functions to block-info.c

 2. Move the code of setting ms(map+sym) in block hist_entry to
    patch 'perf util: Support block formats with compare/sort/display'.
    Because this info is needed for reporting the block range
    (i.e. source line)

 3. Fix a crash issue when tui mode is enabled.

 Impacted patches:
 -----------------
  perf util: Support block formats with compare/sort/display
  perf report: Sort by sampled cycles percent per block for stdio
  perf report: Support --percent-limit for --total-cycles
  perf report: Sort by sampled cycles percent per block for tui 

 v4:
 ---
 1. Move the block collection out of block printing.

 2. Use new option '--total-cycles' to replace '-s total_cycles'

 3. Move code for skipping column length calculation to patch:
    'perf diff: Don't use hack to skip column length calculation'

 4. Some minor updates and cleanup.

 v3:
 ---
 1. Move common block info functions to block-info.h/block-info.c

 2. Remove nasty hack for skipping calculation of column length.

 3. Some minor cleanup.

 v2:
 ---
 Rebase to perf/core branch

Jin Yao (7):
  perf diff: Don't use hack to skip column length calculation
  perf util: Cleanup and refactor block info functions
  perf util: Count the total cycles of all samples
  perf util: Support block formats with compare/sort/display
  perf report: Sort by sampled cycles percent per block for stdio
  perf report: Support --percent-limit for --total-cycles
  perf report: Sort by sampled cycles percent per block for tui

 tools/perf/Documentation/perf-report.txt |  11 +
 tools/perf/builtin-annotate.c            |   2 +-
 tools/perf/builtin-diff.c                | 121 +-----
 tools/perf/builtin-report.c              |  74 +++-
 tools/perf/builtin-top.c                 |   3 +-
 tools/perf/ui/browsers/hists.c           |  62 ++-
 tools/perf/ui/browsers/hists.h           |   2 +
 tools/perf/ui/stdio/hist.c               |  29 +-
 tools/perf/util/Build                    |   1 +
 tools/perf/util/block-info.c             | 456 +++++++++++++++++++++++
 tools/perf/util/block-info.h             |  74 ++++
 tools/perf/util/hist.c                   |  13 +-
 tools/perf/util/hist.h                   |  15 +-
 tools/perf/util/symbol.c                 |  22 --
 tools/perf/util/symbol.h                 |  24 --
 tools/perf/util/symbol_conf.h            |   1 +
 16 files changed, 749 insertions(+), 161 deletions(-)
 create mode 100644 tools/perf/util/block-info.c
 create mode 100644 tools/perf/util/block-info.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 1/7] perf diff: Don't use hack to skip column length calculation
  2019-10-30  6:04 [PATCH v5 0/7] perf report: Support sorting all blocks by cycles Jin Yao
@ 2019-10-30  6:04 ` Jin Yao
  2019-10-30  6:04 ` [PATCH v5 2/7] perf util: Cleanup and refactor block info functions Jin Yao
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Jin Yao @ 2019-10-30  6:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Previously we use a nasty hack to skip the hists__calc_col_len for
block since this function is not very suitable for block column
length calculation.

This patch removes the hack code and add a check at the entry of
hists__calc_col_len to skip for block case.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 11 ++---------
 tools/perf/util/hist.c    |  2 ++
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 5281629c27b1..faf99a81ad3e 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -765,13 +765,6 @@ static void block_hists_match(struct hists *hists_base,
 	}
 }
 
-static int filter_cb(struct hist_entry *he, void *arg __maybe_unused)
-{
-	/* Skip the calculation of column length in output_resort */
-	he->filtered = true;
-	return 0;
-}
-
 static void hists__precompute(struct hists *hists)
 {
 	struct rb_root_cached *root;
@@ -820,8 +813,8 @@ static void hists__precompute(struct hists *hists)
 				if (bh->valid && pair_bh->valid) {
 					block_hists_match(&bh->block_hists,
 							  &pair_bh->block_hists);
-					hists__output_resort_cb(&pair_bh->block_hists,
-								NULL, filter_cb);
+					hists__output_resort(&pair_bh->block_hists,
+							     NULL);
 				}
 				break;
 			default:
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 679a1d75090c..daa6eef4fde0 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -80,6 +80,8 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
 	int symlen;
 	u16 len;
 
+	if (h->block_info)
+		return;
 	/*
 	 * +4 accounts for '[x] ' priv level info
 	 * +2 accounts for 0x prefix on raw addresses
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 2/7] perf util: Cleanup and refactor block info functions
  2019-10-30  6:04 [PATCH v5 0/7] perf report: Support sorting all blocks by cycles Jin Yao
  2019-10-30  6:04 ` [PATCH v5 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
@ 2019-10-30  6:04 ` Jin Yao
  2019-10-30  6:04 ` [PATCH v5 3/7] perf util: Count the total cycles of all samples Jin Yao
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Jin Yao @ 2019-10-30  6:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We have already implemented some block-info related functions.
Now it's time to do some cleanup, refactoring and move the
functions and structures to new block-info.h/block-info.c.

 v4:
 ---
 Move code for skipping column length calculation to patch:
 'perf diff: Don't use hack to skip column length calculation'

 v3:
 ---
 1. Rename the patch title
 2. Rename from block.h/block.c to block-info.h/block-info.c
 3. Move more common part to block-info, such as
    block_info__process_sym.
 4. Remove the nasty hack for skipping calculation of column
    length

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c    | 107 +++--------------------------
 tools/perf/util/Build        |   1 +
 tools/perf/util/block-info.c | 129 +++++++++++++++++++++++++++++++++++
 tools/perf/util/block-info.h |  43 ++++++++++++
 tools/perf/util/hist.c       |   1 +
 tools/perf/util/symbol.c     |  22 ------
 tools/perf/util/symbol.h     |  24 -------
 7 files changed, 185 insertions(+), 142 deletions(-)
 create mode 100644 tools/perf/util/block-info.c
 create mode 100644 tools/perf/util/block-info.h

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index faf99a81ad3e..6728568fe5c4 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -24,6 +24,7 @@
 #include "util/annotate.h"
 #include "util/map.h"
 #include "util/spark.h"
+#include "util/block-info.h"
 #include <linux/err.h>
 #include <linux/zalloc.h>
 #include <subcmd/pager.h>
@@ -98,8 +99,6 @@ static s64 compute_wdiff_w2;
 static const char		*cpu_list;
 static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 
-static struct addr_location dummy_al;
-
 enum {
 	COMPUTE_DELTA,
 	COMPUTE_RATIO,
@@ -537,41 +536,6 @@ static void hists__baseline_only(struct hists *hists)
 	}
 }
 
-static int64_t block_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
-			 struct hist_entry *left, struct hist_entry *right)
-{
-	struct block_info *bi_l = left->block_info;
-	struct block_info *bi_r = right->block_info;
-	int cmp;
-
-	if (!bi_l->sym || !bi_r->sym) {
-		if (!bi_l->sym && !bi_r->sym)
-			return 0;
-		else if (!bi_l->sym)
-			return -1;
-		else
-			return 1;
-	}
-
-	if (bi_l->sym == bi_r->sym) {
-		if (bi_l->start == bi_r->start) {
-			if (bi_l->end == bi_r->end)
-				return 0;
-			else
-				return (int64_t)(bi_r->end - bi_l->end);
-		} else
-			return (int64_t)(bi_r->start - bi_l->start);
-	} else {
-		cmp = strcmp(bi_l->sym->name, bi_r->sym->name);
-		return cmp;
-	}
-
-	if (bi_l->sym->start != bi_r->sym->start)
-		return (int64_t)(bi_r->sym->start - bi_l->sym->start);
-
-	return (int64_t)(bi_r->sym->end - bi_l->sym->end);
-}
-
 static int64_t block_cycles_diff_cmp(struct hist_entry *left,
 				     struct hist_entry *right)
 {
@@ -600,67 +564,13 @@ static void init_block_hist(struct block_hist *bh)
 
 	INIT_LIST_HEAD(&bh->block_fmt.list);
 	INIT_LIST_HEAD(&bh->block_fmt.sort_list);
-	bh->block_fmt.cmp = block_cmp;
+	bh->block_fmt.cmp = block_info__cmp;
 	bh->block_fmt.sort = block_sort;
 	perf_hpp_list__register_sort_field(&bh->block_list,
 					   &bh->block_fmt);
 	bh->valid = true;
 }
 
-static void init_block_info(struct block_info *bi, struct symbol *sym,
-			    struct cyc_hist *ch, int offset)
-{
-	bi->sym = sym;
-	bi->start = ch->start;
-	bi->end = offset;
-	bi->cycles = ch->cycles;
-	bi->cycles_aggr = ch->cycles_aggr;
-	bi->num = ch->num;
-	bi->num_aggr = ch->num_aggr;
-
-	memcpy(bi->cycles_spark, ch->cycles_spark,
-	       NUM_SPARKS * sizeof(u64));
-}
-
-static int process_block_per_sym(struct hist_entry *he)
-{
-	struct annotation *notes;
-	struct cyc_hist *ch;
-	struct block_hist *bh;
-
-	if (!he->ms.map || !he->ms.sym)
-		return 0;
-
-	notes = symbol__annotation(he->ms.sym);
-	if (!notes || !notes->src || !notes->src->cycles_hist)
-		return 0;
-
-	bh = container_of(he, struct block_hist, he);
-	init_block_hist(bh);
-
-	ch = notes->src->cycles_hist;
-	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
-		if (ch[i].num_aggr) {
-			struct block_info *bi;
-			struct hist_entry *he_block;
-
-			bi = block_info__new();
-			if (!bi)
-				return -1;
-
-			init_block_info(bi, he->ms.sym, &ch[i], i);
-			he_block = hists__add_entry_block(&bh->block_hists,
-							  &dummy_al, bi);
-			if (!he_block) {
-				block_info__put(bi);
-				return -1;
-			}
-		}
-	}
-
-	return 0;
-}
-
 static int block_pair_cmp(struct hist_entry *a, struct hist_entry *b)
 {
 	struct block_info *bi_a = a->block_info;
@@ -785,8 +695,11 @@ static void hists__precompute(struct hists *hists)
 		he   = rb_entry(next, struct hist_entry, rb_node_in);
 		next = rb_next(&he->rb_node_in);
 
-		if (compute == COMPUTE_CYCLES)
-			process_block_per_sym(he);
+		if (compute == COMPUTE_CYCLES) {
+			bh = container_of(he, struct block_hist, he);
+			init_block_hist(bh);
+			block_info__process_sym(he, bh, NULL, 0);
+		}
 
 		data__for_each_file_new(i, d) {
 			pair = get_pair_data(he, d);
@@ -805,10 +718,12 @@ static void hists__precompute(struct hists *hists)
 				compute_wdiff(he, pair);
 				break;
 			case COMPUTE_CYCLES:
-				process_block_per_sym(pair);
-				bh = container_of(he, struct block_hist, he);
 				pair_bh = container_of(pair, struct block_hist,
 						       he);
+				init_block_hist(pair_bh);
+				block_info__process_sym(pair, pair_bh, NULL, 0);
+
+				bh = container_of(he, struct block_hist, he);
 
 				if (bh->valid && pair_bh->valid) {
 					block_hists_match(&bh->block_hists,
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 39814b1806a6..b8e05a147b2b 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -1,4 +1,5 @@
 perf-y += annotate.o
+perf-y += block-info.o
 perf-y += block-range.o
 perf-y += build-id.o
 perf-y += cacheline.o
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
new file mode 100644
index 000000000000..b9954a32b8f4
--- /dev/null
+++ b/tools/perf/util/block-info.c
@@ -0,0 +1,129 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdlib.h>
+#include <string.h>
+#include <linux/zalloc.h>
+#include "block-info.h"
+#include "sort.h"
+#include "annotate.h"
+#include "symbol.h"
+
+struct block_info *block_info__get(struct block_info *bi)
+{
+	if (bi)
+		refcount_inc(&bi->refcnt);
+	return bi;
+}
+
+void block_info__put(struct block_info *bi)
+{
+	if (bi && refcount_dec_and_test(&bi->refcnt))
+		free(bi);
+}
+
+struct block_info *block_info__new(void)
+{
+	struct block_info *bi = zalloc(sizeof(*bi));
+
+	if (bi)
+		refcount_set(&bi->refcnt, 1);
+	return bi;
+}
+
+int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+			struct hist_entry *left, struct hist_entry *right)
+{
+	struct block_info *bi_l = left->block_info;
+	struct block_info *bi_r = right->block_info;
+	int cmp;
+
+	if (!bi_l->sym || !bi_r->sym) {
+		if (!bi_l->sym && !bi_r->sym)
+			return 0;
+		else if (!bi_l->sym)
+			return -1;
+		else
+			return 1;
+	}
+
+	if (bi_l->sym == bi_r->sym) {
+		if (bi_l->start == bi_r->start) {
+			if (bi_l->end == bi_r->end)
+				return 0;
+			else
+				return (int64_t)(bi_r->end - bi_l->end);
+		} else
+			return (int64_t)(bi_r->start - bi_l->start);
+	} else {
+		cmp = strcmp(bi_l->sym->name, bi_r->sym->name);
+		return cmp;
+	}
+
+	if (bi_l->sym->start != bi_r->sym->start)
+		return (int64_t)(bi_r->sym->start - bi_l->sym->start);
+
+	return (int64_t)(bi_r->sym->end - bi_l->sym->end);
+}
+
+static void init_block_info(struct block_info *bi, struct symbol *sym,
+			    struct cyc_hist *ch, int offset,
+			    u64 total_cycles)
+{
+	bi->sym = sym;
+	bi->start = ch->start;
+	bi->end = offset;
+	bi->cycles = ch->cycles;
+	bi->cycles_aggr = ch->cycles_aggr;
+	bi->num = ch->num;
+	bi->num_aggr = ch->num_aggr;
+	bi->total_cycles = total_cycles;
+
+	memcpy(bi->cycles_spark, ch->cycles_spark,
+	       NUM_SPARKS * sizeof(u64));
+}
+
+int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
+			    u64 *block_cycles_aggr, u64 total_cycles)
+{
+	struct annotation *notes;
+	struct cyc_hist *ch;
+	static struct addr_location al;
+	u64 cycles = 0;
+
+	if (!he->ms.map || !he->ms.sym)
+		return 0;
+
+	memset(&al, 0, sizeof(al));
+	al.map = he->ms.map;
+	al.sym = he->ms.sym;
+
+	notes = symbol__annotation(he->ms.sym);
+	if (!notes || !notes->src || !notes->src->cycles_hist)
+		return 0;
+	ch = notes->src->cycles_hist;
+	for (unsigned int i = 0; i < symbol__size(he->ms.sym); i++) {
+		if (ch[i].num_aggr) {
+			struct block_info *bi;
+			struct hist_entry *he_block;
+
+			bi = block_info__new();
+			if (!bi)
+				return -1;
+
+			init_block_info(bi, he->ms.sym, &ch[i], i,
+					total_cycles);
+			cycles += bi->cycles_aggr / bi->num_aggr;
+
+			he_block = hists__add_entry_block(&bh->block_hists,
+							  &al, bi);
+			if (!he_block) {
+				block_info__put(bi);
+				return -1;
+			}
+		}
+	}
+
+	if (block_cycles_aggr)
+		*block_cycles_aggr += cycles;
+
+	return 0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
new file mode 100644
index 000000000000..d55dfc2fda6f
--- /dev/null
+++ b/tools/perf/util/block-info.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_BLOCK_H
+#define __PERF_BLOCK_H
+
+#include <linux/types.h>
+#include <linux/refcount.h>
+#include "util/hist.h"
+#include "util/symbol.h"
+
+struct block_info {
+	struct symbol		*sym;
+	u64			start;
+	u64			end;
+	u64			cycles;
+	u64			cycles_aggr;
+	s64			cycles_spark[NUM_SPARKS];
+	u64			total_cycles;
+	int			num;
+	int			num_aggr;
+	refcount_t		refcnt;
+};
+
+struct block_hist;
+
+struct block_info *block_info__new(void);
+struct block_info *block_info__get(struct block_info *bi);
+void   block_info__put(struct block_info *bi);
+
+static inline void __block_info__zput(struct block_info **bi)
+{
+	block_info__put(*bi);
+	*bi = NULL;
+}
+
+#define block_info__zput(bi) __block_info__zput(&bi)
+
+int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+			struct hist_entry *left, struct hist_entry *right);
+
+int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
+			    u64 *block_cycles_aggr, u64 total_cycles);
+
+#endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index daa6eef4fde0..a7fa061987e4 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -18,6 +18,7 @@
 #include "srcline.h"
 #include "symbol.h"
 #include "thread.h"
+#include "block-info.h"
 #include "ui/progress.h"
 #include <errno.h>
 #include <math.h>
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a8f80e427674..fac8887a6759 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2371,25 +2371,3 @@ struct mem_info *mem_info__new(void)
 		refcount_set(&mi->refcnt, 1);
 	return mi;
 }
-
-struct block_info *block_info__get(struct block_info *bi)
-{
-	if (bi)
-		refcount_inc(&bi->refcnt);
-	return bi;
-}
-
-void block_info__put(struct block_info *bi)
-{
-	if (bi && refcount_dec_and_test(&bi->refcnt))
-		free(bi);
-}
-
-struct block_info *block_info__new(void)
-{
-	struct block_info *bi = zalloc(sizeof(*bi));
-
-	if (bi)
-		refcount_set(&bi->refcnt, 1);
-	return bi;
-}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index cc2a89b99d3d..c3bd16d75d5d 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -106,18 +106,6 @@ struct ref_reloc_sym {
 	u64		unrelocated_addr;
 };
 
-struct block_info {
-	struct symbol		*sym;
-	u64			start;
-	u64			end;
-	u64			cycles;
-	u64			cycles_aggr;
-	s64			cycles_spark[NUM_SPARKS];
-	int			num;
-	int			num_aggr;
-	refcount_t		refcnt;
-};
-
 struct addr_location {
 	struct machine *machine;
 	struct thread *thread;
@@ -291,16 +279,4 @@ static inline void __mem_info__zput(struct mem_info **mi)
 
 #define mem_info__zput(mi) __mem_info__zput(&mi)
 
-struct block_info *block_info__new(void);
-struct block_info *block_info__get(struct block_info *bi);
-void   block_info__put(struct block_info *bi);
-
-static inline void __block_info__zput(struct block_info **bi)
-{
-	block_info__put(*bi);
-	*bi = NULL;
-}
-
-#define block_info__zput(bi) __block_info__zput(&bi)
-
 #endif /* __PERF_SYMBOL */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 3/7] perf util: Count the total cycles of all samples
  2019-10-30  6:04 [PATCH v5 0/7] perf report: Support sorting all blocks by cycles Jin Yao
  2019-10-30  6:04 ` [PATCH v5 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
  2019-10-30  6:04 ` [PATCH v5 2/7] perf util: Cleanup and refactor block info functions Jin Yao
@ 2019-10-30  6:04 ` Jin Yao
  2019-10-30  6:04 ` [PATCH v5 4/7] perf util: Support block formats with compare/sort/display Jin Yao
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Jin Yao @ 2019-10-30  6:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We can get the per sample cycles by hist__account_cycles(). It's also
useful to know the total cycles of all samples in order to get the
cycles coverage for a single program block in further. For example,

coverage = per block sampled cycles / total sampled cycles

This patch creates a new argument 'total_cycles' in hist__account_cycles(),
which will be added with the cycles of each sample.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-annotate.c | 2 +-
 tools/perf/builtin-diff.c     | 3 ++-
 tools/perf/builtin-report.c   | 2 +-
 tools/perf/builtin-top.c      | 3 ++-
 tools/perf/util/hist.c        | 6 +++++-
 tools/perf/util/hist.h        | 3 ++-
 6 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 8db8fc9bddef..6ab0cc45b287 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -201,7 +201,7 @@ static int process_branch_callback(struct evsel *evsel,
 	if (a.map != NULL)
 		a.map->dso->hit = 1;
 
-	hist__account_cycles(sample->branch_stack, al, sample, false);
+	hist__account_cycles(sample->branch_stack, al, sample, false, NULL);
 
 	ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
 	return ret;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 6728568fe5c4..376dbf10ad64 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -426,7 +426,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
 			goto out_put;
 		}
 
-		hist__account_cycles(sample->branch_stack, &al, sample, false);
+		hist__account_cycles(sample->branch_stack, &al, sample, false,
+				     NULL);
 	}
 
 	/*
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7accaf8ef689..cdb436d6e11f 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -292,7 +292,7 @@ static int process_sample_event(struct perf_tool *tool,
 
 	if (ui__has_annotation() || rep->symbol_ipc) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
-				     rep->nonany_branch_mode);
+				     rep->nonany_branch_mode, NULL);
 	}
 
 	ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d96f24c8770d..14c52e4d47f6 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -725,7 +725,8 @@ static int hist_iter__top_callback(struct hist_entry_iter *iter,
 		perf_top__record_precise_ip(top, he, iter->sample, evsel, al->addr);
 
 	hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
-		     !(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY));
+		     !(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY),
+		     NULL);
 	return 0;
 }
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index a7fa061987e4..0e27d6830011 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -2572,7 +2572,8 @@ int hists__unlink(struct hists *hists)
 }
 
 void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
-			  struct perf_sample *sample, bool nonany_branch_mode)
+			  struct perf_sample *sample, bool nonany_branch_mode,
+			  u64 *total_cycles)
 {
 	struct branch_info *bi;
 
@@ -2599,6 +2600,9 @@ void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
 					nonany_branch_mode ? NULL : prev,
 					bi[i].flags.cycles);
 				prev = &bi[i].to;
+
+				if (total_cycles)
+					*total_cycles += bi[i].flags.cycles;
 			}
 			free(bi);
 		}
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 6a186b668303..4d87c7b4c1b2 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -527,7 +527,8 @@ unsigned int hists__sort_list_width(struct hists *hists);
 unsigned int hists__overhead_width(struct hists *hists);
 
 void hist__account_cycles(struct branch_stack *bs, struct addr_location *al,
-			  struct perf_sample *sample, bool nonany_branch_mode);
+			  struct perf_sample *sample, bool nonany_branch_mode,
+			  u64 *total_cycles);
 
 struct option;
 int parse_filter_percentage(const struct option *opt, const char *arg, int unset);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 4/7] perf util: Support block formats with compare/sort/display
  2019-10-30  6:04 [PATCH v5 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (2 preceding siblings ...)
  2019-10-30  6:04 ` [PATCH v5 3/7] perf util: Count the total cycles of all samples Jin Yao
@ 2019-10-30  6:04 ` Jin Yao
  2019-10-30  6:04 ` [PATCH v5 5/7] perf report: Sort by sampled cycles percent per block for stdio Jin Yao
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Jin Yao @ 2019-10-30  6:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

This patch provides helper routines to support new
columns for block info output.

The new columns are:

Sampled Cycles%
Sampled Cycles
Avg Cycles%
Avg Cycles
[Program Block Range]
Shared Object

 v5:
 ---
 1. Move more block related functions from builtin-report.c to
    block-info.c

 2. Set ms (map+sym) in block hist_entry. Because this info
    is needed for reporting the block range (i.e. source line)

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/block-info.c | 317 +++++++++++++++++++++++++++++++++++
 tools/perf/util/block-info.h |  33 +++-
 tools/perf/util/hist.c       |   4 +
 3 files changed, 352 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index b9954a32b8f4..1242c3a33509 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -6,6 +6,40 @@
 #include "sort.h"
 #include "annotate.h"
 #include "symbol.h"
+#include "dso.h"
+#include "map.h"
+#include "srcline.h"
+#include "evlist.h"
+
+static struct block_header_column{
+	const char *name;
+	int width;
+} block_columns[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
+	[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT] = {
+		.name = "Sampled Cycles%",
+		.width = 15,
+	},
+	[PERF_HPP_REPORT__BLOCK_LBR_CYCLES] = {
+		.name = "Sampled Cycles",
+		.width = 14,
+	},
+	[PERF_HPP_REPORT__BLOCK_CYCLES_PCT] = {
+		.name = "Avg Cycles%",
+		.width = 11,
+	},
+	[PERF_HPP_REPORT__BLOCK_AVG_CYCLES] = {
+		.name = "Avg Cycles",
+		.width = 10,
+	},
+	[PERF_HPP_REPORT__BLOCK_RANGE] = {
+		.name = "[Program Block Range]",
+		.width = 70,
+	},
+	[PERF_HPP_REPORT__BLOCK_DSO] = {
+		.name = "Shared Object",
+		.width = 20,
+	}
+};
 
 struct block_info *block_info__get(struct block_info *bi)
 {
@@ -127,3 +161,286 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 
 	return 0;
 }
+
+static int block_column_header(struct perf_hpp_fmt *fmt,
+			       struct perf_hpp *hpp,
+			       struct hists *hists __maybe_unused,
+			       int line __maybe_unused,
+			       int *span __maybe_unused)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 block_fmt->header);
+}
+
+static int block_column_width(struct perf_hpp_fmt *fmt,
+			      struct perf_hpp *hpp __maybe_unused,
+			      struct hists *hists __maybe_unused)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+
+	return block_fmt->width;
+}
+
+static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
+					struct perf_hpp *hpp,
+					struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	double ratio = 0.0;
+	char buf[16];
+
+	if (block_fmt->total_cycles)
+		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
+
+	sprintf(buf, "%.2f%%", 100.0 * ratio);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
+					   struct hist_entry *left,
+					   struct hist_entry *right)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi_l = left->block_info;
+	struct block_info *bi_r = right->block_info;
+	double l, r;
+
+	if (block_fmt->total_cycles) {
+		l = ((double)bi_l->cycles /
+			(double)block_fmt->total_cycles) * 1000.0;
+		r = ((double)bi_r->cycles /
+			(double)block_fmt->total_cycles) * 1000.0;
+		return (int64_t)l - (int64_t)r;
+	}
+
+	return 0;
+}
+
+static void cycles_string(u64 cycles, char *buf, int size)
+{
+	if (cycles >= 1000000)
+		scnprintf(buf, size, "%.1fM", (double)cycles / 1000000.0);
+	else if (cycles >= 1000)
+		scnprintf(buf, size, "%.1fK", (double)cycles / 1000.0);
+	else
+		scnprintf(buf, size, "%1d", cycles);
+}
+
+static int block_cycles_lbr_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp, struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char cycles_buf[16];
+
+	cycles_string(bi->cycles_aggr, cycles_buf, sizeof(cycles_buf));
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 cycles_buf);
+}
+
+static int block_cycles_pct_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp, struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	double ratio = 0.0;
+	u64 avg;
+	char buf[16];
+
+	if (block_fmt->block_cycles && bi->num_aggr) {
+		avg = bi->cycles_aggr / bi->num_aggr;
+		ratio = (double)avg / (double)block_fmt->block_cycles;
+	}
+
+	sprintf(buf, "%.2f%%", 100.0 * ratio);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int block_avg_cycles_entry(struct perf_hpp_fmt *fmt,
+				  struct perf_hpp *hpp,
+				  struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char cycles_buf[16];
+
+	cycles_string(bi->cycles_aggr / bi->num_aggr, cycles_buf,
+		      sizeof(cycles_buf));
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 cycles_buf);
+}
+
+static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+			     struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct block_info *bi = he->block_info;
+	char buf[128];
+	char *start_line, *end_line;
+
+	symbol_conf.disable_add2line_warn = true;
+
+	start_line = map__srcline(he->ms.map, bi->sym->start + bi->start,
+				  he->ms.sym);
+
+	end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
+				he->ms.sym);
+
+	if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
+		scnprintf(buf, sizeof(buf), "[%s -> %s]",
+			  start_line, end_line);
+	} else {
+		scnprintf(buf, sizeof(buf), "[%7lx -> %7lx]",
+			  bi->start, bi->end);
+	}
+
+	free_srcline(start_line);
+	free_srcline(end_line);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+			   struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct map *map = he->ms.map;
+
+	if (map && map->dso) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 map->dso->short_name);
+	}
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 "[unknown]");
+}
+
+static void init_block_header(struct block_fmt *block_fmt)
+{
+	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
+
+	BUG_ON(block_fmt->idx >= PERF_HPP_REPORT__BLOCK_MAX_INDEX);
+
+	block_fmt->header = block_columns[block_fmt->idx].name;
+	block_fmt->width = block_columns[block_fmt->idx].width;
+
+	fmt->header = block_column_header;
+	fmt->width = block_column_width;
+}
+
+static void hpp_register(struct block_fmt *block_fmt, int idx,
+			 struct perf_hpp_list *hpp_list)
+{
+	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
+
+	block_fmt->idx = idx;
+	INIT_LIST_HEAD(&fmt->list);
+	INIT_LIST_HEAD(&fmt->sort_list);
+
+	switch (idx) {
+	case PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT:
+		fmt->entry = block_total_cycles_pct_entry;
+		fmt->cmp = block_info__cmp;
+		fmt->sort = block_total_cycles_pct_sort;
+		break;
+	case PERF_HPP_REPORT__BLOCK_LBR_CYCLES:
+		fmt->entry = block_cycles_lbr_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_CYCLES_PCT:
+		fmt->entry = block_cycles_pct_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_AVG_CYCLES:
+		fmt->entry = block_avg_cycles_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_RANGE:
+		fmt->entry = block_range_entry;
+		break;
+	case PERF_HPP_REPORT__BLOCK_DSO:
+		fmt->entry = block_dso_entry;
+		break;
+	default:
+		return;
+	}
+
+	init_block_header(block_fmt);
+	perf_hpp_list__column_register(hpp_list, fmt);
+}
+
+static void register_block_columns(struct perf_hpp_list *hpp_list,
+				   struct block_fmt *block_fmts)
+{
+	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++)
+		hpp_register(&block_fmts[i], i, hpp_list);
+}
+
+static void init_block_hist(struct block_hist *bh, struct block_fmt *block_fmts)
+{
+	__hists__init(&bh->block_hists, &bh->block_list);
+	perf_hpp_list__init(&bh->block_list);
+	bh->block_list.nr_header_lines = 1;
+
+	register_block_columns(&bh->block_list, block_fmts);
+
+	perf_hpp_list__register_sort_field(&bh->block_list,
+		&block_fmts[PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT].fmt);
+}
+
+static inline void set_fmt(struct block_fmt *block_fmt,
+			   u64 total_cycles, u64 block_cycles)
+{
+	block_fmt->total_cycles = total_cycles;
+	block_fmt->block_cycles = block_cycles;
+}
+
+static void process_block_report(struct hists *hists,
+				 struct block_report *block_report,
+				 u64 total_cycles)
+{
+	struct rb_node *next = rb_first_cached(&hists->entries);
+	struct block_hist *bh = &block_report->hist;
+	struct hist_entry *he;
+
+	init_block_hist(bh, block_report->fmts);
+
+	while (next) {
+		he = rb_entry(next, struct hist_entry, rb_node);
+		block_info__process_sym(he, bh, &block_report->cycles,
+					total_cycles);
+		next = rb_next(&he->rb_node);
+	}
+
+	for (int i = 0; i < PERF_HPP_REPORT__BLOCK_MAX_INDEX; i++) {
+		block_report->fmts[i].total_cycles = total_cycles;
+		block_report->fmts[i].block_cycles = block_report->cycles;
+	}
+
+	hists__output_resort(&bh->block_hists, NULL);
+}
+
+struct block_report *block_info__create_report(struct evlist *evlist,
+					       u64 total_cycles)
+{
+	struct block_report *block_reports;
+	int nr_hists = evlist->core.nr_entries, i = 0;
+	struct evsel *pos;
+
+	block_reports = calloc(nr_hists, sizeof(struct block_report));
+	if (!block_reports)
+		return NULL;
+
+	evlist__for_each_entry(evlist, pos) {
+		struct hists *hists = evsel__hists(pos);
+
+		process_block_report(hists, &block_reports[i], total_cycles);
+		i++;
+	}
+
+	return block_reports;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index d55dfc2fda6f..b5266588d476 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -4,8 +4,9 @@
 
 #include <linux/types.h>
 #include <linux/refcount.h>
-#include "util/hist.h"
-#include "util/symbol.h"
+#include "hist.h"
+#include "symbol.h"
+#include "sort.h"
 
 struct block_info {
 	struct symbol		*sym;
@@ -20,6 +21,31 @@ struct block_info {
 	refcount_t		refcnt;
 };
 
+struct block_fmt {
+	struct perf_hpp_fmt	fmt;
+	int			idx;
+	int			width;
+	const char		*header;
+	u64			total_cycles;
+	u64			block_cycles;
+};
+
+enum {
+	PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
+	PERF_HPP_REPORT__BLOCK_LBR_CYCLES,
+	PERF_HPP_REPORT__BLOCK_CYCLES_PCT,
+	PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
+	PERF_HPP_REPORT__BLOCK_RANGE,
+	PERF_HPP_REPORT__BLOCK_DSO,
+	PERF_HPP_REPORT__BLOCK_MAX_INDEX
+};
+
+struct block_report {
+	struct block_hist	hist;
+	u64			cycles;
+	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
+};
+
 struct block_hist;
 
 struct block_info *block_info__new(void);
@@ -40,4 +66,7 @@ int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 			    u64 *block_cycles_aggr, u64 total_cycles);
 
+struct block_report *block_info__create_report(struct evlist *evlist,
+					       u64 total_cycles);
+
 #endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0e27d6830011..7cf137b0451b 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -758,6 +758,10 @@ struct hist_entry *hists__add_entry_block(struct hists *hists,
 	struct hist_entry entry = {
 		.block_info = block_info,
 		.hists = hists,
+		.ms = {
+			.map = al->map,
+			.sym = al->sym,
+		},
 	}, *he = hists__findnew_entry(hists, &entry, al, false);
 
 	return he;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 5/7] perf report: Sort by sampled cycles percent per block for stdio
  2019-10-30  6:04 [PATCH v5 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (3 preceding siblings ...)
  2019-10-30  6:04 ` [PATCH v5 4/7] perf util: Support block formats with compare/sort/display Jin Yao
@ 2019-10-30  6:04 ` Jin Yao
  2019-11-04 14:04   ` Jiri Olsa
  2019-10-30  6:04 ` [PATCH v5 6/7] perf report: Support --percent-limit for --total-cycles Jin Yao
  2019-10-30  6:04 ` [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
  6 siblings, 1 reply; 15+ messages in thread
From: Jin Yao @ 2019-10-30  6:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It would be useful to support sorting for all blocks by the
sampled cycles percent per block. This is useful to concentrate
on the globally hottest blocks.

This patch implements a new option "--total-cycles" which sorts
all blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is the
percent:

 percent = block sampled cycles aggregation / total sampled cycles

Note that, this patch only supports "--stdio" mode.

For example,

perf record -b ./div
perf report --total-cycles --stdio

 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 2M of event 'cycles'
 # Event count (approx.): 2753248
 #
 # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
 # ...............  ..............  ...........  ..........  .................................................................  ....................
 #
            26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
            15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
             5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
             4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
             4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
             3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
             3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
             3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
             2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
             2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
             2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
             2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
             1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
             1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
             1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so
             0.25%          182.5K        0.02%           1                                 [random_r.c:388 -> random_r.c:391]          libc-2.27.so
             0.00%              48        1.07%          48                         [x86_pmu_enable+284 -> x86_pmu_enable+298]     [kernel.kallsyms]
             0.00%              74        1.64%          74                              [vm_mmap_pgoff+0 -> vm_mmap_pgoff+92]     [kernel.kallsyms]
             0.00%              73        1.62%          73                                          [vm_mmap+0 -> vm_mmap+48]     [kernel.kallsyms]
             0.00%              63        0.69%          31                                        [up_write+0 -> up_write+34]     [kernel.kallsyms]
             0.00%              13        0.29%          13                       [setup_arg_pages+396 -> setup_arg_pages+413]     [kernel.kallsyms]
             0.00%               3        0.07%           3                       [setup_arg_pages+418 -> setup_arg_pages+450]     [kernel.kallsyms]
             0.00%             616        6.84%         308                    [security_mmap_file+0 -> security_mmap_file+72]     [kernel.kallsyms]
             0.00%              23        0.51%          23                   [security_mmap_file+77 -> security_mmap_file+87]     [kernel.kallsyms]
             0.00%               4        0.02%           1                                   [sched_clock+0 -> sched_clock+4]     [kernel.kallsyms]
             0.00%               4        0.02%           1                                  [sched_clock+9 -> sched_clock+12]     [kernel.kallsyms]
             0.00%               1        0.02%           1                                 [rcu_nmi_exit+0 -> rcu_nmi_exit+9]     [kernel.kallsyms]

 v5:
 ---
 1. Move all block functions to block-info.c

 2. Move the code of setting ms in block hist_entry to
    other patch.

 v4:
 ---
 1. Use new option '--total-cycles' to replace
    '-s total_cycles' in v3.

 2. Move block info collection out of block info
    printing.

 v3:
 ---
 1. Use common function block_info__process_sym to
    process the blocks per symbol.

 2. Remove the nasty hack for skipping calculation
    of column length

 3. Some minor cleanup

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-report.txt | 11 +++++
 tools/perf/builtin-report.c              | 54 ++++++++++++++++++++++--
 tools/perf/ui/stdio/hist.c               | 22 ++++++++++
 tools/perf/util/symbol_conf.h            |  1 +
 4 files changed, 85 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 7315f155803f..8dbe2119686a 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -525,6 +525,17 @@ include::itrace.txt[]
 	Configure time quantum for time sort key. Default 100ms.
 	Accepts s, us, ms, ns units.
 
+--total-cycles::
+	When --total-cycles is specified, it supports sorting for all blocks by
+	'Sampled Cycles%'. This is useful to concentrate on the globally hottest
+	blocks. In output, there are some new columns:
+
+	'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles
+	'Sampled Cycles'  - block sampled cycles aggregation
+	'Avg Cycles%'     - block average sampled cycles / sum of total block average
+			    sampled cycles
+	'Avg Cycles'      - block average sampled cycles
+
 include::callchain-overhead-calculation.txt[]
 
 SEE ALSO
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index cdb436d6e11f..0c53371c286a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -51,6 +51,7 @@
 #include "util/util.h" // perf_tip()
 #include "ui/ui.h"
 #include "ui/progress.h"
+#include "util/block-info.h"
 
 #include <dlfcn.h>
 #include <errno.h>
@@ -96,10 +97,13 @@ struct report {
 	float			min_percent;
 	u64			nr_entries;
 	u64			queue_size;
+	u64			total_cycles;
 	int			socket_filter;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 	struct branch_type_stat	brtype_stat;
 	bool			symbol_ipc;
+	bool			total_cycles_mode;
+	struct block_report	*block_reports;
 };
 
 static int report__config(const char *var, const char *value, void *cb)
@@ -290,9 +294,10 @@ static int process_sample_event(struct perf_tool *tool,
 	if (al.map != NULL)
 		al.map->dso->hit = 1;
 
-	if (ui__has_annotation() || rep->symbol_ipc) {
+	if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
-				     rep->nonany_branch_mode, NULL);
+				     rep->nonany_branch_mode,
+				     &rep->total_cycles);
 	}
 
 	ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
@@ -480,11 +485,21 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
 	return ret + fprintf(fp, "\n#\n");
 }
 
+static int hists__fprintf_all_blocks(struct block_hist *bh)
+{
+	symbol_conf.report_individual_block = true;
+	hists__fprintf(&bh->block_hists, true, 0, 0, 0,
+		       stdout, true);
+	hists__delete_entries(&bh->block_hists);
+	return 0;
+}
+
 static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 					 struct report *rep,
 					 const char *help)
 {
 	struct evsel *pos;
+	int i = 0;
 
 	if (!quiet) {
 		fprintf(stdout, "#\n# Total Lost Samples: %" PRIu64 "\n#\n",
@@ -494,12 +509,20 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 	evlist__for_each_entry(evlist, pos) {
 		struct hists *hists = evsel__hists(pos);
 		const char *evname = perf_evsel__name(pos);
+		struct block_hist *block_hist;
 
 		if (symbol_conf.event_group &&
 		    !perf_evsel__is_group_leader(pos))
 			continue;
 
 		hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
+
+		if (rep->total_cycles_mode) {
+			block_hist = &rep->block_reports[i++].hist;
+			hists__fprintf_all_blocks(block_hist);
+			continue;
+		}
+
 		hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
 			       !(symbol_conf.use_callchain ||
 			         symbol_conf.show_branchflag_count));
@@ -927,6 +950,13 @@ static int __cmd_report(struct report *rep)
 
 	report__output_resort(rep);
 
+	if (rep->total_cycles_mode) {
+		rep->block_reports = block_info__create_report(session->evlist,
+							       rep->total_cycles);
+		if (!rep->block_reports)
+			return -1;
+	}
+
 	return report__browse_hists(rep);
 }
 
@@ -1211,6 +1241,8 @@ int cmd_report(int argc, const char **argv)
 		     "Set time quantum for time sort key (default 100ms)",
 		     parse_time_quantum),
 	OPTS_EVSWITCH(&report.evswitch),
+	OPT_BOOLEAN(0, "total-cycles", &report.total_cycles_mode,
+		    "Sort all blocks by 'Sampled Cycles%'"),
 	OPT_END()
 	};
 	struct perf_data data = {
@@ -1373,6 +1405,17 @@ int cmd_report(int argc, const char **argv)
 		goto error;
 	}
 
+	if (report.total_cycles_mode) {
+		if (sort__mode != SORT_MODE__BRANCH)
+			report.total_cycles_mode = false;
+		else if (!report.use_stdio) {
+			pr_err("Error: --total-cycles can be only used together with --stdio\n");
+			goto error;
+		} else {
+			sort_order = "sym";
+		}
+	}
+
 	if (strcmp(input_name, "-") != 0)
 		setup_browser(true);
 	else
@@ -1423,7 +1466,8 @@ int cmd_report(int argc, const char **argv)
 	 * so don't allocate extra space that won't be used in the stdio
 	 * implementation.
 	 */
-	if (ui__has_annotation() || report.symbol_ipc) {
+	if (ui__has_annotation() || report.symbol_ipc ||
+	    report.total_cycles_mode) {
 		ret = symbol__annotation_init();
 		if (ret < 0)
 			goto error;
@@ -1484,6 +1528,10 @@ int cmd_report(int argc, const char **argv)
 		itrace_synth_opts__clear_time_range(&itrace_synth_opts);
 		zfree(&report.ptime_range);
 	}
+
+	if (report.block_reports)
+		zfree(&report.block_reports);
+
 	zstd_fini(&(session->zstd_data));
 	perf_session__delete(session);
 	return ret;
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 5365606e9dad..655ef7708cd0 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -558,6 +558,25 @@ static int hist_entry__block_fprintf(struct hist_entry *he,
 	return ret;
 }
 
+static int hist_entry__individual_block_fprintf(struct hist_entry *he,
+						char *bf, size_t size,
+						FILE *fp)
+{
+	int ret = 0;
+
+	struct perf_hpp hpp = {
+		.buf		= bf,
+		.size		= size,
+		.skip		= false,
+	};
+
+	hist_entry__snprintf(he, &hpp);
+	if (!hpp.skip)
+		ret += fprintf(fp, "%s\n", bf);
+
+	return ret;
+}
+
 static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 			       char *bf, size_t bfsz, FILE *fp,
 			       bool ignore_callchains)
@@ -580,6 +599,9 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 	if (symbol_conf.report_block)
 		return hist_entry__block_fprintf(he, bf, size, fp);
 
+	if (symbol_conf.report_individual_block)
+		return hist_entry__individual_block_fprintf(he, bf, size, fp);
+
 	hist_entry__snprintf(he, &hpp);
 
 	ret = fprintf(fp, "%s\n", bf);
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index e6880789864c..10f1ec3e0349 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -40,6 +40,7 @@ struct symbol_conf {
 			raw_trace,
 			report_hierarchy,
 			report_block,
+			report_individual_block,
 			inline_name,
 			disable_add2line_warn;
 	const char	*vmlinux_name,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 6/7] perf report: Support --percent-limit for --total-cycles
  2019-10-30  6:04 [PATCH v5 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (4 preceding siblings ...)
  2019-10-30  6:04 ` [PATCH v5 5/7] perf report: Sort by sampled cycles percent per block for stdio Jin Yao
@ 2019-10-30  6:04 ` Jin Yao
  2019-10-30  6:04 ` [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
  6 siblings, 0 replies; 15+ messages in thread
From: Jin Yao @ 2019-10-30  6:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We have already supported the '--total-cycles' option in previous
patch. It's also useful to show entries only above a threshold
percent.

This patch enables '--percent-limit' for not showing entries
under that percent.

For example,

 perf report --total-cycles --stdio --percent-limit 1

 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 2M of event 'cycles'
 # Event count (approx.): 2753248
 #
 # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
 # ...............  ..............  ...........  ..........  .................................................................  ....................
 #
            26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
            15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
             5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
             4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
             4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
             3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
             3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
             3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
             2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
             2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
             2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
             2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
             2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
             1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
             1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
             1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so

It only shows the entries which 'Sampled Cycles%' > 1%.

 v5:
 ---
 No functional change. Only fix the conflict issue because
 previous patches are changed.

 v4:
 ---
 No functional change. Only fix the build issue because
 previous patches are changed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-report.c  |  6 +++---
 tools/perf/ui/stdio/hist.c   |  7 ++++++-
 tools/perf/util/block-info.c | 10 ++++++++++
 tools/perf/util/block-info.h |  2 ++
 4 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 0c53371c286a..516243715190 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -485,10 +485,10 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
 	return ret + fprintf(fp, "\n#\n");
 }
 
-static int hists__fprintf_all_blocks(struct block_hist *bh)
+static int hists__fprintf_all_blocks(struct block_hist *bh, float min_percent)
 {
 	symbol_conf.report_individual_block = true;
-	hists__fprintf(&bh->block_hists, true, 0, 0, 0,
+	hists__fprintf(&bh->block_hists, true, 0, 0, min_percent,
 		       stdout, true);
 	hists__delete_entries(&bh->block_hists);
 	return 0;
@@ -519,7 +519,7 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 
 		if (rep->total_cycles_mode) {
 			block_hist = &rep->block_reports[i++].hist;
-			hists__fprintf_all_blocks(block_hist);
+			hists__fprintf_all_blocks(block_hist, rep->min_percent);
 			continue;
 		}
 
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 655ef7708cd0..132056c7d5b7 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -15,6 +15,7 @@
 #include "../../util/srcline.h"
 #include "../../util/string2.h"
 #include "../../util/thread.h"
+#include "../../util/block-info.h"
 #include <linux/ctype.h>
 #include <linux/zalloc.h>
 
@@ -856,7 +857,11 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
 		if (h->filtered)
 			continue;
 
-		percent = hist_entry__get_percent_limit(h);
+		if (symbol_conf.report_individual_block)
+			percent = block_info__total_cycles_percent(h);
+		else
+			percent = hist_entry__get_percent_limit(h);
+
 		if (percent < min_pcnt)
 			continue;
 
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 1242c3a33509..a7d108fe53c8 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -444,3 +444,13 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 
 	return block_reports;
 }
+
+float block_info__total_cycles_percent(struct hist_entry *he)
+{
+	struct block_info *bi = he->block_info;
+
+	if (bi->total_cycles)
+		return bi->cycles * 100.0 / bi->total_cycles;
+
+	return 0.0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index b5266588d476..4a0bccbec66e 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -69,4 +69,6 @@ int block_info__process_sym(struct hist_entry *he, struct block_hist *bh,
 struct block_report *block_info__create_report(struct evlist *evlist,
 					       u64 total_cycles);
 
+float block_info__total_cycles_percent(struct hist_entry *he);
+
 #endif /* __PERF_BLOCK_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui
  2019-10-30  6:04 [PATCH v5 0/7] perf report: Support sorting all blocks by cycles Jin Yao
                   ` (5 preceding siblings ...)
  2019-10-30  6:04 ` [PATCH v5 6/7] perf report: Support --percent-limit for --total-cycles Jin Yao
@ 2019-10-30  6:04 ` Jin Yao
  2019-11-01  8:34   ` Jiri Olsa
  2019-11-01  8:34   ` Jiri Olsa
  6 siblings, 2 replies; 15+ messages in thread
From: Jin Yao @ 2019-10-30  6:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Previous patch has implemented a new option "--total-cycles".
But only stdio mode is supported.

This patch supports the tui mode and support '--percent-limit'.

For example,

 perf record -b ./div
 perf report --total-cycles --percent-limit 1

 # Samples: 2753248 of event 'cycles'
 Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
          26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
          15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
           5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
           4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
           4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
           3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
           3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
           3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
           2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
           2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
           2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
           2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
           2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
           2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
           1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
           1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
           1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so

 v5:
 ---
 Fix a crash issue when running perf report without
 '--total-cycles'. The issue is because the internal flag
 is renamed from 'total_cycles' to 'total_cycles_mode' in
 previous patch but this patch still uses 'total_cycles'
 to check if the '--total-cycles' option is enabled, which
 causes the code to be inconsistent.

 v4:
 ---
 Since the block collection is moved out of printing in
 previous patch, this patch is updated accordingly for
 tui supporting.

 v3:
 ---
 Minor change since the function name is changed:
 block_total_cycles_percent -> block_info__total_cycles_percent

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-report.c    | 30 +++++++++++++---
 tools/perf/ui/browsers/hists.c | 62 +++++++++++++++++++++++++++++++++-
 tools/perf/ui/browsers/hists.h |  2 ++
 tools/perf/util/hist.h         | 12 +++++++
 4 files changed, 100 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 516243715190..4437f4ffae90 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -494,6 +494,25 @@ static int hists__fprintf_all_blocks(struct block_hist *bh, float min_percent)
 	return 0;
 }
 
+static int perf_evlist__tui_block_hists_browse(struct evlist *evlist,
+					       struct report *rep)
+{
+	struct block_hist *bh;
+	struct evsel *pos;
+	int i = 0, ret;
+
+	evlist__for_each_entry(evlist, pos) {
+		bh = &rep->block_reports[i++].hist;
+		symbol_conf.report_individual_block = true;
+		ret = block_hists_tui_browse(bh, pos, rep->min_percent);
+		hists__delete_entries(&bh->block_hists);
+		if (ret != 0)
+			return ret;
+	}
+
+	return ret;
+}
+
 static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 					 struct report *rep,
 					 const char *help)
@@ -605,6 +624,11 @@ static int report__browse_hists(struct report *rep)
 
 	switch (use_browser) {
 	case 1:
+		if (rep->total_cycles_mode) {
+			ret = perf_evlist__tui_block_hists_browse(evlist, rep);
+			break;
+		}
+
 		ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
 						    rep->min_percent,
 						    &session->header.env,
@@ -1408,12 +1432,8 @@ int cmd_report(int argc, const char **argv)
 	if (report.total_cycles_mode) {
 		if (sort__mode != SORT_MODE__BRANCH)
 			report.total_cycles_mode = false;
-		else if (!report.use_stdio) {
-			pr_err("Error: --total-cycles can be only used together with --stdio\n");
-			goto error;
-		} else {
+		else
 			sort_order = "sym";
-		}
 	}
 
 	if (strcmp(input_name, "-") != 0)
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 7a7187e069b4..04301303c246 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -26,6 +26,7 @@
 #include "../../util/sort.h"
 #include "../../util/top.h"
 #include "../../util/thread.h"
+#include "../../util/block-info.h"
 #include "../../arch/common.h"
 #include "../../perf.h"
 
@@ -1783,7 +1784,11 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser)
 			continue;
 		}
 
-		percent = hist_entry__get_percent_limit(h);
+		if (symbol_conf.report_individual_block)
+			percent = block_info__total_cycles_percent(h);
+		else
+			percent = hist_entry__get_percent_limit(h);
+
 		if (percent < hb->min_pcnt)
 			continue;
 
@@ -3443,3 +3448,58 @@ int perf_evlist__tui_browse_hists(struct evlist *evlist, const char *help,
 					       warn_lost_event,
 					       annotation_opts);
 }
+
+static int block_hists_browser__title(struct hist_browser *browser, char *bf,
+				      size_t size)
+{
+	struct hists *hists = evsel__hists(browser->block_evsel);
+	const char *evname = perf_evsel__name(browser->block_evsel);
+	unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
+	int ret;
+
+	ret = scnprintf(bf, size, "# Samples: %lu", nr_samples);
+	if (evname)
+		scnprintf(bf + ret, size -  ret, " of event '%s'", evname);
+
+	return 0;
+}
+
+int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
+			   float min_percent)
+{
+	struct hists *hists = &bh->block_hists;
+	struct hist_browser *browser;
+	int key = -1;
+	static const char help[] =
+	" q             Quit \n";
+
+	browser = hist_browser__new(hists);
+	if (!browser)
+		return -1;
+
+	browser->block_evsel = evsel;
+	browser->title = block_hists_browser__title;
+	browser->min_pcnt = min_percent;
+
+	/* reset abort key so that it can get Ctrl-C as a key */
+	SLang_reset_tty();
+	SLang_init_tty(0, 0, 0);
+
+	while (1) {
+		key = hist_browser__run(browser, "? - help", true);
+
+		switch (key) {
+		case 'q':
+			goto out;
+		case '?':
+			ui_browser__help_window(&browser->b, help);
+			break;
+		default:
+			break;
+		}
+	}
+
+out:
+	hist_browser__delete(browser);
+	return 0;
+}
diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
index 91d3e18b50aa..078f2f2c7abd 100644
--- a/tools/perf/ui/browsers/hists.h
+++ b/tools/perf/ui/browsers/hists.h
@@ -5,6 +5,7 @@
 #include "ui/browser.h"
 
 struct annotation_options;
+struct evsel;
 
 struct hist_browser {
 	struct ui_browser   b;
@@ -15,6 +16,7 @@ struct hist_browser {
 	struct pstack	    *pstack;
 	struct perf_env	    *env;
 	struct annotation_options *annotation_opts;
+	struct evsel	    *block_evsel;
 	int		     print_seq;
 	bool		     show_dso;
 	bool		     show_headers;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 4d87c7b4c1b2..f254fa349ad6 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -449,6 +449,8 @@ enum rstype {
 	A_SOURCE
 };
 
+struct block_hist;
+
 #ifdef HAVE_SLANG_SUPPORT
 #include "../ui/keysyms.h"
 void attr_to_script(char *buf, struct perf_event_attr *attr);
@@ -474,6 +476,9 @@ void run_script(char *cmd);
 int res_sample_browse(struct res_sample *res_samples, int num_res,
 		      struct evsel *evsel, enum rstype rstype);
 void res_sample_init(void);
+
+int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
+			   float min_percent);
 #else
 static inline
 int perf_evlist__tui_browse_hists(struct evlist *evlist __maybe_unused,
@@ -516,6 +521,13 @@ static inline int res_sample_browse(struct res_sample *res_samples __maybe_unuse
 	return 0;
 }
 
+static inline int block_hists_tui_browse(struct block_hist *bh __maybe_unused,
+					 struct evsel *evsel __maybe_unused,
+					 float min_percent __maybe_unused)
+{
+	return 0;
+}
+
 static inline void res_sample_init(void) {}
 
 #define K_LEFT  -1000
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui
  2019-10-30  6:04 ` [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
@ 2019-11-01  8:34   ` Jiri Olsa
  2019-11-01 13:05     ` Jin, Yao
  2019-11-01  8:34   ` Jiri Olsa
  1 sibling, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2019-11-01  8:34 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Wed, Oct 30, 2019 at 02:04:30PM +0800, Jin Yao wrote:

SNIP

> +
>  static int perf_evlist__tty_browse_hists(struct evlist *evlist,
>  					 struct report *rep,
>  					 const char *help)
> @@ -605,6 +624,11 @@ static int report__browse_hists(struct report *rep)
>  
>  	switch (use_browser) {
>  	case 1:
> +		if (rep->total_cycles_mode) {
> +			ret = perf_evlist__tui_block_hists_browse(evlist, rep);
> +			break;
> +		}

does this have sense only for cycles event? what if I do:
  # perf record -b -e cycles,cache-misses

jirka

> +
>  		ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
>  						    rep->min_percent,
>  						    &session->header.env,
> @@ -1408,12 +1432,8 @@ int cmd_report(int argc, const char **argv)
>  	if (report.total_cycles_mode) {
>  		if (sort__mode != SORT_MODE__BRANCH)
>  			report.total_cycles_mode = false;
> -		else if (!report.use_stdio) {
> -			pr_err("Error: --total-cycles can be only used together with --stdio\n");
> -			goto error;
> -		} else {
> +		else
>  			sort_order = "sym";
> -		}
>  	}
>  
>  	if (strcmp(input_name, "-") != 0)

SNIP


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui
  2019-10-30  6:04 ` [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
  2019-11-01  8:34   ` Jiri Olsa
@ 2019-11-01  8:34   ` Jiri Olsa
  2019-11-01 14:07     ` Jin, Yao
  1 sibling, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2019-11-01  8:34 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Wed, Oct 30, 2019 at 02:04:30PM +0800, Jin Yao wrote:

SNIP

> diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
> index 91d3e18b50aa..078f2f2c7abd 100644
> --- a/tools/perf/ui/browsers/hists.h
> +++ b/tools/perf/ui/browsers/hists.h
> @@ -5,6 +5,7 @@
>  #include "ui/browser.h"
>  
>  struct annotation_options;
> +struct evsel;
>  
>  struct hist_browser {
>  	struct ui_browser   b;
> @@ -15,6 +16,7 @@ struct hist_browser {
>  	struct pstack	    *pstack;
>  	struct perf_env	    *env;
>  	struct annotation_options *annotation_opts;
> +	struct evsel	    *block_evsel;

you should be able to get the evsel from hists_to_evsel function

jirka


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui
  2019-11-01  8:34   ` Jiri Olsa
@ 2019-11-01 13:05     ` Jin, Yao
  0 siblings, 0 replies; 15+ messages in thread
From: Jin, Yao @ 2019-11-01 13:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 11/1/2019 4:34 PM, Jiri Olsa wrote:
> On Wed, Oct 30, 2019 at 02:04:30PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> +
>>   static int perf_evlist__tty_browse_hists(struct evlist *evlist,
>>   					 struct report *rep,
>>   					 const char *help)
>> @@ -605,6 +624,11 @@ static int report__browse_hists(struct report *rep)
>>   
>>   	switch (use_browser) {
>>   	case 1:
>> +		if (rep->total_cycles_mode) {
>> +			ret = perf_evlist__tui_block_hists_browse(evlist, rep);
>> +			break;
>> +		}
> 
> does this have sense only for cycles event? what if I do:
>    # perf record -b -e cycles,cache-misses
> 
> jirka
> 

It can report both cycles and cache-misses. But I use a simple way.

When I run 'perf report --total-cycles', it displays the window for the 
first event ('cycles') by default. For example,

# Samples: 8384 of event 'cycles'
Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles
...

Once I press 'q' to quit current window, it then switches to another 
window for the second event ('cache-misses'). For example,

# Samples: 7072 of event 'cache-misses'
Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles
...

Thanks
Jin Yao

>> +
>>   		ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
>>   						    rep->min_percent,
>>   						    &session->header.env,
>> @@ -1408,12 +1432,8 @@ int cmd_report(int argc, const char **argv)
>>   	if (report.total_cycles_mode) {
>>   		if (sort__mode != SORT_MODE__BRANCH)
>>   			report.total_cycles_mode = false;
>> -		else if (!report.use_stdio) {
>> -			pr_err("Error: --total-cycles can be only used together with --stdio\n");
>> -			goto error;
>> -		} else {
>> +		else
>>   			sort_order = "sym";
>> -		}
>>   	}
>>   
>>   	if (strcmp(input_name, "-") != 0)
> 
> SNIP
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui
  2019-11-01  8:34   ` Jiri Olsa
@ 2019-11-01 14:07     ` Jin, Yao
  2019-11-04 13:53       ` Jiri Olsa
  0 siblings, 1 reply; 15+ messages in thread
From: Jin, Yao @ 2019-11-01 14:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 11/1/2019 4:34 PM, Jiri Olsa wrote:
> On Wed, Oct 30, 2019 at 02:04:30PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
>> index 91d3e18b50aa..078f2f2c7abd 100644
>> --- a/tools/perf/ui/browsers/hists.h
>> +++ b/tools/perf/ui/browsers/hists.h
>> @@ -5,6 +5,7 @@
>>   #include "ui/browser.h"
>>   
>>   struct annotation_options;
>> +struct evsel;
>>   
>>   struct hist_browser {
>>   	struct ui_browser   b;
>> @@ -15,6 +16,7 @@ struct hist_browser {
>>   	struct pstack	    *pstack;
>>   	struct perf_env	    *env;
>>   	struct annotation_options *annotation_opts;
>> +	struct evsel	    *block_evsel;
> 
> you should be able to get the evsel from hists_to_evsel function
> 
> jirka
> 

Maybe we can't. The hists in hist_browser is set to block_hists (not the 
hists for evsel).

See block_hists_tui_browse,

int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
                           float min_percent)
{
        struct hists *hists = &bh->block_hists;
        struct hist_browser *browser;
        ......
        browser = hist_browser__new(hists);
        ......
}

So I have to pass the evsel in.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui
  2019-11-01 14:07     ` Jin, Yao
@ 2019-11-04 13:53       ` Jiri Olsa
  0 siblings, 0 replies; 15+ messages in thread
From: Jiri Olsa @ 2019-11-04 13:53 UTC (permalink / raw)
  To: Jin, Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Fri, Nov 01, 2019 at 10:07:33PM +0800, Jin, Yao wrote:
> 
> 
> On 11/1/2019 4:34 PM, Jiri Olsa wrote:
> > On Wed, Oct 30, 2019 at 02:04:30PM +0800, Jin Yao wrote:
> > 
> > SNIP
> > 
> > > diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
> > > index 91d3e18b50aa..078f2f2c7abd 100644
> > > --- a/tools/perf/ui/browsers/hists.h
> > > +++ b/tools/perf/ui/browsers/hists.h
> > > @@ -5,6 +5,7 @@
> > >   #include "ui/browser.h"
> > >   struct annotation_options;
> > > +struct evsel;
> > >   struct hist_browser {
> > >   	struct ui_browser   b;
> > > @@ -15,6 +16,7 @@ struct hist_browser {
> > >   	struct pstack	    *pstack;
> > >   	struct perf_env	    *env;
> > >   	struct annotation_options *annotation_opts;
> > > +	struct evsel	    *block_evsel;
> > 
> > you should be able to get the evsel from hists_to_evsel function
> > 
> > jirka
> > 
> 
> Maybe we can't. The hists in hist_browser is set to block_hists (not the
> hists for evsel).
> 
> See block_hists_tui_browse,
> 
> int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
>                           float min_percent)
> {
>        struct hists *hists = &bh->block_hists;
>        struct hist_browser *browser;
>        ......
>        browser = hist_browser__new(hists);
>        ......
> }
> 
> So I have to pass the evsel in.

I see, ok

jirka


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 5/7] perf report: Sort by sampled cycles percent per block for stdio
  2019-10-30  6:04 ` [PATCH v5 5/7] perf report: Sort by sampled cycles percent per block for stdio Jin Yao
@ 2019-11-04 14:04   ` Jiri Olsa
  2019-11-05  3:41     ` Jin, Yao
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2019-11-04 14:04 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Wed, Oct 30, 2019 at 02:04:28PM +0800, Jin Yao wrote:

SNIP

> +static int hists__fprintf_all_blocks(struct block_hist *bh)
> +{
> +	symbol_conf.report_individual_block = true;
> +	hists__fprintf(&bh->block_hists, true, 0, 0, 0,
> +		       stdout, true);
> +	hists__delete_entries(&bh->block_hists);
> +	return 0;
> +}
> +
>  static int perf_evlist__tty_browse_hists(struct evlist *evlist,
>  					 struct report *rep,
>  					 const char *help)
>  {
>  	struct evsel *pos;
> +	int i = 0;
>  
>  	if (!quiet) {
>  		fprintf(stdout, "#\n# Total Lost Samples: %" PRIu64 "\n#\n",
> @@ -494,12 +509,20 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
>  	evlist__for_each_entry(evlist, pos) {
>  		struct hists *hists = evsel__hists(pos);
>  		const char *evname = perf_evsel__name(pos);
> +		struct block_hist *block_hist;
>  
>  		if (symbol_conf.event_group &&
>  		    !perf_evsel__is_group_leader(pos))
>  			continue;
>  
>  		hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
> +
> +		if (rep->total_cycles_mode) {
> +			block_hist = &rep->block_reports[i++].hist;
> +			hists__fprintf_all_blocks(block_hist);
> +			continue;
> +		}

hum, you don't need evsel in here, please make separate function like
perf_evlist__tty_browse_block_hists, where you will iterate directly
block_reports[i++]

IMO the best would be to have report__browse_block_hists in block-info.c
and handle all display modes from there

that's probably the last thing that would be moved to block-info.c
other than that I think the patchset is ok

thanks,
jirka


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 5/7] perf report: Sort by sampled cycles percent per block for stdio
  2019-11-04 14:04   ` Jiri Olsa
@ 2019-11-05  3:41     ` Jin, Yao
  0 siblings, 0 replies; 15+ messages in thread
From: Jin, Yao @ 2019-11-05  3:41 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 11/4/2019 10:04 PM, Jiri Olsa wrote:
> On Wed, Oct 30, 2019 at 02:04:28PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> +static int hists__fprintf_all_blocks(struct block_hist *bh)
>> +{
>> +	symbol_conf.report_individual_block = true;
>> +	hists__fprintf(&bh->block_hists, true, 0, 0, 0,
>> +		       stdout, true);
>> +	hists__delete_entries(&bh->block_hists);
>> +	return 0;
>> +}
>> +
>>   static int perf_evlist__tty_browse_hists(struct evlist *evlist,
>>   					 struct report *rep,
>>   					 const char *help)
>>   {
>>   	struct evsel *pos;
>> +	int i = 0;
>>   
>>   	if (!quiet) {
>>   		fprintf(stdout, "#\n# Total Lost Samples: %" PRIu64 "\n#\n",
>> @@ -494,12 +509,20 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
>>   	evlist__for_each_entry(evlist, pos) {
>>   		struct hists *hists = evsel__hists(pos);
>>   		const char *evname = perf_evsel__name(pos);
>> +		struct block_hist *block_hist;
>>   
>>   		if (symbol_conf.event_group &&
>>   		    !perf_evsel__is_group_leader(pos))
>>   			continue;
>>   
>>   		hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
>> +
>> +		if (rep->total_cycles_mode) {
>> +			block_hist = &rep->block_reports[i++].hist;
>> +			hists__fprintf_all_blocks(block_hist);
>> +			continue;
>> +		}
> 
> hum, you don't need evsel in here, please make separate function like
> perf_evlist__tty_browse_block_hists, where you will iterate directly
> block_reports[i++]
> 
> IMO the best would be to have report__browse_block_hists in block-info.c
> and handle all display modes from there
> 
> that's probably the last thing that would be moved to block-info.c
> other than that I think the patchset is ok
> 
> thanks,
> jirka
> 

Thanks Jiri!

I just posted v6 which moved more block display codes from 
builtin-report.c to block-info.c. That should let the code be more clear.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-11-05  3:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-30  6:04 [PATCH v5 0/7] perf report: Support sorting all blocks by cycles Jin Yao
2019-10-30  6:04 ` [PATCH v5 1/7] perf diff: Don't use hack to skip column length calculation Jin Yao
2019-10-30  6:04 ` [PATCH v5 2/7] perf util: Cleanup and refactor block info functions Jin Yao
2019-10-30  6:04 ` [PATCH v5 3/7] perf util: Count the total cycles of all samples Jin Yao
2019-10-30  6:04 ` [PATCH v5 4/7] perf util: Support block formats with compare/sort/display Jin Yao
2019-10-30  6:04 ` [PATCH v5 5/7] perf report: Sort by sampled cycles percent per block for stdio Jin Yao
2019-11-04 14:04   ` Jiri Olsa
2019-11-05  3:41     ` Jin, Yao
2019-10-30  6:04 ` [PATCH v5 6/7] perf report: Support --percent-limit for --total-cycles Jin Yao
2019-10-30  6:04 ` [PATCH v5 7/7] perf report: Sort by sampled cycles percent per block for tui Jin Yao
2019-11-01  8:34   ` Jiri Olsa
2019-11-01 13:05     ` Jin, Yao
2019-11-01  8:34   ` Jiri Olsa
2019-11-01 14:07     ` Jin, Yao
2019-11-04 13:53       ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).