All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/9] perf tools: Assorted fixes
@ 2018-03-09 10:14 Jiri Olsa
  2018-03-09 10:14 ` [PATCH 1/9] perf tools: Free memory nodes data Jiri Olsa
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

hi,
sending assorted general fixes that queued
up in my other branches.

v2 changes:
  - rebased, some patches already taken
  - multiple fixes suggested by Arnaldo
  - new patch that frees memory_nodes in perf_env__exit

Also available in here:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/fixes

thanks,
jirka


---
Jiri Olsa (9):
      perf tools: Free memory nodes data
      perf tools: Add mem2node object
      perf tests: Add mem2node object test
      perf c2c record: Record physical addresses in samples
      perf c2c report: Make calc_width work with struct c2c_hist_entry
      perf c2c report: Call calc_width only for displayed entries
      perf c2c report: Display node for cacheline address
      perf c2c report: Add span header over cacheline data
      perf c2c report: Add cacheline address count column

 tools/perf/Documentation/perf-c2c.txt |   2 +-
 tools/perf/builtin-c2c.c              | 223 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
 tools/perf/tests/Build                |   1 +
 tools/perf/tests/builtin-test.c       |   4 +++
 tools/perf/tests/mem2node.c           |  75 ++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h              |   1 +
 tools/perf/util/Build                 |   1 +
 tools/perf/util/env.c                 |   4 +++
 tools/perf/util/mem2node.c            | 134 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/mem2node.h            |  19 ++++++++++
 10 files changed, 446 insertions(+), 18 deletions(-)
 create mode 100644 tools/perf/tests/mem2node.c
 create mode 100644 tools/perf/util/mem2node.c
 create mode 100644 tools/perf/util/mem2node.h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/9] perf tools: Free memory nodes data
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-20  6:16   ` [tip:perf/core] perf env: " tip-bot for Jiri Olsa
  2018-03-09 10:14 ` [PATCH 2/9] perf tools: Add mem2node object Jiri Olsa
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

Forgot to free env's memory nodes, adding
needed code to perf_env__exit.

Link: http://lkml.kernel.org/n/tip-ryyndcxqisxtfhbr6zp54jcy@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/env.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 6d311868d850..4c842762e3f2 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -32,6 +32,10 @@ void perf_env__exit(struct perf_env *env)
 	for (i = 0; i < env->caches_cnt; i++)
 		cpu_cache_level__free(&env->caches[i]);
 	zfree(&env->caches);
+
+	for (i = 0; i < env->nr_memory_nodes; i++)
+		free(env->memory_nodes[i].set);
+	zfree(&env->memory_nodes);
 }
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/9] perf tools: Add mem2node object
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
  2018-03-09 10:14 ` [PATCH 1/9] perf tools: Free memory nodes data Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-20  6:16   ` [tip:perf/core] " tip-bot for Jiri Olsa
  2018-03-09 10:14 ` [PATCH 3/9] perf tests: Add mem2node object test Jiri Olsa
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

Adding mem2node object to allow the easy lookup
of the node for the physical address.

It has following interface:

  int  mem2node__init(struct mem2node *map, struct perf_env *env);
  void mem2node__exit(struct mem2node *map);
  int  mem2node__node(struct mem2node *map, u64 addr);

The mem2node__init initialize object from the perf data
file MEM_TOPOLOGY feature data. Following calls to
mem2node__node will return node number for given
physical address. The mem2node__exit function frees
the object.

Link: http://lkml.kernel.org/n/tip-qq7sohu774wxq154n3my037z@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/Build      |   1 +
 tools/perf/util/mem2node.c | 134 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/mem2node.h |  19 +++++++
 3 files changed, 154 insertions(+)
 create mode 100644 tools/perf/util/mem2node.c
 create mode 100644 tools/perf/util/mem2node.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index ea0a452550b0..8052373bcd6a 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -106,6 +106,7 @@ libperf-y += units.o
 libperf-y += time-utils.o
 libperf-y += expr-bison.o
 libperf-y += branch.o
+libperf-y += mem2node.o
 
 libperf-$(CONFIG_LIBBPF) += bpf-loader.o
 libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
diff --git a/tools/perf/util/mem2node.c b/tools/perf/util/mem2node.c
new file mode 100644
index 000000000000..c6fd81c02586
--- /dev/null
+++ b/tools/perf/util/mem2node.c
@@ -0,0 +1,134 @@
+#include <errno.h>
+#include <inttypes.h>
+#include <linux/bitmap.h>
+#include "mem2node.h"
+#include "util.h"
+
+struct phys_entry {
+	struct rb_node	rb_node;
+	u64	start;
+	u64	end;
+	u64	node;
+};
+
+static void phys_entry__insert(struct phys_entry *entry, struct rb_root *root)
+{
+	struct rb_node **p = &root->rb_node;
+	struct rb_node *parent = NULL;
+	struct phys_entry *e;
+
+	while (*p != NULL) {
+		parent = *p;
+		e = rb_entry(parent, struct phys_entry, rb_node);
+
+		if (entry->start < e->start)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	rb_link_node(&entry->rb_node, parent, p);
+	rb_insert_color(&entry->rb_node, root);
+}
+
+static void
+phys_entry__init(struct phys_entry *entry, u64 start, u64 bsize, u64 node)
+{
+	entry->start = start;
+	entry->end   = start + bsize;
+	entry->node  = node;
+	RB_CLEAR_NODE(&entry->rb_node);
+}
+
+int mem2node__init(struct mem2node *map, struct perf_env *env)
+{
+	struct memory_node *n, *nodes = &env->memory_nodes[0];
+	struct phys_entry *entries, *tmp_entries;
+	u64 bsize = env->memory_bsize;
+	int i, j = 0, max = 0;
+
+	memset(map, 0x0, sizeof(*map));
+	map->root = RB_ROOT;
+
+	for (i = 0; i < env->nr_memory_nodes; i++) {
+		n = &nodes[i];
+		max += bitmap_weight(n->set, n->size);
+	}
+
+	entries = zalloc(sizeof(*entries) * max);
+	if (!entries)
+		return -ENOMEM;
+
+	for (i = 0; i < env->nr_memory_nodes; i++) {
+		u64 bit;
+
+		n = &nodes[i];
+
+		for (bit = 0; bit < n->size; bit++) {
+			u64 start;
+
+			if (!test_bit(bit, n->set))
+				continue;
+
+			start = bit * bsize;
+
+			/*
+			 * Merge nearby areas, we walk in order
+			 * through the bitmap, so no need to sort.
+			 */
+			if (j > 0) {
+				struct phys_entry *prev = &entries[j - 1];
+
+				if ((prev->end == start) &&
+				    (prev->node == n->node)) {
+					prev->end += bsize;
+					continue;
+				}
+			}
+
+			phys_entry__init(&entries[j++], start, bsize, n->node);
+		}
+	}
+
+	/* Cut unused entries, due to merging. */
+	tmp_entries = realloc(entries, sizeof(*entries) * j);
+	if (tmp_entries)
+		entries = tmp_entries;
+
+	for (i = 0; i < j; i++) {
+		pr_debug("mem2node %03" PRIu64 " [0x%016" PRIx64 "-0x%016" PRIx64 "]\n",
+			 entries[i].node, entries[i].start, entries[i].end);
+
+		phys_entry__insert(&entries[i], &map->root);
+	}
+
+	map->entries = entries;
+	return 0;
+}
+
+void mem2node__exit(struct mem2node *map)
+{
+	zfree(&map->entries);
+}
+
+int mem2node__node(struct mem2node *map, u64 addr)
+{
+	struct rb_node **p, *parent = NULL;
+	struct phys_entry *entry;
+
+	p = &map->root.rb_node;
+	while (*p != NULL) {
+		parent = *p;
+		entry = rb_entry(parent, struct phys_entry, rb_node);
+		if (addr < entry->start)
+			p = &(*p)->rb_left;
+		else if (addr >= entry->end)
+			p = &(*p)->rb_right;
+		else
+			goto out;
+	}
+
+	entry = NULL;
+out:
+	return entry ? (int) entry->node : -1;
+}
diff --git a/tools/perf/util/mem2node.h b/tools/perf/util/mem2node.h
new file mode 100644
index 000000000000..59c4752a2181
--- /dev/null
+++ b/tools/perf/util/mem2node.h
@@ -0,0 +1,19 @@
+#ifndef __MEM2NODE_H
+#define __MEM2NODE_H
+
+#include <linux/rbtree.h>
+#include "env.h"
+
+struct phys_entry;
+
+struct mem2node {
+	struct rb_root		 root;
+	struct phys_entry	*entries;
+	int			 cnt;
+};
+
+int  mem2node__init(struct mem2node *map, struct perf_env *env);
+void mem2node__exit(struct mem2node *map);
+int  mem2node__node(struct mem2node *map, u64 addr);
+
+#endif /* __MEM2NODE_H */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/9] perf tests: Add mem2node object test
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
  2018-03-09 10:14 ` [PATCH 1/9] perf tools: Free memory nodes data Jiri Olsa
  2018-03-09 10:14 ` [PATCH 2/9] perf tools: Add mem2node object Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-20  6:17   ` [tip:perf/core] " tip-bot for Jiri Olsa
  2018-03-09 10:14 ` [PATCH 4/9] perf c2c record: Record physical addresses in samples Jiri Olsa
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

Adding mem2node object automated test.

The test prepares few artificial node - memory maps and
verifies the mem2node object returns proper node values
to given addresses.

Link: http://lkml.kernel.org/n/tip-17xdxr5k1zfca9o3fymowqa5@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/tests/Build          |  1 +
 tools/perf/tests/builtin-test.c |  4 +++
 tools/perf/tests/mem2node.c     | 75 +++++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h        |  1 +
 4 files changed, 81 insertions(+)
 create mode 100644 tools/perf/tests/mem2node.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 87bf3edb037c..45782220ac23 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -47,6 +47,7 @@ perf-y += bitmap.o
 perf-y += perf-hooks.o
 perf-y += clang.o
 perf-y += unit_number__scnprintf.o
+perf-y += mem2node.o
 
 $(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
 	$(call rule_mkdir)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index fafa014240cd..09071ef4434f 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -271,6 +271,10 @@ static struct test generic_tests[] = {
 		.func = test__unit_number__scnprint,
 	},
 	{
+		.desc = "mem2node",
+		.func = test__mem2node,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/mem2node.c b/tools/perf/tests/mem2node.c
new file mode 100644
index 000000000000..0c3c87f86e03
--- /dev/null
+++ b/tools/perf/tests/mem2node.c
@@ -0,0 +1,75 @@
+#include <linux/compiler.h>
+#include <linux/bitmap.h>
+#include "cpumap.h"
+#include "mem2node.h"
+#include "tests.h"
+
+static struct node {
+	int		 node;
+	const char 	*map;
+} test_nodes[] = {
+	{ .node = 0, .map = "0"     },
+	{ .node = 1, .map = "1-2"   },
+	{ .node = 3, .map = "5-7,9" },
+};
+
+#define T TEST_ASSERT_VAL
+
+static unsigned long *get_bitmap(const char *str, int nbits)
+{
+	struct cpu_map *map = cpu_map__new(str);
+	unsigned long *bm = NULL;
+	int i;
+
+	bm = bitmap_alloc(nbits);
+
+	if (map && bm) {
+		bitmap_zero(bm, nbits);
+
+		for (i = 0; i < map->nr; i++) {
+			set_bit(map->map[i], bm);
+		}
+	}
+
+	if (map)
+		cpu_map__put(map);
+	else
+		free(bm);
+
+	return bm && map ? bm : NULL;
+}
+
+int test__mem2node(struct test *t __maybe_unused, int subtest __maybe_unused)
+{
+	struct mem2node map;
+	struct memory_node nodes[3];
+	struct perf_env env = {
+		.memory_nodes    = (struct memory_node *) &nodes[0],
+		.nr_memory_nodes = ARRAY_SIZE(nodes),
+		.memory_bsize    = 0x100,
+	};
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(nodes); i++) {
+		nodes[i].node = test_nodes[i].node;
+		nodes[i].size = 10;
+
+		T("failed: alloc bitmap",
+		  (nodes[i].set = get_bitmap(test_nodes[i].map, 10)));
+	}
+
+	T("failed: mem2node__init", !mem2node__init(&map, &env));
+	T("failed: mem2node__node",  0 == mem2node__node(&map,   0x50));
+	T("failed: mem2node__node",  1 == mem2node__node(&map,  0x100));
+	T("failed: mem2node__node",  1 == mem2node__node(&map,  0x250));
+	T("failed: mem2node__node",  3 == mem2node__node(&map,  0x500));
+	T("failed: mem2node__node",  3 == mem2node__node(&map,  0x650));
+	T("failed: mem2node__node", -1 == mem2node__node(&map,  0x450));
+	T("failed: mem2node__node", -1 == mem2node__node(&map, 0x1050));
+
+	for (i = 0; i < ARRAY_SIZE(nodes); i++)
+		free(nodes[i].set);
+
+	mem2node__exit(&map);
+	return 0;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 2862b80bc288..2e169819e647 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -102,6 +102,7 @@ int test__clang(struct test *test, int subtest);
 const char *test__clang_subtest_get_desc(int subtest);
 int test__clang_subtest_get_nr(void);
 int test__unit_number__scnprint(struct test *test, int subtest);
+int test__mem2node(struct test *t, int subtest);
 
 bool test__bp_signal_is_supported(void);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/9] perf c2c record: Record physical addresses in samples
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
                   ` (2 preceding siblings ...)
  2018-03-09 10:14 ` [PATCH 3/9] perf tests: Add mem2node object test Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-20  6:17   ` [tip:perf/core] " tip-bot for Jiri Olsa
  2018-03-09 10:14 ` [PATCH 5/9] perf c2c report: Make calc_width work with struct c2c_hist_entry Jiri Olsa
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

We are going to display NUMA node information in following
patches. For this we need to have physical address data in
the sample.

Adding --phys-data as a default option for perf c2c record.

Link: http://lkml.kernel.org/n/tip-4d4lyozdbsknzqeny8kl1jg4@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Documentation/perf-c2c.txt | 2 +-
 tools/perf/builtin-c2c.c              | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
index 822414235170..095aebdc5bb7 100644
--- a/tools/perf/Documentation/perf-c2c.txt
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -116,7 +116,7 @@ and calls standard perf record command.
 Following perf record options are configured by default:
 (check perf record man page for details)
 
-  -W,-d,--sample-cpu
+  -W,-d,--phys-data,--sample-cpu
 
 Unless specified otherwise with '-e' option, following events are monitored by
 default:
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 98d243fa0c06..95765a1db903 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2704,7 +2704,7 @@ static int perf_c2c__record(int argc, const char **argv)
 	argc = parse_options(argc, argv, options, record_mem_usage,
 			     PARSE_OPT_KEEP_UNKNOWN);
 
-	rec_argc = argc + 10; /* max number of arguments */
+	rec_argc = argc + 11; /* max number of arguments */
 	rec_argv = calloc(rec_argc + 1, sizeof(char *));
 	if (!rec_argv)
 		return -1;
@@ -2720,6 +2720,7 @@ static int perf_c2c__record(int argc, const char **argv)
 		rec_argv[i++] = "-W";
 
 	rec_argv[i++] = "-d";
+	rec_argv[i++] = "--phys-data";
 	rec_argv[i++] = "--sample-cpu";
 
 	for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/9] perf c2c report: Make calc_width work with struct c2c_hist_entry
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
                   ` (3 preceding siblings ...)
  2018-03-09 10:14 ` [PATCH 4/9] perf c2c record: Record physical addresses in samples Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-20  6:18   ` [tip:perf/core] " tip-bot for Jiri Olsa
  2018-03-09 10:14 ` [PATCH 6/9] perf c2c report: Call calc_width only for displayed entries Jiri Olsa
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

We are going to calculate column width based on the
struct c2c_hist_entry data, so making calc_width to
work with struct c2c_hist_entry.

Link: http://lkml.kernel.org/n/tip-p775ak70wpca93dm9q5jowre@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-c2c.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 95765a1db903..43ce55550c9d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1839,20 +1839,24 @@ static inline int valid_hitm_or_store(struct hist_entry *he)
 	return has_hitm || c2c_he->stats.store;
 }
 
-static void calc_width(struct hist_entry *he)
+static void calc_width(struct c2c_hist_entry *c2c_he)
 {
 	struct c2c_hists *c2c_hists;
 
-	c2c_hists = container_of(he->hists, struct c2c_hists, hists);
-	hists__calc_col_len(&c2c_hists->hists, he);
+	c2c_hists = container_of(c2c_he->he.hists, struct c2c_hists, hists);
+	hists__calc_col_len(&c2c_hists->hists, &c2c_he->he);
 }
 
 static int filter_cb(struct hist_entry *he)
 {
+	struct c2c_hist_entry *c2c_he;
+
+	c2c_he = container_of(he, struct c2c_hist_entry, he);
+
 	if (c2c.show_src && !he->srcline)
 		he->srcline = hist_entry__get_srcline(he);
 
-	calc_width(he);
+	calc_width(c2c_he);
 
 	if (!valid_hitm_or_store(he))
 		he->filtered = HIST_FILTER__C2C;
@@ -1869,7 +1873,7 @@ static int resort_cl_cb(struct hist_entry *he)
 	c2c_he = container_of(he, struct c2c_hist_entry, he);
 	c2c_hists = c2c_he->hists;
 
-	calc_width(he);
+	calc_width(c2c_he);
 
 	if (display && c2c_hists) {
 		static unsigned int idx;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 6/9] perf c2c report: Call calc_width only for displayed entries
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
                   ` (4 preceding siblings ...)
  2018-03-09 10:14 ` [PATCH 5/9] perf c2c report: Make calc_width work with struct c2c_hist_entry Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-20  6:18   ` [tip:perf/core] perf c2c report: Call calc_width() " tip-bot for Jiri Olsa
  2018-03-09 10:14 ` [PATCH 7/9] perf c2c report: Display node for cacheline address Jiri Olsa
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

There's no need to calculate column width on entries
that are not going to be displayed.

Link: http://lkml.kernel.org/n/tip-l4k7dpiaj0b67t73nh8rszlf@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-c2c.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 43ce55550c9d..821112e8ba97 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1873,12 +1873,11 @@ static int resort_cl_cb(struct hist_entry *he)
 	c2c_he = container_of(he, struct c2c_hist_entry, he);
 	c2c_hists = c2c_he->hists;
 
-	calc_width(c2c_he);
-
 	if (display && c2c_hists) {
 		static unsigned int idx;
 
 		c2c_he->cacheline_idx = idx++;
+		calc_width(c2c_he);
 
 		c2c_hists__reinit(c2c_hists, c2c.cl_output, c2c.cl_resort);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 7/9] perf c2c report: Display node for cacheline address
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
                   ` (5 preceding siblings ...)
  2018-03-09 10:14 ` [PATCH 6/9] perf c2c report: Call calc_width only for displayed entries Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-20  6:19   ` [tip:perf/core] " tip-bot for Jiri Olsa
  2018-03-09 10:14 ` [PATCH 8/9] perf c2c report: Add span header over cacheline data Jiri Olsa
  2018-03-09 10:14 ` [PATCH 9/9] perf c2c report: Add cacheline address count column Jiri Olsa
  8 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

Adding the NUMA node info for the data cacheline. Adding
the new column to both Shared Data Cache Line Table and
Shared Cache Line Distribution Pareto.

Note the new 'Node' column next to the 'Cacheline'.

  $ perf c2c report --stdio
  =================================================
             Shared Data Cache Line Table
  =================================================
  #
  #                                    Total      Tot  ----- LLC Load Hitm -----
  # Index           Cacheline  Node  records     Hitm    Total      Lcl      Rmt
  # .....  ..................  ....  .......  .......  .......  .......  .......
  #
        0      0x7f0830100000     0       84   10.53%        8        8        0
        1  0xffff922a93154200     0        3    2.63%        2        2        0
        2  0xffff922a93154500     0        4    2.63%        2        2        0
  ...

Note the new 'Node' column next to the 'Offset'.

  =================================================
        Shared Cache Line Distribution Pareto
  =================================================
  #
  #        ----- HITM -----  -- Store Refs --        Data address
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss              Offset  Node      Pid
  # .....  .......  .......  .......  .......  ..................  ....  .......
  #
    -------------------------------------------------------------
        0        0        8       32        2      0x7f0830100000
    -------------------------------------------------------------
             0.00%   75.00%   21.88%    0.00%                0x18     0     1791
             0.00%   12.50%   37.50%    0.00%                0x18     0     1791
             0.00%    0.00%   34.38%    0.00%                0x18     0     1791

Using the mem2node object to get the NUMA node data.

Link: http://lkml.kernel.org/n/tip-96n0f5vm714v3goct050xdmk@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-c2c.c | 119 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 114 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 821112e8ba97..45c047fdd7ac 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -32,6 +32,7 @@
 #include "evsel.h"
 #include "ui/browsers/hists.h"
 #include "thread.h"
+#include "mem2node.h"
 
 struct c2c_hists {
 	struct hists		hists;
@@ -49,6 +50,7 @@ struct c2c_hist_entry {
 	struct c2c_hists	*hists;
 	struct c2c_stats	 stats;
 	unsigned long		*cpuset;
+	unsigned long		*nodeset;
 	struct c2c_stats	*node_stats;
 	unsigned int		 cacheline_idx;
 
@@ -59,6 +61,11 @@ struct c2c_hist_entry {
 	 * because of its callchain dynamic entry
 	 */
 	struct hist_entry	he;
+
+	unsigned long		 paddr;
+	unsigned long		 paddr_cnt;
+	bool			 paddr_zero;
+	char			*nodestr;
 };
 
 static char const *coalesce_default = "pid,iaddr";
@@ -66,6 +73,7 @@ static char const *coalesce_default = "pid,iaddr";
 struct perf_c2c {
 	struct perf_tool	tool;
 	struct c2c_hists	hists;
+	struct mem2node		mem2node;
 
 	unsigned long		**nodes;
 	int			 nodes_cnt;
@@ -123,6 +131,10 @@ static void *c2c_he_zalloc(size_t size)
 	if (!c2c_he->cpuset)
 		return NULL;
 
+	c2c_he->nodeset = bitmap_alloc(c2c.nodes_cnt);
+	if (!c2c_he->nodeset)
+		return NULL;
+
 	c2c_he->node_stats = zalloc(c2c.nodes_cnt * sizeof(*c2c_he->node_stats));
 	if (!c2c_he->node_stats)
 		return NULL;
@@ -145,6 +157,8 @@ static void c2c_he_free(void *he)
 	}
 
 	free(c2c_he->cpuset);
+	free(c2c_he->nodeset);
+	free(c2c_he->nodestr);
 	free(c2c_he->node_stats);
 	free(c2c_he);
 }
@@ -194,6 +208,28 @@ static void c2c_he__set_cpu(struct c2c_hist_entry *c2c_he,
 	set_bit(sample->cpu, c2c_he->cpuset);
 }
 
+static void c2c_he__set_node(struct c2c_hist_entry *c2c_he,
+			     struct perf_sample *sample)
+{
+	int node;
+
+	if (!sample->phys_addr) {
+		c2c_he->paddr_zero = true;
+		return;
+	}
+
+	node = mem2node__node(&c2c.mem2node, sample->phys_addr);
+	if (WARN_ONCE(node < 0, "WARNING: failed to find node\n"))
+		return;
+
+	set_bit(node, c2c_he->nodeset);
+
+	if (c2c_he->paddr != sample->phys_addr) {
+		c2c_he->paddr_cnt++;
+		c2c_he->paddr = sample->phys_addr;
+	}
+}
+
 static void compute_stats(struct c2c_hist_entry *c2c_he,
 			  struct c2c_stats *stats,
 			  u64 weight)
@@ -257,6 +293,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 	c2c_add_stats(&c2c_hists->stats, &stats);
 
 	c2c_he__set_cpu(c2c_he, sample);
+	c2c_he__set_node(c2c_he, sample);
 
 	hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
 	ret = hist_entry__append_callchain(he, sample);
@@ -293,6 +330,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 		compute_stats(c2c_he, &stats, sample->weight);
 
 		c2c_he__set_cpu(c2c_he, sample);
+		c2c_he__set_node(c2c_he, sample);
 
 		hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
 		ret = hist_entry__append_callchain(he, sample);
@@ -455,6 +493,20 @@ static int dcacheline_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	return scnprintf(hpp->buf, hpp->size, "%*s", width, HEX_STR(buf, addr));
 }
 
+static int
+dcacheline_node_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		      struct hist_entry *he)
+{
+	struct c2c_hist_entry *c2c_he;
+	int width = c2c_width(fmt, hpp, he->hists);
+
+	c2c_he = container_of(he, struct c2c_hist_entry, he);
+	if (WARN_ON_ONCE(!c2c_he->nodestr))
+		return 0;
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", width, c2c_he->nodestr);
+}
+
 static int offset_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 			struct hist_entry *he)
 {
@@ -1207,6 +1259,14 @@ static struct c2c_dimension dim_dcacheline = {
 	.width		= 18,
 };
 
+static struct c2c_dimension dim_dcacheline_node = {
+	.header		= HEADER_LOW("Node"),
+	.name		= "dcacheline_node",
+	.cmp		= empty_cmp,
+	.entry		= dcacheline_node_entry,
+	.width		= 4,
+};
+
 static struct c2c_header header_offset_tui = HEADER_LOW("Off");
 
 static struct c2c_dimension dim_offset = {
@@ -1217,6 +1277,14 @@ static struct c2c_dimension dim_offset = {
 	.width		= 18,
 };
 
+static struct c2c_dimension dim_offset_node = {
+	.header		= HEADER_LOW("Node"),
+	.name		= "offset_node",
+	.cmp		= empty_cmp,
+	.entry		= dcacheline_node_entry,
+	.width		= 4,
+};
+
 static struct c2c_dimension dim_iaddr = {
 	.header		= HEADER_LOW("Code address"),
 	.name		= "iaddr",
@@ -1536,7 +1604,9 @@ static struct c2c_dimension dim_dcacheline_num_empty = {
 
 static struct c2c_dimension *dimensions[] = {
 	&dim_dcacheline,
+	&dim_dcacheline_node,
 	&dim_offset,
+	&dim_offset_node,
 	&dim_iaddr,
 	&dim_tot_hitm,
 	&dim_lcl_hitm,
@@ -1839,12 +1909,44 @@ static inline int valid_hitm_or_store(struct hist_entry *he)
 	return has_hitm || c2c_he->stats.store;
 }
 
+static void set_node_width(struct c2c_hist_entry *c2c_he, int len)
+{
+	struct c2c_dimension *dim;
+
+	dim = &c2c.hists == c2c_he->hists ?
+	      &dim_dcacheline_node : &dim_offset_node;
+
+	if (len > dim->width)
+		dim->width = len;
+}
+
+static int set_nodestr(struct c2c_hist_entry *c2c_he)
+{
+	char buf[30];
+	int len;
+
+	if (c2c_he->nodestr)
+		return 0;
+
+	if (bitmap_weight(c2c_he->nodeset, c2c.nodes_cnt)) {
+		len = bitmap_scnprintf(c2c_he->nodeset, c2c.nodes_cnt,
+				      buf, sizeof(buf));
+	} else {
+		len = scnprintf(buf, sizeof(buf), "N/A");
+	}
+
+	set_node_width(c2c_he, len);
+	c2c_he->nodestr = strdup(buf);
+	return c2c_he->nodestr ? 0 : -ENOMEM;
+}
+
 static void calc_width(struct c2c_hist_entry *c2c_he)
 {
 	struct c2c_hists *c2c_hists;
 
 	c2c_hists = container_of(c2c_he->he.hists, struct c2c_hists, hists);
 	hists__calc_col_len(&c2c_hists->hists, &c2c_he->he);
+	set_nodestr(c2c_he);
 }
 
 static int filter_cb(struct hist_entry *he)
@@ -2474,7 +2576,7 @@ static int build_cl_output(char *cl_sort, bool no_source)
 		"percent_lcl_hitm,"
 		"percent_stores_l1hit,"
 		"percent_stores_l1miss,"
-		"offset,",
+		"offset,offset_node,",
 		add_pid   ? "pid," : "",
 		add_tid   ? "tid," : "",
 		add_iaddr ? "iaddr," : "",
@@ -2603,17 +2705,21 @@ static int perf_c2c__report(int argc, const char **argv)
 		goto out;
 	}
 
-	err = setup_callchain(session->evlist);
+	err = mem2node__init(&c2c.mem2node, &session->header.env);
 	if (err)
 		goto out_session;
 
+	err = setup_callchain(session->evlist);
+	if (err)
+		goto out_mem2node;
+
 	if (symbol__init(&session->header.env) < 0)
-		goto out_session;
+		goto out_mem2node;
 
 	/* No pipe support at the moment. */
 	if (perf_data__is_pipe(session->data)) {
 		pr_debug("No pipe support at the moment.\n");
-		goto out_session;
+		goto out_mem2node;
 	}
 
 	if (c2c.use_stdio)
@@ -2626,12 +2732,13 @@ static int perf_c2c__report(int argc, const char **argv)
 	err = perf_session__process_events(session);
 	if (err) {
 		pr_err("failed to process sample\n");
-		goto out_session;
+		goto out_mem2node;
 	}
 
 	c2c_hists__reinit(&c2c.hists,
 			"cl_idx,"
 			"dcacheline,"
+			"dcacheline_node,"
 			"tot_recs,"
 			"percent_hitm,"
 			"tot_hitm,lcl_hitm,rmt_hitm,"
@@ -2657,6 +2764,8 @@ static int perf_c2c__report(int argc, const char **argv)
 
 	perf_c2c_display(session);
 
+out_mem2node:
+	mem2node__exit(&c2c.mem2node);
 out_session:
 	perf_session__delete(session);
 out:
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 8/9] perf c2c report: Add span header over cacheline data
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
                   ` (6 preceding siblings ...)
  2018-03-09 10:14 ` [PATCH 7/9] perf c2c report: Display node for cacheline address Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-20  6:19   ` [tip:perf/core] " tip-bot for Jiri Olsa
  2018-03-09 10:14 ` [PATCH 9/9] perf c2c report: Add cacheline address count column Jiri Olsa
  8 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

Forcing the NUMA node output to be grouped with the Cacheline
column in both Shared Data Cache Line Table and Shared Cache
Line Distribution Pareto.

Before:
  #                                    Total      Tot  ----- LLC Load Hitm -----
  # Index           Cacheline  Node  records     Hitm    Total      Lcl      Rmt
  # .....  ..................  ....  .......  .......  .......  .......  .......
  #
        0      0x7f0830100000     0       84   10.53%        8        8        0
        1  0xffff922a93154200     0        3    2.63%        2        2        0
        2  0xffff922a93154500     0        4    2.63%        2        2        0

After:
  #        ------- Cacheline ------    Total      Tot  ----- LLC Load Hitm -----
  # Index             Address  Node  records     Hitm    Total      Lcl      Rmt
  # .....  ..................  ....  .......  .......  .......  .......  .......
  #
        0      0x7f0830100000     0       84   10.53%        8        8        0
        1  0xffff922a93154200     0        3    2.63%        2        2        0
        2  0xffff922a93154500     0        4    2.63%        2        2        0

Before:
  #        ----- HITM -----  -- Store Refs --        Data address
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss              Offset  Node      Pid
  # .....  .......  .......  .......  .......  ..................  ....  .......
  #
    -------------------------------------------------------------
        0        0        8       32        2      0x7f0830100000
    -------------------------------------------------------------
             0.00%   75.00%   21.88%    0.00%                0x18     0     1791
             0.00%   12.50%   37.50%    0.00%                0x18     0     1791
             0.00%    0.00%   34.38%    0.00%                0x18     0     1791

After:
  #        ----- HITM -----  -- Store Refs --  ----- Data address -----
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss              Offset  Node      Pid
  # .....  .......  .......  .......  .......  ..................  ....  .......
  #
    -------------------------------------------------------------
        0        0        8       32        2      0x7f0830100000
    -------------------------------------------------------------
             0.00%   75.00%   21.88%    0.00%                0x18     0     1791
             0.00%   12.50%   37.50%    0.00%                0x18     0     1791
             0.00%    0.00%   34.38%    0.00%                0x18     0     1791

Link: http://lkml.kernel.org/n/tip-j8d4zhimuz1qh3obbaucc8eq@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-c2c.c | 63 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 58 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 45c047fdd7ac..a6336e4e2850 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1252,7 +1252,7 @@ cl_idx_empty_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	}
 
 static struct c2c_dimension dim_dcacheline = {
-	.header		= HEADER_LOW("Cacheline"),
+	.header		= HEADER_SPAN("--- Cacheline ----", "Address", 1),
 	.name		= "dcacheline",
 	.cmp		= dcacheline_cmp,
 	.entry		= dcacheline_entry,
@@ -1267,10 +1267,10 @@ static struct c2c_dimension dim_dcacheline_node = {
 	.width		= 4,
 };
 
-static struct c2c_header header_offset_tui = HEADER_LOW("Off");
+static struct c2c_header header_offset_tui = HEADER_SPAN("-----", "Off", 1);
 
 static struct c2c_dimension dim_offset = {
-	.header		= HEADER_BOTH("Data address", "Offset"),
+	.header		= HEADER_SPAN("--- Data address -", "Offset", 1),
 	.name		= "offset",
 	.cmp		= offset_cmp,
 	.entry		= offset_entry,
@@ -2453,14 +2453,64 @@ static void perf_c2c_display(struct perf_session *session)
 }
 #endif /* HAVE_SLANG_SUPPORT */
 
-static void ui_quirks(void)
+static char *fill_line(const char *orig, int len)
 {
+	int i, j, olen = strlen(orig);
+	char *buf;
+
+	buf = zalloc(len + 1);
+	if (!buf)
+		return NULL;
+
+	j = len / 2 - olen / 2;
+
+	for (i = 0; i < j - 1; i++)
+		buf[i] = '-';
+
+	buf[i++] = ' ';
+
+	strcpy(buf + i, orig);
+
+	i += olen;
+
+	buf[i++] = ' ';
+
+	for (; i < len; i++)
+		buf[i] = '-';
+
+	return buf;
+}
+
+static int ui_quirks(void)
+{
+	const char *nodestr = "Data address";
+	char *buf;
+
 	if (!c2c.use_stdio) {
 		dim_offset.width  = 5;
 		dim_offset.header = header_offset_tui;
+		nodestr = "CL";
 	}
 
 	dim_percent_hitm.header = percent_hitm_header[c2c.display];
+
+	/* Fix the zero line for dcacheline column. */
+	buf = fill_line("Cacheline", dim_dcacheline.width +
+				     dim_dcacheline_node.width + 2);
+	if (!buf)
+		return -ENOMEM;
+
+	dim_dcacheline.header.line[0].text = buf;
+
+	/* Fix the zero line for offset column. */
+	buf = fill_line(nodestr, dim_offset.width +
+			      dim_offset_node.width + 2);
+	if (!buf)
+		return -ENOMEM;
+
+	dim_offset.header.line[0].text = buf;
+
+	return 0;
 }
 
 #define CALLCHAIN_DEFAULT_OPT  "graph,0.5,caller,function,percent"
@@ -2760,7 +2810,10 @@ static int perf_c2c__report(int argc, const char **argv)
 
 	ui_progress__finish();
 
-	ui_quirks();
+	if (ui_quirks()) {
+		pr_err("failed to setup UI\n");
+		goto out_mem2node;
+	}
 
 	perf_c2c_display(session);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 9/9] perf c2c report: Add cacheline address count column
  2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
                   ` (7 preceding siblings ...)
  2018-03-09 10:14 ` [PATCH 8/9] perf c2c report: Add span header over cacheline data Jiri Olsa
@ 2018-03-09 10:14 ` Jiri Olsa
  2018-03-09 14:56   ` Arnaldo Carvalho de Melo
  2018-03-20  6:20   ` [tip:perf/core] " tip-bot for Jiri Olsa
  8 siblings, 2 replies; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 10:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

Adding the 'PA cnt' column grouped under data cacheline address.

It shows how many times the physical addresses changed for the
hist entry. It does not show the number of different physical
addresses for entry, because we don't store those. We only track
the number of times we got different address than we currently
hold, which is not expensive and gives similar info.

  $ perf c2c report --stdio

  #        ----------- Cacheline ----------    Total      Tot  ----- LLC Load Hitm -----
  # Index             Address  Node  PA cnt  records     Hitm    Total      Lcl      Rmt
  # .....  ..................  ....  ......  .......  .......  .......  .......  .......
  #
        0  0xffff9ad56dca0a80     0       9       10    7.69%        2        2        0
        1  0xffff9ad56dce0a80     0       9        9    7.69%        2        2        0
        2  0xffff9ad37659ad80     0       1        2    3.85%        1        1        0

  ...

  #        ----- HITM -----  -- Store Refs --  --------- Data address ---------
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss              Offset  Node  PA cnt      Pid
  # .....  .......  .......  .......  .......  ..................  ....  ......  .......
  #
    -------------------------------------------------------------
        0        0        2        3        0  0xffff9ad56dca0a80
    -------------------------------------------------------------
             0.00%    0.00%   33.33%    0.00%                 0x0     0       1     2510
             0.00%    0.00%   33.33%    0.00%                 0x4     0       1     2476
             0.00%    0.00%   33.33%    0.00%                0x20     0       1        0
             0.00%  100.00%    0.00%    0.00%                0x38     0       1        0

Link: http://lkml.kernel.org/n/tip-j8d4zhimuz1qh3obbaucc8eq@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-c2c.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a6336e4e2850..2126bfbcb385 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -507,6 +507,17 @@ dcacheline_node_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	return scnprintf(hpp->buf, hpp->size, "%*s", width, c2c_he->nodestr);
 }
 
+static int
+dcacheline_node_count(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		      struct hist_entry *he)
+{
+	struct c2c_hist_entry *c2c_he;
+	int width = c2c_width(fmt, hpp, he->hists);
+
+	c2c_he = container_of(he, struct c2c_hist_entry, he);
+	return scnprintf(hpp->buf, hpp->size, "%*lu", width, c2c_he->paddr_cnt);
+}
+
 static int offset_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 			struct hist_entry *he)
 {
@@ -1252,7 +1263,7 @@ cl_idx_empty_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	}
 
 static struct c2c_dimension dim_dcacheline = {
-	.header		= HEADER_SPAN("--- Cacheline ----", "Address", 1),
+	.header		= HEADER_SPAN("--- Cacheline ----", "Address", 2),
 	.name		= "dcacheline",
 	.cmp		= dcacheline_cmp,
 	.entry		= dcacheline_entry,
@@ -1267,10 +1278,18 @@ static struct c2c_dimension dim_dcacheline_node = {
 	.width		= 4,
 };
 
-static struct c2c_header header_offset_tui = HEADER_SPAN("-----", "Off", 1);
+static struct c2c_dimension dim_dcacheline_count = {
+	.header		= HEADER_LOW("PA cnt"),
+	.name		= "dcacheline_count",
+	.cmp		= empty_cmp,
+	.entry		= dcacheline_node_count,
+	.width		= 6,
+};
+
+static struct c2c_header header_offset_tui = HEADER_SPAN("-----", "Off", 2);
 
 static struct c2c_dimension dim_offset = {
-	.header		= HEADER_SPAN("--- Data address -", "Offset", 1),
+	.header		= HEADER_SPAN("--- Data address -", "Offset", 2),
 	.name		= "offset",
 	.cmp		= offset_cmp,
 	.entry		= offset_entry,
@@ -1605,6 +1624,7 @@ static struct c2c_dimension dim_dcacheline_num_empty = {
 static struct c2c_dimension *dimensions[] = {
 	&dim_dcacheline,
 	&dim_dcacheline_node,
+	&dim_dcacheline_count,
 	&dim_offset,
 	&dim_offset_node,
 	&dim_iaddr,
@@ -2496,7 +2516,8 @@ static int ui_quirks(void)
 
 	/* Fix the zero line for dcacheline column. */
 	buf = fill_line("Cacheline", dim_dcacheline.width +
-				     dim_dcacheline_node.width + 2);
+				     dim_dcacheline_node.width +
+				     dim_dcacheline_count.width + 4);
 	if (!buf)
 		return -ENOMEM;
 
@@ -2504,7 +2525,8 @@ static int ui_quirks(void)
 
 	/* Fix the zero line for offset column. */
 	buf = fill_line(nodestr, dim_offset.width +
-			      dim_offset_node.width + 2);
+			         dim_offset_node.width +
+				 dim_dcacheline_count.width + 4);
 	if (!buf)
 		return -ENOMEM;
 
@@ -2626,7 +2648,7 @@ static int build_cl_output(char *cl_sort, bool no_source)
 		"percent_lcl_hitm,"
 		"percent_stores_l1hit,"
 		"percent_stores_l1miss,"
-		"offset,offset_node,",
+		"offset,offset_node,dcacheline_count,",
 		add_pid   ? "pid," : "",
 		add_tid   ? "tid," : "",
 		add_iaddr ? "iaddr," : "",
@@ -2789,6 +2811,7 @@ static int perf_c2c__report(int argc, const char **argv)
 			"cl_idx,"
 			"dcacheline,"
 			"dcacheline_node,"
+			"dcacheline_count,"
 			"tot_recs,"
 			"percent_hitm,"
 			"tot_hitm,lcl_hitm,rmt_hitm,"
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 9/9] perf c2c report: Add cacheline address count column
  2018-03-09 10:14 ` [PATCH 9/9] perf c2c report: Add cacheline address count column Jiri Olsa
@ 2018-03-09 14:56   ` Arnaldo Carvalho de Melo
  2018-03-09 16:28     ` Jiri Olsa
  2018-03-20  6:20   ` [tip:perf/core] " tip-bot for Jiri Olsa
  1 sibling, 1 reply; 22+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-03-09 14:56 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Peter Zijlstra

Em Fri, Mar 09, 2018 at 11:14:42AM +0100, Jiri Olsa escreveu:
> Adding the 'PA cnt' column grouped under data cacheline address.
> 
> It shows how many times the physical addresses changed for the
> hist entry. It does not show the number of different physical
> addresses for entry, because we don't store those. We only track
> the number of times we got different address than we currently
> hold, which is not expensive and gives similar info.
> 
>   $ perf c2c report --stdio
> 
>   #        ----------- Cacheline ----------    Total      Tot  ----- LLC Load Hitm -----
>   # Index             Address  Node  PA cnt  records     Hitm    Total      Lcl      Rmt
>   # .....  ..................  ....  ......  .......  .......  .......  .......  .......
>   #


I'm adding this to the docs, ack?

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index cba16d8a970e..f4a280428e28 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -127,7 +127,7 @@ OPTIONS
 
 	If the --mem-mode option is used, the following sort keys are also available
 	(incompatible with --branch-stack):
-	symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.
+	symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline, dcacheline_count.
 
 	- symbol_daddr: name of data symbol being executed on at the time of sample
 	- dso_daddr: name of library or module containing the data being executed
@@ -137,6 +137,7 @@ OPTIONS
 	- mem: type of memory access for the data at the time of the sample
 	- snoop: type of snoop (if any) for the data at the time of the sample
 	- dcacheline: the cacheline the data address is on at the time of the sample
+	- dcacheline_count: the number of physical addresses sampled for this dcacheline
 	- phys_daddr: physical address of data being executed on at the time of sample
 
 	And the default sort keys are changed to local_weight, mem, sym, dso,

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 9/9] perf c2c report: Add cacheline address count column
  2018-03-09 14:56   ` Arnaldo Carvalho de Melo
@ 2018-03-09 16:28     ` Jiri Olsa
  2018-03-09 17:34       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2018-03-09 16:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, lkml, Ingo Molnar, Namhyung Kim, David Ahern,
	Alexander Shishkin, Peter Zijlstra

On Fri, Mar 09, 2018 at 11:56:43AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Mar 09, 2018 at 11:14:42AM +0100, Jiri Olsa escreveu:
> > Adding the 'PA cnt' column grouped under data cacheline address.
> > 
> > It shows how many times the physical addresses changed for the
> > hist entry. It does not show the number of different physical
> > addresses for entry, because we don't store those. We only track
> > the number of times we got different address than we currently
> > hold, which is not expensive and gives similar info.
> > 
> >   $ perf c2c report --stdio
> > 
> >   #        ----------- Cacheline ----------    Total      Tot  ----- LLC Load Hitm -----
> >   # Index             Address  Node  PA cnt  records     Hitm    Total      Lcl      Rmt
> >   # .....  ..................  ....  ......  .......  .......  .......  .......  .......
> >   #
> 
> 
> I'm adding this to the docs, ack?

nope.. those fields are c2c report only.. hardcoded

I'm preparing a change to share c2c fields with the
general report.. should come with other perf mem
changes later

jirka

> 
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index cba16d8a970e..f4a280428e28 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -127,7 +127,7 @@ OPTIONS
>  
>  	If the --mem-mode option is used, the following sort keys are also available
>  	(incompatible with --branch-stack):
> -	symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.
> +	symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline, dcacheline_count.
>  
>  	- symbol_daddr: name of data symbol being executed on at the time of sample
>  	- dso_daddr: name of library or module containing the data being executed
> @@ -137,6 +137,7 @@ OPTIONS
>  	- mem: type of memory access for the data at the time of the sample
>  	- snoop: type of snoop (if any) for the data at the time of the sample
>  	- dcacheline: the cacheline the data address is on at the time of the sample
> +	- dcacheline_count: the number of physical addresses sampled for this dcacheline
>  	- phys_daddr: physical address of data being executed on at the time of sample
>  
>  	And the default sort keys are changed to local_weight, mem, sym, dso,

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 9/9] perf c2c report: Add cacheline address count column
  2018-03-09 16:28     ` Jiri Olsa
@ 2018-03-09 17:34       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 22+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-03-09 17:34 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, lkml, Ingo Molnar, Namhyung Kim, David Ahern,
	Alexander Shishkin, Peter Zijlstra

Em Fri, Mar 09, 2018 at 05:28:28PM +0100, Jiri Olsa escreveu:
> On Fri, Mar 09, 2018 at 11:56:43AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Mar 09, 2018 at 11:14:42AM +0100, Jiri Olsa escreveu:
> > > Adding the 'PA cnt' column grouped under data cacheline address.
> > > 
> > > It shows how many times the physical addresses changed for the
> > > hist entry. It does not show the number of different physical
> > > addresses for entry, because we don't store those. We only track
> > > the number of times we got different address than we currently
> > > hold, which is not expensive and gives similar info.
> > > 
> > >   $ perf c2c report --stdio
> > > 
> > >   #        ----------- Cacheline ----------    Total      Tot  ----- LLC Load Hitm -----
> > >   # Index             Address  Node  PA cnt  records     Hitm    Total      Lcl      Rmt
> > >   # .....  ..................  ....  ......  .......  .......  .......  .......  .......
> > >   #
> > 
> > 
> > I'm adding this to the docs, ack?
> 
> nope.. those fields are c2c report only.. hardcoded
> 
> I'm preparing a change to share c2c fields with the
> general report.. should come with other perf mem
> changes later

Ok, dropping that then
 
> jirka
> 
> > 
> > diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> > index cba16d8a970e..f4a280428e28 100644
> > --- a/tools/perf/Documentation/perf-report.txt
> > +++ b/tools/perf/Documentation/perf-report.txt
> > @@ -127,7 +127,7 @@ OPTIONS
> >  
> >  	If the --mem-mode option is used, the following sort keys are also available
> >  	(incompatible with --branch-stack):
> > -	symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.
> > +	symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline, dcacheline_count.
> >  
> >  	- symbol_daddr: name of data symbol being executed on at the time of sample
> >  	- dso_daddr: name of library or module containing the data being executed
> > @@ -137,6 +137,7 @@ OPTIONS
> >  	- mem: type of memory access for the data at the time of the sample
> >  	- snoop: type of snoop (if any) for the data at the time of the sample
> >  	- dcacheline: the cacheline the data address is on at the time of the sample
> > +	- dcacheline_count: the number of physical addresses sampled for this dcacheline
> >  	- phys_daddr: physical address of data being executed on at the time of sample
> >  
> >  	And the default sort keys are changed to local_weight, mem, sym, dso,

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf env: Free memory nodes data
  2018-03-09 10:14 ` [PATCH 1/9] perf tools: Free memory nodes data Jiri Olsa
@ 2018-03-20  6:16   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: namhyung, alexander.shishkin, tglx, mingo, linux-kernel, dsahern,
	hpa, jolsa, peterz, acme

Commit-ID:  e725920cdb1c79fdc71f2f164f59be8c411cad68
Gitweb:     https://git.kernel.org/tip/e725920cdb1c79fdc71f2f164f59be8c411cad68
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:34 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:52:09 -0300

perf env: Free memory nodes data

Forgot to free env's memory nodes, adding needed code to perf_env__exit.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/env.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 6d311868d850..4c842762e3f2 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -32,6 +32,10 @@ void perf_env__exit(struct perf_env *env)
 	for (i = 0; i < env->caches_cnt; i++)
 		cpu_cache_level__free(&env->caches[i]);
 	zfree(&env->caches);
+
+	for (i = 0; i < env->nr_memory_nodes; i++)
+		free(env->memory_nodes[i].set);
+	zfree(&env->memory_nodes);
 }
 
 int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[])

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf tools: Add mem2node object
  2018-03-09 10:14 ` [PATCH 2/9] perf tools: Add mem2node object Jiri Olsa
@ 2018-03-20  6:16   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dsahern, peterz, mingo, alexander.shishkin, acme, hpa, jolsa,
	namhyung, linux-kernel, tglx

Commit-ID:  4acf6142de3fbc4fc9cc8da0a1aec073f05b724f
Gitweb:     https://git.kernel.org/tip/4acf6142de3fbc4fc9cc8da0a1aec073f05b724f
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:35 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:52:37 -0300

perf tools: Add mem2node object

Adding mem2node object to allow the easy lookup of the node for the
physical address.

It has following interface:

  int  mem2node__init(struct mem2node *map, struct perf_env *env);
  void mem2node__exit(struct mem2node *map);
  int  mem2node__node(struct mem2node *map, u64 addr);

The mem2node__toolsinit initialize object from the perf data file
MEM_TOPOLOGY feature data. Following calls to mem2node__node will return
node number for given physical address. The mem2node__exit function
frees the object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/Build      |   1 +
 tools/perf/util/mem2node.c | 134 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/mem2node.h |  19 +++++++
 3 files changed, 154 insertions(+)

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index ea0a452550b0..8052373bcd6a 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -106,6 +106,7 @@ libperf-y += units.o
 libperf-y += time-utils.o
 libperf-y += expr-bison.o
 libperf-y += branch.o
+libperf-y += mem2node.o
 
 libperf-$(CONFIG_LIBBPF) += bpf-loader.o
 libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
diff --git a/tools/perf/util/mem2node.c b/tools/perf/util/mem2node.c
new file mode 100644
index 000000000000..c6fd81c02586
--- /dev/null
+++ b/tools/perf/util/mem2node.c
@@ -0,0 +1,134 @@
+#include <errno.h>
+#include <inttypes.h>
+#include <linux/bitmap.h>
+#include "mem2node.h"
+#include "util.h"
+
+struct phys_entry {
+	struct rb_node	rb_node;
+	u64	start;
+	u64	end;
+	u64	node;
+};
+
+static void phys_entry__insert(struct phys_entry *entry, struct rb_root *root)
+{
+	struct rb_node **p = &root->rb_node;
+	struct rb_node *parent = NULL;
+	struct phys_entry *e;
+
+	while (*p != NULL) {
+		parent = *p;
+		e = rb_entry(parent, struct phys_entry, rb_node);
+
+		if (entry->start < e->start)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	rb_link_node(&entry->rb_node, parent, p);
+	rb_insert_color(&entry->rb_node, root);
+}
+
+static void
+phys_entry__init(struct phys_entry *entry, u64 start, u64 bsize, u64 node)
+{
+	entry->start = start;
+	entry->end   = start + bsize;
+	entry->node  = node;
+	RB_CLEAR_NODE(&entry->rb_node);
+}
+
+int mem2node__init(struct mem2node *map, struct perf_env *env)
+{
+	struct memory_node *n, *nodes = &env->memory_nodes[0];
+	struct phys_entry *entries, *tmp_entries;
+	u64 bsize = env->memory_bsize;
+	int i, j = 0, max = 0;
+
+	memset(map, 0x0, sizeof(*map));
+	map->root = RB_ROOT;
+
+	for (i = 0; i < env->nr_memory_nodes; i++) {
+		n = &nodes[i];
+		max += bitmap_weight(n->set, n->size);
+	}
+
+	entries = zalloc(sizeof(*entries) * max);
+	if (!entries)
+		return -ENOMEM;
+
+	for (i = 0; i < env->nr_memory_nodes; i++) {
+		u64 bit;
+
+		n = &nodes[i];
+
+		for (bit = 0; bit < n->size; bit++) {
+			u64 start;
+
+			if (!test_bit(bit, n->set))
+				continue;
+
+			start = bit * bsize;
+
+			/*
+			 * Merge nearby areas, we walk in order
+			 * through the bitmap, so no need to sort.
+			 */
+			if (j > 0) {
+				struct phys_entry *prev = &entries[j - 1];
+
+				if ((prev->end == start) &&
+				    (prev->node == n->node)) {
+					prev->end += bsize;
+					continue;
+				}
+			}
+
+			phys_entry__init(&entries[j++], start, bsize, n->node);
+		}
+	}
+
+	/* Cut unused entries, due to merging. */
+	tmp_entries = realloc(entries, sizeof(*entries) * j);
+	if (tmp_entries)
+		entries = tmp_entries;
+
+	for (i = 0; i < j; i++) {
+		pr_debug("mem2node %03" PRIu64 " [0x%016" PRIx64 "-0x%016" PRIx64 "]\n",
+			 entries[i].node, entries[i].start, entries[i].end);
+
+		phys_entry__insert(&entries[i], &map->root);
+	}
+
+	map->entries = entries;
+	return 0;
+}
+
+void mem2node__exit(struct mem2node *map)
+{
+	zfree(&map->entries);
+}
+
+int mem2node__node(struct mem2node *map, u64 addr)
+{
+	struct rb_node **p, *parent = NULL;
+	struct phys_entry *entry;
+
+	p = &map->root.rb_node;
+	while (*p != NULL) {
+		parent = *p;
+		entry = rb_entry(parent, struct phys_entry, rb_node);
+		if (addr < entry->start)
+			p = &(*p)->rb_left;
+		else if (addr >= entry->end)
+			p = &(*p)->rb_right;
+		else
+			goto out;
+	}
+
+	entry = NULL;
+out:
+	return entry ? (int) entry->node : -1;
+}
diff --git a/tools/perf/util/mem2node.h b/tools/perf/util/mem2node.h
new file mode 100644
index 000000000000..59c4752a2181
--- /dev/null
+++ b/tools/perf/util/mem2node.h
@@ -0,0 +1,19 @@
+#ifndef __MEM2NODE_H
+#define __MEM2NODE_H
+
+#include <linux/rbtree.h>
+#include "env.h"
+
+struct phys_entry;
+
+struct mem2node {
+	struct rb_root		 root;
+	struct phys_entry	*entries;
+	int			 cnt;
+};
+
+int  mem2node__init(struct mem2node *map, struct perf_env *env);
+void mem2node__exit(struct mem2node *map);
+int  mem2node__node(struct mem2node *map, u64 addr);
+
+#endif /* __MEM2NODE_H */

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf tests: Add mem2node object test
  2018-03-09 10:14 ` [PATCH 3/9] perf tests: Add mem2node object test Jiri Olsa
@ 2018-03-20  6:17   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, linux-kernel, alexander.shishkin, peterz, dsahern, tglx,
	acme, namhyung, jolsa, mingo

Commit-ID:  8185850ad603acfc66f5b3d284955809dffa5d2c
Gitweb:     https://git.kernel.org/tip/8185850ad603acfc66f5b3d284955809dffa5d2c
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:36 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:52:48 -0300

perf tests: Add mem2node object test

Adding mem2node object automated test.

The test prepares few artificial nodes - memory maps and verifies the
mem2node object returns proper node values to given addresses.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/Build          |  1 +
 tools/perf/tests/builtin-test.c |  4 +++
 tools/perf/tests/mem2node.c     | 75 +++++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h        |  1 +
 4 files changed, 81 insertions(+)

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 62ca0174d5e1..6c108fa79ae3 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -48,6 +48,7 @@ perf-y += bitmap.o
 perf-y += perf-hooks.o
 perf-y += clang.o
 perf-y += unit_number__scnprintf.o
+perf-y += mem2node.o
 
 $(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
 	$(call rule_mkdir)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 38bf109ce106..625f5a6772af 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -274,6 +274,10 @@ static struct test generic_tests[] = {
 		.desc = "unit_number__scnprintf",
 		.func = test__unit_number__scnprint,
 	},
+	{
+		.desc = "mem2node",
+		.func = test__mem2node,
+	},
 	{
 		.func = NULL,
 	},
diff --git a/tools/perf/tests/mem2node.c b/tools/perf/tests/mem2node.c
new file mode 100644
index 000000000000..0c3c87f86e03
--- /dev/null
+++ b/tools/perf/tests/mem2node.c
@@ -0,0 +1,75 @@
+#include <linux/compiler.h>
+#include <linux/bitmap.h>
+#include "cpumap.h"
+#include "mem2node.h"
+#include "tests.h"
+
+static struct node {
+	int		 node;
+	const char 	*map;
+} test_nodes[] = {
+	{ .node = 0, .map = "0"     },
+	{ .node = 1, .map = "1-2"   },
+	{ .node = 3, .map = "5-7,9" },
+};
+
+#define T TEST_ASSERT_VAL
+
+static unsigned long *get_bitmap(const char *str, int nbits)
+{
+	struct cpu_map *map = cpu_map__new(str);
+	unsigned long *bm = NULL;
+	int i;
+
+	bm = bitmap_alloc(nbits);
+
+	if (map && bm) {
+		bitmap_zero(bm, nbits);
+
+		for (i = 0; i < map->nr; i++) {
+			set_bit(map->map[i], bm);
+		}
+	}
+
+	if (map)
+		cpu_map__put(map);
+	else
+		free(bm);
+
+	return bm && map ? bm : NULL;
+}
+
+int test__mem2node(struct test *t __maybe_unused, int subtest __maybe_unused)
+{
+	struct mem2node map;
+	struct memory_node nodes[3];
+	struct perf_env env = {
+		.memory_nodes    = (struct memory_node *) &nodes[0],
+		.nr_memory_nodes = ARRAY_SIZE(nodes),
+		.memory_bsize    = 0x100,
+	};
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(nodes); i++) {
+		nodes[i].node = test_nodes[i].node;
+		nodes[i].size = 10;
+
+		T("failed: alloc bitmap",
+		  (nodes[i].set = get_bitmap(test_nodes[i].map, 10)));
+	}
+
+	T("failed: mem2node__init", !mem2node__init(&map, &env));
+	T("failed: mem2node__node",  0 == mem2node__node(&map,   0x50));
+	T("failed: mem2node__node",  1 == mem2node__node(&map,  0x100));
+	T("failed: mem2node__node",  1 == mem2node__node(&map,  0x250));
+	T("failed: mem2node__node",  3 == mem2node__node(&map,  0x500));
+	T("failed: mem2node__node",  3 == mem2node__node(&map,  0x650));
+	T("failed: mem2node__node", -1 == mem2node__node(&map,  0x450));
+	T("failed: mem2node__node", -1 == mem2node__node(&map, 0x1050));
+
+	for (i = 0; i < ARRAY_SIZE(nodes); i++)
+		free(nodes[i].set);
+
+	mem2node__exit(&map);
+	return 0;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 9f51edac44ae..a9760e790563 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -103,6 +103,7 @@ int test__clang(struct test *test, int subtest);
 const char *test__clang_subtest_get_desc(int subtest);
 int test__clang_subtest_get_nr(void);
 int test__unit_number__scnprint(struct test *test, int subtest);
+int test__mem2node(struct test *t, int subtest);
 
 bool test__bp_signal_is_supported(void);
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf c2c record: Record physical addresses in samples
  2018-03-09 10:14 ` [PATCH 4/9] perf c2c record: Record physical addresses in samples Jiri Olsa
@ 2018-03-20  6:17   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: alexander.shishkin, dsahern, tglx, jolsa, mingo, namhyung,
	jmario, acme, linux-kernel, hpa, peterz

Commit-ID:  8fab7843a15078814764e01c303d175c92b500c1
Gitweb:     https://git.kernel.org/tip/8fab7843a15078814764e01c303d175c92b500c1
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:37 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:52:57 -0300

perf c2c record: Record physical addresses in samples

We are going to display NUMA node information in following patches. For
this we need to have physical address data in the sample.

Adding --phys-data as a default option for perf c2c record.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-c2c.txt | 2 +-
 tools/perf/builtin-c2c.c              | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
index 822414235170..095aebdc5bb7 100644
--- a/tools/perf/Documentation/perf-c2c.txt
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -116,7 +116,7 @@ and calls standard perf record command.
 Following perf record options are configured by default:
 (check perf record man page for details)
 
-  -W,-d,--sample-cpu
+  -W,-d,--phys-data,--sample-cpu
 
 Unless specified otherwise with '-e' option, following events are monitored by
 default:
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 98d243fa0c06..95765a1db903 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2704,7 +2704,7 @@ static int perf_c2c__record(int argc, const char **argv)
 	argc = parse_options(argc, argv, options, record_mem_usage,
 			     PARSE_OPT_KEEP_UNKNOWN);
 
-	rec_argc = argc + 10; /* max number of arguments */
+	rec_argc = argc + 11; /* max number of arguments */
 	rec_argv = calloc(rec_argc + 1, sizeof(char *));
 	if (!rec_argv)
 		return -1;
@@ -2720,6 +2720,7 @@ static int perf_c2c__record(int argc, const char **argv)
 		rec_argv[i++] = "-W";
 
 	rec_argv[i++] = "-d";
+	rec_argv[i++] = "--phys-data";
 	rec_argv[i++] = "--sample-cpu";
 
 	for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf c2c report: Make calc_width work with struct c2c_hist_entry
  2018-03-09 10:14 ` [PATCH 5/9] perf c2c report: Make calc_width work with struct c2c_hist_entry Jiri Olsa
@ 2018-03-20  6:18   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, peterz, hpa, dsahern, jmario, alexander.shishkin, tglx,
	linux-kernel, namhyung, mingo, acme

Commit-ID:  3773138828b38f3f1364ef318cd876b16182388a
Gitweb:     https://git.kernel.org/tip/3773138828b38f3f1364ef318cd876b16182388a
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:38 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:53:05 -0300

perf c2c report: Make calc_width work with struct c2c_hist_entry

We are going to calculate tje column width based on the struct
c2c_hist_entry data, so making calc_width to work with struct
c2c_hist_entry.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 95765a1db903..43ce55550c9d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1839,20 +1839,24 @@ static inline int valid_hitm_or_store(struct hist_entry *he)
 	return has_hitm || c2c_he->stats.store;
 }
 
-static void calc_width(struct hist_entry *he)
+static void calc_width(struct c2c_hist_entry *c2c_he)
 {
 	struct c2c_hists *c2c_hists;
 
-	c2c_hists = container_of(he->hists, struct c2c_hists, hists);
-	hists__calc_col_len(&c2c_hists->hists, he);
+	c2c_hists = container_of(c2c_he->he.hists, struct c2c_hists, hists);
+	hists__calc_col_len(&c2c_hists->hists, &c2c_he->he);
 }
 
 static int filter_cb(struct hist_entry *he)
 {
+	struct c2c_hist_entry *c2c_he;
+
+	c2c_he = container_of(he, struct c2c_hist_entry, he);
+
 	if (c2c.show_src && !he->srcline)
 		he->srcline = hist_entry__get_srcline(he);
 
-	calc_width(he);
+	calc_width(c2c_he);
 
 	if (!valid_hitm_or_store(he))
 		he->filtered = HIST_FILTER__C2C;
@@ -1869,7 +1873,7 @@ static int resort_cl_cb(struct hist_entry *he)
 	c2c_he = container_of(he, struct c2c_hist_entry, he);
 	c2c_hists = c2c_he->hists;
 
-	calc_width(he);
+	calc_width(c2c_he);
 
 	if (display && c2c_hists) {
 		static unsigned int idx;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf c2c report: Call calc_width() only for displayed entries
  2018-03-09 10:14 ` [PATCH 6/9] perf c2c report: Call calc_width only for displayed entries Jiri Olsa
@ 2018-03-20  6:18   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, hpa, dsahern, mingo, peterz, jolsa, alexander.shishkin,
	namhyung, acme, jmario, linux-kernel

Commit-ID:  bc229c21f2c79ef0f7b30d3a2fce8c2886ffa6c7
Gitweb:     https://git.kernel.org/tip/bc229c21f2c79ef0f7b30d3a2fce8c2886ffa6c7
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:39 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:53:13 -0300

perf c2c report: Call calc_width() only for displayed entries

There's no need to calculate column widths for entries that are not
going to be displayed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 43ce55550c9d..821112e8ba97 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1873,12 +1873,11 @@ static int resort_cl_cb(struct hist_entry *he)
 	c2c_he = container_of(he, struct c2c_hist_entry, he);
 	c2c_hists = c2c_he->hists;
 
-	calc_width(c2c_he);
-
 	if (display && c2c_hists) {
 		static unsigned int idx;
 
 		c2c_he->cacheline_idx = idx++;
+		calc_width(c2c_he);
 
 		c2c_hists__reinit(c2c_hists, c2c.cl_output, c2c.cl_resort);
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf c2c report: Display node for cacheline address
  2018-03-09 10:14 ` [PATCH 7/9] perf c2c report: Display node for cacheline address Jiri Olsa
@ 2018-03-20  6:19   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: namhyung, tglx, peterz, mingo, linux-kernel, alexander.shishkin,
	jolsa, hpa, acme, dsahern, jmario

Commit-ID:  7f834c2e84bbcf94a1ed65a2ae648129e1901370
Gitweb:     https://git.kernel.org/tip/7f834c2e84bbcf94a1ed65a2ae648129e1901370
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:40 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:53:23 -0300

perf c2c report: Display node for cacheline address

Adding the NUMA node info for the data cacheline. Adding the new column
to both "Shared Data Cache Line Table" and "Shared Cache Line
Distribution Pareto".

Note the new 'Node' column next to the 'Cacheline'.

  $ perf c2c report --stdio
  =================================================
             Shared Data Cache Line Table
  =================================================
  #
  #                                    Total      Tot  ----- LLC Load Hitm -----
  # Index           Cacheline  Node  records     Hitm    Total      Lcl      Rmt
  # .....  ..................  ....  .......  .......  .......  .......  .......
  #
        0      0x7f0830100000     0       84   10.53%        8        8        0
        1  0xffff922a93154200     0        3    2.63%        2        2        0
        2  0xffff922a93154500     0        4    2.63%        2        2        0
  ...

Note the new 'Node' column next to the 'Offset'.

  =================================================
        Shared Cache Line Distribution Pareto
  =================================================
  #
  #        ----- HITM -----  -- Store Refs --        Data address
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss              Offset  Node      Pid
  # .....  .......  .......  .......  .......  ..................  ....  .......
  #
    -------------------------------------------------------------
        0        0        8       32        2      0x7f0830100000
    -------------------------------------------------------------
             0.00%   75.00%   21.88%    0.00%                0x18     0     1791
             0.00%   12.50%   37.50%    0.00%                0x18     0     1791
             0.00%    0.00%   34.38%    0.00%                0x18     0     1791

Using the mem2node object to get the NUMA node data.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c | 119 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 114 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 821112e8ba97..45c047fdd7ac 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -32,6 +32,7 @@
 #include "evsel.h"
 #include "ui/browsers/hists.h"
 #include "thread.h"
+#include "mem2node.h"
 
 struct c2c_hists {
 	struct hists		hists;
@@ -49,6 +50,7 @@ struct c2c_hist_entry {
 	struct c2c_hists	*hists;
 	struct c2c_stats	 stats;
 	unsigned long		*cpuset;
+	unsigned long		*nodeset;
 	struct c2c_stats	*node_stats;
 	unsigned int		 cacheline_idx;
 
@@ -59,6 +61,11 @@ struct c2c_hist_entry {
 	 * because of its callchain dynamic entry
 	 */
 	struct hist_entry	he;
+
+	unsigned long		 paddr;
+	unsigned long		 paddr_cnt;
+	bool			 paddr_zero;
+	char			*nodestr;
 };
 
 static char const *coalesce_default = "pid,iaddr";
@@ -66,6 +73,7 @@ static char const *coalesce_default = "pid,iaddr";
 struct perf_c2c {
 	struct perf_tool	tool;
 	struct c2c_hists	hists;
+	struct mem2node		mem2node;
 
 	unsigned long		**nodes;
 	int			 nodes_cnt;
@@ -123,6 +131,10 @@ static void *c2c_he_zalloc(size_t size)
 	if (!c2c_he->cpuset)
 		return NULL;
 
+	c2c_he->nodeset = bitmap_alloc(c2c.nodes_cnt);
+	if (!c2c_he->nodeset)
+		return NULL;
+
 	c2c_he->node_stats = zalloc(c2c.nodes_cnt * sizeof(*c2c_he->node_stats));
 	if (!c2c_he->node_stats)
 		return NULL;
@@ -145,6 +157,8 @@ static void c2c_he_free(void *he)
 	}
 
 	free(c2c_he->cpuset);
+	free(c2c_he->nodeset);
+	free(c2c_he->nodestr);
 	free(c2c_he->node_stats);
 	free(c2c_he);
 }
@@ -194,6 +208,28 @@ static void c2c_he__set_cpu(struct c2c_hist_entry *c2c_he,
 	set_bit(sample->cpu, c2c_he->cpuset);
 }
 
+static void c2c_he__set_node(struct c2c_hist_entry *c2c_he,
+			     struct perf_sample *sample)
+{
+	int node;
+
+	if (!sample->phys_addr) {
+		c2c_he->paddr_zero = true;
+		return;
+	}
+
+	node = mem2node__node(&c2c.mem2node, sample->phys_addr);
+	if (WARN_ONCE(node < 0, "WARNING: failed to find node\n"))
+		return;
+
+	set_bit(node, c2c_he->nodeset);
+
+	if (c2c_he->paddr != sample->phys_addr) {
+		c2c_he->paddr_cnt++;
+		c2c_he->paddr = sample->phys_addr;
+	}
+}
+
 static void compute_stats(struct c2c_hist_entry *c2c_he,
 			  struct c2c_stats *stats,
 			  u64 weight)
@@ -257,6 +293,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 	c2c_add_stats(&c2c_hists->stats, &stats);
 
 	c2c_he__set_cpu(c2c_he, sample);
+	c2c_he__set_node(c2c_he, sample);
 
 	hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
 	ret = hist_entry__append_callchain(he, sample);
@@ -293,6 +330,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 		compute_stats(c2c_he, &stats, sample->weight);
 
 		c2c_he__set_cpu(c2c_he, sample);
+		c2c_he__set_node(c2c_he, sample);
 
 		hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
 		ret = hist_entry__append_callchain(he, sample);
@@ -455,6 +493,20 @@ static int dcacheline_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	return scnprintf(hpp->buf, hpp->size, "%*s", width, HEX_STR(buf, addr));
 }
 
+static int
+dcacheline_node_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		      struct hist_entry *he)
+{
+	struct c2c_hist_entry *c2c_he;
+	int width = c2c_width(fmt, hpp, he->hists);
+
+	c2c_he = container_of(he, struct c2c_hist_entry, he);
+	if (WARN_ON_ONCE(!c2c_he->nodestr))
+		return 0;
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", width, c2c_he->nodestr);
+}
+
 static int offset_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 			struct hist_entry *he)
 {
@@ -1207,6 +1259,14 @@ static struct c2c_dimension dim_dcacheline = {
 	.width		= 18,
 };
 
+static struct c2c_dimension dim_dcacheline_node = {
+	.header		= HEADER_LOW("Node"),
+	.name		= "dcacheline_node",
+	.cmp		= empty_cmp,
+	.entry		= dcacheline_node_entry,
+	.width		= 4,
+};
+
 static struct c2c_header header_offset_tui = HEADER_LOW("Off");
 
 static struct c2c_dimension dim_offset = {
@@ -1217,6 +1277,14 @@ static struct c2c_dimension dim_offset = {
 	.width		= 18,
 };
 
+static struct c2c_dimension dim_offset_node = {
+	.header		= HEADER_LOW("Node"),
+	.name		= "offset_node",
+	.cmp		= empty_cmp,
+	.entry		= dcacheline_node_entry,
+	.width		= 4,
+};
+
 static struct c2c_dimension dim_iaddr = {
 	.header		= HEADER_LOW("Code address"),
 	.name		= "iaddr",
@@ -1536,7 +1604,9 @@ static struct c2c_dimension dim_dcacheline_num_empty = {
 
 static struct c2c_dimension *dimensions[] = {
 	&dim_dcacheline,
+	&dim_dcacheline_node,
 	&dim_offset,
+	&dim_offset_node,
 	&dim_iaddr,
 	&dim_tot_hitm,
 	&dim_lcl_hitm,
@@ -1839,12 +1909,44 @@ static inline int valid_hitm_or_store(struct hist_entry *he)
 	return has_hitm || c2c_he->stats.store;
 }
 
+static void set_node_width(struct c2c_hist_entry *c2c_he, int len)
+{
+	struct c2c_dimension *dim;
+
+	dim = &c2c.hists == c2c_he->hists ?
+	      &dim_dcacheline_node : &dim_offset_node;
+
+	if (len > dim->width)
+		dim->width = len;
+}
+
+static int set_nodestr(struct c2c_hist_entry *c2c_he)
+{
+	char buf[30];
+	int len;
+
+	if (c2c_he->nodestr)
+		return 0;
+
+	if (bitmap_weight(c2c_he->nodeset, c2c.nodes_cnt)) {
+		len = bitmap_scnprintf(c2c_he->nodeset, c2c.nodes_cnt,
+				      buf, sizeof(buf));
+	} else {
+		len = scnprintf(buf, sizeof(buf), "N/A");
+	}
+
+	set_node_width(c2c_he, len);
+	c2c_he->nodestr = strdup(buf);
+	return c2c_he->nodestr ? 0 : -ENOMEM;
+}
+
 static void calc_width(struct c2c_hist_entry *c2c_he)
 {
 	struct c2c_hists *c2c_hists;
 
 	c2c_hists = container_of(c2c_he->he.hists, struct c2c_hists, hists);
 	hists__calc_col_len(&c2c_hists->hists, &c2c_he->he);
+	set_nodestr(c2c_he);
 }
 
 static int filter_cb(struct hist_entry *he)
@@ -2474,7 +2576,7 @@ static int build_cl_output(char *cl_sort, bool no_source)
 		"percent_lcl_hitm,"
 		"percent_stores_l1hit,"
 		"percent_stores_l1miss,"
-		"offset,",
+		"offset,offset_node,",
 		add_pid   ? "pid," : "",
 		add_tid   ? "tid," : "",
 		add_iaddr ? "iaddr," : "",
@@ -2603,17 +2705,21 @@ static int perf_c2c__report(int argc, const char **argv)
 		goto out;
 	}
 
-	err = setup_callchain(session->evlist);
+	err = mem2node__init(&c2c.mem2node, &session->header.env);
 	if (err)
 		goto out_session;
 
+	err = setup_callchain(session->evlist);
+	if (err)
+		goto out_mem2node;
+
 	if (symbol__init(&session->header.env) < 0)
-		goto out_session;
+		goto out_mem2node;
 
 	/* No pipe support at the moment. */
 	if (perf_data__is_pipe(session->data)) {
 		pr_debug("No pipe support at the moment.\n");
-		goto out_session;
+		goto out_mem2node;
 	}
 
 	if (c2c.use_stdio)
@@ -2626,12 +2732,13 @@ static int perf_c2c__report(int argc, const char **argv)
 	err = perf_session__process_events(session);
 	if (err) {
 		pr_err("failed to process sample\n");
-		goto out_session;
+		goto out_mem2node;
 	}
 
 	c2c_hists__reinit(&c2c.hists,
 			"cl_idx,"
 			"dcacheline,"
+			"dcacheline_node,"
 			"tot_recs,"
 			"percent_hitm,"
 			"tot_hitm,lcl_hitm,rmt_hitm,"
@@ -2657,6 +2764,8 @@ static int perf_c2c__report(int argc, const char **argv)
 
 	perf_c2c_display(session);
 
+out_mem2node:
+	mem2node__exit(&c2c.mem2node);
 out_session:
 	perf_session__delete(session);
 out:

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf c2c report: Add span header over cacheline data
  2018-03-09 10:14 ` [PATCH 8/9] perf c2c report: Add span header over cacheline data Jiri Olsa
@ 2018-03-20  6:19   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, hpa, tglx, namhyung, jmario, dsahern, acme, linux-kernel,
	alexander.shishkin, jolsa, peterz

Commit-ID:  d0802b1ee2c8b95e960f46fa14fe0fee562cb79a
Gitweb:     https://git.kernel.org/tip/d0802b1ee2c8b95e960f46fa14fe0fee562cb79a
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:41 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:53:30 -0300

perf c2c report: Add span header over cacheline data

Forcing the NUMA node output to be grouped with the "Cacheline" column
in both "Shared Data Cache Line Table" and "Shared Cache Line
Distribution Pareto" tables.

Before:
  #                                    Total      Tot  ----- LLC Load Hitm -----
  # Index           Cacheline  Node  records     Hitm    Total      Lcl      Rmt
  # .....  ..................  ....  .......  .......  .......  .......  .......
  #
        0      0x7f0830100000     0       84   10.53%        8        8        0
        1  0xffff922a93154200     0        3    2.63%        2        2        0
        2  0xffff922a93154500     0        4    2.63%        2        2        0

After:
  #        ------- Cacheline ------    Total      Tot  ----- LLC Load Hitm -----
  # Index             Address  Node  records     Hitm    Total      Lcl      Rmt
  # .....  ..................  ....  .......  .......  .......  .......  .......
  #
        0      0x7f0830100000     0       84   10.53%        8        8        0
        1  0xffff922a93154200     0        3    2.63%        2        2        0
        2  0xffff922a93154500     0        4    2.63%        2        2        0

Before:
  #        ----- HITM -----  -- Store Refs --        Data address
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss              Offset  Node      Pid
  # .....  .......  .......  .......  .......  ..................  ....  .......
  #
    -------------------------------------------------------------
        0        0        8       32        2      0x7f0830100000
    -------------------------------------------------------------
             0.00%   75.00%   21.88%    0.00%                0x18     0     1791
             0.00%   12.50%   37.50%    0.00%                0x18     0     1791
             0.00%    0.00%   34.38%    0.00%                0x18     0     1791

After:
  #        ----- HITM -----  -- Store Refs --  ----- Data address -----
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss              Offset  Node      Pid
  # .....  .......  .......  .......  .......  ..................  ....  .......
  #
    -------------------------------------------------------------
        0        0        8       32        2      0x7f0830100000
    -------------------------------------------------------------
             0.00%   75.00%   21.88%    0.00%                0x18     0     1791
             0.00%   12.50%   37.50%    0.00%                0x18     0     1791
             0.00%    0.00%   34.38%    0.00%                0x18     0     1791

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c | 63 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 58 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 45c047fdd7ac..a6336e4e2850 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1252,7 +1252,7 @@ cl_idx_empty_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	}
 
 static struct c2c_dimension dim_dcacheline = {
-	.header		= HEADER_LOW("Cacheline"),
+	.header		= HEADER_SPAN("--- Cacheline ----", "Address", 1),
 	.name		= "dcacheline",
 	.cmp		= dcacheline_cmp,
 	.entry		= dcacheline_entry,
@@ -1267,10 +1267,10 @@ static struct c2c_dimension dim_dcacheline_node = {
 	.width		= 4,
 };
 
-static struct c2c_header header_offset_tui = HEADER_LOW("Off");
+static struct c2c_header header_offset_tui = HEADER_SPAN("-----", "Off", 1);
 
 static struct c2c_dimension dim_offset = {
-	.header		= HEADER_BOTH("Data address", "Offset"),
+	.header		= HEADER_SPAN("--- Data address -", "Offset", 1),
 	.name		= "offset",
 	.cmp		= offset_cmp,
 	.entry		= offset_entry,
@@ -2453,14 +2453,64 @@ static void perf_c2c_display(struct perf_session *session)
 }
 #endif /* HAVE_SLANG_SUPPORT */
 
-static void ui_quirks(void)
+static char *fill_line(const char *orig, int len)
 {
+	int i, j, olen = strlen(orig);
+	char *buf;
+
+	buf = zalloc(len + 1);
+	if (!buf)
+		return NULL;
+
+	j = len / 2 - olen / 2;
+
+	for (i = 0; i < j - 1; i++)
+		buf[i] = '-';
+
+	buf[i++] = ' ';
+
+	strcpy(buf + i, orig);
+
+	i += olen;
+
+	buf[i++] = ' ';
+
+	for (; i < len; i++)
+		buf[i] = '-';
+
+	return buf;
+}
+
+static int ui_quirks(void)
+{
+	const char *nodestr = "Data address";
+	char *buf;
+
 	if (!c2c.use_stdio) {
 		dim_offset.width  = 5;
 		dim_offset.header = header_offset_tui;
+		nodestr = "CL";
 	}
 
 	dim_percent_hitm.header = percent_hitm_header[c2c.display];
+
+	/* Fix the zero line for dcacheline column. */
+	buf = fill_line("Cacheline", dim_dcacheline.width +
+				     dim_dcacheline_node.width + 2);
+	if (!buf)
+		return -ENOMEM;
+
+	dim_dcacheline.header.line[0].text = buf;
+
+	/* Fix the zero line for offset column. */
+	buf = fill_line(nodestr, dim_offset.width +
+			      dim_offset_node.width + 2);
+	if (!buf)
+		return -ENOMEM;
+
+	dim_offset.header.line[0].text = buf;
+
+	return 0;
 }
 
 #define CALLCHAIN_DEFAULT_OPT  "graph,0.5,caller,function,percent"
@@ -2760,7 +2810,10 @@ static int perf_c2c__report(int argc, const char **argv)
 
 	ui_progress__finish();
 
-	ui_quirks();
+	if (ui_quirks()) {
+		pr_err("failed to setup UI\n");
+		goto out_mem2node;
+	}
 
 	perf_c2c_display(session);
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip:perf/core] perf c2c report: Add cacheline address count column
  2018-03-09 10:14 ` [PATCH 9/9] perf c2c report: Add cacheline address count column Jiri Olsa
  2018-03-09 14:56   ` Arnaldo Carvalho de Melo
@ 2018-03-20  6:20   ` tip-bot for Jiri Olsa
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot for Jiri Olsa @ 2018-03-20  6:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, peterz, linux-kernel, tglx, mingo, alexander.shishkin,
	namhyung, dsahern, jolsa, hpa, jmario

Commit-ID:  03d9fcb701340de3446b4ff4ddb9f5407d1412f5
Gitweb:     https://git.kernel.org/tip/03d9fcb701340de3446b4ff4ddb9f5407d1412f5
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 9 Mar 2018 11:14:42 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 16 Mar 2018 13:53:38 -0300

perf c2c report: Add cacheline address count column

Adding the 'PA cnt' column grouped under data cacheline address.

It shows how many times the physical addresses changed for the hist
entry. It does not show the number of different physical addresses for
entry, because we don't store those. We only track the number of times
we got different address than we currently hold, which is not expensive
and gives similar info.

  $ perf c2c report --stdio

  #        ----------- Cacheline ----------    Total      Tot  ----- LLC Load Hitm -----
  # Index             Address  Node  PA cnt  records     Hitm    Total      Lcl      Rmt
  # .....  ..................  ....  ......  .......  .......  .......  .......  .......
  #
        0  0xffff9ad56dca0a80     0       9       10    7.69%        2        2        0
        1  0xffff9ad56dce0a80     0       9        9    7.69%        2        2        0
        2  0xffff9ad37659ad80     0       1        2    3.85%        1        1        0

  ...

  #        ----- HITM -----  -- Store Refs --  --------- Data address ---------
  #   Num      Rmt      Lcl   L1 Hit  L1 Miss              Offset  Node  PA cnt      Pid
  # .....  .......  .......  .......  .......  ..................  ....  ......  .......
  #
    -------------------------------------------------------------
        0        0        2        3        0  0xffff9ad56dca0a80
    -------------------------------------------------------------
             0.00%    0.00%   33.33%    0.00%                 0x0     0       1     2510
             0.00%    0.00%   33.33%    0.00%                 0x4     0       1     2476
             0.00%    0.00%   33.33%    0.00%                0x20     0       1        0
             0.00%  100.00%    0.00%    0.00%                0x38     0       1        0

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180309101442.9224-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a6336e4e2850..2126bfbcb385 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -507,6 +507,17 @@ dcacheline_node_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	return scnprintf(hpp->buf, hpp->size, "%*s", width, c2c_he->nodestr);
 }
 
+static int
+dcacheline_node_count(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		      struct hist_entry *he)
+{
+	struct c2c_hist_entry *c2c_he;
+	int width = c2c_width(fmt, hpp, he->hists);
+
+	c2c_he = container_of(he, struct c2c_hist_entry, he);
+	return scnprintf(hpp->buf, hpp->size, "%*lu", width, c2c_he->paddr_cnt);
+}
+
 static int offset_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 			struct hist_entry *he)
 {
@@ -1252,7 +1263,7 @@ cl_idx_empty_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	}
 
 static struct c2c_dimension dim_dcacheline = {
-	.header		= HEADER_SPAN("--- Cacheline ----", "Address", 1),
+	.header		= HEADER_SPAN("--- Cacheline ----", "Address", 2),
 	.name		= "dcacheline",
 	.cmp		= dcacheline_cmp,
 	.entry		= dcacheline_entry,
@@ -1267,10 +1278,18 @@ static struct c2c_dimension dim_dcacheline_node = {
 	.width		= 4,
 };
 
-static struct c2c_header header_offset_tui = HEADER_SPAN("-----", "Off", 1);
+static struct c2c_dimension dim_dcacheline_count = {
+	.header		= HEADER_LOW("PA cnt"),
+	.name		= "dcacheline_count",
+	.cmp		= empty_cmp,
+	.entry		= dcacheline_node_count,
+	.width		= 6,
+};
+
+static struct c2c_header header_offset_tui = HEADER_SPAN("-----", "Off", 2);
 
 static struct c2c_dimension dim_offset = {
-	.header		= HEADER_SPAN("--- Data address -", "Offset", 1),
+	.header		= HEADER_SPAN("--- Data address -", "Offset", 2),
 	.name		= "offset",
 	.cmp		= offset_cmp,
 	.entry		= offset_entry,
@@ -1605,6 +1624,7 @@ static struct c2c_dimension dim_dcacheline_num_empty = {
 static struct c2c_dimension *dimensions[] = {
 	&dim_dcacheline,
 	&dim_dcacheline_node,
+	&dim_dcacheline_count,
 	&dim_offset,
 	&dim_offset_node,
 	&dim_iaddr,
@@ -2496,7 +2516,8 @@ static int ui_quirks(void)
 
 	/* Fix the zero line for dcacheline column. */
 	buf = fill_line("Cacheline", dim_dcacheline.width +
-				     dim_dcacheline_node.width + 2);
+				     dim_dcacheline_node.width +
+				     dim_dcacheline_count.width + 4);
 	if (!buf)
 		return -ENOMEM;
 
@@ -2504,7 +2525,8 @@ static int ui_quirks(void)
 
 	/* Fix the zero line for offset column. */
 	buf = fill_line(nodestr, dim_offset.width +
-			      dim_offset_node.width + 2);
+			         dim_offset_node.width +
+				 dim_dcacheline_count.width + 4);
 	if (!buf)
 		return -ENOMEM;
 
@@ -2626,7 +2648,7 @@ static int build_cl_output(char *cl_sort, bool no_source)
 		"percent_lcl_hitm,"
 		"percent_stores_l1hit,"
 		"percent_stores_l1miss,"
-		"offset,offset_node,",
+		"offset,offset_node,dcacheline_count,",
 		add_pid   ? "pid," : "",
 		add_tid   ? "tid," : "",
 		add_iaddr ? "iaddr," : "",
@@ -2789,6 +2811,7 @@ static int perf_c2c__report(int argc, const char **argv)
 			"cl_idx,"
 			"dcacheline,"
 			"dcacheline_node,"
+			"dcacheline_count,"
 			"tot_recs,"
 			"percent_hitm,"
 			"tot_hitm,lcl_hitm,rmt_hitm,"

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-03-20  6:21 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-09 10:14 [PATCHv2 0/9] perf tools: Assorted fixes Jiri Olsa
2018-03-09 10:14 ` [PATCH 1/9] perf tools: Free memory nodes data Jiri Olsa
2018-03-20  6:16   ` [tip:perf/core] perf env: " tip-bot for Jiri Olsa
2018-03-09 10:14 ` [PATCH 2/9] perf tools: Add mem2node object Jiri Olsa
2018-03-20  6:16   ` [tip:perf/core] " tip-bot for Jiri Olsa
2018-03-09 10:14 ` [PATCH 3/9] perf tests: Add mem2node object test Jiri Olsa
2018-03-20  6:17   ` [tip:perf/core] " tip-bot for Jiri Olsa
2018-03-09 10:14 ` [PATCH 4/9] perf c2c record: Record physical addresses in samples Jiri Olsa
2018-03-20  6:17   ` [tip:perf/core] " tip-bot for Jiri Olsa
2018-03-09 10:14 ` [PATCH 5/9] perf c2c report: Make calc_width work with struct c2c_hist_entry Jiri Olsa
2018-03-20  6:18   ` [tip:perf/core] " tip-bot for Jiri Olsa
2018-03-09 10:14 ` [PATCH 6/9] perf c2c report: Call calc_width only for displayed entries Jiri Olsa
2018-03-20  6:18   ` [tip:perf/core] perf c2c report: Call calc_width() " tip-bot for Jiri Olsa
2018-03-09 10:14 ` [PATCH 7/9] perf c2c report: Display node for cacheline address Jiri Olsa
2018-03-20  6:19   ` [tip:perf/core] " tip-bot for Jiri Olsa
2018-03-09 10:14 ` [PATCH 8/9] perf c2c report: Add span header over cacheline data Jiri Olsa
2018-03-20  6:19   ` [tip:perf/core] " tip-bot for Jiri Olsa
2018-03-09 10:14 ` [PATCH 9/9] perf c2c report: Add cacheline address count column Jiri Olsa
2018-03-09 14:56   ` Arnaldo Carvalho de Melo
2018-03-09 16:28     ` Jiri Olsa
2018-03-09 17:34       ` Arnaldo Carvalho de Melo
2018-03-20  6:20   ` [tip:perf/core] " tip-bot for Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.