linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/10] stat read during perf sampling
@ 2015-08-18  9:25 kan.liang
  2015-08-18  9:25 ` [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads kan.liang
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

The patch series intends to read counter statistics during sampling.
The instant benefit is that we can read memory bandwidth from uncore
event during cpu PMU event is sampling. Also, there are more and more 
free running counter events (such as freq, power etc) we have supported
or plan to support on perf. So it could benefit more.

The patch series includs 10 patches.
  - Patch 1: This patch fixes a potential bug, when evlist and evsel
    have different CPU maps. It can be merged separately.
  - Patch 2-3: These patches introduce a new sort option --socket.
    The user can sort the perf result by socket. User also can get
    the socket view of samples from perf report --stdio --socket.
    This feature should be useful for per-socket event.
  - Patch 4-10: Introduce 'N' event/group modifier. The event with
    this modifier will do counting not sampling. If a group with this
    modifier, only group leader do sampling. The counter statistics will
    be wrote in new RECORD type PERF_RECORD_STAT_READ and stored in
    perf.data. So perf report can present the counter statistics data
    accordingly.
    There may be an alternative way to get counter statistics during
    sampling by running perf record and perf stat together by script.
    But the script way have various issue and complex to parses the
    output. For example, the sophisticated bandwidth analysis requires
    fine granularity (10-20ms interval), while the perf stat interval
    is 100ms. It's better to record all data in perf.data as the
    patchset does.

Example:

 #perf record -e 'cycles,uncore_imc_1/cas_count_read/N'
	--stat-read-interval 10 -a ./tchain_edit
 [ perf record: Woken up 520 times to write data ]
 [ perf record: Captured and wrote 1.454 MB perf.data (21328 samples) ]

 $perf report --stdio --socket

 # Samples: 21K of event 'cycles'
 # Event count (approx.): 12073951084
 #

 # Socket: 0
 #
 # Overhead  Command       Shared Object     Symbol
 # ........  ............  ................  .......................................
 #
    97.58%  tchain_edit   tchain_edit       [.] f3
     0.08%  tchain_edit   tchain_edit       [.] f2
     0.05%  swapper       [kernel.vmlinux]  [k] run_timer_softirq
 # Socket: 1
 #
 # Overhead  Command       Shared Object     Symbol
 # ........  ............  ................  .......................................
 #
     0.43%  swapper       [kernel.vmlinux]  [k] acpi_idle_do_entry
     0.24%  kworker/22:1  [kernel.vmlinux]  [k] delay_tsc
     0.17%  perf          [kernel.vmlinux]  [k] smp_call_function_single

 # Socket: 0
 #
 # Performance counter stats:
 # uncore_imc_1/cas_count_read/N  29004
 #
 # Socket: 1
 #
 # Performance counter stats:
 # uncore_imc_1/cas_count_read/N  11350


 $perf report -D

...
0x3e3a8 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 0:
value 29 time: 78608435366512
0x3e3c8 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 18:
value 15 time: 78608435429055
...
0x3e468 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 0:
value 25 time: 78608445379258
0x3e488 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 18:
value 12 time: 78608445423995
...

Kan Liang (10):
  perf,tools: open event on evsel cpus and threads
  perf,tools: Support new sort type --socket
  perf,tools: support option --socket
  perf,tools: Add 'N' event/group modifier
  perf,tools: Enable statistic read for perf record
  perf,tools: New RECORD type PERF_RECORD_STAT_READ
  perf,tools: record counter statistics during sampling
  perf,tools: option to set stat read interval
  perf,tools: don't validate non-sample event
  perf,tools: Show STAT_READ in perf report

 tools/perf/Documentation/perf-list.txt   |   5 ++
 tools/perf/Documentation/perf-record.txt |   7 ++
 tools/perf/Documentation/perf-report.txt |   6 +-
 tools/perf/builtin-diff.c                |   2 +-
 tools/perf/builtin-record.c              | 140 ++++++++++++++++++++++++++++++-
 tools/perf/builtin-report.c              | 108 +++++++++++++++++++++++-
 tools/perf/builtin-top.c                 |   2 +-
 tools/perf/ui/stdio/hist.c               |  14 +++-
 tools/perf/util/cpumap.c                 |  35 ++++++--
 tools/perf/util/cpumap.h                 |   4 +
 tools/perf/util/event.c                  |   1 +
 tools/perf/util/event.h                  |  10 +++
 tools/perf/util/evlist.c                 |   9 ++
 tools/perf/util/evsel.h                  |   1 +
 tools/perf/util/hist.h                   |   3 +-
 tools/perf/util/parse-events.c           |   8 +-
 tools/perf/util/parse-events.l           |   2 +-
 tools/perf/util/session.c                |  15 ++++
 tools/perf/util/sort.c                   |  34 ++++++++
 tools/perf/util/sort.h                   |   1 +
 tools/perf/util/symbol.h                 |   1 +
 tools/perf/util/tool.h                   |   1 +
 22 files changed, 387 insertions(+), 22 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-20  8:57   ` Jiri Olsa
  2015-08-18  9:25 ` [PATCH RFC 02/10] perf,tools: Support new sort type --socket kan.liang
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

evsel may have different cpus and threads as evlist's.
Use it's own cpus and threads, when open evsel in perf record.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/builtin-record.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 25cf6b4..a0178bf 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -279,7 +279,7 @@ static int record__open(struct record *rec)
 
 	evlist__for_each(evlist, pos) {
 try_again:
-		if (perf_evsel__open(pos, evlist->cpus, evlist->threads) < 0) {
+		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
 			if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
 				if (verbose)
 					ui__warning("%s\n", msg);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
  2015-08-18  9:25 ` [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-20  9:09   ` Jiri Olsa
  2015-08-18  9:25 ` [PATCH RFC 03/10] perf,tools: support option --socket kan.liang
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

This patch enable perf report to sort by socket

$ perf report --stdio --sort socket,comm,dso,symbol
 # To display the perf.data header info, please use
 --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 686  of event 'cycles'
 # Event count (approx.): 349215462
 #
 # Overhead  SOCKET  Command    Shared Object     Symbol
 # ........  ......  .........  ................
 .................................
 #
    97.05%     000  test       test              [.] plusB_c
     0.98%     000  test       test              [.] plusA_c
     0.93%     001  perf       [kernel.vmlinux]  [k]
smp_call_function_single
     0.19%     001  perf       [kernel.vmlinux]  [k] page_fault
     0.19%     001  swapper    [kernel.vmlinux]  [k] pm_qos_request
     0.16%     000  test       [kernel.vmlinux]  [k] add_mm_counter_fast

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/Documentation/perf-report.txt |  3 ++-
 tools/perf/util/cpumap.c                 | 21 ++++++++++++++-------
 tools/perf/util/cpumap.h                 |  1 +
 tools/perf/util/hist.h                   |  1 +
 tools/perf/util/sort.c                   | 28 ++++++++++++++++++++++++++++
 tools/perf/util/sort.h                   |  1 +
 6 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index a18ba75..ca76b0d 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -68,7 +68,7 @@ OPTIONS
 --sort=::
 	Sort histogram entries by given key(s) - multiple keys can be specified
 	in CSV format.  Following sort keys are available:
-	pid, comm, dso, symbol, parent, cpu, srcline, weight, local_weight.
+	pid, comm, dso, symbol, parent, cpu, socket, srcline, weight, local_weight.
 
 	Each key has following meaning:
 
@@ -79,6 +79,7 @@ OPTIONS
 	- parent: name of function matched to the parent regex filter. Unmatched
 	entries are displayed as "[other]".
 	- cpu: cpu number the task ran at the time of sample
+	- socket: socket number the task ran at the time of sample
 	- srcline: filename and line number executed at the time of sample.  The
 	DWARF debugging info must be provided.
 	- srcfile: file name of the source file of the same. Requires dwarf
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 3667e21..7f25e6c 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -225,17 +225,12 @@ void cpu_map__put(struct cpu_map *map)
 		cpu_map__delete(map);
 }
 
-int cpu_map__get_socket(struct cpu_map *map, int idx)
+int cpu__get_socket(int cpu)
 {
 	FILE *fp;
 	const char *mnt;
 	char path[PATH_MAX];
-	int cpu, ret;
-
-	if (idx > map->nr)
-		return -1;
-
-	cpu = map->map[idx];
+	int ret;
 
 	mnt = sysfs__mountpoint();
 	if (!mnt)
@@ -253,6 +248,18 @@ int cpu_map__get_socket(struct cpu_map *map, int idx)
 	return ret == 1 ? cpu : -1;
 }
 
+int cpu_map__get_socket(struct cpu_map *map, int idx)
+{
+	int cpu;
+
+	if (idx > map->nr)
+		return -1;
+
+	cpu = map->map[idx];
+
+	return cpu__get_socket(cpu);
+}
+
 static int cmp_ids(const void *a, const void *b)
 {
 	return *(int *)a - *(int *)b;
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 0af9cec..effb56e 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -18,6 +18,7 @@ struct cpu_map *cpu_map__new(const char *cpu_list);
 struct cpu_map *cpu_map__dummy_new(void);
 struct cpu_map *cpu_map__read(FILE *file);
 size_t cpu_map__fprintf(struct cpu_map *map, FILE *fp);
+int cpu__get_socket(int cpu);
 int cpu_map__get_socket(struct cpu_map *map, int idx);
 int cpu_map__get_core(struct cpu_map *map, int idx);
 int cpu_map__build_socket_map(struct cpu_map *cpus, struct cpu_map **sockp);
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index bc528d5..af80fb5 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -29,6 +29,7 @@ enum hist_column {
 	HISTC_COMM,
 	HISTC_PARENT,
 	HISTC_CPU,
+	HISTC_SOCKET,
 	HISTC_SRCLINE,
 	HISTC_SRCFILE,
 	HISTC_MISPREDICT,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 7e38716..245e254 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -421,6 +421,33 @@ struct sort_entry sort_cpu = {
 	.se_width_idx	= HISTC_CPU,
 };
 
+/* --sort socket */
+
+static int64_t
+sort__socket_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	int r_socket, l_socket;
+
+	r_socket = cpu__get_socket(right->cpu);
+	l_socket = cpu__get_socket(left->cpu);
+	return r_socket - l_socket;
+}
+
+static int hist_entry__socket_snprintf(struct hist_entry *he, char *bf,
+				    size_t size, unsigned int width)
+{
+	int socket = cpu__get_socket(he->cpu);
+
+	return repsep_snprintf(bf, size, "%*.*d", width, width-3, socket);
+}
+
+struct sort_entry sort_socket = {
+	.se_header      = "SOCKET",
+	.se_cmp	        = sort__socket_cmp,
+	.se_snprintf    = hist_entry__socket_snprintf,
+	.se_width_idx	= HISTC_SOCKET,
+};
+
 /* sort keys for branch stacks */
 
 static int64_t
@@ -1248,6 +1275,7 @@ static struct sort_dimension common_sort_dimensions[] = {
 	DIM(SORT_SYM, "symbol", sort_sym),
 	DIM(SORT_PARENT, "parent", sort_parent),
 	DIM(SORT_CPU, "cpu", sort_cpu),
+	DIM(SORT_SOCKET, "socket", sort_socket),
 	DIM(SORT_SRCLINE, "srcline", sort_srcline),
 	DIM(SORT_SRCFILE, "srcfile", sort_srcfile),
 	DIM(SORT_LOCAL_WEIGHT, "local_weight", sort_local_weight),
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 3c2a399..9d86a1f 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -172,6 +172,7 @@ enum sort_type {
 	SORT_SYM,
 	SORT_PARENT,
 	SORT_CPU,
+	SORT_SOCKET,
 	SORT_SRCLINE,
 	SORT_SRCFILE,
 	SORT_LOCAL_WEIGHT,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 03/10] perf,tools: support option --socket
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
  2015-08-18  9:25 ` [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads kan.liang
  2015-08-18  9:25 ` [PATCH RFC 02/10] perf,tools: Support new sort type --socket kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-18 17:36   ` Stephane Eranian
  2015-08-20  9:30   ` Jiri Olsa
  2015-08-18  9:25 ` [PATCH RFC 04/10] perf,tools: Add 'N' event/group modifier kan.liang
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

Introduce a new option for perf report to show the event information in
the same socket together.
When this option is set, perf report will force to sort by socket.

$ perf report --stdio --socket
 # To display the perf.data header info, please use
 --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 686  of event 'cycles'
 # Event count (approx.): 349215462
 #
 #
 # Socket: 0
 #
 # Overhead  Command    Shared Object     Symbol
 # ........  .........  ................
 .................................
 #
    97.05%  test       test              [.] plusB_c
     0.98%  test       test              [.] plusA_c
     0.16%  test       [kernel.vmlinux]  [k] add_mm_counter_fast
     0.15%  swapper    [kernel.vmlinux]  [k] note_gp_changes
     0.15%  perf       [kernel.vmlinux]  [k] unmap_single_vma
     0.06%  perf       [kernel.vmlinux]  [k] run_timer_softirq
     0.00%  swapper    [kernel.vmlinux]  [k] native_write_msr
 #
 # Socket: 1
 #
 # Overhead  Command    Shared Object     Symbol
 # ........  .........  ................
 .................................
 #
     0.93%  perf       [kernel.vmlinux]  [k] smp_call_function_single
     0.19%  perf       [kernel.vmlinux]  [k] page_fault
     0.19%  swapper    [kernel.vmlinux]  [k] pm_qos_request
     0.12%  rcu_sched  [kernel.vmlinux]  [k]
dyntick_save_progress_counter
     0.00%  swapper    [kernel.vmlinux]  [k] wake_up_process
     0.00%  swapper    [kernel.vmlinux]  [k] __do_softirq
     0.00%  swapper    [kernel.vmlinux]  [k] run_timer_softirq
     0.00%  swapper    [kernel.vmlinux]  [k] native_write_msr
     0.00%  perf       [kernel.vmlinux]  [k] native_write_msr

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/Documentation/perf-report.txt |  3 +++
 tools/perf/builtin-diff.c                |  2 +-
 tools/perf/builtin-report.c              | 17 ++++++++++++++++-
 tools/perf/builtin-top.c                 |  2 +-
 tools/perf/ui/stdio/hist.c               | 14 ++++++++++----
 tools/perf/util/cpumap.c                 | 14 ++++++++++++++
 tools/perf/util/cpumap.h                 |  2 ++
 tools/perf/util/hist.h                   |  2 +-
 tools/perf/util/sort.c                   |  6 ++++++
 tools/perf/util/symbol.h                 |  1 +
 10 files changed, 55 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index ca76b0d..49a42e4 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -293,6 +293,9 @@ OPTIONS
 --group::
 	Show event group information together.
 
+--socket::
+	Show event information in the same socket together.
+
 --demangle::
 	Demangle symbol names to human readable form. It's enabled by default,
 	disable with --no-demangle.
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 0b180a8..0dfd91f 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -667,7 +667,7 @@ static void hists__process(struct hists *hists)
 	hists__precompute(hists);
 	hists__output_resort(hists, NULL);
 
-	hists__fprintf(hists, true, 0, 0, 0, stdout);
+	hists__fprintf(hists, true, 0, 0, 0, stdout, -1);
 }
 
 static void data__fprintf(void)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 62b285e..6fdf9f4 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -345,7 +345,17 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
 			continue;
 
 		hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
-		hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout);
+
+		if (symbol_conf.socket) {
+			int i;
+
+			for (i = 0; i < max_socket_num; i++) {
+				fprintf(stdout, "#\n# Socket: %d\n#\n", i);
+				hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout, i);
+			}
+		} else
+			hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout, -1);
+
 		fprintf(stdout, "\n\n");
 	}
 
@@ -724,6 +734,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 		    "Show a column with the sum of periods"),
 	OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
 		    "Show event group information together"),
+	OPT_BOOLEAN(0, "socket", &symbol_conf.socket,
+		    "Show event information in the same socket together"),
 	OPT_CALLBACK_NOOPT('b', "branch-stack", &branch_mode, "",
 		    "use branch records for per branch histogram filling",
 		    parse_branch_mode),
@@ -909,6 +921,9 @@ repeat:
 
 	sort__setup_elide(stdout);
 
+	if (symbol_conf.socket)
+		set_max_socket_num();
+
 	ret = __cmd_report(&report);
 	if (ret == K_SWITCH_INPUT_DATA) {
 		perf_session__delete(session);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index bfe24f1..deffe44 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -295,7 +295,7 @@ static void perf_top__print_sym_table(struct perf_top *top)
 	hists__output_recalc_col_len(hists, top->print_entries - printed);
 	putchar('\n');
 	hists__fprintf(hists, false, top->print_entries - printed, win_width,
-		       top->min_percent, stdout);
+		       top->min_percent, stdout, -1);
 }
 
 static void prompt_integer(int *target, const char *msg)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index dfcbc90..2f512b8 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -323,7 +323,8 @@ static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
 		return 0;
 
 	perf_hpp__for_each_format(fmt) {
-		if (perf_hpp__should_skip(fmt))
+		if (perf_hpp__should_skip(fmt) ||
+		    !strcmp(fmt->name, "SOCKET"))
 			continue;
 
 		/*
@@ -371,7 +372,7 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
 }
 
 size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
-		      int max_cols, float min_pcnt, FILE *fp)
+		      int max_cols, float min_pcnt, FILE *fp, int socket)
 {
 	struct perf_hpp_fmt *fmt;
 	struct rb_node *nd;
@@ -402,7 +403,8 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
 	fprintf(fp, "# ");
 
 	perf_hpp__for_each_format(fmt) {
-		if (perf_hpp__should_skip(fmt))
+		if (perf_hpp__should_skip(fmt) ||
+		    !strcmp(fmt->name, "SOCKET"))
 			continue;
 
 		if (!first)
@@ -428,7 +430,8 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
 	perf_hpp__for_each_format(fmt) {
 		unsigned int i;
 
-		if (perf_hpp__should_skip(fmt))
+		if (perf_hpp__should_skip(fmt) ||
+		    !strcmp(fmt->name, "SOCKET"))
 			continue;
 
 		if (!first)
@@ -465,6 +468,9 @@ print_entries:
 		if (h->filtered)
 			continue;
 
+		if ((socket >= 0) && (cpu__get_socket(h->cpu) != socket))
+			continue;
+
 		percent = hist_entry__get_percent_limit(h);
 		if (percent < min_pcnt)
 			continue;
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 7f25e6c..ae03426 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -407,6 +407,20 @@ out:
 		pr_err("Failed to read max cpus, using default of %d\n", max_cpu_num);
 }
 
+void set_max_socket_num(void)
+{
+	int cpu, socket;
+
+	set_max_cpu_num();
+
+	max_socket_num = 1;
+	for (cpu = 0; cpu < max_cpu_num; cpu++) {
+		socket = cpu__get_socket(cpu);
+		if (max_socket_num < (socket + 1))
+			max_socket_num = socket + 1;
+	}
+}
+
 /* Determine highest possible node in the system for sparse allocation */
 static void set_max_node_num(void)
 {
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index effb56e..094edd9 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -56,9 +56,11 @@ static inline bool cpu_map__empty(const struct cpu_map *map)
 
 int max_cpu_num;
 int max_node_num;
+int max_socket_num;
 int *cpunode_map;
 
 int cpu__setup_cpunode_map(void);
+void set_max_socket_num(void);
 
 static inline int cpu__max_node(void)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index af80fb5..b188a62 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -139,7 +139,7 @@ void events_stats__inc(struct events_stats *stats, u32 type);
 size_t events_stats__fprintf(struct events_stats *stats, FILE *fp);
 
 size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
-		      int max_cols, float min_pcnt, FILE *fp);
+		      int max_cols, float min_pcnt, FILE *fp, int socket);
 size_t perf_evlist__fprintf_nr_events(struct perf_evlist *evlist, FILE *fp);
 
 void hists__filter_by_dso(struct hists *hists);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 245e254..70f0526 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1930,6 +1930,12 @@ int setup_sorting(void)
 			return err;
 	}
 
+	if (symbol_conf.socket) {
+		err = sort_dimension__add("socket");
+		if (err < 0)
+			return err;
+	}
+
 	reset_dimensions();
 
 	/*
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index a4cde92..7c153c7 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -101,6 +101,7 @@ struct symbol_conf {
 			annotate_asm_raw,
 			annotate_src,
 			event_group,
+			socket,
 			demangle,
 			demangle_kernel,
 			filter_relative,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 04/10] perf,tools: Add 'N' event/group modifier
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
                   ` (2 preceding siblings ...)
  2015-08-18  9:25 ` [PATCH RFC 03/10] perf,tools: support option --socket kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-18  9:25 ` [PATCH RFC 05/10] perf,tools: Enable statistic read for perf record kan.liang
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

Add a new event/group modifier 'N' to mark the event which will be read
counter statistics during sampling.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/util/evsel.h        | 1 +
 tools/perf/util/parse-events.c | 8 +++++++-
 tools/perf/util/parse-events.l | 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 93ac6b1..e5ced73 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -110,6 +110,7 @@ struct perf_evsel {
 	int			exclude_GH;
 	int			nr_members;
 	int			sample_read;
+	int			stat_read;
 	unsigned long		*per_pkg_mask;
 	struct perf_evsel	*leader;
 	char			*group_name;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index d826e6f..a322b60 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -821,6 +821,7 @@ struct event_modifier {
 	int precise;
 	int exclude_GH;
 	int sample_read;
+	int stat_read;
 	int pinned;
 };
 
@@ -835,6 +836,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 	int eI = evsel ? evsel->attr.exclude_idle : 0;
 	int precise = evsel ? evsel->attr.precise_ip : 0;
 	int sample_read = 0;
+	int stat_read = 0;
 	int pinned = evsel ? evsel->attr.pinned : 0;
 
 	int exclude = eu | ek | eh;
@@ -872,6 +874,8 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 				eG = 1;
 		} else if (*str == 'S') {
 			sample_read = 1;
+		} else if (*str == 'N') {
+			stat_read = 1;
 		} else if (*str == 'D') {
 			pinned = 1;
 		} else
@@ -902,6 +906,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 	mod->precise = precise;
 	mod->exclude_GH = exclude_GH;
 	mod->sample_read = sample_read;
+	mod->stat_read = stat_read;
 	mod->pinned = pinned;
 
 	return 0;
@@ -916,7 +921,7 @@ static int check_modifier(char *str)
 	char *p = str;
 
 	/* The sizeof includes 0 byte as well. */
-	if (strlen(str) > (sizeof("ukhGHpppSDI") - 1))
+	if (strlen(str) > (sizeof("ukhGHpppSNDI") - 1))
 		return -1;
 
 	while (*p) {
@@ -955,6 +960,7 @@ int parse_events__modifier_event(struct list_head *list, char *str, bool add)
 		evsel->attr.exclude_idle   = mod.eI;
 		evsel->exclude_GH          = mod.exclude_GH;
 		evsel->sample_read         = mod.sample_read;
+		evsel->stat_read           = mod.stat_read;
 
 		if (perf_evsel__is_group_leader(evsel))
 			evsel->attr.pinned = mod.pinned;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 936d566..de2dd4b 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -122,7 +122,7 @@ num_raw_hex	[a-fA-F0-9]+
 name		[a-zA-Z_*?][a-zA-Z0-9_*?.]*
 name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.]*
 /* If you add a modifier you need to update check_modifier() */
-modifier_event	[ukhpGHSDI]+
+modifier_event	[ukhpGHSNDI]+
 modifier_bp	[rwx]{1,3}
 
 %%
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 05/10] perf,tools: Enable statistic read for perf record
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
                   ` (3 preceding siblings ...)
  2015-08-18  9:25 ` [PATCH RFC 04/10] perf,tools: Add 'N' event/group modifier kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-20  9:39   ` Jiri Olsa
  2015-08-20  9:41   ` Jiri Olsa
  2015-08-18  9:25 ` [PATCH RFC 06/10] perf,tools: New RECORD type PERF_RECORD_STAT_READ kan.liang
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

Using 'N' event/group modifier to specify the event which want to read
counter statistics during sampling. For this event, the sampling will
be disabled unless it's group leader.
The 'N' modifier only be available for in system-wide/CPU mode. If a
group is marked as 'N' modifier, only group members read counter
statistics. Group leader always do sampling.
The other limit is that the first event cannot be stat read event, since
many tools special handle first event.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/Documentation/perf-list.txt |  5 +++++
 tools/perf/builtin-record.c            | 31 +++++++++++++++++++++++++++++++
 tools/perf/util/evlist.c               |  3 +++
 3 files changed, 39 insertions(+)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index bada893..133dc2c 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -31,6 +31,11 @@ counted. The following modifiers exist:
  H - host counting (not in KVM guests)
  p - precise level
  S - read sample value (PERF_SAMPLE_READ)
+ N - read counter statistics during sampling (Can be used to read
+     counter statistics of PMU_A event when PMU_B events are sampling.
+     For example, getting memory bandwidth by uncore events during the
+     CPU PMU events run time)
+     (Only available for group members or non-first event in system-wide/CPU mode)
  D - pin the event to the PMU
 
 The 'p' modifier can be used for specifying how precise the instruction
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a0178bf..5b09318 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -273,11 +273,42 @@ static int record__open(struct record *rec)
 	struct perf_evlist *evlist = rec->evlist;
 	struct perf_session *session = rec->session;
 	struct record_opts *opts = &rec->opts;
+	struct perf_evsel *first = perf_evlist__first(evlist);
+	struct perf_event_attr *attr;
 	int rc = 0;
 
 	perf_evlist__config(evlist, opts);
 
 	evlist__for_each(evlist, pos) {
+
+		if (pos->stat_read) {
+			if (!target__has_cpu(&opts->target)) {
+				pos->stat_read = 0;
+				ui__warning("Statistics read only available "
+					    "on system-wide/CPU mode. "
+					    "Remove :N modifier for event %s\n",
+					    pos->name);
+				goto out;
+			}
+			/* Don't do stat read for Group leader */
+			if ((pos->leader == pos) && (pos->leader->nr_members > 1))
+				pos->stat_read = 0;
+			else {
+				if (first == pos) {
+					pos->stat_read = 0;
+					ui__warning("The first event cannot "
+						    "be stat read event\n");
+					goto out;
+				}
+				attr = &pos->attr;
+				attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
+						    PERF_FORMAT_TOTAL_TIME_RUNNING;
+				attr->sample_freq   = 0;
+				attr->sample_period = 0;
+				attr->sample_type = 0;
+				pos->sample_size = 0;
+			}
+		}
 try_again:
 		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
 			if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 373f65b..ca7bf8d 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -854,6 +854,9 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		if (evsel->system_wide && thread)
 			continue;
 
+		if (evsel->stat_read)
+			continue;
+
 		fd = FD(evsel, cpu, thread);
 
 		if (*output == -1) {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 06/10] perf,tools: New RECORD type PERF_RECORD_STAT_READ
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
                   ` (4 preceding siblings ...)
  2015-08-18  9:25 ` [PATCH RFC 05/10] perf,tools: Enable statistic read for perf record kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-20  9:49   ` Jiri Olsa
  2015-08-18  9:25 ` [PATCH RFC 07/10] perf,tools: record counter statistics during sampling kan.liang
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

Introduce a new user RECORD type PERF_RECORD_STAT_READ to store the
stat event read result.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/util/event.c   |  1 +
 tools/perf/util/event.h   | 10 ++++++++++
 tools/perf/util/session.c | 15 +++++++++++++++
 tools/perf/util/tool.h    |  1 +
 4 files changed, 27 insertions(+)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 7ff6127..601f4be 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -37,6 +37,7 @@ static const char *perf_event__names[] = {
 	[PERF_RECORD_AUXTRACE_INFO]		= "AUXTRACE_INFO",
 	[PERF_RECORD_AUXTRACE]			= "AUXTRACE",
 	[PERF_RECORD_AUXTRACE_ERROR]		= "AUXTRACE_ERROR",
+	[PERF_RECORD_STAT_READ]			= "STAT_READ",
 };
 
 const char *perf_event__name(unsigned int id)
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index f729df5..f6932cf 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -226,6 +226,7 @@ enum perf_user_event_type { /* above any possible kernel type */
 	PERF_RECORD_AUXTRACE_INFO		= 70,
 	PERF_RECORD_AUXTRACE			= 71,
 	PERF_RECORD_AUXTRACE_ERROR		= 72,
+	PERF_RECORD_STAT_READ			= 73,
 	PERF_RECORD_HEADER_MAX
 };
 
@@ -355,6 +356,14 @@ struct context_switch_event {
 	u32 next_prev_tid;
 };
 
+struct stat_read_event {
+	struct perf_event_header header;
+	u64	value;		/* counter value delta */
+	u32	cpu;
+	u32	pos_id;		/* event position in evlist */
+	u64	time;		/* time stamp */
+};
+
 union perf_event {
 	struct perf_event_header	header;
 	struct mmap_event		mmap;
@@ -377,6 +386,7 @@ union perf_event {
 	struct aux_event		aux;
 	struct itrace_start_event	itrace_start;
 	struct context_switch_event	context_switch;
+	struct stat_read_event		stat_read;
 };
 
 void perf_event__print_totals(void);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 18722e7..1547970 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -337,6 +337,8 @@ void perf_tool__fill_defaults(struct perf_tool *tool)
 		tool->context_switch = perf_event__process_switch;
 	if (tool->read == NULL)
 		tool->read = process_event_sample_stub;
+	if (tool->stat_read == NULL)
+		tool->stat_read = process_build_id_stub;
 	if (tool->throttle == NULL)
 		tool->throttle = process_event_stub;
 	if (tool->unthrottle == NULL)
@@ -440,6 +442,16 @@ static void perf_event__task_swap(union perf_event *event, bool sample_id_all)
 		swap_sample_id_all(event, &event->fork + 1);
 }
 
+static void perf_event__stat_read_swap(union perf_event *event,
+				       bool sample_id_all __maybe_unused)
+{
+
+	event->stat_read.value = bswap_64(event->stat_read.value);
+	event->stat_read.cpu = bswap_32(event->stat_read.cpu);
+	event->stat_read.pos_id = bswap_32(event->stat_read.pos_id);
+	event->stat_read.time = bswap_64(event->stat_read.time);
+}
+
 static void perf_event__read_swap(union perf_event *event, bool sample_id_all)
 {
 	event->read.pid		 = bswap_32(event->read.pid);
@@ -642,6 +654,7 @@ static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_EXIT]		  = perf_event__task_swap,
 	[PERF_RECORD_LOST]		  = perf_event__all64_swap,
 	[PERF_RECORD_READ]		  = perf_event__read_swap,
+	[PERF_RECORD_STAT_READ]		  = perf_event__stat_read_swap,
 	[PERF_RECORD_THROTTLE]		  = perf_event__throttle_swap,
 	[PERF_RECORD_UNTHROTTLE]	  = perf_event__throttle_swap,
 	[PERF_RECORD_SAMPLE]		  = perf_event__all64_swap,
@@ -1191,6 +1204,8 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 	case PERF_RECORD_AUXTRACE_ERROR:
 		perf_session__auxtrace_error_inc(session, event);
 		return tool->auxtrace_error(tool, event, session);
+	case PERF_RECORD_STAT_READ:
+		return tool->stat_read(tool, event, session);
 	default:
 		return -EINVAL;
 	}
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index cab8cc2..3a002a2 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -57,6 +57,7 @@ struct perf_tool {
 			auxtrace_info,
 			auxtrace_error;
 	event_op3	auxtrace;
+	event_op2	stat_read;
 	bool		ordered_events;
 	bool		ordering_requires_timestamps;
 };
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 07/10] perf,tools: record counter statistics during sampling
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
                   ` (5 preceding siblings ...)
  2015-08-18  9:25 ` [PATCH RFC 06/10] perf,tools: New RECORD type PERF_RECORD_STAT_READ kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-18  9:25 ` [PATCH RFC 08/10] perf,tools: option to set stat read interval kan.liang
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

Reading at the start and the end of sampling, and save the counter
statistics value to PERF_RECORD_STAT_READ.
The absolate value from counter will be stored in prev_raw_counts. The
detla value will be caculated and saved in PERF_RECORD_STAT_READ.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/builtin-record.c | 76 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5b09318..0e5e4c0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -29,6 +29,7 @@
 #include "util/data.h"
 #include "util/auxtrace.h"
 #include "util/parse-branch-options.h"
+#include "util/stat.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -266,6 +267,30 @@ int auxtrace_record__snapshot_start(struct auxtrace_record *itr __maybe_unused)
 
 #endif
 
+static void
+free_prev_stat_counts(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		if (evsel->prev_raw_counts != NULL)
+			perf_evsel__free_prev_raw_counts(evsel);
+	}
+}
+
+static int perf_evlist__alloc_prev_stat(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		if (evsel->stat_read &&
+		    (perf_evsel__alloc_prev_raw_counts(evsel, cpu_map__nr(evsel->cpus), 1) < 0))
+				return -ENOMEM;
+	}
+
+	return 0;
+}
+
 static int record__open(struct record *rec)
 {
 	char msg[512];
@@ -352,6 +377,9 @@ try_again:
 		goto out;
 	}
 
+	if (perf_evlist__alloc_prev_stat(evlist))
+		goto out;
+
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
 out:
@@ -427,6 +455,50 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static void process_stat_read_event(struct record *rec, int id,
+				    struct perf_evsel *pos,
+				    struct perf_counts_values *count,
+				    u32 pos_id)
+{
+	union perf_event ev;
+	struct perf_counts_values tmp;
+
+	memset(&ev, 0, sizeof(ev));
+	ev.stat_read.header.type = PERF_RECORD_STAT_READ;
+	ev.stat_read.header.size = sizeof(ev.stat_read);
+	ev.stat_read.cpu = pos->cpus->map[id];
+	ev.stat_read.pos_id = pos_id;
+
+	BUG_ON(pos->prev_raw_counts == NULL);
+	tmp.val = perf_counts(pos->prev_raw_counts, id, 0)->val;
+	if (tmp.val)
+		ev.stat_read.value = count->val - tmp.val;
+	perf_counts(pos->prev_raw_counts, id, 0)->val = count->val;
+
+	ev.stat_read.time = rdclock();
+
+	record__write(rec, &ev, sizeof(ev.stat_read));
+}
+
+static void perf_read_counter(struct record *rec)
+{
+	struct perf_evlist *evlist = rec->evlist;
+	struct perf_counts_values count;
+	struct perf_evsel *pos;
+	int cpu;
+	u32 pos_id = 0;
+
+	evlist__for_each(evlist, pos) {
+		for (cpu = 0; cpu < cpu_map__nr(pos->cpus); cpu++) {
+			if (pos->stat_read &&
+			    !perf_evsel__read(pos, cpu, 0, &count)) {
+				process_stat_read_event(rec, cpu, pos, &count, pos_id);
+			}
+		}
+		pos_id++;
+	}
+}
+
 static int record__mmap_read_all(struct record *rec)
 {
 	u64 bytes_written = rec->bytes_written;
@@ -664,6 +736,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		perf_evlist__enable(rec->evlist);
 	}
 
+	perf_read_counter(rec);
 	auxtrace_snapshot_enabled = 1;
 	for (;;) {
 		int hits = rec->samples;
@@ -708,6 +781,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		 */
 		if (done && !disabled && !target__none(&opts->target)) {
 			auxtrace_snapshot_enabled = 0;
+			perf_read_counter(rec);
 			perf_evlist__disable(rec->evlist);
 			disabled = true;
 		}
@@ -776,7 +850,7 @@ out_child:
 			perf_data_file__size(file) / 1024.0 / 1024.0,
 			file->path, samples);
 	}
-
+	free_prev_stat_counts(rec->evlist);
 out_delete_session:
 	perf_session__delete(session);
 	return status;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 08/10] perf,tools: option to set stat read interval
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
                   ` (6 preceding siblings ...)
  2015-08-18  9:25 ` [PATCH RFC 07/10] perf,tools: record counter statistics during sampling kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-18  9:25 ` [PATCH RFC 09/10] perf,tools: don't validate non-sample event kan.liang
  2015-08-18  9:25 ` [PATCH RFC 10/10] perf,tools: Show STAT_READ in perf report kan.liang
  9 siblings, 0 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

Add a timer to read stat event regularly. Option --stat-read-interval
can be used to set the interval.
Only read stat event at the beginning and the end of the test is not
enough. Sometimes, we need fine granularity to do sophisticated
analysis. For example, 10-20ms is required to do sophisticated bandwidth
analysis.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/Documentation/perf-record.txt |  7 +++++++
 tools/perf/builtin-record.c              | 31 +++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 347a273..00dc000 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -304,6 +304,13 @@ This option sets the time out limit. The default value is 500 ms.
 Record context switch events i.e. events of type PERF_RECORD_SWITCH or
 PERF_RECORD_SWITCH_CPU_WIDE.
 
+--stat-read-interval::
+Sets the interval to do stat read regularly. This option is valid only with
+stat read event. This option is disabled by default. It means that counter
+only be read at the beginning and end of the test.
+But sometimes, we need fine granularity to do sophisticated analysis.
+For example, 10-20ms is required to do sophisticated bandwidth analysis.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0e5e4c0..ebf37e3 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -572,6 +572,14 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
 
 static void snapshot_sig_handler(int sig);
 
+static unsigned int interval;
+struct record *g_rec;
+
+static void perf_read_alarm(int sig __maybe_unused)
+{
+	perf_read_counter(g_rec);
+}
+
 static int __cmd_record(struct record *rec, int argc, const char **argv)
 {
 	int err;
@@ -584,6 +592,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	struct perf_data_file *file = &rec->file;
 	struct perf_session *session;
 	bool disabled = false, draining = false;
+	static timer_t timerid;
+	struct itimerspec timeout;
+	struct sigevent sigevent;
 	int fd;
 
 	rec->progname = argv[0];
@@ -597,6 +608,19 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	else
 		signal(SIGUSR2, SIG_IGN);
 
+	if (interval) {
+		signal(SIGALRM, perf_read_alarm);
+		g_rec = rec;
+
+		/* Create a timer. */
+		sigevent.sigev_notify = SIGEV_SIGNAL;
+		sigevent.sigev_signo = SIGALRM;
+		timer_create(CLOCK_REALTIME, &sigevent, &timerid);
+		memset(&timeout, 0, sizeof(timeout));
+		timeout.it_value.tv_nsec = interval * 1000000ULL;
+		timeout.it_interval.tv_nsec = interval * 1000000ULL;
+	}
+
 	session = perf_session__new(file, false, tool);
 	if (session == NULL) {
 		pr_err("Perf session creation failed.\n");
@@ -737,6 +761,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	}
 
 	perf_read_counter(rec);
+	if (interval)
+		timer_settime(timerid, 0, &timeout, NULL);
+
 	auxtrace_snapshot_enabled = 1;
 	for (;;) {
 		int hits = rec->samples;
@@ -781,6 +808,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		 */
 		if (done && !disabled && !target__none(&opts->target)) {
 			auxtrace_snapshot_enabled = 0;
+			if (interval)
+				timer_delete(timerid);
 			perf_read_counter(rec);
 			perf_evlist__disable(rec->evlist);
 			disabled = true;
@@ -1187,6 +1216,8 @@ struct option __record_options[] = {
 			"per thread proc mmap processing timeout in ms"),
 	OPT_BOOLEAN(0, "switch-events", &record.opts.record_switch_events,
 		    "Record context switch events"),
+	OPT_UINTEGER(0, "stat-read-interval", &interval,
+		    "Stat read counter at regular interval in ms (>= 10)"),
 	OPT_END()
 };
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 09/10] perf,tools: don't validate non-sample event
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
                   ` (7 preceding siblings ...)
  2015-08-18  9:25 ` [PATCH RFC 08/10] perf,tools: option to set stat read interval kan.liang
@ 2015-08-18  9:25 ` kan.liang
  2015-08-18  9:25 ` [PATCH RFC 10/10] perf,tools: Show STAT_READ in perf report kan.liang
  9 siblings, 0 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

Stat event (:N) doesn't do sampling. So perf tool should not validate
its sample_type, read_format and sample_id_all.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/util/evlist.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ca7bf8d..fc2da23 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1249,6 +1249,8 @@ bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
 		return false;
 
 	evlist__for_each(evlist, pos) {
+		if (!pos->attr.sample_type)
+			continue;
 		if (pos->id_pos != evlist->id_pos ||
 		    pos->is_pos != evlist->is_pos)
 			return false;
@@ -1293,6 +1295,8 @@ bool perf_evlist__valid_read_format(struct perf_evlist *evlist)
 	u64 sample_type = first->attr.sample_type;
 
 	evlist__for_each(evlist, pos) {
+		if (!pos->attr.sample_type)
+			continue;
 		if (read_format != pos->attr.read_format)
 			return false;
 	}
@@ -1350,6 +1354,8 @@ bool perf_evlist__valid_sample_id_all(struct perf_evlist *evlist)
 	struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
 
 	evlist__for_each_continue(evlist, pos) {
+		if (!pos->attr.sample_type)
+			continue;
 		if (first->attr.sample_id_all != pos->attr.sample_id_all)
 			return false;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 10/10] perf,tools: Show STAT_READ in perf report
  2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
                   ` (8 preceding siblings ...)
  2015-08-18  9:25 ` [PATCH RFC 09/10] perf,tools: don't validate non-sample event kan.liang
@ 2015-08-18  9:25 ` kan.liang
  9 siblings, 0 replies; 29+ messages in thread
From: kan.liang @ 2015-08-18  9:25 UTC (permalink / raw)
  To: acme
  Cc: a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel,
	Kan Liang

From: Kan Liang <kan.liang@intel.com>

This patch parse the PERF_RECORD_STAT_READ in perf report.
With -D option, event name, CPU id and value will be dumped.
With --stdio option, the value will be shown in a new section
(Performance counter stats) between header and sample result.

Example 1:

 # perf record -e 'cycles,uncore_imc_0/cas_count_read/N'
--stat-read-interval 10 -a sleep 5
 [ perf record: Woken up 501 times to write data ]
 [ perf record: Captured and wrote 0.376 MB perf.data (816 samples) ]
 # perf report -D | tail
          SAMPLE events:        816
           MMAP2 events:       1787
  FINISHED_ROUND events:        330
       STAT_READ events:       1002
 cycles stats:
           TOTAL events:        816
          SAMPLE events:        816
 uncore_imc_0/cas_count_read/N stats:
           TOTAL events:       1002
       STAT_READ events:       1002

Example 2:
 $ perf record -e '{cycles,instructions}:N' -a sleep 2
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.247 MB perf.data (209 samples) ]

 $ perf report --stdio --group --socket
 # To display the perf.data header info, please use
 --header/--header-only options.
 #
 # Samples: 209  of event 'anon group { cycles, instructions }'
 # Event count (approx.): 41787924
 #
 # Socket: 0
 #
 # Performance counter stats:
 # instructions   2745005
 #
 #         Overhead  Command       Shared Object      Symbol
 # ................  ............  .................
 .................................
 #
     8.06%   0.00%  swapper       [kernel.vmlinux]   [k] ixgbe_read_reg
     2.03%   0.00%  sleep         [kernel.vmlinux]   [k]
 __rb_insert_augmented
     1.98%   0.00%  swapper       [kernel.vmlinux]   [k]
 run_timer_softirq

 # Socket: 1
 #
 # Performance counter stats:
 # instructions   6386942
 #
 #         Overhead  Command       Shared Object      Symbol
 # ................  ............  .................
 .................................
 #
    27.09%   0.00%  kworker/23:2  [kernel.vmlinux]   [k] delay_tsc
    17.53%   0.00%  perf          [kernel.vmlinux]   [k]
 smp_call_function_single
    13.62%   0.00%  kworker/23:2  [kernel.vmlinux]   [k] ixgbe_read_reg

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/builtin-report.c | 101 +++++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/cpumap.c    |   4 +-
 tools/perf/util/cpumap.h    |   1 +
 3 files changed, 97 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6fdf9f4..4794a83 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -182,6 +182,34 @@ out_put:
 	return ret;
 }
 
+static int process_stat_read_event(struct perf_tool *tool __maybe_unused,
+				   union perf_event *event,
+				   struct perf_session *session)
+{
+	struct perf_evlist *evlist = session->evlist;
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+	int id = event->stat_read.pos_id;
+
+	while (id--) {
+		BUG_ON(evsel == NULL);
+		evsel = list_next_entry(evsel, node);
+	}
+
+	if ((evsel->counts == NULL) &&
+	    (perf_evsel__alloc_counts(evsel, max_cpu_num, 1) < 0))
+		return -ENOMEM;
+
+	perf_counts(evsel->counts, event->stat_read.cpu, 0)->val += event->stat_read.value;
+	hists__inc_nr_events(evsel__hists(evsel), event->header.type);
+
+	dump_printf(": %s CPU %d: value %" PRIu64 " time: %" PRIu64 "\n",
+		    evsel ? perf_evsel__name(evsel) : "FAIL",
+		    event->stat_read.cpu,
+		    event->stat_read.value,
+		    event->stat_read.time);
+	return 0;
+}
+
 static int process_read_event(struct perf_tool *tool,
 			      union perf_event *event,
 			      struct perf_sample *sample __maybe_unused,
@@ -329,6 +357,47 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
 	return ret + fprintf(fp, "\n#\n");
 }
 
+static inline void _stat_fprintf(FILE *fp, struct perf_evsel *evsel,
+				 int socket, bool print_head)
+{
+	int i;
+	u64 stat_count = 0;
+
+	if (print_head)
+		fprintf(fp, "# Performance counter stats:\n");
+
+	for (i = 0; i < max_cpu_num; i++) {
+		if ((socket >= 0) && (cpu__get_socket(i) != socket))
+			continue;
+		stat_count += perf_counts(evsel->counts, i, 0)->val;
+	}
+
+	fprintf(fp, "# %s\t %" PRIu64, evsel->name, stat_count);
+}
+
+static void stat_fprintf(FILE *fp, struct perf_evsel *evsel, int socket)
+{
+	struct perf_evsel *leader = evsel->leader;
+	struct perf_evsel *pos;
+	bool print_head = true;
+
+	if (!symbol_conf.event_group &&
+	    (evsel->counts == NULL))
+		return;
+
+	if (symbol_conf.event_group) {
+		for_each_group_member(pos, leader) {
+			if (pos->counts == NULL)
+				continue;
+			_stat_fprintf(fp, pos, socket, print_head);
+			print_head = false;
+		}
+	} else
+		_stat_fprintf(fp, evsel, socket, print_head);
+
+	fprintf(fp, "\n#\n");
+}
+
 static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
 					 struct report *rep,
 					 const char *help)
@@ -344,17 +413,23 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
 		    !perf_evsel__is_group_leader(pos))
 			continue;
 
-		hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
+		if (pos->counts == NULL)
+			hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
 
 		if (symbol_conf.socket) {
 			int i;
 
 			for (i = 0; i < max_socket_num; i++) {
-				fprintf(stdout, "#\n# Socket: %d\n#\n", i);
-				hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout, i);
+				fprintf(stdout, "\n# Socket: %d\n#\n", i);
+				stat_fprintf(stdout, pos, i);
+				if (pos->counts == NULL)
+					hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout, i);
 			}
-		} else
-			hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout, -1);
+		} else {
+			stat_fprintf(stdout, pos, -1);
+			if (pos->counts == NULL)
+				hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout, -1);
+		}
 
 		fprintf(stdout, "\n\n");
 	}
@@ -611,6 +686,17 @@ parse_percent_limit(const struct option *opt, const char *str,
 	return 0;
 }
 
+static void
+free_stat_counts(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		if (evsel->counts != NULL)
+			perf_evsel__free_counts(evsel);
+	}
+}
+
 int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 {
 	struct perf_session *session;
@@ -634,6 +720,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 			.fork		 = perf_event__process_fork,
 			.lost		 = perf_event__process_lost,
 			.read		 = process_read_event,
+			.stat_read	 = process_stat_read_event,
 			.attr		 = perf_event__process_attr,
 			.tracing_data	 = perf_event__process_tracing_data,
 			.build_id	 = perf_event__process_build_id,
@@ -920,18 +1007,20 @@ repeat:
 	}
 
 	sort__setup_elide(stdout);
-
+	set_max_cpu_num();
 	if (symbol_conf.socket)
 		set_max_socket_num();
 
 	ret = __cmd_report(&report);
 	if (ret == K_SWITCH_INPUT_DATA) {
+		free_stat_counts(session->evlist);
 		perf_session__delete(session);
 		goto repeat;
 	} else
 		ret = 0;
 
 error:
+	free_stat_counts(session->evlist);
 	perf_session__delete(session);
 	return ret;
 }
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index ae03426..9ce0f2a 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -380,7 +380,7 @@ out:
 }
 
 /* Determine highest possible cpu in the system for sparse allocation */
-static void set_max_cpu_num(void)
+void set_max_cpu_num(void)
 {
 	const char *mnt;
 	char path[PATH_MAX];
@@ -411,8 +411,6 @@ void set_max_socket_num(void)
 {
 	int cpu, socket;
 
-	set_max_cpu_num();
-
 	max_socket_num = 1;
 	for (cpu = 0; cpu < max_cpu_num; cpu++) {
 		socket = cpu__get_socket(cpu);
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 094edd9..ef2ca6e 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -61,6 +61,7 @@ int *cpunode_map;
 
 int cpu__setup_cpunode_map(void);
 void set_max_socket_num(void);
+void set_max_cpu_num(void);
 
 static inline int cpu__max_node(void)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 03/10] perf,tools: support option --socket
  2015-08-18  9:25 ` [PATCH RFC 03/10] perf,tools: support option --socket kan.liang
@ 2015-08-18 17:36   ` Stephane Eranian
  2015-08-20  9:30   ` Jiri Olsa
  1 sibling, 0 replies; 29+ messages in thread
From: Stephane Eranian @ 2015-08-18 17:36 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, mingo, Jiri Olsa,
	Namhyung Kim, ak, LKML

On Tue, Aug 18, 2015 at 2:25 AM,  <kan.liang@intel.com> wrote:
> From: Kan Liang <kan.liang@intel.com>
>
> Introduce a new option for perf report to show the event information in
> the same socket together.
> When this option is set, perf report will force to sort by socket.
>
I would say processor socket, otherwise it is a bit confusing. At
first, I thought
you were talking about network sockets.

> $ perf report --stdio --socket
>  # To display the perf.data header info, please use
>  --header/--header-only options.
>  #
>  #
>  # Total Lost Samples: 0
>  #
>  # Samples: 686  of event 'cycles'
>  # Event count (approx.): 349215462
>  #
>  #
>  # Socket: 0
>  #
Processor Socket:

>  # Overhead  Command    Shared Object     Symbol
>  # ........  .........  ................
>  .................................
>  #
>     97.05%  test       test              [.] plusB_c
>      0.98%  test       test              [.] plusA_c
>      0.16%  test       [kernel.vmlinux]  [k] add_mm_counter_fast
>      0.15%  swapper    [kernel.vmlinux]  [k] note_gp_changes
>      0.15%  perf       [kernel.vmlinux]  [k] unmap_single_vma
>      0.06%  perf       [kernel.vmlinux]  [k] run_timer_softirq
>      0.00%  swapper    [kernel.vmlinux]  [k] native_write_msr
>  #
>  # Socket: 1
>  #
>  # Overhead  Command    Shared Object     Symbol
>  # ........  .........  ................
>  .................................
>  #
>      0.93%  perf       [kernel.vmlinux]  [k] smp_call_function_single
>      0.19%  perf       [kernel.vmlinux]  [k] page_fault
>      0.19%  swapper    [kernel.vmlinux]  [k] pm_qos_request
>      0.12%  rcu_sched  [kernel.vmlinux]  [k]
> dyntick_save_progress_counter
>      0.00%  swapper    [kernel.vmlinux]  [k] wake_up_process
>      0.00%  swapper    [kernel.vmlinux]  [k] __do_softirq
>      0.00%  swapper    [kernel.vmlinux]  [k] run_timer_softirq
>      0.00%  swapper    [kernel.vmlinux]  [k] native_write_msr
>      0.00%  perf       [kernel.vmlinux]  [k] native_write_msr
>
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> ---
>  tools/perf/Documentation/perf-report.txt |  3 +++
>  tools/perf/builtin-diff.c                |  2 +-
>  tools/perf/builtin-report.c              | 17 ++++++++++++++++-
>  tools/perf/builtin-top.c                 |  2 +-
>  tools/perf/ui/stdio/hist.c               | 14 ++++++++++----
>  tools/perf/util/cpumap.c                 | 14 ++++++++++++++
>  tools/perf/util/cpumap.h                 |  2 ++
>  tools/perf/util/hist.h                   |  2 +-
>  tools/perf/util/sort.c                   |  6 ++++++
>  tools/perf/util/symbol.h                 |  1 +
>  10 files changed, 55 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index ca76b0d..49a42e4 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -293,6 +293,9 @@ OPTIONS
>  --group::
>         Show event group information together.
>
> +--socket::
> +       Show event information in the same socket together.
> +
>  --demangle::
>         Demangle symbol names to human readable form. It's enabled by default,
>         disable with --no-demangle.
> diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
> index 0b180a8..0dfd91f 100644
> --- a/tools/perf/builtin-diff.c
> +++ b/tools/perf/builtin-diff.c
> @@ -667,7 +667,7 @@ static void hists__process(struct hists *hists)
>         hists__precompute(hists);
>         hists__output_resort(hists, NULL);
>
> -       hists__fprintf(hists, true, 0, 0, 0, stdout);
> +       hists__fprintf(hists, true, 0, 0, 0, stdout, -1);
>  }
>
>  static void data__fprintf(void)
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 62b285e..6fdf9f4 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -345,7 +345,17 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
>                         continue;
>
>                 hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
> -               hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout);
> +
> +               if (symbol_conf.socket) {
> +                       int i;
> +
> +                       for (i = 0; i < max_socket_num; i++) {
> +                               fprintf(stdout, "#\n# Socket: %d\n#\n", i);
> +                               hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout, i);
> +                       }
> +               } else
> +                       hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout, -1);
> +
>                 fprintf(stdout, "\n\n");
>         }
>
> @@ -724,6 +734,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
>                     "Show a column with the sum of periods"),
>         OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
>                     "Show event group information together"),
> +       OPT_BOOLEAN(0, "socket", &symbol_conf.socket,
> +                   "Show event information in the same socket together"),
>         OPT_CALLBACK_NOOPT('b', "branch-stack", &branch_mode, "",
>                     "use branch records for per branch histogram filling",
>                     parse_branch_mode),
> @@ -909,6 +921,9 @@ repeat:
>
>         sort__setup_elide(stdout);
>
> +       if (symbol_conf.socket)
> +               set_max_socket_num();
> +
>         ret = __cmd_report(&report);
>         if (ret == K_SWITCH_INPUT_DATA) {
>                 perf_session__delete(session);
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index bfe24f1..deffe44 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -295,7 +295,7 @@ static void perf_top__print_sym_table(struct perf_top *top)
>         hists__output_recalc_col_len(hists, top->print_entries - printed);
>         putchar('\n');
>         hists__fprintf(hists, false, top->print_entries - printed, win_width,
> -                      top->min_percent, stdout);
> +                      top->min_percent, stdout, -1);
>  }
>
>  static void prompt_integer(int *target, const char *msg)
> diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
> index dfcbc90..2f512b8 100644
> --- a/tools/perf/ui/stdio/hist.c
> +++ b/tools/perf/ui/stdio/hist.c
> @@ -323,7 +323,8 @@ static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
>                 return 0;
>
>         perf_hpp__for_each_format(fmt) {
> -               if (perf_hpp__should_skip(fmt))
> +               if (perf_hpp__should_skip(fmt) ||
> +                   !strcmp(fmt->name, "SOCKET"))
>                         continue;
>
>                 /*
> @@ -371,7 +372,7 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
>  }
>
>  size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
> -                     int max_cols, float min_pcnt, FILE *fp)
> +                     int max_cols, float min_pcnt, FILE *fp, int socket)
>  {
>         struct perf_hpp_fmt *fmt;
>         struct rb_node *nd;
> @@ -402,7 +403,8 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
>         fprintf(fp, "# ");
>
>         perf_hpp__for_each_format(fmt) {
> -               if (perf_hpp__should_skip(fmt))
> +               if (perf_hpp__should_skip(fmt) ||
> +                   !strcmp(fmt->name, "SOCKET"))
>                         continue;
>
>                 if (!first)
> @@ -428,7 +430,8 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
>         perf_hpp__for_each_format(fmt) {
>                 unsigned int i;
>
> -               if (perf_hpp__should_skip(fmt))
> +               if (perf_hpp__should_skip(fmt) ||
> +                   !strcmp(fmt->name, "SOCKET"))
>                         continue;
>
>                 if (!first)
> @@ -465,6 +468,9 @@ print_entries:
>                 if (h->filtered)
>                         continue;
>
> +               if ((socket >= 0) && (cpu__get_socket(h->cpu) != socket))
> +                       continue;
> +
>                 percent = hist_entry__get_percent_limit(h);
>                 if (percent < min_pcnt)
>                         continue;
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 7f25e6c..ae03426 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -407,6 +407,20 @@ out:
>                 pr_err("Failed to read max cpus, using default of %d\n", max_cpu_num);
>  }
>
> +void set_max_socket_num(void)
> +{
> +       int cpu, socket;
> +
> +       set_max_cpu_num();
> +
> +       max_socket_num = 1;
> +       for (cpu = 0; cpu < max_cpu_num; cpu++) {
> +               socket = cpu__get_socket(cpu);
> +               if (max_socket_num < (socket + 1))
> +                       max_socket_num = socket + 1;
> +       }
> +}
> +
>  /* Determine highest possible node in the system for sparse allocation */
>  static void set_max_node_num(void)
>  {
> diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
> index effb56e..094edd9 100644
> --- a/tools/perf/util/cpumap.h
> +++ b/tools/perf/util/cpumap.h
> @@ -56,9 +56,11 @@ static inline bool cpu_map__empty(const struct cpu_map *map)
>
>  int max_cpu_num;
>  int max_node_num;
> +int max_socket_num;
>  int *cpunode_map;
>
>  int cpu__setup_cpunode_map(void);
> +void set_max_socket_num(void);
>
>  static inline int cpu__max_node(void)
>  {
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index af80fb5..b188a62 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -139,7 +139,7 @@ void events_stats__inc(struct events_stats *stats, u32 type);
>  size_t events_stats__fprintf(struct events_stats *stats, FILE *fp);
>
>  size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
> -                     int max_cols, float min_pcnt, FILE *fp);
> +                     int max_cols, float min_pcnt, FILE *fp, int socket);
>  size_t perf_evlist__fprintf_nr_events(struct perf_evlist *evlist, FILE *fp);
>
>  void hists__filter_by_dso(struct hists *hists);
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 245e254..70f0526 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1930,6 +1930,12 @@ int setup_sorting(void)
>                         return err;
>         }
>
> +       if (symbol_conf.socket) {
> +               err = sort_dimension__add("socket");
> +               if (err < 0)
> +                       return err;
> +       }
> +
>         reset_dimensions();
>
>         /*
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index a4cde92..7c153c7 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -101,6 +101,7 @@ struct symbol_conf {
>                         annotate_asm_raw,
>                         annotate_src,
>                         event_group,
> +                       socket,
>                         demangle,
>                         demangle_kernel,
>                         filter_relative,
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads
  2015-08-18  9:25 ` [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads kan.liang
@ 2015-08-20  8:57   ` Jiri Olsa
  2015-08-20 17:24     ` Liang, Kan
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-08-20  8:57 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Tue, Aug 18, 2015 at 05:25:37AM -0400, kan.liang@intel.com wrote:
> From: Kan Liang <kan.liang@intel.com>
> 
> evsel may have different cpus and threads as evlist's.
> Use it's own cpus and threads, when open evsel in perf record.
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> ---
>  tools/perf/builtin-record.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 25cf6b4..a0178bf 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -279,7 +279,7 @@ static int record__open(struct record *rec)
>  
>  	evlist__for_each(evlist, pos) {
>  try_again:
> -		if (perf_evsel__open(pos, evlist->cpus, evlist->threads) < 0) {
> +		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
>  			if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
>  				if (verbose)
>  					ui__warning("%s\n", msg);
> -- 
> 1.8.3.1
> 

dont we need then handle filters the same way?
like in attached change? totally untested..

jirka


---
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7aa039bd379a..f5cdf678d504 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -244,12 +244,9 @@ static void handle_initial_delay(void)
 	struct perf_evsel *counter;
 
 	if (initial_delay) {
-		const int ncpus = cpu_map__nr(evsel_list->cpus),
-			nthreads = thread_map__nr(evsel_list->threads);
-
 		usleep(initial_delay * 1000);
 		evlist__for_each(evsel_list, counter)
-			perf_evsel__enable(counter, ncpus, nthreads);
+			perf_evsel__enable(counter);
 	}
 }
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 373f65b02545..0d8428458ae1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1172,14 +1172,12 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist, struct perf_evsel **e
 {
 	struct perf_evsel *evsel;
 	int err = 0;
-	const int ncpus = cpu_map__nr(evlist->cpus),
-		  nthreads = thread_map__nr(evlist->threads);
 
 	evlist__for_each(evlist, evsel) {
 		if (evsel->filter == NULL)
 			continue;
 
-		err = perf_evsel__apply_filter(evsel, ncpus, nthreads, evsel->filter);
+		err = perf_evsel__apply_filter(evsel, evsel->filter);
 		if (err) {
 			*err_evsel = evsel;
 			break;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index b096ef7a240c..e1e70c55c68d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -898,9 +898,10 @@ static int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthread
 	return evsel->fd != NULL ? 0 : -ENOMEM;
 }
 
-static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ncpus, int nthreads,
-			  int ioc,  void *arg)
+static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ioc,  void *arg)
 {
+	int ncpus = cpu_map__nr(evsel->cpus),
+	    nthreads = thread_map__nr(evsel->threads);
 	int cpu, thread;
 
 	if (evsel->system_wide)
@@ -919,11 +920,9 @@ static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ncpus, int nthrea
 	return 0;
 }
 
-int perf_evsel__apply_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
-			     const char *filter)
+int perf_evsel__apply_filter(struct perf_evsel *evsel, const char *filter)
 {
-	return perf_evsel__run_ioctl(evsel, ncpus, nthreads,
-				     PERF_EVENT_IOC_SET_FILTER,
+	return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_SET_FILTER,
 				     (void *)filter);
 }
 
@@ -957,11 +956,9 @@ int perf_evsel__append_filter(struct perf_evsel *evsel,
 	return -1;
 }
 
-int perf_evsel__enable(struct perf_evsel *evsel, int ncpus, int nthreads)
+int perf_evsel__enable(struct perf_evsel *evsel)
 {
-	return perf_evsel__run_ioctl(evsel, ncpus, nthreads,
-				     PERF_EVENT_IOC_ENABLE,
-				     0);
+	return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, 0);
 }
 
 int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 93ac6b128149..bb5b9c1d482b 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -214,9 +214,8 @@ void perf_evsel__set_sample_id(struct perf_evsel *evsel,
 int perf_evsel__set_filter(struct perf_evsel *evsel, const char *filter);
 int perf_evsel__append_filter(struct perf_evsel *evsel,
 			      const char *op, const char *filter);
-int perf_evsel__apply_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
-			     const char *filter);
-int perf_evsel__enable(struct perf_evsel *evsel, int ncpus, int nthreads);
+int perf_evsel__apply_filter(struct perf_evsel *evsel, const char *filter);
+int perf_evsel__enable(struct perf_evsel *evsel);
 
 int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
 			     struct cpu_map *cpus);

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-18  9:25 ` [PATCH RFC 02/10] perf,tools: Support new sort type --socket kan.liang
@ 2015-08-20  9:09   ` Jiri Olsa
  2015-08-21 20:25     ` Liang, Kan
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-08-20  9:09 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Tue, Aug 18, 2015 at 05:25:38AM -0400, kan.liang@intel.com wrote:
> From: Kan Liang <kan.liang@intel.com>
> 
> This patch enable perf report to sort by socket
> 

SNIP

> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 7e38716..245e254 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -421,6 +421,33 @@ struct sort_entry sort_cpu = {
>  	.se_width_idx	= HISTC_CPU,
>  };
>  
> +/* --sort socket */
> +
> +static int64_t
> +sort__socket_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	int r_socket, l_socket;
> +
> +	r_socket = cpu__get_socket(right->cpu);
> +	l_socket = cpu__get_socket(left->cpu);
> +	return r_socket - l_socket;

we need global topology information in perf.data and use
the mapping from there, we can't use current server info

we currently store core_siblings_list and thread_siblings_list,
in topology FEATURE, which is probably not enough

I think we need new feature that stores topology info and
new interface that will provide all useful mappings:
  idx -> cpu
  cpu -> core
  cpu -> socket
  cpu -> node

in another patchset I used new CPUMAP event:
https://git.kernel.org/cgit/linux/kernel/git/jolsa/perf.git/commit/?h=perf/stat_script_3&id=37b7b8449aa23acdfe9dec5a7a371e91c5323da5

we might need both ways (new FEATURE and event) to support pipe reports

jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 03/10] perf,tools: support option --socket
  2015-08-18  9:25 ` [PATCH RFC 03/10] perf,tools: support option --socket kan.liang
  2015-08-18 17:36   ` Stephane Eranian
@ 2015-08-20  9:30   ` Jiri Olsa
  1 sibling, 0 replies; 29+ messages in thread
From: Jiri Olsa @ 2015-08-20  9:30 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Tue, Aug 18, 2015 at 05:25:39AM -0400, kan.liang@intel.com wrote:
> From: Kan Liang <kan.liang@intel.com>
> 
> Introduce a new option for perf report to show the event information in
> the same socket together.
> When this option is set, perf report will force to sort by socket.
> 
> $ perf report --stdio --socket
>  # To display the perf.data header info, please use
>  --header/--header-only options.
>  #
>  #
>  # Total Lost Samples: 0
>  #
>  # Samples: 686  of event 'cycles'
>  # Event count (approx.): 349215462
>  #
>  #
>  # Socket: 0
>  #
>  # Overhead  Command    Shared Object     Symbol
>  # ........  .........  ................
>  .................................
>  #
>     97.05%  test       test              [.] plusB_c
>      0.98%  test       test              [.] plusA_c
>      0.16%  test       [kernel.vmlinux]  [k] add_mm_counter_fast
>      0.15%  swapper    [kernel.vmlinux]  [k] note_gp_changes
>      0.15%  perf       [kernel.vmlinux]  [k] unmap_single_vma
>      0.06%  perf       [kernel.vmlinux]  [k] run_timer_softirq
>      0.00%  swapper    [kernel.vmlinux]  [k] native_write_msr
>  #
>  # Socket: 1
>  #
>  # Overhead  Command    Shared Object     Symbol
>  # ........  .........  ................
>  .................................
>  #
>      0.93%  perf       [kernel.vmlinux]  [k] smp_call_function_single
>      0.19%  perf       [kernel.vmlinux]  [k] page_fault
>      0.19%  swapper    [kernel.vmlinux]  [k] pm_qos_request
>      0.12%  rcu_sched  [kernel.vmlinux]  [k]
> dyntick_save_progress_counter
>      0.00%  swapper    [kernel.vmlinux]  [k] wake_up_process
>      0.00%  swapper    [kernel.vmlinux]  [k] __do_softirq
>      0.00%  swapper    [kernel.vmlinux]  [k] run_timer_softirq
>      0.00%  swapper    [kernel.vmlinux]  [k] native_write_msr
>      0.00%  perf       [kernel.vmlinux]  [k] native_write_msr

nice, but should this be handled by HIST_FILTER__* stuff?

also having generic ability to 'zoom' in TUI for SOCKET/CPU
would be great.. it's already there for thread and dso

jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 05/10] perf,tools: Enable statistic read for perf record
  2015-08-18  9:25 ` [PATCH RFC 05/10] perf,tools: Enable statistic read for perf record kan.liang
@ 2015-08-20  9:39   ` Jiri Olsa
  2015-08-20  9:41   ` Jiri Olsa
  1 sibling, 0 replies; 29+ messages in thread
From: Jiri Olsa @ 2015-08-20  9:39 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Tue, Aug 18, 2015 at 05:25:41AM -0400, kan.liang@intel.com wrote:

SNIP

>  	evlist__for_each(evlist, pos) {
> +
> +		if (pos->stat_read) {
> +			if (!target__has_cpu(&opts->target)) {
> +				pos->stat_read = 0;
> +				ui__warning("Statistics read only available "
> +					    "on system-wide/CPU mode. "
> +					    "Remove :N modifier for event %s\n",
> +					    pos->name);
> +				goto out;
> +			}
> +			/* Don't do stat read for Group leader */
> +			if ((pos->leader == pos) && (pos->leader->nr_members > 1))
> +				pos->stat_read = 0;
> +			else {
> +				if (first == pos) {
> +					pos->stat_read = 0;
> +					ui__warning("The first event cannot "
> +						    "be stat read event\n");
> +					goto out;
> +				}
> +				attr = &pos->attr;
> +				attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
> +						    PERF_FORMAT_TOTAL_TIME_RUNNING;
> +				attr->sample_freq   = 0;
> +				attr->sample_period = 0;
> +				attr->sample_type = 0;
> +				pos->sample_size = 0;
> +			}
> +		}

seems like this should go under perf_evsel__config

jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 05/10] perf,tools: Enable statistic read for perf record
  2015-08-18  9:25 ` [PATCH RFC 05/10] perf,tools: Enable statistic read for perf record kan.liang
  2015-08-20  9:39   ` Jiri Olsa
@ 2015-08-20  9:41   ` Jiri Olsa
  1 sibling, 0 replies; 29+ messages in thread
From: Jiri Olsa @ 2015-08-20  9:41 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Tue, Aug 18, 2015 at 05:25:41AM -0400, kan.liang@intel.com wrote:

SNIP

> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 373f65b..ca7bf8d 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -854,6 +854,9 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
>  		if (evsel->system_wide && thread)
>  			continue;
>  
> +		if (evsel->stat_read)
> +			continue;
> +

how about filters (perf_evlist__apply_filters) ... should be disabled as well IIUIC

jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 06/10] perf,tools: New RECORD type PERF_RECORD_STAT_READ
  2015-08-18  9:25 ` [PATCH RFC 06/10] perf,tools: New RECORD type PERF_RECORD_STAT_READ kan.liang
@ 2015-08-20  9:49   ` Jiri Olsa
  0 siblings, 0 replies; 29+ messages in thread
From: Jiri Olsa @ 2015-08-20  9:49 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Tue, Aug 18, 2015 at 05:25:42AM -0400, kan.liang@intel.com wrote:
> From: Kan Liang <kan.liang@intel.com>
> 
> Introduce a new user RECORD type PERF_RECORD_STAT_READ to store the
> stat event read result.
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> ---
>  tools/perf/util/event.c   |  1 +
>  tools/perf/util/event.h   | 10 ++++++++++
>  tools/perf/util/session.c | 15 +++++++++++++++
>  tools/perf/util/tool.h    |  1 +
>  4 files changed, 27 insertions(+)
> 
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 7ff6127..601f4be 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -37,6 +37,7 @@ static const char *perf_event__names[] = {
>  	[PERF_RECORD_AUXTRACE_INFO]		= "AUXTRACE_INFO",
>  	[PERF_RECORD_AUXTRACE]			= "AUXTRACE",
>  	[PERF_RECORD_AUXTRACE_ERROR]		= "AUXTRACE_ERROR",
> +	[PERF_RECORD_STAT_READ]			= "STAT_READ",
>  };
>  
>  const char *perf_event__name(unsigned int id)
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index f729df5..f6932cf 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -226,6 +226,7 @@ enum perf_user_event_type { /* above any possible kernel type */
>  	PERF_RECORD_AUXTRACE_INFO		= 70,
>  	PERF_RECORD_AUXTRACE			= 71,
>  	PERF_RECORD_AUXTRACE_ERROR		= 72,
> +	PERF_RECORD_STAT_READ			= 73,
>  	PERF_RECORD_HEADER_MAX
>  };
>  
> @@ -355,6 +356,14 @@ struct context_switch_event {
>  	u32 next_prev_tid;
>  };
>  
> +struct stat_read_event {
> +	struct perf_event_header header;
> +	u64	value;		/* counter value delta */
> +	u32	cpu;
> +	u32	pos_id;		/* event position in evlist */
> +	u64	time;		/* time stamp */
> +};

could we make this more generic like:

struct stat_event {
        struct perf_event_header        header;

        u64     id;
        u32     cpu;
        u32     thread;

        union {
                struct {
                        u64 val;
                        u64 ena;
                        u64 run;
                };
                u64 values[3];
        };
};


I'd agreed with the time to be added, but I think it should
contains raw values, not deltas (and all 3 them: val, ena, run)

also I haven't got the value of the pos_id, so far seems
like this could be obtained in report time and does not
need to be part of the event


thanks,
jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads
  2015-08-20  8:57   ` Jiri Olsa
@ 2015-08-20 17:24     ` Liang, Kan
  2015-08-21  7:43       ` Jiri Olsa
  0 siblings, 1 reply; 29+ messages in thread
From: Liang, Kan @ 2015-08-20 17:24 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

> 
> On Tue, Aug 18, 2015 at 05:25:37AM -0400, kan.liang@intel.com wrote:
> > From: Kan Liang <kan.liang@intel.com>
> >
> > evsel may have different cpus and threads as evlist's.
> > Use it's own cpus and threads, when open evsel in perf record.
> >
> > Signed-off-by: Kan Liang <kan.liang@intel.com>
> > ---
> >  tools/perf/builtin-record.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 25cf6b4..a0178bf 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -279,7 +279,7 @@ static int record__open(struct record *rec)
> >
> >  	evlist__for_each(evlist, pos) {
> >  try_again:
> > -		if (perf_evsel__open(pos, evlist->cpus, evlist->threads) <
> 0) {
> > +		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
> >  			if (perf_evsel__fallback(pos, errno, msg,
> sizeof(msg))) {
> >  				if (verbose)
> >  					ui__warning("%s\n", msg);
> > --
> > 1.8.3.1
> >
> 
> dont we need then handle filters the same way?
> like in attached change? totally untested..

Filters look only work for tracepoint event, which doesn't have cpu limit.
So evlist and evsel should always be same.
I think we don't need to change it.

> 
> jirka
> 
> 
> ---
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 7aa039bd379a..f5cdf678d504 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -244,12 +244,9 @@ static void handle_initial_delay(void)
>  	struct perf_evsel *counter;
> 
>  	if (initial_delay) {
> -		const int ncpus = cpu_map__nr(evsel_list->cpus),
> -			nthreads = thread_map__nr(evsel_list->threads);
> -
>  		usleep(initial_delay * 1000);
>  		evlist__for_each(evsel_list, counter)
> -			perf_evsel__enable(counter, ncpus, nthreads);
> +			perf_evsel__enable(counter);
>  	}
>  }
>

Agree, we need to use evsel's cpu and threads here.
What about the code as below? It should be simpler.
+			perf_evsel__enable(counter, cpu_map__nr(counter->cpus), thread_map__nr(counter->threads));
 
 
Thanks,
Kan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads
  2015-08-20 17:24     ` Liang, Kan
@ 2015-08-21  7:43       ` Jiri Olsa
  2015-08-21 13:24         ` Liang, Kan
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-08-21  7:43 UTC (permalink / raw)
  To: Liang, Kan
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Thu, Aug 20, 2015 at 05:24:32PM +0000, Liang, Kan wrote:
> > 
> > On Tue, Aug 18, 2015 at 05:25:37AM -0400, kan.liang@intel.com wrote:
> > > From: Kan Liang <kan.liang@intel.com>
> > >
> > > evsel may have different cpus and threads as evlist's.
> > > Use it's own cpus and threads, when open evsel in perf record.
> > >
> > > Signed-off-by: Kan Liang <kan.liang@intel.com>
> > > ---
> > >  tools/perf/builtin-record.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > > index 25cf6b4..a0178bf 100644
> > > --- a/tools/perf/builtin-record.c
> > > +++ b/tools/perf/builtin-record.c
> > > @@ -279,7 +279,7 @@ static int record__open(struct record *rec)
> > >
> > >  	evlist__for_each(evlist, pos) {
> > >  try_again:
> > > -		if (perf_evsel__open(pos, evlist->cpus, evlist->threads) <
> > 0) {
> > > +		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
> > >  			if (perf_evsel__fallback(pos, errno, msg,
> > sizeof(msg))) {
> > >  				if (verbose)
> > >  					ui__warning("%s\n", msg);
> > > --
> > > 1.8.3.1
> > >
> > 
> > dont we need then handle filters the same way?
> > like in attached change? totally untested..
> 
> Filters look only work for tracepoint event, which doesn't have cpu limit.
> So evlist and evsel should always be same.
> I think we don't need to change it.

right.. at least please make a comment about that

> 
> > 
> > jirka
> > 
> > 
> > ---
> > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > index 7aa039bd379a..f5cdf678d504 100644
> > --- a/tools/perf/builtin-stat.c
> > +++ b/tools/perf/builtin-stat.c
> > @@ -244,12 +244,9 @@ static void handle_initial_delay(void)
> >  	struct perf_evsel *counter;
> > 
> >  	if (initial_delay) {
> > -		const int ncpus = cpu_map__nr(evsel_list->cpus),
> > -			nthreads = thread_map__nr(evsel_list->threads);
> > -
> >  		usleep(initial_delay * 1000);
> >  		evlist__for_each(evsel_list, counter)
> > -			perf_evsel__enable(counter, ncpus, nthreads);
> > +			perf_evsel__enable(counter);
> >  	}
> >  }
> >
> 
> Agree, we need to use evsel's cpu and threads here.
> What about the code as below? It should be simpler.
> +			perf_evsel__enable(counter, cpu_map__nr(counter->cpus), thread_map__nr(counter->threads));
>  

ok, maybe I'll submit that patch as a cleanup,
it seems more sane to use evsel cpus and threads
now that we always have it there

jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads
  2015-08-21  7:43       ` Jiri Olsa
@ 2015-08-21 13:24         ` Liang, Kan
  0 siblings, 0 replies; 29+ messages in thread
From: Liang, Kan @ 2015-08-21 13:24 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

> 
> On Thu, Aug 20, 2015 at 05:24:32PM +0000, Liang, Kan wrote:
> > >
> > > On Tue, Aug 18, 2015 at 05:25:37AM -0400, kan.liang@intel.com wrote:
> > > > From: Kan Liang <kan.liang@intel.com>
> > > >
> > > > evsel may have different cpus and threads as evlist's.
> > > > Use it's own cpus and threads, when open evsel in perf record.
> > > >
> > > > Signed-off-by: Kan Liang <kan.liang@intel.com>
> > > > ---
> > > >  tools/perf/builtin-record.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/tools/perf/builtin-record.c
> > > > b/tools/perf/builtin-record.c index 25cf6b4..a0178bf 100644
> > > > --- a/tools/perf/builtin-record.c
> > > > +++ b/tools/perf/builtin-record.c
> > > > @@ -279,7 +279,7 @@ static int record__open(struct record *rec)
> > > >
> > > >  	evlist__for_each(evlist, pos) {
> > > >  try_again:
> > > > -		if (perf_evsel__open(pos, evlist->cpus, evlist->threads) <
> > > 0) {
> > > > +		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
> > > >  			if (perf_evsel__fallback(pos, errno, msg,
> > > sizeof(msg))) {
> > > >  				if (verbose)
> > > >  					ui__warning("%s\n", msg);
> > > > --
> > > > 1.8.3.1
> > > >
> > >
> > > dont we need then handle filters the same way?
> > > like in attached change? totally untested..
> >
> > Filters look only work for tracepoint event, which doesn't have cpu limit.
> > So evlist and evsel should always be same.
> > I think we don't need to change it.
> 
> right.. at least please make a comment about that
>
OK. I will do that.

> >
> > >
> > > jirka
> > >
> > >
> > > ---
> > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > > index 7aa039bd379a..f5cdf678d504 100644
> > > --- a/tools/perf/builtin-stat.c
> > > +++ b/tools/perf/builtin-stat.c
> > > @@ -244,12 +244,9 @@ static void handle_initial_delay(void)
> > >  	struct perf_evsel *counter;
> > >
> > >  	if (initial_delay) {
> > > -		const int ncpus = cpu_map__nr(evsel_list->cpus),
> > > -			nthreads = thread_map__nr(evsel_list->threads);
> > > -
> > >  		usleep(initial_delay * 1000);
> > >  		evlist__for_each(evsel_list, counter)
> > > -			perf_evsel__enable(counter, ncpus, nthreads);
> > > +			perf_evsel__enable(counter);
> > >  	}
> > >  }
> > >
> >
> > Agree, we need to use evsel's cpu and threads here.
> > What about the code as below? It should be simpler.
> > +			perf_evsel__enable(counter,
> cpu_map__nr(counter->cpus),
> > +thread_map__nr(counter->threads));
> >
> 
> ok, maybe I'll submit that patch as a cleanup, it seems more sane to use
> evsel cpus and threads now that we always have it there
> 

I guess I will send out the builtin-record change and comments as a patch.
So you can either merge the cleanup code to that patch, or send out the
cleanup patch separately.
Either is fine for me.

Thanks,
Kan



^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-20  9:09   ` Jiri Olsa
@ 2015-08-21 20:25     ` Liang, Kan
  2015-08-23 22:00       ` Jiri Olsa
  0 siblings, 1 reply; 29+ messages in thread
From: Liang, Kan @ 2015-08-21 20:25 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel



> On Tue, Aug 18, 2015 at 05:25:38AM -0400, kan.liang@intel.com wrote:
> > From: Kan Liang <kan.liang@intel.com>
> >
> > This patch enable perf report to sort by socket
> >
> 
> SNIP
> 
> > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index
> > 7e38716..245e254 100644
> > --- a/tools/perf/util/sort.c
> > +++ b/tools/perf/util/sort.c
> > @@ -421,6 +421,33 @@ struct sort_entry sort_cpu = {
> >  	.se_width_idx	= HISTC_CPU,
> >  };
> >
> > +/* --sort socket */
> > +
> > +static int64_t
> > +sort__socket_cmp(struct hist_entry *left, struct hist_entry *right) {
> > +	int r_socket, l_socket;
> > +
> > +	r_socket = cpu__get_socket(right->cpu);
> > +	l_socket = cpu__get_socket(left->cpu);
> > +	return r_socket - l_socket;
> 
> we need global topology information in perf.data and use the mapping
> from there, we can't use current server info
> 
> we currently store core_siblings_list and thread_siblings_list, in topology
> FEATURE, which is probably not enough
>

core_siblings_list  includes the cpu list in the same socket.
thread_siblings_list includes the cpu list in the same core.
numa_nodes includes the cpu list for each node.

It looks we have enough data from topology FEATURE.

What do you think about the function as below?
It gets the socket id from env.

+int
+perf_env_get_socket(struct perf_session_env *env, int cpu)
+{
+	int socket_nr, cpu_nr, i, j;
+	struct cpu_map *socket_map = NULL;
+	char *str;
+
+	if (env == NULL)
+		return -1;
+
+	socket_nr = env->nr_sibling_cores;
+	str = env->sibling_cores;
+
+	for (i = 0; i < socket_nr; i++) {
+		socket_map = cpu_map__new(str);
+		str += strlen(str) + 1;
+		if (!socket_map)
+			continue;
+		cpu_nr = socket_map->nr;
+		for (j = 0; j < cpu_nr; j++) {
+			if (cpu == socket_map->map[j]) {
+				free(socket_map);
+				return i;
+			}
+		}
+		free(socket_map);
+	}
+
+	return -1;
+}

Thanks,
Kan

> I think we need new feature that stores topology info and new interface
> that will provide all useful mappings:
>   idx -> cpu
>   cpu -> core
>   cpu -> socket
>   cpu -> node
> 
> in another patchset I used new CPUMAP event:
> https://git.kernel.org/cgit/linux/kernel/git/jolsa/perf.git/commit/?h=perf/
> stat_script_3&id=37b7b8449aa23acdfe9dec5a7a371e91c5323da5
> 
> we might need both ways (new FEATURE and event) to support pipe
> reports
> 
> jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-21 20:25     ` Liang, Kan
@ 2015-08-23 22:00       ` Jiri Olsa
  2015-08-24 14:22         ` Liang, Kan
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-08-23 22:00 UTC (permalink / raw)
  To: Liang, Kan
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Fri, Aug 21, 2015 at 08:25:24PM +0000, Liang, Kan wrote:

SNIP

> > 
> > we need global topology information in perf.data and use the mapping
> > from there, we can't use current server info
> > 
> > we currently store core_siblings_list and thread_siblings_list, in topology
> > FEATURE, which is probably not enough
> >
> 
> core_siblings_list  includes the cpu list in the same socket.
> thread_siblings_list includes the cpu list in the same core.
> numa_nodes includes the cpu list for each node.
> 
> It looks we have enough data from topology FEATURE.

hum, haven't hecked deeply.. how will you get core id for cpu?

> 
> What do you think about the function as below?
> It gets the socket id from env.

some sort of caching would be nice, I guess we could
store those cpumap objects within perf_session_env

jirka

> 
> +int
> +perf_env_get_socket(struct perf_session_env *env, int cpu)
> +{
> +	int socket_nr, cpu_nr, i, j;
> +	struct cpu_map *socket_map = NULL;
> +	char *str;
> +
> +	if (env == NULL)
> +		return -1;
> +
> +	socket_nr = env->nr_sibling_cores;
> +	str = env->sibling_cores;
> +
> +	for (i = 0; i < socket_nr; i++) {
> +		socket_map = cpu_map__new(str);
> +		str += strlen(str) + 1;
> +		if (!socket_map)
> +			continue;
> +		cpu_nr = socket_map->nr;
> +		for (j = 0; j < cpu_nr; j++) {
> +			if (cpu == socket_map->map[j]) {
> +				free(socket_map);
> +				return i;
> +			}
> +		}
> +		free(socket_map);
> +	}
> +
> +	return -1;
> +}
> 
> Thanks,
> Kan
> 
> > I think we need new feature that stores topology info and new interface
> > that will provide all useful mappings:
> >   idx -> cpu
> >   cpu -> core
> >   cpu -> socket
> >   cpu -> node
> > 
> > in another patchset I used new CPUMAP event:
> > https://git.kernel.org/cgit/linux/kernel/git/jolsa/perf.git/commit/?h=perf/
> > stat_script_3&id=37b7b8449aa23acdfe9dec5a7a371e91c5323da5
> > 
> > we might need both ways (new FEATURE and event) to support pipe
> > reports
> > 
> > jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-23 22:00       ` Jiri Olsa
@ 2015-08-24 14:22         ` Liang, Kan
  2015-08-24 14:27           ` Jiri Olsa
  0 siblings, 1 reply; 29+ messages in thread
From: Liang, Kan @ 2015-08-24 14:22 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

> 
> On Fri, Aug 21, 2015 at 08:25:24PM +0000, Liang, Kan wrote:
> 
> SNIP
> 
> > >
> > > we need global topology information in perf.data and use the mapping
> > > from there, we can't use current server info
> > >
> > > we currently store core_siblings_list and thread_siblings_list, in
> > > topology FEATURE, which is probably not enough
> > >
> >
> > core_siblings_list  includes the cpu list in the same socket.
> > thread_siblings_list includes the cpu list in the same core.
> > numa_nodes includes the cpu list for each node.
> >
> > It looks we have enough data from topology FEATURE.
> 
> hum, haven't hecked deeply.. how will you get core id for cpu?
>

from thread_siblings_list.
 I just noticed that svg_build_topology_map did the similar thing to
get topology map for timechart from perf header.

> >
> > What do you think about the function as below?
> > It gets the socket id from env.
> 
> some sort of caching would be nice, I guess we could store those cpumap
> objects within perf_session_env

Yes it will be stored in perf_session_env.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-24 14:22         ` Liang, Kan
@ 2015-08-24 14:27           ` Jiri Olsa
  2015-08-24 16:47             ` Liang, Kan
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-08-24 14:27 UTC (permalink / raw)
  To: Liang, Kan
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Mon, Aug 24, 2015 at 02:22:08PM +0000, Liang, Kan wrote:
> > 
> > On Fri, Aug 21, 2015 at 08:25:24PM +0000, Liang, Kan wrote:
> > 
> > SNIP
> > 
> > > >
> > > > we need global topology information in perf.data and use the mapping
> > > > from there, we can't use current server info
> > > >
> > > > we currently store core_siblings_list and thread_siblings_list, in
> > > > topology FEATURE, which is probably not enough
> > > >
> > >
> > > core_siblings_list  includes the cpu list in the same socket.
> > > thread_siblings_list includes the cpu list in the same core.
> > > numa_nodes includes the cpu list for each node.
> > >
> > > It looks we have enough data from topology FEATURE.
> > 
> > hum, haven't hecked deeply.. how will you get core id for cpu?
> >
> 
> from thread_siblings_list.
>  I just noticed that svg_build_topology_map did the similar thing to
> get topology map for timechart from perf header.

could you please provide both functions then cpu -> core, cpu -> socket

thanks,
jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-24 14:27           ` Jiri Olsa
@ 2015-08-24 16:47             ` Liang, Kan
  2015-08-24 19:30               ` Jiri Olsa
  0 siblings, 1 reply; 29+ messages in thread
From: Liang, Kan @ 2015-08-24 16:47 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel



> On Mon, Aug 24, 2015 at 02:22:08PM +0000, Liang, Kan wrote:
> > >
> > > On Fri, Aug 21, 2015 at 08:25:24PM +0000, Liang, Kan wrote:
> > >
> > > SNIP
> > >
> > > > >
> > > > > we need global topology information in perf.data and use the
> > > > > mapping from there, we can't use current server info
> > > > >
> > > > > we currently store core_siblings_list and thread_siblings_list,
> > > > > in topology FEATURE, which is probably not enough
> > > > >
> > > >
> > > > core_siblings_list  includes the cpu list in the same socket.
> > > > thread_siblings_list includes the cpu list in the same core.
> > > > numa_nodes includes the cpu list for each node.
> > > >
> > > > It looks we have enough data from topology FEATURE.
> > >
> > > hum, haven't hecked deeply.. how will you get core id for cpu?
> > >
> >
> > from thread_siblings_list.
> >  I just noticed that svg_build_topology_map did the similar thing to
> > get topology map for timechart from perf header.
> 
> could you please provide both functions then cpu -> core, cpu -> socket
> 

Do you mean something like this?
Store cpu->socket and cpu->core in perf_session_env.

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 179b2bd..a01c603 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1590,10 +1596,17 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused
 	u32 nr, i;
 	char *str;
 	struct strbuf sb;
+	int cpu_nr = ph->env.nr_cpus_online;
+	struct cpu_map *map;
+	int j;
+
+	ph->env.cpu = calloc(cpu_nr, sizeof(*ph->env.cpu));
+	if (!ph->env.cpu)
+		return -1;
 
 	ret = readn(fd, &nr, sizeof(nr));
 	if (ret != sizeof(nr))
-		return -1;
+		goto free_cpu;
 
 	if (ph->needs_swap)
 		nr = bswap_32(nr);
@@ -1608,6 +1621,14 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused
 
 		/* include a NULL character at the end */
 		strbuf_add(&sb, str, strlen(str) + 1);
+
+		map = cpu_map__new(str);
+		if (!map)
+			goto error;
+		for (j = 0; j < map->nr; j++) {
+			 ph->env.cpu[map->map[j]].socket_id = i;
+		}
+		cpu_map__put(map);
 		free(str);
 	}
 	ph->env.sibling_cores = strbuf_detach(&sb, NULL);
@@ -1628,6 +1649,14 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused
 
 		/* include a NULL character at the end */
 		strbuf_add(&sb, str, strlen(str) + 1);
+
+		map = cpu_map__new(str);
+		if (!map)
+			goto error;
+		for (j = 0; j < map->nr; j++) {
+			ph->env.cpu[map->map[j]].core_id = i;
+		}
+		cpu_map__put(map);
 		free(str);
 	}
 	ph->env.sibling_threads = strbuf_detach(&sb, NULL);
@@ -1635,6 +1664,8 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused
 
 error:
 	strbuf_release(&sb);
+free_cpu:
+	free(ph->env.cpu);
 	return -1;
 }
 
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 9b53b65..8b8c4fc 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -66,6 +66,11 @@ struct perf_header;
 int perf_file_header__read(struct perf_file_header *header,
 			   struct perf_header *ph, int fd);
 
+struct cpu_topology_map {
+	int	socket_id;
+	int	core_id;
+};
+
 struct perf_session_env {
 	char			*hostname;
 	char			*os_release;
@@ -89,6 +94,7 @@ struct perf_session_env {
 	char			*sibling_threads;
 	char			*numa_nodes;
 	char			*pmu_mappings;
+	struct cpu_topology_map	*cpu;
 };
 
 struct perf_header {
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 18722e7..51b4d5a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -185,6 +185,7 @@ static void perf_session_env__exit(struct perf_session_env *env)
 	zfree(&env->sibling_threads);
 	zfree(&env->numa_nodes);
 	zfree(&env->pmu_mappings);
+	zfree(&env->cpu);
 }
 
 void perf_session__delete(struct perf_session *session)


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-24 16:47             ` Liang, Kan
@ 2015-08-24 19:30               ` Jiri Olsa
  2015-08-27 18:24                 ` Jiri Olsa
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-08-24 19:30 UTC (permalink / raw)
  To: Liang, Kan
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Mon, Aug 24, 2015 at 04:47:12PM +0000, Liang, Kan wrote:
> 
> 
> > On Mon, Aug 24, 2015 at 02:22:08PM +0000, Liang, Kan wrote:
> > > >
> > > > On Fri, Aug 21, 2015 at 08:25:24PM +0000, Liang, Kan wrote:
> > > >
> > > > SNIP
> > > >
> > > > > >
> > > > > > we need global topology information in perf.data and use the
> > > > > > mapping from there, we can't use current server info
> > > > > >
> > > > > > we currently store core_siblings_list and thread_siblings_list,
> > > > > > in topology FEATURE, which is probably not enough
> > > > > >
> > > > >
> > > > > core_siblings_list  includes the cpu list in the same socket.
> > > > > thread_siblings_list includes the cpu list in the same core.
> > > > > numa_nodes includes the cpu list for each node.
> > > > >
> > > > > It looks we have enough data from topology FEATURE.
> > > >
> > > > hum, haven't hecked deeply.. how will you get core id for cpu?
> > > >
> > >
> > > from thread_siblings_list.
> > >  I just noticed that svg_build_topology_map did the similar thing to
> > > get topology map for timechart from perf header.
> > 
> > could you please provide both functions then cpu -> core, cpu -> socket
> > 
> 
> Do you mean something like this?
> Store cpu->socket and cpu->core in perf_session_env.

yep, seems ok

thanks,
jirka

> 
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 179b2bd..a01c603 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -1590,10 +1596,17 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused
>  	u32 nr, i;
>  	char *str;
>  	struct strbuf sb;
> +	int cpu_nr = ph->env.nr_cpus_online;
> +	struct cpu_map *map;
> +	int j;
> +
> +	ph->env.cpu = calloc(cpu_nr, sizeof(*ph->env.cpu));
> +	if (!ph->env.cpu)
> +		return -1;
>  
>  	ret = readn(fd, &nr, sizeof(nr));
>  	if (ret != sizeof(nr))
> -		return -1;
> +		goto free_cpu;
>  
>  	if (ph->needs_swap)
>  		nr = bswap_32(nr);
> @@ -1608,6 +1621,14 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused
>  
>  		/* include a NULL character at the end */
>  		strbuf_add(&sb, str, strlen(str) + 1);
> +
> +		map = cpu_map__new(str);
> +		if (!map)
> +			goto error;
> +		for (j = 0; j < map->nr; j++) {
> +			 ph->env.cpu[map->map[j]].socket_id = i;
> +		}
> +		cpu_map__put(map);
>  		free(str);
>  	}
>  	ph->env.sibling_cores = strbuf_detach(&sb, NULL);
> @@ -1628,6 +1649,14 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused
>  
>  		/* include a NULL character at the end */
>  		strbuf_add(&sb, str, strlen(str) + 1);
> +
> +		map = cpu_map__new(str);
> +		if (!map)
> +			goto error;
> +		for (j = 0; j < map->nr; j++) {
> +			ph->env.cpu[map->map[j]].core_id = i;
> +		}
> +		cpu_map__put(map);
>  		free(str);
>  	}
>  	ph->env.sibling_threads = strbuf_detach(&sb, NULL);
> @@ -1635,6 +1664,8 @@ static int process_cpu_topology(struct perf_file_section *section __maybe_unused
>  
>  error:
>  	strbuf_release(&sb);
> +free_cpu:
> +	free(ph->env.cpu);
>  	return -1;
>  }
>  
> diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
> index 9b53b65..8b8c4fc 100644
> --- a/tools/perf/util/header.h
> +++ b/tools/perf/util/header.h
> @@ -66,6 +66,11 @@ struct perf_header;
>  int perf_file_header__read(struct perf_file_header *header,
>  			   struct perf_header *ph, int fd);
>  
> +struct cpu_topology_map {
> +	int	socket_id;
> +	int	core_id;
> +};
> +
>  struct perf_session_env {
>  	char			*hostname;
>  	char			*os_release;
> @@ -89,6 +94,7 @@ struct perf_session_env {
>  	char			*sibling_threads;
>  	char			*numa_nodes;
>  	char			*pmu_mappings;
> +	struct cpu_topology_map	*cpu;
>  };
>  
>  struct perf_header {
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 18722e7..51b4d5a 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -185,6 +185,7 @@ static void perf_session_env__exit(struct perf_session_env *env)
>  	zfree(&env->sibling_threads);
>  	zfree(&env->numa_nodes);
>  	zfree(&env->pmu_mappings);
> +	zfree(&env->cpu);
>  }
>  
>  void perf_session__delete(struct perf_session *session)
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-24 19:30               ` Jiri Olsa
@ 2015-08-27 18:24                 ` Jiri Olsa
  2015-08-27 18:29                   ` Liang, Kan
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-08-27 18:24 UTC (permalink / raw)
  To: Liang, Kan
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel

On Mon, Aug 24, 2015 at 09:30:21PM +0200, Jiri Olsa wrote:
> On Mon, Aug 24, 2015 at 04:47:12PM +0000, Liang, Kan wrote:
> > 
> > 
> > > On Mon, Aug 24, 2015 at 02:22:08PM +0000, Liang, Kan wrote:
> > > > >
> > > > > On Fri, Aug 21, 2015 at 08:25:24PM +0000, Liang, Kan wrote:
> > > > >
> > > > > SNIP
> > > > >
> > > > > > >
> > > > > > > we need global topology information in perf.data and use the
> > > > > > > mapping from there, we can't use current server info
> > > > > > >
> > > > > > > we currently store core_siblings_list and thread_siblings_list,
> > > > > > > in topology FEATURE, which is probably not enough
> > > > > > >
> > > > > >
> > > > > > core_siblings_list  includes the cpu list in the same socket.
> > > > > > thread_siblings_list includes the cpu list in the same core.
> > > > > > numa_nodes includes the cpu list for each node.
> > > > > >
> > > > > > It looks we have enough data from topology FEATURE.
> > > > >
> > > > > hum, haven't hecked deeply.. how will you get core id for cpu?
> > > > >
> > > >
> > > > from thread_siblings_list.
> > > >  I just noticed that svg_build_topology_map did the similar thing to
> > > > get topology map for timechart from perf header.
> > > 
> > > could you please provide both functions then cpu -> core, cpu -> socket
> > > 
> > 
> > Do you mean something like this?
> > Store cpu->socket and cpu->core in perf_session_env.
> 
> yep, seems ok

any chance you could send out this one? I'd need to use it in my other stuff

thanks,
jirka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH RFC 02/10] perf,tools: Support new sort type --socket
  2015-08-27 18:24                 ` Jiri Olsa
@ 2015-08-27 18:29                   ` Liang, Kan
  0 siblings, 0 replies; 29+ messages in thread
From: Liang, Kan @ 2015-08-27 18:29 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, a.p.zijlstra, mingo, jolsa, namhyung, ak, eranian, linux-kernel


> 
> On Mon, Aug 24, 2015 at 09:30:21PM +0200, Jiri Olsa wrote:
> > On Mon, Aug 24, 2015 at 04:47:12PM +0000, Liang, Kan wrote:
> > >
> > >
> > > > On Mon, Aug 24, 2015 at 02:22:08PM +0000, Liang, Kan wrote:
> > > > > >
> > > > > > On Fri, Aug 21, 2015 at 08:25:24PM +0000, Liang, Kan wrote:
> > > > > >
> > > > > > SNIP
> > > > > >
> > > > > > > >
> > > > > > > > we need global topology information in perf.data and use
> > > > > > > > the mapping from there, we can't use current server info
> > > > > > > >
> > > > > > > > we currently store core_siblings_list and
> > > > > > > > thread_siblings_list, in topology FEATURE, which is
> > > > > > > > probably not enough
> > > > > > > >
> > > > > > >
> > > > > > > core_siblings_list  includes the cpu list in the same socket.
> > > > > > > thread_siblings_list includes the cpu list in the same core.
> > > > > > > numa_nodes includes the cpu list for each node.
> > > > > > >
> > > > > > > It looks we have enough data from topology FEATURE.
> > > > > >
> > > > > > hum, haven't hecked deeply.. how will you get core id for cpu?
> > > > > >
> > > > >
> > > > > from thread_siblings_list.
> > > > >  I just noticed that svg_build_topology_map did the similar
> > > > > thing to get topology map for timechart from perf header.
> > > >
> > > > could you please provide both functions then cpu -> core, cpu ->
> > > > socket
> > > >
> > >
> > > Do you mean something like this?
> > > Store cpu->socket and cpu->core in perf_session_env.
> >
> > yep, seems ok
> 
> any chance you could send out this one? I'd need to use it in my other stuff
> 

Ok I will send out this patch as a separate patch first.



^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-08-27 18:29 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-18  9:25 [PATCH RFC 00/10] stat read during perf sampling kan.liang
2015-08-18  9:25 ` [PATCH RFC 01/10] perf,tools: open event on evsel cpus and threads kan.liang
2015-08-20  8:57   ` Jiri Olsa
2015-08-20 17:24     ` Liang, Kan
2015-08-21  7:43       ` Jiri Olsa
2015-08-21 13:24         ` Liang, Kan
2015-08-18  9:25 ` [PATCH RFC 02/10] perf,tools: Support new sort type --socket kan.liang
2015-08-20  9:09   ` Jiri Olsa
2015-08-21 20:25     ` Liang, Kan
2015-08-23 22:00       ` Jiri Olsa
2015-08-24 14:22         ` Liang, Kan
2015-08-24 14:27           ` Jiri Olsa
2015-08-24 16:47             ` Liang, Kan
2015-08-24 19:30               ` Jiri Olsa
2015-08-27 18:24                 ` Jiri Olsa
2015-08-27 18:29                   ` Liang, Kan
2015-08-18  9:25 ` [PATCH RFC 03/10] perf,tools: support option --socket kan.liang
2015-08-18 17:36   ` Stephane Eranian
2015-08-20  9:30   ` Jiri Olsa
2015-08-18  9:25 ` [PATCH RFC 04/10] perf,tools: Add 'N' event/group modifier kan.liang
2015-08-18  9:25 ` [PATCH RFC 05/10] perf,tools: Enable statistic read for perf record kan.liang
2015-08-20  9:39   ` Jiri Olsa
2015-08-20  9:41   ` Jiri Olsa
2015-08-18  9:25 ` [PATCH RFC 06/10] perf,tools: New RECORD type PERF_RECORD_STAT_READ kan.liang
2015-08-20  9:49   ` Jiri Olsa
2015-08-18  9:25 ` [PATCH RFC 07/10] perf,tools: record counter statistics during sampling kan.liang
2015-08-18  9:25 ` [PATCH RFC 08/10] perf,tools: option to set stat read interval kan.liang
2015-08-18  9:25 ` [PATCH RFC 09/10] perf,tools: don't validate non-sample event kan.liang
2015-08-18  9:25 ` [PATCH RFC 10/10] perf,tools: Show STAT_READ in perf report kan.liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).