linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics
@ 2021-02-03 13:58 Alexander Antonov
  2021-02-03 13:58 ` [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode Alexander Antonov
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Alexander Antonov @ 2021-02-03 13:58 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz, alexander.antonov

The previous version can be found at:
v3: https://lkml.kernel.org/r/20210126080619.30275-1-alexander.antonov@linux.intel.com/
Changes in this revision are:
v3 -> v4:
- Addressed comment from Namhyung Kim:
   1. Removed NULL-termination of root ports list

The previous version can be found at:
v2: https://lkml.kernel.org/r/20201223130320.3930-1-alexander.antonov@linux.intel.com

Changes in this revision are:
v2 -> v3:
- Addressed comments from Namhyung Kim:
  1. Removed perf_device pointer from evsel structure. Use priv field instead
  2. Renamed 'iiostat' to 'iostat'
  3. Renamed 'show' mode to 'list' mode
  4. Renamed iiostat_delete_root_ports() to iiostat_release() and
     iostat_show_root_ports() to iostat_list()

The previous version can be found at:
v1: https://lkml.kernel.org/r/20201210090340.14358-1-alexander.antonov@linux.intel.com

Changes in this revision are:
v1 -> v2:
- Addressed comment from Arnaldo Carvalho de Melo:
  1. Using 'perf iiostat' subcommand instead of 'perf stat --iiostat':
    - Added perf-iiostat.sh script to use short command
    - Updated manual pages to get help for 'perf iiostat'
    - Added 'perf-iiostat' to perf's gitignore file

Mode is intended to provide four I/O performance metrics in MB per each
root port:
 - Inbound Read:   I/O devices below root port read from the host memory
 - Inbound Write:  I/O devices below root port write to the host memory
 - Outbound Read:  CPU reads from I/O devices below root port
 - Outbound Write: CPU writes to I/O devices below root port

Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
    #EventCount * 4B / (1024 * 1024)

Note: iostat introduces new perf data aggregation mode - per PCIe root port
hence -e and -M options are not supported.

Usage examples:

1. List all PCIe root ports (example for 2-S platform):
   $ perf iostat list
   S0-uncore_iio_0<0000:00>
   S1-uncore_iio_0<0000:80>
   S0-uncore_iio_1<0000:17>
   S1-uncore_iio_1<0000:85>
   S0-uncore_iio_2<0000:3a>
   S1-uncore_iio_2<0000:ae>
   S0-uncore_iio_3<0000:5d>
   S1-uncore_iio_3<0000:d7>

2. Collect metrics for all PCIe root ports:
   $ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
   357708+0 records in
   357707+0 records out
   375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s

    Performance counter stats for 'system wide':

      port             Inbound Read(MB)    Inbound Write(MB)    Outbound Read(MB)   Outbound Write(MB) 
   0000:00                    1                    0                    2                    3 
   0000:80                    0                    0                    0                    0 
   0000:17               352552                   43                    0                   21 
   0000:85                    0                    0                    0                    0 
   0000:3a                    3                    0                    0                    0 
   0000:ae                    0                    0                    0                    0 
   0000:5d                    0                    0                    0                    0 
   0000:d7                    0                    0                    0                    0

3. Collect metrics for comma separated list of PCIe root ports:
   $ perf iostat 0000:17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
   357708+0 records in
   357707+0 records out
   375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s

    Performance counter stats for 'system wide':

      port             Inbound Read(MB)    Inbound Write(MB)    Outbound Read(MB)   Outbound Write(MB) 
   0000:17               358559                   44                    0                   22 
   0000:3a                    3                    2                    0                    0 

        197.081983474 seconds time elapsed


Alexander Antonov (5):
  perf stat: Add AGGR_PCIE_PORT mode
  perf stat: Basic support for iostat in perf
  perf stat: Helper functions for PCIe root ports list in iostat mode
  perf stat: Enable iostat mode for x86 platforms
  perf: Update .gitignore file

 tools/perf/.gitignore                         |   1 +
 tools/perf/Documentation/perf-iostat.txt      |  88 ++++
 tools/perf/Makefile.perf                      |   5 +-
 tools/perf/arch/x86/util/Build                |   1 +
 tools/perf/arch/x86/util/iostat.c             | 469 ++++++++++++++++++
 tools/perf/builtin-stat.c                     |  36 +-
 tools/perf/command-list.txt                   |   1 +
 tools/perf/perf-iostat.sh                     |  12 +
 tools/perf/util/iostat.h                      |  32 ++
 .../scripting-engines/trace-event-python.c    |   3 +-
 tools/perf/util/stat-display.c                |  53 +-
 tools/perf/util/stat-shadow.c                 |  11 +-
 tools/perf/util/stat.c                        |   4 +-
 tools/perf/util/stat.h                        |   2 +
 14 files changed, 710 insertions(+), 8 deletions(-)
 create mode 100644 tools/perf/Documentation/perf-iostat.txt
 create mode 100644 tools/perf/arch/x86/util/iostat.c
 create mode 100644 tools/perf/perf-iostat.sh
 create mode 100644 tools/perf/util/iostat.h


base-commit: b145b0eb2031a620ca010174240963e4d2c6ce26
-- 
2.19.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode
  2021-02-03 13:58 [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Alexander Antonov
@ 2021-02-03 13:58 ` Alexander Antonov
  2021-02-04 12:07   ` Namhyung Kim
  2021-02-03 13:58 ` [PATCH v4 2/5] perf stat: Basic support for iostat in perf Alexander Antonov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Alexander Antonov @ 2021-02-03 13:58 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz, alexander.antonov

Adding AGGR_PCIE_PORT mode to be able to distinguish aggr_mode
for root ports in following patches.

Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
---
 tools/perf/builtin-stat.c                           |  5 ++++-
 .../util/scripting-engines/trace-event-python.c     |  3 ++-
 tools/perf/util/stat-display.c                      | 13 +++++++++++--
 tools/perf/util/stat.c                              |  4 +++-
 tools/perf/util/stat.h                              |  1 +
 5 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 468fc49420ce..60fdb6a0805f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -908,6 +908,7 @@ static int perf_stat_init_aggr_mode(void)
 		break;
 	case AGGR_GLOBAL:
 	case AGGR_THREAD:
+	case AGGR_PCIE_PORT:
 	case AGGR_UNSET:
 	default:
 		break;
@@ -1072,6 +1073,7 @@ static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
 	case AGGR_NONE:
 	case AGGR_GLOBAL:
 	case AGGR_THREAD:
+	case AGGR_PCIE_PORT:
 	case AGGR_UNSET:
 	default:
 		break;
@@ -1844,7 +1846,8 @@ int cmd_stat(int argc, const char **argv)
 	 * --per-thread is aggregated per thread, we dont mix it with cpu mode
 	 */
 	if (((stat_config.aggr_mode != AGGR_GLOBAL &&
-	      stat_config.aggr_mode != AGGR_THREAD) || nr_cgroups) &&
+	      stat_config.aggr_mode != AGGR_THREAD &&
+	      stat_config.aggr_mode != AGGR_PCIE_PORT) || nr_cgroups) &&
 	    !target__has_cpu(&target)) {
 		fprintf(stderr, "both cgroup and no-aggregation "
 			"modes only available in system-wide mode\n");
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 5d341efc3237..e604c199f493 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1396,7 +1396,8 @@ static void python_process_stat(struct perf_stat_config *config,
 	struct perf_cpu_map *cpus = counter->core.cpus;
 	int cpu, thread;
 
-	if (config->aggr_mode == AGGR_GLOBAL) {
+	if (config->aggr_mode == AGGR_GLOBAL ||
+	    config->aggr_mode == AGGR_PCIE_PORT) {
 		process_stat(counter, -1, -1, tstamp,
 			     &counter->counts->aggr);
 		return;
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index ed3b0ac2f785..db1bec115d0b 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -123,6 +123,7 @@ static void aggr_printout(struct perf_stat_config *config,
 			config->csv_sep);
 		break;
 	case AGGR_GLOBAL:
+	case AGGR_PCIE_PORT:
 	case AGGR_UNSET:
 	default:
 		break;
@@ -322,7 +323,8 @@ static int first_shadow_cpu(struct perf_stat_config *config,
 	if (config->aggr_mode == AGGR_NONE)
 		return id;
 
-	if (config->aggr_mode == AGGR_GLOBAL)
+	if (config->aggr_mode == AGGR_GLOBAL ||
+	    config->aggr_mode == AGGR_PCIE_PORT)
 		return 0;
 
 	for (i = 0; i < perf_evsel__nr_cpus(evsel); i++) {
@@ -416,6 +418,7 @@ static void printout(struct perf_stat_config *config, int id, int nr,
 	if (config->csv_output && !config->metric_only) {
 		static int aggr_fields[] = {
 			[AGGR_GLOBAL] = 0,
+			[AGGR_PCIE_PORT] = 0,
 			[AGGR_THREAD] = 1,
 			[AGGR_NONE] = 1,
 			[AGGR_SOCKET] = 2,
@@ -899,6 +902,7 @@ static int aggr_header_lens[] = {
 	[AGGR_NONE] = 6,
 	[AGGR_THREAD] = 24,
 	[AGGR_GLOBAL] = 0,
+	[AGGR_PCIE_PORT] = 0,
 };
 
 static const char *aggr_header_csv[] = {
@@ -907,7 +911,8 @@ static const char *aggr_header_csv[] = {
 	[AGGR_SOCKET] 	= 	"socket,cpus",
 	[AGGR_NONE] 	= 	"cpu,",
 	[AGGR_THREAD] 	= 	"comm-pid,",
-	[AGGR_GLOBAL] 	=	""
+	[AGGR_GLOBAL]	=	"",
+	[AGGR_PCIE_PORT] =	"port,"
 };
 
 static void print_metric_headers(struct perf_stat_config *config,
@@ -990,6 +995,8 @@ static void print_interval(struct perf_stat_config *config,
 			if (!metric_only)
 				fprintf(output, "                  counts %*s events\n", unit_width, "unit");
 			break;
+		case AGGR_PCIE_PORT:
+			break;
 		case AGGR_GLOBAL:
 		default:
 			fprintf(output, "#           time");
@@ -1214,6 +1221,8 @@ perf_evlist__print_counters(struct evlist *evlist,
 			}
 		}
 		break;
+	case AGGR_PCIE_PORT:
+		break;
 	case AGGR_UNSET:
 	default:
 		break;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ebdd130557fb..9ffb97b31710 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -318,6 +318,7 @@ process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
 		}
 		break;
 	case AGGR_GLOBAL:
+	case AGGR_PCIE_PORT:
 		aggr->val += count->val;
 		aggr->ena += count->ena;
 		aggr->run += count->run;
@@ -377,7 +378,8 @@ int perf_stat_process_counter(struct perf_stat_config *config,
 	if (ret)
 		return ret;
 
-	if (config->aggr_mode != AGGR_GLOBAL)
+	if (config->aggr_mode != AGGR_GLOBAL &&
+	    config->aggr_mode != AGGR_PCIE_PORT)
 		return 0;
 
 	if (!counter->snapshot)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index edbeb2f63e8d..c7544c28c02a 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -46,6 +46,7 @@ enum aggr_mode {
 	AGGR_DIE,
 	AGGR_CORE,
 	AGGR_THREAD,
+	AGGR_PCIE_PORT,
 	AGGR_UNSET,
 };
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 2/5] perf stat: Basic support for iostat in perf
  2021-02-03 13:58 [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Alexander Antonov
  2021-02-03 13:58 ` [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode Alexander Antonov
@ 2021-02-03 13:58 ` Alexander Antonov
  2021-02-04 12:22   ` Namhyung Kim
  2021-02-03 13:58 ` [PATCH v4 3/5] perf stat: Helper functions for PCIe root ports list in iostat mode Alexander Antonov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Alexander Antonov @ 2021-02-03 13:58 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz, alexander.antonov

Add basic flow for a new iostat mode in perf. Mode is intended to
provide four I/O performance metrics per each PCIe root port: Inbound Read,
Inbound Write, Outbound Read, Outbound Write.

The actual code to compute the metrics and attribute it to
root port is in follow-on patches.

Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
---
 tools/perf/builtin-stat.c      | 31 ++++++++++++++++++++++++++
 tools/perf/util/iostat.h       | 32 +++++++++++++++++++++++++++
 tools/perf/util/stat-display.c | 40 +++++++++++++++++++++++++++++++++-
 tools/perf/util/stat-shadow.c  | 11 +++++++++-
 tools/perf/util/stat.h         |  1 +
 5 files changed, 113 insertions(+), 2 deletions(-)
 create mode 100644 tools/perf/util/iostat.h

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 60fdb6a0805f..66c913692120 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -65,6 +65,7 @@
 #include "util/target.h"
 #include "util/time-utils.h"
 #include "util/top.h"
+#include "util/iostat.h"
 #include "asm/bug.h"
 
 #include <linux/time64.h>
@@ -186,6 +187,7 @@ static struct perf_stat_config stat_config = {
 	.metric_only_len	= METRIC_ONLY_LEN,
 	.walltime_nsecs_stats	= &walltime_nsecs_stats,
 	.big_num		= true,
+	.iostat_run		= false,
 };
 
 static inline void diff_timespec(struct timespec *r, struct timespec *a,
@@ -723,6 +725,14 @@ static int parse_metric_groups(const struct option *opt,
 	return metricgroup__parse_groups(opt, str, &stat_config.metric_events);
 }
 
+__weak int iostat_parse(const struct option *opt __maybe_unused,
+			 const char *str __maybe_unused,
+			 int unset __maybe_unused)
+{
+	pr_err("iostat mode is not supported\n");
+	return -1;
+}
+
 static struct option stat_options[] = {
 	OPT_BOOLEAN('T', "transaction", &transaction_run,
 		    "hardware transaction statistics"),
@@ -803,6 +813,8 @@ static struct option stat_options[] = {
 	OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
 		     "monitor specified metrics or metric groups (separated by ,)",
 		     parse_metric_groups),
+	OPT_CALLBACK_OPTARG(0, "iostat", &evsel_list, &stat_config, "root port",
+			    "measure PCIe metrics per root port", iostat_parse),
 	OPT_END()
 };
 
@@ -1131,6 +1143,12 @@ __weak void arch_topdown_group_warn(void)
 {
 }
 
+__weak int iostat_list(struct evlist *evlist __maybe_unused,
+			struct perf_stat_config *config __maybe_unused)
+{
+	return 0;
+}
+
 /*
  * Add default attributes, if there were no attributes specified or
  * if -d/--detailed, -d -d or -d -d -d is used:
@@ -1682,6 +1700,10 @@ static void setup_system_wide(int forks)
 	}
 }
 
+__weak void iostat_release(struct evlist *evlist __maybe_unused)
+{
+}
+
 int cmd_stat(int argc, const char **argv)
 {
 	const char * const stat_usage[] = {
@@ -1858,6 +1880,12 @@ int cmd_stat(int argc, const char **argv)
 		goto out;
 	}
 
+	if (stat_config.iostat_run) {
+		status = iostat_list(evsel_list, &stat_config);
+		if (status || !stat_config.iostat_run)
+			goto out;
+	}
+
 	if (add_default_attributes())
 		goto out;
 
@@ -2008,6 +2036,9 @@ int cmd_stat(int argc, const char **argv)
 	perf_stat__exit_aggr_mode();
 	perf_evlist__free_stats(evsel_list);
 out:
+	if (stat_config.iostat_run)
+		iostat_release(evsel_list);
+
 	zfree(&stat_config.walltime_run);
 
 	if (smi_cost && smi_reset)
diff --git a/tools/perf/util/iostat.h b/tools/perf/util/iostat.h
new file mode 100644
index 000000000000..b34ebedfd5e6
--- /dev/null
+++ b/tools/perf/util/iostat.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * perf iostat
+ *
+ * Copyright (C) 2020, Intel Corporation
+ *
+ * Authors: Alexander Antonov <alexander.antonov@linux.intel.com>
+ */
+
+#ifndef _IOSTAT_H
+#define _IOSTAT_H
+
+#include <subcmd/parse-options.h>
+#include "util/stat.h"
+#include "util/parse-events.h"
+#include "util/evlist.h"
+
+struct option;
+struct perf_stat_config;
+struct evlist;
+struct timespec;
+
+int iostat_parse(const struct option *opt, const char *str,
+		 int unset __maybe_unused);
+void iostat_prefix(struct perf_stat_config *config, struct evlist *evlist,
+		   char *prefix, struct timespec *ts);
+void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
+			 struct perf_stat_output_ctx *out);
+int iostat_list(struct evlist *evlist, struct perf_stat_config *config);
+void iostat_release(struct evlist *evlist);
+
+#endif /* _IOSTAT_H */
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index db1bec115d0b..de78cf6962b9 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -16,6 +16,7 @@
 #include <linux/ctype.h>
 #include "cgroup.h"
 #include <api/fs/fs.h>
+#include "iostat.h"
 
 #define CNTR_NOT_SUPPORTED	"<not supported>"
 #define CNTR_NOT_COUNTED	"<not counted>"
@@ -302,6 +303,11 @@ static void print_metric_header(struct perf_stat_config *config,
 	struct outstate *os = ctx;
 	char tbuf[1024];
 
+	/* In case of iostat, print metric header for first root port only */
+	if (config->iostat_run &&
+	    os->evsel->priv != os->evsel->evlist->selected->priv)
+		return;
+
 	if (!valid_only_metric(unit))
 		return;
 	unit = fixunit(tbuf, os->evsel, unit);
@@ -936,6 +942,8 @@ static void print_metric_headers(struct perf_stat_config *config,
 			fputs("time,", config->output);
 		fputs(aggr_header_csv[config->aggr_mode], config->output);
 	}
+	if (config->iostat_run && !config->interval && !config->csv_output)
+		fprintf(config->output, "   port         ");
 
 	/* Print metrics headers only */
 	evlist__for_each_entry(evlist, counter) {
@@ -954,6 +962,13 @@ static void print_metric_headers(struct perf_stat_config *config,
 	fputc('\n', config->output);
 }
 
+__weak void iostat_prefix(struct perf_stat_config *config __maybe_unused,
+			  struct evlist *evlist __maybe_unused,
+			  char *prefix __maybe_unused,
+			  struct timespec *ts __maybe_unused)
+{
+}
+
 static void print_interval(struct perf_stat_config *config,
 			   struct evlist *evlist,
 			   char *prefix, struct timespec *ts)
@@ -966,7 +981,10 @@ static void print_interval(struct perf_stat_config *config,
 	if (config->interval_clear)
 		puts(CONSOLE_CLEAR);
 
-	sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep);
+	if (!config->iostat_run)
+		sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec,
+						ts->tv_nsec,
+						config->csv_sep);
 
 	if ((num_print_interval == 0 && !config->csv_output) || config->interval_clear) {
 		switch (config->aggr_mode) {
@@ -996,6 +1014,7 @@ static void print_interval(struct perf_stat_config *config,
 				fprintf(output, "                  counts %*s events\n", unit_width, "unit");
 			break;
 		case AGGR_PCIE_PORT:
+			fprintf(output, "#           time    port        ");
 			break;
 		case AGGR_GLOBAL:
 		default:
@@ -1174,6 +1193,10 @@ perf_evlist__print_counters(struct evlist *evlist,
 	int interval = config->interval;
 	struct evsel *counter;
 	char buf[64], *prefix = NULL;
+	void *perf_device = NULL;
+
+	if (config->iostat_run)
+		evlist->selected = evlist__first(evlist);
 
 	if (interval)
 		print_interval(config, evlist, prefix = buf, ts);
@@ -1222,6 +1245,21 @@ perf_evlist__print_counters(struct evlist *evlist,
 		}
 		break;
 	case AGGR_PCIE_PORT:
+		counter = evlist__first(evlist);
+		perf_evlist__set_selected(evlist, counter);
+		iostat_prefix(config, evlist, prefix = buf, ts);
+		fprintf(config->output, "%s", prefix);
+		evlist__for_each_entry(evlist, counter) {
+			perf_device = evlist->selected->priv;
+			if (perf_device && perf_device != counter->priv) {
+				perf_evlist__set_selected(evlist, counter);
+				iostat_prefix(config, evlist, prefix, ts);
+				fprintf(config->output, "\n%s", prefix);
+			}
+			print_counter_aggr(config, counter, prefix);
+			if ((counter->idx + 1) == evlist->core.nr_entries)
+				fputc('\n', config->output);
+		}
 		break;
 	case AGGR_UNSET:
 	default:
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 2c41d47f6f83..083a450c6dc7 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -9,6 +9,7 @@
 #include "expr.h"
 #include "metricgroup.h"
 #include <linux/zalloc.h>
+#include "iostat.h"
 
 /*
  * AGGR_GLOBAL: Use CPU 0
@@ -814,6 +815,12 @@ static void generic_metric(struct perf_stat_config *config,
 		zfree(&pctx.ids[i].name);
 }
 
+__weak void iostat_print_metric(struct perf_stat_config *config __maybe_unused,
+				struct evsel *evsel __maybe_unused,
+				struct perf_stat_output_ctx *out __maybe_unused)
+{
+}
+
 void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				   struct evsel *evsel,
 				   double avg, int cpu,
@@ -829,7 +836,9 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 	struct metric_event *me;
 	int num = 1;
 
-	if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
+	if (config->iostat_run) {
+		iostat_print_metric(config, evsel, out);
+	} else if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
 		total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
 
 		if (total) {
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index c7544c28c02a..c2a2b28effd6 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -107,6 +107,7 @@ struct perf_stat_config {
 	bool			 big_num;
 	bool			 no_merge;
 	bool			 walltime_run_table;
+	bool			 iostat_run;
 	FILE			*output;
 	unsigned int		 interval;
 	unsigned int		 timeout;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 3/5] perf stat: Helper functions for PCIe root ports list in iostat mode
  2021-02-03 13:58 [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Alexander Antonov
  2021-02-03 13:58 ` [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode Alexander Antonov
  2021-02-03 13:58 ` [PATCH v4 2/5] perf stat: Basic support for iostat in perf Alexander Antonov
@ 2021-02-03 13:58 ` Alexander Antonov
  2021-02-04 12:32   ` Namhyung Kim
  2021-02-03 13:58 ` [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms Alexander Antonov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Alexander Antonov @ 2021-02-03 13:58 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz, alexander.antonov

Introduce helper functions to control PCIe root ports list.
These helpers will be used in the follow-up patch.

Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
---
 tools/perf/arch/x86/util/iostat.c | 124 ++++++++++++++++++++++++++++++
 1 file changed, 124 insertions(+)
 create mode 100644 tools/perf/arch/x86/util/iostat.c

diff --git a/tools/perf/arch/x86/util/iostat.c b/tools/perf/arch/x86/util/iostat.c
new file mode 100644
index 000000000000..961e540106e6
--- /dev/null
+++ b/tools/perf/arch/x86/util/iostat.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * perf iostat
+ *
+ * Copyright (C) 2020, Intel Corporation
+ *
+ * Authors: Alexander Antonov <alexander.antonov@linux.intel.com>
+ */
+
+#include <api/fs/fs.h>
+#include <linux/kernel.h>
+#include <linux/err.h>
+#include <limits.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <dirent.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <regex.h>
+#include "util/cpumap.h"
+#include "util/debug.h"
+#include "util/iostat.h"
+#include "util/counts.h"
+#include "path.h"
+
+struct iio_root_port {
+	u32 domain;
+	u8 bus;
+	u8 die;
+	u8 pmu_idx;
+	int idx;
+};
+
+struct iio_root_ports_list {
+	struct iio_root_port **rps;
+	int nr_entries;
+};
+
+static void iio_root_port_show(FILE *output,
+			       const struct iio_root_port * const rp)
+{
+	if (output && rp)
+		fprintf(output, "S%d-uncore_iio_%d<%04x:%02x>\n",
+			rp->die, rp->pmu_idx, rp->domain, rp->bus);
+}
+
+static struct iio_root_port *iio_root_port_new(u32 domain, u8 bus,
+					       u8 die, u8 pmu_idx)
+{
+	struct iio_root_port *p = calloc(1, sizeof(*p));
+
+	if (p) {
+		p->domain = domain;
+		p->bus = bus;
+		p->die = die;
+		p->pmu_idx = pmu_idx;
+	}
+	return p;
+}
+
+static struct iio_root_ports_list *iio_root_ports_list_new(void)
+{
+	struct iio_root_ports_list *list = calloc(1, sizeof(*list));
+
+	if (list) {
+		list->rps = calloc(1, sizeof(struct iio_root_port *));
+		if (!list->rps) {
+			free(list);
+			list = NULL;
+		}
+	}
+	return list;
+}
+
+static void iio_root_ports_list_free(struct iio_root_ports_list *list)
+{
+	int idx;
+
+	if (list) {
+		for (idx = 0; idx < list->nr_entries; idx++)
+			free(list->rps[idx]);
+		free(list->rps);
+		free(list);
+	}
+}
+
+static struct iio_root_port *iio_root_port_find_by_notation(
+	const struct iio_root_ports_list * const list, u32 domain, u8 bus)
+{
+	int idx;
+	struct iio_root_port *rp;
+
+	if (list) {
+		for (idx = 0; idx < list->nr_entries; idx++) {
+			rp = list->rps[idx];
+			if (rp && rp->domain == domain && rp->bus == bus)
+				return rp;
+		}
+	}
+	return NULL;
+}
+
+static int iio_root_ports_list_insert(struct iio_root_ports_list *list,
+				      struct iio_root_port * const rp)
+{
+	struct iio_root_port **tmp_buf;
+
+	if (list && rp) {
+		rp->idx = list->nr_entries++;
+		tmp_buf = realloc(list->rps,
+				  list->nr_entries * sizeof(*list->rps));
+		if (!tmp_buf) {
+			pr_err("Failed to realloc memory\n");
+			return -ENOMEM;
+		}
+		tmp_buf[rp->idx] = rp;
+		list->rps = tmp_buf;
+	}
+	return 0;
+}
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms
  2021-02-03 13:58 [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Alexander Antonov
                   ` (2 preceding siblings ...)
  2021-02-03 13:58 ` [PATCH v4 3/5] perf stat: Helper functions for PCIe root ports list in iostat mode Alexander Antonov
@ 2021-02-03 13:58 ` Alexander Antonov
  2021-03-09  7:51   ` liuqi (BA)
  2021-02-03 13:58 ` [PATCH v4 5/5] perf: Update .gitignore file Alexander Antonov
  2021-02-03 20:55 ` [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Arnaldo Carvalho de Melo
  5 siblings, 1 reply; 19+ messages in thread
From: Alexander Antonov @ 2021-02-03 13:58 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz, alexander.antonov

This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to
IIO PMON mapping")

Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
 - Inbound Read: I/O devices below root port read from the host memory
 - Inbound Write: I/O devices below root port write to the host memory
 - Outbound Read: CPU reads from I/O devices below root port
 - Outbound Write: CPU writes to I/O devices below root port

Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
    #EventCount * 4B / (1024 * 1024)

Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
---
 tools/perf/Documentation/perf-iostat.txt |  88 ++++++
 tools/perf/Makefile.perf                 |   5 +-
 tools/perf/arch/x86/util/Build           |   1 +
 tools/perf/arch/x86/util/iostat.c        | 345 +++++++++++++++++++++++
 tools/perf/command-list.txt              |   1 +
 tools/perf/perf-iostat.sh                |  12 +
 6 files changed, 451 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/Documentation/perf-iostat.txt
 create mode 100644 tools/perf/perf-iostat.sh

diff --git a/tools/perf/Documentation/perf-iostat.txt b/tools/perf/Documentation/perf-iostat.txt
new file mode 100644
index 000000000000..165176944031
--- /dev/null
+++ b/tools/perf/Documentation/perf-iostat.txt
@@ -0,0 +1,88 @@
+perf-iostat(1)
+===============
+
+NAME
+----
+perf-iostat - Show I/O performance metrics
+
+SYNOPSIS
+--------
+[verse]
+'perf iostat' list
+'perf iostat' <ports> -- <command> [<options>]
+
+DESCRIPTION
+-----------
+Mode is intended to provide four I/O performance metrics per each PCIe root port:
+
+- Inbound Read   - I/O devices below root port read from the host memory, in MB
+
+- Inbound Write  - I/O devices below root port write to the host memory, in MB
+
+- Outbound Read  - CPU reads from I/O devices below root port, in MB
+
+- Outbound Write - CPU writes to I/O devices below root port, in MB
+
+OPTIONS
+-------
+<command>...::
+	Any command you can specify in a shell.
+
+list::
+	List all PCIe root ports.
+
+<ports>::
+	Select the root ports for monitoring. Comma-separated list is supported.
+
+EXAMPLES
+--------
+
+1. List all PCIe root ports (example for 2-S platform):
+
+   $ perf iostat list
+   S0-uncore_iio_0<0000:00>
+   S1-uncore_iio_0<0000:80>
+   S0-uncore_iio_1<0000:17>
+   S1-uncore_iio_1<0000:85>
+   S0-uncore_iio_2<0000:3a>
+   S1-uncore_iio_2<0000:ae>
+   S0-uncore_iio_3<0000:5d>
+   S1-uncore_iio_3<0000:d7>
+
+2. Collect metrics for all PCIe root ports:
+
+   $ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
+   357708+0 records in
+   357707+0 records out
+   375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s
+
+    Performance counter stats for 'system wide':
+
+      port             Inbound Read(MB)    Inbound Write(MB)    Outbound Read(MB)   Outbound Write(MB)
+   0000:00                    1                    0                    2                    3
+   0000:80                    0                    0                    0                    0
+   0000:17               352552                   43                    0                   21
+   0000:85                    0                    0                    0                    0
+   0000:3a                    3                    0                    0                    0
+   0000:ae                    0                    0                    0                    0
+   0000:5d                    0                    0                    0                    0
+   0000:d7                    0                    0                    0                    0
+
+3. Collect metrics for comma-separated list of PCIe root ports:
+
+   $ perf iostat 0000:17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
+   357708+0 records in
+   357707+0 records out
+   375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s
+
+    Performance counter stats for 'system wide':
+
+      port             Inbound Read(MB)    Inbound Write(MB)    Outbound Read(MB)   Outbound Write(MB)
+   0000:17               358559                   44                    0                   22
+   0000:3a                    3                    2                    0                    0
+
+        197.081983474 seconds time elapsed
+
+SEE ALSO
+--------
+linkperf:perf-stat[1]
\ No newline at end of file
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 902c792f326a..b4ab48cc01e3 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -267,6 +267,7 @@ SCRIPT_SH =
 
 SCRIPT_SH += perf-archive.sh
 SCRIPT_SH += perf-with-kcore.sh
+SCRIPT_SH += perf-iostat.sh
 
 grep-libs = $(filter -l%,$(1))
 strip-libs = $(filter-out -l%,$(1))
@@ -890,6 +891,8 @@ endif
 		$(INSTALL) $(OUTPUT)perf-archive -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
 	$(call QUIET_INSTALL, perf-with-kcore) \
 		$(INSTALL) $(OUTPUT)perf-with-kcore -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
+	$(call QUIET_INSTALL, perf-iostat) \
+		$(INSTALL) $(OUTPUT)perf-iostat -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
 ifndef NO_LIBAUDIT
 	$(call QUIET_INSTALL, strace/groups) \
 		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(STRACE_GROUPS_INSTDIR_SQ)'; \
@@ -954,7 +957,7 @@ python-clean:
 	$(python-clean)
 
 clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean
-	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
+	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(OUTPUT)perf-iostat $(LANG_BINDINGS)
 	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
 	$(Q)$(RM) $(OUTPUT).config-detected
 	$(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32 $(OUTPUT)pmu-events/jevents $(OUTPUT)$(LIBJVMTI).so
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 47f9c56e744f..5b650bebee78 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -6,6 +6,7 @@ perf-y += perf_regs.o
 perf-y += group.o
 perf-y += machine.o
 perf-y += event.o
+perf-y += iostat.o
 
 perf-$(CONFIG_DWARF) += dwarf-regs.o
 perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/iostat.c b/tools/perf/arch/x86/util/iostat.c
index 961e540106e6..468a9696a12d 100644
--- a/tools/perf/arch/x86/util/iostat.c
+++ b/tools/perf/arch/x86/util/iostat.c
@@ -27,6 +27,44 @@
 #include "util/counts.h"
 #include "path.h"
 
+#ifndef MAX_PATH
+#define MAX_PATH 1024
+#endif
+
+#define UNCORE_IIO_PMU_PATH	"devices/uncore_iio_%d"
+#define SYSFS_UNCORE_PMU_PATH	"%s/"UNCORE_IIO_PMU_PATH
+#define PLATFORM_MAPPING_PATH	UNCORE_IIO_PMU_PATH"/die%d"
+
+enum iostat_mode_t {
+	IOSTAT_NONE		= -1,
+	IOSTAT_RUN		= 0,
+	IOSTAT_LIST		= 1
+};
+
+static enum iostat_mode_t iostat_mode = IOSTAT_NONE;
+
+/*
+ * Each metric requiries one IIO event which increments at every 4B transfer
+ * in corresponding direction. The formulas to compute metrics are generic:
+ *     #EventCount * 4B / (1024 * 1024)
+ */
+static const char * const iostat_metrics[] = {
+	"Inbound Read(MB)",
+	"Inbound Write(MB)",
+	"Outbound Read(MB)",
+	"Outbound Write(MB)",
+};
+
+static inline int iostat_metrics_count(void)
+{
+	return sizeof(iostat_metrics) / sizeof(char *);
+}
+
+static const char *iostat_metric_by_idx(int idx)
+{
+	return *(iostat_metrics + idx % iostat_metrics_count());
+}
+
 struct iio_root_port {
 	u32 domain;
 	u8 bus;
@@ -122,3 +160,310 @@ static int iio_root_ports_list_insert(struct iio_root_ports_list *list,
 	}
 	return 0;
 }
+
+static int iio_mapping(u8 pmu_idx, struct iio_root_ports_list * const list)
+{
+	char *buf;
+	char path[MAX_PATH];
+	u32 domain;
+	u8 bus;
+	struct iio_root_port *rp;
+	size_t size;
+	int ret;
+
+	for (int die = 0; die < cpu__max_node(); die++) {
+		scnprintf(path, MAX_PATH, PLATFORM_MAPPING_PATH, pmu_idx, die);
+		if (sysfs__read_str(path, &buf, &size) < 0) {
+			if (pmu_idx)
+				goto out;
+			pr_err("Mode iostat is not supported\n");
+			return -1;
+		}
+		ret = sscanf(buf, "%04x:%02hhx", &domain, &bus);
+		free(buf);
+		if (ret != 2) {
+			pr_err("Invalid mapping data: iio_%d; die%d\n",
+			       pmu_idx, die);
+			return -1;
+		}
+		rp = iio_root_port_new(domain, bus, die, pmu_idx);
+		if (!rp || iio_root_ports_list_insert(list, rp)) {
+			free(rp);
+			return -ENOMEM;
+		}
+	}
+out:
+	return 0;
+}
+
+static u8 iio_pmu_count(void)
+{
+	u8 pmu_idx = 0;
+	char path[MAX_PATH];
+	const char *sysfs = sysfs__mountpoint();
+
+	if (sysfs) {
+		for (;; pmu_idx++) {
+			snprintf(path, sizeof(path), SYSFS_UNCORE_PMU_PATH,
+				 sysfs, pmu_idx);
+			if (access(path, F_OK) != 0)
+				break;
+		}
+	}
+	return pmu_idx;
+}
+
+static int iio_root_ports_scan(struct iio_root_ports_list **list)
+{
+	int ret = -ENOMEM;
+	struct iio_root_ports_list *tmp_list;
+	u8 pmu_count = iio_pmu_count();
+
+	if (!pmu_count) {
+		pr_err("Unsupported uncore pmu configuration\n");
+		return -1;
+	}
+
+	tmp_list = iio_root_ports_list_new();
+	if (!tmp_list)
+		goto err;
+
+	for (u8 pmu_idx = 0; pmu_idx < pmu_count; pmu_idx++) {
+		ret = iio_mapping(pmu_idx, tmp_list);
+		if (ret)
+			break;
+	}
+err:
+	if (!ret)
+		*list = tmp_list;
+	else
+		iio_root_ports_list_free(tmp_list);
+
+	return ret;
+}
+
+static int iio_root_port_parse_str(u32 *domain, u8 *bus, char *str)
+{
+	int ret;
+	regex_t regex;
+	/*
+	 * Expected format domain:bus:
+	 * Valid domain range [0:ffff]
+	 * Valid bus range [0:ff]
+	 * Example: 0000:af, 0:3d, 01:7
+	 */
+	regcomp(&regex, "^([a-f0-9A-F]{1,}):([a-f0-9A-F]{1,2})$", REG_EXTENDED);
+	ret = regexec(&regex, str, 0, NULL, 0);
+	if (ret || sscanf(str, "%08x:%02hhx", domain, bus) != 2)
+		pr_warning("Unrecognized root port format: %s\n"
+			   "Please use the following format:\n"
+			   "\t [domain]:[bus]\n"
+			   "\t for example: 0000:3d\n", str);
+
+	regfree(&regex);
+	return ret;
+}
+
+static int iio_root_ports_list_filter(struct iio_root_ports_list **list,
+				      const char *filter)
+{
+	char *tok, *tmp, *filter_copy = NULL;
+	struct iio_root_port *rp;
+	u32 domain;
+	u8 bus;
+	int ret = -ENOMEM;
+	struct iio_root_ports_list *tmp_list = iio_root_ports_list_new();
+
+	if (!tmp_list)
+		goto err;
+
+	filter_copy = strdup(filter);
+	if (!filter_copy)
+		goto err;
+
+	for (tok = strtok_r(filter_copy, ",", &tmp); tok;
+	     tok = strtok_r(NULL, ",", &tmp)) {
+		if (!iio_root_port_parse_str(&domain, &bus, tok)) {
+			rp = iio_root_port_find_by_notation(*list, domain, bus);
+			if (rp) {
+				(*list)->rps[rp->idx] = NULL;
+				ret = iio_root_ports_list_insert(tmp_list, rp);
+				if (ret) {
+					free(rp);
+					goto err;
+				}
+			} else if (!iio_root_port_find_by_notation(tmp_list,
+								   domain, bus))
+				pr_warning("Root port %04x:%02x were not found\n",
+					   domain, bus);
+		}
+	}
+
+	if (tmp_list->nr_entries == 0) {
+		pr_err("Requested root ports were not found\n");
+		ret = -EINVAL;
+	}
+err:
+	iio_root_ports_list_free(*list);
+	if (ret)
+		iio_root_ports_list_free(tmp_list);
+	else
+		*list = tmp_list;
+
+	free(filter_copy);
+	return ret;
+}
+
+static int iostat_event_group(struct evlist *evl,
+			      struct iio_root_ports_list *list)
+{
+	int ret;
+	int idx;
+	const char *iostat_cmd_template =
+	"{uncore_iio_%x/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/,\
+	  uncore_iio_%x/event=0x83,umask=0x01,ch_mask=0xF,fc_mask=0x07/,\
+	  uncore_iio_%x/event=0xc0,umask=0x04,ch_mask=0xF,fc_mask=0x07/,\
+	  uncore_iio_%x/event=0xc0,umask=0x01,ch_mask=0xF,fc_mask=0x07/}";
+	const int len_template = strlen(iostat_cmd_template) + 1;
+	struct evsel *evsel = NULL;
+	int metrics_count = iostat_metrics_count();
+	char *iostat_cmd = calloc(len_template, 1);
+
+	if (!iostat_cmd)
+		return -ENOMEM;
+
+	for (idx = 0; idx < list->nr_entries; idx++) {
+		sprintf(iostat_cmd, iostat_cmd_template,
+			list->rps[idx]->pmu_idx, list->rps[idx]->pmu_idx,
+			list->rps[idx]->pmu_idx, list->rps[idx]->pmu_idx);
+		ret = parse_events(evl, iostat_cmd, NULL);
+		if (ret)
+			goto err;
+	}
+
+	evlist__for_each_entry(evl, evsel) {
+		evsel->priv = list->rps[evsel->idx / metrics_count];
+	}
+	list->nr_entries = 0;
+err:
+	iio_root_ports_list_free(list);
+	free(iostat_cmd);
+	return ret;
+}
+
+int iostat_parse(const struct option *opt, const char *str,
+		 int unset __maybe_unused)
+{
+	int ret;
+	struct iio_root_ports_list *list;
+	struct evlist *evl = *(struct evlist **)opt->value;
+	struct perf_stat_config *config = (struct perf_stat_config *)opt->data;
+
+	if (evl->core.nr_entries > 0) {
+		pr_err("Unsupported event configuration\n");
+		return -1;
+	}
+	config->metric_only = true;
+	config->aggr_mode = AGGR_PCIE_PORT;
+	config->iostat_run = true;
+	ret = iio_root_ports_scan(&list);
+	if (ret)
+		return ret;
+
+	if (!str) {
+		iostat_mode = IOSTAT_RUN;
+	} else if (!strcmp(str, "list")) {
+		iostat_mode = IOSTAT_LIST;
+	} else {
+		iostat_mode = IOSTAT_RUN;
+		ret = iio_root_ports_list_filter(&list, str);
+		if (ret)
+			return ret;
+	}
+	return iostat_event_group(evl, list);
+}
+
+void iostat_prefix(struct perf_stat_config *config,
+		   struct evlist *evlist,
+		   char *prefix, struct timespec *ts)
+{
+	struct iio_root_port *rp = evlist->selected->priv;
+
+	if (rp) {
+		if (ts)
+			sprintf(prefix, "%6lu.%09lu%s%04x:%02x%s",
+				ts->tv_sec, ts->tv_nsec,
+				config->csv_sep, rp->domain, rp->bus,
+				config->csv_sep);
+		else
+			sprintf(prefix, "%04x:%02x%s", rp->domain, rp->bus,
+				config->csv_sep);
+	}
+}
+
+void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
+			 struct perf_stat_output_ctx *out)
+{
+	double iostat_value = 0;
+	u64 prev_count_val = 0;
+	const char *iostat_metric = iostat_metric_by_idx(evsel->idx);
+	u8 die = ((struct iio_root_port *)evsel->priv)->die;
+	struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
+
+	if (count->run && count->ena) {
+		if (evsel->prev_raw_counts && !out->force_header) {
+			struct perf_counts_values *prev_count =
+				perf_counts(evsel->prev_raw_counts, die, 0);
+
+			prev_count_val = prev_count->val;
+			prev_count->val = count->val;
+		}
+		iostat_value = (count->val - prev_count_val) /
+			       ((double) count->run / count->ena);
+	}
+	out->print_metric(config, out->ctx, NULL, "%8.0f", iostat_metric,
+			  iostat_value / (256 * 1024));
+}
+
+int iostat_list(struct evlist *evlist, struct perf_stat_config *config)
+{
+	struct evsel *evsel;
+	struct iio_root_port *rp = NULL;
+	bool is_list_mode = (iostat_mode == IOSTAT_LIST);
+
+	if (config->aggr_mode != AGGR_PCIE_PORT) {
+		pr_err("Unsupported event configuration\n");
+		return -1;
+	}
+
+	if (is_list_mode || verbose) {
+		evlist__for_each_entry(evlist, evsel) {
+			if (!evsel->priv) {
+				pr_err("Unsupported event configuration\n");
+				return -1;
+			}
+			if (rp != evsel->priv) {
+				rp = evsel->priv;
+				iio_root_port_show(config->output, rp);
+			}
+		}
+	}
+	/* Stop iostat in list-mode*/
+	config->iostat_run = !is_list_mode;
+	if (is_list_mode)
+		iostat_release(evlist);
+	return 0;
+}
+
+void iostat_release(struct evlist *evlist)
+{
+	struct evsel *evsel;
+	struct iio_root_port *rp = NULL;
+
+	evlist__for_each_entry(evlist, evsel) {
+		if (rp != evsel->priv) {
+			rp = evsel->priv;
+			free(evsel->priv);
+		}
+	}
+}
diff --git a/tools/perf/command-list.txt b/tools/perf/command-list.txt
index bc6c585f74fc..620db1e8e6a7 100644
--- a/tools/perf/command-list.txt
+++ b/tools/perf/command-list.txt
@@ -14,6 +14,7 @@ perf-config			mainporcelain common
 perf-evlist			mainporcelain common
 perf-ftrace			mainporcelain common
 perf-inject			mainporcelain common
+perf-iostat			mainporcelain common
 perf-kallsyms			mainporcelain common
 perf-kmem			mainporcelain common
 perf-kvm			mainporcelain common
diff --git a/tools/perf/perf-iostat.sh b/tools/perf/perf-iostat.sh
new file mode 100644
index 000000000000..e562f252d56f
--- /dev/null
+++ b/tools/perf/perf-iostat.sh
@@ -0,0 +1,12 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# perf iostat
+# Alexander Antonov <alexander.antonov@linux.intel.com>
+
+if [[ "$1" == "list" ]] || [[ "$1" =~ ([a-f0-9A-F]{1,}):([a-f0-9A-F]{1,2})(,)? ]]; then
+        DELIMITER="="
+else
+        DELIMITER=" "
+fi
+
+perf stat --iostat$DELIMITER$*
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 5/5] perf: Update .gitignore file
  2021-02-03 13:58 [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Alexander Antonov
                   ` (3 preceding siblings ...)
  2021-02-03 13:58 ` [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms Alexander Antonov
@ 2021-02-03 13:58 ` Alexander Antonov
  2021-02-03 20:55 ` [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Arnaldo Carvalho de Melo
  5 siblings, 0 replies; 19+ messages in thread
From: Alexander Antonov @ 2021-02-03 13:58 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz, alexander.antonov

After a "make -C tools/perf", git reports the following untracked file:
perf-iostat

Add this generated file to perf's .gitignore file.

Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
---
 tools/perf/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
index bf1252dc2cb0..421f27e2b9af 100644
--- a/tools/perf/.gitignore
+++ b/tools/perf/.gitignore
@@ -19,6 +19,7 @@ perf.data.old
 output.svg
 perf-archive
 perf-with-kcore
+perf-iostat
 tags
 TAGS
 cscope*
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics
  2021-02-03 13:58 [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Alexander Antonov
                   ` (4 preceding siblings ...)
  2021-02-03 13:58 ` [PATCH v4 5/5] perf: Update .gitignore file Alexander Antonov
@ 2021-02-03 20:55 ` Arnaldo Carvalho de Melo
  2021-02-04 12:35   ` Namhyung Kim
  5 siblings, 1 reply; 19+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-02-03 20:55 UTC (permalink / raw)
  To: Namhyung Kim, Alexander Antonov
  Cc: linux-kernel, Jiri Olsa, Andi Kleen, Alexander Shishkin,
	Mark Rutland, Ian Rogers, Ingo Molnar, Peter Zijlstra

Em Wed, Feb 03, 2021 at 04:58:25PM +0300, Alexander Antonov escreveu:
> The previous version can be found at:
> v3: https://lkml.kernel.org/r/20210126080619.30275-1-alexander.antonov@linux.intel.com/
> Changes in this revision are:
> v3 -> v4:
> - Addressed comment from Namhyung Kim:
>    1. Removed NULL-termination of root ports list

Hi Namhyung,

	Are you ok with this patchkit now?

Thanks,

- Arnaldo
 
> The previous version can be found at:
> v2: https://lkml.kernel.org/r/20201223130320.3930-1-alexander.antonov@linux.intel.com
> 
> Changes in this revision are:
> v2 -> v3:
> - Addressed comments from Namhyung Kim:
>   1. Removed perf_device pointer from evsel structure. Use priv field instead
>   2. Renamed 'iiostat' to 'iostat'
>   3. Renamed 'show' mode to 'list' mode
>   4. Renamed iiostat_delete_root_ports() to iiostat_release() and
>      iostat_show_root_ports() to iostat_list()
> 
> The previous version can be found at:
> v1: https://lkml.kernel.org/r/20201210090340.14358-1-alexander.antonov@linux.intel.com
> 
> Changes in this revision are:
> v1 -> v2:
> - Addressed comment from Arnaldo Carvalho de Melo:
>   1. Using 'perf iiostat' subcommand instead of 'perf stat --iiostat':
>     - Added perf-iiostat.sh script to use short command
>     - Updated manual pages to get help for 'perf iiostat'
>     - Added 'perf-iiostat' to perf's gitignore file
> 
> Mode is intended to provide four I/O performance metrics in MB per each
> root port:
>  - Inbound Read:   I/O devices below root port read from the host memory
>  - Inbound Write:  I/O devices below root port write to the host memory
>  - Outbound Read:  CPU reads from I/O devices below root port
>  - Outbound Write: CPU writes to I/O devices below root port
> 
> Each metric requiries only one uncore event which increments at every 4B
> transfer in corresponding direction. The formulas to compute metrics
> are generic:
>     #EventCount * 4B / (1024 * 1024)
> 
> Note: iostat introduces new perf data aggregation mode - per PCIe root port
> hence -e and -M options are not supported.
> 
> Usage examples:
> 
> 1. List all PCIe root ports (example for 2-S platform):
>    $ perf iostat list
>    S0-uncore_iio_0<0000:00>
>    S1-uncore_iio_0<0000:80>
>    S0-uncore_iio_1<0000:17>
>    S1-uncore_iio_1<0000:85>
>    S0-uncore_iio_2<0000:3a>
>    S1-uncore_iio_2<0000:ae>
>    S0-uncore_iio_3<0000:5d>
>    S1-uncore_iio_3<0000:d7>
> 
> 2. Collect metrics for all PCIe root ports:
>    $ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
>    357708+0 records in
>    357707+0 records out
>    375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s
> 
>     Performance counter stats for 'system wide':
> 
>       port             Inbound Read(MB)    Inbound Write(MB)    Outbound Read(MB)   Outbound Write(MB) 
>    0000:00                    1                    0                    2                    3 
>    0000:80                    0                    0                    0                    0 
>    0000:17               352552                   43                    0                   21 
>    0000:85                    0                    0                    0                    0 
>    0000:3a                    3                    0                    0                    0 
>    0000:ae                    0                    0                    0                    0 
>    0000:5d                    0                    0                    0                    0 
>    0000:d7                    0                    0                    0                    0
> 
> 3. Collect metrics for comma separated list of PCIe root ports:
>    $ perf iostat 0000:17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
>    357708+0 records in
>    357707+0 records out
>    375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s
> 
>     Performance counter stats for 'system wide':
> 
>       port             Inbound Read(MB)    Inbound Write(MB)    Outbound Read(MB)   Outbound Write(MB) 
>    0000:17               358559                   44                    0                   22 
>    0000:3a                    3                    2                    0                    0 
> 
>         197.081983474 seconds time elapsed
> 
> 
> Alexander Antonov (5):
>   perf stat: Add AGGR_PCIE_PORT mode
>   perf stat: Basic support for iostat in perf
>   perf stat: Helper functions for PCIe root ports list in iostat mode
>   perf stat: Enable iostat mode for x86 platforms
>   perf: Update .gitignore file
> 
>  tools/perf/.gitignore                         |   1 +
>  tools/perf/Documentation/perf-iostat.txt      |  88 ++++
>  tools/perf/Makefile.perf                      |   5 +-
>  tools/perf/arch/x86/util/Build                |   1 +
>  tools/perf/arch/x86/util/iostat.c             | 469 ++++++++++++++++++
>  tools/perf/builtin-stat.c                     |  36 +-
>  tools/perf/command-list.txt                   |   1 +
>  tools/perf/perf-iostat.sh                     |  12 +
>  tools/perf/util/iostat.h                      |  32 ++
>  .../scripting-engines/trace-event-python.c    |   3 +-
>  tools/perf/util/stat-display.c                |  53 +-
>  tools/perf/util/stat-shadow.c                 |  11 +-
>  tools/perf/util/stat.c                        |   4 +-
>  tools/perf/util/stat.h                        |   2 +
>  14 files changed, 710 insertions(+), 8 deletions(-)
>  create mode 100644 tools/perf/Documentation/perf-iostat.txt
>  create mode 100644 tools/perf/arch/x86/util/iostat.c
>  create mode 100644 tools/perf/perf-iostat.sh
>  create mode 100644 tools/perf/util/iostat.h
> 
> 
> base-commit: b145b0eb2031a620ca010174240963e4d2c6ce26
> -- 
> 2.19.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode
  2021-02-03 13:58 ` [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode Alexander Antonov
@ 2021-02-04 12:07   ` Namhyung Kim
  2021-02-08 11:30     ` Alexander Antonov
  0 siblings, 1 reply; 19+ messages in thread
From: Namhyung Kim @ 2021-02-04 12:07 UTC (permalink / raw)
  To: Alexander Antonov
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra

Hello,

On Wed, Feb 3, 2021 at 10:58 PM Alexander Antonov
<alexander.antonov@linux.intel.com> wrote:
>
> Adding AGGR_PCIE_PORT mode to be able to distinguish aggr_mode
> for root ports in following patches.

I'm not sure adding the AGGR_PCIE_PORT is the right way.
In my understanding, the aggr mode is to specify how we aggregate
counter values of a single event from different cpus.  But this seems
to aggregate counter values from different events.  Also the new
mode is basically the same as AGGR_GLOBAL.

As you will add stat_config.iostat_run to distinguish the iostat
command, probably we just want to use the global aggr mode
(and it's the default!) and get rid of the AGGR_PCIE_PORT.

Thoughts?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 2/5] perf stat: Basic support for iostat in perf
  2021-02-03 13:58 ` [PATCH v4 2/5] perf stat: Basic support for iostat in perf Alexander Antonov
@ 2021-02-04 12:22   ` Namhyung Kim
  2021-02-08 11:55     ` Alexander Antonov
  0 siblings, 1 reply; 19+ messages in thread
From: Namhyung Kim @ 2021-02-04 12:22 UTC (permalink / raw)
  To: Alexander Antonov
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra

On Wed, Feb 3, 2021 at 10:58 PM Alexander Antonov
<alexander.antonov@linux.intel.com> wrote:
>
> Add basic flow for a new iostat mode in perf. Mode is intended to
> provide four I/O performance metrics per each PCIe root port: Inbound Read,
> Inbound Write, Outbound Read, Outbound Write.
>
> The actual code to compute the metrics and attribute it to
> root port is in follow-on patches.
>
> Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
> ---
>  tools/perf/builtin-stat.c      | 31 ++++++++++++++++++++++++++
>  tools/perf/util/iostat.h       | 32 +++++++++++++++++++++++++++
>  tools/perf/util/stat-display.c | 40 +++++++++++++++++++++++++++++++++-
>  tools/perf/util/stat-shadow.c  | 11 +++++++++-
>  tools/perf/util/stat.h         |  1 +
>  5 files changed, 113 insertions(+), 2 deletions(-)
>  create mode 100644 tools/perf/util/iostat.h
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 60fdb6a0805f..66c913692120 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -65,6 +65,7 @@
>  #include "util/target.h"
>  #include "util/time-utils.h"
>  #include "util/top.h"
> +#include "util/iostat.h"
>  #include "asm/bug.h"
>
>  #include <linux/time64.h>
> @@ -186,6 +187,7 @@ static struct perf_stat_config stat_config = {
>         .metric_only_len        = METRIC_ONLY_LEN,
>         .walltime_nsecs_stats   = &walltime_nsecs_stats,
>         .big_num                = true,
> +       .iostat_run             = false,
>  };
>
>  static inline void diff_timespec(struct timespec *r, struct timespec *a,
> @@ -723,6 +725,14 @@ static int parse_metric_groups(const struct option *opt,
>         return metricgroup__parse_groups(opt, str, &stat_config.metric_events);
>  }
>
> +__weak int iostat_parse(const struct option *opt __maybe_unused,
> +                        const char *str __maybe_unused,
> +                        int unset __maybe_unused)
> +{
> +       pr_err("iostat mode is not supported\n");
> +       return -1;
> +}
> +
>  static struct option stat_options[] = {
>         OPT_BOOLEAN('T', "transaction", &transaction_run,
>                     "hardware transaction statistics"),
> @@ -803,6 +813,8 @@ static struct option stat_options[] = {
>         OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
>                      "monitor specified metrics or metric groups (separated by ,)",
>                      parse_metric_groups),
> +       OPT_CALLBACK_OPTARG(0, "iostat", &evsel_list, &stat_config, "root port",
> +                           "measure PCIe metrics per root port", iostat_parse),

Can we make the help string and default argument more generic?
Something like "measure IO metrics provided by arch/platform"
and the default value being "default". :)


>         OPT_END()
>  };
>
> @@ -1131,6 +1143,12 @@ __weak void arch_topdown_group_warn(void)
>  {
>  }
>
> +__weak int iostat_list(struct evlist *evlist __maybe_unused,
> +                       struct perf_stat_config *config __maybe_unused)
> +{
> +       return 0;
> +}
> +
>  /*
>   * Add default attributes, if there were no attributes specified or
>   * if -d/--detailed, -d -d or -d -d -d is used:
> @@ -1682,6 +1700,10 @@ static void setup_system_wide(int forks)
>         }
>  }
>
> +__weak void iostat_release(struct evlist *evlist __maybe_unused)
> +{
> +}
> +
>  int cmd_stat(int argc, const char **argv)
>  {
>         const char * const stat_usage[] = {
> @@ -1858,6 +1880,12 @@ int cmd_stat(int argc, const char **argv)
>                 goto out;
>         }
>
> +       if (stat_config.iostat_run) {
> +               status = iostat_list(evsel_list, &stat_config);

I think it's unnatural to call iostat_list() unconditionally here.
How about this?

    status = iostat_prepare(...);
    if (status < 0)
        goto out;

    if (status == IOSTAT_LIST)
        iostat_list(...);
    else
        ...


> +               if (status || !stat_config.iostat_run)
> +                       goto out;
> +       }
> +
>         if (add_default_attributes())
>                 goto out;
>
> @@ -2008,6 +2036,9 @@ int cmd_stat(int argc, const char **argv)
>         perf_stat__exit_aggr_mode();
>         perf_evlist__free_stats(evsel_list);
>  out:
> +       if (stat_config.iostat_run)
> +               iostat_release(evsel_list);
> +
>         zfree(&stat_config.walltime_run);
>
>         if (smi_cost && smi_reset)
> diff --git a/tools/perf/util/iostat.h b/tools/perf/util/iostat.h
> new file mode 100644
> index 000000000000..b34ebedfd5e6
> --- /dev/null
> +++ b/tools/perf/util/iostat.h
> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * perf iostat
> + *
> + * Copyright (C) 2020, Intel Corporation
> + *
> + * Authors: Alexander Antonov <alexander.antonov@linux.intel.com>
> + */
> +
> +#ifndef _IOSTAT_H
> +#define _IOSTAT_H
> +
> +#include <subcmd/parse-options.h>
> +#include "util/stat.h"
> +#include "util/parse-events.h"
> +#include "util/evlist.h"
> +
> +struct option;
> +struct perf_stat_config;
> +struct evlist;
> +struct timespec;
> +
> +int iostat_parse(const struct option *opt, const char *str,
> +                int unset __maybe_unused);
> +void iostat_prefix(struct perf_stat_config *config, struct evlist *evlist,
> +                  char *prefix, struct timespec *ts);
> +void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
> +                        struct perf_stat_output_ctx *out);
> +int iostat_list(struct evlist *evlist, struct perf_stat_config *config);
> +void iostat_release(struct evlist *evlist);
> +
> +#endif /* _IOSTAT_H */
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index db1bec115d0b..de78cf6962b9 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -16,6 +16,7 @@
>  #include <linux/ctype.h>
>  #include "cgroup.h"
>  #include <api/fs/fs.h>
> +#include "iostat.h"
>
>  #define CNTR_NOT_SUPPORTED     "<not supported>"
>  #define CNTR_NOT_COUNTED       "<not counted>"
> @@ -302,6 +303,11 @@ static void print_metric_header(struct perf_stat_config *config,
>         struct outstate *os = ctx;
>         char tbuf[1024];
>
> +       /* In case of iostat, print metric header for first root port only */
> +       if (config->iostat_run &&
> +           os->evsel->priv != os->evsel->evlist->selected->priv)
> +               return;
> +
>         if (!valid_only_metric(unit))
>                 return;
>         unit = fixunit(tbuf, os->evsel, unit);
> @@ -936,6 +942,8 @@ static void print_metric_headers(struct perf_stat_config *config,
>                         fputs("time,", config->output);
>                 fputs(aggr_header_csv[config->aggr_mode], config->output);
>         }
> +       if (config->iostat_run && !config->interval && !config->csv_output)
> +               fprintf(config->output, "   port         ");

It's too specific to the current implementation.
Let's make it a callback or a weak function.


>
>         /* Print metrics headers only */
>         evlist__for_each_entry(evlist, counter) {
> @@ -954,6 +962,13 @@ static void print_metric_headers(struct perf_stat_config *config,
>         fputc('\n', config->output);
>  }
>
> +__weak void iostat_prefix(struct perf_stat_config *config __maybe_unused,
> +                         struct evlist *evlist __maybe_unused,
> +                         char *prefix __maybe_unused,
> +                         struct timespec *ts __maybe_unused)
> +{
> +}
> +
>  static void print_interval(struct perf_stat_config *config,
>                            struct evlist *evlist,
>                            char *prefix, struct timespec *ts)
> @@ -966,7 +981,10 @@ static void print_interval(struct perf_stat_config *config,
>         if (config->interval_clear)
>                 puts(CONSOLE_CLEAR);
>
> -       sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep);
> +       if (!config->iostat_run)
> +               sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec,
> +                                               ts->tv_nsec,
> +                                               config->csv_sep);
>
>         if ((num_print_interval == 0 && !config->csv_output) || config->interval_clear) {
>                 switch (config->aggr_mode) {
> @@ -996,6 +1014,7 @@ static void print_interval(struct perf_stat_config *config,
>                                 fprintf(output, "                  counts %*s events\n", unit_width, "unit");
>                         break;
>                 case AGGR_PCIE_PORT:
> +                       fprintf(output, "#           time    port        ");

Ditto.

>                         break;
>                 case AGGR_GLOBAL:
>                 default:
> @@ -1174,6 +1193,10 @@ perf_evlist__print_counters(struct evlist *evlist,
>         int interval = config->interval;
>         struct evsel *counter;
>         char buf[64], *prefix = NULL;
> +       void *perf_device = NULL;
> +
> +       if (config->iostat_run)
> +               evlist->selected = evlist__first(evlist);
>
>         if (interval)
>                 print_interval(config, evlist, prefix = buf, ts);
> @@ -1222,6 +1245,21 @@ perf_evlist__print_counters(struct evlist *evlist,
>                 }
>                 break;
>         case AGGR_PCIE_PORT:

Ditto.  Something like iostat_print_counters().

> +               counter = evlist__first(evlist);
> +               perf_evlist__set_selected(evlist, counter);
> +               iostat_prefix(config, evlist, prefix = buf, ts);
> +               fprintf(config->output, "%s", prefix);
> +               evlist__for_each_entry(evlist, counter) {
> +                       perf_device = evlist->selected->priv;
> +                       if (perf_device && perf_device != counter->priv) {
> +                               perf_evlist__set_selected(evlist, counter);
> +                               iostat_prefix(config, evlist, prefix, ts);
> +                               fprintf(config->output, "\n%s", prefix);
> +                       }
> +                       print_counter_aggr(config, counter, prefix);

I'm not sure but do you assume each counter has different priv?
I don't know if the output is correct (like call iostat_prefix() once
and call print_counter_aggr() twice) when they have same one.

> +                       if ((counter->idx + 1) == evlist->core.nr_entries)
> +                               fputc('\n', config->output);

Can we just move this out of the loop?

Thanks,
Namhyung


> +               }
>                 break;
>         case AGGR_UNSET:
>         default:
> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> index 2c41d47f6f83..083a450c6dc7 100644
> --- a/tools/perf/util/stat-shadow.c
> +++ b/tools/perf/util/stat-shadow.c
> @@ -9,6 +9,7 @@
>  #include "expr.h"
>  #include "metricgroup.h"
>  #include <linux/zalloc.h>
> +#include "iostat.h"
>
>  /*
>   * AGGR_GLOBAL: Use CPU 0
> @@ -814,6 +815,12 @@ static void generic_metric(struct perf_stat_config *config,
>                 zfree(&pctx.ids[i].name);
>  }
>
> +__weak void iostat_print_metric(struct perf_stat_config *config __maybe_unused,
> +                               struct evsel *evsel __maybe_unused,
> +                               struct perf_stat_output_ctx *out __maybe_unused)
> +{
> +}
> +
>  void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>                                    struct evsel *evsel,
>                                    double avg, int cpu,
> @@ -829,7 +836,9 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>         struct metric_event *me;
>         int num = 1;
>
> -       if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
> +       if (config->iostat_run) {
> +               iostat_print_metric(config, evsel, out);
> +       } else if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
>                 total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
>
>                 if (total) {
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index c7544c28c02a..c2a2b28effd6 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -107,6 +107,7 @@ struct perf_stat_config {
>         bool                     big_num;
>         bool                     no_merge;
>         bool                     walltime_run_table;
> +       bool                     iostat_run;
>         FILE                    *output;
>         unsigned int             interval;
>         unsigned int             timeout;
> --
> 2.19.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 3/5] perf stat: Helper functions for PCIe root ports list in iostat mode
  2021-02-03 13:58 ` [PATCH v4 3/5] perf stat: Helper functions for PCIe root ports list in iostat mode Alexander Antonov
@ 2021-02-04 12:32   ` Namhyung Kim
  2021-02-08 11:58     ` Alexander Antonov
  0 siblings, 1 reply; 19+ messages in thread
From: Namhyung Kim @ 2021-02-04 12:32 UTC (permalink / raw)
  To: Alexander Antonov
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra

On Wed, Feb 3, 2021 at 10:58 PM Alexander Antonov
<alexander.antonov@linux.intel.com> wrote:
>
> Introduce helper functions to control PCIe root ports list.
> These helpers will be used in the follow-up patch.
>
> Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
> ---
>  tools/perf/arch/x86/util/iostat.c | 124 ++++++++++++++++++++++++++++++
>  1 file changed, 124 insertions(+)
>  create mode 100644 tools/perf/arch/x86/util/iostat.c
>
> diff --git a/tools/perf/arch/x86/util/iostat.c b/tools/perf/arch/x86/util/iostat.c
> new file mode 100644
> index 000000000000..961e540106e6
> --- /dev/null
> +++ b/tools/perf/arch/x86/util/iostat.c
> @@ -0,0 +1,124 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * perf iostat
> + *
> + * Copyright (C) 2020, Intel Corporation
> + *
> + * Authors: Alexander Antonov <alexander.antonov@linux.intel.com>
> + */
> +
> +#include <api/fs/fs.h>
> +#include <linux/kernel.h>
> +#include <linux/err.h>
> +#include <limits.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <errno.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <dirent.h>
> +#include <unistd.h>
> +#include <stdlib.h>
> +#include <regex.h>
> +#include "util/cpumap.h"
> +#include "util/debug.h"
> +#include "util/iostat.h"
> +#include "util/counts.h"
> +#include "path.h"
> +
> +struct iio_root_port {
> +       u32 domain;
> +       u8 bus;
> +       u8 die;
> +       u8 pmu_idx;
> +       int idx;
> +};
> +
> +struct iio_root_ports_list {
> +       struct iio_root_port **rps;
> +       int nr_entries;
> +};
> +
> +static void iio_root_port_show(FILE *output,
> +                              const struct iio_root_port * const rp)
> +{
> +       if (output && rp)
> +               fprintf(output, "S%d-uncore_iio_%d<%04x:%02x>\n",
> +                       rp->die, rp->pmu_idx, rp->domain, rp->bus);
> +}
> +
> +static struct iio_root_port *iio_root_port_new(u32 domain, u8 bus,
> +                                              u8 die, u8 pmu_idx)
> +{
> +       struct iio_root_port *p = calloc(1, sizeof(*p));
> +
> +       if (p) {
> +               p->domain = domain;
> +               p->bus = bus;
> +               p->die = die;
> +               p->pmu_idx = pmu_idx;
> +       }
> +       return p;
> +}
> +
> +static struct iio_root_ports_list *iio_root_ports_list_new(void)
> +{
> +       struct iio_root_ports_list *list = calloc(1, sizeof(*list));
> +
> +       if (list) {
> +               list->rps = calloc(1, sizeof(struct iio_root_port *));

This seems unnecessary now.

Thanks,
Namhyung


> +               if (!list->rps) {
> +                       free(list);
> +                       list = NULL;
> +               }
> +       }
> +       return list;
> +}
> +
> +static void iio_root_ports_list_free(struct iio_root_ports_list *list)
> +{
> +       int idx;
> +
> +       if (list) {
> +               for (idx = 0; idx < list->nr_entries; idx++)
> +                       free(list->rps[idx]);
> +               free(list->rps);
> +               free(list);
> +       }
> +}
> +
> +static struct iio_root_port *iio_root_port_find_by_notation(
> +       const struct iio_root_ports_list * const list, u32 domain, u8 bus)
> +{
> +       int idx;
> +       struct iio_root_port *rp;
> +
> +       if (list) {
> +               for (idx = 0; idx < list->nr_entries; idx++) {
> +                       rp = list->rps[idx];
> +                       if (rp && rp->domain == domain && rp->bus == bus)
> +                               return rp;
> +               }
> +       }
> +       return NULL;
> +}
> +
> +static int iio_root_ports_list_insert(struct iio_root_ports_list *list,
> +                                     struct iio_root_port * const rp)
> +{
> +       struct iio_root_port **tmp_buf;
> +
> +       if (list && rp) {
> +               rp->idx = list->nr_entries++;
> +               tmp_buf = realloc(list->rps,
> +                                 list->nr_entries * sizeof(*list->rps));
> +               if (!tmp_buf) {
> +                       pr_err("Failed to realloc memory\n");
> +                       return -ENOMEM;
> +               }
> +               tmp_buf[rp->idx] = rp;
> +               list->rps = tmp_buf;
> +       }
> +       return 0;
> +}
> --
> 2.19.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics
  2021-02-03 20:55 ` [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Arnaldo Carvalho de Melo
@ 2021-02-04 12:35   ` Namhyung Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Namhyung Kim @ 2021-02-04 12:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexander Antonov, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra

Hi Arnaldo,

On Thu, Feb 4, 2021 at 5:55 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Wed, Feb 03, 2021 at 04:58:25PM +0300, Alexander Antonov escreveu:
> > The previous version can be found at:
> > v3: https://lkml.kernel.org/r/20210126080619.30275-1-alexander.antonov@linux.intel.com/
> > Changes in this revision are:
> > v3 -> v4:
> > - Addressed comment from Namhyung Kim:
> >    1. Removed NULL-termination of root ports list
>
> Hi Namhyung,
>
>         Are you ok with this patchkit now?

I've left some more comments.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode
  2021-02-04 12:07   ` Namhyung Kim
@ 2021-02-08 11:30     ` Alexander Antonov
  2021-02-10 10:02       ` Namhyung Kim
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Antonov @ 2021-02-08 11:30 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra

On 2/4/2021 3:07 PM, Namhyung Kim wrote:
> Hello,
>
> On Wed, Feb 3, 2021 at 10:58 PM Alexander Antonov
> <alexander.antonov@linux.intel.com> wrote:
>> Adding AGGR_PCIE_PORT mode to be able to distinguish aggr_mode
>> for root ports in following patches.
> I'm not sure adding the AGGR_PCIE_PORT is the right way.
> In my understanding, the aggr mode is to specify how we aggregate
> counter values of a single event from different cpus.  But this seems
> to aggregate counter values from different events.  Also the new
> mode is basically the same as AGGR_GLOBAL.
>
> As you will add stat_config.iostat_run to distinguish the iostat
> command, probably we just want to use the global aggr mode
> (and it's the default!) and get rid of the AGGR_PCIE_PORT.
>
> Thoughts?
>
> Thanks,
> Namhyung
Hello Namhyung,

Actually, you are right. We aggregate counter values from different 
events of a
single IIO stack (PCIe root port) to calculate metrics for this IO stack.
But the reason is to prevent using of '-e' and '-M' options in 'iostat' mode
because it can be a reason for the mess in the output that can confuse 
users.

There is an idea to use your suggestion for this part:

status = iostat_prepare(...);
if (status < 0)
         goto out;
if (status == IOSTAT_LIST)
         iostat_list(...);
else
         ...

So, we can check if evlist is empty inside iostat_prepare(). If not, print
a warning, for example, "The -e and -M options are not supported. All chosen
events/metrics will be dropped". Then we can free of evlist by using
evlist__delete(), create new one by using evlist__new() and fill the evlist.

In this case the body of iostat_prepare() function would be:

iostat_prepare()
{
     If (!is_evlist_empty) {
         pr_warning();
         evlist__delete();
         evlist__new()
     }

     iostat_event_group();
}

It will allow to get rid of the AGGR_PCIE_PORT.
What do you think?

Thank you,
Alexander

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 2/5] perf stat: Basic support for iostat in perf
  2021-02-04 12:22   ` Namhyung Kim
@ 2021-02-08 11:55     ` Alexander Antonov
  2021-02-10 10:07       ` Namhyung Kim
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Antonov @ 2021-02-08 11:55 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra


On 2/4/2021 3:22 PM, Namhyung Kim wrote:
> On Wed, Feb 3, 2021 at 10:58 PM Alexander Antonov
> <alexander.antonov@linux.intel.com> wrote:
>> Add basic flow for a new iostat mode in perf. Mode is intended to
>> provide four I/O performance metrics per each PCIe root port: Inbound Read,
>> Inbound Write, Outbound Read, Outbound Write.
>>
>> The actual code to compute the metrics and attribute it to
>> root port is in follow-on patches.
>>
>> Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
>> ---
>>   tools/perf/builtin-stat.c      | 31 ++++++++++++++++++++++++++
>>   tools/perf/util/iostat.h       | 32 +++++++++++++++++++++++++++
>>   tools/perf/util/stat-display.c | 40 +++++++++++++++++++++++++++++++++-
>>   tools/perf/util/stat-shadow.c  | 11 +++++++++-
>>   tools/perf/util/stat.h         |  1 +
>>   5 files changed, 113 insertions(+), 2 deletions(-)
>>   create mode 100644 tools/perf/util/iostat.h
>>
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 60fdb6a0805f..66c913692120 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -65,6 +65,7 @@
>>   #include "util/target.h"
>>   #include "util/time-utils.h"
>>   #include "util/top.h"
>> +#include "util/iostat.h"
>>   #include "asm/bug.h"
>>
>>   #include <linux/time64.h>
>> @@ -186,6 +187,7 @@ static struct perf_stat_config stat_config = {
>>          .metric_only_len        = METRIC_ONLY_LEN,
>>          .walltime_nsecs_stats   = &walltime_nsecs_stats,
>>          .big_num                = true,
>> +       .iostat_run             = false,
>>   };
>>
>>   static inline void diff_timespec(struct timespec *r, struct timespec *a,
>> @@ -723,6 +725,14 @@ static int parse_metric_groups(const struct option *opt,
>>          return metricgroup__parse_groups(opt, str, &stat_config.metric_events);
>>   }
>>
>> +__weak int iostat_parse(const struct option *opt __maybe_unused,
>> +                        const char *str __maybe_unused,
>> +                        int unset __maybe_unused)
>> +{
>> +       pr_err("iostat mode is not supported\n");
>> +       return -1;
>> +}
>> +
>>   static struct option stat_options[] = {
>>          OPT_BOOLEAN('T', "transaction", &transaction_run,
>>                      "hardware transaction statistics"),
>> @@ -803,6 +813,8 @@ static struct option stat_options[] = {
>>          OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
>>                       "monitor specified metrics or metric groups (separated by ,)",
>>                       parse_metric_groups),
>> +       OPT_CALLBACK_OPTARG(0, "iostat", &evsel_list, &stat_config, "root port",
>> +                           "measure PCIe metrics per root port", iostat_parse),
> Can we make the help string and default argument more generic?
> Something like "measure IO metrics provided by arch/platform"
> and the default value being "default". :)
>
Do you mean using "default" instead of "root port"?
What about the faceless "I/O unit"? :)
>>          OPT_END()
>>   };
>>
>> @@ -1131,6 +1143,12 @@ __weak void arch_topdown_group_warn(void)
>>   {
>>   }
>>
>> +__weak int iostat_list(struct evlist *evlist __maybe_unused,
>> +                       struct perf_stat_config *config __maybe_unused)
>> +{
>> +       return 0;
>> +}
>> +
>>   /*
>>    * Add default attributes, if there were no attributes specified or
>>    * if -d/--detailed, -d -d or -d -d -d is used:
>> @@ -1682,6 +1700,10 @@ static void setup_system_wide(int forks)
>>          }
>>   }
>>
>> +__weak void iostat_release(struct evlist *evlist __maybe_unused)
>> +{
>> +}
>> +
>>   int cmd_stat(int argc, const char **argv)
>>   {
>>          const char * const stat_usage[] = {
>> @@ -1858,6 +1880,12 @@ int cmd_stat(int argc, const char **argv)
>>                  goto out;
>>          }
>>
>> +       if (stat_config.iostat_run) {
>> +               status = iostat_list(evsel_list, &stat_config);
> I think it's unnatural to call iostat_list() unconditionally here.
> How about this?
>
>      status = iostat_prepare(...);
>      if (status < 0)
>          goto out;
>
>      if (status == IOSTAT_LIST)
>          iostat_list(...);
>      else
>          ...
I think it's applicable.
In case of 'list' option we will just print list of root ports and exit.
Also listing of root ports is available in verbose mode. In this case we 
will
print list and start the collection.
>
>> +               if (status || !stat_config.iostat_run)
>> +                       goto out;
>> +       }
>> +
>>          if (add_default_attributes())
>>                  goto out;
>>
>> @@ -2008,6 +2036,9 @@ int cmd_stat(int argc, const char **argv)
>>          perf_stat__exit_aggr_mode();
>>          perf_evlist__free_stats(evsel_list);
>>   out:
>> +       if (stat_config.iostat_run)
>> +               iostat_release(evsel_list);
>> +
>>          zfree(&stat_config.walltime_run);
>>
>>          if (smi_cost && smi_reset)
>> diff --git a/tools/perf/util/iostat.h b/tools/perf/util/iostat.h
>> new file mode 100644
>> index 000000000000..b34ebedfd5e6
>> --- /dev/null
>> +++ b/tools/perf/util/iostat.h
>> @@ -0,0 +1,32 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * perf iostat
>> + *
>> + * Copyright (C) 2020, Intel Corporation
>> + *
>> + * Authors: Alexander Antonov <alexander.antonov@linux.intel.com>
>> + */
>> +
>> +#ifndef _IOSTAT_H
>> +#define _IOSTAT_H
>> +
>> +#include <subcmd/parse-options.h>
>> +#include "util/stat.h"
>> +#include "util/parse-events.h"
>> +#include "util/evlist.h"
>> +
>> +struct option;
>> +struct perf_stat_config;
>> +struct evlist;
>> +struct timespec;
>> +
>> +int iostat_parse(const struct option *opt, const char *str,
>> +                int unset __maybe_unused);
>> +void iostat_prefix(struct perf_stat_config *config, struct evlist *evlist,
>> +                  char *prefix, struct timespec *ts);
>> +void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
>> +                        struct perf_stat_output_ctx *out);
>> +int iostat_list(struct evlist *evlist, struct perf_stat_config *config);
>> +void iostat_release(struct evlist *evlist);
>> +
>> +#endif /* _IOSTAT_H */
>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>> index db1bec115d0b..de78cf6962b9 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -16,6 +16,7 @@
>>   #include <linux/ctype.h>
>>   #include "cgroup.h"
>>   #include <api/fs/fs.h>
>> +#include "iostat.h"
>>
>>   #define CNTR_NOT_SUPPORTED     "<not supported>"
>>   #define CNTR_NOT_COUNTED       "<not counted>"
>> @@ -302,6 +303,11 @@ static void print_metric_header(struct perf_stat_config *config,
>>          struct outstate *os = ctx;
>>          char tbuf[1024];
>>
>> +       /* In case of iostat, print metric header for first root port only */
>> +       if (config->iostat_run &&
>> +           os->evsel->priv != os->evsel->evlist->selected->priv)
>> +               return;
>> +
>>          if (!valid_only_metric(unit))
>>                  return;
>>          unit = fixunit(tbuf, os->evsel, unit);
>> @@ -936,6 +942,8 @@ static void print_metric_headers(struct perf_stat_config *config,
>>                          fputs("time,", config->output);
>>                  fputs(aggr_header_csv[config->aggr_mode], config->output);
>>          }
>> +       if (config->iostat_run && !config->interval && !config->csv_output)
>> +               fprintf(config->output, "   port         ");
> It's too specific to the current implementation.
> Let's make it a callback or a weak function.
Okay,
This and other similar blocks will be updated.
>
>>          /* Print metrics headers only */
>>          evlist__for_each_entry(evlist, counter) {
>> @@ -954,6 +962,13 @@ static void print_metric_headers(struct perf_stat_config *config,
>>          fputc('\n', config->output);
>>   }
>>
>> +__weak void iostat_prefix(struct perf_stat_config *config __maybe_unused,
>> +                         struct evlist *evlist __maybe_unused,
>> +                         char *prefix __maybe_unused,
>> +                         struct timespec *ts __maybe_unused)
>> +{
>> +}
>> +
>>   static void print_interval(struct perf_stat_config *config,
>>                             struct evlist *evlist,
>>                             char *prefix, struct timespec *ts)
>> @@ -966,7 +981,10 @@ static void print_interval(struct perf_stat_config *config,
>>          if (config->interval_clear)
>>                  puts(CONSOLE_CLEAR);
>>
>> -       sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep);
>> +       if (!config->iostat_run)
>> +               sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec,
>> +                                               ts->tv_nsec,
>> +                                               config->csv_sep);
>>
>>          if ((num_print_interval == 0 && !config->csv_output) || config->interval_clear) {
>>                  switch (config->aggr_mode) {
>> @@ -996,6 +1014,7 @@ static void print_interval(struct perf_stat_config *config,
>>                                  fprintf(output, "                  counts %*s events\n", unit_width, "unit");
>>                          break;
>>                  case AGGR_PCIE_PORT:
>> +                       fprintf(output, "#           time    port        ");
> Ditto.
>
>>                          break;
>>                  case AGGR_GLOBAL:
>>                  default:
>> @@ -1174,6 +1193,10 @@ perf_evlist__print_counters(struct evlist *evlist,
>>          int interval = config->interval;
>>          struct evsel *counter;
>>          char buf[64], *prefix = NULL;
>> +       void *perf_device = NULL;
>> +
>> +       if (config->iostat_run)
>> +               evlist->selected = evlist__first(evlist);
>>
>>          if (interval)
>>                  print_interval(config, evlist, prefix = buf, ts);
>> @@ -1222,6 +1245,21 @@ perf_evlist__print_counters(struct evlist *evlist,
>>                  }
>>                  break;
>>          case AGGR_PCIE_PORT:
> Ditto.  Something like iostat_print_counters().
>
>> +               counter = evlist__first(evlist);
>> +               perf_evlist__set_selected(evlist, counter);
>> +               iostat_prefix(config, evlist, prefix = buf, ts);
>> +               fprintf(config->output, "%s", prefix);
>> +               evlist__for_each_entry(evlist, counter) {
>> +                       perf_device = evlist->selected->priv;
>> +                       if (perf_device && perf_device != counter->priv) {
>> +                               perf_evlist__set_selected(evlist, counter);
>> +                               iostat_prefix(config, evlist, prefix, ts);
>> +                               fprintf(config->output, "\n%s", prefix);
>> +                       }
>> +                       print_counter_aggr(config, counter, prefix);
> I'm not sure but do you assume each counter has different priv?
> I don't know if the output is correct (like call iostat_prefix() once
> and call print_counter_aggr() twice) when they have same one.
There are 4 counters which are related to single 'priv' field (it's used 
for root
port object in iostat mode). This means if platform has, for example, 5 root
ports we will have 20 counters in summary. And print_counter_aggr() will be
called for each counter.
I call iostat_prefix() before the loop to print first root port and then
iostat_prefix() will be called when next counter is related to other 
root port.
>
>> +                       if ((counter->idx + 1) == evlist->core.nr_entries)
>> +                               fputc('\n', config->output);
> Can we just move this out of the loop?
>
Yes, you are right, we can. I will update it.

Thank you,
Alexander
>
>
>> +               }
>>                  break;
>>          case AGGR_UNSET:
>>          default:
>> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
>> index 2c41d47f6f83..083a450c6dc7 100644
>> --- a/tools/perf/util/stat-shadow.c
>> +++ b/tools/perf/util/stat-shadow.c
>> @@ -9,6 +9,7 @@
>>   #include "expr.h"
>>   #include "metricgroup.h"
>>   #include <linux/zalloc.h>
>> +#include "iostat.h"
>>
>>   /*
>>    * AGGR_GLOBAL: Use CPU 0
>> @@ -814,6 +815,12 @@ static void generic_metric(struct perf_stat_config *config,
>>                  zfree(&pctx.ids[i].name);
>>   }
>>
>> +__weak void iostat_print_metric(struct perf_stat_config *config __maybe_unused,
>> +                               struct evsel *evsel __maybe_unused,
>> +                               struct perf_stat_output_ctx *out __maybe_unused)
>> +{
>> +}
>> +
>>   void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>>                                     struct evsel *evsel,
>>                                     double avg, int cpu,
>> @@ -829,7 +836,9 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
>>          struct metric_event *me;
>>          int num = 1;
>>
>> -       if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
>> +       if (config->iostat_run) {
>> +               iostat_print_metric(config, evsel, out);
>> +       } else if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
>>                  total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
>>
>>                  if (total) {
>> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
>> index c7544c28c02a..c2a2b28effd6 100644
>> --- a/tools/perf/util/stat.h
>> +++ b/tools/perf/util/stat.h
>> @@ -107,6 +107,7 @@ struct perf_stat_config {
>>          bool                     big_num;
>>          bool                     no_merge;
>>          bool                     walltime_run_table;
>> +       bool                     iostat_run;
>>          FILE                    *output;
>>          unsigned int             interval;
>>          unsigned int             timeout;
>> --
>> 2.19.1
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 3/5] perf stat: Helper functions for PCIe root ports list in iostat mode
  2021-02-04 12:32   ` Namhyung Kim
@ 2021-02-08 11:58     ` Alexander Antonov
  0 siblings, 0 replies; 19+ messages in thread
From: Alexander Antonov @ 2021-02-08 11:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra


On 2/4/2021 3:32 PM, Namhyung Kim wrote:
> On Wed, Feb 3, 2021 at 10:58 PM Alexander Antonov
> <alexander.antonov@linux.intel.com> wrote:
>> Introduce helper functions to control PCIe root ports list.
>> These helpers will be used in the follow-up patch.
>>
>> Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
>> ---
>>   tools/perf/arch/x86/util/iostat.c | 124 ++++++++++++++++++++++++++++++
>>   1 file changed, 124 insertions(+)
>>   create mode 100644 tools/perf/arch/x86/util/iostat.c
>>
>> diff --git a/tools/perf/arch/x86/util/iostat.c b/tools/perf/arch/x86/util/iostat.c
>> new file mode 100644
>> index 000000000000..961e540106e6
>> --- /dev/null
>> +++ b/tools/perf/arch/x86/util/iostat.c
>> @@ -0,0 +1,124 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * perf iostat
>> + *
>> + * Copyright (C) 2020, Intel Corporation
>> + *
>> + * Authors: Alexander Antonov <alexander.antonov@linux.intel.com>
>> + */
>> +
>> +#include <api/fs/fs.h>
>> +#include <linux/kernel.h>
>> +#include <linux/err.h>
>> +#include <limits.h>
>> +#include <stdio.h>
>> +#include <string.h>
>> +#include <errno.h>
>> +#include <sys/types.h>
>> +#include <sys/stat.h>
>> +#include <fcntl.h>
>> +#include <dirent.h>
>> +#include <unistd.h>
>> +#include <stdlib.h>
>> +#include <regex.h>
>> +#include "util/cpumap.h"
>> +#include "util/debug.h"
>> +#include "util/iostat.h"
>> +#include "util/counts.h"
>> +#include "path.h"
>> +
>> +struct iio_root_port {
>> +       u32 domain;
>> +       u8 bus;
>> +       u8 die;
>> +       u8 pmu_idx;
>> +       int idx;
>> +};
>> +
>> +struct iio_root_ports_list {
>> +       struct iio_root_port **rps;
>> +       int nr_entries;
>> +};
>> +
>> +static void iio_root_port_show(FILE *output,
>> +                              const struct iio_root_port * const rp)
>> +{
>> +       if (output && rp)
>> +               fprintf(output, "S%d-uncore_iio_%d<%04x:%02x>\n",
>> +                       rp->die, rp->pmu_idx, rp->domain, rp->bus);
>> +}
>> +
>> +static struct iio_root_port *iio_root_port_new(u32 domain, u8 bus,
>> +                                              u8 die, u8 pmu_idx)
>> +{
>> +       struct iio_root_port *p = calloc(1, sizeof(*p));
>> +
>> +       if (p) {
>> +               p->domain = domain;
>> +               p->bus = bus;
>> +               p->die = die;
>> +               p->pmu_idx = pmu_idx;
>> +       }
>> +       return p;
>> +}
>> +
>> +static struct iio_root_ports_list *iio_root_ports_list_new(void)
>> +{
>> +       struct iio_root_ports_list *list = calloc(1, sizeof(*list));
>> +
>> +       if (list) {
>> +               list->rps = calloc(1, sizeof(struct iio_root_port *));
> This seems unnecessary now.
>
> Thanks,
> Namhyung
>

Yes, you are right. Will be fixed.

Thank you,
Alexander
>> +               if (!list->rps) {
>> +                       free(list);
>> +                       list = NULL;
>> +               }
>> +       }
>> +       return list;
>> +}
>> +
>> +static void iio_root_ports_list_free(struct iio_root_ports_list *list)
>> +{
>> +       int idx;
>> +
>> +       if (list) {
>> +               for (idx = 0; idx < list->nr_entries; idx++)
>> +                       free(list->rps[idx]);
>> +               free(list->rps);
>> +               free(list);
>> +       }
>> +}
>> +
>> +static struct iio_root_port *iio_root_port_find_by_notation(
>> +       const struct iio_root_ports_list * const list, u32 domain, u8 bus)
>> +{
>> +       int idx;
>> +       struct iio_root_port *rp;
>> +
>> +       if (list) {
>> +               for (idx = 0; idx < list->nr_entries; idx++) {
>> +                       rp = list->rps[idx];
>> +                       if (rp && rp->domain == domain && rp->bus == bus)
>> +                               return rp;
>> +               }
>> +       }
>> +       return NULL;
>> +}
>> +
>> +static int iio_root_ports_list_insert(struct iio_root_ports_list *list,
>> +                                     struct iio_root_port * const rp)
>> +{
>> +       struct iio_root_port **tmp_buf;
>> +
>> +       if (list && rp) {
>> +               rp->idx = list->nr_entries++;
>> +               tmp_buf = realloc(list->rps,
>> +                                 list->nr_entries * sizeof(*list->rps));
>> +               if (!tmp_buf) {
>> +                       pr_err("Failed to realloc memory\n");
>> +                       return -ENOMEM;
>> +               }
>> +               tmp_buf[rp->idx] = rp;
>> +               list->rps = tmp_buf;
>> +       }
>> +       return 0;
>> +}
>> --
>> 2.19.1
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode
  2021-02-08 11:30     ` Alexander Antonov
@ 2021-02-10 10:02       ` Namhyung Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Namhyung Kim @ 2021-02-10 10:02 UTC (permalink / raw)
  To: Alexander Antonov
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra

On Mon, Feb 8, 2021 at 8:31 PM Alexander Antonov
<alexander.antonov@linux.intel.com> wrote:
>
> On 2/4/2021 3:07 PM, Namhyung Kim wrote:
> > Hello,
> >
> > On Wed, Feb 3, 2021 at 10:58 PM Alexander Antonov
> > <alexander.antonov@linux.intel.com> wrote:
> >> Adding AGGR_PCIE_PORT mode to be able to distinguish aggr_mode
> >> for root ports in following patches.
> > I'm not sure adding the AGGR_PCIE_PORT is the right way.
> > In my understanding, the aggr mode is to specify how we aggregate
> > counter values of a single event from different cpus.  But this seems
> > to aggregate counter values from different events.  Also the new
> > mode is basically the same as AGGR_GLOBAL.
> >
> > As you will add stat_config.iostat_run to distinguish the iostat
> > command, probably we just want to use the global aggr mode
> > (and it's the default!) and get rid of the AGGR_PCIE_PORT.
> >
> > Thoughts?
> >
> > Thanks,
> > Namhyung
> Hello Namhyung,
>
> Actually, you are right. We aggregate counter values from different
> events of a
> single IIO stack (PCIe root port) to calculate metrics for this IO stack.
> But the reason is to prevent using of '-e' and '-M' options in 'iostat' mode
> because it can be a reason for the mess in the output that can confuse
> users.
>
> There is an idea to use your suggestion for this part:
>
> status = iostat_prepare(...);
> if (status < 0)
>          goto out;
> if (status == IOSTAT_LIST)
>          iostat_list(...);
> else
>          ...
>
> So, we can check if evlist is empty inside iostat_prepare(). If not, print
> a warning, for example, "The -e and -M options are not supported. All chosen
> events/metrics will be dropped". Then we can free of evlist by using
> evlist__delete(), create new one by using evlist__new() and fill the evlist.
>
> In this case the body of iostat_prepare() function would be:
>
> iostat_prepare()
> {
>      If (!is_evlist_empty) {
>          pr_warning();
>          evlist__delete();
>          evlist__new()
>      }
>
>      iostat_event_group();
> }
>
> It will allow to get rid of the AGGR_PCIE_PORT.
> What do you think?

LGTM :)

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 2/5] perf stat: Basic support for iostat in perf
  2021-02-08 11:55     ` Alexander Antonov
@ 2021-02-10 10:07       ` Namhyung Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Namhyung Kim @ 2021-02-10 10:07 UTC (permalink / raw)
  To: Alexander Antonov
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa, Andi Kleen,
	Alexander Shishkin, Mark Rutland, Ian Rogers, Ingo Molnar,
	Peter Zijlstra

On Mon, Feb 8, 2021 at 8:55 PM Alexander Antonov
<alexander.antonov@linux.intel.com> wrote:
>
>
> On 2/4/2021 3:22 PM, Namhyung Kim wrote:
> > On Wed, Feb 3, 2021 at 10:58 PM Alexander Antonov
> > <alexander.antonov@linux.intel.com> wrote:
> >> Add basic flow for a new iostat mode in perf. Mode is intended to
> >> provide four I/O performance metrics per each PCIe root port: Inbound Read,
> >> Inbound Write, Outbound Read, Outbound Write.
> >>
> >> The actual code to compute the metrics and attribute it to
> >> root port is in follow-on patches.
> >>
> >> Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
> >> ---
> >>   tools/perf/builtin-stat.c      | 31 ++++++++++++++++++++++++++
> >>   tools/perf/util/iostat.h       | 32 +++++++++++++++++++++++++++
> >>   tools/perf/util/stat-display.c | 40 +++++++++++++++++++++++++++++++++-
> >>   tools/perf/util/stat-shadow.c  | 11 +++++++++-
> >>   tools/perf/util/stat.h         |  1 +
> >>   5 files changed, 113 insertions(+), 2 deletions(-)
> >>   create mode 100644 tools/perf/util/iostat.h
> >>
> >> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> >> index 60fdb6a0805f..66c913692120 100644
> >> --- a/tools/perf/builtin-stat.c
> >> +++ b/tools/perf/builtin-stat.c
> >> @@ -65,6 +65,7 @@
> >>   #include "util/target.h"
> >>   #include "util/time-utils.h"
> >>   #include "util/top.h"
> >> +#include "util/iostat.h"
> >>   #include "asm/bug.h"
> >>
> >>   #include <linux/time64.h>
> >> @@ -186,6 +187,7 @@ static struct perf_stat_config stat_config = {
> >>          .metric_only_len        = METRIC_ONLY_LEN,
> >>          .walltime_nsecs_stats   = &walltime_nsecs_stats,
> >>          .big_num                = true,
> >> +       .iostat_run             = false,
> >>   };
> >>
> >>   static inline void diff_timespec(struct timespec *r, struct timespec *a,
> >> @@ -723,6 +725,14 @@ static int parse_metric_groups(const struct option *opt,
> >>          return metricgroup__parse_groups(opt, str, &stat_config.metric_events);
> >>   }
> >>
> >> +__weak int iostat_parse(const struct option *opt __maybe_unused,
> >> +                        const char *str __maybe_unused,
> >> +                        int unset __maybe_unused)
> >> +{
> >> +       pr_err("iostat mode is not supported\n");
> >> +       return -1;
> >> +}
> >> +
> >>   static struct option stat_options[] = {
> >>          OPT_BOOLEAN('T', "transaction", &transaction_run,
> >>                      "hardware transaction statistics"),
> >> @@ -803,6 +813,8 @@ static struct option stat_options[] = {
> >>          OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
> >>                       "monitor specified metrics or metric groups (separated by ,)",
> >>                       parse_metric_groups),
> >> +       OPT_CALLBACK_OPTARG(0, "iostat", &evsel_list, &stat_config, "root port",
> >> +                           "measure PCIe metrics per root port", iostat_parse),
> > Can we make the help string and default argument more generic?
> > Something like "measure IO metrics provided by arch/platform"
> > and the default value being "default". :)
> >
> Do you mean using "default" instead of "root port"?
> What about the faceless "I/O unit"? :)

Being a generic command, I cannot expect how it can be used later.
So I'd suggest a more general name.


> >>          OPT_END()
> >>   };
> >>
> >> @@ -1131,6 +1143,12 @@ __weak void arch_topdown_group_warn(void)
> >>   {
> >>   }
> >>
> >> +__weak int iostat_list(struct evlist *evlist __maybe_unused,
> >> +                       struct perf_stat_config *config __maybe_unused)
> >> +{
> >> +       return 0;
> >> +}
> >> +
> >>   /*
> >>    * Add default attributes, if there were no attributes specified or
> >>    * if -d/--detailed, -d -d or -d -d -d is used:
> >> @@ -1682,6 +1700,10 @@ static void setup_system_wide(int forks)
> >>          }
> >>   }
> >>
> >> +__weak void iostat_release(struct evlist *evlist __maybe_unused)
> >> +{
> >> +}
> >> +
> >>   int cmd_stat(int argc, const char **argv)
> >>   {
> >>          const char * const stat_usage[] = {
> >> @@ -1858,6 +1880,12 @@ int cmd_stat(int argc, const char **argv)
> >>                  goto out;
> >>          }
> >>
> >> +       if (stat_config.iostat_run) {
> >> +               status = iostat_list(evsel_list, &stat_config);
> > I think it's unnatural to call iostat_list() unconditionally here.
> > How about this?
> >
> >      status = iostat_prepare(...);
> >      if (status < 0)
> >          goto out;
> >
> >      if (status == IOSTAT_LIST)
> >          iostat_list(...);
> >      else
> >          ...
> I think it's applicable.
> In case of 'list' option we will just print list of root ports and exit.
> Also listing of root ports is available in verbose mode. In this case we
> will
> print list and start the collection.

ok.

> >
> >> +               if (status || !stat_config.iostat_run)
> >> +                       goto out;
> >> +       }
> >> +
> >>          if (add_default_attributes())
> >>                  goto out;
> >>
> >> @@ -2008,6 +2036,9 @@ int cmd_stat(int argc, const char **argv)
> >>          perf_stat__exit_aggr_mode();
> >>          perf_evlist__free_stats(evsel_list);
> >>   out:
> >> +       if (stat_config.iostat_run)
> >> +               iostat_release(evsel_list);
> >> +
> >>          zfree(&stat_config.walltime_run);
> >>
> >>          if (smi_cost && smi_reset)
> >> diff --git a/tools/perf/util/iostat.h b/tools/perf/util/iostat.h
> >> new file mode 100644
> >> index 000000000000..b34ebedfd5e6
> >> --- /dev/null
> >> +++ b/tools/perf/util/iostat.h
> >> @@ -0,0 +1,32 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +/*
> >> + * perf iostat
> >> + *
> >> + * Copyright (C) 2020, Intel Corporation
> >> + *
> >> + * Authors: Alexander Antonov <alexander.antonov@linux.intel.com>
> >> + */
> >> +
> >> +#ifndef _IOSTAT_H
> >> +#define _IOSTAT_H
> >> +
> >> +#include <subcmd/parse-options.h>
> >> +#include "util/stat.h"
> >> +#include "util/parse-events.h"
> >> +#include "util/evlist.h"
> >> +
> >> +struct option;
> >> +struct perf_stat_config;
> >> +struct evlist;
> >> +struct timespec;
> >> +
> >> +int iostat_parse(const struct option *opt, const char *str,
> >> +                int unset __maybe_unused);
> >> +void iostat_prefix(struct perf_stat_config *config, struct evlist *evlist,
> >> +                  char *prefix, struct timespec *ts);
> >> +void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
> >> +                        struct perf_stat_output_ctx *out);
> >> +int iostat_list(struct evlist *evlist, struct perf_stat_config *config);
> >> +void iostat_release(struct evlist *evlist);
> >> +
> >> +#endif /* _IOSTAT_H */
> >> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> >> index db1bec115d0b..de78cf6962b9 100644
> >> --- a/tools/perf/util/stat-display.c
> >> +++ b/tools/perf/util/stat-display.c
> >> @@ -16,6 +16,7 @@
> >>   #include <linux/ctype.h>
> >>   #include "cgroup.h"
> >>   #include <api/fs/fs.h>
> >> +#include "iostat.h"
> >>
> >>   #define CNTR_NOT_SUPPORTED     "<not supported>"
> >>   #define CNTR_NOT_COUNTED       "<not counted>"
> >> @@ -302,6 +303,11 @@ static void print_metric_header(struct perf_stat_config *config,
> >>          struct outstate *os = ctx;
> >>          char tbuf[1024];
> >>
> >> +       /* In case of iostat, print metric header for first root port only */
> >> +       if (config->iostat_run &&
> >> +           os->evsel->priv != os->evsel->evlist->selected->priv)
> >> +               return;
> >> +
> >>          if (!valid_only_metric(unit))
> >>                  return;
> >>          unit = fixunit(tbuf, os->evsel, unit);
> >> @@ -936,6 +942,8 @@ static void print_metric_headers(struct perf_stat_config *config,
> >>                          fputs("time,", config->output);
> >>                  fputs(aggr_header_csv[config->aggr_mode], config->output);
> >>          }
> >> +       if (config->iostat_run && !config->interval && !config->csv_output)
> >> +               fprintf(config->output, "   port         ");
> > It's too specific to the current implementation.
> > Let's make it a callback or a weak function.
> Okay,
> This and other similar blocks will be updated.

ok

> >
> >>          /* Print metrics headers only */
> >>          evlist__for_each_entry(evlist, counter) {
> >> @@ -954,6 +962,13 @@ static void print_metric_headers(struct perf_stat_config *config,
> >>          fputc('\n', config->output);
> >>   }
> >>
> >> +__weak void iostat_prefix(struct perf_stat_config *config __maybe_unused,
> >> +                         struct evlist *evlist __maybe_unused,
> >> +                         char *prefix __maybe_unused,
> >> +                         struct timespec *ts __maybe_unused)
> >> +{
> >> +}
> >> +
> >>   static void print_interval(struct perf_stat_config *config,
> >>                             struct evlist *evlist,
> >>                             char *prefix, struct timespec *ts)
> >> @@ -966,7 +981,10 @@ static void print_interval(struct perf_stat_config *config,
> >>          if (config->interval_clear)
> >>                  puts(CONSOLE_CLEAR);
> >>
> >> -       sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep);
> >> +       if (!config->iostat_run)
> >> +               sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec,
> >> +                                               ts->tv_nsec,
> >> +                                               config->csv_sep);
> >>
> >>          if ((num_print_interval == 0 && !config->csv_output) || config->interval_clear) {
> >>                  switch (config->aggr_mode) {
> >> @@ -996,6 +1014,7 @@ static void print_interval(struct perf_stat_config *config,
> >>                                  fprintf(output, "                  counts %*s events\n", unit_width, "unit");
> >>                          break;
> >>                  case AGGR_PCIE_PORT:
> >> +                       fprintf(output, "#           time    port        ");
> > Ditto.
> >
> >>                          break;
> >>                  case AGGR_GLOBAL:
> >>                  default:
> >> @@ -1174,6 +1193,10 @@ perf_evlist__print_counters(struct evlist *evlist,
> >>          int interval = config->interval;
> >>          struct evsel *counter;
> >>          char buf[64], *prefix = NULL;
> >> +       void *perf_device = NULL;
> >> +
> >> +       if (config->iostat_run)
> >> +               evlist->selected = evlist__first(evlist);
> >>
> >>          if (interval)
> >>                  print_interval(config, evlist, prefix = buf, ts);
> >> @@ -1222,6 +1245,21 @@ perf_evlist__print_counters(struct evlist *evlist,
> >>                  }
> >>                  break;
> >>          case AGGR_PCIE_PORT:
> > Ditto.  Something like iostat_print_counters().
> >
> >> +               counter = evlist__first(evlist);
> >> +               perf_evlist__set_selected(evlist, counter);
> >> +               iostat_prefix(config, evlist, prefix = buf, ts);
> >> +               fprintf(config->output, "%s", prefix);
> >> +               evlist__for_each_entry(evlist, counter) {
> >> +                       perf_device = evlist->selected->priv;
> >> +                       if (perf_device && perf_device != counter->priv) {
> >> +                               perf_evlist__set_selected(evlist, counter);
> >> +                               iostat_prefix(config, evlist, prefix, ts);
> >> +                               fprintf(config->output, "\n%s", prefix);
> >> +                       }
> >> +                       print_counter_aggr(config, counter, prefix);
> > I'm not sure but do you assume each counter has different priv?
> > I don't know if the output is correct (like call iostat_prefix() once
> > and call print_counter_aggr() twice) when they have same one.
> There are 4 counters which are related to single 'priv' field (it's used
> for root
> port object in iostat mode). This means if platform has, for example, 5 root
> ports we will have 20 counters in summary. And print_counter_aggr() will be
> called for each counter.
> I call iostat_prefix() before the loop to print first root port and then
> iostat_prefix() will be called when next counter is related to other
> root port.

Thanks for the explanation.

> >
> >> +                       if ((counter->idx + 1) == evlist->core.nr_entries)
> >> +                               fputc('\n', config->output);
> > Can we just move this out of the loop?
> >
> Yes, you are right, we can. I will update it.

Thanks,
Namhyung


> >
> >
> >> +               }
> >>                  break;
> >>          case AGGR_UNSET:
> >>          default:
> >> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> >> index 2c41d47f6f83..083a450c6dc7 100644
> >> --- a/tools/perf/util/stat-shadow.c
> >> +++ b/tools/perf/util/stat-shadow.c
> >> @@ -9,6 +9,7 @@
> >>   #include "expr.h"
> >>   #include "metricgroup.h"
> >>   #include <linux/zalloc.h>
> >> +#include "iostat.h"
> >>
> >>   /*
> >>    * AGGR_GLOBAL: Use CPU 0
> >> @@ -814,6 +815,12 @@ static void generic_metric(struct perf_stat_config *config,
> >>                  zfree(&pctx.ids[i].name);
> >>   }
> >>
> >> +__weak void iostat_print_metric(struct perf_stat_config *config __maybe_unused,
> >> +                               struct evsel *evsel __maybe_unused,
> >> +                               struct perf_stat_output_ctx *out __maybe_unused)
> >> +{
> >> +}
> >> +
> >>   void perf_stat__print_shadow_stats(struct perf_stat_config *config,
> >>                                     struct evsel *evsel,
> >>                                     double avg, int cpu,
> >> @@ -829,7 +836,9 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
> >>          struct metric_event *me;
> >>          int num = 1;
> >>
> >> -       if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
> >> +       if (config->iostat_run) {
> >> +               iostat_print_metric(config, evsel, out);
> >> +       } else if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
> >>                  total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
> >>
> >>                  if (total) {
> >> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> >> index c7544c28c02a..c2a2b28effd6 100644
> >> --- a/tools/perf/util/stat.h
> >> +++ b/tools/perf/util/stat.h
> >> @@ -107,6 +107,7 @@ struct perf_stat_config {
> >>          bool                     big_num;
> >>          bool                     no_merge;
> >>          bool                     walltime_run_table;
> >> +       bool                     iostat_run;
> >>          FILE                    *output;
> >>          unsigned int             interval;
> >>          unsigned int             timeout;
> >> --
> >> 2.19.1
> >>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms
  2021-02-03 13:58 ` [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms Alexander Antonov
@ 2021-03-09  7:51   ` liuqi (BA)
  2021-03-10 16:19     ` Alexander Antonov
  0 siblings, 1 reply; 19+ messages in thread
From: liuqi (BA) @ 2021-03-09  7:51 UTC (permalink / raw)
  To: Alexander Antonov, acme
  Cc: linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz

Hi Alexander,

On 2021/2/3 21:58, Alexander Antonov wrote:
> This functionality is based on recently introduced sysfs attributes
> for Intel® Xeon® Scalable processor family (code name Skylake-SP):
> Commit bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to
> IIO PMON mapping")
> 
> Mode is intended to provide four I/O performance metrics in MB per each
> PCIe root port:
>   - Inbound Read: I/O devices below root port read from the host memory
>   - Inbound Write: I/O devices below root port write to the host memory
>   - Outbound Read: CPU reads from I/O devices below root port
>   - Outbound Write: CPU writes to I/O devices below root port
> 
> Each metric requiries only one uncore event which increments at every 4B
> transfer in corresponding direction. The formulas to compute metrics
> are generic:
>      #EventCount * 4B / (1024 * 1024)
> 
> Signed-off-by: Alexander Antonov<alexander.antonov@linux.intel.com>
> ---
>   tools/perf/Documentation/perf-iostat.txt |  88 ++++++
>   tools/perf/Makefile.perf                 |   5 +-
>   tools/perf/arch/x86/util/Build           |   1 +
>   tools/perf/arch/x86/util/iostat.c        | 345 +++++++++++++++++++++++
>   tools/perf/command-list.txt              |   1 +
>   tools/perf/perf-iostat.sh                |  12 +
>   6 files changed, 451 insertions(+), 1 deletion(-)
>   create mode 100644 tools/perf/Documentation/perf-iostat.txt
>   create mode 100644 tools/perf/perf-iostat.sh
> 
> diff --git a/tools/perf/Documentation/perf-iostat.txt b/tools/perf/Documentation/perf-iostat.txt
> new file mode 100644
> index 000000000000..165176944031
> --- /dev/null
> +++ b/tools/perf/Documentation/perf-iostat.txt
> @@ -0,0 +1,88 @@
> +perf-iostat(1)
> +===============
> +
> +NAME
> +----
> +perf-iostat - Show I/O performance metrics
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'perf iostat' list
> +'perf iostat' <ports> -- <command> [<options>]
> +
> +DESCRIPTION
> +-----------
> +Mode is intended to provide four I/O performance metrics per each PCIe root port:
> +
> +- Inbound Read   - I/O devices below root port read from the host memory, in MB
> +
> +- Inbound Write  - I/O devices below root port write to the host memory, in MB
> +
> +- Outbound Read  - CPU reads from I/O devices below root port, in MB
> +
> +- Outbound Write - CPU writes to I/O devices below root port, in MB
> +
> +OPTIONS
> +-------
> +<command>...::
> +	Any command you can specify in a shell.
> +
> +list::
> +	List all PCIe root ports.

I noticed that "iostat" commond and cmd_iostat() callback function is 
not registered in cmd_struct in perf.c. So I think "perf iostat list" 
perhaps can not work properly.

I also test this patchset on x86 platform, and here is the log:

root@ubuntu:/home/lq# ./perf iostat list
perf: 'iostat' is not a perf-command. See 'perf --help'.
root@ubuntu:/home/lq# ./perf stat --iostat
^C
  Performance counter stats for 'system wide':

    port             Inbound Read(MB)    Inbound Write(MB)    Outbound 
Read(MB)   Outbound Write(MB)
0000:00                    0                    0                    0 
                  0
0000:80                    0                    0                    0 
                  0
0000:17                    0                    0                    0 
                  0
0000:85                    0                    0                    0 
                  0
0000:3a                    0                    0                    0 
                  0
0000:ae                    0                    0                    0 
                  0
0000:5d                    0                    0                    0 
                  0
0000:d7                    0                    0                    0 
                  0

        0.611303832 seconds time elapsed


root@ubuntu:/home/lq# ./perf stat --iostat=0000:17
^C
  Performance counter stats for 'system wide':

    port             Inbound Read(MB)    Inbound Write(MB)    Outbound 
Read(MB)   Outbound Write(MB)
0000:17                    0                    0                    0 
                  0

        0.521317572 seconds time elapsed

So how does following perf iostat list work, did I miss something?

Thanks,
Qi

> +
> +<ports>::
> +	Select the root ports for monitoring. Comma-separated list is supported.
> +
> +EXAMPLES
> +--------
> +
> +1. List all PCIe root ports (example for 2-S platform):
> +
> +   $ perf iostat list
> +   S0-uncore_iio_0<0000:00>
> +   S1-uncore_iio_0<0000:80>
> +   S0-uncore_iio_1<0000:17>
> +   S1-uncore_iio_1<0000:85>
> +   S0-uncore_iio_2<0000:3a>
> +   S1-uncore_iio_2<0000:ae>
> +   S0-uncore_iio_3<0000:5d>
> +   S1-uncore_iio_3<0000:d7>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms
  2021-03-09  7:51   ` liuqi (BA)
@ 2021-03-10 16:19     ` Alexander Antonov
  2021-03-11  7:02       ` liuqi (BA)
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Antonov @ 2021-03-10 16:19 UTC (permalink / raw)
  To: liuqi (BA)
  Cc: acme, linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz


On 3/9/2021 10:51 AM, liuqi (BA) wrote:
> Hi Alexander,
>
> On 2021/2/3 21:58, Alexander Antonov wrote:
>> This functionality is based on recently introduced sysfs attributes
>> for Intel® Xeon® Scalable processor family (code name Skylake-SP):
>> Commit bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to
>> IIO PMON mapping")
>>
>> Mode is intended to provide four I/O performance metrics in MB per each
>> PCIe root port:
>>   - Inbound Read: I/O devices below root port read from the host memory
>>   - Inbound Write: I/O devices below root port write to the host memory
>>   - Outbound Read: CPU reads from I/O devices below root port
>>   - Outbound Write: CPU writes to I/O devices below root port
>>
>> Each metric requiries only one uncore event which increments at every 4B
>> transfer in corresponding direction. The formulas to compute metrics
>> are generic:
>>      #EventCount * 4B / (1024 * 1024)
>>
>> Signed-off-by: Alexander Antonov<alexander.antonov@linux.intel.com>
>> ---
>>   tools/perf/Documentation/perf-iostat.txt |  88 ++++++
>>   tools/perf/Makefile.perf                 |   5 +-
>>   tools/perf/arch/x86/util/Build           |   1 +
>>   tools/perf/arch/x86/util/iostat.c        | 345 +++++++++++++++++++++++
>>   tools/perf/command-list.txt              |   1 +
>>   tools/perf/perf-iostat.sh                |  12 +
>>   6 files changed, 451 insertions(+), 1 deletion(-)
>>   create mode 100644 tools/perf/Documentation/perf-iostat.txt
>>   create mode 100644 tools/perf/perf-iostat.sh
>>
>> diff --git a/tools/perf/Documentation/perf-iostat.txt 
>> b/tools/perf/Documentation/perf-iostat.txt
>> new file mode 100644
>> index 000000000000..165176944031
>> --- /dev/null
>> +++ b/tools/perf/Documentation/perf-iostat.txt
>> @@ -0,0 +1,88 @@
>> +perf-iostat(1)
>> +===============
>> +
>> +NAME
>> +----
>> +perf-iostat - Show I/O performance metrics
>> +
>> +SYNOPSIS
>> +--------
>> +[verse]
>> +'perf iostat' list
>> +'perf iostat' <ports> -- <command> [<options>]
>> +
>> +DESCRIPTION
>> +-----------
>> +Mode is intended to provide four I/O performance metrics per each 
>> PCIe root port:
>> +
>> +- Inbound Read   - I/O devices below root port read from the host 
>> memory, in MB
>> +
>> +- Inbound Write  - I/O devices below root port write to the host 
>> memory, in MB
>> +
>> +- Outbound Read  - CPU reads from I/O devices below root port, in MB
>> +
>> +- Outbound Write - CPU writes to I/O devices below root port, in MB
>> +
>> +OPTIONS
>> +-------
>> +<command>...::
>> +    Any command you can specify in a shell.
>> +
>> +list::
>> +    List all PCIe root ports.
>
> I noticed that "iostat" commond and cmd_iostat() callback function is 
> not registered in cmd_struct in perf.c. So I think "perf iostat list" 
> perhaps can not work properly.
>
> I also test this patchset on x86 platform, and here is the log:
>
> root@ubuntu:/home/lq# ./perf iostat list
> perf: 'iostat' is not a perf-command. See 'perf --help'.
> root@ubuntu:/home/lq# ./perf stat --iostat
> ^C
>  Performance counter stats for 'system wide':
>
>    port             Inbound Read(MB)    Inbound Write(MB) Outbound 
> Read(MB)   Outbound Write(MB)
> 0000:00                    0 0                    0                  0
> 0000:80                    0 0                    0                  0
> 0000:17                    0 0                    0                  0
> 0000:85                    0 0                    0                  0
> 0000:3a                    0 0                    0                  0
> 0000:ae                    0 0                    0                  0
> 0000:5d                    0 0                    0                  0
> 0000:d7                    0 0                    0                  0
>
>        0.611303832 seconds time elapsed
>
>
> root@ubuntu:/home/lq# ./perf stat --iostat=0000:17
> ^C
>  Performance counter stats for 'system wide':
>
>    port             Inbound Read(MB)    Inbound Write(MB) Outbound 
> Read(MB)   Outbound Write(MB)
> 0000:17                    0 0                    0                  0
>
>        0.521317572 seconds time elapsed
>
> So how does following perf iostat list work, did I miss something?
>
> Thanks,
> Qi
>
Hello,

The 'iostat' mode uses aliases mechanism in perf same as 'perf archive' and
in this case you don't need to add function callback into cmd_struct.
For example, the command 'perf iostat list' will be converted to
'perf stat --iostat=list'.

After building the perf tool you should have two shell scripts in tools/perf
directory and one of them is executable, for example:
# make -C tools/perf
# ls -l tools/perf/perf-iostat*
-rwxr-xr-x 1 root root 290 Mar 10 18:17 perf-iostat
-rw-r--r-- 1 root root 290 Feb  3 15:14 perf-iostat.sh

It should be possible to run 'perf iostat' from build directory:
# cd tools/perf
# ./perf iostat list
S0-uncore_iio_0<0000:00>
S1-uncore_iio_0<0000:80>
S0-uncore_iio_1<0000:17>
S1-uncore_iio_1<0000:85>
S0-uncore_iio_2<0000:3a>
S1-uncore_iio_2<0000:ae>
S0-uncore_iio_3<0000:5d>
S1-uncore_iio_3<0000:d7>

Also you can copy 'perf-iostat' to ~/libexec/perf-core/ or just run 
'make install'

# make install
# cp perf /usr/bin/
# ls -lh ~/libexec/perf-core/
total 24K
-rwxr-xr-x 1 root root 1.4K Mar 10 18:17 perf-archive
-rwxr-xr-x 1 root root  290 Mar 10 18:17 perf-iostat
-rwxr-xr-x 1 root root 6.3K Mar 10 18:17 perf-with-kcore
drwxr-xr-x 4 root root 4.0K Nov  5  2019 scripts
drwxr-xr-x 4 root root 4.0K Mar 10 18:17 tests
# perf iostat 0000:17 -I 1000 --interval-count 2
#           time    port             Inbound Read(MB)    Inbound 
Write(MB)    Outbound Read(MB)   Outbound Write(MB)
      1.000220341 0000:17                    0 0                    
0                    0
      2.000569569 0000:17                    0 0                    
0                    0

Actually, Arnaldo has explained before how does aliases mechanism work.

I hope it will solve your issue. Otherwise, please email.

Thank you,
Alexander
>> +
>> +<ports>::
>> +    Select the root ports for monitoring. Comma-separated list is 
>> supported.
>> +
>> +EXAMPLES
>> +--------
>> +
>> +1. List all PCIe root ports (example for 2-S platform):
>> +
>> +   $ perf iostat list
>> +   S0-uncore_iio_0<0000:00>
>> +   S1-uncore_iio_0<0000:80>
>> +   S0-uncore_iio_1<0000:17>
>> +   S1-uncore_iio_1<0000:85>
>> +   S0-uncore_iio_2<0000:3a>
>> +   S1-uncore_iio_2<0000:ae>
>> +   S0-uncore_iio_3<0000:5d>
>> +   S1-uncore_iio_3<0000:d7>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms
  2021-03-10 16:19     ` Alexander Antonov
@ 2021-03-11  7:02       ` liuqi (BA)
  0 siblings, 0 replies; 19+ messages in thread
From: liuqi (BA) @ 2021-03-11  7:02 UTC (permalink / raw)
  To: Alexander Antonov
  Cc: acme, linux-kernel, jolsa, ak, alexander.shishkin, mark.rutland,
	namhyung, irogers, mingo, peterz



On 2021/3/11 0:19, Alexander Antonov wrote:
> 
> On 3/9/2021 10:51 AM, liuqi (BA) wrote:
>> Hi Alexander,
>>
>> On 2021/2/3 21:58, Alexander Antonov wrote:
>>> This functionality is based on recently introduced sysfs attributes
>>> for Intel® Xeon® Scalable processor family (code name Skylake-SP):
>>> Commit bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to
>>> IIO PMON mapping")
>>>
>>> Mode is intended to provide four I/O performance metrics in MB per each
>>> PCIe root port:
>>>   - Inbound Read: I/O devices below root port read from the host memory
>>>   - Inbound Write: I/O devices below root port write to the host memory
>>>   - Outbound Read: CPU reads from I/O devices below root port
>>>   - Outbound Write: CPU writes to I/O devices below root port
>>>
>>> Each metric requiries only one uncore event which increments at every 4B
>>> transfer in corresponding direction. The formulas to compute metrics
>>> are generic:
>>>      #EventCount * 4B / (1024 * 1024)
>>>
>>> Signed-off-by: Alexander Antonov<alexander.antonov@linux.intel.com>
>>> ---
>>>   tools/perf/Documentation/perf-iostat.txt |  88 ++++++
>>>   tools/perf/Makefile.perf                 |   5 +-
>>>   tools/perf/arch/x86/util/Build           |   1 +
>>>   tools/perf/arch/x86/util/iostat.c        | 345 +++++++++++++++++++++++
>>>   tools/perf/command-list.txt              |   1 +
>>>   tools/perf/perf-iostat.sh                |  12 +
>>>   6 files changed, 451 insertions(+), 1 deletion(-)
>>>   create mode 100644 tools/perf/Documentation/perf-iostat.txt
>>>   create mode 100644 tools/perf/perf-iostat.sh
>>>
>>> diff --git a/tools/perf/Documentation/perf-iostat.txt 
>>> b/tools/perf/Documentation/perf-iostat.txt
>>> new file mode 100644
>>> index 000000000000..165176944031
>>> --- /dev/null
>>> +++ b/tools/perf/Documentation/perf-iostat.txt
>>> @@ -0,0 +1,88 @@
>>> +perf-iostat(1)
>>> +===============
>>> +
>>> +NAME
>>> +----
>>> +perf-iostat - Show I/O performance metrics
>>> +
>>> +SYNOPSIS
>>> +--------
>>> +[verse]
>>> +'perf iostat' list
>>> +'perf iostat' <ports> -- <command> [<options>]
>>> +
>>> +DESCRIPTION
>>> +-----------
>>> +Mode is intended to provide four I/O performance metrics per each 
>>> PCIe root port:
>>> +
>>> +- Inbound Read   - I/O devices below root port read from the host 
>>> memory, in MB
>>> +
>>> +- Inbound Write  - I/O devices below root port write to the host 
>>> memory, in MB
>>> +
>>> +- Outbound Read  - CPU reads from I/O devices below root port, in MB
>>> +
>>> +- Outbound Write - CPU writes to I/O devices below root port, in MB
>>> +
>>> +OPTIONS
>>> +-------
>>> +<command>...::
>>> +    Any command you can specify in a shell.
>>> +
>>> +list::
>>> +    List all PCIe root ports.
>>
>> I noticed that "iostat" commond and cmd_iostat() callback function is 
>> not registered in cmd_struct in perf.c. So I think "perf iostat list" 
>> perhaps can not work properly.
>>
>> I also test this patchset on x86 platform, and here is the log:
>>
>> root@ubuntu:/home/lq# ./perf iostat list
>> perf: 'iostat' is not a perf-command. See 'perf --help'.
>> root@ubuntu:/home/lq# ./perf stat --iostat
>> ^C
>>  Performance counter stats for 'system wide':
>>
>>    port             Inbound Read(MB)    Inbound Write(MB) Outbound 
>> Read(MB)   Outbound Write(MB)
>> 0000:00                    0 0                    0                  0
>> 0000:80                    0 0                    0                  0
>> 0000:17                    0 0                    0                  0
>> 0000:85                    0 0                    0                  0
>> 0000:3a                    0 0                    0                  0
>> 0000:ae                    0 0                    0                  0
>> 0000:5d                    0 0                    0                  0
>> 0000:d7                    0 0                    0                  0
>>
>>        0.611303832 seconds time elapsed
>>
>>
>> root@ubuntu:/home/lq# ./perf stat --iostat=0000:17
>> ^C
>>  Performance counter stats for 'system wide':
>>
>>    port             Inbound Read(MB)    Inbound Write(MB) Outbound 
>> Read(MB)   Outbound Write(MB)
>> 0000:17                    0 0                    0                  0
>>
>>        0.521317572 seconds time elapsed
>>
>> So how does following perf iostat list work, did I miss something?
>>
>> Thanks,
>> Qi
>>
> Hello,
> 
> The 'iostat' mode uses aliases mechanism in perf same as 'perf archive' and
> in this case you don't need to add function callback into cmd_struct.
> For example, the command 'perf iostat list' will be converted to
> 'perf stat --iostat=list'.
> 
> After building the perf tool you should have two shell scripts in 
> tools/perf
> directory and one of them is executable, for example:
> # make -C tools/perf
> # ls -l tools/perf/perf-iostat*
> -rwxr-xr-x 1 root root 290 Mar 10 18:17 perf-iostat
> -rw-r--r-- 1 root root 290 Feb  3 15:14 perf-iostat.sh
> 
> It should be possible to run 'perf iostat' from build directory:
> # cd tools/perf
> # ./perf iostat list
> S0-uncore_iio_0<0000:00>
> S1-uncore_iio_0<0000:80>
> S0-uncore_iio_1<0000:17>
> S1-uncore_iio_1<0000:85>
> S0-uncore_iio_2<0000:3a>
> S1-uncore_iio_2<0000:ae>
> S0-uncore_iio_3<0000:5d>
> S1-uncore_iio_3<0000:d7>
> 
> Also you can copy 'perf-iostat' to ~/libexec/perf-core/ or just run 
> 'make install'
> 
> # make install
> # cp perf /usr/bin/
> # ls -lh ~/libexec/perf-core/
> total 24K
> -rwxr-xr-x 1 root root 1.4K Mar 10 18:17 perf-archive
> -rwxr-xr-x 1 root root  290 Mar 10 18:17 perf-iostat
> -rwxr-xr-x 1 root root 6.3K Mar 10 18:17 perf-with-kcore
> drwxr-xr-x 4 root root 4.0K Nov  5  2019 scripts
> drwxr-xr-x 4 root root 4.0K Mar 10 18:17 tests
> # perf iostat 0000:17 -I 1000 --interval-count 2
> #           time    port             Inbound Read(MB)    Inbound 
> Write(MB)    Outbound Read(MB)   Outbound Write(MB)
>       1.000220341 0000:17                    0 0 0                    0
>       2.000569569 0000:17                    0 0 0                    0
> 
> Actually, Arnaldo has explained before how does aliases mechanism work.
> 
> I hope it will solve your issue. Otherwise, please email.
> 
> Thank you,
> Alexander

Hi Alexander,

I try it again on x86 platform and it works, thanks a lot:)

Thanks,
Qi
>>> +
>>> +<ports>::
>>> +    Select the root ports for monitoring. Comma-separated list is 
>>> supported.
>>> +
>>> +EXAMPLES
>>> +--------
>>> +
>>> +1. List all PCIe root ports (example for 2-S platform):
>>> +
>>> +   $ perf iostat list
>>> +   S0-uncore_iio_0<0000:00>
>>> +   S1-uncore_iio_0<0000:80>
>>> +   S0-uncore_iio_1<0000:17>
>>> +   S1-uncore_iio_1<0000:85>
>>> +   S0-uncore_iio_2<0000:3a>
>>> +   S1-uncore_iio_2<0000:ae>
>>> +   S0-uncore_iio_3<0000:5d>
>>> +   S1-uncore_iio_3<0000:d7>
>>
> 
> .


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-03-11  7:03 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-03 13:58 [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Alexander Antonov
2021-02-03 13:58 ` [PATCH v4 1/5] perf stat: Add AGGR_PCIE_PORT mode Alexander Antonov
2021-02-04 12:07   ` Namhyung Kim
2021-02-08 11:30     ` Alexander Antonov
2021-02-10 10:02       ` Namhyung Kim
2021-02-03 13:58 ` [PATCH v4 2/5] perf stat: Basic support for iostat in perf Alexander Antonov
2021-02-04 12:22   ` Namhyung Kim
2021-02-08 11:55     ` Alexander Antonov
2021-02-10 10:07       ` Namhyung Kim
2021-02-03 13:58 ` [PATCH v4 3/5] perf stat: Helper functions for PCIe root ports list in iostat mode Alexander Antonov
2021-02-04 12:32   ` Namhyung Kim
2021-02-08 11:58     ` Alexander Antonov
2021-02-03 13:58 ` [PATCH v4 4/5] perf stat: Enable iostat mode for x86 platforms Alexander Antonov
2021-03-09  7:51   ` liuqi (BA)
2021-03-10 16:19     ` Alexander Antonov
2021-03-11  7:02       ` liuqi (BA)
2021-02-03 13:58 ` [PATCH v4 5/5] perf: Update .gitignore file Alexander Antonov
2021-02-03 20:55 ` [PATCH v4 0/5] perf stat: Introduce iostat mode to provide I/O performance metrics Arnaldo Carvalho de Melo
2021-02-04 12:35   ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).