linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 0/3] Retirement latency perf stat support
@ 2024-04-25 22:34 Ian Rogers
  2024-04-25 22:34 ` [RFC PATCH v1 1/3] perf evsel: Don't open tool events Ian Rogers
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Ian Rogers @ 2024-04-25 22:34 UTC (permalink / raw)
  To: weilin.wang, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Kan Liang, Ze Gao, Leo Yan, Ravi Bangoria, Dmitrii Dolgov,
	Song Liu, James Clark, linux-perf-users, linux-kernel

Support 'R' as a retirement latency modifier on events. When present
the evsel will fork perf record and perf report commands, parsing the
perf report output as the count value. The intent is to do something
similar to Weilin's series:
https://lore.kernel.org/lkml/20240402214436.1409476-1-weilin.wang@intel.com/

While the 'R' and the retirement latency are Intel specific, in the
future I can imagine more evsel like commands that require child
processes. We can make the logic more generic at that point.

The code is untested on hardware that supports retirement latency, and
with metrics with retirement latency in them. The record is also of
sleep and various things need tweaking but I think v1 is good enough
for people to give input.

The first patch stops opening a dummy event for tool events. I came
across this while looking into the issue and we can likely just pick
it first. I kept it in the series for cleanliness sake.

The code has benefitted greatly from Weilin's work and Namhyung's
great review input.

Ian Rogers (3):
  perf evsel: Don't open tool events
  perf parse-events: Add a retirement latency modifier
  perf evsel: Add retirement latency event support

 tools/perf/util/evsel.c        | 186 ++++++++++++++++++++++++++++++++-
 tools/perf/util/evsel.h        |   4 +
 tools/perf/util/parse-events.c |   2 +
 tools/perf/util/parse-events.h |   1 +
 tools/perf/util/parse-events.l |   3 +-
 5 files changed, 192 insertions(+), 4 deletions(-)

-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH v1 1/3] perf evsel: Don't open tool events
  2024-04-25 22:34 [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
@ 2024-04-25 22:34 ` Ian Rogers
  2024-04-25 22:34 ` [RFC PATCH v1 2/3] perf parse-events: Add a retirement latency modifier Ian Rogers
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Ian Rogers @ 2024-04-25 22:34 UTC (permalink / raw)
  To: weilin.wang, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Kan Liang, Ze Gao, Leo Yan, Ravi Bangoria, Dmitrii Dolgov,
	Song Liu, James Clark, linux-perf-users, linux-kernel

Tool events are set up as software dummy events. Don't open them to
save some work and a file descriptor. The evsel's fds are initialized
to -1, so it suffices to just early return.
---
 tools/perf/util/evsel.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3536404e9447..2743d40665ff 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2005,6 +2005,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	int pid = -1, err, old_errno;
 	enum rlimit_action set_rlimit = NO_CHANGE;
 
+	if (evsel__is_tool(evsel)) {
+		/* Tool events are set up as dummy but don't need opening. */
+		return 0;
+	}
+
 	err = __evsel__prepare_open(evsel, cpus, threads);
 	if (err)
 		return err;
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH v1 2/3] perf parse-events: Add a retirement latency modifier
  2024-04-25 22:34 [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
  2024-04-25 22:34 ` [RFC PATCH v1 1/3] perf evsel: Don't open tool events Ian Rogers
@ 2024-04-25 22:34 ` Ian Rogers
  2024-04-25 22:34 ` [RFC PATCH v1 3/3] perf evsel: Add retirement latency event support Ian Rogers
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Ian Rogers @ 2024-04-25 22:34 UTC (permalink / raw)
  To: weilin.wang, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Kan Liang, Ze Gao, Leo Yan, Ravi Bangoria, Dmitrii Dolgov,
	Song Liu, James Clark, linux-perf-users, linux-kernel

Retirement latency is a separate sampled count used on newer Intel
CPUs.
---
 tools/perf/util/evsel.h        | 1 +
 tools/perf/util/parse-events.c | 2 ++
 tools/perf/util/parse-events.h | 1 +
 tools/perf/util/parse-events.l | 3 ++-
 4 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 517cff431de2..e6726587e1bc 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -98,6 +98,7 @@ struct evsel {
 		bool			bpf_counter;
 		bool			use_config_name;
 		bool			skippable;
+		bool			retire_lat;
 		int			bpf_fd;
 		struct bpf_object	*bpf_obj;
 		struct list_head	config_terms;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 0f308b4db2b9..9c2a76ec8c99 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1818,6 +1818,8 @@ static int parse_events__modifier_list(struct parse_events_state *parse_state,
 			evsel->weak_group = true;
 		if (mod.bpf)
 			evsel->bpf_counter = true;
+		if (mod.retire_lat)
+			evsel->retire_lat = true;
 	}
 	return 0;
 }
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 5695308efab9..eb94d1247dae 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -201,6 +201,7 @@ struct parse_events_modifier {
 	bool hypervisor : 1;	/* 'h' */
 	bool guest : 1;		/* 'G' */
 	bool host : 1;		/* 'H' */
+	bool retire_lat : 1;	/* 'R' */
 };
 
 int parse_events__modifier_event(struct parse_events_state *parse_state, void *loc,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 08ea2d845dc3..85015f080240 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -209,6 +209,7 @@ static int modifiers(struct parse_events_state *parse_state, yyscan_t scanner)
 		CASE('W', weak);
 		CASE('e', exclusive);
 		CASE('b', bpf);
+		CASE('R', retire_lat);
 		default:
 			return PE_ERROR;
 		}
@@ -250,7 +251,7 @@ drv_cfg_term	[a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
  * If you add a modifier you need to update check_modifier().
  * Also, the letters in modifier_event must not be in modifier_bp.
  */
-modifier_event	[ukhpPGHSDIWeb]{1,15}
+modifier_event	[ukhpPGHSDIWebR]{1,16}
 modifier_bp	[rwx]{1,3}
 lc_type 	(L1-dcache|l1-d|l1d|L1-data|L1-icache|l1-i|l1i|L1-instruction|LLC|L2|dTLB|d-tlb|Data-TLB|iTLB|i-tlb|Instruction-TLB|branch|branches|bpu|btb|bpc|node)
 lc_op_result	(load|loads|read|store|stores|write|prefetch|prefetches|speculative-read|speculative-load|refs|Reference|ops|access|misses|miss)
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH v1 3/3] perf evsel: Add retirement latency event support
  2024-04-25 22:34 [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
  2024-04-25 22:34 ` [RFC PATCH v1 1/3] perf evsel: Don't open tool events Ian Rogers
  2024-04-25 22:34 ` [RFC PATCH v1 2/3] perf parse-events: Add a retirement latency modifier Ian Rogers
@ 2024-04-25 22:34 ` Ian Rogers
  2024-04-25 22:41 ` [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
  2024-04-26 17:22 ` Liang, Kan
  4 siblings, 0 replies; 7+ messages in thread
From: Ian Rogers @ 2024-04-25 22:34 UTC (permalink / raw)
  To: weilin.wang, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Kan Liang, Ze Gao, Leo Yan, Ravi Bangoria, Dmitrii Dolgov,
	Song Liu, James Clark, linux-perf-users, linux-kernel

When a retirement latency event is processed it sets a flag on the
evsel. This change makes it so that when the flag is set evsel
opening, reading and exiting report values from child perf record and
perf report processes.

Something similar was suggested by Namhyung Kim in:
https://lore.kernel.org/lkml/CAM9d7cgdQQn5GYB7t++xuoMdeqPXiEkkcop69+rD06RAnu9-EQ@mail.gmail.com/

This is trying to add support for retirement latency directly in
events rather than through metric changes, as suggested by Weilin Wang in:
https://lore.kernel.org/lkml/20240402214436.1409476-1-weilin.wang@intel.com/
---
 tools/perf/util/evsel.c | 181 +++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/evsel.h |   3 +
 2 files changed, 181 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2743d40665ff..3f0b4326bac6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -56,6 +56,7 @@
 #include <internal/xyarray.h>
 #include <internal/lib.h>
 #include <internal/threadmap.h>
+#include <subcmd/run-command.h>
 
 #include <linux/ctype.h>
 
@@ -491,6 +492,156 @@ struct evsel *evsel__newtp_idx(const char *sys, const char *name, int idx)
 }
 #endif
 
+static int evsel__start_retire_latency_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
+					  int cpu_map_idx)
+{
+	char buf[16];
+	int pipefd[2];
+	int err, i, event_len;
+	struct perf_cpu cpu = perf_cpu_map__cpu(cpus, cpu_map_idx);
+	struct child_process *child_record =
+		xyarray__entry(evsel->children, cpu_map_idx, 0);
+	struct child_process *child_report =
+		xyarray__entry(evsel->children, cpu_map_idx, 1);
+	char *event = strdup(evsel__name(evsel));
+	// TODO: the dummy event also won't be used, but there's no option to disable.
+	const char *record_argv[15] = {
+		[0] = "perf",
+		[1] = "record",
+		[2] = "--synth=no",
+		[3] = "-W",
+		[4] = "-o",
+		[5] = "-",
+		[6] = "-e",
+	};
+	const char *report_argv[] = {
+		[0] = "perf",
+		[1] = "report",
+		[2] = "-i",
+		[3] = "-",
+		[4] = "-q",
+		[5] = "-F",
+		[6] = "weight1",
+		NULL,
+	};
+
+	if (!event)
+		return -ENOMEM;
+
+	// Remove the R from the modifiers.
+	event_len = strlen(event);
+	if (event[event_len - 1] == 'R' && event[event_len - 2] == ':') {
+		event[strlen(event) - 2] = '\0';
+	} else if (event[event_len - 1] == 'R' && event[event_len - 2] == '/') {
+		event[strlen(event) - 1] = '\0';
+	} else {
+		for (i = event_len - 1; i > 0; i--) {
+			if (event[i] == 'R') {
+				for (int j = i + 1; j < event_len; j++)
+					event[i] = event[j];
+				event[strlen(event) - 1] = '\0';
+				break;
+			}
+		}
+		if (i == 0)
+			pr_err("Expected retired latency 'R'\n");
+	}
+
+	i = 7;
+	record_argv[i++] = event;
+	if (verbose) {
+		record_argv[i++] = verbose > 1 ? "-vv" : "-v";
+	}
+	if (cpu.cpu >= 0) {
+		record_argv[i++] = "-C";
+		snprintf(buf, sizeof(buf), "%d", cpu.cpu);
+		record_argv[i++] = buf;
+	} else {
+		record_argv[i++] = "-a";
+	}
+	record_argv[i++] = "sleep";
+	// TODO: interval and support for different periods.
+	record_argv[i++] = "0.1";
+
+	if (pipe(pipefd) < 0) {
+		free(event);
+		return -errno;
+	}
+
+	child_record->argv = record_argv;
+	child_record->pid = -1;
+	child_record->no_stdin = 1;
+	if (verbose)
+		child_record->err = fileno(stderr);
+	else
+		child_record->no_stderr = 1;
+	child_record->out = pipefd[1];
+	err = start_command(child_record);
+	free(event);
+	if (err)
+		return err;
+
+	child_report->argv = report_argv;
+	child_report->pid = -1;
+	if (verbose)
+		child_report->err = fileno(stderr);
+	else
+		child_report->no_stderr = 1;
+	child_report->in = pipefd[0];
+	child_report->out = -1;
+	return start_command(child_report);
+}
+
+static int evsel__finish_retire_latency_cpu(struct evsel *evsel, int cpu_map_idx)
+{
+	struct child_process *child_record =
+		xyarray__entry(evsel->children, cpu_map_idx, 0);
+	struct child_process *child_report =
+		xyarray__entry(evsel->children, cpu_map_idx, 1);
+
+	finish_command(child_record);
+	finish_command(child_report);
+	return 0;
+}
+
+static int evsel__read_retire_latency(struct evsel *evsel, int cpu_map_idx, int thread)
+{
+	struct child_process *child_report = xyarray__entry(evsel->children, cpu_map_idx, 1);
+	struct perf_counts_values *count = perf_counts(evsel->counts, cpu_map_idx, thread);
+	char buf[256];
+	int err;
+
+	err = read(child_report->out, buf, sizeof(buf));
+	if (err < 0 || strlen(buf) == 0)
+		return -1;
+
+	count->val = atoll(buf);
+	count->ena = 1;
+	count->run = 1;
+	count->id = 0;
+	count->lost = 0;
+
+	err = evsel__finish_retire_latency_cpu(evsel, cpu_map_idx);
+	if (err)
+		return err;
+
+	/* Restart the counter. */
+	return evsel__start_retire_latency_cpu(evsel, evsel->core.cpus, cpu_map_idx);
+}
+
+static int evsel__finish_retire_latency(struct evsel *evsel)
+{
+	int idx;
+
+	perf_cpu_map__for_each_idx(idx, evsel->core.cpus) {
+		int err = evsel__finish_retire_latency_cpu(evsel, idx);
+
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
 const char *const evsel__hw_names[PERF_COUNT_HW_MAX] = {
 	"cycles",
 	"instructions",
@@ -1463,6 +1614,10 @@ static void evsel__free_config_terms(struct evsel *evsel)
 
 void evsel__exit(struct evsel *evsel)
 {
+	if (evsel->children) {
+		evsel__finish_retire_latency(evsel);
+		zfree(&evsel->children);
+	}
 	assert(list_empty(&evsel->core.node));
 	assert(evsel->evlist == NULL);
 	bpf_counter__destroy(evsel);
@@ -1602,9 +1757,10 @@ static int evsel__read_group(struct evsel *leader, int cpu_map_idx, int thread)
 
 int evsel__read_counter(struct evsel *evsel, int cpu_map_idx, int thread)
 {
-	u64 read_format = evsel->core.attr.read_format;
+	if (evsel->retire_lat)
+		return evsel__read_retire_latency(evsel, cpu_map_idx, thread);
 
-	if (read_format & PERF_FORMAT_GROUP)
+	if (evsel->core.attr.read_format & PERF_FORMAT_GROUP)
 		return evsel__read_group(evsel, cpu_map_idx, thread);
 
 	return evsel__read_one(evsel, cpu_map_idx, thread);
@@ -1819,10 +1975,22 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		threads = empty_thread_map;
 	}
 
-	if (evsel->core.fd == NULL &&
+	if (!evsel->retire_lat && evsel->core.fd == NULL &&
 	    perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
 		return -ENOMEM;
 
+	if (evsel->retire_lat && evsel->children == NULL) {
+		/*
+		 * Use ylen of 2, [0] is the record and [1] is the report
+		 * command. Currently retirement latency doesn't support
+		 * per-thread mode.
+		 */
+		evsel->children = xyarray__new(perf_cpu_map__nr(cpus), /*ylen=*/2,
+					sizeof(struct child_process));
+		if (!evsel->children)
+			return -ENOMEM;
+	}
+
 	evsel->open_flags = PERF_FLAG_FD_CLOEXEC;
 	if (evsel->cgrp)
 		evsel->open_flags |= PERF_FLAG_PID_CGROUP;
@@ -2033,6 +2201,13 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 
 	for (idx = start_cpu_map_idx; idx < end_cpu_map_idx; idx++) {
 
+		if (evsel->retire_lat) {
+			err = evsel__start_retire_latency_cpu(evsel, cpus, idx);
+			if (err)
+				return err;
+			continue;
+		}
+
 		for (thread = 0; thread < nthreads; thread++) {
 			int fd, group_fd;
 retry_open:
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index e6726587e1bc..ab7c10e7f063 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -171,6 +171,9 @@ struct evsel {
 
 	/* for missing_features */
 	struct perf_pmu		*pmu;
+
+	/* Used for retire_lat child process. */
+	struct xyarray *children;
 };
 
 struct perf_missing_features {
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v1 0/3] Retirement latency perf stat support
  2024-04-25 22:34 [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
                   ` (2 preceding siblings ...)
  2024-04-25 22:34 ` [RFC PATCH v1 3/3] perf evsel: Add retirement latency event support Ian Rogers
@ 2024-04-25 22:41 ` Ian Rogers
  2024-04-26 17:22 ` Liang, Kan
  4 siblings, 0 replies; 7+ messages in thread
From: Ian Rogers @ 2024-04-25 22:41 UTC (permalink / raw)
  To: weilin.wang, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Kan Liang, Ze Gao, Leo Yan, Ravi Bangoria, Dmitrii Dolgov,
	Song Liu, James Clark, linux-perf-users, linux-kernel

On Thu, Apr 25, 2024 at 3:34 PM Ian Rogers <irogers@google.com> wrote:
>
> Support 'R' as a retirement latency modifier on events. When present
> the evsel will fork perf record and perf report commands, parsing the
> perf report output as the count value. The intent is to do something
> similar to Weilin's series:
> https://lore.kernel.org/lkml/20240402214436.1409476-1-weilin.wang@intel.com/
>
> While the 'R' and the retirement latency are Intel specific, in the
> future I can imagine more evsel like commands that require child
> processes. We can make the logic more generic at that point.
>
> The code is untested on hardware that supports retirement latency, and
> with metrics with retirement latency in them. The record is also of
> sleep and various things need tweaking but I think v1 is good enough
> for people to give input.
>
> The first patch stops opening a dummy event for tool events. I came
> across this while looking into the issue and we can likely just pick
> it first. I kept it in the series for cleanliness sake.
>
> The code has benefitted greatly from Weilin's work and Namhyung's
> great review input.

I forgot to mention this is based on the tmp.perf-tools-next branch
due to the recent parse events clean ups that have already landed
there:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=tmp.perf-tools-next

Thanks,
Ian

> Ian Rogers (3):
>   perf evsel: Don't open tool events
>   perf parse-events: Add a retirement latency modifier
>   perf evsel: Add retirement latency event support
>
>  tools/perf/util/evsel.c        | 186 ++++++++++++++++++++++++++++++++-
>  tools/perf/util/evsel.h        |   4 +
>  tools/perf/util/parse-events.c |   2 +
>  tools/perf/util/parse-events.h |   1 +
>  tools/perf/util/parse-events.l |   3 +-
>  5 files changed, 192 insertions(+), 4 deletions(-)
>
> --
> 2.44.0.769.g3c40516874-goog
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v1 0/3] Retirement latency perf stat support
  2024-04-25 22:34 [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
                   ` (3 preceding siblings ...)
  2024-04-25 22:41 ` [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
@ 2024-04-26 17:22 ` Liang, Kan
  2024-04-26 17:34   ` Ian Rogers
  4 siblings, 1 reply; 7+ messages in thread
From: Liang, Kan @ 2024-04-26 17:22 UTC (permalink / raw)
  To: Ian Rogers, weilin.wang, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Ze Gao, Leo Yan,
	Ravi Bangoria, Dmitrii Dolgov, Song Liu, James Clark,
	linux-perf-users, linux-kernel



On 2024-04-25 6:34 p.m., Ian Rogers wrote:
> Support 'R' as a retirement latency modifier on events. When present
> the evsel will fork perf record and perf report commands, parsing the
> perf report output as the count value. The intent is to do something
> similar to Weilin's series:
> https://lore.kernel.org/lkml/20240402214436.1409476-1-weilin.wang@intel.com/
> 
> While the 'R' and the retirement latency are Intel specific, in the
> future I can imagine more evsel like commands that require child
> processes. We can make the logic more generic at that point.
>

I think in generic what we want is the weight/latency information of the
event. 'W' is already occupied by the weak group. Maybe 'L' is a more
generic name than 'R'. With the event modifier, perf collects and report
the weight/latency information of the event in a perf stat command.

Not just changing the evsel, I think a proper output is still required.
It's possible that an end user can use it without metrics. E.g.,
perf stat -e cycles,instructions:L
A possible generic output maybe

1,931,099,931	cycles
  801,826,458	instructions	# Avg Weight1 1000
				# Avg Weight2 800
				# Avg Weight3 500

Thanks,
Kan

> The code is untested on hardware that supports retirement latency, and
> with metrics with retirement latency in them. The record is also of
> sleep and various things need tweaking but I think v1 is good enough
> for people to give input.
> 
> The first patch stops opening a dummy event for tool events. I came
> across this while looking into the issue and we can likely just pick
> it first. I kept it in the series for cleanliness sake.
> 
> The code has benefitted greatly from Weilin's work and Namhyung's
> great review input.
> 
> Ian Rogers (3):
>   perf evsel: Don't open tool events
>   perf parse-events: Add a retirement latency modifier
>   perf evsel: Add retirement latency event support
> 
>  tools/perf/util/evsel.c        | 186 ++++++++++++++++++++++++++++++++-
>  tools/perf/util/evsel.h        |   4 +
>  tools/perf/util/parse-events.c |   2 +
>  tools/perf/util/parse-events.h |   1 +
>  tools/perf/util/parse-events.l |   3 +-
>  5 files changed, 192 insertions(+), 4 deletions(-)
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH v1 0/3] Retirement latency perf stat support
  2024-04-26 17:22 ` Liang, Kan
@ 2024-04-26 17:34   ` Ian Rogers
  0 siblings, 0 replies; 7+ messages in thread
From: Ian Rogers @ 2024-04-26 17:34 UTC (permalink / raw)
  To: Liang, Kan
  Cc: weilin.wang, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Ze Gao, Leo Yan,
	Ravi Bangoria, Dmitrii Dolgov, Song Liu, James Clark,
	linux-perf-users, linux-kernel

On Fri, Apr 26, 2024 at 10:22 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
> On 2024-04-25 6:34 p.m., Ian Rogers wrote:
> > Support 'R' as a retirement latency modifier on events. When present
> > the evsel will fork perf record and perf report commands, parsing the
> > perf report output as the count value. The intent is to do something
> > similar to Weilin's series:
> > https://lore.kernel.org/lkml/20240402214436.1409476-1-weilin.wang@intel.com/
> >
> > While the 'R' and the retirement latency are Intel specific, in the
> > future I can imagine more evsel like commands that require child
> > processes. We can make the logic more generic at that point.
> >
>
> I think in generic what we want is the weight/latency information of the
> event. 'W' is already occupied by the weak group. Maybe 'L' is a more
> generic name than 'R'. With the event modifier, perf collects and report
> the weight/latency information of the event in a perf stat command.
>
> Not just changing the evsel, I think a proper output is still required.
> It's possible that an end user can use it without metrics. E.g.,
> perf stat -e cycles,instructions:L
> A possible generic output maybe
>
> 1,931,099,931   cycles
>   801,826,458   instructions    # Avg Weight1 1000
>                                 # Avg Weight2 800
>                                 # Avg Weight3 500

I think this is good but we need to work toward it. This change is
opening a separate perf record per CPU, we should really open one perf
record and then read each counter separately in the perf report
output. We shouldn't really fork a perf record, we should gather
multiple weights, and so on.. There isn't a notion in the current
counts abstraction that you have multiple counts, and that will need
feeding through into all the aggregation code.

Thanks,
Ian

> Thanks,
> Kan
>
> > The code is untested on hardware that supports retirement latency, and
> > with metrics with retirement latency in them. The record is also of
> > sleep and various things need tweaking but I think v1 is good enough
> > for people to give input.
> >
> > The first patch stops opening a dummy event for tool events. I came
> > across this while looking into the issue and we can likely just pick
> > it first. I kept it in the series for cleanliness sake.
> >
> > The code has benefitted greatly from Weilin's work and Namhyung's
> > great review input.
> >
> > Ian Rogers (3):
> >   perf evsel: Don't open tool events
> >   perf parse-events: Add a retirement latency modifier
> >   perf evsel: Add retirement latency event support
> >
> >  tools/perf/util/evsel.c        | 186 ++++++++++++++++++++++++++++++++-
> >  tools/perf/util/evsel.h        |   4 +
> >  tools/perf/util/parse-events.c |   2 +
> >  tools/perf/util/parse-events.h |   1 +
> >  tools/perf/util/parse-events.l |   3 +-
> >  5 files changed, 192 insertions(+), 4 deletions(-)
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-04-26 17:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-25 22:34 [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
2024-04-25 22:34 ` [RFC PATCH v1 1/3] perf evsel: Don't open tool events Ian Rogers
2024-04-25 22:34 ` [RFC PATCH v1 2/3] perf parse-events: Add a retirement latency modifier Ian Rogers
2024-04-25 22:34 ` [RFC PATCH v1 3/3] perf evsel: Add retirement latency event support Ian Rogers
2024-04-25 22:41 ` [RFC PATCH v1 0/3] Retirement latency perf stat support Ian Rogers
2024-04-26 17:22 ` Liang, Kan
2024-04-26 17:34   ` Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).