linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] perf metrics: Add literal for system TSC frequency
@ 2022-05-27  4:04 Ian Rogers
  2022-05-27  9:49 ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Ian Rogers @ 2022-05-27  4:04 UTC (permalink / raw)
  To: perry.taylor, caleb.biggers, kshipra.bopardikar, Kan Liang,
	Zhengjun Xing, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Maxime Coquelin, Alexandre Torgue,
	Andi Kleen, James Clark, John Garry, linux-kernel,
	linux-perf-users
  Cc: Stephane Eranian, Ian Rogers

Such a literal is useful to calculate things like the average frequency
[1]. The TSC frequency isn't exposed by sysfs although some experimental
drivers look to add it [2]. This change computes the value using the
frequency in /proc/cpuinfo which is accurate at least on Intel
processors.

v2. Adds warnings to make clear if things have changed/broken on future
    Intel platforms. It also adds caching and an Intel specific that a
    value is computed.

[1] https://github.com/intel/perfmon-metrics/blob/5ad9ef7056f31075e8178b9f1fb732af183b2c8d/SKX/metrics/perf/skx_metric_perf.json#L11
[2] https://github.com/trailofbits/tsc_freq_khz

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/expr.c | 15 +++++++++++++
 tools/perf/util/expr.c  | 50 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 5c0032fe93ae..45afe4f24859 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -1,8 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "util/debug.h"
 #include "util/expr.h"
+#include "util/header.h"
 #include "util/smt.h"
 #include "tests.h"
+#include <math.h>
 #include <stdlib.h>
 #include <string.h>
 #include <linux/zalloc.h>
@@ -69,6 +71,11 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
 	double val, num_cpus, num_cores, num_dies, num_packages;
 	int ret;
 	struct expr_parse_ctx *ctx;
+	bool is_intel = false;
+	char buf[128];
+
+	if (!get_cpuid(buf, sizeof(buf)))
+		is_intel = strstr(buf, "Intel") != NULL;
 
 	TEST_ASSERT_EQUAL("ids_union", test_ids_union(), 0);
 
@@ -175,6 +182,14 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
 	if (num_dies) // Some platforms do not have CPU die support, for example s390
 		TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
 
+	if (is_intel) {
+		double system_tsc_freq;
+
+		TEST_ASSERT_VAL("#system_tsc_freq", expr__parse(&system_tsc_freq, ctx,
+								"#system_tsc_freq") == 0);
+		TEST_ASSERT_VAL("!isnan(#system_tsc_freq)", !isnan(system_tsc_freq));
+	}
+
 	/*
 	 * Source count returns the number of events aggregating in a leader
 	 * event including the leader. Check parsing yields an id.
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index 675f318ce7c1..f33aeb1e6faa 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -402,6 +402,51 @@ double expr_id_data__source_count(const struct expr_id_data *data)
 	return data->val.source_count;
 }
 
+/*
+ * Derive the TSC frequency in Hz from the /proc/cpuinfo, for example:
+ * ...
+ * model name      : Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
+ * ...
+ * will return 3000000000.
+ */
+static double system_tsc_freq(void)
+{
+	static double result;
+	static bool computed;
+	FILE *cpuinfo;
+	char *line = NULL;
+	size_t len = 0;
+
+	if (computed)
+		return result;
+
+	computed = true;
+	result = NAN;
+	cpuinfo = fopen("/proc/cpuinfo", "r");
+	if (!cpuinfo) {
+		pr_err("Failed to read /proc/cpuinfo for TSC frequency");
+		return NAN;
+	}
+	while (getline(&line, &len, cpuinfo) > 0) {
+		if (!strncmp(line, "model name", 10)) {
+			char *pos = strstr(line + 11, " @ ");
+
+			if (pos && sscanf(pos, " @ %lfGHz", &result) == 1) {
+				result *= 1000000000;
+				goto out;
+			}
+		}
+	}
+
+out:
+	if (isnan(result))
+		pr_err("Failed to find TSC frequency in /proc/cpuinfo");
+
+	free(line);
+	fclose(cpuinfo);
+	return result;
+}
+
 double expr__get_literal(const char *literal)
 {
 	static struct cpu_topology *topology;
@@ -417,6 +462,11 @@ double expr__get_literal(const char *literal)
 		goto out;
 	}
 
+	if (!strcasecmp("#system_tsc_freq", literal)) {
+		result = system_tsc_freq();
+		goto out;
+	}
+
 	/*
 	 * Assume that topology strings are consistent, such as CPUs "0-1"
 	 * wouldn't be listed as "0,1", and so after deduplication the number of
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] perf metrics: Add literal for system TSC frequency
  2022-05-27  4:04 [PATCH v2] perf metrics: Add literal for system TSC frequency Ian Rogers
@ 2022-05-27  9:49 ` Peter Zijlstra
  2022-05-27 14:54   ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2022-05-27  9:49 UTC (permalink / raw)
  To: Ian Rogers
  Cc: perry.taylor, caleb.biggers, kshipra.bopardikar, Kan Liang,
	Zhengjun Xing, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Maxime Coquelin, Alexandre Torgue, Andi Kleen, James Clark,
	John Garry, linux-kernel, linux-perf-users, Stephane Eranian,
	Thomas Gleixner

On Thu, May 26, 2022 at 09:04:07PM -0700, Ian Rogers wrote:
> Such a literal is useful to calculate things like the average frequency
> [1]. The TSC frequency isn't exposed by sysfs although some experimental
> drivers look to add it [2]. This change computes the value using the
> frequency in /proc/cpuinfo which is accurate at least on Intel
> processors.
> 
> v2. Adds warnings to make clear if things have changed/broken on future
>     Intel platforms. It also adds caching and an Intel specific that a
>     value is computed.
> 
> [1] https://github.com/intel/perfmon-metrics/blob/5ad9ef7056f31075e8178b9f1fb732af183b2c8d/SKX/metrics/perf/skx_metric_perf.json#L11
> [2] https://github.com/trailofbits/tsc_freq_khz

This all seems bonghits inspired... and perf actually does expose the
tsc frequency. What do you think is in perf_event_mmap_page::time_* ?

> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/tests/expr.c | 15 +++++++++++++
>  tools/perf/util/expr.c  | 50 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 65 insertions(+)
> 
> diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> index 5c0032fe93ae..45afe4f24859 100644
> --- a/tools/perf/tests/expr.c
> +++ b/tools/perf/tests/expr.c
> @@ -1,8 +1,10 @@
>  // SPDX-License-Identifier: GPL-2.0
>  #include "util/debug.h"
>  #include "util/expr.h"
> +#include "util/header.h"
>  #include "util/smt.h"
>  #include "tests.h"
> +#include <math.h>
>  #include <stdlib.h>
>  #include <string.h>
>  #include <linux/zalloc.h>
> @@ -69,6 +71,11 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
>  	double val, num_cpus, num_cores, num_dies, num_packages;
>  	int ret;
>  	struct expr_parse_ctx *ctx;
> +	bool is_intel = false;
> +	char buf[128];
> +
> +	if (!get_cpuid(buf, sizeof(buf)))
> +		is_intel = strstr(buf, "Intel") != NULL;
>  
>  	TEST_ASSERT_EQUAL("ids_union", test_ids_union(), 0);
>  
> @@ -175,6 +182,14 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
>  	if (num_dies) // Some platforms do not have CPU die support, for example s390
>  		TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
>  
> +	if (is_intel) {
> +		double system_tsc_freq;
> +
> +		TEST_ASSERT_VAL("#system_tsc_freq", expr__parse(&system_tsc_freq, ctx,
> +								"#system_tsc_freq") == 0);
> +		TEST_ASSERT_VAL("!isnan(#system_tsc_freq)", !isnan(system_tsc_freq));
> +	}
> +
>  	/*
>  	 * Source count returns the number of events aggregating in a leader
>  	 * event including the leader. Check parsing yields an id.
> diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> index 675f318ce7c1..f33aeb1e6faa 100644
> --- a/tools/perf/util/expr.c
> +++ b/tools/perf/util/expr.c
> @@ -402,6 +402,51 @@ double expr_id_data__source_count(const struct expr_id_data *data)
>  	return data->val.source_count;
>  }
>  
> +/*
> + * Derive the TSC frequency in Hz from the /proc/cpuinfo, for example:
> + * ...
> + * model name      : Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
> + * ...
> + * will return 3000000000.
> + */
> +static double system_tsc_freq(void)
> +{
> +	static double result;
> +	static bool computed;
> +	FILE *cpuinfo;
> +	char *line = NULL;
> +	size_t len = 0;
> +
> +	if (computed)
> +		return result;
> +
> +	computed = true;
> +	result = NAN;
> +	cpuinfo = fopen("/proc/cpuinfo", "r");
> +	if (!cpuinfo) {
> +		pr_err("Failed to read /proc/cpuinfo for TSC frequency");
> +		return NAN;
> +	}
> +	while (getline(&line, &len, cpuinfo) > 0) {
> +		if (!strncmp(line, "model name", 10)) {
> +			char *pos = strstr(line + 11, " @ ");
> +
> +			if (pos && sscanf(pos, " @ %lfGHz", &result) == 1) {
> +				result *= 1000000000;
> +				goto out;
> +			}
> +		}
> +	}
> +
> +out:
> +	if (isnan(result))
> +		pr_err("Failed to find TSC frequency in /proc/cpuinfo");
> +
> +	free(line);
> +	fclose(cpuinfo);
> +	return result;
> +}
> +
>  double expr__get_literal(const char *literal)
>  {
>  	static struct cpu_topology *topology;
> @@ -417,6 +462,11 @@ double expr__get_literal(const char *literal)
>  		goto out;
>  	}
>  
> +	if (!strcasecmp("#system_tsc_freq", literal)) {
> +		result = system_tsc_freq();
> +		goto out;
> +	}
> +
>  	/*
>  	 * Assume that topology strings are consistent, such as CPUs "0-1"
>  	 * wouldn't be listed as "0,1", and so after deduplication the number of
> -- 
> 2.36.1.124.g0e6072fb45-goog
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] perf metrics: Add literal for system TSC frequency
  2022-05-27  9:49 ` Peter Zijlstra
@ 2022-05-27 14:54   ` Andi Kleen
  2022-05-27 15:57     ` Ian Rogers
  2022-05-28 14:01     ` Peter Zijlstra
  0 siblings, 2 replies; 7+ messages in thread
From: Andi Kleen @ 2022-05-27 14:54 UTC (permalink / raw)
  To: Peter Zijlstra, Ian Rogers
  Cc: perry.taylor, caleb.biggers, kshipra.bopardikar, Kan Liang,
	Zhengjun Xing, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Maxime Coquelin, Alexandre Torgue, James Clark, John Garry,
	linux-kernel, linux-perf-users, Stephane Eranian,
	Thomas Gleixner


On 5/27/2022 2:49 AM, Peter Zijlstra wrote:
> On Thu, May 26, 2022 at 09:04:07PM -0700, Ian Rogers wrote:
>> Such a literal is useful to calculate things like the average frequency
>> [1]. The TSC frequency isn't exposed by sysfs although some experimental
>> drivers look to add it [2]. This change computes the value using the
>> frequency in /proc/cpuinfo which is accurate at least on Intel
>> processors.

It would be better to use CPUID if available which is far more accurate. 
Also I believe newer Intel CPUs drop the frequency in the brand string.

BTW it also has the problem that if perf script is run on some other 
system to compute metrics it won't work.

>
> This all seems bonghits inspired... and perf actually does expose the
> tsc frequency. What do you think is in perf_event_mmap_page::time_* ?


That's not really available to perf stat, which is the primary metrics user.


-Andi




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] perf metrics: Add literal for system TSC frequency
  2022-05-27 14:54   ` Andi Kleen
@ 2022-05-27 15:57     ` Ian Rogers
  2022-05-28 14:01     ` Peter Zijlstra
  1 sibling, 0 replies; 7+ messages in thread
From: Ian Rogers @ 2022-05-27 15:57 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, perry.taylor, caleb.biggers, kshipra.bopardikar,
	Kan Liang, Zhengjun Xing, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Maxime Coquelin, Alexandre Torgue, James Clark, John Garry,
	linux-kernel, linux-perf-users, Stephane Eranian,
	Thomas Gleixner

On Fri, May 27, 2022 at 7:54 AM Andi Kleen <ak@linux.intel.com> wrote:
>
>
> On 5/27/2022 2:49 AM, Peter Zijlstra wrote:
> > On Thu, May 26, 2022 at 09:04:07PM -0700, Ian Rogers wrote:
> >> Such a literal is useful to calculate things like the average frequency
> >> [1]. The TSC frequency isn't exposed by sysfs although some experimental
> >> drivers look to add it [2]. This change computes the value using the
> >> frequency in /proc/cpuinfo which is accurate at least on Intel
> >> processors.
>
> It would be better to use CPUID if available which is far more accurate.
> Also I believe newer Intel CPUs drop the frequency in the brand string.

But on v1 you also said it was broken for certain architectures. The
point of the tests and warnings is to identify this case.

> BTW it also has the problem that if perf script is run on some other
> system to compute metrics it won't work.

Yep. We don't yet support recording then reporting metrics on a
different system. There would be a bunch to do to support this.

> >
> > This all seems bonghits inspired... and perf actually does expose the
> > tsc frequency. What do you think is in perf_event_mmap_page::time_* ?
>
>
> That's not really available to perf stat, which is the primary metrics user.

So we have 3 approaches:

1) read /proc/cpuinfo: pros: easy in comparison to other approaches.
cons: will break in the future, will need a cross system strategy
2) use cpuid: pros: can be made to work across systems cons: not
available for all Intel architectures
3) mmap: pros: more stable API than cpuinfo vendor string. cons: could
end up with new events purely to gather a TSC frequency and associated
complexity around permissions, etc.

I feel we can bike shed and go between the 3 approaches almost
endlessly. I'd rather get something in and iterate. How the approach
breaks can motivate the next steps and the tests added here will
capture when it does break, not just when a metric is first used. I
mentioned this on V1 and requested feedback, this is just implementing
what I said I'd do to V1.

Thanks,
Ian

> -Andi
>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] perf metrics: Add literal for system TSC frequency
  2022-05-27 14:54   ` Andi Kleen
  2022-05-27 15:57     ` Ian Rogers
@ 2022-05-28 14:01     ` Peter Zijlstra
  2022-05-28 14:50       ` Ian Rogers
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2022-05-28 14:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ian Rogers, perry.taylor, caleb.biggers, kshipra.bopardikar,
	Kan Liang, Zhengjun Xing, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Maxime Coquelin, Alexandre Torgue, James Clark, John Garry,
	linux-kernel, linux-perf-users, Stephane Eranian,
	Thomas Gleixner

On Fri, May 27, 2022 at 07:54:19AM -0700, Andi Kleen wrote:

> > This all seems bonghits inspired... and perf actually does expose the
> > tsc frequency. What do you think is in perf_event_mmap_page::time_* ?
> 
> 
> That's not really available to perf stat, which is the primary metrics user.

Why not? You can mmap any perf-fd (even software events) and these
fields should be filled out.

It should work on any x86 CPU that has a TSC. The only caveat is that
the kernel must not have marked the TSC unstable.

It could even work for virt -- all you need is for virt to use
native_sched_clock() instead of the paravirt nonsense.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] perf metrics: Add literal for system TSC frequency
  2022-05-28 14:01     ` Peter Zijlstra
@ 2022-05-28 14:50       ` Ian Rogers
  2022-05-28 15:40         ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Ian Rogers @ 2022-05-28 14:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, perry.taylor, caleb.biggers, kshipra.bopardikar,
	Kan Liang, Zhengjun Xing, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Maxime Coquelin, Alexandre Torgue, James Clark, John Garry,
	linux-kernel, linux-perf-users, Stephane Eranian,
	Thomas Gleixner

On Sat, May 28, 2022 at 7:02 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, May 27, 2022 at 07:54:19AM -0700, Andi Kleen wrote:
>
> > > This all seems bonghits inspired... and perf actually does expose the
> > > tsc frequency. What do you think is in perf_event_mmap_page::time_* ?
> >
> >
> > That's not really available to perf stat, which is the primary metrics user.
>
> Why not? You can mmap any perf-fd (even software events) and these
> fields should be filled out.
>
> It should work on any x86 CPU that has a TSC. The only caveat is that
> the kernel must not have marked the TSC unstable.
>
> It could even work for virt -- all you need is for virt to use
> native_sched_clock() instead of the paravirt nonsense.

It will at least fail if inherit is enabled, no?

Thanks,
Ian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] perf metrics: Add literal for system TSC frequency
  2022-05-28 14:50       ` Ian Rogers
@ 2022-05-28 15:40         ` Peter Zijlstra
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2022-05-28 15:40 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Andi Kleen, perry.taylor, caleb.biggers, kshipra.bopardikar,
	Kan Liang, Zhengjun Xing, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Maxime Coquelin, Alexandre Torgue, James Clark, John Garry,
	linux-kernel, linux-perf-users, Stephane Eranian,
	Thomas Gleixner

On Sat, May 28, 2022 at 07:50:40AM -0700, Ian Rogers wrote:
> On Sat, May 28, 2022 at 7:02 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Fri, May 27, 2022 at 07:54:19AM -0700, Andi Kleen wrote:
> >
> > > > This all seems bonghits inspired... and perf actually does expose the
> > > > tsc frequency. What do you think is in perf_event_mmap_page::time_* ?
> > >
> > >
> > > That's not really available to perf stat, which is the primary metrics user.
> >
> > Why not? You can mmap any perf-fd (even software events) and these
> > fields should be filled out.
> >
> > It should work on any x86 CPU that has a TSC. The only caveat is that
> > the kernel must not have marked the TSC unstable.
> >
> > It could even work for virt -- all you need is for virt to use
> > native_sched_clock() instead of the paravirt nonsense.
> 
> It will at least fail if inherit is enabled, no?

For per-task events, yes, I suppose it will. I'd forgotten about that
restriction on perf_mmap() :/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-05-28 15:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-27  4:04 [PATCH v2] perf metrics: Add literal for system TSC frequency Ian Rogers
2022-05-27  9:49 ` Peter Zijlstra
2022-05-27 14:54   ` Andi Kleen
2022-05-27 15:57     ` Ian Rogers
2022-05-28 14:01     ` Peter Zijlstra
2022-05-28 14:50       ` Ian Rogers
2022-05-28 15:40         ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).