linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] libperf: Add support for scaling counters obtained from the read() system call during multiplexing
@ 2021-08-20  9:39 Shunsuke Nakamura
  2021-08-20  9:39 ` [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Shunsuke Nakamura @ 2021-08-20  9:39 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users

This patch series supports counter scaling when perf_evsel__read() obtains a counter
using the read() system call during multiplexing.

The first patch adds scaling of counters obtained from the read() system call
during multiplexing.

The second patch fixes verbose printing.

The third patch adds a test for the first patch.
This patch is based on Vince's rdpmc_multiplexing.c [1]


[1] https://github.com/deater/perf_event_tests/blob/master/tests/rdpmc/rdpmc_multiplexing.c


Shunsuke Nakamura (3):
  libperf: Add processing to scale the counters obtained during the read
    () system call when multiplexing
  libperf tests: Fix verbose printing
  libperf tests: Add test_stat_multiplexing test

 tools/lib/perf/evsel.c                  |   4 +
 tools/lib/perf/include/internal/tests.h |   2 +
 tools/lib/perf/tests/test-evlist.c      | 138 ++++++++++++++++++++++++
 3 files changed, 144 insertions(+)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing
  2021-08-20  9:39 [PATCH 0/3] libperf: Add support for scaling counters obtained from the read() system call during multiplexing Shunsuke Nakamura
@ 2021-08-20  9:39 ` Shunsuke Nakamura
  2021-08-23 20:12   ` Rob Herring
  2021-08-20  9:39 ` [PATCH 2/3] libperf tests: Fix verbose printing Shunsuke Nakamura
  2021-08-20  9:39 ` [PATCH 3/3] libperf tests: Add test_stat_multiplexing test Shunsuke Nakamura
  2 siblings, 1 reply; 12+ messages in thread
From: Shunsuke Nakamura @ 2021-08-20  9:39 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users

perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
does not scale counters obtained by read() system call.

Add processing to perf_evsel__read() to scale the counters obtained during the
read() system call when multiplexing.

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
---
 tools/lib/perf/evsel.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index d8886720e83d..005cf64a1ad7 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -18,6 +18,7 @@
 #include <sys/ioctl.h>
 #include <sys/mman.h>
 #include <asm/bug.h>
+#include <linux/math64.h>
 
 void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
 		      int idx)
@@ -308,6 +309,9 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
 	if (readn(FD(evsel, cpu, thread), count->values, size) <= 0)
 		return -errno;
 
+	if (count->ena != count->run)
+		count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
+
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/3] libperf tests: Fix verbose printing
  2021-08-20  9:39 [PATCH 0/3] libperf: Add support for scaling counters obtained from the read() system call during multiplexing Shunsuke Nakamura
  2021-08-20  9:39 ` [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura
@ 2021-08-20  9:39 ` Shunsuke Nakamura
  2021-08-23 20:26   ` Rob Herring
  2021-08-20  9:39 ` [PATCH 3/3] libperf tests: Add test_stat_multiplexing test Shunsuke Nakamura
  2 siblings, 1 reply; 12+ messages in thread
From: Shunsuke Nakamura @ 2021-08-20  9:39 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users

libperf's verbose printing checks the -v option every time the macro _T_ START
is called.

Since there are currently four libperf tests registered, the macro _T_ START is
called four times, but verbose printing after the second time is not output.

Resets the index of the element processed by getopt() and fix verbose printing
so that it prints in all tests.

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
---
 tools/lib/perf/include/internal/tests.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/lib/perf/include/internal/tests.h b/tools/lib/perf/include/internal/tests.h
index 61052099225b..b130a6663ff8 100644
--- a/tools/lib/perf/include/internal/tests.h
+++ b/tools/lib/perf/include/internal/tests.h
@@ -23,6 +23,8 @@ static inline int get_verbose(char **argv, int argc)
 			break;
 		}
 	}
+	optind = 1;
+
 	return verbose;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/3] libperf tests: Add test_stat_multiplexing test
  2021-08-20  9:39 [PATCH 0/3] libperf: Add support for scaling counters obtained from the read() system call during multiplexing Shunsuke Nakamura
  2021-08-20  9:39 ` [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura
  2021-08-20  9:39 ` [PATCH 2/3] libperf tests: Fix verbose printing Shunsuke Nakamura
@ 2021-08-20  9:39 ` Shunsuke Nakamura
  2 siblings, 0 replies; 12+ messages in thread
From: Shunsuke Nakamura @ 2021-08-20  9:39 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users

Adds a test for a counter obtained using read() system call during multiplexing

Committer testing:

  $ sudo make tests V=1 -C tools/lib/perf/
  make: Entering directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf'
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=libperf
  make -C /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/api/ O= libapi.a
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=tests
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./tests obj=tests
  running static:
  - running tests/test-cpumap.c...OK
  - running tests/test-threadmap.c...OK
  - running tests/test-evlist.c...
  	count = 503074578, run = 261416566, enable = 435415718
  	count = 503295562, run = 261411675, enable = 435412316
  	count = 501603369, run = 261421312, enable = 435408854
  	count = 501245546, run = 261419457, enable = 435405029
  	count = 501849603, run = 261415041, enable = 435400600
  	count = 500801298, run = 261410125, enable = 435395597
  	count = 502116997, run = 261401960, enable = 435389539
  	count = 501797294, run = 261397791, enable = 435382777
  	count = 501249740, run = 261377402, enable = 435374884
  	count = 501754502, run = 260985031, enable = 435366109
  	count = 501659788, run = 260985466, enable = 435354970
  	count = 503670502, run = 260985172, enable = 435345076
  	count = 503209138, run = 260987272, enable = 435335863
  	count = 502772845, run = 260985720, enable = 435314163
  	count = 504045922, run = 260985324, enable = 435303814
     Expected: 502005425
     High: 504045922   Low:  500801298   Average:  502276445
     Average Error = 0.05%
  OK
  - running tests/test-evsel.c...
  	loop = 65536, count = 333260
  	loop = 131072, count = 655861
  	loop = 262144, count = 1315513
  	loop = 524288, count = 2633730
  	loop = 1048576, count = 5270513
  	loop = 65536, count = 407118
  	loop = 131072, count = 786947
  	loop = 262144, count = 1859532
  	loop = 524288, count = 3849286
  	loop = 1048576, count = 7310477
  OK
  running dynamic:
  - running tests/test-cpumap.c...OK
  - running tests/test-threadmap.c...OK
  - running tests/test-evlist.c...
  	count = 502071859, run = 275961923, enable = 461897932
  	count = 501339557, run = 275959274, enable = 461895615
  	count = 501706307, run = 275912910, enable = 461892367
  	count = 501877502, run = 275906600, enable = 461888804
  	count = 501773043, run = 275905934, enable = 461884843
  	count = 502724983, run = 275884848, enable = 461880027
  	count = 502436857, run = 276864452, enable = 461874272
  	count = 502598147, run = 277873765, enable = 461867901
  	count = 502601952, run = 278872100, enable = 461860271
  	count = 502356640, run = 278933007, enable = 461851421
  	count = 503474674, run = 278930678, enable = 461840342
  	count = 503365623, run = 278976959, enable = 461830886
  	count = 503307062, run = 278004451, enable = 461821700
  	count = 502845553, run = 276988055, enable = 461812741
  	count = 501627390, run = 275995291, enable = 461803691
     Expected: 501940978
     High: 503474674   Low:  501339557   Average:  502407143
     Average Error = 0.09%
  OK
  - running tests/test-evsel.c...
  	loop = 65536, count = 328182
  	loop = 131072, count = 661219
  	loop = 262144, count = 1316712
  	loop = 524288, count = 2641030
  	loop = 1048576, count = 5267395
  	loop = 65536, count = 393675
  	loop = 131072, count = 842152
  	loop = 262144, count = 1664160
  	loop = 524288, count = 3421570
  	loop = 1048576, count = 6856783
  OK
  make: Leaving directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf'

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
---
 tools/lib/perf/tests/test-evlist.c | 138 +++++++++++++++++++++++++++++
 1 file changed, 138 insertions(+)

diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c
index c67c83399170..c7184d8b6ce9 100644
--- a/tools/lib/perf/tests/test-evlist.c
+++ b/tools/lib/perf/tests/test-evlist.c
@@ -21,6 +21,9 @@
 #include "tests.h"
 #include <internal/evsel.h>
 
+#define EVENT_NUM 15
+#define WAIT_COUNT 100000000UL
+
 static int libperf_print(enum libperf_print_level level,
 			 const char *fmt, va_list ap)
 {
@@ -413,6 +416,140 @@ static int test_mmap_cpus(void)
 	return 0;
 }
 
+static double display_error(long long average,
+		     long long high,
+		     long long low,
+		     long long expected) {
+
+	double error;
+
+	error = (((double)average - expected) / expected) * 100.0;
+
+	__T_VERBOSE("   Expected: %lld\n", expected);
+	__T_VERBOSE("   High: %lld   Low:  %lld   Average:  %lld\n",
+			high, low, average);
+
+	__T_VERBOSE("   Average Error = %.2f%%\n",error);
+
+	return error;
+
+}
+
+static int test_stat_multiplexing(void)
+{
+	struct perf_counts_values expected_counts = { .val = 0 };
+	struct perf_counts_values multi_counts[EVENT_NUM] = {{ .val = 0 },};
+	struct perf_thread_map *threads;
+	struct perf_evlist *evlist;
+	struct perf_evsel *evsel;
+	struct perf_event_attr attr = {
+		.type	     = PERF_TYPE_HARDWARE,
+		.config	     = PERF_COUNT_HW_INSTRUCTIONS,
+		.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
+			       PERF_FORMAT_TOTAL_TIME_RUNNING,
+		.disabled    = 1,
+	};
+	int err, i, nonzero=0;
+	unsigned long count;
+	long long max = 0, min = 0, avg = 0;
+	double error = 0.0;
+
+	/* read for non-multiplexing event count */
+	threads = perf_thread_map__new_dummy();
+	__T("failed to create threads", threads);
+
+	perf_thread_map__set_pid(threads, 0, 0);
+
+	evsel = perf_evsel__new(&attr);
+	__T("failed to create evsel", evsel);
+
+	err = perf_evsel__open(evsel, NULL, threads);
+	__T("failed to open evsel", err == 0);
+
+	err = perf_evsel__enable(evsel);
+	__T("failed to enable evsel", err == 0);
+
+	count = WAIT_COUNT;
+	while(count--);
+
+	perf_evsel__read(evsel, 0, 0, &expected_counts);
+	__T("failed to read value for evsel", expected_counts.val != 0);
+	__T("failed to read non-multiplexing event count",
+	    expected_counts.ena == expected_counts.run);
+
+	err = perf_evsel__disable(evsel);
+	__T("failed to enable evsel", err == 0);
+
+	perf_evsel__close(evsel);
+	perf_evsel__delete(evsel);
+
+	perf_thread_map__put(threads);
+
+
+	/* read for multiplexing event count */
+	threads = perf_thread_map__new_dummy();
+	__T("failed to create threads", threads);
+
+	perf_thread_map__set_pid(threads, 0, 0);
+
+	evlist = perf_evlist__new();
+	__T("failed to create evlist", evlist);
+
+	for (i = 0; i < EVENT_NUM; i++) {
+		evsel = perf_evsel__new(&attr);
+		__T("failed to create evsel1", evsel);
+
+		perf_evlist__add(evlist, evsel);
+	}
+	perf_evlist__set_maps(evlist, NULL, threads);
+
+	err = perf_evlist__open(evlist);
+	__T("failed to open evsel", err == 0);
+
+	perf_evlist__enable(evlist);
+
+	count = WAIT_COUNT;
+	while (count--);
+
+	i = 0;
+	perf_evlist__for_each_evsel(evlist, evsel) {
+		perf_evsel__read(evsel, 0, 0, &multi_counts[i]);
+		__T("failed to read value for evsel", multi_counts[i].val != 0);
+		i++;
+	}
+
+	perf_evlist__disable(evlist);
+
+
+	min = multi_counts[0].val;
+	for (i = 0; i < EVENT_NUM; i++) {
+		__T_VERBOSE("\tcount = %lu, run = %lu, enable = %lu\n",
+			    multi_counts[i].val, multi_counts[i].run, multi_counts[i].ena);
+
+		if (multi_counts[i].val > max)
+			max = multi_counts[i].val;
+
+		if (multi_counts[i].val < min)
+			min = multi_counts[i].val;
+
+		avg += multi_counts[i].val;
+
+		if (multi_counts[i].val != 0)
+			nonzero++;
+	}
+	avg = avg / nonzero;
+
+	error = display_error(avg, max, min, expected_counts.val);
+
+	__T("Error out of range!", ((error <= 1.0) && (error >= -1.0)));
+
+	perf_evlist__close(evlist);
+	perf_evlist__delete(evlist);
+
+	perf_thread_map__put(threads);
+	return 0;
+}
+
 int test_evlist(int argc, char **argv)
 {
 	__T_START;
@@ -424,6 +561,7 @@ int test_evlist(int argc, char **argv)
 	test_stat_thread_enable();
 	test_mmap_thread();
 	test_mmap_cpus();
+	test_stat_multiplexing();
 
 	__T_END;
 	return tests_failed == 0 ? 0 : -1;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing
  2021-08-20  9:39 ` [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura
@ 2021-08-23 20:12   ` Rob Herring
  2021-08-24 10:11     ` nakamura.shun
  0 siblings, 1 reply; 12+ messages in thread
From: Rob Herring @ 2021-08-23 20:12 UTC (permalink / raw)
  To: Shunsuke Nakamura
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, linux-kernel, linux-perf-users

On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> does not scale counters obtained by read() system call.
> 
> Add processing to perf_evsel__read() to scale the counters obtained during the
> read() system call when multiplexing.

Which one is right though? Changing what read() returns could break 
users, right? Or are you implying that the RDPMC path is correct and 
read() was not. More likely the former case since I wrote the latter.

> 
> Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
> ---
>  tools/lib/perf/evsel.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index d8886720e83d..005cf64a1ad7 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -18,6 +18,7 @@
>  #include <sys/ioctl.h>
>  #include <sys/mman.h>
>  #include <asm/bug.h>
> +#include <linux/math64.h>
>  
>  void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr,
>  		      int idx)
> @@ -308,6 +309,9 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
>  	if (readn(FD(evsel, cpu, thread), count->values, size) <= 0)
>  		return -errno;
>  
> +	if (count->ena != count->run)
> +		count->val = mul_u64_u64_div64(count->val, count->ena, count->run);
> +
>  	return 0;
>  }
>  
> -- 
> 2.25.1
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] libperf tests: Fix verbose printing
  2021-08-20  9:39 ` [PATCH 2/3] libperf tests: Fix verbose printing Shunsuke Nakamura
@ 2021-08-23 20:26   ` Rob Herring
  2021-08-24 18:06     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Rob Herring @ 2021-08-23 20:26 UTC (permalink / raw)
  To: Shunsuke Nakamura
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, linux-kernel, linux-perf-users

On Fri, Aug 20, 2021 at 06:39:07PM +0900, Shunsuke Nakamura wrote:
> libperf's verbose printing checks the -v option every time the macro _T_ START

__T_START

> is called.
> 
> Since there are currently four libperf tests registered, the macro _T_ START is
> called four times, but verbose printing after the second time is not output.
> 
> Resets the index of the element processed by getopt() and fix verbose printing
> so that it prints in all tests.
> 
> Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
> ---
>  tools/lib/perf/include/internal/tests.h | 2 ++
>  1 file changed, 2 insertions(+)

Acked-by: Rob Herring <robh@kernel.org>

> 
> diff --git a/tools/lib/perf/include/internal/tests.h b/tools/lib/perf/include/internal/tests.h
> index 61052099225b..b130a6663ff8 100644
> --- a/tools/lib/perf/include/internal/tests.h
> +++ b/tools/lib/perf/include/internal/tests.h
> @@ -23,6 +23,8 @@ static inline int get_verbose(char **argv, int argc)
>  			break;
>  		}
>  	}
> +	optind = 1;
> +
>  	return verbose;
>  }
>  
> -- 
> 2.25.1
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing
  2021-08-23 20:12   ` Rob Herring
@ 2021-08-24 10:11     ` nakamura.shun
  2021-08-31  8:58       ` nakamura.shun
  2021-08-31 12:26       ` Rob Herring
  0 siblings, 2 replies; 12+ messages in thread
From: nakamura.shun @ 2021-08-24 10:11 UTC (permalink / raw)
  To: Rob Herring
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, linux-kernel, linux-perf-users

Hi, Rob

> On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > does not scale counters obtained by read() system call.
> > 
> > Add processing to perf_evsel__read() to scale the counters obtained during the
> > read() system call when multiplexing.
> 
> Which one is right though? Changing what read() returns could break 
> users, right? Or are you implying that the RDPMC path is correct and 
> read() was not. More likely the former case since I wrote the latter.

perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
by the read() system call when multiplexed with RDPMC enabled.

That is, there is a mix of scaled and unscaled values.

As Rob says, when this patch is applied, rescaling the count obtained from
perf_evsel__read() during multiplexing will break the count.

I think the easiest solution is to change the value you get from RDPMC to not scale 
and let the user scale it, but I thought it would be a little inconvenient.

Best Regards
Shunsuke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] libperf tests: Fix verbose printing
  2021-08-23 20:26   ` Rob Herring
@ 2021-08-24 18:06     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-08-24 18:06 UTC (permalink / raw)
  To: Rob Herring
  Cc: Shunsuke Nakamura, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, namhyung, linux-kernel,
	linux-perf-users

Em Mon, Aug 23, 2021 at 03:26:42PM -0500, Rob Herring escreveu:
> On Fri, Aug 20, 2021 at 06:39:07PM +0900, Shunsuke Nakamura wrote:
> > libperf's verbose printing checks the -v option every time the macro _T_ START
> 
> __T_START
> 
> > is called.
> > 
> > Since there are currently four libperf tests registered, the macro _T_ START is
> > called four times, but verbose printing after the second time is not output.
> > 
> > Resets the index of the element processed by getopt() and fix verbose printing
> > so that it prints in all tests.
> > 
> > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
> > ---
> >  tools/lib/perf/include/internal/tests.h | 2 ++
> >  1 file changed, 2 insertions(+)
> 
> Acked-by: Rob Herring <robh@kernel.org>

Thanks, applied.

Waiting for the conclusion on the discussion for the other two patches.

- Arnaldo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing
  2021-08-24 10:11     ` nakamura.shun
@ 2021-08-31  8:58       ` nakamura.shun
  2021-08-31 12:26       ` Rob Herring
  1 sibling, 0 replies; 12+ messages in thread
From: nakamura.shun @ 2021-08-31  8:58 UTC (permalink / raw)
  To: Rob Herring
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, linux-kernel, linux-perf-users

Hi, Rob

> > On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > does not scale counters obtained by read() system call.
> > > 
> > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > read() system call when multiplexing.
> > 
> > Which one is right though? Changing what read() returns could break 
> > users, right? Or are you implying that the RDPMC path is correct and 
> > read() was not. More likely the former case since I wrote the latter.
> 
> perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
> by the read() system call when multiplexed with RDPMC enabled.
> 
> That is, there is a mix of scaled and unscaled values.
> 
> As Rob says, when this patch is applied, rescaling the count obtained from
> perf_evsel__read() during multiplexing will break the count.
> 
> I think the easiest solution is to change the value you get from RDPMC to not scale 
> and let the user scale it, but I thought it would be a little inconvenient.

Any comments?

Best Regards
Shunsuke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing
  2021-08-24 10:11     ` nakamura.shun
  2021-08-31  8:58       ` nakamura.shun
@ 2021-08-31 12:26       ` Rob Herring
  2021-09-07 23:59         ` Ian Rogers
  1 sibling, 1 reply; 12+ messages in thread
From: Rob Herring @ 2021-08-31 12:26 UTC (permalink / raw)
  To: nakamura.shun
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, linux-kernel, linux-perf-users

On Tue, Aug 24, 2021 at 5:12 AM nakamura.shun@fujitsu.com
<nakamura.shun@fujitsu.com> wrote:
>
> Hi, Rob
>
> > On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > does not scale counters obtained by read() system call.
> > >
> > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > read() system call when multiplexing.
> >
> > Which one is right though? Changing what read() returns could break
> > users, right? Or are you implying that the RDPMC path is correct and
> > read() was not. More likely the former case since I wrote the latter.
>
> perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
> by the read() system call when multiplexed with RDPMC enabled.
>
> That is, there is a mix of scaled and unscaled values.
>
> As Rob says, when this patch is applied, rescaling the count obtained from
> perf_evsel__read() during multiplexing will break the count.
>
> I think the easiest solution is to change the value you get from RDPMC to not scale
> and let the user scale it, but I thought it would be a little inconvenient.

Agreed, unless someone else has an opinion. It would be good to do the
scaling in libperf with the optimized math op, but I assume there's
some reason the user may need unscaled values?

Rob

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing
  2021-08-31 12:26       ` Rob Herring
@ 2021-09-07 23:59         ` Ian Rogers
  2021-09-17  8:04           ` nakamura.shun
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Rogers @ 2021-09-07 23:59 UTC (permalink / raw)
  To: Rob Herring
  Cc: nakamura.shun, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, linux-kernel,
	linux-perf-users

On Tue, Aug 31, 2021 at 5:26 AM Rob Herring <robh@kernel.org> wrote:
>
> On Tue, Aug 24, 2021 at 5:12 AM nakamura.shun@fujitsu.com
> <nakamura.shun@fujitsu.com> wrote:
> >
> > Hi, Rob
> >
> > > On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > > does not scale counters obtained by read() system call.
> > > >
> > > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > > read() system call when multiplexing.
> > >
> > > Which one is right though? Changing what read() returns could break
> > > users, right? Or are you implying that the RDPMC path is correct and
> > > read() was not. More likely the former case since I wrote the latter.
> >
> > perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
> > by the read() system call when multiplexed with RDPMC enabled.
> >
> > That is, there is a mix of scaled and unscaled values.
> >
> > As Rob says, when this patch is applied, rescaling the count obtained from
> > perf_evsel__read() during multiplexing will break the count.
> >
> > I think the easiest solution is to change the value you get from RDPMC to not scale
> > and let the user scale it, but I thought it would be a little inconvenient.
>
> Agreed, unless someone else has an opinion. It would be good to do the
> scaling in libperf with the optimized math op, but I assume there's
> some reason the user may need unscaled values?

Hi, something I've mentioned on other threads [1] is that running may
be zero due to multiplexing but enabled be greater. This can lead to a
divide by zero when scaling. Giving the ratio to the caller gives more
information - I may be misunderstanding this thread, apologies if so.

Thanks,
Ian

[1] https://lore.kernel.org/lkml/CAL_JsqKc_qFA59L9e-xXOhE4yBTdVg-Ea9EddimWwqj3XXxhGw@mail.gmail.com/

> Rob

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing
  2021-09-07 23:59         ` Ian Rogers
@ 2021-09-17  8:04           ` nakamura.shun
  0 siblings, 0 replies; 12+ messages in thread
From: nakamura.shun @ 2021-09-17  8:04 UTC (permalink / raw)
  To: Ian Rogers
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, Rob Herring, linux-kernel, linux-perf-users

Hi, Ian

> > On Tue, Aug 24, 2021 at 5:12 AM nakamura.shun@fujitsu.com
> > <nakamura.shun@fujitsu.com> wrote:
> > >
> > > Hi, Rob
> > >
> > > > On Fri, Aug 20, 2021 at 06:39:06PM +0900, Shunsuke Nakamura wrote:
> > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but
> > > > > does not scale counters obtained by read() system call.
> > > > >
> > > > > Add processing to perf_evsel__read() to scale the counters obtained during the
> > > > > read() system call when multiplexing.
> > > >
> > > > Which one is right though? Changing what read() returns could break
> > > > users, right? Or are you implying that the RDPMC path is correct and
> > > > read() was not. More likely the former case since I wrote the latter.
> > >
> > > perf_evsel__read() returns both the count obtained by RDPMC and the count obtained
> > > by the read() system call when multiplexed with RDPMC enabled.
> > >
> > > That is, there is a mix of scaled and unscaled values.
> > >
> > > As Rob says, when this patch is applied, rescaling the count obtained from
> > > perf_evsel__read() during multiplexing will break the count.
> > >
> > > I think the easiest solution is to change the value you get from RDPMC to not scale
> > > and let the user scale it, but I thought it would be a little inconvenient.
> >
> > Agreed, unless someone else has an opinion. It would be good to do the
> > scaling in libperf with the optimized math op, but I assume there's
> > some reason the user may need unscaled values?
> 
> Hi, something I've mentioned on other threads [1] is that running may
> be zero due to multiplexing but enabled be greater. 

Thanks for your comment.
I'll fix it.

> This can lead to a divide by zero when scaling. Giving the ratio to the caller
> gives more information - I may be misunderstanding this thread, apologies if so.

The perf_counts_values contains enabled and running.
So, caller can calculate the ratio.

Best Regards
Shunsuke

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-09-17  8:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-20  9:39 [PATCH 0/3] libperf: Add support for scaling counters obtained from the read() system call during multiplexing Shunsuke Nakamura
2021-08-20  9:39 ` [PATCH 1/3] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura
2021-08-23 20:12   ` Rob Herring
2021-08-24 10:11     ` nakamura.shun
2021-08-31  8:58       ` nakamura.shun
2021-08-31 12:26       ` Rob Herring
2021-09-07 23:59         ` Ian Rogers
2021-09-17  8:04           ` nakamura.shun
2021-08-20  9:39 ` [PATCH 2/3] libperf tests: Fix verbose printing Shunsuke Nakamura
2021-08-23 20:26   ` Rob Herring
2021-08-24 18:06     ` Arnaldo Carvalho de Melo
2021-08-20  9:39 ` [PATCH 3/3] libperf tests: Add test_stat_multiplexing test Shunsuke Nakamura

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).