All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read()
@ 2021-12-07  8:22 Shunsuke
  2021-12-07  8:22 ` [PATCH v5 1/3] libperf: Move perf_counts_values__scale to tools/lib/perf Shunsuke
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Shunsuke @ 2021-12-07  8:22 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, robh
  Cc: linux-kernel, linux-perf-users

This patch series unifies the counters that can be obtained from
perf_evsel__read() to "no scaling".
The counter scaling will be done using a function moved from
tools/perf/util.

The first patch move perf_counts_values__scale from tools/perf/util
to tools/lib/perf so that it can be used with libperf.

The second patch removes the scaling process from
perf_mmap__read_self().

The third patch adds a verification test to make sure that it scales
correctly when multiplexed.

---
Previous version at:
https://lore.kernel.org/linux-perf-users/20211129090627.592149-1-nakamura.shun@fujitsu.com/

Changes in v5:
 - Update tools/lib/perf/Documentation/libperf.txt

Changes in v4:
 - Modify type s8 to type __s8

Changes in v3:
 - Move scaling process from tools/perf/util to tools/lib/perf
 - Remove scaling process from perf_mmap__read_self()
 - Remove test to verify that no division by zero occurs

Changes in v2:
 - Fix not to divide by zero when counter scaling
 - Add test to verify that no division by zero occurs


[1] https://github.com/deater/perf_event_tests/blob/master/tests/rdpmc/rdpmc_multiplexing.c


Shunsuke Nakamura (3):
  libperf: Move perf_counts_values__scale to tools/lib/perf
  libperf: Remove scaling process from perf_mmap__read_self()
  libperf tests: Add test_stat_multiplexing test

 tools/lib/perf/Documentation/libperf.txt |   2 +
 tools/lib/perf/evsel.c                   |  19 +++
 tools/lib/perf/include/perf/evsel.h      |   4 +
 tools/lib/perf/libperf.map               |   1 +
 tools/lib/perf/mmap.c                    |   2 -
 tools/lib/perf/tests/test-evlist.c       | 157 +++++++++++++++++++++++
 tools/perf/util/evsel.c                  |  19 ---
 tools/perf/util/evsel.h                  |   3 -
 8 files changed, 183 insertions(+), 24 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v5 1/3] libperf: Move perf_counts_values__scale to tools/lib/perf
  2021-12-07  8:22 [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read() Shunsuke
@ 2021-12-07  8:22 ` Shunsuke
  2021-12-07  8:22 ` [PATCH v5 2/3] libperf: Remove scaling process from perf_mmap__read_self() Shunsuke
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Shunsuke @ 2021-12-07  8:22 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, robh
  Cc: linux-kernel, linux-perf-users, Shunsuke Nakamura

From: Shunsuke Nakamura <nakamura.shun@fujitsu.com>

Move perf_counts_values__scale from tools/perf/util to tools/lib/perf
so that it can be used with libperf.

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
---
 tools/lib/perf/Documentation/libperf.txt |  2 ++
 tools/lib/perf/evsel.c                   | 19 +++++++++++++++++++
 tools/lib/perf/include/perf/evsel.h      |  4 ++++
 tools/lib/perf/libperf.map               |  1 +
 tools/perf/util/evsel.c                  | 19 -------------------
 tools/perf/util/evsel.h                  |  3 ---
 6 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/tools/lib/perf/Documentation/libperf.txt b/tools/lib/perf/Documentation/libperf.txt
index 63ae5e0195ce..dfda92e0f0a0 100644
--- a/tools/lib/perf/Documentation/libperf.txt
+++ b/tools/lib/perf/Documentation/libperf.txt
@@ -141,6 +141,8 @@ SYNOPSIS
   void *perf_evsel__mmap_base(struct perf_evsel *evsel, int cpu, int thread);
   int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
                        struct perf_counts_values *count);
+  void perf_counts_values__scale(struct perf_counts_values *count,
+                                 bool scale, __s8 *pscaled);
   int perf_evsel__enable(struct perf_evsel *evsel);
   int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu);
   int perf_evsel__disable(struct perf_evsel *evsel);
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8441e3e1aaac..782d1466df1f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -431,3 +431,22 @@ void perf_evsel__free_id(struct perf_evsel *evsel)
 	zfree(&evsel->id);
 	evsel->ids = 0;
 }
+
+void perf_counts_values__scale(struct perf_counts_values *count,
+			       bool scale, __s8 *pscaled)
+{
+	__s8 scaled = 0;
+
+	if (scale) {
+		if (count->run == 0) {
+			scaled = -1;
+			count->val = 0;
+		} else if (count->run < count->ena) {
+			scaled = 1;
+			count->val = (u64)((double)count->val * count->ena / count->run);
+		}
+	}
+
+	if (pscaled)
+		*pscaled = scaled;
+}
diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
index 60eae25076d3..f401c7484bec 100644
--- a/tools/lib/perf/include/perf/evsel.h
+++ b/tools/lib/perf/include/perf/evsel.h
@@ -4,6 +4,8 @@
 
 #include <stdint.h>
 #include <perf/core.h>
+#include <stdbool.h>
+#include <linux/types.h>
 
 struct perf_evsel;
 struct perf_event_attr;
@@ -39,5 +41,7 @@ LIBPERF_API int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu);
 LIBPERF_API struct perf_cpu_map *perf_evsel__cpus(struct perf_evsel *evsel);
 LIBPERF_API struct perf_thread_map *perf_evsel__threads(struct perf_evsel *evsel);
 LIBPERF_API struct perf_event_attr *perf_evsel__attr(struct perf_evsel *evsel);
+LIBPERF_API void perf_counts_values__scale(struct perf_counts_values *count,
+					   bool scale, __s8 *pscaled);
 
 #endif /* __LIBPERF_EVSEL_H */
diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map
index 71468606e8a7..5979bf92d98f 100644
--- a/tools/lib/perf/libperf.map
+++ b/tools/lib/perf/libperf.map
@@ -50,6 +50,7 @@ LIBPERF_0.0.1 {
 		perf_mmap__read_init;
 		perf_mmap__read_done;
 		perf_mmap__read_event;
+		perf_counts_values__scale;
 	local:
 		*;
 };
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ac0127be0459..656c30b988ce 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1476,25 +1476,6 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
 	count->run = count->run - tmp.run;
 }
 
-void perf_counts_values__scale(struct perf_counts_values *count,
-			       bool scale, s8 *pscaled)
-{
-	s8 scaled = 0;
-
-	if (scale) {
-		if (count->run == 0) {
-			scaled = -1;
-			count->val = 0;
-		} else if (count->run < count->ena) {
-			scaled = 1;
-			count->val = (u64)((double) count->val * count->ena / count->run);
-		}
-	}
-
-	if (pscaled)
-		*pscaled = scaled;
-}
-
 static int evsel__read_one(struct evsel *evsel, int cpu, int thread)
 {
 	struct perf_counts_values *count = perf_counts(evsel->counts, cpu, thread);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 29d49a8c1e92..99aa3363def7 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -195,9 +195,6 @@ static inline int evsel__nr_cpus(struct evsel *evsel)
 	return evsel__cpus(evsel)->nr;
 }
 
-void perf_counts_values__scale(struct perf_counts_values *count,
-			       bool scale, s8 *pscaled);
-
 void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
 			   struct perf_counts_values *count);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v5 2/3] libperf: Remove scaling process from perf_mmap__read_self()
  2021-12-07  8:22 [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read() Shunsuke
  2021-12-07  8:22 ` [PATCH v5 1/3] libperf: Move perf_counts_values__scale to tools/lib/perf Shunsuke
@ 2021-12-07  8:22 ` Shunsuke
  2021-12-07  8:22 ` [PATCH v5 3/3] libperf tests: Add test_stat_multiplexing test Shunsuke
  2021-12-08 21:18 ` [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read() Jiri Olsa
  3 siblings, 0 replies; 7+ messages in thread
From: Shunsuke @ 2021-12-07  8:22 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, robh
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, Shunsuke Nakamura

From: Shunsuke Nakamura <nakamura.shun@fujitsu.com>

Remove the scaling process from perf_mmap__read_self(), and unify the
counters that can be obtained from perf_evsel__read() to "no scaling".

Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
---
 tools/lib/perf/mmap.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
index c89dfa5f67b3..aaa457904008 100644
--- a/tools/lib/perf/mmap.c
+++ b/tools/lib/perf/mmap.c
@@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count
 		count->ena += delta;
 		if (idx)
 			count->run += delta;
-
-		cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
 	}
 
 	count->val = cnt;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v5 3/3] libperf tests: Add test_stat_multiplexing test
  2021-12-07  8:22 [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read() Shunsuke
  2021-12-07  8:22 ` [PATCH v5 1/3] libperf: Move perf_counts_values__scale to tools/lib/perf Shunsuke
  2021-12-07  8:22 ` [PATCH v5 2/3] libperf: Remove scaling process from perf_mmap__read_self() Shunsuke
@ 2021-12-07  8:22 ` Shunsuke
  2021-12-09 23:57   ` Namhyung Kim
  2021-12-08 21:18 ` [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read() Jiri Olsa
  3 siblings, 1 reply; 7+ messages in thread
From: Shunsuke @ 2021-12-07  8:22 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, robh
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, Shunsuke Nakamura

From: Shunsuke Nakamura <nakamura.shun@fujitsu.com>

Adds a test for a counter obtained using read() system call during
multiplexing.

Committer testing:

  $ sudo make tests -C ./tools/lib/perf V=1
    make[1]: Entering directory '/home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/lib/perf'
    make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=. obj=libperf
    make -C /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/lib/api/ O= libapi.a
    make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi
    make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi
    make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=. obj=tests
    make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=./tests obj=tests
    running static:
    - running tests/test-cpumap.c...OK
    - running tests/test-threadmap.c...OK
    - running tests/test-evlist.c...
    Event  0 -- Raw count = 297991478, run = 289848838, enable = 487833990
             Scaled count = 501538569 (59.42%, 289848838/487833990)
    Event  1 -- Raw count = 298202297, run = 289842833, enable = 487828457
             Scaled count = 501898097 (59.41%, 289842833/487828457)
    Event  2 -- Raw count = 298838708, run = 290229672, enable = 487824119
             Scaled count = 502294367 (59.49%, 290229672/487824119)
    Event  3 -- Raw count = 299636384, run = 291223455, enable = 487819672
             Scaled count = 501911916 (59.70%, 291223455/487819672)
    Event  4 -- Raw count = 301039452, run = 292217461, enable = 487814690
             Scaled count = 502541725 (59.90%, 292217461/487814690)
    Event  5 -- Raw count = 301835436, run = 293210150, enable = 487808943
             Scaled count = 502158690 (60.11%, 293210150/487808943)
    Event  6 -- Raw count = 304060357, run = 294621959, enable = 487802339
             Scaled count = 503429390 (60.40%, 294621959/487802339)
    Event  7 -- Raw count = 305283685, run = 295613727, enable = 487794884
             Scaled count = 503751369 (60.60%, 295613727/487794884)
    Event  8 -- Raw count = 305475229, run = 296220779, enable = 487786737
             Scaled count = 503026039 (60.73%, 296220779/487786737)
    Event  9 -- Raw count = 305141917, run = 295602628, enable = 487777537
             Scaled count = 503518435 (60.60%, 295602628/487777537)
    Event 10 -- Raw count = 303495328, run = 294604639, enable = 487765440
             Scaled count = 502485407 (60.40%, 294604639/487765440)
    Event 11 -- Raw count = 302667296, run = 293605945, enable = 487755909
             Scaled count = 502809171 (60.20%, 293605945/487755909)
    Event 12 -- Raw count = 301051839, run = 292174676, enable = 487746418
             Scaled count = 502565650 (59.90%, 292174676/487746418)
    Event 13 -- Raw count = 299861567, run = 291175260, enable = 487737096
             Scaled count = 502287213 (59.70%, 291175260/487737096)
    Event 14 -- Raw count = 299075896, run = 290177159, enable = 487727626
             Scaled count = 502684557 (59.50%, 290177159/487727626)
       Expected: 501627347
       High: 503751369   Low:  297991478   Average:  502593373
       Average Error = 0.19%
    OK
    - running tests/test-evsel.c...
            loop = 65536, count = 328182
            loop = 131072, count = 660212
            loop = 262144, count = 1344434
            loop = 524288, count = 2665921
            loop = 1048576, count = 5292260
            loop = 65536, count = 525695
            loop = 131072, count = 1039025
            loop = 262144, count = 2022367
            loop = 524288, count = 3807896
            loop = 1048576, count = 7026126
    OK
    running dynamic:
    - running tests/test-cpumap.c...OK
    - running tests/test-threadmap.c...OK
    - running tests/test-evlist.c...
    Event  0 -- Raw count = 301261995, run = 297151831, enable = 496168657
             Scaled count = 503031594 (59.89%, 297151831/496168657)
    Event  1 -- Raw count = 301949118, run = 298145404, enable = 496165648
             Scaled count = 502495687 (60.09%, 298145404/496165648)
    Event  2 -- Raw count = 301996384, run = 298170976, enable = 496162496
             Scaled count = 502528051 (60.10%, 298170976/496162496)
    Event  3 -- Raw count = 302266025, run = 298167975, enable = 496158896
             Scaled count = 502978152 (60.10%, 298167975/496158896)
    Event  4 -- Raw count = 302326299, run = 298162895, enable = 496154322
             Scaled count = 503082383 (60.09%, 298162895/496154322)
    Event  5 -- Raw count = 301984135, run = 298160272, enable = 496149190
             Scaled count = 502512232 (60.09%, 298160272/496149190)
    Event  6 -- Raw count = 302227412, run = 298150911, enable = 496142936
             Scaled count = 502926504 (60.09%, 298150911/496142936)
    Event  7 -- Raw count = 302124492, run = 298154219, enable = 496135963
             Scaled count = 502742595 (60.10%, 298154219/496135963)
    Event  8 -- Raw count = 302044822, run = 298146667, enable = 496128143
             Scaled count = 502614830 (60.09%, 298146667/496128143)
    Event  9 -- Raw count = 301592560, run = 298031312, enable = 496119275
             Scaled count = 502047523 (60.07%, 298031312/496119275)
    Event 10 -- Raw count = 300695500, run = 297033588, enable = 496108098
             Scaled count = 502224255 (59.87%, 297033588/496108098)
    Event 11 -- Raw count = 300948104, run = 296983965, enable = 496098673
             Scaled count = 502720593 (59.86%, 296983965/496098673)
    Event 12 -- Raw count = 300864958, run = 296983483, enable = 496089228
             Scaled count = 502572948 (59.86%, 296983483/496089228)
    Event 13 -- Raw count = 301117898, run = 296973717, enable = 496079647
             Scaled count = 503002292 (59.86%, 296973717/496079647)
    Event 14 -- Raw count = 301224163, run = 296977949, enable = 496070093
             Scaled count = 503162942 (59.87%, 296977949/496070093)
       Expected: 501650928
       High: 503162942   Low:  301261995   Average:  502709505
       Average Error = 0.21%
    OK
    - running tests/test-evsel.c...
            loop = 65536, count = 328183
            loop = 131072, count = 740142
            loop = 262144, count = 1339999
            loop = 524288, count = 2696817
            loop = 1048576, count = 5294518
            loop = 65536, count = 517941
            loop = 131072, count = 871035
            loop = 262144, count = 1835805
            loop = 524288, count = 3391920
            loop = 1048576, count = 6891764
    OK
    make[1]: Leaving directory '/home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/lib/perf'

Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
---
 tools/lib/perf/tests/test-evlist.c | 157 +++++++++++++++++++++++++++++
 1 file changed, 157 insertions(+)

diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c
index ce91a582f0e4..064edd0e995c 100644
--- a/tools/lib/perf/tests/test-evlist.c
+++ b/tools/lib/perf/tests/test-evlist.c
@@ -21,6 +21,9 @@
 #include "tests.h"
 #include <internal/evsel.h>
 
+#define EVENT_NUM 15
+#define WAIT_COUNT 100000000UL
+
 static int libperf_print(enum libperf_print_level level,
 			 const char *fmt, va_list ap)
 {
@@ -413,6 +416,159 @@ static int test_mmap_cpus(void)
 	return 0;
 }
 
+static double display_error(long long average,
+			    long long high,
+			    long long low,
+			    long long expected)
+{
+	double error;
+
+	error = (((double)average - expected) / expected) * 100.0;
+
+	__T_VERBOSE("   Expected: %lld\n", expected);
+	__T_VERBOSE("   High: %lld   Low:  %lld   Average:  %lld\n",
+		    high, low, average);
+
+	__T_VERBOSE("   Average Error = %.2f%%\n", error);
+
+	return error;
+}
+
+static int test_stat_multiplexing(void)
+{
+	struct perf_counts_values expected_counts = { .val = 0 };
+	struct perf_counts_values counts[EVENT_NUM] = {{ .val = 0 },};
+	struct perf_thread_map *threads;
+	struct perf_evlist *evlist;
+	struct perf_evsel *evsel;
+	struct perf_event_attr attr = {
+		.type	     = PERF_TYPE_HARDWARE,
+		.config	     = PERF_COUNT_HW_INSTRUCTIONS,
+		.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
+			       PERF_FORMAT_TOTAL_TIME_RUNNING,
+		.disabled    = 1,
+	};
+	int err, i, nonzero = 0;
+	unsigned long count;
+	long long max = 0, min = 0, avg = 0;
+	double error = 0.0;
+	__s8 scaled = 0;
+
+	/* read for non-multiplexing event count */
+	threads = perf_thread_map__new_dummy();
+	__T("failed to create threads", threads);
+
+	perf_thread_map__set_pid(threads, 0, 0);
+
+	evsel = perf_evsel__new(&attr);
+	__T("failed to create evsel", evsel);
+
+	err = perf_evsel__open(evsel, NULL, threads);
+	__T("failed to open evsel", err == 0);
+
+	err = perf_evsel__enable(evsel);
+	__T("failed to enable evsel", err == 0);
+
+	/* wait loop */
+	count = WAIT_COUNT;
+	while (count--)
+		;
+
+	perf_evsel__read(evsel, 0, 0, &expected_counts);
+	__T("failed to read value for evsel", expected_counts.val != 0);
+	__T("failed to read non-multiplexing event count",
+	    expected_counts.ena == expected_counts.run);
+
+	err = perf_evsel__disable(evsel);
+	__T("failed to enable evsel", err == 0);
+
+	perf_evsel__close(evsel);
+	perf_evsel__delete(evsel);
+
+	perf_thread_map__put(threads);
+
+	/* read for multiplexing event count */
+	threads = perf_thread_map__new_dummy();
+	__T("failed to create threads", threads);
+
+	perf_thread_map__set_pid(threads, 0, 0);
+
+	evlist = perf_evlist__new();
+	__T("failed to create evlist", evlist);
+
+	for (i = 0; i < EVENT_NUM; i++) {
+		evsel = perf_evsel__new(&attr);
+		__T("failed to create evsel", evsel);
+
+		perf_evlist__add(evlist, evsel);
+	}
+	perf_evlist__set_maps(evlist, NULL, threads);
+
+	err = perf_evlist__open(evlist);
+	__T("failed to open evsel", err == 0);
+
+	perf_evlist__enable(evlist);
+
+	/* wait loop */
+	count = WAIT_COUNT;
+	while (count--)
+		;
+
+	i = 0;
+	perf_evlist__for_each_evsel(evlist, evsel) {
+		perf_evsel__read(evsel, 0, 0, &counts[i]);
+		__T("failed to read value for evsel", counts[i].val != 0);
+		i++;
+	}
+
+	perf_evlist__disable(evlist);
+
+	min = counts[0].val;
+	for (i = 0; i < EVENT_NUM; i++) {
+		__T_VERBOSE("Event %2d -- Raw count = %lu, run = %lu, enable = %lu\n",
+			    i, counts[i].val, counts[i].run, counts[i].ena);
+
+		perf_counts_values__scale(&counts[i], true, &scaled);
+		if (scaled == 1) {
+			__T_VERBOSE("\t Scaled count = %lu (%.2lf%%, %lu/%lu)\n",
+				    counts[i].val,
+				    (double)counts[i].run / (double)counts[i].ena * 100.0,
+				    counts[i].run, counts[i].ena);
+		} else if (scaled == -1) {
+			__T_VERBOSE("\t Not Runnnig\n");
+		} else {
+			__T_VERBOSE("\t Not Scaling\n");
+		}
+
+		if (counts[i].val > max)
+			max = counts[i].val;
+
+		if (counts[i].val < min)
+			min = counts[i].val;
+
+		avg += counts[i].val;
+
+		if (counts[i].val != 0)
+			nonzero++;
+	}
+
+	if (nonzero != 0)
+		avg = avg / nonzero;
+	else
+		avg = 0;
+
+	error = display_error(avg, max, min, expected_counts.val);
+
+	__T("Error out of range!", ((error <= 1.0) && (error >= -1.0)));
+
+	perf_evlist__close(evlist);
+	perf_evlist__delete(evlist);
+
+	perf_thread_map__put(threads);
+
+	return 0;
+}
+
 int test_evlist(int argc, char **argv)
 {
 	__T_START;
@@ -424,6 +580,7 @@ int test_evlist(int argc, char **argv)
 	test_stat_thread_enable();
 	test_mmap_thread();
 	test_mmap_cpus();
+	test_stat_multiplexing();
 
 	__T_END;
 	return tests_failed == 0 ? 0 : -1;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read()
  2021-12-07  8:22 [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read() Shunsuke
                   ` (2 preceding siblings ...)
  2021-12-07  8:22 ` [PATCH v5 3/3] libperf tests: Add test_stat_multiplexing test Shunsuke
@ 2021-12-08 21:18 ` Jiri Olsa
  3 siblings, 0 replies; 7+ messages in thread
From: Jiri Olsa @ 2021-12-08 21:18 UTC (permalink / raw)
  To: Shunsuke
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung,
	robh, linux-kernel, linux-perf-users

On Tue, Dec 07, 2021 at 05:22:42PM +0900, Shunsuke wrote:
> This patch series unifies the counters that can be obtained from
> perf_evsel__read() to "no scaling".
> The counter scaling will be done using a function moved from
> tools/perf/util.
> 
> The first patch move perf_counts_values__scale from tools/perf/util
> to tools/lib/perf so that it can be used with libperf.
> 
> The second patch removes the scaling process from
> perf_mmap__read_self().
> 
> The third patch adds a verification test to make sure that it scales
> correctly when multiplexed.
> 
> ---
> Previous version at:
> https://lore.kernel.org/linux-perf-users/20211129090627.592149-1-nakamura.shun@fujitsu.com/
> 
> Changes in v5:
>  - Update tools/lib/perf/Documentation/libperf.txt

Acked-by: Jiri Olsa <jolsa@redhat.com>

thanks,
jirka

> 
> Changes in v4:
>  - Modify type s8 to type __s8
> 
> Changes in v3:
>  - Move scaling process from tools/perf/util to tools/lib/perf
>  - Remove scaling process from perf_mmap__read_self()
>  - Remove test to verify that no division by zero occurs
> 
> Changes in v2:
>  - Fix not to divide by zero when counter scaling
>  - Add test to verify that no division by zero occurs
> 
> 
> [1] https://github.com/deater/perf_event_tests/blob/master/tests/rdpmc/rdpmc_multiplexing.c
> 
> 
> Shunsuke Nakamura (3):
>   libperf: Move perf_counts_values__scale to tools/lib/perf
>   libperf: Remove scaling process from perf_mmap__read_self()
>   libperf tests: Add test_stat_multiplexing test
> 
>  tools/lib/perf/Documentation/libperf.txt |   2 +
>  tools/lib/perf/evsel.c                   |  19 +++
>  tools/lib/perf/include/perf/evsel.h      |   4 +
>  tools/lib/perf/libperf.map               |   1 +
>  tools/lib/perf/mmap.c                    |   2 -
>  tools/lib/perf/tests/test-evlist.c       | 157 +++++++++++++++++++++++
>  tools/perf/util/evsel.c                  |  19 ---
>  tools/perf/util/evsel.h                  |   3 -
>  8 files changed, 183 insertions(+), 24 deletions(-)
> 
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 3/3] libperf tests: Add test_stat_multiplexing test
  2021-12-07  8:22 ` [PATCH v5 3/3] libperf tests: Add test_stat_multiplexing test Shunsuke
@ 2021-12-09 23:57   ` Namhyung Kim
  2021-12-21  8:18     ` nakamura.shun
  0 siblings, 1 reply; 7+ messages in thread
From: Namhyung Kim @ 2021-12-09 23:57 UTC (permalink / raw)
  To: Shunsuke
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Rob Herring,
	linux-kernel, linux-perf-users, Jiri Olsa

Hello,

On Tue, Dec 7, 2021 at 12:25 AM Shunsuke <nakamura.shun@fujitsu.com> wrote:
>
> From: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
>
> Adds a test for a counter obtained using read() system call during
> multiplexing.
>
> Committer testing:
>
>   $ sudo make tests -C ./tools/lib/perf V=1
>     make[1]: Entering directory '/home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/lib/perf'
>     make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=. obj=libperf
>     make -C /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/lib/api/ O= libapi.a
>     make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi
>     make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi
>     make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=. obj=tests
>     make -f /home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/build/Makefile.build dir=./tests obj=tests
>     running static:
>     - running tests/test-cpumap.c...OK
>     - running tests/test-threadmap.c...OK
>     - running tests/test-evlist.c...
>     Event  0 -- Raw count = 297991478, run = 289848838, enable = 487833990
>              Scaled count = 501538569 (59.42%, 289848838/487833990)
>     Event  1 -- Raw count = 298202297, run = 289842833, enable = 487828457
>              Scaled count = 501898097 (59.41%, 289842833/487828457)
>     Event  2 -- Raw count = 298838708, run = 290229672, enable = 487824119
>              Scaled count = 502294367 (59.49%, 290229672/487824119)
>     Event  3 -- Raw count = 299636384, run = 291223455, enable = 487819672
>              Scaled count = 501911916 (59.70%, 291223455/487819672)
>     Event  4 -- Raw count = 301039452, run = 292217461, enable = 487814690
>              Scaled count = 502541725 (59.90%, 292217461/487814690)
>     Event  5 -- Raw count = 301835436, run = 293210150, enable = 487808943
>              Scaled count = 502158690 (60.11%, 293210150/487808943)
>     Event  6 -- Raw count = 304060357, run = 294621959, enable = 487802339
>              Scaled count = 503429390 (60.40%, 294621959/487802339)
>     Event  7 -- Raw count = 305283685, run = 295613727, enable = 487794884
>              Scaled count = 503751369 (60.60%, 295613727/487794884)
>     Event  8 -- Raw count = 305475229, run = 296220779, enable = 487786737
>              Scaled count = 503026039 (60.73%, 296220779/487786737)
>     Event  9 -- Raw count = 305141917, run = 295602628, enable = 487777537
>              Scaled count = 503518435 (60.60%, 295602628/487777537)
>     Event 10 -- Raw count = 303495328, run = 294604639, enable = 487765440
>              Scaled count = 502485407 (60.40%, 294604639/487765440)
>     Event 11 -- Raw count = 302667296, run = 293605945, enable = 487755909
>              Scaled count = 502809171 (60.20%, 293605945/487755909)
>     Event 12 -- Raw count = 301051839, run = 292174676, enable = 487746418
>              Scaled count = 502565650 (59.90%, 292174676/487746418)
>     Event 13 -- Raw count = 299861567, run = 291175260, enable = 487737096
>              Scaled count = 502287213 (59.70%, 291175260/487737096)
>     Event 14 -- Raw count = 299075896, run = 290177159, enable = 487727626
>              Scaled count = 502684557 (59.50%, 290177159/487727626)
>        Expected: 501627347
>        High: 503751369   Low:  297991478   Average:  502593373
>        Average Error = 0.19%
>     OK
>     - running tests/test-evsel.c...
>             loop = 65536, count = 328182
>             loop = 131072, count = 660212
>             loop = 262144, count = 1344434
>             loop = 524288, count = 2665921
>             loop = 1048576, count = 5292260
>             loop = 65536, count = 525695
>             loop = 131072, count = 1039025
>             loop = 262144, count = 2022367
>             loop = 524288, count = 3807896
>             loop = 1048576, count = 7026126
>     OK
>     running dynamic:
>     - running tests/test-cpumap.c...OK
>     - running tests/test-threadmap.c...OK
>     - running tests/test-evlist.c...
>     Event  0 -- Raw count = 301261995, run = 297151831, enable = 496168657
>              Scaled count = 503031594 (59.89%, 297151831/496168657)
>     Event  1 -- Raw count = 301949118, run = 298145404, enable = 496165648
>              Scaled count = 502495687 (60.09%, 298145404/496165648)
>     Event  2 -- Raw count = 301996384, run = 298170976, enable = 496162496
>              Scaled count = 502528051 (60.10%, 298170976/496162496)
>     Event  3 -- Raw count = 302266025, run = 298167975, enable = 496158896
>              Scaled count = 502978152 (60.10%, 298167975/496158896)
>     Event  4 -- Raw count = 302326299, run = 298162895, enable = 496154322
>              Scaled count = 503082383 (60.09%, 298162895/496154322)
>     Event  5 -- Raw count = 301984135, run = 298160272, enable = 496149190
>              Scaled count = 502512232 (60.09%, 298160272/496149190)
>     Event  6 -- Raw count = 302227412, run = 298150911, enable = 496142936
>              Scaled count = 502926504 (60.09%, 298150911/496142936)
>     Event  7 -- Raw count = 302124492, run = 298154219, enable = 496135963
>              Scaled count = 502742595 (60.10%, 298154219/496135963)
>     Event  8 -- Raw count = 302044822, run = 298146667, enable = 496128143
>              Scaled count = 502614830 (60.09%, 298146667/496128143)
>     Event  9 -- Raw count = 301592560, run = 298031312, enable = 496119275
>              Scaled count = 502047523 (60.07%, 298031312/496119275)
>     Event 10 -- Raw count = 300695500, run = 297033588, enable = 496108098
>              Scaled count = 502224255 (59.87%, 297033588/496108098)
>     Event 11 -- Raw count = 300948104, run = 296983965, enable = 496098673
>              Scaled count = 502720593 (59.86%, 296983965/496098673)
>     Event 12 -- Raw count = 300864958, run = 296983483, enable = 496089228
>              Scaled count = 502572948 (59.86%, 296983483/496089228)
>     Event 13 -- Raw count = 301117898, run = 296973717, enable = 496079647
>              Scaled count = 503002292 (59.86%, 296973717/496079647)
>     Event 14 -- Raw count = 301224163, run = 296977949, enable = 496070093
>              Scaled count = 503162942 (59.87%, 296977949/496070093)
>        Expected: 501650928
>        High: 503162942   Low:  301261995   Average:  502709505
>        Average Error = 0.21%
>     OK
>     - running tests/test-evsel.c...
>             loop = 65536, count = 328183
>             loop = 131072, count = 740142
>             loop = 262144, count = 1339999
>             loop = 524288, count = 2696817
>             loop = 1048576, count = 5294518
>             loop = 65536, count = 517941
>             loop = 131072, count = 871035
>             loop = 262144, count = 1835805
>             loop = 524288, count = 3391920
>             loop = 1048576, count = 6891764
>     OK
>     make[1]: Leaving directory '/home/nakamura/build_work/build_kernel/linux-kernel/linux/tools/lib/perf'
>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
> ---
>  tools/lib/perf/tests/test-evlist.c | 157 +++++++++++++++++++++++++++++
>  1 file changed, 157 insertions(+)
>
> diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c
> index ce91a582f0e4..064edd0e995c 100644
> --- a/tools/lib/perf/tests/test-evlist.c
> +++ b/tools/lib/perf/tests/test-evlist.c
> @@ -21,6 +21,9 @@
>  #include "tests.h"
>  #include <internal/evsel.h>
>
> +#define EVENT_NUM 15
> +#define WAIT_COUNT 100000000UL
> +
>  static int libperf_print(enum libperf_print_level level,
>                          const char *fmt, va_list ap)
>  {
> @@ -413,6 +416,159 @@ static int test_mmap_cpus(void)
>         return 0;
>  }
>
> +static double display_error(long long average,
> +                           long long high,
> +                           long long low,
> +                           long long expected)
> +{
> +       double error;
> +
> +       error = (((double)average - expected) / expected) * 100.0;
> +
> +       __T_VERBOSE("   Expected: %lld\n", expected);
> +       __T_VERBOSE("   High: %lld   Low:  %lld   Average:  %lld\n",
> +                   high, low, average);
> +
> +       __T_VERBOSE("   Average Error = %.2f%%\n", error);
> +
> +       return error;
> +}
> +
> +static int test_stat_multiplexing(void)
> +{
> +       struct perf_counts_values expected_counts = { .val = 0 };
> +       struct perf_counts_values counts[EVENT_NUM] = {{ .val = 0 },};
> +       struct perf_thread_map *threads;
> +       struct perf_evlist *evlist;
> +       struct perf_evsel *evsel;
> +       struct perf_event_attr attr = {
> +               .type        = PERF_TYPE_HARDWARE,
> +               .config      = PERF_COUNT_HW_INSTRUCTIONS,
> +               .read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
> +                              PERF_FORMAT_TOTAL_TIME_RUNNING,
> +               .disabled    = 1,

It'd be nice if you use a less restrictive event attribute
so that we can test it on VM or with non-root.

How about using SOFTWARE / CPU_CLOCKS with
exclude_kernel = 1 ?

Thanks,
Namhyung

> +       };
> +       int err, i, nonzero = 0;
> +       unsigned long count;
> +       long long max = 0, min = 0, avg = 0;
> +       double error = 0.0;
> +       __s8 scaled = 0;
> +
> +       /* read for non-multiplexing event count */
> +       threads = perf_thread_map__new_dummy();
> +       __T("failed to create threads", threads);
> +
> +       perf_thread_map__set_pid(threads, 0, 0);
> +
> +       evsel = perf_evsel__new(&attr);
> +       __T("failed to create evsel", evsel);
> +
> +       err = perf_evsel__open(evsel, NULL, threads);
> +       __T("failed to open evsel", err == 0);
> +
> +       err = perf_evsel__enable(evsel);
> +       __T("failed to enable evsel", err == 0);
> +
> +       /* wait loop */
> +       count = WAIT_COUNT;
> +       while (count--)
> +               ;
> +
> +       perf_evsel__read(evsel, 0, 0, &expected_counts);
> +       __T("failed to read value for evsel", expected_counts.val != 0);
> +       __T("failed to read non-multiplexing event count",
> +           expected_counts.ena == expected_counts.run);
> +
> +       err = perf_evsel__disable(evsel);
> +       __T("failed to enable evsel", err == 0);
> +
> +       perf_evsel__close(evsel);
> +       perf_evsel__delete(evsel);
> +
> +       perf_thread_map__put(threads);
> +
> +       /* read for multiplexing event count */
> +       threads = perf_thread_map__new_dummy();
> +       __T("failed to create threads", threads);
> +
> +       perf_thread_map__set_pid(threads, 0, 0);
> +
> +       evlist = perf_evlist__new();
> +       __T("failed to create evlist", evlist);
> +
> +       for (i = 0; i < EVENT_NUM; i++) {
> +               evsel = perf_evsel__new(&attr);
> +               __T("failed to create evsel", evsel);
> +
> +               perf_evlist__add(evlist, evsel);
> +       }
> +       perf_evlist__set_maps(evlist, NULL, threads);
> +
> +       err = perf_evlist__open(evlist);
> +       __T("failed to open evsel", err == 0);
> +
> +       perf_evlist__enable(evlist);
> +
> +       /* wait loop */
> +       count = WAIT_COUNT;
> +       while (count--)
> +               ;
> +
> +       i = 0;
> +       perf_evlist__for_each_evsel(evlist, evsel) {
> +               perf_evsel__read(evsel, 0, 0, &counts[i]);
> +               __T("failed to read value for evsel", counts[i].val != 0);
> +               i++;
> +       }
> +
> +       perf_evlist__disable(evlist);
> +
> +       min = counts[0].val;
> +       for (i = 0; i < EVENT_NUM; i++) {
> +               __T_VERBOSE("Event %2d -- Raw count = %lu, run = %lu, enable = %lu\n",
> +                           i, counts[i].val, counts[i].run, counts[i].ena);
> +
> +               perf_counts_values__scale(&counts[i], true, &scaled);
> +               if (scaled == 1) {
> +                       __T_VERBOSE("\t Scaled count = %lu (%.2lf%%, %lu/%lu)\n",
> +                                   counts[i].val,
> +                                   (double)counts[i].run / (double)counts[i].ena * 100.0,
> +                                   counts[i].run, counts[i].ena);
> +               } else if (scaled == -1) {
> +                       __T_VERBOSE("\t Not Runnnig\n");
> +               } else {
> +                       __T_VERBOSE("\t Not Scaling\n");
> +               }
> +
> +               if (counts[i].val > max)
> +                       max = counts[i].val;
> +
> +               if (counts[i].val < min)
> +                       min = counts[i].val;
> +
> +               avg += counts[i].val;
> +
> +               if (counts[i].val != 0)
> +                       nonzero++;
> +       }
> +
> +       if (nonzero != 0)
> +               avg = avg / nonzero;
> +       else
> +               avg = 0;
> +
> +       error = display_error(avg, max, min, expected_counts.val);
> +
> +       __T("Error out of range!", ((error <= 1.0) && (error >= -1.0)));
> +
> +       perf_evlist__close(evlist);
> +       perf_evlist__delete(evlist);
> +
> +       perf_thread_map__put(threads);
> +
> +       return 0;
> +}
> +
>  int test_evlist(int argc, char **argv)
>  {
>         __T_START;
> @@ -424,6 +580,7 @@ int test_evlist(int argc, char **argv)
>         test_stat_thread_enable();
>         test_mmap_thread();
>         test_mmap_cpus();
> +       test_stat_multiplexing();
>
>         __T_END;
>         return tests_failed == 0 ? 0 : -1;
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 3/3] libperf tests: Add test_stat_multiplexing test
  2021-12-09 23:57   ` Namhyung Kim
@ 2021-12-21  8:18     ` nakamura.shun
  0 siblings, 0 replies; 7+ messages in thread
From: nakamura.shun @ 2021-12-21  8:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Rob Herring,
	linux-kernel, linux-perf-users, Jiri Olsa

Hi Namhyung

Sorry for the late reply.

> > +static double display_error(long long average,
> > +                           long long high,
> > +                           long long low,
> > +                           long long expected)
> > +{
> > +       double error;
> > +
> > +       error = (((double)average - expected) / expected) * 100.0;
> > +
> > +       __T_VERBOSE("   Expected: %lld\n", expected);
> > +       __T_VERBOSE("   High: %lld   Low:  %lld   Average:  %lld\n",
> > +                   high, low, average);
> > +
> > +       __T_VERBOSE("   Average Error = %.2f%%\n", error);
> > +
> > +       return error;
> > +}
> > +
> > +static int test_stat_multiplexing(void)
> > +{
> > +       struct perf_counts_values expected_counts = { .val = 0 };
> > +       struct perf_counts_values counts[EVENT_NUM] = {{ .val = 0 },};
> > +       struct perf_thread_map *threads;
> > +       struct perf_evlist *evlist;
> > +       struct perf_evsel *evsel;
> > +       struct perf_event_attr attr = {
> > +               .type        = PERF_TYPE_HARDWARE,
> > +               .config      = PERF_COUNT_HW_INSTRUCTIONS,
> > +               .read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
> > +                              PERF_FORMAT_TOTAL_TIME_RUNNING,
> > +               .disabled    = 1,
> 
> It'd be nice if you use a less restrictive event attribute
> so that we can test it on VM or with non-root.
> 
> How about using SOFTWARE / CPU_CLOCKS with
> exclude_kernel = 1 ?

I'm currently working on adding a new API for libperf.
So, I will respond to the above comments around the end of January.


Best Regards
Shunsuke

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-12-21  8:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-07  8:22 [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read() Shunsuke
2021-12-07  8:22 ` [PATCH v5 1/3] libperf: Move perf_counts_values__scale to tools/lib/perf Shunsuke
2021-12-07  8:22 ` [PATCH v5 2/3] libperf: Remove scaling process from perf_mmap__read_self() Shunsuke
2021-12-07  8:22 ` [PATCH v5 3/3] libperf tests: Add test_stat_multiplexing test Shunsuke
2021-12-09 23:57   ` Namhyung Kim
2021-12-21  8:18     ` nakamura.shun
2021-12-08 21:18 ` [PATCH v5 0/3] libperf: Unify scaling of counters obtained from perf_evsel__read() Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.