* [PATCH v2 0/2] libperf: Add support for scaling counters obtained from the read() system call during multiplexing @ 2021-09-22 10:16 Shunsuke Nakamura 2021-09-22 10:16 ` [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura 2021-09-22 10:16 ` [PATCH v2 2/2] libperf tests: Add test_stat_multiplexing test Shunsuke Nakamura 0 siblings, 2 replies; 12+ messages in thread From: Shunsuke Nakamura @ 2021-09-22 10:16 UTC (permalink / raw) To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung Cc: linux-kernel, linux-perf-users This patch series supports counter scaling when perf_evsel__read() obtains a counter using the read() system call during multiplexing. The first patch adds scaling of counters obtained from the read() system call during multiplexing. The second patch adds a test for the first patch. This patch is based on Vince's rdpmc_multiplexing.c [1] --- Changes in v2: - Fix not to divide by zero when counter scaling - Add test to verify that no division by zero occurs [1] https://github.com/deater/perf_event_tests/blob/master/tests/rdpmc/rdpmc_multiplexing.c nakamura shunsuke (2): libperf: Add processing to scale the counters obtained during the read() system call when multiplexing libperf tests: Add test_stat_multiplexing test tools/lib/perf/evsel.c | 6 + tools/lib/perf/tests/test-evlist.c | 183 +++++++++++++++++++++++++++++ 2 files changed, 189 insertions(+) -- 2.27.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-09-22 10:16 [PATCH v2 0/2] libperf: Add support for scaling counters obtained from the read() system call during multiplexing Shunsuke Nakamura @ 2021-09-22 10:16 ` Shunsuke Nakamura 2021-09-22 21:34 ` Jiri Olsa 2021-09-22 10:16 ` [PATCH v2 2/2] libperf tests: Add test_stat_multiplexing test Shunsuke Nakamura 1 sibling, 1 reply; 12+ messages in thread From: Shunsuke Nakamura @ 2021-09-22 10:16 UTC (permalink / raw) To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung Cc: linux-kernel, linux-perf-users From: nakamura shunsuke <nakamura.shun@fujitsu.com> perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but does not scale counters obtained by read() system call. Add processing to perf_evsel__read() to scale the counters obtained during the read() system call when multiplexing. Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> --- tools/lib/perf/evsel.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c index 8441e3e1aaac..0ebd1d34436f 100644 --- a/tools/lib/perf/evsel.c +++ b/tools/lib/perf/evsel.c @@ -18,6 +18,7 @@ #include <sys/ioctl.h> #include <sys/mman.h> #include <asm/bug.h> +#include <linux/math64.h> void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, int idx) @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, if (readn(*fd, count->values, size) <= 0) return -errno; + if (count->ena != count->run) { + if (count->run != 0) + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); + } + return 0; } -- 2.27.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-09-22 10:16 ` [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura @ 2021-09-22 21:34 ` Jiri Olsa 2021-09-28 9:53 ` nakamura.shun 0 siblings, 1 reply; 12+ messages in thread From: Jiri Olsa @ 2021-09-22 21:34 UTC (permalink / raw) To: Shunsuke Nakamura Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung, linux-kernel, linux-perf-users On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote: > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but > does not scale counters obtained by read() system call. > > Add processing to perf_evsel__read() to scale the counters obtained during the > read() system call when multiplexing. > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > --- > tools/lib/perf/evsel.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > index 8441e3e1aaac..0ebd1d34436f 100644 > --- a/tools/lib/perf/evsel.c > +++ b/tools/lib/perf/evsel.c > @@ -18,6 +18,7 @@ > #include <sys/ioctl.h> > #include <sys/mman.h> > #include <asm/bug.h> > +#include <linux/math64.h> > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, > int idx) > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > if (readn(*fd, count->values, size) <= 0) > return -errno; > > + if (count->ena != count->run) { > + if (count->run != 0) > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); > + } so I think perf stat expect raw values in there and does the scaling by itself, please check following code: read_counters read_affinity_counters read_counter_cpu read_single_counter evsel__read_counter perf_stat_process_counter process_counter_maps process_counter_values perf_counts_values__scale perhaps we could export perf_counts_values__scale if it'd be any help jirka > + > return 0; > } > > -- > 2.27.0 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-09-22 21:34 ` Jiri Olsa @ 2021-09-28 9:53 ` nakamura.shun 2021-10-05 16:36 ` Rob Herring 2021-10-07 17:17 ` Jiri Olsa 0 siblings, 2 replies; 12+ messages in thread From: nakamura.shun @ 2021-09-28 9:53 UTC (permalink / raw) To: Jiri Olsa, Rob Herring Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung, irogers, linux-perf-users, linux-kernel Hi Jirka > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote: > > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but > > does not scale counters obtained by read() system call. > > > > Add processing to perf_evsel__read() to scale the counters obtained during the > > read() system call when multiplexing. > > > > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > > --- > > tools/lib/perf/evsel.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > > index 8441e3e1aaac..0ebd1d34436f 100644 > > --- a/tools/lib/perf/evsel.c > +++ b/tools/lib/perf/evsel.c > > @@ -18,6 +18,7 @@ > > #include <sys/ioctl.h> > > #include <sys/mman.h> > > #include <asm/bug.h> > > +#include <linux/math64.h> > > > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, > > int idx) > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > > if (readn(*fd, count->values, size) <= 0) > > return -errno; > > > > + if (count->ena != count->run) { > > + if (count->run != 0) > > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); > > + } > > so I think perf stat expect raw values in there and does the > scaling by itself, please check following code: > > read_counters > read_affinity_counters > read_counter_cpu > read_single_counter > evsel__read_counter > > perf_stat_process_counter > process_counter_maps > process_counter_values > perf_counts_values__scale > > > perhaps we could export perf_counts_values__scale if it'd be any help Thank you for your comment. The purpose of this patch is to unify the counters obtained with perf_evsel__read() to scaled or unscaled values. perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is available, else gets by readn(). In current implementation, caller gets scaled counter if goes through RDPMC path, otherwise gets unscaled counter via readn() path. However caller cannnot know which path were taken. If caller expects a raw value, I think the RDPMC path should also return an unscaled counter. diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c index c89dfa5..aaa4579 100644 --- a/tools/lib/perf/mmap.c +++ b/tools/lib/perf/mmap.c @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count count->ena += delta; if (idx) count->run += delta; - - cnt = mul_u64_u64_div64(cnt, count->ena, count->run); } count->val = cnt; Rob, do you have any comments? Best Regards Shunsuke ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-09-28 9:53 ` nakamura.shun @ 2021-10-05 16:36 ` Rob Herring 2021-10-19 4:59 ` nakamura.shun 2021-10-07 17:17 ` Jiri Olsa 1 sibling, 1 reply; 12+ messages in thread From: Rob Herring @ 2021-10-05 16:36 UTC (permalink / raw) To: nakamura.shun Cc: Jiri Olsa, peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung, irogers, linux-perf-users, linux-kernel On Tue, Sep 28, 2021 at 7:41 AM nakamura.shun@fujitsu.com <nakamura.shun@fujitsu.com> wrote: > > Hi Jirka > > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote: > > > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but > > > does not scale counters obtained by read() system call. > > > > > > Add processing to perf_evsel__read() to scale the counters obtained during the > > > read() system call when multiplexing. > > > > > > > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > > > --- > > > tools/lib/perf/evsel.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > > > index 8441e3e1aaac..0ebd1d34436f 100644 > > > --- a/tools/lib/perf/evsel.c > > +++ b/tools/lib/perf/evsel.c > > > @@ -18,6 +18,7 @@ > > > #include <sys/ioctl.h> > > > #include <sys/mman.h> > > > #include <asm/bug.h> > > > +#include <linux/math64.h> > > > > > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, > > > int idx) > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > > > if (readn(*fd, count->values, size) <= 0) > > > return -errno; > > > > > > + if (count->ena != count->run) { > > > + if (count->run != 0) > > > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); > > > + } > > > > so I think perf stat expect raw values in there and does the > > scaling by itself, please check following code: > > > > read_counters > > read_affinity_counters > > read_counter_cpu > > read_single_counter > > evsel__read_counter > > > > perf_stat_process_counter > > process_counter_maps > > process_counter_values > > perf_counts_values__scale > > > > > > perhaps we could export perf_counts_values__scale if it'd be any help > > Thank you for your comment. > > The purpose of this patch is to unify the counters obtained with > perf_evsel__read() to scaled or unscaled values. > > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is > available, else gets by readn(). In current implementation, caller > gets scaled counter if goes through RDPMC path, otherwise gets unscaled > counter via readn() path. > > However caller cannnot know which path were taken. > > If caller expects a raw value, I think the RDPMC path should also > return an unscaled counter. > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c > index c89dfa5..aaa4579 100644 > --- a/tools/lib/perf/mmap.c > +++ b/tools/lib/perf/mmap.c > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count > count->ena += delta; > if (idx) > count->run += delta; > - > - cnt = mul_u64_u64_div64(cnt, count->ena, count->run); > } > > count->val = cnt; > > Rob, do you have any comments? Submit a proper patch with the above. Rob ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-10-05 16:36 ` Rob Herring @ 2021-10-19 4:59 ` nakamura.shun 0 siblings, 0 replies; 12+ messages in thread From: nakamura.shun @ 2021-10-19 4:59 UTC (permalink / raw) To: Rob Herring Cc: Jiri Olsa, peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung, irogers, linux-perf-users, linux-kernel Hi Rob Sorry for the late reply. > > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote: > > > > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > > > > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but > > > > does not scale counters obtained by read() system call. > > > > > > > > Add processing to perf_evsel__read() to scale the counters obtained during the > > > > read() system call when multiplexing. > > > > > > > > > > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > > > > --- > > > > tools/lib/perf/evsel.c | 6 ++++++ > > > > 1 file changed, 6 insertions(+) > > > > > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > > > > index 8441e3e1aaac..0ebd1d34436f 100644 > > > > --- a/tools/lib/perf/evsel.c > > > +++ b/tools/lib/perf/evsel.c > > > > @@ -18,6 +18,7 @@ > > > > #include <sys/ioctl.h> > > > > #include <sys/mman.h> > > > > #include <asm/bug.h> > > > > +#include <linux/math64.h> > > > > > > > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, > > > > int idx) > > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > > > > if (readn(*fd, count->values, size) <= 0) > > > > return -errno; > > > > > > > > + if (count->ena != count->run) { > > > > + if (count->run != 0) > > > > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); > > > > + } > > > > > > so I think perf stat expect raw values in there and does the > > > scaling by itself, please check following code: > > > > > > read_counters > > > read_affinity_counters > > > read_counter_cpu > > > read_single_counter > > > evsel__read_counter > > > > > > perf_stat_process_counter > > > process_counter_maps > > > process_counter_values > > > perf_counts_values__scale > > > > > > > > > perhaps we could export perf_counts_values__scale if it'd be any help > > > > Thank you for your comment. > > > > The purpose of this patch is to unify the counters obtained with > > perf_evsel__read() to scaled or unscaled values. > > > > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is > > available, else gets by readn(). In current implementation, caller > > gets scaled counter if goes through RDPMC path, otherwise gets unscaled > > counter via readn() path. > > > > However caller cannnot know which path were taken. > > > > If caller expects a raw value, I think the RDPMC path should also > > return an unscaled counter. > > > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c > > index c89dfa5..aaa4579 100644 > > --- a/tools/lib/perf/mmap.c > > +++ b/tools/lib/perf/mmap.c > > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count > > count->ena += delta; > > if (idx) > > count->run += delta; > > - > > - cnt = mul_u64_u64_div64(cnt, count->ena, count->run); > > } > > > > count->val = cnt; > > > > Rob, do you have any comments? > > Submit a proper patch with the above. Thank you for your comment. I will send the v3 patch. Best Regards Shunsuke ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-09-28 9:53 ` nakamura.shun 2021-10-05 16:36 ` Rob Herring @ 2021-10-07 17:17 ` Jiri Olsa 2021-10-19 5:00 ` nakamura.shun 1 sibling, 1 reply; 12+ messages in thread From: Jiri Olsa @ 2021-10-07 17:17 UTC (permalink / raw) To: nakamura.shun Cc: Rob Herring, peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung, irogers, linux-perf-users, linux-kernel On Tue, Sep 28, 2021 at 09:53:24AM +0000, nakamura.shun@fujitsu.com wrote: > Hi Jirka > > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote: > > > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but > > > does not scale counters obtained by read() system call. > > > > > > Add processing to perf_evsel__read() to scale the counters obtained during the > > > read() system call when multiplexing. > > > > > > > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > > > --- > > > tools/lib/perf/evsel.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > > > index 8441e3e1aaac..0ebd1d34436f 100644 > > > --- a/tools/lib/perf/evsel.c > > +++ b/tools/lib/perf/evsel.c > > > @@ -18,6 +18,7 @@ > > > #include <sys/ioctl.h> > > > #include <sys/mman.h> > > > #include <asm/bug.h> > > > +#include <linux/math64.h> > > > > > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, > > > int idx) > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > > > if (readn(*fd, count->values, size) <= 0) > > > return -errno; > > > > > > + if (count->ena != count->run) { > > > + if (count->run != 0) > > > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); > > > + } > > > > so I think perf stat expect raw values in there and does the > > scaling by itself, please check following code: > > > > read_counters > > read_affinity_counters > > read_counter_cpu > > read_single_counter > > evsel__read_counter > > > > perf_stat_process_counter > > process_counter_maps > > process_counter_values > > perf_counts_values__scale > > > > > > perhaps we could export perf_counts_values__scale if it'd be any help > > Thank you for your comment. > > The purpose of this patch is to unify the counters obtained with > perf_evsel__read() to scaled or unscaled values. > > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is > available, else gets by readn(). In current implementation, caller > gets scaled counter if goes through RDPMC path, otherwise gets unscaled > counter via readn() path. > > However caller cannnot know which path were taken. > > If caller expects a raw value, I think the RDPMC path should also > return an unscaled counter. > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c > index c89dfa5..aaa4579 100644 > --- a/tools/lib/perf/mmap.c > +++ b/tools/lib/perf/mmap.c > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count > count->ena += delta; > if (idx) > count->run += delta; > - > - cnt = mul_u64_u64_div64(cnt, count->ena, count->run); perf stat does not mmap counters so this should not be invoked within perf stat.. but we should be consistent and scale after calling perf_evsel__read.. and give user the possibility to get un-scaled counts that perhaps brings new feature.. mmap perf stat counters to invoke the fast reading path for counters.. IIRC it should be matter just to mmap the first 'user' page thanks, jirka > } > > count->val = cnt; > > Rob, do you have any comments? > > Best Regards > Shunsuke ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-10-07 17:17 ` Jiri Olsa @ 2021-10-19 5:00 ` nakamura.shun 2021-11-08 0:49 ` nakamura.shun 0 siblings, 1 reply; 12+ messages in thread From: nakamura.shun @ 2021-10-19 5:00 UTC (permalink / raw) To: Jiri Olsa Cc: Rob Herring, peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung, irogers, linux-perf-users, linux-kernel Hi Jirka Sorry for the late reply. > > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote: > > > > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > > > > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but > > > > does not scale counters obtained by read() system call. > > > > > > > > Add processing to perf_evsel__read() to scale the counters obtained during the > > > > read() system call when multiplexing. > > > > > > > > > > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > > > > --- > > > > tools/lib/perf/evsel.c | 6 ++++++ > > > > 1 file changed, 6 insertions(+) > > > > > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > > > > index 8441e3e1aaac..0ebd1d34436f 100644 > > > > --- a/tools/lib/perf/evsel.c > > > +++ b/tools/lib/perf/evsel.c > > > > @@ -18,6 +18,7 @@ > > > > #include <sys/ioctl.h> > > > > #include <sys/mman.h> > > > > #include <asm/bug.h> > > > > +#include <linux/math64.h> > > > > > > > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, > > > > int idx) > > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > > > > if (readn(*fd, count->values, size) <= 0) > > > > return -errno; > > > > > > > > + if (count->ena != count->run) { > > > > + if (count->run != 0) > > > > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); > > > > + } > > > > > > so I think perf stat expect raw values in there and does the > > > scaling by itself, please check following code: > > > > > > read_counters > > > read_affinity_counters > > > read_counter_cpu > > > read_single_counter > > > evsel__read_counter > > > > > > perf_stat_process_counter > > > process_counter_maps > > > process_counter_values > > > perf_counts_values__scale > > > > > > > > > perhaps we could export perf_counts_values__scale if it'd be any help > > > > Thank you for your comment. > > > > The purpose of this patch is to unify the counters obtained with > > perf_evsel__read() to scaled or unscaled values. > > > > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is > > available, else gets by readn(). In current implementation, caller > > gets scaled counter if goes through RDPMC path, otherwise gets unscaled > > counter via readn() path. > > > > However caller cannnot know which path were taken. > > > > If caller expects a raw value, I think the RDPMC path should also > > return an unscaled counter. > > > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c > > index c89dfa5..aaa4579 100644 > > --- a/tools/lib/perf/mmap.c > > +++ b/tools/lib/perf/mmap.c > > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count > > count->ena += delta; > > if (idx) > > count->run += delta; > > - > > - cnt = mul_u64_u64_div64(cnt, count->ena, count->run); > > perf stat does not mmap counters so this should not be invoked > within perf stat.. but we should be consistent and scale after > calling perf_evsel__read.. and give user the possibility to get > un-scaled counts > > that perhaps brings new feature.. mmap perf stat counters to invoke > the fast reading path for counters.. IIRC it should be matter just > to mmap the first 'user' page Thank you for your comment. I think it will be good that perf stat supports rdpmc. I will modify the patch. Best Regards Shunsuke ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-10-19 5:00 ` nakamura.shun @ 2021-11-08 0:49 ` nakamura.shun 2021-11-14 16:16 ` Jiri Olsa 0 siblings, 1 reply; 12+ messages in thread From: nakamura.shun @ 2021-11-08 0:49 UTC (permalink / raw) To: Jiri Olsa Cc: Rob Herring, peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung, irogers, linux-perf-users, linux-kernel Hi Jirka > > > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote: > > > > > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > > > > > > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but > > > > > does not scale counters obtained by read() system call. > > > > > > > > > > Add processing to perf_evsel__read() to scale the counters obtained during the > > > > > read() system call when multiplexing. > > > > > > > > > > > > > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > > > > > --- > > > > > tools/lib/perf/evsel.c | 6 ++++++ > > > > > 1 file changed, 6 insertions(+) > > > > > > > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > > > > > index 8441e3e1aaac..0ebd1d34436f 100644 > > > > > --- a/tools/lib/perf/evsel.c > > > > +++ b/tools/lib/perf/evsel.c > > > > > @@ -18,6 +18,7 @@ > > > > > #include <sys/ioctl.h> > > > > > #include <sys/mman.h> > > > > > #include <asm/bug.h> > > > > > +#include <linux/math64.h> > > > > > > > > > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, > > > > > int idx) > > > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > > > > > if (readn(*fd, count->values, size) <= 0) > > > > > return -errno; > > > > > > > > > > + if (count->ena != count->run) { > > > > > + if (count->run != 0) > > > > > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); > > > > > + } > > > > > > > > so I think perf stat expect raw values in there and does the > > > > scaling by itself, please check following code: > > > > > > > > read_counters > > > > read_affinity_counters > > > > read_counter_cpu > > > > read_single_counter > > > > evsel__read_counter > > > > > > > > perf_stat_process_counter > > > > process_counter_maps > > > > process_counter_values > > > > perf_counts_values__scale > > > > > > > > > > > > perhaps we could export perf_counts_values__scale if it'd be any help > > > > > > Thank you for your comment. > > > > > > The purpose of this patch is to unify the counters obtained with > > > perf_evsel__read() to scaled or unscaled values. > > > > > > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is > > > available, else gets by readn(). In current implementation, caller > > > gets scaled counter if goes through RDPMC path, otherwise gets unscaled > > > counter via readn() path. > > > > > > However caller cannnot know which path were taken. > > > > > > If caller expects a raw value, I think the RDPMC path should also > > > return an unscaled counter. > > > > > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c > > > index c89dfa5..aaa4579 100644 > > > --- a/tools/lib/perf/mmap.c > > > +++ b/tools/lib/perf/mmap.c > > > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count > > > count->ena += delta; > > > if (idx) > > > count->run += delta; > > > - > > > - cnt = mul_u64_u64_div64(cnt, count->ena, count->run); > > > > perf stat does not mmap counters so this should not be invoked > > within perf stat.. but we should be consistent and scale after > > calling perf_evsel__read.. and give user the possibility to get > > un-scaled counts > > > > that perhaps brings new feature.. mmap perf stat counters to invoke > > the fast reading path for counters.. IIRC it should be matter just > > to mmap the first 'user' page > > Thank you for your comment. > I think it will be good that perf stat supports rdpmc. > > I will modify the patch. I think rdpmc cannot measure the command/program specified in perf stat because it measures the calling thread of perf_event_open. If my understanding is wrong, please point it out to me. Best Regards Shunsuke ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing 2021-11-08 0:49 ` nakamura.shun @ 2021-11-14 16:16 ` Jiri Olsa 0 siblings, 0 replies; 12+ messages in thread From: Jiri Olsa @ 2021-11-14 16:16 UTC (permalink / raw) To: nakamura.shun Cc: Rob Herring, peterz, mingo, acme, mark.rutland, alexander.shishkin, namhyung, irogers, linux-perf-users, linux-kernel On Mon, Nov 08, 2021 at 12:49:24AM +0000, nakamura.shun@fujitsu.com wrote: > Hi Jirka > > > > > > On Wed, Sep 22, 2021 at 07:16:26PM +0900, Shunsuke Nakamura wrote: > > > > > > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > > > > > > > > > > > perf_evsel__read() scales counters obtained by RDPMC during multiplexing, but > > > > > > does not scale counters obtained by read() system call. > > > > > > > > > > > > Add processing to perf_evsel__read() to scale the counters obtained during the > > > > > > read() system call when multiplexing. > > > > > > > > > > > > > > > > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > > > > > > --- > > > > > > tools/lib/perf/evsel.c | 6 ++++++ > > > > > > 1 file changed, 6 insertions(+) > > > > > > > > > > > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > > > > > > index 8441e3e1aaac..0ebd1d34436f 100644 > > > > > > --- a/tools/lib/perf/evsel.c > > > > > +++ b/tools/lib/perf/evsel.c > > > > > > @@ -18,6 +18,7 @@ > > > > > > #include <sys/ioctl.h> > > > > > > #include <sys/mman.h> > > > > > > #include <asm/bug.h> > > > > > > +#include <linux/math64.h> > > > > > > > > > > > > void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr, > > > > > > int idx) > > > > > > @@ -321,6 +322,11 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, > > > > > > if (readn(*fd, count->values, size) <= 0) > > > > > > return -errno; > > > > > > > > > > > > + if (count->ena != count->run) { > > > > > > + if (count->run != 0) > > > > > > + count->val = mul_u64_u64_div64(count->val, count->ena, count->run); > > > > > > + } > > > > > > > > > > so I think perf stat expect raw values in there and does the > > > > > scaling by itself, please check following code: > > > > > > > > > > read_counters > > > > > read_affinity_counters > > > > > read_counter_cpu > > > > > read_single_counter > > > > > evsel__read_counter > > > > > > > > > > perf_stat_process_counter > > > > > process_counter_maps > > > > > process_counter_values > > > > > perf_counts_values__scale > > > > > > > > > > > > > > > perhaps we could export perf_counts_values__scale if it'd be any help > > > > > > > > Thank you for your comment. > > > > > > > > The purpose of this patch is to unify the counters obtained with > > > > perf_evsel__read() to scaled or unscaled values. > > > > > > > > perf_evsel__read() gets counter by perf_mmap__read_self() if RDPMC is > > > > available, else gets by readn(). In current implementation, caller > > > > gets scaled counter if goes through RDPMC path, otherwise gets unscaled > > > > counter via readn() path. > > > > > > > > However caller cannnot know which path were taken. > > > > > > > > If caller expects a raw value, I think the RDPMC path should also > > > > return an unscaled counter. > > > > > > > > diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c > > > > index c89dfa5..aaa4579 100644 > > > > --- a/tools/lib/perf/mmap.c > > > > +++ b/tools/lib/perf/mmap.c > > > > @@ -353,8 +353,6 @@ int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count > > > > count->ena += delta; > > > > if (idx) > > > > count->run += delta; > > > > - > > > > - cnt = mul_u64_u64_div64(cnt, count->ena, count->run); > > > > > > perf stat does not mmap counters so this should not be invoked > > > within perf stat.. but we should be consistent and scale after > > > calling perf_evsel__read.. and give user the possibility to get > > > un-scaled counts > > > > > > that perhaps brings new feature.. mmap perf stat counters to invoke > > > the fast reading path for counters.. IIRC it should be matter just > > > to mmap the first 'user' page > > > > Thank you for your comment. > > I think it will be good that perf stat supports rdpmc. > > > > I will modify the patch. > > I think rdpmc cannot measure the command/program specified in perf stat > because it measures the calling thread of perf_event_open. > If my understanding is wrong, please point it out to me. right, I guess we could use that just for system wide monitoring, where we open counter for each cpu jirka ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] libperf tests: Add test_stat_multiplexing test 2021-09-22 10:16 [PATCH v2 0/2] libperf: Add support for scaling counters obtained from the read() system call during multiplexing Shunsuke Nakamura 2021-09-22 10:16 ` [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura @ 2021-09-22 10:16 ` Shunsuke Nakamura 2021-10-08 19:11 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 12+ messages in thread From: Shunsuke Nakamura @ 2021-09-22 10:16 UTC (permalink / raw) To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung Cc: linux-kernel, linux-perf-users From: nakamura shunsuke <nakamura.shun@fujitsu.com> Adds a test for a counter obtained using read() system call during multiplexing Committer testing: $ sudo make tests -C tools/lib/perf/ V=1 make: Entering directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf' make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=libperf make -C /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/api/ O= libapi.a make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=tests make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./tests obj=tests running static: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c... count = 502273264, run = 264105562, enable = 444092845 count = 501850699, run = 264100505, enable = 444088400 count = 499613780, run = 264096660, enable = 444085073 count = 499569398, run = 264501528, enable = 444081268 count = 500717228, run = 265495387, enable = 444077029 count = 501260539, run = 266488226, enable = 444071569 count = 501023910, run = 267469613, enable = 444065303 count = 501188267, run = 268461294, enable = 444058413 count = 503096683, run = 269450511, enable = 444050567 count = 503141904, run = 269587790, enable = 444041735 count = 503454490, run = 268590361, enable = 444030544 count = 504838207, run = 267592624, enable = 444021005 count = 505458594, run = 266594648, enable = 444001280 count = 504269433, run = 265595795, enable = 443991377 count = 502625858, run = 264599374, enable = 443982141 Expected: 502065774 High: 505458594 Low: 499569398 Average: 502292150 Average Error = 0.05% count = 26129, run = 134550, enable = 134550 count = 34035, run = 140540, enable = 140540 count = 40189, run = 141126, enable = 141126 count = 45263, run = 138290, enable = 138290 count = 51381, run = 136966, enable = 136966 count = 57499, run = 134993, enable = 134993 count = 63617, run = 132320, enable = 132320 count = 71714, run = 132047, enable = 132047 count = 77832, run = 128157, enable = 128157 count = 0, run = 0, enable = 123662 count = 0, run = 0, enable = 116753 count = 0, run = 0, enable = 110304 count = 0, run = 0, enable = 103834 count = 0, run = 0, enable = 97465 count = 0, run = 0, enable = 91110 OK - running tests/test-evsel.c... loop = 65536, count = 334303 loop = 131072, count = 656954 loop = 262144, count = 1321090 loop = 524288, count = 2652616 loop = 1048576, count = 5262140 loop = 65536, count = 393726 loop = 131072, count = 989345 loop = 262144, count = 1986206 loop = 524288, count = 3831652 loop = 1048576, count = 7747957 OK running dynamic: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c... count = 501975838, run = 247092568, enable = 415093723 count = 502491858, run = 247087145, enable = 415088830 count = 502571026, run = 247083574, enable = 415085139 count = 502743767, run = 247080183, enable = 415081197 count = 503259008, run = 247779561, enable = 415076990 count = 503169622, run = 248773141, enable = 415072091 count = 503430373, run = 249816445, enable = 415066529 count = 502597951, run = 250808797, enable = 415059863 count = 502211776, run = 251799199, enable = 415051831 count = 501921896, run = 252000721, enable = 415042717 count = 501441904, run = 251309399, enable = 415031152 count = 501224474, run = 250311107, enable = 415021479 count = 500502827, run = 249260579, enable = 415012168 count = 500775622, run = 248259621, enable = 415002830 count = 501377375, run = 247261134, enable = 414993890 Expected: 501953755 High: 503430373 Low: 500502827 Average: 502113021 Average Error = 0.03% count = 26098, run = 131826, enable = 131826 count = 33689, run = 138284, enable = 138284 count = 39808, run = 139152, enable = 139152 count = 45927, run = 138880, enable = 138880 count = 51049, run = 135405, enable = 135405 count = 57168, run = 133406, enable = 133406 count = 63287, run = 130736, enable = 130736 count = 71392, run = 130474, enable = 130474 count = 77794, run = 127203, enable = 127203 count = 0, run = 0, enable = 123172 count = 0, run = 0, enable = 116856 count = 0, run = 0, enable = 110557 count = 0, run = 0, enable = 104282 count = 0, run = 0, enable = 97929 count = 0, run = 0, enable = 91792 OK - running tests/test-evsel.c... loop = 65536, count = 334400 loop = 131072, count = 656908 loop = 262144, count = 1315441 loop = 524288, count = 2657481 loop = 1048576, count = 5258193 loop = 65536, count = 491938 loop = 131072, count = 965755 loop = 262144, count = 1666277 loop = 524288, count = 3516123 loop = 1048576, count = 6759204 OK make: Leaving directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf' Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> --- tools/lib/perf/tests/test-evlist.c | 183 +++++++++++++++++++++++++++++ 1 file changed, 183 insertions(+) diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c index c67c83399170..ca7a9542f40e 100644 --- a/tools/lib/perf/tests/test-evlist.c +++ b/tools/lib/perf/tests/test-evlist.c @@ -21,6 +21,10 @@ #include "tests.h" #include <internal/evsel.h> +#define EVENT_NUM 15 +#define WAIT_COUNT 100000000UL + + static int libperf_print(enum libperf_print_level level, const char *fmt, va_list ap) { @@ -413,6 +417,184 @@ static int test_mmap_cpus(void) return 0; } +static double display_error(long long average, + long long high, + long long low, + long long expected) { + + double error; + + error = (((double)average - expected) / expected) * 100.0; + + __T_VERBOSE(" Expected: %lld\n", expected); + __T_VERBOSE(" High: %lld Low: %lld Average: %lld\n", + high, low, average); + + __T_VERBOSE(" Average Error = %.2f%%\n",error); + + return error; + +} + +static int test_stat_multiplexing(void) +{ + struct perf_counts_values expected_counts = { .val = 0 }; + struct perf_counts_values multiplexing_counts[EVENT_NUM] = {{ .val = 0 },}; + struct perf_thread_map *threads; + struct perf_evlist *evlist; + struct perf_evsel *evsel; + struct perf_event_attr attr = { + .type = PERF_TYPE_HARDWARE, + .config = PERF_COUNT_HW_INSTRUCTIONS, + .read_format = PERF_FORMAT_TOTAL_TIME_ENABLED | + PERF_FORMAT_TOTAL_TIME_RUNNING, + .disabled = 1, + }; + int err, i, nonzero=0; + unsigned long count; + long long max = 0, min = 0, avg = 0; + double error = 0.0; + + + /* read for non-multiplexing event count */ + threads = perf_thread_map__new_dummy(); + __T("failed to create threads", threads); + + perf_thread_map__set_pid(threads, 0, 0); + + evsel = perf_evsel__new(&attr); + __T("failed to create evsel", evsel); + + err = perf_evsel__open(evsel, NULL, threads); + __T("failed to open evsel", err == 0); + + err = perf_evsel__enable(evsel); + __T("failed to enable evsel", err == 0); + + /* wait loop */ + count = WAIT_COUNT; + while(count--); + + perf_evsel__read(evsel, 0, 0, &expected_counts); + __T("failed to read value for evsel", expected_counts.val != 0); + __T("failed to read non-multiplexing event count", + expected_counts.ena == expected_counts.run); + + err = perf_evsel__disable(evsel); + __T("failed to enable evsel", err == 0); + + perf_evsel__close(evsel); + perf_evsel__delete(evsel); + + perf_thread_map__put(threads); + + + /* read for multiplexing event count */ + threads = perf_thread_map__new_dummy(); + __T("failed to create threads", threads); + + perf_thread_map__set_pid(threads, 0, 0); + + evlist = perf_evlist__new(); + __T("failed to create evlist", evlist); + + for (i = 0; i < EVENT_NUM; i++) { + evsel = perf_evsel__new(&attr); + __T("failed to create evsel1", evsel); + + perf_evlist__add(evlist, evsel); + } + perf_evlist__set_maps(evlist, NULL, threads); + + err = perf_evlist__open(evlist); + __T("failed to open evsel", err == 0); + + perf_evlist__enable(evlist); + + /* wait loop */ + count = WAIT_COUNT; + while(count--); + + i = 0; + perf_evlist__for_each_evsel(evlist, evsel) { + perf_evsel__read(evsel, 0, 0, &multiplexing_counts[i]); + __T("failed to read value for evsel", multiplexing_counts[i].val != 0); + i++; + } + + perf_evlist__disable(evlist); + + + min = multiplexing_counts[0].val; + for (i = 0; i < EVENT_NUM; i++) { + __T_VERBOSE("\tcount = %lu, run = %lu, enable = %lu\n", + multiplexing_counts[i].val, multiplexing_counts[i].run, + multiplexing_counts[i].ena); + + if (multiplexing_counts[i].val > max) + max = multiplexing_counts[i].val; + + if (multiplexing_counts[i].val < min) + min = multiplexing_counts[i].val; + + avg += multiplexing_counts[i].val; + + if (multiplexing_counts[i].val != 0) + nonzero++; + } + + avg = avg / nonzero; + + error = display_error(avg, max, min, expected_counts.val); + + __T("Error out of range!", ((error <= 1.0) && (error >= -1.0))); + + perf_evlist__close(evlist); + perf_evlist__delete(evlist); + + perf_thread_map__put(threads); + + + /* Verify that no division by zero occurs */ + threads = perf_thread_map__new_dummy(); + __T("failed to create threads", threads); + + perf_thread_map__set_pid(threads, 0, 0); + + evlist = perf_evlist__new(); + __T("failed to create evlist", evlist); + + for (i = 0; i < EVENT_NUM; i++) { + evsel = perf_evsel__new(&attr); + __T("failed to create evsel1", evsel); + + perf_evlist__add(evlist, evsel); + } + perf_evlist__set_maps(evlist, NULL, threads); + + err = perf_evlist__open(evlist); + __T("failed to open evsel", err == 0); + + perf_evlist__enable(evlist); + + /* Reproduce run=0 by not executing wait loop. */ + + i = 0; + perf_evlist__for_each_evsel(evlist, evsel) { + perf_evsel__read(evsel, 0, 0, &multiplexing_counts[i]); + __T_VERBOSE("\tcount = %lu, run = %lu, enable = %lu\n", + multiplexing_counts[i].val, multiplexing_counts[i].run, + multiplexing_counts[i].ena); + i++; + } + + perf_evlist__close(evlist); + perf_evlist__delete(evlist); + + perf_thread_map__put(threads); + return 0; +} + int test_evlist(int argc, char **argv) { __T_START; @@ -424,6 +606,7 @@ int test_evlist(int argc, char **argv) test_stat_thread_enable(); test_mmap_thread(); test_mmap_cpus(); + test_stat_multiplexing(); __T_END; return tests_failed == 0 ? 0 : -1; -- 2.27.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/2] libperf tests: Add test_stat_multiplexing test 2021-09-22 10:16 ` [PATCH v2 2/2] libperf tests: Add test_stat_multiplexing test Shunsuke Nakamura @ 2021-10-08 19:11 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 12+ messages in thread From: Arnaldo Carvalho de Melo @ 2021-10-08 19:11 UTC (permalink / raw) To: Shunsuke Nakamura Cc: Rob Herring, peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung, linux-kernel, linux-perf-users Em Wed, Sep 22, 2021 at 07:16:27PM +0900, Shunsuke Nakamura escreveu: > From: nakamura shunsuke <nakamura.shun@fujitsu.com> > > Adds a test for a counter obtained using read() system call during multiplexing You forgot to add Rob to the CC list, can I have Acked-by or Reviewed-by? Thanks, - Arnaldo > Committer testing: > > $ sudo make tests -C tools/lib/perf/ V=1 > make: Entering directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf' > make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=libperf > make -C /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/api/ O= libapi.a > make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi > make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi > make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=tests > make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./tests obj=tests > running static: > - running tests/test-cpumap.c...OK > - running tests/test-threadmap.c...OK > - running tests/test-evlist.c... > count = 502273264, run = 264105562, enable = 444092845 > count = 501850699, run = 264100505, enable = 444088400 > count = 499613780, run = 264096660, enable = 444085073 > count = 499569398, run = 264501528, enable = 444081268 > count = 500717228, run = 265495387, enable = 444077029 > count = 501260539, run = 266488226, enable = 444071569 > count = 501023910, run = 267469613, enable = 444065303 > count = 501188267, run = 268461294, enable = 444058413 > count = 503096683, run = 269450511, enable = 444050567 > count = 503141904, run = 269587790, enable = 444041735 > count = 503454490, run = 268590361, enable = 444030544 > count = 504838207, run = 267592624, enable = 444021005 > count = 505458594, run = 266594648, enable = 444001280 > count = 504269433, run = 265595795, enable = 443991377 > count = 502625858, run = 264599374, enable = 443982141 > Expected: 502065774 > High: 505458594 Low: 499569398 Average: 502292150 > Average Error = 0.05% > count = 26129, run = 134550, enable = 134550 > count = 34035, run = 140540, enable = 140540 > count = 40189, run = 141126, enable = 141126 > count = 45263, run = 138290, enable = 138290 > count = 51381, run = 136966, enable = 136966 > count = 57499, run = 134993, enable = 134993 > count = 63617, run = 132320, enable = 132320 > count = 71714, run = 132047, enable = 132047 > count = 77832, run = 128157, enable = 128157 > count = 0, run = 0, enable = 123662 > count = 0, run = 0, enable = 116753 > count = 0, run = 0, enable = 110304 > count = 0, run = 0, enable = 103834 > count = 0, run = 0, enable = 97465 > count = 0, run = 0, enable = 91110 > OK > - running tests/test-evsel.c... > loop = 65536, count = 334303 > loop = 131072, count = 656954 > loop = 262144, count = 1321090 > loop = 524288, count = 2652616 > loop = 1048576, count = 5262140 > loop = 65536, count = 393726 > loop = 131072, count = 989345 > loop = 262144, count = 1986206 > loop = 524288, count = 3831652 > loop = 1048576, count = 7747957 > OK > running dynamic: > - running tests/test-cpumap.c...OK > - running tests/test-threadmap.c...OK > - running tests/test-evlist.c... > count = 501975838, run = 247092568, enable = 415093723 > count = 502491858, run = 247087145, enable = 415088830 > count = 502571026, run = 247083574, enable = 415085139 > count = 502743767, run = 247080183, enable = 415081197 > count = 503259008, run = 247779561, enable = 415076990 > count = 503169622, run = 248773141, enable = 415072091 > count = 503430373, run = 249816445, enable = 415066529 > count = 502597951, run = 250808797, enable = 415059863 > count = 502211776, run = 251799199, enable = 415051831 > count = 501921896, run = 252000721, enable = 415042717 > count = 501441904, run = 251309399, enable = 415031152 > count = 501224474, run = 250311107, enable = 415021479 > count = 500502827, run = 249260579, enable = 415012168 > count = 500775622, run = 248259621, enable = 415002830 > count = 501377375, run = 247261134, enable = 414993890 > Expected: 501953755 > High: 503430373 Low: 500502827 Average: 502113021 > Average Error = 0.03% > count = 26098, run = 131826, enable = 131826 > count = 33689, run = 138284, enable = 138284 > count = 39808, run = 139152, enable = 139152 > count = 45927, run = 138880, enable = 138880 > count = 51049, run = 135405, enable = 135405 > count = 57168, run = 133406, enable = 133406 > count = 63287, run = 130736, enable = 130736 > count = 71392, run = 130474, enable = 130474 > count = 77794, run = 127203, enable = 127203 > count = 0, run = 0, enable = 123172 > count = 0, run = 0, enable = 116856 > count = 0, run = 0, enable = 110557 > count = 0, run = 0, enable = 104282 > count = 0, run = 0, enable = 97929 > count = 0, run = 0, enable = 91792 > OK > - running tests/test-evsel.c... > loop = 65536, count = 334400 > loop = 131072, count = 656908 > loop = 262144, count = 1315441 > loop = 524288, count = 2657481 > loop = 1048576, count = 5258193 > loop = 65536, count = 491938 > loop = 131072, count = 965755 > loop = 262144, count = 1666277 > loop = 524288, count = 3516123 > loop = 1048576, count = 6759204 > OK > make: Leaving directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf' > > > Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> > --- > tools/lib/perf/tests/test-evlist.c | 183 +++++++++++++++++++++++++++++ > 1 file changed, 183 insertions(+) > > diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c > index c67c83399170..ca7a9542f40e 100644 > --- a/tools/lib/perf/tests/test-evlist.c > +++ b/tools/lib/perf/tests/test-evlist.c > @@ -21,6 +21,10 @@ > #include "tests.h" > #include <internal/evsel.h> > > +#define EVENT_NUM 15 > +#define WAIT_COUNT 100000000UL > + > + > static int libperf_print(enum libperf_print_level level, > const char *fmt, va_list ap) > { > @@ -413,6 +417,184 @@ static int test_mmap_cpus(void) > return 0; > } > > +static double display_error(long long average, > + long long high, > + long long low, > + long long expected) { > + > + double error; > + > + error = (((double)average - expected) / expected) * 100.0; > + > + __T_VERBOSE(" Expected: %lld\n", expected); > + __T_VERBOSE(" High: %lld Low: %lld Average: %lld\n", > + high, low, average); > + > + __T_VERBOSE(" Average Error = %.2f%%\n",error); > + > + return error; > + > +} > + > +static int test_stat_multiplexing(void) > +{ > + struct perf_counts_values expected_counts = { .val = 0 }; > + struct perf_counts_values multiplexing_counts[EVENT_NUM] = {{ .val = 0 },}; > + struct perf_thread_map *threads; > + struct perf_evlist *evlist; > + struct perf_evsel *evsel; > + struct perf_event_attr attr = { > + .type = PERF_TYPE_HARDWARE, > + .config = PERF_COUNT_HW_INSTRUCTIONS, > + .read_format = PERF_FORMAT_TOTAL_TIME_ENABLED | > + PERF_FORMAT_TOTAL_TIME_RUNNING, > + .disabled = 1, > + }; > + int err, i, nonzero=0; > + unsigned long count; > + long long max = 0, min = 0, avg = 0; > + double error = 0.0; > + > + > + /* read for non-multiplexing event count */ > + threads = perf_thread_map__new_dummy(); > + __T("failed to create threads", threads); > + > + perf_thread_map__set_pid(threads, 0, 0); > + > + evsel = perf_evsel__new(&attr); > + __T("failed to create evsel", evsel); > + > + err = perf_evsel__open(evsel, NULL, threads); > + __T("failed to open evsel", err == 0); > + > + err = perf_evsel__enable(evsel); > + __T("failed to enable evsel", err == 0); > + > + /* wait loop */ > + count = WAIT_COUNT; > + while(count--); > + > + perf_evsel__read(evsel, 0, 0, &expected_counts); > + __T("failed to read value for evsel", expected_counts.val != 0); > + __T("failed to read non-multiplexing event count", > + expected_counts.ena == expected_counts.run); > + > + err = perf_evsel__disable(evsel); > + __T("failed to enable evsel", err == 0); > + > + perf_evsel__close(evsel); > + perf_evsel__delete(evsel); > + > + perf_thread_map__put(threads); > + > + > + /* read for multiplexing event count */ > + threads = perf_thread_map__new_dummy(); > + __T("failed to create threads", threads); > + > + perf_thread_map__set_pid(threads, 0, 0); > + > + evlist = perf_evlist__new(); > + __T("failed to create evlist", evlist); > + > + for (i = 0; i < EVENT_NUM; i++) { > + evsel = perf_evsel__new(&attr); > + __T("failed to create evsel1", evsel); > + > + perf_evlist__add(evlist, evsel); > + } > + perf_evlist__set_maps(evlist, NULL, threads); > + > + err = perf_evlist__open(evlist); > + __T("failed to open evsel", err == 0); > + > + perf_evlist__enable(evlist); > + > + /* wait loop */ > + count = WAIT_COUNT; > + while(count--); > + > + i = 0; > + perf_evlist__for_each_evsel(evlist, evsel) { > + perf_evsel__read(evsel, 0, 0, &multiplexing_counts[i]); > + __T("failed to read value for evsel", multiplexing_counts[i].val != 0); > + i++; > + } > + > + perf_evlist__disable(evlist); > + > + > + min = multiplexing_counts[0].val; > + for (i = 0; i < EVENT_NUM; i++) { > + __T_VERBOSE("\tcount = %lu, run = %lu, enable = %lu\n", > + multiplexing_counts[i].val, multiplexing_counts[i].run, > + multiplexing_counts[i].ena); > + > + if (multiplexing_counts[i].val > max) > + max = multiplexing_counts[i].val; > + > + if (multiplexing_counts[i].val < min) > + min = multiplexing_counts[i].val; > + > + avg += multiplexing_counts[i].val; > + > + if (multiplexing_counts[i].val != 0) > + nonzero++; > + } > + > + avg = avg / nonzero; > + > + error = display_error(avg, max, min, expected_counts.val); > + > + __T("Error out of range!", ((error <= 1.0) && (error >= -1.0))); > + > + perf_evlist__close(evlist); > + perf_evlist__delete(evlist); > + > + perf_thread_map__put(threads); > + > + > + /* Verify that no division by zero occurs */ > + threads = perf_thread_map__new_dummy(); > + __T("failed to create threads", threads); > + > + perf_thread_map__set_pid(threads, 0, 0); > + > + evlist = perf_evlist__new(); > + __T("failed to create evlist", evlist); > + > + for (i = 0; i < EVENT_NUM; i++) { > + evsel = perf_evsel__new(&attr); > + __T("failed to create evsel1", evsel); > + > + perf_evlist__add(evlist, evsel); > + } > + perf_evlist__set_maps(evlist, NULL, threads); > + > + err = perf_evlist__open(evlist); > + __T("failed to open evsel", err == 0); > + > + perf_evlist__enable(evlist); > + > + /* Reproduce run=0 by not executing wait loop. */ > + > + i = 0; > + perf_evlist__for_each_evsel(evlist, evsel) { > + perf_evsel__read(evsel, 0, 0, &multiplexing_counts[i]); > + __T_VERBOSE("\tcount = %lu, run = %lu, enable = %lu\n", > + multiplexing_counts[i].val, multiplexing_counts[i].run, > + multiplexing_counts[i].ena); > + i++; > + } > + > + perf_evlist__close(evlist); > + perf_evlist__delete(evlist); > + > + perf_thread_map__put(threads); > + return 0; > +} > + > int test_evlist(int argc, char **argv) > { > __T_START; > @@ -424,6 +606,7 @@ int test_evlist(int argc, char **argv) > test_stat_thread_enable(); > test_mmap_thread(); > test_mmap_cpus(); > + test_stat_multiplexing(); > > __T_END; > return tests_failed == 0 ? 0 : -1; > -- > 2.27.0 -- - Arnaldo ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-11-14 16:16 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-22 10:16 [PATCH v2 0/2] libperf: Add support for scaling counters obtained from the read() system call during multiplexing Shunsuke Nakamura 2021-09-22 10:16 ` [PATCH v2 1/2] libperf: Add processing to scale the counters obtained during the read() system call when multiplexing Shunsuke Nakamura 2021-09-22 21:34 ` Jiri Olsa 2021-09-28 9:53 ` nakamura.shun 2021-10-05 16:36 ` Rob Herring 2021-10-19 4:59 ` nakamura.shun 2021-10-07 17:17 ` Jiri Olsa 2021-10-19 5:00 ` nakamura.shun 2021-11-08 0:49 ` nakamura.shun 2021-11-14 16:16 ` Jiri Olsa 2021-09-22 10:16 ` [PATCH v2 2/2] libperf tests: Add test_stat_multiplexing test Shunsuke Nakamura 2021-10-08 19:11 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).