All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Tidy user rdpmc documentation and testing
@ 2022-06-08 22:43 Ian Rogers
  2022-06-08 22:43 ` [PATCH 1/4] libperf evsel: Open shouldn't leak fd on failure Ian Rogers
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Ian Rogers @ 2022-06-08 22:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Rob Herring
  Cc: Stephane Eranian, Ian Rogers

libperf's perf_mmap__read_self and the addition of arm64 support mean
that the perf_event.h and the rdpmc perf test have become
stale. Refresh the documentation in perf_event.h, remove the x86 rdpmc
test and port the libperf test as a non-architecture specific test.

Address sanitizer testing showed libperf leaking fds when the
perf_event_open failed, add error paths to handle this.

A comment is added to perf_event.h to avoid a divide by zero when
scaling counts if the running time is 0. This was previously discussed
in this thread:
https://lore.kernel.org/lkml/CAP-5=fVRdqvswtyQMg5cB+ntTGda+SAYskjTQednEH-AeZo13g@mail.gmail.com/

Ian Rogers (4):
  libperf evsel: Open shouldn't leak fd on failure
  perf: Align user space counter reading with code
  perf test: Remove x86 rdpmc test
  perf test: Add user space counter reading tests

 include/uapi/linux/perf_event.h        |  32 +++--
 tools/include/uapi/linux/perf_event.h  |  32 +++--
 tools/lib/perf/evsel.c                 |  17 ++-
 tools/perf/arch/x86/tests/Build        |   1 -
 tools/perf/arch/x86/tests/arch-tests.c |   2 -
 tools/perf/arch/x86/tests/rdpmc.c      | 182 -------------------------
 tools/perf/tests/mmap-basic.c          | 128 ++++++++++++++++-
 7 files changed, 177 insertions(+), 217 deletions(-)
 delete mode 100644 tools/perf/arch/x86/tests/rdpmc.c

-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] libperf evsel: Open shouldn't leak fd on failure
  2022-06-08 22:43 [PATCH 0/4] Tidy user rdpmc documentation and testing Ian Rogers
@ 2022-06-08 22:43 ` Ian Rogers
  2022-06-09 16:47   ` Arnaldo Carvalho de Melo
  2022-06-08 22:43 ` [PATCH 2/4] perf: Align user space counter reading with code Ian Rogers
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Ian Rogers @ 2022-06-08 22:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Rob Herring
  Cc: Stephane Eranian, Ian Rogers

If the perf_event_open fails the fd is opened but the fd is only freed
by closing (not by delete). Typically when an open fails you don't call
close and so this results in a memory leak. To avoid this, add a close
when open fails.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/evsel.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index c1d58673f6ef..952f3520d5c2 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -149,23 +149,30 @@ int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
 			int fd, group_fd, *evsel_fd;
 
 			evsel_fd = FD(evsel, idx, thread);
-			if (evsel_fd == NULL)
-				return -EINVAL;
+			if (evsel_fd == NULL) {
+				err = -EINVAL;
+				goto out;
+			}
 
 			err = get_group_fd(evsel, idx, thread, &group_fd);
 			if (err < 0)
-				return err;
+				goto out;
 
 			fd = sys_perf_event_open(&evsel->attr,
 						 threads->map[thread].pid,
 						 cpu, group_fd, 0);
 
-			if (fd < 0)
-				return -errno;
+			if (fd < 0) {
+				err = -errno;
+				goto out;
+			}
 
 			*evsel_fd = fd;
 		}
 	}
+out:
+	if (err)
+		perf_evsel__close(evsel);
 
 	return err;
 }
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] perf: Align user space counter reading with code
  2022-06-08 22:43 [PATCH 0/4] Tidy user rdpmc documentation and testing Ian Rogers
  2022-06-08 22:43 ` [PATCH 1/4] libperf evsel: Open shouldn't leak fd on failure Ian Rogers
@ 2022-06-08 22:43 ` Ian Rogers
  2022-06-08 23:24   ` Rob Herring
  2022-06-08 22:43 ` [PATCH 3/4] perf test: Remove x86 rdpmc test Ian Rogers
  2022-06-08 22:43 ` [PATCH 4/4] perf test: Add user space counter reading tests Ian Rogers
  3 siblings, 1 reply; 10+ messages in thread
From: Ian Rogers @ 2022-06-08 22:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Rob Herring
  Cc: Stephane Eranian, Ian Rogers

Align the user space counter reading documentation with the code in
perf_mmap__read_self. Previously the documentation was based on the perf
rdpmc test, but now general purpose code is provided by libperf.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 include/uapi/linux/perf_event.h       | 32 ++++++++++++++++-----------
 tools/include/uapi/linux/perf_event.h | 32 ++++++++++++++++-----------
 2 files changed, 38 insertions(+), 26 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index d37629dbad72..3b84e0ad0723 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -538,9 +538,13 @@ struct perf_event_mmap_page {
 	 *
 	 *     if (pc->cap_usr_time && enabled != running) {
 	 *       cyc = rdtsc();
-	 *       time_offset = pc->time_offset;
 	 *       time_mult   = pc->time_mult;
 	 *       time_shift  = pc->time_shift;
+	 *       time_offset = pc->time_offset;
+	 *       if (pc->cap_user_time_short) {
+	 *         time_cycles = pc->time_cycles;
+	 *         time_mask = pc->time_mask;
+	 *       }
 	 *     }
 	 *
 	 *     index = pc->index;
@@ -548,6 +552,9 @@ struct perf_event_mmap_page {
 	 *     if (pc->cap_user_rdpmc && index) {
 	 *       width = pc->pmc_width;
 	 *       pmc = rdpmc(index - 1);
+	 *       pmc <<= 64 - width;
+	 *       pmc >>= 64 - width;
+	 *       count += pmc;
 	 *     }
 	 *
 	 *     barrier();
@@ -590,25 +597,24 @@ struct perf_event_mmap_page {
 	 * If cap_usr_time the below fields can be used to compute the time
 	 * delta since time_enabled (in ns) using rdtsc or similar.
 	 *
-	 *   u64 quot, rem;
-	 *   u64 delta;
-	 *
-	 *   quot = (cyc >> time_shift);
-	 *   rem = cyc & (((u64)1 << time_shift) - 1);
-	 *   delta = time_offset + quot * time_mult +
-	 *              ((rem * time_mult) >> time_shift);
+	 *   cyc = time_cycles + ((cyc - time_cycles) & time_mask);
+	 *   delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift);
 	 *
 	 * Where time_offset,time_mult,time_shift and cyc are read in the
 	 * seqcount loop described above. This delta can then be added to
-	 * enabled and possible running (if index), improving the scaling:
+	 * enabled and possible running (if index) to improve the scaling. Due
+	 * to event multiplexing, running maybe zero and so care is needed to
+	 * avoid division by zero.
 	 *
 	 *   enabled += delta;
-	 *   if (index)
+	 *   if (idx)
 	 *     running += delta;
 	 *
-	 *   quot = count / running;
-	 *   rem  = count % running;
-	 *   count = quot * enabled + (rem * enabled) / running;
+	 *   if (running != 0) {
+	 *     quot = count / running;
+	 *     rem  = count % running;
+	 *     count = quot * enabled + (rem * enabled) / running;
+	 *   }
 	 */
 	__u16	time_shift;
 	__u32	time_mult;
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index d37629dbad72..3b84e0ad0723 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -538,9 +538,13 @@ struct perf_event_mmap_page {
 	 *
 	 *     if (pc->cap_usr_time && enabled != running) {
 	 *       cyc = rdtsc();
-	 *       time_offset = pc->time_offset;
 	 *       time_mult   = pc->time_mult;
 	 *       time_shift  = pc->time_shift;
+	 *       time_offset = pc->time_offset;
+	 *       if (pc->cap_user_time_short) {
+	 *         time_cycles = pc->time_cycles;
+	 *         time_mask = pc->time_mask;
+	 *       }
 	 *     }
 	 *
 	 *     index = pc->index;
@@ -548,6 +552,9 @@ struct perf_event_mmap_page {
 	 *     if (pc->cap_user_rdpmc && index) {
 	 *       width = pc->pmc_width;
 	 *       pmc = rdpmc(index - 1);
+	 *       pmc <<= 64 - width;
+	 *       pmc >>= 64 - width;
+	 *       count += pmc;
 	 *     }
 	 *
 	 *     barrier();
@@ -590,25 +597,24 @@ struct perf_event_mmap_page {
 	 * If cap_usr_time the below fields can be used to compute the time
 	 * delta since time_enabled (in ns) using rdtsc or similar.
 	 *
-	 *   u64 quot, rem;
-	 *   u64 delta;
-	 *
-	 *   quot = (cyc >> time_shift);
-	 *   rem = cyc & (((u64)1 << time_shift) - 1);
-	 *   delta = time_offset + quot * time_mult +
-	 *              ((rem * time_mult) >> time_shift);
+	 *   cyc = time_cycles + ((cyc - time_cycles) & time_mask);
+	 *   delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift);
 	 *
 	 * Where time_offset,time_mult,time_shift and cyc are read in the
 	 * seqcount loop described above. This delta can then be added to
-	 * enabled and possible running (if index), improving the scaling:
+	 * enabled and possible running (if index) to improve the scaling. Due
+	 * to event multiplexing, running maybe zero and so care is needed to
+	 * avoid division by zero.
 	 *
 	 *   enabled += delta;
-	 *   if (index)
+	 *   if (idx)
 	 *     running += delta;
 	 *
-	 *   quot = count / running;
-	 *   rem  = count % running;
-	 *   count = quot * enabled + (rem * enabled) / running;
+	 *   if (running != 0) {
+	 *     quot = count / running;
+	 *     rem  = count % running;
+	 *     count = quot * enabled + (rem * enabled) / running;
+	 *   }
 	 */
 	__u16	time_shift;
 	__u32	time_mult;
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] perf test: Remove x86 rdpmc test
  2022-06-08 22:43 [PATCH 0/4] Tidy user rdpmc documentation and testing Ian Rogers
  2022-06-08 22:43 ` [PATCH 1/4] libperf evsel: Open shouldn't leak fd on failure Ian Rogers
  2022-06-08 22:43 ` [PATCH 2/4] perf: Align user space counter reading with code Ian Rogers
@ 2022-06-08 22:43 ` Ian Rogers
  2022-06-08 23:29   ` Rob Herring
  2022-06-08 22:43 ` [PATCH 4/4] perf test: Add user space counter reading tests Ian Rogers
  3 siblings, 1 reply; 10+ messages in thread
From: Ian Rogers @ 2022-06-08 22:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Rob Herring
  Cc: Stephane Eranian, Ian Rogers

This test has been superseded by test_stat_user_read in:
tools/lib/perf/tests/test-evsel.c
The updated test doesn't divide-by-0 when running time of a counter is
0. It also supports ARM64.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/arch/x86/tests/Build        |   1 -
 tools/perf/arch/x86/tests/arch-tests.c |   2 -
 tools/perf/arch/x86/tests/rdpmc.c      | 182 -------------------------
 3 files changed, 185 deletions(-)
 delete mode 100644 tools/perf/arch/x86/tests/rdpmc.c

diff --git a/tools/perf/arch/x86/tests/Build b/tools/perf/arch/x86/tests/Build
index 28d793390198..70b5bcbc15df 100644
--- a/tools/perf/arch/x86/tests/Build
+++ b/tools/perf/arch/x86/tests/Build
@@ -2,7 +2,6 @@ perf-$(CONFIG_DWARF_UNWIND) += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
 perf-y += arch-tests.o
-perf-y += rdpmc.o
 perf-y += sample-parsing.o
 perf-$(CONFIG_AUXTRACE) += insn-x86.o intel-pt-pkt-decoder-test.o
 perf-$(CONFIG_X86_64) += bp-modify.o
diff --git a/tools/perf/arch/x86/tests/arch-tests.c b/tools/perf/arch/x86/tests/arch-tests.c
index 64fb73d14d2f..04018b8aa85b 100644
--- a/tools/perf/arch/x86/tests/arch-tests.c
+++ b/tools/perf/arch/x86/tests/arch-tests.c
@@ -3,7 +3,6 @@
 #include "tests/tests.h"
 #include "arch-tests.h"
 
-DEFINE_SUITE("x86 rdpmc", rdpmc);
 #ifdef HAVE_AUXTRACE_SUPPORT
 DEFINE_SUITE("x86 instruction decoder - new instructions", insn_x86);
 DEFINE_SUITE("Intel PT packet decoder", intel_pt_pkt_decoder);
@@ -14,7 +13,6 @@ DEFINE_SUITE("x86 bp modify", bp_modify);
 DEFINE_SUITE("x86 Sample parsing", x86_sample_parsing);
 
 struct test_suite *arch_tests[] = {
-	&suite__rdpmc,
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 	&suite__dwarf_unwind,
 #endif
diff --git a/tools/perf/arch/x86/tests/rdpmc.c b/tools/perf/arch/x86/tests/rdpmc.c
deleted file mode 100644
index 498413ad9c97..000000000000
--- a/tools/perf/arch/x86/tests/rdpmc.c
+++ /dev/null
@@ -1,182 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <errno.h>
-#include <unistd.h>
-#include <stdlib.h>
-#include <signal.h>
-#include <sys/mman.h>
-#include <sys/types.h>
-#include <sys/wait.h>
-#include <linux/string.h>
-#include <linux/types.h>
-#include "perf-sys.h"
-#include "debug.h"
-#include "tests/tests.h"
-#include "cloexec.h"
-#include "event.h"
-#include <internal/lib.h> // page_size
-#include "arch-tests.h"
-
-static u64 rdpmc(unsigned int counter)
-{
-	unsigned int low, high;
-
-	asm volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter));
-
-	return low | ((u64)high) << 32;
-}
-
-static u64 rdtsc(void)
-{
-	unsigned int low, high;
-
-	asm volatile("rdtsc" : "=a" (low), "=d" (high));
-
-	return low | ((u64)high) << 32;
-}
-
-static u64 mmap_read_self(void *addr)
-{
-	struct perf_event_mmap_page *pc = addr;
-	u32 seq, idx, time_mult = 0, time_shift = 0;
-	u64 count, cyc = 0, time_offset = 0, enabled, running, delta;
-
-	do {
-		seq = pc->lock;
-		barrier();
-
-		enabled = pc->time_enabled;
-		running = pc->time_running;
-
-		if (enabled != running) {
-			cyc = rdtsc();
-			time_mult = pc->time_mult;
-			time_shift = pc->time_shift;
-			time_offset = pc->time_offset;
-		}
-
-		idx = pc->index;
-		count = pc->offset;
-		if (idx)
-			count += rdpmc(idx - 1);
-
-		barrier();
-	} while (pc->lock != seq);
-
-	if (enabled != running) {
-		u64 quot, rem;
-
-		quot = (cyc >> time_shift);
-		rem = cyc & (((u64)1 << time_shift) - 1);
-		delta = time_offset + quot * time_mult +
-			((rem * time_mult) >> time_shift);
-
-		enabled += delta;
-		if (idx)
-			running += delta;
-
-		quot = count / running;
-		rem = count % running;
-		count = quot * enabled + (rem * enabled) / running;
-	}
-
-	return count;
-}
-
-/*
- * If the RDPMC instruction faults then signal this back to the test parent task:
- */
-static void segfault_handler(int sig __maybe_unused,
-			     siginfo_t *info __maybe_unused,
-			     void *uc __maybe_unused)
-{
-	exit(-1);
-}
-
-static int __test__rdpmc(void)
-{
-	volatile int tmp = 0;
-	u64 i, loops = 1000;
-	int n;
-	int fd;
-	void *addr;
-	struct perf_event_attr attr = {
-		.type = PERF_TYPE_HARDWARE,
-		.config = PERF_COUNT_HW_INSTRUCTIONS,
-		.exclude_kernel = 1,
-	};
-	u64 delta_sum = 0;
-        struct sigaction sa;
-	char sbuf[STRERR_BUFSIZE];
-
-	sigfillset(&sa.sa_mask);
-	sa.sa_sigaction = segfault_handler;
-	sa.sa_flags = 0;
-	sigaction(SIGSEGV, &sa, NULL);
-
-	fd = sys_perf_event_open(&attr, 0, -1, -1,
-				 perf_event_open_cloexec_flag());
-	if (fd < 0) {
-		pr_err("Error: sys_perf_event_open() syscall returned "
-		       "with %d (%s)\n", fd,
-		       str_error_r(errno, sbuf, sizeof(sbuf)));
-		return -1;
-	}
-
-	addr = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
-	if (addr == (void *)(-1)) {
-		pr_err("Error: mmap() syscall returned with (%s)\n",
-		       str_error_r(errno, sbuf, sizeof(sbuf)));
-		goto out_close;
-	}
-
-	for (n = 0; n < 6; n++) {
-		u64 stamp, now, delta;
-
-		stamp = mmap_read_self(addr);
-
-		for (i = 0; i < loops; i++)
-			tmp++;
-
-		now = mmap_read_self(addr);
-		loops *= 10;
-
-		delta = now - stamp;
-		pr_debug("%14d: %14Lu\n", n, (long long)delta);
-
-		delta_sum += delta;
-	}
-
-	munmap(addr, page_size);
-	pr_debug("   ");
-out_close:
-	close(fd);
-
-	if (!delta_sum)
-		return -1;
-
-	return 0;
-}
-
-int test__rdpmc(struct test_suite *test __maybe_unused, int subtest __maybe_unused)
-{
-	int status = 0;
-	int wret = 0;
-	int ret;
-	int pid;
-
-	pid = fork();
-	if (pid < 0)
-		return -1;
-
-	if (!pid) {
-		ret = __test__rdpmc();
-
-		exit(ret);
-	}
-
-	wret = waitpid(pid, &status, 0);
-	if (wret < 0 || status)
-		return -1;
-
-	return 0;
-}
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] perf test: Add user space counter reading tests
  2022-06-08 22:43 [PATCH 0/4] Tidy user rdpmc documentation and testing Ian Rogers
                   ` (2 preceding siblings ...)
  2022-06-08 22:43 ` [PATCH 3/4] perf test: Remove x86 rdpmc test Ian Rogers
@ 2022-06-08 22:43 ` Ian Rogers
  3 siblings, 0 replies; 10+ messages in thread
From: Ian Rogers @ 2022-06-08 22:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Rob Herring
  Cc: Stephane Eranian, Ian Rogers

These tests are based on test_stat_user_read in
tools/lib/perf/tests/test-evsel.c. The tests are modified to skip if
perf_event_open fails or rdpmc isn't supported.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/mmap-basic.c | 128 +++++++++++++++++++++++++++++++++-
 1 file changed, 127 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 30bbe144648a..791eebcc5883 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -170,14 +170,140 @@ static int test__basic_mmap(struct test_suite *test __maybe_unused, int subtest
 	return err;
 }
 
+static int test_stat_user_read(int event)
+{
+	struct perf_counts_values counts = { .val = 0 };
+	struct perf_thread_map *threads;
+	struct perf_evsel *evsel;
+	struct perf_event_mmap_page *pc;
+	struct perf_event_attr attr = {
+		.type	= PERF_TYPE_HARDWARE,
+		.config	= event,
+#ifdef __aarch64__
+		.config1 = 0x2,		/* Request user access */
+#endif
+	};
+	int err, i, ret = TEST_FAIL;
+	bool opened = false, mapped = false;
+
+	threads = perf_thread_map__new_dummy();
+	TEST_ASSERT_VAL("failed to create threads", threads);
+
+	perf_thread_map__set_pid(threads, 0, 0);
+
+	evsel = perf_evsel__new(&attr);
+	TEST_ASSERT_VAL("failed to create evsel", evsel);
+
+	err = perf_evsel__open(evsel, NULL, threads);
+	if (err) {
+		pr_err("failed to open evsel: %s\n", strerror(-err));
+		if (err == -EACCES)
+			ret = TEST_SKIP;
+		goto out;
+	}
+	opened = true;
+
+	err = perf_evsel__mmap(evsel, 0);
+	if (err) {
+		pr_err("failed to mmap evsel: %s\n", strerror(-err));
+		goto out;
+	}
+	mapped = true;
+
+	pc = perf_evsel__mmap_base(evsel, 0, 0);
+	if (!pc) {
+		pr_err("failed to get mmapped address\n");
+		goto out;
+	}
+
+	if (!pc->cap_user_rdpmc || !pc->index) {
+		pr_err("userspace counter access not %s\n",
+			!pc->cap_user_rdpmc ? "supported" : "enabled");
+		ret = TEST_SKIP;
+		goto out;
+	}
+	if (pc->pmc_width < 32) {
+		pr_err("userspace counter width not set (%d)\n", pc->pmc_width);
+		goto out;
+	}
+
+	perf_evsel__read(evsel, 0, 0, &counts);
+	if (counts.val == 0) {
+		pr_err("failed to read value for evsel\n");
+		goto out;
+	}
+
+	for (i = 0; i < 5; i++) {
+		volatile int count = 0x10000 << i;
+		__u64 start, end, last = 0;
+
+		pr_debug("\tloop = %u, ", count);
+
+		perf_evsel__read(evsel, 0, 0, &counts);
+		start = counts.val;
+
+		while (count--) ;
+
+		perf_evsel__read(evsel, 0, 0, &counts);
+		end = counts.val;
+
+		if ((end - start) < last) {
+			pr_err("invalid counter data: end=%llu start=%llu last= %llu\n",
+				end, start, last);
+			goto out;
+		}
+		last = end - start;
+		pr_debug("count = %llu\n", end - start);
+	}
+	ret = TEST_OK;
+
+out:
+	if (mapped)
+		perf_evsel__munmap(evsel);
+	if (opened)
+		perf_evsel__close(evsel);
+	perf_evsel__delete(evsel);
+
+	perf_thread_map__put(threads);
+	return ret;
+}
+
+static int test__mmap_user_read_instr(struct test_suite *test __maybe_unused,
+				      int subtest __maybe_unused)
+{
+	return test_stat_user_read(PERF_COUNT_HW_INSTRUCTIONS);
+}
+
+static int test__mmap_user_read_cycles(struct test_suite *test __maybe_unused,
+				       int subtest __maybe_unused)
+{
+	return test_stat_user_read(PERF_COUNT_HW_CPU_CYCLES);
+}
+
 static struct test_case tests__basic_mmap[] = {
 	TEST_CASE_REASON("Read samples using the mmap interface",
 			 basic_mmap,
 			 "permissions"),
+	TEST_CASE_REASON("User space counter reading of instructions",
+			 mmap_user_read_instr,
+#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
+			 "permissions"
+#else
+			 "unsupported"
+#endif
+		),
+	TEST_CASE_REASON("User space counter reading of cycles",
+			 mmap_user_read_cycles,
+#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
+			 "permissions"
+#else
+			 "unsupported"
+#endif
+		),
 	{	.name = NULL, }
 };
 
 struct test_suite suite__basic_mmap = {
-	.desc = "Read samples using the mmap interface",
+	.desc = "mmap interface tests",
 	.test_cases = tests__basic_mmap,
 };
-- 
2.36.1.255.ge46751e96f-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] perf: Align user space counter reading with code
  2022-06-08 22:43 ` [PATCH 2/4] perf: Align user space counter reading with code Ian Rogers
@ 2022-06-08 23:24   ` Rob Herring
  2022-06-09  0:07     ` Ian Rogers
  0 siblings, 1 reply; 10+ messages in thread
From: Rob Herring @ 2022-06-08 23:24 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Stephane Eranian

On Wed, Jun 8, 2022 at 4:44 PM Ian Rogers <irogers@google.com> wrote:
>
> Align the user space counter reading documentation with the code in
> perf_mmap__read_self. Previously the documentation was based on the perf
> rdpmc test, but now general purpose code is provided by libperf.

IMO, this copy of not quite right code should just be removed perhaps
with a pointer to perf_mmap__read_self(). It will just get out of date
again. For example, the read loop might get rewritten with restartable
sequences.

> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  include/uapi/linux/perf_event.h       | 32 ++++++++++++++++-----------
>  tools/include/uapi/linux/perf_event.h | 32 ++++++++++++++++-----------
>  2 files changed, 38 insertions(+), 26 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index d37629dbad72..3b84e0ad0723 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -538,9 +538,13 @@ struct perf_event_mmap_page {
>          *
>          *     if (pc->cap_usr_time && enabled != running) {
>          *       cyc = rdtsc();

Kind of x86 specific.

> -        *       time_offset = pc->time_offset;
>          *       time_mult   = pc->time_mult;
>          *       time_shift  = pc->time_shift;
> +        *       time_offset = pc->time_offset;
> +        *       if (pc->cap_user_time_short) {
> +        *         time_cycles = pc->time_cycles;
> +        *         time_mask = pc->time_mask;
> +        *       }

This still misses the need for READ_ONCE().

>          *     }
>          *
>          *     index = pc->index;
> @@ -548,6 +552,9 @@ struct perf_event_mmap_page {
>          *     if (pc->cap_user_rdpmc && index) {
>          *       width = pc->pmc_width;
>          *       pmc = rdpmc(index - 1);
> +        *       pmc <<= 64 - width;
> +        *       pmc >>= 64 - width;
> +        *       count += pmc;
>          *     }
>          *
>          *     barrier();
> @@ -590,25 +597,24 @@ struct perf_event_mmap_page {
>          * If cap_usr_time the below fields can be used to compute the time
>          * delta since time_enabled (in ns) using rdtsc or similar.
>          *
> -        *   u64 quot, rem;
> -        *   u64 delta;
> -        *
> -        *   quot = (cyc >> time_shift);
> -        *   rem = cyc & (((u64)1 << time_shift) - 1);
> -        *   delta = time_offset + quot * time_mult +
> -        *              ((rem * time_mult) >> time_shift);
> +        *   cyc = time_cycles + ((cyc - time_cycles) & time_mask);
> +        *   delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift);

For documentation purposes, the original code was easier to read and
this is just an optimization. What does mul_u64_u32_shr() do exactly?
It's not documented.

>          *
>          * Where time_offset,time_mult,time_shift and cyc are read in the
>          * seqcount loop described above. This delta can then be added to
> -        * enabled and possible running (if index), improving the scaling:
> +        * enabled and possible running (if index) to improve the scaling. Due
> +        * to event multiplexing, running maybe zero and so care is needed to
> +        * avoid division by zero.
>          *
>          *   enabled += delta;
> -        *   if (index)
> +        *   if (idx)
>          *     running += delta;
>          *
> -        *   quot = count / running;
> -        *   rem  = count % running;
> -        *   count = quot * enabled + (rem * enabled) / running;
> +        *   if (running != 0) {
> +        *     quot = count / running;
> +        *     rem  = count % running;
> +        *     count = quot * enabled + (rem * enabled) / running;
> +        *   }
>          */
>         __u16   time_shift;
>         __u32   time_mult;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] perf test: Remove x86 rdpmc test
  2022-06-08 22:43 ` [PATCH 3/4] perf test: Remove x86 rdpmc test Ian Rogers
@ 2022-06-08 23:29   ` Rob Herring
  2022-06-08 23:30     ` Rob Herring
  0 siblings, 1 reply; 10+ messages in thread
From: Rob Herring @ 2022-06-08 23:29 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Stephane Eranian

On Wed, Jun 8, 2022 at 4:44 PM Ian Rogers <irogers@google.com> wrote:
>
> This test has been superseded by test_stat_user_read in:
> tools/lib/perf/tests/test-evsel.c
> The updated test doesn't divide-by-0 when running time of a counter is
> 0. It also supports ARM64.

Jiri objected to this when I did the same thing[1] as 'perf test'
doesn't run libperf tests.

Rob

[1] https://lore.kernel.org/all/20200831091113.GA406859@krava/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] perf test: Remove x86 rdpmc test
  2022-06-08 23:29   ` Rob Herring
@ 2022-06-08 23:30     ` Rob Herring
  0 siblings, 0 replies; 10+ messages in thread
From: Rob Herring @ 2022-06-08 23:30 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Stephane Eranian

On Wed, Jun 8, 2022 at 5:29 PM Rob Herring <robh@kernel.org> wrote:
>
> On Wed, Jun 8, 2022 at 4:44 PM Ian Rogers <irogers@google.com> wrote:
> >
> > This test has been superseded by test_stat_user_read in:
> > tools/lib/perf/tests/test-evsel.c
> > The updated test doesn't divide-by-0 when running time of a counter is
> > 0. It also supports ARM64.
>
> Jiri objected to this when I did the same thing[1] as 'perf test'
> doesn't run libperf tests.

NM, I just saw patch 4.

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] perf: Align user space counter reading with code
  2022-06-08 23:24   ` Rob Herring
@ 2022-06-09  0:07     ` Ian Rogers
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Rogers @ 2022-06-09  0:07 UTC (permalink / raw)
  To: Rob Herring
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kajol Jain, Andi Kleen, Adrian Hunter, Anshuman Khandual,
	linux-kernel, linux-perf-users, Stephane Eranian

On Wed, Jun 8, 2022 at 4:25 PM Rob Herring <robh@kernel.org> wrote:
>
> On Wed, Jun 8, 2022 at 4:44 PM Ian Rogers <irogers@google.com> wrote:
> >
> > Align the user space counter reading documentation with the code in
> > perf_mmap__read_self. Previously the documentation was based on the perf
> > rdpmc test, but now general purpose code is provided by libperf.
>
> IMO, this copy of not quite right code should just be removed perhaps
> with a pointer to perf_mmap__read_self(). It will just get out of date
> again. For example, the read loop might get rewritten with restartable
> sequences.

Thanks. The intent with the rdpmc test and the header is they be in
sync. The test flakes when testing at scale, why I'm here. Having the
code in the header makes it clear this is a Linux API and subject to
the Linux syscall note. I like the code in the header but not that it
is out of sync with the code used currently to read it.

> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  include/uapi/linux/perf_event.h       | 32 ++++++++++++++++-----------
> >  tools/include/uapi/linux/perf_event.h | 32 ++++++++++++++++-----------
> >  2 files changed, 38 insertions(+), 26 deletions(-)
> >
> > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> > index d37629dbad72..3b84e0ad0723 100644
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -538,9 +538,13 @@ struct perf_event_mmap_page {
> >          *
> >          *     if (pc->cap_usr_time && enabled != running) {
> >          *       cyc = rdtsc();
>
> Kind of x86 specific.

There is a reference to it below and so I let it be, the name rdpmc is
also very x86 but something of a sailed ship.

> > -        *       time_offset = pc->time_offset;
> >          *       time_mult   = pc->time_mult;
> >          *       time_shift  = pc->time_shift;
> > +        *       time_offset = pc->time_offset;
> > +        *       if (pc->cap_user_time_short) {
> > +        *         time_cycles = pc->time_cycles;
> > +        *         time_mask = pc->time_mask;
> > +        *       }
>
> This still misses the need for READ_ONCE().

I'd elided that as it is something of.a kernel API. I can add it back.

> >          *     }
> >          *
> >          *     index = pc->index;
> > @@ -548,6 +552,9 @@ struct perf_event_mmap_page {
> >          *     if (pc->cap_user_rdpmc && index) {
> >          *       width = pc->pmc_width;
> >          *       pmc = rdpmc(index - 1);
> > +        *       pmc <<= 64 - width;
> > +        *       pmc >>= 64 - width;
> > +        *       count += pmc;
> >          *     }
> >          *
> >          *     barrier();
> > @@ -590,25 +597,24 @@ struct perf_event_mmap_page {
> >          * If cap_usr_time the below fields can be used to compute the time
> >          * delta since time_enabled (in ns) using rdtsc or similar.
> >          *
> > -        *   u64 quot, rem;
> > -        *   u64 delta;
> > -        *
> > -        *   quot = (cyc >> time_shift);
> > -        *   rem = cyc & (((u64)1 << time_shift) - 1);
> > -        *   delta = time_offset + quot * time_mult +
> > -        *              ((rem * time_mult) >> time_shift);
> > +        *   cyc = time_cycles + ((cyc - time_cycles) & time_mask);
> > +        *   delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift);
>
> For documentation purposes, the original code was easier to read and
> this is just an optimization. What does mul_u64_u32_shr() do exactly?
> It's not documented.

My concern with expanding the function is the header and the code
aren't in sync and so we're not testing what we're advertising.
Finding the definition is not a huge challenge imo.

Thanks,
Ian

> >          *
> >          * Where time_offset,time_mult,time_shift and cyc are read in the
> >          * seqcount loop described above. This delta can then be added to
> > -        * enabled and possible running (if index), improving the scaling:
> > +        * enabled and possible running (if index) to improve the scaling. Due
> > +        * to event multiplexing, running maybe zero and so care is needed to
> > +        * avoid division by zero.
> >          *
> >          *   enabled += delta;
> > -        *   if (index)
> > +        *   if (idx)
> >          *     running += delta;
> >          *
> > -        *   quot = count / running;
> > -        *   rem  = count % running;
> > -        *   count = quot * enabled + (rem * enabled) / running;
> > +        *   if (running != 0) {
> > +        *     quot = count / running;
> > +        *     rem  = count % running;
> > +        *     count = quot * enabled + (rem * enabled) / running;
> > +        *   }
> >          */
> >         __u16   time_shift;
> >         __u32   time_mult;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] libperf evsel: Open shouldn't leak fd on failure
  2022-06-08 22:43 ` [PATCH 1/4] libperf evsel: Open shouldn't leak fd on failure Ian Rogers
@ 2022-06-09 16:47   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-06-09 16:47 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Kajol Jain, Andi Kleen, Adrian Hunter,
	Anshuman Khandual, linux-kernel, linux-perf-users, Rob Herring,
	Stephane Eranian

Em Wed, Jun 08, 2022 at 03:43:50PM -0700, Ian Rogers escreveu:
> If the perf_event_open fails the fd is opened but the fd is only freed
> by closing (not by delete). Typically when an open fails you don't call
> close and so this results in a memory leak. To avoid this, add a close
> when open fails.

Thanks, applied.

- Arnaldo

 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/lib/perf/evsel.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index c1d58673f6ef..952f3520d5c2 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -149,23 +149,30 @@ int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
>  			int fd, group_fd, *evsel_fd;
>  
>  			evsel_fd = FD(evsel, idx, thread);
> -			if (evsel_fd == NULL)
> -				return -EINVAL;
> +			if (evsel_fd == NULL) {
> +				err = -EINVAL;
> +				goto out;
> +			}
>  
>  			err = get_group_fd(evsel, idx, thread, &group_fd);
>  			if (err < 0)
> -				return err;
> +				goto out;
>  
>  			fd = sys_perf_event_open(&evsel->attr,
>  						 threads->map[thread].pid,
>  						 cpu, group_fd, 0);
>  
> -			if (fd < 0)
> -				return -errno;
> +			if (fd < 0) {
> +				err = -errno;
> +				goto out;
> +			}
>  
>  			*evsel_fd = fd;
>  		}
>  	}
> +out:
> +	if (err)
> +		perf_evsel__close(evsel);
>  
>  	return err;
>  }
> -- 
> 2.36.1.255.ge46751e96f-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-06-09 16:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-08 22:43 [PATCH 0/4] Tidy user rdpmc documentation and testing Ian Rogers
2022-06-08 22:43 ` [PATCH 1/4] libperf evsel: Open shouldn't leak fd on failure Ian Rogers
2022-06-09 16:47   ` Arnaldo Carvalho de Melo
2022-06-08 22:43 ` [PATCH 2/4] perf: Align user space counter reading with code Ian Rogers
2022-06-08 23:24   ` Rob Herring
2022-06-09  0:07     ` Ian Rogers
2022-06-08 22:43 ` [PATCH 3/4] perf test: Remove x86 rdpmc test Ian Rogers
2022-06-08 23:29   ` Rob Herring
2022-06-08 23:30     ` Rob Herring
2022-06-08 22:43 ` [PATCH 4/4] perf test: Add user space counter reading tests Ian Rogers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.