All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] [PATCH 0/3] perf_event_open02 tweaks
@ 2019-11-18 14:59 Jan Stancek
  2019-11-18 14:59 ` [LTP] [PATCH 1/3] perf_event_open02: migrate to newlib Jan Stancek
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jan Stancek @ 2019-11-18 14:59 UTC (permalink / raw)
  To: ltp

Motivation for this series is patch 3, because this test
fails pretty reliably on Pentium4 systems.

I talked to Jiri Olsa (one of perf maintainers) and he had his
reservations about how test determines number of HW counters -
by looking at time_enabled/time_running. One idea was to create
a single group and keep adding hw events to it until it fails.
This however didn't work on Pentium4 system either and it failed
on 2nd event. He also pointed out, that test will always succeed
when number of hw counter is over-estimated.

I found one workaround, which seems to work on both Pentium4
and more recent CPUs. It replaces comparison of absolute values
with comparison of increments (see patch 3 for details).

Question about validity of this test remains, but workaround in
patch 3 at least should not make things worse.

Jan Stancek (3):
  perf_event_open02: migrate to newlib
  perf_event_open02: make do_work() run for specified time
  perf_event_open02: workaround for Pentium4

 .../syscalls/perf_event_open/perf_event_open02.c   | 402 +++++++++------------
 1 file changed, 168 insertions(+), 234 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [LTP] [PATCH 1/3] perf_event_open02: migrate to newlib
  2019-11-18 14:59 [LTP] [PATCH 0/3] perf_event_open02 tweaks Jan Stancek
@ 2019-11-18 14:59 ` Jan Stancek
  2019-11-20 12:29   ` Cyril Hrubis
  2019-11-18 14:59 ` [LTP] [PATCH 2/3] perf_event_open02: make do_work() run for specified time Jan Stancek
  2019-11-18 14:59 ` [LTP] [PATCH 3/3] perf_event_open02: workaround for Pentium4 Jan Stancek
  2 siblings, 1 reply; 12+ messages in thread
From: Jan Stancek @ 2019-11-18 14:59 UTC (permalink / raw)
  To: ltp

Migrate code to newlib, no functional changes.

Adjust comment style, replace license text with SPDX.
Reorder includes, drop some unneeded ones.
Rename variable "n" to "totaln".
Move common error handling to function.
Move cpu affinity code to separate function.

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 .../syscalls/perf_event_open/perf_event_open02.c   | 356 ++++++++-------------
 1 file changed, 132 insertions(+), 224 deletions(-)

diff --git a/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c b/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
index c2831aa20d40..584487de8255 100644
--- a/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
+++ b/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
@@ -1,132 +1,103 @@
-/******************************************************************************/
-/*                                                                            */
-/* Paul Mackerras <paulus@samba.org>, 2009                                    */
-/*                                                                            */
-/* This program is free software;  you can redistribute it and/or modify      */
-/* it under the terms of the GNU General Public License as published by       */
-/* the Free Software Foundation; either version 2 of the License, or          */
-/* (at your option) any later version.                                        */
-/*                                                                            */
-/* This program is distributed in the hope that it will be useful,            */
-/* but WITHOUT ANY WARRANTY;  without even the implied warranty of            */
-/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See                  */
-/* the GNU General Public License for more details.                           */
-/*                                                                            */
-/* You should have received a copy of the GNU General Public License          */
-/* along with this program;  if not, write to the Free Software               */
-/* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA    */
-/*                                                                            */
-/******************************************************************************/
+// SPDX-License-Identifier: GPL-2.0-or-later
 /*
-Here's a little test program that checks whether software counters
-(specifically, the task clock counter) work correctly when they're in
-a group with hardware counters.
-
-What it does is to create several groups, each with one hardware
-counter, counting instructions, plus a task clock counter.  It needs
-to know an upper bound N on the number of hardware counters you have
-(N defaults to 8), and it creates N+4 groups to force them to be
-multiplexed.  It also creates an overall task clock counter.
-
-Then it spins for a while, and then stops all the counters and reads
-them.  It takes the total of the task clock counters in the groups and
-computes the ratio of that total to the overall execution time from
-the overall task clock counter.
-
-That ratio should be equal to the number of actual hardware counters
-that can count instructions.  If the task clock counters in the groups
-don't stop when their group gets taken off the PMU, the ratio will
-instead be close to N+4.  The program will declare that the test fails
-if the ratio is greater than N (actually, N + 0.0001 to allow for FP
-rounding errors).
-
-Could someone run this on x86 on the latest PCL tree and let me know
-what happens?  I don't have an x86 crash box easily to hand.  On
-powerpc, it passes, but I think that is because I am missing setting
-counter->prev_count in arch/powerpc/kernel/perf_counter.c, and I think
-that means that enabling/disabling a group with a task clock counter
-in it won't work correctly (I'll do a test program for that next).
-
-Usage is: ./performance_counter02  [-v]
-
-The -v flag makes it print out the values of each counter.
-*/
+ * Copyright (c) 2009 Paul Mackerras <paulus@samba.org>
+ * Copyright (c) 2019 Linux Test Project
+ */
+/*
+ * Here's a little test program that checks whether software counters
+ * (specifically, the task clock counter) work correctly when they're in
+ * a group with hardware counters.
+ *
+ * What it does is to create several groups, each with one hardware
+ * counter, counting instructions, plus a task clock counter.  It needs
+ * to know an upper bound N on the number of hardware counters you have
+ * (N defaults to 8), and it creates N+4 groups to force them to be
+ * multiplexed.  It also creates an overall task clock counter.
+ *
+ * Then it spins for a while, and then stops all the counters and reads
+ * them.  It takes the total of the task clock counters in the groups and
+ * computes the ratio of that total to the overall execution time from
+ * the overall task clock counter.
+ *
+ * That ratio should be equal to the number of actual hardware counters
+ * that can count instructions.  If the task clock counters in the groups
+ * don't stop when their group gets taken off the PMU, the ratio will
+ * instead be close to N+4.  The program will declare that the test fails
+ * if the ratio is greater than N (actually, N + 0.0001 to allow for FP
+ * rounding errors).
+ */
 
 #define _GNU_SOURCE
-#include <stdio.h>
+#include <errno.h>
+#include <inttypes.h>
+#include <sched.h>
 #include <stddef.h>
+#include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
-#include <fcntl.h>
-#include <poll.h>
 #include <unistd.h>
-#include <errno.h>
-#include "config.h"
 #include <sys/prctl.h>
 #include <sys/types.h>
-#include <linux/types.h>
-#include <sched.h>
 
-#if HAVE_PERF_EVENT_ATTR
-# include <linux/perf_event.h>
-#endif
-
-#include "test.h"
-#include "safe_macros.h"
+#include "config.h"
+#include "tst_test.h"
 #include "lapi/cpuset.h"
 #include "lapi/syscalls.h"
 
-char *TCID = "perf_event_open02";
-int TST_TOTAL = 1;
-
 #if HAVE_PERF_EVENT_ATTR
+#include <linux/types.h>
+#include <linux/perf_event.h>
 
 #define MAX_CTRS	1000
 #define LOOPS		100000000
 
-static void setup(void);
-static void verify(void);
-static void cleanup(void);
-static void help(void);
+struct read_format {
+	unsigned long long value;
+	/* if PERF_FORMAT_TOTAL_TIME_ENABLED */
+	unsigned long long time_enabled;
+	/* if PERF_FORMAT_TOTAL_TIME_RUNNING */
+	unsigned long long time_running;
+};
 
-static int n, nhw;
-static int verbose;
-static option_t options[] = {
-	{"v", &verbose, NULL},
+static char *verbose;
+static struct tst_option options[] = {
+	{"v", &verbose, "-v\tverbose output"},
 	{NULL, NULL, NULL},
 };
 
-static int tsk0;
-static int hwfd[MAX_CTRS], tskfd[MAX_CTRS];
+static int ntotal, nhw;
+static int tsk0 = -1, hwfd[MAX_CTRS], tskfd[MAX_CTRS];
 
-int main(int ac, char **av)
+static int perf_event_open(struct perf_event_attr *event, pid_t pid,
+	int cpu, int group_fd, unsigned long flags)
 {
-	int lc;
+	int ret;
 
-	tst_parse_opts(ac, av, options, help);
+	ret = tst_syscall(__NR_perf_event_open, event, pid, cpu,
+		group_fd, flags);
 
-	setup();
+	if (ret != -1)
+		return ret;
 
-	for (lc = 0; TEST_LOOPING(lc); lc++) {
-		tst_count = 0;
-		verify();
+	tst_res(TINFO, "perf_event_open event.type: %"PRIu32
+		", event.config: %"PRIu64, (uint32_t)event->type,
+		(uint64_t)event->config);
+	if (errno == ENOENT || errno == ENODEV) {
+		tst_brk(TCONF | TERRNO,
+			"perf_event_open type/config not supported");
 	}
+	tst_brk(TBROK | TERRNO, "perf_event_open failed");
 
-	cleanup();
-	tst_exit();
+	/* unreachable */
+	return -1;
 }
 
-static int perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
-	int cpu, int group_fd, unsigned long flags)
+static void all_counters_set(int state)
 {
-	int ret;
-
-	ret = ltp_syscall(__NR_perf_event_open, hw_event, pid, cpu,
-			  group_fd, flags);
-	return ret;
+	if (prctl(state) == -1)
+		tst_brk(TBROK | TERRNO, "prctl(%d) failed", state);
 }
 
-
 static void do_work(void)
 {
 	int i;
@@ -135,13 +106,6 @@ static void do_work(void)
 		asm volatile (""::"g" (i));
 }
 
-struct read_format {
-	unsigned long long value;
-	/* if PERF_FORMAT_TOTAL_TIME_ENABLED */
-	unsigned long long time_enabled;
-	/* if PERF_FORMAT_TOTAL_TIME_RUNNING */
-	unsigned long long time_running;
-};
 
 #ifndef __s390__
 static int count_hardware_counters(void)
@@ -162,35 +126,17 @@ static int count_hardware_counters(void)
 
 	for (i = 0; i < MAX_CTRS; i++) {
 		fdarry[i] = perf_event_open(&hw_event, 0, -1, -1, 0);
-		if (fdarry[i] == -1) {
-			if (errno == ENOENT || errno == ENODEV) {
-				tst_brkm(TCONF | TERRNO, cleanup,
-				         "PERF_COUNT_HW_INSTRUCTIONS not supported");
-			}
-			tst_brkm(TBROK | TERRNO, cleanup,
-				 "perf_event_open failed@iteration:%d", i);
-		}
-
-		if (prctl(PR_TASK_PERF_EVENTS_ENABLE) == -1) {
-			tst_brkm(TBROK | TERRNO, cleanup,
-				 "prctl(PR_TASK_PERF_EVENTS_ENABLE) failed");
-		}
 
+		all_counters_set(PR_TASK_PERF_EVENTS_ENABLE);
 		do_work();
+		all_counters_set(PR_TASK_PERF_EVENTS_DISABLE);
 
-		if (prctl(PR_TASK_PERF_EVENTS_DISABLE) == -1) {
-			tst_brkm(TBROK | TERRNO, cleanup,
-				 "prctl(PR_TASK_PERF_EVENTS_DISABLE) failed");
-		}
-
-		if (read(fdarry[i], &buf, sizeof(buf)) != sizeof(buf)) {
-			tst_brkm(TBROK | TERRNO, cleanup,
-				 "error reading counter(s)");
-		}
+		if (read(fdarry[i], &buf, sizeof(buf)) != sizeof(buf))
+			tst_brk(TBROK | TERRNO, "error reading counter(s)");
 
-		if (verbose == 1) {
-			printf("at iteration:%d value:%lld time_enabled:%lld "
-			       "time_running:%lld\n", i, buf.value,
+		if (verbose) {
+			tst_res(TINFO, "[%d] value:%lld time_enabled:%lld "
+			       "time_running:%lld", i, buf.value,
 			       buf.time_enabled, buf.time_running);
 		}
 
@@ -213,33 +159,36 @@ static int count_hardware_counters(void)
 	}
 
 	for (i = 0; i <= hwctrs; i++)
-		SAFE_CLOSE(cleanup, fdarry[i]);
+		SAFE_CLOSE(fdarry[i]);
 
 	return hwctrs;
 }
-#endif
+#endif /* __s390__ */
 
-static void setup(void)
+static void bind_to_current_cpu(void)
 {
-	int i;
-	struct perf_event_attr tsk_event, hw_event;
-
 #ifdef HAVE_SCHED_GETCPU
 	int cpu = sched_getcpu();
 	size_t mask_size;
 	cpu_set_t *mask;
 
 	if (cpu == -1)
-		tst_brkm(TBROK | TERRNO, NULL, "sched_getcpu() failed");
+		tst_brk(TBROK | TERRNO, "sched_getcpu() failed");
 
 	mask = CPU_ALLOC(cpu + 1);
 	mask_size = CPU_ALLOC_SIZE(cpu + 1);
 	CPU_ZERO_S(mask_size, mask);
 	CPU_SET(cpu, mask);
 	if (sched_setaffinity(0, mask_size, mask) == -1)
-		tst_brkm(TBROK | TERRNO, NULL, "sched_setaffinity() failed");
+		tst_brk(TBROK | TERRNO, "sched_setaffinity() failed");
 	CPU_FREE(mask);
 #endif
+}
+
+static void setup(void)
+{
+	int i;
+	struct perf_event_attr tsk_event, hw_event;
 
 	/*
 	 * According to perf_event_open's manpage, the official way of
@@ -247,12 +196,9 @@ static void setup(void)
 	 * the existence of the file /proc/sys/kernel/perf_event_paranoid.
 	 */
 	if (access("/proc/sys/kernel/perf_event_paranoid", F_OK) == -1)
-		tst_brkm(TCONF, NULL, "Kernel doesn't have perf_event support");
-
-	tst_sig(NOFORK, DEF_HANDLER, cleanup);
-
-	TEST_PAUSE;
+		tst_brk(TCONF, "Kernel doesn't have perf_event support");
 
+	bind_to_current_cpu();
 #ifdef __s390__
 	/*
 	 * On s390 the "time_enabled" and "time_running" values are always the
@@ -261,10 +207,10 @@ static void setup(void)
 	 * There are distinct/dedicated counters that can be used independently.
 	 * Use the dedicated counter for instructions here.
 	 */
-	n = nhw = 1;
+	ntotal = nhw = 1;
 #else
 	nhw = count_hardware_counters();
-	n = nhw + 4;
+	ntotal = nhw + 4;
 #endif
 
 	memset(&hw_event, 0, sizeof(struct perf_event_attr));
@@ -281,30 +227,10 @@ static void setup(void)
 	hw_event.config =  PERF_COUNT_HW_INSTRUCTIONS;
 
 	tsk0 = perf_event_open(&tsk_event, 0, -1, -1, 0);
-	if (tsk0 == -1) {
-		tst_brkm(TBROK | TERRNO, cleanup, "perf_event_open failed");
-	} else {
-		tsk_event.disabled = 0;
-		for (i = 0; i < n; ++i) {
-			hwfd[i] = perf_event_open(&hw_event, 0, -1, -1, 0);
-			if (hwfd[i] == -1) {
-				if (errno == ENOENT || errno == ENODEV) {
-					tst_brkm(TCONF | TERRNO, cleanup,
-						"PERF_COUNT_HW_INSTRUCTIONS not supported");
-				}
-				tst_brkm(TBROK | TERRNO, cleanup,
-					 "perf_event_open failed");
-			}
-			tskfd[i] = perf_event_open(&tsk_event, 0, -1, hwfd[i], 0);
-			if (tskfd[i] == -1) {
-				if (errno == ENOENT || errno == ENODEV) {
-					tst_brkm(TCONF | TERRNO, cleanup,
-						"PERF_COUNT_SW_TASK_CLOCK not supported");
-				}
-				tst_brkm(TBROK | TERRNO, cleanup,
-					 "perf_event_open failed");
-			}
-		}
+	tsk_event.disabled = 0;
+	for (i = 0; i < ntotal; ++i) {
+		hwfd[i] = perf_event_open(&hw_event, 0, -1, -1, 0);
+		tskfd[i] = perf_event_open(&tsk_event, 0, -1, hwfd[i], 0);
 	}
 }
 
@@ -312,15 +238,13 @@ static void cleanup(void)
 {
 	int i;
 
-	for (i = 0; i < n; i++) {
-		if (hwfd[i] > 0 && close(hwfd[i]) == -1)
-			tst_resm(TWARN | TERRNO, "close(%d) failed", hwfd[i]);
-		if (tskfd[i] > 0 && close(tskfd[i]) == -1)
-			tst_resm(TWARN | TERRNO, "close(%d) failed", tskfd[i]);
+	for (i = 0; i < ntotal; i++) {
+		SAFE_CLOSE(hwfd[i]);
+		SAFE_CLOSE(tskfd[i]);
 	}
 
-	if (tsk0 > 0 && close(tsk0) == -1)
-		tst_resm(TWARN | TERRNO, "close(%d) failed", tsk0);
+	if (tsk0 != -1)
+		SAFE_CLOSE(tsk0);
 }
 
 static void verify(void)
@@ -332,82 +256,66 @@ static void verify(void)
 	struct sched_param sparam = {.sched_priority = 1};
 
 	if (sched_setscheduler(0, SCHED_FIFO, &sparam)) {
-		tst_brkm(TBROK | TERRNO, cleanup,
-			 "sched_setscheduler(0, SCHED_FIFO, ...) failed");
-	}
-
-	if (prctl(PR_TASK_PERF_EVENTS_ENABLE) == -1) {
-		tst_brkm(TBROK | TERRNO, cleanup,
-			 "prctl(PR_TASK_PERF_EVENTS_ENABLE) failed");
+		tst_brk(TBROK | TERRNO,
+			"sched_setscheduler(0, SCHED_FIFO, ...) failed");
 	}
 
+	all_counters_set(PR_TASK_PERF_EVENTS_ENABLE);
 	do_work();
-
 	/* stop groups with hw counters first before tsk0 */
-	for (i = 0; i < n; i++) {
+	for (i = 0; i < ntotal; i++) {
 		ioctl(hwfd[i], PERF_EVENT_IOC_DISABLE);
 		ioctl(tskfd[i], PERF_EVENT_IOC_DISABLE);
 	}
-
-	if (prctl(PR_TASK_PERF_EVENTS_DISABLE) == -1) {
-		tst_brkm(TBROK | TERRNO, cleanup,
-			 "prctl(PR_TASK_PERF_EVENTS_DISABLE) failed");
-	}
+	all_counters_set(PR_TASK_PERF_EVENTS_DISABLE);
 
 	sparam.sched_priority = 0;
 	if (sched_setscheduler(0, SCHED_OTHER, &sparam)) {
-		tst_brkm(TBROK | TERRNO, cleanup,
-			 "sched_setscheduler(0, SCHED_OTHER, ...) failed");
+		tst_brk(TBROK | TERRNO,
+			"sched_setscheduler(0, SCHED_OTHER, ...) failed");
 	}
 
-	if (read(tsk0, &vt0, sizeof(vt0)) != sizeof(vt0)) {
-		tst_brkm(TBROK | TERRNO, cleanup,
-			 "error reading task clock counter");
-	}
+	if (read(tsk0, &vt0, sizeof(vt0)) != sizeof(vt0))
+		tst_brk(TBROK | TERRNO, "error reading task clock counter");
 
-	for (i = 0; i < n; ++i) {
+	for (i = 0; i < ntotal; ++i) {
 		if (read(tskfd[i], &vt[i], sizeof(vt[i])) != sizeof(vt[i]) ||
-		    read(hwfd[i], &vh[i], sizeof(vh[i])) != sizeof(vh[i])) {
-			tst_brkm(TBROK | TERRNO, cleanup,
-				 "error reading counter(s)");
-		}
+		    read(hwfd[i], &vh[i], sizeof(vh[i])) != sizeof(vh[i]))
+			tst_brk(TBROK | TERRNO, "error reading counter(s)");
 		vtsum += vt[i];
 		vhsum += vh[i];
 	}
 
-	tst_resm(TINFO, "overall task clock: %llu", vt0);
-	tst_resm(TINFO, "hw sum: %llu, task clock sum: %llu", vhsum, vtsum);
-
-	if (verbose == 1) {
-		printf("hw counters:");
-		for (i = 0; i < n; ++i)
-			printf(" %llu", vh[i]);
-		printf("\ntask clock counters:");
-		for (i = 0; i < n; ++i)
-			printf(" %llu", vt[i]);
-		printf("\n");
+	tst_res(TINFO, "nhw: %d, overall task clock: %llu", nhw, vt0);
+	tst_res(TINFO, "hw sum: %llu, task clock sum: %llu", vhsum, vtsum);
+
+	if (verbose) {
+		tst_res(TINFO, "hw counters:");
+		for (i = 0; i < ntotal; ++i)
+			tst_res(TINFO, " %llu", vh[i]);
+		tst_res(TINFO, "task clock counters:");
+		for (i = 0; i < ntotal; ++i)
+			tst_res(TINFO, " %llu", vt[i]);
 	}
 
 	ratio = (double)vtsum / vt0;
-	tst_resm(TINFO, "ratio: %lf", ratio);
+	tst_res(TINFO, "ratio: %lf", ratio);
 	if (ratio > nhw + 0.0001) {
-		tst_resm(TFAIL, "test failed (ratio was greater than )");
+		tst_res(TFAIL, "test failed (ratio was greater than %d)", nhw);
 	} else {
-		tst_resm(TPASS, "test passed");
+		tst_res(TPASS, "test passed");
 	}
 }
 
-static void help(void)
-{
-	printf("  -v      Print verbose information\n");
-}
-
-#else
+static struct tst_test test = {
+	.setup = setup,
+	.cleanup = cleanup,
+	.options = options,
+	.test_all = verify,
+	.needs_root = 1,
+};
 
-int main(void)
-{
-	tst_brkm(TCONF, NULL, "This system doesn't have "
-		 "header file:<linux/perf_event.h> or "
-		 "no struct perf_event_attr defined");
-}
+#else /* HAVE_PERF_EVENT_ATTR */
+TST_TEST_TCONF("This system doesn't have <linux/perf_event.h> or "
+	"struct perf_event_attr is not defined.");
 #endif
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [LTP] [PATCH 2/3] perf_event_open02: make do_work() run for specified time
  2019-11-18 14:59 [LTP] [PATCH 0/3] perf_event_open02 tweaks Jan Stancek
  2019-11-18 14:59 ` [LTP] [PATCH 1/3] perf_event_open02: migrate to newlib Jan Stancek
@ 2019-11-18 14:59 ` Jan Stancek
  2019-11-20 12:33   ` Cyril Hrubis
  2019-11-21 19:21   ` [LTP] [PATCH v2 " Jan Stancek
  2019-11-18 14:59 ` [LTP] [PATCH 3/3] perf_event_open02: workaround for Pentium4 Jan Stancek
  2 siblings, 2 replies; 12+ messages in thread
From: Jan Stancek @ 2019-11-18 14:59 UTC (permalink / raw)
  To: ltp

do_work() runtime varies a lot, because it's based on a fixed
number of iterations. Set a timer and run for at least specified
time. We don't need fine accuracy, just some coarse runtime
across all systems. verify() function is using larger value to
get more precision for "ratio" calculation.

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 .../syscalls/perf_event_open/perf_event_open02.c   | 36 +++++++++++++++++-----
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c b/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
index 584487de8255..5891694eb894 100644
--- a/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
+++ b/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
@@ -31,12 +31,14 @@
 #include <errno.h>
 #include <inttypes.h>
 #include <sched.h>
+#include <signal.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 #include <sys/prctl.h>
+#include <sys/time.h>
 #include <sys/types.h>
 
 #include "config.h"
@@ -49,7 +51,6 @@
 #include <linux/perf_event.h>
 
 #define MAX_CTRS	1000
-#define LOOPS		100000000
 
 struct read_format {
 	unsigned long long value;
@@ -67,6 +68,7 @@ static struct tst_option options[] = {
 
 static int ntotal, nhw;
 static int tsk0 = -1, hwfd[MAX_CTRS], tskfd[MAX_CTRS];
+static int volatile work_done;
 
 static int perf_event_open(struct perf_event_attr *event, pid_t pid,
 	int cpu, int group_fd, unsigned long flags)
@@ -98,14 +100,34 @@ static void all_counters_set(int state)
 		tst_brk(TBROK | TERRNO, "prctl(%d) failed", state);
 }
 
-static void do_work(void)
+void alarm_handler(int sig LTP_ATTRIBUTE_UNUSED)
+{
+	work_done = 1;
+}
+
+static void do_work(int time_ms)
 {
 	int i;
+	struct sigaction sa;
+	struct itimerval val;
 
-	for (i = 0; i < LOOPS; ++i)
-		asm volatile (""::"g" (i));
-}
+	work_done = 0;
+	memset(&val, 0, sizeof(val));
+	val.it_value.tv_sec = time_ms / 1000;
+	val.it_value.tv_usec = (time_ms % 1000) * 1000;
 
+	sa.sa_handler = alarm_handler;
+	sa.sa_flags = SA_RESETHAND;
+	SAFE_SIGACTION(SIGALRM, &sa, NULL);
+
+	if (setitimer(ITIMER_REAL, &val, NULL))
+		tst_brk(TBROK | TERRNO, "setitimer");
+
+	while (!work_done) {
+		for (i = 0; i < 100000; ++i)
+			asm volatile (""::"g" (i));
+	}
+}
 
 #ifndef __s390__
 static int count_hardware_counters(void)
@@ -128,7 +150,7 @@ static int count_hardware_counters(void)
 		fdarry[i] = perf_event_open(&hw_event, 0, -1, -1, 0);
 
 		all_counters_set(PR_TASK_PERF_EVENTS_ENABLE);
-		do_work();
+		do_work(500);
 		all_counters_set(PR_TASK_PERF_EVENTS_DISABLE);
 
 		if (read(fdarry[i], &buf, sizeof(buf)) != sizeof(buf))
@@ -261,7 +283,7 @@ static void verify(void)
 	}
 
 	all_counters_set(PR_TASK_PERF_EVENTS_ENABLE);
-	do_work();
+	do_work(4000);
 	/* stop groups with hw counters first before tsk0 */
 	for (i = 0; i < ntotal; i++) {
 		ioctl(hwfd[i], PERF_EVENT_IOC_DISABLE);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [LTP] [PATCH 3/3] perf_event_open02: workaround for Pentium4
  2019-11-18 14:59 [LTP] [PATCH 0/3] perf_event_open02 tweaks Jan Stancek
  2019-11-18 14:59 ` [LTP] [PATCH 1/3] perf_event_open02: migrate to newlib Jan Stancek
  2019-11-18 14:59 ` [LTP] [PATCH 2/3] perf_event_open02: make do_work() run for specified time Jan Stancek
@ 2019-11-18 14:59 ` Jan Stancek
  2019-11-20 13:44   ` Cyril Hrubis
  2 siblings, 1 reply; 12+ messages in thread
From: Jan Stancek @ 2019-11-18 14:59 UTC (permalink / raw)
  To: ltp

perf_event_open02 has been observed to fail reliably on Pentium4,
because it detects wrong number of HW counters. This is because
time_enabled and time_running differ by a small value.

On Pentium4, p4_pmu_schedule_events() appears to be called repeatedly
for a group, for example when hwc pointer is being reused by different
cpu [1]. So, a group can fail to get scheduled on pmu on first attempt,
but *sched_in/*sched_out actions still happen. Event state changes
between ACTIVE/INACTIVE couple times, ctx->timestamp is updated as well.
Every time state changes a __perf_update_times() function is called,
which updates time_enabled and time_running. If state is INACTIVE,
only time_enabled is updated.

To workaround that, don't look at/compare absolute values. Instead
let process run for a bit and then look at increments. If both
are of same value, assume that no multiplexing happened and increase
"hwctrs" value.

Also always print time_enabled/time_running values. Number of HW
counters should be low and this data might come handy if a failure
is observed.

[1] 13beacee817d ("perf/x86/p4: Fix counter corruption when using lots of perf groups")

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 .../syscalls/perf_event_open/perf_event_open02.c   | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c b/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
index 5891694eb894..e89726dbcfdd 100644
--- a/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
+++ b/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
@@ -135,7 +135,7 @@ static int count_hardware_counters(void)
 	struct perf_event_attr hw_event;
 	int i, hwctrs = 0;
 	int fdarry[MAX_CTRS];
-	struct read_format buf;
+	struct read_format buf, buf2, diff;
 
 	memset(&hw_event, 0, sizeof(struct perf_event_attr));
 
@@ -151,16 +151,20 @@ static int count_hardware_counters(void)
 
 		all_counters_set(PR_TASK_PERF_EVENTS_ENABLE);
 		do_work(500);
+		if (read(fdarry[i], &buf, sizeof(buf)) != sizeof(buf))
+			tst_brk(TBROK | TERRNO, "error reading counter(s) #1");
+		do_work(500);
 		all_counters_set(PR_TASK_PERF_EVENTS_DISABLE);
+		if (read(fdarry[i], &buf2, sizeof(buf2)) != sizeof(buf2))
+			tst_brk(TBROK | TERRNO, "error reading counter(s) #2");
 
-		if (read(fdarry[i], &buf, sizeof(buf)) != sizeof(buf))
-			tst_brk(TBROK | TERRNO, "error reading counter(s)");
+		diff.value = buf2.value - buf.value;
+		diff.time_enabled = buf2.time_enabled - buf.time_enabled;
+		diff.time_running = buf2.time_running - buf.time_running;
 
-		if (verbose) {
-			tst_res(TINFO, "[%d] value:%lld time_enabled:%lld "
-			       "time_running:%lld", i, buf.value,
-			       buf.time_enabled, buf.time_running);
-		}
+		tst_res(TINFO, "[%d] value:%lld time_enabled:%lld "
+		       "time_running:%lld", i, diff.value,
+		       diff.time_enabled, diff.time_running);
 
 		/*
 		 * Normally time_enabled and time_running are the same value.
@@ -174,7 +178,7 @@ static int count_hardware_counters(void)
 		 * multiplexing happens and the number of the opened events
 		 * are the number of max available hardware counters.
 		 */
-		if (buf.time_enabled != buf.time_running) {
+		if (diff.time_enabled != diff.time_running) {
 			hwctrs = i;
 			break;
 		}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [LTP] [PATCH 1/3] perf_event_open02: migrate to newlib
  2019-11-18 14:59 ` [LTP] [PATCH 1/3] perf_event_open02: migrate to newlib Jan Stancek
@ 2019-11-20 12:29   ` Cyril Hrubis
  2019-11-21  8:45     ` Jan Stancek
  0 siblings, 1 reply; 12+ messages in thread
From: Cyril Hrubis @ 2019-11-20 12:29 UTC (permalink / raw)
  To: ltp

Hi!
> @@ -312,15 +238,13 @@ static void cleanup(void)
>  {
>  	int i;
>  
> -	for (i = 0; i < n; i++) {
> -		if (hwfd[i] > 0 && close(hwfd[i]) == -1)
> -			tst_resm(TWARN | TERRNO, "close(%d) failed", hwfd[i]);
> -		if (tskfd[i] > 0 && close(tskfd[i]) == -1)
> -			tst_resm(TWARN | TERRNO, "close(%d) failed", tskfd[i]);
> +	for (i = 0; i < ntotal; i++) {
> +		SAFE_CLOSE(hwfd[i]);
> +		SAFE_CLOSE(tskfd[i]);

Shouldn't we check that the hwfd[i] and tskfd[i] are > 0?

I guess that we may do SAFE_CLOSE(0) repeatedly here in a case that
perf_event_open() failed somewhere in the middle of the setup().

Otherwise this is a nice cleanup, acked.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [LTP] [PATCH 2/3] perf_event_open02: make do_work() run for specified time
  2019-11-18 14:59 ` [LTP] [PATCH 2/3] perf_event_open02: make do_work() run for specified time Jan Stancek
@ 2019-11-20 12:33   ` Cyril Hrubis
  2019-11-21  8:48     ` Jan Stancek
  2019-11-21 19:21   ` [LTP] [PATCH v2 " Jan Stancek
  1 sibling, 1 reply; 12+ messages in thread
From: Cyril Hrubis @ 2019-11-20 12:33 UTC (permalink / raw)
  To: ltp

> -static void do_work(void)
> +void alarm_handler(int sig LTP_ATTRIBUTE_UNUSED)

static void ?

> +{
> +	work_done = 1;
> +}
> +
> +static void do_work(int time_ms)
>  {
>  	int i;
> +	struct sigaction sa;
> +	struct itimerval val;
>  
> -	for (i = 0; i < LOOPS; ++i)
> -		asm volatile (""::"g" (i));
> -}
> +	work_done = 0;
> +	memset(&val, 0, sizeof(val));
> +	val.it_value.tv_sec = time_ms / 1000;
> +	val.it_value.tv_usec = (time_ms % 1000) * 1000;
>  
> +	sa.sa_handler = alarm_handler;
> +	sa.sa_flags = SA_RESETHAND;
> +	SAFE_SIGACTION(SIGALRM, &sa, NULL);

I would have set up the signal handler just once in the test setup.

> +	if (setitimer(ITIMER_REAL, &val, NULL))
> +		tst_brk(TBROK | TERRNO, "setitimer");
> +
> +	while (!work_done) {
> +		for (i = 0; i < 100000; ++i)
> +			asm volatile (""::"g" (i));
> +	}
> +}

Otherwise it looks great, acked.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [LTP] [PATCH 3/3] perf_event_open02: workaround for Pentium4
  2019-11-18 14:59 ` [LTP] [PATCH 3/3] perf_event_open02: workaround for Pentium4 Jan Stancek
@ 2019-11-20 13:44   ` Cyril Hrubis
  0 siblings, 0 replies; 12+ messages in thread
From: Cyril Hrubis @ 2019-11-20 13:44 UTC (permalink / raw)
  To: ltp

Hi!
Looks good, acked.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [LTP] [PATCH 1/3] perf_event_open02: migrate to newlib
  2019-11-20 12:29   ` Cyril Hrubis
@ 2019-11-21  8:45     ` Jan Stancek
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Stancek @ 2019-11-21  8:45 UTC (permalink / raw)
  To: ltp



----- Original Message -----
> Hi!
> > @@ -312,15 +238,13 @@ static void cleanup(void)
> >  {
> >  	int i;
> >  
> > -	for (i = 0; i < n; i++) {
> > -		if (hwfd[i] > 0 && close(hwfd[i]) == -1)
> > -			tst_resm(TWARN | TERRNO, "close(%d) failed", hwfd[i]);
> > -		if (tskfd[i] > 0 && close(tskfd[i]) == -1)
> > -			tst_resm(TWARN | TERRNO, "close(%d) failed", tskfd[i]);
> > +	for (i = 0; i < ntotal; i++) {
> > +		SAFE_CLOSE(hwfd[i]);
> > +		SAFE_CLOSE(tskfd[i]);
> 
> Shouldn't we check that the hwfd[i] and tskfd[i] are > 0?
> 
> I guess that we may do SAFE_CLOSE(0) repeatedly here in a case that
> perf_event_open() failed somewhere in the middle of the setup().

I'll add check for -1, since 0 could be valid (even though it's unlikely).

> 
> Otherwise this is a nice cleanup, acked.

Thanks for review.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [LTP] [PATCH 2/3] perf_event_open02: make do_work() run for specified time
  2019-11-20 12:33   ` Cyril Hrubis
@ 2019-11-21  8:48     ` Jan Stancek
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Stancek @ 2019-11-21  8:48 UTC (permalink / raw)
  To: ltp


----- Original Message -----
> > -static void do_work(void)
> > +void alarm_handler(int sig LTP_ATTRIBUTE_UNUSED)
> 
> static void ?

Yes

> 
> > +{
> > +	work_done = 1;
> > +}
> > +
> > +static void do_work(int time_ms)
> >  {
> >  	int i;
> > +	struct sigaction sa;
> > +	struct itimerval val;
> >  
> > -	for (i = 0; i < LOOPS; ++i)
> > -		asm volatile (""::"g" (i));
> > -}
> > +	work_done = 0;
> > +	memset(&val, 0, sizeof(val));
> > +	val.it_value.tv_sec = time_ms / 1000;
> > +	val.it_value.tv_usec = (time_ms % 1000) * 1000;
> >  
> > +	sa.sa_handler = alarm_handler;
> > +	sa.sa_flags = SA_RESETHAND;
> > +	SAFE_SIGACTION(SIGALRM, &sa, NULL);
> 
> I would have set up the signal handler just once in the test setup.

I'll move it.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [LTP] [PATCH v2 2/3] perf_event_open02: make do_work() run for specified time
  2019-11-18 14:59 ` [LTP] [PATCH 2/3] perf_event_open02: make do_work() run for specified time Jan Stancek
  2019-11-20 12:33   ` Cyril Hrubis
@ 2019-11-21 19:21   ` Jan Stancek
  2019-11-22 13:33     ` Cyril Hrubis
  1 sibling, 1 reply; 12+ messages in thread
From: Jan Stancek @ 2019-11-21 19:21 UTC (permalink / raw)
  To: ltp

do_work() runtime varies a lot, because it's based on a fixed
number of iterations. Use a timer and measure how many iterations
are needed to run for specified time. We don't need fine accuracy,
just some coarse runtime across all systems. verify() function is
using larger value to get more precision for "ratio" calculation.

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 .../syscalls/perf_event_open/perf_event_open02.c   | 52 +++++++++++++++++++---
 1 file changed, 45 insertions(+), 7 deletions(-)

Notes for v2
--------------

Problem with v1 is that it hangs on -rt kernel. With FIFO scheduling
signal handler never runs and there's nothing to stop do_work().
Adding sched_yield() to periodically call in do_work() is problem too,
because test spends more time in kernel and it breaks the expectation
for 'ratio'.

So in v2, use timer only to estimate number of (100 thousands) loops
needed to run for specified time. Actual test will use the estimated
number (or multiple of it).

Tested with: 5.3.0-0.rc6.git0.1.fc31.x86_64
	     4.18.0-143.el8.x86_64
	     4.18.0-147.rt24.93.el8.x86_64

diff --git a/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c b/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
index 2f0caab56c08..87ed7f3d4abe 100644
--- a/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
+++ b/testcases/kernel/syscalls/perf_event_open/perf_event_open02.c
@@ -31,12 +31,14 @@
 #include <errno.h>
 #include <inttypes.h>
 #include <sched.h>
+#include <signal.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 #include <sys/prctl.h>
+#include <sys/time.h>
 #include <sys/types.h>
 
 #include "config.h"
@@ -49,7 +51,6 @@
 #include <linux/perf_event.h>
 
 #define MAX_CTRS	1000
-#define LOOPS		100000000
 
 struct read_format {
 	unsigned long long value;
@@ -67,6 +68,8 @@ static struct tst_option options[] = {
 
 static int ntotal, nhw;
 static int tsk0 = -1, hwfd[MAX_CTRS], tskfd[MAX_CTRS];
+static int volatile work_done;
+static unsigned int est_loops;
 
 static int perf_event_open(struct perf_event_attr *event, pid_t pid,
 	int cpu, int group_fd, unsigned long flags)
@@ -98,14 +101,47 @@ static void all_counters_set(int state)
 		tst_brk(TBROK | TERRNO, "prctl(%d) failed", state);
 }
 
-static void do_work(void)
+static void alarm_handler(int sig LTP_ATTRIBUTE_UNUSED)
 {
-	int i;
+	work_done = 1;
+}
+
+static void bench_work(int time_ms)
+{
+	unsigned int i;
+	struct itimerval val;
+	struct sigaction sa;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = alarm_handler;
+	sa.sa_flags = SA_RESETHAND;
+	SAFE_SIGACTION(SIGALRM, &sa, NULL);
+
+	work_done = 0;
+	memset(&val, 0, sizeof(val));
+	val.it_value.tv_sec = time_ms / 1000;
+	val.it_value.tv_usec = (time_ms % 1000) * 1000;
+
+	if (setitimer(ITIMER_REAL, &val, NULL))
+		tst_brk(TBROK | TERRNO, "setitimer");
+
+	while (!work_done) {
+		for (i = 0; i < 100000; ++i)
+			asm volatile (""::"g" (i));
+		est_loops++;
+	}
 
-	for (i = 0; i < LOOPS; ++i)
-		asm volatile (""::"g" (i));
+	tst_res(TINFO, "bench_work estimated loops = %u in %d ms", est_loops, time_ms);
 }
 
+static void do_work(int mult)
+{
+	unsigned long i, j, loops = mult * est_loops;
+
+	for (j = 0; j < loops; j++)
+		for (i = 0; i < 100000; i++)
+			asm volatile (""::"g" (i));
+}
 
 #ifndef __s390__
 static int count_hardware_counters(void)
@@ -128,7 +164,7 @@ static int count_hardware_counters(void)
 		fdarry[i] = perf_event_open(&hw_event, 0, -1, -1, 0);
 
 		all_counters_set(PR_TASK_PERF_EVENTS_ENABLE);
-		do_work();
+		do_work(1);
 		all_counters_set(PR_TASK_PERF_EVENTS_DISABLE);
 
 		if (read(fdarry[i], &buf, sizeof(buf)) != sizeof(buf))
@@ -195,6 +231,8 @@ static void setup(void)
 		tskfd[i] = -1;
 	}
 
+	bench_work(500);
+
 	/*
 	 * According to perf_event_open's manpage, the official way of
 	 * knowing if perf_event_open() support is enabled is checking for
@@ -268,7 +306,7 @@ static void verify(void)
 	}
 
 	all_counters_set(PR_TASK_PERF_EVENTS_ENABLE);
-	do_work();
+	do_work(8);
 	/* stop groups with hw counters first before tsk0 */
 	for (i = 0; i < ntotal; i++) {
 		ioctl(hwfd[i], PERF_EVENT_IOC_DISABLE);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [LTP] [PATCH v2 2/3] perf_event_open02: make do_work() run for specified time
  2019-11-21 19:21   ` [LTP] [PATCH v2 " Jan Stancek
@ 2019-11-22 13:33     ` Cyril Hrubis
  2019-11-25 14:54       ` Jan Stancek
  0 siblings, 1 reply; 12+ messages in thread
From: Cyril Hrubis @ 2019-11-22 13:33 UTC (permalink / raw)
  To: ltp

Hi!
> do_work() runtime varies a lot, because it's based on a fixed
> number of iterations. Use a timer and measure how many iterations
> are needed to run for specified time. We don't need fine accuracy,
> just some coarse runtime across all systems. verify() function is
> using larger value to get more precision for "ratio" calculation.
> 
> Signed-off-by: Jan Stancek <jstancek@redhat.com>
> ---
>  .../syscalls/perf_event_open/perf_event_open02.c   | 52 +++++++++++++++++++---
>  1 file changed, 45 insertions(+), 7 deletions(-)
> 
> Notes for v2
> --------------
> 
> Problem with v1 is that it hangs on -rt kernel. With FIFO scheduling
> signal handler never runs and there's nothing to stop do_work().
> Adding sched_yield() to periodically call in do_work() is problem too,
> because test spends more time in kernel and it breaks the expectation
> for 'ratio'.
> 
> So in v2, use timer only to estimate number of (100 thousands) loops
> needed to run for specified time. Actual test will use the estimated
> number (or multiple of it).
> 
> Tested with: 5.3.0-0.rc6.git0.1.fc31.x86_64
> 	     4.18.0-143.el8.x86_64
> 	     4.18.0-147.rt24.93.el8.x86_64

Looks good, acked.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [LTP] [PATCH v2 2/3] perf_event_open02: make do_work() run for specified time
  2019-11-22 13:33     ` Cyril Hrubis
@ 2019-11-25 14:54       ` Jan Stancek
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Stancek @ 2019-11-25 14:54 UTC (permalink / raw)
  To: ltp


----- Original Message -----
> > 
> > Tested with: 5.3.0-0.rc6.git0.1.fc31.x86_64
> > 	     4.18.0-143.el8.x86_64
> > 	     4.18.0-147.rt24.93.el8.x86_64
> 
> Looks good, acked.

Series pushed.

I tested also older kernels based on 2.6.32 (RHEL6.10), and recently
released 5.4.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-11-25 14:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-18 14:59 [LTP] [PATCH 0/3] perf_event_open02 tweaks Jan Stancek
2019-11-18 14:59 ` [LTP] [PATCH 1/3] perf_event_open02: migrate to newlib Jan Stancek
2019-11-20 12:29   ` Cyril Hrubis
2019-11-21  8:45     ` Jan Stancek
2019-11-18 14:59 ` [LTP] [PATCH 2/3] perf_event_open02: make do_work() run for specified time Jan Stancek
2019-11-20 12:33   ` Cyril Hrubis
2019-11-21  8:48     ` Jan Stancek
2019-11-21 19:21   ` [LTP] [PATCH v2 " Jan Stancek
2019-11-22 13:33     ` Cyril Hrubis
2019-11-25 14:54       ` Jan Stancek
2019-11-18 14:59 ` [LTP] [PATCH 3/3] perf_event_open02: workaround for Pentium4 Jan Stancek
2019-11-20 13:44   ` Cyril Hrubis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.