All of lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting
@ 2020-02-17 14:50 Petri Latvala
  2020-02-17 14:50 ` [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts Petri Latvala
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Petri Latvala @ 2020-02-17 14:50 UTC (permalink / raw)
  To: igt-dev; +Cc: Petri Latvala

Instead of aiming for inactivity_timeout and splitting that into
suitable intervals for watchdog pinging, replace the whole logic with
one-second select() timeouts and checking if we're reaching a timeout
condition based on current time and the time passed since a particular
event, be it the last activity or the time of signaling the child
processes.

With the refactoring, we gain a couple of new features for free:

- use-watchdog now makes sense even without
inactivity-timeout. Previously use-watchdog was silently ignored if
inactivity-timeout was not set. Now, watchdogs will be used always if
configured so, effectively ensuring the device gets rebooted if
userspace dies without other timeout tracking.

- Killing tests early on kernel taint now happens even
earlier. Previously on an inactive system we possibly waited for some
tens of seconds before checking kernel taints.

Signed-off-by: Petri Latvala <petri.latvala@intel.com>
---
 runner/executor.c | 224 +++++++++++++++++++++++-----------------------
 1 file changed, 113 insertions(+), 111 deletions(-)

diff --git a/runner/executor.c b/runner/executor.c
index 3ea5d167..33610c9e 100644
--- a/runner/executor.c
+++ b/runner/executor.c
@@ -93,7 +93,7 @@ static void init_watchdogs(struct settings *settings)
 
 	memset(&watchdogs, 0, sizeof(watchdogs));
 
-	if (!settings->use_watchdog || settings->inactivity_timeout <= 0)
+	if (!settings->use_watchdog)
 		return;
 
 	if (settings->log_level >= LOG_LEVEL_VERBOSE) {
@@ -672,6 +672,69 @@ static void show_kernel_task_state(void)
 	sysrq('t');
 }
 
+static const char *need_to_timeout(struct settings *settings,
+				   int killed,
+				   unsigned long taints,
+				   double time_since_activity,
+				   double time_since_kill)
+{
+	if (killed) {
+		/*
+		 * Timeout after being killed is a hardcoded amount
+		 * depending on which signal we already used. The
+		 * exception is SIGKILL which just immediately bails
+		 * out if the kernel is tainted, because there's
+		 * little to no hope of the process dying gracefully
+		 * or at all.
+		 *
+		 * Note that if killed == SIGKILL, the caller needs
+		 * special handling anyway and should ignore the
+		 * actual string returned.
+		 */
+		const double kill_timeout = killed == SIGKILL ? 20.0 : 120.0;
+
+		if ((killed == SIGKILL && is_tainted(taints)) ||
+		    time_since_kill > kill_timeout)
+			return "Timeout. Killing the current test with SIGKILL.\n";
+
+		/*
+		 * We don't care for the other reasons to timeout if
+		 * we're already killing the test.
+		 */
+		return NULL;
+	}
+
+	/*
+	 * If we're configured to care about taints, kill the
+	 * test if there's a taint.
+	 */
+	if (settings->abort_mask & ABORT_TAINT &&
+	    is_tainted(taints))
+		return "Killing the test because the kernel is tainted.\n";
+
+	if (settings->inactivity_timeout != 0 &&
+	    time_since_activity > settings->inactivity_timeout) {
+		show_kernel_task_state();
+		return "Timeout. Killing the current test with SIGQUIT.\n";
+	}
+
+	return NULL;
+}
+
+static int next_kill_signal(int killed)
+{
+	switch (killed) {
+	case 0:
+		return SIGQUIT;
+	case SIGQUIT:
+		return SIGKILL;
+	case SIGKILL:
+	default:
+		assert(!"Unreachable");
+		return SIGKILL;
+	}
+}
+
 /*
  * Returns:
  *  =0 - Success
@@ -693,18 +756,15 @@ static int monitor_output(pid_t child,
 	ssize_t s;
 	int n, status;
 	int nfds = outfd;
-	int timeout = settings->inactivity_timeout;
-	int timeout_intervals = 1, intervals_left;
-	int wd_extra = 10;
+	const int interval_length = 1;
+	int wd_timeout;
 	int killed = 0; /* 0 if not killed, signal number otherwise */
-	int sigkill_timeout = 120;
-	int sigkill_interval = 20;
-	int sigkill_intervals_left = sigkill_timeout / sigkill_interval;
-	struct timespec time_beg, time_end;
+	struct timespec time_beg, time_now, time_last_activity, time_killed;
 	unsigned long taints = 0;
 	bool aborting = false;
 
 	igt_gettime(&time_beg);
+	time_last_activity = time_killed = time_beg;
 
 	if (errfd > nfds)
 		nfds = errfd;
@@ -714,32 +774,32 @@ static int monitor_output(pid_t child,
 		nfds = sigfd;
 	nfds++;
 
-	if (timeout > 0) {
+	/*
+	 * If we're still alive, we want to kill the test process
+	 * instead of cutting power. Use a healthy 2 minute watchdog
+	 * timeout that gets automatically reduced if the device
+	 * doesn't support it.
+	 *
+	 * watchdogs_set_timeout() is a no-op and returns the given
+	 * timeout if we don't have use_watchdog set in settings.
+	 */
+	wd_timeout = watchdogs_set_timeout(120);
+
+	if (wd_timeout < 120) {
 		/*
-		 * Use original timeout plus some leeway. If we're still
-		 * alive, we want to kill the test process instead of cutting
-		 * power.
+		 * Watchdog timeout smaller, warn the user. With the
+		 * short select() timeout we're using we're able to
+		 * ping the watchdog regardless.
 		 */
-		int wd_timeout = watchdogs_set_timeout(timeout + wd_extra);
-
-		if (wd_timeout < timeout + wd_extra) {
-			/* Watchdog timeout smaller, so ping it more often */
-			if (wd_timeout - wd_extra < 0)
-				wd_extra = wd_timeout / 2;
-			timeout_intervals = timeout / (wd_timeout - wd_extra);
-			timeout /= timeout_intervals;
-
-			if (settings->log_level >= LOG_LEVEL_VERBOSE) {
-				outf("Watchdog doesn't support the timeout we requested (shortened to %d seconds). Using %d intervals of %d seconds.\n",
-				     wd_timeout, timeout_intervals, timeout);
-			}
+		if (settings->log_level >= LOG_LEVEL_VERBOSE) {
+			outf("Watchdog doesn't support the timeout we requested (shortened to %d seconds).\n",
+			     wd_timeout);
 		}
 	}
 
-	intervals_left = timeout_intervals;
-
 	while (outfd >= 0 || errfd >= 0 || sigfd >= 0) {
-		struct timeval tv = { .tv_sec = timeout };
+		const char *timeout_reason;
+		struct timeval tv = { .tv_sec = interval_length };
 
 		FD_ZERO(&set);
 		if (outfd >= 0)
@@ -751,7 +811,7 @@ static int monitor_output(pid_t child,
 		if (sigfd >= 0)
 			FD_SET(sigfd, &set);
 
-		n = select(nfds, &set, NULL, NULL, timeout == 0 ? NULL : &tv);
+		n = select(nfds, &set, NULL, NULL, &tv);
 		ping_watchdogs();
 
 		if (n < 0) {
@@ -759,86 +819,18 @@ static int monitor_output(pid_t child,
 			return -1;
 		}
 
-		/*
-		 * If we're configured to care about taints, kill the
-		 * test if there's a taint. But only if we didn't
-		 * already kill it, and make sure we still process the
-		 * fds select() marked for us.
-		 */
-		if (settings->abort_mask & ABORT_TAINT &&
-		    tainted(&taints) &&
-		    killed == 0) {
-			if (settings->log_level >= LOG_LEVEL_NORMAL) {
-				outf("Killing the test because the kernel is tainted.\n");
-				fflush(stdout);
-			}
-
-			killed = SIGQUIT;
-			if (!kill_child(killed, child))
-				return -1;
-
-			/*
-			 * Now continue the loop and let the
-			 * dying child be handled normally.
-			 */
-			timeout = 20;
-			watchdogs_set_timeout(120);
-			intervals_left = timeout_intervals = 1;
-		} else if (n == 0) {
-			if (--intervals_left)
-				continue;
+		igt_gettime(&time_now);
 
-			switch (killed) {
-			case 0:
-				show_kernel_task_state();
-				if (settings->log_level >= LOG_LEVEL_NORMAL) {
-					outf("Timeout. Killing the current test with SIGQUIT.\n");
-					fflush(stdout);
-				}
-
-				killed = SIGQUIT;
-				if (!kill_child(killed, child))
-					return -1;
-
-				/*
-				 * Now continue the loop and let the
-				 * dying child be handled normally.
-				 */
-				timeout = 20;
-				watchdogs_set_timeout(120);
-				intervals_left = timeout_intervals = 1;
-				break;
-			case SIGQUIT:
-				if (settings->log_level >= LOG_LEVEL_NORMAL) {
-					outf("Timeout. Killing the current test with SIGKILL.\n");
-					fflush(stdout);
-				}
-
-				killed = SIGKILL;
-				if (!kill_child(killed, child))
-					return -1;
-
-				/*
-				 * Allow the test two minutes to die
-				 * on SIGKILL. If it takes more than
-				 * that, we're quite likely in a
-				 * scenario where we want to reboot
-				 * the machine anyway.
-				 */
-				watchdogs_set_timeout(sigkill_timeout);
-				timeout = sigkill_interval;
-				intervals_left = 1; /* Intervals handled separately for sigkill */
-				break;
-			case SIGKILL:
-				if (!is_tainted(taints) && --sigkill_intervals_left) {
-					intervals_left = 1;
-					break;
-				}
+		timeout_reason = need_to_timeout(settings, killed, tainted(&taints),
+						 igt_time_elapsed(&time_last_activity, &time_now),
+						 igt_time_elapsed(&time_killed, &time_now));
 
+		if (timeout_reason) {
+			if (killed == SIGKILL) {
 				/* Nothing that can be done, really. Let's tell the caller we want to abort. */
 
 				if (settings->log_level >= LOG_LEVEL_NORMAL) {
-					errf("Child refuses to die, tainted %lx. Aborting.\n",
+					errf("Child refuses to die, tainted 0x%lx. Aborting.\n",
 					     taints);
 					if (kill(child, 0) && errno == ESRCH)
 						errf("The test process no longer exists, "
@@ -853,15 +845,23 @@ static int monitor_output(pid_t child,
 				return -1;
 			}
 
-			continue;
-		}
+			if (settings->log_level >= LOG_LEVEL_NORMAL) {
+				outf("%s", timeout_reason);
+				fflush(stdout);
+			}
 
-		intervals_left = timeout_intervals;
+			killed = next_kill_signal(killed);
+			if (!kill_child(killed, child))
+				return -1;
+			time_killed = time_now;
+		}
 
 		/* TODO: Refactor these handlers to their own functions */
 		if (outfd >= 0 && FD_ISSET(outfd, &set)) {
 			char *newline;
 
+			time_last_activity = time_now;
+
 			s = read(outfd, buf, sizeof(buf));
 			if (s <= 0) {
 				if (s < 0) {
@@ -929,6 +929,8 @@ static int monitor_output(pid_t child,
 	out_end:
 
 		if (errfd >= 0 && FD_ISSET(errfd, &set)) {
+			time_last_activity = time_now;
+
 			s = read(errfd, buf, sizeof(buf));
 			if (s <= 0) {
 				if (s < 0) {
@@ -945,6 +947,8 @@ static int monitor_output(pid_t child,
 		}
 
 		if (kmsgfd >= 0 && FD_ISSET(kmsgfd, &set)) {
+			time_last_activity = time_now;
+
 			s = read(kmsgfd, buf, sizeof(buf));
 			if (s < 0) {
 				if (errno != EPIPE && errno != EINVAL) {
@@ -995,17 +999,15 @@ static int monitor_output(pid_t child,
 				}
 
 				aborting = true;
-				timeout = 2;
 				killed = SIGQUIT;
 				if (!kill_child(killed, child))
 					return -1;
+				time_killed = time_now;
 
 				continue;
 			}
 
-			igt_gettime(&time_end);
-
-			time = igt_time_elapsed(&time_beg, &time_end);
+			time = igt_time_elapsed(&time_beg, &time_now);
 			if (time < 0.0)
 				time = 0.0;
 
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts
  2020-02-17 14:50 [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting Petri Latvala
@ 2020-02-17 14:50 ` Petri Latvala
  2020-02-18 10:24   ` Chris Wilson
  2020-02-17 16:47 ` [igt-dev] ✗ Fi.CI.BAT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting Patchwork
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Petri Latvala @ 2020-02-17 14:50 UTC (permalink / raw)
  To: igt-dev; +Cc: Petri Latvala

A new config option, --per-test-timeout, sets a time a single test
cannot exceed without getting itself killed. The time resets when
starting a subtest or a dynamic subtest, so an execution with
--per-test-timeout=20 can indeed go over 20 seconds a long as it
launches a dynamic subtest within that time.

As a bonus, verbose log level from runner now also prints dynamic
subtest begin/result.

Signed-off-by: Petri Latvala <petri.latvala@intel.com>
---
 runner/executor.c     | 34 +++++++++++++++++++++++++++++++---
 runner/runner_tests.c |  6 ++++++
 runner/settings.c     | 11 +++++++++++
 runner/settings.h     |  1 +
 4 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/runner/executor.c b/runner/executor.c
index 33610c9e..72e45b65 100644
--- a/runner/executor.c
+++ b/runner/executor.c
@@ -676,6 +676,7 @@ static const char *need_to_timeout(struct settings *settings,
 				   int killed,
 				   unsigned long taints,
 				   double time_since_activity,
+				   double time_since_subtest,
 				   double time_since_kill)
 {
 	if (killed) {
@@ -712,10 +713,16 @@ static const char *need_to_timeout(struct settings *settings,
 	    is_tainted(taints))
 		return "Killing the test because the kernel is tainted.\n";
 
+	if (settings->per_test_timeout != 0 &&
+	    time_since_subtest > settings->per_test_timeout) {
+		show_kernel_task_state();
+		return "Per-test timeout exceeded. Killing the current test with SIGQUIT.\n";
+	}
+
 	if (settings->inactivity_timeout != 0 &&
 	    time_since_activity > settings->inactivity_timeout) {
 		show_kernel_task_state();
-		return "Timeout. Killing the current test with SIGQUIT.\n";
+		return "Inactivity timeout exceeded. Killing the current test with SIGQUIT.\n";
 	}
 
 	return NULL;
@@ -759,12 +766,12 @@ static int monitor_output(pid_t child,
 	const int interval_length = 1;
 	int wd_timeout;
 	int killed = 0; /* 0 if not killed, signal number otherwise */
-	struct timespec time_beg, time_now, time_last_activity, time_killed;
+	struct timespec time_beg, time_now, time_last_activity, time_last_subtest, time_killed;
 	unsigned long taints = 0;
 	bool aborting = false;
 
 	igt_gettime(&time_beg);
-	time_last_activity = time_killed = time_beg;
+	time_last_activity = time_last_subtest = time_killed = time_beg;
 
 	if (errfd > nfds)
 		nfds = errfd;
@@ -823,6 +830,7 @@ static int monitor_output(pid_t child,
 
 		timeout_reason = need_to_timeout(settings, killed, tainted(&taints),
 						 igt_time_elapsed(&time_last_activity, &time_now),
+						 igt_time_elapsed(&time_last_subtest, &time_now),
 						 igt_time_elapsed(&time_killed, &time_now));
 
 		if (timeout_reason) {
@@ -893,6 +901,8 @@ static int monitor_output(pid_t child,
 					       linelen - strlen(STARTING_SUBTEST));
 					current_subtest[linelen - strlen(STARTING_SUBTEST)] = '\0';
 
+					time_last_subtest = time_now;
+
 					if (settings->log_level >= LOG_LEVEL_VERBOSE) {
 						fwrite(outbuf, 1, linelen, stdout);
 					}
@@ -921,6 +931,24 @@ static int monitor_output(pid_t child,
 						}
 					}
 				}
+				if (linelen > strlen(STARTING_DYNAMIC_SUBTEST) &&
+				    !memcmp(outbuf, STARTING_DYNAMIC_SUBTEST, strlen(STARTING_DYNAMIC_SUBTEST))) {
+					time_last_subtest = time_now;
+
+					if (settings->log_level >= LOG_LEVEL_VERBOSE) {
+						fwrite(outbuf, 1, linelen, stdout);
+					}
+				}
+				if (linelen > strlen(DYNAMIC_SUBTEST_RESULT) &&
+				    !memcmp(outbuf, DYNAMIC_SUBTEST_RESULT, strlen(DYNAMIC_SUBTEST_RESULT))) {
+					char *delim = memchr(outbuf, ':', linelen);
+
+					if (delim != NULL) {
+						if (settings->log_level >= LOG_LEVEL_VERBOSE) {
+							fwrite(outbuf, 1, linelen, stdout);
+						}
+					}
+				}
 
 				memmove(outbuf, newline + 1, outbufsize - linelen);
 				outbufsize -= linelen;
diff --git a/runner/runner_tests.c b/runner/runner_tests.c
index ed30b3f9..2f4e0abb 100644
--- a/runner/runner_tests.c
+++ b/runner/runner_tests.c
@@ -173,6 +173,7 @@ static void assert_settings_equal(struct settings *one, struct settings *two)
 	igt_assert_eq(one->overwrite, two->overwrite);
 	igt_assert_eq(one->multiple_mode, two->multiple_mode);
 	igt_assert_eq(one->inactivity_timeout, two->inactivity_timeout);
+	igt_assert_eq(one->per_test_timeout, two->per_test_timeout);
 	igt_assert_eq(one->use_watchdog, two->use_watchdog);
 	igt_assert_eqstr(one->test_root, two->test_root);
 	igt_assert_eqstr(one->results_path, two->results_path);
@@ -261,6 +262,7 @@ igt_main
 		igt_assert(!settings->overwrite);
 		igt_assert(!settings->multiple_mode);
 		igt_assert_eq(settings->inactivity_timeout, 0);
+		igt_assert_eq(settings->per_test_timeout, 0);
 		igt_assert_eq(settings->overall_timeout, 0);
 		igt_assert(!settings->use_watchdog);
 		igt_assert(strstr(settings->test_root, "test-root-dir") != NULL);
@@ -378,6 +380,7 @@ igt_main
 		igt_assert(!settings->overwrite);
 		igt_assert(!settings->multiple_mode);
 		igt_assert_eq(settings->inactivity_timeout, 0);
+		igt_assert_eq(settings->per_test_timeout, 0);
 		igt_assert_eq(settings->overall_timeout, 0);
 		igt_assert(!settings->use_watchdog);
 		igt_assert(strstr(settings->test_root, testdatadir) != NULL);
@@ -408,6 +411,7 @@ igt_main
 				       "--overwrite",
 				       "--multiple-mode",
 				       "--inactivity-timeout", "27",
+				       "--per-test-timeout", "72",
 				       "--overall-timeout", "360",
 				       "--use-watchdog",
 				       "--piglit-style-dmesg",
@@ -438,6 +442,7 @@ igt_main
 		igt_assert(settings->overwrite);
 		igt_assert(settings->multiple_mode);
 		igt_assert_eq(settings->inactivity_timeout, 27);
+		igt_assert_eq(settings->per_test_timeout, 72);
 		igt_assert_eq(settings->overall_timeout, 360);
 		igt_assert(settings->use_watchdog);
 		igt_assert(strstr(settings->test_root, "test-root-dir") != NULL);
@@ -827,6 +832,7 @@ igt_main
 					       "--overwrite",
 					       "--multiple-mode",
 					       "--inactivity-timeout", "27",
+					       "--per-test-timeout", "72",
 					       "--overall-timeout", "360",
 					       "--use-watchdog",
 					       "--piglit-style-dmesg",
diff --git a/runner/settings.c b/runner/settings.c
index d601cd11..32840307 100644
--- a/runner/settings.c
+++ b/runner/settings.c
@@ -20,6 +20,7 @@ enum {
 	OPT_PIGLIT_DMESG,
 	OPT_DMESG_WARN_LEVEL,
 	OPT_OVERALL_TIMEOUT,
+	OPT_PER_TEST_TIMEOUT,
 	OPT_HELP = 'h',
 	OPT_NAME = 'n',
 	OPT_DRY_RUN = 'd',
@@ -163,6 +164,10 @@ static const char *usage_str =
 	"  --inactivity-timeout <seconds>\n"
 	"                        Kill the running test after <seconds> of inactivity in\n"
 	"                        the test's stdout, stderr, or dmesg\n"
+	"  --per-test-timeout <seconds>\n"
+	"                        Kill the running test after <seconds>. This timeout is per\n"
+	"                        subtest, or dynamic subtest. In other words, every subtest,\n"
+	"                        even when running in multiple-mode, must finish in <seconds>.\n"
 	"  --overall-timeout <seconds>\n"
 	"                        Don't execute more tests after <seconds> has elapsed\n"
 	"  --use-watchdog        Use hardware watchdog for lethal enforcement of the\n"
@@ -325,6 +330,7 @@ bool parse_options(int argc, char **argv,
 		{"ignore-missing", no_argument, NULL, OPT_IGNORE_MISSING},
 		{"multiple-mode", no_argument, NULL, OPT_MULTIPLE},
 		{"inactivity-timeout", required_argument, NULL, OPT_TIMEOUT},
+		{"per-test-timeout", required_argument, NULL, OPT_PER_TEST_TIMEOUT},
 		{"overall-timeout", required_argument, NULL, OPT_OVERALL_TIMEOUT},
 		{"use-watchdog", no_argument, NULL, OPT_WATCHDOG},
 		{"piglit-style-dmesg", no_argument, NULL, OPT_PIGLIT_DMESG},
@@ -388,6 +394,9 @@ bool parse_options(int argc, char **argv,
 		case OPT_TIMEOUT:
 			settings->inactivity_timeout = atoi(optarg);
 			break;
+		case OPT_PER_TEST_TIMEOUT:
+			settings->per_test_timeout = atoi(optarg);
+			break;
 		case OPT_OVERALL_TIMEOUT:
 			settings->overall_timeout = atoi(optarg);
 			break;
@@ -617,6 +626,7 @@ bool serialize_settings(struct settings *settings)
 	SERIALIZE_LINE(f, settings, overwrite, "%d");
 	SERIALIZE_LINE(f, settings, multiple_mode, "%d");
 	SERIALIZE_LINE(f, settings, inactivity_timeout, "%d");
+	SERIALIZE_LINE(f, settings, per_test_timeout, "%d");
 	SERIALIZE_LINE(f, settings, overall_timeout, "%d");
 	SERIALIZE_LINE(f, settings, use_watchdog, "%d");
 	SERIALIZE_LINE(f, settings, piglit_style_dmesg, "%d");
@@ -662,6 +672,7 @@ bool read_settings_from_file(struct settings *settings, FILE *f)
 		PARSE_LINE(settings, name, val, overwrite, numval);
 		PARSE_LINE(settings, name, val, multiple_mode, numval);
 		PARSE_LINE(settings, name, val, inactivity_timeout, numval);
+		PARSE_LINE(settings, name, val, per_test_timeout, numval);
 		PARSE_LINE(settings, name, val, overall_timeout, numval);
 		PARSE_LINE(settings, name, val, use_watchdog, numval);
 		PARSE_LINE(settings, name, val, piglit_style_dmesg, numval);
diff --git a/runner/settings.h b/runner/settings.h
index 13409f04..5203ec6e 100644
--- a/runner/settings.h
+++ b/runner/settings.h
@@ -38,6 +38,7 @@ struct settings {
 	bool overwrite;
 	bool multiple_mode;
 	int inactivity_timeout;
+	int per_test_timeout;
 	int overall_timeout;
 	bool use_watchdog;
 	char *test_root;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [igt-dev] ✗ Fi.CI.BAT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting
  2020-02-17 14:50 [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting Petri Latvala
  2020-02-17 14:50 ` [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts Petri Latvala
@ 2020-02-17 16:47 ` Patchwork
  2020-02-18 11:29   ` Petri Latvala
  2020-02-18  9:08 ` [igt-dev] [PATCH i-g-t 1/2] " Petri Latvala
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Patchwork @ 2020-02-17 16:47 UTC (permalink / raw)
  To: Petri Latvala; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/2] runner: Refactor timeouting
URL   : https://patchwork.freedesktop.org/series/73539/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7955 -> IGTPW_4166
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_4166 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_4166, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_4166:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_suspend@basic-s0:
    - fi-kbl-7500u:       NOTRUN -> [INCOMPLETE][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/fi-kbl-7500u/igt@gem_exec_suspend@basic-s0.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_exec_suspend@basic-s0:
    - {fi-kbl-7560u}:     NOTRUN -> [INCOMPLETE][2]
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/fi-kbl-7560u/igt@gem_exec_suspend@basic-s0.html

  
Known issues
------------

  Here are the changes found in IGTPW_4166 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_close_race@basic-threads:
    - fi-hsw-peppy:       [PASS][3] -> [INCOMPLETE][4] ([i915#694] / [i915#816])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/fi-hsw-peppy/igt@gem_close_race@basic-threads.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/fi-hsw-peppy/igt@gem_close_race@basic-threads.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#694]: https://gitlab.freedesktop.org/drm/intel/issues/694
  [i915#816]: https://gitlab.freedesktop.org/drm/intel/issues/816
  [i915#937]: https://gitlab.freedesktop.org/drm/intel/issues/937


Participating hosts (51 -> 46)
------------------------------

  Additional (2): fi-kbl-7560u fi-kbl-7500u 
  Missing    (7): fi-hsw-4770r fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_5445 -> IGTPW_4166

  CI-20190529: 20190529
  CI_DRM_7955: 429740c5a92b11e3b649edb3ed4fd66237bd323e @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_4166: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
  IGT_5445: 21e523814d692978d6d04ba85eadd67fcbd88b7e @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting
  2020-02-17 14:50 [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting Petri Latvala
  2020-02-17 14:50 ` [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts Petri Latvala
  2020-02-17 16:47 ` [igt-dev] ✗ Fi.CI.BAT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting Patchwork
@ 2020-02-18  9:08 ` Petri Latvala
  2020-02-18 10:21 ` Chris Wilson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Petri Latvala @ 2020-02-18  9:08 UTC (permalink / raw)
  To: igt-dev

On Mon, Feb 17, 2020 at 04:50:41PM +0200, Petri Latvala wrote:
> Instead of aiming for inactivity_timeout and splitting that into
> suitable intervals for watchdog pinging, replace the whole logic with
> one-second select() timeouts and checking if we're reaching a timeout
> condition based on current time and the time passed since a particular
> event, be it the last activity or the time of signaling the child
> processes.
> 
> With the refactoring, we gain a couple of new features for free:
> 
> - use-watchdog now makes sense even without
> inactivity-timeout. Previously use-watchdog was silently ignored if
> inactivity-timeout was not set. Now, watchdogs will be used always if
> configured so, effectively ensuring the device gets rebooted if
> userspace dies without other timeout tracking.
> 
> - Killing tests early on kernel taint now happens even
> earlier. Previously on an inactive system we possibly waited for some
> tens of seconds before checking kernel taints.
> 
> Signed-off-by: Petri Latvala <petri.latvala@intel.com>
> ---
>  runner/executor.c | 224 +++++++++++++++++++++++-----------------------
>  1 file changed, 113 insertions(+), 111 deletions(-)
> 
> diff --git a/runner/executor.c b/runner/executor.c
> index 3ea5d167..33610c9e 100644
> --- a/runner/executor.c
> +++ b/runner/executor.c
> @@ -93,7 +93,7 @@ static void init_watchdogs(struct settings *settings)
>  
>  	memset(&watchdogs, 0, sizeof(watchdogs));
>  
> -	if (!settings->use_watchdog || settings->inactivity_timeout <= 0)
> +	if (!settings->use_watchdog)
>  		return;
>  
>  	if (settings->log_level >= LOG_LEVEL_VERBOSE) {
> @@ -672,6 +672,69 @@ static void show_kernel_task_state(void)
>  	sysrq('t');
>  }
>  
> +static const char *need_to_timeout(struct settings *settings,
> +				   int killed,
> +				   unsigned long taints,
> +				   double time_since_activity,
> +				   double time_since_kill)
> +{
> +	if (killed) {
> +		/*
> +		 * Timeout after being killed is a hardcoded amount
> +		 * depending on which signal we already used. The
> +		 * exception is SIGKILL which just immediately bails
> +		 * out if the kernel is tainted, because there's
> +		 * little to no hope of the process dying gracefully
> +		 * or at all.
> +		 *
> +		 * Note that if killed == SIGKILL, the caller needs
> +		 * special handling anyway and should ignore the
> +		 * actual string returned.
> +		 */
> +		const double kill_timeout = killed == SIGKILL ? 20.0 : 120.0;


Executing this code in my head a few times I realized that before this
patch, while we did have the exact same values for the timeout, we
waited forever for a killed test to die as long as it (or the kernel)
produced output within that time. Now we don't. I consider that a
bugfix.


-- 
Petri Latvala
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting
  2020-02-17 14:50 [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting Petri Latvala
                   ` (2 preceding siblings ...)
  2020-02-18  9:08 ` [igt-dev] [PATCH i-g-t 1/2] " Petri Latvala
@ 2020-02-18 10:21 ` Chris Wilson
  2020-02-18 15:38 ` [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/2] " Patchwork
  2020-02-19  7:09 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
  5 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2020-02-18 10:21 UTC (permalink / raw)
  To: Petri Latvala, igt-dev; +Cc: Petri Latvala

Quoting Petri Latvala (2020-02-17 14:50:41)
> Instead of aiming for inactivity_timeout and splitting that into
> suitable intervals for watchdog pinging, replace the whole logic with
> one-second select() timeouts and checking if we're reaching a timeout
> condition based on current time and the time passed since a particular
> event, be it the last activity or the time of signaling the child
> processes.
> 
> With the refactoring, we gain a couple of new features for free:
> 
> - use-watchdog now makes sense even without
> inactivity-timeout. Previously use-watchdog was silently ignored if
> inactivity-timeout was not set. Now, watchdogs will be used always if
> configured so, effectively ensuring the device gets rebooted if
> userspace dies without other timeout tracking.
> 
> - Killing tests early on kernel taint now happens even
> earlier. Previously on an inactive system we possibly waited for some
> tens of seconds before checking kernel taints.
> 
> Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts
  2020-02-17 14:50 ` [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts Petri Latvala
@ 2020-02-18 10:24   ` Chris Wilson
  2020-02-18 11:26     ` Petri Latvala
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2020-02-18 10:24 UTC (permalink / raw)
  To: Petri Latvala, igt-dev; +Cc: Petri Latvala

Quoting Petri Latvala (2020-02-17 14:50:42)
> A new config option, --per-test-timeout, sets a time a single test
> cannot exceed without getting itself killed. The time resets when
> starting a subtest or a dynamic subtest, so an execution with
> --per-test-timeout=20 can indeed go over 20 seconds a long as it
> launches a dynamic subtest within that time.
> 
> As a bonus, verbose log level from runner now also prints dynamic
> subtest begin/result.
> 
> Signed-off-by: Petri Latvala <petri.latvala@intel.com>

I wish igt had a pipe back to the runner so there was a clear protocol
for this rather than trying to re-interpret the user output. (What's
that you could pass commands from individual tests back to the runner,
such as changing the dmesg filters, timeouts etc, etc)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts
  2020-02-18 10:24   ` Chris Wilson
@ 2020-02-18 11:26     ` Petri Latvala
  0 siblings, 0 replies; 11+ messages in thread
From: Petri Latvala @ 2020-02-18 11:26 UTC (permalink / raw)
  To: Chris Wilson; +Cc: igt-dev

On Tue, Feb 18, 2020 at 10:24:12AM +0000, Chris Wilson wrote:
> Quoting Petri Latvala (2020-02-17 14:50:42)
> > A new config option, --per-test-timeout, sets a time a single test
> > cannot exceed without getting itself killed. The time resets when
> > starting a subtest or a dynamic subtest, so an execution with
> > --per-test-timeout=20 can indeed go over 20 seconds a long as it
> > launches a dynamic subtest within that time.
> > 
> > As a bonus, verbose log level from runner now also prints dynamic
> > subtest begin/result.
> > 
> > Signed-off-by: Petri Latvala <petri.latvala@intel.com>
> 
> I wish igt had a pipe back to the runner so there was a clear protocol
> for this rather than trying to re-interpret the user output. (What's
> that you could pass commands from individual tests back to the runner,
> such as changing the dmesg filters, timeouts etc, etc)

Stop spying on my local branches.


-- 
Petri Latvala
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] ✗ Fi.CI.BAT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting
  2020-02-17 16:47 ` [igt-dev] ✗ Fi.CI.BAT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting Patchwork
@ 2020-02-18 11:29   ` Petri Latvala
  0 siblings, 0 replies; 11+ messages in thread
From: Petri Latvala @ 2020-02-18 11:29 UTC (permalink / raw)
  To: igt-dev, Lakshminarayana Vudum

On Mon, Feb 17, 2020 at 04:47:08PM +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [i-g-t,1/2] runner: Refactor timeouting
> URL   : https://patchwork.freedesktop.org/series/73539/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_7955 -> IGTPW_4166
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with IGTPW_4166 absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in IGTPW_4166, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in IGTPW_4166:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@gem_exec_suspend@basic-s0:
>     - fi-kbl-7500u:       NOTRUN -> [INCOMPLETE][1]
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/fi-kbl-7500u/igt@gem_exec_suspend@basic-s0.html
> 


Lakshmi, false positive above. This is the rc2 explosion.



-- 
Petri Latvala



>   
> #### Suppressed ####
> 
>   The following results come from untrusted machines, tests, or statuses.
>   They do not affect the overall result.
> 
>   * igt@gem_exec_suspend@basic-s0:
>     - {fi-kbl-7560u}:     NOTRUN -> [INCOMPLETE][2]
>    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/fi-kbl-7560u/igt@gem_exec_suspend@basic-s0.html
> 
>   
> Known issues
> ------------
> 
>   Here are the changes found in IGTPW_4166 that come from known issues:
> 
> ### IGT changes ###
> 
> #### Issues hit ####
> 
>   * igt@gem_close_race@basic-threads:
>     - fi-hsw-peppy:       [PASS][3] -> [INCOMPLETE][4] ([i915#694] / [i915#816])
>    [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/fi-hsw-peppy/igt@gem_close_race@basic-threads.html
>    [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/fi-hsw-peppy/igt@gem_close_race@basic-threads.html
> 
>   
>   {name}: This element is suppressed. This means it is ignored when computing
>           the status of the difference (SUCCESS, WARNING, or FAILURE).
> 
>   [i915#694]: https://gitlab.freedesktop.org/drm/intel/issues/694
>   [i915#816]: https://gitlab.freedesktop.org/drm/intel/issues/816
>   [i915#937]: https://gitlab.freedesktop.org/drm/intel/issues/937
> 
> 
> Participating hosts (51 -> 46)
> ------------------------------
> 
>   Additional (2): fi-kbl-7560u fi-kbl-7500u 
>   Missing    (7): fi-hsw-4770r fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 
> 
> 
> Build changes
> -------------
> 
>   * CI: CI-20190529 -> None
>   * IGT: IGT_5445 -> IGTPW_4166
> 
>   CI-20190529: 20190529
>   CI_DRM_7955: 429740c5a92b11e3b649edb3ed4fd66237bd323e @ git://anongit.freedesktop.org/gfx-ci/linux
>   IGTPW_4166: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
>   IGT_5445: 21e523814d692978d6d04ba85eadd67fcbd88b7e @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/2] runner: Refactor timeouting
  2020-02-17 14:50 [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting Petri Latvala
                   ` (3 preceding siblings ...)
  2020-02-18 10:21 ` Chris Wilson
@ 2020-02-18 15:38 ` Patchwork
  2020-02-19  7:09 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
  5 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2020-02-18 15:38 UTC (permalink / raw)
  To: Petri Latvala; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/2] runner: Refactor timeouting
URL   : https://patchwork.freedesktop.org/series/73539/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7955 -> IGTPW_4166
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html

Known issues
------------

  Here are the changes found in IGTPW_4166 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_close_race@basic-threads:
    - fi-hsw-peppy:       [PASS][1] -> [INCOMPLETE][2] ([i915#694] / [i915#816])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/fi-hsw-peppy/igt@gem_close_race@basic-threads.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/fi-hsw-peppy/igt@gem_close_race@basic-threads.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#694]: https://gitlab.freedesktop.org/drm/intel/issues/694
  [i915#816]: https://gitlab.freedesktop.org/drm/intel/issues/816
  [i915#937]: https://gitlab.freedesktop.org/drm/intel/issues/937


Participating hosts (51 -> 46)
------------------------------

  Additional (2): fi-kbl-7560u fi-kbl-7500u 
  Missing    (7): fi-hsw-4770r fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_5445 -> IGTPW_4166

  CI-20190529: 20190529
  CI_DRM_7955: 429740c5a92b11e3b649edb3ed4fd66237bd323e @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_4166: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
  IGT_5445: 21e523814d692978d6d04ba85eadd67fcbd88b7e @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [igt-dev] ✗ Fi.CI.IGT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting
  2020-02-17 14:50 [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting Petri Latvala
                   ` (4 preceding siblings ...)
  2020-02-18 15:38 ` [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/2] " Patchwork
@ 2020-02-19  7:09 ` Patchwork
  2020-02-19 10:39   ` Petri Latvala
  5 siblings, 1 reply; 11+ messages in thread
From: Patchwork @ 2020-02-19  7:09 UTC (permalink / raw)
  To: Petri Latvala; +Cc: igt-dev

== Series Details ==

Series: series starting with [i-g-t,1/2] runner: Refactor timeouting
URL   : https://patchwork.freedesktop.org/series/73539/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7955_full -> IGTPW_4166_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_4166_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_4166_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_4166_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_suspend@basic-s3:
    - shard-iclb:         NOTRUN -> [INCOMPLETE][1] +4 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb1/igt@gem_exec_suspend@basic-s3.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@kms_hdr@bpc-switch-suspend}:
    - shard-iclb:         NOTRUN -> [INCOMPLETE][2]
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb5/igt@kms_hdr@bpc-switch-suspend.html
    - shard-tglb:         NOTRUN -> [INCOMPLETE][3]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-tglb7/igt@kms_hdr@bpc-switch-suspend.html

  
Known issues
------------

  Here are the changes found in IGTPW_4166_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@rcs0-s3:
    - shard-apl:          [PASS][4] -> [DMESG-WARN][5] ([i915#180]) +2 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-apl4/igt@gem_ctx_isolation@rcs0-s3.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-apl3/igt@gem_ctx_isolation@rcs0-s3.html

  * igt@gem_ctx_isolation@vcs1-s3:
    - shard-kbl:          [PASS][6] -> [INCOMPLETE][7] ([fdo#103665])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-kbl1/igt@gem_ctx_isolation@vcs1-s3.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-kbl3/igt@gem_ctx_isolation@vcs1-s3.html

  * igt@gem_exec_parallel@rcs0-fds:
    - shard-hsw:          [PASS][8] -> [INCOMPLETE][9] ([i915#61] / [i915#694])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-hsw6/igt@gem_exec_parallel@rcs0-fds.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-hsw4/igt@gem_exec_parallel@rcs0-fds.html

  * igt@gem_exec_parallel@vcs1:
    - shard-iclb:         [PASS][10] -> [SKIP][11] ([fdo#112080]) +2 similar issues
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-iclb1/igt@gem_exec_parallel@vcs1.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb7/igt@gem_exec_parallel@vcs1.html

  * igt@gem_exec_schedule@independent-bsd:
    - shard-iclb:         [PASS][12] -> [SKIP][13] ([fdo#112146]) +1 similar issue
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-iclb3/igt@gem_exec_schedule@independent-bsd.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb2/igt@gem_exec_schedule@independent-bsd.html

  * igt@gem_exec_schedule@pi-shared-iova-bsd2:
    - shard-iclb:         [PASS][14] -> [SKIP][15] ([fdo#109276]) +7 similar issues
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-iclb2/igt@gem_exec_schedule@pi-shared-iova-bsd2.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb3/igt@gem_exec_schedule@pi-shared-iova-bsd2.html

  * igt@gem_linear_blits@normal:
    - shard-hsw:          [PASS][16] -> [FAIL][17] ([i915#694]) +2 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-hsw8/igt@gem_linear_blits@normal.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-hsw5/igt@gem_linear_blits@normal.html

  * igt@gem_mmap_gtt@basic-write-gtt:
    - shard-snb:          [PASS][18] -> [DMESG-WARN][19] ([i915#478])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-snb2/igt@gem_mmap_gtt@basic-write-gtt.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-snb7/igt@gem_mmap_gtt@basic-write-gtt.html

  * igt@gen9_exec_parse@allowed-all:
    - shard-kbl:          [PASS][20] -> [DMESG-WARN][21] ([i915#716])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-kbl1/igt@gen9_exec_parse@allowed-all.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-kbl1/igt@gen9_exec_parse@allowed-all.html

  * igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy:
    - shard-hsw:          [PASS][22] -> [FAIL][23] ([i915#96])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-hsw2/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-hsw8/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy.html

  * igt@kms_plane_lowres@pipe-a-tiling-x:
    - shard-glk:          [PASS][24] -> [FAIL][25] ([i915#899])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-glk8/igt@kms_plane_lowres@pipe-a-tiling-x.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-glk7/igt@kms_plane_lowres@pipe-a-tiling-x.html

  * igt@kms_psr@no_drrs:
    - shard-iclb:         [PASS][26] -> [FAIL][27] ([i915#173])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-iclb7/igt@kms_psr@no_drrs.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb1/igt@kms_psr@no_drrs.html

  * igt@kms_psr@psr2_primary_render:
    - shard-tglb:         [PASS][28] -> [SKIP][29] ([i915#668])
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-tglb5/igt@kms_psr@psr2_primary_render.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-tglb2/igt@kms_psr@psr2_primary_render.html

  * igt@kms_vblank@pipe-c-ts-continuation-suspend:
    - shard-kbl:          [PASS][30] -> [DMESG-WARN][31] ([i915#180]) +1 similar issue
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-kbl4/igt@kms_vblank@pipe-c-ts-continuation-suspend.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-kbl7/igt@kms_vblank@pipe-c-ts-continuation-suspend.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@vcs1-reset:
    - shard-iclb:         [SKIP][32] ([fdo#112080]) -> [PASS][33] +1 similar issue
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-iclb8/igt@gem_ctx_isolation@vcs1-reset.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb1/igt@gem_ctx_isolation@vcs1-reset.html

  * igt@gem_ctx_shared@exec-single-timeline-bsd:
    - shard-iclb:         [SKIP][34] ([fdo#110841]) -> [PASS][35]
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-iclb1/igt@gem_ctx_shared@exec-single-timeline-bsd.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb7/igt@gem_ctx_shared@exec-single-timeline-bsd.html

  * igt@gem_exec_async@concurrent-writes-bsd:
    - shard-iclb:         [SKIP][36] ([fdo#112146]) -> [PASS][37] +5 similar issues
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-iclb2/igt@gem_exec_async@concurrent-writes-bsd.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb3/igt@gem_exec_async@concurrent-writes-bsd.html

  * igt@gem_exec_balancer@hang:
    - shard-tglb:         [TIMEOUT][38] ([fdo#112271]) -> [PASS][39]
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-tglb6/igt@gem_exec_balancer@hang.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-tglb5/igt@gem_exec_balancer@hang.html

  * igt@gem_exec_schedule@pi-shared-iova-bsd1:
    - shard-iclb:         [SKIP][40] ([fdo#109276]) -> [PASS][41] +3 similar issues
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-iclb8/igt@gem_exec_schedule@pi-shared-iova-bsd1.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb1/igt@gem_exec_schedule@pi-shared-iova-bsd1.html

  * igt@gem_ppgtt@flink-and-close-vma-leak:
    - shard-glk:          [FAIL][42] ([i915#644]) -> [PASS][43]
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-glk1/igt@gem_ppgtt@flink-and-close-vma-leak.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-glk2/igt@gem_ppgtt@flink-and-close-vma-leak.html

  * igt@gem_userptr_blits@sync-unmap-after-close:
    - shard-snb:          [DMESG-WARN][44] ([fdo#111870] / [i915#478]) -> [PASS][45]
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-snb2/igt@gem_userptr_blits@sync-unmap-after-close.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-snb4/igt@gem_userptr_blits@sync-unmap-after-close.html

  * igt@kms_cursor_crc@pipe-a-cursor-suspend:
    - shard-kbl:          [DMESG-WARN][46] ([i915#180]) -> [PASS][47] +7 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-kbl4/igt@kms_cursor_crc@pipe-a-cursor-suspend.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-kbl6/igt@kms_cursor_crc@pipe-a-cursor-suspend.html

  * igt@kms_flip@2x-flip-vs-blocking-wf-vblank:
    - shard-hsw:          [INCOMPLETE][48] ([i915#61]) -> [PASS][49]
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-hsw5/igt@kms_flip@2x-flip-vs-blocking-wf-vblank.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-hsw2/igt@kms_flip@2x-flip-vs-blocking-wf-vblank.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible:
    - shard-apl:          [FAIL][50] ([i915#79]) -> [PASS][51]
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-apl7/igt@kms_flip@flip-vs-expired-vblank-interruptible.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-apl7/igt@kms_flip@flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-apl:          [DMESG-WARN][52] ([i915#180]) -> [PASS][53] +3 similar issues
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-apl1/igt@kms_flip@flip-vs-suspend-interruptible.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-apl4/igt@kms_flip@flip-vs-suspend-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt:
    - shard-glk:          [FAIL][54] ([i915#49]) -> [PASS][55] +1 similar issue
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-glk6/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-glk8/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt.html

  * igt@kms_plane_lowres@pipe-a-tiling-y:
    - shard-glk:          [FAIL][56] ([i915#899]) -> [PASS][57]
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-glk6/igt@kms_plane_lowres@pipe-a-tiling-y.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-glk5/igt@kms_plane_lowres@pipe-a-tiling-y.html

  
#### Warnings ####

  * igt@gem_tiled_blits@interruptible:
    - shard-hsw:          [FAIL][58] ([i915#694]) -> [FAIL][59] ([i915#818])
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-hsw4/igt@gem_tiled_blits@interruptible.html
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-hsw2/igt@gem_tiled_blits@interruptible.html

  * igt@gem_tiled_blits@normal:
    - shard-hsw:          [FAIL][60] ([i915#818]) -> [INCOMPLETE][61] ([i915#61])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-hsw1/igt@gem_tiled_blits@normal.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-hsw1/igt@gem_tiled_blits@normal.html

  * igt@gem_userptr_blits@dmabuf-unsync:
    - shard-snb:          [DMESG-WARN][62] ([fdo#110789] / [fdo#111870] / [i915#478]) -> [DMESG-WARN][63] ([fdo#111870] / [i915#478])
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-snb7/igt@gem_userptr_blits@dmabuf-unsync.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-snb6/igt@gem_userptr_blits@dmabuf-unsync.html

  * igt@runner@aborted:
    - shard-kbl:          [FAIL][64] ([i915#974]) -> ([FAIL][65], [FAIL][66]) ([i915#716] / [i915#974])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7955/shard-kbl1/igt@runner@aborted.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-kbl1/igt@runner@aborted.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-kbl1/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#109276]: https://bugs.freedesktop.org/show_bug.cgi?id=109276
  [fdo#110789]: https://bugs.freedesktop.org/show_bug.cgi?id=110789
  [fdo#110841]: https://bugs.freedesktop.org/show_bug.cgi?id=110841
  [fdo#111870]: https://bugs.freedesktop.org/show_bug.cgi?id=111870
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [fdo#112146]: https://bugs.freedesktop.org/show_bug.cgi?id=112146
  [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
  [i915#173]: https://gitlab.freedesktop.org/drm/intel/issues/173
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#478]: https://gitlab.freedesktop.org/drm/intel/issues/478
  [i915#49]: https://gitlab.freedesktop.org/drm/intel/issues/49
  [i915#61]: https://gitlab.freedesktop.org/drm/intel/issues/61
  [i915#644]: https://gitlab.freedesktop.org/drm/intel/issues/644
  [i915#668]: https://gitlab.freedesktop.org/drm/intel/issues/668
  [i915#694]: https://gitlab.freedesktop.org/drm/intel/issues/694
  [i915#716]: https://gitlab.freedesktop.org/drm/intel/issues/716
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#818]: https://gitlab.freedesktop.org/drm/intel/issues/818
  [i915#899]: https://gitlab.freedesktop.org/drm/intel/issues/899
  [i915#96]: https://gitlab.freedesktop.org/drm/intel/issues/96
  [i915#974]: https://gitlab.freedesktop.org/drm/intel/issues/974


Participating hosts (10 -> 8)
------------------------------

  Missing    (2): pig-skl-6260u pig-glk-j5005 


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_5445 -> IGTPW_4166
  * Piglit: piglit_4509 -> None

  CI-20190529: 20190529
  CI_DRM_7955: 429740c5a92b11e3b649edb3ed4fd66237bd323e @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_4166: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
  IGT_5445: 21e523814d692978d6d04ba85eadd67fcbd88b7e @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] ✗ Fi.CI.IGT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting
  2020-02-19  7:09 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
@ 2020-02-19 10:39   ` Petri Latvala
  0 siblings, 0 replies; 11+ messages in thread
From: Petri Latvala @ 2020-02-19 10:39 UTC (permalink / raw)
  To: igt-dev

On Wed, Feb 19, 2020 at 07:09:53AM +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [i-g-t,1/2] runner: Refactor timeouting
> URL   : https://patchwork.freedesktop.org/series/73539/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_7955_full -> IGTPW_4166_full
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with IGTPW_4166_full absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in IGTPW_4166_full, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/index.html
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in IGTPW_4166_full:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@gem_exec_suspend@basic-s3:
>     - shard-iclb:         NOTRUN -> [INCOMPLETE][1] +4 similar issues
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4166/shard-iclb1/igt@gem_exec_suspend@basic-s3.html


rc2 fallout, so merged.


-- 
Petri Latvala
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-02-19 10:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-17 14:50 [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting Petri Latvala
2020-02-17 14:50 ` [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts Petri Latvala
2020-02-18 10:24   ` Chris Wilson
2020-02-18 11:26     ` Petri Latvala
2020-02-17 16:47 ` [igt-dev] ✗ Fi.CI.BAT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting Patchwork
2020-02-18 11:29   ` Petri Latvala
2020-02-18  9:08 ` [igt-dev] [PATCH i-g-t 1/2] " Petri Latvala
2020-02-18 10:21 ` Chris Wilson
2020-02-18 15:38 ` [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/2] " Patchwork
2020-02-19  7:09 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
2020-02-19 10:39   ` Petri Latvala

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.