linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] selftests/eeh: Bump EEH wait time to 60s
@ 2020-01-22  3:11 Oliver O'Halloran
  2020-01-29  5:17 ` Michael Ellerman
  0 siblings, 1 reply; 2+ messages in thread
From: Oliver O'Halloran @ 2020-01-22  3:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Oliver O'Halloran, Douglas Miller, Steve Best

Some newer cards supported by aacraid can take up to 40s to recover
after an EEH event. This causes spurious failures in the basic EEH
self-test since the current maximim timeout is only 30s.

Fix the immediate issue by bumping the timeout to a default of 60s,
and allow the wait time to be specified via an environmental variable
(EEH_MAX_WAIT).

Reported-by: Steve Best <sbest@redhat.com>
Suggested-by: Douglas Miller <dougmill@us.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 tools/testing/selftests/powerpc/eeh/eeh-functions.sh | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
index 26112ab5cdf4..f52ed92b53e7 100755
--- a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
+++ b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
@@ -53,9 +53,13 @@ eeh_one_dev() {
 	# is a no-op.
 	echo $dev >/sys/kernel/debug/powerpc/eeh_dev_check
 
-	# Enforce a 30s timeout for recovery. Even the IPR, which is infamously
-	# slow to reset, should recover within 30s.
-	max_wait=30
+	# Default to a 60s timeout when waiting for a device to recover. This
+	# is an arbitrary default which can be overridden by setting the
+	# EEH_MAX_WAIT environmental variable when required.
+
+	# The current record holder for longest recovery time is:
+	#  "Adaptec Series 8 12G SAS/PCIe 3" at 39 seconds
+	max_wait=${EEH_MAX_WAIT:=60}
 
 	for i in `seq 0 ${max_wait}` ; do
 		if pe_ok $dev ; then
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] selftests/eeh: Bump EEH wait time to 60s
  2020-01-22  3:11 [PATCH] selftests/eeh: Bump EEH wait time to 60s Oliver O'Halloran
@ 2020-01-29  5:17 ` Michael Ellerman
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Ellerman @ 2020-01-29  5:17 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev
  Cc: Oliver O'Halloran, Douglas Miller, Steve Best

On Wed, 2020-01-22 at 03:11:25 UTC, Oliver O'Halloran wrote:
> Some newer cards supported by aacraid can take up to 40s to recover
> after an EEH event. This causes spurious failures in the basic EEH
> self-test since the current maximim timeout is only 30s.
> 
> Fix the immediate issue by bumping the timeout to a default of 60s,
> and allow the wait time to be specified via an environmental variable
> (EEH_MAX_WAIT).
> 
> Reported-by: Steve Best <sbest@redhat.com>
> Suggested-by: Douglas Miller <dougmill@us.ibm.com>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/414f50434aa2463202a5b35e844f4125dd1a7101

cheers

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-01-29  6:24 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-22  3:11 [PATCH] selftests/eeh: Bump EEH wait time to 60s Oliver O'Halloran
2020-01-29  5:17 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).