All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ACPI, APEI, EINJ: Relax platform response timeout to 1 second.
@ 2021-10-15  3:38 Shuai Xue
  2021-10-15 15:37 ` Luck, Tony
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Shuai Xue @ 2021-10-15  3:38 UTC (permalink / raw)
  To: linux-kernel, linux-acpi, bp, tony.luck, james.morse, lenb, rjw
  Cc: xueshuai, zhangliguang, zhuo.song

When injecting an error into the platform, the OSPM executes an
EXECUTE_OPERATION action to instruct the platform to begin the injection
operation. And then, the OSPM busy waits for a while by continually
executing CHECK_BUSY_STATUS action until the platform indicates that the
operation is complete. More specifically, the platform is limited to
respond within 1 millisecond right now. This is too strict for some
platforms.

For example, in Arm platfrom, when injecting a Processor Correctable error,
the OSPM will warn:
    Firmware does not respond in time.

And a message is printed on the console:
    echo: write error: Input/output error

We observe that the waiting time for DDR error injection is about 10 ms
and that for PCIe error injection is about 500 ms in Arm platfrom.

In this patch, we relax the response timeout to 1 second and allow user to
pass the time out value as a argument.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 drivers/acpi/apei/einj.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c
index 133156759551..fa2386ee37db 100644
--- a/drivers/acpi/apei/einj.c
+++ b/drivers/acpi/apei/einj.c
@@ -14,6 +14,7 @@
 
 #include <linux/kernel.h>
 #include <linux/module.h>
+#include <linux/moduleparam.h>
 #include <linux/init.h>
 #include <linux/io.h>
 #include <linux/debugfs.h>
@@ -28,9 +29,9 @@
 #undef pr_fmt
 #define pr_fmt(fmt) "EINJ: " fmt
 
-#define SPIN_UNIT		100			/* 100ns */
-/* Firmware should respond within 1 milliseconds */
-#define FIRMWARE_TIMEOUT	(1 * NSEC_PER_MSEC)
+#define SPIN_UNIT		100			/* 100us */
+/* Firmware should respond within 1 seconds */
+#define FIRMWARE_TIMEOUT	(1 * USEC_PER_SEC)
 #define ACPI5_VENDOR_BIT	BIT(31)
 #define MEM_ERROR_MASK		(ACPI_EINJ_MEMORY_CORRECTABLE | \
 				ACPI_EINJ_MEMORY_UNCORRECTABLE | \
@@ -40,6 +41,8 @@
  * ACPI version 5 provides a SET_ERROR_TYPE_WITH_ADDRESS action.
  */
 static int acpi5;
+static int timeout_default = FIRMWARE_TIMEOUT;
+module_param(timeout_default, int, 0644);
 
 struct set_error_type_with_address {
 	u32	type;
@@ -176,7 +179,7 @@ static int einj_timedout(u64 *t)
 		return 1;
 	}
 	*t -= SPIN_UNIT;
-	ndelay(SPIN_UNIT);
+	udelay(SPIN_UNIT);
 	touch_nmi_watchdog();
 	return 0;
 }
@@ -403,7 +406,7 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
 			       u64 param3, u64 param4)
 {
 	struct apei_exec_context ctx;
-	u64 val, trigger_paddr, timeout = FIRMWARE_TIMEOUT;
+	u64 val, trigger_paddr, timeout = timeout_default;
 	int rc;
 
 	einj_exec_ctx_init(&ctx);
-- 
2.20.1.12.g72788fdb


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-10-27 18:24 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15  3:38 [PATCH] ACPI, APEI, EINJ: Relax platform response timeout to 1 second Shuai Xue
2021-10-15 15:37 ` Luck, Tony
2021-10-17  4:06   ` Shuai Xue
2021-10-18 15:40     ` Luck, Tony
2021-10-19 13:33       ` Shuai Xue
2021-10-22 13:44 ` [PATCH v2] " Shuai Xue
2021-10-22 23:54   ` Luck, Tony
2021-10-24  9:10     ` Shuai Xue
2021-10-25 12:49       ` Shuai Xue
2021-10-25 15:59         ` Luck, Tony
2021-10-26  7:28 ` [PATCH v3] " Shuai Xue
2021-10-26 17:05   ` Luck, Tony
2021-10-27  2:18     ` Shuai Xue
2021-10-27 18:24     ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.