All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Preserve task state in reentrant calls to ___wait_event
@ 2015-11-06 20:44 Chris Bainbridge
  2015-11-06 23:06 ` Rafael J. Wysocki
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Chris Bainbridge @ 2015-11-06 20:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-acpi, lv.zheng, mingo, peterz, rjw, oleg, aystarik

In the ACPI SBS initialisation, a reentrant call to wait_event_timeout()
causes an intermittent boot stall of several minutes usually following
the "Switching to clocksource tsc" message. This stall is caused by:

 1. drivers/acpi/sbshc.c wait_transaction_complete() calls
    wait_event_timeout():

 	if (wait_event_timeout(hc->wait, smb_check_done(hc),
 			       msecs_to_jiffies(timeout)))

 2. ___wait_event sets task state to uninterruptible

 3. ___wait_event calls the "condition" smb_check_done()

 4. smb_check_done (sbshc.c) calls through to ec_read() in
    drivers/acpi/ec.c

 5. ec_guard() is reached which calls wait_event_timeout()

 	if (wait_event_timeout(ec->wait,
 			       ec_transaction_completed(ec),
 			       guard))

    ie. wait_event_timeout() is being called again inside evaluation of
    the previous wait_event_timeout() condition

 5. The EC IRQ handler calls wake_up() and wakes up the sleeping task in
    ec_guard()

 6. The task is now in state running even though the wait "condition" is
    still being evaluated

 7. The "condition" check returns false so ___wait_event calls
    schedule_timeout()

 8. Since the task state is running, the scheduler immediately schedules
    it again

 9. This loop repeats for around 250 seconds event though the original
    wait_event_timeout was only 1000ms.

    This happens because each the call to schedule_timeout() usually
    returns immediately, taking less than 1ms, so the jiffies timeout
    counter is not decremented. The task is now stuck in a running
    state, and so is highly likely to get rescheduled immediately, which
    takes less than a jiffy.

The root problem is that wait_event_timeout() does not preserve the task
state when called by tasks that are not running. We fix this by
preserving and restoring the task state in ___wait_event().

Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
---
I am assuming here that wait_event_timeout() is supposed to support reentrant
calls. If not, perhaps it should BUG_ON when called with a non-running task
state, and the SBS HC / ACPI EC code needs to be fixed to stop doing this. If
reentrant calls are supposed to work, then this patch will preserve the task
state (there may be a more appropriate way to support reentrant calls than this
exact patch, suggestions/alternatives are welcome, but this does work).
---
 include/linux/wait.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 1e1bf9f..a847cf8 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -209,11 +209,12 @@ wait_queue_head_t *bit_waitqueue(void *, int);
  * otherwise.
  */
 
-#define ___wait_event(wq, condition, state, exclusive, ret, cmd)	\
+#define ___wait_event(wq, condition, nstate, exclusive, ret, cmd)	\
 ({									\
 	__label__ __out;						\
 	wait_queue_t __wait;						\
 	long __ret = ret;	/* explicit shadow */			\
+	long ostate = current->state;					\
 									\
 	INIT_LIST_HEAD(&__wait.task_list);				\
 	if (exclusive)							\
@@ -222,16 +223,16 @@ wait_queue_head_t *bit_waitqueue(void *, int);
 		__wait.flags = 0;					\
 									\
 	for (;;) {							\
-		long __int = prepare_to_wait_event(&wq, &__wait, state);\
+		long __int = prepare_to_wait_event(&wq, &__wait, nstate);\
 									\
 		if (condition)						\
 			break;						\
 									\
-		if (___wait_is_interruptible(state) && __int) {		\
+		if (___wait_is_interruptible(nstate) && __int) {	\
 			__ret = __int;					\
 			if (exclusive) {				\
 				abort_exclusive_wait(&wq, &__wait,	\
-						     state, NULL);	\
+						     nstate, NULL);	\
 				goto __out;				\
 			}						\
 			break;						\
@@ -240,6 +241,7 @@ wait_queue_head_t *bit_waitqueue(void *, int);
 		cmd;							\
 	}								\
 	finish_wait(&wq, &__wait);					\
+	set_current_state(ostate);					\
 __out:	__ret;								\
 })
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Preserve task state in reentrant calls to ___wait_event
  2015-11-06 20:44 [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
@ 2015-11-06 23:06 ` Rafael J. Wysocki
  2015-11-12 18:05   ` [PATCH] ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code Chris Bainbridge
  2015-11-12 19:24   ` [PATCH] Revert "ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook" Chris Bainbridge
  2015-11-07  8:32 ` [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
  2015-11-07  9:19 ` Peter Zijlstra
  2 siblings, 2 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2015-11-06 23:06 UTC (permalink / raw)
  To: Chris Bainbridge, peterz
  Cc: linux-kernel, linux-acpi, lv.zheng, mingo, oleg, aystarik

On Friday, November 06, 2015 08:44:08 PM Chris Bainbridge wrote:
> In the ACPI SBS initialisation, a reentrant call to wait_event_timeout()
> causes an intermittent boot stall of several minutes usually following
> the "Switching to clocksource tsc" message. This stall is caused by:
> 
>  1. drivers/acpi/sbshc.c wait_transaction_complete() calls
>     wait_event_timeout():
> 
>  	if (wait_event_timeout(hc->wait, smb_check_done(hc),
>  			       msecs_to_jiffies(timeout)))
> 
>  2. ___wait_event sets task state to uninterruptible
> 
>  3. ___wait_event calls the "condition" smb_check_done()
> 
>  4. smb_check_done (sbshc.c) calls through to ec_read() in
>     drivers/acpi/ec.c
> 
>  5. ec_guard() is reached which calls wait_event_timeout()
> 
>  	if (wait_event_timeout(ec->wait,
>  			       ec_transaction_completed(ec),
>  			       guard))
> 
>     ie. wait_event_timeout() is being called again inside evaluation of
>     the previous wait_event_timeout() condition

Hmm.

This doesn't look quite right to me.

>  5. The EC IRQ handler calls wake_up() and wakes up the sleeping task in
>     ec_guard()
> 
>  6. The task is now in state running even though the wait "condition" is
>     still being evaluated
>
>  7. The "condition" check returns false so ___wait_event calls
>     schedule_timeout()
> 
>  8. Since the task state is running, the scheduler immediately schedules
>     it again
> 
>  9. This loop repeats for around 250 seconds event though the original
>     wait_event_timeout was only 1000ms.
> 
>     This happens because each the call to schedule_timeout() usually
>     returns immediately, taking less than 1ms, so the jiffies timeout
>     counter is not decremented. The task is now stuck in a running
>     state, and so is highly likely to get rescheduled immediately, which
>     takes less than a jiffy.
> 
> The root problem is that wait_event_timeout() does not preserve the task
> state when called by tasks that are not running. We fix this by
> preserving and restoring the task state in ___wait_event().
> 
> Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
> ---
> I am assuming here that wait_event_timeout() is supposed to support reentrant
> calls. If not, perhaps it should BUG_ON when called with a non-running task
> state, and the SBS HC / ACPI EC code needs to be fixed to stop doing this.

OK, so I'm not sure about that too.  Peter?

I'd fix the SBS HC / ACPI EC code to stop doing this regardless.

> If reentrant calls are supposed to work, then this patch will preserve the task
> state (there may be a more appropriate way to support reentrant calls than this
> exact patch, suggestions/alternatives are welcome, but this does work).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Preserve task state in reentrant calls to ___wait_event
  2015-11-06 20:44 [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
  2015-11-06 23:06 ` Rafael J. Wysocki
@ 2015-11-07  8:32 ` Chris Bainbridge
  2015-11-07  9:19 ` Peter Zijlstra
  2 siblings, 0 replies; 6+ messages in thread
From: Chris Bainbridge @ 2015-11-07  8:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-acpi, lv.zheng, mingo, peterz, rjw, oleg, aystarik

On Fri, Nov 06, 2015 at 08:44:08PM +0000, Chris Bainbridge wrote:
> -#define ___wait_event(wq, condition, state, exclusive, ret, cmd)	\
> +#define ___wait_event(wq, condition, nstate, exclusive, ret, cmd)	\
>  ({									\
>  	__label__ __out;						\
>  	wait_queue_t __wait;						\
>  	long __ret = ret;	/* explicit shadow */			\
> +	long ostate = current->state;					\

XXX

>  									\
>  	INIT_LIST_HEAD(&__wait.task_list);				\
>  	if (exclusive)							\
> @@ -222,16 +223,16 @@ wait_queue_head_t *bit_waitqueue(void *, int);
>  		__wait.flags = 0;					\
>  									\
>  	for (;;) {							\
> -		long __int = prepare_to_wait_event(&wq, &__wait, state);\
> +		long __int = prepare_to_wait_event(&wq, &__wait, nstate);\
>  									\
>  		if (condition)						\
>  			break;						\
>  									\
> -		if (___wait_is_interruptible(state) && __int) {		\
> +		if (___wait_is_interruptible(nstate) && __int) {	\
>  			__ret = __int;					\
>  			if (exclusive) {				\
>  				abort_exclusive_wait(&wq, &__wait,	\
> -						     state, NULL);	\
> +						     nstate, NULL);	\
>  				goto __out;				\
>  			}						\
>  			break;						\
> @@ -240,6 +241,7 @@ wait_queue_head_t *bit_waitqueue(void *, int);
>  		cmd;							\
>  	}								\
>  	finish_wait(&wq, &__wait);					\
> +	set_current_state(ostate);					\

I'm not convinced that this particular code is (or can be) race free in
the general reentrant case. The outer call to ___wait_event will miss 
any wake_up received in the inner call between XXX above (store of
current->state) and this point of restoring the previous state. So if
the inner condition evaluation or some interrupt handler happens to
trigger a wake_up meant for the outer call then it will be lost.

>  __out:	__ret;								\
>  })

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Preserve task state in reentrant calls to ___wait_event
  2015-11-06 20:44 [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
  2015-11-06 23:06 ` Rafael J. Wysocki
  2015-11-07  8:32 ` [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
@ 2015-11-07  9:19 ` Peter Zijlstra
  2 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2015-11-07  9:19 UTC (permalink / raw)
  To: Chris Bainbridge
  Cc: linux-kernel, linux-acpi, lv.zheng, mingo, rjw, oleg, aystarik

On Fri, Nov 06, 2015 at 08:44:08PM +0000, Chris Bainbridge wrote:
> I am assuming here that wait_event_timeout() is supposed to support reentrant
> calls.

Not really. It is sort of allowed, provided the inner one will rarely
block. And therefore the outer one will mostly work.

> If not, perhaps it should BUG_ON when called with a non-running task
> state,

There is a warning in __might_sleep() that catches some of this.

> and the SBS HC / ACPI EC code needs to be fixed to stop doing this.

Yes.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code
  2015-11-06 23:06 ` Rafael J. Wysocki
@ 2015-11-12 18:05   ` Chris Bainbridge
  2015-11-12 19:24   ` [PATCH] Revert "ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook" Chris Bainbridge
  1 sibling, 0 replies; 6+ messages in thread
From: Chris Bainbridge @ 2015-11-12 18:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: peterz, linux-kernel, linux-acpi, lv.zheng, mingo, oleg,
	aystarik, aaron.lu

In the SBS initialisation, a reentrant call to wait_event_timeout()
causes an intermittent boot stall of several minutes usually following
the "Switching to clocksource tsc" message. Another symptom of this bug
is high CPU usage from programs (Firefox, upowerd) querying the battery
state. This is caused by:

 1. drivers/acpi/sbshc.c wait_transaction_complete() calls
    wait_event_timeout():

 	if (wait_event_timeout(hc->wait, smb_check_done(hc),
 			       msecs_to_jiffies(timeout)))

 2. ___wait_event sets task state to uninterruptible

 3. ___wait_event calls the "condition" smb_check_done()

 4. smb_check_done (sbshc.c) calls through to ec_read() in
    drivers/acpi/ec.c

 5. ec_guard() is reached which calls wait_event_timeout()

 	if (wait_event_timeout(ec->wait,
 			       ec_transaction_completed(ec),
 			       guard))

    ie. wait_event_timeout() is being called again inside evaluation of
    the previous wait_event_timeout() condition

 5. The EC IRQ handler calls wake_up() and wakes up the sleeping task in
    ec_guard()

 6. The task is now in state running even though the wait "condition" is
    still being evaluated

 7. The "condition" check returns false so ___wait_event calls
    schedule_timeout()

 8. Since the task state is running, the scheduler immediately schedules
    it again

 9. This loop usually repeats for around 250 seconds even though the
    original wait_event_timeout was only 1000ms.

    The timeout is incorrect because each call to schedule_timeout()
    usually returns immediately, taking less than 1ms, so the jiffies
    timeout counter is not decremented. The task is now stuck in a
    running state, and so is highly likely to be immediately
    rescheduled, which takes less than a jiffy. The loop will never exit
    if all schedule_timeout() calls take less than a jiffy.

Fix this by replacing SMBus reads in the wait_event_timeout condition
with checks of a boolean value that is updated by the EC query handler.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107191
Link: https://lkml.org/lkml/2015/11/6/776
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
---
 drivers/acpi/sbshc.c | 26 +++++++-------------------
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/acpi/sbshc.c b/drivers/acpi/sbshc.c
index bf034f8..e290051 100644
--- a/drivers/acpi/sbshc.c
+++ b/drivers/acpi/sbshc.c
@@ -30,6 +30,7 @@ struct acpi_smb_hc {
 	u8 query_bit;
 	smbus_alarm_callback callback;
 	void *context;
+	bool done;
 };
 
 static int acpi_smbus_hc_add(struct acpi_device *device);
@@ -100,27 +101,11 @@ static inline int smb_hc_write(struct acpi_smb_hc *hc, u8 address, u8 data)
 	return ec_write(hc->offset + address, data);
 }
 
-static inline int smb_check_done(struct acpi_smb_hc *hc)
-{
-	union acpi_smb_status status = {.raw = 0};
-	smb_hc_read(hc, ACPI_SMB_STATUS, &status.raw);
-	return status.fields.done && (status.fields.status == SMBUS_OK);
-}
-
 static int wait_transaction_complete(struct acpi_smb_hc *hc, int timeout)
 {
-	if (wait_event_timeout(hc->wait, smb_check_done(hc),
-			       msecs_to_jiffies(timeout)))
+	if (wait_event_timeout(hc->wait, hc->done, msecs_to_jiffies(timeout)))
 		return 0;
-	/*
-	 * After the timeout happens, OS will try to check the status of SMbus.
-	 * If the status is what OS expected, it will be regarded as the bogus
-	 * timeout.
-	 */
-	if (smb_check_done(hc))
-		return 0;
-	else
-		return -ETIME;
+	return -ETIME;
 }
 
 static int acpi_smbus_transaction(struct acpi_smb_hc *hc, u8 protocol,
@@ -135,6 +120,7 @@ static int acpi_smbus_transaction(struct acpi_smb_hc *hc, u8 protocol,
 	}
 
 	mutex_lock(&hc->lock);
+	hc->done = false;
 	if (macbook)
 		udelay(5);
 	if (smb_hc_read(hc, ACPI_SMB_PROTOCOL, &temp))
@@ -235,8 +221,10 @@ static int smbus_alarm(void *context)
 	if (smb_hc_read(hc, ACPI_SMB_STATUS, &status.raw))
 		return 0;
 	/* Check if it is only a completion notify */
-	if (status.fields.done)
+	if (status.fields.done && status.fields.status == SMBUS_OK) {
+		hc->done = true;
 		wake_up(&hc->wait);
+	}
 	if (!status.fields.alarm)
 		return 0;
 	mutex_lock(&hc->lock);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] Revert "ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook"
  2015-11-06 23:06 ` Rafael J. Wysocki
  2015-11-12 18:05   ` [PATCH] ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code Chris Bainbridge
@ 2015-11-12 19:24   ` Chris Bainbridge
  1 sibling, 0 replies; 6+ messages in thread
From: Chris Bainbridge @ 2015-11-12 19:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: peterz, linux-kernel, linux-acpi, lv.zheng, aaron.lu, aystarik

This reverts commit 3349fb64b2927407017d970dd5c4daf3c5ad69f8.

After commit 8848d20cf09c ("ACPI / SMBus: Fix boot stalls / high CPU
caused by reentrant code") the delay is no longer necessary.

Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
---
 drivers/acpi/sbshc.c | 22 ----------------------
 1 file changed, 22 deletions(-)

diff --git a/drivers/acpi/sbshc.c b/drivers/acpi/sbshc.c
index e2900518cf7e..2fa8304171e0 100644
--- a/drivers/acpi/sbshc.c
+++ b/drivers/acpi/sbshc.c
@@ -14,7 +14,6 @@
 #include <linux/delay.h>
 #include <linux/module.h>
 #include <linux/interrupt.h>
-#include <linux/dmi.h>
 #include "sbshc.h"
 
 #define PREFIX "ACPI: "
@@ -89,8 +88,6 @@ enum acpi_smb_offset {
 	ACPI_SMB_ALARM_DATA = 0x26,	/* 2 bytes alarm data */
 };
 
-static bool macbook;
-
 static inline int smb_hc_read(struct acpi_smb_hc *hc, u8 address, u8 *data)
 {
 	return ec_read(hc->offset + address, data);
@@ -121,8 +118,6 @@ static int acpi_smbus_transaction(struct acpi_smb_hc *hc, u8 protocol,
 
 	mutex_lock(&hc->lock);
 	hc->done = false;
-	if (macbook)
-		udelay(5);
 	if (smb_hc_read(hc, ACPI_SMB_PROTOCOL, &temp))
 		goto end;
 	if (temp) {
@@ -250,29 +245,12 @@ extern int acpi_ec_add_query_handler(struct acpi_ec *ec, u8 query_bit,
 			      acpi_handle handle, acpi_ec_query_func func,
 			      void *data);
 
-static int macbook_dmi_match(const struct dmi_system_id *d)
-{
-	pr_debug("Detected MacBook, enabling workaround\n");
-	macbook = true;
-	return 0;
-}
-
-static struct dmi_system_id acpi_smbus_dmi_table[] = {
-	{ macbook_dmi_match, "Apple MacBook", {
-	  DMI_MATCH(DMI_BOARD_VENDOR, "Apple"),
-	  DMI_MATCH(DMI_PRODUCT_NAME, "MacBook") },
-	},
-	{ },
-};
-
 static int acpi_smbus_hc_add(struct acpi_device *device)
 {
 	int status;
 	unsigned long long val;
 	struct acpi_smb_hc *hc;
 
-	dmi_check_system(acpi_smbus_dmi_table);
-
 	if (!device)
 		return -EINVAL;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-11-12 19:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-06 20:44 [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
2015-11-06 23:06 ` Rafael J. Wysocki
2015-11-12 18:05   ` [PATCH] ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code Chris Bainbridge
2015-11-12 19:24   ` [PATCH] Revert "ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook" Chris Bainbridge
2015-11-07  8:32 ` [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
2015-11-07  9:19 ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.