linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Bainbridge <chris.bainbridge@gmail.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: peterz@infradead.org, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, lv.zheng@intel.com, mingo@redhat.com,
	oleg@redhat.com, aystarik@gmail.com, aaron.lu@intel.com
Subject: [PATCH] ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code
Date: Thu, 12 Nov 2015 18:05:37 +0000	[thread overview]
Message-ID: <20151112180537.GA5237@localhost> (raw)
In-Reply-To: <53589361.gQMCOa1UoV@vostro.rjw.lan>

In the SBS initialisation, a reentrant call to wait_event_timeout()
causes an intermittent boot stall of several minutes usually following
the "Switching to clocksource tsc" message. Another symptom of this bug
is high CPU usage from programs (Firefox, upowerd) querying the battery
state. This is caused by:

 1. drivers/acpi/sbshc.c wait_transaction_complete() calls
    wait_event_timeout():

 	if (wait_event_timeout(hc->wait, smb_check_done(hc),
 			       msecs_to_jiffies(timeout)))

 2. ___wait_event sets task state to uninterruptible

 3. ___wait_event calls the "condition" smb_check_done()

 4. smb_check_done (sbshc.c) calls through to ec_read() in
    drivers/acpi/ec.c

 5. ec_guard() is reached which calls wait_event_timeout()

 	if (wait_event_timeout(ec->wait,
 			       ec_transaction_completed(ec),
 			       guard))

    ie. wait_event_timeout() is being called again inside evaluation of
    the previous wait_event_timeout() condition

 5. The EC IRQ handler calls wake_up() and wakes up the sleeping task in
    ec_guard()

 6. The task is now in state running even though the wait "condition" is
    still being evaluated

 7. The "condition" check returns false so ___wait_event calls
    schedule_timeout()

 8. Since the task state is running, the scheduler immediately schedules
    it again

 9. This loop usually repeats for around 250 seconds even though the
    original wait_event_timeout was only 1000ms.

    The timeout is incorrect because each call to schedule_timeout()
    usually returns immediately, taking less than 1ms, so the jiffies
    timeout counter is not decremented. The task is now stuck in a
    running state, and so is highly likely to be immediately
    rescheduled, which takes less than a jiffy. The loop will never exit
    if all schedule_timeout() calls take less than a jiffy.

Fix this by replacing SMBus reads in the wait_event_timeout condition
with checks of a boolean value that is updated by the EC query handler.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107191
Link: https://lkml.org/lkml/2015/11/6/776
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
---
 drivers/acpi/sbshc.c | 26 +++++++-------------------
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/acpi/sbshc.c b/drivers/acpi/sbshc.c
index bf034f8..e290051 100644
--- a/drivers/acpi/sbshc.c
+++ b/drivers/acpi/sbshc.c
@@ -30,6 +30,7 @@ struct acpi_smb_hc {
 	u8 query_bit;
 	smbus_alarm_callback callback;
 	void *context;
+	bool done;
 };
 
 static int acpi_smbus_hc_add(struct acpi_device *device);
@@ -100,27 +101,11 @@ static inline int smb_hc_write(struct acpi_smb_hc *hc, u8 address, u8 data)
 	return ec_write(hc->offset + address, data);
 }
 
-static inline int smb_check_done(struct acpi_smb_hc *hc)
-{
-	union acpi_smb_status status = {.raw = 0};
-	smb_hc_read(hc, ACPI_SMB_STATUS, &status.raw);
-	return status.fields.done && (status.fields.status == SMBUS_OK);
-}
-
 static int wait_transaction_complete(struct acpi_smb_hc *hc, int timeout)
 {
-	if (wait_event_timeout(hc->wait, smb_check_done(hc),
-			       msecs_to_jiffies(timeout)))
+	if (wait_event_timeout(hc->wait, hc->done, msecs_to_jiffies(timeout)))
 		return 0;
-	/*
-	 * After the timeout happens, OS will try to check the status of SMbus.
-	 * If the status is what OS expected, it will be regarded as the bogus
-	 * timeout.
-	 */
-	if (smb_check_done(hc))
-		return 0;
-	else
-		return -ETIME;
+	return -ETIME;
 }
 
 static int acpi_smbus_transaction(struct acpi_smb_hc *hc, u8 protocol,
@@ -135,6 +120,7 @@ static int acpi_smbus_transaction(struct acpi_smb_hc *hc, u8 protocol,
 	}
 
 	mutex_lock(&hc->lock);
+	hc->done = false;
 	if (macbook)
 		udelay(5);
 	if (smb_hc_read(hc, ACPI_SMB_PROTOCOL, &temp))
@@ -235,8 +221,10 @@ static int smbus_alarm(void *context)
 	if (smb_hc_read(hc, ACPI_SMB_STATUS, &status.raw))
 		return 0;
 	/* Check if it is only a completion notify */
-	if (status.fields.done)
+	if (status.fields.done && status.fields.status == SMBUS_OK) {
+		hc->done = true;
 		wake_up(&hc->wait);
+	}
 	if (!status.fields.alarm)
 		return 0;
 	mutex_lock(&hc->lock);
-- 
2.1.4


  reply	other threads:[~2015-11-12 18:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-06 20:44 [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
2015-11-06 23:06 ` Rafael J. Wysocki
2015-11-12 18:05   ` Chris Bainbridge [this message]
2015-11-12 19:24   ` [PATCH] Revert "ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook" Chris Bainbridge
2015-11-07  8:32 ` [PATCH] Preserve task state in reentrant calls to ___wait_event Chris Bainbridge
2015-11-07  9:19 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151112180537.GA5237@localhost \
    --to=chris.bainbridge@gmail.com \
    --cc=aaron.lu@intel.com \
    --cc=aystarik@gmail.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lv.zheng@intel.com \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).