linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xhci: use iopoll for xhci_handshake
@ 2019-02-27 22:19 Daniel Kurtz
  2019-02-28  7:09 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Kurtz @ 2019-02-27 22:19 UTC (permalink / raw)
  Cc: rrangel, Daniel Kurtz, Mathias Nyman, Greg Kroah-Hartman,
	open list:USB XHCI DRIVER, open list

In cases such as xhci_abort_cmd_ring(), xhci_handshake() is called with
a spin lock held (and local interrupts disabled) with a huge 5 second
timeout.  This can translates to 5 million calls to udelay(1).  By its
very nature, udelay() is not meant to be precise, it only guarantees to
delay a minimum of 1 microsecond. Therefore the actual delay of
xhci_handshake() can be significantly longer.  If the average udelay(1)
is greater than 2.2 us, the total time in xhci_handshake() - with
interrupts disabled can be > 11 seconds triggering the kernel's soft lockup
detector.

To avoid this, let's replace the open coded io polling loop with one from
iopoll.h that uses a loop timed with the more presumably reliable ktime
infrastructure.

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
---
 drivers/usb/host/xhci.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 005e65922608e..fde5ff84ba69b 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -16,6 +16,7 @@
 #include <linux/slab.h>
 #include <linux/dmi.h>
 #include <linux/dma-mapping.h>
+#include <linux/iopoll.h>
 
 #include "xhci.h"
 #include "xhci-trace.h"
@@ -69,18 +70,18 @@ static bool td_on_ring(struct xhci_td *td, struct xhci_ring *ring)
 int xhci_handshake(void __iomem *ptr, u32 mask, u32 done, int usec)
 {
 	u32	result;
+	int ret;
 
-	do {
-		result = readl(ptr);
-		if (result == ~(u32)0)		/* card removed */
-			return -ENODEV;
-		result &= mask;
-		if (result == done)
-			return 0;
-		udelay(1);
-		usec--;
-	} while (usec > 0);
-	return -ETIMEDOUT;
+	ret = readl_poll_timeout_atomic(ptr, result,
+			((result & mask) == done || (result == ~(u32)0)),
+			1, usec);
+	if (ret)
+		return ret;
+
+	if (result == ~(u32)0)		/* card removed */
+		return -ENODEV;
+
+	return 0;
 }
 
 /*
-- 
2.21.0.rc2.261.ga7da99ff1b-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] xhci: use iopoll for xhci_handshake
  2019-02-27 22:19 [PATCH] xhci: use iopoll for xhci_handshake Daniel Kurtz
@ 2019-02-28  7:09 ` Greg Kroah-Hartman
  2019-02-28  8:55   ` Mathias Nyman
  2019-02-28 16:49   ` Daniel Kurtz
  0 siblings, 2 replies; 4+ messages in thread
From: Greg Kroah-Hartman @ 2019-02-28  7:09 UTC (permalink / raw)
  To: Daniel Kurtz; +Cc: rrangel, Mathias Nyman, open list:USB XHCI DRIVER, open list

On Wed, Feb 27, 2019 at 03:19:17PM -0700, Daniel Kurtz wrote:
> In cases such as xhci_abort_cmd_ring(), xhci_handshake() is called with
> a spin lock held (and local interrupts disabled) with a huge 5 second
> timeout.  This can translates to 5 million calls to udelay(1).  By its
> very nature, udelay() is not meant to be precise, it only guarantees to
> delay a minimum of 1 microsecond. Therefore the actual delay of
> xhci_handshake() can be significantly longer.  If the average udelay(1)
> is greater than 2.2 us, the total time in xhci_handshake() - with
> interrupts disabled can be > 11 seconds triggering the kernel's soft lockup
> detector.
> 
> To avoid this, let's replace the open coded io polling loop with one from
> iopoll.h that uses a loop timed with the more presumably reliable ktime
> infrastructure.
> 
> Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>

Looks sane to me, nice fixup.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Is this causing problems on older kernels/devices today such that we
should backport this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] xhci: use iopoll for xhci_handshake
  2019-02-28  7:09 ` Greg Kroah-Hartman
@ 2019-02-28  8:55   ` Mathias Nyman
  2019-02-28 16:49   ` Daniel Kurtz
  1 sibling, 0 replies; 4+ messages in thread
From: Mathias Nyman @ 2019-02-28  8:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Daniel Kurtz, Andrey Smirnov
  Cc: rrangel, Mathias Nyman, open list:USB XHCI DRIVER, open list

On 28.2.2019 9.09, Greg Kroah-Hartman wrote:
> On Wed, Feb 27, 2019 at 03:19:17PM -0700, Daniel Kurtz wrote:
>> In cases such as xhci_abort_cmd_ring(), xhci_handshake() is called with
>> a spin lock held (and local interrupts disabled) with a huge 5 second
>> timeout.  This can translates to 5 million calls to udelay(1).  By its
>> very nature, udelay() is not meant to be precise, it only guarantees to
>> delay a minimum of 1 microsecond. Therefore the actual delay of
>> xhci_handshake() can be significantly longer.  If the average udelay(1)
>> is greater than 2.2 us, the total time in xhci_handshake() - with
>> interrupts disabled can be > 11 seconds triggering the kernel's soft lockup
>> detector.
>>
>> To avoid this, let's replace the open coded io polling loop with one from
>> iopoll.h that uses a loop timed with the more presumably reliable ktime
>> infrastructure.
>>
>> Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
> 
> Looks sane to me, nice fixup.
> 
> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> Is this causing problems on older kernels/devices today such that we
> should backport this?
> 

A very similar patch was submitted some weeks ago by Andrey Smirnov.

https://lore.kernel.org/lkml/20190208014816.21869-1-andrew.smirnov@gmail.com/

His commit message only mentions that readl_poll_timeout_atomic() does the same job,
not about any issues with the loop,  so I was going to send it forward to usb-next
after 5.1-rc (to 5.2).

-Mathias

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] xhci: use iopoll for xhci_handshake
  2019-02-28  7:09 ` Greg Kroah-Hartman
  2019-02-28  8:55   ` Mathias Nyman
@ 2019-02-28 16:49   ` Daniel Kurtz
  1 sibling, 0 replies; 4+ messages in thread
From: Daniel Kurtz @ 2019-02-28 16:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: rrangel, Mathias Nyman, open list:USB XHCI DRIVER, open list

On Thu, Feb 28, 2019 at 12:09 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Wed, Feb 27, 2019 at 03:19:17PM -0700, Daniel Kurtz wrote:
> > In cases such as xhci_abort_cmd_ring(), xhci_handshake() is called with
> > a spin lock held (and local interrupts disabled) with a huge 5 second
> > timeout.  This can translates to 5 million calls to udelay(1).  By its
> > very nature, udelay() is not meant to be precise, it only guarantees to
> > delay a minimum of 1 microsecond. Therefore the actual delay of
> > xhci_handshake() can be significantly longer.  If the average udelay(1)
> > is greater than 2.2 us, the total time in xhci_handshake() - with
> > interrupts disabled can be > 11 seconds triggering the kernel's soft lockup
> > detector.
> >
> > To avoid this, let's replace the open coded io polling loop with one from
> > iopoll.h that uses a loop timed with the more presumably reliable ktime
> > infrastructure.
> >
> > Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
>
> Looks sane to me, nice fixup.
>
> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> Is this causing problems on older kernels/devices today such that we
> should backport this?

We detected that xhci_handshake timing out can lead to softlockup
while debugging a USB issue on a new product.  The xhci_handshake
timeout itself is a symptom of another underlying problem causing some
commands to be aborted.  I don't know if any such underlying problems
exist on other older devices, but the potential is there so a backport
is reasonable.  Although, it may just shift the symptom of an
underlying problem from a softlockup/oops to some other symptom, like
USB just being dead.

-Dan

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-02-28 16:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-27 22:19 [PATCH] xhci: use iopoll for xhci_handshake Daniel Kurtz
2019-02-28  7:09 ` Greg Kroah-Hartman
2019-02-28  8:55   ` Mathias Nyman
2019-02-28 16:49   ` Daniel Kurtz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).