All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] USB: hub: Don't record a connect-change event during reset-resume
@ 2020-01-31 15:39 Alan Stern
  2020-01-31 19:36 ` Paul Zimmerman
  0 siblings, 1 reply; 3+ messages in thread
From: Alan Stern @ 2020-01-31 15:39 UTC (permalink / raw)
  To: Greg KH; +Cc: Paul Zimmerman, David Heinzelmann, USB list

Paul Zimmerman reports that his USB Bluetooth adapter sometimes
crashes following system resume, when it receives a
Get-Device-Descriptor request while it is busy doing something else.

Such a request was added by commit a4f55d8b8c14 ("usb: hub: Check
device descriptor before resusciation").  It gets sent when the hub
driver's work thread checks whether a connect-change event on an
enabled port really indicates a new device has been connected, as
opposed to an old device momentarily disconnecting and then
reconnecting (which can happen with xHCI host controllers, since they
automatically enable connected ports).

The same kind of thing occurs when a port's power session is lost
during system suspend.  When the system wakes up it sees a
connect-change event on the port, and if the child device's
persist_enabled flag was set then hub_activate() sets the device's
reset_resume flag as well as the port's bit in hub->change_bits.  The
reset-resume code then takes responsibility for checking that the same
device is still attached to the port, and it does this as part of the
device's resume pathway.  By the time the hub driver's work thread
starts up again, the device has already been fully reinitialized and
is busy doing its own thing.  There's no need for the work thread to
do the same check a second time, and in fact this unnecessary check is
what caused the problem that Paul observed.

Note that performing the unnecessary check is not actually a bug.
Devices are supposed to be able to send descriptors back to the host
even when they are busy doing something else.  The underlying cause of
Paul's problem lies in his Bluetooth adapter.  Nevertheless, we
shouldn't perform the same check twice in a row -- and as a nice side
benefit, removing the extra check allows the Bluetooth adapter to work
more reliably.

The work thread performs its check when it sees that the port's bit is
set in hub->change_bits.  In this situation that bit is interpreted as
though a connect-change event had occurred on the port _after_ the
reset-resume, which is not what actually happened.

One possible fix would be to make the reset-resume code clear the
port's bit in hub->change_bits.  But it seems simpler to just avoid
setting the bit during hub_activate() in the first place.  That's what
this patch does.

(Proving that the patch is correct when CONFIG_PM is disabled requires
a little thought.  In that setting hub_activate() will be called only
for initialization and resets, since there won't be any resumes or
reset-resumes.  During initialization and hub resets the hub doesn't
have any child devices, and so this code path never gets executed.)

Reported-and-tested-by: Paul Zimmerman <pauldzim@gmail.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://marc.info/?t=157949360700001&r=1&w=2
CC: David Heinzelmann <heinzelmann.david@gmail.com>
CC: <stable@vger.kernel.org>

---

[as1930]


 drivers/usb/core/hub.c |    5 -----
 1 file changed, 5 deletions(-)

Index: usb-devel/drivers/usb/core/hub.c
===================================================================
--- usb-devel.orig/drivers/usb/core/hub.c
+++ usb-devel/drivers/usb/core/hub.c
@@ -1216,11 +1216,6 @@ static void hub_activate(struct usb_hub
 #ifdef CONFIG_PM
 			udev->reset_resume = 1;
 #endif
-			/* Don't set the change_bits when the device
-			 * was powered off.
-			 */
-			if (test_bit(port1, hub->power_bits))
-				set_bit(port1, hub->change_bits);
 
 		} else {
 			/* The power session is gone; tell hub_wq */


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] USB: hub: Don't record a connect-change event during reset-resume
  2020-01-31 15:39 [PATCH] USB: hub: Don't record a connect-change event during reset-resume Alan Stern
@ 2020-01-31 19:36 ` Paul Zimmerman
  2020-01-31 21:04   ` Alan Stern
  0 siblings, 1 reply; 3+ messages in thread
From: Paul Zimmerman @ 2020-01-31 19:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, David Heinzelmann, USB list

On Fri, 31 Jan 2020 10:39:26 -0500 (EST)
Alan Stern <stern@rowland.harvard.edu> wrote:

> Paul Zimmerman reports that his USB Bluetooth adapter sometimes
> crashes following system resume, when it receives a
> Get-Device-Descriptor request while it is busy doing something else.
> 
> Such a request was added by commit a4f55d8b8c14 ("usb: hub: Check
> device descriptor before resusciation").  It gets sent when the hub
> driver's work thread checks whether a connect-change event on an
> enabled port really indicates a new device has been connected, as
> opposed to an old device momentarily disconnecting and then
> reconnecting (which can happen with xHCI host controllers, since they
> automatically enable connected ports).

< snip > 

> Note that performing the unnecessary check is not actually a bug.
> Devices are supposed to be able to send descriptors back to the host
> even when they are busy doing something else.  The underlying cause of
> Paul's problem lies in his Bluetooth adapter.  Nevertheless, we
> shouldn't perform the same check twice in a row -- and as a nice side
> benefit, removing the extra check allows the Bluetooth adapter to work
> more reliably.

Actually, at the time the failure happens, the bluetooth driver is putting
the device into a "manufacturer mode" and downloading a firmware patch to
the device. So I don't think we can fault the device for not responding to
a get-descriptor request at that point. Probably there should be some kind
of locking in the driver while that is being done.

Nevertheless, your patch makes everything work again, so I think it's
"good enough" :)

Thanks,
Paul

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] USB: hub: Don't record a connect-change event during reset-resume
  2020-01-31 19:36 ` Paul Zimmerman
@ 2020-01-31 21:04   ` Alan Stern
  0 siblings, 0 replies; 3+ messages in thread
From: Alan Stern @ 2020-01-31 21:04 UTC (permalink / raw)
  To: Paul Zimmerman; +Cc: Greg KH, David Heinzelmann, USB list

On Fri, 31 Jan 2020, Paul Zimmerman wrote:

> > Note that performing the unnecessary check is not actually a bug.
> > Devices are supposed to be able to send descriptors back to the host
> > even when they are busy doing something else.  The underlying cause of
> > Paul's problem lies in his Bluetooth adapter.  Nevertheless, we
> > shouldn't perform the same check twice in a row -- and as a nice side
> > benefit, removing the extra check allows the Bluetooth adapter to work
> > more reliably.
> 
> Actually, at the time the failure happens, the bluetooth driver is putting
> the device into a "manufacturer mode" and downloading a firmware patch to
> the device. So I don't think we can fault the device for not responding to
> a get-descriptor request at that point. Probably there should be some kind
> of locking in the driver while that is being done.

Heh.  We don't have any locking of that sort in the kernel.  Any user 
at any time can run "lsusb -v" and that program will try to communicate 
with every attached USB device.  There's no way to claim exclusive 
rights to a device.

The fact that firmware loading works at all is more or less a matter of
luck (although the odds are with us).

> Nevertheless, your patch makes everything work again, so I think it's
> "good enough" :)
> 
> Thanks,
> Paul

You're welcome.

Alan Stern


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-31 21:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-31 15:39 [PATCH] USB: hub: Don't record a connect-change event during reset-resume Alan Stern
2020-01-31 19:36 ` Paul Zimmerman
2020-01-31 21:04   ` Alan Stern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.