All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found] <1472132041.13456.11.camel@researchut.com>
@ 2016-08-25 17:17 ` Alan Stern
       [not found]   ` <Pine.LNX.4.44L0.1608251254220.1395-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-08-25 17:17 UTC (permalink / raw)
  To: Ritesh Raj Sarraf, Ulf Hansson; +Cc: USB list, linux-mmc

Ulf:

Ritesh has collected logs showing that his Realtek RTS5129 USB card
reader (drivers/mfd/rtsx_usb.c, drivers/mmc/host/rtsx_usb_sdmmc.c) goes
into runtime autosuspend every 3 seconds and then immediately resumes.  
This sounds like something is failing to call
pm_runtime_mark_last_busy().  He's using a 4.7 kernel.

In addition, the device gets disconnected from the USB bus from time to 
time.  This appears to be a completely separate issue.

For now, I'd like to fix the runtime PM problem.  But I don't know
anything about the mmc core, so perhaps you can help.


On Thu, 25 Aug 2016, Ritesh Raj Sarraf wrote:

> > Do you happen to know which driver is being used: the memstick
> > (rtsx_usb_ms) or mmc (rtsx_usb_sdmmc) driver?  I suppose this may 
> > depend on what type of card you insert in the reader.
> > 
> 
> 
> I think it is the rtsx_usb_sdmmc which is in use. I removed the rtsx_usb_ms
> kernel module and still was able to access the sdcard.
> 
> rrs@learner:~$ lsmod | grep usb_ms
> 2016-08-25 / 18:45:52 ♒♒♒  ☹  => 1  
> 
> rrs@learner:~$ lsmod | grep usb_sd
> rtsx_usb_sdmmc         24576  0
> rtsx_usb               24576  1 rtsx_usb_sdmmc
> mmc_core              135168  4 mmc_block,sdhci,sdhci_acpi,rtsx_usb_sdmmc
> 2016-08-25 / 18:45:55 ♒♒♒  ☺  
> 
> 
> The interesting bit is that when I enter the adapter into the reader, I get the
> following error 5 times, and then it can access the card.
> 
> [  496.822613] mmc0: tuning execution failed: -22
> [  496.822629] mmc0: error -22 whilst initialising SD card
> [  501.980908] mmc0: tuning execution failed: -22
> [  501.980922] mmc0: error -22 whilst initialising SD card
> [  507.119953] mmc0: tuning execution failed: -22
> [  507.119968] mmc0: error -22 whilst initialising SD card
> [  513.148143] mmc0: tuning execution failed: -22
> [  513.148157] mmc0: error -22 whilst initialising SD card
> [  518.702215] mmc0: tuning execution failed: -22
> [  518.702222] mmc0: error -22 whilst initialising SD card
> [  524.081122] mmc0: new ultra high speed SDR50 SDHC card at address 0002
> [  524.082596] mmcblk0: mmc0:0002 NCard 14.9 GiB 
> [  524.084240]  mmcblk0: p1
> [  524.306434] FAT-fs (mmcblk0p1): utf8 is not a recommended IO charset for FAT
> filesystems, filesystem will be case sensitive!

I can't tell why those errors occur.  It would require more debugging.  
At least they don't seem to cause any serious problems.

> With your patch applied, the initial errors messages (xhci_hcd 0000:00:14.0: dev
> 4 ep1out scatterlist error -104/-110) are not seen so far.

This is because those errors occur when the device goes into runtime
autosuspend and the computer tries to communicate with it while it is
suspended.  Both things (the autosuspend and the communication attempt)  
are bugs in the drivers.

> The device does reset (as you had mentioned), but it doesn't seem to have any
> power drain related negative effects.
> 
> 
> rrs@learner:~$ less /var/tmp/dmesg-post-patch.txt  | tail -n 25
> [11922.283067] wlan0: RX AssocResp from 00:40:77:bb:55:12 (capab=0x411 status=0
> aid=1)
> [11922.283743] wlan0: associated
> [11922.283801] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> [11922.883849] systemd[1]: apt-daily.timer: Adding 8h 36min 18.323966s random
> time.
> [11923.081426] systemd[1]: apt-daily.timer: Adding 2h 23min 22.221062s random
> time.
> [13799.616838] atkbd serio0: Unknown key pressed (translated set 2, code 0xbe on
> isa0060/serio0).
> [13799.616843] atkbd serio0: Use 'setkeycodes e03e <keycode>' to make it known.
> [13799.625901] atkbd serio0: Unknown key released (translated set 2, code 0xbe
> on isa0060/serio0).
> [13799.625905] atkbd serio0: Use 'setkeycodes e03e <keycode>' to make it known.
> [13800.547966] usb 1-4: USB disconnect, device number 15

Spontaneous disconnect followed by reconnect a little later...

> [13801.707137] usb 1-4: new high-speed USB device number 16 using xhci_hcd
> [13801.880788] usb 1-4: New USB device found, idVendor=0bda, idProduct=0129
> [13801.880791] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
> [13801.880792] usb 1-4: Product: USB2.0-CRW
> [13801.880793] usb 1-4: Manufacturer: Generic
> [13801.880794] usb 1-4: SerialNumber: 20100201396000000
> [13802.809031] usb 1-4: USB disconnect, device number 16
> [13803.390459] usb 1-4: new high-speed USB device number 18 using xhci_hcd
> [13808.807084] usb 1-4: new high-speed USB device number 19 using xhci_hcd
> [13808.980827] usb 1-4: New USB device found, idVendor=0bda, idProduct=0129
> [13808.980831] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
> [13808.980833] usb 1-4: Product: USB2.0-CRW
> [13808.980834] usb 1-4: Manufacturer: Generic
> [13808.980835] usb 1-4: SerialNumber: 20100201396000000
> [14367.255033] usb 1-7: reset full-speed USB device number 5 using xhci_hcd
> 2016-08-25 / 18:53:16 ♒♒♒  ☺  
> 
> Note: These resets are seen without any card/adapter in the reader.

The computer probably still wants to communicate with the reader, in 
order to check whether a card has been inserted.  In theory this 
shouldn't be necessary, because the card reader should perform a remote 
wakeup when a card is inserted or removed.  It might not support this 
feature, however -- although your "lsusb -v" output shows that it does 
support remote wakeup.

> > As you mentioned above, there's another aspect to power management 
> > besides runtime PM, namely Link Power Management.  Perhaps the device 
> > can't handle LPM.
> > 
> > You can test this by editing the usb_device_supports_lpm() routine in 
> > drivers/usb/core/hub.c.  If you make it always return 0 immediately, 
> > that will disable LPM for all USB devices.  If the spontaneous 
> > disconnects don't reappear, we'll have the answer.
> > 
> > Alan Stern
> > 
> 
> 
> I'll try this out on a new build and share my results again on this thread.
> 
> As for the patch, will it qualify for inclusion into the mainline kernel ?

You mean the patch that disables autosuspend in the rtsx_usb driver?  
No, it's not a real solution.  We need to figure out why the device
gets autosuspended every 3 seconds and fix that bug, not just eliminate
all support for runtime PM.

Alan Stern


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]   ` <Pine.LNX.4.44L0.1608251254220.1395-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2016-08-30  8:14     ` Ulf Hansson
       [not found]       ` <CAPDyKFq2SYtwWCNhSzQcxj8XdYmAhTqn6mxRKMJ7eKZAk=itWg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Ulf Hansson @ 2016-08-30  8:14 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ritesh Raj Sarraf, USB list, linux-mmc

On 25 August 2016 at 19:17, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
> Ulf:
>
> Ritesh has collected logs showing that his Realtek RTS5129 USB card
> reader (drivers/mfd/rtsx_usb.c, drivers/mmc/host/rtsx_usb_sdmmc.c) goes
> into runtime autosuspend every 3 seconds and then immediately resumes.
> This sounds like something is failing to call
> pm_runtime_mark_last_busy().  He's using a 4.7 kernel.
>
> In addition, the device gets disconnected from the USB bus from time to
> time.  This appears to be a completely separate issue.
>
> For now, I'd like to fix the runtime PM problem.  But I don't know
> anything about the mmc core, so perhaps you can help.
>

Sorry for the delay! We have had some regressions for 4.8 rc1 in the
mmc block layer. Those problem should be resolved by now.

By reading from the runtime PM issues you have, the problems could
very well be related. Although I don't believe the issues was present
in a 4.7 kernel.

Perhaps you can run a test on a 4.8 rc4 kernel, just to double check.
The 4.8 rc4, contains the following fixes in the mmc block layer.

commit 7afafc8a44bf ("block: Fix secure erase")
commit 869c554808cc ("mmc: fix use-after-free of struct request")

Kind regards
Uffe

>
> On Thu, 25 Aug 2016, Ritesh Raj Sarraf wrote:
>
>> > Do you happen to know which driver is being used: the memstick
>> > (rtsx_usb_ms) or mmc (rtsx_usb_sdmmc) driver?  I suppose this may
>> > depend on what type of card you insert in the reader.
>> >
>>
>>
>> I think it is the rtsx_usb_sdmmc which is in use. I removed the rtsx_usb_ms
>> kernel module and still was able to access the sdcard.
>>
>> rrs@learner:~$ lsmod | grep usb_ms
>> 2016-08-25 / 18:45:52 ♒♒♒  ☹  => 1
>>
>> rrs@learner:~$ lsmod | grep usb_sd
>> rtsx_usb_sdmmc         24576  0
>> rtsx_usb               24576  1 rtsx_usb_sdmmc
>> mmc_core              135168  4 mmc_block,sdhci,sdhci_acpi,rtsx_usb_sdmmc
>> 2016-08-25 / 18:45:55 ♒♒♒  ☺
>>
>>
>> The interesting bit is that when I enter the adapter into the reader, I get the
>> following error 5 times, and then it can access the card.
>>
>> [  496.822613] mmc0: tuning execution failed: -22
>> [  496.822629] mmc0: error -22 whilst initialising SD card
>> [  501.980908] mmc0: tuning execution failed: -22
>> [  501.980922] mmc0: error -22 whilst initialising SD card
>> [  507.119953] mmc0: tuning execution failed: -22
>> [  507.119968] mmc0: error -22 whilst initialising SD card
>> [  513.148143] mmc0: tuning execution failed: -22
>> [  513.148157] mmc0: error -22 whilst initialising SD card
>> [  518.702215] mmc0: tuning execution failed: -22
>> [  518.702222] mmc0: error -22 whilst initialising SD card
>> [  524.081122] mmc0: new ultra high speed SDR50 SDHC card at address 0002
>> [  524.082596] mmcblk0: mmc0:0002 NCard 14.9 GiB
>> [  524.084240]  mmcblk0: p1
>> [  524.306434] FAT-fs (mmcblk0p1): utf8 is not a recommended IO charset for FAT
>> filesystems, filesystem will be case sensitive!
>
> I can't tell why those errors occur.  It would require more debugging.
> At least they don't seem to cause any serious problems.
>
>> With your patch applied, the initial errors messages (xhci_hcd 0000:00:14.0: dev
>> 4 ep1out scatterlist error -104/-110) are not seen so far.
>
> This is because those errors occur when the device goes into runtime
> autosuspend and the computer tries to communicate with it while it is
> suspended.  Both things (the autosuspend and the communication attempt)
> are bugs in the drivers.
>
>> The device does reset (as you had mentioned), but it doesn't seem to have any
>> power drain related negative effects.
>>
>>
>> rrs@learner:~$ less /var/tmp/dmesg-post-patch.txt  | tail -n 25
>> [11922.283067] wlan0: RX AssocResp from 00:40:77:bb:55:12 (capab=0x411 status=0
>> aid=1)
>> [11922.283743] wlan0: associated
>> [11922.283801] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
>> [11922.883849] systemd[1]: apt-daily.timer: Adding 8h 36min 18.323966s random
>> time.
>> [11923.081426] systemd[1]: apt-daily.timer: Adding 2h 23min 22.221062s random
>> time.
>> [13799.616838] atkbd serio0: Unknown key pressed (translated set 2, code 0xbe on
>> isa0060/serio0).
>> [13799.616843] atkbd serio0: Use 'setkeycodes e03e <keycode>' to make it known.
>> [13799.625901] atkbd serio0: Unknown key released (translated set 2, code 0xbe
>> on isa0060/serio0).
>> [13799.625905] atkbd serio0: Use 'setkeycodes e03e <keycode>' to make it known.
>> [13800.547966] usb 1-4: USB disconnect, device number 15
>
> Spontaneous disconnect followed by reconnect a little later...
>
>> [13801.707137] usb 1-4: new high-speed USB device number 16 using xhci_hcd
>> [13801.880788] usb 1-4: New USB device found, idVendor=0bda, idProduct=0129
>> [13801.880791] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
>> [13801.880792] usb 1-4: Product: USB2.0-CRW
>> [13801.880793] usb 1-4: Manufacturer: Generic
>> [13801.880794] usb 1-4: SerialNumber: 20100201396000000
>> [13802.809031] usb 1-4: USB disconnect, device number 16
>> [13803.390459] usb 1-4: new high-speed USB device number 18 using xhci_hcd
>> [13808.807084] usb 1-4: new high-speed USB device number 19 using xhci_hcd
>> [13808.980827] usb 1-4: New USB device found, idVendor=0bda, idProduct=0129
>> [13808.980831] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
>> [13808.980833] usb 1-4: Product: USB2.0-CRW
>> [13808.980834] usb 1-4: Manufacturer: Generic
>> [13808.980835] usb 1-4: SerialNumber: 20100201396000000
>> [14367.255033] usb 1-7: reset full-speed USB device number 5 using xhci_hcd
>> 2016-08-25 / 18:53:16 ♒♒♒  ☺
>>
>> Note: These resets are seen without any card/adapter in the reader.
>
> The computer probably still wants to communicate with the reader, in
> order to check whether a card has been inserted.  In theory this
> shouldn't be necessary, because the card reader should perform a remote
> wakeup when a card is inserted or removed.  It might not support this
> feature, however -- although your "lsusb -v" output shows that it does
> support remote wakeup.
>
>> > As you mentioned above, there's another aspect to power management
>> > besides runtime PM, namely Link Power Management.  Perhaps the device
>> > can't handle LPM.
>> >
>> > You can test this by editing the usb_device_supports_lpm() routine in
>> > drivers/usb/core/hub.c.  If you make it always return 0 immediately,
>> > that will disable LPM for all USB devices.  If the spontaneous
>> > disconnects don't reappear, we'll have the answer.
>> >
>> > Alan Stern
>> >
>>
>>
>> I'll try this out on a new build and share my results again on this thread.
>>
>> As for the patch, will it qualify for inclusion into the mainline kernel ?
>
> You mean the patch that disables autosuspend in the rtsx_usb driver?
> No, it's not a real solution.  We need to figure out why the device
> gets autosuspended every 3 seconds and fix that bug, not just eliminate
> all support for runtime PM.
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]       ` <CAPDyKFq2SYtwWCNhSzQcxj8XdYmAhTqn6mxRKMJ7eKZAk=itWg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-09-04 11:32         ` Ritesh Raj Sarraf
  2016-09-04 18:01           ` Ritesh Raj Sarraf
  0 siblings, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-04 11:32 UTC (permalink / raw)
  To: Ulf Hansson, Alan Stern; +Cc: USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Alan, Ulf.

Sorry for the delayed reply. Last week was a sickly week for me.

On Tue, 2016-08-30 at 10:14 +0200, Ulf Hansson wrote:
> On 25 August 2016 at 19:17, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
> > Ulf:
> >
> > Ritesh has collected logs showing that his Realtek RTS5129 USB card
> > reader (drivers/mfd/rtsx_usb.c, drivers/mmc/host/rtsx_usb_sdmmc.c) goes
> > into runtime autosuspend every 3 seconds and then immediately resumes.
> > This sounds like something is failing to call
> > pm_runtime_mark_last_busy().  He's using a 4.7 kernel.
> >
> > In addition, the device gets disconnected from the USB bus from time to
> > time.  This appears to be a completely separate issue.
> >
> > For now, I'd like to fix the runtime PM problem.  But I don't know
> > anything about the mmc core, so perhaps you can help.
> >
> 

@Alan,

I noticed that, even with the 1-line patch that disabled autosuspend attribute
for this driver, I still occasionally hit the driver error messages.

[19608.533058] usb 1-4: reset high-speed USB device number 13 using xhci_hcd
[19608.692846] usb 1-4: Device not responding to setup address.
[19608.898254] usb 1-4: Device not responding to setup address.
[19609.099553] usb 1-4: device not accepting address 13, error -71
[19609.259474] usb 1-4: reset high-speed USB device number 13 using xhci_hcd
[19609.420221] usb 1-4: device descriptor read/64, error -71
[19609.699656] usb 1-4: device descriptor read/all, error -71
[19609.859678] usb 1-4: reset high-speed USB device number 13 using xhci_hcd
[19609.859768] usb 1-4: Device not responding to setup address.
[19610.062891] usb 1-4: Device not responding to setup address.
[19610.266148] usb 1-4: device not accepting address 13, error -71
[19610.426157] usb 1-4: reset high-speed USB device number 13 using xhci_hcd
[19610.426252] usb 1-4: Device not responding to setup address.
[19610.629533] usb 1-4: Device not responding to setup address.
[19610.832791] usb 1-4: device not accepting address 13, error -71
[19610.833374] usb 1-4: USB disconnect, device number 13
[19611.506151] usb 1-4: new high-speed USB device number 14 using xhci_hcd
[19611.666770] usb 1-4: device descriptor read/64, error -71
[19611.929563] usb 1-4: Device not responding to setup address.
[19612.136298] usb 1-4: Device not responding to setup address.
[19612.342800] usb 1-4: device not accepting address 14, error -71
[19612.502791] usb 1-4: new high-speed USB device number 15 using xhci_hcd
[19612.662829] usb 1-4: device descriptor read/64, error -71
[19612.929561] usb 1-4: Device not responding to setup address.
[19613.132883] usb 1-4: Device not responding to setup address.
[19613.336162] usb 1-4: device not accepting address 15, error -71
[19613.496165] usb 1-4: new high-speed USB device number 16 using xhci_hcd
[19613.509674] usb 1-4: device descriptor read/8, error -71
[19613.626406] usb 1-4: device descriptor read/8, error -71
[19613.889528] usb 1-4: new high-speed USB device number 17 using xhci_hcd
[19613.889639] usb 1-4: Device not responding to setup address.
[19614.092899] usb 1-4: Device not responding to setup address.
[19614.296169] usb 1-4: device not accepting address 17, error -71
[19614.296216] usb usb1-port4: unable to enumerate USB device
[20357.431911] SGI XFS with ACLs, security attributes, realtime, no debug
enabled


Earlier, only the device resets error was seen. But I was able to see this once
again.

> Sorry for the delay! We have had some regressions for 4.8 rc1 in the
> mmc block layer. Those problem should be resolved by now.
> 
> By reading from the runtime PM issues you have, the problems could
> very well be related. Although I don't believe the issues was present
> in a 4.7 kernel.
> 

This issue has been around with previous kernels too.

> Perhaps you can run a test on a 4.8 rc4 kernel, just to double check.
> The 4.8 rc4, contains the following fixes in the mmc block layer.
> 
> commit 7afafc8a44bf ("block: Fix secure erase")
> commit 869c554808cc ("mmc: fix use-after-free of struct request")

I'll share my 4.8-rc4 results soon.

Thanks,
Ritesh

- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJXzAZBAAoJEKY6WKPy4XVpKk4P/0lwMRmZYVhUf19s6yqviMzB
5YPMKCahz5KyPEBP5bdfDW6TCi83UI34yn/FCUj0l4kRRb41zL7JUZ/PYjyFfq3c
2d6viAx4+Qjqn26sZ+PDciEaBKHAQWEu1SYgGfxk+5fmXUc/aLQpY9h+DAHB30xU
H80FQz5Ct6rFwp5TkHmFALSk0Ue1bF/ZJy9AvsOkGZCN+1lYekEh8Ry26RvKsIEc
2i5wJ5CDJXQT1AiV5Jwy6WfiEy8a8FKlxRuo767e9Ftp/sZhUaulhYXsBaJOm6ZT
ckBbib9HEtfKM/nL/Cp3kjBqR7PHWdxeFER9x+r/KWCQYalD1MlbtP48oekFGr7U
h/XqLNuHdaKdfLFctHx4HZ/vQCBQ8XW0dYfpXSnoNOzm7I8qxp5icUZGad5BTxSE
QDbg8A6vW1gZ8dxRVG39+JHVHUxx2AXvUM720aUpRdik8c6+kz/pcIwXPLvboPsY
WGKDiI1Q8EE7/8910+g/qghOI97o0dgasdVY9FZGhYL427BzanAPcXm5aGXeyFJk
5mA+TNASC8gviYC5z7hc0lOouoprNyt1ssp6hZFPmkEsOyKpy6rQrf2VJcJuoYdH
PHsfJFUkM7/t2PygEhRBV266DVCzOR1Ut968T++ErnHmEVkHse94cRNvbXZVYkHJ
XsagbkAA/uxsIdubWV85
=GIR5
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-04 11:32         ` Ritesh Raj Sarraf
@ 2016-09-04 18:01           ` Ritesh Raj Sarraf
       [not found]             ` <1473012074.5339.6.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-04 18:01 UTC (permalink / raw)
  To: Ulf Hansson, Alan Stern; +Cc: USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Sun, 2016-09-04 at 17:02 +0530, Ritesh Raj Sarraf wrote:
> > Sorry for the delay! We have had some regressions for 4.8 rc1 in the
> > mmc block layer. Those problem should be resolved by now.
> > 
> > By reading from the runtime PM issues you have, the problems could
> > very well be related. Although I don't believe the issues was present
> > in a 4.7 kernel.
> > 
> 
> This issue has been around with previous kernels too.
> 
> > Perhaps you can run a test on a 4.8 rc4 kernel, just to double check.
> > The 4.8 rc4, contains the following fixes in the mmc block layer.
> > 
> > commit 7afafc8a44bf ("block: Fix secure erase")
> > commit 869c554808cc ("mmc: fix use-after-free of struct request")
> 
> I'll share my 4.8-rc4 results soon.

As I hoped, I was able to reproduce this with 4.8-rc4 too.

[  857.999547] systemd[1]: apt-daily.timer: Adding 10h 21min 53.234725s random
time.
[13071.615285] usb 2-4: USB disconnect, device number 2
[13072.794802] usb 2-4: new high-speed USB device number 7 using xhci_hcd
[13072.925092] usb 2-4: New USB device found, idVendor=0bda, idProduct=0129
[13072.925094] usb 2-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[13072.925095] usb 2-4: Product: USB2.0-CRW
[13072.925096] usb 2-4: Manufacturer: Generic
[13072.925097] usb 2-4: SerialNumber: 20100201396000000
[13093.499011] usb 2-4: reset high-speed USB device number 7 using xhci_hcd
[13093.612239] usb 2-4: Device not responding to setup address.
[13093.818896] usb 2-4: Device not responding to setup address.
[13094.025491] usb 2-4: device not accepting address 7, error -71
[13094.138844] usb 2-4: reset high-speed USB device number 7 using xhci_hcd
[13094.252178] usb 2-4: device descriptor read/64, error -71
[13094.489037] usb 2-4: device descriptor read/all, error -71
[13094.602374] usb 2-4: reset high-speed USB device number 7 using xhci_hcd
[13094.602453] usb 2-4: Device not responding to setup address.
[13094.808929] usb 2-4: Device not responding to setup address.
[13095.015523] usb 2-4: device not accepting address 7, error -71
[13095.128879] usb 2-4: reset high-speed USB device number 7 using xhci_hcd
[13095.145799] usb 2-4: device descriptor read/all, error 8
[13095.145842] usb 2-4: USB disconnect, device number 7
[13095.492216] usb 2-4: new high-speed USB device number 8 using xhci_hcd
[13095.605642] usb 2-4: Device not responding to setup address.
[13095.812294] usb 2-4: Device not responding to setup address.


- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJXzGFqAAoJEKY6WKPy4XVpZzcP/jCzROd3+k0oF+cgK4WqKKEN
Wb/LOHezZlyNXnZ4/UxT0ox9Agc9AMQEvx4+77Itm5khQ+H3ZMEQNQg0jSENZ6aT
KjPSC3fUZ93PBGyCzec+CTj4UY49WJ1hZ4dvv7WW7WEuR8gxZxV+Z0VcG7kD8h0E
aOwJ5PVbyNSY7ZCH8NNRPnx5v74C/b+1OlFbRV1SNTUBU06ekeImy5IuvGwF45Gq
cLbKMIjJmUsg/jSol/pqUg1Sewj+z1part7sF5ZeAYM/ntwb4JAWwYvcWNI7VMxW
skT7HqyQ/qKk++UShxGsHYqV8Ec5Rx5tAUTgEH68Y7y4i9YDSTOnqXDgjTANz5RL
yaZS856dGoIs96/mHCjzMV+1NYpTo2aMPO1/JDs6nvKfBtyIVDbTsB8qbdP5vqnb
x1GmzB6j8wDHBTAtJIifkB8XGQfcxbDto2LG6/OnPPN3/aiTMCKB6ZND2HISNGfr
IeN39Ay35zn9epq6D3usdd1TAlF5xqo1djHyFDYfXE0gZsJZiCCs9GC2NaP46ZYp
G7a4O1UpbPKhRq7peyh2f/GzHg+3k1J+PD+I1Us/w2Dj1ObCr7M4ToF70AQ2xISE
DFBuP7dWue6waQup7S3zNLqyP5K101NkKQbSrXJHEhxIaqNFbEZg1D8MuC4I0gHA
i4uloGV6oSg973dwyTiU
=q0wG
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]             ` <1473012074.5339.6.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
@ 2016-09-04 19:46               ` Alan Stern
  2016-09-05 12:59                 ` Ritesh Raj Sarraf
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-04 19:46 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, USB list, linux-mmc

On Sun, 4 Sep 2016, Ritesh Raj Sarraf wrote:

> > This issue has been around with previous kernels too.
> > 
> > > Perhaps you can run a test on a 4.8 rc4 kernel, just to double check.
> > > The 4.8 rc4, contains the following fixes in the mmc block layer.
> > > 
> > > commit 7afafc8a44bf ("block: Fix secure erase")
> > > commit 869c554808cc ("mmc: fix use-after-free of struct request")
> > 
> > I'll share my 4.8-rc4 results soon.
> 
> As I hoped, I was able to reproduce this with 4.8-rc4 too.
> 
> [  857.999547] systemd[1]: apt-daily.timer: Adding 10h 21min 53.234725s random
> time.
> [13071.615285] usb 2-4: USB disconnect, device number 2
> [13072.794802] usb 2-4: new high-speed USB device number 7 using xhci_hcd
> [13072.925092] usb 2-4: New USB device found, idVendor=0bda, idProduct=0129
> [13072.925094] usb 2-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
> [13072.925095] usb 2-4: Product: USB2.0-CRW
> [13072.925096] usb 2-4: Manufacturer: Generic
> [13072.925097] usb 2-4: SerialNumber: 20100201396000000
> [13093.499011] usb 2-4: reset high-speed USB device number 7 using xhci_hcd
> [13093.612239] usb 2-4: Device not responding to setup address.
> [13093.818896] usb 2-4: Device not responding to setup address.
> [13094.025491] usb 2-4: device not accepting address 7, error -71
> [13094.138844] usb 2-4: reset high-speed USB device number 7 using xhci_hcd
> [13094.252178] usb 2-4: device descriptor read/64, error -71
> [13094.489037] usb 2-4: device descriptor read/all, error -71
> [13094.602374] usb 2-4: reset high-speed USB device number 7 using xhci_hcd
> [13094.602453] usb 2-4: Device not responding to setup address.
> [13094.808929] usb 2-4: Device not responding to setup address.
> [13095.015523] usb 2-4: device not accepting address 7, error -71
> [13095.128879] usb 2-4: reset high-speed USB device number 7 using xhci_hcd
> [13095.145799] usb 2-4: device descriptor read/all, error 8
> [13095.145842] usb 2-4: USB disconnect, device number 7
> [13095.492216] usb 2-4: new high-speed USB device number 8 using xhci_hcd
> [13095.605642] usb 2-4: Device not responding to setup address.
> [13095.812294] usb 2-4: Device not responding to setup address.

This is not the problem I was discussing with Ulf.  The problem was why
the device kept going into and out of runtime suspend every three
seconds.  The kernel log above does not say whether this was happening.
One way to tell is to look at a usbmon trace (like we did before).

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-04 19:46               ` Alan Stern
@ 2016-09-05 12:59                 ` Ritesh Raj Sarraf
       [not found]                   ` <1473080344.10346.4.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-05 12:59 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ulf Hansson, USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Sun, 2016-09-04 at 15:46 -0400, Alan Stern wrote:
> 
> This is not the problem I was discussing with Ulf.  The problem was why
> the device kept going into and out of runtime suspend every three
> seconds.  The kernel log above does not say whether this was happening.
> One way to tell is to look at a usbmon trace (like we did before).

https://people.debian.org/~rrs/tmp/usbmon.txt.gz

This should have the logs you asked for, running on 4.8-rc4.


- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJXzWwYAAoJEKY6WKPy4XVpFKgP/i76Ek/1e1CYqStgakQUtmPb
eeGuSPFVqvQt4PEvGNvMtXcLMx5IKIEpKbqua7N4/ZfKmf5EA00wUCXzm85MnyAp
fIN9K1uOW1ZepnsViJ/U1bNZvHVDXy/LEfd5me+7JzWsPqaR/KjyZmZ8VSlu+aov
CvBDrt41x9plf1xJBYffHc2KvdM0lYFJjft2Gop9qr/hMhIy8RcJdbylPpnf4rtU
cbm3Ti3DdA+r86MLCKagq50Iow3xcfBEEYZLnn+ucnpTOWz56TyO3+44sjgLgrpy
IDRBlCbgk2RZkm8lpj+d052ssk3MjLBupBFkoRfXfXfear3zLrut2Q51xjeJIINz
oZdmAxozsvGxCtgC6dIb/79clruL8kqdr/TqF/YtwzB6JnFvZEgzBcpAlxFBxTlK
RtvAyPVs4Np1vtTli/Uv1O+odoR0FPhzpV+nDSsea7im6/oMav40UpXvgs72EEUS
cr5SwY8xClfNeemCHFZ/wFeYwPfjPfNU5Lgm/Hkpx6oaOABCnfacF1nWxbIRbuoh
gP9vW/Z0LI4azo940xlWfRiDck5W9IQXeI3v2wXVVIE8TDwxraQzyBXKKPBFxmfX
jTKWLRDDqyOWGcWxLJpMHMOa9rNtAZeiBI7TMbnqZKj6eLsdjuOW76zDhEHnMAtb
dUCOFYRG24arYTUpIVnB
=YG3r
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                   ` <1473080344.10346.4.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
@ 2016-09-05 15:58                     ` Alan Stern
       [not found]                       ` <Pine.LNX.4.44L0.1609051157310.25234-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-05 15:58 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, USB list, linux-mmc

On Mon, 5 Sep 2016, Ritesh Raj Sarraf wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> On Sun, 2016-09-04 at 15:46 -0400, Alan Stern wrote:
> > 
> > This is not the problem I was discussing with Ulf.  The problem was why
> > the device kept going into and out of runtime suspend every three
> > seconds.  The kernel log above does not say whether this was happening.
> > One way to tell is to look at a usbmon trace (like we did before).
> 
> https://people.debian.org/~rrs/tmp/usbmon.txt.gz
> 
> This should have the logs you asked for, running on 4.8-rc4.

Confirmed, the runtime suspends and resumes are still happening.

Ulf, any insights?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                       ` <Pine.LNX.4.44L0.1609051157310.25234-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2016-09-06  9:42                         ` Ulf Hansson
  2016-09-06 17:08                           ` Ritesh Raj Sarraf
       [not found]                           ` <CAPDyKFpnCXhdoKgoG576teC=y38vbC1x=-ehC_9EWEeKr_K6BQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 2 replies; 33+ messages in thread
From: Ulf Hansson @ 2016-09-06  9:42 UTC (permalink / raw)
  To: Alan Stern, Ritesh Raj Sarraf; +Cc: USB list, linux-mmc

On 5 September 2016 at 17:58, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
> On Mon, 5 Sep 2016, Ritesh Raj Sarraf wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA512
>>
>> On Sun, 2016-09-04 at 15:46 -0400, Alan Stern wrote:
>> >
>> > This is not the problem I was discussing with Ulf.  The problem was why
>> > the device kept going into and out of runtime suspend every three
>> > seconds.  The kernel log above does not say whether this was happening.
>> > One way to tell is to look at a usbmon trace (like we did before).
>>
>> https://people.debian.org/~rrs/tmp/usbmon.txt.gz
>>
>> This should have the logs you asked for, running on 4.8-rc4.
>
> Confirmed, the runtime suspends and resumes are still happening.
>
> Ulf, any insights?

Alan, Ritesh,

Yes, I am starting to understand more about what goes on here.
Although I need help to test as I don't have the HW.
As you already guessed, I suspect the problem is within the runtime PM
deployment in the drivers/mmc/host/rtsx_usb_sdmmc.c.

Let me start by first give you some background to how the mmc core
deals with runtime PM.

*)
The mmc core manages most of the calls to the pm_runtime_get|put*()
and pm_runtime_mark_last_busy() for the mmc host device. The gets/puts
are done when the core is about to access the mmc host device, via the
mmc host ops driver interface. You may search for calls to the
mmc_claim|release_host() functions to find out when the gets/puts are
done.

**)
The mmc core have also deployed runtime PM for the mmc *card* device
and which has the runtime PM autosuspend feature enabled with a 3s
default timeout. One important point is that the mmc card device has
the mmc host device assigned as being its parent device. I guessing
the reason to why you are encountering strange 3s intervals of runtime
PM suspend/resume is related to this.

Now, in this case, the rtsx_usb_sdmmc driver seems to need a bit of
special runtime PM deployment, as the calls to pm_runtime_get|put*()
also controls the power to the usb device and thus also the power to
the card. I am guessing that's done via the usb device being assigned
as parent for the mmc host's platform device!?

By reviewing the code of the rtsx_usb_sdmmc driver, particularly how
it calls pm_runtime_get|put() I am guessing those calls may not be
properly deployed. Perhaps rtsx_usb_sdmmc should convert to use the
usb_autopm_put|get_interface() and friends, although I didn't want to
make that change at this point so instead I have cooked a patch that
might fixes the behaviour.

Ritesh, can you please try it out to see what happens?

---
 drivers/mmc/host/rtsx_usb_sdmmc.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
b/drivers/mmc/host/rtsx_usb_sdmmc.c
index 6c71fc9..3d6fe51 100644
--- a/drivers/mmc/host/rtsx_usb_sdmmc.c
+++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
@@ -1138,11 +1138,6 @@ static void sdmmc_set_ios(struct mmc_host *mmc,
struct mmc_ios *ios)
        dev_dbg(sdmmc_dev(host), "%s\n", __func__);
        mutex_lock(&ucr->dev_mutex);

-       if (rtsx_usb_card_exclusive_check(ucr, RTSX_USB_SD_CARD)) {
-               mutex_unlock(&ucr->dev_mutex);
-               return;
-       }

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-06  9:42                         ` Ulf Hansson
@ 2016-09-06 17:08                           ` Ritesh Raj Sarraf
       [not found]                           ` <CAPDyKFpnCXhdoKgoG576teC=y38vbC1x=-ehC_9EWEeKr_K6BQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 0 replies; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-06 17:08 UTC (permalink / raw)
  To: Ulf Hansson, Alan Stern; +Cc: USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Ulf,

On Tue, 2016-09-06 at 11:42 +0200, Ulf Hansson wrote:
> 
> By reviewing the code of the rtsx_usb_sdmmc driver, particularly how
> it calls pm_runtime_get|put() I am guessing those calls may not be
> properly deployed. Perhaps rtsx_usb_sdmmc should convert to use the
> usb_autopm_put|get_interface() and friends, although I didn't want to
> make that change at this point so instead I have cooked a patch that
> might fixes the behaviour.
> 
> Ritesh, can you please try it out to see what happens?

I was able to hit the issue again, with your patch applied. I tried it on top of
the 4.8-rc5 kernel.

I ensured to capture usbmon trace.
https://people.debian.org/~rrs/tmp/usbmon-ulf.txt.gz


- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJXzvgmAAoJEKY6WKPy4XVpJ/oQAIR/bC82KdqD3xuQpq8r62RT
fV8DSduNhr394GIDO5PBDSKUAFDKCHL9SyCBx9I5zx1U7OF+IfvhgpRji+XcBsQ5
TOtDAhiiTatKfWl/zLGbVqukNS8QLzyU26w0HRabvK3RZ0lGxBugQ8KneUC8wA/Q
NYWAgio5gN5jpD6BgGEZXo6cdzcdI1HM5kyNxcs4VfhD0zDor4wlrW024458a3Fs
02QAkmyV1aBwR4w/Ntw7D9DXb8sVujGgeEOr+KukzdFOPW+w4JOANv3gregDnKSc
EgIYmtTE95d0jWyJuS5sbZlbf4QOcrv9eVwn86TOdhZHU1XlaqaIEbRPzu15ZtZF
B+zYbWg+TgTYFgTYJoaB8T+CUJMMnNh62IC72LXa5K/C0kjgjlHUzis04XdFmT2q
jYC767M1utXi7a4LcXxw8V8uoV0nnmuvxWzhrcdBxi6+a85BLL8bPKT+nrGxkxB0
Hg7Wmnc9moohpjYD3KlAYR3bW56dDYrGgsj8xRUnplBkUp10g2vhix7bk/q+eUwU
gJghnI/WgEE5hBQabtuqCTCsvMdTXmZVjzMg1k43V5v+UR0a7mKVW0uBWu5cU38p
Ga9ZrNmPEzxOVmeXH3XCB8fCXRgce8ieu/KEPHpvbegia0nX4X7BcCHhK2KGsqnX
Y4f+TvpJw2idUe/cvdtw
=/aP1
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                           ` <CAPDyKFpnCXhdoKgoG576teC=y38vbC1x=-ehC_9EWEeKr_K6BQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-09-07 20:48                             ` Alan Stern
       [not found]                               ` <Pine.LNX.4.44L0.1609071630350.2115-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-07 20:48 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: Ritesh Raj Sarraf, USB list, linux-mmc

On Tue, 6 Sep 2016, Ulf Hansson wrote:

> On 5 September 2016 at 17:58, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
> > On Mon, 5 Sep 2016, Ritesh Raj Sarraf wrote:
> >
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA512
> >>
> >> On Sun, 2016-09-04 at 15:46 -0400, Alan Stern wrote:
> >> >
> >> > This is not the problem I was discussing with Ulf.  The problem was why
> >> > the device kept going into and out of runtime suspend every three
> >> > seconds.  The kernel log above does not say whether this was happening.
> >> > One way to tell is to look at a usbmon trace (like we did before).
> >>
> >> https://people.debian.org/~rrs/tmp/usbmon.txt.gz
> >>
> >> This should have the logs you asked for, running on 4.8-rc4.
> >
> > Confirmed, the runtime suspends and resumes are still happening.
> >
> > Ulf, any insights?
> 
> Alan, Ritesh,
> 
> Yes, I am starting to understand more about what goes on here.
> Although I need help to test as I don't have the HW.
> As you already guessed, I suspect the problem is within the runtime PM
> deployment in the drivers/mmc/host/rtsx_usb_sdmmc.c.
> 
> Let me start by first give you some background to how the mmc core
> deals with runtime PM.
> 
> *)
> The mmc core manages most of the calls to the pm_runtime_get|put*()
> and pm_runtime_mark_last_busy() for the mmc host device. The gets/puts
> are done when the core is about to access the mmc host device, via the
> mmc host ops driver interface. You may search for calls to the
> mmc_claim|release_host() functions to find out when the gets/puts are
> done.

Since mmc_claim_host() does call pm_runtime_get_sync(), these runtime
suspends would not occur if the host was claimed.  So we can conclude
that rtsx_usb_sdmmc.c must be doing I/O to the device without claiming
the host.

> **)
> The mmc core have also deployed runtime PM for the mmc *card* device
> and which has the runtime PM autosuspend feature enabled with a 3s
> default timeout. One important point is that the mmc card device has
> the mmc host device assigned as being its parent device. I guessing
> the reason to why you are encountering strange 3s intervals of runtime
> PM suspend/resume is related to this.

That sounds likely.

> Now, in this case, the rtsx_usb_sdmmc driver seems to need a bit of
> special runtime PM deployment, as the calls to pm_runtime_get|put*()
> also controls the power to the usb device and thus also the power to
> the card. I am guessing that's done via the usb device being assigned
> as parent for the mmc host's platform device!?

Yes, in drivers/mfd/mfd-core.c's mfd_add_device() routine, which is 
called indirectly by drivers/mfd/rtsx_usb.c's probe routine.

> By reviewing the code of the rtsx_usb_sdmmc driver, particularly how
> it calls pm_runtime_get|put() I am guessing those calls may not be
> properly deployed. Perhaps rtsx_usb_sdmmc should convert to use the
> usb_autopm_put|get_interface() and friends, although I didn't want to
> make that change at this point so instead I have cooked a patch that
> might fixes the behaviour.

The usb_autopm_* calls are mostly just convenience wrappers around the
pm_runtime_* functions, meant for use with USB drivers.  In fact, you
can't use them with a platform_device.

It seems odd that rtsx_usb_sdmmc calls pm_runtime_put() and
pm_runtime_get_sync() directly instead of using
mmc_claim|release_host().

> Ritesh, can you please try it out to see what happens?
> 
> ---
>  drivers/mmc/host/rtsx_usb_sdmmc.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
> b/drivers/mmc/host/rtsx_usb_sdmmc.c
> index 6c71fc9..3d6fe51 100644
> --- a/drivers/mmc/host/rtsx_usb_sdmmc.c
> +++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
> @@ -1138,11 +1138,6 @@ static void sdmmc_set_ios(struct mmc_host *mmc,
> struct mmc_ios *ios)
>         dev_dbg(sdmmc_dev(host), "%s\n", __func__);
>         mutex_lock(&ucr->dev_mutex);
> 
> -       if (rtsx_usb_card_exclusive_check(ucr, RTSX_USB_SD_CARD)) {
> -               mutex_unlock(&ucr->dev_mutex);
> -               return;
> -       }
> -
>         sd_set_power_mode(host, ios->power_mode);
>         sd_set_bus_width(host, ios->bus_width);
>         sd_set_timing(host, ios->timing, &host->ddr_mode);
> @@ -1336,7 +1331,7 @@ static void rtsx_usb_init_host(struct
> rtsx_usb_sdmmc *host)
>                 MMC_CAP_MMC_HIGHSPEED | MMC_CAP_BUS_WIDTH_TEST |
>                 MMC_CAP_UHS_SDR12 | MMC_CAP_UHS_SDR25 | MMC_CAP_UHS_SDR50 |
>                 MMC_CAP_NEEDS_POLL;
> -       mmc->caps2 = MMC_CAP2_NO_PRESCAN_POWERUP | MMC_CAP2_FULL_PWR_CYCLE;
> +       mmc->caps2 = MMC_CAP2_FULL_PWR_CYCLE;
> 
>         mmc->max_current_330 = 400;
>         mmc->max_current_180 = 800;

These changes don't seem to affect the way rtsx_usb_sdmmc.c handles 
runtime PM.  In particular, the driver doesn't call 
pm_runtime_mark_last_busy() anywhere.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                               ` <Pine.LNX.4.44L0.1609071630350.2115-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2016-09-09 10:54                                 ` Ulf Hansson
       [not found]                                   ` <CAPDyKFr0vEaEbsoPm6YwJD1JOQc=YR=zwi4T6Rr3gCQ4StNuvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Ulf Hansson @ 2016-09-09 10:54 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ritesh Raj Sarraf, USB list, linux-mmc

On 7 September 2016 at 22:48, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
> On Tue, 6 Sep 2016, Ulf Hansson wrote:
>
>> On 5 September 2016 at 17:58, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
>> > On Mon, 5 Sep 2016, Ritesh Raj Sarraf wrote:
>> >
>> >> -----BEGIN PGP SIGNED MESSAGE-----
>> >> Hash: SHA512
>> >>
>> >> On Sun, 2016-09-04 at 15:46 -0400, Alan Stern wrote:
>> >> >
>> >> > This is not the problem I was discussing with Ulf.  The problem was why
>> >> > the device kept going into and out of runtime suspend every three
>> >> > seconds.  The kernel log above does not say whether this was happening.
>> >> > One way to tell is to look at a usbmon trace (like we did before).
>> >>
>> >> https://people.debian.org/~rrs/tmp/usbmon.txt.gz
>> >>
>> >> This should have the logs you asked for, running on 4.8-rc4.
>> >
>> > Confirmed, the runtime suspends and resumes are still happening.
>> >
>> > Ulf, any insights?
>>
>> Alan, Ritesh,
>>
>> Yes, I am starting to understand more about what goes on here.
>> Although I need help to test as I don't have the HW.
>> As you already guessed, I suspect the problem is within the runtime PM
>> deployment in the drivers/mmc/host/rtsx_usb_sdmmc.c.
>>
>> Let me start by first give you some background to how the mmc core
>> deals with runtime PM.
>>
>> *)
>> The mmc core manages most of the calls to the pm_runtime_get|put*()
>> and pm_runtime_mark_last_busy() for the mmc host device. The gets/puts
>> are done when the core is about to access the mmc host device, via the
>> mmc host ops driver interface. You may search for calls to the
>> mmc_claim|release_host() functions to find out when the gets/puts are
>> done.
>
> Since mmc_claim_host() does call pm_runtime_get_sync(), these runtime
> suspends would not occur if the host was claimed.  So we can conclude
> that rtsx_usb_sdmmc.c must be doing I/O to the device without claiming
> the host.

Yes.

So to be clear, it's the responsible of the mmc core to deal with
mmc_claim_release() host, not the mmc host driver.

Although, under special circumstances the mmc host driver may still
need to access its device. In these cases it explicitly needs to deal
with pm_runtime_get|put*() itself.

Now, what puzzles me here, is that the rtsx_usb_sdmmc driver was
introduced in kernel v3.16.
At that point (and not until v4.1) the mmc core did *not* deal with
runtime PM for mmc host devices via mmc_claim_release_host.

So how could the driver work at that point? :-) Maybe runtime PM has
never worked for this driver!?

>
>> **)
>> The mmc core have also deployed runtime PM for the mmc *card* device
>> and which has the runtime PM autosuspend feature enabled with a 3s
>> default timeout. One important point is that the mmc card device has
>> the mmc host device assigned as being its parent device. I guessing
>> the reason to why you are encountering strange 3s intervals of runtime
>> PM suspend/resume is related to this.
>
> That sounds likely.
>
>> Now, in this case, the rtsx_usb_sdmmc driver seems to need a bit of
>> special runtime PM deployment, as the calls to pm_runtime_get|put*()
>> also controls the power to the usb device and thus also the power to
>> the card. I am guessing that's done via the usb device being assigned
>> as parent for the mmc host's platform device!?
>
> Yes, in drivers/mfd/mfd-core.c's mfd_add_device() routine, which is
> called indirectly by drivers/mfd/rtsx_usb.c's probe routine.
>
>> By reviewing the code of the rtsx_usb_sdmmc driver, particularly how
>> it calls pm_runtime_get|put() I am guessing those calls may not be
>> properly deployed. Perhaps rtsx_usb_sdmmc should convert to use the
>> usb_autopm_put|get_interface() and friends, although I didn't want to
>> make that change at this point so instead I have cooked a patch that
>> might fixes the behaviour.
>
> The usb_autopm_* calls are mostly just convenience wrappers around the
> pm_runtime_* functions, meant for use with USB drivers.  In fact, you
> can't use them with a platform_device.

Okay.

>
> It seems odd that rtsx_usb_sdmmc calls pm_runtime_put() and
> pm_runtime_get_sync() directly instead of using
> mmc_claim|release_host().

Those calls is done when the mmc core calls mmc host driver's
->set_ios() callback and to power up/off the mmc *card*. It's has
nothing to do with the mmc host device as such.

If I understand the original author's intent, was that because of
these runtime PM calls he wanted to control the power to the mmc card.

>
>> Ritesh, can you please try it out to see what happens?
>>
>> ---
>>  drivers/mmc/host/rtsx_usb_sdmmc.c | 7 +------
>>  1 file changed, 1 insertion(+), 6 deletions(-)
>>
>> diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
>> b/drivers/mmc/host/rtsx_usb_sdmmc.c
>> index 6c71fc9..3d6fe51 100644
>> --- a/drivers/mmc/host/rtsx_usb_sdmmc.c
>> +++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
>> @@ -1138,11 +1138,6 @@ static void sdmmc_set_ios(struct mmc_host *mmc,
>> struct mmc_ios *ios)
>>         dev_dbg(sdmmc_dev(host), "%s\n", __func__);
>>         mutex_lock(&ucr->dev_mutex);
>>
>> -       if (rtsx_usb_card_exclusive_check(ucr, RTSX_USB_SD_CARD)) {
>> -               mutex_unlock(&ucr->dev_mutex);
>> -               return;
>> -       }
>> -
>>         sd_set_power_mode(host, ios->power_mode);
>>         sd_set_bus_width(host, ios->bus_width);
>>         sd_set_timing(host, ios->timing, &host->ddr_mode);
>> @@ -1336,7 +1331,7 @@ static void rtsx_usb_init_host(struct
>> rtsx_usb_sdmmc *host)
>>                 MMC_CAP_MMC_HIGHSPEED | MMC_CAP_BUS_WIDTH_TEST |
>>                 MMC_CAP_UHS_SDR12 | MMC_CAP_UHS_SDR25 | MMC_CAP_UHS_SDR50 |
>>                 MMC_CAP_NEEDS_POLL;
>> -       mmc->caps2 = MMC_CAP2_NO_PRESCAN_POWERUP | MMC_CAP2_FULL_PWR_CYCLE;
>> +       mmc->caps2 = MMC_CAP2_FULL_PWR_CYCLE;
>>
>>         mmc->max_current_330 = 400;
>>         mmc->max_current_180 = 800;
>
> These changes don't seem to affect the way rtsx_usb_sdmmc.c handles
> runtime PM.  In particular, the driver doesn't call
> pm_runtime_mark_last_busy() anywhere.

This affects the way the core calls the host driver's ->set_ios()
callback. Earlier it was invoked first to do power off then power up.
With this change it starts with power up instead.
I wanted to try this because I suspected the initial state could be wrong.

So here are some other ideas on how to move forward.
1. Run with CONFIG_PM unset to see if we can reproduce the problem.
2. Revert back the state in the mmc core we had in 3.16 around how it
deals with runtime PM for host devices. That's actually very easy as
we only need to remove the pm_runtime_put|get() calls in
mmc_claim|release_host().

Ritesh, can you try these options?

Neither of the above will actually solve the problem, so I guess we
anyway need take a closer look to understand why the usb device is
accessed when it is actually runtime suspended.

BTW, Ritesh you could also run a git bisect to find out if/when this
became broken.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                   ` <CAPDyKFr0vEaEbsoPm6YwJD1JOQc=YR=zwi4T6Rr3gCQ4StNuvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-09-09 13:14                                     ` Ritesh Raj Sarraf
       [not found]                                       ` <1473426861.9415.2.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-09 13:14 UTC (permalink / raw)
  To: Ulf Hansson, Alan Stern; +Cc: USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Fri, 2016-09-09 at 12:54 +0200, Ulf Hansson wrote:
> This affects the way the core calls the host driver's ->set_ios()
> callback. Earlier it was invoked first to do power off then power up.
> With this change it starts with power up instead.
> I wanted to try this because I suspected the initial state could be wrong.
> 
> So here are some other ideas on how to move forward.
> 1. Run with CONFIG_PM unset to see if we can reproduce the problem.
> 2. Revert back the state in the mmc core we had in 3.16 around how it
> deals with runtime PM for host devices. That's actually very easy as
> we only need to remove the pm_runtime_put|get() calls in
> mmc_claim|release_host().
> 
> Ritesh, can you try these options?
> 

Yes. I can try the above ones now. I'm building the kernel for it.


> Neither of the above will actually solve the problem, so I guess we
> anyway need take a closer look to understand why the usb device is
> accessed when it is actually runtime suspended.
> 
> BTW, Ritesh you could also run a git bisect to find out if/when this
> became broken.

What starting point do you want me to use ? 4.0 ?

- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX0rWtAAoJEKY6WKPy4XVpgeYP/3FCabIWEOvKeVJK52B5gviF
+CFz00vNVddMDQjUlbN/JzaA8h5HiLMBaTp/7nBvaMYKYKZjjo8iiNQ/3jjwPUu8
S/XrWPvMUVJUhF42OMKTl4uUpVPo3fI18SYIw4XFt5pNaUFrmBPjZ6S23BhqfLN3
x2BSw9rwZrCmxH4aMpfWmDFzjWvkuJ6grpxPfbz2lx4t7BWGs4wWdTc0yfQgzWDR
SLD/CpPCkVFjnpjRnrYZei3nNo0tS0+qSv1SsRwy8GchRbOo9bK1W8YKnLH/AFLb
kCYcVKCc4mq8iq5e05vjPvdppXpbzKKlhpxxtlqzLjAYx9FbESRN89D9c7V1Ng0q
c40b+lQV7AGYUxaf+tOcErflDvzRquR1dcwcjJizgeO/pOI6uXGWE+/T5j+4q4UJ
cypkxjzPIz8SbipNEPGf2nSMppXC1RRcBGo4247vFZ0QYTC7RAViL6ErRS67kfM9
5eX2iBtO3qNZMxIMqBPu97rLoLHSZ3nFEwNbyyFnNHERZuvLUZrZY208tZYUfZKZ
X18QJKPrqfbb9n+KyyLNTaESm6jhW2yYFdiRs/4LQWniOlLLPa+6GOD7E81044nB
t8LAKtSSxK+qvFVcfqrZSfaivRKQZbYwuk+x3O4vAUDgTpTgmugNaDTYendN3v1B
R809X7cWC6V1kzTaBkK9
=o1rL
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                       ` <1473426861.9415.2.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
@ 2016-09-09 14:04                                         ` Ritesh Raj Sarraf
       [not found]                                           ` <1473429884.9415.8.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-09 14:04 UTC (permalink / raw)
  To: Ulf Hansson, Alan Stern; +Cc: USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Fri, 2016-09-09 at 18:44 +0530, Ritesh Raj Sarraf wrote:
> On Fri, 2016-09-09 at 12:54 +0200, Ulf Hansson wrote:
> > This affects the way the core calls the host driver's ->set_ios()
> > callback. Earlier it was invoked first to do power off then power up.
> > With this change it starts with power up instead.
> > I wanted to try this because I suspected the initial state could be wrong.
> > 
> > So here are some other ideas on how to move forward.
> > 1. Run with CONFIG_PM unset to see if we can reproduce the problem.
> > 2. Revert back the state in the mmc core we had in 3.16 around how it
> > deals with runtime PM for host devices. That's actually very easy as
> > we only need to remove the pm_runtime_put|get() calls in
> > mmc_claim|release_host().
> > 
> > Ritesh, can you try these options?
> > 
> 
> Yes. I can try the above ones now. I'm building the kernel for it.

For #1, menuconfig doesn't allow me to disable CONFIG_PM in 4.8. I checked it
back up till 4.0, and it still doesn't allow disabling CONFIG_PM.

For #2, I'm building the 4.8-rc5 kernel with the following change. This build
does not include the previous change you had suggested (related to POWER_CYCLE)

Date:   Fri Sep 9 19:28:03 2016 +0530

    Disable pm runtime in mmc core

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index e55cde6..32388d5 100644
- --- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t *abort)
        spin_unlock_irqrestore(&host->lock, flags);
        remove_wait_queue(&host->wq, &wait);
 
- -       if (pm)
- -               pm_runtime_get_sync(mmc_dev(host));
- -
        return stop;
 }
 EXPORT_SYMBOL(__mmc_claim_host);
@@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
                spin_unlock_irqrestore(&host->lock, flags);
                wake_up(&host->wq);
                pm_runtime_mark_last_busy(mmc_dev(host));
- -               pm_runtime_put_autosuspend(mmc_dev(host));
        }
 }
 EXPORT_SYMBOL(mmc_release_host);


- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX0sF8AAoJEKY6WKPy4XVpmDQQAKPpzuw4QaaYdGuoEdZs9tvL
ZIVXOp81QVFg/VC+k8b5JmxcVmyaRrAmlKwKSBUrQqLsRIDROKHz7kAZmABvvmMo
8PC+haslv6o+M/xTd2kZMgYRk0Xj11+Ucr6mTd0BVbTqzD86WZhSmdufeiFWhzjB
aMloMDJ3cYABMIHqPQH5S/+knNhuffKqEEZ1O7jgcc10c/JpwpxaAlefNLh9Qotk
bsb+ptpBE0ggk8gD/tGSx6JZLNFy15JyzE8yuL8LfrZzzW2KU8M4kv94+6BNMqpE
sJ1mapW3zu52Hev9cDpUeTgyVVEOEXJKu9AM626voyxVYrCEwrE4usUcLVsdJH17
p7Rm5gBiEK/Wx+f10CBiFW2HwdE0KmeBgxweprv+E6VXaWFkjoXSJY5DDX5zuxlf
we9onx87IaGVTLN0I7dEcVse/3T3zT8URM/HwFyR6K+PWD0Ioiyoi4GIE96OCIJn
oUahrupOppgUZbr+qn+HULHLXJONWBslZmbS3gjQG282+koy00wquAou+4HznA5z
DBLPaljAaIuIPKxKrIDOcJ3nxBh5eBf0RHXsn9Ho6iWrcpPqYtN2XuDWNhOA/wJB
eEm1TQAMmi83FATZ0qf0n70E1/mv2wrV4nSzHPqeIgtexABWLqhS8fLCEuMW6ESt
zMRmOrou7dzaXfwkGzhC
=2dkl
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                           ` <1473429884.9415.8.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
@ 2016-09-09 16:15                                             ` Alan Stern
  2016-09-14 14:50                                             ` Ritesh Raj Sarraf
  1 sibling, 0 replies; 33+ messages in thread
From: Alan Stern @ 2016-09-09 16:15 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, USB list, linux-mmc

On Fri, 9 Sep 2016, Ritesh Raj Sarraf wrote:

> On Fri, 2016-09-09 at 18:44 +0530, Ritesh Raj Sarraf wrote:
> > On Fri, 2016-09-09 at 12:54 +0200, Ulf Hansson wrote:
> > > This affects the way the core calls the host driver's ->set_ios()
> > > callback. Earlier it was invoked first to do power off then power up.
> > > With this change it starts with power up instead.
> > > I wanted to try this because I suspected the initial state could be wrong.
> > > 
> > > So here are some other ideas on how to move forward.
> > > 1. Run with CONFIG_PM unset to see if we can reproduce the problem.
> > > 2. Revert back the state in the mmc core we had in 3.16 around how it
> > > deals with runtime PM for host devices. That's actually very easy as
> > > we only need to remove the pm_runtime_put|get() calls in
> > > mmc_claim|release_host().
> > > 
> > > Ritesh, can you try these options?
> > > 
> > 
> > Yes. I can try the above ones now. I'm building the kernel for it.
> 
> For #1, menuconfig doesn't allow me to disable CONFIG_PM in 4.8. I checked it
> back up till 4.0, and it still doesn't allow disabling CONFIG_PM.

You can do it, but it would require some pretty far-reaching changes.  
Besides, there's really no point.  If CONFIG_PM isn't enabled then the 
kernel doesn't do any runtime PM at all.

Alan Stern


> For #2, I'm building the 4.8-rc5 kernel with the following change. This build
> does not include the previous change you had suggested (related to POWER_CYCLE)
> 
> Date:   Fri Sep 9 19:28:03 2016 +0530
> 
>     Disable pm runtime in mmc core
> 
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index e55cde6..32388d5 100644
> - --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t *abort)
>         spin_unlock_irqrestore(&host->lock, flags);
>         remove_wait_queue(&host->wq, &wait);
>  
> - -       if (pm)
> - -               pm_runtime_get_sync(mmc_dev(host));
> - -
>         return stop;
>  }
>  EXPORT_SYMBOL(__mmc_claim_host);
> @@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
>                 spin_unlock_irqrestore(&host->lock, flags);
>                 wake_up(&host->wq);
>                 pm_runtime_mark_last_busy(mmc_dev(host));
> - -               pm_runtime_put_autosuspend(mmc_dev(host));
>         }
>  }
>  EXPORT_SYMBOL(mmc_release_host);

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                           ` <1473429884.9415.8.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
  2016-09-09 16:15                                             ` Alan Stern
@ 2016-09-14 14:50                                             ` Ritesh Raj Sarraf
       [not found]                                               ` <1473864634.9913.12.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
  1 sibling, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-14 14:50 UTC (permalink / raw)
  To: Ulf Hansson, Alan Stern; +Cc: USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Ulf and Alan,

On Fri, 2016-09-09 at 19:34 +0530, Ritesh Raj Sarraf wrote:
> For #2, I'm building the 4.8-rc5 kernel with the following change. This build
> does not include the previous change you had suggested (related to
> POWER_CYCLE)
> 
> Date:   Fri Sep 9 19:28:03 2016 +0530
> 
>     Disable pm runtime in mmc core
> 
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index e55cde6..32388d5 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t
> *abort)
>         spin_unlock_irqrestore(&host->lock, flags);
>         remove_wait_queue(&host->wq, &wait);
>  
> -       if (pm)
> -               pm_runtime_get_sync(mmc_dev(host));
> -
>         return stop;
>  }
>  EXPORT_SYMBOL(__mmc_claim_host);
> @@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
>                 spin_unlock_irqrestore(&host->lock, flags);
>                 wake_up(&host->wq);
>                 pm_runtime_mark_last_busy(mmc_dev(host));
> -               pm_runtime_put_autosuspend(mmc_dev(host));
>         }
>  }
>  EXPORT_SYMBOL(mmc_release_host);

I tried with these changes on 4.8-rc6 and I only saw 2 resets so far.
I captured the usb trace [1], just in case if you need it.

[1] https://people.debian.org/~rrs/tmp/4.8-rc6-ulf.txt.gz

- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX2WO6AAoJEKY6WKPy4XVpfx4P/3VOWgHw87iZxuHlw+vHUVOR
BB8cdQidh6nVba2UcDXb8uw/oYYYEJZ0FYvvdgKwt/14QNaL1L3jwrLpayUC7AAM
wZA8bhESANx/KoJiZH4GuasnkXfmjXVz2XPOIz/b8qfp4jfreFZfZQHgkIY7cEtE
gO9JArpM/e6FZY/5mEYy1bFnvuuyTfI4Wu5Cm6HmyidT1mfhTM7xvyMTLfM+QRUm
VT7pRhS6oujWO7K2KDMTrcZqgs2AVmjCqZW13F4AnMU/owxvRYTpGyW3hDbi9oIg
Z8ZJ5sHT7jpJ4I/FZJxwaSthIN5aF4wG8UjTFr2EIc+W5Xx/xzYpZP3Qm+HkFnej
n8Uyo0ZMN9+CV6VI2Qgzr+pB5LAfYHjIMobGAzaCzN81MWCVmb/GfiLQhpcgOnoK
TMHVqCqWlkF8W0V5ap+Tc4Ce4Wqj/V+RlQpE01GeOg15DUtgqWXCGbYKspnwwRD2
u2Ivso9G4xDd0VOp9x4zpIdfbAqaSP+DuZZ7EnGVfd30j4ENWsrRTDXdWf63uAx6
GMOpAephwUOZo7WFxwd/q19181YM/emSg+weOMcGxwHvpQ+vTEN+JELD0B0Z6PpI
OvaPROIL8bb5SuPKzKJ55ZRLpv40sjWOglZC1oOhtSg2IMFJwjPcqzaO4frIxl4t
3OCttIZYgpp5kIiW8qV1
=KNnj
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                               ` <1473864634.9913.12.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
@ 2016-09-14 15:19                                                 ` Alan Stern
  2016-09-15 13:59                                                   ` Ulf Hansson
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-14 15:19 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, USB list, linux-mmc

On Wed, 14 Sep 2016, Ritesh Raj Sarraf wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> Hello Ulf and Alan,
> 
> On Fri, 2016-09-09 at 19:34 +0530, Ritesh Raj Sarraf wrote:
> > For #2, I'm building the 4.8-rc5 kernel with the following change. This build
> > does not include the previous change you had suggested (related to
> > POWER_CYCLE)
> > 
> > Date:   Fri Sep 9 19:28:03 2016 +0530
> > 
> >     Disable pm runtime in mmc core
> > 
> > diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> > index e55cde6..32388d5 100644
> > --- a/drivers/mmc/core/core.c
> > +++ b/drivers/mmc/core/core.c
> > @@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t
> > *abort)
> >         spin_unlock_irqrestore(&host->lock, flags);
> >         remove_wait_queue(&host->wq, &wait);
> >  
> > -       if (pm)
> > -               pm_runtime_get_sync(mmc_dev(host));
> > -
> >         return stop;
> >  }
> >  EXPORT_SYMBOL(__mmc_claim_host);
> > @@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
> >                 spin_unlock_irqrestore(&host->lock, flags);
> >                 wake_up(&host->wq);
> >                 pm_runtime_mark_last_busy(mmc_dev(host));
> > -               pm_runtime_put_autosuspend(mmc_dev(host));
> >         }
> >  }
> >  EXPORT_SYMBOL(mmc_release_host);
> 
> I tried with these changes on 4.8-rc6 and I only saw 2 resets so far.
> I captured the usb trace [1], just in case if you need it.
> 
> [1] https://people.debian.org/~rrs/tmp/4.8-rc6-ulf.txt.gz

The situation isn't any better.  At the start of the trace, 
the device is in runtime suspend but there are many attempts to 
communicate with it, all of which fail.

Then a little less than an hour after the trace started, the device was 
resumed.  At that point it started working okay.  Until there was a 
spontaneous disconnect.

The device reconnected, but after 3 seconds it was runtime suspended 
again -- and the I/O attempts continued.  Some time later there was 
another runtime resume, and the device began working again.  Until 
another spontaneous disconnect occurred.  And so on...

Ulf, we really need to figure out why the autosuspends are occurring 
and why the I/O doesn't stop while the device is suspended.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-14 15:19                                                 ` Alan Stern
@ 2016-09-15 13:59                                                   ` Ulf Hansson
  2016-09-15 14:16                                                     ` Alan Stern
  0 siblings, 1 reply; 33+ messages in thread
From: Ulf Hansson @ 2016-09-15 13:59 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ritesh Raj Sarraf, USB list, linux-mmc

On 14 September 2016 at 17:19, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Wed, 14 Sep 2016, Ritesh Raj Sarraf wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA512
>>
>> Hello Ulf and Alan,
>>
>> On Fri, 2016-09-09 at 19:34 +0530, Ritesh Raj Sarraf wrote:
>> > For #2, I'm building the 4.8-rc5 kernel with the following change. This build
>> > does not include the previous change you had suggested (related to
>> > POWER_CYCLE)
>> >
>> > Date:   Fri Sep 9 19:28:03 2016 +0530
>> >
>> >     Disable pm runtime in mmc core
>> >
>> > diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> > index e55cde6..32388d5 100644
>> > --- a/drivers/mmc/core/core.c
>> > +++ b/drivers/mmc/core/core.c
>> > @@ -970,9 +970,6 @@ int __mmc_claim_host(struct mmc_host *host, atomic_t
>> > *abort)
>> >         spin_unlock_irqrestore(&host->lock, flags);
>> >         remove_wait_queue(&host->wq, &wait);
>> >
>> > -       if (pm)
>> > -               pm_runtime_get_sync(mmc_dev(host));
>> > -
>> >         return stop;
>> >  }
>> >  EXPORT_SYMBOL(__mmc_claim_host);
>> > @@ -1000,7 +997,6 @@ void mmc_release_host(struct mmc_host *host)
>> >                 spin_unlock_irqrestore(&host->lock, flags);
>> >                 wake_up(&host->wq);
>> >                 pm_runtime_mark_last_busy(mmc_dev(host));
>> > -               pm_runtime_put_autosuspend(mmc_dev(host));
>> >         }
>> >  }
>> >  EXPORT_SYMBOL(mmc_release_host);
>>
>> I tried with these changes on 4.8-rc6 and I only saw 2 resets so far.
>> I captured the usb trace [1], just in case if you need it.
>>
>> [1] https://people.debian.org/~rrs/tmp/4.8-rc6-ulf.txt.gz
>
> The situation isn't any better.  At the start of the trace,
> the device is in runtime suspend but there are many attempts to
> communicate with it, all of which fail.

It's really weird. Have this driver ever worked!? :-)

>
> Then a little less than an hour after the trace started, the device was
> resumed.  At that point it started working okay.  Until there was a
> spontaneous disconnect.
>
> The device reconnected, but after 3 seconds it was runtime suspended
> again -- and the I/O attempts continued.  Some time later there was
> another runtime resume, and the device began working again.  Until
> another spontaneous disconnect occurred.  And so on...
>
> Ulf, we really need to figure out why the autosuspends are occurring
> and why the I/O doesn't stop while the device is suspended.

Okay, let's see.

I had another look in the rtsx_usb_sdmmc driver. Apparently it
registers a led classdev. Updating the led is done from a work, by
calling rtsx_usb_turn_on|off_led(), which do access the usb device.
These calls are not properly managed by runtime PM, so I have fixed
those according to below change:

---
 drivers/mmc/host/rtsx_usb_sdmmc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
b/drivers/mmc/host/rtsx_usb_sdmmc.c
index 6c71fc9..a59c7fa 100644
--- a/drivers/mmc/host/rtsx_usb_sdmmc.c
+++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
@@ -1314,6 +1314,7 @@ static void rtsx_usb_update_led(struct work_struct *work)
                container_of(work, struct rtsx_usb_sdmmc, led_work);
        struct rtsx_ucr *ucr = host->ucr;

+       pm_runtime_get_sync(sdmmc_dev(host));
        mutex_lock(&ucr->dev_mutex);

        if (host->led.brightness == LED_OFF)
@@ -1322,6 +1323,7 @@ static void rtsx_usb_update_led(struct work_struct *work)
                rtsx_usb_turn_on_led(ucr);

        mutex_unlock(&ucr->dev_mutex);
+       pm_runtime_put(sdmmc_dev(host));
 }
 #endif

-- 

Although, I doubt the above is the main reason to the issues we see.
Instead I think somehow the parent device (usb device) isn't being
properly managed through runtime PM, but not due to wrong deployment
in the mmc core nor in the rtsx_usb_driver, but at some place else.
:-)

I started looking for calls to pm_suspend_ignore_children(dev, true),
which would decouple the usb device from the mmc platform device from
a runtime PM point of view. I found one suspicious case!

drivers/usb/storage/realtek_cr.c:
pm_suspend_ignore_children(&us->pusb_intf->dev, true);

As I am not so familiar with USB, I can't really tell why the above
exists, although perhaps just removing that line would be worth a
try!?

If neither of the above works, the next step could be to start
checking error codes in the mmc core and in the rtsx_usb_sdmmc driver,
from the calls to pm_runtime_get|put() and pm_runtime_enable().

Kind regards
Uffe

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-15 13:59                                                   ` Ulf Hansson
@ 2016-09-15 14:16                                                     ` Alan Stern
  2016-09-16 15:42                                                       ` Ritesh Raj Sarraf
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-15 14:16 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: Ritesh Raj Sarraf, USB list, linux-mmc

On Thu, 15 Sep 2016, Ulf Hansson wrote:

> > The situation isn't any better.  At the start of the trace,
> > the device is in runtime suspend but there are many attempts to
> > communicate with it, all of which fail.
> 
> It's really weird. Have this driver ever worked!? :-)

Probably not.  Or at least, not with runtime PM.

> > Then a little less than an hour after the trace started, the device was
> > resumed.  At that point it started working okay.  Until there was a
> > spontaneous disconnect.
> >
> > The device reconnected, but after 3 seconds it was runtime suspended
> > again -- and the I/O attempts continued.  Some time later there was
> > another runtime resume, and the device began working again.  Until
> > another spontaneous disconnect occurred.  And so on...
> >
> > Ulf, we really need to figure out why the autosuspends are occurring
> > and why the I/O doesn't stop while the device is suspended.
> 
> Okay, let's see.
> 
> I had another look in the rtsx_usb_sdmmc driver. Apparently it
> registers a led classdev. Updating the led is done from a work, by
> calling rtsx_usb_turn_on|off_led(), which do access the usb device.
> These calls are not properly managed by runtime PM, so I have fixed
> those according to below change:
> 
> ---
>  drivers/mmc/host/rtsx_usb_sdmmc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
> b/drivers/mmc/host/rtsx_usb_sdmmc.c
> index 6c71fc9..a59c7fa 100644
> --- a/drivers/mmc/host/rtsx_usb_sdmmc.c
> +++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
> @@ -1314,6 +1314,7 @@ static void rtsx_usb_update_led(struct work_struct *work)
>                 container_of(work, struct rtsx_usb_sdmmc, led_work);
>         struct rtsx_ucr *ucr = host->ucr;
> 
> +       pm_runtime_get_sync(sdmmc_dev(host));
>         mutex_lock(&ucr->dev_mutex);
> 
>         if (host->led.brightness == LED_OFF)
> @@ -1322,6 +1323,7 @@ static void rtsx_usb_update_led(struct work_struct *work)
>                 rtsx_usb_turn_on_led(ucr);
> 
>         mutex_unlock(&ucr->dev_mutex);
> +       pm_runtime_put(sdmmc_dev(host));
>  }
>  #endif
> 
> -- 
> 
> Although, I doubt the above is the main reason to the issues we see.

I don't know -- it could well be the reason.  The symptoms are 
definitely what you would expect to see if some thread was doing I/O 
without calling the pm_runtime_* routines.

> Instead I think somehow the parent device (usb device) isn't being
> properly managed through runtime PM, but not due to wrong deployment
> in the mmc core nor in the rtsx_usb_driver, but at some place else.
> :-)
> 
> I started looking for calls to pm_suspend_ignore_children(dev, true),
> which would decouple the usb device from the mmc platform device from
> a runtime PM point of view. I found one suspicious case!
> 
> drivers/usb/storage/realtek_cr.c:
> pm_suspend_ignore_children(&us->pusb_intf->dev, true);
> 
> As I am not so familiar with USB, I can't really tell why the above
> exists, although perhaps just removing that line would be worth a
> try!?

No, the realtek_cr driver has no connection with this.  It's a
sub-module of the usb_storage driver; it uses the SCSI interface,
not the MMC interface.

> If neither of the above works, the next step could be to start
> checking error codes in the mmc core and in the rtsx_usb_sdmmc driver,
> from the calls to pm_runtime_get|put() and pm_runtime_enable().

Let's see what this patch does.

Alan Stern


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-15 14:16                                                     ` Alan Stern
@ 2016-09-16 15:42                                                       ` Ritesh Raj Sarraf
  2016-09-16 21:40                                                         ` Alan Stern
  0 siblings, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-16 15:42 UTC (permalink / raw)
  To: Alan Stern, Ulf Hansson; +Cc: USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Ulf and Alan,

On Thu, 2016-09-15 at 10:16 -0400, Alan Stern wrote:
> > ---
> >  drivers/mmc/host/rtsx_usb_sdmmc.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
> > b/drivers/mmc/host/rtsx_usb_sdmmc.c
> > index 6c71fc9..a59c7fa 100644
> > --- a/drivers/mmc/host/rtsx_usb_sdmmc.c
> > +++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
> > @@ -1314,6 +1314,7 @@ static void rtsx_usb_update_led(struct work_struct
> *work)
> >                 container_of(work, struct rtsx_usb_sdmmc, led_work);
> >         struct rtsx_ucr *ucr = host->ucr;
> > 
> > +       pm_runtime_get_sync(sdmmc_dev(host));
> >         mutex_lock(&ucr->dev_mutex);
> > 
> >         if (host->led.brightness == LED_OFF)
> > @@ -1322,6 +1323,7 @@ static void rtsx_usb_update_led(struct work_struct
> *work)
> >                 rtsx_usb_turn_on_led(ucr);
> > 
> >         mutex_unlock(&ucr->dev_mutex);
> > +       pm_runtime_put(sdmmc_dev(host));
> >  }
> >  #endif
> > 
> > -- 
> > 
> > Although, I doubt the above is the main reason to the issues we see.
> 
> I don't know -- it could well be the reason.  The symptoms are 
> definitely what you would expect to see if some thread was doing I/O 
> without calling the pm_runtime_* routines.
> 
> > Instead I think somehow the parent device (usb device) isn't being
> > properly managed through runtime PM, but not due to wrong deployment
> > in the mmc core nor in the rtsx_usb_driver, but at some place else.
> > :-)
> > 
> > I started looking for calls to pm_suspend_ignore_children(dev, true),
> > which would decouple the usb device from the mmc platform device from
> > a runtime PM point of view. I found one suspicious case!
> > 
> > drivers/usb/storage/realtek_cr.c:
> > pm_suspend_ignore_children(&us->pusb_intf->dev, true);
> > 
> > As I am not so familiar with USB, I can't really tell why the above
> > exists, although perhaps just removing that line would be worth a
> > try!?
> 
> No, the realtek_cr driver has no connection with this.  It's a
> sub-module of the usb_storage driver; it uses the SCSI interface,
> not the MMC interface.
> 
> > If neither of the above works, the next step could be to start
> > checking error codes in the mmc core and in the rtsx_usb_sdmmc driver,
> > from the calls to pm_runtime_get|put() and pm_runtime_enable().
> 
> Let's see what this patch does.

I was able to hit it again. Please find the usbmon trace at:
https://people.debian.org/~rrs/tmp/usb-4.8.0-rc6ulf1+.log.gz


- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX3BLOAAoJEKY6WKPy4XVpXMgP/jTyKOX/SYCPTU9twYldY7LQ
f64hpiWqXOUs+jFYM+BcrF5B5DuXiB1Wm4F3+Xm/QBN3grJD7yBq1nrhv/mAhCr3
y1gFRIbeKfZsEp1vdBov9m1jQCZzzIZlFXPmRGT/8uC/GZTHlgIeSLqBntpq9+yL
MQSE91tLVayVgaOQxpPz+uZ4PTAom19sU21Haa90ECHLKAUTJ9WncQFecjPLHMjb
4SUvgq53V2s1Yo1E85RhtgR6Nrk/Bh7qZEC1NyeganLazGbbsz9YnRcGy58x9Jiq
xmfURTtvG834CnGcGuzcRU09FGPMtXx/u57EYC6mdEMWhSglo0h6YhVxcUOtAhRD
s1gs+a6ToKTDLn6qr0cnIwG27ALyLh41QmzxEpiaZiugIEBzZ/uK3TBjzcul4Huj
v0+x2fSC0SXwGo4P3GAOnHuWUjgj3C1wElP1R3brXfO0aayESUNKzE8V7RbQIWiC
mHewSlKTiPwCr/lchaINTt2TyFcHJWOx90iV10GO5TpMyqho4AzpBpoimItrbx2t
qQJCvGzDLPjr0tPvpeWyJSfBnqCDqbJ44CY3nCFgKhTd3BXp4fDj09eBtNmSiuvu
UdZZxm84FD3BDSNX8k2W9CF81jML/4lzwliJge3uIPrXNDqGSZMxDSpd0u1EFNHf
rEQ/kP1WlArvqButQ5ZN
=fiWV
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-16 15:42                                                       ` Ritesh Raj Sarraf
@ 2016-09-16 21:40                                                         ` Alan Stern
       [not found]                                                           ` <Pine.LNX.4.44L0.1609161729340.1657-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-16 21:40 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, USB list, linux-mmc

On Fri, 16 Sep 2016, Ritesh Raj Sarraf wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> Hello Ulf and Alan,
> 
> On Thu, 2016-09-15 at 10:16 -0400, Alan Stern wrote:
> > > ---
> > >  drivers/mmc/host/rtsx_usb_sdmmc.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c
> > > b/drivers/mmc/host/rtsx_usb_sdmmc.c
> > > index 6c71fc9..a59c7fa 100644
> > > --- a/drivers/mmc/host/rtsx_usb_sdmmc.c
> > > +++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
> > > @@ -1314,6 +1314,7 @@ static void rtsx_usb_update_led(struct work_struct
> > *work)
> > >                 container_of(work, struct rtsx_usb_sdmmc, led_work);
> > >         struct rtsx_ucr *ucr = host->ucr;
> > > 
> > > +       pm_runtime_get_sync(sdmmc_dev(host));
> > >         mutex_lock(&ucr->dev_mutex);
> > > 
> > >         if (host->led.brightness == LED_OFF)
> > > @@ -1322,6 +1323,7 @@ static void rtsx_usb_update_led(struct work_struct
> > *work)
> > >                 rtsx_usb_turn_on_led(ucr);
> > > 
> > >         mutex_unlock(&ucr->dev_mutex);
> > > +       pm_runtime_put(sdmmc_dev(host));
> > >  }
> > >  #endif
> > > 
> > > -- 
> > > 
> > > Although, I doubt the above is the main reason to the issues we see.
> > 
> > I don't know -- it could well be the reason.  The symptoms are 
> > definitely what you would expect to see if some thread was doing I/O 
> > without calling the pm_runtime_* routines.

> I was able to hit it again. Please find the usbmon trace at:
> https://people.debian.org/~rrs/tmp/usb-4.8.0-rc6ulf1+.log.gz

 
We're still getting runtime suspends, but now at 2-second intervals.   
This is partly because the driver isn't calling
pm_runtime_mark_last_busy(), but there may be more to it.  The 2-second 
period is the default autosuspend timeout for USB devices.  However, I 
don't see the activity that rtsx_usb_get_card_status() should produce 
when rtsx_usb_suspend() runs; I don't know why not.

We're also getting occasional I/O attempts while the device is
suspended.  They must be on some other pathway, not the one fixed by
the patch above.  Let's see if we can find out just where they come
from.

Ritesh, please try applying this patch on top of the previous one.  It 
will produce output in the kernel log whenever these bad I/O attempts 
occur.  Also, enable dynamic debugging for the rtsx_usb driver:

	echo 'module rtsx_usb =p' >/sys/kernel/debug/dynamic_debug/control

before starting the test.  (You may need to mount a debugfs filesystem 
on /sys/kernel/debug first.)

Alan Stern



Index: usb-4.x/drivers/usb/core/hcd.c
===================================================================
--- usb-4.x.orig/drivers/usb/core/hcd.c
+++ usb-4.x/drivers/usb/core/hcd.c
@@ -1647,6 +1647,8 @@ int usb_hcd_submit_urb (struct urb *urb,
 		status = map_urb_for_dma(hcd, urb, mem_flags);
 		if (likely(status == 0)) {
 			status = hcd->driver->urb_enqueue(hcd, urb, mem_flags);
+			if (status == -EHOSTUNREACH)
+				dump_stack();
 			if (unlikely(status))
 				unmap_urb_for_dma(hcd, urb);
 		}



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                                           ` <Pine.LNX.4.44L0.1609161729340.1657-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2016-09-17 11:42                                                             ` Ritesh Raj Sarraf
  2016-09-18  1:42                                                               ` Alan Stern
  0 siblings, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-17 11:42 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ulf Hansson, USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Alan,


On Fri, 2016-09-16 at 17:40 -0400, Alan Stern wrote:
> We're still getting runtime suspends, but now at 2-second intervals.   
> This is partly because the driver isn't calling
> pm_runtime_mark_last_busy(), but there may be more to it.  The 2-second 
> period is the default autosuspend timeout for USB devices.  However, I 
> don't see the activity that rtsx_usb_get_card_status() should produce 
> when rtsx_usb_suspend() runs; I don't know why not.
> 
> We're also getting occasional I/O attempts while the device is
> suspended.  They must be on some other pathway, not the one fixed by
> the patch above.  Let's see if we can find out just where they come
> from.
> 
> Ritesh, please try applying this patch on top of the previous one.  It 
> will produce output in the kernel log whenever these bad I/O attempts 
> occur.  Also, enable dynamic debugging for the rtsx_usb driver:
> 

Please find links to the usbmon trace and the kernel trace.

https://people.debian.org/~rrs/tmp/4.8.0-rc6ulf1alan1+.kern.log
https://people.debian.org/~rrs/tmp/usb-4.8.0-rc6ulf1alan1+.log.gz

Thanks.

>         echo 'module rtsx_usb =p' >/sys/kernel/debug/dynamic_debug/control
> 
> before starting the test.  (You may need to mount a debugfs filesystem 
> on /sys/kernel/debug first.)
> 
> Alan Stern
> 
> 
> 
> Index: usb-4.x/drivers/usb/core/hcd.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/core/hcd.c
> +++ usb-4.x/drivers/usb/core/hcd.c
> @@ -1647,6 +1647,8 @@ int usb_hcd_submit_urb (struct urb *urb,
>                 status = map_urb_for_dma(hcd, urb, mem_flags);
>                 if (likely(status == 0)) {
>                         status = hcd->driver->urb_enqueue(hcd, urb,
> mem_flags);
> +                       if (status == -EHOSTUNREACH)
> +                               dump_stack();
>                         if (unlikely(status))
>                                 unmap_urb_for_dma(hcd, urb);
>                 }
> 
- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX3SwNAAoJEKY6WKPy4XVpMaYQAK3p/meduq2SLRKjcLuher/+
U6W0+6t1MJmNLZgArqEYLprGQs8dboDUVuYdOkpDyjsL3oRVc2RFRhKP4n5uyeqf
UOyyJC/Dn8JpW5abQPdQOi8/zeY019P1MPKd/lAvjs+MXOdRvOluwne3KGeVJzrc
nWNt9YMZCvxscbXjJVqNWFh8utg6BVVoJ72sqkHYL6N+cWKwKb4QphgNXbhoPQq3
K7KwsBywQHty/Wf4TXB6n8z/6zR6uNHQjaveboUvidkMhWXYFSVag6ba6ZOsAyoL
nPWvowTTKO3snKh2AzHhJEzgky2EXMoxG5+GzA3nBA9suAapxuS7tTSt5YAShupv
at1FASeb8kyZ7vt3Srq7WQN7OIlER9zvVvapENJHwqATnHAQ+35h+7o1CrZLjDyF
UB5qKWOOQlFmiEfjjs15bLumBQqA4vX7JvYqJxNX7AtOPUXZHvs85eX7S4s18TZ7
OYO5JkS0xcLD6HMvuxzwO+UDS5DYknwma8gAPvHg1mX8QQMOPNkbnd7Igt9kjhMR
ZioeP0xLyyUGwjGErQpzgQwHlk5bMQhQ4iiOxi4nz4aCWLCs2YwzojfwtfDxSYnV
rCSjWEAanG3eRP7iPBtbdrbFmWsjSsavopXXlLPNYDjBjnpm2/IcyIlC/dTKKYJz
0rkc0hreSRuSDMPxbkfV
=d/iL
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-17 11:42                                                             ` Ritesh Raj Sarraf
@ 2016-09-18  1:42                                                               ` Alan Stern
       [not found]                                                                 ` <Pine.LNX.4.44L0.1609172131120.698-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-18  1:42 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, USB list, linux-mmc

On Sat, 17 Sep 2016, Ritesh Raj Sarraf wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> Hello Alan,
> 
> 
> On Fri, 2016-09-16 at 17:40 -0400, Alan Stern wrote:
> > We're still getting runtime suspends, but now at 2-second intervals.   
> > This is partly because the driver isn't calling
> > pm_runtime_mark_last_busy(), but there may be more to it.  The 2-second 
> > period is the default autosuspend timeout for USB devices.  However, I 
> > don't see the activity that rtsx_usb_get_card_status() should produce 
> > when rtsx_usb_suspend() runs; I don't know why not.
> > 
> > We're also getting occasional I/O attempts while the device is
> > suspended.  They must be on some other pathway, not the one fixed by
> > the patch above.  Let's see if we can find out just where they come
> > from.
> > 
> > Ritesh, please try applying this patch on top of the previous one.  It 
> > will produce output in the kernel log whenever these bad I/O attempts 
> > occur.  Also, enable dynamic debugging for the rtsx_usb driver:
> > 
> 
> Please find links to the usbmon trace and the kernel trace.
> 
> https://people.debian.org/~rrs/tmp/4.8.0-rc6ulf1alan1+.kern.log
> https://people.debian.org/~rrs/tmp/usb-4.8.0-rc6ulf1alan1+.log.gz

Well, this is pretty clear:

Sep 17 15:55:52 learner kernel: CPU: 1 PID: 535 Comm: rtsx_usb_ms_1 Tainted: G     U          4.8.0-rc6ulf1alan1+ #19
Sep 17 15:55:52 learner kernel: Hardware name: LENOVO 20344/INVALID, BIOS 96CN31WW(V1.17) 07/21/2015
Sep 17 15:55:52 learner kernel:  0000000000000000 ffffffff81314be5 ffff8802476746c0 0000000002400000
Sep 17 15:55:52 learner kernel:  ffffffffa016f719 00000000523bec00 ffff88025f255780 ffff88024feff600
Sep 17 15:55:52 learner kernel:  0000000000018080 0000000000000000 ffff88025f258080 ffffffff815a0e60
Sep 17 15:55:52 learner kernel: Call Trace:
Sep 17 15:55:52 learner kernel:  [<ffffffff81314be5>] ? dump_stack+0x7d/0xb8
Sep 17 15:55:52 learner kernel:  [<ffffffffa016f719>] ? usb_hcd_submit_urb+0x3c9/0xad0 [usbcore]
Sep 17 15:55:52 learner kernel:  [<ffffffff815a0e60>] ? _raw_spin_lock_irqsave+0x20/0x47
Sep 17 15:55:52 learner kernel:  [<ffffffff810d5c8b>] ? lock_timer_base.isra.24+0x7b/0xa0
Sep 17 15:55:52 learner kernel:  [<ffffffff810d5d59>] ? try_to_del_timer_sync+0x49/0x60
Sep 17 15:55:52 learner kernel:  [<ffffffffa017180d>] ? usb_start_wait_urb+0x5d/0x140 [usbcore]
Sep 17 15:55:52 learner kernel:  [<ffffffffa00ee2be>] ? rtsx_usb_send_cmd+0x5e/0x80 [rtsx_usb]
Sep 17 15:55:52 learner kernel:  [<ffffffffa00ee4a7>] ? rtsx_usb_read_register+0x67/0xb0 [rtsx_usb]
Sep 17 15:55:52 learner kernel:  [<ffffffffa0b15ac1>] ? rtsx_usb_detect_ms_card+0x61/0xe0 [rtsx_usb_ms]
Sep 17 15:55:52 learner kernel:  [<ffffffffa0b15a60>] ? rtsx_usb_ms_set_param+0x770/0x770 [rtsx_usb_ms]
Sep 17 15:55:52 learner kernel:  [<ffffffff8108ee0d>] ? kthread+0xbd/0xe0
Sep 17 15:55:52 learner kernel:  [<ffffffff81024741>] ? __switch_to+0x2b1/0x6a0
Sep 17 15:55:52 learner kernel:  [<ffffffff815a118f>] ? ret_from_fork+0x1f/0x40
Sep 17 15:55:52 learner kernel:  [<ffffffff8108ed50>] ? kthread_create_on_node+0x180/0x180

This is the rtsx_usb_detect_ms_card() routine in
drivers/memstick/host/rtsx_usb_ms.c, which runs as a kthread.  It 
doesn't do any runtime PM.  So it looks like the bug is present in both 
the MMC and MemoryStick interfaces.

Alan Stern


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                                                 ` <Pine.LNX.4.44L0.1609172131120.698-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2016-09-19 10:10                                                                   ` Ulf Hansson
  2016-09-19 17:48                                                                     ` Alan Stern
  0 siblings, 1 reply; 33+ messages in thread
From: Ulf Hansson @ 2016-09-19 10:10 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ritesh Raj Sarraf, USB list, linux-mmc

On 18 September 2016 at 03:42, Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
> On Sat, 17 Sep 2016, Ritesh Raj Sarraf wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA512
>>
>> Hello Alan,
>>
>>
>> On Fri, 2016-09-16 at 17:40 -0400, Alan Stern wrote:
>> > We're still getting runtime suspends, but now at 2-second intervals.
>> > This is partly because the driver isn't calling
>> > pm_runtime_mark_last_busy(), but there may be more to it.  The 2-second
>> > period is the default autosuspend timeout for USB devices.  However, I
>> > don't see the activity that rtsx_usb_get_card_status() should produce
>> > when rtsx_usb_suspend() runs; I don't know why not.
>> >
>> > We're also getting occasional I/O attempts while the device is
>> > suspended.  They must be on some other pathway, not the one fixed by
>> > the patch above.  Let's see if we can find out just where they come
>> > from.
>> >
>> > Ritesh, please try applying this patch on top of the previous one.  It
>> > will produce output in the kernel log whenever these bad I/O attempts
>> > occur.  Also, enable dynamic debugging for the rtsx_usb driver:
>> >
>>
>> Please find links to the usbmon trace and the kernel trace.
>>
>> https://people.debian.org/~rrs/tmp/4.8.0-rc6ulf1alan1+.kern.log
>> https://people.debian.org/~rrs/tmp/usb-4.8.0-rc6ulf1alan1+.log.gz
>
> Well, this is pretty clear:
>
> Sep 17 15:55:52 learner kernel: CPU: 1 PID: 535 Comm: rtsx_usb_ms_1 Tainted: G     U          4.8.0-rc6ulf1alan1+ #19
> Sep 17 15:55:52 learner kernel: Hardware name: LENOVO 20344/INVALID, BIOS 96CN31WW(V1.17) 07/21/2015
> Sep 17 15:55:52 learner kernel:  0000000000000000 ffffffff81314be5 ffff8802476746c0 0000000002400000
> Sep 17 15:55:52 learner kernel:  ffffffffa016f719 00000000523bec00 ffff88025f255780 ffff88024feff600
> Sep 17 15:55:52 learner kernel:  0000000000018080 0000000000000000 ffff88025f258080 ffffffff815a0e60
> Sep 17 15:55:52 learner kernel: Call Trace:
> Sep 17 15:55:52 learner kernel:  [<ffffffff81314be5>] ? dump_stack+0x7d/0xb8
> Sep 17 15:55:52 learner kernel:  [<ffffffffa016f719>] ? usb_hcd_submit_urb+0x3c9/0xad0 [usbcore]
> Sep 17 15:55:52 learner kernel:  [<ffffffff815a0e60>] ? _raw_spin_lock_irqsave+0x20/0x47
> Sep 17 15:55:52 learner kernel:  [<ffffffff810d5c8b>] ? lock_timer_base.isra.24+0x7b/0xa0
> Sep 17 15:55:52 learner kernel:  [<ffffffff810d5d59>] ? try_to_del_timer_sync+0x49/0x60
> Sep 17 15:55:52 learner kernel:  [<ffffffffa017180d>] ? usb_start_wait_urb+0x5d/0x140 [usbcore]
> Sep 17 15:55:52 learner kernel:  [<ffffffffa00ee2be>] ? rtsx_usb_send_cmd+0x5e/0x80 [rtsx_usb]
> Sep 17 15:55:52 learner kernel:  [<ffffffffa00ee4a7>] ? rtsx_usb_read_register+0x67/0xb0 [rtsx_usb]
> Sep 17 15:55:52 learner kernel:  [<ffffffffa0b15ac1>] ? rtsx_usb_detect_ms_card+0x61/0xe0 [rtsx_usb_ms]
> Sep 17 15:55:52 learner kernel:  [<ffffffffa0b15a60>] ? rtsx_usb_ms_set_param+0x770/0x770 [rtsx_usb_ms]
> Sep 17 15:55:52 learner kernel:  [<ffffffff8108ee0d>] ? kthread+0xbd/0xe0
> Sep 17 15:55:52 learner kernel:  [<ffffffff81024741>] ? __switch_to+0x2b1/0x6a0
> Sep 17 15:55:52 learner kernel:  [<ffffffff815a118f>] ? ret_from_fork+0x1f/0x40
> Sep 17 15:55:52 learner kernel:  [<ffffffff8108ed50>] ? kthread_create_on_node+0x180/0x180
>
> This is the rtsx_usb_detect_ms_card() routine in
> drivers/memstick/host/rtsx_usb_ms.c, which runs as a kthread.  It
> doesn't do any runtime PM.  So it looks like the bug is present in both
> the MMC and MemoryStick interfaces.

I think the problem is even worse in the MemoryStick case, as the
memstick core doesn't help with runtime PM. I am pretty sure there are
other cases when the MemoryStick driver accesses the usb device
without first runtime resuming it.

Of course we could start simple an fix the bug observed above and see
if that solves the reported problem. Alan, do you want to post to
patch or you want me?

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-19 10:10                                                                   ` Ulf Hansson
@ 2016-09-19 17:48                                                                     ` Alan Stern
       [not found]                                                                       ` <Pine.LNX.4.44L0.1609191340320.1458-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-19 17:48 UTC (permalink / raw)
  To: Ulf Hansson, Alex Dubov; +Cc: Ritesh Raj Sarraf, USB list, linux-mmc

On Mon, 19 Sep 2016, Ulf Hansson wrote:

> On 18 September 2016 at 03:42, Alan Stern <stern@rowland.harvard.edu> wrote:

> > Well, this is pretty clear:
> >
> > Sep 17 15:55:52 learner kernel: CPU: 1 PID: 535 Comm: rtsx_usb_ms_1 Tainted: G     U          4.8.0-rc6ulf1alan1+ #19
> > Sep 17 15:55:52 learner kernel: Hardware name: LENOVO 20344/INVALID, BIOS 96CN31WW(V1.17) 07/21/2015
> > Sep 17 15:55:52 learner kernel:  0000000000000000 ffffffff81314be5 ffff8802476746c0 0000000002400000
> > Sep 17 15:55:52 learner kernel:  ffffffffa016f719 00000000523bec00 ffff88025f255780 ffff88024feff600
> > Sep 17 15:55:52 learner kernel:  0000000000018080 0000000000000000 ffff88025f258080 ffffffff815a0e60
> > Sep 17 15:55:52 learner kernel: Call Trace:
> > Sep 17 15:55:52 learner kernel:  [<ffffffff81314be5>] ? dump_stack+0x7d/0xb8
> > Sep 17 15:55:52 learner kernel:  [<ffffffffa016f719>] ? usb_hcd_submit_urb+0x3c9/0xad0 [usbcore]
> > Sep 17 15:55:52 learner kernel:  [<ffffffff815a0e60>] ? _raw_spin_lock_irqsave+0x20/0x47
> > Sep 17 15:55:52 learner kernel:  [<ffffffff810d5c8b>] ? lock_timer_base.isra.24+0x7b/0xa0
> > Sep 17 15:55:52 learner kernel:  [<ffffffff810d5d59>] ? try_to_del_timer_sync+0x49/0x60
> > Sep 17 15:55:52 learner kernel:  [<ffffffffa017180d>] ? usb_start_wait_urb+0x5d/0x140 [usbcore]
> > Sep 17 15:55:52 learner kernel:  [<ffffffffa00ee2be>] ? rtsx_usb_send_cmd+0x5e/0x80 [rtsx_usb]
> > Sep 17 15:55:52 learner kernel:  [<ffffffffa00ee4a7>] ? rtsx_usb_read_register+0x67/0xb0 [rtsx_usb]
> > Sep 17 15:55:52 learner kernel:  [<ffffffffa0b15ac1>] ? rtsx_usb_detect_ms_card+0x61/0xe0 [rtsx_usb_ms]
> > Sep 17 15:55:52 learner kernel:  [<ffffffffa0b15a60>] ? rtsx_usb_ms_set_param+0x770/0x770 [rtsx_usb_ms]
> > Sep 17 15:55:52 learner kernel:  [<ffffffff8108ee0d>] ? kthread+0xbd/0xe0
> > Sep 17 15:55:52 learner kernel:  [<ffffffff81024741>] ? __switch_to+0x2b1/0x6a0
> > Sep 17 15:55:52 learner kernel:  [<ffffffff815a118f>] ? ret_from_fork+0x1f/0x40
> > Sep 17 15:55:52 learner kernel:  [<ffffffff8108ed50>] ? kthread_create_on_node+0x180/0x180
> >
> > This is the rtsx_usb_detect_ms_card() routine in
> > drivers/memstick/host/rtsx_usb_ms.c, which runs as a kthread.  It
> > doesn't do any runtime PM.  So it looks like the bug is present in both
> > the MMC and MemoryStick interfaces.
> 
> I think the problem is even worse in the MemoryStick case, as the
> memstick core doesn't help with runtime PM. I am pretty sure there are
> other cases when the MemoryStick driver accesses the usb device
> without first runtime resuming it.

Maybe we should get a MemoryStick maintainer involved in this thread.  
I CC'ed Alex Dubov.

Alex, the problem here is that drivers/memstick/host/rtsx_usb_ms.c
tries to communicate with the host USB device while it is runtime
suspended.

> Of course we could start simple an fix the bug observed above and see
> if that solves the reported problem. Alan, do you want to post to
> patch or you want me?

This ought to help.  Ritesh, please apply this patch on top of the 
two earlier ones and let's see what happens.

Alan Stern



Index: usb-4.x/drivers/memstick/host/rtsx_usb_ms.c
===================================================================
--- usb-4.x.orig/drivers/memstick/host/rtsx_usb_ms.c
+++ usb-4.x/drivers/memstick/host/rtsx_usb_ms.c
@@ -681,6 +681,7 @@ static int rtsx_usb_detect_ms_card(void
 	int err;
 
 	for (;;) {
+		pm_runtime_get_sync(ms_dev(host));
 		mutex_lock(&ucr->dev_mutex);
 
 		/* Check pending MS card changes */
@@ -703,6 +704,7 @@ static int rtsx_usb_detect_ms_card(void
 		}
 
 poll_again:
+		pm_runtime_put(ms_dev(host));
 		if (host->eject)
 			break;
 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                                                       ` <Pine.LNX.4.44L0.1609191340320.1458-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2016-09-20 12:36                                                                         ` Ritesh Raj Sarraf
  2016-09-20 14:16                                                                           ` Alan Stern
  0 siblings, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-20 12:36 UTC (permalink / raw)
  To: Alan Stern, Ulf Hansson, Alex Dubov; +Cc: USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Mon, 2016-09-19 at 13:48 -0400, Alan Stern wrote:
> 
> This ought to help.  Ritesh, please apply this patch on top of the 
> two earlier ones and let's see what happens.
> 
> Alan Stern
> 
> 

Please find the logs at the following links. On this boot, I did not see any
kernel stack being printed.

https://people.debian.org/~rrs/tmp/4.8.0-rc7ulf1alan2+.kern.log
https://people.debian.org/~rrs/tmp/usb-4.8.0-rc7ulf1alan2+.log


> 
> Index: usb-4.x/drivers/memstick/host/rtsx_usb_ms.c
> ===================================================================
> --- usb-4.x.orig/drivers/memstick/host/rtsx_usb_ms.c
> +++ usb-4.x/drivers/memstick/host/rtsx_usb_ms.c
> @@ -681,6 +681,7 @@ static int rtsx_usb_detect_ms_card(void
>         int err;
>  
>         for (;;) {
> +               pm_runtime_get_sync(ms_dev(host));
>                 mutex_lock(&ucr->dev_mutex);
>  
>                 /* Check pending MS card changes */
> @@ -703,6 +704,7 @@ static int rtsx_usb_detect_ms_card(void
>                 }
>  
>  poll_again:
> +               pm_runtime_put(ms_dev(host));
>                 if (host->eject)
>                         break;
>  
- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX4S1BAAoJEKY6WKPy4XVprAUP/jPnuxUAZ+6qKiCVx6qB69d+
wHQDkFOxmlTwTh5GyMa+oxEqvi0shvOKZ/ef7Oz0NA9DSiLonFw4aqSzF7jBRbee
UTKIgnNxHmJC6pdMPXWo5HVLBn6qYtX4pJFX6g1MwmjEDa9pjYWK9p7QzHkrx1GB
Z3X7TcWYk3DJS04GbFO9pMDl0P1phLR2VtnfzQwqtgF/g2fy7USpft1bYIQLQzxb
oOSAEDnTCtpurdAfLWq8OVQbL3rrf+HD3InVtdCZa+lwNSNwNfUZWnKKkS1S1tq+
hgKxvGOTEGunhm6Px6iQUCE9yxsvfmDK2GBxc/a3Tqpcy5ndZv/5laKFhXTt27pa
OuGksYgHCf2vWGHFuYHJH6cQKxgdsnnE7yGwbC8zYnrCT9O3hcLPxbVbzJorWFU0
YMNKt7RYZXrNQss9J4ufkTSLvzbUqsiYJwWH27LbQ5zHC7b9/ebgnMW6JIb1x+2p
iuz6MERvyxVxorG3R260GWSz/5SM/VVnTqzlRUnMHcVAyUHNGPGoqLu5LkrmI2VT
Zwcikip9G3fE79786eKF50X7dp2kU2p+W2bBmcJEWpWV9Vz5PiQdibsiu3CQilKc
QGxrKLp0OSsUvtwb4ceD/RWu7F99F7VCu3f/ohYYS2iciux5sFky+27GfY0fEJ2u
ikPpuK6xNWWSDgaNVVHD
=Nq4a
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-20 12:36                                                                         ` Ritesh Raj Sarraf
@ 2016-09-20 14:16                                                                           ` Alan Stern
       [not found]                                                                             ` <Pine.LNX.4.44L0.1609201012290.1459-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-20 14:16 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, Alex Dubov, USB list, linux-mmc

On Tue, 20 Sep 2016, Ritesh Raj Sarraf wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> On Mon, 2016-09-19 at 13:48 -0400, Alan Stern wrote:
> > 
> > This ought to help.  Ritesh, please apply this patch on top of the 
> > two earlier ones and let's see what happens.
> > 
> > Alan Stern
> > 
> > 
> 
> Please find the logs at the following links. On this boot, I did not see any
> kernel stack being printed.
> 
> https://people.debian.org/~rrs/tmp/4.8.0-rc7ulf1alan2+.kern.log
> https://people.debian.org/~rrs/tmp/usb-4.8.0-rc7ulf1alan2+.log

This is a lot better.  No more I/O errors.

We still have irregular suspends and resumes, but that's to be 
expected.  More worrying are the spontaneous disconnects.  They don't 
seem to be related to the suspend/resume activity.

You can disable suspend for this device entirely by doing:

	echo on >/sys/bus/usb/devices/2-4/power/control

I'm afraid that this won't prevent the device from disconnecting
itself, though.  This appears to be some sort of hardware bug that
can't be fixed in software.

Alan Stern


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                                                             ` <Pine.LNX.4.44L0.1609201012290.1459-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2016-09-20 15:17                                                                               ` Ritesh Raj Sarraf
       [not found]                                                                                 ` <1474384626.21100.6.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
  2016-09-21 11:10                                                                               ` Ritesh Raj Sarraf
  1 sibling, 1 reply; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-20 15:17 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ulf Hansson, Alex Dubov, USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Alan,

On Tue, 2016-09-20 at 10:16 -0400, Alan Stern wrote:
> This is a lot better.  No more I/O errors.
> 
> We still have irregular suspends and resumes, but that's to be 
> expected.  More worrying are the spontaneous disconnects.  They don't 
> seem to be related to the suspend/resume activity.
> 
> You can disable suspend for this device entirely by doing:
> 
>         echo on >/sys/bus/usb/devices/2-4/power/control
> 

Yes. But that'd also mean to write that value upon every suspend/resume cycle
because the rtsx usb driver still declares support for autosuspend.
Should that be dropped ?

> I'm afraid that this won't prevent the device from disconnecting
> itself, though.  This appears to be some sort of hardware bug that
> can't be fixed in software.

And that'd mean that upon every reset, the driver will again enable autosuspend
for that driver.


It is an upsetting state for this device but thank you, to all of you, for
helping debug this problem.

- -- 
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX4VLyAAoJEKY6WKPy4XVprEgQAKdYcFIF1ICNGHkF+oyDe1VY
DqyxTa0vLPbXWsG6GaV4Jdwld+gVKiWIVNKxbLpboBHj/vj0bkzoPJ0z4xI+Oc8p
bu6T5Io6nAaegdGa6XvYIAXk1fIlXEehBFVM6OyzC1EUJAjgYhIDVlG8mqZKxqVp
GGsX1e5UFdS0vCqjYqSxI5IHrqsm1M4lXuwr8ia66qyuSfpg8trizLiWrdiEa7hv
hRtI81XxITvVJ4+2ernO6Y+RO/z6WQLs1SAhXvDEH3RlYh/RoEBpolqMIO8LVWMK
jd4GSsYmMKiG1eJJq3UYM+iPDANIIi4gdO0hf/24vNcsVa8eF5kcKT+bxufRaiB/
5TzZS0RARBdq1+N6bK/wF8lDL4bWy4Sl1mts/dXJaCOlOoLeQ/u/J55K58mDJb2O
gdvEvzD/S9NNeawL2ow4sxaM8EBpeyJtBTyVIbLJVFmetauVs+ClI0uLoNMW/N+u
qMDE8yhiey4ClXa0WmVZHN4qjfNAnW4OSMtrq8+TLa1yhVj/ONNBo8QkfjuKBHwE
ELDrX3/N9JsZo6ZX0CPNVhodvck8ZVKrk8w3jAlcqQ0FlJy5MMFoqGEvqJ1EIfNv
IQ5FKrI08RIA7yw6keTy3nybzyY3MhepLJWxiEVyKRCopiHk96DmEGJmOmor7VIQ
hjgbpCikJ7sXPcn0n97u
=DBja
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                                                                 ` <1474384626.21100.6.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
@ 2016-09-20 15:43                                                                                   ` Alan Stern
  2016-09-20 15:51                                                                                     ` Ritesh Raj Sarraf
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2016-09-20 15:43 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, Alex Dubov, USB list, linux-mmc

On Tue, 20 Sep 2016, Ritesh Raj Sarraf wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> Hello Alan,
> 
> On Tue, 2016-09-20 at 10:16 -0400, Alan Stern wrote:
> > This is a lot better.  No more I/O errors.
> > 
> > We still have irregular suspends and resumes, but that's to be 
> > expected.  More worrying are the spontaneous disconnects.  They don't 
> > seem to be related to the suspend/resume activity.
> > 
> > You can disable suspend for this device entirely by doing:
> > 
> >         echo on >/sys/bus/usb/devices/2-4/power/control
> > 
> 
> Yes. But that'd also mean to write that value upon every suspend/resume cycle
> because the rtsx usb driver still declares support for autosuspend.
> Should that be dropped ?

No, the value doesn't change across a suspend/resume cycle.

> > I'm afraid that this won't prevent the device from disconnecting
> > itself, though.  This appears to be some sort of hardware bug that
> > can't be fixed in software.
> 
> And that'd mean that upon every reset, the driver will again enable autosuspend
> for that driver.

Yes, that's true.  I'm curious to see if preventing autosuspends will 
get rid of the resets.  My guess is that it won't.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-20 15:43                                                                                   ` Alan Stern
@ 2016-09-20 15:51                                                                                     ` Ritesh Raj Sarraf
  0 siblings, 0 replies; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-20 15:51 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ulf Hansson, Alex Dubov, USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Tue, 2016-09-20 at 11:43 -0400, Alan Stern wrote:
> > Yes. But that'd also mean to write that value upon every suspend/resume
> cycle
> > because the rtsx usb driver still declares support for autosuspend.
> > Should that be dropped ?
> 
> No, the value doesn't change across a suspend/resume cycle.
> 

I just verified, and yes, you are right. The value doesn't change.

> > > I'm afraid that this won't prevent the device from disconnecting
> > > itself, though.  This appears to be some sort of hardware bug that
> > > can't be fixed in software.
> > 
> > And that'd mean that upon every reset, the driver will again enable
> autosuspend
> > for that driver.
> 
> Yes, that's true.  I'm curious to see if preventing autosuspends will 
> get rid of the resets.  My guess is that it won't.

No. We tried it in the beginning. And the resets were still seen.

Thanks.

- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX4Vr/AAoJEKY6WKPy4XVpqSoP/jimRbblWfR56HM3RuK6jLTp
XHm2lJj6LopxC7BDPVF+SIMULlTyPh2hjbF2MVw9yNwsv1nruGQspzZ5VrBWlTsK
8Fs2sJif7Y8tWVFEMZczghrEoHN0KLe5vW4W6rX8xjjH5nL6ljtUeBDT6DyvD7yT
WymQWfObwp6VnjoR3nZ1SzB4DN/oGH10NaMjkk234mTkhU9Pl+UXFmesDdWn8Y64
3l5SpemMbNQaCaa/jyFQBJXu3+OTYVQafHjcl0bb3aRt4sHq5neS5zc/EIjz+Cpo
kqQwpQ6FslvSvamlwwB8mqDalPQZHeIvUNFMjlldpiAs8iCVeMHpolWI/CXCfo+1
BwVv8Kc1VnoMsjZ7uEUQJY9F1Q7YJ+4gFK6WSAhz7B9Na/0ztPJgq0tFYnVQgrwx
zUnLL7jPZZ4Wt8if9UayPtCUCdqHBSIfeoJ7+HMkC6FPt5GGCsrhtZX0u0Onop7F
Ka/VNgpMUNccgPvdqq3zYKyNIaAIUPf0jSyFbwxVXGbCLSZi8f4QmSw7k3BvkqNN
lR+pyqjKbImTpzqk0QT22SGT+4MeQgclbEUkpfA8PaPyb+9uLjgtZgp1ucTlMzOV
c3mXaTzRtSihagSW4hNyqYOINtBnvZp3n2fWDPgjJu+LWGOhpqY7P8mq8gFj77Fp
G49mKNuDiOkPWH3qHJgA
=bGWs
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                                                             ` <Pine.LNX.4.44L0.1609201012290.1459-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
  2016-09-20 15:17                                                                               ` Ritesh Raj Sarraf
@ 2016-09-21 11:10                                                                               ` Ritesh Raj Sarraf
       [not found]                                                                                 ` <1474456212.8192.2.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
  2016-09-21 14:37                                                                                 ` Alan Stern
  1 sibling, 2 replies; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-21 11:10 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ulf Hansson, Alex Dubov, USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hi Alan,

On Tue, 2016-09-20 at 10:16 -0400, Alan Stern wrote:
> This is a lot better.  No more I/O errors.
> 
> We still have irregular suspends and resumes, but that's to be 
> expected.  More worrying are the spontaneous disconnects.  They don't 
> seem to be related to the suspend/resume activity.
> 
> You can disable suspend for this device entirely by doing:
> 
>         echo on >/sys/bus/usb/devices/2-4/power/control
> 
> I'm afraid that this won't prevent the device from disconnecting
> itself, though.  This appears to be some sort of hardware bug that
> can't be fixed in software.

I'm not sure what you were referring to when you said "No more I/O errors".
But I still got these errors today, with all patches applied.

Sep 21 14:58:11 learner kernel: usb 2-4: new high-speed USB device number 98
using xhci_hcd
Sep 21 14:58:18 learner kernel: usb 2-4: new high-speed USB device number 102
using xhci_hcd
Sep 21 14:58:24 learner kernel: usb 2-4: new high-speed USB device number 106
using xhci_hcd
Sep 21 14:58:31 learner kernel: usb 2-4: new high-speed USB device number 114
using xhci_hcd
Sep 21 14:58:41 learner kernel: usb 2-4: new high-speed USB device number 12
using xhci_hcd
Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
Sep 21 14:58:41 learner kernel: usb 2-4: new high-speed USB device number 13
using xhci_hcd
Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
Sep 21 14:58:42 learner kernel: usb 2-4: new high-speed USB device number 14
using xhci_hcd
Sep 21 14:58:42 learner kernel: usb 2-4: Device not responding to setup address.
Sep 21 14:58:42 learner kernel: usb 2-4: Device not responding to setup address.
Sep 21 14:58:42 learner kernel: usb 2-4: device not accepting address 14, error
- -71
Sep 21 14:58:42 learner kernel: usb 2-4: new high-speed USB device number 15
using xhci_hcd
Sep 21 14:58:42 learner kernel: usb 2-4: Device not responding to setup address.
Sep 21 14:58:42 learner kernel: usb 2-4: Device not responding to setup address.
Sep 21 14:58:43 learner kernel: usb 2-4: device not accepting address 15, error
- -71
Sep 21 14:58:43 learner kernel: usb usb2-port4: unable to enumerate USB device
Sep 21 16:19:39 learner kernel: ahci 0000:00:1f.2: port does not support device
sleep
Sep 21 16:19:39 learner kernel: NMI watchdog: enabled on all CPUs, permanently
consumes one hw-
Sep 21 16:19:39 learner kernel: EXT4-fs (dm-0): re-mounted. Opts:
errors=remount-ro,data=ordere
Sep 21 16:19:39 learner kernel: EXT4-fs (sda6): re-mounted. Opts:
data=ordered,commit=0
Sep 21 16:19:39 learner kernel: EXT4-fs (dm-3): re-mounted. Opts:
errors=remount-ro,data=writeb
Sep 21 16:19:39 learner kernel: usb 2-4: new high-speed USB device number 16
using xhci_hcd
Sep 21 16:19:39 learner kernel: usb 2-4: device descriptor read/64, error -71
Sep 21 16:19:40 learner kernel: usb 2-4: device descriptor read/64, error -71
Sep 21 16:19:40 learner kernel: usb 2-4: new high-speed USB device number 17
using xhci_hcd
Sep 21 16:19:40 learner kernel: usb 2-4: device descriptor read/64, error -71
Sep 21 16:19:40 learner kernel: usb 2-4: device descriptor read/64, error -71
Sep 21 16:19:40 learner kernel: usb 2-4: new high-speed USB device number 18
using xhci_hcd
Sep 21 16:19:40 learner kernel: usb 2-4: Device not responding to setup address.
Sep 21 16:19:41 learner kernel: usb 2-4: Device not responding to setup address.
Sep 21 16:19:41 learner kernel: usb 2-4: device not accepting address 18, error
- -71
Sep 21 16:19:41 learner kernel: usb 2-4: new high-speed USB device number 19
using xhci_hcd


- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX4mqUAAoJEKY6WKPy4XVpEHMP/RhcDQXxt3LTOpGhJizyqZ4z
7Sm1tcBe/4NKP80nUpiI0geQYHYfRTR93hGKFayp48ULstn8xJ8T3ItZIS0WmZDK
TJcdxXzrkWMGNAGQFjNd9Lk1C7h1IIuo2D5xDuhrpHGMc5y4UVmpPixQRwEnbzG9
zX+PabvummAmlzT1+cRyO10uwpGFzsJ3SDkokjkxZ/aViL+vBU58/qiXIFH1D1hX
KTY8ABZjh4Hnkw07EcQh0xKztEbE/v2wJWPSx4RCPbsRdO5vdKUtOtWB7+1WVAY3
noSrvNWjj0Ntnm0+t4XIid1fDmNumK0EcYe8fDb/GqAuYDTqjcIZ5ANCaVSM/joq
suY7KTXVe44Pol1Bb89lERR49QAkxyKJViNc0bNSkp0+F4u4cDW9o0q6s0X6xw5b
LdAQHQek92IRNmT7v4gYO9bUKUBurqgHuUdi3iYlylbvs8UAzHmOL3nrFBz2GIcG
KQvqmvENy31VIlIMx+k3SipyedG77LIAmxX8bG7Xlu8lSZz3sPkMz7RJYeW0QwQ6
lC2cWiF2cn5K/0eTQPW3MX5H9m5qlq0QGaDrf8kGX6XpRKR3Qsu98L+R+AAmViQ9
kd2eBFzL4JdNVhXgNWrNk5mr0R0D9RB/58YWize3sASPg75zCFQCNOoFTPNZGN1q
edPs8uwkN5O2cy+0ur8n
=uZvc
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                                                                 ` <1474456212.8192.2.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
@ 2016-09-21 11:17                                                                                   ` Ulf Hansson
       [not found]                                                                                     ` <CAPDyKFrWHaPhubTsPjd7GpZcoQnGM9u1YEiy=iGpb1Qa2rJqPA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Ulf Hansson @ 2016-09-21 11:17 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Alan Stern, Alex Dubov, USB list, linux-mmc

On 21 September 2016 at 13:10, Ritesh Raj Sarraf <rrs-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> Hi Alan,
>
> On Tue, 2016-09-20 at 10:16 -0400, Alan Stern wrote:
>> This is a lot better.  No more I/O errors.
>>
>> We still have irregular suspends and resumes, but that's to be
>> expected.  More worrying are the spontaneous disconnects.  They don't
>> seem to be related to the suspend/resume activity.
>>
>> You can disable suspend for this device entirely by doing:
>>
>>         echo on >/sys/bus/usb/devices/2-4/power/control
>>
>> I'm afraid that this won't prevent the device from disconnecting
>> itself, though.  This appears to be some sort of hardware bug that
>> can't be fixed in software.
>
> I'm not sure what you were referring to when you said "No more I/O errors".
> But I still got these errors today, with all patches applied.
>
> Sep 21 14:58:11 learner kernel: usb 2-4: new high-speed USB device number 98
> using xhci_hcd
> Sep 21 14:58:18 learner kernel: usb 2-4: new high-speed USB device number 102
> using xhci_hcd
> Sep 21 14:58:24 learner kernel: usb 2-4: new high-speed USB device number 106
> using xhci_hcd
> Sep 21 14:58:31 learner kernel: usb 2-4: new high-speed USB device number 114
> using xhci_hcd
> Sep 21 14:58:41 learner kernel: usb 2-4: new high-speed USB device number 12
> using xhci_hcd
> Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 14:58:41 learner kernel: usb 2-4: new high-speed USB device number 13
> using xhci_hcd
> Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 14:58:42 learner kernel: usb 2-4: new high-speed USB device number 14
> using xhci_hcd
> Sep 21 14:58:42 learner kernel: usb 2-4: Device not responding to setup address.
> Sep 21 14:58:42 learner kernel: usb 2-4: Device not responding to setup address.
> Sep 21 14:58:42 learner kernel: usb 2-4: device not accepting address 14, error
> - -71
> Sep 21 14:58:42 learner kernel: usb 2-4: new high-speed USB device number 15
> using xhci_hcd
> Sep 21 14:58:42 learner kernel: usb 2-4: Device not responding to setup address.
> Sep 21 14:58:42 learner kernel: usb 2-4: Device not responding to setup address.
> Sep 21 14:58:43 learner kernel: usb 2-4: device not accepting address 15, error
> - -71
> Sep 21 14:58:43 learner kernel: usb usb2-port4: unable to enumerate USB device
> Sep 21 16:19:39 learner kernel: ahci 0000:00:1f.2: port does not support device
> sleep
> Sep 21 16:19:39 learner kernel: NMI watchdog: enabled on all CPUs, permanently
> consumes one hw-
> Sep 21 16:19:39 learner kernel: EXT4-fs (dm-0): re-mounted. Opts:
> errors=remount-ro,data=ordere
> Sep 21 16:19:39 learner kernel: EXT4-fs (sda6): re-mounted. Opts:
> data=ordered,commit=0
> Sep 21 16:19:39 learner kernel: EXT4-fs (dm-3): re-mounted. Opts:
> errors=remount-ro,data=writeb
> Sep 21 16:19:39 learner kernel: usb 2-4: new high-speed USB device number 16
> using xhci_hcd
> Sep 21 16:19:39 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 16:19:40 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 16:19:40 learner kernel: usb 2-4: new high-speed USB device number 17
> using xhci_hcd
> Sep 21 16:19:40 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 16:19:40 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 16:19:40 learner kernel: usb 2-4: new high-speed USB device number 18
> using xhci_hcd
> Sep 21 16:19:40 learner kernel: usb 2-4: Device not responding to setup address.
> Sep 21 16:19:41 learner kernel: usb 2-4: Device not responding to setup address.
> Sep 21 16:19:41 learner kernel: usb 2-4: device not accepting address 18, error
> - -71
> Sep 21 16:19:41 learner kernel: usb 2-4: new high-speed USB device number 19
> using xhci_hcd
>

I am pretty sure the memstick driver causes additional access to the
usb device without first calling pm_runtime_get_sync(). To eliminate
those cases from causing the issues, could you try disable the
memstick driver all-together?

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
       [not found]                                                                                     ` <CAPDyKFrWHaPhubTsPjd7GpZcoQnGM9u1YEiy=iGpb1Qa2rJqPA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-09-21 11:42                                                                                       ` Ritesh Raj Sarraf
  0 siblings, 0 replies; 33+ messages in thread
From: Ritesh Raj Sarraf @ 2016-09-21 11:42 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: Alan Stern, Alex Dubov, USB list, linux-mmc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello Ulf,

On Wed, 2016-09-21 at 13:17 +0200, Ulf Hansson wrote:
> 
> I am pretty sure the memstick driver causes additional access to the
> usb device without first calling pm_runtime_get_sync(). To eliminate
> those cases from causing the issues, could you try disable the
> memstick driver all-together?

I'm assuming you are referring to the rtsx_usb_ms driver ?

What is the oddest thing right now, is that none of the rtsx modules are
reported loaded.

rrs@learner:~$ lsmod | grep -i rts
2016-09-21 / 17:07:08 ♒♒♒  ☹  => 1  

where as the module was built for the kernel, and does load when asked manually.

rrs@learner:~$ less /boot/config-4.8.0-rc7alxb+ 
2016-09-21 / 17:07:54 ♒♒♒  ☺  
rrs@learner:~$ find /lib/modules/4.8.0-rc7alxb+/ | grep rtsx
/lib/modules/4.8.0-rc7alxb+/kernel/drivers/mmc/host/rtsx_usb_sdmmc.ko
/lib/modules/4.8.0-rc7alxb+/kernel/drivers/mmc/host/rtsx_pci_sdmmc.ko
/lib/modules/4.8.0-rc7alxb+/kernel/drivers/mfd/rtsx_usb.ko
/lib/modules/4.8.0-rc7alxb+/kernel/drivers/mfd/rtsx_pci.ko
/lib/modules/4.8.0-rc7alxb+/kernel/drivers/memstick/host/rtsx_pci_ms.ko
/lib/modules/4.8.0-rc7alxb+/kernel/drivers/memstick/host/rtsx_usb_ms.ko
2016-09-21 / 17:08:09 ♒♒♒  ☺  

rrs@learner:~$ sudo modprobe rtsx-usb-sdmmc
2016-09-21 / 17:08:46 ♒♒♒  ☺  

rrs@learner:~$ dmesg | tail -n 5
[ 6870.017311] usb 2-4: Device not responding to setup address.
[ 6870.223471] usb 2-4: device not accepting address 19, error -71
[ 6870.223536] usb usb2-port4: unable to enumerate USB device
[ 7958.474543] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure
on pipe A (start=166356 end=166357) time 3 us, min 1073, max 1079, scanline
start 1088, end 1080
[ 9814.785241] usbcore: registered new interface driver rtsx_usb
2016-09-21 / 17:10:07 ♒♒♒  ☺  



- -- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJX4nIMAAoJEKY6WKPy4XVpg4gP/Rkp5Wje2vTUHwnrJZzAKd65
V1VWkTyCfpbCTw4nAyMbzDevyoXJf3P+ktrcyQonvoZr/dsd/LOmjSXcjNJaFX0z
Vj+1lGOZeN1mr6vKMbh2y188oU4RyHkeCfg7SNdo1VuhVST/m2jOecCznXuEtpiS
TI66lmre0SNRusKHRQNDtaP4hFW0KDBmtXvc8pnNOL5781qua0pF12VIP5SsqriP
3d/DcqlVR0Lqh7X7wdt5Knp9ilSvsrCgGbimBarRUYnrOrhklJH9UwKXcHWVypzt
M0XvjO7R6JrBUoM6s8EA6gKmNxwlsIrjKFUFlTT6HPLb52fjpYkbYqCRUmqyIJZB
t3uA0hNKcqLav+Fg7ugT6ePAPeVkANDnbrPE+g69KOtM/CEJHbHkLxqznb/0lpMU
+SAp/Jz1CmDLt8M3s8gS9iCUWVrWy1oyDpMQsYIrTYOG6oOEOoTYEf/g5D6PBtY0
r1tD5bU/cZJV61YKer2xRDNu1YbAYkvX3XskFD7DFsnZpCyBXKGZ7gWpWety9kJJ
iBiqvn5Rk7jvL6EhIb/TQ867QLmhQCzZumPClFM4z7b8G2E440vykw5D5sKv91+H
qu9Abcxe0R0T9pxCFRz+//DYlxGvDDlUSyNBkrv6aGS1rNeDY/e4FWbNuEz9UoGn
LNiunOtwzBndajfEFETk
=tEAz
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: xHCI problem? [was Re: Erratic USB device behavior and device loss]
  2016-09-21 11:10                                                                               ` Ritesh Raj Sarraf
       [not found]                                                                                 ` <1474456212.8192.2.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
@ 2016-09-21 14:37                                                                                 ` Alan Stern
  1 sibling, 0 replies; 33+ messages in thread
From: Alan Stern @ 2016-09-21 14:37 UTC (permalink / raw)
  To: Ritesh Raj Sarraf; +Cc: Ulf Hansson, Alex Dubov, USB list, linux-mmc

On Wed, 21 Sep 2016, Ritesh Raj Sarraf wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> Hi Alan,
> 
> On Tue, 2016-09-20 at 10:16 -0400, Alan Stern wrote:
> > This is a lot better.  No more I/O errors.
> > 
> > We still have irregular suspends and resumes, but that's to be 
> > expected.  More worrying are the spontaneous disconnects.  They don't 
> > seem to be related to the suspend/resume activity.
> > 
> > You can disable suspend for this device entirely by doing:
> > 
> >         echo on >/sys/bus/usb/devices/2-4/power/control
> > 
> > I'm afraid that this won't prevent the device from disconnecting
> > itself, though.  This appears to be some sort of hardware bug that
> > can't be fixed in software.
> 
> I'm not sure what you were referring to when you said "No more I/O errors".

I was referring to the attempts at I/O while the device was suspended.  
They didn't occur in your most recent test.

> But I still got these errors today, with all patches applied.
> 
> Sep 21 14:58:11 learner kernel: usb 2-4: new high-speed USB device number 98
> using xhci_hcd
> Sep 21 14:58:18 learner kernel: usb 2-4: new high-speed USB device number 102
> using xhci_hcd
> Sep 21 14:58:24 learner kernel: usb 2-4: new high-speed USB device number 106
> using xhci_hcd
> Sep 21 14:58:31 learner kernel: usb 2-4: new high-speed USB device number 114
> using xhci_hcd
> Sep 21 14:58:41 learner kernel: usb 2-4: new high-speed USB device number 12
> using xhci_hcd
> Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 14:58:41 learner kernel: usb 2-4: device descriptor read/64, error -71
> Sep 21 14:58:41 learner kernel: usb 2-4: new high-speed USB device number 13
> using xhci_hcd

These are a completely different kind of error.  They occurred during a 
reset, which followed one of those spontaneous disconnects.  Probably 
the cause of the disconnect is also the cause of these errors.

Alan Stern


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2016-09-21 14:37 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1472132041.13456.11.camel@researchut.com>
2016-08-25 17:17 ` xHCI problem? [was Re: Erratic USB device behavior and device loss] Alan Stern
     [not found]   ` <Pine.LNX.4.44L0.1608251254220.1395-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2016-08-30  8:14     ` Ulf Hansson
     [not found]       ` <CAPDyKFq2SYtwWCNhSzQcxj8XdYmAhTqn6mxRKMJ7eKZAk=itWg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-09-04 11:32         ` Ritesh Raj Sarraf
2016-09-04 18:01           ` Ritesh Raj Sarraf
     [not found]             ` <1473012074.5339.6.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
2016-09-04 19:46               ` Alan Stern
2016-09-05 12:59                 ` Ritesh Raj Sarraf
     [not found]                   ` <1473080344.10346.4.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
2016-09-05 15:58                     ` Alan Stern
     [not found]                       ` <Pine.LNX.4.44L0.1609051157310.25234-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
2016-09-06  9:42                         ` Ulf Hansson
2016-09-06 17:08                           ` Ritesh Raj Sarraf
     [not found]                           ` <CAPDyKFpnCXhdoKgoG576teC=y38vbC1x=-ehC_9EWEeKr_K6BQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-09-07 20:48                             ` Alan Stern
     [not found]                               ` <Pine.LNX.4.44L0.1609071630350.2115-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2016-09-09 10:54                                 ` Ulf Hansson
     [not found]                                   ` <CAPDyKFr0vEaEbsoPm6YwJD1JOQc=YR=zwi4T6Rr3gCQ4StNuvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-09-09 13:14                                     ` Ritesh Raj Sarraf
     [not found]                                       ` <1473426861.9415.2.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
2016-09-09 14:04                                         ` Ritesh Raj Sarraf
     [not found]                                           ` <1473429884.9415.8.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
2016-09-09 16:15                                             ` Alan Stern
2016-09-14 14:50                                             ` Ritesh Raj Sarraf
     [not found]                                               ` <1473864634.9913.12.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
2016-09-14 15:19                                                 ` Alan Stern
2016-09-15 13:59                                                   ` Ulf Hansson
2016-09-15 14:16                                                     ` Alan Stern
2016-09-16 15:42                                                       ` Ritesh Raj Sarraf
2016-09-16 21:40                                                         ` Alan Stern
     [not found]                                                           ` <Pine.LNX.4.44L0.1609161729340.1657-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2016-09-17 11:42                                                             ` Ritesh Raj Sarraf
2016-09-18  1:42                                                               ` Alan Stern
     [not found]                                                                 ` <Pine.LNX.4.44L0.1609172131120.698-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
2016-09-19 10:10                                                                   ` Ulf Hansson
2016-09-19 17:48                                                                     ` Alan Stern
     [not found]                                                                       ` <Pine.LNX.4.44L0.1609191340320.1458-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2016-09-20 12:36                                                                         ` Ritesh Raj Sarraf
2016-09-20 14:16                                                                           ` Alan Stern
     [not found]                                                                             ` <Pine.LNX.4.44L0.1609201012290.1459-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2016-09-20 15:17                                                                               ` Ritesh Raj Sarraf
     [not found]                                                                                 ` <1474384626.21100.6.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
2016-09-20 15:43                                                                                   ` Alan Stern
2016-09-20 15:51                                                                                     ` Ritesh Raj Sarraf
2016-09-21 11:10                                                                               ` Ritesh Raj Sarraf
     [not found]                                                                                 ` <1474456212.8192.2.camel-7WuBAv+fczCJ8c2fQYRYNw@public.gmane.org>
2016-09-21 11:17                                                                                   ` Ulf Hansson
     [not found]                                                                                     ` <CAPDyKFrWHaPhubTsPjd7GpZcoQnGM9u1YEiy=iGpb1Qa2rJqPA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-09-21 11:42                                                                                       ` Ritesh Raj Sarraf
2016-09-21 14:37                                                                                 ` Alan Stern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.