linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kai-Heng Feng <kai.heng.feng@canonical.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: AceLan Kao <acelan.kao@canonical.com>,
	Mathias Nyman <mathias.nyman@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
	Harry Pan <harry.pan@intel.com>,
	David Heinzelmann <heinzelmann.david@gmail.com>,
	Andrey Konovalov <andreyknvl@google.com>,
	Nicolas Saenz Julienne <nsaenzjulienne@suse.de>,
	Mathieu Malaterre <malat@debian.org>,
	USB list <linux-usb@vger.kernel.org>,
	"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] usb: hub: move resume delay at the head of all USB access functions
Date: Fri, 3 Jan 2020 13:26:13 +0800	[thread overview]
Message-ID: <C434F324-5498-4629-8C03-5B1BD01F3A8F@canonical.com> (raw)
In-Reply-To: <Pine.LNX.4.44L0.2001021605410.20954-100000@iolanthe.rowland.org>

Hi Alan,

> On Jan 3, 2020, at 05:37, Alan Stern <stern@rowland.harvard.edu> wrote:
> 
> On Wed, 25 Dec 2019, AceLan Kao wrote:
> 
>> Here[1] are the dmesg and the usbmon log from wireshark, and
>> /sys/kernel/debug/usb/usbmon/0u.
>> 
>> I verified this issue on Dell XPS 13 + Dell Salomon WD19 docking
>> station(plug-in 3 USB disk on it)
>> After s2idle 7 times, 2 usb disks lost. But from wireshark log, the
>> packets look normal, no error.
>> 
>> So, I re-do the test again and log the usbmon/0u output, but it's greek to me.
>> Hope you can help to find some clues in the logs.
>> Thanks.
>> 
>> 1. https://people.canonical.com/~acelan/bugs/usb_issue/
> 
> Maybe Mathias can help figure out what's going on.  It clearly is an 
> xHCI-related problem of some sort.

I think I figured this issue out and I'll send out a patch series soon.

> 
> I can't get much out of these logs.  For one thing, the time period
> covered by the usbmon trace is different from the time period in your
> dmesg log.  When you collect two kinds of logs for a test, it's
> important that they should be collected at the _same_ time!  Otherwise 
> they record different events, which is no use.
> 
> For another, your usbmon trace used the 0u file, but you should have
> used the 2u file instead.  All the errors you get involve devices on 
> bus 2; including other buses in the trace just makes it more confusing 
> and causes more entries to be dropped.
> 
> Also, I don't understand why your Dell Dock devices show up the way
> they do.  The SuperSpeed dock is on bus 2, and the high speed dock is
> on bus 3, which is attached to a different xHCI controller!  A hub
> (which is what a dock is) should show up twice, and both instances
> should be attached to the same controller.

I think this is how Ice Lake xHCI works, the Thunderbolt xHCI is also part of SoC on Ice Lake.

> 
> In any case, it's clear that you've got some problem involving the xHCI
> controller and the SuperSpeed Dell Dock (device 2-1).  Here's an
> excerpt from the dmesg log showing what goes wrong.  This occurs during
> a system resume; apparently the dock's connection drops while the
> system is suspended and there are errors when the system tries to
> re-activate it:
> 
> [  721.507202] usb 2-1: USB disconnect, device number 32
> [  721.507206] usb 2-1.3: USB disconnect, device number 33
> [  721.507209] usb 2-1.3.1: USB disconnect, device number 35
> [  721.540214] usb 2-1.3.3: USB disconnect, device number 36
> [  721.584518] usb 2-1.4: USB disconnect, device number 34
> [  721.647466] xhci_hcd 0000:00:0d.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
> 
> That's the first problem.
> 
> [  721.919433] usb 2-1: new SuperSpeedPlus Gen 2 USB device number 38 using xhci_hcd
> [  721.942597] usb 2-1: New USB device found, idVendor=0bda, idProduct=0487, bcdDevice= 1.47
> [  721.942600] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> [  721.942602] usb 2-1: Product: Dell dock
> [  721.942604] usb 2-1: Manufacturer: Dell Inc.
> [  721.944849] hub 2-1:1.0: USB hub found
> [  721.945232] hub 2-1:1.0: 4 ports detected
> [  722.178395] hub 2-1:1.0: hub_ext_port_status failed (err = -71)
> [  722.178570] usb 2-1-port3: attempt power cycle
> [  722.559212] usb 2-1: USB disconnect, device number 38
> [  722.559311] usb 2-1: Failed to suspend device, error -19
> 
> And that's a second error.  Then the system tries again:
> 
> [  722.831431] usb 2-1: new SuperSpeedPlus Gen 2 USB device number 43 using xhci_hcd
> [  722.854608] usb 2-1: New USB device found, idVendor=0bda, idProduct=0487, bcdDevice= 1.47
> [  722.854611] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> [  722.854614] usb 2-1: Product: Dell dock
> [  722.854616] usb 2-1: Manufacturer: Dell Inc.
> [  722.857034] hub 2-1:1.0: USB hub found
> [  722.857440] hub 2-1:1.0: 4 ports detected
> [  723.090106] hub 2-1:1.0: hub_ext_port_status failed (err = -71)
> [  723.090316] usb 2-1-port3: attempt power cycle
> [  723.471212] usb 2-1: USB disconnect, device number 43
> [  723.471274] usb 2-1: Failed to suspend device, error -19
> 
> And it fails in the same way.  A third try ends up succeeding:
> 
> [  723.743465] usb 2-1: new SuperSpeedPlus Gen 2 USB device number 48 using xhci_hcd
> ...
> 
> I don't see why the device numbers jump around in such a crazy manner.  
> When device 38 disconnects, the next device is number 43.  And when 
> that fails, the next number is 48!  It looks like device numbers are 
> somehow being allocated for the four child devices even though the log 
> doesn't show them being detected.
> 
> Here's a portion of the usbmon trace that apparently corresponds
> somewhat to part of the log above, although it's hard to be certain.  
> In this trace, device 77 on bus 2 is 2-1.  Unlike the log above, it
> only has a child attached to port 3, not to port 4:
> 
> ffff9879b43ff840 291366827 S Ci:2:001:0 s a3 00 0000 0001 0004 4 <
> ffff9879b43ff840 291366846 C Ci:2:001:0 0 4 = 03020000
> ffff9879b2f54780 291469875 S Ci:2:077:0 s a3 00 0000 0001 0004 4 <
> ffff9879b2f54780 291470062 C Ci:2:077:0 0 4 = a0020000
> ffff9879b2f54780 291470069 S Ci:2:077:0 s a3 00 0000 0002 0004 4 <
> ffff9879b2f54780 291470208 C Ci:2:077:0 0 4 = a0020000
> ffff9879b2f54780 291470214 S Ci:2:077:0 s a3 00 0000 0003 0004 4 <
> ffff9879b2f54780 291470387 C Ci:2:077:0 0 4 = 03021000
> ffff9879b2f54780 291470392 S Co:2:077:0 s 23 01 0014 0003 0000 0
> ffff9879b2f54780 291470527 C Co:2:077:0 0 0
> ffff9879b2f54780 291470533 S Ci:2:077:0 s a3 00 0000 0004 0004 4 <
> ffff9879b2f54780 291470709 C Ci:2:077:0 0 4 = a0020000
> ffff9879b43ff480 291573804 S Ii:2:077:1 -115:128 2 <
> ffff9879b2f54780 291573839 S Ci:2:077:0 s a3 00 0000 0003 0004 4 <
> ffff9879b2f54780 291574068 C Ci:2:077:0 0 4 = 03020000
> ffff9879b2f54780 291574166 S Ci:2:077:0 s a3 00 0000 0003 0004 4 <
> ffff9879b2f54780 291574363 C Ci:2:077:0 0 4 = 03020000
> ffff9879b2f54780 291574383 S Co:2:077:0 s 23 03 0004 0003 0000 0
> ffff9879b2f54780 291574580 C Co:2:077:0 0 0
> ffff9879b2f54780 291593804 S Ci:2:077:0 s a3 00 0002 0003 0008 8 <
> ffff9879ae2969c0 291595804 C Ii:2:001:1 0:2048 1 = 02
> ffff9879ae2969c0 291595824 S Ii:2:001:1 -115:2048 4 <
> ffff9879ba3476c0 291595843 S Ci:2:001:0 s a3 00 0000 0001 0004 4 <
> ffff9879ba3476c0 291595857 C Ci:2:001:0 0 4 = c0024100
> 
> As far as I can tell, the device was working fine until at this point
> it disconnected itself.
> 
> ffff9879ba3476c0 291595863 S Co:2:001:0 s 23 01 0010 0001 0000 0
> ffff9879ba3476c0 291595867 C Co:2:001:0 0 0
> ffff9879ba3476c0 291595870 S Co:2:001:0 s 23 01 0019 0001 0000 0
> ffff9879ba3476c0 291595875 C Co:2:001:0 0 0
> ffff9879ba3476c0 291595879 S Co:2:001:0 s 23 03 001c 0001 0000 0
> ffff9879ba3476c0 291595884 C Co:2:001:0 0 0
> ffff9879b2f54780 291600949 C Ci:2:077:0 -71 0
> ffff9879b2f54780 291600997 S Co:2:077:0 s 23 03 0004 0003 0000 0
> ffff9879b2f54780 291601000 E Co:2:077:0 -19 0
> 
> Mathias, any suggestions on how Acelan can track down the reasons for 
> these problems?

There are three issues for this hub:
- It may not be in U3 after suspend
- It may not be in U0 after resume
- It disconnects when setting downstream ports to U0 while the hub itself is not in U0.

I'll send a patch series soon.

Kai-Heng

> 
> Alan Stern


      reply	other threads:[~2020-01-03  5:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-20  2:59 [PATCH] usb: hub: move resume delay at the head of all USB access functions AceLan Kao
2019-12-20 15:48 ` Alan Stern
2019-12-23  4:29   ` AceLan Kao
2019-12-24 15:23     ` Alan Stern
2019-12-25  3:15       ` AceLan Kao
2019-12-26  4:13         ` Pan, Harry
2019-12-26  9:07           ` AceLan Kao
2020-01-02 21:37         ` Alan Stern
2020-01-03  5:26           ` Kai-Heng Feng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C434F324-5498-4629-8C03-5B1BD01F3A8F@canonical.com \
    --to=kai.heng.feng@canonical.com \
    --cc=Thinh.Nguyen@synopsys.com \
    --cc=acelan.kao@canonical.com \
    --cc=andreyknvl@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=harry.pan@intel.com \
    --cc=heinzelmann.david@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=malat@debian.org \
    --cc=mathias.nyman@intel.com \
    --cc=nsaenzjulienne@suse.de \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).