All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Mason <slash.tmp@free.fr>,
	Mathias Nyman <mathias.nyman@intel.com>,
	Felipe Balbi <felipe.balbi@linux.intel.com>,
	linux-pci <linux-pci@vger.kernel.org>,
	linux-usb <linux-usb@vger.kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: Possible regression between 4.9 and 4.13
Date: Wed, 23 Aug 2017 14:11:38 +0300	[thread overview]
Message-ID: <599D62EA.7050100@linux.intel.com> (raw)
In-Reply-To: <251c41c0-a4fd-8aae-88e0-5d5928ce45cf@free.fr>

On 23.08.2017 12:31, Mason wrote:
> On 23/08/2017 09:51, Mathias Nyman wrote:
>
>> very likely cause is the more aggressive detection of pci removed xhci hosts
>>
>> See commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
>>       xhci: Rework how we handle unresponsive or hoptlug removed hosts
>>
>> It checks if a xhci register reads returns 0xffffffff and assumes xhci
>> died in that case.
>>
>> Could you add something like the below to check which what is killing the host?
>> Or a BUG()/WARN() in xhci_hc_died() to get a backtrace of who called it.
>
> [   46.525247] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
> [   46.565496] usb-storage 2-2:1.0: USB Mass Storage device detected
> [   46.571934] scsi host0: usb-storage 2-2:1.0
> [   47.601227] scsi 0:0:0:0: Direct-Access     Kingston DataTraveler 3.0      PQ: 0 ANSI: 6
> [   47.611340] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 GB/7.20 GiB)
> [   47.621624] sd 0:0:0:0: [sda] Write Protect is off
> [   47.627131] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
> [   47.639637]  sda: sda1
> [   47.648091] sd 0:0:0:0: [sda] Attached SCSI removable disk
> [   58.100306] xhci_hcd 0000:01:00.0: xHCI host controller not responding, assume dead
> [   58.108021] CPU: 0 PID: 939 Comm: kworker/0:2 Tainted: G         C      4.13.0-rc6 #11
> [   58.115976] Hardware name: Sigma Tango DT
> [   58.120016] Workqueue: usb_hub_wq hub_event
> [   58.124241] [<c010f288>] (unwind_backtrace) from [<c010af58>] (show_stack+0x10/0x14)
> [   58.132033] [<c010af58>] (show_stack) from [<c049d714>] (dump_stack+0x84/0x98)
> [   58.139302] [<c049d714>] (dump_stack) from [<c03b090c>] (xhci_hc_died.part.9+0x50/0x23c)
> [   58.147438] [<c03b090c>] (xhci_hc_died.part.9) from [<c03b5d80>] (xhci_hub_control+0xf3c/0x175c)
> [   58.156273] [<c03b5d80>] (xhci_hub_control) from [<c03934a4>] (usb_hcd_submit_urb+0x264/0x814)
> [   58.164932] [<c03934a4>] (usb_hcd_submit_urb) from [<c0394fa4>] (usb_start_wait_urb+0x4c/0xbc)
> [   58.173591] [<c0394fa4>] (usb_start_wait_urb) from [<c03950b4>] (usb_control_msg+0xa0/0xcc)
> [   58.181985] [<c03950b4>] (usb_control_msg) from [<c038bf54>] (usb_clear_port_feature+0x44/0x4c)
> [   58.190730] [<c038bf54>] (usb_clear_port_feature) from [<c038c320>] (hub_port_reset+0x228/0x51c)
> [   58.199561] [<c038c320>] (hub_port_reset) from [<c038fd68>] (hub_event+0x87c/0x108c)
> [   58.207349] [<c038fd68>] (hub_event) from [<c012ecc4>] (process_one_work+0x1d8/0x3f0)
> [   58.215220] [<c012ecc4>] (process_one_work) from [<c012f8d8>] (worker_thread+0x38/0x554)
> [   58.223354] [<c012f8d8>] (worker_thread) from [<c01347d0>] (kthread+0x108/0x138)
> [   58.230789] [<c01347d0>] (kthread) from [<c01076d8>] (ret_from_fork+0x14/0x3c)
> [   58.238056] xhci_hcd 0000:01:00.0: HC died; cleaning up
> [   58.243391] usb 2-2: USB disconnect, device number 2
> --

xhci driver reads 0xffffffff from a mmio mapped xhci portsc register and bails out in:
xhci-hub.c:
         temp = readl(port_array[wIndex]);
                 if (temp == ~(u32)0) {
                         xhci_hc_died(xhci);
			retval = -ENODEV;
	                break;
		}

In this case we read the register when hub thread asks to clear port feature.

why portsc returns 0xffffffff is a nother quiestion, could the hub thread be running while xhci controller is (in D3)?
Was xhci runtime suspended?
There were some pcieport errors in another log you showed, maybe PCI devices are not properly recovered
and the registers return 0xffffffff?

-Mathias

WARNING: multiple messages have this Message-ID (diff)
From: mathias.nyman@linux.intel.com (Mathias Nyman)
To: linux-arm-kernel@lists.infradead.org
Subject: Possible regression between 4.9 and 4.13
Date: Wed, 23 Aug 2017 14:11:38 +0300	[thread overview]
Message-ID: <599D62EA.7050100@linux.intel.com> (raw)
In-Reply-To: <251c41c0-a4fd-8aae-88e0-5d5928ce45cf@free.fr>

On 23.08.2017 12:31, Mason wrote:
> On 23/08/2017 09:51, Mathias Nyman wrote:
>
>> very likely cause is the more aggressive detection of pci removed xhci hosts
>>
>> See commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
>>       xhci: Rework how we handle unresponsive or hoptlug removed hosts
>>
>> It checks if a xhci register reads returns 0xffffffff and assumes xhci
>> died in that case.
>>
>> Could you add something like the below to check which what is killing the host?
>> Or a BUG()/WARN() in xhci_hc_died() to get a backtrace of who called it.
>
> [   46.525247] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd
> [   46.565496] usb-storage 2-2:1.0: USB Mass Storage device detected
> [   46.571934] scsi host0: usb-storage 2-2:1.0
> [   47.601227] scsi 0:0:0:0: Direct-Access     Kingston DataTraveler 3.0      PQ: 0 ANSI: 6
> [   47.611340] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 GB/7.20 GiB)
> [   47.621624] sd 0:0:0:0: [sda] Write Protect is off
> [   47.627131] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
> [   47.639637]  sda: sda1
> [   47.648091] sd 0:0:0:0: [sda] Attached SCSI removable disk
> [   58.100306] xhci_hcd 0000:01:00.0: xHCI host controller not responding, assume dead
> [   58.108021] CPU: 0 PID: 939 Comm: kworker/0:2 Tainted: G         C      4.13.0-rc6 #11
> [   58.115976] Hardware name: Sigma Tango DT
> [   58.120016] Workqueue: usb_hub_wq hub_event
> [   58.124241] [<c010f288>] (unwind_backtrace) from [<c010af58>] (show_stack+0x10/0x14)
> [   58.132033] [<c010af58>] (show_stack) from [<c049d714>] (dump_stack+0x84/0x98)
> [   58.139302] [<c049d714>] (dump_stack) from [<c03b090c>] (xhci_hc_died.part.9+0x50/0x23c)
> [   58.147438] [<c03b090c>] (xhci_hc_died.part.9) from [<c03b5d80>] (xhci_hub_control+0xf3c/0x175c)
> [   58.156273] [<c03b5d80>] (xhci_hub_control) from [<c03934a4>] (usb_hcd_submit_urb+0x264/0x814)
> [   58.164932] [<c03934a4>] (usb_hcd_submit_urb) from [<c0394fa4>] (usb_start_wait_urb+0x4c/0xbc)
> [   58.173591] [<c0394fa4>] (usb_start_wait_urb) from [<c03950b4>] (usb_control_msg+0xa0/0xcc)
> [   58.181985] [<c03950b4>] (usb_control_msg) from [<c038bf54>] (usb_clear_port_feature+0x44/0x4c)
> [   58.190730] [<c038bf54>] (usb_clear_port_feature) from [<c038c320>] (hub_port_reset+0x228/0x51c)
> [   58.199561] [<c038c320>] (hub_port_reset) from [<c038fd68>] (hub_event+0x87c/0x108c)
> [   58.207349] [<c038fd68>] (hub_event) from [<c012ecc4>] (process_one_work+0x1d8/0x3f0)
> [   58.215220] [<c012ecc4>] (process_one_work) from [<c012f8d8>] (worker_thread+0x38/0x554)
> [   58.223354] [<c012f8d8>] (worker_thread) from [<c01347d0>] (kthread+0x108/0x138)
> [   58.230789] [<c01347d0>] (kthread) from [<c01076d8>] (ret_from_fork+0x14/0x3c)
> [   58.238056] xhci_hcd 0000:01:00.0: HC died; cleaning up
> [   58.243391] usb 2-2: USB disconnect, device number 2
> --

xhci driver reads 0xffffffff from a mmio mapped xhci portsc register and bails out in:
xhci-hub.c:
         temp = readl(port_array[wIndex]);
                 if (temp == ~(u32)0) {
                         xhci_hc_died(xhci);
			retval = -ENODEV;
	                break;
		}

In this case we read the register when hub thread asks to clear port feature.

why portsc returns 0xffffffff is a nother quiestion, could the hub thread be running while xhci controller is (in D3)?
Was xhci runtime suspended?
There were some pcieport errors in another log you showed, maybe PCI devices are not properly recovered
and the registers return 0xffffffff?

-Mathias

  reply	other threads:[~2017-08-23 11:11 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-22 17:34 Possible regression between 4.9 and 4.13 Mason
2017-08-22 17:34 ` Mason
2017-08-23  6:07 ` Felipe Balbi
2017-08-23  6:07   ` Felipe Balbi
2017-08-23  7:51   ` Mathias Nyman
2017-08-23  7:51     ` Mathias Nyman
2017-08-23  9:18     ` Mason
2017-08-23  9:18       ` Mason
2017-08-23  9:31     ` Mason
2017-08-23  9:31       ` Mason
2017-08-23 11:11       ` Mathias Nyman [this message]
2017-08-23 11:11         ` Mathias Nyman
2017-08-23 11:54         ` Mason
2017-08-23 11:54           ` Mason
2017-08-23 12:41           ` Mason
2017-08-23 12:41             ` Mason
2017-08-23 14:30             ` Mason
2017-08-23 14:30               ` Mason
2017-08-28  8:39               ` Mathias Nyman
2017-08-28  8:39                 ` Mathias Nyman
2017-08-28 14:40                 ` Mason
2017-08-28 14:40                   ` Mason
2017-08-29 13:28                   ` Mathias Nyman
2017-08-29 13:28                     ` Mathias Nyman
2017-08-29 13:38                     ` Lukas Wunner
2017-08-29 13:38                       ` Lukas Wunner
2017-08-29 14:47                       ` Greg Kroah-Hartman
2017-08-29 14:47                         ` Greg Kroah-Hartman
2017-08-29 15:34                         ` Lukas Wunner
2017-08-29 15:34                           ` Lukas Wunner
2017-08-29 15:51                           ` Greg Kroah-Hartman
2017-08-29 15:51                             ` Greg Kroah-Hartman
2017-08-30  6:36                             ` Lukas Wunner
2017-08-30  6:36                               ` Lukas Wunner
2017-08-30  6:45                               ` Greg Kroah-Hartman
2017-08-30  6:45                                 ` Greg Kroah-Hartman
2017-08-29 23:53                     ` Lukas Wunner
2017-08-29 23:53                       ` Lukas Wunner
2017-08-30  6:02                       ` Greg Kroah-Hartman
2017-08-30  6:02                         ` Greg Kroah-Hartman
2017-08-30  8:55                         ` Mason
2017-08-30  8:55                           ` Mason
2017-08-30  9:06                           ` Greg Kroah-Hartman
2017-08-30  9:06                             ` Greg Kroah-Hartman
2017-08-31  9:39                             ` Mason
2017-08-31  9:39                               ` Mason
2017-08-31 11:40                               ` Mathias Nyman
2017-08-31 11:40                                 ` Mathias Nyman
2017-08-30  9:07                           ` Ard Biesheuvel
2017-08-30  9:07                             ` Ard Biesheuvel
2017-08-30  9:22                             ` Greg Kroah-Hartman
2017-08-30  9:22                               ` Greg Kroah-Hartman
2017-08-30  9:37                             ` Mason
2017-08-30  9:37                               ` Mason
2017-08-31  9:17                               ` Mason
2017-08-31  9:17                                 ` Mason
2017-08-31 11:38                                 ` Mathias Nyman
2017-08-31 11:38                                   ` Mathias Nyman
2017-08-23 10:19     ` Mason
2017-08-23 10:19       ` Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=599D62EA.7050100@linux.intel.com \
    --to=mathias.nyman@linux.intel.com \
    --cc=felipe.balbi@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=helgaas@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    --cc=slash.tmp@free.fr \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.