linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* USB regression in kernel 6.2.2
@ 2023-03-07 13:21 Brian Morrison
  2023-03-08  9:52 ` Linux regression tracking #adding (Thorsten Leemhuis)
  2023-03-08 15:16 ` Mathias Nyman
  0 siblings, 2 replies; 11+ messages in thread
From: Brian Morrison @ 2023-03-07 13:21 UTC (permalink / raw)
  To: mathias.nyman; +Cc: linux-usb

[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]

Hello Mathias (sorry you're getting this twice).

Re-send after linux-usb list rejection (too big).

Hans de Goede replied to my Fedora kernel bug here:

https://bugzilla.redhat.com/show_bug.cgi?id=2175534

suggesting that I contact you about it and Cc: the linux-usb list.

Starting with kernel-6.2.2-300 on Fedora x86_64 (the first 6.2 kernel on
Fedora 37) I am seeing problems with USB devices on a Renesas
ROM-based USB PCI card which works normally with kernel-6.1.15-200 and
earlier 6.x kernels, essentially the USB 2.0 device tree on this card's
bus is not being enumerated with the result that my /dev/ttyUSB*
devices are no longer present (these are Silicon Labs CP210x UARTS with
TI PCM290x devices behind them).

I have attached the lsusb -t output for the working and broken cases, I
don't know where the problem lies but I suspect it's not udev because
the configuration is unchanged, it seems to be in the kernel usb code.

There are further attachments in the bug referred to above, I don't
know if they help but you can look there if the lsusb output is
insufficient, I can point out that lsmod does show the cp210x module is
loaded which may provide a clue about where things are failing.

Thanks for reading this, I look forward to hearing your suggestions.

-- 

Brian Morrison


[-- Attachment #2: lsusb_t_6.1.15 --]
[-- Type: application/octet-stream, Size: 2669 bytes --]

/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 12M
        |__ Port 1: Dev 4, If 0, Class=Vendor Specific Class, Driver=cp210x, 12M
        |__ Port 4: Dev 6, If 0, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 4: Dev 6, If 1, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 4: Dev 6, If 2, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 4: Dev 6, If 3, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 2: Dev 3, If 0, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 2: Dev 3, If 1, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 2: Dev 3, If 2, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 3: Dev 5, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 3: Dev 5, If 1, Class=Video, Driver=uvcvideo, 480M
    |__ Port 3: Dev 5, If 2, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 3: Dev 5, If 3, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 4: Dev 7, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 1: Dev 8, If 0, Class=Vendor Specific Class, Driver=cp210x, 12M
        |__ Port 2: Dev 9, If 0, Class=Vendor Specific Class, Driver=cp210x, 12M
        |__ Port 4: Dev 10, If 0, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 4: Dev 10, If 1, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 4: Dev 10, If 2, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 4: Dev 10, If 3, Class=Human Interface Device, Driver=usbhid, 12M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/8p, 10000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/16p, 480M
    |__ Port 5: Dev 2, If 0, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 5: Dev 2, If 1, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 5: Dev 2, If 2, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 5: Dev 2, If 3, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 6: Dev 3, If 0, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 6: Dev 3, If 1, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 6: Dev 3, If 2, Class=Human Interface Device, Driver=usbhid, 480M
    |__ Port 9: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 10: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 10: Dev 5, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 12: Dev 6, If 0, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 12: Dev 6, If 1, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 12: Dev 6, If 2, Class=Human Interface Device, Driver=usbhid, 12M

[-- Attachment #3: lsusb_t_6.2.2 --]
[-- Type: application/octet-stream, Size: 1493 bytes --]

/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/8p, 10000M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
    |__ Port 3: Dev 5, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 3: Dev 5, If 1, Class=Video, Driver=uvcvideo, 480M
    |__ Port 3: Dev 5, If 2, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 3: Dev 5, If 3, Class=Audio, Driver=snd-usb-audio, 480M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/16p, 480M
    |__ Port 5: Dev 2, If 0, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 5: Dev 2, If 1, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 5: Dev 2, If 2, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 5: Dev 2, If 3, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 6: Dev 3, If 0, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 6: Dev 3, If 1, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 6: Dev 3, If 2, Class=Human Interface Device, Driver=usbhid, 480M
    |__ Port 9: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 10: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 10: Dev 5, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 12: Dev 6, If 0, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 12: Dev 6, If 1, Class=Audio, Driver=snd-usb-audio, 12M
    |__ Port 12: Dev 6, If 2, Class=Human Interface Device, Driver=usbhid, 12M

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-07 13:21 USB regression in kernel 6.2.2 Brian Morrison
@ 2023-03-08  9:52 ` Linux regression tracking #adding (Thorsten Leemhuis)
  2023-03-08 15:16 ` Mathias Nyman
  1 sibling, 0 replies; 11+ messages in thread
From: Linux regression tracking #adding (Thorsten Leemhuis) @ 2023-03-08  9:52 UTC (permalink / raw)
  To: Brian Morrison, mathias.nyman; +Cc: linux-usb, Linux kernel regressions list

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 07.03.23 14:21, Brian Morrison wrote:
> Hello Mathias (sorry you're getting this twice).
> 
> Re-send after linux-usb list rejection (too big).
> 
> Hans de Goede replied to my Fedora kernel bug here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=2175534
> 
> suggesting that I contact you about it and Cc: the linux-usb list.
> 
> Starting with kernel-6.2.2-300 on Fedora x86_64 (the first 6.2 kernel on
> Fedora 37) I am seeing problems with USB devices on a Renesas
> ROM-based USB PCI card which works normally with kernel-6.1.15-200 and
> earlier 6.x kernels, essentially the USB 2.0 device tree on this card's
> bus is not being enumerated with the result that my /dev/ttyUSB*
> devices are no longer present (these are Silicon Labs CP210x UARTS with
> TI PCM290x devices behind them).
> 
> I have attached the lsusb -t output for the working and broken cases, I
> don't know where the problem lies but I suspect it's not udev because
> the configuration is unchanged, it seems to be in the kernel usb code.
> 
> There are further attachments in the bug referred to above, I don't
> know if they help but you can look there if the lsusb output is
> insufficient, I can point out that lsmod does show the cp210x module is
> loaded which may provide a clue about where things are failing.
> 
> Thanks for reading this, I look forward to hearing your suggestions.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced v6.1..v6.2
#regzbot title usb: USB 2.0 device tree not enumerated on Renesas
ROM-based USB PCI card
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-07 13:21 USB regression in kernel 6.2.2 Brian Morrison
  2023-03-08  9:52 ` Linux regression tracking #adding (Thorsten Leemhuis)
@ 2023-03-08 15:16 ` Mathias Nyman
  2023-03-08 16:45   ` Brian Morrison
  2023-03-09 20:04   ` Brian Morrison
  1 sibling, 2 replies; 11+ messages in thread
From: Mathias Nyman @ 2023-03-08 15:16 UTC (permalink / raw)
  To: Brian Morrison, mathias.nyman; +Cc: linux-usb

On 7.3.2023 15.21, Brian Morrison wrote:
> Hello Mathias (sorry you're getting this twice).
> 
> Re-send after linux-usb list rejection (too big).
> 
> Hans de Goede replied to my Fedora kernel bug here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=2175534
> 
> suggesting that I contact you about it and Cc: the linux-usb list.
> 
> Starting with kernel-6.2.2-300 on Fedora x86_64 (the first 6.2 kernel on
> Fedora 37) I am seeing problems with USB devices on a Renesas
> ROM-based USB PCI card which works normally with kernel-6.1.15-200 and
> earlier 6.x kernels, essentially the USB 2.0 device tree on this card's
> bus is not being enumerated with the result that my /dev/ttyUSB*
> devices are no longer present (these are Silicon Labs CP210x UARTS with
> TI PCM290x devices behind them).
> 
> I have attached the lsusb -t output for the working and broken cases, I
> don't know where the problem lies but I suspect it's not udev because
> the configuration is unchanged, it seems to be in the kernel usb code.
> 
> There are further attachments in the bug referred to above, I don't
> know if they help but you can look there if the lsusb output is
> insufficient, I can point out that lsmod does show the cp210x module is
> loaded which may provide a clue about where things are failing.
> 
> Thanks for reading this, I look forward to hearing your suggestions.
> 

Looks like that those devices initially enumerated fine, but suddenly
disconnect about 19 seconds after boot.

[   19.155556] usb 2-1.1: USB disconnect, device number 4
[   19.155685] cp210x ttyUSB0: cp210x converter now disconnected from ttyUSB0
[   19.159290] usb 2-1.4: USB disconnect, device number 6
[   19.242344] usb 2-1.4: 3:0: failed to get current value for ch 0 (-22)
[   20.100761] usb 2-4.1: USB disconnect, device number 8
[   20.100894] cp210x ttyUSB1: cp210x converter now disconnected from ttyUSB1
[   20.100999] cp210x 2-4.1:1.0: device disconnected
[   20.107188] usb 2-4.2: USB disconnect, device number 9
[   20.107253] cp210x ttyUSB2: cp210x converter now disconnected from ttyUSB2
[   20.107284] cp210x 2-4.2:1.0: device disconnected
[   20.111938] usb 2-4.4: USB disconnect, device number 10
[   20.181363] usb 2-4.4: 3:0: failed to get current value for ch 0 (-22)

Interestingly those are all the devices behind external hubs.

Bisecting this to find the offending commit would be best, but a dmesg
with xhci and usb core dynamic debug enabled could also show why those devices
disconnect.

Adding "usbcore.dyndbg=+p xhci_hcd.dyndbg=+p" to your kernel cmdline
should do this.

Thanks
Mathias


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-08 15:16 ` Mathias Nyman
@ 2023-03-08 16:45   ` Brian Morrison
  2023-03-09 20:04   ` Brian Morrison
  1 sibling, 0 replies; 11+ messages in thread
From: Brian Morrison @ 2023-03-08 16:45 UTC (permalink / raw)
  To: linux-usb

On Wed, 8 Mar 2023 17:16:01 +0200
Mathias Nyman <mathias.nyman@linux.intel.com> wrote:

> Looks like that those devices initially enumerated fine, but suddenly
> disconnect about 19 seconds after boot.
> 
> [   19.155556] usb 2-1.1: USB disconnect, device number 4
> [   19.155685] cp210x ttyUSB0: cp210x converter now disconnected from
> ttyUSB0 [   19.159290] usb 2-1.4: USB disconnect, device number 6
> [   19.242344] usb 2-1.4: 3:0: failed to get current value for ch 0
> (-22) [   20.100761] usb 2-4.1: USB disconnect, device number 8
> [   20.100894] cp210x ttyUSB1: cp210x converter now disconnected from
> ttyUSB1 [   20.100999] cp210x 2-4.1:1.0: device disconnected
> [   20.107188] usb 2-4.2: USB disconnect, device number 9
> [   20.107253] cp210x ttyUSB2: cp210x converter now disconnected from
> ttyUSB2 [   20.107284] cp210x 2-4.2:1.0: device disconnected
> [   20.111938] usb 2-4.4: USB disconnect, device number 10
> [   20.181363] usb 2-4.4: 3:0: failed to get current value for ch 0
> (-22)
> 
> Interestingly those are all the devices behind external hubs.

These are in amateur radio gear, providing sound card modems and radio
CAT control on USB2 ports, but they have given no trouble since the
Renesas USB3 PCI card ROM load bug was sorted out a couple of years ago
I think.

> 
> Bisecting this to find the offending commit would be best, but a dmesg
> with xhci and usb core dynamic debug enabled could also show why
> those devices disconnect.
> 
> Adding "usbcore.dyndbg=+p xhci_hcd.dyndbg=+p" to your kernel cmdline
> should do this.

OK, I have done this and attached the dmesg output (which has expanded
by a factor of 3 with the extra debug).

A quick grep reveals these which are not expected:

[bdm@deangelis ~]$ grep usb_disable_device dmesg_6.2.2_debug.txt
[   18.349015] usb 2-1.1: usb_disable_device nuking all URBs
[   18.587034] usb 2-1.4: usb_disable_device nuking all URBs
[   18.589675] usb 2-1: usb_disable_device nuking non-ep0 URBs
[   19.280599] usb 2-2: usb_disable_device nuking non-ep0 URBs
[   19.288312] usb 2-4.1: usb_disable_device nuking all URBs
[   19.298113] usb 2-4.2: usb_disable_device nuking all URBs
[   19.386494] usb 2-4.4: usb_disable_device nuking all URBs
[   19.390100] usb 2-4: usb_disable_device nuking non-ep0 URBs

which are then followed by:

xhci_drop_endpoint from the xhci_hcd driver which seems expected given
the usb_disable_device being commanded.

That's about as far as I know how to go, I have not copied this to the
linux-usb list because of the size of the attachment, I have added that
to the redhat bug at:

https://bugzilla.redhat.com/show_bug.cgi?id=2175534

I suppose this should get to the list without the attachment.

-- 

Brian Morrison


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-08 15:16 ` Mathias Nyman
  2023-03-08 16:45   ` Brian Morrison
@ 2023-03-09 20:04   ` Brian Morrison
  2023-03-12  0:03     ` Brian Morrison
  1 sibling, 1 reply; 11+ messages in thread
From: Brian Morrison @ 2023-03-09 20:04 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, Linux kernel regressions list

On Wed, 8 Mar 2023 17:16:01 +0200
Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
 
> 
> Looks like that those devices initially enumerated fine, but suddenly
> disconnect about 19 seconds after boot.
> 
> [   19.155556] usb 2-1.1: USB disconnect, device number 4
> [   19.155685] cp210x ttyUSB0: cp210x converter now disconnected from
> ttyUSB0 [   19.159290] usb 2-1.4: USB disconnect, device number 6
> [   19.242344] usb 2-1.4: 3:0: failed to get current value for ch 0
> (-22) [   20.100761] usb 2-4.1: USB disconnect, device number 8
> [   20.100894] cp210x ttyUSB1: cp210x converter now disconnected from
> ttyUSB1 [   20.100999] cp210x 2-4.1:1.0: device disconnected
> [   20.107188] usb 2-4.2: USB disconnect, device number 9
> [   20.107253] cp210x ttyUSB2: cp210x converter now disconnected from
> ttyUSB2 [   20.107284] cp210x 2-4.2:1.0: device disconnected
> [   20.111938] usb 2-4.4: USB disconnect, device number 10
> [   20.181363] usb 2-4.4: 3:0: failed to get current value for ch 0
> (-22)
> 
> Interestingly those are all the devices behind external hubs.
> 
> Bisecting this to find the offending commit would be best, but a dmesg
> with xhci and usb core dynamic debug enabled could also show why
> those devices disconnect.
> 
> Adding "usbcore.dyndbg=+p xhci_hcd.dyndbg=+p" to your kernel cmdline
> should do this.

In addition to the debug output I have been looking at the diff between
kernel-6.1 and kernel-6.2 in the /drivers/usb tree, in particular under
/drivers/usb/core/hub.h and /drivers/usb/core/hub.c where the vendor
for this device with VID 0451 is newly listed although its PID is not:

Bus 003 Device 002: ID 0451:2046 Texas Instruments, Inc. TUSB2046 Hub

this device is missing from lsusb output in kernel 6.2.2 but is present
with kernel 6.1.*

In my inexpert way I think it is all tied in to changes from a few
months ago (November 2022) that went into the 6.2rc kernels where the
early_stop capability was added to USB enumeration but I am certainly
not smart enough to identify exactly why the particular combination of
hardware I have is caught up in it. I can see from the extended dmesg
output that certain USB interfaces are unregistered for no obvious
reason and that once this happens they are invisible to the OS. The
altered USB core code would seem to be a prime suspect as the cause of
this regression.

-- 

Brian Morrison


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-09 20:04   ` Brian Morrison
@ 2023-03-12  0:03     ` Brian Morrison
  2023-03-13 10:06       ` Mathias Nyman
  0 siblings, 1 reply; 11+ messages in thread
From: Brian Morrison @ 2023-03-12  0:03 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, Linux kernel regressions list

On Thu, 9 Mar 2023 20:04:15 +0000
Brian Morrison <bdm@fenrir.org.uk> wrote:

> On Wed, 8 Mar 2023 17:16:01 +0200
> Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
>  
> > 
> > Looks like that those devices initially enumerated fine, but
> > suddenly disconnect about 19 seconds after boot.
> > 
> > [   19.155556] usb 2-1.1: USB disconnect, device number 4
> > [   19.155685] cp210x ttyUSB0: cp210x converter now disconnected
> > from ttyUSB0 [   19.159290] usb 2-1.4: USB disconnect, device
> > number 6 [   19.242344] usb 2-1.4: 3:0: failed to get current value
> > for ch 0 (-22) [   20.100761] usb 2-4.1: USB disconnect, device
> > number 8 [   20.100894] cp210x ttyUSB1: cp210x converter now
> > disconnected from ttyUSB1 [   20.100999] cp210x 2-4.1:1.0: device
> > disconnected [   20.107188] usb 2-4.2: USB disconnect, device
> > number 9 [   20.107253] cp210x ttyUSB2: cp210x converter now
> > disconnected from ttyUSB2 [   20.107284] cp210x 2-4.2:1.0: device
> > disconnected [   20.111938] usb 2-4.4: USB disconnect, device
> > number 10 [   20.181363] usb 2-4.4: 3:0: failed to get current
> > value for ch 0 (-22)
> > 
> > Interestingly those are all the devices behind external hubs.
> > 
> > Bisecting this to find the offending commit would be best, but a
> > dmesg with xhci and usb core dynamic debug enabled could also show
> > why those devices disconnect.
> > 
> > Adding "usbcore.dyndbg=+p xhci_hcd.dyndbg=+p" to your kernel cmdline
> > should do this.  
> 
> In addition to the debug output I have been looking at the diff
> between kernel-6.1 and kernel-6.2 in the /drivers/usb tree, in
> particular under /drivers/usb/core/hub.h and /drivers/usb/core/hub.c
> where the vendor for this device with VID 0451 is newly listed
> although its PID is not:
> 
> Bus 003 Device 002: ID 0451:2046 Texas Instruments, Inc. TUSB2046 Hub
> 
> this device is missing from lsusb output in kernel 6.2.2 but is
> present with kernel 6.1.*

I was wrong about this, it's the devices on the far side of the TI and
SMSC hub devices that are missing, not the hubs themselves.

> 
> In my inexpert way I think it is all tied in to changes from a few
> months ago (November 2022) that went into the 6.2rc kernels where the
> early_stop capability was added to USB enumeration but I am certainly
> not smart enough to identify exactly why the particular combination of
> hardware I have is caught up in it. I can see from the extended dmesg
> output that certain USB interfaces are unregistered for no obvious
> reason and that once this happens they are invisible to the OS. The
> altered USB core code would seem to be a prime suspect as the cause of
> this regression.
> 

Further testing with kernels 6.1.18 and 6.2.5 is added to the bug entry
here:

https://bugzilla.redhat.com/show_bug.cgi?id=2175534#c12

I don't know how to bisect this with the available Fedora kernels.

-- 

Brian Morrison


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-12  0:03     ` Brian Morrison
@ 2023-03-13 10:06       ` Mathias Nyman
  2023-03-14 14:00         ` Brian Morrison
  0 siblings, 1 reply; 11+ messages in thread
From: Mathias Nyman @ 2023-03-13 10:06 UTC (permalink / raw)
  To: Brian Morrison; +Cc: linux-usb, Linux kernel regressions list

On 12.3.2023 2.03, Brian Morrison wrote:
> On Thu, 9 Mar 2023 20:04:15 +0000
> Brian Morrison <bdm@fenrir.org.uk> wrote:
> 
>> On Wed, 8 Mar 2023 17:16:01 +0200
>> Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
>>   
>>>
>>> Looks like that those devices initially enumerated fine, but
>>> suddenly disconnect about 19 seconds after boot.
>>>
>>> [   19.155556] usb 2-1.1: USB disconnect, device number 4
>>> [   19.155685] cp210x ttyUSB0: cp210x converter now disconnected
>>> from ttyUSB0 [   19.159290] usb 2-1.4: USB disconnect, device
>>> number 6 [   19.242344] usb 2-1.4: 3:0: failed to get current value
>>> for ch 0 (-22) [   20.100761] usb 2-4.1: USB disconnect, device
>>> number 8 [   20.100894] cp210x ttyUSB1: cp210x converter now
>>> disconnected from ttyUSB1 [   20.100999] cp210x 2-4.1:1.0: device
>>> disconnected [   20.107188] usb 2-4.2: USB disconnect, device
>>> number 9 [   20.107253] cp210x ttyUSB2: cp210x converter now
>>> disconnected from ttyUSB2 [   20.107284] cp210x 2-4.2:1.0: device
>>> disconnected [   20.111938] usb 2-4.4: USB disconnect, device
>>> number 10 [   20.181363] usb 2-4.4: 3:0: failed to get current
>>> value for ch 0 (-22)
>>>
>>> Interestingly those are all the devices behind external hubs.
>>>
>>> Bisecting this to find the offending commit would be best, but a
>>> dmesg with xhci and usb core dynamic debug enabled could also show
>>> why those devices disconnect.
>>>
>>> Adding "usbcore.dyndbg=+p xhci_hcd.dyndbg=+p" to your kernel cmdline
>>> should do this.
>>
>> In addition to the debug output I have been looking at the diff
>> between kernel-6.1 and kernel-6.2 in the /drivers/usb tree, in
>> particular under /drivers/usb/core/hub.h and /drivers/usb/core/hub.c
>> where the vendor for this device with VID 0451 is newly listed
>> although its PID is not:
>>
>> Bus 003 Device 002: ID 0451:2046 Texas Instruments, Inc. TUSB2046 Hub
>>
>> this device is missing from lsusb output in kernel 6.2.2 but is
>> present with kernel 6.1.*
> 
> I was wrong about this, it's the devices on the far side of the TI and
> SMSC hub devices that are missing, not the hubs themselves.
> 
>>
>> In my inexpert way I think it is all tied in to changes from a few
>> months ago (November 2022) that went into the 6.2rc kernels where the
>> early_stop capability was added to USB enumeration but I am certainly
>> not smart enough to identify exactly why the particular combination of
>> hardware I have is caught up in it. I can see from the extended dmesg
>> output that certain USB interfaces are unregistered for no obvious
>> reason and that once this happens they are invisible to the OS. The
>> altered USB core code would seem to be a prime suspect as the cause of
>> this regression.
>>
> 
> Further testing with kernels 6.1.18 and 6.2.5 is added to the bug entry
> here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=2175534#c12
> 
> I don't know how to bisect this with the available Fedora kernels.
> 


In your v6.2 logs the usb bus numbers are interleaved, in the v6.1 they
are not. xhci driver registers two usb buses per host, one High-Speed and
one SuperSpeed.

in v6.2:

[    1.094679] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
[    1.094695] xhci_hcd 0000:04:00.0: new USB bus registered, assigned bus number 2
[    1.096690] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 3
[    1.100549] xhci_hcd 0000:04:00.0: new USB bus registered, assigned bus number 4

in 6.1:

[    1.071987] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
[    1.073300] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
[    1.076445] xhci_hcd 0000:04:00.0: new USB bus registered, assigned bus number 3
[    1.082133] xhci_hcd 0000:04:00.0: new USB bus registered, assigned bus number 4

0000:00:14.0 is your Intel xHC
0000:04:00.0 is your Renesas xHC

This change could be due to 6.2 commit:

4c2604a9a689 usb: xhci-pci: Set PROBE_PREFER_ASYNCHRONOUS

Not sure why it would cause this regression, but worth testing it.

Can you try to revert that commit?
Or alternatively unbind and rebind the hosts from the xhci driver:

echo 0000:00:14.0 > /sys/bus/pci/drivers/xhci_hcd/unbind
echo 0000:04:00.0 > /sys/bus/pci/drivers/xhci_hcd/unbind

(all your usb devices should now be disconnected)

echo 0000:00:14.0 > /sys/bus/pci/drivers/xhci_hcd/bind
<wait a couple seconds>
echo 0000:04:00.0 > /sys/bus/pci/drivers/xhci_hcd/bind

Thanks
Mathias

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-13 10:06       ` Mathias Nyman
@ 2023-03-14 14:00         ` Brian Morrison
  2023-03-15 11:19           ` Mathias Nyman
  0 siblings, 1 reply; 11+ messages in thread
From: Brian Morrison @ 2023-03-14 14:00 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, Linux kernel regressions list

On Mon, 13 Mar 2023 12:06:59 +0200
Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
> 
> 
> In your v6.2 logs the usb bus numbers are interleaved, in the v6.1
> they are not. xhci driver registers two usb buses per host, one
> High-Speed and one SuperSpeed.
> 
> in v6.2:
> 
> [    1.094679] xhci_hcd 0000:00:14.0: new USB bus registered,
> assigned bus number 1 [    1.094695] xhci_hcd 0000:04:00.0: new USB
> bus registered, assigned bus number 2 [    1.096690] xhci_hcd
> 0000:00:14.0: new USB bus registered, assigned bus number 3 [
> 1.100549] xhci_hcd 0000:04:00.0: new USB bus registered, assigned bus
> number 4
> 
> in 6.1:
> 
> [    1.071987] xhci_hcd 0000:00:14.0: new USB bus registered,
> assigned bus number 1 [    1.073300] xhci_hcd 0000:00:14.0: new USB
> bus registered, assigned bus number 2 [    1.076445] xhci_hcd
> 0000:04:00.0: new USB bus registered, assigned bus number 3 [
> 1.082133] xhci_hcd 0000:04:00.0: new USB bus registered, assigned bus
> number 4
> 
> 0000:00:14.0 is your Intel xHC
> 0000:04:00.0 is your Renesas xHC
> 
> This change could be due to 6.2 commit:
> 
> 4c2604a9a689 usb: xhci-pci: Set PROBE_PREFER_ASYNCHRONOUS
> 
> Not sure why it would cause this regression, but worth testing it.

I have now reverted the above commit, it's only the one line in
xhci-pci.c and it took a couple of hours to rebuild my kernel rpms which
wasn't too bad.

With this change all of my USB devices are present again and the 3
/dev/ttyUSB* nodes are all present and usable.

I found this in the linux-usb list archives:

https://www.spinics.net/lists/kernel/msg4569289.html

and the first part of this patch series here:

https://www.spinics.net/lists/kernel/msg4569288.html

Should both of these patches be reverted? I assume so but I don't think
I have anything that uses an ehci device to test it.

I know nothing about how this all works other than finding this:

"Note that the end goal is to switch the kernel to use asynchronous
probing by default, so annotating drivers with
PROBE_PREFER_ASYNCHRONOUS is a temporary measure that allows us to
speed up boot process while we are validating the rest of the drivers."

which is at:

https://www.kernel.org/doc/html/v4.14/driver-api/infrastructure.html

so by the looks of it either this driver needs to initialise
synchronously or there is a further problem which causes the bus
ordering to be wrong but it also seems to be a work in progress so I
don't know how this will eventually play out.


> 
> Can you try to revert that commit?
> Or alternatively unbind and rebind the hosts from the xhci driver:
> 
> echo 0000:00:14.0 > /sys/bus/pci/drivers/xhci_hcd/unbind
> echo 0000:04:00.0 > /sys/bus/pci/drivers/xhci_hcd/unbind
> 
> (all your usb devices should now be disconnected)
> 
> echo 0000:00:14.0 > /sys/bus/pci/drivers/xhci_hcd/bind
> <wait a couple seconds>
> echo 0000:04:00.0 > /sys/bus/pci/drivers/xhci_hcd/bind

This suggestion only worked on one of the two USB ports, I mention it
only for completeness, the revert above is a 100% fix.

-- 

Brian Morrison


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-14 14:00         ` Brian Morrison
@ 2023-03-15 11:19           ` Mathias Nyman
  2023-03-15 14:53             ` Alan Stern
  0 siblings, 1 reply; 11+ messages in thread
From: Mathias Nyman @ 2023-03-15 11:19 UTC (permalink / raw)
  To: Brian Morrison
  Cc: linux-usb, Linux kernel regressions list, Alan Stern,
	Chen Xingdi, Takashi Iwai, Moritz Fischer, Christian Lamparter,
	Vinod Koul

On 14.3.2023 16.00, Brian Morrison wrote:
> On Mon, 13 Mar 2023 12:06:59 +0200
> Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
>>
>>
>> In your v6.2 logs the usb bus numbers are interleaved, in the v6.1
>> they are not. xhci driver registers two usb buses per host, one
>> High-Speed and one SuperSpeed.
>>
>> This change could be due to 6.2 commit:
>>
>> 4c2604a9a689 usb: xhci-pci: Set PROBE_PREFER_ASYNCHRONOUS
>>
>> Not sure why it would cause this regression, but worth testing it.
> 
> I have now reverted the above commit, it's only the one line in
> xhci-pci.c and it took a couple of hours to rebuild my kernel rpms which
> wasn't too bad.
> 
> With this change all of my USB devices are present again and the 3
> /dev/ttyUSB* nodes are all present and usable.
> 

Thanks for testing.
So setting PROBE_PREFER_ASYNCHRONOUS does trigger this issue for Renesas xHCI.

Was it so that with the devices connected to the Intel host everything worked
on 6.2 kernel?

Just to make sure that this is a vendor specific host issue and not generic xhci
driver issue.

If we can't quickly figure out the real reason for this then we just have to
revert that patch.

> I found this in the linux-usb list archives:
> 
> https://www.spinics.net/lists/kernel/msg4569289.html
> 
> and the first part of this patch series here:
> 
> https://www.spinics.net/lists/kernel/msg4569288.html
> 
> Should both of these patches be reverted? I assume so but I don't think
> I have anything that uses an ehci device to test it.
> 

Probably just the xhci one. I haven't heard of any ehci issues.

Alan (cc) would know better if there are any new odd ehci issues that can
be traced back to the async probe change.

> I know nothing about how this all works other than finding this:
> 
> "Note that the end goal is to switch the kernel to use asynchronous
> probing by default, so annotating drivers with
> PROBE_PREFER_ASYNCHRONOUS is a temporary measure that allows us to
> speed up boot process while we are validating the rest of the drivers."
> 
> which is at:
> 
> https://www.kernel.org/doc/html/v4.14/driver-api/infrastructure.html
> 
> so by the looks of it either this driver needs to initialise
> synchronously or there is a further problem which causes the bus
> ordering to be wrong but it also seems to be a work in progress so I
> don't know how this will eventually play out.
> 
  
Adding several persons who worked on xhci-pci-renesas.c in hope of figuring
this out.

Thanks
Mathias


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-15 11:19           ` Mathias Nyman
@ 2023-03-15 14:53             ` Alan Stern
  2023-03-17 18:37               ` Brian Morrison
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Stern @ 2023-03-15 14:53 UTC (permalink / raw)
  To: Mathias Nyman
  Cc: Brian Morrison, linux-usb, Linux kernel regressions list,
	Chen Xingdi, Takashi Iwai, Moritz Fischer, Christian Lamparter,
	Vinod Koul

On Wed, Mar 15, 2023 at 01:19:16PM +0200, Mathias Nyman wrote:
> On 14.3.2023 16.00, Brian Morrison wrote:
> > On Mon, 13 Mar 2023 12:06:59 +0200
> > Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
> > > 
> > > 
> > > In your v6.2 logs the usb bus numbers are interleaved, in the v6.1
> > > they are not. xhci driver registers two usb buses per host, one
> > > High-Speed and one SuperSpeed.
> > > 
> > > This change could be due to 6.2 commit:
> > > 
> > > 4c2604a9a689 usb: xhci-pci: Set PROBE_PREFER_ASYNCHRONOUS
> > > 
> > > Not sure why it would cause this regression, but worth testing it.
> > 
> > I have now reverted the above commit, it's only the one line in
> > xhci-pci.c and it took a couple of hours to rebuild my kernel rpms which
> > wasn't too bad.
> > 
> > With this change all of my USB devices are present again and the 3
> > /dev/ttyUSB* nodes are all present and usable.
> > 
> 
> Thanks for testing.
> So setting PROBE_PREFER_ASYNCHRONOUS does trigger this issue for Renesas xHCI.
> 
> Was it so that with the devices connected to the Intel host everything worked
> on 6.2 kernel?
> 
> Just to make sure that this is a vendor specific host issue and not generic xhci
> driver issue.
> 
> If we can't quickly figure out the real reason for this then we just have to
> revert that patch.
> 
> > I found this in the linux-usb list archives:
> > 
> > https://www.spinics.net/lists/kernel/msg4569289.html
> > 
> > and the first part of this patch series here:
> > 
> > https://www.spinics.net/lists/kernel/msg4569288.html
> > 
> > Should both of these patches be reverted? I assume so but I don't think
> > I have anything that uses an ehci device to test it.
> > 
> 
> Probably just the xhci one. I haven't heard of any ehci issues.
> 
> Alan (cc) would know better if there are any new odd ehci issues that can
> be traced back to the async probe change.

I haven't heard of any problems with EHCI.

Alan Stern

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: USB regression in kernel 6.2.2
  2023-03-15 14:53             ` Alan Stern
@ 2023-03-17 18:37               ` Brian Morrison
  0 siblings, 0 replies; 11+ messages in thread
From: Brian Morrison @ 2023-03-17 18:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: Mathias Nyman, linux-usb, Linux kernel regressions list,
	Chen Xingdi, Takashi Iwai, Moritz Fischer, Christian Lamparter,
	Vinod Koul

On Wed, 15 Mar 2023 10:53:23 -0400
Alan Stern <stern@rowland.harvard.edu> wrote:

> On Wed, Mar 15, 2023 at 01:19:16PM +0200, Mathias Nyman wrote:
> > On 14.3.2023 16.00, Brian Morrison wrote:  
> > > On Mon, 13 Mar 2023 12:06:59 +0200
> > > Mathias Nyman <mathias.nyman@linux.intel.com> wrote:  
>  [...]  
> > > 
> > > I have now reverted the above commit, it's only the one line in
> > > xhci-pci.c and it took a couple of hours to rebuild my kernel
> > > rpms which wasn't too bad.
> > > 
> > > With this change all of my USB devices are present again and the 3
> > > /dev/ttyUSB* nodes are all present and usable.
> > >   
> > 
> > Thanks for testing.
> > So setting PROBE_PREFER_ASYNCHRONOUS does trigger this issue for
> > Renesas xHCI.
> > 
> > Was it so that with the devices connected to the Intel host
> > everything worked on 6.2 kernel?
> > 
> > Just to make sure that this is a vendor specific host issue and not
> > generic xhci driver issue.

I will see if I can test this, but it may be difficult. The add-on
Renesas card allows my USB cables (with quite large ferrites to keep RF
out of the PC) to fit in, the Intel host ports are in a different
orientation and so physically too close together for the ferrite-laden
cables to fit in.

If I can manage to test it I will report, but don't hold your breath.

> > 
> > If we can't quickly figure out the real reason for this then we
> > just have to revert that patch.

It's certainly working for me, but as I don't know much about how the
xhci driver initialises and finds the two bus host controllers I don't
know about any consequences beyond the boot delay issue that prompted
the async change in the first place.

> >   
> > > I found this in the linux-usb list archives:
> > > 
> > > https://www.spinics.net/lists/kernel/msg4569289.html
> > > 
> > > and the first part of this patch series here:
> > > 
> > > https://www.spinics.net/lists/kernel/msg4569288.html
> > > 
> > > Should both of these patches be reverted? I assume so but I don't
> > > think I have anything that uses an ehci device to test it.
> > >   
> > 
> > Probably just the xhci one. I haven't heard of any ehci issues.
> > 
> > Alan (cc) would know better if there are any new odd ehci issues
> > that can be traced back to the async probe change.  
> 
> I haven't heard of any problems with EHCI.

I think that EHCI and UHCI are older standards, I don't know if the
hardware those drivers work with is still common. I also have a VIA PCI
USB card on another machine that also uses the xhci driver and it's much
older than the machine with the Renesas card. I don't think I have the
hardware that would allow me to test those drivers.

> 
> Alan Stern
> 

Please ask if there is any extra patch you would like me to try.

-- 

Brian Morrison


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-03-17 18:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-07 13:21 USB regression in kernel 6.2.2 Brian Morrison
2023-03-08  9:52 ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-03-08 15:16 ` Mathias Nyman
2023-03-08 16:45   ` Brian Morrison
2023-03-09 20:04   ` Brian Morrison
2023-03-12  0:03     ` Brian Morrison
2023-03-13 10:06       ` Mathias Nyman
2023-03-14 14:00         ` Brian Morrison
2023-03-15 11:19           ` Mathias Nyman
2023-03-15 14:53             ` Alan Stern
2023-03-17 18:37               ` Brian Morrison

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).