All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Kepplinger <martin.kepplinger@puri.sm>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: linux-usb@vger.kernel.org
Subject: Re: USB device disconnects on resume
Date: Thu, 21 Apr 2022 12:38:56 +0200	[thread overview]
Message-ID: <4fb8bd5842135a9f723bbe0406ed1afc023c25fe.camel@puri.sm> (raw)
In-Reply-To: <YmAbZDd6LJwCCvkB@rowland.harvard.edu>

Am Mittwoch, dem 20.04.2022 um 10:40 -0400 schrieb Alan Stern:
> On Wed, Apr 20, 2022 at 12:37:36PM +0200, Martin Kepplinger wrote:
> > Am Dienstag, dem 19.04.2022 um 10:32 -0400 schrieb Alan Stern:
> > > On Tue, Apr 19, 2022 at 11:28:21AM +0200, Martin Kepplinger
> > > wrote:
> > > > hi,
> > > > 
> > > > I'm seeing resets and re-enumerations on runtime-resume for one
> > > > device
> > > > a lot. It's a modem connected to the USB2642 Microchip (SMSC)
> > > > USB2
> > > > hub,
> > > > that's connected to an xhci HC.
> > > > 
> > > > A remote wakeup *sometimes* makes the hub say "physically
> > > > disconnected"
> > > > during resume in hub_activate(), and thus sets reset_resume.
> > > > Then
> > > > the
> > > > device comes up as low-speed device once, which again is not
> > > > allowed
> > > > during normal runtime resume, so would itself trigger a reset.
> > > 
> > > Does the reset-resume always fail in this way?
> > 
> > Resetting itself doesn't usually fail in the sense that a device
> > would
> > not work anymore after resetting. The problem is that the resets
> > happen
> > in the first place. 90+% of runtime-resumes are fine - auto- and
> > wakeup-resume. Resetting is a major problem though, imagine a modem
> > device being re-enumerated during a phone call or "realtime" data
> > connection. I see that a lot.
> 
> Okay, I see.
> 
> > Let me record what hub.c says when leading up to the reset of 1-1.2
> > (the modem), with logs of a normal runtime resume/suspend cycle
> > included before that, as reference:
> > 
> > 1650447001.174798 pureos kernel: usb 1-1: usb auto-resume
> > 1650447001.242810 pureos kernel: usb 1-1: Waited 0ms for CONNECT
> > 1650447001.247853 pureos kernel: usb 1-1: finish resume
> > 1650447001.249697 pureos kernel: hub 1-1:1.0: hub_resume
> > 1650447001.251409 pureos kernel: usb 1-1-port1: status 0507 change
> > 0000
> > 1650447001.251624 pureos kernel: usb 1-1-port2: status 0507 change
> > 0000
> > 1650447001.251793 pureos kernel: hub 1-1:1.0: state 7 ports 3 chg
> > 0000
> > evt 0000
> > 1650447001.253052 pureos kernel: usb 1-1.2: usb auto-resume
> 
> What is the cause of this runtime resume?  According to the port
> status 
> above, the 1-1.2 device did not send a wakeup request.

How would I find out? Recording via usbmon is next on my todo list.

> 
> > 1650447001.318845 pureos kernel: usb 1-1.2: Waited 0ms for CONNECT
> > 1650447001.324925 pureos kernel: usb 1-1.2: finish resume
> > 1650447003.831095 pureos kernel: usb 1-1.2: usb auto-suspend,
> > wakeup 1
> > 1650447003.854701 pureos kernel: hub 1-1:1.0: hub_suspend
> > 1650447003.874773 pureos kernel: usb 1-1: usb auto-suspend, wakeup
> > 1
> > 1650447003.922054 pureos kernel: usb 1-1: usb wakeup-resume
> 
> This wakeup occurred only 48 ms after the hub was runtime suspended. 
> But here at least the cause is evident: The hub sent a wakeup request
> because its child (the 1-1.2 modem) disconnected.

fwiw, that wakeup-resume *always* comes about 50 ms after the last
runtime suspend.

> 
> > 1650447003.942066 pureos kernel: usb 1-1: Waited 0ms for CONNECT
> > 1650447003.945755 pureos kernel: usb 1-1: finish resume
> > 1650447003.947589 pureos kernel: hub 1-1:1.0: hub_resume
> > 1650447003.949226 pureos kernel: usb 1-1-port1: status 0507 change
> > 0000
> > 1650447003.949430 pureos kernel: usb 1-1-port2: status 0101 change
> > 0005
> > 1650447004.058779 pureos kernel: hub 1-1:1.0: state 7 ports 3 chg
> > 0004
> > evt 0000
> > 1650447004.074089 pureos kernel: usb 1-1.2: usb wakeup-resume
> > 1650447004.094056 pureos kernel: usb 1-1.2: Waited 0ms for CONNECT
> > 1650447004.097255 pureos kernel: usb 1-1.2: finish reset-resume
> > 1650447004.182333 pureos kernel: usb 1-1.2: reset high-speed USB
> > device
> > number 5 using xhci-hcd
> > 1650447004.314425 pureos kernel: usb 1-1-port2: resume, status 0
> > 1650447004.317628 pureos kernel: usb 1-1-port2: status 0101, change
> > 0004, 12 Mb/s
> > 1650447004.318673 pureos kernel: usb 1-1.2: USB disconnect, device
> > number 5
> > 1650447004.323374 pureos kernel: usb 1-1.2: unregistering device
> 
> And it looks like in this case, the reset-resume failed.

Well, at least reset_resume has been set, which I want to avoid.

> 
> > So before resetting, the hub reads
> > "usb 1-1-port2: status 0101 change 0005" instead of normally
> > "usb 1-1-port2: status 0507 change 0000"
> > 
> > but I don't know why. That portstatus/portchange doesn't change
> > over
> > time when I just keep reading portstatus/portchange in
> > hub_activate()
> > in a loop.
> 
> You mean that if the port status and change values are originally 
> 0101 and 0005 in hub_activate(), they remain equal to those values? 
> And 
> similarly if they are originally 0507 and 0000?
> 
> That is to be expected.  Nothing happens to make those values change 
> until hub_activate() sends some commands to the hub.

I see.

Of course this doesn't make much sense, but just so you know: if I just
don't let hub_activate() set udev->reset_resume to 1, then
check_port_resume_type() will do so, and thus again
finish_port_resume() will reset the device by calling
usb_reset_and_verify_device().

> 
> > > > The Hub and device is permanently connected on the PCB, so the
> > > > hub
> > > > is
> > > > interpreting it in a wrong way.
> > > 
> > > What is the hub is interpreting in a wrong way?  Why should a
> > > permanent 
> > > connection on the PCB have anything to do with whether the resume
> > > signals are misinterpreted?
> > 
> > I only wanted to say that the device (modem in this case) cannot be
> > unplugged - there's no plug. That's all :)
> >      
> > https://elixir.bootlin.com/linux/latest/source/drivers/usb/core/hub.c#L1197
> > interprets my situation as a "removed" device.
> 
> What it means is that the modem was electronically disconnected from
> the 
> USB bus.  In theory this could be the result of a mixup in the resume
> signals, but it's more likely that the modem did this deliberately 
> because its firmware crashed.  (Why it should crash while it is 
> suspended is a good question, though...)

ok. assuming such a firmware bug, if I set a new quirk for the device,
do you think I can work around that (in hub.c?) in a way that userspace
doesn't really notice?

In theory, if I know this behaviour in advance, I think I should be
able to somehow wait until the device is ready again instead of
resetting.


> 
> > > >  I found an email that describes what I
> > > > see from Sarah Sharp in 2013 here:
> > > > https://marc.info/?l=linux-usb&m=137754385421825&w=2 Where she
> > > > says:
> > > > 
> > > > "Occasionally, the host controller was sending the SoFs too
> > > > soon on
> > > > resume, and the device would interpret it as a low-speed
> > > > chirp. 
> > > > The
> > > > device would disconnect, and transform from a high speed device
> > > > to
> > > > a
> > > > low speed device.  I don't think increasing the 10 ms time out
> > > > will
> > > > help at all in this case, but you did ask what USB device
> > > > disconnect
> > > > scenarios I've seen."
> > > 
> > > Read the following messages in that email thread.  Sarah said
> > > that
> > > she 
> > > would fix the SoF signal timing in xhci-hcd ("I agree that this
> > > seems
> > > like an xHCI driver issue, and I'll fix it in the driver").  I
> > > have
> > > no 
> > > idea whether this helped the faulty devices; my guess is that it
> > > didn't.
> > 
> > Do you know with what changes she tried to fix that?
> 
> No.  But you ought to be able to see by checking the history for that
> time period.
> 
> Alan Stern



  reply	other threads:[~2022-04-21 10:39 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-19  9:28 USB device disconnects on resume Martin Kepplinger
2022-04-19 14:32 ` Alan Stern
2022-04-20 10:37   ` Martin Kepplinger
2022-04-20 14:40     ` Alan Stern
2022-04-21 10:38       ` Martin Kepplinger [this message]
2022-04-21 14:24         ` Alan Stern
2022-04-25  9:45           ` Martin Kepplinger
2022-04-26 14:58             ` Alan Stern
2022-04-28  7:01               ` Martin Kepplinger
2022-04-28 19:13                 ` Alan Stern
2022-04-29 10:15                   ` Martin Kepplinger
2022-04-29 20:21                     ` Alan Stern
     [not found]                       ` <fdc8354e39f9162bcc63ab99f237bdbbe30d6017.camel@puri.sm>
2022-05-02 19:32                         ` Alan Stern
2022-05-03  7:36                           ` Oliver Neukum
2022-05-03 13:56                             ` Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4fb8bd5842135a9f723bbe0406ed1afc023c25fe.camel@puri.sm \
    --to=martin.kepplinger@puri.sm \
    --cc=linux-usb@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.