linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Ohci-hcd: fix endless loop (second take)
@ 2004-11-26 10:30 Colin Leroy
  2004-11-26 17:28 ` [linux-usb-devel] " David Brownell
  0 siblings, 1 reply; 11+ messages in thread
From: Colin Leroy @ 2004-11-26 10:30 UTC (permalink / raw)
  To: Linux-kernel; +Cc: Benjamin Herrenschmidt, Greg KH, Andrew Morton

Hi, 

Following patch fixes an endless loop that happens after having
slept and resumed my iBook with a linux-wlan-ng controller plugged in,
removed the stick and plugged it back (getting "IRQ lossage" message).

It supercedes the previous one where 
.I hadn't noticed limit was unsigned,
.Decrementing limit was twice too fast,
.the goto was a bit useless.

Signed-off-by: Colin Leroy <colin@colino.net>
--- a/drivers/usb/host/ohci-hcd.c	2004-11-26 11:28:21.284259057 +0100
+++ b/drivers/usb/host/ohci-hcd.c	2004-11-26 11:28:03.437351150 +0100
@@ -344,7 +344,7 @@
 	int			epnum = ep & USB_ENDPOINT_NUMBER_MASK;
 	unsigned long		flags;
 	struct ed		*ed;
-	unsigned		limit = 1000;
+	int			limit = 1000;
 
 	/* ASSERT:  any requests/urbs are being unlinked */
 	/* ASSERT:  nobody can be submitting urbs for this any more */
@@ -375,6 +375,11 @@
 		spin_unlock_irqrestore (&ohci->lock, flags);
 		set_current_state (TASK_UNINTERRUPTIBLE);
 		schedule_timeout (1);
+		if (limit < 1000) {
+			ohci_warn (ohci, "Can't recover, restarting.\n");
+			ohci_restart(ohci);
+			return;
+		}
 		goto rescan;
 	case ED_IDLE:		/* fully unlinked */
 		if (list_empty (&ed->td_list)) {

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-26 10:30 [PATCH] Ohci-hcd: fix endless loop (second take) Colin Leroy
@ 2004-11-26 17:28 ` David Brownell
  2004-11-26 17:37   ` Colin Leroy
  2004-11-26 18:46   ` David Brownell
  0 siblings, 2 replies; 11+ messages in thread
From: David Brownell @ 2004-11-26 17:28 UTC (permalink / raw)
  To: linux-usb-devel
  Cc: Colin Leroy, Linux-kernel, Benjamin Herrenschmidt, Greg KH,
	Andrew Morton

On Friday 26 November 2004 02:30, Colin Leroy wrote:
> @@ -375,6 +375,11 @@
>  		spin_unlock_irqrestore (&ohci->lock, flags);
>  		set_current_state (TASK_UNINTERRUPTIBLE);
>  		schedule_timeout (1);
> +		if (limit < 1000) {
> +			ohci_warn (ohci, "Can't recover, restarting.\n");
> +			ohci_restart(ohci);
> +			return;
> +		}

So instead of waiting a moment for the ED to finish
its normal processing and move from state ED_UNLINK
into ED_IDLE, you want to always clobber the whole
USB device tree attached to that bus?  That'd happen
quite routinely.

This isn't a good patch either... maybe your best
bet would be to find out why the IRQs stopped getting
delivered.

- Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-26 17:28 ` [linux-usb-devel] " David Brownell
@ 2004-11-26 17:37   ` Colin Leroy
  2004-11-26 17:57     ` David Brownell
  2004-11-26 18:46   ` David Brownell
  1 sibling, 1 reply; 11+ messages in thread
From: Colin Leroy @ 2004-11-26 17:37 UTC (permalink / raw)
  To: David Brownell
  Cc: linux-usb-devel, Colin Leroy, Linux-kernel,
	Benjamin Herrenschmidt, Greg KH, Andrew Morton

On 26 Nov 2004 at 09h11, David Brownell wrote:

Hi, 

> So instead of waiting a moment for the ED to finish
> its normal processing and move from state ED_UNLINK
> into ED_IDLE, you want to always clobber the whole
> USB device tree attached to that bus?  That'd happen
> quite routinely.

Yeah. Sorry. Also, just noticed that this patch seemed
to work because I overlooked the unsigned bit, makeing my
hack not go though sanitize - which changes eb->state and 
thus does not get back to the ED_UNLINK path. Duh... I must
have been tired.
 
> This isn't a good patch either... maybe your best
> bet would be to find out why the IRQs stopped getting
> delivered.

It's probably a linux-wlan-ng issue... What do you think 
of these logs ?

#resume logs... 
#disconnecting the stick:
usb 4-1: USB disconnect, address 2
ohci_hcd 0001:10:1b.1: IRQ INTR_SF lossage
hfa384x_usbin_callback: Fatal, failed to resubmit rx_urb. error=-19
hfa384x_dorrid: ctlx failure=REQ_TIMEOUT
prism2sta_mlmerequest: Failed to read eth1 statistics: error=-5
#reconnecting the stick:
usb 4-1: new full speed USB device using address 3
usb 4-1: control timeout on ep0out

maybe the lwlan driver should catch these and kill the urbs or
something? 
Thanks for your help, I'm not an expert at all in the usb world...
-- 
Colin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-26 17:37   ` Colin Leroy
@ 2004-11-26 17:57     ` David Brownell
  2004-11-26 22:12       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 11+ messages in thread
From: David Brownell @ 2004-11-26 17:57 UTC (permalink / raw)
  To: Colin Leroy
  Cc: linux-usb-devel, Colin Leroy, Linux-kernel,
	Benjamin Herrenschmidt, Greg KH, Andrew Morton

On Friday 26 November 2004 09:37, Colin Leroy wrote:
> On 26 Nov 2004 at 09h11, David Brownell wrote:
> > This isn't a good patch either... maybe your best
> > bet would be to find out why the IRQs stopped getting
> > delivered.
> 
> It's probably a linux-wlan-ng issue... 

I suspect PPC resume issues myself.


> What do you think  
> of these logs ?
> 
> #resume logs... 
> #disconnecting the stick:
> usb 4-1: USB disconnect, address 2
> ohci_hcd 0001:10:1b.1: IRQ INTR_SF lossage

That does seem to be the first problem; fixing
it (that is, making sure IRQs arrive again!)
should make the rest go away.


> hfa384x_usbin_callback: Fatal, failed to resubmit rx_urb. error=-19
> hfa384x_dorrid: ctlx failure=REQ_TIMEOUT
> prism2sta_mlmerequest: Failed to read eth1 statistics: error=-5

Those look like plausible ways for that driver to
behave.  "-19" == "-ENODEV" for device-gone (you
unplugged it!), though the rest (timeout, EIO)
suggest that WLAN code fault recovery is wierd.


> #reconnecting the stick:
> usb 4-1: new full speed USB device using address 3
> usb 4-1: control timeout on ep0out

As expected, if IRQs aren't arriving.  Though you
may not be using the latest kernel; it's supposed
to give warnings about IRQ delivery problems after
resume too, not just on initial startup.


> maybe the lwlan driver should catch these and kill the urbs or
> something? 

The only obvious "looks wrong" thing from that WLAN
code is discarding the non-recoverable ENODEV status
in favor of reporting a usually-recoverable (timeout)
then maybe-recoverable (EIO) error.  But that's not
necessarily troublesome here.


> Thanks for your help, I'm not an expert at all in the usb world...

Most people aren't... :)

I'm not expert in PPC IRQ delivery, which is where the
root cause of this problem seems to live.  We all have
places where we need help!

- Dave


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-26 17:28 ` [linux-usb-devel] " David Brownell
  2004-11-26 17:37   ` Colin Leroy
@ 2004-11-26 18:46   ` David Brownell
  1 sibling, 0 replies; 11+ messages in thread
From: David Brownell @ 2004-11-26 18:46 UTC (permalink / raw)
  To: linux-usb-devel
  Cc: Colin Leroy, Linux-kernel, Benjamin Herrenschmidt, Greg KH,
	Andrew Morton

Colin reported off-line that he's using 2.6.9
rather than 2.6.10-rc2 or newer ... so it's
actually expected that his kernel misbehave
with USB PM.  The workaround, for all 2.6
kernels until very recently, is to rmmod the
HCDs before entering a system sleep state.

I think that starting in 2.6.10 it'll be OK
to leave the USB HCDs loaded during various
PM sleep states ... in at least some common
system configuration.  There are several
hundred different possibilities, it's hard
to test all of them even if you do happen to
have all that hardware!

But for earlier kernels, don't even try that.

- Dave


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-26 17:57     ` David Brownell
@ 2004-11-26 22:12       ` Benjamin Herrenschmidt
  2004-11-29  8:04         ` Colin Leroy
  0 siblings, 1 reply; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2004-11-26 22:12 UTC (permalink / raw)
  To: David Brownell
  Cc: Colin Leroy, Linux-USB, Colin Leroy, Linux Kernel list, Greg KH,
	Andrew Morton

On Fri, 2004-11-26 at 09:57 -0800, David Brownell wrote:
> On Friday 26 November 2004 09:37, Colin Leroy wrote:
> > On 26 Nov 2004 at 09h11, David Brownell wrote:
> > > This isn't a good patch either... maybe your best
> > > bet would be to find out why the IRQs stopped getting
> > > delivered.
> > 
> > It's probably a linux-wlan-ng issue... 
> 
> I suspect PPC resume issues myself.

Colin, you didn't tell us which controller it was ? The NEC one is a
totally normal off-the-shelves controller coming out of D3. The Apple
ones are a bit special tho.
> 
> As expected, if IRQs aren't arriving.  Though you
> may not be using the latest kernel; it's supposed
> to give warnings about IRQ delivery problems after
> resume too, not just on initial startup.

It could be a problem in the code restarting the clocks to the USB cell
in KL (provided it's one of these controller and not the NEC), that
would need some more delay before restarting things...

> I'm not expert in PPC IRQ delivery, which is where the
> root cause of this problem seems to live.  We all have
> places where we need help!

There is nothing fancy with PPC IRQ delivery. IRQs work on wakeup for
everybody or nobody. It's a problem with the USB chip. (There is no
fancy firmware IRQ routing thing, etc... every device is physically
wired to one of the about 128 IRQ lines of the MPIC).

Ben.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-26 22:12       ` Benjamin Herrenschmidt
@ 2004-11-29  8:04         ` Colin Leroy
  2004-11-29 22:26           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 11+ messages in thread
From: Colin Leroy @ 2004-11-29  8:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: David Brownell, Linux-USB, Colin Leroy, Linux Kernel list,
	Greg KH, Andrew Morton

On 27 Nov 2004 at 09h11, Benjamin Herrenschmidt wrote:

Hi, 

> > > It's probably a linux-wlan-ng issue... 
> > 
> > I suspect PPC resume issues myself.
> 
> Colin, you didn't tell us which controller it was ? The NEC one is a
> totally normal off-the-shelves controller coming out of D3. The Apple
> ones are a bit special tho.

It's the ibook G4's controller:
[colin@jack ~]$ for i in 1 2 3 4; do cat /sys/bus/usb/devices/usb$i/product; done;
NEC Corporation USB 2.0
Apple Computer Inc. KeyLargo/Intrepid USB (#3)
NEC Corporation USB
NEC Corporation USB (#2)


-- 
Colin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-29  8:04         ` Colin Leroy
@ 2004-11-29 22:26           ` Benjamin Herrenschmidt
  2004-11-29 22:34             ` Colin Leroy
  0 siblings, 1 reply; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2004-11-29 22:26 UTC (permalink / raw)
  To: Colin Leroy
  Cc: David Brownell, Linux-USB, Colin Leroy, Linux Kernel list,
	Greg KH, Andrew Morton

On Mon, 2004-11-29 at 09:04 +0100, Colin Leroy wrote:
> On 27 Nov 2004 at 09h11, Benjamin Herrenschmidt wrote:
> 
> Hi, 
> 
> > > > It's probably a linux-wlan-ng issue... 
> > > 
> > > I suspect PPC resume issues myself.
> > 
> > Colin, you didn't tell us which controller it was ? The NEC one is a
> > totally normal off-the-shelves controller coming out of D3. The Apple
> > ones are a bit special tho.
> 
> It's the ibook G4's controller:
> [colin@jack ~]$ for i in 1 2 3 4; do cat /sys/bus/usb/devices/usb$i/product; done;
> NEC Corporation USB 2.0
> Apple Computer Inc. KeyLargo/Intrepid USB (#3)
> NEC Corporation USB
> NEC Corporation USB (#2)

Hrm... there is some problem in communication here. I asked you which
controller out of the 3 OHCIs you have in this machine is the culprit,
you give me a list of all of them but without PCI IDs ... From the
archive, I think it was USB bus #4 no ? not sure which of these
controllers it matches. 

The iBook G4 has actually 3 "Apple" OHCI's in KeyLargo/Intrepid but with
2 of them disabled by the firmware (not wired) plus one NEC USB2
controller (which contains 1 EHCI and 2 OHCIs) on the PCI bus. The code
managing their sleep process is very different.

Ben.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-29 22:26           ` Benjamin Herrenschmidt
@ 2004-11-29 22:34             ` Colin Leroy
  2004-11-29 22:43               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 11+ messages in thread
From: Colin Leroy @ 2004-11-29 22:34 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: David Brownell, Linux-USB, Colin Leroy, Linux Kernel list,
	Greg KH, Andrew Morton

On 30 Nov 2004 at 09h11, Benjamin Herrenschmidt wrote:

Hi, 

> Hrm... there is some problem in communication here. I asked you which
> controller out of the 3 OHCIs you have in this machine is the culprit,
> you give me a list of all of them but without PCI IDs ... From the
> archive, I think it was USB bus #4 no ? not sure which of these
> controllers it matches. 
> 
> The iBook G4 has actually 3 "Apple" OHCI's in KeyLargo/Intrepid but
> with 2 of them disabled by the firmware (not wired) plus one NEC USB2
> controller (which contains 1 EHCI and 2 OHCIs) on the PCI bus. The
> code managing their sleep process is very different.

Sorry, i was away and had a problem of /proc/bus/usb being empty. As my
link was on the wireless stick I couldn't reload usb modules. The
culprit is usb 4-1, I think it would be this one (as the stick is bus
004 device 001):

Bus 004 Device 001: ID 0000:0000
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB              10.01
  bDeviceClass            9 Hub
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0         8
  idVendor           0x0000
  idProduct          0x0000
  bcdDevice            6.02
  iManufacturer           3 Linux 2.6.9 ohci_hcd
  iProduct                2 NEC Corporation USB (#2)
  iSerial                 1 0001:10:1b.1
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           25
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xc0
      Self Powered
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0
      bInterfaceProtocol      0
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               none
        wMaxPacketSize          2
        bInterval             255
  Language IDs: (length=4)
     0409 English(US)

-- 
Colin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-29 22:34             ` Colin Leroy
@ 2004-11-29 22:43               ` Benjamin Herrenschmidt
  2004-11-30  0:41                 ` David Brownell
  0 siblings, 1 reply; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2004-11-29 22:43 UTC (permalink / raw)
  To: Colin Leroy
  Cc: David Brownell, Linux-USB, Colin Leroy, Linux Kernel list,
	Greg KH, Andrew Morton

On Mon, 2004-11-29 at 23:34 +0100, Colin Leroy wrote:
> On 30 Nov 2004 at 09h11, Benjamin Herrenschmidt wrote:
> 
> Hi, 
> 
> > Hrm... there is some problem in communication here. I asked you which
> > controller out of the 3 OHCIs you have in this machine is the culprit,
> > you give me a list of all of them but without PCI IDs ... From the
> > archive, I think it was USB bus #4 no ? not sure which of these
> > controllers it matches. 
> > 
> > The iBook G4 has actually 3 "Apple" OHCI's in KeyLargo/Intrepid but
> > with 2 of them disabled by the firmware (not wired) plus one NEC USB2
> > controller (which contains 1 EHCI and 2 OHCIs) on the PCI bus. The
> > code managing their sleep process is very different.
> 
> Sorry, i was away and had a problem of /proc/bus/usb being empty. As my
> link was on the wireless stick I couldn't reload usb modules. The
> culprit is usb 4-1, I think it would be this one (as the stick is bus
> 004 device 001):

Ok, this is a perfectly normal "out of the schelves" NEC chip, no
special "Mac" thing in there, it just use normal PCI PM...

It could be one of the devices not properly dealing with beeing
suspended, or it could be some delay needing to be increased here or
there in the resume process, difficult to say at this point.

Ben.
 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-usb-devel] [PATCH] Ohci-hcd: fix endless loop (second take)
  2004-11-29 22:43               ` Benjamin Herrenschmidt
@ 2004-11-30  0:41                 ` David Brownell
  0 siblings, 0 replies; 11+ messages in thread
From: David Brownell @ 2004-11-30  0:41 UTC (permalink / raw)
  To: linux-usb-devel
  Cc: Benjamin Herrenschmidt, Colin Leroy, Colin Leroy,
	Linux Kernel list, Greg KH, Andrew Morton

On Monday 29 November 2004 2:43 pm, Benjamin Herrenschmidt wrote:
> On Mon, 2004-11-29 at 23:34 +0100, Colin Leroy wrote:
> 
> Ok, this is a perfectly normal "out of the schelves" NEC chip, no
> special "Mac" thing in there, it just use normal PCI PM...
> 
> It could be one of the devices not properly dealing with beeing
> suspended, or it could be some delay needing to be increased here or
> there in the resume process, difficult to say at this point.

Or as I said before, it's probably one of the issues fixed
in the USB PM patches in 2.6.10-rc2 ... really, it's not
even worth testing that with straight 2.6.9 kernels.  


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2004-11-30  0:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-26 10:30 [PATCH] Ohci-hcd: fix endless loop (second take) Colin Leroy
2004-11-26 17:28 ` [linux-usb-devel] " David Brownell
2004-11-26 17:37   ` Colin Leroy
2004-11-26 17:57     ` David Brownell
2004-11-26 22:12       ` Benjamin Herrenschmidt
2004-11-29  8:04         ` Colin Leroy
2004-11-29 22:26           ` Benjamin Herrenschmidt
2004-11-29 22:34             ` Colin Leroy
2004-11-29 22:43               ` Benjamin Herrenschmidt
2004-11-30  0:41                 ` David Brownell
2004-11-26 18:46   ` David Brownell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).