linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 2.6.10] ehci "hc died" on startup (chip bug workaround)
@ 2005-01-05 22:35 David Brownell
  2005-01-07 17:43 ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: David Brownell @ 2005-01-05 22:35 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-usb-devel, Linux Kernel list

[-- Attachment #1: Type: text/plain, Size: 138 bytes --]

We seem to have tracked some annoying board-coupled EHCI startup
problems to a chip bug, with a simple workaround.  Please merge.

- Dave

[-- Attachment #2: e0105b.patch --]
[-- Type: text/x-diff, Size: 1864 bytes --]

This fixes OSDL bugid #3056 for at least some users, where the EHCI
driver gets a "fatal error" IRQ on startup ... only on certain boards,
starting with the 2.6.6 or 2.6.7 kernels.  These IRQs normally indicate
that an invalid DMA address got passed to the controller, or something
equally nasty and unrecoverable.

But it turns out that some of these controllers (at least ALI and Intel)
are lying.  They're issuing these IRQs without stopping, contrary to the
EHCI spec ... so these IRQs can be recovered from.  Thanks to Christian
Iversen for noticing that his ALI controller would continue operating,
which was the first real break in this annoying case.

This patch tests for these bogus IRQs, and ignores them ... working around
what's clearly a chip bug.  It's not clear why we started triggering that
bug, but at least EHCI is now usable on boards exhibiting this problem.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>


--- 1.112/drivers/usb/host/ehci-hcd.c	2004-12-24 18:03:39 -08:00
+++ edited/drivers/usb/host/ehci-hcd.c	2005-01-04 11:53:29 -08:00
@@ -903,14 +903,20 @@
 
 	/* PCI errors [4.15.2.4] */
 	if (unlikely ((status & STS_FATAL) != 0)) {
-		ehci_err (ehci, "fatal error\n");
+		/* bogus "fatal" IRQs appear on some chips... why?  */
+		status = readl (&ehci->regs->status);
+		dbg_cmd (ehci, "fatal", readl (&ehci->regs->command));
+		dbg_status (ehci, "fatal", status);
+		if (status & STS_HALT) {
+			ehci_err (ehci, "fatal error\n");
 dead:
-		ehci_reset (ehci);
-		writel (0, &ehci->regs->configured_flag);
-		/* generic layer kills/unlinks all urbs, then
-		 * uses ehci_stop to clean up the rest
-		 */
-		bh = 1;
+			ehci_reset (ehci);
+			writel (0, &ehci->regs->configured_flag);
+			/* generic layer kills/unlinks all urbs, then
+			 * uses ehci_stop to clean up the rest
+			 */
+			bh = 1;
+		}
 	}
 
 	if (bh)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch 2.6.10] ehci "hc died" on startup (chip bug workaround)
  2005-01-05 22:35 [patch 2.6.10] ehci "hc died" on startup (chip bug workaround) David Brownell
@ 2005-01-07 17:43 ` Greg KH
  2005-01-07 18:05   ` David Brownell
  0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2005-01-07 17:43 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-usb-devel, Linux Kernel list

On Wed, Jan 05, 2005 at 02:35:42PM -0800, David Brownell wrote:
> We seem to have tracked some annoying board-coupled EHCI startup
> problems to a chip bug, with a simple workaround.  Please merge.

Hm, I get a reject from this:
drivers/usb/host/ehci-hcd.c 1.153: 1210 lines
patching file drivers/usb/host/ehci-hcd.c
Hunk #1 FAILED at 903.
1 out of 1 hunk FAILED -- saving rejects to file drivers/usb/host/ehci-hcd.c.rej

What kernel tree is it against?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch 2.6.10] ehci "hc died" on startup (chip bug workaround)
  2005-01-07 17:43 ` Greg KH
@ 2005-01-07 18:05   ` David Brownell
  2005-01-07 18:29     ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: David Brownell @ 2005-01-07 18:05 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-usb-devel, Linux Kernel list

[-- Attachment #1: Type: text/plain, Size: 455 bytes --]

On Friday 07 January 2005 9:43 am, Greg KH wrote:
> On Wed, Jan 05, 2005 at 02:35:42PM -0800, David Brownell wrote:
> > We seem to have tracked some annoying board-coupled EHCI startup
> > problems to a chip bug, with a simple workaround.  Please merge.
> 
> Hm, I get a reject from this:
> ...
> 
> What kernel tree is it against?

Probably my gadget-2.6 tree; here's one that applies against
current 2.5 BK or your USB integration tree.  Sorry!

- Dave

[-- Attachment #2: e0107.patch --]
[-- Type: text/x-diff, Size: 1840 bytes --]

This fixes OSDL bugid #3056 for at least some users, where the EHCI
driver gets a "fatal error" IRQ on startup ... only on certain boards,
starting with the 2.6.6 or 2.6.7 kernels.  These IRQs normally indicate
that an invalid DMA address got passed to the controller, or something
equally nasty and unrecoverable.

But it turns out that some of these controllers (at least ALI and Intel)
are lying.  They're issuing these IRQs without stopping, contrary to the
EHCI spec ... so these IRQs can be recovered from.  Thanks to Christian
Iversen for noticing that his ALI controller would continue operating,
which was the first real break in this annoying case.

This patch tests for these bogus IRQs, and ignores them ... working around
what's clearly a chip bug.  It's not clear why we started triggering that
bug, but at least EHCI is now usable on boards exhibiting this problem.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>


--- xu26/drivers/usb/host/ehci-hcd.c	2004-12-20 15:07:23.000000000 -0800
+++ gadget-2.6/drivers/usb/host/ehci-hcd.c	2005-01-04 12:01:46.000000000 -0800
@@ -883,13 +903,20 @@
 
 	/* PCI errors [4.15.2.4] */
 	if (unlikely ((status & STS_FATAL) != 0)) {
-		ehci_err (ehci, "fatal error\n");
+		/* bogus "fatal" IRQs appear on some chips... why?  */
+		status = readl (&ehci->regs->status);
+		dbg_cmd (ehci, "fatal", readl (&ehci->regs->command));
+		dbg_status (ehci, "fatal", status);
+		if (status & STS_HALT) {
+			ehci_err (ehci, "fatal error\n");
 dead:
-		ehci_reset (ehci);
-		/* generic layer kills/unlinks all urbs, then
-		 * uses ehci_stop to clean up the rest
-		 */
-		bh = 1;
+			ehci_reset (ehci);
+			writel (0, &ehci->regs->configured_flag);
+			/* generic layer kills/unlinks all urbs, then
+			 * uses ehci_stop to clean up the rest
+			 */
+			bh = 1;
+		}
 	}
 
 	if (bh)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch 2.6.10] ehci "hc died" on startup (chip bug workaround)
  2005-01-07 18:05   ` David Brownell
@ 2005-01-07 18:29     ` Greg KH
  0 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2005-01-07 18:29 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-usb-devel, Linux Kernel list

On Fri, Jan 07, 2005 at 10:05:43AM -0800, David Brownell wrote:
> On Friday 07 January 2005 9:43 am, Greg KH wrote:
> > On Wed, Jan 05, 2005 at 02:35:42PM -0800, David Brownell wrote:
> > > We seem to have tracked some annoying board-coupled EHCI startup
> > > problems to a chip bug, with a simple workaround.  Please merge.
> > 
> > Hm, I get a reject from this:
> > ...
> > 
> > What kernel tree is it against?
> 
> Probably my gadget-2.6 tree; here's one that applies against
> current 2.5 BK or your USB integration tree.  Sorry!

Ah, much better, that worked.  Thanks, applied.

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-01-07 18:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-05 22:35 [patch 2.6.10] ehci "hc died" on startup (chip bug workaround) David Brownell
2005-01-07 17:43 ` Greg KH
2005-01-07 18:05   ` David Brownell
2005-01-07 18:29     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).