linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WARNING: at drivers/usb/host/ehci-hcd.c:287
@ 2008-03-03  0:05 Christian Kujau
  2008-03-03 22:38 ` Christian Kujau
  2008-03-04  7:49 ` Andrew Morton
  0 siblings, 2 replies; 17+ messages in thread
From: Christian Kujau @ 2008-03-03  0:05 UTC (permalink / raw)
  To: LKML; +Cc: 0x0007

Hi,

after upgrading to 2.6.25-rc3, kern.log shows:

[ 1535.884848] ------------[ cut here ]------------
[ 1535.884855] WARNING: at drivers/usb/host/ehci-hcd.c:287 ehci_iaa_watchdog+0x7a/0x80()
[ 1535.884858] Modules linked in: sha256_generic xt_tcpudp ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_nat_ftp nf_nat nf_conntrack_ftp xt_conntrack nf_conntrack iptable_filter ip_tables ipt_ULOG x_tables nfsd lockd nfs_acl auth_rpcgss exportfs tun sunrpc fuse twofish_i586 twofish_common eeprom w83l785ts asb100 hwmon_vid hwmon snd_pcm_oss snd_mixer_oss zd1211rw firmware_class snd_intel8x0 snd_ac97_codec mac80211 ac97_bus snd_pcm snd_timer snd soundcore snd_page_alloc cfg80211 pl2303 usbserial i2c_nforce2 i2c_core
[ 1535.884901] Pid: 4272, comm: FahCore_78.exe Not tainted 2.6.25-rc3 #6
[ 1535.884906]  [<c0119f9f>] warn_on_slowpath+0x5f/0x90
[ 1535.884918]  [<f8d868b5>] sta_info_get+0x55/0x60 [mac80211]
[ 1535.884940]  [<c0139285>] __lock_acquire+0x4e5/0xfd0
[ 1535.884950]  [<c0139285>] __lock_acquire+0x4e5/0xfd0
[ 1535.884955]  [<c0165487>] kmem_cache_free+0xa7/0xf0
[ 1535.884964]  [<c0102f17>] restore_nocheck+0x12/0x15
[ 1535.884970]  [<c01388c6>] trace_hardirqs_on+0x66/0x110
[ 1535.884978]  [<c04410df>] _spin_lock_irqsave+0x3f/0x50
[ 1535.884987]  [<c036369a>] ehci_iaa_watchdog+0x7a/0x80
[ 1535.884991]  [<c0122cc8>] run_timer_softirq+0x128/0x190
[ 1535.884998]  [<c02b8e3c>] blk_done_softirq+0x3c/0x70
[ 1535.885004]  [<c0363620>] ehci_iaa_watchdog+0x0/0x80
[ 1535.885010]  [<c011f232>] __do_softirq+0x52/0xa0
[ 1535.885014]  [<c01056c8>] do_softirq+0x98/0xd0
[ 1535.885020]  [<c0144df0>] handle_level_irq+0x0/0xe0
[ 1535.885027]  [<c011f1bd>] irq_exit+0x4d/0x70
[ 1535.885031]  [<c0105790>] do_IRQ+0x90/0xf0
[ 1535.885034]  [<c01388fc>] trace_hardirqs_on+0x9c/0x110
[ 1535.885040]  [<c01038c6>] common_interrupt+0x2e/0x34
[ 1535.885049]  =======================
[ 1535.885051] ---[ end trace b6c2a2178c98c162 ]---


There's no oops, and the box is running fine. The only USB devices in use 
is the zd1211rw WLAN adapter and some mp3 player attached as usb-storage.

I've seen http://lkml.org/lkml/2008/2/13/566 with the same warning and a 
real problem, but nothing happened. Linas, is this freeze still present?

Full dmesg and .config: http://nerdbynature.de/bits/2.6.25-rc3/

Thanks,
Christian.
-- 
BOFH excuse #415:

Maintenance window broken

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-03  0:05 WARNING: at drivers/usb/host/ehci-hcd.c:287 Christian Kujau
@ 2008-03-03 22:38 ` Christian Kujau
  2008-03-04 16:29   ` Alan Stern
  2008-03-04  7:49 ` Andrew Morton
  1 sibling, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-03-03 22:38 UTC (permalink / raw)
  To: LKML; +Cc: 0x0007, linux-usb

On Mon, 3 Mar 2008, Christian Kujau wrote:
> after upgrading to 2.6.25-rc3, kern.log shows:
>
> [ 1535.884848] ------------[ cut here ]------------
> [ 1535.884855] WARNING: at drivers/usb/host/ehci-hcd.c:287

Hm, after ~24h of uptime, this message appeared 25 times already.
The WLAN USB device was used the whole time, the usb-storage module
was hardly used over the day:

# grep usb /proc/interrupts
  10:   10693403    XT-PIC-XT        ohci_hcd:usb3, eth0
  11:   10270770    XT-PIC-XT        sym53c8xx, ehci_hcd:usb1
  12:         91    XT-PIC-XT        ohci_hcd:usb2

I even tried to trigger the system freeze[0] by using usb-storage
and reading a lot from it, but no freeze, and the message could
not be triggered either - they pop up every now and then, but too often, 
IMHO.

Can anybody shed some light on this?

Thanks,
Christian.

[0] http://lkml.org/lkml/2008/2/13/566
-- 
BOFH excuse #107:

The keyboard isn't plugged in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-03  0:05 WARNING: at drivers/usb/host/ehci-hcd.c:287 Christian Kujau
  2008-03-03 22:38 ` Christian Kujau
@ 2008-03-04  7:49 ` Andrew Morton
  2008-03-04  8:01   ` Christian Kujau
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2008-03-04  7:49 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, 0x0007, linux-usb

On Mon, 3 Mar 2008 01:05:53 +0100 (CET) Christian Kujau <lists@nerdbynature.de> wrote:

> Hi,
> 
> after upgrading to 2.6.25-rc3, kern.log shows:

>From which kernel version did you upgrade?

> [ 1535.884848] ------------[ cut here ]------------
> [ 1535.884855] WARNING: at drivers/usb/host/ehci-hcd.c:287 ehci_iaa_watchdog+0x7a/0x80()
> [ 1535.884858] Modules linked in: sha256_generic xt_tcpudp ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_nat_ftp nf_nat nf_conntrack_ftp xt_conntrack nf_conntrack iptable_filter ip_tables ipt_ULOG x_tables nfsd lockd nfs_acl auth_rpcgss exportfs tun sunrpc fuse twofish_i586 twofish_common eeprom w83l785ts asb100 hwmon_vid hwmon snd_pcm_oss snd_mixer_oss zd1211rw firmware_class snd_intel8x0 snd_ac97_codec mac80211 ac97_bus snd_pcm snd_timer snd soundcore snd_page_alloc cfg80211 pl2303 usbserial i2c_nforce2 i2c_core
> [ 1535.884901] Pid: 4272, comm: FahCore_78.exe Not tainted 2.6.25-rc3 #6
> [ 1535.884906]  [<c0119f9f>] warn_on_slowpath+0x5f/0x90
> [ 1535.884918]  [<f8d868b5>] sta_info_get+0x55/0x60 [mac80211]
> [ 1535.884940]  [<c0139285>] __lock_acquire+0x4e5/0xfd0
> [ 1535.884950]  [<c0139285>] __lock_acquire+0x4e5/0xfd0
> [ 1535.884955]  [<c0165487>] kmem_cache_free+0xa7/0xf0
> [ 1535.884964]  [<c0102f17>] restore_nocheck+0x12/0x15
> [ 1535.884970]  [<c01388c6>] trace_hardirqs_on+0x66/0x110
> [ 1535.884978]  [<c04410df>] _spin_lock_irqsave+0x3f/0x50
> [ 1535.884987]  [<c036369a>] ehci_iaa_watchdog+0x7a/0x80
> [ 1535.884991]  [<c0122cc8>] run_timer_softirq+0x128/0x190
> [ 1535.884998]  [<c02b8e3c>] blk_done_softirq+0x3c/0x70
> [ 1535.885004]  [<c0363620>] ehci_iaa_watchdog+0x0/0x80
> [ 1535.885010]  [<c011f232>] __do_softirq+0x52/0xa0
> [ 1535.885014]  [<c01056c8>] do_softirq+0x98/0xd0
> [ 1535.885020]  [<c0144df0>] handle_level_irq+0x0/0xe0
> [ 1535.885027]  [<c011f1bd>] irq_exit+0x4d/0x70
> [ 1535.885031]  [<c0105790>] do_IRQ+0x90/0xf0
> [ 1535.885034]  [<c01388fc>] trace_hardirqs_on+0x9c/0x110
> [ 1535.885040]  [<c01038c6>] common_interrupt+0x2e/0x34
> [ 1535.885049]  =======================
> [ 1535.885051] ---[ end trace b6c2a2178c98c162 ]---
> 
> 
> There's no oops, and the box is running fine. The only USB devices in use 
> is the zd1211rw WLAN adapter and some mp3 player attached as usb-storage.
> 
> I've seen http://lkml.org/lkml/2008/2/13/566 with the same warning and a 
> real problem, but nothing happened. Linas, is this freeze still present?
> 
> Full dmesg and .config: http://nerdbynature.de/bits/2.6.25-rc3/
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04  7:49 ` Andrew Morton
@ 2008-03-04  8:01   ` Christian Kujau
  2008-03-04  8:10     ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-03-04  8:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML, 0x0007, linux-usb

On Mon, 3 Mar 2008, Andrew Morton wrote:
> On Mon, 3 Mar 2008 01:05:53 +0100 (CET) Christian Kujau <lists@nerdbynature.de> wrote:
>> after upgrading to 2.6.25-rc3, kern.log shows:
>
> From which kernel version did you upgrade?

I upgraded from 2.6.24.1. I did not try the last -rc kernels due to lack 
of (free) time :-\

C.
-- 
BOFH excuse #405:

Sysadmins unavailable because they are in a meeting talking about why they are unavailable so much.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04  8:01   ` Christian Kujau
@ 2008-03-04  8:10     ` Andrew Morton
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Morton @ 2008-03-04  8:10 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, 0x0007, linux-usb, Rafael J. Wysocki

On Tue, 4 Mar 2008 09:01:13 +0100 (CET) Christian Kujau <lists@nerdbynature.de> wrote:

> On Mon, 3 Mar 2008, Andrew Morton wrote:
> > On Mon, 3 Mar 2008 01:05:53 +0100 (CET) Christian Kujau <lists@nerdbynature.de> wrote:
> >> after upgrading to 2.6.25-rc3, kern.log shows:
> >
> > From which kernel version did you upgrade?
> 
> I upgraded from 2.6.24.1. I did not try the last -rc kernels due to lack 
> of (free) time :-\
> 

OK, thanks.  Rafael, can you please add it to th list?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-03 22:38 ` Christian Kujau
@ 2008-03-04 16:29   ` Alan Stern
  2008-03-04 18:53     ` Christian Kujau
  2008-03-04 20:51     ` David Brownell
  0 siblings, 2 replies; 17+ messages in thread
From: Alan Stern @ 2008-03-04 16:29 UTC (permalink / raw)
  To: David Brownell, Christian Kujau; +Cc: LKML, 0x0007, USB list

On Mon, 3 Mar 2008, Christian Kujau wrote:

> On Mon, 3 Mar 2008, Christian Kujau wrote:
> > after upgrading to 2.6.25-rc3, kern.log shows:
> >
> > [ 1535.884848] ------------[ cut here ]------------
> > [ 1535.884855] WARNING: at drivers/usb/host/ehci-hcd.c:287
> 
> Hm, after ~24h of uptime, this message appeared 25 times already.
> The WLAN USB device was used the whole time, the usb-storage module
> was hardly used over the day:
> 
> # grep usb /proc/interrupts
>   10:   10693403    XT-PIC-XT        ohci_hcd:usb3, eth0
>   11:   10270770    XT-PIC-XT        sym53c8xx, ehci_hcd:usb1
>   12:         91    XT-PIC-XT        ohci_hcd:usb2
> 
> I even tried to trigger the system freeze[0] by using usb-storage
> and reading a lot from it, but no freeze, and the message could
> not be triggered either - they pop up every now and then, but too often, 
> IMHO.
> 
> Can anybody shed some light on this?

Dave, it seems to me that this must be an example of a race between the 
iaa watchdog timer expiring and end_unlink_async() running.  It's not 
good enough for end_unlink_async() to call iaa_watchdog_done(), because 
on an SMP system the timer may already have fired and the watchdog 
routine may be spinning on ehci->lock.

It's even worse, because end_unlink_async() can call 
start_unlink_async().  If that happens, the watchdog routine will think 
the timer has expired when in fact it has just been restarted.

How about replacing the

	WARN_ON(!ehci->reclaim);

line in ehci_iaa_watchdog() with

	if (unlikely(!ehci->reclaim || 
			!HC_IS_RUNNING(ehci_to_hcd(ehci)->state) ||
			timer_pending(&ehci->iaa_watchdog))
		goto done;

where "done:" is added just before the spin_unlock_irqrestore?

Alan Stern


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04 16:29   ` Alan Stern
@ 2008-03-04 18:53     ` Christian Kujau
  2008-03-04 20:27       ` Alan Stern
  2008-03-04 20:51     ` David Brownell
  1 sibling, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-03-04 18:53 UTC (permalink / raw)
  To: Alan Stern; +Cc: David Brownell, LKML, 0x0007, USB list

On Tue, 4 Mar 2008, Alan Stern wrote:
> How about replacing the
>
> 	WARN_ON(!ehci->reclaim);
>
> line in ehci_iaa_watchdog() with
>
> 	if (unlikely(!ehci->reclaim ||
> 			!HC_IS_RUNNING(ehci_to_hcd(ehci)->state) ||
> 			timer_pending(&ehci->iaa_watchdog))
> 		goto done;
>
> where "done:" is added just before the spin_unlock_irqrestore?

I'm not sure, but do you mean:

--- linux-2.6/drivers/usb/host/ehci-hcd.c	2008-02-24 22:25:54.000000000 +0100
+++ linux-2.6/drivers/usb/host/ehci-hcd.c.edited	2008-03-04 19:46:13.000000000 +0100
@@ -284,7 +284,10 @@ static void ehci_iaa_watchdog(unsigned l
  	u32			status, cmd;

  	spin_lock_irqsave (&ehci->lock, flags);
-	WARN_ON(!ehci->reclaim);
+	if (unlikely(!ehci->reclaim ||
+			!HC_IS_RUNNING(ehci_to_hcd(ehci)->state) ||
+			timer_pending(&ehci->iaa_watchdog))
+	goto done;

  	status = ehci_readl(ehci, &ehci->regs->status);
  	cmd = ehci_readl(ehci, &ehci->regs->command);


Thanks,
Christian.
-- 
BOFH excuse #189:

SCSI's too wide.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04 18:53     ` Christian Kujau
@ 2008-03-04 20:27       ` Alan Stern
  0 siblings, 0 replies; 17+ messages in thread
From: Alan Stern @ 2008-03-04 20:27 UTC (permalink / raw)
  To: Christian Kujau; +Cc: David Brownell, LKML, 0x0007, USB list

On Tue, 4 Mar 2008, Christian Kujau wrote:

> I'm not sure, but do you mean:
> 
> --- linux-2.6/drivers/usb/host/ehci-hcd.c	2008-02-24 22:25:54.000000000 +0100
> +++ linux-2.6/drivers/usb/host/ehci-hcd.c.edited	2008-03-04 19:46:13.000000000 +0100
> @@ -284,7 +284,10 @@ static void ehci_iaa_watchdog(unsigned l
>   	u32			status, cmd;
> 
>   	spin_lock_irqsave (&ehci->lock, flags);
> -	WARN_ON(!ehci->reclaim);
> +	if (unlikely(!ehci->reclaim ||
> +			!HC_IS_RUNNING(ehci_to_hcd(ehci)->state) ||
> +			timer_pending(&ehci->iaa_watchdog))
> +	goto done;
> 
>   	status = ehci_readl(ehci, &ehci->regs->status);
>   	cmd = ehci_readl(ehci, &ehci->regs->command);

Almost.  I meant:

Index: usb-2.6/drivers/usb/host/ehci-hcd.c
===================================================================
--- usb-2.6.orig/drivers/usb/host/ehci-hcd.c
+++ usb-2.6/drivers/usb/host/ehci-hcd.c
@@ -260,7 +260,10 @@ static void ehci_iaa_watchdog(unsigned l
 	u32			status, cmd;
 
 	spin_lock_irqsave (&ehci->lock, flags);
-	WARN_ON(!ehci->reclaim);
+	if (unlikely(!ehci->reclaim ||
+			!HC_IS_RUNNING(ehci_to_hcd(ehci)->state) ||
+			timer_pending(&ehci->iaa_watchdog))
+		goto done;
 
 	status = ehci_readl(ehci, &ehci->regs->status);
 	cmd = ehci_readl(ehci, &ehci->regs->command);
@@ -276,7 +279,7 @@ static void ehci_iaa_watchdog(unsigned l
 		ehci_writel(ehci, cmd & ~CMD_IAAD, &ehci->regs->command);
 		end_unlink_async(ehci);
 	}
-
+ done:
 	spin_unlock_irqrestore(&ehci->lock, flags);
 }
 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04 16:29   ` Alan Stern
  2008-03-04 18:53     ` Christian Kujau
@ 2008-03-04 20:51     ` David Brownell
  2008-03-04 20:57       ` Alan Stern
  1 sibling, 1 reply; 17+ messages in thread
From: David Brownell @ 2008-03-04 20:51 UTC (permalink / raw)
  To: Alan Stern; +Cc: Christian Kujau, LKML, 0x0007, USB list

On Tuesday 04 March 2008, Alan Stern wrote:
> Dave, it seems to me that this must be an example of a race between the 
> iaa watchdog timer expiring and end_unlink_async() running.  It's not 
> good enough for end_unlink_async() to call iaa_watchdog_done(), because 
> on an SMP system the timer may already have fired and the watchdog 
> routine may be spinning on ehci->lock.

Right.


> How about replacing the
> 
>         WARN_ON(!ehci->reclaim);

Or bettter yet, just removing it entirely.  ISTR doing that in some
patches I've not yet sent for merging, and expect that's what my
preferred fix will be ...

- Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04 20:51     ` David Brownell
@ 2008-03-04 20:57       ` Alan Stern
  2008-03-04 22:01         ` David Brownell
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Stern @ 2008-03-04 20:57 UTC (permalink / raw)
  To: David Brownell; +Cc: Christian Kujau, LKML, 0x0007, USB list

On Tue, 4 Mar 2008, David Brownell wrote:

> On Tuesday 04 March 2008, Alan Stern wrote:
> > Dave, it seems to me that this must be an example of a race between the 
> > iaa watchdog timer expiring and end_unlink_async() running.  It's not 
> > good enough for end_unlink_async() to call iaa_watchdog_done(), because 
> > on an SMP system the timer may already have fired and the watchdog 
> > routine may be spinning on ehci->lock.
> 
> Right.
> 
> 
> > How about replacing the
> > 
> >         WARN_ON(!ehci->reclaim);
> 
> Or bettter yet, just removing it entirely.  ISTR doing that in some
> patches I've not yet sent for merging, and expect that's what my
> preferred fix will be ...

But removing the WARN_ON won't fix the problem.  The routine still
needs to exit immediately if ehci->reclaim isn't set, the root hub
isn't running, or the timer has been restarted.

Alan Stern


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04 20:57       ` Alan Stern
@ 2008-03-04 22:01         ` David Brownell
  2008-03-04 23:15           ` Christian Kujau
  0 siblings, 1 reply; 17+ messages in thread
From: David Brownell @ 2008-03-04 22:01 UTC (permalink / raw)
  To: Alan Stern; +Cc: Christian Kujau, LKML, 0x0007, USB list

On Tuesday 04 March 2008, Alan Stern wrote:

> > > How about replacing the
> > > 
> > >         WARN_ON(!ehci->reclaim);
> > 
> > Or bettter yet, just removing it entirely.  ISTR doing that in some
> > patches I've not yet sent for merging, and expect that's what my
> > preferred fix will be ...
> 
> But removing the WARN_ON won't fix the problem.

It fully resolves the $SUBJECT problem.  :)


> The routine still 
> needs to exit immediately if ehci->reclaim isn't set,

It more or less does that already; it does some avoidable
work, but that's not incorrect.


> the root hub isn't running, or the timer has been restarted.

Those are different issues.

- Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04 22:01         ` David Brownell
@ 2008-03-04 23:15           ` Christian Kujau
  2008-03-05  0:30             ` David Brownell
  0 siblings, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-03-04 23:15 UTC (permalink / raw)
  To: David Brownell; +Cc: Alan Stern, LKML, 0x0007, USB list

On Tue, 4 Mar 2008, David Brownell wrote:
>>> Or bettter yet, just removing it entirely.  ISTR doing that in some
>>> patches I've not yet sent for merging, and expect that's what my
>>> preferred fix will be ...
>>
>> But removing the WARN_ON won't fix the problem.
>
> It fully resolves the $SUBJECT problem.  :)

...and if it's a safe thing to do, I'm happy with it. I really have no 
clue why shutting up a WARN_ON has no (future) implications but if you 
say so I trust you :)

So now we're down to just:

--- linux-2.6/drivers/usb/host/ehci-hcd.c	2008-03-05 00:08:33.000000000 +0100
+++ linux-2.6/drivers/usb/host/ehci-hcd.c.edited	2008-03-05 00:09:45.000000000 +0100
@@ -284,7 +284,6 @@ static void ehci_iaa_watchdog(unsigned l
  	u32			status, cmd;

  	spin_lock_irqsave (&ehci->lock, flags);
-	WARN_ON(!ehci->reclaim);

  	status = ehci_readl(ehci, &ehci->regs->status);
  	cmd = ehci_readl(ehci, &ehci->regs->command);

Right?

Thank you both for looking into this issue,
Christian.
-- 
BOFH excuse #285:

Telecommunications is upgrading.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-04 23:15           ` Christian Kujau
@ 2008-03-05  0:30             ` David Brownell
  2008-03-05  1:15               ` Christian Kujau
  0 siblings, 1 reply; 17+ messages in thread
From: David Brownell @ 2008-03-05  0:30 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Alan Stern, LKML, 0x0007, USB list

On Tuesday 04 March 2008, Christian Kujau wrote:
> On Tue, 4 Mar 2008, David Brownell wrote:
> >>> Or bettter yet, just removing it entirely.  ISTR doing that in some
> >>> patches I've not yet sent for merging, and expect that's what my
> >>> preferred fix will be ...
> >>
> >> But removing the WARN_ON won't fix the problem.
> >
> > It fully resolves the $SUBJECT problem.  :)
> 
> ...and if it's a safe thing to do, I'm happy with it. I really have no 
> clue why shutting up a WARN_ON has no (future) implications but if you 
> say so I trust you :)

Carrying that WARN_ON forward was inappropriate, so
removing it is a safe thing to do.

There are other lurking issues though.

- Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-05  0:30             ` David Brownell
@ 2008-03-05  1:15               ` Christian Kujau
  2008-03-05  4:25                 ` David Brownell
  0 siblings, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-03-05  1:15 UTC (permalink / raw)
  To: David Brownell; +Cc: Alan Stern, LKML, 0x0007, USB list

On Tue, 4 Mar 2008, David Brownell wrote:
> Carrying that WARN_ON forward was inappropriate, so
> removing it is a safe thing to do.

OK, fine with me. Removing the WARN_ON "fixed" this particular issue, so 
feel free to close http://bugzilla.kernel.org/show_bug.cgi?id=10168 :)

> There are other lurking issues though.

uuh...let's hope they don't wake up?

thanks,
C.
-- 
BOFH excuse #238:

You did wha... oh _dear_....

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-05  1:15               ` Christian Kujau
@ 2008-03-05  4:25                 ` David Brownell
  2008-03-05 22:59                   ` Christian Kujau
  0 siblings, 1 reply; 17+ messages in thread
From: David Brownell @ 2008-03-05  4:25 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Alan Stern, LKML, 0x0007, USB list

On Tuesday 04 March 2008, Christian Kujau wrote:
> On Tue, 4 Mar 2008, David Brownell wrote:

> > There are other lurking issues though.
> 
> uuh...let's hope they don't wake up?

Nah.  See the patch I just attached to

	http://bugzilla.kernel.org/show_bug.cgi?id=10078

... appended here for reference.

- Dave


======== CUT HERE
The recent EHCI driver update to split the IAA watchdog timer out from
the other timers made several things work better, but not everything;
and it created a couple new issues in bugzilla.  Ergo this patch:

  - Handle a should-be-rare SMP race between the watchdog firing
    and (very late) IAA interrupts;

  - Remove a shouldn't-have-been-added WARN_ON() test;

  - Guard against one observed OOPS;

  - If this watchdog fires during clean HC shutdown, it should act
    as a NOP instead of interfering with the shutdown sequence;

  - Guard against silicon errata hypothesized by some vendors:
      * IAA status latch broken, but IAAD cleared OK;
      * IAAD wasn't cleared when IAA status got reported;

The WARN_ON is in bugzilla as 10168; the OOPS as 10078.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
---
 drivers/usb/host/ehci-hcd.c |   60 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 45 insertions(+), 15 deletions(-)

--- g26.orig/drivers/usb/host/ehci-hcd.c	2008-03-04 17:24:22.000000000 -0800
+++ g26/drivers/usb/host/ehci-hcd.c	2008-03-04 20:08:28.000000000 -0800
@@ -257,23 +257,44 @@ static void ehci_iaa_watchdog(unsigned l
 {
 	struct ehci_hcd		*ehci = (struct ehci_hcd *) param;
 	unsigned long		flags;
-	u32			status, cmd;
 
 	spin_lock_irqsave (&ehci->lock, flags);
-	WARN_ON(!ehci->reclaim);
 
-	status = ehci_readl(ehci, &ehci->regs->status);
-	cmd = ehci_readl(ehci, &ehci->regs->command);
-	ehci_dbg(ehci, "IAA watchdog: status %x cmd %x\n", status, cmd);
-
-	/* lost IAA irqs wedge things badly; seen first with a vt8235 */
-	if (ehci->reclaim) {
-		if (status & STS_IAA) {
-			ehci_vdbg (ehci, "lost IAA\n");
+	/* Lost IAA irqs wedge things badly; seen first with a vt8235.
+	 * So we need this watchdog, but must protect it against both
+	 * (a) SMP races against real IAA firing and retriggering, and
+	 * (b) clean HC shutdown, when IAA watchdog was pending.
+	 */
+	if (ehci->reclaim
+			&& !timer_pending(&ehci->iaa_watchdog)
+			&& HC_IS_RUNNING(ehci_to_hcd(ehci)->state)) {
+		u32 cmd, status;
+
+		/* If we get here, IAA is *REALLY* late.  It's barely
+		 * conceivable that the system is so busy that CMD_IAAD
+		 * is still legitimately set, so let's be sure it's
+		 * clear before we read STS_IAA.  (The HC should clear
+		 * CMD_IAAD when it sets STS_IAA.)
+		 */
+		cmd = ehci_readl(ehci, &ehci->regs->command);
+		if (cmd & CMD_IAAD)
+			ehci_writel(ehci, cmd & ~CMD_IAAD,
+					&ehci->regs->command);
+
+		/* If IAA is set here it either legitimately triggered
+		 * before we cleared IAAD above (but _way_ late, so we'll
+		 * still count it as lost) ... or a silicon erratum:
+		 * - VIA seems to set IAA without triggering the IRQ;
+		 * - IAAD potentially cleared without setting IAA.
+		 */
+		status = ehci_readl(ehci, &ehci->regs->status);
+		if ((status & STS_IAA) || !(cmd & CMD_IAAD)) {
 			COUNT (ehci->stats.lost_iaa);
 			ehci_writel(ehci, STS_IAA, &ehci->regs->status);
 		}
-		ehci_writel(ehci, cmd & ~CMD_IAAD, &ehci->regs->command);
+
+		ehci_vdbg(ehci, "IAA watchdog: status %x cmd %x\n",
+				status, cmd);
 		end_unlink_async(ehci);
 	}
 
@@ -607,7 +628,7 @@ static int ehci_run (struct usb_hcd *hcd
 static irqreturn_t ehci_irq (struct usb_hcd *hcd)
 {
 	struct ehci_hcd		*ehci = hcd_to_ehci (hcd);
-	u32			status, pcd_status = 0;
+	u32			status, pcd_status = 0, cmd;
 	int			bh;
 
 	spin_lock (&ehci->lock);
@@ -628,7 +649,7 @@ static irqreturn_t ehci_irq (struct usb_
 
 	/* clear (just) interrupts */
 	ehci_writel(ehci, status, &ehci->regs->status);
-	ehci_readl(ehci, &ehci->regs->command);	/* unblock posted write */
+	cmd = ehci_readl(ehci, &ehci->regs->command);
 	bh = 0;
 
 #ifdef	VERBOSE_DEBUG
@@ -649,8 +670,17 @@ static irqreturn_t ehci_irq (struct usb_
 
 	/* complete the unlinking of some qh [4.15.2.3] */
 	if (status & STS_IAA) {
-		COUNT (ehci->stats.reclaim);
-		end_unlink_async(ehci);
+		/* guard against (alleged) silicon errata */
+		if (cmd & CMD_IAAD) {
+			ehci_writel(ehci, cmd & ~CMD_IAAD,
+					&ehci->regs->command);
+			ehci_dbg(ehci, "IAA with IAAD still set?\n");
+		}
+		if (ehci->reclaim) {
+			COUNT(ehci->stats.reclaim);
+			end_unlink_async(ehci);
+		} else
+			ehci_dbg(ehci, "IAA with nothing to reclaim?\n");
 	}
 
 	/* remote wakeup [4.3.1] */

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-05  4:25                 ` David Brownell
@ 2008-03-05 22:59                   ` Christian Kujau
  2008-03-07 19:51                     ` Christian Kujau
  0 siblings, 1 reply; 17+ messages in thread
From: Christian Kujau @ 2008-03-05 22:59 UTC (permalink / raw)
  To: David Brownell; +Cc: Alan Stern, LKML, 0x0007, USB list

On Tue, 4 Mar 2008, David Brownell wrote:
> Nah.  See the patch I just attached to
> 	http://bugzilla.kernel.org/show_bug.cgi?id=10078

Thanks, I'm compiling (applied to -rc4 with some fuzz, hope that's ok) 
now and will test tomorrow.

C.
-- 
BOFH excuse #269:

Melting hard drives

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WARNING: at drivers/usb/host/ehci-hcd.c:287
  2008-03-05 22:59                   ` Christian Kujau
@ 2008-03-07 19:51                     ` Christian Kujau
  0 siblings, 0 replies; 17+ messages in thread
From: Christian Kujau @ 2008-03-07 19:51 UTC (permalink / raw)
  To: David Brownell; +Cc: Alan Stern, LKML, 0x0007, USB list

On Wed, 5 Mar 2008, Christian Kujau wrote:
> On Tue, 4 Mar 2008, David Brownell wrote:
>> Nah.  See the patch I just attached to
>> 	http://bugzilla.kernel.org/show_bug.cgi?id=10078
>
> Thanks, I'm compiling (applied to -rc4 with some fuzz, hope that's ok) now 
> and will test tomorrow.

For the record: the WARNINGS are gone and USB is still functional. I'm 
having other (much more severe) problems with -rc4 now, but they seem 
unrelated to your patch.

Thanks,
Christian.
-- 
BOFH excuse #317:

Internet exceeded Luser level, please wait until a luser logs off before attempting to log back on.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-03-07 19:51 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-03  0:05 WARNING: at drivers/usb/host/ehci-hcd.c:287 Christian Kujau
2008-03-03 22:38 ` Christian Kujau
2008-03-04 16:29   ` Alan Stern
2008-03-04 18:53     ` Christian Kujau
2008-03-04 20:27       ` Alan Stern
2008-03-04 20:51     ` David Brownell
2008-03-04 20:57       ` Alan Stern
2008-03-04 22:01         ` David Brownell
2008-03-04 23:15           ` Christian Kujau
2008-03-05  0:30             ` David Brownell
2008-03-05  1:15               ` Christian Kujau
2008-03-05  4:25                 ` David Brownell
2008-03-05 22:59                   ` Christian Kujau
2008-03-07 19:51                     ` Christian Kujau
2008-03-04  7:49 ` Andrew Morton
2008-03-04  8:01   ` Christian Kujau
2008-03-04  8:10     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).