linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: USB 2.0 with 250Gb disk and insane loads
@ 2003-06-01 14:31 David Brownell
  2003-06-01 14:53 ` Oliver Neukum
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: David Brownell @ 2003-06-01 14:31 UTC (permalink / raw)
  To: alexh; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1232 bytes --]

> I'm trying to nail own a problem here, with my shiny new Maxtor 250Gb
> USB 2.0 disk. Under 2.4 (vanilla, latest 21-preX and 21-preX-acY) the
> disk will mount and talk nicely. As soon as any load hits it, e.g. a
> single cp from my internal CD-ROM to the disk, the mahcine load will
> sky-rocket and at some point within a few minuter hang the machine.
> 
> On 2.5 (vanilla and -mm) the load will show as i/o-wait and at some
> point hang any access to the drive, but the kernel will go on working.

That's a big clue -- nothing in the USB code ever shows up
as "i/o wait".  It can't, since USB is usually built as
modules and things like io_schedule() are, for some odd
reason, never exported for use in modules.  So USB I/O
can't use them, and won't show up as "i/o" ... and that
load must come from some place other than USB.

There are some patches that really ought to get merged
into the 2.4.21 release (2.5.70 has both of these), but
they shouldn't affect anything with an "i/o wait" symptom.

- Dave

[1] ehci-0429 ... applies to 2.4.21 with some fuzz, adds
     i/o watchdog timer, addressing both hardware (on some
     silicon) and software (suspected) irq lossage.

[2] ehci-0519 ... fixes some SMP-only problems.




[-- Attachment #2: ehci-0429.patch --]
[-- Type: text/plain, Size: 6479 bytes --]

--- 1.46/drivers/usb/host/ehci-hcd.c	Wed Mar 19 02:25:01 2003
+++ edited/drivers/usb/host/ehci-hcd.c	Thu Apr 24 13:28:01 2003
@@ -116,8 +116,10 @@
 #define	EHCI_TUNE_MULT_TT	1
 #define	EHCI_TUNE_FLS		2	/* (small) 256 frame schedule */
 
-#define EHCI_WATCHDOG_JIFFIES	(HZ/100)	/* arbitrary; ~10 msec */
+#define EHCI_IAA_JIFFIES	(HZ/100)	/* arbitrary; ~10 msec */
+#define EHCI_IO_JIFFIES		(HZ/10)		/* io watchdog > irq_thresh */
 #define EHCI_ASYNC_JIFFIES	(HZ/20)		/* async idle timeout */
+#define EHCI_SHRINK_JIFFIES	(HZ/200)	/* async qh unlink delay */
 
 /* Initial IRQ latency:  lower than default */
 static int log2_irq_thresh = 0;		// 0 to 6
@@ -266,16 +268,13 @@
 		}
 	}
 
+ 	/* stop async processing after it's idled a bit */
+	if (test_bit (TIMER_ASYNC_OFF, &ehci->actions))
+ 		start_unlink_async (ehci, ehci->async);
+
+	/* ehci could run by timer, without IRQs ... */
 	ehci_work (ehci, NULL);
-	if (ehci->reclaim && !timer_pending (&ehci->watchdog))
-		mod_timer (&ehci->watchdog,
-				jiffies + EHCI_WATCHDOG_JIFFIES);
 
- 	/* stop async processing after it's idled a while */
-	else if (ehci->async_idle) {
- 		start_unlink_async (ehci, ehci->async);
- 		ehci->async_idle = 0;
-	}
 	spin_unlock_irqrestore (&ehci->lock, flags);
 }
 
@@ -658,11 +657,18 @@
  */
 static void ehci_work (struct ehci_hcd *ehci, struct pt_regs *regs)
 {
+	timer_action_done (ehci, TIMER_IO_WATCHDOG);
 	if (ehci->reclaim_ready)
 		end_unlink_async (ehci, regs);
 	scan_async (ehci, regs);
 	if (ehci->next_uframe != -1)
 		scan_periodic (ehci, regs);
+
+	/* the IO watchdog guards against hardware or driver bugs that
+	 * misplace IRQs, and should let us run completely without IRQs.
+	 */
+	if ((ehci->async->qh_next.ptr != 0) || (ehci->periodic_sched != 0))
+		timer_action (ehci, TIMER_IO_WATCHDOG);
 }
 
 /*-------------------------------------------------------------------------*/
--- 1.44/drivers/usb/host/ehci-q.c	Mon Feb 24 03:30:38 2003
+++ edited/drivers/usb/host/ehci-q.c	Thu Apr 24 13:27:50 2003
@@ -704,8 +704,7 @@
 
 	/* (re)start the async schedule? */
 	head = ehci->async;
-	if (ehci->async_idle)
-		del_timer (&ehci->watchdog);
+	timer_action_done (ehci, TIMER_ASYNC_OFF);
 	if (!head->qh_next.qh) {
 		u32	cmd = readl (&ehci->regs->command);
 
@@ -731,8 +730,6 @@
 
 	qh->qh_state = QH_STATE_LINKED;
 	/* qtd completions reported later by interrupt */
-
-	ehci->async_idle = 0;
 }
 
 /*-------------------------------------------------------------------------*/
@@ -914,7 +911,7 @@
 	struct ehci_qh		*qh = ehci->reclaim;
 	struct ehci_qh		*next;
 
-	del_timer (&ehci->watchdog);
+	timer_action_done (ehci, TIMER_IAA_WATCHDOG);
 
 	qh->hw_next = cpu_to_le32 (qh->qh_dma);
 	qh->qh_state = QH_STATE_IDLE;
@@ -939,12 +936,8 @@
 		 * active but idle for a while once it empties.
 		 */
 		if (HCD_IS_RUNNING (ehci->hcd.state)
-				&& ehci->async->qh_next.qh == 0
-				&& !timer_pending (&ehci->watchdog)) {
-			ehci->async_idle = 1;
-			mod_timer (&ehci->watchdog,
-					jiffies + EHCI_ASYNC_JIFFIES);
-		}
+				&& ehci->async->qh_next.qh == 0)
+			timer_action (ehci, TIMER_ASYNC_OFF);
 	}
 
 	if (next)
@@ -979,6 +972,7 @@
 			wmb ();
 			// handshake later, if we need to
 		}
+		timer_action_done (ehci, TIMER_ASYNC_OFF);
 		return;
 	} 
 
@@ -1004,9 +998,8 @@
 	ehci->reclaim_ready = 0;
 	cmd |= CMD_IAAD;
 	writel (cmd, &ehci->regs->command);
-	/* posted write need not be known to HC yet ... */
-
-	mod_timer (&ehci->watchdog, jiffies + EHCI_WATCHDOG_JIFFIES);
+	(void) readl (&ehci->regs->command);
+	timer_action (ehci, TIMER_IAA_WATCHDOG);
 }
 
 /*-------------------------------------------------------------------------*/
@@ -1015,10 +1008,11 @@
 scan_async (struct ehci_hcd *ehci, struct pt_regs *regs)
 {
 	struct ehci_qh		*qh;
-	int			unlink_delay = 0;
+	enum ehci_timer_action	action = TIMER_IO_WATCHDOG;
 
 	if (!++(ehci->stamp))
 		ehci->stamp++;
+	timer_action_done (ehci, TIMER_ASYNC_SHRINK);
 rescan:
 	qh = ehci->async->qh_next.qh;
 	if (likely (qh != 0)) {
@@ -1050,17 +1044,14 @@
 			 */
 			if (list_empty (&qh->qtd_list)) {
 				if (qh->stamp == ehci->stamp)
-					unlink_delay = 1;
-				else if (!ehci->reclaim) {
+					action = TIMER_ASYNC_SHRINK;
+				else if (!ehci->reclaim)
 					start_unlink_async (ehci, qh);
-					unlink_delay = 0;
-				}
 			}
 
 			qh = qh->qh_next.qh;
 		} while (qh);
 	}
-
-	if (unlink_delay && !timer_pending (&ehci->watchdog))
-		mod_timer (&ehci->watchdog, jiffies + EHCI_WATCHDOG_JIFFIES/2);
+	if (action == TIMER_ASYNC_SHRINK)
+		timer_action (ehci, TIMER_ASYNC_SHRINK);
 }
--- 1.19/drivers/usb/host/ehci.h	Mon Feb 24 03:30:38 2003
+++ edited/drivers/usb/host/ehci.h	Thu Apr 24 13:29:02 2003
@@ -52,8 +52,7 @@
 	/* async schedule support */
 	struct ehci_qh		*async;
 	struct ehci_qh		*reclaim;
-	int			reclaim_ready : 1,
-				async_idle : 1;
+	int			reclaim_ready : 1;
 
 	/* periodic schedule support */
 #define	DEFAULT_I_TDPS		1024		/* some HCs can do less */
@@ -83,6 +82,7 @@
 
 	struct timer_list	watchdog;
 	struct notifier_block	reboot_notifier;
+	unsigned long		actions;
 	unsigned		stamp;
 
 	/* irq statistics */
@@ -99,6 +99,53 @@
 
 /* NOTE:  urb->transfer_flags expected to not use this bit !!! */
 #define EHCI_STATE_UNLINK	0x8000		/* urb being unlinked */
+
+enum ehci_timer_action {
+	TIMER_IO_WATCHDOG,
+	TIMER_IAA_WATCHDOG,
+	TIMER_ASYNC_SHRINK,
+	TIMER_ASYNC_OFF,
+};
+
+static inline void
+timer_action_done (struct ehci_hcd *ehci, enum ehci_timer_action action)
+{
+	clear_bit (action, &ehci->actions);
+}
+
+static inline void
+timer_action (struct ehci_hcd *ehci, enum ehci_timer_action action)
+{
+	if (!test_and_set_bit (action, &ehci->actions)) {
+		unsigned long t;
+
+		switch (action) {
+		case TIMER_IAA_WATCHDOG:
+			t = EHCI_IAA_JIFFIES;
+			break;
+		case TIMER_IO_WATCHDOG:
+			t = EHCI_IO_JIFFIES;
+			break;
+		case TIMER_ASYNC_OFF:
+			t = EHCI_ASYNC_JIFFIES;
+			break;
+		// case TIMER_ASYNC_SHRINK:
+		default:
+			t = EHCI_SHRINK_JIFFIES;
+			break;
+		}
+		t += jiffies;
+		// all timings except IAA watchdog can be overridden.
+		// async queue SHRINK often precedes IAA.  while it's ready
+		// to go OFF neither can matter, and afterwards the IO
+		// watchdog stops unless there's still periodic traffic.
+		if (action != TIMER_IAA_WATCHDOG
+				&& t > ehci->watchdog.expires
+				&& timer_pending (&ehci->watchdog))
+			return;
+		mod_timer (&ehci->watchdog, t);
+	}
+}
 
 /*-------------------------------------------------------------------------*/
 

[-- Attachment #3: ehci-0519.patch --]
[-- Type: text/plain, Size: 964 bytes --]

--- 1.21/drivers/usb/host/ehci-q.c	Mon May 12 22:25:05 2003
+++ edited/ehci-q.c	Mon May 19 09:10:48 2003
@@ -958,11 +958,11 @@
 	qh->qh_state = QH_STATE_IDLE;
 	qh->qh_next.qh = 0;
 	qh_put (ehci, qh);			// refcount from reclaim 
-	ehci->reclaim = 0;
-	ehci->reclaim_ready = 0;
 
 	/* other unlink(s) may be pending (in QH_STATE_UNLINK_WAIT) */
 	next = qh->reclaim;
+	ehci->reclaim = next;
+	ehci->reclaim_ready = 0;
 	qh->reclaim = 0;
 
 	qh_completions (ehci, qh, regs);
@@ -981,8 +981,10 @@
 			timer_action (ehci, TIMER_ASYNC_OFF);
 	}
 
-	if (next)
+	if (next) {
+		ehci->reclaim = 0;
 		start_unlink_async (ehci, next);
+	}
 }
 
 /* makes sure the async qh will become idle */
@@ -1086,7 +1088,8 @@
 			if (list_empty (&qh->qtd_list)) {
 				if (qh->stamp == ehci->stamp)
 					action = TIMER_ASYNC_SHRINK;
-				else if (!ehci->reclaim)
+				else if (!ehci->reclaim
+					    && qh->qh_state == QH_STATE_LINKED)
 					start_unlink_async (ehci, qh);
 			}
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-01 14:31 USB 2.0 with 250Gb disk and insane loads David Brownell
@ 2003-06-01 14:53 ` Oliver Neukum
  2003-06-01 15:17   ` Alexander Hoogerhuis
  2003-06-01 14:57 ` Alexander Hoogerhuis
  2003-06-13 20:19 ` David Brownell
  2 siblings, 1 reply; 12+ messages in thread
From: Oliver Neukum @ 2003-06-01 14:53 UTC (permalink / raw)
  To: David Brownell, alexh; +Cc: linux-kernel

Am Sonntag, 1. Juni 2003 16:31 schrieb David Brownell:
> > I'm trying to nail own a problem here, with my shiny new Maxtor 250Gb
> > USB 2.0 disk. Under 2.4 (vanilla, latest 21-preX and 21-preX-acY) the
> > disk will mount and talk nicely. As soon as any load hits it, e.g. a
> > single cp from my internal CD-ROM to the disk, the mahcine load will
> > sky-rocket and at some point within a few minuter hang the machine.
> >
> > On 2.5 (vanilla and -mm) the load will show as i/o-wait and at some
> > point hang any access to the drive, but the kernel will go on working.
>
> That's a big clue -- nothing in the USB code ever shows up
> as "i/o wait".  It can't, since USB is usually built as
> modules and things like io_schedule() are, for some odd
> reason, never exported for use in modules.  So USB I/O
> can't use them, and won't show up as "i/o" ... and that
> load must come from some place other than USB.

Probably the block layer as it waits for free io slots.
But that doesn't tell us why the requests are not executed.
Where is SCSI timeout kicking in?
Have you tried enabling debugging in storage?
Could you try on USB1.1 only?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-01 14:31 USB 2.0 with 250Gb disk and insane loads David Brownell
  2003-06-01 14:53 ` Oliver Neukum
@ 2003-06-01 14:57 ` Alexander Hoogerhuis
  2003-06-13 20:19 ` David Brownell
  2 siblings, 0 replies; 12+ messages in thread
From: Alexander Hoogerhuis @ 2003-06-01 14:57 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Brownell <david-b@pacbell.net> writes:

> > I'm trying to nail own a problem here, with my shiny new Maxtor 250Gb
> > USB 2.0 disk. Under 2.4 (vanilla, latest 21-preX and 21-preX-acY) the
> > disk will mount and talk nicely. As soon as any load hits it, e.g. a
> > single cp from my internal CD-ROM to the disk, the mahcine load will
> > sky-rocket and at some point within a few minuter hang the machine.
> > On 2.5 (vanilla and -mm) the load will show as i/o-wait and at some
> > point hang any access to the drive, but the kernel will go on working.
> 
> That's a big clue -- nothing in the USB code ever shows up
> as "i/o wait".  It can't, since USB is usually built as
> modules and things like io_schedule() are, for some odd
> reason, never exported for use in modules.  So USB I/O
> can't use them, and won't show up as "i/o" ... and that
> load must come from some place other than USB.
> 
> There are some patches that really ought to get merged
> into the 2.4.21 release (2.5.70 has both of these), but
> they shouldn't affect anything with an "i/o wait" symptom.
> 

Here's the output from about one minute of cp'ing to the disk on an
idle machine:

top - 16:51:17 up 16:37,  8 users,  load average: 3.59, 1.43, 0.70
Tasks:  85 total,   3 running,  82 sleeping,   0 stopped,   0 zombie
Cpu(s):   7.0% user,   2.0% system,   0.0% nice,   0.0% idle,  91.0% IO-wait
Mem:    775072k total,   772444k used,     2628k free,     2820k buffers
Swap:   136040k total,    14372k used,   121668k free,   600664k cached

Both the load and the percentage of time marked as "io-wait" is
completely bogus.

I ran procinfo too, to dump irq stats, and it ran with ~2k interrupts
per 10 secs (matching numer of irq's for the ide-system where the file
comes from) and at some point it does, no message in the kernel
log. Currently it's doing about 20 interrupts per 10 seconds and the
cp command cannot be broken off.

Doing a ps -efadl for the cp command yiled this:

0 D alexh    25915  4861  1  75   0 -   845 io_sch 16:49 pts/3 00:00:05 cp

and if it's of any use, it's /proc/$PID/status file contains this:

alexh@lapper /proc/25915 $ cat status
Name:   cp
State:  D (disk sleep)
Tgid:   25915
Pid:    25915
PPid:   4861
TracerPid:      0
Uid:    1000    1000    1000    1000
Gid:    100     100     100     100
FDSize: 256
Groups: 100 5 10 18 19 35 80
VmSize:     3380 kB
VmLck:         0 kB
VmRSS:      1376 kB
VmData:       20 kB
VmStk:        12 kB
VmExe:        48 kB
VmLib:      1228 kB
SigPnd: 0000000000000100
ShdPnd: 0000000000000002
SigBlk: 0000000080000000
SigIgn: 8000000000000000
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 00000000ffffffff
CapEff: 00000000fffffeff

and wchan says "io_schedule".

mvh,
A
- -- 
Alexander Hoogerhuis                               | alexh@ihatent.com
CCNP - CCDP - MCNE - CCSE                          | +47 908 21 485
"You have zero privacy anyway. Get over it."  --Scott McNealy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQE+2hRcCQ1pa+gRoggRAgtSAJsGsxr3Bo+TG6KPpP7mK4IJ1pphXQCfcBqE
Vy2EVyvseC9L6OmNd6P/3PM=
=U4HH
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-01 14:53 ` Oliver Neukum
@ 2003-06-01 15:17   ` Alexander Hoogerhuis
  2003-06-01 18:21     ` Oliver Neukum
  2003-06-03 11:49     ` Terje Malmedal
  0 siblings, 2 replies; 12+ messages in thread
From: Alexander Hoogerhuis @ 2003-06-01 15:17 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: David Brownell, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Oliver Neukum <oliver@neukum.org> writes:

> Am Sonntag, 1. Juni 2003 16:31 schrieb David Brownell:
> > > I'm trying to nail own a problem here, with my shiny new Maxtor 250Gb
> > > USB 2.0 disk. Under 2.4 (vanilla, latest 21-preX and 21-preX-acY) the
> > > disk will mount and talk nicely. As soon as any load hits it, e.g. a
> > > single cp from my internal CD-ROM to the disk, the mahcine load will
> > > sky-rocket and at some point within a few minuter hang the machine.
> > >
> > > On 2.5 (vanilla and -mm) the load will show as i/o-wait and at some
> > > point hang any access to the drive, but the kernel will go on working.
> >
> > That's a big clue -- nothing in the USB code ever shows up
> > as "i/o wait".  It can't, since USB is usually built as
> > modules and things like io_schedule() are, for some odd
> > reason, never exported for use in modules.  So USB I/O
> > can't use them, and won't show up as "i/o" ... and that
> > load must come from some place other than USB.
> 
> Probably the block layer as it waits for free io slots.
> But that doesn't tell us why the requests are not executed.
> Where is SCSI timeout kicking in?

I'm not seeing any scsi timeouts in the logs.

> Have you tried enabling debugging in storage?

Have it compiled in now, just need to wait abotu an hour for seomthing
to finish before I can boot.

> Could you try on USB1.1 only?
> 

Stuck it in an older machine on USB 1.1 and it foudn the disk fine
(redhat 9, 2.4.20-13.9 kernel on that machine), and ditto result:

19:15:16  up 2 days, 20:23,  4 users,  load average: 6.02, 2.41, 0.89
58 processes: 55 sleeping, 3 running, 0 zombie, 0 stopped
CPU states:   0.2% user   4.0% system   0.0% nice   0.0% iowait  95.8% idle
Mem:   385040k av,  380820k used,    4220k free,       0k shrd,   67368k buff
       224720k active,              69412k inactive
Swap:  521632k av,      80k used,  521552k free                  237452k cached
                                                                                
and generating about 2500 interrupts for the usb controller per 10
seconds and when i finally break it off and give it "sync" it uses
about two minutes with about 4500 per 10 seconds to get it all on
disk. On 2.4 the machine becomes more and more sluggish if I let it go
more than a short minute.

> 	Regards
> 		Oliver
> 

mvh,
A

- -- 
Alexander Hoogerhuis                               | alexh@ihatent.com
CCNP - CCDP - MCNE - CCSE                          | +47 908 21 485
"You have zero privacy anyway. Get over it."  --Scott McNealy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQE+2hkXCQ1pa+gRoggRAuSRAKClouXqQkJZYmIN/DxFcoXvXd85yACghtTu
FaQRm6d9wqy4OxOpDG9zmWg=
=cFZM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-01 15:17   ` Alexander Hoogerhuis
@ 2003-06-01 18:21     ` Oliver Neukum
  2003-06-02  6:51       ` Alexander Hoogerhuis
  2003-06-02 14:03       ` Alexander Hoogerhuis
  2003-06-03 11:49     ` Terje Malmedal
  1 sibling, 2 replies; 12+ messages in thread
From: Oliver Neukum @ 2003-06-01 18:21 UTC (permalink / raw)
  To: Alexander Hoogerhuis; +Cc: David Brownell, linux-kernel


> > Probably the block layer as it waits for free io slots.
> > But that doesn't tell us why the requests are not executed.
> > Where is SCSI timeout kicking in?
>
> I'm not seeing any scsi timeouts in the logs.

So it seems that the driver doesn't fail utterly, but crawls along.
Storage's debugging output should clarify the situation.

[..]
> > Could you try on USB1.1 only?
>
> Stuck it in an older machine on USB 1.1 and it foudn the disk fine
> (redhat 9, 2.4.20-13.9 kernel on that machine), and ditto result:
>
> 19:15:16  up 2 days, 20:23,  4 users,  load average: 6.02, 2.41, 0.89
> 58 processes: 55 sleeping, 3 running, 0 zombie, 0 stopped
> CPU states:   0.2% user   4.0% system   0.0% nice   0.0% iowait  95.8% idle
> Mem:   385040k av,  380820k used,    4220k free,       0k shrd,   67368k
> buff 224720k active,              69412k inactive
> Swap:  521632k av,      80k used,  521552k free                  237452k
> cached
>
> and generating about 2500 interrupts for the usb controller per 10
> seconds and when i finally break it off and give it "sync" it uses
> about two minutes with about 4500 per 10 seconds to get it all on
> disk. On 2.4 the machine becomes more and more sluggish if I let it go
> more than a short minute.

Which 2.4 ?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-01 18:21     ` Oliver Neukum
@ 2003-06-02  6:51       ` Alexander Hoogerhuis
  2003-06-02 14:03       ` Alexander Hoogerhuis
  1 sibling, 0 replies; 12+ messages in thread
From: Alexander Hoogerhuis @ 2003-06-02  6:51 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: David Brownell, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Oliver Neukum <oliver@neukum.org> writes:

> > > Probably the block layer as it waits for free io slots.
> > > But that doesn't tell us why the requests are not executed.
> > > Where is SCSI timeout kicking in?
> >
> > I'm not seeing any scsi timeouts in the logs.
> 
> So it seems that the driver doesn't fail utterly, but crawls along.
> Storage's debugging output should clarify the situation.
> 
> [..]
> > > Could you try on USB1.1 only?
> >
> > Stuck it in an older machine on USB 1.1 and it foudn the disk fine
> > (redhat 9, 2.4.20-13.9 kernel on that machine), and ditto result:
> >
> > 19:15:16  up 2 days, 20:23,  4 users,  load average: 6.02, 2.41, 0.89
> > 58 processes: 55 sleeping, 3 running, 0 zombie, 0 stopped
> > CPU states:   0.2% user   4.0% system   0.0% nice   0.0% iowait  95.8% idle
> > Mem:   385040k av,  380820k used,    4220k free,       0k shrd,   67368k
> > buff 224720k active,              69412k inactive
> > Swap:  521632k av,      80k used,  521552k free                  237452k
> > cached
> >
> > and generating about 2500 interrupts for the usb controller per 10
> > seconds and when i finally break it off and give it "sync" it uses
> > about two minutes with about 4500 per 10 seconds to get it all on
> > disk. On 2.4 the machine becomes more and more sluggish if I let it go
> > more than a short minute.
> 
> Which 2.4 ?
> 

This one was latest RedHat 9 kernel (yes I know, proprietary, but
ditto results on 2.4.21-rc[4|6]-ac[1|2].

Output below from 2.5.70-mm3 (USB 2.0), which leaves the machine in a
survivable state (as oposed to 2.4, which gradually climbs up a tree
and then falls down hard). After enabling debugging in USB storage it
fell over much more quickly.

> 	Regards
> 		Oliver

mvh,
A

alexh@lapper ~ $ dmesg
orage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9e a4 e7 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x135 Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x135 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9e c4 ef 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x136 Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x136 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9e e4 f7 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x137 Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x137 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9f 04 ff 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x138 Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x138 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9f 25 07 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x139 Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x139 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9f 45 0f 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x13a Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x13a R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9f 65 17 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x13b Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x13b R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9f 85 1f 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x13c Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x13c R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9f a5 27 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x13d Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x13d R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9f c5 2f 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x13e Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x13e R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b 9f e5 37 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x13f Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x13f R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b a0 15 4f 00 00 08 00
usb-storage: Bulk command S 0x43425355 T 0x140 Trg 0 LUN 0 L 4096 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 4096 bytes, 1 entries
usb-storage: Status code 0; transferred 4096/4096
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x140 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b a0 39 0f 00 04 00 00
usb-storage: Bulk command S 0x43425355 T 0x141 Trg 0 LUN 0 L 524288 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 524288 bytes, 128 entries
usb-storage: Status code 0; transferred 524288/524288
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x141 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b a0 3d 0f 00 04 00 00
usb-storage: Bulk command S 0x43425355 T 0x142 Trg 0 LUN 0 L 524288 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 524288 bytes, 128 entries
usb-storage: Status code 0; transferred 524288/524288
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x142 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b a0 41 0f 00 04 00 00
usb-storage: Bulk command S 0x43425355 T 0x143 Trg 0 LUN 0 L 524288 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 524288 bytes, 128 entries
usb-storage: Status code 0; transferred 524288/524288
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0x143 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 1b a0 45 0f 00 04 00 00
usb-storage: Bulk command S 0x43425355 T 0x144 Trg 0 LUN 0 L 524288 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 524288 bytes, 128 entries
usb-storage: Status code 0; transferred 524288/524288
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: usb_storage_command_abort called
usb-storage: usb_stor_stop_transport called
usb-storage: -- cancelling URB
usb-storage: Status code -104; transferred 0/13
usb-storage: -- transfer cancelled
usb-storage: Bulk status result = 3
usb-storage: -- command was aborted
usb-storage: Bulk reset requested


- -- 
Alexander Hoogerhuis                               | alexh@ihatent.com
CCNP - CCDP - MCNE - CCSE                          | +47 908 21 485
"You have zero privacy anyway. Get over it."  --Scott McNealy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQE+2vPTCQ1pa+gRoggRApiuAJ4nu4+/eSgQ5TRKwasBCp+0hUW3NgCfc4kQ
w6a0BReU5/srQE62zgdLDEI=
=tGSl
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-01 18:21     ` Oliver Neukum
  2003-06-02  6:51       ` Alexander Hoogerhuis
@ 2003-06-02 14:03       ` Alexander Hoogerhuis
  2003-06-02 17:13         ` David Brownell
  1 sibling, 1 reply; 12+ messages in thread
From: Alexander Hoogerhuis @ 2003-06-02 14:03 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: David Brownell, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Oliver Neukum <oliver@neukum.org> writes:

> > > Probably the block layer as it waits for free io slots.
> > > But that doesn't tell us why the requests are not executed.
> > > Where is SCSI timeout kicking in?
> >
> > I'm not seeing any scsi timeouts in the logs.
> 
> So it seems that the driver doesn't fail utterly, but crawls along.
> Storage's debugging output should clarify the situation.
> 

I had a private reply form a guy that had three of these running
reliably on 2.4.21-pre6, and he noted he'd never done cd->disk
transfers, but across the net. So I did the same.

Results are that it survived a lot longer, I managed to get about
700Mb across at about 8Mb/s (line speed 100mbit half duplex) before it
fell over with this:

usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0xc42 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command WRITE_10 (10 bytes)
usb-storage:  2a 00 18 f0 34 47 00 04 00 00
usb-storage: Bulk command S 0x43425355 T 0xc43 Trg 0 LUN 0 L 524288 F 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 524288 bytes, 128 entries
usb-storage: Status code 0; transferred 524288/524288
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: usb_storage_command_abort called
usb-storage: usb_stor_stop_transport called
usb-storage: -- cancelling URB
usb-storage: Status code -104; transferred 0/13
usb-storage: -- transfer cancelled
usb-storage: Bulk status result = 3
usb-storage: -- command was aborted
usb-storage: Bulk reset requested
usb-storage: Soft reset: clearing bulk-in endpoint halt
usb-storage: Soft reset: clearing bulk-out endpoint halt
usb-storage: Soft reset done
usb-storage: scsi command aborted
usb-storage: *** thread sleeping.
usb-storage: queuecommand() called
usb-storage: *** thread awakened.
usb-storage: Command TEST_UNIT_READY (6 bytes)
usb-storage:  00 00 00 00 00 00
usb-storage: Bulk command S 0x43425355 T 0xc43 Trg 0 LUN 0 L 0 F 0 CL 6
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk status Sig 0x53425355 T 0xc43 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.

Load only got up to about 3-4 before it fell over.

Apart from that, it seems the speed at which it falls over is depening
on two factors: with/without debugging and speed at which data arrives
for the drive.

Using a ssh-session that limited it to 3Mb/sec and it ran fine for a
long time with interrupts showing at about 5000 a second for the
usb-controller (vs. 1000 for the timer). With the example above that
made it fall over it ran at close to 10k interrupts/sec.

If it makes a difference, transferring from a local CD-ROM to the
drive across USB 2.0 also uses ide-scsi. The 2.4 example I had falling
over used no ide-scsi.

mvh,
A
- -- 
Alexander Hoogerhuis                               | alexh@ihatent.com
CCNP - CCDP - MCNE - CCSE                          | +47 908 21 485
"You have zero privacy anyway. Get over it."  --Scott McNealy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQE+21kjCQ1pa+gRoggRAiV4AJ9OAqrgeSP5WzwkonApbZtNPDOd1QCfTz68
81LU78RHkdpIlLNPD6Bgr6g=
=4Lmd
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-02 14:03       ` Alexander Hoogerhuis
@ 2003-06-02 17:13         ` David Brownell
  2003-06-02 17:17           ` Alexander Hoogerhuis
  0 siblings, 1 reply; 12+ messages in thread
From: David Brownell @ 2003-06-02 17:13 UTC (permalink / raw)
  To: Alexander Hoogerhuis; +Cc: Oliver Neukum, linux-kernel

Alexander Hoogerhuis wrote:
> 
> I had a private reply form a guy that had three of these running
> reliably on 2.4.21-pre6, and he noted he'd never done cd->disk
> transfers, but across the net. So I did the same.
> 
> Results are that it survived a lot longer, I managed to get about
> 700Mb across at about 8Mb/s (line speed 100mbit half duplex) before it
> fell over with this:
> 
> ...
> usb-storage: Command WRITE_10 (10 bytes)
> usb-storage:  2a 00 18 f0 34 47 00 04 00 00
> usb-storage: Bulk command S 0x43425355 T 0xc43 Trg 0 LUN 0 L 524288 F 0 CL 10
> usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> usb-storage: Status code 0; transferred 31/31
> usb-storage: -- transfer complete
> usb-storage: Bulk command transfer result=0


> usb-storage: usb_stor_bulk_transfer_sglist: xfer 524288 bytes, 128 entries
> usb-storage: Status code 0; transferred 524288/524288
> usb-storage: -- transfer complete
> usb-storage: Bulk data transfer result 0x0

That's two successive operations on the OUT endpoint
(two IRQs:  request, then 128 pages) both of which
worked fine, followed by one on the IN endpoint:


> usb-storage: Attempting to get CSW...
> usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
> usb-storage: usb_storage_command_abort called
> usb-storage: usb_stor_stop_transport called
> usb-storage: -- cancelling URB
> usb-storage: Status code -104; transferred 0/13
> usb-storage: -- transfer cancelled
> usb-storage: Bulk status result = 3
> usb-storage: -- command was aborted
> ...

Interesting.  So basically, the failure mode you saw
was that after all the data was (evidently) transferred
OK, usb-storage aborted (for some reason) its fetch
for the transfer status ... and then trouble.

Why did usb-storage abort that IN transfer?  If we
knew that, we'd have a good clue as to what's going
wrong.


> Load only got up to about 3-4 before it fell over.
> 
> Apart from that, it seems the speed at which it falls over is depening
> on two factors: with/without debugging and speed at which data arrives
> for the drive.

Not unrelated.  Turning on usb-storage debug slows down the
rate at which data is handed to the drive.

- Dave



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-02 17:13         ` David Brownell
@ 2003-06-02 17:17           ` Alexander Hoogerhuis
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Hoogerhuis @ 2003-06-02 17:17 UTC (permalink / raw)
  To: David Brownell; +Cc: Oliver Neukum, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Brownell <david-b@pacbell.net> writes:

> Alexander Hoogerhuis wrote:
> > I had a private reply form a guy that had three of these running
> > reliably on 2.4.21-pre6, and he noted he'd never done cd->disk
> > transfers, but across the net. So I did the same.
> > Results are that it survived a lot longer, I managed to get about
> > 700Mb across at about 8Mb/s (line speed 100mbit half duplex) before it
> > fell over with this:
> > ...
> > usb-storage: Command WRITE_10 (10 bytes)
> > usb-storage:  2a 00 18 f0 34 47 00 04 00 00
> > usb-storage: Bulk command S 0x43425355 T 0xc43 Trg 0 LUN 0 L 524288 F 0 CL 10
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> > usb-storage: Status code 0; transferred 31/31
> > usb-storage: -- transfer complete
> > usb-storage: Bulk command transfer result=0
> 
> 
> > usb-storage: usb_stor_bulk_transfer_sglist: xfer 524288 bytes, 128 entries
> > usb-storage: Status code 0; transferred 524288/524288
> > usb-storage: -- transfer complete
> > usb-storage: Bulk data transfer result 0x0
> 
> That's two successive operations on the OUT endpoint
> (two IRQs:  request, then 128 pages) both of which
> worked fine, followed by one on the IN endpoint:
> 
> 
> > usb-storage: Attempting to get CSW...
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
> > usb-storage: usb_storage_command_abort called
> > usb-storage: usb_stor_stop_transport called
> > usb-storage: -- cancelling URB
> > usb-storage: Status code -104; transferred 0/13
> > usb-storage: -- transfer cancelled
> > usb-storage: Bulk status result = 3
> > usb-storage: -- command was aborted
> > ...
> 
> Interesting.  So basically, the failure mode you saw
> was that after all the data was (evidently) transferred
> OK, usb-storage aborted (for some reason) its fetch
> for the transfer status ... and then trouble.
> 
> Why did usb-storage abort that IN transfer?  If we
> knew that, we'd have a good clue as to what's going
> wrong.
> 

I'll be happy to do whatever is neccesary to answer that, but I have
no clue off-hand how to get at what the problem is.

> 
> > Load only got up to about 3-4 before it fell over.
> > Apart from that, it seems the speed at which it falls over is
> > depening
> > on two factors: with/without debugging and speed at which data arrives
> > for the drive.
> 
> Not unrelated.  Turning on usb-storage debug slows down the
> rate at which data is handed to the drive.
> 

True, but I was thinking of it keeling over faster with debugging than
without. 

> - Dave
> 

- -A
- -- 
Alexander Hoogerhuis                               | alexh@ihatent.com
CCNP - CCDP - MCNE - CCSE                          | +47 908 21 485
"You have zero privacy anyway. Get over it."  --Scott McNealy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQE+24arCQ1pa+gRoggRAgzNAKDICQ+jU3ck4RV4w1dTBt8LgyHCSwCdGVpJ
86HgVilIQpKHdv2uWNhXmo0=
=Xtnx
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-01 15:17   ` Alexander Hoogerhuis
  2003-06-01 18:21     ` Oliver Neukum
@ 2003-06-03 11:49     ` Terje Malmedal
  1 sibling, 0 replies; 12+ messages in thread
From: Terje Malmedal @ 2003-06-03 11:49 UTC (permalink / raw)
  To: Alexander Hoogerhuis; +Cc: Oliver Neukum, David Brownell, linux-kernel


[Alexander Hoogerhuis]
> Stuck it in an older machine on USB 1.1 and it foudn the disk fine
> (redhat 9, 2.4.20-13.9 kernel on that machine), and ditto result:

> 19:15:16  up 2 days, 20:23,  4 users,  load average: 6.02, 2.41, 0.89
> 58 processes: 55 sleeping, 3 running, 0 zombie, 0 stopped
> CPU states:   0.2% user   4.0% system   0.0% nice   0.0% iowait  95.8% idle
> Mem:   385040k av,  380820k used,    4220k free,       0k shrd,   67368k buff
>        224720k active,              69412k inactive
> Swap:  521632k av,      80k used,  521552k free                  237452k cached
                                                                                
> and generating about 2500 interrupts for the usb controller per 10
> seconds and when i finally break it off and give it "sync" it uses
> about two minutes with about 4500 per 10 seconds to get it all on
> disk. On 2.4 the machine becomes more and more sluggish if I let it
> go more than a short minute.

I had the same problem with USB 1.1. I solved it by writing a
LD_PRELOAD-able shared library which overrides write() to do a
fdatasync() on the filehandle after a megabyte worth of writes.

Would be nice to have it fixed in the kernel though.

-- 
 - Terje
malmedal@usit.uio.no

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: USB 2.0 with 250Gb disk and insane loads
  2003-06-01 14:31 USB 2.0 with 250Gb disk and insane loads David Brownell
  2003-06-01 14:53 ` Oliver Neukum
  2003-06-01 14:57 ` Alexander Hoogerhuis
@ 2003-06-13 20:19 ` David Brownell
  2 siblings, 0 replies; 12+ messages in thread
From: David Brownell @ 2003-06-13 20:19 UTC (permalink / raw)
  To: alexh; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 869 bytes --]

>> I'm trying to nail own a problem here, with my shiny new Maxtor 250Gb
>> USB 2.0 disk. Under 2.4 (vanilla, latest 21-preX and 21-preX-acY) the
>> disk will mount and talk nicely. As soon as any load hits it, e.g. a
>> single cp from my internal CD-ROM to the disk, the mahcine load will
>> sky-rocket and at some point within a few minuter hang the machine.

Here's a combined 2.4.21-rc8 patch, which includes:

  - two patches waiting for 2.4.22-pre ("i/o watchdog"
    and "SMP fixes"), as posted earlier;

  - another one, recently circulated for 2.5 ("short reads"
    and cleanup);

  - a bugfix for a hang that 2.4 + usb-storage seemed
    particularly ready to trigger.

I was able to reproduce a load+hang issue without this
patch; add the patch, the problem goes away and 2.4 acts
just as nicely (but not as quickly) as 2.5.

Please give it a try.

- Dave


[-- Attachment #2: ehci24-0613.patch --]
[-- Type: text/plain, Size: 14668 bytes --]

--- 1.8/drivers/usb/host/ehci-dbg.c	Wed Mar 19 02:25:01 2003
+++ edited/drivers/usb/host/ehci-dbg.c	Thu Jun 12 11:17:57 2003
@@ -125,19 +125,28 @@
 #ifdef	DEBUG
 
 static void __attribute__((__unused__))
+dbg_qtd (char *label, struct ehci_hcd *ehci, struct ehci_qtd *qtd)
+{
+	ehci_dbg (ehci, "%s td %p n%08x %08x t%08x p0=%08x\n", label, qtd,
+		cpu_to_le32p (&qtd->hw_next),
+		cpu_to_le32p (&qtd->hw_alt_next),
+		cpu_to_le32p (&qtd->hw_token),
+		cpu_to_le32p (&qtd->hw_buf [0]));
+	if (qtd->hw_buf [1])
+		ehci_dbg (ehci, "  p1=%08x p2=%08x p3=%08x p4=%08x\n",
+			cpu_to_le32p (&qtd->hw_buf [1]),
+			cpu_to_le32p (&qtd->hw_buf [2]),
+			cpu_to_le32p (&qtd->hw_buf [3]),
+			cpu_to_le32p (&qtd->hw_buf [4]));
+}
+
+static void __attribute__((__unused__))
 dbg_qh (char *label, struct ehci_hcd *ehci, struct ehci_qh *qh)
 {
-	dbg ("%s %p n%08x info1 %x info2 %x hw_curr %x qtd_next %x", label,
+	ehci_dbg (ehci, "%s qh %p n%08x info %x %x qtd %x\n", label,
 		qh, qh->hw_next, qh->hw_info1, qh->hw_info2,
-		qh->hw_current, qh->hw_qtd_next);
-	dbg ("  alt+nak+t= %x, token= %x, page0= %x, page1= %x",
-		qh->hw_alt_next, qh->hw_token,
-		qh->hw_buf [0], qh->hw_buf [1]);
-	if (qh->hw_buf [2]) {
-		dbg ("  page2= %x, page3= %x, page4= %x",
-			qh->hw_buf [2], qh->hw_buf [3],
-			qh->hw_buf [4]);
-	}
+		qh->hw_current);
+	dbg_qtd ("overlay", ehci, (struct ehci_qtd *) &qh->hw_qtd_next);
 }
 
 static int __attribute__((__unused__))
@@ -294,8 +303,7 @@
 		return '*';
 	if (token & QTD_STS_HALT)
 		return '-';
-	if (QTD_PID (token) != 1 /* not IN: OUT or SETUP */
-			|| QTD_LENGTH (token) == 0)
+	if (!IS_SHORT_READ (token))
 		return ' ';
 	/* tries to advance through hw_alt_next */
 	return '/';
@@ -317,11 +325,14 @@
 	char			*next = *nextp;
 	char			mark;
 
-	mark = token_mark (qh->hw_token);
+	if (qh->hw_qtd_next == EHCI_LIST_END)	/* NEC does this */
+		mark = '@';
+	else
+		mark = token_mark (qh->hw_token);
 	if (mark == '/') {	/* qh_alt_next controls qh advance? */
 		if ((qh->hw_alt_next & QTD_MASK) == ehci->async->hw_alt_next)
 			mark = '#';	/* blocked */
-		else if (qh->hw_alt_next & cpu_to_le32 (0x01))
+		else if (qh->hw_alt_next == EHCI_LIST_END)
 			mark = '.';	/* use hw_qtd_next */
 		/* else alt_next points to some other qtd */
 	}
@@ -334,7 +345,7 @@
 			(scratch >> 8) & 0x000f,
 			scratch, cpu_to_le32p (&qh->hw_info2),
 			cpu_to_le32p (&qh->hw_token), mark,
-			(cpu_to_le32 (0x8000000) & qh->hw_token)
+			(__constant_cpu_to_le32 (QTD_TOGGLE) & qh->hw_token)
 				? "data0" : "data1",
 			(cpu_to_le32p (&qh->hw_alt_next) >> 1) & 0x0f);
 	size -= temp;
@@ -400,6 +411,8 @@
 	char			*next;
 	struct ehci_qh		*qh;
 
+	*buf = 0;
+
 	pdev = container_of (dev, struct pci_dev, dev);
 	ehci = container_of (pci_get_drvdata (pdev), struct ehci_hcd, hcd);
 	next = buf;
@@ -422,7 +435,7 @@
 	}
 	spin_unlock_irqrestore (&ehci->lock, flags);
 
-	return PAGE_SIZE - size;
+	return strlen (buf);
 }
 static DEVICE_ATTR (async, S_IRUGO, show_async, NULL);
 
@@ -558,7 +571,8 @@
 	/* Capability Registers */
 	i = readw (&ehci->caps->hci_version);
 	temp = snprintf (next, size,
-		"EHCI %x.%02x, hcd state %d (version " DRIVER_VERSION ")\n",
+		"%s\nEHCI %x.%02x, hcd state %d (driver " DRIVER_VERSION ")\n",
+		pdev->dev.name,
 		i >> 8, i & 0x0ff, ehci->hcd.state);
 	size -= temp;
 	next += temp;
--- 1.12/drivers/usb/host/ehci-hcd.c	Wed Mar 19 02:25:01 2003
+++ edited/drivers/usb/host/ehci-hcd.c	Thu Jun 12 11:17:57 2003
@@ -31,6 +31,7 @@
 #include <linux/list.h>
 #include <linux/interrupt.h>
 #include <linux/reboot.h>
+#include <linux/bitops.h> /* for generic_ffs */
 
 #ifdef CONFIG_USB_DEBUG
 	#define DEBUG
@@ -41,11 +42,7 @@
 #include <linux/usb.h>
 
 #include <linux/version.h>
-#if LINUX_VERSION_CODE < KERNEL_VERSION(2,5,32)
 #include "../hcd.h"
-#else
-#include "../core/hcd.h"
-#endif
 
 #include <asm/byteorder.h>
 #include <asm/io.h>
@@ -94,11 +91,11 @@
  * 2001-June	Works with usb-storage and NEC EHCI on 2.4
  */
 
-#define DRIVER_VERSION "2003-Jan-22"
+#define DRIVER_VERSION "2003-Jun-12/2.4"
 #define DRIVER_AUTHOR "David Brownell"
 #define DRIVER_DESC "USB 2.0 'Enhanced' Host Controller (EHCI) Driver"
 
-static const char	hcd_name [] = "ehci-hcd";
+static const char	hcd_name [] = "ehci_hcd";
 
 
 // #define EHCI_VERBOSE_DEBUG
@@ -118,8 +115,10 @@
 #define	EHCI_TUNE_MULT_TT	1
 #define	EHCI_TUNE_FLS		2	/* (small) 256 frame schedule */
 
-#define EHCI_WATCHDOG_JIFFIES	(HZ/100)	/* arbitrary; ~10 msec */
+#define EHCI_IAA_JIFFIES	(HZ/100)	/* arbitrary; ~10 msec */
+#define EHCI_IO_JIFFIES		(HZ/10)		/* io watchdog > irq_thresh */
 #define EHCI_ASYNC_JIFFIES	(HZ/20)		/* async idle timeout */
+#define EHCI_SHRINK_JIFFIES	(HZ/200)	/* async qh unlink delay */
 
 /* Initial IRQ latency:  lower than default */
 static int log2_irq_thresh = 0;		// 0 to 6
@@ -268,16 +267,13 @@
 		}
 	}
 
+ 	/* stop async processing after it's idled a bit */
+	if (test_bit (TIMER_ASYNC_OFF, &ehci->actions))
+ 		start_unlink_async (ehci, ehci->async);
+
+	/* ehci could run by timer, without IRQs ... */
 	ehci_work (ehci, NULL);
-	if (ehci->reclaim && !timer_pending (&ehci->watchdog))
-		mod_timer (&ehci->watchdog,
-				jiffies + EHCI_WATCHDOG_JIFFIES);
 
- 	/* stop async processing after it's idled a while */
-	else if (ehci->async_idle) {
- 		start_unlink_async (ehci, ehci->async);
- 		ehci->async_idle = 0;
-	}
 	spin_unlock_irqrestore (&ehci->lock, flags);
 }
 
@@ -660,11 +656,18 @@
  */
 static void ehci_work (struct ehci_hcd *ehci, struct pt_regs *regs)
 {
+	timer_action_done (ehci, TIMER_IO_WATCHDOG);
 	if (ehci->reclaim_ready)
 		end_unlink_async (ehci, regs);
 	scan_async (ehci, regs);
 	if (ehci->next_uframe != -1)
 		scan_periodic (ehci, regs);
+
+	/* the IO watchdog guards against hardware or driver bugs that
+	 * misplace IRQs, and should let us run completely without IRQs.
+	 */
+	if ((ehci->async->qh_next.ptr != 0) || (ehci->periodic_sched != 0))
+		timer_action (ehci, TIMER_IO_WATCHDOG);
 }
 
 /*-------------------------------------------------------------------------*/
@@ -1039,8 +1042,8 @@
 
 static int __init init (void) 
 {
-	dbg (DRIVER_INFO);
-	dbg ("block sizes: qh %Zd qtd %Zd itd %Zd sitd %Zd",
+	pr_debug ("%s: block sizes: qh %Zd qtd %Zd itd %Zd sitd %Zd\n",
+		hcd_name,
 		sizeof (struct ehci_qh), sizeof (struct ehci_qtd),
 		sizeof (struct ehci_itd), sizeof (struct ehci_sitd));
 
--- 1.13/drivers/usb/host/ehci-q.c	Tue Apr  1 12:23:15 2003
+++ edited/drivers/usb/host/ehci-q.c	Fri Jun 13 12:25:08 2003
@@ -88,7 +88,6 @@
 static inline void
 qh_update (struct ehci_hcd *ehci, struct ehci_qh *qh, struct ehci_qtd *qtd)
 {
-	qh->hw_current = 0;
 	qh->hw_qtd_next = QTD_NEXT (qtd->qtd_dma);
 	qh->hw_alt_next = EHCI_LIST_END;
 
@@ -99,8 +98,6 @@
 
 /*-------------------------------------------------------------------------*/
 
-#define IS_SHORT_READ(token) (QTD_LENGTH (token) != 0 && QTD_PID (token) == 1)
-
 static void qtd_copy_status (
 	struct ehci_hcd *ehci,
 	struct urb *urb,
@@ -314,16 +311,15 @@
 		/* hardware copies qtd out of qh overlay */
 		rmb ();
 		token = le32_to_cpu (qtd->hw_token);
-		stopped = stopped
-			|| (HALT_BIT & qh->hw_token) != 0
-			|| (ehci->hcd.state == USB_STATE_HALT);
 
 		/* always clean up qtds the hc de-activated */
 		if ((token & QTD_STS_ACTIVE) == 0) {
 
-			/* magic dummy for short reads; won't advance */
-			if (IS_SHORT_READ (token)
-					&& !(token & QTD_STS_HALT)
+			if ((token & QTD_STS_HALT) != 0) {
+				stopped = 1;
+
+			/* magic dummy for some short reads; qh won't advance */
+			} else if (IS_SHORT_READ (token)
 					&& (qh->hw_alt_next & QTD_MASK)
 						== ehci->async->hw_alt_next) {
 				stopped = 1;
@@ -331,10 +327,13 @@
 			}
 
 		/* stop scanning when we reach qtds the hc is using */
-		} else if (likely (!stopped)) {
+		} else if (likely (!stopped
+				|| HCD_IS_RUNNING (ehci->hcd.state))) {
 			break;
 
 		} else {
+			stopped = 1;
+
 			/* ignore active urbs unless some previous qtd
 			 * for the urb faulted (including short read) or
 			 * its urb was canceled.  we may patch qh or qtds.
@@ -393,12 +392,20 @@
 	qh->qh_state = state;
 
 	/* update qh after fault cleanup */
-	if (unlikely ((HALT_BIT & qh->hw_token) != 0)) {
-		qh_update (ehci, qh,
-			list_empty (&qh->qtd_list)
-				? qh->dummy
-				: list_entry (qh->qtd_list.next,
-					struct ehci_qtd, qtd_list));
+	if (unlikely (stopped != 0)
+			/* some EHCI 0.95 impls will overlay dummy qtds */ 
+			|| qh->hw_qtd_next == EHCI_LIST_END) {
+		if (list_empty (&qh->qtd_list))
+			end = qh->dummy;
+		else {
+			end = list_entry (qh->qtd_list.next,
+					struct ehci_qtd, qtd_list);
+			/* first qtd may already be partially processed */
+			if (cpu_to_le32 (end->qtd_dma) == qh->hw_current)
+				end = 0;
+		}
+		if (end)
+			qh_update (ehci, qh, end);
 	}
 
 	return count;
@@ -741,8 +748,7 @@
 
 	/* (re)start the async schedule? */
 	head = ehci->async;
-	if (ehci->async_idle)
-		del_timer (&ehci->watchdog);
+	timer_action_done (ehci, TIMER_ASYNC_OFF);
 	if (!head->qh_next.qh) {
 		u32	cmd = readl (&ehci->regs->command);
 
@@ -773,8 +779,6 @@
 
 	qh->qh_state = QH_STATE_LINKED;
 	/* qtd completions reported later by interrupt */
-
-	ehci->async_idle = 0;
 }
 
 /*-------------------------------------------------------------------------*/
@@ -831,9 +835,8 @@
 			}
 		}
 
-		/* FIXME:  changing config or interface setting is not
-		 * supported yet.  preferred fix is for usbcore to tell
-		 * us to clear out each endpoint's state, but...
+		/* NOTE:  changing config or interface setting is not
+		 * supported without the 2.5 endpoint disable logic.
 		 */
 
 		/* usb_clear_halt() means qh data toggle gets reset */
@@ -955,17 +958,17 @@
 	struct ehci_qh		*qh = ehci->reclaim;
 	struct ehci_qh		*next;
 
-	del_timer (&ehci->watchdog);
+	timer_action_done (ehci, TIMER_IAA_WATCHDOG);
 
 	// qh->hw_next = cpu_to_le32 (qh->qh_dma);
 	qh->qh_state = QH_STATE_IDLE;
 	qh->qh_next.qh = 0;
 	qh_put (ehci, qh);			// refcount from reclaim 
-	ehci->reclaim = 0;
-	ehci->reclaim_ready = 0;
 
 	/* other unlink(s) may be pending (in QH_STATE_UNLINK_WAIT) */
 	next = qh->reclaim;
+	ehci->reclaim = next;
+	ehci->reclaim_ready = 0;
 	qh->reclaim = 0;
 
 	qh_completions (ehci, qh, regs);
@@ -980,16 +983,14 @@
 		 * active but idle for a while once it empties.
 		 */
 		if (HCD_IS_RUNNING (ehci->hcd.state)
-				&& ehci->async->qh_next.qh == 0
-				&& !timer_pending (&ehci->watchdog)) {
-			ehci->async_idle = 1;
-			mod_timer (&ehci->watchdog,
-					jiffies + EHCI_ASYNC_JIFFIES);
-		}
+				&& ehci->async->qh_next.qh == 0)
+			timer_action (ehci, TIMER_ASYNC_OFF);
 	}
 
-	if (next)
+	if (next) {
+		ehci->reclaim = 0;
 		start_unlink_async (ehci, next);
+	}
 }
 
 /* makes sure the async qh will become idle */
@@ -1020,6 +1021,7 @@
 			wmb ();
 			// handshake later, if we need to
 		}
+		timer_action_done (ehci, TIMER_ASYNC_OFF);
 		return;
 	} 
 
@@ -1045,9 +1047,8 @@
 	ehci->reclaim_ready = 0;
 	cmd |= CMD_IAAD;
 	writel (cmd, &ehci->regs->command);
-	/* posted write need not be known to HC yet ... */
-
-	mod_timer (&ehci->watchdog, jiffies + EHCI_WATCHDOG_JIFFIES);
+	(void) readl (&ehci->regs->command);
+	timer_action (ehci, TIMER_IAA_WATCHDOG);
 }
 
 /*-------------------------------------------------------------------------*/
@@ -1056,10 +1057,11 @@
 scan_async (struct ehci_hcd *ehci, struct pt_regs *regs)
 {
 	struct ehci_qh		*qh;
-	int			unlink_delay = 0;
+	enum ehci_timer_action	action = TIMER_IO_WATCHDOG;
 
 	if (!++(ehci->stamp))
 		ehci->stamp++;
+	timer_action_done (ehci, TIMER_ASYNC_SHRINK);
 rescan:
 	qh = ehci->async->qh_next.qh;
 	if (likely (qh != 0)) {
@@ -1091,17 +1093,15 @@
 			 */
 			if (list_empty (&qh->qtd_list)) {
 				if (qh->stamp == ehci->stamp)
-					unlink_delay = 1;
-				else if (!ehci->reclaim) {
+					action = TIMER_ASYNC_SHRINK;
+				else if (!ehci->reclaim
+					    && qh->qh_state == QH_STATE_LINKED)
 					start_unlink_async (ehci, qh);
-					unlink_delay = 0;
-				}
 			}
 
 			qh = qh->qh_next.qh;
 		} while (qh);
 	}
-
-	if (unlink_delay && !timer_pending (&ehci->watchdog))
-		mod_timer (&ehci->watchdog, jiffies + EHCI_WATCHDOG_JIFFIES/2);
+	if (action == TIMER_ASYNC_SHRINK)
+		timer_action (ehci, TIMER_ASYNC_SHRINK);
 }
--- 1.7/drivers/usb/host/ehci.h	Mon Mar  3 03:33:58 2003
+++ edited/drivers/usb/host/ehci.h	Thu Jun 12 11:17:57 2003
@@ -52,8 +52,7 @@
 	/* async schedule support */
 	struct ehci_qh		*async;
 	struct ehci_qh		*reclaim;
-	int			reclaim_ready : 1,
-				async_idle : 1;
+	int			reclaim_ready : 1;
 
 	/* periodic schedule support */
 #define	DEFAULT_I_TDPS		1024		/* some HCs can do less */
@@ -83,6 +82,7 @@
 
 	struct timer_list	watchdog;
 	struct notifier_block	reboot_notifier;
+	unsigned long		actions;
 	unsigned		stamp;
 
 	/* irq statistics */
@@ -100,6 +100,53 @@
 /* NOTE:  urb->transfer_flags expected to not use this bit !!! */
 #define EHCI_STATE_UNLINK	0x8000		/* urb being unlinked */
 
+enum ehci_timer_action {
+	TIMER_IO_WATCHDOG,
+	TIMER_IAA_WATCHDOG,
+	TIMER_ASYNC_SHRINK,
+	TIMER_ASYNC_OFF,
+};
+
+static inline void
+timer_action_done (struct ehci_hcd *ehci, enum ehci_timer_action action)
+{
+	clear_bit (action, &ehci->actions);
+}
+
+static inline void
+timer_action (struct ehci_hcd *ehci, enum ehci_timer_action action)
+{
+	if (!test_and_set_bit (action, &ehci->actions)) {
+		unsigned long t;
+
+		switch (action) {
+		case TIMER_IAA_WATCHDOG:
+			t = EHCI_IAA_JIFFIES;
+			break;
+		case TIMER_IO_WATCHDOG:
+			t = EHCI_IO_JIFFIES;
+			break;
+		case TIMER_ASYNC_OFF:
+			t = EHCI_ASYNC_JIFFIES;
+			break;
+		// case TIMER_ASYNC_SHRINK:
+		default:
+			t = EHCI_SHRINK_JIFFIES;
+			break;
+		}
+		t += jiffies;
+		// all timings except IAA watchdog can be overridden.
+		// async queue SHRINK often precedes IAA.  while it's ready
+		// to go OFF neither can matter, and afterwards the IO
+		// watchdog stops unless there's still periodic traffic.
+		if (action != TIMER_IAA_WATCHDOG
+				&& t > ehci->watchdog.expires
+				&& timer_pending (&ehci->watchdog))
+			return;
+		mod_timer (&ehci->watchdog, t);
+	}
+}
+
 /*-------------------------------------------------------------------------*/
 
 /* EHCI register interface, corresponds to EHCI Revision 0.95 specification */
@@ -243,7 +290,10 @@
 	size_t			length;			/* length of buffer */
 } __attribute__ ((aligned (32)));
 
-#define QTD_MASK cpu_to_le32 (~0x1f)	/* mask NakCnt+T in qh->hw_alt_next */
+/* mask NakCnt+T in qh->hw_alt_next */
+#define QTD_MASK __constant_cpu_to_le32 (~0x1f)
+
+#define IS_SHORT_READ(token) (QTD_LENGTH (token) != 0 && QTD_PID (token) == 1)
 
 /*-------------------------------------------------------------------------*/
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* USB 2.0 with 250Gb disk and insane loads
@ 2003-06-01 13:47 Alexander Hoogerhuis
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Hoogerhuis @ 2003-06-01 13:47 UTC (permalink / raw)
  To: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm trying to nail own a problem here, with my shiny new Maxtor 250Gb
USB 2.0 disk. Under 2.4 (vanilla, latest 21-preX and 21-preX-acY) the
disk will mount and talk nicely. As soon as any load hits it, e.g. a
single cp from my internal CD-ROM to the disk, the mahcine load will
sky-rocket and at some point within a few minuter hang the machine.

On 2.5 (vanilla and -mm) the load will show as i/o-wait and at some
point hang any access to the drive, but the kernel will go on working.

The drive will be dicovered nicely by the hotplug scripts and it will
load the apropriate drivers to make it show up as /dev/sda.

Any ideas how to nail down this problem when 2.4 will say nothing
before it locks hard and 2.5 doesnt say aything but goes off to hang
itself after a while with no furter clue?

mvh,
A
- -- 
Alexander Hoogerhuis                               | alexh@ihatent.com
CCNP - CCDP - MCNE - CCSE                          | +47 908 21 485
"You have zero privacy anyway. Get over it."  --Scott McNealy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQE+2gPvCQ1pa+gRoggRAqkZAKCLSHDpOx8f/zL0PFwBsdSdoWjYgACeO3KW
RjW6OBQwUCgJ0V/sprQNSKk=
=HYxq
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-06-13 20:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-06-01 14:31 USB 2.0 with 250Gb disk and insane loads David Brownell
2003-06-01 14:53 ` Oliver Neukum
2003-06-01 15:17   ` Alexander Hoogerhuis
2003-06-01 18:21     ` Oliver Neukum
2003-06-02  6:51       ` Alexander Hoogerhuis
2003-06-02 14:03       ` Alexander Hoogerhuis
2003-06-02 17:13         ` David Brownell
2003-06-02 17:17           ` Alexander Hoogerhuis
2003-06-03 11:49     ` Terje Malmedal
2003-06-01 14:57 ` Alexander Hoogerhuis
2003-06-13 20:19 ` David Brownell
  -- strict thread matches above, loose matches on Subject: below --
2003-06-01 13:47 Alexander Hoogerhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).