linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] BZ#11120: AACRAID driver stalls under high load
@ 2009-06-23 11:25 Andy Whitcroft
  2009-06-23 11:25 ` [PATCH 1/1] Reduce AACRAID hardware queue size Andy Whitcroft
  0 siblings, 1 reply; 9+ messages in thread
From: Andy Whitcroft @ 2009-06-23 11:25 UTC (permalink / raw)
  To: Adaptec OEM Raid Solutions
  Cc: Andy Whitcroft, Mathias Urlichs, James Bottomley, linux-scsi,
	linux-kernel

We have had reports of driver stalls on AACRAID drivers under high load.  This
seems to be related to the ammount of concurrent IO that can be pushed
to the controller.  Reducing the maximum queue count for this driver sorts
this out.  For further details see the upstream and Ubuntu bugs:

    http://bugzilla.kernel.org/show_bug.cgi?id=11120
    http://bugs.launchpad.net/bugs/249964

Following this email is a patch from Mathias Urlichs to reduce the
queue size.  This has been tested and confirmed to fix the issue by a
couple of those affected.

Patches against Linus' tree.

-apw

Mathias Urlichs (1):
  Reduce AACRAID hardware queue size

 drivers/scsi/aacraid/aacraid.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/1] Reduce AACRAID hardware queue size
  2009-06-23 11:25 [PATCH 0/1] BZ#11120: AACRAID driver stalls under high load Andy Whitcroft
@ 2009-06-23 11:25 ` Andy Whitcroft
  2009-06-23 11:50   ` Matthias Urlichs
  2009-06-23 15:31   ` James Bottomley
  0 siblings, 2 replies; 9+ messages in thread
From: Andy Whitcroft @ 2009-06-23 11:25 UTC (permalink / raw)
  To: Adaptec OEM Raid Solutions
  Cc: Andy Whitcroft, Mathias Urlichs, James Bottomley, linux-scsi,
	linux-kernel

From: Mathias Urlichs <matthias@urlichs.de>

BugLink: http://bugzilla.kernel.org/show_bug.cgi?id=11120
BugLink: http://bugs.launchpad.net/bugs/249964

Reduce the hardware queue size for the AACRAID controller.  This controloler
suffers adapter aborts and scsi resets under high load otherwise:

    aacraid: Host adapter abort request (0,0,2,0)
    aacraid: Host adapter abort request (0,0,3,0)
    aacraid: Host adapter reset request. SCSI hang ?
    aacraid: Host adapter abort request (0,0,0,0)

Signed-Off-By: Mathias Urlichs <matthias@urlichs.de>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
---
 drivers/scsi/aacraid/aacraid.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index cdbdec9..0d5d036 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -24,7 +24,7 @@
 #define AAC_MAX_LUN		(8)
 
 #define AAC_MAX_HOSTPHYSMEMPAGES (0xfffff)
-#define AAC_MAX_32BIT_SGBCOUNT	((unsigned short)256)
+#define AAC_MAX_32BIT_SGBCOUNT	((unsigned short)127)
 
 /*
  * These macros convert from physical channels to virtual channels
-- 
1.6.3.rc3.199.g24398


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] Reduce AACRAID hardware queue size
  2009-06-23 11:25 ` [PATCH 1/1] Reduce AACRAID hardware queue size Andy Whitcroft
@ 2009-06-23 11:50   ` Matthias Urlichs
  2009-06-23 14:11     ` Andy Whitcroft
  2009-06-23 15:31   ` James Bottomley
  1 sibling, 1 reply; 9+ messages in thread
From: Matthias Urlichs @ 2009-06-23 11:50 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: Adaptec OEM Raid Solutions, James Bottomley, linux-scsi, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 450 bytes --]

Hi,

Andy Whitcroft:
> From: Mathias Urlichs <matthias@urlichs.de>
> 
Well ... "Matthias", if you please.

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
v4sw7$Yhw6+8ln7ma7u7L!wl7DUi2e6t3TMWb8HAGen6g3a4s6Mr1p-3/-6 hackerkey.com
 - -
At no time is freedom of speech more precious than when a man hits his
thumb with a hammer.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] Reduce AACRAID hardware queue size
  2009-06-23 11:50   ` Matthias Urlichs
@ 2009-06-23 14:11     ` Andy Whitcroft
  0 siblings, 0 replies; 9+ messages in thread
From: Andy Whitcroft @ 2009-06-23 14:11 UTC (permalink / raw)
  To: Matthias Urlichs
  Cc: Adaptec OEM Raid Solutions, James Bottomley, linux-scsi, linux-kernel

On Tue, Jun 23, 2009 at 01:50:20PM +0200, Matthias Urlichs wrote:
> Hi,
> 
> Andy Whitcroft:
> > From: Mathias Urlichs <matthias@urlichs.de>
> > 
> Well ... "Matthias", if you please.

Well that is most peculiar, your name is sufficiently alien to my naive
tongue that I am sure I cut-n-pasted it wholesale to there from somewhere.
It looks like I got it from the s-o-b line in the patch:

	Signed-Off-By: Mathias Urlichs <matthias@urlichs.de>

Yeah it seems you got your name wrong in the original patch on the Kernel
bugzilla, and I have propogated it to there:

	http://bugzilla.kernel.org/show_bug.cgi?id=11120#c4

So I guess all of those references need changing.  If these patches need
regenerating please let me know.

-apw

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] Reduce AACRAID hardware queue size
  2009-06-23 11:25 ` [PATCH 1/1] Reduce AACRAID hardware queue size Andy Whitcroft
  2009-06-23 11:50   ` Matthias Urlichs
@ 2009-06-23 15:31   ` James Bottomley
  2009-06-23 15:43     ` Alan Cox
  2009-06-23 16:04     ` Matthias Urlichs
  1 sibling, 2 replies; 9+ messages in thread
From: James Bottomley @ 2009-06-23 15:31 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: Adaptec OEM Raid Solutions, Mathias Urlichs, linux-scsi, linux-kernel

On Tue, 2009-06-23 at 12:25 +0100, Andy Whitcroft wrote:
> From: Mathias Urlichs <matthias@urlichs.de>
> 
> BugLink: http://bugzilla.kernel.org/show_bug.cgi?id=11120
> BugLink: http://bugs.launchpad.net/bugs/249964
> 
> Reduce the hardware queue size for the AACRAID controller.  This controloler
> suffers adapter aborts and scsi resets under high load otherwise:
> 
>     aacraid: Host adapter abort request (0,0,2,0)
>     aacraid: Host adapter abort request (0,0,3,0)
>     aacraid: Host adapter reset request. SCSI hang ?
>     aacraid: Host adapter abort request (0,0,0,0)
> 
> Signed-Off-By: Mathias Urlichs <matthias@urlichs.de>
> Signed-off-by: Andy Whitcroft <apw@canonical.com>
> ---
>  drivers/scsi/aacraid/aacraid.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
> index cdbdec9..0d5d036 100644
> --- a/drivers/scsi/aacraid/aacraid.h
> +++ b/drivers/scsi/aacraid/aacraid.h
> @@ -24,7 +24,7 @@
>  #define AAC_MAX_LUN		(8)
>  
>  #define AAC_MAX_HOSTPHYSMEMPAGES (0xfffff)
> -#define AAC_MAX_32BIT_SGBCOUNT	((unsigned short)256)
> +#define AAC_MAX_32BIT_SGBCOUNT	((unsigned short)127)

So I'm afraid this isn't a proper fix.  It was a diagnostic test to see
if SGBCOUNT was the root cause for this card.

Incidentally, SGBCOUNT isn't queue depth, its maximum number of sectors
in an individual transfer.  What we'd need to show for this to be the
fix is that every 32 bit aacraid card is affected, which, given the
paucity of bug reports, I don't think so.

Firstly, Matthias, can you see if on an unmodified aacraid, this fixes
the problem for you:

echo 63 > /sys/block/<disk>/queue/max_sectors_kb

63 is because the parameter is in kb for sysfs, but in number of 512
byte blocks for the driver.  If it does, we can likely just add it to
the udev unusual devices and not bother with a kernel fix.

To fix the kernel properly, we'd need to add an AAC_QUIRK for this
adapter, which is a bit more work, so lets see if udev can fix it for us
first ...

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] Reduce AACRAID hardware queue size
  2009-06-23 15:31   ` James Bottomley
@ 2009-06-23 15:43     ` Alan Cox
  2009-06-23 16:04     ` Matthias Urlichs
  1 sibling, 0 replies; 9+ messages in thread
From: Alan Cox @ 2009-06-23 15:43 UTC (permalink / raw)
  To: James Bottomley
  Cc: Andy Whitcroft, Adaptec OEM Raid Solutions, Mathias Urlichs,
	linux-scsi, linux-kernel

> Incidentally, SGBCOUNT isn't queue depth, its maximum number of sectors
> in an individual transfer.  What we'd need to show for this to be the
> fix is that every 32 bit aacraid card is affected, which, given the
> paucity of bug reports, I don't think so.

Its usually a specific firmware revision that weird problems with aacraid
hardware are linked to so it is worth trying different firmwares.

Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] Reduce AACRAID hardware queue size
  2009-06-23 15:31   ` James Bottomley
  2009-06-23 15:43     ` Alan Cox
@ 2009-06-23 16:04     ` Matthias Urlichs
  2009-07-01 15:58       ` Andy Whitcroft
  1 sibling, 1 reply; 9+ messages in thread
From: Matthias Urlichs @ 2009-06-23 16:04 UTC (permalink / raw)
  To: James Bottomley
  Cc: Andy Whitcroft, Adaptec OEM Raid Solutions, linux-scsi, linux-kernel

Hi,

James Bottomley:
> Firstly, Matthias, can you see if on an unmodified aacraid, this fixes
> the problem for you:
> 
> echo 63 > /sys/block/<disk>/queue/max_sectors_kb
> 
Thank you, I'll do that tonight.

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
 - -
Why did the chicken cross the road?

Ralph Waldo Emerson: It didn't cross the road; it transcended it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] Reduce AACRAID hardware queue size
  2009-06-23 16:04     ` Matthias Urlichs
@ 2009-07-01 15:58       ` Andy Whitcroft
  2009-07-01 16:28         ` Matthias Urlichs
  0 siblings, 1 reply; 9+ messages in thread
From: Andy Whitcroft @ 2009-07-01 15:58 UTC (permalink / raw)
  To: Matthias Urlichs
  Cc: James Bottomley, Adaptec OEM Raid Solutions, linux-scsi, linux-kernel

On Tue, Jun 23, 2009 at 06:04:08PM +0200, Matthias Urlichs wrote:
> Hi,
> 
> James Bottomley:
> > Firstly, Matthias, can you see if on an unmodified aacraid, this fixes
> > the problem for you:
> > 
> > echo 63 > /sys/block/<disk>/queue/max_sectors_kb
> > 
> Thank you, I'll do that tonight.

How did this one work out for you?  I don't think I saw a reply either
way?

-apw

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] Reduce AACRAID hardware queue size
  2009-07-01 15:58       ` Andy Whitcroft
@ 2009-07-01 16:28         ` Matthias Urlichs
  0 siblings, 0 replies; 9+ messages in thread
From: Matthias Urlichs @ 2009-07-01 16:28 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: James Bottomley, Adaptec OEM Raid Solutions, linux-scsi, linux-kernel

Hi,

Andy Whitcroft:
> > > echo 63 > /sys/block/<disk>/queue/max_sectors_kb
> > > 
> > Thank you, I'll do that tonight.
> 
> How did this one work out for you?  I don't think I saw a reply either
> way?
> 
Bah. Thanks for the reminder; I totally forgot to send that email.
(Too much other stuff in my life right now. (Mostly good, fortunately.))

Short version: It didn't. The controller now seems dead.
It times out all the time, and is unable to find any disks.

Unfortunately the thing is built-in and can't easily be replaced;
I don't have any warranty info for this machine either, so ...

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
v4sw7$Yhw6+8ln7ma7u7L!wl7DUi2e6t3TMWb8HAGen6g3a4s6Mr1p-3/-6 hackerkey.com
 - -
Sex is a misdemeanor -- the more I miss, de meaner I get.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-07-01 16:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-23 11:25 [PATCH 0/1] BZ#11120: AACRAID driver stalls under high load Andy Whitcroft
2009-06-23 11:25 ` [PATCH 1/1] Reduce AACRAID hardware queue size Andy Whitcroft
2009-06-23 11:50   ` Matthias Urlichs
2009-06-23 14:11     ` Andy Whitcroft
2009-06-23 15:31   ` James Bottomley
2009-06-23 15:43     ` Alan Cox
2009-06-23 16:04     ` Matthias Urlichs
2009-07-01 15:58       ` Andy Whitcroft
2009-07-01 16:28         ` Matthias Urlichs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).