All of lore.kernel.org
 help / color / mirror / Atom feed
* Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
@ 2003-10-07 16:21 Stefan Kaltenbrunner
  2003-10-09 19:22 ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Kaltenbrunner @ 2003-10-07 16:21 UTC (permalink / raw)
  To: linux-kernel

Hello!

we have a bunch of IBM x305 here which are entrylevel 1HE servers based 
on a Serverworks CSB5 chipset.
One of those has 2 120GB IDE disks in a software RAID1 and the main 
userspace-application is a heavly (mostly insert/update) used 
postgresql-database. The database generates a lot of sustained 
IO-traffic and after some minutes (depends on the load - sometimes it 
even works for one or two hours) the kernel generates the following 
messages(2.4.22 and 2.6.0-test6 behave almost identically - 
error-messages are from 2.6.0-test6):


hdc: dma_timer_expiry: dma status == 0x20
hdc: DMA timeout retry
hdc: timeout waiting for DMA
hdc: status timeout: status=0xd0 { Busy }

hdc: drive not ready for command
ide1: reset: success
hdc: dma_timer_expiry: dma status == 0x20
hdc: DMA timeout retry
hdc: timeout waiting for DMA
hdc: status timeout: status=0xd0 { Busy }

hdc: drive not ready for command
ide1: reset: success
hdc: dma_timer_expiry: dma status == 0x20
hdc: DMA timeout retry
hdc: timeout waiting for DMA
hdc: status timeout: status=0xd0 { Busy }

hdc: drive not ready for command
ide1: reset: success
hdc: dma_timer_expiry: dma status == 0x20
hdc: DMA timeout retry
hdc: timeout waiting for DMA
hdc: status timeout: status=0xd0 { Busy }

hdc: drive not ready for command
ide1: reset: success
hda: dma_timer_expiry: dma status == 0x60
hda: DMA timeout retry
hda: timeout waiting for DMA
hda: status timeout: status=0xd0 { Busy }

hdb: DMA disabled
hda: drive not ready for command
ide0: reset: success
blk: queue dfdee200, I/O limit 4095Mb (mask 0xffffffff)
hda: dma_timer_expiry: dma status == 0x20
hda: DMA timeout retry
hda: timeout waiting for DMA
hda: status timeout: status=0xd0 { Busy }

hda: drive not ready for command
ide0: reset: success
hda: dma_timer_expiry: dma status == 0x20
hda: DMA timeout retry
hda: timeout waiting for DMA
hda: status timeout: status=0xd0 { Busy }

hda: drive not ready for command
ide0: reset: success


after one of this events DMA on one of the disks (either hdc or hda) 
gets disabled and the maschine is heavily overloaded and the database 
cannot keep up any more with the incoming load of database-updates.
It's also worth mentioning that the kernel reports a "DMA disabled" only 
for hdb which is the internal cd-drive and completely unused.

I do know that Serverworks IDE has been flaky (especially with the CSB4) 
in the past but I thought this had been fixed in newer chipset-revisions 
- is there anything I can do to solve this problem?

dmesg of the machine in question can be found at 
http://www.kaltenbrunner.cc/files/dmesg.txt



many thanks

Stefan Kaltenbrunner


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-07 16:21 Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6) Stefan Kaltenbrunner
@ 2003-10-09 19:22 ` Bartlomiej Zolnierkiewicz
  2003-10-09 19:35   ` Marcelo Tosatti
  0 siblings, 1 reply; 11+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-10-09 19:22 UTC (permalink / raw)
  To: linux-kernel, mm-mailinglist


Bart, 

Do you have any idea ? 

Wild guess: Disable APIC? 

On Tue, 7 Oct 2003, Stefan Kaltenbrunner wrote:

> Hello!
> 
> we have a bunch of IBM x305 here which are entrylevel 1HE servers based 
> on a Serverworks CSB5 chipset.
> One of those has 2 120GB IDE disks in a software RAID1 and the main 
> userspace-application is a heavly (mostly insert/update) used 
> postgresql-database. The database generates a lot of sustained 
> IO-traffic and after some minutes (depends on the load - sometimes it 
> even works for one or two hours) the kernel generates the following 
> messages(2.4.22 and 2.6.0-test6 behave almost identically - 
> error-messages are from 2.6.0-test6):
> 
> 
> hdc: dma_timer_expiry: dma status == 0x20
> hdc: DMA timeout retry
> hdc: timeout waiting for DMA
> hdc: status timeout: status=0xd0 { Busy }
> 
> hdc: drive not ready for command
> ide1: reset: success
> hdc: dma_timer_expiry: dma status == 0x20
> hdc: DMA timeout retry
> hdc: timeout waiting for DMA
> hdc: status timeout: status=0xd0 { Busy }
> 
> hdc: drive not ready for command
> ide1: reset: success
> hdc: dma_timer_expiry: dma status == 0x20
> hdc: DMA timeout retry
> hdc: timeout waiting for DMA
> hdc: status timeout: status=0xd0 { Busy }
> 
> hdc: drive not ready for command
> ide1: reset: success
> hdc: dma_timer_expiry: dma status == 0x20
> hdc: DMA timeout retry
> hdc: timeout waiting for DMA
> hdc: status timeout: status=0xd0 { Busy }
> 
> hdc: drive not ready for command
> ide1: reset: success
> hda: dma_timer_expiry: dma status == 0x60
> hda: DMA timeout retry
> hda: timeout waiting for DMA
> hda: status timeout: status=0xd0 { Busy }
> 
> hdb: DMA disabled
> hda: drive not ready for command
> ide0: reset: success
> blk: queue dfdee200, I/O limit 4095Mb (mask 0xffffffff)
> hda: dma_timer_expiry: dma status == 0x20
> hda: DMA timeout retry
> hda: timeout waiting for DMA
> hda: status timeout: status=0xd0 { Busy }
> 
> hda: drive not ready for command
> ide0: reset: success
> hda: dma_timer_expiry: dma status == 0x20
> hda: DMA timeout retry
> hda: timeout waiting for DMA
> hda: status timeout: status=0xd0 { Busy }
> 
> hda: drive not ready for command
> ide0: reset: success
> 
> 
> after one of this events DMA on one of the disks (either hdc or hda) 
> gets disabled and the maschine is heavily overloaded and the database 
> cannot keep up any more with the incoming load of database-updates.
> It's also worth mentioning that the kernel reports a "DMA disabled" only 
> for hdb which is the internal cd-drive and completely unused.
> 
> I do know that Serverworks IDE has been flaky (especially with the CSB4) 
> in the past but I thought this had been fixed in newer chipset-revisions 
> - is there anything I can do to solve this problem?
> 
> dmesg of the machine in question can be found at 
> http://www.kaltenbrunner.cc/files/dmesg.txt
> 
> 
> 
> many thanks
> 
> Stefan Kaltenbrunner
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-09 19:22 ` Bartlomiej Zolnierkiewicz
@ 2003-10-09 19:35   ` Marcelo Tosatti
  2003-10-09 19:46     ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 11+ messages in thread
From: Marcelo Tosatti @ 2003-10-09 19:35 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: linux-kernel, mm-mailinglist


Sorry, I screwed up, again.

Thats me asking, not Bart. Duh. 

On Thu, 9 Oct 2003, Bartlomiej Zolnierkiewicz wrote:

> 
> Bart, 
> 
> Do you have any idea ? 
> 
> Wild guess: Disable APIC? 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-09 19:35   ` Marcelo Tosatti
@ 2003-10-09 19:46     ` Bartlomiej Zolnierkiewicz
  2003-10-09 20:58       ` Stefan Kaltenbrunner
  0 siblings, 1 reply; 11+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-10-09 19:46 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-kernel, mm-mailinglist

On Thursday 09 of October 2003 21:35, Marcelo Tosatti wrote:
> Sorry, I screwed up, again.
>
> Thats me asking, not Bart. Duh.

Hehe.

> On Thu, 9 Oct 2003, Bartlomiej Zolnierkiewicz wrote:
> > Bart,
> >
> > Do you have any idea ?
> >
> > Wild guess: Disable APIC?

APIC problem should be fixed, but yes it's better to disable ACPI.

These "timeout due to drive busy" needs to be resolved.
I don't remember seeing it et all in earlier 2.4.x kernels.
I am now searching through mail archive to find some starting point
(ie. 2.4.18 works, 2.4.20 doesn't etc.).

--bartlomiej


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-09 19:46     ` Bartlomiej Zolnierkiewicz
@ 2003-10-09 20:58       ` Stefan Kaltenbrunner
  2003-10-09 21:13         ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Kaltenbrunner @ 2003-10-09 20:58 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: marcelo.tosatti, linux-kernel

Bartlomiej Zolnierkiewicz wrote:

> APIC problem should be fixed, but yes it's better to disable ACPI.

Not sure if I understand this one right - the dmesg was from the 
2.6.0-test6 kernel which did have ACPI HT-enum-only compiled in but no 
"local APIC support".
The 2.4.22 one that has the same problem does neither have ACPI nor APIC 
support compiled in - so no this doesn't seem to be the problem.

> These "timeout due to drive busy" needs to be resolved.

Yes - I really hope this will be fixed soon. I was forced to add a 
fiberchannel HBA into this maschine today to integrate it into our SAN 
to get the database up to speed again.
However I'm willing to move the database to the local disks again if you 
want me to test a patch or something along that line.



Stefan


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-09 20:58       ` Stefan Kaltenbrunner
@ 2003-10-09 21:13         ` Bartlomiej Zolnierkiewicz
  2003-10-09 21:22           ` Stefan Kaltenbrunner
  0 siblings, 1 reply; 11+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-10-09 21:13 UTC (permalink / raw)
  To: Stefan Kaltenbrunner; +Cc: marcelo.tosatti, linux-kernel

On Thursday 09 of October 2003 22:58, Stefan Kaltenbrunner wrote:
> Bartlomiej Zolnierkiewicz wrote:
> > APIC problem should be fixed, but yes it's better to disable ACPI.
>
> Not sure if I understand this one right - the dmesg was from the
> 2.6.0-test6 kernel which did have ACPI HT-enum-only compiled in but no
> "local APIC support".
> The 2.4.22 one that has the same problem does neither have ACPI nor APIC
> support compiled in - so no this doesn't seem to be the problem.

Okay.

> > These "timeout due to drive busy" needs to be resolved.
>
> Yes - I really hope this will be fixed soon. I was forced to add a
> fiberchannel HBA into this maschine today to integrate it into our SAN
> to get the database up to speed again.
> However I'm willing to move the database to the local disks again if you
> want me to test a patch or something along that line.

Did some kernel worked okay or this is new system?

--bartlomiej


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-09 21:13         ` Bartlomiej Zolnierkiewicz
@ 2003-10-09 21:22           ` Stefan Kaltenbrunner
  2003-10-09 21:29             ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Kaltenbrunner @ 2003-10-09 21:22 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: marcelo.tosatti, linux-kernel

Bartlomiej Zolnierkiewicz wrote:

>>>These "timeout due to drive busy" needs to be resolved.
>>
>>Yes - I really hope this will be fixed soon. I was forced to add a
>>fiberchannel HBA into this maschine today to integrate it into our SAN
>>to get the database up to speed again.
>>However I'm willing to move the database to the local disks again if you
>>want me to test a patch or something along that line.
> 
> 
> Did some kernel worked okay or this is new system?

This is a new system - I can try an older kernel if you can give me some 
hints about how old it should be :-)

Stefan


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-09 21:22           ` Stefan Kaltenbrunner
@ 2003-10-09 21:29             ` Bartlomiej Zolnierkiewicz
  2003-10-10  8:57               ` Stefan Kaltenbrunner
  0 siblings, 1 reply; 11+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-10-09 21:29 UTC (permalink / raw)
  To: Stefan Kaltenbrunner; +Cc: marcelo.tosatti, linux-kernel


2.4.18, 2.4.19 w/o APIC and ACPI

On Thursday 09 of October 2003 23:22, Stefan Kaltenbrunner wrote:
> Bartlomiej Zolnierkiewicz wrote:
> >>>These "timeout due to drive busy" needs to be resolved.
> >>
> >>Yes - I really hope this will be fixed soon. I was forced to add a
> >>fiberchannel HBA into this maschine today to integrate it into our SAN
> >>to get the database up to speed again.
> >>However I'm willing to move the database to the local disks again if you
> >>want me to test a patch or something along that line.
> >
> > Did some kernel worked okay or this is new system?
>
> This is a new system - I can try an older kernel if you can give me some
> hints about how old it should be :-)
>
> Stefan


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-09 21:29             ` Bartlomiej Zolnierkiewicz
@ 2003-10-10  8:57               ` Stefan Kaltenbrunner
  2003-10-10  9:27                 ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Kaltenbrunner @ 2003-10-10  8:57 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: marcelo.tosatti, linux-kernel

Bartlomiej Zolnierkiewicz wrote:
> 2.4.18, 2.4.19 w/o APIC and ACPI

ok 2.4.18 (dmesg at http://www.kaltenbrunner.cc/files/dmesg2418.txt) 
seems to work better(although not as fast as I would like to have it) 
but I suspect that:

ide1: Speed warnings UDMA 3/4/5 is not functional.
ide0: Speed warnings UDMA 3/4/5 is not functional.

is quite interesting - if these UDMA-modes do not work reliable - why do 
they get enabled with later kernels(not that I would have a problem with 
getting UDMA > 2 working *g*) ?


Stefan


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-10  8:57               ` Stefan Kaltenbrunner
@ 2003-10-10  9:27                 ` Bartlomiej Zolnierkiewicz
  2003-10-13  7:43                   ` Stefan Kaltenbrunner
  0 siblings, 1 reply; 11+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-10-10  9:27 UTC (permalink / raw)
  To: Stefan Kaltenbrunner; +Cc: marcelo.tosatti, linux-kernel

On Friday 10 of October 2003 10:57, Stefan Kaltenbrunner wrote:
> Bartlomiej Zolnierkiewicz wrote:
> > 2.4.18, 2.4.19 w/o APIC and ACPI
>
> ok 2.4.18 (dmesg at http://www.kaltenbrunner.cc/files/dmesg2418.txt)
> seems to work better(although not as fast as I would like to have it)
> but I suspect that:
>
> ide1: Speed warnings UDMA 3/4/5 is not functional.
> ide0: Speed warnings UDMA 3/4/5 is not functional.
>
> is quite interesting - if these UDMA-modes do not work reliable - why do
> they get enabled with later kernels(not that I would have a problem with
> getting UDMA > 2 working *g*) ?

2.4.22 has 80-pin cable dedetecion for more vendors.
You can try passing "ide0=ata66 ide1=ata66" boot options.

--bartlomiej


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6)
  2003-10-10  9:27                 ` Bartlomiej Zolnierkiewicz
@ 2003-10-13  7:43                   ` Stefan Kaltenbrunner
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Kaltenbrunner @ 2003-10-13  7:43 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Marcelo Tosatti, linux-kernel

Bartlomiej Zolnierkiewicz wrote:
> On Friday 10 of October 2003 10:57, Stefan Kaltenbrunner wrote:
> 
>>Bartlomiej Zolnierkiewicz wrote:
>>
>>>2.4.18, 2.4.19 w/o APIC and ACPI
>>
>>ok 2.4.18 (dmesg at http://www.kaltenbrunner.cc/files/dmesg2418.txt)
>>seems to work better(although not as fast as I would like to have it)
>>but I suspect that:
>>
>>ide1: Speed warnings UDMA 3/4/5 is not functional.
>>ide0: Speed warnings UDMA 3/4/5 is not functional.
>>
>>is quite interesting - if these UDMA-modes do not work reliable - why do
>>they get enabled with later kernels(not that I would have a problem with
>>getting UDMA > 2 working *g*) ?
> 
> 
> 2.4.22 has 80-pin cable dedetecion for more vendors.
> You can try passing "ide0=ata66 ide1=ata66" boot options.

tried that - worked well during the weekend (low io-traffic).
but today it broke again:


hda: dma_timer_expiry: dma status == 0x61
hda: error waiting for DMA
hda: dma timeout retry: status=0xd0 { Busy }

hda: DMA disabled
hdb: DMA disabled
ide0: reset: success
blk: queue c034db40, I/O limit 4095Mb (mask 0xffffffff)


more ideas to fix this ?

thanks

Stefan


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2003-10-13  7:43 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-07 16:21 Serverworks CSB5 IDE-DMA Problem (2.4 and 2.6) Stefan Kaltenbrunner
2003-10-09 19:22 ` Bartlomiej Zolnierkiewicz
2003-10-09 19:35   ` Marcelo Tosatti
2003-10-09 19:46     ` Bartlomiej Zolnierkiewicz
2003-10-09 20:58       ` Stefan Kaltenbrunner
2003-10-09 21:13         ` Bartlomiej Zolnierkiewicz
2003-10-09 21:22           ` Stefan Kaltenbrunner
2003-10-09 21:29             ` Bartlomiej Zolnierkiewicz
2003-10-10  8:57               ` Stefan Kaltenbrunner
2003-10-10  9:27                 ` Bartlomiej Zolnierkiewicz
2003-10-13  7:43                   ` Stefan Kaltenbrunner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.