linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* HPT374 IDE problem with 2.6.21.* kernels
@ 2007-05-30  9:30 Geller Sandor
  2007-06-01 20:46 ` Andrew Morton
  0 siblings, 1 reply; 15+ messages in thread
From: Geller Sandor @ 2007-05-30  9:30 UTC (permalink / raw)
  To: linux-kernel

Hi,

I saw a similar report yesterday with '2.6.21.1 - 97% wait time on IDE 
operations' subject.

After upgrading from 2.6.20.7 kernel to 2.6.21.1 my system started to 
reset infrequenly the IDE bus. In the syslog DMA timeout, resetting IDE 
bus messages appeared. I've changed the two disks attached to the HPT374 
controller, and always the first disk had problems. I've replaced cables, 
plugged the disks into other IDE ports, but it was only a matter of time 
to experience an IDE reset. When I upgraded to 2.6.21.3 the resets became 
much more frequent, this time even DMA was disabled too on the first disk. 
I turned DMA back Manually with hdparm, and a few seconds of intense IO 
activity resulted in another IDE reset.

Reverting back to 2.6.20.12 the problem seems to be gone. BTW I'm using 
the PATA driver for the HTP374, not the libata one.

Is this a known problem/ is there a way I can help locating the cause of 
the problem?

Regards,

   Geller Sandor <wildy@petra.hos.u-szeged.hu>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-05-30  9:30 HPT374 IDE problem with 2.6.21.* kernels Geller Sandor
@ 2007-06-01 20:46 ` Andrew Morton
  2007-06-01 20:53   ` Sergei Shtylyov
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2007-06-01 20:46 UTC (permalink / raw)
  To: Geller Sandor; +Cc: linux-kernel, Sergei Shtylyov, linux-ide

On Wed, 30 May 2007 11:30:00 +0200 (CEST)
Geller Sandor <wildy@petra.hos.u-szeged.hu> wrote:

> Hi,
> 
> I saw a similar report yesterday with '2.6.21.1 - 97% wait time on IDE 
> operations' subject.
> 
> After upgrading from 2.6.20.7 kernel to 2.6.21.1 my system started to 
> reset infrequenly the IDE bus. In the syslog DMA timeout, resetting IDE 
> bus messages appeared. I've changed the two disks attached to the HPT374 
> controller, and always the first disk had problems. I've replaced cables, 
> plugged the disks into other IDE ports, but it was only a matter of time 
> to experience an IDE reset. When I upgraded to 2.6.21.3 the resets became 
> much more frequent, this time even DMA was disabled too on the first disk. 
> I turned DMA back Manually with hdparm, and a few seconds of intense IO 
> activity resulted in another IDE reset.
> 
> Reverting back to 2.6.20.12 the problem seems to be gone. BTW I'm using 
> the PATA driver for the HTP374, not the libata one.
> 
> Is this a known problem/ is there a way I can help locating the cause of 
> the problem?
> 

(cc's added)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-01 20:46 ` Andrew Morton
@ 2007-06-01 20:53   ` Sergei Shtylyov
  2007-06-01 21:13     ` Geller Sandor
  0 siblings, 1 reply; 15+ messages in thread
From: Sergei Shtylyov @ 2007-06-01 20:53 UTC (permalink / raw)
  To: Geller Sandor; +Cc: Andrew Morton, linux-kernel, linux-ide

Hello.

Andrew Morton wrote:

>>I saw a similar report yesterday with '2.6.21.1 - 97% wait time on IDE 
>>operations' subject.

>>After upgrading from 2.6.20.7 kernel to 2.6.21.1 my system started to 
>>reset infrequenly the IDE bus. In the syslog DMA timeout, resetting IDE 
>>bus messages appeared. I've changed the two disks attached to the HPT374 
>>controller, and always the first disk had problems. I've replaced cables, 
>>plugged the disks into other IDE ports, but it was only a matter of time 
>>to experience an IDE reset. When I upgraded to 2.6.21.3 the resets became 
>>much more frequent, this time even DMA was disabled too on the first disk. 
>>I turned DMA back Manually with hdparm, and a few seconds of intense IO 
>>activity resulted in another IDE reset.

>>Reverting back to 2.6.20.12 the problem seems to be gone. BTW I'm using 
>>the PATA driver for the HTP374, not the libata one.

>>Is this a known problem/ is there a way I can help locating the cause of 
>>the problem?

    Yes, please post the boot log and the IDE reset log too for starters...

> (cc's added)

WBR, Sergei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-01 20:53   ` Sergei Shtylyov
@ 2007-06-01 21:13     ` Geller Sandor
  2007-06-01 21:26       ` Sergei Shtylyov
  0 siblings, 1 reply; 15+ messages in thread
From: Geller Sandor @ 2007-06-01 21:13 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Andrew Morton, linux-kernel, linux-ide

On Sat, 2 Jun 2007, Sergei Shtylyov wrote:

> Hello.
>
> Andrew Morton wrote:
>
>>> I saw a similar report yesterday with '2.6.21.1 - 97% wait time on IDE 
>>> operations' subject.
>
>>> After upgrading from 2.6.20.7 kernel to 2.6.21.1 my system started to 
>>> reset infrequenly the IDE bus. In the syslog DMA timeout, resetting IDE 
>>> bus messages appeared. I've changed the two disks attached to the HPT374 
>>> controller, and always the first disk had problems. I've replaced cables, 
>>> plugged the disks into other IDE ports, but it was only a matter of time 
>>> to experience an IDE reset. When I upgraded to 2.6.21.3 the resets became 
>>> much more frequent, this time even DMA was disabled too on the first disk. 
>>> I turned DMA back Manually with hdparm, and a few seconds of intense IO 
>>> activity resulted in another IDE reset.
>
>>> Reverting back to 2.6.20.12 the problem seems to be gone. BTW I'm using 
>>> the PATA driver for the HTP374, not the libata one.
>
>>> Is this a known problem/ is there a way I can help locating the cause of 
>>> the problem?
>
>   Yes, please post the boot log and the IDE reset log too for starters...
>
>> (cc's added)
>
> WBR, Sergei

Hi,

The log of a typical IDE reset is available here:

http://petra.hos.u-szeged.hu/~wildy/syslog.gz

This was the worst case: the IDE bus was resetted during the system boot.

   Geller Sandor <wildy@petra.hos.u-szeged.hu>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-01 21:13     ` Geller Sandor
@ 2007-06-01 21:26       ` Sergei Shtylyov
  2007-06-01 22:41         ` Geller Sandor
  0 siblings, 1 reply; 15+ messages in thread
From: Sergei Shtylyov @ 2007-06-01 21:26 UTC (permalink / raw)
  To: Geller Sandor; +Cc: Andrew Morton, linux-kernel, linux-ide

Hello.

Geller Sandor wrote:

>>>> I saw a similar report yesterday with '2.6.21.1 - 97% wait time on 
>>>> IDE operations' subject.

>>>> After upgrading from 2.6.20.7 kernel to 2.6.21.1 my system started 
>>>> to reset infrequenly the IDE bus. In the syslog DMA timeout, 
>>>> resetting IDE bus messages appeared. I've changed the two disks 
>>>> attached to the HPT374 controller, and always the first disk had 
>>>> problems. I've replaced cables, plugged the disks into other IDE 
>>>> ports, but it was only a matter of time to experience an IDE reset. 
>>>> When I upgraded to 2.6.21.3 the resets became much more frequent, 
>>>> this time even DMA was disabled too on the first disk. I turned DMA 
>>>> back Manually with hdparm, and a few seconds of intense IO activity 
>>>> resulted in another IDE reset.
>>
>>
>>>> Reverting back to 2.6.20.12 the problem seems to be gone. BTW I'm 
>>>> using the PATA driver for the HTP374, not the libata one.
>>
>>
>>>> Is this a known problem/ is there a way I can help locating the 
>>>> cause of the problem?

>>   Yes, please post the boot log and the IDE reset log too for starters...

>>> (cc's added)

> The log of a typical IDE reset is available here:

> http://petra.hos.u-szeged.hu/~wildy/syslog.gz

> This was the worst case: the IDE bus was resetted during the system boot.

    Could you try setting HPT374_ALLOW_ATA133_6 to 0 in 
drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?

MBR, Sergei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-01 21:26       ` Sergei Shtylyov
@ 2007-06-01 22:41         ` Geller Sandor
  2007-06-02 23:38           ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 15+ messages in thread
From: Geller Sandor @ 2007-06-01 22:41 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Andrew Morton, linux-kernel, linux-ide

On Sat, 2 Jun 2007, Sergei Shtylyov wrote:

>> The log of a typical IDE reset is available here:
>
>> http://petra.hos.u-szeged.hu/~wildy/syslog.gz
>
>> This was the worst case: the IDE bus was resetted during the system boot.
>
>   Could you try setting HPT374_ALLOW_ATA133_6 to 0 in 
> drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?


Hi Sergei,

This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce 
the problem within a few seconds. With the above modification the machine 
is running under heavy disk I/O without problems since 30 minutes...

Regards,

   Geller Sandor <wildy@petra.hos.u-szeged.hu>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-01 22:41         ` Geller Sandor
@ 2007-06-02 23:38           ` Bartlomiej Zolnierkiewicz
  2007-06-03 10:37             ` Geller Sandor
  0 siblings, 1 reply; 15+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2007-06-02 23:38 UTC (permalink / raw)
  To: Geller Sandor; +Cc: Sergei Shtylyov, Andrew Morton, linux-kernel, linux-ide


Hi,

On Saturday 02 June 2007, Geller Sandor wrote:
> On Sat, 2 Jun 2007, Sergei Shtylyov wrote:
> 
> >> The log of a typical IDE reset is available here:
> >
> >> http://petra.hos.u-szeged.hu/~wildy/syslog.gz
> >
> >> This was the worst case: the IDE bus was resetted during the system boot.
> >
> >   Could you try setting HPT374_ALLOW_ATA133_6 to 0 in 
> > drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?
> 
> 
> Hi Sergei,
> 
> This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce 
> the problem within a few seconds. With the above modification the machine 
> is running under heavy disk I/O without problems since 30 minutes...

Did it fix the problem for good?

Sergei, do we need to disallow UDMA6 completely on HPT734 or
is it only an issue with some problematic devices (=> blacklist)?

Either way we need to fix it somehow for 2.6.22.

Thanks,
Bart

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-02 23:38           ` Bartlomiej Zolnierkiewicz
@ 2007-06-03 10:37             ` Geller Sandor
  2007-06-03 17:36               ` Sergei Shtylyov
  0 siblings, 1 reply; 15+ messages in thread
From: Geller Sandor @ 2007-06-03 10:37 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Sergei Shtylyov, Andrew Morton, linux-kernel, linux-ide

Hello,

On Sun, 3 Jun 2007, Bartlomiej Zolnierkiewicz wrote:

>
> Hi,
>
> On Saturday 02 June 2007, Geller Sandor wrote:
>> On Sat, 2 Jun 2007, Sergei Shtylyov wrote:
>>
>>>> The log of a typical IDE reset is available here:
>>>
>>>> http://petra.hos.u-szeged.hu/~wildy/syslog.gz
>>>
>>>> This was the worst case: the IDE bus was resetted during the system boot.
>>>
>>>   Could you try setting HPT374_ALLOW_ATA133_6 to 0 in
>>> drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?
>>
>>
>> Hi Sergei,
>>
>> This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce
>> the problem within a few seconds. With the above modification the machine
>> is running under heavy disk I/O without problems since 30 minutes...
>
> Did it fix the problem for good?

It seems so far. There hasn't been any problem since I've applied the fix.

> Sergei, do we need to disallow UDMA6 completely on HPT734 or
> is it only an issue with some problematic devices (=> blacklist)?
>
> Either way we need to fix it somehow for 2.6.22.

For the record: this HTP374 is running with a quite outdated firmware 
(1.22) - maybe newer firmwares work correctly. I'm going to upgrade the 
firmware to the latest one (which was released in 2004...), but 
unfortunately in the upcoming 2-3 weeks I won't have access to this 
machine, so I can't check the case within the release cycle of 2.6.22. If 
you were interested I would post the result of the firmware upgrade.

Regards,

   Sandor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-03 10:37             ` Geller Sandor
@ 2007-06-03 17:36               ` Sergei Shtylyov
  2007-06-05  1:00                 ` Bartlomiej Zolnierkiewicz
  2007-06-05 20:08                 ` Sergei Shtylyov
  0 siblings, 2 replies; 15+ messages in thread
From: Sergei Shtylyov @ 2007-06-03 17:36 UTC (permalink / raw)
  To: Geller Sandor
  Cc: Bartlomiej Zolnierkiewicz, Andrew Morton, linux-kernel, linux-ide

Geller Sandor wrote:
Hello.

>>>>> The log of a typical IDE reset is available here:

>>>>> http://petra.hos.u-szeged.hu/~wildy/syslog.gz

>>>>> This was the worst case: the IDE bus was resetted during the system 
>>>>> boot.

>>>>   Could you try setting HPT374_ALLOW_ATA133_6 to 0 in
>>>> drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?

>>> Hi Sergei,

>>> This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce
>>> the problem within a few seconds. With the above modification the 
>>> machine
>>> is running under heavy disk I/O without problems since 30 minutes...

>> Did it fix the problem for good?

> It seems so far. There hasn't been any problem since I've applied the fix.

>> Sergei, do we need to disallow UDMA6 completely on HPT734 or
>> is it only an issue with some problematic devices (=> blacklist)?

    Note that I didn't change what the old code was doing in this regard -- 
although the HPT374 spec does *not* say that UDMA6 is supported, it had been 
enabled. What have *really* changed for HPT374 was:

- in 2.6.20-rc1, the driver switched to using the actual 33 MHz timing table
   instead of the old one, matching 50 MHz (and so, severely underclocked);

- in 2.6.2-rc1, the driver switched from 33 MHz PCI to 66 MHz DPLL clock.

    Disallowing UDMA6 would clock the chip with 50 MHz DPLL, howewer, the 
original report claimed that something has changed to worse between 2.6.21.1 
and .3 but nothing changed in drivers/ide/ between those releases...

>> Either way we need to fix it somehow for 2.6.22.

> For the record: this HTP374 is running with a quite outdated firmware 
> (1.22) - maybe newer firmwares work correctly. I'm going to upgrade the 
> firmware to the latest one (which was released in 2004...), but 
> unfortunately in the upcoming 2-3 weeks I won't have access to this 
> machine, so I can't check the case within the release cycle of 2.6.22. 
> If you were interested I would post the result of the firmware upgrade.

    I don't think this will matter...

> Regards,
>   Sandor

MBR, Sergei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-03 17:36               ` Sergei Shtylyov
@ 2007-06-05  1:00                 ` Bartlomiej Zolnierkiewicz
  2007-06-05 12:45                   ` Sergei Shtylyov
  2007-06-05 20:08                 ` Sergei Shtylyov
  1 sibling, 1 reply; 15+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2007-06-05  1:00 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Geller Sandor, Andrew Morton, linux-kernel, linux-ide


Hello,

On Sunday 03 June 2007, Sergei Shtylyov wrote:
> Geller Sandor wrote:
> Hello.
> 
> >>>>> The log of a typical IDE reset is available here:
> 
> >>>>> http://petra.hos.u-szeged.hu/~wildy/syslog.gz
> 
> >>>>> This was the worst case: the IDE bus was resetted during the system 
> >>>>> boot.
> 
> >>>>   Could you try setting HPT374_ALLOW_ATA133_6 to 0 in
> >>>> drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?
> 
> >>> Hi Sergei,
> 
> >>> This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce
> >>> the problem within a few seconds. With the above modification the 
> >>> machine
> >>> is running under heavy disk I/O without problems since 30 minutes...
> 
> >> Did it fix the problem for good?
> 
> > It seems so far. There hasn't been any problem since I've applied the fix.
> 
> >> Sergei, do we need to disallow UDMA6 completely on HPT734 or
> >> is it only an issue with some problematic devices (=> blacklist)?
> 
>     Note that I didn't change what the old code was doing in this regard -- 
> although the HPT374 spec does *not* say that UDMA6 is supported, it had been 
> enabled. What have *really* changed for HPT374 was:
> 
> - in 2.6.20-rc1, the driver switched to using the actual 33 MHz timing table
>    instead of the old one, matching 50 MHz (and so, severely underclocked);
> 
> - in 2.6.2-rc1, the driver switched from 33 MHz PCI to 66 MHz DPLL clock.
> 
>     Disallowing UDMA6 would clock the chip with 50 MHz DPLL, howewer, the 

I felt inspired by this explanation (thanks!) and took a look at
hpt374-opensource-v2.10 vendor driver.  Here is something interesting:

glbdata.c:

...
#ifdef CLOCK_66MHZ
ULONG setting370_66[] = {
        0xd029d5e,  0xd029d26,  0xc829ca6,  0xc829c84,  0xc829c62,
        0x2c829d2c, 0x2c829c66, 0x2c829c62,
        0x1c829c62, 0x1c9a9c62, 0x1c929c62, 0x1c8e9c62, 0x1c8a9c62,
        0x1c8a9c62/*0x1cae9c62*/, 0x1c869c62, 0x1c869c62,
};
...

hpt366.c:

...
static u32 sixty_six_base_hpt37x[] = {
        /* XFER_UDMA_6 */       0x1c869c62,
        /* XFER_UDMA_5 */       0x1cae9c62,     /* 0x1c8a9c62 */
...

So we are using Dual ATA Clock for UDMA5 whereas vendor driver doesn't
(the only other mode which uses Dual ATA Clock, in both drivers, is rarely
used UDMA3).

Thanks to this UDMA cycle time should be equal 22.5ns instead of 30ns
(spec defines it at 16.8ns, ide_timings[] uses 20ns) when using 66 MHz DPLL
clock.  In theory everything should play nice but the data manual for HPT374
contains weird note that Dual ATA Clock is meant to implement ATA100 read
and write at different clocks (there is no more explanation to this).

Geller reported that the problems started after migrating from 2.6.20.7 to
2.6.21.1 (the affected disks are using UDMA5) and at the same time the driver
switched from 33 MHz PCI to 66 MHz DPLL clock.  Also the issue is completely
fixed by using 50 MHz DPLL clock (UDMA5 timing for 50 MHz DPLL clock is
0x12848242 so UDMA cycle time equals 20ns and is smaller than the one
obtained using 66 MHz DPLL clock).

It all makes me wonder whether it is really safe to use Dual ATA Clock for
UDMA5 and whether we should just be using "the offical" timing instead...

Sergei?

> original report claimed that something has changed to worse between 2.6.21.1 
> and .3 but nothing changed in drivers/ide/ between those releases...

It could be that md changes from 2.6.21.3 have influenced the situation
(by putting more stress on disks etc)...

Thanks,
Bart

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-05  1:00                 ` Bartlomiej Zolnierkiewicz
@ 2007-06-05 12:45                   ` Sergei Shtylyov
  2007-06-05 14:14                     ` Sergei Shtylyov
  2007-06-08 12:33                     ` Bartlomiej Zolnierkiewicz
  0 siblings, 2 replies; 15+ messages in thread
From: Sergei Shtylyov @ 2007-06-05 12:45 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Geller Sandor, Andrew Morton, linux-kernel, linux-ide

Hello.

Bartlomiej Zolnierkiewicz wrote:

>>>>>>>The log of a typical IDE reset is available here:

>>>>>>>http://petra.hos.u-szeged.hu/~wildy/syslog.gz

>>>>>>>This was the worst case: the IDE bus was resetted during the system 
>>>>>>>boot.

>>>>>>  Could you try setting HPT374_ALLOW_ATA133_6 to 0 in
>>>>>>drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?

>>>>>Hi Sergei,

>>>>>This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce
>>>>>the problem within a few seconds. With the above modification the 
>>>>>machine
>>>>>is running under heavy disk I/O without problems since 30 minutes...

>>>>Did it fix the problem for good?

>>>It seems so far. There hasn't been any problem since I've applied the fix.

>>>>Sergei, do we need to disallow UDMA6 completely on HPT734 or
>>>>is it only an issue with some problematic devices (=> blacklist)?

>>    Note that I didn't change what the old code was doing in this regard -- 
>>although the HPT374 spec does *not* say that UDMA6 is supported, it had been 
>>enabled. What have *really* changed for HPT374 was:

>>- in 2.6.20-rc1, the driver switched to using the actual 33 MHz timing table
>>   instead of the old one, matching 50 MHz (and so, severely underclocked);

>>- in 2.6.2-rc1, the driver switched from 33 MHz PCI to 66 MHz DPLL clock.

>>    Disallowing UDMA6 would clock the chip with 50 MHz DPLL, howewer, the 

> I felt inspired by this explanation (thanks!) and took a look at
> hpt374-opensource-v2.10 vendor driver.  Here is something interesting:

> glbdata.c:

> ...
> #ifdef CLOCK_66MHZ
> ULONG setting370_66[] = {
>         0xd029d5e,  0xd029d26,  0xc829ca6,  0xc829c84,  0xc829c62,
>         0x2c829d2c, 0x2c829c66, 0x2c829c62,
>         0x1c829c62, 0x1c9a9c62, 0x1c929c62, 0x1c8e9c62, 0x1c8a9c62,
>         0x1c8a9c62/*0x1cae9c62*/, 0x1c869c62, 0x1c869c62,
> };
> ...

> hpt366.c:

> ...
> static u32 sixty_six_base_hpt37x[] = {
>         /* XFER_UDMA_6 */       0x1c869c62,
>         /* XFER_UDMA_5 */       0x1cae9c62,     /* 0x1c8a9c62 */
> ...

> So we are using Dual ATA Clock for UDMA5 whereas vendor driver doesn't

    This is so in all other HPT drivers (and HPT371N datasheet has the same 
figures -- this chip is the only one supporting UDMA6 and having the default 
DPLL clock > 50 MHz).  Note that it means that there's no actual UDMA5 since 
the timing exactly matches that one used for UDMA4.

> (the only other mode which uses Dual ATA Clock, in both drivers, is rarely
> used UDMA3).

    And UDMA4 with 50 MHz clock.

> Thanks to this UDMA cycle time should be equal 22.5ns instead of 30ns
> (spec defines it at 16.8ns, ide_timings[] uses 20ns) when using 66 MHz DPLL
> clock.  In theory everything should play nice but the data manual for HPT374

    And it does -- on other chips.

> contains weird note that Dual ATA Clock is meant to implement ATA100 read
> and write at different clocks (there is no more explanation to this).

    That's the thing that keeps me confused in the other datasheets too -- 
from my interpretation of their timing figures it seemed to control 2x ATA 
clock multipler. HPT370 datasheet just gives different timings and SCR2 values 
for reads/writes in UDMA5 (I've disabled this mode on HPT370 from which the 
read performance only gained -- not sure if it makes sense to restore the old 
clock turnaround hack).

> Geller reported that the problems started after migrating from 2.6.20.7 to
> 2.6.21.1 (the affected disks are using UDMA5) and at the same time the driver
> switched from 33 MHz PCI to 66 MHz DPLL clock.  Also the issue is completely
> fixed by using 50 MHz DPLL clock (UDMA5 timing for 50 MHz DPLL clock is
> 0x12848242 so UDMA cycle time equals 20ns and is smaller than the one
> obtained using 66 MHz DPLL clock).


> It all makes me wonder whether it is really safe to use Dual ATA Clock for
> UDMA5 and whether we should just be using "the offical" timing instead...

    Not sure. I had no problems with this on the HPT371N/302 and 371N was 
clocked by 66 MHz DPLL from the start (its default clock is 75 MHz however).
    I'm still holding to my hypothesis that HPT374 simply can't tolerate 66 
MHz DPLL clock, and the UDMA5 timing figures that you've cited seem to prove that.
    I'm going to post a patch today -- how about completely prohibiting UDMA6 
on HPT374?

> Thanks,
> Bart

WBR, Sergei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-05 12:45                   ` Sergei Shtylyov
@ 2007-06-05 14:14                     ` Sergei Shtylyov
  2007-06-08 12:33                     ` Bartlomiej Zolnierkiewicz
  1 sibling, 0 replies; 15+ messages in thread
From: Sergei Shtylyov @ 2007-06-05 14:14 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Geller Sandor, Andrew Morton, linux-kernel, linux-ide

Hello, I wrote:

>> I felt inspired by this explanation (thanks!) and took a look at
>> hpt374-opensource-v2.10 vendor driver.  Here is something interesting:

>> glbdata.c:

>> ...
>> #ifdef CLOCK_66MHZ
>> ULONG setting370_66[] = {
>>         0xd029d5e,  0xd029d26,  0xc829ca6,  0xc829c84,  0xc829c62,
>>         0x2c829d2c, 0x2c829c66, 0x2c829c62,
>>         0x1c829c62, 0x1c9a9c62, 0x1c929c62, 0x1c8e9c62, 0x1c8a9c62,
>>         0x1c8a9c62/*0x1cae9c62*/, 0x1c869c62, 0x1c869c62,
>> };
>> ...

>> hpt366.c:

>> ...
>> static u32 sixty_six_base_hpt37x[] = {
>>         /* XFER_UDMA_6 */       0x1c869c62,
>>         /* XFER_UDMA_5 */       0x1cae9c62,     /* 0x1c8a9c62 */
>> ...

>> So we are using Dual ATA Clock for UDMA5 whereas vendor driver doesn't

>    This is so in all other HPT drivers (and HPT371N datasheet has the 
> same figures -- this chip is the only one supporting UDMA6 and having 
> the default DPLL clock > 50 MHz).

    What I meant to say was the only one I have a datasheet for. :-)

MBR, Sergei


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-03 17:36               ` Sergei Shtylyov
  2007-06-05  1:00                 ` Bartlomiej Zolnierkiewicz
@ 2007-06-05 20:08                 ` Sergei Shtylyov
  1 sibling, 0 replies; 15+ messages in thread
From: Sergei Shtylyov @ 2007-06-05 20:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Geller Sandor, Bartlomiej Zolnierkiewicz, Andrew Morton, linux-ide

Hello, I wrote:

>>>> This looks promising. Using a vanilla 2.6.22-rc3 I was able to 
>>>> reproduce
>>>> the problem within a few seconds. With the above modification the 
>>>> machine
>>>> is running under heavy disk I/O without problems since 30 minutes...

>>> Did it fix the problem for good?

>> It seems so far. There hasn't been any problem since I've applied the 
>> fix.

>>> Sergei, do we need to disallow UDMA6 completely on HPT734 or
>>> is it only an issue with some problematic devices (=> blacklist)?

>    Note that I didn't change what the old code was doing in this regard 
> -- although the HPT374 spec does *not* say that UDMA6 is supported, it 
> had been enabled. What have *really* changed for HPT374 was:

    No, I've lied (my memory haven't served and I've finally forgot to check 
myself).  It was me who enabled it by default (that there should have been no 
option to do this is another question). :-<

> - in 2.6.20-rc1, the driver switched to using the actual 33 MHz timing 
> table
>   instead of the old one, matching 50 MHz (and so, severely underclocked);

> - in 2.6.2-rc1, the driver switched from 33 MHz PCI to 66 MHz DPLL clock.

>    Disallowing UDMA6 would clock the chip with 50 MHz DPLL, howewer, the 
> original report claimed that something has changed to worse between 
> 2.6.21.1 and .3 but nothing changed in drivers/ide/ between those 
> releases...

>>> Either way we need to fix it somehow for 2.6.22.

>> For the record: this HTP374 is running with a quite outdated firmware 
>> (1.22) - maybe newer firmwares work correctly. I'm going to upgrade 
>> the firmware to the latest one (which was released in 2004...), but 
>> unfortunately in the upcoming 2-3 weeks I won't have access to this 
>> machine, so I can't check the case within the release cycle of 2.6.22. 
>> If you were interested I would post the result of the firmware upgrade.

>    I don't think this will matter...

>> Regards,
>>   Sandor

MBR, Sergei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-05 12:45                   ` Sergei Shtylyov
  2007-06-05 14:14                     ` Sergei Shtylyov
@ 2007-06-08 12:33                     ` Bartlomiej Zolnierkiewicz
  2007-06-09 10:13                       ` Sergei Shtylyov
  1 sibling, 1 reply; 15+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2007-06-08 12:33 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Geller Sandor, Andrew Morton, linux-kernel, linux-ide

On Tuesday 05 June 2007, Sergei Shtylyov wrote:
> Hello.
> 
> Bartlomiej Zolnierkiewicz wrote:
> 
> >>>>>>>The log of a typical IDE reset is available here:
> 
> >>>>>>>http://petra.hos.u-szeged.hu/~wildy/syslog.gz
> 
> >>>>>>>This was the worst case: the IDE bus was resetted during the system 
> >>>>>>>boot.
> 
> >>>>>>  Could you try setting HPT374_ALLOW_ATA133_6 to 0 in
> >>>>>>drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?
> 
> >>>>>Hi Sergei,
> 
> >>>>>This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce
> >>>>>the problem within a few seconds. With the above modification the 
> >>>>>machine
> >>>>>is running under heavy disk I/O without problems since 30 minutes...
> 
> >>>>Did it fix the problem for good?
> 
> >>>It seems so far. There hasn't been any problem since I've applied the fix.
> 
> >>>>Sergei, do we need to disallow UDMA6 completely on HPT734 or
> >>>>is it only an issue with some problematic devices (=> blacklist)?
> 
> >>    Note that I didn't change what the old code was doing in this regard -- 
> >>although the HPT374 spec does *not* say that UDMA6 is supported, it had been 
> >>enabled. What have *really* changed for HPT374 was:
> 
> >>- in 2.6.20-rc1, the driver switched to using the actual 33 MHz timing table
> >>   instead of the old one, matching 50 MHz (and so, severely underclocked);
> 
> >>- in 2.6.2-rc1, the driver switched from 33 MHz PCI to 66 MHz DPLL clock.
> 
> >>    Disallowing UDMA6 would clock the chip with 50 MHz DPLL, howewer, the 
> 
> > I felt inspired by this explanation (thanks!) and took a look at
> > hpt374-opensource-v2.10 vendor driver.  Here is something interesting:
> 
> > glbdata.c:
> 
> > ...
> > #ifdef CLOCK_66MHZ
> > ULONG setting370_66[] = {
> >         0xd029d5e,  0xd029d26,  0xc829ca6,  0xc829c84,  0xc829c62,
> >         0x2c829d2c, 0x2c829c66, 0x2c829c62,
> >         0x1c829c62, 0x1c9a9c62, 0x1c929c62, 0x1c8e9c62, 0x1c8a9c62,
> >         0x1c8a9c62/*0x1cae9c62*/, 0x1c869c62, 0x1c869c62,
> > };
> > ...
> 
> > hpt366.c:
> 
> > ...
> > static u32 sixty_six_base_hpt37x[] = {
> >         /* XFER_UDMA_6 */       0x1c869c62,
> >         /* XFER_UDMA_5 */       0x1cae9c62,     /* 0x1c8a9c62 */
> > ...
> 
> > So we are using Dual ATA Clock for UDMA5 whereas vendor driver doesn't
> 
>     This is so in all other HPT drivers (and HPT371N datasheet has the same 
> figures -- this chip is the only one supporting UDMA6 and having the default 
> DPLL clock > 50 MHz).  Note that it means that there's no actual UDMA5 since 
> the timing exactly matches that one used for UDMA4.
> 
> > (the only other mode which uses Dual ATA Clock, in both drivers, is rarely
> > used UDMA3).
> 
>     And UDMA4 with 50 MHz clock.
> 
> > Thanks to this UDMA cycle time should be equal 22.5ns instead of 30ns
> > (spec defines it at 16.8ns, ide_timings[] uses 20ns) when using 66 MHz DPLL
> > clock.  In theory everything should play nice but the data manual for HPT374
> 
>     And it does -- on other chips.

My beautiful theory failed... Oh, well... ;)

> > contains weird note that Dual ATA Clock is meant to implement ATA100 read
> > and write at different clocks (there is no more explanation to this).
> 
>     That's the thing that keeps me confused in the other datasheets too -- 
> from my interpretation of their timing figures it seemed to control 2x ATA 
> clock multipler. HPT370 datasheet just gives different timings and SCR2 values 
> for reads/writes in UDMA5 (I've disabled this mode on HPT370 from which the 
> read performance only gained -- not sure if it makes sense to restore the old 
> clock turnaround hack).
> 
> > Geller reported that the problems started after migrating from 2.6.20.7 to
> > 2.6.21.1 (the affected disks are using UDMA5) and at the same time the driver
> > switched from 33 MHz PCI to 66 MHz DPLL clock.  Also the issue is completely
> > fixed by using 50 MHz DPLL clock (UDMA5 timing for 50 MHz DPLL clock is
> > 0x12848242 so UDMA cycle time equals 20ns and is smaller than the one
> > obtained using 66 MHz DPLL clock).
> 
> 
> > It all makes me wonder whether it is really safe to use Dual ATA Clock for
> > UDMA5 and whether we should just be using "the offical" timing instead...
> 
>     Not sure. I had no problems with this on the HPT371N/302 and 371N was 
> clocked by 66 MHz DPLL from the start (its default clock is 75 MHz however).
>     I'm still holding to my hypothesis that HPT374 simply can't tolerate 66 
> MHz DPLL clock, and the UDMA5 timing figures that you've cited seem to prove that.
>     I'm going to post a patch today -- how about completely prohibiting UDMA6 
> on HPT374?

Sounds fine, in case somebody misses it we can introduce something like
hpt374_allow_66mhz_dpll module parameter...

Thanks,
Bart

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: HPT374 IDE problem with 2.6.21.* kernels
  2007-06-08 12:33                     ` Bartlomiej Zolnierkiewicz
@ 2007-06-09 10:13                       ` Sergei Shtylyov
  0 siblings, 0 replies; 15+ messages in thread
From: Sergei Shtylyov @ 2007-06-09 10:13 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Geller Sandor, Andrew Morton, linux-kernel, linux-ide

Bartlomiej Zolnierkiewicz wrote:

>>>>>>Sergei, do we need to disallow UDMA6 completely on HPT734 or
>>>>>>is it only an issue with some problematic devices (=> blacklist)?

>>>>   Note that I didn't change what the old code was doing in this regard -- 
>>>>although the HPT374 spec does *not* say that UDMA6 is supported, it had been 
>>>>enabled. What have *really* changed for HPT374 was:

>>>>- in 2.6.20-rc1, the driver switched to using the actual 33 MHz timing table
>>>>  instead of the old one, matching 50 MHz (and so, severely underclocked);

>>>>- in 2.6.2-rc1, the driver switched from 33 MHz PCI to 66 MHz DPLL clock.

>>>>   Disallowing UDMA6 would clock the chip with 50 MHz DPLL, howewer, the 

>>>I felt inspired by this explanation (thanks!) and took a look at
>>>hpt374-opensource-v2.10 vendor driver.  Here is something interesting:
>>
>>>glbdata.c:
>>
>>>...
>>>#ifdef CLOCK_66MHZ
>>>ULONG setting370_66[] = {
>>>        0xd029d5e,  0xd029d26,  0xc829ca6,  0xc829c84,  0xc829c62,
>>>        0x2c829d2c, 0x2c829c66, 0x2c829c62,
>>>        0x1c829c62, 0x1c9a9c62, 0x1c929c62, 0x1c8e9c62, 0x1c8a9c62,
>>>        0x1c8a9c62/*0x1cae9c62*/, 0x1c869c62, 0x1c869c62,
>>>};
>>>...

>>>hpt366.c:

>>>...
>>>static u32 sixty_six_base_hpt37x[] = {
>>>        /* XFER_UDMA_6 */       0x1c869c62,
>>>        /* XFER_UDMA_5 */       0x1cae9c62,     /* 0x1c8a9c62 */
>>>...

>>>So we are using Dual ATA Clock for UDMA5 whereas vendor driver doesn't

>>    This is so in all other HPT drivers (and HPT371N datasheet has the same 
>>figures -- this chip is the only one supporting UDMA6 and having the default 
>>DPLL clock > 50 MHz).  Note that it means that there's no actual UDMA5 since 
>>the timing exactly matches that one used for UDMA4.

>>>(the only other mode which uses Dual ATA Clock, in both drivers, is rarely
>>>used UDMA3).

>>    And UDMA4 with 50 MHz clock.

>>>Thanks to this UDMA cycle time should be equal 22.5ns instead of 30ns
>>>(spec defines it at 16.8ns, ide_timings[] uses 20ns) when using 66 MHz DPLL
>>>clock.  In theory everything should play nice but the data manual for HPT374

>>    And it does -- on other chips.

> My beautiful theory failed... Oh, well... ;)

    Sigh, if we only knew why HPT decided that UDMA5 timings should be the 
same as UDMA4 -- probably they had some reason...

>>>contains weird note that Dual ATA Clock is meant to implement ATA100 read
>>>and write at different clocks (there is no more explanation to this).

>>    That's the thing that keeps me confused in the other datasheets too -- 
>>from my interpretation of their timing figures it seemed to control 2x ATA 
>>clock multipler. HPT370 datasheet just gives different timings and SCR2 values 
>>for reads/writes in UDMA5 (I've disabled this mode on HPT370 from which the 
>>read performance only gained -- not sure if it makes sense to restore the old 
>>clock turnaround hack).

    It used to clock the writes from DPLL in UDMA5, and clock UDMA5 reads and 
all other modes from PCI... And the result was dog slow reads in UDMA5 which 
UDMA4 was beating by about 8 MB/s... Maybe that's why UDMA4 timings were used 
for UDMA5 in later chips by HPT -- but the real UDMA5 yielded faster transfer 
speeds than UDMA4 for those chips...

>>>Geller reported that the problems started after migrating from 2.6.20.7 to
>>>2.6.21.1 (the affected disks are using UDMA5) and at the same time the driver
>>>switched from 33 MHz PCI to 66 MHz DPLL clock.  Also the issue is completely
>>>fixed by using 50 MHz DPLL clock (UDMA5 timing for 50 MHz DPLL clock is
>>>0x12848242 so UDMA cycle time equals 20ns and is smaller than the one
>>>obtained using 66 MHz DPLL clock).

>>>It all makes me wonder whether it is really safe to use Dual ATA Clock for
>>>UDMA5 and whether we should just be using "the offical" timing instead...

>>    Not sure. I had no problems with this on the HPT371N/302 and 371N was 
>>clocked by 66 MHz DPLL from the start (its default clock is 75 MHz however).

    I meant to say 77... :-)

>>    I'm still holding to my hypothesis that HPT374 simply can't tolerate 66 
>>MHz DPLL clock, and the UDMA5 timing figures that you've cited seem to prove that.
>>    I'm going to post a patch today -- how about completely prohibiting UDMA6 
>>on HPT374?

> Sounds fine, in case somebody misses it we can introduce something like
> hpt374_allow_66mhz_dpll module parameter...

    Don't think anybody will miss it. Anyway, chip spec doesn't say that it's 
supported.

> Thanks,
> Bart

MBR, Sergei

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2007-06-09 10:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-30  9:30 HPT374 IDE problem with 2.6.21.* kernels Geller Sandor
2007-06-01 20:46 ` Andrew Morton
2007-06-01 20:53   ` Sergei Shtylyov
2007-06-01 21:13     ` Geller Sandor
2007-06-01 21:26       ` Sergei Shtylyov
2007-06-01 22:41         ` Geller Sandor
2007-06-02 23:38           ` Bartlomiej Zolnierkiewicz
2007-06-03 10:37             ` Geller Sandor
2007-06-03 17:36               ` Sergei Shtylyov
2007-06-05  1:00                 ` Bartlomiej Zolnierkiewicz
2007-06-05 12:45                   ` Sergei Shtylyov
2007-06-05 14:14                     ` Sergei Shtylyov
2007-06-08 12:33                     ` Bartlomiej Zolnierkiewicz
2007-06-09 10:13                       ` Sergei Shtylyov
2007-06-05 20:08                 ` Sergei Shtylyov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).