linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PATCH: [2.4.21-pre3] IDE error recovery
@ 2003-01-14  2:39 Ross Biro
  2003-01-14  3:44 ` Ross Biro
  2003-01-14  7:23 ` Andre Hedrick
  0 siblings, 2 replies; 5+ messages in thread
From: Ross Biro @ 2003-01-14  2:39 UTC (permalink / raw)
  To: alan, andre, marcelo; +Cc: linux-kernel


Take this patch with a big grain of salt.

The current IDE error recovery code will send a IDLEIMMEDIATE in the 
case when a drive is still busy or has drq set during an error.  This is 
a violation of the ATA spec and hoses up quite a few drives.  The 
prefered form of error recovery seems to be a soft reset followed by a 
set features.  This patch replaces the IDLEIMMEDIATE with a reset, but 
does not do the set features.  The set features should not be necessary, 
but at least one drive vendor does most of their testing with the set 
features.  OK_TO_RESET_CONTROLLER must be set to 1 if this patch is applied.

I've modified the google kernel to add a desired_speed to ide_drive_t 
and check to see if current_speed != desired_speed before every read or 
write request.  This causes a set features after a reset.  I can provide 
patches to do something similiar for 2.4.21, but I'm not sure it's 
appropriate.  There are bugs in the way hdparm -X is handled in 2.4.20, 
and this patch would fix them as well, but it may be too large of a 
patch for a stable kernel.

This also fixes drive_cmd_intr to do a little better checking to make 
sure the device is ready.  The spec requires this, but any device that 
sends an interrupt before it's ready is probably broken.

I have no idea what impact this will have on older devices or non disks.

This patch has only been minimally tested.

----- snip -------
diff -durbB linux-2.4.20-p1/drivers/ide/ide-cd.c 
linux-2.4.20-p2/drivers/ide/ide-cd.c
--- linux-2.4.20-p1/drivers/ide/ide-cd.c    Wed Jan  8 16:25:04 2003
+++ linux-2.4.20-p2/drivers/ide/ide-cd.c    Thu Jan  9 11:38:22 2003
@@ -644,7 +644,11 @@
     }
     if (HWIF(drive)->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
         /* force an abort */
-        HWIF(drive)->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
+                /* This violates the ATA Spec and causes trouble for
+                   many modern hard drives.  I'm not sure if it is 
necessary
+                   for older devices or even modern cd-roms. -- RAB
+                   HWIF(drive)->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG); */
+                rq->errors |= ERROR_RESET;
         }
     if (rq->errors >= ERROR_MAX) {
         DRIVER(drive)->end_request(drive, 0);
diff -durbB linux-2.4.20-p1/drivers/ide/ide-disk.c 
linux-2.4.20-p2/drivers/ide/ide-disk.c
--- linux-2.4.20-p1/drivers/ide/ide-disk.c    Wed Jan  8 16:25:17 2003
+++ linux-2.4.20-p2/drivers/ide/ide-disk.c    Thu Jan  9 11:40:20 2003
@@ -943,7 +943,11 @@
     }
     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
         /* force an abort */
-        hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
+                /* This violates the ATA Spec and causes trouble for
+                   many modern hard drives.  I'm not sure if it is 
necessary
+                   for older devices or even other modern devices. -- RAB
+                   hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG); */
+                rq->errors |= ERROR_RESET;
     }
     if (rq->errors >= ERROR_MAX)
         DRIVER(drive)->end_request(drive, 0);
diff -durbB linux-2.4.20-p1/drivers/ide/ide-io.c 
linux-2.4.20-p2/drivers/ide/ide-io.c
--- linux-2.4.20-p1/drivers/ide/ide-io.c    Wed Jan  8 16:25:37 2003
+++ linux-2.4.20-p2/drivers/ide/ide-io.c    Mon Jan 13 18:01:52 2003
@@ -328,7 +328,12 @@
     }
     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
         /* force an abort */
+                /* This violates the ATA Spec and causes trouble for
+                   many modern hard drives.  I'm not sure if it is 
necessary
+                   for older devices or even other modern ata devices 
-- RAB
         hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
+                */
+                rq->errors |= ERROR_RESET;
     }
     if (rq->errors >= ERROR_MAX) {
         if (drive->driver != NULL)
@@ -397,18 +402,63 @@
     struct request *rq = HWGROUP(drive)->rq;
     ide_hwif_t *hwif = HWIF(drive);
     u8 *args = (u8 *) rq->buffer;
-    u8 stat = hwif->INB(IDE_STATUS_REG);
+    u8 stat;
     int retries = 10;
 
+        /* The ATA-6 spec requires a 400ns wait when entering the
+           HPIOI1: Check_status State from the HI4 state.  We should
+           only get here coming from the HPIOI0 INTRQ_wait state, but
+           better safe than sorry.  The spec also says a read of the
+           alt status register accomplishes this wait. */
+    if (IDE_CONTROL_REG)
+        (void)hwif->INB(IDE_ALTSTATUS_REG);
+    else
+                ide_delay_400ns();
+
+        /* The ATA-6 spec says that the should enter the check status
+           state when it get's an interrupt for a drive command, so
+           technically, the device can send an interrupt before it has
+           finished.  I would consider any device that does this to be
+           broken, but the spec allows it. --RAB 1/10/03 */
+    stat = hwif->INB(IDE_STATUS_REG);
+        if (stat & BUSY_STAT) {
+                /* We are just supposed to poll from here on out. */
+                if (HWGROUP(drive)->poll_timeout == 0) {
+                        
HWGROUP(drive)->poll_timeout=WAIT_CMD/CMD_POLL_TICKS;
+                } else if (--HWGROUP(drive)->poll_timeout == 0) {
+                        return DRIVER(drive)->error(drive,
+                                                    "drive_cmd timeout",
+                                                    stat);
+                }
+                ide_set_handler(drive, drive_cmd_intr, CMD_POLL_TICKS, 
NULL);
+                return ide_started;
+        }
+
     local_irq_enable();
-    if ((stat & DRQ_STAT) && args && args[3]) {
+
+        /* We want to make sure that the device and the command
+           both agree on wether or not data is returned by the command
+           that was just issued.  If they do not, it is an error. */
+        if (args && args[3]) {
         u8 io_32bit = drive->io_32bit;
+                /* The software expects some data back.  If the drive does
+                   not, then it is an error. */
+                if (!(stat & DRQ_STAT)) {
+                        return DRIVER(drive)->error(drive, "drive_cmd 
no data", stat);
+                }
         drive->io_32bit = 0;
         hwif->ata_input_data(drive, &args[4], args[3] * SECTOR_WORDS);
         drive->io_32bit = io_32bit;
         while (((stat = hwif->INB(IDE_STATUS_REG)) & BUSY_STAT) && 
retries--)
             udelay(100);
+        } else {
+                /* The software does not expect data, so if the drive
+                   returns some, then it's an error. */
+                if (stat & DRQ_STAT) {
+                        return DRIVER(drive)->error(drive, "drive_cmd 
unexpected data", stat);
+                }
     }
+
 
     if (!OK_STAT(stat, READY_STAT, BAD_STAT))
         return DRIVER(drive)->error(drive, "drive_cmd", stat);
diff -durbB linux-2.4.20-p1/drivers/ide/ide-taskfile.c 
linux-2.4.20-p2/drivers/ide/ide-taskfile.c
--- linux-2.4.20-p1/drivers/ide/ide-taskfile.c    Wed Jan  8 16:25:43 2003
+++ linux-2.4.20-p2/drivers/ide/ide-taskfile.c    Fri Jan 10 11:19:07 2003
@@ -485,7 +485,13 @@
     }
     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
         /* force an abort */
-        hwif->OUTB(WIN_IDLEIMMEDIATE, IDE_COMMAND_REG);
+                /* This violates the ATA Spec and causes trouble for
+                   many modern hard drives.  I'm not sure if it is 
necessary
+                   for older devices or even other modern ata devices 
-- RAB
+        hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
+                */
+                rq->errors |= ERROR_RESET;
+
     }
     if (rq->errors >= ERROR_MAX) {
         DRIVER(drive)->end_request(drive, 0);
diff -durbB linux-2.4.20-p1/include/linux/ide.h 
linux-2.4.20-p2/include/linux/ide.h
--- linux-2.4.20-p1/include/linux/ide.h    Thu Jan  9 15:37:22 2003
+++ linux-2.4.20-p2/include/linux/ide.h    Mon Jan 13 18:04:47 2003
@@ -271,6 +271,10 @@
 #define WAIT_WORSTCASE    (30*HZ)    /* 30sec  - worst case when 
spinning up */
 #define WAIT_CMD    (10*HZ)    /* 10sec  - maximum wait for an IRQ to 
happen */
 #define WAIT_MIN_SLEEP    (2*HZ/100)    /* 20msec - minimum sleep time */
+#define CMD_POLL_TICKS  (1)     /* How long to wait in ticks between
+                                   status checks for a drive cmd
+                                   that set it's interrupt before it
+                                   was complete. */
 
 #define HOST(hwif,chipset)                    \
 {                                \



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH: [2.4.21-pre3] IDE error recovery
  2003-01-14  2:39 PATCH: [2.4.21-pre3] IDE error recovery Ross Biro
@ 2003-01-14  3:44 ` Ross Biro
  2003-01-14  7:23 ` Andre Hedrick
  1 sibling, 0 replies; 5+ messages in thread
From: Ross Biro @ 2003-01-14  3:44 UTC (permalink / raw)
  To: Ross Biro; +Cc: alan, andre, marcelo, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9342 bytes --]


Someone said there were word wrapping problems with my mailer.  I'm too 
lazy to get a real mailer, so here's an attachment.

       Ross


Ross Biro wrote:

>
> Take this patch with a big grain of salt.
>
> The current IDE error recovery code will send a IDLEIMMEDIATE in the 
> case when a drive is still busy or has drq set during an error.  This 
> is a violation of the ATA spec and hoses up quite a few drives.  The 
> prefered form of error recovery seems to be a soft reset followed by a 
> set features.  This patch replaces the IDLEIMMEDIATE with a reset, but 
> does not do the set features.  The set features should not be 
> necessary, but at least one drive vendor does most of their testing 
> with the set features.  OK_TO_RESET_CONTROLLER must be set to 1 if 
> this patch is applied.
>
> I've modified the google kernel to add a desired_speed to ide_drive_t 
> and check to see if current_speed != desired_speed before every read 
> or write request.  This causes a set features after a reset.  I can 
> provide patches to do something similiar for 2.4.21, but I'm not sure 
> it's appropriate.  There are bugs in the way hdparm -X is handled in 
> 2.4.20, and this patch would fix them as well, but it may be too large 
> of a patch for a stable kernel.
>
> This also fixes drive_cmd_intr to do a little better checking to make 
> sure the device is ready.  The spec requires this, but any device that 
> sends an interrupt before it's ready is probably broken.
>
> I have no idea what impact this will have on older devices or non disks.
>
> This patch has only been minimally tested.
>
> ----- snip -------
> diff -durbB linux-2.4.20-p1/drivers/ide/ide-cd.c 
> linux-2.4.20-p2/drivers/ide/ide-cd.c
> --- linux-2.4.20-p1/drivers/ide/ide-cd.c    Wed Jan  8 16:25:04 2003
> +++ linux-2.4.20-p2/drivers/ide/ide-cd.c    Thu Jan  9 11:38:22 2003
> @@ -644,7 +644,11 @@
>     }
>     if (HWIF(drive)->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>         /* force an abort */
> -        HWIF(drive)->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
> +                /* This violates the ATA Spec and causes trouble for
> +                   many modern hard drives.  I'm not sure if it is 
> necessary
> +                   for older devices or even modern cd-roms. -- RAB
> +                   
> HWIF(drive)->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG); */
> +                rq->errors |= ERROR_RESET;
>         }
>     if (rq->errors >= ERROR_MAX) {
>         DRIVER(drive)->end_request(drive, 0);
> diff -durbB linux-2.4.20-p1/drivers/ide/ide-disk.c 
> linux-2.4.20-p2/drivers/ide/ide-disk.c
> --- linux-2.4.20-p1/drivers/ide/ide-disk.c    Wed Jan  8 16:25:17 2003
> +++ linux-2.4.20-p2/drivers/ide/ide-disk.c    Thu Jan  9 11:40:20 2003
> @@ -943,7 +943,11 @@
>     }
>     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>         /* force an abort */
> -        hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
> +                /* This violates the ATA Spec and causes trouble for
> +                   many modern hard drives.  I'm not sure if it is 
> necessary
> +                   for older devices or even other modern devices. -- 
> RAB
> +                   hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG); */
> +                rq->errors |= ERROR_RESET;
>     }
>     if (rq->errors >= ERROR_MAX)
>         DRIVER(drive)->end_request(drive, 0);
> diff -durbB linux-2.4.20-p1/drivers/ide/ide-io.c 
> linux-2.4.20-p2/drivers/ide/ide-io.c
> --- linux-2.4.20-p1/drivers/ide/ide-io.c    Wed Jan  8 16:25:37 2003
> +++ linux-2.4.20-p2/drivers/ide/ide-io.c    Mon Jan 13 18:01:52 2003
> @@ -328,7 +328,12 @@
>     }
>     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>         /* force an abort */
> +                /* This violates the ATA Spec and causes trouble for
> +                   many modern hard drives.  I'm not sure if it is 
> necessary
> +                   for older devices or even other modern ata devices 
> -- RAB
>         hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
> +                */
> +                rq->errors |= ERROR_RESET;
>     }
>     if (rq->errors >= ERROR_MAX) {
>         if (drive->driver != NULL)
> @@ -397,18 +402,63 @@
>     struct request *rq = HWGROUP(drive)->rq;
>     ide_hwif_t *hwif = HWIF(drive);
>     u8 *args = (u8 *) rq->buffer;
> -    u8 stat = hwif->INB(IDE_STATUS_REG);
> +    u8 stat;
>     int retries = 10;
>
> +        /* The ATA-6 spec requires a 400ns wait when entering the
> +           HPIOI1: Check_status State from the HI4 state.  We should
> +           only get here coming from the HPIOI0 INTRQ_wait state, but
> +           better safe than sorry.  The spec also says a read of the
> +           alt status register accomplishes this wait. */
> +    if (IDE_CONTROL_REG)
> +        (void)hwif->INB(IDE_ALTSTATUS_REG);
> +    else
> +                ide_delay_400ns();
> +
> +        /* The ATA-6 spec says that the should enter the check status
> +           state when it get's an interrupt for a drive command, so
> +           technically, the device can send an interrupt before it has
> +           finished.  I would consider any device that does this to be
> +           broken, but the spec allows it. --RAB 1/10/03 */
> +    stat = hwif->INB(IDE_STATUS_REG);
> +        if (stat & BUSY_STAT) {
> +                /* We are just supposed to poll from here on out. */
> +                if (HWGROUP(drive)->poll_timeout == 0) {
> +                        
> HWGROUP(drive)->poll_timeout=WAIT_CMD/CMD_POLL_TICKS;
> +                } else if (--HWGROUP(drive)->poll_timeout == 0) {
> +                        return DRIVER(drive)->error(drive,
> +                                                    "drive_cmd timeout",
> +                                                    stat);
> +                }
> +                ide_set_handler(drive, drive_cmd_intr, 
> CMD_POLL_TICKS, NULL);
> +                return ide_started;
> +        }
> +
>     local_irq_enable();
> -    if ((stat & DRQ_STAT) && args && args[3]) {
> +
> +        /* We want to make sure that the device and the command
> +           both agree on wether or not data is returned by the command
> +           that was just issued.  If they do not, it is an error. */
> +        if (args && args[3]) {
>         u8 io_32bit = drive->io_32bit;
> +                /* The software expects some data back.  If the drive 
> does
> +                   not, then it is an error. */
> +                if (!(stat & DRQ_STAT)) {
> +                        return DRIVER(drive)->error(drive, "drive_cmd 
> no data", stat);
> +                }
>         drive->io_32bit = 0;
>         hwif->ata_input_data(drive, &args[4], args[3] * SECTOR_WORDS);
>         drive->io_32bit = io_32bit;
>         while (((stat = hwif->INB(IDE_STATUS_REG)) & BUSY_STAT) && 
> retries--)
>             udelay(100);
> +        } else {
> +                /* The software does not expect data, so if the drive
> +                   returns some, then it's an error. */
> +                if (stat & DRQ_STAT) {
> +                        return DRIVER(drive)->error(drive, "drive_cmd 
> unexpected data", stat);
> +                }
>     }
> +
>
>     if (!OK_STAT(stat, READY_STAT, BAD_STAT))
>         return DRIVER(drive)->error(drive, "drive_cmd", stat);
> diff -durbB linux-2.4.20-p1/drivers/ide/ide-taskfile.c 
> linux-2.4.20-p2/drivers/ide/ide-taskfile.c
> --- linux-2.4.20-p1/drivers/ide/ide-taskfile.c    Wed Jan  8 16:25:43 
> 2003
> +++ linux-2.4.20-p2/drivers/ide/ide-taskfile.c    Fri Jan 10 11:19:07 
> 2003
> @@ -485,7 +485,13 @@
>     }
>     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>         /* force an abort */
> -        hwif->OUTB(WIN_IDLEIMMEDIATE, IDE_COMMAND_REG);
> +                /* This violates the ATA Spec and causes trouble for
> +                   many modern hard drives.  I'm not sure if it is 
> necessary
> +                   for older devices or even other modern ata devices 
> -- RAB
> +        hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
> +                */
> +                rq->errors |= ERROR_RESET;
> +
>     }
>     if (rq->errors >= ERROR_MAX) {
>         DRIVER(drive)->end_request(drive, 0);
> diff -durbB linux-2.4.20-p1/include/linux/ide.h 
> linux-2.4.20-p2/include/linux/ide.h
> --- linux-2.4.20-p1/include/linux/ide.h    Thu Jan  9 15:37:22 2003
> +++ linux-2.4.20-p2/include/linux/ide.h    Mon Jan 13 18:04:47 2003
> @@ -271,6 +271,10 @@
> #define WAIT_WORSTCASE    (30*HZ)    /* 30sec  - worst case when 
> spinning up */
> #define WAIT_CMD    (10*HZ)    /* 10sec  - maximum wait for an IRQ to 
> happen */
> #define WAIT_MIN_SLEEP    (2*HZ/100)    /* 20msec - minimum sleep time */
> +#define CMD_POLL_TICKS  (1)     /* How long to wait in ticks between
> +                                   status checks for a drive cmd
> +                                   that set it's interrupt before it
> +                                   was complete. */
>
> #define HOST(hwif,chipset)                    \
> {                                \
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/




[-- Attachment #2: patch2 --]
[-- Type: application/x-java-vm, Size: 6840 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH: [2.4.21-pre3] IDE error recovery
  2003-01-14  2:39 PATCH: [2.4.21-pre3] IDE error recovery Ross Biro
  2003-01-14  3:44 ` Ross Biro
@ 2003-01-14  7:23 ` Andre Hedrick
  2003-01-14 17:08   ` Ross Biro
  1 sibling, 1 reply; 5+ messages in thread
From: Andre Hedrick @ 2003-01-14  7:23 UTC (permalink / raw)
  To: Ross Biro; +Cc: alan, marcelo, linux-kernel


Whoa, you need to check whic spec you are refering to first.

 Model=FUJITSU MHJ2181AT, FwRev=D034, SerialNo=01001697
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=512kB, MaxMultSect=16, MultSect=16
 CurCHS=17475/15/63, CurSects=-78446341, LBA=yes, LBAsects=35433216
 IORDY=yes, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4
 AdvancedPM=yes: disabled (255)
 Drive Supports : Reserved : ATA-2 ATA-3 ATA-4 ATA-5
 Kernel Drive Geometry LogicalCHS=2205/255/63 PhysicalCHS=37495/15/63

This drive claims ATA2-5 support, and thus it accepts the status quo.
Where are you finding the violations?
If the drive vendor states a range of support they are in violation.
This is clearly the point where the driver core needs to have a

	unsinged long drive->support_bits

The mess this is going to create.

Cheers,

On Mon, 13 Jan 2003, Ross Biro wrote:

> 
> Take this patch with a big grain of salt.
> 
> The current IDE error recovery code will send a IDLEIMMEDIATE in the 
> case when a drive is still busy or has drq set during an error.  This is 
> a violation of the ATA spec and hoses up quite a few drives.  The 
> prefered form of error recovery seems to be a soft reset followed by a 
> set features.  This patch replaces the IDLEIMMEDIATE with a reset, but 
> does not do the set features.  The set features should not be necessary, 
> but at least one drive vendor does most of their testing with the set 
> features.  OK_TO_RESET_CONTROLLER must be set to 1 if this patch is applied.
> 
> I've modified the google kernel to add a desired_speed to ide_drive_t 
> and check to see if current_speed != desired_speed before every read or 
> write request.  This causes a set features after a reset.  I can provide 
> patches to do something similiar for 2.4.21, but I'm not sure it's 
> appropriate.  There are bugs in the way hdparm -X is handled in 2.4.20, 
> and this patch would fix them as well, but it may be too large of a 
> patch for a stable kernel.
> 
> This also fixes drive_cmd_intr to do a little better checking to make 
> sure the device is ready.  The spec requires this, but any device that 
> sends an interrupt before it's ready is probably broken.
> 
> I have no idea what impact this will have on older devices or non disks.
> 
> This patch has only been minimally tested.
> 
> ----- snip -------
> diff -durbB linux-2.4.20-p1/drivers/ide/ide-cd.c 
> linux-2.4.20-p2/drivers/ide/ide-cd.c
> --- linux-2.4.20-p1/drivers/ide/ide-cd.c    Wed Jan  8 16:25:04 2003
> +++ linux-2.4.20-p2/drivers/ide/ide-cd.c    Thu Jan  9 11:38:22 2003
> @@ -644,7 +644,11 @@
>      }
>      if (HWIF(drive)->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>          /* force an abort */
> -        HWIF(drive)->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
> +                /* This violates the ATA Spec and causes trouble for
> +                   many modern hard drives.  I'm not sure if it is 
> necessary
> +                   for older devices or even modern cd-roms. -- RAB
> +                   HWIF(drive)->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG); */
> +                rq->errors |= ERROR_RESET;
>          }
>      if (rq->errors >= ERROR_MAX) {
>          DRIVER(drive)->end_request(drive, 0);
> diff -durbB linux-2.4.20-p1/drivers/ide/ide-disk.c 
> linux-2.4.20-p2/drivers/ide/ide-disk.c
> --- linux-2.4.20-p1/drivers/ide/ide-disk.c    Wed Jan  8 16:25:17 2003
> +++ linux-2.4.20-p2/drivers/ide/ide-disk.c    Thu Jan  9 11:40:20 2003
> @@ -943,7 +943,11 @@
>      }
>      if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>          /* force an abort */
> -        hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
> +                /* This violates the ATA Spec and causes trouble for
> +                   many modern hard drives.  I'm not sure if it is 
> necessary
> +                   for older devices or even other modern devices. -- RAB
> +                   hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG); */
> +                rq->errors |= ERROR_RESET;
>      }
>      if (rq->errors >= ERROR_MAX)
>          DRIVER(drive)->end_request(drive, 0);
> diff -durbB linux-2.4.20-p1/drivers/ide/ide-io.c 
> linux-2.4.20-p2/drivers/ide/ide-io.c
> --- linux-2.4.20-p1/drivers/ide/ide-io.c    Wed Jan  8 16:25:37 2003
> +++ linux-2.4.20-p2/drivers/ide/ide-io.c    Mon Jan 13 18:01:52 2003
> @@ -328,7 +328,12 @@
>      }
>      if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>          /* force an abort */
> +                /* This violates the ATA Spec and causes trouble for
> +                   many modern hard drives.  I'm not sure if it is 
> necessary
> +                   for older devices or even other modern ata devices 
> -- RAB
>          hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
> +                */
> +                rq->errors |= ERROR_RESET;
>      }
>      if (rq->errors >= ERROR_MAX) {
>          if (drive->driver != NULL)
> @@ -397,18 +402,63 @@
>      struct request *rq = HWGROUP(drive)->rq;
>      ide_hwif_t *hwif = HWIF(drive);
>      u8 *args = (u8 *) rq->buffer;
> -    u8 stat = hwif->INB(IDE_STATUS_REG);
> +    u8 stat;
>      int retries = 10;
>  
> +        /* The ATA-6 spec requires a 400ns wait when entering the
> +           HPIOI1: Check_status State from the HI4 state.  We should
> +           only get here coming from the HPIOI0 INTRQ_wait state, but
> +           better safe than sorry.  The spec also says a read of the
> +           alt status register accomplishes this wait. */
> +    if (IDE_CONTROL_REG)
> +        (void)hwif->INB(IDE_ALTSTATUS_REG);
> +    else
> +                ide_delay_400ns();
> +
> +        /* The ATA-6 spec says that the should enter the check status
> +           state when it get's an interrupt for a drive command, so
> +           technically, the device can send an interrupt before it has
> +           finished.  I would consider any device that does this to be
> +           broken, but the spec allows it. --RAB 1/10/03 */
> +    stat = hwif->INB(IDE_STATUS_REG);
> +        if (stat & BUSY_STAT) {
> +                /* We are just supposed to poll from here on out. */
> +                if (HWGROUP(drive)->poll_timeout == 0) {
> +                        
> HWGROUP(drive)->poll_timeout=WAIT_CMD/CMD_POLL_TICKS;
> +                } else if (--HWGROUP(drive)->poll_timeout == 0) {
> +                        return DRIVER(drive)->error(drive,
> +                                                    "drive_cmd timeout",
> +                                                    stat);
> +                }
> +                ide_set_handler(drive, drive_cmd_intr, CMD_POLL_TICKS, 
> NULL);
> +                return ide_started;
> +        }
> +
>      local_irq_enable();
> -    if ((stat & DRQ_STAT) && args && args[3]) {
> +
> +        /* We want to make sure that the device and the command
> +           both agree on wether or not data is returned by the command
> +           that was just issued.  If they do not, it is an error. */
> +        if (args && args[3]) {
>          u8 io_32bit = drive->io_32bit;
> +                /* The software expects some data back.  If the drive does
> +                   not, then it is an error. */
> +                if (!(stat & DRQ_STAT)) {
> +                        return DRIVER(drive)->error(drive, "drive_cmd 
> no data", stat);
> +                }
>          drive->io_32bit = 0;
>          hwif->ata_input_data(drive, &args[4], args[3] * SECTOR_WORDS);
>          drive->io_32bit = io_32bit;
>          while (((stat = hwif->INB(IDE_STATUS_REG)) & BUSY_STAT) && 
> retries--)
>              udelay(100);
> +        } else {
> +                /* The software does not expect data, so if the drive
> +                   returns some, then it's an error. */
> +                if (stat & DRQ_STAT) {
> +                        return DRIVER(drive)->error(drive, "drive_cmd 
> unexpected data", stat);
> +                }
>      }
> +
>  
>      if (!OK_STAT(stat, READY_STAT, BAD_STAT))
>          return DRIVER(drive)->error(drive, "drive_cmd", stat);
> diff -durbB linux-2.4.20-p1/drivers/ide/ide-taskfile.c 
> linux-2.4.20-p2/drivers/ide/ide-taskfile.c
> --- linux-2.4.20-p1/drivers/ide/ide-taskfile.c    Wed Jan  8 16:25:43 2003
> +++ linux-2.4.20-p2/drivers/ide/ide-taskfile.c    Fri Jan 10 11:19:07 2003
> @@ -485,7 +485,13 @@
>      }
>      if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>          /* force an abort */
> -        hwif->OUTB(WIN_IDLEIMMEDIATE, IDE_COMMAND_REG);
> +                /* This violates the ATA Spec and causes trouble for
> +                   many modern hard drives.  I'm not sure if it is 
> necessary
> +                   for older devices or even other modern ata devices 
> -- RAB
> +        hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
> +                */
> +                rq->errors |= ERROR_RESET;
> +
>      }
>      if (rq->errors >= ERROR_MAX) {
>          DRIVER(drive)->end_request(drive, 0);
> diff -durbB linux-2.4.20-p1/include/linux/ide.h 
> linux-2.4.20-p2/include/linux/ide.h
> --- linux-2.4.20-p1/include/linux/ide.h    Thu Jan  9 15:37:22 2003
> +++ linux-2.4.20-p2/include/linux/ide.h    Mon Jan 13 18:04:47 2003
> @@ -271,6 +271,10 @@
>  #define WAIT_WORSTCASE    (30*HZ)    /* 30sec  - worst case when 
> spinning up */
>  #define WAIT_CMD    (10*HZ)    /* 10sec  - maximum wait for an IRQ to 
> happen */
>  #define WAIT_MIN_SLEEP    (2*HZ/100)    /* 20msec - minimum sleep time */
> +#define CMD_POLL_TICKS  (1)     /* How long to wait in ticks between
> +                                   status checks for a drive cmd
> +                                   that set it's interrupt before it
> +                                   was complete. */
>  
>  #define HOST(hwif,chipset)                    \
>  {                                \
> 
> 

Andre Hedrick
LAD Storage Consulting Group


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH: [2.4.21-pre3] IDE error recovery
  2003-01-14  7:23 ` Andre Hedrick
@ 2003-01-14 17:08   ` Ross Biro
  2003-01-14 17:36     ` Ross Biro
  0 siblings, 1 reply; 5+ messages in thread
From: Ross Biro @ 2003-01-14 17:08 UTC (permalink / raw)
  To: Andre Hedrick; +Cc: alan, marcelo, linux-kernel


I knew this was going to be a problem.

ATA-6 no longer allows a device to claim ATA-2 compatibility.  The bit 
is obsolete.
ATA-3 does not allow IDLE IMMEDIATE.  We need to changre the error 
recovery depending on which bits are set.

ATA-6 => use srst
ATA-2 => idle immediate

Anything else is probably srst as well.

One of the drives that has trouble claims ATA-2 through ATA-6, which is 
interesting,  This means this device claims to support ATA-5, hence we 
should be able to interpret the bits in the Identify device according to 
ATA-5 which means it supports ATA-2 which it clearly does not.   Only 1 
of the 3 different drives I've looked at gets it right and does not 
claim ATA-2 support when it does not support it.  I'll drop an email to 
a couple of manufacturers and let them know their drives have a problem.

    Ross


Andre Hedrick wrote:

>Whoa, you need to check whic spec you are refering to first.
>
> Model=FUJITSU MHJ2181AT, FwRev=D034, SerialNo=01001697
> Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
> RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
> BuffType=unknown, BuffSize=512kB, MaxMultSect=16, MultSect=16
> CurCHS=17475/15/63, CurSects=-78446341, LBA=yes, LBAsects=35433216
> IORDY=yes, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
> PIO modes: pio0 pio1 pio2 pio3 pio4
> DMA modes: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4
> AdvancedPM=yes: disabled (255)
> Drive Supports : Reserved : ATA-2 ATA-3 ATA-4 ATA-5
> Kernel Drive Geometry LogicalCHS=2205/255/63 PhysicalCHS=37495/15/63
>
>This drive claims ATA2-5 support, and thus it accepts the status quo.
>Where are you finding the violations?
>If the drive vendor states a range of support they are in violation.
>This is clearly the point where the driver core needs to have a
>
>	unsinged long drive->support_bits
>
>The mess this is going to create.
>
>Cheers,
>
>On Mon, 13 Jan 2003, Ross Biro wrote:
>
>  
>
>>Take this patch with a big grain of salt.
>>
>>The current IDE error recovery code will send a IDLEIMMEDIATE in the 
>>case when a drive is still busy or has drq set during an error.  This is 
>>a violation of the ATA spec and hoses up quite a few drives.  The 
>>prefered form of error recovery seems to be a soft reset followed by a 
>>set features.  This patch replaces the IDLEIMMEDIATE with a reset, but 
>>does not do the set features.  The set features should not be necessary, 
>>but at least one drive vendor does most of their testing with the set 
>>features.  OK_TO_RESET_CONTROLLER must be set to 1 if this patch is applied.
>>
>>I've modified the google kernel to add a desired_speed to ide_drive_t 
>>and check to see if current_speed != desired_speed before every read or 
>>write request.  This causes a set features after a reset.  I can provide 
>>patches to do something similiar for 2.4.21, but I'm not sure it's 
>>appropriate.  There are bugs in the way hdparm -X is handled in 2.4.20, 
>>and this patch would fix them as well, but it may be too large of a 
>>patch for a stable kernel.
>>
>>This also fixes drive_cmd_intr to do a little better checking to make 
>>sure the device is ready.  The spec requires this, but any device that 
>>sends an interrupt before it's ready is probably broken.
>>
>>I have no idea what impact this will have on older devices or non disks.
>>
>>This patch has only been minimally tested.
>>
>>----- snip -------
>>diff -durbB linux-2.4.20-p1/drivers/ide/ide-cd.c 
>>linux-2.4.20-p2/drivers/ide/ide-cd.c
>>--- linux-2.4.20-p1/drivers/ide/ide-cd.c    Wed Jan  8 16:25:04 2003
>>+++ linux-2.4.20-p2/drivers/ide/ide-cd.c    Thu Jan  9 11:38:22 2003
>>@@ -644,7 +644,11 @@
>>     }
>>     if (HWIF(drive)->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>>         /* force an abort */
>>-        HWIF(drive)->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
>>+                /* This violates the ATA Spec and causes trouble for
>>+                   many modern hard drives.  I'm not sure if it is 
>>necessary
>>+                   for older devices or even modern cd-roms. -- RAB
>>+                   HWIF(drive)->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG); */
>>+                rq->errors |= ERROR_RESET;
>>         }
>>     if (rq->errors >= ERROR_MAX) {
>>         DRIVER(drive)->end_request(drive, 0);
>>diff -durbB linux-2.4.20-p1/drivers/ide/ide-disk.c 
>>linux-2.4.20-p2/drivers/ide/ide-disk.c
>>--- linux-2.4.20-p1/drivers/ide/ide-disk.c    Wed Jan  8 16:25:17 2003
>>+++ linux-2.4.20-p2/drivers/ide/ide-disk.c    Thu Jan  9 11:40:20 2003
>>@@ -943,7 +943,11 @@
>>     }
>>     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>>         /* force an abort */
>>-        hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
>>+                /* This violates the ATA Spec and causes trouble for
>>+                   many modern hard drives.  I'm not sure if it is 
>>necessary
>>+                   for older devices or even other modern devices. -- RAB
>>+                   hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG); */
>>+                rq->errors |= ERROR_RESET;
>>     }
>>     if (rq->errors >= ERROR_MAX)
>>         DRIVER(drive)->end_request(drive, 0);
>>diff -durbB linux-2.4.20-p1/drivers/ide/ide-io.c 
>>linux-2.4.20-p2/drivers/ide/ide-io.c
>>--- linux-2.4.20-p1/drivers/ide/ide-io.c    Wed Jan  8 16:25:37 2003
>>+++ linux-2.4.20-p2/drivers/ide/ide-io.c    Mon Jan 13 18:01:52 2003
>>@@ -328,7 +328,12 @@
>>     }
>>     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>>         /* force an abort */
>>+                /* This violates the ATA Spec and causes trouble for
>>+                   many modern hard drives.  I'm not sure if it is 
>>necessary
>>+                   for older devices or even other modern ata devices 
>>-- RAB
>>         hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
>>+                */
>>+                rq->errors |= ERROR_RESET;
>>     }
>>     if (rq->errors >= ERROR_MAX) {
>>         if (drive->driver != NULL)
>>@@ -397,18 +402,63 @@
>>     struct request *rq = HWGROUP(drive)->rq;
>>     ide_hwif_t *hwif = HWIF(drive);
>>     u8 *args = (u8 *) rq->buffer;
>>-    u8 stat = hwif->INB(IDE_STATUS_REG);
>>+    u8 stat;
>>     int retries = 10;
>> 
>>+        /* The ATA-6 spec requires a 400ns wait when entering the
>>+           HPIOI1: Check_status State from the HI4 state.  We should
>>+           only get here coming from the HPIOI0 INTRQ_wait state, but
>>+           better safe than sorry.  The spec also says a read of the
>>+           alt status register accomplishes this wait. */
>>+    if (IDE_CONTROL_REG)
>>+        (void)hwif->INB(IDE_ALTSTATUS_REG);
>>+    else
>>+                ide_delay_400ns();
>>+
>>+        /* The ATA-6 spec says that the should enter the check status
>>+           state when it get's an interrupt for a drive command, so
>>+           technically, the device can send an interrupt before it has
>>+           finished.  I would consider any device that does this to be
>>+           broken, but the spec allows it. --RAB 1/10/03 */
>>+    stat = hwif->INB(IDE_STATUS_REG);
>>+        if (stat & BUSY_STAT) {
>>+                /* We are just supposed to poll from here on out. */
>>+                if (HWGROUP(drive)->poll_timeout == 0) {
>>+                        
>>HWGROUP(drive)->poll_timeout=WAIT_CMD/CMD_POLL_TICKS;
>>+                } else if (--HWGROUP(drive)->poll_timeout == 0) {
>>+                        return DRIVER(drive)->error(drive,
>>+                                                    "drive_cmd timeout",
>>+                                                    stat);
>>+                }
>>+                ide_set_handler(drive, drive_cmd_intr, CMD_POLL_TICKS, 
>>NULL);
>>+                return ide_started;
>>+        }
>>+
>>     local_irq_enable();
>>-    if ((stat & DRQ_STAT) && args && args[3]) {
>>+
>>+        /* We want to make sure that the device and the command
>>+           both agree on wether or not data is returned by the command
>>+           that was just issued.  If they do not, it is an error. */
>>+        if (args && args[3]) {
>>         u8 io_32bit = drive->io_32bit;
>>+                /* The software expects some data back.  If the drive does
>>+                   not, then it is an error. */
>>+                if (!(stat & DRQ_STAT)) {
>>+                        return DRIVER(drive)->error(drive, "drive_cmd 
>>no data", stat);
>>+                }
>>         drive->io_32bit = 0;
>>         hwif->ata_input_data(drive, &args[4], args[3] * SECTOR_WORDS);
>>         drive->io_32bit = io_32bit;
>>         while (((stat = hwif->INB(IDE_STATUS_REG)) & BUSY_STAT) && 
>>retries--)
>>             udelay(100);
>>+        } else {
>>+                /* The software does not expect data, so if the drive
>>+                   returns some, then it's an error. */
>>+                if (stat & DRQ_STAT) {
>>+                        return DRIVER(drive)->error(drive, "drive_cmd 
>>unexpected data", stat);
>>+                }
>>     }
>>+
>> 
>>     if (!OK_STAT(stat, READY_STAT, BAD_STAT))
>>         return DRIVER(drive)->error(drive, "drive_cmd", stat);
>>diff -durbB linux-2.4.20-p1/drivers/ide/ide-taskfile.c 
>>linux-2.4.20-p2/drivers/ide/ide-taskfile.c
>>--- linux-2.4.20-p1/drivers/ide/ide-taskfile.c    Wed Jan  8 16:25:43 2003
>>+++ linux-2.4.20-p2/drivers/ide/ide-taskfile.c    Fri Jan 10 11:19:07 2003
>>@@ -485,7 +485,13 @@
>>     }
>>     if (hwif->INB(IDE_STATUS_REG) & (BUSY_STAT|DRQ_STAT)) {
>>         /* force an abort */
>>-        hwif->OUTB(WIN_IDLEIMMEDIATE, IDE_COMMAND_REG);
>>+                /* This violates the ATA Spec and causes trouble for
>>+                   many modern hard drives.  I'm not sure if it is 
>>necessary
>>+                   for older devices or even other modern ata devices 
>>-- RAB
>>+        hwif->OUTB(WIN_IDLEIMMEDIATE,IDE_COMMAND_REG);
>>+                */
>>+                rq->errors |= ERROR_RESET;
>>+
>>     }
>>     if (rq->errors >= ERROR_MAX) {
>>         DRIVER(drive)->end_request(drive, 0);
>>diff -durbB linux-2.4.20-p1/include/linux/ide.h 
>>linux-2.4.20-p2/include/linux/ide.h
>>--- linux-2.4.20-p1/include/linux/ide.h    Thu Jan  9 15:37:22 2003
>>+++ linux-2.4.20-p2/include/linux/ide.h    Mon Jan 13 18:04:47 2003
>>@@ -271,6 +271,10 @@
>> #define WAIT_WORSTCASE    (30*HZ)    /* 30sec  - worst case when 
>>spinning up */
>> #define WAIT_CMD    (10*HZ)    /* 10sec  - maximum wait for an IRQ to 
>>happen */
>> #define WAIT_MIN_SLEEP    (2*HZ/100)    /* 20msec - minimum sleep time */
>>+#define CMD_POLL_TICKS  (1)     /* How long to wait in ticks between
>>+                                   status checks for a drive cmd
>>+                                   that set it's interrupt before it
>>+                                   was complete. */
>> 
>> #define HOST(hwif,chipset)                    \
>> {                                \
>>
>>
>>    
>>
>
>Andre Hedrick
>LAD Storage Consulting Group
>
>  
>




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH: [2.4.21-pre3] IDE error recovery
  2003-01-14 17:08   ` Ross Biro
@ 2003-01-14 17:36     ` Ross Biro
  0 siblings, 0 replies; 5+ messages in thread
From: Ross Biro @ 2003-01-14 17:36 UTC (permalink / raw)
  To: Ross Biro; +Cc: Andre Hedrick, alan, marcelo, linux-kernel

Ross Biro wrote:

>
> One of the drives that has trouble claims ATA-2 through ATA-6, which 
> is interesting,  This means this device claims to support ATA-5, hence 
> we should be able to interpret the bits in the Identify device 
> according to ATA-5 which means it supports ATA-2 which it clearly does 
> not.   Only 1 of the 3 different drives I've looked at gets it right 
> and does not claim ATA-2 support when it does not support it.  I'll 
> drop an email to a couple of manufacturers and let them know their 
> drives have a problem.

I've just checked the latest drives from 4 different manufacturers.  3 
of them claim support for ATA-2 through ATA-6 and all of them have 
trouble with IDLEIMMEDIATE sent as part of the error recovery.  I've 
already sent an email to 2 of the vendors and I have a meeting already 
scheduled with the third tomorrow.

    Ross





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-01-14 17:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-14  2:39 PATCH: [2.4.21-pre3] IDE error recovery Ross Biro
2003-01-14  3:44 ` Ross Biro
2003-01-14  7:23 ` Andre Hedrick
2003-01-14 17:08   ` Ross Biro
2003-01-14 17:36     ` Ross Biro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).