linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PIIX4: DMA timeout issue
@ 2008-07-29  8:43 gshan
  2008-07-29 13:34 ` Jeff Garzik
  0 siblings, 1 reply; 14+ messages in thread
From: gshan @ 2008-07-29  8:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: gshan

Hello, I sent mails on this issue before, but nobody made response.
There are 2 types of board for me. The hardware configuration for
those 2 boards are almost same except that one of them has 100GB
harddisk, and another type of board has 60GB harddisk. All harddisk
was access through PIIX4. I found this issue on board with 60GB
disk, but can't reproduce it on that with 100GB disk.

When I uncompressed a large file (600MB) to a logic partition with 8GB
size, following errors was report on middle way (5 minutes about from 
start).
Anybody has ideas?

hdc: dma_timer_expiry: dma status == 0x21
hdc: DMA timeout error
hdc: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: DMA disabled
ide1: reset: success

Output at booting as follows:
=====================
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller at PCI slot 0001:00:03.1
PIIX4: chipset revision 1
PIIX4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hda:pio, hdb:pio
    ide1: BM-DMA at 0x1008-0x100f, BIOS settings: hdc:pio, hdd:pio
hdc: HTS726060M9ATX0, ATA DISK drive
ide1 at 0x170-0x177,0x376 on irq 47
hdc: max request size: 512KiB
hdc: 117210240 sectors (60011 MB) w/7877KiB Cache, CHS=16383/255/63, 
UDMA(33)
hdc: cache flushes supported
 hdc: hdc1 hdc4 < hdc5 hdc6 hdc7 hdc8 >

More info from proc
==============
/proc/ide/ide1/hdc $ cat settings
name                    value           min             max             mode
----                    -----           ---             ---             ----
acoustic                0               0               254             rw
address                 1               0               2               rw
bios_cyl                16383           0               65535           rw
bios_head               255             0               255             rw
bios_sect               63              0               63              rw
bswap                   0               0               1               r
current_speed           66              0               70              rw
dma_timeout             20000           0               20000           rw
failures                0               0               65535           rw
init_speed              66              0               70              rw
io_32bit                0               0               3               rw
keepsettings            0               0               1               rw
lun                     0               0               7               rw
max_failures            1               0               65535           rw
multcount               0               0               16              rw
nice1                   1               0               1               rw
nowerr                  0               0               1               rw
number                  2               0               3               rw
pio_mode                write-only      0               255             w
unmaskirq               0               0               1               rw
using_dma               1               0               1               rw
wcache                  1               0               1               rw

Output from fdisk
=============
/proc/ide/ide1/hdc $ fdisk /dev/hdc

The number of cylinders for this disk is set to 7296.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/hdc: 60.0 GB, 60011642880 bytes
255 heads, 63 sectors/track, 7296 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x488ea1c3

   Device Boot      Start         End      Blocks   Id  System
/dev/hdc1               1          17      136521   83  Linux
/dev/hdc4              18        7296    58468567+   5  Extended
/dev/hdc5              18        1014     8008371   83  Linux
/dev/hdc6            1015        2011     8008371   83  Linux
/dev/hdc7            2012        3008     8008371   83  Linux
/dev/hdc8            3009        4005     8008371   83  Linux


Regards,
Gavin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-07-29  8:43 PIIX4: DMA timeout issue gshan
@ 2008-07-29 13:34 ` Jeff Garzik
       [not found]   ` <200807291713.29580.bzolnier@gmail.com>
                     ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jeff Garzik @ 2008-07-29 13:34 UTC (permalink / raw)
  To: gshan; +Cc: linux-kernel, Linux IDE mailing list

gshan wrote:
> Hello, I sent mails on this issue before, but nobody made response.
> There are 2 types of board for me. The hardware configuration for
> those 2 boards are almost same except that one of them has 100GB
> harddisk, and another type of board has 60GB harddisk. All harddisk
> was access through PIIX4. I found this issue on board with 60GB
> disk, but can't reproduce it on that with 100GB disk.
> 
> When I uncompressed a large file (600MB) to a logic partition with 8GB
> size, following errors was report on middle way (5 minutes about from 
> start).
> Anybody has ideas?
> 
> hdc: dma_timer_expiry: dma status == 0x21
> hdc: DMA timeout error
> hdc: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: DMA disabled
> ide1: reset: success

Can you try disabling the IDE driver (CONFIG_IDE) and enabling the newer 
libata driver?  (CONFIG_ATA)

	Jeff





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
       [not found]   ` <200807291713.29580.bzolnier@gmail.com>
@ 2008-07-30  0:57     ` gshan
  2008-07-30 19:43       ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 14+ messages in thread
From: gshan @ 2008-07-30  0:57 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Jeff Garzik, linux-kernel, Linux IDE mailing list, Maciej W. Rozycki


>
> Hmmm?
>
> Maciej was investigating this:
>
> 	http://lkml.org/lkml/2008/5/20/724
>
> but you didn't follow up and started new thread with:
>
> 	http://lkml.org/lkml/2008/7/14/452
>   
I'm sorry for this :-(
> [ It could be that the original IRQ routing issue was fixed
>   (or wasn't really the source of problem) but you didn't tell
>   us about it and ignored previous discussions. ]
>   
The IRQ routing issue has been fixed. IRQ routing system
on my board is somewhat complicated: i8259->mlba (FPGA based
PCI device)->MV64460(system controller). I forgot to set MV64460
IRQ pin to level sensitive and edge sensitive was used, which caused
interrupt lost finally. After changing this to level sensitive, everything
seems fine. Now, the problem isn't interrupt lost, but suspending DMA
operations.
> Also is the problem still present in 2.6.26 or 2.6.27-rc1?
>
> [ I don't see an information about the kernel version that is
>   being used in either of threads? ]
>   
Sorry, I'm using 2.6.21 kernel.
>   
>>> There are 2 types of board for me. The hardware configuration for
>>> those 2 boards are almost same except that one of them has 100GB
>>> harddisk, and another type of board has 60GB harddisk. All harddisk
>>> was access through PIIX4. I found this issue on board with 60GB
>>> disk, but can't reproduce it on that with 100GB disk.
>>>
>>> When I uncompressed a large file (600MB) to a logic partition with 8GB
>>> size, following errors was report on middle way (5 minutes about from 
>>> start).
>>> Anybody has ideas?
>>>
>>> hdc: dma_timer_expiry: dma status == 0x21
>>> hdc: DMA timeout error
>>> hdc: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
>>> ide: failed opcode was: unknown
>>> hdc: DMA disabled
>>> ide1: reset: success
>>>       
>> Can you try disabling the IDE driver (CONFIG_IDE) and enabling the newer 
>> libata driver?  (CONFIG_ATA)
>>
>> 	Jeff
>>     


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-07-29 13:34 ` Jeff Garzik
       [not found]   ` <200807291713.29580.bzolnier@gmail.com>
@ 2008-07-30  0:58   ` gshan
  2008-07-30  9:04   ` gshan
  2 siblings, 0 replies; 14+ messages in thread
From: gshan @ 2008-07-30  0:58 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, Linux IDE mailing list

Jeff Garzik wrote:
> gshan wrote:
>> Hello, I sent mails on this issue before, but nobody made response.
>> There are 2 types of board for me. The hardware configuration for
>> those 2 boards are almost same except that one of them has 100GB
>> harddisk, and another type of board has 60GB harddisk. All harddisk
>> was access through PIIX4. I found this issue on board with 60GB
>> disk, but can't reproduce it on that with 100GB disk.
>>
>> When I uncompressed a large file (600MB) to a logic partition with 8GB
>> size, following errors was report on middle way (5 minutes about from 
>> start).
>> Anybody has ideas?
>>
>> hdc: dma_timer_expiry: dma status == 0x21
>> hdc: DMA timeout error
>> hdc: dma timeout error: status=0x58 { DriveReady SeekComplete 
>> DataRequest }
>> ide: failed opcode was: unknown
>> hdc: DMA disabled
>> ide1: reset: success
>
> Can you try disabling the IDE driver (CONFIG_IDE) and enabling the 
> newer libata driver?  (CONFIG_ATA)
Jeff, thanks a lot for your suggestions. I will try it if possible.
>
>     Jeff
>
>
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-07-29 13:34 ` Jeff Garzik
       [not found]   ` <200807291713.29580.bzolnier@gmail.com>
  2008-07-30  0:58   ` gshan
@ 2008-07-30  9:04   ` gshan
  2008-07-30 10:49     ` Alan Cox
  2 siblings, 1 reply; 14+ messages in thread
From: gshan @ 2008-07-30  9:04 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, Linux IDE mailing list


>
> Can you try disabling the IDE driver (CONFIG_IDE) and enabling the 
> newer libata driver?  (CONFIG_ATA)
>
Jeff, I tried moving to ATA driver for PIIX4 today and found ATA need to 
request
2 IRQ lines, each one of them belongs to primary and secondary channel. 
However,
when I make PIIX4 worked under IDE driver, it also need to request one 
IRQ for "ide1".
So it's necessary to disable secondary channel manually?

Another question is my PIIX4 chip has device ID + vendor ID of 
0x71108086. But it finally
was mapped to following ID of piix_pci_tbl[] array. Device of 0x7110 and 
0x7010 should
be quit different, right?

 { 0x8086, 0x7010, PCI_ANY_ID, PCI_ANY_ID, 0, 0, piix_pata_mwdma },

Gavin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-07-30  9:04   ` gshan
@ 2008-07-30 10:49     ` Alan Cox
  2008-07-31  4:30       ` gshan
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Cox @ 2008-07-30 10:49 UTC (permalink / raw)
  To: gshan; +Cc: Jeff Garzik, linux-kernel, Linux IDE mailing list

> Jeff, I tried moving to ATA driver for PIIX4 today and found ATA need to 
> request
> 2 IRQ lines, each one of them belongs to primary and secondary channel. 
> However,
> when I make PIIX4 worked under IDE driver, it also need to request one 
> IRQ for "ide1".
> So it's necessary to disable secondary channel manually?

If the controller is in legacy mode then libata will claim the interrupt
lines for all channels which are enabled even if there are no devices on
them. This is neccessary for device hot or warm plug support as new disks
could appear on empty channels.
> 
> Another question is my PIIX4 chip has device ID + vendor ID of 
> 0x71108086. But it finally
> was mapped to following ID of piix_pci_tbl[] array. Device of 0x7110 and 
> 0x7010 should
> be quit different, right?

Yes - I find tht one very odd - you might need to debug a bit to see what
is occuring if you think this is the case.

Alan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-07-30  0:57     ` gshan
@ 2008-07-30 19:43       ` Bartlomiej Zolnierkiewicz
  2008-07-31  1:00         ` gshan
  0 siblings, 1 reply; 14+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-30 19:43 UTC (permalink / raw)
  To: gshan
  Cc: Jeff Garzik, linux-kernel, Linux IDE mailing list, Maciej W. Rozycki

On Wednesday 30 July 2008, gshan wrote:

[...]
 
> The IRQ routing issue has been fixed. IRQ routing system
> on my board is somewhat complicated: i8259->mlba (FPGA based
> PCI device)->MV64460(system controller). I forgot to set MV64460
> IRQ pin to level sensitive and edge sensitive was used, which caused
> interrupt lost finally. After changing this to level sensitive, everything
> seems fine. Now, the problem isn't interrupt lost, but suspending DMA
> operations.

Thanks for explaining this.

> > Also is the problem still present in 2.6.26 or 2.6.27-rc1?
> >
> > [ I don't see an information about the kernel version that is
> >   being used in either of threads? ]
> >   
> Sorry, I'm using 2.6.21 kernel.

Would be great to try with some recent one, the issue may have been
fixed already (which is quite likely given a number of bugfixes that
went in recent kernels).

Thanks,
Bart

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-07-30 19:43       ` Bartlomiej Zolnierkiewicz
@ 2008-07-31  1:00         ` gshan
  0 siblings, 0 replies; 14+ messages in thread
From: gshan @ 2008-07-31  1:00 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Jeff Garzik, linux-kernel, Linux IDE mailing list, Maciej W. Rozycki

Bartlomiej Zolnierkiewicz wrote:
> On Wednesday 30 July 2008, gshan wrote:
>
> [...]
>  
>   
>> The IRQ routing issue has been fixed. IRQ routing system
>> on my board is somewhat complicated: i8259->mlba (FPGA based
>> PCI device)->MV64460(system controller). I forgot to set MV64460
>> IRQ pin to level sensitive and edge sensitive was used, which caused
>> interrupt lost finally. After changing this to level sensitive, everything
>> seems fine. Now, the problem isn't interrupt lost, but suspending DMA
>> operations.
>>     
>
> Thanks for explaining this.
>
>   
>>> Also is the problem still present in 2.6.26 or 2.6.27-rc1?
>>>
>>> [ I don't see an information about the kernel version that is
>>>   being used in either of threads? ]
>>>   
>>>       
>> Sorry, I'm using 2.6.21 kernel.
>>     
>
> Would be great to try with some recent one, the issue may have been
> fixed already (which is quite likely given a number of bugfixes that
> went in recent kernels).
>   
I tried to searched in bugzilla.kernel.org and didn't find
there has fix on this issue. Yesterday, Jeff suggested me
to turn to ata driver for PIIX4. I tried this yesterday but still
have some problems to make PIIX4 workable under ata
driver. I will continue working on this today.

Gavin

> Thanks,
> Bart
>   


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-07-30 10:49     ` Alan Cox
@ 2008-07-31  4:30       ` gshan
  2008-08-01  7:43         ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: gshan @ 2008-07-31  4:30 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Garzik, linux-kernel, Linux IDE mailing list


>> Another question is my PIIX4 chip has device ID + vendor ID of 
>> 0x71108086. But it finally
>> was mapped to following ID of piix_pci_tbl[] array. Device of 0x7110 and 
>> 0x7010 should
>> be quit different, right?
>>     
>
> Yes - I find tht one very odd - you might need to debug a bit to see what
> is occuring if you think this is the case.
>
> Alan
>   
Alan/Jeff, I tried to install ATA driver on PIIX4. After fixing IRQ number
to 143 for primary and secondary channel. I got following output and
it seems the driver has been loaded sucessfully. However, I couldn't
see any information from /proc/devices. Do you have any ideas?

piix_init: enter
piix_init_one: pci device: �[eh
piix_init_one: vendor=0x00008086, device=0x00007111
piix_init_one: piix_host_priv allocated
piix_init_one: ent->driver_data=0
ata_pci_init_one: n_ports=2
ata_pci_init_one: pci device enabled
ata_pci_init_one: pci device class=0x00010180
ata_pci_init_one: legacy_mode=8
ata_pci_init_one: DMA mode has been set
ata_pci_init_legacy_port: port_mask=0x0000000b
ata_pci_init_legacy_port: ATA_PRIMARY_CMD=0x000001f0, 
ATA_PRIMARY_CTL=0x000003f6
ata_pci_init_legacy_port: ATA_SECONDARY_CMD=0x00000170, 
ATA_SECONDARY_CTL=0x00000376
ata_pci_init_legacy_port: BMDMA=0xfbf00000
, phys=0x00001000, phys_end=0x0000100f
ata_pci_init_legacy_port: probe_ent initialized
ata_pci_init_legacy_port: IRQ line of primary channel is 143
ata_pci_init_legacy_port: IRQ line of secondary channel is 143
ata_pci_init_one: Master set
ata1: PATA max UDMA/33 cmd 0xfbeff1f0 ctl 0xfbeff3f6 bmdma 0xfbf00000 
irq 143
ata2: PATA max UDMA/33 cmd 0xfbeff170 ctl 0xfbeff376 bmdma 0xfbf00008 
irq 143
scsi0 : ata_piix
scsi1 : ata_piix
ata2.00: ATA-6: HTE721010G9AT00, MCZOA51A, max UDMA/100
ata2.00: 195371568 sectors, multi 0: LBA48
ata2.00: configured for UDMA/33
1:0:0:0: Direct-Access ATA HTE721010G9AT00 MCZO PQ: 0 ANSI: 5

/ $ cat /proc/devices
Character devices:
1 mem
2 pty
3 ttyp
4 /dev/vc/0
4 tty
5 /dev/tty
5 /dev/console
5 /dev/ptmx
7 vcs
10 misc
13 input
90 mtd
128 ptm
136 pts
204 ttyMM
205 ttyVC
206 ttySC
253 telica_sysinfo
254 mlba

Block devices:
1 ramdisk
7 loop
31 mtdblock

Gavin



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-07-31  4:30       ` gshan
@ 2008-08-01  7:43         ` Tejun Heo
  2008-08-01  8:38           ` gshan
  2008-08-01  8:54           ` gshan
  0 siblings, 2 replies; 14+ messages in thread
From: Tejun Heo @ 2008-08-01  7:43 UTC (permalink / raw)
  To: gshan; +Cc: Alan Cox, Jeff Garzik, linux-kernel, Linux IDE mailing list

gshan wrote:
> 
>>> Another question is my PIIX4 chip has device ID + vendor ID of
>>> 0x71108086. But it finally
>>> was mapped to following ID of piix_pci_tbl[] array. Device of 0x7110
>>> and 0x7010 should
>>> be quit different, right?
>>>     
>>
>> Yes - I find tht one very odd - you might need to debug a bit to see what
>> is occuring if you think this is the case.
>>
>> Alan
>>   
> Alan/Jeff, I tried to install ATA driver on PIIX4. After fixing IRQ number
> to 143 for primary and secondary channel. I got following output and
> it seems the driver has been loaded sucessfully. However, I couldn't
> see any information from /proc/devices. Do you have any ideas?

You need the SCSI disk driver - sd_mod.ko.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-08-01  7:43         ` Tejun Heo
@ 2008-08-01  8:38           ` gshan
  2008-08-01  8:54           ` gshan
  1 sibling, 0 replies; 14+ messages in thread
From: gshan @ 2008-08-01  8:38 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Alan Cox, Jeff Garzik, linux-kernel, Linux IDE mailing list


>> Alan/Jeff, I tried to install ATA driver on PIIX4. After fixing IRQ number
>> to 143 for primary and secondary channel. I got following output and
>> it seems the driver has been loaded sucessfully. However, I couldn't
>> see any information from /proc/devices. Do you have any ideas?
>>     
>
> You need the SCSI disk driver - sd_mod.ko.
>
>   
Thanks a lot. It seems it starts to work now.

piix_init: enter
piix_init_one: pci device: �\
                             @
piix_init_one: vendor=0x00008086, device=0x00007111
piix_init_one: piix_host_priv allocated
piix_init_one: ent->driver_data=0
ata_pci_init_one: n_ports=2
ata_pci_init_one: pci device enabled
ata_pci_init_one: pci device class=0x00010180
ata_pci_init_one: legacy_mode=8
ata_pci_init_one: DMA mode has been set
ata_pci_init_legacy_port: port_mask=0x0000000b
ata_pci_init_legacy_port: ATA_PRIMARY_CMD=0x000001f0, 
ATA_PRIMARY_CTL=0x000003f6
ata_pci_init_legacy_port: ATA_SECONDARY_CMD=0x00000170, 
ATA_SECONDARY_CTL=0x00000376
ata_pci_init_legacy_port: BMDMA=0xfbf00000
, phys=0x00001000, phys_end=0x0000100f
ata_pci_init_legacy_port: probe_ent initialized
ata_pci_init_legacy_port: IRQ line of primary channel is 143
ata_pci_init_legacy_port: IRQ line of secondary channel is 143
ata_pci_init_one: Master set
ata1: PATA max UDMA/33 cmd 0xfbeff1f0 ctl 0xfbeff3f6 bmdma 0xfbf00000 
irq 143
ata2: PATA max UDMA/33 cmd 0xfbeff170 ctl 0xfbeff376 bmdma 0xfbf00008 
irq 143
scsi0 : ata_piix
scsi1 : ata_piix
ata2.00: ATA-6: HTE721010G9AT00, MCZOA51A, max UDMA/100
ata2.00: 195371568 sectors, multi 0: LBA48
ata2.00: configured for UDMA/33
 1:0:0:0: Direct-Access     ATA      HTE721010G9AT00  MCZO PQ: 0 ANSI: 5
SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB)
sda: Write Protect is off
SCSI device sda: write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA
SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB)
sda: Write Protect is off
SCSI device sda: write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA
 sda: sda1 sda4 < sda5 sda6 sda7 sda8 >
sd 1:0:0:0: Attached scsi disk sda
sd 1:0:0:0: Attached scsi generic sg0 type 0

Gavin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-08-01  7:43         ` Tejun Heo
  2008-08-01  8:38           ` gshan
@ 2008-08-01  8:54           ` gshan
  2008-08-03 14:10             ` Gavin Shan
  1 sibling, 1 reply; 14+ messages in thread
From: gshan @ 2008-08-01  8:54 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Alan Cox, Jeff Garzik, linux-kernel, Linux IDE mailing list


>> Alan/Jeff, I tried to install ATA driver on PIIX4. After fixing IRQ number
>> to 143 for primary and secondary channel. I got following output and
>> it seems the driver has been loaded sucessfully. However, I couldn't
>> see any information from /proc/devices. Do you have any ideas?
>>     
>
> You need the SCSI disk driver - sd_mod.ko.
>
>   
Unfortunately, the timeout issue still happened on ATA driver as IDE.

Command (m for help): p

Disk /dev/sda: 60.0 GB, 60011642880 bytes
255 heads, 63 sectors/track, 7296 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x4892c004

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1          17      136521   83  Linux
/dev/sda4              18        7296    58468567+   5  Extended
/dev/sda5              18        7296    58468536   83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
SCSI device sda: 117210240 512-byte hdwr sectors (60012 MB)
sda: Write Protect is off
SCSI device sda: write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA
 sda: sda1 sda4 < sda5 >
Syncing disks.
/ $ mkfs.ext3 -N 655360 /dev/sda5
mke2fs 1.40 (29-Jun-2007)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
657984 inodes, 14617134 blocks
730856 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
447 block groups
32768 blocks per group, 32768 fragments per group
1472 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 
2654208,
        4096000, 7962624, 11239424

Writing inode tables: done                           
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 25 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
/ $ mkdir /aaa
/ $ ifconfig rep0 192.168.253.27
mv643xx_eth_if_start: register PHY interrupt rep0 port=1 sw=1 swport=2
/ $ mount -o nolock 192.168.253.26:/home1 /aaa
/ $ cd /disk_root/
/disk_root $ ls
/disk_root $ cd /
/ $ mount /dev/sda5 /disk_root/
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
/ $ cd /disk_root/
/disk_root $ tar -zxf /aaa/target_7xx_cge5.2008-07-10.tgz

ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
524288 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2.00: configured for UDMA/33
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
524288 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2.00: configured for UDMA/33
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
524288 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2.00: configured for UDMA/33
ata2: EH complete
ata2.00: limiting speed to UDMA/25:PIO4
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
524288 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2.00: configured for UDMA/25
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
524288 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2.00: configured for UDMA/25
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
524288 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2.00: configured for UDMA/25
sd 1:0:0:0: SCSI error: return code = 0x08000002
sda: Current [descriptor]: sense key=0xb
    ASC=0x0 ASCQ=0x0
Descriptor sense data with sense descriptors (in hex):
        72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
        00 00 00 00
end_request: I/O error, dev sda, sector 19459864
Buffer I/O error on device sda5, logical block 2398337
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398338
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398339
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398340
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398341
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398342
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398343
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398344
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398345
lost page write due to I/O error on sda5
Buffer I/O error on device sda5, logical block 2398346
lost page write due to I/O error on sda5
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd ca/00:f8:18:f3:28/00:00:00:00:00/e1 tag 0 cdb 0x0 data 
126976 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2.00: configured for UDMA/25
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd ca/00:f8:18:f3:28/00:00:00:00:00/e1 tag 0 cdb 0x0 data 
126976 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2.00: configured for UDMA/25
ata2: EH complete

Gavin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-08-01  8:54           ` gshan
@ 2008-08-03 14:10             ` Gavin Shan
  2008-08-05  3:41               ` gshan
  0 siblings, 1 reply; 14+ messages in thread
From: Gavin Shan @ 2008-08-03 14:10 UTC (permalink / raw)
  To: gshan, Tejun Heo
  Cc: Alan Cox, Jeff Garzik, linux-kernel, Linux IDE mailing list

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 6844 bytes --]

> Unfortunately, the timeout issue still happened on ATA driver as IDE.
> 
> Command (m for help): p
> 
> Disk /dev/sda: 60.0 GB, 60011642880 bytes
> 255 heads, 63 sectors/track, 7296 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Disk identifier: 0x4892c004
> 
>   Device Boot      Start         End      Blocks   Id  System
> /dev/sda1               1          17      136521   83  Linux
> /dev/sda4              18        7296    58468567+   5  Extended
> /dev/sda5              18        7296    58468536   83  Linux
> 
> Command (m for help): w
> The partition table has been altered!
> 
> Calling ioctl() to re-read partition table.
> SCSI device sda: 117210240 512-byte hdwr sectors (60012 MB)
> sda: Write Protect is off
> SCSI device sda: write cache: enabled, read cache: enabled, doesn't 
> support DPO or FUA
> sda: sda1 sda4 < sda5 >
> Syncing disks.
> / $ mkfs.ext3 -N 655360 /dev/sda5
> mke2fs 1.40 (29-Jun-2007)
> Filesystem label=
> OS type: Linux
> Block size=4096 (log=2)
> Fragment size=4096 (log=2)
> 657984 inodes, 14617134 blocks
> 730856 blocks (5.00%) reserved for the super user
> First data block=0
> Maximum filesystem blocks=0
> 447 block groups
> 32768 blocks per group, 32768 fragments per group
> 1472 inodes per group
> Superblock backups stored on blocks:
>        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 
> 2654208,
>        4096000, 7962624, 11239424
> 
> Writing inode tables: done                           
> Creating journal (32768 blocks): done
> Writing superblocks and filesystem accounting information: done
> 
> This filesystem will be automatically checked every 25 mounts or
> 180 days, whichever comes first.  Use tune2fs -c or -i to override.
> / $ mkdir /aaa
> / $ ifconfig rep0 192.168.253.27
> mv643xx_eth_if_start: register PHY interrupt rep0 port=1 sw=1 swport=2
> / $ mount -o nolock 192.168.253.26:/home1 /aaa
> / $ cd /disk_root/
> /disk_root $ ls
> /disk_root $ cd /
> / $ mount /dev/sda5 /disk_root/
> kjournald starting.  Commit interval 5 seconds
> EXT3 FS on sda5, internal journal
> EXT3-fs: mounted filesystem with ordered data mode.
> / $ cd /disk_root/
> /disk_root $ tar -zxf /aaa/target_7xx_cge5.2008-07-10.tgz
> 
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
> 524288 out
>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ata2.00: configured for UDMA/33
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
> 524288 out
>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ata2.00: configured for UDMA/33
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
> 524288 out
>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ata2.00: configured for UDMA/33
> ata2: EH complete
> ata2.00: limiting speed to UDMA/25:PIO4
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
> 524288 out
>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ata2.00: configured for UDMA/25
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
> 524288 out
>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ata2.00: configured for UDMA/25
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
> 524288 out
>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ata2.00: configured for UDMA/25
> sd 1:0:0:0: SCSI error: return code = 0x08000002
> sda: Current [descriptor]: sense key=0xb
>    ASC=0x0 ASCQ=0x0
> Descriptor sense data with sense descriptors (in hex):
>        72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
>        00 00 00 00
> end_request: I/O error, dev sda, sector 19459864
> Buffer I/O error on device sda5, logical block 2398337
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398338
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398339
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398340
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398341
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398342
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398343
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398344
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398345
> lost page write due to I/O error on sda5
> Buffer I/O error on device sda5, logical block 2398346
> lost page write due to I/O error on sda5
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd ca/00:f8:18:f3:28/00:00:00:00:00/e1 tag 0 cdb 0x0 data 
> 126976 out
>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ata2.00: configured for UDMA/25
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd ca/00:f8:18:f3:28/00:00:00:00:00/e1 tag 0 cdb 0x0 data 
> 126976 out
>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ata2.00: configured for UDMA/25
> ata2: EH complete

Alan and Jeff, I didn't receive your reply till now. I'm not sure you are 
still intrested on this issue. If you would like to get more information,
pls let me know. 

I tested it for almost one day and found same issue on 100GB and 60GB
disk. So I have to think about interrupt lost inside i8259. Then I changed
code of i8259 for several time and tested it and unfortunately same issue
was there. I don't have good ideas now. If you would like, I can show you
the code about i8259 interrupt handling. 

The suspecting point of mine on i8259 is: when multiple interrupts are asserted
on slave controller. Non-specific EOI will clear all pending interrupts or just
clear that one in ISR?

Gavin
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PIIX4: DMA timeout issue
  2008-08-03 14:10             ` Gavin Shan
@ 2008-08-05  3:41               ` gshan
  0 siblings, 0 replies; 14+ messages in thread
From: gshan @ 2008-08-05  3:41 UTC (permalink / raw)
  To: Gavin Shan
  Cc: Tejun Heo, Alan Cox, Jeff Garzik, linux-kernel, Linux IDE mailing list

Gavin Shan wrote:
>> Unfortunately, the timeout issue still happened on ATA driver as IDE.
>>
>> Command (m for help): p
>>
>> Disk /dev/sda: 60.0 GB, 60011642880 bytes
>> 255 heads, 63 sectors/track, 7296 cylinders
>> Units = cylinders of 16065 * 512 = 8225280 bytes
>> Disk identifier: 0x4892c004
>>
>>   Device Boot      Start         End      Blocks   Id  System
>> /dev/sda1               1          17      136521   83  Linux
>> /dev/sda4              18        7296    58468567+   5  Extended
>> /dev/sda5              18        7296    58468536   83  Linux
>>
>> Command (m for help): w
>> The partition table has been altered!
>>
>> Calling ioctl() to re-read partition table.
>> SCSI device sda: 117210240 512-byte hdwr sectors (60012 MB)
>> sda: Write Protect is off
>> SCSI device sda: write cache: enabled, read cache: enabled, doesn't 
>> support DPO or FUA
>> sda: sda1 sda4 < sda5 >
>> Syncing disks.
>> / $ mkfs.ext3 -N 655360 /dev/sda5
>> mke2fs 1.40 (29-Jun-2007)
>> Filesystem label=
>> OS type: Linux
>> Block size=4096 (log=2)
>> Fragment size=4096 (log=2)
>> 657984 inodes, 14617134 blocks
>> 730856 blocks (5.00%) reserved for the super user
>> First data block=0
>> Maximum filesystem blocks=0
>> 447 block groups
>> 32768 blocks per group, 32768 fragments per group
>> 1472 inodes per group
>> Superblock backups stored on blocks:
>>        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 
>> 2654208,
>>        4096000, 7962624, 11239424
>>
>> Writing inode tables: done                           
>> Creating journal (32768 blocks): done
>> Writing superblocks and filesystem accounting information: done
>>
>> This filesystem will be automatically checked every 25 mounts or
>> 180 days, whichever comes first.  Use tune2fs -c or -i to override.
>> / $ mkdir /aaa
>> / $ ifconfig rep0 192.168.253.27
>> mv643xx_eth_if_start: register PHY interrupt rep0 port=1 sw=1 swport=2
>> / $ mount -o nolock 192.168.253.26:/home1 /aaa
>> / $ cd /disk_root/
>> /disk_root $ ls
>> /disk_root $ cd /
>> / $ mount /dev/sda5 /disk_root/
>> kjournald starting.  Commit interval 5 seconds
>> EXT3 FS on sda5, internal journal
>> EXT3-fs: mounted filesystem with ordered data mode.
>> / $ cd /disk_root/
>> /disk_root $ tar -zxf /aaa/target_7xx_cge5.2008-07-10.tgz
>>
>> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
>> 524288 out
>>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2: soft resetting port
>> ata2.00: configured for UDMA/33
>> ata2: EH complete
>> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
>> 524288 out
>>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2: soft resetting port
>> ata2.00: configured for UDMA/33
>> ata2: EH complete
>> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
>> 524288 out
>>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2: soft resetting port
>> ata2.00: configured for UDMA/33
>> ata2: EH complete
>> ata2.00: limiting speed to UDMA/25:PIO4
>> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
>> 524288 out
>>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2: soft resetting port
>> ata2.00: configured for UDMA/25
>> ata2: EH complete
>> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
>> 524288 out
>>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2: soft resetting port
>> ata2.00: configured for UDMA/25
>> ata2: EH complete
>> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata2.00: cmd 35/00:00:18:ef:28/00:04:01:00:00/e0 tag 0 cdb 0x0 data 
>> 524288 out
>>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2: soft resetting port
>> ata2.00: configured for UDMA/25
>> sd 1:0:0:0: SCSI error: return code = 0x08000002
>> sda: Current [descriptor]: sense key=0xb
>>    ASC=0x0 ASCQ=0x0
>> Descriptor sense data with sense descriptors (in hex):
>>        72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
>>        00 00 00 00
>> end_request: I/O error, dev sda, sector 19459864
>> Buffer I/O error on device sda5, logical block 2398337
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398338
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398339
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398340
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398341
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398342
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398343
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398344
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398345
>> lost page write due to I/O error on sda5
>> Buffer I/O error on device sda5, logical block 2398346
>> lost page write due to I/O error on sda5
>> ata2: EH complete
>> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata2.00: cmd ca/00:f8:18:f3:28/00:00:00:00:00/e1 tag 0 cdb 0x0 data 
>> 126976 out
>>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2: soft resetting port
>> ata2.00: configured for UDMA/25
>> ata2: EH complete
>> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>> ata2.00: cmd ca/00:f8:18:f3:28/00:00:00:00:00/e1 tag 0 cdb 0x0 data 
>> 126976 out
>>         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2: soft resetting port
>> ata2.00: configured for UDMA/25
>> ata2: EH complete
>>     
>
> Alan and Jeff, I didn't receive your reply till now. I'm not sure you are 
> still intrested on this issue. If you would like to get more information,
> pls let me know. 
>
> I tested it for almost one day and found same issue on 100GB and 60GB
> disk. So I have to think about interrupt lost inside i8259. Then I changed
> code of i8259 for several time and tested it and unfortunately same issue
> was there. I don't have good ideas now. If you would like, I can show you
> the code about i8259 interrupt handling. 
>
> The suspecting point of mine on i8259 is: when multiple interrupts are asserted
> on slave controller. Non-specific EOI will clear all pending interrupts or just
> clear that one in ISR?
>
> Gavin
>   
I also turned on the debugging options and got following output when 
timeout happened.

ata_scsi_translate: ENTER
scsi_10_lba_len: ten-byte command
ata_sg_setup: ENTER, ata2
ata_sg_setup: 101 sg elements mapped
ata_fill_sg: PRD[0] = (0x1BF7E000, 0x1000)
ata_fill_sg: PRD[1] = (0x1BDFB000, 0x1000)
ata_fill_sg: PRD[2] = (0x1BB05000, 0x1000)
ata_fill_sg: PRD[3] = (0x1B80C000, 0x1000)
ata_fill_sg: PRD[4] = (0x1C251000, 0x1000)
ata_fill_sg: PRD[5] = (0x1BB04000, 0x1000)
ata_fill_sg: PRD[6] = (0x1C289000, 0x1000)
ata_fill_sg: PRD[7] = (0x1C288000, 0x1000)
ata_fill_sg: PRD[8] = (0x1BE95000, 0x1000)
ata_fill_sg: PRD[9] = (0x1BE94000, 0x1000)
ata_fill_sg: PRD[10] = (0x1BE97000, 0x1000)
ata_fill_sg: PRD[11] = (0x1BE96000, 0x1000)
ata_fill_sg: PRD[12] = (0x1BF31000, 0x1000)
ata_fill_sg: PRD[13] = (0x1BF30000, 0x1000)
ata_fill_sg: PRD[14] = (0x1BF33000, 0x1000)
ata_fill_sg: PRD[15] = (0x1B80D000, 0x1000)
ata_fill_sg: PRD[16] = (0x1B803000, 0x1000)
ata_fill_sg: PRD[17] = (0x1B914000, 0x1000)
ata_fill_sg: PRD[18] = (0x1BF7D000, 0x1000)
ata_fill_sg: PRD[19] = (0x1B915000, 0x1000)
ata_fill_sg: PRD[20] = (0x1B80E000, 0x1000)
ata_fill_sg: PRD[21] = (0x1BF7C000, 0x1000)
ata_fill_sg: PRD[22] = (0x1B844000, 0x1000)
ata_fill_sg: PRD[23] = (0x1B80F000, 0x1000)
ata_fill_sg: PRD[24] = (0x1B846000, 0x1000)
ata_fill_sg: PRD[25] = (0x1B845000, 0x1000)
ata_fill_sg: PRD[26] = (0x1B874000, 0x1000)
ata_fill_sg: PRD[27] = (0x1B847000, 0x1000)
ata_fill_sg: PRD[28] = (0x1B876000, 0x1000)
ata_fill_sg: PRD[29] = (0x1B875000, 0x1000)
ata_fill_sg: PRD[30] = (0x1B800000, 0x1000)
ata_fill_sg: PRD[31] = (0x1B877000, 0x1000)
ata_fill_sg: PRD[32] = (0x1BAD2000, 0x2000)
ata_fill_sg: PRD[33] = (0x1B802000, 0x1000)
ata_fill_sg: PRD[34] = (0x1BAC4000, 0x1000)
ata_fill_sg: PRD[35] = (0x1B916000, 0x1000)
ata_fill_sg: PRD[36] = (0x1B801000, 0x1000)
ata_fill_sg: PRD[37] = (0x1B8F0000, 0x1000)
ata_fill_sg: PRD[38] = (0x1B917000, 0x1000)
ata_fill_sg: PRD[39] = (0x1B8F2000, 0x1000)
ata_fill_sg: PRD[40] = (0x1B8F1000, 0x1000)
ata_fill_sg: PRD[41] = (0x1B92C000, 0x1000)
ata_fill_sg: PRD[42] = (0x1B8F3000, 0x1000)
ata_fill_sg: PRD[43] = (0x1B92E000, 0x1000)
ata_fill_sg: PRD[44] = (0x1B92D000, 0x1000)
ata_fill_sg: PRD[45] = (0x1BA70000, 0x1000)
ata_fill_sg: PRD[46] = (0x1B92F000, 0x1000)
ata_fill_sg: PRD[47] = (0x1BAD1000, 0x1000)
ata_fill_sg: PRD[48] = (0x1BAD0000, 0x1000)
ata_fill_sg: PRD[49] = (0x1BAC5000, 0x1000)
ata_fill_sg: PRD[50] = (0x1C714000, 0x1000)
ata_fill_sg: PRD[51] = (0x1BAC7000, 0x1000)
ata_fill_sg: PRD[52] = (0x1BAC6000, 0x1000)
ata_fill_sg: PRD[53] = (0x1BAF1000, 0x1000)
ata_fill_sg: PRD[54] = (0x1BAF0000, 0x1000)
ata_fill_sg: PRD[55] = (0x1BAF3000, 0x1000)
ata_fill_sg: PRD[56] = (0x1BAF2000, 0x1000)
ata_fill_sg: PRD[57] = (0x1C765000, 0x1000)
ata_fill_sg: PRD[58] = (0x1C764000, 0x1000)
ata_fill_sg: PRD[59] = (0x1C767000, 0x1000)
ata_fill_sg: PRD[60] = (0x1C766000, 0x1000)
ata_fill_sg: PRD[61] = (0x1C719000, 0x1000)
ata_fill_sg: PRD[62] = (0x1C718000, 0x1000)
ata_fill_sg: PRD[63] = (0x1C6D2000, 0x1000)
ata_fill_sg: PRD[64] = (0x1BDF5000, 0x1000)
ata_fill_sg: PRD[65] = (0x1C6E9000, 0x2000)
ata_fill_sg: PRD[66] = (0x1BE70000, 0x1000)
ata_fill_sg: PRD[67] = (0x1BDF7000, 0x1000)
ata_fill_sg: PRD[68] = (0x1BE72000, 0x1000)
ata_fill_sg: PRD[69] = (0x1BE71000, 0x1000)
ata_fill_sg: PRD[70] = (0x1BE74000, 0x1000)
ata_fill_sg: PRD[71] = (0x1BE73000, 0x1000)
ata_fill_sg: PRD[72] = (0x1AF3C000, 0x1000)
ata_fill_sg: PRD[73] = (0x1AF43000, 0x1000)
ata_fill_sg: PRD[74] = (0x1B6E4000, 0x1000)
ata_fill_sg: PRD[75] = (0x1B6E3000, 0x1000)
ata_fill_sg: PRD[76] = (0x1B6E6000, 0x1000)
ata_fill_sg: PRD[77] = (0x1B7A2000, 0x1000)
ata_fill_sg: PRD[78] = (0x1B7A4000, 0x1000)
ata_fill_sg: PRD[79] = (0x1B678000, 0x1000)
ata_fill_sg: PRD[80] = (0x1B4C6000, 0x2000)
ata_fill_sg: PRD[81] = (0x1AEE4000, 0x1000)
ata_fill_sg: PRD[82] = (0x1B4F5000, 0x1000)
ata_fill_sg: PRD[83] = (0x1AF03000, 0x1000)
ata_fill_sg: PRD[84] = (0x1AEFC000, 0x1000)
ata_fill_sg: PRD[85] = (0x1B7A6000, 0x1000)
ata_fill_sg: PRD[86] = (0x1B0D5000, 0x1000)
ata_fill_sg: PRD[87] = (0x1B0D1000, 0x1000)
ata_fill_sg: PRD[88] = (0x1B118000, 0x1000)
ata_fill_sg: PRD[89] = (0x1B0D6000, 0x1000)
ata_fill_sg: PRD[90] = (0x1B11C000, 0x1000)
ata_fill_sg: PRD[91] = (0x1B11A000, 0x1000)
ata_fill_sg: PRD[92] = (0x1B11E000, 0x1000)
ata_fill_sg: PRD[93] = (0x1B0D2000, 0x1000)
ata_fill_sg: PRD[94] = (0x1AF21000, 0x1000)
ata_fill_sg: PRD[95] = (0x1AF4F000, 0x1000)
ata_fill_sg: PRD[96] = (0x1AF4D000, 0x1000)
ata_fill_sg: PRD[97] = (0x1AFE8000, 0x1000)
ata_fill_sg: PRD[98] = (0x1AFEA000, 0x1000)
ata_fill_sg: PRD[99] = (0x1AA8C000, 0x1000)
ata_fill_sg: PRD[100] = (0x1A840000, 0x1000)
ata2: ata_dev_select: ENTER, device 0, wait 1
ata_tf_load: hob: feat 0x0 nsect 0x3, lba 0x0 0x0 0x0
ata_tf_load: feat 0x0 nsect 0x40 lba 0xAD 0x75 0xEC
ata_tf_load: device 0xE0
ata_exec_command: ata2: cmd 0x35
ata_scsi_translate: EXIT

ata_scsi_timed_out: ENTER
ata_scsi_timed_out: EXIT, ret=0
ata_scsi_error: ENTER
ata_port_flush_task: ENTER
ata_port_flush_task: flush #1
__ata_port_freeze: ata2 port frozen
ata_eh_autopsy: ENTER
ata_eh_autopsy: EXIT
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:40:ad:75:ec/00:03:00:00:00/e0 tag 0 cdb 0x0 data 
425984 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata_eh_recover: ENTER
__ata_port_freeze: ata2 port frozen
ata2: soft resetting port


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-08-05  3:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-29  8:43 PIIX4: DMA timeout issue gshan
2008-07-29 13:34 ` Jeff Garzik
     [not found]   ` <200807291713.29580.bzolnier@gmail.com>
2008-07-30  0:57     ` gshan
2008-07-30 19:43       ` Bartlomiej Zolnierkiewicz
2008-07-31  1:00         ` gshan
2008-07-30  0:58   ` gshan
2008-07-30  9:04   ` gshan
2008-07-30 10:49     ` Alan Cox
2008-07-31  4:30       ` gshan
2008-08-01  7:43         ` Tejun Heo
2008-08-01  8:38           ` gshan
2008-08-01  8:54           ` gshan
2008-08-03 14:10             ` Gavin Shan
2008-08-05  3:41               ` gshan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).