All of lore.kernel.org
 help / color / mirror / Atom feed
* Question about PATA Sil680 Cache Line Size and Performance Degradation on ARM XScale
@ 2007-02-21 22:56 Fajun Chen
  2007-02-22  0:04 ` Alan
  0 siblings, 1 reply; 7+ messages in thread
From: Fajun Chen @ 2007-02-21 22:56 UTC (permalink / raw)
  To: linux-ide; +Cc: Tejun Heo, alan

Hi Folks,

I've noticed the following code in both pata_sil680.c and IDE code siimage.c
        /* FIXME: double check */
	pci_write_config_byte(pdev, PCI_CACHE_LINE_SIZE, (class_rev) ? 1 : 255);
I was unable to find the recommended setting in Sil680 document. Could
someone explain the rational behind the code above? Does it need to be
adjusted on different processors for PCI read/write performance?

The problem I am investigating is slow IO on PATA Sil680 on ARM XScale
processor (VIVT cache) but not on i386. Based on libata trace below,
it took about 4ms for read DMA command to finish:
[4294934.196000] ata_scsi_dump_cdb: CDB (1:0,0,0) 28 00 00 0e fa 00 00 00 80
[4294934.196000] ata_scsi_translate: ENTER
[4294934.196000] scsi_10_lba_len: ten-byte command
[4294934.196000] ata_sg_setup: ENTER, ata1
[4294934.196000] ata_sg_setup: 13 sg elements mapped
[4294934.196000] ata_fill_sg: PRD[0] = (0x2C3F000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[1] = (0x2D76000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[2] = (0x2C5B000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[3] = (0x2C98000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[4] = (0x2D5E000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[5] = (0x2D71000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[6] = (0x2D7C000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[7] = (0x2D8B000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[8] = (0x2DA1000, 0x1000)
[4294934.196000] ata_fill_sg: PRD[9] = (0x2D0C000, 0x2000)
[4294934.196000] ata_fill_sg: PRD[10] = (0x33FC000, 0x2000)
[4294934.196000] ata_fill_sg: PRD[11] = (0x2D8C000, 0x2000)
[4294934.196000] ata_fill_sg: PRD[12] = (0x2C06000, 0x1000)
[4294934.196000] ata1: ata_dev_select: ENTER, ata1: device 0, wait 1
[4294934.196000] ata_tf_load_pio: feat 0x0 nsect 0x80 lba 0x0 0xFA 0xE
[4294934.196000] ata_tf_load_pio: device 0xE0
[4294934.196000] ata_exec_command_pio: ata1: cmd 0xC8
[4294934.196000] ata_scsi_translate: EXIT
[4294934.200000] ata_host_intr: ata1: protocol 3 task_state 3
[4294934.200000] ata_host_intr: ata1: host_stat 0x4
[4294934.200000] ata_hsm_move: ata1: protocol 3 task_state 3 (dev_stat 0x50)
[4294934.200000] ata_hsm_move: ata1: dev 0 command complete, drv_stat 0x50
[4294934.200000] ata_sg_clean: unmapping 13 sg elements

I did the same test on i386 with the same PATA Sil680 HBA and the
interrupt latency is reduced to around 1ms:
[  113.494605] ata_scsi_dump_cdb: CDB (5:0,0,0) 28 00 00 0a ad 80 00 00 80
[  113.494674] ata_scsi_translate: ENTER
[  113.494731] scsi_10_lba_len: ten-byte command
[  113.494791] ata_sg_setup: ENTER, ata5
[  113.494847] ata_sg_setup: 2 sg elements mapped
[  113.494907] ata_fill_sg: PRD[0] = (0x1158000, 0x4000)
[  113.494968] ata_fill_sg: PRD[1] = (0x1170000, 0xC000)
[  113.495029] ata5: ata_dev_select: ENTER, ata5: device 0, wait 1
[  113.495125] ata_tf_load_pio: feat 0x0 nsect 0x80 lba 0x80 0xAD 0xA
[  113.495190] ata_tf_load_pio: device 0xE0
[  113.495261] ata_exec_command_pio: ata5: cmd 0xC8
[  113.495324] ata_scsi_translate: EXIT
[  113.496005] ata_host_intr: ata5: protocol 3 task_state 3
[  113.496068] ata_host_intr: ata5: host_stat 0x4
[  113.496135] ata_hsm_move: ata5: protocol 3 task_state 3 (dev_stat 0x50)
[  113.496201] ata_hsm_move: ata5: dev 0 command complete, drv_stat 0x50
[  113.496266] ata_sg_clean: unmapping 2 sg elements

I also observed that the same AT command (Read DMA) took around 1ms on
the same test hardware with SATA Sil3124 HBA.

As part of the experiments, I've changed Sil680 cache line size to
0x08, 0x04, 0x02, etc, but the IO performance was not improved.  So
what might be the bottleneck causing the IO slowness on ARM XScale?
Thanks in advance for your help!

Thanks,
Fajun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-02-22 23:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-21 22:56 Question about PATA Sil680 Cache Line Size and Performance Degradation on ARM XScale Fajun Chen
2007-02-22  0:04 ` Alan
2007-02-22  1:21   ` Fajun Chen
2007-02-22 18:52     ` Alan
2007-02-22 18:18       ` Jeff Garzik
2007-02-22 23:14         ` Fajun Chen
2007-02-22 23:23           ` Jeff Garzik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.