All of lore.kernel.org
 help / color / mirror / Atom feed
* ahci problems with sata disk.
@ 2007-01-14 14:32 kenneth johansson
  2007-01-15  9:13 ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: kenneth johansson @ 2007-01-14 14:32 UTC (permalink / raw)
  To: linux-ide

[-- Attachment #1: Type: text/plain, Size: 1160 bytes --]

I changed my bios setting for SATA from IDE to AHCI.

This resulted in some "interesting" read throughput. 

plots can be found at http://kenjo.org/~ken/sata/
The plots was done on a live disk so some noise is expected but in the
ahci mode the throughput get stuck at 17 MB way to much.

The read was done with dd on the device and plot data was from "vmstat
1"

kernel version is  "2.6.20-5-generic" from ubuntu feisty (unstable)
running in 32 bit mode on a core 2 duo

---
00:1f.2 IDE interface: Intel Corporation 82801H (ICH8 Family) 4 port
SATA IDE Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO])
        Subsystem: ASUSTeK Computer Inc. Unknown device 821a
        Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 21
        I/O ports at ec00 [size=8]
        I/O ports at e880 [size=4]
        I/O ports at e800 [size=8]
        I/O ports at e480 [size=4]
        I/O ports at e400 [size=16]
        I/O ports at e080 [size=16]
        Capabilities: <access denied>
---
Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: ASUSTeK Computer INC.
        Product Name: P5B-V
        Version: Rev 1.xx
 ---


[-- Attachment #2: ide.png --]
[-- Type: image/png, Size: 5498 bytes --]

[-- Attachment #3: ahci.png --]
[-- Type: image/png, Size: 6217 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-14 14:32 ahci problems with sata disk kenneth johansson
@ 2007-01-15  9:13 ` Tejun Heo
  2007-01-15 11:05   ` kenneth johansson
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2007-01-15  9:13 UTC (permalink / raw)
  To: kenneth johansson; +Cc: linux-ide

kenneth johansson wrote:
> I changed my bios setting for SATA from IDE to AHCI.
> 
> This resulted in some "interesting" read throughput. 
> 
> plots can be found at http://kenjo.org/~ken/sata/
> The plots was done on a live disk so some noise is expected but in the
> ahci mode the throughput get stuck at 17 MB way to much.

It's probably not an ahci problem but more of NCQ implementation problem
in the drive firmware.  Please report the result of 'hdparm -I /dev/sdX'
and try adjust queue depth and see what happens.

http://linux-ata.org/faq.html

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-15  9:13 ` Tejun Heo
@ 2007-01-15 11:05   ` kenneth johansson
  2007-01-15 11:36     ` Alan
                       ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: kenneth johansson @ 2007-01-15 11:05 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide

On Mon, 2007-01-15 at 18:13 +0900, Tejun Heo wrote:
> kenneth johansson wrote:
> > I changed my bios setting for SATA from IDE to AHCI.
> > 
> > This resulted in some "interesting" read throughput. 
> > 
> > plots can be found at http://kenjo.org/~ken/sata/
> > The plots was done on a live disk so some noise is expected but in the
> > ahci mode the throughput get stuck at 17 MB way to much.
> 
> It's probably not an ahci problem but more of NCQ implementation problem
> in the drive firmware.  Please report the result of 'hdparm -I /dev/sdX'
> and try adjust queue depth and see what happens.
> 
> http://linux-ata.org/faq.html
> 

It was, when I turn of NCQ with "echo 1
> /sys/block/sda/device/queue_depth" I get the same performance as when
the BIOS is set to IDE.

I though that NCQ was intended to increase performance ??

also the disk is a Westen Digital raptor and it's probably the most
benchmarked drive one could get so I was not expecting a problem with
the drive. 

-------
ATA device, with non-removable media
        Model Number:       WDC WD1500ADFD-00NLR1                   
        Serial Number:      WD-WMAP41269747
        Firmware Revision:  20.07P20
Standards:
        Used: ATA/ATAPI-7 published, ANSI INCITS 397-2005 
        Supported: 7 6 5 4 
Configuration:
        Logical         max     current
        cylinders       16383   16383
        heads           16      16
        sectors/track   63      63
        --
        CHS current addressable sectors:   16514064
        LBA    user addressable sectors:  268435455
        LBA48  user addressable sectors:  293046768
        device size with M = 1024*1024:      143089 MBytes
        device size with M = 1000*1000:      150039 MBytes (150 GB)
Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 32
        Standby timer values: spec'd by Standard, with device specific
minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        Recommended acoustic management value: 128, current value: 254
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5
*udma6 
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4 
             Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    SMART feature set
                Security Mode feature set
           *    Power Management feature set
           *    Write cache
           *    Look-ahead
           *    Host Protected Area feature set
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    NOP cmd
           *    DOWNLOAD_MICROCODE
                Power-Up In Standby feature set
           *    SET_FEATURES required to spinup after power up
                SET_MAX security extension
           *    Automatic Acoustic Management feature set
           *    48-bit Address feature set
           *    Device Configuration Overlay feature set
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
           *    SMART error logging
           *    SMART self-test
           *    General Purpose Logging feature set
           *    SATA-I signaling speed (1.5Gb/s)
           *    Native Command Queueing (NCQ)
           *    Host-initiated interface power management
           *    Phy event counters
                DMA Setup Auto-Activate optimization
           *    Software settings preservation
           *    SMART Command Transport (SCT) feature set
           *    SCT Long Sector Access (AC1)
           *    SCT LBA Segment Access (AC2)
           *    SCT Error Recovery Control (AC3)
           *    SCT Features Control (AC4)
           *    SCT Data Tables (AC5)
                unknown 206[12]
Security: 
        Master password revision code = 65534
                supported
        not     enabled
        not     locked
                frozen
        not     expired: security count
        not     supported: enhanced erase
Checksum: correct



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-15 11:05   ` kenneth johansson
@ 2007-01-15 11:36     ` Alan
  2007-01-15 13:50     ` Tejun Heo
  2007-01-16 16:44     ` Andrew Lyon
  2 siblings, 0 replies; 13+ messages in thread
From: Alan @ 2007-01-15 11:36 UTC (permalink / raw)
  To: kenneth johansson; +Cc: Tejun Heo, linux-ide

> also the disk is a Westen Digital raptor and it's probably the most
> benchmarked drive one could get so I was not expecting a problem with
> the drive. 

A lot of early NCQ firmware seems to reduce performance and cause
problems. At least one other raptor is in our "don't NCQ" list in the
kernel drivers.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-15 11:05   ` kenneth johansson
  2007-01-15 11:36     ` Alan
@ 2007-01-15 13:50     ` Tejun Heo
  2007-01-16  1:43       ` kenneth johansson
  2007-01-16 16:44     ` Andrew Lyon
  2 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2007-01-15 13:50 UTC (permalink / raw)
  To: kenneth johansson; +Cc: linux-ide

kenneth johansson wrote:
> On Mon, 2007-01-15 at 18:13 +0900, Tejun Heo wrote:
>> kenneth johansson wrote:
>>> I changed my bios setting for SATA from IDE to AHCI.
>>>
>>> This resulted in some "interesting" read throughput. 
>>>
>>> plots can be found at http://kenjo.org/~ken/sata/
>>> The plots was done on a live disk so some noise is expected but in the
>>> ahci mode the throughput get stuck at 17 MB way to much.
>> It's probably not an ahci problem but more of NCQ implementation problem
>> in the drive firmware.  Please report the result of 'hdparm -I /dev/sdX'
>> and try adjust queue depth and see what happens.
>>
>> http://linux-ata.org/faq.html
>>
> 
> It was, when I turn of NCQ with "echo 1
>> /sys/block/sda/device/queue_depth" I get the same performance as when
> the BIOS is set to IDE.

Can you play with queue depth a bit?  e.g. Benchmark queue depth of 4, 8
and 16.

> I though that NCQ was intended to increase performance ??

Supposedly.

> also the disk is a Westen Digital raptor and it's probably the most
> benchmarked drive one could get so I was not expecting a problem with
> the drive. 

Most benchmarked doesn't make the firmware any better, it seems.  The
raptor Alan talked about, reportedly, locks up after hours of NCQ load too.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-15 13:50     ` Tejun Heo
@ 2007-01-16  1:43       ` kenneth johansson
  0 siblings, 0 replies; 13+ messages in thread
From: kenneth johansson @ 2007-01-16  1:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide

On Mon, 2007-01-15 at 22:50 +0900, Tejun Heo wrote:
> kenneth johansson wrote:
> > On Mon, 2007-01-15 at 18:13 +0900, Tejun Heo wrote:
> >> kenneth johansson wrote:
> >>> I changed my bios setting for SATA from IDE to AHCI.
> >>>
> >>> This resulted in some "interesting" read throughput. 
> >>>
> >>> plots can be found at http://kenjo.org/~ken/sata/
> >>> The plots was done on a live disk so some noise is expected but in the
> >>> ahci mode the throughput get stuck at 17 MB way to much.
> >> It's probably not an ahci problem but more of NCQ implementation problem
> >> in the drive firmware.  Please report the result of 'hdparm -I /dev/sdX'
> >> and try adjust queue depth and see what happens.
> >>
> >> http://linux-ata.org/faq.html
> >>
> > 
> > It was, when I turn of NCQ with "echo 1
> >> /sys/block/sda/device/queue_depth" I get the same performance as when
> > the BIOS is set to IDE.
> 
> Can you play with queue depth a bit?  e.g. Benchmark queue depth of 4, 8
> and 16.

I did some more test "http://kenjo.org/~ken/sata/" and queue 1 and maybe
2 works  but everything larger than that has problems. 




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-15 11:05   ` kenneth johansson
  2007-01-15 11:36     ` Alan
  2007-01-15 13:50     ` Tejun Heo
@ 2007-01-16 16:44     ` Andrew Lyon
  2007-01-16 18:32       ` Mark Lord
  2 siblings, 1 reply; 13+ messages in thread
From: Andrew Lyon @ 2007-01-16 16:44 UTC (permalink / raw)
  To: linux-ide

On 1/15/07, kenneth johansson <ken@kenjo.org> wrote:
> On Mon, 2007-01-15 at 18:13 +0900, Tejun Heo wrote:
> > kenneth johansson wrote:
> > > I changed my bios setting for SATA from IDE to AHCI.
> > >
> > > This resulted in some "interesting" read throughput.
> > >
> > > plots can be found at http://kenjo.org/~ken/sata/
> > > The plots was done on a live disk so some noise is expected but in the
> > > ahci mode the throughput get stuck at 17 MB way to much.
> >
> > It's probably not an ahci problem but more of NCQ implementation problem
> > in the drive firmware.  Please report the result of 'hdparm -I /dev/sdX'
> > and try adjust queue depth and see what happens.
> >
> > http://linux-ata.org/faq.html
> >
>
> It was, when I turn of NCQ with "echo 1
> > /sys/block/sda/device/queue_depth" I get the same performance as when
> the BIOS is set to IDE.
>
> I though that NCQ was intended to increase performance ??
>
> also the disk is a Westen Digital raptor and it's probably the most
> benchmarked drive one could get so I was not expecting a problem with
> the drive.

My Raptor drive  WDC WD740ADFD-00  Rev: 20.0 performs worse with NCQ
than with NCQ, Alan Cox has added it to a blacklist but I dont think
that is in the mainline kernel yet.

Search the list archives for my previous postings about it, including
some hdparm -tT results.

Andy



>
> -------
> ATA device, with non-removable media
>         Model Number:       WDC WD1500ADFD-00NLR1
>         Serial Number:      WD-WMAP41269747
>         Firmware Revision:  20.07P20
> Standards:
>         Used: ATA/ATAPI-7 published, ANSI INCITS 397-2005
>         Supported: 7 6 5 4
> Configuration:
>         Logical         max     current
>         cylinders       16383   16383
>         heads           16      16
>         sectors/track   63      63
>         --
>         CHS current addressable sectors:   16514064
>         LBA    user addressable sectors:  268435455
>         LBA48  user addressable sectors:  293046768
>         device size with M = 1024*1024:      143089 MBytes
>         device size with M = 1000*1000:      150039 MBytes (150 GB)
> Capabilities:
>         LBA, IORDY(can be disabled)
>         Queue depth: 32
>         Standby timer values: spec'd by Standard, with device specific
> minimum
>         R/W multiple sector transfer: Max = 16  Current = 16
>         Recommended acoustic management value: 128, current value: 254
>         DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5
> *udma6
>              Cycle time: min=120ns recommended=120ns
>         PIO: pio0 pio1 pio2 pio3 pio4
>              Cycle time: no flow control=120ns  IORDY flow control=120ns
> Commands/features:
>         Enabled Supported:
>            *    SMART feature set
>                 Security Mode feature set
>            *    Power Management feature set
>            *    Write cache
>            *    Look-ahead
>            *    Host Protected Area feature set
>            *    WRITE_BUFFER command
>            *    READ_BUFFER command
>            *    NOP cmd
>            *    DOWNLOAD_MICROCODE
>                 Power-Up In Standby feature set
>            *    SET_FEATURES required to spinup after power up
>                 SET_MAX security extension
>            *    Automatic Acoustic Management feature set
>            *    48-bit Address feature set
>            *    Device Configuration Overlay feature set
>            *    Mandatory FLUSH_CACHE
>            *    FLUSH_CACHE_EXT
>            *    SMART error logging
>            *    SMART self-test
>            *    General Purpose Logging feature set
>            *    SATA-I signaling speed (1.5Gb/s)
>            *    Native Command Queueing (NCQ)
>            *    Host-initiated interface power management
>            *    Phy event counters
>                 DMA Setup Auto-Activate optimization
>            *    Software settings preservation
>            *    SMART Command Transport (SCT) feature set
>            *    SCT Long Sector Access (AC1)
>            *    SCT LBA Segment Access (AC2)
>            *    SCT Error Recovery Control (AC3)
>            *    SCT Features Control (AC4)
>            *    SCT Data Tables (AC5)
>                 unknown 206[12]
> Security:
>         Master password revision code = 65534
>                 supported
>         not     enabled
>         not     locked
>                 frozen
>         not     expired: security count
>         not     supported: enhanced erase
> Checksum: correct
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-16 16:44     ` Andrew Lyon
@ 2007-01-16 18:32       ` Mark Lord
  2007-01-16 20:20         ` Mark Hahn
  0 siblings, 1 reply; 13+ messages in thread
From: Mark Lord @ 2007-01-16 18:32 UTC (permalink / raw)
  To: Andrew Lyon; +Cc: linux-ide, Alan Cox, Tejun Heo

..
>> I though that NCQ was intended to increase performance ??
..
> My Raptor drive  WDC WD740ADFD-00  Rev: 20.0 performs worse with NCQ
> than with NCQ, Alan Cox has added it to a blacklist but I dont think
> that is in the mainline kernel yet.
> 
> Search the list archives for my previous postings about it, including
> some hdparm -tT results.
> 
> Andy

Raptors have "server style" firmware in them.
When NCQ is turned on, they self-optimize for random seeks
rather than for sequential read-ahead.

My hdparm test is a sequential read-ahead test, so it will
naturally perform worse on a Raptor when NCQ is on.

That doesn't mean NCQ is bad on all Raptors; it just means
it's not good for large sequential reads.

Cheers

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-16 18:32       ` Mark Lord
@ 2007-01-16 20:20         ` Mark Hahn
  2007-01-16 22:10           ` Jeff Garzik
  0 siblings, 1 reply; 13+ messages in thread
From: Mark Hahn @ 2007-01-16 20:20 UTC (permalink / raw)
  To: linux-ide

>>> I though that NCQ was intended to increase performance ??

intended to increase _sales_ performance ;)

remember that you've always had command queueing (kernel elevator): 
the main difference with NCQ (or SCSI tagged queueing) is when
the disk can out-schedule the kernel.  afaikt, this means sqeezing
in a rotationally intermediate request along the way.

that intermediate request must be fairly small and should be a read
(for head-settling reasons).

I wonder how often this happens in the real world, given the relatively
small queues the disk has to work with.

> My hdparm test is a sequential read-ahead test, so it will
> naturally perform worse on a Raptor when NCQ is on.

that's a surprisingly naive heuristic, especially since 
NCQ is concerned with just a max of ~4MB of reads, only a smallish
fraction of the available cache.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-16 20:20         ` Mark Hahn
@ 2007-01-16 22:10           ` Jeff Garzik
  2007-01-16 22:26             ` Eric D. Mudama
  2007-01-17 22:03             ` Jens Axboe
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff Garzik @ 2007-01-16 22:10 UTC (permalink / raw)
  To: Mark Hahn; +Cc: linux-ide, Jens Axboe, Andrew Morton

Mark Hahn wrote:
>>>> I though that NCQ was intended to increase performance ??
> 
> intended to increase _sales_ performance ;)

Yep.


> remember that you've always had command queueing (kernel elevator): the 
> main difference with NCQ (or SCSI tagged queueing) is when
> the disk can out-schedule the kernel.  afaikt, this means sqeezing
> in a rotationally intermediate request along the way.
> 
> that intermediate request must be fairly small and should be a read
> (for head-settling reasons).
> 
> I wonder how often this happens in the real world, given the relatively
> small queues the disk has to work with.

ISTR either Jens or Andrew ran some numbers, and found that there was 
little utility beyond 4 or 8 tags or so.


>> My hdparm test is a sequential read-ahead test, so it will
>> naturally perform worse on a Raptor when NCQ is on.
> 
> that's a surprisingly naive heuristic, especially since NCQ is concerned 
> with just a max of ~4MB of reads, only a smallish
> fraction of the available cache.

NCQ mainly helps with multiple threads doing reads.  Writes are largely 
asynchronous to the user already (except for fsync-style writes).  You 
want to be able to stuff the disk's internal elevator with as many read 
requests as possible, because reads are very often synchronous -- most 
apps (1) read a block, (2) do something, (3) goto step #1.  The kernel's 
elevator isn't much use in these cases.

	Jeff




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-16 22:10           ` Jeff Garzik
@ 2007-01-16 22:26             ` Eric D. Mudama
  2007-01-17 22:03               ` Jens Axboe
  2007-01-17 22:03             ` Jens Axboe
  1 sibling, 1 reply; 13+ messages in thread
From: Eric D. Mudama @ 2007-01-16 22:26 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Mark Hahn, linux-ide, Jens Axboe, Andrew Morton

On 1/16/07, Jeff Garzik <jeff@garzik.org> wrote:
> ISTR either Jens or Andrew ran some numbers, and found that there was
> little utility beyond 4 or 8 tags or so.

Write cache is effectively queueing small writes already, so NCQ
simply brings random read performance closer to writes.

I know on the Maxtor drives with ~16MB of cache, they could do almost
200 ops/s at 7200RPM with their buffer granularity.

Random reads were about 70 ops/s at a depth of 1, and 120 ops/s at a
depth of 32.  Every double of queue depth added another level of
performance, and brings it closer to the implementation of cached
writes (queued or unqueued).  (infinite queue depth basically
eliminates seek and rotate time, and brings you to your minimum settle
criteria as your minimum operation time)

It really has a lot of application dependence, but for the mixed
random workloads, a 25-30% performance increase was common in our
testing.  Drives should be able to handle normal streaming workloads
at identical performance, with or without queueing, since the patterns
are so easy to detect.

If done properly, queueing should never hurt performance.  High queue
depths will increase average latency of course, but shouldn't hurt
overall performance.

--eric

> NCQ mainly helps with multiple threads doing reads.  Writes are largely
> asynchronous to the user already (except for fsync-style writes).  You
> want to be able to stuff the disk's internal elevator with as many read
> requests as possible, because reads are very often synchronous -- most
> apps (1) read a block, (2) do something, (3) goto step #1.  The kernel's
> elevator isn't much use in these cases.

True.  And internal to the drive, normal elevator is "meh."  There are
other algorithms for scheduling that perform better.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-16 22:10           ` Jeff Garzik
  2007-01-16 22:26             ` Eric D. Mudama
@ 2007-01-17 22:03             ` Jens Axboe
  1 sibling, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2007-01-17 22:03 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Mark Hahn, linux-ide, Andrew Morton

On Tue, Jan 16 2007, Jeff Garzik wrote:
> Mark Hahn wrote:
> >>>>I though that NCQ was intended to increase performance ??
> >
> >intended to increase _sales_ performance ;)
> 
> Yep.
> 
> 
> >remember that you've always had command queueing (kernel elevator): the 
> >main difference with NCQ (or SCSI tagged queueing) is when
> >the disk can out-schedule the kernel.  afaikt, this means sqeezing
> >in a rotationally intermediate request along the way.
> >
> >that intermediate request must be fairly small and should be a read
> >(for head-settling reasons).
> >
> >I wonder how often this happens in the real world, given the relatively
> >small queues the disk has to work with.
> 
> ISTR either Jens or Andrew ran some numbers, and found that there was 
> little utility beyond 4 or 8 tags or so.

It entirely depends on the access pattern. For truly random reads,
performance does seem to continue to scale up with increasing drive
queue depths. It may only be a benchmark figure though, as truly random
read workloads probably aren't that common :-)

For anything else, going beyond 4 tags doesn't improve much.

> >>My hdparm test is a sequential read-ahead test, so it will
> >>naturally perform worse on a Raptor when NCQ is on.
> >
> >that's a surprisingly naive heuristic, especially since NCQ is concerned 
> >with just a max of ~4MB of reads, only a smallish
> >fraction of the available cache.
> 
> NCQ mainly helps with multiple threads doing reads.  Writes are largely 
> asynchronous to the user already (except for fsync-style writes).  You 
> want to be able to stuff the disk's internal elevator with as many read 
> requests as possible, because reads are very often synchronous -- most 
> apps (1) read a block, (2) do something, (3) goto step #1.  The kernel's 
> elevator isn't much use in these cases.

Au contraire, this is one of the cases where intelligent IO scheduling
in the kernel makes a ton of difference. It's the primary reason that AS
and CFQ are able to maintain > 90% of disk bandwidth for more than one
process, idling the drive for the duration of step 2 in the sequence
above (step 2 is typically really small, time wise). If the next block
read is close to the first one, that is. If you do that, you will
greatly outperform the same workload pushed to the drive scheduling.
I've done considerable benchmarks on this. Only if the processes are
doing random IO should the IO scheduler punt and push everything to the
drive queue.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ahci problems with sata disk.
  2007-01-16 22:26             ` Eric D. Mudama
@ 2007-01-17 22:03               ` Jens Axboe
  0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2007-01-17 22:03 UTC (permalink / raw)
  To: Eric D. Mudama
  Cc: Jeff Garzik, Mark Hahn, linux-ide, Jens Axboe, Andrew Morton

On Tue, Jan 16 2007, Eric D. Mudama wrote:

[snip lots of stuff I agree completely with]

> If done properly, queueing should never hurt performance.  High queue
> depths will increase average latency of course, but shouldn't hurt
> overall performance.

It may never hurt performance, but there are common scenarios where you
are much better off not doing queuing even if you could. A good example
of that is a media serving service, where you end up reading a bunch of
files sequentially. It's faster to read chunks of each file sequentially
at depth 1 and move on, than queue a a request from each of them and
send them to the drive. On my laptop with an NCQ enabled drive, the
mentioned approach outperforms queuing by more than 100%.

> >NCQ mainly helps with multiple threads doing reads.  Writes are
> >largely asynchronous to the user already (except for fsync-style
> >writes).  You want to be able to stuff the disk's internal elevator
> >with as many read requests as possible, because reads are very often
> >synchronous -- most apps (1) read a block, (2) do something, (3) goto
> >step #1.  The kernel's elevator isn't much use in these cases.
> 
> True.  And internal to the drive, normal elevator is "meh."  There are
> other algorithms for scheduling that perform better.

Well Linux doesn't default to using a normal elevator, so it's a moot
point.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-01-17 22:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-14 14:32 ahci problems with sata disk kenneth johansson
2007-01-15  9:13 ` Tejun Heo
2007-01-15 11:05   ` kenneth johansson
2007-01-15 11:36     ` Alan
2007-01-15 13:50     ` Tejun Heo
2007-01-16  1:43       ` kenneth johansson
2007-01-16 16:44     ` Andrew Lyon
2007-01-16 18:32       ` Mark Lord
2007-01-16 20:20         ` Mark Hahn
2007-01-16 22:10           ` Jeff Garzik
2007-01-16 22:26             ` Eric D. Mudama
2007-01-17 22:03               ` Jens Axboe
2007-01-17 22:03             ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.