All of lore.kernel.org
 help / color / mirror / Atom feed
* SSD read latency negatively impacted by large writes (independent of  choice of I/O scheduler)
@ 2009-10-30 23:21 Zubin Dittia
  2009-11-02 14:25 ` Jeff Moyer
  0 siblings, 1 reply; 3+ messages in thread
From: Zubin Dittia @ 2009-10-30 23:21 UTC (permalink / raw)
  To: linux-kernel

I've been doing some testing with an Intel X25-E SSD, and noticed that
large writes can severely affect read latency, regardless of which I/O
scheduler or scheduler parameters are in use (this is with kernel
2.6.28-16 from Ubuntu jaunty 9.04).  The test was very simple: I had
two threads running; the first was in a tight loop reading different
4KB sized blocks (and recording the latency of each read) from the SSD
block device file.  While the first thread is doing this, a second
thread does a single big 5MB write to the device.  What I noticed is
that about 30 seconds after the write (which is when the write is
actually written back to the device from buffer cache), I see a very
large spike in read latency: from 200 microseconds to 25 milliseconds.
 This seems to imply that the writes issued by the scheduler are not
being broken up into sufficiently small chunks with interspersed
reads; instead, the whole sequential write seems to be getting issued
while starving reads during that period.  I've noticed the same
behavior with SSDs from another vendor as well, and there the latency
impact was even worse (80 ms).  Playing around with different I/O
schedulers and parameters doesn't seem to help at all.

The same behavior is exhibited when using O_DIRECT as well (except
that the latency hit is immediate instead of 30 seconds later, as one
would expect).  The only way I was able to reduce the worst-case read
latency was by using O_DIRECT and breaking up the large write into
multiple smaller writes (with one system call per smaller write).  My
theory is that the time between write system calls was enough to allow
reads to squeeze themselves in between the writes.  But, as would be
expected, this does bad things to the sequential write throughput
because of the overhead of multiple system calls.

My question is: have others seen this behavior?  Are there any
tunables that could help (perhaps a parameter that would dictate the
largest size of a write that can be pending to the device at any given
time).  If not, would it make sense to implement a new I/O scheduler
(or hack an existing one) which does this.

Thanks,
-Zubin

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: SSD read latency negatively impacted by large writes (independent of  choice of I/O scheduler)
  2009-10-30 23:21 SSD read latency negatively impacted by large writes (independent of choice of I/O scheduler) Zubin Dittia
@ 2009-11-02 14:25 ` Jeff Moyer
  2009-11-02 15:56   ` Stefan Richter
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Moyer @ 2009-11-02 14:25 UTC (permalink / raw)
  To: Zubin Dittia; +Cc: linux-kernel

Zubin Dittia <zubin@tintri.com> writes:

> I've been doing some testing with an Intel X25-E SSD, and noticed that
> large writes can severely affect read latency, regardless of which I/O
> scheduler or scheduler parameters are in use (this is with kernel
> 2.6.28-16 from Ubuntu jaunty 9.04).  The test was very simple: I had
> two threads running; the first was in a tight loop reading different
> 4KB sized blocks (and recording the latency of each read) from the SSD
> block device file.  While the first thread is doing this, a second
> thread does a single big 5MB write to the device.  What I noticed is
> that about 30 seconds after the write (which is when the write is
> actually written back to the device from buffer cache), I see a very
> large spike in read latency: from 200 microseconds to 25 milliseconds.
>  This seems to imply that the writes issued by the scheduler are not
> being broken up into sufficiently small chunks with interspersed
> reads; instead, the whole sequential write seems to be getting issued
> while starving reads during that period.  I've noticed the same
> behavior with SSDs from another vendor as well, and there the latency
> impact was even worse (80 ms).  Playing around with different I/O
> schedulers and parameters doesn't seem to help at all.
>
> The same behavior is exhibited when using O_DIRECT as well (except
> that the latency hit is immediate instead of 30 seconds later, as one
> would expect).  The only way I was able to reduce the worst-case read
> latency was by using O_DIRECT and breaking up the large write into
> multiple smaller writes (with one system call per smaller write).  My
> theory is that the time between write system calls was enough to allow
> reads to squeeze themselves in between the writes.  But, as would be
> expected, this does bad things to the sequential write throughput
> because of the overhead of multiple system calls.
>
> My question is: have others seen this behavior?  Are there any
> tunables that could help (perhaps a parameter that would dictate the
> largest size of a write that can be pending to the device at any given
> time).  If not, would it make sense to implement a new I/O scheduler
> (or hack an existing one) which does this.

I haven't verified your findings, but if what you state is true, then
you could try tuning max_sectors_kb for your device.  Making that
smaller will decrease the total amount of I/O that can be queued in the
device at any given time.  There's always a trade-off between bandwidth
and latency, of course.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: SSD read latency negatively impacted by large writes (independent of  choice of I/O scheduler)
  2009-11-02 14:25 ` Jeff Moyer
@ 2009-11-02 15:56   ` Stefan Richter
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Richter @ 2009-11-02 15:56 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Zubin Dittia, linux-kernel

Jeff Moyer wrote:
> Zubin Dittia <zubin@tintri.com> writes:
[...]
>> about 30 seconds after the write (which is when the write is
>> actually written back to the device from buffer cache), I see a very
>> large spike in read latency: from 200 microseconds to 25 milliseconds.
>>  This seems to imply that the writes issued by the scheduler are not
>> being broken up into sufficiently small chunks with interspersed
>> reads; instead, the whole sequential write seems to be getting issued
>> while starving reads during that period.
[...]
>> Playing around with different I/O
>> schedulers and parameters doesn't seem to help at all.
[...]
> I haven't verified your findings, but if what you state is true, then
> you could try tuning max_sectors_kb for your device.  Making that
> smaller will decrease the total amount of I/O that can be queued in the
> device at any given time.  There's always a trade-off between bandwidth
> and latency, of course.

Maximum transfer size per request is indeed one factor; another one is
queue_depth.  With a deep queue, a read request between many write
requests will still be held up by many write requests queued up before
the read request.  (Once the scheduler issued the requests to the queue,
it can't reorder the requests any more --- only the disk's firmware
could reorder the requests if it is sophisticated enough and there are
no barriers in the mix.)

When transfer size and queue depth are set different from the default,
the various I/O schedulers should be tested again because then their
behaviors may vary more than before.
-- 
Stefan Richter
-=====-==--= =-== ---=-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-11-02 15:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-30 23:21 SSD read latency negatively impacted by large writes (independent of choice of I/O scheduler) Zubin Dittia
2009-11-02 14:25 ` Jeff Moyer
2009-11-02 15:56   ` Stefan Richter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.