linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* solid state drive access and context switching
@ 2007-12-03 23:06 Chris Friesen
  2007-12-03 23:06 ` Alan Cox
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Friesen @ 2007-12-03 23:06 UTC (permalink / raw)
  To: linux-kernel


Over on comp.os.linux.development.system someone asked an interesting 
question, and I thought I'd mention it here.

Given a fast low-latency solid state drive, would it ever be beneficial 
to simply wait in the kernel for synchronous read/write calls to 
complete?  The idea is that you could avoid at least two task context 
switches, and if the data access can be completed at less cost than 
those context switches it could be an overall win.

Has anyone played with this concept?

Chris



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-03 23:06 solid state drive access and context switching Chris Friesen
@ 2007-12-03 23:06 ` Alan Cox
  2007-12-04 17:54   ` Jared Hulbert
  2007-12-04 20:52   ` Jeff Garzik
  0 siblings, 2 replies; 17+ messages in thread
From: Alan Cox @ 2007-12-03 23:06 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-kernel

> Given a fast low-latency solid state drive, would it ever be beneficial 
> to simply wait in the kernel for synchronous read/write calls to 
> complete?  The idea is that you could avoid at least two task context 
> switches, and if the data access can be completed at less cost than 
> those context switches it could be an overall win.

In certain situations theoretically yes, the kernel is better off
continuing to poll than switching to the idle thread. You can do this to
some extent in a driver already today - just poll rather than sleeping
but respsect the reschedule hints and don't do it with irqs masked.
 
> Has anyone played with this concept?

For things like SATA based devices they aren't that fast yet. 

Alan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-03 23:06 ` Alan Cox
@ 2007-12-04 17:54   ` Jared Hulbert
  2007-12-04 20:35     ` Alan Cox
  2007-12-04 20:46     ` Chris Friesen
  2007-12-04 20:52   ` Jeff Garzik
  1 sibling, 2 replies; 17+ messages in thread
From: Jared Hulbert @ 2007-12-04 17:54 UTC (permalink / raw)
  To: Alan Cox; +Cc: Chris Friesen, linux-kernel

> > Has anyone played with this concept?
>
> For things like SATA based devices they aren't that fast yet.

What is fast enough?

As I understand the basic memory technology, the hard limit is in the
100's of microseconds range for latency.  SATA adds something to that.
 I'd be surprised to see latencies on SATA SSD's as measured at the OS
level to get below 1 millisecond.

What happens we start placing NAND technology in lower latency, higher
bandwidth buses?  I'm guessing we'll get down to that 100's of
microseconds level and an order of magnitude higher bandwidth than
SATA.  Is that fast enough to warrant this more synchronous IO?

Magnetic drives have latencies ~10 milliseconds, current SSD's are an
order of magnitude better (~1 millisecond), new interfaces and
refinements could theoretically get us down one more (~100
microsecond).  I'm guessing the current block driver subsystem would
negate a lot of that latency gain.  Am I wrong?

BTW - This trend toward faster, lower latency busses is marching
forward.  2 examples; the ioDrive from Fusion IO, Micron's RAM-module
like SSD concept.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 17:54   ` Jared Hulbert
@ 2007-12-04 20:35     ` Alan Cox
  2007-12-04 21:54       ` Jared Hulbert
  2007-12-04 20:46     ` Chris Friesen
  1 sibling, 1 reply; 17+ messages in thread
From: Alan Cox @ 2007-12-04 20:35 UTC (permalink / raw)
  To: Jared Hulbert; +Cc: Chris Friesen, linux-kernel

> microseconds level and an order of magnitude higher bandwidth than
> SATA.  Is that fast enough to warrant this more synchronous IO?

See the mtd layer.

> BTW - This trend toward faster, lower latency busses is marching
> forward.  2 examples; the ioDrive from Fusion IO, Micron's RAM-module
> like SSD concept.

Very much so but we can do quite a bit in 10,000 processor cycles ...

Alan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 17:54   ` Jared Hulbert
  2007-12-04 20:35     ` Alan Cox
@ 2007-12-04 20:46     ` Chris Friesen
  2007-12-04 21:38       ` Jared Hulbert
  1 sibling, 1 reply; 17+ messages in thread
From: Chris Friesen @ 2007-12-04 20:46 UTC (permalink / raw)
  To: Jared Hulbert; +Cc: Alan Cox, linux-kernel

Jared Hulbert wrote:

> Magnetic drives have latencies ~10 milliseconds, current SSD's are an
> order of magnitude better (~1 millisecond), new interfaces and
> refinements could theoretically get us down one more (~100
> microsecond).

They've already done already better than that.  Here's a solid state 
drive with a claimed 20 microsecond access time:

http://www.curtisssd.com/products/drives/hyperxclr

Chris

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-03 23:06 ` Alan Cox
  2007-12-04 17:54   ` Jared Hulbert
@ 2007-12-04 20:52   ` Jeff Garzik
  2007-12-04 21:02     ` Alan Cox
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff Garzik @ 2007-12-04 20:52 UTC (permalink / raw)
  To: Alan Cox; +Cc: Chris Friesen, linux-kernel

Alan Cox wrote:
> For things like SATA based devices they aren't that fast yet. 

You forget the Gigabyte i-RAM.

For others:  the i-RAM is a SATA-based device that plugs into a PCI slot 
on your motherboard (for power), providing RAM+battery backup as fast as 
your SATA bus and DIMMs will go.

	Jeff




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 20:52   ` Jeff Garzik
@ 2007-12-04 21:02     ` Alan Cox
  0 siblings, 0 replies; 17+ messages in thread
From: Alan Cox @ 2007-12-04 21:02 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Chris Friesen, linux-kernel

On Tue, 04 Dec 2007 15:52:20 -0500
Jeff Garzik <jeff@garzik.org> wrote:

> Alan Cox wrote:
> > For things like SATA based devices they aren't that fast yet. 
> 
> You forget the Gigabyte i-RAM.
> 
> For others:  the i-RAM is a SATA-based device that plugs into a PCI slot 
> on your motherboard (for power), providing RAM+battery backup as fast as 
> your SATA bus and DIMMs will go.

Actually even allowing for the iRAM the SATA stuff is way too slow to be
worth using synchronously. The latency is a killer.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 20:46     ` Chris Friesen
@ 2007-12-04 21:38       ` Jared Hulbert
  0 siblings, 0 replies; 17+ messages in thread
From: Jared Hulbert @ 2007-12-04 21:38 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Alan Cox, linux-kernel

> > refinements could theoretically get us down one more (~100
> > microsecond).
>
> They've already done already better than that.  Here's a solid state
> drive with a claimed 20 microsecond access time:
>
> http://www.curtisssd.com/products/drives/hyperxclr

Right.  That looks to be RAM based, which means $$$$ compared to NAND,
so that's not going to breakout of a server niche.  I imagine the
latency is the device latency not the system latency.  By the time you
send the request through the fibrechannel stack and get the block back
it's gonna be much closer to 100 microseconds.  It's that OS visible
latency that you've got to design to.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 20:35     ` Alan Cox
@ 2007-12-04 21:54       ` Jared Hulbert
  2007-12-04 22:45         ` Jörn Engel
  2007-12-04 23:24         ` Alan Cox
  0 siblings, 2 replies; 17+ messages in thread
From: Jared Hulbert @ 2007-12-04 21:54 UTC (permalink / raw)
  To: Alan Cox; +Cc: Chris Friesen, linux-kernel

> > microseconds level and an order of magnitude higher bandwidth than
> > SATA.  Is that fast enough to warrant this more synchronous IO?
>
> See the mtd layer.

Right.  The trend is to hide the nastiness of NAND technology changes
behind controllers.  In general I think this is a good thing.
Basically the changes in ECC and reliability change very rapidly in
this technology.  Having custom controller hardware to handle this is
faster than handling it in software and makes for a nice modular
interface.  We don't rewrite our SATA drivers and filesystem
everything the magnetic media switches to a new recording scheme, we
just plug it in.  SSD's are going to be like that even if they aren't
SATA. However, the MTD layer is more about managing the chips
themselves, which is what the controllers are for.

Maybe I'm missing something but I don't see it.  We want a block
interface for these devices, we just need a faster slimmer interface.
Maybe a new mtdblock interface that doesn't do erase would be the
place for?

> > BTW - This trend toward faster, lower latency busses is marching
> > forward.  2 examples; the ioDrive from Fusion IO, Micron's RAM-module
> > like SSD concept.
>
> Very much so but we can do quite a bit in 10,000 processor cycles ...
>
> Alan
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 21:54       ` Jared Hulbert
@ 2007-12-04 22:45         ` Jörn Engel
  2007-12-05  0:03           ` Jared Hulbert
  2007-12-04 23:24         ` Alan Cox
  1 sibling, 1 reply; 17+ messages in thread
From: Jörn Engel @ 2007-12-04 22:45 UTC (permalink / raw)
  To: Jared Hulbert; +Cc: Alan Cox, Chris Friesen, linux-kernel

On Tue, 4 December 2007 13:54:21 -0800, Jared Hulbert wrote:
> 
> Maybe I'm missing something but I don't see it.  We want a block
> interface for these devices, we just need a faster slimmer interface.
> Maybe a new mtdblock interface that doesn't do erase would be the
> place for?

Doesn't do erase?  MTD has to learn almost all tricks from the block
layer, as devices are becoming high-latency high-bandwidth, compared to
what MTD was designed for.  In order to get any decent performance, we
need asynchronous operations, request queues and caching.

The only useful advantage MTD does have over block devices is an
_explicit_ erase operation.  Did you mean "doesn't do _implicit_ erase".

Jörn

-- 
It's just what we asked for, but not what we want!
-- anonymous

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 21:54       ` Jared Hulbert
  2007-12-04 22:45         ` Jörn Engel
@ 2007-12-04 23:24         ` Alan Cox
  2007-12-05  0:08           ` Jared Hulbert
  1 sibling, 1 reply; 17+ messages in thread
From: Alan Cox @ 2007-12-04 23:24 UTC (permalink / raw)
  To: Jared Hulbert; +Cc: Chris Friesen, linux-kernel

> Right.  The trend is to hide the nastiness of NAND technology changes
> behind controllers.  In general I think this is a good thing.

You miss the point - any controller you hide it behind almost inevitably
adds enough latency you don't want to use it synchronously.

Alan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 22:45         ` Jörn Engel
@ 2007-12-05  0:03           ` Jared Hulbert
  0 siblings, 0 replies; 17+ messages in thread
From: Jared Hulbert @ 2007-12-05  0:03 UTC (permalink / raw)
  To: Jörn Engel; +Cc: Alan Cox, Chris Friesen, linux-kernel

> > Maybe I'm missing something but I don't see it.  We want a block
> > interface for these devices, we just need a faster slimmer interface.
> > Maybe a new mtdblock interface that doesn't do erase would be the
> > place for?
>
> Doesn't do erase?  MTD has to learn almost all tricks from the block
> layer, as devices are becoming high-latency high-bandwidth, compared to
> what MTD was designed for.  In order to get any decent performance, we
> need asynchronous operations, request queues and caching.
>
> The only useful advantage MTD does have over block devices is an
> _explicit_ erase operation.  Did you mean "doesn't do _implicit_ erase".


You're right.  That the point I was trying to make, albeit badly, MTD
isn't the place for this.  The fact that more and more of what the MTD
is being used for looks a lot like the block layer is a whole
different discussion.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-04 23:24         ` Alan Cox
@ 2007-12-05  0:08           ` Jared Hulbert
  2007-12-05  0:24             ` Alan Cox
  0 siblings, 1 reply; 17+ messages in thread
From: Jared Hulbert @ 2007-12-05  0:08 UTC (permalink / raw)
  To: Alan Cox; +Cc: Chris Friesen, linux-kernel

On Dec 4, 2007 3:24 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> > Right.  The trend is to hide the nastiness of NAND technology changes
> > behind controllers.  In general I think this is a good thing.
>
> You miss the point - any controller you hide it behind almost inevitably
> adds enough latency you don't want to use it synchronously.

I think I get it.  We keep saying that it's the latency is too high.
I agree that most technologies out there have latencies that are too
high.  Again I ask the question, what latencies do we have to hit
before the sync options become worth it?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-05  0:08           ` Jared Hulbert
@ 2007-12-05  0:24             ` Alan Cox
  2007-12-05 22:01               ` Jared Hulbert
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Cox @ 2007-12-05  0:24 UTC (permalink / raw)
  To: Jared Hulbert; +Cc: Chris Friesen, linux-kernel

On Tue, 4 Dec 2007 16:08:07 -0800
"Jared Hulbert" <jaredeh@gmail.com> wrote:

> On Dec 4, 2007 3:24 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> > > Right.  The trend is to hide the nastiness of NAND technology changes
> > > behind controllers.  In general I think this is a good thing.
> >
> > You miss the point - any controller you hide it behind almost inevitably
> > adds enough latency you don't want to use it synchronously.
> 
> I think I get it.  We keep saying that it's the latency is too high.
> I agree that most technologies out there have latencies that are too
> high.  Again I ask the question, what latencies do we have to hit
> before the sync options become worth it?

Probably about 1000 clocks but its always going to depend upon the
workload and whether any other work can be done usefully.

Alan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-05  0:24             ` Alan Cox
@ 2007-12-05 22:01               ` Jared Hulbert
  2007-12-06  3:51                 ` Kyungmin Park
  0 siblings, 1 reply; 17+ messages in thread
From: Jared Hulbert @ 2007-12-05 22:01 UTC (permalink / raw)
  To: Alan Cox; +Cc: Chris Friesen, linux-kernel

> Probably about 1000 clocks but its always going to depend upon the
> workload and whether any other work can be done usefully.

Yeah.  Sounds right, in the microsecond range.  Be interesting to see data.

Anybody have ideas on what kind of experiments could confirm this
estimate is right?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
  2007-12-05 22:01               ` Jared Hulbert
@ 2007-12-06  3:51                 ` Kyungmin Park
  0 siblings, 0 replies; 17+ messages in thread
From: Kyungmin Park @ 2007-12-06  3:51 UTC (permalink / raw)
  To: Jared Hulbert; +Cc: Alan Cox, Chris Friesen, linux-kernel

Hi,

On Dec 6, 2007 7:01 AM, Jared Hulbert <jaredeh@gmail.com> wrote:
> > Probably about 1000 clocks but its always going to depend upon the
> > workload and whether any other work can be done usefully.
>
> Yeah.  Sounds right, in the microsecond range.  Be interesting to see data.
>
> Anybody have ideas on what kind of experiments could confirm this
> estimate is right?

Is it the right place to write synchronously?
Now only concern the SATA.

Thank you,
Kyungmin Park

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 3b927be..cce0618 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -3221,6 +3221,13 @@ static inline void __generic_make_request(struct bio *bio
        if (bio_check_eod(bio, nr_sectors))
                goto end_io;

+#if 1
+       /* FIXME simple hack */
+       if (MAJOR(bio->bi_bdev->bd_dev) == 8 && bio_data_dir(bio) == WRITE) {
+               /* WRITE_SYNC */
+               bio->bi_rw |= (1 << BIO_RW_SYNC);
+       }
+#endif
        /*
         * Resolve the mapping until finished. (drivers are
         * still free to implement/resolve their own stacking

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: solid state drive access and context switching
       [not found] <fa.4uUCLsuQFjc4FtQYCBYK6kY9TiU@ifi.uio.no>
@ 2007-12-05  1:11 ` Robert Hancock
  0 siblings, 0 replies; 17+ messages in thread
From: Robert Hancock @ 2007-12-05  1:11 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-kernel

Chris Friesen wrote:
> 
> Over on comp.os.linux.development.system someone asked an interesting 
> question, and I thought I'd mention it here.
> 
> Given a fast low-latency solid state drive, would it ever be beneficial 
> to simply wait in the kernel for synchronous read/write calls to 
> complete?  The idea is that you could avoid at least two task context 
> switches, and if the data access can be completed at less cost than 
> those context switches it could be an overall win.
> 
> Has anyone played with this concept?

I don't think most SSDs are fast enough that it would really be worth 
avoiding the context switch for.. I could be wrong though.

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-12-06  3:51 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-03 23:06 solid state drive access and context switching Chris Friesen
2007-12-03 23:06 ` Alan Cox
2007-12-04 17:54   ` Jared Hulbert
2007-12-04 20:35     ` Alan Cox
2007-12-04 21:54       ` Jared Hulbert
2007-12-04 22:45         ` Jörn Engel
2007-12-05  0:03           ` Jared Hulbert
2007-12-04 23:24         ` Alan Cox
2007-12-05  0:08           ` Jared Hulbert
2007-12-05  0:24             ` Alan Cox
2007-12-05 22:01               ` Jared Hulbert
2007-12-06  3:51                 ` Kyungmin Park
2007-12-04 20:46     ` Chris Friesen
2007-12-04 21:38       ` Jared Hulbert
2007-12-04 20:52   ` Jeff Garzik
2007-12-04 21:02     ` Alan Cox
     [not found] <fa.4uUCLsuQFjc4FtQYCBYK6kY9TiU@ifi.uio.no>
2007-12-05  1:11 ` Robert Hancock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).