All of lore.kernel.org
 help / color / mirror / Atom feed
* Intel Updates SSDs, Supports TRIM, Faster Writes
@ 2009-11-08 17:57 Bill Davidsen
  2009-11-08 22:30 ` Thomas Fjellstrom
  2009-11-09  1:13 ` Majed B.
  0 siblings, 2 replies; 34+ messages in thread
From: Bill Davidsen @ 2009-11-08 17:57 UTC (permalink / raw)
  To: Linux RAID

For those of us playing with use of SSD for journals on ext[34], this 
does have implications for RAID performance.

http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes

-- 
Bill Davidsen <davidsen@tmr.com>
  Unintended results are the well-earned reward for incompetence.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-08 17:57 Intel Updates SSDs, Supports TRIM, Faster Writes Bill Davidsen
@ 2009-11-08 22:30 ` Thomas Fjellstrom
  2009-11-09  1:13 ` Majed B.
  1 sibling, 0 replies; 34+ messages in thread
From: Thomas Fjellstrom @ 2009-11-08 22:30 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux RAID

On Sun November 8 2009, Bill Davidsen wrote:
> For those of us playing with use of SSD for journals on ext[34], this
> does have implications for RAID performance.
> 
> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Su
> pports-TRIM-Faster-Writes
> 

Just don't upgrade it till they fix the bugs ;)

-- 
Thomas Fjellstrom
tfjellstrom@shaw.ca

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-08 17:57 Intel Updates SSDs, Supports TRIM, Faster Writes Bill Davidsen
  2009-11-08 22:30 ` Thomas Fjellstrom
@ 2009-11-09  1:13 ` Majed B.
  2009-11-09 16:37   ` Chris Worley
  2009-11-09 18:42   ` Greg Freemyer
  1 sibling, 2 replies; 34+ messages in thread
From: Majed B. @ 2009-11-09  1:13 UTC (permalink / raw)
  To: Linux RAID

The firmware which introduced the TRIM command was deemed buggy and
has been pulled out.

Are there any filesystems that are TRIM-aware?

On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
> For those of us playing with use of SSD for journals on ext[34], this does
> have implications for RAID performance.
>
> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>
> --
> Bill Davidsen <davidsen@tmr.com>
>  Unintended results are the well-earned reward for incompetence.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-09  1:13 ` Majed B.
@ 2009-11-09 16:37   ` Chris Worley
  2009-11-09 16:42     ` Majed B.
  2009-11-09 18:42   ` Greg Freemyer
  1 sibling, 1 reply; 34+ messages in thread
From: Chris Worley @ 2009-11-09 16:37 UTC (permalink / raw)
  To: Majed B.; +Cc: Linux RAID

On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
> The firmware which introduced the TRIM command was deemed buggy and
> has been pulled out.
>
> Are there any filesystems that are TRIM-aware?

Ext4 (at that level in the kernel, it's referred to as "discard", it's
not TRIM until it's issued as a SCSI command).

Chris
>
> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>> For those of us playing with use of SSD for journals on ext[34], this does
>> have implications for RAID performance.
>>
>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>>
>> --
>> Bill Davidsen <davidsen@tmr.com>
>>  Unintended results are the well-earned reward for incompetence.
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
>       Majed B.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-09 16:37   ` Chris Worley
@ 2009-11-09 16:42     ` Majed B.
  2009-11-09 16:59       ` Chris Worley
  0 siblings, 1 reply; 34+ messages in thread
From: Majed B. @ 2009-11-09 16:42 UTC (permalink / raw)
  To: Chris Worley; +Cc: Linux RAID

Well, SATA uses SCSI emulation so I guess that's no problem, right?

On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
>> The firmware which introduced the TRIM command was deemed buggy and
>> has been pulled out.
>>
>> Are there any filesystems that are TRIM-aware?
>
> Ext4 (at that level in the kernel, it's referred to as "discard", it's
> not TRIM until it's issued as a SCSI command).
>
> Chris
>>
>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>>> For those of us playing with use of SSD for journals on ext[34], this does
>>> have implications for RAID performance.
>>>
>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>>>
>>> --
>>> Bill Davidsen <davidsen@tmr.com>
>>>  Unintended results are the well-earned reward for incompetence.
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>>
>> --
>>       Majed B.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>



-- 
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-09 16:42     ` Majed B.
@ 2009-11-09 16:59       ` Chris Worley
  2009-11-10  9:42         ` Kasper Sandberg
  2009-11-10 16:36         ` Martin K. Petersen
  0 siblings, 2 replies; 34+ messages in thread
From: Chris Worley @ 2009-11-09 16:59 UTC (permalink / raw)
  To: Majed B.; +Cc: Linux RAID

On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
> Well, SATA uses SCSI emulation so I guess that's no problem, right?

The only problem is SSD's put Solid State Storage (SSS) behind
SATA/SAS controllers... while compatible w/ old disk technology, it
severely limits performance (i.e. none of these SSD drives do even
300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
drivers and ext4 "discard" worked very well with forward-thinking SSS
not encumbered by old controller technology... but, SSD's were not
able to handle it well:

http://lwn.net/Articles/347511/

So it looks like "design by committee" Linux is well behind Windows 7,
while Linux contemplates slowing new technology down to optimize for
ill-designed SSD's.

Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!

Chris
>
> On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
>> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
>>> The firmware which introduced the TRIM command was deemed buggy and
>>> has been pulled out.
>>>
>>> Are there any filesystems that are TRIM-aware?
>>
>> Ext4 (at that level in the kernel, it's referred to as "discard", it's
>> not TRIM until it's issued as a SCSI command).
>>
>> Chris
>>>
>>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>>>> For those of us playing with use of SSD for journals on ext[34], this does
>>>> have implications for RAID performance.
>>>>
>>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>>>>
>>>> --
>>>> Bill Davidsen <davidsen@tmr.com>
>>>>  Unintended results are the well-earned reward for incompetence.
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>>
>>> --
>>>       Majed B.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>
>
>
> --
>       Majed B.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-09  1:13 ` Majed B.
  2009-11-09 16:37   ` Chris Worley
@ 2009-11-09 18:42   ` Greg Freemyer
  1 sibling, 0 replies; 34+ messages in thread
From: Greg Freemyer @ 2009-11-09 18:42 UTC (permalink / raw)
  To: Majed B.; +Cc: Linux RAID

On Sun, Nov 8, 2009 at 8:13 PM, Majed B. <majedb@gmail.com> wrote:
> The firmware which introduced the TRIM command was deemed buggy and
> has been pulled out.
>
> Are there any filesystems that are TRIM-aware?
>
> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>> For those of us playing with use of SSD for journals on ext[34], this does
>> have implications for RAID performance.
>>
>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>>
>> --
>> Bill Davidsen <davidsen@tmr.com>
> --
>       Majed B.

Majed,

There are various ways to address the TRIM issue.

My favorite is to have a once a day (or whatever) process invoked by
cron that scans a filesystem for unused space, then calls trim on all
of the unused chunks.

Mark Lord had this working via fallocate calls from user space a
couple months ago.

See <http://markmail.org/message/rytr4jqx52h2wftm>

fyi: Mark Lord is the hdparm maintainer and I think you can get his
userspace stuff from sourceforge.  I think the kernel code is already
in vanilla.

fyi2: I have not had my hands on a trim capable SSD so I have not
tried Mark's code yet.

Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-09 16:59       ` Chris Worley
@ 2009-11-10  9:42         ` Kasper Sandberg
  2009-11-10 15:39           ` Chris Worley
  2009-11-10 16:36         ` Martin K. Petersen
  1 sibling, 1 reply; 34+ messages in thread
From: Kasper Sandberg @ 2009-11-10  9:42 UTC (permalink / raw)
  To: Chris Worley; +Cc: Majed B., Linux RAID

On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote:
> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
> > Well, SATA uses SCSI emulation so I guess that's no problem, right?
> 
> The only problem is SSD's put Solid State Storage (SSS) behind
> SATA/SAS controllers... while compatible w/ old disk technology, it
> severely limits performance (i.e. none of these SSD drives do even
> 300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
No, around 280MB/s... and obviously they dont do more, because of the
simple limitation of the sata controllers.. this also means they dont
need to do as many channels as other devices..
> drivers and ext4 "discard" worked very well with forward-thinking SSS
> not encumbered by old controller technology... but, SSD's were not
> able to handle it well:
> 
> http://lwn.net/Articles/347511/
> 
> So it looks like "design by committee" Linux is well behind Windows 7,
And how exactly does windows 7 handle this so much better?
> while Linux contemplates slowing new technology down to optimize for
> ill-designed SSD's.
It does?
> 
> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!
> 
> Chris
> >
> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
> >>> The firmware which introduced the TRIM command was deemed buggy and
> >>> has been pulled out.
> >>>
> >>> Are there any filesystems that are TRIM-aware?
> >>
> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's
> >> not TRIM until it's issued as a SCSI command).
> >>
> >> Chris
> >>>
> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
> >>>> For those of us playing with use of SSD for journals on ext[34], this does
> >>>> have implications for RAID performance.
> >>>>
> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
> >>>>
> >>>> --
> >>>> Bill Davidsen <davidsen@tmr.com>
> >>>>  Unintended results are the well-earned reward for incompetence.
> >>>>
> >>>>
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >>>> the body of a message to majordomo@vger.kernel.org
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>>       Majed B.
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>
> >
> >
> >
> > --
> >       Majed B.
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10  9:42         ` Kasper Sandberg
@ 2009-11-10 15:39           ` Chris Worley
  2009-11-10 15:43             ` Majed B.
                               ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Chris Worley @ 2009-11-10 15:39 UTC (permalink / raw)
  To: Kasper Sandberg; +Cc: Majed B., Linux RAID

On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote:
> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote:
>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
>> > Well, SATA uses SCSI emulation so I guess that's no problem, right?
>>
>> The only problem is SSD's put Solid State Storage (SSS) behind
>> SATA/SAS controllers... while compatible w/ old disk technology, it
>> severely limits performance (i.e. none of these SSD drives do even
>> 300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
> No, around 280MB/s... and obviously they dont do more, because of the
> simple limitation of the sata controllers.. this also means they dont
> need to do as many channels as other devices..

I'm not sure if you're agreeing or disagreeing here...
280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
while SSS, w/o a legacy controller, can do 800MB/s out of a single
drive.

>> drivers and ext4 "discard" worked very well with forward-thinking SSS
>> not encumbered by old controller technology... but, SSD's were not
>> able to handle it well:
>>
>> http://lwn.net/Articles/347511/
>>
>> So it looks like "design by committee" Linux is well behind Windows 7,
> And how exactly does windows 7 handle this so much better?

TRIM is in W7; NTFS support.  No Linux distro does.  And by the time
"design by committee" gets through with it,we shouldn't have bothered.

>> while Linux contemplates slowing new technology down to optimize for
>> ill-designed SSD's.
> It does?

Those that speak loudest in the kernel development (and contribute the
most) work for companies like Intel that promote the slower,
controller-based, SSD's.

Chris
>>
>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!
>>
>> Chris
>> >
>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
>> >>> The firmware which introduced the TRIM command was deemed buggy and
>> >>> has been pulled out.
>> >>>
>> >>> Are there any filesystems that are TRIM-aware?
>> >>
>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's
>> >> not TRIM until it's issued as a SCSI command).
>> >>
>> >> Chris
>> >>>
>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>> >>>> For those of us playing with use of SSD for journals on ext[34], this does
>> >>>> have implications for RAID performance.
>> >>>>
>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>> >>>>
>> >>>> --
>> >>>> Bill Davidsen <davidsen@tmr.com>
>> >>>>  Unintended results are the well-earned reward for incompetence.
>> >>>>
>> >>>>
>> >>>> --
>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> >>>> the body of a message to majordomo@vger.kernel.org
>> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>>       Majed B.
>> >>> --
>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> >>> the body of a message to majordomo@vger.kernel.org
>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> >       Majed B.
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 15:39           ` Chris Worley
@ 2009-11-10 15:43             ` Majed B.
  2009-11-10 15:58               ` Chris Worley
  2009-11-10 15:48             ` Asdo
  2009-11-10 18:38             ` Kasper Sandberg
  2 siblings, 1 reply; 34+ messages in thread
From: Majed B. @ 2009-11-10 15:43 UTC (permalink / raw)
  To: Linux RAID

Does that mean we won't be able to squeeze the juice out of Intel's
Extreme SSDs on Linux?

What about those of us who use OpenFiler and build their own storage
solutions? We won't be able to provide solutions based on these SSDs
because the kernel support is crap?

I may have clients wanting to mix between SAS/SATA & SSD to load their
main database on the SSDs, but now it seems pointless since the
performance isn't gonna be that great :/

On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote:
> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote:
>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote:
>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right?
>>>
>>> The only problem is SSD's put Solid State Storage (SSS) behind
>>> SATA/SAS controllers... while compatible w/ old disk technology, it
>>> severely limits performance (i.e. none of these SSD drives do even
>>> 300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
>> No, around 280MB/s... and obviously they dont do more, because of the
>> simple limitation of the sata controllers.. this also means they dont
>> need to do as many channels as other devices..
>
> I'm not sure if you're agreeing or disagreeing here...
> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
> while SSS, w/o a legacy controller, can do 800MB/s out of a single
> drive.
>
>>> drivers and ext4 "discard" worked very well with forward-thinking SSS
>>> not encumbered by old controller technology... but, SSD's were not
>>> able to handle it well:
>>>
>>> http://lwn.net/Articles/347511/
>>>
>>> So it looks like "design by committee" Linux is well behind Windows 7,
>> And how exactly does windows 7 handle this so much better?
>
> TRIM is in W7; NTFS support.  No Linux distro does.  And by the time
> "design by committee" gets through with it,we shouldn't have bothered.
>
>>> while Linux contemplates slowing new technology down to optimize for
>>> ill-designed SSD's.
>> It does?
>
> Those that speak loudest in the kernel development (and contribute the
> most) work for companies like Intel that promote the slower,
> controller-based, SSD's.
>
> Chris
>>>
>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!
>>>
>>> Chris
>>> >
>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
>>> >>> The firmware which introduced the TRIM command was deemed buggy and
>>> >>> has been pulled out.
>>> >>>
>>> >>> Are there any filesystems that are TRIM-aware?
>>> >>
>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's
>>> >> not TRIM until it's issued as a SCSI command).
>>> >>
>>> >> Chris
>>> >>>
>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does
>>> >>>> have implications for RAID performance.
>>> >>>>
>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>>> >>>>
>>> >>>> --
>>> >>>> Bill Davidsen <davidsen@tmr.com>
>>> >>>>  Unintended results are the well-earned reward for incompetence.
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> >>>> the body of a message to majordomo@vger.kernel.org
>>> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>>       Majed B.
>>> >>> --
>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> >>> the body of a message to majordomo@vger.kernel.org
>>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> >>>
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> >       Majed B.
>>> >
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>



-- 
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 15:39           ` Chris Worley
  2009-11-10 15:43             ` Majed B.
@ 2009-11-10 15:48             ` Asdo
  2009-11-10 16:04               ` Chris Worley
  2009-11-10 18:38             ` Kasper Sandberg
  2 siblings, 1 reply; 34+ messages in thread
From: Asdo @ 2009-11-10 15:48 UTC (permalink / raw)
  To: Chris Worley; +Cc: linux-raid

Chris Worley wrote:
> I'm not sure if you're agreeing or disagreeing here...
> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
> while SSS, w/o a legacy controller, can do 800MB/s out of a single
> drive.
>   
I have not heard about these SSS you mention.
Do you have a link?

Also are you sure that the SATA/SCSI layer is the problem? Some hardware 
raids can do 800 MB/s sequential, single stream, and indeed with a 
SATA/SAS interface to the kernel. If what you say was true, that would 
be impossible...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 15:43             ` Majed B.
@ 2009-11-10 15:58               ` Chris Worley
  2009-11-10 16:01                 ` Majed B.
  0 siblings, 1 reply; 34+ messages in thread
From: Chris Worley @ 2009-11-10 15:58 UTC (permalink / raw)
  To: Majed B.; +Cc: Linux RAID

On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote:
> Does that mean we won't be able to squeeze the juice out of Intel's
> Extreme SSDs on Linux?

The limitation is in the design.  You'll be able to get as much
performance as they can offer, given the bad design (of putting SSS
behind legacy controllers).

>
> What about those of us who use OpenFiler and build their own storage
> solutions? We won't be able to provide solutions based on these SSDs
> because the kernel support is crap?

It's sub-optimal, written to make the best of a bad design, limiting
performance of good designs, but not crap.

>
> I may have clients wanting to mix between SAS/SATA & SSD to load their
> main database on the SSDs, but now it seems pointless since the
> performance isn't gonna be that great :/

You can still get much greater performance from SSS designed
correctly.  Just don't do SSD's.

Chris
>
> On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote:
>> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote:
>>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote:
>>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
>>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right?
>>>>
>>>> The only problem is SSD's put Solid State Storage (SSS) behind
>>>> SATA/SAS controllers... while compatible w/ old disk technology, it
>>>> severely limits performance (i.e. none of these SSD drives do even
>>>> 300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
>>> No, around 280MB/s... and obviously they dont do more, because of the
>>> simple limitation of the sata controllers.. this also means they dont
>>> need to do as many channels as other devices..
>>
>> I'm not sure if you're agreeing or disagreeing here...
>> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
>> while SSS, w/o a legacy controller, can do 800MB/s out of a single
>> drive.
>>
>>>> drivers and ext4 "discard" worked very well with forward-thinking SSS
>>>> not encumbered by old controller technology... but, SSD's were not
>>>> able to handle it well:
>>>>
>>>> http://lwn.net/Articles/347511/
>>>>
>>>> So it looks like "design by committee" Linux is well behind Windows 7,
>>> And how exactly does windows 7 handle this so much better?
>>
>> TRIM is in W7; NTFS support.  No Linux distro does.  And by the time
>> "design by committee" gets through with it,we shouldn't have bothered.
>>
>>>> while Linux contemplates slowing new technology down to optimize for
>>>> ill-designed SSD's.
>>> It does?
>>
>> Those that speak loudest in the kernel development (and contribute the
>> most) work for companies like Intel that promote the slower,
>> controller-based, SSD's.
>>
>> Chris
>>>>
>>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!
>>>>
>>>> Chris
>>>> >
>>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
>>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
>>>> >>> The firmware which introduced the TRIM command was deemed buggy and
>>>> >>> has been pulled out.
>>>> >>>
>>>> >>> Are there any filesystems that are TRIM-aware?
>>>> >>
>>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's
>>>> >> not TRIM until it's issued as a SCSI command).
>>>> >>
>>>> >> Chris
>>>> >>>
>>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does
>>>> >>>> have implications for RAID performance.
>>>> >>>>
>>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>>>> >>>>
>>>> >>>> --
>>>> >>>> Bill Davidsen <davidsen@tmr.com>
>>>> >>>>  Unintended results are the well-earned reward for incompetence.
>>>> >>>>
>>>> >>>>
>>>> >>>> --
>>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>> >>>> the body of a message to majordomo@vger.kernel.org
>>>> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> >>>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>>       Majed B.
>>>> >>> --
>>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>> >>> the body of a message to majordomo@vger.kernel.org
>>>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> >>>
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >       Majed B.
>>>> >
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>
>
>
> --
>       Majed B.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 15:58               ` Chris Worley
@ 2009-11-10 16:01                 ` Majed B.
  2009-11-10 16:15                   ` Robin Hill
  2009-11-10 16:18                   ` Chris Worley
  0 siblings, 2 replies; 34+ messages in thread
From: Majed B. @ 2009-11-10 16:01 UTC (permalink / raw)
  To: Chris Worley; +Cc: Linux RAID

Which disks can provide 2ms response with a read of 250 MB/s and write
of 170 MB/s other than SSDs?!

Are you saying that it doesn't matter whether we use Linux or Windows
with SSDs because the limitation is coming from the disk's controller
itself?

On Tue, Nov 10, 2009 at 6:58 PM, Chris Worley <worleys@gmail.com> wrote:
> On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote:
>> Does that mean we won't be able to squeeze the juice out of Intel's
>> Extreme SSDs on Linux?
>
> The limitation is in the design.  You'll be able to get as much
> performance as they can offer, given the bad design (of putting SSS
> behind legacy controllers).
>
>>
>> What about those of us who use OpenFiler and build their own storage
>> solutions? We won't be able to provide solutions based on these SSDs
>> because the kernel support is crap?
>
> It's sub-optimal, written to make the best of a bad design, limiting
> performance of good designs, but not crap.
>
>>
>> I may have clients wanting to mix between SAS/SATA & SSD to load their
>> main database on the SSDs, but now it seems pointless since the
>> performance isn't gonna be that great :/
>
> You can still get much greater performance from SSS designed
> correctly.  Just don't do SSD's.
>
> Chris
>>
>> On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote:
>>> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote:
>>>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote:
>>>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
>>>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right?
>>>>>
>>>>> The only problem is SSD's put Solid State Storage (SSS) behind
>>>>> SATA/SAS controllers... while compatible w/ old disk technology, it
>>>>> severely limits performance (i.e. none of these SSD drives do even
>>>>> 300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
>>>> No, around 280MB/s... and obviously they dont do more, because of the
>>>> simple limitation of the sata controllers.. this also means they dont
>>>> need to do as many channels as other devices..
>>>
>>> I'm not sure if you're agreeing or disagreeing here...
>>> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
>>> while SSS, w/o a legacy controller, can do 800MB/s out of a single
>>> drive.
>>>
>>>>> drivers and ext4 "discard" worked very well with forward-thinking SSS
>>>>> not encumbered by old controller technology... but, SSD's were not
>>>>> able to handle it well:
>>>>>
>>>>> http://lwn.net/Articles/347511/
>>>>>
>>>>> So it looks like "design by committee" Linux is well behind Windows 7,
>>>> And how exactly does windows 7 handle this so much better?
>>>
>>> TRIM is in W7; NTFS support.  No Linux distro does.  And by the time
>>> "design by committee" gets through with it,we shouldn't have bothered.
>>>
>>>>> while Linux contemplates slowing new technology down to optimize for
>>>>> ill-designed SSD's.
>>>> It does?
>>>
>>> Those that speak loudest in the kernel development (and contribute the
>>> most) work for companies like Intel that promote the slower,
>>> controller-based, SSD's.
>>>
>>> Chris
>>>>>
>>>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!
>>>>>
>>>>> Chris
>>>>> >
>>>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
>>>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
>>>>> >>> The firmware which introduced the TRIM command was deemed buggy and
>>>>> >>> has been pulled out.
>>>>> >>>
>>>>> >>> Are there any filesystems that are TRIM-aware?
>>>>> >>
>>>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's
>>>>> >> not TRIM until it's issued as a SCSI command).
>>>>> >>
>>>>> >> Chris
>>>>> >>>
>>>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>>>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does
>>>>> >>>> have implications for RAID performance.
>>>>> >>>>
>>>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>>>>> >>>>
>>>>> >>>> --
>>>>> >>>> Bill Davidsen <davidsen@tmr.com>
>>>>> >>>>  Unintended results are the well-earned reward for incompetence.
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> --
>>>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>> >>>> the body of a message to majordomo@vger.kernel.org
>>>>> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> >>>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>> --
>>>>> >>>       Majed B.
>>>>> >>> --
>>>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>> >>> the body of a message to majordomo@vger.kernel.org
>>>>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> >>>
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >       Majed B.
>>>>> >
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>
>>
>>
>>
>> --
>>       Majed B.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>



-- 
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 15:48             ` Asdo
@ 2009-11-10 16:04               ` Chris Worley
  2009-11-11 18:02                 ` Default User
  0 siblings, 1 reply; 34+ messages in thread
From: Chris Worley @ 2009-11-10 16:04 UTC (permalink / raw)
  To: Asdo; +Cc: linux-raid

On Tue, Nov 10, 2009 at 8:48 AM, Asdo <asdo@shiftmail.org> wrote:
> Chris Worley wrote:
>>
>> I'm not sure if you're agreeing or disagreeing here...
>> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
>> while SSS, w/o a legacy controller, can do 800MB/s out of a single
>> drive.
>>
>
> I have not heard about these SSS you mention.
> Do you have a link?

All the Fusion-io products (fusionio.com) and TMS's (ramsan.com) RS20
are two examples (not their RAM-based products).  Sun has their
"Sunfire", but I haven't seen that yet.
>
> Also are you sure that the SATA/SCSI layer is the problem? Some hardware
> raids can do 800 MB/s sequential, single stream, and indeed with a SATA/SAS
> interface to the kernel. If what you say was true, that would be
> impossible...

Sequential/streaming performance is a corner case.  There are many
high speed solutions to that (even using rotating media).  I'm talking
random I/O at 128KB blocks at 800MB/s per drive.

Chris
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 16:01                 ` Majed B.
@ 2009-11-10 16:15                   ` Robin Hill
  2009-11-10 16:31                     ` Chris Worley
  2009-11-10 16:18                   ` Chris Worley
  1 sibling, 1 reply; 34+ messages in thread
From: Robin Hill @ 2009-11-10 16:15 UTC (permalink / raw)
  To: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 1224 bytes --]

On Tue Nov 10, 2009 at 07:01:02PM +0300, Majed B. wrote:

> Which disks can provide 2ms response with a read of 250 MB/s and write
> of 170 MB/s other than SSDs?!
> 
> Are you saying that it doesn't matter whether we use Linux or Windows
> with SSDs because the limitation is coming from the disk's controller
> itself?
> 
Not exactly - without TRIM support, the drive performance will degrade
over time.  Windows 7 has TRIM implemented for deletes (and formats),
which prevents this degradation.  Initial performance will be the same
on both systems though (as performance is limited by the interface -
SATA 6G is starting to appear though, which will definitely help).

Currently, AFAIK, none of the Linux filesystems support TRIM.  This is
largely due to discussions about the implementation - a TRIM call
requires a full flush of all pending writes (as it's a non-queueable
call) which results in severe performance issues if done synchronously
on deletes.

HTH,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 16:01                 ` Majed B.
  2009-11-10 16:15                   ` Robin Hill
@ 2009-11-10 16:18                   ` Chris Worley
  2009-11-10 18:31                     ` Majed B.
  2009-11-10 18:40                     ` Kasper Sandberg
  1 sibling, 2 replies; 34+ messages in thread
From: Chris Worley @ 2009-11-10 16:18 UTC (permalink / raw)
  To: Majed B.; +Cc: Linux RAID

On Tue, Nov 10, 2009 at 9:01 AM, Majed B. <majedb@gmail.com> wrote:
> Which disks can provide 2ms response with a read of 250 MB/s and write
> of 170 MB/s other than SSDs?!

The drives I use average <50usecs latency at 4KB packets (properly
measured as the complete turn-around time of a single outstanding
I/O), 800MB/s reads and >600MB/s writes at 128KB blocks.

>
> Are you saying that it doesn't matter whether we use Linux or Windows
> with SSDs because the limitation is coming from the disk's controller
> itself?

To some degree, yes, when using SSD's behind a controller, the
controller is the biggest performance issue, and given they use
chicklets for processors, they all hamper performance given the speed
potential of the underlying storage.

As none of the enterprise distros are handling TRIM yet, W7 can claim
it was first, and putting together a TRIM-capable kernel is manual
currently in Linux, and given only ext4 supports it (strangely, FAT
supported it, then the code was pulled... XFS may support it, but I
believe that's still in the works), you have the additional problem
that ext4 has some maturity issues.  Porting "discard" to ext2/3 would
not be too difficult, especially w/o journal considerations.

Chris
>
> On Tue, Nov 10, 2009 at 6:58 PM, Chris Worley <worleys@gmail.com> wrote:
>> On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote:
>>> Does that mean we won't be able to squeeze the juice out of Intel's
>>> Extreme SSDs on Linux?
>>
>> The limitation is in the design.  You'll be able to get as much
>> performance as they can offer, given the bad design (of putting SSS
>> behind legacy controllers).
>>
>>>
>>> What about those of us who use OpenFiler and build their own storage
>>> solutions? We won't be able to provide solutions based on these SSDs
>>> because the kernel support is crap?
>>
>> It's sub-optimal, written to make the best of a bad design, limiting
>> performance of good designs, but not crap.
>>
>>>
>>> I may have clients wanting to mix between SAS/SATA & SSD to load their
>>> main database on the SSDs, but now it seems pointless since the
>>> performance isn't gonna be that great :/
>>
>> You can still get much greater performance from SSS designed
>> correctly.  Just don't do SSD's.
>>
>> Chris
>>>
>>> On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote:
>>>> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote:
>>>>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote:
>>>>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
>>>>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right?
>>>>>>
>>>>>> The only problem is SSD's put Solid State Storage (SSS) behind
>>>>>> SATA/SAS controllers... while compatible w/ old disk technology, it
>>>>>> severely limits performance (i.e. none of these SSD drives do even
>>>>>> 300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
>>>>> No, around 280MB/s... and obviously they dont do more, because of the
>>>>> simple limitation of the sata controllers.. this also means they dont
>>>>> need to do as many channels as other devices..
>>>>
>>>> I'm not sure if you're agreeing or disagreeing here...
>>>> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
>>>> while SSS, w/o a legacy controller, can do 800MB/s out of a single
>>>> drive.
>>>>
>>>>>> drivers and ext4 "discard" worked very well with forward-thinking SSS
>>>>>> not encumbered by old controller technology... but, SSD's were not
>>>>>> able to handle it well:
>>>>>>
>>>>>> http://lwn.net/Articles/347511/
>>>>>>
>>>>>> So it looks like "design by committee" Linux is well behind Windows 7,
>>>>> And how exactly does windows 7 handle this so much better?
>>>>
>>>> TRIM is in W7; NTFS support.  No Linux distro does.  And by the time
>>>> "design by committee" gets through with it,we shouldn't have bothered.
>>>>
>>>>>> while Linux contemplates slowing new technology down to optimize for
>>>>>> ill-designed SSD's.
>>>>> It does?
>>>>
>>>> Those that speak loudest in the kernel development (and contribute the
>>>> most) work for companies like Intel that promote the slower,
>>>> controller-based, SSD's.
>>>>
>>>> Chris
>>>>>>
>>>>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!
>>>>>>
>>>>>> Chris
>>>>>> >
>>>>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
>>>>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
>>>>>> >>> The firmware which introduced the TRIM command was deemed buggy and
>>>>>> >>> has been pulled out.
>>>>>> >>>
>>>>>> >>> Are there any filesystems that are TRIM-aware?
>>>>>> >>
>>>>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's
>>>>>> >> not TRIM until it's issued as a SCSI command).
>>>>>> >>
>>>>>> >> Chris
>>>>>> >>>
>>>>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
>>>>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does
>>>>>> >>>> have implications for RAID performance.
>>>>>> >>>>
>>>>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
>>>>>> >>>>
>>>>>> >>>> --
>>>>>> >>>> Bill Davidsen <davidsen@tmr.com>
>>>>>> >>>>  Unintended results are the well-earned reward for incompetence.
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>> --
>>>>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>>> >>>> the body of a message to majordomo@vger.kernel.org
>>>>>> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>> >>>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>>       Majed B.
>>>>>> >>> --
>>>>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>>> >>> the body of a message to majordomo@vger.kernel.org
>>>>>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>> >>>
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >       Majed B.
>>>>>> >
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>>       Majed B.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>
>
>
> --
>       Majed B.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 16:15                   ` Robin Hill
@ 2009-11-10 16:31                     ` Chris Worley
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Worley @ 2009-11-10 16:31 UTC (permalink / raw)
  To: Linux RAID

On Tue, Nov 10, 2009 at 9:15 AM, Robin Hill <robin@robinhill.me.uk> wrote:
> On Tue Nov 10, 2009 at 07:01:02PM +0300, Majed B. wrote:
>
>> Which disks can provide 2ms response with a read of 250 MB/s and write
>> of 170 MB/s other than SSDs?!
>>
>> Are you saying that it doesn't matter whether we use Linux or Windows
>> with SSDs because the limitation is coming from the disk's controller
>> itself?
>>
> Not exactly - without TRIM support, the drive performance will degrade
> over time.

Not true.  First, it never effects read performance (you didn't
qualify).  Given that SSD's have a management layer that rotating
media doesn't, it is a very complex issue, and dependent on the SSD
management layer's algorithms.  For many of these algorithms, most
typical write usage performance is completely unaffected.  The biggest
performance effect is seen in benchmarks that were designed w/
rotating media in mind that makes assumptions that don't apply to real
applications, but given rotating media's simplicity were justifiable.
Those assumptions are no longer justifiable given SSD's management
layer; benchmarks must be re-coded to exhibit more application-like
behavior.

>  Windows 7 has TRIM implemented for deletes (and formats),
> which prevents this degradation.  Initial performance will be the same
> on both systems though (as performance is limited by the interface -
> SATA 6G is starting to appear though, which will definitely help).
>
> Currently, AFAIK, none of the Linux filesystems support TRIM.

ext4 supports discard.  I've been using successfully since 2.6.27.
FAT did too, but it was pulled.

>  This is
> largely due to discussions about the implementation - a TRIM call
> requires a full flush of all pending writes (as it's a non-queueable
> call) which results in severe performance issues if done synchronously
> on deletes.

This only causes performance issues for ill-designed SSD's that put
themselves behind legacy controllers for compatibility reasons.
Conceptually, for SSS, the earlier the TRIM the better (the sooner the
management layer can use that information), no matter the perceived
fragmentation.  The performance issue only arises with the poor
compatibility design, but, given they are the loudest voices in the
Linux community, their design will prevail.

Chris
>
> HTH,
>    Robin
> --
>     ___
>    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
>   / / )      | Little Jim says ....                            |
>  // !!       |      "He fallen in de water !!"                 |
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-09 16:59       ` Chris Worley
  2009-11-10  9:42         ` Kasper Sandberg
@ 2009-11-10 16:36         ` Martin K. Petersen
  2009-11-10 17:22           ` Chris Worley
  1 sibling, 1 reply; 34+ messages in thread
From: Martin K. Petersen @ 2009-11-10 16:36 UTC (permalink / raw)
  To: Chris Worley; +Cc: Majed B., Linux RAID

>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:

Chris> The only problem is SSD's put Solid State Storage (SSS) behind
Chris> SATA/SAS controllers... while compatible w/ old disk technology,
Chris> it severely limits performance (i.e. none of these SSD drives do
Chris> even 300MB/s... while SSS drives do 800MB/s).

You are arguing that the SATA/SCSI protocols are inhibiting factors on
the grounds that PCIe solid state devices are faster.

Performance inside a flash device is gated by the number of channels you
run in parallel.  There is not much point in increasing the number of
channels if your physical interconnect (3Gbps SATA, say) can't handle
the traffic.  Hence the drive towards 6Gbps interconnects and beyond for
both SATA and SAS.

Also, not all SSS boards present a memory-style device to the host.
Several shipping SSS boards use a regular SAS HBA backed by multiple
SATA/SAS targets which again comprise of multiple flash channels.  And
the performance of these devices is absolutely on par with the
memory-based devices.  Without requiring proprietary drivers, and
without reinventing filesystems and I/O stack.

We have been pushing tens of gigabytes per second through the storage
stack for years when connected to arrays which - given their large
non-volatile caches - are virtually indistinguishable from SSDs.  We're
constantly tweaking and tuning.  Jens has done a lot of work to bring
down command latency, I have worked on storage topology which allows us
to uniquely identify the characteristics of the physical storage device
so we can issue I/O in an optimal fashion.

Note that I don't think that memory-based SSS devices are without merit.
But it's baloney to claim that a storage-flavored interface inherently
means bad performance.


Chris> So it looks like "design by committee" Linux is well behind
Chris> Windows 7, while Linux contemplates slowing new technology down
Chris> to optimize for ill-designed SSD's.

We're not slowing anything, nor are we optimizing for ill-designed SSDs.

Because initial TRIM performance was absolutely appalling there was a
lot of discussion about the merits of doing weekly scrubs instead of
issuing TRIM on the fly.  However, Windows 7 shipped issuing TRIM in
realtime which means that all the early SSDs with lame duck DSM
performance are headed straight for the garbage bin.

Futhermore, unlike Windows 7 we can't pretend everything is desktop
class ATA.  We've spent a lot of time making sure that our block layer
discard support works equally well for both ATA DSM (TRIM) as well as
SCSI WRITE SAME and UNMAP used by high-end arrays.  All three commands
have been moving targets and none of them are technically set in stone
in their respective standards bodies yet.

So I think it would be a stretch to claim that TRIM is well tested and
stable in the industry.  intel just pulled their latest X25-M firmware
because of problems with Windows 7...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 16:36         ` Martin K. Petersen
@ 2009-11-10 17:22           ` Chris Worley
  2009-11-10 20:11             ` Martin K. Petersen
  0 siblings, 1 reply; 34+ messages in thread
From: Chris Worley @ 2009-11-10 17:22 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Majed B., Linux RAID

On Tue, Nov 10, 2009 at 9:36 AM, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:
>
> Chris> The only problem is SSD's put Solid State Storage (SSS) behind
> Chris> SATA/SAS controllers... while compatible w/ old disk technology,
> Chris> it severely limits performance (i.e. none of these SSD drives do
> Chris> even 300MB/s... while SSS drives do 800MB/s).
>
> You are arguing that the SATA/SCSI protocols are inhibiting factors on
> the grounds that PCIe solid state devices are faster.
>
> Performance inside a flash device is gated by the number of channels you
> run in parallel.  There is not much point in increasing the number of
> channels if your physical interconnect (3Gbps SATA, say) can't handle
> the traffic.  Hence the drive towards 6Gbps interconnects and beyond for
> both SATA and SAS.

Absolutely agreed: the SSD manufacturers will limit their NAND
performance given the performance limitations of the controller
front-end.  Also, given their management layer is an on-board ASIC,
they further limit their performance in this design.

>
> Also, not all SSS boards present a memory-style device to the host.
> Several shipping SSS boards use a regular SAS HBA backed by multiple
> SATA/SAS targets which again comprise of multiple flash channels.  And
> the performance of these devices is absolutely on par with the
> memory-based devices.  Without requiring proprietary drivers, and
> without reinventing filesystems and I/O stack.

I'm not talking about memory-based or -looking devices.  A block
device is all you need, and you don't have to re-write file systems to
put one atop a block device.

Those using legacy controller technology can overcome the issue by
using multiple devices.  We've been talking single device performance.
 I can get 6GB/s using 8 SSS drives.  Scalability is much easier when
you start with really fast individual components.  So, legacy
controllers are still a bad design.

>
> We have been pushing tens of gigabytes per second through the storage
> stack for years when connected to arrays which - given their large
> non-volatile caches - are virtually indistinguishable from SSDs.  We're
> constantly tweaking and tuning.  Jens has done a lot of work to bring
> down command latency, I have worked on storage topology which allows us
> to uniquely identify the characteristics of the physical storage device
> so we can issue I/O in an optimal fashion.

And I do appreciate all your work.  I fear, in this case, discard will
be optimized for the slower technology... we won't be getting all
that's available from it.
>
> Note that I don't think that memory-based SSS devices are without merit.

Let's call it CPU-based.  "Memory-based" sounds like RAM-based
storage... we're not talking about that.

> But it's baloney to claim that a storage-flavored interface inherently
> means bad performance.

You need an epiphany here.  Between the SAS/SATA controllers and the
on-board drive logic, SSD's are a bad design when it comes to
performance.  They are dwarfed, in performance, by CPU-based
controllers.  CPU's have much more performance for handling the
management needed by NAND, and there are so many cores these days
going unused.

SSD's do win the "compatibility" argument.  It's too bad we didn't
invent thumb drives that were floppy compatible ;)

>
>
> Chris> So it looks like "design by committee" Linux is well behind
> Chris> Windows 7, while Linux contemplates slowing new technology down
> Chris> to optimize for ill-designed SSD's.
>
> We're not slowing anything, nor are we optimizing for ill-designed SSDs.
>
> Because initial TRIM performance was absolutely appalling

Only on SSD's behind legacy controllers.  It worked great as-is with SSS.

> there was a
> lot of discussion about the merits of doing weekly scrubs instead of
> issuing TRIM on the fly.  However, Windows 7 shipped issuing TRIM in
> realtime which means that all the early SSDs with lame duck DSM
> performance are headed straight for the garbage bin.

Too bad the legacy design doesn't go with them ;)

Chris
>
> Futhermore, unlike Windows 7 we can't pretend everything is desktop
> class ATA.  We've spent a lot of time making sure that our block layer
> discard support works equally well for both ATA DSM (TRIM) as well as
> SCSI WRITE SAME and UNMAP used by high-end arrays.  All three commands
> have been moving targets and none of them are technically set in stone
> in their respective standards bodies yet.
>
> So I think it would be a stretch to claim that TRIM is well tested and
> stable in the industry.  intel just pulled their latest X25-M firmware
> because of problems with Windows 7...
>
> --
> Martin K. Petersen      Oracle Linux Engineering
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 16:18                   ` Chris Worley
@ 2009-11-10 18:31                     ` Majed B.
  2009-11-10 23:03                       ` Mathieu Chouquet-Stringer
  2009-11-10 18:40                     ` Kasper Sandberg
  1 sibling, 1 reply; 34+ messages in thread
From: Majed B. @ 2009-11-10 18:31 UTC (permalink / raw)
  To: Linux RAID

Chris,

Do you mind sharing the drive models & controllers you're using that
give you 800 MB/s?

On Tue, Nov 10, 2009 at 7:18 PM, Chris Worley <worleys@gmail.com> wrote:
> On Tue, Nov 10, 2009 at 9:01 AM, Majed B. <majedb@gmail.com> wrote:
>> Which disks can provide 2ms response with a read of 250 MB/s and write
>> of 170 MB/s other than SSDs?!
>
> The drives I use average <50usecs latency at 4KB packets (properly
> measured as the complete turn-around time of a single outstanding
> I/O), 800MB/s reads and >600MB/s writes at 128KB blocks.
>
>>
>> Are you saying that it doesn't matter whether we use Linux or Windows
>> with SSDs because the limitation is coming from the disk's controller
>> itself?
>
> To some degree, yes, when using SSD's behind a controller, the
> controller is the biggest performance issue, and given they use
> chicklets for processors, they all hamper performance given the speed
> potential of the underlying storage.
>
> As none of the enterprise distros are handling TRIM yet, W7 can claim
> it was first, and putting together a TRIM-capable kernel is manual
> currently in Linux, and given only ext4 supports it (strangely, FAT
> supported it, then the code was pulled... XFS may support it, but I
> believe that's still in the works), you have the additional problem
> that ext4 has some maturity issues.  Porting "discard" to ext2/3 would
> not be too difficult, especially w/o journal considerations.
>
> Chris
>>
>> On Tue, Nov 10, 2009 at 6:58 PM, Chris Worley <worleys@gmail.com> wrote:
>>> On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote:
>>>> Does that mean we won't be able to squeeze the juice out of Intel's
>>>> Extreme SSDs on Linux?
>>>
>>> The limitation is in the design.  You'll be able to get as much
>>> performance as they can offer, given the bad design (of putting SSS
>>> behind legacy controllers).
-- 
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 15:39           ` Chris Worley
  2009-11-10 15:43             ` Majed B.
  2009-11-10 15:48             ` Asdo
@ 2009-11-10 18:38             ` Kasper Sandberg
  2 siblings, 0 replies; 34+ messages in thread
From: Kasper Sandberg @ 2009-11-10 18:38 UTC (permalink / raw)
  To: Chris Worley; +Cc: Majed B., Linux RAID

On Tue, 2009-11-10 at 08:39 -0700, Chris Worley wrote:
> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote:
> > On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote:
> >> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
> >> > Well, SATA uses SCSI emulation so I guess that's no problem, right?
> >>
> >> The only problem is SSD's put Solid State Storage (SSS) behind
> >> SATA/SAS controllers... while compatible w/ old disk technology, it
> >> severely limits performance (i.e. none of these SSD drives do even
> >> 300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
> > No, around 280MB/s... and obviously they dont do more, because of the
> > simple limitation of the sata controllers.. this also means they dont
> > need to do as many channels as other devices..
> 
> I'm not sure if you're agreeing or disagreeing here...
> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
> while SSS, w/o a legacy controller, can do 800MB/s out of a single
> drive.
A single drive which uses _ALOT_ more channels than the SSDs doing sata,
and thats why, so you get extra performance, for a price..

> 
> >> drivers and ext4 "discard" worked very well with forward-thinking SSS
> >> not encumbered by old controller technology... but, SSD's were not
> >> able to handle it well:
> >>
> >> http://lwn.net/Articles/347511/
> >>
> >> So it looks like "design by committee" Linux is well behind Windows 7,
> > And how exactly does windows 7 handle this so much better?
> 
> TRIM is in W7; NTFS support.  No Linux distro does.  And by the time
> "design by committee" gets through with it,we shouldn't have bothered.
> 
> >> while Linux contemplates slowing new technology down to optimize for
> >> ill-designed SSD's.
> > It does?
> 
> Those that speak loudest in the kernel development (and contribute the
> most) work for companies like Intel that promote the slower,
> controller-based, SSD's.

these 3 comments makes no sense..
> 
> Chris
> >>
> >> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!
> >>
> >> Chris
> >> >
> >> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
> >> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
> >> >>> The firmware which introduced the TRIM command was deemed buggy and
> >> >>> has been pulled out.
> >> >>>
> >> >>> Are there any filesystems that are TRIM-aware?
> >> >>
> >> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's
> >> >> not TRIM until it's issued as a SCSI command).
> >> >>
> >> >> Chris
> >> >>>
> >> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
> >> >>>> For those of us playing with use of SSD for journals on ext[34], this does
> >> >>>> have implications for RAID performance.
> >> >>>>
> >> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
> >> >>>>
> >> >>>> --
> >> >>>> Bill Davidsen <davidsen@tmr.com>
> >> >>>>  Unintended results are the well-earned reward for incompetence.
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> >>>> the body of a message to majordomo@vger.kernel.org
> >> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>>       Majed B.
> >> >>> --
> >> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> >>> the body of a message to majordomo@vger.kernel.org
> >> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >>>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> >       Majed B.
> >> >
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 16:18                   ` Chris Worley
  2009-11-10 18:31                     ` Majed B.
@ 2009-11-10 18:40                     ` Kasper Sandberg
  1 sibling, 0 replies; 34+ messages in thread
From: Kasper Sandberg @ 2009-11-10 18:40 UTC (permalink / raw)
  To: Chris Worley; +Cc: Majed B., Linux RAID

On Tue, 2009-11-10 at 09:18 -0700, Chris Worley wrote:
> On Tue, Nov 10, 2009 at 9:01 AM, Majed B. <majedb@gmail.com> wrote:
> > Which disks can provide 2ms response with a read of 250 MB/s and write
> > of 170 MB/s other than SSDs?!
> 
> The drives I use average <50usecs latency at 4KB packets (properly
> measured as the complete turn-around time of a single outstanding
> I/O), 800MB/s reads and >600MB/s writes at 128KB blocks.
> 
> >
> > Are you saying that it doesn't matter whether we use Linux or Windows
> > with SSDs because the limitation is coming from the disk's controller
> > itself?
> 
> To some degree, yes, when using SSD's behind a controller, the
> controller is the biggest performance issue, and given they use
> chicklets for processors, they all hamper performance given the speed
> potential of the underlying storage.
> 
> As none of the enterprise distros are handling TRIM yet, W7 can claim
> it was first, and putting together a TRIM-capable kernel is manual
Except it wasnt, it may be earlier than the enterprise distros, but
thats not first.
> currently in Linux, and given only ext4 supports it (strangely, FAT
> supported it, then the code was pulled... XFS may support it, but I
> believe that's still in the works), you have the additional problem
> that ext4 has some maturity issues.  Porting "discard" to ext2/3 would
> not be too difficult, especially w/o journal considerations.
And given W7 supports it, it is going to have the same issues which
linux faces, i dont know what solution microsoft has chosen, but that
doesnt mean linux shouldnt choose the best one..

> 
> Chris
> >
> > On Tue, Nov 10, 2009 at 6:58 PM, Chris Worley <worleys@gmail.com> wrote:
> >> On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote:
> >>> Does that mean we won't be able to squeeze the juice out of Intel's
> >>> Extreme SSDs on Linux?
> >>
> >> The limitation is in the design.  You'll be able to get as much
> >> performance as they can offer, given the bad design (of putting SSS
> >> behind legacy controllers).
> >>
> >>>
> >>> What about those of us who use OpenFiler and build their own storage
> >>> solutions? We won't be able to provide solutions based on these SSDs
> >>> because the kernel support is crap?
> >>
> >> It's sub-optimal, written to make the best of a bad design, limiting
> >> performance of good designs, but not crap.
> >>
> >>>
> >>> I may have clients wanting to mix between SAS/SATA & SSD to load their
> >>> main database on the SSDs, but now it seems pointless since the
> >>> performance isn't gonna be that great :/
> >>
> >> You can still get much greater performance from SSS designed
> >> correctly.  Just don't do SSD's.
> >>
> >> Chris
> >>>
> >>> On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote:
> >>>> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote:
> >>>>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote:
> >>>>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote:
> >>>>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right?
> >>>>>>
> >>>>>> The only problem is SSD's put Solid State Storage (SSS) behind
> >>>>>> SATA/SAS controllers... while compatible w/ old disk technology, it
> >>>>>> severely limits performance (i.e. none of these SSD drives do even
> >>>>>> 300MB/s... while SSS drives do 800MB/s).  While the initial 2.6.27
> >>>>> No, around 280MB/s... and obviously they dont do more, because of the
> >>>>> simple limitation of the sata controllers.. this also means they dont
> >>>>> need to do as many channels as other devices..
> >>>>
> >>>> I'm not sure if you're agreeing or disagreeing here...
> >>>> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's,
> >>>> while SSS, w/o a legacy controller, can do 800MB/s out of a single
> >>>> drive.
> >>>>
> >>>>>> drivers and ext4 "discard" worked very well with forward-thinking SSS
> >>>>>> not encumbered by old controller technology... but, SSD's were not
> >>>>>> able to handle it well:
> >>>>>>
> >>>>>> http://lwn.net/Articles/347511/
> >>>>>>
> >>>>>> So it looks like "design by committee" Linux is well behind Windows 7,
> >>>>> And how exactly does windows 7 handle this so much better?
> >>>>
> >>>> TRIM is in W7; NTFS support.  No Linux distro does.  And by the time
> >>>> "design by committee" gets through with it,we shouldn't have bothered.
> >>>>
> >>>>>> while Linux contemplates slowing new technology down to optimize for
> >>>>>> ill-designed SSD's.
> >>>>> It does?
> >>>>
> >>>> Those that speak loudest in the kernel development (and contribute the
> >>>> most) work for companies like Intel that promote the slower,
> >>>> controller-based, SSD's.
> >>>>
> >>>> Chris
> >>>>>>
> >>>>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!!
> >>>>>>
> >>>>>> Chris
> >>>>>> >
> >>>>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote:
> >>>>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote:
> >>>>>> >>> The firmware which introduced the TRIM command was deemed buggy and
> >>>>>> >>> has been pulled out.
> >>>>>> >>>
> >>>>>> >>> Are there any filesystems that are TRIM-aware?
> >>>>>> >>
> >>>>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's
> >>>>>> >> not TRIM until it's issued as a SCSI command).
> >>>>>> >>
> >>>>>> >> Chris
> >>>>>> >>>
> >>>>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote:
> >>>>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does
> >>>>>> >>>> have implications for RAID performance.
> >>>>>> >>>>
> >>>>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes
> >>>>>> >>>>
> >>>>>> >>>> --
> >>>>>> >>>> Bill Davidsen <davidsen@tmr.com>
> >>>>>> >>>>  Unintended results are the well-earned reward for incompetence.
> >>>>>> >>>>
> >>>>>> >>>>
> >>>>>> >>>> --
> >>>>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >>>>>> >>>> the body of a message to majordomo@vger.kernel.org
> >>>>>> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>> >>>>
> >>>>>> >>>
> >>>>>> >>>
> >>>>>> >>>
> >>>>>> >>> --
> >>>>>> >>>       Majed B.
> >>>>>> >>> --
> >>>>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >>>>>> >>> the body of a message to majordomo@vger.kernel.org
> >>>>>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>> >>>
> >>>>>> >>
> >>>>>> >
> >>>>>> >
> >>>>>> >
> >>>>>> > --
> >>>>>> >       Majed B.
> >>>>>> >
> >>>>>> --
> >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>>       Majed B.
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>
> >
> >
> >
> > --
> >       Majed B.
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 17:22           ` Chris Worley
@ 2009-11-10 20:11             ` Martin K. Petersen
  2009-11-10 20:45               ` Chris Worley
  2009-11-10 21:01               ` Greg Freemyer
  0 siblings, 2 replies; 34+ messages in thread
From: Martin K. Petersen @ 2009-11-10 20:11 UTC (permalink / raw)
  To: Chris Worley; +Cc: Martin K. Petersen, Majed B., Linux RAID

>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:

Chris> I'm not talking about memory-based or -looking devices.  A block
Chris> device is all you need, and you don't have to re-write file
Chris> systems to put one atop a block device.

And a SATA/SCSI-fronted flash disk isn't a block device how?

Do you have any compelling evidence as to why using a protocol like SCSI
is bad?  A SCSI command is typically 16 bytes.  A typical HBA IOCB
slightly bigger but includes the inevitable scatterlist.  We're talking
a pretty dense format for expressing an I/O operation here.

You seem to be arguing that letting a device speak "block" instead of
SCSI would make things faster.  I'm not convinced.  Also, SCSI gives us
a nice way to track outstanding I/Os via command queueing plus much
more.  All in a open, non-vendor-specific format requiring no custom
drivers.  Unlike, say, the SSS board you mentioned elsewhere in this
thread.

On top of that Linux is used all over the place in deployments that have
throughput and IOPS figures above and beyond the numbers you quote here.
Despite "legacy" controllers being in the mix.


Chris> Those using legacy controller technology can overcome the issue
Chris> by using multiple devices.  We've been talking single device
Chris> performance. I can get 6GB/s using 8 SSS drives.

And adding another flash-backed SAS board isn't giving you exactly the
same benefit?


Chris> And I do appreciate all your work.  I fear, in this case, discard
Chris> will be optimized for the slower technology... we won't be
Chris> getting all that's available from it.

Discard isn't "optimized" for anything.  It's a command.  Filesystem
issues it, it gets sent to the storage device (DSM/TRIM, WRITE SAME, or
UNMAP depending on target type).


Chris> CPU's have much more performance for handling the management
Chris> needed by NAND, and there are so many cores these days going
Chris> unused.

You seem to think that the limiting factor in SSD design is the speed of
the ASIC and not the speed of the actual flash chips behind it.


Chris> SSD's do win the "compatibility" argument.  It's too bad we
Chris> didn't invent thumb drives that were floppy compatible ;)

There are many good reasons for that.  drivers/block/floppy.c contains a
several of them.  Keep a bag of expletives handy.


>> Because initial TRIM performance was absolutely appalling

Chris> Only on SSD's behind legacy controllers.  It worked great as-is
Chris> with SSS.

Please elaborate.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 20:11             ` Martin K. Petersen
@ 2009-11-10 20:45               ` Chris Worley
  2009-11-10 22:35                 ` Martin K. Petersen
  2009-11-10 21:01               ` Greg Freemyer
  1 sibling, 1 reply; 34+ messages in thread
From: Chris Worley @ 2009-11-10 20:45 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Majed B., Linux RAID

On Tue, Nov 10, 2009 at 1:11 PM, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:
>
> Chris> I'm not talking about memory-based or -looking devices.  A block
> Chris> device is all you need, and you don't have to re-write file
> Chris> systems to put one atop a block device.
>
> And a SATA/SCSI-fronted flash disk isn't a block device how?

It's not any different.  The previous statement that I was responding
to (you snipped out) had imlied that the fs code had to be re-written
for non-SCSI devices.  I was just assuring that was not necessary.
>
> Do you have any compelling evidence as to why using a protocol like SCSI
> is bad?  A SCSI command is typically 16 bytes.  A typical HBA IOCB
> slightly bigger but includes the inevitable scatterlist.  We're talking
> a pretty dense format for expressing an I/O operation here.

I'm not saying the SCSI protocol is bad, I'm saying the SAS/SATA/SCSI
controllers, that have been optimized for years for rotating media,
don't have the compute power to handle the sort of performance
attainable with SSS.

>
> You seem to be arguing that letting a device speak "block" instead of
> SCSI would make things faster.  I'm not convinced.

That's not what I'm saying; the protocol is not the culprit, the
controller is.  But, once you get rid of the controller, and just
speak block device, another level of overhead had been removed.

> Also, SCSI gives us
> a nice way to track outstanding I/Os via command queueing plus much
> more.  All in a open, non-vendor-specific format requiring no custom
> drivers.  Unlike, say, the SSS board you mentioned elsewhere in this
> thread.

At least one of the boards I mentioned I know has command queuing w/o
being a SCSI device.

>
> On top of that Linux is used all over the place in deployments that have
> throughput and IOPS figures above and beyond the numbers you quote here.

I was only quoting single drive specs.  You only scale to really big
numbers if you start with really fast individual components.  I'm sure
you could quote TB/s using rotating media, but you have a lot more
expensive pieces needed to get there.

> Despite "legacy" controllers being in the mix.
>
>
> Chris> Those using legacy controller technology can overcome the issue
> Chris> by using multiple devices.  We've been talking single device
> Chris> performance. I can get 6GB/s using 8 SSS drives.
>
> And adding another flash-backed SAS board isn't giving you exactly the
> same benefit?

Again, scalability is achieved more readily and with less complexity
using faster components.
>
>
> Chris> And I do appreciate all your work.  I fear, in this case, discard
> Chris> will be optimized for the slower technology... we won't be
> Chris> getting all that's available from it.
>
> Discard isn't "optimized" for anything.  It's a command.  Filesystem
> issues it, it gets sent to the storage device (DSM/TRIM, WRITE SAME, or
> UNMAP depending on target type).

Unless you try to coalesce it for a later time, which is what I hear
is being done to compensate for slow controllers.
>
>
> Chris> CPU's have much more performance for handling the management
> Chris> needed by NAND, and there are so many cores these days going
> Chris> unused.
>
> You seem to think that the limiting factor in SSD design is the speed of
> the ASIC and not the speed of the actual flash chips behind it.

True.  They limit the NAND performance based on the lack of
performance of their ASIC and the controller.  That doesn't mean you
can't get a lot better performance out of NAND, it just means they
limited themselves to be compatible, and the kernel will implement a
strategy that will optimize for the poor design.

>
>
> Chris> SSD's do win the "compatibility" argument.  It's too bad we
> Chris> didn't invent thumb drives that were floppy compatible ;)
>
> There are many good reasons for that.  drivers/block/floppy.c contains a
> several of them.  Keep a bag of expletives handy.

So you _are_ glad that compatibility was not followed in the move to
USB thumb drives, but you also believe the best way to do SSS was
behind compatible legacy SAS/SATA devices optimized for old rotating
media?
>
>
>>> Because initial TRIM performance was absolutely appalling
>
> Chris> Only on SSD's behind legacy controllers.  It worked great as-is
> Chris> with SSS.
>
> Please elaborate.

I had no performance issues testing w/ the original discard
implementation using SSS.

I'd run IOZone and fill the drive (as I recall ~200GB) w/ files and
benchmark, which, at the end, IOZone would delete all the files
created (in the hundreds), and the delete/discard process was no more
time consuming than just the delete process (for everything on the
drive).  This was w/ the original 2.6.27 and 2.6.28 ext4 "discard"
implementations.

Chris
>
> --
> Martin K. Petersen      Oracle Linux Engineering
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 20:11             ` Martin K. Petersen
  2009-11-10 20:45               ` Chris Worley
@ 2009-11-10 21:01               ` Greg Freemyer
  2009-11-10 21:17                 ` Chris Worley
  2009-11-10 22:56                 ` Martin K. Petersen
  1 sibling, 2 replies; 34+ messages in thread
From: Greg Freemyer @ 2009-11-10 21:01 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Chris Worley, Majed B., Linux RAID

On Tue, Nov 10, 2009 at 3:11 PM, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:
>
<snip>
>
> Chris> And I do appreciate all your work.  I fear, in this case, discard
> Chris> will be optimized for the slower technology... we won't be
> Chris> getting all that's available from it.
>
> Discard isn't "optimized" for anything.  It's a command.  Filesystem
> issues it, it gets sent to the storage device (DSM/TRIM, WRITE SAME, or
> UNMAP depending on target type).
>

Martin,

I'm not sure that is right, but Chris is also wrong.

I'm not sure where it ended up, but the big SSD / discard discussion
of a few months ago talked about 3 kinds of solutions, and I thought
the plan was to support all 3.

1) optimization 1 - A white-listed instant discard feature.  In this
methodology, the filesystems would immediately send discard calls down
to the block layer would send them on down the block stack to the
physical devices with very minimal buffering.  It was thought high-end
Intel SSDs would benefit from this model.  It also sounds like SSS
devices would benefit from this per Chris's comments.

Note that this approach is NOT very friendly from a raid 4/5/6
approach.  Those raid levels need to discard full stripes at a time,
so getting a large number of small discards would be painful.

2) optimization 2 - The block layer would accept those small discards,
but accumulate them for a short period.  (less than a second was my
impression).  Then coalesce them into larger discards and send them
down the block stack and eventually to the physical device.

This is slightly better from a raid 4/5/6 perspective, but I suspect
the discard ranges would still be too small.

3) optimization 3 -  a background freespace scanner would run from
time to time that scanned a filesystem for free blocks and send a
discard / trim command down to the device.  This is what Mark Lord was
working on.  His solution was primarily in user space and was
controlled by cron.

I believe this is by far the best approach for a raid 4/5/6
implementation, but at the time Mark's implementation was bypassing
the block stack and using SG_IO to directly talk to the physical
devices.  I don't recall any discussion of how MD could participate in
the process.  Thus Mark's solution at the time was not compatible with
md raid 4/5/6 implementations.

Since this is the mdraid mailing list, maybe someone can tell us which
of the above are getting the attention of md devels and if there is
any ongoing effort to support them?

Thanks
Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 21:01               ` Greg Freemyer
@ 2009-11-10 21:17                 ` Chris Worley
  2009-11-10 22:56                 ` Martin K. Petersen
  1 sibling, 0 replies; 34+ messages in thread
From: Chris Worley @ 2009-11-10 21:17 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: Martin K. Petersen, Majed B., Linux RAID

On Tue, Nov 10, 2009 at 2:01 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote:
> On Tue, Nov 10, 2009 at 3:11 PM, Martin K. Petersen
> <martin.petersen@oracle.com> wrote:
>>>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:
>>
> <snip>
>>
>> Chris> And I do appreciate all your work.  I fear, in this case, discard
>> Chris> will be optimized for the slower technology... we won't be
>> Chris> getting all that's available from it.
>>
>> Discard isn't "optimized" for anything.  It's a command.  Filesystem
>> issues it, it gets sent to the storage device (DSM/TRIM, WRITE SAME, or
>> UNMAP depending on target type).
>>
>
> Martin,
>
> I'm not sure that is right, but Chris is also wrong.

I've not been happier to hear I'm wrong; I do hope you are right
(there will be a switch for optimal approaches).

Thanks,

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 20:45               ` Chris Worley
@ 2009-11-10 22:35                 ` Martin K. Petersen
  2009-11-11 18:17                   ` Chris Worley
  0 siblings, 1 reply; 34+ messages in thread
From: Martin K. Petersen @ 2009-11-10 22:35 UTC (permalink / raw)
  To: Chris Worley; +Cc: Martin K. Petersen, Majed B., Linux RAID

>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:

Chris> I'm not saying the SCSI protocol is bad, I'm saying the
Chris> SAS/SATA/SCSI controllers, that have been optimized for years for
Chris> rotating media, don't have the compute power to handle the sort
Chris> of performance attainable with SSS.

And I'm saying that at least in the SCSI case that's untrue.  SAS and FC
controllers are optimized for lots and lots of I/O because their main
application is driving large storage arrays which have performance
comparable to the solid state devices you mention.

In fact, many deployments use said SCSI controllers to drive RAM-based
solid state storage devices which are faster than the flash-based
devices we're talking about here.


Chris> Unless you try to coalesce it for a later time, which is what I
Chris> hear is being done to compensate for slow controllers.

We don't coalesce.


Chris> True.  They limit the NAND performance based on the lack of
Chris> performance of their ASIC and the controller.  

Interesting theory.

I'm personally of the conviction that cheap SSDs suffer from amazingly
poor FTL design rather than inherent hardware limitations.

That's something intel got right with their drives.  The hardware itself
is pretty unremarkable.


Chris> That doesn't mean you can't get a lot better performance out of
Chris> NAND, it just means they limited themselves to be compatible, and
Chris> the kernel will implement a strategy that will optimize for the
Chris> poor design.

You are confusing limitations in interconnect technology with the
properties of the protocols used.  There is no point in adding channels
behind the ASIC to drive 12 Gbps of I/O if your host interface is 1.5
Gbps SATA.  That has nothing to do with whether ATA and SCSI are
suitable protocols.

I'm arguing that the at least SCSI is a good protocol for sending
commands to a block device.  Nothing prevents your flash-based block
device from presenting a PCIe SCSI interface to the host and then do
something completely different in the back.

There's lots of warts in SCSI.  And I personally think that ATA TRIM was
very poorly defined.  But I don't believe that these protocols are
inherently bad for driving storage.  And I don't believe that coming up
with a custom "block" interface will improve anything in the short term.
Heck, the overhead of speaking SCSI is so low that even the thumb drive
you brought up implements it.  At negligible cost.


Chris> but you also believe the best way to do SSS was behind compatible
Chris> legacy SAS/SATA devices optimized for old rotating media?

You're the one claiming these "legacy" devices are optimized for
rotating media.  I'm claiming there's nothing rotating about either
protocol.

Both express "do something to this range of blocks" in 16 bytes or less
+ a scatterlist describing memory.  That's a pretty efficient interface
in my book.


Chris> I'd run IOZone and fill the drive (as I recall ~200GB) w/ files
Chris> and benchmark, which, at the end, IOZone would delete all the
Chris> files created (in the hundreds), and the delete/discard process
Chris> was no more time consuming than just the delete process (for
Chris> everything on the drive).  This was w/ the original 2.6.27 and
Chris> 2.6.28 ext4 "discard" implementations.

And which device was this?  How did it implement discard?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 21:01               ` Greg Freemyer
  2009-11-10 21:17                 ` Chris Worley
@ 2009-11-10 22:56                 ` Martin K. Petersen
  2009-11-11 17:00                   ` Greg Freemyer
  1 sibling, 1 reply; 34+ messages in thread
From: Martin K. Petersen @ 2009-11-10 22:56 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: Martin K. Petersen, Chris Worley, Majed B., Linux RAID

>>>>> "Greg" == Greg Freemyer <greg.freemyer@gmail.com> writes:

Greg> I'm not sure where it ended up, but the big SSD / discard
Greg> discussion of a few months ago talked about 3 kinds of solutions,
Greg> and I thought the plan was to support all 3.

We don't design for the past.


Greg> 1) optimization 1 - A white-listed instant discard feature.  In
Greg>    this methodology, the filesystems would immediately send
Greg>    discard calls down to the block layer would send them on down
Greg>    the block stack to the physical devices with very minimal
Greg>    buffering.

There's no whitelist.  That's just how it works.

Yes, there were a few crappy devices out there.  Windows 7 issuing TRIM
commands in realtime made them instantly obsolete.  If future devices
suck with Windows 7 nobody will buy them.


Greg> 2) optimization 2 - The block layer would accept those small
Greg>    discards, but accumulate them for a short period.  (less than a
Greg>    second was my impression).  Then coalesce them into larger
Greg>    discards and send them down the block stack and eventually to
Greg>    the physical device.

SSDs are special in that they actually track map state on a per-logical
block basis.  Other thinly provisioned devices track space in units
ranging from 16-32-64KB up to megabytes.

It's up to each block device to track the map space.  The way most
arrays work is that they'll ignore the portions of the request that are
not aligned to and a multiple of their internal allocation unit.

The same applies to MD.  IOW, MD would only unmap the portions of the
discard request that constitute entire stripes.  No keeping state
required.

Jens just queued my patch which allows block devices to communicate
their unmap granularity and alignment to the filesystems.  This means we
can potentially use this to influence filesystem allocators.  For SCSI
arrays these values are queried and passed up the stack.  MD can choose
to manually set the granularity to its stripe size.


Greg> 3) optimization 3 - a background freespace scanner would run from
Greg> time to time that scanned a filesystem for free blocks and send a
Greg> discard / trim command down to the device.  This is what Mark Lord
Greg> was working on.  His solution was primarily in user space and was
Greg> controlled by cron.

I think that's a fine approach for legacy devices.  But as I said I
think Windows 7 will root out all devices with poor TRIM performance
pretty quickly.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 18:31                     ` Majed B.
@ 2009-11-10 23:03                       ` Mathieu Chouquet-Stringer
  2009-11-11  2:52                         ` Majed B.
  0 siblings, 1 reply; 34+ messages in thread
From: Mathieu Chouquet-Stringer @ 2009-11-10 23:03 UTC (permalink / raw)
  To: "Majed B."; +Cc: Linux RAID

majedb@gmail.com ("Majed B.") writes:
> Chris,
> 
> Do you mind sharing the drive models & controllers you're using that
> give you 800 MB/s?

At work we reviewed different kind of fusion-io products (namely ioDrive
and ioDrive-Duo, 80GB and 640GB) and I could easily get 700 MB/s with
more than 100k iops (benched using fio)...

http://kb.fusionio.com/KB/a29/verifying-linux-system-performance.aspx

Their results are consistent with what I saw...

I didn't like the binary like driver though...
-- 
Mathieu Chouquet-Stringer                           mchouque@free.fr
            The sun itself sees not till heaven clears.
	             -- William Shakespeare --

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 23:03                       ` Mathieu Chouquet-Stringer
@ 2009-11-11  2:52                         ` Majed B.
  0 siblings, 0 replies; 34+ messages in thread
From: Majed B. @ 2009-11-11  2:52 UTC (permalink / raw)
  To: Linux RAID

Thank you Mathieu for the input.

I have seen IBM DS4800 SANs doing 600MB/s-700MB/s using a bunch of
148GB 15k RPM FC disks. though I haven't seen them being benchmarked
for IOPS.

I was reading an article from AnadTech yesterday that compared
rotational media to SSDs and they ran some stress tests. In the end,
they concluded that hardware RAID controllers were hampering the
performance because they couldn't absorb the amount of requests coming
at them from the SSDs. When they switched to software RAID (on Windows
in their test), they got almost double the performance (RAID5, 8
disks).

You can see the numbers here: http://it.anandtech.com/IT/showdoc.aspx?i=3532&p=9
If you're interested in the whole article, you can go to the link
above and go back to the main page using the index.

On sequential read, they achieved 1257 MB/s on a RAID 5 setup. Quite
impressive for those with video streaming applications.

On Wed, Nov 11, 2009 at 2:03 AM, Mathieu Chouquet-Stringer
<mchouque@free.fr> wrote:
> majedb@gmail.com ("Majed B.") writes:
>> Chris,
>>
>> Do you mind sharing the drive models & controllers you're using that
>> give you 800 MB/s?
>
> At work we reviewed different kind of fusion-io products (namely ioDrive
> and ioDrive-Duo, 80GB and 640GB) and I could easily get 700 MB/s with
> more than 100k iops (benched using fio)...
>
> http://kb.fusionio.com/KB/a29/verifying-linux-system-performance.aspx
>
> Their results are consistent with what I saw...
>
> I didn't like the binary like driver though...
> --
> Mathieu Chouquet-Stringer                           mchouque@free.fr
>            The sun itself sees not till heaven clears.
>                     -- William Shakespeare --
>



-- 
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 22:56                 ` Martin K. Petersen
@ 2009-11-11 17:00                   ` Greg Freemyer
  2009-11-12  5:50                     ` Martin K. Petersen
  0 siblings, 1 reply; 34+ messages in thread
From: Greg Freemyer @ 2009-11-11 17:00 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Chris Worley, Majed B., Linux RAID

On Tue, Nov 10, 2009 at 5:56 PM, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Greg" == Greg Freemyer <greg.freemyer@gmail.com> writes:
>
> Greg> I'm not sure where it ended up, but the big SSD / discard
> Greg> discussion of a few months ago talked about 3 kinds of solutions,
> Greg> and I thought the plan was to support all 3.
>
> We don't design for the past.
>
>
> Greg> 1) optimization 1 - A white-listed instant discard feature.  In
> Greg>    this methodology, the filesystems would immediately send
> Greg>    discard calls down to the block layer would send them on down
> Greg>    the block stack to the physical devices with very minimal
> Greg>    buffering.
>
> There's no whitelist.  That's just how it works.
>
> Yes, there were a few crappy devices out there.  Windows 7 issuing TRIM
> commands in realtime made them instantly obsolete.  If future devices
> suck with Windows 7 nobody will buy them.
>
>
> Greg> 2) optimization 2 - The block layer would accept those small
> Greg>    discards, but accumulate them for a short period.  (less than a
> Greg>    second was my impression).  Then coalesce them into larger
> Greg>    discards and send them down the block stack and eventually to
> Greg>    the physical device.
>
> SSDs are special in that they actually track map state on a per-logical
> block basis.  Other thinly provisioned devices track space in units
> ranging from 16-32-64KB up to megabytes.
>
> It's up to each block device to track the map space.  The way most
> arrays work is that they'll ignore the portions of the request that are
> not aligned to and a multiple of their internal allocation unit.
>
> The same applies to MD.  IOW, MD would only unmap the portions of the
> discard request that constitute entire stripes.  No keeping state
> required.
>
> Jens just queued my patch which allows block devices to communicate
> their unmap granularity and alignment to the filesystems.  This means we
> can potentially use this to influence filesystem allocators.  For SCSI
> arrays these values are queried and passed up the stack.  MD can choose
> to manually set the granularity to its stripe size.
>
>
> Greg> 3) optimization 3 - a background freespace scanner would run from
> Greg> time to time that scanned a filesystem for free blocks and send a
> Greg> discard / trim command down to the device.  This is what Mark Lord
> Greg> was working on.  His solution was primarily in user space and was
> Greg> controlled by cron.
>
> I think that's a fine approach for legacy devices.  But as I said I
> think Windows 7 will root out all devices with poor TRIM performance
> pretty quickly.
>
> --
> Martin K. Petersen      Oracle Linux Engineering
>

Martin,

So for a workload mostly composed of small files residing on a MD raid
4/5/6 setup, how is this supposed to work.  (ie. Tiffs, small word
docs, pdfs, individual emails, etc.)

Most of the individual files will be less than one stripe wide, so
when they are deleted I gather the discard range will be less than a
stripe and therefore MD would ignore it in the simplest of
implementations.  ie. Without coalescence at some point, MD will never
forward discards to the hardware.

Thus I would think for that workload, the nightly full freespace scan
and discard would be the best solution.

Thanks
Greg


-- 
Greg Freemyer
Head of EDD Tape Extraction and Processing team
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
Preservation and Forensic processing of Exchange Repositories White Paper -
<http://www.norcrossgroup.com/forms/whitepapers/tng_whitepaper_fpe.html>

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 16:04               ` Chris Worley
@ 2009-11-11 18:02                 ` Default User
  0 siblings, 0 replies; 34+ messages in thread
From: Default User @ 2009-11-11 18:02 UTC (permalink / raw)
  To: Chris Worley; +Cc: linux-raid

Chris Worley wrote:
> On Tue, Nov 10, 2009 at 8:48 AM, Asdo <asdo@shiftmail.org> wrote:
>   
>> I have not heard about these SSS you mention.
>> Do you have a link?
>>     
>
> All the Fusion-io products (fusionio.com) and TMS's (ramsan.com) RS20
> are two examples (not their RAM-based products).  Sun has their
> "Sunfire", but I haven't seen that yet.
>   
I don't know TMS, I know Fusion-io a bit: it is indeed 10x faster than a 
SSD but it is also 10 times more expensive!
If you make a raid-0 of ten SSDs in a good hardware-raid controller, 
exported to the OS as a single SCSI disk, I bet you obtain about the 
same performances.
Look at this:
http://www.tomshardware.com/reviews/x25-e-ssd-performance,2365.html
by looking at this page
http://www.tomshardware.com/reviews/x25-e-ssd-performance,2365-7.html
it seems the "streaming writes" is apparently similar to the benchmark 
you want (see the specs), do you agree? Yes it's 0% random it's 4 
workers... and the blocksize is the one you want.
You find the result in the following page. That's 2.2GB/sec with 16 
disks. If you imagine it with 8 disks and only 1 controller (the 
benchmark uses 2 controllers with a software raid-0 above) it's more 
than the speed you want (800MB/sec) and it's with a SCSI interface.

What do you think?

>> Also are you sure that the SATA/SCSI layer is the problem? Some hardware
>> raids can do 800 MB/s sequential, single stream, and indeed with a SATA/SAS
>> interface to the kernel. If what you say was true, that would be
>> impossible...
>>     
>
> Sequential/streaming performance is a corner case.  There are many
> high speed solutions to that (even using rotating media).  I'm talking
> random I/O at 128KB blocks at 800MB/s per drive.
>   


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-10 22:35                 ` Martin K. Petersen
@ 2009-11-11 18:17                   ` Chris Worley
  0 siblings, 0 replies; 34+ messages in thread
From: Chris Worley @ 2009-11-11 18:17 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Linux RAID

On Tue, Nov 10, 2009 at 3:35 PM, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes:
>
> Chris> I'm not saying the SCSI protocol is bad, I'm saying the
> Chris> SAS/SATA/SCSI controllers, that have been optimized for years for
> Chris> rotating media, don't have the compute power to handle the sort
> Chris> of performance attainable with SSS.
>
> And I'm saying that at least in the SCSI case that's untrue.  SAS and FC
> controllers are optimized for lots and lots of I/O because their main
> application is driving large storage arrays which have performance
> comparable to the solid state devices you mention.

We're going to have to agree to disagree on this.  My feeling is, you
haven't tried the next generation in I/O performance, only the slow
SSD's currently available, and don't (yet) see the potential for
getting rid of all the hardware layers that evolved around rotating
media.  And when you talk of FC... there's more performance
inhibition.  Slow hardware like 10G Ethernet and FC8 can't keep up
with the performance required for fast SSS I/O.  A single QDR IB port
is a good start, with 3GB/s per port (measured using SRP to export the
drives).  How many FC8 or 10G over iSCSI ports would it take to get
one QDR IB ports performance(?)... then start thinking 2x, 4x, 8x, ...
and the complexity of the old hardware becomes daunting when trying to
scale.   Again, to scale easily and with less complexity: you need
your fundamental components to be fast.  SSD's, FC, 10G are last
generation hardware and way too slow.

You can use a 90's vintage distributed supercomputer with 100's of
processors to run tasks that one CPU can do as fast or faster today...
but many would agree that the new CPU is probably an easier choice.

And again, I'm not attacking the SCSI protocol, just the controller
performance; but getting rid of unnecessary OS software layers (i.e.
when you can directly use a block device), also provides more
performance.  When you've got <50usecs latency storage, every CPU
cycle counts.

<snip>
>
> Chris> I'd run IOZone and fill the drive (as I recall ~200GB) w/ files
> Chris> and benchmark, which, at the end, IOZone would delete all the
> Chris> files created (in the hundreds), and the delete/discard process
> Chris> was no more time consuming than just the delete process (for
> Chris> everything on the drive).  This was w/ the original 2.6.27 and
> Chris> 2.6.28 ext4 "discard" implementations.
>
> And which device was this?  How did it implement discard?

This was a non-GPL driver (as is the management layer for all SSD's),
so I doubt you're interested.

The methodology used was that laid out by David Wodhouse in:

http://lwn.net/Articles/293658/

Basically: 1) register for the discard, 2) decode the write BIO's that
indicate "discard", 3) send completion when done.

Chris
>
> --
> Martin K. Petersen      Oracle Linux Engineering
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Intel Updates SSDs, Supports TRIM, Faster Writes
  2009-11-11 17:00                   ` Greg Freemyer
@ 2009-11-12  5:50                     ` Martin K. Petersen
  0 siblings, 0 replies; 34+ messages in thread
From: Martin K. Petersen @ 2009-11-12  5:50 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: Martin K. Petersen, Linux RAID

>>>>> "Greg" == Greg Freemyer <greg.freemyer@gmail.com> writes:

Greg> So for a workload mostly composed of small files residing on a MD
Greg> raid 4/5/6 setup, how is this supposed to work.  (ie. Tiffs, small
Greg> word docs, pdfs, individual emails, etc.)

The intent of thin provisioning is not to free up 4KB of space when you
delete an email.  The intent is to free 4GB when you delete a database
or a virtual machine disk image.

In the RAID array space, allocation units bigger than a single stripe
are common.  I'm not aware of any arrays that track sectors or
filesystem block sized chunks.

We're exposing as much information about the mapping granularity as the
hardware is willing to share.  This is done so that filesystems have the
option of laying out data accordingly.  For instance a filesystem can
choose to reuse recently freed blocks and to pack data tightly together
instead of the traditional approach of spreading things out over the
entire LBA range.


Greg> Most of the individual files will be less than one stripe wide, so
Greg> when they are deleted I gather the discard range will be less than
Greg> a stripe and therefore MD would ignore it in the simplest of
Greg> implementations.  ie. Without coalescence at some point, MD will
Greg> never forward discards to the hardware.

Greg> Thus I would think for that workload, the nightly full freespace
Greg> scan and discard would be the best solution.

Well that's certainly possible to implement for small setups.  And less
tedious than tracking individual sector mapping in MD.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2009-11-12  5:50 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-08 17:57 Intel Updates SSDs, Supports TRIM, Faster Writes Bill Davidsen
2009-11-08 22:30 ` Thomas Fjellstrom
2009-11-09  1:13 ` Majed B.
2009-11-09 16:37   ` Chris Worley
2009-11-09 16:42     ` Majed B.
2009-11-09 16:59       ` Chris Worley
2009-11-10  9:42         ` Kasper Sandberg
2009-11-10 15:39           ` Chris Worley
2009-11-10 15:43             ` Majed B.
2009-11-10 15:58               ` Chris Worley
2009-11-10 16:01                 ` Majed B.
2009-11-10 16:15                   ` Robin Hill
2009-11-10 16:31                     ` Chris Worley
2009-11-10 16:18                   ` Chris Worley
2009-11-10 18:31                     ` Majed B.
2009-11-10 23:03                       ` Mathieu Chouquet-Stringer
2009-11-11  2:52                         ` Majed B.
2009-11-10 18:40                     ` Kasper Sandberg
2009-11-10 15:48             ` Asdo
2009-11-10 16:04               ` Chris Worley
2009-11-11 18:02                 ` Default User
2009-11-10 18:38             ` Kasper Sandberg
2009-11-10 16:36         ` Martin K. Petersen
2009-11-10 17:22           ` Chris Worley
2009-11-10 20:11             ` Martin K. Petersen
2009-11-10 20:45               ` Chris Worley
2009-11-10 22:35                 ` Martin K. Petersen
2009-11-11 18:17                   ` Chris Worley
2009-11-10 21:01               ` Greg Freemyer
2009-11-10 21:17                 ` Chris Worley
2009-11-10 22:56                 ` Martin K. Petersen
2009-11-11 17:00                   ` Greg Freemyer
2009-11-12  5:50                     ` Martin K. Petersen
2009-11-09 18:42   ` Greg Freemyer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.