* Intel Updates SSDs, Supports TRIM, Faster Writes @ 2009-11-08 17:57 Bill Davidsen 2009-11-08 22:30 ` Thomas Fjellstrom 2009-11-09 1:13 ` Majed B. 0 siblings, 2 replies; 34+ messages in thread From: Bill Davidsen @ 2009-11-08 17:57 UTC (permalink / raw) To: Linux RAID For those of us playing with use of SSD for journals on ext[34], this does have implications for RAID performance. http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes -- Bill Davidsen <davidsen@tmr.com> Unintended results are the well-earned reward for incompetence. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-08 17:57 Intel Updates SSDs, Supports TRIM, Faster Writes Bill Davidsen @ 2009-11-08 22:30 ` Thomas Fjellstrom 2009-11-09 1:13 ` Majed B. 1 sibling, 0 replies; 34+ messages in thread From: Thomas Fjellstrom @ 2009-11-08 22:30 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux RAID On Sun November 8 2009, Bill Davidsen wrote: > For those of us playing with use of SSD for journals on ext[34], this > does have implications for RAID performance. > > http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Su > pports-TRIM-Faster-Writes > Just don't upgrade it till they fix the bugs ;) -- Thomas Fjellstrom tfjellstrom@shaw.ca ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-08 17:57 Intel Updates SSDs, Supports TRIM, Faster Writes Bill Davidsen 2009-11-08 22:30 ` Thomas Fjellstrom @ 2009-11-09 1:13 ` Majed B. 2009-11-09 16:37 ` Chris Worley 2009-11-09 18:42 ` Greg Freemyer 1 sibling, 2 replies; 34+ messages in thread From: Majed B. @ 2009-11-09 1:13 UTC (permalink / raw) To: Linux RAID The firmware which introduced the TRIM command was deemed buggy and has been pulled out. Are there any filesystems that are TRIM-aware? On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: > For those of us playing with use of SSD for journals on ext[34], this does > have implications for RAID performance. > > http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes > > -- > Bill Davidsen <davidsen@tmr.com> > Unintended results are the well-earned reward for incompetence. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Majed B. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-09 1:13 ` Majed B. @ 2009-11-09 16:37 ` Chris Worley 2009-11-09 16:42 ` Majed B. 2009-11-09 18:42 ` Greg Freemyer 1 sibling, 1 reply; 34+ messages in thread From: Chris Worley @ 2009-11-09 16:37 UTC (permalink / raw) To: Majed B.; +Cc: Linux RAID On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: > The firmware which introduced the TRIM command was deemed buggy and > has been pulled out. > > Are there any filesystems that are TRIM-aware? Ext4 (at that level in the kernel, it's referred to as "discard", it's not TRIM until it's issued as a SCSI command). Chris > > On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >> For those of us playing with use of SSD for journals on ext[34], this does >> have implications for RAID performance. >> >> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >> >> -- >> Bill Davidsen <davidsen@tmr.com> >> Unintended results are the well-earned reward for incompetence. >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Majed B. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-09 16:37 ` Chris Worley @ 2009-11-09 16:42 ` Majed B. 2009-11-09 16:59 ` Chris Worley 0 siblings, 1 reply; 34+ messages in thread From: Majed B. @ 2009-11-09 16:42 UTC (permalink / raw) To: Chris Worley; +Cc: Linux RAID Well, SATA uses SCSI emulation so I guess that's no problem, right? On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: > On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: >> The firmware which introduced the TRIM command was deemed buggy and >> has been pulled out. >> >> Are there any filesystems that are TRIM-aware? > > Ext4 (at that level in the kernel, it's referred to as "discard", it's > not TRIM until it's issued as a SCSI command). > > Chris >> >> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >>> For those of us playing with use of SSD for journals on ext[34], this does >>> have implications for RAID performance. >>> >>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >>> >>> -- >>> Bill Davidsen <davidsen@tmr.com> >>> Unintended results are the well-earned reward for incompetence. >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> >> >> -- >> Majed B. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- Majed B. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-09 16:42 ` Majed B. @ 2009-11-09 16:59 ` Chris Worley 2009-11-10 9:42 ` Kasper Sandberg 2009-11-10 16:36 ` Martin K. Petersen 0 siblings, 2 replies; 34+ messages in thread From: Chris Worley @ 2009-11-09 16:59 UTC (permalink / raw) To: Majed B.; +Cc: Linux RAID On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: > Well, SATA uses SCSI emulation so I guess that's no problem, right? The only problem is SSD's put Solid State Storage (SSS) behind SATA/SAS controllers... while compatible w/ old disk technology, it severely limits performance (i.e. none of these SSD drives do even 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 drivers and ext4 "discard" worked very well with forward-thinking SSS not encumbered by old controller technology... but, SSD's were not able to handle it well: http://lwn.net/Articles/347511/ So it looks like "design by committee" Linux is well behind Windows 7, while Linux contemplates slowing new technology down to optimize for ill-designed SSD's. Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! Chris > > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: >>> The firmware which introduced the TRIM command was deemed buggy and >>> has been pulled out. >>> >>> Are there any filesystems that are TRIM-aware? >> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's >> not TRIM until it's issued as a SCSI command). >> >> Chris >>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >>>> For those of us playing with use of SSD for journals on ext[34], this does >>>> have implications for RAID performance. >>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >>>> >>>> -- >>>> Bill Davidsen <davidsen@tmr.com> >>>> Unintended results are the well-earned reward for incompetence. >>>> >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >>> >>> >>> -- >>> Majed B. >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> > > > > -- > Majed B. > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-09 16:59 ` Chris Worley @ 2009-11-10 9:42 ` Kasper Sandberg 2009-11-10 15:39 ` Chris Worley 2009-11-10 16:36 ` Martin K. Petersen 1 sibling, 1 reply; 34+ messages in thread From: Kasper Sandberg @ 2009-11-10 9:42 UTC (permalink / raw) To: Chris Worley; +Cc: Majed B., Linux RAID On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote: > On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: > > Well, SATA uses SCSI emulation so I guess that's no problem, right? > > The only problem is SSD's put Solid State Storage (SSS) behind > SATA/SAS controllers... while compatible w/ old disk technology, it > severely limits performance (i.e. none of these SSD drives do even > 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 No, around 280MB/s... and obviously they dont do more, because of the simple limitation of the sata controllers.. this also means they dont need to do as many channels as other devices.. > drivers and ext4 "discard" worked very well with forward-thinking SSS > not encumbered by old controller technology... but, SSD's were not > able to handle it well: > > http://lwn.net/Articles/347511/ > > So it looks like "design by committee" Linux is well behind Windows 7, And how exactly does windows 7 handle this so much better? > while Linux contemplates slowing new technology down to optimize for > ill-designed SSD's. It does? > > Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! > > Chris > > > > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: > >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: > >>> The firmware which introduced the TRIM command was deemed buggy and > >>> has been pulled out. > >>> > >>> Are there any filesystems that are TRIM-aware? > >> > >> Ext4 (at that level in the kernel, it's referred to as "discard", it's > >> not TRIM until it's issued as a SCSI command). > >> > >> Chris > >>> > >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: > >>>> For those of us playing with use of SSD for journals on ext[34], this does > >>>> have implications for RAID performance. > >>>> > >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes > >>>> > >>>> -- > >>>> Bill Davidsen <davidsen@tmr.com> > >>>> Unintended results are the well-earned reward for incompetence. > >>>> > >>>> > >>>> -- > >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >>>> the body of a message to majordomo@vger.kernel.org > >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>> > >>> > >>> > >>> > >>> -- > >>> Majed B. > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> > > > > > > > > -- > > Majed B. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 9:42 ` Kasper Sandberg @ 2009-11-10 15:39 ` Chris Worley 2009-11-10 15:43 ` Majed B. ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Chris Worley @ 2009-11-10 15:39 UTC (permalink / raw) To: Kasper Sandberg; +Cc: Majed B., Linux RAID On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote: > On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote: >> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: >> > Well, SATA uses SCSI emulation so I guess that's no problem, right? >> >> The only problem is SSD's put Solid State Storage (SSS) behind >> SATA/SAS controllers... while compatible w/ old disk technology, it >> severely limits performance (i.e. none of these SSD drives do even >> 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 > No, around 280MB/s... and obviously they dont do more, because of the > simple limitation of the sata controllers.. this also means they dont > need to do as many channels as other devices.. I'm not sure if you're agreeing or disagreeing here... 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, while SSS, w/o a legacy controller, can do 800MB/s out of a single drive. >> drivers and ext4 "discard" worked very well with forward-thinking SSS >> not encumbered by old controller technology... but, SSD's were not >> able to handle it well: >> >> http://lwn.net/Articles/347511/ >> >> So it looks like "design by committee" Linux is well behind Windows 7, > And how exactly does windows 7 handle this so much better? TRIM is in W7; NTFS support. No Linux distro does. And by the time "design by committee" gets through with it,we shouldn't have bothered. >> while Linux contemplates slowing new technology down to optimize for >> ill-designed SSD's. > It does? Those that speak loudest in the kernel development (and contribute the most) work for companies like Intel that promote the slower, controller-based, SSD's. Chris >> >> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! >> >> Chris >> > >> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: >> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: >> >>> The firmware which introduced the TRIM command was deemed buggy and >> >>> has been pulled out. >> >>> >> >>> Are there any filesystems that are TRIM-aware? >> >> >> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's >> >> not TRIM until it's issued as a SCSI command). >> >> >> >> Chris >> >>> >> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >> >>>> For those of us playing with use of SSD for journals on ext[34], this does >> >>>> have implications for RAID performance. >> >>>> >> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >> >>>> >> >>>> -- >> >>>> Bill Davidsen <davidsen@tmr.com> >> >>>> Unintended results are the well-earned reward for incompetence. >> >>>> >> >>>> >> >>>> -- >> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> >>>> the body of a message to majordomo@vger.kernel.org >> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Majed B. >> >>> -- >> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> >>> the body of a message to majordomo@vger.kernel.org >> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>> >> >> >> > >> > >> > >> > -- >> > Majed B. >> > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 15:39 ` Chris Worley @ 2009-11-10 15:43 ` Majed B. 2009-11-10 15:58 ` Chris Worley 2009-11-10 15:48 ` Asdo 2009-11-10 18:38 ` Kasper Sandberg 2 siblings, 1 reply; 34+ messages in thread From: Majed B. @ 2009-11-10 15:43 UTC (permalink / raw) To: Linux RAID Does that mean we won't be able to squeeze the juice out of Intel's Extreme SSDs on Linux? What about those of us who use OpenFiler and build their own storage solutions? We won't be able to provide solutions based on these SSDs because the kernel support is crap? I may have clients wanting to mix between SAS/SATA & SSD to load their main database on the SSDs, but now it seems pointless since the performance isn't gonna be that great :/ On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote: > On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote: >> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote: >>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: >>> > Well, SATA uses SCSI emulation so I guess that's no problem, right? >>> >>> The only problem is SSD's put Solid State Storage (SSS) behind >>> SATA/SAS controllers... while compatible w/ old disk technology, it >>> severely limits performance (i.e. none of these SSD drives do even >>> 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 >> No, around 280MB/s... and obviously they dont do more, because of the >> simple limitation of the sata controllers.. this also means they dont >> need to do as many channels as other devices.. > > I'm not sure if you're agreeing or disagreeing here... > 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, > while SSS, w/o a legacy controller, can do 800MB/s out of a single > drive. > >>> drivers and ext4 "discard" worked very well with forward-thinking SSS >>> not encumbered by old controller technology... but, SSD's were not >>> able to handle it well: >>> >>> http://lwn.net/Articles/347511/ >>> >>> So it looks like "design by committee" Linux is well behind Windows 7, >> And how exactly does windows 7 handle this so much better? > > TRIM is in W7; NTFS support. No Linux distro does. And by the time > "design by committee" gets through with it,we shouldn't have bothered. > >>> while Linux contemplates slowing new technology down to optimize for >>> ill-designed SSD's. >> It does? > > Those that speak loudest in the kernel development (and contribute the > most) work for companies like Intel that promote the slower, > controller-based, SSD's. > > Chris >>> >>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! >>> >>> Chris >>> > >>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: >>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: >>> >>> The firmware which introduced the TRIM command was deemed buggy and >>> >>> has been pulled out. >>> >>> >>> >>> Are there any filesystems that are TRIM-aware? >>> >> >>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's >>> >> not TRIM until it's issued as a SCSI command). >>> >> >>> >> Chris >>> >>> >>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >>> >>>> For those of us playing with use of SSD for journals on ext[34], this does >>> >>>> have implications for RAID performance. >>> >>>> >>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >>> >>>> >>> >>>> -- >>> >>>> Bill Davidsen <davidsen@tmr.com> >>> >>>> Unintended results are the well-earned reward for incompetence. >>> >>>> >>> >>>> >>> >>>> -- >>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> >>>> the body of a message to majordomo@vger.kernel.org >>> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Majed B. >>> >>> -- >>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> >>> the body of a message to majordomo@vger.kernel.org >>> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >>> >> >>> > >>> > >>> > >>> > -- >>> > Majed B. >>> > >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- Majed B. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 15:43 ` Majed B. @ 2009-11-10 15:58 ` Chris Worley 2009-11-10 16:01 ` Majed B. 0 siblings, 1 reply; 34+ messages in thread From: Chris Worley @ 2009-11-10 15:58 UTC (permalink / raw) To: Majed B.; +Cc: Linux RAID On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote: > Does that mean we won't be able to squeeze the juice out of Intel's > Extreme SSDs on Linux? The limitation is in the design. You'll be able to get as much performance as they can offer, given the bad design (of putting SSS behind legacy controllers). > > What about those of us who use OpenFiler and build their own storage > solutions? We won't be able to provide solutions based on these SSDs > because the kernel support is crap? It's sub-optimal, written to make the best of a bad design, limiting performance of good designs, but not crap. > > I may have clients wanting to mix between SAS/SATA & SSD to load their > main database on the SSDs, but now it seems pointless since the > performance isn't gonna be that great :/ You can still get much greater performance from SSS designed correctly. Just don't do SSD's. Chris > > On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote: >> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote: >>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote: >>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: >>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right? >>>> >>>> The only problem is SSD's put Solid State Storage (SSS) behind >>>> SATA/SAS controllers... while compatible w/ old disk technology, it >>>> severely limits performance (i.e. none of these SSD drives do even >>>> 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 >>> No, around 280MB/s... and obviously they dont do more, because of the >>> simple limitation of the sata controllers.. this also means they dont >>> need to do as many channels as other devices.. >> >> I'm not sure if you're agreeing or disagreeing here... >> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, >> while SSS, w/o a legacy controller, can do 800MB/s out of a single >> drive. >> >>>> drivers and ext4 "discard" worked very well with forward-thinking SSS >>>> not encumbered by old controller technology... but, SSD's were not >>>> able to handle it well: >>>> >>>> http://lwn.net/Articles/347511/ >>>> >>>> So it looks like "design by committee" Linux is well behind Windows 7, >>> And how exactly does windows 7 handle this so much better? >> >> TRIM is in W7; NTFS support. No Linux distro does. And by the time >> "design by committee" gets through with it,we shouldn't have bothered. >> >>>> while Linux contemplates slowing new technology down to optimize for >>>> ill-designed SSD's. >>> It does? >> >> Those that speak loudest in the kernel development (and contribute the >> most) work for companies like Intel that promote the slower, >> controller-based, SSD's. >> >> Chris >>>> >>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! >>>> >>>> Chris >>>> > >>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: >>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: >>>> >>> The firmware which introduced the TRIM command was deemed buggy and >>>> >>> has been pulled out. >>>> >>> >>>> >>> Are there any filesystems that are TRIM-aware? >>>> >> >>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's >>>> >> not TRIM until it's issued as a SCSI command). >>>> >> >>>> >> Chris >>>> >>> >>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does >>>> >>>> have implications for RAID performance. >>>> >>>> >>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >>>> >>>> >>>> >>>> -- >>>> >>>> Bill Davidsen <davidsen@tmr.com> >>>> >>>> Unintended results are the well-earned reward for incompetence. >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>> >>>> the body of a message to majordomo@vger.kernel.org >>>> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> Majed B. >>>> >>> -- >>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>> >>> the body of a message to majordomo@vger.kernel.org >>>> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >>>> >> >>>> > >>>> > >>>> > >>>> > -- >>>> > Majed B. >>>> > >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> > > > > -- > Majed B. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 15:58 ` Chris Worley @ 2009-11-10 16:01 ` Majed B. 2009-11-10 16:15 ` Robin Hill 2009-11-10 16:18 ` Chris Worley 0 siblings, 2 replies; 34+ messages in thread From: Majed B. @ 2009-11-10 16:01 UTC (permalink / raw) To: Chris Worley; +Cc: Linux RAID Which disks can provide 2ms response with a read of 250 MB/s and write of 170 MB/s other than SSDs?! Are you saying that it doesn't matter whether we use Linux or Windows with SSDs because the limitation is coming from the disk's controller itself? On Tue, Nov 10, 2009 at 6:58 PM, Chris Worley <worleys@gmail.com> wrote: > On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote: >> Does that mean we won't be able to squeeze the juice out of Intel's >> Extreme SSDs on Linux? > > The limitation is in the design. You'll be able to get as much > performance as they can offer, given the bad design (of putting SSS > behind legacy controllers). > >> >> What about those of us who use OpenFiler and build their own storage >> solutions? We won't be able to provide solutions based on these SSDs >> because the kernel support is crap? > > It's sub-optimal, written to make the best of a bad design, limiting > performance of good designs, but not crap. > >> >> I may have clients wanting to mix between SAS/SATA & SSD to load their >> main database on the SSDs, but now it seems pointless since the >> performance isn't gonna be that great :/ > > You can still get much greater performance from SSS designed > correctly. Just don't do SSD's. > > Chris >> >> On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote: >>> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote: >>>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote: >>>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: >>>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right? >>>>> >>>>> The only problem is SSD's put Solid State Storage (SSS) behind >>>>> SATA/SAS controllers... while compatible w/ old disk technology, it >>>>> severely limits performance (i.e. none of these SSD drives do even >>>>> 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 >>>> No, around 280MB/s... and obviously they dont do more, because of the >>>> simple limitation of the sata controllers.. this also means they dont >>>> need to do as many channels as other devices.. >>> >>> I'm not sure if you're agreeing or disagreeing here... >>> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, >>> while SSS, w/o a legacy controller, can do 800MB/s out of a single >>> drive. >>> >>>>> drivers and ext4 "discard" worked very well with forward-thinking SSS >>>>> not encumbered by old controller technology... but, SSD's were not >>>>> able to handle it well: >>>>> >>>>> http://lwn.net/Articles/347511/ >>>>> >>>>> So it looks like "design by committee" Linux is well behind Windows 7, >>>> And how exactly does windows 7 handle this so much better? >>> >>> TRIM is in W7; NTFS support. No Linux distro does. And by the time >>> "design by committee" gets through with it,we shouldn't have bothered. >>> >>>>> while Linux contemplates slowing new technology down to optimize for >>>>> ill-designed SSD's. >>>> It does? >>> >>> Those that speak loudest in the kernel development (and contribute the >>> most) work for companies like Intel that promote the slower, >>> controller-based, SSD's. >>> >>> Chris >>>>> >>>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! >>>>> >>>>> Chris >>>>> > >>>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: >>>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: >>>>> >>> The firmware which introduced the TRIM command was deemed buggy and >>>>> >>> has been pulled out. >>>>> >>> >>>>> >>> Are there any filesystems that are TRIM-aware? >>>>> >> >>>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's >>>>> >> not TRIM until it's issued as a SCSI command). >>>>> >> >>>>> >> Chris >>>>> >>> >>>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >>>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does >>>>> >>>> have implications for RAID performance. >>>>> >>>> >>>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >>>>> >>>> >>>>> >>>> -- >>>>> >>>> Bill Davidsen <davidsen@tmr.com> >>>>> >>>> Unintended results are the well-earned reward for incompetence. >>>>> >>>> >>>>> >>>> >>>>> >>>> -- >>>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>> >>>> the body of a message to majordomo@vger.kernel.org >>>>> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> -- >>>>> >>> Majed B. >>>>> >>> -- >>>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>> >>> the body of a message to majordomo@vger.kernel.org >>>>> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>> >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > Majed B. >>>>> > >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>> >> >> >> >> -- >> Majed B. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- Majed B. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 16:01 ` Majed B. @ 2009-11-10 16:15 ` Robin Hill 2009-11-10 16:31 ` Chris Worley 2009-11-10 16:18 ` Chris Worley 1 sibling, 1 reply; 34+ messages in thread From: Robin Hill @ 2009-11-10 16:15 UTC (permalink / raw) To: Linux RAID [-- Attachment #1: Type: text/plain, Size: 1224 bytes --] On Tue Nov 10, 2009 at 07:01:02PM +0300, Majed B. wrote: > Which disks can provide 2ms response with a read of 250 MB/s and write > of 170 MB/s other than SSDs?! > > Are you saying that it doesn't matter whether we use Linux or Windows > with SSDs because the limitation is coming from the disk's controller > itself? > Not exactly - without TRIM support, the drive performance will degrade over time. Windows 7 has TRIM implemented for deletes (and formats), which prevents this degradation. Initial performance will be the same on both systems though (as performance is limited by the interface - SATA 6G is starting to appear though, which will definitely help). Currently, AFAIK, none of the Linux filesystems support TRIM. This is largely due to discussions about the implementation - a TRIM call requires a full flush of all pending writes (as it's a non-queueable call) which results in severe performance issues if done synchronously on deletes. HTH, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 16:15 ` Robin Hill @ 2009-11-10 16:31 ` Chris Worley 0 siblings, 0 replies; 34+ messages in thread From: Chris Worley @ 2009-11-10 16:31 UTC (permalink / raw) To: Linux RAID On Tue, Nov 10, 2009 at 9:15 AM, Robin Hill <robin@robinhill.me.uk> wrote: > On Tue Nov 10, 2009 at 07:01:02PM +0300, Majed B. wrote: > >> Which disks can provide 2ms response with a read of 250 MB/s and write >> of 170 MB/s other than SSDs?! >> >> Are you saying that it doesn't matter whether we use Linux or Windows >> with SSDs because the limitation is coming from the disk's controller >> itself? >> > Not exactly - without TRIM support, the drive performance will degrade > over time. Not true. First, it never effects read performance (you didn't qualify). Given that SSD's have a management layer that rotating media doesn't, it is a very complex issue, and dependent on the SSD management layer's algorithms. For many of these algorithms, most typical write usage performance is completely unaffected. The biggest performance effect is seen in benchmarks that were designed w/ rotating media in mind that makes assumptions that don't apply to real applications, but given rotating media's simplicity were justifiable. Those assumptions are no longer justifiable given SSD's management layer; benchmarks must be re-coded to exhibit more application-like behavior. > Windows 7 has TRIM implemented for deletes (and formats), > which prevents this degradation. Initial performance will be the same > on both systems though (as performance is limited by the interface - > SATA 6G is starting to appear though, which will definitely help). > > Currently, AFAIK, none of the Linux filesystems support TRIM. ext4 supports discard. I've been using successfully since 2.6.27. FAT did too, but it was pulled. > This is > largely due to discussions about the implementation - a TRIM call > requires a full flush of all pending writes (as it's a non-queueable > call) which results in severe performance issues if done synchronously > on deletes. This only causes performance issues for ill-designed SSD's that put themselves behind legacy controllers for compatibility reasons. Conceptually, for SSS, the earlier the TRIM the better (the sooner the management layer can use that information), no matter the perceived fragmentation. The performance issue only arises with the poor compatibility design, but, given they are the loudest voices in the Linux community, their design will prevail. Chris > > HTH, > Robin > -- > ___ > ( ' } | Robin Hill <robin@robinhill.me.uk> | > / / ) | Little Jim says .... | > // !! | "He fallen in de water !!" | > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 16:01 ` Majed B. 2009-11-10 16:15 ` Robin Hill @ 2009-11-10 16:18 ` Chris Worley 2009-11-10 18:31 ` Majed B. 2009-11-10 18:40 ` Kasper Sandberg 1 sibling, 2 replies; 34+ messages in thread From: Chris Worley @ 2009-11-10 16:18 UTC (permalink / raw) To: Majed B.; +Cc: Linux RAID On Tue, Nov 10, 2009 at 9:01 AM, Majed B. <majedb@gmail.com> wrote: > Which disks can provide 2ms response with a read of 250 MB/s and write > of 170 MB/s other than SSDs?! The drives I use average <50usecs latency at 4KB packets (properly measured as the complete turn-around time of a single outstanding I/O), 800MB/s reads and >600MB/s writes at 128KB blocks. > > Are you saying that it doesn't matter whether we use Linux or Windows > with SSDs because the limitation is coming from the disk's controller > itself? To some degree, yes, when using SSD's behind a controller, the controller is the biggest performance issue, and given they use chicklets for processors, they all hamper performance given the speed potential of the underlying storage. As none of the enterprise distros are handling TRIM yet, W7 can claim it was first, and putting together a TRIM-capable kernel is manual currently in Linux, and given only ext4 supports it (strangely, FAT supported it, then the code was pulled... XFS may support it, but I believe that's still in the works), you have the additional problem that ext4 has some maturity issues. Porting "discard" to ext2/3 would not be too difficult, especially w/o journal considerations. Chris > > On Tue, Nov 10, 2009 at 6:58 PM, Chris Worley <worleys@gmail.com> wrote: >> On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote: >>> Does that mean we won't be able to squeeze the juice out of Intel's >>> Extreme SSDs on Linux? >> >> The limitation is in the design. You'll be able to get as much >> performance as they can offer, given the bad design (of putting SSS >> behind legacy controllers). >> >>> >>> What about those of us who use OpenFiler and build their own storage >>> solutions? We won't be able to provide solutions based on these SSDs >>> because the kernel support is crap? >> >> It's sub-optimal, written to make the best of a bad design, limiting >> performance of good designs, but not crap. >> >>> >>> I may have clients wanting to mix between SAS/SATA & SSD to load their >>> main database on the SSDs, but now it seems pointless since the >>> performance isn't gonna be that great :/ >> >> You can still get much greater performance from SSS designed >> correctly. Just don't do SSD's. >> >> Chris >>> >>> On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote: >>>> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote: >>>>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote: >>>>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: >>>>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right? >>>>>> >>>>>> The only problem is SSD's put Solid State Storage (SSS) behind >>>>>> SATA/SAS controllers... while compatible w/ old disk technology, it >>>>>> severely limits performance (i.e. none of these SSD drives do even >>>>>> 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 >>>>> No, around 280MB/s... and obviously they dont do more, because of the >>>>> simple limitation of the sata controllers.. this also means they dont >>>>> need to do as many channels as other devices.. >>>> >>>> I'm not sure if you're agreeing or disagreeing here... >>>> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, >>>> while SSS, w/o a legacy controller, can do 800MB/s out of a single >>>> drive. >>>> >>>>>> drivers and ext4 "discard" worked very well with forward-thinking SSS >>>>>> not encumbered by old controller technology... but, SSD's were not >>>>>> able to handle it well: >>>>>> >>>>>> http://lwn.net/Articles/347511/ >>>>>> >>>>>> So it looks like "design by committee" Linux is well behind Windows 7, >>>>> And how exactly does windows 7 handle this so much better? >>>> >>>> TRIM is in W7; NTFS support. No Linux distro does. And by the time >>>> "design by committee" gets through with it,we shouldn't have bothered. >>>> >>>>>> while Linux contemplates slowing new technology down to optimize for >>>>>> ill-designed SSD's. >>>>> It does? >>>> >>>> Those that speak loudest in the kernel development (and contribute the >>>> most) work for companies like Intel that promote the slower, >>>> controller-based, SSD's. >>>> >>>> Chris >>>>>> >>>>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! >>>>>> >>>>>> Chris >>>>>> > >>>>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: >>>>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: >>>>>> >>> The firmware which introduced the TRIM command was deemed buggy and >>>>>> >>> has been pulled out. >>>>>> >>> >>>>>> >>> Are there any filesystems that are TRIM-aware? >>>>>> >> >>>>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's >>>>>> >> not TRIM until it's issued as a SCSI command). >>>>>> >> >>>>>> >> Chris >>>>>> >>> >>>>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >>>>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does >>>>>> >>>> have implications for RAID performance. >>>>>> >>>> >>>>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >>>>>> >>>> >>>>>> >>>> -- >>>>>> >>>> Bill Davidsen <davidsen@tmr.com> >>>>>> >>>> Unintended results are the well-earned reward for incompetence. >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> -- >>>>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>> >>>> the body of a message to majordomo@vger.kernel.org >>>>>> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> -- >>>>>> >>> Majed B. >>>>>> >>> -- >>>>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>> >>> the body of a message to majordomo@vger.kernel.org >>>>>> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>> >>>>>> >> >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > Majed B. >>>>>> > >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>>> >>>> >>> >>> >>> >>> -- >>> Majed B. >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> > > > > -- > Majed B. > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 16:18 ` Chris Worley @ 2009-11-10 18:31 ` Majed B. 2009-11-10 23:03 ` Mathieu Chouquet-Stringer 2009-11-10 18:40 ` Kasper Sandberg 1 sibling, 1 reply; 34+ messages in thread From: Majed B. @ 2009-11-10 18:31 UTC (permalink / raw) To: Linux RAID Chris, Do you mind sharing the drive models & controllers you're using that give you 800 MB/s? On Tue, Nov 10, 2009 at 7:18 PM, Chris Worley <worleys@gmail.com> wrote: > On Tue, Nov 10, 2009 at 9:01 AM, Majed B. <majedb@gmail.com> wrote: >> Which disks can provide 2ms response with a read of 250 MB/s and write >> of 170 MB/s other than SSDs?! > > The drives I use average <50usecs latency at 4KB packets (properly > measured as the complete turn-around time of a single outstanding > I/O), 800MB/s reads and >600MB/s writes at 128KB blocks. > >> >> Are you saying that it doesn't matter whether we use Linux or Windows >> with SSDs because the limitation is coming from the disk's controller >> itself? > > To some degree, yes, when using SSD's behind a controller, the > controller is the biggest performance issue, and given they use > chicklets for processors, they all hamper performance given the speed > potential of the underlying storage. > > As none of the enterprise distros are handling TRIM yet, W7 can claim > it was first, and putting together a TRIM-capable kernel is manual > currently in Linux, and given only ext4 supports it (strangely, FAT > supported it, then the code was pulled... XFS may support it, but I > believe that's still in the works), you have the additional problem > that ext4 has some maturity issues. Porting "discard" to ext2/3 would > not be too difficult, especially w/o journal considerations. > > Chris >> >> On Tue, Nov 10, 2009 at 6:58 PM, Chris Worley <worleys@gmail.com> wrote: >>> On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote: >>>> Does that mean we won't be able to squeeze the juice out of Intel's >>>> Extreme SSDs on Linux? >>> >>> The limitation is in the design. You'll be able to get as much >>> performance as they can offer, given the bad design (of putting SSS >>> behind legacy controllers). -- Majed B. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 18:31 ` Majed B. @ 2009-11-10 23:03 ` Mathieu Chouquet-Stringer 2009-11-11 2:52 ` Majed B. 0 siblings, 1 reply; 34+ messages in thread From: Mathieu Chouquet-Stringer @ 2009-11-10 23:03 UTC (permalink / raw) To: "Majed B."; +Cc: Linux RAID majedb@gmail.com ("Majed B.") writes: > Chris, > > Do you mind sharing the drive models & controllers you're using that > give you 800 MB/s? At work we reviewed different kind of fusion-io products (namely ioDrive and ioDrive-Duo, 80GB and 640GB) and I could easily get 700 MB/s with more than 100k iops (benched using fio)... http://kb.fusionio.com/KB/a29/verifying-linux-system-performance.aspx Their results are consistent with what I saw... I didn't like the binary like driver though... -- Mathieu Chouquet-Stringer mchouque@free.fr The sun itself sees not till heaven clears. -- William Shakespeare -- ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 23:03 ` Mathieu Chouquet-Stringer @ 2009-11-11 2:52 ` Majed B. 0 siblings, 0 replies; 34+ messages in thread From: Majed B. @ 2009-11-11 2:52 UTC (permalink / raw) To: Linux RAID Thank you Mathieu for the input. I have seen IBM DS4800 SANs doing 600MB/s-700MB/s using a bunch of 148GB 15k RPM FC disks. though I haven't seen them being benchmarked for IOPS. I was reading an article from AnadTech yesterday that compared rotational media to SSDs and they ran some stress tests. In the end, they concluded that hardware RAID controllers were hampering the performance because they couldn't absorb the amount of requests coming at them from the SSDs. When they switched to software RAID (on Windows in their test), they got almost double the performance (RAID5, 8 disks). You can see the numbers here: http://it.anandtech.com/IT/showdoc.aspx?i=3532&p=9 If you're interested in the whole article, you can go to the link above and go back to the main page using the index. On sequential read, they achieved 1257 MB/s on a RAID 5 setup. Quite impressive for those with video streaming applications. On Wed, Nov 11, 2009 at 2:03 AM, Mathieu Chouquet-Stringer <mchouque@free.fr> wrote: > majedb@gmail.com ("Majed B.") writes: >> Chris, >> >> Do you mind sharing the drive models & controllers you're using that >> give you 800 MB/s? > > At work we reviewed different kind of fusion-io products (namely ioDrive > and ioDrive-Duo, 80GB and 640GB) and I could easily get 700 MB/s with > more than 100k iops (benched using fio)... > > http://kb.fusionio.com/KB/a29/verifying-linux-system-performance.aspx > > Their results are consistent with what I saw... > > I didn't like the binary like driver though... > -- > Mathieu Chouquet-Stringer mchouque@free.fr > The sun itself sees not till heaven clears. > -- William Shakespeare -- > -- Majed B. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 16:18 ` Chris Worley 2009-11-10 18:31 ` Majed B. @ 2009-11-10 18:40 ` Kasper Sandberg 1 sibling, 0 replies; 34+ messages in thread From: Kasper Sandberg @ 2009-11-10 18:40 UTC (permalink / raw) To: Chris Worley; +Cc: Majed B., Linux RAID On Tue, 2009-11-10 at 09:18 -0700, Chris Worley wrote: > On Tue, Nov 10, 2009 at 9:01 AM, Majed B. <majedb@gmail.com> wrote: > > Which disks can provide 2ms response with a read of 250 MB/s and write > > of 170 MB/s other than SSDs?! > > The drives I use average <50usecs latency at 4KB packets (properly > measured as the complete turn-around time of a single outstanding > I/O), 800MB/s reads and >600MB/s writes at 128KB blocks. > > > > > Are you saying that it doesn't matter whether we use Linux or Windows > > with SSDs because the limitation is coming from the disk's controller > > itself? > > To some degree, yes, when using SSD's behind a controller, the > controller is the biggest performance issue, and given they use > chicklets for processors, they all hamper performance given the speed > potential of the underlying storage. > > As none of the enterprise distros are handling TRIM yet, W7 can claim > it was first, and putting together a TRIM-capable kernel is manual Except it wasnt, it may be earlier than the enterprise distros, but thats not first. > currently in Linux, and given only ext4 supports it (strangely, FAT > supported it, then the code was pulled... XFS may support it, but I > believe that's still in the works), you have the additional problem > that ext4 has some maturity issues. Porting "discard" to ext2/3 would > not be too difficult, especially w/o journal considerations. And given W7 supports it, it is going to have the same issues which linux faces, i dont know what solution microsoft has chosen, but that doesnt mean linux shouldnt choose the best one.. > > Chris > > > > On Tue, Nov 10, 2009 at 6:58 PM, Chris Worley <worleys@gmail.com> wrote: > >> On Tue, Nov 10, 2009 at 8:43 AM, Majed B. <majedb@gmail.com> wrote: > >>> Does that mean we won't be able to squeeze the juice out of Intel's > >>> Extreme SSDs on Linux? > >> > >> The limitation is in the design. You'll be able to get as much > >> performance as they can offer, given the bad design (of putting SSS > >> behind legacy controllers). > >> > >>> > >>> What about those of us who use OpenFiler and build their own storage > >>> solutions? We won't be able to provide solutions based on these SSDs > >>> because the kernel support is crap? > >> > >> It's sub-optimal, written to make the best of a bad design, limiting > >> performance of good designs, but not crap. > >> > >>> > >>> I may have clients wanting to mix between SAS/SATA & SSD to load their > >>> main database on the SSDs, but now it seems pointless since the > >>> performance isn't gonna be that great :/ > >> > >> You can still get much greater performance from SSS designed > >> correctly. Just don't do SSD's. > >> > >> Chris > >>> > >>> On Tue, Nov 10, 2009 at 6:39 PM, Chris Worley <worleys@gmail.com> wrote: > >>>> On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote: > >>>>> On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote: > >>>>>> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: > >>>>>> > Well, SATA uses SCSI emulation so I guess that's no problem, right? > >>>>>> > >>>>>> The only problem is SSD's put Solid State Storage (SSS) behind > >>>>>> SATA/SAS controllers... while compatible w/ old disk technology, it > >>>>>> severely limits performance (i.e. none of these SSD drives do even > >>>>>> 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 > >>>>> No, around 280MB/s... and obviously they dont do more, because of the > >>>>> simple limitation of the sata controllers.. this also means they dont > >>>>> need to do as many channels as other devices.. > >>>> > >>>> I'm not sure if you're agreeing or disagreeing here... > >>>> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, > >>>> while SSS, w/o a legacy controller, can do 800MB/s out of a single > >>>> drive. > >>>> > >>>>>> drivers and ext4 "discard" worked very well with forward-thinking SSS > >>>>>> not encumbered by old controller technology... but, SSD's were not > >>>>>> able to handle it well: > >>>>>> > >>>>>> http://lwn.net/Articles/347511/ > >>>>>> > >>>>>> So it looks like "design by committee" Linux is well behind Windows 7, > >>>>> And how exactly does windows 7 handle this so much better? > >>>> > >>>> TRIM is in W7; NTFS support. No Linux distro does. And by the time > >>>> "design by committee" gets through with it,we shouldn't have bothered. > >>>> > >>>>>> while Linux contemplates slowing new technology down to optimize for > >>>>>> ill-designed SSD's. > >>>>> It does? > >>>> > >>>> Those that speak loudest in the kernel development (and contribute the > >>>> most) work for companies like Intel that promote the slower, > >>>> controller-based, SSD's. > >>>> > >>>> Chris > >>>>>> > >>>>>> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! > >>>>>> > >>>>>> Chris > >>>>>> > > >>>>>> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: > >>>>>> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: > >>>>>> >>> The firmware which introduced the TRIM command was deemed buggy and > >>>>>> >>> has been pulled out. > >>>>>> >>> > >>>>>> >>> Are there any filesystems that are TRIM-aware? > >>>>>> >> > >>>>>> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's > >>>>>> >> not TRIM until it's issued as a SCSI command). > >>>>>> >> > >>>>>> >> Chris > >>>>>> >>> > >>>>>> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: > >>>>>> >>>> For those of us playing with use of SSD for journals on ext[34], this does > >>>>>> >>>> have implications for RAID performance. > >>>>>> >>>> > >>>>>> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes > >>>>>> >>>> > >>>>>> >>>> -- > >>>>>> >>>> Bill Davidsen <davidsen@tmr.com> > >>>>>> >>>> Unintended results are the well-earned reward for incompetence. > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> -- > >>>>>> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >>>>>> >>>> the body of a message to majordomo@vger.kernel.org > >>>>>> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>> >>>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> -- > >>>>>> >>> Majed B. > >>>>>> >>> -- > >>>>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >>>>>> >>> the body of a message to majordomo@vger.kernel.org > >>>>>> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>> >>> > >>>>>> >> > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> > -- > >>>>>> > Majed B. > >>>>>> > > >>>>>> -- > >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >>>>>> the body of a message to majordomo@vger.kernel.org > >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>> > >>>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> Majed B. > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> > > > > > > > > -- > > Majed B. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 15:39 ` Chris Worley 2009-11-10 15:43 ` Majed B. @ 2009-11-10 15:48 ` Asdo 2009-11-10 16:04 ` Chris Worley 2009-11-10 18:38 ` Kasper Sandberg 2 siblings, 1 reply; 34+ messages in thread From: Asdo @ 2009-11-10 15:48 UTC (permalink / raw) To: Chris Worley; +Cc: linux-raid Chris Worley wrote: > I'm not sure if you're agreeing or disagreeing here... > 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, > while SSS, w/o a legacy controller, can do 800MB/s out of a single > drive. > I have not heard about these SSS you mention. Do you have a link? Also are you sure that the SATA/SCSI layer is the problem? Some hardware raids can do 800 MB/s sequential, single stream, and indeed with a SATA/SAS interface to the kernel. If what you say was true, that would be impossible... ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 15:48 ` Asdo @ 2009-11-10 16:04 ` Chris Worley 2009-11-11 18:02 ` Default User 0 siblings, 1 reply; 34+ messages in thread From: Chris Worley @ 2009-11-10 16:04 UTC (permalink / raw) To: Asdo; +Cc: linux-raid On Tue, Nov 10, 2009 at 8:48 AM, Asdo <asdo@shiftmail.org> wrote: > Chris Worley wrote: >> >> I'm not sure if you're agreeing or disagreeing here... >> 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, >> while SSS, w/o a legacy controller, can do 800MB/s out of a single >> drive. >> > > I have not heard about these SSS you mention. > Do you have a link? All the Fusion-io products (fusionio.com) and TMS's (ramsan.com) RS20 are two examples (not their RAM-based products). Sun has their "Sunfire", but I haven't seen that yet. > > Also are you sure that the SATA/SCSI layer is the problem? Some hardware > raids can do 800 MB/s sequential, single stream, and indeed with a SATA/SAS > interface to the kernel. If what you say was true, that would be > impossible... Sequential/streaming performance is a corner case. There are many high speed solutions to that (even using rotating media). I'm talking random I/O at 128KB blocks at 800MB/s per drive. Chris > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 16:04 ` Chris Worley @ 2009-11-11 18:02 ` Default User 0 siblings, 0 replies; 34+ messages in thread From: Default User @ 2009-11-11 18:02 UTC (permalink / raw) To: Chris Worley; +Cc: linux-raid Chris Worley wrote: > On Tue, Nov 10, 2009 at 8:48 AM, Asdo <asdo@shiftmail.org> wrote: > >> I have not heard about these SSS you mention. >> Do you have a link? >> > > All the Fusion-io products (fusionio.com) and TMS's (ramsan.com) RS20 > are two examples (not their RAM-based products). Sun has their > "Sunfire", but I haven't seen that yet. > I don't know TMS, I know Fusion-io a bit: it is indeed 10x faster than a SSD but it is also 10 times more expensive! If you make a raid-0 of ten SSDs in a good hardware-raid controller, exported to the OS as a single SCSI disk, I bet you obtain about the same performances. Look at this: http://www.tomshardware.com/reviews/x25-e-ssd-performance,2365.html by looking at this page http://www.tomshardware.com/reviews/x25-e-ssd-performance,2365-7.html it seems the "streaming writes" is apparently similar to the benchmark you want (see the specs), do you agree? Yes it's 0% random it's 4 workers... and the blocksize is the one you want. You find the result in the following page. That's 2.2GB/sec with 16 disks. If you imagine it with 8 disks and only 1 controller (the benchmark uses 2 controllers with a software raid-0 above) it's more than the speed you want (800MB/sec) and it's with a SCSI interface. What do you think? >> Also are you sure that the SATA/SCSI layer is the problem? Some hardware >> raids can do 800 MB/s sequential, single stream, and indeed with a SATA/SAS >> interface to the kernel. If what you say was true, that would be >> impossible... >> > > Sequential/streaming performance is a corner case. There are many > high speed solutions to that (even using rotating media). I'm talking > random I/O at 128KB blocks at 800MB/s per drive. > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 15:39 ` Chris Worley 2009-11-10 15:43 ` Majed B. 2009-11-10 15:48 ` Asdo @ 2009-11-10 18:38 ` Kasper Sandberg 2 siblings, 0 replies; 34+ messages in thread From: Kasper Sandberg @ 2009-11-10 18:38 UTC (permalink / raw) To: Chris Worley; +Cc: Majed B., Linux RAID On Tue, 2009-11-10 at 08:39 -0700, Chris Worley wrote: > On Tue, Nov 10, 2009 at 2:42 AM, Kasper Sandberg <postmaster@metanurb.dk> wrote: > > On Mon, 2009-11-09 at 09:59 -0700, Chris Worley wrote: > >> On Mon, Nov 9, 2009 at 9:42 AM, Majed B. <majedb@gmail.com> wrote: > >> > Well, SATA uses SCSI emulation so I guess that's no problem, right? > >> > >> The only problem is SSD's put Solid State Storage (SSS) behind > >> SATA/SAS controllers... while compatible w/ old disk technology, it > >> severely limits performance (i.e. none of these SSD drives do even > >> 300MB/s... while SSS drives do 800MB/s). While the initial 2.6.27 > > No, around 280MB/s... and obviously they dont do more, because of the > > simple limitation of the sata controllers.. this also means they dont > > need to do as many channels as other devices.. > > I'm not sure if you're agreeing or disagreeing here... > 280MB/s<300MB/s, due to the "compatibility" based design of SSD's, > while SSS, w/o a legacy controller, can do 800MB/s out of a single > drive. A single drive which uses _ALOT_ more channels than the SSDs doing sata, and thats why, so you get extra performance, for a price.. > > >> drivers and ext4 "discard" worked very well with forward-thinking SSS > >> not encumbered by old controller technology... but, SSD's were not > >> able to handle it well: > >> > >> http://lwn.net/Articles/347511/ > >> > >> So it looks like "design by committee" Linux is well behind Windows 7, > > And how exactly does windows 7 handle this so much better? > > TRIM is in W7; NTFS support. No Linux distro does. And by the time > "design by committee" gets through with it,we shouldn't have bothered. > > >> while Linux contemplates slowing new technology down to optimize for > >> ill-designed SSD's. > > It does? > > Those that speak loudest in the kernel development (and contribute the > most) work for companies like Intel that promote the slower, > controller-based, SSD's. these 3 comments makes no sense.. > > Chris > >> > >> Be glad "thumb drives" didn't try to be floppy-drive-compatible!!! > >> > >> Chris > >> > > >> > On Mon, Nov 9, 2009 at 7:37 PM, Chris Worley <worleys@gmail.com> wrote: > >> >> On Sun, Nov 8, 2009 at 6:13 PM, Majed B. <majedb@gmail.com> wrote: > >> >>> The firmware which introduced the TRIM command was deemed buggy and > >> >>> has been pulled out. > >> >>> > >> >>> Are there any filesystems that are TRIM-aware? > >> >> > >> >> Ext4 (at that level in the kernel, it's referred to as "discard", it's > >> >> not TRIM until it's issued as a SCSI command). > >> >> > >> >> Chris > >> >>> > >> >>> On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: > >> >>>> For those of us playing with use of SSD for journals on ext[34], this does > >> >>>> have implications for RAID performance. > >> >>>> > >> >>>> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes > >> >>>> > >> >>>> -- > >> >>>> Bill Davidsen <davidsen@tmr.com> > >> >>>> Unintended results are the well-earned reward for incompetence. > >> >>>> > >> >>>> > >> >>>> -- > >> >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >> >>>> the body of a message to majordomo@vger.kernel.org > >> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> >>>> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> Majed B. > >> >>> -- > >> >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >> >>> the body of a message to majordomo@vger.kernel.org > >> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> >>> > >> >> > >> > > >> > > >> > > >> > -- > >> > Majed B. > >> > > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-09 16:59 ` Chris Worley 2009-11-10 9:42 ` Kasper Sandberg @ 2009-11-10 16:36 ` Martin K. Petersen 2009-11-10 17:22 ` Chris Worley 1 sibling, 1 reply; 34+ messages in thread From: Martin K. Petersen @ 2009-11-10 16:36 UTC (permalink / raw) To: Chris Worley; +Cc: Majed B., Linux RAID >>>>> "Chris" == Chris Worley <worleys@gmail.com> writes: Chris> The only problem is SSD's put Solid State Storage (SSS) behind Chris> SATA/SAS controllers... while compatible w/ old disk technology, Chris> it severely limits performance (i.e. none of these SSD drives do Chris> even 300MB/s... while SSS drives do 800MB/s). You are arguing that the SATA/SCSI protocols are inhibiting factors on the grounds that PCIe solid state devices are faster. Performance inside a flash device is gated by the number of channels you run in parallel. There is not much point in increasing the number of channels if your physical interconnect (3Gbps SATA, say) can't handle the traffic. Hence the drive towards 6Gbps interconnects and beyond for both SATA and SAS. Also, not all SSS boards present a memory-style device to the host. Several shipping SSS boards use a regular SAS HBA backed by multiple SATA/SAS targets which again comprise of multiple flash channels. And the performance of these devices is absolutely on par with the memory-based devices. Without requiring proprietary drivers, and without reinventing filesystems and I/O stack. We have been pushing tens of gigabytes per second through the storage stack for years when connected to arrays which - given their large non-volatile caches - are virtually indistinguishable from SSDs. We're constantly tweaking and tuning. Jens has done a lot of work to bring down command latency, I have worked on storage topology which allows us to uniquely identify the characteristics of the physical storage device so we can issue I/O in an optimal fashion. Note that I don't think that memory-based SSS devices are without merit. But it's baloney to claim that a storage-flavored interface inherently means bad performance. Chris> So it looks like "design by committee" Linux is well behind Chris> Windows 7, while Linux contemplates slowing new technology down Chris> to optimize for ill-designed SSD's. We're not slowing anything, nor are we optimizing for ill-designed SSDs. Because initial TRIM performance was absolutely appalling there was a lot of discussion about the merits of doing weekly scrubs instead of issuing TRIM on the fly. However, Windows 7 shipped issuing TRIM in realtime which means that all the early SSDs with lame duck DSM performance are headed straight for the garbage bin. Futhermore, unlike Windows 7 we can't pretend everything is desktop class ATA. We've spent a lot of time making sure that our block layer discard support works equally well for both ATA DSM (TRIM) as well as SCSI WRITE SAME and UNMAP used by high-end arrays. All three commands have been moving targets and none of them are technically set in stone in their respective standards bodies yet. So I think it would be a stretch to claim that TRIM is well tested and stable in the industry. intel just pulled their latest X25-M firmware because of problems with Windows 7... -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 16:36 ` Martin K. Petersen @ 2009-11-10 17:22 ` Chris Worley 2009-11-10 20:11 ` Martin K. Petersen 0 siblings, 1 reply; 34+ messages in thread From: Chris Worley @ 2009-11-10 17:22 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Majed B., Linux RAID On Tue, Nov 10, 2009 at 9:36 AM, Martin K. Petersen <martin.petersen@oracle.com> wrote: >>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes: > > Chris> The only problem is SSD's put Solid State Storage (SSS) behind > Chris> SATA/SAS controllers... while compatible w/ old disk technology, > Chris> it severely limits performance (i.e. none of these SSD drives do > Chris> even 300MB/s... while SSS drives do 800MB/s). > > You are arguing that the SATA/SCSI protocols are inhibiting factors on > the grounds that PCIe solid state devices are faster. > > Performance inside a flash device is gated by the number of channels you > run in parallel. There is not much point in increasing the number of > channels if your physical interconnect (3Gbps SATA, say) can't handle > the traffic. Hence the drive towards 6Gbps interconnects and beyond for > both SATA and SAS. Absolutely agreed: the SSD manufacturers will limit their NAND performance given the performance limitations of the controller front-end. Also, given their management layer is an on-board ASIC, they further limit their performance in this design. > > Also, not all SSS boards present a memory-style device to the host. > Several shipping SSS boards use a regular SAS HBA backed by multiple > SATA/SAS targets which again comprise of multiple flash channels. And > the performance of these devices is absolutely on par with the > memory-based devices. Without requiring proprietary drivers, and > without reinventing filesystems and I/O stack. I'm not talking about memory-based or -looking devices. A block device is all you need, and you don't have to re-write file systems to put one atop a block device. Those using legacy controller technology can overcome the issue by using multiple devices. We've been talking single device performance. I can get 6GB/s using 8 SSS drives. Scalability is much easier when you start with really fast individual components. So, legacy controllers are still a bad design. > > We have been pushing tens of gigabytes per second through the storage > stack for years when connected to arrays which - given their large > non-volatile caches - are virtually indistinguishable from SSDs. We're > constantly tweaking and tuning. Jens has done a lot of work to bring > down command latency, I have worked on storage topology which allows us > to uniquely identify the characteristics of the physical storage device > so we can issue I/O in an optimal fashion. And I do appreciate all your work. I fear, in this case, discard will be optimized for the slower technology... we won't be getting all that's available from it. > > Note that I don't think that memory-based SSS devices are without merit. Let's call it CPU-based. "Memory-based" sounds like RAM-based storage... we're not talking about that. > But it's baloney to claim that a storage-flavored interface inherently > means bad performance. You need an epiphany here. Between the SAS/SATA controllers and the on-board drive logic, SSD's are a bad design when it comes to performance. They are dwarfed, in performance, by CPU-based controllers. CPU's have much more performance for handling the management needed by NAND, and there are so many cores these days going unused. SSD's do win the "compatibility" argument. It's too bad we didn't invent thumb drives that were floppy compatible ;) > > > Chris> So it looks like "design by committee" Linux is well behind > Chris> Windows 7, while Linux contemplates slowing new technology down > Chris> to optimize for ill-designed SSD's. > > We're not slowing anything, nor are we optimizing for ill-designed SSDs. > > Because initial TRIM performance was absolutely appalling Only on SSD's behind legacy controllers. It worked great as-is with SSS. > there was a > lot of discussion about the merits of doing weekly scrubs instead of > issuing TRIM on the fly. However, Windows 7 shipped issuing TRIM in > realtime which means that all the early SSDs with lame duck DSM > performance are headed straight for the garbage bin. Too bad the legacy design doesn't go with them ;) Chris > > Futhermore, unlike Windows 7 we can't pretend everything is desktop > class ATA. We've spent a lot of time making sure that our block layer > discard support works equally well for both ATA DSM (TRIM) as well as > SCSI WRITE SAME and UNMAP used by high-end arrays. All three commands > have been moving targets and none of them are technically set in stone > in their respective standards bodies yet. > > So I think it would be a stretch to claim that TRIM is well tested and > stable in the industry. intel just pulled their latest X25-M firmware > because of problems with Windows 7... > > -- > Martin K. Petersen Oracle Linux Engineering > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 17:22 ` Chris Worley @ 2009-11-10 20:11 ` Martin K. Petersen 2009-11-10 20:45 ` Chris Worley 2009-11-10 21:01 ` Greg Freemyer 0 siblings, 2 replies; 34+ messages in thread From: Martin K. Petersen @ 2009-11-10 20:11 UTC (permalink / raw) To: Chris Worley; +Cc: Martin K. Petersen, Majed B., Linux RAID >>>>> "Chris" == Chris Worley <worleys@gmail.com> writes: Chris> I'm not talking about memory-based or -looking devices. A block Chris> device is all you need, and you don't have to re-write file Chris> systems to put one atop a block device. And a SATA/SCSI-fronted flash disk isn't a block device how? Do you have any compelling evidence as to why using a protocol like SCSI is bad? A SCSI command is typically 16 bytes. A typical HBA IOCB slightly bigger but includes the inevitable scatterlist. We're talking a pretty dense format for expressing an I/O operation here. You seem to be arguing that letting a device speak "block" instead of SCSI would make things faster. I'm not convinced. Also, SCSI gives us a nice way to track outstanding I/Os via command queueing plus much more. All in a open, non-vendor-specific format requiring no custom drivers. Unlike, say, the SSS board you mentioned elsewhere in this thread. On top of that Linux is used all over the place in deployments that have throughput and IOPS figures above and beyond the numbers you quote here. Despite "legacy" controllers being in the mix. Chris> Those using legacy controller technology can overcome the issue Chris> by using multiple devices. We've been talking single device Chris> performance. I can get 6GB/s using 8 SSS drives. And adding another flash-backed SAS board isn't giving you exactly the same benefit? Chris> And I do appreciate all your work. I fear, in this case, discard Chris> will be optimized for the slower technology... we won't be Chris> getting all that's available from it. Discard isn't "optimized" for anything. It's a command. Filesystem issues it, it gets sent to the storage device (DSM/TRIM, WRITE SAME, or UNMAP depending on target type). Chris> CPU's have much more performance for handling the management Chris> needed by NAND, and there are so many cores these days going Chris> unused. You seem to think that the limiting factor in SSD design is the speed of the ASIC and not the speed of the actual flash chips behind it. Chris> SSD's do win the "compatibility" argument. It's too bad we Chris> didn't invent thumb drives that were floppy compatible ;) There are many good reasons for that. drivers/block/floppy.c contains a several of them. Keep a bag of expletives handy. >> Because initial TRIM performance was absolutely appalling Chris> Only on SSD's behind legacy controllers. It worked great as-is Chris> with SSS. Please elaborate. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 20:11 ` Martin K. Petersen @ 2009-11-10 20:45 ` Chris Worley 2009-11-10 22:35 ` Martin K. Petersen 2009-11-10 21:01 ` Greg Freemyer 1 sibling, 1 reply; 34+ messages in thread From: Chris Worley @ 2009-11-10 20:45 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Majed B., Linux RAID On Tue, Nov 10, 2009 at 1:11 PM, Martin K. Petersen <martin.petersen@oracle.com> wrote: >>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes: > > Chris> I'm not talking about memory-based or -looking devices. A block > Chris> device is all you need, and you don't have to re-write file > Chris> systems to put one atop a block device. > > And a SATA/SCSI-fronted flash disk isn't a block device how? It's not any different. The previous statement that I was responding to (you snipped out) had imlied that the fs code had to be re-written for non-SCSI devices. I was just assuring that was not necessary. > > Do you have any compelling evidence as to why using a protocol like SCSI > is bad? A SCSI command is typically 16 bytes. A typical HBA IOCB > slightly bigger but includes the inevitable scatterlist. We're talking > a pretty dense format for expressing an I/O operation here. I'm not saying the SCSI protocol is bad, I'm saying the SAS/SATA/SCSI controllers, that have been optimized for years for rotating media, don't have the compute power to handle the sort of performance attainable with SSS. > > You seem to be arguing that letting a device speak "block" instead of > SCSI would make things faster. I'm not convinced. That's not what I'm saying; the protocol is not the culprit, the controller is. But, once you get rid of the controller, and just speak block device, another level of overhead had been removed. > Also, SCSI gives us > a nice way to track outstanding I/Os via command queueing plus much > more. All in a open, non-vendor-specific format requiring no custom > drivers. Unlike, say, the SSS board you mentioned elsewhere in this > thread. At least one of the boards I mentioned I know has command queuing w/o being a SCSI device. > > On top of that Linux is used all over the place in deployments that have > throughput and IOPS figures above and beyond the numbers you quote here. I was only quoting single drive specs. You only scale to really big numbers if you start with really fast individual components. I'm sure you could quote TB/s using rotating media, but you have a lot more expensive pieces needed to get there. > Despite "legacy" controllers being in the mix. > > > Chris> Those using legacy controller technology can overcome the issue > Chris> by using multiple devices. We've been talking single device > Chris> performance. I can get 6GB/s using 8 SSS drives. > > And adding another flash-backed SAS board isn't giving you exactly the > same benefit? Again, scalability is achieved more readily and with less complexity using faster components. > > > Chris> And I do appreciate all your work. I fear, in this case, discard > Chris> will be optimized for the slower technology... we won't be > Chris> getting all that's available from it. > > Discard isn't "optimized" for anything. It's a command. Filesystem > issues it, it gets sent to the storage device (DSM/TRIM, WRITE SAME, or > UNMAP depending on target type). Unless you try to coalesce it for a later time, which is what I hear is being done to compensate for slow controllers. > > > Chris> CPU's have much more performance for handling the management > Chris> needed by NAND, and there are so many cores these days going > Chris> unused. > > You seem to think that the limiting factor in SSD design is the speed of > the ASIC and not the speed of the actual flash chips behind it. True. They limit the NAND performance based on the lack of performance of their ASIC and the controller. That doesn't mean you can't get a lot better performance out of NAND, it just means they limited themselves to be compatible, and the kernel will implement a strategy that will optimize for the poor design. > > > Chris> SSD's do win the "compatibility" argument. It's too bad we > Chris> didn't invent thumb drives that were floppy compatible ;) > > There are many good reasons for that. drivers/block/floppy.c contains a > several of them. Keep a bag of expletives handy. So you _are_ glad that compatibility was not followed in the move to USB thumb drives, but you also believe the best way to do SSS was behind compatible legacy SAS/SATA devices optimized for old rotating media? > > >>> Because initial TRIM performance was absolutely appalling > > Chris> Only on SSD's behind legacy controllers. It worked great as-is > Chris> with SSS. > > Please elaborate. I had no performance issues testing w/ the original discard implementation using SSS. I'd run IOZone and fill the drive (as I recall ~200GB) w/ files and benchmark, which, at the end, IOZone would delete all the files created (in the hundreds), and the delete/discard process was no more time consuming than just the delete process (for everything on the drive). This was w/ the original 2.6.27 and 2.6.28 ext4 "discard" implementations. Chris > > -- > Martin K. Petersen Oracle Linux Engineering > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 20:45 ` Chris Worley @ 2009-11-10 22:35 ` Martin K. Petersen 2009-11-11 18:17 ` Chris Worley 0 siblings, 1 reply; 34+ messages in thread From: Martin K. Petersen @ 2009-11-10 22:35 UTC (permalink / raw) To: Chris Worley; +Cc: Martin K. Petersen, Majed B., Linux RAID >>>>> "Chris" == Chris Worley <worleys@gmail.com> writes: Chris> I'm not saying the SCSI protocol is bad, I'm saying the Chris> SAS/SATA/SCSI controllers, that have been optimized for years for Chris> rotating media, don't have the compute power to handle the sort Chris> of performance attainable with SSS. And I'm saying that at least in the SCSI case that's untrue. SAS and FC controllers are optimized for lots and lots of I/O because their main application is driving large storage arrays which have performance comparable to the solid state devices you mention. In fact, many deployments use said SCSI controllers to drive RAM-based solid state storage devices which are faster than the flash-based devices we're talking about here. Chris> Unless you try to coalesce it for a later time, which is what I Chris> hear is being done to compensate for slow controllers. We don't coalesce. Chris> True. They limit the NAND performance based on the lack of Chris> performance of their ASIC and the controller. Interesting theory. I'm personally of the conviction that cheap SSDs suffer from amazingly poor FTL design rather than inherent hardware limitations. That's something intel got right with their drives. The hardware itself is pretty unremarkable. Chris> That doesn't mean you can't get a lot better performance out of Chris> NAND, it just means they limited themselves to be compatible, and Chris> the kernel will implement a strategy that will optimize for the Chris> poor design. You are confusing limitations in interconnect technology with the properties of the protocols used. There is no point in adding channels behind the ASIC to drive 12 Gbps of I/O if your host interface is 1.5 Gbps SATA. That has nothing to do with whether ATA and SCSI are suitable protocols. I'm arguing that the at least SCSI is a good protocol for sending commands to a block device. Nothing prevents your flash-based block device from presenting a PCIe SCSI interface to the host and then do something completely different in the back. There's lots of warts in SCSI. And I personally think that ATA TRIM was very poorly defined. But I don't believe that these protocols are inherently bad for driving storage. And I don't believe that coming up with a custom "block" interface will improve anything in the short term. Heck, the overhead of speaking SCSI is so low that even the thumb drive you brought up implements it. At negligible cost. Chris> but you also believe the best way to do SSS was behind compatible Chris> legacy SAS/SATA devices optimized for old rotating media? You're the one claiming these "legacy" devices are optimized for rotating media. I'm claiming there's nothing rotating about either protocol. Both express "do something to this range of blocks" in 16 bytes or less + a scatterlist describing memory. That's a pretty efficient interface in my book. Chris> I'd run IOZone and fill the drive (as I recall ~200GB) w/ files Chris> and benchmark, which, at the end, IOZone would delete all the Chris> files created (in the hundreds), and the delete/discard process Chris> was no more time consuming than just the delete process (for Chris> everything on the drive). This was w/ the original 2.6.27 and Chris> 2.6.28 ext4 "discard" implementations. And which device was this? How did it implement discard? -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 22:35 ` Martin K. Petersen @ 2009-11-11 18:17 ` Chris Worley 0 siblings, 0 replies; 34+ messages in thread From: Chris Worley @ 2009-11-11 18:17 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Linux RAID On Tue, Nov 10, 2009 at 3:35 PM, Martin K. Petersen <martin.petersen@oracle.com> wrote: >>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes: > > Chris> I'm not saying the SCSI protocol is bad, I'm saying the > Chris> SAS/SATA/SCSI controllers, that have been optimized for years for > Chris> rotating media, don't have the compute power to handle the sort > Chris> of performance attainable with SSS. > > And I'm saying that at least in the SCSI case that's untrue. SAS and FC > controllers are optimized for lots and lots of I/O because their main > application is driving large storage arrays which have performance > comparable to the solid state devices you mention. We're going to have to agree to disagree on this. My feeling is, you haven't tried the next generation in I/O performance, only the slow SSD's currently available, and don't (yet) see the potential for getting rid of all the hardware layers that evolved around rotating media. And when you talk of FC... there's more performance inhibition. Slow hardware like 10G Ethernet and FC8 can't keep up with the performance required for fast SSS I/O. A single QDR IB port is a good start, with 3GB/s per port (measured using SRP to export the drives). How many FC8 or 10G over iSCSI ports would it take to get one QDR IB ports performance(?)... then start thinking 2x, 4x, 8x, ... and the complexity of the old hardware becomes daunting when trying to scale. Again, to scale easily and with less complexity: you need your fundamental components to be fast. SSD's, FC, 10G are last generation hardware and way too slow. You can use a 90's vintage distributed supercomputer with 100's of processors to run tasks that one CPU can do as fast or faster today... but many would agree that the new CPU is probably an easier choice. And again, I'm not attacking the SCSI protocol, just the controller performance; but getting rid of unnecessary OS software layers (i.e. when you can directly use a block device), also provides more performance. When you've got <50usecs latency storage, every CPU cycle counts. <snip> > > Chris> I'd run IOZone and fill the drive (as I recall ~200GB) w/ files > Chris> and benchmark, which, at the end, IOZone would delete all the > Chris> files created (in the hundreds), and the delete/discard process > Chris> was no more time consuming than just the delete process (for > Chris> everything on the drive). This was w/ the original 2.6.27 and > Chris> 2.6.28 ext4 "discard" implementations. > > And which device was this? How did it implement discard? This was a non-GPL driver (as is the management layer for all SSD's), so I doubt you're interested. The methodology used was that laid out by David Wodhouse in: http://lwn.net/Articles/293658/ Basically: 1) register for the discard, 2) decode the write BIO's that indicate "discard", 3) send completion when done. Chris > > -- > Martin K. Petersen Oracle Linux Engineering > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 20:11 ` Martin K. Petersen 2009-11-10 20:45 ` Chris Worley @ 2009-11-10 21:01 ` Greg Freemyer 2009-11-10 21:17 ` Chris Worley 2009-11-10 22:56 ` Martin K. Petersen 1 sibling, 2 replies; 34+ messages in thread From: Greg Freemyer @ 2009-11-10 21:01 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Chris Worley, Majed B., Linux RAID On Tue, Nov 10, 2009 at 3:11 PM, Martin K. Petersen <martin.petersen@oracle.com> wrote: >>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes: > <snip> > > Chris> And I do appreciate all your work. I fear, in this case, discard > Chris> will be optimized for the slower technology... we won't be > Chris> getting all that's available from it. > > Discard isn't "optimized" for anything. It's a command. Filesystem > issues it, it gets sent to the storage device (DSM/TRIM, WRITE SAME, or > UNMAP depending on target type). > Martin, I'm not sure that is right, but Chris is also wrong. I'm not sure where it ended up, but the big SSD / discard discussion of a few months ago talked about 3 kinds of solutions, and I thought the plan was to support all 3. 1) optimization 1 - A white-listed instant discard feature. In this methodology, the filesystems would immediately send discard calls down to the block layer would send them on down the block stack to the physical devices with very minimal buffering. It was thought high-end Intel SSDs would benefit from this model. It also sounds like SSS devices would benefit from this per Chris's comments. Note that this approach is NOT very friendly from a raid 4/5/6 approach. Those raid levels need to discard full stripes at a time, so getting a large number of small discards would be painful. 2) optimization 2 - The block layer would accept those small discards, but accumulate them for a short period. (less than a second was my impression). Then coalesce them into larger discards and send them down the block stack and eventually to the physical device. This is slightly better from a raid 4/5/6 perspective, but I suspect the discard ranges would still be too small. 3) optimization 3 - a background freespace scanner would run from time to time that scanned a filesystem for free blocks and send a discard / trim command down to the device. This is what Mark Lord was working on. His solution was primarily in user space and was controlled by cron. I believe this is by far the best approach for a raid 4/5/6 implementation, but at the time Mark's implementation was bypassing the block stack and using SG_IO to directly talk to the physical devices. I don't recall any discussion of how MD could participate in the process. Thus Mark's solution at the time was not compatible with md raid 4/5/6 implementations. Since this is the mdraid mailing list, maybe someone can tell us which of the above are getting the attention of md devels and if there is any ongoing effort to support them? Thanks Greg -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 21:01 ` Greg Freemyer @ 2009-11-10 21:17 ` Chris Worley 2009-11-10 22:56 ` Martin K. Petersen 1 sibling, 0 replies; 34+ messages in thread From: Chris Worley @ 2009-11-10 21:17 UTC (permalink / raw) To: Greg Freemyer; +Cc: Martin K. Petersen, Majed B., Linux RAID On Tue, Nov 10, 2009 at 2:01 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote: > On Tue, Nov 10, 2009 at 3:11 PM, Martin K. Petersen > <martin.petersen@oracle.com> wrote: >>>>>>> "Chris" == Chris Worley <worleys@gmail.com> writes: >> > <snip> >> >> Chris> And I do appreciate all your work. I fear, in this case, discard >> Chris> will be optimized for the slower technology... we won't be >> Chris> getting all that's available from it. >> >> Discard isn't "optimized" for anything. It's a command. Filesystem >> issues it, it gets sent to the storage device (DSM/TRIM, WRITE SAME, or >> UNMAP depending on target type). >> > > Martin, > > I'm not sure that is right, but Chris is also wrong. I've not been happier to hear I'm wrong; I do hope you are right (there will be a switch for optimal approaches). Thanks, Chris -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 21:01 ` Greg Freemyer 2009-11-10 21:17 ` Chris Worley @ 2009-11-10 22:56 ` Martin K. Petersen 2009-11-11 17:00 ` Greg Freemyer 1 sibling, 1 reply; 34+ messages in thread From: Martin K. Petersen @ 2009-11-10 22:56 UTC (permalink / raw) To: Greg Freemyer; +Cc: Martin K. Petersen, Chris Worley, Majed B., Linux RAID >>>>> "Greg" == Greg Freemyer <greg.freemyer@gmail.com> writes: Greg> I'm not sure where it ended up, but the big SSD / discard Greg> discussion of a few months ago talked about 3 kinds of solutions, Greg> and I thought the plan was to support all 3. We don't design for the past. Greg> 1) optimization 1 - A white-listed instant discard feature. In Greg> this methodology, the filesystems would immediately send Greg> discard calls down to the block layer would send them on down Greg> the block stack to the physical devices with very minimal Greg> buffering. There's no whitelist. That's just how it works. Yes, there were a few crappy devices out there. Windows 7 issuing TRIM commands in realtime made them instantly obsolete. If future devices suck with Windows 7 nobody will buy them. Greg> 2) optimization 2 - The block layer would accept those small Greg> discards, but accumulate them for a short period. (less than a Greg> second was my impression). Then coalesce them into larger Greg> discards and send them down the block stack and eventually to Greg> the physical device. SSDs are special in that they actually track map state on a per-logical block basis. Other thinly provisioned devices track space in units ranging from 16-32-64KB up to megabytes. It's up to each block device to track the map space. The way most arrays work is that they'll ignore the portions of the request that are not aligned to and a multiple of their internal allocation unit. The same applies to MD. IOW, MD would only unmap the portions of the discard request that constitute entire stripes. No keeping state required. Jens just queued my patch which allows block devices to communicate their unmap granularity and alignment to the filesystems. This means we can potentially use this to influence filesystem allocators. For SCSI arrays these values are queried and passed up the stack. MD can choose to manually set the granularity to its stripe size. Greg> 3) optimization 3 - a background freespace scanner would run from Greg> time to time that scanned a filesystem for free blocks and send a Greg> discard / trim command down to the device. This is what Mark Lord Greg> was working on. His solution was primarily in user space and was Greg> controlled by cron. I think that's a fine approach for legacy devices. But as I said I think Windows 7 will root out all devices with poor TRIM performance pretty quickly. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-10 22:56 ` Martin K. Petersen @ 2009-11-11 17:00 ` Greg Freemyer 2009-11-12 5:50 ` Martin K. Petersen 0 siblings, 1 reply; 34+ messages in thread From: Greg Freemyer @ 2009-11-11 17:00 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Chris Worley, Majed B., Linux RAID On Tue, Nov 10, 2009 at 5:56 PM, Martin K. Petersen <martin.petersen@oracle.com> wrote: >>>>>> "Greg" == Greg Freemyer <greg.freemyer@gmail.com> writes: > > Greg> I'm not sure where it ended up, but the big SSD / discard > Greg> discussion of a few months ago talked about 3 kinds of solutions, > Greg> and I thought the plan was to support all 3. > > We don't design for the past. > > > Greg> 1) optimization 1 - A white-listed instant discard feature. In > Greg> this methodology, the filesystems would immediately send > Greg> discard calls down to the block layer would send them on down > Greg> the block stack to the physical devices with very minimal > Greg> buffering. > > There's no whitelist. That's just how it works. > > Yes, there were a few crappy devices out there. Windows 7 issuing TRIM > commands in realtime made them instantly obsolete. If future devices > suck with Windows 7 nobody will buy them. > > > Greg> 2) optimization 2 - The block layer would accept those small > Greg> discards, but accumulate them for a short period. (less than a > Greg> second was my impression). Then coalesce them into larger > Greg> discards and send them down the block stack and eventually to > Greg> the physical device. > > SSDs are special in that they actually track map state on a per-logical > block basis. Other thinly provisioned devices track space in units > ranging from 16-32-64KB up to megabytes. > > It's up to each block device to track the map space. The way most > arrays work is that they'll ignore the portions of the request that are > not aligned to and a multiple of their internal allocation unit. > > The same applies to MD. IOW, MD would only unmap the portions of the > discard request that constitute entire stripes. No keeping state > required. > > Jens just queued my patch which allows block devices to communicate > their unmap granularity and alignment to the filesystems. This means we > can potentially use this to influence filesystem allocators. For SCSI > arrays these values are queried and passed up the stack. MD can choose > to manually set the granularity to its stripe size. > > > Greg> 3) optimization 3 - a background freespace scanner would run from > Greg> time to time that scanned a filesystem for free blocks and send a > Greg> discard / trim command down to the device. This is what Mark Lord > Greg> was working on. His solution was primarily in user space and was > Greg> controlled by cron. > > I think that's a fine approach for legacy devices. But as I said I > think Windows 7 will root out all devices with poor TRIM performance > pretty quickly. > > -- > Martin K. Petersen Oracle Linux Engineering > Martin, So for a workload mostly composed of small files residing on a MD raid 4/5/6 setup, how is this supposed to work. (ie. Tiffs, small word docs, pdfs, individual emails, etc.) Most of the individual files will be less than one stripe wide, so when they are deleted I gather the discard range will be less than a stripe and therefore MD would ignore it in the simplest of implementations. ie. Without coalescence at some point, MD will never forward discards to the hardware. Thus I would think for that workload, the nightly full freespace scan and discard would be the best solution. Thanks Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer Preservation and Forensic processing of Exchange Repositories White Paper - <http://www.norcrossgroup.com/forms/whitepapers/tng_whitepaper_fpe.html> The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-11 17:00 ` Greg Freemyer @ 2009-11-12 5:50 ` Martin K. Petersen 0 siblings, 0 replies; 34+ messages in thread From: Martin K. Petersen @ 2009-11-12 5:50 UTC (permalink / raw) To: Greg Freemyer; +Cc: Martin K. Petersen, Linux RAID >>>>> "Greg" == Greg Freemyer <greg.freemyer@gmail.com> writes: Greg> So for a workload mostly composed of small files residing on a MD Greg> raid 4/5/6 setup, how is this supposed to work. (ie. Tiffs, small Greg> word docs, pdfs, individual emails, etc.) The intent of thin provisioning is not to free up 4KB of space when you delete an email. The intent is to free 4GB when you delete a database or a virtual machine disk image. In the RAID array space, allocation units bigger than a single stripe are common. I'm not aware of any arrays that track sectors or filesystem block sized chunks. We're exposing as much information about the mapping granularity as the hardware is willing to share. This is done so that filesystems have the option of laying out data accordingly. For instance a filesystem can choose to reuse recently freed blocks and to pack data tightly together instead of the traditional approach of spreading things out over the entire LBA range. Greg> Most of the individual files will be less than one stripe wide, so Greg> when they are deleted I gather the discard range will be less than Greg> a stripe and therefore MD would ignore it in the simplest of Greg> implementations. ie. Without coalescence at some point, MD will Greg> never forward discards to the hardware. Greg> Thus I would think for that workload, the nightly full freespace Greg> scan and discard would be the best solution. Well that's certainly possible to implement for small setups. And less tedious than tracking individual sector mapping in MD. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Intel Updates SSDs, Supports TRIM, Faster Writes 2009-11-09 1:13 ` Majed B. 2009-11-09 16:37 ` Chris Worley @ 2009-11-09 18:42 ` Greg Freemyer 1 sibling, 0 replies; 34+ messages in thread From: Greg Freemyer @ 2009-11-09 18:42 UTC (permalink / raw) To: Majed B.; +Cc: Linux RAID On Sun, Nov 8, 2009 at 8:13 PM, Majed B. <majedb@gmail.com> wrote: > The firmware which introduced the TRIM command was deemed buggy and > has been pulled out. > > Are there any filesystems that are TRIM-aware? > > On Sun, Nov 8, 2009 at 8:57 PM, Bill Davidsen <davidsen@tmr.com> wrote: >> For those of us playing with use of SSD for journals on ext[34], this does >> have implications for RAID performance. >> >> http://hardware.slashdot.org/story/09/10/27/1427209/Intel-Updates-SSDs-Supports-TRIM-Faster-Writes >> >> -- >> Bill Davidsen <davidsen@tmr.com> > -- > Majed B. Majed, There are various ways to address the TRIM issue. My favorite is to have a once a day (or whatever) process invoked by cron that scans a filesystem for unused space, then calls trim on all of the unused chunks. Mark Lord had this working via fallocate calls from user space a couple months ago. See <http://markmail.org/message/rytr4jqx52h2wftm> fyi: Mark Lord is the hdparm maintainer and I think you can get his userspace stuff from sourceforge. I think the kernel code is already in vanilla. fyi2: I have not had my hands on a trim capable SSD so I have not tried Mark's code yet. Greg -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2009-11-12 5:50 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-11-08 17:57 Intel Updates SSDs, Supports TRIM, Faster Writes Bill Davidsen 2009-11-08 22:30 ` Thomas Fjellstrom 2009-11-09 1:13 ` Majed B. 2009-11-09 16:37 ` Chris Worley 2009-11-09 16:42 ` Majed B. 2009-11-09 16:59 ` Chris Worley 2009-11-10 9:42 ` Kasper Sandberg 2009-11-10 15:39 ` Chris Worley 2009-11-10 15:43 ` Majed B. 2009-11-10 15:58 ` Chris Worley 2009-11-10 16:01 ` Majed B. 2009-11-10 16:15 ` Robin Hill 2009-11-10 16:31 ` Chris Worley 2009-11-10 16:18 ` Chris Worley 2009-11-10 18:31 ` Majed B. 2009-11-10 23:03 ` Mathieu Chouquet-Stringer 2009-11-11 2:52 ` Majed B. 2009-11-10 18:40 ` Kasper Sandberg 2009-11-10 15:48 ` Asdo 2009-11-10 16:04 ` Chris Worley 2009-11-11 18:02 ` Default User 2009-11-10 18:38 ` Kasper Sandberg 2009-11-10 16:36 ` Martin K. Petersen 2009-11-10 17:22 ` Chris Worley 2009-11-10 20:11 ` Martin K. Petersen 2009-11-10 20:45 ` Chris Worley 2009-11-10 22:35 ` Martin K. Petersen 2009-11-11 18:17 ` Chris Worley 2009-11-10 21:01 ` Greg Freemyer 2009-11-10 21:17 ` Chris Worley 2009-11-10 22:56 ` Martin K. Petersen 2009-11-11 17:00 ` Greg Freemyer 2009-11-12 5:50 ` Martin K. Petersen 2009-11-09 18:42 ` Greg Freemyer
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.