* What's the typical RAID10 setup? @ 2011-01-31 9:41 Mathias Burén 2011-01-31 10:14 ` Robin Hill ` (2 more replies) 0 siblings, 3 replies; 127+ messages in thread From: Mathias Burén @ 2011-01-31 9:41 UTC (permalink / raw) To: Linux-RAID Hi, RAID10 is (could be) setup in this way, correct? 2 devices in a RAID1 2 devices in another RAID1 Then you run RAID0 on top of them. If you're lucky you can lose 2 devices at most (1 in each RAID1). If you have, say 6 HDDs, would you create 3 RAID1 volumes? Then create a RAID0 on top of them? How would one go about expanding a 4 HDD RAID10 into a 6 HDD RAID10? Is it "just" a matter of creating a new RAID1 array of the 2 new HDDs, then adding them to the RAID0, then expanding whatever is on that (lvm, xfs, ext4)? Are there any design tips, or caveats? For example, how many disks would you use at most, in a RAID10 setup? Kind regards, // Mathias ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 9:41 What's the typical RAID10 setup? Mathias Burén @ 2011-01-31 10:14 ` Robin Hill 2011-01-31 10:22 ` Mathias Burén 2011-01-31 10:36 ` CoolCold 2011-01-31 20:07 ` Stan Hoeppner 2 siblings, 1 reply; 127+ messages in thread From: Robin Hill @ 2011-01-31 10:14 UTC (permalink / raw) To: Linux-RAID [-- Attachment #1: Type: text/plain, Size: 1760 bytes --] On Mon Jan 31, 2011 at 09:41:43AM +0000, Mathias Burén wrote: > Hi, > > RAID10 is (could be) setup in this way, correct? > > 2 devices in a RAID1 > 2 devices in another RAID1 > > Then you run RAID0 on top of them. If you're lucky you can lose 2 > devices at most (1 in each RAID1). > It could be, yes, or you could just use the RAID10 mode in md, which simplifies the process and offers you a selection of different physical layouts (some of which can offer significant performance benefits, depending on usage). > If you have, say 6 HDDs, would you create 3 RAID1 volumes? Then create > a RAID0 on top of them? > Yes. > How would one go about expanding a 4 HDD RAID10 into a 6 HDD RAID10? > Is it "just" a matter of creating a new RAID1 array of the 2 new HDDs, > then adding them to the RAID0, then expanding whatever is on that > (lvm, xfs, ext4)? > Expansion of RAID0 (or RAID10) is not currently implemented, though there is a workaround for RAID0. The basic steps are to convert to RAID4 with missing parity disk, expand, then convert back to RAID0. It's a bit more complex though as you need to prevent md from recovering the RAID4 array first - the full command process was posted a few days ago though, so a dig through the archives should find them. Proper RAID0 expansion should be in a forthcoming (next?) mdadm release, not sure about RAID10 expansion though. Otherwise, yes, those are the correct steps needed for expanding the array and filesystem. Cheers, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 10:14 ` Robin Hill @ 2011-01-31 10:22 ` Mathias Burén 0 siblings, 0 replies; 127+ messages in thread From: Mathias Burén @ 2011-01-31 10:22 UTC (permalink / raw) To: Linux-RAID On 31 January 2011 10:14, Robin Hill <robin@robinhill.me.uk> wrote: > On Mon Jan 31, 2011 at 09:41:43AM +0000, Mathias Burén wrote: > >> Hi, >> >> RAID10 is (could be) setup in this way, correct? >> >> 2 devices in a RAID1 >> 2 devices in another RAID1 >> >> Then you run RAID0 on top of them. If you're lucky you can lose 2 >> devices at most (1 in each RAID1). >> > It could be, yes, or you could just use the RAID10 mode in md, which > simplifies the process and offers you a selection of different physical > layouts (some of which can offer significant performance benefits, > depending on usage). > >> If you have, say 6 HDDs, would you create 3 RAID1 volumes? Then create >> a RAID0 on top of them? >> > Yes. > >> How would one go about expanding a 4 HDD RAID10 into a 6 HDD RAID10? >> Is it "just" a matter of creating a new RAID1 array of the 2 new HDDs, >> then adding them to the RAID0, then expanding whatever is on that >> (lvm, xfs, ext4)? >> > Expansion of RAID0 (or RAID10) is not currently implemented, though > there is a workaround for RAID0. The basic steps are to convert to > RAID4 with missing parity disk, expand, then convert back to RAID0. > It's a bit more complex though as you need to prevent md from recovering > the RAID4 array first - the full command process was posted a few days > ago though, so a dig through the archives should find them. Proper > RAID0 expansion should be in a forthcoming (next?) mdadm release, not > sure about RAID10 expansion though. > > Otherwise, yes, those are the correct steps needed for expanding the > array and filesystem. > > Cheers, > Robin > -- > ___ > ( ' } | Robin Hill <robin@robinhill.me.uk> | > / / ) | Little Jim says .... | > // !! | "He fallen in de water !!" | > Ah, good to know. Because of money and physical space vs usable storage amount, I'm probably going with RAID5 again or RAID6 for my future box. Cheers, // Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 9:41 What's the typical RAID10 setup? Mathias Burén 2011-01-31 10:14 ` Robin Hill @ 2011-01-31 10:36 ` CoolCold 2011-01-31 15:00 ` Roberto Spadim 2011-01-31 20:07 ` Stan Hoeppner 2 siblings, 1 reply; 127+ messages in thread From: CoolCold @ 2011-01-31 10:36 UTC (permalink / raw) To: Mathias Burén; +Cc: Linux-RAID On Mon, Jan 31, 2011 at 12:41 PM, Mathias Burén <mathias.buren@gmail.com> wrote: > Hi, > > RAID10 is (could be) setup in this way, correct? > > 2 devices in a RAID1 > 2 devices in another RAID1 I usually use LVM striping over two RAID1 arrays for this. > > Then you run RAID0 on top of them. If you're lucky you can lose 2 > devices at most (1 in each RAID1). > > If you have, say 6 HDDs, would you create 3 RAID1 volumes? Then create > a RAID0 on top of them? > > How would one go about expanding a 4 HDD RAID10 into a 6 HDD RAID10? > Is it "just" a matter of creating a new RAID1 array of the 2 new HDDs, > then adding them to the RAID0, then expanding whatever is on that > (lvm, xfs, ext4)? > > Are there any design tips, or caveats? For example, how many disks > would you use at most, in a RAID10 setup? > > Kind regards, > // Mathias > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Best regards, [COOLCOLD-RIPN] -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 10:36 ` CoolCold @ 2011-01-31 15:00 ` Roberto Spadim 2011-01-31 15:21 ` Robin Hill 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 15:00 UTC (permalink / raw) To: CoolCold; +Cc: Mathias Burén, Linux-RAID i think make two very big raid 0 and after raid1 is better using raid10 you can use some layouts (how data is write on same raid system (for raid1 part of raid10), first disk write from head 0 to last head, second disk write from last head to head 0) 2011/1/31 CoolCold <coolthecold@gmail.com>: > On Mon, Jan 31, 2011 at 12:41 PM, Mathias Burén <mathias.buren@gmail.com> wrote: >> Hi, >> >> RAID10 is (could be) setup in this way, correct? >> >> 2 devices in a RAID1 >> 2 devices in another RAID1 > I usually use LVM striping over two RAID1 arrays for this. >> >> Then you run RAID0 on top of them. If you're lucky you can lose 2 >> devices at most (1 in each RAID1). >> >> If you have, say 6 HDDs, would you create 3 RAID1 volumes? Then create >> a RAID0 on top of them? >> >> How would one go about expanding a 4 HDD RAID10 into a 6 HDD RAID10? >> Is it "just" a matter of creating a new RAID1 array of the 2 new HDDs, >> then adding them to the RAID0, then expanding whatever is on that >> (lvm, xfs, ext4)? >> >> Are there any design tips, or caveats? For example, how many disks >> would you use at most, in a RAID10 setup? >> >> Kind regards, >> // Mathias >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Best regards, > [COOLCOLD-RIPN] > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:00 ` Roberto Spadim @ 2011-01-31 15:21 ` Robin Hill 2011-01-31 15:27 ` Roberto Spadim 2011-01-31 15:30 ` Robin Hill 0 siblings, 2 replies; 127+ messages in thread From: Robin Hill @ 2011-01-31 15:21 UTC (permalink / raw) To: Linux-RAID On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: > i think make two very big raid 0 > and after raid1 > is better > Not really - you increase the failure risk doing this. With this setup, a single drive failure from each RAID0 array will lose you the entire array. With the reverse (RAID0 over RAID1) then you require both drives in the RAID1 to fail in order to lose the array. Of course, with a 4 drive array then the risk is the same (33% with 2 drive failures) but with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% for RAID0 over RAID1. Cheers, Robin ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:21 ` Robin Hill @ 2011-01-31 15:27 ` Roberto Spadim 2011-01-31 15:28 ` Roberto Spadim 2011-01-31 16:55 ` Denis 2011-01-31 15:30 ` Robin Hill 1 sibling, 2 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 15:27 UTC (permalink / raw) To: Linux-RAID hum that's right, but not 'increase' (only if you compare raid0+1 betwen raid1+0) using raid1 and after raid0 have LESS point of fail between raid 0 and after raid 1, since the number of point of fail is proportional to number of raid1 devices. 2011/1/31 Robin Hill <robin@robinhill.me.uk>: > On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: > >> i think make two very big raid 0 >> and after raid1 >> is better >> > Not really - you increase the failure risk doing this. With this setup, > a single drive failure from each RAID0 array will lose you the entire > array. With the reverse (RAID0 over RAID1) then you require both drives > in the RAID1 to fail in order to lose the array. Of course, with a 4 > drive array then the risk is the same (33% with 2 drive failures) but > with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% > for RAID0 over RAID1. > > Cheers, > Robin > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:27 ` Roberto Spadim @ 2011-01-31 15:28 ` Roberto Spadim 2011-01-31 15:32 ` Roberto Spadim 2011-01-31 16:55 ` Denis 1 sibling, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 15:28 UTC (permalink / raw) To: Linux-RAID do you have a faster array using raid0+1 or raid1+0? 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > hum that's right, > but not 'increase' (only if you compare raid0+1 betwen raid1+0) using > raid1 and after raid0 have LESS point of fail between raid 0 and after > raid 1, since the number of point of fail is proportional to number of > raid1 devices. > > 2011/1/31 Robin Hill <robin@robinhill.me.uk>: >> On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: >> >>> i think make two very big raid 0 >>> and after raid1 >>> is better >>> >> Not really - you increase the failure risk doing this. With this setup, >> a single drive failure from each RAID0 array will lose you the entire >> array. With the reverse (RAID0 over RAID1) then you require both drives >> in the RAID1 to fail in order to lose the array. Of course, with a 4 >> drive array then the risk is the same (33% with 2 drive failures) but >> with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% >> for RAID0 over RAID1. >> >> Cheers, >> Robin >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:28 ` Roberto Spadim @ 2011-01-31 15:32 ` Roberto Spadim 2011-01-31 15:34 ` Roberto Spadim 2011-01-31 15:45 ` Robin Hill 0 siblings, 2 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 15:32 UTC (permalink / raw) To: Linux-RAID rewriting.. using raid10 or raid01 you will have problems if you lose 2 drives too... if you lose two raid 1 devices you loose raid 1... see: disks=4 RAID 1+0 raid1= 1-2(A) ; 3-4(B); 5-6(C) raid0= A-B-C if you lose (A,B or C) your raid0 stop RAID 0+1 raid0= 1-2-3(A) ; 4-5-6(B) raid1= A-B if you lose (1,4 OR 1,5 OR 1,6 OR 2,4 OR 2,5 OR 2,6 OR 3,4 OR 4,5 OR 4,6) your raid0 stop using raid1+0 or raid0+1 you can't lose two disks... 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > do you have a faster array using raid0+1 or raid1+0? > > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >> hum that's right, >> but not 'increase' (only if you compare raid0+1 betwen raid1+0) using >> raid1 and after raid0 have LESS point of fail between raid 0 and after >> raid 1, since the number of point of fail is proportional to number of >> raid1 devices. >> >> 2011/1/31 Robin Hill <robin@robinhill.me.uk>: >>> On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: >>> >>>> i think make two very big raid 0 >>>> and after raid1 >>>> is better >>>> >>> Not really - you increase the failure risk doing this. With this setup, >>> a single drive failure from each RAID0 array will lose you the entire >>> array. With the reverse (RAID0 over RAID1) then you require both drives >>> in the RAID1 to fail in order to lose the array. Of course, with a 4 >>> drive array then the risk is the same (33% with 2 drive failures) but >>> with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% >>> for RAID0 over RAID1. >>> >>> Cheers, >>> Robin >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:32 ` Roberto Spadim @ 2011-01-31 15:34 ` Roberto Spadim 2011-01-31 15:37 ` Roberto Spadim 2011-01-31 15:45 ` Robin Hill 1 sibling, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 15:34 UTC (permalink / raw) To: Linux-RAID the only way to make it safer, is put more devices on raid1 for example: disks=6 (was wrong on last email) raid0= 1-2(a) 3-4(b) 5-6(c) raid1= a,b,c or raid1=1-2-3(a) 4-5-6(b) raid0=a,b now you can loose tree disks 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > rewriting.. > using raid10 or raid01 you will have problems if you lose 2 drives too... > if you lose two raid 1 devices you loose raid 1... > see: > > disks=4 > RAID 1+0 > raid1= 1-2(A) ; 3-4(B); 5-6(C) > raid0= A-B-C > if you lose (A,B or C) your raid0 stop > > RAID 0+1 > raid0= 1-2-3(A) ; 4-5-6(B) > raid1= A-B > if you lose (1,4 OR 1,5 OR 1,6 OR 2,4 OR 2,5 OR 2,6 OR 3,4 OR 4,5 OR > 4,6) your raid0 stop > > using raid1+0 or raid0+1 you can't lose two disks... > > > > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >> do you have a faster array using raid0+1 or raid1+0? >> >> 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >>> hum that's right, >>> but not 'increase' (only if you compare raid0+1 betwen raid1+0) using >>> raid1 and after raid0 have LESS point of fail between raid 0 and after >>> raid 1, since the number of point of fail is proportional to number of >>> raid1 devices. >>> >>> 2011/1/31 Robin Hill <robin@robinhill.me.uk>: >>>> On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: >>>> >>>>> i think make two very big raid 0 >>>>> and after raid1 >>>>> is better >>>>> >>>> Not really - you increase the failure risk doing this. With this setup, >>>> a single drive failure from each RAID0 array will lose you the entire >>>> array. With the reverse (RAID0 over RAID1) then you require both drives >>>> in the RAID1 to fail in order to lose the array. Of course, with a 4 >>>> drive array then the risk is the same (33% with 2 drive failures) but >>>> with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% >>>> for RAID0 over RAID1. >>>> >>>> Cheers, >>>> Robin >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >>> >>> >>> -- >>> Roberto Spadim >>> Spadim Technology / SPAEmpresarial >>> >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:34 ` Roberto Spadim @ 2011-01-31 15:37 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 15:37 UTC (permalink / raw) To: Linux-RAID the main problem is, you can lose one disk if you lose any disk, you should replace all disks and raid1 allow you to make it online (without shutdown your server) that's the main use of raid1 (replica/mirror/redundance) 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > the only way to make it safer, is put more devices on raid1 > for example: > disks=6 (was wrong on last email) > raid0= 1-2(a) 3-4(b) 5-6(c) > raid1= a,b,c > > or > raid1=1-2-3(a) 4-5-6(b) > raid0=a,b > now you can loose tree disks > > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >> rewriting.. >> using raid10 or raid01 you will have problems if you lose 2 drives too... >> if you lose two raid 1 devices you loose raid 1... >> see: >> >> disks=4 >> RAID 1+0 >> raid1= 1-2(A) ; 3-4(B); 5-6(C) >> raid0= A-B-C >> if you lose (A,B or C) your raid0 stop >> >> RAID 0+1 >> raid0= 1-2-3(A) ; 4-5-6(B) >> raid1= A-B >> if you lose (1,4 OR 1,5 OR 1,6 OR 2,4 OR 2,5 OR 2,6 OR 3,4 OR 4,5 OR >> 4,6) your raid0 stop >> >> using raid1+0 or raid0+1 you can't lose two disks... >> >> >> >> 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >>> do you have a faster array using raid0+1 or raid1+0? >>> >>> 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >>>> hum that's right, >>>> but not 'increase' (only if you compare raid0+1 betwen raid1+0) using >>>> raid1 and after raid0 have LESS point of fail between raid 0 and after >>>> raid 1, since the number of point of fail is proportional to number of >>>> raid1 devices. >>>> >>>> 2011/1/31 Robin Hill <robin@robinhill.me.uk>: >>>>> On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: >>>>> >>>>>> i think make two very big raid 0 >>>>>> and after raid1 >>>>>> is better >>>>>> >>>>> Not really - you increase the failure risk doing this. With this setup, >>>>> a single drive failure from each RAID0 array will lose you the entire >>>>> array. With the reverse (RAID0 over RAID1) then you require both drives >>>>> in the RAID1 to fail in order to lose the array. Of course, with a 4 >>>>> drive array then the risk is the same (33% with 2 drive failures) but >>>>> with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% >>>>> for RAID0 over RAID1. >>>>> >>>>> Cheers, >>>>> Robin >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>>> >>>> >>>> -- >>>> Roberto Spadim >>>> Spadim Technology / SPAEmpresarial >>>> >>> >>> >>> >>> -- >>> Roberto Spadim >>> Spadim Technology / SPAEmpresarial >>> >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:32 ` Roberto Spadim 2011-01-31 15:34 ` Roberto Spadim @ 2011-01-31 15:45 ` Robin Hill 1 sibling, 0 replies; 127+ messages in thread From: Robin Hill @ 2011-01-31 15:45 UTC (permalink / raw) To: Linux-RAID On Mon Jan 31, 2011 at 01:32:06PM -0200, Roberto Spadim wrote: > rewriting.. > using raid10 or raid01 you will have problems if you lose 2 drives too... > if you lose two raid 1 devices you loose raid 1... > see: > > disks=4 > RAID 1+0 > raid1= 1-2(A) ; 3-4(B); 5-6(C) > raid0= A-B-C > if you lose (A,B or C) your raid0 stop > > RAID 0+1 > raid0= 1-2-3(A) ; 4-5-6(B) > raid1= A-B > if you lose (1,4 OR 1,5 OR 1,6 OR 2,4 OR 2,5 OR 2,6 OR 3,4 OR 4,5 OR > 4,6) your raid0 stop > > using raid1+0 or raid0+1 you can't lose two disks... > Yes you can - it just depends which disks. With the 6-disk case you can lose a maximum of 3 drives, though only a single drive failure will definitely not cause total array failure. For RAID1+0 your 2-drive failure cases are only 1,2 OR 3,4 OR 5,6 - any other pairing will not break the overall array. For RAID0+1 there's 9 failure cases as you point out (except the last two should be 3,5 and 3,6). Cheers, Robin ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:27 ` Roberto Spadim 2011-01-31 15:28 ` Roberto Spadim @ 2011-01-31 16:55 ` Denis 2011-01-31 17:31 ` Roberto Spadim 1 sibling, 1 reply; 127+ messages in thread From: Denis @ 2011-01-31 16:55 UTC (permalink / raw) To: Roberto Spadim; +Cc: Linux-RAID 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > hum that's right, > but not 'increase' (only if you compare raid0+1 betwen raid1+0) using > raid1 and after raid0 have LESS point of fail between raid 0 and after > raid 1, since the number of point of fail is proportional to number of > raid1 devices. In that case, in an occurency of a failury, you will take much longer to rebuild the failed disk, at least, double the time. > > 2011/1/31 Robin Hill <robin@robinhill.me.uk>: >> On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: >> >>> i think make two very big raid 0 >>> and after raid1 >>> is better >>> >> Not really - you increase the failure risk doing this. With this setup, >> a single drive failure from each RAID0 array will lose you the entire >> array. With the reverse (RAID0 over RAID1) then you require both drives >> in the RAID1 to fail in order to lose the array. Of course, with a 4 >> drive array then the risk is the same (33% with 2 drive failures) but >> with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% >> for RAID0 over RAID1. >> >> Cheers, >> Robin >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Denis Anjos, www.versatushpc.com.br -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 16:55 ` Denis @ 2011-01-31 17:31 ` Roberto Spadim 2011-01-31 18:35 ` Denis 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 17:31 UTC (permalink / raw) To: Denis; +Cc: Linux-RAID i think that partial failure (raid0 fail) of a mirror, is a fail (since all mirror is repaired and resync) the security is, if you lose all mirrors you have a device so your 'secure' is the number of mirrors, not the number of disks ssd or another type of device... how many mirrors you have here: raid0= 1,2(a) 3,4(b) raid1=a,b 1 mirror (a or b) and here: raid1=1,2(a) 3,4(b) raid0=ab 1 mirror (a or b) let´s think about hard disk? your hard disk have 2 disks? why not make two partition? first partition is disk1, second partition is disk2 mirror it what´s your security? 1 mirror is it security? normaly when a harddisk crash all disks inside it crash but you is secury if only one internal disk fail... that´s the point, how many mirror? the point is with raid1+0 (raid10) we know that disks are fragments (raid1) with raid0+1 we know that disks are a big disk (raid0) the point is, we can´t allow that information stop, we need mirror to be secured (1 is good, 2 better, 3 really better, 4 5 6 7...) you can´t break mirror (not disk) to don´t break mirror have a second mirror (raid0 don´t help here! just raid1) with raid10 you will repair smal size of information (raid1), here sync will cost less time with raid01 you will repair big size of information (raid0), here sync will cost more time 2011/1/31 Denis <denismpa@gmail.com>: > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >> hum that's right, >> but not 'increase' (only if you compare raid0+1 betwen raid1+0) using >> raid1 and after raid0 have LESS point of fail between raid 0 and after >> raid 1, since the number of point of fail is proportional to number of >> raid1 devices. > In that case, in an occurency of a failury, you will take much longer > to rebuild the failed disk, at least, double the time. > > >> >> 2011/1/31 Robin Hill <robin@robinhill.me.uk>: >>> On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: >>> >>>> i think make two very big raid 0 >>>> and after raid1 >>>> is better >>>> >>> Not really - you increase the failure risk doing this. With this setup, >>> a single drive failure from each RAID0 array will lose you the entire >>> array. With the reverse (RAID0 over RAID1) then you require both drives >>> in the RAID1 to fail in order to lose the array. Of course, with a 4 >>> drive array then the risk is the same (33% with 2 drive failures) but >>> with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% >>> for RAID0 over RAID1. >>> >>> Cheers, >>> Robin >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Denis Anjos, > www.versatushpc.com.br > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 17:31 ` Roberto Spadim @ 2011-01-31 18:35 ` Denis 2011-01-31 19:15 ` Roberto Spadim 2011-01-31 19:37 ` Phillip Susi 0 siblings, 2 replies; 127+ messages in thread From: Denis @ 2011-01-31 18:35 UTC (permalink / raw) To: Roberto Spadim; +Cc: Linux-RAID 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > i think that partial failure (raid0 fail) of a mirror, is a fail > (since all mirror is repaired and resync) > the security is, if you lose all mirrors you have a device > so your 'secure' is the number of mirrors, not the number of disks ssd > or another type of device... > how many mirrors you have here: > raid0= 1,2(a) 3,4(b) > raid1=a,b > 1 mirror (a or b) > > and here: > raid1=1,2(a) 3,4(b) > raid0=ab > 1 mirror (a or b) > > let´s think about hard disk? > your hard disk have 2 disks? > why not make two partition? first partition is disk1, second partition is disk2 > mirror it > what´s your security? 1 mirror > is it security? normaly when a harddisk crash all disks inside it > crash but you is secury if only one internal disk fail... > > that´s the point, how many mirror? > the point is > with raid1+0 (raid10) we know that disks are fragments (raid1) > with raid0+1 we know that disks are a big disk (raid0) > the point is, we can´t allow that information stop, we need mirror to > be secured (1 is good, 2 better, 3 really better, 4 5 6 7...) > you can´t break mirror (not disk) to don´t break mirror have a second > mirror (raid0 don´t help here! just raid1) > > with raid10 you will repair smal size of information (raid1), here > sync will cost less time > with raid01 you will repair big size of information (raid0), here > sync will cost more time Roberto, to quite understend how better a raid 10 is over raid 01 you need to take down into a mathematical level: once I had the same doubt: "The difference is that the chance of system failure with two drive failures in a RAID 0+1 system with two sets of drives is (n/2)/(n - 1) where n is the total number of drives in the system. The chance of system failure in a RAID 1+0 system with two drives per mirror is 1/(n - 1). So, for example, using a 8 drive system, the chance that losing a second drive would bring down the RAID system is 4/7 with a RAID 0+1 system and 1/7 with a RAID 1+0 system." Another problem is that in the case of a failury of one disk ( in a two sets case), in a raid01 you will loose redundancy for ALL your data, while in a raid10 you will loose redundancy for 1/[(n/2 -1)/(n/2)], in the same case 1/4 of your data set. And also, in a raid 10 you will have o re-mirror just one disk in the case of a disk failure, in raid 01 you will have to re-mirror the whole failed set. -- Denis Anjos, www.versatushpc.com.br -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 18:35 ` Denis @ 2011-01-31 19:15 ` Roberto Spadim 2011-01-31 19:28 ` Keld Jørn Simonsen 2011-01-31 19:37 ` Phillip Susi 1 sibling, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 19:15 UTC (permalink / raw) To: Denis; +Cc: Linux-RAID ok, but lost of a disk = problem with hardware = big problems = mirror failed think about a 'disaster recover' system you can´t lost the main data (you MUST have one 'primary' data source) raid1 don´t have ecc or anyother 'paged' data recover solution (it have just all mirror resync) let´s get back a level... (inside hard disk) if your hard disk have 2 heads, you have a raid0 inside you disk (got the point?) using your math, you should consider head problem (since it make the real read of information) but at raid (1/0) software (firmware) level, you have devices (with out without heads, can be memory or anyother type of adresseable information souce, RAID0 = DEVICE for raid software/firmware, but you have A DEVICE) for raid 1 you have mirrors(a copy of one primary device) if software find 1bit of error inside this mirror(device), you lost the full mirror, 1bit of fail = mirror fail!!!!! it´s not more sync with the main(primary) data source!!!! got the problem? mirror will need a resync if any disk fail (check what fail make you mirror to fail, but i think linux raid1 mirror fail with any disk fail) if you have 4 mirrors you can loose 4 disks (1 disk fail = mirror fail, 2 disk fail = mirror fail, 3 disk fail = mirror fail, any device with fail inside a raid1 device will make the mirror to fail, got? you can have good and bad disks on raid0, but you will have a mirror failed if you have >=1 disk fail inside your raid0) got the point? what´s the probability of your mirror fail? if you use raid0 as mirror any disk of raid0 failed = mirror failed got? you can lose all raid0 but you have just 1 mirror failed! could i be more explicit? you can´t make probability using bit, you must make probability using mirror, since it´s you level of data consistency =] got? 2011/1/31 Denis <denismpa@gmail.com>: > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >> i think that partial failure (raid0 fail) of a mirror, is a fail >> (since all mirror is repaired and resync) >> the security is, if you lose all mirrors you have a device >> so your 'secure' is the number of mirrors, not the number of disks ssd >> or another type of device... >> how many mirrors you have here: >> raid0= 1,2(a) 3,4(b) >> raid1=a,b >> 1 mirror (a or b) >> >> and here: >> raid1=1,2(a) 3,4(b) >> raid0=ab >> 1 mirror (a or b) >> >> let´s think about hard disk? >> your hard disk have 2 disks? >> why not make two partition? first partition is disk1, second partition is disk2 >> mirror it >> what´s your security? 1 mirror >> is it security? normaly when a harddisk crash all disks inside it >> crash but you is secury if only one internal disk fail... >> >> that´s the point, how many mirror? >> the point is >> with raid1+0 (raid10) we know that disks are fragments (raid1) >> with raid0+1 we know that disks are a big disk (raid0) >> the point is, we can´t allow that information stop, we need mirror to >> be secured (1 is good, 2 better, 3 really better, 4 5 6 7...) >> you can´t break mirror (not disk) to don´t break mirror have a second >> mirror (raid0 don´t help here! just raid1) >> >> with raid10 you will repair smal size of information (raid1), here >> sync will cost less time >> with raid01 you will repair big size of information (raid0), here >> sync will cost more time > > Roberto, to quite understend how better a raid 10 is over raid 01 you > need to take down into a mathematical level: > > once I had the same doubt: > > "The difference is that the chance of system failure with two drive > failures in a RAID 0+1 system with two sets of drives is (n/2)/(n - 1) > where n is the total number of drives in the system. The chance of > system failure in a RAID 1+0 system with two drives per mirror is 1/(n > - 1). So, for example, using a 8 drive system, the chance that losing > a second drive would bring down the RAID system is 4/7 with a RAID 0+1 > system and 1/7 with a RAID 1+0 system." > > > Another problem is that in the case of a failury of one disk ( in a > two sets case), in a raid01 you will loose redundancy for ALL your > data, while in a raid10 you will loose redundancy for 1/[(n/2 > -1)/(n/2)], in the same case 1/4 of your data set. > > And also, in a raid 10 you will have o re-mirror just one disk in the > case of a disk failure, in raid 01 you will have to re-mirror the > whole failed set. > > -- > Denis Anjos, > www.versatushpc.com.br > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:15 ` Roberto Spadim @ 2011-01-31 19:28 ` Keld Jørn Simonsen 2011-01-31 19:35 ` Roberto Spadim 2011-01-31 20:17 ` Stan Hoeppner 0 siblings, 2 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-01-31 19:28 UTC (permalink / raw) To: Roberto Spadim; +Cc: Denis, Linux-RAID Top-posting... How is the raid0+1 problem of only 33 % survival for 2 disk with RAID10? I know for RAID10,F2 the implementation in Linux MD is bad. It is only 33 % survival, while it with a probably minor fix could be 66%. But how with RAID10,n2 and RAID10,o2? best regards keld On Mon, Jan 31, 2011 at 05:15:29PM -0200, Roberto Spadim wrote: > ok, but lost of a disk = problem with hardware = big problems = mirror failed > think about a 'disaster recover' system > you can?t lost the main data (you MUST have one 'primary' data source) > > raid1 don?t have ecc or anyother 'paged' data recover solution (it > have just all mirror resync) > > let?s get back a level... (inside hard disk) > if your hard disk have 2 heads, you have a raid0 inside you disk (got > the point?) > using your math, you should consider head problem (since it make the > real read of information) > > but at raid (1/0) software (firmware) level, you have devices (with > out without heads, can be memory or anyother type of adresseable > information souce, RAID0 = DEVICE for raid software/firmware, but you > have A DEVICE) > > for raid 1 you have mirrors(a copy of one primary device) > if software find 1bit of error inside this mirror(device), you lost > the full mirror, 1bit of fail = mirror fail!!!!! it?s not more sync > with the main(primary) data source!!!! > > got the problem? mirror will need a resync if any disk fail (check > what fail make you mirror to fail, but i think linux raid1 mirror fail > with any disk fail) > > if you have 4 mirrors you can loose 4 disks (1 disk fail = mirror > fail, 2 disk fail = mirror fail, 3 disk fail = mirror fail, any device > with fail inside a raid1 device will make the mirror to fail, got? you > can have good and bad disks on raid0, but you will have a mirror > failed if you have >=1 disk fail inside your raid0) > > got the point? > what?s the probability of your mirror fail? > if you use raid0 as mirror > any disk of raid0 failed = mirror failed got? > you can lose all raid0 but you have just 1 mirror failed! > > > could i be more explicit? you can?t make probability using bit, you > must make probability using mirror, since it?s you level of data > consistency > =] got? > > > 2011/1/31 Denis <denismpa@gmail.com>: > > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > >> i think that partial failure (raid0 fail) of a mirror, is a fail > >> (since all mirror is repaired and resync) > >> the security is, if you lose all mirrors you have a device > >> so your 'secure' is the number of mirrors, not the number of disks ssd > >> or another type of device... > >> how many mirrors you have here: > >> raid0= 1,2(a) 3,4(b) > >> raid1=a,b > >> 1 mirror (a or b) > >> > >> and here: > >> raid1=1,2(a) 3,4(b) > >> raid0=ab > >> 1 mirror (a or b) > >> > >> let?s think about hard disk? > >> your hard disk have 2 disks? > >> why not make two partition? first partition is disk1, second partition is disk2 > >> mirror it > >> what?s your security? 1 mirror > >> is it security? normaly when a harddisk crash all disks inside it > >> crash but you is secury if only one internal disk fail... > >> > >> that?s the point, how many mirror? > >> the point is > >> with raid1+0 (raid10) we know that disks are fragments (raid1) > >> with raid0+1 we know that disks are a big disk (raid0) > >> the point is, we can?t allow that information stop, we need mirror to > >> be secured (1 is good, 2 better, 3 really better, 4 5 6 7...) > >> you can?t break mirror (not disk) to don?t break mirror have a second > >> mirror (raid0 don?t help here! just raid1) > >> > >> with raid10 you will repair smal size of information (raid1), here > >> sync will cost less time > >> with raid01 you will repair big size of information (raid0), here > >> sync will cost more time > > > > Roberto, to quite understend how better a raid 10 is over raid 01 you > > need to take down into a mathematical level: > > > > once I had the same doubt: > > > > "The difference is that the chance of system failure with two drive > > failures in a RAID 0+1 system with two sets of drives is (n/2)/(n - 1) > > where n is the total number of drives in the system. The chance of > > system failure in a RAID 1+0 system with two drives per mirror is 1/(n > > - 1). So, for example, using a 8 drive system, the chance that losing > > a second drive would bring down the RAID system is 4/7 with a RAID 0+1 > > system and 1/7 with a RAID 1+0 system." > > > > > > Another problem is that in the case of a failury of one disk ( in a > > two sets case), in a raid01 you will loose redundancy for ALL your > > data, while in a raid10 you will loose redundancy for 1/[(n/2 > > -1)/(n/2)], in the same case 1/4 of your data set. > > > > And also, in a raid 10 you will have o re-mirror just one disk in the > > case of a disk failure, in raid 01 you will have to re-mirror the > > whole failed set. > > > > -- > > Denis Anjos, > > www.versatushpc.com.br > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:28 ` Keld Jørn Simonsen @ 2011-01-31 19:35 ` Roberto Spadim 2011-01-31 19:37 ` Roberto Spadim 2011-01-31 20:22 ` Keld Jørn Simonsen 2011-01-31 20:17 ` Stan Hoeppner 1 sibling, 2 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 19:35 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Denis, Linux-RAID the question is: how many mirrors you have? you don´t have a partial mirror (i didn´t found it on raid documentation yet), or you have a working mirror or you don´t have the mirror and must resync to have a running one raid10 = raid1 but the raid1 devices are raid0 if you put raid1 over raid0 or raid0 over raid1 is not a diference of security. just a diference of how many time i will wait to resync the raid1 mirror (a big raid0 you slower than smallers harddisks/ssd devices) the question again: how many mirrors you have? 2011/1/31 Keld Jørn Simonsen <keld@keldix.com>: > Top-posting... > > How is the raid0+1 problem of only 33 % survival for 2 disk with RAID10? > > I know for RAID10,F2 the implementation in Linux MD is bad. > It is only 33 % survival, while it with a probably minor fix could be 66%. > > But how with RAID10,n2 and RAID10,o2? > > best regards > keld > > > On Mon, Jan 31, 2011 at 05:15:29PM -0200, Roberto Spadim wrote: >> ok, but lost of a disk = problem with hardware = big problems = mirror failed >> think about a 'disaster recover' system >> you can?t lost the main data (you MUST have one 'primary' data source) >> >> raid1 don?t have ecc or anyother 'paged' data recover solution (it >> have just all mirror resync) >> >> let?s get back a level... (inside hard disk) >> if your hard disk have 2 heads, you have a raid0 inside you disk (got >> the point?) >> using your math, you should consider head problem (since it make the >> real read of information) >> >> but at raid (1/0) software (firmware) level, you have devices (with >> out without heads, can be memory or anyother type of adresseable >> information souce, RAID0 = DEVICE for raid software/firmware, but you >> have A DEVICE) >> >> for raid 1 you have mirrors(a copy of one primary device) >> if software find 1bit of error inside this mirror(device), you lost >> the full mirror, 1bit of fail = mirror fail!!!!! it?s not more sync >> with the main(primary) data source!!!! >> >> got the problem? mirror will need a resync if any disk fail (check >> what fail make you mirror to fail, but i think linux raid1 mirror fail >> with any disk fail) >> >> if you have 4 mirrors you can loose 4 disks (1 disk fail = mirror >> fail, 2 disk fail = mirror fail, 3 disk fail = mirror fail, any device >> with fail inside a raid1 device will make the mirror to fail, got? you >> can have good and bad disks on raid0, but you will have a mirror >> failed if you have >=1 disk fail inside your raid0) >> >> got the point? >> what?s the probability of your mirror fail? >> if you use raid0 as mirror >> any disk of raid0 failed = mirror failed got? >> you can lose all raid0 but you have just 1 mirror failed! >> >> >> could i be more explicit? you can?t make probability using bit, you >> must make probability using mirror, since it?s you level of data >> consistency >> =] got? >> >> >> 2011/1/31 Denis <denismpa@gmail.com>: >> > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >> >> i think that partial failure (raid0 fail) of a mirror, is a fail >> >> (since all mirror is repaired and resync) >> >> the security is, if you lose all mirrors you have a device >> >> so your 'secure' is the number of mirrors, not the number of disks ssd >> >> or another type of device... >> >> how many mirrors you have here: >> >> raid0= 1,2(a) 3,4(b) >> >> raid1=a,b >> >> 1 mirror (a or b) >> >> >> >> and here: >> >> raid1=1,2(a) 3,4(b) >> >> raid0=ab >> >> 1 mirror (a or b) >> >> >> >> let?s think about hard disk? >> >> your hard disk have 2 disks? >> >> why not make two partition? first partition is disk1, second partition is disk2 >> >> mirror it >> >> what?s your security? 1 mirror >> >> is it security? normaly when a harddisk crash all disks inside it >> >> crash but you is secury if only one internal disk fail... >> >> >> >> that?s the point, how many mirror? >> >> the point is >> >> with raid1+0 (raid10) we know that disks are fragments (raid1) >> >> with raid0+1 we know that disks are a big disk (raid0) >> >> the point is, we can?t allow that information stop, we need mirror to >> >> be secured (1 is good, 2 better, 3 really better, 4 5 6 7...) >> >> you can?t break mirror (not disk) to don?t break mirror have a second >> >> mirror (raid0 don?t help here! just raid1) >> >> >> >> with raid10 you will repair smal size of information (raid1), here >> >> sync will cost less time >> >> with raid01 you will repair big size of information (raid0), here >> >> sync will cost more time >> > >> > Roberto, to quite understend how better a raid 10 is over raid 01 you >> > need to take down into a mathematical level: >> > >> > once I had the same doubt: >> > >> > "The difference is that the chance of system failure with two drive >> > failures in a RAID 0+1 system with two sets of drives is (n/2)/(n - 1) >> > where n is the total number of drives in the system. The chance of >> > system failure in a RAID 1+0 system with two drives per mirror is 1/(n >> > - 1). So, for example, using a 8 drive system, the chance that losing >> > a second drive would bring down the RAID system is 4/7 with a RAID 0+1 >> > system and 1/7 with a RAID 1+0 system." >> > >> > >> > Another problem is that in the case of a failury of one disk ( in a >> > two sets case), in a raid01 you will loose redundancy for ALL your >> > data, while in a raid10 you will loose redundancy for 1/[(n/2 >> > -1)/(n/2)], in the same case 1/4 of your data set. >> > >> > And also, in a raid 10 you will have o re-mirror just one disk in the >> > case of a disk failure, in raid 01 you will have to re-mirror the >> > whole failed set. >> > >> > -- >> > Denis Anjos, >> > www.versatushpc.com.br >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:35 ` Roberto Spadim @ 2011-01-31 19:37 ` Roberto Spadim 2011-01-31 20:22 ` Keld Jørn Simonsen 1 sibling, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 19:37 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Denis, Linux-RAID the main line of how to calcule the probability here: YOU CAN´T HAVE A LOST OF INFORMATION! so you can´t allow the MIN(probability to fail) be the secured probability you MUST use the MAX(probability to fail) MAX(probability to fail) = 1 disk failed = 1 mirror failed got? 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > the question is: > how many mirrors you have? you don´t have a partial mirror (i didn´t > found it on raid documentation yet), or you have a working mirror or > you don´t have the mirror and must resync to have a running one > > raid10 = raid1 > but the raid1 devices are raid0 > if you put raid1 over raid0 or raid0 over raid1 is not a diference of > security. just a diference of how many time i will wait to resync the > raid1 mirror (a big raid0 you slower than smallers harddisks/ssd > devices) > > the question again: > how many mirrors you have? > > 2011/1/31 Keld Jørn Simonsen <keld@keldix.com>: >> Top-posting... >> >> How is the raid0+1 problem of only 33 % survival for 2 disk with RAID10? >> >> I know for RAID10,F2 the implementation in Linux MD is bad. >> It is only 33 % survival, while it with a probably minor fix could be 66%. >> >> But how with RAID10,n2 and RAID10,o2? >> >> best regards >> keld >> >> >> On Mon, Jan 31, 2011 at 05:15:29PM -0200, Roberto Spadim wrote: >>> ok, but lost of a disk = problem with hardware = big problems = mirror failed >>> think about a 'disaster recover' system >>> you can?t lost the main data (you MUST have one 'primary' data source) >>> >>> raid1 don?t have ecc or anyother 'paged' data recover solution (it >>> have just all mirror resync) >>> >>> let?s get back a level... (inside hard disk) >>> if your hard disk have 2 heads, you have a raid0 inside you disk (got >>> the point?) >>> using your math, you should consider head problem (since it make the >>> real read of information) >>> >>> but at raid (1/0) software (firmware) level, you have devices (with >>> out without heads, can be memory or anyother type of adresseable >>> information souce, RAID0 = DEVICE for raid software/firmware, but you >>> have A DEVICE) >>> >>> for raid 1 you have mirrors(a copy of one primary device) >>> if software find 1bit of error inside this mirror(device), you lost >>> the full mirror, 1bit of fail = mirror fail!!!!! it?s not more sync >>> with the main(primary) data source!!!! >>> >>> got the problem? mirror will need a resync if any disk fail (check >>> what fail make you mirror to fail, but i think linux raid1 mirror fail >>> with any disk fail) >>> >>> if you have 4 mirrors you can loose 4 disks (1 disk fail = mirror >>> fail, 2 disk fail = mirror fail, 3 disk fail = mirror fail, any device >>> with fail inside a raid1 device will make the mirror to fail, got? you >>> can have good and bad disks on raid0, but you will have a mirror >>> failed if you have >=1 disk fail inside your raid0) >>> >>> got the point? >>> what?s the probability of your mirror fail? >>> if you use raid0 as mirror >>> any disk of raid0 failed = mirror failed got? >>> you can lose all raid0 but you have just 1 mirror failed! >>> >>> >>> could i be more explicit? you can?t make probability using bit, you >>> must make probability using mirror, since it?s you level of data >>> consistency >>> =] got? >>> >>> >>> 2011/1/31 Denis <denismpa@gmail.com>: >>> > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: >>> >> i think that partial failure (raid0 fail) of a mirror, is a fail >>> >> (since all mirror is repaired and resync) >>> >> the security is, if you lose all mirrors you have a device >>> >> so your 'secure' is the number of mirrors, not the number of disks ssd >>> >> or another type of device... >>> >> how many mirrors you have here: >>> >> raid0= 1,2(a) 3,4(b) >>> >> raid1=a,b >>> >> 1 mirror (a or b) >>> >> >>> >> and here: >>> >> raid1=1,2(a) 3,4(b) >>> >> raid0=ab >>> >> 1 mirror (a or b) >>> >> >>> >> let?s think about hard disk? >>> >> your hard disk have 2 disks? >>> >> why not make two partition? first partition is disk1, second partition is disk2 >>> >> mirror it >>> >> what?s your security? 1 mirror >>> >> is it security? normaly when a harddisk crash all disks inside it >>> >> crash but you is secury if only one internal disk fail... >>> >> >>> >> that?s the point, how many mirror? >>> >> the point is >>> >> with raid1+0 (raid10) we know that disks are fragments (raid1) >>> >> with raid0+1 we know that disks are a big disk (raid0) >>> >> the point is, we can?t allow that information stop, we need mirror to >>> >> be secured (1 is good, 2 better, 3 really better, 4 5 6 7...) >>> >> you can?t break mirror (not disk) to don?t break mirror have a second >>> >> mirror (raid0 don?t help here! just raid1) >>> >> >>> >> with raid10 you will repair smal size of information (raid1), here >>> >> sync will cost less time >>> >> with raid01 you will repair big size of information (raid0), here >>> >> sync will cost more time >>> > >>> > Roberto, to quite understend how better a raid 10 is over raid 01 you >>> > need to take down into a mathematical level: >>> > >>> > once I had the same doubt: >>> > >>> > "The difference is that the chance of system failure with two drive >>> > failures in a RAID 0+1 system with two sets of drives is (n/2)/(n - 1) >>> > where n is the total number of drives in the system. The chance of >>> > system failure in a RAID 1+0 system with two drives per mirror is 1/(n >>> > - 1). So, for example, using a 8 drive system, the chance that losing >>> > a second drive would bring down the RAID system is 4/7 with a RAID 0+1 >>> > system and 1/7 with a RAID 1+0 system." >>> > >>> > >>> > Another problem is that in the case of a failury of one disk ( in a >>> > two sets case), in a raid01 you will loose redundancy for ALL your >>> > data, while in a raid10 you will loose redundancy for 1/[(n/2 >>> > -1)/(n/2)], in the same case 1/4 of your data set. >>> > >>> > And also, in a raid 10 you will have o re-mirror just one disk in the >>> > case of a disk failure, in raid 01 you will have to re-mirror the >>> > whole failed set. >>> > >>> > -- >>> > Denis Anjos, >>> > www.versatushpc.com.br >>> > -- >>> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> > the body of a message to majordomo@vger.kernel.org >>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > >>> >>> >>> >>> -- >>> Roberto Spadim >>> Spadim Technology / SPAEmpresarial >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:35 ` Roberto Spadim 2011-01-31 19:37 ` Roberto Spadim @ 2011-01-31 20:22 ` Keld Jørn Simonsen 1 sibling, 0 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-01-31 20:22 UTC (permalink / raw) To: Roberto Spadim; +Cc: Keld Jørn Simonsen, Denis, Linux-RAID On Mon, Jan 31, 2011 at 05:35:31PM -0200, Roberto Spadim wrote: > the question is: > how many mirrors you have? you don?t have a partial mirror (i didn?t > found it on raid documentation yet), or you have a working mirror or > you don?t have the mirror and must resync to have a running one > > raid10 = raid1 > but the raid1 devices are raid0 you are confused, please read below. > if you put raid1 over raid0 or raid0 over raid1 is not a diference of > security. just a diference of how many time i will wait to resync the > raid1 mirror (a big raid0 you slower than smallers harddisks/ssd > devices) > > the question again: > how many mirrors you have? My question was really not related to your question, it is a general question for the design of Linux MD RAID10. And please keep terminology clean. RAID10 here on the list is Linux MD RAID10. This is very different from what was called RAID10 five years ago. The term for that is RAID1+0, meaning that you have 2 RAID1 devices, and then you make a RAID0 over the 2 RAID1 devices. Best regards keld > 2011/1/31 Keld Jørn Simonsen <keld@keldix.com>: > > Top-posting... > > > > How is the raid0+1 problem of only 33 % survival for 2 disk with RAID10? > > > > I know for RAID10,F2 the implementation in Linux MD is bad. > > It is only 33 % survival, while it with a probably minor fix could be 66%. > > > > But how with RAID10,n2 and RAID10,o2? > > > > best regards > > keld > > > > > > On Mon, Jan 31, 2011 at 05:15:29PM -0200, Roberto Spadim wrote: > >> ok, but lost of a disk = problem with hardware = big problems = mirror failed > >> think about a 'disaster recover' system > >> you can?t lost the main data (you MUST have one 'primary' data source) > >> > >> raid1 don?t have ecc or anyother 'paged' data recover solution (it > >> have just all mirror resync) > >> > >> let?s get back a level... (inside hard disk) > >> if your hard disk have 2 heads, you have a raid0 inside you disk (got > >> the point?) > >> using your math, you should consider head problem (since it make the > >> real read of information) > >> > >> but at raid (1/0) software (firmware) level, you have devices (with > >> out without heads, can be memory or anyother type of adresseable > >> information souce, RAID0 = DEVICE for raid software/firmware, but you > >> have A DEVICE) > >> > >> for raid 1 you have mirrors(a copy of one primary device) > >> if software find 1bit of error inside this mirror(device), you lost > >> the full mirror, 1bit of fail = mirror fail!!!!! it?s not more sync > >> with the main(primary) data source!!!! > >> > >> got the problem? mirror will need a resync if any disk fail (check > >> what fail make you mirror to fail, but i think linux raid1 mirror fail > >> with any disk fail) > >> > >> if you have 4 mirrors you can loose 4 disks (1 disk fail = mirror > >> fail, 2 disk fail = mirror fail, 3 disk fail = mirror fail, any device > >> with fail inside a raid1 device will make the mirror to fail, got? you > >> can have good and bad disks on raid0, but you will have a mirror > >> failed if you have >=1 disk fail inside your raid0) > >> > >> got the point? > >> what?s the probability of your mirror fail? > >> if you use raid0 as mirror > >> any disk of raid0 failed = mirror failed got? > >> you can lose all raid0 but you have just 1 mirror failed! > >> > >> > >> could i be more explicit? you can?t make probability using bit, you > >> must make probability using mirror, since it?s you level of data > >> consistency > >> =] got? > >> > >> > >> 2011/1/31 Denis <denismpa@gmail.com>: > >> > 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > >> >> i think that partial failure (raid0 fail) of a mirror, is a fail > >> >> (since all mirror is repaired and resync) > >> >> the security is, if you lose all mirrors you have a device > >> >> so your 'secure' is the number of mirrors, not the number of disks ssd > >> >> or another type of device... > >> >> how many mirrors you have here: > >> >> raid0= 1,2(a) 3,4(b) > >> >> raid1=a,b > >> >> 1 mirror (a or b) > >> >> > >> >> and here: > >> >> raid1=1,2(a) 3,4(b) > >> >> raid0=ab > >> >> 1 mirror (a or b) > >> >> > >> >> let?s think about hard disk? > >> >> your hard disk have 2 disks? > >> >> why not make two partition? first partition is disk1, second partition is disk2 > >> >> mirror it > >> >> what?s your security? 1 mirror > >> >> is it security? normaly when a harddisk crash all disks inside it > >> >> crash but you is secury if only one internal disk fail... > >> >> > >> >> that?s the point, how many mirror? > >> >> the point is > >> >> with raid1+0 (raid10) we know that disks are fragments (raid1) > >> >> with raid0+1 we know that disks are a big disk (raid0) > >> >> the point is, we can?t allow that information stop, we need mirror to > >> >> be secured (1 is good, 2 better, 3 really better, 4 5 6 7...) > >> >> you can?t break mirror (not disk) to don?t break mirror have a second > >> >> mirror (raid0 don?t help here! just raid1) > >> >> > >> >> with raid10 you will repair smal size of information (raid1), here > >> >> sync will cost less time > >> >> with raid01 you will repair big size of information (raid0), here > >> >> sync will cost more time > >> > > >> > Roberto, to quite understend how better a raid 10 is over raid 01 you > >> > need to take down into a mathematical level: > >> > > >> > once I had the same doubt: > >> > > >> > "The difference is that the chance of system failure with two drive > >> > failures in a RAID 0+1 system with two sets of drives is (n/2)/(n - 1) > >> > where n is the total number of drives in the system. The chance of > >> > system failure in a RAID 1+0 system with two drives per mirror is 1/(n > >> > - 1). So, for example, using a 8 drive system, the chance that losing > >> > a second drive would bring down the RAID system is 4/7 with a RAID 0+1 > >> > system and 1/7 with a RAID 1+0 system." > >> > > >> > > >> > Another problem is that in the case of a failury of one disk ( in a > >> > two sets case), in a raid01 you will loose redundancy for ALL your > >> > data, while in a raid10 you will loose redundancy for 1/[(n/2 > >> > -1)/(n/2)], in the same case 1/4 of your data set. > >> > > >> > And also, in a raid 10 you will have o re-mirror just one disk in the > >> > case of a disk failure, in raid 01 you will have to re-mirror the > >> > whole failed set. > >> > > >> > -- > >> > Denis Anjos, > >> > www.versatushpc.com.br > >> > -- > >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >> > the body of a message to majordomo@vger.kernel.org > >> > More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > > >> > >> > >> > >> -- > >> Roberto Spadim > >> Spadim Technology / SPAEmpresarial > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:28 ` Keld Jørn Simonsen 2011-01-31 19:35 ` Roberto Spadim @ 2011-01-31 20:17 ` Stan Hoeppner 2011-01-31 20:37 ` Keld Jørn Simonsen 1 sibling, 1 reply; 127+ messages in thread From: Stan Hoeppner @ 2011-01-31 20:17 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Roberto Spadim, Denis, Linux-RAID Keld Jørn Simonsen put forth on 1/31/2011 1:28 PM: > Top-posting... > > How is the raid0+1 problem of only 33 % survival for 2 disk with RAID10? > > I know for RAID10,F2 the implementation in Linux MD is bad. > It is only 33 % survival, while it with a probably minor fix could be 66%. > > But how with RAID10,n2 and RAID10,o2? I don't care what Neil or anyone says, these "layouts" are _NOT_ RAID 10. If you want to discuss RAID 10, please leave these non-standard Frankenstein "layouts" out of the discussion. Including them only muddies things unnecessarily. Thank you. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 20:17 ` Stan Hoeppner @ 2011-01-31 20:37 ` Keld Jørn Simonsen 2011-01-31 21:20 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-01-31 20:37 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Keld Jørn Simonsen, Roberto Spadim, Denis, Linux-RAID On Mon, Jan 31, 2011 at 02:17:37PM -0600, Stan Hoeppner wrote: > Keld Jørn Simonsen put forth on 1/31/2011 1:28 PM: > > Top-posting... > > > > How is the raid0+1 problem of only 33 % survival for 2 disk with RAID10? > > > > I know for RAID10,F2 the implementation in Linux MD is bad. > > It is only 33 % survival, while it with a probably minor fix could be 66%. > > > > But how with RAID10,n2 and RAID10,o2? > > I don't care what Neil or anyone says, these "layouts" are _NOT_ RAID 10. If > you want to discuss RAID 10, please leave these non-standard Frankenstein > "layouts" out of the discussion. Including them only muddies things unnecessarily. Please keep terminology clean, and non-ambigeous. Please refer to the old term RAID10 as RAID1+0, which is also the original and more precise term for that concept of multilevel RAID. RAID10 on this list refers to the RAID10 modules of the Linux kernel. I can concurr that this may be a somewhat misleading term, as it is easily confused with the popular understanding of RAID10, meaning RAID1+0. And I see Linux RAID10 as a family of RAID1 layouts. Indeed RAID10,n2 is almost the same as normal RAID1, and RAID10,o2 is an implementation of a specific layout of the RAID1 standard. RAID10,f2 could easily also be seen as a specific RAID1 layout. But that is the naming of terms that we have to deal with on this Linux kernel list for the RAID modules. best regards Keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 20:37 ` Keld Jørn Simonsen @ 2011-01-31 21:20 ` Roberto Spadim 2011-01-31 21:24 ` Mathias Burén 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 21:20 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Stan Hoeppner, Denis, Linux-RAID no matter what raid1 or raid10 system we use raid1 is mirror! let´s think that raid0 = more than one disk (not a single disk)... if a hard disk inside a mirror (raid1) fail (it can be a raid0 or a single disk) the mirror is failed for example: there´s no 25% survival for 2 mirrors with 4 disks! probability, here, is mirror based, not disk based! it´s not a question about linux implementation is a question for generic raid1 (mirror) system (1 failed 2 mirrors = 1 mirror failed but 1 mirror working) you only can have 25% 'survival' if you can use 4 disks, or multiples of 4, for raid1 if your raid0 is broken you don´t have a raid0! you have a broken raid = broken mirror (for raid1)! should i write it again? for raid10 (raid1+0) with 4 disks you can only lost 1 disk! 1 disk lost = 1 raid0 lost = 1 mirror lost! should i write it again? 2011/1/31 Keld Jørn Simonsen <keld@keldix.com>: > On Mon, Jan 31, 2011 at 02:17:37PM -0600, Stan Hoeppner wrote: >> Keld Jørn Simonsen put forth on 1/31/2011 1:28 PM: >> > Top-posting... >> > >> > How is the raid0+1 problem of only 33 % survival for 2 disk with RAID10? >> > >> > I know for RAID10,F2 the implementation in Linux MD is bad. >> > It is only 33 % survival, while it with a probably minor fix could be 66%. >> > >> > But how with RAID10,n2 and RAID10,o2? >> >> I don't care what Neil or anyone says, these "layouts" are _NOT_ RAID 10. If >> you want to discuss RAID 10, please leave these non-standard Frankenstein >> "layouts" out of the discussion. Including them only muddies things unnecessarily. > > Please keep terminology clean, and non-ambigeous. > Please refer to the old term RAID10 as RAID1+0, which is also the > original and more precise term for that concept of multilevel RAID. > > RAID10 on this list refers to the RAID10 modules of the Linux kernel. > > I can concurr that this may be a somewhat misleading term, as it is > easily confused with the popular understanding of RAID10, meaning > RAID1+0. And I see Linux RAID10 as a family of RAID1 layouts. > Indeed RAID10,n2 is almost the same as normal RAID1, and RAID10,o2 > is an implementation of a specific layout of the RAID1 standard. > RAID10,f2 could easily also be seen as a specific RAID1 layout. > > But that is the naming of terms that we have to deal with on this Linux > kernel list for the RAID modules. > > best regards > Keld > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:20 ` Roberto Spadim @ 2011-01-31 21:24 ` Mathias Burén 2011-01-31 21:27 ` Jon Nelson 0 siblings, 1 reply; 127+ messages in thread From: Mathias Burén @ 2011-01-31 21:24 UTC (permalink / raw) To: Roberto Spadim; +Cc: Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > no matter what raid1 or raid10 system we use > raid1 is mirror! let´s think that raid0 = more than one disk (not a > single disk)... > > if a hard disk inside a mirror (raid1) fail (it can be a raid0 or a > single disk) the mirror is failed > for example: there´s no 25% survival for 2 mirrors with 4 disks! > probability, here, is mirror based, not disk based! > it´s not a question about linux implementation is a question for > generic raid1 (mirror) system (1 failed 2 mirrors = 1 mirror failed > but 1 mirror working) > > you only can have 25% 'survival' if you can use 4 disks, or multiples > of 4, for raid1 > if your raid0 is broken you don´t have a raid0! you have a broken raid > = broken mirror (for raid1)! > > should i write it again? for raid10 (raid1+0) with 4 disks you can > only lost 1 disk! 1 disk lost = 1 raid0 lost = 1 mirror lost! > should i write it again? > > 2011/1/31 Keld Jørn Simonsen <keld@keldix.com>: >> On Mon, Jan 31, 2011 at 02:17:37PM -0600, Stan Hoeppner wrote: >>> Keld Jørn Simonsen put forth on 1/31/2011 1:28 PM: >>> > Top-posting... >>> > >>> > How is the raid0+1 problem of only 33 % survival for 2 disk with RAID10? >>> > >>> > I know for RAID10,F2 the implementation in Linux MD is bad. >>> > It is only 33 % survival, while it with a probably minor fix could be 66%. >>> > >>> > But how with RAID10,n2 and RAID10,o2? >>> >>> I don't care what Neil or anyone says, these "layouts" are _NOT_ RAID 10. If >>> you want to discuss RAID 10, please leave these non-standard Frankenstein >>> "layouts" out of the discussion. Including them only muddies things unnecessarily. >> >> Please keep terminology clean, and non-ambigeous. >> Please refer to the old term RAID10 as RAID1+0, which is also the >> original and more precise term for that concept of multilevel RAID. >> >> RAID10 on this list refers to the RAID10 modules of the Linux kernel. >> >> I can concurr that this may be a somewhat misleading term, as it is >> easily confused with the popular understanding of RAID10, meaning >> RAID1+0. And I see Linux RAID10 as a family of RAID1 layouts. >> Indeed RAID10,n2 is almost the same as normal RAID1, and RAID10,o2 >> is an implementation of a specific layout of the RAID1 standard. >> RAID10,f2 could easily also be seen as a specific RAID1 layout. >> >> But that is the naming of terms that we have to deal with on this Linux >> kernel list for the RAID modules. >> >> best regards >> Keld >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > In a 4 disk RAID1+0 (where you have 2 HDDs in RAID1 (a) and 2 other HDDs in RAID1 (b), then put them together in a RAID0) you can lose a maximum of 2 HDDs, without any data loss. Sure, the "mirror" is "broken", but your data is intact. So, you can actually rip out 2 HDDs and still have your data, providing you pull the "right" drives. A single disk can be described as RAID0 (as it's not redundant). // Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:24 ` Mathias Burén @ 2011-01-31 21:27 ` Jon Nelson 2011-01-31 21:47 ` Roberto Spadim ` (2 more replies) 0 siblings, 3 replies; 127+ messages in thread From: Jon Nelson @ 2011-01-31 21:27 UTC (permalink / raw) To: Mathias Burén Cc: Roberto Spadim, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID Before this goes any further, why not just reference the excellent Wikipedia article (actually, excellent applies to both Wikipedia *and* the article): http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 The only problem I have with the wikipedia article is the assertion that Linux MD RAID 10 is non-standard. It's as standard as anything else is in this world. -- Jon ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:27 ` Jon Nelson @ 2011-01-31 21:47 ` Roberto Spadim 2011-01-31 21:51 ` Roberto Spadim 2011-01-31 22:52 ` Keld Jørn Simonsen 2011-02-01 0:58 ` Stan Hoeppner 2011-02-01 8:46 ` hansbkk 2 siblings, 2 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 21:47 UTC (permalink / raw) To: Jon Nelson Cc: Mathias Burén, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID =] hehehe there is no standard for linux, just the linux standard that was implemented :P linux raid10 work and is the same idea of the 'raid10' academic standard i don´t know any raid standard, just hardware based standard you can´t get a smart array(hp) disk and put on a perc(dell) or linux mdadm and wait it will work without tweaking... 2011/1/31 Jon Nelson <jnelson-linux-raid@jamponi.net>: > Before this goes any further, why not just reference the excellent > Wikipedia article (actually, excellent applies to both Wikipedia *and* > the article): > > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > > The only problem I have with the wikipedia article is the assertion > that Linux MD RAID 10 is non-standard. It's as standard as anything > else is in this world. > > > -- > Jon > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:47 ` Roberto Spadim @ 2011-01-31 21:51 ` Roberto Spadim 2011-01-31 22:50 ` NeilBrown 2011-01-31 22:52 ` Keld Jørn Simonsen 1 sibling, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 21:51 UTC (permalink / raw) To: Jon Nelson Cc: Mathias Burén, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID now, a question.... if raid1 is like raid10 (one disk = raid0) why not only one raid1 (raid10) software implementation? for example, if i have 4 disks and i want 4 mirrors. why not work with only raid10? why the option since we have all features of raid1 inside raid10? is it to allow small source code (a small ARM rom)? memory usage? cpu usage? easy to implement? 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > =] hehehe there is no standard for linux, just the linux standard that > was implemented :P > linux raid10 work and is the same idea of the 'raid10' academic standard > i don´t know any raid standard, just hardware based standard > you can´t get a smart array(hp) disk and put on a perc(dell) or linux > mdadm and wait it will work without tweaking... > > > 2011/1/31 Jon Nelson <jnelson-linux-raid@jamponi.net>: >> Before this goes any further, why not just reference the excellent >> Wikipedia article (actually, excellent applies to both Wikipedia *and* >> the article): >> >> http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 >> >> The only problem I have with the wikipedia article is the assertion >> that Linux MD RAID 10 is non-standard. It's as standard as anything >> else is in this world. >> >> >> -- >> Jon >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:51 ` Roberto Spadim @ 2011-01-31 22:50 ` NeilBrown 2011-01-31 22:53 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: NeilBrown @ 2011-01-31 22:50 UTC (permalink / raw) To: Roberto Spadim Cc: Jon Nelson, Mathias Burén, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID On Mon, 31 Jan 2011 19:51:32 -0200 Roberto Spadim <roberto@spadim.com.br> wrote: > now, a question.... > > if raid1 is like raid10 (one disk = raid0) > why not only one raid1 (raid10) software implementation? > for example, if i have 4 disks and i want 4 mirrors. > why not work with only raid10? why the option since we have all > features of raid1 inside raid10? > is it to allow small source code (a small ARM rom)? memory usage? cpu > usage? easy to implement? It is mostly "historical reasons". RAID1 already existed. When I wrote RAID10 I wanted to keep it separate so as not to break RAID1. I have never had a good reason to merge the two implementations. And RAID1 does have some functionality that RAID10 doesn't, like write-behind. Also RAID1 doesn't have a chunk size. RAID10 does. NeilBrown ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:50 ` NeilBrown @ 2011-01-31 22:53 ` Roberto Spadim 2011-01-31 23:10 ` NeilBrown 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 22:53 UTC (permalink / raw) To: NeilBrown Cc: Jon Nelson, Mathias Burén, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID uhmmmmm write-behind is nice, raid0 have chunk size, don´t? 2011/1/31 NeilBrown <neilb@suse.de>: > On Mon, 31 Jan 2011 19:51:32 -0200 Roberto Spadim <roberto@spadim.com.br> > wrote: > >> now, a question.... >> >> if raid1 is like raid10 (one disk = raid0) >> why not only one raid1 (raid10) software implementation? >> for example, if i have 4 disks and i want 4 mirrors. >> why not work with only raid10? why the option since we have all >> features of raid1 inside raid10? >> is it to allow small source code (a small ARM rom)? memory usage? cpu >> usage? easy to implement? > > It is mostly "historical reasons". > RAID1 already existed. When I wrote RAID10 I wanted to keep it separate so > as not to break RAID1. I have never had a good reason to merge the two > implementations. > > And RAID1 does have some functionality that RAID10 doesn't, like write-behind. > Also RAID1 doesn't have a chunk size. RAID10 does. > > NeilBrown > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:53 ` Roberto Spadim @ 2011-01-31 23:10 ` NeilBrown 2011-01-31 23:14 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: NeilBrown @ 2011-01-31 23:10 UTC (permalink / raw) To: Roberto Spadim Cc: Jon Nelson, Mathias Burén, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID On Mon, 31 Jan 2011 20:53:07 -0200 Roberto Spadim <roberto@spadim.com.br> wrote: > uhmmmmm write-behind is nice, > raid0 have chunk size, don´t? Yes, but we weren't discussing RAID0 (???) NeilBrown > > 2011/1/31 NeilBrown <neilb@suse.de>: > > On Mon, 31 Jan 2011 19:51:32 -0200 Roberto Spadim <roberto@spadim.com.br> > > wrote: > > > >> now, a question.... > >> > >> if raid1 is like raid10 (one disk = raid0) > >> why not only one raid1 (raid10) software implementation? > >> for example, if i have 4 disks and i want 4 mirrors. > >> why not work with only raid10? why the option since we have all > >> features of raid1 inside raid10? > >> is it to allow small source code (a small ARM rom)? memory usage? cpu > >> usage? easy to implement? > > > > It is mostly "historical reasons". > > RAID1 already existed. When I wrote RAID10 I wanted to keep it separate so > > as not to break RAID1. I have never had a good reason to merge the two > > implementations. > > > > And RAID1 does have some functionality that RAID10 doesn't, like write-behind. > > Also RAID1 doesn't have a chunk size. RAID10 does. > > > > NeilBrown > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 23:10 ` NeilBrown @ 2011-01-31 23:14 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 23:14 UTC (permalink / raw) To: NeilBrown Cc: Jon Nelson, Mathias Burén, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID raid10 could be used as raid1+raid0 so raid1+raid0 implementation could allow write-behind, raid10 don´t =] 2011/1/31 NeilBrown <neilb@suse.de>: > On Mon, 31 Jan 2011 20:53:07 -0200 Roberto Spadim <roberto@spadim.com.br> > wrote: > >> uhmmmmm write-behind is nice, >> raid0 have chunk size, don´t? > > Yes, but we weren't discussing RAID0 (???) > > NeilBrown > >> >> 2011/1/31 NeilBrown <neilb@suse.de>: >> > On Mon, 31 Jan 2011 19:51:32 -0200 Roberto Spadim <roberto@spadim.com.br> >> > wrote: >> > >> >> now, a question.... >> >> >> >> if raid1 is like raid10 (one disk = raid0) >> >> why not only one raid1 (raid10) software implementation? >> >> for example, if i have 4 disks and i want 4 mirrors. >> >> why not work with only raid10? why the option since we have all >> >> features of raid1 inside raid10? >> >> is it to allow small source code (a small ARM rom)? memory usage? cpu >> >> usage? easy to implement? >> > >> > It is mostly "historical reasons". >> > RAID1 already existed. When I wrote RAID10 I wanted to keep it separate so >> > as not to break RAID1. I have never had a good reason to merge the two >> > implementations. >> > >> > And RAID1 does have some functionality that RAID10 doesn't, like write-behind. >> > Also RAID1 doesn't have a chunk size. RAID10 does. >> > >> > NeilBrown >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> >> >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:47 ` Roberto Spadim 2011-01-31 21:51 ` Roberto Spadim @ 2011-01-31 22:52 ` Keld Jørn Simonsen 2011-01-31 23:00 ` Roberto Spadim 2011-02-01 10:01 ` David Brown 1 sibling, 2 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-01-31 22:52 UTC (permalink / raw) To: Roberto Spadim Cc: Jon Nelson, Mathias Burén, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID On Mon, Jan 31, 2011 at 07:47:05PM -0200, Roberto Spadim wrote: > =] hehehe there is no standard for linux, just the linux standard that > was implemented :P There is a Linux standard, LSB Linux Standard Base ISO/IEC 23360. And then there is the POSIX standard that the Linux kernel and many utilities in GNU/linux follow. POSIX is ISO/IEC 9945. > linux raid10 work and is the same idea of the 'raid10' academic standard raid1+0 and Linux MD raid10 are similar, but significantly different in a number of ways. Linux MD raid10 can run on only 2 drives. Linux raid10,f2 has almost RAID0 striping performance in sequential read. You can have an odd number of drives in raid10. And you can have as many copies as you like in raid10, > i don?t know any raid standard, just hardware based standard There is an organisation that standardizes RAID levels. Unfortunately I cannot find a link right now. The raid10 offset layout is an implementation of one of their specs. > you can?t get a smart array(hp) disk and put on a perc(dell) or linux > mdadm and wait it will work without tweaking... Yes. And? best regards keld > 2011/1/31 Jon Nelson <jnelson-linux-raid@jamponi.net>: > > Before this goes any further, why not just reference the excellent > > Wikipedia article (actually, excellent applies to both Wikipedia *and* > > the article): > > > > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > > > > The only problem I have with the wikipedia article is the assertion > > that Linux MD RAID 10 is non-standard. It's as standard as anything > > else is in this world. > > > > > > -- > > Jon > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:52 ` Keld Jørn Simonsen @ 2011-01-31 23:00 ` Roberto Spadim 2011-02-01 10:01 ` David Brown 1 sibling, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 23:00 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Jon Nelson, Mathias Burén, Stan Hoeppner, Denis, Linux-RAID >> =] hehehe there is no standard for linux, just the linux standard that >> was implemented :P it´s a joke =P hhehe >> you can?t get a smart array(hp) disk and put on a perc(dell) or linux >> mdadm and wait it will work without tweaking... they are not standard based?! (they are standard based! before anyone tell...) i was talking about wikipedia writers thinking that linux don´t have a standard, check last email to understand the context: >Before this goes any further, why not just reference the excellent >Wikipedia article (actually, excellent applies to both Wikipedia *and* >the article): >http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > >The only problem I have with the wikipedia article is the assertion >that Linux MD RAID 10 is non-standard. It's as standard as anything >else is in this world. > > >-- >Jon 2011/1/31 Keld Jørn Simonsen <keld@keldix.com>: > On Mon, Jan 31, 2011 at 07:47:05PM -0200, Roberto Spadim wrote: >> =] hehehe there is no standard for linux, just the linux standard that >> was implemented :P > > There is a Linux standard, LSB Linux Standard Base ISO/IEC 23360. > And then there is the POSIX standard that the Linux kernel and > many utilities in GNU/linux follow. POSIX is ISO/IEC 9945. > >> linux raid10 work and is the same idea of the 'raid10' academic standard > > raid1+0 and Linux MD raid10 are similar, but significantly different > in a number of ways. Linux MD raid10 can run on only 2 drives. > Linux raid10,f2 has almost RAID0 striping performance in sequential read. > You can have an odd number of drives in raid10. > And you can have as many copies as you like in raid10, > >> i don?t know any raid standard, just hardware based standard > > There is an organisation that standardizes RAID levels. > Unfortunately I cannot find a link right now. > The raid10 offset layout is an implementation of one of their specs. > >> you can?t get a smart array(hp) disk and put on a perc(dell) or linux >> mdadm and wait it will work without tweaking... > > Yes. And? > > best regards > keld > >> 2011/1/31 Jon Nelson <jnelson-linux-raid@jamponi.net>: >> > Before this goes any further, why not just reference the excellent >> > Wikipedia article (actually, excellent applies to both Wikipedia *and* >> > the article): >> > >> > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 >> > >> > The only problem I have with the wikipedia article is the assertion >> > that Linux MD RAID 10 is non-standard. It's as standard as anything >> > else is in this world. >> > >> > >> > -- >> > Jon >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:52 ` Keld Jørn Simonsen 2011-01-31 23:00 ` Roberto Spadim @ 2011-02-01 10:01 ` David Brown 2011-02-01 13:50 ` Jon Nelson 2011-02-01 16:02 ` Keld Jørn Simonsen 1 sibling, 2 replies; 127+ messages in thread From: David Brown @ 2011-02-01 10:01 UTC (permalink / raw) To: linux-raid On 31/01/2011 23:52, Keld Jørn Simonsen wrote: > raid1+0 and Linux MD raid10 are similar, but significantly different > in a number of ways. Linux MD raid10 can run on only 2 drives. > Linux raid10,f2 has almost RAID0 striping performance in sequential read. > You can have an odd number of drives in raid10. > And you can have as many copies as you like in raid10, > You can make raid10,f2 functionality from raid1+0 by using partitions. For example, to get a raid10,f2 equivalent on two drives, partition them into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe set of md0 and md1. If you have three disks, you can do that too: md0 = raid1(sda1, sdb2) md1 = raid1(sdb1, sdc2) md2 = raid1(sdc1, sda2) md3 = raid0(md0, md1, md2) As far as I can figure out, the performance should be pretty much the same (although wrapping everything in a single raid10,f2 is more convenient). For four disks, there are more ways to do it: Option A: md0 = raid1(sda1, sdb2) md1 = raid1(sdb1, sdc2) md2 = raid1(sdc1, sdd2) md3 = raid1(sdd1, sda2) md4 = raid0(md0, md1, md2, md3) Option B: md0 = raid1(sda1, sdb2) md1 = raid1(sdb1, sda2) md2 = raid1(sdc1, sdd2) md3 = raid1(sdd1, sdc2) md4 = raid0(md0, md1, md2, md3) Option C: md0 = raid1(sda1, sdc2) md1 = raid1(sdb1, sdd2) md2 = raid1(sdc1, sda2) md3 = raid1(sdd1, sdb2) md4 = raid0(md0, md1, md2, md3) "Ordinary" raid 1 + 0 is roughly like this: md0 = raid1(sda1, sdb1) md1 = raid1(sda2, sdb2) md2 = raid1(sdc1, sdc1) md3 = raid1(sdc2, sdd2) md4 = raid0(md0, md1, md2, md3) I don't know which of A, B or C is used for raid10,f2 on four disks - maybe Neil knows? The fun thing here is to try to figure out the performance for these combinations. For large reads, A, B and C will give you much better performance than raid 1 + 0, since you can stream data off all disks in parallel. For most other accesses, I think the performance will be fairly similar, except for medium write sizes (covering between a quarter and half a stripe), which will be faster with C since all four disks can write in parallel. All four arrangements support any single disk failing, and B, C and raid1+0 have a 66% chance of supporting a second failure. I don't think there is any way you can get the equivalent of raid10,o2 in this way. But then, I am not sure how much use raid10,o2 actually is - are there any usage patterns for which it is faster than raid10,n2 or raid10,f2? -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 10:01 ` David Brown @ 2011-02-01 13:50 ` Jon Nelson 2011-02-01 14:25 ` Roberto Spadim ` (2 more replies) 2011-02-01 16:02 ` Keld Jørn Simonsen 1 sibling, 3 replies; 127+ messages in thread From: Jon Nelson @ 2011-02-01 13:50 UTC (permalink / raw) To: David Brown; +Cc: linux-raid On Tue, Feb 1, 2011 at 4:01 AM, David Brown <david@westcontrol.com> wrote: > On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >> >> raid1+0 and Linux MD raid10 are similar, but significantly different >> in a number of ways. Linux MD raid10 can run on only 2 drives. >> Linux raid10,f2 has almost RAID0 striping performance in sequential read. >> You can have an odd number of drives in raid10. >> And you can have as many copies as you like in raid10, >> > > You can make raid10,f2 functionality from raid1+0 by using partitions. For > example, to get a raid10,f2 equivalent on two drives, partition them into > equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and md1 a > raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe set of md0 > and md1. > > If you have three disks, you can do that too: > > md0 = raid1(sda1, sdb2) > md1 = raid1(sdb1, sdc2) > md2 = raid1(sdc1, sda2) > md3 = raid0(md0, md1, md2) > > As far as I can figure out, the performance should be pretty much the same > (although wrapping everything in a single raid10,f2 is more convenient). The performance will not be the same because. Whenever possible, md reads from the outermost portion of the disk -- theoretically the fastest portion of the disk (by 2 or 3 times as much as the inner tracks) -- and in this way raid10,f2 can actually be faster than raid0. -- Jon -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 13:50 ` Jon Nelson @ 2011-02-01 14:25 ` Roberto Spadim 2011-02-01 14:48 ` David Brown 2011-02-01 22:05 ` Stan Hoeppner 2 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-01 14:25 UTC (permalink / raw) To: Jon Nelson; +Cc: David Brown, linux-raid nice, english isn´t my language, i´m not offended, i know that we have problem reading my words... thanks =) probability using ´raid world words´, isn´t ´mdadm world words´ since mdadm don´t work with disk or ssd (it work with devices) probability can´t go inside device to try to explain anything without knowing how mdadm works if you want global system probability, don´t call mdadm as a source of probability if you don´t know what it can do. can a failed mirror be used without sync? no another point, after a fail (disk) will your system stop or continue? did you probability consider a fixed point in time or a global scenario? talking about probability, try to explain the context, and how to calculate it (it´s necessary, belive me) using mdadm raid10 how many devices could you lose, for mirror context? 1 mirror, right? losing 1 mirror = losing 1 raid0 disk, right? if ok, make probability (for mdadm world) with this mirrors, not with disks use probability with the most secure results, it´s not a academic probability, it´s a production use software, use secure results. the original question... how could i make probability about security for mdadm software? raid1 raid10 raid 5 raid6 raid0, all raid, maybe the answer could be documentated on raid wiki =), just to don´t get back again in this mail list anyone could help with this part of documentation probability isn´t just numbers, it´s numbers+context a car can be a vehicle, but a vehicle can be a truck too probability numbers are nothing without context 2011/2/1 Jon Nelson <jnelson-linux-raid@jamponi.net>: > On Tue, Feb 1, 2011 at 4:01 AM, David Brown <david@westcontrol.com> wrote: >> On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >>> >>> raid1+0 and Linux MD raid10 are similar, but significantly different >>> in a number of ways. Linux MD raid10 can run on only 2 drives. >>> Linux raid10,f2 has almost RAID0 striping performance in sequential read. >>> You can have an odd number of drives in raid10. >>> And you can have as many copies as you like in raid10, >>> >> >> You can make raid10,f2 functionality from raid1+0 by using partitions. For >> example, to get a raid10,f2 equivalent on two drives, partition them into >> equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and md1 a >> raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe set of md0 >> and md1. >> >> If you have three disks, you can do that too: >> >> md0 = raid1(sda1, sdb2) >> md1 = raid1(sdb1, sdc2) >> md2 = raid1(sdc1, sda2) >> md3 = raid0(md0, md1, md2) >> >> As far as I can figure out, the performance should be pretty much the same >> (although wrapping everything in a single raid10,f2 is more convenient). > > The performance will not be the same because. Whenever possible, md > reads from the outermost portion of the disk -- theoretically the > fastest portion of the disk (by 2 or 3 times as much as the inner > tracks) -- and in this way raid10,f2 can actually be faster than > raid0. > > > -- > Jon > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 13:50 ` Jon Nelson 2011-02-01 14:25 ` Roberto Spadim @ 2011-02-01 14:48 ` David Brown 2011-02-01 15:41 ` Roberto Spadim 2011-02-01 22:05 ` Stan Hoeppner 2 siblings, 1 reply; 127+ messages in thread From: David Brown @ 2011-02-01 14:48 UTC (permalink / raw) To: linux-raid On 01/02/2011 14:50, Jon Nelson wrote: > On Tue, Feb 1, 2011 at 4:01 AM, David Brown<david@westcontrol.com> wrote: >> On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >>> >>> raid1+0 and Linux MD raid10 are similar, but significantly different >>> in a number of ways. Linux MD raid10 can run on only 2 drives. >>> Linux raid10,f2 has almost RAID0 striping performance in sequential read. >>> You can have an odd number of drives in raid10. >>> And you can have as many copies as you like in raid10, >>> >> >> You can make raid10,f2 functionality from raid1+0 by using partitions. For >> example, to get a raid10,f2 equivalent on two drives, partition them into >> equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and md1 a >> raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe set of md0 >> and md1. >> >> If you have three disks, you can do that too: >> >> md0 = raid1(sda1, sdb2) >> md1 = raid1(sdb1, sdc2) >> md2 = raid1(sdc1, sda2) >> md3 = raid0(md0, md1, md2) >> >> As far as I can figure out, the performance should be pretty much the same >> (although wrapping everything in a single raid10,f2 is more convenient). > > The performance will not be the same because. Whenever possible, md > reads from the outermost portion of the disk -- theoretically the > fastest portion of the disk (by 2 or 3 times as much as the inner > tracks) -- and in this way raid10,f2 can actually be faster than > raid0. > This would presumably apply to all raid1 arrangements, not just raid10 - when md has a choice to read from more than one place it will prefer the outermost place. In the arrangement I described above, the raid pairs such as md0 each have one have on an inner partition, and one half on an outer partition. /If/ md is smart enough, then it will do the same here and read from the outer partition by preference. The question is, does md determine the "outermost" copy by track number relative to the partition, or by absolute track number on the disk? If it is the former, then I see your point - with my raid 1 + 0 arrangement the innermost and outermost partitions will be viewed the same. If it is the later, then my arrangement will work equally well. On a related note, if you mix an SSD and a HD (partition) in a mirror, will md prefer to read from the SSD first? I know it is possible to use the "write-mostly" flag to force all reads to come from the SSD (assuming it hasn't failed), but it would be nice to get parallel reads from the HD as well whenever the read is large enough or when there are multiple reads in parallel. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 14:48 ` David Brown @ 2011-02-01 15:41 ` Roberto Spadim 2011-02-03 3:36 ` Drew 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-01 15:41 UTC (permalink / raw) To: David Brown; +Cc: linux-raid ok, read/write scheduling is another task.. the best algorithm is time based (optimize to minimal time) it´s not the round robin neither the closer head algorithm if you want performace use minimal time to execute (time based) in this topic (many emails) there´s a important thing. resolve the probability problem and make it 'official', include numbers and context the best algorithm for read and write isn´t the question about probability of how many mirros can i lose, how many disks can i lose (maybe can be if we change context to more source based, MAYBE) 2011/2/1 David Brown <david@westcontrol.com>: > On 01/02/2011 14:50, Jon Nelson wrote: >> >> On Tue, Feb 1, 2011 at 4:01 AM, David Brown<david@westcontrol.com> wrote: >>> >>> On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >>>> >>>> raid1+0 and Linux MD raid10 are similar, but significantly different >>>> in a number of ways. Linux MD raid10 can run on only 2 drives. >>>> Linux raid10,f2 has almost RAID0 striping performance in sequential >>>> read. >>>> You can have an odd number of drives in raid10. >>>> And you can have as many copies as you like in raid10, >>>> >>> >>> You can make raid10,f2 functionality from raid1+0 by using partitions. >>> For >>> example, to get a raid10,f2 equivalent on two drives, partition them into >>> equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and md1 a >>> raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe set of >>> md0 >>> and md1. >>> >>> If you have three disks, you can do that too: >>> >>> md0 = raid1(sda1, sdb2) >>> md1 = raid1(sdb1, sdc2) >>> md2 = raid1(sdc1, sda2) >>> md3 = raid0(md0, md1, md2) >>> >>> As far as I can figure out, the performance should be pretty much the >>> same >>> (although wrapping everything in a single raid10,f2 is more convenient). >> >> The performance will not be the same because. Whenever possible, md >> reads from the outermost portion of the disk -- theoretically the >> fastest portion of the disk (by 2 or 3 times as much as the inner >> tracks) -- and in this way raid10,f2 can actually be faster than >> raid0. >> > > This would presumably apply to all raid1 arrangements, not just raid10 - > when md has a choice to read from more than one place it will prefer the > outermost place. In the arrangement I described above, the raid pairs such > as md0 each have one have on an inner partition, and one half on an outer > partition. /If/ md is smart enough, then it will do the same here and read > from the outer partition by preference. > > The question is, does md determine the "outermost" copy by track number > relative to the partition, or by absolute track number on the disk? If it > is the former, then I see your point - with my raid 1 + 0 arrangement the > innermost and outermost partitions will be viewed the same. If it is the > later, then my arrangement will work equally well. > > On a related note, if you mix an SSD and a HD (partition) in a mirror, will > md prefer to read from the SSD first? I know it is possible to use the > "write-mostly" flag to force all reads to come from the SSD (assuming it > hasn't failed), but it would be nice to get parallel reads from the HD as > well whenever the read is large enough or when there are multiple reads in > parallel. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 15:41 ` Roberto Spadim @ 2011-02-03 3:36 ` Drew 2011-02-03 8:18 ` Stan Hoeppner [not found] ` <AANLkTikerSZfhMbkEvGBVyLB=wHDSHLWszoEz5As5Hi4@mail.gmail.com> 0 siblings, 2 replies; 127+ messages in thread From: Drew @ 2011-02-03 3:36 UTC (permalink / raw) To: linux-raid > in this topic (many emails) there´s a important thing. resolve the > probability problem and make it 'official', include numbers and > context The probability that's been talked about in this thread, is how resilient RAID 0+1 is vs RAID 1+0. The "1 in 3" vs "2 in 3" chance being refereed to is about "what is the probability a second drive failure will completely take down my degraded array?" In a RAID 0+1 you have the following: (a,b) & (c,d) where each pair is a stripe and the two stripes are mirrored. In RAID 1+0 you have the following: (a,b) & (c,d) where each pair is a mirror and the two pairs are striped. Let's assume 1 drive, say 'a' fails. We're in a degraded state and should replace the drive. While we're syncing/replacing the new drive a second disk fails. This second disk can be b, c , or d. In RAID 0+1, which second disk(s) can fail and we can still recover the data? Because of the failure of the RAID0 pair (a,b), the higher level RAID 1 is also degraded. Failure of either c or d will cause the mirror to lose it's second copy and we're down, hard. This means there is a '2 in 3' (66%) chance that the failure of a second disk will detroy the data in this array. Contrast, RAID 1+0. With the failure of 'a', the lower level RAID 1 pair (a,b) is still intact but degraded. The higher level RAID 0 is still intact. Which disks can we lose and still keep the upper level RAID 0 intact? Failure of 'b' will cuase the whole RAID to go down whereas the failure of either 'c', or 'd' will result in both lower level RAID 1's being degraded, however the RAID 0 is still intact. This gives us a "1 in 3" chance (33%) that a second disk failure will take down the entire array. You tell me. Given 1 in 3 odds of surviving a second disk failure or 2 in 3 odds, which would you choose? :-) -- Drew "Nothing in life is to be feared. It is only to be understood." --Marie Curie -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 3:36 ` Drew @ 2011-02-03 8:18 ` Stan Hoeppner [not found] ` <AANLkTikerSZfhMbkEvGBVyLB=wHDSHLWszoEz5As5Hi4@mail.gmail.com> 1 sibling, 0 replies; 127+ messages in thread From: Stan Hoeppner @ 2011-02-03 8:18 UTC (permalink / raw) To: Drew; +Cc: linux-raid Drew put forth on 2/2/2011 9:36 PM: > You tell me. Given 1 in 3 odds of surviving a second disk failure or 2 > in 3 odds, which would you choose? :-) This is also why few, if any, hardware RAID vendors offer RAID 0+1. Most (all?) offer only RAID 10. However, due to the RAID level migrations offered by some hardware RAID controllers, a customer can actually end up with a RAID 0+1 array if they go through a specific migration/expansion path. Obviously you'd want to avoid those paths. -- Stan ^ permalink raw reply [flat|nested] 127+ messages in thread
[parent not found: <AANLkTikerSZfhMbkEvGBVyLB=wHDSHLWszoEz5As5Hi4@mail.gmail.com>]
[parent not found: <AANLkTikLyR206x4aMy+veNkWPV67uF9r5dZKGqXJUEqN@mail.gmail.com>]
* Re: What's the typical RAID10 setup? [not found] ` <AANLkTikLyR206x4aMy+veNkWPV67uF9r5dZKGqXJUEqN@mail.gmail.com> @ 2011-02-03 14:35 ` Roberto Spadim 2011-02-03 15:43 ` Keld Jørn Simonsen 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 14:35 UTC (permalink / raw) To: Drew, Linux-RAID =] i think that we can end discussion and conclude that context (test / production) allow or don't allow lucky on probability, what's lucky? for production, lucky = poor disk, for production we don't allow failed disks, we have smart to predict, and when a disk fail we change many disks to prevent another disk fail could we update our raid wiki with some informations about this discussion? 2011/2/3 Drew <drew.kay@gmail.com>: >> for test, raid1 and after raid0 have better probability to don't stop >> raid10, but it's a probability... don't believe in lucky, since it's >> just for test, not production, it doesn't matter... >> >> what i whould implement? for production? anyone, if a disk fail, all >> array should be replaced (if without money replace disk with small >> life) > > A lot of this discussion about failure rates and probabilities is > academic. There are assumptions about each disk having it's own > independent failure probability, which if that can not be predicted > must be assumed to be 50%. At the end of the day I agree that when > the first disk fails the RAID is degraded and one *must* take steps to > remedy that. This discussion is more about why RAID 10 (1+0) is better > then 0+1. > > On our production systems we work with our vendor to ensure the > individual drives we get aren't from the same batch/production run, > thereby mitigating some issues around flaws in specific batches. We > keep spare drives on hand for all three RAID arrays, so as to minimize > the time we're operating in a degraded state. All data on RAID arrays > is backed up nightly to storage which is then mirrored off-site. > > At the end of the day our decision around what RAID type (10/5/6) to > use was based on a balance between performance, safety, & capacity > then on specific failure criteria. RAID 10 backs the iSCSI LUN that > our VMware cluster uses for the individual OSes, and the data > partition for the accounting database server. RAID 5 backs the > partitions we store user data one. And RAID 6 backs the NASes we use > for our backup system. > > RAID 10 was chosen for performance reasons. It doesn't have to > calculate parity on every write so for the OS & database, which do a > lot of small reads & writes, it's faster. For user disks we went with > RAID 5 because we get more space in the array at a small performance > penalty, which is fine as the users have to access the file server > over the LAN and the bottle neck is the pipe between the switch & the > VM, not between the iSCSI SAN & the server. For backups we went with > RAID 6 because the performance & storage penalties for the array were > outweighed by the need for maximum safety. > > > > -- > Drew > > "Nothing in life is to be feared. It is only to be understood." > --Marie Curie > > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 14:35 ` Roberto Spadim @ 2011-02-03 15:43 ` Keld Jørn Simonsen 2011-02-03 15:50 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-03 15:43 UTC (permalink / raw) To: Roberto Spadim; +Cc: Drew, Linux-RAID On Thu, Feb 03, 2011 at 12:35:52PM -0200, Roberto Spadim wrote: > =] i think that we can end discussion and conclude that context (test > / production) allow or don't allow lucky on probability, what's lucky? > for production, lucky = poor disk, for production we don't allow > failed disks, we have smart to predict, and when a disk fail we change > many disks to prevent another disk fail > > could we update our raid wiki with some informations about this discussion? I would like to, but it is a bit complicated. Anyway I think there already is something there on the wiki. And then, for one of the most important raid types in Linux MD, namely raid10, I am not sure what to write. It could be raid1+0, or raid0+1 like, and as far as I kow, it is raid0+1 for F2:-( but I don't know for n2 and o2. The German version on raid at wikipedia has a lot of info on probability http://de.wikipedia.org/wiki/RAID - but it is wrong a number of places. I have tried to correct it, but the German version is moderated, and they don't know what they are writing about. http://de.wikipedia.org/wiki/RAID Best regards Keld > 2011/2/3 Drew <drew.kay@gmail.com>: > >> for test, raid1 and after raid0 have better probability to don't stop > >> raid10, but it's a probability... don't believe in lucky, since it's > >> just for test, not production, it doesn't matter... > >> > >> what i whould implement? for production? anyone, if a disk fail, all > >> array should be replaced (if without money replace disk with small > >> life) > > > > A lot of this discussion about failure rates and probabilities is > > academic. There are assumptions about each disk having it's own > > independent failure probability, which if that can not be predicted > > must be assumed to be 50%. At the end of the day I agree that when > > the first disk fails the RAID is degraded and one *must* take steps to > > remedy that. This discussion is more about why RAID 10 (1+0) is better > > then 0+1. > > > > On our production systems we work with our vendor to ensure the > > individual drives we get aren't from the same batch/production run, > > thereby mitigating some issues around flaws in specific batches. We > > keep spare drives on hand for all three RAID arrays, so as to minimize > > the time we're operating in a degraded state. All data on RAID arrays > > is backed up nightly to storage which is then mirrored off-site. > > > > At the end of the day our decision around what RAID type (10/5/6) to > > use was based on a balance between performance, safety, & capacity > > then on specific failure criteria. RAID 10 backs the iSCSI LUN that > > our VMware cluster uses for the individual OSes, and the data > > partition for the accounting database server. RAID 5 backs the > > partitions we store user data one. And RAID 6 backs the NASes we use > > for our backup system. > > > > RAID 10 was chosen for performance reasons. It doesn't have to > > calculate parity on every write so for the OS & database, which do a > > lot of small reads & writes, it's faster. For user disks we went with > > RAID 5 because we get more space in the array at a small performance > > penalty, which is fine as the users have to access the file server > > over the LAN and the bottle neck is the pipe between the switch & the > > VM, not between the iSCSI SAN & the server. For backups we went with > > RAID 6 because the performance & storage penalties for the array were > > outweighed by the need for maximum safety. > > > > > > > > -- > > Drew > > > > "Nothing in life is to be feared. It is only to be understood." > > --Marie Curie > > > > > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 15:43 ` Keld Jørn Simonsen @ 2011-02-03 15:50 ` Roberto Spadim 2011-02-03 15:54 ` Roberto Spadim 2011-02-03 16:02 ` Keld Jørn Simonsen 0 siblings, 2 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 15:50 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Drew, Linux-RAID hummm, nice keld (or anyone), do you know someone (with time, not much, total time i think it´s just 2 hours) to try develop modifications on raid1 read_balance function? what modification, today read_balance have distance (current_head - next_head), multiply it by a number at /sys/block/md0/distance_rate, and make add read_size*byte_rate (byte_rate at /sys/block/md0/byte_read_rate), with this, the algorithm will make minimal time, and not minimal distance with this, i can get better read_balance (for ssd) for a second time we could implement device queue time to end (i think we will work about 1 day to get it working with all device schedulers), but it´s not for now 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: > On Thu, Feb 03, 2011 at 12:35:52PM -0200, Roberto Spadim wrote: >> =] i think that we can end discussion and conclude that context (test >> / production) allow or don't allow lucky on probability, what's lucky? >> for production, lucky = poor disk, for production we don't allow >> failed disks, we have smart to predict, and when a disk fail we change >> many disks to prevent another disk fail >> >> could we update our raid wiki with some informations about this discussion? > > I would like to, but it is a bit complicated. > Anyway I think there already is something there on the wiki. > And then, for one of the most important raid types in Linux MD, > namely raid10, I am not sure what to write. It could be raid1+0, or > raid0+1 like, and as far as I kow, it is raid0+1 for F2:-( > but I don't know for n2 and o2. > > The German version on raid at wikipedia has a lot of info on probability > http://de.wikipedia.org/wiki/RAID - but it is wrong a number of places. > I have tried to correct it, but the German version is moderated, and > they don't know what they are writing about. > http://de.wikipedia.org/wiki/RAID > > Best regards > Keld > >> 2011/2/3 Drew <drew.kay@gmail.com>: >> >> for test, raid1 and after raid0 have better probability to don't stop >> >> raid10, but it's a probability... don't believe in lucky, since it's >> >> just for test, not production, it doesn't matter... >> >> >> >> what i whould implement? for production? anyone, if a disk fail, all >> >> array should be replaced (if without money replace disk with small >> >> life) >> > >> > A lot of this discussion about failure rates and probabilities is >> > academic. There are assumptions about each disk having it's own >> > independent failure probability, which if that can not be predicted >> > must be assumed to be 50%. At the end of the day I agree that when >> > the first disk fails the RAID is degraded and one *must* take steps to >> > remedy that. This discussion is more about why RAID 10 (1+0) is better >> > then 0+1. >> > >> > On our production systems we work with our vendor to ensure the >> > individual drives we get aren't from the same batch/production run, >> > thereby mitigating some issues around flaws in specific batches. We >> > keep spare drives on hand for all three RAID arrays, so as to minimize >> > the time we're operating in a degraded state. All data on RAID arrays >> > is backed up nightly to storage which is then mirrored off-site. >> > >> > At the end of the day our decision around what RAID type (10/5/6) to >> > use was based on a balance between performance, safety, & capacity >> > then on specific failure criteria. RAID 10 backs the iSCSI LUN that >> > our VMware cluster uses for the individual OSes, and the data >> > partition for the accounting database server. RAID 5 backs the >> > partitions we store user data one. And RAID 6 backs the NASes we use >> > for our backup system. >> > >> > RAID 10 was chosen for performance reasons. It doesn't have to >> > calculate parity on every write so for the OS & database, which do a >> > lot of small reads & writes, it's faster. For user disks we went with >> > RAID 5 because we get more space in the array at a small performance >> > penalty, which is fine as the users have to access the file server >> > over the LAN and the bottle neck is the pipe between the switch & the >> > VM, not between the iSCSI SAN & the server. For backups we went with >> > RAID 6 because the performance & storage penalties for the array were >> > outweighed by the need for maximum safety. >> > >> > >> > >> > -- >> > Drew >> > >> > "Nothing in life is to be feared. It is only to be understood." >> > --Marie Curie >> > >> > >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 15:50 ` Roberto Spadim @ 2011-02-03 15:54 ` Roberto Spadim 2011-02-03 16:02 ` Keld Jørn Simonsen 1 sibling, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 15:54 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Drew, Linux-RAID sorry /sys/block/md0/distance_rate -> /sys/block/md0/md/sda1_distance_rate /sys/block/md0/byte_read_rate ->/sys/block/md0/md/sda1_byte_read_rate 2011/2/3 Roberto Spadim <roberto@spadim.com.br>: > hummm, nice > keld (or anyone), do you know someone (with time, not much, total time > i think it´s just 2 hours) to try develop modifications on raid1 > read_balance function? > what modification, today read_balance have distance (current_head - > next_head), multiply it by a number at /sys/block/md0/distance_rate, > and make add read_size*byte_rate (byte_rate at > /sys/block/md0/byte_read_rate), with this, the algorithm will make > minimal time, and not minimal distance > with this, i can get better read_balance (for ssd) > for a second time we could implement device queue time to end (i think > we will work about 1 day to get it working with all device > schedulers), but it´s not for now > > > 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: >> On Thu, Feb 03, 2011 at 12:35:52PM -0200, Roberto Spadim wrote: >>> =] i think that we can end discussion and conclude that context (test >>> / production) allow or don't allow lucky on probability, what's lucky? >>> for production, lucky = poor disk, for production we don't allow >>> failed disks, we have smart to predict, and when a disk fail we change >>> many disks to prevent another disk fail >>> >>> could we update our raid wiki with some informations about this discussion? >> >> I would like to, but it is a bit complicated. >> Anyway I think there already is something there on the wiki. >> And then, for one of the most important raid types in Linux MD, >> namely raid10, I am not sure what to write. It could be raid1+0, or >> raid0+1 like, and as far as I kow, it is raid0+1 for F2:-( >> but I don't know for n2 and o2. >> >> The German version on raid at wikipedia has a lot of info on probability >> http://de.wikipedia.org/wiki/RAID - but it is wrong a number of places. >> I have tried to correct it, but the German version is moderated, and >> they don't know what they are writing about. >> http://de.wikipedia.org/wiki/RAID >> >> Best regards >> Keld >> >>> 2011/2/3 Drew <drew.kay@gmail.com>: >>> >> for test, raid1 and after raid0 have better probability to don't stop >>> >> raid10, but it's a probability... don't believe in lucky, since it's >>> >> just for test, not production, it doesn't matter... >>> >> >>> >> what i whould implement? for production? anyone, if a disk fail, all >>> >> array should be replaced (if without money replace disk with small >>> >> life) >>> > >>> > A lot of this discussion about failure rates and probabilities is >>> > academic. There are assumptions about each disk having it's own >>> > independent failure probability, which if that can not be predicted >>> > must be assumed to be 50%. At the end of the day I agree that when >>> > the first disk fails the RAID is degraded and one *must* take steps to >>> > remedy that. This discussion is more about why RAID 10 (1+0) is better >>> > then 0+1. >>> > >>> > On our production systems we work with our vendor to ensure the >>> > individual drives we get aren't from the same batch/production run, >>> > thereby mitigating some issues around flaws in specific batches. We >>> > keep spare drives on hand for all three RAID arrays, so as to minimize >>> > the time we're operating in a degraded state. All data on RAID arrays >>> > is backed up nightly to storage which is then mirrored off-site. >>> > >>> > At the end of the day our decision around what RAID type (10/5/6) to >>> > use was based on a balance between performance, safety, & capacity >>> > then on specific failure criteria. RAID 10 backs the iSCSI LUN that >>> > our VMware cluster uses for the individual OSes, and the data >>> > partition for the accounting database server. RAID 5 backs the >>> > partitions we store user data one. And RAID 6 backs the NASes we use >>> > for our backup system. >>> > >>> > RAID 10 was chosen for performance reasons. It doesn't have to >>> > calculate parity on every write so for the OS & database, which do a >>> > lot of small reads & writes, it's faster. For user disks we went with >>> > RAID 5 because we get more space in the array at a small performance >>> > penalty, which is fine as the users have to access the file server >>> > over the LAN and the bottle neck is the pipe between the switch & the >>> > VM, not between the iSCSI SAN & the server. For backups we went with >>> > RAID 6 because the performance & storage penalties for the array were >>> > outweighed by the need for maximum safety. >>> > >>> > >>> > >>> > -- >>> > Drew >>> > >>> > "Nothing in life is to be feared. It is only to be understood." >>> > --Marie Curie >>> > >>> > >>> >>> >>> >>> -- >>> Roberto Spadim >>> Spadim Technology / SPAEmpresarial >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 15:50 ` Roberto Spadim 2011-02-03 15:54 ` Roberto Spadim @ 2011-02-03 16:02 ` Keld Jørn Simonsen 2011-02-03 16:07 ` Roberto Spadim 2011-02-03 16:16 ` Roberto Spadim 1 sibling, 2 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-03 16:02 UTC (permalink / raw) To: Roberto Spadim; +Cc: Keld Jørn Simonsen, Drew, Linux-RAID On Thu, Feb 03, 2011 at 01:50:49PM -0200, Roberto Spadim wrote: > hummm, nice > keld (or anyone), do you know someone (with time, not much, total time > i think it?s just 2 hours) to try develop modifications on raid1 > read_balance function? maybe our very productive Polish friends at Intel could have a look. But then again, I am not sure it is productive. I think raid1 is OK, You could have a look at raid10, where "offset" has been discussed as being the better layout for ssd. > what modification, today read_balance have distance (current_head - > next_head), multiply it by a number at /sys/block/md0/distance_rate, > and make add read_size*byte_rate (byte_rate at > /sys/block/md0/byte_read_rate), with this, the algorithm will make > minimal time, and not minimal distance > with this, i can get better read_balance (for ssd) > for a second time we could implement device queue time to end (i think > we will work about 1 day to get it working with all device > schedulers), but it?s not for now Hmm, I thought you wanted to write new elevator schedulers? best regards keld > > 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: > > On Thu, Feb 03, 2011 at 12:35:52PM -0200, Roberto Spadim wrote: > >> =] i think that we can end discussion and conclude that context (test > >> / production) allow or don't allow lucky on probability, what's lucky? > >> for production, lucky = poor disk, for production we don't allow > >> failed disks, we have smart to predict, and when a disk fail we change > >> many disks to prevent another disk fail > >> > >> could we update our raid wiki with some informations about this discussion? > > > > I would like to, but it is a bit complicated. > > Anyway I think there already is something there on the wiki. > > And then, for one of the most important raid types in Linux MD, > > namely raid10, I am not sure what to write. It could be raid1+0, or > > raid0+1 like, and as far as I kow, it is raid0+1 for F2:-( > > but I don't know for n2 and o2. > > > > The German version on raid at wikipedia has a lot of info on probability > > http://de.wikipedia.org/wiki/RAID - but it is wrong a number of places. > > I have tried to correct it, but the German version is moderated, and > > they don't know what they are writing about. at least in some places, refusing to correct errors. > > http://de.wikipedia.org/wiki/RAID > > > > Best regards > > Keld > > > >> 2011/2/3 Drew <drew.kay@gmail.com>: > >> >> for test, raid1 and after raid0 have better probability to don't stop > >> >> raid10, but it's a probability... don't believe in lucky, since it's > >> >> just for test, not production, it doesn't matter... > >> >> > >> >> what i whould implement? for production? anyone, if a disk fail, all > >> >> array should be replaced (if without money replace disk with small > >> >> life) > >> > > >> > A lot of this discussion about failure rates and probabilities is > >> > academic. There are assumptions about each disk having it's own > >> > independent failure probability, which if that can not be predicted > >> > must be assumed to be 50%. At the end of the day I agree that when > >> > the first disk fails the RAID is degraded and one *must* take steps to > >> > remedy that. This discussion is more about why RAID 10 (1+0) is better > >> > then 0+1. > >> > > >> > On our production systems we work with our vendor to ensure the > >> > individual drives we get aren't from the same batch/production run, > >> > thereby mitigating some issues around flaws in specific batches. We > >> > keep spare drives on hand for all three RAID arrays, so as to minimize > >> > the time we're operating in a degraded state. All data on RAID arrays > >> > is backed up nightly to storage which is then mirrored off-site. > >> > > >> > At the end of the day our decision around what RAID type (10/5/6) to > >> > use was based on a balance between performance, safety, & capacity > >> > then on specific failure criteria. RAID 10 backs the iSCSI LUN that > >> > our VMware cluster uses for the individual OSes, and the data > >> > partition for the accounting database server. RAID 5 backs the > >> > partitions we store user data one. And RAID 6 backs the NASes we use > >> > for our backup system. > >> > > >> > RAID 10 was chosen for performance reasons. It doesn't have to > >> > calculate parity on every write so for the OS & database, which do a > >> > lot of small reads & writes, it's faster. For user disks we went with > >> > RAID 5 because we get more space in the array at a small performance > >> > penalty, which is fine as the users have to access the file server > >> > over the LAN and the bottle neck is the pipe between the switch & the > >> > VM, not between the iSCSI SAN & the server. For backups we went with > >> > RAID 6 because the performance & storage penalties for the array were > >> > outweighed by the need for maximum safety. > >> > > >> > > >> > > >> > -- > >> > Drew > >> > > >> > "Nothing in life is to be feared. It is only to be understood." > >> > --Marie Curie > >> > > >> > > >> > >> > >> > >> -- > >> Roberto Spadim > >> Spadim Technology / SPAEmpresarial > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 16:02 ` Keld Jørn Simonsen @ 2011-02-03 16:07 ` Roberto Spadim 2011-02-03 16:16 ` Roberto Spadim 1 sibling, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 16:07 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Drew, Linux-RAID 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: > On Thu, Feb 03, 2011 at 01:50:49PM -0200, Roberto Spadim wrote: >> hummm, nice >> keld (or anyone), do you know someone (with time, not much, total time >> i think it?s just 2 hours) to try develop modifications on raid1 >> read_balance function? > > maybe our very productive Polish friends at Intel could have a look. > But then again, I am not sure it is productive. I think raid1 is OK, > You could have a look at raid10, where "offset" has been discussed as > being the better layout for ssd. ok, i think that there´s no 'better' layout for ssd since ssd don´t have a variable access time like head on hard disk it´s better because today read balance is optimized for minimal distance (nearest head) but it´s not true that layout for ssd is better >> what modification, today read_balance have distance (current_head - >> next_head), multiply it by a number at /sys/block/md0/distance_rate, >> and make add read_size*byte_rate (byte_rate at >> /sys/block/md0/byte_read_rate), with this, the algorithm will make >> minimal time, and not minimal distance >> with this, i can get better read_balance (for ssd) >> for a second time we could implement device queue time to end (i think >> we will work about 1 day to get it working with all device >> schedulers), but it?s not for now > > Hmm, I thought you wanted to write new elevator schedulers? no, it´s not a elevator, it´s a raid1 read balance, based on time (each sda,sdb,sdc,sdd can have you elevator without problem) it´s like a elevator for raid1 (mirror), some other raid could use it too (raid10, i don´t know if raid5 have mirror, but if yes, could use too) > > best regards > keld thanks, keld -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 16:02 ` Keld Jørn Simonsen 2011-02-03 16:07 ` Roberto Spadim @ 2011-02-03 16:16 ` Roberto Spadim 1 sibling, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 16:16 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Drew, Linux-RAID i was trying to search coder for round robin balance, i founded Roy Keene how could we contact him and tell if he could help us? http://www.spinics.net/lists/raid/msg30003.html 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: > On Thu, Feb 03, 2011 at 01:50:49PM -0200, Roberto Spadim wrote: >> hummm, nice >> keld (or anyone), do you know someone (with time, not much, total time >> i think it?s just 2 hours) to try develop modifications on raid1 >> read_balance function? > > maybe our very productive Polish friends at Intel could have a look. > But then again, I am not sure it is productive. I think raid1 is OK, > You could have a look at raid10, where "offset" has been discussed as > being the better layout for ssd. > >> what modification, today read_balance have distance (current_head - >> next_head), multiply it by a number at /sys/block/md0/distance_rate, >> and make add read_size*byte_rate (byte_rate at >> /sys/block/md0/byte_read_rate), with this, the algorithm will make >> minimal time, and not minimal distance >> with this, i can get better read_balance (for ssd) >> for a second time we could implement device queue time to end (i think >> we will work about 1 day to get it working with all device >> schedulers), but it?s not for now > > Hmm, I thought you wanted to write new elevator schedulers? > > best regards > keld > >> >> 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: >> > On Thu, Feb 03, 2011 at 12:35:52PM -0200, Roberto Spadim wrote: >> >> =] i think that we can end discussion and conclude that context (test >> >> / production) allow or don't allow lucky on probability, what's lucky? >> >> for production, lucky = poor disk, for production we don't allow >> >> failed disks, we have smart to predict, and when a disk fail we change >> >> many disks to prevent another disk fail >> >> >> >> could we update our raid wiki with some informations about this discussion? >> > >> > I would like to, but it is a bit complicated. >> > Anyway I think there already is something there on the wiki. >> > And then, for one of the most important raid types in Linux MD, >> > namely raid10, I am not sure what to write. It could be raid1+0, or >> > raid0+1 like, and as far as I kow, it is raid0+1 for F2:-( >> > but I don't know for n2 and o2. >> > >> > The German version on raid at wikipedia has a lot of info on probability >> > http://de.wikipedia.org/wiki/RAID - but it is wrong a number of places. >> > I have tried to correct it, but the German version is moderated, and >> > they don't know what they are writing about. > > at least in some places, refusing to correct errors. > >> > http://de.wikipedia.org/wiki/RAID >> > >> > Best regards >> > Keld >> > >> >> 2011/2/3 Drew <drew.kay@gmail.com>: >> >> >> for test, raid1 and after raid0 have better probability to don't stop >> >> >> raid10, but it's a probability... don't believe in lucky, since it's >> >> >> just for test, not production, it doesn't matter... >> >> >> >> >> >> what i whould implement? for production? anyone, if a disk fail, all >> >> >> array should be replaced (if without money replace disk with small >> >> >> life) >> >> > >> >> > A lot of this discussion about failure rates and probabilities is >> >> > academic. There are assumptions about each disk having it's own >> >> > independent failure probability, which if that can not be predicted >> >> > must be assumed to be 50%. At the end of the day I agree that when >> >> > the first disk fails the RAID is degraded and one *must* take steps to >> >> > remedy that. This discussion is more about why RAID 10 (1+0) is better >> >> > then 0+1. >> >> > >> >> > On our production systems we work with our vendor to ensure the >> >> > individual drives we get aren't from the same batch/production run, >> >> > thereby mitigating some issues around flaws in specific batches. We >> >> > keep spare drives on hand for all three RAID arrays, so as to minimize >> >> > the time we're operating in a degraded state. All data on RAID arrays >> >> > is backed up nightly to storage which is then mirrored off-site. >> >> > >> >> > At the end of the day our decision around what RAID type (10/5/6) to >> >> > use was based on a balance between performance, safety, & capacity >> >> > then on specific failure criteria. RAID 10 backs the iSCSI LUN that >> >> > our VMware cluster uses for the individual OSes, and the data >> >> > partition for the accounting database server. RAID 5 backs the >> >> > partitions we store user data one. And RAID 6 backs the NASes we use >> >> > for our backup system. >> >> > >> >> > RAID 10 was chosen for performance reasons. It doesn't have to >> >> > calculate parity on every write so for the OS & database, which do a >> >> > lot of small reads & writes, it's faster. For user disks we went with >> >> > RAID 5 because we get more space in the array at a small performance >> >> > penalty, which is fine as the users have to access the file server >> >> > over the LAN and the bottle neck is the pipe between the switch & the >> >> > VM, not between the iSCSI SAN & the server. For backups we went with >> >> > RAID 6 because the performance & storage penalties for the array were >> >> > outweighed by the need for maximum safety. >> >> > >> >> > >> >> > >> >> > -- >> >> > Drew >> >> > >> >> > "Nothing in life is to be feared. It is only to be understood." >> >> > --Marie Curie >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Roberto Spadim >> >> Spadim Technology / SPAEmpresarial >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> >> the body of a message to majordomo@vger.kernel.org >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 13:50 ` Jon Nelson 2011-02-01 14:25 ` Roberto Spadim 2011-02-01 14:48 ` David Brown @ 2011-02-01 22:05 ` Stan Hoeppner 2011-02-01 23:12 ` Roberto Spadim 2011-02-01 23:35 ` Keld Jørn Simonsen 2 siblings, 2 replies; 127+ messages in thread From: Stan Hoeppner @ 2011-02-01 22:05 UTC (permalink / raw) To: Jon Nelson; +Cc: David Brown, linux-raid Jon Nelson put forth on 2/1/2011 7:50 AM: > The performance will not be the same because. Whenever possible, md > reads from the outermost portion of the disk -- theoretically the > fastest portion of the disk (by 2 or 3 times as much as the inner > tracks) -- and in this way raid10,f2 can actually be faster than > raid0. Faster in what regard? I assume you mean purely sequential read, and not random IOPS. The access patterns of the vast majority of workloads are random, so I don't see much real world benefit, if what you say is correct. This might benefit MythTV or similar niche streaming apps. -- Stan ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 22:05 ` Stan Hoeppner @ 2011-02-01 23:12 ` Roberto Spadim 2011-02-02 9:25 ` Robin Hill 2011-02-01 23:35 ` Keld Jørn Simonsen 1 sibling, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-01 23:12 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Jon Nelson, David Brown, linux-raid again.... closest head algorithm (today raid1) is good for hard disks but isn´t good for ssd (round robin here is better) but the best algorithm is time based (minimize time to access data) 2011/2/1 Stan Hoeppner <stan@hardwarefreak.com>: > Jon Nelson put forth on 2/1/2011 7:50 AM: > >> The performance will not be the same because. Whenever possible, md >> reads from the outermost portion of the disk -- theoretically the >> fastest portion of the disk (by 2 or 3 times as much as the inner >> tracks) -- and in this way raid10,f2 can actually be faster than >> raid0. > > Faster in what regard? I assume you mean purely sequential read, and not random > IOPS. The access patterns of the vast majority of workloads are random, so I > don't see much real world benefit, if what you say is correct. This might > benefit MythTV or similar niche streaming apps. > > -- > Stan > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 23:12 ` Roberto Spadim @ 2011-02-02 9:25 ` Robin Hill 2011-02-02 16:00 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Robin Hill @ 2011-02-02 9:25 UTC (permalink / raw) To: Jon Nelson, linux-raid [-- Attachment #1: Type: text/plain, Size: 471 bytes --] On Tue Feb 01, 2011 at 09:12:11PM -0200, Roberto Spadim wrote: > but the best algorithm is time based (minimize time to access data) > And what do you think takes the time accessing the data? In a rotating disk, it's moving the heads - that's why the current strategy is nearest head. In an SSD there's no head movement, so access time should be the same for accessing any data, making it pretty much irrelevant which strategy is used. Cheers, Robin [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 9:25 ` Robin Hill @ 2011-02-02 16:00 ` Roberto Spadim 2011-02-02 16:06 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-02 16:00 UTC (permalink / raw) To: Jon Nelson, linux-raid time based: is the time to: HD:head positioning , SSD: time to send command to ROM chip HD:read/write time (disk speed - rpm), SSD: time to write/read (time to ssd rom chip receive bytes) that's time based what is fast por read? consider that time based must know that disk is doing a I/O and that you have a time to end, this time to end is another time in algorithm for example: NBD (network block device) time to send read message + time to send command to rom or head positioning read/write time: time to nbd server return the read/write bytes what algorithm should do? calculate all time or all mirrors, including time to end current request (if only one request could be processed, or if allow more than 1 request, the time spent to start our command) after all time calculated, select the minimal value/device that's time based it's not based on round robin it's not based on closest head it's based on device speed to: *(1)position head/send rom command *(2)read/write time (per total of bytes read/write) *(3)time to start out request command (if don't allow more than 1 request per time, don't have a device queue) the total time per device will tell us the best device to read if we mix, nbd + ssd + hdd (5000rpm) + hdd(7500rpm) + hdd(10000rpm) + hdd(15000rpm) we can get the best read time using this algorithm the problem? we must run a constante benchmark to get this values *(1) *(2) *(3) and calculate good values of time spent on each process resuming... whe need a model of each device (simple-constants or very complex-neural network?), and calculate time spent per device nice? 2011/2/2 Robin Hill <robin@robinhill.me.uk>: > On Tue Feb 01, 2011 at 09:12:11PM -0200, Roberto Spadim wrote: > >> but the best algorithm is time based (minimize time to access data) >> > And what do you think takes the time accessing the data? In a rotating > disk, it's moving the heads - that's why the current strategy is nearest > head. In an SSD there's no head movement, so access time should be the > same for accessing any data, making it pretty much irrelevant which > strategy is used. > > Cheers, > Robin > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 16:00 ` Roberto Spadim @ 2011-02-02 16:06 ` Roberto Spadim 2011-02-02 16:07 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-02 16:06 UTC (permalink / raw) To: Jon Nelson, linux-raid it's cpu/mem consuming if use a complex model, and less cpu/mem consuming if use a single model another idea.... many algorithm.... first execute time based it selected a bug (failed) device execute closest head if selected a bug (failed) device execute round robin if selected a bug (failed) device select first usable non write-mostly if selected a bug (failed) device select first usable write-mostly if end of devices, stop md raid to make this, today... we need a read_algorithm at /sys/block/md0/xxxxxx, to select what algorith to use, write algorithm is based on raid being used.. raid0 make linear and stripe, raid1 make mirror, there's no algorithm to use here... we need some files at /sys/block/md0/xxx to manage 'devices' time model (parameters) we need a adaptive algorithm to update parameters and make it closest possible to real model of 'devices' a raid0 have global parameters, inside raid0 devices have per device parameters a raid1 over raid0, should use raid0 parameters raid0 over devices, should use devices parameters 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > time based: is the time to: > HD:head positioning , SSD: time to send command to ROM chip > HD:read/write time (disk speed - rpm), SSD: time to write/read (time > to ssd rom chip receive bytes) > that's time based > > what is fast por read? > consider that time based must know that disk is doing a I/O and that > you have a time to end, this time to end is another time in algorithm > > for example: > NBD (network block device) > time to send read message + time to send command to rom or head positioning > read/write time: time to nbd server return the read/write bytes > > what algorithm should do? > calculate all time or all mirrors, including time to end current > request (if only one request could be processed, or if allow more than > 1 request, the time spent to start our command) > after all time calculated, select the minimal value/device > > that's time based > it's not based on round robin > it's not based on closest head > it's based on device speed to: > *(1)position head/send rom command > *(2)read/write time (per total of bytes read/write) > *(3)time to start out request command (if don't allow more than 1 > request per time, don't have a device queue) > > the total time per device will tell us the best device to read > if we mix, nbd + ssd + hdd (5000rpm) + hdd(7500rpm) + hdd(10000rpm) + > hdd(15000rpm) > we can get the best read time using this algorithm > the problem? we must run a constante benchmark to get this values *(1) > *(2) *(3) and calculate good values of time spent on each process > > resuming... whe need a model of each device (simple-constants or very > complex-neural network?), and calculate time spent per device > nice? > > > 2011/2/2 Robin Hill <robin@robinhill.me.uk>: >> On Tue Feb 01, 2011 at 09:12:11PM -0200, Roberto Spadim wrote: >> >>> but the best algorithm is time based (minimize time to access data) >>> >> And what do you think takes the time accessing the data? In a rotating >> disk, it's moving the heads - that's why the current strategy is nearest >> head. In an SSD there's no head movement, so access time should be the >> same for accessing any data, making it pretty much irrelevant which >> strategy is used. >> >> Cheers, >> Robin >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 16:06 ` Roberto Spadim @ 2011-02-02 16:07 ` Roberto Spadim 2011-02-02 16:10 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-02 16:07 UTC (permalink / raw) To: Jon Nelson, linux-raid check that, read balance is: time based closest head round robin algorithms plus.... failed device problem and write-mostly with time based we can drop write-mosty.... just make the time of that device very high 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > it's cpu/mem consuming if use a complex model, and less cpu/mem > consuming if use a single model > > another idea.... > many algorithm.... > > first execute time based > it selected a bug (failed) device > execute closest head > if selected a bug (failed) device > execute round robin > if selected a bug (failed) device > select first usable non write-mostly > if selected a bug (failed) device > select first usable write-mostly > if end of devices, stop md raid > > to make this, today... we need a read_algorithm at > /sys/block/md0/xxxxxx, to select what algorith to use, write algorithm > is based on raid being used.. raid0 make linear and stripe, raid1 make > mirror, there's no algorithm to use here... > we need some files at /sys/block/md0/xxx to manage 'devices' time > model (parameters) > we need a adaptive algorithm to update parameters and make it closest > possible to real model of 'devices' > a raid0 have global parameters, inside raid0 devices have per device parameters > a raid1 over raid0, should use raid0 parameters > raid0 over devices, should use devices parameters > > > 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: >> time based: is the time to: >> HD:head positioning , SSD: time to send command to ROM chip >> HD:read/write time (disk speed - rpm), SSD: time to write/read (time >> to ssd rom chip receive bytes) >> that's time based >> >> what is fast por read? >> consider that time based must know that disk is doing a I/O and that >> you have a time to end, this time to end is another time in algorithm >> >> for example: >> NBD (network block device) >> time to send read message + time to send command to rom or head positioning >> read/write time: time to nbd server return the read/write bytes >> >> what algorithm should do? >> calculate all time or all mirrors, including time to end current >> request (if only one request could be processed, or if allow more than >> 1 request, the time spent to start our command) >> after all time calculated, select the minimal value/device >> >> that's time based >> it's not based on round robin >> it's not based on closest head >> it's based on device speed to: >> *(1)position head/send rom command >> *(2)read/write time (per total of bytes read/write) >> *(3)time to start out request command (if don't allow more than 1 >> request per time, don't have a device queue) >> >> the total time per device will tell us the best device to read >> if we mix, nbd + ssd + hdd (5000rpm) + hdd(7500rpm) + hdd(10000rpm) + >> hdd(15000rpm) >> we can get the best read time using this algorithm >> the problem? we must run a constante benchmark to get this values *(1) >> *(2) *(3) and calculate good values of time spent on each process >> >> resuming... whe need a model of each device (simple-constants or very >> complex-neural network?), and calculate time spent per device >> nice? >> >> >> 2011/2/2 Robin Hill <robin@robinhill.me.uk>: >>> On Tue Feb 01, 2011 at 09:12:11PM -0200, Roberto Spadim wrote: >>> >>>> but the best algorithm is time based (minimize time to access data) >>>> >>> And what do you think takes the time accessing the data? In a rotating >>> disk, it's moving the heads - that's why the current strategy is nearest >>> head. In an SSD there's no head movement, so access time should be the >>> same for accessing any data, making it pretty much irrelevant which >>> strategy is used. >>> >>> Cheers, >>> Robin >>> >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 16:07 ` Roberto Spadim @ 2011-02-02 16:10 ` Roberto Spadim 2011-02-02 16:13 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-02 16:10 UTC (permalink / raw) To: Jon Nelson, linux-raid pros against closest head: since we can use raid1 with identical disks (buyed at same time, with near serial numbers) we can have disks with same time to fail using closest head, the more used disk, will fail first failing first we have time to change it (while the second isn't as used as first device) but, think about it... it's like a write-mostly not? 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > check that, read balance is: > time based > closest head > round robin > algorithms > > plus.... > failed device problem and write-mostly > > with time based we can drop write-mosty.... just make the time of that > device very high > > > 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: >> it's cpu/mem consuming if use a complex model, and less cpu/mem >> consuming if use a single model >> >> another idea.... >> many algorithm.... >> >> first execute time based >> it selected a bug (failed) device >> execute closest head >> if selected a bug (failed) device >> execute round robin >> if selected a bug (failed) device >> select first usable non write-mostly >> if selected a bug (failed) device >> select first usable write-mostly >> if end of devices, stop md raid >> >> to make this, today... we need a read_algorithm at >> /sys/block/md0/xxxxxx, to select what algorith to use, write algorithm >> is based on raid being used.. raid0 make linear and stripe, raid1 make >> mirror, there's no algorithm to use here... >> we need some files at /sys/block/md0/xxx to manage 'devices' time >> model (parameters) >> we need a adaptive algorithm to update parameters and make it closest >> possible to real model of 'devices' >> a raid0 have global parameters, inside raid0 devices have per device parameters >> a raid1 over raid0, should use raid0 parameters >> raid0 over devices, should use devices parameters >> >> >> 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: >>> time based: is the time to: >>> HD:head positioning , SSD: time to send command to ROM chip >>> HD:read/write time (disk speed - rpm), SSD: time to write/read (time >>> to ssd rom chip receive bytes) >>> that's time based >>> >>> what is fast por read? >>> consider that time based must know that disk is doing a I/O and that >>> you have a time to end, this time to end is another time in algorithm >>> >>> for example: >>> NBD (network block device) >>> time to send read message + time to send command to rom or head positioning >>> read/write time: time to nbd server return the read/write bytes >>> >>> what algorithm should do? >>> calculate all time or all mirrors, including time to end current >>> request (if only one request could be processed, or if allow more than >>> 1 request, the time spent to start our command) >>> after all time calculated, select the minimal value/device >>> >>> that's time based >>> it's not based on round robin >>> it's not based on closest head >>> it's based on device speed to: >>> *(1)position head/send rom command >>> *(2)read/write time (per total of bytes read/write) >>> *(3)time to start out request command (if don't allow more than 1 >>> request per time, don't have a device queue) >>> >>> the total time per device will tell us the best device to read >>> if we mix, nbd + ssd + hdd (5000rpm) + hdd(7500rpm) + hdd(10000rpm) + >>> hdd(15000rpm) >>> we can get the best read time using this algorithm >>> the problem? we must run a constante benchmark to get this values *(1) >>> *(2) *(3) and calculate good values of time spent on each process >>> >>> resuming... whe need a model of each device (simple-constants or very >>> complex-neural network?), and calculate time spent per device >>> nice? >>> >>> >>> 2011/2/2 Robin Hill <robin@robinhill.me.uk>: >>>> On Tue Feb 01, 2011 at 09:12:11PM -0200, Roberto Spadim wrote: >>>> >>>>> but the best algorithm is time based (minimize time to access data) >>>>> >>>> And what do you think takes the time accessing the data? In a rotating >>>> disk, it's moving the heads - that's why the current strategy is nearest >>>> head. In an SSD there's no head movement, so access time should be the >>>> same for accessing any data, making it pretty much irrelevant which >>>> strategy is used. >>>> >>>> Cheers, >>>> Robin >>>> >>> >>> >>> >>> -- >>> Roberto Spadim >>> Spadim Technology / SPAEmpresarial >>> >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 16:10 ` Roberto Spadim @ 2011-02-02 16:13 ` Roberto Spadim 2011-02-02 19:44 ` Keld Jørn Simonsen 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-02 16:13 UTC (permalink / raw) To: Jon Nelson, linux-raid ssd time to make head positioning (latency): <0.1ms hd max time to make head positioning (latency): 10ms ssd rate of read: 270MB/s random/sequential read (excluding latency) check that ssd is BLOCK (4kb mostly) oriented hd rate of read? 130MB/s sequential read? check that hd is BIT oriented write rate? random/sequencial? with these answers we can make a simple 'time' model of read/write, per device (use of raid0 (/dev/md0) is a device!, raid1 too (/dev/md1), raid5 (/dev/md2) ,raid6 (/dev/md3)) any device have this variables... just make a model and use the model to optimize minimal time to execute write/read 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > pros against closest head: > since we can use raid1 with identical disks (buyed at same time, with > near serial numbers) we can have disks with same time to fail > using closest head, the more used disk, will fail first > failing first we have time to change it (while the second isn't as > used as first device) > > but, think about it... > it's like a write-mostly not? > > > 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: >> check that, read balance is: >> time based >> closest head >> round robin >> algorithms >> >> plus.... >> failed device problem and write-mostly >> >> with time based we can drop write-mosty.... just make the time of that >> device very high >> >> >> 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: >>> it's cpu/mem consuming if use a complex model, and less cpu/mem >>> consuming if use a single model >>> >>> another idea.... >>> many algorithm.... >>> >>> first execute time based >>> it selected a bug (failed) device >>> execute closest head >>> if selected a bug (failed) device >>> execute round robin >>> if selected a bug (failed) device >>> select first usable non write-mostly >>> if selected a bug (failed) device >>> select first usable write-mostly >>> if end of devices, stop md raid >>> >>> to make this, today... we need a read_algorithm at >>> /sys/block/md0/xxxxxx, to select what algorith to use, write algorithm >>> is based on raid being used.. raid0 make linear and stripe, raid1 make >>> mirror, there's no algorithm to use here... >>> we need some files at /sys/block/md0/xxx to manage 'devices' time >>> model (parameters) >>> we need a adaptive algorithm to update parameters and make it closest >>> possible to real model of 'devices' >>> a raid0 have global parameters, inside raid0 devices have per device parameters >>> a raid1 over raid0, should use raid0 parameters >>> raid0 over devices, should use devices parameters >>> >>> >>> 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: >>>> time based: is the time to: >>>> HD:head positioning , SSD: time to send command to ROM chip >>>> HD:read/write time (disk speed - rpm), SSD: time to write/read (time >>>> to ssd rom chip receive bytes) >>>> that's time based >>>> >>>> what is fast por read? >>>> consider that time based must know that disk is doing a I/O and that >>>> you have a time to end, this time to end is another time in algorithm >>>> >>>> for example: >>>> NBD (network block device) >>>> time to send read message + time to send command to rom or head positioning >>>> read/write time: time to nbd server return the read/write bytes >>>> >>>> what algorithm should do? >>>> calculate all time or all mirrors, including time to end current >>>> request (if only one request could be processed, or if allow more than >>>> 1 request, the time spent to start our command) >>>> after all time calculated, select the minimal value/device >>>> >>>> that's time based >>>> it's not based on round robin >>>> it's not based on closest head >>>> it's based on device speed to: >>>> *(1)position head/send rom command >>>> *(2)read/write time (per total of bytes read/write) >>>> *(3)time to start out request command (if don't allow more than 1 >>>> request per time, don't have a device queue) >>>> >>>> the total time per device will tell us the best device to read >>>> if we mix, nbd + ssd + hdd (5000rpm) + hdd(7500rpm) + hdd(10000rpm) + >>>> hdd(15000rpm) >>>> we can get the best read time using this algorithm >>>> the problem? we must run a constante benchmark to get this values *(1) >>>> *(2) *(3) and calculate good values of time spent on each process >>>> >>>> resuming... whe need a model of each device (simple-constants or very >>>> complex-neural network?), and calculate time spent per device >>>> nice? >>>> >>>> >>>> 2011/2/2 Robin Hill <robin@robinhill.me.uk>: >>>>> On Tue Feb 01, 2011 at 09:12:11PM -0200, Roberto Spadim wrote: >>>>> >>>>>> but the best algorithm is time based (minimize time to access data) >>>>>> >>>>> And what do you think takes the time accessing the data? In a rotating >>>>> disk, it's moving the heads - that's why the current strategy is nearest >>>>> head. In an SSD there's no head movement, so access time should be the >>>>> same for accessing any data, making it pretty much irrelevant which >>>>> strategy is used. >>>>> >>>>> Cheers, >>>>> Robin >>>>> >>>> >>>> >>>> >>>> -- >>>> Roberto Spadim >>>> Spadim Technology / SPAEmpresarial >>>> >>> >>> >>> >>> -- >>> Roberto Spadim >>> Spadim Technology / SPAEmpresarial >>> >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 16:13 ` Roberto Spadim @ 2011-02-02 19:44 ` Keld Jørn Simonsen 2011-02-02 20:28 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-02 19:44 UTC (permalink / raw) To: Roberto Spadim; +Cc: Jon Nelson, linux-raid Hmm, Roberto, where are the gains? I think it is hard to make raid1 better than it is today. Normally the driver orders the reads to minimize head movement and loss with rotation latency. Where can we improve that? Also, what about conflicts with the elevator algorithm? There are several scheduling algorithms available, and each has its merits. Will your new scheme work against these? Or is your new scheme just another scheduling algorithm? I think I learned that scheduling is per drive, not per file system. and is it reading or writing or both? Normally we are dependant on the reading, as we cannot process data before we have read them. OTOH writing is less time critical, as nobody is waiting for it. Or is it maximum thruput you want? Or a mix, given some restraints? best regards keld Best regards Keld On Wed, Feb 02, 2011 at 02:13:52PM -0200, Roberto Spadim wrote: > ssd time to make head positioning (latency): <0.1ms > hd max time to make head positioning (latency): 10ms > > ssd rate of read: 270MB/s random/sequential read (excluding latency) > check that ssd is BLOCK (4kb mostly) oriented > hd rate of read? 130MB/s sequential read? check that hd is BIT oriented > write rate? random/sequencial? > > with these answers we can make a simple 'time' model of read/write, > per device (use of raid0 (/dev/md0) is a device!, raid1 too > (/dev/md1), raid5 (/dev/md2) ,raid6 (/dev/md3)) any device have this > variables... > just make a model and use the model to optimize minimal time to > execute write/read > > 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > > pros against closest head: > > since we can use raid1 with identical disks (buyed at same time, with > > near serial numbers) we can have disks with same time to fail > > using closest head, the more used disk, will fail first > > failing first we have time to change it (while the second isn't as > > used as first device) > > > > but, think about it... > > it's like a write-mostly not? > > > > > > 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > >> check that, read balance is: > >> time based > >> closest head > >> round robin > >> algorithms > >> > >> plus.... > >> failed device problem and write-mostly > >> > >> with time based we can drop write-mosty.... just make the time of that > >> device very high > >> > >> > >> 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > >>> it's cpu/mem consuming if use a complex model, and less cpu/mem > >>> consuming if use a single model > >>> > >>> another idea.... > >>> many algorithm.... > >>> > >>> first execute time based > >>> it selected a bug (failed) device > >>> execute closest head > >>> if selected a bug (failed) device > >>> execute round robin > >>> if selected a bug (failed) device > >>> select first usable non write-mostly > >>> if selected a bug (failed) device > >>> select first usable write-mostly > >>> if end of devices, stop md raid > >>> > >>> to make this, today... we need a read_algorithm at > >>> /sys/block/md0/xxxxxx, to select what algorith to use, write algorithm > >>> is based on raid being used.. raid0 make linear and stripe, raid1 make > >>> mirror, there's no algorithm to use here... > >>> we need some files at /sys/block/md0/xxx to manage 'devices' time > >>> model (parameters) > >>> we need a adaptive algorithm to update parameters and make it closest > >>> possible to real model of 'devices' > >>> a raid0 have global parameters, inside raid0 devices have per device parameters > >>> a raid1 over raid0, should use raid0 parameters > >>> raid0 over devices, should use devices parameters > >>> > >>> > >>> 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > >>>> time based: is the time to: > >>>> HD:head positioning , SSD: time to send command to ROM chip > >>>> HD:read/write time (disk speed - rpm), SSD: time to write/read (time > >>>> to ssd rom chip receive bytes) > >>>> that's time based > >>>> > >>>> what is fast por read? > >>>> consider that time based must know that disk is doing a I/O and that > >>>> you have a time to end, this time to end is another time in algorithm > >>>> > >>>> for example: > >>>> NBD (network block device) > >>>> time to send read message + time to send command to rom or head positioning > >>>> read/write time: time to nbd server return the read/write bytes > >>>> > >>>> what algorithm should do? > >>>> calculate all time or all mirrors, including time to end current > >>>> request (if only one request could be processed, or if allow more than > >>>> 1 request, the time spent to start our command) > >>>> after all time calculated, select the minimal value/device > >>>> > >>>> that's time based > >>>> it's not based on round robin > >>>> it's not based on closest head > >>>> it's based on device speed to: > >>>> *(1)position head/send rom command > >>>> *(2)read/write time (per total of bytes read/write) > >>>> *(3)time to start out request command (if don't allow more than 1 > >>>> request per time, don't have a device queue) > >>>> > >>>> the total time per device will tell us the best device to read > >>>> if we mix, nbd + ssd + hdd (5000rpm) + hdd(7500rpm) + hdd(10000rpm) + > >>>> hdd(15000rpm) > >>>> we can get the best read time using this algorithm > >>>> the problem? we must run a constante benchmark to get this values *(1) > >>>> *(2) *(3) and calculate good values of time spent on each process > >>>> > >>>> resuming... whe need a model of each device (simple-constants or very > >>>> complex-neural network?), and calculate time spent per device > >>>> nice? > >>>> > >>>> > >>>> 2011/2/2 Robin Hill <robin@robinhill.me.uk>: > >>>>> On Tue Feb 01, 2011 at 09:12:11PM -0200, Roberto Spadim wrote: > >>>>> > >>>>>> but the best algorithm is time based (minimize time to access data) > >>>>>> > >>>>> And what do you think takes the time accessing the data? In a rotating > >>>>> disk, it's moving the heads - that's why the current strategy is nearest > >>>>> head. In an SSD there's no head movement, so access time should be the > >>>>> same for accessing any data, making it pretty much irrelevant which > >>>>> strategy is used. > >>>>> > >>>>> Cheers, > >>>>> Robin > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Roberto Spadim > >>>> Spadim Technology / SPAEmpresarial > >>>> > >>> > >>> > >>> > >>> -- > >>> Roberto Spadim > >>> Spadim Technology / SPAEmpresarial > >>> > >> > >> > >> > >> -- > >> Roberto Spadim > >> Spadim Technology / SPAEmpresarial > >> > > > > > > > > -- > > Roberto Spadim > > Spadim Technology / SPAEmpresarial > > > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 19:44 ` Keld Jørn Simonsen @ 2011-02-02 20:28 ` Roberto Spadim 2011-02-02 21:31 ` Roberto Spadim ` (2 more replies) 0 siblings, 3 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-02 20:28 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Jon Nelson, linux-raid before, this thread i put at this page: https://bbs.archlinux.org/viewtopic.php?pid=887267 to make this mail list with less emails 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>: > Hmm, Roberto, where are the gains? it´s dificult to talk... NCQ and linux scheduler don´t help a mirror, they help a single device a new scheduler for mirrors can be done (round robin, closest head, others) > I think it is hard to make raid1 better than it is today. i don´t think, since head, is just for hard disk (rotational) not for solid state disks, let´s not talk about ssd, just hard disk? a raid with 5000rpm and 10000rpm disk, we will have better i/o read with 10000rpm ? we don´t know the model of i/o for that device, but probally will be faster, but when it´s busy we could use 5000rpm... that´s the point, just closest head don´t help, we need know what´s the queue (list of i/o being processed) and the time to read the current i/o > Normally the driver orders the reads to minimize head movement > and loss with rotation latency. Where can we improve that? no way to improve it, it´s very good! but per hard disk, not per mirror but since we know it´s busy we can use another mirror (another disk with same information), that´s what i want > Also, what about conflicts with the elevator algorithm? elevator are based on model of disk, think disk as: linux elevator + NCQ + disks, the sum of three infomration give us time based infomrations to select best device maybe making complex code (per elevator) we could know the time spent to execute it, but it´s a lot of work, for the first model, lets think about parameters of our model (linux elevator + ncq + disks) a second version we could implement elevator algorithm time calculation (network block device NBD, have a elevator? at server side + tcp/ip stack at client and server side, right?) > There are several scheduling algorithms available, and each has > its merits. Will your new scheme work against these? > Or is your new scheme just another scheduling algorithm? it´s a scheduling for mirrors round balance is a algorithm for mirror closest head is a algorithm for mirror my 'new' algorith will be for mirror (if anyone help me coding for linux kernel hehehe, i didn´t coded for linux kernel yet, just for user space) noop, deadline, cfq isn´t for mirror, these are for raid0 problem (linear, stripe if you hard disk have more then one head on your hard disk) > I think I learned that scheduling is per drive, not per file system. yes, you learned right! =) /dev/md0 (raid1) is a device with scheduling (closest head,round robin) /dev/sda is a device with scheduling (noop, deadline, cfq, others) /dev/sda1 is a device with scheduling (it send all i/o directly to /dev/sda) the new algorithm is just for mirrors (raid1), i dont remeber about raid5,6 if they are mirror based too, if yes they could be optimized with this algorithm too raid0 don´t have mirrors, but information is per device striped (not for linear), that´s why it can be faster... can make parallel reads with closest head we can´t use best disk, we can use a single disk all time if it´s head closer, maybe it´s not the fastest disk (that´s why we implent the write-mostly, we don´t make they usable for read, just for write or when mirror fail, but it´s not perfect for speed, a better algorithm can be made, for identical disks, a round robin work well, better than closest head if it´s a solid state disk) ok on a high load, maybe closest mirror is better than this algorithm? yes, if you just use hard disk, if you mix hard disk+solid state+network block device +floppy disks+any other device, you don´t have the best algorithm for i/o over mirrors > and is it reading or writing or both? Normally we are dependant on the > reading, as we cannot process data before we have read them. > OTOH writing is less time critical, as nobody is waiting for it. it must be implemented on write and read, write for just time calculations, read for select the best mirror for write we must write on all mirrors (sync write is better, async isn´t power fail safe) > Or is it maximum thruput you want? > Or a mix, given some restraints? it´s the maximum performace = what´s the better strategy to spent less time to execute current i/o, based on time to access disk, time to read bytes, time to wait others i/o being executed that´s for mirror select, not for disks i/o for disks we can use noop, deadline, cfq scheduller (for disks) tcp/ip tweaks for network block device a model identification must execute to tell the mirror select algorithm what´s the model of each device model: time to read X bytes, time to move head, time to start a read, time to write, time time time per byte per kb per units calcule time and select the minimal value calculated as the device (mirror) to execute our read > > best regards > keld thanks keld sorry if i make email list very big -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 20:28 ` Roberto Spadim @ 2011-02-02 21:31 ` Roberto Spadim 2011-02-02 22:13 ` Keld Jørn Simonsen 2011-02-03 3:05 ` Stan Hoeppner 2 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-02 21:31 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Jon Nelson, linux-raid sorry pour english, it´s not closest head, it´s nearest head 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > before, this thread i put at this page: > https://bbs.archlinux.org/viewtopic.php?pid=887267 > to make this mail list with less emails > > 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>: >> Hmm, Roberto, where are the gains? > > it´s dificult to talk... NCQ and linux scheduler don´t help a mirror, > they help a single device > a new scheduler for mirrors can be done (round robin, closest head, others) > >> I think it is hard to make raid1 better than it is today. > i don´t think, since head, is just for hard disk (rotational) not for > solid state disks, let´s not talk about ssd, just hard disk? a raid > with 5000rpm and 10000rpm disk, we will have better i/o read with > 10000rpm ? we don´t know the model of i/o for that device, but > probally will be faster, but when it´s busy we could use 5000rpm... > that´s the point, just closest head don´t help, we need know what´s > the queue (list of i/o being processed) and the time to read the > current i/o > >> Normally the driver orders the reads to minimize head movement >> and loss with rotation latency. Where can we improve that? > > no way to improve it, it´s very good! but per hard disk, not per mirror > but since we know it´s busy we can use another mirror (another disk > with same information), that´s what i want > >> Also, what about conflicts with the elevator algorithm? > elevator are based on model of disk, think disk as: linux elevator + > NCQ + disks, the sum of three infomration give us time based > infomrations to select best device > maybe making complex code (per elevator) we could know the time spent > to execute it, but it´s a lot of work, > for the first model, lets think about parameters of our model (linux > elevator + ncq + disks) > a second version we could implement elevator algorithm time > calculation (network block device NBD, have a elevator? at server side > + tcp/ip stack at client and server side, right?) > >> There are several scheduling algorithms available, and each has >> its merits. Will your new scheme work against these? >> Or is your new scheme just another scheduling algorithm? > > it´s a scheduling for mirrors > round balance is a algorithm for mirror > closest head is a algorithm for mirror > my 'new' algorith will be for mirror (if anyone help me coding for > linux kernel hehehe, i didn´t coded for linux kernel yet, just for > user space) > > noop, deadline, cfq isn´t for mirror, these are for raid0 problem > (linear, stripe if you hard disk have more then one head on your hard > disk) > >> I think I learned that scheduling is per drive, not per file system. > yes, you learned right! =) > /dev/md0 (raid1) is a device with scheduling (closest head,round robin) > /dev/sda is a device with scheduling (noop, deadline, cfq, others) > /dev/sda1 is a device with scheduling (it send all i/o directly to /dev/sda) > > the new algorithm is just for mirrors (raid1), i dont remeber about > raid5,6 if they are mirror based too, if yes they could be optimized > with this algorithm too > > raid0 don´t have mirrors, but information is per device striped (not > for linear), that´s why it can be faster... can make parallel reads > > with closest head we can´t use best disk, we can use a single disk all > time if it´s head closer, maybe it´s not the fastest disk (that´s why > we implent the write-mostly, we don´t make they usable for read, just > for write or when mirror fail, but it´s not perfect for speed, a > better algorithm can be made, for identical disks, a round robin work > well, better than closest head if it´s a solid state disk) > ok on a high load, maybe closest mirror is better than this algorithm? > yes, if you just use hard disk, if you mix hard disk+solid > state+network block device +floppy disks+any other device, you don´t > have the best algorithm for i/o over mirrors > > >> and is it reading or writing or both? Normally we are dependant on the >> reading, as we cannot process data before we have read them. >> OTOH writing is less time critical, as nobody is waiting for it. > it must be implemented on write and read, write for just time > calculations, read for select the best mirror > for write we must write on all mirrors (sync write is better, async > isn´t power fail safe) > >> Or is it maximum thruput you want? >> Or a mix, given some restraints? > it´s the maximum performace = what´s the better strategy to spent less > time to execute current i/o, based on time to access disk, time to > read bytes, time to wait others i/o being executed > > that´s for mirror select, not for disks i/o > for disks we can use noop, deadline, cfq scheduller (for disks) > tcp/ip tweaks for network block device > > a model identification must execute to tell the mirror select > algorithm what´s the model of each device > model: time to read X bytes, time to move head, time to start a read, > time to write, time time time per byte per kb per units > calcule time and select the minimal value calculated as the device > (mirror) to execute our read > > >> >> best regards >> keld > > thanks keld > > sorry if i make email list very big > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 20:28 ` Roberto Spadim 2011-02-02 21:31 ` Roberto Spadim @ 2011-02-02 22:13 ` Keld Jørn Simonsen 2011-02-02 22:26 ` Roberto Spadim 2011-02-03 3:05 ` Stan Hoeppner 2 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-02 22:13 UTC (permalink / raw) To: Roberto Spadim; +Cc: Keld Jørn Simonsen, Jon Nelson, linux-raid Hmm, Roberto, I think we are close to theoretical maximum with some of the raid1/raid10 stuff already. and my nose tells me that we can gain more by minimizing CPU usage. Or maybe using some threading for raid modules - they all run single-threaded. Best regards keld On Wed, Feb 02, 2011 at 06:28:27PM -0200, Roberto Spadim wrote: > before, this thread i put at this page: > https://bbs.archlinux.org/viewtopic.php?pid=887267 > to make this mail list with less emails > > 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>: > > Hmm, Roberto, where are the gains? > > it?s dificult to talk... NCQ and linux scheduler don?t help a mirror, > they help a single device > a new scheduler for mirrors can be done (round robin, closest head, others) > > > I think it is hard to make raid1 better than it is today. > i don?t think, since head, is just for hard disk (rotational) not for > solid state disks, let?s not talk about ssd, just hard disk? a raid > with 5000rpm and 10000rpm disk, we will have better i/o read with > 10000rpm ? we don?t know the model of i/o for that device, but > probally will be faster, but when it?s busy we could use 5000rpm... > that?s the point, just closest head don?t help, we need know what?s > the queue (list of i/o being processed) and the time to read the > current i/o > > > Normally the driver orders the reads to minimize head movement > > and loss with rotation latency. Where can we improve that? > > no way to improve it, it?s very good! but per hard disk, not per mirror > but since we know it?s busy we can use another mirror (another disk > with same information), that?s what i want > > > Also, what about conflicts with the elevator algorithm? > elevator are based on model of disk, think disk as: linux elevator + > NCQ + disks, the sum of three infomration give us time based > infomrations to select best device > maybe making complex code (per elevator) we could know the time spent > to execute it, but it?s a lot of work, > for the first model, lets think about parameters of our model (linux > elevator + ncq + disks) > a second version we could implement elevator algorithm time > calculation (network block device NBD, have a elevator? at server side > + tcp/ip stack at client and server side, right?) > > > There are several scheduling algorithms available, and each has > > its merits. Will your new scheme work against these? > > Or is your new scheme just another scheduling algorithm? > > it?s a scheduling for mirrors > round balance is a algorithm for mirror > closest head is a algorithm for mirror > my 'new' algorith will be for mirror (if anyone help me coding for > linux kernel hehehe, i didn?t coded for linux kernel yet, just for > user space) > > noop, deadline, cfq isn?t for mirror, these are for raid0 problem > (linear, stripe if you hard disk have more then one head on your hard > disk) > > > I think I learned that scheduling is per drive, not per file system. > yes, you learned right! =) > /dev/md0 (raid1) is a device with scheduling (closest head,round robin) > /dev/sda is a device with scheduling (noop, deadline, cfq, others) > /dev/sda1 is a device with scheduling (it send all i/o directly to /dev/sda) > > the new algorithm is just for mirrors (raid1), i dont remeber about > raid5,6 if they are mirror based too, if yes they could be optimized > with this algorithm too > > raid0 don?t have mirrors, but information is per device striped (not > for linear), that?s why it can be faster... can make parallel reads > > with closest head we can?t use best disk, we can use a single disk all > time if it?s head closer, maybe it?s not the fastest disk (that?s why > we implent the write-mostly, we don?t make they usable for read, just > for write or when mirror fail, but it?s not perfect for speed, a > better algorithm can be made, for identical disks, a round robin work > well, better than closest head if it?s a solid state disk) > ok on a high load, maybe closest mirror is better than this algorithm? > yes, if you just use hard disk, if you mix hard disk+solid > state+network block device +floppy disks+any other device, you don?t > have the best algorithm for i/o over mirrors > > > > and is it reading or writing or both? Normally we are dependant on the > > reading, as we cannot process data before we have read them. > > OTOH writing is less time critical, as nobody is waiting for it. > it must be implemented on write and read, write for just time > calculations, read for select the best mirror > for write we must write on all mirrors (sync write is better, async > isn?t power fail safe) > > > Or is it maximum thruput you want? > > Or a mix, given some restraints? > it?s the maximum performace = what?s the better strategy to spent less > time to execute current i/o, based on time to access disk, time to > read bytes, time to wait others i/o being executed > > that?s for mirror select, not for disks i/o > for disks we can use noop, deadline, cfq scheduller (for disks) > tcp/ip tweaks for network block device > > a model identification must execute to tell the mirror select > algorithm what?s the model of each device > model: time to read X bytes, time to move head, time to start a read, > time to write, time time time per byte per kb per units > calcule time and select the minimal value calculated as the device > (mirror) to execute our read > > > > > > best regards > > keld > > thanks keld > > sorry if i make email list very big > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 22:13 ` Keld Jørn Simonsen @ 2011-02-02 22:26 ` Roberto Spadim 2011-02-03 1:57 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-02 22:26 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Jon Nelson, linux-raid nice, i don´t know if it´s a problem of single thread i think it´s a problem about async read command being executed in parallel i post again at https://bbs.archlinux.org/viewtopic.php?pid=887345 please see the history at the end of page i´m talking about a disk with 5000rpm and a disk with 7000rpm i think we can optimize mirror read algorithm and it´s not very hard for same speed hard disk, near mirror is good for same speed solid state, round robin is good for anyone, time based is good diferences? hard disk: time to position head is high, time to read can be small solid state: time to position is small, time to read is small (some ssd are old, and have small read rate) nbd: time based on server hard/solid disk, and network time, but don´t think in nbd yet 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>: > Hmm, Roberto, I think we are close to theoretical maximum with > some of the raid1/raid10 stuff already. and my nose tells me > that we can gain more by minimizing CPU usage. > Or maybe using some threading for raid modules - they > all run single-threaded. > > Best regards > keld > > > On Wed, Feb 02, 2011 at 06:28:27PM -0200, Roberto Spadim wrote: >> before, this thread i put at this page: >> https://bbs.archlinux.org/viewtopic.php?pid=887267 >> to make this mail list with less emails >> >> 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>: >> > Hmm, Roberto, where are the gains? >> >> it?s dificult to talk... NCQ and linux scheduler don?t help a mirror, >> they help a single device >> a new scheduler for mirrors can be done (round robin, closest head, others) >> >> > I think it is hard to make raid1 better than it is today. >> i don?t think, since head, is just for hard disk (rotational) not for >> solid state disks, let?s not talk about ssd, just hard disk? a raid >> with 5000rpm and 10000rpm disk, we will have better i/o read with >> 10000rpm ? we don?t know the model of i/o for that device, but >> probally will be faster, but when it?s busy we could use 5000rpm... >> that?s the point, just closest head don?t help, we need know what?s >> the queue (list of i/o being processed) and the time to read the >> current i/o >> >> > Normally the driver orders the reads to minimize head movement >> > and loss with rotation latency. Where can we improve that? >> >> no way to improve it, it?s very good! but per hard disk, not per mirror >> but since we know it?s busy we can use another mirror (another disk >> with same information), that?s what i want >> >> > Also, what about conflicts with the elevator algorithm? >> elevator are based on model of disk, think disk as: linux elevator + >> NCQ + disks, the sum of three infomration give us time based >> infomrations to select best device >> maybe making complex code (per elevator) we could know the time spent >> to execute it, but it?s a lot of work, >> for the first model, lets think about parameters of our model (linux >> elevator + ncq + disks) >> a second version we could implement elevator algorithm time >> calculation (network block device NBD, have a elevator? at server side >> + tcp/ip stack at client and server side, right?) >> >> > There are several scheduling algorithms available, and each has >> > its merits. Will your new scheme work against these? >> > Or is your new scheme just another scheduling algorithm? >> >> it?s a scheduling for mirrors >> round balance is a algorithm for mirror >> closest head is a algorithm for mirror >> my 'new' algorith will be for mirror (if anyone help me coding for >> linux kernel hehehe, i didn?t coded for linux kernel yet, just for >> user space) >> >> noop, deadline, cfq isn?t for mirror, these are for raid0 problem >> (linear, stripe if you hard disk have more then one head on your hard >> disk) >> >> > I think I learned that scheduling is per drive, not per file system. >> yes, you learned right! =) >> /dev/md0 (raid1) is a device with scheduling (closest head,round robin) >> /dev/sda is a device with scheduling (noop, deadline, cfq, others) >> /dev/sda1 is a device with scheduling (it send all i/o directly to /dev/sda) >> >> the new algorithm is just for mirrors (raid1), i dont remeber about >> raid5,6 if they are mirror based too, if yes they could be optimized >> with this algorithm too >> >> raid0 don?t have mirrors, but information is per device striped (not >> for linear), that?s why it can be faster... can make parallel reads >> >> with closest head we can?t use best disk, we can use a single disk all >> time if it?s head closer, maybe it?s not the fastest disk (that?s why >> we implent the write-mostly, we don?t make they usable for read, just >> for write or when mirror fail, but it?s not perfect for speed, a >> better algorithm can be made, for identical disks, a round robin work >> well, better than closest head if it?s a solid state disk) >> ok on a high load, maybe closest mirror is better than this algorithm? >> yes, if you just use hard disk, if you mix hard disk+solid >> state+network block device +floppy disks+any other device, you don?t >> have the best algorithm for i/o over mirrors >> >> >> > and is it reading or writing or both? Normally we are dependant on the >> > reading, as we cannot process data before we have read them. >> > OTOH writing is less time critical, as nobody is waiting for it. >> it must be implemented on write and read, write for just time >> calculations, read for select the best mirror >> for write we must write on all mirrors (sync write is better, async >> isn?t power fail safe) >> >> > Or is it maximum thruput you want? >> > Or a mix, given some restraints? >> it?s the maximum performace = what?s the better strategy to spent less >> time to execute current i/o, based on time to access disk, time to >> read bytes, time to wait others i/o being executed >> >> that?s for mirror select, not for disks i/o >> for disks we can use noop, deadline, cfq scheduller (for disks) >> tcp/ip tweaks for network block device >> >> a model identification must execute to tell the mirror select >> algorithm what?s the model of each device >> model: time to read X bytes, time to move head, time to start a read, >> time to write, time time time per byte per kb per units >> calcule time and select the minimal value calculated as the device >> (mirror) to execute our read >> >> >> > >> > best regards >> > keld >> >> thanks keld >> >> sorry if i make email list very big >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 22:26 ` Roberto Spadim @ 2011-02-03 1:57 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 1:57 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Jon Nelson, linux-raid i have updated again, some questions are being explained (https://bbs.archlinux.org/viewtopic.php?pid=887345) check that this question (optional io mirror scheduler algorithm) is very old (1+1/2 years, Chris Worley [ Fr, 16 Oktober 2009 21:07 ] [ ID #2019215 ]) http://www.issociate.de/board/post/499463/Load-balancing_mirrors_w/_asymmetric_performance.html 2011/2/2 Roberto Spadim <roberto@spadim.com.br>: > nice, i don´t know if it´s a problem of single thread > i think it´s a problem about async read command being executed in parallel > i post again at https://bbs.archlinux.org/viewtopic.php?pid=887345 > please see the history at the end of page > i´m talking about a disk with 5000rpm and a disk with 7000rpm > i think we can optimize mirror read algorithm and it´s not very hard > for same speed hard disk, near mirror is good > for same speed solid state, round robin is good > for anyone, time based is good > > diferences? > hard disk: time to position head is high, time to read can be small > solid state: time to position is small, time to read is small (some > ssd are old, and have small read rate) > nbd: time based on server hard/solid disk, and network time, but don´t > think in nbd yet > > 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>: >> Hmm, Roberto, I think we are close to theoretical maximum with >> some of the raid1/raid10 stuff already. and my nose tells me >> that we can gain more by minimizing CPU usage. >> Or maybe using some threading for raid modules - they >> all run single-threaded. >> >> Best regards >> keld >> >> >> On Wed, Feb 02, 2011 at 06:28:27PM -0200, Roberto Spadim wrote: >>> before, this thread i put at this page: >>> https://bbs.archlinux.org/viewtopic.php?pid=887267 >>> to make this mail list with less emails >>> >>> 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>: >>> > Hmm, Roberto, where are the gains? >>> >>> it?s dificult to talk... NCQ and linux scheduler don?t help a mirror, >>> they help a single device >>> a new scheduler for mirrors can be done (round robin, closest head, others) >>> >>> > I think it is hard to make raid1 better than it is today. >>> i don?t think, since head, is just for hard disk (rotational) not for >>> solid state disks, let?s not talk about ssd, just hard disk? a raid >>> with 5000rpm and 10000rpm disk, we will have better i/o read with >>> 10000rpm ? we don?t know the model of i/o for that device, but >>> probally will be faster, but when it?s busy we could use 5000rpm... >>> that?s the point, just closest head don?t help, we need know what?s >>> the queue (list of i/o being processed) and the time to read the >>> current i/o >>> >>> > Normally the driver orders the reads to minimize head movement >>> > and loss with rotation latency. Where can we improve that? >>> >>> no way to improve it, it?s very good! but per hard disk, not per mirror >>> but since we know it?s busy we can use another mirror (another disk >>> with same information), that?s what i want >>> >>> > Also, what about conflicts with the elevator algorithm? >>> elevator are based on model of disk, think disk as: linux elevator + >>> NCQ + disks, the sum of three infomration give us time based >>> infomrations to select best device >>> maybe making complex code (per elevator) we could know the time spent >>> to execute it, but it?s a lot of work, >>> for the first model, lets think about parameters of our model (linux >>> elevator + ncq + disks) >>> a second version we could implement elevator algorithm time >>> calculation (network block device NBD, have a elevator? at server side >>> + tcp/ip stack at client and server side, right?) >>> >>> > There are several scheduling algorithms available, and each has >>> > its merits. Will your new scheme work against these? >>> > Or is your new scheme just another scheduling algorithm? >>> >>> it?s a scheduling for mirrors >>> round balance is a algorithm for mirror >>> closest head is a algorithm for mirror >>> my 'new' algorith will be for mirror (if anyone help me coding for >>> linux kernel hehehe, i didn?t coded for linux kernel yet, just for >>> user space) >>> >>> noop, deadline, cfq isn?t for mirror, these are for raid0 problem >>> (linear, stripe if you hard disk have more then one head on your hard >>> disk) >>> >>> > I think I learned that scheduling is per drive, not per file system. >>> yes, you learned right! =) >>> /dev/md0 (raid1) is a device with scheduling (closest head,round robin) >>> /dev/sda is a device with scheduling (noop, deadline, cfq, others) >>> /dev/sda1 is a device with scheduling (it send all i/o directly to /dev/sda) >>> >>> the new algorithm is just for mirrors (raid1), i dont remeber about >>> raid5,6 if they are mirror based too, if yes they could be optimized >>> with this algorithm too >>> >>> raid0 don?t have mirrors, but information is per device striped (not >>> for linear), that?s why it can be faster... can make parallel reads >>> >>> with closest head we can?t use best disk, we can use a single disk all >>> time if it?s head closer, maybe it?s not the fastest disk (that?s why >>> we implent the write-mostly, we don?t make they usable for read, just >>> for write or when mirror fail, but it?s not perfect for speed, a >>> better algorithm can be made, for identical disks, a round robin work >>> well, better than closest head if it?s a solid state disk) >>> ok on a high load, maybe closest mirror is better than this algorithm? >>> yes, if you just use hard disk, if you mix hard disk+solid >>> state+network block device +floppy disks+any other device, you don?t >>> have the best algorithm for i/o over mirrors >>> >>> >>> > and is it reading or writing or both? Normally we are dependant on the >>> > reading, as we cannot process data before we have read them. >>> > OTOH writing is less time critical, as nobody is waiting for it. >>> it must be implemented on write and read, write for just time >>> calculations, read for select the best mirror >>> for write we must write on all mirrors (sync write is better, async >>> isn?t power fail safe) >>> >>> > Or is it maximum thruput you want? >>> > Or a mix, given some restraints? >>> it?s the maximum performace = what?s the better strategy to spent less >>> time to execute current i/o, based on time to access disk, time to >>> read bytes, time to wait others i/o being executed >>> >>> that?s for mirror select, not for disks i/o >>> for disks we can use noop, deadline, cfq scheduller (for disks) >>> tcp/ip tweaks for network block device >>> >>> a model identification must execute to tell the mirror select >>> algorithm what?s the model of each device >>> model: time to read X bytes, time to move head, time to start a read, >>> time to write, time time time per byte per kb per units >>> calcule time and select the minimal value calculated as the device >>> (mirror) to execute our read >>> >>> >>> > >>> > best regards >>> > keld >>> >>> thanks keld >>> >>> sorry if i make email list very big >>> >>> >>> >>> -- >>> Roberto Spadim >>> Spadim Technology / SPAEmpresarial >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-02 20:28 ` Roberto Spadim 2011-02-02 21:31 ` Roberto Spadim 2011-02-02 22:13 ` Keld Jørn Simonsen @ 2011-02-03 3:05 ` Stan Hoeppner 2011-02-03 3:13 ` Roberto Spadim 2 siblings, 1 reply; 127+ messages in thread From: Stan Hoeppner @ 2011-02-03 3:05 UTC (permalink / raw) To: Roberto Spadim; +Cc: Keld Jørn Simonsen, Jon Nelson, linux-raid Roberto Spadim put forth on 2/2/2011 2:28 PM: > i don´t think, since head, is just for hard disk (rotational) not for > solid state disks, let´s not talk about ssd, just hard disk? a raid > with 5000rpm and 10000rpm disk, we will have better i/o read with Anyone who would mix drives of such disparate spindle speeds within the same array is not concerned with performance. Anyone who has read enough to create their first array knows better than to do this. Why waste effort to optimize such a horrible design decision? > 10000rpm ? we don´t know the model of i/o for that device, but > probally will be faster, but when it´s busy we could use 5000rpm... > that´s the point, just closest head don´t help, we need know what´s > the queue (list of i/o being processed) and the time to read the > current i/o This is just silly. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 3:05 ` Stan Hoeppner @ 2011-02-03 3:13 ` Roberto Spadim 2011-02-03 3:17 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 3:13 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Keld Jørn Simonsen, Jon Nelson, linux-raid > Anyone who would mix drives of such disparate spindle speeds within the same > array is not concerned with performance. Anyone who has read enough to create > their first array knows better than to do this. i don't think that... let's talk about internet link? we can have two internet links (1 radio, 1 adsl or another), with linux i can load balance based on network band use, round robin, and many others ideas it's same for raid1, except, write speed = slowest write speed, for read we can get highers speed (with raid1 we can have same speed than raid0, or, we should...) > Why waste effort to optimize such a horrible design decision? a horrible design decision: two 1TB ssd with 1,5GB/s and two 1TB ssd with 2,0GB/s (can be from texas... or ocz...) using raid1, it's a horrible design? or... for hard disks two 1tb sas with 300MB/s and two 1tb sas with 250MB/s (i never get more speed than 300mb/s with hard disks) is it a horrible design? > This is just silly. no, it's a option, we could use round robin, near head, and another read balance algorithms, the closest to read world the faster we get now it's true for raid1: write speed is poor (that's why we use raid10) but read can be as fast as raid0 .... even faster since all disks have the same information... what we need is a good read balance algorithm (i think for a non balanced array, a time based is the best option...) 2011/2/3 Stan Hoeppner <stan@hardwarefreak.com>: > Roberto Spadim put forth on 2/2/2011 2:28 PM: > >> i don´t think, since head, is just for hard disk (rotational) not for >> solid state disks, let´s not talk about ssd, just hard disk? a raid >> with 5000rpm and 10000rpm disk, we will have better i/o read with > > Anyone who would mix drives of such disparate spindle speeds within the same > array is not concerned with performance. Anyone who has read enough to create > their first array knows better than to do this. > > Why waste effort to optimize such a horrible design decision? > >> 10000rpm ? we don´t know the model of i/o for that device, but >> probally will be faster, but when it´s busy we could use 5000rpm... >> that´s the point, just closest head don´t help, we need know what´s >> the queue (list of i/o being processed) and the time to read the >> current i/o > > This is just silly. > > -- > Stan > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 3:13 ` Roberto Spadim @ 2011-02-03 3:17 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 3:17 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Keld Jørn Simonsen, Jon Nelson, linux-raid could we implement first a round robin, and after try another algorithm (time based)? with round robin many problems for ssd can be solved, near head don't help ssd disks since they don't have head, and sequencial read rate is near random read rate 2011/2/3 Roberto Spadim <roberto@spadim.com.br>: >> Anyone who would mix drives of such disparate spindle speeds within the same >> array is not concerned with performance. Anyone who has read enough to create >> their first array knows better than to do this. > i don't think that... let's talk about internet link? we can have two > internet links (1 radio, 1 adsl or another), with linux i can load > balance based on network band use, round robin, and many others ideas > it's same for raid1, except, write speed = slowest write speed, for > read we can get highers speed (with raid1 we can have same speed than > raid0, or, we should...) > >> Why waste effort to optimize such a horrible design decision? > a horrible design decision: > two 1TB ssd with 1,5GB/s and two 1TB ssd with 2,0GB/s (can be from > texas... or ocz...) > using raid1, it's a horrible design? > or... for hard disks > two 1tb sas with 300MB/s and two 1tb sas with 250MB/s (i never get > more speed than 300mb/s with hard disks) > is it a horrible design? > >> This is just silly. > no, it's a option, we could use round robin, near head, and another > read balance algorithms, the closest to read world the faster we get > > now it's true for raid1: write speed is poor (that's why we use raid10) > but read can be as fast as raid0 .... even faster since all disks have > the same information... what we need is a good read balance algorithm > (i think for a non balanced array, a time based is the best option...) > > 2011/2/3 Stan Hoeppner <stan@hardwarefreak.com>: >> Roberto Spadim put forth on 2/2/2011 2:28 PM: >> >>> i don´t think, since head, is just for hard disk (rotational) not for >>> solid state disks, let´s not talk about ssd, just hard disk? a raid >>> with 5000rpm and 10000rpm disk, we will have better i/o read with >> >> Anyone who would mix drives of such disparate spindle speeds within the same >> array is not concerned with performance. Anyone who has read enough to create >> their first array knows better than to do this. >> >> Why waste effort to optimize such a horrible design decision? >> >>> 10000rpm ? we don´t know the model of i/o for that device, but >>> probally will be faster, but when it´s busy we could use 5000rpm... >>> that´s the point, just closest head don´t help, we need know what´s >>> the queue (list of i/o being processed) and the time to read the >>> current i/o >> >> This is just silly. >> >> -- >> Stan >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 22:05 ` Stan Hoeppner 2011-02-01 23:12 ` Roberto Spadim @ 2011-02-01 23:35 ` Keld Jørn Simonsen 1 sibling, 0 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-01 23:35 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Jon Nelson, David Brown, linux-raid On Tue, Feb 01, 2011 at 04:05:23PM -0600, Stan Hoeppner wrote: > Jon Nelson put forth on 2/1/2011 7:50 AM: > > > The performance will not be the same because. Whenever possible, md > > reads from the outermost portion of the disk -- theoretically the > > fastest portion of the disk (by 2 or 3 times as much as the inner > > tracks) -- and in this way raid10,f2 can actually be faster than > > raid0. > > Faster in what regard? I assume you mean purely sequential read, and not random > IOPS. The access patterns of the vast majority of workloads are random, so I > don't see much real world benefit, if what you say is correct. This might > benefit MythTV or similar niche streaming apps. It is mostly interesting for workstations, where one user is the sole user of the system But it is also interesting for a server, where the client is interested in the completion time for a single request. The faster processing of a sequential reads is significant, as far as I can tell. Best regards keld ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 10:01 ` David Brown 2011-02-01 13:50 ` Jon Nelson @ 2011-02-01 16:02 ` Keld Jørn Simonsen 2011-02-01 16:24 ` Roberto Spadim ` (2 more replies) 1 sibling, 3 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-01 16:02 UTC (permalink / raw) To: David Brown; +Cc: linux-raid On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote: > On 31/01/2011 23:52, Keld Jørn Simonsen wrote: > >raid1+0 and Linux MD raid10 are similar, but significantly different > >in a number of ways. Linux MD raid10 can run on only 2 drives. > >Linux raid10,f2 has almost RAID0 striping performance in sequential read. > >You can have an odd number of drives in raid10. > >And you can have as many copies as you like in raid10, > > > > You can make raid10,f2 functionality from raid1+0 by using partitions. > For example, to get a raid10,f2 equivalent on two drives, partition them > into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and > md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe > set of md0 and md1. I don't think you get the striping performance of raid10,f2 with this layout. And that is one of the main advantages of raid10,f2 layout. Have you tried it out? As far as I can see the layout of blocks are not alternating between the disks. You have one raid1 of sda1 and sdb2, there a file is allocated on blocks sequentially on sda1 and then mirrored on sdb2, where it is also sequentially allocated. That gives no striping. > I don't think there is any way you can get the equivalent of raid10,o2 > in this way. But then, I am not sure how much use raid10,o2 actually is > - are there any usage patterns for which it is faster than raid10,n2 or > raid10,f2? In theory raid10,o2 should have better performance on SSD's because of the low latency, and raid10,o2 doing multireading from each drive, which raid0,n2 does not. We lack some evidence from benchmarks, tho. best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 16:02 ` Keld Jørn Simonsen @ 2011-02-01 16:24 ` Roberto Spadim 2011-02-01 17:56 ` Keld Jørn Simonsen 2011-02-01 20:32 ` Keld Jørn Simonsen 2011-02-01 21:18 ` David Brown 2 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-01 16:24 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: David Brown, linux-raid thinking about this: > I don't think you get the striping performance of raid10,f2 with this > layout. And that is one of the main advantages of raid10,f2 layout. > Have you tried it out? since you have a raid1, you don´t need striping, you can read from any mirror, the information is the same, raid1 read is as fast as raid0 read, just write is slower (it must read on each mirror) the only problem is raid0 part, or you use linear or stripe, i think raid10 mdadm algorithm use stripe for raid0 part 2011/2/1 Keld Jørn Simonsen <keld@keldix.com>: > On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote: >> On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >> >raid1+0 and Linux MD raid10 are similar, but significantly different >> >in a number of ways. Linux MD raid10 can run on only 2 drives. >> >Linux raid10,f2 has almost RAID0 striping performance in sequential read. >> >You can have an odd number of drives in raid10. >> >And you can have as many copies as you like in raid10, >> > >> >> You can make raid10,f2 functionality from raid1+0 by using partitions. >> For example, to get a raid10,f2 equivalent on two drives, partition them >> into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and >> md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe >> set of md0 and md1. > > I don't think you get the striping performance of raid10,f2 with this > layout. And that is one of the main advantages of raid10,f2 layout. > Have you tried it out? > > As far as I can see the layout of blocks are not alternating between the > disks. You have one raid1 of sda1 and sdb2, there a file is allocated on > blocks sequentially on sda1 and then mirrored on sdb2, where it is also > sequentially allocated. That gives no striping. > >> I don't think there is any way you can get the equivalent of raid10,o2 >> in this way. But then, I am not sure how much use raid10,o2 actually is >> - are there any usage patterns for which it is faster than raid10,n2 or >> raid10,f2? > > In theory raid10,o2 should have better performance on SSD's because of > the low latency, and raid10,o2 doing multireading from each drive, which > raid0,n2 does not. > > We lack some evidence from benchmarks, tho. > > best regards > keld > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 16:24 ` Roberto Spadim @ 2011-02-01 17:56 ` Keld Jørn Simonsen 2011-02-01 18:09 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-01 17:56 UTC (permalink / raw) To: Roberto Spadim; +Cc: Keld Jørn Simonsen, David Brown, linux-raid On Tue, Feb 01, 2011 at 02:24:01PM -0200, Roberto Spadim wrote: > thinking about this: > > > I don't think you get the striping performance of raid10,f2 with this > > layout. And that is one of the main advantages of raid10,f2 layout. > > Have you tried it out? > > since you have a raid1, you don?t need striping, you can read from any > mirror, the information is the same, raid1 read is as fast as raid0 > read, just write is slower (it must read on each mirror) > the only problem is raid0 part, or you use linear or stripe, i think > raid10 mdadm algorithm use stripe for raid0 part well, raid0 is for reading sequentially, about double as fast as raid1. https://raid.wiki.kernel.org/index.php/Performance best regards keld > 2011/2/1 Keld Jørn Simonsen <keld@keldix.com>: > > On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote: > >> On 31/01/2011 23:52, Keld Jørn Simonsen wrote: > >> >raid1+0 and Linux MD raid10 are similar, but significantly different > >> >in a number of ways. Linux MD raid10 can run on only 2 drives. > >> >Linux raid10,f2 has almost RAID0 striping performance in sequential read. > >> >You can have an odd number of drives in raid10. > >> >And you can have as many copies as you like in raid10, > >> > > >> > >> You can make raid10,f2 functionality from raid1+0 by using partitions. > >> For example, to get a raid10,f2 equivalent on two drives, partition them > >> into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and > >> md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe > >> set of md0 and md1. > > > > I don't think you get the striping performance of raid10,f2 with this > > layout. And that is one of the main advantages of raid10,f2 layout. > > Have you tried it out? > > > > As far as I can see the layout of blocks are not alternating between the > > disks. You have one raid1 of sda1 and sdb2, there a file is allocated on > > blocks sequentially on sda1 and then mirrored on sdb2, where it is also > > sequentially allocated. That gives no striping. > > > >> I don't think there is any way you can get the equivalent of raid10,o2 > >> in this way. But then, I am not sure how much use raid10,o2 actually is > >> - are there any usage patterns for which it is faster than raid10,n2 or > >> raid10,f2? > > > > In theory raid10,o2 should have better performance on SSD's because of > > the low latency, and raid10,o2 doing multireading from each drive, which > > raid0,n2 does not. > > > > We lack some evidence from benchmarks, tho. > > > > best regards > > keld > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 17:56 ` Keld Jørn Simonsen @ 2011-02-01 18:09 ` Roberto Spadim 2011-02-01 20:16 ` Keld Jørn Simonsen 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-01 18:09 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: David Brown, linux-raid ok, but you are wrong... > well, raid0 is for reading sequentially, about double as fast as raid1. > https://raid.wiki.kernel.org/index.php/Performance it´s true for a mix of write/read (write is the time of slowest drive) but if you just read, raid0 and raid1 have the same read time (depend on implementation, but can have the same speed) 2011/2/1 Keld Jørn Simonsen <keld@keldix.com>: > On Tue, Feb 01, 2011 at 02:24:01PM -0200, Roberto Spadim wrote: >> thinking about this: >> >> > I don't think you get the striping performance of raid10,f2 with this >> > layout. And that is one of the main advantages of raid10,f2 layout. >> > Have you tried it out? >> >> since you have a raid1, you don?t need striping, you can read from any >> mirror, the information is the same, raid1 read is as fast as raid0 >> read, just write is slower (it must read on each mirror) >> the only problem is raid0 part, or you use linear or stripe, i think >> raid10 mdadm algorithm use stripe for raid0 part > > well, raid0 is for reading sequentially, about double as fast as raid1. > https://raid.wiki.kernel.org/index.php/Performance > > best regards > keld > >> 2011/2/1 Keld Jørn Simonsen <keld@keldix.com>: >> > On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote: >> >> On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >> >> >raid1+0 and Linux MD raid10 are similar, but significantly different >> >> >in a number of ways. Linux MD raid10 can run on only 2 drives. >> >> >Linux raid10,f2 has almost RAID0 striping performance in sequential read. >> >> >You can have an odd number of drives in raid10. >> >> >And you can have as many copies as you like in raid10, >> >> > >> >> >> >> You can make raid10,f2 functionality from raid1+0 by using partitions. >> >> For example, to get a raid10,f2 equivalent on two drives, partition them >> >> into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and >> >> md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe >> >> set of md0 and md1. >> > >> > I don't think you get the striping performance of raid10,f2 with this >> > layout. And that is one of the main advantages of raid10,f2 layout. >> > Have you tried it out? >> > >> > As far as I can see the layout of blocks are not alternating between the >> > disks. You have one raid1 of sda1 and sdb2, there a file is allocated on >> > blocks sequentially on sda1 and then mirrored on sdb2, where it is also >> > sequentially allocated. That gives no striping. >> > >> >> I don't think there is any way you can get the equivalent of raid10,o2 >> >> in this way. But then, I am not sure how much use raid10,o2 actually is >> >> - are there any usage patterns for which it is faster than raid10,n2 or >> >> raid10,f2? >> > >> > In theory raid10,o2 should have better performance on SSD's because of >> > the low latency, and raid10,o2 doing multireading from each drive, which >> > raid0,n2 does not. >> > >> > We lack some evidence from benchmarks, tho. >> > >> > best regards >> > keld >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> >> >> >> -- >> Roberto Spadim >> Spadim Technology / SPAEmpresarial >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 18:09 ` Roberto Spadim @ 2011-02-01 20:16 ` Keld Jørn Simonsen 0 siblings, 0 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-01 20:16 UTC (permalink / raw) To: Roberto Spadim; +Cc: Keld Jørn Simonsen, David Brown, linux-raid On Tue, Feb 01, 2011 at 04:09:03PM -0200, Roberto Spadim wrote: > ok, but you are wrong... > > well, raid0 is for reading sequentially, about double as fast as raid1. > > https://raid.wiki.kernel.org/index.php/Performance > > it?s true for a mix of write/read (write is the time of slowest drive) > > but if you just read, raid0 and raid1 have the same read time (depend > on implementation, but can have the same speed) For sequential reading, this is not true. For random reading and writing I agree with you in theory, but benchmarks show that it is not so, at least for Linux RAID, viz the above URL. Have you got benchmarks to shed some light on this? best regards keld ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 16:02 ` Keld Jørn Simonsen 2011-02-01 16:24 ` Roberto Spadim @ 2011-02-01 20:32 ` Keld Jørn Simonsen 2011-02-01 20:58 ` Roberto Spadim 2011-02-01 21:18 ` David Brown 2 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-01 20:32 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: David Brown, linux-raid On Tue, Feb 01, 2011 at 05:02:46PM +0100, Keld Jørn Simonsen wrote: > On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote: > > On 31/01/2011 23:52, Keld Jørn Simonsen wrote: > > >raid1+0 and Linux MD raid10 are similar, but significantly different > > >in a number of ways. Linux MD raid10 can run on only 2 drives. > > >Linux raid10,f2 has almost RAID0 striping performance in sequential read. > > >You can have an odd number of drives in raid10. > > >And you can have as many copies as you like in raid10, > > > > > > > You can make raid10,f2 functionality from raid1+0 by using partitions. > > For example, to get a raid10,f2 equivalent on two drives, partition them > > into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and > > md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe > > set of md0 and md1. > > I don't think you get the striping performance of raid10,f2 with this > layout. And that is one of the main advantages of raid10,f2 layout. > Have you tried it out? > > As far as I can see the layout of blocks are not alternating between the > disks. You have one raid1 of sda1 and sdb2, there a file is allocated on > blocks sequentially on sda1 and then mirrored on sdb2, where it is also > sequentially allocated. That gives no striping. Well, maybe the RAID0 layer provides the adequate striping. I am noy sure, but it looks like it could hold in theory. One could try it out. One advantage of this scheme could be improved probability When 2 drives fail, eg. in the case of a 4 drive array. The probability of survival of a running system could then be enhaced form 33 % to 66 %. One problem could be the choice of always the lowest block number, which is secured in raid10,f2, but not in a raid0 over raid1 (or raid10,n2) scenario. best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 20:32 ` Keld Jørn Simonsen @ 2011-02-01 20:58 ` Roberto Spadim 2011-02-01 21:04 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-01 20:58 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: David Brown, linux-raid > For sequential reading, this is not true. For random reading and > writing I agree with you in theory, but benchmarks show that it is not > so, at least for Linux RAID, viz the above URL. i agree with you, since linux algorith for raid1 is closest head, not round robin or time based there´s some patch on internet (google it: round robin raid1 linux) for roundrobin, but none for time based =( it´s a point of optimization of today raid1 algorithm round robin (may be at this mail list) http://www.spinics.net/lists/raid/msg30003.html 2011/2/1 Keld Jørn Simonsen <keld@keldix.com>: > On Tue, Feb 01, 2011 at 05:02:46PM +0100, Keld Jørn Simonsen wrote: >> On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote: >> > On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >> > >raid1+0 and Linux MD raid10 are similar, but significantly different >> > >in a number of ways. Linux MD raid10 can run on only 2 drives. >> > >Linux raid10,f2 has almost RAID0 striping performance in sequential read. >> > >You can have an odd number of drives in raid10. >> > >And you can have as many copies as you like in raid10, >> > > >> > >> > You can make raid10,f2 functionality from raid1+0 by using partitions. >> > For example, to get a raid10,f2 equivalent on two drives, partition them >> > into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and >> > md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe >> > set of md0 and md1. >> >> I don't think you get the striping performance of raid10,f2 with this >> layout. And that is one of the main advantages of raid10,f2 layout. >> Have you tried it out? >> >> As far as I can see the layout of blocks are not alternating between the >> disks. You have one raid1 of sda1 and sdb2, there a file is allocated on >> blocks sequentially on sda1 and then mirrored on sdb2, where it is also >> sequentially allocated. That gives no striping. > > Well, maybe the RAID0 layer provides the adequate striping. > I am noy sure, but it looks like it could hold in theory. > One could try it out. > > One advantage of this scheme could be improved probability > When 2 drives fail, eg. in the case of a 4 drive array. > The probability of survival of a running system could then > be enhaced form 33 % to 66 %. > > One problem could be the choice of always the lowest block number, which > is secured in raid10,f2, but not in a raid0 over raid1 (or raid10,n2) scenario. > > best regards > keld > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 20:58 ` Roberto Spadim @ 2011-02-01 21:04 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-01 21:04 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: David Brown, linux-raid the problem for benchmark on linux (mdadm) raid implementation (raid1 vs raid0) is: raid1 read balance(closest head) is diferent from raid0 read balance(stripe) algorithm it´s good but can´t be as fast as raid0 for read intensive (sequential) the problem of closest head algorithm is it can´t paralelize reads like raid0 paralelize 2011/2/1 Roberto Spadim <roberto@spadim.com.br>: >> For sequential reading, this is not true. For random reading and >> writing I agree with you in theory, but benchmarks show that it is not >> so, at least for Linux RAID, viz the above URL. > > i agree with you, since linux algorith for raid1 is closest head, not > round robin or time based > > there´s some patch on internet (google it: round robin raid1 linux) > for roundrobin, but none for time based =( > it´s a point of optimization of today raid1 algorithm > > round robin (may be at this mail list) > http://www.spinics.net/lists/raid/msg30003.html > > 2011/2/1 Keld Jørn Simonsen <keld@keldix.com>: >> On Tue, Feb 01, 2011 at 05:02:46PM +0100, Keld Jørn Simonsen wrote: >>> On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote: >>> > On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >>> > >raid1+0 and Linux MD raid10 are similar, but significantly different >>> > >in a number of ways. Linux MD raid10 can run on only 2 drives. >>> > >Linux raid10,f2 has almost RAID0 striping performance in sequential read. >>> > >You can have an odd number of drives in raid10. >>> > >And you can have as many copies as you like in raid10, >>> > > >>> > >>> > You can make raid10,f2 functionality from raid1+0 by using partitions. >>> > For example, to get a raid10,f2 equivalent on two drives, partition them >>> > into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and >>> > md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe >>> > set of md0 and md1. >>> >>> I don't think you get the striping performance of raid10,f2 with this >>> layout. And that is one of the main advantages of raid10,f2 layout. >>> Have you tried it out? >>> >>> As far as I can see the layout of blocks are not alternating between the >>> disks. You have one raid1 of sda1 and sdb2, there a file is allocated on >>> blocks sequentially on sda1 and then mirrored on sdb2, where it is also >>> sequentially allocated. That gives no striping. >> >> Well, maybe the RAID0 layer provides the adequate striping. >> I am noy sure, but it looks like it could hold in theory. >> One could try it out. >> >> One advantage of this scheme could be improved probability >> When 2 drives fail, eg. in the case of a 4 drive array. >> The probability of survival of a running system could then >> be enhaced form 33 % to 66 %. >> >> One problem could be the choice of always the lowest block number, which >> is secured in raid10,f2, but not in a raid0 over raid1 (or raid10,n2) scenario. >> >> best regards >> keld >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 16:02 ` Keld Jørn Simonsen 2011-02-01 16:24 ` Roberto Spadim 2011-02-01 20:32 ` Keld Jørn Simonsen @ 2011-02-01 21:18 ` David Brown 2 siblings, 0 replies; 127+ messages in thread From: David Brown @ 2011-02-01 21:18 UTC (permalink / raw) To: linux-raid On 01/02/11 17:02, Keld Jørn Simonsen wrote: > On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote: >> On 31/01/2011 23:52, Keld Jørn Simonsen wrote: >>> raid1+0 and Linux MD raid10 are similar, but significantly different >>> in a number of ways. Linux MD raid10 can run on only 2 drives. >>> Linux raid10,f2 has almost RAID0 striping performance in sequential read. >>> You can have an odd number of drives in raid10. >>> And you can have as many copies as you like in raid10, >>> >> >> You can make raid10,f2 functionality from raid1+0 by using partitions. >> For example, to get a raid10,f2 equivalent on two drives, partition them >> into equal halves. Then make md0 a raid1 mirror of sda1 and sdb2, and >> md1 a raid1 mirror of sdb1 and sda2. Finally, make md2 a raid0 stripe >> set of md0 and md1. > > I don't think you get the striping performance of raid10,f2 with this > layout. And that is one of the main advantages of raid10,f2 layout. > Have you tried it out? No, I haven't tried it yet. I've got four disks in this PC with an empty partition on each specifically for testing such things, but I haven't taken the time to try it properly. But I believe you will get the striping performance - the two raid1 parts are striped together as raid0, and they can both be accessed in parallel. > > As far as I can see the layout of blocks are not alternating between the > disks. You have one raid1 of sda1 and sdb2, there a file is allocated on > blocks sequentially on sda1 and then mirrored on sdb2, where it is also > sequentially allocated. That gives no striping. > Suppose your data blocks are 0, 1, 2, 3, ... where each block is half a raid0 stripe. Then the arrangement of this data on raid10,f2 is: sda: 0 2 4 6 .... 1 3 5 7 .... sdb: 1 3 5 7 .... 0 2 4 6 .... The arrangement inside my md2 is (striped but not mirrored) : md0: 0 2 4 6 .... md1: 1 3 5 7 .... Inside md0 (mirrored) is then: sda1: 0 2 4 6 .... sdb2: 0 2 4 6 .... Inside md1 (mirrored) it is: sdb1: 1 3 5 7 .... sda2: 1 3 5 7 .... Thus inside the disks themselves you have sda: 0 2 4 6 .... 1 3 5 7 .... sdb: 1 3 5 7 .... 0 2 5 7 .... >> I don't think there is any way you can get the equivalent of raid10,o2 >> in this way. But then, I am not sure how much use raid10,o2 actually is >> - are there any usage patterns for which it is faster than raid10,n2 or >> raid10,f2? > > In theory raid10,o2 should have better performance on SSD's because of > the low latency, and raid10,o2 doing multireading from each drive, which > raid0,n2 does not. > I think it should beat raid10,n2 for some things - because of multireading. But I don't see it being faster than raid10,f2, which multi-reads even better. In particular with SSD's, the disadvantage of raid10,f2 - the large head movements on writes - disappears. > We lack some evidence from benchmarks, tho. > Indeed. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:27 ` Jon Nelson 2011-01-31 21:47 ` Roberto Spadim @ 2011-02-01 0:58 ` Stan Hoeppner 2011-02-01 12:50 ` Roman Mamedov 2011-02-03 11:04 ` Keld Jørn Simonsen 2011-02-01 8:46 ` hansbkk 2 siblings, 2 replies; 127+ messages in thread From: Stan Hoeppner @ 2011-02-01 0:58 UTC (permalink / raw) To: Jon Nelson Cc: Mathias Burén, Roberto Spadim, Keld Jørn Simonsen, Denis, Linux-RAID Jon Nelson put forth on 1/31/2011 3:27 PM: > Before this goes any further, why not just reference the excellent > Wikipedia article (actually, excellent applies to both Wikipedia *and* > the article): > > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > > The only problem I have with the wikipedia article is the assertion > that Linux MD RAID 10 is non-standard. It's as standard as anything > else is in this world. Unfortunately there is no organization, no standards body, that defines RAID levels. The use of the word "standard" in the article means "de facto" standard. All of the major and lesser expansion slot card and external RAID controller vendors have been using the same RAID level terminology for two decades. In this case, some have used "RAID 10" while others have used "RAID 1+0" for the same algorithm. Some used to use "RAID 10" and switched to using "RAID 1+0" recently o avoid perceived confusion in the marketplace. NONE of them include a RAID "10" or "1+0" implementation that works with only 3 disks, or even 2 disks. In the hardware industry, RAID "10" or "1+0" always means a stripe of mirrored pairs, 4 drives (devices) being the minimum required. I believe the "non standard" description is right on the money though because Linux mdraid is the only software/hardware solution that offers these other "layouts". The reason I disdain these multiple layouts has nothing to do with their technical merit, but the fact that it's almost impossible to discuss some things because we don't know what the heck each other are talking about. An OP may say "RAID 10" on this list, but is actually referring to one of the layouts that isn't the classic striped mirrors. Such as when discussing how many drives can fail and you're still up and running. That is well defined for traditional RAID 10, but not so well, at least from my perspective, for these "non standard" RAID 10 layouts. -- Stan ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 0:58 ` Stan Hoeppner @ 2011-02-01 12:50 ` Roman Mamedov 2011-02-03 11:04 ` Keld Jørn Simonsen 1 sibling, 0 replies; 127+ messages in thread From: Roman Mamedov @ 2011-02-01 12:50 UTC (permalink / raw) To: Stan Hoeppner Cc: Jon Nelson, Mathias Burén, Roberto Spadim, Keld Jørn Simonsen, Denis, Linux-RAID [-- Attachment #1: Type: text/plain, Size: 821 bytes --] On Mon, 31 Jan 2011 18:58:29 -0600 Stan Hoeppner <stan@hardwarefreak.com> wrote: > Unfortunately there is no organization, no standards body, that defines RAID > levels. There actually is/was: http://www.freeopenbook.com/upgrading-repairing-pc/ch07lev1sec6.html "An organization called the RAID Advisory Board (RAB) was formed in July 1992 to standardize, classify, and educate on the subject of RAID. The RAB has developed specifications for RAID, a conformance program for the various RAID levels, and a classification program for RAID hardware." Just who the fsck they are and who gave them the authority to define anything, is another question. http://www.raid-advisory.com/ is mentioned in various sources as RAB's website, currently a placeholder domain with spam. -- With respect, Roman [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 0:58 ` Stan Hoeppner 2011-02-01 12:50 ` Roman Mamedov @ 2011-02-03 11:04 ` Keld Jørn Simonsen 2011-02-03 14:17 ` Roberto Spadim 2011-02-03 23:43 ` Stan Hoeppner 1 sibling, 2 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-03 11:04 UTC (permalink / raw) To: Stan Hoeppner Cc: Jon Nelson, Mathias Burén, Roberto Spadim, Keld Jørn Simonsen, Denis, Linux-RAID On Mon, Jan 31, 2011 at 06:58:29PM -0600, Stan Hoeppner wrote: > Jon Nelson put forth on 1/31/2011 3:27 PM: > > Before this goes any further, why not just reference the excellent > > Wikipedia article (actually, excellent applies to both Wikipedia *and* > > the article): > > > > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > > > > The only problem I have with the wikipedia article is the assertion > > that Linux MD RAID 10 is non-standard. It's as standard as anything > > else is in this world. > > Unfortunately there is no organization, no standards body, that defines RAID > levels. Well there is an organisation that does just that, namely SNIA. http://www.snia.org The RAID levels are defined in DDF - a "SNIA" standard. http://www.snia.org/tech_activities/standards/curr_standards/ddf/ (Info courtesey of Neil Brown) best regards keld ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 11:04 ` Keld Jørn Simonsen @ 2011-02-03 14:17 ` Roberto Spadim 2011-02-03 15:54 ` Keld Jørn Simonsen 2011-02-03 23:43 ` Stan Hoeppner 1 sibling, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 14:17 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Stan Hoeppner, Jon Nelson, Mathias Burén, Denis, Linux-RAID nice, could we put a link to snia ddf at raid wiki? The RAID levels are defined in DDF - a "SNIA" standard. http://www.snia.org/tech_activities/standards/curr_standards/ddf/ 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: > On Mon, Jan 31, 2011 at 06:58:29PM -0600, Stan Hoeppner wrote: >> Jon Nelson put forth on 1/31/2011 3:27 PM: >> > Before this goes any further, why not just reference the excellent >> > Wikipedia article (actually, excellent applies to both Wikipedia *and* >> > the article): >> > >> > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 >> > >> > The only problem I have with the wikipedia article is the assertion >> > that Linux MD RAID 10 is non-standard. It's as standard as anything >> > else is in this world. >> >> Unfortunately there is no organization, no standards body, that defines RAID >> levels. > > Well there is an organisation that does just that, namely SNIA. > > http://www.snia.org > > The RAID levels are defined in DDF - a "SNIA" standard. > > http://www.snia.org/tech_activities/standards/curr_standards/ddf/ > > (Info courtesey of Neil Brown) > > best regards > keld > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 14:17 ` Roberto Spadim @ 2011-02-03 15:54 ` Keld Jørn Simonsen 2011-02-03 18:39 ` Keld Jørn Simonsen 0 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-03 15:54 UTC (permalink / raw) To: Roberto Spadim Cc: Keld Jørn Simonsen, Stan Hoeppner, Jon Nelson, Mathias Burén, Denis, Linux-RAID On Thu, Feb 03, 2011 at 12:17:39PM -0200, Roberto Spadim wrote: > nice, could we put a link to snia ddf at raid wiki? > > The RAID levels are defined in DDF - a "SNIA" standard. > http://www.snia.org/tech_activities/standards/curr_standards/ddf/ Yes, it was my plan to add it to our wiki pages, and to Wikipedia. As I could not find the info there in the first place, but I knew that there was something like that... best regards keld > > 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: > > On Mon, Jan 31, 2011 at 06:58:29PM -0600, Stan Hoeppner wrote: > >> Jon Nelson put forth on 1/31/2011 3:27 PM: > >> > Before this goes any further, why not just reference the excellent > >> > Wikipedia article (actually, excellent applies to both Wikipedia *and* > >> > the article): > >> > > >> > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > >> > > >> > The only problem I have with the wikipedia article is the assertion > >> > that Linux MD RAID 10 is non-standard. It's as standard as anything > >> > else is in this world. > >> > >> Unfortunately there is no organization, no standards body, that defines RAID > >> levels. > > > > Well there is an organisation that does just that, namely SNIA. > > > > http://www.snia.org > > > > The RAID levels are defined in DDF - a "SNIA" standard. > > > > http://www.snia.org/tech_activities/standards/curr_standards/ddf/ > > > > (Info courtesey of Neil Brown) > > > > best regards > > keld > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 15:54 ` Keld Jørn Simonsen @ 2011-02-03 18:39 ` Keld Jørn Simonsen 2011-02-03 18:41 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-03 18:39 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Roberto Spadim, Stan Hoeppner, Jon Nelson, Mathias Burén, Denis, Linux-RAID On Thu, Feb 03, 2011 at 04:54:28PM +0100, Keld Jørn Simonsen wrote: > On Thu, Feb 03, 2011 at 12:17:39PM -0200, Roberto Spadim wrote: > > nice, could we put a link to snia ddf at raid wiki? > > > > The RAID levels are defined in DDF - a "SNIA" standard. > > http://www.snia.org/tech_activities/standards/curr_standards/ddf/ > > Yes, it was my plan to add it to our wiki pages, and to Wikipedia. > As I could not find the info there in the first place, but I knew > that there was something like that... OK, I added it to our wiki and to wikipedia, but maybe not in the most appropiate way. best regards keld > > > > > 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: > > > On Mon, Jan 31, 2011 at 06:58:29PM -0600, Stan Hoeppner wrote: > > >> Jon Nelson put forth on 1/31/2011 3:27 PM: > > >> > Before this goes any further, why not just reference the excellent > > >> > Wikipedia article (actually, excellent applies to both Wikipedia *and* > > >> > the article): > > >> > > > >> > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > > >> > > > >> > The only problem I have with the wikipedia article is the assertion > > >> > that Linux MD RAID 10 is non-standard. It's as standard as anything > > >> > else is in this world. > > >> > > >> Unfortunately there is no organization, no standards body, that defines RAID > > >> levels. > > > > > > Well there is an organisation that does just that, namely SNIA. > > > > > > http://www.snia.org > > > > > > The RAID levels are defined in DDF - a "SNIA" standard. > > > > > > http://www.snia.org/tech_activities/standards/curr_standards/ddf/ > > > > > > (Info courtesey of Neil Brown) > > > > > > best regards > > > keld > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > -- > > Roberto Spadim > > Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 18:39 ` Keld Jørn Simonsen @ 2011-02-03 18:41 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-03 18:41 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Stan Hoeppner, Jon Nelson, Mathias Burén, Denis, Linux-RAID =] heehe no problem 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: > On Thu, Feb 03, 2011 at 04:54:28PM +0100, Keld Jørn Simonsen wrote: >> On Thu, Feb 03, 2011 at 12:17:39PM -0200, Roberto Spadim wrote: >> > nice, could we put a link to snia ddf at raid wiki? >> > >> > The RAID levels are defined in DDF - a "SNIA" standard. >> > http://www.snia.org/tech_activities/standards/curr_standards/ddf/ >> >> Yes, it was my plan to add it to our wiki pages, and to Wikipedia. >> As I could not find the info there in the first place, but I knew >> that there was something like that... > > OK, I added it to our wiki and to wikipedia, but maybe not in the most > appropiate way. > > best regards > keld >> >> > >> > 2011/2/3 Keld Jørn Simonsen <keld@keldix.com>: >> > > On Mon, Jan 31, 2011 at 06:58:29PM -0600, Stan Hoeppner wrote: >> > >> Jon Nelson put forth on 1/31/2011 3:27 PM: >> > >> > Before this goes any further, why not just reference the excellent >> > >> > Wikipedia article (actually, excellent applies to both Wikipedia *and* >> > >> > the article): >> > >> > >> > >> > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 >> > >> > >> > >> > The only problem I have with the wikipedia article is the assertion >> > >> > that Linux MD RAID 10 is non-standard. It's as standard as anything >> > >> > else is in this world. >> > >> >> > >> Unfortunately there is no organization, no standards body, that defines RAID >> > >> levels. >> > > >> > > Well there is an organisation that does just that, namely SNIA. >> > > >> > > http://www.snia.org >> > > >> > > The RAID levels are defined in DDF - a "SNIA" standard. >> > > >> > > http://www.snia.org/tech_activities/standards/curr_standards/ddf/ >> > > >> > > (Info courtesey of Neil Brown) >> > > >> > > best regards >> > > keld >> > > -- >> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> > > the body of a message to majordomo@vger.kernel.org >> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > >> > >> > >> > >> > -- >> > Roberto Spadim >> > Spadim Technology / SPAEmpresarial > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 11:04 ` Keld Jørn Simonsen 2011-02-03 14:17 ` Roberto Spadim @ 2011-02-03 23:43 ` Stan Hoeppner 2011-02-04 3:49 ` hansbkk 2011-02-04 7:06 ` Keld Jørn Simonsen 1 sibling, 2 replies; 127+ messages in thread From: Stan Hoeppner @ 2011-02-03 23:43 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID Keld Jørn Simonsen put forth on 2/3/2011 5:04 AM: > On Mon, Jan 31, 2011 at 06:58:29PM -0600, Stan Hoeppner wrote: >> Jon Nelson put forth on 1/31/2011 3:27 PM: >>> Before this goes any further, why not just reference the excellent >>> Wikipedia article (actually, excellent applies to both Wikipedia *and* >>> the article): >>> >>> http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 >>> >>> The only problem I have with the wikipedia article is the assertion >>> that Linux MD RAID 10 is non-standard. It's as standard as anything >>> else is in this world. >> >> Unfortunately there is no organization, no standards body, that defines RAID >> levels. > > Well there is an organisation that does just that, namely SNIA. I should have qualified that with "defines RAID levels the entire industry accepts/adopts". Unfortunately SNIA is not a standards body or working group, such as PCI-SIG, or IETF, whose specifications entire industries _do_ accept/adopt. > http://www.snia.org > > The RAID levels are defined in DDF - a "SNIA" standard. Exactly. It's an SNIA standard. Unfortunately SNIA doesn't carry sufficient weight to drive full adoption. I commend them for trying though. > http://www.snia.org/tech_activities/standards/curr_standards/ddf/ > > (Info courtesey of Neil Brown) Please note that the SNIA Disk Data Format document doesn't define RAID 10 at all. Yet there is a single mention of RAID 10 in the entire document: "RAID-1E 0x11 >2 disk RAID-1, similar to RAID-10 but with striping integrated into array" They don't define RAID 10, but they reference it. Thus one can only assume that SNIA _assumes_ RAID 10 is already well defined in industry to reference it in such a manner without previously defining it in the document. Does anyone else find this reference to a RAID level omitted in their definitions a little more than interesting? This RAID 10 omission is especially interesting considering that RAID 10 dominates the storage back ends of Fortune 1000 companies, specifically beneath databases and high transaction load systems such as enterprise mail. They've omitted defining the one RAID level with the best combination of high performance, most resilience, and greatest penetration of the "high end" of computing in the history of RAID. This begs the question: "Why?" Something smells bad here. Does one of the RAID companies own a patent or trademark on "RAID 10"? I'll look into this. It just doesn't make any sense for RAID 10 to be omitted from the SNIA DDF but to be referenced in the manner it is. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 23:43 ` Stan Hoeppner @ 2011-02-04 3:49 ` hansbkk 2011-02-04 7:06 ` Keld Jørn Simonsen 1 sibling, 0 replies; 127+ messages in thread From: hansbkk @ 2011-02-04 3:49 UTC (permalink / raw) To: Linux-RAID On Fri, Feb 4, 2011 at 6:43 AM, Stan Hoeppner <stan@hardwarefreak.com> wrote: > Keld Jørn Simonsen put forth on 2/3/2011 5:04 AM: >>> Unfortunately there is no organization, no standards body, that defines RAID >>> levels. > > Something smells bad here. Does one of the RAID companies own a patent or > trademark on "RAID 10"? I'll look into this. It just doesn't make any sense > for RAID 10 to be omitted from the SNIA DDF but to be referenced in the manner > it is. For the sake of practical clarity, would it be possible for the list to simply agree to use RAID1+0 = the "outside-the-mdadm-world" meaning of RAID10 and something like "md RAID10" for "our" RAID10 and to just not use plain "RAID10" at all, to avoid confusion and move on? Others more senior can of course propose other syntax; my goal is to simply avoid the confusion caused by ambiguous terminology, and the unnecessary friction that seems to come up every time these terms are discussed. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-03 23:43 ` Stan Hoeppner 2011-02-04 3:49 ` hansbkk @ 2011-02-04 7:06 ` Keld Jørn Simonsen 2011-02-04 8:27 ` Stan Hoeppner 1 sibling, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-04 7:06 UTC (permalink / raw) To: Stan Hoeppner Cc: Keld Jørn Simonsen, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID On Thu, Feb 03, 2011 at 05:43:42PM -0600, Stan Hoeppner wrote: > Keld Jørn Simonsen put forth on 2/3/2011 5:04 AM: > > On Mon, Jan 31, 2011 at 06:58:29PM -0600, Stan Hoeppner wrote: > >> Jon Nelson put forth on 1/31/2011 3:27 PM: > >>> Before this goes any further, why not just reference the excellent > >>> Wikipedia article (actually, excellent applies to both Wikipedia *and* > >>> the article): > >>> > >>> http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > >>> > >>> The only problem I have with the wikipedia article is the assertion > >>> that Linux MD RAID 10 is non-standard. It's as standard as anything > >>> else is in this world. > >> > >> Unfortunately there is no organization, no standards body, that defines RAID > >> levels. > > > > Well there is an organisation that does just that, namely SNIA. > > I should have qualified that with "defines RAID levels the entire industry > accepts/adopts". Unfortunately SNIA is not a standards body or working group, > such as PCI-SIG, or IETF, whose specifications entire industries _do_ accept/adopt. I don't know about SNIA, but I have vast experience with standardisation bodies. I see SNIA as an industry standard standardisation body. I don't know how pervasive that organisation is, but their membership list is impressive http://www.snia.org/member_com/member_directory/ - Looks like everybody in the harddisk business is on board. > Please note that the SNIA Disk Data Format document doesn't define RAID 10 at > all. Yet there is a single mention of RAID 10 in the entire document: > > "RAID-1E 0x11 >2 disk RAID-1, similar to RAID-10 but with striping integrated > into array" > > They don't define RAID 10, but they reference it. Thus one can only assume that > SNIA _assumes_ RAID 10 is already well defined in industry to reference it in > such a manner without previously defining it in the document. > > Does anyone else find this reference to a RAID level omitted in their > definitions a little more than interesting? This RAID 10 omission is especially > interesting considering that RAID 10 dominates the storage back ends of Fortune > 1000 companies, specifically beneath databases and high transaction load systems > such as enterprise mail. > > They've omitted defining the one RAID level with the best combination of high > performance, most resilience, and greatest penetration of the "high end" of > computing in the history of RAID. This begs the question: "Why?" Well RAID1+0 is not the best combination available. I would argue that raid10,f2 is significantly better in a number of areas. > Something smells bad here. Does one of the RAID companies own a patent or > trademark on "RAID 10"? I'll look into this. It just doesn't make any sense > for RAID 10 to be omitted from the SNIA DDF but to be referenced in the manner > it is. It looks like they do define all major basic RAID disk layouts. (except raid10,f2 of cause) . RAID1+0 is a derived format, maybe that is out of scope of the DDF standard. Best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 7:06 ` Keld Jørn Simonsen @ 2011-02-04 8:27 ` Stan Hoeppner 2011-02-04 9:06 ` Keld Jørn Simonsen 2011-02-04 11:34 ` David Brown 0 siblings, 2 replies; 127+ messages in thread From: Stan Hoeppner @ 2011-02-04 8:27 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID Keld Jørn Simonsen put forth on 2/4/2011 1:06 AM: > Well RAID1+0 is not the best combination available. I would argue that > raid10,f2 is significantly better in a number of areas. I'd guess Linux software RAID would be lucky to have 1% of RAID deployments worldwide--very lucky. The other 99%+ are HBA RAID or SAN/NAS "appliances" most often using custom embedded RTOS with the RAID code written in assembler, especially in the case of the HBAs. For everything not Linux mdraid, RAID 10 (aka 1+0) is king of the hill, and has been for 15 years+ >> Something smells bad here. Does one of the RAID companies own a patent or >> trademark on "RAID 10"? I'll look into this. It just doesn't make any sense >> for RAID 10 to be omitted from the SNIA DDF but to be referenced in the manner >> it is. > > It looks like they do define all major basic RAID disk layouts. (except > raid10,f2 of cause) . RAID1+0 is a derived format, maybe that is out of > scope of the DDF standard. "A secondary virtual disk is a VD configured using hybrid RAID levels like RAID10 or RAID50. Its elements are BVDs." So apparently their Disk Data Format specification doesn't include hybrid RAID levels. This makes sense, as the _on disk_ layout of RAID 10 is identical to RAID 1. We apparently need to be looking for other SNIA documents to find their definition of RAID 10. That is what started us down this tunnel isn't it? We're so deep now there's no light and I can't see the path behind me anymore. ;) -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 8:27 ` Stan Hoeppner @ 2011-02-04 9:06 ` Keld Jørn Simonsen 2011-02-04 10:04 ` Stan Hoeppner 2011-02-04 20:42 ` Keld Jørn Simonsen 2011-02-04 11:34 ` David Brown 1 sibling, 2 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-04 9:06 UTC (permalink / raw) To: Stan Hoeppner Cc: Keld Jørn Simonsen, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID On Fri, Feb 04, 2011 at 02:27:38AM -0600, Stan Hoeppner wrote: > Keld Jørn Simonsen put forth on 2/4/2011 1:06 AM: > > > Well RAID1+0 is not the best combination available. I would argue that > > raid10,f2 is significantly better in a number of areas. > > I'd guess Linux software RAID would be lucky to have 1% of RAID deployments > worldwide--very lucky. The other 99%+ are HBA RAID or SAN/NAS "appliances" most > often using custom embedded RTOS with the RAID code written in assembler, > especially in the case of the HBAs. For everything not Linux mdraid, RAID 10 > (aka 1+0) is king of the hill, and has been for 15 years+ Yes, you are right, Linux MD really has an advantage here:-) > >> Something smells bad here. Does one of the RAID companies own a patent or > >> trademark on "RAID 10"? I'll look into this. It just doesn't make any sense > >> for RAID 10 to be omitted from the SNIA DDF but to be referenced in the manner > >> it is. > > > > It looks like they do define all major basic RAID disk layouts. (except > > raid10,f2 of cause) . RAID1+0 is a derived format, maybe that is out of > > scope of the DDF standard. > > "A secondary virtual disk is a VD configured using hybrid RAID levels like > RAID10 or RAID50. Its elements are BVDs." > > So apparently their Disk Data Format specification doesn't include hybrid RAID > levels. This makes sense, as the _on disk_ layout of RAID 10 is identical to > RAID 1. Yes, raid10 is just a variation of RAID1, actually raid10,n2 is identical on the disk to RAID1, eg for a 2 drive or 4-drive array. > We apparently need to be looking for other SNIA documents to find their > definition of RAID 10. That is what started us down this tunnel isn't it? > We're so deep now there's no light and I can't see the path behind me anymore. ;) I dont think SNIA defines RAID 10, which is a specific Linux MD thing. For RAID1+0, I think it is covered by the DDF standard, as what DDF is aimed at is defining formats on the disks to portably handle RAID. That means that you can move a set of disks used in one manufacturer's configuration to another make's configuration, and it will still work. And it will also work with RAID1+0, as the underlying RAID1 and RAID0 formats are defined in DDF. So no need to add specific RAID1+0 definitions. Also RAID1 may mean different layouts, like the "far" and "offset" layouts, and it would mean an explosion of definitions of RAID1+0 if you should name and standardize all of these variations of RAID1+0 explicitely. best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 9:06 ` Keld Jørn Simonsen @ 2011-02-04 10:04 ` Stan Hoeppner 2011-02-04 11:15 ` hansbkk 2011-02-04 20:35 ` Keld Jørn Simonsen 2011-02-04 20:42 ` Keld Jørn Simonsen 1 sibling, 2 replies; 127+ messages in thread From: Stan Hoeppner @ 2011-02-04 10:04 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID Keld Jørn Simonsen put forth on 2/4/2011 3:06 AM: >> We apparently need to be looking for other SNIA documents to find their >> definition of RAID 10. That is what started us down this tunnel isn't it? >> We're so deep now there's no light and I can't see the path behind me anymore. ;) > > I dont think SNIA defines RAID 10, which is a specific Linux MD thing. Either you're very young, or don't read what I type, maybe both. ;) RAID 10 has been around for 15+ years. It is not unique to Linux mdraid. The "alternate layouts" of mdraid are unique to Linux/mdraid, but RAID 10 is not unique to mdraid. > For RAID1+0, I think it is covered by the DDF standard RAID 10 and RAID 1+0 are the same thing by two different names. It's not covered in the DDF. I just spent a paragraph explaining why it's not in the DDF, and you agreed with me for Pete's sake! Now you say you think it _is_ in the DDF? I'm getting dizzy from the running in circles... I suggest you actually read the DDF and other documents available on the SNIA website, as I have. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 10:04 ` Stan Hoeppner @ 2011-02-04 11:15 ` hansbkk 2011-02-04 13:33 ` Keld Jørn Simonsen 2011-02-04 20:35 ` Keld Jørn Simonsen 1 sibling, 1 reply; 127+ messages in thread From: hansbkk @ 2011-02-04 11:15 UTC (permalink / raw) To: Stan Hoeppner Cc: Keld Jørn Simonsen, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID On Fri, Feb 4, 2011 at 5:04 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote: > Keld Jørn Simonsen put forth on 2/4/2011 3:06 AM: >> I dont think SNIA defines RAID 10, which is a specific Linux MD thing. > > Either you're very young, or don't read what I type, maybe both. ;) RAID 10 has > been around for 15+ years. It is not unique to Linux mdraid. The "alternate > layouts" of mdraid are unique to Linux/mdraid, but RAID 10 is not unique to mdraid. > >> For RAID1+0, I think it is covered by the DDF standard > > RAID 10 and RAID 1+0 are the same thing by two different names. It's not > > I'm getting dizzy from the running in circles... Dope-slaps all 'round! Again, this problem would just go away if everyone would just refrain from using "RAID 10" or "RAID10" as if they were meaningful - within the context of this list, they just cause confusion. It seems the mdadm people are willing to use RAID 1+0 for "the other kind", and the "outsiders" are willing to use "md RAID10", but both sides would like to claim just plain "RAID10" for themselves, and it's causing endless loops of non-communication! Please reply with better alternatives, but IMO "md raid10" and "raid1+0" are clear enough to be understood by one and all. For those of you sick of my ranting on beating a dead horse, I'm sorry but I just can't help keeping on trying to put this behind us - we're all here for a common cause aren't we? I'm not asking you to change any religious beliefs outside the list, just adopt an enabling convention for discussions here. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 11:15 ` hansbkk @ 2011-02-04 13:33 ` Keld Jørn Simonsen 0 siblings, 0 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-04 13:33 UTC (permalink / raw) To: hansbkk Cc: Stan Hoeppner, Keld Jørn Simonsen, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID On Fri, Feb 04, 2011 at 06:15:44PM +0700, hansbkk@gmail.com wrote: > On Fri, Feb 4, 2011 at 5:04 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote: > > Keld Jørn Simonsen put forth on 2/4/2011 3:06 AM: > > >> I dont think SNIA defines RAID 10, which is a specific Linux MD thing. > > > > Either you're very young, or don't read what I type, maybe both. ;) RAID 10 has > > been around for 15+ years. It is not unique to Linux mdraid. The "alternate > > layouts" of mdraid are unique to Linux/mdraid, but RAID 10 is not unique to mdraid. > > > >> For RAID1+0, I think it is covered by the DDF standard > > > > RAID 10 and RAID 1+0 are the same thing by two different names. It's not > > > > > I'm getting dizzy from the running in circles... > > > Dope-slaps all 'round! > > Again, this problem would just go away if everyone would just refrain > from using "RAID 10" or "RAID10" as if they were meaningful - within > the context of this list, they just cause confusion. > > It seems the mdadm people are willing to use RAID 1+0 for "the other > kind", and the "outsiders" are willing to use "md RAID10", but both > sides would like to claim just plain "RAID10" for themselves, and it's > causing endless loops of non-communication! > > Please reply with better alternatives, but IMO "md raid10" and > "raid1+0" are clear enough to be understood by one and all. > > For those of you sick of my ranting on beating a dead horse, I'm sorry > but I just can't help keeping on trying to put this behind us - we're > all here for a common cause aren't we? > > I'm not asking you to change any religious beliefs outside the list, > just adopt an enabling convention for discussions here. I can agree with what is said here, at least when we are discussing with somebody like Stan who insists on using terminology, that is easily misunderstood. I do think precise terminology is important. Also I have tried to clean up terminology other places, such as our wiki and in articles elsewhere. I tend to write "Linux MD raid10" in those instances. Best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 10:04 ` Stan Hoeppner 2011-02-04 11:15 ` hansbkk @ 2011-02-04 20:35 ` Keld Jørn Simonsen 1 sibling, 0 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-04 20:35 UTC (permalink / raw) To: Stan Hoeppner Cc: Keld Jørn Simonsen, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID On Fri, Feb 04, 2011 at 04:04:42AM -0600, Stan Hoeppner wrote: > Keld Jørn Simonsen put forth on 2/4/2011 3:06 AM: > > > For RAID1+0, I think it is covered by the DDF standard > > RAID 10 and RAID 1+0 are the same thing by two different names. It's not > covered in the DDF. I agree that it is not described in detail in DDF. But it is covered, as I wrote. RAID 1+0 can be a number of things, as RAID1 can be a number of things. It can be what we know in Linux-land as a RAID0 of MD raid10,n2 or of md raid10,o2, or some other raid1 layout. when you then move a set of RAID 1+0 disks from one RAID device to another, then, using DDF, you can handle that correctly on the new raid device, as you can see what kind of RAID1 that the disks are formatted with, thanks to the DDF standard. And you can then also see that the RAID0 is a RAID0, according to the data stored in the RAID description adhering to the DDF standard. So you can safely move RAID 1+0 disks from one RAID device (say NAS box) to another. Handy if a NAS box breaks down beyond repair, and you have to buy a new one of another make. > I just spent a paragraph explaining why it's not in the > DDF, and you agreed with me for Pete's sake! Now you say you think it _is_ in > the DDF? Implicitely it is there, and for good reasons it is not spelled out. The same story goes for RAID 5+0 and other nested RAID types. It does not make sense to describe the combined nested type, only the component data formats are described. But one could have a brief description of the concepts, and why it is not adequate terminology in the DDF standard. best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 9:06 ` Keld Jørn Simonsen 2011-02-04 10:04 ` Stan Hoeppner @ 2011-02-04 20:42 ` Keld Jørn Simonsen 2011-02-04 21:15 ` Stan Hoeppner 1 sibling, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-04 20:42 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Stan Hoeppner, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID On Fri, Feb 04, 2011 at 10:06:03AM +0100, Keld Jørn Simonsen wrote: > On Fri, Feb 04, 2011 at 02:27:38AM -0600, Stan Hoeppner wrote: > > Keld Jørn Simonsen put forth on 2/4/2011 1:06 AM: > > > > > It looks like they do define all major basic RAID disk layouts. (except > > > raid10,f2 of cause) . RAID1+0 is a derived format, maybe that is out of > > > scope of the DDF standard. > > > > "A secondary virtual disk is a VD configured using hybrid RAID levels like > > RAID10 or RAID50. Its elements are BVDs." > > > > So apparently their Disk Data Format specification doesn't include hybrid RAID > > levels. This makes sense, as the _on disk_ layout of RAID 10 is identical to > > RAID 1. I was puzzled here. I think you mean: "the _on disk_ layout of RAID 10 is identical to RAID 1 and RAID 0" If that is what you meant, I think we agree on most things here. best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 20:42 ` Keld Jørn Simonsen @ 2011-02-04 21:15 ` Stan Hoeppner 2011-02-04 22:05 ` Keld Jørn Simonsen 0 siblings, 1 reply; 127+ messages in thread From: Stan Hoeppner @ 2011-02-04 21:15 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID Keld Jørn Simonsen put forth on 2/4/2011 2:42 PM: >>> So apparently their Disk Data Format specification doesn't include hybrid RAID >>> levels. This makes sense, as the _on disk_ layout of RAID 10 is identical to >>> RAID 1. > > I was puzzled here. I think you mean: > > "the _on disk_ layout of RAID 10 is identical to RAID 1 and RAID 0" > > If that is what you meant, I think we agree on most things here. I set a trap for you, of sorts. ;) If what you say is true, then RAID 10 should be covered in the DDF, as migration from one device to another isn't possible without the RAID 10 on disk layout being defined in the DDF as with all the other RAID levels. True? -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 21:15 ` Stan Hoeppner @ 2011-02-04 22:05 ` Keld Jørn Simonsen 2011-02-04 23:03 ` Stan Hoeppner 0 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-04 22:05 UTC (permalink / raw) To: Stan Hoeppner Cc: Keld Jørn Simonsen, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID On Fri, Feb 04, 2011 at 03:15:39PM -0600, Stan Hoeppner wrote: > Keld Jørn Simonsen put forth on 2/4/2011 2:42 PM: > > >>> So apparently their Disk Data Format specification doesn't include hybrid RAID > >>> levels. This makes sense, as the _on disk_ layout of RAID 10 is identical to > >>> RAID 1. > > > > I was puzzled here. I think you mean: > > > > "the _on disk_ layout of RAID 10 is identical to RAID 1 and RAID 0" > > > > If that is what you meant, I think we agree on most things here. > > I set a trap for you, of sorts. ;) And I deliberately fell into your trap, tongue in cheek:-) Maybe we should avoid traps and fooling around in them, as it just confuses others and waistes our time. I do think there are valid points coming out of our discussion here. I have tried to avoid your ad hominem remarks and traps, and just try to be constructive. > If what you say is true, then RAID 10 should > be covered in the DDF, as migration from one device to another isn't possible > without the RAID 10 on disk layout being defined in the DDF as with all the > other RAID levels. True? I am a little puzzled with what you mean here. RAID 1+0 is covered in DDF 2.0 as a description of RAID-1 or RAID-1E , and then striping it according to 4.3.1 . I do think they should add something about calling it RAID 1+0 or maybe RAID-1E+0 or some such. And then explain something about the term "RAID10" and why you should rather call it RAID-1+0 or RAID-1E+0, to indicate it is a Secondary RAID level, and to avoid ambiguity. I have given some remarks on RAID10 on http://www.snia.org/tech_activities/feedback/ Best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 22:05 ` Keld Jørn Simonsen @ 2011-02-04 23:03 ` Stan Hoeppner 2011-02-06 3:59 ` Drew 0 siblings, 1 reply; 127+ messages in thread From: Stan Hoeppner @ 2011-02-04 23:03 UTC (permalink / raw) To: Keld Jørn Simonsen Cc: Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID Keld Jørn Simonsen put forth on 2/4/2011 4:05 PM: > I am a little puzzled with what you mean here. RAID 1+0 is covered <snip> No, it's not. I see no mention of "RAID 1+0" anywhere in SNIA documents. I _do_ see "RAID 10" casually mentioned in many places in their documents. But I've yet to find where they define or bother to minimally explain the term "RAID 10". In technical writing you must define something before you discuss or reference it. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 23:03 ` Stan Hoeppner @ 2011-02-06 3:59 ` Drew 2011-02-06 4:27 ` Stan Hoeppner 0 siblings, 1 reply; 127+ messages in thread From: Drew @ 2011-02-06 3:59 UTC (permalink / raw) To: Stan Hoeppner Cc: Keld Jørn Simonsen, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID > No, it's not. I see no mention of "RAID 1+0" anywhere in SNIA documents. I > _do_ see "RAID 10" casually mentioned in many places in their documents. But > I've yet to find where they define or bother to minimally explain the term "RAID > 10". In technical writing you must define something before you discuss or > reference it. It's not explicitly defined but it's there. Page 84, Section 4.3 Secondary RAID Level "Table 15 lists values used in the Secondary_RAID_Level field of the Virtual Disk Configuration Record (Section 5.9.1) and their definitions. The table defines secondary RAID levels such as Striped, Volume Concatenation, Spanned, and Mirrored for hybrid or multilevel virtual disks. The Secondary_RAID_Level field in the Virtual Disk Configuration Record MUST use the values defined in Table 15." It then goes on to describe how the various secondary RAID levels are composed of "Basic Virtual Disks" (their term for RAID arrays presented as single disks.) As a somewhat related aside I ran into this on an IBM x3650 I was configuring for an office a few months back. IBM explicitly stated support for RAID 10 but the process for setting up the RAID 10 array involved setting up a pair of mirrored disks which the RAID controller then recognized as components it could use to build a RAID 10 array. Reading through the SNIA document and noting IBM & LSI's involvement made me think they may be using the DDF specs in their arrays, which might explain why setting up a RAID 10 array was such an involved affair. -- Drew "Nothing in life is to be feared. It is only to be understood." --Marie Curie -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-06 3:59 ` Drew @ 2011-02-06 4:27 ` Stan Hoeppner 0 siblings, 0 replies; 127+ messages in thread From: Stan Hoeppner @ 2011-02-06 4:27 UTC (permalink / raw) To: Drew Cc: Keld Jørn Simonsen, Jon Nelson, Mathias Burén, Roberto Spadim, Denis, Linux-RAID Drew put forth on 2/5/2011 9:59 PM: >> No, it's not. I see no mention of "RAID 1+0" anywhere in SNIA documents. I >> _do_ see "RAID 10" casually mentioned in many places in their documents. But >> I've yet to find where they define or bother to minimally explain the term "RAID >> 10". In technical writing you must define something before you discuss or >> reference it. > > It's not explicitly defined but it's there. Do you agree that they should give a cursory definition of RAID 10 in the DDF, since they reference it at least once by name in the same document? > Page 84, Section 4.3 Secondary RAID Level > > "Table 15 lists values used in the Secondary_RAID_Level field of the > Virtual Disk Configuration Record > (Section 5.9.1) and their definitions. The table defines secondary > RAID levels such as Striped, Volume > Concatenation, Spanned, and Mirrored for hybrid or multilevel virtual > disks. The Secondary_RAID_Level > field in the Virtual Disk Configuration Record MUST use the values > defined in Table 15." Yes, I obviously read all of this searching for a sign of a RAID 10 like definition. My whole point on this though is that they use the description/phrase "RAID 10" in passing, yet there is no other reference to it in the document. > As a somewhat related aside I ran into this on an IBM x3650 I was > configuring for an office a few months back. IBM explicitly stated > support for RAID 10 but the process for setting up the RAID 10 array > involved setting up a pair of mirrored disks which the RAID controller > then recognized as components it could use to build a RAID 10 array. I believe this is relatively new. Once upon a time LSI made SCSI ASICs and that was about it. Then they bought Mylex, a RAID card company, and then bought the RAID card division of American Megatrends (aka AMI BIOS) to eliminate all the major competition (years later they swallowed 3Ware for the same reason). From the late 90s through the early/mid 2000s creating a RAID 10 with either the Mylex or AMI RAID firmware was a single step process. However, back then, they didn't offer any "hybrid" RAID levels other than 10--no RAID 50/51 etc. > Reading through the SNIA document and noting IBM & LSI's involvement > made me think they may be using the DDF specs in their arrays, which > might explain why setting up a RAID 10 array was such an involved > affair. I doubt this has anything to do with SNIA. Now that they offer multiple hybrid RAID levels, which all basically all nested stripes, they've created a multi step process that covers all cases, instead of individual separate code to cover each case. -- Stan ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 8:27 ` Stan Hoeppner 2011-02-04 9:06 ` Keld Jørn Simonsen @ 2011-02-04 11:34 ` David Brown 2011-02-04 13:53 ` Keld Jørn Simonsen 1 sibling, 1 reply; 127+ messages in thread From: David Brown @ 2011-02-04 11:34 UTC (permalink / raw) To: linux-raid On 04/02/2011 09:27, Stan Hoeppner wrote: > Keld Jørn Simonsen put forth on 2/4/2011 1:06 AM: > >> Well RAID1+0 is not the best combination available. I would argue that >> raid10,f2 is significantly better in a number of areas. > > I'd guess Linux software RAID would be lucky to have 1% of RAID deployments > worldwide--very lucky. The other 99%+ are HBA RAID or SAN/NAS "appliances" most > often using custom embedded RTOS with the RAID code written in assembler, > especially in the case of the HBAs. For everything not Linux mdraid, RAID 10 > (aka 1+0) is king of the hill, and has been for 15 years+ > I wonder what sort of market penetration small cheap SAN/NAS "appliances" have these days, aimed at the home markets and small offices. These are almost invariably Linux md raid devices, although the user views them as an black-box appliance. However, though they use md raid, they typically don't support RAID10, RAID1+0, RAID10,f2, or anything other than RAID0, RAID1 and RAID5. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 11:34 ` David Brown @ 2011-02-04 13:53 ` Keld Jørn Simonsen 2011-02-04 14:17 ` David Brown 2011-02-04 14:21 ` hansbkk 0 siblings, 2 replies; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-04 13:53 UTC (permalink / raw) To: David Brown; +Cc: linux-raid On Fri, Feb 04, 2011 at 12:34:00PM +0100, David Brown wrote: > On 04/02/2011 09:27, Stan Hoeppner wrote: > >Keld Jørn Simonsen put forth on 2/4/2011 1:06 AM: > > > >>Well RAID1+0 is not the best combination available. I would argue that > >>raid10,f2 is significantly better in a number of areas. > > > >I'd guess Linux software RAID would be lucky to have 1% of RAID deployments > >worldwide--very lucky. The other 99%+ are HBA RAID or SAN/NAS > >"appliances" most > >often using custom embedded RTOS with the RAID code written in assembler, > >especially in the case of the HBAs. For everything not Linux mdraid, RAID > >10 > >(aka 1+0) is king of the hill, and has been for 15 years+ > > > > I wonder what sort of market penetration small cheap SAN/NAS > "appliances" have these days, aimed at the home markets and small > offices. These are almost invariably Linux md raid devices, although > the user views them as an black-box appliance. > > However, though they use md raid, they typically don't support RAID10, > RAID1+0, RAID10,f2, or anything other than RAID0, RAID1 and RAID5. I wonder why this is so. (I cannot dispute what you are saying, as I have not got any experience with any small SAN/NAS devices). Anyway, Linux NAS/SAN devices should run a kernel that should be able to run MD raid10 and RAID 1+0 - as this has been in the Linux kernel for more than 5 years. For c0mpanies that sell Linux NAS/SAN devices, I would have thought that they would have at least one engineer following this list. Maybe they will not disclose themselves, but are there some of you out here? And what kind of support of RAID types are available on your box? And maybe the more advanced stuff is available, but only in some CLI. The configuration web server could be without the more advanced oprions. But then: why not use options that kind of doubles the performance compared to competitors? Are the raid 1+0 and md raid10 options available via some ssh or other CLI access mechanisms? I know on routers, there are normally a CLI interface. Furthermore, Linux has a good penetration in the server market. And I think most people would run servers with raids, if they do something serious. So Linux RAID should be more than 1 %, at least in the server market. Best regards keld -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 13:53 ` Keld Jørn Simonsen @ 2011-02-04 14:17 ` David Brown 2011-02-04 14:21 ` hansbkk 1 sibling, 0 replies; 127+ messages in thread From: David Brown @ 2011-02-04 14:17 UTC (permalink / raw) To: linux-raid On 04/02/2011 14:53, Keld Jørn Simonsen wrote: > On Fri, Feb 04, 2011 at 12:34:00PM +0100, David Brown wrote: >> On 04/02/2011 09:27, Stan Hoeppner wrote: >>> Keld Jørn Simonsen put forth on 2/4/2011 1:06 AM: >>> >>>> Well RAID1+0 is not the best combination available. I would argue that >>>> raid10,f2 is significantly better in a number of areas. >>> >>> I'd guess Linux software RAID would be lucky to have 1% of RAID deployments >>> worldwide--very lucky. The other 99%+ are HBA RAID or SAN/NAS >>> "appliances" most >>> often using custom embedded RTOS with the RAID code written in assembler, >>> especially in the case of the HBAs. For everything not Linux mdraid, RAID >>> 10 >>> (aka 1+0) is king of the hill, and has been for 15 years+ >>> >> >> I wonder what sort of market penetration small cheap SAN/NAS >> "appliances" have these days, aimed at the home markets and small >> offices. These are almost invariably Linux md raid devices, although >> the user views them as an black-box appliance. >> >> However, though they use md raid, they typically don't support RAID10, >> RAID1+0, RAID10,f2, or anything other than RAID0, RAID1 and RAID5. > > I wonder why this is so. (I cannot dispute what you are saying, as I have > not got any experience with any small SAN/NAS devices). > > Anyway, Linux NAS/SAN devices should run a kernel that should be able to > run MD raid10 and RAID 1+0 - as this has been in the Linux kernel > for more than 5 years. > I think it is just a matter of simplifying the interface for the expected use of the target audience. The typical customer of such NAS appliances doesn't know enough about raid to understand the detailed pros and cons of different types, and is unlikely to care about small performance differences. Thus they have the options of raid0 and JBD for maximal space per $, raid1 for two disks with redundancy, and raid5 for more disks with redundancy. They don't have hot spares, raid6, mixing raid levels on different partitions, etc. Keep it simple, and people can use it. Of course, you can always access these devices directly, or with ssh, and re-arrange things as you want. It's only the web-based user interface that is limited. > For c0mpanies that sell Linux NAS/SAN devices, I would have thought that > they would have at least one engineer following this list. > Maybe they will not disclose themselves, but are there some of you out > here? > > And what kind of support of RAID types are available on your box? > > And maybe the more advanced stuff is available, but only in some CLI. > The configuration web server could be without the more advanced oprions. > But then: why not use options that kind of doubles the performance > compared to competitors? > > Are the raid 1+0 and md raid10 options available via some ssh > or other CLI access mechanisms? I know on routers, there are > normally a CLI interface. > > Furthermore, Linux has a good penetration in the server market. > And I think most people would run servers with raids, if they do > something serious. So Linux RAID should be more than 1 %, at least > in the server market. > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 13:53 ` Keld Jørn Simonsen 2011-02-04 14:17 ` David Brown @ 2011-02-04 14:21 ` hansbkk 2011-02-06 4:02 ` Drew 1 sibling, 1 reply; 127+ messages in thread From: hansbkk @ 2011-02-04 14:21 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: David Brown, linux-raid 2011/2/4 Keld Jørn Simonsen <keld@keldix.com>: > > I wonder why this is so. (I cannot dispute what you are saying, as I have > not got any experience with any small SAN/NAS devices). > > Anyway, Linux NAS/SAN devices should run a kernel that should be able to > run MD raid10 and RAID 1+0 - as this has been in the Linux kernel > for more than 5 years. > 95% of such devices sold have only one drive, a few models have 2 bays, with external connections (usually USB, some e-sata now) only available for in-bound data copying. These are cheap enough that a decent hacker community has sprung up creating replacement opensource firmware, but the hardware limitations are severe enough that IMO the main attraction is a very low power bill for 24/7 convenience. There are some that offer 4 or 5 bays, but IMO anyone knowledgeable enough to make use of higher-end options like md raid10 would take one look at the price tag and bolt for a whitebox+free/opensource solution instead. Even whitebox+Windows Home Server (gasp!) would IMO be both cheaper and better in most cases than getting locked into proprietary hardware *and* software wrapped up together. If you want to investigate further, here are the main brands, I imagine most are running on Linux/mdadm-based firmware: Thecus, Qnap, Synology, Dlink, Buffalo, NetGear (was Infrant) IMO Drobo looks like a very interesting kit, not mdadm-based AFAICT -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 14:21 ` hansbkk @ 2011-02-06 4:02 ` Drew 2011-02-06 7:58 ` Keld Jørn Simonsen 0 siblings, 1 reply; 127+ messages in thread From: Drew @ 2011-02-06 4:02 UTC (permalink / raw) To: hansbkk; +Cc: Keld Jørn Simonsen, David Brown, linux-raid > If you want to investigate further, here are the main brands, I > imagine most are running on Linux/mdadm-based firmware: > > Thecus, Qnap, Synology, Dlink, Buffalo, NetGear (was Infrant) I can confirm Buffalo & QNAP are both Linux based. Neither support RAID 10, just 0, 1, 5, and in the QNAP's case, 6. -- Drew "Nothing in life is to be feared. It is only to be understood." --Marie Curie ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-06 4:02 ` Drew @ 2011-02-06 7:58 ` Keld Jørn Simonsen 2011-02-06 12:03 ` Roman Mamedov 0 siblings, 1 reply; 127+ messages in thread From: Keld Jørn Simonsen @ 2011-02-06 7:58 UTC (permalink / raw) To: Drew; +Cc: hansbkk, Keld Jørn Simonsen, David Brown, linux-raid On Sat, Feb 05, 2011 at 08:02:02PM -0800, Drew wrote: > > If you want to investigate further, here are the main brands, I > > imagine most are running on Linux/mdadm-based firmware: > > > > Thecus, Qnap, Synology, Dlink, Buffalo, NetGear (was Infrant) > > I can confirm Buffalo & QNAP are both Linux based. Neither support > RAID 10, just 0, 1, 5, and in the QNAP's case, 6. Do they have a ssh mode, where you can log in and do as you please? best regards Keld ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-06 7:58 ` Keld Jørn Simonsen @ 2011-02-06 12:03 ` Roman Mamedov 2011-02-06 14:30 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roman Mamedov @ 2011-02-06 12:03 UTC (permalink / raw) To: Keld Jørn Simonsen; +Cc: Drew, hansbkk, David Brown, linux-raid [-- Attachment #1: Type: text/plain, Size: 930 bytes --] On Sun, 6 Feb 2011 08:58:16 +0100 Keld Jørn Simonsen <keld@keldix.com> wrote: > On Sat, Feb 05, 2011 at 08:02:02PM -0800, Drew wrote: > > > If you want to investigate further, here are the main brands, I > > > imagine most are running on Linux/mdadm-based firmware: > > > > > > Thecus, Qnap, Synology, Dlink, Buffalo, NetGear (was Infrant) > > > > I can confirm Buffalo & QNAP are both Linux based. Neither support > > RAID 10, just 0, 1, 5, and in the QNAP's case, 6. > > Do they have a ssh mode, where you can log in and do as you please? Most likely not in the stock configuration, but on some devices you can replace the original firmware with something more advanced. For example, I have installed complete Debian Squeeze on my D-Link DNS-323 box, following these instructions: http://www.cyrius.com/debian/orion/d-link/dns-323/ It has a 500 MHz ARM CPU and 64 MB of RAM. -- With respect, Roman [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-06 12:03 ` Roman Mamedov @ 2011-02-06 14:30 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-06 14:30 UTC (permalink / raw) To: Roman Mamedov Cc: Keld Jørn Simonsen, Drew, hansbkk, David Brown, linux-raid heehhee nice guys i have changed raid1.c and raid1.h code added new read_balance ffunctions could every body test? kernel 2.6.37 code at: http://www.spadim.com.br/raid1/ 2011/2/6 Roman Mamedov <rm@romanrm.ru>: > On Sun, 6 Feb 2011 08:58:16 +0100 > Keld Jørn Simonsen <keld@keldix.com> wrote: > >> On Sat, Feb 05, 2011 at 08:02:02PM -0800, Drew wrote: >> > > If you want to investigate further, here are the main brands, I >> > > imagine most are running on Linux/mdadm-based firmware: >> > > >> > > Thecus, Qnap, Synology, Dlink, Buffalo, NetGear (was Infrant) >> > >> > I can confirm Buffalo & QNAP are both Linux based. Neither support >> > RAID 10, just 0, 1, 5, and in the QNAP's case, 6. >> >> Do they have a ssh mode, where you can log in and do as you please? > > Most likely not in the stock configuration, but on some devices you can > replace the original firmware with something more advanced. For example, I > have installed complete Debian Squeeze on my D-Link DNS-323 box, following > these instructions: http://www.cyrius.com/debian/orion/d-link/dns-323/ > It has a 500 MHz ARM CPU and 64 MB of RAM. > > -- > With respect, > Roman > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:27 ` Jon Nelson 2011-01-31 21:47 ` Roberto Spadim 2011-02-01 0:58 ` Stan Hoeppner @ 2011-02-01 8:46 ` hansbkk 2 siblings, 0 replies; 127+ messages in thread From: hansbkk @ 2011-02-01 8:46 UTC (permalink / raw) To: Jon Nelson Cc: Mathias Burén, Roberto Spadim, Keld Jørn Simonsen, Stan Hoeppner, Denis, Linux-RAID For regulars here on the list we understand "raid10" to mean (what outsiders call non-standard) "md raid10". To be very clear for everyone coming here, how about we agree to use that - "md raid10" and "raid1+0" to mean the (to outsiders "standard") version, and to just not use plain "raid10" at all as it is ambiguous. My understanding of "standard" = accepted in all the subdomain discussion areas of the overall concept (md raid being a subset of raid). Within the overall topic domain, mdadm-implemented raid is just one flavor, one which in fact many consider substandard and inconsequential - although I and of course most here disagree, that's the way it is. A major advantage is the fact that mdraid is not proprietary but open, and although the meaning of standard may often imply open as opposed to to proprietary, that's not so in this case. On Tue, Feb 1, 2011 at 4:27 AM, Jon Nelson <jnelson-linux-raid@jamponi.net> wrote: > Before this goes any further, why not just reference the excellent > Wikipedia article (actually, excellent applies to both Wikipedia *and* > the article): > > http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10 > > The only problem I have with the wikipedia article is the assertion > that Linux MD RAID 10 is non-standard. It's as standard as anything > else is in this world. ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 18:35 ` Denis 2011-01-31 19:15 ` Roberto Spadim @ 2011-01-31 19:37 ` Phillip Susi 2011-01-31 19:41 ` Roberto Spadim 2011-01-31 20:23 ` Stan Hoeppner 1 sibling, 2 replies; 127+ messages in thread From: Phillip Susi @ 2011-01-31 19:37 UTC (permalink / raw) To: Denis; +Cc: Roberto Spadim, Linux-RAID On 1/31/2011 1:35 PM, Denis wrote: > Roberto, to quite understend how better a raid 10 is over raid 01 you > need to take down into a mathematical level: Raid 10 is not raid 1+0. Raid 10 defaults to having 2 duplicate copies, and so can withstand the failure of exactly one disk. If two disks fail, it does not matter which two they are, the array has failed. You can increase it to 3 copies so you can build an array of any size ( 4, 6, 8, whatever ) that can withstand exactly 2 failed disks, in any combination. ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:37 ` Phillip Susi @ 2011-01-31 19:41 ` Roberto Spadim 2011-01-31 19:46 ` Phillip Susi 2011-01-31 20:23 ` Stan Hoeppner 1 sibling, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 19:41 UTC (permalink / raw) To: Phillip Susi; +Cc: Denis, Linux-RAID nice, linux raid1 allow many mirrors does raid10 allow many mirrors or just one? 2011/1/31 Phillip Susi <psusi@cfl.rr.com>: > On 1/31/2011 1:35 PM, Denis wrote: >> Roberto, to quite understend how better a raid 10 is over raid 01 you >> need to take down into a mathematical level: > > Raid 10 is not raid 1+0. Raid 10 defaults to having 2 duplicate copies, > and so can withstand the failure of exactly one disk. If two disks > fail, it does not matter which two they are, the array has failed. You > can increase it to 3 copies so you can build an array of any size ( 4, > 6, 8, whatever ) that can withstand exactly 2 failed disks, in any > combination. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:41 ` Roberto Spadim @ 2011-01-31 19:46 ` Phillip Susi 2011-01-31 19:53 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Phillip Susi @ 2011-01-31 19:46 UTC (permalink / raw) To: Roberto Spadim; +Cc: Denis, Linux-RAID On 1/31/2011 2:41 PM, Roberto Spadim wrote: > nice, > linux raid1 allow many mirrors > does raid10 allow many mirrors or just one? See the man page. You can specify any number for near, far, or offset copies. The default is 2 near. ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:46 ` Phillip Susi @ 2011-01-31 19:53 ` Roberto Spadim 2011-01-31 22:10 ` Phillip Susi 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 19:53 UTC (permalink / raw) To: Phillip Susi; +Cc: Denis, Linux-RAID nice, copies (mirrors) if you lose 2 mirros in a 2 mirrors raid, you lost you information so 2 mirrors fail >= 2 disk fail you have a probability to fail of 1/2 (50%, 1 failed mirror, 2 total mirrors) if you have a bigger and bigger and bigger disk configuration and only two mirrors you still with 50% if you put more mirrors: 1/3 (33%, 1 failed, 3 total), 1/4=25%, 1/5=20% ...... got? is this information on wikipedia? maybe we could put there (raid0/raid1/raid10) fail system: 1 disk failed = 1 mirror failed 2 disk failed on same mirror = 1 mirror failed probability to crash: (failed mirrors / total mirrors)!=(failed disks / total disks)!=(failed disks / total mirrors)!=(failed mirrors / total disks) 2011/1/31 Phillip Susi <psusi@cfl.rr.com>: > On 1/31/2011 2:41 PM, Roberto Spadim wrote: >> nice, >> linux raid1 allow many mirrors >> does raid10 allow many mirrors or just one? > > See the man page. You can specify any number for near, far, or offset > copies. The default is 2 near. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:53 ` Roberto Spadim @ 2011-01-31 22:10 ` Phillip Susi 2011-01-31 22:14 ` Denis 0 siblings, 1 reply; 127+ messages in thread From: Phillip Susi @ 2011-01-31 22:10 UTC (permalink / raw) To: Roberto Spadim; +Cc: Denis, Linux-RAID On 1/31/2011 2:53 PM, Roberto Spadim wrote: > if you have a bigger and bigger and bigger disk configuration and only > two mirrors you still with 50% No, it goes down with more disks since you still can only tolerate a single failure, so the more drives you have, the more likely it is that two will fail. ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:10 ` Phillip Susi @ 2011-01-31 22:14 ` Denis 2011-01-31 22:33 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Denis @ 2011-01-31 22:14 UTC (permalink / raw) To: Roberto Spadim; +Cc: Linux-RAID 2011/1/31 Phillip Susi <psusi@cfl.rr.com>: > On 1/31/2011 2:53 PM, Roberto Spadim wrote: >> if you have a bigger and bigger and bigger disk configuration and only >> two mirrors you still with 50% might help http://en.wikipedia.org/wiki/Probability -- Denis Anjos, www.versatushpc.com.br -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:14 ` Denis @ 2011-01-31 22:33 ` Roberto Spadim 2011-01-31 22:36 ` Roberto Spadim 0 siblings, 1 reply; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 22:33 UTC (permalink / raw) To: Denis; +Cc: Linux-RAID it´s not just a probability problem, it´s a probability based on variable type if you want probability based on mirror you have a result, if you want probability based on disk you have another result raid0 allow disks (but we don´t have a 0,5 raid0 system, we just allow integers numbers, we can´t allow decimal numbers on this probability!) the problem here is: raid0 isn´t divisible, we can´t run a 0,5 raid0 system. we need a full working raid0 to have a working mirror, a half disk don´t help us to WORK, just to test and play with data, not to real production work! why we don´t buy disks with bad blocks? got the problem? for tests ok, for production NEVER i had a raid1 system broken last month, i have luck, two disks of 4 disks are broken, the first raid0 and the last raid0 but they are on separated mirrors (the server hit the floor, sorry boss =( ). i lost information? no! the last raid0 brolen disk was never used =], i used concatenate raid0 (LINEAR), if i was using raid0 with stripe i was f**** result: 4 new disks, 2 working 2 broken (i opened it! and 2 was good, after open it they are broken hehehe) that´s the problem, we can´t allow luck on production servers, consider using probability based on mirrors(only integer numbers) not on disks(decimal number of mirror) i could lost all my informations since i broken two mirrors on two mirror based server... make probability using integers not decimal numbers... since raid1 is mirrors based another example... i want raid10 (2mirrors) if i make partition on 4 disks (8 partitions, 2 per disk) and make raid1 on each partition on same disk (4 disk self mirrored) what the probability? i tell you is it mirror based or disk based? since software raid (mdadm) is mirror based you have 2 mirrors, you can lost 1 mirror! that´s the key. for mdadm you can´t go inside mirror device (just for raid0) but you can´t tell that´s a good probability look OCZ Revo-drive it´s a ssd pci board with two SATA SSD storage (120gb revodrive = 2x 55gb ssd) if i make self mirror i have a fault tolerant system? yes! understand now? for mdadm we have 2 mirrors, don´t make probability based on disks! make probability based on mirrors! 2011/1/31 Denis <denismpa@gmail.com>: > 2011/1/31 Phillip Susi <psusi@cfl.rr.com>: >> On 1/31/2011 2:53 PM, Roberto Spadim wrote: >>> if you have a bigger and bigger and bigger disk configuration and only >>> two mirrors you still with 50% > might help http://en.wikipedia.org/wiki/Probability > > > -- > Denis Anjos, > www.versatushpc.com.br > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:33 ` Roberto Spadim @ 2011-01-31 22:36 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-01-31 22:36 UTC (permalink / raw) To: Denis; +Cc: Linux-RAID if you go to hardware level, check first.. is your server mirror based? if you lost your UPS what´s the total system security? here (mdadm) we just use mdadm, we don´t need probability for upper layers (filesystem) or bottom layers (hardware) we need probability on devices that´s the layer that mdadm work, linux devices. that´s why we should use mirror based probability! not disk based! 2011/1/31 Roberto Spadim <roberto@spadim.com.br>: > it´s not just a probability problem, it´s a probability based on variable type > if you want probability based on mirror you have a result, if you want > probability based on disk you have another result > raid0 allow disks (but we don´t have a 0,5 raid0 system, we just allow > integers numbers, we can´t allow decimal numbers on this probability!) > > the problem here is: raid0 isn´t divisible, we can´t run a 0,5 raid0 > system. we need a full working raid0 to have a working mirror, a half > disk don´t help us to WORK, just to test and play with data, not to > real production work! why we don´t buy disks with bad blocks? got the > problem? for tests ok, for production NEVER > > i had a raid1 system broken last month, i have luck, two disks of 4 > disks are broken, the first raid0 and the last raid0 but they are on > separated mirrors (the server hit the floor, sorry boss =( ). > > i lost information? no! the last raid0 brolen disk was never used =], > i used concatenate raid0 (LINEAR), if i was using raid0 with stripe i > was f**** > result: 4 new disks, 2 working 2 broken (i opened it! and 2 was good, > after open it they are broken hehehe) > > that´s the problem, we can´t allow luck on production servers, > consider using probability based on mirrors(only integer numbers) not > on disks(decimal number of mirror) > > i could lost all my informations since i broken two mirrors on two > mirror based server... > make probability using integers not decimal numbers... since raid1 is > mirrors based > > another example... > i want raid10 (2mirrors) if i make partition on 4 disks (8 partitions, > 2 per disk) and make raid1 on each partition on same disk (4 disk self > mirrored) what the probability? > > i tell you is it mirror based or disk based? > since software raid (mdadm) is mirror based you have 2 mirrors, you > can lost 1 mirror! that´s the key. for mdadm you can´t go inside > mirror device (just for raid0) but you can´t tell that´s a good > probability > > look OCZ Revo-drive > it´s a ssd pci board with two SATA SSD storage (120gb revodrive = 2x > 55gb ssd) if i make self mirror i have a fault tolerant system? yes! > understand now? for mdadm we have 2 mirrors, don´t make probability > based on disks! make probability based on mirrors! > > 2011/1/31 Denis <denismpa@gmail.com>: >> 2011/1/31 Phillip Susi <psusi@cfl.rr.com>: >>> On 1/31/2011 2:53 PM, Roberto Spadim wrote: >>>> if you have a bigger and bigger and bigger disk configuration and only >>>> two mirrors you still with 50% >> might help http://en.wikipedia.org/wiki/Probability >> >> >> -- >> Denis Anjos, >> www.versatushpc.com.br >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Roberto Spadim > Spadim Technology / SPAEmpresarial > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 19:37 ` Phillip Susi 2011-01-31 19:41 ` Roberto Spadim @ 2011-01-31 20:23 ` Stan Hoeppner 2011-01-31 21:59 ` Phillip Susi 1 sibling, 1 reply; 127+ messages in thread From: Stan Hoeppner @ 2011-01-31 20:23 UTC (permalink / raw) To: Phillip Susi; +Cc: Denis, Roberto Spadim, Linux-RAID Phillip Susi put forth on 1/31/2011 1:37 PM: > Raid 10 is not raid 1+0. Raid 10 defaults to having 2 duplicate copies, Yes, actually, they are two names for the same RAID level. > and so can withstand the failure of exactly one disk. If two disks > fail, it does not matter which two they are, the array has failed. You This is absolutely not correct. In a 10 disk RAID 10 array, exactly 5 disks can fail, as long as no two are in the same mirror pair, and the array will continue to function, with little or no performance degradation. Where are you getting your information? Pretty much everything you stated is wrong... -- Stan ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 20:23 ` Stan Hoeppner @ 2011-01-31 21:59 ` Phillip Susi 2011-01-31 22:08 ` Jon Nelson 2011-02-01 9:20 ` Robin Hill 0 siblings, 2 replies; 127+ messages in thread From: Phillip Susi @ 2011-01-31 21:59 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Denis, Roberto Spadim, Linux-RAID On 1/31/2011 3:23 PM, Stan Hoeppner wrote: >> Raid 10 is not raid 1+0. Raid 10 defaults to having 2 duplicate copies, > > Yes, actually, they are two names for the same RAID level. No, they are not. See the mdadm man page. Raid10 can operate on 3 drives, raid1+0 can not. In theory a raid10 could be done on two disks though mdadm seems to want at least 3. > This is absolutely not correct. In a 10 disk RAID 10 array, exactly 5 disks can > fail, as long as no two are in the same mirror pair, and the array will continue > to function, with little or no performance degradation. That is a raid 0+1, not raid10. > Where are you getting your information? Pretty much everything you stated is > wrong... The mdadm man page. ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:59 ` Phillip Susi @ 2011-01-31 22:08 ` Jon Nelson 2011-01-31 22:38 ` Phillip Susi 2011-02-01 9:20 ` Robin Hill 1 sibling, 1 reply; 127+ messages in thread From: Jon Nelson @ 2011-01-31 22:08 UTC (permalink / raw) To: Phillip Susi; +Cc: Stan Hoeppner, Denis, Roberto Spadim, Linux-RAID On Mon, Jan 31, 2011 at 3:59 PM, Phillip Susi <psusi@cfl.rr.com> wrote: > On 1/31/2011 3:23 PM, Stan Hoeppner wrote: >>> Raid 10 is not raid 1+0. Raid 10 defaults to having 2 duplicate copies, >> >> Yes, actually, they are two names for the same RAID level. > > No, they are not. See the mdadm man page. Raid10 can operate on 3 > drives, raid1+0 can not. In theory a raid10 could be done on two disks > though mdadm seems to want at least 3. I operate (and have done so for a long time) a raid10,f2 on two drives. What makes you think it wants at least 3? -- Jon -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:08 ` Jon Nelson @ 2011-01-31 22:38 ` Phillip Susi 2011-02-01 10:05 ` David Brown 0 siblings, 1 reply; 127+ messages in thread From: Phillip Susi @ 2011-01-31 22:38 UTC (permalink / raw) To: Jon Nelson; +Cc: Stan Hoeppner, Denis, Roberto Spadim, Linux-RAID On 1/31/2011 5:08 PM, Jon Nelson wrote: > I operate (and have done so for a long time) a raid10,f2 on two drives. > What makes you think it wants at least 3? Neat, I thought I tried that the other day and it complained that raid 5 and 6 ( sic ) need at least 3 drives. Seemed like it was just a bug or oversight. ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 22:38 ` Phillip Susi @ 2011-02-01 10:05 ` David Brown 0 siblings, 0 replies; 127+ messages in thread From: David Brown @ 2011-02-01 10:05 UTC (permalink / raw) To: linux-raid On 31/01/2011 23:38, Phillip Susi wrote: > On 1/31/2011 5:08 PM, Jon Nelson wrote: >> I operate (and have done so for a long time) a raid10,f2 on two drives. >> What makes you think it wants at least 3? > > Neat, I thought I tried that the other day and it complained that raid 5 > and 6 ( sic ) need at least 3 drives. Seemed like it was just a bug or > oversight. raid5 needs at least 3 disks to make sense (raid5 with 2 disks is the same as raid1), while raid6 needs at least 4 disks. But mdadm raid10 needs only two disks. ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 21:59 ` Phillip Susi 2011-01-31 22:08 ` Jon Nelson @ 2011-02-01 9:20 ` Robin Hill 2011-02-04 16:03 ` Phillip Susi 1 sibling, 1 reply; 127+ messages in thread From: Robin Hill @ 2011-02-01 9:20 UTC (permalink / raw) To: Linux-RAID [-- Attachment #1: Type: text/plain, Size: 1399 bytes --] On Mon Jan 31, 2011 at 04:59:28PM -0500, Phillip Susi wrote: > On 1/31/2011 3:23 PM, Stan Hoeppner wrote: > > This is absolutely not correct. In a 10 disk RAID 10 array, exactly 5 disks can > > fail, as long as no two are in the same mirror pair, and the array will continue > > to function, with little or no performance degradation. > > That is a raid 0+1, not raid10. > No, it's RAID 10 or RAID 1+0. RAID 0+1 would be 2 mirrored pairs of 5-disk RAID 0 arrays, in which case you could only lose 5 disks if they're all from the same RAID 0 array. With RAID 10 or RAID 1+0 (in the case of a 10-disk n2 setup, the physical layout should be exactly the same) then the restriction is, as stated, that no two are "mirrored" (whether that's a separate RAID 1 mirror or just that the two are defined by the RAID 10 layout to contain the same data is irrelevant). > > Where are you getting your information? Pretty much everything you stated is > > wrong... > > The mdadm man page. > The md man page would be better for information on the physical layouts, but I don't see anything on there to support what you're saying here. Cheers, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-01 9:20 ` Robin Hill @ 2011-02-04 16:03 ` Phillip Susi 2011-02-04 16:22 ` Robin Hill 0 siblings, 1 reply; 127+ messages in thread From: Phillip Susi @ 2011-02-04 16:03 UTC (permalink / raw) To: Linux-RAID FYI, you should get in the habit of using your mail client's reply to all function. I did not see this until now because you did not send me a copy. On 2/1/2011 4:20 AM, Robin Hill wrote: > No, it's RAID 10 or RAID 1+0. RAID 0+1 would be 2 mirrored pairs of > 5-disk RAID 0 arrays, in which case you could only lose 5 disks if In English we read from left to right and top to bottom, so 0+1 means stripe on top of mirror. > The md man page would be better for information on the physical layouts, > but I don't see anything on there to support what you're saying here. The section on raid10 describes the layouts. For a 4 disk array, the default layout of n2 is equivalent to raid 0+1. A 2, 3, or 5 disk array is not even possible with 0+1, but raid10 is quite happy with that. ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 16:03 ` Phillip Susi @ 2011-02-04 16:22 ` Robin Hill 2011-02-04 20:35 ` [OT] " Phil Turmel ` (2 more replies) 0 siblings, 3 replies; 127+ messages in thread From: Robin Hill @ 2011-02-04 16:22 UTC (permalink / raw) To: Phillip Susi; +Cc: Linux-RAID [-- Attachment #1: Type: text/plain, Size: 1477 bytes --] On Fri Feb 04, 2011 at 11:03:30AM -0500, Phillip Susi wrote: > FYI, you should get in the habit of using your mail client's reply to > all function. I did not see this until now because you did not send me > a copy. > I'll make an exception in this case, but I generally reply to messages on mailing lists to the list only, and I have no plans to change this. > On 2/1/2011 4:20 AM, Robin Hill wrote: > > No, it's RAID 10 or RAID 1+0. RAID 0+1 would be 2 mirrored pairs of > > 5-disk RAID 0 arrays, in which case you could only lose 5 disks if > > In English we read from left to right and top to bottom, so 0+1 means > stripe on top of mirror. > The vast majority of online sources would disagree with you. See: http://en.wikipedia.org/wiki/Nested_RAID_levels http://www.aput.net/~jheiss/raid10/ http://decipherinfosys.wordpress.com/2008/01/15/difference-between-raid-01-vs-raid-10/ http://www.raid.com/04_01_0_1.html http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html http://www.adrc.com/raid-01.html The order it's written is the order of creation. RAID0+1 = RAID 0, then RAID 1 (mirrored stripes) and RAID1+0 = RAID 1, then RAID 0 (striped mirrors). Cheers, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 127+ messages in thread
* [OT] Re: What's the typical RAID10 setup? 2011-02-04 16:22 ` Robin Hill @ 2011-02-04 20:35 ` Phil Turmel 2011-02-04 20:35 ` Phillip Susi 2011-02-04 21:05 ` Stan Hoeppner 2 siblings, 0 replies; 127+ messages in thread From: Phil Turmel @ 2011-02-04 20:35 UTC (permalink / raw) To: Linux-RAID; +Cc: Phillip Susi Hi Robin, On 02/04/2011 11:22 AM, Robin Hill wrote: > On Fri Feb 04, 2011 at 11:03:30AM -0500, Phillip Susi wrote: > >> FYI, you should get in the habit of using your mail client's reply to >> all function. I did not see this until now because you did not send me >> a copy. >> > I'll make an exception in this case, but I generally reply to messages > on mailing lists to the list only, and I have no plans to change this. You seem to not be aware that this is an open list -- non-subscribers are not only allowed, but encouraged to post, and expect to be included in the To: or CC: lists of replies. In an open list, anything other than reply-to-all is rude to non-subscribers. All lists on kernel.org are open lists. Replying only to the list is perfectly appropriate for closed lists (subscriber only), and many such lists munge the reply-to header to make it easier, but that's another topic... Phil ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 16:22 ` Robin Hill 2011-02-04 20:35 ` [OT] " Phil Turmel @ 2011-02-04 20:35 ` Phillip Susi 2011-02-04 21:05 ` Stan Hoeppner 2 siblings, 0 replies; 127+ messages in thread From: Phillip Susi @ 2011-02-04 20:35 UTC (permalink / raw) To: Linux-RAID On 2/4/2011 11:22 AM, Robin Hill wrote: > I'll make an exception in this case, but I generally reply to messages > on mailing lists to the list only, and I have no plans to change this. You should as that is bad netiquette. It breaks threads that are cross posted, and people who post on lists ( or end up getting Cc'd ) without subscribing, which happens a lot on high traffic lists. Reply-to-list is an abomination that violates Internet Mail Standards. See http://david.woodhou.se/reply-to-list.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 16:22 ` Robin Hill 2011-02-04 20:35 ` [OT] " Phil Turmel 2011-02-04 20:35 ` Phillip Susi @ 2011-02-04 21:05 ` Stan Hoeppner 2011-02-04 21:13 ` Roberto Spadim 2 siblings, 1 reply; 127+ messages in thread From: Stan Hoeppner @ 2011-02-04 21:05 UTC (permalink / raw) To: Phillip Susi, Linux-RAID Robin Hill put forth on 2/4/2011 10:22 AM: > On Fri Feb 04, 2011 at 11:03:30AM -0500, Phillip Susi wrote: > >> FYI, you should get in the habit of using your mail client's reply to >> all function. I did not see this until now because you did not send me >> a copy. >> > I'll make an exception in this case, but I generally reply to messages > on mailing lists to the list only, and I have no plans to change this. > >> On 2/1/2011 4:20 AM, Robin Hill wrote: >>> No, it's RAID 10 or RAID 1+0. RAID 0+1 would be 2 mirrored pairs of >>> 5-disk RAID 0 arrays, in which case you could only lose 5 disks if >> >> In English we read from left to right and top to bottom, so 0+1 means >> stripe on top of mirror. >> > The vast majority of online sources would disagree with you. See: > http://en.wikipedia.org/wiki/Nested_RAID_levels > http://www.aput.net/~jheiss/raid10/ > http://decipherinfosys.wordpress.com/2008/01/15/difference-between-raid-01-vs-raid-10/ > http://www.raid.com/04_01_0_1.html > http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html > http://www.adrc.com/raid-01.html > > The order it's written is the order of creation. RAID0+1 = RAID 0, > then RAID 1 (mirrored stripes) and RAID1+0 = RAID 1, then RAID 0 > (striped mirrors). LSI is the undisputed king of HBA RAID, with all major server OEMs rebadging LSI cards and/or using their storage processor chips on the mobo, including Dell, HP, IBM, Intel, Sun, etc. Note the diagram at the bottom of this PDF showing the layout of RAID 10: http://www.lsi.com/DistributionSystem/AssetDocument/SCG_LSI_SAS_6Gbps_IR_PB_092909.pdf It clearly shows a stripe over two mirrors. -- Stan ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-02-04 21:05 ` Stan Hoeppner @ 2011-02-04 21:13 ` Roberto Spadim 0 siblings, 0 replies; 127+ messages in thread From: Roberto Spadim @ 2011-02-04 21:13 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Phillip Susi, Linux-RAID http://www.lsi.com/DistributionSystem/AssetDocument/SCG_LSI_SAS_6Gbps_IR_PB_092909.pdf raid10 = (can use linux raid10 too, but i´m just explaning raid 1+0) mdadm --create /dev/md0 --level=1 --raid-devices 2 /dev/sda /dev/sdb mdadm --create /dev/md1 --level=1 --raid-devices 2 /dev/sdc /dev/sdd mdadm --create /dev/md2 --level=0 --raid-devices 2 /dev/md0 /dev/md1 2011/2/4 Stan Hoeppner <stan@hardwarefreak.com>: > Robin Hill put forth on 2/4/2011 10:22 AM: >> On Fri Feb 04, 2011 at 11:03:30AM -0500, Phillip Susi wrote: >> >>> FYI, you should get in the habit of using your mail client's reply to >>> all function. I did not see this until now because you did not send me >>> a copy. >>> >> I'll make an exception in this case, but I generally reply to messages >> on mailing lists to the list only, and I have no plans to change this. >> >>> On 2/1/2011 4:20 AM, Robin Hill wrote: >>>> No, it's RAID 10 or RAID 1+0. RAID 0+1 would be 2 mirrored pairs of >>>> 5-disk RAID 0 arrays, in which case you could only lose 5 disks if >>> >>> In English we read from left to right and top to bottom, so 0+1 means >>> stripe on top of mirror. >>> >> The vast majority of online sources would disagree with you. See: >> http://en.wikipedia.org/wiki/Nested_RAID_levels >> http://www.aput.net/~jheiss/raid10/ >> http://decipherinfosys.wordpress.com/2008/01/15/difference-between-raid-01-vs-raid-10/ >> http://www.raid.com/04_01_0_1.html >> http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html >> http://www.adrc.com/raid-01.html >> >> The order it's written is the order of creation. RAID0+1 = RAID 0, >> then RAID 1 (mirrored stripes) and RAID1+0 = RAID 1, then RAID 0 >> (striped mirrors). > > LSI is the undisputed king of HBA RAID, with all major server OEMs rebadging LSI > cards and/or using their storage processor chips on the mobo, including Dell, > HP, IBM, Intel, Sun, etc. Note the diagram at the bottom of this PDF showing > the layout of RAID 10: > > http://www.lsi.com/DistributionSystem/AssetDocument/SCG_LSI_SAS_6Gbps_IR_PB_092909.pdf > > It clearly shows a stripe over two mirrors. > > -- > Stan > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 15:21 ` Robin Hill 2011-01-31 15:27 ` Roberto Spadim @ 2011-01-31 15:30 ` Robin Hill 1 sibling, 0 replies; 127+ messages in thread From: Robin Hill @ 2011-01-31 15:30 UTC (permalink / raw) To: Linux-RAID On Mon Jan 31, 2011 at 03:21:51PM +0000, Robin Hill wrote: > On Mon Jan 31, 2011 at 01:00:13PM -0200, Roberto Spadim wrote: > > > i think make two very big raid 0 > > and after raid1 > > is better > > > Not really - you increase the failure risk doing this. With this setup, > a single drive failure from each RAID0 array will lose you the entire > array. With the reverse (RAID0 over RAID1) then you require both drives > in the RAID1 to fail in order to lose the array. Of course, with a 4 > drive array then the risk is the same (33% with 2 drive failures) but > with a 6 drive array it changes to 60% for RAID1 over RAID0 versus 20% > for RAID0 over RAID1. > And I managed to get my maths wrong. Even for a 4-drive array, RAID1 over RAID0 will have a 66% 2-drive failure chance, versus 33% for RAID0 over RAID1. Cheers, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | ^ permalink raw reply [flat|nested] 127+ messages in thread
* Re: What's the typical RAID10 setup? 2011-01-31 9:41 What's the typical RAID10 setup? Mathias Burén 2011-01-31 10:14 ` Robin Hill 2011-01-31 10:36 ` CoolCold @ 2011-01-31 20:07 ` Stan Hoeppner 2 siblings, 0 replies; 127+ messages in thread From: Stan Hoeppner @ 2011-01-31 20:07 UTC (permalink / raw) To: Mathias Burén; +Cc: Linux-RAID Mathias Burén put forth on 1/31/2011 3:41 AM: > How would one go about expanding a 4 HDD RAID10 into a 6 HDD RAID10? > Is it "just" a matter of creating a new RAID1 array of the 2 new HDDs, > then adding them to the RAID0, then expanding whatever is on that > (lvm, xfs, ext4)? The best way to do this "down the road" expansion is to simply create another 4 drive mdraid 10 array and concatenate it to the first with LVM. My guess is that you're more concerned with usable space than IOPs, so you should probably start with a 4 drive RAID5. Down the road, you could add a 3 drive mdraid 5 and concatenate that with the first array using LVM. This will avoid all kinds of potentially messy, possibly earth shattering mdadm conversions. Running such conversions on live arrays without a reliable backup/restore capability is just asking for trouble. Using concatenation leaves your old and new md arrays totally intact. http://linux.about.com/od/commands/l/blcmdl8_lvm.htm > Are there any design tips, or caveats? For example, how many disks > would you use at most, in a RAID10 setup? As many as you want, or that your hardware can support. The largest RAID 10 array I've setup was 40 drives. That was an FC SAN controller though, not mdraid. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 127+ messages in thread
end of thread, other threads:[~2011-02-06 14:30 UTC | newest] Thread overview: 127+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-01-31 9:41 What's the typical RAID10 setup? Mathias Burén 2011-01-31 10:14 ` Robin Hill 2011-01-31 10:22 ` Mathias Burén 2011-01-31 10:36 ` CoolCold 2011-01-31 15:00 ` Roberto Spadim 2011-01-31 15:21 ` Robin Hill 2011-01-31 15:27 ` Roberto Spadim 2011-01-31 15:28 ` Roberto Spadim 2011-01-31 15:32 ` Roberto Spadim 2011-01-31 15:34 ` Roberto Spadim 2011-01-31 15:37 ` Roberto Spadim 2011-01-31 15:45 ` Robin Hill 2011-01-31 16:55 ` Denis 2011-01-31 17:31 ` Roberto Spadim 2011-01-31 18:35 ` Denis 2011-01-31 19:15 ` Roberto Spadim 2011-01-31 19:28 ` Keld Jørn Simonsen 2011-01-31 19:35 ` Roberto Spadim 2011-01-31 19:37 ` Roberto Spadim 2011-01-31 20:22 ` Keld Jørn Simonsen 2011-01-31 20:17 ` Stan Hoeppner 2011-01-31 20:37 ` Keld Jørn Simonsen 2011-01-31 21:20 ` Roberto Spadim 2011-01-31 21:24 ` Mathias Burén 2011-01-31 21:27 ` Jon Nelson 2011-01-31 21:47 ` Roberto Spadim 2011-01-31 21:51 ` Roberto Spadim 2011-01-31 22:50 ` NeilBrown 2011-01-31 22:53 ` Roberto Spadim 2011-01-31 23:10 ` NeilBrown 2011-01-31 23:14 ` Roberto Spadim 2011-01-31 22:52 ` Keld Jørn Simonsen 2011-01-31 23:00 ` Roberto Spadim 2011-02-01 10:01 ` David Brown 2011-02-01 13:50 ` Jon Nelson 2011-02-01 14:25 ` Roberto Spadim 2011-02-01 14:48 ` David Brown 2011-02-01 15:41 ` Roberto Spadim 2011-02-03 3:36 ` Drew 2011-02-03 8:18 ` Stan Hoeppner [not found] ` <AANLkTikerSZfhMbkEvGBVyLB=wHDSHLWszoEz5As5Hi4@mail.gmail.com> [not found] ` <AANLkTikLyR206x4aMy+veNkWPV67uF9r5dZKGqXJUEqN@mail.gmail.com> 2011-02-03 14:35 ` Roberto Spadim 2011-02-03 15:43 ` Keld Jørn Simonsen 2011-02-03 15:50 ` Roberto Spadim 2011-02-03 15:54 ` Roberto Spadim 2011-02-03 16:02 ` Keld Jørn Simonsen 2011-02-03 16:07 ` Roberto Spadim 2011-02-03 16:16 ` Roberto Spadim 2011-02-01 22:05 ` Stan Hoeppner 2011-02-01 23:12 ` Roberto Spadim 2011-02-02 9:25 ` Robin Hill 2011-02-02 16:00 ` Roberto Spadim 2011-02-02 16:06 ` Roberto Spadim 2011-02-02 16:07 ` Roberto Spadim 2011-02-02 16:10 ` Roberto Spadim 2011-02-02 16:13 ` Roberto Spadim 2011-02-02 19:44 ` Keld Jørn Simonsen 2011-02-02 20:28 ` Roberto Spadim 2011-02-02 21:31 ` Roberto Spadim 2011-02-02 22:13 ` Keld Jørn Simonsen 2011-02-02 22:26 ` Roberto Spadim 2011-02-03 1:57 ` Roberto Spadim 2011-02-03 3:05 ` Stan Hoeppner 2011-02-03 3:13 ` Roberto Spadim 2011-02-03 3:17 ` Roberto Spadim 2011-02-01 23:35 ` Keld Jørn Simonsen 2011-02-01 16:02 ` Keld Jørn Simonsen 2011-02-01 16:24 ` Roberto Spadim 2011-02-01 17:56 ` Keld Jørn Simonsen 2011-02-01 18:09 ` Roberto Spadim 2011-02-01 20:16 ` Keld Jørn Simonsen 2011-02-01 20:32 ` Keld Jørn Simonsen 2011-02-01 20:58 ` Roberto Spadim 2011-02-01 21:04 ` Roberto Spadim 2011-02-01 21:18 ` David Brown 2011-02-01 0:58 ` Stan Hoeppner 2011-02-01 12:50 ` Roman Mamedov 2011-02-03 11:04 ` Keld Jørn Simonsen 2011-02-03 14:17 ` Roberto Spadim 2011-02-03 15:54 ` Keld Jørn Simonsen 2011-02-03 18:39 ` Keld Jørn Simonsen 2011-02-03 18:41 ` Roberto Spadim 2011-02-03 23:43 ` Stan Hoeppner 2011-02-04 3:49 ` hansbkk 2011-02-04 7:06 ` Keld Jørn Simonsen 2011-02-04 8:27 ` Stan Hoeppner 2011-02-04 9:06 ` Keld Jørn Simonsen 2011-02-04 10:04 ` Stan Hoeppner 2011-02-04 11:15 ` hansbkk 2011-02-04 13:33 ` Keld Jørn Simonsen 2011-02-04 20:35 ` Keld Jørn Simonsen 2011-02-04 20:42 ` Keld Jørn Simonsen 2011-02-04 21:15 ` Stan Hoeppner 2011-02-04 22:05 ` Keld Jørn Simonsen 2011-02-04 23:03 ` Stan Hoeppner 2011-02-06 3:59 ` Drew 2011-02-06 4:27 ` Stan Hoeppner 2011-02-04 11:34 ` David Brown 2011-02-04 13:53 ` Keld Jørn Simonsen 2011-02-04 14:17 ` David Brown 2011-02-04 14:21 ` hansbkk 2011-02-06 4:02 ` Drew 2011-02-06 7:58 ` Keld Jørn Simonsen 2011-02-06 12:03 ` Roman Mamedov 2011-02-06 14:30 ` Roberto Spadim 2011-02-01 8:46 ` hansbkk 2011-01-31 19:37 ` Phillip Susi 2011-01-31 19:41 ` Roberto Spadim 2011-01-31 19:46 ` Phillip Susi 2011-01-31 19:53 ` Roberto Spadim 2011-01-31 22:10 ` Phillip Susi 2011-01-31 22:14 ` Denis 2011-01-31 22:33 ` Roberto Spadim 2011-01-31 22:36 ` Roberto Spadim 2011-01-31 20:23 ` Stan Hoeppner 2011-01-31 21:59 ` Phillip Susi 2011-01-31 22:08 ` Jon Nelson 2011-01-31 22:38 ` Phillip Susi 2011-02-01 10:05 ` David Brown 2011-02-01 9:20 ` Robin Hill 2011-02-04 16:03 ` Phillip Susi 2011-02-04 16:22 ` Robin Hill 2011-02-04 20:35 ` [OT] " Phil Turmel 2011-02-04 20:35 ` Phillip Susi 2011-02-04 21:05 ` Stan Hoeppner 2011-02-04 21:13 ` Roberto Spadim 2011-01-31 15:30 ` Robin Hill 2011-01-31 20:07 ` Stan Hoeppner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.