* expand raid10
@ 2011-04-13 4:28 Roberto Spadim
2011-04-13 7:15 ` Mathias Burén
0 siblings, 1 reply; 12+ messages in thread
From: Roberto Spadim @ 2011-04-13 4:28 UTC (permalink / raw)
To: Linux-RAID
hi guys, today i have 2 disks raid10 far, with 2tb each disk/array
i want to expand it to 4 disks raid10 far, with 2tb each disk and 4tb array
in other words, i will put more 2 disks of 2tb and i want more 2tb of space
could i do this with raid10 far layout? i´m using ext4 at filesystem
how could i expand ext4 filesystem?
--
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 4:28 expand raid10 Roberto Spadim
@ 2011-04-13 7:15 ` Mathias Burén
2011-04-13 10:47 ` Roberto Spadim
0 siblings, 1 reply; 12+ messages in thread
From: Mathias Burén @ 2011-04-13 7:15 UTC (permalink / raw)
To: Roberto Spadim; +Cc: Linux-RAID
On 13 April 2011 05:28, Roberto Spadim <roberto@spadim.com.br> wrote:
> hi guys, today i have 2 disks raid10 far, with 2tb each disk/array
> i want to expand it to 4 disks raid10 far, with 2tb each disk and 4tb array
> in other words, i will put more 2 disks of 2tb and i want more 2tb of space
>
> could i do this with raid10 far layout? i´m using ext4 at filesystem
> how could i expand ext4 filesystem?
>
> --
> Roberto Spadim
> Spadim Technology / SPAEmpresarial
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
No, you cannot expand RAID0.
// M
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 7:15 ` Mathias Burén
@ 2011-04-13 10:47 ` Roberto Spadim
2011-04-13 11:10 ` Keld Jørn Simonsen
0 siblings, 1 reply; 12+ messages in thread
From: Roberto Spadim @ 2011-04-13 10:47 UTC (permalink / raw)
To: Mathias Burén; +Cc: Linux-RAID
raid10 with other layout i could expand?
2011/4/13 Mathias Burén <mathias.buren@gmail.com>:
> On 13 April 2011 05:28, Roberto Spadim <roberto@spadim.com.br> wrote:
>> hi guys, today i have 2 disks raid10 far, with 2tb each disk/array
>> i want to expand it to 4 disks raid10 far, with 2tb each disk and 4tb array
>> in other words, i will put more 2 disks of 2tb and i want more 2tb of space
>>
>> could i do this with raid10 far layout? i´m using ext4 at filesystem
>> how could i expand ext4 filesystem?
>>
>> --
>> Roberto Spadim
>> Spadim Technology / SPAEmpresarial
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> No, you cannot expand RAID0.
>
> // M
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 10:47 ` Roberto Spadim
@ 2011-04-13 11:10 ` Keld Jørn Simonsen
2011-04-13 11:17 ` NeilBrown
0 siblings, 1 reply; 12+ messages in thread
From: Keld Jørn Simonsen @ 2011-04-13 11:10 UTC (permalink / raw)
To: Roberto Spadim; +Cc: Mathias Burén, Linux-RAID
On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
> raid10 with other layout i could expand?
My understanding is that you currently cannot expand raid10.
but there are things in the works. Expansion of raid10,far
was not on the list from neil, raid10,near was. But it should be fairly
easy to expand raid10,far. You can just treat one of the copies as your
refence data, and copy that data to the other raid0-like parts of the
array. I wonder if Neil thinks he could leave that as an exersize for
me to implement... I would like to be able to combine it with a
reformat to a more robust layout of raid10,far that in some cases can survive more
than one disk failure.
best regards
keld
> 2011/4/13 Mathias Burén <mathias.buren@gmail.com>:
> > On 13 April 2011 05:28, Roberto Spadim <roberto@spadim.com.br> wrote:
> >> hi guys, today i have 2 disks raid10 far, with 2tb each disk/array
> >> i want to expand it to 4 disks raid10 far, with 2tb each disk and 4tb array
> >> in other words, i will put more 2 disks of 2tb and i want more 2tb of space
> >>
> >> could i do this with raid10 far layout? i?m using ext4 at filesystem
> >> how could i expand ext4 filesystem?
> >>
> >> --
> >> Roberto Spadim
> >> Spadim Technology / SPAEmpresarial
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> >
> > No, you cannot expand RAID0.
> >
> > // M
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
>
>
> --
> Roberto Spadim
> Spadim Technology / SPAEmpresarial
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 11:10 ` Keld Jørn Simonsen
@ 2011-04-13 11:17 ` NeilBrown
2011-04-13 12:34 ` Keld Jørn Simonsen
2011-04-13 12:34 ` David Brown
0 siblings, 2 replies; 12+ messages in thread
From: NeilBrown @ 2011-04-13 11:17 UTC (permalink / raw)
To: Keld Jørn Simonsen; +Cc: Roberto Spadim, Mathias Burén, Linux-RAID
On Wed, 13 Apr 2011 13:10:16 +0200 Keld Jørn Simonsen <keld@keldix.com> wrote:
> On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
> > raid10 with other layout i could expand?
>
> My understanding is that you currently cannot expand raid10.
> but there are things in the works. Expansion of raid10,far
> was not on the list from neil, raid10,near was. But it should be fairly
> easy to expand raid10,far. You can just treat one of the copies as your
> refence data, and copy that data to the other raid0-like parts of the
> array. I wonder if Neil thinks he could leave that as an exersize for
> me to implement... I would like to be able to combine it with a
> reformat to a more robust layout of raid10,far that in some cases can survive more
> than one disk failure.
>
I'm very happy for anyone to offer to implement anything.
I will of course require the code to be of reasonable quality before I accept
it, but I'm also happy to give helpful review comments and guidance.
So don't wait for permission, if you want to try implementing something, just
do it.
Equally if there is something that I particularly want done I won't wait for
ever for someone else who says they are working on it. But RAID10 reshape is
a long way from the top of my list.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 11:17 ` NeilBrown
@ 2011-04-13 12:34 ` Keld Jørn Simonsen
2011-04-13 23:28 ` NeilBrown
2011-04-13 12:34 ` David Brown
1 sibling, 1 reply; 12+ messages in thread
From: Keld Jørn Simonsen @ 2011-04-13 12:34 UTC (permalink / raw)
To: NeilBrown
Cc: Keld Jørn Simonsen, Roberto Spadim, Mathias Burén, Linux-RAID
On Wed, Apr 13, 2011 at 09:17:15PM +1000, NeilBrown wrote:
> On Wed, 13 Apr 2011 13:10:16 +0200 Keld Jørn Simonsen <keld@keldix.com> wrote:
>
> > On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
> > > raid10 with other layout i could expand?
> >
> > My understanding is that you currently cannot expand raid10.
> > but there are things in the works. Expansion of raid10,far
> > was not on the list from neil, raid10,near was. But it should be fairly
> > easy to expand raid10,far. You can just treat one of the copies as your
> > refence data, and copy that data to the other raid0-like parts of the
> > array. I wonder if Neil thinks he could leave that as an exersize for
> > me to implement... I would like to be able to combine it with a
> > reformat to a more robust layout of raid10,far that in some cases can survive more
> > than one disk failure.
> >
>
> I'm very happy for anyone to offer to implement anything.
>
> I will of course require the code to be of reasonable quality before I accept
> it, but I'm also happy to give helpful review comments and guidance.
>
> So don't wait for permission, if you want to try implementing something, just
> do it.
>
> Equally if there is something that I particularly want done I won't wait for
> ever for someone else who says they are working on it. But RAID10 reshape is
> a long way from the top of my list.
Hi Neil!
Yes, that is how I understand your policy on contributions.
Do you by RAID10 reshaping also mean RAID10 expansion?
In my eyes this is quite important, and something that I have wanted for
a long time. I think it is a quite common task for many Linux MD users.
best regards
keld
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 11:17 ` NeilBrown
2011-04-13 12:34 ` Keld Jørn Simonsen
@ 2011-04-13 12:34 ` David Brown
2011-04-13 23:36 ` NeilBrown
1 sibling, 1 reply; 12+ messages in thread
From: David Brown @ 2011-04-13 12:34 UTC (permalink / raw)
To: linux-raid
On 13/04/2011 13:17, NeilBrown wrote:
> On Wed, 13 Apr 2011 13:10:16 +0200 Keld Jørn Simonsen<keld@keldix.com> wrote:
>
>> On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
>>> raid10 with other layout i could expand?
>>
>> My understanding is that you currently cannot expand raid10.
>> but there are things in the works. Expansion of raid10,far
>> was not on the list from neil, raid10,near was. But it should be fairly
>> easy to expand raid10,far. You can just treat one of the copies as your
>> refence data, and copy that data to the other raid0-like parts of the
>> array. I wonder if Neil thinks he could leave that as an exersize for
>> me to implement... I would like to be able to combine it with a
>> reformat to a more robust layout of raid10,far that in some cases can survive more
>> than one disk failure.
>>
>
> I'm very happy for anyone to offer to implement anything.
>
> I will of course require the code to be of reasonable quality before I accept
> it, but I'm also happy to give helpful review comments and guidance.
>
> So don't wait for permission, if you want to try implementing something, just
> do it.
>
> Equally if there is something that I particularly want done I won't wait for
> ever for someone else who says they are working on it. But RAID10 reshape is
> a long way from the top of my list.
>
I know you have other exciting things on your to-do list - there was
lots in your roadmap thread a while back.
But I'd like to put in a word for raid10,far - it is an excellent choice
of layout for small or medium systems with a combination of redundancy
and near-raid0 speed. It is especially ideal for 2 or 3 disk systems.
The only disadvantage is that it can't be resized or re-shaped. The
algorithm suggested by Keld sounds simple to implement, but it would
leave the disks in a non-redundant state during the resize/reshape.
That would be good enough for some uses (and better than nothing), but
not good enough for all uses. It may also be scalable to include both
resizing (replacing each disk with a bigger one) and adding another disk
to the array.
Currently, it /is/ possible to get an approximate raid10,far layout that
is resizeable and reshapeable. You can divide the member disks into two
partitions and pair them off appropriately in mirrors. Then use these
mirrors to form a degraded raid5 with "parity-last" layout and a missing
last disk - this is, as far as I can see, equivalent to a raid0 layout
but can be re-shaped to more disks and resized to use bigger disks.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 12:34 ` Keld Jørn Simonsen
@ 2011-04-13 23:28 ` NeilBrown
0 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2011-04-13 23:28 UTC (permalink / raw)
To: Keld Jørn Simonsen; +Cc: Roberto Spadim, Mathias Burén, Linux-RAID
On Wed, 13 Apr 2011 14:34:14 +0200 Keld Jørn Simonsen <keld@keldix.com> wrote:
> On Wed, Apr 13, 2011 at 09:17:15PM +1000, NeilBrown wrote:
> > On Wed, 13 Apr 2011 13:10:16 +0200 Keld Jørn Simonsen <keld@keldix.com> wrote:
> >
> > > On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
> > > > raid10 with other layout i could expand?
> > >
> > > My understanding is that you currently cannot expand raid10.
> > > but there are things in the works. Expansion of raid10,far
> > > was not on the list from neil, raid10,near was. But it should be fairly
> > > easy to expand raid10,far. You can just treat one of the copies as your
> > > refence data, and copy that data to the other raid0-like parts of the
> > > array. I wonder if Neil thinks he could leave that as an exersize for
> > > me to implement... I would like to be able to combine it with a
> > > reformat to a more robust layout of raid10,far that in some cases can survive more
> > > than one disk failure.
> > >
> >
> > I'm very happy for anyone to offer to implement anything.
> >
> > I will of course require the code to be of reasonable quality before I accept
> > it, but I'm also happy to give helpful review comments and guidance.
> >
> > So don't wait for permission, if you want to try implementing something, just
> > do it.
> >
> > Equally if there is something that I particularly want done I won't wait for
> > ever for someone else who says they are working on it. But RAID10 reshape is
> > a long way from the top of my list.
>
> Hi Neil!
>
> Yes, that is how I understand your policy on contributions.
>
> Do you by RAID10 reshaping also mean RAID10 expansion?
> In my eyes this is quite important, and something that I have wanted for
> a long time. I think it is a quite common task for many Linux MD users.
>
> best regards
> keld
Yes, by 'reshaping' I mean everything included here:
http://neil.brown.name/blog/20110216044002#11
which includes size changes.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 12:34 ` David Brown
@ 2011-04-13 23:36 ` NeilBrown
2011-04-14 8:16 ` David Brown
2011-04-15 16:52 ` Keld Jørn Simonsen
0 siblings, 2 replies; 12+ messages in thread
From: NeilBrown @ 2011-04-13 23:36 UTC (permalink / raw)
To: David Brown; +Cc: linux-raid
On Wed, 13 Apr 2011 14:34:15 +0200 David Brown <david@westcontrol.com> wrote:
> On 13/04/2011 13:17, NeilBrown wrote:
> > On Wed, 13 Apr 2011 13:10:16 +0200 Keld Jørn Simonsen<keld@keldix.com> wrote:
> >
> >> On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
> >>> raid10 with other layout i could expand?
> >>
> >> My understanding is that you currently cannot expand raid10.
> >> but there are things in the works. Expansion of raid10,far
> >> was not on the list from neil, raid10,near was. But it should be fairly
> >> easy to expand raid10,far. You can just treat one of the copies as your
> >> refence data, and copy that data to the other raid0-like parts of the
> >> array. I wonder if Neil thinks he could leave that as an exersize for
> >> me to implement... I would like to be able to combine it with a
> >> reformat to a more robust layout of raid10,far that in some cases can survive more
> >> than one disk failure.
> >>
> >
> > I'm very happy for anyone to offer to implement anything.
> >
> > I will of course require the code to be of reasonable quality before I accept
> > it, but I'm also happy to give helpful review comments and guidance.
> >
> > So don't wait for permission, if you want to try implementing something, just
> > do it.
> >
> > Equally if there is something that I particularly want done I won't wait for
> > ever for someone else who says they are working on it. But RAID10 reshape is
> > a long way from the top of my list.
> >
>
> I know you have other exciting things on your to-do list - there was
> lots in your roadmap thread a while back.
>
> But I'd like to put in a word for raid10,far - it is an excellent choice
> of layout for small or medium systems with a combination of redundancy
> and near-raid0 speed. It is especially ideal for 2 or 3 disk systems.
> The only disadvantage is that it can't be resized or re-shaped. The
> algorithm suggested by Keld sounds simple to implement, but it would
> leave the disks in a non-redundant state during the resize/reshape.
> That would be good enough for some uses (and better than nothing), but
> not good enough for all uses. It may also be scalable to include both
> resizing (replacing each disk with a bigger one) and adding another disk
> to the array.
>
> Currently, it /is/ possible to get an approximate raid10,far layout that
> is resizeable and reshapeable. You can divide the member disks into two
> partitions and pair them off appropriately in mirrors. Then use these
> mirrors to form a degraded raid5 with "parity-last" layout and a missing
> last disk - this is, as far as I can see, equivalent to a raid0 layout
> but can be re-shaped to more disks and resized to use bigger disks.
>
There is an interesting idea in here....
Currently if the devices in an md/raid array with redundancy (1,4,5,6,10) are
of difference sizes, they are all treated as being the size of the smallest
device.
However this doesn't really make sense for RAID10-far.
For RAID10-far, it would make the offset where the second slab of data
appeared not be 50% of the smallest device (in the far-2 case), but 50% of
the current device.
Then replacing all the devices in a RAID10-far with larger devices would mean
that the size of the array could then be increased with no further data
rearrangement.
A lot of care would be needed to implement this as the assumption that all
drives are only as big as the smallest is pretty deep. But it could be done
and would be sensible.
That would make point 2 of http://neil.brown.name/blog/20110216044002#11 a
lot simpler.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 23:36 ` NeilBrown
@ 2011-04-14 8:16 ` David Brown
2011-04-15 16:52 ` Keld Jørn Simonsen
1 sibling, 0 replies; 12+ messages in thread
From: David Brown @ 2011-04-14 8:16 UTC (permalink / raw)
To: linux-raid
On 14/04/2011 01:36, NeilBrown wrote:
> On Wed, 13 Apr 2011 14:34:15 +0200 David Brown<david@westcontrol.com> wrote:
>
>> On 13/04/2011 13:17, NeilBrown wrote:
>>> On Wed, 13 Apr 2011 13:10:16 +0200 Keld Jørn Simonsen<keld@keldix.com> wrote:
>>>
>>>> On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
>>>>> raid10 with other layout i could expand?
>>>>
>>>> My understanding is that you currently cannot expand raid10.
>>>> but there are things in the works. Expansion of raid10,far
>>>> was not on the list from neil, raid10,near was. But it should be fairly
>>>> easy to expand raid10,far. You can just treat one of the copies as your
>>>> refence data, and copy that data to the other raid0-like parts of the
>>>> array. I wonder if Neil thinks he could leave that as an exersize for
>>>> me to implement... I would like to be able to combine it with a
>>>> reformat to a more robust layout of raid10,far that in some cases can survive more
>>>> than one disk failure.
>>>>
>>>
>>> I'm very happy for anyone to offer to implement anything.
>>>
>>> I will of course require the code to be of reasonable quality before I accept
>>> it, but I'm also happy to give helpful review comments and guidance.
>>>
>>> So don't wait for permission, if you want to try implementing something, just
>>> do it.
>>>
>>> Equally if there is something that I particularly want done I won't wait for
>>> ever for someone else who says they are working on it. But RAID10 reshape is
>>> a long way from the top of my list.
>>>
>>
>> I know you have other exciting things on your to-do list - there was
>> lots in your roadmap thread a while back.
>>
>> But I'd like to put in a word for raid10,far - it is an excellent choice
>> of layout for small or medium systems with a combination of redundancy
>> and near-raid0 speed. It is especially ideal for 2 or 3 disk systems.
>> The only disadvantage is that it can't be resized or re-shaped. The
>> algorithm suggested by Keld sounds simple to implement, but it would
>> leave the disks in a non-redundant state during the resize/reshape.
>> That would be good enough for some uses (and better than nothing), but
>> not good enough for all uses. It may also be scalable to include both
>> resizing (replacing each disk with a bigger one) and adding another disk
>> to the array.
>>
>> Currently, it /is/ possible to get an approximate raid10,far layout that
>> is resizeable and reshapeable. You can divide the member disks into two
>> partitions and pair them off appropriately in mirrors. Then use these
>> mirrors to form a degraded raid5 with "parity-last" layout and a missing
>> last disk - this is, as far as I can see, equivalent to a raid0 layout
>> but can be re-shaped to more disks and resized to use bigger disks.
>>
>
> There is an interesting idea in here....
>
> Currently if the devices in an md/raid array with redundancy (1,4,5,6,10) are
> of difference sizes, they are all treated as being the size of the smallest
> device.
> However this doesn't really make sense for RAID10-far.
>
> For RAID10-far, it would make the offset where the second slab of data
> appeared not be 50% of the smallest device (in the far-2 case), but 50% of
> the current device.
>
> Then replacing all the devices in a RAID10-far with larger devices would mean
> that the size of the array could then be increased with no further data
> rearrangement.
>
> A lot of care would be needed to implement this as the assumption that all
> drives are only as big as the smallest is pretty deep. But it could be done
> and would be sensible.
>
> That would make point 2 of http://neil.brown.name/blog/20110216044002#11 a
> lot simpler.
>
I'd like to share an idea here for a slight change in the metadata, and
an algorithm that I think can be used for resizing raid10,far. I
apologise if I've got my terminology wrong, or if it sounds like I'm
teaching my grandmother to suck eggs.
I think you want to make a distinction between the size of the
underlying device (disk, partition, lvm device, other md raid), the size
of the components actually used, and the position of the mirror copy in
raid10.
I see it as perfectly reasonable to assume that the used component size
is the same for all devices in an array, and that this only changes when
you "grow" the array itself (assuming the underlying devices are
bigger). That's the way raid 1, 4, 5, and 6 work, and I think that
assumption would help make 10 growable. It is also, AFAIU, the reason
normal raid 0 isn't growable - because it doesn't have that restriction.
(Maybe raid0 can be made growable for cases where the component sizes
are the same?)
To make raid10, far resizeable, I think the key is that instead of
"position of second copy" being fixed at 50% of the array component
size, or 50% of the underlying device size, it should be variable. In
fact, not only should it be variable - it should consist of two (start,
length) pairs.
The issue here is that to do a safe grow after resizing the underlying
device (this being the most awkward case), the mirror copy has to be
moved rather than deleted and re-written - otherwise you lose your
redundancy. But if you keep track of two valid regions, it becomes
easier. In the most common case, growing the disk, you would start at
the end. Copy a block from the end of the component part of the mirror
to the appropriate place near the end of the new underlying device.
Update the second (start, length) pair to include this block, and the
first (start, length) pair to remove it. Repeat the process until you
have copied over everything valid and then have a device with a first
data block, then some unused space, then a mirror block, then some
unused space. Once every underlying device is in this shape, then a
"grow" is just a straight sync of the unused space (or you just mark it
in the non-sync bitmap).
Let me try to put it into a picture. I'll label all the real data
blocks by letters, and use "." for unused data blocks. Small letters
and big letters represent the same data in two copies. "*" is for
non-sync bitmap data, or data that must be synced normally (if the
non-sync bitmap functionality is not yet implemented).
The list of numbers after the disks is:
Size of underlying disk, size of component, (start, length), (start, length)
We start with a raid10,far layout:
1: acegikBDFHJL 12, 6, (6, 6), (0, 0)
2: bdfhjlACEGIK 12, 6, (6, 6), (0, 0)
Then we assume disk 2 is grown (either it is an LVM partition, a raid
that is grown, or whatever). Thus we have:
1: acegikBDFHJL 12, 6, (6, 6), (0, 0)
2: bdfhjlACEGIK...... 18, 6, (6, 6), (0, 0)
Rebalancing disk 2 (which may be done as its own operation, or
automatically during a "grow" of the whole array - assuming each
component disk has enough space) goes through steps like this:
2: bdfhjlACEGIK...... 18, 6, (6, 6), (0, 0)
2: bdfhjlACEGIK.IK... 18, 6, (6, 6), (13, 2)
2: bdfhjlACEG...IK... 18, 6, (6, 4), (13, 2)
2: bdfhjlACEG.EGIK... 18, 6, (6, 4), (11, 4)
2: bdfhjlAC...EGIK... 18, 6, (6, 2), (11, 4)
2: bdfhjlAC.ACEGIK... 18, 6, (6, 2), (9, 6)
2: bdfhjl...ACEGIK... 18, 6, (6, 0), (9, 6)
2: bdfhjl...ACEGIK... 18, 6, (9, 6), (0, 0)
With the pair now being:
1: acegikBDFHJL 12, 6, (6, 6), (0, 0)
2: bdfhjl...ACEGIK... 18, 6, (9, 6), (0, 0)
After a similar process with disk 1 we have:
1: acegik...BDFHJL... 18, 6, (9, 6), (0, 0)
2: bdfhjl...ACEGIK... 18, 6, (9, 6), (0, 0)
"Grow" gives you:
1: acegik***BDFHJL*** 18, 9, (9, 9), (0, 0)
2: bdfhjl***ACEGIK*** 18, 9, (9, 9), (0, 0)
A similar sort of sequence is easy to imagine for shrinking partitions.
And when replacing a disk with a new one, this re-shape could easily
be combined with a hot-replace copy.
As far as I can see, this setup with the extra metadata will hold
everything consistent, safe and redundant during the whole operation.
mvh.,
David
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-13 23:36 ` NeilBrown
2011-04-14 8:16 ` David Brown
@ 2011-04-15 16:52 ` Keld Jørn Simonsen
2011-04-18 0:46 ` NeilBrown
1 sibling, 1 reply; 12+ messages in thread
From: Keld Jørn Simonsen @ 2011-04-15 16:52 UTC (permalink / raw)
To: NeilBrown; +Cc: David Brown, linux-raid
On Thu, Apr 14, 2011 at 09:36:57AM +1000, NeilBrown wrote:
> On Wed, 13 Apr 2011 14:34:15 +0200 David Brown <david@westcontrol.com> wrote:
>
> > On 13/04/2011 13:17, NeilBrown wrote:
> > > On Wed, 13 Apr 2011 13:10:16 +0200 Keld Jørn Simonsen<keld@keldix.com> wrote:
> > >
> > >> On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
> > >>> raid10 with other layout i could expand?
> > >>
> > >> My understanding is that you currently cannot expand raid10.
> > >> but there are things in the works. Expansion of raid10,far
> > >> was not on the list from neil, raid10,near was. But it should be fairly
> > >> easy to expand raid10,far. You can just treat one of the copies as your
> > >> refence data, and copy that data to the other raid0-like parts of the
> > >> array. I wonder if Neil thinks he could leave that as an exersize for
> > >> me to implement... I would like to be able to combine it with a
> > >> reformat to a more robust layout of raid10,far that in some cases can survive more
> > >> than one disk failure.
> > >>
> > >
> > > I'm very happy for anyone to offer to implement anything.
> > >
> > > I will of course require the code to be of reasonable quality before I accept
> > > it, but I'm also happy to give helpful review comments and guidance.
> > >
> > > So don't wait for permission, if you want to try implementing something, just
> > > do it.
> > >
> > > Equally if there is something that I particularly want done I won't wait for
> > > ever for someone else who says they are working on it. But RAID10 reshape is
> > > a long way from the top of my list.
> > >
> >
> > I know you have other exciting things on your to-do list - there was
> > lots in your roadmap thread a while back.
> >
> > But I'd like to put in a word for raid10,far - it is an excellent choice
> > of layout for small or medium systems with a combination of redundancy
> > and near-raid0 speed. It is especially ideal for 2 or 3 disk systems.
> > The only disadvantage is that it can't be resized or re-shaped. The
> > algorithm suggested by Keld sounds simple to implement, but it would
> > leave the disks in a non-redundant state during the resize/reshape.
> > That would be good enough for some uses (and better than nothing), but
> > not good enough for all uses. It may also be scalable to include both
> > resizing (replacing each disk with a bigger one) and adding another disk
> > to the array.
> >
> > Currently, it /is/ possible to get an approximate raid10,far layout that
> > is resizeable and reshapeable. You can divide the member disks into two
> > partitions and pair them off appropriately in mirrors. Then use these
> > mirrors to form a degraded raid5 with "parity-last" layout and a missing
> > last disk - this is, as far as I can see, equivalent to a raid0 layout
> > but can be re-shaped to more disks and resized to use bigger disks.
> >
>
> There is an interesting idea in here....
>
> Currently if the devices in an md/raid array with redundancy (1,4,5,6,10) are
> of difference sizes, they are all treated as being the size of the smallest
> device.
> However this doesn't really make sense for RAID10-far.
>
> For RAID10-far, it would make the offset where the second slab of data
> appeared not be 50% of the smallest device (in the far-2 case), but 50% of
> the current device.
>
> Then replacing all the devices in a RAID10-far with larger devices would mean
> that the size of the array could then be increased with no further data
> rearrangement.
>
> A lot of care would be needed to implement this as the assumption that all
> drives are only as big as the smallest is pretty deep. But it could be done
> and would be sensible.
>
> That would make point 2 of http://neil.brown.name/blog/20110216044002#11 a
> lot simpler.
Hmm, I am not sure I understand. Eg for the simple case of growing a 2
disk raid10-far to a 3 disk or 4 disk, how would that be done? I think
you need to rewrite the whole array. But I think you also need to do
that when growing most of the other array types.
Quoting point 2 of http://neil.brown.name/blog/20110216044002#11:
> 2/ Device size of 'far' arrays cannot be changed easily. Increasing
> device size of 'far' would require re-laying out a lot of data. We would
> need to record the 'old' and 'new' sizes which metadata doesn't
> currently allow. If we spent 8 bytes on this we could possibly manage a
> 'reverse reshape' style conversion here.
>
> EDIT: if we stored data on drives a little differently this could be a
> lot easier. Instead of starting the second slab of data at the same
> location on all devices, we start it an appropriate fraction into the
> size of 'this' device, then replacing all devices in a raid10-far with
> larger drives would be very effective. However just increasing the size
> of the device (e.g. using LVM) would not work very well
I am not sure I understand the problem here. Are you saying that there
is no room in the metadata to hold info on the reshaping while it is
processed?
For a simple grow with more partitions of the same size I see problems
in just keeping the old data. I think that would damage the striping
performance.
And I don't understand what is meant with "we start it an appropriate
fraction" - what fraction would that be? Eg growing from 2 to 3 disks?
If you want integrity of the data, understood as always having the
required number of copies available, then you could copy from the end of
the half array and then have a pointer that tells whereto the process
is completed. There may be some initial problems with consistency, but
maybe there is some recovery areas in the new array data that could be
used for bootstrapping the process - once you are over an initial size,
you are not overwriting old data.
Best regards
keld
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: expand raid10
2011-04-15 16:52 ` Keld Jørn Simonsen
@ 2011-04-18 0:46 ` NeilBrown
0 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2011-04-18 0:46 UTC (permalink / raw)
To: Keld Jørn Simonsen; +Cc: David Brown, linux-raid
On Fri, 15 Apr 2011 18:52:03 +0200 Keld Jørn Simonsen <keld@keldix.com> wrote:
> On Thu, Apr 14, 2011 at 09:36:57AM +1000, NeilBrown wrote:
> > On Wed, 13 Apr 2011 14:34:15 +0200 David Brown <david@westcontrol.com> wrote:
> >
> > > On 13/04/2011 13:17, NeilBrown wrote:
> > > > On Wed, 13 Apr 2011 13:10:16 +0200 Keld Jørn Simonsen<keld@keldix.com> wrote:
> > > >
> > > >> On Wed, Apr 13, 2011 at 07:47:26AM -0300, Roberto Spadim wrote:
> > > >>> raid10 with other layout i could expand?
> > > >>
> > > >> My understanding is that you currently cannot expand raid10.
> > > >> but there are things in the works. Expansion of raid10,far
> > > >> was not on the list from neil, raid10,near was. But it should be fairly
> > > >> easy to expand raid10,far. You can just treat one of the copies as your
> > > >> refence data, and copy that data to the other raid0-like parts of the
> > > >> array. I wonder if Neil thinks he could leave that as an exersize for
> > > >> me to implement... I would like to be able to combine it with a
> > > >> reformat to a more robust layout of raid10,far that in some cases can survive more
> > > >> than one disk failure.
> > > >>
> > > >
> > > > I'm very happy for anyone to offer to implement anything.
> > > >
> > > > I will of course require the code to be of reasonable quality before I accept
> > > > it, but I'm also happy to give helpful review comments and guidance.
> > > >
> > > > So don't wait for permission, if you want to try implementing something, just
> > > > do it.
> > > >
> > > > Equally if there is something that I particularly want done I won't wait for
> > > > ever for someone else who says they are working on it. But RAID10 reshape is
> > > > a long way from the top of my list.
> > > >
> > >
> > > I know you have other exciting things on your to-do list - there was
> > > lots in your roadmap thread a while back.
> > >
> > > But I'd like to put in a word for raid10,far - it is an excellent choice
> > > of layout for small or medium systems with a combination of redundancy
> > > and near-raid0 speed. It is especially ideal for 2 or 3 disk systems.
> > > The only disadvantage is that it can't be resized or re-shaped. The
> > > algorithm suggested by Keld sounds simple to implement, but it would
> > > leave the disks in a non-redundant state during the resize/reshape.
> > > That would be good enough for some uses (and better than nothing), but
> > > not good enough for all uses. It may also be scalable to include both
> > > resizing (replacing each disk with a bigger one) and adding another disk
> > > to the array.
> > >
> > > Currently, it /is/ possible to get an approximate raid10,far layout that
> > > is resizeable and reshapeable. You can divide the member disks into two
> > > partitions and pair them off appropriately in mirrors. Then use these
> > > mirrors to form a degraded raid5 with "parity-last" layout and a missing
> > > last disk - this is, as far as I can see, equivalent to a raid0 layout
> > > but can be re-shaped to more disks and resized to use bigger disks.
> > >
> >
> > There is an interesting idea in here....
> >
> > Currently if the devices in an md/raid array with redundancy (1,4,5,6,10) are
> > of difference sizes, they are all treated as being the size of the smallest
> > device.
> > However this doesn't really make sense for RAID10-far.
> >
> > For RAID10-far, it would make the offset where the second slab of data
> > appeared not be 50% of the smallest device (in the far-2 case), but 50% of
> > the current device.
> >
> > Then replacing all the devices in a RAID10-far with larger devices would mean
> > that the size of the array could then be increased with no further data
> > rearrangement.
> >
> > A lot of care would be needed to implement this as the assumption that all
> > drives are only as big as the smallest is pretty deep. But it could be done
> > and would be sensible.
> >
> > That would make point 2 of http://neil.brown.name/blog/20110216044002#11 a
> > lot simpler.
>
> Hmm, I am not sure I understand. Eg for the simple case of growing a 2
> disk raid10-far to a 3 disk or 4 disk, how would that be done? I think
> you need to rewrite the whole array. But I think you also need to do
> that when growing most of the other array types.
>
> Quoting point 2 of http://neil.brown.name/blog/20110216044002#11:
>
> > 2/ Device size of 'far' arrays cannot be changed easily. Increasing
> > device size of 'far' would require re-laying out a lot of data. We would
> > need to record the 'old' and 'new' sizes which metadata doesn't
> > currently allow. If we spent 8 bytes on this we could possibly manage a
> > 'reverse reshape' style conversion here.
> >
> > EDIT: if we stored data on drives a little differently this could be a
> > lot easier. Instead of starting the second slab of data at the same
> > location on all devices, we start it an appropriate fraction into the
> > size of 'this' device, then replacing all devices in a raid10-far with
> > larger drives would be very effective. However just increasing the size
> > of the device (e.g. using LVM) would not work very well
>
> I am not sure I understand the problem here. Are you saying that there
> is no room in the metadata to hold info on the reshaping while it is
> processed?
No, though adding stuff to the metadata shouldn't be done lightly.
I'm saying that if we layout that RAID10-far data on the device a little bit
differently, then making a RAID10-far make full use of the devices after
replacing all the devices becomes very easy.
>
> For a simple grow with more partitions of the same size I see problems
> in just keeping the old data. I think that would damage the striping
> performance.
The preceding is about increasing the size of individual drives. That is
quite different to adding more drives of the same size.
When you add more drives you certainly have to re-layout all the stripes.
This isn't conceptually difficult - just a lot of reads and writes and some
care in writing the code to make it safe and efficient.
>
> And I don't understand what is meant with "we start it an appropriate
> fraction" - what fraction would that be? Eg growing from 2 to 3 disks?
It doesn't apply to that case. It only applies to growing the size of
individual disks. For far2, the fraction would be 1/2. For far3 it would be
1/3.
>
> If you want integrity of the data, understood as always having the
> required number of copies available, then you could copy from the end of
> the half array and then have a pointer that tells whereto the process
> is completed. There may be some initial problems with consistency, but
> maybe there is some recovery areas in the new array data that could be
> used for bootstrapping the process - once you are over an initial size,
> you are not overwriting old data.
Yes. The 'pointer' would be the 'reshape_position' value in the metadata.
Data before this has been relocated. Data after this has not... At least
that is how RAID5 works. For RAID10 we might want slightly different ranges.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-04-18 0:46 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-13 4:28 expand raid10 Roberto Spadim
2011-04-13 7:15 ` Mathias Burén
2011-04-13 10:47 ` Roberto Spadim
2011-04-13 11:10 ` Keld Jørn Simonsen
2011-04-13 11:17 ` NeilBrown
2011-04-13 12:34 ` Keld Jørn Simonsen
2011-04-13 23:28 ` NeilBrown
2011-04-13 12:34 ` David Brown
2011-04-13 23:36 ` NeilBrown
2011-04-14 8:16 ` David Brown
2011-04-15 16:52 ` Keld Jørn Simonsen
2011-04-18 0:46 ` NeilBrown
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.