All of lore.kernel.org
 help / color / mirror / Atom feed
* Migrating a RAID 5 from 4x2TB to 3x6TB ?
       [not found] <1419435054.589.1433790714774.JavaMail.zimbra@wieser.fr>
@ 2015-06-08 19:28 ` Pierre Wieser
  2015-06-08 20:10   ` Wols Lists
                     ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Pierre Wieser @ 2015-06-08 19:28 UTC (permalink / raw)
  To: linux-raid

Hi all,

I currently have an almost full RAID 5 built with 4 x 2 TB disks.
I wonder if it would be possible to migrate it to a bigger RAID 5
with 3 x 6TB new disks.

I've imagined something like that :
- successively fail, remove a 2TB disk, add a 4TB disk, wait for end of recovery on three 2TB disks
- at the end of this first phase, I have the same ~6TB RAID 5 clean group with 3 x 4TB + 1 x 2TB disks
- declare the last 2 TB disk faulty and remove it
- the RAID 5 group state goes to clean, degraded
- grow the RAID 5 group with --size=max option
- grow the RAID 5 group with --array-size=~12TB option
- last, grow the RAID 5 group with --raid-devices=3 and --backup-file=... options.

And I have tested it on a small test RAID 5 group.
As expected, this last command makes the RAID 5 group begins a reshaping operation.
But this one keeps stucked at zero.

So I have several questions :

- is it even theorically possible to grow a RAID 5 while decreasing the number of disks ?
- do you think the sequence i've imagined is correct ?
- why the reshaping operation does it stuck at zero ?

Any help or hint would be greatly appreciated :)
Thanks
Regards
Pierre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-08 19:28 ` Migrating a RAID 5 from 4x2TB to 3x6TB ? Pierre Wieser
@ 2015-06-08 20:10   ` Wols Lists
  2015-06-09 18:33     ` Pierre Wieser
  2015-06-09  5:23   ` Can Jeuleers
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Wols Lists @ 2015-06-08 20:10 UTC (permalink / raw)
  To: Pierre Wieser, linux-raid

On 08/06/15 20:28, Pierre Wieser wrote:
> Hi all,
> 
> I currently have an almost full RAID 5 built with 4 x 2 TB disks.
> I wonder if it would be possible to migrate it to a bigger RAID 5
> with 3 x 6TB new disks.
> 
Do you have a spare (I presume SATA) disk port?

> I've imagined something like that :
> - successively fail, remove a 2TB disk, add a 4TB disk, wait for end of recovery on three 2TB disks

If you've got a spare port, the newer mdadm's have, I believe, a "clone
and replace" option. Much better than failing then rebuilding.

If not, is it worth getting a SATA expansion board? If you've not got a
specialist mobo, surely a board is only going to cost a tenner or so,
and quality isn't *that* important seeing as it's only a temporary measure.

> - at the end of this first phase, I have the same ~6TB RAID 5 clean group with 3 x 4TB + 1 x 2TB disks
> - declare the last 2 TB disk faulty and remove it
> - the RAID 5 group state goes to clean, degraded
> - grow the RAID 5 group with --size=max option
> - grow the RAID 5 group with --array-size=~12TB option
> - last, grow the RAID 5 group with --raid-devices=3 and --backup-file=... options.
> 
> And I have tested it on a small test RAID 5 group.
> As expected, this last command makes the RAID 5 group begins a reshaping operation.
> But this one keeps stucked at zero.
> 
> So I have several questions :
> 
Don't think I've answered any of them, but I might have raised new ones.
I just hope the tip saves you a bit of time.

Cheers,
Wol


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-08 19:28 ` Migrating a RAID 5 from 4x2TB to 3x6TB ? Pierre Wieser
  2015-06-08 20:10   ` Wols Lists
@ 2015-06-09  5:23   ` Can Jeuleers
  2015-06-09 18:46     ` Pierre Wieser
  2015-06-09 18:46     ` Wols Lists
       [not found]   ` <CAOS+5GHzBgx2DuDe0+RLgZj9Q1BZ944i-9q4NEERq66Sk78b2g@mail.gmail.com>
  2015-06-10 19:37   ` Pierre Wieser
  3 siblings, 2 replies; 15+ messages in thread
From: Can Jeuleers @ 2015-06-09  5:23 UTC (permalink / raw)
  To: Pierre Wieser, linux-raid

On 08/06/15 21:28, Pierre Wieser wrote:
> Hi all,
> 
> I currently have an almost full RAID 5 built with 4 x 2 TB disks.
> I wonder if it would be possible to migrate it to a bigger RAID 5
> with 3 x 6TB new disks.

I'd recommend against it:

https://en.wikipedia.org/wiki/RAID#Unrecoverable_read_errors_during_rebuild

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
       [not found]   ` <CAOS+5GHzBgx2DuDe0+RLgZj9Q1BZ944i-9q4NEERq66Sk78b2g@mail.gmail.com>
@ 2015-06-09 18:26     ` Pierre Wieser
  0 siblings, 0 replies; 15+ messages in thread
From: Pierre Wieser @ 2015-06-09 18:26 UTC (permalink / raw)
  To: Another Sillyname, linux-raid

----- Original Message -----
> I assume the motherboard you're using only has 4 sata ports?

Yes, the motherboard has 4 sata ports, and there is already a small external
sata controller. So my actual raid device has 5 disks (4+S). For the discussion,
I just omitted the spare device ;)

> On 8 June 2015 at 20:28, Pierre Wieser <pwieser@trychlos.org> wrote:
> > Hi all,
> >
> > I currently have an almost full RAID 5 built with 4 x 2 TB disks.
> > I wonder if it would be possible to migrate it to a bigger RAID 5
> > with 3 x 6TB new disks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-08 20:10   ` Wols Lists
@ 2015-06-09 18:33     ` Pierre Wieser
  0 siblings, 0 replies; 15+ messages in thread
From: Pierre Wieser @ 2015-06-09 18:33 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid

----- Original Message -----
> On 08/06/15 20:28, Pierre Wieser wrote:
> > Hi all,
> > 
> > I currently have an almost full RAID 5 built with 4 x 2 TB disks.
> > I wonder if it would be possible to migrate it to a bigger RAID 5
> > with 3 x 6TB new disks.
> > 
> Do you have a spare (I presume SATA) disk port?

I am away from home this evening, but, no, I do not "see" any free sata port.

> > I've imagined something like that :
> > - successively fail, remove a 2TB disk, add a 4TB disk, wait for end of
> > recovery on three 2TB disks
> 
> If you've got a spare port, the newer mdadm's have, I believe, a "clone
> and replace" option. Much better than failing then rebuilding.

I was not conscious of this option. I understand that it is better because
this prevents me to pass through a "clean, degraded" state ? As I expect that
the clone time will be identical the a recovery one ?

> If not, is it worth getting a SATA expansion board? If you've not got a
> specialist mobo, surely a board is only going to cost a tenner or so,
> and quality isn't *that* important seeing as it's only a temporary measure.

Yes, you're right: it might be worth to purchase another sata controller,
as I have PCI-e free ports, enough alimentation. Maybe a small issue at
disk enclosure level. To be checked... 

> > - at the end of this first phase, I have the same ~6TB RAID 5 clean group
> > with 3 x 4TB + 1 x 2TB disks
> > - declare the last 2 TB disk faulty and remove it
> > - the RAID 5 group state goes to clean, degraded
> > - grow the RAID 5 group with --size=max option
> > - grow the RAID 5 group with --array-size=~12TB option
> > - last, grow the RAID 5 group with --raid-devices=3 and --backup-file=...
> > options.
> > 
> > And I have tested it on a small test RAID 5 group.
> > As expected, this last command makes the RAID 5 group begins a reshaping
> > operation.
> > But this one keeps stucked at zero.
> > 
> > So I have several questions :
> > 
> Don't think I've answered any of them, but I might have raised new ones.
> I just hope the tip saves you a bit of time.

Surely you have opened new pistes. Thanks.

> Cheers,
> Wol
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-09  5:23   ` Can Jeuleers
@ 2015-06-09 18:46     ` Pierre Wieser
  2015-06-09 19:06       ` Wols Lists
  2015-06-09 19:15       ` Roman Mamedov
  2015-06-09 18:46     ` Wols Lists
  1 sibling, 2 replies; 15+ messages in thread
From: Pierre Wieser @ 2015-06-09 18:46 UTC (permalink / raw)
  To: Can Jeuleers; +Cc: linux-raid

---- Original Message -----
> On 08/06/15 21:28, Pierre Wieser wrote:
> > Hi all,
> > 
> > I currently have an almost full RAID 5 built with 4 x 2 TB disks.
> > I wonder if it would be possible to migrate it to a bigger RAID 5
> > with 3 x 6TB new disks.
> 
> I'd recommend against it:
> 
> https://en.wikipedia.org/wiki/RAID#Unrecoverable_read_errors_during_rebuild

Oop's! I was not conscious at all of this issue. It happens that I am currenly
living dangerously as I have another RAID 5 11,5 TB device :( I had already
seen various administration issues when the size increases to this level, but
I tought this was only an issue regarding the volumes organization (not tought
deeply enough, obviously)....

Starting from your link, and searching a bit, I understand now that the 10TB
is a maximal limit for desktop-grade disks (regarding the URE at least). And
thus for any element of a RAID device which needs to be scanned at recovery
time.
Apart from my poor english, would you say I'm right with this ?

Does linux-raid have any recommandation(s) when managing more than 10TB of data ?

I may imagine:
- several smaller RAID 5 devices
- would RAID10 be a valuable solution in your opinion ?

(as a precision, all my servers have been migrated to CentOS 7.1)

Nonetheless, I thank you very much for the link, which may prevent me to
lose a big bunch of data !

> Jan
> 
Pierre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-09  5:23   ` Can Jeuleers
  2015-06-09 18:46     ` Pierre Wieser
@ 2015-06-09 18:46     ` Wols Lists
  2015-06-09 19:06       ` Another Sillyname
                         ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Wols Lists @ 2015-06-09 18:46 UTC (permalink / raw)
  To: Can Jeuleers, Pierre Wieser, linux-raid

On 09/06/15 06:23, Can Jeuleers wrote:
> On 08/06/15 21:28, Pierre Wieser wrote:
>> Hi all,
>>
>> I currently have an almost full RAID 5 built with 4 x 2 TB disks.
>> I wonder if it would be possible to migrate it to a bigger RAID 5
>> with 3 x 6TB new disks.
> 
> I'd recommend against it:
> 
> https://en.wikipedia.org/wiki/RAID#Unrecoverable_read_errors_during_rebuild
> 
> Jan
> 
Please expand! Having read the article, it doesn't seem to say anything
more than what is repeated time and time on this list - MAKE SURE YOUR
DRIVES ARE DECENT RAID DRIVES.

If you have ERC, then the odd "soft" read error doesn't matter. If you
don't have ERC, then your data is at risk when you replace a drive, and
it doesn't matter how big your drives are, it's the array size that matters.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-09 18:46     ` Pierre Wieser
@ 2015-06-09 19:06       ` Wols Lists
  2015-06-09 19:15       ` Roman Mamedov
  1 sibling, 0 replies; 15+ messages in thread
From: Wols Lists @ 2015-06-09 19:06 UTC (permalink / raw)
  To: Pierre Wieser, Can Jeuleers; +Cc: linux-raid

On 09/06/15 19:46, Pierre Wieser wrote:
> ---- Original Message -----
>> On 08/06/15 21:28, Pierre Wieser wrote:
>>> Hi all,
>>>
>>> I currently have an almost full RAID 5 built with 4 x 2 TB disks.
>>> I wonder if it would be possible to migrate it to a bigger RAID 5
>>> with 3 x 6TB new disks.
>>
>> I'd recommend against it:
>>
>> https://en.wikipedia.org/wiki/RAID#Unrecoverable_read_errors_during_rebuild
> 
> Oop's! I was not conscious at all of this issue. It happens that I am currenly
> living dangerously as I have another RAID 5 11,5 TB device :( I had already
> seen various administration issues when the size increases to this level, but
> I tought this was only an issue regarding the volumes organization (not tought
> deeply enough, obviously)....
> 
> Starting from your link, and searching a bit, I understand now that the 10TB
> is a maximal limit for desktop-grade disks (regarding the URE at least). And
> thus for any element of a RAID device which needs to be scanned at recovery
> time.
> Apart from my poor english, would you say I'm right with this ?

I don't think so. You may be lucky, and your drives are better than
average. You may be unlucky, and your drives are worse than average.

DON'T USE DESKTOP GRADE DISKS EXCEPT IN RAID 1. That said, I'm using
Seagate Barracudas which I was planning to upgrade to raid 5 - not a
good idea :-(
> 
> Does linux-raid have any recommandation(s) when managing more than 10TB of data ?
> 
> I may imagine:
> - several smaller RAID 5 devices
> - would RAID10 be a valuable solution in your opinion ?

Read the list archive. There's a bunch of stuff about how to mitigate
the problem - mostly by increasing the raid timeout (the problem is,
basically, that the raid software returns with an error before the disk
times out - increase the raid timeout and it will detect the disk error
and retry).
> 
> (as a precision, all my servers have been migrated to CentOS 7.1)
> 
> Nonetheless, I thank you very much for the link, which may prevent me to
> lose a big bunch of data !
> 
"man smartctl" is your friend :-)

You'll have to read up, but try

"smartctl -i /dev/sdx"

That'll tell you if smart is turned on - if it isn't, turn it on.

"smartctl -s on /dev/sdx"

Then do

"smartctl -x /dev/sdx"

and look for anything about ERC or Error Recovery Control. From my
Barracudas I get

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP Log 0x04) not supported

OOPS!!! These drives are NOT NOT NOT suitable for raid :-( Everything
will be fine if I increase the raid timeout, but given that the typical
drive timeout is two minutes, the raid timeout needs to be longer than
that which means if I have any soft errors, the rebuild will be horribly
slow.

Looks like my next drives will be WD Reds, they're not much more expensive.

WARNING: If you have to enable smartctl, it's supposed to survive a
cold-boot, but it doesn't look like it has on my drives, and it's
reported a lot of drives don't. You need to make sure you have a boot
script that forces it on, and forces ERC on or sets the timeout.


All that said, as you can see, desktop drives are fine for raid IF
repeat IF you take the necessary precautions. They're probably fine on a
desktop :-)

Cheers,
Wol


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-09 18:46     ` Wols Lists
@ 2015-06-09 19:06       ` Another Sillyname
  2015-06-09 19:18       ` Can Jeuleers
  2015-06-15 11:31       ` Wilson, Jonathan
  2 siblings, 0 replies; 15+ messages in thread
From: Another Sillyname @ 2015-06-09 19:06 UTC (permalink / raw)
  To: linux-raid

If you can try to pick up a cheap IBM M1015 Raid controller that you
can configure as JBOD (look to pay no more then £50 or $75).

Flash it to the standard LSI firmware (google M1015 LSI
firmware...there's plenty of forum pages on this).

Add your new drives to this controller, build new raid on this new
controller......copy dataset from old array.

I personally would NEVER suggest doing a migrate of drives or data
sizes on an active live dataset...if anything goes wrong you could
easily kill your data and have no backup.



On 9 June 2015 at 19:46, Wols Lists <antlists@youngman.org.uk> wrote:
> On 09/06/15 06:23, Can Jeuleers wrote:
>> On 08/06/15 21:28, Pierre Wieser wrote:
>>> Hi all,
>>>
>>> I currently have an almost full RAID 5 built with 4 x 2 TB disks.
>>> I wonder if it would be possible to migrate it to a bigger RAID 5
>>> with 3 x 6TB new disks.
>>
>> I'd recommend against it:
>>
>> https://en.wikipedia.org/wiki/RAID#Unrecoverable_read_errors_during_rebuild
>>
>> Jan
>>
> Please expand! Having read the article, it doesn't seem to say anything
> more than what is repeated time and time on this list - MAKE SURE YOUR
> DRIVES ARE DECENT RAID DRIVES.
>
> If you have ERC, then the odd "soft" read error doesn't matter. If you
> don't have ERC, then your data is at risk when you replace a drive, and
> it doesn't matter how big your drives are, it's the array size that matters.
>
> Cheers,
> Wol
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-09 18:46     ` Pierre Wieser
  2015-06-09 19:06       ` Wols Lists
@ 2015-06-09 19:15       ` Roman Mamedov
  1 sibling, 0 replies; 15+ messages in thread
From: Roman Mamedov @ 2015-06-09 19:15 UTC (permalink / raw)
  To: Pierre Wieser; +Cc: Can Jeuleers, linux-raid

[-- Attachment #1: Type: text/plain, Size: 840 bytes --]

On Tue, 9 Jun 2015 20:46:12 +0200 (CEST)
Pierre Wieser <pwieser@trychlos.org> wrote:

> Does linux-raid have any recommandation(s) when managing more than 10TB of data ?
> 
> I may imagine:
> - several smaller RAID 5 devices
> - would RAID10 be a valuable solution in your opinion ?

The solution is called RAID6.

And don't go for overly large drives, e.g. 6x2TB is better than 4x3TB.

1) smaller drives might use more proven technology, require less precision, so
might be more reliable (at least according to hearsay and urban legends);

2) you need a considerable array member count anyway, to justify "losing" two
drives for parity in RAID6.

"But my enclosure only fits 4 drives". "But I don't have enough SATA ports".

Get a bigger enclosure. Get a second/third SATA controller. :)

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-09 18:46     ` Wols Lists
  2015-06-09 19:06       ` Another Sillyname
@ 2015-06-09 19:18       ` Can Jeuleers
  2015-06-15 11:31       ` Wilson, Jonathan
  2 siblings, 0 replies; 15+ messages in thread
From: Can Jeuleers @ 2015-06-09 19:18 UTC (permalink / raw)
  To: Wols Lists, Pierre Wieser, linux-raid

On 09/06/15 20:46, Wols Lists wrote:
> Please expand! Having read the article, it doesn't seem to say anything
> more than what is repeated time and time on this list - MAKE SURE YOUR
> DRIVES ARE DECENT RAID DRIVES.

Large RAID5 arrays are a bad idea because of the probability of
unrecoverable read errors occurring during a rebuild, which increases
with the size of the array.

Decent RAID drives will have better reliability than indecent-ones
(haha), but in absolute terms their URE rates are still going to
increase with array size.

> If you have ERC, then the odd "soft" read error doesn't matter. If you
> don't have ERC, then your data is at risk when you replace a drive, and
> it doesn't matter how big your drives are, it's the array size that matters.

Indeed. Pierre was proposing to further increase the size of his RAID5
array, and I was advising him against it because of the above.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-08 19:28 ` Migrating a RAID 5 from 4x2TB to 3x6TB ? Pierre Wieser
                     ` (2 preceding siblings ...)
       [not found]   ` <CAOS+5GHzBgx2DuDe0+RLgZj9Q1BZ944i-9q4NEERq66Sk78b2g@mail.gmail.com>
@ 2015-06-10 19:37   ` Pierre Wieser
  2015-06-15 10:46     ` Wilson, Jonathan
  3 siblings, 1 reply; 15+ messages in thread
From: Pierre Wieser @ 2015-06-10 19:37 UTC (permalink / raw)
  To: linux-raid

Hi all,

> I currently have an almost full RAID 5 built with 4 x 2 TB disks.
> I wonder if it would be possible to migrate it to a bigger RAID 5
> with 3 x 6TB new disks.

Due to all suggestions, and I'd wish another time thank all contributions,
I've spent some hours reading the mailing list archives, surfing the web
with more appropiate keywords, and so on..

So, here what I now plan to do:

First, I hace cancelled my order for the new 6TB desktop-grade disks, 
replacing it with 4TB WD RedPro, and one 6TB desktop-grade (see below its use)

As the full RAID5 array I planned to migrate is already my backup system,
I cannot rely on a restore :(. So the first thing is to rsync the current
array to the directly attached 6TB disk. I don't thing I have a free SATA 
port on my motherboard, but at worst I will be able to use the one currently
used for the DVD drive.

I've chosen to build new RAID10 arrays.
I've moved away the RAID6 suggestion due to its known bad write performance,
and also because I'm willing/able to put a bit more money to get better perfs.

The 4x4TB new disks will be partitioned as:
- 512MB to be a RAID1 array mounted as /boot
- 8GB to be a RAID10 array used as swap
- two 25 GB parts to be two RAID10 arrays used as root filesystem
  (plus place for an alternate when upgrading the OS)
- the rest of the disk will be splitted in four equal parts (about 930 MB 
I think), each of which being member of a separate data RAID10 array.

I am conscious that this seems as a waste of space, and especially for the
/boot partition. But this scheme will let me:
a) have banalized disks: all disks have same rules, are partitioned identically
b) replace my system disk which is not part of any RAID system as of today,
thus gaining actually both a SATA port for the RAID systems and more security
for the boot and root filesystems
c) also because I use to use LVM on top of RAID to get advantages of its
flexibility (so several PVs which may or may not be aggregated later)

Other suggestions include the use of smartctl tool. I've checked that the 
daemon was already running. But I didn't use the '-x' option that I understand
is hardly an option !

I plan to build these RAID devices out of CentOS 7 standard install process
(I'm currently downloading a CentOS Live iso), thus presenting to the install
some predefined partitions.

I expect about 5-10 days to get these orders delivered. So more news at this time :)

Thank you all for your help. I keep reading the list that I discovered for 
the occasion...

Regards
Pierre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-10 19:37   ` Pierre Wieser
@ 2015-06-15 10:46     ` Wilson, Jonathan
  2015-06-15 21:45       ` Wols Lists
  0 siblings, 1 reply; 15+ messages in thread
From: Wilson, Jonathan @ 2015-06-15 10:46 UTC (permalink / raw)
  To: Pierre Wieser; +Cc: linux-raid

On Wed, 2015-06-10 at 21:37 +0200, Pierre Wieser wrote:
> Hi all,
> 
> > I currently have an almost full RAID 5 built with 4 x 2 TB disks.
> > I wonder if it would be possible to migrate it to a bigger RAID 5
> > with 3 x 6TB new disks.
> 
> Due to all suggestions, and I'd wish another time thank all contributions,
> I've spent some hours reading the mailing list archives, surfing the web
> with more appropiate keywords, and so on..
> 
> So, here what I now plan to do:
> 
> First, I hace cancelled my order for the new 6TB desktop-grade disks, 
> replacing it with 4TB WD RedPro, and one 6TB desktop-grade (see below its use)
> 
> As the full RAID5 array I planned to migrate is already my backup system,
> I cannot rely on a restore :(. So the first thing is to rsync the current
> array to the directly attached 6TB disk. I don't thing I have a free SATA 
> port on my motherboard, but at worst I will be able to use the one currently
> used for the DVD drive.
> 
> I've chosen to build new RAID10 arrays.
> I've moved away the RAID6 suggestion due to its known bad write performance,
> and also because I'm willing/able to put a bit more money to get better perfs.

First I am not an expert, the following is based on multiple web sites
so is kind of cobbled together and using your setup but based on my
system as it stands now.

I'm going to make some assumptions here... 
1) the motherboard can see and boot from 4TB drives instead of only
seeing 750G (approx) in the bios, if not you will need a smaller disk
for the boot/os. 
2) this will be a "bios" install/boot or UEFI with CSM to simulate a
bios install/boot
3) these will be the only disks in the system. 


> 
> The 4x4TB new disks will be partitioned as:

As this will be a clean install, make a 1M partition flagged as "bios
boot" (EF02 in gdisk) this will allow grub2 to install into the member
(as normal) and its larger next stage loader & raid "drivers/ability" to
be installed into the "bios boot" partition, do this for all 4 disks.
(see #a later)

> - 512MB to be a RAID1 array mounted as /boot

Of the 4 drives 512M partitions, create a 4 way raid1 for /boot
(grub2/config and the kernels & initiramfs will live in here) (see #b
later)

> - 8GB to be a RAID10 array used as swap

On the 4 disks, create 17G partitions then create a 4 disk raid10 far2
array with 64K chunk. This will give you a swap file of 34G in size
(well over provisioned, but doesn't hurt or impact performance). As its
likely swap access will be in small random amounts this means the disk
write size is not overly large, no point in writing/reading 512K chunks
(the current default) for a 4K page swap/memory access; raid10 is fast;
far2 from what I've read also improves the speed of read/writes in some
tests (I don't know why or if the tests I've seen mentioned on the web
are accurate for the type of access swap will cause but on my setup I
can get a dd speed of 582M read and 215M write from drives with a single
device speed of about 80-100M as a rough and ready speed test).

> - two 25 GB parts to be two RAID10 arrays used as root filesystem
>   (plus place for an alternate when upgrading the OS)

25G partition on all 4 disks in to a single raid10 far2 (default 512K
chunk) = 50G for "root"

Duplicate above for a second "root/install" (this might be useful for #b
later also)

> - the rest of the disk will be splitted in four equal parts (about 930 MB 
> I think), each of which being member of a separate data RAID10 array.

I would not bother creating 4 smaller partitions on each disk, nothing
will be gained except more complexity and may even reduce speeds due to
increasing seeks when data doesn't reside exactly on one raid group. LVM
can still sit on the top for flexibility later. You could also go for a
4 disk raid6 (which I have) which would give you the same amount of
storage space on creation but would then mean 1 extra disk=1 extra disks
worth of space, not half,as you add more. (I'm not sure about R/W
speeds, also while I think it can - I'm not sure if mdadm --grow works
on raid10)

> 
> I am conscious that this seems as a waste of space, and especially for the
> /boot partition. But this scheme will let me:
> a) have banalized disks: all disks have same rules, are partitioned identically

I have found with GPT/raid etc. that as time has gone on I have created
partitions with the same "number" as the md/X numbering, while not
needed it does mean I know "/dev/md/3" is made up of /dev/sd[a-d]3 so if
at some future point I add more disks and create a new array I do it by
creating partition number(s) "4" and array /dev/md/4 instead of having a
bunch of partition "1"s with a multitude of differing number mdadm
arrays which gives my brain a kick to remind me that "no you can't
delete that partition because it doesn't match the array number you are
doing stuff with".

> b) replace my system disk which is not part of any RAID system as of today,
> thus gaining actually both a SATA port for the RAID systems and more security
> for the boot and root filesystems

See my assumption 1, on my old P45DE core2/quad system linux can happily
see big drives (over 2TB I think is the limit) and use all the space as
one large partition or further divided, but the bios could only see a
smaller 750G amount so could not boot from my 3TB drives so while I did
all the partitioning mentioned in my replies (ready for when I upgraded
to newer hardware, which I have done) I needed a 1TB disk to hold the
"bios boot," "/boot," and "root" to be able to then see the larger
drives. (actually strictly speaking you could probably get away with
just "bios boot" and "/boot" on the smaller disk, and have /root on the
larger ones as grub2 loads the kernel file and initramfs from /boot...
I'm not 100% sure but I think grub2 can also see and understand larger
disks, so you might be able to install grub2 to the small disk (or flash
drive), which the bios can then boot from, which can then load the
kernel from the large disk's /boot raid.)

> c) also because I use to use LVM on top of RAID to get advantages of its
> flexibility (so several PVs which may or may not be aggregated later)
> Other suggestions include the use of smartctl tool. I've checked that the 
> daemon was already running. But I didn't use the '-x' option that I understand
> is hardly an option !
> 
> I plan to build these RAID devices out of CentOS 7 standard install process
> (I'm currently downloading a CentOS Live iso), thus presenting to the install
> some predefined partitions.
> 
> I expect about 5-10 days to get these orders delivered. So more news at this time :)
> 
> Thank you all for your help. I keep reading the list that I discovered for 
> the occasion...

(#a) After installing to sda and booting etc, you than then install grub
on to sd[b,c,d]. This means that should you lose sda, you can boot from
any of the remaining disks without having to worry about getting a
"live" cd or some such method of recovering the system. 

(#b) Should you upgrade to a UEFI motherboard and/or disable CSM remove
the array on the 4 disks (512M partitions), mark them as EF00 (EFI
System) in gdisk, format them all as fat32, install the boot loader
& /boot to disk sda2 to get a working system, replicate sda2 into
sd[b,c,d]2, to allow recovery should sda fail, and use efibootmgr to add
boot entries to NVRAM for disks b2,c2,d2. (I think grub install to each
disk under uefi should also add boot entries to the uefi NVRAM but UEFI
is much more of a pain than "bios" with stupid things such as forgetting
its entries if a disk is removed/replaced, so efibootmgr is a tool to
get used to)



> 
> Regards
> Pierre

Jon.

(all the above is based on my experience/hassles as an "end user/self
learner" and various web searches and posts on this list, so may be
totally different advice from what a systems administrator would give
for a work server set up with way more experience and knowledge of just
what works best, and why, especially system/raid performance which is an
art that an end user doesn't really have to worry about as "its fast
enough/it works ok" usually suffices.)

> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-09 18:46     ` Wols Lists
  2015-06-09 19:06       ` Another Sillyname
  2015-06-09 19:18       ` Can Jeuleers
@ 2015-06-15 11:31       ` Wilson, Jonathan
  2 siblings, 0 replies; 15+ messages in thread
From: Wilson, Jonathan @ 2015-06-15 11:31 UTC (permalink / raw)
  To: Wols Lists; +Cc: Can Jeuleers, Pierre Wieser, linux-raid

On Tue, 2015-06-09 at 19:46 +0100, Wols Lists wrote:
> On 09/06/15 06:23, Can Jeuleers wrote:
> > On 08/06/15 21:28, Pierre Wieser wrote:
> >> Hi all,
> >>
> >> I currently have an almost full RAID 5 built with 4 x 2 TB disks.
> >> I wonder if it would be possible to migrate it to a bigger RAID 5
> >> with 3 x 6TB new disks.
> > 
> > I'd recommend against it:
> > 
> > https://en.wikipedia.org/wiki/RAID#Unrecoverable_read_errors_during_rebuild
> > 
> > Jan
> > 
> Please expand! Having read the article, it doesn't seem to say anything
> more than what is repeated time and time on this list - MAKE SURE YOUR
> DRIVES ARE DECENT RAID DRIVES.
> 
> If you have ERC, then the odd "soft" read error doesn't matter. If you
> don't have ERC, then your data is at risk when you replace a drive, and
> it doesn't matter how big your drives are, it's the array size that matters.

TLER doesn't actually affect the raid or its integrity compared to
non-tler drives (well strictly it _might_ as drives with tler might have
better lifespans, might have longer warranties (which suggests better
life), might have better URE rates, etc.) but the difference to how
mdadm handles things is actually down to the way the block device layer
handles things.

From what I can tell, with TLER the disk just gives up and reports an
error very quickly, this is then passed up the stack to the raid layer
which then tries to resolve the problem using various methods... a TLER
"error" does not mean the device is kicked, only if mdadm can't resolve
the problem does the device get booted. (I think it tries to recover the
data then tries to write recovered data back to the device, only if this
fails does the disk get booted)

Without TLER the disk tries to sort its own problems out instead of
reporting an error, this might take a long time, it might try to resolve
the problem forever in one long endless loop. The block layer (sdX)
knows it asked for something to happen, it gets bored and decides its
taken to long for the disk to return data so it decides that the disk no
longer exists, it (the device block layer as far as I can tell) kicks
the disk then passes on a message to mdadm that the disk is down for the
count and has been booted from the system.

I don't know who sets the block layer time out, or if it varies
depending on if the disk is a "file system" or is "a raid member" but
someone decided that after a few seconds the device should disappear/be
marked as bad within the system to prevent the raid from stalling, or as
a "normal" disk/file system various types of errors up to and including
a complete crash.

By setting the time out in the block layer /sys/block/sdX/device/timeout
to a high(ish) value the raid will stall (not a problem for most end
users, big no-no for a high end data server with 100's of users relying
on quick responses) or "hang" on a "normal" disk producing a frozen
screen or what not to the end user... while a pain, better than a disk
fail especially if eventually the disk internally manages to sort the
problem and give valid data back instead of the system crashing.

I set the block time out to 180 seconds on all disks (3 mins) for disks
with TLER enabled they will still give up and send an error message up
the stack in less than 7 seconds, for my other "green" drives with no
TLER they will try their best to recover and if not will eventually pass
the error up to the block layer or after 3 mins the block layer will
report they timed out to either mdadm or to the file system.

Unlike with mdadm and the block device which can be tuned, a hardware
raid will give up on the drive after 7 seconds and kick it (which is why
you should only use raid/TLER drives in a HW raid); at least with mdadm,
specifically the block device layer, depending on the type of drive and
how much internal (to the disk) error recovery is performed and how
important response times are you can use any old disk with mdadm raid
with no problems. 
It should also be noted that the same issue would happen without raid, a
pause/hang or a drive marked as failed and/or the system crashing if the
block layer gives up or an error message passed up to the file system if
the disk has TLER and is used in a non raid way... how the file system
handles it is up to the file system.

> 
> Cheers,
> Wol
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Migrating a RAID 5 from 4x2TB to 3x6TB ?
  2015-06-15 10:46     ` Wilson, Jonathan
@ 2015-06-15 21:45       ` Wols Lists
  0 siblings, 0 replies; 15+ messages in thread
From: Wols Lists @ 2015-06-15 21:45 UTC (permalink / raw)
  To: Wilson, Jonathan, Pierre Wieser; +Cc: linux-raid

On 15/06/15 11:46, Wilson, Jonathan wrote:
> On the 4 disks, create 17G partitions then create a 4 disk raid10 far2
> array with 64K chunk. This will give you a swap file of 34G in size
> (well over provisioned, but doesn't hurt or impact performance). As its
> likely swap access will be in small random amounts this means the disk
> write size is not overly large, no point in writing/reading 512K chunks
> (the current default) for a 4K page swap/memory access; raid10 is fast;
> far2 from what I've read also improves the speed of read/writes in some
> tests (I don't know why or if the tests I've seen mentioned on the web
> are accurate for the type of access swap will cause but on my setup I
> can get a dd speed of 582M read and 215M write from drives with a single
> device speed of about 80-100M as a rough and ready speed test).

Bear in mind that linux will all by itself do a raid-0 on your swap
partitions if you ask it to.

I *always* size my swap partitions at twice mobo max ram. If you read
the release notes for linux 2.4.early, you'll notice Linus says "if you
have swap, it MUST be twice ram or more" - that was a kernel panic if
you ignored it ! Given that people have been saying that rule was
obsolete since before linux was born, and that while a lot of water has
passed under the bridge since then I've seen and heard nothing to tell
me that the fundamentals have changed ... so because my mobo maxes out
at 16MB ram, all my disks have a 32GB swap partition each.

If you want to raid10 that lot, fine, I just set equal priority on all
my swap partitions, and linux will raid-0 it for me.

The other thing to bear in mind, it's all very well doing speed tests on
your drive, but if you're hammering swap and backing store at the same
time, your speeds are going to plummet as your drive starts seeking all
over the shop ... mind you, with a decent amount of ram you probably
won't need swap at all.

Cheers,
Wol

Cheers,
Wol

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-06-15 21:45 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1419435054.589.1433790714774.JavaMail.zimbra@wieser.fr>
2015-06-08 19:28 ` Migrating a RAID 5 from 4x2TB to 3x6TB ? Pierre Wieser
2015-06-08 20:10   ` Wols Lists
2015-06-09 18:33     ` Pierre Wieser
2015-06-09  5:23   ` Can Jeuleers
2015-06-09 18:46     ` Pierre Wieser
2015-06-09 19:06       ` Wols Lists
2015-06-09 19:15       ` Roman Mamedov
2015-06-09 18:46     ` Wols Lists
2015-06-09 19:06       ` Another Sillyname
2015-06-09 19:18       ` Can Jeuleers
2015-06-15 11:31       ` Wilson, Jonathan
     [not found]   ` <CAOS+5GHzBgx2DuDe0+RLgZj9Q1BZ944i-9q4NEERq66Sk78b2g@mail.gmail.com>
2015-06-09 18:26     ` Pierre Wieser
2015-06-10 19:37   ` Pierre Wieser
2015-06-15 10:46     ` Wilson, Jonathan
2015-06-15 21:45       ` Wols Lists

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.