All of lore.kernel.org
 help / color / mirror / Atom feed
* RAID5 made with assume-clean
@ 2013-02-06 11:52 Wakko Warner
  2013-02-06 13:02 ` Robin Hill
  0 siblings, 1 reply; 5+ messages in thread
From: Wakko Warner @ 2013-02-06 11:52 UTC (permalink / raw)
  To: linux-raid

I was testing different parameters with --assume-clean to avoid the initial
rebuild.  When I decided on the parameters I wanted, I forgot to create the
array without --assume-clean.  I have 3 disks in the array.

I thought that I'd run a check on it by doing
echo check > /sys/block/md0/md/sync_action

/proc/mdstat is showing this:
Personalities : [raid1] [raid6] [raid5] [raid4] 
md0 : active raid5 sda1[0] sdb1[1] sdc1[2]
      488018688 blocks super 1.1 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      [============>........]  check = 61.7% (150688512/244009344) finish=7.1min speed=216592K/sec

unused devices: <none>

The thing is, the drives can only do ~60mb/sec and there is no disk
activity.  The activity lights are not lit at all.  What would cause that?

I was also wondering if the raid5 did RMW on the parity with 3 disks when
the array is written to.

I can rebuild the array without assume-clean if that's the only way I can
get the parity to be correct, but I'd like to avoid doing that if possible.

I also have backups of the array if it gets trashed.

-- 
 Microsoft has beaten Volkswagen's world record.  Volkswagen only created 22
 million bugs.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 made with assume-clean
  2013-02-06 11:52 RAID5 made with assume-clean Wakko Warner
@ 2013-02-06 13:02 ` Robin Hill
  2013-02-06 14:54   ` Wakko Warner
  0 siblings, 1 reply; 5+ messages in thread
From: Robin Hill @ 2013-02-06 13:02 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2152 bytes --]

On Wed Feb 06, 2013 at 06:52:58AM -0500, Wakko Warner wrote:

> I was testing different parameters with --assume-clean to avoid the initial
> rebuild.  When I decided on the parameters I wanted, I forgot to create the
> array without --assume-clean.  I have 3 disks in the array.
> 
> I thought that I'd run a check on it by doing
> echo check > /sys/block/md0/md/sync_action
> 
> /proc/mdstat is showing this:
> Personalities : [raid1] [raid6] [raid5] [raid4] 
> md0 : active raid5 sda1[0] sdb1[1] sdc1[2]
>       488018688 blocks super 1.1 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
>       [============>........]  check = 61.7% (150688512/244009344) finish=7.1min speed=216592K/sec
> 
> unused devices: <none>
> 
> The thing is, the drives can only do ~60mb/sec and there is no disk
> activity.  The activity lights are not lit at all.  What would cause that?
> 
What disks are they? I would expect a modern SATA disk to be able to
handle 120MB/s for sequential read, so 220 across the array would be
pretty normal.

> I was also wondering if the raid5 did RMW on the parity with 3 disks when
> the array is written to.
> 
Not sure what the logic is on this. For a 3 disk array it'd need a
single read and 2 writes for a single chunk, whether it's doing RMW or
not. It will probably still do RMW though, as that avoids the
complication of special-casing things. I've had a quick look at the code
and I can't see any special casing (other than for a 2 disk array, where
the same data is written to both).

> I can rebuild the array without assume-clean if that's the only way I can
> get the parity to be correct, but I'd like to avoid doing that if possible.
> 
That's definitely the safest option. If you can verify the data then you
could run a repair and a fsck before verifying/restoring the data, but
that'd take far longer than a simple rebuild and restore.

HTH,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 made with assume-clean
  2013-02-06 13:02 ` Robin Hill
@ 2013-02-06 14:54   ` Wakko Warner
  2013-02-07 14:29     ` Robin Hill
  0 siblings, 1 reply; 5+ messages in thread
From: Wakko Warner @ 2013-02-06 14:54 UTC (permalink / raw)
  To: linux-raid

Please keep me in CC.

Robin Hill wrote:
> On Wed Feb 06, 2013 at 06:52:58AM -0500, Wakko Warner wrote:
> 
> > I was testing different parameters with --assume-clean to avoid the initial
> > rebuild.  When I decided on the parameters I wanted, I forgot to create the
> > array without --assume-clean.  I have 3 disks in the array.
> > 
> > I thought that I'd run a check on it by doing
> > echo check > /sys/block/md0/md/sync_action
> > 
> > /proc/mdstat is showing this:
> > Personalities : [raid1] [raid6] [raid5] [raid4] 
> > md0 : active raid5 sda1[0] sdb1[1] sdc1[2]
> >       488018688 blocks super 1.1 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
> >       [============>........]  check = 61.7% (150688512/244009344) finish=7.1min speed=216592K/sec
> > 
> > unused devices: <none>
> > 
> > The thing is, the drives can only do ~60mb/sec and there is no disk
> > activity.  The activity lights are not lit at all.  What would cause that?
> > 
> What disks are they? I would expect a modern SATA disk to be able to
> handle 120MB/s for sequential read, so 220 across the array would be
> pretty normal.

They are old WDC 250 disks.  I did a DD test on them, they are 60mb/sec.  A
DD test on the array gives me about 119mb/sec.  As I stated, the activity
lights did not even come on during the check.  Nothing was done.  This is
kernel 3.3.0.  

> > I was also wondering if the raid5 did RMW on the parity with 3 disks when
> > the array is written to.
> > 
> Not sure what the logic is on this. For a 3 disk array it'd need a
> single read and 2 writes for a single chunk, whether it's doing RMW or
> not. It will probably still do RMW though, as that avoids the
> complication of special-casing things. I've had a quick look at the code
> and I can't see any special casing (other than for a 2 disk array, where
> the same data is written to both).

I know there's 2 ways to update the parity.
1) Read the other data blocks, calculate parity, modify parity block.  Or
2) Read the parity, read the old data, calculate new parity, write parity.

Obviously, with many disks, #2 is the best option.  With 3 disks, #1 would
be the best option (IMO) especially when created with assume clean.

> > I can rebuild the array without assume-clean if that's the only way I can
> > get the parity to be correct, but I'd like to avoid doing that if possible.
> > 
> That's definitely the safest option. If you can verify the data then you
> could run a repair and a fsck before verifying/restoring the data, but
> that'd take far longer than a simple rebuild and restore.

The data I added was transfered from one LVM PV to this one with pvmove. 
One volume was DDd from the old disk that this array was replacing.  I saved
the raw volume image to another server incase I messed something up (and the
old disk was dying anyway)

I really wanted to know the answer to the problem where check didn't work.
If I recreate the array, I'll add another drive to the VG and move the
volumes off, recreate and move back.

-- 
 Microsoft has beaten Volkswagen's world record.  Volkswagen only created 22
 million bugs.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 made with assume-clean
  2013-02-06 14:54   ` Wakko Warner
@ 2013-02-07 14:29     ` Robin Hill
  2013-02-07 18:14       ` Wakko Warner
  0 siblings, 1 reply; 5+ messages in thread
From: Robin Hill @ 2013-02-07 14:29 UTC (permalink / raw)
  To: Wakko Warner; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4252 bytes --]

On Wed Feb 06, 2013 at 09:54:46AM -0500, Wakko Warner wrote:

> Please keep me in CC.
> 
Then don't set the Mail-Followup-To header to point to the list,
otherwise you're strongly suggesting that you don't want to be CCed in
the replies.

> Robin Hill wrote:
> > On Wed Feb 06, 2013 at 06:52:58AM -0500, Wakko Warner wrote:
> > 
> > > I was testing different parameters with --assume-clean to avoid the initial
> > > rebuild.  When I decided on the parameters I wanted, I forgot to create the
> > > array without --assume-clean.  I have 3 disks in the array.
> > > 
> > > I thought that I'd run a check on it by doing
> > > echo check > /sys/block/md0/md/sync_action
> > > 
> > > /proc/mdstat is showing this:
> > > Personalities : [raid1] [raid6] [raid5] [raid4] 
> > > md0 : active raid5 sda1[0] sdb1[1] sdc1[2]
> > >       488018688 blocks super 1.1 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
> > >       [============>........]  check = 61.7% (150688512/244009344) finish=7.1min speed=216592K/sec
> > > 
> > > unused devices: <none>
> > > 
> > > The thing is, the drives can only do ~60mb/sec and there is no disk
> > > activity.  The activity lights are not lit at all.  What would cause that?
> > > 
> > What disks are they? I would expect a modern SATA disk to be able to
> > handle 120MB/s for sequential read, so 220 across the array would be
> > pretty normal.
> 
> They are old WDC 250 disks.  I did a DD test on them, they are 60mb/sec.  A
> DD test on the array gives me about 119mb/sec.  As I stated, the activity
> lights did not even come on during the check.  Nothing was done.  This is
> kernel 3.3.0.  
> 
That does seem odd then. I've never seen that happen before. Has the
array been stopped & restarted since the initial creation? It may be
that having created it with --assume-clean is setting something which
short-circuits the check process.

> > > I was also wondering if the raid5 did RMW on the parity with 3 disks when
> > > the array is written to.
> > > 
> > Not sure what the logic is on this. For a 3 disk array it'd need a
> > single read and 2 writes for a single chunk, whether it's doing RMW or
> > not. It will probably still do RMW though, as that avoids the
> > complication of special-casing things. I've had a quick look at the code
> > and I can't see any special casing (other than for a 2 disk array, where
> > the same data is written to both).
> 
> I know there's 2 ways to update the parity.
> 1) Read the other data blocks, calculate parity, modify parity block.  Or
> 2) Read the parity, read the old data, calculate new parity, write parity.
> 
> Obviously, with many disks, #2 is the best option.  With 3 disks, #1 would
> be the best option (IMO) especially when created with assume clean.
> 
Performance-wise there's little difference (though #2 may be slightly
quicker as it avoids a seek on one disk), but #1 makes the code more
complex and will only help in really obscure situations.

> > > I can rebuild the array without assume-clean if that's the only way I can
> > > get the parity to be correct, but I'd like to avoid doing that if possible.
> > > 
> > That's definitely the safest option. If you can verify the data then you
> > could run a repair and a fsck before verifying/restoring the data, but
> > that'd take far longer than a simple rebuild and restore.
> 
> The data I added was transfered from one LVM PV to this one with pvmove. 
> One volume was DDd from the old disk that this array was replacing.  I saved
> the raw volume image to another server incase I messed something up (and the
> old disk was dying anyway)
> 
> I really wanted to know the answer to the problem where check didn't work.
> If I recreate the array, I'll add another drive to the VG and move the
> volumes off, recreate and move back.
> 
I'd suggest stopping & restarting the array (if this hasn't been done)
and rerunning the check (or just running a repair straightaway).

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 made with assume-clean
  2013-02-07 14:29     ` Robin Hill
@ 2013-02-07 18:14       ` Wakko Warner
  0 siblings, 0 replies; 5+ messages in thread
From: Wakko Warner @ 2013-02-07 18:14 UTC (permalink / raw)
  To: Robin Hill; +Cc: linux-raid

Robin Hill wrote:
> On Wed Feb 06, 2013 at 09:54:46AM -0500, Wakko Warner wrote:
> 
> > Please keep me in CC.
> > 
> Then don't set the Mail-Followup-To header to point to the list,
> otherwise you're strongly suggesting that you don't want to be CCed in
> the replies.

It wasn't supposed to be set.  I never configured mutt to do so.  I've set
the follow up to my address.  I didn't know it was doing that.  Thanks for
letting me know.

> > Robin Hill wrote:
> > > What disks are they? I would expect a modern SATA disk to be able to
> > > handle 120MB/s for sequential read, so 220 across the array would be
> > > pretty normal.
> > 
> > They are old WDC 250 disks.  I did a DD test on them, they are 60mb/sec.  A
> > DD test on the array gives me about 119mb/sec.  As I stated, the activity
> > lights did not even come on during the check.  Nothing was done.  This is
> > kernel 3.3.0.  
> > 
> That does seem odd then. I've never seen that happen before. Has the
> array been stopped & restarted since the initial creation? It may be
> that having created it with --assume-clean is setting something which
> short-circuits the check process.

I'm not sure.  Since the array (md0) is an LVM PV, I added another PV
yesterday.  I moved the LVs to the new PV and recreated the array.  It did
do the resync process.  All 3 activity lights were active.

> > > Not sure what the logic is on this. For a 3 disk array it'd need a
> > > single read and 2 writes for a single chunk, whether it's doing RMW or
> > > not. It will probably still do RMW though, as that avoids the
> > > complication of special-casing things. I've had a quick look at the code
> > > and I can't see any special casing (other than for a 2 disk array, where
> > > the same data is written to both).
> > 
> > I know there's 2 ways to update the parity.
> > 1) Read the other data blocks, calculate parity, modify parity block.  Or
> > 2) Read the parity, read the old data, calculate new parity, write parity.
> > 
> > Obviously, with many disks, #2 is the best option.  With 3 disks, #1 would
> > be the best option (IMO) especially when created with assume clean.
> > 
> Performance-wise there's little difference (though #2 may be slightly
> quicker as it avoids a seek on one disk), but #1 makes the code more
> complex and will only help in really obscure situations.

I guess that depends.  On a 3 drive array, it wouldn't make a difference in
performance, but #1 would always keep the parity accurate (barring disk
errors).  #2 would make a difference in performance if there were more
disks.  Say 10 and there were multiple writes.

> > > That's definitely the safest option. If you can verify the data then you
> > > could run a repair and a fsck before verifying/restoring the data, but
> > > that'd take far longer than a simple rebuild and restore.
> > 
> > The data I added was transfered from one LVM PV to this one with pvmove. 
> > One volume was DDd from the old disk that this array was replacing.  I saved
> > the raw volume image to another server incase I messed something up (and the
> > old disk was dying anyway)
> > 
> > I really wanted to know the answer to the problem where check didn't work.
> > If I recreate the array, I'll add another drive to the VG and move the
> > volumes off, recreate and move back.
> > 
> I'd suggest stopping & restarting the array (if this hasn't been done)
> and rerunning the check (or just running a repair straightaway).

As stated above, I just rebuilt the array.  The speed was expected at around
60mb/sec.  Interestingly, if I issue a check on it, it still won't touch the
disks the entire time it's checking the array.  I have another array in this
machine and it does the same thing.  It may be a kernel bug with 3.3.0, not
sure.

-- 
 Microsoft has beaten Volkswagen's world record.  Volkswagen only created 22
 million bugs.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-02-07 18:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-06 11:52 RAID5 made with assume-clean Wakko Warner
2013-02-06 13:02 ` Robin Hill
2013-02-06 14:54   ` Wakko Warner
2013-02-07 14:29     ` Robin Hill
2013-02-07 18:14       ` Wakko Warner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.