Fwd: Maximizing failed disk replacement on a RAID5 array

All of lore.kernel.org
 help / color / mirror / Atom feed

* Fwd: Maximizing failed disk replacement on a RAID5 array
       [not found] <BANLkTimBYFhjQ-sC9DhTMO+PG-Ox+A9S2Q@mail.gmail.com>
@ 2011-06-05 14:22 ` Durval Menezes
  2011-06-06 15:02   ` Drew
  0 siblings, 1 reply; 17+ messages in thread
From: Durval Menezes @ 2011-06-05 14:22 UTC (permalink / raw)
  To: linux-raid

Hello folks,

A few days ago, the smartd daemon running on my Lucid system at home
(kernel 2.6.32-32-generic, mdadm 2.6.7.1) has started warning me about
a few (less than 50 so far) offline uncorrectable and other errors on
one of my 1.5TB HDs three-disk RAID5 array. This failing HD is still
online (ie, hasn't been kicked off the array), at least for now.

I have another disk ready for replacement, and I'm trying to determine
the safer (not necessarily the simpler) way of proceeding.

I understand that, if I do it the "standard" way (ie, power down the
system, remove the failing disk, add the replacement disk, then boot
up and use "mdadm --add" to add the new disk to the array) I run the
risk of running into unreadable sectors on one of the other two disks,
and then my RAID5 is kaput.

What I would really like to do is to be able to add the new HD to the
array WITHOUT removing the failing HD, somehow sync it with the rest,
and THEN remove the failing HD: that way, an eventual failed read from
one of the two other HDs could possibly be satisfied from the failing
HD (unless EXACTLY that same sector is also unreadable on it, which I
find unlikely), and so avoid losing the whole array in the above case.

So far, the only way I've been able to figure to do that would be to
convert the  array from RAID5 to RAID6, add the new disk, wait for the
array to sync, remove the failing disk, and then convert the array
back from RAID6 to RAID5 (and I'm not really sure that this is a good
idea, or even doable).

So, folks, what do you say? Is there a better way? Any gotchas in the
RAID5->RAID6->RAID6 approach?

Thanks,
--
   Durval Menezes.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-05 14:22 ` Fwd: Maximizing failed disk replacement on a RAID5 array Durval Menezes
@ 2011-06-06 15:02   ` Drew
  2011-06-06 15:20     ` Brad Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Drew @ 2011-06-06 15:02 UTC (permalink / raw)
  To: Durval Menezes; +Cc: linux-raid

> I understand that, if I do it the "standard" way (ie, power down the
> system, remove the failing disk, add the replacement disk, then boot
> up and use "mdadm --add" to add the new disk to the array) I run the
> risk of running into unreadable sectors on one of the other two disks,
> and then my RAID5 is kaput.
>
> What I would really like to do is to be able to add the new HD to the
> array WITHOUT removing the failing HD, somehow sync it with the rest,
> and THEN remove the failing HD: that way, an eventual failed read from
> one of the two other HDs could possibly be satisfied from the failing
> HD (unless EXACTLY that same sector is also unreadable on it, which I
> find unlikely), and so avoid losing the whole array in the above case.

A reshape from RAID5 -> RAID6 -> RAID5 will hammer your disks so if
either of the other two are ready to die, this will most likely tip
them over the edge.

A far simpler way would be to take the array offline, dd (or
dd_rescue) the old drive's contents onto the new disk, pull the old
disk, and restart the array with the new drive in it's place. With
luck you won't need a resync *and* you're not hammering the other two
drives in the process.


-- 
Drew

"Nothing in life is to be feared. It is only to be understood."
--Marie Curie

"This started out as a hobby and spun horribly out of control."
-Unknown

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-06 15:02   ` Drew
@ 2011-06-06 15:20     ` Brad Campbell
  2011-06-06 15:37       ` Drew
  0 siblings, 1 reply; 17+ messages in thread
From: Brad Campbell @ 2011-06-06 15:20 UTC (permalink / raw)
  To: Drew; +Cc: Durval Menezes, linux-raid

On 06/06/11 23:02, Drew wrote:
>> I understand that, if I do it the "standard" way (ie, power down the
>> system, remove the failing disk, add the replacement disk, then boot
>> up and use "mdadm --add" to add the new disk to the array) I run the
>> risk of running into unreadable sectors on one of the other two disks,
>> and then my RAID5 is kaput.
>>
>> What I would really like to do is to be able to add the new HD to the
>> array WITHOUT removing the failing HD, somehow sync it with the rest,
>> and THEN remove the failing HD: that way, an eventual failed read from
>> one of the two other HDs could possibly be satisfied from the failing
>> HD (unless EXACTLY that same sector is also unreadable on it, which I
>> find unlikely), and so avoid losing the whole array in the above case.
> A reshape from RAID5 ->  RAID6 ->  RAID5 will hammer your disks so if
> either of the other two are ready to die, this will most likely tip
> them over the edge.
>
> A far simpler way would be to take the array offline, dd (or
> dd_rescue) the old drive's contents onto the new disk, pull the old
> disk, and restart the array with the new drive in it's place. With
> luck you won't need a resync *and* you're not hammering the other two
> drives in the process.
<afterthought>
Bear with me, I've had a few scotches and this might not be as coherent as it might be, but I think 
I spot a very, very fatal flaw in your plan.
</afterthought>

I thought this initially also, except it blows up in the scenario where the dud sectors are data and 
not parity.

If you do it the way you suggest and choose dd_rescue in place of dd, dodgy data from the dud 
sectors will be replicated as kosher sectors on the replacement disk (or zero, or random or whatever)

If you execute a "repair" first, it will strike the dud sectors, see they are toast, re-calculate 
them from parity and write them back forcing a reallocation.

You can then replicate the failing disk using "dd", *not* dd_rescue. If dd fails due to a read error 
then you know that part of your data is likely to be toast on the replaced disk, and you can go 
about making provisions for a backup/restore operation using the original disk (which will likely 
succeed as the data read from the array will be re-built from parity where required).

dd_rescue is a blessing and a curse. It's _very_ good at getting you access to data that you have no 
backup of, and you have no other way of getting back. On the other hand, it will happily go and 
replicate whatever trash it happens to get back from the source disk, or skip those sectors and 
leave you with an incomplete copy that will leave no trace of it being incomplete until you find 
chunks missing (like superblocks or your formula for a zero cost petroleum replacement).

If your array works, but has a badly failing drive you are far better to buy some cheap 2TB disks 
and back it up, then restore it onto a re-created array than chance losing chunks of data by using a 
dd_rescue'd clone disk.

Now, if I'm off the wall and missing something blindingly obvious feel free to thump me with a clue 
bat (it would not be the first time).

I've lost 2 arrays recently. 8TB to a dodgy controller (thanks SIL), and 2TB to complete idiocy on 
my part, so I know the sting of lost or corrupted data.

Brad

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-06 15:20     ` Brad Campbell
@ 2011-06-06 15:37       ` Drew
  2011-06-06 15:54         ` Brad Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Drew @ 2011-06-06 15:37 UTC (permalink / raw)
  To: Brad Campbell; +Cc: Durval Menezes, linux-raid

> Now, if I'm off the wall and missing something blindingly obvious feel free
> to thump me with a clue bat (it would not be the first time).
>
> I've lost 2 arrays recently. 8TB to a dodgy controller (thanks SIL), and 2TB
> to complete idiocy on my part, so I know the sting of lost or corrupted
> data.

I think you've covered the process in more detail, including pitfalls,
then I have. :-) Only catch is where would you find a cheap 2-3TB
drive right now?

I also know the sting of mixing stupidity and dd. ;-) A friend was
helping me do some complex rework with dd on one of my disks. Being
the n00b I followed his instructions exactly, and him being the expert
(and assuming I wasn't the n00b I was back then) didn't double check
my work. Net result was I backed the MBR/Partition Table up using dd,
but did so to a partition on the drive we were working on. There may
have been some alcohol involved (I was in University), the revised
data we inserted failed, and next thing you know I'm running Partition
Magic (the gnu tools circa 2005 failed to detect anything) to try and
recover the partition table. No backups obviously. ;-)

-- 
Drew

"Nothing in life is to be feared. It is only to be understood."
--Marie Curie

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-06 15:37       ` Drew
@ 2011-06-06 15:54         ` Brad Campbell
  2011-06-06 18:06           ` Durval Menezes
  0 siblings, 1 reply; 17+ messages in thread
From: Brad Campbell @ 2011-06-06 15:54 UTC (permalink / raw)
  To: Drew; +Cc: Durval Menezes, linux-raid

On 06/06/11 23:37, Drew wrote:
>> Now, if I'm off the wall and missing something blindingly obvious feel free
>> to thump me with a clue bat (it would not be the first time).
>>
>> I've lost 2 arrays recently. 8TB to a dodgy controller (thanks SIL), and 2TB
>> to complete idiocy on my part, so I know the sting of lost or corrupted
>> data.
> I think you've covered the process in more detail, including pitfalls,
> then I have. :-) Only catch is where would you find a cheap 2-3TB
> drive right now?

I bought 10 recently for about $90 each. It's all relative, but I consider ~$45 / TB cheap.

> I also know the sting of mixing stupidity and dd. ;-) A friend was
> helping me do some complex rework with dd on one of my disks. Being
> the n00b I followed his instructions exactly, and him being the expert
> (and assuming I wasn't the n00b I was back then) didn't double check
> my work. Net result was I backed the MBR/Partition Table up using dd,
> but did so to a partition on the drive we were working on. There may
> have been some alcohol involved (I was in University), the revised
> data we inserted failed, and next thing you know I'm running Partition
> Magic (the gnu tools circa 2005 failed to detect anything) to try and
> recover the partition table. No backups obviously. ;-)

Similar to my

dd if=/dev/zero of=/dev/sdb bs=1M count=100

except instead of the target disk, it was to a raid array member that was currently active. To its 
credit, ext3 and fsck managed to give me most of my data back, even if I had to spend months 
intermittently sorting/renaming inode numbers from lost+found into files and directories.

I'd like to claim Alcohol as a mitigating factor (hell, it gets people off charges in our court 
system all the time) but unfortunately I was just stupid.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-06 15:54         ` Brad Campbell
@ 2011-06-06 18:06           ` Durval Menezes
  2011-06-07  5:03             ` Durval Menezes
  2011-06-07  8:52             ` John Robinson
  0 siblings, 2 replies; 17+ messages in thread
From: Durval Menezes @ 2011-06-06 18:06 UTC (permalink / raw)
  To: linux-raid; +Cc: Brad Campbell, Drew

Hello Brad, Drew,

Thanks for reminding me of the hammering a RAID level conversion would  cause.
This is certainly a major  reason to avoid the RAID5->RAID6->RAID5 route.

The "repair" has been running here for a few days already, with the
server online, and ought to finish in 24 more hours. So far (thanks to
the automatic rewrite relocation) the number of  uncorrectable sectors
being reported by SMART has dropped from 40 to 20 , so it seems the
repair is  doing its job. Lets just hope the disk has enough  spare
sectors  to remap all the bad sectors; if it does, a simple "dd "from
the bad disk to  its replacement ought to  do the job  (as you have
indicated).

On the other hand, as this "dd" has to be done with the array offline,
it will entail in some downtime (although not as much as having to
restore the whole array from backups).... not ideal, but not too bad
either.

In case worst comes to worst, I have an up-to-date offline backup of
the contents of the whole array, so if something really bad happens, I
have something to restore from.

It would be great to have a
"duplicate-this-bad-old-disk-into-this-shiny-new-disk"  functionality,
as it would enable  an almost-no-downtime disk replacement with
minimum  risk, but it seems we can't have everything... :-0 Maybe it's
something for the wishlist?

About mishaps with "dd", I think everyone  who ever dealt with a
system  (not just Linux)  on the level we do has sometime gone through
something similar... the last time I remember doing this was many
years ago, before  Linux existed, when me and a few friends spent a
wonderful night installing  William Jolitz ' then-new 386/BSD  on a HD
 (a process which *required*  dd)  and trashing its Windows partitions
(which contained the only copy of the graduation thesis of one of us,
due in a few days).

Thanks for all the help,
--
   Durval Menezes.

On Mon, Jun 6, 2011 at 12:54 PM, Brad Campbell <brad@fnarfbargle.com> wrote:
>
> On 06/06/11 23:37, Drew wrote:
>>>
>>> Now, if I'm off the wall and missing something blindingly obvious feel free
>>> to thump me with a clue bat (it would not be the first time).
>>>
>>> I've lost 2 arrays recently. 8TB to a dodgy controller (thanks SIL), and 2TB
>>> to complete idiocy on my part, so I know the sting of lost or corrupted
>>> data.
>>
>> I think you've covered the process in more detail, including pitfalls,
>> then I have. :-) Only catch is where would you find a cheap 2-3TB
>> drive right now?
>
> I bought 10 recently for about $90 each. It's all relative, but I consider ~$45 / TB cheap.
>
>> I also know the sting of mixing stupidity and dd. ;-) A friend was
>> helping me do some complex rework with dd on one of my disks. Being
>> the n00b I followed his instructions exactly, and him being the expert
>> (and assuming I wasn't the n00b I was back then) didn't double check
>> my work. Net result was I backed the MBR/Partition Table up using dd,
>> but did so to a partition on the drive we were working on. There may
>> have been some alcohol involved (I was in University), the revised
>> data we inserted failed, and next thing you know I'm running Partition
>> Magic (the gnu tools circa 2005 failed to detect anything) to try and
>> recover the partition table. No backups obviously. ;-)
>
> Similar to my
>
> dd if=/dev/zero of=/dev/sdb bs=1M count=100
>
> except instead of the target disk, it was to a raid array member that was currently active. To its credit, ext3 and fsck managed to give me most of my data back, even if I had to spend months intermittently sorting/renaming inode numbers from lost+found into files and directories.
>
> I'd like to claim Alcohol as a mitigating factor (hell, it gets people off charges in our court system all the time) but unfortunately I was just stupid.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-06 18:06           ` Durval Menezes
@ 2011-06-07  5:03             ` Durval Menezes
  2011-06-07  5:35               ` Brad Campbell
  2011-06-07  8:52             ` John Robinson
  1 sibling, 1 reply; 17+ messages in thread
From: Durval Menezes @ 2011-06-07  5:03 UTC (permalink / raw)
  To: linux-raid; +Cc: Brad Campbell, Drew

Hello Folks,

Just finished the "repair". It completed OK, and over SMART the HD now
shows a "Reallocated_Sector_Ct" of 291 (which shows that many bad
sectors have been remapped), but it's also still reporting 4
"Current_Pending_Sector" and 4 "Offline_Uncorrectable"... which I
think means exactly the same thing, ie, that there are 4 "active"
(from the HD perspective) sectors on the drive still detected as bad
and not remapped.

I've been thinking about exactly what that means, and I think that
these 4 sectors are either A) outside the RAID partition (not very
probable as this partition occupies more than 99.99% of the disk,
leaving just a small, less than 105MB area at the beginning), or B)
some kind of metadata or unused space that hasn't been read and
rewritten by the "repair" I've just completed. I've just done a "dd
bs=1024k count=105 </dev/DISK >/dev/null" to account for the
hyphotesys A), and come out empty: no errors, and the drive still
shows 4 bad, unmapped sectors on SMART.

So, by elimination, it must be either case B) above, or a bug in the
linux md code (which prevents it from hitting every needed block on
the disk), or a bug in SMART (which makes it report inexistent bad
sectors). I've just started running a "smart long test" on the disk
(which will try to read all of its sectors, reporting the first error
by LBA) and see what happens. If it shows no errors, I will know it's
a SMART bug. If it shows errors, it must be in a unused/metadata block
or a bug in linux md.

Either way, my plan is then to try a  plain "dd" (no "dd_repair", at
least not now) of this failing disk to a new one; if it goes by
without any errors, I will know it's a bug in SMART. if it hits any
errors, I will have the first errors position (from "dd" point of
view) and then I will try and dump that specific sector with dd_repair
and examine it.

I will keep you posted.

Cheers,
-- 
   Durval Menezes.


On Mon, Jun 6, 2011 at 3:06 PM, Durval Menezes <durval.menezes@gmail.com> wrote:
> Hello Brad, Drew,
>
> Thanks for reminding me of the hammering a RAID level conversion would  cause.
> This is certainly a major  reason to avoid the RAID5->RAID6->RAID5 route.
>
> The "repair" has been running here for a few days already, with the
> server online, and ought to finish in 24 more hours. So far (thanks to
> the automatic rewrite relocation) the number of  uncorrectable sectors
> being reported by SMART has dropped from 40 to 20 , so it seems the
> repair is  doing its job. Lets just hope the disk has enough  spare
> sectors  to remap all the bad sectors; if it does, a simple "dd "from
> the bad disk to  its replacement ought to  do the job  (as you have
> indicated).
>
> On the other hand, as this "dd" has to be done with the array offline,
> it will entail in some downtime (although not as much as having to
> restore the whole array from backups).... not ideal, but not too bad
> either.
>
> In case worst comes to worst, I have an up-to-date offline backup of
> the contents of the whole array, so if something really bad happens, I
> have something to restore from.
>
> It would be great to have a
> "duplicate-this-bad-old-disk-into-this-shiny-new-disk"  functionality,
> as it would enable  an almost-no-downtime disk replacement with
> minimum  risk, but it seems we can't have everything... :-0 Maybe it's
> something for the wishlist?
>
> About mishaps with "dd", I think everyone  who ever dealt with a
> system  (not just Linux)  on the level we do has sometime gone through
> something similar... the last time I remember doing this was many
> years ago, before  Linux existed, when me and a few friends spent a
> wonderful night installing  William Jolitz ' then-new 386/BSD  on a HD
>  (a process which *required*  dd)  and trashing its Windows partitions
> (which contained the only copy of the graduation thesis of one of us,
> due in a few days).
>
> Thanks for all the help,
> --
>    Durval Menezes.
>
> On Mon, Jun 6, 2011 at 12:54 PM, Brad Campbell <brad@fnarfbargle.com> wrote:
>>
>> On 06/06/11 23:37, Drew wrote:
>>>>
>>>> Now, if I'm off the wall and missing something blindingly obvious feel free
>>>> to thump me with a clue bat (it would not be the first time).
>>>>
>>>> I've lost 2 arrays recently. 8TB to a dodgy controller (thanks SIL), and 2TB
>>>> to complete idiocy on my part, so I know the sting of lost or corrupted
>>>> data.
>>>
>>> I think you've covered the process in more detail, including pitfalls,
>>> then I have. :-) Only catch is where would you find a cheap 2-3TB
>>> drive right now?
>>
>> I bought 10 recently for about $90 each. It's all relative, but I consider ~$45 / TB cheap.
>>
>>> I also know the sting of mixing stupidity and dd. ;-) A friend was
>>> helping me do some complex rework with dd on one of my disks. Being
>>> the n00b I followed his instructions exactly, and him being the expert
>>> (and assuming I wasn't the n00b I was back then) didn't double check
>>> my work. Net result was I backed the MBR/Partition Table up using dd,
>>> but did so to a partition on the drive we were working on. There may
>>> have been some alcohol involved (I was in University), the revised
>>> data we inserted failed, and next thing you know I'm running Partition
>>> Magic (the gnu tools circa 2005 failed to detect anything) to try and
>>> recover the partition table. No backups obviously. ;-)
>>
>> Similar to my
>>
>> dd if=/dev/zero of=/dev/sdb bs=1M count=100
>>
>> except instead of the target disk, it was to a raid array member that was currently active. To its credit, ext3 and fsck managed to give me most of my data back, even if I had to spend months intermittently sorting/renaming inode numbers from lost+found into files and directories.
>>
>> I'd like to claim Alcohol as a mitigating factor (hell, it gets people off charges in our court system all the time) but unfortunately I was just stupid.
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-07  5:03             ` Durval Menezes
@ 2011-06-07  5:35               ` Brad Campbell
  2011-06-08  6:58                 ` Durval Menezes
  0 siblings, 1 reply; 17+ messages in thread
From: Brad Campbell @ 2011-06-07  5:35 UTC (permalink / raw)
  To: Durval Menezes; +Cc: linux-raid, Drew

On 07/06/11 13:03, Durval Menezes wrote:
> Hello Folks,
>
> Just finished the "repair". It completed OK, and over SMART the HD now
> shows a "Reallocated_Sector_Ct" of 291 (which shows that many bad
> sectors have been remapped), but it's also still reporting 4
> "Current_Pending_Sector" and 4 "Offline_Uncorrectable"... which I
> think means exactly the same thing, ie, that there are 4 "active"
> (from the HD perspective) sectors on the drive still detected as bad
> and not remapped.
>
> I've been thinking about exactly what that means, and I think that
> these 4 sectors are either A) outside the RAID partition (not very
> probable as this partition occupies more than 99.99% of the disk,
> leaving just a small, less than 105MB area at the beginning), or B)
> some kind of metadata or unused space that hasn't been read and
> rewritten by the "repair" I've just completed. I've just done a "dd
> bs=1024k count=105</dev/DISK>/dev/null" to account for the
> hyphotesys A), and come out empty: no errors, and the drive still
> shows 4 bad, unmapped sectors on SMART.
>
> So, by elimination, it must be either case B) above, or a bug in the
> linux md code (which prevents it from hitting every needed block on
> the disk), or a bug in SMART (which makes it report inexistent bad
>
Try running a SMART long test smartctl -t long and it will tell you whether the sectors are really 
bad or not.
I've had instances where the firmware still thought that some previously pending sectors were still 
pending until I forced a test, at which time the drive came to its senses and they went away.

I believe if you wait until the drive gets around to doing its periodic offline data collection 
you'll see the same thing, but a long test is nice as it will give you an actual block number for 
the first failure (if you have one)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-07  5:35               ` Brad Campbell
@ 2011-06-08  6:58                 ` Durval Menezes
  2011-06-08  7:32                   ` Brad Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Durval Menezes @ 2011-06-08  6:58 UTC (permalink / raw)
  To: Brad Campbell; +Cc: linux-raid, Drew

Hello,

On Tue, Jun 7, 2011 at 2:35 AM, Brad Campbell <brad@fnarfbargle.com> wrote:
> On 07/06/11 13:03, Durval Menezes wrote:
>>
>> Hello Folks,
>>
>> Just finished the "repair". It completed OK, and over SMART the HD now
>> shows a "Reallocated_Sector_Ct" of 291 (which shows that many bad
>> sectors have been remapped), but it's also still reporting 4
>> "Current_Pending_Sector" and 4 "Offline_Uncorrectable"... which I
>> think means exactly the same thing, ie, that there are 4 "active"
>> (from the HD perspective) sectors on the drive still detected as bad
>> and not remapped.
>>
>> I've been thinking about exactly what that means, and I think that
>> these 4 sectors are either A) outside the RAID partition (not very
>> probable as this partition occupies more than 99.99% of the disk,
>> leaving just a small, less than 105MB area at the beginning), or B)
>> some kind of metadata or unused space that hasn't been read and
>> rewritten by the "repair" I've just completed. I've just done a "dd
>> bs=1024k count=105</dev/DISK>/dev/null" to account for the
>> hyphotesys A), and come out empty: no errors, and the drive still
>> shows 4 bad, unmapped sectors on SMART.
>>
>> So, by elimination, it must be either case B) above, or a bug in the
>> linux md code (which prevents it from hitting every needed block on
>> the disk), or a bug in SMART (which makes it report inexistent bad
>>
> Try running a SMART long test smartctl -t long and it will tell you whether
> the sectors are really bad or not.
> I've had instances where the firmware still thought that some previously
> pending sectors were still pending until I forced a test, at which time the
> drive came to its senses and they went away.
>
> I believe if you wait until the drive gets around to doing its periodic
> offline data collection you'll see the same thing, but a long test is nice
> as it will give you an actual block number for the first failure (if you
> have one)

I did it (smartctl -a long) and it completed (registering an error at
the very end of the disk):

     SMART Self-test log structure revision number 1
     Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
     # 1  Extended offline    Completed: read failure       10%
9942           2930273794

The SMART Attributes table still shows 4 pending/uncorrectable sectors:

    197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -
           4
    198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
Offline      -
           4

Converting the above LBA to a block number, I find 2930273794/2=
1465136897; as this is a 1.5TB HD,
this first error (there are possibly 3 more) is right at the final
35GB of the media, so it's inside (near the
end) of the RAID partition:

     fdisk -l /dev/sdc
         Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
          255 heads, 63 sectors/track, 182401 cylinders
          Units = cylinders of 16065 * 512 = 8225280 bytes
          Sector size (logical/physical): 512 bytes / 512 bytes
          I/O size (minimum/optimal): 512 bytes / 512 bytes
          Disk identifier: 0x6be6057c
             Device Boot      Start         End      Blocks   Id  System
          /dev/sdc1               1           1        8001    4  FAT16 <32M
          /dev/sdc2   *           2          14      104422+  83  Linux
          /dev/sdc3              15      182401  1465023577+  fd
Linux raid autodetect

Confirming that this block is indeed returning read errors:

    dd count=1 bs=1024 skip=1465136897 if=/dev/sdc of=/dev/null
        [long delay]
        dd: reading `/dev/sdc': Input/output error
        0+0 records in
        0+0 records out
        0 bytes (0 B) copied, 45.1076 s, 0.0 kB/s

Examining one sector before:

    dd count=1 bs=1024 skip=146513686 if=/dev/sdc | hexdump -C
        00000000  92 e1 b4 d4 c6 cd 0f 33  db 7c ff a9 be c1 c1 8e
|.......3.|......|
        00000010  71 35 fc 55 16 c4 36 ef  59 10 db 20 22 f4 57 99
|q5.U..6.Y.. ".W.|
        00000020  31 61 2b 24 e0 98 3c 94  4b 8a 17 93 23 aa e9 96
|1a+$..<.K...#...|
        00000030  b0 47 7b 8f 12 c6 52 42  99 0d 72 b4 51 02 5a 8e
|.G{...RB..r.Q.Z.|
        00000040  c6 5a ac 86 0b a5 74 9b  13 e7 87 7a db 94 e2 7f
|.Z....t....z....|
        00000050  c6 42 75 ba 53 bf 7f 20  fc 9c ad 4b 8f 3c 85 64
|.Bu.S.. ...K.<.d|
        00000060  3a b0 ac 41 6e 41 fb 95  03 70 24 7e 2e d5 df 8a
|:..AnA...p$~....|
        00000070  f9 dc d1 7d 4a 1e e1 93  9d 39 18 83 6c 9f 9f 79
|...}J....9..l..y|
        00000080  53 a3 d1 fb 7f c6 bd 44  8d 0c 40 06 0a 92 f9 7e
|S......D..@....~|
        00000090  0c 0e 87 43 66 9d fc 12  2b 0d 7a 34 ba 84 cb 73
|...Cf...+.z4...s|
        000000a0  47 3b a4 fa c9 50 d9 96  f9 50 a2 60 17 eb 7c c8
|G;...P...P.`..|.|
        000000b0  42 76 59 d0 1e 06 10 a8  3b 89 74 8d b4 04 83 88
|BvY.....;.t.....|
        000000c0  d7 9d 3c 82 cf 8f 7d 6e  a2 b6 bf 56 06 c0 aa 7c
|..<...}n...V...||
        000000d0  7d 39 ae 0a 67 48 28 b5  07 fd fc ae 49 e4 7a 08
|}9..gH(.....I.z.|
        000000e0  8a 37 94 e0 d3 d7 f0 f4  4c 49 3a ed b7 f4 84 95
|.7......LI:.....|
        000000f0  3f 0a 4f 6c 47 62 1a f4  70 ca 14 8a 52 6d 4c 1e
|?.OlGb..p...RmL.|
        00000100  da 0c 29 17 c1 a4 e1 5c  cb 43 e0 01 45 9c 72 7f
|..)....\.C..E.r.|
        00000110  78 b8 19 3f dd 35 c5 50  ff 9b 42 fb 0b d8 61 5a
|x..?.5.P..B...aZ|
        00000120  24 2b ae c9 45 e6 e5 e9  04 00 93 bb 53 c0 fd d6
|$+..E.......S...|
        00000130  9c ab 69 98 50 f0 5e 98  0d 0b b3 dc cb cb d0 7d
|..i.P.^........}|
        00000140  21 70 68 e8 fb 3c 55 fd  2d c6 6c 25 86 dd 9a 4a
|!ph..<U.-.l%...J|
        00000150  fc e2 24 a9 fb 9a 6b be  d5 e2 3b e9 a0 b1 61 ad
|..$...k...;...a.|
        00000160  1f 9a c8 31 86 91 c6 1f  86 9e 17 35 25 7e 77 42
|...1.......5%~wB|
        00000170  37 86 b2 17 08 8e c4 cf  4e e2 64 7d 83 11 05 1e
|7.......N.d}....|
        00000180  6b c1 e7 5d 0f e2 c9 f9  0a 0a b1 2b 83 a1 2a a4
|k..].......+..*.|
        00000190  1d f8 a6 13 2f e9 45 bb  b7 e2 71 e9 69 ad 3c 47
|..../.E...q.i.<G|
        000001a0  3f fa 39 7f 1e 93 0e d2  89 09 dc d2 b3 3b f8 6f
|?.9..........;.o|
        000001b0  21 21 72 b6 9e 9d 42 79  fb 78 3c 02 85 7b 1f 4f
|!!r...By.x<..{.O|
        000001c0  8b 3c 26 62 8a 58 38 a7  48 31 b9 e2 0c 0d 41 d6
|.<&b.X8.H1....A.|
        000001d0  8f 43 95 f0 1f 52 3e 0e  55 8d c0 93 f7 e3 c8 79
|.C...R>.U......y|
        000001e0  a2 bc 51 72 87 3c 16 c3  d0 f3 57 a8 e4 48 51 32
|..Qr.<....W..HQ2|
        000001f0  00 99 3e 0e 88 a3 fa e3  00 a4 c2 cb 28 7a a1 00
|..>.........(z..|
        00000200  a0 b4 1b 6d c4 2a 15 75  a3 f0 24 47 5a d6 54 74
|...m.*.u..$GZ.Tt|
        00000210  d0 ad e4 92 b1 99 5d 7a  62 47 b9 54 8f 9e 15 ca
|......]zbG.T....|
        00000220  65 09 9e d0 d3 61 51 93  88 4a 46 1e 5c 15 07 ef
|e....aQ..JF.\...|
        00000230  b0 92 fa a7 e7 3d e5 36  20 67 d2 24 b7 59 ae f4
|.....=.6 g.$.Y..|
        00000240  7c 26 57 90 e1 69 b5 f3  b4 1b 8e e6 07 2e 46 84
||&W..i........F.|
        1+0 records in
        1+0 records out
        1024 bytes (1.0 kB) copied, 5.0224e-05 s, 20.4 MB/s

Looking at one sector after the error returns similar results.

So, I don't know about you, but the above seems pretty much like data
to me (although it could also be parity).

So I have two questions:

1) can I simply skip over these sectors (using dd_rescue or multiple
dd invocations) when off-line copying the old disk to the new one,
trusting the RAID5 to reconstruct the data correctly from the other 2
disks? Or is it better to simply do the recover the "traditional" way
(ie, "fail" the old disk, "add" the new one, and run the risk of a
possible bad sector on one of the two remaining old disks ruining the
show completely and forcing me to recover from backups [I *do* have
up-to-date backups on this array])?

2) Is there a formula, a program or anything that can tell me exactly
what is located at the above sector (ie, whether it's RAID parity or a
data sector)?

Thanks,
-- 
   Durval Menezes.




Ditto, one sector after:



So, when I "dd" this partition to a new one, I think



>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-08  6:58                 ` Durval Menezes
@ 2011-06-08  7:32                   ` Brad Campbell
  2011-06-08  7:47                     ` Durval Menezes
  0 siblings, 1 reply; 17+ messages in thread
From: Brad Campbell @ 2011-06-08  7:32 UTC (permalink / raw)
  To: Durval Menezes; +Cc: linux-raid, Drew

On 08/06/11 14:58, Durval Menezes wrote:

> 1) can I simply skip over these sectors (using dd_rescue or multiple
> dd invocations) when off-line copying the old disk to the new one,
> trusting the RAID5 to reconstruct the data correctly from the other 2

Noooooooooooo. As we stated early on, it you do that md will have no 
idea that the data missing is actually missing as the drive won't return 
a read error.

does a repair take long on your machine? I find that a few repair runs 
generally gets me enough re-writes to clear the dud sectors and allow an 
offline clone.

If your dd of the old disk to the new disk aborts with an error, do 
_not_ under any circumstances (well, unless you have really good 
backups) do a dd_rescue and just swap the disks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-08  7:32                   ` Brad Campbell
@ 2011-06-08  7:47                     ` Durval Menezes
  2011-06-08  7:57                       ` Brad Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Durval Menezes @ 2011-06-08  7:47 UTC (permalink / raw)
  To: Brad Campbell; +Cc: linux-raid, Drew

Hello Brad,

On Wed, Jun 8, 2011 at 4:32 AM, Brad Campbell <brad@fnarfbargle.com> wrote:
> On 08/06/11 14:58, Durval Menezes wrote:
>
>> 1) can I simply skip over these sectors (using dd_rescue or multiple
>> dd invocations) when off-line copying the old disk to the new one,
>> trusting the RAID5 to reconstruct the data correctly from the other 2
>
> Noooooooooooo. As we stated early on, it you do that md will have no idea
> that the data missing is actually missing as the drive won't return a read
> error.

Even if a "repair" (echo "repair" >/sys/block/md1/md/sync_status,
checking progress with "cat /proc/mdstat" and completion with "tail -f
/var/log/messages | grep md" ) finishes with no errors?

> does a repair take long on your machine? I find that a few repair runs
> generally gets me enough re-writes to clear the dud sectors and allow an
> offline clone.

I'm sorry if I did not make myself clear; I've already run both a
"repair" on the RAID  (see above) and a "smart -t long" on the
particular disk... I had about 40 bad sectors before, and now have
just 4, but these 4 sectors persist as being marked in error... I
think the "RAID repair" didn't touch them.

Cheers,
-- 
  Durval.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-08  7:47                     ` Durval Menezes
@ 2011-06-08  7:57                       ` Brad Campbell
       [not found]                         ` <BANLkTi=BuXK4SBGR=FrEcHFC1WohNkUY7g@mail.gmail.com>
  2011-06-13  5:56                         ` Durval Menezes
  0 siblings, 2 replies; 17+ messages in thread
From: Brad Campbell @ 2011-06-08  7:57 UTC (permalink / raw)
  To: Durval Menezes; +Cc: Brad Campbell, linux-raid, Drew

On 08/06/11 15:47, Durval Menezes wrote:

> I'm sorry if I did not make myself clear; I've already run both a
> "repair" on the RAID  (see above) and a "smart -t long" on the
> particular disk... I had about 40 bad sectors before, and now have
> just 4, but these 4 sectors persist as being marked in error... I
> think the "RAID repair" didn't touch them.

Apologies, I obviously missed that fact.

I think your best course of action in this case is to test both the 
other drives with SMART long checks and fail/replace the faulty one.

I've never had md not report a repaired sector when performing a repair 
operation.

I'll just re-iterate, if you take the bad sectors away without a good 
copy of the data on them, md won't know it is supposed to reconstruct 
those missing sectors.

Hrm.. *or*, and this is a big *or* you could use hdparm to create 
correctable bad sectors on the copy at the appropriate LBA's, and md 
should do the right thing as it will get read errors from those, which 
will go away when they are re-written.

I'd not thought of that before, but it should do the trick.

^ permalink raw reply	[flat|nested] 17+ messages in thread

[parent not found: <BANLkTi=BuXK4SBGR=FrEcHFC1WohNkUY7g@mail.gmail.com>]

[parent not found: <4DEF7775.5020407@fnarfbargle.com>]

[parent not found: <BANLkTin8dpbxWfSCG_VoOM_FMmqCkm2mJg@mail.gmail.com>]

* Re: Maximizing failed disk replacement on a RAID5 array
       [not found]                             ` <BANLkTin8dpbxWfSCG_VoOM_FMmqCkm2mJg@mail.gmail.com>
@ 2011-06-13  5:32                               ` Durval Menezes
  0 siblings, 0 replies; 17+ messages in thread
From: Durval Menezes @ 2011-06-13  5:32 UTC (permalink / raw)
  To: Linux RAID

Hello Folks,

On Wed, Jun 8, 2011 at 10:21 AM, Brad Campbell
<lists2009@fnarfbargle.com> wrote:
>
> Best of luck, and let us know how you get on.

Just finished the process here. To summarize, seems I've got my array
back in a stable state.

What I did:

1) Got a good backup of all the data in the array (using "tar") to
   removable HDs, verified it (using md5sum), and then stored these
   HDs safely offline;

2) Unmounted the filesystem in the array;

3) inserted the replacement disk on a USB dock, partitioned it,
   then added it to the array ("mdadm --add");
    -> Verified (via "mdadm --detail") that the replacement disk was
       listed on the array as a "spare";

4) failed the bad disk in the array ("mdadm --fail")
   -> At that point, the array immediatelly started to resync into the
      replacement disk;

5) Monitored the resync process via "cat /proc/mdstat": it took
   roughly 11 hours (I guess because transfer speed to the replacement
   disk was limited by the USB ~40MB/s speed limit), but it signaled
   no errors;

6) Verified that the array was really synced ("mdadm --detail") and
   that there were indeed no errors during the resync (less
   /var/log/messages);

7) removed the bad disk logically from the array ("mdadm --remove");

8) shut down the machine (init 0);

9) removed the bad disk physically from the machine, ejected the
   replacement disk from the USB dock, and then installed the
   replacement disk inside the machine;

10) turned the system on: the OS booted, assembled the array and
    mounted the filesystem in it with no issues;

11) checked (using "md5sum -c" on the md5sum files generated during
    pass#1 above) that all that ON THE ARRAY was indeed correct, so
    in the end I didn't need to restore anything from backup.

Thanks for all the help, folks, and I pray we have the "hot-replace"
functionality implemented soon... it will make for much sounder sleep
the next time one of my disks fails... :-)

Cheers,
--
  Durval Menezes.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-08  7:57                       ` Brad Campbell
       [not found]                         ` <BANLkTi=BuXK4SBGR=FrEcHFC1WohNkUY7g@mail.gmail.com>
@ 2011-06-13  5:56                         ` Durval Menezes
  1 sibling, 0 replies; 17+ messages in thread
From: Durval Menezes @ 2011-06-13  5:56 UTC (permalink / raw)
  To: Brad Campbell; +Cc: Brad Campbell, linux-raid, Drew

Hello Brad, folks,

On Wed, Jun 8, 2011 at 4:57 AM, Brad Campbell <lists2009@fnarfbargle.com> wrote:
> On 08/06/11 15:47, Durval Menezes wrote:
>
>> I'm sorry if I did not make myself clear; I've already run both a
>> "repair" on the RAID  (see above) and a "smart -t long" on the
>> particular disk... I had about 40 bad sectors before, and now have
>> just 4, but these 4 sectors persist as being marked in error... I
>> think the "RAID repair" didn't touch them.
[...]
> I've never had md not report a repaired sector when performing a repair
> operation.

Just to keep you posted: after I finished replacing that failing disk,
I wrote zeroes to all its sectors (dd bs=16065b </dev/zero >/dev/DISK)
and then checked SMART again.

Guess what? As expected, the 4 sectors that were being reported in
both "Current_Pending_Sector" and "Offline_Uncorrectable" just went
away... both counters now report a big round  "0".

So, it seems my hypothesis that these sectors were not being touched
by the md "repair" was indeed correct: once they were written to, they
ended up being automatically remapped (or at least cleaned out of the
above counters) by the HD firmware. Also, as the array resync reported
no errors whatsoever, it seems these sectors were simply not being
used by md.

Anyone has a good explanation to that? Inquiring minds want to know... :-)

Cheers,
-- 
   Durval Menezes.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-06 18:06           ` Durval Menezes
  2011-06-07  5:03             ` Durval Menezes
@ 2011-06-07  8:52             ` John Robinson
  2011-06-10 10:25               ` John Robinson
  1 sibling, 1 reply; 17+ messages in thread
From: John Robinson @ 2011-06-07  8:52 UTC (permalink / raw)
  To: Durval Menezes; +Cc: linux-raid, Brad Campbell, Drew

On 06/06/2011 19:06, Durval Menezes wrote:
[...]
> It would be great to have a
> "duplicate-this-bad-old-disk-into-this-shiny-new-disk"  functionality,
> as it would enable  an almost-no-downtime disk replacement with
> minimum  risk, but it seems we can't have everything... :-0 Maybe it's
> something for the wishlist?

It's already on the wishlist, described as a hot replace.

Cheers,

John.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-07  8:52             ` John Robinson
@ 2011-06-10 10:25               ` John Robinson
  2011-06-11 22:35                 ` Durval Menezes
  0 siblings, 1 reply; 17+ messages in thread
From: John Robinson @ 2011-06-10 10:25 UTC (permalink / raw)
  To: Linux RAID

On 07/06/2011 09:52, John Robinson wrote:
> On 06/06/2011 19:06, Durval Menezes wrote:
> [...]
>> It would be great to have a
>> "duplicate-this-bad-old-disk-into-this-shiny-new-disk" functionality,
>> as it would enable an almost-no-downtime disk replacement with
>> minimum risk, but it seems we can't have everything... :-0 Maybe it's
>> something for the wishlist?
>
> It's already on the wishlist, described as a hot replace.

Actually I've been thinking about this. I think I'd rather the hot 
replace functionality did a normal rebuild from the still-good drives, 
and only if it came across a read error from those would it attempt to 
refer to the contents of the known-to-be-failing drive (and then also 
attempt to repair the read error on the supposedly-still-good drive that 
gave a read error, as already happens).

My rationale for this is as follows: if we want to hot-replace a drive 
that's known to be failing, we should trust it less than the remaining 
still-good drives, and treat it with kid gloves. It may be suffering 
from bit-rot. We'd rather not hit all the bad sectors on the failing 
drive, because each time we do that we send the drive into 7 seconds (or 
more, for cheap drives without TLER) of re-reading, plus any Linux-level 
re-reading there might be. Further, making the known-to-be-failing drive 
work extra hard (doing the equivalent of dd'ing from it while also still 
using it to serve its contents as an array member) might make it die 
completely before we've finished.

What will this do for rebuild time? Well, I don't think it'll be any 
slower. On the one hand, you'd think that copying from one drive to 
another would be faster than a rebuild, because you're only reading 1 
drive instead of N-1, but on the other, your array is going to run 
slowly (pretty much degraded speed) anyway because you're keeping one 
drive in constant use reading from it, and you risk it becoming much, 
much slower if you do run in to hundreds or thousands of read errors on 
the failing drive.

So overall I think hot-replace should be a normal replace with a 
possible second source of data/parity.

Thoughts?

Yes, I know, -ENOPATCH

Cheers,

John.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Maximizing failed disk replacement on a RAID5 array
  2011-06-10 10:25               ` John Robinson
@ 2011-06-11 22:35                 ` Durval Menezes
  0 siblings, 0 replies; 17+ messages in thread
From: Durval Menezes @ 2011-06-11 22:35 UTC (permalink / raw)
  To: John Robinson; +Cc: Linux RAID

Hello John,

On Fri, Jun 10, 2011 at 7:25 AM, John Robinson
<john.robinson@anonymous.org.uk> wrote:
> On 07/06/2011 09:52, John Robinson wrote:
>>
>> On 06/06/2011 19:06, Durval Menezes wrote:
>> [...]
>>>
>>> It would be great to have a
>>> "duplicate-this-bad-old-disk-into-this-shiny-new-disk" functionality,
>>> as it would enable an almost-no-downtime disk replacement with
>>> minimum risk, but it seems we can't have everything... :-0 Maybe it's
>>> something for the wishlist?
>>
>> It's already on the wishlist, described as a hot replace.
>
> Actually I've been thinking about this. I think I'd rather the hot replace
> functionality did a normal rebuild from the still-good drives, and only if
> it came across a read error from those would it attempt to refer to the
> contents of the known-to-be-failing drive (and then also attempt to repair
> the read error on the supposedly-still-good drive that gave a read error, as
> already happens).

This looks like a very good idea. The old (failing) drive would be
kept "on reserve", ready to be accessed for eventual failed sectors on
the other old (good) drives...

> My rationale for this is as follows: if we want to hot-replace a drive
> that's known to be failing, we should trust it less than the remaining
> still-good drives, and treat it with kid gloves. It may be suffering from
> bit-rot. We'd rather not hit all the bad sectors on the failing drive,
> because each time we do that we send the drive into 7 seconds (or more, for
> cheap drives without TLER) of re-reading, plus any Linux-level re-reading
> there might be. Further, making the known-to-be-failing drive work extra
> hard (doing the equivalent of dd'ing from it while also still using it to
> serve its contents as an array member) might make it die completely before
> we've finished.

I agree completely.

> What will this do for rebuild time? Well, I don't think it'll be any slower.

I think it will actually be faster.

> On the one hand, you'd think that copying from one drive to another would be
> faster than a rebuild, because you're only reading 1 drive instead of N-1,
> but on the other, your array is going to run slowly (pretty much degraded
> speed) anyway because you're keeping one drive in constant use reading from
> it, and you risk it becoming much, much slower if you do run in to hundreds
> or thousands of read errors on the failing drive.
>
> So overall I think hot-replace should be a normal replace with a possible
> second source of data/parity.

Your reasoning sounds good to me.

> Thoughts?

Only sadness that it's not implemented yet... :-)

> Yes, I know, -ENOPATCH

Exactly :-)

Cheers,
-- 
   Durval Menezes.


>
> Cheers,
>
> John.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-06-13  5:56 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <BANLkTimBYFhjQ-sC9DhTMO+PG-Ox+A9S2Q@mail.gmail.com>
2011-06-05 14:22 ` Fwd: Maximizing failed disk replacement on a RAID5 array Durval Menezes
2011-06-06 15:02   ` Drew
2011-06-06 15:20     ` Brad Campbell
2011-06-06 15:37       ` Drew
2011-06-06 15:54         ` Brad Campbell
2011-06-06 18:06           ` Durval Menezes
2011-06-07  5:03             ` Durval Menezes
2011-06-07  5:35               ` Brad Campbell
2011-06-08  6:58                 ` Durval Menezes
2011-06-08  7:32                   ` Brad Campbell
2011-06-08  7:47                     ` Durval Menezes
2011-06-08  7:57                       ` Brad Campbell
     [not found]                         ` <BANLkTi=BuXK4SBGR=FrEcHFC1WohNkUY7g@mail.gmail.com>
     [not found]                           ` <4DEF7775.5020407@fnarfbargle.com>
     [not found]                             ` <BANLkTin8dpbxWfSCG_VoOM_FMmqCkm2mJg@mail.gmail.com>
2011-06-13  5:32                               ` Durval Menezes
2011-06-13  5:56                         ` Durval Menezes
2011-06-07  8:52             ` John Robinson
2011-06-10 10:25               ` John Robinson
2011-06-11 22:35                 ` Durval Menezes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.