All of lore.kernel.org
 help / color / mirror / Atom feed
* RAID0 partition set to spare
@ 2011-12-30  8:21 Steve Carlson
  2011-12-30 10:58 ` Jes Sorensen
  0 siblings, 1 reply; 9+ messages in thread
From: Steve Carlson @ 2011-12-30  8:21 UTC (permalink / raw)
  To: linux-raid

I'm an enthusiast and don't have any experience managing raids.  I got
here through a series of steps starting out in basic linux help on
IRC, so bear with me if I don't know how to do something and please
try to be verbose/explicit in responses.

I have a Synology DS207 2 drive NAS that I set up in RAID0 using their
automated tools back in 2008.  I didn't manually set the raid up, so I
have no implicit knowledge on how it's all held together.  I had an
electrician over and went through a series of power cycles, and the
disks were in their current state when he had left (Dec 21ish).  MD2
refuses to assemble without adding SDB3 as swap.

Syonology uses a slimmed down embedded BusyBox install.  I have photos
on the disk that aren't backed up, so I'd really prefer to keep the
data intact, but completely understand I took the risk of losing the
data with RAID0.

I didn't want a 20 page email, so hopefully http links with the
pertinent info will be acceptable.

Marble> uname -a
Linux Marble 2.6.24 #1594 Fri Feb 25 19:00:24 CST 2011 ppc GNU/Linux
synology_ppc824x_207

cat /proc/mdstat
http://dl.dropbox.com/u/2776371/raid/cat_proc_mdstat.txt

mdadm --examine
http://dl.dropbox.com/u/2776371/raid/mdadm_examine_all_drives.txt

smartctl -a /dev/sda
http://dl.dropbox.com/u/2776371/raid/smartctl_a_sda.txt

smartctl -a /dev/sdb
http://dl.dropbox.com/u/2776371/raid/smartctl_a_sdb.txt

cat /var/log/messages | grep 'error'
http://dl.dropbox.com/u/2776371/raid/cat_var_log_messages_pipe_grep_error.txt

mdadm --stop /dev/md2 && mdadm --assemble /dev/md2 /dev/sd[ab]3
http://dl.dropbox.com/u/2776371/raid/stop_assemble_md2_dmesg.txt

mdadm --stop /dev/md2 && mdadm --assemble --force /dev/md2 /dev/sd[ab]3
http://dl.dropbox.com/u/2776371/raid/stop_assemble_force_md2_dmesg.txt

dmesg from boot
http://dl.dropbox.com/u/2776371/raid/boot_dmesg.txt

Please let me know if there's any other info I can provide to help
diagnose the issue.

Thanks in advance.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID0 partition set to spare
  2011-12-30  8:21 RAID0 partition set to spare Steve Carlson
@ 2011-12-30 10:58 ` Jes Sorensen
  2011-12-31  3:03   ` Steve Carlson
  0 siblings, 1 reply; 9+ messages in thread
From: Jes Sorensen @ 2011-12-30 10:58 UTC (permalink / raw)
  To: Steve Carlson; +Cc: linux-raid

On 12/30/11 09:21, Steve Carlson wrote:
> I'm an enthusiast and don't have any experience managing raids.  I got
> here through a series of steps starting out in basic linux help on
> IRC, so bear with me if I don't know how to do something and please
> try to be verbose/explicit in responses.
> 
> I have a Synology DS207 2 drive NAS that I set up in RAID0 using their
> automated tools back in 2008.  I didn't manually set the raid up, so I
> have no implicit knowledge on how it's all held together.  I had an
> electrician over and went through a series of power cycles, and the
> disks were in their current state when he had left (Dec 21ish).  MD2
> refuses to assemble without adding SDB3 as swap.
> 
> Syonology uses a slimmed down embedded BusyBox install.  I have photos
> on the disk that aren't backed up, so I'd really prefer to keep the
> data intact, but completely understand I took the risk of losing the
> data with RAID0.
> 
> I didn't want a 20 page email, so hopefully http links with the
> pertinent info will be acceptable.
> 
> Marble> uname -a
> Linux Marble 2.6.24 #1594 Fri Feb 25 19:00:24 CST 2011 ppc GNU/Linux
> synology_ppc824x_207
> 
> cat /proc/mdstat
> http://dl.dropbox.com/u/2776371/raid/cat_proc_mdstat.txt
> 
> mdadm --examine
> http://dl.dropbox.com/u/2776371/raid/mdadm_examine_all_drives.txt
> 
> smartctl -a /dev/sda
> http://dl.dropbox.com/u/2776371/raid/smartctl_a_sda.txt
> 
> smartctl -a /dev/sdb
> http://dl.dropbox.com/u/2776371/raid/smartctl_a_sdb.txt
> 
> cat /var/log/messages | grep 'error'
> http://dl.dropbox.com/u/2776371/raid/cat_var_log_messages_pipe_grep_error.txt
> 
> mdadm --stop /dev/md2 && mdadm --assemble /dev/md2 /dev/sd[ab]3
> http://dl.dropbox.com/u/2776371/raid/stop_assemble_md2_dmesg.txt
> 
> mdadm --stop /dev/md2 && mdadm --assemble --force /dev/md2 /dev/sd[ab]3
> http://dl.dropbox.com/u/2776371/raid/stop_assemble_force_md2_dmesg.txt
> 
> dmesg from boot
> http://dl.dropbox.com/u/2776371/raid/boot_dmesg.txt
> 
> Please let me know if there's any other info I can provide to help
> diagnose the issue.

Hi Steve,

Posting the error messages like this is great - certainly works for me.

Looking at your error messages, I am not overly optimistic for you
unfortunately :( It looks to me like your sdb has some bad sectors on it
(even though smart claims it passed), and it therefore is unable to read
the raid metadata on /dev/sdb3 :( Note that the error counts on both
drives are *very* high, at least compared to the drives I have here.

Could you try and run this:

mdadm --stop /dev/md2
dmesg -c (this clears the dmesg log)
mdadm --assemble /dev/md2 /dev/sd[ab]3

then post the output from dmesg? I'd like to see if you get read errors
at this exact point.

Unless there is a secondary backup of the metadata elsewhere on the
partition, or you can force/trick mdadm into ignoring the metadata on
sdb3 and rely solely on the info found on sda3 (however I am not sure if
this is possible), then I suspect you are out of luck.

Neil may have some ideas for this.

I presume you tried letting the system shut down and cool off completely
before trying to bring it back up? Just in case it has been running very
hot for a long time?

Cheers,
Jes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID0 partition set to spare
  2011-12-30 10:58 ` Jes Sorensen
@ 2011-12-31  3:03   ` Steve Carlson
  2011-12-31 10:08     ` Jes Sorensen
  0 siblings, 1 reply; 9+ messages in thread
From: Steve Carlson @ 2011-12-31  3:03 UTC (permalink / raw)
  To: Jes Sorensen; +Cc: linux-raid

>
> Hi Steve,
>
> Posting the error messages like this is great - certainly works for me.
>
> Looking at your error messages, I am not overly optimistic for you
> unfortunately :( It looks to me like your sdb has some bad sectors on it
> (even though smart claims it passed), and it therefore is unable to read
> the raid metadata on /dev/sdb3 :( Note that the error counts on both
> drives are *very* high, at least compared to the drives I have here.
>
> Could you try and run this:
>
> mdadm --stop /dev/md2
> dmesg -c (this clears the dmesg log)
> mdadm --assemble /dev/md2 /dev/sd[ab]3
>
> then post the output from dmesg? I'd like to see if you get read errors
> at this exact point.
>
> Unless there is a secondary backup of the metadata elsewhere on the
> partition, or you can force/trick mdadm into ignoring the metadata on
> sdb3 and rely solely on the info found on sda3 (however I am not sure if
> this is possible), then I suspect you are out of luck.
>
> Neil may have some ideas for this.
>
> I presume you tried letting the system shut down and cool off completely
> before trying to bring it back up? Just in case it has been running very
> hot for a long time?
>
> Cheers,
> Jes

Hi Jes,

The system has mostly been sitting powered down while I'm not trying
to fix it.  Both drives reporting 37c temps right now.

mdadm --stop /dev/md2 && dmesg -c && mdadm --assemble /dev/md2 /dev/sd[ab]3
http://dl.dropbox.com/u/2776371/raid/mdadm_stop_dmesg_c_mdadm_assemble.txt

I also ran extended self tests just to be thorough, and they both came
back error free.  I'm unsure as to whether that puts them in the clear
for bad sectors though.

smartctl --test=long /dev/sda
http://dl.dropbox.com/u/2776371/raid/smartctl_test_long_sda.txt

smartctl --test=long /dev/sdb
http://dl.dropbox.com/u/2776371/raid/smartctl_test_long_sdb.txt

-Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID0 partition set to spare
  2011-12-31  3:03   ` Steve Carlson
@ 2011-12-31 10:08     ` Jes Sorensen
  2011-12-31 23:37       ` Steve Carlson
  0 siblings, 1 reply; 9+ messages in thread
From: Jes Sorensen @ 2011-12-31 10:08 UTC (permalink / raw)
  To: Steve Carlson; +Cc: linux-raid

On 12/31/11 04:03, Steve Carlson wrote:
> Hi Jes,
> 
> The system has mostly been sitting powered down while I'm not trying
> to fix it.  Both drives reporting 37c temps right now.
> 
> mdadm --stop /dev/md2 && dmesg -c && mdadm --assemble /dev/md2 /dev/sd[ab]3
> http://dl.dropbox.com/u/2776371/raid/mdadm_stop_dmesg_c_mdadm_assemble.txt

Hi Steve,

What I meant here is that I would like to see the dmesg output at this
point, to see if there were any read errors reported during the assembly.

> I also ran extended self tests just to be thorough, and they both came
> back error free.  I'm unsure as to whether that puts them in the clear
> for bad sectors though.

There were read errors in the dmesg output you posted earlier, which is
why I am wary of it.

Cheers,
Jes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID0 partition set to spare
  2011-12-31 10:08     ` Jes Sorensen
@ 2011-12-31 23:37       ` Steve Carlson
       [not found]         ` <CALFpzo6tUc1UrvJj9T1cWtgNyVjBHP89qnZPq8r_9Qz-m=eWog@mail.gmail.com>
  2012-01-14 21:42         ` Steve Carlson
  0 siblings, 2 replies; 9+ messages in thread
From: Steve Carlson @ 2011-12-31 23:37 UTC (permalink / raw)
  To: Jes Sorensen

> On Sat, Dec 31, 2011 at 4:08 AM, Jes Sorensen wrote:
> Hi Steve,
>
> What I meant here is that I would like to see the dmesg output at this
> point, to see if there were any read errors reported during the assembly.
>
>> I also ran extended self tests just to be thorough, and they both came
>> back error free.  I'm unsure as to whether that puts them in the clear
>> for bad sectors though.
>
> There were read errors in the dmesg output you posted earlier, which is
> why I am wary of it.
>
> Cheers,
> Jes

Jes,

This is /var/log/messages from boot to stopping and reassembling md2.
I think this is what you wanted, I'm not sure what more I can give
you.
http://dl.dropbox.com/u/2776371/raid/mdadm_stop_dmesg_c_mdadm_assemble_var_log_message.txt

-Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID0 partition set to spare
       [not found]         ` <CALFpzo6tUc1UrvJj9T1cWtgNyVjBHP89qnZPq8r_9Qz-m=eWog@mail.gmail.com>
@ 2012-01-01  3:44           ` Marcus Sorensen
  0 siblings, 0 replies; 9+ messages in thread
From: Marcus Sorensen @ 2012-01-01  3:44 UTC (permalink / raw)
  To: Steve Carlson; +Cc: linux-raid

looks like adding a -f to your assemble command will clear the faulty
flag. I suppose I haven't looked hard enough. I'd still recommend
taking a dd of those partitions before playing too much (maybe pipe to
netcat to another machine if you don't have space locally), if you
care a lot about those photos. see "how to do everything with dd" :
http://www.linuxquestions.org/linux/answers/Applications_GUI_Multimedia/How_To_Do_Eveything_With_DD

On Sat, Dec 31, 2011 at 8:37 PM, Marcus Sorensen <shadowsor@gmail.com> wrote:
> Do you have spare drive space elsewhere or on another system where you can
> save an image these disk partitions? Then you can play and revert. I would
> be tempted to wipe the md superblocks and recreate the array with identical
> properties. All of the data is intact unless there's a preexisting drive or
> corruption issue.
>
> In the past I have been doing failure testing with raid1s that I know are
> clean, but one side shows as faulty spare because I temporarily yank a cord
> or something. I have not found a way other than recreating the array with
> assume-clean to clear the faulty flag without a full rebuild, but if there
> is a way then I suspect it would help you as well. As long as that sdb2 is
> marked faulty I don't think you can assemble as there's no place in raid0 to
> recover from.
>
> On Dec 31, 2011 4:37 PM, "Steve Carlson" <stevengcarlson@gmail.com> wrote:
>>
>> > On Sat, Dec 31, 2011 at 4:08 AM, Jes Sorensen wrote:
>> > Hi Steve,
>> >
>> > What I meant here is that I would like to see the dmesg output at this
>> > point, to see if there were any read errors reported during the
>> > assembly.
>> >
>> >> I also ran extended self tests just to be thorough, and they both came
>> >> back error free.  I'm unsure as to whether that puts them in the clear
>> >> for bad sectors though.
>> >
>> > There were read errors in the dmesg output you posted earlier, which is
>> > why I am wary of it.
>> >
>> > Cheers,
>> > Jes
>>
>> Jes,
>>
>> This is /var/log/messages from boot to stopping and reassembling md2.
>> I think this is what you wanted, I'm not sure what more I can give
>> you.
>>
>> http://dl.dropbox.com/u/2776371/raid/mdadm_stop_dmesg_c_mdadm_assemble_var_log_message.txt
>>
>> -Steve
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID0 partition set to spare
  2011-12-31 23:37       ` Steve Carlson
       [not found]         ` <CALFpzo6tUc1UrvJj9T1cWtgNyVjBHP89qnZPq8r_9Qz-m=eWog@mail.gmail.com>
@ 2012-01-14 21:42         ` Steve Carlson
  2012-01-15  5:18           ` NeilBrown
  1 sibling, 1 reply; 9+ messages in thread
From: Steve Carlson @ 2012-01-14 21:42 UTC (permalink / raw)
  To: linux-raid

On Sat, Dec 31, 2011 at 5:37 PM, Steve Carlson <stevengcarlson@gmail.com> wrote:
>> On Sat, Dec 31, 2011 at 4:08 AM, Jes Sorensen wrote:
>> Hi Steve,
>>
>> What I meant here is that I would like to see the dmesg output at this
>> point, to see if there were any read errors reported during the assembly.
>>
>>> I also ran extended self tests just to be thorough, and they both came
>>> back error free.  I'm unsure as to whether that puts them in the clear
>>> for bad sectors though.
>>
>> There were read errors in the dmesg output you posted earlier, which is
>> why I am wary of it.
>>
>> Cheers,
>> Jes
>
> Jes,
>
> This is /var/log/messages from boot to stopping and reassembling md2.
> I think this is what you wanted, I'm not sure what more I can give
> you.
> http://dl.dropbox.com/u/2776371/raid/mdadm_stop_dmesg_c_mdadm_assemble_var_log_message.txt
>
> -Steve



Hi all,

This is still unresolved.  What else could I try short of breaking and
rebuilding the array?

Thanks,
Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID0 partition set to spare
  2012-01-14 21:42         ` Steve Carlson
@ 2012-01-15  5:18           ` NeilBrown
  2012-01-16  8:04             ` Steve Carlson
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2012-01-15  5:18 UTC (permalink / raw)
  To: Steve Carlson; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1637 bytes --]

On Sat, 14 Jan 2012 15:42:58 -0600 Steve Carlson <stevengcarlson@gmail.com>
wrote:

> On Sat, Dec 31, 2011 at 5:37 PM, Steve Carlson <stevengcarlson@gmail.com> wrote:
> >> On Sat, Dec 31, 2011 at 4:08 AM, Jes Sorensen wrote:
> >> Hi Steve,
> >>
> >> What I meant here is that I would like to see the dmesg output at this
> >> point, to see if there were any read errors reported during the assembly.
> >>
> >>> I also ran extended self tests just to be thorough, and they both came
> >>> back error free.  I'm unsure as to whether that puts them in the clear
> >>> for bad sectors though.
> >>
> >> There were read errors in the dmesg output you posted earlier, which is
> >> why I am wary of it.
> >>
> >> Cheers,
> >> Jes
> >
> > Jes,
> >
> > This is /var/log/messages from boot to stopping and reassembling md2.
> > I think this is what you wanted, I'm not sure what more I can give
> > you.
> > http://dl.dropbox.com/u/2776371/raid/mdadm_stop_dmesg_c_mdadm_assemble_var_log_message.txt
> >
> > -Steve
> 
> 
> 
> Hi all,
> 
> This is still unresolved.  What else could I try short of breaking and
> rebuilding the array?

I don't know how the array could have got in this state.  IO errors on a
RAID0 don't mark the devices as faulty.
You would have to explicitly
   mdadm /dev/md2 --fail /dev/sdb3
or something like that - and I doubt you did that.

Anyway:
 mdadm -S /dev/md2
 mdadm -C /dev/md2 -e 0.90 -c 64 -l 0 -n 2 /dev/sda3 /dev/sdb3

should get you going again.  All the data should be there except for anything
that the hard drive has decided to keep for itself.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID0 partition set to spare
  2012-01-15  5:18           ` NeilBrown
@ 2012-01-16  8:04             ` Steve Carlson
  0 siblings, 0 replies; 9+ messages in thread
From: Steve Carlson @ 2012-01-16  8:04 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

> I don't know how the array could have got in this state.  IO errors on a
> RAID0 don't mark the devices as faulty.
> You would have to explicitly
>   mdadm /dev/md2 --fail /dev/sdb3
> or something like that - and I doubt you did that.
>
> Anyway:
>  mdadm -S /dev/md2
>  mdadm -C /dev/md2 -e 0.90 -c 64 -l 0 -n 2 /dev/sda3 /dev/sdb3
>
> should get you going again.  All the data should be there except for anything
> that the hard drive has decided to keep for itself.
>
> NeilBrown

This fixed it.  Thanks a bunch Neil and Jes!

-Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-01-16  8:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-30  8:21 RAID0 partition set to spare Steve Carlson
2011-12-30 10:58 ` Jes Sorensen
2011-12-31  3:03   ` Steve Carlson
2011-12-31 10:08     ` Jes Sorensen
2011-12-31 23:37       ` Steve Carlson
     [not found]         ` <CALFpzo6tUc1UrvJj9T1cWtgNyVjBHP89qnZPq8r_9Qz-m=eWog@mail.gmail.com>
2012-01-01  3:44           ` Marcus Sorensen
2012-01-14 21:42         ` Steve Carlson
2012-01-15  5:18           ` NeilBrown
2012-01-16  8:04             ` Steve Carlson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.