All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Fjellstrom <thomas@fjellstrom.ca>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: potentially lost largeish raid5 array..
Date: Fri, 23 Sep 2011 02:09:36 -0600	[thread overview]
Message-ID: <201109230209.36209.thomas@fjellstrom.ca> (raw)
In-Reply-To: <201109222322.57040.tfjellstrom@shaw.ca>

On September 22, 2011, Thomas Fjellstrom wrote:
> On September 22, 2011, NeilBrown wrote:
> > On Thu, 22 Sep 2011 22:49:12 -0600 Thomas Fjellstrom
> > <tfjellstrom@shaw.ca>
> > 
> > wrote:
> > > On September 22, 2011, NeilBrown wrote:
> > > > On Thu, 22 Sep 2011 19:50:36 -0600 Thomas Fjellstrom
> > > > <tfjellstrom@shaw.ca>
> > > > 
> > > > wrote:
> > > > > Hi,
> > > > > 
> > > > > I've been struggling with a SAS card recently that has had poor
> > > > > driver support for a long time, and tonight its decided to kick
> > > > > every drive in the array one after the other. Now mdstat shows:
> > > > > 
> > > > > md1 : active raid5 sdf[0](F) sdh[7](F) sdi[6](F) sdj[5](F)
> > > > > sde[3](F) sdd[2](F) sdg[1](F)
> > > > > 
> > > > >       5860574208 blocks super 1.1 level 5, 512k chunk, algorithm 2
> > > > >       [7/0]
> > > > > 
> > > > > [_______]
> > > > > 
> > > > >       bitmap: 3/8 pages [12KB], 65536KB chunk
> > > > > 
> > > > > Does the fact that I'm using a bitmap save my rear here? Or am I
> > > > > hosed? If I'm not hosed, is there a way I can recover the array
> > > > > without rebooting? maybe just a --stop and a --assemble ? If that
> > > > > won't work, will a reboot be ok?
> > > > > 
> > > > > I'd really prefer not to have lost all of my data. Please tell me
> > > > > (please) that it is possible to recover the array. All but sdi are
> > > > > still visible in /dev (I may be able to get it back via hotplug
> > > > > maybe, but it'd get sdk or something).
> > > > 
> > > > mdadm --stop /dev/md1
> > > > 
> > > > mdadm --examine /dev/sd[fhijedg]
> > > > mdadm --assemble --verbose /dev/md1 /dev/sd[fhijedg]
> > > > 
> > > > Report all output.
> > > > 
> > > > NeilBrown
> > > 
> > > Hi, thanks for the help. Seems the SAS card/driver is in a funky state
> > > at the moment. the --stop worked*. but --examine just gives "no md
> > > superblock detected", and dmesg reports io errors for all drives.
> > 
> > > I've just reloaded the driver, and things seem to have come back:
> > That's good!!
> > 
> > > root@boris:~# mdadm --examine /dev/sd[fhijedg]
> > 
> > ....
> > 
> > sd1 has a slightly older event count than the others - Update time is
> > 1:13 older.  So it presumably died first.
> > 
> > > root@boris:~# mdadm --assemble --verbose /dev/md1 /dev/sd[fhijedg]
> > > mdadm: looking for devices for /dev/md1
> > > mdadm: /dev/sdd is identified as a member of /dev/md1, slot 2.
> > > mdadm: /dev/sde is identified as a member of /dev/md1, slot 3.
> > > mdadm: /dev/sdf is identified as a member of /dev/md1, slot 0.
> > > mdadm: /dev/sdg is identified as a member of /dev/md1, slot 1.
> > > mdadm: /dev/sdh is identified as a member of /dev/md1, slot 6.
> > > mdadm: /dev/sdi is identified as a member of /dev/md1, slot 5.
> > > mdadm: /dev/sdj is identified as a member of /dev/md1, slot 4.
> > > mdadm: added /dev/sdg to /dev/md1 as 1
> > > mdadm: added /dev/sdd to /dev/md1 as 2
> > > mdadm: added /dev/sde to /dev/md1 as 3
> > > mdadm: added /dev/sdj to /dev/md1 as 4
> > > mdadm: added /dev/sdi to /dev/md1 as 5
> > > mdadm: added /dev/sdh to /dev/md1 as 6
> > > mdadm: added /dev/sdf to /dev/md1 as 0
> > > mdadm: /dev/md1 has been started with 6 drives (out of 7).
> > > 
> > > 
> > > Now I guess the question is, how to get that last drive back in? would:
> > > 
> > > mdadm --re-add /dev/md1 /dev/sdi
> > > 
> > > work?
> > 
> > re-add should work, yes.  It will use the bitmap info to only update the
> > blocks that need updating - presumably not many.
> > It might be interesting to run
> > 
> >   mdadm -X /dev/sdf
> > 
> > first to see what the bitmap looks like - how many dirty bits and what
> > the event counts are.
> 
> root@boris:~# mdadm -X /dev/sdf
>         Filename : /dev/sdf
>            Magic : 6d746962
>          Version : 4
>             UUID : 7d0e9847:ec3a4a46:32b60a80:06d0ee1c
>           Events : 1241766
>   Events Cleared : 1241740
>            State : OK
>        Chunksize : 64 MB
>           Daemon : 5s flush period
>       Write Mode : Normal
>        Sync Size : 976762368 (931.51 GiB 1000.20 GB)
>           Bitmap : 14905 bits (chunks), 18 dirty (0.1%)
> 
> > But yes: --re-add should make it all happy.
> 
> Very nice. I was quite upset there for a bit. Had to take a walk ;D

I forgot to say, but: Thank you very much :) for the help, and your tireless 
work on md.

> > NeilBrown


-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

  reply	other threads:[~2011-09-23  8:09 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-23  1:50 potentially lost largeish raid5 array Thomas Fjellstrom
2011-09-23  4:32 ` NeilBrown
2011-09-23  4:49   ` Thomas Fjellstrom
2011-09-23  4:58     ` Roman Mamedov
2011-09-23  5:10       ` Thomas Fjellstrom
2011-09-23  7:06         ` David Brown
2011-09-23  7:37           ` Thomas Fjellstrom
2011-09-23 12:56         ` Stan Hoeppner
2011-09-23 13:28           ` David Brown
2011-09-23 16:22           ` Thomas Fjellstrom
2011-09-23 23:24             ` Stan Hoeppner
2011-09-24  0:11               ` Thomas Fjellstrom
2011-09-24 12:17                 ` Stan Hoeppner
2011-09-24 13:11                   ` (unknown) Tomáš Dulík
2011-09-24 15:16                   ` potentially lost largeish raid5 array David Brown
2011-09-24 16:38                     ` Stan Hoeppner
2011-09-25 13:03                       ` David Brown
2011-09-25 14:39                         ` Stan Hoeppner
2011-09-25 15:18                           ` David Brown
2011-09-25 23:58                             ` Stan Hoeppner
2011-09-26 10:51                               ` David Brown
2011-09-26 19:52                                 ` Stan Hoeppner
2011-09-26 20:29                                   ` David Brown
2011-09-26 23:28                                   ` Krzysztof Adamski
2011-09-27  3:53                                     ` Stan Hoeppner
2011-09-24 17:48                   ` Thomas Fjellstrom
2011-09-24  5:59             ` Mikael Abrahamsson
2011-09-24 17:53               ` Thomas Fjellstrom
2011-09-25 18:07           ` Robert L Mathews
2011-09-26  6:08             ` Mikael Abrahamsson
2011-09-26  2:26           ` Krzysztof Adamski
2011-09-23  5:11     ` NeilBrown
2011-09-23  5:22       ` Thomas Fjellstrom
2011-09-23  8:09         ` Thomas Fjellstrom [this message]
2011-09-23  9:15           ` NeilBrown
2011-09-23 16:26             ` Thomas Fjellstrom
2011-09-25  9:37               ` NeilBrown
2011-09-24 21:57             ` Aapo Laine
2011-09-25  9:18               ` Kristleifur Daðason
2011-09-25 10:10               ` NeilBrown
2011-10-01 23:21                 ` Aapo Laine
2011-10-02 17:00                   ` Aapo Laine
2011-10-05  2:13                     ` NeilBrown
2011-10-05  2:06                   ` NeilBrown
2011-11-05 12:17                 ` Alexander Lyakas
2011-11-06 21:58                   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201109230209.36209.thomas@fjellstrom.ca \
    --to=thomas@fjellstrom.ca \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.