All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roger Willcocks <roger@filmlight.ltd.uk>
To: xfs@oss.sgi.com
Subject: Re: xfs_repair of critical volume
Date: Sun, 31 Oct 2010 16:52:13 +0000	[thread overview]
Message-ID: <8C4130A7-53BC-460B-8674-1440B479E67D@filmlight.ltd.uk> (raw)
In-Reply-To: <75C248E3-2C99-426E-AE7D-9EC543726796@ucsc.edu>

Don't do anything which has the potential to write to your drives until you have a full bit-for-bit copy of the existing volumes.

In particular, don't run xfs_repair. This is is a hardware issue. It can't be fixed with software.

Now stop and think. There's a good chance a professional data repair outfit can get stuff off your failed drives.

So before you go any further:

* carefully label all the drives, note down their serial numbers, and their positions in the array. You need to do this for the 'failed' drives too.

* speak to your raid vendor. They will have seen this before. 

* try and find out why multiple drives failed on both your main and your backup systems. Was it power related? Temperature? Vibration? Or a bad batch of disks?

* speak to the drive manufacturer. They will have seen this before.

Come back to this list and give us an update. This isn't an xfs problem per se, but there are several people here who work regularly with multi-terabyte arrays.


--
Roger



On 31 Oct 2010, at 07:54, Eli Morris wrote:

> Hi,
> 
> I have a large XFS filesystem (60 TB) that is composed of 5 hardware RAID 6 volumes. One of those volumes had several drives fail in a very short time and we lost that volume. However, four of the volumes seem OK. We are in a worse state because our backup unit failed a week later when four drives simultaneously went offline. So we are in a bad very state. I am able to mount the filesystem that consists of the four remaining volumes. I was thinking about running xfs_repair on the filesystem in hopes it would recover all the files that were not on the bad volume, which are obviously gone. Since our backup is gone, I'm very concerned about doing anything to lose the data that will still have. I ran xfs_repair with the -n flag and I have a lengthly file of things that program would do to our filesystem. I don't have the expertise to decipher the output and figure out if xfs_repair would fix the filesystem in a way that would retain our remaining data or if it would, let's say!
  t!
> runcate the filesystem at the data loss boundary (our lost volume was the middle one of the five volumes), returning 2/5 of the filesystem or some other undesirable result. I would post the xfs_repair -n output here, but it is more than a megabyte. I'm hoping some one of you xfs gurus will take pity on me and let me send you the output to look at or give me an idea as to what they think xfs_repair is likely to do if I should run it or if anyone has any suggestions as to how to get back as much data as possible in this recovery.
> 
> thanks very much,
> 
> Eli
> 
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2010-10-31 16:50 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-31  7:54 xfs_repair of critical volume Eli Morris
2010-10-31  9:54 ` Stan Hoeppner
2010-11-12  8:48   ` Eli Morris
2010-11-12 13:22     ` Michael Monnerie
2010-11-12 22:14       ` Stan Hoeppner
2010-11-13  8:19         ` Emmanuel Florac
2010-11-13  9:28           ` Stan Hoeppner
2010-11-13 15:35             ` Michael Monnerie
2010-11-14  3:31               ` Stan Hoeppner
2010-12-04 10:30         ` Martin Steigerwald
2010-12-05  4:49           ` Stan Hoeppner
2010-12-05  9:44             ` Roger Willcocks
2010-11-12 23:01       ` Eli Morris
2010-11-13 15:25         ` Michael Monnerie
2010-11-14 11:05         ` Dave Chinner
2010-11-15  4:09           ` Eli Morris
2010-11-16  0:04             ` Dave Chinner
2010-11-17  7:29               ` Eli Morris
2010-11-17  7:47                 ` Dave Chinner
2010-11-30  7:22                   ` Eli Morris
2010-12-02 11:33                     ` Michael Monnerie
2010-12-03  0:58                       ` Stan Hoeppner
2010-12-04  0:43                       ` Eli Morris
2010-10-31 14:10 ` Emmanuel Florac
2010-10-31 14:41   ` Steve Costaras
2010-10-31 16:52 ` Roger Willcocks [this message]
2010-11-01 22:21 ` Eric Sandeen
2010-11-01 23:32   ` Eli Morris
2010-11-02  0:14     ` Eric Sandeen
2010-10-31 19:56 Eli Morris
2010-10-31 20:40 ` Emmanuel Florac
2010-11-01  3:40   ` Eli Morris
2010-11-01 10:07     ` Emmanuel Florac
2010-10-31 21:10 ` Steve Costaras
2010-11-01 15:03 ` Stan Hoeppner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8C4130A7-53BC-460B-8674-1440B479E67D@filmlight.ltd.uk \
    --to=roger@filmlight.ltd.uk \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.