All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: Chris Murphy <lists@colorremedies.com>
Cc: Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: How to handle a RAID5 arrawy with a failing drive? -> raid5 mostly works, just no rebuilds
Date: Wed, 19 Mar 2014 08:40:31 -0700	[thread overview]
Message-ID: <20140319154031.GP6143@merlins.org> (raw)
In-Reply-To: <CD1D8CED-8BC2-4C16-AE9E-0C47DC954CFA@colorremedies.com>

On Wed, Mar 19, 2014 at 12:32:55AM -0600, Chris Murphy wrote:
> 
> On Mar 19, 2014, at 12:09 AM, Marc MERLIN <marc@merlins.org> wrote:
> > 
> > 7) you can remove a drive from an array, add files, and then if you plug
> >   the drive in, it apparently gets auto sucked in back in the array.
> > There is no rebuild that happens, you now have an inconsistent array where
> > one drive is not at the same level than the other ones (I lost all files I added 
> > after the drive was removed when I added the drive back).
> 
> Seems worthy of a dedicated bug report and keeping an eye on in the future, not good.
 
Since it's not supposed to be working, I didn't file a bug, but I figured
it'd be good for people to know about it in the meantime.

> >> polgara:/mnt/btrfs_backupcopy# btrfs device add -f /dev/mapper/crypt_sdm1 /mnt/btrfs_backupcopy/
> >> polgara:/mnt/btrfs_backupcopy# df -h .
> >> Filesystem              Size  Used Avail Use% Mounted on
> >> /dev/mapper/crypt_sdb1  4.6T  3.0M  4.6T   1% /mnt/btrfs_backupcopy
> > 
> > Oh look it's bigger now. We need to manual rebalance to use the new drive:
> 
> You don't have to. As soon as you add the additional drive, newly allocated chunks will stripe across all available drives. e.g. 1 GB allocations striped across 3x drives, if I add a 4th drive, initially any additional writes are only to the first three drives but once a new data chunk is allocated it gets striped across 4 drives.
 
That's the thing though. If the bad device hadn't been forcibly removed, and
apparently the only way to do this was to unmount, make the device node
disappear, and remount in degraded mode, it looked to me like btrfs was
still consideing that the drive was part of the array and trying to write to
it.
After adding a drive, I couldn't quite tell if it was striping over 11
drive2 or 10, but it felt that at least at times, it was striping over 11
drives with write failures on the missing drive.
I can't prove it, but I'm thinking the new data I was writing was being
striped in degraded mode.

> Sure the whole thing isn't corrupt. But if anything written while degraded vanishes once the missing device is reattached, and you remount normally (non-degraded), that's data loss. Yikes!

Yes, although it's limited, you apparently only lose new data that was added
after you went into degraded mode and only if you add another drive where
you write more data.
In real life this shouldn't be too common, even if it is indeed a bug.

Cheers,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

  reply	other threads:[~2014-03-19 15:40 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-16 15:23 [PATCH] Btrfs: fix incremental send's decision to delay a dir move/rename Filipe David Borba Manana
2014-03-16 17:09 ` [PATCH v2] " Filipe David Borba Manana
2014-03-16 20:37 ` [PATCH v3] " Filipe David Borba Manana
2014-03-16 22:20   ` How to handle a RAID5 arrawy with a failing drive? Marc MERLIN
2014-03-16 22:55     ` Chris Murphy
2014-03-16 23:12       ` Chris Murphy
2014-03-16 23:17         ` Marc MERLIN
2014-03-16 23:23           ` Chris Murphy
2014-03-17  0:51             ` Marc MERLIN
2014-03-17  1:06               ` Chris Murphy
2014-03-17  1:17                 ` Marc MERLIN
2014-03-17  2:56                   ` Chris Murphy
2014-03-17  3:44                     ` Marc MERLIN
2014-03-17  5:12                       ` Chris Murphy
2014-03-17 16:13                         ` Marc MERLIN
2014-03-17 17:38                           ` Chris Murphy
2014-03-16 23:40           ` ronnie sahlberg
2014-03-16 23:20         ` Chris Murphy
2014-03-18  9:02     ` Duncan
2014-03-19  6:09       ` How to handle a RAID5 arrawy with a failing drive? -> raid5 mostly works, just no rebuilds Marc MERLIN
2014-03-19  6:32         ` Chris Murphy
2014-03-19 15:40           ` Marc MERLIN [this message]
2014-03-19 16:53             ` Chris Murphy
2014-03-19 22:40               ` Marc MERLIN
     [not found]                 ` <CAGwxe4jL+L571MtEmeHnTnHQSD7h+2ApfWqycgV-ymXhfMR-JA@mail.gmail.com>
2014-03-20  0:46                   ` Marc MERLIN
2014-03-20  7:37                     ` Tobias Holst
2014-03-23 19:22               ` Marc MERLIN
2014-03-20  7:37             ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140319154031.GP6143@merlins.org \
    --to=marc@merlins.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.