All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fredrik Tolf <fredrik@dolda2000.com>
To: Martin Steigerwald <Martin@lichtvoll.de>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Rebalancing RAID1
Date: Fri, 15 Feb 2013 04:44:43 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.02.1302150435450.8810@shack.dolda2000.com> (raw)
In-Reply-To: <201302141544.05747.Martin@lichtvoll.de>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2494 bytes --]

On Thu, 14 Feb 2013, Martin Steigerwald wrote:
> Am Mittwoch, 13. Februar 2013 schrieb Fredrik Tolf:
> You started the balance after above btrfs fi show command?

I did.

> Then its obvious to me:
>
> For some reason BTRFS is still trying to write to /dev/sdd, which isn´t
> there anymore. That perfectly explains those lost page writes for me. If
> that is the case, this seems to me like a serious bug in BTRFS.

Now I have disconnected the drive entirely, quite simply, so that I can 
try to do simply what I should do if the drive really had failed 
completely and I had gotten a replacement in its stead. Neither any sdd 
nor any sdi is seen by the system anymore. However, I'm still getting 
kernel messages about being unable to write to sdd:

Feb 15 04:37:41 nerv kernel: [252822.640560] lost page write due to I/O error on /dev/sdd1
Feb 15 04:37:41 nerv kernel: [252822.644531] btrfs: bdev /dev/sdd1 errs: wr 362195, rd 26, flush 1, corrupt 0, gen 0

I can't say I know what conclusions that lead to with regards to your 
observations.

> I´d restart the machine, see that BTRFS is using both devices again and
> then try the balance again.

I mentioned it in another mail, but I'd very much prefer not to do that. 
I'd like to try and solve this as I normally should when a drive fails.

When I'm running btrfs fi show, this is what I'm getting now:

> $ sudo ./btrfs fi show
> Label: none  uuid: 40d346bb-2c77-4a78-8803-1e441bf0aff7
>         Total devices 2 FS bytes used 2.66TB
>         devid    2 size 2.73TB used 2.67TB path /dev/sde1
>         *** Some devices missing

So that's what it should look like when a drive fails, right?

At this point, I'm trying to remove the missing device from the filesystem 
as the Wiki indicates that I should be able to, but alas:

> $ sudo ./btrfs device delete missing /mnt
> ERROR: error removing the device 'missing' - Invalid argument

The dmesg tells me this:

> Feb 15 04:42:22 nerv kernel: [253103.799201] btrfs: unable to go below two devices on raid1

How do I remove the conception of the missing device so that I can replace 
it? Should I simply add the replacement first, and only after that remove 
the missing device?

If the latter, how can I "scratch" the previous btrfs metadata from this 
"replacement" drive so that it doesn't try to autoreinsert it into the 
filesystem when it is detected? I assume it won't be enough be just 
zeroing the first few sectors of the drive, right?

Thanks for replying!

--

Fredrik Tolf

  parent reply	other threads:[~2013-02-15  3:44 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-12 23:01 Rebalancing RAID1 Fredrik Tolf
2013-02-13  0:58 ` Chris Murphy
2013-02-13  6:18   ` Fredrik Tolf
2013-02-13  8:10     ` Chris Murphy
2013-02-14  6:42       ` Fredrik Tolf
2013-02-14  7:27         ` Chris Murphy
2013-02-14  7:58           ` Fredrik Tolf
2013-02-14  8:41             ` Chris Murphy
2013-02-14  8:59               ` Hugo Mills
2013-02-14 18:05                 ` Chris Murphy
2013-02-14 20:56                   ` Hugo Mills
2013-02-14 22:11                     ` Chris Murphy
2013-02-15  3:50                   ` Fredrik Tolf
2013-02-15  3:55                     ` Chris Murphy
2013-02-15  3:56                       ` Fredrik Tolf
2013-02-15  4:03                         ` Chris Murphy
2013-02-14  8:01         ` Chris Murphy
2013-02-15  4:06           ` Fredrik Tolf
2013-02-14 14:44 ` Martin Steigerwald
2013-02-14 18:45   ` Chris Murphy
2013-02-15  3:44   ` Fredrik Tolf [this message]
2013-02-15  5:49     ` Sander
2013-02-15  9:05     ` Martin Steigerwald
2013-02-15 21:56       ` Fredrik Tolf
2013-02-18 15:29         ` Stefan Behrens
2013-02-23  0:36           ` Fredrik Tolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.02.1302150435450.8810@shack.dolda2000.com \
    --to=fredrik@dolda2000.com \
    --cc=Martin@lichtvoll.de \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.