linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Vladimir Panteleev <thecybershadow@gmail.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	Qu Wenruo <quwenruo.btrfs@gmx.com>
Subject: Re: "kernel BUG" and segmentation fault with "device delete"
Date: Fri, 5 Jul 2019 20:38:13 -0600	[thread overview]
Message-ID: <CAJCQCtS87cQV4PWuDRaQmmY-N03XmGqN2hh8EQv8BqqVGRuxbw@mail.gmail.com> (raw)
In-Reply-To: <a4920d21-3c90-9a96-9b44-f90d7b5eed3a@gmail.com>

On Fri, Jul 5, 2019 at 6:05 PM Vladimir Panteleev
<thecybershadow@gmail.com> wrote:

> On 05/07/2019 21.43, Chris Murphy wrote:

> > But I can't tell from the
> > above exactly when each drive was disconnected. In this scenario you
> > need to convert to raid1 first, wait for that to complete successfully
> > before you can do a device remove. That's clear.  Also clear is you
> > must use 'btrfs device remove' and it must complete before that device
> > is disconnected.
>
> Unfortunately as mentioned before that wasn't an option. I was
> performing this operation on a DM snapshot target backed by a file that
> certainly could not fit the result of a RAID10-to-RAID1 rebalance.

Then the total operation isn't possible. Maybe you could have made the
volume a seed, and then create a single device sprout on a new single
target, and later convert that sprout to raid1. But I'm not sure of
the state of multiple device seeds.


>
> > What I've never tried, but the man page implies, is you can specify
> > two devices at one time for 'btrfs device remove' if the profile and
> > the number of devices permits it.
>
> What I found surprising, was that "btrfs device delete missing" deletes
> exactly one device, instead of all missing devices. But, that might be
> simply because a device with RAID10 blocks should not have been
> mountable rw with two missing drives in the first place.

It's a really good question for developers if there is a good reason
to permit rw mount of a volume that's missing two or more devices for
raid 1, 10, or 5; and missing three or more for raid6. I cannot think
of a good reason to allow degraded,rw mounts for a raid10 missing two
devices.


> > This is actually worse, potentially because it means there's only one
> > copy of the system chunk on sdd1. It has not been replicated to sdf1,
> > but is on the missing device.
>
> I'm sorry, but that's not right. As I mentioned in my second email, if I
> use btrfs device replace, then it successfully rebuilds all missing
> data. So, there is no lost data with no remaining copies; btrfs is
> simply having some trouble moving it off of that device.
>
> Here is the filesystem info with a loop device replacing the missing drive:
>
> https://dump.thecybershadow.net/9a0c88c3720c55bcf7fee98630c2a8e1/00%3A02%3A17-upload.txt

Wow that's really interesting. So you did 'btrfs replace start' for
one of the missing drive devid's, with a loop device as the
replacement, and that worked and finished?!

Does this three device volume mount rw and not degraded? I guess it
must have because 'btrfs fi us' worked on it.

        devid    1 size 7.28TiB used 2.71TiB path /dev/sdd1
        devid    2 size 7.28TiB used 22.01GiB path /dev/loop0
        devid    3 size 7.28TiB used 2.69TiB path /dev/sdf1

OK so what happens now if you try to 'btrfs device remove /dev/loop0' ?


>
> > Depending on degraded operation for this task is the wrong strategy.
> > You needed to 'btrfs device delete/remove' before physically
> > disconnecting these drives.
> >
> > OK you definitely did this incorrectly if you're expecting to
> > disconnect two devices at the same time, and then "btrfs device delete
> > missing" instead of explicitly deleting drives by ID before you
> > physically disconnect them.
>
> I don't disagree in general, however, I did make sure that all data was
> accessible with two devices before proceeding with this endeavor.

Well there's definitely something screwy if Btrfs needs something on a
missing drive, which is indicated by its refusal to remove it from the
volume, and yet at same time it's possible to e.g. rsync every file to
/dev/null without any errors. That's a bug somewhere.


> >> OK so what did you do, in order, each command, interleaving the
> >> physical device removals.
>
> Well, at this point, I'm still quite confident that the BTRFS kernel bug
> is unrelated to this entire RAID10 thing, but I'll do so if you like.
> Unfortunately I do not have an exact record of this, but I can do my
> best to reconstruct it from memory.

I'm not a developer but a dev very well might need to have a simple
reproducer for this in order to locate the problem. But the call trace
might tell them what they need to know. I'm not sure.


>
> The reason I'm doing this in the first place is that I'm trying to split
> a 4-drive RAID10 array that was getting full. The goal was to move some
> data off of it to a new array, then delete it from its original
> location. I couldn't use rsync because most of the data was in
> snapshots, and I couldn't use btrfs send/receive because it bugs out
> with the old "chown oXXX-XXXXXXX-0 failed: No such file or directory"
> bug. So, my idea was:

I'm not familiar with that bug. That sounds like a receive side bug
not a send side bug. I wonder if receive will continue if you use the
-E 0 option, and the result will just be wrong owner on a few files.


>
> 1. Use device mapper to create a COW copy of all four devices, and
> operate on those (make the SATA devices read-only to ensure they're not
> touched)
> 2. Use btrfs-tune to change the UUID of the new filesystem
> 3. Delete 75%-ish of data off of the COW copy
> 4. Somehow convert the 4-disk RAID10 to 2-disk RAID1 without incurring a
> ton of writes to the COW copies
> 5. dd the contents of the COW copies to two new real disks
> 6. After ensuring the remaining data is safe on the new disks, delete it
> from the original array.
>
> For steps 2 and 3, I needed to specify the exact devices to work with.
> It's possible to specify the device list when mounting with -o device=,
> but for btrfstune, I had to bind-mount a fake partitions file over
> /proc/partitions. I can share the scripts I used for all this if you like.

No, it's fine.

> Have you had a chance to look at the kernel stack trace yet? It looks
> like it's running out of temporary space to perform a relocation. I
> think that is where we should be concentrating on.

I've looked at it but I can't really follow it. The comments in the
code don't really tell me much either other than Btrfs is confused,
and so you're seeing the warning and then error -28. It may really be
running out of global reserve for this operation, I can't really tell.

Qu will understand this better.

-- 
Chris Murphy

  reply	other threads:[~2019-07-06  2:38 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-05  4:39 "kernel BUG" and segmentation fault with "device delete" Vladimir Panteleev
2019-07-05  7:01 ` Vladimir Panteleev
2019-07-05  9:42 ` Andrei Borzenkov
2019-07-05 10:20   ` Vladimir Panteleev
2019-07-05 21:48     ` Chris Murphy
2019-07-05 22:04       ` Chris Murphy
2019-07-05 21:43 ` Chris Murphy
2019-07-06  0:05   ` Vladimir Panteleev
2019-07-06  2:38     ` Chris Murphy [this message]
2019-07-06  3:37       ` Vladimir Panteleev
2019-07-06 17:36         ` Chris Murphy
2019-07-06  5:01 ` Qu Wenruo
2019-07-06  5:13   ` Vladimir Panteleev
2019-07-06  5:51     ` Qu Wenruo
2019-07-06 15:09       ` Vladimir Panteleev
2019-07-20 10:59       ` Vladimir Panteleev
2019-08-08 20:40         ` Vladimir Panteleev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtS87cQV4PWuDRaQmmY-N03XmGqN2hh8EQv8BqqVGRuxbw@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=thecybershadow@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).