linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: linux-btrfs@vger.kernel.org
Cc: wqu@suse.com
Subject: Re: Corrupted btrfs (LUKS), seeking advice
Date: Tue, 9 Aug 2022 19:37:02 +0800	[thread overview]
Message-ID: <6c15d5db-4e87-dd49-d42a-2fcf08157b25@gmx.com> (raw)
In-Reply-To: <83bf3b4b-7f4c-387a-b286-9251e3991e34@bluemole.com>



On 2022/8/9 19:23, Michael Zacherl wrote:
> On 8/9/22 01:22, Qu Wenruo wrote:
>> You can try "mount -o ro,rescue=all", which will skip the block group
>> item search, but still there will be other corruptions.
>>
>> I'm more interested in how this happened.
>> The main point here is, the found transid is newer than the on-disk
>> transid, which means metadata COW is not working at all.
>>
>> Are you using unsafe mount options like disabling barriers?
>
> No, this upgraded setup is fairly new and completely stock.
> (and most of that terminology I have to look up anyway)
> Btrfs is new to me and I don't experiment on systems I need for work.
>
> I think what happened is having had mounted the FS twice by accident:
> The former system (Mint 19.3/ext4) has been cloned to a USB-stick which
> I can boot from.
> In one such session I mounted the new btrfs nvme on the old system for
> some data exchange.
> I put the old system to hibernation but forgot to unmount the nvme prior
> to that. :(

Hibernation, I'm not sure about the details, but it looks like there
were some corruption reports related to that.

>
> So when booting up the new system from the nvme it was like having had a
> hard shutdown.

A hard shutdown to btrfs itself should not cause anything wrong, that's
ensured by its mandatory metadata COW.

> So that in itself wouldn't be the problem, I'd think.
> But the other day I again booted from the old system from its
> hibernation with the forgotten nvme mounted.

Oh I got it now, it's not after the hibernation immediately, but the
resume from hibernation, and then some write into the already
out-of-sync fs caused the problem.

> And that was the killer, I'd say, since a lot of metadata has changed on
> that btrfs meanwhile.

Yes, I believe that's the case.

>
> On top of it, btrfs is v4.15.1 on the old system, many things just don't
> exits on this version, AFAICT.
> If that made things worse, I can't say.
>
> On 8/9/22 01:24, Qu Wenruo wrote:
>> Try this mount option "-o ro,rescue=all,rescue=nologreplay".
>
> Oh, I thought "nologgreplay" would be included in "all"?

I checked the code, and the latest code is indeed including that.

But that's weird, if included we should not try to replay the log, thus
that "start tree-log replay" should not occur.

Anyway if rescue=all doesn't work, you may try "btrfs rescue zero-log"
to manually delete that dirty log and then RO mount should still be
possible.

>
>>> Is this FS repairable to a usable state?
>>
>> Definitely no.
>
> Ouch. But meanwhile I can see the damage I did to it.
> I'm currently abroad, so no access to my regular backup infrastructure.
>
> However, since I've to re-install the new system when I'm back:
> There would be enough space on the new ssd for a second partition, which
> I'd like not to go for.
> Or is there an option for additional redundancy within the same btrfs to
> help in case of such a bad mistake (I'd dearly try to avoid anyway, but
> ...)?

I'm not sure if there is anyway to prevent such out-of-sync RW mount
from corrupting the fs.

We can go RAID1 (if you have multiple disks) or DUP (single device,
double copy), but none of them can handle the case you hit.

Personally speaking, I never trust hibernation/suspension, although due
to other ACPI related reasons.
Now I won't even touch hibernation/suspension at all, just to avoid any
removable storage corruption.

Thanks,
Qu

>
> Thanks a lot for your time, Michael.
>
>

  reply	other threads:[~2022-08-09 11:37 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-08 13:06 Corrupted btrfs (LUKS), seeking advice Michael Zacherl
2022-08-08 16:57 ` Chris Murphy
2022-08-08 17:33   ` Michael Zacherl
2022-08-09  0:24     ` Qu Wenruo
2022-08-09 11:23       ` Michael Zacherl
2022-08-09 11:37         ` Qu Wenruo [this message]
2022-08-10  6:10           ` Andrei Borzenkov
2022-08-10  6:29             ` Qu Wenruo
2022-08-10  7:26               ` Goffredo Baroncelli
2022-08-11  6:40                 ` Qu Wenruo
2022-08-14 13:18                   ` Michael Zacherl
2022-08-09  0:22 ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6c15d5db-4e87-dd49-d42a-2fcf08157b25@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).