Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Alexander Wetzel <alexander.wetzel@web.de>,
	linux-btrfs@vger.kernel.org, wqu@suse.com
Subject: Re: [BUG] BTRFS critical corrupt leaf - bisected to 496245cac57e
Date: Sun, 14 Jul 2019 17:49:09 +0800
Message-ID: <6e764f38-a8dd-19e2-e885-3d7561479681@gmx.com> (raw)
In-Reply-To: <057a7561-f691-d7ee-1dea-27acc5ea79cc@web.de>

[-- Attachment #1.1: Type: text/plain, Size: 5966 bytes --]



On 2019/7/14 下午5:25, Alexander Wetzel wrote:
> 
>>>
>>> filtering for btrfs and removing duplicate lines just shows three uniq
>>> error messages:
>>>   BTRFS critical (device vda3): corrupt leaf: root=300 block=8645398528
>>> slot=4 ino=259223, invalid inode generation: has 139737289170944 expect
>>> [0, 1425224]
>>>   BTRFS critical (device vda3): corrupt leaf: root=300 block=8645398528
>>> slot=4 ino=259223, invalid inode generation: has 139737289170944 expect
>>> [0, 1425225]
>>>   BTRFS critical (device vda3): corrupt leaf: root=300 block=8645398528
>>> slot=4 ino=259223, invalid inode generation: has 139737289170944 expect
>>> [0, 1425227]
>>
>> The generation number is 0x7f171f7ba000, I see no reason why it would
>> make any sense.
>>
>> I see no problem rejecting obviously corrupted item.
>>
>> The problem is:
>> - Is that corrupted item?
>>    At least to me, it looks corrupted just from the dmesg.
>>
>> - How and when this happens
>>    Obviously happened on some older kernel.
>>    V5.2 will report such problem before writing corrupted data back to
>>    disk, at least prevent such problem from happening.
> 
> It's probably useless information at that point, but the FS was created
> with a boot image from Debian 8 around Dec 1st 2016 by migrating an also
> freshly created ext4 filesystem to btrfs.

Migrated image could has something unexpected, but according to the
owner id, it's definitely not the converted subvolume. But newly created
subvolume/snapshot.

> I'm pretty sure the migration failed with the newer gentoo kernel
> intended for operation - which was sys-kernel/hardened-sources-4.7.10 -
> and a used the Debian boot image for that. (I can piece together all
> kernel versions used from wtmp, but the Debian boot kernel would be
> "guess only".)
> 
> The time stamps like "2016-12-01 21:51:27" in the dump below are
> matching very well to the time I was setting up the system based on the
> few remaining log evidence I have.

I just did a quick grep and blame for inode transid related code.
The latest direct modification to inode_transid is 6e17d30bfaf4 ("Btrfs:
fill ->last_trans for delayed inode in btrfs_fill_inode."), which is
upstreamed in v4.1.

Furthermore, at that time, we don't have good enough practice for
backport, thus that commit lacks fixes tag, and won't be backported to
most stable branches.
I don't believe Debian backport team would pick this into their kernels,
so if the fs is modified by kernel older than v4.1, then that may be the
cause.

> 
>> Please provide the following dump:
>>   # btrfs ins dump-tree -b 8645398528 /dev/vda3
>>
> 
> xar /home/alex # btrfs ins dump-tree -b 8645398528 /dev/vda3
> btrfs-progs v4.19
> leaf 8645398528 items 48 free space 509 generation 1425074 owner 300
> leaf 8645398528 flags 0x1(WRITTEN) backref revision 1
> fs uuid 668c885e-50b9-41d0-a3ce-b653a4d3f87a
> chunk uuid 54c6809b-e261-423f-b4a1-362304e887bd
>         item 0 key (259222 DIR_ITEM 2504220146) itemoff 3960 itemsize 35
>                 location key (259223 INODE_ITEM 0) type FILE
>                 transid 8119256875011 data_len 0 name_len 5
>                 name: .keep

If we're checking DIR_ITEM/DIR_INDEX transid, it kernel should fail even
easier.

Those transid makes no sense at all.

>         item 1 key (259222 DIR_INDEX 2) itemoff 3925 itemsize 35
>                 location key (259223 INODE_ITEM 0) type FILE
>                 transid 8119256875011 data_len 0 name_len 5
>                 name: .keep
>         item 2 key (259222 DIR_INDEX 3) itemoff 3888 itemsize 37
>                 location key (258830 INODE_ITEM 0) type DIR
>                 transid 2673440063491 data_len 0 name_len 7
>                 name: portage
>         item 3 key (259222 DIR_INDEX 4) itemoff 3851 itemsize 37
>                 location key (3632036 INODE_ITEM 0) type DIR
>                 transid 169620 data_len 0 name_len 7
>                 name: binpkgs
>         item 4 key (259223 INODE_ITEM 0) itemoff 3691 itemsize 160
>                 generation 1 transid 139737289170944 size 0 nbytes 0
>                 block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0
>                 sequence 139737289225400 flags 0x0(none)

Either the reported transid makes sense.

>                 atime 1480625487.0 (2016-12-01 21:51:27)
>                 ctime 1480625487.0 (2016-12-01 21:51:27)
>                 mtime 1480015482.0 (2016-11-24 20:24:42)
>                 otime 0.0 (1970-01-01 01:00:00)
>         item 5 key (259223 INODE_REF 259222) itemoff 3676 itemsize 15
>                 index 2 namelen 5 name: .keep
>         item 6 key (259224 INODE_ITEM 0) itemoff 3516 itemsize 160
>                 generation 1 transid 1733 size 4 nbytes 5

This transid shold be correct.

According to the leaf geneartion, any transid larger than 1425074 should
be incorrect.

So, the are a lot of transid error, not limited to the reported item 4.
There may be so many transid error that most of your tree blocks may get
modified to update the transid.

To fix this, I believe it's possible to reset all these inodes' transid
to leaf transid, but I'm not 100% sure if such fix won't affect things
like send.


I totally understand that the solution I'm going to provide sounds
aweful, but I'd recommend to use a newer enough kernel but without that
check, to copy all the data to another btrfs fs.

It could be more safe than waiting for a btrfs check to repair it.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-13 20:48 Alexander Wetzel
2019-07-14  1:30 ` Qu Wenruo
2019-07-14  9:25   ` Alexander Wetzel
2019-07-14  9:49     ` Qu Wenruo [this message]
2019-07-14 12:07       ` Alexander Wetzel
2019-07-14 12:51         ` Qu Wenruo
2019-07-14 15:40       ` Chris Murphy
2019-07-15  1:07         ` Qu Wenruo

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e764f38-a8dd-19e2-e885-3d7561479681@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=alexander.wetzel@web.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org linux-btrfs@archiver.kernel.org
	public-inbox-index linux-btrfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox