All of lore.kernel.org
 help / color / mirror / Atom feed
* Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
@ 2021-11-10  2:17 S.
  2021-11-10  2:33 ` Qu Wenruo
  2021-11-10  2:55 ` S.
  0 siblings, 2 replies; 16+ messages in thread
From: S. @ 2021-11-10  2:17 UTC (permalink / raw)
  To: linux-btrfs

Hi there, I run OpenMediaVault on an old LaCiE NAS with an armel processor. It has two HDDs, with the Debian root on an EXT4 partition and a simple BtrFS RAID-1 using `/dev/sda3` + `/dev/sdb2` for the OpenMediaVault data. Both drives have a few bad blocks, but I assumed that the filesystem was working around them, because it was running fine for almost 2 years on Buster. The SMART report for both drives says `SMART overall-health self-assessment test result: PASSED`. Today I upgraded to OpenMediaVault 6 and Debian Bullseye, and the BtrFS volume is no longer mountable. Here are my attempts thus far:

https://paste.debian.net/1218866/

For the search engines, these are the key errors:

-----------------------------

     [   86.110770] BTRFS critical (device sda3): corrupt leaf: root=9 block=170459136 slot=0, invalid key objectid, have 1101835439474057344 expect to be aligned to 4096
     [   86.125317] BTRFS error (device sda3): block=170459136 read time tree block corruption detected
     [   86.149595] BTRFS critical (device sda3): corrupt leaf: root=9 block=170459136 slot=0, invalid key objectid, have 1101835439474057344 expect to be aligned to 4096
     [   86.164280] BTRFS error (device sda3): block=170459136 read time tree block corruption detected
     [   86.173099] BTRFS warning (device sda3): failed to read root (objectid=9): -5
     [   86.268589] BTRFS critical (device sda3): corrupt leaf: root=9 block=170459136 slot=0, invalid key objectid, have 1101835439474057344 expect to be aligned to 4096
     [   86.283207] BTRFS error (device sda3): block=170459136 read time tree block corruption detected
     [   86.298934] BTRFS critical (device sda3): corrupt leaf: root=9 block=170459136 slot=0, invalid key objectid, have 1101835439474057344 expect to be aligned to 4096
     [   86.313575] BTRFS error (device sda3): block=170459136 read time tree block corruption detected
     [   86.322394] BTRFS warning (device sda3): failed to read root (objectid=9): -5
     [   86.363745] BTRFS error (device sda3): parent transid verify failed on 78086144 wanted 8060 found 8063
     [   86.390901] BTRFS error (device sda3): parent transid verify failed on 78086144 wanted 8060 found 8063
     [   86.414379] BTRFS error (device sda3): parent transid verify failed on 79216640 wanted 8061 found 8063
     [   86.434284] BTRFS error (device sda3): parent transid verify failed on 79216640 wanted 8061 found 8063
     [   86.465467] BTRFS warning (device sda3): couldn't read tree root
     [   86.500332] BTRFS error (device sda3): open_ctree failed

-----------------------------

     root@OpenMediaVault:~# btrfs check /dev/sda3
     Opening filesystem to check...
     Checking filesystem on /dev/sda3
     UUID: 4a057760-998c-4c66-aa6a-2a08c51d5299
     [1/7] checking root items
     [2/7] checking extents
     Invalid key type(EXTENT_ITEM) found in root(UUID_TREE)
     ignoring invalid key
     [3/7] checking free space cache
     [4/7] checking fs roots
     [5/7] checking only csums items (without verifying data)
     [6/7] checking root refs
     [7/7] checking quota groups skipped (not enabled on this FS)
     found 704343715840 bytes used, no error found
     total csum bytes: 686922316
     total tree bytes: 757006336
     total fs tree bytes: 14794752
     total extent tree bytes: 13795328
     btree space waste bytes: 32551835
     file data blocks allocated: 707684626432
      referenced 707430318080

-----------------------------

Any suggestions? Thanks a lot.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10  2:17 Upgraded from Buster to Bullseye, unmountable Btrfs filesystem S.
@ 2021-11-10  2:33 ` Qu Wenruo
  2021-11-10  2:55 ` S.
  1 sibling, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-11-10  2:33 UTC (permalink / raw)
  To: S., linux-btrfs



On 2021/11/10 10:17, S. wrote:
> Hi there, I run OpenMediaVault on an old LaCiE NAS with an armel
> processor. It has two HDDs, with the Debian root on an EXT4 partition
> and a simple BtrFS RAID-1 using `/dev/sda3` + `/dev/sdb2` for the
> OpenMediaVault data. Both drives have a few bad blocks, but I assumed
> that the filesystem was working around them, because it was running fine
> for almost 2 years on Buster. The SMART report for both drives says
> `SMART overall-health self-assessment test result: PASSED`. Today I
> upgraded to OpenMediaVault 6 and Debian Bullseye, and the BtrFS volume
> is no longer mountable. Here are my attempts thus far:

Newer kernel (mostly since v5.11?) has much strict sanity check inside
btrfs, thus it can detects things which doesn't show up in older kernels.

>
> https://paste.debian.net/1218866/
>
> For the search engines, these are the key errors:
>
> -----------------------------
>
>      [   86.110770] BTRFS critical (device sda3): corrupt leaf: root=9
> block=170459136 slot=0, invalid key objectid, have 1101835439474057344
> expect to be aligned to 4096

The root is UUID tree, thus the objectid is part of the UUID.

The problem is, the key type is corrupted, from your fsck result:

 >    Invalid key type(EXTENT_ITEM) found in root(UUID_TREE)
 >    ignoring invalid key

This bad key type gets caught by tree-checker, and the mount is rejected.

Further more, EXTENT_ITEM_KEY value is 168, while the correct key types
in UUID tree should be UUID_KEY_SUBVOL (251) or
UUID_KEY_RECEVIED_SUBVOL(252).
Which doesn't seem to be an simple bit flip.

But I still recommend to do a memtest to rule out memory problem.


For the repair, we can rebuilt UUID tree, but unfortunately btrfs-progs
doesn't have such ability yet.

Meanwhile, you can prepare a build environment to build btrfs-progs, I
can soon craft a branch for you to re-init UUID tree to solve the
problem first.

To be extra safe, please provide the dump of your UUID tree:

# btrfs ins dump-tree -t uuid /dev/sda3

Thanks,
Qu

>      [   86.125317] BTRFS error (device sda3): block=170459136 read time
> tree block corruption detected
>      [   86.149595] BTRFS critical (device sda3): corrupt leaf: root=9
> block=170459136 slot=0, invalid key objectid, have 1101835439474057344
> expect to be aligned to 4096
>      [   86.164280] BTRFS error (device sda3): block=170459136 read time
> tree block corruption detected
>      [   86.173099] BTRFS warning (device sda3): failed to read root
> (objectid=9): -5
>      [   86.268589] BTRFS critical (device sda3): corrupt leaf: root=9
> block=170459136 slot=0, invalid key objectid, have 1101835439474057344
> expect to be aligned to 4096
>      [   86.283207] BTRFS error (device sda3): block=170459136 read time
> tree block corruption detected
>      [   86.298934] BTRFS critical (device sda3): corrupt leaf: root=9
> block=170459136 slot=0, invalid key objectid, have 1101835439474057344
> expect to be aligned to 4096
>      [   86.313575] BTRFS error (device sda3): block=170459136 read time
> tree block corruption detected
>      [   86.322394] BTRFS warning (device sda3): failed to read root
> (objectid=9): -5
>      [   86.363745] BTRFS error (device sda3): parent transid verify
> failed on 78086144 wanted 8060 found 8063
>      [   86.390901] BTRFS error (device sda3): parent transid verify
> failed on 78086144 wanted 8060 found 8063
>      [   86.414379] BTRFS error (device sda3): parent transid verify
> failed on 79216640 wanted 8061 found 8063
>      [   86.434284] BTRFS error (device sda3): parent transid verify
> failed on 79216640 wanted 8061 found 8063
>      [   86.465467] BTRFS warning (device sda3): couldn't read tree root
>      [   86.500332] BTRFS error (device sda3): open_ctree failed
>
> -----------------------------
>
>      root@OpenMediaVault:~# btrfs check /dev/sda3
>      Opening filesystem to check...
>      Checking filesystem on /dev/sda3
>      UUID: 4a057760-998c-4c66-aa6a-2a08c51d5299
>      [1/7] checking root items
>      [2/7] checking extents
>      Invalid key type(EXTENT_ITEM) found in root(UUID_TREE)
>      ignoring invalid key
>      [3/7] checking free space cache
>      [4/7] checking fs roots
>      [5/7] checking only csums items (without verifying data)
>      [6/7] checking root refs
>      [7/7] checking quota groups skipped (not enabled on this FS)
>      found 704343715840 bytes used, no error found
>      total csum bytes: 686922316
>      total tree bytes: 757006336
>      total fs tree bytes: 14794752
>      total extent tree bytes: 13795328
>      btree space waste bytes: 32551835
>      file data blocks allocated: 707684626432
>       referenced 707430318080
>
> -----------------------------
>
> Any suggestions? Thanks a lot.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10  2:17 Upgraded from Buster to Bullseye, unmountable Btrfs filesystem S.
  2021-11-10  2:33 ` Qu Wenruo
@ 2021-11-10  2:55 ` S.
  2021-11-10  3:34   ` Qu Wenruo
  2021-11-10  4:30   ` S.
  1 sibling, 2 replies; 16+ messages in thread
From: S. @ 2021-11-10  2:55 UTC (permalink / raw)
  To: linux-btrfs

Hi Qu, thank you very much for your fast response!

Regarding memtest, normally in Linux I have never been able to run it from inside an installed Linux system because it needs access to protected kernel memory, and instead it has to run from a live USB or from the memtest86 live testing image. But since this system is a proprietary NAS with uboot and no video interface I don't know how to run a memtest.

> To be extra safe, please provide the dump of your UUID tree:

------------------
root@OpenMediaVault:~# btrfs ins dump-tree -t uuid /dev/sda3
btrfs-progs v5.10.1
uuid tree key (UUID_TREE ROOT_ITEM 0)
leaf 170459136 items 4 free space 16151 generation 366 owner UUID_TREE
leaf 170459136 flags 0x1(WRITTEN) backref revision 1
fs uuid 4a057760-998c-4c66-aa6a-2a08c51d5299
chunk uuid 54b2fa0f-9907-49d1-af33-e172581cd25e
	item 0 key (1101835439474057344 EXTENT_ITEM 56168916570538915) itemoff 16275 itemsize 8
	item 1 key (0x0f4a817e92c9a080 UUID_KEY_SUBVOL 0xc78d54ff9b03a3a8) itemoff 16267 itemsize 8
		subvol_id 5
	item 2 key (0x421cfa6924ef510d UUID_KEY_SUBVOL 0xc8423812cf31288b) itemoff 16259 itemsize 8
		subvol_id 269
	item 3 key (0x45a64f82bc9152f1 UUID_KEY_SUBVOL 0x3859efe8688d6ea4) itemoff 16251 itemsize 8
		subvol_id 274
total bytes 1990110658560
bytes used 704343715840
uuid 4a057760-998c-4c66-aa6a-2a08c51d5299
------------------

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10  2:55 ` S.
@ 2021-11-10  3:34   ` Qu Wenruo
  2021-11-10  4:30   ` S.
  1 sibling, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-11-10  3:34 UTC (permalink / raw)
  To: S., linux-btrfs



On 2021/11/10 10:55, S. wrote:
> Hi Qu, thank you very much for your fast response!
>
> Regarding memtest, normally in Linux I have never been able to run it
> from inside an installed Linux system because it needs access to
> protected kernel memory, and instead it has to run from a live USB or
> from the memtest86 live testing image. But since this system is a
> proprietary NAS with uboot and no video interface I don't know how to
> run a memtest.

There is a user space tool, memtester, which pins down a large chunk of
memory, and do tests in user space.

It would not be able to test memory space used by kernel, but it would
be mostly enough for memory test use case.

>
>> To be extra safe, please provide the dump of your UUID tree:
>
> ------------------
> root@OpenMediaVault:~# btrfs ins dump-tree -t uuid /dev/sda3
> btrfs-progs v5.10.1
> uuid tree key (UUID_TREE ROOT_ITEM 0)
> leaf 170459136 items 4 free space 16151 generation 366 owner UUID_TREE
> leaf 170459136 flags 0x1(WRITTEN) backref revision 1
> fs uuid 4a057760-998c-4c66-aa6a-2a08c51d5299
> chunk uuid 54b2fa0f-9907-49d1-af33-e172581cd25e
>      item 0 key (1101835439474057344 EXTENT_ITEM 56168916570538915)
> itemoff 16275 itemsize 8

Yep, exactly the problem.

>      item 1 key (0x0f4a817e92c9a080 UUID_KEY_SUBVOL 0xc78d54ff9b03a3a8)
> itemoff 16267 itemsize 8
>          subvol_id 5
>      item 2 key (0x421cfa6924ef510d UUID_KEY_SUBVOL 0xc8423812cf31288b)
> itemoff 16259 itemsize 8
>          subvol_id 269
>      item 3 key (0x45a64f82bc9152f1 UUID_KEY_SUBVOL 0x3859efe8688d6ea4)
> itemoff 16251 itemsize 8
>          subvol_id 274

And you haven't yet recevied any subovlume, it would be way much easier
to rebuild the tree.

Thanks,
Qu
> total bytes 1990110658560
> bytes used 704343715840
> uuid 4a057760-998c-4c66-aa6a-2a08c51d5299
> ------------------

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10  2:55 ` S.
  2021-11-10  3:34   ` Qu Wenruo
@ 2021-11-10  4:30   ` S.
  2021-11-10  7:01     ` Qu Wenruo
  2021-11-12 15:18     ` S.
  1 sibling, 2 replies; 16+ messages in thread
From: S. @ 2021-11-10  4:30 UTC (permalink / raw)
  To: linux-btrfs

Thanks again for your help Qu.

> There is a user space tool, memtester, which pins down a large chunk of memory, and do tests in user space.

I see, thanks, I'll try that and report back.

> Yep, exactly the problem.

I'm not very familiar with the low level functions of Btrfs. Could you please help me understand what went wrong? Assuming that this isn't a bad RAM issue, is the the filesystem in the same state as it was before the OS upgrade, just with a more strict kernel now that doesn't like the bad blocks? Or did the tree get damaged during the OS upgrade process?

> And you haven't yet recevied any subovlume, it would be way much easier to rebuild the tree.

So should I wait for a btrfs-progs update? I'm not sure how to build it for armel. Any idea on the timeframe?
I assume that `btrfs check --repair` is not an option?

Thanks again.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10  4:30   ` S.
@ 2021-11-10  7:01     ` Qu Wenruo
  2021-11-10 14:01       ` S.
  2021-11-12 15:18     ` S.
  1 sibling, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2021-11-10  7:01 UTC (permalink / raw)
  To: S., linux-btrfs



On 2021/11/10 12:30, S. wrote:
> Thanks again for your help Qu.
>
>> There is a user space tool, memtester, which pins down a large chunk
>> of memory, and do tests in user space.
>
> I see, thanks, I'll try that and report back.
>
>> Yep, exactly the problem.
>
> I'm not very familiar with the low level functions of Btrfs. Could you
> please help me understand what went wrong?

Btrfs subvolumes have a UUID, mostly used for send/receive.

And btrfs also has a tree to record UUID->subvolume mapping, that's the
UUID tree involved in this case.

The problem here is in the UUID tree, we should only have two types of
keys, (XXXX BTRFS_UUID_KEY_SUBVOLUME XXXX) and (XXXX
BTRFS_UUID_KEY_RECEVIED_SUBVOLUME XXXX).

But strangely in your dump tree, you have an key type with EXTENT_ITEM,
which should not show up in UUID tree.

Now newer kernel expose such problem and refuse to mount.


> Assuming that this isn't a
> bad RAM issue, is the the filesystem in the same state as it was before
> the OS upgrade, just with a more strict kernel now that doesn't like the
> bad blocks?

The reason why I'm assuming it's RAM problem is because that every tree
block in btrfs has its checksum.

And in your case, the checksum for that tree block passed, which means
the corruption doesn't happen from disk bit rot, but something during
runtime.

And it's not from the newer kernel, as kernel newer than v5.11 will have
write time sanity check, to prevent such corruption from reaching disk.

> Or did the tree get damaged during the OS upgrade process?

I don't think so, I believe it's some very old corruption, not exposed
until the latest update.

>
>> And you haven't yet recevied any subovlume, it would be way much
>> easier to rebuild the tree.
>
> So should I wait for a btrfs-progs update? I'm not sure how to build it
> for armel. Any idea on the timeframe?
> I assume that `btrfs check --repair` is not an option?

btrfs check --repair is not yet an option.

But there is another option.

If you can revert to older kernel/distro, then you can mount the fs with
"-o rescan_uuid" to regenerate the UUID tree using the old kernel.

Then it would rebuild the UUID tree, no need for a tool in user space.

Thanks,
Qu

>
> Thanks again.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10  7:01     ` Qu Wenruo
@ 2021-11-10 14:01       ` S.
  2021-11-10 23:46         ` Qu Wenruo
  2021-11-11  5:22         ` S.
  0 siblings, 2 replies; 16+ messages in thread
From: S. @ 2021-11-10 14:01 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs


On 11/10/21 02:01, Qu Wenruo wrote:
> If you can revert to older kernel/distro, then you can mount the fs with
> "-o rescan_uuid" to regenerate the UUID tree using the old kernel.
> 
> Then it would rebuild the UUID tree, no need for a tool in user space.

Thanks very much for the explanation and for the suggestions.
Fortunately the system saved a copy of the old 4.19 kernel and initrd image. You are correct that the old kernel can still boot the filesystem without errors. Then I unmounted it and remounted it with `-o rescan_uuid_tree`. This also appeared to work, as it was able to mount. However, after rebooting into the new Bullseye kernel the filesystem is still unmountable:

------------------------
[  115.250774] BTRFS info (device sda3): flagging fs with big metadata feature
[  115.257773] BTRFS info (device sda3): disk space caching is enabled
[  115.264089] BTRFS info (device sda3): has skinny extents
[  115.414229] BTRFS critical (device sda3): corrupt leaf: root=9 block=170459136 slot=0, invalid key objectid, have 1101835439474057344 expect to be aligned to 4096
[  115.428780] BTRFS error (device sda3): block=170459136 read time tree block corruption detected
[  115.459643] BTRFS critical (device sda3): corrupt leaf: root=9 block=170459136 slot=0, invalid key objectid, have 1101835439474057344 expect to be aligned to 4096
[  115.474296] BTRFS error (device sda3): block=170459136 read time tree block corruption detected
[  115.483109] BTRFS warning (device sda3): failed to read root (objectid=9): -5
[  115.534748] BTRFS error (device sda3): open_ctree failed
------------------------

Any more ideas? Thanks again.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10 14:01       ` S.
@ 2021-11-10 23:46         ` Qu Wenruo
  2021-11-11  0:18           ` S.
  2021-11-11  5:22         ` S.
  1 sibling, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2021-11-10 23:46 UTC (permalink / raw)
  To: S., linux-btrfs



On 2021/11/10 22:01, S. wrote:
>
> On 11/10/21 02:01, Qu Wenruo wrote:
>> If you can revert to older kernel/distro, then you can mount the fs with
>> "-o rescan_uuid" to regenerate the UUID tree using the old kernel.
>>
>> Then it would rebuild the UUID tree, no need for a tool in user space.
>
> Thanks very much for the explanation and for the suggestions.
> Fortunately the system saved a copy of the old 4.19 kernel and initrd
> image. You are correct that the old kernel can still boot the filesystem
> without errors. Then I unmounted it and remounted it with `-o
> rescan_uuid_tree`. This also appeared to work, as it was able to mount.
> However, after rebooting into the new Bullseye kernel the filesystem is
> still unmountable:
>
> ------------------------
> [  115.250774] BTRFS info (device sda3): flagging fs with big metadata
> feature
> [  115.257773] BTRFS info (device sda3): disk space caching is enabled
> [  115.264089] BTRFS info (device sda3): has skinny extents
> [  115.414229] BTRFS critical (device sda3): corrupt leaf: root=9
> block=170459136 slot=0, invalid key objectid, have 1101835439474057344
> expect to be aligned to 4096r
> [  115.428780] BTRFS error (device sda3): block=170459136 read time tree
> block corruption detected
> [  115.459643] BTRFS critical (device sda3): corrupt leaf: root=9
> block=170459136 slot=0, invalid key objectid, have 1101835439474057344
> expect to be aligned to 4096
> [  115.474296] BTRFS error (device sda3): block=170459136 read time tree
> block corruption detected
> [  115.483109] BTRFS warning (device sda3): failed to read root
> (objectid=9): -5
> [  115.534748] BTRFS error (device sda3): open_ctree failed
> ------------------------

After a deeper look into the uuid rescan code, it doesn't delete those
corrupted keys, but only add back correct items.

So it means, we still need btrfs-progs to repair it.

Thus I believe you still need to prepare a build environment for it.

For the worst case, I could try to build a static btrfs-progs for you if
you could provide the "uname -a" output.

Thanks,
Qu
>
> Any more ideas? Thanks again.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10 23:46         ` Qu Wenruo
@ 2021-11-11  0:18           ` S.
  2021-11-11  0:58             ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: S. @ 2021-11-11  0:18 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On 11/10/21 18:46, Qu Wenruo wrote:
> So it means, we still need btrfs-progs to repair it.
> 
> Thus I believe you still need to prepare a build environment for it.
> 
> For the worst case, I could try to build a static btrfs-progs for you if
> you could provide the "uname -a" output.

Thanks for your time and patience. If you have time for a static build of btrfs-progs I would be very grateful.
# uname -a
Linux OpenMediaVault 5.10.0-9-marvell #1 Debian 5.10.70-1 (2021-09-30) armv5tel GNU/Linux

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-11  0:18           ` S.
@ 2021-11-11  0:58             ` Qu Wenruo
  2021-11-11  2:29               ` S.
       [not found]               ` <e19518ec-a885-4a1d-1dda-a5be645a1d73@gmail.com>
  0 siblings, 2 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-11-11  0:58 UTC (permalink / raw)
  To: S., Qu Wenruo, linux-btrfs



On 2021/11/11 08:18, S. wrote:
> On 11/10/21 18:46, Qu Wenruo wrote:
>> So it means, we still need btrfs-progs to repair it.
>>
>> Thus I believe you still need to prepare a build environment for it.
>>
>> For the worst case, I could try to build a static btrfs-progs for you if
>> you could provide the "uname -a" output.
> 
> Thanks for your time and patience. If you have time for a static build 
> of btrfs-progs I would be very grateful.
> # uname -a
> Linux OpenMediaVault 5.10.0-9-marvell #1 Debian 5.10.70-1 (2021-09-30) 
> armv5tel GNU/Linux
> 

Oh, I was expecting something like aarch64...

I don't have the toolchain for armv5 at hand.

But there is still another way.

Don't mount the fs with older kernel yet, make sure the newer kernel 
still fails with the same dmesg output, especially the same line like:

[  115.428780] BTRFS error (device sda3): block=170459136 read time tree 
block corruption detected

Then use the following command to grab the physical position of the 
corrupted tree block:

# btrfs-map-logical -l 170459136 /dev/sda3

It would show something like:

mirror 1 logical 170459136 physical 178847744 device /dev/test/test
mirror 2 logical 170459136 physical 447283200 device /dev/test/test

Then use the physical values to read the tree block out:

# dd if=/dev/sda3 of=/tmp/mirror1 bs=1 count=16k skip=178847744
# dd if=/dev/sda3 of=/tmp/mirror2 bs=1 count=16k skip=447283200

Then send the file mirror1 and mirror2 to me, so I can modify the tree 
block manually.

NOTE: 178847744 and 447283200 should be from your btrfs-map-logical 
output, the numbers I used are just some examples.
And 16k should be your node size, which you can verify by "btrfs ins 
dump-super /dev/sda3 | grep nodesize".
16K should have been the default value for a long long time.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-11  0:58             ` Qu Wenruo
@ 2021-11-11  2:29               ` S.
       [not found]               ` <e19518ec-a885-4a1d-1dda-a5be645a1d73@gmail.com>
  1 sibling, 0 replies; 16+ messages in thread
From: S. @ 2021-11-11  2:29 UTC (permalink / raw)
  To: linux-btrfs

Thanks very much Qu! I just sent you the two mirror files to your personal email.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10 14:01       ` S.
  2021-11-10 23:46         ` Qu Wenruo
@ 2021-11-11  5:22         ` S.
  2021-11-11  6:20           ` Rosen Penev
  1 sibling, 1 reply; 16+ messages in thread
From: S. @ 2021-11-11  5:22 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

Oh, and I forgot to report that memtester didn't find any errors. I ran it one more time too, I think I managed to request up to 210M before the OOM-killer intervened:

---------------------------------

root@OpenMediaVault:~# memtester 200M 1
memtester version 4.5.0 (32-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 200MB (209715200 bytes)
got  200MB (209715200 bytes), trying mlock ...locked.
Loop 1/1:
   Stuck Address       : ok
   Random Value        : ok
   Compare XOR         : ok
   Compare SUB         : ok
   Compare MUL         : ok
   Compare DIV         : ok
   Compare OR          : ok
   Compare AND         : ok
   Sequential Increment: ok
   Solid Bits          : ok
   Block Sequential    : ok
   Checkerboard        : ok
   Bit Spread          : ok
   Bit Flip            : ok
   Walking Ones        : ok
   Walking Zeroes      : ok
   8-bit Writes        : ok
   16-bit Writes       : ok

---------------------------------

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-11  5:22         ` S.
@ 2021-11-11  6:20           ` Rosen Penev
  2021-11-11 14:26             ` S.
  0 siblings, 1 reply; 16+ messages in thread
From: Rosen Penev @ 2021-11-11  6:20 UTC (permalink / raw)
  To: S.; +Cc: Qu Wenruo, linux-btrfs

On Wed, Nov 10, 2021 at 9:25 PM S. <sb56637@gmail.com> wrote:
>
> Oh, and I forgot to report that memtester didn't find any errors. I ran it one more time too, I think I managed to request up to 210M before the OOM-killer intervened:
I bet the actual issue is some 32-bit problem...
>
> ---------------------------------
>
> root@OpenMediaVault:~# memtester 200M 1
> memtester version 4.5.0 (32-bit)
> Copyright (C) 2001-2020 Charles Cazabon.
> Licensed under the GNU General Public License version 2 (only).
>
> pagesize is 4096
> pagesizemask is 0xfffff000
> want 200MB (209715200 bytes)
> got  200MB (209715200 bytes), trying mlock ...locked.
> Loop 1/1:
>    Stuck Address       : ok
>    Random Value        : ok
>    Compare XOR         : ok
>    Compare SUB         : ok
>    Compare MUL         : ok
>    Compare DIV         : ok
>    Compare OR          : ok
>    Compare AND         : ok
>    Sequential Increment: ok
>    Solid Bits          : ok
>    Block Sequential    : ok
>    Checkerboard        : ok
>    Bit Spread          : ok
>    Bit Flip            : ok
>    Walking Ones        : ok
>    Walking Zeroes      : ok
>    8-bit Writes        : ok
>    16-bit Writes       : ok
>
> ---------------------------------

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-11  6:20           ` Rosen Penev
@ 2021-11-11 14:26             ` S.
  0 siblings, 0 replies; 16+ messages in thread
From: S. @ 2021-11-11 14:26 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Rosen Penev

On 11/11/21 01:20, Rosen Penev wrote:
> On Wed, Nov 10, 2021 at 9:25 PM S. <sb56637@gmail.com> wrote:
>>
>> Oh, and I forgot to report that memtester didn't find any errors. I ran it one more time too, I think I managed to request up to 210M before the OOM-killer intervened:
> I bet the actual issue is some 32-bit problem...

That's what I'm thinking too.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
       [not found]                   ` <fdcac254-e169-7aba-7a12-c828aaab3231@gmail.com>
@ 2021-11-12  0:06                     ` Qu Wenruo
  0 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2021-11-12  0:06 UTC (permalink / raw)
  To: S., Qu Wenruo, linux-btrfs



On 2021/11/11 21:55, S. wrote:
> On 11/11/21 02:11, Qu Wenruo wrote:
>
>>> root@OpenMediaVault:~# dd if=/dev/sda3 of=/tmp/mirror1 bs=1 count=16k
>>> skip=149487616
>>> 16384+0 records in
>>> 16384+0 records out
>>> 16384 bytes (16 kB, 16 KiB) copied, 0.305711 s, 53.6 kB/s
>>
>> My bad, this copy should be from /dev/sdb2, not /dev/sda3.
>>
>> No wonder both copy doesn't match.
>
> Hi, I really appreciate your help to fix this. I am attaching the new
> `mirror1` image, generated like this:
>
> ------------------------
> root@OpenMediaVault:~# dd if=/dev/sdb2 of=/tmp/mirror1 bs=1 count=16k

You missed the "skip=" parameter.

> 16384+0 records in
> 16384+0 records out
> 16384 bytes (16 kB, 16 KiB) copied, 0.274943 s, 59.6 kB/s
> root@OpenMediaVault:~# rsync -avh /tmp/mirror1
> sully@192.168.1.14:/home/sully/Desktop/
> Password:
> sending incremental file list
> mirror1
>
> sent 16.49K bytes  received 35 bytes  3.67K bytes/sec
> total size is 16.38K  speedup is 0.99
> ------------------------
>
>
>> # dd if=mirror2 of=/dev/sdb2 bs=1 count=16k skip=149487616
>> # dd if=mirror2 of=/dev/sda3 bs=1 count=16k skip=170459136

All my bad.

The correct command line should be:

# dd if=mirror2 of=/dev/sdb2 bs=1 count=16k seek=149487616
# dd if=mirror2 of=/dev/sda3 bs=1 count=16k seek=170459136

Thankfully, even with the wrong command, btrfs has enough space reserved
at its beginning, so your fs is still untouched and very safe.

Thanks,
Qu
>
> After writing out your fixed `mirror2` image with the above commands I
> get these dmesg errors when attempting to mount it:
>
> -------------------------
> [   28.859193] BTRFS info (device sda3): flagging fs with big metadata
> feature
> [   28.866209] BTRFS info (device sda3): disk space caching is enabled
> [   28.872507] BTRFS info (device sda3): has skinny extents
> [   29.053000] BTRFS critical (device sda3): corrupt leaf: root=9
> block=170459136 slot=0, invalid key objectid, have 1101835439474057344
> expect to be aligned to 4096
> [   29.067527] BTRFS error (device sda3): block=170459136 read time tree
> block corruption detected
> [   29.120141] BTRFS critical (device sda3): corrupt leaf: root=9
> block=170459136 slot=0, invalid key objectid, have 1101835439474057344
> expect to be aligned to 4096
> [   29.134781] BTRFS error (device sda3): block=170459136 read time tree
> block corruption detected
> [   29.143568] BTRFS warning (device sda3): failed to read root
> (objectid=9): -5
> [   29.239873] BTRFS error (device sda3): open_ctree failed
> -------------------------
>
> Please let me know if you need anything else. Thanks a lot!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Upgraded from Buster to Bullseye, unmountable Btrfs filesystem
  2021-11-10  4:30   ` S.
  2021-11-10  7:01     ` Qu Wenruo
@ 2021-11-12 15:18     ` S.
  1 sibling, 0 replies; 16+ messages in thread
From: S. @ 2021-11-12 15:18 UTC (permalink / raw)
  To: linux-btrfs

Here's an update and hopefully the conclusion to this thread:
Qu has been a huge help, and I really appreciate his time and patience. It was ultimately necessary for him to manually repair my tree block, since my NAS uses an old and obscure processor architecture. But in the end he sent me back a repaired tree block file with instructions to write it back out to my disks, and now the filesystem is working fine. Still no idea how this happened, as memtester doesn't find any issues. It's possible that I hit some weird 32-bit quirk, as I imagine that not a lot of people are running Btrfs RAID-1 on 32-bit systems. The good thing that came out of this is a new `clear-uuid-tree` function that Qu is proposing for btrfs-progs, so hopefully there will be an automated way to fix this weird error if it hits anybody in the future. Thanks again for the support!

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-11-12 15:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-10  2:17 Upgraded from Buster to Bullseye, unmountable Btrfs filesystem S.
2021-11-10  2:33 ` Qu Wenruo
2021-11-10  2:55 ` S.
2021-11-10  3:34   ` Qu Wenruo
2021-11-10  4:30   ` S.
2021-11-10  7:01     ` Qu Wenruo
2021-11-10 14:01       ` S.
2021-11-10 23:46         ` Qu Wenruo
2021-11-11  0:18           ` S.
2021-11-11  0:58             ` Qu Wenruo
2021-11-11  2:29               ` S.
     [not found]               ` <e19518ec-a885-4a1d-1dda-a5be645a1d73@gmail.com>
     [not found]                 ` <73fb26b3-932c-9592-bced-6a3fda3456f0@gmx.com>
     [not found]                   ` <fdcac254-e169-7aba-7a12-c828aaab3231@gmail.com>
2021-11-12  0:06                     ` Qu Wenruo
2021-11-11  5:22         ` S.
2021-11-11  6:20           ` Rosen Penev
2021-11-11 14:26             ` S.
2021-11-12 15:18     ` S.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.