All of lore.kernel.org
 help / color / mirror / Atom feed
* BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
@ 2022-02-27 17:45 Christoph Anton Mitterer
  2022-02-27 23:26 ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2022-02-27 17:45 UTC (permalink / raw)
  To: linux-btrfs

Hey.

This is on 5.16.11, Debian sid.

This filesystem has existed since quite long (is the one from my main
working notebook).

Today I was doing a full backup onto another btrfs, with:
  tar --selinux --xattrs "--xattrs-include=*" --acls --numeric-owner -cf backup.tar /mnt/snapshots/2022-02-27
in which /mnt/snapshots/2022-02-27 is a snapshot from the filesystem
with the issues.

While tar was (or actually it still is) running I got these in the
kernel log:
Feb 27 18:35:10 heisenberg kernel: BTRFS info (device dm-1): disk space caching is enabled
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511299584, 64511303680)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511303680, 64511307776)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511307776, 64511311872)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511311872, 64511315968)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511315968, 64511320064)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511320064, 64511324160)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511324160, 64511328256)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511328256, 64511332352)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511332352, 64511336448)
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511336448, 64511340544)
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601640960 csum 0x62c2c721 expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601645056 csum 0xff51e027 expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601649152 csum 0x681a44cd expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601653248 csum 0xbbfad1b7 expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601657344 csum 0x09ae86f1 expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601661440 csum 0x09ee43ad expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601665536 csum 0xaae8fc18 expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601669632 csum 0xe6d04b46 expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 8, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601673728 csum 0x3e49bf9d expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 9, gen 0
Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601677824 csum 0x08695db5 expected csum 0x00000000 mirror 1
Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 10, gen 0

And this in tar:
tar: 2022-02-25_1/home/calestyo/cpu-tests/test-vid-high-res.mkv: File shrank by 334726574 bytes; padding with zeros


1) Any ideas what caused this respectively what it means?

2) Can I check whether this is actually the file that caused that
issue?

3) Can I check whether other files are affected?

4) Is it recommended to recreate the filesystem on dm-0?

I would have a number of generations of snaphsots of that filesystem,
also sent/received onto another btrfs, if that would help anything for
debugging.


Thanks,
Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-27 17:45 BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22 Christoph Anton Mitterer
@ 2022-02-27 23:26 ` Qu Wenruo
  2022-02-28  0:38   ` Christoph Anton Mitterer
  0 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2022-02-27 23:26 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



On 2022/2/28 01:45, Christoph Anton Mitterer wrote:
> Hey.
>
> This is on 5.16.11, Debian sid.
>
> This filesystem has existed since quite long (is the one from my main
> working notebook).
>
> Today I was doing a full backup onto another btrfs, with:
>    tar --selinux --xattrs "--xattrs-include=*" --acls --numeric-owner -cf backup.tar /mnt/snapshots/2022-02-27
> in which /mnt/snapshots/2022-02-27 is a snapshot from the filesystem
> with the issues.
>
> While tar was (or actually it still is) running I got these in the
> kernel log:
> Feb 27 18:35:10 heisenberg kernel: BTRFS info (device dm-1): disk space caching is enabled
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22

I checked the transid:

hex(262166) = 0x40016
hex(22)     = 0x00016

So definitely a bitflip.

Please run memtest on the machine.


For the btrfs part, unfortunately tree-checker itself doesn't yet has
the ability to verify the transid of a tree write.

I may put something like check against last_trans_committed, but that
definitely needs extra check before doing it.

Thanks,
Qu

> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511299584, 64511303680)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511303680, 64511307776)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511307776, 64511311872)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511311872, 64511315968)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511315968, 64511320064)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511320064, 64511324160)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511324160, 64511328256)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511328256, 64511332352)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511332352, 64511336448)
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum hole found for disk bytenr range [64511336448, 64511340544)
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601640960 csum 0x62c2c721 expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601645056 csum 0xff51e027 expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601649152 csum 0x681a44cd expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601653248 csum 0xbbfad1b7 expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601657344 csum 0x09ae86f1 expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601661440 csum 0x09ee43ad expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601665536 csum 0xaae8fc18 expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601669632 csum 0xe6d04b46 expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 8, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601673728 csum 0x3e49bf9d expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 9, gen 0
> Feb 27 18:36:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 1583 ino 1354893 off 601677824 csum 0x08695db5 expected csum 0x00000000 mirror 1
> Feb 27 18:36:52 heisenberg kernel: BTRFS error (device dm-0): bdev /dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
>
> And this in tar:
> tar: 2022-02-25_1/home/calestyo/cpu-tests/test-vid-high-res.mkv: File shrank by 334726574 bytes; padding with zeros
>
>
> 1) Any ideas what caused this respectively what it means?
>
> 2) Can I check whether this is actually the file that caused that
> issue?
>
> 3) Can I check whether other files are affected?
>
> 4) Is it recommended to recreate the filesystem on dm-0?
>
> I would have a number of generations of snaphsots of that filesystem,
> also sent/received onto another btrfs, if that would help anything for
> debugging.
>
>
> Thanks,
> Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-27 23:26 ` Qu Wenruo
@ 2022-02-28  0:38   ` Christoph Anton Mitterer
  2022-02-28  0:55     ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2022-02-28  0:38 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On Mon, 2022-02-28 at 07:26 +0800, Qu Wenruo wrote:
> I checked the transid:
> 
> hex(262166) = 0x40016
> hex(22)     = 0x00016
> 
> So definitely a bitflip.

Hmm... would be a surprise, since I copy loads of data over that
machine, which is always protected by some SHA512 sums.
But could of course be possible.

I assume you mean the bitflip would have happened when the data was
written? Cause reading it reproducibly causes the same issue.

But shouldn't a scrub have noticed that? That file was created around
January 2019, and since then I've had mad several scrubs at least.


> Please run memtest on the machine.

Will do so later.



Anyway... is it recommended to re-create the fs? Or is deleting the
file enough, if a fsck+scrub finds nothing else.


Thanks,
Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-28  0:38   ` Christoph Anton Mitterer
@ 2022-02-28  0:55     ` Qu Wenruo
  2022-02-28  5:19       ` Christoph Anton Mitterer
  2022-02-28  5:32       ` Christoph Anton Mitterer
  0 siblings, 2 replies; 14+ messages in thread
From: Qu Wenruo @ 2022-02-28  0:55 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



On 2022/2/28 08:38, Christoph Anton Mitterer wrote:
> On Mon, 2022-02-28 at 07:26 +0800, Qu Wenruo wrote:
>> I checked the transid:
>>
>> hex(262166) = 0x40016
>> hex(22)     = 0x00016
>>
>> So definitely a bitflip.
>
> Hmm... would be a surprise, since I copy loads of data over that
> machine, which is always protected by some SHA512 sums.
> But could of course be possible.
>
> I assume you mean the bitflip would have happened when the data was
> written? Cause reading it reproducibly causes the same issue.

The corruption part is a tree block in checksum tree (ironically).

This corruption makes btrfs unable to read (part of) checksum tree, thus
unable to verify a lot of data.

>
> But shouldn't a scrub have noticed that?

Please note that, scrub only checks the checksum.

For memory bitflip, since it's corrupted in memory, the checksum will be
calculated using the corrupted data, thus the checksum for that tree
block will be correct, thus scrub won't detect it.

> That file was created around
> January 2019, and since then I've had mad several scrubs at least.
>
>
>> Please run memtest on the machine.
>
> Will do so later.
>
>
>
> Anyway... is it recommended to re-create the fs? Or is deleting the
> file enough, if a fsck+scrub finds nothing else.

The problem is not in the file data, but that checksum tree block.

Unfortunately there will be no good way to reset that bitflip using
btrfs-check.

It's possible to manually reset that generation and re-calculate the
csum to fix the fs.

But it needs to be done manually, as no tool has taken bitflip into
consideration.

Thanks,
Qu

>
>
> Thanks,
> Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-28  0:55     ` Qu Wenruo
@ 2022-02-28  5:19       ` Christoph Anton Mitterer
  2022-02-28  6:54         ` Qu Wenruo
  2022-02-28  5:32       ` Christoph Anton Mitterer
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2022-02-28  5:19 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On Mon, 2022-02-28 at 08:55 +0800, Qu Wenruo wrote:
> The corruption part is a tree block in checksum tree (ironically).
> 
> This corruption makes btrfs unable to read (part of) checksum tree,
> thus
> unable to verify a lot of data.

I see... so, can one find out which files are affected by that part of
the checksum tree?



> Please note that, scrub only checks the checksum.

Sure, and it fails, presumably then when encountering that broken block
and stops completely:
Feb 28 05:56:11 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
Feb 28 05:56:11 heisenberg kernel: BTRFS info (device dm-0): scrub: not finished on devid 1 with status: -5



> For memory bitflip, since it's corrupted in memory, the checksum will
> be
> calculated using the corrupted data, thus the checksum for that tree
> block will be correct, thus scrub won't detect it.

I though that would depend on where/when the bitflip happens... i.e. if
it happens on either the data or the csum, after the latter has been
calculated but before both are written.



> The problem is not in the file data, but that checksum tree block.
> 
> Unfortunately there will be no good way to reset that bitflip using
> btrfs-check.
> 
> It's possible to manually reset that generation and re-calculate the
> csum to fix the fs.
> 
> But it needs to be done manually, as no tool has taken bitflip into
> consideration.

So how to do it then?

If I could determine which files are all affected and if it was e.g.
just that one,... would deleting it help (assuming that this would also
clear the broken part of the checksum tree)? 

And if not... how can  recover? Recursively copying all files to a
fresh fs would also fail, I guess.


And is there a way to read the content of the file while ignoring the
csum errro?


Thanks,
Chris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-28  0:55     ` Qu Wenruo
  2022-02-28  5:19       ` Christoph Anton Mitterer
@ 2022-02-28  5:32       ` Christoph Anton Mitterer
  2022-02-28  6:48         ` Qu Wenruo
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2022-02-28  5:32 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

And I still don't understand this:


On Mon, 2022-02-28 at 08:55 +0800, Qu Wenruo wrote:
> The corruption part is a tree block in checksum tree (ironically).
> 
> This corruption makes btrfs unable to read (part of) checksum tree

You say the damage is in the csum tree... is that checksummed itself
(and the error is noticed just by reading the block in the tree)... or
is it noticed when the (actual data) is compared to the (wrong) data in
the ckecksum tree and the mismatch is detected.


> thus
> unable to verify a lot of data.

How much is a lot? I copied the whole fs before, when I made the
backup,.. and I got errors only for that 1382301696 and that one
file... all others, tar read without giving any error.
Is that exepcted?




> For memory bitflip, since it's corrupted in memory, the checksum will
> be
> calculated using the corrupted data, thus the checksum for that tree
> block will be correct, thus scrub won't detect it.

Could that part of the checksum block have been rewritten recently?
Cause I send/received that data at least once in the past to another
fs... and I would have assumed that any error should have shown up
already back then (didn't however).... so the bitflip must have
happened recently... after the affected file had been written to disk
originally (and after it's checksum had been written originally).


Thanks,
Chris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-28  5:32       ` Christoph Anton Mitterer
@ 2022-02-28  6:48         ` Qu Wenruo
  2022-02-28 15:24           ` Christoph Anton Mitterer
  0 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2022-02-28  6:48 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



On 2022/2/28 13:32, Christoph Anton Mitterer wrote:
> And I still don't understand this:
>
>
> On Mon, 2022-02-28 at 08:55 +0800, Qu Wenruo wrote:
>> The corruption part is a tree block in checksum tree (ironically).
>>
>> This corruption makes btrfs unable to read (part of) checksum tree
>
> You say the damage is in the csum tree... is that checksummed itself
> (and the error is noticed just by reading the block in the tree)... or
> is it noticed when the (actual data) is compared to the (wrong) data in
> the ckecksum tree and the mismatch is detected.

Btrfs handles checksum differently for metadata (tree block) and data.

For metadata, its header has 32 bytes reserved for checksum, and that's
where the csum of metadata is.
Aka, inlined checksum.

For data, we put data checksum into its own tree, aka the csum tree.
It records the logical -> data checksum mapping.

Currently, if btrfs has any thing wrong searching csum tree, it will not
even submit the data read.

>
>
>> thus
>> unable to verify a lot of data.
>
> How much is a lot? I copied the whole fs before, when I made the
> backup,.. and I got errors only for that 1382301696 and that one
> file... all others, tar read without giving any error.
> Is that exepcted?nd out which files are affected by that part of
the checksum tre

It really depends.

For the worst case, if the generation mismatch happens at a very high
level, the whole csum tree can be rendered useless.
In that case, almost all data can be affected (although the data on-disk
may still be OK).

For the best case, it's just a leave got this corruption.
In that case, if you're using SHA256 and 16K nodesize, you get at most
2MiB range which can not be read.
(Again, on disk data can still be fine)

>
>
>
>
>> For memory bitflip, since it's corrupted in memory, the checksum will
>> be
>> calculated using the corrupted data, thus the checksum for that tree
>> block will be correct, thus scrub won't detect it.
>
> Could that part of the checksum block have been rewritten recently?

Depends on the generation. If your current generation (can be checked
with btrfs ins dump-super) is close to the number 262166, then it's
possible it's rewritten recently.

Thanks,
Qu

> Cause I send/received that data at least once in the past to another
> fs... and I would have assumed that any error should have shown up
> already back then (didn't however).... so the bitflip must have
> happened recently... after the affected file had been written to disk
> originally (and after it's checksum had been written originally).
>
>
> Thanks,
> Chris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-28  5:19       ` Christoph Anton Mitterer
@ 2022-02-28  6:54         ` Qu Wenruo
  0 siblings, 0 replies; 14+ messages in thread
From: Qu Wenruo @ 2022-02-28  6:54 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



On 2022/2/28 13:19, Christoph Anton Mitterer wrote:
> On Mon, 2022-02-28 at 08:55 +0800, Qu Wenruo wrote:
>> The corruption part is a tree block in checksum tree (ironically).
>>
>> This corruption makes btrfs unable to read (part of) checksum tree,
>> thus
>> unable to verify a lot of data.
>
> I see... so, can one find out which files are affected by that part of
> the checksum tree?

It may not be a single file, but a lot of files.

As csum tree only stores two things, logical bytenr, and its csum.

So we need some work to find out:

1) Which logical bytenr range is in that csum tree block

2) Which files owns the logical bytenr range.

>
>
>
>> Please note that, scrub only checks the checksum.
>
> Sure, and it fails, presumably then when encountering that broken block
> and stops completely:
> Feb 28 05:56:11 heisenberg kernel: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
> Feb 28 05:56:11 heisenberg kernel: BTRFS info (device dm-0): scrub: not finished on devid 1 with status: -5
>
>
>
>> For memory bitflip, since it's corrupted in memory, the checksum will
>> be
>> calculated using the corrupted data, thus the checksum for that tree
>> block will be correct, thus scrub won't detect it.
>
> I though that would depend on where/when the bitflip happens... i.e. if
> it happens on either the data or the csum, after the latter has been
> calculated but before both are written.
>
>
>
>> The problem is not in the file data, but that checksum tree block.
>>
>> Unfortunately there will be no good way to reset that bitflip using
>> btrfs-check.
>>
>> It's possible to manually reset that generation and re-calculate the
>> csum to fix the fs.
>>
>> But it needs to be done manually, as no tool has taken bitflip into
>> consideration.
>
> So how to do it then?
>
> If I could determine which files are all affected and if it was e.g.
> just that one,... would deleting it help (assuming that this would also
> clear the broken part of the checksum tree)?

No common operations can help.

But I can craft you a special fix to manually reset the generation of
that offending csum tree block, as a last resort method.

>
> And if not... how can  recover? Recursively copying all files to a
> fresh fs would also fail, I guess.
>
>
> And is there a way to read the content of the file while ignoring the
> csum errro?

We have a way, since v5.11, we have a new mount option,
rescue=idatacsums, which can do exactly that, completely ignore data csums.

Thanks,
Qu

>
>
> Thanks,
> Chris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-28  6:48         ` Qu Wenruo
@ 2022-02-28 15:24           ` Christoph Anton Mitterer
  2022-03-01  0:19             ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2022-02-28 15:24 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On Mon, 2022-02-28 at 14:48 +0800, Qu Wenruo wrote:
> Btrfs handles checksum differently for metadata (tree block) and
> data.
> 
> For metadata, its header has 32 bytes reserved for checksum, and
> that's
> where the csum of metadata is.
> Aka, inlined checksum.

Ah, I see.


> For the best case, it's just a leave got this corruption.
> In that case, if you're using SHA256 and 16K nodesize, you get at
> most
> 2MiB range which can not be read.
> (Again, on disk data can still be fine)

It would be interesting to see how much is actually affected,...
shouldn't it be possible to run something like dd_rescue on it? I mean
I'd probably get thousands of csum errors, but in the end it should
show me how much of the file is gone.



> 
> Depends on the generation. If your current generation (can be checked
> with btrfs ins dump-super) is close to the number 262166, then it's
> possible it's rewritten recently.


Hmm, I assume it's just "main" generation field?

Then the number would be *pretty* much off. Which makes the whole thing
IMO quite strange... as said, the file was written around 2019,... and
it had been sent/received at least once.

So would expect that the corruption or bit-flip would need to have
happened at some point after it was first sent/received?


# btrfs inspect-internal dump-super -f /dev/mapper/system  
superblock: bytenr=65536, device=/dev/mapper/system
---------------------------------------------------------
csum_type		0 (crc32c)
csum_size		4
csum			0x80b351a8 [match]
bytenr			65536
flags			0x1
			( WRITTEN )
magic			_BHRfS_M [match]
fsid			1639c139-9033-48a6-80ac-3f7103f5421a
metadata_uuid		1639c139-9033-48a6-80ac-3f7103f5421a
label			system
generation		2233998
root			1743650816
sys_array_size		97
chunk_root_generation	2115390
root_level		1
chunk_root		1048576
chunk_root_level	1
log_root		1749876736
log_root_transid	0
log_root_level		0
total_bytes		1006093991936
bytes_used		658049802240
sectorsize		4096
nodesize		16384
leafsize (deprecated)	16384
stripesize		4096
root_dir		6
num_devices		1
compat_flags		0x0
compat_ro_flags		0x0
incompat_flags		0x161
			( MIXED_BACKREF |
			  BIG_METADATA |
			  EXTENDED_IREF |
			  SKINNY_METADATA )
cache_generation	2233998
uuid_tree_generation	2233998
dev_item.uuid		61917e72-aada-41c5-846b-cd1f73a7fcd8
dev_item.fsid		1639c139-9033-48a6-80ac-3f7103f5421a [match]
dev_item.type		0
dev_item.total_bytes	1006093991936
dev_item.bytes_used	851498237952
dev_item.io_align	4096
dev_item.io_width	4096
dev_item.sector_size	4096
dev_item.devid		1
dev_item.dev_group	0
dev_item.seek_speed	0
dev_item.bandwidth	0
dev_item.generation	0
sys_chunk_array[2048]:
	item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 1048576)
		length 4194304 owner 2 stripe_len 65536 type SYSTEM|single
		io_align 4096 io_width 4096 sector_size 4096
		num_stripes 1 sub_stripes 0
			stripe 0 devid 1 offset 1048576
			dev_uuid 61917e72-aada-41c5-846b-cd1f73a7fcd8
backup_roots[4]:
	backup 0:
		backup_tree_root:	1743650816	gen: 2233998	level: 1
		backup_chunk_root:	1048576	gen: 2115390	level: 1
		backup_extent_root:	1741438976	gen: 2233998	level: 2
		backup_fs_root:		637187178496	gen: 2232658	level: 0
		backup_dev_root:	1230602240	gen: 2233971	level: 1
		backup_csum_root:	1731887104	gen: 2233998	level: 2
		backup_total_bytes:	1006093991936
		backup_bytes_used:	658049802240
		backup_num_devices:	1

	backup 1:
		backup_tree_root:	1749188608	gen: 2233995	level: 1
		backup_chunk_root:	1048576	gen: 2115390	level: 1
		backup_extent_root:	1747468288	gen: 2233995	level: 2
		backup_fs_root:		637187178496	gen: 2232658	level: 0
		backup_dev_root:	1230602240	gen: 2233971	level: 1
		backup_csum_root:	1711751168	gen: 2233995	level: 2
		backup_total_bytes:	1006093991936
		backup_bytes_used:	658049695744
		backup_num_devices:	1

	backup 2:
		backup_tree_root:	1754923008	gen: 2233996	level: 1
		backup_chunk_root:	1048576	gen: 2115390	level: 1
		backup_extent_root:	1746878464	gen: 2233996	level: 2
		backup_fs_root:		637187178496	gen: 2232658	level: 0
		backup_dev_root:	1230602240	gen: 2233971	level: 1
		backup_csum_root:	1711423488	gen: 2233996	level: 2
		backup_total_bytes:	1006093991936
		backup_bytes_used:	658049695744
		backup_num_devices:	1

	backup 3:
		backup_tree_root:	1761574912	gen: 2233997	level: 1
		backup_chunk_root:	1048576	gen: 2115390	level: 1
		backup_extent_root:	1751040000	gen: 2233997	level: 2
		backup_fs_root:		637187178496	gen: 2232658	level: 0
		backup_dev_root:	1230602240	gen: 2233971	level: 1
		backup_csum_root:	1716748288	gen: 2233997	level: 2
		backup_total_bytes:	1006093991936
		backup_bytes_used:	658049806336
		backup_num_devices:	1



On Mon, 2022-02-28 at 14:54 +0800, Qu Wenruo wrote:> 
> It may not be a single file, but a lot of files.

Shouldn't I be able to find out simply by copying away each file (like
what I did during yesterday's backup)?
Or something like tar -cf /dev/null /

Every file that tar cannot read should give an error, and I'd see which
are affected?


> As csum tree only stores two things, logical bytenr, and its csum.
> 
> So we need some work to find out:
> 
> 1) Which logical bytenr range is in that csum tree block
> 
> 2) Which files owns the logical bytenr range.

Is this possible already with standard tools?


 
> No common operations can help.
> 
> But I can craft you a special fix to manually reset the generation of
> that offending csum tree block, as a last resort method.

I guess, if you'd say that the above way would work to find out which
file was affected, and if it was only that one (which is not
precious)... than I could simply copy all data off to some external
disk, an just re-create the fs.


If I'd delete the affected file(s) would btrfs simply clear the broken
csum block?

 
> We have a way, since v5.11, we have a new mount option,
> rescue=idatacsums, which can do exactly that, completely ignore data
> csums.

Ah :-)


Thanks,
Chris.


PS: I'll start the memtest now, and report back once I have some news.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-02-28 15:24           ` Christoph Anton Mitterer
@ 2022-03-01  0:19             ` Qu Wenruo
  2022-03-01  2:14               ` Christoph Anton Mitterer
  0 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2022-03-01  0:19 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



On 2022/2/28 23:24, Christoph Anton Mitterer wrote:
> On Mon, 2022-02-28 at 14:48 +0800, Qu Wenruo wrote:
>> Btrfs handles checksum differently for metadata (tree block) and
>> data.
>>
>> For metadata, its header has 32 bytes reserved for checksum, and
>> that's
>> where the csum of metadata is.
>> Aka, inlined checksum.
>
> Ah, I see.
>
>
>> For the best case, it's just a leave got this corruption.
>> In that case, if you're using SHA256 and 16K nodesize, you get at
>> most
>> 2MiB range which can not be read.
>> (Again, on disk data can still be fine)
>
> It would be interesting to see how much is actually affected,...
> shouldn't it be possible to run something like dd_rescue on it? I mean
> I'd probably get thousands of csum errors, but in the end it should
> show me how much of the file is gone.

As said, no real file is damaged.
It's just we can get csum.

So go rescue=idatacsums, and verify the content if you have backup.

>>
>> Depends on the generation. If your current generation (can be checked
>> with btrfs ins dump-super) is close to the number 262166, then it's
>> possible it's rewritten recently.
>
>
> Hmm, I assume it's just "main" generation field?

Yep.

>
> Then the number would be *pretty* much off. Which makes the whole thing
> IMO quite strange... as said, the file was written around 2019,... and
> it had been sent/received at least once.
>
> So would expect that the corruption or bit-flip would need to have
> happened at some point after it was first sent/received?

I guess the corrupted csum tree block happen at that time.

And fortunately that range doesn't get much utilized thus later
read/write won't get interrupted by that corrupted tree block.

...
>
>
> On Mon, 2022-02-28 at 14:54 +0800, Qu Wenruo wrote:>
>> It may not be a single file, but a lot of files.
>
> Shouldn't I be able to find out simply by copying away each file (like
> what I did during yesterday's backup)?

Yep, that's possible.

> Or something like tar -cf /dev/null /
>
> Every file that tar cannot read should give an error, and I'd see which
> are affected?

That's also a way.

>
>
>> As csum tree only stores two things, logical bytenr, and its csum.
>>
>> So we need some work to find out:
>>
>> 1) Which logical bytenr range is in that csum tree block
>>
>> 2) Which files owns the logical bytenr range.
>
> Is this possible already with standard tools?

We have tools for 2), "btrfs ins logical-resolve" to search for all the
files owning a logical bytenr range.

But we don't have to tool for 1), maybe you can use "btrfs ins dump-tree
-b <bytenr>" to check the content of that corrupted tree.
>
>
>
>> No common operations can help.
>>
>> But I can craft you a special fix to manually reset the generation of
>> that offending csum tree block, as a last resort method.
>
> I guess, if you'd say that the above way would work to find out which
> file was affected, and if it was only that one (which is not
> precious)... than I could simply copy all data off to some external
> disk, an just re-create the fs.
>
>
> If I'd delete the affected file(s) would btrfs simply clear the broken
> csum block?

Nope. That generation mismatch would prevent btrfs to do any
modification including CoW the tree block to a new location.

Thanks,
Qu

>
>
>> We have a way, since v5.11, we have a new mount option,
>> rescue=idatacsums, which can do exactly that, completely ignore data
>> csums.
>
> Ah :-)
>
>
> Thanks,
> Chris.
>
>
> PS: I'll start the memtest now, and report back once I have some news.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-03-01  0:19             ` Qu Wenruo
@ 2022-03-01  2:14               ` Christoph Anton Mitterer
  2022-03-01  2:30                 ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2022-03-01  2:14 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

Hey.

memtest[0] showed, that in fact memory is damaged in some higher region... as you've guessed, its always a single but flip.



Am 1. März 2022 01:19:12 MEZ schrieb Qu Wenruo <quwenruo.btrfs@gmx.com>:
>> It would be interesting to see how much is actually affected,...
>> shouldn't it be possible to run something like dd_rescue on it? I mean
>> I'd probably get thousands of csum errors, but in the end it should
>> show me how much of the file is gone.
>
>As said, no real file is damaged.
>It's just we can get csum.

Sure. I've had understood that. What I've meant was to find out how much of the file (or, if more were affected, which files) was not guaranteed to be integrity protected, because its csum data is broken.



>> So would expect that the corruption or bit-flip would need to have
>> happened at some point after it was first sent/received?
>
>I guess the corrupted csum tree block happen at that time.

It's still a bit strange, though, because I most likely had run a scrub since then,  and no errors were found...

But in principle, scrub should notice these corruptions in the csum tree, shouldn't it? 


>And fortunately that range doesn't get much utilized thus later
>read/write won't get interrupted by that corrupted tree block.

That I don't understand. You mean the csum tree isn't read/written in that region (i.e. not unless the associated files are read)... and that's why it went so long unnoticed?



>> Shouldn't I be able to find out simply by copying away each file (like
>> what I did during yesterday's backup)?
>
>Yep, that's possible.
>
>> Or something like tar -cf /dev/null /
>>
>> Every file that tar cannot read should give an error, and I'd see which
>> are affected?
>
>That's also a way.

Ok... if both works to find out files are affected (in the sense that they cannot be verified because the csum is broken... and thus may or may not be valid)... then I guess that's the easiest way for me to recover. 



>>> 1) Which logical bytenr range is in that csum tree block
>>>
>>> 2) Which files owns the logical bytenr range.
>>
>> Is this possible already with standard tools?
>
>We have tools for 2), "btrfs ins logical-resolve" to search for all the
>files owning a logical bytenr range.

So in principle,  since my tar yesterday brought no further errors,  the should result in only that one file. 


>> If I'd delete the affected file(s) would btrfs simply clear the broken
>> csum block?
>
>Nope. That generation mismatch would prevent btrfs to do any
>modification including CoW the tree block to a new location.

Ah, OK. 


Thanks,
Chris.

[0] For those who haven't seen yet,  there's pcmemtest (https://github.com/martinwhitaker/pcmemtest) which is a fork off (or based upon memtest86+)... but with UEFI support, which memtest86+ cannot be used with. 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-03-01  2:14               ` Christoph Anton Mitterer
@ 2022-03-01  2:30                 ` Qu Wenruo
  2022-03-02  1:38                   ` Christoph Anton Mitterer
  0 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2022-03-01  2:30 UTC (permalink / raw)
  To: Christoph Anton Mitterer, Qu Wenruo, linux-btrfs



On 2022/3/1 10:14, Christoph Anton Mitterer wrote:
> Hey.
> 
> memtest[0] showed, that in fact memory is damaged in some higher region... as you've guessed, its always a single but flip.

Btrfs is now a pretty good memtest tool too :)

> 
> 
> 
> Am 1. März 2022 01:19:12 MEZ schrieb Qu Wenruo <quwenruo.btrfs@gmx.com>:
>>> It would be interesting to see how much is actually affected,...
>>> shouldn't it be possible to run something like dd_rescue on it? I mean
>>> I'd probably get thousands of csum errors, but in the end it should
>>> show me how much of the file is gone.
>>
>> As said, no real file is damaged.
>> It's just we can get csum.
> 
> Sure. I've had understood that. What I've meant was to find out how much of the file (or, if more were affected, which files) was not guaranteed to be integrity protected, because its csum data is broken.
> 
> 
> 
>>> So would expect that the corruption or bit-flip would need to have
>>> happened at some point after it was first sent/received?
>>
>> I guess the corrupted csum tree block happen at that time.
> 
> It's still a bit strange, though, because I most likely had run a scrub since then,  and no errors were found...
> 
> But in principle, scrub should notice these corruptions in the csum tree, shouldn't it?

In theory, it should, especially for csum tree with skinny metadata feature.

In that case we should do a tree search and locate that tree block.

But there is a catch, if the tree block is still cached in memory, we 
may not do full comprehensive check on it and thus it may be a hole 
allowing it to sneak in.

Anyway, I need more investigate to be sure on how this happened without 
triggering scrub, and find a way to make btrfs a more robust memtester :)

Thanks,
Qu
> 
> 
>> And fortunately that range doesn't get much utilized thus later
>> read/write won't get interrupted by that corrupted tree block.
> 
> That I don't understand. You mean the csum tree isn't read/written in that region (i.e. not unless the associated files are read)... and that's why it went so long unnoticed?
> 
> 
> 
>>> Shouldn't I be able to find out simply by copying away each file (like
>>> what I did during yesterday's backup)?
>>
>> Yep, that's possible.
>>
>>> Or something like tar -cf /dev/null /
>>>
>>> Every file that tar cannot read should give an error, and I'd see which
>>> are affected?
>>
>> That's also a way.
> 
> Ok... if both works to find out files are affected (in the sense that they cannot be verified because the csum is broken... and thus may or may not be valid)... then I guess that's the easiest way for me to recover.
> 
> 
> 
>>>> 1) Which logical bytenr range is in that csum tree block
>>>>
>>>> 2) Which files owns the logical bytenr range.
>>>
>>> Is this possible already with standard tools?
>>
>> We have tools for 2), "btrfs ins logical-resolve" to search for all the
>> files owning a logical bytenr range.
> 
> So in principle,  since my tar yesterday brought no further errors,  the should result in only that one file.
> 
> 
>>> If I'd delete the affected file(s) would btrfs simply clear the broken
>>> csum block?
>>
>> Nope. That generation mismatch would prevent btrfs to do any
>> modification including CoW the tree block to a new location.
> 
> Ah, OK.
> 
> 
> Thanks,
> Chris.
> 
> [0] For those who haven't seen yet,  there's pcmemtest (https://github.com/martinwhitaker/pcmemtest) which is a fork off (or based upon memtest86+)... but with UEFI support, which memtest86+ cannot be used with.
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-03-01  2:30                 ` Qu Wenruo
@ 2022-03-02  1:38                   ` Christoph Anton Mitterer
  2022-03-02  2:01                     ` Qu Wenruo
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Anton Mitterer @ 2022-03-02  1:38 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On Tue, 2022-03-01 at 10:30 +0800, Qu Wenruo wrote:
> In that case we should do a tree search and locate that tree block.

... and ...

> Anyway, I need more investigate to be sure on how this happened
> without 
> triggering scrub, and find a way to make btrfs a more robust
> memtester :)

... still anything needed to do from my side here? Or is that something
you just meant for your todo list? Cause then I'd recreate the fs in
the next days.


btw: I tried:
# btrfs inspect-internal logical-resolve -P 1382301696 /
ERROR: logical ino ioctl: No such file or directory

but that fails.


Cheers,
Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22
  2022-03-02  1:38                   ` Christoph Anton Mitterer
@ 2022-03-02  2:01                     ` Qu Wenruo
  0 siblings, 0 replies; 14+ messages in thread
From: Qu Wenruo @ 2022-03-02  2:01 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



On 2022/3/2 09:38, Christoph Anton Mitterer wrote:
> On Tue, 2022-03-01 at 10:30 +0800, Qu Wenruo wrote:
>> In that case we should do a tree search and locate that tree block.
>
> ... and ...
>
>> Anyway, I need more investigate to be sure on how this happened
>> without
>> triggering scrub, and find a way to make btrfs a more robust
>> memtester :)
>
> ... still anything needed to do from my side here? Or is that something
> you just meant for your todo list? Cause then I'd recreate the fs in
> the next days.

Please go ahead, I think I have got all I need.
>
>
> btw: I tried:
> # btrfs inspect-internal logical-resolve -P 1382301696 /
> ERROR: logical ino ioctl: No such file or directory

That bytenr belongs to a btree block, thus internal resolve will return
-ENOENT.

Thanks,
Qu
>
> but that fails.
>
>
> Cheers,
> Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-03-02  2:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-27 17:45 BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22 Christoph Anton Mitterer
2022-02-27 23:26 ` Qu Wenruo
2022-02-28  0:38   ` Christoph Anton Mitterer
2022-02-28  0:55     ` Qu Wenruo
2022-02-28  5:19       ` Christoph Anton Mitterer
2022-02-28  6:54         ` Qu Wenruo
2022-02-28  5:32       ` Christoph Anton Mitterer
2022-02-28  6:48         ` Qu Wenruo
2022-02-28 15:24           ` Christoph Anton Mitterer
2022-03-01  0:19             ` Qu Wenruo
2022-03-01  2:14               ` Christoph Anton Mitterer
2022-03-01  2:30                 ` Qu Wenruo
2022-03-02  1:38                   ` Christoph Anton Mitterer
2022-03-02  2:01                     ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.