All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfsck lowmem mode shows corruptions
@ 2017-05-04 17:29 Kai Krakow
  2017-05-05  0:55 ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Kai Krakow @ 2017-05-04 17:29 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2626 bytes --]

Hello!

Since I saw a few kernel freezes lately (due to experimenting with
ck-sources) including some filesystem-related backtraces, I booted my
rescue system to check my btrfs filesystem.

Luckily, it showed no problems. It said, everything's fine. But I also
thought: Okay, let's try lowmem mode. And that showed a frightening
long list of extent corruptions und unreferenced chunks. Should I worry?

PS: The freezes seem to be related to bfq, switching to deadline solved
these.

Full log attached, here's an excerpt:

---8<---

checking extents
ERROR: chunk[256 4324327424) stripe 0 did not find the related dev extent
ERROR: chunk[256 4324327424) stripe 1 did not find the related dev extent
ERROR: chunk[256 4324327424) stripe 2 did not find the related dev extent
ERROR: chunk[256 7545552896) stripe 0 did not find the related dev extent
ERROR: chunk[256 7545552896) stripe 1 did not find the related dev extent
ERROR: chunk[256 7545552896) stripe 2 did not find the related dev extent
[...]
ERROR: device extent[1, 1094713344, 1073741824] did not find the related chunk
ERROR: device extent[1, 2168455168, 1073741824] did not find the related chunk
ERROR: device extent[1, 3242196992, 1073741824] did not find the related chunk
[...]
ERROR: device extent[2, 608854605824, 1073741824] did not find the related chunk
ERROR: device extent[2, 609928347648, 1073741824] did not find the related chunk
ERROR: device extent[2, 611002089472, 1073741824] did not find the related chunk
[...]
ERROR: device extent[3, 64433946624, 1073741824] did not find the related chunk
ERROR: device extent[3, 65507688448, 1073741824] did not find the related chunk
ERROR: device extent[3, 66581430272, 1073741824] did not find the related chunk
[...]
ERROR: data extent[96316809216 2097152] backref lost
ERROR: data extent[96316809216 2097152] backref lost
ERROR: data extent[96316809216 2097152] backref lost
ERROR: data extent[686074396672 13737984] backref lost
ERROR: data extent[686074396672 13737984] backref lost
ERROR: data extent[686074396672 13737984] backref lost
[...]
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
ERROR: errors found in fs roots
Checking filesystem on /dev/disk/by-label/system
UUID: bc201ce5-8f2b-4263-995a-6641e89d4c88
found 1960075935744 bytes used, error(s) found
total csum bytes: 1673537040
total tree bytes: 4899094528
total fs tree bytes: 2793914368
total extent tree bytes: 190398464
btree space waste bytes: 871743708
file data blocks allocated: 6907169177600
 referenced 1979268648960

-- 
Regards,
Kai

Replies to list-only preferred.

[-- Attachment #2: lowmem.txt.gz --]
[-- Type: application/gzip, Size: 31503 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: btrfsck lowmem mode shows corruptions
  2017-05-04 17:29 btrfsck lowmem mode shows corruptions Kai Krakow
@ 2017-05-05  0:55 ` Qu Wenruo
  2017-05-05 18:15   ` Kai Krakow
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2017-05-05  0:55 UTC (permalink / raw)
  To: Kai Krakow, linux-btrfs



At 05/05/2017 01:29 AM, Kai Krakow wrote:
> Hello!
> 
> Since I saw a few kernel freezes lately (due to experimenting with
> ck-sources) including some filesystem-related backtraces, I booted my
> rescue system to check my btrfs filesystem.
> 
> Luckily, it showed no problems. It said, everything's fine. But I also
> thought: Okay, let's try lowmem mode. And that showed a frightening
> long list of extent corruptions und unreferenced chunks. Should I worry?

Thanks for trying lowmem mode.

Would you please provide the version of btrfs-progs?

IIRC "ERROR: data extent[96316809216 2097152] backref lost" bug has been 
fixed in recent release.

And for reference, would you please provide the tree dump of your chunk 
and device tree?

This can be done by running:
# btrfs-debug-tree -t device <device>
# btrfs-debug-tree -t chunk <device>

And this 2 dump only contains the btrfs chunk mapping info, so nothing 
sensitive is contained.

Thanks,
Qu
> 
> PS: The freezes seem to be related to bfq, switching to deadline solved
> these.
> 
> Full log attached, here's an excerpt:
> 
> ---8<---
> 
> checking extents
> ERROR: chunk[256 4324327424) stripe 0 did not find the related dev extent
> ERROR: chunk[256 4324327424) stripe 1 did not find the related dev extent
> ERROR: chunk[256 4324327424) stripe 2 did not find the related dev extent
> ERROR: chunk[256 7545552896) stripe 0 did not find the related dev extent
> ERROR: chunk[256 7545552896) stripe 1 did not find the related dev extent
> ERROR: chunk[256 7545552896) stripe 2 did not find the related dev extent
> [...]
> ERROR: device extent[1, 1094713344, 1073741824] did not find the related chunk
> ERROR: device extent[1, 2168455168, 1073741824] did not find the related chunk
> ERROR: device extent[1, 3242196992, 1073741824] did not find the related chunk
> [...]
> ERROR: device extent[2, 608854605824, 1073741824] did not find the related chunk
> ERROR: device extent[2, 609928347648, 1073741824] did not find the related chunk
> ERROR: device extent[2, 611002089472, 1073741824] did not find the related chunk
> [...]
> ERROR: device extent[3, 64433946624, 1073741824] did not find the related chunk
> ERROR: device extent[3, 65507688448, 1073741824] did not find the related chunk
> ERROR: device extent[3, 66581430272, 1073741824] did not find the related chunk
> [...]
> ERROR: data extent[96316809216 2097152] backref lost
> ERROR: data extent[96316809216 2097152] backref lost
> ERROR: data extent[96316809216 2097152] backref lost
> ERROR: data extent[686074396672 13737984] backref lost
> ERROR: data extent[686074396672 13737984] backref lost
> ERROR: data extent[686074396672 13737984] backref lost
> [...]
> ERROR: errors found in extent allocation tree or chunk allocation
> checking free space cache
> checking fs roots
> ERROR: errors found in fs roots
> Checking filesystem on /dev/disk/by-label/system
> UUID: bc201ce5-8f2b-4263-995a-6641e89d4c88
> found 1960075935744 bytes used, error(s) found
> total csum bytes: 1673537040
> total tree bytes: 4899094528
> total fs tree bytes: 2793914368
> total extent tree bytes: 190398464
> btree space waste bytes: 871743708
> file data blocks allocated: 6907169177600
>   referenced 1979268648960
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: btrfsck lowmem mode shows corruptions
  2017-05-05  0:55 ` Qu Wenruo
@ 2017-05-05 18:15   ` Kai Krakow
  2017-05-10  3:09     ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Kai Krakow @ 2017-05-05 18:15 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4364 bytes --]

Am Fri, 5 May 2017 08:55:10 +0800
schrieb Qu Wenruo <quwenruo@cn.fujitsu.com>:

> At 05/05/2017 01:29 AM, Kai Krakow wrote:
> > Hello!
> > 
> > Since I saw a few kernel freezes lately (due to experimenting with
> > ck-sources) including some filesystem-related backtraces, I booted
> > my rescue system to check my btrfs filesystem.
> > 
> > Luckily, it showed no problems. It said, everything's fine. But I
> > also thought: Okay, let's try lowmem mode. And that showed a
> > frightening long list of extent corruptions und unreferenced
> > chunks. Should I worry?  
> 
> Thanks for trying lowmem mode.
> 
> Would you please provide the version of btrfs-progs?

Sorry... I realized it myself the moment I hit the "send" button.

Here it is:

# btrfs version
btrfs-progs v4.10.2

# uname -a
Linux jupiter 4.10.13-ck #2 SMP PREEMPT Thu May 4 23:44:09 CEST 2017
x86_64 Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz GenuineIntel GNU/Linux

> IIRC "ERROR: data extent[96316809216 2097152] backref lost" bug has
> been fixed in recent release.

Is there a patch I could apply?

> And for reference, would you please provide the tree dump of your
> chunk and device tree?
> 
> This can be done by running:
> # btrfs-debug-tree -t device <device>
> # btrfs-debug-tree -t chunk <device>

I'll attach those...

I'd like to note that between the OP and these dumps, I scrubbed and
rebalanced the hole device. I think that would scramble up some
numbers. Also I took the dumps while the fs was online.

If you want me to do clean dumps of the offline device without
intermediate fs processing, let me know.

Thanks,
Kai

> And this 2 dump only contains the btrfs chunk mapping info, so
> nothing sensitive is contained.
> 
> Thanks,
> Qu
> > 
> > PS: The freezes seem to be related to bfq, switching to deadline
> > solved these.
> > 
> > Full log attached, here's an excerpt:
> > 
> > ---8<---
> > 
> > checking extents
> > ERROR: chunk[256 4324327424) stripe 0 did not find the related dev
> > extent ERROR: chunk[256 4324327424) stripe 1 did not find the
> > related dev extent ERROR: chunk[256 4324327424) stripe 2 did not
> > find the related dev extent ERROR: chunk[256 7545552896) stripe 0
> > did not find the related dev extent ERROR: chunk[256 7545552896)
> > stripe 1 did not find the related dev extent ERROR: chunk[256
> > 7545552896) stripe 2 did not find the related dev extent [...]
> > ERROR: device extent[1, 1094713344, 1073741824] did not find the
> > related chunk ERROR: device extent[1, 2168455168, 1073741824] did
> > not find the related chunk ERROR: device extent[1, 3242196992,
> > 1073741824] did not find the related chunk [...]
> > ERROR: device extent[2, 608854605824, 1073741824] did not find the
> > related chunk ERROR: device extent[2, 609928347648, 1073741824] did
> > not find the related chunk ERROR: device extent[2, 611002089472,
> > 1073741824] did not find the related chunk [...]
> > ERROR: device extent[3, 64433946624, 1073741824] did not find the
> > related chunk ERROR: device extent[3, 65507688448, 1073741824] did
> > not find the related chunk ERROR: device extent[3, 66581430272,
> > 1073741824] did not find the related chunk [...]
> > ERROR: data extent[96316809216 2097152] backref lost
> > ERROR: data extent[96316809216 2097152] backref lost
> > ERROR: data extent[96316809216 2097152] backref lost
> > ERROR: data extent[686074396672 13737984] backref lost
> > ERROR: data extent[686074396672 13737984] backref lost
> > ERROR: data extent[686074396672 13737984] backref lost
> > [...]
> > ERROR: errors found in extent allocation tree or chunk allocation
> > checking free space cache
> > checking fs roots
> > ERROR: errors found in fs roots
> > Checking filesystem on /dev/disk/by-label/system
> > UUID: bc201ce5-8f2b-4263-995a-6641e89d4c88
> > found 1960075935744 bytes used, error(s) found
> > total csum bytes: 1673537040
> > total tree bytes: 4899094528
> > total fs tree bytes: 2793914368
> > total extent tree bytes: 190398464
> > btree space waste bytes: 871743708
> > file data blocks allocated: 6907169177600
> >   referenced 1979268648960
> >   
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-btrfs" in the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



-- 
Regards,
Kai

Replies to list-only preferred.

[-- Attachment #2: chunk-tree.txt.gz --]
[-- Type: application/gzip, Size: 25878 bytes --]

[-- Attachment #3: device-tree.txt.gz --]
[-- Type: application/gzip, Size: 44084 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: btrfsck lowmem mode shows corruptions
  2017-05-05 18:15   ` Kai Krakow
@ 2017-05-10  3:09     ` Qu Wenruo
  0 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2017-05-10  3:09 UTC (permalink / raw)
  To: Kai Krakow, linux-btrfs



At 05/06/2017 02:15 AM, Kai Krakow wrote:
> Am Fri, 5 May 2017 08:55:10 +0800
> schrieb Qu Wenruo <quwenruo@cn.fujitsu.com>:
> 
>> At 05/05/2017 01:29 AM, Kai Krakow wrote:
>>> Hello!
>>>
>>> Since I saw a few kernel freezes lately (due to experimenting with
>>> ck-sources) including some filesystem-related backtraces, I booted
>>> my rescue system to check my btrfs filesystem.
>>>
>>> Luckily, it showed no problems. It said, everything's fine. But I
>>> also thought: Okay, let's try lowmem mode. And that showed a
>>> frightening long list of extent corruptions und unreferenced
>>> chunks. Should I worry?
>>
>> Thanks for trying lowmem mode.
>>
>> Would you please provide the version of btrfs-progs?
> 
> Sorry... I realized it myself the moment I hit the "send" button.
> 
> Here it is:
> 
> # btrfs version
> btrfs-progs v4.10.2
> 
> # uname -a
> Linux jupiter 4.10.13-ck #2 SMP PREEMPT Thu May 4 23:44:09 CEST 2017
> x86_64 Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz GenuineIntel GNU/Linux
> 
>> IIRC "ERROR: data extent[96316809216 2097152] backref lost" bug has
>> been fixed in recent release.
> 
> Is there a patch I could apply?

That's strange, btrfs-progs v4.10.2 has already shipped the fix:
ad60ed92d1e0edca2769754c3a50129571a0e49d btrfs-progs: check: lowmem, fix 
false alert about backref lost for SHARED_DATA_REF

Then it's a another bug now.

> 
>> And for reference, would you please provide the tree dump of your
>> chunk and device tree?
>>
>> This can be done by running:
>> # btrfs-debug-tree -t device <device>
>> # btrfs-debug-tree -t chunk <device>
> 
> I'll attach those...
> 
> I'd like to note that between the OP and these dumps, I scrubbed and
> rebalanced the hole device. I think that would scramble up some
> numbers. Also I took the dumps while the fs was online.

Online is not a big problem.
But rebalance really changed the number, making it quite hard to
use the dump to match with previous check error output.

I'll try to make a script to filter the output, but first several chunk 
matches with its dev extent tree.

So I'd better digging the lowmem check code to find the problem.

Thanks for the report anyway,
Qu

> 
> If you want me to do clean dumps of the offline device without
> intermediate fs processing, let me know.
> 
> Thanks,
> Kai
> 
>> And this 2 dump only contains the btrfs chunk mapping info, so
>> nothing sensitive is contained.
>>
>> Thanks,
>> Qu
>>>
>>> PS: The freezes seem to be related to bfq, switching to deadline
>>> solved these.
>>>
>>> Full log attached, here's an excerpt:
>>>
>>> ---8<---
>>>
>>> checking extents
>>> ERROR: chunk[256 4324327424) stripe 0 did not find the related dev
>>> extent ERROR: chunk[256 4324327424) stripe 1 did not find the
>>> related dev extent ERROR: chunk[256 4324327424) stripe 2 did not
>>> find the related dev extent ERROR: chunk[256 7545552896) stripe 0
>>> did not find the related dev extent ERROR: chunk[256 7545552896)
>>> stripe 1 did not find the related dev extent ERROR: chunk[256
>>> 7545552896) stripe 2 did not find the related dev extent [...]
>>> ERROR: device extent[1, 1094713344, 1073741824] did not find the
>>> related chunk ERROR: device extent[1, 2168455168, 1073741824] did
>>> not find the related chunk ERROR: device extent[1, 3242196992,
>>> 1073741824] did not find the related chunk [...]
>>> ERROR: device extent[2, 608854605824, 1073741824] did not find the
>>> related chunk ERROR: device extent[2, 609928347648, 1073741824] did
>>> not find the related chunk ERROR: device extent[2, 611002089472,
>>> 1073741824] did not find the related chunk [...]
>>> ERROR: device extent[3, 64433946624, 1073741824] did not find the
>>> related chunk ERROR: device extent[3, 65507688448, 1073741824] did
>>> not find the related chunk ERROR: device extent[3, 66581430272,
>>> 1073741824] did not find the related chunk [...]
>>> ERROR: data extent[96316809216 2097152] backref lost
>>> ERROR: data extent[96316809216 2097152] backref lost
>>> ERROR: data extent[96316809216 2097152] backref lost
>>> ERROR: data extent[686074396672 13737984] backref lost
>>> ERROR: data extent[686074396672 13737984] backref lost
>>> ERROR: data extent[686074396672 13737984] backref lost
>>> [...]
>>> ERROR: errors found in extent allocation tree or chunk allocation
>>> checking free space cache
>>> checking fs roots
>>> ERROR: errors found in fs roots
>>> Checking filesystem on /dev/disk/by-label/system
>>> UUID: bc201ce5-8f2b-4263-995a-6641e89d4c88
>>> found 1960075935744 bytes used, error(s) found
>>> total csum bytes: 1673537040
>>> total tree bytes: 4899094528
>>> total fs tree bytes: 2793914368
>>> total extent tree bytes: 190398464
>>> btree space waste bytes: 871743708
>>> file data blocks allocated: 6907169177600
>>>    referenced 1979268648960
>>>    
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe
>> linux-btrfs" in the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-05-10  3:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-04 17:29 btrfsck lowmem mode shows corruptions Kai Krakow
2017-05-05  0:55 ` Qu Wenruo
2017-05-05 18:15   ` Kai Krakow
2017-05-10  3:09     ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.