All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfsck crashes
@ 2012-07-08 16:08 Christian Volkmann
  2012-07-09  3:40 ` Anand Jain
  0 siblings, 1 reply; 10+ messages in thread
From: Christian Volkmann @ 2012-07-08 16:08 UTC (permalink / raw)
  To: linux-btrfs

Hi there,

I have a corrupted filesystem. This filesystem crashes btrfsck.

A gdb anaylsis showed me:
(gdb) bt
#0  0x0000000000402379 in btrfs_header_nritems (eb=0x0) at ctree.h:1426
#1  0x0000000000408c14 in run_next_block (root=0x73fb40, bits=0x740d50, bits_nr=1024, last=0x7fffffffd948, pending=0x7fffffffda40,
     seen=0x7fffffffda50, reada=0x7fffffffda30, nodes=0x7fffffffda20, extent_cache=0x7fffffffda60) at btrfsck.c:2512
#2  0x00000000004099e2 in check_extents (root=0x73fb40) at btrfsck.c:2792
#3  0x0000000000409bec in main (ac=1, av=0x7fffffffdbe8) at btrfsck.c:2853

What I have seen: buf is "0", after read_tree_block.

btrfsck.c:2511 buf = read_tree_block(root, bytenr, size, 0);
           2512 nritems = btrfs_header_nritems(buf);

So ctree.h crashes here with btrfs_header_nritems(buf)
...
static inline u##bits btrfs_##name(struct extent_buffer *eb)            \
{                                                                       \
         struct btrfs_header *h = (struct btrfs_header *)eb->data;       \
         return le##bits##_to_cpu(h->member);                            \
}                                                                       \
...

I expect an error "eb == 0" is not covered by ctree.h.
May be another fix is required. E.g. harden btrfsck against "0".

The file system crashes the kernel on some access. I did not follow up this,
cause the file system is corrupt.( Using  openSUSE Tumbleweed 3.4.4-31-desktop)
May be the kernel code requires also checks for this?

Please contact me, if I should do some further tests with this file system
or use some tools for a fix test. (developer knowledge given)

Another minor issue: btrfsck uses much memory. But this might be normal.
( > 800MB)

Best regards,
Christian



PS: Just if anyone is interested:
- History + tried: openSUSE btrfsck showed the messages below in the first step.
- /sbin/btrfsck /dev/md3 --repair removed some messages, except checksum.
- File system is mounted with:
   /backup              btrfs      defaults,compress=zlib,noatime             1 2
- filesystem is used to back up some unix system with heavy usage of:
   rsync -aH .... --link-dest=...
   So each file should have regular multiple hard links.

===
Is there anybody interested in fixing this file system with me,
to check btrfsck speedy:/home/cv # /sbin/btrfsck /dev/md3
checking extents
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match
owner ref check failed [2327654400 4096]
ref mismatch on [101138354176 98304] extent item 1, found 0
Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x1f076d0
backpointer mismatch on [101138354176 98304]
owner ref check failed [101138354176 98304]
ref mismatch on [101138452480 106496] extent item 1, found 0
Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0x6aa85d0
backpointer mismatch on [101138452480 106496]
owner ref check failed [101138452480 106496]
ref mismatch on [101138558976 8192] extent item 1, found 0
Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x6aa8610
backpointer mismatch on [101138558976 8192]
owner ref check failed [101138558976 8192]
ref mismatch on [101138567168 16384] extent item 1, found 0
Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x1f8fa80
backpointer mismatch on [101138567168 16384]
owner ref check failed [101138567168 16384]
ref mismatch on [101138583552 16384] extent item 1, found 0
Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x1f8fac0
backpointer mismatch on [101138583552 16384]
owner ref check failed [101138583552 16384]
Errors found in extent allocation tree
checking fs roots
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match
Speicherzugriffsfehler

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-08 16:08 btrfsck crashes Christian Volkmann
@ 2012-07-09  3:40 ` Anand Jain
  2012-07-09 21:23   ` Christian Volkmann
  0 siblings, 1 reply; 10+ messages in thread
From: Anand Jain @ 2012-07-09  3:40 UTC (permalink / raw)
  To: Christian Volkmann; +Cc: linux-btrfs



> What I have seen: buf is "0", after read_tree_block.

  Yes since we not checking extent_buffer_uptodate for the csum_root_tree,
  that will pass the null buf, The following patch will avoid sending null
  buffer
    https://patchwork.kernel.org/patch/1148831/
  
  However whether --init-csum-tree will build the good csum I think that
  will still depends on the corruption IMO.
  
-Anand


On 09/07/12 00:08, Christian Volkmann wrote:
> Hi there,
>
> I have a corrupted filesystem. This filesystem crashes btrfsck.
>
> A gdb anaylsis showed me:
> (gdb) bt
> #0 0x0000000000402379 in btrfs_header_nritems (eb=0x0) at ctree.h:1426
> #1 0x0000000000408c14 in run_next_block (root=0x73fb40, bits=0x740d50, bits_nr=1024, last=0x7fffffffd948, pending=0x7fffffffda40,
> seen=0x7fffffffda50, reada=0x7fffffffda30, nodes=0x7fffffffda20, extent_cache=0x7fffffffda60) at btrfsck.c:2512
> #2 0x00000000004099e2 in check_extents (root=0x73fb40) at btrfsck.c:2792
> #3 0x0000000000409bec in main (ac=1, av=0x7fffffffdbe8) at btrfsck.c:2853
>
> What I have seen: buf is "0", after read_tree_block.
>
> btrfsck.c:2511 buf = read_tree_block(root, bytenr, size, 0);
> 2512 nritems = btrfs_header_nritems(buf);
>
> So ctree.h crashes here with btrfs_header_nritems(buf)
> ...
> static inline u##bits btrfs_##name(struct extent_buffer *eb) \
> { \
> struct btrfs_header *h = (struct btrfs_header *)eb->data; \
> return le##bits##_to_cpu(h->member); \
> } \
> ...
>
> I expect an error "eb == 0" is not covered by ctree.h.
> May be another fix is required. E.g. harden btrfsck against "0".
>
> The file system crashes the kernel on some access. I did not follow up this,
> cause the file system is corrupt.( Using openSUSE Tumbleweed 3.4.4-31-desktop)
> May be the kernel code requires also checks for this?
>
> Please contact me, if I should do some further tests with this file system
> or use some tools for a fix test. (developer knowledge given)
>
> Another minor issue: btrfsck uses much memory. But this might be normal.
> ( > 800MB)
>
> Best regards,
> Christian
>
>
>
> PS: Just if anyone is interested:
> - History + tried: openSUSE btrfsck showed the messages below in the first step.
> - /sbin/btrfsck /dev/md3 --repair removed some messages, except checksum.
> - File system is mounted with:
> /backup btrfs defaults,compress=zlib,noatime 1 2
> - filesystem is used to back up some unix system with heavy usage of:
> rsync -aH .... --link-dest=...
> So each file should have regular multiple hard links.
>
> ===
> Is there anybody interested in fixing this file system with me,
> to check btrfsck speedy:/home/cv # /sbin/btrfsck /dev/md3
> checking extents
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> Csum didn't match
> owner ref check failed [2327654400 4096]
> ref mismatch on [101138354176 98304] extent item 1, found 0
> Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x1f076d0
> backpointer mismatch on [101138354176 98304]
> owner ref check failed [101138354176 98304]
> ref mismatch on [101138452480 106496] extent item 1, found 0
> Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0x6aa85d0
> backpointer mismatch on [101138452480 106496]
> owner ref check failed [101138452480 106496]
> ref mismatch on [101138558976 8192] extent item 1, found 0
> Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x6aa8610
> backpointer mismatch on [101138558976 8192]
> owner ref check failed [101138558976 8192]
> ref mismatch on [101138567168 16384] extent item 1, found 0
> Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x1f8fa80
> backpointer mismatch on [101138567168 16384]
> owner ref check failed [101138567168 16384]
> ref mismatch on [101138583552 16384] extent item 1, found 0
> Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x1f8fac0
> backpointer mismatch on [101138583552 16384]
> owner ref check failed [101138583552 16384]
> Errors found in extent allocation tree
> checking fs roots
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> Csum didn't match
> Speicherzugriffsfehler
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-09  3:40 ` Anand Jain
@ 2012-07-09 21:23   ` Christian Volkmann
  2012-07-10  6:30     ` Anand Jain
  0 siblings, 1 reply; 10+ messages in thread
From: Christian Volkmann @ 2012-07-09 21:23 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

Anand Jain schrieb:>
 >
 >> What I have seen: buf is "0", after read_tree_block.
 >
 >   Yes since we not checking extent_buffer_uptodate for the csum_root_tree,
 >   that will pass the null buf, The following patch will avoid sending null
 >   buffer
 >     https://patchwork.kernel.org/patch/1148831/
 >
 >   However whether --init-csum-tree will build the good csum I think that
 >   will still depends on the corruption IMO.
 >
 > -Anand
 >

.)
The patch does not help.
This is false: !extent_buffer_uptodate(info->csum_root->node)

.)
Output btrfsck of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git ,
patched at line 3552.

speedy:/tmp/btrfs/btrfs-progs # gdb ./btrfsck
GNU gdb (GDB) SUSE (7.3-41.1.2)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /tmp/btrfs/btrfs-progs/btrfsck...done.
(gdb) r /dev/md3
Starting program: /tmp/btrfs/btrfs-progs/btrfsck /dev/md3
Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
Try: zypper install -C "debuginfo(build-id)=f20c99249f5a5776e1377d3bd728502e3f455a3f"
Missing separate debuginfo for /lib64/libuuid.so.1
Try: zypper install -C "debuginfo(build-id)=24ae727f9cd5fb29f81b0f965859d3cf4668bf17"
Missing separate debuginfo for /lib64/libc.so.6
Try: zypper install -C "debuginfo(build-id)=7b169b1db50384b70e3e4b4884cd56432d5de796"
checking extents
checksum verify failed on 2327654400 wanted 89AAEA38 found 72
checksum verify failed on 2327654400 wanted 89AAEA38 found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 89AAEA38 found 72
Csum didn't match
owner ref check failed [2327654400 4096]
ref mismatch on [101138354176 98304] extent item 1, found 0
Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x182ebd20
backpointer mismatch on [101138354176 98304]
owner ref check failed [101138354176 98304]
ref mismatch on [101138452480 106496] extent item 1, found 0
Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0xefb8d0
backpointer mismatch on [101138452480 106496]
owner ref check failed [101138452480 106496]
ref mismatch on [101138558976 8192] extent item 1, found 0
Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x5a22350
backpointer mismatch on [101138558976 8192]
owner ref check failed [101138558976 8192]
ref mismatch on [101138567168 16384] extent item 1, found 0
Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x5a22390
backpointer mismatch on [101138567168 16384]
owner ref check failed [101138567168 16384]
ref mismatch on [101138583552 16384] extent item 1, found 0
Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x19dfaae0
backpointer mismatch on [101138583552 16384]
owner ref check failed [101138583552 16384]
Errors found in extent allocation tree
checking fs roots
checksum verify failed on 2327654400 wanted 89AAEA38 found 72
checksum verify failed on 2327654400 wanted 89AAEA38 found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 89AAEA38 found 72
Csum didn't match

Program received signal SIGSEGV, Segmentation fault.
0x0000000000402264 in btrfs_header_level (eb=0x0) at ctree.h:1540
1540    BTRFS_SETGET_HEADER_FUNCS(header_level, struct btrfs_header, level, 8);
(gdb)


.)
Against which git should I regular patch?
This git from the wiki seems to be not up to date:
  http://git.darksatanic.net/repo/btrfs-progs-unstable.git

This repository does not match from the line number:
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git

.)
Strange for me: Why seems the same "number" 2327654400 wants
to have a different checksum?

checksum verify failed on 2327654400 wanted 89AAEA38 found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72


Thanks & regards,
Christian


>
> On 09/07/12 00:08, Christian Volkmann wrote:
>> Hi there,
>>
>> I have a corrupted filesystem. This filesystem crashes btrfsck.
>>
>> A gdb anaylsis showed me:
>> (gdb) bt
>> #0 0x0000000000402379 in btrfs_header_nritems (eb=0x0) at ctree.h:1426
>> #1 0x0000000000408c14 in run_next_block (root=0x73fb40, bits=0x740d50, bits_nr=1024, last=0x7fffffffd948, pending=0x7fffffffda40,
>> seen=0x7fffffffda50, reada=0x7fffffffda30, nodes=0x7fffffffda20, extent_cache=0x7fffffffda60) at btrfsck.c:2512
>> #2 0x00000000004099e2 in check_extents (root=0x73fb40) at btrfsck.c:2792
>> #3 0x0000000000409bec in main (ac=1, av=0x7fffffffdbe8) at btrfsck.c:2853
>>
>> What I have seen: buf is "0", after read_tree_block.
>>
>> btrfsck.c:2511 buf = read_tree_block(root, bytenr, size, 0);
>> 2512 nritems = btrfs_header_nritems(buf);
>>
>> So ctree.h crashes here with btrfs_header_nritems(buf)
>> ...
>> static inline u##bits btrfs_##name(struct extent_buffer *eb) \
>> { \
>> struct btrfs_header *h = (struct btrfs_header *)eb->data; \
>> return le##bits##_to_cpu(h->member); \
>> } \
>> ...
>>
>> I expect an error "eb == 0" is not covered by ctree.h.
>> May be another fix is required. E.g. harden btrfsck against "0".
>>
>> The file system crashes the kernel on some access. I did not follow up this,
>> cause the file system is corrupt.( Using openSUSE Tumbleweed 3.4.4-31-desktop)
>> May be the kernel code requires also checks for this?
>>
>> Please contact me, if I should do some further tests with this file system
>> or use some tools for a fix test. (developer knowledge given)
>>
>> Another minor issue: btrfsck uses much memory. But this might be normal.
>> ( > 800MB)
>>
>> Best regards,
>> Christian
>>
>>
>>
>> PS: Just if anyone is interested:
>> - History + tried: openSUSE btrfsck showed the messages below in the first step.
>> - /sbin/btrfsck /dev/md3 --repair removed some messages, except checksum.
>> - File system is mounted with:
>> /backup btrfs defaults,compress=zlib,noatime 1 2
>> - filesystem is used to back up some unix system with heavy usage of:
>> rsync -aH .... --link-dest=...
>> So each file should have regular multiple hard links.
>>
>> ===
>> Is there anybody interested in fixing this file system with me,
>> to check btrfsck speedy:/home/cv # /sbin/btrfsck /dev/md3
>> checking extents
>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>> Csum didn't match
>> owner ref check failed [2327654400 4096]
>> ref mismatch on [101138354176 98304] extent item 1, found 0
>> Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x1f076d0
>> backpointer mismatch on [101138354176 98304]
>> owner ref check failed [101138354176 98304]
>> ref mismatch on [101138452480 106496] extent item 1, found 0
>> Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0x6aa85d0
>> backpointer mismatch on [101138452480 106496]
>> owner ref check failed [101138452480 106496]
>> ref mismatch on [101138558976 8192] extent item 1, found 0
>> Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x6aa8610
>> backpointer mismatch on [101138558976 8192]
>> owner ref check failed [101138558976 8192]
>> ref mismatch on [101138567168 16384] extent item 1, found 0
>> Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x1f8fa80
>> backpointer mismatch on [101138567168 16384]
>> owner ref check failed [101138567168 16384]
>> ref mismatch on [101138583552 16384] extent item 1, found 0
>> Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x1f8fac0
>> backpointer mismatch on [101138583552 16384]
>> owner ref check failed [101138583552 16384]
>> Errors found in extent allocation tree
>> checking fs roots
>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>> Csum didn't match
>> Speicherzugriffsfehler
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-09 21:23   ` Christian Volkmann
@ 2012-07-10  6:30     ` Anand Jain
  2012-07-10  9:13       ` haveaniceday
  0 siblings, 1 reply; 10+ messages in thread
From: Anand Jain @ 2012-07-10  6:30 UTC (permalink / raw)
  To: Christian Volkmann; +Cc: linux-btrfs


Christian,

  line # is still confusing to me as well. patch was to avoid seg
  fault when csum_root node is null and it might not be the case
  here then.

  (If the original problem stack-trace has remained the same
  which is as below)..
---------
>>> (gdb) bt
>>> #0 0x0000000000402379 in btrfs_header_nritems (eb=0x0) at ctree.h:1426
>>> #1 0x0000000000408c14 in run_next_block (root=0x73fb40, bits=0x740d50, bits_nr=1024, last=0x7fffffffd948, pending=0x7fffffffda40,
>>> seen=0x7fffffffda50, reada=0x7fffffffda30, nodes=0x7fffffffda20, extent_cache=0x7fffffffda60) at btrfsck.c:2512
>>> #2 0x00000000004099e2 in check_extents (root=0x73fb40) at btrfsck.c:2792
>>> #3 0x0000000000409bec in main (ac=1, av=0x7fffffffdbe8) at btrfsck.c:2853
----------
>>> What I have seen: buf is "0", after read_tree_block.
>>>
>>> btrfsck.c:2511 buf = read_tree_block(root, bytenr, size, 0);
>>> 2512 nritems = btrfs_header_nritems(buf);
----------


   A re-look (ignore line number) suggests that we already have the
   extent_buffer_uptodate check for the buf, so buf can't be NULL
   when calling btrfs_header_nritems which contradicts the above
   stack trace if you are using the latest code. as shown below.
  
http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git;a=blob;f=btrfsck.c;h=088b9f427339cde70dd6b1a457aeba5cf190ce34;hb=HEAD

-------
2526 static int run_next_block(struct btrfs_root *root,
::
2585         buf = read_tree_block(root, bytenr, size, 0);
2586         if (!extent_buffer_uptodate(buf)) {
2587                 record_bad_block_io(root->fs_info,
2588                                     extent_cache, bytenr, size);
2589                 free_extent_buffer(buf);
2590                 goto out;
2591         }
2592
2593         nritems = btrfs_header_nritems(buf);  <-- Seg fault ??
-------

   

Thanks,
-Anand




On 10/07/12 05:23, Christian Volkmann wrote:
> Anand Jain schrieb:>
>  >
>  >> What I have seen: buf is "0", after read_tree_block.
>  >
>  > Yes since we not checking extent_buffer_uptodate for the csum_root_tree,
>  > that will pass the null buf, The following patch will avoid sending null
>  > buffer
>  > https://patchwork.kernel.org/patch/1148831/
>  >
>  > However whether --init-csum-tree will build the good csum I think that
>  > will still depends on the corruption IMO.
>  >
>  > -Anand
>  >
>
> .)
> The patch does not help.
> This is false: !extent_buffer_uptodate(info->csum_root->node)
>
> .)
> Output btrfsck of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git ,
> patched at line 3552.
>
> speedy:/tmp/btrfs/btrfs-progs # gdb ./btrfsck
> GNU gdb (GDB) SUSE (7.3-41.1.2)
> Copyright (C) 2011 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-suse-linux".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /tmp/btrfs/btrfs-progs/btrfsck...done.
> (gdb) r /dev/md3
> Starting program: /tmp/btrfs/btrfs-progs/btrfsck /dev/md3
> Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
> Try: zypper install -C "debuginfo(build-id)=f20c99249f5a5776e1377d3bd728502e3f455a3f"
> Missing separate debuginfo for /lib64/libuuid.so.1
> Try: zypper install -C "debuginfo(build-id)=24ae727f9cd5fb29f81b0f965859d3cf4668bf17"
> Missing separate debuginfo for /lib64/libc.so.6
> Try: zypper install -C "debuginfo(build-id)=7b169b1db50384b70e3e4b4884cd56432d5de796"
> checking extents
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> Csum didn't match
> owner ref check failed [2327654400 4096]
> ref mismatch on [101138354176 98304] extent item 1, found 0
> Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x182ebd20
> backpointer mismatch on [101138354176 98304]
> owner ref check failed [101138354176 98304]
> ref mismatch on [101138452480 106496] extent item 1, found 0
> Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0xefb8d0
> backpointer mismatch on [101138452480 106496]
> owner ref check failed [101138452480 106496]
> ref mismatch on [101138558976 8192] extent item 1, found 0
> Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x5a22350
> backpointer mismatch on [101138558976 8192]
> owner ref check failed [101138558976 8192]
> ref mismatch on [101138567168 16384] extent item 1, found 0
> Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x5a22390
> backpointer mismatch on [101138567168 16384]
> owner ref check failed [101138567168 16384]
> ref mismatch on [101138583552 16384] extent item 1, found 0
> Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x19dfaae0
> backpointer mismatch on [101138583552 16384]
> owner ref check failed [101138583552 16384]
> Errors found in extent allocation tree
> checking fs roots
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> Csum didn't match
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000402264 in btrfs_header_level (eb=0x0) at ctree.h:1540
> 1540 BTRFS_SETGET_HEADER_FUNCS(header_level, struct btrfs_header, level, 8);
> (gdb)
>
>
> .)
> Against which git should I regular patch?
> This git from the wiki seems to be not up to date:
> http://git.darksatanic.net/repo/btrfs-progs-unstable.git
>
> This repository does not match from the line number:
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
>
> .)
> Strange for me: Why seems the same "number" 2327654400 wants
> to have a different checksum?
>
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>
>
> Thanks & regards,
> Christian
>
>
>>
>> On 09/07/12 00:08, Christian Volkmann wrote:
>>> Hi there,
>>>
>>> I have a corrupted filesystem. This filesystem crashes btrfsck.
>>>
>>> A gdb anaylsis showed me:
>>> (gdb) bt
>>> #0 0x0000000000402379 in btrfs_header_nritems (eb=0x0) at ctree.h:1426
>>> #1 0x0000000000408c14 in run_next_block (root=0x73fb40, bits=0x740d50, bits_nr=1024, last=0x7fffffffd948, pending=0x7fffffffda40,
>>> seen=0x7fffffffda50, reada=0x7fffffffda30, nodes=0x7fffffffda20, extent_cache=0x7fffffffda60) at btrfsck.c:2512
>>> #2 0x00000000004099e2 in check_extents (root=0x73fb40) at btrfsck.c:2792
>>> #3 0x0000000000409bec in main (ac=1, av=0x7fffffffdbe8) at btrfsck.c:2853
>>>
>>> What I have seen: buf is "0", after read_tree_block.
>>>
>>> btrfsck.c:2511 buf = read_tree_block(root, bytenr, size, 0);
>>> 2512 nritems = btrfs_header_nritems(buf);
>>>
>>> So ctree.h crashes here with btrfs_header_nritems(buf)
>>> ...
>>> static inline u##bits btrfs_##name(struct extent_buffer *eb) \
>>> { \
>>> struct btrfs_header *h = (struct btrfs_header *)eb->data; \
>>> return le##bits##_to_cpu(h->member); \
>>> } \
>>> ...
>>>
>>> I expect an error "eb == 0" is not covered by ctree.h.
>>> May be another fix is required. E.g. harden btrfsck against "0".
>>>
>>> The file system crashes the kernel on some access. I did not follow up this,
>>> cause the file system is corrupt.( Using openSUSE Tumbleweed 3.4.4-31-desktop)
>>> May be the kernel code requires also checks for this?
>>>
>>> Please contact me, if I should do some further tests with this file system
>>> or use some tools for a fix test. (developer knowledge given)
>>>
>>> Another minor issue: btrfsck uses much memory. But this might be normal.
>>> ( > 800MB)
>>>
>>> Best regards,
>>> Christian
>>>
>>>
>>>
>>> PS: Just if anyone is interested:
>>> - History + tried: openSUSE btrfsck showed the messages below in the first step.
>>> - /sbin/btrfsck /dev/md3 --repair removed some messages, except checksum.
>>> - File system is mounted with:
>>> /backup btrfs defaults,compress=zlib,noatime 1 2
>>> - filesystem is used to back up some unix system with heavy usage of:
>>> rsync -aH .... --link-dest=...
>>> So each file should have regular multiple hard links.
>>>
>>> ===
>>> Is there anybody interested in fixing this file system with me,
>>> to check btrfsck speedy:/home/cv # /sbin/btrfsck /dev/md3
>>> checking extents
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> Csum didn't match
>>> owner ref check failed [2327654400 4096]
>>> ref mismatch on [101138354176 98304] extent item 1, found 0
>>> Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x1f076d0
>>> backpointer mismatch on [101138354176 98304]
>>> owner ref check failed [101138354176 98304]
>>> ref mismatch on [101138452480 106496] extent item 1, found 0
>>> Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0x6aa85d0
>>> backpointer mismatch on [101138452480 106496]
>>> owner ref check failed [101138452480 106496]
>>> ref mismatch on [101138558976 8192] extent item 1, found 0
>>> Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x6aa8610
>>> backpointer mismatch on [101138558976 8192]
>>> owner ref check failed [101138558976 8192]
>>> ref mismatch on [101138567168 16384] extent item 1, found 0
>>> Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x1f8fa80
>>> backpointer mismatch on [101138567168 16384]
>>> owner ref check failed [101138567168 16384]
>>> ref mismatch on [101138583552 16384] extent item 1, found 0
>>> Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x1f8fac0
>>> backpointer mismatch on [101138583552 16384]
>>> owner ref check failed [101138583552 16384]
>>> Errors found in extent allocation tree
>>> checking fs roots
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> Csum didn't match
>>> Speicherzugriffsfehler
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-10  6:30     ` Anand Jain
@ 2012-07-10  9:13       ` haveaniceday
  2012-07-10 11:08         ` haveaniceday
  0 siblings, 1 reply; 10+ messages in thread
From: haveaniceday @ 2012-07-10  9:13 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs




Anand Jain <Anand.Jain@oracle.com> hat am 10. Juli 2012 um 08:30 geschrieben:

>
> Christian,
>
>   line # is still confusing to me as well. patch was to avoid seg
>   fault when csum_root node is null and it might not be the case
>   here then.
>
>   (If the original problem stack-trace has remained the same
>   which is as below)..

Hi Anand,

I have used git clone
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git

The stack looks now like below.  I expect the first "0" in path->nodes
 generates the problem.
(gdb) p path->nodes
$2 = {0x0, 0x27ad82e0, 0x3893a930, 0x3dc24f0, 0x75efa0, 0x0, 0x0, 0x0}



speedy:/tmp/btrfs/btrfs-progs # gdb ./btrfsck
GNU gdb (GDB) SUSE (7.3-41.1.2)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /tmp/btrfs/btrfs-progs/btrfsck...done.
(gdb) r /dev/md3
Starting program: /tmp/btrfs/btrfs-progs/btrfsck /dev/md3
Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
Try: zypper install -C
"debuginfo(build-id)=f20c99249f5a5776e1377d3bd728502e3f455a3f"
Missing separate debuginfo for /lib64/libuuid.so.1
Try: zypper install -C
"debuginfo(build-id)=24ae727f9cd5fb29f81b0f965859d3cf4668bf17"
Missing separate debuginfo for /lib64/libc.so.6
Try: zypper install -C
"debuginfo(build-id)=7b169b1db50384b70e3e4b4884cd56432d5de796"
checking extents
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match
owner ref check failed [2327654400 4096]
ref mismatch on [101138354176 98304] extent item 1, found 0
Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0
found 0 wanted 1 back 0x19e025d0
backpointer mismatch on [101138354176 98304]
owner ref check failed [101138354176 98304]
ref mismatch on [101138452480 106496] extent item 1, found 0
Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0
found 0 wanted 1 back 0x19e02610
backpointer mismatch on [101138452480 106496]
owner ref check failed [101138452480 106496]
ref mismatch on [101138558976 8192] extent item 1, found 0
Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0
found 0 wanted 1 back 0x5a60f90
backpointer mismatch on [101138558976 8192]
owner ref check failed [101138558976 8192]
ref mismatch on [101138567168 16384] extent item 1, found 0
Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0
found 0 wanted 1 back 0x5a60fd0
backpointer mismatch on [101138567168 16384]
owner ref check failed [101138567168 16384]
ref mismatch on [101138583552 16384] extent item 1, found 0
Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0
found 0 wanted 1 back 0x19e04420
backpointer mismatch on [101138583552 16384]
owner ref check failed [101138583552 16384]
Errors found in extent allocation tree
checking fs roots
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match

Program received signal SIGSEGV, Segmentation fault.
0x0000000000402264 in btrfs_header_level (eb=0x0) at ctree.h:1540
1540    BTRFS_SETGET_HEADER_FUNCS(header_level, struct btrfs_header, level, 8);
(gdb) bt
#0  0x0000000000402264 in btrfs_header_level (eb=0x0) at ctree.h:1540
#1  0x000000000040531f in walk_down_tree (root=0x6f6c540, path=0x7fffffffde70,
wc=0x7fffffffdf40,
    level=0x7fffffffdf00) at btrfsck.c:1176
#2  0x0000000000406bf6 in check_fs_root (root=0x6f6c540,
root_cache=0x7fffffffe0c0, wc=0x7fffffffdf40)
    at btrfsck.c:1702
#3  0x0000000000406ecb in check_fs_roots (root=0x75ee00,
root_cache=0x7fffffffe0c0) at btrfsck.c:1773
#4  0x000000000040b49c in main (ac=1, av=0x7fffffffe1f8) at btrfsck.c:3576
(gdb) l
1535    BTRFS_SETGET_HEADER_FUNCS(header_generation, struct btrfs_header,
1536                              generation, 64);
1537    BTRFS_SETGET_HEADER_FUNCS(header_owner, struct btrfs_header, owner, 64);
1538    BTRFS_SETGET_HEADER_FUNCS(header_nritems, struct btrfs_header, nritems,
32);
1539    BTRFS_SETGET_HEADER_FUNCS(header_flags, struct btrfs_header, flags, 64);
1540    BTRFS_SETGET_HEADER_FUNCS(header_level, struct btrfs_header, level, 8);
1541
1542    static inline int btrfs_header_flag(struct extent_buffer *eb, u64 flag)
1543    {
1544            return (btrfs_header_flags(eb) & flag) == flag;
(gdb) up
#1  0x000000000040531f in walk_down_tree (root=0x6f6c540, path=0x7fffffffde70,
wc=0x7fffffffdf40,
    level=0x7fffffffdf00) at btrfsck.c:1176
1176                    if (btrfs_header_level(cur) != *level)
(gdb) l
1171            while (*level >= 0) {
1172                    WARN_ON(*level < 0);
1173                    WARN_ON(*level >= BTRFS_MAX_LEVEL);
1174                    cur = path->nodes[*level];
1175
1176                    if (btrfs_header_level(cur) != *level)
1177                            WARN_ON(1);
1178
1179                    if (path->slots[*level] >= btrfs_header_nritems(cur))
1180                            break;
(gdb) p cur
$1 = (struct extent_buffer *) 0x0
(gdb) p path->nodes
$2 = {0x0, 0x27ad82e0, 0x3893a930, 0x3dc24f0, 0x75efa0, 0x0, 0x0, 0x0}
(gdb) p *level
$3 = 0
(gdb)

Best regards,
Christian




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-10  9:13       ` haveaniceday
@ 2012-07-10 11:08         ` haveaniceday
  2012-07-11  7:13           ` Anand Jain
  0 siblings, 1 reply; 10+ messages in thread
From: haveaniceday @ 2012-07-10 11:08 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

This code should detect the problem without SIGSEGV but a Assertition.
...
Csum didn't match
btrfsck: btrfsck.c:1177: walk_down_tree: Assertion `!(1)' failed.
Aborted
...


--- btrfsck.c   2012-07-10 10:23:24.781622144 +0200
+++ btrfsck.c   2012-07-10 12:59:00.120146266 +0200
@@ -1173,7 +1173,7 @@
                WARN_ON(*level >= BTRFS_MAX_LEVEL);
                cur = path->nodes[*level];

-               if (btrfs_header_level(cur) != *level)
+               if (! cur || btrfs_header_level(cur) != *level)
                        WARN_ON(1);

                if (path->slots[*level] >= btrfs_header_nritems(cur))

 I tried to skip this error with the code below. The next errors reported are
also below.


--- btrfsck.c   2012-07-10 10:23:24.781622144 +0200
+++ btrfsck.c   2012-07-10 12:36:51.995996771 +0200
@@ -1173,8 +1173,13 @@
                WARN_ON(*level >= BTRFS_MAX_LEVEL);
                cur = path->nodes[*level];

-               if (btrfs_header_level(cur) != *level)
-                       WARN_ON(1);
+               if (cur != 0 ) {
+                       if ( btrfs_header_level(cur) != *level)
+                               WARN_ON(1);
+               }else {
+                       fprintf(stderr, "CVCV path->nodes[*level] is 0!\n");
+                       break;
+               }

                if (path->slots[*level] >= btrfs_header_nritems(cur))
                        break;
@@ -1213,7 +1218,11 @@
                path->slots[*level] = 0;
        }
 out:
+       if ( path->nodes[*level] != 0 ){
        path->slots[*level] = btrfs_header_nritems(path->nodes[*level]);
+       } else {
+       path->slots[*level] = 0;
+       }
        return 0;
 }

Next errors I get are:

....
checking fs roots
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match
CVCV path->nodes[*level] is 0!
root 5 inode 265 errors 2000
        unresolved ref dir 2658782 index 3 namelen 12 name aquota.group filetype
0 error 3
        unresolved ref dir 2914579 index 3 namelen 12 name aquota.group filetype
0 error 3
root 5 inode 266 errors 2000
        unresolved ref dir 2658782 index 4 namelen 11 name aquota.user filetype
0 error 3
        unresolved ref dir 2914579 index 4 namelen 11 name aquota.user filetype
0 error 3
root 5 inode 285 errors 2000
        unresolved ref dir 2658783 index 3 namelen 3 name awk filetype 0 error 3
        unresolved ref dir 2914580 index 3 namelen 3 name awk filetype 0 error 3
root 5 inode 286 errors 2000
        unresolved ref dir 2658783 index 16 namelen 3 name csh filetype 0 error
3
        unresolved ref dir 2914580 index 16 namelen 3 name csh filetype 0 error
3
root 5 inode 287 errors 2000
        unresolved ref dir 2658783 index 27 namelen 13 name dnsdomainname
filetype 0 error 3
        unresolved ref dir 2914580 index 27 namelen 13 name dnsdomainname
filetype 0 error 3
root 5 inode 288 errors 2000
        unresolved ref dir 2658783 index 28 namelen 10 name domainname filetype
0 error 3
        unresolved ref dir 2914580 index 28 namelen 10 name domainname filetype
0 error 3
root 5 inode 289 errors 2000
        unresolved ref dir 2658783 index 34 namelen 2 name ex filetype 0 error 3
        unresolved ref dir 2914580 index 34 namelen 2 name ex filetype 0 error 3
root 5 inode 290 errors 2000
        unresolved ref dir 2658783 index 48 namelen 2 name ip filetype 0 error 3
        unresolved ref dir 2914580 index 48 namelen 2 name ip filetype 0 error 3
root 5 inode 291 errors 2000
        unresolved ref dir 2658783 index 54 namelen 3 name ksh filetype 0 error
3
        unresolved ref dir 2914580 index 54 namelen 3 name ksh filetype 0 error
3
root 5 inode 292 errors 2000
        unresolved ref dir 2658783 index 63 namelen 4 name mail filetype 0 error
3
        unresolved ref dir 2914580 index 63 namelen 4 name mail filetype 0 error
3
...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-10 11:08         ` haveaniceday
@ 2012-07-11  7:13           ` Anand Jain
  2012-07-11  8:36             ` haveaniceday
  2012-07-12 19:08             ` Christian Volkmann
  0 siblings, 2 replies; 10+ messages in thread
From: Anand Jain @ 2012-07-11  7:13 UTC (permalink / raw)
  To: haveaniceday; +Cc: linux-btrfs



  If this is a deliberate corruption can you pls share the test-case ?
  if not have you tried mount with recovery and the scrub. ? scrub
  would be preferred choice over btrfsck.



On 10/07/12 19:08, haveaniceday@cv-sv.de wrote:
> This code should detect the problem without SIGSEGV but a Assertition.
> ...
> Csum didn't match
> btrfsck: btrfsck.c:1177: walk_down_tree: Assertion `!(1)' failed.
> Aborted
> ...
>
>
> --- btrfsck.c   2012-07-10 10:23:24.781622144 +0200
> +++ btrfsck.c   2012-07-10 12:59:00.120146266 +0200
> @@ -1173,7 +1173,7 @@
>                  WARN_ON(*level>= BTRFS_MAX_LEVEL);
>                  cur = path->nodes[*level];
>
> -               if (btrfs_header_level(cur) != *level)
> +               if (! cur || btrfs_header_level(cur) != *level)
>                          WARN_ON(1);
>
>                  if (path->slots[*level]>= btrfs_header_nritems(cur))
>
>   I tried to skip this error with the code below. The next errors reported are
> also below.
>
>
> --- btrfsck.c   2012-07-10 10:23:24.781622144 +0200
> +++ btrfsck.c   2012-07-10 12:36:51.995996771 +0200
> @@ -1173,8 +1173,13 @@
>                  WARN_ON(*level>= BTRFS_MAX_LEVEL);
>                  cur = path->nodes[*level];
>
> -               if (btrfs_header_level(cur) != *level)
> -                       WARN_ON(1);
> +               if (cur != 0 ) {
> +                       if ( btrfs_header_level(cur) != *level)
> +                               WARN_ON(1);
> +               }else {
> +                       fprintf(stderr, "CVCV path->nodes[*level] is 0!\n");
> +                       break;
> +               }
>
>                  if (path->slots[*level]>= btrfs_header_nritems(cur))
>                          break;
> @@ -1213,7 +1218,11 @@
>                  path->slots[*level] = 0;
>          }
>   out:
> +       if ( path->nodes[*level] != 0 ){
>          path->slots[*level] = btrfs_header_nritems(path->nodes[*level]);
> +       } else {
> +       path->slots[*level] = 0;
> +       }
>          return 0;
>   }
>
> Next errors I get are:
>
> ....
> checking fs roots
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> Csum didn't match
> CVCV path->nodes[*level] is 0!
> root 5 inode 265 errors 2000
>          unresolved ref dir 2658782 index 3 namelen 12 name aquota.group filetype
> 0 error 3
>          unresolved ref dir 2914579 index 3 namelen 12 name aquota.group filetype
> 0 error 3
> root 5 inode 266 errors 2000
>          unresolved ref dir 2658782 index 4 namelen 11 name aquota.user filetype
> 0 error 3
>          unresolved ref dir 2914579 index 4 namelen 11 name aquota.user filetype
> 0 error 3
> root 5 inode 285 errors 2000
>          unresolved ref dir 2658783 index 3 namelen 3 name awk filetype 0 error 3
>          unresolved ref dir 2914580 index 3 namelen 3 name awk filetype 0 error 3
> root 5 inode 286 errors 2000
>          unresolved ref dir 2658783 index 16 namelen 3 name csh filetype 0 error
> 3
>          unresolved ref dir 2914580 index 16 namelen 3 name csh filetype 0 error
> 3
> root 5 inode 287 errors 2000
>          unresolved ref dir 2658783 index 27 namelen 13 name dnsdomainname
> filetype 0 error 3
>          unresolved ref dir 2914580 index 27 namelen 13 name dnsdomainname
> filetype 0 error 3
> root 5 inode 288 errors 2000
>          unresolved ref dir 2658783 index 28 namelen 10 name domainname filetype
> 0 error 3
>          unresolved ref dir 2914580 index 28 namelen 10 name domainname filetype
> 0 error 3
> root 5 inode 289 errors 2000
>          unresolved ref dir 2658783 index 34 namelen 2 name ex filetype 0 error 3
>          unresolved ref dir 2914580 index 34 namelen 2 name ex filetype 0 error 3
> root 5 inode 290 errors 2000
>          unresolved ref dir 2658783 index 48 namelen 2 name ip filetype 0 error 3
>          unresolved ref dir 2914580 index 48 namelen 2 name ip filetype 0 error 3
> root 5 inode 291 errors 2000
>          unresolved ref dir 2658783 index 54 namelen 3 name ksh filetype 0 error
> 3
>          unresolved ref dir 2914580 index 54 namelen 3 name ksh filetype 0 error
> 3
> root 5 inode 292 errors 2000
>          unresolved ref dir 2658783 index 63 namelen 4 name mail filetype 0 error
> 3
>          unresolved ref dir 2914580 index 63 namelen 4 name mail filetype 0 error
> 3
> ...
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-11  7:13           ` Anand Jain
@ 2012-07-11  8:36             ` haveaniceday
  2012-07-15 14:05               ` Martin Steigerwald
  2012-07-12 19:08             ` Christian Volkmann
  1 sibling, 1 reply; 10+ messages in thread
From: haveaniceday @ 2012-07-11  8:36 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs


Anand Jain <Anand.Jain@oracle.com> hat am 11. Juli 2012 um 09:13 geschrieben:

>
>
>   If this is a deliberate corruption can you pls share the test-case ?
No. It's a real life corruption on a file system used to back up some servers.

That's also why basics like aquota,awk etc. are found.

But I expect it would be very hard to make a reproducible test case with error.
(Usage: see  PS: below.)

>   if not have you tried mount with recovery and the scrub. ? scrub>   would be
> preferred choice over btrfsck.
I can scrub this file system.
But isn't it a good test to try some recovery? A stable btrfs later should
manage  corruptions like this SIGSEGV and data loss.
I expect a real life recover could cover more strange things than the test cases
:)

So it's your/ btrfs supporters choice. How far we should follow this issue.
I did in between an image of the corrupted file system, so multiple recovery
tries are possible.

Best regards,
Christian

PS: I would bet that my kind of usage is a very good stress test for btrfs.

- large file system "/backup" btrfs with compress enabled.

Content of the file system:
- ./server1 .... /server5  as directories
- for each server the directory has a structure like this:
  backup-YYYY-DD-MM-HH:M
  New backups are created with:
  rsync -axvH --link-dest=/backup/server'n'/backup-...(old, last dir)..
  server:/   /backup/server'n'/backup-YYY-.../.

This generates files with a large number of hard links.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-11  7:13           ` Anand Jain
  2012-07-11  8:36             ` haveaniceday
@ 2012-07-12 19:08             ` Christian Volkmann
  1 sibling, 0 replies; 10+ messages in thread
From: Christian Volkmann @ 2012-07-12 19:08 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

Anand Jain schrieb:
>
>
>   If this is a deliberate corruption can you pls share the test-case ?
>   if not have you tried mount with recovery and the scrub. ? scrub
>   would be preferred choice over btrfsck.
>
>
>

Scrub does not fix the problem. I replaced the real host name with "myhost".
Strange for me: the mentioned pathes for errors point to the same file names,
just a part of the "myhost" is different.

The btrfsck fails with the same crash after the scrub.


speedy:/home/cv # btrfs scrub status /backup.old
scrub status for fa7034c8-86d4-4aa3-9fde-ecd7051ff43c
         scrub started at Thu Jul 12 20:21:08 2012 and finished after 1495 seconds
         total bytes scrubbed: 115.49GiB with 9 errors
         error details: verify=3 csum=6
         corrected errors: 3, uncorrectable errors: 6, unverified errors: 0

Should I continue with any analysis for bug hunting or just reformat and forget?

Best regard,
Christian

[ 5059.168649] btrfs: checksum/header error at logical 1532956672 on dev /dev/md3, sector 7204744: metadata leaf (level 0) in tree 2
[ 5059.168656] btrfs: checksum/header error at logical 1532956672 on dev /dev/md3, sector 7204744: metadata leaf (level 0) in tree 2
[ 5065.581348] btrfs: fixed up error at logical 1532956672 on dev /dev/md3
[ 5065.587844] btrfs: checksum/header error at logical 1532960768 on dev /dev/md3, sector 7204752: metadata leaf (level 0) in tree 2
[ 5065.587851] btrfs: checksum/header error at logical 1532960768 on dev /dev/md3, sector 7204752: metadata leaf (level 0) in tree 2
[ 5065.599317] btrfs: fixed up error at logical 1532960768 on dev /dev/md3
[ 5065.599500] btrfs: checksum/header error at logical 1532964864 on dev /dev/md3, sector 7204760: metadata leaf (level 0) in tree 2
[ 5065.599506] btrfs: checksum/header error at logical 1532964864 on dev /dev/md3, sector 7204760: metadata leaf (level 0) in tree 2
[ 5065.607379] btrfs: fixed up error at logical 1532964864 on dev /dev/md3
[ 5074.964900] btrfs: checksum error at logical 2327654400 on dev /dev/md3, sector 8756888: metadata leaf (level 0) in tree 5
[ 5074.964907] btrfs: checksum error at logical 2327654400 on dev /dev/md3, sector 8756888: metadata leaf (level 0) in tree 5
[ 5075.977763] btrfs: unable to fixup (regular) error at logical 2327654400 on dev /dev/md3
[ 5085.133646] btrfs: checksum error at logical 2327654400 on dev /dev/md3, sector 10854040: metadata leaf (level 0) in tree 5
[ 5085.133653] btrfs: checksum error at logical 2327654400 on dev /dev/md3, sector 10854040: metadata leaf (level 0) in tree 5
[ 5086.148842] btrfs: unable to fixup (regular) error at logical 2327654400 on dev /dev/md3
[ 6436.036292] btrfs: checksum error at logical 139801403392 on dev /dev/md3, sector 331786256, root 5, inode 2960268, offset 1345069056, length 4096, links 1 (path: int-www-mail/int-www-2012-07-05-22_28_57/srv/www/vhosts/"myhost".de/statistics/logs/access_ssl_log.processed)
[ 6436.036300] btrfs: unable to fixup (regular) error at logical 139801403392 on dev /dev/md3
[ 6454.615722] btrfs: checksum error at logical 141661282304 on dev /dev/md3, sector 335418832, root 5, inode 2968078, offset 104292352, length 4096, links 1 (path: int-www-mail/int-www-2012-07-05-22_28_57/srv/www/vhosts/"myhost".no/statistics/logs/error_log)
[ 6454.615736] btrfs: unable to fixup (regular) error at logical 141661282304 on dev /dev/md3
[ 6455.523759] btrfs: checksum error at logical 140794101760 on dev /dev/md3, sector 333725120, root 5, inode 2964438, offset 87449600, length 4096, links 1 (path: int-www-mail/int-www-2012-07-05-22_28_57/srv/www/vhosts/"myhost".fr/statistics/logs/access_log.processed)
[ 6455.523775] btrfs: unable to fixup (regular) error at logical 140794101760 on dev /dev/md3
[ 6475.865387] btrfs: checksum error at logical 143052115968 on dev /dev/md3, sector 338135304, root 5, inode 3000621, offset 1078595584, length 4096, links 1 (path: int-www-mail/int-www-2012-07-05-22_28_57/srv/www/vhosts/"otherhost".com/statistics/logs/access_log.processed)
[ 6475.865403] btrfs: unable to fixup (regular) error at logical 143052115968 on dev /dev/md3

speedy:/tmp/btrfs/btrfs-progs # ./btrfsck /dev/md3
checking extents
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match
owner ref check failed [2327654400 4096]
ref mismatch on [101138354176 98304] extent item 1, found 0
Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x787c260
backpointer mismatch on [101138354176 98304]
owner ref check failed [101138354176 98304]
ref mismatch on [101138452480 106496] extent item 1, found 0
Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0x787c2a0
backpointer mismatch on [101138452480 106496]
owner ref check failed [101138452480 106496]
ref mismatch on [101138558976 8192] extent item 1, found 0
Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x2d2a700
backpointer mismatch on [101138558976 8192]
owner ref check failed [101138558976 8192]
ref mismatch on [101138567168 16384] extent item 1, found 0
Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x2d2a740
backpointer mismatch on [101138567168 16384]
owner ref check failed [101138567168 16384]
ref mismatch on [101138583552 16384] extent item 1, found 0
Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x2d2a780
backpointer mismatch on [101138583552 16384]
owner ref check failed [101138583552 16384]
Errors found in extent allocation tree
checking fs roots
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match
btrfsck: btrfsck.c:1177: walk_down_tree: Assertion `!(1)' failed.
Abgebrochen
s

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfsck crashes
  2012-07-11  8:36             ` haveaniceday
@ 2012-07-15 14:05               ` Martin Steigerwald
  0 siblings, 0 replies; 10+ messages in thread
From: Martin Steigerwald @ 2012-07-15 14:05 UTC (permalink / raw)
  To: linux-btrfs, haveaniceday; +Cc: Anand Jain

Am Mittwoch, 11. Juli 2012 schrieb haveaniceday@cv-sv.de:
> PS: I would bet that my kind of usage is a very good stress test for
> btrfs.
> 
> - large file system "/backup" btrfs with compress enabled.
> 
> Content of the file system:
> - ./server1 .... /server5  as directories
> - for each server the directory has a structure like this:
>   backup-YYYY-DD-MM-HH:M
>   New backups are created with:
>   rsync -axvH --link-dest=/backup/server'n'/backup-...(old, last dir)..
>   server:/   /backup/server'n'/backup-YYY-.../.
> 
> This generates files with a large number of hard links.

I am using

btrfs subvolume snapshot -r

after a rsync backup for exact this case.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-07-15 14:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-08 16:08 btrfsck crashes Christian Volkmann
2012-07-09  3:40 ` Anand Jain
2012-07-09 21:23   ` Christian Volkmann
2012-07-10  6:30     ` Anand Jain
2012-07-10  9:13       ` haveaniceday
2012-07-10 11:08         ` haveaniceday
2012-07-11  7:13           ` Anand Jain
2012-07-11  8:36             ` haveaniceday
2012-07-15 14:05               ` Martin Steigerwald
2012-07-12 19:08             ` Christian Volkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.