All of lore.kernel.org
 help / color / mirror / Atom feed
* Chunk-Recovery fails with alignment error
@ 2017-12-05 19:22 Benjamin Beichler
  2017-12-06  0:42 ` Qu Wenruo
  2017-12-06  6:08 ` Chris Murphy
  0 siblings, 2 replies; 11+ messages in thread
From: Benjamin Beichler @ 2017-12-05 19:22 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I have a setup as following:  (1,7TB drive + 128GB SSD in Bcache) <==>
luks device <==> btrfs FS
I have been running Arch linux with newest stable kernel 4.14.

After a reboot last week my btrfs volume becomes unmountable, because
of checksum errors in the chunk root.

These are the outputs of check:
"btrfs check /dev/mapper/root

checksum verify failed on 131072 found 1A98EC4A wanted B97166DB
checksum verify failed on 131072 found 1A98EC4A wanted B97166DB
bytenr mismatch, want=131072, have=9229526874648754029
ERROR: cannot read chunk root
ERROR: cannot open file system
"

Most other check/repair commands fail with the same error. The
super-dump can be found here:
https://gist.github.com/anonymous/33bd22696c37355c6cfd093f4c6bd226

 After my attempts in initramfs failed I used a recent arch-live disk
to compile newest btrfs-tools and use recent kernel (4.9) to recover
the chunk tree.

Therefore I ran the chunk recover command on my drive.  It looks like
the following here>
https://gist.github.com/anonymous/5359c08734cf81ad3887b635536d9631

for better debugging I already used gdb to inspect the following error:

"ERROR: tree block bytenr 0 is not aligned to sectorsize 4096"

but as the exception raise is far to late to inspect the problem.
Since most of my chunks seems to be recoverable, I hope a tree rebuild
is possible, but I don't know how.

Do you have any suggestions?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-05 19:22 Chunk-Recovery fails with alignment error Benjamin Beichler
@ 2017-12-06  0:42 ` Qu Wenruo
       [not found]   ` <CABi++uKoMgi3WMw4z+kgJ1G2H_y_2e2Czg0OLwf18g9GmoU2Cg@mail.gmail.com>
  2017-12-06  6:08 ` Chris Murphy
  1 sibling, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2017-12-06  0:42 UTC (permalink / raw)
  To: Benjamin Beichler, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2852 bytes --]



On 2017年12月06日 03:22, Benjamin Beichler wrote:
> Hi,
> 
> I have a setup as following:  (1,7TB drive + 128GB SSD in Bcache) <==>
> luks device <==> btrfs FS
> I have been running Arch linux with newest stable kernel 4.14.
> 
> After a reboot last week my btrfs volume becomes unmountable, because
> of checksum errors in the chunk root.
> 
> These are the outputs of check:
> "btrfs check /dev/mapper/root
> 
> checksum verify failed on 131072 found 1A98EC4A wanted B97166DB
> checksum verify failed on 131072 found 1A98EC4A wanted B97166DB
> bytenr mismatch, want=131072, have=9229526874648754029
> ERROR: cannot read chunk root
> ERROR: cannot open file system

That's the most serious problem for btrfs.
No chunk tree nothing can be recovered.

> "
> 
> Most other check/repair commands fail with the same error. The
> super-dump can be found here:
> https://gist.github.com/anonymous/33bd22696c37355c6cfd093f4c6bd226

Better with -fa option to show system chunk array and backup roots.

> 
>  After my attempts in initramfs failed I used a recent arch-live disk
> to compile newest btrfs-tools and use recent kernel (4.9) to recover
> the chunk tree.
> 
> Therefore I ran the chunk recover command on my drive.  It looks like
> the following here>
> https://gist.github.com/anonymous/5359c08734cf81ad3887b635536d9631
> 
> for better debugging I already used gdb to inspect the following error:
> 
> "ERROR: tree block bytenr 0 is not aligned to sectorsize 4096"
> 
> but as the exception raise is far to late to inspect the problem.
> Since most of my chunks seems to be recoverable, I hope a tree rebuild
> is possible, but I don't know how.

This seems to be related to __rebuikld_device_items(), where
btrfs_insert_item() return -EIO and triggered BUG_ON().

Despite the heavy (and mostly overkilled) method, we can do a lighter
version by checking every possible tree root in system chunk array.

Which needs the -fa option of super-dump subcommand.


The basic idea is:
1) Check the back_roots of output
   If backup_chunk_root of all backup roots are the same with chunk_root
   (which is 131072 in your case), skip to step 3)

2) Try all backup_chunk_root numbers with "btrfs check --chunk-root
   <bytenr>"
   If some number returns good result (no obvious problem reported),
   then with --repair option, finish the repair.
   If all fails, go step 3)

3) Try all bytenr aligned to 16K in system chunk
   Same "btrfs check --chunk-root <bytenr>" until one returns good
   result. Then --repair.

Thanks,
Qu
> 
> Do you have any suggestions?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-05 19:22 Chunk-Recovery fails with alignment error Benjamin Beichler
  2017-12-06  0:42 ` Qu Wenruo
@ 2017-12-06  6:08 ` Chris Murphy
  2017-12-06  6:18   ` Qu Wenruo
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2017-12-06  6:08 UTC (permalink / raw)
  To: Benjamin Beichler; +Cc: Btrfs BTRFS

On Tue, Dec 5, 2017 at 12:22 PM, Benjamin Beichler
<hadrian2002@googlemail.com> wrote:
> Hi,
>
> I have a setup as following:  (1,7TB drive + 128GB SSD in Bcache) <==>
> luks device <==> btrfs FS
> I have been running Arch linux with newest stable kernel 4.14.

There is a known bug in 4.14 that affects all bcache backed file
systems (maybe more, I think it's not a bcache specific bug). You need
to downgrade or upgrade your kernel. Fix should be in 4.14.2.

The bug was found quickly enough I'm surprised 4.14.0 would be in Arch
stable, let alone it *still* being in stable.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-06  6:08 ` Chris Murphy
@ 2017-12-06  6:18   ` Qu Wenruo
  0 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2017-12-06  6:18 UTC (permalink / raw)
  To: Chris Murphy, Benjamin Beichler; +Cc: Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 950 bytes --]



On 2017年12月06日 14:08, Chris Murphy wrote:
> On Tue, Dec 5, 2017 at 12:22 PM, Benjamin Beichler
> <hadrian2002@googlemail.com> wrote:
>> Hi,
>>
>> I have a setup as following:  (1,7TB drive + 128GB SSD in Bcache) <==>
>> luks device <==> btrfs FS
>> I have been running Arch linux with newest stable kernel 4.14.
> 
> There is a known bug in 4.14 that affects all bcache backed file
> systems (maybe more, I think it's not a bcache specific bug). You need
> to downgrade or upgrade your kernel. Fix should be in 4.14.2.
> 
> The bug was found quickly enough I'm surprised 4.14.0 would be in Arch
> stable, let alone it *still* being in stable.

Although the latest kernel is 4.14.3, it's only several days age.

It seems that the kernel update cycle of arch get sped up.

Normally kernel should stay in testing repo for a while and when it
moves into core, it's normally already *.2 or more.

Thanks,
Qu

> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
       [not found]         ` <b484d30c-f32d-78cf-7f07-b9cadb43a2d1@gmx.com>
@ 2017-12-09 23:12           ` Benjamin Beichler
  2017-12-10  0:29             ` Qu Wenruo
  0 siblings, 1 reply; 11+ messages in thread
From: Benjamin Beichler @ 2017-12-09 23:12 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

Hi Qu,

2017-12-07 12:09 GMT+00:00 Qu Wenruo <quwenruo.btrfs@gmx.com>:
>
> Since the btrfs chunk recovery doesn't work and my dirty quick hack
> doesn't work either, I don't expect much to recovery.
>
> Unless we have more detailed info about the how and why the BUG_ON() of
> chunk recovery is triggered.
>
> That's to say, it will be a quite time consuming work to use gdb to
> locate the problem, and see if any developer (mostly me) could use the
> info to further dig into the problem or fix it.
> (Considering the difference in timezone, I expect at least 8+ weeks to
> get a conclusion)

I'm really pleased that you want to help me, of course the current
backtrace was quite useless.
Firstly, I revised the code a bit, and since one run over the 1,7TB
drive took about 6h, I thought about saving the state of already found
chunks. I simply saved all bytenr which are valid to a file. The
consequence was a reduction of the time for scan_one_device to about
30s. If you think this could be interesting for the normal version, I
could create a patch for this.

>
> If you really want to do it, please step into the function
> btrfs_insert_item() in __rebuild_device_items() and to see at which
> point -EIO is returned.
>
> My guess is btrfs_search_slot() call in btrfs_insert_empty_items().
>
> If that's true, please call
>
> btrfs_print_tree(root->fs_info->chunk_root, root->fs_info->chunk_root->node, 1)
>
> in gdb, just before the btrfs_search_slot() call above, to show what's
> the problem.
>
Your guess was right. The current stack trace and btrfs_print_tree is
under : https://gist.github.com/anonymous/2cf40ac1d3ddcbca95177acec78041b2

As you can see, the code in disk.io:321 explicitly exclude the the
sector from 0 to sectorsize, and states it is unaligned. I think
because the code found a chunk/block at address zero, this triggers
the problem. Is it possible, that there live chunks/blocks at address
0 or is this fuzzy data?

>
> BTW, currently nothing in chunk tree/super block contains any info of
> your fs, feel free to share it with the mail list, where more guys may help.
>
I added the list, I simply forgot it in some answer.

> Thanks,
> Qu
>

thanks

Benjamin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-09 23:12           ` Benjamin Beichler
@ 2017-12-10  0:29             ` Qu Wenruo
  2017-12-10  0:33               ` Qu Wenruo
  0 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2017-12-10  0:29 UTC (permalink / raw)
  To: Benjamin Beichler, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2782 bytes --]



On 2017年12月10日 07:12, Benjamin Beichler wrote:
> Hi Qu,
> 
> 2017-12-07 12:09 GMT+00:00 Qu Wenruo <quwenruo.btrfs@gmx.com>:
>>
>> Since the btrfs chunk recovery doesn't work and my dirty quick hack
>> doesn't work either, I don't expect much to recovery.
>>
>> Unless we have more detailed info about the how and why the BUG_ON() of
>> chunk recovery is triggered.
>>
>> That's to say, it will be a quite time consuming work to use gdb to
>> locate the problem, and see if any developer (mostly me) could use the
>> info to further dig into the problem or fix it.
>> (Considering the difference in timezone, I expect at least 8+ weeks to
>> get a conclusion)
> 
> I'm really pleased that you want to help me, of course the current
> backtrace was quite useless.
> Firstly, I revised the code a bit, and since one run over the 1,7TB
> drive took about 6h, I thought about saving the state of already found
> chunks. I simply saved all bytenr which are valid to a file. The
> consequence was a reduction of the time for scan_one_device to about
> 30s. If you think this could be interesting for the normal version, I
> could create a patch for this.
> 
>>
>> If you really want to do it, please step into the function
>> btrfs_insert_item() in __rebuild_device_items() and to see at which
>> point -EIO is returned.
>>
>> My guess is btrfs_search_slot() call in btrfs_insert_empty_items().
>>
>> If that's true, please call
>>
>> btrfs_print_tree(root->fs_info->chunk_root, root->fs_info->chunk_root->node, 1)
>>
>> in gdb, just before the btrfs_search_slot() call above, to show what's
>> the problem.
>>
> Your guess was right. The current stack trace and btrfs_print_tree is
> under : https://gist.github.com/anonymous/2cf40ac1d3ddcbca95177acec78041b2

The output is very helpful.

I was originally thinking it's something more serious, but it turns out
to be less serious than my expectation.

> 
> As you can see, the code in disk.io:321 explicitly exclude the the
> sector from 0 to sectorsize, and states it is unaligned. I think
> because the code found a chunk/block at address zero, this triggers
> the problem. Is it possible, that there live chunks/blocks at address
> 0 or is this fuzzy data?

0 is completely valid in btrfs logical address space.

It's the IS_ALIGNED macro which caused the problem.
So it's quite easy to fix in fact.

For 0, always return it as aligned should fix your problem.

Thanks,
Qu

> 
>>
>> BTW, currently nothing in chunk tree/super block contains any info of
>> your fs, feel free to share it with the mail list, where more guys may help.
>>
> I added the list, I simply forgot it in some answer.
> 
>> Thanks,
>> Qu
>>
> 
> thanks
> 
> Benjamin
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-10  0:29             ` Qu Wenruo
@ 2017-12-10  0:33               ` Qu Wenruo
  2017-12-10 21:16                 ` Benjamin Beichler
  0 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2017-12-10  0:33 UTC (permalink / raw)
  To: Benjamin Beichler, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3129 bytes --]



On 2017年12月10日 08:29, Qu Wenruo wrote:
> 
> 
> On 2017年12月10日 07:12, Benjamin Beichler wrote:
>> Hi Qu,
>>
>> 2017-12-07 12:09 GMT+00:00 Qu Wenruo <quwenruo.btrfs@gmx.com>:
>>>
>>> Since the btrfs chunk recovery doesn't work and my dirty quick hack
>>> doesn't work either, I don't expect much to recovery.
>>>
>>> Unless we have more detailed info about the how and why the BUG_ON() of
>>> chunk recovery is triggered.
>>>
>>> That's to say, it will be a quite time consuming work to use gdb to
>>> locate the problem, and see if any developer (mostly me) could use the
>>> info to further dig into the problem or fix it.
>>> (Considering the difference in timezone, I expect at least 8+ weeks to
>>> get a conclusion)
>>
>> I'm really pleased that you want to help me, of course the current
>> backtrace was quite useless.
>> Firstly, I revised the code a bit, and since one run over the 1,7TB
>> drive took about 6h, I thought about saving the state of already found
>> chunks. I simply saved all bytenr which are valid to a file. The
>> consequence was a reduction of the time for scan_one_device to about
>> 30s. If you think this could be interesting for the normal version, I
>> could create a patch for this.
>>
>>>
>>> If you really want to do it, please step into the function
>>> btrfs_insert_item() in __rebuild_device_items() and to see at which
>>> point -EIO is returned.
>>>
>>> My guess is btrfs_search_slot() call in btrfs_insert_empty_items().
>>>
>>> If that's true, please call
>>>
>>> btrfs_print_tree(root->fs_info->chunk_root, root->fs_info->chunk_root->node, 1)
>>>
>>> in gdb, just before the btrfs_search_slot() call above, to show what's
>>> the problem.
>>>
>> Your guess was right. The current stack trace and btrfs_print_tree is
>> under : https://gist.github.com/anonymous/2cf40ac1d3ddcbca95177acec78041b2
> 
> The output is very helpful.
> 
> I was originally thinking it's something more serious, but it turns out
> to be less serious than my expectation.
> 
>>
>> As you can see, the code in disk.io:321 explicitly exclude the the
>> sector from 0 to sectorsize, and states it is unaligned. I think
>> because the code found a chunk/block at address zero, this triggers
>> the problem. Is it possible, that there live chunks/blocks at address
>> 0 or is this fuzzy data?
> 
> 0 is completely valid in btrfs logical address space.
> 
> It's the IS_ALIGNED macro which caused the problem.
> So it's quite easy to fix in fact.

Sorry, IS_ALIGNED is working as expected.

It's the bytenr < sectorsize line causing the problem.
Please remove bytenr < sectorsize check, I'll submit a patch later to
fix it.

Thanks,
Qu

> 
> For 0, always return it as aligned should fix your problem.
> 
> Thanks,
> Qu
> 
>>
>>>
>>> BTW, currently nothing in chunk tree/super block contains any info of
>>> your fs, feel free to share it with the mail list, where more guys may help.
>>>
>> I added the list, I simply forgot it in some answer.
>>
>>> Thanks,
>>> Qu
>>>
>>
>> thanks
>>
>> Benjamin
>>
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-10  0:33               ` Qu Wenruo
@ 2017-12-10 21:16                 ` Benjamin Beichler
  2017-12-10 23:32                   ` Qu Wenruo
  2017-12-10 23:50                   ` Qu Wenruo
  0 siblings, 2 replies; 11+ messages in thread
From: Benjamin Beichler @ 2017-12-10 21:16 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

The patch let the chunk-recover be successful. But I'm no lucky man,
the recovered chunk tree does not work or other metadata is also
broken.

Mounting the system is not successful (dmesg):
BTRFS critical (device dm-0): corrupt node, invalid item slot:
block=16384, root=1, slot=0
BTRFS error (device dm-0): failed to read chunk root
BTRFS error (device dm-0): open_ctree failed

Therefore I tried a btrfs check --repair, this time without error:
https://gist.github.com/anonymous/5cf7ad9e187032d2c94db4f91bb62c24

Then I tried btrfs check --init-extent-tree and this produces much
output. I put the beginning into here:
https://gist.github.com/anonymous/70e2482646a8235ee2327105d920dadd
>From a fast view, the messages keep to be similar to the last of the
gist, but the messages in the beginning are not repeating. If it helps
I have complete compressed log.

Then I tried a btrfs recover to get files, but for many files (also
improtant data, but I filtered the output) I get outputs like in:
https://gist.github.com/anonymous/1cc7f7ab5af33e76d0bf80960bb300eb

Any new suggestions?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-10 21:16                 ` Benjamin Beichler
@ 2017-12-10 23:32                   ` Qu Wenruo
  2017-12-10 23:50                   ` Qu Wenruo
  1 sibling, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2017-12-10 23:32 UTC (permalink / raw)
  To: Benjamin Beichler; +Cc: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1422 bytes --]



On 2017年12月11日 05:16, Benjamin Beichler wrote:
> The patch let the chunk-recover be successful. But I'm no lucky man,
> the recovered chunk tree does not work or other metadata is also
> broken.
> 
> Mounting the system is not successful (dmesg):
> BTRFS critical (device dm-0): corrupt node, invalid item slot:
> block=16384, root=1, slot=0

What does btrfs dump-tree -b 16384 say?

IIRC the same 0 bytenr problem in kernel.

> BTRFS error (device dm-0): failed to read chunk root
> BTRFS error (device dm-0): open_ctree failed
> 
> Therefore I tried a btrfs check --repair, this time without error:
> https://gist.github.com/anonymous/5cf7ad9e187032d2c94db4f91bb62c24
> 
> Then I tried btrfs check --init-extent-tree and this produces much
> output. I put the beginning into here:
> https://gist.github.com/anonymous/70e2482646a8235ee2327105d920dadd

That's common for --init-extent-tree, but I don't think extent tree is
related in this case.

Thanks,
Qu
> From a fast view, the messages keep to be similar to the last of the
> gist, but the messages in the beginning are not repeating. If it helps
> I have complete compressed log.
> 
> Then I tried a btrfs recover to get files, but for many files (also
> improtant data, but I filtered the output) I get outputs like in:
> https://gist.github.com/anonymous/1cc7f7ab5af33e76d0bf80960bb300eb>
> Any new suggestions?
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-10 21:16                 ` Benjamin Beichler
  2017-12-10 23:32                   ` Qu Wenruo
@ 2017-12-10 23:50                   ` Qu Wenruo
  2017-12-12 16:07                     ` Benjamin Beichler
  1 sibling, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2017-12-10 23:50 UTC (permalink / raw)
  To: Benjamin Beichler; +Cc: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2162 bytes --]



On 2017年12月11日 05:16, Benjamin Beichler wrote:
> The patch let the chunk-recover be successful. But I'm no lucky man,
> the recovered chunk tree does not work or other metadata is also
> broken.
> 
> Mounting the system is not successful (dmesg):
> BTRFS critical (device dm-0): corrupt node, invalid item slot:
> block=16384, root=1, slot=0
> BTRFS error (device dm-0): failed to read chunk root
> BTRFS error (device dm-0): open_ctree failed

For this error, you could apply this diff to by-pass it:

------
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index ce4ed6ec8f39..355220e21c2e 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -413,12 +413,6 @@ int btrfs_check_node(struct btrfs_root *root,
struct extent_buffer *node)
                btrfs_node_key_to_cpu(node, &key, slot);
                btrfs_node_key_to_cpu(node, &next_key, slot + 1);

-               if (!bytenr) {
-                       generic_err(root, node, slot,
-                               "invalid NULL node pointer");
-                       ret = -EUCLEAN;
-                       goto out;
-               }
                if (!IS_ALIGNED(bytenr, root->fs_info->sectorsize)) {
                        generic_err(root, node, slot,
                        "unaligned pointer, have %llu should be aligned
to %u",
------

Thanks,
Qu

> 
> Therefore I tried a btrfs check --repair, this time without error:
> https://gist.github.com/anonymous/5cf7ad9e187032d2c94db4f91bb62c24
> 
> Then I tried btrfs check --init-extent-tree and this produces much
> output. I put the beginning into here:
> https://gist.github.com/anonymous/70e2482646a8235ee2327105d920dadd
> From a fast view, the messages keep to be similar to the last of the
> gist, but the messages in the beginning are not repeating. If it helps
> I have complete compressed log.
> 
> Then I tried a btrfs recover to get files, but for many files (also
> improtant data, but I filtered the output) I get outputs like in:
> https://gist.github.com/anonymous/1cc7f7ab5af33e76d0bf80960bb300eb
> 
> Any new suggestions?
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Chunk-Recovery fails with alignment error
  2017-12-10 23:50                   ` Qu Wenruo
@ 2017-12-12 16:07                     ` Benjamin Beichler
  0 siblings, 0 replies; 11+ messages in thread
From: Benjamin Beichler @ 2017-12-12 16:07 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

Hi,

the patch unfortunately did not work, because I didn't know for which
version/tree was made, since it does not apply for 4.14. or 4.15. But
since I got your hint with the possibility of the old age of my btrfs
volume, I simply tried an old ubuntu live disk, and it mounted the
volume :-) Then I started a balance of the metadata, and hope it
reloacted the unfavorable placed block and voila - it is also
mountable in newer kernels. Then I updated my arch kernel in chroot
and now my system is running again.

I know changed to DUP metadata profile and currently I run a scrub,
but I hope there are no more serious errors.

I don't know whether the patch for checking bytenr < sectorsize should
be seen as regression bug, or the block was simply placed wrong by the
bcache bug.

All in all: Many thanks for the help !

kind regards

Benjamin

2017-12-11 0:50 GMT+01:00 Qu Wenruo <quwenruo.btrfs@gmx.com>:
>
>
> On 2017年12月11日 05:16, Benjamin Beichler wrote:
>> The patch let the chunk-recover be successful. But I'm no lucky man,
>> the recovered chunk tree does not work or other metadata is also
>> broken.
>>
>> Mounting the system is not successful (dmesg):
>> BTRFS critical (device dm-0): corrupt node, invalid item slot:
>> block=16384, root=1, slot=0
>> BTRFS error (device dm-0): failed to read chunk root
>> BTRFS error (device dm-0): open_ctree failed
>
> For this error, you could apply this diff to by-pass it:
>
> ------
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index ce4ed6ec8f39..355220e21c2e 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -413,12 +413,6 @@ int btrfs_check_node(struct btrfs_root *root,
> struct extent_buffer *node)
>                 btrfs_node_key_to_cpu(node, &key, slot);
>                 btrfs_node_key_to_cpu(node, &next_key, slot + 1);
>
> -               if (!bytenr) {
> -                       generic_err(root, node, slot,
> -                               "invalid NULL node pointer");
> -                       ret = -EUCLEAN;
> -                       goto out;
> -               }
>                 if (!IS_ALIGNED(bytenr, root->fs_info->sectorsize)) {
>                         generic_err(root, node, slot,
>                         "unaligned pointer, have %llu should be aligned
> to %u",
> ------
>
> Thanks,
> Qu
>
>>
>> Therefore I tried a btrfs check --repair, this time without error:
>> https://gist.github.com/anonymous/5cf7ad9e187032d2c94db4f91bb62c24
>>
>> Then I tried btrfs check --init-extent-tree and this produces much
>> output. I put the beginning into here:
>> https://gist.github.com/anonymous/70e2482646a8235ee2327105d920dadd
>> From a fast view, the messages keep to be similar to the last of the
>> gist, but the messages in the beginning are not repeating. If it helps
>> I have complete compressed log.
>>
>> Then I tried a btrfs recover to get files, but for many files (also
>> improtant data, but I filtered the output) I get outputs like in:
>> https://gist.github.com/anonymous/1cc7f7ab5af33e76d0bf80960bb300eb
>>
>> Any new suggestions?
>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-12-12 16:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-05 19:22 Chunk-Recovery fails with alignment error Benjamin Beichler
2017-12-06  0:42 ` Qu Wenruo
     [not found]   ` <CABi++uKoMgi3WMw4z+kgJ1G2H_y_2e2Czg0OLwf18g9GmoU2Cg@mail.gmail.com>
     [not found]     ` <2f4346d5-ccb9-8de2-11d1-b270058723c1@gmx.com>
     [not found]       ` <CABi++uLD_2sXjuF6b0GKhWAX6fRRA0jqqUaujWP6_q+hiuvSXw@mail.gmail.com>
     [not found]         ` <b484d30c-f32d-78cf-7f07-b9cadb43a2d1@gmx.com>
2017-12-09 23:12           ` Benjamin Beichler
2017-12-10  0:29             ` Qu Wenruo
2017-12-10  0:33               ` Qu Wenruo
2017-12-10 21:16                 ` Benjamin Beichler
2017-12-10 23:32                   ` Qu Wenruo
2017-12-10 23:50                   ` Qu Wenruo
2017-12-12 16:07                     ` Benjamin Beichler
2017-12-06  6:08 ` Chris Murphy
2017-12-06  6:18   ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.