All of lore.kernel.org
 help / color / mirror / Atom feed
* Cannot mount or recover btrfs
@ 2019-12-29 15:05 Raviu
  2019-12-29 15:38 ` Hugo Mills
  2019-12-30  5:38 ` Qu Wenruo
  0 siblings, 2 replies; 5+ messages in thread
From: Raviu @ 2019-12-29 15:05 UTC (permalink / raw)
  To: linux-btrfs

Hi,
My system suddenly crashed, after reboot I cannot mount /home any more.

`uname -a`
Linux moonIk80 4.12.14-lp151.28.36-default #1 SMP Fri Dec 6 13:50:27 UTC 2019 (8f4a495) x86_64 x86_64 x86_64 GNU/Linux

btrfs-progs v5.4

`btrfs fi show`
Label: none  uuid: 378faa6e-8af0-415e-93f7-68b31fb08a29
        Total devices 1 FS bytes used 194.99GiB
        devid    1 size 232.79GiB used 231.79GiB path /dev/mapper/cr_sda4


The device cannot be mounted.
[  188.649876] BTRFS info (device dm-1): disk space caching is enabled
[  188.649878] BTRFS info (device dm-1): has skinny extents
[  188.656364] BTRFS critical (device dm-1): corrupt leaf: root=2 block=294640566272 slot=104, unexpected item end, have 42739 expect 9971
[  188.656374] BTRFS error (device dm-1): failed to read block groups: -5
[  188.700088] BTRFS error (device dm-1): open_ctree failed



`btrfs check /dev/mapper/cr_sda4`
Opening filesystem to check...
incorrect offsets 9971 42739
incorrect offsets 9971 42739
incorrect offsets 9971 42739
ERROR: failed to read block groups: Operation not permitted
ERROR: cannot open file system



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cannot mount or recover btrfs
  2019-12-29 15:05 Cannot mount or recover btrfs Raviu
@ 2019-12-29 15:38 ` Hugo Mills
  2019-12-30  5:38 ` Qu Wenruo
  1 sibling, 0 replies; 5+ messages in thread
From: Hugo Mills @ 2019-12-29 15:38 UTC (permalink / raw)
  To: Raviu; +Cc: linux-btrfs

On Sun, Dec 29, 2019 at 03:05:14PM +0000, Raviu wrote:
> Hi,
> My system suddenly crashed, after reboot I cannot mount /home any more.
> 
> `uname -a`
> Linux moonIk80 4.12.14-lp151.28.36-default #1 SMP Fri Dec 6 13:50:27 UTC 2019 (8f4a495) x86_64 x86_64 x86_64 GNU/Linux
> 
> btrfs-progs v5.4
> 
> `btrfs fi show`
> Label: none  uuid: 378faa6e-8af0-415e-93f7-68b31fb08a29
>         Total devices 1 FS bytes used 194.99GiB
>         devid    1 size 232.79GiB used 231.79GiB path /dev/mapper/cr_sda4
> 
> 
> The device cannot be mounted.
> [  188.649876] BTRFS info (device dm-1): disk space caching is enabled
> [  188.649878] BTRFS info (device dm-1): has skinny extents
> [  188.656364] BTRFS critical (device dm-1): corrupt leaf: root=2 block=294640566272 slot=104, unexpected item end, have 42739 expect 9971

>>> hex(9971)
'0x26f3'
>>> hex(42739)
'0xa6f3'

   That looks like a single bit error, and it's got a correct checksum
for the incorrect data, which suggests that the error happened while
the metadata was in RAM. The most likely cause here is that your RAM
is bad. (There are other options, but they're also mostly hardware).

   As for fixing it -- first, make really, really sure that your
hardware is OK. After that, I don't think btrfs check will fix it
(although it might). Maybe one of the devs can add something to it to
help.

   Hugo.

> [  188.656374] BTRFS error (device dm-1): failed to read block groups: -5
> [  188.700088] BTRFS error (device dm-1): open_ctree failed
> 
> 
> 
> `btrfs check /dev/mapper/cr_sda4`
> Opening filesystem to check...
> incorrect offsets 9971 42739
> incorrect offsets 9971 42739
> incorrect offsets 9971 42739
> ERROR: failed to read block groups: Operation not permitted
> ERROR: cannot open file system
> 
> 

-- 
Hugo Mills             | I'm on a 30-day diet. So far I've lost 18 days.
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cannot mount or recover btrfs
  2019-12-29 15:05 Cannot mount or recover btrfs Raviu
  2019-12-29 15:38 ` Hugo Mills
@ 2019-12-30  5:38 ` Qu Wenruo
  2020-01-07  7:35   ` Raviu
  1 sibling, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2019-12-30  5:38 UTC (permalink / raw)
  To: Raviu, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2085 bytes --]



On 2019/12/29 下午11:05, Raviu wrote:
> Hi,
> My system suddenly crashed, after reboot I cannot mount /home any more.
> 
> `uname -a`
> Linux moonIk80 4.12.14-lp151.28.36-default #1 SMP Fri Dec 6 13:50:27 UTC 2019 (8f4a495) x86_64 x86_64 x86_64 GNU/Linux
> 
> btrfs-progs v5.4
> 
> `btrfs fi show`
> Label: none  uuid: 378faa6e-8af0-415e-93f7-68b31fb08a29
>         Total devices 1 FS bytes used 194.99GiB
>         devid    1 size 232.79GiB used 231.79GiB path /dev/mapper/cr_sda4
> 
> 
> The device cannot be mounted.
> [  188.649876] BTRFS info (device dm-1): disk space caching is enabled
> [  188.649878] BTRFS info (device dm-1): has skinny extents
> [  188.656364] BTRFS critical (device dm-1): corrupt leaf: root=2 block=294640566272 slot=104, unexpected item end, have 42739 expect 9971

As Hugo has already pointed out, this looks very like a bit flip.
Thus a memtest is highly recommended.

Also, your kernel is a little old. I'm not sure if the distro (I guess
it's openSUSE or SLE?) had all the backports, but starts from v5.2, we
had newer write-time tree-checker to even prevent such bitflip written
back to disk, thus we could catch them earlier.



This is extent tree, in theory you can always salvage the data using
`btrfs-restore`.

But that's the last resort method.

> [  188.656374] BTRFS error (device dm-1): failed to read block groups: -5
> [  188.700088] BTRFS error (device dm-1): open_ctree failed
> 
> 
> 
> `btrfs check /dev/mapper/cr_sda4`
> Opening filesystem to check...
> incorrect offsets 9971 42739
> incorrect offsets 9971 42739
> incorrect offsets 9971 42739
> ERROR: failed to read block groups: Operation not permitted
> ERROR: cannot open file system
> 
> 
If you can re-compile btrfs-progs, you can try this branch:
https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_raviu

Then use the compiled btrfs-corrupt-block (I know it's a terrible name)
to fix the fs:
# ./btrfs-corrupt-block -X /dev/dm-1

It should output what it fixed if it found anything.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cannot mount or recover btrfs
  2019-12-30  5:38 ` Qu Wenruo
@ 2020-01-07  7:35   ` Raviu
  2020-01-10 15:49     ` Raviu
  0 siblings, 1 reply; 5+ messages in thread
From: Raviu @ 2020-01-07  7:35 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

I've restored over 90% of the data using restore command, and reformatted disk before I got the email.
But I can confirm that I'd RAM corruption, I've done memtest per your recommendation and found that the 1st module of the two modules is bad.
Corruption was severe and repetitive, I just exclaim how did this server corruption went without notice from the linux kernel other than random rare lockups. I'm really amazed how apps and kernel was functioning! Data is really changed on ram.
I've upgraded to vanilla kernel 5.4.6 before doing the memtest, so the latest kernel was not panic about this bad RAM.

Isn't this something that should be fixed?


Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, December 30, 2019 7:38 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:

> On 2019/12/29 下午11:05, Raviu wrote:
>
> > Hi,
> > My system suddenly crashed, after reboot I cannot mount /home any more.
> > `uname -a`
> > Linux moonIk80 4.12.14-lp151.28.36-default #1 SMP Fri Dec 6 13:50:27 UTC 2019 (8f4a495) x86_64 x86_64 x86_64 GNU/Linux
> > btrfs-progs v5.4
> > `btrfs fi show`
> > Label: none uuid: 378faa6e-8af0-415e-93f7-68b31fb08a29
> > Total devices 1 FS bytes used 194.99GiB
> > devid 1 size 232.79GiB used 231.79GiB path /dev/mapper/cr_sda4
> > The device cannot be mounted.
> > [ 188.649876] BTRFS info (device dm-1): disk space caching is enabled
> > [ 188.649878] BTRFS info (device dm-1): has skinny extents
> > [ 188.656364] BTRFS critical (device dm-1): corrupt leaf: root=2 block=294640566272 slot=104, unexpected item end, have 42739 expect 9971
>
> As Hugo has already pointed out, this looks very like a bit flip.
> Thus a memtest is highly recommended.
>
> Also, your kernel is a little old. I'm not sure if the distro (I guess
> it's openSUSE or SLE?) had all the backports, but starts from v5.2, we
> had newer write-time tree-checker to even prevent such bitflip written
> back to disk, thus we could catch them earlier.
>
> This is extent tree, in theory you can always salvage the data using
> `btrfs-restore`.
>
> But that's the last resort method.
>
> > [ 188.656374] BTRFS error (device dm-1): failed to read block groups: -5
> > [ 188.700088] BTRFS error (device dm-1): open_ctree failed
> > `btrfs check /dev/mapper/cr_sda4`
> > Opening filesystem to check...
> > incorrect offsets 9971 42739
> > incorrect offsets 9971 42739
> > incorrect offsets 9971 42739
> > ERROR: failed to read block groups: Operation not permitted
> > ERROR: cannot open file system
>
> If you can re-compile btrfs-progs, you can try this branch:
> https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_raviu
>
> Then use the compiled btrfs-corrupt-block (I know it's a terrible name)
> to fix the fs:
>
> ./btrfs-corrupt-block -X /dev/dm-1
>
> ===================================
>
> It should output what it fixed if it found anything.
>
> Thanks,
> Qu



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Cannot mount or recover btrfs
  2020-01-07  7:35   ` Raviu
@ 2020-01-10 15:49     ` Raviu
  0 siblings, 0 replies; 5+ messages in thread
From: Raviu @ 2020-01-10 15:49 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

Would a duplicate metadata be of any help in my case?
If my filesystem was formatted with duplicate metadata I mean.

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, January 7, 2020 9:35 AM, Raviu <raviu@protonmail.com> wrote:

> I've restored over 90% of the data using restore command, and reformatted disk before I got the email.
> But I can confirm that I'd RAM corruption, I've done memtest per your recommendation and found that the 1st module of the two modules is bad.
> Corruption was severe and repetitive, I just exclaim how did this server corruption went without notice from the linux kernel other than random rare lockups. I'm really amazed how apps and kernel was functioning! Data is really changed on ram.
> I've upgraded to vanilla kernel 5.4.6 before doing the memtest, so the latest kernel was not panic about this bad RAM.
>
> Isn't this something that should be fixed?
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Monday, December 30, 2019 7:38 AM, Qu Wenruo quwenruo.btrfs@gmx.com wrote:
>
> > On 2019/12/29 下午11:05, Raviu wrote:
> >
> > > Hi,
> > > My system suddenly crashed, after reboot I cannot mount /home any more.
> > > `uname -a`
> > > Linux moonIk80 4.12.14-lp151.28.36-default #1 SMP Fri Dec 6 13:50:27 UTC 2019 (8f4a495) x86_64 x86_64 x86_64 GNU/Linux
> > > btrfs-progs v5.4
> > > `btrfs fi show`
> > > Label: none uuid: 378faa6e-8af0-415e-93f7-68b31fb08a29
> > > Total devices 1 FS bytes used 194.99GiB
> > > devid 1 size 232.79GiB used 231.79GiB path /dev/mapper/cr_sda4
> > > The device cannot be mounted.
> > > [ 188.649876] BTRFS info (device dm-1): disk space caching is enabled
> > > [ 188.649878] BTRFS info (device dm-1): has skinny extents
> > > [ 188.656364] BTRFS critical (device dm-1): corrupt leaf: root=2 block=294640566272 slot=104, unexpected item end, have 42739 expect 9971
> >
> > As Hugo has already pointed out, this looks very like a bit flip.
> > Thus a memtest is highly recommended.
> > Also, your kernel is a little old. I'm not sure if the distro (I guess
> > it's openSUSE or SLE?) had all the backports, but starts from v5.2, we
> > had newer write-time tree-checker to even prevent such bitflip written
> > back to disk, thus we could catch them earlier.
> > This is extent tree, in theory you can always salvage the data using
> > `btrfs-restore`.
> > But that's the last resort method.
> >
> > > [ 188.656374] BTRFS error (device dm-1): failed to read block groups: -5
> > > [ 188.700088] BTRFS error (device dm-1): open_ctree failed
> > > `btrfs check /dev/mapper/cr_sda4`
> > > Opening filesystem to check...
> > > incorrect offsets 9971 42739
> > > incorrect offsets 9971 42739
> > > incorrect offsets 9971 42739
> > > ERROR: failed to read block groups: Operation not permitted
> > > ERROR: cannot open file system
> >
> > If you can re-compile btrfs-progs, you can try this branch:
> > https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_raviu
> > Then use the compiled btrfs-corrupt-block (I know it's a terrible name)
> > to fix the fs:
> > ./btrfs-corrupt-block -X /dev/dm-1
> > ===================================
> > It should output what it fixed if it found anything.
> > Thanks,
> > Qu



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-01-10 15:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-29 15:05 Cannot mount or recover btrfs Raviu
2019-12-29 15:38 ` Hugo Mills
2019-12-30  5:38 ` Qu Wenruo
2020-01-07  7:35   ` Raviu
2020-01-10 15:49     ` Raviu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.