All of lore.kernel.org
 help / color / mirror / Atom feed
* corrupt leaf, bad key order on kernel 5.0
@ 2019-04-05 19:11 Nazar Mokrynskyi
  2019-04-05 19:32 ` Hugo Mills
  0 siblings, 1 reply; 4+ messages in thread
From: Nazar Mokrynskyi @ 2019-04-05 19:11 UTC (permalink / raw)
  To: linux-btrfs

NOTE: I do not need help with recovery, I have fully automated snapshots, backups and restoration mechanisms, the only purpose of this email is to help developers find the reason of yet another filesystem corruption and hopefully fix it.

Yet another corruption of my root BTRFS filesystem happened today.
Didn't bother to run scrub, balance or check, just created disk image for future investigation and restored everything from backup.

Here is what corruption looks like:
[  274.241339] BTRFS info (device dm-0): disk space caching is enabled
[  274.241344] BTRFS info (device dm-0): has skinny extents
[  274.283238] BTRFS info (device dm-0): enabling ssd optimizations
[  310.436672] BTRFS critical (device dm-0): corrupt leaf: root=268 block=42044719104 slot=123, bad key order, prev (1240717 108 41447424) current (1240717 76 41451520)
[  310.449304] BTRFS critical (device dm-0): corrupt leaf: root=268 block=42044719104 slot=123, bad key order, prev (1240717 108 41447424) current (1240717 76 41451520)
[  310.449309] BTRFS: error (device dm-0) in btrfs_dropa_snapshot:9250: errno=-5 IO failure
[  310.449311] BTRFS info (device dm-0): forced readonly
[  311.266789] BTRFS info (device dm-0): delayed_refs has NO entry
[  311.277088] BTRFS error (device dm-0): cleaner transaction attach returned -30

My system just freezed when I was not looking at it and this is the state it is in now.
File system survived from March 8th til April 05, one of the fastest corruptions in my experience.

Looks like this happened during sending incremental snapshot to the other BTRFS filesystem, since last snapshot on that one was not read-only as it should have been otherwise.

I'm on Ubuntu 19.04 with Linux kernel 5.0.5 and btrfs-progs v4.20.2.

My filesystem is on top of LUKS on NVMe SSD (SM961), I have 3 snapshots created every 15 minutes from 3 subvolumes with rotation of old snapshots (can be from tens to hundreds of snapshots at any time).

Mount options: compress=lzo,noatime,ssd

I have full disk image with corrupted filesystem and will create Qcow2 snapshots of it, so if you want me to run any experiments, including potentially destructive, including usage of custom patches to btrfs-progs to find out the reason of corruption, would be happy to help as much as I can.

P.S. I'm riding latest stable and rc kernels all the time and during last 6 months I've got about as many corruptions of different BTRFS filesystems as during 3 years before that, really worrying if you ask me.

-- 
Sincerely, Nazar Mokrynskyi
github.com/nazar-pc


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: corrupt leaf, bad key order on kernel 5.0
  2019-04-05 19:11 corrupt leaf, bad key order on kernel 5.0 Nazar Mokrynskyi
@ 2019-04-05 19:32 ` Hugo Mills
  2019-04-05 20:23   ` Nazar Mokrynskyi
  2019-04-06  2:17   ` Qu Wenruo
  0 siblings, 2 replies; 4+ messages in thread
From: Hugo Mills @ 2019-04-05 19:32 UTC (permalink / raw)
  To: Nazar Mokrynskyi; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3409 bytes --]

On Fri, Apr 05, 2019 at 10:11:57PM +0300, Nazar Mokrynskyi wrote:
> NOTE: I do not need help with recovery, I have fully automated snapshots, backups and restoration mechanisms, the only purpose of this email is to help developers find the reason of yet another filesystem corruption and hopefully fix it.

   That's good news, at least.

> Yet another corruption of my root BTRFS filesystem happened today.
> Didn't bother to run scrub, balance or check, just created disk image for future investigation and restored everything from backup.
> 
> Here is what corruption looks like:
> [  274.241339] BTRFS info (device dm-0): disk space caching is enabled
> [  274.241344] BTRFS info (device dm-0): has skinny extents
> [  274.283238] BTRFS info (device dm-0): enabling ssd optimizations
> [  310.436672] BTRFS critical (device dm-0): corrupt leaf: root=268 block=42044719104 slot=123, bad key order, prev (1240717 108 41447424) current (1240717 76 41451520)

   "Bad key order" is usually an indicator of faulty RAM -- a piece of
metadata gets loaded into RAM for modification, a bit gets flipped in
it (because the bit is stuck on one value), and then the csum is
computed for the page (including the faulty bit), and written out to
disk. In this case, it's not obvious, but I'd suggest that the second
field of the key has been flipped, as 108 is 0x6c, and 76 is 0x4c --
one bit away from each other.

   I recommend you check your hardware thoroughly before attempting to
rebuild the FS.

   Hugo.

> [  310.449304] BTRFS critical (device dm-0): corrupt leaf: root=268 block=42044719104 slot=123, bad key order, prev (1240717 108 41447424) current (1240717 76 41451520)
> [  310.449309] BTRFS: error (device dm-0) in btrfs_dropa_snapshot:9250: errno=-5 IO failure
> [  310.449311] BTRFS info (device dm-0): forced readonly
> [  311.266789] BTRFS info (device dm-0): delayed_refs has NO entry
> [  311.277088] BTRFS error (device dm-0): cleaner transaction attach returned -30
> 
> My system just freezed when I was not looking at it and this is the state it is in now.
> File system survived from March 8th til April 05, one of the fastest corruptions in my experience.
> 
> Looks like this happened during sending incremental snapshot to the other BTRFS filesystem, since last snapshot on that one was not read-only as it should have been otherwise.
> 
> I'm on Ubuntu 19.04 with Linux kernel 5.0.5 and btrfs-progs v4.20.2.
> 
> My filesystem is on top of LUKS on NVMe SSD (SM961), I have 3 snapshots created every 15 minutes from 3 subvolumes with rotation of old snapshots (can be from tens to hundreds of snapshots at any time).
> 
> Mount options: compress=lzo,noatime,ssd
> 
> I have full disk image with corrupted filesystem and will create Qcow2 snapshots of it, so if you want me to run any experiments, including potentially destructive, including usage of custom patches to btrfs-progs to find out the reason of corruption, would be happy to help as much as I can.
> 
> P.S. I'm riding latest stable and rc kernels all the time and during last 6 months I've got about as many corruptions of different BTRFS filesystems as during 3 years before that, really worrying if you ask me.
> 

-- 
Hugo Mills             | I'm always right.
hugo@... carfax.org.uk | But I might be wrong about that.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: corrupt leaf, bad key order on kernel 5.0
  2019-04-05 19:32 ` Hugo Mills
@ 2019-04-05 20:23   ` Nazar Mokrynskyi
  2019-04-06  2:17   ` Qu Wenruo
  1 sibling, 0 replies; 4+ messages in thread
From: Nazar Mokrynskyi @ 2019-04-05 20:23 UTC (permalink / raw)
  To: Hugo Mills; +Cc: linux-btrfs

05.04.19 22:32, Hugo Mills пише:
>> Yet another corruption of my root BTRFS filesystem happened today.
>> Didn't bother to run scrub, balance or check, just created disk image for future investigation and restored everything from backup.
>>
>> Here is what corruption looks like:
>> [  274.241339] BTRFS info (device dm-0): disk space caching is enabled
>> [  274.241344] BTRFS info (device dm-0): has skinny extents
>> [  274.283238] BTRFS info (device dm-0): enabling ssd optimizations
>> [  310.436672] BTRFS critical (device dm-0): corrupt leaf: root=268 block=42044719104 slot=123, bad key order, prev (1240717 108 41447424) current (1240717 76 41451520)
>    "Bad key order" is usually an indicator of faulty RAM -- a piece of
> metadata gets loaded into RAM for modification, a bit gets flipped in
> it (because the bit is stuck on one value), and then the csum is
> computed for the page (including the faulty bit), and written out to
> disk. In this case, it's not obvious, but I'd suggest that the second
> field of the key has been flipped, as 108 is 0x6c, and 76 is 0x4c --
> one bit away from each other.
>
>    I recommend you check your hardware thoroughly before attempting to
> rebuild the FS.
>
>    Hugo.

Hm... this might indeed be related to RAM being overclocked a bit too much. It worked fine for a long time, but apparently not 100% stable.
Rolled back overclock, thanks for suggestion!

Sincerely, Nazar Mokrynskyi
github.com/nazar-pc


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: corrupt leaf, bad key order on kernel 5.0
  2019-04-05 19:32 ` Hugo Mills
  2019-04-05 20:23   ` Nazar Mokrynskyi
@ 2019-04-06  2:17   ` Qu Wenruo
  1 sibling, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2019-04-06  2:17 UTC (permalink / raw)
  To: Hugo Mills, Nazar Mokrynskyi, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3887 bytes --]



On 2019/4/6 上午3:32, Hugo Mills wrote:
> On Fri, Apr 05, 2019 at 10:11:57PM +0300, Nazar Mokrynskyi wrote:
>> NOTE: I do not need help with recovery, I have fully automated snapshots, backups and restoration mechanisms, the only purpose of this email is to help developers find the reason of yet another filesystem corruption and hopefully fix it.
> 
>    That's good news, at least.
> 
>> Yet another corruption of my root BTRFS filesystem happened today.
>> Didn't bother to run scrub, balance or check, just created disk image for future investigation and restored everything from backup.
>>
>> Here is what corruption looks like:
>> [  274.241339] BTRFS info (device dm-0): disk space caching is enabled
>> [  274.241344] BTRFS info (device dm-0): has skinny extents
>> [  274.283238] BTRFS info (device dm-0): enabling ssd optimizations
>> [  310.436672] BTRFS critical (device dm-0): corrupt leaf: root=268 block=42044719104 slot=123, bad key order, prev (1240717 108 41447424) current (1240717 76 41451520)
> 
>    "Bad key order" is usually an indicator of faulty RAM -- a piece of
> metadata gets loaded into RAM for modification, a bit gets flipped in
> it (because the bit is stuck on one value), and then the csum is
> computed for the page (including the faulty bit), and written out to
> disk. In this case, it's not obvious, but I'd suggest that the second
> field of the key has been flipped, as 108 is 0x6c, and 76 is 0x4c --
> one bit away from each other.

Furthermore, 108 is EXTENT_DATA_KEY, a completely valid type, while
there is no key type assigned to 76.

> 
>    I recommend you check your hardware thoroughly before attempting to
> rebuild the FS.

Hugo's completely right.

Very much a symptom of memory bit flip.

> 
>    Hugo.
> 
>> [  310.449304] BTRFS critical (device dm-0): corrupt leaf: root=268 block=42044719104 slot=123, bad key order, prev (1240717 108 41447424) current (1240717 76 41451520)
>> [  310.449309] BTRFS: error (device dm-0) in btrfs_dropa_snapshot:9250: errno=-5 IO failure
>> [  310.449311] BTRFS info (device dm-0): forced readonly
>> [  311.266789] BTRFS info (device dm-0): delayed_refs has NO entry
>> [  311.277088] BTRFS error (device dm-0): cleaner transaction attach returned -30
>>
>> My system just freezed when I was not looking at it and this is the state it is in now.
>> File system survived from March 8th til April 05, one of the fastest corruptions in my experience.
>>
>> Looks like this happened during sending incremental snapshot to the other BTRFS filesystem, since last snapshot on that one was not read-only as it should have been otherwise.
>>
>> I'm on Ubuntu 19.04 with Linux kernel 5.0.5 and btrfs-progs v4.20.2.
>>
>> My filesystem is on top of LUKS on NVMe SSD (SM961), I have 3 snapshots created every 15 minutes from 3 subvolumes with rotation of old snapshots (can be from tens to hundreds of snapshots at any time).
>>
>> Mount options: compress=lzo,noatime,ssd
>>
>> I have full disk image with corrupted filesystem and will create Qcow2 snapshots of it, so if you want me to run any experiments, including potentially destructive, including usage of custom patches to btrfs-progs to find out the reason of corruption, would be happy to help as much as I can.
>>
>> P.S. I'm riding latest stable and rc kernels all the time and during last 6 months I've got about as many corruptions of different BTRFS filesystems as during 3 years before that, really worrying if you ask me.

This is because btrfs is way more strict on any possible corruption
(sometimes too strict during development cycle, but this time is a real
problem).

I'm afraid it will report more and more problem, but at least next time,
it won't cause mount failure, but transaction abort before writing bad
data into disk.

Thanks,
Qu

>>
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-04-06  2:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-05 19:11 corrupt leaf, bad key order on kernel 5.0 Nazar Mokrynskyi
2019-04-05 19:32 ` Hugo Mills
2019-04-05 20:23   ` Nazar Mokrynskyi
2019-04-06  2:17   ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.