linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Unrecoverable btrfs corruption (backref bytes do not match extent backref)
@ 2019-01-03 22:06 Nazar Mokrynskyi
  2019-01-04  1:15 ` Chris Murphy
  0 siblings, 1 reply; 10+ messages in thread
From: Nazar Mokrynskyi @ 2019-01-03 22:06 UTC (permalink / raw)
  To: linux-btrfs

Today I have faced yet another BTRFS corruption, the only major software change in recent days was upgrade to Linux 4.20, so this is something to keep in mind.
Previous corruption happened with kernel 4.14 a bit over a year ago I think, so filesystem was with high probability created in December 2017.

This time I saved an image of corrupted partition, so I can make more experiments with it.

A bit of history:
1) several days ago I was running scrub and found some inode errors, it boiled down to /var/lib/mysql/ibdata1 file (/var/lib/mysql have CoW disabled), so I've replaced that file with file from backup, made fresh snapshot and removed old snapshots that had corrupted file in it, scrub was happy after that
2) today everything worked fine until on reboot system booted into initramfs complaining about something with btrfs...

My setup: BTRFS under full-disk LUKS, no partition table:
/dev/mapper/system / btrfs compress=lzo,noatime,ssd,subvol=/root 0 1

btrfs-progs v4.19.1, Ubuntu 19.04 (develoment branch with proposed packages enabled), kernel 4.20.0 with ACS override patch

After inspection turned out that scrub is still perfectly fine, no complains, however:

root@ubuntu:~# btrfsck /dev/mapper/luks-739967f1-9770-470a-a031-8d8b8bcdb350
warning, bad space info total_bytes 2155872256 used 2155876352
warning, bad space info total_bytes 3229614080 used 3229618176
warning, bad space info total_bytes 4303355904 used 4303360000
warning, bad space info total_bytes 5377097728 used 5377101824
warning, bad space info total_bytes 6450839552 used 6450843648
warning, bad space info total_bytes 7524581376 used 7524585472
warning, bad space info total_bytes 8598323200 used 8598327296
warning, bad space info total_bytes 9672065024 used 9672069120
warning, bad space info total_bytes 10745806848 used 10745810944
warning, bad space info total_bytes 11819548672 used 11819552768
warning, bad space info total_bytes 12893290496 used 12893294592
warning, bad space info total_bytes 13967032320 used 13967036416
warning, bad space info total_bytes 15040774144 used 15040778240
warning, bad space info total_bytes 16114515968 used 16114520064
warning, bad space info total_bytes 17188257792 used 17188261888
warning, bad space info total_bytes 18261999616 used 18262003712
warning, bad space info total_bytes 19335741440 used 19335745536
warning, bad space info total_bytes 20409483264 used 20409487360
warning, bad space info total_bytes 21483225088 used 21483229184
warning, bad space info total_bytes 22556966912 used 22556971008
warning, bad space info total_bytes 23630708736 used 23630712832
warning, bad space info total_bytes 24704450560 used 24704454656
warning, bad space info total_bytes 25778192384 used 25778196480
warning, bad space info total_bytes 26851934208 used 26851938304
warning, bad space info total_bytes 27925676032 used 27925680128
warning, bad space info total_bytes 28999417856 used 28999421952
warning, bad space info total_bytes 30073159680 used 30073163776
warning, bad space info total_bytes 31146901504 used 31146905600
warning, bad space info total_bytes 32220643328 used 32220647424
Checking filesystem on /dev/mapper/luks-739967f1-9770-470a-a031-8d8b8bcdb350
UUID: 5170aca4-061a-4c6c-ab00-bd7fc8ae6030
checking extents
extent item 3114475520 has multiple extent items
ref mismatch on [3114475520 4096] extent item 1, found 2
backref bytes do not match extent backref, bytenr=3114475520, ref bytes=4096, backref bytes=36864
backpointer mismatch on [3114475520 4096]
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
checking csums
checking root refs
found 39409483813 bytes used, error(s) found
total csum bytes: 35990412
total tree bytes: 2395095040
total fs tree bytes: 2249408512
total extent tree bytes: 96534528
btree space waste bytes: 456622616
file data blocks allocated: 174319587328
 referenced 61677670400

I've removed all secondary subvolumes (like for Docker) and snapshots, but issue persisted.

Then I've tried `btrfs balance` and it failed at around 10-20% remaining.

I think after that scrub started complaining about errors or just stop increasing scrub progress.

This is where I made 256G partition image.

Since I have proper backups, I gave `btrfsck --repair` a try just in case it does anything useful at least once, but it just repeatedly prints following without any meaningful CPU or disk activity:

ref mismatch on [3114475520 36864] extent item 1, found 2
backref bytes do not match extent backref, bytenr=3114475520, ref bytes=36864, backref bytes=4096
backpointer mismatch on [3114475520 36864]
attempting to repair backref discrepency for bytenr 3114475520

The actual corruption happened in /docker subvolume somewhere amongst regular files. I get segfaults when I try to do anything there, sometimes kernel can lock up. Other files seem to be perfectly fine.

Removing /docker subvolume locked the kernel if I recall correctly.


If this seems anything important and you want me to run some commands to check what happened exactly, I can start VMs with this partition image connected and do whatever is needed. I can't send image anywhere though, since it contains sensitive information.

NOTE: I don't need help with partition or data recovery, I'm used to these kinds of crashes and have backups, so no data were lost.

P.S. I really wish BTRFS can stop accidentally corrupting itself one day.

-- 
Sincerely, Nazar Mokrynskyi
github.com/nazar-pc


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-01-05  1:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-03 22:06 Unrecoverable btrfs corruption (backref bytes do not match extent backref) Nazar Mokrynskyi
2019-01-04  1:15 ` Chris Murphy
2019-01-04  1:32   ` Qu Wenruo
2019-01-04  1:36     ` Hans van Kranenburg
2019-01-04  3:43     ` Chris Murphy
2019-01-04  4:31       ` Qu Wenruo
2019-01-04 23:04     ` Nazar Mokrynskyi
2019-01-05  1:18       ` Qu Wenruo
2019-01-05  1:26         ` Nazar Mokrynskyi
2019-01-05  0:13   ` Nazar Mokrynskyi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).