All of lore.kernel.org
 help / color / mirror / Atom feed
* 4.11 relocate crash, null pointer
@ 2017-05-01 17:06                                       ` Marc MERLIN
  2017-05-01 18:08                                         ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-05-01 17:06 UTC (permalink / raw)
  To: linux-btrfs

I have a filesystem that sadly got corrupted by a SAS card I just installed yesterday.

I don't think in a case like this, there is there a way to roll back all
writes across all subvolumes in the last 24H, correct?

Is the best thing to go in each subvolume, delete the recent snapshots and
rename the one from 24H as the current one?

BTRFS warning (device dm-5): failed to load free space cache for block group 6746013696000, rebuilding it now
BTRFS warning (device dm-5): block group 6754603630592 has wrong amount of free space
BTRFS warning (device dm-5): failed to load free space cache for block group 6754603630592, rebuilding it now
BTRFS warning (device dm-5): block group 7125178777600 has wrong amount of free space
BTRFS warning (device dm-5): failed to load free space cache for block group 7125178777600, rebuilding it now
BTRFS error (device dm-5): bad tree block start 3981076597540270796 2899180224512
BTRFS error (device dm-5): bad tree block start 942082474969670243 2899180224512
BTRFS: error (device dm-5) in __btrfs_free_extent:6944: errno=-5 IO failure
BTRFS info (device dm-5): forced readonly
BTRFS: error (device dm-5) in btrfs_run_delayed_refs:2961: errno=-5 IO failure
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: __del_reloc_root+0x3f/0xa6
PGD 189a0e067
PUD 189a0f067
PMD 0

Oops: 0000 [#1] PREEMPT SMP
Modules linked in: veth ip6table_filter ip6_tables ebtable_nat ebtables ppdev lp xt_addrtype br_netfilter bridge stp llc tun autofs4 softdog binfmt_misc ftdi_sio nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipt_REJECT nf_reject_ipv4 xt_conntrack xt_mark xt_nat xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG iptable_mangle iptable_filter lm85 hwmon_vid pl2303 dm_snapshot dm_bufio iptable_nat ip_tables nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_conntrack_ftp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat nf_conntrack x_tables sg st snd_pcm_oss snd_mixer_oss bcache kvm_intel kvm irqbypass snd_hda_codec_realtek snd_cmipci snd_hda_codec_generic snd_hda_intel snd_mpu401_uart snd_hda_codec snd_opl3_lib snd_rawmidi snd_hda_core snd_seq_device snd_hwdep eeepc_wmi snd_pcm asus_wmi rc_ati_x10
 asix snd_timer ati_remote sparse_keymap usbnet rfkill snd hwmon soundcore rc_core evdev libphy tpm_infineon pcspkr i915 parport_pc i2c_i801 input_leds mei_me lpc_ich parport tpm_tis battery usbserial tpm_tis_core tpm wmi e1000e ptp pps_core fuse raid456 multipath mmc_block mmc_core lrw ablk_helper dm_crypt dm_mod async_raid6_recov async_pq async_xor async_memcpy async_tx crc32c_intel blowfish_x86_64 blowfish_common pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd xhci_pci ehci_pci sata_sil24 xhci_hcd mvsas ehci_hcd r8169 usbcore mii libsas scsi_transport_sas thermal fan [last unloaded: ftdi_sio]
CPU: 0 PID: 9056 Comm: btrfs Tainted: G     U          4.11.0-amd64-preempt-sysrq-20170406 #2
Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
task: ffff88374d2a60c0 task.stack: ffffa6f226424000
RIP: 0010:__del_reloc_root+0x3f/0xa6
RSP: 0018:ffffa6f226427a40 EFLAGS: 00210246
RAX: 0000000000000000 RBX: ffff8838ee256000 RCX: 00000000ffffffe2
RDX: 0000000000000001 RSI: ffffffff9f83b410 RDI: ffff8837992da568
RBP: ffffa6f226427a68 R08: 0000000000000000 R09: ffffffff9fd69480
R10: 0000000000000000 R11: 0000000000000000 R12: ffffa6f226427ab0
R13: ffff883768938000 R14: ffff8837992da568 R15: ffff8837992da570
FS:  00007facd18d28c0(0000) GS:ffff883a5e200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000189a10000 CR4: 00000000001406f0
Call Trace:
 free_reloc_roots+0x4f/0x5d
 merge_reloc_roots+0x159/0x1ba
 relocate_block_group+0x410/0x492
 btrfs_relocate_block_group+0x12d/0x253
 btrfs_relocate_chunk+0x3e/0xb1
 btrfs_balance+0xd16/0xf36
 btrfs_ioctl_balance+0x24f/0x2cd
 ? __alloc_pages_nodemask+0x134/0x1e0
 btrfs_ioctl+0x1447/0x1e22
 ? mem_cgroup_charge_statistics+0x1e/0x88
 ? get_page+0x9/0x26
 ? __lru_cache_add+0x2a/0x6c
 ? set_pte_at+0x9/0xd
 ? __handle_mm_fault+0x61d/0xa6f
 vfs_ioctl+0x21/0x38
 ? vfs_ioctl+0x21/0x38
 do_vfs_ioctl+0x4ef/0x537
 ? current_kernel_time64+0x10/0x36
 ? __audit_syscall_entry+0xc2/0xe6
 ? syscall_trace_enter+0x1ac/0x20e
 SyS_ioctl+0x57/0x7b
 do_syscall_64+0x6b/0x7d
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7facd097ecc7
RSP: 002b:00007ffefd3c3128 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007facd097ecc7
RDX: 00007ffefd3c31b8 RSI: 00000000c4009420 RDI: 0000000000000003
RBP: 00007ffefd3c31b8 R08: 0000000000000003 R09: 0000000000008040
R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000003
R13: 00007ffefd3c4cc9 R14: 0000000000000001 R15: 0000000000000001
Code: af f0 01 00 00 48 89 fb 4d 8b b5 10 0b 00 00 4d 8d be 70 05 00 00 49 81 c6 68 05 00 00 4c 89 ff e8 0f 44 43 00 48 8b 03 4c 89 f7 <48> 8b 30 e8 0e fc ff ff 48 85 c0 49 89 c4 74 0b 4c 89 f6 48 89
RIP: __del_reloc_root+0x3f/0xa6 RSP: ffffa6f226427a40
CR2: 0000000000000000
---[ end trace 64c3fa4dc953d295 ]---
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Rebooting in 20 seconds..
ACPI MEMORY or I/O RESET_REG.

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-01 17:06                                       ` 4.11 relocate crash, null pointer Marc MERLIN
@ 2017-05-01 18:08                                         ` Marc MERLIN
  2017-05-02  1:50                                           ` Chris Murphy
  2017-05-05  1:13                                           ` Qu Wenruo
  0 siblings, 2 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-05-01 18:08 UTC (permalink / raw)
  To: linux-btrfs; +Cc: clm, bo.li.liu, fdmanana, jbacik, quwenruo, dsterba

So, I forgot to mention that it's my main media and backup server that got
corrupted. Yes, I do actually have a backup of a backup server, but it's
going to take days to recover due to the amount of data to copy back, not
counting lots of manual typing due to the number of subvolumes, btrfs
send/receive relationships and so forth.

Really, I should be able to roll back all writes from the last 24H, run a
check --repair/scrub on top just to be sure, and be back on track.

In the meantime, the good news is that the filesystem doesn't crash the
kernel (the poasted crash below) now that I was able to cancel the btrfs balance, 
but it goes read only at the drop of a hat, even when I'm trying to delete
recent snapshots and all data that was potentially written in the last 24H

On Mon, May 01, 2017 at 10:06:41AM -0700, Marc MERLIN wrote:
> I have a filesystem that sadly got corrupted by a SAS card I just installed yesterday.
> 
> I don't think in a case like this, there is there a way to roll back all
> writes across all subvolumes in the last 24H, correct?
> 
> Is the best thing to go in each subvolume, delete the recent snapshots and
> rename the one from 24H as the current one?
 
Well, just like I expected, it's a pain in the rear and this can't even help
fix the top level mountpoint which doesn't have snapshots, so I can't roll
it back.
btrfs should really have an easy way to roll back X hours, or days to
recover from garbage written after a good known point, given that it is COW
afterall.

Is there a way do this with check --repair maybe?

In the meantime, I got stuck while trying to delete snapshots:

Let's say I have this:
ID 428 gen 294021 top level 5 path backup
ID 2023 gen 294021 top level 5 path Soft
ID 3021 gen 294051 top level 428 path backup/debian32
ID 4400 gen 294018 top level 428 path backup/debian64
ID 4930 gen 294019 top level 428 path backup/ubuntu

I can easily
Delete subvolume (no-commit): '/mnt/btrfs_pool2/Soft'
and then:
gargamel:/mnt/btrfs_pool2# mv Soft_rw.20170430_01:50:22 Soft

But I can't delete backup, which actually is mostly only a directory
containing other things (in hindsight I shouldn't have made that a
subvolume)
Delete subvolume (no-commit): '/mnt/btrfs_pool2/backup'
ERROR: cannot delete '/mnt/btrfs_pool2/backup': Directory not empty

This is because backup has a lot of subvolumes due to btrfs send/receive
relationships.

Is it possible to recover there? Can you reparent subvolumes to a different
subvolume without doing a full copy via btrfs send/receive?

Thanks,
Marc

> BTRFS warning (device dm-5): failed to load free space cache for block group 6746013696000, rebuilding it now
> BTRFS warning (device dm-5): block group 6754603630592 has wrong amount of free space
> BTRFS warning (device dm-5): failed to load free space cache for block group 6754603630592, rebuilding it now
> BTRFS warning (device dm-5): block group 7125178777600 has wrong amount of free space
> BTRFS warning (device dm-5): failed to load free space cache for block group 7125178777600, rebuilding it now
> BTRFS error (device dm-5): bad tree block start 3981076597540270796 2899180224512
> BTRFS error (device dm-5): bad tree block start 942082474969670243 2899180224512
> BTRFS: error (device dm-5) in __btrfs_free_extent:6944: errno=-5 IO failure
> BTRFS info (device dm-5): forced readonly
> BTRFS: error (device dm-5) in btrfs_run_delayed_refs:2961: errno=-5 IO failure
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: __del_reloc_root+0x3f/0xa6
> PGD 189a0e067
> PUD 189a0f067
> PMD 0
> 
> Oops: 0000 [#1] PREEMPT SMP
> Modules linked in: veth ip6table_filter ip6_tables ebtable_nat ebtables ppdev lp xt_addrtype br_netfilter bridge stp llc tun autofs4 softdog binfmt_misc ftdi_sio nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipt_REJECT nf_reject_ipv4 xt_conntrack xt_mark xt_nat xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG iptable_mangle iptable_filter lm85 hwmon_vid pl2303 dm_snapshot dm_bufio iptable_nat ip_tables nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_conntrack_ftp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat nf_conntrack x_tables sg st snd_pcm_oss snd_mixer_oss bcache kvm_intel kvm irqbypass snd_hda_codec_realtek snd_cmipci snd_hda_codec_generic snd_hda_intel snd_mpu401_uart snd_hda_codec snd_opl3_lib snd_rawmidi snd_hda_core snd_seq_device snd_hwdep eeepc_wmi snd_pcm asus_wmi rc_ati_x10
>  asix snd_timer ati_remote sparse_keymap usbnet rfkill snd hwmon soundcore rc_core evdev libphy tpm_infineon pcspkr i915 parport_pc i2c_i801 input_leds mei_me lpc_ich parport tpm_tis battery usbserial tpm_tis_core tpm wmi e1000e ptp pps_core fuse raid456 multipath mmc_block mmc_core lrw ablk_helper dm_crypt dm_mod async_raid6_recov async_pq async_xor async_memcpy async_tx crc32c_intel blowfish_x86_64 blowfish_common pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd xhci_pci ehci_pci sata_sil24 xhci_hcd mvsas ehci_hcd r8169 usbcore mii libsas scsi_transport_sas thermal fan [last unloaded: ftdi_sio]
> CPU: 0 PID: 9056 Comm: btrfs Tainted: G     U          4.11.0-amd64-preempt-sysrq-20170406 #2
> Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
> task: ffff88374d2a60c0 task.stack: ffffa6f226424000
> RIP: 0010:__del_reloc_root+0x3f/0xa6
> RSP: 0018:ffffa6f226427a40 EFLAGS: 00210246
> RAX: 0000000000000000 RBX: ffff8838ee256000 RCX: 00000000ffffffe2
> RDX: 0000000000000001 RSI: ffffffff9f83b410 RDI: ffff8837992da568
> RBP: ffffa6f226427a68 R08: 0000000000000000 R09: ffffffff9fd69480
> R10: 0000000000000000 R11: 0000000000000000 R12: ffffa6f226427ab0
> R13: ffff883768938000 R14: ffff8837992da568 R15: ffff8837992da570
> FS:  00007facd18d28c0(0000) GS:ffff883a5e200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 0000000189a10000 CR4: 00000000001406f0
> Call Trace:
>  free_reloc_roots+0x4f/0x5d
>  merge_reloc_roots+0x159/0x1ba
>  relocate_block_group+0x410/0x492
>  btrfs_relocate_block_group+0x12d/0x253
>  btrfs_relocate_chunk+0x3e/0xb1
>  btrfs_balance+0xd16/0xf36
>  btrfs_ioctl_balance+0x24f/0x2cd
>  ? __alloc_pages_nodemask+0x134/0x1e0
>  btrfs_ioctl+0x1447/0x1e22
>  ? mem_cgroup_charge_statistics+0x1e/0x88
>  ? get_page+0x9/0x26
>  ? __lru_cache_add+0x2a/0x6c
>  ? set_pte_at+0x9/0xd
>  ? __handle_mm_fault+0x61d/0xa6f
>  vfs_ioctl+0x21/0x38
>  ? vfs_ioctl+0x21/0x38
>  do_vfs_ioctl+0x4ef/0x537
>  ? current_kernel_time64+0x10/0x36
>  ? __audit_syscall_entry+0xc2/0xe6
>  ? syscall_trace_enter+0x1ac/0x20e
>  SyS_ioctl+0x57/0x7b
>  do_syscall_64+0x6b/0x7d
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x7facd097ecc7
> RSP: 002b:00007ffefd3c3128 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007facd097ecc7
> RDX: 00007ffefd3c31b8 RSI: 00000000c4009420 RDI: 0000000000000003
> RBP: 00007ffefd3c31b8 R08: 0000000000000003 R09: 0000000000008040
> R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000003
> R13: 00007ffefd3c4cc9 R14: 0000000000000001 R15: 0000000000000001
> Code: af f0 01 00 00 48 89 fb 4d 8b b5 10 0b 00 00 4d 8d be 70 05 00 00 49 81 c6 68 05 00 00 4c 89 ff e8 0f 44 43 00 48 8b 03 4c 89 f7 <48> 8b 30 e8 0e fc ff ff 48 85 c0 49 89 c4 74 0b 4c 89 f6 48 89
> RIP: __del_reloc_root+0x3f/0xa6 RSP: ffffa6f226427a40
> CR2: 0000000000000000
> ---[ end trace 64c3fa4dc953d295 ]---
> Kernel panic - not syncing: Fatal exception
> Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> Rebooting in 20 seconds..
> ACPI MEMORY or I/O RESET_REG.
> 
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/  

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-01 18:08                                         ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN
@ 2017-05-02  1:50                                           ` Chris Murphy
  2017-05-02  3:23                                             ` Marc MERLIN
  2017-05-05  1:13                                           ` Qu Wenruo
  1 sibling, 1 reply; 77+ messages in thread
From: Chris Murphy @ 2017-05-02  1:50 UTC (permalink / raw)
  To: Marc MERLIN
  Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik,
	Qu Wenruo, David Sterba

What about btfs check (no repair), without and then also with --mode=lowmem?

In theory I like the idea of a 24 hour rollback; but in normal usage
Btrfs will eventually free up space containing stale and no longer
necessary metadata. Like the chunk tree, it's always changing, so you
get to a point, even with snapshots, that the old state of that tree
is just - gone. A snapshot of an fs tree does not make the chunk tree
frozen in time.

To do what you want, maybe isn't a ton of work if it could be based on
a variation of the existing btrfs seed device code. Call it a "super
snapshot".

I like the idea of triage, where bad parts of the file system can just
be cut off, like triage. Compared to other filesystems, they'll say
this is hardware sabotage and nothing can be done. Btrfs is a bit
deceptive in that it sorta invites the idea we can use hardware that
isn't proven, and the fs can survive.

In any case, it's a big problem in my mind if no existing tools can
fix a file system of this size. So before making anymore changes, make
sure you have a btrfs-image somewhere, even if it's huge. The offline
checker needs to be able to repair it, right now it's all we have for
such a case.


Chris Murphy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-02  1:50                                           ` Chris Murphy
@ 2017-05-02  3:23                                             ` Marc MERLIN
  2017-05-02  4:56                                               ` Chris Murphy
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-05-02  3:23 UTC (permalink / raw)
  To: Chris Murphy
  Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik,
	Qu Wenruo, David Sterba

Hi Chris,

Thanks for the reply, much appreciated.

On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote:
> What about btfs check (no repair), without and then also with --mode=lowmem?
> 
> In theory I like the idea of a 24 hour rollback; but in normal usage
> Btrfs will eventually free up space containing stale and no longer
> necessary metadata. Like the chunk tree, it's always changing, so you
> get to a point, even with snapshots, that the old state of that tree
> is just - gone. A snapshot of an fs tree does not make the chunk tree
> frozen in time.
 
Right, of course, I was being way over optimistic here. I kind of forgot
that metadata wasn't COW, my bad.

> In any case, it's a big problem in my mind if no existing tools can
> fix a file system of this size. So before making anymore changes, make
> sure you have a btrfs-image somewhere, even if it's huge. The offline
> checker needs to be able to repair it, right now it's all we have for
> such a case.

The image will be huge, and take maybe 24H to make (last time it took
some silly amount of time like that), and honestly I'm not sure how
useful it'll be.
Outside of the kernel crashing if I do a btrfs balance, and hopefully
the crash report I gave is good enough, the state I'm in is not btrfs'
fault.

If I can't roll back to a reasonably working state, with data loss of a
known quantity that I can recover from backup, I'll have to destroy and
filesystem and recover from scratch, which will take multiple days.
Since I can't wait too long before getting back to a working state, I
think I'm going to try btrfs check --repair after a scrub to get a list
of all the pathanmes/inodes that are known to be damaged, and work from
there.
Sounds reasonable?

Also, how is --mode=lowmem being useful?

And for re-parenting a sub-subvolume, is that possible?
(I want to delete /sub1/ but I can't because I have /sub1/sub2 that's also a subvolume
and I'm not sure how to re-parent sub2 to somewhere else so that I can subvolume delete
sub1)

In the meantime, a simple check without repair looks like this. It will
likely take many hours to complete:
gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2
Checking filesystem on /dev/mapper/dshelf2
UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
checking extents
checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A
checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
parent transid verify failed on 1671538819072 wanted 293964 found 293902
parent transid verify failed on 1671538819072 wanted 293964 found 293902
checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00
checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00
checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09
checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
(...)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-02  3:23                                             ` Marc MERLIN
@ 2017-05-02  4:56                                               ` Chris Murphy
  2017-05-02  5:11                                                 ` Marc MERLIN
  2017-05-02 19:59                                                 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow
  2017-05-02  5:01                                               ` Duncan
  2017-05-05  1:19                                               ` Qu Wenruo
  2 siblings, 2 replies; 77+ messages in thread
From: Chris Murphy @ 2017-05-02  4:56 UTC (permalink / raw)
  To: Marc MERLIN
  Cc: Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana,
	Josef Bacik, Qu Wenruo, David Sterba

On Mon, May 1, 2017 at 9:23 PM, Marc MERLIN <marc@merlins.org> wrote:
> Hi Chris,
>
> Thanks for the reply, much appreciated.
>
> On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote:
>> What about btfs check (no repair), without and then also with --mode=lowmem?
>>
>> In theory I like the idea of a 24 hour rollback; but in normal usage
>> Btrfs will eventually free up space containing stale and no longer
>> necessary metadata. Like the chunk tree, it's always changing, so you
>> get to a point, even with snapshots, that the old state of that tree
>> is just - gone. A snapshot of an fs tree does not make the chunk tree
>> frozen in time.
>
> Right, of course, I was being way over optimistic here. I kind of forgot
> that metadata wasn't COW, my bad.

Well it is COW. But there's more to the file system than fs trees, and
just because an fs tree gets snapshot doesn't mean all data is
snapshot. So whether snapshot or not, there's metadata that becomes
obsolete as the file system is updated and those areas get freed up
and eventually overwritten.


>
>> In any case, it's a big problem in my mind if no existing tools can
>> fix a file system of this size. So before making anymore changes, make
>> sure you have a btrfs-image somewhere, even if it's huge. The offline
>> checker needs to be able to repair it, right now it's all we have for
>> such a case.
>
> The image will be huge, and take maybe 24H to make (last time it took
> some silly amount of time like that), and honestly I'm not sure how
> useful it'll be.
> Outside of the kernel crashing if I do a btrfs balance, and hopefully
> the crash report I gave is good enough, the state I'm in is not btrfs'
> fault.
>
> If I can't roll back to a reasonably working state, with data loss of a
> known quantity that I can recover from backup, I'll have to destroy and
> filesystem and recover from scratch, which will take multiple days.
> Since I can't wait too long before getting back to a working state, I
> think I'm going to try btrfs check --repair after a scrub to get a list
> of all the pathanmes/inodes that are known to be damaged, and work from
> there.
> Sounds reasonable?

Yes.


>
> Also, how is --mode=lowmem being useful?

Testing. lowmem is a different implementation, so it might find
different things from the regular check.


>
> And for re-parenting a sub-subvolume, is that possible?
> (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's also a subvolume
> and I'm not sure how to re-parent sub2 to somewhere else so that I can subvolume delete
> sub1)

Well you can move sub2 out of sub1 just like a directory and then
delete sub1. If it's read-only it can't be moved, but you can use
btrfs property get/set ro true/false to temporarily make it not
read-only, move it, then make it read-only again, and it's still fine
to use with btrfs send receive.





>
> In the meantime, a simple check without repair looks like this. It will
> likely take many hours to complete:
> gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2
> Checking filesystem on /dev/mapper/dshelf2
> UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
> checking extents
> checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A
> checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> parent transid verify failed on 1671538819072 wanted 293964 found 293902
> parent transid verify failed on 1671538819072 wanted 293964 found 293902
> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00
> checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09
> checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071

Not understanding the problem, it's by definition naive for me to
suggest it should go read-only sooner before hosing itself. But I'd
like to think it's possible for Btrfs to look backward every once in a
while for sanity checking, to limit damage should it be occurring even
if the hardware isn't reporting any problems.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-02  3:23                                             ` Marc MERLIN
  2017-05-02  4:56                                               ` Chris Murphy
@ 2017-05-02  5:01                                               ` Duncan
  2017-05-02 19:53                                                 ` Kai Krakow
  2017-05-23 16:58                                                 ` Marc MERLIN
  2017-05-05  1:19                                               ` Qu Wenruo
  2 siblings, 2 replies; 77+ messages in thread
From: Duncan @ 2017-05-02  5:01 UTC (permalink / raw)
  To: linux-btrfs

Marc MERLIN posted on Mon, 01 May 2017 20:23:46 -0700 as excerpted:

> Also, how is --mode=lowmem being useful?

FWIW, I just watched your talk that's linked from the wiki, and wondered 
what you were doing these days as I hadn't seen any posts from you here 
for awhile.

Well, that you're asking that question confirms you've not been following 
the list too closely...  Of course that's understandable as people have 
other stuff to do, but just sayin'.

The answer is... btrfs check in lowmem mode isn't simply lowmem, it's 
also effectively a very nearly entirely rewritten second implementation, 
which has already demonstrated its worth as it has already allowed 
finding and fixing a number of bugs in normal mode check.  Of course 
normal mode check has returned the favor a few times as well, so it is 
now reasonably standard list troubleshooting practice to ask for the 
output from both modes to see what and where they differ, especially if 
it's not something known to be directly fixable by normal mode, which of 
course remains the more mature default.

So even if neither one can actually fix the problem ATM, any differences 
in output both lend important clues to the real problem, and potentially 
help developers to find and fix bugs in one or the other implementation.

Tho it's worth noting that lowmem mode can be expected to take longer, as 
it favors lower memory usage over speed, just as the mode title suggests 
it will.  On a filesystem as big as yours... it may unfortunately not be 
entirely practical, especially if as you hint there's at least some time 
pressure here, tho it's not extreme.

Of course on-list I'm somewhat known for my arguments propounding the 
notion that any filesystem that's too big to be practically maintained 
(including time necessary to restore from backups, should that be 
necessary for whatever reason) is... too big... and should ideally be 
broken along logical and functional boundaries into a number of 
individual smaller filesystems until such point as each one is found to 
be practically maintainable within a reasonably practical time frame.  
Don't put all the eggs in one basket, and when the bottom of one of those 
baskets inevitably falls out, most of your eggs will be safe in other 
baskets. =:^)

But as someone else (pg, IIRC) on-list is fond of saying, lots of other 
people "know better" (TM).  Whatever.  It's your data, your systems and 
your time, not mine.  I just know what I've found (sometimes finding it 
the hard way!) to work best for me, and TBs on TBs of data on a single 
filesystem, even if it's a backup and is itself backed up, isn't 
something I'd be putting my own faith in, as the time even for a simple 
restore from backups is simply too high for me to consider it at all 
practical. =:^)

> And for re-parenting a sub-subvolume, is that possible?
> (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's
> also a subvolume and I'm not sure how to re-parent sub2 to somewhere
> else so that I can subvolume delete sub1)

As I believe you know my own use-case doesn't deal with subvolumes and 
snapshots, so this may be of limited practicality, but FWIW, the 
sysadmin's guide discussion of snapshot management and special cases 
seems apropos as a first stop, before going further:

https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Managing_Snapshots

Note that toward the bottom of "management" it discusses moving 
subvolumes (which will obviously reparent them), but then below that in 
special cases it says that read-only subvolumes (and thus snapshots) 
cannot be moved, explaining why.


*BUT*, and here's the "go further" part, keep in mind that subvolume-read-
only is a property, gettable and settable by btrfs property.

So you should be able to unset the read-only property of a subvolume or 
snapshot, move it, then if desired, set it again.

Of course I wouldn't expect send -p to work with such a snapshot, but 
send -c /might/ still work, I'm not actually sure but I'd consider it 
worth trying.  (I'd try -p as well, but expect it to fail...)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-02  4:56                                               ` Chris Murphy
@ 2017-05-02  5:11                                                 ` Marc MERLIN
  2017-05-02 18:47                                                   ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN
  2017-07-07  5:37                                                   ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN
  2017-05-02 19:59                                                 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow
  1 sibling, 2 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-05-02  5:11 UTC (permalink / raw)
  To: Chris Murphy
  Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik,
	Qu Wenruo, David Sterba

On Mon, May 01, 2017 at 10:56:06PM -0600, Chris Murphy wrote:
> > Right, of course, I was being way over optimistic here. I kind of forgot
> > that metadata wasn't COW, my bad.
> 
> Well it is COW. But there's more to the file system than fs trees, and
> just because an fs tree gets snapshot doesn't mean all data is
> snapshot. So whether snapshot or not, there's metadata that becomes
> obsolete as the file system is updated and those areas get freed up
> and eventually overwritten.

Got it, thanks for explaining.

> > Also, how is --mode=lowmem being useful?
> 
> Testing. lowmem is a different implementation, so it might find
> different things from the regular check.
 
I see.
I've fired off some scrub -r and then check to run overnight, I'll see
if it finishes overnight assuming the kernel doesn't crash again (yeah,
just to make things simpler, I'm hitting another issue when I/O piles up
on btrfs on top of dmcrypt on top of bcache
http://lkml.iu.edu/hypermail/linux/kernel/1705.0/00626.html
https://pastebin.com/YqE4riw0
but that's not a bcache bug, just something else getting in the way.

> > And for re-parenting a sub-subvolume, is that possible?
> > (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's also a subvolume
> > and I'm not sure how to re-parent sub2 to somewhere else so that I can subvolume delete
> > sub1)
> 
> Well you can move sub2 out of sub1 just like a directory and then
> delete sub1. If it's read-only it can't be moved, but you can use
> btrfs property get/set ro true/false to temporarily make it not
> read-only, move it, then make it read-only again, and it's still fine
> to use with btrfs send receive.

Ah, I didn't think mv would work from inside a subvolume to outside of a
subvolume without copying data (it doesn't for files) but I guess it
would for for subvolumes, good point.
I'll try that, thanks.

> Not understanding the problem, it's by definition naive for me to
> suggest it should go read-only sooner before hosing itself. But I'd
> like to think it's possible for Btrfs to look backward every once in a
> while for sanity checking, to limit damage should it be occurring even
> if the hardware isn't reporting any problems.

Fair point. To be honest, maybe btrfs could indeed have detected
problems earlier, but ultimately it's not really its fault if bad things
happen when I'm having repeated storage errors underneath. For all I
know, some data got written after getting corrupted and btrfs would not
notice that right away.
Now, I kind of naively thought I could simply unroll all writes done
after a certain point. You pointed right (rightfully so) that it's not
nearly as simple as I was hoping.

So at this point, I think it's just a matter of me providing
check/repair logs if they are useful, and someone looking into this
balance causing a kernel crash, which is IMO the only real thing that
btrfs should reasonably fix.

I'll update the thread when I have more logs and have moved further on
the recovery.

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* btrfs check --repair: failed to repair damaged filesystem, aborting
  2017-05-02  5:11                                                 ` Marc MERLIN
@ 2017-05-02 18:47                                                   ` Marc MERLIN
  2017-05-03  6:00                                                     ` Marc MERLIN
  2017-07-07  5:37                                                   ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN
  1 sibling, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-05-02 18:47 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba

(cc trimmed)

The one in debian/unstable crashed:
gargamel:~# btrfs --version
btrfs-progs v4.7.3
gargamel:~# btrfs check --repair /dev/mapper/dshelf2
bytenr mismatch, want=2899180224512, have=3981076597540270796
extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed.
btrfs[0x43e418]
btrfs[0x43e43f]
btrfs[0x43f276]
btrfs[0x43f46f]
btrfs[0x4407ef]
btrfs[0x440963]
btrfs(btrfs_inc_extent_ref+0x513)[0x44107a]
btrfs[0x420053]
btrfs[0x4265eb]
btrfs(cmd_check+0x1111)[0x427d6d]
btrfs(main+0x12f)[0x40a341]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1]
btrfs(_start+0x2a)[0x40a37a]

Ok, it's old, let's take git from today:
gargamel:~# btrfs --version
btrfs-progs v4.10.2
As a note, 
gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2
enabling repair mode
ERROR: low memory mode doesn't support repair yet

As a note, a 32bit binary on a 64bit kernel:
gargamel:~# btrfs check --repair /dev/mapper/dshelf2
enabling repair mode
Checking filesystem on /dev/mapper/dshelf2
UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
checking extents
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
parent transid verify failed on 1671538819072 wanted 293964 found 293902
parent transid verify failed on 1671538819072 wanted 293964 found 293902
checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1
Aborted

let's try again with a 64bit binary built from git:
(...)
Repaired extent references for 4227617038336
ref mismatch on [4227872751616 4096] extent item 1, found 0
Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0
offset 0 found 0 wanted 1 back 0x56470b18e7f0  
Backref disk bytenr does not match extent record, bytenr=4227872751616, ref
bytenr=0
backpointer mismatch on [4227872751616 4096]
owner ref check failed [4227872751616 4096]
repair deleting extent record: key 4227872751616 168 4096
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E  
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
Repaired extent references for 4227872751616
ref mismatch on [6674127745024 32768] extent item 0, found 1
Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not
found in extent tree
Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0
offset 0 found 1 wanted 0 back 0x5648afda0f20  
backpointer mismatch on [6674127745024 32768]
checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E
checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
bytenr mismatch, want=6983266418688, have=13671317608077697645
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
bytenr mismatch, want=2899180224512, have=3981076597540270796
failed to repair damaged filesystem, aborting


So, I'm out of luck now, full wipe and 3-5 day rebuild?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-02  5:01                                               ` Duncan
@ 2017-05-02 19:53                                                 ` Kai Krakow
  2017-05-23 16:58                                                 ` Marc MERLIN
  1 sibling, 0 replies; 77+ messages in thread
From: Kai Krakow @ 2017-05-02 19:53 UTC (permalink / raw)
  To: linux-btrfs

Am Tue, 2 May 2017 05:01:02 +0000 (UTC)
schrieb Duncan <1i5t5.duncan@cox.net>:

> Of course on-list I'm somewhat known for my arguments propounding the 
> notion that any filesystem that's too big to be practically
> maintained (including time necessary to restore from backups, should
> that be necessary for whatever reason) is... too big... and should
> ideally be broken along logical and functional boundaries into a
> number of individual smaller filesystems until such point as each one
> is found to be practically maintainable within a reasonably practical
> time frame. Don't put all the eggs in one basket, and when the bottom
> of one of those baskets inevitably falls out, most of your eggs will
> be safe in other baskets. =:^)

Hehe... Yes, you're a fan of small filesystems. I'm more from the
opposite camp, preferring one big filesystem to not mess around with
size constraints of small filesystems fighting for the same volume
space. It also gives such filesystems better chances for data locality
of data staying in totally different parts across your fs mounts and
can reduce head movement. Of course, much of this is not true if you
use different devices per filesystem, or use SSDs, or SAN where you
have no real control over the physical placement of image stripes
anyway. But well...

In an ideal world, subvolumes of btrfs would be totally independent of
each other, just only share the same volume and dynamically allocating
chunks of space from it. If one is broken, it is simply not usable and
it should be destroyable. A garbage collector would grab the leftover
chunks from the subvolume and free them, and you could recreate this
subvolume from backup. In reality, shared extents will cross subvolume
borders so it is probably not how things could work anytime in the near
of far future.

This idea is more like having thinly provisioned LVM volumes which
allocate space as the filesystems on top need them, much like doing
thinly provisioned images with a VM host system. The problem here is,
unlike subvolumes, those chunks of space could never be given back to
the host as it doesn't know if it is still in use. Of course, there's
implementations available which allow thinning the images by passing
through TRIM from the guest to the host (or by other means of
communication channels between host and guest), but that is usually not
giving good performance, if even supported.

I tried once to exploit this in VirtualBox and hoped it would translate
guest discards into hole punching requests on the host, and it's even
documented to work that way... But (a) it was horrible slow, and (b) it
was incredibly unstable to the point of being useless. OTOH, it's not
announced as a stable feature and has to be enabled by manually editing
the XML config files.

But I still like the idea: Is it possible to make btrfs still work if
one subvolume gets corrupted? Of course it should have ways of telling
the user which other subvolumes are interconnected through shared
extents so those would be also discarded upon corruption cleanup - at
least if those extents couldn't be made any sense of any longer. Since
corruption is an issue mostly of subvolumes being written to, snapshots
should be mostly safe.

Such a feature would also only make sense if btrfs had an online repair
tool. BTW, are there plans for having an online repair tool in the
future? Maybe one that only scans and fixes part of the filesystems
(for obvious performance reasons, wrt Duncans idea of handling
filesystems), i.e. those parts that the kernel discovered having
corruptions? If I could then just delete and restore affected files,
this would be even better than having independent subvolumes like above.

-- 
Regards,
Kai

Replies to list-only preferred.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-02  4:56                                               ` Chris Murphy
  2017-05-02  5:11                                                 ` Marc MERLIN
@ 2017-05-02 19:59                                                 ` Kai Krakow
  1 sibling, 0 replies; 77+ messages in thread
From: Kai Krakow @ 2017-05-02 19:59 UTC (permalink / raw)
  To: linux-btrfs

Am Mon, 1 May 2017 22:56:06 -0600
schrieb Chris Murphy <lists@colorremedies.com>:

> On Mon, May 1, 2017 at 9:23 PM, Marc MERLIN <marc@merlins.org> wrote:
> > Hi Chris,
> >
> > Thanks for the reply, much appreciated.
> >
> > On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote:  
> >> What about btfs check (no repair), without and then also with
> >> --mode=lowmem?
> >>
> >> In theory I like the idea of a 24 hour rollback; but in normal
> >> usage Btrfs will eventually free up space containing stale and no
> >> longer necessary metadata. Like the chunk tree, it's always
> >> changing, so you get to a point, even with snapshots, that the old
> >> state of that tree is just - gone. A snapshot of an fs tree does
> >> not make the chunk tree frozen in time.  
> >
> > Right, of course, I was being way over optimistic here. I kind of
> > forgot that metadata wasn't COW, my bad.  
> 
> Well it is COW. But there's more to the file system than fs trees, and
> just because an fs tree gets snapshot doesn't mean all data is
> snapshot. So whether snapshot or not, there's metadata that becomes
> obsolete as the file system is updated and those areas get freed up
> and eventually overwritten.
> 
> 
> >  
> >> In any case, it's a big problem in my mind if no existing tools can
> >> fix a file system of this size. So before making anymore changes,
> >> make sure you have a btrfs-image somewhere, even if it's huge. The
> >> offline checker needs to be able to repair it, right now it's all
> >> we have for such a case.  
> >
> > The image will be huge, and take maybe 24H to make (last time it
> > took some silly amount of time like that), and honestly I'm not
> > sure how useful it'll be.
> > Outside of the kernel crashing if I do a btrfs balance, and
> > hopefully the crash report I gave is good enough, the state I'm in
> > is not btrfs' fault.
> >
> > If I can't roll back to a reasonably working state, with data loss
> > of a known quantity that I can recover from backup, I'll have to
> > destroy and filesystem and recover from scratch, which will take
> > multiple days. Since I can't wait too long before getting back to a
> > working state, I think I'm going to try btrfs check --repair after
> > a scrub to get a list of all the pathanmes/inodes that are known to
> > be damaged, and work from there.
> > Sounds reasonable?  
> 
> Yes.
> 
> 
> >
> > Also, how is --mode=lowmem being useful?  
> 
> Testing. lowmem is a different implementation, so it might find
> different things from the regular check.
> 
> 
> >
> > And for re-parenting a sub-subvolume, is that possible?
> > (I want to delete /sub1/ but I can't because I have /sub1/sub2
> > that's also a subvolume and I'm not sure how to re-parent sub2 to
> > somewhere else so that I can subvolume delete sub1)  
> 
> Well you can move sub2 out of sub1 just like a directory and then
> delete sub1. If it's read-only it can't be moved, but you can use
> btrfs property get/set ro true/false to temporarily make it not
> read-only, move it, then make it read-only again, and it's still fine
> to use with btrfs send receive.
> 
> 
> 
> 
> 
> >
> > In the meantime, a simple check without repair looks like this. It
> > will likely take many hours to complete:
> > gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2
> > Checking filesystem on /dev/mapper/dshelf2
> > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
> > checking extents
> > checksum verify failed on 3096461459456 found 0E6B7980 wanted
> > FBE5477A checksum verify failed on 3096461459456 found 0E6B7980
> > wanted FBE5477A checksum verify failed on 2899180224512 found
> > 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512
> > found 7A6D427F wanted 7E899EE5 checksum verify failed on
> > 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed
> > on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch,
> > want=2899180224512, have=3981076597540270796 checksum verify failed
> > on 1449488023552 found CECC36AF wanted 199FE6C5 checksum verify
> > failed on 1449488023552 found CECC36AF wanted 199FE6C5 checksum
> > verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> > checksum verify failed on 1449544613888 found 895D691B wanted
> > A0C64D2B parent transid verify failed on 1671538819072 wanted
> > 293964 found 293902 parent transid verify failed on 1671538819072
> > wanted 293964 found 293902 checksum verify failed on 1671603781632
> > found 18BC28D6 wanted 372655A0 checksum verify failed on
> > 1671603781632 found 18BC28D6 wanted 372655A0 checksum verify failed
> > on 1759425052672 found 843B59F1 wanted F0FF7D00 checksum verify
> > failed on 1759425052672 found 843B59F1 wanted F0FF7D00 checksum
> > verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> > checksum verify failed on 2182657212416 found CD8EFC0C wanted
> > 70847071 checksum verify failed on 2898779357184 found 96395131
> > wanted 433D6E09 checksum verify failed on 2898779357184 found
> > 96395131 wanted 433D6E09 checksum verify failed on 2899180224512
> > found 7A6D427F wanted 7E899EE5 checksum verify failed on
> > 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed
> > on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify
> > failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr
> > mismatch, want=2899180224512, have=3981076597540270796 checksum
> > verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> > checksum verify failed on 2182657212416 found CD8EFC0C wanted
> > 70847071 checksum verify failed on 2182657212416 found CD8EFC0C
> > wanted 70847071 checksum verify failed on 2182657212416 found
> > CD8EFC0C wanted 70847071 checksum verify failed on 2182657212416
> > found CD8EFC0C wanted 70847071  
> 
> Not understanding the problem, it's by definition naive for me to
> suggest it should go read-only sooner before hosing itself. But I'd
> like to think it's possible for Btrfs to look backward every once in a
> while for sanity checking, to limit damage should it be occurring even
> if the hardware isn't reporting any problems.

Would it be possible to make btrfs avoid using parts of the filesystem
it detected corruptions in? Then a still-in-theory online repair tool
could check these parts, maybe repair them (or destroy them upon
request), and make those parts of the fs available again... Such a
repair tool (scanning only known corrupted parts) would probably also
need less memory and time to run.

-- 
Regards,
Kai

Replies to list-only preferred.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: btrfs check --repair: failed to repair damaged filesystem, aborting
  2017-05-02 18:47                                                   ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN
@ 2017-05-03  6:00                                                     ` Marc MERLIN
  2017-05-03  6:17                                                       ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-05-03  6:00 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba

David,

I think you maintain btrfs-progs, but I'm not sure if you're in charge 
of check --repair.
Could you comment on the bottom of the mail, namely:
> failed to repair damaged filesystem, aborting
> So, I'm out of luck now, full wipe and 3-5 day rebuild?
 
Thanks,
Marc

Rest:
On Tue, May 02, 2017 at 11:47:22AM -0700, Marc MERLIN wrote:
> (cc trimmed)
> 
> The one in debian/unstable crashed:
> gargamel:~# btrfs --version
> btrfs-progs v4.7.3
> gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed.
> btrfs[0x43e418]
> btrfs[0x43e43f]
> btrfs[0x43f276]
> btrfs[0x43f46f]
> btrfs[0x4407ef]
> btrfs[0x440963]
> btrfs(btrfs_inc_extent_ref+0x513)[0x44107a]
> btrfs[0x420053]
> btrfs[0x4265eb]
> btrfs(cmd_check+0x1111)[0x427d6d]
> btrfs(main+0x12f)[0x40a341]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1]
> btrfs(_start+0x2a)[0x40a37a]
> 
> Ok, it's old, let's take git from today:
> gargamel:~# btrfs --version
> btrfs-progs v4.10.2
> As a note, 
> gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2
> enabling repair mode
> ERROR: low memory mode doesn't support repair yet
> 
> As a note, a 32bit binary on a 64bit kernel:
> gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> enabling repair mode
> Checking filesystem on /dev/mapper/dshelf2
> UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
> checking extents
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> parent transid verify failed on 1671538819072 wanted 293964 found 293902
> parent transid verify failed on 1671538819072 wanted 293964 found 293902
> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1
> Aborted
> 
> let's try again with a 64bit binary built from git:
> (...)
> Repaired extent references for 4227617038336
> ref mismatch on [4227872751616 4096] extent item 1, found 0
> Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0
> offset 0 found 0 wanted 1 back 0x56470b18e7f0  
> Backref disk bytenr does not match extent record, bytenr=4227872751616, ref
> bytenr=0
> backpointer mismatch on [4227872751616 4096]
> owner ref check failed [4227872751616 4096]
> repair deleting extent record: key 4227872751616 168 4096
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E  
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> Repaired extent references for 4227872751616
> ref mismatch on [6674127745024 32768] extent item 0, found 1
> Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not
> found in extent tree
> Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0
> offset 0 found 1 wanted 0 back 0x5648afda0f20  
> backpointer mismatch on [6674127745024 32768]
> checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E
> checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> bytenr mismatch, want=6983266418688, have=13671317608077697645
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> failed to repair damaged filesystem, aborting
> 
> 
> So, I'm out of luck now, full wipe and 3-5 day rebuild?
> 
> Thanks,
> Marc
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: btrfs check --repair: failed to repair damaged filesystem, aborting
  2017-05-03  6:00                                                     ` Marc MERLIN
@ 2017-05-03  6:17                                                       ` Marc MERLIN
  2017-05-03  6:32                                                         ` Roman Mamedov
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-05-03  6:17 UTC (permalink / raw)
  To: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba

On Tue, May 02, 2017 at 11:00:08PM -0700, Marc MERLIN wrote:
> David,
> 
> I think you maintain btrfs-progs, but I'm not sure if you're in charge 
> of check --repair.
> Could you comment on the bottom of the mail, namely:
> > failed to repair damaged filesystem, aborting
> > So, I'm out of luck now, full wipe and 3-5 day rebuild?
  
Actually, another thought:
Is there or should there be a way to repair around the bit that cannot
be repaired?
Separately, or not, can I locate which bits are causing the repair to
fail and maybe get a pointer to the path/inode so that I can hopefully
just delete those bad data structures (assuming deleting them is even
possible and that the FS won't just go read only as I try to do that)

Here is the full run if that helps:
https://pastebin.com/STMFHty4

> Thanks,
> Marc
> 
> Rest:
> On Tue, May 02, 2017 at 11:47:22AM -0700, Marc MERLIN wrote:
> > (cc trimmed)
> > 
> > The one in debian/unstable crashed:
> > gargamel:~# btrfs --version
> > btrfs-progs v4.7.3
> > gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed.
> > btrfs[0x43e418]
> > btrfs[0x43e43f]
> > btrfs[0x43f276]
> > btrfs[0x43f46f]
> > btrfs[0x4407ef]
> > btrfs[0x440963]
> > btrfs(btrfs_inc_extent_ref+0x513)[0x44107a]
> > btrfs[0x420053]
> > btrfs[0x4265eb]
> > btrfs(cmd_check+0x1111)[0x427d6d]
> > btrfs(main+0x12f)[0x40a341]
> > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1]
> > btrfs(_start+0x2a)[0x40a37a]
> > 
> > Ok, it's old, let's take git from today:
> > gargamel:~# btrfs --version
> > btrfs-progs v4.10.2
> > As a note, 
> > gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2
> > enabling repair mode
> > ERROR: low memory mode doesn't support repair yet
> > 
> > As a note, a 32bit binary on a 64bit kernel:
> > gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> > enabling repair mode
> > Checking filesystem on /dev/mapper/dshelf2
> > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
> > checking extents
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> > parent transid verify failed on 1671538819072 wanted 293964 found 293902
> > parent transid verify failed on 1671538819072 wanted 293964 found 293902
> > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> > cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1
> > Aborted
> > 
> > let's try again with a 64bit binary built from git:
> > (...)
> > Repaired extent references for 4227617038336
> > ref mismatch on [4227872751616 4096] extent item 1, found 0
> > Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0
> > offset 0 found 0 wanted 1 back 0x56470b18e7f0  
> > Backref disk bytenr does not match extent record, bytenr=4227872751616, ref
> > bytenr=0
> > backpointer mismatch on [4227872751616 4096]
> > owner ref check failed [4227872751616 4096]
> > repair deleting extent record: key 4227872751616 168 4096
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E  
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > Repaired extent references for 4227872751616
> > ref mismatch on [6674127745024 32768] extent item 0, found 1
> > Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not
> > found in extent tree
> > Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0
> > offset 0 found 1 wanted 0 back 0x5648afda0f20  
> > backpointer mismatch on [6674127745024 32768]
> > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E
> > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > bytenr mismatch, want=6983266418688, have=13671317608077697645
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > failed to repair damaged filesystem, aborting
> > 
> > 
> > So, I'm out of luck now, full wipe and 3-5 day rebuild?
> > 
> > Thanks,
> > Marc
> > -- 
> > "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> > Microsoft is to operating systems ....
> >                                       .... what McDonalds is to gourmet cooking
> > Home page: http://marc.merlins.org/  
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: btrfs check --repair: failed to repair damaged filesystem, aborting
  2017-05-03  6:17                                                       ` Marc MERLIN
@ 2017-05-03  6:32                                                         ` Roman Mamedov
  2017-05-03 20:40                                                           ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Roman Mamedov @ 2017-05-03  6:32 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba

On Tue, 2 May 2017 23:17:11 -0700
Marc MERLIN <marc@merlins.org> wrote:

> On Tue, May 02, 2017 at 11:00:08PM -0700, Marc MERLIN wrote:
> > David,
> > 
> > I think you maintain btrfs-progs, but I'm not sure if you're in charge 
> > of check --repair.
> > Could you comment on the bottom of the mail, namely:
> > > failed to repair damaged filesystem, aborting
> > > So, I'm out of luck now, full wipe and 3-5 day rebuild?
>   
> Actually, another thought:
> Is there or should there be a way to repair around the bit that cannot
> be repaired?
> Separately, or not, can I locate which bits are causing the repair to
> fail and maybe get a pointer to the path/inode so that I can hopefully
> just delete those bad data structures (assuming deleting them is even
> possible and that the FS won't just go read only as I try to do that)

There is the "btrfs-corrupt-block" tool which helped me to kick Btrfsck
further along its course in a similar "unrepairable" situation.
https://www.spinics.net/lists/linux-btrfs/msg53061.html

In your case it appears like the block 2899180224512 is giving it the most
trouble, so you could start with killing that one. From what I can tell this
tool zeroes out the entire block, so Btrfsck can simply delete the reference
and forget it, rather than repeatedly trying to figure out solutions and
bailing out with "failed to repair damaged filesystem, aborting".

Depending on what was stored in it, you may have either no visible effect, or
a complete filesystem failure, or anything in between. Hence if you want to
experiment with this, find a way to work on writable overlay snapshots (also
described in the linked message).

> Here is the full run if that helps:
> https://pastebin.com/STMFHty4
> 
> > Thanks,
> > Marc
> > 
> > Rest:
> > On Tue, May 02, 2017 at 11:47:22AM -0700, Marc MERLIN wrote:
> > > (cc trimmed)
> > > 
> > > The one in debian/unstable crashed:
> > > gargamel:~# btrfs --version
> > > btrfs-progs v4.7.3
> > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed.
> > > btrfs[0x43e418]
> > > btrfs[0x43e43f]
> > > btrfs[0x43f276]
> > > btrfs[0x43f46f]
> > > btrfs[0x4407ef]
> > > btrfs[0x440963]
> > > btrfs(btrfs_inc_extent_ref+0x513)[0x44107a]
> > > btrfs[0x420053]
> > > btrfs[0x4265eb]
> > > btrfs(cmd_check+0x1111)[0x427d6d]
> > > btrfs(main+0x12f)[0x40a341]
> > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1]
> > > btrfs(_start+0x2a)[0x40a37a]
> > > 
> > > Ok, it's old, let's take git from today:
> > > gargamel:~# btrfs --version
> > > btrfs-progs v4.10.2
> > > As a note, 
> > > gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2
> > > enabling repair mode
> > > ERROR: low memory mode doesn't support repair yet
> > > 
> > > As a note, a 32bit binary on a 64bit kernel:
> > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> > > enabling repair mode
> > > Checking filesystem on /dev/mapper/dshelf2
> > > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
> > > checking extents
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> > > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> > > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> > > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> > > parent transid verify failed on 1671538819072 wanted 293964 found 293902
> > > parent transid verify failed on 1671538819072 wanted 293964 found 293902
> > > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> > > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> > > cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1
> > > Aborted
> > > 
> > > let's try again with a 64bit binary built from git:
> > > (...)
> > > Repaired extent references for 4227617038336
> > > ref mismatch on [4227872751616 4096] extent item 1, found 0
> > > Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0
> > > offset 0 found 0 wanted 1 back 0x56470b18e7f0  
> > > Backref disk bytenr does not match extent record, bytenr=4227872751616, ref
> > > bytenr=0
> > > backpointer mismatch on [4227872751616 4096]
> > > owner ref check failed [4227872751616 4096]
> > > repair deleting extent record: key 4227872751616 168 4096
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E  
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > Repaired extent references for 4227872751616
> > > ref mismatch on [6674127745024 32768] extent item 0, found 1
> > > Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not
> > > found in extent tree
> > > Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0
> > > offset 0 found 1 wanted 0 back 0x5648afda0f20  
> > > backpointer mismatch on [6674127745024 32768]
> > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > > checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E
> > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
> > > bytenr mismatch, want=6983266418688, have=13671317608077697645
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> > > bytenr mismatch, want=2899180224512, have=3981076597540270796
> > > failed to repair damaged filesystem, aborting
> > > 
> > > 
> > > So, I'm out of luck now, full wipe and 3-5 day rebuild?
> > > 
> > > Thanks,
> > > Marc
> > > -- 
> > > "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> > > Microsoft is to operating systems ....
> > >                                       .... what McDonalds is to gourmet cooking
> > > Home page: http://marc.merlins.org/  
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > 
> > -- 
> > "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> > Microsoft is to operating systems ....
> >                                       .... what McDonalds is to gourmet cooking
> > Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 


-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: btrfs check --repair: failed to repair damaged filesystem, aborting
  2017-05-03  6:32                                                         ` Roman Mamedov
@ 2017-05-03 20:40                                                           ` Marc MERLIN
  0 siblings, 0 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-05-03 20:40 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba

On Wed, May 03, 2017 at 11:32:26AM +0500, Roman Mamedov wrote:
> > Actually, another thought:
> > Is there or should there be a way to repair around the bit that cannot
> > be repaired?
> > Separately, or not, can I locate which bits are causing the repair to
> > fail and maybe get a pointer to the path/inode so that I can hopefully
> > just delete those bad data structures (assuming deleting them is even
> > possible and that the FS won't just go read only as I try to do that)
> 
> There is the "btrfs-corrupt-block" tool which helped me to kick Btrfsck
> further along its course in a similar "unrepairable" situation.
> https://www.spinics.net/lists/linux-btrfs/msg53061.html
> 
> In your case it appears like the block 2899180224512 is giving it the most
> trouble, so you could start with killing that one. From what I can tell this
> tool zeroes out the entire block, so Btrfsck can simply delete the reference
> and forget it, rather than repeatedly trying to figure out solutions and
> bailing out with "failed to repair damaged filesystem, aborting".
> 
> Depending on what was stored in it, you may have either no visible effect, or
> a complete filesystem failure, or anything in between. Hence if you want to
> experiment with this, find a way to work on writable overlay snapshots (also
> described in the linked message).

Thanks for the tip. This does not seem to have worked at all, though.
Did I do something wrong?

gargamel:/var/local/src/btrfs-progs# ./btrfs-corrupt-block -l  2899180224512 /dev/mapper/dshelf2
mirror 1 logical 2899180224512 physical 2814363009024 device /dev/mapper/dshelf2
corrupting 2899180224512 copy 1
mirror 2 logical 2899180224512 physical 2814899879936 device /dev/mapper/dshelf2
corrupting 2899180224512 copy 2


gargamel:/mnt/btrfs_pool1# btrfs check --repair /dev/mapper/dshelf2
(...)
checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000
checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000
checksum verify failed on 2899180224512 found E5245DBD wanted 00000000
checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000
bytenr mismatch, want=2899180224512, have=0
Repaired extent references for 3566695825408
ref mismatch on [6674127745024 32768] extent item 0, found 1
Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not found in extent tree
Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 offset 0 found 1 wanted 0 back 0x555cb4e9ced0
backpointer mismatch on [6674127745024 32768]
checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E
checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C
bytenr mismatch, want=6983266418688, have=13671317608077697645
checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000
checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000
checksum verify failed on 2899180224512 found E5245DBD wanted 00000000
checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000
bytenr mismatch, want=2899180224512, have=0
failed to repair damaged filesystem, aborting

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-01 18:08                                         ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN
  2017-05-02  1:50                                           ` Chris Murphy
@ 2017-05-05  1:13                                           ` Qu Wenruo
  1 sibling, 0 replies; 77+ messages in thread
From: Qu Wenruo @ 2017-05-05  1:13 UTC (permalink / raw)
  To: Marc MERLIN, linux-btrfs; +Cc: clm, bo.li.liu, fdmanana, jbacik, dsterba



At 05/02/2017 02:08 AM, Marc MERLIN wrote:
> So, I forgot to mention that it's my main media and backup server that got
> corrupted. Yes, I do actually have a backup of a backup server, but it's
> going to take days to recover due to the amount of data to copy back, not
> counting lots of manual typing due to the number of subvolumes, btrfs
> send/receive relationships and so forth.
> 
> Really, I should be able to roll back all writes from the last 24H, run a
> check --repair/scrub on top just to be sure, and be back on track.
> 
> In the meantime, the good news is that the filesystem doesn't crash the
> kernel (the poasted crash below) now that I was able to cancel the btrfs balance,
> but it goes read only at the drop of a hat, even when I'm trying to delete
> recent snapshots and all data that was potentially written in the last 24H
> 
> On Mon, May 01, 2017 at 10:06:41AM -0700, Marc MERLIN wrote:
>> I have a filesystem that sadly got corrupted by a SAS card I just installed yesterday.
>>
>> I don't think in a case like this, there is there a way to roll back all
>> writes across all subvolumes in the last 24H, correct?

Sorry for the late reply.
I thought the case is already finished as I see little chance to recover. :(

No, no way to roll back unless you're completely sure there is only 1 
transaction commit happened in last 24H.
(Well, not really possible in real world)

Btrfs is only capable to rollback to *previous* commit.
That's ensure by forced metadata CoW.

But beyond previous commit, only god knows.

If all metadata CoW write is done in some place never used by any 
previous metadata, then there is the chance to recover.

But mostly the possibility is very low, some mount option like ssd will 
change the extent allocator behavior to improve the possibility, but 
still need a lot of luck.

More detailed comment will be replied to btrfs check mail.

Thanks,
Qu

>>
>> Is the best thing to go in each subvolume, delete the recent snapshots and
>> rename the one from 24H as the current one?
>   
> Well, just like I expected, it's a pain in the rear and this can't even help
> fix the top level mountpoint which doesn't have snapshots, so I can't roll
> it back.
> btrfs should really have an easy way to roll back X hours, or days to
> recover from garbage written after a good known point, given that it is COW
> afterall.
> 
> Is there a way do this with check --repair maybe?
> 
> In the meantime, I got stuck while trying to delete snapshots:
> 
> Let's say I have this:
> ID 428 gen 294021 top level 5 path backup
> ID 2023 gen 294021 top level 5 path Soft
> ID 3021 gen 294051 top level 428 path backup/debian32
> ID 4400 gen 294018 top level 428 path backup/debian64
> ID 4930 gen 294019 top level 428 path backup/ubuntu
> 
> I can easily
> Delete subvolume (no-commit): '/mnt/btrfs_pool2/Soft'
> and then:
> gargamel:/mnt/btrfs_pool2# mv Soft_rw.20170430_01:50:22 Soft
> 
> But I can't delete backup, which actually is mostly only a directory
> containing other things (in hindsight I shouldn't have made that a
> subvolume)
> Delete subvolume (no-commit): '/mnt/btrfs_pool2/backup'
> ERROR: cannot delete '/mnt/btrfs_pool2/backup': Directory not empty
> 
> This is because backup has a lot of subvolumes due to btrfs send/receive
> relationships.
> 
> Is it possible to recover there? Can you reparent subvolumes to a different
> subvolume without doing a full copy via btrfs send/receive?
> 
> Thanks,
> Marc
> 
>> BTRFS warning (device dm-5): failed to load free space cache for block group 6746013696000, rebuilding it now
>> BTRFS warning (device dm-5): block group 6754603630592 has wrong amount of free space
>> BTRFS warning (device dm-5): failed to load free space cache for block group 6754603630592, rebuilding it now
>> BTRFS warning (device dm-5): block group 7125178777600 has wrong amount of free space
>> BTRFS warning (device dm-5): failed to load free space cache for block group 7125178777600, rebuilding it now
>> BTRFS error (device dm-5): bad tree block start 3981076597540270796 2899180224512
>> BTRFS error (device dm-5): bad tree block start 942082474969670243 2899180224512
>> BTRFS: error (device dm-5) in __btrfs_free_extent:6944: errno=-5 IO failure
>> BTRFS info (device dm-5): forced readonly
>> BTRFS: error (device dm-5) in btrfs_run_delayed_refs:2961: errno=-5 IO failure
>> BUG: unable to handle kernel NULL pointer dereference at           (null)
>> IP: __del_reloc_root+0x3f/0xa6
>> PGD 189a0e067
>> PUD 189a0f067
>> PMD 0
>>
>> Oops: 0000 [#1] PREEMPT SMP
>> Modules linked in: veth ip6table_filter ip6_tables ebtable_nat ebtables ppdev lp xt_addrtype br_netfilter bridge stp llc tun autofs4 softdog binfmt_misc ftdi_sio nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipt_REJECT nf_reject_ipv4 xt_conntrack xt_mark xt_nat xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG iptable_mangle iptable_filter lm85 hwmon_vid pl2303 dm_snapshot dm_bufio iptable_nat ip_tables nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_conntrack_ftp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat nf_conntrack x_tables sg st snd_pcm_oss snd_mixer_oss bcache kvm_intel kvm irqbypass snd_hda_codec_realtek snd_cmipci snd_hda_codec_generic snd_hda_intel snd_mpu401_uart snd_hda_codec snd_opl3_lib snd_rawmidi snd_hda_core snd_seq_device snd_hwdep eeepc_wmi snd_pcm asus_wmi rc_ati_x10
>>   asix snd_timer ati_remote sparse_keymap usbnet rfkill snd hwmon soundcore rc_core evdev libphy tpm_infineon pcspkr i915 parport_pc i2c_i801 input_leds mei_me lpc_ich parport tpm_tis battery usbserial tpm_tis_core tpm wmi e1000e ptp pps_core fuse raid456 multipath mmc_block mmc_core lrw ablk_helper dm_crypt dm_mod async_raid6_recov async_pq async_xor async_memcpy async_tx crc32c_intel blowfish_x86_64 blowfish_common pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd xhci_pci ehci_pci sata_sil24 xhci_hcd mvsas ehci_hcd r8169 usbcore mii libsas scsi_transport_sas thermal fan [last unloaded: ftdi_sio]
>> CPU: 0 PID: 9056 Comm: btrfs Tainted: G     U          4.11.0-amd64-preempt-sysrq-20170406 #2
>> Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
>> task: ffff88374d2a60c0 task.stack: ffffa6f226424000
>> RIP: 0010:__del_reloc_root+0x3f/0xa6
>> RSP: 0018:ffffa6f226427a40 EFLAGS: 00210246
>> RAX: 0000000000000000 RBX: ffff8838ee256000 RCX: 00000000ffffffe2
>> RDX: 0000000000000001 RSI: ffffffff9f83b410 RDI: ffff8837992da568
>> RBP: ffffa6f226427a68 R08: 0000000000000000 R09: ffffffff9fd69480
>> R10: 0000000000000000 R11: 0000000000000000 R12: ffffa6f226427ab0
>> R13: ffff883768938000 R14: ffff8837992da568 R15: ffff8837992da570
>> FS:  00007facd18d28c0(0000) GS:ffff883a5e200000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000000 CR3: 0000000189a10000 CR4: 00000000001406f0
>> Call Trace:
>>   free_reloc_roots+0x4f/0x5d
>>   merge_reloc_roots+0x159/0x1ba
>>   relocate_block_group+0x410/0x492
>>   btrfs_relocate_block_group+0x12d/0x253
>>   btrfs_relocate_chunk+0x3e/0xb1
>>   btrfs_balance+0xd16/0xf36
>>   btrfs_ioctl_balance+0x24f/0x2cd
>>   ? __alloc_pages_nodemask+0x134/0x1e0
>>   btrfs_ioctl+0x1447/0x1e22
>>   ? mem_cgroup_charge_statistics+0x1e/0x88
>>   ? get_page+0x9/0x26
>>   ? __lru_cache_add+0x2a/0x6c
>>   ? set_pte_at+0x9/0xd
>>   ? __handle_mm_fault+0x61d/0xa6f
>>   vfs_ioctl+0x21/0x38
>>   ? vfs_ioctl+0x21/0x38
>>   do_vfs_ioctl+0x4ef/0x537
>>   ? current_kernel_time64+0x10/0x36
>>   ? __audit_syscall_entry+0xc2/0xe6
>>   ? syscall_trace_enter+0x1ac/0x20e
>>   SyS_ioctl+0x57/0x7b
>>   do_syscall_64+0x6b/0x7d
>>   entry_SYSCALL64_slow_path+0x25/0x25
>> RIP: 0033:0x7facd097ecc7
>> RSP: 002b:00007ffefd3c3128 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007facd097ecc7
>> RDX: 00007ffefd3c31b8 RSI: 00000000c4009420 RDI: 0000000000000003
>> RBP: 00007ffefd3c31b8 R08: 0000000000000003 R09: 0000000000008040
>> R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000003
>> R13: 00007ffefd3c4cc9 R14: 0000000000000001 R15: 0000000000000001
>> Code: af f0 01 00 00 48 89 fb 4d 8b b5 10 0b 00 00 4d 8d be 70 05 00 00 49 81 c6 68 05 00 00 4c 89 ff e8 0f 44 43 00 48 8b 03 4c 89 f7 <48> 8b 30 e8 0e fc ff ff 48 85 c0 49 89 c4 74 0b 4c 89 f6 48 89
>> RIP: __del_reloc_root+0x3f/0xa6 RSP: ffffa6f226427a40
>> CR2: 0000000000000000
>> ---[ end trace 64c3fa4dc953d295 ]---
>> Kernel panic - not syncing: Fatal exception
>> Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> Rebooting in 20 seconds..
>> ACPI MEMORY or I/O RESET_REG.
>>
>> -- 
>> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>> Microsoft is to operating systems ....
>>                                        .... what McDonalds is to gourmet cooking
>> Home page: http://marc.merlins.org/
> 



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-02  3:23                                             ` Marc MERLIN
  2017-05-02  4:56                                               ` Chris Murphy
  2017-05-02  5:01                                               ` Duncan
@ 2017-05-05  1:19                                               ` Qu Wenruo
  2017-05-05  2:10                                                 ` Qu Wenruo
  2017-05-05  2:40                                                 ` Marc MERLIN
  2 siblings, 2 replies; 77+ messages in thread
From: Qu Wenruo @ 2017-05-05  1:19 UTC (permalink / raw)
  To: Marc MERLIN, Chris Murphy
  Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, David Sterba



At 05/02/2017 11:23 AM, Marc MERLIN wrote:
> Hi Chris,
> 
> Thanks for the reply, much appreciated.
> 
> On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote:
>> What about btfs check (no repair), without and then also with --mode=lowmem?
>>
>> In theory I like the idea of a 24 hour rollback; but in normal usage
>> Btrfs will eventually free up space containing stale and no longer
>> necessary metadata. Like the chunk tree, it's always changing, so you
>> get to a point, even with snapshots, that the old state of that tree
>> is just - gone. A snapshot of an fs tree does not make the chunk tree
>> frozen in time.
>   
> Right, of course, I was being way over optimistic here. I kind of forgot
> that metadata wasn't COW, my bad.
> 
>> In any case, it's a big problem in my mind if no existing tools can
>> fix a file system of this size. So before making anymore changes, make
>> sure you have a btrfs-image somewhere, even if it's huge. The offline
>> checker needs to be able to repair it, right now it's all we have for
>> such a case.
> 
> The image will be huge, and take maybe 24H to make (last time it took
> some silly amount of time like that), and honestly I'm not sure how
> useful it'll be.
> Outside of the kernel crashing if I do a btrfs balance, and hopefully
> the crash report I gave is good enough, the state I'm in is not btrfs'
> fault.
> 
> If I can't roll back to a reasonably working state, with data loss of a
> known quantity that I can recover from backup, I'll have to destroy and
> filesystem and recover from scratch, which will take multiple days.
> Since I can't wait too long before getting back to a working state, I
> think I'm going to try btrfs check --repair after a scrub to get a list
> of all the pathanmes/inodes that are known to be damaged, and work from
> there.
> Sounds reasonable?
> 
> Also, how is --mode=lowmem being useful?
> 
> And for re-parenting a sub-subvolume, is that possible?
> (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's also a subvolume
> and I'm not sure how to re-parent sub2 to somewhere else so that I can subvolume delete
> sub1)
> 
> In the meantime, a simple check without repair looks like this. It will
> likely take many hours to complete:
> gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2
> Checking filesystem on /dev/mapper/dshelf2
> UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
> checking extents
> checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A
> checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
> parent transid verify failed on 1671538819072 wanted 293964 found 293902
> parent transid verify failed on 1671538819072 wanted 293964 found 293902
> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
> checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00
> checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09
> checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
> bytenr mismatch, want=2899180224512, have=3981076597540270796
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
> (...)

Full output please.

I know it will be long, but the point here is, full output could help us 
to at least locate where the most corruption are.

If most corruption are only in extent tree, the chance to recover will 
increase hugely.

As extent tree is just a backref for all allocated extents, it's not 
really important if recovery (read) is the primary goal.

But if other tree (fs or subvolume tree important for you) also get 
corrupted, I'm afraid your last chance will be "btrfs restore" then.

Thanks,
Qu

> 
> Thanks,
> Marc
> 



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-05  1:19                                               ` Qu Wenruo
@ 2017-05-05  2:10                                                 ` Qu Wenruo
  2017-05-05  2:40                                                 ` Marc MERLIN
  1 sibling, 0 replies; 77+ messages in thread
From: Qu Wenruo @ 2017-05-05  2:10 UTC (permalink / raw)
  To: Marc MERLIN, Chris Murphy
  Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, David Sterba



At 05/05/2017 09:19 AM, Qu Wenruo wrote:
> 
> 
> At 05/02/2017 11:23 AM, Marc MERLIN wrote:
>> Hi Chris,
>>
>> Thanks for the reply, much appreciated.
>>
>> On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote:
>>> What about btfs check (no repair), without and then also with 
>>> --mode=lowmem?
>>>
>>> In theory I like the idea of a 24 hour rollback; but in normal usage
>>> Btrfs will eventually free up space containing stale and no longer
>>> necessary metadata. Like the chunk tree, it's always changing, so you
>>> get to a point, even with snapshots, that the old state of that tree
>>> is just - gone. A snapshot of an fs tree does not make the chunk tree
>>> frozen in time.
>> Right, of course, I was being way over optimistic here. I kind of forgot
>> that metadata wasn't COW, my bad.
>>
>>> In any case, it's a big problem in my mind if no existing tools can
>>> fix a file system of this size. So before making anymore changes, make
>>> sure you have a btrfs-image somewhere, even if it's huge. The offline
>>> checker needs to be able to repair it, right now it's all we have for
>>> such a case.
>>
>> The image will be huge, and take maybe 24H to make (last time it took
>> some silly amount of time like that), and honestly I'm not sure how
>> useful it'll be.
>> Outside of the kernel crashing if I do a btrfs balance, and hopefully
>> the crash report I gave is good enough, the state I'm in is not btrfs'
>> fault.
>>
>> If I can't roll back to a reasonably working state, with data loss of a
>> known quantity that I can recover from backup, I'll have to destroy and
>> filesystem and recover from scratch, which will take multiple days.
>> Since I can't wait too long before getting back to a working state, I
>> think I'm going to try btrfs check --repair after a scrub to get a list
>> of all the pathanmes/inodes that are known to be damaged, and work from
>> there.
>> Sounds reasonable?
>>
>> Also, how is --mode=lowmem being useful?
>>
>> And for re-parenting a sub-subvolume, is that possible?
>> (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's 
>> also a subvolume
>> and I'm not sure how to re-parent sub2 to somewhere else so that I can 
>> subvolume delete
>> sub1)
>>
>> In the meantime, a simple check without repair looks like this. It will
>> likely take many hours to complete:
>> gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2
>> Checking filesystem on /dev/mapper/dshelf2
>> UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653
>> checking extents
>> checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A
>> checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A
>> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
>> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
>> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
>> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
>> bytenr mismatch, want=2899180224512, have=3981076597540270796
>> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
>> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5
>> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
>> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B
>> parent transid verify failed on 1671538819072 wanted 293964 found 293902
>> parent transid verify failed on 1671538819072 wanted 293964 found 293902
>> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
>> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0
>> checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00
>> checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00
>> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
>> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
>> checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09
>> checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09
>> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
>> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
>> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
>> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
>> bytenr mismatch, want=2899180224512, have=3981076597540270796
>> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
>> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
>> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
>> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
>> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071
>> (...)
> 
> Full output please.

Sorry for not noticing the link.

[Conclusion]
After checking the full result, some of fs/subvolume trees are corrupted.

[Details]
Some example here:

---
ref mismatch on [6674127745024 32768] extent item 0, found 1
Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 
not found in extent tree
Incorrect local backref count on 6674127745024 parent 7566652473344 
owner 0 offset 0 found 1 wanted 0 back 0x5648afda0f20
backpointer mismatch on [6674127745024 32768]
---

The extent at 6674127745024 seems to be an *DATA* extent.
While current default nodesize is 16K and ancient default node is 4K.

Unless you specified -n 32K at mkfs time, it's a DATA extent.

Further more, it's a shared data backref, it's using its parent tree 
block to do backref walk.

And its parent tree block is 7566652473344.
While such bytenr can't be found anywhere (including csum error output), 
that's to say either we can't find that tree block nor can't reach the 
tree root for it.

Considering it's data extent, its owner is either root or fs/subvolume tree.


Such cases are everywhere, as I found other extent sized from 4K to 44K, 
so I'm pretty sure there must be some fs/subvolume tree corrupted.
(Data extent in root tree is seldom 4K sized)

So unfortunately, your fs/subvolume trees are also corrupted.
And almost no chance to do a graceful recovery.

[Alternatives]
I would recommend to use "btrfs restore -f <subvolid>" to restore 
specified subvolume.

What we can do is to try to dump the tree of a subvolume, and manually 
gather what's still here and put them somewhere else.
And that's what btrfs-restore is doing.

The only good new is, your chunk tree seems to be good, so btrfs-restore 
shouldn't encounter too many problems.

Good luck.

Thanks,
Qu

> 
> I know it will be long, but the point here is, full output could help us 
> to at least locate where the most corruption are.
> 
> If most corruption are only in extent tree, the chance to recover will 
> increase hugely.
> 
> As extent tree is just a backref for all allocated extents, it's not 
> really important if recovery (read) is the primary goal.
> 
> But if other tree (fs or subvolume tree important for you) also get 
> corrupted, I'm afraid your last chance will be "btrfs restore" then.
> 
> Thanks,
> Qu
> 
>>
>> Thanks,
>> Marc
>>



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-05  1:19                                               ` Qu Wenruo
  2017-05-05  2:10                                                 ` Qu Wenruo
@ 2017-05-05  2:40                                                 ` Marc MERLIN
  2017-05-05  5:03                                                   ` Qu Wenruo
  1 sibling, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-05-05  2:40 UTC (permalink / raw)
  To: Qu Wenruo
  Cc: Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana,
	Josef Bacik, David Sterba

On Fri, May 05, 2017 at 09:19:29AM +0800, Qu Wenruo wrote:
> Sorry for not noticing the link.
 
no problem, it was only one line amongst many :)
Thanks much for having had a look.

> [Conclusion]
> After checking the full result, some of fs/subvolume trees are corrupted.
> 
> [Details]
> Some example here:
> 
> ---
> ref mismatch on [6674127745024 32768] extent item 0, found 1
> Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not
> found in extent tree
> Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0
> offset 0 found 1 wanted 0 back 0x5648afda0f20
> backpointer mismatch on [6674127745024 32768]
> ---
> 
> The extent at 6674127745024 seems to be an *DATA* extent.
> While current default nodesize is 16K and ancient default node is 4K.
> 
> Unless you specified -n 32K at mkfs time, it's a DATA extent.

I did not, so you must be right about DATA, which should be good, right,
I don't mind losing data as long as the underlying metadata is correct.

I should have given more data on the FS:

gargamel:/var/local/src/btrfs-progs# btrfs fi df /mnt/btrfs_pool2/
Data, single: total=6.28TiB, used=6.12TiB
System, DUP: total=32.00MiB, used=720.00KiB
Metadata, DUP: total=97.00GiB, used=94.39GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

gargamel:/var/local/src/btrfs-progs# btrfs fi usage /mnt/btrfs_pool2
Overall:
    Device size:                   7.28TiB
    Device allocated:              6.47TiB
    Device unallocated:          824.48GiB
    Device missing:                  0.00B
    Used:                          6.30TiB
    Free (estimated):            994.45GiB      (min: 582.21GiB)
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,single: Size:6.28TiB, Used:6.12TiB
   /dev/mapper/dshelf2     6.28TiB

Metadata,DUP: Size:97.00GiB, Used:94.39GiB
   /dev/mapper/dshelf2   194.00GiB

System,DUP: Size:32.00MiB, Used:720.00KiB
   /dev/mapper/dshelf2    64.00MiB

Unallocated:
   /dev/mapper/dshelf2   824.48GiB


> Further more, it's a shared data backref, it's using its parent tree block
> to do backref walk.
> 
> And its parent tree block is 7566652473344.
> While such bytenr can't be found anywhere (including csum error output),
> that's to say either we can't find that tree block nor can't reach the tree
> root for it.
> 
> Considering it's data extent, its owner is either root or fs/subvolume tree.
> 
> 
> Such cases are everywhere, as I found other extent sized from 4K to 44K, so
> I'm pretty sure there must be some fs/subvolume tree corrupted.
> (Data extent in root tree is seldom 4K sized)
> 
> So unfortunately, your fs/subvolume trees are also corrupted.
> And almost no chance to do a graceful recovery.
 
So I'm confused here. You're saying my metadata is not corrupted (and in
my case, I have DUP, so I should have 2 copies), but with data blocks
(which are not duped) corrupted, it's also possible to lose the
filesystem in a way that it can't be taken back to a clean state, even
by deleting some corrupted data?

> [Alternatives]
> I would recommend to use "btrfs restore -f <subvolid>" to restore specified
> subvolume.

I don't need to restore data, the data is a backup. It will just take
many days to recreate (plus many hours of typing from me because the
backup updates are automated, but recreating everything, is not
automated)

So if I understand correctly, my metadata is fine (and I guess I have 2
copies, so it would have been unlucky to get both copies corrupted), but
enough data blocks got corrupted that btrfs cannot recover, even by
deleting the corrupted data blocks. Correct?

And is it not possible to clear the corrupted blocks like this?
./btrfs-corrupt-block -l  2899180224512 /dev/mapper/dshelf2
and just accept the lost data but get btrfs check repair to deal with
the deleted blocks and bring the rest back to a clean state?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-05  2:40                                                 ` Marc MERLIN
@ 2017-05-05  5:03                                                   ` Qu Wenruo
  2017-05-05 15:43                                                     ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Qu Wenruo @ 2017-05-05  5:03 UTC (permalink / raw)
  To: Marc MERLIN
  Cc: Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana,
	Josef Bacik, David Sterba



At 05/05/2017 10:40 AM, Marc MERLIN wrote:
> On Fri, May 05, 2017 at 09:19:29AM +0800, Qu Wenruo wrote:
>> Sorry for not noticing the link.
>   
> no problem, it was only one line amongst many :)
> Thanks much for having had a look.
> 
>> [Conclusion]
>> After checking the full result, some of fs/subvolume trees are corrupted.
>>
>> [Details]
>> Some example here:
>>
>> ---
>> ref mismatch on [6674127745024 32768] extent item 0, found 1
>> Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not
>> found in extent tree
>> Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0
>> offset 0 found 1 wanted 0 back 0x5648afda0f20
>> backpointer mismatch on [6674127745024 32768]
>> ---
>>
>> The extent at 6674127745024 seems to be an *DATA* extent.
>> While current default nodesize is 16K and ancient default node is 4K.
>>
>> Unless you specified -n 32K at mkfs time, it's a DATA extent.
> 
> I did not, so you must be right about DATA, which should be good, right,
> I don't mind losing data as long as the underlying metadata is correct.
> 
> I should have given more data on the FS:
> 
> gargamel:/var/local/src/btrfs-progs# btrfs fi df /mnt/btrfs_pool2/
> Data, single: total=6.28TiB, used=6.12TiB
> System, DUP: total=32.00MiB, used=720.00KiB
> Metadata, DUP: total=97.00GiB, used=94.39GiB

Tons of metadata since the fs is so large.

> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> gargamel:/var/local/src/btrfs-progs# btrfs fi usage /mnt/btrfs_pool2
> Overall:
>      Device size:                   7.28TiB
>      Device allocated:              6.47TiB
>      Device unallocated:          824.48GiB
>      Device missing:                  0.00B
>      Used:                          6.30TiB
>      Free (estimated):            994.45GiB      (min: 582.21GiB)
>      Data ratio:                       1.00
>      Metadata ratio:                   2.00
>      Global reserve:              512.00MiB      (used: 0.00B)
> 
> Data,single: Size:6.28TiB, Used:6.12TiB
>     /dev/mapper/dshelf2     6.28TiB
> 
> Metadata,DUP: Size:97.00GiB, Used:94.39GiB
>     /dev/mapper/dshelf2   194.00GiB
> 
> System,DUP: Size:32.00MiB, Used:720.00KiB
>     /dev/mapper/dshelf2    64.00MiB
> 
> Unallocated:
>     /dev/mapper/dshelf2   824.48GiB
> 
> 
>> Further more, it's a shared data backref, it's using its parent tree block
>> to do backref walk.
>>
>> And its parent tree block is 7566652473344.
>> While such bytenr can't be found anywhere (including csum error output),
>> that's to say either we can't find that tree block nor can't reach the tree
>> root for it.
>>
>> Considering it's data extent, its owner is either root or fs/subvolume tree.
>>
>>
>> Such cases are everywhere, as I found other extent sized from 4K to 44K, so
>> I'm pretty sure there must be some fs/subvolume tree corrupted.
>> (Data extent in root tree is seldom 4K sized)
>>
>> So unfortunately, your fs/subvolume trees are also corrupted.
>> And almost no chance to do a graceful recovery.
>   
> So I'm confused here. You're saying my metadata is not corrupted (and in
> my case, I have DUP, so I should have 2 copies),

Nope, here I'm all talking about metadata (tree blocks).
Difference is the owner, either extent tree or fs/subvolume tree.

The fsck doesn't check data blocks.

> but with data blocks
> (which are not duped) corrupted, it's also possible to lose the
> filesystem in a way that it can't be taken back to a clean state, even
> by deleting some corrupted data?

No, it can't be repaired by deleting data.

The problem is, tree blocks (metadata) that refers these data blocks are 
corrupted.

And they are corrupted in such a way that both extent tree (tree 
contains extent allocation info) and fs tree (tree contains real fs 
info, like inode and data location) are corrupted.

So graceful recovery is not possible now.

> 
>> [Alternatives]
>> I would recommend to use "btrfs restore -f <subvolid>" to restore specified
>> subvolume.
> 
> I don't need to restore data, the data is a backup. It will just take
> many days to recreate (plus many hours of typing from me because the
> backup updates are automated, but recreating everything, is not
> automated)
> 
> So if I understand correctly, my metadata is fine (and I guess I have 2
> copies, so it would have been unlucky to get both copies corrupted), but
> enough data blocks got corrupted that btrfs cannot recover, even by
> deleting the corrupted data blocks. Correct?

Unfortunately, no, even you have 2 copies, a lot of tree blocks are 
corrupted that neither copy matches checksum.

Just like the following tree block, both copy have wrong checksum.
---
checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E
checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5
---

> 
> And is it not possible to clear the corrupted blocks like this?
> ./btrfs-corrupt-block -l  2899180224512 /dev/mapper/dshelf2
> and just accept the lost data but get btrfs check repair to deal with
> the deleted blocks and bring the rest back to a clean state?No, that won't help.

Corrupted blocks are corrupted, that command is just trying to corrupt 
it again.
It won't do the black magic to adjust tree blocks to avoid them.

That's done in btrfs check. (and --repair)
Btrfs check will just skip corrupted tree blocks and continue, while 
btrfs check --repair will try to rebuild the tree and avoid corrupted 
blocks.

But as you can see, btrfs check can't handle it, due to the complicated 
corruption combination.

So I'm afraid no good method to recover.

Thanks,
Qu
> 
> Thanks,
> Marc
> 



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-05  5:03                                                   ` Qu Wenruo
@ 2017-05-05 15:43                                                     ` Marc MERLIN
  2017-05-17 18:23                                                       ` Kai Krakow
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-05-05 15:43 UTC (permalink / raw)
  To: Qu Wenruo, hurikhan77
  Cc: Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana,
	Josef Bacik, David Sterba

Thanks again for your answer. Obviously even if my filesystem is toast,
it's useful to learn from what happened.

On Fri, May 05, 2017 at 01:03:02PM +0800, Qu Wenruo wrote:
> > > So unfortunately, your fs/subvolume trees are also corrupted.
> > > And almost no chance to do a graceful recovery.
> > So I'm confused here. You're saying my metadata is not corrupted (and in
> > my case, I have DUP, so I should have 2 copies),
> 
> Nope, here I'm all talking about metadata (tree blocks).
> Difference is the owner, either extent tree or fs/subvolume tree.
 
I see. I didn't realize that my filesystem managed to corrupt both
copies of its metadata.

> The fsck doesn't check data blocks.

Right, that's what scrub does, fair enough.

> The problem is, tree blocks (metadata) that refers these data blocks are
> corrupted.
> 
> And they are corrupted in such a way that both extent tree (tree contains
> extent allocation info) and fs tree (tree contains real fs info, like inode
> and data location) are corrupted.
> 
> So graceful recovery is not possible now.

I see, thanks for explaining.

> Unfortunately, no, even you have 2 copies, a lot of tree blocks are
> corrupted that neither copy matches checksum.
 
Thanks for confirming. I guess if I'm having corruption due to a bad
card, it makes sense that both get updated after one another and both
got corrupted for the same reason.

> Corrupted blocks are corrupted, that command is just trying to corrupt it
> again.
> It won't do the black magic to adjust tree blocks to avoid them.
 
I see. you may hve seen the earlier message from Kai Krakow who was
able to to recover his FS by trying this trick, but I understand it
can't work in all cases.

Thanks again for your answers.
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-05 15:43                                                     ` Marc MERLIN
@ 2017-05-17 18:23                                                       ` Kai Krakow
  0 siblings, 0 replies; 77+ messages in thread
From: Kai Krakow @ 2017-05-17 18:23 UTC (permalink / raw)
  To: Marc MERLIN
  Cc: Qu Wenruo, Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu,
	fdmanana, Josef Bacik, David Sterba

Am Fri, 5 May 2017 08:43:23 -0700
schrieb Marc MERLIN <marc@merlins.org>:

[missing quote of the command]
> > Corrupted blocks are corrupted, that command is just trying to
> > corrupt it again.
> > It won't do the black magic to adjust tree blocks to avoid them.  
>  
> I see. you may hve seen the earlier message from Kai Krakow who was
> able to to recover his FS by trying this trick, but I understand it
> can't work in all cases.

Huh, what trick? I don't take credit for it... ;-)

The corrupt-block trick must've been someone else...


-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-02  5:01                                               ` Duncan
  2017-05-02 19:53                                                 ` Kai Krakow
@ 2017-05-23 16:58                                                 ` Marc MERLIN
  2017-05-24 10:16                                                   ` Duncan
  1 sibling, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-05-23 16:58 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Tue, May 02, 2017 at 05:01:02AM +0000, Duncan wrote:
> Marc MERLIN posted on Mon, 01 May 2017 20:23:46 -0700 as excerpted:
> 
> > Also, how is --mode=lowmem being useful?
> 
> FWIW, I just watched your talk that's linked from the wiki, and wondered 
> what you were doing these days as I hadn't seen any posts from you here 
> for awhile.
 
First, sorry for the late reply. Because you didn't Cc me in the answer,
it went to a different folder where I only saw your replies now.
Off topic, but basically I'm not dead or anything, I have btrfs working
well enough to not mess with it further because I have many other
hobbies :)  that is unless I put a new SAS card in my server, hit some
corruption bugs, and now I'm back spending days fixing the system.

> Well, that you're asking that question confirms you've not been following 
> the list too closely...  Of course that's understandable as people have 
> other stuff to do, but just sayin'.

That's exactly right. I'm subscribed to way too many lists on way too
many topics to be up to date with all, sadly :(

> Of course on-list I'm somewhat known for my arguments propounding the 
> notion that any filesystem that's too big to be practically maintained 
> (including time necessary to restore from backups, should that be 
> necessary for whatever reason) is... too big... and should ideally be 
> broken along logical and functional boundaries into a number of 
> individual smaller filesystems until such point as each one is found to 
> be practically maintainable within a reasonably practical time frame.  
> Don't put all the eggs in one basket, and when the bottom of one of those 
> baskets inevitably falls out, most of your eggs will be safe in other 
> baskets. =:^)
 
That's a valid point, and in my case, I can back it up/restore, it just
takes a bit of time, but most of the time is manually babysitting all
those subvolumes that I need to recreate by hand with btrfs send/restore
relationships, which all get lost during backup/restore.
This is the most painful part.
What's too big? I've only ever used a filesystem that fits on on a raid
of 4 data drives. That value has increased over time, but I don't have a
a crazy array of 20+ drives as a single filesystem, or anything.
Since drives have gotten bigger, but not that much faster, I use bcache
to make things more acceptable in speed.

> *BUT*, and here's the "go further" part, keep in mind that subvolume-read-
> only is a property, gettable and settable by btrfs property.
> 
> So you should be able to unset the read-only property of a subvolume or 
> snapshot, move it, then if desired, set it again.
> 
> Of course I wouldn't expect send -p to work with such a snapshot, but 
> send -c /might/ still work, I'm not actually sure but I'd consider it 
> worth trying.  (I'd try -p as well, but expect it to fail...)

That's an interesting point, thanks for making it.
In that case, I did have to destroy and recreate the filesystem since
btrfs check --repair was unable to fix it, but knowing how to reparent
read only subvolumes may be handy in the future, thanks.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
  2017-05-23 16:58                                                 ` Marc MERLIN
@ 2017-05-24 10:16                                                   ` Duncan
  0 siblings, 0 replies; 77+ messages in thread
From: Duncan @ 2017-05-24 10:16 UTC (permalink / raw)
  To: linux-btrfs

Marc MERLIN posted on Tue, 23 May 2017 09:58:47 -0700 as excerpted:

> That's a valid point, and in my case, I can back it up/restore, it just
> takes a bit of time, but most of the time is manually babysitting all
> those subvolumes that I need to recreate by hand with btrfs send/restore
> relationships, which all get lost during backup/restore.
> This is the most painful part.
> What's too big? I've only ever used a filesystem that fits on on a raid
> of 4 data drives. That value has increased over time, but I don't have a
> a crazy array of 20+ drives as a single filesystem, or anything.
> Since drives have gotten bigger, but not that much faster, I use bcache
> to make things more acceptable in speed.

What's too big?  That depends on your tolerance for pain, but given the 
subvolumes manually recreated by hand with send/receive scenario, I'd 
probably try to break it down so while there's the same number of 
snapshots to restore, the number of subvolumes the snapshots are taken 
against are limited.

My own rule of thumb is if it's taking so long that it's a barrier to 
doing it, I really need to either break things down further, or upgrade 
to faster storage.  The latter is why I'm actually looking at upgrading 
my media and second backup set, on spinning rust, to ssd.  Because while 
I used to do backups spinning rust to spinning rust of that size all the 
time, ssds have spoiled me, and now I dread doing the spinning rust 
backups... or restores.   Tho in my case the spinning rust is only a half-
TB, so a pair of half-TB to 1 TB ssds for an upgrade is still cost 
effective.  It's not like I'm going multi-TB, which would still be cost 
prohibitive on SSD, particularly since I want raid1, so doubling the 
number of SSDs.

Meanwhile, what I'd do with that raid of four drives (and /did/ do with 
my 4-drive raid back a few storage generations ago, when 300 GB spinning-
rust disks were still quite big, and what I do with my paired SSDs with 
btrfs now) is partition them up and do raids of partitions on each drive.

One thing that's nice about that is that you can actually do a set of 
backups on a second set of partitions on the same physical devices, 
because the physical device redundancy of the raids covers loss of a 
device, and the separate partitions and raids (btrfs raid1 now) cover the 
fat-finger or simple loss of filesystem risk.  A second set of backups to 
separate devices can then be made just in case, and depending on the 
need, swapped out to off-premises or uploaded to the cloud or whatever, 
but you always have the primary backup at hand to boot to or mount if the 
working copy fails, by simply pointing to the backup partitions and 
filesystem instead of the normal working copy.  For root, I even have a 
grub menu item that switches to the backup copy, and for fstab, I have a 
set of stubs that are assembled via script into three copies of fstab 
that swap working and backup copies as necessary, with /etc/fstab itself 
being a symlink to the working copy one, that I simply switch to point to 
the one that loads the backup copies as working, on the backup.  Or I can 
mount the root filesystem for maintenance from the initramfs, and switch 
the fstab symlink from there, before exiting maintenance and booting the 
main system.

I learned this "split it up" method the hard way back before mdraid had 
write-intent bitmaps, and I had only two much larger raids, working and 
backup, where if one device dropped out and I brought it back in, I had 
to wait way too long for the huge working raid to resync.  When I split 
things up by function into multiple raids, most of the time only some of 
them were active and only one or two of the active ones would actually 
have been being written at the time so were out of sync, and syncing them 
was fast as they were much smaller than the larger full system raids I 
had been using previously.

>> *BUT*, and here's the "go further" part, keep in mind that
>> subvolume-read-
>> only is a property, gettable and settable by btrfs property.
>> 
>> So you should be able to unset the read-only property of a subvolume or
>> snapshot, move it, then if desired, set it again.
>> 
>> Of course I wouldn't expect send -p to work with such a snapshot, but
>> send -c /might/ still work, I'm not actually sure but I'd consider it
>> worth trying.  (I'd try -p as well, but expect it to fail...)
> 
> That's an interesting point, thanks for making it.
> In that case, I did have to destroy and recreate the filesystem since
> btrfs check --repair was unable to fix it, but knowing how to reparent
> read only subvolumes may be handy in the future, thanks.

Hopefully you won't end up testing it any time soon, but if you do, 
please confirm whether my suspicions that send -p won't work after 
toggling and reparenting, but send -c still will, are correct.

(For those who read this out of thread context where I believe I already 
stated it, my own use-case involves neither snapshots nor send-receive.  
But it'd be useful information to confirm, both for others, and in case I 
suddenly find myself with a different use-case for some reason or other.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
@ 2017-06-20 14:39 Marc MERLIN
  2017-06-20 15:23 ` Hugo Mills
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-06-20 14:39 UTC (permalink / raw)
  To: linux-btrfs

My filesystem got remounted read only, and yet after a lengthy
btrfs check --repair, it ran clean.

Any idea what went wrong?
[846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1
[846333.744721] BTRFS critical (device dm-1): unable to add free space :-17
[847312.529660] BTRFS: Transaction aborted (error -17)
[847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists
[847313.247668] BTRFS info (device dm-1): forced readonly

gargamel:~# btrfs check --repair /dev/mapper/dshelf2
enabling repair mode
Checking filesystem on /dev/mapper/dshelf2
UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
checking extents
Fixed 0 roots.
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
found 5544539336704 bytes used, no error found
total csum bytes: 5344305964
total tree bytes: 70455754752
total fs tree bytes: 58427670528
total extent tree bytes: 5372461056
btree space waste bytes: 10620592981
file data blocks allocated: 7735818444800
 referenced 6155805896704


this is how it went read only:
[846332.977964] ------------[ cut here ]------------
[846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1
[846333.402648] CPU: 4 PID: 4095 Comm: btrfs-transacti Tainted: G     U          4.11.3-amd64-preempt-sysrq-20170406 #5
[846333.434917] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
[846333.463597] Call Trace:
[846333.469942] usb 2-1-port4: device 2-1.4 not suspended yet
[846333.489639]  dump_stack+0x61/0x7d
[846333.500480]  __warn+0xc2/0xdd
[846333.510956]  warn_slowpath_null+0x1d/0x1f
[846333.524103]  tree_insert_offset+0x78/0xb1
[846333.537337]  link_free_space+0x2c/0x41
[846333.549991]  __btrfs_add_free_space+0x89/0x3aa
[846333.564236]  ? kmem_cache_free+0x3d/0x92
[846333.577702]  btrfs_add_free_space+0x1d/0x1f
[846333.591179]  unpin_extent_range+0xf3/0x2b0
[846333.605220]  btrfs_finish_extent_commit+0xda/0x1d4
[846333.621324]  btrfs_commit_transaction+0x629/0x79a
[846333.637205]  ? add_wait_queue+0x44/0x44
[846333.649680]  transaction_kthread+0xe2/0x178
[846333.663201]  ? btrfs_cleanup_transaction+0x3e8/0x3e8
[846333.679033]  kthread+0xfb/0x100
[846333.690261]  ? init_completion+0x24/0x24
[846333.703239]  ? do_fast_syscall_32+0xb7/0xfe
[846333.717649]  ret_from_fork+0x2c/0x40
[846333.729656] ---[ end trace 27aa532d1886e536 ]---
[846333.744721] BTRFS critical (device dm-1): unable to add free space :-17

[847312.529660] BTRFS: Transaction aborted (error -17)
[847312.912784] CPU: 6 PID: 4094 Comm: btrfs-cleaner Tainted: G     U  W       4.11.3-amd64-preempt-sysrq-20170406 #5
[847312.913132] usb 2-1-port4: device 2-1.4 not suspended yet
[847312.962394] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
[847312.990936] Call Trace:
[847312.999347]  dump_stack+0x61/0x7d
[847313.010383]  __warn+0xc2/0xdd
[847313.020351]  warn_slowpath_fmt+0x5a/0x76
[847313.033274]  btrfs_run_delayed_refs+0xb1/0x1cc
[847313.047655]  btrfs_should_end_transaction+0x50/0x57
[847313.063910]  btrfs_drop_snapshot+0x38a/0x6c4
[847313.078619]  ? btrfs_kill_all_delayed_nodes+0x5f/0xd7
[847313.094916]  ? _raw_spin_lock+0x15/0x17
[847313.108325]  btrfs_clean_one_deleted_snapshot+0xce/0xdc
[847313.125493]  cleaner_kthread+0x91/0x14b
[847313.138228]  ? btrfs_destroy_pinned_extent+0xd2/0xd2
[847313.154308]  kthread+0xfb/0x100
[847313.164900]  ? init_completion+0x24/0x24
[847313.177781]  ? do_fast_syscall_32+0xb7/0xfe
[847313.191490]  ret_from_fork+0x2c/0x40
[847313.203432] ---[ end trace 27aa532d1886e537 ]---
[847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists
[847313.247668] BTRFS info (device dm-1): forced readonly

[849789.173126] BTRFS error (device dm-1): parent transid verify failed on 1935589703680 wanted 37959 found 3229
[849789.218675] BTRFS error (device dm-1): parent transid verify failed on 1935589703680 wanted 37959 found 3229

[863279.783590] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863279.827526] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863279.857797] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863279.888096] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863279.918393] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863279.948740] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863279.979033] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863280.009362] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863280.040438] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
[863280.070966] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 14:39 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Marc MERLIN
@ 2017-06-20 15:23 ` Hugo Mills
  2017-06-20 15:26   ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Hugo Mills @ 2017-06-20 15:23 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 5818 bytes --]

On Tue, Jun 20, 2017 at 07:39:16AM -0700, Marc MERLIN wrote:
> My filesystem got remounted read only, and yet after a lengthy
> btrfs check --repair, it ran clean.
> 
> Any idea what went wrong?
> [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1
> [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17
> [847312.529660] BTRFS: Transaction aborted (error -17)
> [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists

   Error 17 is EEXIST, so I'd guess (and it is a guess) that it's
trying to add a free space cache record for some space that already
has such a record. This might also match with:

[...]
> gargamel:~# btrfs check --repair /dev/mapper/dshelf2
[...]
> cache and super generation don't match, space cache will be invalidated
[...]

   I'd try clearing the cache (mount with -o clear_cache, once), and
then letting it rebuild.

   Hugo.

> checking fs roots
> checking csums
> checking root refs
> found 5544539336704 bytes used, no error found
> total csum bytes: 5344305964
> total tree bytes: 70455754752
> total fs tree bytes: 58427670528
> total extent tree bytes: 5372461056
> btree space waste bytes: 10620592981
> file data blocks allocated: 7735818444800
>  referenced 6155805896704
> 
> 
> this is how it went read only:
> [846332.977964] ------------[ cut here ]------------
> [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1
> [846333.402648] CPU: 4 PID: 4095 Comm: btrfs-transacti Tainted: G     U          4.11.3-amd64-preempt-sysrq-20170406 #5
> [846333.434917] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
> [846333.463597] Call Trace:
> [846333.469942] usb 2-1-port4: device 2-1.4 not suspended yet
> [846333.489639]  dump_stack+0x61/0x7d
> [846333.500480]  __warn+0xc2/0xdd
> [846333.510956]  warn_slowpath_null+0x1d/0x1f
> [846333.524103]  tree_insert_offset+0x78/0xb1
> [846333.537337]  link_free_space+0x2c/0x41
> [846333.549991]  __btrfs_add_free_space+0x89/0x3aa
> [846333.564236]  ? kmem_cache_free+0x3d/0x92
> [846333.577702]  btrfs_add_free_space+0x1d/0x1f
> [846333.591179]  unpin_extent_range+0xf3/0x2b0
> [846333.605220]  btrfs_finish_extent_commit+0xda/0x1d4
> [846333.621324]  btrfs_commit_transaction+0x629/0x79a
> [846333.637205]  ? add_wait_queue+0x44/0x44
> [846333.649680]  transaction_kthread+0xe2/0x178
> [846333.663201]  ? btrfs_cleanup_transaction+0x3e8/0x3e8
> [846333.679033]  kthread+0xfb/0x100
> [846333.690261]  ? init_completion+0x24/0x24
> [846333.703239]  ? do_fast_syscall_32+0xb7/0xfe
> [846333.717649]  ret_from_fork+0x2c/0x40
> [846333.729656] ---[ end trace 27aa532d1886e536 ]---
> [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17
> 
> [847312.529660] BTRFS: Transaction aborted (error -17)
> [847312.912784] CPU: 6 PID: 4094 Comm: btrfs-cleaner Tainted: G     U  W       4.11.3-amd64-preempt-sysrq-20170406 #5
> [847312.913132] usb 2-1-port4: device 2-1.4 not suspended yet
> [847312.962394] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
> [847312.990936] Call Trace:
> [847312.999347]  dump_stack+0x61/0x7d
> [847313.010383]  __warn+0xc2/0xdd
> [847313.020351]  warn_slowpath_fmt+0x5a/0x76
> [847313.033274]  btrfs_run_delayed_refs+0xb1/0x1cc
> [847313.047655]  btrfs_should_end_transaction+0x50/0x57
> [847313.063910]  btrfs_drop_snapshot+0x38a/0x6c4
> [847313.078619]  ? btrfs_kill_all_delayed_nodes+0x5f/0xd7
> [847313.094916]  ? _raw_spin_lock+0x15/0x17
> [847313.108325]  btrfs_clean_one_deleted_snapshot+0xce/0xdc
> [847313.125493]  cleaner_kthread+0x91/0x14b
> [847313.138228]  ? btrfs_destroy_pinned_extent+0xd2/0xd2
> [847313.154308]  kthread+0xfb/0x100
> [847313.164900]  ? init_completion+0x24/0x24
> [847313.177781]  ? do_fast_syscall_32+0xb7/0xfe
> [847313.191490]  ret_from_fork+0x2c/0x40
> [847313.203432] ---[ end trace 27aa532d1886e537 ]---
> [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists
> [847313.247668] BTRFS info (device dm-1): forced readonly
> 
> [849789.173126] BTRFS error (device dm-1): parent transid verify failed on 1935589703680 wanted 37959 found 3229
> [849789.218675] BTRFS error (device dm-1): parent transid verify failed on 1935589703680 wanted 37959 found 3229
> 
> [863279.783590] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863279.827526] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863279.857797] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863279.888096] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863279.918393] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863279.948740] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863279.979033] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863280.009362] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863280.040438] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> [863280.070966] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634
> 

-- 
Hugo Mills             | I believe that it's closely correlated with the
hugo@... carfax.org.uk | aeroswine coefficient
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                       Adrian Bridgett

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 15:23 ` Hugo Mills
@ 2017-06-20 15:26   ` Marc MERLIN
  2017-06-20 15:36     ` Hugo Mills
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-06-20 15:26 UTC (permalink / raw)
  To: Hugo Mills, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1607 bytes --]

On Tue, Jun 20, 2017 at 03:23:54PM +0000, Hugo Mills wrote:
> On Tue, Jun 20, 2017 at 07:39:16AM -0700, Marc MERLIN wrote:
> > My filesystem got remounted read only, and yet after a lengthy
> > btrfs check --repair, it ran clean.
> > 
> > Any idea what went wrong?
> > [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1
> > [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17
> > [847312.529660] BTRFS: Transaction aborted (error -17)
> > [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists
> 
>    Error 17 is EEXIST, so I'd guess (and it is a guess) that it's
> trying to add a free space cache record for some space that already
> has such a record. This might also match with:
 
Thanks for having a look. Is it a bug, or is it a problem with my storage
subsystem?

> [...]
> > gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> [...]
> > cache and super generation don't match, space cache will be invalidated
> [...]
> 
>    I'd try clearing the cache (mount with -o clear_cache, once), and
> then letting it rebuild.

"space cache will be invalidated " => doesn't that mean that my cache was
already cleared by check --repair, or are you saying I need to clear it
again?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 291 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 15:26   ` Marc MERLIN
@ 2017-06-20 15:36     ` Hugo Mills
  2017-06-20 15:44       ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Hugo Mills @ 2017-06-20 15:36 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1999 bytes --]

On Tue, Jun 20, 2017 at 08:26:48AM -0700, Marc MERLIN wrote:
> On Tue, Jun 20, 2017 at 03:23:54PM +0000, Hugo Mills wrote:
> > On Tue, Jun 20, 2017 at 07:39:16AM -0700, Marc MERLIN wrote:
> > > My filesystem got remounted read only, and yet after a lengthy
> > > btrfs check --repair, it ran clean.
> > > 
> > > Any idea what went wrong?
> > > [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1
> > > [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17
> > > [847312.529660] BTRFS: Transaction aborted (error -17)
> > > [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists
> > 
> >    Error 17 is EEXIST, so I'd guess (and it is a guess) that it's
> > trying to add a free space cache record for some space that already
> > has such a record. This might also match with:
>  
> Thanks for having a look. Is it a bug, or is it a problem with my storage
> subsystem?

   Well, I'd say it's probably a problem with some inconsistent data
on the disk. How that data got there is another matter -- it may be
due to a bug which wrote the inconsistent data some time ago, and has
only now been found out.

> > [...]
> > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2
> > [...]
> > > cache and super generation don't match, space cache will be invalidated
> > [...]
> > 
> >    I'd try clearing the cache (mount with -o clear_cache, once), and
> > then letting it rebuild.
> 
> "space cache will be invalidated " => doesn't that mean that my cache was
> already cleared by check --repair, or are you saying I need to clear it
> again?

   I'm never quite sure about that one. :)

   It can't hurt to clear it manually as well.

   Hugo.

-- 
Hugo Mills             | I believe that it's closely correlated with the
hugo@... carfax.org.uk | aeroswine coefficient
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                       Adrian Bridgett

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 15:36     ` Hugo Mills
@ 2017-06-20 15:44       ` Marc MERLIN
  2017-06-20 23:12         ` Marc MERLIN
  2017-06-21  3:26         ` Chris Murphy
  0 siblings, 2 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-06-20 15:44 UTC (permalink / raw)
  To: Hugo Mills, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1797 bytes --]

On Tue, Jun 20, 2017 at 03:36:01PM +0000, Hugo Mills wrote:
> > Thanks for having a look. Is it a bug, or is it a problem with my storage
> > subsystem?
> 
>    Well, I'd say it's probably a problem with some inconsistent data
> on the disk. How that data got there is another matter -- it may be
> due to a bug which wrote the inconsistent data some time ago, and has
> only now been found out.
 
Understood.

> > "space cache will be invalidated " => doesn't that mean that my cache was
> > already cleared by check --repair, or are you saying I need to clear it
> > again?
> 
>    I'm never quite sure about that one. :)
> 
>    It can't hurt to clear it manually as well.

Sounds good, done.

In the meantime, I ran into this again:
https://bugzilla.kernel.org/show_bug.cgi?id=195863
btrfs check of a big filesystem kills the kernel due to OOM (but btrfs userspace is not OOM killed)

Is it achievable at all for btrfs check to realize that it's taking all the
available RAM in kernel space, is about to crash the system, and cancel the
check before the system crashes?
I've already confirmed that it doesn't use swap. I've just had to order new
RAM to upgrade my machine from 24GB to 32GB, but 32GB is max for that
hardware, so hopefully the lowmem repair stuff will work before I hit the
32GB limit next time.

In the meantime, though, it really shouldn't crash your system (potentially
causing more damage in the process because you end up with an unclean
shutdown).
Can anyone look at this?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 291 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 15:44       ` Marc MERLIN
@ 2017-06-20 23:12         ` Marc MERLIN
  2017-06-20 23:58           ` Marc MERLIN
                             ` (2 more replies)
  2017-06-21  3:26         ` Chris Murphy
  1 sibling, 3 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-06-20 23:12 UTC (permalink / raw)
  To: Hugo Mills, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3513 bytes --]

On Tue, Jun 20, 2017 at 08:44:29AM -0700, Marc MERLIN wrote:
> On Tue, Jun 20, 2017 at 03:36:01PM +0000, Hugo Mills wrote:
> > > Thanks for having a look. Is it a bug, or is it a problem with my storage
> > > subsystem?
> > 
> >    Well, I'd say it's probably a problem with some inconsistent data
> > on the disk. How that data got there is another matter -- it may be
> > due to a bug which wrote the inconsistent data some time ago, and has
> > only now been found out.
>  
> Understood.
> 
> > > "space cache will be invalidated " => doesn't that mean that my cache was
> > > already cleared by check --repair, or are you saying I need to clear it
> > > again?
> > 
> >    I'm never quite sure about that one. :)
> > 
> >    It can't hurt to clear it manually as well.
> 
> Sounds good, done.
 
Except it didn't help :(
It worked for a while, and failed again.

It looks like I'm hitting a persistent bug :(

[   86.383988] BTRFS: device label dshelf2 devid 1 transid 37975 /dev/mapper/dshelf2
[   98.232529] BTRFS info (device dm-1): use lzo compression
[   98.251982] BTRFS info (device dm-1): disk space caching is enabled
[   98.274847] BTRFS info (device dm-1): has skinny extents
[  104.171597] BTRFS info (device dm-1): detected SSD devices, enabling SSD mode
[  165.429894] BTRFS error (device dm-1): Duplicate entries in free space cache, dumping
[  165.455673] BTRFS warning (device dm-1): failed to load free space cache for block group 2039601954816, rebuilding it now
[  234.221435] BTRFS warning (device dm-1): block group 2837392130048 has wrong amount of free space
[  234.249264] BTRFS warning (device dm-1): failed to load free space cache for block group 2837392130048, rebuilding it now
[  234.636396] BTRFS warning (device dm-1): block group 2885173641216 has wrong amount of free space
[  234.664015] BTRFS warning (device dm-1): failed to load free space cache for block group 2885173641216, rebuilding it now
[  242.042940] BTRFS warning (device dm-1): block group 3116565004288 has wrong amount of free space
[  242.071207] BTRFS warning (device dm-1): failed to load free space cache for block group 3116565004288, rebuilding it now
[  273.910918] BTRFS warning (device dm-1): block group 3209980542976 has wrong amount of free space
[  273.937625] BTRFS warning (device dm-1): failed to load free space cache for block group 3209980542976, rebuilding it now
[  298.578615] BTRFS warning (device dm-1): block group 2305889927168 has wrong amount of free space
[  298.605250] BTRFS warning (device dm-1): failed to load free space cache for block group 2305889927168, rebuilding it now
[  873.265687] BTRFS: Transaction aborted (error -17)
[  873.948245] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists
[  873.978884] BTRFS info (device dm-1): forced readonly

Given that check --repair ran clean when I ran it yesterday after this first happened,
and I then ran  mount -o clear_cache , the cache got rebuilt, and I got the problem again, 
this is not looking good, seems like a persistent bug :-/

I'm now going to remount this with nospace_cache to see if your guess about
space_cache was correct.
Other suggestions also welcome :)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 291 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 23:12         ` Marc MERLIN
@ 2017-06-20 23:58           ` Marc MERLIN
  2017-06-21  3:31           ` Chris Murphy
  2017-06-21 12:04           ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan
  2 siblings, 0 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-06-20 23:58 UTC (permalink / raw)
  To: Hugo Mills, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4740 bytes --]

On Tue, Jun 20, 2017 at 04:12:03PM -0700, Marc MERLIN wrote:
> Given that check --repair ran clean when I ran it yesterday after this first happened,
> and I then ran  mount -o clear_cache , the cache got rebuilt, and I got the problem again, 
> this is not looking good, seems like a persistent bug :-/
> 
> I'm now going to remount this with nospace_cache to see if your guess about
> space_cache was correct.

Now, it seems that disabling the cache is causing some serious hangs:
[ 2055.473113] INFO: task kworker/u16:17:7579 blocked for more than 120 seconds.
[ 2055.496148]       Tainted: G     U          4.11.6-amd64-preempt-sysrq-20170406 #6
[ 2055.520611] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2055.545675] kworker/u16:17  D    0  7579      2 0x00000080
[ 2055.563626] Workqueue: writeback wb_workfn (flush-btrfs-4)
[ 2055.581458] Call Trace:
[ 2055.590154]  __schedule+0x4ef/0x627
[ 2055.602830]  schedule+0x89/0x9a
[ 2055.613618]  io_schedule+0x16/0x38
[ 2055.625324]  wait_on_page_bit_common+0xd8/0x151
[ 2055.640413]  ? inode_to_bdi+0x35/0x35
[ 2055.653701]  __lock_page+0x40/0x42
[ 2055.665431]  lock_page+0x19/0x1c
[ 2055.676315]  extent_write_cache_pages.constprop.31+0x173/0x368
[ 2055.695049]  ? update_load_avg+0x227/0x3c6
[ 2055.708592]  ? update_load_avg+0x3b1/0x3c6
[ 2055.722340]  ? list_add+0x1a/0x34
[ 2055.733520]  ? cfs_rq_throttled.isra.24+0xd/0x1d
[ 2055.748503]  ? update_cfs_shares+0x2e/0xcf
[ 2055.761891]  extent_writepages+0x5b/0x80
[ 2055.774854]  ? __percpu_counter_compare+0x29/0x72
[ 2055.790054]  ? insert_reserved_file_extent.constprop.41+0x28e/0x28e
[ 2055.809869]  btrfs_writepages+0x28/0x2a
[ 2055.822516]  do_writepages+0x20/0x29
[ 2055.834251]  __writeback_single_inode+0x8a/0x328
[ 2055.849159]  ? inode_cgwb_enabled+0xd/0x3b
[ 2055.862521]  writeback_sb_inodes+0x22e/0x400
[ 2055.876310]  __writeback_inodes_wb+0x6e/0xb0
[ 2055.890057]  wb_writeback+0x163/0x2ca
[ 2055.902436]  wb_workfn+0x1f7/0x2bf
[ 2055.913520]  ? wb_workfn+0x1f7/0x2bf
[ 2055.925090]  ? __switch_to+0x2c8/0x45f
[ 2055.937184]  process_one_work+0x193/0x2b0
[ 2055.950034]  ? rescuer_thread+0x2b1/0x2b1
[ 2055.962833]  worker_thread+0x1e9/0x2c1
[ 2055.974826]  ? rescuer_thread+0x2b1/0x2b1
[ 2055.988016]  kthread+0xfb/0x100
[ 2055.998183]  ? init_completion+0x24/0x24
[ 2056.010902]  ? do_syscall_64+0x77/0x7d
[ 2056.022802]  ret_from_fork+0x2c/0x40
[ 2056.034224] INFO: task rsync:27554 blocked for more than 120 seconds.
[ 2056.054213]       Tainted: G     U          4.11.6-amd64-preempt-sysrq-20170406 #6
[ 2056.077611] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2056.101705] rsync           D    0 27554  27526 0x20020080
[ 2056.119102] Call Trace:
[ 2056.127019]  __schedule+0x4ef/0x627
[ 2056.138385]  schedule+0x89/0x9a
[ 2056.148616]  io_schedule+0x16/0x38
[ 2056.159682]  wait_on_page_bit_common+0xd8/0x151
[ 2056.173787]  ? inode_to_bdi+0x35/0x35
[ 2056.185336]  __lock_page+0x40/0x42
[ 2056.196176]  lock_page+0x19/0x1c
[ 2056.206420]  extent_write_cache_pages.constprop.31+0x173/0x368
[ 2056.224786]  ? _raw_read_unlock+0xe/0x1e
[ 2056.237221]  ? btrfs_set_lock_blocking_rw+0x9a/0x9d
[ 2056.252388]  extent_writepages+0x5b/0x80
[ 2056.264687]  ? insert_reserved_file_extent.constprop.41+0x28e/0x28e
[ 2056.284051]  btrfs_writepages+0x28/0x2a
[ 2056.296117]  do_writepages+0x20/0x29
[ 2056.307426]  __filemap_fdatawrite_range+0x97/0xc3
[ 2056.322374]  filemap_flush+0x1c/0x1e
[ 2056.333627]  btrfs_rename2+0x894/0xf6f
[ 2056.345376]  ? capable_wrt_inode_uidgid+0x3f/0x4e
[ 2056.359977]  ? generic_permission+0x11e/0x175
[ 2056.373719]  vfs_rename+0x234/0x391
[ 2056.384805]  ? vfs_rename+0x234/0x391
[ 2056.396341]  SYSC_renameat2+0x327/0x448
[ 2056.408349]  SyS_rename+0x1e/0x20
[ 2056.418806]  do_fast_syscall_32+0xb7/0xfe
[ 2056.431325]  entry_SYSENTER_compat+0x4c/0x5b
[ 2056.444642] RIP: 0023:0xf76feb39
[ 2056.454861] RSP: 002b:00000000ffe177bc EFLAGS: 00000292 ORIG_RAX: 0000000000000026
[ 2056.478081] RAX: ffffffffffffffda RBX: 00000000ffe18890 RCX: 00000000ffe1a890
[ 2056.500019] RDX: 0000000000000001 RSI: 00000000ffe1a890 RDI: 0000000000000003
[ 2056.521948] RBP: 00000000ffe177f8 R08: 0000000000000000 R09: 0000000000000000
[ 2056.543858] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 2056.565809] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000


Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 291 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 15:44       ` Marc MERLIN
  2017-06-20 23:12         ` Marc MERLIN
@ 2017-06-21  3:26         ` Chris Murphy
  2017-06-21  4:06           ` Marc MERLIN
  1 sibling, 1 reply; 77+ messages in thread
From: Chris Murphy @ 2017-06-21  3:26 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Hugo Mills, Btrfs BTRFS

On Tue, Jun 20, 2017 at 9:44 AM, Marc MERLIN <marc@merlins.org> wrote:

> In the meantime, I ran into this again:
> https://bugzilla.kernel.org/show_bug.cgi?id=195863
> btrfs check of a big filesystem kills the kernel due to OOM (but btrfs userspace is not OOM killed)
>
> Is it achievable at all for btrfs check to realize that it's taking all the
> available RAM in kernel space, is about to crash the system, and cancel the
> check before the system crashes?
> I've already confirmed that it doesn't use swap. I've just had to order new
> RAM to upgrade my machine from 24GB to 32GB, but 32GB is max for that
> hardware, so hopefully the lowmem repair stuff will work before I hit the
> 32GB limit next time.

Right now Btrfs isn't scalable if you have to repair it because large
volumes run into this problem; one of the reasons for the lowmem mode.

It's a separate bug that it OOMs even with swap, I don't know why it
won't use that, it should be up to kernel memory management to deal
with this; I know this works with xfs_repair. I don't know if the idea
is that normal mode will go away, in favor of lowmem mode, or if there
are fixes still planned for normal mode. If it's going to stick
around, it needs to be able to use swap, same for lowmem mode. Just
running into a total inability to --repair isn't OK.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 23:12         ` Marc MERLIN
  2017-06-20 23:58           ` Marc MERLIN
@ 2017-06-21  3:31           ` Chris Murphy
  2017-06-21  3:43             ` Marc MERLIN
  2017-06-21 12:04           ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan
  2 siblings, 1 reply; 77+ messages in thread
From: Chris Murphy @ 2017-06-21  3:31 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Hugo Mills, Btrfs BTRFS

On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote:

> I'm now going to remount this with nospace_cache to see if your guess about
> space_cache was correct.
> Other suggestions also welcome :)

What results do you get with lowmem mode? It won't repair without
additional patches, but might give a dev a clue what's going on. I
regularly see normal mode check finds no problems, and lowmem mode
finds problems. Lowmem mode is a total rewrite so it's a different
implementation and can find things normal mode won't.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-21  3:31           ` Chris Murphy
@ 2017-06-21  3:43             ` Marc MERLIN
  2017-06-21 15:13               ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-06-21  3:43 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS

On Tue, Jun 20, 2017 at 09:31:42PM -0600, Chris Murphy wrote:
> On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote:
> 
> > I'm now going to remount this with nospace_cache to see if your guess about
> > space_cache was correct.
> > Other suggestions also welcome :)
> 
> What results do you get with lowmem mode? It won't repair without
> additional patches, but might give a dev a clue what's going on. I
> regularly see normal mode check finds no problems, and lowmem mode
> finds problems. Lowmem mode is a total rewrite so it's a different
> implementation and can find things normal mode won't.

Oh, I kind of forgot that lowmem mode looked for more things than regular
mode.
I will run this tonight and see what it says.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-21  3:26         ` Chris Murphy
@ 2017-06-21  4:06           ` Marc MERLIN
  0 siblings, 0 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-06-21  4:06 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS

On Tue, Jun 20, 2017 at 09:26:27PM -0600, Chris Murphy wrote:
> Right now Btrfs isn't scalable if you have to repair it because large
> volumes run into this problem; one of the reasons for the lowmem mode.
> 
> It's a separate bug that it OOMs even with swap, I don't know why it
> won't use that, it should be up to kernel memory management to deal

The thing is that it doesn't even get OOM'ed.
I didn't look at the code, but I'm assuming it must be using kernel RAM
instead of user space RAM, which is why it can't be OOM'ed and why it gets
the kernel to deadlock.

If that is the case, then the user space code should monitor kernel space
usage and cancel the check if it's about to run out of usable RAM (better
than deadlocking the system).

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean
  2017-06-20 23:12         ` Marc MERLIN
  2017-06-20 23:58           ` Marc MERLIN
  2017-06-21  3:31           ` Chris Murphy
@ 2017-06-21 12:04           ` Duncan
  2 siblings, 0 replies; 77+ messages in thread
From: Duncan @ 2017-06-21 12:04 UTC (permalink / raw)
  To: linux-btrfs

Marc MERLIN posted on Tue, 20 Jun 2017 16:12:03 -0700 as excerpted:

> On Tue, Jun 20, 2017 at 08:44:29AM -0700, Marc MERLIN wrote:
>> On Tue, Jun 20, 2017 at 03:36:01PM +0000, Hugo Mills wrote:
>> 
>>>> "space cache will be invalidated " => doesn't that mean that my
>>>> cache was already cleared by check --repair, or are you saying I
>>>> need to clear it again?
>>> 
>>>    I'm never quite sure about that one. :)
>>> 
>>>    It can't hurt to clear it manually as well.
>> 
>> Sounds good, done.
>  
> Except it didn't help :(
> It worked for a while, and failed again.
> 
> It looks like I'm hitting a persistent bug :(

[Omitted free space cache dmesg errors]

> Given that check --repair ran clean when I ran it yesterday after this
> first happened, and I then ran  mount -o clear_cache , the cache got
> rebuilt, and I got the problem again, this is not looking good, seems
> like a persistent bug :-/

Keep in mind this quote from a recent (I'm quoting -progs 4.11) btrfs-
check manpage (reformatted for posting):

>>>>> 

--clear-space-cache v1|v2

completely wipe all free space cache of given type

For free space cache v1, the clear_cache kernel mount option only 
rebuilds the free space cache for block groups that are modified while the
filesystem is mounted with that option. Thus, using this option with v1 
makes it possible to actually clear the entire free space cache.

For free space cache v2, the clear_cache kernel mount option does destroy 
the entire free space cache. This option with v2 provides an alternative
method of clearing the free space cache that doesn’t require mounting the 
filesystem.

<<<<<

Given the dmesg, seems you're still running the space cache, not the v2/
tree (which is fine, I'm conservative enough not to have switched yet 
either).  So try the check option instead of the mount option.  The mount 
option might simply have not caught all the badness while it was active.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-21  3:43             ` Marc MERLIN
@ 2017-06-21 15:13               ` Marc MERLIN
  2017-06-21 23:22                 ` Chris Murphy
  2017-06-22  2:22                 ` Qu Wenruo
  0 siblings, 2 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-06-21 15:13 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS

On Tue, Jun 20, 2017 at 08:43:52PM -0700, Marc MERLIN wrote:
> On Tue, Jun 20, 2017 at 09:31:42PM -0600, Chris Murphy wrote:
> > On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote:
> > 
> > > I'm now going to remount this with nospace_cache to see if your guess about
> > > space_cache was correct.
> > > Other suggestions also welcome :)
> > 
> > What results do you get with lowmem mode? It won't repair without
> > additional patches, but might give a dev a clue what's going on. I
> > regularly see normal mode check finds no problems, and lowmem mode
> > finds problems. Lowmem mode is a total rewrite so it's a different
> > implementation and can find things normal mode won't.
> 
> Oh, I kind of forgot that lowmem mode looked for more things than regular
> mode.
> I will run this tonight and see what it says.
 
It's probably still a ways from being finished given how slow lowmem is in
comparison, but sadly it found a bunch of problems which regular mode didn't
find.

I'm pretty bummed. I just spent way too long recreating this filesystem and
the multiple btrfs send/receive relationships from other machines. Too a bit
over a week :(

It looks like the errors are not major (especially if the regular mode
doesn't even see them), but without lowmem --repair, I'm kind of screwed.

I'm wondering if I could/should leave those errors unfixed until lowmem --repair
finally happens, or whether I'm looking at spending another week rebuilding
this filesystem :-/


gargamel:~# btrfs check -p --mode lowmem  /dev/mapper/dshelf2
Checking filesystem on /dev/mapper/dshelf2
UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4
ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11712, owner: 863395, offset: 79659008) wanted: 1, have: 2
ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12424, owner: 6945, offset: 2083389440) wanted: 1, have: 2
ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12172, owner: 6945, offset: 2083389440) wanted: 1, have: 2
ERROR: extent[4571729862656, 876544] referencer count mismatch (root: 11058, owner: 375442, offset: 907706368) wanted: 6, have: 21
ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11712, owner: 375444, offset: 1848672256) wanted: 3, have: 5
ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11276, owner: 375444, offset: 1848672256) wanted: 3, have: 5
ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11058, owner: 375444, offset: 1848672256) wanted: 3, have: 5
ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11494, owner: 375444, offset: 1848672256) wanted: 3, have: 5
ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11712, owner: 375444, offset: 1848705024) wanted: 1, have: 3
ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11712, owner: 863395, offset: 79523840) wanted: 1, have: 3
ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11930, owner: 863395, offset: 79523840) wanted: 1, have: 3
ERROR: extent[4698380947456, 409600] referencer count mismatch (root: 11930, owner: 375444, offset: 1851596800) wanted: 3, have: 4
ERROR: extent[4720470421504, 667648] referencer count mismatch (root: 11058, owner: 3463478, offset: 2334720) wanted: 2, have: 10
ERROR: extent[4783941246976, 65536] referencer count mismatch (root: 9365, owner: 24493, offset: 4562944) wanted: 2, have: 3
ERROR: extent[5077564477440, 106496] referencer count mismatch (root: 9370, owner: 1602694, offset: 734756864) wanted: 1, have: 2
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11712, owner: 375441, offset: 910999552) wanted: 16, have: 1864
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11276, owner: 375441, offset: 910999552) wanted: 867, have: 1865
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11058, owner: 375441, offset: 910999552) wanted: 126, have: 1872
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11494, owner: 375441, offset: 910999552) wanted: 866, have: 1864
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11930, owner: 375441, offset: 910999552) wanted: 861, have: 1859
ERROR: extent[5136649891840, 66781184] referencer count mismatch (root: 11058, owner: 375442, offset: 192659456) wanted: 5, have: 19
ERROR: extent[5136879157248, 134217728] referencer count mismatch (root: 11930, owner: 375442, offset: 394543104) wanted: 10, have: 33
ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11058, owner: 375442, offset: 875233280) wanted: 1, have: 21
ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11930, owner: 375442, offset: 875233280) wanted: 11, have: 21
ERROR: extent[5138641395712, 524288] referencer count mismatch (root: 11494, owner: 375445, offset: 39845888) wanted: 1, have: 3
ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11712, owner: 863395, offset: 51118080) wanted: 1, have: 4
ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11276, owner: 863395, offset: 51118080) wanted: 1, have: 4
ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11494, owner: 863395, offset: 51118080) wanted: 1, have: 4
ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 74952704) wanted: 3, have: 5
ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 74952704) wanted: 3, have: 5
ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11494, owner: 863395, offset: 74952704) wanted: 3, have: 5
ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11930, owner: 863395, offset: 74952704) wanted: 3, have: 5
ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11712, owner: 863395, offset: 77705216) wanted: 1, have: 6
ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 77705216) wanted: 5, have: 6
ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 77705216) wanted: 5, have: 6
ERROR: extent[5427326599168, 61440] referencer count mismatch (root: 9370, owner: 1225712, offset: 29753344) wanted: 1, have: 5
ERROR: extent[5456623030272, 24576] referencer count mismatch (root: 11058, owner: 2278892, offset: 786432) wanted: 2, have: 3
ERROR: extent[5851251269632, 134217728] referencer count mismatch (root: 9370, owner: 1602695, offset: 534061056) wanted: 3, have: 4
ERROR: errors found in extent allocation tree or chunk allocation
cache and super generation don't match, space cache will be invalidated
ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt
ERROR: root 3857 EXTENT_DATA[133050 4096] interrupt
ERROR: root 3857 EXTENT_DATA[388570 4096] interrupt
ERROR: root 3857 EXTENT_DATA[729583 4096] interrupt
ERROR: root 3857 EXTENT_DATA[984778 4096] interrupt
ERROR: root 3857 EXTENT_DATA[997394 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1002954 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1007491 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1111463 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1111506 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1134500 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1136498 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1175965 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1185977 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1190919 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1201340 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1230370 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1235960 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1248784 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1271827 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1295242 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1406074 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1410780 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1412938 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1421245 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1423365 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1425985 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1429229 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1430615 4096] interrupt
ERROR: root 3857 EXTENT_DATA[1443769 4096] interrupt
ERROR: root 3860 EXTENT_DATA[599089 4096] interrupt

(not finished, still going on)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-21 15:13               ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
@ 2017-06-21 23:22                 ` Chris Murphy
  2017-06-22  0:48                   ` Marc MERLIN
  2017-06-22  2:22                 ` Qu Wenruo
  1 sibling, 1 reply; 77+ messages in thread
From: Chris Murphy @ 2017-06-21 23:22 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS, Qu Wenruo

On Wed, Jun 21, 2017 at 9:13 AM, Marc MERLIN <marc@merlins.org> wrote:
> On Tue, Jun 20, 2017 at 08:43:52PM -0700, Marc MERLIN wrote:
>> On Tue, Jun 20, 2017 at 09:31:42PM -0600, Chris Murphy wrote:
>> > On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote:
>> >
>> > > I'm now going to remount this with nospace_cache to see if your guess about
>> > > space_cache was correct.
>> > > Other suggestions also welcome :)
>> >
>> > What results do you get with lowmem mode? It won't repair without
>> > additional patches, but might give a dev a clue what's going on. I
>> > regularly see normal mode check finds no problems, and lowmem mode
>> > finds problems. Lowmem mode is a total rewrite so it's a different
>> > implementation and can find things normal mode won't.
>>
>> Oh, I kind of forgot that lowmem mode looked for more things than regular
>> mode.
>> I will run this tonight and see what it says.
>
> It's probably still a ways from being finished given how slow lowmem is in
> comparison, but sadly it found a bunch of problems which regular mode didn't
> find.
>
> I'm pretty bummed. I just spent way too long recreating this filesystem and
> the multiple btrfs send/receive relationships from other machines. Too a bit
> over a week :(
>
> It looks like the errors are not major (especially if the regular mode
> doesn't even see them), but without lowmem --repair, I'm kind of screwed.
>
> I'm wondering if I could/should leave those errors unfixed until lowmem --repair
> finally happens, or whether I'm looking at spending another week rebuilding
> this filesystem :-/
>
>
> gargamel:~# btrfs check -p --mode lowmem  /dev/mapper/dshelf2
> Checking filesystem on /dev/mapper/dshelf2
> UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
> ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4
> ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11712, owner: 863395, offset: 79659008) wanted: 1, have: 2
> ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12424, owner: 6945, offset: 2083389440) wanted: 1, have: 2
> ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12172, owner: 6945, offset: 2083389440) wanted: 1, have: 2
> ERROR: extent[4571729862656, 876544] referencer count mismatch (root: 11058, owner: 375442, offset: 907706368) wanted: 6, have: 21
> ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11712, owner: 375444, offset: 1848672256) wanted: 3, have: 5
> ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11276, owner: 375444, offset: 1848672256) wanted: 3, have: 5
> ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11058, owner: 375444, offset: 1848672256) wanted: 3, have: 5
> ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11494, owner: 375444, offset: 1848672256) wanted: 3, have: 5
> ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11712, owner: 375444, offset: 1848705024) wanted: 1, have: 3
> ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11712, owner: 863395, offset: 79523840) wanted: 1, have: 3
> ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11930, owner: 863395, offset: 79523840) wanted: 1, have: 3
> ERROR: extent[4698380947456, 409600] referencer count mismatch (root: 11930, owner: 375444, offset: 1851596800) wanted: 3, have: 4
> ERROR: extent[4720470421504, 667648] referencer count mismatch (root: 11058, owner: 3463478, offset: 2334720) wanted: 2, have: 10
> ERROR: extent[4783941246976, 65536] referencer count mismatch (root: 9365, owner: 24493, offset: 4562944) wanted: 2, have: 3
> ERROR: extent[5077564477440, 106496] referencer count mismatch (root: 9370, owner: 1602694, offset: 734756864) wanted: 1, have: 2
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11712, owner: 375441, offset: 910999552) wanted: 16, have: 1864
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11276, owner: 375441, offset: 910999552) wanted: 867, have: 1865
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11058, owner: 375441, offset: 910999552) wanted: 126, have: 1872
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11494, owner: 375441, offset: 910999552) wanted: 866, have: 1864
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11930, owner: 375441, offset: 910999552) wanted: 861, have: 1859
> ERROR: extent[5136649891840, 66781184] referencer count mismatch (root: 11058, owner: 375442, offset: 192659456) wanted: 5, have: 19
> ERROR: extent[5136879157248, 134217728] referencer count mismatch (root: 11930, owner: 375442, offset: 394543104) wanted: 10, have: 33
> ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11058, owner: 375442, offset: 875233280) wanted: 1, have: 21
> ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11930, owner: 375442, offset: 875233280) wanted: 11, have: 21
> ERROR: extent[5138641395712, 524288] referencer count mismatch (root: 11494, owner: 375445, offset: 39845888) wanted: 1, have: 3
> ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11712, owner: 863395, offset: 51118080) wanted: 1, have: 4
> ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11276, owner: 863395, offset: 51118080) wanted: 1, have: 4
> ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11494, owner: 863395, offset: 51118080) wanted: 1, have: 4
> ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 74952704) wanted: 3, have: 5
> ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 74952704) wanted: 3, have: 5
> ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11494, owner: 863395, offset: 74952704) wanted: 3, have: 5
> ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11930, owner: 863395, offset: 74952704) wanted: 3, have: 5
> ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11712, owner: 863395, offset: 77705216) wanted: 1, have: 6
> ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 77705216) wanted: 5, have: 6
> ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 77705216) wanted: 5, have: 6
> ERROR: extent[5427326599168, 61440] referencer count mismatch (root: 9370, owner: 1225712, offset: 29753344) wanted: 1, have: 5
> ERROR: extent[5456623030272, 24576] referencer count mismatch (root: 11058, owner: 2278892, offset: 786432) wanted: 2, have: 3
> ERROR: extent[5851251269632, 134217728] referencer count mismatch (root: 9370, owner: 1602695, offset: 534061056) wanted: 3, have: 4
> ERROR: errors found in extent allocation tree or chunk allocation
> cache and super generation don't match, space cache will be invalidated
> ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[133050 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[388570 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[729583 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[984778 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[997394 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1002954 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1007491 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1111463 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1111506 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1134500 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1136498 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1175965 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1185977 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1190919 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1201340 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1230370 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1235960 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1248784 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1271827 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1295242 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1406074 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1410780 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1412938 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1421245 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1423365 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1425985 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1429229 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1430615 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1443769 4096] interrupt
> ERROR: root 3860 EXTENT_DATA[599089 4096] interrupt
>
> (not finished, still going on)


I don't know what it means. Maybe Qu has some idea. He might want a
btrfs-image  of this file system to see if it's a bug. There are still
some bugs found with lowmem mode, so these could be bogus messages.
But the file system clearly has problems, the question is why does
such a new file system have these kinds of problems that can't be
fixed by normal repair because they aren't even being detected; or
maybe there is no problem on disk per se, the problem might be a bug.

In which case, off chance going back to a substantially older kernel
might help. Maybe the latest 4.9 series kernel?

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-21 23:22                 ` Chris Murphy
@ 2017-06-22  0:48                   ` Marc MERLIN
  0 siblings, 0 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-06-22  0:48 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS, Qu Wenruo

On Wed, Jun 21, 2017 at 05:22:15PM -0600, Chris Murphy wrote:
> I don't know what it means. Maybe Qu has some idea. He might want a
> btrfs-image  of this file system to see if it's a bug. There are still
> some bugs found with lowmem mode, so these could be bogus messages.
> But the file system clearly has problems, the question is why does
> such a new file system have these kinds of problems that can't be
> fixed by normal repair because they aren't even being detected; or
> maybe there is no problem on disk per se, the problem might be a bug.
 
Yes, that's indeed the question I was asking myself too :)
Now, I did have a couple of drives that got kicked out of a (mdadm, not
btrfs) raid array, causing the array to go away while btrfs was trying to
write to it, but my understanding of btrfs write journalling is that the new
data that was being written should have been discarded and I should have
ended up at the previous good state.
AFAIK, I'm pretty sure I didn't get any block layer corruption this time, I
just got a drive effectively pulled from a running array (well 2, one went
to degraded, and the 2nd one killed the array. I re-added them carefully and
correctly in the right order and mdadm rebuilt what it needed using the
extent tree)

For what it's worth, I've had no end of trouble with Sata SAS cards and their 4
sata cables in one:
https://www.amazon.com/gp/product/B0050SLTPC/ref=oh_aui_search_detailpage?ie=UTF8&psc=1
https://www.amazon.com/gp/product/B013G4EMH8/ref=oh_aui_search_detailpage?ie=UTF8&psc=1

I have it stable now, but those cables are super sensitive and have caused
drives to get kicked out if they weren't air canned first, and plugged in
just right :-/

> In which case, off chance going back to a substantially older kernel
> might help. Maybe the latest 4.9 series kernel?

If there is reasonable evidence that it will help, I can give it a shot.

Qu, or anyone, given that btrfs-image is going to take a long time (maybe a
day or more), given that I have to use at least -s before I can share the
image, and if I need -ss, then it's even slower from what I remember.

Basically please suggest the fastest image algorithm I can use. It's a quad
core HT machine, so should I use
btrfs-image -c0 -t8 -s /dev/ image
(I'm assuing -c9 will not be faster and that -ss will be even slower)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-21 15:13               ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
  2017-06-21 23:22                 ` Chris Murphy
@ 2017-06-22  2:22                 ` Qu Wenruo
  2017-06-22  2:53                   ` Marc MERLIN
  1 sibling, 1 reply; 77+ messages in thread
From: Qu Wenruo @ 2017-06-22  2:22 UTC (permalink / raw)
  To: Marc MERLIN, Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS



At 06/21/2017 11:13 PM, Marc MERLIN wrote:
> On Tue, Jun 20, 2017 at 08:43:52PM -0700, Marc MERLIN wrote:
>> On Tue, Jun 20, 2017 at 09:31:42PM -0600, Chris Murphy wrote:
>>> On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote:
>>>
>>>> I'm now going to remount this with nospace_cache to see if your guess about
>>>> space_cache was correct.
>>>> Other suggestions also welcome :)
>>>
>>> What results do you get with lowmem mode? It won't repair without
>>> additional patches, but might give a dev a clue what's going on. I
>>> regularly see normal mode check finds no problems, and lowmem mode
>>> finds problems. Lowmem mode is a total rewrite so it's a different
>>> implementation and can find things normal mode won't.
>>
>> Oh, I kind of forgot that lowmem mode looked for more things than regular
>> mode.
>> I will run this tonight and see what it says.
>   
> It's probably still a ways from being finished given how slow lowmem is in
> comparison, but sadly it found a bunch of problems which regular mode didn't
> find.
> 
> I'm pretty bummed. I just spent way too long recreating this filesystem and
> the multiple btrfs send/receive relationships from other machines. Too a bit
> over a week :(
> 
> It looks like the errors are not major (especially if the regular mode
> doesn't even see them), but without lowmem --repair, I'm kind of screwed.
> 
> I'm wondering if I could/should leave those errors unfixed until lowmem --repair
> finally happens, or whether I'm looking at spending another week rebuilding
> this filesystem :-/
> 
> 
> gargamel:~# btrfs check -p --mode lowmem  /dev/mapper/dshelf2
> Checking filesystem on /dev/mapper/dshelf2
> UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
> ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4

This means that in extent tree, btrfs says there is only one referring 
to this extent, but lowmem mode find 4.

It would provide great help if you could dump extent tree for it.
# btrfs-debug-tree <dev> | grep -C 10 3886187384832


> ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11712, owner: 863395, offset: 79659008) wanted: 1, have: 2
> ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12424, owner: 6945, offset: 2083389440) wanted: 1, have: 2
> ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12172, owner: 6945, offset: 2083389440) wanted: 1, have: 2
> ERROR: extent[4571729862656, 876544] referencer count mismatch (root: 11058, owner: 375442, offset: 907706368) wanted: 6, have: 21
> ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11712, owner: 375444, offset: 1848672256) wanted: 3, have: 5
> ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11276, owner: 375444, offset: 1848672256) wanted: 3, have: 5
> ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11058, owner: 375444, offset: 1848672256) wanted: 3, have: 5
> ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11494, owner: 375444, offset: 1848672256) wanted: 3, have: 5
> ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11712, owner: 375444, offset: 1848705024) wanted: 1, have: 3
> ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11712, owner: 863395, offset: 79523840) wanted: 1, have: 3
> ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11930, owner: 863395, offset: 79523840) wanted: 1, have: 3
> ERROR: extent[4698380947456, 409600] referencer count mismatch (root: 11930, owner: 375444, offset: 1851596800) wanted: 3, have: 4
> ERROR: extent[4720470421504, 667648] referencer count mismatch (root: 11058, owner: 3463478, offset: 2334720) wanted: 2, have: 10
> ERROR: extent[4783941246976, 65536] referencer count mismatch (root: 9365, owner: 24493, offset: 4562944) wanted: 2, have: 3
> ERROR: extent[5077564477440, 106496] referencer count mismatch (root: 9370, owner: 1602694, offset: 734756864) wanted: 1, have: 2
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11712, owner: 375441, offset: 910999552) wanted: 16, have: 1864
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11276, owner: 375441, offset: 910999552) wanted: 867, have: 1865
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11058, owner: 375441, offset: 910999552) wanted: 126, have: 1872
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11494, owner: 375441, offset: 910999552) wanted: 866, have: 1864
> ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11930, owner: 375441, offset: 910999552) wanted: 861, have: 1859
> ERROR: extent[5136649891840, 66781184] referencer count mismatch (root: 11058, owner: 375442, offset: 192659456) wanted: 5, have: 19
> ERROR: extent[5136879157248, 134217728] referencer count mismatch (root: 11930, owner: 375442, offset: 394543104) wanted: 10, have: 33
> ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11058, owner: 375442, offset: 875233280) wanted: 1, have: 21
> ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11930, owner: 375442, offset: 875233280) wanted: 11, have: 21
> ERROR: extent[5138641395712, 524288] referencer count mismatch (root: 11494, owner: 375445, offset: 39845888) wanted: 1, have: 3
> ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11712, owner: 863395, offset: 51118080) wanted: 1, have: 4
> ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11276, owner: 863395, offset: 51118080) wanted: 1, have: 4
> ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11494, owner: 863395, offset: 51118080) wanted: 1, have: 4
> ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 74952704) wanted: 3, have: 5
> ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 74952704) wanted: 3, have: 5
> ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11494, owner: 863395, offset: 74952704) wanted: 3, have: 5
> ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11930, owner: 863395, offset: 74952704) wanted: 3, have: 5
> ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11712, owner: 863395, offset: 77705216) wanted: 1, have: 6
> ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 77705216) wanted: 5, have: 6
> ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 77705216) wanted: 5, have: 6
> ERROR: extent[5427326599168, 61440] referencer count mismatch (root: 9370, owner: 1225712, offset: 29753344) wanted: 1, have: 5
> ERROR: extent[5456623030272, 24576] referencer count mismatch (root: 11058, owner: 2278892, offset: 786432) wanted: 2, have: 3
> ERROR: extent[5851251269632, 134217728] referencer count mismatch (root: 9370, owner: 1602695, offset: 534061056) wanted: 3, have: 4
> ERROR: errors found in extent allocation tree or chunk allocation
> cache and super generation don't match, space cache will be invalidated
> ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt

This means that, for root 3857, inode 108864, file offset 4096, there is 
a gap before that extent.
In NO_HOLES mode it's allowed, but if NO_HOLES incompat flag is not set, 
this should be a problem.

I wonder if this is a problem caused by inlined compressed file extent.

This can also be dumped by the following command.
# btrfs-debug-tree -t 3857 <dev> | grep -C 10 108864

Thanks,
Qu

> ERROR: root 3857 EXTENT_DATA[133050 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[388570 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[729583 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[984778 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[997394 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1002954 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1007491 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1111463 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1111506 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1134500 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1136498 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1175965 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1185977 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1190919 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1201340 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1230370 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1235960 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1248784 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1271827 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1295242 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1406074 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1410780 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1412938 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1421245 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1423365 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1425985 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1429229 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1430615 4096] interrupt
> ERROR: root 3857 EXTENT_DATA[1443769 4096] interrupt
> ERROR: root 3860 EXTENT_DATA[599089 4096] interrupt
> 
> (not finished, still going on)
> 
> Marc
> 



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-22  2:22                 ` Qu Wenruo
@ 2017-06-22  2:53                   ` Marc MERLIN
  2017-06-22  4:08                     ` Qu Wenruo
  2017-06-22  4:08                     ` Qu Wenruo
  0 siblings, 2 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-06-22  2:53 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 3830 bytes --]

Ok, first it finished (almost 24H)

(...)
ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
ERROR: root 3864 EXTENT_DATA[109336 4096] interrupt
ERROR: errors found in fs roots
found 5544779108352 bytes used, error(s) found
total csum bytes: 5344523140
total tree bytes: 71323041792
total fs tree bytes: 59288403968
total extent tree bytes: 5378260992
btree space waste bytes: 10912166856
file data blocks allocated: 7830914256896
 referenced 6244104495104

Thanks for your reply Qu

On Thu, Jun 22, 2017 at 10:22:57AM +0800, Qu Wenruo wrote:
> >gargamel:~# btrfs check -p --mode lowmem  /dev/mapper/dshelf2
> >Checking filesystem on /dev/mapper/dshelf2
> >UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
> >ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 
> >11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4
> 
> This means that in extent tree, btrfs says there is only one referring 
> to this extent, but lowmem mode find 4.
> 
> It would provide great help if you could dump extent tree for it.
> # btrfs-debug-tree <dev> | grep -C 10 3886187384832
 
                extent data backref root 11712 objectid 375444 offset 1851572224 count 1
                extent data backref root 11276 objectid 375444 offset 1851572224 count 1
                extent data backref root 11058 objectid 375444 offset 1851572224 count 1
                extent data backref root 11494 objectid 375444 offset 1851572224 count 1
        item 37 key (3886187352064 EXTENT_ITEM 32768) itemoff 11381 itemsize 140
                extent refs 4 gen 32382 flags DATA
                extent data backref root 11712 objectid 375444 offset 1851596800 count 1
                extent data backref root 11276 objectid 375444 offset 1851596800 count 1
                extent data backref root 11058 objectid 375444 offset 1851596800 count 1
                extent data backref root 11494 objectid 375444 offset 1851596800 count 1
        item 38 key (3886187384832 EXTENT_ITEM 81920) itemoff 11212 itemsize 169
                extent refs 16 gen 32382 flags DATA
                extent data backref root 11712 objectid 375444 offset 1851654144 count 4
                extent data backref root 11276 objectid 375444 offset 1851654144 count 4
                extent data backref root 11058 objectid 375444 offset 1851654144 count 3
                extent data backref root 11494 objectid 375444 offset 1851654144 count 4
                extent data backref root 11930 objectid 375444 offset 1851654144 count 1
        item 39 key (3886187466752 EXTENT_ITEM 16384) itemoff 11043 itemsize 169
                extent refs 5 gen 32382 flags DATA
                extent data backref root 11712 objectid 375444 offset 1851744256 count 1
                extent data backref root 11276 objectid 375444 offset 1851744256 count 1

 
> >ERROR: errors found in extent allocation tree or chunk allocation
> >cache and super generation don't match, space cache will be invalidated
> >ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt
> 
> This means that, for root 3857, inode 108864, file offset 4096, there is 
> a gap before that extent.
> In NO_HOLES mode it's allowed, but if NO_HOLES incompat flag is not set, 
> this should be a problem.
> 
> I wonder if this is a problem caused by inlined compressed file extent.
> 
> This can also be dumped by the following command.
> # btrfs-debug-tree -t 3857 <dev> | grep -C 10 108864

This one is much bigger (192KB), I've bzipped and attached it.

Thanks for having a look, I appreciate it.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

[-- Attachment #2: out.bz2 --]
[-- Type: application/octet-stream, Size: 23826 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-22  2:53                   ` Marc MERLIN
@ 2017-06-22  4:08                     ` Qu Wenruo
  2017-06-23  4:06                       ` Marc MERLIN
  2017-06-22  4:08                     ` Qu Wenruo
  1 sibling, 1 reply; 77+ messages in thread
From: Qu Wenruo @ 2017-06-22  4:08 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS



At 06/22/2017 10:53 AM, Marc MERLIN wrote:
> Ok, first it finished (almost 24H)
> 
> (...)
> ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
> ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
> ERROR: root 3864 EXTENT_DATA[109336 4096] interrupt
> ERROR: errors found in fs roots
> found 5544779108352 bytes used, error(s) found
> total csum bytes: 5344523140
> total tree bytes: 71323041792
> total fs tree bytes: 59288403968
> total extent tree bytes: 5378260992
> btree space waste bytes: 10912166856
> file data blocks allocated: 7830914256896
>   referenced 6244104495104
> 
> Thanks for your reply Qu
> 
> On Thu, Jun 22, 2017 at 10:22:57AM +0800, Qu Wenruo wrote:
>>> gargamel:~# btrfs check -p --mode lowmem  /dev/mapper/dshelf2
>>> Checking filesystem on /dev/mapper/dshelf2
>>> UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
>>> ERROR: extent[3886187384832, 81920] referencer count mismatch (root:
>>> 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4
>>
>> This means that in extent tree, btrfs says there is only one referring
>> to this extent, but lowmem mode find 4.
>>
>> It would provide great help if you could dump extent tree for it.
>> # btrfs-debug-tree <dev> | grep -C 10 3886187384832
>   
>                  extent data backref root 11712 objectid 375444 offset 1851572224 count 1
>                  extent data backref root 11276 objectid 375444 offset 1851572224 count 1
>                  extent data backref root 11058 objectid 375444 offset 1851572224 count 1
>                  extent data backref root 11494 objectid 375444 offset 1851572224 count 1
>          item 37 key (3886187352064 EXTENT_ITEM 32768) itemoff 11381 itemsize 140
>                  extent refs 4 gen 32382 flags DATA
>                  extent data backref root 11712 objectid 375444 offset 1851596800 count 1
>                  extent data backref root 11276 objectid 375444 offset 1851596800 count 1
>                  extent data backref root 11058 objectid 375444 offset 1851596800 count 1
>                  extent data backref root 11494 objectid 375444 offset 1851596800 count 1
>          item 38 key (3886187384832 EXTENT_ITEM 81920) itemoff 11212 itemsize 169
>                  extent refs 16 gen 32382 flags DATA
>                  extent data backref root 11712 objectid 375444 offset 1851654144 count 4
>                  extent data backref root 11276 objectid 375444 offset 1851654144 count 4
>                  extent data backref root 11058 objectid 375444 offset 1851654144 count 3
>                  extent data backref root 11494 objectid 375444 offset 1851654144 count 4
>                  extent data backref root 11930 objectid 375444 offset 1851654144 count 1
>          item 39 key (3886187466752 EXTENT_ITEM 16384) itemoff 11043 itemsize 169
>                  extent refs 5 gen 32382 flags DATA
>                  extent data backref root 11712 objectid 375444 offset 1851744256 count 1
>                  extent data backref root 11276 objectid 375444 offset 1851744256 count 1

Well, there is only the output from extent tree.

I was also expecting output from subvolue (11930) tree.

It could be done by
# btrfs-debug-tree -t 11930 | grep -C 10 3886187384832

But please pay attention that, this dump may contain filenames, feel 
free to mask the filenames.

Thanks,
Qu

> 
>   
>>> ERROR: errors found in extent allocation tree or chunk allocation
>>> cache and super generation don't match, space cache will be invalidated
>>> ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt
>>
>> This means that, for root 3857, inode 108864, file offset 4096, there is
>> a gap before that extent.
>> In NO_HOLES mode it's allowed, but if NO_HOLES incompat flag is not set,
>> this should be a problem.
>>
>> I wonder if this is a problem caused by inlined compressed file extent.
>>
>> This can also be dumped by the following command.
>> # btrfs-debug-tree -t 3857 <dev> | grep -C 10 108864
> 
> This one is much bigger (192KB), I've bzipped and attached it.

Thanks for this one.
And it is caused by inlined compressed extent.

Lu Fengqi will send patch fixing it.

Thanks,
Qu

> 
> Thanks for having a look, I appreciate it.
> 
> Marc
> 



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-22  2:53                   ` Marc MERLIN
  2017-06-22  4:08                     ` Qu Wenruo
@ 2017-06-22  4:08                     ` Qu Wenruo
  1 sibling, 0 replies; 77+ messages in thread
From: Qu Wenruo @ 2017-06-22  4:08 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS



At 06/22/2017 10:53 AM, Marc MERLIN wrote:
> Ok, first it finished (almost 24H)
> 
> (...)
> ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
> ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
> ERROR: root 3864 EXTENT_DATA[109336 4096] interrupt
> ERROR: errors found in fs roots
> found 5544779108352 bytes used, error(s) found
> total csum bytes: 5344523140
> total tree bytes: 71323041792
> total fs tree bytes: 59288403968
> total extent tree bytes: 5378260992
> btree space waste bytes: 10912166856
> file data blocks allocated: 7830914256896
>   referenced 6244104495104
> 
> Thanks for your reply Qu
> 
> On Thu, Jun 22, 2017 at 10:22:57AM +0800, Qu Wenruo wrote:
>>> gargamel:~# btrfs check -p --mode lowmem  /dev/mapper/dshelf2
>>> Checking filesystem on /dev/mapper/dshelf2
>>> UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
>>> ERROR: extent[3886187384832, 81920] referencer count mismatch (root:
>>> 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4
>>
>> This means that in extent tree, btrfs says there is only one referring
>> to this extent, but lowmem mode find 4.
>>
>> It would provide great help if you could dump extent tree for it.
>> # btrfs-debug-tree <dev> | grep -C 10 3886187384832
>   
>                  extent data backref root 11712 objectid 375444 offset 1851572224 count 1
>                  extent data backref root 11276 objectid 375444 offset 1851572224 count 1
>                  extent data backref root 11058 objectid 375444 offset 1851572224 count 1
>                  extent data backref root 11494 objectid 375444 offset 1851572224 count 1
>          item 37 key (3886187352064 EXTENT_ITEM 32768) itemoff 11381 itemsize 140
>                  extent refs 4 gen 32382 flags DATA
>                  extent data backref root 11712 objectid 375444 offset 1851596800 count 1
>                  extent data backref root 11276 objectid 375444 offset 1851596800 count 1
>                  extent data backref root 11058 objectid 375444 offset 1851596800 count 1
>                  extent data backref root 11494 objectid 375444 offset 1851596800 count 1
>          item 38 key (3886187384832 EXTENT_ITEM 81920) itemoff 11212 itemsize 169
>                  extent refs 16 gen 32382 flags DATA
>                  extent data backref root 11712 objectid 375444 offset 1851654144 count 4
>                  extent data backref root 11276 objectid 375444 offset 1851654144 count 4
>                  extent data backref root 11058 objectid 375444 offset 1851654144 count 3
>                  extent data backref root 11494 objectid 375444 offset 1851654144 count 4
>                  extent data backref root 11930 objectid 375444 offset 1851654144 count 1
>          item 39 key (3886187466752 EXTENT_ITEM 16384) itemoff 11043 itemsize 169
>                  extent refs 5 gen 32382 flags DATA
>                  extent data backref root 11712 objectid 375444 offset 1851744256 count 1
>                  extent data backref root 11276 objectid 375444 offset 1851744256 count 1

Well, there is only the output from extent tree.

I was also expecting output from subvolue (11930) tree.

It could be done by
# btrfs-debug-tree -t 11930 | grep -C 10 3886187384832

But please pay attention that, this dump may contain filenames, feel 
free to mask the filenames.

> 
>   
>>> ERROR: errors found in extent allocation tree or chunk allocation
>>> cache and super generation don't match, space cache will be invalidated
>>> ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt
>>
>> This means that, for root 3857, inode 108864, file offset 4096, there is
>> a gap before that extent.
>> In NO_HOLES mode it's allowed, but if NO_HOLES incompat flag is not set,
>> this should be a problem.
>>
>> I wonder if this is a problem caused by inlined compressed file extent.
>>
>> This can also be dumped by the following command.
>> # btrfs-debug-tree -t 3857 <dev> | grep -C 10 108864
> 
> This one is much bigger (192KB), I've bzipped and attached it.

Thanks for this one.
And it is caused by inlined compressed extent.

Lu Fengqi will send patch fixing it.

Thanks,
Qu

> 
> Thanks for having a look, I appreciate it.
> 
> Marc
> 



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-22  4:08                     ` Qu Wenruo
@ 2017-06-23  4:06                       ` Marc MERLIN
  2017-06-23  8:54                         ` Lu Fengqi
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-06-23  4:06 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS

On Thu, Jun 22, 2017 at 12:08:44PM +0800, Qu Wenruo wrote:
> > On Thu, Jun 22, 2017 at 10:22:57AM +0800, Qu Wenruo wrote:
> > > > gargamel:~# btrfs check -p --mode lowmem  /dev/mapper/dshelf2
> > > > Checking filesystem on /dev/mapper/dshelf2
> > > > UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
> > > > ERROR: extent[3886187384832, 81920] referencer count mismatch (root:
> > > > 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4
> > > 
> > > This means that in extent tree, btrfs says there is only one referring
> > > to this extent, but lowmem mode find 4.
> > > 
> > > It would provide great help if you could dump extent tree for it.
> > > # btrfs-debug-tree <dev> | grep -C 10 3886187384832
> >                  extent data backref root 11712 objectid 375444 offset 1851572224 count 1
> >                  extent data backref root 11276 objectid 375444 offset 1851572224 count 1
> >                  extent data backref root 11058 objectid 375444 offset 1851572224 count 1
> >                  extent data backref root 11494 objectid 375444 offset 1851572224 count 1
> >          item 37 key (3886187352064 EXTENT_ITEM 32768) itemoff 11381 itemsize 140
> >                  extent refs 4 gen 32382 flags DATA
> >                  extent data backref root 11712 objectid 375444 offset 1851596800 count 1
> >                  extent data backref root 11276 objectid 375444 offset 1851596800 count 1
> >                  extent data backref root 11058 objectid 375444 offset 1851596800 count 1
> >                  extent data backref root 11494 objectid 375444 offset 1851596800 count 1
> >          item 38 key (3886187384832 EXTENT_ITEM 81920) itemoff 11212 itemsize 169
> >                  extent refs 16 gen 32382 flags DATA
> >                  extent data backref root 11712 objectid 375444 offset 1851654144 count 4
> >                  extent data backref root 11276 objectid 375444 offset 1851654144 count 4
> >                  extent data backref root 11058 objectid 375444 offset 1851654144 count 3
> >                  extent data backref root 11494 objectid 375444 offset 1851654144 count 4
> >                  extent data backref root 11930 objectid 375444 offset 1851654144 count 1
> >          item 39 key (3886187466752 EXTENT_ITEM 16384) itemoff 11043 itemsize 169
> >                  extent refs 5 gen 32382 flags DATA
> >                  extent data backref root 11712 objectid 375444 offset 1851744256 count 1
> >                  extent data backref root 11276 objectid 375444 offset 1851744256 count 1
> 
> Well, there is only the output from extent tree.
> 
> I was also expecting output from subvolue (11930) tree.
> 
> It could be done by
> # btrfs-debug-tree -t 11930 | grep -C 10 3886187384832
> 
> But please pay attention that, this dump may contain filenames, feel free to
> mask the filenames.
 
There you go:
gargamel:~# btrfs-debug-tree /dev/mapper/dsh | grep -C 10 3886187384832  
dshelf1@ dshelf2@  
                extent compression 0 (none)
        item 201 key (375444 EXTENT_DATA 1851654144) itemoff 5577 itemsize 53
                extent data disk byte 5613689888768 nr 8192
                extent data offset 0 nr 8192 ram 8192
                extent compression 0 (none)
        item 3 key (375444 EXTENT_DATA 1851744256) itemoff 16071 itemsize 53
                generation 32382 type 1 (regular)
                extent data disk byte 5613689888768 nr 8192
                extent data offset 0 nr 8192 ram 8192
                extent compression 0 (none)
        item 201 key (375444 EXTENT_DATA 1851654144) itemoff 5577 itemsize 53
                generation 32961 type 1 (regular)
                extent data disk byte 4686293291008 nr 16384
                extent data offset 0 nr 16384 ram 16384
                extent compression 0 (none)
        item 202 key (375444 EXTENT_DATA 1851670528) itemoff 5524 itemsize 53
                generation 32382 type 1 (regular)
                extent data disk byte 3886187384832 nr 81920
                extent data offset 16384 nr 8192 ram 81920
                extent compression 0 (none)
        item 203 key (375444 EXTENT_DATA 1851678720) itemoff 5471 itemsize 53
                generation 33534 type 1 (regular)
                extent data disk byte 5540480962560 nr 8192
                extent data offset 0 nr 8192 ram 8192
                extent compression 0 (none)
        item 204 key (375444 EXTENT_DATA 1851686912) itemoff 5418 itemsize 53
                generation 32961 type 1 (regular)
                extent data disk byte 4686293307392 nr 16384
                extent data offset 8192 nr 8192 ram 16384
                extent compression 0 (none)
        item 205 key (375444 EXTENT_DATA 1851695104) itemoff 5365 itemsize 53
                generation 32382 type 1 (regular)
                extent data disk byte 3886187384832 nr 81920
                extent data offset 40960 nr 8192 ram 81920
                extent compression 0 (none)
        item 206 key (375444 EXTENT_DATA 1851703296) itemoff 5312 itemsize 53
                generation 32961 type 1 (regular)
                extent data disk byte 4686293323776 nr 8192
                extent data offset 0 nr 8192 ram 8192
                extent compression 0 (none)
        item 207 key (375444 EXTENT_DATA 1851711488) itemoff 5259 itemsize 53
                generation 32382 type 1 (regular)
                extent data disk byte 3886187384832 nr 81920
                extent data offset 57344 nr 8192 ram 81920
                extent compression 0 (none)
leaf 5715801047040 items 105 free space 8093 generation 36595 owner 11930
fs uuid 85441c59-ad11-4b25-b1fe-974f9e4acede
chunk uuid ed705b7b-2fa6-43f6-a4a1-941c8463ee68
        item 0 key (375444 EXTENT_DATA 1851719680) itemoff 16230 itemsize 53
                generation 34868 type 1 (regular)
                extent data disk byte 5591266127872 nr 8192
                extent data offset 0 nr 8192 ram 8192
                extent compression 0 (none)
        item 1 key (375444 EXTENT_DATA 1851727872) itemoff 16177 itemsize 53
                generation 32382 type 1 (regular)
                extent data disk byte 3886187384832 nr 81920
                extent data offset 73728 nr 8192 ram 81920
                extent compression 0 (none)
        item 2 key (375444 EXTENT_DATA 1851736064) itemoff 16124 itemsize 53
                generation 31782 type 1 (regular)
                extent data disk byte 5922189430784 nr 106496
                extent data offset 81920 nr 8192 ram 106496
                extent compression 0 (none)
        item 3 key (375444 EXTENT_DATA 1851744256) itemoff 16071 itemsize 53
                generation 32382 type 1 (regular)
                extent data disk byte 3886187466752 nr 16384


> Thanks for this one.
> And it is caused by inlined compressed extent.
> 
> Lu Fengqi will send patch fixing it.

I got the patch and will test it, thank you.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-23  4:06                       ` Marc MERLIN
@ 2017-06-23  8:54                         ` Lu Fengqi
  2017-06-23 16:17                           ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Lu Fengqi @ 2017-06-23  8:54 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS

On 2017年06月23日 12:06, Marc MERLIN wrote:
>> Well, there is only the output from extent tree.
>>
>> I was also expecting output from subvolue (11930) tree.
>>
>> It could be done by
>> # btrfs-debug-tree -t 11930 | grep -C 10 3886187384832
>>
I apologize if this was not made clear.

>> But please pay attention that, this dump may contain filenames, feel free to
>> mask the filenames.
>   
> There you go:
> gargamel:~# btrfs-debug-tree /dev/mapper/dsh | grep -C 10 3886187384832

Could you dump file tree (11930) by the following command.
# btrfs-debug-tree -t 11930 /dev/mapper/dsh | grep -C 10 3886187384832

I wonder if this extent was referenced by this file tree four times.

Hoping that this will not cause you too much trouble.


-- 
Thanks,
Lu



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-23  8:54                         ` Lu Fengqi
@ 2017-06-23 16:17                           ` Marc MERLIN
  2017-06-24  2:34                             ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-06-23 16:17 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS

On Fri, Jun 23, 2017 at 04:54:01PM +0800, Lu Fengqi wrote:
> On 2017年06月23日 12:06, Marc MERLIN wrote:
> > > Well, there is only the output from extent tree.
> > > 
> > > I was also expecting output from subvolue (11930) tree.
> > > 
> > > It could be done by
> > > # btrfs-debug-tree -t 11930 | grep -C 10 3886187384832
> > > 
> I apologize if this was not made clear.
> 
> > > But please pay attention that, this dump may contain filenames, feel free to
> > > mask the filenames.
> > There you go:
> > gargamel:~# btrfs-debug-tree /dev/mapper/dsh | grep -C 10 3886187384832
> 
> Could you dump file tree (11930) by the following command.
> # btrfs-debug-tree -t 11930 /dev/mapper/dsh | grep -C 10 3886187384832

Sure thing, there you go:
		extent data disk byte 5613689888768 nr 8192
		extent data offset 0 nr 8192 ram 8192
		extent compression 0 (none)
	item 201 key (375444 EXTENT_DATA 1851654144) itemoff 5577 itemsize 53
		generation 32961 type 1 (regular)
		extent data disk byte 4686293291008 nr 16384
		extent data offset 0 nr 16384 ram 16384
		extent compression 0 (none)
	item 202 key (375444 EXTENT_DATA 1851670528) itemoff 5524 itemsize 53
		generation 32382 type 1 (regular)
		extent data disk byte 3886187384832 nr 81920
		extent data offset 16384 nr 8192 ram 81920
		extent compression 0 (none)
	item 203 key (375444 EXTENT_DATA 1851678720) itemoff 5471 itemsize 53
		generation 33534 type 1 (regular)
		extent data disk byte 5540480962560 nr 8192
		extent data offset 0 nr 8192 ram 8192
		extent compression 0 (none)
	item 204 key (375444 EXTENT_DATA 1851686912) itemoff 5418 itemsize 53
		generation 32961 type 1 (regular)
		extent data disk byte 4686293307392 nr 16384
		extent data offset 8192 nr 8192 ram 16384
		extent compression 0 (none)
	item 205 key (375444 EXTENT_DATA 1851695104) itemoff 5365 itemsize 53
		generation 32382 type 1 (regular)
		extent data disk byte 3886187384832 nr 81920
		extent data offset 40960 nr 8192 ram 81920
		extent compression 0 (none)
	item 206 key (375444 EXTENT_DATA 1851703296) itemoff 5312 itemsize 53
		generation 32961 type 1 (regular)
		extent data disk byte 4686293323776 nr 8192
		extent data offset 0 nr 8192 ram 8192
		extent compression 0 (none)
	item 207 key (375444 EXTENT_DATA 1851711488) itemoff 5259 itemsize 53
		generation 32382 type 1 (regular)
		extent data disk byte 3886187384832 nr 81920
		extent data offset 57344 nr 8192 ram 81920
		extent compression 0 (none)
leaf 5715801047040 items 105 free space 8093 generation 36595 owner 11930
leaf 5715801047040 flags 0x1(WRITTEN) backref revision 1
fs uuid 85441c59-ad11-4b25-b1fe-974f9e4acede
chunk uuid ed705b7b-2fa6-43f6-a4a1-941c8463ee68
	item 0 key (375444 EXTENT_DATA 1851719680) itemoff 16230 itemsize 53
		generation 34868 type 1 (regular)
		extent data disk byte 5591266127872 nr 8192
		extent data offset 0 nr 8192 ram 8192
		extent compression 0 (none)
	item 1 key (375444 EXTENT_DATA 1851727872) itemoff 16177 itemsize 53
		generation 32382 type 1 (regular)
		extent data disk byte 3886187384832 nr 81920
		extent data offset 73728 nr 8192 ram 81920
		extent compression 0 (none)
	item 2 key (375444 EXTENT_DATA 1851736064) itemoff 16124 itemsize 53
		generation 31782 type 1 (regular)
		extent data disk byte 5922189430784 nr 106496
		extent data offset 81920 nr 8192 ram 106496
		extent compression 0 (none)
	item 3 key (375444 EXTENT_DATA 1851744256) itemoff 16071 itemsize 53
		generation 32382 type 1 (regular)
		extent data disk byte 3886187466752 nr 16384


Thanks for looking at this.
I have applied your patch and I'm still re-running check in lowmem. It takes about 24H so I'll
post the full results when it's done.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-23 16:17                           ` Marc MERLIN
@ 2017-06-24  2:34                             ` Marc MERLIN
  2017-06-26 10:46                               ` Lu Fengqi
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-06-24  2:34 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS

On Fri, Jun 23, 2017 at 09:17:50AM -0700, Marc MERLIN wrote:
> Thanks for looking at this.
> I have applied your patch and I'm still re-running check in lowmem. It takes about 24H so I'll
> post the full results when it's done.

Ok, here is the output of the check with btrfs-progs freshly synced from
git, including Lu's just added patch.

Obviously while I'm happy to give further debug info on why my filesystem is in that state and
while check --repair sees nothing to repair, suggestions on how to clean those warnings up, unless they are not going to affect filesystem operation, would be greatly appreciated :)

Thanks,
Marc


Checking filesystem on /dev/mapper/dshelf2
UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4
ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11712, owner: 863395, offset: 79659008) wanted: 1, have: 2
ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11494, owner: 863395, offset: 79659008) wanted: 1, have: 2
ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11930, owner: 863395, offset: 79659008) wanted: 1, have: 2
ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12424, owner: 6945, offset: 2083389440) wanted: 1, have: 2
ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12172, owner: 6945, offset: 2083389440) wanted: 1, have: 2
ERROR: extent[4571729862656, 876544] referencer count mismatch (root: 11058, owner: 375442, offset: 907706368) wanted: 6, have: 21
ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11712, owner: 375444, offset: 1848672256) wanted: 3, have: 5
ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11276, owner: 375444, offset: 1848672256) wanted: 3, have: 5
ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11058, owner: 375444, offset: 1848672256) wanted: 3, have: 5
ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11494, owner: 375444, offset: 1848672256) wanted: 3, have: 5
ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11712, owner: 375444, offset: 1848705024) wanted: 1, have: 3
ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11276, owner: 375444, offset: 1848705024) wanted: 1, have: 3
ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11058, owner: 375444, offset: 1848705024) wanted: 1, have: 3
ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11494, owner: 375444, offset: 1848705024) wanted: 1, have: 3
ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11712, owner: 863395, offset: 79523840) wanted: 1, have: 3
ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11494, owner: 863395, offset: 79523840) wanted: 1, have: 3
ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11930, owner: 863395, offset: 79523840) wanted: 1, have: 3
ERROR: extent[4698380947456, 409600] referencer count mismatch (root: 11930, owner: 375444, offset: 1851596800) wanted: 3, have: 4
ERROR: extent[4720470421504, 667648] referencer count mismatch (root: 11058, owner: 3463478, offset: 2334720) wanted: 2, have: 10
ERROR: extent[4783941246976, 65536] referencer count mismatch (root: 9365, owner: 24493, offset: 4562944) wanted: 2, have: 3
ERROR: extent[5077564477440, 106496] referencer count mismatch (root: 9370, owner: 1602694, offset: 734756864) wanted: 1, have: 2
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11712, owner: 375441, offset: 910999552) wanted: 16, have: 1864
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11276, owner: 375441, offset: 910999552) wanted: 867, have: 1865
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11058, owner: 375441, offset: 910999552) wanted: 126, have: 1872
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11494, owner: 375441, offset: 910999552) wanted: 866, have: 1864
ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11930, owner: 375441, offset: 910999552) wanted: 861, have: 1859
ERROR: extent[5136649891840, 66781184] referencer count mismatch (root: 11058, owner: 375442, offset: 192659456) wanted: 5, have: 19
ERROR: extent[5136879157248, 134217728] referencer count mismatch (root: 11930, owner: 375442, offset: 394543104) wanted: 10, have: 33
ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11058, owner: 375442, offset: 875233280) wanted: 1, have: 21
ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11930, owner: 375442, offset: 875233280) wanted: 11, have: 21
ERROR: extent[5138641395712, 524288] referencer count mismatch (root: 11494, owner: 375445, offset: 39845888) wanted: 1, have: 3
ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11712, owner: 863395, offset: 51118080) wanted: 1, have: 4
ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11276, owner: 863395, offset: 51118080) wanted: 1, have: 4
ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11494, owner: 863395, offset: 51118080) wanted: 1, have: 4
ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 74952704) wanted: 3, have: 5
ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 74952704) wanted: 3, have: 5
ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11494, owner: 863395, offset: 74952704) wanted: 3, have: 5
ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11930, owner: 863395, offset: 74952704) wanted: 3, have: 5
ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11712, owner: 863395, offset: 77705216) wanted: 1, have: 6
ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 77705216) wanted: 5, have: 6
ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 77705216) wanted: 5, have: 6
ERROR: extent[5427326599168, 61440] referencer count mismatch (root: 9370, owner: 1225712, offset: 29753344) wanted: 1, have: 5
ERROR: extent[5456623030272, 24576] referencer count mismatch (root: 11058, owner: 2278892, offset: 786432) wanted: 2, have: 3
ERROR: extent[5851251269632, 134217728] referencer count mismatch (root: 9370, owner: 1602695, offset: 534061056) wanted: 3, have: 4
ERROR: errors found in extent allocation tree or chunk allocation
cache and super generation don't match, space cache will be invalidated
ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
ERROR: errors found in fs roots
found 5544779108352 bytes used, error(s) found
total csum bytes: 5344523140
total tree bytes: 71323041792
total fs tree bytes: 59288403968
total extent tree bytes: 5378260992
btree space waste bytes: 10912166856
file data blocks allocated: 7830914256896
 referenced 6244104495104

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-24  2:34                             ` Marc MERLIN
@ 2017-06-26 10:46                               ` Lu Fengqi
  2017-06-27 23:11                                 ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Lu Fengqi @ 2017-06-26 10:46 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS

On 2017年06月24日 10:34, Marc MERLIN wrote:
> On Fri, Jun 23, 2017 at 09:17:50AM -0700, Marc MERLIN wrote:
>> Thanks for looking at this.
>> I have applied your patch and I'm still re-running check in lowmem. It takes about 24H so I'll
>> post the full results when it's done.
> 
> Ok, here is the output of the check with btrfs-progs freshly synced from
> git, including Lu's just added patch.
> 
> Obviously while I'm happy to give further debug info on why my filesystem is in that state and
> while check --repair sees nothing to repair, suggestions on how to clean those warnings up, unless they are not going to affect filesystem operation, would be greatly appreciated :)
> 
> Thanks,
> Marc

Thanks for the updated information. I'm sorry that the false alert make 
you feel nervous.

> 
> ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt
> ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt
> ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt
> ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
> ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
> ERROR: errors found in fs roots

However, this looks like another problem. Could you dump this file tree 
by the following command?
# btrfs-debug-tree -t 3862 <dev> | grep -C 10 18170706

-- 
Thanks,
Lu



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-26 10:46                               ` Lu Fengqi
@ 2017-06-27 23:11                                 ` Marc MERLIN
  2017-06-28  7:10                                   ` Lu Fengqi
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-06-27 23:11 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS

On Mon, Jun 26, 2017 at 06:46:16PM +0800, Lu Fengqi wrote:
> Thanks for the updated information. I'm sorry that the false alert make 
> you feel nervous.

If you can help me find out whether those are real errors that I need to fix
(and can't yet since there is no --repair), or whether they are not real
problems, I can ignore them as long as the other check --mode normal runs
clean (it does), we'll be good :)

> >ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt
> >ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt
> >ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt
> >ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
> >ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
> >ERROR: errors found in fs roots
> 
> However, this looks like another problem. Could you dump this file tree 
> by the following command?
> # btrfs-debug-tree -t 3862 <dev> | grep -C 10 18170706

argamel:~# btrfs-debug-tree -t 3862 /dev/mapper/dshelf2 | grep -C 10 18170706  
                transid 522 data_len 0 name_len 45
                name: 007b01c69a8d_9582a070_425a45d7@mindspring.com
        item 89 key (835232 DIR_ITEM 1181375325) itemoff 10304 itemsize 64
                location key (877605 INODE_ITEM 0) type FILE
                transid 530 data_len 0 name_len 34
                name: PO.5078346.7051988@codeweavers.com
        item 90 key (835232 DIR_ITEM 1181489720) itemoff 10226 itemsize 78
                location key (873230 INODE_ITEM 0) type FILE
                transid 529 data_len 0 name_len 48
                name: 8-8356087-5CruUNCbuG3Kg9zuO@mail1.fireflypro.com
        item 91 key (835232 DIR_ITEM 1181707066) itemoff 10148 itemsize 78
                location key (869906 INODE_ITEM 0) type FILE
                transid 528 data_len 0 name_len 48
                name: 7-7026088-1uLzZJuFzYD6h4rzV@max.firmalliance.biz
        item 92 key (835232 DIR_ITEM 1181727135) itemoff 10084 itemsize 64
                location key (877380 INODE_ITEM 0) type FILE
                transid 530 data_len 0 name_len 34
                name: NJ.5943286.7059518@codeweavers.com
        item 93 key (835232 DIR_ITEM 1181873033) itemoff 10038 itemsize 46
                location key (859092 INODE_ITEM 0) type FILE
                transid 526 data_len 0 name_len 16
                name: mdadm_detail.0
        item 83 key (2640780 DIR_ITEM 3316050734) itemoff 12739 itemsize 39
                location key (15689752 INODE_ITEM 0) type FILE
                transid 8178 data_len 0 name_len 9
                name: sda3.dd.0
        item 84 key (2640780 DIR_ITEM 3349213389) itemoff 12697 itemsize 42
                location key (2667656 INODE_ITEM 0) type FILE
                transid 885 data_len 0 name_len 12
                name: sdb2.dd.1.gz
        item 85 key (2640780 DIR_ITEM 3351742419) itemoff 12663 itemsize 34
                location key (18170706 INODE_ITEM 0) type FILE
                transid 37866 data_len 0 name_len 4
                name: dm-0
        item 86 key (2640780 DIR_ITEM 3354578455) itemoff 12624 itemsize 39
                location key (13847590 INODE_ITEM 0) type FILE
                transid 2387 data_len 0 name_len 9
                name: sda7.3.gz
        item 87 key (2640780 DIR_ITEM 3361267344) itemoff 12586 itemsize 38
                location key (2667594 INODE_ITEM 0) type FILE
                transid 885 data_len 0 name_len 8
                name: .profile
--
                name: sdc1.dd.4.gz
        item 70 key (2640780 DIR_INDEX 1685) itemoff 13162 itemsize 42
                location key (17548883 INODE_ITEM 0) type FILE
                transid 34469 data_len 0 name_len 12
                name: sdc1.dd.5.gz
        item 71 key (2640780 DIR_INDEX 1687) itemoff 13120 itemsize 42
                location key (17548884 INODE_ITEM 0) type FILE
                transid 34469 data_len 0 name_len 12
                name: sdc1.dd.6.gz
        item 72 key (2640780 DIR_INDEX 2039) itemoff 13086 itemsize 34
                location key (18170706 INODE_ITEM 0) type FILE
                transid 37866 data_len 0 name_len 4
                name: dm-0
        item 73 key (2640780 DIR_INDEX 2041) itemoff 13051 itemsize 35
                location key (18170707 INODE_ITEM 0) type FILE
                transid 37866 data_len 0 name_len 5
                name: fdisk
        item 74 key (2640780 DIR_INDEX 2043) itemoff 13007 itemsize 44
                location key (18170708 INODE_ITEM 0) type FILE
                transid 37866 data_len 0 name_len 14
                name: mdadm_detail.0

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-27 23:11                                 ` Marc MERLIN
@ 2017-06-28  7:10                                   ` Lu Fengqi
  2017-06-28 14:43                                     ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Lu Fengqi @ 2017-06-28  7:10 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS

On Tue, Jun 27, 2017 at 04:11:46PM -0700, Marc MERLIN wrote:
>On Mon, Jun 26, 2017 at 06:46:16PM +0800, Lu Fengqi wrote:
>> Thanks for the updated information. I'm sorry that the false alert make 
>> you feel nervous.
>
>If you can help me find out whether those are real errors that I need to fix
>(and can't yet since there is no --repair), or whether they are not real
>problems, I can ignore them as long as the other check --mode normal runs
>clean (it does), we'll be good :)

:)

>
>> >ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt
>> >ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt
>> >ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt
>> >ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
>> >ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
>> >ERROR: errors found in fs roots
>> 
>> However, this looks like another problem. Could you dump this file tree 
>> by the following command?
>> # btrfs-debug-tree -t 3862 <dev> | grep -C 10 18170706
>

Because the output is abnormal, except for the relevant DIR_ITEM and
DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA.
I wonder if the file system is online when this command is executed? If
so, please re-execute it offline again; if not, could you apply my
patches re-check it again?

>argamel:~# btrfs-debug-tree -t 3862 /dev/mapper/dshelf2 | grep -C 10 18170706  
>                transid 522 data_len 0 name_len 45
>                name: 007b01c69a8d_9582a070_425a45d7@mindspring.com
>        item 89 key (835232 DIR_ITEM 1181375325) itemoff 10304 itemsize 64
>                location key (877605 INODE_ITEM 0) type FILE
>                transid 530 data_len 0 name_len 34
>                name: PO.5078346.7051988@codeweavers.com
>        item 90 key (835232 DIR_ITEM 1181489720) itemoff 10226 itemsize 78
>                location key (873230 INODE_ITEM 0) type FILE
>                transid 529 data_len 0 name_len 48
>                name: 8-8356087-5CruUNCbuG3Kg9zuO@mail1.fireflypro.com
>        item 91 key (835232 DIR_ITEM 1181707066) itemoff 10148 itemsize 78
>                location key (869906 INODE_ITEM 0) type FILE
>                transid 528 data_len 0 name_len 48
>                name: 7-7026088-1uLzZJuFzYD6h4rzV@max.firmalliance.biz
>        item 92 key (835232 DIR_ITEM 1181727135) itemoff 10084 itemsize 64
>                location key (877380 INODE_ITEM 0) type FILE
>                transid 530 data_len 0 name_len 34
>                name: NJ.5943286.7059518@codeweavers.com
>        item 93 key (835232 DIR_ITEM 1181873033) itemoff 10038 itemsize 46
>                location key (859092 INODE_ITEM 0) type FILE
>                transid 526 data_len 0 name_len 16
>                name: mdadm_detail.0
>        item 83 key (2640780 DIR_ITEM 3316050734) itemoff 12739 itemsize 39
>                location key (15689752 INODE_ITEM 0) type FILE
>                transid 8178 data_len 0 name_len 9
>                name: sda3.dd.0
>        item 84 key (2640780 DIR_ITEM 3349213389) itemoff 12697 itemsize 42
>                location key (2667656 INODE_ITEM 0) type FILE
>                transid 885 data_len 0 name_len 12
>                name: sdb2.dd.1.gz
>        item 85 key (2640780 DIR_ITEM 3351742419) itemoff 12663 itemsize 34
>                location key (18170706 INODE_ITEM 0) type FILE
>                transid 37866 data_len 0 name_len 4
>                name: dm-0
>        item 86 key (2640780 DIR_ITEM 3354578455) itemoff 12624 itemsize 39
>                location key (13847590 INODE_ITEM 0) type FILE
>                transid 2387 data_len 0 name_len 9
>                name: sda7.3.gz
>        item 87 key (2640780 DIR_ITEM 3361267344) itemoff 12586 itemsize 38
>                location key (2667594 INODE_ITEM 0) type FILE
>                transid 885 data_len 0 name_len 8
>                name: .profile
>--
>                name: sdc1.dd.4.gz
>        item 70 key (2640780 DIR_INDEX 1685) itemoff 13162 itemsize 42
>                location key (17548883 INODE_ITEM 0) type FILE
>                transid 34469 data_len 0 name_len 12
>                name: sdc1.dd.5.gz
>        item 71 key (2640780 DIR_INDEX 1687) itemoff 13120 itemsize 42
>                location key (17548884 INODE_ITEM 0) type FILE
>                transid 34469 data_len 0 name_len 12
>                name: sdc1.dd.6.gz
>        item 72 key (2640780 DIR_INDEX 2039) itemoff 13086 itemsize 34
>                location key (18170706 INODE_ITEM 0) type FILE
>                transid 37866 data_len 0 name_len 4
>                name: dm-0
>        item 73 key (2640780 DIR_INDEX 2041) itemoff 13051 itemsize 35
>                location key (18170707 INODE_ITEM 0) type FILE
>                transid 37866 data_len 0 name_len 5
>                name: fdisk
>        item 74 key (2640780 DIR_INDEX 2043) itemoff 13007 itemsize 44
>                location key (18170708 INODE_ITEM 0) type FILE
>                transid 37866 data_len 0 name_len 14
>                name: mdadm_detail.0
>
>Marc
>-- 
>"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>Microsoft is to operating systems ....
>                                      .... what McDonalds is to gourmet cooking
>Home page: http://marc.merlins.org/  
>
>

-- 
Thanks,
Lu



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-28  7:10                                   ` Lu Fengqi
@ 2017-06-28 14:43                                     ` Marc MERLIN
  2017-05-01 17:06                                       ` 4.11 relocate crash, null pointer Marc MERLIN
  2017-06-29 13:36                                       ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi
  0 siblings, 2 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-06-28 14:43 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Qu Wenruo, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 740 bytes --]

[cc trimmed]

On Wed, Jun 28, 2017 at 03:10:27PM +0800, Lu Fengqi wrote:
> Because the output is abnormal, except for the relevant DIR_ITEM and
> DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA.
> I wonder if the file system is online when this command is executed? If
> so, please re-execute it offline again; if not, could you apply my
> patches re-check it again?

The filesystem was offline and I had those 2 patches applied.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

[-- Attachment #2: p1.patch --]
[-- Type: text/x-diff, Size: 4038 bytes --]

>From lufq.fnst@cn.fujitsu.com Mon Jun 26 03:37:46 2017
Received: from [59.151.112.132] (port=50126 helo=heian.cn.fujitsu.com)
	by mail1.merlins.org with esmtp (Exim 4.87 #1)
	id 1dPROn-0001kT-Ud
	for <marc@merlins.org>; Mon, 26 Jun 2017 03:37:46 -0700
X-IronPort-AV: E=Sophos;i="5.22,518,1449504000"; 
   d="scan'208";a="20491849"
Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5])
  by heian.cn.fujitsu.com with ESMTP; 26 Jun 2017 18:37:30 +0800
Received: from G08CNEXCHPEKD02.g08.fujitsu.local (unknown [10.167.33.83])
	by cn.fujitsu.com (Postfix) with ESMTP id 2694647E64CC;
	Mon, 26 Jun 2017 18:37:30 +0800 (CST)
Received: from lufq.5F.lufq.5F (10.167.225.63) by
 G08CNEXCHPEKD02.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server
 (TLS) id 14.3.319.2; Mon, 26 Jun 2017 18:37:31 +0800
From: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
To: <linux-btrfs@vger.kernel.org>
CC: <marc@merlins.org>
Date: Mon, 26 Jun 2017 18:37:24 +0800
Message-ID: <20170626103727.8945-1-lufq.fnst@cn.fujitsu.com>
X-Mailer: git-send-email 2.13.1
MIME-Version: 1.0
Content-Type: text/plain
X-Originating-IP: [10.167.225.63]
X-yoursite-MailScanner-ID: 2694647E64CC.AB674
X-yoursite-MailScanner: Found to be clean
X-yoursite-MailScanner-From: lufq.fnst@cn.fujitsu.com
X-Broken-Reverse-DNS: no host name for IP address 59.151.112.132
X-SA-Exim-Connect-IP: 59.151.112.132
X-SA-Exim-Rcpt-To: marc@merlins.org
X-SA-Exim-Mail-From: lufq.fnst@cn.fujitsu.com
X-Spam-Checker-Version: SpamAssassin 3.4.1-mmrules_20121111 (2015-04-28) on
	magic.merlins.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 required=7.0 tests=BAYES_00,GREYLIST_ISWHITE,
	RDNS_NONE autolearn=ham autolearn_force=no version=3.4.1-mmrules_20121111
X-Spam-Report: 
	* -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
	*      [score: 0.0000]
	*  0.8 RDNS_NONE Delivered to internal network by a host with no rDNS
	* -1.5 GREYLIST_ISWHITE The incoming server has been whitelisted for this
	*      receipient and sender
Subject: [PATCH v3 1/4] btrfs-progs: lowmem check: Fix false alert about file extent interrupt
X-SA-Exim-Version: 4.2.1 (built Tue, 02 Aug 2016 21:08:31 +0000)
X-SA-Exim-Scanned: Yes (on mail1.merlins.org)
Status: RO
Content-Length: 1811
Lines: 52

As Qu mentioned in this thread
(https://www.spinics.net/lists/linux-btrfs/msg64469.html), compression
can cause regular extent to co-exist with inlined extent. This coexistence
makes things confusing. Since it was permitted currently, so fix
btrfsck to prevent a bunch of error logs that will make user feel
panic.

When check file extent, record the extent_end of regular extent to check
if there is a gap between the regular extents. Normally there is only one
inlined extent, so the extent_end of inlined extent is useless. However,
if regular extent can co-exist with inlined extent, the extent_end of
inlined extent also need to record.

Reported-by: Marc MERLIN <marc@merlins.org>
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
---

Changlog:
v2: Just fix reported-by
v3: Output verbose information when file extent interrupt

 cmds-check.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index c052f66e..70d2b7f2 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -4782,6 +4782,7 @@ static int check_file_extent(struct btrfs_root *root, struct btrfs_key *fkey,
 				extent_num_bytes, item_inline_len);
 			err |= FILE_EXTENT_ERROR;
 		}
+		*end += extent_num_bytes;
 		*size += extent_num_bytes;
 		return err;
 	}
@@ -4847,8 +4848,8 @@ static int check_file_extent(struct btrfs_root *root, struct btrfs_key *fkey,
 		      root->objectid, fkey->objectid, fkey->offset);
 	} else if (!no_holes && *end != fkey->offset) {
 		err |= FILE_EXTENT_ERROR;
-		error("root %llu EXTENT_DATA[%llu %llu] interrupt",
-		      root->objectid, fkey->objectid, fkey->offset);
+		error("root %llu EXTENT_DATA[%llu %llu] interrupt, should start at %llu",
+		      root->objectid, fkey->objectid, fkey->offset, *end);
 	}
 
 	*end += extent_num_bytes;
-- 
2.13.1






[-- Attachment #3: p2.patch --]
[-- Type: text/x-diff, Size: 3267 bytes --]

>From lufq.fnst@cn.fujitsu.com Mon Jun 26 03:37:41 2017
Received: from [59.151.112.132] (port=50126 helo=heian.cn.fujitsu.com)
	by mail1.merlins.org with esmtp (Exim 4.87 #1)
	id 1dPROj-0001kT-Tq
	for <marc@merlins.org>; Mon, 26 Jun 2017 03:37:41 -0700
X-IronPort-AV: E=Sophos;i="5.22,518,1449504000"; 
   d="scan'208";a="20491848"
Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5])
  by heian.cn.fujitsu.com with ESMTP; 26 Jun 2017 18:37:30 +0800
Received: from G08CNEXCHPEKD02.g08.fujitsu.local (unknown [10.167.33.83])
	by cn.fujitsu.com (Postfix) with ESMTP id B3C5047E64D5;
	Mon, 26 Jun 2017 18:37:30 +0800 (CST)
Received: from lufq.5F.lufq.5F (10.167.225.63) by
 G08CNEXCHPEKD02.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server
 (TLS) id 14.3.319.2; Mon, 26 Jun 2017 18:37:32 +0800
From: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
To: <linux-btrfs@vger.kernel.org>
CC: <marc@merlins.org>
Date: Mon, 26 Jun 2017 18:37:25 +0800
Message-ID: <20170626103727.8945-2-lufq.fnst@cn.fujitsu.com>
X-Mailer: git-send-email 2.13.1
In-Reply-To: <20170626103727.8945-1-lufq.fnst@cn.fujitsu.com>
References: <20170626103727.8945-1-lufq.fnst@cn.fujitsu.com>
MIME-Version: 1.0
Content-Type: text/plain
X-Originating-IP: [10.167.225.63]
X-yoursite-MailScanner-ID: B3C5047E64D5.AC56F
X-yoursite-MailScanner: Found to be clean
X-yoursite-MailScanner-From: lufq.fnst@cn.fujitsu.com
X-Broken-Reverse-DNS: no host name for IP address 59.151.112.132
X-SA-Exim-Connect-IP: 59.151.112.132
X-SA-Exim-Rcpt-To: marc@merlins.org
X-SA-Exim-Mail-From: lufq.fnst@cn.fujitsu.com
X-Spam-Checker-Version: SpamAssassin 3.4.1-mmrules_20121111 (2015-04-28) on
	magic.merlins.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 required=7.0 tests=BAYES_00,GREYLIST_ISWHITE,
	RDNS_NONE autolearn=ham autolearn_force=no version=3.4.1-mmrules_20121111
X-Spam-Report: 
	* -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
	*      [score: 0.0000]
	*  0.8 RDNS_NONE Delivered to internal network by a host with no rDNS
	* -1.5 GREYLIST_ISWHITE The incoming server has been whitelisted for this
	*      receipient and sender
Subject: [PATCH v3 2/4] btrfs-progs: lowmem check: Fix false alert about referencer count mismatch
X-SA-Exim-Version: 4.2.1 (built Tue, 02 Aug 2016 21:08:31 +0000)
X-SA-Exim-Scanned: Yes (on mail1.merlins.org)
Status: O
Content-Length: 915
Lines: 29

The normal back reference counting doesn't care about the extent referred
by the extent data in the shared leaf. The check_extent_data_backref
function need to skip the leaf that owner mismatch with the root_id.

Reported-by: Marc MERLIN <marc@merlins.org>
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
---
 cmds-check.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/cmds-check.c b/cmds-check.c
index 70d2b7f2..f42968cd 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -10692,7 +10692,8 @@ static int check_extent_data_backref(struct btrfs_fs_info *fs_info,
 		leaf = path.nodes[0];
 		slot = path.slots[0];
 
-		if (slot >= btrfs_header_nritems(leaf))
+		if (slot >= btrfs_header_nritems(leaf) ||
+		    btrfs_header_owner(leaf) != root_id)
 			goto next;
 		btrfs_item_key_to_cpu(leaf, &key, slot);
 		if (key.objectid != objectid || key.type != BTRFS_EXTENT_DATA_KEY)
-- 
2.13.1






^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-28 14:43                                     ` Marc MERLIN
  2017-05-01 17:06                                       ` 4.11 relocate crash, null pointer Marc MERLIN
@ 2017-06-29 13:36                                       ` Lu Fengqi
  2017-06-29 15:30                                         ` Marc MERLIN
  1 sibling, 1 reply; 77+ messages in thread
From: Lu Fengqi @ 2017-06-29 13:36 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Qu Wenruo, Btrfs BTRFS

On Wed, Jun 28, 2017 at 07:43:48AM -0700, Marc MERLIN wrote:
>[cc trimmed]
>
>On Wed, Jun 28, 2017 at 03:10:27PM +0800, Lu Fengqi wrote:
>> Because the output is abnormal, except for the relevant DIR_ITEM and
>> DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA.
>> I wonder if the file system is online when this command is executed? If
>> so, please re-execute it offline again; if not, could you apply my
>> patches re-check it again?
>
>The filesystem was offline and I had those 2 patches applied.

I am afraid I don't know why the inode item disappers. Besides, if
btrfs-debug-tree can't find the inode item, btrfs check shouldn't report
this inode item's extent data interrupt. Could you check the disk
again? The error output may have changed.

>
>Marc
>-- 
>"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>Microsoft is to operating systems ....
>                                      .... what McDonalds is to gourmet cooking
>Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901


-- 
Thanks,
Lu



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-29 13:36                                       ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi
@ 2017-06-29 15:30                                         ` Marc MERLIN
  2017-06-30 14:59                                           ` Lu Fengqi
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-06-29 15:30 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Qu Wenruo, Btrfs BTRFS

On Thu, Jun 29, 2017 at 09:36:15PM +0800, Lu Fengqi wrote:
> On Wed, Jun 28, 2017 at 07:43:48AM -0700, Marc MERLIN wrote:
> >[cc trimmed]
> >
> >On Wed, Jun 28, 2017 at 03:10:27PM +0800, Lu Fengqi wrote:
> >> Because the output is abnormal, except for the relevant DIR_ITEM and
> >> DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA.
> >> I wonder if the file system is online when this command is executed? If
> >> so, please re-execute it offline again; if not, could you apply my
> >> patches re-check it again?
> >
> >The filesystem was offline and I had those 2 patches applied.
> 
> I am afraid I don't know why the inode item disappers. Besides, if
> btrfs-debug-tree can't find the inode item, btrfs check shouldn't report
> this inode item's extent data interrupt. Could you check the disk
> again? The error output may have changed.

I just did but it takes 24H. I just have the results now: 
gargamel:~# btrfs check --mode lowmem  /dev/mapper/dshelf2
Checking filesystem on /dev/mapper/dshelf2
UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
checking extents
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
ERROR: errors found in fs roots
found 5544779124736 bytes used, error(s) found
total csum bytes: 5344523140
total tree bytes: 71323058176
total fs tree bytes: 59288403968
total extent tree bytes: 5378277376
btree space waste bytes: 10912183048
file data blocks allocated: 7830914256896
 referenced 6244104495104


This is looking better, but not 0.
Can I ignore these or should we look into them still?

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't?
  2017-06-29 15:30                                         ` Marc MERLIN
@ 2017-06-30 14:59                                           ` Lu Fengqi
  0 siblings, 0 replies; 77+ messages in thread
From: Lu Fengqi @ 2017-06-30 14:59 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Qu Wenruo, Btrfs BTRFS

On Thu, Jun 29, 2017 at 08:30:35AM -0700, Marc MERLIN wrote:
>On Thu, Jun 29, 2017 at 09:36:15PM +0800, Lu Fengqi wrote:
>> On Wed, Jun 28, 2017 at 07:43:48AM -0700, Marc MERLIN wrote:
>> >[cc trimmed]
>> >
>> >On Wed, Jun 28, 2017 at 03:10:27PM +0800, Lu Fengqi wrote:
>> >> Because the output is abnormal, except for the relevant DIR_ITEM and
>> >> DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA.
>> >> I wonder if the file system is online when this command is executed? If
>> >> so, please re-execute it offline again; if not, could you apply my
>> >> patches re-check it again?
>> >
>> >The filesystem was offline and I had those 2 patches applied.
>> 
>> I am afraid I don't know why the inode item disappers. Besides, if
>> btrfs-debug-tree can't find the inode item, btrfs check shouldn't report
>> this inode item's extent data interrupt. Could you check the disk
>> again? The error output may have changed.
>
>I just did but it takes 24H. I just have the results now: 
>gargamel:~# btrfs check --mode lowmem  /dev/mapper/dshelf2
>Checking filesystem on /dev/mapper/dshelf2
>UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
>checking extents
>checking free space cache
>cache and super generation don't match, space cache will be invalidated
>checking fs roots
>ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt
>ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt
>ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt
>ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt
>ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt
>ERROR: errors found in fs roots
>found 5544779124736 bytes used, error(s) found
>total csum bytes: 5344523140
>total tree bytes: 71323058176
>total fs tree bytes: 59288403968
>total extent tree bytes: 5378277376
>btree space waste bytes: 10912183048
>file data blocks allocated: 7830914256896
> referenced 6244104495104
>
>
>This is looking better, but not 0.
>Can I ignore these or should we look into them still?
>
>Marc
>-- 
>"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>Microsoft is to operating systems ....
>                                      .... what McDonalds is to gourmet cooking
>Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
>
>

Personally, I think since the normal mode didn't report any error
related this inode, then these error maybe caused by the bug of lowmem
mode and btrfs-debug-tree.

At your convenience, would you please give me all items about this
inode? I think it can provide some clues regarding the disappearance
of inode and the extent interrupt. It can be dumped by this following
command:

# btrfs-debug-tree /dev/mapper/dshelf2 | grep -C 10 18170706

Please pay attention that, this dump may contain filenames, feel free
to mask the filenames.

Thank you for your assistance.

-- 
Thanks,
Lu



^ permalink raw reply	[flat|nested] 77+ messages in thread

* ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
  2017-05-02  5:11                                                 ` Marc MERLIN
  2017-05-02 18:47                                                   ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN
@ 2017-07-07  5:37                                                   ` Marc MERLIN
  2017-07-07  5:39                                                     ` Marc MERLIN
  1 sibling, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-07-07  5:37 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Btrfs BTRFS, Josef Bacik, Qu Wenruo, David Sterba

I'm still trying to fix my filesystem.
It seems to work well enough since the damage is apparently localized, but
I'd really want check --repair to actually bring it back to a working
state, but now it's crashing

This is btrfs tools from git from a few days ago

Failed to find [4068943577088, 168, 16384]
btrfs unable to find ref byte nr 4068943577088 parent 0 root 4  owner 1 offset 0
Failed to find [5905106075648, 168, 16384]
btrfs unable to find ref byte nr 5906282119168 parent 0 root 4  owner 0 offset 1
Failed to find [21037056, 168, 16384]
btrfs unable to find ref byte nr 21037056 parent 0 root 3  owner 1 offset 0
Failed to find [21053440, 168, 16384]
btrfs unable to find ref byte nr 21053440 parent 0 root 3  owner 0 offset 1
Failed to find [21299200, 168, 16384]
btrfs unable to find ref byte nr 21299200 parent 0 root 3  owner 0 offset 1
Failed to find [5523931971584, 168, 16384]
btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861  owner 3 offset 0
ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
btrfs(+0x113cf)[0x5651e60443cf]
btrfs(__btrfs_cow_block+0x576)[0x5651e6045848]
btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6]
btrfs(btrfs_search_slot+0x11df)[0x5651e604969d]
btrfs(+0x59184)[0x5651e608c184]
btrfs(cmd_check+0x2bd4)[0x5651e60987b3]
btrfs(main+0x85)[0x5651e60442c3]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1]
btrfs(_start+0x2a)[0x5651e6043e3a]


Full log:
enabling repair mode
Checking filesystem on /dev/mapper/dshelf2
UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
checking extents
Fixed 0 roots.
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
checksum verify failed on 3037243965440 found 179689AF wanted 82B97043
checksum verify failed on 3037243965440 found 179689AF wanted 82B97043
checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F
checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F
checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E
checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E
checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C
checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C
checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09
checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09
checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA
checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA
checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC
checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC
checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598
checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598
checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135
checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135
checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D
checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D
checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC
checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC
checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7
checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7
checksum verify failed on 3111569391616 found 3C623707 wanted D955D668
checksum verify failed on 3111569391616 found 3C623707 wanted D955D668
checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A
checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A
checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2
checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2
checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35
checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35
checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339
checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339
checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43
checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43
checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF
checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF
checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE
checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE
checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0
checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0
checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE
checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE
checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10
checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10
checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81
checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81
checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B
checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B
Csum didn't match
The following tree block(s) is corrupted in tree 3861:
	tree block bytenr: 1710573748224, level: 1, node key: (1073956, 12, 959325)
Try to repair the btree for root 3861
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Csum didn't match
Failed to find [4068943577088, 168, 16384]
btrfs unable to find ref byte nr 4068943577088 parent 0 root 4  owner 1 offset 0
Failed to find [5905106075648, 168, 16384]
btrfs unable to find ref byte nr 5906282119168 parent 0 root 4  owner 0 offset 1
Failed to find [21037056, 168, 16384]
btrfs unable to find ref byte nr 21037056 parent 0 root 3  owner 1 offset 0
Failed to find [21053440, 168, 16384]
btrfs unable to find ref byte nr 21053440 parent 0 root 3  owner 0 offset 1
Failed to find [21299200, 168, 16384]
btrfs unable to find ref byte nr 21299200 parent 0 root 3  owner 0 offset 1
Failed to find [5523931971584, 168, 16384]
btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861  owner 3 offset 0
ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
btrfs(+0x113cf)[0x5651e60443cf]
btrfs(__btrfs_cow_block+0x576)[0x5651e6045848]
btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6]
btrfs(btrfs_search_slot+0x11df)[0x5651e604969d]
btrfs(+0x59184)[0x5651e608c184]
btrfs(cmd_check+0x2bd4)[0x5651e60987b3]
btrfs(main+0x85)[0x5651e60442c3]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1]
btrfs(_start+0x2a)[0x5651e6043e3a]
Aborted
gargamel:~# 
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
  2017-07-07  5:37                                                   ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN
@ 2017-07-07  5:39                                                     ` Marc MERLIN
  2017-07-07  9:33                                                       ` Lu Fengqi
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-07-07  5:39 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Btrfs BTRFS, Josef Bacik, Qu Wenruo, David Sterba

On Thu, Jul 06, 2017 at 10:37:18PM -0700, Marc MERLIN wrote:
> I'm still trying to fix my filesystem.
> It seems to work well enough since the damage is apparently localized, but
> I'd really want check --repair to actually bring it back to a working
> state, but now it's crashing
> 
> This is btrfs tools from git from a few days ago
> 
> Failed to find [4068943577088, 168, 16384]
> btrfs unable to find ref byte nr 4068943577088 parent 0 root 4  owner 1 offset 0
> Failed to find [5905106075648, 168, 16384]
> btrfs unable to find ref byte nr 5906282119168 parent 0 root 4  owner 0 offset 1
> Failed to find [21037056, 168, 16384]
> btrfs unable to find ref byte nr 21037056 parent 0 root 3  owner 1 offset 0
> Failed to find [21053440, 168, 16384]
> btrfs unable to find ref byte nr 21053440 parent 0 root 3  owner 0 offset 1
> Failed to find [21299200, 168, 16384]
> btrfs unable to find ref byte nr 21299200 parent 0 root 3  owner 0 offset 1
> Failed to find [5523931971584, 168, 16384]
> btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861  owner 3 offset 0
> ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
> btrfs(+0x113cf)[0x5651e60443cf]
> btrfs(__btrfs_cow_block+0x576)[0x5651e6045848]
> btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6]
> btrfs(btrfs_search_slot+0x11df)[0x5651e604969d]
> btrfs(+0x59184)[0x5651e608c184]
> btrfs(cmd_check+0x2bd4)[0x5651e60987b3]
> btrfs(main+0x85)[0x5651e60442c3]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1]
> btrfs(_start+0x2a)[0x5651e6043e3a]

Mmmh, never mind, it seems that the software raid suffered yet another
double disk failure due to some undermined flakiness in the underlying block
device cabling :-/
That would likely explain the failures here.

 
> Full log:
> enabling repair mode
> Checking filesystem on /dev/mapper/dshelf2
> UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
> checking extents
> Fixed 0 roots.
> checking free space cache
> cache and super generation don't match, space cache will be invalidated
> checking fs roots
> checksum verify failed on 3037243965440 found 179689AF wanted 82B97043
> checksum verify failed on 3037243965440 found 179689AF wanted 82B97043
> checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F
> checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F
> checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E
> checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E
> checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C
> checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C
> checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09
> checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09
> checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA
> checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA
> checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC
> checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC
> checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598
> checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598
> checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135
> checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135
> checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D
> checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D
> checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC
> checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC
> checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7
> checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7
> checksum verify failed on 3111569391616 found 3C623707 wanted D955D668
> checksum verify failed on 3111569391616 found 3C623707 wanted D955D668
> checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A
> checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A
> checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2
> checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2
> checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35
> checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35
> checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339
> checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339
> checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43
> checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43
> checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF
> checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF
> checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE
> checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE
> checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0
> checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0
> checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE
> checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE
> checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10
> checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10
> checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81
> checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81
> checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B
> checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B
> Csum didn't match
> The following tree block(s) is corrupted in tree 3861:
> 	tree block bytenr: 1710573748224, level: 1, node key: (1073956, 12, 959325)
> Try to repair the btree for root 3861
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Failed to find [4068943577088, 168, 16384]
> btrfs unable to find ref byte nr 4068943577088 parent 0 root 4  owner 1 offset 0
> Failed to find [5905106075648, 168, 16384]
> btrfs unable to find ref byte nr 5906282119168 parent 0 root 4  owner 0 offset 1
> Failed to find [21037056, 168, 16384]
> btrfs unable to find ref byte nr 21037056 parent 0 root 3  owner 1 offset 0
> Failed to find [21053440, 168, 16384]
> btrfs unable to find ref byte nr 21053440 parent 0 root 3  owner 0 offset 1
> Failed to find [21299200, 168, 16384]
> btrfs unable to find ref byte nr 21299200 parent 0 root 3  owner 0 offset 1
> Failed to find [5523931971584, 168, 16384]
> btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861  owner 3 offset 0
> ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
> btrfs(+0x113cf)[0x5651e60443cf]
> btrfs(__btrfs_cow_block+0x576)[0x5651e6045848]
> btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6]
> btrfs(btrfs_search_slot+0x11df)[0x5651e604969d]
> btrfs(+0x59184)[0x5651e608c184]
> btrfs(cmd_check+0x2bd4)[0x5651e60987b3]
> btrfs(main+0x85)[0x5651e60442c3]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1]
> btrfs(_start+0x2a)[0x5651e6043e3a]
> Aborted
> gargamel:~# 
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/  

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
  2017-07-07  5:39                                                     ` Marc MERLIN
@ 2017-07-07  9:33                                                       ` Lu Fengqi
  2017-07-07 16:38                                                         ` Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Lu Fengqi @ 2017-07-07  9:33 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Btrfs BTRFS, Josef Bacik, Qu Wenruo, David Sterba

On Thu, Jul 06, 2017 at 10:39:53PM -0700, Marc MERLIN wrote:
>On Thu, Jul 06, 2017 at 10:37:18PM -0700, Marc MERLIN wrote:
>> I'm still trying to fix my filesystem.
>> It seems to work well enough since the damage is apparently localized, but
>> I'd really want check --repair to actually bring it back to a working
>> state, but now it's crashing

I apologise for my late reply. As a colleague left, I have to take over his
work recently.

>> 
>> This is btrfs tools from git from a few days ago
>> 
>> Failed to find [4068943577088, 168, 16384]
>> btrfs unable to find ref byte nr 4068943577088 parent 0 root 4  owner 1 offset 0
>> Failed to find [5905106075648, 168, 16384]
>> btrfs unable to find ref byte nr 5906282119168 parent 0 root 4  owner 0 offset 1
>> Failed to find [21037056, 168, 16384]
>> btrfs unable to find ref byte nr 21037056 parent 0 root 3  owner 1 offset 0
>> Failed to find [21053440, 168, 16384]
>> btrfs unable to find ref byte nr 21053440 parent 0 root 3  owner 0 offset 1
>> Failed to find [21299200, 168, 16384]
>> btrfs unable to find ref byte nr 21299200 parent 0 root 3  owner 0 offset 1
>> Failed to find [5523931971584, 168, 16384]
>> btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861  owner 3 offset 0
>> ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
>> btrfs(+0x113cf)[0x5651e60443cf]
>> btrfs(__btrfs_cow_block+0x576)[0x5651e6045848]
>> btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6]
>> btrfs(btrfs_search_slot+0x11df)[0x5651e604969d]
>> btrfs(+0x59184)[0x5651e608c184]
>> btrfs(cmd_check+0x2bd4)[0x5651e60987b3]
>> btrfs(main+0x85)[0x5651e60442c3]
>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1]
>> btrfs(_start+0x2a)[0x5651e6043e3a]
>
>Mmmh, never mind, it seems that the software raid suffered yet another
>double disk failure due to some undermined flakiness in the underlying block
>device cabling :-/
>That would likely explain the failures here.

I'm sorry for hear this. Which raid level are you using? So could you recover
from this double disk failure?

-- 
Thanks,
Lu



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
  2017-07-07  9:33                                                       ` Lu Fengqi
@ 2017-07-07 16:38                                                         ` Marc MERLIN
  2017-07-09  4:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
  0 siblings, 1 reply; 77+ messages in thread
From: Marc MERLIN @ 2017-07-07 16:38 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Btrfs BTRFS, Josef Bacik, Qu Wenruo, David Sterba

On Fri, Jul 07, 2017 at 05:33:20PM +0800, Lu Fengqi wrote:
> I apologise for my late reply. As a colleague left, I have to take over his
> work recently.
 
no worries.

> >Mmmh, never mind, it seems that the software raid suffered yet another
> >double disk failure due to some undermined flakiness in the underlying block
> >device cabling :-/
> >That would likely explain the failures here.
> 
> I'm sorry for hear this. Which raid level are you using? So could you recover
> from this double disk failure?

The disks aren't failed, and the array wasn't being written to.
It's just a matter of putting the disks back in the md raid5 array in the
right order.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-07 16:38                                                         ` Marc MERLIN
@ 2017-07-09  4:34                                                           ` Marc MERLIN
  2017-07-09  5:05                                                             ` We really need a better/working btrfs check --repair Marc MERLIN
                                                                               ` (2 more replies)
  0 siblings, 3 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-07-09  4:34 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Btrfs BTRFS, David Sterba

Sigh,

This is now the 3rd filesystem I have (on 3 different machines) that is
getting corruption of some kind (on 4.11.6).
This is starting to look suspicious :-/

Can I fix this filesystem in some other way?
gargamel:/var/local/scr/host# btrfs check --repair /dev/mapper/crypt_bcache2 
enabling repair mode
Checking filesystem on /dev/mapper/crypt_bcache2
UUID: c4e6f9ca-e9a2-43d7-befa-763fc2cd5a57
checking extents
ref mismatch on [14655689654272 16384] extent item 0, found 1
Backref 14655689654272 parent 15455 root 15455 not found in extent tree
backpointer mismatch on [14655689654272 16384]
owner ref check failed [14655689654272 16384]
repair deleting extent record: key 14655689654272 169 1
adding new tree backref on start 14655689654272 len 16384 parent 0 root 15455
Repaired extent references for 14655689654272
root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
ERROR: failed to repair root items: Invalid argument

Recreating the filesystem is going to take me a week of work, a lot of if
manual, and I'm not feeling very good with doing this since the backup
server this is a backup of, is also seeing some hopefully minor) problems
too.

I really hope there isn't a new corruption problem in 4.11, because when
I'm getting corruption on my laptop, my backup server, and the backup of my
backup server, I'm starting to run out of redundant backups :(
(and I'm not mentioning all the time this is costing me)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* We really need a better/working btrfs check --repair
  2017-07-09  4:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
@ 2017-07-09  5:05                                                             ` Marc MERLIN
  2017-07-09  6:34                                                             ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
  2017-07-09  7:57                                                             ` Martin Steigerwald
  2 siblings, 0 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-07-09  5:05 UTC (permalink / raw)
  To: Lu Fengqi, Chris Mason; +Cc: Btrfs BTRFS, David Sterba

+Chris

On Sat, Jul 08, 2017 at 09:34:17PM -0700, Marc MERLIN wrote:
> gargamel:/var/local/scr/host# btrfs check --repair /dev/mapper/crypt_bcache2 
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_bcache2
> UUID: c4e6f9ca-e9a2-43d7-befa-763fc2cd5a57
> checking extents
> ref mismatch on [14655689654272 16384] extent item 0, found 1
> Backref 14655689654272 parent 15455 root 15455 not found in extent tree
> backpointer mismatch on [14655689654272 16384]
> owner ref check failed [14655689654272 16384]
> repair deleting extent record: key 14655689654272 169 1
> adding new tree backref on start 14655689654272 len 16384 parent 0 root 15455
> Repaired extent references for 14655689654272
> root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
> ERROR: failed to repair root items: Invalid argument

On this note, getting hit 3 times on 3 different filesystems, that are not
badly damaged, but in none of those caess can btrfs check --repair put them
in a working state, is really bringing home the problem with lack of proper
fsck.

I understand that some errors are hard to fix without unknown data loss, but
btrfs check --repair should just do what it takes to put the filesystem back
into a consistent state, never mind what data is lost.
Restoring 10 to 20TB of data is getting old and is not really an acceptable
answer as the only way out.
I should not have to recreate a filesystem as the only way to bring it back
to a working state. 

Before Duncan tells me my filesystem is too big, and I should keep to very
small filesystems so that it's less work for each time btrfs gets corrupted
again, and fails again to bring back the filesystem to a usable state after
discarding some data, that's just not an acceptable answer long term, and by
long term honestly I mean now.
I just have data that doesn't segment well and the more small filesystems I
make the more time I'm going to waste managing them all and dealing with
which one gets full first :(

So, whether 4.11 has a corruption problem, or not, please put some resources
behind btrfs check --repair, be it the lowmem mode, or not.

Thank you
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-09  4:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
  2017-07-09  5:05                                                             ` We really need a better/working btrfs check --repair Marc MERLIN
@ 2017-07-09  6:34                                                             ` Marc MERLIN
  2017-07-09  7:57                                                             ` Martin Steigerwald
  2 siblings, 0 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-07-09  6:34 UTC (permalink / raw)
  To: Lu Fengqi; +Cc: Btrfs BTRFS, David Sterba

On Sat, Jul 08, 2017 at 09:34:17PM -0700, Marc MERLIN wrote:
> Sigh,
> 
> This is now the 3rd filesystem I have (on 3 different machines) that is
> getting corruption of some kind (on 4.11.6).
> This is starting to look suspicious :-/
> 
> Can I fix this filesystem in some other way?
> gargamel:/var/local/scr/host# btrfs check --repair /dev/mapper/crypt_bcache2 
> enabling repair mode
> Checking filesystem on /dev/mapper/crypt_bcache2
> UUID: c4e6f9ca-e9a2-43d7-befa-763fc2cd5a57
> checking extents
> ref mismatch on [14655689654272 16384] extent item 0, found 1
> Backref 14655689654272 parent 15455 root 15455 not found in extent tree
> backpointer mismatch on [14655689654272 16384]
> owner ref check failed [14655689654272 16384]
> repair deleting extent record: key 14655689654272 169 1
> adding new tree backref on start 14655689654272 len 16384 parent 0 root 15455
> Repaired extent references for 14655689654272
> root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
> ERROR: failed to repair root items: Invalid argument

Mmmh, actually to be fair, this was the 2nd run, I didn't scroll back
enough and missed the first run (doing too many recoveries at once,
I'm getting mixed up).
This first run looks like a lot more things happened:
http://marc.merlins.org/tmp/btrfs_check_ds5.txt

The amount of things that went wrong here are very worrisome, given that
there were no issues with those drives and that array has been working
for over a year without problems, until I recently upgraded to 4.11 :(

Now mind you, despite the 21MB of things that got fixed, I still kind of
have the expectation that btrfs check --repairs continues and fixes
everything until the filesystem is clean again, just like e2fsck -f
would, but I understand that this filesystem somehow got corrupted to a
point that it's maybe not that simple to do so.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-09  4:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
  2017-07-09  5:05                                                             ` We really need a better/working btrfs check --repair Marc MERLIN
  2017-07-09  6:34                                                             ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
@ 2017-07-09  7:57                                                             ` Martin Steigerwald
  2017-07-09  9:16                                                               ` Paul Jones
  2017-07-31 21:07                                                               ` Ivan Sizov
  2 siblings, 2 replies; 77+ messages in thread
From: Martin Steigerwald @ 2017-07-09  7:57 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Lu Fengqi, Btrfs BTRFS, David Sterba

Hello Marc.

Marc MERLIN - 08.07.17, 21:34:
> Sigh,
> 
> This is now the 3rd filesystem I have (on 3 different machines) that is
> getting corruption of some kind (on 4.11.6).

Anyone else getting corruptions with 4.11?

I happily switch back to 4.10.17 or even 4.9 if that is the case. I may even 
do so just from your reports. Well, yes, I will do exactly that. I just switch 
back for 4.10 for now. Better be safe, than sorry.

I know how you feel, Marc. I posted about a corruption on one of my backup 
harddisks here some time ago that btrfs check --repair wasn´t able to handle. 
I redid that disk from scratch and it took a long, long time.

I agree with you that this has to stop. Before that I will never *ever* 
recommend this to a customer. Ideally no corruptions in stable kernels, 
especially when its a .6 at the end of the version number. But if so… then 
fixable. Other filesystems like Ext4 and XFS can do it… so this should be 
possible with BTRFS as well.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-09  7:57                                                             ` Martin Steigerwald
@ 2017-07-09  9:16                                                               ` Paul Jones
  2017-07-09 11:17                                                                 ` Duncan
  2017-07-31 21:07                                                               ` Ivan Sizov
  1 sibling, 1 reply; 77+ messages in thread
From: Paul Jones @ 2017-07-09  9:16 UTC (permalink / raw)
  To: Martin Steigerwald, Marc MERLIN; +Cc: Btrfs BTRFS

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1175 bytes --]

> -----Original Message-----
> From: linux-btrfs-owner@vger.kernel.org [mailto:linux-btrfs-
> owner@vger.kernel.org] On Behalf Of Martin Steigerwald
> Sent: Sunday, 9 July 2017 5:58 PM
> To: Marc MERLIN <marc@merlins.org>
> Cc: Lu Fengqi <lufq.fnst@cn.fujitsu.com>; Btrfs BTRFS <linux-
> btrfs@vger.kernel.org>; David Sterba <dsterba@suse.cz>
> Subject: Re: 4.11.6 / more corruption / root 15455 has a root item with a more
> recent gen (33682) compared to the found root node (0)
> 
> Hello Marc.
> 
> Marc MERLIN - 08.07.17, 21:34:
> > Sigh,
> >
> > This is now the 3rd filesystem I have (on 3 different machines) that
> > is getting corruption of some kind (on 4.11.6).
> 
> Anyone else getting corruptions with 4.11?
> 
> I happily switch back to 4.10.17 or even 4.9 if that is the case. I may even do
> so just from your reports. Well, yes, I will do exactly that. I just switch back
> for 4.10 for now. Better be safe, than sorry.

No corruption for me - I've been on 4.11 since about .2 and everything seems fine. Currently on 4.11.8

Paul.
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-09  9:16                                                               ` Paul Jones
@ 2017-07-09 11:17                                                                 ` Duncan
  2017-07-09 13:00                                                                   ` Martin Steigerwald
  2017-07-29 19:29                                                                   ` Imran Geriskovan
  0 siblings, 2 replies; 77+ messages in thread
From: Duncan @ 2017-07-09 11:17 UTC (permalink / raw)
  To: linux-btrfs

Paul Jones posted on Sun, 09 Jul 2017 09:16:36 +0000 as excerpted:

>> Marc MERLIN - 08.07.17, 21:34:
>> >
>> > This is now the 3rd filesystem I have (on 3 different machines) that
>> > is getting corruption of some kind (on 4.11.6).
>> 
>> Anyone else getting corruptions with 4.11?
>> 
>> I happily switch back to 4.10.17 or even 4.9 if that is the case. I may
>> even do so just from your reports. Well, yes, I will do exactly that. I
>> just switch back for 4.10 for now. Better be safe, than sorry.
> 
> No corruption for me - I've been on 4.11 since about .2 and everything
> seems fine. Currently on 4.11.8

No corruptions here either. 4.12.0 now, previously 4.12-rc5(ish, git), 
before that 4.11.0.

I have however just upgraded to new ssds then wiped and setup the old 
ones as another backup set, so everything is on brand new filesystems on 
fast ssds, no possibility of old undetected corruption suddenly 
triggering problems.

Also, all my btrfs are raid1 or dup for checksummed redundancy, and 
relatively small, the largest now 80 GiB per device, after the upgrade.  
And my use-case doesn't involve snapshots or subvolumes.  

So any bug that is most likely on older filesystems, say those without 
the no-holes feature, for instance, or that doesn't tend to hit raid1 or 
dup mode, or that is less likely on small filesystems on fast ssds, or 
that triggers most often with reflinks and thus on filesystems with 
snapshots, is unlikely to hit me.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-09 11:17                                                                 ` Duncan
@ 2017-07-09 13:00                                                                   ` Martin Steigerwald
  2017-07-29 19:29                                                                   ` Imran Geriskovan
  1 sibling, 0 replies; 77+ messages in thread
From: Martin Steigerwald @ 2017-07-09 13:00 UTC (permalink / raw)
  To: linux-btrfs

Hello Duncan.

Duncan - 09.07.17, 11:17:
> Paul Jones posted on Sun, 09 Jul 2017 09:16:36 +0000 as excerpted:
> >> Marc MERLIN - 08.07.17, 21:34:
> >> > This is now the 3rd filesystem I have (on 3 different machines) that
> >> > is getting corruption of some kind (on 4.11.6).
> >> 
> >> Anyone else getting corruptions with 4.11?
> >> 
> >> I happily switch back to 4.10.17 or even 4.9 if that is the case. I may
> >> even do so just from your reports. Well, yes, I will do exactly that. I
> >> just switch back for 4.10 for now. Better be safe, than sorry.
> > 
> > No corruption for me - I've been on 4.11 since about .2 and everything
> > seems fine. Currently on 4.11.8
> 
> No corruptions here either. 4.12.0 now, previously 4.12-rc5(ish, git),
> before that 4.11.0.
> 
> I have however just upgraded to new ssds then wiped and setup the old
[…]
> Also, all my btrfs are raid1 or dup for checksummed redundancy, and
> relatively small, the largest now 80 GiB per device, after the upgrade.
> And my use-case doesn't involve snapshots or subvolumes.
> 
> So any bug that is most likely on older filesystems, say those without
> the no-holes feature, for instance, or that doesn't tend to hit raid1 or
> dup mode, or that is less likely on small filesystems on fast ssds, or
> that triggers most often with reflinks and thus on filesystems with
> snapshots, is unlikely to hit me.

Hmmm, the BTRFS filesystems on my laptop 3 to 5 or even more years old. I stick 
with 4.10 for now, I think.

The older ones are RAID 1 across two SSDs, the newer one is single device, on 
one SSD.

These filesystems didn´t fail me in years and since 4.5 or 4.6 even the "I 
search for free space" kernel hang (hung tasks and all that) is gone as well.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-09 11:17                                                                 ` Duncan
  2017-07-09 13:00                                                                   ` Martin Steigerwald
@ 2017-07-29 19:29                                                                   ` Imran Geriskovan
  2017-07-29 23:38                                                                     ` Duncan
  1 sibling, 1 reply; 77+ messages in thread
From: Imran Geriskovan @ 2017-07-29 19:29 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On 7/9/17, Duncan <1i5t5.duncan@cox.net> wrote:
> I have however just upgraded to new ssds then wiped and setup the old
> ones as another backup set, so everything is on brand new filesystems on
> fast ssds, no possibility of old undetected corruption suddenly
> triggering problems.
>
> Also, all my btrfs are raid1 or dup for checksummed redundancy

Do you have any experience/advice/comment regarding
dup data on ssds?

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-29 19:29                                                                   ` Imran Geriskovan
@ 2017-07-29 23:38                                                                     ` Duncan
  2017-07-30 14:54                                                                       ` Imran Geriskovan
  0 siblings, 1 reply; 77+ messages in thread
From: Duncan @ 2017-07-29 23:38 UTC (permalink / raw)
  To: linux-btrfs

Imran Geriskovan posted on Sat, 29 Jul 2017 21:29:46 +0200 as excerpted:

> On 7/9/17, Duncan <1i5t5.duncan@cox.net> wrote:
>> I have however just upgraded to new ssds then wiped and setup the old
>> ones as another backup set, so everything is on brand new filesystems 
on
>> fast ssds, no possibility of old undetected corruption suddenly
>> triggering problems.
>>
>> Also, all my btrfs are raid1 or dup for checksummed redundancy
> 
> Do you have any experience/advice/comment regarding
> dup data on ssds?

Very good question. =:^)

Limited.  Most of my btrfs are raid1, with dup only used on the device-
respective /boot btrfs (of which there are four, one on each of the two 
ssds that otherwise form the btrfs raid1 pairs, for each of the working 
and backup copy pairs -- I can use BIOS to select any of the four to 
boot), and those are all sub-GiB mixed-bg mode.

So all my dup experience is sub-GiB mixed-blockgroup mode.

Within that limitation, my only btrfs problem has been that at my 
initially chosen size of 256 MiB, mkfs.btrfs at least used to create an 
initial data/metadata chunk of 64 MiB.  Remember, this is dup mode, so 
there's two of them = 128 MiB.  Because there's also a system chunk, that 
means the initial chunk cannot be balanced even with an entirely empty 
filesystem, because there's not enough space to write a second 64 MiB 
chunk duped to 128 MiB.

Between that and the 256 MiB in dup mode size meaning under 128 MiB 
usable, and the fact that I routinely run and sometimes need to bisect 
pre-release kernels, I was routinely running out of space, then cleaning 
up, but not being able to do a full cleanup without a blow-away and new 
mkfs.btrfs, because I couldn't balance.

When I recently purchased the second pair of (now larger) ssds in ordered 
to put everything, including the media and backups that were previously 
still on spinning rust, on ssd, I redid the layout and made the /boots 
512 MiB, still mixed-bg dup mode.  That seems to have solved the problem, 
and I can now rebalance the first mkfs.btrfs-created mixed-bg chunk, as 
it's now small enough that it's less than half the filesystem even when 
duped.

Because it's now 512 MiB, however, I can't say for sure whether the 
previous problem with mkfs.btrfs creating an initial mixed-bg chunk of a 
quarter the 256 MiB filesystem size, so in dup mode it can't be balanced 
because it's half the total filesystem size and with the system chunk as 
well, the other half is partially used so there's no space to write the 
balance destination chunks, is fixed, or not.  What I can say is that the 
problem doesn't affect the new 512 MiB size, at least with btrfs-progs 
4.11.x, which is what I used to mkfs.btrfs the new layout.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-29 23:38                                                                     ` Duncan
@ 2017-07-30 14:54                                                                       ` Imran Geriskovan
  2017-07-31  4:53                                                                         ` Duncan
  0 siblings, 1 reply; 77+ messages in thread
From: Imran Geriskovan @ 2017-07-30 14:54 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On 7/30/17, Duncan <1i5t5.duncan@cox.net> wrote:
>>> Also, all my btrfs are raid1 or dup for checksummed redundancy

>> Do you have any experience/advice/comment regarding
>> dup data on ssds?

> Very good question. =:^)

> Limited.  Most of my btrfs are raid1, with dup only used on the device-
> respective /boot btrfs (of which there are four, one on each of the two
> ssds that otherwise form the btrfs raid1 pairs, for each of the working
> and backup copy pairs -- I can use BIOS to select any of the four to
> boot), and those are all sub-GiB mixed-bg mode.

Is this a military or deep space device? ;)

> So all my dup experience is sub-GiB mixed-blockgroup mode.
>
> Within that limitation, my only btrfs problem has been that at my
> initially chosen size of 256 MiB, mkfs.btrfs at least used to create an
> initial data/metadata chunk of 64 MiB.  Remember, this is dup mode, so
> there's two of them = 128 MiB.  Because there's also a system chunk, that
> means the initial chunk cannot be balanced even with an entirely empty
> filesystem, because there's not enough space to write a second 64 MiB
> chunk duped to 128 MiB.

For /boot, I've also tried dup data.

But because of combinations of constraints you've mentioned,
I totally give-up trying to have a bullet proof /boot
as my poor laptop is not mission critical as your device and
as I do always have bootable backups and always carry
some bootable sdcards.

Perhaps that has something to do with me kicking
out all systemd, inits, initramfs, mkinitcpio, dracut, etc, etc.

Now the init on /boot is a "19 lines" shell script, including lines
for keymap, hdparm, crytpsetup. And let's not forget this is
possible by a custom kernel, its reliable buddy syslinux.

Interestingly my seach for reliability started with
"dup data" and ended up here. :)

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-30 14:54                                                                       ` Imran Geriskovan
@ 2017-07-31  4:53                                                                         ` Duncan
  2017-07-31 20:32                                                                           ` Imran Geriskovan
  0 siblings, 1 reply; 77+ messages in thread
From: Duncan @ 2017-07-31  4:53 UTC (permalink / raw)
  To: linux-btrfs

Imran Geriskovan posted on Sun, 30 Jul 2017 16:54:25 +0200 as excerpted:

> On 7/30/17, Duncan <1i5t5.duncan@cox.net> wrote:
>>>> Also, all my btrfs are raid1 or dup for checksummed redundancy
> 
>>> Do you have any experience/advice/comment regarding dup data on ssds?
> 
>> Very good question. =:^)
> 
>> Limited.  Most of my btrfs are raid1, with dup only used on the device-
>> respective /boot btrfs (of which there are four, one on each of the two
>> ssds that otherwise form the btrfs raid1 pairs, for each of the working
>> and backup copy pairs -- I can use BIOS to select any of the four to
>> boot), and those are all sub-GiB mixed-bg mode.
> 
> Is this a military or deep space device? ;)

Just happens to have four physical ssds, two pairs, with everything but 
/boot being paired btrfs raid1.  Because I wanted similar partition 
layout for ease of management, that's a /boot on each one, and because 
bios can only point to one at a time, that's four separate grub installs
[1], each of which is configured to load its own /boot.

While four is a bit much, three can certainly be very useful, because it 
allows a bad grub upgrade to be core-installed to one BIOS-boot 
partition, while allowing me to fat-finger point it to the wrong /boot on 
a second device destroying my ability to boot to it as well, and still 
have a third untouched to boot from.  The forth is simply bonus insurance 
on that, more by accident due to having two pair than because I really 
needed it.

A minimum of three /boots is also quite convenient for my kernel update 
routine, given I routinely test and sometimes bisect pre-release 
kernels.  The default/working /boot gets the prereleases with a release 
and stable fallback, the first backup the releases and a stable fallback, 
and the secondary backups get updated less frequently, generally when I'm 
doing a / backup cycle as well and there has been either a kernel config 
or system change substantial enough that I'm no longer confident the 
older kernels will work correctly with the updated system.

Of course the same general testing/release/stable /boot system works well 
for other related updates, say to the grub menu (I use grub2's bash-like 
scripting language directly, not the high level stuff which I find too 
difficult to tweak to my liking) or the initrd, which I attach to the 
individual kernels at build-time, so a tested kernel selection is a 
tested initramfs selection as well.

> For /boot, I've also tried dup data.
> 
> But because of combinations of constraints you've mentioned,
> I totally give-up trying to have a bullet proof /boot as my poor laptop
> is not mission critical as your device and as I do always have bootable
> backups and always carry some bootable sdcards.

When I complained about the 64-MiB default mixed-bg mode chunk size on a 
256 MiB filesystem being too big to allow balance in dup mode, a dev 
answered that in theory chunk sizes are supposed to be limited to 1/8 
filesystem size (down to something like a 16 MiB minimum chunk size I 
think, but might be 8 or 32), but something about my setup, likely the 
mixed-bg mode as it's less tested, was short-circuiting that, thus the 
quarter-fs-size 64 MiB chunk sizes, which he agreed didn't make much 
sense on a 256 MiB filesystem in dup mode.

He was able to duplicate the problem, and there seemed no disagreement is 
was a bug, but I'm not sure if mkfs.btrfs was ever patched to fix it, and 
of course now with the bigger half-gig filesystem the same 64-MiB initial 
chunk size is fine.

And my other quarter-gig btrfs, log, is raid1, quarter-gig per device, so 
I'd not see the problem there, mixed-mode or not.  (As mentioned in the 
footnote below, at least in this go-round it's not... more by accident 
than intent.)

Meanwhile, such bugs come with the territory when you're running what 
might be roughly compared at the commercial software level to late beta 
or rc level software, or even initial release, pre-service-release-1, 
level, which I'd argue is a more accurate btrfs comparison at this 
point.  As long as you stay within the known stable areas the danger of 
it eating your data is relatively small now, but the full feature set 
isn't there yet, and some of the features that are there are 
significantly less mature and stable than others.

> Perhaps that has something to do with me kicking out all systemd, inits,
> initramfs, mkinitcpio, dracut, etc, etc.
> 
> Now the init on /boot is a "19 lines" shell script, including lines for
> keymap, hdparm, crytpsetup. And let's not forget this is possible by a
> custom kernel, its reliable buddy syslinux.

FWIW...

I really like grub2, especially it's quite flexible bash-like scripting 
language (the higher level stuff intended for normal users just isn't 
flexible enough for me, so I need the scripting language anyway, and once 
I knew that, the higher level stuff only got in the way) and command line 
that allow all sorts of stuff like browsing for kernel commandline 
documentation at the boot prompt that I never imagined possible in a boot 
manager.

And after holding off for awhile, I'm now a cautious adopter and 
supporter of systemd in general, tho I don't use its solutions for 
/everything/ and don't like its extremely aggressive feature expansion.

And after resisting an initr* for years as unnecessary, I've been a 
reluctant adopter since a btrfs raid1 root effectively requires it 
(rootflags=device= doesn't seem to work, for whatever reason, or at least 
didn't when I initially converted to btrfs, so at least a limited initr* 
seems the only viable solution for a btrfs raid1 root).

And I'm using dracut for that, tho quite cut down from its default, with 
a monolithic kernel and only installing necessary dracut modules.

But particularly after the last dracut update pulled in kmod as a 
mandatory dep as it now links against its libs, despite my monolithic 
kernel built without module support, I've been considering similar initr* 
alternatives, including hand-rolling my own initr* build scripts.

Because I'm still not happy having to run an initr* at all, especially 
since there's more "magic" there than I'm particularly comfortable with 
since I like to grok the boot and thus potential recovery process better 
than I do this, and dracut was just the most convenient option at the 
time.

But kmod isn't a /huge/ dep, particularly with the executables and docs 
install-masked so it's only the library, headers and *.pc config file 
installed, and the current dracut solution works /reasonably/ well, so 
finding/creating an alternative isn't particularly high on my priority 
list, and I'll probably never do it unless dracut suddenly decides some 
of its other modules are going to need mandatory deps, or something else 
radically changes the current fragile balance and I really do need that 
currently lacking initr* grok.

> Interestingly my seach for reliability started with "dup data" and ended
> up here. :)

=:^)

---
[1] Grub and partition layout:  I install grub-core (i386-pc) to a raw 
GPT legacy BIOS boot partition.  While this only requires a partition 
size of about a third of a MiB, I use gdisk's default 1 MiB alignment and 
the first MiB is the GTP and the alignment gap, so this first BIOS boot 
partition starts at 1 MiB and must be a whole MiB unit in size.  Because 
I wanted plenty of room, however, and wanted additional partitions a 
minimum of 4 MiB aligned, I configured a 3 MiB BIOS boot partition for 
grub to use, thus accomplishing that 4 MiB alignment for further 
partitions.

The second partition is a currently unused GPT EFI partition for forward 
compatibility, 252 MiB in size so further partitions are quarter-GiB 
aligned.

The third partition is the /boot partition we've been discussing, a half 
GiB in size, thus ending at 3/4 GiB.  It's my only btrfs mixed-mode dup 
in the layout, so a half gig in size but a quarter gig usable.  As 
mentioned, with four physical ssds that's a total of four /boots, each 
pointed at by the grub-core installation in the first partition on the 
corresponding ssd.

Partition 4 is the log partition, a quarter GiB in size as log rotation 
keeps typical usage under 50 MiB, but the quarter gig size means it ends 
on the 1 GiB boundary and further partitions are GiB aligned.  In the 
last layout generation this was a half gig and /boot a quarter gig, but I 
decided /boot could use the extra quarter gig more than log so I traded 
sizes.  This, like all further partitions, is btrfs raid1.  I intended to 
make it mixed-bg mode, as it was in the previous generation layout, but 
forgot the mkfs.btrfs switch for that and it no longer defaults to mixed 
at under a gig, so I got standard mode.  Never-the-less, with raid1 
instead of dup, and low normal usage, the chunk size is small enough that 
balance shouldn't be an issue, and if it is I can always blow it away and 
recreate in mixed mode.

All further partitions are gig-aligned btrfs raid1 pair-device, three 
copies, working/0 and backups 1 and 2, on two separate pairs of ssds.  
The older pair is 256GB/238GiB with the backup/1 copy, the newer pair is 
1TB/931GiB with working/0 and backup/2.  The partition size and layout is 
identical on all four thru the sub-GiB and first copy, with the second 
copy on the larger pair being a same-sequence same-size repeat of the 
first, beyond the non-duplicated sub-GiB, of course.  So as long as the 
GPT on one of the four remains intact and bootable, I can easily recreate 
the other three.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-31  4:53                                                                         ` Duncan
@ 2017-07-31 20:32                                                                           ` Imran Geriskovan
  2017-08-01  1:36                                                                             ` Duncan
  0 siblings, 1 reply; 77+ messages in thread
From: Imran Geriskovan @ 2017-07-31 20:32 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

>>>> Do you have any experience/advice/comment regarding dup data on ssds?

>>> Very good question. =:^)

>> Now the init on /boot is a "19 lines" shell script, including lines for
>> keymap, hdparm, crytpsetup. And let's not forget this is possible by a
>> custom kernel and its reliable buddy syslinux.
>
> FWIW...
> And I'm using dracut for that, tho quite cut down from its default, with
> a monolithic kernel and only installing necessary dracut modules.

Just create minimal bootable /boot for running below init.
(Your initramfs/rd is a bloated and packaged version of
this anyway.) Kick the rest. Since you a have your own
kernel you are not far away from it.


#!/bin/sh
# This is actually busybox ash or hush. Cant remember now.
# You may compile/customize your busybox as well. Easy.

mount proc /proc -t proc
mount sys  /sys  -t sysfs
mount run  /run  -t tmpfs
mkdir /dev/pts /dev/shm /run/lock
mount devpts /dev/pts -t devpts &
mount shm    /dev/shm -t tmpfs &
mount -o remount,rw,noatime / &

# '&' is for backgrounding/parallel_execution.
# Use responsibly double checking its side effects
# depending on your setup.

hdparm -B 254 /dev/sda &
loadkmap < /boot/trq.bkmap

cryptsetup -T 10 luksOpen /dev/sdXX sdXX
mount /dev/mapper/sdXX /mnt/new_root -t btrfs -o noatime,compress=lzo

cd /mnt/new_root
mount --move /dev  ./dev
mount --move /proc ./proc
mount --move /sys  ./sys
mount --move /run  ./run
pivot_root . boot

exec chroot . busybox init
# Jump to your real roots init. Whatever it may be.


> But particularly after the last dracut update pulled in kmod as a
> mandatory dep as it now links against its libs, despite my monolithic
> kernel built without module support, I've been considering similar initr*
> alternatives, including hand-rolling my own initr* build scripts.
>
> Because I'm still not happy having to run an initr* at all, especially
> since there's more "magic" there than I'm particularly comfortable with
> since I like to grok the boot and thus potential recovery process better
> than I do this, and dracut was just the most convenient option at the
> time.

>> Interestingly my seach for reliability started with "dup data" and ended
>> up here. :)
> =:^)

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-09  7:57                                                             ` Martin Steigerwald
  2017-07-09  9:16                                                               ` Paul Jones
@ 2017-07-31 21:07                                                               ` Ivan Sizov
  2017-07-31 21:17                                                                 ` Marc MERLIN
  1 sibling, 1 reply; 77+ messages in thread
From: Ivan Sizov @ 2017-07-31 21:07 UTC (permalink / raw)
  To: Martin Steigerwald
  Cc: Marc MERLIN, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan

2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>:
> Hello Marc.
>
> Marc MERLIN - 08.07.17, 21:34:
>> Sigh,
>>
>> This is now the 3rd filesystem I have (on 3 different machines) that is
>> getting corruption of some kind (on 4.11.6).
>
> Anyone else getting corruptions with 4.11?
Yes, a lot. There are at least 3 cases, probably I've missed something.
https://www.spinics.net/lists/linux-btrfs/msg67177.html
https://www.spinics.net/lists/linux-btrfs/msg67681.html
https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275

If an additional debug info is needed, I'm ready to provide it.

-- 
Ivan Sizov

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-31 21:07                                                               ` Ivan Sizov
@ 2017-07-31 21:17                                                                 ` Marc MERLIN
  2017-07-31 21:39                                                                   ` Ivan Sizov
  2017-07-31 22:00                                                                   ` Justin Maggard
  0 siblings, 2 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-07-31 21:17 UTC (permalink / raw)
  To: Ivan Sizov
  Cc: Martin Steigerwald, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan

On Tue, Aug 01, 2017 at 12:07:14AM +0300, Ivan Sizov wrote:
> 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>:
> > Hello Marc.
> >
> > Marc MERLIN - 08.07.17, 21:34:
> >> Sigh,
> >>
> >> This is now the 3rd filesystem I have (on 3 different machines) that is
> >> getting corruption of some kind (on 4.11.6).
> >
> > Anyone else getting corruptions with 4.11?
> Yes, a lot. There are at least 3 cases, probably I've missed something.
> https://www.spinics.net/lists/linux-btrfs/msg67177.html
> https://www.spinics.net/lists/linux-btrfs/msg67681.html
> https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275

Indeed. My main server is happy back on 4.9.36 and while my laptop is
stuck on 4.11 due to other kernel issues that prevent me from going back
to 4.9, it only corrupted a single filesystem so far, and no other ones
that I've noticed yet.
Hopefully that will hold :-/

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-31 21:17                                                                 ` Marc MERLIN
@ 2017-07-31 21:39                                                                   ` Ivan Sizov
  2017-08-01 16:41                                                                     ` Ivan Sizov
  2017-07-31 22:00                                                                   ` Justin Maggard
  1 sibling, 1 reply; 77+ messages in thread
From: Ivan Sizov @ 2017-07-31 21:39 UTC (permalink / raw)
  To: Marc MERLIN
  Cc: Martin Steigerwald, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan

2017-08-01 0:17 GMT+03:00 Marc MERLIN <marc@merlins.org>:
> On Tue, Aug 01, 2017 at 12:07:14AM +0300, Ivan Sizov wrote:
>> 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>:
>> > Hello Marc.
>> >
>> > Marc MERLIN - 08.07.17, 21:34:
>> >> Sigh,
>> >>
>> >> This is now the 3rd filesystem I have (on 3 different machines) that is
>> >> getting corruption of some kind (on 4.11.6).
>> >
>> > Anyone else getting corruptions with 4.11?
>> Yes, a lot. There are at least 3 cases, probably I've missed something.
>> https://www.spinics.net/lists/linux-btrfs/msg67177.html
>> https://www.spinics.net/lists/linux-btrfs/msg67681.html
>> https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275
>
> Indeed. My main server is happy back on 4.9.36 and while my laptop is
> stuck on 4.11 due to other kernel issues that prevent me from going back
> to 4.9, it only corrupted a single filesystem so far, and no other ones
> that I've noticed yet.
> Hopefully that will hold :-/
>
> Marc
> --
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

I want to try mounting and checking FS under Live images with
different kernels tomorrow. Today's Fedora Rawhide image seems to be
built incorrectly. Can you advice me where to get a fresh live image
with 4.12 kernel (it's not important which distro that will be)?

-- 
Ivan Sizov

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-31 21:17                                                                 ` Marc MERLIN
  2017-07-31 21:39                                                                   ` Ivan Sizov
@ 2017-07-31 22:00                                                                   ` Justin Maggard
  2017-08-01  6:38                                                                     ` Marc MERLIN
  1 sibling, 1 reply; 77+ messages in thread
From: Justin Maggard @ 2017-07-31 22:00 UTC (permalink / raw)
  To: Marc MERLIN
  Cc: Ivan Sizov, Martin Steigerwald, Lu Fengqi, Btrfs BTRFS,
	David Sterba, Duncan

On Mon, Jul 31, 2017 at 2:17 PM, Marc MERLIN <marc@merlins.org> wrote:
> On Tue, Aug 01, 2017 at 12:07:14AM +0300, Ivan Sizov wrote:
>> 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>:
>> > Hello Marc.
>> >
>> > Marc MERLIN - 08.07.17, 21:34:
>> >> Sigh,
>> >>
>> >> This is now the 3rd filesystem I have (on 3 different machines) that is
>> >> getting corruption of some kind (on 4.11.6).
>> >
>> > Anyone else getting corruptions with 4.11?
>> Yes, a lot. There are at least 3 cases, probably I've missed something.
>> https://www.spinics.net/lists/linux-btrfs/msg67177.html
>> https://www.spinics.net/lists/linux-btrfs/msg67681.html
>> https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275
>
> Indeed. My main server is happy back on 4.9.36 and while my laptop is
> stuck on 4.11 due to other kernel issues that prevent me from going back
> to 4.9, it only corrupted a single filesystem so far, and no other ones
> that I've noticed yet.
> Hopefully that will hold :-/
>

Marc, do you have quotas enabled?  IIRC, you're a send/receive user.
The combination of quotas and btrfs receive can corrupt your
filesystem, as shown by the xfstest I sent to the list a little while
ago.

-Justin

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-31 20:32                                                                           ` Imran Geriskovan
@ 2017-08-01  1:36                                                                             ` Duncan
  2017-08-01 15:18                                                                               ` Imran Geriskovan
  0 siblings, 1 reply; 77+ messages in thread
From: Duncan @ 2017-08-01  1:36 UTC (permalink / raw)
  To: linux-btrfs

Imran Geriskovan posted on Mon, 31 Jul 2017 22:32:39 +0200 as excerpted:

>>> Now the init on /boot is a "19 lines" shell script, including lines
>>> for keymap, hdparm, crytpsetup. And let's not forget this is possible
>>> by a custom kernel and its reliable buddy syslinux.
>>
>> FWIW...
>> And I'm using dracut for that, tho quite cut down from its default,
>> with a monolithic kernel and only installing necessary dracut modules.
> 
> Just create minimal bootable /boot for running below init.
> (Your initramfs/rd is a bloated and packaged version of this anyway.)
> Kick the rest. Since you a have your own kernel you are not far away
> from it.

Thanks.  You just solved my primary problem of needing to take the time 
to actually research all the steps and in what order I needed to do them, 
for a hand-rolled script. =:^)

Unfortunately, while I've been laid-up the last ~5 days due to a twisted 
knee and have been spending more time on the lists, etc, and would have 
loved to spend a day or so testing and setting this up, I'm back to work 
tomorrow, so I've no idea when I'll actually get to play with this.

But meanwhile, I'm saving your message for reference when the time 
comes.  It should be /very/ useful!  =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-31 22:00                                                                   ` Justin Maggard
@ 2017-08-01  6:38                                                                     ` Marc MERLIN
  0 siblings, 0 replies; 77+ messages in thread
From: Marc MERLIN @ 2017-08-01  6:38 UTC (permalink / raw)
  To: Justin Maggard
  Cc: Ivan Sizov, Martin Steigerwald, Lu Fengqi, Btrfs BTRFS,
	David Sterba, Duncan

On Mon, Jul 31, 2017 at 03:00:53PM -0700, Justin Maggard wrote:
> Marc, do you have quotas enabled?  IIRC, you're a send/receive user.
> The combination of quotas and btrfs receive can corrupt your
> filesystem, as shown by the xfstest I sent to the list a little while
> ago.

Thanks for checking. I do not use quota given the problems I had with
them early on over 2y ago.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-08-01  1:36                                                                             ` Duncan
@ 2017-08-01 15:18                                                                               ` Imran Geriskovan
  0 siblings, 0 replies; 77+ messages in thread
From: Imran Geriskovan @ 2017-08-01 15:18 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On 8/1/17, Duncan <1i5t5.duncan@cox.net> wrote:
> Imran Geriskovan posted on Mon, 31 Jul 2017 22:32:39 +0200 as excerpted:
>>>> Now the init on /boot is a "19 lines" shell script, including lines
>>>> for keymap, hdparm, crytpsetup. And let's not forget this is possible
>>>> by a custom kernel and its reliable buddy syslinux.

>>> And I'm using dracut for that, tho quite cut down from its default,
>>> with a monolithic kernel and only installing necessary dracut modules.

>> Just create minimal bootable /boot for running below init.
>> (Your initramfs/rd is a bloated and packaged version of this anyway.)
>> Kick the rest. Since you a have your own kernel you are not far away
>> from it.

> Thanks.  You just solved my primary problem of needing to take the time
> to actually research all the steps and in what order I needed to do them,
> for a hand-rolled script. =:^)

It's just a minimal one. But it is a good start. For possible extensions
extract your initramfs and explore it. Dracut is bloated. Try mkinitcpio.

Once your have your self hosting bootmng, kernel, modules, /boot, init, etc
chain, you'll be shocked to realize you have been spending so much time for
that bullshit while trying to keep them up..

Get to this point in the shortest possible time. Save your precious
time. And reclaim your systems reliability.

For X, you'll still need udev or eudev.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
  2017-07-31 21:39                                                                   ` Ivan Sizov
@ 2017-08-01 16:41                                                                     ` Ivan Sizov
  0 siblings, 0 replies; 77+ messages in thread
From: Ivan Sizov @ 2017-08-01 16:41 UTC (permalink / raw)
  To: Marc MERLIN
  Cc: Martin Steigerwald, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan

2017-08-01 0:39 GMT+03:00 Ivan Sizov <sivan606@gmail.com>:
> 2017-08-01 0:17 GMT+03:00 Marc MERLIN <marc@merlins.org>:
>> On Tue, Aug 01, 2017 at 12:07:14AM +0300, Ivan Sizov wrote:
>>> 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>:
>>> > Hello Marc.
>>> >
>>> > Marc MERLIN - 08.07.17, 21:34:
>>> >> Sigh,
>>> >>
>>> >> This is now the 3rd filesystem I have (on 3 different machines) that is
>>> >> getting corruption of some kind (on 4.11.6).
>>> >
>>> > Anyone else getting corruptions with 4.11?
>>> Yes, a lot. There are at least 3 cases, probably I've missed something.
>>> https://www.spinics.net/lists/linux-btrfs/msg67177.html
>>> https://www.spinics.net/lists/linux-btrfs/msg67681.html
>>> https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275
>>
>> Indeed. My main server is happy back on 4.9.36 and while my laptop is
>> stuck on 4.11 due to other kernel issues that prevent me from going back
>> to 4.9, it only corrupted a single filesystem so far, and no other ones
>> that I've noticed yet.
>> Hopefully that will hold :-/
>>
>> Marc
>> --
>> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>> Microsoft is to operating systems ....
>>                                       .... what McDonalds is to gourmet cooking
>> Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
>
> I want to try mounting and checking FS under Live images with
> different kernels tomorrow. Today's Fedora Rawhide image seems to be
> built incorrectly. Can you advice me where to get a fresh live image
> with 4.12 kernel (it's not important which distro that will be)?
>
> --
> Ivan Sizov
Mounting problem persists:
on 4.13.0 with btrfs-progs v4.11.1 (latest Fedora Rawhide Live)
on 4.10.0 with btrfs-progs v4.9.1 (Ubuntu 17.04 Live)
on 4.9.0 with btrfs-progs v 4.7.3 (Debian 9 Stretch Live)
"btrfs check --readonly" also gives the same output on 4.11, 4.10 and 4.9.

Marc, how did you roll back and fix those errors?

-- 
Ivan Sizov

^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2017-08-01 16:41 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-20 14:39 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Marc MERLIN
2017-06-20 15:23 ` Hugo Mills
2017-06-20 15:26   ` Marc MERLIN
2017-06-20 15:36     ` Hugo Mills
2017-06-20 15:44       ` Marc MERLIN
2017-06-20 23:12         ` Marc MERLIN
2017-06-20 23:58           ` Marc MERLIN
2017-06-21  3:31           ` Chris Murphy
2017-06-21  3:43             ` Marc MERLIN
2017-06-21 15:13               ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
2017-06-21 23:22                 ` Chris Murphy
2017-06-22  0:48                   ` Marc MERLIN
2017-06-22  2:22                 ` Qu Wenruo
2017-06-22  2:53                   ` Marc MERLIN
2017-06-22  4:08                     ` Qu Wenruo
2017-06-23  4:06                       ` Marc MERLIN
2017-06-23  8:54                         ` Lu Fengqi
2017-06-23 16:17                           ` Marc MERLIN
2017-06-24  2:34                             ` Marc MERLIN
2017-06-26 10:46                               ` Lu Fengqi
2017-06-27 23:11                                 ` Marc MERLIN
2017-06-28  7:10                                   ` Lu Fengqi
2017-06-28 14:43                                     ` Marc MERLIN
2017-05-01 17:06                                       ` 4.11 relocate crash, null pointer Marc MERLIN
2017-05-01 18:08                                         ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN
2017-05-02  1:50                                           ` Chris Murphy
2017-05-02  3:23                                             ` Marc MERLIN
2017-05-02  4:56                                               ` Chris Murphy
2017-05-02  5:11                                                 ` Marc MERLIN
2017-05-02 18:47                                                   ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN
2017-05-03  6:00                                                     ` Marc MERLIN
2017-05-03  6:17                                                       ` Marc MERLIN
2017-05-03  6:32                                                         ` Roman Mamedov
2017-05-03 20:40                                                           ` Marc MERLIN
2017-07-07  5:37                                                   ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN
2017-07-07  5:39                                                     ` Marc MERLIN
2017-07-07  9:33                                                       ` Lu Fengqi
2017-07-07 16:38                                                         ` Marc MERLIN
2017-07-09  4:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  5:05                                                             ` We really need a better/working btrfs check --repair Marc MERLIN
2017-07-09  6:34                                                             ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  7:57                                                             ` Martin Steigerwald
2017-07-09  9:16                                                               ` Paul Jones
2017-07-09 11:17                                                                 ` Duncan
2017-07-09 13:00                                                                   ` Martin Steigerwald
2017-07-29 19:29                                                                   ` Imran Geriskovan
2017-07-29 23:38                                                                     ` Duncan
2017-07-30 14:54                                                                       ` Imran Geriskovan
2017-07-31  4:53                                                                         ` Duncan
2017-07-31 20:32                                                                           ` Imran Geriskovan
2017-08-01  1:36                                                                             ` Duncan
2017-08-01 15:18                                                                               ` Imran Geriskovan
2017-07-31 21:07                                                               ` Ivan Sizov
2017-07-31 21:17                                                                 ` Marc MERLIN
2017-07-31 21:39                                                                   ` Ivan Sizov
2017-08-01 16:41                                                                     ` Ivan Sizov
2017-07-31 22:00                                                                   ` Justin Maggard
2017-08-01  6:38                                                                     ` Marc MERLIN
2017-05-02 19:59                                                 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow
2017-05-02  5:01                                               ` Duncan
2017-05-02 19:53                                                 ` Kai Krakow
2017-05-23 16:58                                                 ` Marc MERLIN
2017-05-24 10:16                                                   ` Duncan
2017-05-05  1:19                                               ` Qu Wenruo
2017-05-05  2:10                                                 ` Qu Wenruo
2017-05-05  2:40                                                 ` Marc MERLIN
2017-05-05  5:03                                                   ` Qu Wenruo
2017-05-05 15:43                                                     ` Marc MERLIN
2017-05-17 18:23                                                       ` Kai Krakow
2017-05-05  1:13                                           ` Qu Wenruo
2017-06-29 13:36                                       ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi
2017-06-29 15:30                                         ` Marc MERLIN
2017-06-30 14:59                                           ` Lu Fengqi
2017-06-22  4:08                     ` Qu Wenruo
2017-06-21 12:04           ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan
2017-06-21  3:26         ` Chris Murphy
2017-06-21  4:06           ` Marc MERLIN

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.