* How to Fix 'Error: could not find extent items for root 257'? @ 2020-02-05 10:18 Chiung-Ming Huang 2020-02-05 10:29 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-05 10:18 UTC (permalink / raw) To: linux-btrfs Hi everyone It's a long story. I try to describe it shortly. My btrfs RAID1 includes 5 HDDs, 10Tx2, 1Tx1, 2Tx1 and 3Tx1. They all based on bcache (1Tx1 SSD as cache) and luks. I tried to reorder it to ` Luks --> Bcache --> SSD --> HDD` with only one layer of luks on bcache. But I failed because of power-off accidentally. Please help me to fix it. Thanks. 1. OS: Ubuntu 18.04 2. $ uname -a Linux rescue 5.3.0-26-generic #28~18.04.1-Ubuntu SMP Wed Dec 18 16:40:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux 3. $ btrfs --version btrfs-progs v5.4.1 4. $ btrfs fi show Label: none uuid: 0b79cf54-c424-40ed-adca-bd66b38ad57a Total devices 5 FS bytes used 496.00KiB devid 1 size 9.09TiB used 3.93TiB path /dev/bcache2 devid 3 size 2.73TiB used 746.00GiB path /dev/bcache3 devid 5 size 9.09TiB used 7.09TiB path /dev/bcache4 devid 6 size 931.51GiB used 0.00B path /dev/mapper/disk-1t devid 7 size 1.82TiB used 0.00B path /dev/mapper/disk-2t 5. $ mount /dev/bcache4 /mnt It showed the second part of messages after about 10 seconds and remount it as readonly ------------------ dmesg part 1/2 ------------------ [Wed Feb 5 17:09:04 2020] BTRFS info (device bcache2): disk space caching is enabled [Wed Feb 5 17:09:04 2020] BTRFS info (device bcache2): has skinny extents [Wed Feb 5 17:09:04 2020] BTRFS info (device bcache2): bdev /dev/bcache2 errs: wr 0, rd 0, flush 0, corrupt 266140, gen 8928 [Wed Feb 5 17:09:04 2020] BTRFS info (device bcache2): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3 [Wed Feb 5 17:09:04 2020] BTRFS info (device bcache2): enabling ssd optimizations [Wed Feb 5 17:09:04 2020] BTRFS info (device bcache2): checking UUID tree [Wed Feb 5 17:09:04 2020] BTRFS error (device bcache2): tree level mismatch detected, bytenr=19499133206528 level expected=0 has=2 [Wed Feb 5 17:09:04 2020] BTRFS error (device bcache2): tree level mismatch detected, bytenr=19499133206528 level expected=0 has=2 [Wed Feb 5 17:09:04 2020] BTRFS warning (device bcache2): iterating uuid_tree failed -117 btrfs fi df / ------------------ dmesg part 2/2 ------------------ [Wed Feb 5 17:09:36 2020] BTRFS error (device bcache2): tree block 14963956514816 owner 3 already locked by pid=3187, extent tree corruption detected [Wed Feb 5 17:09:36 2020] ------------[ cut here ]------------ [Wed Feb 5 17:09:36 2020] BTRFS: Transaction aborted (error -117) [Wed Feb 5 17:09:36 2020] WARNING: CPU: 4 PID: 3187 at /build/linux-hwe-3HpQOB/linux-hwe-5.3.0/fs/btrfs/volumes.c:3025 btrfs_remove_chunk+0x76e/0x8a0 [btrfs] [Wed Feb 5 17:09:36 2020] Modules linked in: cmac bnep nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep edac_mce_amd kvm_amd snd_pcm ccp kvm snd_seq_midi snd_seq_midi_event irqby pass k10temp snd_rawmidi btusb fam15h_power btrtl btbcm btintel joydev nouveau snd_seq input_leds bluetooth snd_seq_device mxm_wmi snd_timer video ecdh_generic snd ecc ttm i2c_algo_bit soundcore mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 b trfs xor zstd_compress raid6_pq libcrc32c algif_skcipher af_alg dm_crypt hid_logitech_hidpp bcache crc64 hid_logitech_dj hid_generic usbhid hid uas usb_storage nvidia_drm(POE) nvidia_modeset(POE) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel nvidia(POE) ae s_x86_64 crypto_simd drm_kms_helper syscopyarea cryptd sysfillrect glue_helper sysimgblt fb_sys_fops drm r8169 nvme realtek i2c_piix4 ahci ipmi_devintf nvme_core(E) libahci ipmi_msghandler wmi [Wed Feb 5 17:09:36 2020] CPU: 4 PID: 3187 Comm: btrfs-cleaner Tainted: P W OE 5.3.0-26-generic #28~18.04.1-Ubuntu [Wed Feb 5 17:09:36 2020] Hardware name: MSI MS-7974/970A-G43 PLUS (MS-7974), BIOS V1.1 07/04/2016 [Wed Feb 5 17:09:36 2020] RIP: 0010:btrfs_remove_chunk+0x76e/0x8a0 [btrfs] [Wed Feb 5 17:09:36 2020] Code: 48 8b 50 50 f0 48 0f ba aa 40 ce 00 00 02 8b 45 a0 72 1c 83 f8 fb 0f 84 af 00 00 00 89 c6 48 c7 c7 f0 52 7d c1 e8 72 fa 73 eb <0f> 0b 8b 45 a0 48 8b 7d 90 89 c1 ba d1 0b 00 00 48 c7 c6 90 54 7c [Wed Feb 5 17:09:36 2020] RSP: 0018:ffffa7a5035d3d98 EFLAGS: 00010282 [Wed Feb 5 17:09:36 2020] RAX: 0000000000000000 RBX: 0000000040000000 RCX: 0000000000000006 [Wed Feb 5 17:09:36 2020] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff97cea7b17440 [Wed Feb 5 17:09:36 2020] RBP: ffffa7a5035d3e48 R08: 00000000000005d3 R09: 0000000000000004 [Wed Feb 5 17:09:36 2020] R10: 00000000ffffffff R11: 0000000000000001 R12: ffff97ce9b647c00 [Wed Feb 5 17:09:36 2020] R13: ffff97ce95c2e800 R14: ffff97ce9c1d03b8 R15: ffff97ce9253ec40 [Wed Feb 5 17:09:36 2020] FS: 0000000000000000(0000) GS:ffff97cea7b00000(0000) knlGS:0000000000000000 [Wed Feb 5 17:09:36 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Wed Feb 5 17:09:36 2020] CR2: 000055b322010290 CR3: 000000061d628000 CR4: 00000000000406e0 [Wed Feb 5 17:09:36 2020] Call Trace: [Wed Feb 5 17:09:36 2020] btrfs_delete_unused_bgs+0x36a/0x490 [btrfs] [Wed Feb 5 17:09:36 2020] cleaner_kthread+0xed/0x130 [btrfs] [Wed Feb 5 17:09:36 2020] kthread+0x121/0x140 [Wed Feb 5 17:09:36 2020] ? __btrfs_btree_balance_dirty+0x60/0x60 [btrfs] [Wed Feb 5 17:09:36 2020] ? kthread_park+0xb0/0xb0 [Wed Feb 5 17:09:36 2020] ret_from_fork+0x22/0x40 [Wed Feb 5 17:09:36 2020] ---[ end trace c34270cb20778d7d ]--- [Wed Feb 5 17:09:36 2020] BTRFS: error (device bcache2) in btrfs_remove_chunk:3025: errno=-117 unknown [Wed Feb 5 17:09:36 2020] BTRFS info (device bcache2): forced readonly [Wed Feb 5 17:09:36 2020] ------------[ cut here ]------------ [Wed Feb 5 17:09:36 2020] WARNING: CPU: 4 PID: 3187 at /build/linux-hwe-3HpQOB/linux-hwe-5.3.0/fs/btrfs/space-info.h:106 btrfs_space_info_update_bytes_may_use.part.10+0x14/0x21 [btrfs] [Wed Feb 5 17:09:36 2020] Modules linked in: cmac bnep nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep edac_mce_amd kvm_amd snd_pcm ccp kvm snd_seq_midi snd_seq_midi_event irqby pass k10temp snd_rawmidi btusb fam15h_power btrtl btbcm btintel joydev nouveau snd_seq input_leds bluetooth snd_seq_device mxm_wmi snd_timer video ecdh_generic snd ecc ttm i2c_algo_bit soundcore mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 b trfs xor zstd_compress raid6_pq libcrc32c algif_skcipher af_alg dm_crypt hid_logitech_hidpp bcache crc64 hid_logitech_dj hid_generic usbhid hid uas usb_storage nvidia_drm(POE) nvidia_modeset(POE) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel nvidia(POE) ae s_x86_64 crypto_simd drm_kms_helper syscopyarea cryptd sysfillrect glue_helper sysimgblt fb_sys_fops drm r8169 nvme realtek i2c_piix4 ahci ipmi_devintf nvme_core(E) libahci ipmi_msghandler wmi [Wed Feb 5 17:09:36 2020] CPU: 4 PID: 3187 Comm: btrfs-cleaner Tainted: P W OE 5.3.0-26-generic #28~18.04.1-Ubuntu [Wed Feb 5 17:09:36 2020] Hardware name: MSI MS-7974/970A-G43 PLUS (MS-7974), BIOS V1.1 07/04/2016 [Wed Feb 5 17:09:36 2020] RIP: 0010:btrfs_space_info_update_bytes_may_use.part.10+0x14/0x21 [btrfs] [Wed Feb 5 17:09:36 2020] Code: 74 05 e8 22 a5 6d eb 48 8d 65 d8 5b 41 5a 41 5c 41 5d 41 5e 5d c3 55 48 89 e5 53 48 89 fb 48 c7 c7 e8 a4 7d c1 e8 d2 84 74 eb <0f> 0b 48 c7 43 28 00 00 00 00 5b 5d c3 55 48 89 e5 53 48 89 fb 48 [Wed Feb 5 17:09:36 2020] RSP: 0018:ffffa7a5035d3ca8 EFLAGS: 00010286 [Wed Feb 5 17:09:36 2020] RAX: 0000000000000024 RBX: ffff97ce96ed5800 RCX: 0000000000000006 [Wed Feb 5 17:09:36 2020] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff97cea7b17440 [Wed Feb 5 17:09:36 2020] RBP: ffffa7a5035d3cb0 R08: 00000000000005ed R09: 0000000000000004 [Wed Feb 5 17:09:36 2020] R10: 0000000000000002 R11: 0000000000000001 R12: ffff97ce96ed5800 [Wed Feb 5 17:09:36 2020] R13: 0000000000080000 R14: 000000000007c000 R15: ffff97ce9c1d0000 [Wed Feb 5 17:09:36 2020] FS: 0000000000000000(0000) GS:ffff97cea7b00000(0000) knlGS:0000000000000000 [Wed Feb 5 17:09:36 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Wed Feb 5 17:09:36 2020] CR2: 000055b322010290 CR3: 000000061d628000 CR4: 00000000000406e0 [Wed Feb 5 17:09:36 2020] Call Trace: [Wed Feb 5 17:09:36 2020] btrfs_space_info_add_old_bytes+0x261/0x280 [btrfs] [Wed Feb 5 17:09:36 2020] __btrfs_block_rsv_release+0x16e/0x1a0 [btrfs] [Wed Feb 5 17:09:36 2020] btrfs_trans_release_chunk_metadata+0x35/0x50 [btrfs] [Wed Feb 5 17:09:36 2020] btrfs_create_pending_block_groups+0x13d/0x240 [btrfs] [Wed Feb 5 17:09:36 2020] __btrfs_end_transaction+0x6e/0x1e0 [btrfs] [Wed Feb 5 17:09:36 2020] btrfs_end_transaction+0x10/0x20 [btrfs] [Wed Feb 5 17:09:36 2020] btrfs_delete_unused_bgs+0x28b/0x490 [btrfs] [Wed Feb 5 17:09:36 2020] cleaner_kthread+0xed/0x130 [btrfs] [Wed Feb 5 17:09:36 2020] kthread+0x121/0x140 [Wed Feb 5 17:09:36 2020] ? __btrfs_btree_balance_dirty+0x60/0x60 [btrfs] [Wed Feb 5 17:09:36 2020] ? kthread_park+0xb0/0xb0 [Wed Feb 5 17:09:36 2020] ret_from_fork+0x22/0x40 [Wed Feb 5 17:09:36 2020] ---[ end trace c34270cb20778d7e ]--- 6. $ btrfs fi df /mnt Data, RAID1: total=4.21TiB, used=0.00B Data, single: total=3.30TiB, used=0.00B System, RAID1: total=32.00MiB, used=0.00B Metadata, RAID1: total=12.00GiB, used=496.00KiB Metadata, single: total=8.00GiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B 7. $ btrfs check -p /dev/bcache4 Opening filesystem to check... Checking filesystem on /dev/bcache4 UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a Error: could not find extent items for root 257(0:00:00 elapsed, 1199 items checked) [1/7] checking root items (0:00:00 elapsed, 7748 items checked) ERROR: failed to repair root items: No such file or directory 8. $ btrfs scrub start -B -R /mnt The status is aborted because the file system was forcely re-mounted readonly. 9. $ lsblk -o NAME,SIZE,TYPE,FSTYPE sda 931.5G disk bcache └─bcache0 931.5G disk crypto_LUKS └─disk-1t 931.5G crypt btrfs sdb 232.9G disk └─sdb6 10G part crypto_LUKS └─rescue 10G crypt btrfs sdc 2.7T disk crypto_LUKS └─disk-3t 2.7T crypt bcache └─bcache3 2.7T disk btrfs sdd 9.1T disk crypto_LUKS └─disk-10t 9.1T crypt bcache └─bcache2 9.1T disk btrfs sde 1.8T disk bcache └─bcache1 1.8T disk crypto_LUKS └─disk-2t 1.8T crypt btrfs sdf 9.1T disk crypto_LUKS └─disk-10t 9.1T crypt bcache └─bcache4 9.1T disk btrfs nvme0n1 953.9G disk └─nvme0n1p1 636G part crypto_LUKS └─cache 636G crypt bcache ├─bcache0 931.5G disk crypto_LUKS │ └─disk-1t 931.5G crypt btrfs ├─bcache1 1.8T disk crypto_LUKS │ └─disk-2t 1.8T crypt btrfs ├─bcache2 9.1T disk btrfs ├─bcache3 2.7T disk btrfs └─bcache4 9.1T disk btrfs Regards, Chiung-Ming Huang ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-05 10:18 How to Fix 'Error: could not find extent items for root 257'? Chiung-Ming Huang @ 2020-02-05 10:29 ` Qu Wenruo 2020-02-05 15:29 ` Chiung-Ming Huang [not found] ` <CAEOGEKHf9F0VM=au-42MwD63_V8RwtqiskV0LsGpq-c=J_qyPg@mail.gmail.com> 0 siblings, 2 replies; 16+ messages in thread From: Qu Wenruo @ 2020-02-05 10:29 UTC (permalink / raw) To: Chiung-Ming Huang, linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 3711 bytes --] On 2020/2/5 下午6:18, Chiung-Ming Huang wrote: > Hi everyone > > It's a long story. I try to describe it shortly. My btrfs RAID1 > includes 5 HDDs, 10Tx2, 1Tx1, 2Tx1 and 3Tx1. They all based on bcache > (1Tx1 SSD as cache) and luks. I tried to reorder it to ` Luks --> > Bcache --> SSD --> HDD` with only one layer of luks on bcache. But I > failed because of power-off accidentally. Please help me to fix it. > Thanks. > > 1. OS: Ubuntu 18.04 > > 2. $ uname -a > Linux rescue 5.3.0-26-generic #28~18.04.1-Ubuntu SMP Wed Dec 18 > 16:40:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux ... > [Wed Feb 5 17:09:04 2020] BTRFS error (device bcache2): tree level > mismatch detected, bytenr=19499133206528 level expected=0 has=2 > [Wed Feb 5 17:09:04 2020] BTRFS error (device bcache2): tree level > mismatch detected, bytenr=19499133206528 level expected=0 has=2 > [Wed Feb 5 17:09:04 2020] BTRFS warning (device bcache2): iterating > uuid_tree failed -117 > btrfs fi df / > > ------------------ dmesg part 2/2 ------------------ > > [Wed Feb 5 17:09:36 2020] BTRFS error (device bcache2): tree block > 14963956514816 owner 3 already locked by pid=3187, extent tree > corruption detected This shows the problem. Your extent tree is corrupted. I don't believe the lower storage stack is involved. Full histroy of the fs please (from mkfs to current stage) ... > > 7. $ btrfs check -p /dev/bcache4 > Opening filesystem to check... > Checking filesystem on /dev/bcache4 > UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a > Error: could not find extent items for root 257(0:00:00 elapsed, 1199 > items checked) > [1/7] checking root items (0:00:00 elapsed, 7748 > items checked) > ERROR: failed to repair root items: No such file or directory Have you tried btrfs check --repair then mount? Is that mentioned dmesg the first time you hit, not something after btrfs check --repair? And `btrfs check` without --repair please, that's the most important info to evaluate how to fix it (if possible). Thanks, Qu > > 8. $ btrfs scrub start -B -R /mnt > The status is aborted because the file system was forcely re-mounted readonly. > > 9. $ lsblk -o NAME,SIZE,TYPE,FSTYPE > sda 931.5G disk bcache > └─bcache0 931.5G disk crypto_LUKS > └─disk-1t 931.5G crypt btrfs > sdb 232.9G disk > └─sdb6 10G part crypto_LUKS > └─rescue 10G crypt btrfs > sdc 2.7T disk crypto_LUKS > └─disk-3t 2.7T crypt bcache > └─bcache3 2.7T disk btrfs > sdd 9.1T disk crypto_LUKS > └─disk-10t 9.1T crypt bcache > └─bcache2 9.1T disk btrfs > sde 1.8T disk bcache > └─bcache1 1.8T disk crypto_LUKS > └─disk-2t 1.8T crypt btrfs > sdf 9.1T disk crypto_LUKS > └─disk-10t 9.1T crypt bcache > └─bcache4 9.1T disk btrfs > nvme0n1 953.9G disk > └─nvme0n1p1 636G part crypto_LUKS > └─cache 636G crypt bcache > ├─bcache0 931.5G disk crypto_LUKS > │ └─disk-1t 931.5G crypt btrfs > ├─bcache1 1.8T disk crypto_LUKS > │ └─disk-2t 1.8T crypt btrfs > ├─bcache2 9.1T disk btrfs > ├─bcache3 2.7T disk btrfs > └─bcache4 9.1T disk btrfs > > > Regards, > Chiung-Ming Huang > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-05 10:29 ` Qu Wenruo @ 2020-02-05 15:29 ` Chiung-Ming Huang 2020-02-05 19:38 ` Chris Murphy [not found] ` <CAEOGEKHf9F0VM=au-42MwD63_V8RwtqiskV0LsGpq-c=J_qyPg@mail.gmail.com> 1 sibling, 1 reply; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-05 15:29 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs Hi Qu Wenruo Thanks for your reply and help. Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月5日 週三 下午6:29寫道: server ~$ Decrypted by /etc/crypttab server ~$ mkfs.btrfs -m raid1 -d raid1 /dev/bcache0 /dev/bcache1 /dev/bcache2 /dev/bcache3 /dev/bcache4 server ~$ mount -o subvol=@defaults,degraded,nossd,subvol /dev/bcache4 / (By /etc/fstab) server ~$ reboot server ~$ btrfs balance start / server ~$ btrfs fi usage / ^^^^^^^^^^^^^ Only /dev/sdc (3T), /dev/sdd (10T), /dev/sdf (10T) have data. The first two don't have any data and the last one have a mirror copy of RAID1. server ~$ Removed /dev/sda (1T), /dev/sde (2T), /dev/sdd (10T) from /etc/crypttab server ~$ reboot server ~$ btrfs fi show ^^^^^^^^^^^^ /dev/sda (1T), /dev/sde (2T), /dev/sdd (10T) are marked `missing` server ~$ btrfs balance start -f -sconvert=single -mconvert=single -dconvert=single / server ~$ btrfs balance cancel / server ~$ reboot server ~$ Put luks on bcache and mkfs.btrfs /dev/sda (1T) server ~$ Put luks on bcache and mkfs.btrfs /dev/sde (2T) server ~$ Forgot to do it on /dev/sdd (10T) server ~$ btrfs device remove missing ^^^^^^^^^^^^ Executed about 5 seconds. (Becasue /dev/sda (1T) is empty?) server ~$ btrfs device remove missing ^^^^^^^^^^^^ Executed about 5 seconds. (Becasue /dev/sde (2T) is empty?) server ~$ btrfs device remove missing ^^^^^^^^^^^^ Executed at least 12 hours before power-off accidentally. [Change to the independent rescue OS.] rescue ~$ Add /dev/sda (1T), /dev/sde (2T), /dev/sdd (10T) back to /etc/crypttab. And /dev/sdd (10T) still keep the mirror copy of RAID1 before removing it from /etc/crypttab. rescue ~$ reboot rescue ~$ btrfs check -p --repair /dev/bcache4 ^^^^^^^^^^^ failed rescue ~$ mount /dev/bcache4 /mnt rescue ~$ btrfs check --repair /dev/bcache4 ^^^^^^^^^^ [4/7] ... Errors ....... fs root rescue ~$ btrfs scrub start -B /mnt ^^^^^^^^^^ Showed a lot of errors and I can't ctrl+alt+3. So I rebooted. rescue ~$ btrfs check --repair /dev/bcache4 ^^^^^^^^^^ [1/7] checking root items Error: could not find extent items for root 257 ERROR: failed to repair root items: No such file or directory > ... > > > > 7. $ btrfs check -p /dev/bcache4 > > Opening filesystem to check... > > Checking filesystem on /dev/bcache4 > > UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a > > Error: could not find extent items for root 257(0:00:00 elapsed, 1199 > > items checked) > > [1/7] checking root items (0:00:00 elapsed, 7748 > > items checked) > > ERROR: failed to repair root items: No such file or directory > > Have you tried btrfs check --repair then mount? Yes. > Is that mentioned dmesg the first time you hit, not something after I keep kern.log but it's about 17M. I cannot post it here. And It doesn't show `btrfs command` in the context. A lot of `BTRFS critical` and `BTRFS error` are there but `BTRFS critical` repeated. Feb 3 15:38:24 rescue kernel: [ 8731.172674] BTRFS critical (device bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, invalid key objectid: has 18446744073709551606 expect 6 or [256, 18446744073709551360] or 18446744073709551604 Feb 3 15:38:24 rescue kernel: [ 8731.172860] BTRFS critical (device bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, invalid key objectid: has 18446744073709551606 expect 6 or [256, 18446744073709551360] or 18446744073709551604 Feb 3 20:19:42 rescue kernel: [25609.592216] BTRFS critical (device bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, invalid key objectid: has 18446744073709551606 expect 6 or [256, 18446744073709551360] or 18446744073709551604 Feb 3 20:19:42 rescue kernel: [25609.592511] BTRFS critical (device bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, invalid key objectid: has 18446744073709551606 expect 6 or [256, 18446744073709551360] or 18446744073709551604 Feb 5 17:05:58 rescue kernel: [ 3601.738469] BTRFS critical (device bcache2): unable to find logical 7157918187520 length 4096 Feb 5 17:05:58 rescue kernel: [ 3601.738474] BTRFS critical (device bcache2): unable to find logical 7157918187520 length 4096 Feb 5 17:05:58 rescue kernel: [ 3601.738481] BTRFS critical (device bcache2): unable to find logical 7157918187520 length 16384 Feb 5 17:05:58 rescue kernel: [ 3601.738531] BTRFS critical (device bcache2): unable to find logical 7157918187520 length 4096 Feb 5 17:05:58 rescue kernel: [ 3601.738533] BTRFS critical (device bcache2): unable to find logical 7157918187520 length 4096 Feb 5 17:05:58 rescue kernel: [ 3601.738539] BTRFS critical (device bcache2): unable to find logical 7157918187520 length 16384 .... (repeated 4096, 4096, 16384 these three lines) > btrfs check --repair? > > And `btrfs check` without --repair please, that's the most important > info to evaluate how to fix it (if possible). rescue ~$ btrfs check /dev/bcache4 Opening filesystem to check... Checking filesystem on /dev/bcache4 UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a [1/7] checking root items Error: could not find extent items for root 257 ERROR: failed to repair root items: No such file or directory rescue ~$ btrfs check --repair /dev/bcache4 enabling repair mode WARNING: Do not use --repair unless you are advised to do so by a developer or an experienced user, and then only after having accepted that no fsck can successfully repair all types of filesystem corruption. Eg. some software or hardware bugs can fatally damage a volume. The operation will start in 10 seconds. Use Ctrl-C to stop it. 10 9 8 7 6 5 4 3 2 1 Starting repair. Opening filesystem to check... Checking filesystem on /dev/bcache4 UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a [1/7] checking root items Error: could not find extent items for root 257 ERROR: failed to repair root items: No such file or directory rescue ~$ > Thanks, > Qu Regards, Chiung-Ming Huang ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-05 15:29 ` Chiung-Ming Huang @ 2020-02-05 19:38 ` Chris Murphy 2020-02-06 3:11 ` Chiung-Ming Huang 0 siblings, 1 reply; 16+ messages in thread From: Chris Murphy @ 2020-02-05 19:38 UTC (permalink / raw) To: Chiung-Ming Huang; +Cc: Qu Wenruo, Btrfs BTRFS On Wed, Feb 5, 2020 at 8:29 AM Chiung-Ming Huang <photon3108@gmail.com> wrote: > > server ~$ mount -o subvol=@defaults,degraded,nossd,subvol /dev/bcache4 Is this file system always mounted with the degraded mount option? It's in /etc/fstab? -- Chris Murphy ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-05 19:38 ` Chris Murphy @ 2020-02-06 3:11 ` Chiung-Ming Huang 0 siblings, 0 replies; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-06 3:11 UTC (permalink / raw) To: Chris Murphy; +Cc: Qu Wenruo, Btrfs BTRFS Chris Murphy <lists@colorremedies.com> 於 2020年2月6日 週四 上午3:38寫道: > > On Wed, Feb 5, 2020 at 8:29 AM Chiung-Ming Huang <photon3108@gmail.com> wrote: > > > > server ~$ mount -o subvol=@defaults,degraded,nossd,subvol /dev/bcache4 > > Is this file system always mounted with the degraded mount option? > It's in /etc/fstab? Yes, because my btrfs raid1 includes root directory, '/'. I assume it could make boot successfully even if raid1 lost some disks. Is it a bad idea? Or does it has any performance issue? > > -- > Chris Murphy Regards, Chiung-Ming Huang ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAEOGEKHf9F0VM=au-42MwD63_V8RwtqiskV0LsGpq-c=J_qyPg@mail.gmail.com>]
[parent not found: <f2ad6b4f-b011-8954-77e1-5162c84f7c1f@gmx.com>]
* Re: How to Fix 'Error: could not find extent items for root 257'? [not found] ` <f2ad6b4f-b011-8954-77e1-5162c84f7c1f@gmx.com> @ 2020-02-06 4:13 ` Chiung-Ming Huang 2020-02-06 4:35 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-06 4:13 UTC (permalink / raw) To: Qu Wenruo, Btrfs [-- Attachment #1: Type: text/plain, Size: 1857 bytes --] Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 上午9:13寫道: > Please keep in mind that, if you post dmesg, the first time such error > happens is the most important. > Not something after you modified the fs by btrfs check --repair. Thanks for your advice. I'll keep in my mind. :) > > > > Feb 3 15:38:24 rescue kernel: [ 8731.172674] BTRFS critical (device > > bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, > > invalid key objectid: has 18446744073709551606 expect 6 or [256, > > 18446744073709551360] or 18446744073709551604 > > This is message is even earlier than your initial report, and it's more > important. > This means you have a bad inode item with objectid EXTENT_CSUM_OBJECTID. > > This is a bigger problem. It sounds bad. Is it possible to save the data or part of them? > Are you sure that is the very first error message you hit? My .bash_history doesn't show timestamp so I'm not really sure which critical/error message is exactly right after the first `btrfs check --repair`. I tried to make log file smaller and excerpted only btrfs messages before the first critical message in the attachment. I'm not so familiar with mailing list. Could you see `btrfs_.log`? > > rescue ~$ btrfs check /dev/bcache4 > > Opening filesystem to check... > > Checking filesystem on /dev/bcache4 > > UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a > > [1/7] checking root items > > Error: could not find extent items for root 257 > > ERROR: failed to repair root items: No such file or directory > > This part is from a special repair for a regression in 3.17. > > I guess we should not enable it by default. > That will be another patch for btrfs-progs. Is this patch safe for saving my btrfs? If it is, I can build btrfs-progs. Regards, Chiung-Ming Huang [-- Attachment #2: btrfs_.log --] [-- Type: text/x-log, Size: 25635 bytes --] Jan 28 18:19:26 rescue kernel: [ 23.014317] Btrfs loaded, crc32c=crc32c-intel Jan 28 18:19:26 rescue kernel: [ 23.126873] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 5 transid 1401252 /dev/bcache4 Jan 28 18:19:26 rescue kernel: [ 23.126984] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 4 transid 1401252 /dev/bcache3 Jan 28 18:19:26 rescue kernel: [ 23.127080] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 3 transid 1401252 /dev/bcache2 Jan 28 18:19:26 rescue kernel: [ 23.127181] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 2 transid 1401252 /dev/bcache1 Jan 28 18:19:26 rescue kernel: [ 23.127343] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 1 transid 1401252 /dev/bcache0 Jan 28 18:19:26 rescue kernel: [ 23.127480] BTRFS: device fsid 01220871-cd98-45c9-8aac-070c4dd97a4f devid 1 transid 344 /dev/mapper/boot Jan 28 18:19:26 rescue kernel: [ 23.127652] BTRFS: device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 transid 740 /dev/mapper/rescue Jan 28 18:19:26 rescue kernel: [ 23.149243] BTRFS info (device dm-0): disk space caching is enabled Jan 28 18:19:26 rescue kernel: [ 23.149246] BTRFS info (device dm-0): has skinny extents Jan 28 18:19:26 rescue kernel: [ 23.155698] BTRFS info (device dm-0): enabling ssd optimizations Jan 28 18:19:26 rescue kernel: [ 23.547284] BTRFS info (device dm-0): disk space caching is enabled Jan 28 18:19:26 rescue kernel: [ 23.560353] BTRFS warning (device dm-0): swapfile must not be copy-on-write Jan 28 18:19:26 rescue kernel: [ 24.118209] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/mapper/rescue new:/dev/dm-0 Jan 28 18:19:26 rescue kernel: [ 24.118663] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/dm-0 new:/dev/mapper/rescue Jan 28 18:19:26 rescue kernel: [ 24.239166] BTRFS warning (device dm-0): swapfile must not be copy-on-write Jan 28 18:19:33 rescue kernel: [ 32.341375] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/mapper/rescue new:/dev/dm-0 Jan 28 18:19:33 rescue kernel: [ 32.346855] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/dm-0 new:/dev/mapper/rescue Jan 28 18:24:19 rescue kernel: [ 318.031325] BTRFS info (device dm-1): disk space caching is enabled Jan 28 18:24:19 rescue kernel: [ 318.031327] BTRFS info (device dm-1): has skinny extents Jan 28 18:24:19 rescue kernel: [ 318.042297] BTRFS info (device dm-1): enabling ssd optimizations Feb 3 10:55:12 rescue kernel: [ 19.205264] Btrfs loaded, crc32c=crc32c-intel Feb 3 10:55:12 rescue kernel: [ 19.324218] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 1 transid 1415180 /dev/bcache2 Feb 3 10:55:12 rescue kernel: [ 19.324355] BTRFS: device fsid 01220871-cd98-45c9-8aac-070c4dd97a4f devid 1 transid 399 /dev/mapper/boot Feb 3 10:55:12 rescue kernel: [ 19.324492] BTRFS: device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 transid 791 /dev/mapper/rescue Feb 3 10:55:12 rescue kernel: [ 19.347030] BTRFS info (device dm-0): disk space caching is enabled Feb 3 10:55:12 rescue kernel: [ 19.347031] BTRFS info (device dm-0): has skinny extents Feb 3 10:55:12 rescue kernel: [ 19.353614] BTRFS info (device dm-0): enabling ssd optimizations Feb 3 10:55:12 rescue kernel: [ 19.751089] BTRFS info (device dm-0): disk space caching is enabled Feb 3 10:55:12 rescue kernel: [ 19.763127] BTRFS warning (device dm-0): swapfile must not be copy-on-write Feb 3 10:55:12 rescue kernel: [ 20.542239] BTRFS warning (device dm-0): swapfile must not be copy-on-write Feb 3 10:55:12 rescue kernel: [ 20.709261] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/mapper/rescue new:/dev/dm-0 Feb 3 10:55:12 rescue kernel: [ 20.709619] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/dm-0 new:/dev/mapper/rescue Feb 3 10:56:29 rescue kernel: [ 98.214952] BTRFS info (device dm-1): disk space caching is enabled Feb 3 10:56:29 rescue kernel: [ 98.214956] BTRFS info (device dm-1): has skinny extents Feb 3 10:56:29 rescue kernel: [ 98.247149] BTRFS info (device dm-1): enabling ssd optimizations Feb 3 10:59:15 rescue kernel: [ 263.602762] BTRFS info (device bcache2): disk space caching is enabled Feb 3 10:59:15 rescue kernel: [ 263.602764] BTRFS info (device bcache2): has skinny extents Feb 3 10:59:15 rescue kernel: [ 263.603619] BTRFS error (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing Feb 3 10:59:15 rescue kernel: [ 263.603624] BTRFS error (device bcache2): failed to read the system array: -2 Feb 3 10:59:15 rescue kernel: [ 263.645634] BTRFS error (device bcache2): open_ctree failed Feb 3 10:59:22 rescue kernel: [ 271.229983] BTRFS info (device bcache2): disk space caching is enabled Feb 3 10:59:22 rescue kernel: [ 271.229986] BTRFS info (device bcache2): has skinny extents Feb 3 10:59:22 rescue kernel: [ 271.230996] BTRFS error (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing Feb 3 10:59:22 rescue kernel: [ 271.231005] BTRFS error (device bcache2): failed to read the system array: -2 Feb 3 10:59:23 rescue kernel: [ 271.301168] BTRFS error (device bcache2): open_ctree failed Feb 3 11:00:23 rescue kernel: [ 332.126013] BTRFS info (device bcache2): allowing degraded mounts Feb 3 11:00:23 rescue kernel: [ 332.126017] BTRFS info (device bcache2): disk space caching is enabled Feb 3 11:00:23 rescue kernel: [ 332.126019] BTRFS info (device bcache2): has skinny extents Feb 3 11:00:23 rescue kernel: [ 332.127151] BTRFS warning (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing Feb 3 11:00:23 rescue kernel: [ 332.142841] BTRFS warning (device bcache2): devid 2 uuid 7c87f647-874d-49aa-83d3-5f4cc126a627 is missing Feb 3 11:00:23 rescue kernel: [ 332.142845] BTRFS warning (device bcache2): devid 3 uuid f9b7fe84-d95b-4db5-9e2b-c34a2d4186e9 is missing Feb 3 11:00:23 rescue kernel: [ 332.142847] BTRFS warning (device bcache2): devid 4 uuid 37706d9f-0672-4dc7-b987-5cc83c6beb48 is missing Feb 3 11:00:23 rescue kernel: [ 332.142849] BTRFS warning (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing Feb 3 11:00:23 rescue kernel: [ 332.182460] BTRFS warning (device bcache2): failed to read tree root Feb 3 11:00:24 rescue kernel: [ 332.336612] BTRFS error (device bcache2): open_ctree failed Feb 3 11:00:50 rescue kernel: [ 359.095221] BTRFS info (device bcache2): allowing degraded mounts Feb 3 11:00:50 rescue kernel: [ 359.095223] BTRFS info (device bcache2): disk space caching is enabled Feb 3 11:00:50 rescue kernel: [ 359.095224] BTRFS info (device bcache2): has skinny extents Feb 3 11:00:50 rescue kernel: [ 359.096523] BTRFS warning (device bcache2): devid 2 uuid 7c87f647-874d-49aa-83d3-5f4cc126a627 is missing Feb 3 11:00:50 rescue kernel: [ 359.096526] BTRFS warning (device bcache2): devid 3 uuid f9b7fe84-d95b-4db5-9e2b-c34a2d4186e9 is missing Feb 3 11:00:50 rescue kernel: [ 359.096527] BTRFS warning (device bcache2): devid 4 uuid 37706d9f-0672-4dc7-b987-5cc83c6beb48 is missing Feb 3 11:00:50 rescue kernel: [ 359.096529] BTRFS warning (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing Feb 3 11:00:50 rescue kernel: [ 359.123421] BTRFS warning (device bcache2): failed to read tree root Feb 3 11:00:51 rescue kernel: [ 359.265056] BTRFS error (device bcache2): open_ctree failed Feb 3 11:10:16 rescue kernel: [ 924.586408] BTRFS: device fsid 5330041e-7c11-4090-a99b-e86f32ee2136 devid 1 transid 5 /dev/mapper/tmp Feb 3 11:11:13 rescue kernel: [ 981.791708] BTRFS info (device dm-5): disk space caching is enabled Feb 3 11:11:13 rescue kernel: [ 981.791710] BTRFS info (device dm-5): has skinny extents Feb 3 11:11:13 rescue kernel: [ 981.791711] BTRFS info (device dm-5): flagging fs with big metadata feature Feb 3 11:11:13 rescue kernel: [ 981.794318] BTRFS info (device dm-5): enabling ssd optimizations Feb 3 11:11:13 rescue kernel: [ 981.794938] BTRFS info (device dm-5): creating UUID tree Feb 3 13:14:43 rescue kernel: [ 18.721825] Btrfs loaded, crc32c=crc32c-intel Feb 3 13:14:43 rescue kernel: [ 18.838106] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 1 transid 1415180 /dev/bcache2 Feb 3 13:14:43 rescue kernel: [ 18.838244] BTRFS: device fsid 01220871-cd98-45c9-8aac-070c4dd97a4f devid 1 transid 401 /dev/mapper/boot Feb 3 13:14:43 rescue kernel: [ 18.838379] BTRFS: device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 transid 1458 /dev/mapper/rescue Feb 3 13:14:43 rescue kernel: [ 18.858902] BTRFS info (device dm-0): disk space caching is enabled Feb 3 13:14:43 rescue kernel: [ 18.858904] BTRFS info (device dm-0): has skinny extents Feb 3 13:14:43 rescue kernel: [ 18.865645] BTRFS info (device dm-0): enabling ssd optimizations Feb 3 13:14:43 rescue kernel: [ 19.293884] BTRFS info (device dm-0): disk space caching is enabled Feb 3 13:14:43 rescue kernel: [ 19.305694] BTRFS warning (device dm-0): swapfile must not be copy-on-write Feb 3 13:14:43 rescue kernel: [ 20.157798] BTRFS warning (device dm-0): swapfile must not be copy-on-write Feb 3 13:14:43 rescue kernel: [ 20.384004] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/mapper/rescue new:/dev/dm-0 Feb 3 13:14:43 rescue kernel: [ 20.384394] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/dm-0 new:/dev/mapper/rescue Feb 3 13:15:58 rescue kernel: [ 185.577094] BTRFS info (device dm-1): disk space caching is enabled Feb 3 13:15:58 rescue kernel: [ 185.577097] BTRFS info (device dm-1): has skinny extents Feb 3 13:15:58 rescue kernel: [ 185.589554] BTRFS info (device dm-1): enabling ssd optimizations Feb 3 13:18:20 rescue kernel: [ 327.032329] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 5 transid 1448328 /dev/bcache4 Feb 3 13:18:20 rescue kernel: [ 327.380096] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 3 transid 1448328 /dev/bcache3 Feb 3 13:25:15 rescue kernel: [ 742.224425] BTRFS info (device bcache3): allowing degraded mounts Feb 3 13:25:15 rescue kernel: [ 742.224427] BTRFS info (device bcache3): disk space caching is enabled Feb 3 13:25:15 rescue kernel: [ 742.224428] BTRFS info (device bcache3): has skinny extents Feb 3 13:25:15 rescue kernel: [ 742.308375] BTRFS error (device bcache3): super_num_devices 3 mismatch with num_devices 3 found here Feb 3 13:25:15 rescue kernel: [ 742.308383] BTRFS error (device bcache3): failed to read chunk tree: -22 Feb 3 13:25:15 rescue kernel: [ 742.459207] BTRFS error (device bcache3): open_ctree failed Feb 3 13:25:40 rescue kernel: [ 767.040748] BTRFS info (device bcache3): allowing degraded mounts Feb 3 13:25:40 rescue kernel: [ 767.040749] BTRFS info (device bcache3): disk space caching is enabled Feb 3 13:25:40 rescue kernel: [ 767.040750] BTRFS info (device bcache3): has skinny extents Feb 3 13:25:40 rescue kernel: [ 767.061935] BTRFS error (device bcache3): super_num_devices 3 mismatch with num_devices 3 found here Feb 3 13:25:40 rescue kernel: [ 767.061943] BTRFS error (device bcache3): failed to read chunk tree: -22 Feb 3 13:25:40 rescue kernel: [ 767.223721] BTRFS error (device bcache3): open_ctree failed Feb 3 14:16:21 rescue kernel: [ 3808.492183] BTRFS info (device bcache3): allowing degraded mounts Feb 3 14:16:21 rescue kernel: [ 3808.492185] BTRFS info (device bcache3): disk space caching is enabled Feb 3 14:16:21 rescue kernel: [ 3808.492186] BTRFS info (device bcache3): has skinny extents Feb 3 14:16:21 rescue kernel: [ 3808.702369] BTRFS error (device bcache3): parent transid verify failed on 15062144499712 wanted 1446883 found 112350 Feb 3 14:16:21 rescue kernel: [ 3808.711258] BTRFS info (device bcache3): read error corrected: ino 0 off 15062144499712 (dev /dev/bcache2 sector 13892444896) Feb 3 14:16:21 rescue kernel: [ 3808.711562] BTRFS info (device bcache3): read error corrected: ino 0 off 15062144503808 (dev /dev/bcache2 sector 13892444904) Feb 3 14:16:21 rescue kernel: [ 3808.711767] BTRFS info (device bcache3): read error corrected: ino 0 off 15062144507904 (dev /dev/bcache2 sector 13892444912) Feb 3 14:16:21 rescue kernel: [ 3808.712058] BTRFS info (device bcache3): read error corrected: ino 0 off 15062144512000 (dev /dev/bcache2 sector 13892444920) Feb 3 14:16:21 rescue kernel: [ 3808.749827] BTRFS error (device bcache3): parent transid verify failed on 15062014951424 wanted 1446322 found 100674 Feb 3 14:16:21 rescue kernel: [ 3808.750423] BTRFS info (device bcache3): read error corrected: ino 0 off 15062014951424 (dev /dev/bcache2 sector 13892191872) Feb 3 14:16:21 rescue kernel: [ 3808.750731] BTRFS info (device bcache3): read error corrected: ino 0 off 15062014955520 (dev /dev/bcache2 sector 13892191880) Feb 3 14:16:21 rescue kernel: [ 3808.750957] BTRFS info (device bcache3): read error corrected: ino 0 off 15062014959616 (dev /dev/bcache2 sector 13892191888) Feb 3 14:16:21 rescue kernel: [ 3808.751217] BTRFS info (device bcache3): read error corrected: ino 0 off 15062014963712 (dev /dev/bcache2 sector 13892191896) Feb 3 14:16:21 rescue kernel: [ 3808.753943] BTRFS info (device bcache3): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3 Feb 3 14:16:22 rescue kernel: [ 3808.913147] BTRFS error (device bcache3): parent transid verify failed on 15062736961536 wanted 1447095 found 1415067 Feb 3 14:16:22 rescue kernel: [ 3808.919641] BTRFS info (device bcache3): read error corrected: ino 0 off 15062736961536 (dev /dev/bcache2 sector 13893602048) Feb 3 14:16:22 rescue kernel: [ 3808.920181] BTRFS info (device bcache3): read error corrected: ino 0 off 15062736965632 (dev /dev/bcache2 sector 13893602056) Feb 3 14:16:22 rescue kernel: [ 3809.203444] BTRFS error (device bcache3): parent transid verify failed on 15061822210048 wanted 1448318 found 1415066 Feb 3 14:16:22 rescue kernel: [ 3809.369463] BTRFS error (device bcache3): parent transid verify failed on 15062219669504 wanted 1448318 found 1415066 Feb 3 14:16:22 rescue kernel: [ 3809.371991] BTRFS error (device bcache3): parent transid verify failed on 15062219849728 wanted 1448318 found 1415066 Feb 3 14:16:22 rescue kernel: [ 3809.384999] BTRFS error (device bcache3): parent transid verify failed on 15062220980224 wanted 1448318 found 1415066 Feb 3 14:16:22 rescue kernel: [ 3809.555515] BTRFS info (device bcache3): enabling ssd optimizations Feb 3 14:16:22 rescue kernel: [ 3809.651522] BTRFS info (device bcache3): checking UUID tree Feb 3 14:16:22 rescue kernel: [ 3809.670189] BTRFS error (device bcache3): parent transid verify failed on 15062533734400 wanted 1448319 found 99757 Feb 3 14:16:22 rescue kernel: [ 3809.672604] BTRFS error (device bcache3): parent transid verify failed on 15062480715776 wanted 1448319 found 109527 Feb 3 14:16:22 rescue kernel: [ 3809.678123] BTRFS error (device bcache3): parent transid verify failed on 15062533668864 wanted 1448319 found 99757 Feb 3 14:16:56 rescue kernel: [ 3843.262404] BTRFS error (device bcache3): parent transid verify failed on 19500524929024 wanted 1448324 found 1404193 Feb 3 14:16:56 rescue kernel: [ 3843.262897] BTRFS info (device bcache3): read error corrected: ino 0 off 19500524929024 (dev /dev/bcache2 sector 14113631808) Feb 3 14:16:56 rescue kernel: [ 3843.263192] BTRFS info (device bcache3): read error corrected: ino 0 off 19500524933120 (dev /dev/bcache2 sector 14113631816) Feb 3 14:16:56 rescue kernel: [ 3843.263437] BTRFS info (device bcache3): read error corrected: ino 0 off 19500524937216 (dev /dev/bcache2 sector 14113631824) Feb 3 14:16:56 rescue kernel: [ 3843.263638] BTRFS info (device bcache3): read error corrected: ino 0 off 19500524941312 (dev /dev/bcache2 sector 14113631832) Feb 3 14:47:02 rescue kernel: [ 5649.108021] BTRFS: device fsid 56182a66-b7c2-4953-a7db-98717ed02356 devid 1 transid 5 /dev/mapper/disk-wd-1t-0730 Feb 3 14:47:40 rescue kernel: [ 5687.214767] BTRFS info (device bcache2): allowing degraded mounts Feb 3 14:47:40 rescue kernel: [ 5687.214770] BTRFS info (device bcache2): disk space caching is enabled Feb 3 14:47:40 rescue kernel: [ 5687.214771] BTRFS info (device bcache2): has skinny extents Feb 3 14:47:41 rescue kernel: [ 5688.229117] BTRFS info (device bcache2): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3 Feb 3 14:47:42 rescue kernel: [ 5688.933827] BTRFS info (device bcache2): enabling ssd optimizations Feb 3 14:48:07 rescue kernel: [ 5714.056608] BTRFS info (device bcache2): disk added /dev/mapper/disk-wd-1t-0730 Feb 3 14:48:07 rescue kernel: [ 5714.059515] BTRFS info (device bcache2): device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 6 moved old:/dev/mapper/disk-wd-1t-0730 new:/dev/dm-6 Feb 3 14:48:07 rescue kernel: [ 5714.060355] BTRFS info (device bcache2): device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 6 moved old:/dev/dm-6 new:/dev/mapper/disk-wd-1t-0730 Feb 3 14:48:49 rescue kernel: [ 5756.214729] BTRFS error (device bcache2): parent transid verify failed on 19500017106944 wanted 1448321 found 1415054 Feb 3 14:48:49 rescue kernel: [ 5756.215355] BTRFS info (device bcache2): read error corrected: ino 0 off 19500017106944 (dev /dev/bcache2 sector 14112639968) Feb 3 14:48:49 rescue kernel: [ 5756.215675] BTRFS info (device bcache2): read error corrected: ino 0 off 19500017111040 (dev /dev/bcache2 sector 14112639976) Feb 3 14:48:49 rescue kernel: [ 5756.215935] BTRFS info (device bcache2): read error corrected: ino 0 off 19500017115136 (dev /dev/bcache2 sector 14112639984) Feb 3 14:48:49 rescue kernel: [ 5756.216156] BTRFS info (device bcache2): read error corrected: ino 0 off 19500017119232 (dev /dev/bcache2 sector 14112639992) Feb 3 14:48:49 rescue kernel: [ 5756.840866] BTRFS info (device bcache2): disk added /dev/mapper/disk-wd-2t-4979 Feb 3 14:48:49 rescue kernel: [ 5756.843836] BTRFS info (device bcache2): device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 7 moved old:/dev/mapper/disk-wd-2t-4979 new:/dev/dm-7 Feb 3 14:48:49 rescue kernel: [ 5756.844811] BTRFS info (device bcache2): device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 7 moved old:/dev/dm-7 new:/dev/mapper/disk-wd-2t-4979 Feb 3 15:06:51 rescue kernel: [ 6838.152819] BTRFS info (device bcache2): allowing degraded mounts Feb 3 15:06:51 rescue kernel: [ 6838.152822] BTRFS info (device bcache2): disk space caching is enabled Feb 3 15:06:51 rescue kernel: [ 6838.152823] BTRFS info (device bcache2): has skinny extents Feb 3 15:06:51 rescue kernel: [ 6838.230152] BTRFS info (device bcache2): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3 Feb 3 15:06:52 rescue kernel: [ 6838.934006] BTRFS info (device bcache2): enabling ssd optimizations Feb 3 15:06:54 rescue kernel: [ 6841.674943] BTRFS error (device bcache2): parent transid verify failed on 15062538846208 wanted 1448319 found 118025 Feb 3 15:06:54 rescue kernel: [ 6841.675400] BTRFS info (device bcache2): read error corrected: ino 0 off 15062538846208 (dev /dev/bcache2 sector 13893215104) Feb 3 15:06:54 rescue kernel: [ 6841.675463] BTRFS info (device bcache2): read error corrected: ino 0 off 15062538850304 (dev /dev/bcache2 sector 13893215112) Feb 3 15:06:54 rescue kernel: [ 6841.675529] BTRFS info (device bcache2): read error corrected: ino 0 off 15062538854400 (dev /dev/bcache2 sector 13893215120) Feb 3 15:06:54 rescue kernel: [ 6841.676223] BTRFS info (device bcache2): read error corrected: ino 0 off 15062538858496 (dev /dev/bcache2 sector 13893215128) Feb 3 15:06:54 rescue kernel: [ 6841.766371] BTRFS error (device bcache2): parent transid verify failed on 15062516727808 wanted 1448319 found 116977 Feb 3 15:06:54 rescue kernel: [ 6841.766894] BTRFS info (device bcache2): read error corrected: ino 0 off 15062516727808 (dev /dev/bcache2 sector 13893171904) Feb 3 15:06:54 rescue kernel: [ 6841.766958] BTRFS info (device bcache2): read error corrected: ino 0 off 15062516731904 (dev /dev/bcache2 sector 13893171912) Feb 3 15:06:54 rescue kernel: [ 6841.767022] BTRFS info (device bcache2): read error corrected: ino 0 off 15062516736000 (dev /dev/bcache2 sector 13893171920) Feb 3 15:06:54 rescue kernel: [ 6841.767082] BTRFS info (device bcache2): read error corrected: ino 0 off 15062516740096 (dev /dev/bcache2 sector 13893171928) Feb 3 15:38:23 rescue kernel: [ 8730.213844] BTRFS info (device bcache2): allowing degraded mounts Feb 3 15:38:23 rescue kernel: [ 8730.213846] BTRFS info (device bcache2): disk space caching is enabled Feb 3 15:38:23 rescue kernel: [ 8730.213847] BTRFS info (device bcache2): has skinny extents Feb 3 15:38:23 rescue kernel: [ 8730.225206] BTRFS error (device bcache2): parent transid verify failed on 14963956989952 wanted 1448324 found 80508 Feb 3 15:38:23 rescue kernel: [ 8730.225621] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956989952 (dev /dev/bcache2 sector 534883232) Feb 3 15:38:23 rescue kernel: [ 8730.225829] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956994048 (dev /dev/bcache2 sector 534883240) Feb 3 15:38:23 rescue kernel: [ 8730.225885] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956998144 (dev /dev/bcache2 sector 534883248) Feb 3 15:38:23 rescue kernel: [ 8730.225937] BTRFS info (device bcache2): read error corrected: ino 0 off 14963957002240 (dev /dev/bcache2 sector 534883256) Feb 3 15:38:23 rescue kernel: [ 8730.332134] BTRFS error (device bcache2): parent transid verify failed on 14963956613120 wanted 1446901 found 120202 Feb 3 15:38:23 rescue kernel: [ 8730.332435] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956613120 (dev /dev/bcache2 sector 534882496) Feb 3 15:38:23 rescue kernel: [ 8730.332493] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956617216 (dev /dev/bcache2 sector 534882504) Feb 3 15:38:23 rescue kernel: [ 8730.332549] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956621312 (dev /dev/bcache2 sector 534882512) Feb 3 15:38:23 rescue kernel: [ 8730.332605] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956625408 (dev /dev/bcache2 sector 534882520) Feb 3 15:38:23 rescue kernel: [ 8730.338227] BTRFS error (device bcache2): parent transid verify failed on 14963956776960 wanted 1446891 found 105104 Feb 3 15:38:23 rescue kernel: [ 8730.338525] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956776960 (dev /dev/bcache2 sector 534882816) Feb 3 15:38:23 rescue kernel: [ 8730.338582] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956781056 (dev /dev/bcache2 sector 534882824) Feb 3 15:38:23 rescue kernel: [ 8730.340420] BTRFS error (device bcache2): parent transid verify failed on 14963956695040 wanted 1446882 found 108615 Feb 3 15:38:23 rescue kernel: [ 8730.341332] BTRFS error (device bcache2): parent transid verify failed on 14963957334016 wanted 1446872 found 94852 Feb 3 15:38:23 rescue kernel: [ 8730.343095] BTRFS error (device bcache2): parent transid verify failed on 14963957383168 wanted 1446734 found 91496 Feb 3 15:38:23 rescue kernel: [ 8730.344103] BTRFS error (device bcache2): parent transid verify failed on 14963957301248 wanted 1446522 found 100496 Feb 3 15:38:23 rescue kernel: [ 8730.345274] BTRFS error (device bcache2): parent transid verify failed on 14963956662272 wanted 1446296 found 99399 Feb 3 15:38:23 rescue kernel: [ 8730.346413] BTRFS error (device bcache2): parent transid verify failed on 14963956645888 wanted 1446287 found 102741 Feb 3 15:38:23 rescue kernel: [ 8730.347550] BTRFS error (device bcache2): parent transid verify failed on 14963957317632 wanted 1446844 found 97159 Feb 3 15:38:23 rescue kernel: [ 8730.371987] BTRFS error (device bcache2): bad tree block start, want 14963957694464 have 9926189483690973024 Feb 3 15:38:23 rescue kernel: [ 8730.529027] BTRFS info (device bcache2): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3 Feb 3 15:38:24 rescue kernel: [ 8730.974041] BTRFS info (device bcache2): enabling ssd optimizations Feb 3 15:38:24 rescue kernel: [ 8730.974481] BTRFS info (device bcache2): checking UUID tree Feb 3 15:38:24 rescue kernel: [ 8731.172674] BTRFS critical (device bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, invalid key objectid: has 18446744073709551606 expect 6 or [256, 18446744073709551360] or 18446744073709551604 Feb 3 15:38:24 rescue kernel: [ 8731.172678] BTRFS error (device bcache2): block=19498503094272 read time tree block corruption detected Feb 3 15:38:24 rescue kernel: [ 8731.172860] BTRFS critical (device bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, invalid key objectid: has 18446744073709551606 expect 6 or [256, 18446744073709551360] or 18446744073709551604 Feb 3 15:38:24 rescue kernel: [ 8731.172862] BTRFS error (device bcache2): block=19498503094272 read time tree block corruption detected Feb 3 15:38:24 rescue kernel: [ 8731.172878] BTRFS warning (device bcache2): iterating uuid_tree failed -5 Feb 3 15:38:40 rescue kernel: [ 8747.300244] BTRFS error (device bcache2): parent transid verify failed on 19499606147072 wanted 1419889 found 1415052 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-06 4:13 ` Chiung-Ming Huang @ 2020-02-06 4:35 ` Qu Wenruo 2020-02-06 6:50 ` Chiung-Ming Huang 2020-02-07 3:49 ` Chiung-Ming Huang 0 siblings, 2 replies; 16+ messages in thread From: Qu Wenruo @ 2020-02-06 4:35 UTC (permalink / raw) To: Chiung-Ming Huang, Btrfs [-- Attachment #1.1: Type: text/plain, Size: 4373 bytes --] On 2020/2/6 下午12:13, Chiung-Ming Huang wrote: > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 上午9:13寫道: >> Please keep in mind that, if you post dmesg, the first time such error >> happens is the most important. >> Not something after you modified the fs by btrfs check --repair. > > Thanks for your advice. I'll keep in my mind. :) > > >>> >>> Feb 3 15:38:24 rescue kernel: [ 8731.172674] BTRFS critical (device >>> bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, >>> invalid key objectid: has 18446744073709551606 expect 6 or [256, >>> 18446744073709551360] or 18446744073709551604 >> >> This is message is even earlier than your initial report, and it's more >> important. >> This means you have a bad inode item with objectid EXTENT_CSUM_OBJECTID. >> >> This is a bigger problem. > > It sounds bad. Is it possible to save the data or part of them? Metadata is already screwed up. Data maybe partly saved for btrfs-restore or if you can mount it read-only. > > >> Are you sure that is the very first error message you hit? > > My .bash_history doesn't show timestamp so I'm not really sure which > critical/error > message is exactly right after the first `btrfs check --repair`. I > tried to make log file > smaller and excerpted only btrfs messages before the first critical > message in the > attachment. I'm not so familiar with mailing list. Could you see `btrfs_.log`? Got the attachment. The first strange part is, I see several mount failure with is caused by 4 or more devices missing. Then it mounted with devid1 missing. After reboot, you got the the full fs mounted without any missing. So far so good, but I'm not sure how degraded mount affects here. Soon after that, there is already problem showing some degraded mount is causing problem, where num_devices doesn't match. Further more, around 14:16 Feb 3, there are metadata transid mismatch, which means some metadata is already way older. At that point, btrfs can still try to read from the other copy, thus it's not a big problem yet. But that's already poisoning your fs, reducing the stability step-by-step. It's the RAID1 of btrfs barely saved your fs. The normal way to handle it is, trigger a full fs scrub to resilver/resync all RAID1 copies. And finally, you hit the last stage, where around 15:38 btrfs can't repair the metadata mismatch caused by multiple brain-split RAID1 situation, causing tons of transid error where btrfs can't fix. So from the full dmesg, it looks like the abuse of degraded is causing the problem. This shows one shortcoming of current btrfs RAID implementation, it doesn't do automatic re-silver. Unlike mdraid which will do re-silver before it can be accessed. Btrfs doesn't have a record of which blocks are written before some device go missing. Thus degraded for btrfs should really be considered as a last-resort method. And manual scrub after all devices go back online is really recommended. Thanks, Qu > > >>> rescue ~$ btrfs check /dev/bcache4 >>> Opening filesystem to check... >>> Checking filesystem on /dev/bcache4 >>> UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a >>> [1/7] checking root items >>> Error: could not find extent items for root 257 >>> ERROR: failed to repair root items: No such file or directory >> >> This part is from a special repair for a regression in 3.17. >> >> I guess we should not enable it by default. >> That will be another patch for btrfs-progs. > > Is this patch safe for saving my btrfs? If it is, I can build btrfs-progs. Here is the diff, should be pretty safe: diff --git a/check/main.c b/check/main.c index 7db65150048b..bcde157c415d 100644 --- a/check/main.c +++ b/check/main.c @@ -10373,7 +10373,8 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv) ctx.tp = TASK_ROOT_ITEMS; task_start(ctx.info, &ctx.start_time, &ctx.item_count); } - ret = repair_root_items(info); + if (repair) + ret = repair_root_items(info); task_stop(ctx.info); if (ret < 0) { err = !!ret; Thanks, Qu > > > Regards, > Chiung-Ming Huang > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-06 4:35 ` Qu Wenruo @ 2020-02-06 6:50 ` Chiung-Ming Huang 2020-02-07 3:49 ` Chiung-Ming Huang 1 sibling, 0 replies; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-06 6:50 UTC (permalink / raw) To: Qu Wenruo; +Cc: Btrfs Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 下午12:35寫道: > > > > On 2020/2/6 下午12:13, Chiung-Ming Huang wrote: > > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 上午9:13寫道: > Got the attachment. > > The first strange part is, I see several mount failure with is caused by > 4 or more devices missing. > > Then it mounted with devid1 missing. > > After reboot, you got the the full fs mounted without any missing. That's because /etc/crypttab of rescue system wasn't set up correctly. I logged in at the first and then fixed it. > So far so good, but I'm not sure how degraded mount affects here. > Soon after that, there is already problem showing some degraded mount is > causing problem, where num_devices doesn't match. Before `btrfs balance start -f ...` to single, I removed 3 disks from /etc/crypttab of server system. They are 1TB(empty), 2TB(empty), 10TB(5TB data + metadata). 10TB is one of RAID1 copies. I formatted 1TB and 2TB immediately but not 10TB just in case. Then, I triggered `btrfs balance ...` and let the server keep receiving data from internet. I thought 10TB disk has old data and metadata. Even if I add it back to RAID1, btrfs can figure out what data are new or old and fix it automatically. The server can work at the mean time. Just wast some disk space but it will be rectified by `btrfs balance` or `btrfs scrub` later. Is that true? `btrfs balance ..` suddenly failed after hours. The server system was totally not responded, included ssh and ctrl+alt+3. After that and power-off, I booted into the rescue systemand then fix /etc/crypttab and bring all to /dev on the rescue system. So `super_num_devices 3 mismatch` means these 3 disks. (Not sure) > So from the full dmesg, it looks like the abuse of degraded is causing > the problem. According the description I wrote above, is the conclusion still the same? > Thus degraded for btrfs should really be considered as a last-resort > method. And manual scrub after all devices go back online is really > recommended. Thanks for your analysis and help. 💔 What's done is done. My purpose now is to try to fix btrfs and save data as much as possible. Should I unplug 10TB disk, one of old RAID1 copy, at the first? Originally, the server had about 6TB data. This 10TB disk I removed from '/etc/crypttab' keeps about 5TB data. I'm worried what I'm going to do at the next step may result in loss of this 5TB data. Maybe worse, it's gone already. I tried mount btrfs with only this 10TB disk. It didn't work. Dmesg showed [Thu Feb 6 14:34:03 2020] BTRFS info (device bcache2): allowing degraded mounts [Thu Feb 6 14:34:03 2020] BTRFS info (device bcache2): disk space caching is enabled [Thu Feb 6 14:34:03 2020] BTRFS info (device bcache2): has skinny extents [Thu Feb 6 14:34:03 2020] BTRFS warning (device bcache2): devid 3 uuid f9b7fe84-d95b-4db5-9e2b-c34a2d4186e9 is missing [Thu Feb 6 14:34:03 2020] BTRFS warning (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing [Thu Feb 6 14:34:03 2020] BTRFS warning (device bcache2): devid 6 uuid d18e3182-a3cc-448b-b15b-0a20dc9c8cbe is missing [Thu Feb 6 14:34:03 2020] BTRFS warning (device bcache2): devid 7 uuid 991286c4-fa81-417a-876d-a0cb10989ded is missing [Thu Feb 6 14:34:03 2020] BTRFS warning (device bcache2): failed to read tree root [Thu Feb 6 14:34:03 2020] BTRFS error (device bcache2): open_ctree failed Base on your analysis, could you give me some advice about the next steps to save my btrfs raid? Are they 1) Apply the patch. 2) `btrfs check --repair /dev/bcache4` Regards, Chiung-Ming Huang ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-06 4:35 ` Qu Wenruo 2020-02-06 6:50 ` Chiung-Ming Huang @ 2020-02-07 3:49 ` Chiung-Ming Huang 2020-02-07 4:00 ` Qu Wenruo 1 sibling, 1 reply; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-07 3:49 UTC (permalink / raw) To: Qu Wenruo; +Cc: Btrfs Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 下午12:35寫道: > > Here is the diff, should be pretty safe: > diff --git a/check/main.c b/check/main.c > index 7db65150048b..bcde157c415d 100644 > --- a/check/main.c > +++ b/check/main.c > @@ -10373,7 +10373,8 @@ static int cmd_check(const struct cmd_struct > *cmd, int argc, char **argv) > ctx.tp = TASK_ROOT_ITEMS; > task_start(ctx.info, &ctx.start_time, > &ctx.item_count); > } > - ret = repair_root_items(info); > + if (repair) > + ret = repair_root_items(info); > task_stop(ctx.info); > if (ret < 0) { > err = !!ret; > I applied this patch and executed `btrfs check /dev/bcache4`. It showed these. Opening filesystem to check... Checking filesystem on /dev/bcache4 UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a [1/7] checking root items [2/7] checking extents parent transid verify failed on 7153357357056 wanted 1382980 found 1452673 parent transid verify failed on 7153357357056 wanted 1382980 found 1452673 parent transid verify failed on 7153357357056 wanted 1382980 found 1452673 Ignoring transid failure leaf parent key incorrect 7153357357056 bad block 7153357357056 ERROR: errors found in extent allocation tree or chunk allocation [3/7] checking free space cache cache and super generation don't match, space cache will be invalidated [4/7] checking fs roots root 5 root dir 256 not found root 257 root dir 256 not found root 258 root dir 256 not found root 277 root dir 256 not found root 278 root dir 256 not found root 279 root dir 256 not found root 280 root dir 256 not found root 283 root dir 256 not found root 286 root dir 256 not found root 289 root dir 256 not found root 292 root dir 256 not found root 295 root dir 256 not found root 298 root dir 256 not found root 304 root dir 256 not found root 307 root dir 256 not found root 310 root dir 256 not found root 313 root dir 256 not found root 316 root dir 256 not found root 319 root dir 256 not found root 322 root dir 256 not found root 325 root dir 256 not found root 360 root dir 256 not found root 367 root dir 256 not found root 370 root dir 256 not found root 373 root dir 256 not found root 376 root dir 256 not found root 380 root dir 256 not found root 383 root dir 256 not found root 386 root dir 256 not found root 389 root dir 256 not found root 392 root dir 256 not found root 399 root dir 256 not found root 402 root dir 256 not found root 405 root dir 256 not found root 408 root dir 256 not found root 411 root dir 256 not found root 414 root dir 256 not found root 417 root dir 256 not found root 420 root dir 256 not found root 423 root dir 256 not found root 426 root dir 256 not found root 429 root dir 256 not found root 439 root dir 256 not found root 442 root dir 256 not found root 445 root dir 256 not found root 448 root dir 256 not found root 451 root dir 256 not found root 513 root dir 256 not found root 4613 root dir 256 not found root 4616 root dir 256 not found root 4619 root dir 256 not found root 4622 root dir 256 not found root 4625 root dir 256 not found root 4628 root dir 256 not found root 4631 root dir 256 not found root 4640 root dir 256 not found root 4643 root dir 256 not found root 4646 root dir 256 not found root 4649 root dir 256 not found root 4652 root dir 256 not found root 4673 root dir 256 not found root 18871 root dir 256 not found root 19354 root dir 256 not found root 19355 root dir 256 not found root 19356 root dir 256 not found root 19375 root dir 256 not found root 19416 root dir 256 not found root 19419 root dir 256 not found root 19422 root dir 256 not found root 19425 root dir 256 not found root 19428 root dir 256 not found root 19432 root dir 256 not found root 19435 root dir 256 not found root 19438 root dir 256 not found root 19441 root dir 256 not found root 19450 root dir 256 not found root 19453 root dir 256 not found root 19456 root dir 256 not found root 19459 root dir 256 not found root 19462 root dir 256 not found root 19465 root dir 256 not found root 19468 root dir 256 not found root 19472 root dir 256 not found root 19473 root dir 256 not found root 19613 root dir 256 not found root 19784 root dir 256 not found root 19812 root dir 256 not found root 20572 root dir 256 not found root 20768 root dir 256 not found root 20771 root dir 256 not found root 20834 root dir 256 not found root 20837 root dir 256 not found root 21438 root dir 256 not found root 21447 root dir 256 not found root 21469 root dir 256 not found root 21470 root dir 256 not found root 23144 root dir 256 not found root 23146 root dir 256 not found root 23147 root dir 256 not found root 23440 root dir 256 not found root 23452 root dir 256 not found root 23460 root dir 256 not found root 23471 root dir 256 not found root 23520 root dir 256 not found root 23521 root dir 256 not found root 23833 root dir 256 not found root 23834 root dir 256 not found root 23854 root dir 256 not found root 23855 root dir 256 not found ERROR: errors found in fs roots found 1902526464 bytes used, error(s) found total csum bytes: 0 total tree bytes: 6275072 total fs tree bytes: 1032192 total extent tree bytes: 409600 btree space waste bytes: 974245 file data blocks allocated: 1628962816 referenced 1628962816 Regards, Chiung-Ming Huang ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-07 3:49 ` Chiung-Ming Huang @ 2020-02-07 4:00 ` Qu Wenruo 2020-02-07 6:16 ` Chiung-Ming Huang 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2020-02-07 4:00 UTC (permalink / raw) To: Chiung-Ming Huang; +Cc: Btrfs [-- Attachment #1.1: Type: text/plain, Size: 6232 bytes --] On 2020/2/7 上午11:49, Chiung-Ming Huang wrote: > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 下午12:35寫道: >> >> Here is the diff, should be pretty safe: >> diff --git a/check/main.c b/check/main.c >> index 7db65150048b..bcde157c415d 100644 >> --- a/check/main.c >> +++ b/check/main.c >> @@ -10373,7 +10373,8 @@ static int cmd_check(const struct cmd_struct >> *cmd, int argc, char **argv) >> ctx.tp = TASK_ROOT_ITEMS; >> task_start(ctx.info, &ctx.start_time, >> &ctx.item_count); >> } >> - ret = repair_root_items(info); >> + if (repair) >> + ret = repair_root_items(info); >> task_stop(ctx.info); >> if (ret < 0) { >> err = !!ret; >> > > I applied this patch and executed `btrfs check /dev/bcache4`. It showed these. > Opening filesystem to check... > Checking filesystem on /dev/bcache4 > UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a > [1/7] checking root items > [2/7] checking extents > parent transid verify failed on 7153357357056 wanted 1382980 found 1452673 > parent transid verify failed on 7153357357056 wanted 1382980 found 1452673 > parent transid verify failed on 7153357357056 wanted 1382980 found 1452673 Extent tree corrupted by transid. Already a bad news. > Ignoring transid failure > leaf parent key incorrect 7153357357056 > bad block 7153357357056 > ERROR: errors found in extent allocation tree or chunk allocation > [3/7] checking free space cache > cache and super generation don't match, space cache will be invalidated > [4/7] checking fs roots > root 5 root dir 256 not found > root 257 root dir 256 not found > root 258 root dir 256 not found > root 277 root dir 256 not found > root 278 root dir 256 not found > root 279 root dir 256 not found > root 280 root dir 256 not found > root 283 root dir 256 not found > root 286 root dir 256 not found > root 289 root dir 256 not found > root 292 root dir 256 not found > root 295 root dir 256 not found > root 298 root dir 256 not found > root 304 root dir 256 not found > root 307 root dir 256 not found > root 310 root dir 256 not found > root 313 root dir 256 not found > root 316 root dir 256 not found > root 319 root dir 256 not found > root 322 root dir 256 not found > root 325 root dir 256 not found > root 360 root dir 256 not found > root 367 root dir 256 not found > root 370 root dir 256 not found > root 373 root dir 256 not found > root 376 root dir 256 not found > root 380 root dir 256 not found > root 383 root dir 256 not found > root 386 root dir 256 not found > root 389 root dir 256 not found > root 392 root dir 256 not found > root 399 root dir 256 not found > root 402 root dir 256 not found > root 405 root dir 256 not found > root 408 root dir 256 not found > root 411 root dir 256 not found > root 414 root dir 256 not found > root 417 root dir 256 not found > root 420 root dir 256 not found > root 423 root dir 256 not found > root 426 root dir 256 not found > root 429 root dir 256 not found > root 439 root dir 256 not found > root 442 root dir 256 not found > root 445 root dir 256 not found > root 448 root dir 256 not found > root 451 root dir 256 not found > root 513 root dir 256 not found > root 4613 root dir 256 not found > root 4616 root dir 256 not found > root 4619 root dir 256 not found > root 4622 root dir 256 not found > root 4625 root dir 256 not found > root 4628 root dir 256 not found > root 4631 root dir 256 not found > root 4640 root dir 256 not found > root 4643 root dir 256 not found > root 4646 root dir 256 not found > root 4649 root dir 256 not found > root 4652 root dir 256 not found > root 4673 root dir 256 not found > root 18871 root dir 256 not found > root 19354 root dir 256 not found > root 19355 root dir 256 not found > root 19356 root dir 256 not found > root 19375 root dir 256 not found > root 19416 root dir 256 not found > root 19419 root dir 256 not found > root 19422 root dir 256 not found > root 19425 root dir 256 not found > root 19428 root dir 256 not found > root 19432 root dir 256 not found > root 19435 root dir 256 not found > root 19438 root dir 256 not found > root 19441 root dir 256 not found > root 19450 root dir 256 not found > root 19453 root dir 256 not found > root 19456 root dir 256 not found > root 19459 root dir 256 not found > root 19462 root dir 256 not found > root 19465 root dir 256 not found > root 19468 root dir 256 not found > root 19472 root dir 256 not found > root 19473 root dir 256 not found > root 19613 root dir 256 not found > root 19784 root dir 256 not found > root 19812 root dir 256 not found > root 20572 root dir 256 not found > root 20768 root dir 256 not found > root 20771 root dir 256 not found > root 20834 root dir 256 not found > root 20837 root dir 256 not found > root 21438 root dir 256 not found > root 21447 root dir 256 not found > root 21469 root dir 256 not found > root 21470 root dir 256 not found > root 23144 root dir 256 not found > root 23146 root dir 256 not found > root 23147 root dir 256 not found > root 23440 root dir 256 not found > root 23452 root dir 256 not found > root 23460 root dir 256 not found > root 23471 root dir 256 not found > root 23520 root dir 256 not found > root 23521 root dir 256 not found > root 23833 root dir 256 not found > root 23834 root dir 256 not found > root 23854 root dir 256 not found > root 23855 root dir 256 not found All these subvolumes had a missing root dir. That's not good either. I guess btrfs-restore is your last chance, or RO mount with my rescue=skipbg patchset: https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715 Thanks, Qu > ERROR: errors found in fs roots > found 1902526464 bytes used, error(s) found > total csum bytes: 0 > total tree bytes: 6275072 > total fs tree bytes: 1032192 > total extent tree bytes: 409600 > btree space waste bytes: 974245 > file data blocks allocated: 1628962816 > referenced 1628962816 > > Regards, > Chiung-Ming Huang > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-07 4:00 ` Qu Wenruo @ 2020-02-07 6:16 ` Chiung-Ming Huang 2020-02-07 7:16 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-07 6:16 UTC (permalink / raw) To: Qu Wenruo; +Cc: Btrfs Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道: > > All these subvolumes had a missing root dir. That's not good either. > I guess btrfs-restore is your last chance, or RO mount with my > rescue=skipbg patchset: > https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715 > Is it possible to use original disks to keep the restored data safely? I would like to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4? /dev/bcache2, ID: 1 Device size: 9.09TiB Device slack: 0.00B Data,RAID1: 3.93TiB Metadata,RAID1: 2.00GiB System,RAID1: 32.00MiB Unallocated: 5.16TiB /dev/bcache3, ID: 3 Device size: 2.73TiB Device slack: 0.00B Data,single: 378.00GiB Data,RAID1: 355.00GiB Metadata,single: 2.00GiB Metadata,RAID1: 11.00GiB Unallocated: 2.00TiB /dev/bcache4, ID: 5 Device size: 9.09TiB Device slack: 0.00B Data,single: 2.93TiB Data,RAID1: 4.15TiB Metadata,single: 6.00GiB Metadata,RAID1: 11.00GiB System,RAID1: 32.00MiB Unallocated: 2.00TiB Regards, Chiung-Ming Huang ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-07 6:16 ` Chiung-Ming Huang @ 2020-02-07 7:16 ` Qu Wenruo 2020-02-10 6:50 ` Chiung-Ming Huang 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2020-02-07 7:16 UTC (permalink / raw) To: Chiung-Ming Huang; +Cc: Btrfs [-- Attachment #1.1: Type: text/plain, Size: 1900 bytes --] On 2020/2/7 下午2:16, Chiung-Ming Huang wrote: > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道: >> >> All these subvolumes had a missing root dir. That's not good either. >> I guess btrfs-restore is your last chance, or RO mount with my >> rescue=skipbg patchset: >> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715 >> > > Is it possible to use original disks to keep the restored data safely? > I would like > to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then > add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something > in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4? Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata are in RAID1. But it's strongly recommended to test without wiping bcache2, to make sure btrfs-restore can salvage enough data, then wipeing bcache2. Thanks, Qu > > /dev/bcache2, ID: 1 > Device size: 9.09TiB > Device slack: 0.00B > Data,RAID1: 3.93TiB > Metadata,RAID1: 2.00GiB > System,RAID1: 32.00MiB > Unallocated: 5.16TiB > > /dev/bcache3, ID: 3 > Device size: 2.73TiB > Device slack: 0.00B > Data,single: 378.00GiB > Data,RAID1: 355.00GiB > Metadata,single: 2.00GiB > Metadata,RAID1: 11.00GiB > Unallocated: 2.00TiB > > /dev/bcache4, ID: 5 > Device size: 9.09TiB > Device slack: 0.00B > Data,single: 2.93TiB > Data,RAID1: 4.15TiB > Metadata,single: 6.00GiB > Metadata,RAID1: 11.00GiB > System,RAID1: 32.00MiB > Unallocated: 2.00TiB > > Regards, > Chiung-Ming Huang > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-07 7:16 ` Qu Wenruo @ 2020-02-10 6:50 ` Chiung-Ming Huang 2020-02-10 7:03 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-10 6:50 UTC (permalink / raw) To: Qu Wenruo; +Cc: Btrfs Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午3:16寫道: > > > > On 2020/2/7 下午2:16, Chiung-Ming Huang wrote: > > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道: > >> > >> All these subvolumes had a missing root dir. That's not good either. > >> I guess btrfs-restore is your last chance, or RO mount with my > >> rescue=skipbg patchset: > >> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715 > >> > > > > Is it possible to use original disks to keep the restored data safely? > > I would like > > to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then > > add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something > > in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4? > > Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata > are in RAID1. > > But it's strongly recommended to test without wiping bcache2, to make > sure btrfs-restore can salvage enough data, then wipeing bcache2. > > Thanks, > Qu Is it possible to shrink the size of bcache2 btrfs without making everything worse? I need more disk space but I still need bcache2 itself. Regards, Chiung-Ming Huang > > > > /dev/bcache2, ID: 1 > > Device size: 9.09TiB > > Device slack: 0.00B > > Data,RAID1: 3.93TiB > > Metadata,RAID1: 2.00GiB > > System,RAID1: 32.00MiB > > Unallocated: 5.16TiB > > > > /dev/bcache3, ID: 3 > > Device size: 2.73TiB > > Device slack: 0.00B > > Data,single: 378.00GiB > > Data,RAID1: 355.00GiB > > Metadata,single: 2.00GiB > > Metadata,RAID1: 11.00GiB > > Unallocated: 2.00TiB > > > > /dev/bcache4, ID: 5 > > Device size: 9.09TiB > > Device slack: 0.00B > > Data,single: 2.93TiB > > Data,RAID1: 4.15TiB > > Metadata,single: 6.00GiB > > Metadata,RAID1: 11.00GiB > > System,RAID1: 32.00MiB > > Unallocated: 2.00TiB > > > > Regards, > > Chiung-Ming Huang > > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-10 6:50 ` Chiung-Ming Huang @ 2020-02-10 7:03 ` Qu Wenruo 2020-02-15 3:47 ` Chiung-Ming Huang 0 siblings, 1 reply; 16+ messages in thread From: Qu Wenruo @ 2020-02-10 7:03 UTC (permalink / raw) To: Chiung-Ming Huang; +Cc: Btrfs [-- Attachment #1.1: Type: text/plain, Size: 4049 bytes --] On 2020/2/10 下午2:50, Chiung-Ming Huang wrote: > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午3:16寫道: >> >> >> >> On 2020/2/7 下午2:16, Chiung-Ming Huang wrote: >>> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道: >>>> >>>> All these subvolumes had a missing root dir. That's not good either. >>>> I guess btrfs-restore is your last chance, or RO mount with my >>>> rescue=skipbg patchset: >>>> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715 >>>> >>> >>> Is it possible to use original disks to keep the restored data safely? >>> I would like >>> to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then >>> add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something >>> in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4? >> >> Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata >> are in RAID1. >> >> But it's strongly recommended to test without wiping bcache2, to make >> sure btrfs-restore can salvage enough data, then wipeing bcache2. >> >> Thanks, >> Qu > > Is it possible to shrink the size of bcache2 btrfs without making > everything worse? > I need more disk space but I still need bcache2 itself. That is kinda possible, but please keep in mind that, even in the best case, it still needs to write some (very small amount) metadata into the fs, thus I can't ensure it won't make things worse, or even possible without falling back to RO. You need to dump the device extent tree, to determine the where the last dev extent is for each device, then shrink to that size. Some example here: # btrfs ins dump-tree -t dev /dev/nvme/btrfs ... item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48 dev extent chunk_tree 3 chunk_objectid 256 chunk_offset 2169503744 length 1073741824 chunk_tree_uuid 00000000-0000-0000-0000-000000000000 Here for the key, 1 means devid 1, 2169503744 means where the device extent starts at. 1073741824 is the length of the device extent. In above case, the device with devid 1 can be resized to 2169503744 + 1073741824, without relocating any data/metadata. # time btrfs fi resize 1:3243245568 /mnt/btrfs/ Resize '/mnt/btrfs/' of '1:3243245568' real 0m0.013s user 0m0.006s sys 0m0.004s And the dump-tree shows the same last device extent: ... item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48 dev extent chunk_tree 3 chunk_objectid 256 chunk_offset 2169503744 length 1073741824 chunk_tree_uuid 00000000-0000-0000-0000-000000000000 (Maybe it's a good time to implement some like fast shrink for btrfs-progs) Of course, after resizing btrfs, you still need to resize bcache, but that's not related to btrfs (and I am not familiar with bcache either). Thanks, Qu > > Regards, > Chiung-Ming Huang > > >>> >>> /dev/bcache2, ID: 1 >>> Device size: 9.09TiB >>> Device slack: 0.00B >>> Data,RAID1: 3.93TiB >>> Metadata,RAID1: 2.00GiB >>> System,RAID1: 32.00MiB >>> Unallocated: 5.16TiB >>> >>> /dev/bcache3, ID: 3 >>> Device size: 2.73TiB >>> Device slack: 0.00B >>> Data,single: 378.00GiB >>> Data,RAID1: 355.00GiB >>> Metadata,single: 2.00GiB >>> Metadata,RAID1: 11.00GiB >>> Unallocated: 2.00TiB >>> >>> /dev/bcache4, ID: 5 >>> Device size: 9.09TiB >>> Device slack: 0.00B >>> Data,single: 2.93TiB >>> Data,RAID1: 4.15TiB >>> Metadata,single: 6.00GiB >>> Metadata,RAID1: 11.00GiB >>> System,RAID1: 32.00MiB >>> Unallocated: 2.00TiB >>> >>> Regards, >>> Chiung-Ming Huang >>> >> [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-10 7:03 ` Qu Wenruo @ 2020-02-15 3:47 ` Chiung-Ming Huang 2020-02-15 4:29 ` Qu Wenruo 0 siblings, 1 reply; 16+ messages in thread From: Chiung-Ming Huang @ 2020-02-15 3:47 UTC (permalink / raw) To: Qu Wenruo; +Cc: Btrfs Hi Qu Thanks for your reply. That's really helpful. BTW, I just read this url and the mail thread in it. https://unix.stackexchange.com/a/345972 It seems to say if raid1 is degraded and even if rw, it should not be applied any operations other than btrfs-replace or btrfs-balance. Does it mean the degraded raid1 should not be used with both btrfs-replace/balance and the original server rw services at the meantime? For example, I put PostgreSQL DB on btrfs raid1 and I though one of raid1 two copies is my backup. Even if I lost one copy, the service still can keep running by another one immediately. Okay, maybe not immediately. I need to reboot. But waiting 24 hours or longer which depends on the size of data for the completion of btrfs-replace/balance seems not to be a good idea. Regards, Chiung-Ming Huang Regards, Chiung-Ming Huang Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月10日 週一 下午3:03寫道: > > > > On 2020/2/10 下午2:50, Chiung-Ming Huang wrote: > > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午3:16寫道: > >> > >> > >> > >> On 2020/2/7 下午2:16, Chiung-Ming Huang wrote: > >>> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道: > >>>> > >>>> All these subvolumes had a missing root dir. That's not good either. > >>>> I guess btrfs-restore is your last chance, or RO mount with my > >>>> rescue=skipbg patchset: > >>>> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715 > >>>> > >>> > >>> Is it possible to use original disks to keep the restored data safely? > >>> I would like > >>> to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then > >>> add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something > >>> in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4? > >> > >> Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata > >> are in RAID1. > >> > >> But it's strongly recommended to test without wiping bcache2, to make > >> sure btrfs-restore can salvage enough data, then wipeing bcache2. > >> > >> Thanks, > >> Qu > > > > Is it possible to shrink the size of bcache2 btrfs without making > > everything worse? > > I need more disk space but I still need bcache2 itself. > > That is kinda possible, but please keep in mind that, even in the best > case, it still needs to write some (very small amount) metadata into the > fs, thus I can't ensure it won't make things worse, or even possible > without falling back to RO. > > You need to dump the device extent tree, to determine the where the last > dev extent is for each device, then shrink to that size. > > Some example here: > > # btrfs ins dump-tree -t dev /dev/nvme/btrfs > ... > > item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48 > dev extent chunk_tree 3 > chunk_objectid 256 chunk_offset 2169503744 length 1073741824 > chunk_tree_uuid 00000000-0000-0000-0000-000000000000 > > Here for the key, 1 means devid 1, 2169503744 means where the device > extent starts at. 1073741824 is the length of the device extent. > > In above case, the device with devid 1 can be resized to 2169503744 + > 1073741824, without relocating any data/metadata. > > # time btrfs fi resize 1:3243245568 /mnt/btrfs/ > Resize '/mnt/btrfs/' of '1:3243245568' > > real 0m0.013s > user 0m0.006s > sys 0m0.004s > > And the dump-tree shows the same last device extent: > ... > item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48 > dev extent chunk_tree 3 > chunk_objectid 256 chunk_offset 2169503744 length 1073741824 > chunk_tree_uuid 00000000-0000-0000-0000-000000000000 > > (Maybe it's a good time to implement some like fast shrink for btrfs-progs) > > Of course, after resizing btrfs, you still need to resize bcache, but > that's not related to btrfs (and I am not familiar with bcache either). > > Thanks, > Qu > > > > > Regards, > > Chiung-Ming Huang > > > > > >>> > >>> /dev/bcache2, ID: 1 > >>> Device size: 9.09TiB > >>> Device slack: 0.00B > >>> Data,RAID1: 3.93TiB > >>> Metadata,RAID1: 2.00GiB > >>> System,RAID1: 32.00MiB > >>> Unallocated: 5.16TiB > >>> > >>> /dev/bcache3, ID: 3 > >>> Device size: 2.73TiB > >>> Device slack: 0.00B > >>> Data,single: 378.00GiB > >>> Data,RAID1: 355.00GiB > >>> Metadata,single: 2.00GiB > >>> Metadata,RAID1: 11.00GiB > >>> Unallocated: 2.00TiB > >>> > >>> /dev/bcache4, ID: 5 > >>> Device size: 9.09TiB > >>> Device slack: 0.00B > >>> Data,single: 2.93TiB > >>> Data,RAID1: 4.15TiB > >>> Metadata,single: 6.00GiB > >>> Metadata,RAID1: 11.00GiB > >>> System,RAID1: 32.00MiB > >>> Unallocated: 2.00TiB > >>> > >>> Regards, > >>> Chiung-Ming Huang > >>> > >> > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: How to Fix 'Error: could not find extent items for root 257'? 2020-02-15 3:47 ` Chiung-Ming Huang @ 2020-02-15 4:29 ` Qu Wenruo 0 siblings, 0 replies; 16+ messages in thread From: Qu Wenruo @ 2020-02-15 4:29 UTC (permalink / raw) To: Chiung-Ming Huang; +Cc: Btrfs [-- Attachment #1.1: Type: text/plain, Size: 6913 bytes --] On 2020/2/15 上午11:47, Chiung-Ming Huang wrote: > Hi Qu > > Thanks for your reply. That's really helpful. BTW, I just read this url and > the mail thread in it. https://unix.stackexchange.com/a/345972 > It seems to say if raid1 is degraded and even if rw, it should not be applied > any operations other than btrfs-replace or btrfs-balance. That would be the best case. > > Does it mean the degraded raid1 should not be used with both > btrfs-replace/balance and the original server rw services at the meantime? No, as long as the fs is still mounted, degraded RAID1 can be pretty safe in fact. At least to me, all the problem happen when we try to mount the fs again using a mix of up-to-date disks with out-of-data disk. For running degraded fs, btrfs knows which device is missing, it just submit read/write to existing devices, and replace/balance can all handle the case where. > > For example, I put PostgreSQL DB on btrfs raid1 and I though one of raid1 > two copies is my backup. Even if I lost one copy, the service still can keep > running by another one immediately. Okay, maybe not immediately. I need > to reboot. You'd better not to reboot, at least not reboot directly to normal running status, with the bad disk attached. > But waiting 24 hours or longer which depends on the size of data > for the completion of btrfs-replace/balance seems not to be a good idea. Btrfs-replace works just like scrub, which can only copying/verify data on certain disk. It's not rewriting/verifying the whole fs, but I understand that it can be very slow. For btrfs-replace, you can just run the replace in the background. Replace has extra protection to avoid data out-of-sync. In short, for your case, it looks the problem is between some of your degraded mount which screwed up some metadata blocks due to metadata out of sync. To avoid such problem, it may be a good idea to allow btrfs to use superblock generation to find out which device is out-of-data, and do self re-silver or at least avoid reading data/meta from the old device. But that feature will need extra consideration before we even trying to implement. So currently my only practical recommendation would be, if you find one disk failing, please remove it completely and ensure it will never show up before remount the fs. Then you can safely replace/remount. Thanks, Qu > > Regards, > Chiung-Ming Huang > > Regards, > Chiung-Ming Huang > > > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月10日 週一 下午3:03寫道: >> >> >> >> On 2020/2/10 下午2:50, Chiung-Ming Huang wrote: >>> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午3:16寫道: >>>> >>>> >>>> >>>> On 2020/2/7 下午2:16, Chiung-Ming Huang wrote: >>>>> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道: >>>>>> >>>>>> All these subvolumes had a missing root dir. That's not good either. >>>>>> I guess btrfs-restore is your last chance, or RO mount with my >>>>>> rescue=skipbg patchset: >>>>>> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715 >>>>>> >>>>> >>>>> Is it possible to use original disks to keep the restored data safely? >>>>> I would like >>>>> to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then >>>>> add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something >>>>> in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4? >>>> >>>> Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata >>>> are in RAID1. >>>> >>>> But it's strongly recommended to test without wiping bcache2, to make >>>> sure btrfs-restore can salvage enough data, then wipeing bcache2. >>>> >>>> Thanks, >>>> Qu >>> >>> Is it possible to shrink the size of bcache2 btrfs without making >>> everything worse? >>> I need more disk space but I still need bcache2 itself. >> >> That is kinda possible, but please keep in mind that, even in the best >> case, it still needs to write some (very small amount) metadata into the >> fs, thus I can't ensure it won't make things worse, or even possible >> without falling back to RO. >> >> You need to dump the device extent tree, to determine the where the last >> dev extent is for each device, then shrink to that size. >> >> Some example here: >> >> # btrfs ins dump-tree -t dev /dev/nvme/btrfs >> ... >> >> item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48 >> dev extent chunk_tree 3 >> chunk_objectid 256 chunk_offset 2169503744 length 1073741824 >> chunk_tree_uuid 00000000-0000-0000-0000-000000000000 >> >> Here for the key, 1 means devid 1, 2169503744 means where the device >> extent starts at. 1073741824 is the length of the device extent. >> >> In above case, the device with devid 1 can be resized to 2169503744 + >> 1073741824, without relocating any data/metadata. >> >> # time btrfs fi resize 1:3243245568 /mnt/btrfs/ >> Resize '/mnt/btrfs/' of '1:3243245568' >> >> real 0m0.013s >> user 0m0.006s >> sys 0m0.004s >> >> And the dump-tree shows the same last device extent: >> ... >> item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48 >> dev extent chunk_tree 3 >> chunk_objectid 256 chunk_offset 2169503744 length 1073741824 >> chunk_tree_uuid 00000000-0000-0000-0000-000000000000 >> >> (Maybe it's a good time to implement some like fast shrink for btrfs-progs) >> >> Of course, after resizing btrfs, you still need to resize bcache, but >> that's not related to btrfs (and I am not familiar with bcache either). >> >> Thanks, >> Qu >> >>> >>> Regards, >>> Chiung-Ming Huang >>> >>> >>>>> >>>>> /dev/bcache2, ID: 1 >>>>> Device size: 9.09TiB >>>>> Device slack: 0.00B >>>>> Data,RAID1: 3.93TiB >>>>> Metadata,RAID1: 2.00GiB >>>>> System,RAID1: 32.00MiB >>>>> Unallocated: 5.16TiB >>>>> >>>>> /dev/bcache3, ID: 3 >>>>> Device size: 2.73TiB >>>>> Device slack: 0.00B >>>>> Data,single: 378.00GiB >>>>> Data,RAID1: 355.00GiB >>>>> Metadata,single: 2.00GiB >>>>> Metadata,RAID1: 11.00GiB >>>>> Unallocated: 2.00TiB >>>>> >>>>> /dev/bcache4, ID: 5 >>>>> Device size: 9.09TiB >>>>> Device slack: 0.00B >>>>> Data,single: 2.93TiB >>>>> Data,RAID1: 4.15TiB >>>>> Metadata,single: 6.00GiB >>>>> Metadata,RAID1: 11.00GiB >>>>> System,RAID1: 32.00MiB >>>>> Unallocated: 2.00TiB >>>>> >>>>> Regards, >>>>> Chiung-Ming Huang >>>>> >>>> >> [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2020-02-15 4:29 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-05 10:18 How to Fix 'Error: could not find extent items for root 257'? Chiung-Ming Huang 2020-02-05 10:29 ` Qu Wenruo 2020-02-05 15:29 ` Chiung-Ming Huang 2020-02-05 19:38 ` Chris Murphy 2020-02-06 3:11 ` Chiung-Ming Huang [not found] ` <CAEOGEKHf9F0VM=au-42MwD63_V8RwtqiskV0LsGpq-c=J_qyPg@mail.gmail.com> [not found] ` <f2ad6b4f-b011-8954-77e1-5162c84f7c1f@gmx.com> 2020-02-06 4:13 ` Chiung-Ming Huang 2020-02-06 4:35 ` Qu Wenruo 2020-02-06 6:50 ` Chiung-Ming Huang 2020-02-07 3:49 ` Chiung-Ming Huang 2020-02-07 4:00 ` Qu Wenruo 2020-02-07 6:16 ` Chiung-Ming Huang 2020-02-07 7:16 ` Qu Wenruo 2020-02-10 6:50 ` Chiung-Ming Huang 2020-02-10 7:03 ` Qu Wenruo 2020-02-15 3:47 ` Chiung-Ming Huang 2020-02-15 4:29 ` Qu Wenruo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).