linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to Fix 'Error: could not find extent items for root 257'?
@ 2020-02-05 10:18 Chiung-Ming Huang
  2020-02-05 10:29 ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-05 10:18 UTC (permalink / raw)
  To: linux-btrfs

Hi everyone

It's a long story. I try to describe it shortly. My btrfs RAID1
includes 5 HDDs, 10Tx2, 1Tx1, 2Tx1 and 3Tx1. They all based on bcache
(1Tx1 SSD as cache) and luks. I tried to reorder it to ` Luks -->
Bcache --> SSD --> HDD` with only one layer of luks on bcache. But I
failed because of power-off accidentally. Please help me to fix it.
Thanks.

1. OS: Ubuntu 18.04

2. $ uname -a
Linux rescue 5.3.0-26-generic #28~18.04.1-Ubuntu SMP Wed Dec 18
16:40:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

3. $ btrfs --version
btrfs-progs v5.4.1

4. $ btrfs fi show
Label: none  uuid: 0b79cf54-c424-40ed-adca-bd66b38ad57a
        Total devices 5 FS bytes used 496.00KiB
        devid    1 size 9.09TiB used 3.93TiB path /dev/bcache2
        devid    3 size 2.73TiB used 746.00GiB path /dev/bcache3
        devid    5 size 9.09TiB used 7.09TiB path /dev/bcache4
        devid    6 size 931.51GiB used 0.00B path /dev/mapper/disk-1t
        devid    7 size 1.82TiB used 0.00B path /dev/mapper/disk-2t

5. $ mount /dev/bcache4 /mnt
It showed the second part of messages after about 10 seconds and
remount it as readonly
------------------ dmesg part 1/2 ------------------

[Wed Feb  5 17:09:04 2020] BTRFS info (device bcache2): disk space
caching is enabled
[Wed Feb  5 17:09:04 2020] BTRFS info (device bcache2): has skinny
extents
[Wed Feb  5 17:09:04 2020] BTRFS info (device bcache2): bdev
/dev/bcache2 errs: wr 0, rd 0, flush 0, corrupt 266140, gen 8928
[Wed Feb  5 17:09:04 2020] BTRFS info (device bcache2): bdev
/dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3
[Wed Feb  5 17:09:04 2020] BTRFS info (device bcache2): enabling ssd
optimizations
[Wed Feb  5 17:09:04 2020] BTRFS info (device bcache2): checking UUID
tree
[Wed Feb  5 17:09:04 2020] BTRFS error (device bcache2): tree level
mismatch detected, bytenr=19499133206528 level expected=0 has=2
[Wed Feb  5 17:09:04 2020] BTRFS error (device bcache2): tree level
mismatch detected, bytenr=19499133206528 level expected=0 has=2
[Wed Feb  5 17:09:04 2020] BTRFS warning (device bcache2): iterating
uuid_tree failed -117
btrfs fi df /

------------------ dmesg part 2/2 ------------------

[Wed Feb  5 17:09:36 2020] BTRFS error (device bcache2): tree block
14963956514816 owner 3 already locked by pid=3187, extent tree
corruption detected
[Wed Feb  5 17:09:36 2020] ------------[ cut here ]------------
[Wed Feb  5 17:09:36 2020] BTRFS: Transaction aborted (error -117)
[Wed Feb  5 17:09:36 2020] WARNING: CPU: 4 PID: 3187 at
/build/linux-hwe-3HpQOB/linux-hwe-5.3.0/fs/btrfs/volumes.c:3025
btrfs_remove_chunk+0x76e/0x8a0 [btrfs]
[Wed Feb  5 17:09:36 2020] Modules linked in: cmac bnep nls_iso8859_1
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic
ledtrig_audio snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep
edac_mce_amd kvm_amd snd_pcm ccp kvm snd_seq_midi snd_seq_midi_event
irqby
pass k10temp snd_rawmidi btusb fam15h_power btrtl btbcm btintel joydev
nouveau snd_seq input_leds bluetooth snd_seq_device mxm_wmi snd_timer
video ecdh_generic snd ecc ttm i2c_algo_bit soundcore mac_hid
sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 b
trfs xor zstd_compress raid6_pq libcrc32c algif_skcipher af_alg
dm_crypt hid_logitech_hidpp bcache crc64 hid_logitech_dj hid_generic
usbhid hid uas usb_storage nvidia_drm(POE) nvidia_modeset(POE)
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
nvidia(POE) ae
s_x86_64 crypto_simd drm_kms_helper syscopyarea cryptd sysfillrect
glue_helper sysimgblt fb_sys_fops drm r8169 nvme realtek i2c_piix4
ahci ipmi_devintf nvme_core(E) libahci ipmi_msghandler wmi
[Wed Feb  5 17:09:36 2020] CPU: 4 PID: 3187 Comm: btrfs-cleaner
Tainted: P        W  OE     5.3.0-26-generic #28~18.04.1-Ubuntu
[Wed Feb  5 17:09:36 2020] Hardware name: MSI MS-7974/970A-G43 PLUS
(MS-7974), BIOS V1.1 07/04/2016
[Wed Feb  5 17:09:36 2020] RIP: 0010:btrfs_remove_chunk+0x76e/0x8a0 [btrfs]
[Wed Feb  5 17:09:36 2020] Code: 48 8b 50 50 f0 48 0f ba aa 40 ce 00
00 02 8b 45 a0 72 1c 83 f8 fb 0f 84 af 00 00 00 89 c6 48 c7 c7 f0 52
7d c1 e8 72 fa 73 eb <0f> 0b 8b 45 a0 48 8b 7d 90 89 c1 ba d1 0b 00 00
48 c7 c6 90 54 7c
[Wed Feb  5 17:09:36 2020] RSP: 0018:ffffa7a5035d3d98 EFLAGS: 00010282
[Wed Feb  5 17:09:36 2020] RAX: 0000000000000000 RBX: 0000000040000000
RCX: 0000000000000006
[Wed Feb  5 17:09:36 2020] RDX: 0000000000000007 RSI: 0000000000000092
RDI: ffff97cea7b17440
[Wed Feb  5 17:09:36 2020] RBP: ffffa7a5035d3e48 R08: 00000000000005d3
R09: 0000000000000004
[Wed Feb  5 17:09:36 2020] R10: 00000000ffffffff R11: 0000000000000001
R12: ffff97ce9b647c00
[Wed Feb  5 17:09:36 2020] R13: ffff97ce95c2e800 R14: ffff97ce9c1d03b8
R15: ffff97ce9253ec40
[Wed Feb  5 17:09:36 2020] FS:  0000000000000000(0000)
GS:ffff97cea7b00000(0000) knlGS:0000000000000000
[Wed Feb  5 17:09:36 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed Feb  5 17:09:36 2020] CR2: 000055b322010290 CR3: 000000061d628000
CR4: 00000000000406e0
[Wed Feb  5 17:09:36 2020] Call Trace:
[Wed Feb  5 17:09:36 2020]  btrfs_delete_unused_bgs+0x36a/0x490 [btrfs]
[Wed Feb  5 17:09:36 2020]  cleaner_kthread+0xed/0x130 [btrfs]
[Wed Feb  5 17:09:36 2020]  kthread+0x121/0x140
[Wed Feb  5 17:09:36 2020]  ? __btrfs_btree_balance_dirty+0x60/0x60 [btrfs]
[Wed Feb  5 17:09:36 2020]  ? kthread_park+0xb0/0xb0
[Wed Feb  5 17:09:36 2020]  ret_from_fork+0x22/0x40
[Wed Feb  5 17:09:36 2020] ---[ end trace c34270cb20778d7d ]---
[Wed Feb  5 17:09:36 2020] BTRFS: error (device bcache2) in
btrfs_remove_chunk:3025: errno=-117 unknown
[Wed Feb  5 17:09:36 2020] BTRFS info (device bcache2): forced readonly
[Wed Feb  5 17:09:36 2020] ------------[ cut here ]------------
[Wed Feb  5 17:09:36 2020] WARNING: CPU: 4 PID: 3187 at
/build/linux-hwe-3HpQOB/linux-hwe-5.3.0/fs/btrfs/space-info.h:106
btrfs_space_info_update_bytes_may_use.part.10+0x14/0x21 [btrfs]
[Wed Feb  5 17:09:36 2020] Modules linked in: cmac bnep nls_iso8859_1
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic
ledtrig_audio snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep
edac_mce_amd kvm_amd snd_pcm ccp kvm snd_seq_midi snd_seq_midi_event
irqby
pass k10temp snd_rawmidi btusb fam15h_power btrtl btbcm btintel joydev
nouveau snd_seq input_leds bluetooth snd_seq_device mxm_wmi snd_timer
video ecdh_generic snd ecc ttm i2c_algo_bit soundcore mac_hid
sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 b
trfs xor zstd_compress raid6_pq libcrc32c algif_skcipher af_alg
dm_crypt hid_logitech_hidpp bcache crc64 hid_logitech_dj hid_generic
usbhid hid uas usb_storage nvidia_drm(POE) nvidia_modeset(POE)
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
nvidia(POE) ae
s_x86_64 crypto_simd drm_kms_helper syscopyarea cryptd sysfillrect
glue_helper sysimgblt fb_sys_fops drm r8169 nvme realtek i2c_piix4
ahci ipmi_devintf nvme_core(E) libahci ipmi_msghandler wmi
[Wed Feb  5 17:09:36 2020] CPU: 4 PID: 3187 Comm: btrfs-cleaner
Tainted: P        W  OE     5.3.0-26-generic #28~18.04.1-Ubuntu
[Wed Feb  5 17:09:36 2020] Hardware name: MSI MS-7974/970A-G43 PLUS
(MS-7974), BIOS V1.1 07/04/2016
[Wed Feb  5 17:09:36 2020] RIP:
0010:btrfs_space_info_update_bytes_may_use.part.10+0x14/0x21 [btrfs]
[Wed Feb  5 17:09:36 2020] Code: 74 05 e8 22 a5 6d eb 48 8d 65 d8 5b
41 5a 41 5c 41 5d 41 5e 5d c3 55 48 89 e5 53 48 89 fb 48 c7 c7 e8 a4
7d c1 e8 d2 84 74 eb <0f> 0b 48 c7 43 28 00 00 00 00 5b 5d c3 55 48 89
e5 53 48 89 fb 48
[Wed Feb  5 17:09:36 2020] RSP: 0018:ffffa7a5035d3ca8 EFLAGS: 00010286
[Wed Feb  5 17:09:36 2020] RAX: 0000000000000024 RBX: ffff97ce96ed5800
RCX: 0000000000000006
[Wed Feb  5 17:09:36 2020] RDX: 0000000000000000 RSI: 0000000000000092
RDI: ffff97cea7b17440
[Wed Feb  5 17:09:36 2020] RBP: ffffa7a5035d3cb0 R08: 00000000000005ed
R09: 0000000000000004
[Wed Feb  5 17:09:36 2020] R10: 0000000000000002 R11: 0000000000000001
R12: ffff97ce96ed5800
[Wed Feb  5 17:09:36 2020] R13: 0000000000080000 R14: 000000000007c000
R15: ffff97ce9c1d0000
[Wed Feb  5 17:09:36 2020] FS:  0000000000000000(0000)
GS:ffff97cea7b00000(0000) knlGS:0000000000000000
[Wed Feb  5 17:09:36 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed Feb  5 17:09:36 2020] CR2: 000055b322010290 CR3: 000000061d628000
CR4: 00000000000406e0
[Wed Feb  5 17:09:36 2020] Call Trace:
[Wed Feb  5 17:09:36 2020]  btrfs_space_info_add_old_bytes+0x261/0x280 [btrfs]
[Wed Feb  5 17:09:36 2020]  __btrfs_block_rsv_release+0x16e/0x1a0 [btrfs]
[Wed Feb  5 17:09:36 2020]  btrfs_trans_release_chunk_metadata+0x35/0x50 [btrfs]
[Wed Feb  5 17:09:36 2020]
btrfs_create_pending_block_groups+0x13d/0x240 [btrfs]
[Wed Feb  5 17:09:36 2020]  __btrfs_end_transaction+0x6e/0x1e0 [btrfs]
[Wed Feb  5 17:09:36 2020]  btrfs_end_transaction+0x10/0x20 [btrfs]
[Wed Feb  5 17:09:36 2020]  btrfs_delete_unused_bgs+0x28b/0x490 [btrfs]
[Wed Feb  5 17:09:36 2020]  cleaner_kthread+0xed/0x130 [btrfs]
[Wed Feb  5 17:09:36 2020]  kthread+0x121/0x140
[Wed Feb  5 17:09:36 2020]  ? __btrfs_btree_balance_dirty+0x60/0x60 [btrfs]
[Wed Feb  5 17:09:36 2020]  ? kthread_park+0xb0/0xb0
[Wed Feb  5 17:09:36 2020]  ret_from_fork+0x22/0x40
[Wed Feb  5 17:09:36 2020] ---[ end trace c34270cb20778d7e ]---

6. $ btrfs fi df /mnt
Data, RAID1: total=4.21TiB, used=0.00B
Data, single: total=3.30TiB, used=0.00B
System, RAID1: total=32.00MiB, used=0.00B
Metadata, RAID1: total=12.00GiB, used=496.00KiB
Metadata, single: total=8.00GiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

7. $ btrfs check -p /dev/bcache4
Opening filesystem to check...
Checking filesystem on /dev/bcache4
UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
Error: could not find extent items for root 257(0:00:00 elapsed, 1199
items checked)
[1/7] checking root items                      (0:00:00 elapsed, 7748
items checked)
ERROR: failed to repair root items: No such file or directory

8. $ btrfs scrub start -B -R /mnt
The status is aborted because the file system was forcely re-mounted readonly.

9. $ lsblk -o NAME,SIZE,TYPE,FSTYPE
sda                     931.5G disk  bcache
└─bcache0               931.5G disk  crypto_LUKS
  └─disk-1t                931.5G crypt btrfs
sdb                     232.9G disk
└─sdb6                     10G part  crypto_LUKS
  └─rescue                 10G crypt btrfs
sdc                       2.7T disk  crypto_LUKS
└─disk-3t             2.7T crypt bcache
  └─bcache3               2.7T disk  btrfs
sdd                       9.1T disk  crypto_LUKS
└─disk-10t           9.1T crypt bcache
  └─bcache2               9.1T disk  btrfs
sde                       1.8T disk  bcache
└─bcache1                 1.8T disk  crypto_LUKS
  └─disk-2t          1.8T crypt btrfs
sdf                       9.1T disk  crypto_LUKS
└─disk-10t          9.1T crypt bcache
  └─bcache4               9.1T disk  btrfs
nvme0n1                 953.9G disk
└─nvme0n1p1               636G part  crypto_LUKS
  └─cache                 636G crypt bcache
    ├─bcache0           931.5G disk  crypto_LUKS
    │ └─disk-1t           931.5G crypt btrfs
    ├─bcache1             1.8T disk  crypto_LUKS
    │ └─disk-2t          1.8T crypt btrfs
    ├─bcache2             9.1T disk  btrfs
    ├─bcache3             2.7T disk  btrfs
    └─bcache4             9.1T disk  btrfs


Regards,
Chiung-Ming Huang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-05 10:18 How to Fix 'Error: could not find extent items for root 257'? Chiung-Ming Huang
@ 2020-02-05 10:29 ` Qu Wenruo
  2020-02-05 15:29   ` Chiung-Ming Huang
       [not found]   ` <CAEOGEKHf9F0VM=au-42MwD63_V8RwtqiskV0LsGpq-c=J_qyPg@mail.gmail.com>
  0 siblings, 2 replies; 16+ messages in thread
From: Qu Wenruo @ 2020-02-05 10:29 UTC (permalink / raw)
  To: Chiung-Ming Huang, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3711 bytes --]



On 2020/2/5 下午6:18, Chiung-Ming Huang wrote:
> Hi everyone
> 
> It's a long story. I try to describe it shortly. My btrfs RAID1
> includes 5 HDDs, 10Tx2, 1Tx1, 2Tx1 and 3Tx1. They all based on bcache
> (1Tx1 SSD as cache) and luks. I tried to reorder it to ` Luks -->
> Bcache --> SSD --> HDD` with only one layer of luks on bcache. But I
> failed because of power-off accidentally. Please help me to fix it.
> Thanks.
> 
> 1. OS: Ubuntu 18.04
> 
> 2. $ uname -a
> Linux rescue 5.3.0-26-generic #28~18.04.1-Ubuntu SMP Wed Dec 18
> 16:40:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
...
> [Wed Feb  5 17:09:04 2020] BTRFS error (device bcache2): tree level
> mismatch detected, bytenr=19499133206528 level expected=0 has=2
> [Wed Feb  5 17:09:04 2020] BTRFS error (device bcache2): tree level
> mismatch detected, bytenr=19499133206528 level expected=0 has=2
> [Wed Feb  5 17:09:04 2020] BTRFS warning (device bcache2): iterating
> uuid_tree failed -117
> btrfs fi df /
> 
> ------------------ dmesg part 2/2 ------------------
> 
> [Wed Feb  5 17:09:36 2020] BTRFS error (device bcache2): tree block
> 14963956514816 owner 3 already locked by pid=3187, extent tree
> corruption detected

This shows the problem. Your extent tree is corrupted.

I don't believe the lower storage stack is involved.

Full histroy of the fs please (from mkfs to current stage)


...
> 
> 7. $ btrfs check -p /dev/bcache4
> Opening filesystem to check...
> Checking filesystem on /dev/bcache4
> UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
> Error: could not find extent items for root 257(0:00:00 elapsed, 1199
> items checked)
> [1/7] checking root items                      (0:00:00 elapsed, 7748
> items checked)
> ERROR: failed to repair root items: No such file or directory

Have you tried btrfs check --repair then mount?
Is that mentioned dmesg the first time you hit, not something after
btrfs check --repair?

And `btrfs check` without --repair please, that's the most important
info to evaluate how to fix it (if possible).

Thanks,
Qu

> 
> 8. $ btrfs scrub start -B -R /mnt
> The status is aborted because the file system was forcely re-mounted readonly.
> 
> 9. $ lsblk -o NAME,SIZE,TYPE,FSTYPE
> sda                     931.5G disk  bcache
> └─bcache0               931.5G disk  crypto_LUKS
>   └─disk-1t                931.5G crypt btrfs
> sdb                     232.9G disk
> └─sdb6                     10G part  crypto_LUKS
>   └─rescue                 10G crypt btrfs
> sdc                       2.7T disk  crypto_LUKS
> └─disk-3t             2.7T crypt bcache
>   └─bcache3               2.7T disk  btrfs
> sdd                       9.1T disk  crypto_LUKS
> └─disk-10t           9.1T crypt bcache
>   └─bcache2               9.1T disk  btrfs
> sde                       1.8T disk  bcache
> └─bcache1                 1.8T disk  crypto_LUKS
>   └─disk-2t          1.8T crypt btrfs
> sdf                       9.1T disk  crypto_LUKS
> └─disk-10t          9.1T crypt bcache
>   └─bcache4               9.1T disk  btrfs
> nvme0n1                 953.9G disk
> └─nvme0n1p1               636G part  crypto_LUKS
>   └─cache                 636G crypt bcache
>     ├─bcache0           931.5G disk  crypto_LUKS
>     │ └─disk-1t           931.5G crypt btrfs
>     ├─bcache1             1.8T disk  crypto_LUKS
>     │ └─disk-2t          1.8T crypt btrfs
>     ├─bcache2             9.1T disk  btrfs
>     ├─bcache3             2.7T disk  btrfs
>     └─bcache4             9.1T disk  btrfs
> 
> 
> Regards,
> Chiung-Ming Huang
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-05 10:29 ` Qu Wenruo
@ 2020-02-05 15:29   ` Chiung-Ming Huang
  2020-02-05 19:38     ` Chris Murphy
       [not found]   ` <CAEOGEKHf9F0VM=au-42MwD63_V8RwtqiskV0LsGpq-c=J_qyPg@mail.gmail.com>
  1 sibling, 1 reply; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-05 15:29 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

Hi Qu Wenruo

Thanks for your reply and help.

Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月5日 週三 下午6:29寫道: server ~$
Decrypted by /etc/crypttab
server ~$ mkfs.btrfs -m raid1 -d raid1 /dev/bcache0 /dev/bcache1
/dev/bcache2 /dev/bcache3 /dev/bcache4
server ~$ mount -o subvol=@defaults,degraded,nossd,subvol /dev/bcache4
/ (By /etc/fstab)
server ~$ reboot
server ~$ btrfs balance start /
server ~$ btrfs fi usage /
^^^^^^^^^^^^^ Only /dev/sdc (3T), /dev/sdd (10T), /dev/sdf (10T) have
data. The first two don't have any data and the last one have a mirror
copy of RAID1.
server ~$ Removed /dev/sda (1T), /dev/sde (2T), /dev/sdd (10T) from
/etc/crypttab
server ~$ reboot
server ~$ btrfs fi show
^^^^^^^^^^^^ /dev/sda (1T), /dev/sde (2T), /dev/sdd (10T) are marked `missing`
server ~$ btrfs balance start -f -sconvert=single -mconvert=single
-dconvert=single /
server ~$ btrfs balance cancel /
server ~$ reboot
server ~$ Put luks on bcache and mkfs.btrfs /dev/sda (1T)
server ~$ Put luks on bcache and mkfs.btrfs /dev/sde (2T)
server ~$ Forgot to do it on /dev/sdd (10T)
server ~$ btrfs device remove missing
^^^^^^^^^^^^ Executed about 5 seconds. (Becasue /dev/sda (1T) is empty?)
server ~$ btrfs device remove missing
^^^^^^^^^^^^ Executed about 5 seconds. (Becasue /dev/sde (2T) is empty?)
server ~$ btrfs device remove missing
^^^^^^^^^^^^ Executed at least 12 hours before power-off accidentally.

[Change to the independent rescue OS.]
rescue ~$ Add /dev/sda (1T), /dev/sde (2T), /dev/sdd (10T) back to
/etc/crypttab. And /dev/sdd (10T) still keep the mirror copy of RAID1
before removing it from /etc/crypttab.
rescue ~$ reboot
rescue ~$ btrfs check -p --repair /dev/bcache4
^^^^^^^^^^^ failed
rescue ~$ mount /dev/bcache4 /mnt
rescue ~$ btrfs check --repair /dev/bcache4
^^^^^^^^^^ [4/7] ...
Errors ....... fs root
rescue ~$ btrfs scrub start -B /mnt
^^^^^^^^^^ Showed a lot of errors and I can't ctrl+alt+3. So I rebooted.
rescue ~$ btrfs check --repair /dev/bcache4
^^^^^^^^^^ [1/7] checking root items
Error: could not find extent items for root 257
ERROR: failed to repair root items: No such file or directory



> ...
> >
> > 7. $ btrfs check -p /dev/bcache4
> > Opening filesystem to check...
> > Checking filesystem on /dev/bcache4
> > UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
> > Error: could not find extent items for root 257(0:00:00 elapsed, 1199
> > items checked)
> > [1/7] checking root items                      (0:00:00 elapsed, 7748
> > items checked)
> > ERROR: failed to repair root items: No such file or directory
>
> Have you tried btrfs check --repair then mount?

Yes.

> Is that mentioned dmesg the first time you hit, not something after

I keep kern.log but it's about 17M. I cannot post it here. And It
doesn't show `btrfs command` in the context.
A lot of `BTRFS critical` and `BTRFS error` are there but `BTRFS
critical` repeated.

Feb  3 15:38:24 rescue kernel: [ 8731.172674] BTRFS critical (device
bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8,
invalid key objectid: has 18446744073709551606 expect 6 or [256,
18446744073709551360] or 18446744073709551604
Feb  3 15:38:24 rescue kernel: [ 8731.172860] BTRFS critical (device
bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8,
invalid key objectid: has 18446744073709551606 expect 6 or [256,
18446744073709551360] or 18446744073709551604
Feb  3 20:19:42 rescue kernel: [25609.592216] BTRFS critical (device
bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8,
invalid key objectid: has 18446744073709551606 expect 6 or [256,
18446744073709551360] or 18446744073709551604
Feb  3 20:19:42 rescue kernel: [25609.592511] BTRFS critical (device
bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8,
invalid key objectid: has 18446744073709551606 expect 6 or [256,
18446744073709551360] or 18446744073709551604
Feb  5 17:05:58 rescue kernel: [ 3601.738469] BTRFS critical (device
bcache2): unable to find logical 7157918187520 length 4096
Feb  5 17:05:58 rescue kernel: [ 3601.738474] BTRFS critical (device
bcache2): unable to find logical 7157918187520 length 4096
Feb  5 17:05:58 rescue kernel: [ 3601.738481] BTRFS critical (device
bcache2): unable to find logical 7157918187520 length 16384
Feb  5 17:05:58 rescue kernel: [ 3601.738531] BTRFS critical (device
bcache2): unable to find logical 7157918187520 length 4096
Feb  5 17:05:58 rescue kernel: [ 3601.738533] BTRFS critical (device
bcache2): unable to find logical 7157918187520 length 4096
Feb  5 17:05:58 rescue kernel: [ 3601.738539] BTRFS critical (device
bcache2): unable to find logical 7157918187520 length 16384
.... (repeated 4096, 4096, 16384 these three lines)

> btrfs check --repair?
>
> And `btrfs check` without --repair please, that's the most important
> info to evaluate how to fix it (if possible).

rescue ~$ btrfs check /dev/bcache4
Opening filesystem to check...
Checking filesystem on /dev/bcache4
UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
[1/7] checking root items
Error: could not find extent items for root 257
ERROR: failed to repair root items: No such file or directory

rescue ~$ btrfs check --repair /dev/bcache4
enabling repair mode
WARNING:

        Do not use --repair unless you are advised to do so by a developer
        or an experienced user, and then only after having accepted that no
        fsck can successfully repair all types of filesystem corruption. Eg.
        some software or hardware bugs can fatally damage a volume.
        The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/bcache4
UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
[1/7] checking root items
Error: could not find extent items for root 257
ERROR: failed to repair root items: No such file or directory
rescue ~$

> Thanks,
> Qu

Regards,
Chiung-Ming Huang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-05 15:29   ` Chiung-Ming Huang
@ 2020-02-05 19:38     ` Chris Murphy
  2020-02-06  3:11       ` Chiung-Ming Huang
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Murphy @ 2020-02-05 19:38 UTC (permalink / raw)
  To: Chiung-Ming Huang; +Cc: Qu Wenruo, Btrfs BTRFS

On Wed, Feb 5, 2020 at 8:29 AM Chiung-Ming Huang <photon3108@gmail.com> wrote:
>
> server ~$ mount -o subvol=@defaults,degraded,nossd,subvol /dev/bcache4

Is this file system always mounted with the degraded mount option?
It's in /etc/fstab?


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-05 19:38     ` Chris Murphy
@ 2020-02-06  3:11       ` Chiung-Ming Huang
  0 siblings, 0 replies; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-06  3:11 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Qu Wenruo, Btrfs BTRFS

Chris Murphy <lists@colorremedies.com> 於 2020年2月6日 週四 上午3:38寫道:
>
> On Wed, Feb 5, 2020 at 8:29 AM Chiung-Ming Huang <photon3108@gmail.com> wrote:
> >
> > server ~$ mount -o subvol=@defaults,degraded,nossd,subvol /dev/bcache4
>
> Is this file system always mounted with the degraded mount option?
> It's in /etc/fstab?

Yes, because my btrfs raid1 includes root directory, '/'. I assume it
could make boot
successfully even if raid1 lost some disks.

Is it a bad idea? Or does it has any performance issue?

>
> --
> Chris Murphy


Regards,
Chiung-Ming Huang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
       [not found]     ` <f2ad6b4f-b011-8954-77e1-5162c84f7c1f@gmx.com>
@ 2020-02-06  4:13       ` Chiung-Ming Huang
  2020-02-06  4:35         ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-06  4:13 UTC (permalink / raw)
  To: Qu Wenruo, Btrfs

[-- Attachment #1: Type: text/plain, Size: 1857 bytes --]

Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 上午9:13寫道:
> Please keep in mind that, if you post dmesg, the first time such error
> happens is the most important.
> Not something after you modified the fs by btrfs check --repair.

Thanks for your advice. I'll keep in my mind. :)


> >
> > Feb  3 15:38:24 rescue kernel: [ 8731.172674] BTRFS critical (device
> > bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8,
> > invalid key objectid: has 18446744073709551606 expect 6 or [256,
> > 18446744073709551360] or 18446744073709551604
>
> This is message is even earlier than your initial report, and it's more
> important.
> This means you have a bad inode item with objectid EXTENT_CSUM_OBJECTID.
>
> This is a bigger problem.

It sounds bad. Is it possible to save the data or part of them?


> Are you sure that is the very first error message you hit?

My .bash_history doesn't show timestamp so I'm not really sure which
critical/error
message is exactly right after the first `btrfs check --repair`. I
tried to make log file
smaller and excerpted only btrfs messages before the first critical
message in the
attachment. I'm not so familiar with mailing list. Could you see `btrfs_.log`?


> > rescue ~$ btrfs check /dev/bcache4
> > Opening filesystem to check...
> > Checking filesystem on /dev/bcache4
> > UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
> > [1/7] checking root items
> > Error: could not find extent items for root 257
> > ERROR: failed to repair root items: No such file or directory
>
> This part is from a special repair for a regression in 3.17.
>
> I guess we should not enable it by default.
> That will be another patch for btrfs-progs.

Is this patch safe for saving my btrfs? If it is, I can build btrfs-progs.


Regards,
Chiung-Ming Huang

[-- Attachment #2: btrfs_.log --]
[-- Type: text/x-log, Size: 25635 bytes --]

Jan 28 18:19:26 rescue kernel: [   23.014317] Btrfs loaded, crc32c=crc32c-intel
Jan 28 18:19:26 rescue kernel: [   23.126873] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 5 transid 1401252 /dev/bcache4
Jan 28 18:19:26 rescue kernel: [   23.126984] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 4 transid 1401252 /dev/bcache3
Jan 28 18:19:26 rescue kernel: [   23.127080] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 3 transid 1401252 /dev/bcache2
Jan 28 18:19:26 rescue kernel: [   23.127181] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 2 transid 1401252 /dev/bcache1
Jan 28 18:19:26 rescue kernel: [   23.127343] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 1 transid 1401252 /dev/bcache0
Jan 28 18:19:26 rescue kernel: [   23.127480] BTRFS: device fsid 01220871-cd98-45c9-8aac-070c4dd97a4f devid 1 transid 344 /dev/mapper/boot
Jan 28 18:19:26 rescue kernel: [   23.127652] BTRFS: device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 transid 740 /dev/mapper/rescue
Jan 28 18:19:26 rescue kernel: [   23.149243] BTRFS info (device dm-0): disk space caching is enabled
Jan 28 18:19:26 rescue kernel: [   23.149246] BTRFS info (device dm-0): has skinny extents
Jan 28 18:19:26 rescue kernel: [   23.155698] BTRFS info (device dm-0): enabling ssd optimizations
Jan 28 18:19:26 rescue kernel: [   23.547284] BTRFS info (device dm-0): disk space caching is enabled
Jan 28 18:19:26 rescue kernel: [   23.560353] BTRFS warning (device dm-0): swapfile must not be copy-on-write
Jan 28 18:19:26 rescue kernel: [   24.118209] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/mapper/rescue new:/dev/dm-0
Jan 28 18:19:26 rescue kernel: [   24.118663] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/dm-0 new:/dev/mapper/rescue
Jan 28 18:19:26 rescue kernel: [   24.239166] BTRFS warning (device dm-0): swapfile must not be copy-on-write
Jan 28 18:19:33 rescue kernel: [   32.341375] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/mapper/rescue new:/dev/dm-0
Jan 28 18:19:33 rescue kernel: [   32.346855] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/dm-0 new:/dev/mapper/rescue
Jan 28 18:24:19 rescue kernel: [  318.031325] BTRFS info (device dm-1): disk space caching is enabled
Jan 28 18:24:19 rescue kernel: [  318.031327] BTRFS info (device dm-1): has skinny extents
Jan 28 18:24:19 rescue kernel: [  318.042297] BTRFS info (device dm-1): enabling ssd optimizations
Feb  3 10:55:12 rescue kernel: [   19.205264] Btrfs loaded, crc32c=crc32c-intel
Feb  3 10:55:12 rescue kernel: [   19.324218] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 1 transid 1415180 /dev/bcache2
Feb  3 10:55:12 rescue kernel: [   19.324355] BTRFS: device fsid 01220871-cd98-45c9-8aac-070c4dd97a4f devid 1 transid 399 /dev/mapper/boot
Feb  3 10:55:12 rescue kernel: [   19.324492] BTRFS: device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 transid 791 /dev/mapper/rescue
Feb  3 10:55:12 rescue kernel: [   19.347030] BTRFS info (device dm-0): disk space caching is enabled
Feb  3 10:55:12 rescue kernel: [   19.347031] BTRFS info (device dm-0): has skinny extents
Feb  3 10:55:12 rescue kernel: [   19.353614] BTRFS info (device dm-0): enabling ssd optimizations
Feb  3 10:55:12 rescue kernel: [   19.751089] BTRFS info (device dm-0): disk space caching is enabled
Feb  3 10:55:12 rescue kernel: [   19.763127] BTRFS warning (device dm-0): swapfile must not be copy-on-write
Feb  3 10:55:12 rescue kernel: [   20.542239] BTRFS warning (device dm-0): swapfile must not be copy-on-write
Feb  3 10:55:12 rescue kernel: [   20.709261] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/mapper/rescue new:/dev/dm-0
Feb  3 10:55:12 rescue kernel: [   20.709619] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/dm-0 new:/dev/mapper/rescue
Feb  3 10:56:29 rescue kernel: [   98.214952] BTRFS info (device dm-1): disk space caching is enabled
Feb  3 10:56:29 rescue kernel: [   98.214956] BTRFS info (device dm-1): has skinny extents
Feb  3 10:56:29 rescue kernel: [   98.247149] BTRFS info (device dm-1): enabling ssd optimizations
Feb  3 10:59:15 rescue kernel: [  263.602762] BTRFS info (device bcache2): disk space caching is enabled
Feb  3 10:59:15 rescue kernel: [  263.602764] BTRFS info (device bcache2): has skinny extents
Feb  3 10:59:15 rescue kernel: [  263.603619] BTRFS error (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing
Feb  3 10:59:15 rescue kernel: [  263.603624] BTRFS error (device bcache2): failed to read the system array: -2
Feb  3 10:59:15 rescue kernel: [  263.645634] BTRFS error (device bcache2): open_ctree failed
Feb  3 10:59:22 rescue kernel: [  271.229983] BTRFS info (device bcache2): disk space caching is enabled
Feb  3 10:59:22 rescue kernel: [  271.229986] BTRFS info (device bcache2): has skinny extents
Feb  3 10:59:22 rescue kernel: [  271.230996] BTRFS error (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing
Feb  3 10:59:22 rescue kernel: [  271.231005] BTRFS error (device bcache2): failed to read the system array: -2
Feb  3 10:59:23 rescue kernel: [  271.301168] BTRFS error (device bcache2): open_ctree failed
Feb  3 11:00:23 rescue kernel: [  332.126013] BTRFS info (device bcache2): allowing degraded mounts
Feb  3 11:00:23 rescue kernel: [  332.126017] BTRFS info (device bcache2): disk space caching is enabled
Feb  3 11:00:23 rescue kernel: [  332.126019] BTRFS info (device bcache2): has skinny extents
Feb  3 11:00:23 rescue kernel: [  332.127151] BTRFS warning (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing
Feb  3 11:00:23 rescue kernel: [  332.142841] BTRFS warning (device bcache2): devid 2 uuid 7c87f647-874d-49aa-83d3-5f4cc126a627 is missing
Feb  3 11:00:23 rescue kernel: [  332.142845] BTRFS warning (device bcache2): devid 3 uuid f9b7fe84-d95b-4db5-9e2b-c34a2d4186e9 is missing
Feb  3 11:00:23 rescue kernel: [  332.142847] BTRFS warning (device bcache2): devid 4 uuid 37706d9f-0672-4dc7-b987-5cc83c6beb48 is missing
Feb  3 11:00:23 rescue kernel: [  332.142849] BTRFS warning (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing
Feb  3 11:00:23 rescue kernel: [  332.182460] BTRFS warning (device bcache2): failed to read tree root
Feb  3 11:00:24 rescue kernel: [  332.336612] BTRFS error (device bcache2): open_ctree failed
Feb  3 11:00:50 rescue kernel: [  359.095221] BTRFS info (device bcache2): allowing degraded mounts
Feb  3 11:00:50 rescue kernel: [  359.095223] BTRFS info (device bcache2): disk space caching is enabled
Feb  3 11:00:50 rescue kernel: [  359.095224] BTRFS info (device bcache2): has skinny extents
Feb  3 11:00:50 rescue kernel: [  359.096523] BTRFS warning (device bcache2): devid 2 uuid 7c87f647-874d-49aa-83d3-5f4cc126a627 is missing
Feb  3 11:00:50 rescue kernel: [  359.096526] BTRFS warning (device bcache2): devid 3 uuid f9b7fe84-d95b-4db5-9e2b-c34a2d4186e9 is missing
Feb  3 11:00:50 rescue kernel: [  359.096527] BTRFS warning (device bcache2): devid 4 uuid 37706d9f-0672-4dc7-b987-5cc83c6beb48 is missing
Feb  3 11:00:50 rescue kernel: [  359.096529] BTRFS warning (device bcache2): devid 5 uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing
Feb  3 11:00:50 rescue kernel: [  359.123421] BTRFS warning (device bcache2): failed to read tree root
Feb  3 11:00:51 rescue kernel: [  359.265056] BTRFS error (device bcache2): open_ctree failed
Feb  3 11:10:16 rescue kernel: [  924.586408] BTRFS: device fsid 5330041e-7c11-4090-a99b-e86f32ee2136 devid 1 transid 5 /dev/mapper/tmp
Feb  3 11:11:13 rescue kernel: [  981.791708] BTRFS info (device dm-5): disk space caching is enabled
Feb  3 11:11:13 rescue kernel: [  981.791710] BTRFS info (device dm-5): has skinny extents
Feb  3 11:11:13 rescue kernel: [  981.791711] BTRFS info (device dm-5): flagging fs with big metadata feature
Feb  3 11:11:13 rescue kernel: [  981.794318] BTRFS info (device dm-5): enabling ssd optimizations
Feb  3 11:11:13 rescue kernel: [  981.794938] BTRFS info (device dm-5): creating UUID tree
Feb  3 13:14:43 rescue kernel: [   18.721825] Btrfs loaded, crc32c=crc32c-intel
Feb  3 13:14:43 rescue kernel: [   18.838106] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 1 transid 1415180 /dev/bcache2
Feb  3 13:14:43 rescue kernel: [   18.838244] BTRFS: device fsid 01220871-cd98-45c9-8aac-070c4dd97a4f devid 1 transid 401 /dev/mapper/boot
Feb  3 13:14:43 rescue kernel: [   18.838379] BTRFS: device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 transid 1458 /dev/mapper/rescue
Feb  3 13:14:43 rescue kernel: [   18.858902] BTRFS info (device dm-0): disk space caching is enabled
Feb  3 13:14:43 rescue kernel: [   18.858904] BTRFS info (device dm-0): has skinny extents
Feb  3 13:14:43 rescue kernel: [   18.865645] BTRFS info (device dm-0): enabling ssd optimizations
Feb  3 13:14:43 rescue kernel: [   19.293884] BTRFS info (device dm-0): disk space caching is enabled
Feb  3 13:14:43 rescue kernel: [   19.305694] BTRFS warning (device dm-0): swapfile must not be copy-on-write
Feb  3 13:14:43 rescue kernel: [   20.157798] BTRFS warning (device dm-0): swapfile must not be copy-on-write
Feb  3 13:14:43 rescue kernel: [   20.384004] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/mapper/rescue new:/dev/dm-0
Feb  3 13:14:43 rescue kernel: [   20.384394] BTRFS info (device dm-0): device fsid ffd4ee02-4df2-4202-bf3a-169af7a47b18 devid 1 moved old:/dev/dm-0 new:/dev/mapper/rescue
Feb  3 13:15:58 rescue kernel: [  185.577094] BTRFS info (device dm-1): disk space caching is enabled
Feb  3 13:15:58 rescue kernel: [  185.577097] BTRFS info (device dm-1): has skinny extents
Feb  3 13:15:58 rescue kernel: [  185.589554] BTRFS info (device dm-1): enabling ssd optimizations
Feb  3 13:18:20 rescue kernel: [  327.032329] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 5 transid 1448328 /dev/bcache4
Feb  3 13:18:20 rescue kernel: [  327.380096] BTRFS: device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 3 transid 1448328 /dev/bcache3
Feb  3 13:25:15 rescue kernel: [  742.224425] BTRFS info (device bcache3): allowing degraded mounts
Feb  3 13:25:15 rescue kernel: [  742.224427] BTRFS info (device bcache3): disk space caching is enabled
Feb  3 13:25:15 rescue kernel: [  742.224428] BTRFS info (device bcache3): has skinny extents
Feb  3 13:25:15 rescue kernel: [  742.308375] BTRFS error (device bcache3): super_num_devices 3 mismatch with num_devices 3 found here
Feb  3 13:25:15 rescue kernel: [  742.308383] BTRFS error (device bcache3): failed to read chunk tree: -22
Feb  3 13:25:15 rescue kernel: [  742.459207] BTRFS error (device bcache3): open_ctree failed
Feb  3 13:25:40 rescue kernel: [  767.040748] BTRFS info (device bcache3): allowing degraded mounts
Feb  3 13:25:40 rescue kernel: [  767.040749] BTRFS info (device bcache3): disk space caching is enabled
Feb  3 13:25:40 rescue kernel: [  767.040750] BTRFS info (device bcache3): has skinny extents
Feb  3 13:25:40 rescue kernel: [  767.061935] BTRFS error (device bcache3): super_num_devices 3 mismatch with num_devices 3 found here
Feb  3 13:25:40 rescue kernel: [  767.061943] BTRFS error (device bcache3): failed to read chunk tree: -22
Feb  3 13:25:40 rescue kernel: [  767.223721] BTRFS error (device bcache3): open_ctree failed
Feb  3 14:16:21 rescue kernel: [ 3808.492183] BTRFS info (device bcache3): allowing degraded mounts
Feb  3 14:16:21 rescue kernel: [ 3808.492185] BTRFS info (device bcache3): disk space caching is enabled
Feb  3 14:16:21 rescue kernel: [ 3808.492186] BTRFS info (device bcache3): has skinny extents
Feb  3 14:16:21 rescue kernel: [ 3808.702369] BTRFS error (device bcache3): parent transid verify failed on 15062144499712 wanted 1446883 found 112350
Feb  3 14:16:21 rescue kernel: [ 3808.711258] BTRFS info (device bcache3): read error corrected: ino 0 off 15062144499712 (dev /dev/bcache2 sector 13892444896)
Feb  3 14:16:21 rescue kernel: [ 3808.711562] BTRFS info (device bcache3): read error corrected: ino 0 off 15062144503808 (dev /dev/bcache2 sector 13892444904)
Feb  3 14:16:21 rescue kernel: [ 3808.711767] BTRFS info (device bcache3): read error corrected: ino 0 off 15062144507904 (dev /dev/bcache2 sector 13892444912)
Feb  3 14:16:21 rescue kernel: [ 3808.712058] BTRFS info (device bcache3): read error corrected: ino 0 off 15062144512000 (dev /dev/bcache2 sector 13892444920)
Feb  3 14:16:21 rescue kernel: [ 3808.749827] BTRFS error (device bcache3): parent transid verify failed on 15062014951424 wanted 1446322 found 100674
Feb  3 14:16:21 rescue kernel: [ 3808.750423] BTRFS info (device bcache3): read error corrected: ino 0 off 15062014951424 (dev /dev/bcache2 sector 13892191872)
Feb  3 14:16:21 rescue kernel: [ 3808.750731] BTRFS info (device bcache3): read error corrected: ino 0 off 15062014955520 (dev /dev/bcache2 sector 13892191880)
Feb  3 14:16:21 rescue kernel: [ 3808.750957] BTRFS info (device bcache3): read error corrected: ino 0 off 15062014959616 (dev /dev/bcache2 sector 13892191888)
Feb  3 14:16:21 rescue kernel: [ 3808.751217] BTRFS info (device bcache3): read error corrected: ino 0 off 15062014963712 (dev /dev/bcache2 sector 13892191896)
Feb  3 14:16:21 rescue kernel: [ 3808.753943] BTRFS info (device bcache3): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3
Feb  3 14:16:22 rescue kernel: [ 3808.913147] BTRFS error (device bcache3): parent transid verify failed on 15062736961536 wanted 1447095 found 1415067
Feb  3 14:16:22 rescue kernel: [ 3808.919641] BTRFS info (device bcache3): read error corrected: ino 0 off 15062736961536 (dev /dev/bcache2 sector 13893602048)
Feb  3 14:16:22 rescue kernel: [ 3808.920181] BTRFS info (device bcache3): read error corrected: ino 0 off 15062736965632 (dev /dev/bcache2 sector 13893602056)
Feb  3 14:16:22 rescue kernel: [ 3809.203444] BTRFS error (device bcache3): parent transid verify failed on 15061822210048 wanted 1448318 found 1415066
Feb  3 14:16:22 rescue kernel: [ 3809.369463] BTRFS error (device bcache3): parent transid verify failed on 15062219669504 wanted 1448318 found 1415066
Feb  3 14:16:22 rescue kernel: [ 3809.371991] BTRFS error (device bcache3): parent transid verify failed on 15062219849728 wanted 1448318 found 1415066
Feb  3 14:16:22 rescue kernel: [ 3809.384999] BTRFS error (device bcache3): parent transid verify failed on 15062220980224 wanted 1448318 found 1415066
Feb  3 14:16:22 rescue kernel: [ 3809.555515] BTRFS info (device bcache3): enabling ssd optimizations
Feb  3 14:16:22 rescue kernel: [ 3809.651522] BTRFS info (device bcache3): checking UUID tree
Feb  3 14:16:22 rescue kernel: [ 3809.670189] BTRFS error (device bcache3): parent transid verify failed on 15062533734400 wanted 1448319 found 99757
Feb  3 14:16:22 rescue kernel: [ 3809.672604] BTRFS error (device bcache3): parent transid verify failed on 15062480715776 wanted 1448319 found 109527
Feb  3 14:16:22 rescue kernel: [ 3809.678123] BTRFS error (device bcache3): parent transid verify failed on 15062533668864 wanted 1448319 found 99757
Feb  3 14:16:56 rescue kernel: [ 3843.262404] BTRFS error (device bcache3): parent transid verify failed on 19500524929024 wanted 1448324 found 1404193
Feb  3 14:16:56 rescue kernel: [ 3843.262897] BTRFS info (device bcache3): read error corrected: ino 0 off 19500524929024 (dev /dev/bcache2 sector 14113631808)
Feb  3 14:16:56 rescue kernel: [ 3843.263192] BTRFS info (device bcache3): read error corrected: ino 0 off 19500524933120 (dev /dev/bcache2 sector 14113631816)
Feb  3 14:16:56 rescue kernel: [ 3843.263437] BTRFS info (device bcache3): read error corrected: ino 0 off 19500524937216 (dev /dev/bcache2 sector 14113631824)
Feb  3 14:16:56 rescue kernel: [ 3843.263638] BTRFS info (device bcache3): read error corrected: ino 0 off 19500524941312 (dev /dev/bcache2 sector 14113631832)
Feb  3 14:47:02 rescue kernel: [ 5649.108021] BTRFS: device fsid 56182a66-b7c2-4953-a7db-98717ed02356 devid 1 transid 5 /dev/mapper/disk-wd-1t-0730
Feb  3 14:47:40 rescue kernel: [ 5687.214767] BTRFS info (device bcache2): allowing degraded mounts
Feb  3 14:47:40 rescue kernel: [ 5687.214770] BTRFS info (device bcache2): disk space caching is enabled
Feb  3 14:47:40 rescue kernel: [ 5687.214771] BTRFS info (device bcache2): has skinny extents
Feb  3 14:47:41 rescue kernel: [ 5688.229117] BTRFS info (device bcache2): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3
Feb  3 14:47:42 rescue kernel: [ 5688.933827] BTRFS info (device bcache2): enabling ssd optimizations
Feb  3 14:48:07 rescue kernel: [ 5714.056608] BTRFS info (device bcache2): disk added /dev/mapper/disk-wd-1t-0730
Feb  3 14:48:07 rescue kernel: [ 5714.059515] BTRFS info (device bcache2): device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 6 moved old:/dev/mapper/disk-wd-1t-0730 new:/dev/dm-6
Feb  3 14:48:07 rescue kernel: [ 5714.060355] BTRFS info (device bcache2): device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 6 moved old:/dev/dm-6 new:/dev/mapper/disk-wd-1t-0730
Feb  3 14:48:49 rescue kernel: [ 5756.214729] BTRFS error (device bcache2): parent transid verify failed on 19500017106944 wanted 1448321 found 1415054
Feb  3 14:48:49 rescue kernel: [ 5756.215355] BTRFS info (device bcache2): read error corrected: ino 0 off 19500017106944 (dev /dev/bcache2 sector 14112639968)
Feb  3 14:48:49 rescue kernel: [ 5756.215675] BTRFS info (device bcache2): read error corrected: ino 0 off 19500017111040 (dev /dev/bcache2 sector 14112639976)
Feb  3 14:48:49 rescue kernel: [ 5756.215935] BTRFS info (device bcache2): read error corrected: ino 0 off 19500017115136 (dev /dev/bcache2 sector 14112639984)
Feb  3 14:48:49 rescue kernel: [ 5756.216156] BTRFS info (device bcache2): read error corrected: ino 0 off 19500017119232 (dev /dev/bcache2 sector 14112639992)
Feb  3 14:48:49 rescue kernel: [ 5756.840866] BTRFS info (device bcache2): disk added /dev/mapper/disk-wd-2t-4979
Feb  3 14:48:49 rescue kernel: [ 5756.843836] BTRFS info (device bcache2): device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 7 moved old:/dev/mapper/disk-wd-2t-4979 new:/dev/dm-7
Feb  3 14:48:49 rescue kernel: [ 5756.844811] BTRFS info (device bcache2): device fsid 0b79cf54-c424-40ed-adca-bd66b38ad57a devid 7 moved old:/dev/dm-7 new:/dev/mapper/disk-wd-2t-4979
Feb  3 15:06:51 rescue kernel: [ 6838.152819] BTRFS info (device bcache2): allowing degraded mounts
Feb  3 15:06:51 rescue kernel: [ 6838.152822] BTRFS info (device bcache2): disk space caching is enabled
Feb  3 15:06:51 rescue kernel: [ 6838.152823] BTRFS info (device bcache2): has skinny extents
Feb  3 15:06:51 rescue kernel: [ 6838.230152] BTRFS info (device bcache2): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3
Feb  3 15:06:52 rescue kernel: [ 6838.934006] BTRFS info (device bcache2): enabling ssd optimizations
Feb  3 15:06:54 rescue kernel: [ 6841.674943] BTRFS error (device bcache2): parent transid verify failed on 15062538846208 wanted 1448319 found 118025
Feb  3 15:06:54 rescue kernel: [ 6841.675400] BTRFS info (device bcache2): read error corrected: ino 0 off 15062538846208 (dev /dev/bcache2 sector 13893215104)
Feb  3 15:06:54 rescue kernel: [ 6841.675463] BTRFS info (device bcache2): read error corrected: ino 0 off 15062538850304 (dev /dev/bcache2 sector 13893215112)
Feb  3 15:06:54 rescue kernel: [ 6841.675529] BTRFS info (device bcache2): read error corrected: ino 0 off 15062538854400 (dev /dev/bcache2 sector 13893215120)
Feb  3 15:06:54 rescue kernel: [ 6841.676223] BTRFS info (device bcache2): read error corrected: ino 0 off 15062538858496 (dev /dev/bcache2 sector 13893215128)
Feb  3 15:06:54 rescue kernel: [ 6841.766371] BTRFS error (device bcache2): parent transid verify failed on 15062516727808 wanted 1448319 found 116977
Feb  3 15:06:54 rescue kernel: [ 6841.766894] BTRFS info (device bcache2): read error corrected: ino 0 off 15062516727808 (dev /dev/bcache2 sector 13893171904)
Feb  3 15:06:54 rescue kernel: [ 6841.766958] BTRFS info (device bcache2): read error corrected: ino 0 off 15062516731904 (dev /dev/bcache2 sector 13893171912)
Feb  3 15:06:54 rescue kernel: [ 6841.767022] BTRFS info (device bcache2): read error corrected: ino 0 off 15062516736000 (dev /dev/bcache2 sector 13893171920)
Feb  3 15:06:54 rescue kernel: [ 6841.767082] BTRFS info (device bcache2): read error corrected: ino 0 off 15062516740096 (dev /dev/bcache2 sector 13893171928)
Feb  3 15:38:23 rescue kernel: [ 8730.213844] BTRFS info (device bcache2): allowing degraded mounts
Feb  3 15:38:23 rescue kernel: [ 8730.213846] BTRFS info (device bcache2): disk space caching is enabled
Feb  3 15:38:23 rescue kernel: [ 8730.213847] BTRFS info (device bcache2): has skinny extents
Feb  3 15:38:23 rescue kernel: [ 8730.225206] BTRFS error (device bcache2): parent transid verify failed on 14963956989952 wanted 1448324 found 80508
Feb  3 15:38:23 rescue kernel: [ 8730.225621] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956989952 (dev /dev/bcache2 sector 534883232)
Feb  3 15:38:23 rescue kernel: [ 8730.225829] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956994048 (dev /dev/bcache2 sector 534883240)
Feb  3 15:38:23 rescue kernel: [ 8730.225885] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956998144 (dev /dev/bcache2 sector 534883248)
Feb  3 15:38:23 rescue kernel: [ 8730.225937] BTRFS info (device bcache2): read error corrected: ino 0 off 14963957002240 (dev /dev/bcache2 sector 534883256)
Feb  3 15:38:23 rescue kernel: [ 8730.332134] BTRFS error (device bcache2): parent transid verify failed on 14963956613120 wanted 1446901 found 120202
Feb  3 15:38:23 rescue kernel: [ 8730.332435] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956613120 (dev /dev/bcache2 sector 534882496)
Feb  3 15:38:23 rescue kernel: [ 8730.332493] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956617216 (dev /dev/bcache2 sector 534882504)
Feb  3 15:38:23 rescue kernel: [ 8730.332549] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956621312 (dev /dev/bcache2 sector 534882512)
Feb  3 15:38:23 rescue kernel: [ 8730.332605] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956625408 (dev /dev/bcache2 sector 534882520)
Feb  3 15:38:23 rescue kernel: [ 8730.338227] BTRFS error (device bcache2): parent transid verify failed on 14963956776960 wanted 1446891 found 105104
Feb  3 15:38:23 rescue kernel: [ 8730.338525] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956776960 (dev /dev/bcache2 sector 534882816)
Feb  3 15:38:23 rescue kernel: [ 8730.338582] BTRFS info (device bcache2): read error corrected: ino 0 off 14963956781056 (dev /dev/bcache2 sector 534882824)
Feb  3 15:38:23 rescue kernel: [ 8730.340420] BTRFS error (device bcache2): parent transid verify failed on 14963956695040 wanted 1446882 found 108615
Feb  3 15:38:23 rescue kernel: [ 8730.341332] BTRFS error (device bcache2): parent transid verify failed on 14963957334016 wanted 1446872 found 94852
Feb  3 15:38:23 rescue kernel: [ 8730.343095] BTRFS error (device bcache2): parent transid verify failed on 14963957383168 wanted 1446734 found 91496
Feb  3 15:38:23 rescue kernel: [ 8730.344103] BTRFS error (device bcache2): parent transid verify failed on 14963957301248 wanted 1446522 found 100496
Feb  3 15:38:23 rescue kernel: [ 8730.345274] BTRFS error (device bcache2): parent transid verify failed on 14963956662272 wanted 1446296 found 99399
Feb  3 15:38:23 rescue kernel: [ 8730.346413] BTRFS error (device bcache2): parent transid verify failed on 14963956645888 wanted 1446287 found 102741
Feb  3 15:38:23 rescue kernel: [ 8730.347550] BTRFS error (device bcache2): parent transid verify failed on 14963957317632 wanted 1446844 found 97159
Feb  3 15:38:23 rescue kernel: [ 8730.371987] BTRFS error (device bcache2): bad tree block start, want 14963957694464 have 9926189483690973024
Feb  3 15:38:23 rescue kernel: [ 8730.529027] BTRFS info (device bcache2): bdev /dev/bcache3 errs: wr 0, rd 0, flush 0, corrupt 0, gen 3
Feb  3 15:38:24 rescue kernel: [ 8730.974041] BTRFS info (device bcache2): enabling ssd optimizations
Feb  3 15:38:24 rescue kernel: [ 8730.974481] BTRFS info (device bcache2): checking UUID tree
Feb  3 15:38:24 rescue kernel: [ 8731.172674] BTRFS critical (device bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, invalid key objectid: has 18446744073709551606 expect 6 or [256, 18446744073709551360] or 18446744073709551604
Feb  3 15:38:24 rescue kernel: [ 8731.172678] BTRFS error (device bcache2): block=19498503094272 read time tree block corruption detected
Feb  3 15:38:24 rescue kernel: [ 8731.172860] BTRFS critical (device bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8, invalid key objectid: has 18446744073709551606 expect 6 or [256, 18446744073709551360] or 18446744073709551604
Feb  3 15:38:24 rescue kernel: [ 8731.172862] BTRFS error (device bcache2): block=19498503094272 read time tree block corruption detected
Feb  3 15:38:24 rescue kernel: [ 8731.172878] BTRFS warning (device bcache2): iterating uuid_tree failed -5
Feb  3 15:38:40 rescue kernel: [ 8747.300244] BTRFS error (device bcache2): parent transid verify failed on 19499606147072 wanted 1419889 found 1415052

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-06  4:13       ` Chiung-Ming Huang
@ 2020-02-06  4:35         ` Qu Wenruo
  2020-02-06  6:50           ` Chiung-Ming Huang
  2020-02-07  3:49           ` Chiung-Ming Huang
  0 siblings, 2 replies; 16+ messages in thread
From: Qu Wenruo @ 2020-02-06  4:35 UTC (permalink / raw)
  To: Chiung-Ming Huang, Btrfs


[-- Attachment #1.1: Type: text/plain, Size: 4373 bytes --]



On 2020/2/6 下午12:13, Chiung-Ming Huang wrote:
> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 上午9:13寫道:
>> Please keep in mind that, if you post dmesg, the first time such error
>> happens is the most important.
>> Not something after you modified the fs by btrfs check --repair.
> 
> Thanks for your advice. I'll keep in my mind. :)
> 
> 
>>>
>>> Feb  3 15:38:24 rescue kernel: [ 8731.172674] BTRFS critical (device
>>> bcache2): corrupt leaf: root=23146 block=19498503094272 slot=8,
>>> invalid key objectid: has 18446744073709551606 expect 6 or [256,
>>> 18446744073709551360] or 18446744073709551604
>>
>> This is message is even earlier than your initial report, and it's more
>> important.
>> This means you have a bad inode item with objectid EXTENT_CSUM_OBJECTID.
>>
>> This is a bigger problem.
> 
> It sounds bad. Is it possible to save the data or part of them?

Metadata is already screwed up.

Data maybe partly saved for btrfs-restore or if you can mount it read-only.

> 
> 
>> Are you sure that is the very first error message you hit?
> 
> My .bash_history doesn't show timestamp so I'm not really sure which
> critical/error
> message is exactly right after the first `btrfs check --repair`. I
> tried to make log file
> smaller and excerpted only btrfs messages before the first critical
> message in the
> attachment. I'm not so familiar with mailing list. Could you see `btrfs_.log`?

Got the attachment.

The first strange part is, I see several mount failure with is caused by
4 or more devices missing.

Then it mounted with devid1 missing.

After reboot, you got the the full fs mounted without any missing.
So far so good, but I'm not sure how degraded mount affects here.

Soon after that, there is already problem showing some degraded mount is
causing problem, where num_devices doesn't match.

Further more, around 14:16 Feb 3, there are metadata transid mismatch,
which means some metadata is already way older.

At that point, btrfs can still try to read from the other copy, thus
it's not a big problem yet.

But that's already poisoning your fs, reducing the stability
step-by-step. It's the RAID1 of btrfs barely saved your fs.
The normal way to handle it is, trigger a full fs scrub to
resilver/resync all RAID1 copies.

And finally, you hit the last stage, where around 15:38 btrfs can't
repair the metadata mismatch caused by multiple brain-split RAID1
situation, causing tons of transid error where btrfs can't fix.


So from the full dmesg, it looks like the abuse of degraded is causing
the problem.

This shows one shortcoming of current btrfs RAID implementation, it
doesn't do automatic re-silver. Unlike mdraid which will do re-silver
before it can be accessed.
Btrfs doesn't have a record of which blocks are written before some
device go missing.

Thus degraded for btrfs should really be considered as a last-resort
method. And manual scrub after all devices go back online is really
recommended.

Thanks,
Qu
> 
> 
>>> rescue ~$ btrfs check /dev/bcache4
>>> Opening filesystem to check...
>>> Checking filesystem on /dev/bcache4
>>> UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
>>> [1/7] checking root items
>>> Error: could not find extent items for root 257
>>> ERROR: failed to repair root items: No such file or directory
>>
>> This part is from a special repair for a regression in 3.17.
>>
>> I guess we should not enable it by default.
>> That will be another patch for btrfs-progs.
> 
> Is this patch safe for saving my btrfs? If it is, I can build btrfs-progs.

Here is the diff, should be pretty safe:
diff --git a/check/main.c b/check/main.c
index 7db65150048b..bcde157c415d 100644
--- a/check/main.c
+++ b/check/main.c
@@ -10373,7 +10373,8 @@ static int cmd_check(const struct cmd_struct
*cmd, int argc, char **argv)
                        ctx.tp = TASK_ROOT_ITEMS;
                        task_start(ctx.info, &ctx.start_time,
&ctx.item_count);
                }
-               ret = repair_root_items(info);
+               if (repair)
+                       ret = repair_root_items(info);
                task_stop(ctx.info);
                if (ret < 0) {
                        err = !!ret;


Thanks,
Qu
> 
> 
> Regards,
> Chiung-Ming Huang
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-06  4:35         ` Qu Wenruo
@ 2020-02-06  6:50           ` Chiung-Ming Huang
  2020-02-07  3:49           ` Chiung-Ming Huang
  1 sibling, 0 replies; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-06  6:50 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Btrfs

Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 下午12:35寫道:
>
>
>
> On 2020/2/6 下午12:13, Chiung-Ming Huang wrote:
> > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 上午9:13寫道:

> Got the attachment.
>
> The first strange part is, I see several mount failure with is caused by
> 4 or more devices missing.
>
> Then it mounted with devid1 missing.
>
> After reboot, you got the the full fs mounted without any missing.

That's because /etc/crypttab of rescue system wasn't set up correctly.
I logged in at the first and then fixed it.

> So far so good, but I'm not sure how degraded mount affects here.

> Soon after that, there is already problem showing some degraded mount is
> causing problem, where num_devices doesn't match.

Before `btrfs balance start -f ...` to single, I removed 3 disks from
/etc/crypttab of
server system. They are 1TB(empty), 2TB(empty), 10TB(5TB data + metadata).
10TB is one of RAID1 copies. I formatted 1TB and 2TB immediately but not 10TB
just in case. Then, I triggered `btrfs balance ...` and let the server
keep receiving
data from internet. I thought 10TB disk has old data and metadata. Even if I add
it back to RAID1, btrfs can figure out what data are new or old and
fix it automatically.
The server can work at the mean time. Just wast some disk space but it will be
rectified by `btrfs balance` or `btrfs scrub` later. Is that true?

`btrfs balance ..` suddenly failed after hours. The server system was
totally not
responded, included ssh and ctrl+alt+3. After that and power-off, I booted into
the rescue systemand then fix /etc/crypttab and bring all to /dev on the rescue
system.

So `super_num_devices 3 mismatch` means these 3 disks. (Not sure)

> So from the full dmesg, it looks like the abuse of degraded is causing
> the problem.

According the description I wrote above, is the conclusion still the same?

> Thus degraded for btrfs should really be considered as a last-resort
> method. And manual scrub after all devices go back online is really
> recommended.

Thanks for your analysis and help.

💔 What's done is done. My purpose now is to try to fix btrfs and save
data as much
as possible. Should I unplug 10TB disk, one of old RAID1 copy, at the first?
Originally, the server had about 6TB data. This 10TB disk I removed from
'/etc/crypttab' keeps about 5TB data. I'm worried what I'm going to do at the
next step may result in loss of this 5TB data. Maybe worse, it's gone already.

I tried mount btrfs with only this 10TB disk. It didn't work. Dmesg showed
[Thu Feb  6 14:34:03 2020] BTRFS info (device bcache2): allowing degraded mounts
[Thu Feb  6 14:34:03 2020] BTRFS info (device bcache2): disk space
caching is enabled
[Thu Feb  6 14:34:03 2020] BTRFS info (device bcache2): has skinny extents
[Thu Feb  6 14:34:03 2020] BTRFS warning (device bcache2): devid 3
uuid f9b7fe84-d95b-4db5-9e2b-c34a2d4186e9 is missing
[Thu Feb  6 14:34:03 2020] BTRFS warning (device bcache2): devid 5
uuid d442b477-0233-4a4a-aa71-cb24343b83ee is missing
[Thu Feb  6 14:34:03 2020] BTRFS warning (device bcache2): devid 6
uuid d18e3182-a3cc-448b-b15b-0a20dc9c8cbe is missing
[Thu Feb  6 14:34:03 2020] BTRFS warning (device bcache2): devid 7
uuid 991286c4-fa81-417a-876d-a0cb10989ded is missing
[Thu Feb  6 14:34:03 2020] BTRFS warning (device bcache2): failed to
read tree root
[Thu Feb  6 14:34:03 2020] BTRFS error (device bcache2): open_ctree failed


Base on your analysis, could you give me some advice about the next
steps to save
my btrfs raid? Are they
1) Apply the patch.
2) `btrfs check --repair /dev/bcache4`


Regards,
Chiung-Ming Huang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-06  4:35         ` Qu Wenruo
  2020-02-06  6:50           ` Chiung-Ming Huang
@ 2020-02-07  3:49           ` Chiung-Ming Huang
  2020-02-07  4:00             ` Qu Wenruo
  1 sibling, 1 reply; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-07  3:49 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Btrfs

Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 下午12:35寫道:
>
> Here is the diff, should be pretty safe:
> diff --git a/check/main.c b/check/main.c
> index 7db65150048b..bcde157c415d 100644
> --- a/check/main.c
> +++ b/check/main.c
> @@ -10373,7 +10373,8 @@ static int cmd_check(const struct cmd_struct
> *cmd, int argc, char **argv)
>                         ctx.tp = TASK_ROOT_ITEMS;
>                         task_start(ctx.info, &ctx.start_time,
> &ctx.item_count);
>                 }
> -               ret = repair_root_items(info);
> +               if (repair)
> +                       ret = repair_root_items(info);
>                 task_stop(ctx.info);
>                 if (ret < 0) {
>                         err = !!ret;
>

I applied this patch and executed `btrfs check /dev/bcache4`. It showed these.
Opening filesystem to check...
Checking filesystem on /dev/bcache4
UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
[1/7] checking root items
[2/7] checking extents
parent transid verify failed on 7153357357056 wanted 1382980 found 1452673
parent transid verify failed on 7153357357056 wanted 1382980 found 1452673
parent transid verify failed on 7153357357056 wanted 1382980 found 1452673
Ignoring transid failure
leaf parent key incorrect 7153357357056
bad block 7153357357056
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space cache
cache and super generation don't match, space cache will be invalidated
[4/7] checking fs roots
root 5 root dir 256 not found
root 257 root dir 256 not found
root 258 root dir 256 not found
root 277 root dir 256 not found
root 278 root dir 256 not found
root 279 root dir 256 not found
root 280 root dir 256 not found
root 283 root dir 256 not found
root 286 root dir 256 not found
root 289 root dir 256 not found
root 292 root dir 256 not found
root 295 root dir 256 not found
root 298 root dir 256 not found
root 304 root dir 256 not found
root 307 root dir 256 not found
root 310 root dir 256 not found
root 313 root dir 256 not found
root 316 root dir 256 not found
root 319 root dir 256 not found
root 322 root dir 256 not found
root 325 root dir 256 not found
root 360 root dir 256 not found
root 367 root dir 256 not found
root 370 root dir 256 not found
root 373 root dir 256 not found
root 376 root dir 256 not found
root 380 root dir 256 not found
root 383 root dir 256 not found
root 386 root dir 256 not found
root 389 root dir 256 not found
root 392 root dir 256 not found
root 399 root dir 256 not found
root 402 root dir 256 not found
root 405 root dir 256 not found
root 408 root dir 256 not found
root 411 root dir 256 not found
root 414 root dir 256 not found
root 417 root dir 256 not found
root 420 root dir 256 not found
root 423 root dir 256 not found
root 426 root dir 256 not found
root 429 root dir 256 not found
root 439 root dir 256 not found
root 442 root dir 256 not found
root 445 root dir 256 not found
root 448 root dir 256 not found
root 451 root dir 256 not found
root 513 root dir 256 not found
root 4613 root dir 256 not found
root 4616 root dir 256 not found
root 4619 root dir 256 not found
root 4622 root dir 256 not found
root 4625 root dir 256 not found
root 4628 root dir 256 not found
root 4631 root dir 256 not found
root 4640 root dir 256 not found
root 4643 root dir 256 not found
root 4646 root dir 256 not found
root 4649 root dir 256 not found
root 4652 root dir 256 not found
root 4673 root dir 256 not found
root 18871 root dir 256 not found
root 19354 root dir 256 not found
root 19355 root dir 256 not found
root 19356 root dir 256 not found
root 19375 root dir 256 not found
root 19416 root dir 256 not found
root 19419 root dir 256 not found
root 19422 root dir 256 not found
root 19425 root dir 256 not found
root 19428 root dir 256 not found
root 19432 root dir 256 not found
root 19435 root dir 256 not found
root 19438 root dir 256 not found
root 19441 root dir 256 not found
root 19450 root dir 256 not found
root 19453 root dir 256 not found
root 19456 root dir 256 not found
root 19459 root dir 256 not found
root 19462 root dir 256 not found
root 19465 root dir 256 not found
root 19468 root dir 256 not found
root 19472 root dir 256 not found
root 19473 root dir 256 not found
root 19613 root dir 256 not found
root 19784 root dir 256 not found
root 19812 root dir 256 not found
root 20572 root dir 256 not found
root 20768 root dir 256 not found
root 20771 root dir 256 not found
root 20834 root dir 256 not found
root 20837 root dir 256 not found
root 21438 root dir 256 not found
root 21447 root dir 256 not found
root 21469 root dir 256 not found
root 21470 root dir 256 not found
root 23144 root dir 256 not found
root 23146 root dir 256 not found
root 23147 root dir 256 not found
root 23440 root dir 256 not found
root 23452 root dir 256 not found
root 23460 root dir 256 not found
root 23471 root dir 256 not found
root 23520 root dir 256 not found
root 23521 root dir 256 not found
root 23833 root dir 256 not found
root 23834 root dir 256 not found
root 23854 root dir 256 not found
root 23855 root dir 256 not found
ERROR: errors found in fs roots
found 1902526464 bytes used, error(s) found
total csum bytes: 0
total tree bytes: 6275072
total fs tree bytes: 1032192
total extent tree bytes: 409600
btree space waste bytes: 974245
file data blocks allocated: 1628962816
 referenced 1628962816

Regards,
Chiung-Ming Huang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-07  3:49           ` Chiung-Ming Huang
@ 2020-02-07  4:00             ` Qu Wenruo
  2020-02-07  6:16               ` Chiung-Ming Huang
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2020-02-07  4:00 UTC (permalink / raw)
  To: Chiung-Ming Huang; +Cc: Btrfs


[-- Attachment #1.1: Type: text/plain, Size: 6232 bytes --]



On 2020/2/7 上午11:49, Chiung-Ming Huang wrote:
> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月6日 週四 下午12:35寫道:
>>
>> Here is the diff, should be pretty safe:
>> diff --git a/check/main.c b/check/main.c
>> index 7db65150048b..bcde157c415d 100644
>> --- a/check/main.c
>> +++ b/check/main.c
>> @@ -10373,7 +10373,8 @@ static int cmd_check(const struct cmd_struct
>> *cmd, int argc, char **argv)
>>                         ctx.tp = TASK_ROOT_ITEMS;
>>                         task_start(ctx.info, &ctx.start_time,
>> &ctx.item_count);
>>                 }
>> -               ret = repair_root_items(info);
>> +               if (repair)
>> +                       ret = repair_root_items(info);
>>                 task_stop(ctx.info);
>>                 if (ret < 0) {
>>                         err = !!ret;
>>
> 
> I applied this patch and executed `btrfs check /dev/bcache4`. It showed these.
> Opening filesystem to check...
> Checking filesystem on /dev/bcache4
> UUID: 0b79cf54-c424-40ed-adca-bd66b38ad57a
> [1/7] checking root items
> [2/7] checking extents
> parent transid verify failed on 7153357357056 wanted 1382980 found 1452673
> parent transid verify failed on 7153357357056 wanted 1382980 found 1452673
> parent transid verify failed on 7153357357056 wanted 1382980 found 1452673

Extent tree corrupted by transid. Already a bad news.

> Ignoring transid failure
> leaf parent key incorrect 7153357357056
> bad block 7153357357056
> ERROR: errors found in extent allocation tree or chunk allocation
> [3/7] checking free space cache
> cache and super generation don't match, space cache will be invalidated
> [4/7] checking fs roots
> root 5 root dir 256 not found
> root 257 root dir 256 not found
> root 258 root dir 256 not found
> root 277 root dir 256 not found
> root 278 root dir 256 not found
> root 279 root dir 256 not found
> root 280 root dir 256 not found
> root 283 root dir 256 not found
> root 286 root dir 256 not found
> root 289 root dir 256 not found
> root 292 root dir 256 not found
> root 295 root dir 256 not found
> root 298 root dir 256 not found
> root 304 root dir 256 not found
> root 307 root dir 256 not found
> root 310 root dir 256 not found
> root 313 root dir 256 not found
> root 316 root dir 256 not found
> root 319 root dir 256 not found
> root 322 root dir 256 not found
> root 325 root dir 256 not found
> root 360 root dir 256 not found
> root 367 root dir 256 not found
> root 370 root dir 256 not found
> root 373 root dir 256 not found
> root 376 root dir 256 not found
> root 380 root dir 256 not found
> root 383 root dir 256 not found
> root 386 root dir 256 not found
> root 389 root dir 256 not found
> root 392 root dir 256 not found
> root 399 root dir 256 not found
> root 402 root dir 256 not found
> root 405 root dir 256 not found
> root 408 root dir 256 not found
> root 411 root dir 256 not found
> root 414 root dir 256 not found
> root 417 root dir 256 not found
> root 420 root dir 256 not found
> root 423 root dir 256 not found
> root 426 root dir 256 not found
> root 429 root dir 256 not found
> root 439 root dir 256 not found
> root 442 root dir 256 not found
> root 445 root dir 256 not found
> root 448 root dir 256 not found
> root 451 root dir 256 not found
> root 513 root dir 256 not found
> root 4613 root dir 256 not found
> root 4616 root dir 256 not found
> root 4619 root dir 256 not found
> root 4622 root dir 256 not found
> root 4625 root dir 256 not found
> root 4628 root dir 256 not found
> root 4631 root dir 256 not found
> root 4640 root dir 256 not found
> root 4643 root dir 256 not found
> root 4646 root dir 256 not found
> root 4649 root dir 256 not found
> root 4652 root dir 256 not found
> root 4673 root dir 256 not found
> root 18871 root dir 256 not found
> root 19354 root dir 256 not found
> root 19355 root dir 256 not found
> root 19356 root dir 256 not found
> root 19375 root dir 256 not found
> root 19416 root dir 256 not found
> root 19419 root dir 256 not found
> root 19422 root dir 256 not found
> root 19425 root dir 256 not found
> root 19428 root dir 256 not found
> root 19432 root dir 256 not found
> root 19435 root dir 256 not found
> root 19438 root dir 256 not found
> root 19441 root dir 256 not found
> root 19450 root dir 256 not found
> root 19453 root dir 256 not found
> root 19456 root dir 256 not found
> root 19459 root dir 256 not found
> root 19462 root dir 256 not found
> root 19465 root dir 256 not found
> root 19468 root dir 256 not found
> root 19472 root dir 256 not found
> root 19473 root dir 256 not found
> root 19613 root dir 256 not found
> root 19784 root dir 256 not found
> root 19812 root dir 256 not found
> root 20572 root dir 256 not found
> root 20768 root dir 256 not found
> root 20771 root dir 256 not found
> root 20834 root dir 256 not found
> root 20837 root dir 256 not found
> root 21438 root dir 256 not found
> root 21447 root dir 256 not found
> root 21469 root dir 256 not found
> root 21470 root dir 256 not found
> root 23144 root dir 256 not found
> root 23146 root dir 256 not found
> root 23147 root dir 256 not found
> root 23440 root dir 256 not found
> root 23452 root dir 256 not found
> root 23460 root dir 256 not found
> root 23471 root dir 256 not found
> root 23520 root dir 256 not found
> root 23521 root dir 256 not found
> root 23833 root dir 256 not found
> root 23834 root dir 256 not found
> root 23854 root dir 256 not found
> root 23855 root dir 256 not found

All these subvolumes had a missing root dir. That's not good either.
I guess btrfs-restore is your last chance, or RO mount with my
rescue=skipbg patchset:
https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715

Thanks,
Qu

> ERROR: errors found in fs roots
> found 1902526464 bytes used, error(s) found
> total csum bytes: 0
> total tree bytes: 6275072
> total fs tree bytes: 1032192
> total extent tree bytes: 409600
> btree space waste bytes: 974245
> file data blocks allocated: 1628962816
>  referenced 1628962816
> 
> Regards,
> Chiung-Ming Huang
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-07  4:00             ` Qu Wenruo
@ 2020-02-07  6:16               ` Chiung-Ming Huang
  2020-02-07  7:16                 ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-07  6:16 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Btrfs

Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道:
>
> All these subvolumes had a missing root dir. That's not good either.
> I guess btrfs-restore is your last chance, or RO mount with my
> rescue=skipbg patchset:
> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715
>

Is it possible to use original disks to keep the restored data safely?
I would like
to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then
add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something
in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4?

/dev/bcache2, ID: 1
   Device size:             9.09TiB
   Device slack:              0.00B
   Data,RAID1:              3.93TiB
   Metadata,RAID1:          2.00GiB
   System,RAID1:           32.00MiB
   Unallocated:             5.16TiB

/dev/bcache3, ID: 3
   Device size:             2.73TiB
   Device slack:              0.00B
   Data,single:           378.00GiB
   Data,RAID1:            355.00GiB
   Metadata,single:         2.00GiB
   Metadata,RAID1:         11.00GiB
   Unallocated:             2.00TiB

/dev/bcache4, ID: 5
   Device size:             9.09TiB
   Device slack:              0.00B
   Data,single:             2.93TiB
   Data,RAID1:              4.15TiB
   Metadata,single:         6.00GiB
   Metadata,RAID1:         11.00GiB
   System,RAID1:           32.00MiB
   Unallocated:             2.00TiB

Regards,
Chiung-Ming Huang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-07  6:16               ` Chiung-Ming Huang
@ 2020-02-07  7:16                 ` Qu Wenruo
  2020-02-10  6:50                   ` Chiung-Ming Huang
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2020-02-07  7:16 UTC (permalink / raw)
  To: Chiung-Ming Huang; +Cc: Btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1900 bytes --]



On 2020/2/7 下午2:16, Chiung-Ming Huang wrote:
> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道:
>>
>> All these subvolumes had a missing root dir. That's not good either.
>> I guess btrfs-restore is your last chance, or RO mount with my
>> rescue=skipbg patchset:
>> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715
>>
> 
> Is it possible to use original disks to keep the restored data safely?
> I would like
> to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then
> add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something
> in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4?

Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata
are in RAID1.

But it's strongly recommended to test without wiping bcache2, to make
sure btrfs-restore can salvage enough data, then wipeing bcache2.

Thanks,
Qu

> 
> /dev/bcache2, ID: 1
>    Device size:             9.09TiB
>    Device slack:              0.00B
>    Data,RAID1:              3.93TiB
>    Metadata,RAID1:          2.00GiB
>    System,RAID1:           32.00MiB
>    Unallocated:             5.16TiB
> 
> /dev/bcache3, ID: 3
>    Device size:             2.73TiB
>    Device slack:              0.00B
>    Data,single:           378.00GiB
>    Data,RAID1:            355.00GiB
>    Metadata,single:         2.00GiB
>    Metadata,RAID1:         11.00GiB
>    Unallocated:             2.00TiB
> 
> /dev/bcache4, ID: 5
>    Device size:             9.09TiB
>    Device slack:              0.00B
>    Data,single:             2.93TiB
>    Data,RAID1:              4.15TiB
>    Metadata,single:         6.00GiB
>    Metadata,RAID1:         11.00GiB
>    System,RAID1:           32.00MiB
>    Unallocated:             2.00TiB
> 
> Regards,
> Chiung-Ming Huang
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-07  7:16                 ` Qu Wenruo
@ 2020-02-10  6:50                   ` Chiung-Ming Huang
  2020-02-10  7:03                     ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-10  6:50 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Btrfs

Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午3:16寫道:
>
>
>
> On 2020/2/7 下午2:16, Chiung-Ming Huang wrote:
> > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道:
> >>
> >> All these subvolumes had a missing root dir. That's not good either.
> >> I guess btrfs-restore is your last chance, or RO mount with my
> >> rescue=skipbg patchset:
> >> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715
> >>
> >
> > Is it possible to use original disks to keep the restored data safely?
> > I would like
> > to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then
> > add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something
> > in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4?
>
> Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata
> are in RAID1.
>
> But it's strongly recommended to test without wiping bcache2, to make
> sure btrfs-restore can salvage enough data, then wipeing bcache2.
>
> Thanks,
> Qu

Is it possible to shrink the size of bcache2 btrfs without making
everything worse?
I need more disk space but I still need bcache2 itself.

Regards,
Chiung-Ming Huang


> >
> > /dev/bcache2, ID: 1
> >    Device size:             9.09TiB
> >    Device slack:              0.00B
> >    Data,RAID1:              3.93TiB
> >    Metadata,RAID1:          2.00GiB
> >    System,RAID1:           32.00MiB
> >    Unallocated:             5.16TiB
> >
> > /dev/bcache3, ID: 3
> >    Device size:             2.73TiB
> >    Device slack:              0.00B
> >    Data,single:           378.00GiB
> >    Data,RAID1:            355.00GiB
> >    Metadata,single:         2.00GiB
> >    Metadata,RAID1:         11.00GiB
> >    Unallocated:             2.00TiB
> >
> > /dev/bcache4, ID: 5
> >    Device size:             9.09TiB
> >    Device slack:              0.00B
> >    Data,single:             2.93TiB
> >    Data,RAID1:              4.15TiB
> >    Metadata,single:         6.00GiB
> >    Metadata,RAID1:         11.00GiB
> >    System,RAID1:           32.00MiB
> >    Unallocated:             2.00TiB
> >
> > Regards,
> > Chiung-Ming Huang
> >
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-10  6:50                   ` Chiung-Ming Huang
@ 2020-02-10  7:03                     ` Qu Wenruo
  2020-02-15  3:47                       ` Chiung-Ming Huang
  0 siblings, 1 reply; 16+ messages in thread
From: Qu Wenruo @ 2020-02-10  7:03 UTC (permalink / raw)
  To: Chiung-Ming Huang; +Cc: Btrfs


[-- Attachment #1.1: Type: text/plain, Size: 4049 bytes --]



On 2020/2/10 下午2:50, Chiung-Ming Huang wrote:
> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午3:16寫道:
>>
>>
>>
>> On 2020/2/7 下午2:16, Chiung-Ming Huang wrote:
>>> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道:
>>>>
>>>> All these subvolumes had a missing root dir. That's not good either.
>>>> I guess btrfs-restore is your last chance, or RO mount with my
>>>> rescue=skipbg patchset:
>>>> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715
>>>>
>>>
>>> Is it possible to use original disks to keep the restored data safely?
>>> I would like
>>> to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then
>>> add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something
>>> in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4?
>>
>> Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata
>> are in RAID1.
>>
>> But it's strongly recommended to test without wiping bcache2, to make
>> sure btrfs-restore can salvage enough data, then wipeing bcache2.
>>
>> Thanks,
>> Qu
> 
> Is it possible to shrink the size of bcache2 btrfs without making
> everything worse?
> I need more disk space but I still need bcache2 itself.

That is kinda possible, but please keep in mind that, even in the best
case, it still needs to write some (very small amount) metadata into the
fs, thus I can't ensure it won't make things worse, or even possible
without falling back to RO.

You need to dump the device extent tree, to determine the where the last
dev extent is for each device, then shrink to that size.

Some example here:

# btrfs ins dump-tree -t dev /dev/nvme/btrfs
...

        item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48
                dev extent chunk_tree 3
                chunk_objectid 256 chunk_offset 2169503744 length 1073741824
                chunk_tree_uuid 00000000-0000-0000-0000-000000000000

Here for the key, 1 means devid 1, 2169503744 means where the device
extent starts at. 1073741824 is the length of the device extent.

In above case, the device with devid 1 can be resized to 2169503744 +
1073741824, without relocating any data/metadata.

# time btrfs fi resize 1:3243245568 /mnt/btrfs/
Resize '/mnt/btrfs/' of '1:3243245568'

real    0m0.013s
user    0m0.006s
sys     0m0.004s

And the dump-tree shows the same last device extent:
...
        item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48
                dev extent chunk_tree 3
                chunk_objectid 256 chunk_offset 2169503744 length 1073741824
                chunk_tree_uuid 00000000-0000-0000-0000-000000000000

(Maybe it's a good time to implement some like fast shrink for btrfs-progs)

Of course, after resizing btrfs, you still need to resize bcache, but
that's not related to btrfs (and I am not familiar with bcache either).

Thanks,
Qu

> 
> Regards,
> Chiung-Ming Huang
> 
> 
>>>
>>> /dev/bcache2, ID: 1
>>>    Device size:             9.09TiB
>>>    Device slack:              0.00B
>>>    Data,RAID1:              3.93TiB
>>>    Metadata,RAID1:          2.00GiB
>>>    System,RAID1:           32.00MiB
>>>    Unallocated:             5.16TiB
>>>
>>> /dev/bcache3, ID: 3
>>>    Device size:             2.73TiB
>>>    Device slack:              0.00B
>>>    Data,single:           378.00GiB
>>>    Data,RAID1:            355.00GiB
>>>    Metadata,single:         2.00GiB
>>>    Metadata,RAID1:         11.00GiB
>>>    Unallocated:             2.00TiB
>>>
>>> /dev/bcache4, ID: 5
>>>    Device size:             9.09TiB
>>>    Device slack:              0.00B
>>>    Data,single:             2.93TiB
>>>    Data,RAID1:              4.15TiB
>>>    Metadata,single:         6.00GiB
>>>    Metadata,RAID1:         11.00GiB
>>>    System,RAID1:           32.00MiB
>>>    Unallocated:             2.00TiB
>>>
>>> Regards,
>>> Chiung-Ming Huang
>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-10  7:03                     ` Qu Wenruo
@ 2020-02-15  3:47                       ` Chiung-Ming Huang
  2020-02-15  4:29                         ` Qu Wenruo
  0 siblings, 1 reply; 16+ messages in thread
From: Chiung-Ming Huang @ 2020-02-15  3:47 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Btrfs

Hi Qu

Thanks for your reply. That's really helpful. BTW, I just read this url and
the mail thread in it. https://unix.stackexchange.com/a/345972
It seems to say if raid1 is degraded and even if rw, it should not be applied
any operations other than btrfs-replace or btrfs-balance.

Does it mean the degraded raid1 should not be used with both
btrfs-replace/balance and the original server rw services at the meantime?

For example, I put PostgreSQL DB on btrfs raid1 and I though one of raid1
two copies is my backup. Even if I lost one copy, the service still can keep
running by another one immediately. Okay, maybe not immediately. I need
to reboot. But waiting 24 hours or longer which depends on the size of data
for the completion of btrfs-replace/balance seems not to be a good idea.

Regards,
Chiung-Ming Huang

Regards,
Chiung-Ming Huang


Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月10日 週一 下午3:03寫道:
>
>
>
> On 2020/2/10 下午2:50, Chiung-Ming Huang wrote:
> > Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午3:16寫道:
> >>
> >>
> >>
> >> On 2020/2/7 下午2:16, Chiung-Ming Huang wrote:
> >>> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道:
> >>>>
> >>>> All these subvolumes had a missing root dir. That's not good either.
> >>>> I guess btrfs-restore is your last chance, or RO mount with my
> >>>> rescue=skipbg patchset:
> >>>> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715
> >>>>
> >>>
> >>> Is it possible to use original disks to keep the restored data safely?
> >>> I would like
> >>> to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then
> >>> add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something
> >>> in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4?
> >>
> >> Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata
> >> are in RAID1.
> >>
> >> But it's strongly recommended to test without wiping bcache2, to make
> >> sure btrfs-restore can salvage enough data, then wipeing bcache2.
> >>
> >> Thanks,
> >> Qu
> >
> > Is it possible to shrink the size of bcache2 btrfs without making
> > everything worse?
> > I need more disk space but I still need bcache2 itself.
>
> That is kinda possible, but please keep in mind that, even in the best
> case, it still needs to write some (very small amount) metadata into the
> fs, thus I can't ensure it won't make things worse, or even possible
> without falling back to RO.
>
> You need to dump the device extent tree, to determine the where the last
> dev extent is for each device, then shrink to that size.
>
> Some example here:
>
> # btrfs ins dump-tree -t dev /dev/nvme/btrfs
> ...
>
>         item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48
>                 dev extent chunk_tree 3
>                 chunk_objectid 256 chunk_offset 2169503744 length 1073741824
>                 chunk_tree_uuid 00000000-0000-0000-0000-000000000000
>
> Here for the key, 1 means devid 1, 2169503744 means where the device
> extent starts at. 1073741824 is the length of the device extent.
>
> In above case, the device with devid 1 can be resized to 2169503744 +
> 1073741824, without relocating any data/metadata.
>
> # time btrfs fi resize 1:3243245568 /mnt/btrfs/
> Resize '/mnt/btrfs/' of '1:3243245568'
>
> real    0m0.013s
> user    0m0.006s
> sys     0m0.004s
>
> And the dump-tree shows the same last device extent:
> ...
>         item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48
>                 dev extent chunk_tree 3
>                 chunk_objectid 256 chunk_offset 2169503744 length 1073741824
>                 chunk_tree_uuid 00000000-0000-0000-0000-000000000000
>
> (Maybe it's a good time to implement some like fast shrink for btrfs-progs)
>
> Of course, after resizing btrfs, you still need to resize bcache, but
> that's not related to btrfs (and I am not familiar with bcache either).
>
> Thanks,
> Qu
>
> >
> > Regards,
> > Chiung-Ming Huang
> >
> >
> >>>
> >>> /dev/bcache2, ID: 1
> >>>    Device size:             9.09TiB
> >>>    Device slack:              0.00B
> >>>    Data,RAID1:              3.93TiB
> >>>    Metadata,RAID1:          2.00GiB
> >>>    System,RAID1:           32.00MiB
> >>>    Unallocated:             5.16TiB
> >>>
> >>> /dev/bcache3, ID: 3
> >>>    Device size:             2.73TiB
> >>>    Device slack:              0.00B
> >>>    Data,single:           378.00GiB
> >>>    Data,RAID1:            355.00GiB
> >>>    Metadata,single:         2.00GiB
> >>>    Metadata,RAID1:         11.00GiB
> >>>    Unallocated:             2.00TiB
> >>>
> >>> /dev/bcache4, ID: 5
> >>>    Device size:             9.09TiB
> >>>    Device slack:              0.00B
> >>>    Data,single:             2.93TiB
> >>>    Data,RAID1:              4.15TiB
> >>>    Metadata,single:         6.00GiB
> >>>    Metadata,RAID1:         11.00GiB
> >>>    System,RAID1:           32.00MiB
> >>>    Unallocated:             2.00TiB
> >>>
> >>> Regards,
> >>> Chiung-Ming Huang
> >>>
> >>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to Fix 'Error: could not find extent items for root 257'?
  2020-02-15  3:47                       ` Chiung-Ming Huang
@ 2020-02-15  4:29                         ` Qu Wenruo
  0 siblings, 0 replies; 16+ messages in thread
From: Qu Wenruo @ 2020-02-15  4:29 UTC (permalink / raw)
  To: Chiung-Ming Huang; +Cc: Btrfs


[-- Attachment #1.1: Type: text/plain, Size: 6913 bytes --]



On 2020/2/15 上午11:47, Chiung-Ming Huang wrote:
> Hi Qu
> 
> Thanks for your reply. That's really helpful. BTW, I just read this url and
> the mail thread in it. https://unix.stackexchange.com/a/345972
> It seems to say if raid1 is degraded and even if rw, it should not be applied
> any operations other than btrfs-replace or btrfs-balance.

That would be the best case.

> 
> Does it mean the degraded raid1 should not be used with both
> btrfs-replace/balance and the original server rw services at the meantime?

No, as long as the fs is still mounted, degraded RAID1 can be pretty
safe in fact.
At least to me, all the problem happen when we try to mount the fs again
using a mix of up-to-date disks with out-of-data disk.

For running degraded fs, btrfs knows which device is missing, it just
submit read/write to existing devices, and replace/balance can all
handle the case where.

> 
> For example, I put PostgreSQL DB on btrfs raid1 and I though one of raid1
> two copies is my backup. Even if I lost one copy, the service still can keep
> running by another one immediately. Okay, maybe not immediately. I need
> to reboot.

You'd better not to reboot, at least not reboot directly to normal
running status, with the bad disk attached.

> But waiting 24 hours or longer which depends on the size of data
> for the completion of btrfs-replace/balance seems not to be a good idea.

Btrfs-replace works just like scrub, which can only copying/verify data
on certain disk. It's not rewriting/verifying the whole fs, but I
understand that it can be very slow.

For btrfs-replace, you can just run the replace in the background.
Replace has extra protection to avoid data out-of-sync.

In short, for your case, it looks the problem is between some of your
degraded mount which screwed up some metadata blocks due to metadata out
of sync.

To avoid such problem, it may be a good idea to allow btrfs to use
superblock generation to find out which device is out-of-data, and do
self re-silver or at least avoid reading data/meta from the old device.
But that feature will need extra consideration before we even trying to
implement.

So currently my only practical recommendation would be, if you find one
disk failing, please remove it completely and ensure it will never show
up before remount the fs.
Then you can safely replace/remount.

Thanks,
Qu
> 
> Regards,
> Chiung-Ming Huang
> 
> Regards,
> Chiung-Ming Huang
> 
> 
> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月10日 週一 下午3:03寫道:
>>
>>
>>
>> On 2020/2/10 下午2:50, Chiung-Ming Huang wrote:
>>> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午3:16寫道:
>>>>
>>>>
>>>>
>>>> On 2020/2/7 下午2:16, Chiung-Ming Huang wrote:
>>>>> Qu Wenruo <quwenruo.btrfs@gmx.com> 於 2020年2月7日 週五 下午12:00寫道:
>>>>>>
>>>>>> All these subvolumes had a missing root dir. That's not good either.
>>>>>> I guess btrfs-restore is your last chance, or RO mount with my
>>>>>> rescue=skipbg patchset:
>>>>>> https://patchwork.kernel.org/project/linux-btrfs/list/?series=170715
>>>>>>
>>>>>
>>>>> Is it possible to use original disks to keep the restored data safely?
>>>>> I would like
>>>>> to restore the data of /dev/bcache3 to the new btrfs RAID0 at the first and then
>>>>> add it to the new btrfs RAID0. Does `btrfs restore` need metadata or something
>>>>> in /dev/bcache3 to restore /dev/bcache2 and /dev/bcache4?
>>>>
>>>> Devid 1 (bcache 2) seems OK to be missing, as all its data and metadata
>>>> are in RAID1.
>>>>
>>>> But it's strongly recommended to test without wiping bcache2, to make
>>>> sure btrfs-restore can salvage enough data, then wipeing bcache2.
>>>>
>>>> Thanks,
>>>> Qu
>>>
>>> Is it possible to shrink the size of bcache2 btrfs without making
>>> everything worse?
>>> I need more disk space but I still need bcache2 itself.
>>
>> That is kinda possible, but please keep in mind that, even in the best
>> case, it still needs to write some (very small amount) metadata into the
>> fs, thus I can't ensure it won't make things worse, or even possible
>> without falling back to RO.
>>
>> You need to dump the device extent tree, to determine the where the last
>> dev extent is for each device, then shrink to that size.
>>
>> Some example here:
>>
>> # btrfs ins dump-tree -t dev /dev/nvme/btrfs
>> ...
>>
>>         item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48
>>                 dev extent chunk_tree 3
>>                 chunk_objectid 256 chunk_offset 2169503744 length 1073741824
>>                 chunk_tree_uuid 00000000-0000-0000-0000-000000000000
>>
>> Here for the key, 1 means devid 1, 2169503744 means where the device
>> extent starts at. 1073741824 is the length of the device extent.
>>
>> In above case, the device with devid 1 can be resized to 2169503744 +
>> 1073741824, without relocating any data/metadata.
>>
>> # time btrfs fi resize 1:3243245568 /mnt/btrfs/
>> Resize '/mnt/btrfs/' of '1:3243245568'
>>
>> real    0m0.013s
>> user    0m0.006s
>> sys     0m0.004s
>>
>> And the dump-tree shows the same last device extent:
>> ...
>>         item 6 key (1 DEV_EXTENT 2169503744) itemoff 15955 itemsize 48
>>                 dev extent chunk_tree 3
>>                 chunk_objectid 256 chunk_offset 2169503744 length 1073741824
>>                 chunk_tree_uuid 00000000-0000-0000-0000-000000000000
>>
>> (Maybe it's a good time to implement some like fast shrink for btrfs-progs)
>>
>> Of course, after resizing btrfs, you still need to resize bcache, but
>> that's not related to btrfs (and I am not familiar with bcache either).
>>
>> Thanks,
>> Qu
>>
>>>
>>> Regards,
>>> Chiung-Ming Huang
>>>
>>>
>>>>>
>>>>> /dev/bcache2, ID: 1
>>>>>    Device size:             9.09TiB
>>>>>    Device slack:              0.00B
>>>>>    Data,RAID1:              3.93TiB
>>>>>    Metadata,RAID1:          2.00GiB
>>>>>    System,RAID1:           32.00MiB
>>>>>    Unallocated:             5.16TiB
>>>>>
>>>>> /dev/bcache3, ID: 3
>>>>>    Device size:             2.73TiB
>>>>>    Device slack:              0.00B
>>>>>    Data,single:           378.00GiB
>>>>>    Data,RAID1:            355.00GiB
>>>>>    Metadata,single:         2.00GiB
>>>>>    Metadata,RAID1:         11.00GiB
>>>>>    Unallocated:             2.00TiB
>>>>>
>>>>> /dev/bcache4, ID: 5
>>>>>    Device size:             9.09TiB
>>>>>    Device slack:              0.00B
>>>>>    Data,single:             2.93TiB
>>>>>    Data,RAID1:              4.15TiB
>>>>>    Metadata,single:         6.00GiB
>>>>>    Metadata,RAID1:         11.00GiB
>>>>>    System,RAID1:           32.00MiB
>>>>>    Unallocated:             2.00TiB
>>>>>
>>>>> Regards,
>>>>> Chiung-Ming Huang
>>>>>
>>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-02-15  4:29 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-05 10:18 How to Fix 'Error: could not find extent items for root 257'? Chiung-Ming Huang
2020-02-05 10:29 ` Qu Wenruo
2020-02-05 15:29   ` Chiung-Ming Huang
2020-02-05 19:38     ` Chris Murphy
2020-02-06  3:11       ` Chiung-Ming Huang
     [not found]   ` <CAEOGEKHf9F0VM=au-42MwD63_V8RwtqiskV0LsGpq-c=J_qyPg@mail.gmail.com>
     [not found]     ` <f2ad6b4f-b011-8954-77e1-5162c84f7c1f@gmx.com>
2020-02-06  4:13       ` Chiung-Ming Huang
2020-02-06  4:35         ` Qu Wenruo
2020-02-06  6:50           ` Chiung-Ming Huang
2020-02-07  3:49           ` Chiung-Ming Huang
2020-02-07  4:00             ` Qu Wenruo
2020-02-07  6:16               ` Chiung-Ming Huang
2020-02-07  7:16                 ` Qu Wenruo
2020-02-10  6:50                   ` Chiung-Ming Huang
2020-02-10  7:03                     ` Qu Wenruo
2020-02-15  3:47                       ` Chiung-Ming Huang
2020-02-15  4:29                         ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).