All of lore.kernel.org
 help / color / mirror / Atom feed
* Crashes running btrfs scrub
@ 2018-03-15 18:58 Mike Stevens
  2018-03-15 20:32 ` waxhead
  2018-03-15 21:15 ` Chris Murphy
  0 siblings, 2 replies; 24+ messages in thread
From: Mike Stevens @ 2018-03-15 18:58 UTC (permalink / raw)
  To: linux-btrfs

First, the required information
 
~ $ uname -a
Linux auswscs9903 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
 ~ $ btrfs --version
btrfs-progs v4.9.1
 ~ $ sudo btrfs fi show
Label: none  uuid: 77afc2bb-f7a8-4ce9-9047-c031f7571150
        Total devices 34 FS bytes used 89.06TiB
        devid    1 size 5.46TiB used 4.72TiB path /dev/sdb
        devid    2 size 5.46TiB used 4.72TiB path /dev/sda
        devid    3 size 5.46TiB used 4.72TiB path /dev/sdx
        devid    4 size 5.46TiB used 4.72TiB path /dev/sdt
        devid    5 size 5.46TiB used 4.72TiB path /dev/sdz
        devid    6 size 5.46TiB used 4.72TiB path /dev/sdv
        devid    7 size 5.46TiB used 4.72TiB path /dev/sdab
        devid    8 size 5.46TiB used 4.72TiB path /dev/sdw
        devid    9 size 5.46TiB used 4.72TiB path /dev/sdad
        devid   10 size 5.46TiB used 4.72TiB path /dev/sdaa
        devid   11 size 5.46TiB used 4.72TiB path /dev/sdr
        devid   12 size 5.46TiB used 4.72TiB path /dev/sdy
        devid   13 size 5.46TiB used 4.72TiB path /dev/sdj
        devid   14 size 5.46TiB used 4.72TiB path /dev/sdaf
        devid   15 size 5.46TiB used 4.72TiB path /dev/sdag
        devid   16 size 5.46TiB used 4.72TiB path /dev/sdh
        devid   17 size 5.46TiB used 4.72TiB path /dev/sdu
        devid   18 size 5.46TiB used 4.72TiB path /dev/sdac
        devid   19 size 5.46TiB used 4.72TiB path /dev/sdk
        devid   20 size 5.46TiB used 4.72TiB path /dev/sdah
        devid   21 size 5.46TiB used 4.72TiB path /dev/sdp
        devid   22 size 5.46TiB used 4.72TiB path /dev/sdae
        devid   23 size 5.46TiB used 4.72TiB path /dev/sdc
        devid   24 size 5.46TiB used 4.72TiB path /dev/sdl
        devid   25 size 5.46TiB used 4.72TiB path /dev/sdo
        devid   26 size 5.46TiB used 4.72TiB path /dev/sdd
        devid   27 size 5.46TiB used 4.72TiB path /dev/sdi
        devid   28 size 5.46TiB used 4.72TiB path /dev/sdn
        devid   29 size 5.46TiB used 4.72TiB path /dev/sds
        devid   30 size 5.46TiB used 4.72TiB path /dev/sdm
        devid   31 size 5.46TiB used 4.72TiB path /dev/sdf
        devid   32 size 5.46TiB used 4.72TiB path /dev/sdq
        devid   33 size 5.46TiB used 4.72TiB path /dev/sdg
        devid   34 size 5.46TiB used 4.72TiB path /dev/sde

 ~ $ sudo btrfs fi df /gpfs_backups
Data, RAID6: total=150.82TiB, used=88.88TiB
System, RAID6: total=512.00MiB, used=19.08MiB
Metadata, RAID6: total=191.00GiB, used=187.38GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

I was running a btrfs balance, which crashed.  Since then, I cannot do anything on the filesystems that does any real i/o, or it quickly goes read only.  Running btrfs scrub results in this crash:

Mar 15 11:10:43 auswscs9903 kernel: WARNING: CPU: 1 PID: 4588 at fs/btrfs/extent-tree.c:10367 btrfs_create_pending_block_groups+0x23e/0x240 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: Modules linked in: nfsv3 nfs fscache mpt3sas mpt2sas raid_class mptctl mptbase binfmt_misc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack libcrc32c iptable_filter dm_mirror dm_region_hash dm_log dm_mod iTCO_wdt iTCO_vendor_support btrfs sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd raid6_pq xor pcspkr joydev ses enclosure scsi_transport_sas sg mei_me i2c_i801 mei lpc_ich ioatdma shpchp wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic ast drm_kms_helper syscopyarea sysfillrect
Mar 15 11:10:43 auswscs9903 kernel: sysimgblt fb_sys_fops ttm drm ahci igb libahci libata crct10dif_pclmul crct10dif_common crc32c_intel megaraid_sas ptp pps_core i2c_algo_bit myri10ge i2c_core dca
Mar 15 11:10:43 auswscs9903 kernel: CPU: 1 PID: 4588 Comm: btrfs Tainted: G        W      ------------   3.10.0-693.21.1.el7.x86_64 #1
Mar 15 11:10:43 auswscs9903 kernel: Hardware name: Supermicro Super Server/X10DRL-i, BIOS 1.1b 09/11/2015
Mar 15 11:10:43 auswscs9903 kernel: Call Trace:
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff8108ae58>] __warn+0xd8/0x100
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff8108aedf>] warn_slowpath_fmt+0x5f/0x80
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0ac2fd2>] ? btrfs_finish_chunk_alloc+0x222/0x5e0 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0a7cb7e>] btrfs_create_pending_block_groups+0x23e/0x240 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0a7d215>] do_chunk_alloc+0x2f5/0x330 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0a816ee>] btrfs_inc_block_group_ro+0x18e/0x1b0 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0afad47>] scrub_enumerate_chunks+0x207/0x6a0 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff810c79ec>] ? try_to_wake_up+0x18c/0x350
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff816b2c00>] ? __ww_mutex_lock+0x40/0xa0
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0afc5f3>] btrfs_scrub_dev+0x233/0x5a0 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0ad2a00>] ? btrfs_ioctl+0xdc0/0x2d30 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0ad2a59>] btrfs_ioctl+0xe19/0x2d30 [btrfs]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc026b1f1>] ? ext4_filemap_fault+0x41/0x50 [ext4]
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff81186deb>] ? unlock_page+0x2b/0x30
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff811b1f16>] ? do_read_fault.isra.44+0xe6/0x130
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff811e4629>] ? kmem_cache_alloc_node+0x109/0x200
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff811b6781>] ? handle_mm_fault+0x691/0xfa0
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff81121930>] ? audit_filter_rules.isra.8+0x280/0xf90
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff81219e90>] do_vfs_ioctl+0x350/0x560
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff8121a141>] SyS_ioctl+0xa1/0xc0
Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21

As far as I can tell the hardware seem fine.  I've updated from CentOS 7.2 to the most current version, but the problem persists.  How best to address this problem?

Freundliche Grüße / Best regards,
 
Mike Stevens
Senior Systems Administrator - SC3


________________________________________________________________________
The information contained in this e-mail is for the exclusive use of the 
intended recipient(s) and may be confidential, proprietary, and/or 
legally privileged.  Inadvertent disclosure of this message does not 
constitute a waiver of any privilege.  If you receive this message in 
error, please do not directly or indirectly use, print, copy, forward,
or disclose any part of this message.  Please also delete this e-mail 
and all copies and notify the sender.  Thank you. 
________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-15 18:58 Crashes running btrfs scrub Mike Stevens
@ 2018-03-15 20:32 ` waxhead
  2018-03-15 21:07   ` Mike Stevens
  2018-03-15 21:15 ` Chris Murphy
  1 sibling, 1 reply; 24+ messages in thread
From: waxhead @ 2018-03-15 20:32 UTC (permalink / raw)
  To: Mike Stevens, linux-btrfs

Mike Stevens wrote:
> First, the required information
>   
> ~ $ uname -a
> Linux auswscs9903 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>   ~ $ btrfs --version
> btrfs-progs v4.9.1
>   ~ $ sudo btrfs fi show
> Label: none  uuid: 77afc2bb-f7a8-4ce9-9047-c031f7571150
>          Total devices 34 FS bytes used 89.06TiB
>          devid    1 size 5.46TiB used 4.72TiB path /dev/sdb
>          devid    2 size 5.46TiB used 4.72TiB path /dev/sda
>          devid    3 size 5.46TiB used 4.72TiB path /dev/sdx
>          devid    4 size 5.46TiB used 4.72TiB path /dev/sdt
>          devid    5 size 5.46TiB used 4.72TiB path /dev/sdz
>          devid    6 size 5.46TiB used 4.72TiB path /dev/sdv
>          devid    7 size 5.46TiB used 4.72TiB path /dev/sdab
>          devid    8 size 5.46TiB used 4.72TiB path /dev/sdw
>          devid    9 size 5.46TiB used 4.72TiB path /dev/sdad
>          devid   10 size 5.46TiB used 4.72TiB path /dev/sdaa
>          devid   11 size 5.46TiB used 4.72TiB path /dev/sdr
>          devid   12 size 5.46TiB used 4.72TiB path /dev/sdy
>          devid   13 size 5.46TiB used 4.72TiB path /dev/sdj
>          devid   14 size 5.46TiB used 4.72TiB path /dev/sdaf
>          devid   15 size 5.46TiB used 4.72TiB path /dev/sdag
>          devid   16 size 5.46TiB used 4.72TiB path /dev/sdh
>          devid   17 size 5.46TiB used 4.72TiB path /dev/sdu
>          devid   18 size 5.46TiB used 4.72TiB path /dev/sdac
>          devid   19 size 5.46TiB used 4.72TiB path /dev/sdk
>          devid   20 size 5.46TiB used 4.72TiB path /dev/sdah
>          devid   21 size 5.46TiB used 4.72TiB path /dev/sdp
>          devid   22 size 5.46TiB used 4.72TiB path /dev/sdae
>          devid   23 size 5.46TiB used 4.72TiB path /dev/sdc
>          devid   24 size 5.46TiB used 4.72TiB path /dev/sdl
>          devid   25 size 5.46TiB used 4.72TiB path /dev/sdo
>          devid   26 size 5.46TiB used 4.72TiB path /dev/sdd
>          devid   27 size 5.46TiB used 4.72TiB path /dev/sdi
>          devid   28 size 5.46TiB used 4.72TiB path /dev/sdn
>          devid   29 size 5.46TiB used 4.72TiB path /dev/sds
>          devid   30 size 5.46TiB used 4.72TiB path /dev/sdm
>          devid   31 size 5.46TiB used 4.72TiB path /dev/sdf
>          devid   32 size 5.46TiB used 4.72TiB path /dev/sdq
>          devid   33 size 5.46TiB used 4.72TiB path /dev/sdg
>          devid   34 size 5.46TiB used 4.72TiB path /dev/sde
> 
>   ~ $ sudo btrfs fi df /gpfs_backups
> Data, RAID6: total=150.82TiB, used=88.88TiB
> System, RAID6: total=512.00MiB, used=19.08MiB
> Metadata, RAID6: total=191.00GiB, used=187.38GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
That's a hell of a filesystem. RAID5 and RAID5 is unstable and should 
not be used for anything but throw away data. You will be happy that you 
value you data enough to have backups.... because all sensible sysadmins 
do have backups correct?! (Do read just about any of Duncan's replies - 
he describes this better than me).

Also if you are running kernel ***3.10*** that is nearly antique in 
btrfs terms. As a word of advise, try a more recent kernel (there have 
been lots of patches to raid5/6 since kernel 4.9) and if you ever get 
the filesystem running again then *at least* rebalance the metadata to 
raid1 as quickly as possible as the raid1 profile is (unlike raid5 or 
raid6) working really well.

PS! I'm not a BTRFS dev so don't run away just yet. Someone else may 
magically help you recover, Best of luck!

- Waxhead

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: Crashes running btrfs scrub
  2018-03-15 20:32 ` waxhead
@ 2018-03-15 21:07   ` Mike Stevens
  2018-03-15 21:22     ` Chris Murphy
  2018-03-16 21:39     ` Liu Bo
  0 siblings, 2 replies; 24+ messages in thread
From: Mike Stevens @ 2018-03-15 21:07 UTC (permalink / raw)
  To: linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 6222 bytes --]

> That's a hell of a filesystem. RAID5 and RAID5 is unstable and should 
> not be used for anything but throw away data. You will be happy that you 
> value you data enough to have backups.... because all sensible sysadmins 
> do have backups correct?! (Do read just about any of Duncan's replies - 
> he describes this better than me).

It's a backup of a backup of a very large filesystem.  Nothing I want to sync again, 
but not a critical data loss if I have to.

> Also if you are running kernel ***3.10*** that is nearly antique in 
> btrfs terms. As a word of advise, try a more recent kernel (there have 
> been lots of patches to raid5/6 since kernel 4.9) and if you ever get 
> the filesystem running again then *at least* rebalance the metadata to 
> raid1 as quickly as possible as the raid1 profile is (unlike raid5 or 
> raid6) working really well.

Not being in the kernel space much, I did not realize how far behind I was.
I've updated to 4.15.10 with a different crash at least.

Mar 15 14:03:06 auswscs9903 kernel: WARNING: CPU: 6 PID: 2720 at fs/btrfs/extent-tree.c:10192 btrfs_create_pending_block_groups+0x1f3/0x260 [btrfs]
Mar 15 14:03:06 auswscs9903 kernel: Modules linked in: nfsv3 nfs fscache mpt3sas raid_class mptctl mptbase binfmt_misc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack libcrc32c iptable_filter dm_mirror dm_region_hash dm_log dm_mod dax iTCO_wdt iTCO_vendor_support btrfs ses enclosure scsi_transport_sas xor zstd_decompress zstd_compress xxhash raid6_pq sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd intel_cstate lpc_ich sg intel_rapl_perf pcspkr joydev input_leds i2c_i801 mfd_core mei_me mei ipmi_si ipmi_devintf shpchp wmi ioatdma ipmi_msghandler acpi_power_meter acpi_pad nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache
Mar 15 14:03:06 auswscs9903 kernel: jbd2 sd_mod crc32c_intel ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm libahci igb ptp drm pps_core i2c_algo_bit libata myri10ge megaraid_sas dca
Mar 15 14:03:06 auswscs9903 kernel: CPU: 6 PID: 2720 Comm: btrfs Not tainted 4.15.10-1.el7.elrepo.x86_64 #1
Mar 15 14:03:06 auswscs9903 kernel: Hardware name: Supermicro Super Server/X10DRL-i, BIOS 1.1b 09/11/2015
Mar 15 14:03:06 auswscs9903 kernel: RIP: 0010:btrfs_create_pending_block_groups+0x1f3/0x260 [btrfs]
Mar 15 14:03:06 auswscs9903 kernel: RSP: 0018:ffffc90009c2fae8 EFLAGS: 00010282
Mar 15 14:03:06 auswscs9903 kernel: RAX: 0000000000000000 RBX: 00000000ffffffe5 RCX: 0000000000000006
Mar 15 14:03:06 auswscs9903 kernel: RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff88103f3969d0
Mar 15 14:03:06 auswscs9903 kernel: RBP: ffffc90009c2fb68 R08: 0000000000000000 R09: 0000000000000525
Mar 15 14:03:06 auswscs9903 kernel: R10: 0000000000000004 R11: 0000000000000524 R12: ffff88100d7c7000
Mar 15 14:03:06 auswscs9903 kernel: R13: ffff880fc6985800 R14: ffff88100d7c6f48 R15: ffff880fc6985920
Mar 15 14:03:06 auswscs9903 kernel: FS:  00007fc1564b6700(0000) GS:ffff88103f380000(0000) knlGS:0000000000000000
Mar 15 14:03:06 auswscs9903 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 15 14:03:06 auswscs9903 kernel: CR2: 00000000016a5330 CR3: 0000000fc6310005 CR4: 00000000001606e0
Mar 15 14:03:06 auswscs9903 kernel: Call Trace:
Mar 15 14:03:06 auswscs9903 kernel: do_chunk_alloc+0x269/0x2e0 [btrfs]
Mar 15 14:03:06 auswscs9903 kernel: ? start_transaction+0xa7/0x450 [btrfs]
Mar 15 14:03:06 auswscs9903 kernel: btrfs_inc_block_group_ro+0x142/0x160 [btrfs]
Mar 15 14:03:06 auswscs9903 kernel: scrub_enumerate_chunks+0x1ad/0x680 [btrfs]
Mar 15 14:03:06 auswscs9903 kernel: ? try_to_wake_up+0x59/0x480
Mar 15 14:03:06 auswscs9903 kernel: btrfs_scrub_dev+0x21d/0x540 [btrfs]
Mar 15 14:03:06 auswscs9903 kernel: ? __check_object_size+0x159/0x190
Mar 15 14:03:06 auswscs9903 kernel: ? _copy_from_user+0x33/0x70
Mar 15 14:03:06 auswscs9903 kernel: btrfs_ioctl+0xf20/0x2110 [btrfs]
Mar 15 14:03:06 auswscs9903 kernel: ? audit_filter_rules.isra.9+0x241/0xe80
Mar 15 14:03:06 auswscs9903 kernel: do_vfs_ioctl+0xaa/0x610
Mar 15 14:03:06 auswscs9903 kernel: ? __audit_syscall_entry+0xac/0xf0
Mar 15 14:03:06 auswscs9903 kernel: ? syscall_trace_enter+0x1cd/0x2b0
Mar 15 14:03:06 auswscs9903 kernel: SyS_ioctl+0x79/0x90
Mar 15 14:03:06 auswscs9903 kernel: do_syscall_64+0x79/0x1b0
Mar 15 14:03:06 auswscs9903 kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mar 15 14:03:06 auswscs9903 kernel: RIP: 0033:0x7fc1565a6107
Mar 15 14:03:06 auswscs9903 kernel: RSP: 002b:00007fc1564b5d58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mar 15 14:03:06 auswscs9903 kernel: RAX: ffffffffffffffda RBX: 000000000168a3a0 RCX: 00007fc1565a6107
Mar 15 14:03:06 auswscs9903 kernel: RDX: 000000000168a3a0 RSI: 00000000c400941b RDI: 0000000000000003
Mar 15 14:03:06 auswscs9903 kernel: RBP: 0000000000000000 R08: 00007fc1564b6700 R09: 0000000000000000
Mar 15 14:03:06 auswscs9903 kernel: R10: 00007fc1564b6700 R11: 0000000000000246 R12: 00007fc1564b64e0
Mar 15 14:03:06 auswscs9903 kernel: R13: 00007fc1564b69c0 R14: 00007fc1564b6700 R15: 0000000000000001
Mar 15 14:03:06 auswscs9903 kernel: Code: 00 e9 5d ff ff ff 49 8b 44 24 60 f0 0f ba a8 d8 cd 00 00 02 72 17 83 fb fb 74 2d 89 de 48 c7 c7 d8 68 77 a0 31 c0 e8 cd 8f 9b e0 <0f> 0b 89 d9 ba d0 27 00 00 48 c7 c6 60 f7 76 a0 4c 89 e7 e8 18

________________________________________________________________________
The information contained in this e-mail is for the exclusive use of the 
intended recipient(s) and may be confidential, proprietary, and/or 
legally privileged.  Inadvertent disclosure of this message does not 
constitute a waiver of any privilege.  If you receive this message in 
error, please do not directly or indirectly use, print, copy, forward,
or disclose any part of this message.  Please also delete this e-mail 
and all copies and notify the sender.  Thank you. 
________________________________________________________________________
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-15 18:58 Crashes running btrfs scrub Mike Stevens
  2018-03-15 20:32 ` waxhead
@ 2018-03-15 21:15 ` Chris Murphy
  1 sibling, 0 replies; 24+ messages in thread
From: Chris Murphy @ 2018-03-15 21:15 UTC (permalink / raw)
  To: Mike Stevens; +Cc: linux-btrfs

On Thu, Mar 15, 2018 at 12:58 PM, Mike Stevens
<michael.stevens@bayer.com> wrote:
> First, the required information
>
> ~ $ uname -a
> Linux auswscs9903 3.10.0-693.21.1.el7.x86_64

For a kernel this old you kinda need to get support from the distro.
This list is upstream and pretty much always what you'll get from any
upstream, XFS, ext4, or Btrfs, with such an old kernel is the same
answer: if you want support for a distro kernel, go to the distro for
support. For upstream, typical response is to try with the current
stable kernel which is 4.15.10. If you can reproduce the problem there
too then it's a bug. 3.10 isn't even a longterm kernel getting
backports so it's really a non-starter. Sorry. You can get newer
kernels prebuilt from elrepo.org. They have 4.14.121 and 4.15.10.

http://elrepo.org/linux/kernel/el7/x86_64/RPMS/


>Data, RAID6: total=150.82TiB, used=88.88TiB
>System, RAID6: total=512.00MiB, used=19.08MiB
>Metadata, RAID6: total=191.00GiB, used=187.38GiB


Unfortunately no one is really supporting raid6, distro or even
upstream, for production purposes. Upstream is really your only
option, and you really need to be running a newer kernel because so
much raid5 and raid6 has changed, even aside from Btrfs itself.
There's tens of thousands of line changes in the code since 3.18 (EL 7
Btrfs is not really based on 3.10 tree, I think it's based on 3.18
tree but I don't have Red Hat's decoder ring).


> I was running a btrfs balance, which crashed.  Since then, I cannot do anything on the filesystems that does any real i/o, or it quickly goes read only.

Mount it read only - update the backups. Then update the kernel and
btrfs-progs. You can use the Fedora btrfs-progs package on EL 7.

Full listing
https://koji.fedoraproject.org/koji/packageinfo?packageID=6398

I recommend this package, only because I'm using it now on Fedora 28.
https://kojipkgs.fedoraproject.org//packages/btrfs-progs/4.15.1/1.fc28/x86_64/btrfs-progs-4.15.1-1.fc28.x86_64.rpm

For what it's worth, scrub is initiated and monitored by btrfs-progs
user space tools, but the real work is in the kernel code.


 >Running btrfs scrub results in this crash:
>
> Mar 15 11:10:43 auswscs9903 kernel: WARNING: CPU: 1 PID: 4588 at fs/btrfs/extent-tree.c:10367 btrfs_create_pending_block_groups+0x23e/0x240 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: Modules linked in: nfsv3 nfs fscache mpt3sas mpt2sas raid_class mptctl mptbase binfmt_misc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack libcrc32c iptable_filter dm_mirror dm_region_hash dm_log dm_mod iTCO_wdt iTCO_vendor_support btrfs sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd raid6_pq xor pcspkr joydev ses enclosure scsi_transport_sas sg mei_me i2c_i801 mei lpc_ich ioatdma shpchp wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic ast drm_kms_helper syscopyarea sysfillrect
> Mar 15 11:10:43 auswscs9903 kernel: sysimgblt fb_sys_fops ttm drm ahci igb libahci libata crct10dif_pclmul crct10dif_common crc32c_intel megaraid_sas ptp pps_core i2c_algo_bit myri10ge i2c_core dca
> Mar 15 11:10:43 auswscs9903 kernel: CPU: 1 PID: 4588 Comm: btrfs Tainted: G        W      ------------   3.10.0-693.21.1.el7.x86_64 #1
> Mar 15 11:10:43 auswscs9903 kernel: Hardware name: Supermicro Super Server/X10DRL-i, BIOS 1.1b 09/11/2015
> Mar 15 11:10:43 auswscs9903 kernel: Call Trace:
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff8108ae58>] __warn+0xd8/0x100
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff8108aedf>] warn_slowpath_fmt+0x5f/0x80
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0ac2fd2>] ? btrfs_finish_chunk_alloc+0x222/0x5e0 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0a7cb7e>] btrfs_create_pending_block_groups+0x23e/0x240 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0a7d215>] do_chunk_alloc+0x2f5/0x330 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0a816ee>] btrfs_inc_block_group_ro+0x18e/0x1b0 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0afad47>] scrub_enumerate_chunks+0x207/0x6a0 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff810c79ec>] ? try_to_wake_up+0x18c/0x350
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff816b2c00>] ? __ww_mutex_lock+0x40/0xa0
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0afc5f3>] btrfs_scrub_dev+0x233/0x5a0 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0ad2a00>] ? btrfs_ioctl+0xdc0/0x2d30 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc0ad2a59>] btrfs_ioctl+0xe19/0x2d30 [btrfs]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffffc026b1f1>] ? ext4_filemap_fault+0x41/0x50 [ext4]
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff81186deb>] ? unlock_page+0x2b/0x30
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff811b1f16>] ? do_read_fault.isra.44+0xe6/0x130
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff811e4629>] ? kmem_cache_alloc_node+0x109/0x200
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff811b6781>] ? handle_mm_fault+0x691/0xfa0
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff81121930>] ? audit_filter_rules.isra.8+0x280/0xf90
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff81219e90>] do_vfs_ioctl+0x350/0x560
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff8121a141>] SyS_ioctl+0xa1/0xc0
> Mar 15 11:10:43 auswscs9903 kernel: [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21


Crashing while allocating new chunks, looks like, and then maybe gets
confused about what it's supposed to scrub. The ext4_filemap_fault is
curious. I can't really parse this trace, but I doubt this is a
hardware bug. I think it's a legit bug in the code, and it's almost
certainly fixed in newer kernels but the only way to know for sure is
to upgrade. 4.14.121 would be the minimum worth testing so flip a coin
on 4.15.10 and 4.14.121.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-15 21:07   ` Mike Stevens
@ 2018-03-15 21:22     ` Chris Murphy
       [not found]       ` <6b4f2b33edb44f1ea8cef47ae68960af@MOXDE7.na.bayer.cnb>
  2018-03-16 21:39     ` Liu Bo
  1 sibling, 1 reply; 24+ messages in thread
From: Chris Murphy @ 2018-03-15 21:22 UTC (permalink / raw)
  To: Mike Stevens, Qu Wenruo; +Cc: linux-btrfs

On Thu, Mar 15, 2018 at 3:07 PM, Mike Stevens <michael.stevens@bayer.com> wrote:

> Mar 15 14:03:06 auswscs9903 kernel: WARNING: CPU: 6 PID: 2720 at fs/btrfs/extent-tree.c:10192 btrfs_create_pending_block_groups+0x1f3/0x260 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: Modules linked in: nfsv3 nfs fscache mpt3sas raid_class mptctl mptbase binfmt_misc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack libcrc32c iptable_filter dm_mirror dm_region_hash dm_log dm_mod dax iTCO_wdt iTCO_vendor_support btrfs ses enclosure scsi_transport_sas xor zstd_decompress zstd_compress xxhash raid6_pq sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd intel_cstate lpc_ich sg intel_rapl_perf pcspkr joydev input_leds i2c_i801 mfd_core mei_me mei ipmi_si ipmi_devintf shpchp wmi ioatdma ipmi_msghandler acpi_power_meter acpi_pad nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache
> Mar 15 14:03:06 auswscs9903 kernel: jbd2 sd_mod crc32c_intel ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm libahci igb ptp drm pps_core i2c_algo_bit libata myri10ge megaraid_sas dca
> Mar 15 14:03:06 auswscs9903 kernel: CPU: 6 PID: 2720 Comm: btrfs Not tainted 4.15.10-1.el7.elrepo.x86_64 #1
> Mar 15 14:03:06 auswscs9903 kernel: Hardware name: Supermicro Super Server/X10DRL-i, BIOS 1.1b 09/11/2015
> Mar 15 14:03:06 auswscs9903 kernel: RIP: 0010:btrfs_create_pending_block_groups+0x1f3/0x260 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: RSP: 0018:ffffc90009c2fae8 EFLAGS: 00010282
> Mar 15 14:03:06 auswscs9903 kernel: RAX: 0000000000000000 RBX: 00000000ffffffe5 RCX: 0000000000000006
> Mar 15 14:03:06 auswscs9903 kernel: RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff88103f3969d0
> Mar 15 14:03:06 auswscs9903 kernel: RBP: ffffc90009c2fb68 R08: 0000000000000000 R09: 0000000000000525
> Mar 15 14:03:06 auswscs9903 kernel: R10: 0000000000000004 R11: 0000000000000524 R12: ffff88100d7c7000
> Mar 15 14:03:06 auswscs9903 kernel: R13: ffff880fc6985800 R14: ffff88100d7c6f48 R15: ffff880fc6985920
> Mar 15 14:03:06 auswscs9903 kernel: FS:  00007fc1564b6700(0000) GS:ffff88103f380000(0000) knlGS:0000000000000000
> Mar 15 14:03:06 auswscs9903 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Mar 15 14:03:06 auswscs9903 kernel: CR2: 00000000016a5330 CR3: 0000000fc6310005 CR4: 00000000001606e0
> Mar 15 14:03:06 auswscs9903 kernel: Call Trace:
> Mar 15 14:03:06 auswscs9903 kernel: do_chunk_alloc+0x269/0x2e0 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: ? start_transaction+0xa7/0x450 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: btrfs_inc_block_group_ro+0x142/0x160 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: scrub_enumerate_chunks+0x1ad/0x680 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: ? try_to_wake_up+0x59/0x480
> Mar 15 14:03:06 auswscs9903 kernel: btrfs_scrub_dev+0x21d/0x540 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: ? __check_object_size+0x159/0x190
> Mar 15 14:03:06 auswscs9903 kernel: ? _copy_from_user+0x33/0x70
> Mar 15 14:03:06 auswscs9903 kernel: btrfs_ioctl+0xf20/0x2110 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: ? audit_filter_rules.isra.9+0x241/0xe80
> Mar 15 14:03:06 auswscs9903 kernel: do_vfs_ioctl+0xaa/0x610
> Mar 15 14:03:06 auswscs9903 kernel: ? __audit_syscall_entry+0xac/0xf0
> Mar 15 14:03:06 auswscs9903 kernel: ? syscall_trace_enter+0x1cd/0x2b0
> Mar 15 14:03:06 auswscs9903 kernel: SyS_ioctl+0x79/0x90
> Mar 15 14:03:06 auswscs9903 kernel: do_syscall_64+0x79/0x1b0
> Mar 15 14:03:06 auswscs9903 kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> Mar 15 14:03:06 auswscs9903 kernel: RIP: 0033:0x7fc1565a6107
> Mar 15 14:03:06 auswscs9903 kernel: RSP: 002b:00007fc1564b5d58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Mar 15 14:03:06 auswscs9903 kernel: RAX: ffffffffffffffda RBX: 000000000168a3a0 RCX: 00007fc1565a6107
> Mar 15 14:03:06 auswscs9903 kernel: RDX: 000000000168a3a0 RSI: 00000000c400941b RDI: 0000000000000003
> Mar 15 14:03:06 auswscs9903 kernel: RBP: 0000000000000000 R08: 00007fc1564b6700 R09: 0000000000000000
> Mar 15 14:03:06 auswscs9903 kernel: R10: 00007fc1564b6700 R11: 0000000000000246 R12: 00007fc1564b64e0
> Mar 15 14:03:06 auswscs9903 kernel: R13: 00007fc1564b69c0 R14: 00007fc1564b6700 R15: 0000000000000001
> Mar 15 14:03:06 auswscs9903 kernel: Code: 00 e9 5d ff ff ff 49 8b 44 24 60 f0 0f ba a8 d8 cd 00 00 02 72 17 83 fb fb 74 2d 89 de 48 c7 c7 d8 68 77 a0 31 c0 e8 cd 8f 9b e0 <0f> 0b 89 d9 ba d0 27 00 00 48 c7 c6 60 f7 76 a0 4c 89 e7 e8 18
>

Can you post a more complete dmesg rather than snipping it? Is there
anything device or Btrfs related in the 5 minutes before this trace
happens? And is it still going read only?

Also hopefully the SCT ERC on all these drives is less than the SCSI
driver's default timeout of 30 seconds. You can check with 'smartctl
-l scterc /dev/' This is critical to ensuring sector failures are
properly fixed up by Btrfs. And honestly I'm not really certain we had
fix up code for raid6 in the 3.18 code, so it's possible some problems
have not been getting fixed up. Any enterprise or NAS drive will have
something like 70 deciseconds for SCT ERC which is fine. Anything less
than 30 seconds is OK. For sure fix up code is in 4.14 (I think it's
since 4.1 or 4.4 for raid56 and since ancient times for other Btrfs
profiles).

Can you do an offline btrfs check without repair? This probably will
take a while... it's a big file system.

This needs to get the attention of a developer though.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
       [not found]       ` <6b4f2b33edb44f1ea8cef47ae68960af@MOXDE7.na.bayer.cnb>
@ 2018-03-16 16:00         ` Chris Murphy
  2018-03-16 16:17           ` Mike Stevens
  2018-03-18  2:23         ` Qu Wenruo
  1 sibling, 1 reply; 24+ messages in thread
From: Chris Murphy @ 2018-03-16 16:00 UTC (permalink / raw)
  To: Mike Stevens; +Cc: Qu Wenruo, linux-btrfs

Also, in the meantime, maybe the problem can be prevented by
preventing the balance from resuming when mounting. First umount then
mount with -o skip_balance.


Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: Crashes running btrfs scrub
  2018-03-16 16:00         ` Chris Murphy
@ 2018-03-16 16:17           ` Mike Stevens
  2018-03-16 16:44             ` Chris Murphy
  0 siblings, 1 reply; 24+ messages in thread
From: Mike Stevens @ 2018-03-16 16:17 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Qu Wenruo, linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1115 bytes --]

> Also, in the meantime, maybe the problem can be prevented by
> preventing the balance from resuming when mounting. First umount then
> mount with -o skip_balance.

Thanks for the suggestion Chris.  I already had mounted it with skip_balance and then cancelled
the balance.  It will mount, but any significant i/o to the volume cause it to drop r/o.

Mike Stevens

________________________________________________________________________
The information contained in this e-mail is for the exclusive use of the 
intended recipient(s) and may be confidential, proprietary, and/or 
legally privileged.  Inadvertent disclosure of this message does not 
constitute a waiver of any privilege.  If you receive this message in 
error, please do not directly or indirectly use, print, copy, forward,
or disclose any part of this message.  Please also delete this e-mail 
and all copies and notify the sender.  Thank you. 
________________________________________________________________________
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-16 16:17           ` Mike Stevens
@ 2018-03-16 16:44             ` Chris Murphy
  2018-03-16 18:53               ` Mike Stevens
  0 siblings, 1 reply; 24+ messages in thread
From: Chris Murphy @ 2018-03-16 16:44 UTC (permalink / raw)
  To: Mike Stevens; +Cc: Chris Murphy, Qu Wenruo, linux-btrfs

On Fri, Mar 16, 2018 at 10:17 AM, Mike Stevens
<michael.stevens@bayer.com> wrote:
>> Also, in the meantime, maybe the problem can be prevented by
>> preventing the balance from resuming when mounting. First umount then
>> mount with -o skip_balance.
>
> Thanks for the suggestion Chris.  I already had mounted it with skip_balance and then cancelled
> the balance.  It will mount, but any significant i/o to the volume cause it to drop r/o.

It's getting confused and doesn't want to corrupt the file system,
that's a good thing.

Basically it wants to create a block group, but this fails. In the
code, before it gets to the particular failure noted in the call
trace, there are multiple different attempts to allocate a block group
but those are also failing.

But here's the thing - the scrub is still being started or resumed. I
didn't think that scrubs are resumed automatically, but you've got

>Mar 15 14:03:06 auswscs9903 kernel: scrub_enumerate_chunks+0x1ad/0x680 [btrfs]
>Mar 15 14:03:06 auswscs9903 kernel: btrfs_scrub_dev+0x21d/0x540 [btrfs]

and

>Mar 15 14:03:06 auswscs9903 kernel: BTRFS warning (device sdag): failed setting block group ro: -30


These are only found in scrub.c

Is there something starting the scrub right away at mount time? Is
there enough time to cancel scrub before it goes read only?

I definitely think there's a bug here somewhere, but it's taking more
than one thing at once to trigger it, so it's a kind of corner case or
it would have been caught sooner.

See if you can prevent scrub from being started, or if it's resuming
on its own for some reason then try to cancel it soon after mount,
hopefully before it goes ro.

Another thing you could try is mounting with nospace_cache. This is a
coin toss if it will matter, but the fact it's not able to create
pending bg's makes me wonder if possibly something is awry with the on
disk free space cache, and this would eliminate that possibility
without having to clear the cache. There's a performance penalty with
nospace_cache but that's the least of the issues right now.





-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: Crashes running btrfs scrub
  2018-03-16 16:44             ` Chris Murphy
@ 2018-03-16 18:53               ` Mike Stevens
  0 siblings, 0 replies; 24+ messages in thread
From: Mike Stevens @ 2018-03-16 18:53 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Qu Wenruo, linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1581 bytes --]


>>Mar 15 14:03:06 auswscs9903 kernel: BTRFS warning (device sdag): failed setting block group ro: -30


> These are only found in scrub.c

Interesting.  I'm running an offline btrfs check right now, so far extents and free space cache seem to have passed.
If that finishes successfully, I'll try resuming my rsync and see what happens. 

> Another thing you could try is mounting with nospace_cache. This is a
> coin toss if it will matter, but the fact it's not able to create
> pending bg's makes me wonder if possibly something is awry with the on
> disk free space cache, and this would eliminate that possibility
> without having to clear the cache. There's a performance penalty with
> nospace_cache but that's the least of the issues right now.

I'll try this as well if the above fails.  Thanks for the input.

-- 
Mike S

________________________________________________________________________
The information contained in this e-mail is for the exclusive use of the 
intended recipient(s) and may be confidential, proprietary, and/or 
legally privileged.  Inadvertent disclosure of this message does not 
constitute a waiver of any privilege.  If you receive this message in 
error, please do not directly or indirectly use, print, copy, forward,
or disclose any part of this message.  Please also delete this e-mail 
and all copies and notify the sender.  Thank you. 
________________________________________________________________________
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-15 21:07   ` Mike Stevens
  2018-03-15 21:22     ` Chris Murphy
@ 2018-03-16 21:39     ` Liu Bo
  2018-03-16 21:46       ` Mike Stevens
  1 sibling, 1 reply; 24+ messages in thread
From: Liu Bo @ 2018-03-16 21:39 UTC (permalink / raw)
  To: Mike Stevens; +Cc: linux-btrfs

On Thu, Mar 15, 2018 at 2:07 PM, Mike Stevens <michael.stevens@bayer.com> wrote:
>> That's a hell of a filesystem. RAID5 and RAID5 is unstable and should
>> not be used for anything but throw away data. You will be happy that you
>> value you data enough to have backups.... because all sensible sysadmins
>> do have backups correct?! (Do read just about any of Duncan's replies -
>> he describes this better than me).
>
> It's a backup of a backup of a very large filesystem.  Nothing I want to sync again,
> but not a critical data loss if I have to.
>
>> Also if you are running kernel ***3.10*** that is nearly antique in
>> btrfs terms. As a word of advise, try a more recent kernel (there have
>> been lots of patches to raid5/6 since kernel 4.9) and if you ever get
>> the filesystem running again then *at least* rebalance the metadata to
>> raid1 as quickly as possible as the raid1 profile is (unlike raid5 or
>> raid6) working really well.
>
> Not being in the kernel space much, I did not realize how far behind I was.
> I've updated to 4.15.10 with a different crash at least.
>

Could you please paste the whole dmesg, it looks like it hit
btrfs_abort_transaction(),
which should give us more information about where goes wrong.

thanks,
liubo

> Mar 15 14:03:06 auswscs9903 kernel: WARNING: CPU: 6 PID: 2720 at fs/btrfs/extent-tree.c:10192 btrfs_create_pending_block_groups+0x1f3/0x260 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: Modules linked in: nfsv3 nfs fscache mpt3sas raid_class mptctl mptbase binfmt_misc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_multiport xt_conntrack nf_conntrack libcrc32c iptable_filter dm_mirror dm_region_hash dm_log dm_mod dax iTCO_wdt iTCO_vendor_support btrfs ses enclosure scsi_transport_sas xor zstd_decompress zstd_compress xxhash raid6_pq sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd intel_cstate lpc_ich sg intel_rapl_perf pcspkr joydev input_leds i2c_i801 mfd_core mei_me mei ipmi_si ipmi_devintf shpchp wmi ioatdma ipmi_msghandler acpi_power_meter acpi_pad nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache
> Mar 15 14:03:06 auswscs9903 kernel: jbd2 sd_mod crc32c_intel ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm libahci igb ptp drm pps_core i2c_algo_bit libata myri10ge megaraid_sas dca
> Mar 15 14:03:06 auswscs9903 kernel: CPU: 6 PID: 2720 Comm: btrfs Not tainted 4.15.10-1.el7.elrepo.x86_64 #1
> Mar 15 14:03:06 auswscs9903 kernel: Hardware name: Supermicro Super Server/X10DRL-i, BIOS 1.1b 09/11/2015
> Mar 15 14:03:06 auswscs9903 kernel: RIP: 0010:btrfs_create_pending_block_groups+0x1f3/0x260 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: RSP: 0018:ffffc90009c2fae8 EFLAGS: 00010282
> Mar 15 14:03:06 auswscs9903 kernel: RAX: 0000000000000000 RBX: 00000000ffffffe5 RCX: 0000000000000006
> Mar 15 14:03:06 auswscs9903 kernel: RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff88103f3969d0
> Mar 15 14:03:06 auswscs9903 kernel: RBP: ffffc90009c2fb68 R08: 0000000000000000 R09: 0000000000000525
> Mar 15 14:03:06 auswscs9903 kernel: R10: 0000000000000004 R11: 0000000000000524 R12: ffff88100d7c7000
> Mar 15 14:03:06 auswscs9903 kernel: R13: ffff880fc6985800 R14: ffff88100d7c6f48 R15: ffff880fc6985920
> Mar 15 14:03:06 auswscs9903 kernel: FS:  00007fc1564b6700(0000) GS:ffff88103f380000(0000) knlGS:0000000000000000
> Mar 15 14:03:06 auswscs9903 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Mar 15 14:03:06 auswscs9903 kernel: CR2: 00000000016a5330 CR3: 0000000fc6310005 CR4: 00000000001606e0
> Mar 15 14:03:06 auswscs9903 kernel: Call Trace:
> Mar 15 14:03:06 auswscs9903 kernel: do_chunk_alloc+0x269/0x2e0 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: ? start_transaction+0xa7/0x450 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: btrfs_inc_block_group_ro+0x142/0x160 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: scrub_enumerate_chunks+0x1ad/0x680 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: ? try_to_wake_up+0x59/0x480
> Mar 15 14:03:06 auswscs9903 kernel: btrfs_scrub_dev+0x21d/0x540 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: ? __check_object_size+0x159/0x190
> Mar 15 14:03:06 auswscs9903 kernel: ? _copy_from_user+0x33/0x70
> Mar 15 14:03:06 auswscs9903 kernel: btrfs_ioctl+0xf20/0x2110 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: ? audit_filter_rules.isra.9+0x241/0xe80
> Mar 15 14:03:06 auswscs9903 kernel: do_vfs_ioctl+0xaa/0x610
> Mar 15 14:03:06 auswscs9903 kernel: ? __audit_syscall_entry+0xac/0xf0
> Mar 15 14:03:06 auswscs9903 kernel: ? syscall_trace_enter+0x1cd/0x2b0
> Mar 15 14:03:06 auswscs9903 kernel: SyS_ioctl+0x79/0x90
> Mar 15 14:03:06 auswscs9903 kernel: do_syscall_64+0x79/0x1b0
> Mar 15 14:03:06 auswscs9903 kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> Mar 15 14:03:06 auswscs9903 kernel: RIP: 0033:0x7fc1565a6107
> Mar 15 14:03:06 auswscs9903 kernel: RSP: 002b:00007fc1564b5d58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Mar 15 14:03:06 auswscs9903 kernel: RAX: ffffffffffffffda RBX: 000000000168a3a0 RCX: 00007fc1565a6107
> Mar 15 14:03:06 auswscs9903 kernel: RDX: 000000000168a3a0 RSI: 00000000c400941b RDI: 0000000000000003
> Mar 15 14:03:06 auswscs9903 kernel: RBP: 0000000000000000 R08: 00007fc1564b6700 R09: 0000000000000000
> Mar 15 14:03:06 auswscs9903 kernel: R10: 00007fc1564b6700 R11: 0000000000000246 R12: 00007fc1564b64e0
> Mar 15 14:03:06 auswscs9903 kernel: R13: 00007fc1564b69c0 R14: 00007fc1564b6700 R15: 0000000000000001
> Mar 15 14:03:06 auswscs9903 kernel: Code: 00 e9 5d ff ff ff 49 8b 44 24 60 f0 0f ba a8 d8 cd 00 00 02 72 17 83 fb fb 74 2d 89 de 48 c7 c7 d8 68 77 a0 31 c0 e8 cd 8f 9b e0 <0f> 0b 89 d9 ba d0 27 00 00 48 c7 c6 60 f7 76 a0 4c 89 e7 e8 18
>
> ________________________________________________________________________
> The information contained in this e-mail is for the exclusive use of the
> intended recipient(s) and may be confidential, proprietary, and/or
> legally privileged.  Inadvertent disclosure of this message does not
> constitute a waiver of any privilege.  If you receive this message in
> error, please do not directly or indirectly use, print, copy, forward,
> or disclose any part of this message.  Please also delete this e-mail
> and all copies and notify the sender.  Thank you.
> ________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: Crashes running btrfs scrub
  2018-03-16 21:39     ` Liu Bo
@ 2018-03-16 21:46       ` Mike Stevens
  2018-03-18  0:26         ` Liu Bo
  0 siblings, 1 reply; 24+ messages in thread
From: Mike Stevens @ 2018-03-16 21:46 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 962 bytes --]

> Could you please paste the whole dmesg, it looks like it hit
> btrfs_abort_transaction(),
> which should give us more information about where goes wrong.

The whole thing is here https://pastebin.com/4ENq2saQ


________________________________________________________________________
The information contained in this e-mail is for the exclusive use of the 
intended recipient(s) and may be confidential, proprietary, and/or 
legally privileged.  Inadvertent disclosure of this message does not 
constitute a waiver of any privilege.  If you receive this message in 
error, please do not directly or indirectly use, print, copy, forward,
or disclose any part of this message.  Please also delete this e-mail 
and all copies and notify the sender.  Thank you. 
________________________________________________________________________
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-16 21:46       ` Mike Stevens
@ 2018-03-18  0:26         ` Liu Bo
  2018-03-18  6:41           ` Liu Bo
  0 siblings, 1 reply; 24+ messages in thread
From: Liu Bo @ 2018-03-18  0:26 UTC (permalink / raw)
  To: Mike Stevens; +Cc: linux-btrfs

On Fri, Mar 16, 2018 at 2:46 PM, Mike Stevens <michael.stevens@bayer.com> wrote:
>> Could you please paste the whole dmesg, it looks like it hit
>> btrfs_abort_transaction(),
>> which should give us more information about where goes wrong.
>
> The whole thing is here https://pastebin.com/4ENq2saQ

Given this,

[  299.410998] BTRFS: error (device sdag) in
btrfs_create_pending_block_groups:10192: errno=-27 unknown

it refers to -EFBIG, so I think the warning comes from

btrfs_add_system_chunk()
{
...
        if (array_size + item_size + sizeof(disk_key)

                        > BTRFS_SYSTEM_CHUNK_ARRAY_SIZE) {

                mutex_unlock(&fs_info->chunk_mutex);

                return -EFBIG;

        }

If that's the case, we need to check this earlier during mount.

thanks,
liubo

>
>
> ________________________________________________________________________
> The information contained in this e-mail is for the exclusive use of the
> intended recipient(s) and may be confidential, proprietary, and/or
> legally privileged.  Inadvertent disclosure of this message does not
> constitute a waiver of any privilege.  If you receive this message in
> error, please do not directly or indirectly use, print, copy, forward,
> or disclose any part of this message.  Please also delete this e-mail
> and all copies and notify the sender.  Thank you.
> ________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
       [not found]       ` <6b4f2b33edb44f1ea8cef47ae68960af@MOXDE7.na.bayer.cnb>
  2018-03-16 16:00         ` Chris Murphy
@ 2018-03-18  2:23         ` Qu Wenruo
  1 sibling, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2018-03-18  2:23 UTC (permalink / raw)
  To: Mike Stevens, Chris Murphy; +Cc: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1563 bytes --]



On 2018年03月16日 23:03, Mike Stevens wrote:
>> Can you post a more complete dmesg rather than snipping it? Is there
>> anything device or Btrfs related in the 5 minutes before this trace
>> happens? And is it still going read only?
> 
> It's still going read only after the 4.15.10 update.  Here's a lot more log:
> 

> Mar 15 14:01:14 auswscs9903 MR_MONITOR[1821]: <MRMON243> Controller ID:  0   Fan speed changed on enclosure:   1  Fan#012      1
> Mar 15 14:03:06 auswscs9903 kernel: ------------[ cut here ]------------
> Mar 15 14:03:06 auswscs9903 kernel: WARNING: CPU: 6 PID: 2720 at fs/btrfs/extent-tree.c:10192 btrfs_create_pending_block_groups+0x1f3/0x260 [btrfs]
> Mar 15 14:03:06 auswscs9903 kernel: BTRFS: error (device sdag) in btrfs_create_pending_block_groups:10192: errno=-27 unknown

-27 is EFBIG.

And in the call trace of btrfs_create_pending_block_groups, I think it's
triggered by btrfs_add_system_chunk(), which will return -EFBIG if a
superblock can't contain that much system chunk entries.


Furthermore, you have too many disks which dramatically increase the
superblock system chunk array usage.

If you're putting too many devices into one single btrfs and uses
profile that utilize all disks like RAID0/10/5/6, then it will easily
take up all space of superblock.

Since even raid6 can only tolerant 2 disks failure, for so many disks it
won't help much.

It's recommended to use RAID10 for metadata and system chunk profiles,
and reduce device number to a reasonable number.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-18  0:26         ` Liu Bo
@ 2018-03-18  6:41           ` Liu Bo
  2018-03-18  7:57             ` Goffredo Baroncelli
  2018-03-18 22:52             ` waxhead
  0 siblings, 2 replies; 24+ messages in thread
From: Liu Bo @ 2018-03-18  6:41 UTC (permalink / raw)
  To: Mike Stevens; +Cc: linux-btrfs

On Sat, Mar 17, 2018 at 5:26 PM, Liu Bo <obuil.liubo@gmail.com> wrote:
> On Fri, Mar 16, 2018 at 2:46 PM, Mike Stevens <michael.stevens@bayer.com> wrote:
>>> Could you please paste the whole dmesg, it looks like it hit
>>> btrfs_abort_transaction(),
>>> which should give us more information about where goes wrong.
>>
>> The whole thing is here https://pastebin.com/4ENq2saQ
>
> Given this,
>
> [  299.410998] BTRFS: error (device sdag) in
> btrfs_create_pending_block_groups:10192: errno=-27 unknown
>
> it refers to -EFBIG, so I think the warning comes from
>
> btrfs_add_system_chunk()
> {
> ...
>         if (array_size + item_size + sizeof(disk_key)
>
>                         > BTRFS_SYSTEM_CHUNK_ARRAY_SIZE) {
>
>                 mutex_unlock(&fs_info->chunk_mutex);
>
>                 return -EFBIG;
>
>         }
>
> If that's the case, we need to check this earlier during mount.
>

I didn't realize this until now,  we do have a limitation on up to how
many disks btrfs could handle, in order to make balance/scrub work
properly (where system chunks may be set readonly),

((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE / 2) - sizeof(struct btrfs_chunk)) /
sizeof(struct btrfs_stripe) + 1

will be the number of disks btrfs can handle at most.

Mike,

For now, looks like we don't have a good way to work around the
warning since it's hardcoded in the source, but more fine-grained
balance is possible if that's what you're looking for.

thanks,
liubo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-18  6:41           ` Liu Bo
@ 2018-03-18  7:57             ` Goffredo Baroncelli
  2018-03-18 14:46               ` Goffredo Baroncelli
  2018-03-18 22:52             ` waxhead
  1 sibling, 1 reply; 24+ messages in thread
From: Goffredo Baroncelli @ 2018-03-18  7:57 UTC (permalink / raw)
  To: Liu Bo, Mike Stevens; +Cc: linux-btrfs

On 03/18/2018 07:41 AM, Liu Bo wrote:
> ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE / 2) - sizeof(struct btrfs_chunk)) /
> sizeof(struct btrfs_stripe) + 1

	BTRFS_SYSTEM_CHUNK_ARRAY_SIZE = 2048
	sizeof(struct btrfs_chunk)) = 48
	sizeof(struct btrfs_stripe) = 32

So

	(2048/2-48)/32+1 = 31

If my math is correct
gb
-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-18  7:57             ` Goffredo Baroncelli
@ 2018-03-18 14:46               ` Goffredo Baroncelli
  0 siblings, 0 replies; 24+ messages in thread
From: Goffredo Baroncelli @ 2018-03-18 14:46 UTC (permalink / raw)
  To: Liu Bo, Mike Stevens; +Cc: linux-btrfs

On 03/18/2018 08:57 AM, Goffredo Baroncelli wrote:
> 	BTRFS_SYSTEM_CHUNK_ARRAY_SIZE = 2048
> 	sizeof(struct btrfs_chunk)) = 48
> 	sizeof(struct btrfs_stripe) = 32
> 
> So
> 
> 	(2048/2-48)/32+1 = 31
> 
> If my math is correct

my math was wrong:

	sizeof(struct btrfs_chunk)=80

so the maximum number of disk is: (2048/2-80)/32+1 = 30

Does make sense putting some warning in btrfs kernel module, and in some btrfs command like:
- btrfs fi show
- btrfs dev add

and in the wiki ? 30 device are a big number, however not unreachable...


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-18  6:41           ` Liu Bo
  2018-03-18  7:57             ` Goffredo Baroncelli
@ 2018-03-18 22:52             ` waxhead
  2018-03-19  1:51               ` Qu Wenruo
  2018-03-19 18:06               ` Liu Bo
  1 sibling, 2 replies; 24+ messages in thread
From: waxhead @ 2018-03-18 22:52 UTC (permalink / raw)
  To: Liu Bo, Mike Stevens; +Cc: linux-btrfs

Liu Bo wrote:
> On Sat, Mar 17, 2018 at 5:26 PM, Liu Bo <obuil.liubo@gmail.com> wrote:
>> On Fri, Mar 16, 2018 at 2:46 PM, Mike Stevens <michael.stevens@bayer.com> wrote:
>>>> Could you please paste the whole dmesg, it looks like it hit
>>>> btrfs_abort_transaction(),
>>>> which should give us more information about where goes wrong.
>>>
>>> The whole thing is here https://pastebin.com/4ENq2saQ
>>
>> Given this,
>>
>> [  299.410998] BTRFS: error (device sdag) in
>> btrfs_create_pending_block_groups:10192: errno=-27 unknown
>>
>> it refers to -EFBIG, so I think the warning comes from
>>
>> btrfs_add_system_chunk()
>> {
>> ...
>>          if (array_size + item_size + sizeof(disk_key)
>>
>>                          > BTRFS_SYSTEM_CHUNK_ARRAY_SIZE) {
>>
>>                  mutex_unlock(&fs_info->chunk_mutex);
>>
>>                  return -EFBIG;
>>
>>          }
>>
>> If that's the case, we need to check this earlier during mount.
>>
> 
> I didn't realize this until now,  we do have a limitation on up to how
> many disks btrfs could handle, in order to make balance/scrub work
> properly (where system chunks may be set readonly),
> 
> ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE / 2) - sizeof(struct btrfs_chunk)) /
> sizeof(struct btrfs_stripe) + 1
> 
> will be the number of disks btrfs can handle at most.

Am I understanding this correct, BTRFS have limit to the number of 
physical devices it can handle?! (max 30 devices?!)

Or are this referring to the number of devices BTRFS can utilize in a 
stripe (in which case 30 actually sounds like a high number).

30 devices is really not that much, heck you get 90 disks top load JBOD 
storage chassis these days and BTRFS does sound like an attractive 
choice for things like that.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-18 22:52             ` waxhead
@ 2018-03-19  1:51               ` Qu Wenruo
  2018-03-19 18:06               ` Liu Bo
  1 sibling, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2018-03-19  1:51 UTC (permalink / raw)
  To: waxhead, Liu Bo, Mike Stevens; +Cc: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3570 bytes --]



On 2018年03月19日 06:52, waxhead wrote:
> Liu Bo wrote:
>> On Sat, Mar 17, 2018 at 5:26 PM, Liu Bo <obuil.liubo@gmail.com> wrote:
>>> On Fri, Mar 16, 2018 at 2:46 PM, Mike Stevens
>>> <michael.stevens@bayer.com> wrote:
>>>>> Could you please paste the whole dmesg, it looks like it hit
>>>>> btrfs_abort_transaction(),
>>>>> which should give us more information about where goes wrong.
>>>>
>>>> The whole thing is here https://pastebin.com/4ENq2saQ
>>>
>>> Given this,
>>>
>>> [  299.410998] BTRFS: error (device sdag) in
>>> btrfs_create_pending_block_groups:10192: errno=-27 unknown
>>>
>>> it refers to -EFBIG, so I think the warning comes from
>>>
>>> btrfs_add_system_chunk()
>>> {
>>> ...
>>>          if (array_size + item_size + sizeof(disk_key)
>>>
>>>                          > BTRFS_SYSTEM_CHUNK_ARRAY_SIZE) {
>>>
>>>                  mutex_unlock(&fs_info->chunk_mutex);
>>>
>>>                  return -EFBIG;
>>>
>>>          }
>>>
>>> If that's the case, we need to check this earlier during mount.
>>>
>>
>> I didn't realize this until now,  we do have a limitation on up to how
>> many disks btrfs could handle, in order to make balance/scrub work
>> properly (where system chunks may be set readonly),
>>
>> ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE / 2) - sizeof(struct btrfs_chunk)) /
>> sizeof(struct btrfs_stripe) + 1
>>
>> will be the number of disks btrfs can handle at most.
> 
> Am I understanding this correct, BTRFS have limit to the number of
> physical devices it can handle?! (max 30 devices?!)

Not exactly.

System chunk array is only used to store system chunk (tree blocks of
chunk tree).
For data chunks, it's completely unrelated.

And for certain system chunk type (raid1/dup) it's fixed to have 2
stripes for each chunk, so it won't take so much space.


Furthermore, system chunk should not be that much, as it's only
containing chunk tree.
For large fs, one chunk can be up to 10G for data or 1G for metadata.

10TB only needs about 100 chunk items, which doesn't even need to
allocate extra system chunk space.


But if one is using RAID0/5/6/10, it's pretty possible one will hit that
device number max.
But that's still pretty strange.
Correct calculation should be (I hardly see more than 2 system chunks in
real world)

2048 = 80 + 32 + 80 + n * 32 (For old mkfs which still has temporary sys
chunk)

Or

2048 = 80 + n * 32 (For new mkfs which doesn't have temporary sys chunk)

n should be the total number of disks if using RAID0/10/5/6 as metadata
profile.

So n will be 58, which is quite large at least for mkfs time limit.

Although for runtime, especially for relocation case, we need half of
the system chunk array to store the new system chunk, which makes the
number half.

But it should still be possible if we convert system chunk profile from
RAID0/10/5/6 to RAID1, which only takes 80 + 32 * 2.

Thanks,
Qu


> 
> Or are this referring to the number of devices BTRFS can utilize in a
> stripe (in which case 30 actually sounds like a high number).
> 
> 30 devices is really not that much, heck you get 90 disks top load JBOD
> storage chassis these days and BTRFS does sound like an attractive
> choice for things like that.
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-18 22:52             ` waxhead
  2018-03-19  1:51               ` Qu Wenruo
@ 2018-03-19 18:06               ` Liu Bo
  2018-03-20 17:44                 ` Mike Stevens
  2018-03-20 20:04                 ` Goffredo Baroncelli
  1 sibling, 2 replies; 24+ messages in thread
From: Liu Bo @ 2018-03-19 18:06 UTC (permalink / raw)
  To: waxhead, kreijack, David Sterba; +Cc: Mike Stevens, linux-btrfs

On Sun, Mar 18, 2018 at 3:52 PM, waxhead <waxhead@dirtcellar.net> wrote:
> Liu Bo wrote:
>>
>> On Sat, Mar 17, 2018 at 5:26 PM, Liu Bo <obuil.liubo@gmail.com> wrote:
>>>
>>> On Fri, Mar 16, 2018 at 2:46 PM, Mike Stevens <michael.stevens@bayer.com>
>>> wrote:
>>>>>
>>>>> Could you please paste the whole dmesg, it looks like it hit
>>>>> btrfs_abort_transaction(),
>>>>> which should give us more information about where goes wrong.
>>>>
>>>>
>>>> The whole thing is here https://pastebin.com/4ENq2saQ
>>>
>>>
>>> Given this,
>>>
>>> [  299.410998] BTRFS: error (device sdag) in
>>> btrfs_create_pending_block_groups:10192: errno=-27 unknown
>>>
>>> it refers to -EFBIG, so I think the warning comes from
>>>
>>> btrfs_add_system_chunk()
>>> {
>>> ...
>>>          if (array_size + item_size + sizeof(disk_key)
>>>
>>>                          > BTRFS_SYSTEM_CHUNK_ARRAY_SIZE) {
>>>
>>>                  mutex_unlock(&fs_info->chunk_mutex);
>>>
>>>                  return -EFBIG;
>>>
>>>          }
>>>
>>> If that's the case, we need to check this earlier during mount.
>>>
>>
>> I didn't realize this until now,  we do have a limitation on up to how
>> many disks btrfs could handle, in order to make balance/scrub work
>> properly (where system chunks may be set readonly),
>>
>> ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE / 2) - sizeof(struct btrfs_chunk)) /
>> sizeof(struct btrfs_stripe) + 1
>>
>> will be the number of disks btrfs can handle at most.
>
>
> Am I understanding this correct, BTRFS have limit to the number of physical
> devices it can handle?! (max 30 devices?!)
>
> Or are this referring to the number of devices BTRFS can utilize in a stripe
> (in which case 30 actually sounds like a high number).
>
> 30 devices is really not that much, heck you get 90 disks top load JBOD
> storage chassis these days and BTRFS does sound like an attractive choice
> for things like that.

So Mike's case is, that both metadata and data are configured as
raid6, and the operations, balance and scrub, that he tried, need to
set the existing block group as readonly (in order to avoid any
further changes being applied during operations are running), then we
got into the place where another system chunk is needed.

However, I think it'd be better to have some warnings about this when
doing a) mkfs.btrfs -mraid6, b) btrfs device add.

David, any idea?

thanks,
liubo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: Crashes running btrfs scrub
  2018-03-19 18:06               ` Liu Bo
@ 2018-03-20 17:44                 ` Mike Stevens
  2018-03-21  2:01                   ` Qu Wenruo
  2018-03-20 20:04                 ` Goffredo Baroncelli
  1 sibling, 1 reply; 24+ messages in thread
From: Mike Stevens @ 2018-03-20 17:44 UTC (permalink / raw)
  To: Liu Bo, waxhead, kreijack, David Sterba; +Cc: linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2226 bytes --]


>> 30 devices is really not that much, heck you get 90 disks top load JBOD
>> storage chassis these days and BTRFS does sound like an attractive choice
>> for things like that.

> So Mike's case is, that both metadata and data are configured as
> raid6, and the operations, balance and scrub, that he tried, need to
> set the existing block group as readonly (in order to avoid any
> further changes being applied during operations are running), then we
> got into the place where another system chunk is needed.

> However, I think it'd be better to have some warnings about this when
> doing a) mkfs.btrfs -mraid6, b) btrfs device add.

> David, any idea?

I'll certainly vote for a warning, I would have set this up differently had I been aware.  

My filesystem check seems to have returned successfully:

[root@auswscs9903] ~ # btrfs check --readonly /dev/sdb
Checking filesystem on /dev/sdb
UUID: 77afc2bb-f7a8-4ce9-9047-c031f7571150
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 97926270238720 bytes used err is 0
total csum bytes: 95395030288
total tree bytes: 201223503872
total fs tree bytes: 84484636672
total extent tree bytes: 7195869184
btree space waste bytes: 29627784154
file data blocks allocated: 97756261568512

I've remounted the filesystem and I can at least touch a file.  I'm restarting the rsync that was running when it originally went read only.
What is the next step if it drops back to r/o?

________________________________________________________________________
The information contained in this e-mail is for the exclusive use of the 
intended recipient(s) and may be confidential, proprietary, and/or 
legally privileged.  Inadvertent disclosure of this message does not 
constitute a waiver of any privilege.  If you receive this message in 
error, please do not directly or indirectly use, print, copy, forward,
or disclose any part of this message.  Please also delete this e-mail 
and all copies and notify the sender.  Thank you. 
________________________________________________________________________
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-19 18:06               ` Liu Bo
  2018-03-20 17:44                 ` Mike Stevens
@ 2018-03-20 20:04                 ` Goffredo Baroncelli
  1 sibling, 0 replies; 24+ messages in thread
From: Goffredo Baroncelli @ 2018-03-20 20:04 UTC (permalink / raw)
  To: Liu Bo, waxhead, David Sterba; +Cc: Mike Stevens, linux-btrfs

On 03/19/2018 07:06 PM, Liu Bo wrote:
[...]
> 
> So Mike's case is, that both metadata and data are configured as
> raid6, and the operations, balance and scrub, that he tried, need to
> set the existing block group as readonly (in order to avoid any
> further changes being applied during operations are running), then we
> got into the place where another system chunk is needed.
> 
> However, I think it'd be better to have some warnings about this when
> doing a) mkfs.btrfs -mraid6, b) btrfs device add.

What about in the kernel dmesg during the mount

> 
> David, any idea?
> 
> thanks,
> liubo
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-20 17:44                 ` Mike Stevens
@ 2018-03-21  2:01                   ` Qu Wenruo
  2018-03-21 17:13                     ` Liu Bo
  0 siblings, 1 reply; 24+ messages in thread
From: Qu Wenruo @ 2018-03-21  2:01 UTC (permalink / raw)
  To: Mike Stevens, Liu Bo, waxhead, kreijack, David Sterba; +Cc: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2773 bytes --]



On 2018年03月21日 01:44, Mike Stevens wrote:
> 
>>> 30 devices is really not that much, heck you get 90 disks top load JBOD
>>> storage chassis these days and BTRFS does sound like an attractive choice
>>> for things like that.
> 
>> So Mike's case is, that both metadata and data are configured as
>> raid6, and the operations, balance and scrub, that he tried, need to
>> set the existing block group as readonly (in order to avoid any
>> further changes being applied during operations are running), then we
>> got into the place where another system chunk is needed.
> 
>> However, I think it'd be better to have some warnings about this when
>> doing a) mkfs.btrfs -mraid6, b) btrfs device add.
> 
>> David, any idea?
> 
> I'll certainly vote for a warning, I would have set this up differently had I been aware.  
> 
> My filesystem check seems to have returned successfully:
> 
> [root@auswscs9903] ~ # btrfs check --readonly /dev/sdb
> Checking filesystem on /dev/sdb
> UUID: 77afc2bb-f7a8-4ce9-9047-c031f7571150
> checking extents
> checking free space cache
> checking fs roots
> checking csums
> checking root refs
> found 97926270238720 bytes used err is 0
> total csum bytes: 95395030288
> total tree bytes: 201223503872
> total fs tree bytes: 84484636672
> total extent tree bytes: 7195869184
> btree space waste bytes: 29627784154
> file data blocks allocated: 97756261568512
> 
> I've remounted the filesystem and I can at least touch a file.  I'm restarting the rsync that was running when it originally went read only.
> What is the next step if it drops back to r/o?

As already mentioned, if you're using tons of disks and RAID0/10/5/6 as
metadata profile, you can just convert your metadata (or just system) to
RAID1/DUP.

Then there will be more than enough space for system chunk array.

Thanks,
Qu

> 
> ________________________________________________________________________
> The information contained in this e-mail is for the exclusive use of the 
> intended recipient(s) and may be confidential, proprietary, and/or 
> legally privileged.  Inadvertent disclosure of this message does not 
> constitute a waiver of any privilege.  If you receive this message in 
> error, please do not directly or indirectly use, print, copy, forward,
> or disclose any part of this message.  Please also delete this e-mail 
> and all copies and notify the sender.  Thank you. 
> ________________________________________________________________________
> N�����r��y���b�X��ǧv�^�)޺{.n�+����{�n�߲)���w*\x1fjg���\x1e�����ݢj/���z�ޖ��2�ޙ���&�)ߡ�a��\x7f��\x1e�G���h�\x0f�j:+v���w�٥
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-21  2:01                   ` Qu Wenruo
@ 2018-03-21 17:13                     ` Liu Bo
  2018-03-22  0:08                       ` Qu Wenruo
  0 siblings, 1 reply; 24+ messages in thread
From: Liu Bo @ 2018-03-21 17:13 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Mike Stevens, waxhead, kreijack, David Sterba, linux-btrfs

On Tue, Mar 20, 2018 at 7:01 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2018年03月21日 01:44, Mike Stevens wrote:
>>
>>>> 30 devices is really not that much, heck you get 90 disks top load JBOD
>>>> storage chassis these days and BTRFS does sound like an attractive choice
>>>> for things like that.
>>
>>> So Mike's case is, that both metadata and data are configured as
>>> raid6, and the operations, balance and scrub, that he tried, need to
>>> set the existing block group as readonly (in order to avoid any
>>> further changes being applied during operations are running), then we
>>> got into the place where another system chunk is needed.
>>
>>> However, I think it'd be better to have some warnings about this when
>>> doing a) mkfs.btrfs -mraid6, b) btrfs device add.
>>
>>> David, any idea?
>>
>> I'll certainly vote for a warning, I would have set this up differently had I been aware.
>>
>> My filesystem check seems to have returned successfully:
>>
>> [root@auswscs9903] ~ # btrfs check --readonly /dev/sdb
>> Checking filesystem on /dev/sdb
>> UUID: 77afc2bb-f7a8-4ce9-9047-c031f7571150
>> checking extents
>> checking free space cache
>> checking fs roots
>> checking csums
>> checking root refs
>> found 97926270238720 bytes used err is 0
>> total csum bytes: 95395030288
>> total tree bytes: 201223503872
>> total fs tree bytes: 84484636672
>> total extent tree bytes: 7195869184
>> btree space waste bytes: 29627784154
>> file data blocks allocated: 97756261568512
>>
>> I've remounted the filesystem and I can at least touch a file.  I'm restarting the rsync that was running when it originally went read only.
>> What is the next step if it drops back to r/o?
>
> As already mentioned, if you're using tons of disks and RAID0/10/5/6 as
> metadata profile, you can just convert your metadata (or just system) to
> RAID1/DUP.
>
> Then there will be more than enough space for system chunk array.
>

It's chicken & egg, balance seems to be the only way to switch raid
profiles however users are stuck here because balance is aborted due
to failing to allocate an extra system chunk.

thanks,
liubo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Crashes running btrfs scrub
  2018-03-21 17:13                     ` Liu Bo
@ 2018-03-22  0:08                       ` Qu Wenruo
  0 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2018-03-22  0:08 UTC (permalink / raw)
  To: Liu Bo; +Cc: Mike Stevens, waxhead, kreijack, David Sterba, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2706 bytes --]



On 2018年03月22日 01:13, Liu Bo wrote:
> On Tue, Mar 20, 2018 at 7:01 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>> On 2018年03月21日 01:44, Mike Stevens wrote:
>>>
>>>>> 30 devices is really not that much, heck you get 90 disks top load JBOD
>>>>> storage chassis these days and BTRFS does sound like an attractive choice
>>>>> for things like that.
>>>
>>>> So Mike's case is, that both metadata and data are configured as
>>>> raid6, and the operations, balance and scrub, that he tried, need to
>>>> set the existing block group as readonly (in order to avoid any
>>>> further changes being applied during operations are running), then we
>>>> got into the place where another system chunk is needed.
>>>
>>>> However, I think it'd be better to have some warnings about this when
>>>> doing a) mkfs.btrfs -mraid6, b) btrfs device add.
>>>
>>>> David, any idea?
>>>
>>> I'll certainly vote for a warning, I would have set this up differently had I been aware.
>>>
>>> My filesystem check seems to have returned successfully:
>>>
>>> [root@auswscs9903] ~ # btrfs check --readonly /dev/sdb
>>> Checking filesystem on /dev/sdb
>>> UUID: 77afc2bb-f7a8-4ce9-9047-c031f7571150
>>> checking extents
>>> checking free space cache
>>> checking fs roots
>>> checking csums
>>> checking root refs
>>> found 97926270238720 bytes used err is 0
>>> total csum bytes: 95395030288
>>> total tree bytes: 201223503872
>>> total fs tree bytes: 84484636672
>>> total extent tree bytes: 7195869184
>>> btree space waste bytes: 29627784154
>>> file data blocks allocated: 97756261568512
>>>
>>> I've remounted the filesystem and I can at least touch a file.  I'm restarting the rsync that was running when it originally went read only.
>>> What is the next step if it drops back to r/o?
>>
>> As already mentioned, if you're using tons of disks and RAID0/10/5/6 as
>> metadata profile, you can just convert your metadata (or just system) to
>> RAID1/DUP.
>>
>> Then there will be more than enough space for system chunk array.
>>
> 
> It's chicken & egg, balance seems to be the only way to switch raid
> profiles however users are stuck here because balance is aborted due
> to failing to allocate an extra system chunk.

Skip_balance to abort current balance and do the new convert.
Since convert will allocate new chunk in new profile, raid1 sys chunk
should be able to fit into superblock.

Thanks,
Qu

> 
> thanks,
> liubo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-03-22  0:09 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-15 18:58 Crashes running btrfs scrub Mike Stevens
2018-03-15 20:32 ` waxhead
2018-03-15 21:07   ` Mike Stevens
2018-03-15 21:22     ` Chris Murphy
     [not found]       ` <6b4f2b33edb44f1ea8cef47ae68960af@MOXDE7.na.bayer.cnb>
2018-03-16 16:00         ` Chris Murphy
2018-03-16 16:17           ` Mike Stevens
2018-03-16 16:44             ` Chris Murphy
2018-03-16 18:53               ` Mike Stevens
2018-03-18  2:23         ` Qu Wenruo
2018-03-16 21:39     ` Liu Bo
2018-03-16 21:46       ` Mike Stevens
2018-03-18  0:26         ` Liu Bo
2018-03-18  6:41           ` Liu Bo
2018-03-18  7:57             ` Goffredo Baroncelli
2018-03-18 14:46               ` Goffredo Baroncelli
2018-03-18 22:52             ` waxhead
2018-03-19  1:51               ` Qu Wenruo
2018-03-19 18:06               ` Liu Bo
2018-03-20 17:44                 ` Mike Stevens
2018-03-21  2:01                   ` Qu Wenruo
2018-03-21 17:13                     ` Liu Bo
2018-03-22  0:08                       ` Qu Wenruo
2018-03-20 20:04                 ` Goffredo Baroncelli
2018-03-15 21:15 ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.