* btrfs metadata has reserved 1T of extra space and balances don't reclaim it @ 2021-09-29 2:23 Brandon Heisner 2021-09-29 7:23 ` Forza ` (3 more replies) 0 siblings, 4 replies; 11+ messages in thread From: Brandon Heisner @ 2021-09-29 2:23 UTC (permalink / raw) To: linux-btrfs I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP Fri Jan 20 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is version locked to that kernel. The metadata has reserved a full 1T of disk space, while only using ~38G. I've tried to balance the metadata to reclaim that so it can be used for data, but it doesn't work and gives no errors. It just says it balanced the chunks but the size doesn't change. The metadata total is still growing as well, as it used to be 1.04 and now it is 1.08 with only about 10G more of metadata used. I've tried doing balances up to 70 or 80 musage I think, and the total metadata does not decrease. I've done so many attempts at balancing, I've probably tried to move 300 chunks or more. None have resulted in any change to the metadata total like they do on other servers running btrfs. I first started with very low musage, like 10 and then increased it by 10 to try to see if that would balance any chunks out, but with no success. # /sbin/btrfs balance start -musage=60 -mlimit=30 /opt/zimbra Done, had to relocate 30 out of 2127 chunks I can do that command over and over again, or increase the mlimit, and it doesn't change the metadata total ever. # btrfs fi show /opt/zimbra/ Label: 'Data' uuid: ece150db-5817-4704-9e84-80f7d8a3b1da Total devices 4 FS bytes used 1.48TiB devid 1 size 1.46TiB used 1.38TiB path /dev/sde devid 2 size 1.46TiB used 1.38TiB path /dev/sdf devid 3 size 1.46TiB used 1.38TiB path /dev/sdg devid 4 size 1.46TiB used 1.38TiB path /dev/sdh # btrfs fi df /opt/zimbra/ Data, RAID10: total=1.69TiB, used=1.45TiB System, RAID10: total=64.00MiB, used=640.00KiB Metadata, RAID10: total=1.08TiB, used=37.69GiB GlobalReserve, single: total=512.00MiB, used=0.00B # btrfs fi us /opt/zimbra/ -T Overall: Device size: 5.82TiB Device allocated: 5.54TiB Device unallocated: 291.54GiB Device missing: 0.00B Used: 2.96TiB Free (estimated): 396.36GiB (min: 396.36GiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data Metadata System Id Path RAID10 RAID10 RAID10 Unallocated -- -------- --------- --------- --------- ----------- 1 /dev/sde 432.75GiB 276.00GiB 16.00MiB 781.65GiB 2 /dev/sdf 432.75GiB 276.00GiB 16.00MiB 781.65GiB 3 /dev/sdg 432.75GiB 276.00GiB 16.00MiB 781.65GiB 4 /dev/sdh 432.75GiB 276.00GiB 16.00MiB 781.65GiB -- -------- --------- --------- --------- ----------- Total 1.69TiB 1.08TiB 64.00MiB 3.05TiB Used 1.45TiB 37.69GiB 640.00KiB -- Brandon Heisner System Administrator Wolfram Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 2:23 btrfs metadata has reserved 1T of extra space and balances don't reclaim it Brandon Heisner @ 2021-09-29 7:23 ` Forza 2021-09-29 14:34 ` Brandon Heisner 2021-09-29 8:22 ` Qu Wenruo ` (2 subsequent siblings) 3 siblings, 1 reply; 11+ messages in thread From: Forza @ 2021-09-29 7:23 UTC (permalink / raw) To: brandonh, linux-btrfs ---- From: Brandon Heisner <brandonh@wolfram.com> -- Sent: 2021-09-29 - 04:23 ---- > I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP Fri Jan 20 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is version locked to that kernel. The metadata has reserved a full 1T of disk space, while only using ~38G. I've tried to balance the metadata to reclaim that so it can be used for data, but it doesn't work and gives no errors. It just says it balanced the chunks but the size doesn't change. The metadata total is still growing as well, as it used to be 1.04 and now it is 1.08 with only about 10G more of metadata used. I've tried doing balances up to 70 or 80 musage I think, and the total metadata does not decrease. I've done so many attempts at balancing, I've probably tried to move 300 chunks or more. None have resulted in any change to the metadata total like they do on other servers running btrfs. I first started with very low musage, like 10 and then increased it by 10 to try to see if that would balance any chunks out, but with no succes s. > > # /sbin/btrfs balance start -musage=60 -mlimit=30 /opt/zimbra > Done, had to relocate 30 out of 2127 chunks > > I can do that command over and over again, or increase the mlimit, and it doesn't change the metadata total ever. > > > # btrfs fi show /opt/zimbra/ > Label: 'Data' uuid: ece150db-5817-4704-9e84-80f7d8a3b1da > Total devices 4 FS bytes used 1.48TiB > devid 1 size 1.46TiB used 1.38TiB path /dev/sde > devid 2 size 1.46TiB used 1.38TiB path /dev/sdf > devid 3 size 1.46TiB used 1.38TiB path /dev/sdg > devid 4 size 1.46TiB used 1.38TiB path /dev/sdh > > # btrfs fi df /opt/zimbra/ > Data, RAID10: total=1.69TiB, used=1.45TiB > System, RAID10: total=64.00MiB, used=640.00KiB > Metadata, RAID10: total=1.08TiB, used=37.69GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > > # btrfs fi us /opt/zimbra/ -T > Overall: > Device size: 5.82TiB > Device allocated: 5.54TiB > Device unallocated: 291.54GiB > Device missing: 0.00B > Used: 2.96TiB > Free (estimated): 396.36GiB (min: 396.36GiB) > Data ratio: 2.00 > Metadata ratio: 2.00 > Global reserve: 512.00MiB (used: 0.00B) > > Data Metadata System > Id Path RAID10 RAID10 RAID10 Unallocated > -- -------- --------- --------- --------- ----------- > 1 /dev/sde 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 2 /dev/sdf 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 3 /dev/sdg 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 4 /dev/sdh 432.75GiB 276.00GiB 16.00MiB 781.65GiB > -- -------- --------- --------- --------- ----------- > Total 1.69TiB 1.08TiB 64.00MiB 3.05TiB > Used 1.45TiB 37.69GiB 640.00KiB > > > > > > > -- > Brandon Heisner > System Administrator > Wolfram Research What are you mount options? Do you by any chance use metadata_ratio mount option? https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)#MOUNT_OPTIONS ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 7:23 ` Forza @ 2021-09-29 14:34 ` Brandon Heisner 2021-10-03 11:26 ` Forza 0 siblings, 1 reply; 11+ messages in thread From: Brandon Heisner @ 2021-09-29 14:34 UTC (permalink / raw) To: Forza; +Cc: linux-btrfs No I do not use that option. Also, because of btrfs not mounting individual subvolume options, I have the compression and nodatacow set with filesystem attributes on the directories that are btrfs subvolumes. UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra btrfs subvol=zimbra,defaults,discard,compress=lzo 0 0 UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /var/log btrfs subvol=root-var-log,defaults,discard,compress=lzo 0 0 UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/db btrfs subvol=db,defaults,discard,nodatacow 0 0 UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/index btrfs subvol=index,defaults,discard,compress=lzo 0 0 UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/store btrfs subvol=store,defaults,discard,compress=lzo 0 0 UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/log btrfs subvol=log,defaults,discard,compress=lzo 0 0 UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/snapshots btrfs subvol=snapshots,defaults,discard,compress=lzo 0 0 ----- On Sep 29, 2021, at 2:23 AM, Forza forza@tnonline.net wrote: > ---- From: Brandon Heisner <brandonh@wolfram.com> -- Sent: 2021-09-29 - 04:23 > ---- > >> I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP Fri Jan 20 >> 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is version locked to that >> kernel. The metadata has reserved a full 1T of disk space, while only using >> ~38G. I've tried to balance the metadata to reclaim that so it can be used for >> data, but it doesn't work and gives no errors. It just says it balanced the >> chunks but the size doesn't change. The metadata total is still growing as >> well, as it used to be 1.04 and now it is 1.08 with only about 10G more of >> metadata used. I've tried doing balances up to 70 or 80 musage I think, and >> the total metadata does not decrease. I've done so many attempts at balancing, >> I've probably tried to move 300 chunks or more. None have resulted in any >> change to the metadata total like they do on other servers running btrfs. I >> first started with very low musage, like 10 and then increased it by 10 to try >> to see if that would balance any chunks out, but with no success. >> >> # /sbin/btrfs balance start -musage=60 -mlimit=30 /opt/zimbra >> Done, had to relocate 30 out of 2127 chunks >> >> I can do that command over and over again, or increase the mlimit, and it >> doesn't change the metadata total ever. >> >> >> # btrfs fi show /opt/zimbra/ >> Label: 'Data' uuid: ece150db-5817-4704-9e84-80f7d8a3b1da >> Total devices 4 FS bytes used 1.48TiB >> devid 1 size 1.46TiB used 1.38TiB path /dev/sde >> devid 2 size 1.46TiB used 1.38TiB path /dev/sdf >> devid 3 size 1.46TiB used 1.38TiB path /dev/sdg >> devid 4 size 1.46TiB used 1.38TiB path /dev/sdh >> >> # btrfs fi df /opt/zimbra/ >> Data, RAID10: total=1.69TiB, used=1.45TiB >> System, RAID10: total=64.00MiB, used=640.00KiB >> Metadata, RAID10: total=1.08TiB, used=37.69GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> >> >> # btrfs fi us /opt/zimbra/ -T >> Overall: >> Device size: 5.82TiB >> Device allocated: 5.54TiB >> Device unallocated: 291.54GiB >> Device missing: 0.00B >> Used: 2.96TiB >> Free (estimated): 396.36GiB (min: 396.36GiB) >> Data ratio: 2.00 >> Metadata ratio: 2.00 >> Global reserve: 512.00MiB (used: 0.00B) >> >> Data Metadata System >> Id Path RAID10 RAID10 RAID10 Unallocated >> -- -------- --------- --------- --------- ----------- >> 1 /dev/sde 432.75GiB 276.00GiB 16.00MiB 781.65GiB >> 2 /dev/sdf 432.75GiB 276.00GiB 16.00MiB 781.65GiB >> 3 /dev/sdg 432.75GiB 276.00GiB 16.00MiB 781.65GiB >> 4 /dev/sdh 432.75GiB 276.00GiB 16.00MiB 781.65GiB >> -- -------- --------- --------- --------- ----------- >> Total 1.69TiB 1.08TiB 64.00MiB 3.05TiB >> Used 1.45TiB 37.69GiB 640.00KiB >> >> >> >> >> >> >> -- >> Brandon Heisner >> System Administrator >> Wolfram Research > > > What are you mount options? Do you by any chance use metadata_ratio mount > option? > > https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)#MOUNT_OPTIONS ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 14:34 ` Brandon Heisner @ 2021-10-03 11:26 ` Forza 2021-10-03 18:21 ` Zygo Blaxell 0 siblings, 1 reply; 11+ messages in thread From: Forza @ 2021-10-03 11:26 UTC (permalink / raw) To: brandonh; +Cc: linux-btrfs On 2021-09-29 16:34, Brandon Heisner wrote: > No I do not use that option. Also, because of btrfs not mounting individual subvolume options, I have the compression and nodatacow set with filesystem attributes on the directories that are btrfs subvolumes. > > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra btrfs subvol=zimbra,defaults,discard,compress=lzo 0 0 > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /var/log btrfs subvol=root-var-log,defaults,discard,compress=lzo 0 0 > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/db btrfs subvol=db,defaults,discard,nodatacow 0 0 > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/index btrfs subvol=index,defaults,discard,compress=lzo 0 0 > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/store btrfs subvol=store,defaults,discard,compress=lzo 0 0 > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/log btrfs subvol=log,defaults,discard,compress=lzo 0 0 > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/snapshots btrfs subvol=snapshots,defaults,discard,compress=lzo 0 0 > > It might be worth looking into discard=async (*) or setting up regular fstrim instead of doing the discard mount option. * async discard: "mount -o discard=async" to enable it freed extents are not discarded immediatelly, but grouped together and trimmed later, with IO rate limiting * https://lore.kernel.org/lkml/cover.1580142284.git.dsterba@suse.com/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-10-03 11:26 ` Forza @ 2021-10-03 18:21 ` Zygo Blaxell 0 siblings, 0 replies; 11+ messages in thread From: Zygo Blaxell @ 2021-10-03 18:21 UTC (permalink / raw) To: Forza; +Cc: brandonh, linux-btrfs On Sun, Oct 03, 2021 at 01:26:24PM +0200, Forza wrote: > > > On 2021-09-29 16:34, Brandon Heisner wrote: > > No I do not use that option. Also, because of btrfs not mounting individual subvolume options, I have the compression and nodatacow set with filesystem attributes on the directories that are btrfs subvolumes. > > > > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra btrfs subvol=zimbra,defaults,discard,compress=lzo 0 0 > > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /var/log btrfs subvol=root-var-log,defaults,discard,compress=lzo 0 0 > > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/db btrfs subvol=db,defaults,discard,nodatacow 0 0 > > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/index btrfs subvol=index,defaults,discard,compress=lzo 0 0 > > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/store btrfs subvol=store,defaults,discard,compress=lzo 0 0 > > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/log btrfs subvol=log,defaults,discard,compress=lzo 0 0 > > UUID=ece150db-5817-4704-9e84-80f7d8a3b1da /opt/zimbra/snapshots btrfs subvol=snapshots,defaults,discard,compress=lzo 0 0 > > > > > > It might be worth looking into discard=async (*) or setting up regular > fstrim instead of doing the discard mount option. Brandon's kernel (4.9.5 from 2017) is three years too old to have working discard=async. Upgrading the kernel would most likely fix the problems even without changing mount options. > * async discard: > "mount -o discard=async" to enable it freed extents are not discarded > immediatelly, but grouped together and trimmed later, with IO rate limiting > * https://lore.kernel.org/lkml/cover.1580142284.git.dsterba@suse.com/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 2:23 btrfs metadata has reserved 1T of extra space and balances don't reclaim it Brandon Heisner 2021-09-29 7:23 ` Forza @ 2021-09-29 8:22 ` Qu Wenruo 2021-09-29 15:18 ` Andrea Gelmini 2021-09-29 17:31 ` Zygo Blaxell 3 siblings, 0 replies; 11+ messages in thread From: Qu Wenruo @ 2021-09-29 8:22 UTC (permalink / raw) To: brandonh, linux-btrfs On 2021/9/29 10:23, Brandon Heisner wrote: > I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP Fri Jan 20 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is version locked to that kernel. The metadata has reserved a full 1T of disk space, while only using ~38G. I've tried to balance the metadata to reclaim that so it can be used for data, but it doesn't work and gives no errors. It just says it balanced the chunks but the size doesn't change. The metadata total is still growing as well, as it used to be 1.04 and now it is 1.08 with only about 10G more of metadata used. I've tried doing balances up to 70 or 80 musage I think, and the total metadata does not decrease. I've done so many attempts at balancing, I've probably tried to move 300 chunks or more. None have resulted in any change to the metadata total like they do on other servers running btrfs. I first started with very low musage, like 10 and then increased it by 10 to try to see if that would balance any chunks out, but with no success. > > # /sbin/btrfs balance start -musage=60 -mlimit=30 /opt/zimbra > Done, had to relocate 30 out of 2127 chunks One question is, did -musage=0 resulted any change? If there are empty metadata block groups, btrfs should be able to reclaim without any extra commands. And is there any dmesg during above -musage=0 output? Thanks, Qu > > I can do that command over and over again, or increase the mlimit, and it doesn't change the metadata total ever. > > > # btrfs fi show /opt/zimbra/ > Label: 'Data' uuid: ece150db-5817-4704-9e84-80f7d8a3b1da > Total devices 4 FS bytes used 1.48TiB > devid 1 size 1.46TiB used 1.38TiB path /dev/sde > devid 2 size 1.46TiB used 1.38TiB path /dev/sdf > devid 3 size 1.46TiB used 1.38TiB path /dev/sdg > devid 4 size 1.46TiB used 1.38TiB path /dev/sdh > > # btrfs fi df /opt/zimbra/ > Data, RAID10: total=1.69TiB, used=1.45TiB > System, RAID10: total=64.00MiB, used=640.00KiB > Metadata, RAID10: total=1.08TiB, used=37.69GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > > # btrfs fi us /opt/zimbra/ -T > Overall: > Device size: 5.82TiB > Device allocated: 5.54TiB > Device unallocated: 291.54GiB > Device missing: 0.00B > Used: 2.96TiB > Free (estimated): 396.36GiB (min: 396.36GiB) > Data ratio: 2.00 > Metadata ratio: 2.00 > Global reserve: 512.00MiB (used: 0.00B) > > Data Metadata System > Id Path RAID10 RAID10 RAID10 Unallocated > -- -------- --------- --------- --------- ----------- > 1 /dev/sde 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 2 /dev/sdf 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 3 /dev/sdg 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 4 /dev/sdh 432.75GiB 276.00GiB 16.00MiB 781.65GiB > -- -------- --------- --------- --------- ----------- > Total 1.69TiB 1.08TiB 64.00MiB 3.05TiB > Used 1.45TiB 37.69GiB 640.00KiB > > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 2:23 btrfs metadata has reserved 1T of extra space and balances don't reclaim it Brandon Heisner 2021-09-29 7:23 ` Forza 2021-09-29 8:22 ` Qu Wenruo @ 2021-09-29 15:18 ` Andrea Gelmini 2021-09-29 16:39 ` Forza 2021-09-29 17:31 ` Zygo Blaxell 3 siblings, 1 reply; 11+ messages in thread From: Andrea Gelmini @ 2021-09-29 15:18 UTC (permalink / raw) To: brandonh; +Cc: Linux BTRFS Il giorno mer 29 set 2021 alle ore 04:41 Brandon Heisner <brandonh@wolfram.com> ha scritto: > > I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP Fri Jan 20 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is version locked to that kernel. The metadata has reserved a full 1T of disk space, while only using ~38G. I've tried to balance the metadata to reclaim that so it can be used for data, but it doesn't work and gives no errors. It just says it balanced the chunks but the size doesn't change. The metadata total is still growing as well, as it used to be 1.04 and now it is 1.08 with only about 10G more of metadata used. I've tried doing balances up to 70 or 80 musage I think, and Similar situation here. A 18TB single disk with one big snapraid parity file, and a lot of metadata allocated. I solved with: btrfs filesystem defrag -v -r -clzo . (useless the compression, in my case) So, just after a little bit from start I saw already space reclaming. In the end I fallback to exfat to avoid to keep re-reading/re-writing all data just to avoid "metadata waste". Ciao, Gelma ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 15:18 ` Andrea Gelmini @ 2021-09-29 16:39 ` Forza 2021-09-29 18:55 ` Andrea Gelmini 0 siblings, 1 reply; 11+ messages in thread From: Forza @ 2021-09-29 16:39 UTC (permalink / raw) To: Andrea Gelmini, brandonh; +Cc: Linux BTRFS ---- From: Andrea Gelmini <andrea.gelmini@gmail.com> -- Sent: 2021-09-29 - 17:18 ---- > Il giorno mer 29 set 2021 alle ore 04:41 Brandon Heisner > <brandonh@wolfram.com> ha scritto: >> >> I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP Fri Jan 20 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is version locked to that kernel. The metadata has reserved a full 1T of disk space, while only using ~38G. I've tried to balance the metadata to reclaim that so it can be used for data, but it doesn't work and gives no errors. It just says it balanced the chunks but the size doesn't change. The metadata total is still growing as well, as it used to be 1.04 and now it is 1.08 with only about 10G more of metadata used. I've tried doing balances up to 70 or 80 musage I think, and > > Similar situation here. > A 18TB single disk with one big snapraid parity file, and a lot of > metadata allocated. > I solved with: > btrfs filesystem defrag -v -r -clzo . (useless the compression, in my case) > > So, just after a little bit from start I saw already space reclaming. > > In the end I fallback to exfat to avoid to keep re-reading/re-writing > all data just to avoid "metadata waste". > > Ciao, > Gelma Maybe autodefrag mount option might be helpful? Your problem sounds like partially filled extents and not metadata related. Typical scenarios where that happens is with some databases and vm images. A file could allocate much more space than actuall data due to this. Use 'compsize' to determine this. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 16:39 ` Forza @ 2021-09-29 18:55 ` Andrea Gelmini 0 siblings, 0 replies; 11+ messages in thread From: Andrea Gelmini @ 2021-09-29 18:55 UTC (permalink / raw) To: Forza; +Cc: brandonh, Linux BTRFS Il giorno mer 29 set 2021 alle ore 18:39 Forza <forza@tnonline.net> ha scritto: > Maybe autodefrag mount option might be helpful? It was enabled since beginning. > Your problem sounds like partially filled extents and not metadata related. Typical scenarios where that happens is with some databases and vm images. A file could allocate much more space than actuall data due to this. Use 'compsize' to determine this. I confirm is one big file with random writes. I agree about extents. But I'm quite confident the same approach can fix the original question. Ciao, Gelma ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 2:23 btrfs metadata has reserved 1T of extra space and balances don't reclaim it Brandon Heisner ` (2 preceding siblings ...) 2021-09-29 15:18 ` Andrea Gelmini @ 2021-09-29 17:31 ` Zygo Blaxell 2021-10-01 7:49 ` Brandon Heisner 3 siblings, 1 reply; 11+ messages in thread From: Zygo Blaxell @ 2021-09-29 17:31 UTC (permalink / raw) To: Brandon Heisner; +Cc: linux-btrfs On Tue, Sep 28, 2021 at 09:23:01PM -0500, Brandon Heisner wrote: > I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP > Fri Jan 20 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is That is a really old kernel. I recall there were some anomalous metadata allocation behaviors with kernels of that age, e.g. running scrub and balance at the same time would allocate a lot of metadata because scrub would lock a metadata block group immediately after it had been allocated, forcing another metadata block group to be allocated immediately. The symptom of that bug is very similar to yours--without warning, hundreds of GB of metadata block groups are allocated, all empty, during a scrub or balance operation. Unfortunately I don't have a better solution than "upgrade to a newer kernel", as that particular bug was solved years ago (along with hundreds of others). > version locked to that kernel. The metadata has reserved a full > 1T of disk space, while only using ~38G. I've tried to balance the > metadata to reclaim that so it can be used for data, but it doesn't > work and gives no errors. It just says it balanced the chunks but the > size doesn't change. The metadata total is still growing as well, > as it used to be 1.04 and now it is 1.08 with only about 10G more > of metadata used. I've tried doing balances up to 70 or 80 musage I > think, and the total metadata does not decrease. I've done so many > attempts at balancing, I've probably tried to move 300 chunks or more. > None have resulted in any change to the metadata total like they do > on other servers running btrfs. I first started with very low musage, > like 10 and then increased it by 10 to try to see if that would balance > any chunks out, but with no success. Have you tried rebooting? The block groups may be stuck in a locked state in memory or pinned by pending discard requests, in which case balance won't touch them. For that matter, try turning off discard (it's usually better to run fstrim once a day anyway, and not use the discard mount option). > # /sbin/btrfs balance start -musage=60 -mlimit=30 /opt/zimbra > Done, had to relocate 30 out of 2127 chunks > > I can do that command over and over again, or increase the mlimit, > and it doesn't change the metadata total ever. I would use just -m here (no filters, only metadata). If it gets the allocation under control, run 'btrfs balance cancel'; if it doesn't, let it run all the way to the end. Each balance starts from the last block group, so you are effectively restarting balance to process the same 30 block groups over and over here. > # btrfs fi show /opt/zimbra/ > Label: 'Data' uuid: ece150db-5817-4704-9e84-80f7d8a3b1da > Total devices 4 FS bytes used 1.48TiB > devid 1 size 1.46TiB used 1.38TiB path /dev/sde > devid 2 size 1.46TiB used 1.38TiB path /dev/sdf > devid 3 size 1.46TiB used 1.38TiB path /dev/sdg > devid 4 size 1.46TiB used 1.38TiB path /dev/sdh > > # btrfs fi df /opt/zimbra/ > Data, RAID10: total=1.69TiB, used=1.45TiB > System, RAID10: total=64.00MiB, used=640.00KiB > Metadata, RAID10: total=1.08TiB, used=37.69GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > > # btrfs fi us /opt/zimbra/ -T > Overall: > Device size: 5.82TiB > Device allocated: 5.54TiB > Device unallocated: 291.54GiB > Device missing: 0.00B > Used: 2.96TiB > Free (estimated): 396.36GiB (min: 396.36GiB) > Data ratio: 2.00 > Metadata ratio: 2.00 > Global reserve: 512.00MiB (used: 0.00B) > > Data Metadata System > Id Path RAID10 RAID10 RAID10 Unallocated > -- -------- --------- --------- --------- ----------- > 1 /dev/sde 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 2 /dev/sdf 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 3 /dev/sdg 432.75GiB 276.00GiB 16.00MiB 781.65GiB > 4 /dev/sdh 432.75GiB 276.00GiB 16.00MiB 781.65GiB > -- -------- --------- --------- --------- ----------- > Total 1.69TiB 1.08TiB 64.00MiB 3.05TiB > Used 1.45TiB 37.69GiB 640.00KiB > > > > > > > -- > Brandon Heisner > System Administrator > Wolfram Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it 2021-09-29 17:31 ` Zygo Blaxell @ 2021-10-01 7:49 ` Brandon Heisner 0 siblings, 0 replies; 11+ messages in thread From: Brandon Heisner @ 2021-10-01 7:49 UTC (permalink / raw) To: Zygo Blaxell; +Cc: linux-btrfs A reboot of the server did help quite a bit with the problem, but still not fixed completely. I went from having 1.08T reserved for metadata to "only" having 446G reserved. My free space went from 346G to 1010G. So at least I have some breathing room again. I prefer not to do a defrag, as that breaks all the COW links and the disk usage would go up then. I haven't tried the balance of all the metadata, which might be resource intensive. # btrfs fi us /opt/zimbra/ -T Overall: Device size: 5.82TiB Device allocated: 4.36TiB Device unallocated: 1.46TiB Device missing: 0.00B Used: 3.05TiB Free (estimated): 1010.62GiB (min: 1010.62GiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data Metadata System Id Path RAID10 RAID10 RAID10 Unallocated -- -------- --------- --------- --------- ----------- 1 /dev/sdc 446.25GiB 111.50GiB 32.00MiB 932.63GiB 2 /dev/sdd 446.25GiB 111.50GiB 32.00MiB 932.63GiB 3 /dev/sde 446.25GiB 111.50GiB 32.00MiB 932.63GiB 4 /dev/sdf 446.25GiB 111.50GiB 32.00MiB 932.63GiB -- -------- --------- --------- --------- ----------- Total 1.74TiB 446.00GiB 128.00MiB 3.64TiB Used 1.49TiB 38.16GiB 464.00KiB # btrfs fi df /opt/zimbra/ Data, RAID10: total=1.74TiB, used=1.49TiB System, RAID10: total=128.00MiB, used=464.00KiB Metadata, RAID10: total=446.00GiB, used=38.19GiB GlobalReserve, single: total=512.00MiB, used=0.00B ----- On Sep 29, 2021, at 12:31 PM, Zygo Blaxell ce3g8jdj@umail.furryterror.org wrote: > On Tue, Sep 28, 2021 at 09:23:01PM -0500, Brandon Heisner wrote: >> I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP >> Fri Jan 20 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is > > That is a really old kernel. I recall there were some anomalous > metadata allocation behaviors with kernels of that age, e.g. running > scrub and balance at the same time would allocate a lot of metadata > because scrub would lock a metadata block group immediately after > it had been allocated, forcing another metadata block group to be > allocated immediately. The symptom of that bug is very similar to > yours--without warning, hundreds of GB of metadata block groups are > allocated, all empty, during a scrub or balance operation. > > Unfortunately I don't have a better solution than "upgrade to a newer > kernel", as that particular bug was solved years ago (along with > hundreds of others). > >> version locked to that kernel. The metadata has reserved a full >> 1T of disk space, while only using ~38G. I've tried to balance the >> metadata to reclaim that so it can be used for data, but it doesn't >> work and gives no errors. It just says it balanced the chunks but the >> size doesn't change. The metadata total is still growing as well, >> as it used to be 1.04 and now it is 1.08 with only about 10G more >> of metadata used. I've tried doing balances up to 70 or 80 musage I >> think, and the total metadata does not decrease. I've done so many >> attempts at balancing, I've probably tried to move 300 chunks or more. >> None have resulted in any change to the metadata total like they do >> on other servers running btrfs. I first started with very low musage, >> like 10 and then increased it by 10 to try to see if that would balance >> any chunks out, but with no success. > > Have you tried rebooting? The block groups may be stuck in a locked > state in memory or pinned by pending discard requests, in which case > balance won't touch them. For that matter, try turning off discard > (it's usually better to run fstrim once a day anyway, and not use > the discard mount option). > >> # /sbin/btrfs balance start -musage=60 -mlimit=30 /opt/zimbra >> Done, had to relocate 30 out of 2127 chunks >> >> I can do that command over and over again, or increase the mlimit, >> and it doesn't change the metadata total ever. > > I would use just -m here (no filters, only metadata). If it gets the > allocation under control, run 'btrfs balance cancel'; if it doesn't, > let it run all the way to the end. Each balance starts from the last > block group, so you are effectively restarting balance to process the > same 30 block groups over and over here. > >> # btrfs fi show /opt/zimbra/ >> Label: 'Data' uuid: ece150db-5817-4704-9e84-80f7d8a3b1da >> Total devices 4 FS bytes used 1.48TiB >> devid 1 size 1.46TiB used 1.38TiB path /dev/sde >> devid 2 size 1.46TiB used 1.38TiB path /dev/sdf >> devid 3 size 1.46TiB used 1.38TiB path /dev/sdg >> devid 4 size 1.46TiB used 1.38TiB path /dev/sdh >> >> # btrfs fi df /opt/zimbra/ >> Data, RAID10: total=1.69TiB, used=1.45TiB >> System, RAID10: total=64.00MiB, used=640.00KiB >> Metadata, RAID10: total=1.08TiB, used=37.69GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> >> >> # btrfs fi us /opt/zimbra/ -T >> Overall: >> Device size: 5.82TiB >> Device allocated: 5.54TiB >> Device unallocated: 291.54GiB >> Device missing: 0.00B >> Used: 2.96TiB >> Free (estimated): 396.36GiB (min: 396.36GiB) >> Data ratio: 2.00 >> Metadata ratio: 2.00 >> Global reserve: 512.00MiB (used: 0.00B) >> >> Data Metadata System >> Id Path RAID10 RAID10 RAID10 Unallocated >> -- -------- --------- --------- --------- ----------- >> 1 /dev/sde 432.75GiB 276.00GiB 16.00MiB 781.65GiB >> 2 /dev/sdf 432.75GiB 276.00GiB 16.00MiB 781.65GiB >> 3 /dev/sdg 432.75GiB 276.00GiB 16.00MiB 781.65GiB >> 4 /dev/sdh 432.75GiB 276.00GiB 16.00MiB 781.65GiB >> -- -------- --------- --------- --------- ----------- >> Total 1.69TiB 1.08TiB 64.00MiB 3.05TiB >> Used 1.45TiB 37.69GiB 640.00KiB >> >> >> >> >> >> >> -- >> Brandon Heisner >> System Administrator > > Wolfram Research ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-10-03 18:21 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-29 2:23 btrfs metadata has reserved 1T of extra space and balances don't reclaim it Brandon Heisner 2021-09-29 7:23 ` Forza 2021-09-29 14:34 ` Brandon Heisner 2021-10-03 11:26 ` Forza 2021-10-03 18:21 ` Zygo Blaxell 2021-09-29 8:22 ` Qu Wenruo 2021-09-29 15:18 ` Andrea Gelmini 2021-09-29 16:39 ` Forza 2021-09-29 18:55 ` Andrea Gelmini 2021-09-29 17:31 ` Zygo Blaxell 2021-10-01 7:49 ` Brandon Heisner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.