* Switching from spacecache v1 to v2 @ 2020-10-31 0:27 waxhead 2020-11-01 17:49 ` Zygo Blaxell 0 siblings, 1 reply; 5+ messages in thread From: waxhead @ 2020-10-31 0:27 UTC (permalink / raw) To: Btrfs BTRFS A couple of months ago I asked on IRC how to properly switch from version 1 to version 2 of the space cache. I also asked if the space cache v2 was considered stable. I only remember what we talked about, and from what I understood it was not as easy to switch as the wiki may seem to indicate. We run a box with a btrfs filesystem at 19TB, 9 disks, 11 subvolumes that contains about 6.5 million files (and this number is growing). The filesystem has always been mounted with just the default options. Performance is slow, and it improved when I moved the bulk of the files to various subvolumes for some reason. The wiki states that performance on very large filesystems (what is considered large?) may degrade drastically. I would like to try v2 of the space cache to see if that improves speed a bit. So is space cache v2 safe to use?! And How do I make the switch properly? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Switching from spacecache v1 to v2 2020-10-31 0:27 Switching from spacecache v1 to v2 waxhead @ 2020-11-01 17:49 ` Zygo Blaxell 2020-11-02 5:48 ` A L 2020-11-02 17:03 ` waxhead 0 siblings, 2 replies; 5+ messages in thread From: Zygo Blaxell @ 2020-11-01 17:49 UTC (permalink / raw) To: waxhead; +Cc: Btrfs BTRFS On Sat, Oct 31, 2020 at 01:27:57AM +0100, waxhead wrote: > A couple of months ago I asked on IRC how to properly switch from version 1 > to version 2 of the space cache. I also asked if the space cache v2 was > considered stable. > I only remember what we talked about, and from what I understood it was not > as easy to switch as the wiki may seem to indicate. > > We run a box with a btrfs filesystem at 19TB, 9 disks, 11 subvolumes that > contains about 6.5 million files (and this number is growing). > > The filesystem has always been mounted with just the default options. > > Performance is slow, and it improved when I moved the bulk of the files to > various subvolumes for some reason. The wiki states that performance on very > large filesystems (what is considered large?) may degrade drastically. The important number for space_cache=v1 performance is the number of block groups in which some space was allocated or deallocated per transaction (i.e. the number of block groups that have to be updated on disk), divided by the speed of the drives (i.e. the number of seeks they can perform per second). "Large" could be 100GB if it was on a slow disk with a highly fragmented workload and low latency requirement. A 19TB filesystem has up to 19000 block groups and a spinning disk can do maybe 150 seeks per second, so a worst-case commit could take a couple of minutes. Delete a few old snapshots, and you'll add enough fragmentation to touch a significant portion of the block groups, and thus see a lot of additional latency. > I would like to try v2 of the space cache to see if that improves speed a > bit. > > So is space cache v2 safe to use?! AFAIK it has been 663 days since the last bug fix specific to free space tree (a6d8654d885d "Btrfs: fix deadlock when using free space tree due to block group creation" from 5.0). That fix was backported to earlier LTS kernels. We switched to space_cache=v2 for all new filesystems back in 2016, and upgraded our last legacy machine still running space_cache=v1 in 2019. I have never considered going back to v1: we have no machines running v1, I don't run regression tests on new kernels with v1, and I've never seen a filesystem fail in the field due to v2 (even with the bugs we now know it had). IMHO the real question is "is v1 safe to use", given that its design is based on letting errors happen, then detecting and recovering from them after they occur (this is the mechanism behind the ubiquitous "failed to load free space cache for block group %llu, rebuilding it now" message). v2 prevents the errors from happening in the first place by using the same btrfs metadata update mechanisms that are used for everything else in the filesystem. The problems in v1 may be mostly theoretical. I've never cared enough about v1 to try a practical experiment to see if btrfs recovers from these problems correctly (or not). v2 doesn't have those problems even in theory, and it works, so I use v2 instead. > And > How do I make the switch properly? Unmount the filesystem, mount it once with -o clear_cache,space_cache=v2. It will take some time to create the tree. After that, no mount option is needed. With current kernels it is not possible to upgrade while the filesystem is online, i.e. to upgrade "/" you have to set rootflags in the bootloader or boot from external media. That and the long mount time to do the conversion (which offends systemd's default mount timeout parameters) are the two major gotchas. There are some patches for future kernels that will take care of details like deleting the v1 space cache inodes and other inert parts of the space_cache=v1 infrastructure. I would not bother with these now, and instead let future kernels clean up automatically. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Switching from spacecache v1 to v2 2020-11-01 17:49 ` Zygo Blaxell @ 2020-11-02 5:48 ` A L 2020-11-02 14:40 ` Zygo Blaxell 2020-11-02 17:03 ` waxhead 1 sibling, 1 reply; 5+ messages in thread From: A L @ 2020-11-02 5:48 UTC (permalink / raw) To: Zygo Blaxell, waxhead; +Cc: Btrfs BTRFS >> And >> How do I make the switch properly? > Unmount the filesystem, mount it once with -o clear_cache,space_cache=v2. > It will take some time to create the tree. After that, no mount option > is needed. > > With current kernels it is not possible to upgrade while the filesystem is > online, i.e. to upgrade "/" you have to set rootflags in the bootloader > or boot from external media. That and the long mount time to do the > conversion (which offends systemd's default mount timeout parameters) > are the two major gotchas. > > There are some patches for future kernels that will take care of details > like deleting the v1 space cache inodes and other inert parts of the > space_cache=v1 infrastructure. I would not bother with these > now, and instead let future kernels clean up automatically. There is also this option according to the man page of btrfs-check: https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-check --clear-space-cache v1|v2 completely wipe all free space cache of given type For free space cache v1, the clear_cache kernel mount option only rebuilds the free space cache for block groups that are modified while the filesystem is mounted with that option. Thus, using this option with v1 makes it possible to actually clear the entire free space cache. For free space cache v2, the clear_cache kernel mount option destroys the entire free space cache. This option, with v2 provides an alternative method of clearing the free space cache that doesn’t require mounting the filesystem. Is there any practical difference to clearing the space cache using mount options? For example, would a lot of old space_cache=v1 data remain on-disk after mounting -o clear_cache,space_cache=v2 ? Would that affect performance in any way? Thanks. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Switching from spacecache v1 to v2 2020-11-02 5:48 ` A L @ 2020-11-02 14:40 ` Zygo Blaxell 0 siblings, 0 replies; 5+ messages in thread From: Zygo Blaxell @ 2020-11-02 14:40 UTC (permalink / raw) To: A L; +Cc: waxhead, Btrfs BTRFS On Mon, Nov 02, 2020 at 06:48:11AM +0100, A L wrote: > > > > And > > > How do I make the switch properly? > > Unmount the filesystem, mount it once with -o clear_cache,space_cache=v2. > > It will take some time to create the tree. After that, no mount option > > is needed. > > > > With current kernels it is not possible to upgrade while the filesystem is > > online, i.e. to upgrade "/" you have to set rootflags in the bootloader > > or boot from external media. That and the long mount time to do the > > conversion (which offends systemd's default mount timeout parameters) > > are the two major gotchas. > > > > There are some patches for future kernels that will take care of details > > like deleting the v1 space cache inodes and other inert parts of the > > space_cache=v1 infrastructure. I would not bother with these > > now, and instead let future kernels clean up automatically. > > There is also this option according to the man page of btrfs-check: > https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-check > > --clear-space-cache v1|v2 > completely wipe all free space cache of given type > For free space cache v1, the clear_cache kernel mount option only > rebuilds the free space cache for block groups that are modified while the > filesystem is mounted with that option. Thus, using this option with v1 > makes it possible to actually clear the entire free space cache. > For free space cache v2, the clear_cache kernel mount option destroys > the entire free space cache. This option, with v2 provides an alternative > method of clearing the free space cache that doesn’t require mounting the > filesystem. > > Is there any practical difference to clearing the space cache using mount > options? It's easier, because mount requires only setting flags. It doesn't require the additional separate step of running btrfs check. The kernel will currently recreate parts of the v1 structures when space_cache=v2 is used, so it will partially cancel out the work btrfs check does. There is a patch out there to fix that, see "btrfs: skip space_cache v1 setup when not using it"). > For example, would a lot of old space_cache=v1 data remain on-disk > after mounting -o clear_cache,space_cache=v2 ? It does, but the space used is negligible. Future kernels will clean it up automatically, assuming the patch set lands. > Would that affect performance in any way? Unused space cache is inert. It only takes up space, and not very much of that. > Thanks. > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Switching from spacecache v1 to v2 2020-11-01 17:49 ` Zygo Blaxell 2020-11-02 5:48 ` A L @ 2020-11-02 17:03 ` waxhead 1 sibling, 0 replies; 5+ messages in thread From: waxhead @ 2020-11-02 17:03 UTC (permalink / raw) To: Zygo Blaxell; +Cc: Btrfs BTRFS Zygo Blaxell wrote: > On Sat, Oct 31, 2020 at 01:27:57AM +0100, waxhead wrote: >> A couple of months ago I asked on IRC how to properly switch from version 1 >> to version 2 of the space cache. I also asked if the space cache v2 was >> considered stable. >> I only remember what we talked about, and from what I understood it was not >> as easy to switch as the wiki may seem to indicate. >> >> We run a box with a btrfs filesystem at 19TB, 9 disks, 11 subvolumes that >> contains about 6.5 million files (and this number is growing). >> >> The filesystem has always been mounted with just the default options. >> >> Performance is slow, and it improved when I moved the bulk of the files to >> various subvolumes for some reason. The wiki states that performance on very >> large filesystems (what is considered large?) may degrade drastically. > > The important number for space_cache=v1 performance is the number of block > groups in which some space was allocated or deallocated per transaction > (i.e. the number of block groups that have to be updated on disk), > divided by the speed of the drives (i.e. the number of seeks they can > perform per second). > > "Large" could be 100GB if it was on a slow disk with a highly fragmented > workload and low latency requirement. > > A 19TB filesystem has up to 19000 block groups and a spinning disk can do > maybe 150 seeks per second, so a worst-case commit could take a couple of > minutes. Delete a few old snapshots, and you'll add enough fragmentation > to touch a significant portion of the block groups, and thus see a lot > of additional latency. > >> I would like to try v2 of the space cache to see if that improves speed a >> bit. >> >> So is space cache v2 safe to use?! > > AFAIK it has been 663 days since the last bug fix specific to free space > tree (a6d8654d885d "Btrfs: fix deadlock when using free space tree due > to block group creation" from 5.0). That fix was backported to earlier > LTS kernels. > > We switched to space_cache=v2 for all new filesystems back in 2016, and > upgraded our last legacy machine still running space_cache=v1 in 2019. > > I have never considered going back to v1: we have no machines running > v1, I don't run regression tests on new kernels with v1, and I've never > seen a filesystem fail in the field due to v2 (even with the bugs we > now know it had). > > IMHO the real question is "is v1 safe to use", given that its design is > based on letting errors happen, then detecting and recovering from them > after they occur (this is the mechanism behind the ubiquitous "failed to > load free space cache for block group %llu, rebuilding it now" message). > v2 prevents the errors from happening in the first place by using the > same btrfs metadata update mechanisms that are used for everything else > in the filesystem. > > The problems in v1 may be mostly theoretical. I've never cared enough > about v1 to try a practical experiment to see if btrfs recovers from > these problems correctly (or not). v2 doesn't have those problems even > in theory, and it works, so I use v2 instead. > >> And >> How do I make the switch properly? > > Unmount the filesystem, mount it once with -o clear_cache,space_cache=v2. > It will take some time to create the tree. After that, no mount option > is needed. > > With current kernels it is not possible to upgrade while the filesystem is > online, i.e. to upgrade "/" you have to set rootflags in the bootloader > or boot from external media. That and the long mount time to do the > conversion (which offends systemd's default mount timeout parameters) > are the two major gotchas. > > There are some patches for future kernels that will take care of details > like deleting the v1 space cache inodes and other inert parts of the > space_cache=v1 infrastructure. I would not bother with these > now, and instead let future kernels clean up automatically. > Well I did exactly as you said. I mounted the filesystem from a live CD with -o clear_cache,space_cache=v2 and rebooted back into the system (yes, the rootfs is btrfs). Everything I am about to say is of course subjective, but the system is significantly more snappy now - quite a lot too. So unless the live cd with kernel 5.9 tuned something magnificent that has a nice effect on 5.8 as well the change to V2 space cache was significant on our box. So if I may summarize... COW-ABUNGA! WOW! Not sure why it had such a profound impact on performance, but perhaps V2 should be the default?! ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-11-02 17:03 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-31 0:27 Switching from spacecache v1 to v2 waxhead 2020-11-01 17:49 ` Zygo Blaxell 2020-11-02 5:48 ` A L 2020-11-02 14:40 ` Zygo Blaxell 2020-11-02 17:03 ` waxhead
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).