* Understanding BTRFS RAID0 Performance @ 2018-10-04 21:33 Wilson, Ellis 2018-10-05 8:45 ` Nikolay Borisov 2018-10-05 10:40 ` Duncan 0 siblings, 2 replies; 6+ messages in thread From: Wilson, Ellis @ 2018-10-04 21:33 UTC (permalink / raw) To: Btrfs BTRFS Hi all, I'm attempting to understand a roughly 30% degradation in BTRFS RAID0 for large read I/Os across six disks compared with ext4 atop mdadm RAID0. Specifically, I achieve performance parity with BTRFS in terms of single-threaded write and read, and multi-threaded write, but poor performance for multi-threaded read. The relative discrepancy appears to grow as one adds disks. At 6 disks in a RAID0 (yes, I know, and I do not care about data persistence as I have this solved at a different layer) I see approximately 1.3GB/s for ext4 atop mdadm, but only about 950MB/s for BTRFS, both using four threads to read and write four different large files. Across a large number of my nodes this aggregates to a sizable performance loss. This has been a long and winding road for me, but to keep my question somewhat succinct, I'm down to the level of block tracing and one thing that stands out between the two traces is the number of rather small read I/O's that reach one of the drives in the test is vastly different for mdadm RAID0 vs BTRFS, which I think explains (in part at least) the performance drop off. The read queue depth for BTRFS hovers in the upper single digits while the ext4/mdadm queue depth is towards 20. I'm unsure right now if this is related or not. Benchmark: FIO was used with the following command: fio --name=read --rw=read --bs=1M --direct=0 --size=16G --numjobs=4 --runtime=120 --group_reporting The block sizes and counts of I/Os at that size I'm seeing for both cases comes in like the following (my max_segment_kb_size is 4K, hence the above typical upper-end): BTRFS: Count Read I/O Size 21849 128 18 640 9 768 3 1280 9 1408 3 2048 3 2560 1011 2688 507 2816 ext4 on mdadm RAID0: Count Read I/O Size 9 8 3 16 5 256 5 768 19 1024 716 1536 5 1592 5 2504 695 2560 24 4096 21 6656 477 8192 Before I dive into the BTRFS source or try tracing in a different way, I wanted to see if this was a well-known artifact of BTRFS RAID0 and, even better, if there's any tunables available for RAID0 in BTRFS I could play with. The man page for mkfs.btrfs and btrfstune in the tuning regard seemed...sparse. Any help or pointers are greatly appreciated! Thanks, ellis ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Understanding BTRFS RAID0 Performance 2018-10-04 21:33 Understanding BTRFS RAID0 Performance Wilson, Ellis @ 2018-10-05 8:45 ` Nikolay Borisov 2018-10-05 10:40 ` Duncan 1 sibling, 0 replies; 6+ messages in thread From: Nikolay Borisov @ 2018-10-05 8:45 UTC (permalink / raw) To: Wilson, Ellis, Btrfs BTRFS On 5.10.2018 00:33, Wilson, Ellis wrote: > Hi all, > > I'm attempting to understand a roughly 30% degradation in BTRFS RAID0 > for large read I/Os across six disks compared with ext4 atop mdadm RAID0. > > Specifically, I achieve performance parity with BTRFS in terms of > single-threaded write and read, and multi-threaded write, but poor > performance for multi-threaded read. The relative discrepancy appears > to grow as one adds disks. At 6 disks in a RAID0 (yes, I know, and I do > not care about data persistence as I have this solved at a different > layer) I see approximately 1.3GB/s for ext4 atop mdadm, but only about > 950MB/s for BTRFS, both using four threads to read and write four > different large files. Across a large number of my nodes this > aggregates to a sizable performance loss. > > This has been a long and winding road for me, but to keep my question > somewhat succinct, I'm down to the level of block tracing and one thing > that stands out between the two traces is the number of rather small > read I/O's that reach one of the drives in the test is vastly different > for mdadm RAID0 vs BTRFS, which I think explains (in part at least) the > performance drop off. The read queue depth for BTRFS hovers in the > upper single digits while the ext4/mdadm queue depth is towards 20. I'm > unsure right now if this is related or not. > > Benchmark: FIO was used with the following command: > fio --name=read --rw=read --bs=1M --direct=0 --size=16G --numjobs=4 > --runtime=120 --group_reporting Right, so you are doing sequential reads. Since btrfs uses generic_read_file_iter as its read-related operations and what it just calls btrfs_readpage which ends up in: btrfs_readpage extent_read_full_page __extent_read_full_page __do_readpage submit_extent_page <- Here we have some code which is supposed to detect contiguous bios detection and merging So my first guess would be to instrument the code around the merging logic and see if it works as expected and is able to merge the majority of the bios. > > The block sizes and counts of I/Os at that size I'm seeing for both > cases comes in like the following (my max_segment_kb_size is 4K, hence > the above typical upper-end): > > BTRFS: > Count Read I/O Size > 21849 128 > 18 640 > 9 768 > 3 1280 > 9 1408 > 3 2048 > 3 2560 > 1011 2688 > 507 2816 > > ext4 on mdadm RAID0: > Count Read I/O Size > 9 8 > 3 16 > 5 256 > 5 768 > 19 1024 > 716 1536 > 5 1592 > 5 2504 > 695 2560 > 24 4096 > 21 6656 > 477 8192 > > Before I dive into the BTRFS source or try tracing in a different way, I > wanted to see if this was a well-known artifact of BTRFS RAID0 and, even > better, if there's any tunables available for RAID0 in BTRFS I could > play with. The man page for mkfs.btrfs and btrfstune in the tuning > regard seemed...sparse.> > Any help or pointers are greatly appreciated!> > Thanks, > > ellis > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Understanding BTRFS RAID0 Performance 2018-10-04 21:33 Understanding BTRFS RAID0 Performance Wilson, Ellis 2018-10-05 8:45 ` Nikolay Borisov @ 2018-10-05 10:40 ` Duncan 2018-10-05 15:29 ` Wilson, Ellis 1 sibling, 1 reply; 6+ messages in thread From: Duncan @ 2018-10-05 10:40 UTC (permalink / raw) To: linux-btrfs Wilson, Ellis posted on Thu, 04 Oct 2018 21:33:29 +0000 as excerpted: > Hi all, > > I'm attempting to understand a roughly 30% degradation in BTRFS RAID0 > for large read I/Os across six disks compared with ext4 atop mdadm > RAID0. > > Specifically, I achieve performance parity with BTRFS in terms of > single-threaded write and read, and multi-threaded write, but poor > performance for multi-threaded read. The relative discrepancy appears > to grow as one adds disks. [...] > Before I dive into the BTRFS source or try tracing in a different way, I > wanted to see if this was a well-known artifact of BTRFS RAID0 and, even > better, if there's any tunables available for RAID0 in BTRFS I could > play with. The man page for mkfs.btrfs and btrfstune in the tuning > regard seemed...sparse. This is indeed well known for btrfs at this point, as it hasn't been multi-read-thread optimized yet. I'm personally more familiar with the raid1 case, where which one of the two copies gets the read is simply even/odd-PID-based, but AFAIK raid0 isn't particularly optimized either. The recommended workaround is (as you might expect) btrfs on top of mdraid. In fact, while it doesn't apply to your case, btrfs raid1 on top of mdraid0s is often recommended as an alternative to btrfs raid10, as that gives you the best of both worlds -- the data and metadata integrity protection of btrfs checksums and fallback (with writeback of the correct version) to the other copy if the first copy read fails checksum verification, with the much better optimized mdraid0 performance. So it stands to reason that the same recommendation would apply to raid0 -- just do single-mode btrfs on mdraid0, for better performance than the as yet unoptimized btrfs raid0. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Understanding BTRFS RAID0 Performance 2018-10-05 10:40 ` Duncan @ 2018-10-05 15:29 ` Wilson, Ellis 2018-10-06 0:34 ` Duncan 0 siblings, 1 reply; 6+ messages in thread From: Wilson, Ellis @ 2018-10-05 15:29 UTC (permalink / raw) To: Duncan, linux-btrfs On 10/05/2018 06:40 AM, Duncan wrote: > Wilson, Ellis posted on Thu, 04 Oct 2018 21:33:29 +0000 as excerpted: > >> Hi all, >> >> I'm attempting to understand a roughly 30% degradation in BTRFS RAID0 >> for large read I/Os across six disks compared with ext4 atop mdadm >> RAID0. >> >> Specifically, I achieve performance parity with BTRFS in terms of >> single-threaded write and read, and multi-threaded write, but poor >> performance for multi-threaded read. The relative discrepancy appears >> to grow as one adds disks. > > [...] > >> Before I dive into the BTRFS source or try tracing in a different way, I >> wanted to see if this was a well-known artifact of BTRFS RAID0 and, even >> better, if there's any tunables available for RAID0 in BTRFS I could >> play with. The man page for mkfs.btrfs and btrfstune in the tuning >> regard seemed...sparse. > > This is indeed well known for btrfs at this point, as it hasn't been > multi-read-thread optimized yet. I'm personally more familiar with the > raid1 case, where which one of the two copies gets the read is simply > even/odd-PID-based, but AFAIK raid0 isn't particularly optimized either. > > The recommended workaround is (as you might expect) btrfs on top of > mdraid. In fact, while it doesn't apply to your case, btrfs raid1 on top > of mdraid0s is often recommended as an alternative to btrfs raid10, as > that gives you the best of both worlds -- the data and metadata integrity > protection of btrfs checksums and fallback (with writeback of the correct > version) to the other copy if the first copy read fails checksum > verification, with the much better optimized mdraid0 performance. So it > stands to reason that the same recommendation would apply to raid0 -- > just do single-mode btrfs on mdraid0, for better performance than the as > yet unoptimized btrfs raid0. Thank you very much Duncan. I failed to mention that I'd tried this before as well, but was hoping to avoid it as it felt like a kludge and it didn't give me the big jump I expected so I forgot about it. I retested and btrfs on mdraid in a six-wide RAID0 does improve performance slightly -- I see typically 990MB/s, and up to around 1.1GB/s in the best case. Same options to fio as my original email. Still a ways away from ext4 (which admittedly may be cheating a bit since it seems to detect the md0 underneath of it and adjust its stride length accordingly, though I may be over-representing it's intelligence about this). The I/O sizes improve greatly to parity with ext4 atop mdraid, but the queue depth is still fairly low -- even with many processes it rarely exceeds 5 or 6. This is true if I run fio with or without the aio ioengine. Is there any tuning in BTRFS that limits the number of outstanding reads at a time to a small single-digit number, or something else that could be behind small queue depths? I can't otherwise imagine what the difference would be on the read path between ext4 vs btrfs when both are on mdraid. Thanks again for your insights, ellis ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Understanding BTRFS RAID0 Performance 2018-10-05 15:29 ` Wilson, Ellis @ 2018-10-06 0:34 ` Duncan 2018-10-08 12:20 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 6+ messages in thread From: Duncan @ 2018-10-06 0:34 UTC (permalink / raw) To: linux-btrfs Wilson, Ellis posted on Fri, 05 Oct 2018 15:29:52 +0000 as excerpted: > Is there any tuning in BTRFS that limits the number of outstanding reads > at a time to a small single-digit number, or something else that could > be behind small queue depths? I can't otherwise imagine what the > difference would be on the read path between ext4 vs btrfs when both are > on mdraid. It seems I forgot to directly answer that question in my first reply. Thanks for restating it. Btrfs doesn't really expose much performance tuning (yet?), at least outside the code itself. There are a few very limited knobs, but they're just that, few and limited or broad-stroke. There are mount options like ssd/nossd, ssd_spread/nossd_spread, the space_cache set of options (see below), flushoncommit/noflushoncommit, commit=<seconds>, etc (see the btrfs (5) manpage), but nothing really to influence stride length, etc, or to optimize chunk placement between ssd and non-ssd devices, for instance. And there's a few filesystem features, normally set at mkfs.btrfs time (and thus covered in the mkfs.btrfs manpage) but some of which can be tuned later, but generally, the defaults have changed over time to reflect the best case, and the older variants are there primarily to retain backward compatibility with old kernels and tools that didn't handle the newer variants. That said, as I think about it there are some tunables that may be worth experimenting with. Most or all of these are covered in the btrfs (5) manpage. * Given the large device numbers you mention and raid0, you're likely dealing with multi-TB-scale filesystems. At this level, the space_cache=v2 mount option may be useful. It's not the default yet as btrfs check, etc, don't yet handle it, but given your raid0 choice you may not be concerned about that. Need only be given once after which v2 is "on" for the filesystem until turned off. * Consider experimenting with the thread_pool=n mount option. I've seen very little discussion of this one, but given your interest in parallelization, it could make a difference. * Possibly the commit=<seconds> (default 30) mount option. In theory, upping this may allow better write merging, tho your interest seems to be more on the read side, and the commit time has consequences at crash time. * The autodefrag mount option may be considered if you do a lot of existing file updates, as is common with database or VM image files. Due to COW this triggers high fragmentation on btrfs, and autodefrag should help control that. Note that autodefrag effectively increases the minimum extent size from 4 KiB to, IIRC, 16 MB, tho it may be less, and doesn't operate at whole-file size, so larger repeatedly-modified files will still have some fragmentation, just not as much. Obviously, you wouldn't see the read-time effects of this until the filesystem has aged somewhat, so it may not show up on your benchmarks. (Another option for such files is setting them nocow or using the nodatacow mount option, but this turns off checksumming and if it's on, compression for those files, and has a few other non-obvious caveats as well, so isn't something I recommend. Instead of using nocow, I'd suggest putting such files on a dedicated traditional non-cow filesystem such as ext4, and I consider nocow at best a workaround option for those who prefer to use btrfs as a single big storage pool and thus don't want to do the dedicated non-cow filesystem for some subset of their files.) * Not really for reads but for btrfs and any cow-based filesystem, you almost certainly want the (not btrfs specific) noatime mount option. * While it has serious filesystem integrity implications and thus can't be responsibly recommended, there is the nobarrier mount option. But if you're already running raid0 on a large number of devices you're already gambling with device stability, and this /might/ be an additional risk you're willing to take, as it should increase performance. But for normal users it's simply not worth the risk, and if you do choose to use it, it's at your own risk. * If you're enabling the discard mount option, consider trying with it off, as it can affect performance if your devices don't support queued- trim. The alternative is fstrim, presumably scheduled to run once a week or so. (The util-linux package includes an fstrim systemd timer and service set to run once a week. You can activate that, or equivalent cron job if you're not on systemd.) * For filesystem features you may look at no_holes and skinny_metadata. These are both quite stable and at least skinny-metadata is now the default. These are normally set at mkfs.btrfs time, but can be modified later. Setting at mkfs time should be more efficient. * At mkfs.btrfs time, you can set metadata --nodesize. The newer default is 16 KiB, while the old default was the (minimum for amd64/x86) 4 KiB, and the maximum is 64 KiB. See the mkfs.btrfs manpage for the details as there's a tradeoff, smaller sizes increase (metadata) fragmentation but decrease lock contention, while larger sizes pack more efficiently and are less fragmented but updating is more expensive. The change in default was because 16 KiB was a win over the old 4 KiB for most use- cases, but the 32 or 64 KiB options may or may not be, depending on use- case, and of course if you're bottlenecking on locks, 4 KiB may still be a win. Among all those, I'd be especially interested in what thread_pool=n does or doesn't do for you, both because it specifically mentions parallelization and because I've seen little discussion of it. space_cache=v2 may also be a big boost for you, if you're filesystems are the size the 6-device raid0 implies and are at all reasonably populated. (Metadata) nodesize may or may not make a difference, tho I suspect if so it'll be mostly on writes (but I'm not familiar with the specifics there so could be wrong). I'd be interested to see if it does. In general I can recommend the no_holes and skinny_metadata features but you may well already have them, and the noatime mount option, which you may well already be using as well. Similarly, I ensure that all my btrfs are mounted from first mount with autodefrag, so it's always on as the filesystem is populated, but I doubt you'll see a difference from that in your benchmarks unless you're specifically testing an aged filesystem that would be heavily fragmented on its own. There's one guy here who has done heavy testing on the ssd stuff and knows btrfs on-device chunk allocation strategies very well, having come up with a utilization visualization utility and been the force behind the relatively recent (4.16-ish) changes to the ssd mount option's allocation strategy. He'd be the one to talk to if you're considering diving into btrfs' on-disk allocation code, etc. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Understanding BTRFS RAID0 Performance 2018-10-06 0:34 ` Duncan @ 2018-10-08 12:20 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 6+ messages in thread From: Austin S. Hemmelgarn @ 2018-10-08 12:20 UTC (permalink / raw) To: linux-btrfs On 2018-10-05 20:34, Duncan wrote: > Wilson, Ellis posted on Fri, 05 Oct 2018 15:29:52 +0000 as excerpted: > >> Is there any tuning in BTRFS that limits the number of outstanding reads >> at a time to a small single-digit number, or something else that could >> be behind small queue depths? I can't otherwise imagine what the >> difference would be on the read path between ext4 vs btrfs when both are >> on mdraid. > > It seems I forgot to directly answer that question in my first reply. > Thanks for restating it. > > Btrfs doesn't really expose much performance tuning (yet?), at least > outside the code itself. There are a few very limited knobs, but they're > just that, few and limited or broad-stroke. > > There are mount options like ssd/nossd, ssd_spread/nossd_spread, the > space_cache set of options (see below), flushoncommit/noflushoncommit, > commit=<seconds>, etc (see the btrfs (5) manpage), but nothing really to > influence stride length, etc, or to optimize chunk placement between ssd > and non-ssd devices, for instance. > > And there's a few filesystem features, normally set at mkfs.btrfs time > (and thus covered in the mkfs.btrfs manpage) but some of which can be > tuned later, but generally, the defaults have changed over time to > reflect the best case, and the older variants are there primarily to > retain backward compatibility with old kernels and tools that didn't > handle the newer variants. > > That said, as I think about it there are some tunables that may be worth > experimenting with. Most or all of these are covered in the btrfs (5) > manpage. > > * Given the large device numbers you mention and raid0, you're likely > dealing with multi-TB-scale filesystems. At this level, the > space_cache=v2 mount option may be useful. It's not the default yet as > btrfs check, etc, don't yet handle it, but given your raid0 choice you > may not be concerned about that. Need only be given once after which v2 > is "on" for the filesystem until turned off. > > * Consider experimenting with the thread_pool=n mount option. I've seen > very little discussion of this one, but given your interest in > parallelization, it could make a difference. Probably not as much as you might think. I'll explain a bit more further down where this is being mentioned again. > > * Possibly the commit=<seconds> (default 30) mount option. In theory, > upping this may allow better write merging, tho your interest seems to be > more on the read side, and the commit time has consequences at crash time. Based on my own experience, having a higher commit time doesn't impact read or write performance much or really help all that much with write merging. All it really helps with is minimizing overhead, but it's not even all that great at doing that. > > * The autodefrag mount option may be considered if you do a lot of > existing file updates, as is common with database or VM image files. Due > to COW this triggers high fragmentation on btrfs, and autodefrag should > help control that. Note that autodefrag effectively increases the > minimum extent size from 4 KiB to, IIRC, 16 MB, tho it may be less, and > doesn't operate at whole-file size, so larger repeatedly-modified files > will still have some fragmentation, just not as much. Obviously, you > wouldn't see the read-time effects of this until the filesystem has aged > somewhat, so it may not show up on your benchmarks. > > (Another option for such files is setting them nocow or using the > nodatacow mount option, but this turns off checksumming and if it's on, > compression for those files, and has a few other non-obvious caveats as > well, so isn't something I recommend. Instead of using nocow, I'd > suggest putting such files on a dedicated traditional non-cow filesystem > such as ext4, and I consider nocow at best a workaround option for those > who prefer to use btrfs as a single big storage pool and thus don't want > to do the dedicated non-cow filesystem for some subset of their files.) > > * Not really for reads but for btrfs and any cow-based filesystem, you > almost certainly want the (not btrfs specific) noatime mount option. Actually... This can help a bit for some workloads. Just like the commit time, it comes down to a matter of overhead. Essentially, if you read a file regularly, than with the default of relatime, you've got a guaranteed write requiring a commit of the metadata tree once every 24 hours. It's not much to worry about for just one file, but if you're reading a very large number of files all the time, it can really add up. > > * While it has serious filesystem integrity implications and thus can't > be responsibly recommended, there is the nobarrier mount option. But if > you're already running raid0 on a large number of devices you're already > gambling with device stability, and this /might/ be an additional risk > you're willing to take, as it should increase performance. But for > normal users it's simply not worth the risk, and if you do choose to use > it, it's at your own risk. Agreed, if you're running RAID0 with this many drives, nobarrier may be worth it for a little bit of extra performance. It will make writes a bit faster, and make them have less impact on concurrent reads. > > * If you're enabling the discard mount option, consider trying with it > off, as it can affect performance if your devices don't support queued- > trim. The alternative is fstrim, presumably scheduled to run once a week > or so. (The util-linux package includes an fstrim systemd timer and > service set to run once a week. You can activate that, or equivalent > cron job if you're not on systemd.) Even if you have queued discard support, you may still be better off using fstrim instead. While queuing discards reduces their performance impact, some device firmware still can't handle them efficiently. Pretty much, test both ways, see which works better for your workload. > > * For filesystem features you may look at no_holes and skinny_metadata. > These are both quite stable and at least skinny-metadata is now the > default. These are normally set at mkfs.btrfs time, but can be modified > later. Setting at mkfs time should be more efficient. > > * At mkfs.btrfs time, you can set metadata --nodesize. The newer default > is 16 KiB, while the old default was the (minimum for amd64/x86) 4 KiB, > and the maximum is 64 KiB. See the mkfs.btrfs manpage for the details as > there's a tradeoff, smaller sizes increase (metadata) fragmentation but > decrease lock contention, while larger sizes pack more efficiently and > are less fragmented but updating is more expensive. The change in > default was because 16 KiB was a win over the old 4 KiB for most use- > cases, but the 32 or 64 KiB options may or may not be, depending on use- > case, and of course if you're bottlenecking on locks, 4 KiB may still be > a win. One caveat here, if you're running on top of another RAID platform, you can often get a small performance boost by matching the node size to the chunks size for the underlying RAID layer (so, the chunk size that replication is done at for replicated RAID, or the amount of data per disk per stripe for striped stuff). > > > Among all those, I'd be especially interested in what thread_pool=n does > or doesn't do for you, both because it specifically mentions > parallelization and because I've seen little discussion of it. There's been little discussion because the default value that gets selected is actually near optimal in all but the largest systems. The default logic is to set this to either the total number of logical cores in the system or 8, whichever is less. What this does is actually rather simple, it's functionally the maximum number of I/O requests that can be processed concurrently by BTRFS for that volume. Now, in theory it might sound like increasing this should improve things here. The problem with that is that beyond about 8 requests, you start to see the effects of lock contention a _lot_ more. If you can find a way to mitigate the locking issues (check the end of my reply for more about that), bumping this up _might_ help, but it generally should still not be more than the number of logical cores in the system (I've done some testing myself, no matter how well you have lock contention mitigated, performance gains are at best negligible from using more threads than logical cores, and at worst you'll make performance significantly worse). > > space_cache=v2 may also be a big boost for you, if you're filesystems are > the size the 6-device raid0 implies and are at all reasonably populated. > > (Metadata) nodesize may or may not make a difference, tho I suspect if so > it'll be mostly on writes (but I'm not familiar with the specifics there > so could be wrong). I'd be interested to see if it does. > > In general I can recommend the no_holes and skinny_metadata features but > you may well already have them, and the noatime mount option, which you > may well already be using as well. Similarly, I ensure that all my btrfs > are mounted from first mount with autodefrag, so it's always on as the > filesystem is populated, but I doubt you'll see a difference from that in > your benchmarks unless you're specifically testing an aged filesystem > that would be heavily fragmented on its own. > > > There's one guy here who has done heavy testing on the ssd stuff and > knows btrfs on-device chunk allocation strategies very well, having come > up with a utilization visualization utility and been the force behind the > relatively recent (4.16-ish) changes to the ssd mount option's allocation > strategy. He'd be the one to talk to if you're considering diving into > btrfs' on-disk allocation code, etc. There are two other recommendations I would make: * Stupid as it sounds, depending on your workload, you may actually see better performance with the single profile than the raid0 profile. Essentially, if you've got mostly big files that would span multiple devices in raid0 mode and you don't have a workload that needs concurrent access to the same file regularly, you can reduce contention for access to each individual device by running with the data profile set to single. * If you can find some way to logically subdivide your workload, you should look at creating one subvolume per subdivision. This will reduce lock contention (and thus make bumping up the `thread_pool` option actually have some benefits). ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-10-08 12:20 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-04 21:33 Understanding BTRFS RAID0 Performance Wilson, Ellis 2018-10-05 8:45 ` Nikolay Borisov 2018-10-05 10:40 ` Duncan 2018-10-05 15:29 ` Wilson, Ellis 2018-10-06 0:34 ` Duncan 2018-10-08 12:20 ` Austin S. Hemmelgarn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.