* btrfs filesystem defragment -r -- does it affect subvolumes? @ 2017-08-31 7:05 Ulli Horlacher 2017-09-12 16:28 ` defragmenting best practice? Ulli Horlacher 0 siblings, 1 reply; 56+ messages in thread From: Ulli Horlacher @ 2017-08-31 7:05 UTC (permalink / raw) To: linux-btrfs When I do a btrfs filesystem defragment -r /directory does it defragment really all files in this directory tree, even if it contains subvolumes? The man page does not mention subvolumes on this topic. I have an older script (written by myself) which does a "btrfs filesystem defragment -r" on all subvolumes recursivly: btrfs filesystem defragment -r $m for s in $(btrfs subvolume list $m | awk '{ print $NF }'); do [[ "$s" =~ ^@ ]] && continue [[ $(btrfs subvolume show $m/$s) =~ Flags:.*readonly ]] && continue btrfs filesystem defragment -r $m/$s done I wonder why I have done it that way :-} -- Ullrich Horlacher Server und Virtualisierung Rechenzentrum TIK Universitaet Stuttgart E-Mail: horlacher@tik.uni-stuttgart.de Allmandring 30a Tel: ++49-711-68565868 70569 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/ REF:<20170831070558.GB5783@rus.uni-stuttgart.de> ^ permalink raw reply [flat|nested] 56+ messages in thread
* defragmenting best practice? 2017-08-31 7:05 btrfs filesystem defragment -r -- does it affect subvolumes? Ulli Horlacher @ 2017-09-12 16:28 ` Ulli Horlacher 2017-09-12 17:27 ` Austin S. Hemmelgarn 2017-09-14 11:38 ` Kai Krakow 0 siblings, 2 replies; 56+ messages in thread From: Ulli Horlacher @ 2017-09-12 16:28 UTC (permalink / raw) To: linux-btrfs On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: > When I do a > btrfs filesystem defragment -r /directory > does it defragment really all files in this directory tree, even if it > contains subvolumes? > The man page does not mention subvolumes on this topic. No answer so far :-( But I found another problem in the man-page: Defragmenting with Linux kernel versions < 3.9 or >= 3.14-rc2 as well as with Linux stable kernel versions >= 3.10.31, >= 3.12.12 or >= 3.13.4 will break up the ref-links of COW data (for example files copied with cp --reflink, snapshots or de-duplicated data). This may cause considerable increase of space usage depending on the broken up ref-links. I am running Ubuntu 16.04 with Linux kernel 4.10 and I have several snapshots. Therefore, I better should avoid calling "btrfs filesystem defragment -r"? What is the defragmenting best practice? Avoid it completly? -- Ullrich Horlacher Server und Virtualisierung Rechenzentrum TIK Universitaet Stuttgart E-Mail: horlacher@tik.uni-stuttgart.de Allmandring 30a Tel: ++49-711-68565868 70569 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/ REF:<20170831070558.GB5783@rus.uni-stuttgart.de> ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-12 16:28 ` defragmenting best practice? Ulli Horlacher @ 2017-09-12 17:27 ` Austin S. Hemmelgarn 2017-09-14 7:54 ` Duncan 2017-09-14 11:38 ` Kai Krakow 1 sibling, 1 reply; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-09-12 17:27 UTC (permalink / raw) To: linux-btrfs On 2017-09-12 12:28, Ulli Horlacher wrote: > On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: >> When I do a >> btrfs filesystem defragment -r /directory >> does it defragment really all files in this directory tree, even if it >> contains subvolumes? >> The man page does not mention subvolumes on this topic. > > No answer so far :-( I hadn't seen your original mail, otherwise I probably would have responded. Sorry about that. On the note of the original question: I'm pretty sure that it does recursively operate on nested subvolumes. The documentation doesn't say otherwise, and not doing so would be non-intuitive to people who don't know anything about subvolumes. > > But I found another problem in the man-page: > > Defragmenting with Linux kernel versions < 3.9 or >= 3.14-rc2 as well as > with Linux stable kernel versions >= 3.10.31, >= 3.12.12 or >= 3.13.4 > will break up the ref-links of COW data (for example files copied with > cp --reflink, snapshots or de-duplicated data). This may cause > considerable increase of space usage depending on the broken up > ref-links. > > I am running Ubuntu 16.04 with Linux kernel 4.10 and I have several > snapshots. > Therefore, I better should avoid calling "btrfs filesystem defragment -r"? > > What is the defragmenting best practice? That really depends on what you're doing. First, you need to understand that defrag won't break _all_ reflinks, just the particular instances you point it at. So, if you have subvolume A, and snapshots S1 and S2 of that subvolume A, then running defrag on _just_ subvolume A will break the reflinks between it and the snapshots, but S1 and S2 will still share any data they were originally with each other. If you then take a third snapshot of A, it will share data with A, but not with S1 or S2 (because A is no longer sharing data with S1 or S2). Given this behavior, you have in turn three potential cases when talking about persistent snapshots: 1. You care about minimizing space used, but aren't as worried about performance. In this case, the only option is to not run defrag at all. 2. You care about performance, but not space usage. In this case, defragment everything. 3. You care about both space usage and performance. In this case, I would personally suggest defragmenting only the source subvolume (so only subvolume A in the above explanation), and doing so on a schedule that coincides with snapshot rotation. The idea is to defrag just before you take a snapshot, and at a frequency that gives a good balance between space usage and performance. As a general rule, if you take this route, start by doing the defrag on either a monthly basis if you're doing daily or weekly snapshots, or with every fourth snapshot if not, and then adjust the interval based on how that impacts your space usage. Additionally, you can compact free space without defragmenting data or breaking reflinks by running a full balance on the filesystem. The tricky part though is that differing workloads are impacted differently by fragmentation. Using just four generic examples: * Mostly sequential write focused workloads (like security recording systems) tend to be impacted by free space fragmentation more than data fragmentation. Balancing filesystems used for such workloads is likely to give a noticeable improvement, but defragmenting probably won't give much. * Mostly sequential read focused workloads (like a streaming media server) tend to be the most impacted by data fragmentation, but aren't generally impacted by free space fragmentation. As a result, defrag will help here a lot, but balance won't as much. * Mostly random write focused workloads (like most database systems or virtual machines) are often impacted by both free space and data fragmentation, and are a pathological case for CoW filesystems. Balance and defrag will help here, but they won't help for long. * Mostly random read focused workloads (like most non-multimedia desktop usage) are not impacted much by either aspect, but if you're on a traditional hard drive they can be impacted significantly by how the data is spread across the disk. Balance can help here, but only because it improves data locality, not because it compacts free space. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-12 17:27 ` Austin S. Hemmelgarn @ 2017-09-14 7:54 ` Duncan 2017-09-14 12:28 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 56+ messages in thread From: Duncan @ 2017-09-14 7:54 UTC (permalink / raw) To: linux-btrfs Austin S. Hemmelgarn posted on Tue, 12 Sep 2017 13:27:00 -0400 as excerpted: > The tricky part though is that differing workloads are impacted > differently by fragmentation. Using just four generic examples: > > * Mostly sequential write focused workloads (like security recording > systems) tend to be impacted by free space fragmentation more than data > fragmentation. Balancing filesystems used for such workloads is likely > to give a noticeable improvement, but defragmenting probably won't give > much. > * Mostly sequential read focused workloads (like a streaming media > server) > tend to be the most impacted by data fragmentation, but aren't generally > impacted by free space fragmentation. As a result, defrag will help > here a lot, but balance won't as much. > * Mostly random write focused workloads (like most database systems or > virtual machines) are often impacted by both free space and data > fragmentation, and are a pathological case for CoW filesystems. Balance > and defrag will help here, but they won't help for long. > * Mostly random read focused workloads (like most non-multimedia desktop > usage) are not impacted much by either aspect, but if you're on a > traditional hard drive they can be impacted significantly by how the > data is spread across the disk. Balance can help here, but only because > it improves data locality, not because it compacts free space. This is a very useful analysis, particularly given the examples. Maybe put it on the wiki under the defrag discussion? (Assuming something like it isn't already there. I've not looked in awhile.) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 7:54 ` Duncan @ 2017-09-14 12:28 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-09-14 12:28 UTC (permalink / raw) To: linux-btrfs On 2017-09-14 03:54, Duncan wrote: > Austin S. Hemmelgarn posted on Tue, 12 Sep 2017 13:27:00 -0400 as > excerpted: > >> The tricky part though is that differing workloads are impacted >> differently by fragmentation. Using just four generic examples: >> >> * Mostly sequential write focused workloads (like security recording >> systems) tend to be impacted by free space fragmentation more than data >> fragmentation. Balancing filesystems used for such workloads is likely >> to give a noticeable improvement, but defragmenting probably won't give >> much. >> * Mostly sequential read focused workloads (like a streaming media >> server) >> tend to be the most impacted by data fragmentation, but aren't generally >> impacted by free space fragmentation. As a result, defrag will help >> here a lot, but balance won't as much. >> * Mostly random write focused workloads (like most database systems or >> virtual machines) are often impacted by both free space and data >> fragmentation, and are a pathological case for CoW filesystems. Balance >> and defrag will help here, but they won't help for long. >> * Mostly random read focused workloads (like most non-multimedia desktop >> usage) are not impacted much by either aspect, but if you're on a >> traditional hard drive they can be impacted significantly by how the >> data is spread across the disk. Balance can help here, but only because >> it improves data locality, not because it compacts free space. > > This is a very useful analysis, particularly given the examples. Maybe > put it on the wiki under the defrag discussion? (Assuming something like > it isn't already there. I've not looked in awhile.) > I've actually been meaning to write up something more thoroughly about this online (probably as a Gist). When finally get around to that (probably in the next few weeks), I'll try to make sure a link ends up on the defrag page on the wiki. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-12 16:28 ` defragmenting best practice? Ulli Horlacher 2017-09-12 17:27 ` Austin S. Hemmelgarn @ 2017-09-14 11:38 ` Kai Krakow 2017-09-14 13:31 ` Tomasz Kłoczko 1 sibling, 1 reply; 56+ messages in thread From: Kai Krakow @ 2017-09-14 11:38 UTC (permalink / raw) To: linux-btrfs Am Tue, 12 Sep 2017 18:28:43 +0200 schrieb Ulli Horlacher <framstag@rus.uni-stuttgart.de>: > On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: > > When I do a > > btrfs filesystem defragment -r /directory > > does it defragment really all files in this directory tree, even if > > it contains subvolumes? > > The man page does not mention subvolumes on this topic. > > No answer so far :-( > > But I found another problem in the man-page: > > Defragmenting with Linux kernel versions < 3.9 or >= 3.14-rc2 as > well as with Linux stable kernel versions >= 3.10.31, >= 3.12.12 or > >= 3.13.4 will break up the ref-links of COW data (for example files > >copied with > cp --reflink, snapshots or de-duplicated data). This may cause > considerable increase of space usage depending on the broken up > ref-links. > > I am running Ubuntu 16.04 with Linux kernel 4.10 and I have several > snapshots. > Therefore, I better should avoid calling "btrfs filesystem defragment > -r"? > > What is the defragmenting best practice? > Avoid it completly? You may want to try https://github.com/Zygo/bees. It is a daemon watching the file system generation changes, scanning the blocks and then recombines them. Of course, this process somewhat defeats the purpose of defragging in the first place as it will undo some of the defragmenting. I suggest you only ever defragment parts of your main subvolume or rely on autodefrag, and let bees do optimizing the snapshots. Also, I experimented with adding btrfs support to shake, still working on better integration but currently lacking time... :-( Shake is an adaptive defragger which rewrites files. With my current patches it clones each file, and then rewrites it to its original location. This approach is currently not optimal as it simply bails out if some other process is accessing the file and leaves you with an (intact) temporary copy you need to move back in place manually. Shake works very well with the idea of detecting how defragmented, how old, and how far away from an "ideal" position a file is and exploits standard Linux file systems behavior to optimally placing files by rewriting them. It then records its status per file in extended attributes. It also works with non-btrfs file systems. My patches try to avoid defragging files with shared extents, so this may help your situation. However, it will still shuffle files around if they are too far from their ideal position, thus destroying shared extents. A future patch could use extent recombining and skip shared extents in that process. But first I'd like to clean out some of the rough edges together with the original author of shake. Look here: https://github.com/unbrice/shake and also check out the pull requests and comments there. You shouldn't currently run shake unattended and only on specific parts of your FS you feel need defragmenting. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 11:38 ` Kai Krakow @ 2017-09-14 13:31 ` Tomasz Kłoczko 2017-09-14 15:24 ` Kai Krakow 0 siblings, 1 reply; 56+ messages in thread From: Tomasz Kłoczko @ 2017-09-14 13:31 UTC (permalink / raw) Cc: linux-btrfs On 14 September 2017 at 12:38, Kai Krakow <hurikhan77@gmail.com> wrote: [..] > > I suggest you only ever defragment parts of your main subvolume or rely > on autodefrag, and let bees do optimizing the snapshots. > > Also, I experimented with adding btrfs support to shake, still working > on better integration but currently lacking time... :-( > > Shake is an adaptive defragger which rewrites files. With my current > patches it clones each file, and then rewrites it to its original > location. This approach is currently not optimal as it simply bails out > if some other process is accessing the file and leaves you with an > (intact) temporary copy you need to move back in place manually. If you really want to have real and *ideal* distribution of the data across physical disk first you need to build time travel device. This device will allow you to put all blocks which needs to be read in perfect order (to read all data only sequentially without seek). However it will be working only in case of spindles because in case of SSDs there is no seek time. Please let us know when you will write drivers/timetravel/ Linux kernel driver. When such driver will be available I promise I'll write all necessary btrfs code by myself in matter of few days (it will be piece of cake compare to build such device). But seriously .. Only context/scenario when you may want to lower defragmentation is when you are something needs to allocate continuous area lower than free space and larger than largest free chunk. Something like this happens only when volume is working on almost 100% allocated space. In such scenario even you bees cannot do to much as it may be not enough free space to move some other data in larger chunks to defragment FS physical space. If your workload will be still writing new data to FS such defragmentation may give you (maybe) few more seconds and just after this FS will be 100% full, In other words if someone is thinking that such defragmentation daemon is solving any problems he/she may be 100% right .. such person is only *thinking* that this is truth. kloczek PS. Do you know first McGyver rule? -> "If it ain't broke, don't fix it". So first show that fragmentation is hurting latency of the access to btrfs data and it will be possible to measurable such impact. Before you will start measuring this you need to learn how o sample for example VFS layer latency. Do you know how to do this to deliver such proof? PS2. The same "discussions" about fragmentation where in the past about +10 years ago after ZFS has been introduced. Just to let you know that after initial ZFS introduction up to now was not written even single line of ZFS code to handle active fragmentation and no one been able to prove that something about active defragmentation needs to be done in case of ZFS. Why? Because all stands on the shoulders of enough cleaver *allocation algorithm*. Only this and nothing more. PS3. Please can we stop this/EOT? -- Tomasz Kłoczko | LinkedIn: http://lnkd.in/FXPWxH ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 13:31 ` Tomasz Kłoczko @ 2017-09-14 15:24 ` Kai Krakow 2017-09-14 15:47 ` Kai Krakow 2017-09-14 17:48 ` Tomasz Kłoczko 0 siblings, 2 replies; 56+ messages in thread From: Kai Krakow @ 2017-09-14 15:24 UTC (permalink / raw) To: linux-btrfs Am Thu, 14 Sep 2017 14:31:48 +0100 schrieb Tomasz Kłoczko <kloczko.tomasz@gmail.com>: > On 14 September 2017 at 12:38, Kai Krakow <hurikhan77@gmail.com> > wrote: [..] > > > > I suggest you only ever defragment parts of your main subvolume or > > rely on autodefrag, and let bees do optimizing the snapshots. Please read that again including the parts you omitted. > > Also, I experimented with adding btrfs support to shake, still > > working on better integration but currently lacking time... :-( > > > > Shake is an adaptive defragger which rewrites files. With my current > > patches it clones each file, and then rewrites it to its original > > location. This approach is currently not optimal as it simply bails > > out if some other process is accessing the file and leaves you with > > an (intact) temporary copy you need to move back in place > > manually. > > If you really want to have real and *ideal* distribution of the data > across physical disk first you need to build time travel device. This > device will allow you to put all blocks which needs to be read in > perfect order (to read all data only sequentially without seek). > However it will be working only in case of spindles because in case of > SSDs there is no seek time. > Please let us know when you will write drivers/timetravel/ Linux > kernel driver. When such driver will be available I promise I'll > write all necessary btrfs code by myself in matter of few days (it > will be piece of cake compare to build such device). > > But seriously .. Seriously: Defragmentation on spindles is IMHO not about getting the perfect continuous allocation but providing better spatial layout of the files you work with. Getting e.g. boot files into read order or at least nearby improves boot time a lot. Similar for loading applications. Shake tries to improve this by rewriting the files - and this works because file systems (given enough free space) already do a very good job at doing this. But constant system updates degrade this order over time. It really doesn't matter if some big file is laid out in 1 allocation of 1 GB or in 250 allocations of 4MB: It really doesn't make a big difference. Recombining extents into bigger once, tho, can make a big difference in an aging btrfs, even on SSDs. Bees is, btw, not about defragmentation: I have some OS containers running and I want to deduplicate data after updates. It seems to do a good job here, better than other deduplicators I found. And if some defrag tools destroyed your snapshot reflinks, bees can also help here. On its way it may recombine extents so it may improve fragmentation. But usually it probably defragments because it needs to split extents that a defragger combined. But well, I think getting 100% continuous allocation is really not the achievement you want to get, especially when reflinks are a primary concern. > Only context/scenario when you may want to lower defragmentation is > when you are something needs to allocate continuous area lower than > free space and larger than largest free chunk. Something like this > happens only when volume is working on almost 100% allocated space. > In such scenario even you bees cannot do to much as it may be not > enough free space to move some other data in larger chunks to > defragment FS physical space. Bees does not do that. > If your workload will be still writing > new data to FS such defragmentation may give you (maybe) few more > seconds and just after this FS will be 100% full, > > In other words if someone is thinking that such defragmentation daemon > is solving any problems he/she may be 100% right .. such person is > only *thinking* that this is truth. Bees is not about that. > kloczek > PS. Do you know first McGyver rule? -> "If it ain't broke, don't fix > it". Do you know the saying "think first, then act"? > So first show that fragmentation is hurting latency of the > access to btrfs data and it will be possible to measurable such > impact. Before you will start measuring this you need to learn how o > sample for example VFS layer latency. Do you know how to do this to > deliver such proof? You didn't get the point. You only read "defragmentation" and your alarm lights lid up. You even think bees would be a defragmenter. It probably is more the opposite because it introduces more fragments in exchange for more reflinks. > PS2. The same "discussions" about fragmentation where in the past > about +10 years ago after ZFS has been introduced. Just to let you > know that after initial ZFS introduction up to now was not written > even single line of ZFS code to handle active fragmentation and no one > been able to prove that something about active defragmentation needs > to be done in case of ZFS. Btrfs has autodefrag to reduce the number of fragments by rewriting small portions of the file being written to. This is needed, otherwise the feature won't be there. Why? Have you tried working with 1GB files broken into 100000+ of fragments just because of how CoW works? Try, there's your latency. > Why? Because all stands on the shoulders of enough cleaver *allocation > algorithm*. Only this and nothing more. > PS3. Please can we stop this/EOT? Can we please not start a flame war just because you hate defrag tools? I think the whole discussion about "defragmenting" should be stopped. Let's call it "optimizers": If it reduces needed storage space, it optimizes. And I need a tool for that. Otherwise tell me how btrfs solves this in-kernel, when applications break reflinks by rewriting data... If you're on spindles you want files be kept spatially nearby that are needed at around the same time. This improves boot times and application start times. The file system already does a good job at doing this. But for some work loads (like booting) this degrades over time and the FS can do nothing about it because this is just not how package managers work (or Windows updates, NTFS also uses extent allocation and as such solves the same problems in similar way as most Linux systems). Let the package manager reinstall all files accessed at boot and it would probably be solved. But who wants that? Btrfs does not solve this, SSDs do. Using bcache for that matter on my local system. Wihtout SSDs, shake (and other tools) can solve this. If you are on SSD and work with almost full file systems, you may get back performance by recombining free space. Defragmentation here is not about files but free space. This can also be called an optimizer then. I really have no interest in defragmenting a file system to 100% continuous allocation. That was need for FAT and small system without enough RAM for caching all the file system infrastructure. Today systems use extent allocations and that solves the problem where the original idea of defragmentation came from. When I speak of defragmentation I mean something more intelligent like optimizing file system layout for access patterns you use. Conclusion: The original question was about defrag best practice with regards to reflinked snapshots. And I recommended partially against it and instead recommended bees which restores and optimizes the reflinks and may recombine some of the extents. From my wording, and I apologize for that, it was probably not completely clear what this means: [I wrote] > You may want to try https://github.com/Zygo/bees. It is a daemon > watching the file system generation changes, scanning the blocks and > then recombines them. Of course, this process somewhat defeats the > purpose of defragging in the first place as it will undo some of the > defragmenting. It scans for duplicate blocks and recombines them into reflinked blocks. This is done by recombining extents. For that purpose, extents that the file system allocated, usually need to be broken up again into smaller chunks. But bees tries to recombine such broken extents back into bigger ones. But it is not a defragger, seriously! It indeed breaks extents into smaller chunks. Later I recommended to have a look at shake which I experimented with. And I also recommended to let the btrfs autodefrag do the work and only ever defragment only very selected parts of the file system he feels needing "defragmentation". My patches to shake try to avoid btrfs shared extents so actually they reduce the effect of defragmenting the FS, because I think keeping reflinked extents is more important. But I see the main purpose of shake to re-layout supplied files into nearby space. I think it is more important to improve spatial locality of files than having them 100% continuous. I will try to make my intent more clear next time but I guess you won't probably read it in its entirely anyways. :,-( -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 15:24 ` Kai Krakow @ 2017-09-14 15:47 ` Kai Krakow 2017-09-14 17:48 ` Tomasz Kłoczko 1 sibling, 0 replies; 56+ messages in thread From: Kai Krakow @ 2017-09-14 15:47 UTC (permalink / raw) To: linux-btrfs Am Thu, 14 Sep 2017 17:24:34 +0200 schrieb Kai Krakow <hurikhan77@gmail.com>: Errors corrected, see below... > Am Thu, 14 Sep 2017 14:31:48 +0100 > schrieb Tomasz Kłoczko <kloczko.tomasz@gmail.com>: > > > On 14 September 2017 at 12:38, Kai Krakow <hurikhan77@gmail.com> > > wrote: [..] > > > > > > I suggest you only ever defragment parts of your main subvolume or > > > rely on autodefrag, and let bees do optimizing the snapshots. > > Please read that again including the parts you omitted. > > > > > Also, I experimented with adding btrfs support to shake, still > > > working on better integration but currently lacking time... :-( > > > > > > Shake is an adaptive defragger which rewrites files. With my > > > current patches it clones each file, and then rewrites it to its > > > original location. This approach is currently not optimal as it > > > simply bails out if some other process is accessing the file and > > > leaves you with an (intact) temporary copy you need to move back > > > in place manually. > > > > If you really want to have real and *ideal* distribution of the data > > across physical disk first you need to build time travel device. > > This device will allow you to put all blocks which needs to be read > > in perfect order (to read all data only sequentially without seek). > > However it will be working only in case of spindles because in case > > of SSDs there is no seek time. > > Please let us know when you will write drivers/timetravel/ Linux > > kernel driver. When such driver will be available I promise I'll > > write all necessary btrfs code by myself in matter of few days (it > > will be piece of cake compare to build such device). > > > > But seriously .. > > Seriously: Defragmentation on spindles is IMHO not about getting the > perfect continuous allocation but providing better spatial layout of > the files you work with. > > Getting e.g. boot files into read order or at least nearby improves > boot time a lot. Similar for loading applications. Shake tries to > improve this by rewriting the files - and this works because file > systems (given enough free space) already do a very good job at doing > this. But constant system updates degrade this order over time. > > It really doesn't matter if some big file is laid out in 1 allocation > of 1 GB or in 250 allocations of 4MB: It really doesn't make a big > difference. > > Recombining extents into bigger once, tho, can make a big difference > in an aging btrfs, even on SSDs. > > Bees is, btw, not about defragmentation: I have some OS containers > running and I want to deduplicate data after updates. It seems to do a > good job here, better than other deduplicators I found. And if some > defrag tools destroyed your snapshot reflinks, bees can also help > here. On its way it may recombine extents so it may improve > fragmentation. But usually it probably defragments because it needs ^^^^^^^^^^^ It fragments! > to split extents that a defragger combined. > > But well, I think getting 100% continuous allocation is really not the > achievement you want to get, especially when reflinks are a primary > concern. > > > > Only context/scenario when you may want to lower defragmentation is > > when you are something needs to allocate continuous area lower than > > free space and larger than largest free chunk. Something like this > > happens only when volume is working on almost 100% allocated space. > > In such scenario even you bees cannot do to much as it may be not > > enough free space to move some other data in larger chunks to > > defragment FS physical space. > > Bees does not do that. > > > > If your workload will be still writing > > new data to FS such defragmentation may give you (maybe) few more > > seconds and just after this FS will be 100% full, > > > > In other words if someone is thinking that such defragmentation > > daemon is solving any problems he/she may be 100% right .. such > > person is only *thinking* that this is truth. > > Bees is not about that. > > > > kloczek > > PS. Do you know first McGyver rule? -> "If it ain't broke, don't fix > > it". > > Do you know the saying "think first, then act"? > > > > So first show that fragmentation is hurting latency of the > > access to btrfs data and it will be possible to measurable such > > impact. Before you will start measuring this you need to learn how o > > sample for example VFS layer latency. Do you know how to do this to > > deliver such proof? > > You didn't get the point. You only read "defragmentation" and your > alarm lights lid up. You even think bees would be a defragmenter. It > probably is more the opposite because it introduces more fragments in > exchange for more reflinks. > > > > PS2. The same "discussions" about fragmentation where in the past > > about +10 years ago after ZFS has been introduced. Just to let you > > know that after initial ZFS introduction up to now was not written > > even single line of ZFS code to handle active fragmentation and no > > one been able to prove that something about active defragmentation > > needs to be done in case of ZFS. > > Btrfs has autodefrag to reduce the number of fragments by rewriting > small portions of the file being written to. This is needed, otherwise > the feature won't be there. Why? Have you tried working with 1GB files > broken into 100000+ of fragments just because of how CoW works? Try, > there's your latency. > > > > Why? Because all stands on the shoulders of enough cleaver > > *allocation algorithm*. Only this and nothing more. > > PS3. Please can we stop this/EOT? > > Can we please not start a flame war just because you hate defrag > tools? > > I think the whole discussion about "defragmenting" should be stopped. > Let's call it "optimizers": > > If it reduces needed storage space, it optimizes. And I need a tool > for that. Otherwise tell me how btrfs solves this in-kernel, when > applications break reflinks by rewriting data... > > If you're on spindles you want files be kept spatially nearby that are > needed at around the same time. This improves boot times and > application start times. The file system already does a good job at > doing this. But for some work loads (like booting) this degrades over > time and the FS can do nothing about it because this is just not how > package managers work (or Windows updates, NTFS also uses extent > allocation and as such solves the same problems in similar way as > most Linux systems). Let the package manager reinstall all files > accessed at boot and it would probably be solved. But who wants that? > Btrfs does not solve this, SSDs do. Using bcache for that matter on > my local system. Wihtout SSDs, shake (and other tools) can solve this. > > If you are on SSD and work with almost full file systems, you may get > back performance by recombining free space. Defragmentation here is > not about files but free space. This can also be called an optimizer > then. > > > I really have no interest in defragmenting a file system to 100% > continuous allocation. That was need for FAT and small system without > enough RAM for caching all the file system infrastructure. Today > systems use extent allocations and that solves the problem where the > original idea of defragmentation came from. When I speak of > defragmentation I mean something more intelligent like optimizing file > system layout for access patterns you use. > > > Conclusion: The original question was about defrag best practice with > regards to reflinked snapshots. And I recommended partially against it > and instead recommended bees which restores and optimizes the reflinks > and may recombine some of the extents. From my wording, and I > apologize for that, it was probably not completely clear what this > means: > > [I wrote] > > You may want to try https://github.com/Zygo/bees. It is a daemon > > watching the file system generation changes, scanning the blocks and > > then recombines them. Of course, this process somewhat defeats the > > purpose of defragging in the first place as it will undo some of the > > defragmenting. > > It scans for duplicate blocks and recombines them into reflinked > blocks. This is done by recombining extents. For that purpose, extents > that the file system allocated, usually need to be broken up again > into smaller chunks. But bees tries to recombine such broken extents > back into bigger ones. But it is not a defragger, seriously! It indeed > breaks extents into smaller chunks. > > Later I recommended to have a look at shake which I experimented with. > And I also recommended to let the btrfs autodefrag do the work and > only ever defragment only very selected parts of the file system he > feels needing "defragmentation". My patches to shake try to avoid > btrfs shared extents so actually they reduce the effect of > defragmenting the FS, because I think keeping reflinked extents is > more important. But I see the main purpose of shake to re-layout > supplied files into nearby space. I think it is more important to > improve spatial locality of files than having them 100% continuous. > > I will try to make my intent more clear next time but I guess you > won't probably read it in its entirely anyways. :,-( > > -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 15:24 ` Kai Krakow 2017-09-14 15:47 ` Kai Krakow @ 2017-09-14 17:48 ` Tomasz Kłoczko 2017-09-14 18:53 ` Austin S. Hemmelgarn ` (2 more replies) 1 sibling, 3 replies; 56+ messages in thread From: Tomasz Kłoczko @ 2017-09-14 17:48 UTC (permalink / raw) Cc: linux-btrfs On 14 September 2017 at 16:24, Kai Krakow <hurikhan77@gmail.com> wrote: [..] > Getting e.g. boot files into read order or at least nearby improves > boot time a lot. Similar for loading applications. By how much it is possible to improve boot time? Just please some example which I can try to replay which ill be showing that we have similar results. I still have one one of my laptops with spindle on btrfs root fs ( and no other FSess in use) so I could be able to confirm that my numbers are enough close to your numbers. > Shake tries to > improve this by rewriting the files - and this works because file > systems (given enough free space) already do a very good job at doing > this. But constant system updates degrade this order over time. OK. Please prepare some database, import some data which size will be few times of not used RAM (best if this multiplication factor will be at least 10). Then do some batch of selects measuring distribution latencies of those queries. This will give you some data about. not fragmented data. Then on next stage try to apply some number of update queries and after reboot the system or drop all caches. and repeat the same set of selects. After this all what you need to do is compare distribution of the latencies. > It really doesn't matter if some big file is laid out in 1 allocation > of 1 GB or in 250 allocations of 4MB: It really doesn't make a big > difference. > > Recombining extents into bigger once, tho, can make a big difference in > an aging btrfs, even on SSDs. That it may be an issue with using extents. Again: please show some results of some test unit which anyone will be able to reply and confirm or not that this effect really exist. If problem really exist and is related ot extents you should have real scenario explanation why ZFS is not using extents. btrfs is not to far from classic approach do FS because it srill uses allocation structures. This is not the case in context of ZFS because this technology has no information about what is already allocates. ZFS uses free lists so by negation whatever is not on free list is already allocated. I'm not trying to point that ZFS is better but only point that by changing allocation strategy you may not be blasted by something like some extents bottleneck (which sill needs to be proven) There are at least few very good reason why it is even necessary to change sometimes strategy from allocations structures to free lists. First: ZFS free list management is very similar to known from Linux memory SLAB allocator. Did you heard that someone needs to do system memory defragnentation because fragmented memory adds some additional latency to memory access? Other consequence is that with growing size of the files and number of files or directories FS metadata are growing exponentially with size and numbers of such objects. In case of free lists there is no such growth and all structures are growing with linear correlation. Caching in memory free list data takes much less than caching b-trees. Last thing is effort on deallocating something in FS with allocation structure and with free lists. In classic approach number of such operations is growing with depth of b-trees. In case free list all hat you need to do is compare ctime of the allocated block with volume or snapshot ctime to make decision about return or not block to free list. No matter how many snapshots, volumes, files or directories allays it will be *just one compare* of the block or vol/snapshot ctime. With necessity to do just only one compare comes way better predictable behavior of whole FS and simplicity of the code making such decisions. In other words ZFS internally uses well know SLAB allocator with caching some data about best possible location to allocate some different sizes allocation unit size multiplied by n^2 like you can see on Linux in /proc/slabinfo in case of *kmalloc* SLABs. This is why in case of ZFS number of volumes, snapshots has zero impact on avg speed of interactions over VFS layer. If you will be able present real impact of the fragmentation (again *if*) this may trigger other actions. So AFAIK no one been able to deliver real numbers or scenarios about such impact. And *if* such impact really exist one of the solutions may be just mimic what ZFS is doing (maybe there are other solutions). So please show us test unit exposing problem with measurement methodology presenting pathology related to fragmentation. > Bees is, btw, not about defragmentation: I have some OS containers > running and I want to deduplicate data after updates. Deduplication done in userspace has natural consequences in form of security issues. executable doing such things will need full access to everything and needs to have exposed some API/ABI allowing fiddle with content of the btrfs. Which adds second batch of security related risks. Try to have look how deduplication is working in case of ZFS without offline deduplication. >> In other words if someone is thinking that such defragmentation daemon >> is solving any problems he/she may be 100% right .. such person is >> only *thinking* that this is truth. > > Bees is not about that. I've been only trying to say that I would be really surprised if bees will be taking care of such scenarios. >> So first show that fragmentation is hurting latency of the >> access to btrfs data and it will be possible to measurable such >> impact. Before you will start measuring this you need to learn how o >> sample for example VFS layer latency. Do you know how to do this to >> deliver such proof? > > You didn't get the point. You only read "defragmentation" and your > alarm lights lid up. You even think bees would be a defragmenter. It > probably is more the opposite because it introduces more fragments in > exchange for more reflinks. So you are asking to start investing in the development time implementing something without proving or demonstrating that problem is real? No matter how long someone will be thinking about this it will change nothing. [..] > Can we please not start a flame war just because you hate defrag tools? Really I have no idea where I wrote that I hate defragmentation. Using ZFS as working and real example I've only told you that necessity to reduce fragmentation is NULL if you are following exact path. In your world you are trying to tell that you keys do not match to the locker in doors. I'm only trying to tell you that there are many doors without key hole which can be opened and closed. I can only repeat that to trigger some actions about defragmentation first you need to *present* some case scenario exposing that the problem is real. I may even believe you that you may be right but engineering it is not something is possible to apply "believe" term. Intuition always may be tricking you here that as long as impact is non-zero someone should take care of this. No. if this impact will be enough small this can be ignored as same as we are ignoring some consequences of the quantum physics in our life (probability that bucket of water standing on open fire may freeze instead boil according to quantum physics is always non-zero and despite this fact no one been able to observe something like this). In other words you need to show some *real numbers* which will show SCALE of the issue. kloczek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 17:48 ` Tomasz Kłoczko @ 2017-09-14 18:53 ` Austin S. Hemmelgarn 2017-09-15 2:26 ` Tomasz Kłoczko 2017-09-14 20:17 ` Kai Krakow 2017-09-15 10:54 ` Michał Sokołowski 2 siblings, 1 reply; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-09-14 18:53 UTC (permalink / raw) To: Tomasz Kłoczko; +Cc: linux-btrfs On 2017-09-14 13:48, Tomasz Kłoczko wrote: > On 14 September 2017 at 16:24, Kai Krakow <hurikhan77@gmail.com> wrote: > [..] >> Getting e.g. boot files into read order or at least nearby improves >> boot time a lot. Similar for loading applications. > > By how much it is possible to improve boot time? > Just please some example which I can try to replay which ill be > showing that we have similar results. > I still have one one of my laptops with spindle on btrfs root fs ( and > no other FSess in use) so I could be able to confirm that my numbers > are enough close to your numbers. While it's not for BTRFS< a tool called e4rat might be of interest to you regarding this. It reorganizes files on an ext4 filesystem so that stuff used by the boot loader is right at the beginning of the device, and I've know people to get insane performance improvements (on the order of 20x in some pathologicallyb ad cases) in the time taken from the BIOS handing things off to GRUB to GRUB handing execution off to the kernel. > >> Shake tries to >> improve this by rewriting the files - and this works because file >> systems (given enough free space) already do a very good job at doing >> this. But constant system updates degrade this order over time. > > OK. Please prepare some database, import some data which size will be > few times of not used RAM (best if this multiplication factor will be > at least 10). Then do some batch of selects measuring distribution > latencies of those queries. > This will give you some data about. not fragmented data. > Then on next stage try to apply some number of update queries and > after reboot the system or drop all caches. and repeat the same set of > selects. > After this all what you need to do is compare distribution of the latencies. > >> It really doesn't matter if some big file is laid out in 1 allocation >> of 1 GB or in 250 allocations of 4MB: It really doesn't make a big >> difference. >> >> Recombining extents into bigger once, tho, can make a big difference in >> an aging btrfs, even on SSDs. > > That it may be an issue with using extents. > Again: please show some results of some test unit which anyone will be > able to reply and confirm or not that this effect really exist. This shouldn't need examples. It's trivial math combined with basic knowledge of hardware behavior. Every request to a device has a minimum amount of overhead. On traditional hard drives, this is usually dominated by seek latency, but on SSD's, the request setup, dispatch, and completion are the dominant factor. Assumign you have a 2 micro-second overhead per-request (not an exact number, just chosen for demonstration purposes because it makes the math easy), and a 1GB file, the time difference between reading ten 100MB extents and reading ten thousand 100kB extents is just short of 0.02 seconds, or a factor of about one thousand (which, no surprise here, is the factor of difference between the number of extents). > > If problem really exist and is related ot extents you should have real > scenario explanation why ZFS is not using extents. Extents have nothing to do with it. What matters is how much of the file data is contiguous (and therefore can be read as a single request) and how smart the FS is about figuring that out. Extents help figure that out, but the primary reason to use them is to save space encoding block allocations within a file (go take a look at how ext2 handles allocations, and then compare that to ext4, the difference is insane in terms of space savings). > btrfs is not to far from classic approach do FS because it srill uses > allocation structures. > This is not the case in context of ZFS because this technology has no > information about what is already allocates. > ZFS uses free lists so by negation whatever is not on free list is > already allocated. > I'm not trying to point that ZFS is better but only point that by > changing allocation strategy you may not be blasted by something like > some extents bottleneck (which sill needs to be proven) > > There are at least few very good reason why it is even necessary to > change sometimes strategy from allocations structures to free lists. > First: ZFS free list management is very similar to known from Linux > memory SLAB allocator. > Did you heard that someone needs to do system memory defragnentation > because fragmented memory adds some additional latency to memory > access? > Other consequence is that with growing size of the files and number of > files or directories FS metadata are growing exponentially with size > and numbers of such objects. In case of free lists there is no such > growth and all structures are growing with linear correlation. > Caching in memory free list data takes much less than caching b-trees. > Last thing is effort on deallocating something in FS with allocation > structure and with free lists. > In classic approach number of such operations is growing with depth of b-trees. > In case free list all hat you need to do is compare ctime of the > allocated block with volume or snapshot ctime to make decision about > return or not block to free list. > No matter how many snapshots, volumes, files or directories allays it > will be *just one compare* of the block or vol/snapshot ctime. > With necessity to do just only one compare comes way better > predictable behavior of whole FS and simplicity of the code making > such decisions. > In other words ZFS internally uses well know SLAB allocator with > caching some data about best possible location to allocate some > different sizes allocation unit size multiplied by n^2 like you can > see on Linux in /proc/slabinfo in case of *kmalloc* SLABs. > This is why in case of ZFS number of volumes, snapshots has zero > impact on avg speed of interactions over VFS layer. > > If you will be able present real impact of the fragmentation (again > *if*) this may trigger other actions. > So AFAIK no one been able to deliver real numbers or scenarios about > such impact. > And *if* such impact really exist one of the solutions may be just > mimic what ZFS is doing (maybe there are other solutions). > > So please show us test unit exposing problem with measurement > methodology presenting pathology related to fragmentation. > >> Bees is, btw, not about defragmentation: I have some OS containers >> running and I want to deduplicate data after updates. > > Deduplication done in userspace has natural consequences in form of > security issues. > executable doing such things will need full access to everything and > needs to have exposed some API/ABI allowing fiddle with content of the > btrfs. Which adds second batch of security related risks. > > Try to have look how deduplication is working in case of ZFS without > offline deduplication. You mean how it eats tons of RAM and gives nearly no benefit in most cases compared to just using transparent compression? Online deduplication like ZFS offers has issues too. > >>> In other words if someone is thinking that such defragmentation daemon >>> is solving any problems he/she may be 100% right .. such person is >>> only *thinking* that this is truth. >> >> Bees is not about that. > > I've been only trying to say that I would be really surprised if bees > will be taking care of such scenarios. > >>> So first show that fragmentation is hurting latency of the >>> access to btrfs data and it will be possible to measurable such >>> impact. Before you will start measuring this you need to learn how o >>> sample for example VFS layer latency. Do you know how to do this to >>> deliver such proof? >> >> You didn't get the point. You only read "defragmentation" and your >> alarm lights lid up. You even think bees would be a defragmenter. It >> probably is more the opposite because it introduces more fragments in >> exchange for more reflinks. > > So you are asking to start investing in the development time > implementing something without proving or demonstrating that problem > is real? > No matter how long someone will be thinking about this it will change nothing. > > [..] >> Can we please not start a flame war just because you hate defrag tools? > > Really I have no idea where I wrote that I hate defragmentation. > Using ZFS as working and real example I've only told you that > necessity to reduce fragmentation is NULL if you are following exact > path. > In your world you are trying to tell that you keys do not match to the > locker in doors. > I'm only trying to tell you that there are many doors without key hole > which can be opened and closed. > > I can only repeat that to trigger some actions about defragmentation > first you need to *present* some case scenario exposing that the > problem is real. I may even believe you that you may be right but > engineering it is not something is possible to apply "believe" term. > > Intuition always may be tricking you here that as long as impact is > non-zero someone should take care of this. > No. if this impact will be enough small this can be ignored as same as > we are ignoring some consequences of the quantum physics in our life > (probability that bucket of water standing on open fire may freeze > instead boil according to quantum physics is always non-zero and > despite this fact no one been able to observe something like this). > In other words you need to show some *real numbers* which will show > SCALE of the issue. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 18:53 ` Austin S. Hemmelgarn @ 2017-09-15 2:26 ` Tomasz Kłoczko 2017-09-15 12:23 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 56+ messages in thread From: Tomasz Kłoczko @ 2017-09-15 2:26 UTC (permalink / raw) To: linux-btrfs On 14 September 2017 at 19:53, Austin S. Hemmelgarn <ahferroin7@gmail.com> wrote: [..] > While it's not for BTRFS< a tool called e4rat might be of interest to you > regarding this. It reorganizes files on an ext4 filesystem so that stuff > used by the boot loader is right at the beginning of the device, and I've > know people to get insane performance improvements (on the order of 20x in > some pathologicallyb ad cases) in the time taken from the BIOS handing > things off to GRUB to GRUB handing execution off to the kernel. Do you know that what you've just wrote has nothing to do with fragmentation? Intentionally or not you just trying to change the subject. [..] > This shouldn't need examples. It's trivial math combined with basic > knowledge of hardware behavior. Every request to a device has a minimum > amount of overhead. On traditional hard drives, this is usually dominated > by seek latency, but on SSD's, the request setup, dispatch, and completion > are the dominant factor. Assumign you have a 2 micro-second overhead > per-request (not an exact number, just chosen for demonstration purposes > because it makes the math easy), and a 1GB file, the time difference between > reading ten 100MB extents and reading ten thousand 100kB extents is just > short of 0.02 seconds, or a factor of about one thousand (which, no surprise > here, is the factor of difference between the number of extents). So to produce few seconds delay during boot you need to make few hundreds thousands if not millions more IOs and on reading everything using ideal long sequential reads. Almost every package upgrade on rewrite some files in 100% will produce by using COW fully continuous areas per file. You know .. there is no so many files in typical distribution installation to produce such measurable impact. On my current laptop I have a lot of devel and debug stuff installed and still I have only $ rpm -qal | wc -l 276489 files (from which only small fractions are ELF DSOs or executables) installed by: $ rpm -qa | wc -l 2314 packages. I can bet that during even very complicated boot process it will be touched (by read IOs) only few hundreds files. None of those files will be read sequentially because this is not how executable content is usually loaded into the buffer cache. Simple change block device read ahead may improve boot time enough without putting all blocks in perfect order. All what you need is start enough early "blockdev --setra N" where N is greater than default 256 blocks. All this can be done without thinking about fragmentation. Seems you don't know that Linux by default is reading data from block dev using at least 256 blocks (1KB each one) chunks because such IO size is part of default RA settings, You can change those settings just for boot time and you will have way lower number of IOs and sill no significant improvement like few times shorter time. Fragmentation will be in such case secondary factor. All this could be done without bothering about fragmentation. In other words still you are talking about some institutionally possible results which will be falsified if you will try at least one time do some real tests and measurements. Last time when I've been doing some boot time measurements it was about using sequential start of all services vs. maximum palatalization. And yes by this it was possible to improve boot time by few times. All without bothering about fragmentation. Current fedora systemd base services definition can be improved in many places by add more dependencies and execute many small services in parallel. All those corrections can be done without even thinking about fragmentation. Because these base sett of systemd services comes with systemd source code those improvements can be done for almost all Linux systemd based distros. kloczek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 2:26 ` Tomasz Kłoczko @ 2017-09-15 12:23 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-09-15 12:23 UTC (permalink / raw) To: Tomasz Kłoczko, linux-btrfs On 2017-09-14 22:26, Tomasz Kłoczko wrote: > On 14 September 2017 at 19:53, Austin S. Hemmelgarn > <ahferroin7@gmail.com> wrote: > [..] >> While it's not for BTRFS< a tool called e4rat might be of interest to you >> regarding this. It reorganizes files on an ext4 filesystem so that stuff >> used by the boot loader is right at the beginning of the device, and I've >> know people to get insane performance improvements (on the order of 20x in >> some pathologicallyb ad cases) in the time taken from the BIOS handing >> things off to GRUB to GRUB handing execution off to the kernel. > > Do you know that what you've just wrote has nothing to do with fragmentation? > Intentionally or not you just trying to change the subject. As hard as it may be to believe, this _is_ relevant to the part of your reply that I was responding to, namely: > By how much it is possible to improve boot time? Note that discussion of file ordering impacting boot times is almost always centered around the boot loader, and _not_ userspace (because as you choose to focus on in changing the subject for the rest of this message, it's trivially possible to improve performance in userspace with some really simple tweaks). You wanted examples regarding reordering of data in a localized manner improving boot time, so I gave _the_ reference for this on Linux (e4rat is the only publicly available tool I know of that does this). > > [..] >> This shouldn't need examples. It's trivial math combined with basic >> knowledge of hardware behavior. Every request to a device has a minimum >> amount of overhead. On traditional hard drives, this is usually dominated >> by seek latency, but on SSD's, the request setup, dispatch, and completion >> are the dominant factor. Assumign you have a 2 micro-second overhead >> per-request (not an exact number, just chosen for demonstration purposes >> because it makes the math easy), and a 1GB file, the time difference between >> reading ten 100MB extents and reading ten thousand 100kB extents is just >> short of 0.02 seconds, or a factor of about one thousand (which, no surprise >> here, is the factor of difference between the number of extents). > > So to produce few seconds delay during boot you need to make few > hundreds thousands if not millions more IOs and on reading everything > using ideal long sequential reads. No, that isn't what I was talking about. Quit taking things out of context and assuming all of someone's reply is about only part of yours. This was responding solely to this: > That it may be an issue with using extents. > Again: please show some results of some test unit which anyone will be > able to reply and confirm or not that this effect really exist. And has nothing to do with boot time. > Almost every package upgrade on rewrite some files in 100% will > produce by using COW fully continuous areas per file. > You know .. there is no so many files in typical distribution > installation to produce such measurable impact. > On my current laptop I have a lot of devel and debug stuff installed > and still I have only > > $ rpm -qal | wc -l > 276489 > > files (from which only small fractions are ELF DSOs or executables) > installed by: > > $ rpm -qa | wc -l > 2314 > > packages. > > I can bet that during even very complicated boot process it will be > touched (by read IOs) only few hundreds files. None of those files > will be read sequentially because this is not how executable content > is usually loaded into the buffer cache. Simple change block device > read ahead may improve boot time enough without putting all blocks in > perfect order. All what you need is start enough early "blockdev > --setra N" where N is greater than default 256 blocks. All this can be > done without thinking about fragmentation. As I mentioned above, the primary argument for reordering data for boot is largely related to the boot-loader, which doesn't have intelligent I/O scheduling and doesn't do read ahead, and is primarily about usage with traditional hard drives, where seek latency caused by lack of data locality actually does have a significant (and well documented) impact. > Seems you don't know that Linux by default is reading data from block > dev using at least 256 blocks (1KB each one) chunks because such IO > size is part of default RA settings, You can change those settings > just for boot time and you will have way lower number of IOs and sill > no significant improvement like few times shorter time. Fragmentation > will be in such case secondary factor. > All this could be done without bothering about fragmentation. The block-level read-ahead done by the kernel has near zero impact on performance unless your data is already highly local (not necessarily ordered, but at least all in the same place), which will almost never be the case on BTRFS when dealing with an active data set because of its copy on write semantics. > > In other words still you are talking about some institutionally > possible results which will be falsified if you will try at least one > time do some real tests and measurements. > Last time when I've been doing some boot time measurements it was > about using sequential start of all services vs. maximum > palatalization. And yes by this it was possible to improve boot time > by few times. All without bothering about fragmentation. > > Current fedora systemd base services definition can be improved in > many places by add more dependencies and execute many small services > in parallel. All those corrections can be done without even thinking > about fragmentation. Because these base sett of systemd services comes > with systemd source code those improvements can be done for almost all > Linux systemd based distros. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 17:48 ` Tomasz Kłoczko 2017-09-14 18:53 ` Austin S. Hemmelgarn @ 2017-09-14 20:17 ` Kai Krakow 2017-09-15 10:54 ` Michał Sokołowski 2 siblings, 0 replies; 56+ messages in thread From: Kai Krakow @ 2017-09-14 20:17 UTC (permalink / raw) To: linux-btrfs Am Thu, 14 Sep 2017 18:48:54 +0100 schrieb Tomasz Kłoczko <kloczko.tomasz@gmail.com>: > On 14 September 2017 at 16:24, Kai Krakow <hurikhan77@gmail.com> > wrote: [..] > > Getting e.g. boot files into read order or at least nearby improves > > boot time a lot. Similar for loading applications. > > By how much it is possible to improve boot time? > Just please some example which I can try to replay which ill be > showing that we have similar results. > I still have one one of my laptops with spindle on btrfs root fs ( and > no other FSess in use) so I could be able to confirm that my numbers > are enough close to your numbers. I need to create a test setup because this system uses bcache. The difference (according to systemd-analyze) between warm bcache and no bcache at all ranges from 16-30s boot time vs. 3+ minutes boot time. I could turn off bcache, do a boot trace, try to rearrange boot files, boot again. However, that is not very reproducible as the current file layout is not defined. It'd be better to setup a separate machine where I could start over from a "well defined" state before applying optimization steps to see the differences between different strategies. At least readahead is not very helpful, I tested that in the past. It reduces boot time just by a few seconds, maybe 20-30, thus going from 3+ minutes to 2+ minutes. I still have an old laptop lying around: Single spindle, should make a good test scenario. I'll have to see if I can get it back into shape. It will take me some time. > > Shake tries to > > improve this by rewriting the files - and this works because file > > systems (given enough free space) already do a very good job at > > doing this. But constant system updates degrade this order over > > time. > > OK. Please prepare some database, import some data which size will be > few times of not used RAM (best if this multiplication factor will be > at least 10). Then do some batch of selects measuring distribution > latencies of those queries. Well, this is pretty easy. Systemd-journald is a real beast when it comes to cow fragmentation. Results can be easily generated and reproduced. There are long traces of discussions in the systemd mailing list and I simply decided to make the files nocow right from the start and that fixed it for me. I can simply revert it and create benchmarks. > This will give you some data about. not fragmented data. Well, I would probably do it the other way around: Generate a fragmented journal file (as that is how journald creates the file over time), then rewrite it by some manner to reduce extents, then run journal operations again on this file. Does it bother you to turn this around? > Then on next stage try to apply some number of update queries and > after reboot the system or drop all caches. and repeat the same set of > selects. > After this all what you need to do is compare distribution of the > latencies. Which tool to use to measure which latencies? Speaking of latencies: What's of interest here is perceived performance resulting mostly from seek overhead (except probably in the journal file case which just overwhelmes by the pure amount of extents). I'm not sure if measuring VFS latencies would provide any useful insights here. VFS probably works fast enough still in this case. > > It really doesn't matter if some big file is laid out in 1 > > allocation of 1 GB or in 250 allocations of 4MB: It really doesn't > > make a big difference. > > > > Recombining extents into bigger once, tho, can make a big > > difference in an aging btrfs, even on SSDs. > > That it may be an issue with using extents. I can't follow why you argue that a file with thousands of extents vs a file of same size but only a few extents would makes no difference to operate on. And of course this has to do with extents. But btrfs uses extents. Do you suggest to use ZFS instead? Due to how cow works, the effect would probably be less or barely noticable for writes, but read scanning through the file becomes slow with clearly more "noise" from the moving heads. > Again: please show some results of some test unit which anyone will be > able to reply and confirm or not that this effect really exist. > > If problem really exist and is related ot extents you should have real > scenario explanation why ZFS is not using extents. That was never the discussion. You brought in the ZFS point. I read about the design reasoning behind ZFS when it appeared and started gain public interest years back. > btrfs is not to far from classic approach do FS because it srill uses > allocation structures. > This is not the case in context of ZFS because this technology has no > information about what is already allocates. What about btrfs free space tree? Isn't that more or less the same? But I don't believe that makes a significant difference for desktop-sized storages. I think introduction of free space tree was due to performance of many-TB file systems up to petabyte storage (and beyond of course). > ZFS uses free lists so by negation whatever is not on free list is > already allocated. > I'm not trying to point that ZFS is better but only point that by > changing allocation strategy you may not be blasted by something like > some extents bottleneck (which sill needs to be proven) Reasoning behind using block-oriented allocation probably has more to do with providing efficient vdevs and snapshotting. Using extents for that has some nasty (and obvious) downsides if you think about it, like slack space from only partially shared extents. I guess that is why bees rewrites extent and then shares them again using EXTENT_SAME IOCTL. It generates a lot of writes just to free some unused extent slack. > There are at least few very good reason why it is even necessary to > change sometimes strategy from allocations structures to free lists. > First: ZFS free list management is very similar to known from Linux > memory SLAB allocator. > Did you heard that someone needs to do system memory defragnentation > because fragmented memory adds some additional latency to memory > access? 64 bit systems tend to have enough address space that this is not an issue. But it can easily become an issue if you fill the page tables or use huge pages a lot. There's really something like memory fragmentation but you usually don't defragment memory (and yes, such products existed in the past for unnamed popular "OS"es but that is snake oil). And I can totally follow why free lists are better here, you don't need to explain that. BTW: Do you really compare RAM to spindle storage now? Latency for RAM access is clearly more an electrical than a mechanical problem and also very predictable and thus static, like it is with SSDs. > Other consequence is that with growing size of the files and number of > files or directories FS metadata are growing exponentially with size > and numbers of such objects. I'm not sure if this holds true for every implementation out there. You can make it pretty linear if you wanted to (but you don't). > In case of free lists there is no such > growth and all structures are growing with linear correlation. Why is that so? Can you illustrate examples? Well, of course lists are linear, trees are not. But lists become slow. So if you implement free lists as trees, I don't think that growth is strictly linear. That's just not how trees work. And a list will become slow at some point. BTW: The slab memory allocator indeed has to handle fragmentation issues. And it can become slow if used in wrong ways. Slab uses a triple linked list to keep track of allocations, free items and mixed times (items that hold allocated and free objects). I think you can compare btrfs chunks and extents to how slab manages memory. A full btrfs chunk would be tracked as a full slab item, a free chunk as free item, and the rest is mixed. When inserting objects into slab this would compare to btrfs extents. You will have some slack because you cannot optimally fit all different sized extents into a chunk. If you deallocate objects (thus remove an extent), you'll get fragmented free space. I think btrfs pretty well knows where such free space exists, and it can find it. But if it has to start looking in the mixed case, it will be harder to find fitting space (especially an optimal fit). Slab will struggle the same problem. But is has to move no heads for this. And I think slab matches objects into different size buckets to alleviate such problems where possible. I think even ZFS differentiates block sizes into different buckets for more performant and optimal handling. Btrfs has to try to fit it with a lot of strategies to optimize this: Will the extent grow shortly? Should I allocate now or later? Maybe later would provide a better fit? But it is a good strategy for most workloads but not the best party with CoW. > Caching in memory free list data takes much less than caching b-trees. > Last thing is effort on deallocating something in FS with allocation > structure and with free lists. > In classic approach number of such operations is growing with depth > of b-trees. In case free list all hat you need to do is compare ctime > of the allocated block with volume or snapshot ctime to make decision > about return or not block to free list. As noted above I can follow why this was chosen. But that's not the topic here. Btrfs has b-trees - that's what it is. It's not ZFS. It's not ext4. It is btrfs. You say "btrfs needs no defragmentation, it makes no difference in speed" but now you list the many flaws and performance downsides of things different to ZFS. So maybe there is a benefit in coalescing many small extents back into few big extents? Or there is a benefit in coalescing free space all over the place into fewer chunks as "btrfs balance" would do it? Why are there these tools if it makes no difference to have them? When there was no strong benefit, why did anyone bother with the effort of programming this and putting infrastructure into the kernel for it when the kernel is already clearly very complex? Why did anyone program different file systems? We could have gone with ext4, or xfs (which starts to support reflinks already). What's the point of autodefrag when it's not needed? > No matter how many snapshots, volumes, files or directories allays it > will be *just one compare* of the block or vol/snapshot ctime. > With necessity to do just only one compare comes way better > predictable behavior of whole FS and simplicity of the code making > such decisions. You almost completely convinced me to ditch btrfs and use ZFS and recommend it to everyone who feels the urge to "defragment" even only one if her/his files... How much RAM do I need again for ZFS to operate with good performance? > In other words ZFS internally uses well know SLAB allocator with > caching some data about best possible location to allocate some > different sizes allocation unit size multiplied by n^2 like you can > see on Linux in /proc/slabinfo in case of *kmalloc* SLABs. > This is why in case of ZFS number of volumes, snapshots has zero > impact on avg speed of interactions over VFS layer. I'm feeling the whole discussion only started because you think performance perception solely comes from VFS latencies. Is that so? > If you will be able present real impact of the fragmentation (again > *if*) this may trigger other actions. I start guessing that the numbers I'd present are not convincing for you because you only want to see VFS latencies. Please think of something imaginary: Perceived performance *whoosh* Sure, I can throw lots of RAM at the problem. I can throw SSDs at the problem. I can introduce HBAs with huge caching capabilites. I can throw ZFS with L2ARC and ZIL at it. Plus huge amounts of RAM. It's all no problem, we actually do that for high performance, high cost enterprise server machines. But the ordinary desktop user can probably not effort that. > So AFAIK no one been able to deliver real numbers or scenarios about > such impact. > And *if* such impact really exist one of the solutions may be just > mimic what ZFS is doing (maybe there are other solutions). No. Probably not. You cannot just replace btrfs infrastructure with something else and still call it btrfs. And also, there would be no migration path. And then: ZFS on Linux is already there. If I want ZFS, I use it, and do not invest efforts to make something else into ZFS. Remember the rules: If it's not broken, don't fix it. And also use the tools that best fit. When we are faced with what is here, and it improves things as a one shot solution for an acceptable period of time - why not use it? I mean, McGyver would also use that bubble gum to glue the lighter to a stick, and not walk to the next super glue store to get the one and only valid way to glue lighters to sticks. The bubble gum will do long enough to temporarily solve the problem. > So please show us test unit exposing problem with measurement > methodology presenting pathology related to fragmentation. Yeah, I get it: Fragmentation is a non-issue. > > Bees is, btw, not about defragmentation: I have some OS containers > > running and I want to deduplicate data after updates. > > Deduplication done in userspace has natural consequences in form of > security issues. Yes, of course. It needs proper isolation. The kernel is already very "bloated", do you really want another worker process doing complicated things running directly in kernel space? This naturally introduces stability issues (which, btw, also introduce security issues). What about providing better interfaces for exactly such operations? > executable doing such things will need full access to everything and > needs to have exposed some API/ABI allowing fiddle with content of the > btrfs. Which adds second batch of security related risks. It depends on how much other interfaces such a process exposes. You can use proper process isolation. And maybe you shouldn't run it on untrusted machines. But then again: Personally, I'd not store sensitive information there. If security is your concern, then don't bloat the kernel with such things, and then simply don't run it. Every extra process running can be a security issue. Everyone knows that. > Try to have look how deduplication is working in case of ZFS without > offline deduplication. I didn't investigate the inner workings but I know it needs lots of RAM. > >> In other words if someone is thinking that such defragmentation > >> daemon is solving any problems he/she may be 100% right .. such > >> person is only *thinking* that this is truth. > > > > Bees is not about that. > > I've been only trying to say that I would be really surprised if bees > will be taking care of such scenarios. It at least tries to not be totally inefficient and as far as I read the code and docs, it removes extent slack by recombining and resplitting extents using data-safe kernel operations. But not for the sake of defragmenting. > >> So first show that fragmentation is hurting latency of the > >> access to btrfs data and it will be possible to measurable such > >> impact. Before you will start measuring this you need to learn how > >> o sample for example VFS layer latency. Do you know how to do this > >> to deliver such proof? > > > > You didn't get the point. You only read "defragmentation" and your > > alarm lights lid up. You even think bees would be a defragmenter. It > > probably is more the opposite because it introduces more fragments > > in exchange for more reflinks. > > So you are asking to start investing in the development time > implementing something without proving or demonstrating that problem > is real? No, you did ask for it between the lines. You are taking about latencies of single access. It is probably no problem. BTW: You don't need to prove that to me. But - personal experience - when it takes me to search the system journal 30-40s, and when I defragmented the file, it takes just 3-4 seconds? What does this have to do with VFS layer latencies? Nothing! I'm even in the same boat with you saying the the many file accesses are still all low latency at the VFS layer. But boy, they are so much more! That is perceived performance. Fragmentation makes a performance difference. That takes no scientific approach to believe that. The fix is already implemented: defrag the extents. The kernel has an IOCTL for this. Now, leverage the tools for it: To fasten a screw, you use a screw driver. You don't built it yourself, you take it from you toolbox. The screw is already there, the screw driver is there. Nothing to invent. McGyver wouldn't build one himself when one was already lying around. > No matter how long someone will be thinking about this it will change > nothing. Probably the right conclusion. So let's take the tools that are here, or switch to a better fitting file system (which, btw, is also a tool that is available). > [..] > > Can we please not start a flame war just because you hate defrag > > tools? > > Really I have no idea where I wrote that I hate defragmentation. > Using ZFS as working and real example I've only told you that > necessity to reduce fragmentation is NULL if you are following exact > path. Yes, I'll provide data for systemd journal access. And please, not another thread about that application. > In your world you are trying to tell that you keys do not match to the > locker in doors. No, the key is just under the carpet. Use it, and turn it in the right direction. > I'm only trying to tell you that there are many doors without key hole > which can be opened and closed. That is insecure. *scnr > I can only repeat that to trigger some actions about defragmentation > first you need to *present* some case scenario exposing that the > problem is real. I may even believe you that you may be right but > engineering it is not something is possible to apply "believe" term. Okay, no more hints about useful software because btrfs already has everything you ever need. Seriously, I didn't ask for fixing anything in btrfs. I hinted two tools that the OP could benefit from when using snapshots and handling fragmented files and asking for best practice. And I didn't recommend to defragment the whole filesystem all day long because it will give you a speed boost of 100+%. You jumped the train and said that defragmentation is never needed, because btrfs does all this perfectly already, while later telling how much better zfs does everything, then telling that extent allocation is the problem. Ah yes, we get to the point... But well, that's a non-issue because VFS latencies are not the problem except I scientifically prove it. No one wanted to go so far and deep. Really. Fragmented files with lots of small extents? Defragment this file. Did it help? Yes, okay that's your tool, the problem comes from the CoW nature. Also, please use bees if you are planning to defrag files part of the snapshot reflinks or undo operations. Maybe btrfs doesn't fit your workload then. If no, okay let's look at the underlying problem. Now it's time to do all this scientific stuff and so on. But this has totally been hijacked with no chance for the OP to follow this thread sanely. > Intuition always may be tricking you here that as long as impact is > non-zero someone should take care of this. Yes, if access to the file is slow, I rewrite it with some tool, and now it's fast. I must have been totally tricked. God, how dare I to measure the time with a clock and not some block tracing debug tool from the kernel... And if I rearrange boot files on a spindle and the system comes up in 30s now like a fresh build instead of in 2 minutes... I must have been tricked. Okay, it was Windows. But really, tell me: What does Windows do what Linux wouldn't do during boot? Read files? Nah... I can deduce that it has an effect even on Linux, I'm just still into finding and making the right tool for it while meanwhile I circumvented it with bcache. And please, I don't use those shiny snake oil defraggers with even counterproductive effects on the file system. I'm not a dumb non-tech reader born this millenium, I'm not clicking those click-bait articles "defragment your harddrive for speed". I'm looking into the technical workings behind this (and other stuff), since almost 30 years. There are only very very few tools available that do defrag right. And I know exactly 2, one for NTFS, one for ext3. But in the FOSS world, I can at least improve that. But maybe I shouldn't even try, because there is no problem. And there's nothing to fix. > No. if this impact will be enough small this can be ignored as same as > we are ignoring some consequences of the quantum physics in our life > (probability that bucket of water standing on open fire may freeze > instead boil according to quantum physics is always non-zero and > despite this fact no one been able to observe something like this). > In other words you need to show some *real numbers* which will show > SCALE of the issue. Quantum physics is - literally - when you try to plug your USB thumb drive and it doesn't fit, turn it around, try again, and it doesn't fit, then look at it and try again, and it fits. And that is a perfect example for what the Schrödinger experiment really stands. Try that with your water example, it won't work so easily. ;-) -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-14 17:48 ` Tomasz Kłoczko 2017-09-14 18:53 ` Austin S. Hemmelgarn 2017-09-14 20:17 ` Kai Krakow @ 2017-09-15 10:54 ` Michał Sokołowski 2017-09-15 11:13 ` Peter Grandi 2017-09-15 13:07 ` Tomasz Kłoczko 2 siblings, 2 replies; 56+ messages in thread From: Michał Sokołowski @ 2017-09-15 10:54 UTC (permalink / raw) To: Tomasz Kłoczko; +Cc: Linux fs Btrfs [-- Attachment #1: Type: text/plain, Size: 687 bytes --] On 09/14/2017 07:48 PM, Tomasz Kłoczko wrote: > On 14 September 2017 at 16:24, Kai Krakow <hurikhan77@gmail.com> wrote: > [..] >> > Getting e.g. boot files into read order or at least nearby improves >> > boot time a lot. Similar for loading applications. > [...] > Just please some example which I can try to replay which ill be > showing that we have similar results. Case #1 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage -> guest BTRFS filesystem SQL table row insertions per second: 1-2 Case #2 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu raw storage -> guest EXT4 filesystem SQL table row insertions per second: 10-15 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 3849 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 10:54 ` Michał Sokołowski @ 2017-09-15 11:13 ` Peter Grandi 2017-09-15 13:07 ` Tomasz Kłoczko 1 sibling, 0 replies; 56+ messages in thread From: Peter Grandi @ 2017-09-15 11:13 UTC (permalink / raw) To: Linux fs Btrfs > Case #1 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage > -> guest BTRFS filesystem > SQL table row insertions per second: 1-2 "Doctor, if I stab my hand with a fork it hurts a lot: can you cure that?" > Case #2 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu raw > storage -> guest EXT4 filesystem > SQL table row insertions per second: 10-15 "Doctor, I can't run as fast with a backpack full of bricks as without it: can you cure that?" :-) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 10:54 ` Michał Sokołowski 2017-09-15 11:13 ` Peter Grandi @ 2017-09-15 13:07 ` Tomasz Kłoczko 2017-09-15 14:11 ` Michał Sokołowski 1 sibling, 1 reply; 56+ messages in thread From: Tomasz Kłoczko @ 2017-09-15 13:07 UTC (permalink / raw) To: Michał Sokołowski; +Cc: Linux fs Btrfs On 15 September 2017 at 11:54, Michał Sokołowski <michal@sarach.com.pl> wrote: [..] >> Just please some example which I can try to replay which ill be >> showing that we have similar results. > > Case #1 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage > -> guest BTRFS filesystem > SQL table row insertions per second: 1-2 > > Case #2 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu raw storage -> > guest EXT4 filesystem > SQL table row insertions per second: 10-15 Q -1) why you are comparing btrfs against ext4 on top of the btrfs which is doing own COW operations on bottom of such sandwiches .. if we SUPPOSE to be talking about impact of the fragmentation on top of btrfs? Q 0) what do you think that you measure here? Q 1) how did you produce those time measurements? time command? looking on the watch? Q 2) why there are ranges of timings? did you repeat some operations few times (how many times and with or without dropping caches or doing reboots?) Q 3) What kind of SQL engine? with what kind of settings? with what kind of tables? (indexes? foreign keys?) What kind of transactions semantics? Q 4) where is the example set of inserts which I can replay in my setup? did you drop caches before batch of inserts? (do you know that every insert generates as well some number of read IOs so information is something is already cached before batch of inserts is *crucial*) Did you restart SQL engine? Q 5) are both test have been executed on the same box? if not which one version of the kernel(s) have been used? Q 6) ) effectively how many IOs have been done during those tests? how did you measured those numbers (dtrace? perf? systemtap?) Q7) why you are running your tests over qemu? Is it anything more running on the host system during those tests? . . . I can probably make this list of questions 2 or 3 times longer. koczek -- Tomasz Kłoczko | LinkedIn: http://lnkd.in/FXPWxH ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 13:07 ` Tomasz Kłoczko @ 2017-09-15 14:11 ` Michał Sokołowski 2017-09-15 16:35 ` Peter Grandi 2017-09-15 17:08 ` Kai Krakow 0 siblings, 2 replies; 56+ messages in thread From: Michał Sokołowski @ 2017-09-15 14:11 UTC (permalink / raw) To: Tomasz Kłoczko; +Cc: Linux fs Btrfs [-- Attachment #1: Type: text/plain, Size: 2877 bytes --] On 09/15/2017 03:07 PM, Tomasz Kłoczko wrote: > [...] > Case #1 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage > -> guest BTRFS filesystem > SQL table row insertions per second: 1-2 > > Case #2 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu raw storage -> > guest EXT4 filesystem > SQL table row insertions per second: 10-15 > Q -1) why you are comparing btrfs against ext4 on top of the btrfs > which is doing own COW operations on bottom of such sandwiches .. if > we SUPPOSE to be talking about impact of the fragmentation on top of > btrfs? Tomasz, you seem to be convinced that fragmentation does not matter. I found this (extremely bad, true) example says otherwise. > Q 0) what do you think that you measure here? Cow's fragmentation impact on SQL write performance. > Q 1) how did you produce those time measurements? time command? > looking on the watch? Time command (real) of bash script inserting 1000 rows (index and 128B random string). > Q 2) why there are ranges of timings? did you repeat some operations > few times (how many times and with or without dropping caches or doing > reboots?) Yes, we've repeated it. With and without flushing cache (it didn't seem to have any impact). I cannot remember whenever there were any reboots. Those big time ranges are because, I don't have exact numbers on me. It was quick and dirty task to find, prove and remove performance bottleneck at minimal cost. AFAIR removing storage cow2 and guest BTRFS storage gave us ~ 10 times boost. Surprisingly for us this boost seems to be consistent (it does not degrade noticeably over time - 2 months from the change). > Q 3) What kind of SQL engine? with what kind of settings? with what > kind of tables? (indexes? foreign keys?) What kind of transactions > semantics? PostgreSQL and MySQL both gave us those results. * > Q 4) where is the example set of inserts which I can replay in my > setup? did you drop caches before batch of inserts? (do you know that > every insert generates as well some number of read IOs so information > is something is already cached before batch of inserts is *crucial*) > Did you restart SQL engine? > Q 5) are both test have been executed on the same box? if not which > one version of the kernel(s) have been used? Same distribution, machine and kernel. * > Q 6) ) effectively how many IOs have been done during those tests? how > did you measured those numbers (dtrace? perf? systemtap?) I didn't check that. * > Q7) why you are running your tests over qemu? Is it anything more > running on the host system during those tests? Because of "production" environment location. No, there was not. *) If you're really interested in (which I doubt), then I can put example environment somewhere and gather more data. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 3849 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 14:11 ` Michał Sokołowski @ 2017-09-15 16:35 ` Peter Grandi 2017-09-15 17:08 ` Kai Krakow 1 sibling, 0 replies; 56+ messages in thread From: Peter Grandi @ 2017-09-15 16:35 UTC (permalink / raw) To: Linux fs Btrfs [ ... ] >>>> Case #1 >>>> 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs >>>> -> qemu cow2 storage -> guest BTRFS filesystem >>>> SQL table row insertions per second: 1-2 >>>> Case #2 >>>> 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs >>>> -> qemu raw storage -> guest EXT4 filesystem >>>> SQL table row insertions per second: 10-15 [ ... ] >> Q 0) what do you think that you measure here? > Cow's fragmentation impact on SQL write performance. That's not what you are measuring, you are measing the impact on speed of configurations "designed" (perhaps unintentionally) for maximum flexibility, lowest cost, and complete disregard for speed. [ ... ] > It was quick and dirty task to find, prove and remove > performance bottleneck at minimal cost. This is based on the usual confusion between "performance" (the result of several tradeoffs) and "speed". When you report "row insertions per second" you are reporting a rate, that is a "speed", not "performance", which is always multi-dimensional. http://www.sabi.co.uk/blog/15-two.html?151023#151023 In the cases above speed is low, but I think that, taking into account flexibility and cost, performance is pretty good. > AFAIR removing storage cow2 and guest BTRFS storage gave us ~ > 10 times boost. "Oh doctor, if I stop stabbing my hand with a fork it no longer hurts, but running while carrying a rucksack full of bricks is still slower than with a rucksack full of feathers". [ ... ] ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 14:11 ` Michał Sokołowski 2017-09-15 16:35 ` Peter Grandi @ 2017-09-15 17:08 ` Kai Krakow 2017-09-15 19:10 ` Tomasz Kłoczko 1 sibling, 1 reply; 56+ messages in thread From: Kai Krakow @ 2017-09-15 17:08 UTC (permalink / raw) To: linux-btrfs Am Fri, 15 Sep 2017 16:11:50 +0200 schrieb Michał Sokołowski <michal@sarach.com.pl>: > On 09/15/2017 03:07 PM, Tomasz Kłoczko wrote: > > [...] > > Case #1 > > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 > > storage -> guest BTRFS filesystem > > SQL table row insertions per second: 1-2 > > > > Case #2 > > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu raw > > storage -> guest EXT4 filesystem > > SQL table row insertions per second: 10-15 > > Q -1) why you are comparing btrfs against ext4 on top of the btrfs > > which is doing own COW operations on bottom of such sandwiches .. > > if we SUPPOSE to be talking about impact of the fragmentation on > > top of btrfs? > > Tomasz, > you seem to be convinced that fragmentation does not matter. I found > this (extremely bad, true) example says otherwise. Sorry to jump this, but did you at least set the qemu image to nocow? Otherwise this example is totally flawed because you're testing qemu storage layer mostly and not btrfs. A better test would've been to test qemu raw on btrfs cow vs on btrfs nocow, with both the same file system inside the qemu image. But you are modifying multiple parameters at once during the test, and I expect then everyone has a huge impact on performance but only one is specific to btrfs which you apparently did not test this way. Personally, running qemu cow2 on btrfs cow really helps nothing except really bad performance. Make one of both layers nocow and it should become better. If you want to give some better numbers, please reduce this test to just one cow layer, the one at the top layer: btrfs host fs. Copy the image somewhere else to restore from, and ensure (using filefrag) that the starting situation matches each test run. Don't change any parameters of the qemu layer at each test. And run a file system inside which doesn't do any fancy stuff, like ext2 or ext3 without journal. Use qemu raw storage. Then test again with cow vs nocow on the host side. Create a nocow copy of your image (use size of the source image for truncate): # rm -f qemu-image-nocow.raw # touch qemu-image-nocow.raw # chattr +C -c qemu-image-nocow.raw # dd if=source-image.raw of=qemu-image-nocow.raw bs=1M # btrfs fi defrag -f qemu-image-nocow.raw # filefrag -v qemu-image-nocow.raw Create a cow copy of your image: # rm -f qemu-image-cow.raw # touch qemu-image-cow.raw # chattr -C -c qemu-image-cow.raw # dd if=source-image.raw of=qemu-image-cow.raw bs=1M # btrfs fi defrag -f qemu-image-cow.raw # filefrag -v qemu-image-cow.raw Given that host btrfs is mounted datacow,compress=none and without autodefrag, and you don't touch the source image contents during tests. Now run your test script inside both qemu machines, take your measurements and check fragmentation again after the run. filefrag should report no more fragments than before the test for the first test, but should report a magnitude more for the second test. Now copy (cp) both one at a time to a new file and measure the time. It should be slower for the highly fragmented version. Don't forget to run tests with and without flushed caches so we get cold and warm numbers. In this scenario, qemu would only be the application to modify the raw image files and you're actually testing the impact of fragmentation of btrfs. You could also make a reflink copy of the nocow test image and do a third test to see that it introduces fragmentation now, tho probably much lower than for the cow test image. You can verify the numbers with filefrag. According to Tomasz, your tests should not run at vastly different speeds because fragmentation has no impact on performance, quod est demonstrandum... I think we will not get to the "erat" part. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 17:08 ` Kai Krakow @ 2017-09-15 19:10 ` Tomasz Kłoczko 2017-09-20 6:38 ` Dave 2017-09-20 7:34 ` Dmitry Kudriavtsev 0 siblings, 2 replies; 56+ messages in thread From: Tomasz Kłoczko @ 2017-09-15 19:10 UTC (permalink / raw) To: Linux fs Btrfs On 15 September 2017 at 18:08, Kai Krakow <hurikhan77@gmail.com> wrote: [..] > According to Tomasz, your tests should not run at vastly different > speeds because fragmentation has no impact on performance, quod est > demonstrandum... I think we will not get to the "erat" part. No. This is not precisely what I'm trying to tell. Now however seeing that there is no precise/fully repeatable methodology of performing proposed test I have huge doubts about what is reported has effect has anything to do do with fragmentation or it is side effect of using COW (which allow glue some number random updates into larger sequential write IOs). kloczek -- Tomasz Kłoczko LinkedIn: http://lnkd.in/FXPWxH ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 19:10 ` Tomasz Kłoczko @ 2017-09-20 6:38 ` Dave 2017-09-20 11:46 ` Austin S. Hemmelgarn ` (2 more replies) 2017-09-20 7:34 ` Dmitry Kudriavtsev 1 sibling, 3 replies; 56+ messages in thread From: Dave @ 2017-09-20 6:38 UTC (permalink / raw) To: Linux fs Btrfs >On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: >> When I do a >> btrfs filesystem defragment -r /directory >> does it defragment really all files in this directory tree, even if it >> contains subvolumes? >> The man page does not mention subvolumes on this topic. > >No answer so far :-( > >But I found another problem in the man-page: > > Defragmenting with Linux kernel versions < 3.9 or >= 3.14-rc2 as well as > with Linux stable kernel versions >= 3.10.31, >= 3.12.12 or >= 3.13.4 > will break up the ref-links of COW data (for example files copied with > cp --reflink, snapshots or de-duplicated data). This may cause > considerable increase of space usage depending on the broken up > ref-links. > >I am running Ubuntu 16.04 with Linux kernel 4.10 and I have several >snapshots. >Therefore, I better should avoid calling "btrfs filesystem defragment -r"? > >What is the defragmenting best practice? >Avoid it completly? My question is the same as the OP in this thread, so I came here to read the answers before asking. However, it turns out that I still need to ask something. Should I ask here or start a new thread? (I'll assume here, since the topic is the same.) Based on the answers here, it sounds like I should not run defrag at all. However, I have a performance problem I need to solve, so if I don't defrag, I need to do something else. Here's my scenario. Some months ago I built an over-the-top powerful desktop computer / workstation and I was looking forward to really fantastic performance improvements over my 6 year old Ubuntu machine. I installed Arch Linux on BTRFS on the new computer (on an SSD). To my shock, it was no faster than my old machine. I focused a lot on Firefox performance because I use Firefox a lot and that was one of the applications in which I was most looking forward to better performance. I tried everything I could think of and everything recommended to me in various forums (except switching to Windows) and the performance remained very disappointing. Then today I read the following: Gotchas - btrfs Wiki https://btrfs.wiki.kernel.org/index.php/Gotchas Fragmentation: Files with a lot of random writes can become heavily fragmented (10000+ extents) causing excessive multi-second spikes of CPU load on systems with an SSD or large amount a RAM. On desktops this primarily affects application databases (including Firefox). Workarounds include manually defragmenting your home directory using btrfs fi defragment. Auto-defragment (mount option autodefrag) should solve this problem. Upon reading that I am wondering if fragmentation in the Firefox profile is part of my issue. That's one thing I never tested previously. (BTW, this system has 256 GB of RAM and 20 cores.) Furthermore, on the same BTRFS Wiki page, it mentions the performance penalties of many snapshots. I am keeping 30 to 50 snapshots of the volume that contains the Firefox profile. Would these two things be enough to turn top-of-the-line hardware into a mediocre-preforming desktop system? (The system performs fine on benchmarks -- it's real life usage, particularly with Firefox where it is disappointing.) After reading the info here, I am wondering if I should make a new subvolume just for my Firefox profile(s) and not use COW and/or not keep snapshots on it and mount it with the autodefrag option. As part of this strategy, I could send snapshots to another disk using btrfs send-receive. That way I would have the benefits of snapshots (which are important to me), but by not keeping any snapshots on the live subvolume I could avoid the performance problems. What would you guys do in this situation? ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-20 6:38 ` Dave @ 2017-09-20 11:46 ` Austin S. Hemmelgarn 2017-09-21 20:10 ` Kai Krakow 2017-09-21 11:09 ` Duncan 2017-09-21 19:28 ` Sean Greenslade 2 siblings, 1 reply; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-09-20 11:46 UTC (permalink / raw) To: Dave, Linux fs Btrfs On 2017-09-20 02:38, Dave wrote: >> On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: >>> When I do a >>> btrfs filesystem defragment -r /directory >>> does it defragment really all files in this directory tree, even if it >>> contains subvolumes? >>> The man page does not mention subvolumes on this topic. >> >> No answer so far :-( >> >> But I found another problem in the man-page: >> >> Defragmenting with Linux kernel versions < 3.9 or >= 3.14-rc2 as well as >> with Linux stable kernel versions >= 3.10.31, >= 3.12.12 or >= 3.13.4 >> will break up the ref-links of COW data (for example files copied with >> cp --reflink, snapshots or de-duplicated data). This may cause >> considerable increase of space usage depending on the broken up >> ref-links. >> >> I am running Ubuntu 16.04 with Linux kernel 4.10 and I have several >> snapshots. >> Therefore, I better should avoid calling "btrfs filesystem defragment -r"? >> >> What is the defragmenting best practice? >> Avoid it completly? > > My question is the same as the OP in this thread, so I came here to > read the answers before asking. However, it turns out that I still > need to ask something. Should I ask here or start a new thread? (I'll > assume here, since the topic is the same.) > > Based on the answers here, it sounds like I should not run defrag at > all. However, I have a performance problem I need to solve, so if I > don't defrag, I need to do something else. > > Here's my scenario. Some months ago I built an over-the-top powerful > desktop computer / workstation and I was looking forward to really > fantastic performance improvements over my 6 year old Ubuntu machine. > I installed Arch Linux on BTRFS on the new computer (on an SSD). To my > shock, it was no faster than my old machine. I focused a lot on > Firefox performance because I use Firefox a lot and that was one of > the applications in which I was most looking forward to better > performance. > > I tried everything I could think of and everything recommended to me > in various forums (except switching to Windows) and the performance > remained very disappointing. Switching to Windows won't help any more than switching to ext4 would. If you were running Chrome, it might (Chrome actually has better performance on Windows than Linux by a small margin last time I checked), but Firefox gets pretty much the same performance on both platforms. > > Then today I read the following: > > Gotchas - btrfs Wiki > https://btrfs.wiki.kernel.org/index.php/Gotchas > > Fragmentation: Files with a lot of random writes can become > heavily fragmented (10000+ extents) causing excessive multi-second > spikes of CPU load on systems with an SSD or large amount a RAM. On > desktops this primarily affects application databases (including > Firefox). Workarounds include manually defragmenting your home > directory using btrfs fi defragment. Auto-defragment (mount option > autodefrag) should solve this problem. > > Upon reading that I am wondering if fragmentation in the Firefox > profile is part of my issue. That's one thing I never tested > previously. (BTW, this system has 256 GB of RAM and 20 cores.) Almost certainly. Most modern web browsers are brain-dead and insist on using SQLite databases (or traditional DB files) for everything, including the cache, and the usage for the cache in particular kills performance when fragmentation is an issue. > > Furthermore, on the same BTRFS Wiki page, it mentions the performance > penalties of many snapshots. I am keeping 30 to 50 snapshots of the > volume that contains the Firefox profile. > > Would these two things be enough to turn top-of-the-line hardware into > a mediocre-preforming desktop system? (The system performs fine on > benchmarks -- it's real life usage, particularly with Firefox where it > is disappointing.) Even ignoring fragmentation and reflink issues (it's reflinks, not snapshots that are the issue, snapshots just have tons of reflinks), BTRFS is slower than ext4 or XFS simply because of the fact that it's doing way more work. The difference should have limited impact on an SSD if you get a handle on the other issues though. > > After reading the info here, I am wondering if I should make a new > subvolume just for my Firefox profile(s) and not use COW and/or not > keep snapshots on it and mount it with the autodefrag option. > > As part of this strategy, I could send snapshots to another disk using > btrfs send-receive. That way I would have the benefits of snapshots > (which are important to me), but by not keeping any snapshots on the > live subvolume I could avoid the performance problems. > > What would you guys do in this situation? Personally? Use Chrome or Chromium and turn on the simple cache backend (chrome://flags/#enable-simple-cache-backend) which doesn't have issues with fragmentation because it doesn't use a database file to store the cache and lets the filesystem handle the allocations. The difference in performance in Chrome itself from flipping this switch is pretty amazing to be honest. They're also faster than Firefox in general in my experience, but that's a separate discussion. From a practical perspective though, if you're using the profile sync feature in Firefox, you don't need the checksumming of BTRFS and shouldn't need snapshots either (at least, not for that), so through some symlink trickery you could put your Firefox profile on another filesystem (same for Thunderbird, which has the same issues). Alternatively, if you can afford to have your space usage effectively multiplied by the number of snapshots, defragment the FS after every snapshot. That will deal both with the performance issues from fragmentation, and the performance issues from reflinks (because defrag breaks reflinks). ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-20 11:46 ` Austin S. Hemmelgarn @ 2017-09-21 20:10 ` Kai Krakow 2017-09-21 23:30 ` Dave ` (2 more replies) 0 siblings, 3 replies; 56+ messages in thread From: Kai Krakow @ 2017-09-21 20:10 UTC (permalink / raw) To: linux-btrfs Am Wed, 20 Sep 2017 07:46:52 -0400 schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>: > > Fragmentation: Files with a lot of random writes can become > > heavily fragmented (10000+ extents) causing excessive multi-second > > spikes of CPU load on systems with an SSD or large amount a RAM. On > > desktops this primarily affects application databases (including > > Firefox). Workarounds include manually defragmenting your home > > directory using btrfs fi defragment. Auto-defragment (mount option > > autodefrag) should solve this problem. > > > > Upon reading that I am wondering if fragmentation in the Firefox > > profile is part of my issue. That's one thing I never tested > > previously. (BTW, this system has 256 GB of RAM and 20 cores.) > Almost certainly. Most modern web browsers are brain-dead and insist > on using SQLite databases (or traditional DB files) for everything, > including the cache, and the usage for the cache in particular kills > performance when fragmentation is an issue. At least in Chrome, you can turn on simple cache backend, which, I think, is using many small instead of one huge file. This suit btrfs much better: chrome://flags/#enable-simple-cache-backend And then I suggest also doing this (as your login user): $ cd $HOME $ mv .cache .cache.old $ mkdir .cache $ lsattr +C .cache $ rsync -av .cache.old/ .cache/ $ rm -Rf .cache.old This makes caches for most applications nocow. Chrome performance was completely fixed for me by doing this. I'm not sure where Firefox puts its cache, I only use it on very rare occasions. But I think it's going to .cache/mozilla last time looked at it. You may want to close all apps before converting the cache directory. Also, I don't see any downsides in making this nocow. That directory could easily be also completely volatile. If something breaks due to no longer protected by data csum, just clean it out. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-21 20:10 ` Kai Krakow @ 2017-09-21 23:30 ` Dave 2017-09-21 23:58 ` Kai Krakow 2017-09-22 11:22 ` Austin S. Hemmelgarn 2 siblings, 0 replies; 56+ messages in thread From: Dave @ 2017-09-21 23:30 UTC (permalink / raw) To: Linux fs Btrfs These are great suggestions. I will test several of them (or all of them) and report back with my results once I have done the testing. Thank you! This is a fantastic mailing list. P.S. I'm inclined to stay with Firefox, but I will definitely test Chromium vs Firefox after making a series of changes based on the suggestions here. I would hate to see the market lose the option of Firefox because everyone goes to Chrome/Chromium. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-21 20:10 ` Kai Krakow 2017-09-21 23:30 ` Dave @ 2017-09-21 23:58 ` Kai Krakow 2017-09-22 11:22 ` Austin S. Hemmelgarn 2 siblings, 0 replies; 56+ messages in thread From: Kai Krakow @ 2017-09-21 23:58 UTC (permalink / raw) To: linux-btrfs Am Thu, 21 Sep 2017 22:10:13 +0200 schrieb Kai Krakow <hurikhan77@gmail.com>: > Am Wed, 20 Sep 2017 07:46:52 -0400 > schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>: > > > > Fragmentation: Files with a lot of random writes can become > > > heavily fragmented (10000+ extents) causing excessive multi-second > > > spikes of CPU load on systems with an SSD or large amount a RAM. > > > On desktops this primarily affects application databases > > > (including Firefox). Workarounds include manually defragmenting > > > your home directory using btrfs fi defragment. Auto-defragment > > > (mount option autodefrag) should solve this problem. > > > > > > Upon reading that I am wondering if fragmentation in the Firefox > > > profile is part of my issue. That's one thing I never tested > > > previously. (BTW, this system has 256 GB of RAM and 20 cores.) > > Almost certainly. Most modern web browsers are brain-dead and > > insist on using SQLite databases (or traditional DB files) for > > everything, including the cache, and the usage for the cache in > > particular kills performance when fragmentation is an issue. > > At least in Chrome, you can turn on simple cache backend, which, I > think, is using many small instead of one huge file. This suit btrfs > much better: > > chrome://flags/#enable-simple-cache-backend > > > And then I suggest also doing this (as your login user): > > $ cd $HOME > $ mv .cache .cache.old > $ mkdir .cache > $ lsattr +C .cache Oops, of course that's chattr, not lsattr > $ rsync -av .cache.old/ .cache/ > $ rm -Rf .cache.old > > This makes caches for most applications nocow. Chrome performance was > completely fixed for me by doing this. > > I'm not sure where Firefox puts its cache, I only use it on very rare > occasions. But I think it's going to .cache/mozilla last time looked > at it. > > You may want to close all apps before converting the cache directory. > > Also, I don't see any downsides in making this nocow. That directory > could easily be also completely volatile. If something breaks due to > no longer protected by data csum, just clean it out. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-21 20:10 ` Kai Krakow 2017-09-21 23:30 ` Dave 2017-09-21 23:58 ` Kai Krakow @ 2017-09-22 11:22 ` Austin S. Hemmelgarn 2017-09-22 20:29 ` Marc Joliet 2 siblings, 1 reply; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-09-22 11:22 UTC (permalink / raw) To: linux-btrfs On 2017-09-21 16:10, Kai Krakow wrote: > Am Wed, 20 Sep 2017 07:46:52 -0400 > schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>: > >>> Fragmentation: Files with a lot of random writes can become >>> heavily fragmented (10000+ extents) causing excessive multi-second >>> spikes of CPU load on systems with an SSD or large amount a RAM. On >>> desktops this primarily affects application databases (including >>> Firefox). Workarounds include manually defragmenting your home >>> directory using btrfs fi defragment. Auto-defragment (mount option >>> autodefrag) should solve this problem. >>> >>> Upon reading that I am wondering if fragmentation in the Firefox >>> profile is part of my issue. That's one thing I never tested >>> previously. (BTW, this system has 256 GB of RAM and 20 cores.) >> Almost certainly. Most modern web browsers are brain-dead and insist >> on using SQLite databases (or traditional DB files) for everything, >> including the cache, and the usage for the cache in particular kills >> performance when fragmentation is an issue. > > At least in Chrome, you can turn on simple cache backend, which, I > think, is using many small instead of one huge file. This suit btrfs > much better: That's correct. The traditional cache in Chrome and Chromium uses a single SQLite database for storing all the cache data and metadata (just like FIrefox did last time I checked). The simple cache backend instead uses the filesystem to handle allocations and uses directory hashing to speed up look ups of items, which actually means that even without BTRFS involved, it will usually be faster (both because it allows concurrent access unlike SQLite, and because it's generally faster to parse a multi-level directory hash than an SQL statement). > > chrome://flags/#enable-simple-cache-backend > > > And then I suggest also doing this (as your login user): > > $ cd $HOME > $ mv .cache .cache.old > $ mkdir .cache > $ lsattr +C .cache > $ rsync -av .cache.old/ .cache/ > $ rm -Rf .cache.old > > This makes caches for most applications nocow. Chrome performance was > completely fixed for me by doing this. > > I'm not sure where Firefox puts its cache, I only use it on very rare > occasions. But I think it's going to .cache/mozilla last time looked > at it. I'm pretty sure that is correct. > > You may want to close all apps before converting the cache directory. At a minimum, you'll have to restart them to get them to use the new location. > > Also, I don't see any downsides in making this nocow. That directory > could easily be also completely volatile. If something breaks due to no > longer protected by data csum, just clean it out. Indeed, anything that is storing data here that can't be regenerated from some other source is asking for trouble, sane backup systems don't include ~/.cache, and it's quite often one of the first things recommended for deletion when trying to free up disk space. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-22 11:22 ` Austin S. Hemmelgarn @ 2017-09-22 20:29 ` Marc Joliet 0 siblings, 0 replies; 56+ messages in thread From: Marc Joliet @ 2017-09-22 20:29 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 2055 bytes --] Am Freitag, 22. September 2017, 13:22:52 CEST schrieb Austin S. Hemmelgarn: > > I'm not sure where Firefox puts its cache, I only use it on very rare > > occasions. But I think it's going to .cache/mozilla last time looked > > at it. > > I'm pretty sure that is correct. FWIW, on my system Firefox's cache looks like this: % du -hsc (find .cache/mozilla/firefox/ -type f) | wc -l 9008 % du -hsc (find .cache/mozilla/firefox/ -type f) | sort -h | tail 5,4M .cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 83CEC8ADA08D9A9658458AB872BE107A216E71C6 5,5M .cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ C60061B33D3BB91ED45951C922BAA1BB40022CB7 5,7M .cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 0900D9EA8E3222EB8690348C2482C69308B15A20 5,7M .cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ F8E90D121B884360E36BCB1735CC5A8B1B7A744B 5,8M .cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 903C4CD01ABD74E353C7484C6E21A053AAC5DCC2 6,1M .cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 3A0D4193B009700155811D14A28DBE38C37C0067 6,1M .cache/mozilla/firefox/cb236e4s.default-1464421886682/startupCache/ scriptCache-current.bin 6,5M .cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 304405168662C3624D57AF98A74345464F32A0DB 8,8M .cache/mozilla/firefox/ik7qsfwb.Temp/cache2/entries/ BD7CA4125B3AA87D6B16C987741F33C65DBFFFDD 427M insgesamt So lots of files, many of which are (I suppose) relatively large, but do not look "everything in one database" large to me. (This is with Firefox 55.0.2.) -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-20 6:38 ` Dave 2017-09-20 11:46 ` Austin S. Hemmelgarn @ 2017-09-21 11:09 ` Duncan 2017-10-31 21:47 ` Dave 2017-09-21 19:28 ` Sean Greenslade 2 siblings, 1 reply; 56+ messages in thread From: Duncan @ 2017-09-21 11:09 UTC (permalink / raw) To: linux-btrfs Dave posted on Wed, 20 Sep 2017 02:38:13 -0400 as excerpted: > Here's my scenario. Some months ago I built an over-the-top powerful > desktop computer / workstation and I was looking forward to really > fantastic performance improvements over my 6 year old Ubuntu machine. I > installed Arch Linux on BTRFS on the new computer (on an SSD). To my > shock, it was no faster than my old machine. I focused a lot on Firefox > performance because I use Firefox a lot and that was one of the > applications in which I was most looking forward to better performance. > > I tried everything I could think of and everything recommended to me in > various forums (except switching to Windows) and the performance > remained very disappointing. > > Then today I read the following: > > Gotchas - btrfs Wiki https://btrfs.wiki.kernel.org/index.php/Gotchas > > Fragmentation: Files with a lot of random writes can become > heavily fragmented (10000+ extents) causing excessive multi-second > spikes of CPU load on systems with an SSD or large amount a RAM. On > desktops this primarily affects application databases (including > Firefox). Workarounds include manually defragmenting your home directory > using btrfs fi defragment. Auto-defragment (mount option autodefrag) > should solve this problem. > > Upon reading that I am wondering if fragmentation in the Firefox profile > is part of my issue. That's one thing I never tested previously. (BTW, > this system has 256 GB of RAM and 20 cores.) > > Furthermore, on the same BTRFS Wiki page, it mentions the performance > penalties of many snapshots. I am keeping 30 to 50 snapshots of the > volume that contains the Firefox profile. > > Would these two things be enough to turn top-of-the-line hardware into a > mediocre-preforming desktop system? (The system performs fine on > benchmarks -- it's real life usage, particularly with Firefox where it > is disappointing.) > > After reading the info here, I am wondering if I should make a new > subvolume just for my Firefox profile(s) and not use COW and/or not keep > snapshots on it and mount it with the autodefrag option. > > As part of this strategy, I could send snapshots to another disk using > btrfs send-receive. That way I would have the benefits of snapshots > (which are important to me), but by not keeping any snapshots on the > live subvolume I could avoid the performance problems. > > What would you guys do in this situation? [FWIW this is my second try at a reply, my first being way too detailed and going off into the weeds somewhere, so I killed it.] That's an interesting scenario indeed, and perhaps I can help, since my config isn't near as high end as yours, but I run firefox on btrfs on ssds, and have no performance complaints. The difference is very likely due to one or more of the following (FWIW I'd suggest a 4-3-1-2 order, tho only 1 and 2 are really btrfs related): 1) I make sure I consistently mount with autodefrag, from the first mount after the filesystem is created in ordered to first populate it, on. The filesystem never gets fragmented, forcing writes to highly fragmented free space, in the first place. (With the past and current effect of the ssd mount option under discussion to change, it's possible I'll get more fragmentation in the future after ssd doesn't try so hard to find reasonably large free-space chunks to write into, but it has been fine so far.) 2) Subvolumes and snapshots seemed to me more trouble than they were worth, particularly since it's the same filesystem anyway, and if it's damaged, it'll take all the subvolumes and snapshots with it. So I don't use them, preferring instead to use real partitioning and more smaller fully separate filesystems, some of which aren't mounted by default (and root mounted read-only by default), so there's little chance they'll be damaged in a crash or filesystem bug damage scenario. And if there /is/ any damage, it's much more limited in scope since all my data eggs aren't in the same basket, so maintenance such as btrfs check and scrub take far less time (and check far less memory) than they would were it one big pool with snapshots. And if recovery fails too, the backups are likewise small filesystems the same size as the working copies, so copying the data back over takes far less time as well (not to mention making the backups takes less time in the first place, so it's easier to regularly update them). 3) Austin mentioned the firefox cache. I honestly wouldn't know on it, since I have firefox configured to use a tmpfs for its cache, so it operates at memory speed and gets cleared along with its memory at every reboot or tmpfs umount. My inet speed is fast enough I don't really need cache anyway, but it's nice to have it, operating at memory speed, within a single boot session... and to have it cleared on reboot. 4) This one was the biggest one for me for awhile. Is firefox running in multi-process mode? If you don't know, got to about:support, and look in the Application Basics section, at the Multiprocess Windows entry and the Web Content Processes entry. When you have multiple windows open it should show something like 2/2 (for two windows open, tho you won't get 20/20 for 20 windows open) for windows, and n/7 (tho I believe the default is 4 instead of 7, I've upped mine) for content processes, with n going up toward 7 (or 4) if you have multiple tabs/windows open playing video or the like. If you're stuck at a single process that'll be a *BIG* drag on performance, particularly when playing youtube full-screen or the like. There are various reasons you might get stuck at a single process, including extensions that aren't compatible with "electrolysis" (aka e10s, this being the mozilla code name for multi-process firefox), and the one that was my problem after I ensured all my extensions were e10s compatible -- I was trying to run the upstream firefox binary, which is now pulseaudio-only (no more direct alsa support), with apulse as a pulseaudio substitute, and apulse is apparently single-process-only (forcing multi-process would crash the tabs as soon as I tried navigating away from about:whatever to anything remote). Once I figured that out I switched back to using the gentoo firefox ebuild and enabling the alsa USE flag instead of pulseaudio there. That got multiprocess working, and it was was *MUCH* more responsive, as I figured it should be! =:^) If you find you're stuck at single process (remember, check with at least two windows open) and need help with it, yell. Because it'll make a *HUGE* difference. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-21 11:09 ` Duncan @ 2017-10-31 21:47 ` Dave 2017-10-31 23:06 ` Peter Grandi ` (2 more replies) 0 siblings, 3 replies; 56+ messages in thread From: Dave @ 2017-10-31 21:47 UTC (permalink / raw) To: Linux fs Btrfs I'm following up on all the suggestions regarding Firefox performance on BTRFS. I have time to make these changes now, but I am having trouble figuring out what to do. The constraints are: 1. BTRFS snapshots have proven to be too useful (and too important to our overall IT approach) to forego. 2. We do not see any practical alternative (for us) to the incremental backup strategy (https://btrfs.wiki.kernel.org/index.php/Incremental_Backup) 3. We have large amounts of storage space (and can add more), but not enough to break all reflinks on all snapshots. 4. We can transfer snapshots to backup storage (and thereby retain minimal snapshots on the live volume) 3. Our team is standardized on Firefox. (Switching to Chromium is not an option for us.) 5. Firefox profile sync has not worked well for us in the past, so we don't use it. 6. Our machines generally have plenty of RAM so we could put the Firefox cache (and maybe profile) into RAM using a technique such as https://wiki.archlinux.org/index.php/Firefox/Profile_on_RAM. However, profile persistence is important. The most common recommendations were to switch to Chromium, defragment and don't use snapshots. As the constraints above illustrate, we cannot do those things. The tentative solution I have come up with is: 1. Continue using snapshots, but retain the minimal number possible on the live volume. Move historical snapshots to a backup device using btrfs send-receive. (https://btrfs.wiki.kernel.org/index.php/Incremental_Backup) 2. Put $HOME/.cache on a separate BTRFS subvolume that is mounted nocow -- it will NOT be snapshotted 3. Put most of $HOME on a "home" volume but separate all user documents to another volume (i.e., "documents"). 3.a. The "home" volume will retain only the one most recent snapshot on that live volume. (More backup history will be retained on a backup volume. ) This home volume can be defragmented. With one snapshot, that will double our space usage, which is acceptable. 3.b. The documents volume will be snapshotted hourly and 36 hourly snapshots plus daily, weekly and monthly snapshots retained. Therefore it will NOT be defragmented, as that would not be practical or space-wise possible. 3.c. The root volume (operating system, etc.) will follow a strategy similar to home, but will also retain pre- and post- update snapshots. 4. Put the Firefox cache in RAM 5. If needed, consider putting the Firefox profile in RAM 6. Make sure Firefox is running in multi-process mode. (Duncan's instructions, while greatly appreciated and very useful, left me slightly confused about pulseaudio's compatibility with multi-process mode.) 7. Check various Firefox performance tweaks such as these: https://wiki.archlinux.org/index.php/Firefox/Tweaks Can anyone guess whether this will be sufficient to solve our severe performance problems? Do these steps make sense? Will any of these steps lead to new problems? Should I proceed to give them a try? Or can anyone suggest a better set of steps to test? Notes: In regard to snapshots, we must retain about 36 hourly snapshots of user documents, for example. We have to have pre- and post- package upgrade snapshots from at least the most recent operating system & application package update. And we have to retain several daily, weekly and monthly snapshots of system directories and some other locations.) Most of these snapshots can be retained on backup storage devices. Regarding Firefox profile sync, it does not have an intelligent method for resolving conflicts, for example. We found too many unexpected changes when using sync, so we do not use it. On Thu, Sep 21, 2017 at 7:09 AM, Duncan <1i5t5.duncan@cox.net> wrote: > Dave posted on Wed, 20 Sep 2017 02:38:13 -0400 as excerpted: > >> Here's my scenario. Some months ago I built an over-the-top powerful >> desktop computer / workstation and I was looking forward to really >> fantastic performance improvements over my 6 year old Ubuntu machine. I >> installed Arch Linux on BTRFS on the new computer (on an SSD). To my >> shock, it was no faster than my old machine. I focused a lot on Firefox >> performance because I use Firefox a lot and that was one of the >> applications in which I was most looking forward to better performance. >> >> I tried everything I could think of and everything recommended to me in >> various forums (except switching to Windows) and the performance >> remained very disappointing. >> >> Then today I read the following: >> >> Gotchas - btrfs Wiki https://btrfs.wiki.kernel.org/index.php/Gotchas >> >> Fragmentation: Files with a lot of random writes can become >> heavily fragmented (10000+ extents) causing excessive multi-second >> spikes of CPU load on systems with an SSD or large amount a RAM. On >> desktops this primarily affects application databases (including >> Firefox). Workarounds include manually defragmenting your home directory >> using btrfs fi defragment. Auto-defragment (mount option autodefrag) >> should solve this problem. >> >> Upon reading that I am wondering if fragmentation in the Firefox profile >> is part of my issue. That's one thing I never tested previously. (BTW, >> this system has 256 GB of RAM and 20 cores.) >> >> Furthermore, on the same BTRFS Wiki page, it mentions the performance >> penalties of many snapshots. I am keeping 30 to 50 snapshots of the >> volume that contains the Firefox profile. >> >> Would these two things be enough to turn top-of-the-line hardware into a >> mediocre-preforming desktop system? (The system performs fine on >> benchmarks -- it's real life usage, particularly with Firefox where it >> is disappointing.) >> >> After reading the info here, I am wondering if I should make a new >> subvolume just for my Firefox profile(s) and not use COW and/or not keep >> snapshots on it and mount it with the autodefrag option. >> >> As part of this strategy, I could send snapshots to another disk using >> btrfs send-receive. That way I would have the benefits of snapshots >> (which are important to me), but by not keeping any snapshots on the >> live subvolume I could avoid the performance problems. >> >> What would you guys do in this situation? > > [FWIW this is my second try at a reply, my first being way too detailed > and going off into the weeds somewhere, so I killed it.] > > That's an interesting scenario indeed, and perhaps I can help, since my > config isn't near as high end as yours, but I run firefox on btrfs on > ssds, and have no performance complaints. The difference is very likely > due to one or more of the following (FWIW I'd suggest a 4-3-1-2 order, > tho only 1 and 2 are really btrfs related): > > 1) I make sure I consistently mount with autodefrag, from the first mount > after the filesystem is created in ordered to first populate it, on. The > filesystem never gets fragmented, forcing writes to highly fragmented > free space, in the first place. (With the past and current effect of the > ssd mount option under discussion to change, it's possible I'll get more > fragmentation in the future after ssd doesn't try so hard to find > reasonably large free-space chunks to write into, but it has been fine so > far.) > > 2) Subvolumes and snapshots seemed to me more trouble than they were > worth, particularly since it's the same filesystem anyway, and if it's > damaged, it'll take all the subvolumes and snapshots with it. So I don't > use them, preferring instead to use real partitioning and more smaller > fully separate filesystems, some of which aren't mounted by default (and > root mounted read-only by default), so there's little chance they'll be > damaged in a crash or filesystem bug damage scenario. And if there /is/ > any damage, it's much more limited in scope since all my data eggs aren't > in the same basket, so maintenance such as btrfs check and scrub take far > less time (and check far less memory) than they would were it one big > pool with snapshots. And if recovery fails too, the backups are likewise > small filesystems the same size as the working copies, so copying the > data back over takes far less time as well (not to mention making the > backups takes less time in the first place, so it's easier to regularly > update them). > > 3) Austin mentioned the firefox cache. I honestly wouldn't know on it, > since I have firefox configured to use a tmpfs for its cache, so it > operates at memory speed and gets cleared along with its memory at every > reboot or tmpfs umount. My inet speed is fast enough I don't really need > cache anyway, but it's nice to have it, operating at memory speed, within > a single boot session... and to have it cleared on reboot. > > > 4) This one was the biggest one for me for awhile. > > Is firefox running in multi-process mode? If you don't know, got to > about:support, and look in the Application Basics section, at the > Multiprocess Windows entry and the Web Content Processes entry. When you > have multiple windows open it should show something like 2/2 (for two > windows open, tho you won't get 20/20 for 20 windows open) for windows, > and n/7 (tho I believe the default is 4 instead of 7, I've upped mine) > for content processes, with n going up toward 7 (or 4) if you have > multiple tabs/windows open playing video or the like. > > If you're stuck at a single process that'll be a *BIG* drag on > performance, particularly when playing youtube full-screen or the like. > There are various reasons you might get stuck at a single process, > including extensions that aren't compatible with "electrolysis" (aka e10s, > this being the mozilla code name for multi-process firefox), and the one > that was my problem after I ensured all my extensions were e10s > compatible -- I was trying to run the upstream firefox binary, which is > now pulseaudio-only (no more direct alsa support), with apulse as a > pulseaudio substitute, and apulse is apparently single-process-only > (forcing multi-process would crash the tabs as soon as I tried navigating > away from about:whatever to anything remote). > > Once I figured that out I switched back to using the gentoo firefox ebuild > and enabling the alsa USE flag instead of pulseaudio there. That got > multiprocess working, and it was was *MUCH* more responsive, as I figured > it should be! =:^) > > If you find you're stuck at single process (remember, check with at least > two windows open) and need help with it, yell. Because it'll make a > *HUGE* difference. > > -- > Duncan - List replies preferred. No HTML msgs. > "Every nonfree program has a lord, a master -- > and if you use the program, he is your master." Richard Stallman > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-10-31 21:47 ` Dave @ 2017-10-31 23:06 ` Peter Grandi 2017-11-01 0:37 ` Dave [not found] ` <CAH=dxU47-52-asM5vJ_-qOpEpjZczHw7vQzgi1-TeKm58++zBQ@mail.gmail.com> 2017-11-01 7:43 ` Sean Greenslade 2017-11-01 13:31 ` Duncan 2 siblings, 2 replies; 56+ messages in thread From: Peter Grandi @ 2017-10-31 23:06 UTC (permalink / raw) To: Linux fs Btrfs > I'm following up on all the suggestions regarding Firefox performance > on BTRFS. [ ... ] I haven't read that yet, so maybe I am missing something, but I use Firefox with Btrfs all the time and I haven't got issues. [ ... ] > 1. BTRFS snapshots have proven to be too useful (and too important to > our overall IT approach) to forego. [ ... ] > 3. We have large amounts of storage space (and can add more), but not > enough to break all reflinks on all snapshots. Firefox profiles get fragmented only in the databases containes in them, and they are tiny, as in dozens of MB. That's usually irrelevant. Also nothing forces you to defragment a whole filesystem, you can just defragment individual files or directories by using 'find' with it. My top "$HOME" fragmented files are the aKregator RSS feed databases, usually a few hundred fragments each, and the '.sqlite' files for Firefox. Occasionally like just now I do this: tree$ sudo filefrag .firefox/default/*.sqlite | sort -t: -k 2n | tail -4 .firefox/default/cleanup.sqlite: 43 extents found .firefox/default/content-prefs.sqlite: 67 extents found .firefox/default/formhistory.sqlite: 87 extents found .firefox/default/places.sqlite: 3879 extents found tree$ sudo btrfs fi defrag .firefox/default/*.sqlite tree$ sudo filefrag .firefox/default/*.sqlite | sort -t: -k 2n | tail -4 .firefox/default/webappsstore.sqlite: 1 extent found .firefox/default/favicons.sqlite: 2 extents found .firefox/default/kinto.sqlite: 2 extents found .firefox/default/places.sqlite: 44 extents found > 2. Put $HOME/.cache on a separate BTRFS subvolume that is mounted > nocow -- it will NOT be snapshotted The cache can be simply deleted, and usually files in it are not updated in place, so don't get fragmented, so no worry. Also, you can declare the '.firefox/default/' directory to be NOCOW, and that "just works". I haven't even bothered with that. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-10-31 23:06 ` Peter Grandi @ 2017-11-01 0:37 ` Dave 2017-11-01 12:21 ` Austin S. Hemmelgarn ` (2 more replies) [not found] ` <CAH=dxU47-52-asM5vJ_-qOpEpjZczHw7vQzgi1-TeKm58++zBQ@mail.gmail.com> 1 sibling, 3 replies; 56+ messages in thread From: Dave @ 2017-11-01 0:37 UTC (permalink / raw) To: Linux fs Btrfs; +Cc: Peter Grandi On Tue, Oct 31, 2017 at 7:06 PM, Peter Grandi <pg@btrfs.list.sabi.co.uk> wrote: > > Also nothing forces you to defragment a whole filesystem, you > can just defragment individual files or directories by using > 'find' with it. Thanks for that info. When defragmenting individual files on a BTRFS filesystem with COW, I assume reflinks between that file and all snapshots are broken. So if there are 30 snapshots on that volume, that one file will suddenly take up 30 times more space... Is that correct? Or are the reflinks only broken between the live file and the latest snapshot? Or is it something between, based on how many times the file has changed? > > My top "$HOME" fragmented files are the aKregator RSS feed > databases, usually a few hundred fragments each, and the > '.sqlite' files for Firefox. Occasionally like just now I do > this: > > tree$ sudo filefrag .firefox/default/*.sqlite | sort -t: -k 2n | tail -4 > .firefox/default/cleanup.sqlite: 43 extents found > .firefox/default/content-prefs.sqlite: 67 extents found > .firefox/default/formhistory.sqlite: 87 extents found > .firefox/default/places.sqlite: 3879 extents found > > tree$ sudo btrfs fi defrag .firefox/default/*.sqlite > > tree$ sudo filefrag .firefox/default/*.sqlite | sort -t: -k 2n | tail -4 > .firefox/default/webappsstore.sqlite: 1 extent found > .firefox/default/favicons.sqlite: 2 extents found > .firefox/default/kinto.sqlite: 2 extents found > .firefox/default/places.sqlite: 44 extents found That's a very helpful example. Can you also give an example of using find, as you suggested above? I'm generally familiar with using find to execute specific commands, but an example is appreciated in this case. > > 2. Put $HOME/.cache on a separate BTRFS subvolume that is mounted nocow -- it will NOT be snapshotted > Also, you can declare the '.firefox/default/' directory to be NOCOW, and that "just works". The cache is in a separate location from the profiles, as I'm sure you know. The reason I suggested a separate BTRFS subvolume for $HOME/.cache is that this will prevent the cache files for all applications (for that user) from being included in the snapshots. We take frequent snapshots and (afaik) it makes no sense to include cache in backups or snapshots. The easiest way I know to exclude cache from BTRFS snapshots is to put it on a separate subvolume. I assumed this would make several things related to snapshots more efficient too. As far as the Firefox profile being declared NOCOW, as soon as we take the first snapshot, I understand that it will become COW again. So I don't see any point in making it NOCOW. Thanks for your reply. I appreciate any other feedback or suggestions. Background: I'm not sure why our Firefox performance is so terrible but here's my original post from Sept 20. (I could repost the earlier replies too if needed.) I've been waiting to have a window of opportunity to try to fix our Firefox performance again, and now I have that chance. >On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: >> When I do a >> btrfs filesystem defragment -r /directory >> does it defragment really all files in this directory tree, even if it >> contains subvolumes? >> The man page does not mention subvolumes on this topic. > >No answer so far :-( > >But I found another problem in the man-page: > > Defragmenting with Linux kernel versions < 3.9 or >= 3.14-rc2 as well as > with Linux stable kernel versions >= 3.10.31, >= 3.12.12 or >= 3.13.4 > will break up the ref-links of COW data (for example files copied with > cp --reflink, snapshots or de-duplicated data). This may cause > considerable increase of space usage depending on the broken up > ref-links. > >I am running Ubuntu 16.04 with Linux kernel 4.10 and I have several >snapshots. >Therefore, I better should avoid calling "btrfs filesystem defragment -r"? > >What is the defragmenting best practice? >Avoid it completly? My question is the same as the OP in this thread, so I came here to read the answers before asking. Based on the answers here, it sounds like I should not run defrag at all. However, I have a performance problem I need to solve, so if I don't defrag, I need to do something else. Here's my scenario. Some months ago I built an over-the-top powerful desktop computer / workstation and I was looking forward to really fantastic performance improvements over my 6 year old Ubuntu machine. I installed Arch Linux on BTRFS on the new computer (on an SSD). To my shock, it was no faster than my old machine. I focused a lot on Firefox performance because I use Firefox a lot and that was one of the applications in which I was most looking forward to better performance. I tried everything I could think of and everything recommended to me in various forums (except switching to Windows) and the performance remained very disappointing. Then today I read the following: Gotchas - btrfs Wiki https://btrfs.wiki.kernel.org/index.php/Gotchas Fragmentation: Files with a lot of random writes can become heavily fragmented (10000+ extents) causing excessive multi-second spikes of CPU load on systems with an SSD or large amount a RAM. On desktops this primarily affects application databases (including Firefox). Workarounds include manually defragmenting your home directory using btrfs fi defragment. Auto-defragment (mount option autodefrag) should solve this problem. Upon reading that I am wondering if fragmentation in the Firefox profile is part of my issue. That's one thing I never tested previously. (BTW, this system has 256 GB of RAM and 20 cores.) Furthermore, on the same BTRFS Wiki page, it mentions the performance penalties of many snapshots. I am keeping 30 to 50 snapshots of the volume that contains the Firefox profile. Would these two things be enough to turn top-of-the-line hardware into a mediocre-preforming desktop system? (The system performs fine on benchmarks -- it's real life usage, particularly with Firefox where it is disappointing.) After reading the info here, I am wondering if I should make a new subvolume just for my Firefox profile(s) and not use COW and/or not keep snapshots on it and mount it with the autodefrag option. As part of this strategy, I could send snapshots to another disk using btrfs send-receive. That way I would have the benefits of snapshots (which are important to me), but by not keeping any snapshots on the live subvolume I could avoid the performance problems. What would you guys do in this situation? ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-01 0:37 ` Dave @ 2017-11-01 12:21 ` Austin S. Hemmelgarn 2017-11-02 1:39 ` Dave 2017-11-01 17:48 ` Peter Grandi 2017-11-02 21:16 ` Kai Krakow 2 siblings, 1 reply; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-11-01 12:21 UTC (permalink / raw) To: Dave, Linux fs Btrfs; +Cc: Peter Grandi On 2017-10-31 20:37, Dave wrote: > On Tue, Oct 31, 2017 at 7:06 PM, Peter Grandi <pg@btrfs.list.sabi.co.uk> wrote: >> >> Also nothing forces you to defragment a whole filesystem, you >> can just defragment individual files or directories by using >> 'find' with it. > > Thanks for that info. When defragmenting individual files on a BTRFS > filesystem with COW, I assume reflinks between that file and all > snapshots are broken. So if there are 30 snapshots on that volume, > that one file will suddenly take up 30 times more space... Is that > correct? Or are the reflinks only broken between the live file and the > latest snapshot? Or is it something between, based on how many times > the file has changed? Only that file will be split, all the other reflinks will be preserved, so it will only take up twice the space in your example. Reflinks are at the block level, and don't have a single origin point where they can all be broken at once. It's just like having multiple hardlinks to a file, and then replacing one of them via a rename. The rename will break _that_ hardlink, but not any of the others. In fact, the simplest way to explain reflinks is block-level hard links that automatically break when the block is updated. > >> >> My top "$HOME" fragmented files are the aKregator RSS feed >> databases, usually a few hundred fragments each, and the >> '.sqlite' files for Firefox. Occasionally like just now I do >> this: >> >> tree$ sudo filefrag .firefox/default/*.sqlite | sort -t: -k 2n | tail -4 >> .firefox/default/cleanup.sqlite: 43 extents found >> .firefox/default/content-prefs.sqlite: 67 extents found >> .firefox/default/formhistory.sqlite: 87 extents found >> .firefox/default/places.sqlite: 3879 extents found >> >> tree$ sudo btrfs fi defrag .firefox/default/*.sqlite >> >> tree$ sudo filefrag .firefox/default/*.sqlite | sort -t: -k 2n | tail -4 >> .firefox/default/webappsstore.sqlite: 1 extent found >> .firefox/default/favicons.sqlite: 2 extents found >> .firefox/default/kinto.sqlite: 2 extents found >> .firefox/default/places.sqlite: 44 extents found > > That's a very helpful example. > > Can you also give an example of using find, as you suggested above? > I'm generally familiar with using find to execute specific commands, > but an example is appreciated in this case. > >>> 2. Put $HOME/.cache on a separate BTRFS subvolume that is mounted nocow -- it will NOT be snapshotted > >> Also, you can declare the '.firefox/default/' directory to be NOCOW, and that "just works". > > The cache is in a separate location from the profiles, as I'm sure you > know. The reason I suggested a separate BTRFS subvolume for > $HOME/.cache is that this will prevent the cache files for all > applications (for that user) from being included in the snapshots. We > take frequent snapshots and (afaik) it makes no sense to include cache > in backups or snapshots. The easiest way I know to exclude cache from > BTRFS snapshots is to put it on a separate subvolume. I assumed this > would make several things related to snapshots more efficient too. Yes, it will, and it will save space long-term as well since $HOME/.cache is usually the most frequently modified location in $HOME. In addition to not including this in the snapshots, it may also improve performance. Each subvolume is it's own tree, with it's own locking, which means that you can generally improve parallel access performance by splitting the workload across multiple subvolumes. Whether it will actually provide any real benefit in that respect is heavily dependent on the exact workload however, but it won't hurt performance. > > As far as the Firefox profile being declared NOCOW, as soon as we take > the first snapshot, I understand that it will become COW again. So I > don't see any point in making it NOCOW.When snapshotting NOCOW files, exactly one COW operation happens for each block as it gets written. In your case, this may not matter (most people don't change settings on a sub-hourly basis), but in cases where changes are very frequent relative to snapshots, it can have a big impact to only COW once instead of all the time. > > Thanks for your reply. I appreciate any other feedback or suggestions. > > Background: I'm not sure why our Firefox performance is so terrible > but here's my original post from Sept 20. (I could repost the earlier > replies too if needed.) I've been waiting to have a window of > opportunity to try to fix our Firefox performance again, and now I > have that chance. > >> On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: >>> When I do a >>> btrfs filesystem defragment -r /directory >>> does it defragment really all files in this directory tree, even if it >>> contains subvolumes? >>> The man page does not mention subvolumes on this topic. >> >> No answer so far :-( >> >> But I found another problem in the man-page: >> >> Defragmenting with Linux kernel versions < 3.9 or >= 3.14-rc2 as well as >> with Linux stable kernel versions >= 3.10.31, >= 3.12.12 or >= 3.13.4 >> will break up the ref-links of COW data (for example files copied with >> cp --reflink, snapshots or de-duplicated data). This may cause >> considerable increase of space usage depending on the broken up >> ref-links. >> >> I am running Ubuntu 16.04 with Linux kernel 4.10 and I have several >> snapshots. >> Therefore, I better should avoid calling "btrfs filesystem defragment -r"? >> >> What is the defragmenting best practice? >> Avoid it completly? > > My question is the same as the OP in this thread, so I came here to > read the answers before asking. > > Based on the answers here, it sounds like I should not run defrag at > all. However, I have a performance problem I need to solve, so if I > don't defrag, I need to do something else. > > Here's my scenario. Some months ago I built an over-the-top powerful > desktop computer / workstation and I was looking forward to really > fantastic performance improvements over my 6 year old Ubuntu machine. > I installed Arch Linux on BTRFS on the new computer (on an SSD). To my > shock, it was no faster than my old machine. I focused a lot on > Firefox performance because I use Firefox a lot and that was one of > the applications in which I was most looking forward to better > performance. > > I tried everything I could think of and everything recommended to me > in various forums (except switching to Windows) and the performance > remained very disappointing. > > Then today I read the following: > > Gotchas - btrfs Wiki > https://btrfs.wiki.kernel.org/index.php/Gotchas > > Fragmentation: Files with a lot of random writes can become > heavily fragmented (10000+ extents) causing excessive multi-second > spikes of CPU load on systems with an SSD or large amount a RAM. On > desktops this primarily affects application databases (including > Firefox). Workarounds include manually defragmenting your home > directory using btrfs fi defragment. Auto-defragment (mount option > autodefrag) should solve this problem. > > Upon reading that I am wondering if fragmentation in the Firefox > profile is part of my issue. That's one thing I never tested > previously. (BTW, this system has 256 GB of RAM and 20 cores.) > > Furthermore, on the same BTRFS Wiki page, it mentions the performance > penalties of many snapshots. I am keeping 30 to 50 snapshots of the > volume that contains the Firefox profile. > > Would these two things be enough to turn top-of-the-line hardware into > a mediocre-preforming desktop system? (The system performs fine on > benchmarks -- it's real life usage, particularly with Firefox where it > is disappointing.) > > After reading the info here, I am wondering if I should make a new > subvolume just for my Firefox profile(s) and not use COW and/or not > keep snapshots on it and mount it with the autodefrag option. > > As part of this strategy, I could send snapshots to another disk using > btrfs send-receive. That way I would have the benefits of snapshots > (which are important to me), but by not keeping any snapshots on the > live subvolume I could avoid the performance problems. > > What would you guys do in this situation? > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-01 12:21 ` Austin S. Hemmelgarn @ 2017-11-02 1:39 ` Dave 2017-11-02 11:07 ` Austin S. Hemmelgarn 2017-11-03 5:58 ` Marat Khalili 0 siblings, 2 replies; 56+ messages in thread From: Dave @ 2017-11-02 1:39 UTC (permalink / raw) To: Linux fs Btrfs; +Cc: Peter Grandi, Austin S. Hemmelgarn On Wed, Nov 1, 2017 at 8:21 AM, Austin S. Hemmelgarn <ahferroin7@gmail.com> wrote: >> The cache is in a separate location from the profiles, as I'm sure you >> know. The reason I suggested a separate BTRFS subvolume for >> $HOME/.cache is that this will prevent the cache files for all >> applications (for that user) from being included in the snapshots. We >> take frequent snapshots and (afaik) it makes no sense to include cache >> in backups or snapshots. The easiest way I know to exclude cache from >> BTRFS snapshots is to put it on a separate subvolume. I assumed this >> would make several things related to snapshots more efficient too. > > Yes, it will, and it will save space long-term as well since $HOME/.cache is > usually the most frequently modified location in $HOME. In addition to not > including this in the snapshots, it may also improve performance. Each > subvolume is it's own tree, with it's own locking, which means that you can > generally improve parallel access performance by splitting the workload > across multiple subvolumes. Whether it will actually provide any real > benefit in that respect is heavily dependent on the exact workload however, > but it won't hurt performance. I'm going to make this change now. What would be a good way to implement this so that the change applies to the $HOME/.cache of each user? The simple way would be to create a new subvolume for each existing user and mount it at $HOME/.cache in /etc/fstab, hard coding that mount location for each user. I don't mind doing that as there are only 4 users to consider. One minor concern is that it adds an unexpected step to the process of creating a new user. Is there a better way? ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-02 1:39 ` Dave @ 2017-11-02 11:07 ` Austin S. Hemmelgarn 2017-11-03 2:59 ` Dave 2017-11-03 5:58 ` Marat Khalili 1 sibling, 1 reply; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-11-02 11:07 UTC (permalink / raw) To: Dave, Linux fs Btrfs; +Cc: Peter Grandi On 2017-11-01 21:39, Dave wrote: > On Wed, Nov 1, 2017 at 8:21 AM, Austin S. Hemmelgarn > <ahferroin7@gmail.com> wrote: > >>> The cache is in a separate location from the profiles, as I'm sure you >>> know. The reason I suggested a separate BTRFS subvolume for >>> $HOME/.cache is that this will prevent the cache files for all >>> applications (for that user) from being included in the snapshots. We >>> take frequent snapshots and (afaik) it makes no sense to include cache >>> in backups or snapshots. The easiest way I know to exclude cache from >>> BTRFS snapshots is to put it on a separate subvolume. I assumed this >>> would make several things related to snapshots more efficient too. >> >> Yes, it will, and it will save space long-term as well since $HOME/.cache is >> usually the most frequently modified location in $HOME. In addition to not >> including this in the snapshots, it may also improve performance. Each >> subvolume is it's own tree, with it's own locking, which means that you can >> generally improve parallel access performance by splitting the workload >> across multiple subvolumes. Whether it will actually provide any real >> benefit in that respect is heavily dependent on the exact workload however, >> but it won't hurt performance. > > I'm going to make this change now. What would be a good way to > implement this so that the change applies to the $HOME/.cache of each > user? > > The simple way would be to create a new subvolume for each existing > user and mount it at $HOME/.cache in /etc/fstab, hard coding that > mount location for each user. I don't mind doing that as there are > only 4 users to consider. One minor concern is that it adds an > unexpected step to the process of creating a new user. Is there a > better way? > The easiest option is to just make sure nobody is logged in and run the following shell script fragment: for dir in /home/* ; do rm -rf $dir/.cache btrfs subvolume create $dir/.cache done And then add something to the user creation scripts to create that subvolume. This approach won't pollute /etc/fstab, will still exclude the directory from snapshots, and doesn't require any hugely creative work to integrate with user creation and deletion. In general, the contents of the .cache directory are just that, cached data. Provided nobody is actively accessing it, it's perfectly safe to just nuke the entire directory (I actually do this on a semi-regular basis on my systems just because it helps save space). In fact, based on the FreeDesktop.org standards, if this does break anything, it's a bug in the software in question. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-02 11:07 ` Austin S. Hemmelgarn @ 2017-11-03 2:59 ` Dave 2017-11-03 7:12 ` Kai Krakow 0 siblings, 1 reply; 56+ messages in thread From: Dave @ 2017-11-03 2:59 UTC (permalink / raw) To: Linux fs Btrfs; +Cc: Austin S. Hemmelgarn On Thu, Nov 2, 2017 at 7:07 AM, Austin S. Hemmelgarn <ahferroin7@gmail.com> wrote: > On 2017-11-01 21:39, Dave wrote: >> I'm going to make this change now. What would be a good way to >> implement this so that the change applies to the $HOME/.cache of each >> user? >> >> The simple way would be to create a new subvolume for each existing >> user and mount it at $HOME/.cache in /etc/fstab, hard coding that >> mount location for each user. I don't mind doing that as there are >> only 4 users to consider. One minor concern is that it adds an >> unexpected step to the process of creating a new user. Is there a >> better way? >> > The easiest option is to just make sure nobody is logged in and run the > following shell script fragment: > > for dir in /home/* ; do > rm -rf $dir/.cache > btrfs subvolume create $dir/.cache > done > > And then add something to the user creation scripts to create that > subvolume. This approach won't pollute /etc/fstab, will still exclude the > directory from snapshots, and doesn't require any hugely creative work to > integrate with user creation and deletion. > > In general, the contents of the .cache directory are just that, cached data. > Provided nobody is actively accessing it, it's perfectly safe to just nuke > the entire directory... I like this suggestion. Thank you. I had intended to mount the .cache subvolumes with the NODATACOW option. However, with this approach, I won't be explicitly mounting the .cache subvolumes. Is it possible to use "chattr +C $dir/.cache" in that loop even though it is a subvolume? And, is setting the .cache directory to NODATACOW the right choice given this scenario? From earlier comments, I believe it is, but I want to be sure I understood correctly. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-03 2:59 ` Dave @ 2017-11-03 7:12 ` Kai Krakow 0 siblings, 0 replies; 56+ messages in thread From: Kai Krakow @ 2017-11-03 7:12 UTC (permalink / raw) To: linux-btrfs Am Thu, 2 Nov 2017 22:59:36 -0400 schrieb Dave <davestechshop@gmail.com>: > On Thu, Nov 2, 2017 at 7:07 AM, Austin S. Hemmelgarn > <ahferroin7@gmail.com> wrote: > > On 2017-11-01 21:39, Dave wrote: > >> I'm going to make this change now. What would be a good way to > >> implement this so that the change applies to the $HOME/.cache of > >> each user? > >> > >> The simple way would be to create a new subvolume for each existing > >> user and mount it at $HOME/.cache in /etc/fstab, hard coding that > >> mount location for each user. I don't mind doing that as there are > >> only 4 users to consider. One minor concern is that it adds an > >> unexpected step to the process of creating a new user. Is there a > >> better way? > >> > > The easiest option is to just make sure nobody is logged in and run > > the following shell script fragment: > > > > for dir in /home/* ; do > > rm -rf $dir/.cache > > btrfs subvolume create $dir/.cache > > done > > > > And then add something to the user creation scripts to create that > > subvolume. This approach won't pollute /etc/fstab, will still > > exclude the directory from snapshots, and doesn't require any > > hugely creative work to integrate with user creation and deletion. > > > > In general, the contents of the .cache directory are just that, > > cached data. Provided nobody is actively accessing it, it's > > perfectly safe to just nuke the entire directory... > > I like this suggestion. Thank you. I had intended to mount the .cache > subvolumes with the NODATACOW option. However, with this approach, I > won't be explicitly mounting the .cache subvolumes. Is it possible to > use "chattr +C $dir/.cache" in that loop even though it is a > subvolume? And, is setting the .cache directory to NODATACOW the right > choice given this scenario? From earlier comments, I believe it is, > but I want to be sure I understood correctly. It is important to apply "chattr +C" to the _empty_ directory, because even if used recursively, it won't apply to already existing, non-empty files. But the +C attribute is inherited by newly created files and directory: So simply follow the "chattr +C on empty directory" and you're all set. BTW: You cannot mount subvolumes from an already mounted btrfs device with different mount options. That is currently not implemented (except for maybe a very few options). So the fstab approach probably wouldn't have helped you (depending on your partition layout). You can simply just create subvolumes within the location needed and they are implicitly mounted. Then change the particular subvolume cow behavior with chattr. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-02 1:39 ` Dave 2017-11-02 11:07 ` Austin S. Hemmelgarn @ 2017-11-03 5:58 ` Marat Khalili 2017-11-03 7:19 ` Kai Krakow 1 sibling, 1 reply; 56+ messages in thread From: Marat Khalili @ 2017-11-03 5:58 UTC (permalink / raw) To: Dave, Linux fs Btrfs; +Cc: Peter Grandi, Austin S. Hemmelgarn On 02/11/17 04:39, Dave wrote: > I'm going to make this change now. What would be a good way to > implement this so that the change applies to the $HOME/.cache of each > user? I'd make each user's .cache a symlink (should work but if it won't then bind mount) to a per-user directory in some separately mounted volume with necessary options. -- With Best Regards, Marat Khalili ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-03 5:58 ` Marat Khalili @ 2017-11-03 7:19 ` Kai Krakow 0 siblings, 0 replies; 56+ messages in thread From: Kai Krakow @ 2017-11-03 7:19 UTC (permalink / raw) To: linux-btrfs Am Fri, 3 Nov 2017 08:58:22 +0300 schrieb Marat Khalili <mkh@rqc.ru>: > On 02/11/17 04:39, Dave wrote: > > I'm going to make this change now. What would be a good way to > > implement this so that the change applies to the $HOME/.cache of > > each user? > I'd make each user's .cache a symlink (should work but if it won't > then bind mount) to a per-user directory in some separately mounted > volume with necessary options. On a systemd system, each user already has a private tmpfs location at /run/user/$(id -u). You could add to the central login script: # CACHE_DIR="/run/user/$(id -u)/cache" # mkdir -p $CACHE_DIR && ln -snf $CACHE_DIR $HOME/.cache You should not run this as root (because of mkdir -p). You could wrap it into an if statement: # if [ "$(whoami)" -ne "root" ]; then # ... # fi -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-01 0:37 ` Dave 2017-11-01 12:21 ` Austin S. Hemmelgarn @ 2017-11-01 17:48 ` Peter Grandi 2017-11-02 0:09 ` Dave 2017-11-02 0:43 ` Peter Grandi 2017-11-02 21:16 ` Kai Krakow 2 siblings, 2 replies; 56+ messages in thread From: Peter Grandi @ 2017-11-01 17:48 UTC (permalink / raw) To: Linux fs Btrfs > When defragmenting individual files on a BTRFS filesystem with > COW, I assume reflinks between that file and all snapshots are > broken. So if there are 30 snapshots on that volume, that one > file will suddenly take up 30 times more space... [ ... ] Defragmentation works by effectively making a copy of the file contents (simplistic view), so the end result is one copy with 29 reflinked contents, and one copy with defragmented contents. > Can you also give an example of using find, as you suggested > above? [ ... ] Well, one way is to use 'find' as a filtering replacement for 'defrag' option '-r', as in for example: find "$HOME" -xdev '(' -name '*.sqlite' -o -name '*.mk4' ')' \ -type f -print0 | xargs -0 btrfs fi defrag Another one is to find the most fragmented files first or all files of at least 1M with with at least say 100 fragments as in: find "$HOME" -xdev -type f -size +1M -print0 | xargs -0 filefrag \ | perl -n -e 'print "$1\0" if (m/(.*): ([0-9]+) extents/ && $1 > 100)' \ | xargs -0 btrfs fi defrag But there are many 'find' web pages and that is not quite a Btrfs related topic. > [ ... ] The easiest way I know to exclude cache from > BTRFS snapshots is to put it on a separate subvolume. I assumed this > would make several things related to snapshots more efficient too. Only slightly. > Background: I'm not sure why our Firefox performance is so terrible As I always say, "performance" is not the same as "speed", and probably your Firefox "performance" is sort of OKish even if the "speed" is terrile, and neither is likely related to the profile or the cache being on Btrfs: most JavaScript based sites are awfully horrible regardless of browser: http://www.sabi.co.uk/blog/13-two.html?130817#130817 and if Firefox makes a special contribution it tends to leak memory on several odd but common cases: https://utcc.utoronto.ca/~cks/space/blog/web/FirefoxResignedToLeaks?showcomments Plus it tends to cache too much, e.g. recently close tabs. But Firefox is not special because most web browsers are not designed to run for a long time without a restart, and Chromium/Chrome simply have a different set of problem sites. Maybe the new "Quantum" Firefox 57 will improve matters because it has a far more restrictive plugin API. The overall problem is insoluble, hipster UX designers will be the second the the wall when the revolution comes :-). ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-01 17:48 ` Peter Grandi @ 2017-11-02 0:09 ` Dave 2017-11-02 11:17 ` Austin S. Hemmelgarn 2017-11-02 0:43 ` Peter Grandi 1 sibling, 1 reply; 56+ messages in thread From: Dave @ 2017-11-02 0:09 UTC (permalink / raw) To: Linux fs Btrfs; +Cc: Peter Grandi On Wed, Nov 1, 2017 at 1:48 PM, Peter Grandi <pg@btfs.list.sabi.co.uk> wrote: >> When defragmenting individual files on a BTRFS filesystem with >> COW, I assume reflinks between that file and all snapshots are >> broken. So if there are 30 snapshots on that volume, that one >> file will suddenly take up 30 times more space... [ ... ] > > Defragmentation works by effectively making a copy of the file > contents (simplistic view), so the end result is one copy with > 29 reflinked contents, and one copy with defragmented contents. The clarification is much appreciated. >> Can you also give an example of using find, as you suggested >> above? [ ... ] > > Well, one way is to use 'find' as a filtering replacement for > 'defrag' option '-r', as in for example: > > find "$HOME" -xdev '(' -name '*.sqlite' -o -name '*.mk4' ')' \ > -type f -print0 | xargs -0 btrfs fi defrag > > Another one is to find the most fragmented files first or all > files of at least 1M with with at least say 100 fragments as in: > > find "$HOME" -xdev -type f -size +1M -print0 | xargs -0 filefrag \ > | perl -n -e 'print "$1\0" if (m/(.*): ([0-9]+) extents/ && $1 > 100)' \ > | xargs -0 btrfs fi defrag > > But there are many 'find' web pages and that is not quite a > Btrfs related topic. Your examples were perfect. I have experience using find in similar ways. I can take it from there. :-) >> Background: I'm not sure why our Firefox performance is so terrible > > As I always say, "performance" is not the same as "speed", and > probably your Firefox "performance" is sort of OKish even if the > "speed" is terrile, and neither is likely related to the profile > or the cache being on Btrfs. Here's what happened. Two years ago I installed Kubuntu (with Firefox) on two desktop computers. One machine performed fine. Like you said, "sort of OKish" and that's what we expect with the current state of Linux. The other machine was substantially worse. We ran side-by-side real-world tests on these two machines for months. Initially I did a lot of testing, troubleshooting and reconfiguration trying to get the second machine to perform as well as the first. I never had success. At first I thought it was related to the GPU (or driver). Then I thought it was because the first machine used the z170 chipset and the second was X99 based. But that wasn't it. I have never solved the problem and I have been coming back to it periodically these last two years. In that time I have tried different distros from opensuse to Arch, and a lot of different hardware. Furthermore, my new machines have the same performance problem. The most interesting example is a high end machine with 256 GB of RAM. It showed substantially worse desktop application performance than any other computer here. All are running the exact same version of Firefox with the exact same add-ons. (The installations are carbon copies of each other.) What originally caught my attention was earlier information in this thread: Am Wed, 20 Sep 2017 07:46:52 -0400 schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>: > > Fragmentation: Files with a lot of random writes can become > > heavily fragmented (10000+ extents) causing excessive multi-second > > spikes of CPU load on systems with an SSD or large amount a RAM. On > > desktops this primarily affects application databases (including > > Firefox). Workarounds include manually defragmenting your home > > directory using btrfs fi defragment. Auto-defragment (mount option > > autodefrag) should solve this problem. > > > > Upon reading that I am wondering if fragmentation in the Firefox > > profile is part of my issue. That's one thing I never tested > > previously. (BTW, this system has 256 GB of RAM and 20 cores.) > Almost certainly. Most modern web browsers are brain-dead and insist > on using SQLite databases (or traditional DB files) for everything, > including the cache, and the usage for the cache in particular kills > performance when fragmentation is an issue. It turns out the the first machine (which performed well enough) was the last one which was installed using LVM + EXT4. The second machine (the one with the original performance problem) and all subsequent machines have used BTRFS. And the worst performing machine was the one with the most RAM and a fast NVMe drive and top of the line hardware. While Firefox and Linux in general have their performance "issues", that's not relevant here. I'm comparing the same distros, same Firefox versions, same Firefox add-ons, etc. I eventually tested many hardware configurations: different CPU's, motherboards, GPU's, SSD's, RAM, etc. The only remaining difference I can find is that the computer with acceptable performance uses LVM + EXT4 while all the others use BTRFS. With all the great feedback I have gotten here, I'm now ready to retest this after implementing all the BTRFS-related suggestions I have received. Maybe that will solve the problem or maybe this mystery will continue... ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-02 0:09 ` Dave @ 2017-11-02 11:17 ` Austin S. Hemmelgarn 2017-11-02 18:09 ` Dave 0 siblings, 1 reply; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-11-02 11:17 UTC (permalink / raw) To: Dave, Linux fs Btrfs; +Cc: Peter Grandi On 2017-11-01 20:09, Dave wrote: > On Wed, Nov 1, 2017 at 1:48 PM, Peter Grandi <pg@btfs.list.sabi.co.uk> wrote: >>> When defragmenting individual files on a BTRFS filesystem with >>> COW, I assume reflinks between that file and all snapshots are >>> broken. So if there are 30 snapshots on that volume, that one >>> file will suddenly take up 30 times more space... [ ... ] >> >> Defragmentation works by effectively making a copy of the file >> contents (simplistic view), so the end result is one copy with >> 29 reflinked contents, and one copy with defragmented contents. > > The clarification is much appreciated. > >>> Can you also give an example of using find, as you suggested >>> above? [ ... ] >> >> Well, one way is to use 'find' as a filtering replacement for >> 'defrag' option '-r', as in for example: >> >> find "$HOME" -xdev '(' -name '*.sqlite' -o -name '*.mk4' ')' \ >> -type f -print0 | xargs -0 btrfs fi defrag >> >> Another one is to find the most fragmented files first or all >> files of at least 1M with with at least say 100 fragments as in: >> >> find "$HOME" -xdev -type f -size +1M -print0 | xargs -0 filefrag \ >> | perl -n -e 'print "$1\0" if (m/(.*): ([0-9]+) extents/ && $1 > 100)' \ >> | xargs -0 btrfs fi defrag >> >> But there are many 'find' web pages and that is not quite a >> Btrfs related topic. > > Your examples were perfect. I have experience using find in similar > ways. I can take it from there. :-) > >>> Background: I'm not sure why our Firefox performance is so terrible >> >> As I always say, "performance" is not the same as "speed", and >> probably your Firefox "performance" is sort of OKish even if the >> "speed" is terrile, and neither is likely related to the profile >> or the cache being on Btrfs. > > Here's what happened. Two years ago I installed Kubuntu (with Firefox) > on two desktop computers. One machine performed fine. Like you said, > "sort of OKish" and that's what we expect with the current state of > Linux. The other machine was substantially worse. We ran side-by-side > real-world tests on these two machines for months. > > Initially I did a lot of testing, troubleshooting and reconfiguration > trying to get the second machine to perform as well as the first. I > never had success. At first I thought it was related to the GPU (or > driver). Then I thought it was because the first machine used the z170 > chipset and the second was X99 based. But that wasn't it. I have never > solved the problem and I have been coming back to it periodically > these last two years. In that time I have tried different distros from > opensuse to Arch, and a lot of different hardware. > > Furthermore, my new machines have the same performance problem. The > most interesting example is a high end machine with 256 GB of RAM. It > showed substantially worse desktop application performance than any > other computer here. All are running the exact same version of Firefox > with the exact same add-ons. (The installations are carbon copies of > each other.) > > What originally caught my attention was earlier information in this thread: > > Am Wed, 20 Sep 2017 07:46:52 -0400 > schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>: > >>> Fragmentation: Files with a lot of random writes can become >>> heavily fragmented (10000+ extents) causing excessive multi-second >>> spikes of CPU load on systems with an SSD or large amount a RAM. On >>> desktops this primarily affects application databases (including >>> Firefox). Workarounds include manually defragmenting your home >>> directory using btrfs fi defragment. Auto-defragment (mount option >>> autodefrag) should solve this problem. >>> >>> Upon reading that I am wondering if fragmentation in the Firefox >>> profile is part of my issue. That's one thing I never tested >>> previously. (BTW, this system has 256 GB of RAM and 20 cores.) >> Almost certainly. Most modern web browsers are brain-dead and insist >> on using SQLite databases (or traditional DB files) for everything, >> including the cache, and the usage for the cache in particular kills >> performance when fragmentation is an issue. > > It turns out the the first machine (which performed well enough) was > the last one which was installed using LVM + EXT4. The second machine > (the one with the original performance problem) and all subsequent > machines have used BTRFS. > > And the worst performing machine was the one with the most RAM and a > fast NVMe drive and top of the line hardware. Somewhat nonsensically, I'll bet that NVMe is a contributing factor in this particular case. NVMe has particularly bad performance with the old block IO schedulers (though it is NVMe, so it should still be better than a SATA or SAS SSD), and the new blk-mq framework just got scheduling support in 4.12, and only got reasonably good scheduling options in 4.13. I doubt it's the entirety of the issue, but it's probably part of it. > > While Firefox and Linux in general have their performance "issues", > that's not relevant here. I'm comparing the same distros, same Firefox > versions, same Firefox add-ons, etc. I eventually tested many hardware > configurations: different CPU's, motherboards, GPU's, SSD's, RAM, etc. > The only remaining difference I can find is that the computer with > acceptable performance uses LVM + EXT4 while all the others use BTRFS. > > With all the great feedback I have gotten here, I'm now ready to > retest this after implementing all the BTRFS-related suggestions I > have received. Maybe that will solve the problem or maybe this mystery > will continue... Hmm, if you're only using SSD's, that may partially explain things. I don't remember if it was mentioned earlier in this thread, but you might try adding 'nossd' to the mount options. The 'ssd' mount option (which gets set automatically if the device reports as non-rotational) impacts how the block allocator works, and that can have a pretty insane impact on performance. Additionally, independently from that, try toggling the 'discard' mount option. If you have it enabled, disable it, if you have it disabled, enable it. Inline discards can be very expensive on some hardware, especially older SSD's, and discards happen pretty frequently in a COW filesystem. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-02 11:17 ` Austin S. Hemmelgarn @ 2017-11-02 18:09 ` Dave 2017-11-02 18:37 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 56+ messages in thread From: Dave @ 2017-11-02 18:09 UTC (permalink / raw) To: Linux fs Btrfs; +Cc: Austin S. Hemmelgarn On Thu, Nov 2, 2017 at 7:17 AM, Austin S. Hemmelgarn <ahferroin7@gmail.com> wrote: >> And the worst performing machine was the one with the most RAM and a >> fast NVMe drive and top of the line hardware. > > Somewhat nonsensically, I'll bet that NVMe is a contributing factor in this > particular case. NVMe has particularly bad performance with the old block > IO schedulers (though it is NVMe, so it should still be better than a SATA > or SAS SSD), and the new blk-mq framework just got scheduling support in > 4.12, and only got reasonably good scheduling options in 4.13. I doubt it's > the entirety of the issue, but it's probably part of it. Thanks for that news. Based on that, I assume the advice here (to use noop for NVMe) is now outdated? https://stackoverflow.com/a/27664577/463994 Is the solution as simple as running a kernel >= 4.13? Or do I need to specify which scheduler to use? I just checked one computer: uname -a Linux morpheus 4.13.5-1-ARCH #1 SMP PREEMPT Fri Oct 6 09:58:47 CEST 2017 x86_64 GNU/Linux $ sudo find /sys -name scheduler -exec grep . {} + /sys/devices/pci0000:00/0000:00:1d.0/0000:08:00.0/nvme/nvme0/nvme0n1/queue/scheduler:[none] mq-deadline kyber bfq >From this article, it sounds like (maybe) I should use kyber. I see kyber listed in the output above, so I assume that means it is available. I also think [none] is the current scheduler being used, as it is in brackets. I checked this: https://www.kernel.org/doc/Documentation/block/switching-sched.txt Based on that, I assume I would do this at runtime: echo kyber > /sys/devices/pci0000:00/0000:00:1d.0/0000:08:00.0/nvme/nvme0/nvme0n1/queue/scheduler I assume this is equivalent: echo kyber > /sys/block/nvme0n1/queue/scheduler How would I set it permanently at boot time? >> While Firefox and Linux in general have their performance "issues", >> that's not relevant here. I'm comparing the same distros, same Firefox >> versions, same Firefox add-ons, etc. I eventually tested many hardware >> configurations: different CPU's, motherboards, GPU's, SSD's, RAM, etc. >> The only remaining difference I can find is that the computer with >> acceptable performance uses LVM + EXT4 while all the others use BTRFS. >> >> With all the great feedback I have gotten here, I'm now ready to >> retest this after implementing all the BTRFS-related suggestions I >> have received. Maybe that will solve the problem or maybe this mystery >> will continue... > > Hmm, if you're only using SSD's, that may partially explain things. I don't > remember if it was mentioned earlier in this thread, but you might try > adding 'nossd' to the mount options. The 'ssd' mount option (which gets set > automatically if the device reports as non-rotational) impacts how the block > allocator works, and that can have a pretty insane impact on performance. I will test the "nossd" mount option. > Additionally, independently from that, try toggling the 'discard' mount > option. If you have it enabled, disable it, if you have it disabled, enable > it. Inline discards can be very expensive on some hardware, especially > older SSD's, and discards happen pretty frequently in a COW filesystem. I have been following this advice, so I have never enabled discard for an NVMe drive. Do you think it is worth testing? Solid State Drives/NVMe - ArchWiki https://wiki.archlinux.org/index.php/Solid_State_Drives/NVMe Discards: Note: Although continuous TRIM is an option (albeit not recommended) for SSDs, NVMe devices should not be issued discards. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-02 18:09 ` Dave @ 2017-11-02 18:37 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-11-02 18:37 UTC (permalink / raw) To: Dave, Linux fs Btrfs On 2017-11-02 14:09, Dave wrote: > On Thu, Nov 2, 2017 at 7:17 AM, Austin S. Hemmelgarn > <ahferroin7@gmail.com> wrote: > >>> And the worst performing machine was the one with the most RAM and a >>> fast NVMe drive and top of the line hardware. >> >> Somewhat nonsensically, I'll bet that NVMe is a contributing factor in this >> particular case. NVMe has particularly bad performance with the old block >> IO schedulers (though it is NVMe, so it should still be better than a SATA >> or SAS SSD), and the new blk-mq framework just got scheduling support in >> 4.12, and only got reasonably good scheduling options in 4.13. I doubt it's >> the entirety of the issue, but it's probably part of it. > > Thanks for that news. Based on that, I assume the advice here (to use > noop for NVMe) is now outdated? > https://stackoverflow.com/a/27664577/463994 > > Is the solution as simple as running a kernel >= 4.13? Or do I need to > specify which scheduler to use? > > I just checked one computer: > > uname -a > Linux morpheus 4.13.5-1-ARCH #1 SMP PREEMPT Fri Oct 6 09:58:47 CEST > 2017 x86_64 GNU/Linux > > $ sudo find /sys -name scheduler -exec grep . {} + > /sys/devices/pci0000:00/0000:00:1d.0/0000:08:00.0/nvme/nvme0/nvme0n1/queue/scheduler:[none] > mq-deadline kyber bfq > > From this article, it sounds like (maybe) I should use kyber. I see > kyber listed in the output above, so I assume that means it is > available. I also think [none] is the current scheduler being used, as > it is in brackets. > > I checked this: > https://www.kernel.org/doc/Documentation/block/switching-sched.txt > Based on that, I assume I would do this at runtime: > > echo kyber > /sys/devices/pci0000:00/0000:00:1d.0/0000:08:00.0/nvme/nvme0/nvme0n1/queue/scheduler > > I assume this is equivalent: > > echo kyber > /sys/block/nvme0n1/queue/scheduler > > How would I set it permanently at boot time? It's kind of complicated overall. As of 4.14, there are four options for the blk-mq path. The 'none' scheduler is the old behavior prior to 4.13, and does no scheduling. 'mq-deadline' is the default AFAIK, and behaves like the old deadline I/O scheduler (not sure if it supports I/O priorities). 'bfq' is a blk-mq port of a scheduler originally designed to replace the default CFQ scheduler from the old block layer. 'kyber' I know essentially nothing about, I never saw the patches on LKML (not sure if I just missed them, or they only went to topic lists), and I've not tried it myself. I have no personal experience with anything but the none scheduler on NVMe devices, so i can't really comment much more than saying that I've seen a huge difference on the SATA SSD's I use first when the deadline scheduler became the default and then again when I switched to BFQ on my systems, and the fact that I've seen reports of using the deadline scheduler improving things on NVMe. As far as setting it at boot time, there's currently no kernel configuration option to set a default like there is for the old block interface, and I don't know of any kernel command line option to set it either, but a udev rule setting it as a attribute works reliably. I'm using something like the following to set all my SATA devices to use BFQ by default: KERNEL=="sd?", SUBSYSTEM=="block", ACTION=="add", ATTR{queue/scheduler}="bfq" > >>> While Firefox and Linux in general have their performance "issues", >>> that's not relevant here. I'm comparing the same distros, same Firefox >>> versions, same Firefox add-ons, etc. I eventually tested many hardware >>> configurations: different CPU's, motherboards, GPU's, SSD's, RAM, etc. >>> The only remaining difference I can find is that the computer with >>> acceptable performance uses LVM + EXT4 while all the others use BTRFS. >>> >>> With all the great feedback I have gotten here, I'm now ready to >>> retest this after implementing all the BTRFS-related suggestions I >>> have received. Maybe that will solve the problem or maybe this mystery >>> will continue... >> >> Hmm, if you're only using SSD's, that may partially explain things. I don't >> remember if it was mentioned earlier in this thread, but you might try >> adding 'nossd' to the mount options. The 'ssd' mount option (which gets set >> automatically if the device reports as non-rotational) impacts how the block >> allocator works, and that can have a pretty insane impact on performance. > > I will test the "nossd" mount option. If you're not seeing any difference on the newest kernels (I hadn't realized you were running 4.13 on anything), you might not see any impact from doing this. I'd also suggest running a full balance prior to testing _after_ switching the option, part of the performance impact is due to the resultant on-disk layout. > >> Additionally, independently from that, try toggling the 'discard' mount >> option. If you have it enabled, disable it, if you have it disabled, enable >> it. Inline discards can be very expensive on some hardware, especially >> older SSD's, and discards happen pretty frequently in a COW filesystem. > > I have been following this advice, so I have never enabled discard for > an NVMe drive. Do you think it is worth testing? > > Solid State Drives/NVMe - ArchWiki > https://wiki.archlinux.org/index.php/Solid_State_Drives/NVMe > > Discards: > Note: Although continuous TRIM is an option (albeit not recommended) > for SSDs, NVMe devices should not be issued discards. I've never heard this particular advice before, and it offers no source for the claim. I have seen Intel's advice that they quote below that before though, and would tend to agree with it for most users. The part that makes this all complicated is that different devices handle batched discards (what the Arch people call 'Periodic TRIM') and on-demand discards (what the Arch people call 'Continuous TRIM') differently. Some devices (especially old ones) do better with batched discards, while others seem to do better with on-demand discards. On top of that, there's significant variance based on the actual workload (including that from the filesystem itself). Based on my own experience using BTRFS on SATA SSD's, it's usually better to do batched discards unless you only write to the filesystem infrequently, because: 1. Each COW operation triggers an associated discard (this can seriously kill your performance). 2. Because old copies of blocks get discarded immediately, it's much harder to recover a damaged filesystem. There are some odd exceptions though. If for example you're running BTRFS on a ramdisk or ZRAM device, you should just use on-demand discards, as that will free up memory immediately. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-01 17:48 ` Peter Grandi 2017-11-02 0:09 ` Dave @ 2017-11-02 0:43 ` Peter Grandi 1 sibling, 0 replies; 56+ messages in thread From: Peter Grandi @ 2017-11-02 0:43 UTC (permalink / raw) To: Linux fs Btrfs > Another one is to find the most fragmented files first or all > files of at least 1M with with at least say 100 fragments as in: > find "$HOME" -xdev -type f -size +1M -print0 | xargs -0 filefrag \ > | perl -n -e 'print "$1\0" if (m/(.*): ([0-9]+) extents/ && $1 > 100)' \ > | xargs -0 btrfs fi defrag That should have "&& $2 > 100". ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-01 0:37 ` Dave 2017-11-01 12:21 ` Austin S. Hemmelgarn 2017-11-01 17:48 ` Peter Grandi @ 2017-11-02 21:16 ` Kai Krakow 2017-11-03 2:47 ` Dave 2 siblings, 1 reply; 56+ messages in thread From: Kai Krakow @ 2017-11-02 21:16 UTC (permalink / raw) To: linux-btrfs Am Tue, 31 Oct 2017 20:37:27 -0400 schrieb Dave <davestechshop@gmail.com>: > > Also, you can declare the '.firefox/default/' directory to be > > NOCOW, and that "just works". > > The cache is in a separate location from the profiles, as I'm sure you > know. The reason I suggested a separate BTRFS subvolume for > $HOME/.cache is that this will prevent the cache files for all > applications (for that user) from being included in the snapshots. We > take frequent snapshots and (afaik) it makes no sense to include cache > in backups or snapshots. The easiest way I know to exclude cache from > BTRFS snapshots is to put it on a separate subvolume. I assumed this > would make several things related to snapshots more efficient too. > > As far as the Firefox profile being declared NOCOW, as soon as we take > the first snapshot, I understand that it will become COW again. So I > don't see any point in making it NOCOW. Ah well, not really. The files and directories will still be nocow - however, the next write to any such file after a snapshot will make a cow operation. So you still see the fragmentation effect but to a much lesser extent. But the files itself will remain in nocow format. You may want to try btrfs autodefrag mount option and see if it improves things (tho, the effect may take days or weeks to apply if you didn't enable it right from the creation of the filesystem). Also, autodefrag will probably unshare reflinks on your snapshots. You may be able to use bees[1] to work against this effect. Its interaction with autodefrag is not well tested but it works fine for me. Also, bees is able to reduce some of the fragmentation during deduplication because it will rewrite extents back into bigger chunks (but only for duplicated data). [1]: https://github.com/Zygo/bees -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-02 21:16 ` Kai Krakow @ 2017-11-03 2:47 ` Dave 2017-11-03 7:26 ` Kai Krakow 0 siblings, 1 reply; 56+ messages in thread From: Dave @ 2017-11-03 2:47 UTC (permalink / raw) To: Linux fs Btrfs; +Cc: Kai Krakow On Thu, Nov 2, 2017 at 5:16 PM, Kai Krakow <hurikhan77@gmail.com> wrote: > > You may want to try btrfs autodefrag mount option and see if it > improves things (tho, the effect may take days or weeks to apply if you > didn't enable it right from the creation of the filesystem). > > Also, autodefrag will probably unshare reflinks on your snapshots. You > may be able to use bees[1] to work against this effect. Its interaction > with autodefrag is not well tested but it works fine for me. Also, bees > is able to reduce some of the fragmentation during deduplication > because it will rewrite extents back into bigger chunks (but only for > duplicated data). > > [1]: https://github.com/Zygo/bees I will look into bees. And yes, I plan to try autodefrag. (I already have it enabled now.) However, I need to understand something about how btrfs send-receive works in regard to reflinks and fragmentation. Say I have 2 snapshots on my live volume. The earlier one of them has already been sent to another block device by btrfs send-receive (full backup). Now defrag runs on the live volume and breaks some percentage of the reflinks. At this point I do an incremental btrfs send-receive using "-p" (or "-c") with the diff going to the same other block device where the prior snapshot was already sent. Will reflinks be "made whole" (restored) on the receiving block device? Or is the state of the source volume replicated so closely that reflink status is the same on the target? Also, is fragmentation reduced on the receiving block device? My expectation is that fragmentation would be reduced and duplication would be reduced too. In other words, does send-receive result in defragmentation and deduplication too? ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-03 2:47 ` Dave @ 2017-11-03 7:26 ` Kai Krakow 2017-11-03 11:30 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 56+ messages in thread From: Kai Krakow @ 2017-11-03 7:26 UTC (permalink / raw) To: linux-btrfs Am Thu, 2 Nov 2017 22:47:31 -0400 schrieb Dave <davestechshop@gmail.com>: > On Thu, Nov 2, 2017 at 5:16 PM, Kai Krakow <hurikhan77@gmail.com> > wrote: > > > > > You may want to try btrfs autodefrag mount option and see if it > > improves things (tho, the effect may take days or weeks to apply if > > you didn't enable it right from the creation of the filesystem). > > > > Also, autodefrag will probably unshare reflinks on your snapshots. > > You may be able to use bees[1] to work against this effect. Its > > interaction with autodefrag is not well tested but it works fine > > for me. Also, bees is able to reduce some of the fragmentation > > during deduplication because it will rewrite extents back into > > bigger chunks (but only for duplicated data). > > > > [1]: https://github.com/Zygo/bees > > I will look into bees. And yes, I plan to try autodefrag. (I already > have it enabled now.) However, I need to understand something about > how btrfs send-receive works in regard to reflinks and fragmentation. > > Say I have 2 snapshots on my live volume. The earlier one of them has > already been sent to another block device by btrfs send-receive (full > backup). Now defrag runs on the live volume and breaks some percentage > of the reflinks. At this point I do an incremental btrfs send-receive > using "-p" (or "-c") with the diff going to the same other block > device where the prior snapshot was already sent. > > Will reflinks be "made whole" (restored) on the receiving block > device? Or is the state of the source volume replicated so closely > that reflink status is the same on the target? > > Also, is fragmentation reduced on the receiving block device? > > My expectation is that fragmentation would be reduced and duplication > would be reduced too. In other words, does send-receive result in > defragmentation and deduplication too? As far as I understand, btrfs send/receive doesn't create an exact mirror. It just replays the block operations between generation numbers. That is: If it finds new blocks referenced between generations, it will write a _new_ block to the destination. So, no, it won't reduce fragmentation or duplication. It just keeps reflinks intact as long as such extents weren't touched within the generation range. Otherwise they are rewritten as new extents. Autodefrag and deduplication processes will as such probably increase duplication at the destination. A developer may have a better clue, tho. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-03 7:26 ` Kai Krakow @ 2017-11-03 11:30 ` Austin S. Hemmelgarn 0 siblings, 0 replies; 56+ messages in thread From: Austin S. Hemmelgarn @ 2017-11-03 11:30 UTC (permalink / raw) To: linux-btrfs On 2017-11-03 03:26, Kai Krakow wrote: > Am Thu, 2 Nov 2017 22:47:31 -0400 > schrieb Dave <davestechshop@gmail.com>: > >> On Thu, Nov 2, 2017 at 5:16 PM, Kai Krakow <hurikhan77@gmail.com> >> wrote: >> >>> >>> You may want to try btrfs autodefrag mount option and see if it >>> improves things (tho, the effect may take days or weeks to apply if >>> you didn't enable it right from the creation of the filesystem). >>> >>> Also, autodefrag will probably unshare reflinks on your snapshots. >>> You may be able to use bees[1] to work against this effect. Its >>> interaction with autodefrag is not well tested but it works fine >>> for me. Also, bees is able to reduce some of the fragmentation >>> during deduplication because it will rewrite extents back into >>> bigger chunks (but only for duplicated data). >>> >>> [1]: https://github.com/Zygo/bees >> >> I will look into bees. And yes, I plan to try autodefrag. (I already >> have it enabled now.) However, I need to understand something about >> how btrfs send-receive works in regard to reflinks and fragmentation. >> >> Say I have 2 snapshots on my live volume. The earlier one of them has >> already been sent to another block device by btrfs send-receive (full >> backup). Now defrag runs on the live volume and breaks some percentage >> of the reflinks. At this point I do an incremental btrfs send-receive >> using "-p" (or "-c") with the diff going to the same other block >> device where the prior snapshot was already sent. >> >> Will reflinks be "made whole" (restored) on the receiving block >> device? Or is the state of the source volume replicated so closely >> that reflink status is the same on the target? >> >> Also, is fragmentation reduced on the receiving block device? >> >> My expectation is that fragmentation would be reduced and duplication >> would be reduced too. In other words, does send-receive result in >> defragmentation and deduplication too? > > As far as I understand, btrfs send/receive doesn't create an exact > mirror. It just replays the block operations between generation > numbers. That is: If it finds new blocks referenced between > generations, it will write a _new_ block to the destination. That is mostly correct, except it's not a block level copy. To put it in a heavily simplified manner, send/receive will recreate the subvolume using nothing more than basic file manipulation syscalls (write(), chown(), chmod(), etc), the clone ioctl, and some extra logic to figure out the correct location to clone from. IOW, it's functionally equivalent to using rsync to copy the data, and then deduplicating, albeit a bit smarter about when to deduplicate (and more efficient in that respect). > > So, no, it won't reduce fragmentation or duplication. It just keeps > reflinks intact as long as such extents weren't touched within the > generation range. Otherwise they are rewritten as new extents. A received subvolume will almost always be less fragmented than the source, since everything is received serially, and each file is written out one at a time. > > Autodefrag and deduplication processes will as such probably increase > duplication at the destination. A developer may have a better clue, tho. In theory, yes, but in practice, not so much. Autodefrag generally operates on very small blocks of data (64k IIRC), and I'm pretty sure it has some heuristic that only triggers it on small random writes, so depending on the workload, it may not be triggering much (for example, it often won't trigger on cache directories, since those almost never have files rewritten in place). ^ permalink raw reply [flat|nested] 56+ messages in thread
[parent not found: <CAH=dxU47-52-asM5vJ_-qOpEpjZczHw7vQzgi1-TeKm58++zBQ@mail.gmail.com>]
* Re: defragmenting best practice? [not found] ` <CAH=dxU47-52-asM5vJ_-qOpEpjZczHw7vQzgi1-TeKm58++zBQ@mail.gmail.com> @ 2017-12-11 5:18 ` Dave 2017-12-11 6:10 ` Timofey Titovets 0 siblings, 1 reply; 56+ messages in thread From: Dave @ 2017-12-11 5:18 UTC (permalink / raw) To: Linux fs Btrfs On Tue, Oct 31, 2017 someone wrote: > > > > 2. Put $HOME/.cache on a separate BTRFS subvolume that is mounted > > nocow -- it will NOT be snapshotted I did exactly this. It servers the purpose of avoiding snapshots. However, today I saw the following at https://wiki.archlinux.org/index.php/Btrfs Note: From Btrfs Wiki Mount options: within a single file system, it is not possible to mount some subvolumes with nodatacow and others with datacow. The mount option of the first mounted subvolume applies to any other subvolumes. That makes me think my nodatacow mount option on $HOME/.cache is not effective. True? (My subjective performance results have not been as good as hoped for with the tweaks I have tried so far.) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-12-11 5:18 ` Dave @ 2017-12-11 6:10 ` Timofey Titovets 0 siblings, 0 replies; 56+ messages in thread From: Timofey Titovets @ 2017-12-11 6:10 UTC (permalink / raw) To: Dave; +Cc: Linux fs Btrfs 2017-12-11 8:18 GMT+03:00 Dave <davestechshop@gmail.com>: > On Tue, Oct 31, 2017 someone wrote: >> >> >> > 2. Put $HOME/.cache on a separate BTRFS subvolume that is mounted >> > nocow -- it will NOT be snapshotted > > I did exactly this. It servers the purpose of avoiding snapshots. > However, today I saw the following at > https://wiki.archlinux.org/index.php/Btrfs > > Note: From Btrfs Wiki Mount options: within a single file system, it > is not possible to mount some subvolumes with nodatacow and others > with datacow. The mount option of the first mounted subvolume applies > to any other subvolumes. > > That makes me think my nodatacow mount option on $HOME/.cache is not > effective. True? > > (My subjective performance results have not been as good as hoped for > with the tweaks I have tried so far.) > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html True, for magic dirs, that you may want mark as no cow, you need to use chattr, like: rm -rf ~/.cache mkdir ~/.cache chattr +C ~/.cache -- Have a nice day, Timofey. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-10-31 21:47 ` Dave 2017-10-31 23:06 ` Peter Grandi @ 2017-11-01 7:43 ` Sean Greenslade 2017-11-01 13:31 ` Duncan 2 siblings, 0 replies; 56+ messages in thread From: Sean Greenslade @ 2017-11-01 7:43 UTC (permalink / raw) To: Dave; +Cc: Linux fs Btrfs On Tue, Oct 31, 2017 at 05:47:54PM -0400, Dave wrote: > I'm following up on all the suggestions regarding Firefox performance > on BTRFS. > > <SNIP> > > 5. Firefox profile sync has not worked well for us in the past, so we > don't use it. > 6. Our machines generally have plenty of RAM so we could put the > Firefox cache (and maybe profile) into RAM using a technique such as > https://wiki.archlinux.org/index.php/Firefox/Profile_on_RAM. However, > profile persistence is important. > 4. Put the Firefox cache in RAM > > 5. If needed, consider putting the Firefox profile in RAM Have you looked into profile-sync-daemon? https://wiki.archlinux.org/index.php/profile-sync-daemon It basically does the "keep the profile in RAM but also sync it to HDD" for you. I've used it for years, it works quite well. --Sean ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-10-31 21:47 ` Dave 2017-10-31 23:06 ` Peter Grandi 2017-11-01 7:43 ` Sean Greenslade @ 2017-11-01 13:31 ` Duncan 2017-11-01 23:36 ` Dave 2 siblings, 1 reply; 56+ messages in thread From: Duncan @ 2017-11-01 13:31 UTC (permalink / raw) To: linux-btrfs Dave posted on Tue, 31 Oct 2017 17:47:54 -0400 as excerpted: > 6. Make sure Firefox is running in multi-process mode. (Duncan's > instructions, while greatly appreciated and very useful, left me > slightly confused about pulseaudio's compatibility with multi-process > mode.) Just to clarify: There's no problem with native pulseaudio and firefox multi-process mode. As that's what most people will be using, and what firefox upstream ships for, chances are very high that you're just fine there, tho there's some small chance you have some other problem. My specific problem was that I do *NOT* have pulseaudio installed here, as I've never found I needed it and it adds more complication to my configuration than the limited benefit I'd get out of it justifies. Straight alsa has been fine for me. (Explanatory note: Being on gentoo/~amd64, aka testing, I do a lot more updating than stable users, and because it's gentoo, all those updates are build from sources, so every single extra package I have installed has a very real cost in terms of repeated update builds over time. Put a bit differently, building and updating from sources tends to rather strongly encourage the best security practice of only installing what you actually need, because you have to rebuild it at every update. And I don't need pulseaudio enough to be worth the cost to keep it updated, so I don't have it installed. It really is that simple. Binary-based distro users have rather trivial update costs in comparison, so having a few extra packages installed that they don't actually use, isn't such a big deal for them. Which is of course fortunate, since dependencies are often determined at build-time, and binary-based distros tend to enable relatively more of them because /someone/ uses them, even if it's a minority, so they tend to carry around more dependencies than the normal user will need, simply to support the few that do. And because the cost is relatively lower, users, except for the ones that pay enough attention to the security aspect of the wider attack surface, don't generally care as much as they would if they were forced to build and update all of them from sources!) So when firefox upstream dropped support for alsa and began requiring pulseaudio for users that actually wanted their browser to play sound, I had two choices. I could try to find a workaround that would fake firefox into believing that I had pulseaudio, or I could switch back to building firefox from sources instead of simply installing the upstream provided binaries, since gentoo's firefox build scripts still have the alsa support option that upstream firefox refused to support or ship any longer. As with most people and their browsers, firefox is the most security- exposed app I run, and it sometimes takes gentoo a few days after an upstream firefox release to get a working build out, during which users waiting on gentoo's package build are exposed to already widely known and patched by upstream security issues. That was more risk than I wanted to take, thus my choice of switching to the upstream firefox binaries in the first place, since they were available, indeed, autoupdated, on release day. Additionally, a firefox build takes awhile, much longer than most other packages, and now requires rust, itself an expensive to build package (tho fortunately it doesn't upgrade on the fast cycle that firefox does). So I wasn't particularly happy about being forced back to waiting for gentoo to get around to updating its firefox builds several days after upstream, and then taking the time to build them myself, making it worthwhile to look for a workaround. And as it happens, there's a /sort/ of workaround called apulse, a much simpler and smaller package than pulseaudio itself, that's basically just a pulseaudio API wrapper around alsa. And when I first installed apulse and tested firefox with it, sure enough, I got firefox sound back! =:^) I thought I had my workaround and that it was a satisfactory solution. Unfortunately, apulse appears not to be multi-process-safe, and as firefox went more and more multi-process in the announcements, etc, at first I couldn't figure out what was keeping firefox single-process for me. After some research on the web, I found the settings to /force/ firefox- multi-process, and tried them. But firefox would then only work in local mode (about: pages, basically). Every time I tried to actually go to a normal URL, the multi-process tabs would crash before it rendered a thing! The original firefox UI shell was still running, but with an error message indicating the tab crash instead of the page I wanted. After some troubleshooting I figured out it was apulse. If I moved the apulse library out of the way so firefox couldn't find it, I could browse the web in multiprocess mode just fine... except I was of course missing audio again. =:^( So apulse wasn't the workaround for upstream firefox now requiring pulseaudio that I thought it was, since apulse wouldn't work with multi- process, and I had to switch back to gentoo's firefox build from sources in ordered to get the alsa support that upstream had dropped, after all. Thus, it wasn't pulseaudio that was the problem with multiprocess, but the fact that firefox had dropped alsa and was forcing pulseaudio on Linux if you wanted audio at all, and the fact that the apulse workaround I thought I had, didn't work with multiprocess. So it was apulse that was the problem, and pulseaudio was only involved because firefox dropping direct alsa support and forcing pulseaudio was what had me installing apulse as an attempted workaround. Meanwhile, my intent with the original mention wasn't that apulse was likely your problem, that's relatively unlikely, but that you might have some /other/ problem, say a not electrolysis-enabled (aka e10s, e, ten- letters, s) extension. Back when I posted that, a not e10s-enabled extension was actually quite likely, as e10s was still rather new. It's probably somewhat less so now, and firefox is of course on to the next big change, dropping the old "legacy chrome" extension support, in favor of the newer and generally Chromium-compatible WebExtensions/WE API, with firefox 57, to be released mid-month (Nov 14). But assuming you're still seeing firefox performance issues, I'm still guessing that it's likely to be /something/ forcing single-process, as I /know/ how much of a difference that can make from experience. So I'd definitely check it, and if you're not getting multi-process, the firefox about:support page should show it in the application basics section, multiprocess windows, and if that's working, web content processes, entries. With luck it'll tell you why it's disabled if it is, saying something about incompatible extensions or the like, tho I had to do a bit more troubleshooting to find the problem with apulse. If with multiple firefox windows open you're seeing 2/2 (or higher) in the multiprocess windows entry, and n/4 (the default, here I forced a higher 7) in the web content processes entry, then you're good to go in this regard, and the problem must be elsewhere. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-11-01 13:31 ` Duncan @ 2017-11-01 23:36 ` Dave 0 siblings, 0 replies; 56+ messages in thread From: Dave @ 2017-11-01 23:36 UTC (permalink / raw) To: Linux fs Btrfs On Wed, Nov 1, 2017 at 9:31 AM, Duncan <1i5t5.duncan@cox.net> wrote: > Dave posted on Tue, 31 Oct 2017 17:47:54 -0400 as excerpted: > >> 6. Make sure Firefox is running in multi-process mode. (Duncan's >> instructions, while greatly appreciated and very useful, left me >> slightly confused about pulseaudio's compatibility with multi-process >> mode.) > > Just to clarify: > > There's no problem with native pulseaudio and firefox multi-process > mode. Thank you for clarifying. And I appreciate your detailed explanation. > Back when I posted that, a not e10s-enabled extension was actually quite > likely, as e10s was still rather new. It's probably somewhat less so > now, and firefox is of course on to the next big change, dropping the old > "legacy chrome" extension support, in favor of the newer and generally > Chromium-compatible WebExtensions/WE API, with firefox 57, to be released > mid-month (Nov 14). I am now running Firefox 57 beta and I'll be doing my testing with that using only WebExtensions. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-20 6:38 ` Dave 2017-09-20 11:46 ` Austin S. Hemmelgarn 2017-09-21 11:09 ` Duncan @ 2017-09-21 19:28 ` Sean Greenslade 2 siblings, 0 replies; 56+ messages in thread From: Sean Greenslade @ 2017-09-21 19:28 UTC (permalink / raw) To: Dave, Linux fs Btrfs On September 19, 2017 11:38:13 PM PDT, Dave <davestechshop@gmail.com> wrote: >>On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: > <snip> >Here's my scenario. Some months ago I built an over-the-top powerful >desktop computer / workstation and I was looking forward to really >fantastic performance improvements over my 6 year old Ubuntu machine. >I installed Arch Linux on BTRFS on the new computer (on an SSD). To my >shock, it was no faster than my old machine. I focused a lot on >Firefox performance because I use Firefox a lot and that was one of >the applications in which I was most looking forward to better >performance. > > <snip> > >What would you guys do in this situation? Check out profile sync daemon: https://wiki.archlinux.org/index.php/profile-sync-daemon It keeps the active profile files in a ramfs, periodically syncing them back to disk. It works quite well on my 7 year old netbook. --Sean ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: defragmenting best practice? 2017-09-15 19:10 ` Tomasz Kłoczko 2017-09-20 6:38 ` Dave @ 2017-09-20 7:34 ` Dmitry Kudriavtsev 1 sibling, 0 replies; 56+ messages in thread From: Dmitry Kudriavtsev @ 2017-09-20 7:34 UTC (permalink / raw) To: linux-btrfs I've had a very similar issue with the performance of my laptop dropping to very low levels, eventually solved by uninstalling Snapper, deleting snapshots, and then defragmenting the drive. This seems to be a common concern, I also had it happen on my desktop. Dmitry --- Thank you, Dmitry Kudriavtsev https://dkudriavtsev.xyz inexpensivecomputers.net ⠀⠀⠀⠀⠀⠀⠀⣸⣧⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⣰⣿⣿⣆⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⣀⡙⠿⣿⣿⣆⠀⠀⠀⠀⠀Hey, did you hear about that cool new OS? It's called ⠀⠀⠀⠀⣰⣿⣿⣷⣿⣿⣿⣆⠀⠀⠀⠀Arch Linux. I use Arch Linux. Have you ever used Arch ⠀⠀⠀⣰⣿⣿⣿⡿⢿⣿⣿⣿⣆⠀⠀⠀Linux? You should use Arch Linux. Everyone uses Arch! ⠀⠀⣰⣿⣿⣿⡏⠀⠀⢹⣿⣿⠿⡆⠀⠀Check out i3wm too! ⠀⣰⣿⣿⣿⡿⠇⠀⠀⠸⢿⣿⣷⣦⣄⠀ ⣼⠿⠛⠉⠀⠀⠀⠀⠀⠀⠀⠀⠉⠛⠿⣦ September 19 2017 11:38 PM, "Dave" <davestechshop@gmail.com> wrote: >> On Thu 2017-08-31 (09:05), Ulli Horlacher wrote: >>> When I do a >>> btrfs filesystem defragment -r /directory >>> does it defragment really all files in this directory tree, even if it >>> contains subvolumes? >>> The man page does not mention subvolumes on this topic. >> >> No answer so far :-( >> >> But I found another problem in the man-page: >> >> Defragmenting with Linux kernel versions < 3.9 or >= 3.14-rc2 as well as >> with Linux stable kernel versions >= 3.10.31, >= 3.12.12 or >= 3.13.4 >> will break up the ref-links of COW data (for example files copied with >> cp --reflink, snapshots or de-duplicated data). This may cause >> considerable increase of space usage depending on the broken up >> ref-links. >> >> I am running Ubuntu 16.04 with Linux kernel 4.10 and I have several >> snapshots. >> Therefore, I better should avoid calling "btrfs filesystem defragment -r"? >> >> What is the defragmenting best practice? >> Avoid it completly? > > My question is the same as the OP in this thread, so I came here to > read the answers before asking. However, it turns out that I still > need to ask something. Should I ask here or start a new thread? (I'll > assume here, since the topic is the same.) > > Based on the answers here, it sounds like I should not run defrag at > all. However, I have a performance problem I need to solve, so if I > don't defrag, I need to do something else. > > Here's my scenario. Some months ago I built an over-the-top powerful > desktop computer / workstation and I was looking forward to really > fantastic performance improvements over my 6 year old Ubuntu machine. > I installed Arch Linux on BTRFS on the new computer (on an SSD). To my > shock, it was no faster than my old machine. I focused a lot on > Firefox performance because I use Firefox a lot and that was one of > the applications in which I was most looking forward to better > performance. > > I tried everything I could think of and everything recommended to me > in various forums (except switching to Windows) and the performance > remained very disappointing. > > Then today I read the following: > > Gotchas - btrfs Wiki > https://btrfs.wiki.kernel.org/index.php/Gotchas > > Fragmentation: Files with a lot of random writes can become > heavily fragmented (10000+ extents) causing excessive multi-second > spikes of CPU load on systems with an SSD or large amount a RAM. On > desktops this primarily affects application databases (including > Firefox). Workarounds include manually defragmenting your home > directory using btrfs fi defragment. Auto-defragment (mount option > autodefrag) should solve this problem. > > Upon reading that I am wondering if fragmentation in the Firefox > profile is part of my issue. That's one thing I never tested > previously. (BTW, this system has 256 GB of RAM and 20 cores.) > > Furthermore, on the same BTRFS Wiki page, it mentions the performance > penalties of many snapshots. I am keeping 30 to 50 snapshots of the > volume that contains the Firefox profile. > > Would these two things be enough to turn top-of-the-line hardware into > a mediocre-preforming desktop system? (The system performs fine on > benchmarks -- it's real life usage, particularly with Firefox where it > is disappointing.) > > After reading the info here, I am wondering if I should make a new > subvolume just for my Firefox profile(s) and not use COW and/or not > keep snapshots on it and mount it with the autodefrag option. > > As part of this strategy, I could send snapshots to another disk using > btrfs send-receive. That way I would have the benefits of snapshots > (which are important to me), but by not keeping any snapshots on the > live subvolume I could avoid the performance problems. > > What would you guys do in this situation? > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 56+ messages in thread
end of thread, other threads:[~2017-12-11 6:11 UTC | newest] Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-31 7:05 btrfs filesystem defragment -r -- does it affect subvolumes? Ulli Horlacher 2017-09-12 16:28 ` defragmenting best practice? Ulli Horlacher 2017-09-12 17:27 ` Austin S. Hemmelgarn 2017-09-14 7:54 ` Duncan 2017-09-14 12:28 ` Austin S. Hemmelgarn 2017-09-14 11:38 ` Kai Krakow 2017-09-14 13:31 ` Tomasz Kłoczko 2017-09-14 15:24 ` Kai Krakow 2017-09-14 15:47 ` Kai Krakow 2017-09-14 17:48 ` Tomasz Kłoczko 2017-09-14 18:53 ` Austin S. Hemmelgarn 2017-09-15 2:26 ` Tomasz Kłoczko 2017-09-15 12:23 ` Austin S. Hemmelgarn 2017-09-14 20:17 ` Kai Krakow 2017-09-15 10:54 ` Michał Sokołowski 2017-09-15 11:13 ` Peter Grandi 2017-09-15 13:07 ` Tomasz Kłoczko 2017-09-15 14:11 ` Michał Sokołowski 2017-09-15 16:35 ` Peter Grandi 2017-09-15 17:08 ` Kai Krakow 2017-09-15 19:10 ` Tomasz Kłoczko 2017-09-20 6:38 ` Dave 2017-09-20 11:46 ` Austin S. Hemmelgarn 2017-09-21 20:10 ` Kai Krakow 2017-09-21 23:30 ` Dave 2017-09-21 23:58 ` Kai Krakow 2017-09-22 11:22 ` Austin S. Hemmelgarn 2017-09-22 20:29 ` Marc Joliet 2017-09-21 11:09 ` Duncan 2017-10-31 21:47 ` Dave 2017-10-31 23:06 ` Peter Grandi 2017-11-01 0:37 ` Dave 2017-11-01 12:21 ` Austin S. Hemmelgarn 2017-11-02 1:39 ` Dave 2017-11-02 11:07 ` Austin S. Hemmelgarn 2017-11-03 2:59 ` Dave 2017-11-03 7:12 ` Kai Krakow 2017-11-03 5:58 ` Marat Khalili 2017-11-03 7:19 ` Kai Krakow 2017-11-01 17:48 ` Peter Grandi 2017-11-02 0:09 ` Dave 2017-11-02 11:17 ` Austin S. Hemmelgarn 2017-11-02 18:09 ` Dave 2017-11-02 18:37 ` Austin S. Hemmelgarn 2017-11-02 0:43 ` Peter Grandi 2017-11-02 21:16 ` Kai Krakow 2017-11-03 2:47 ` Dave 2017-11-03 7:26 ` Kai Krakow 2017-11-03 11:30 ` Austin S. Hemmelgarn [not found] ` <CAH=dxU47-52-asM5vJ_-qOpEpjZczHw7vQzgi1-TeKm58++zBQ@mail.gmail.com> 2017-12-11 5:18 ` Dave 2017-12-11 6:10 ` Timofey Titovets 2017-11-01 7:43 ` Sean Greenslade 2017-11-01 13:31 ` Duncan 2017-11-01 23:36 ` Dave 2017-09-21 19:28 ` Sean Greenslade 2017-09-20 7:34 ` Dmitry Kudriavtsev
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.