* Defragmenting to recover wasted space @ 2019-11-07 14:03 Nate Eldredge 2019-11-07 16:50 ` Remi Gauvin 0 siblings, 1 reply; 6+ messages in thread From: Nate Eldredge @ 2019-11-07 14:03 UTC (permalink / raw) To: linux-btrfs I had a confusing issue on a btrfs filesystem, where the amount of space used according to `df', `btrfs fi usage', etc, was about 50% higher than the total reported by `du' or `btrfs fi du', about 185 GB vs 125 GB, meaning that about 60 GB was somehow wasted. I ruled out all the usual suspects (deleted files still open, files under mount points, etc) and eventually fixed the issue by doing `btrfs fi defrag` on a directory containing a few big files (Virtualbox disk images). This is on Ubuntu 19.04, currently using kernel 5.0.0-32. So everything is good now, but I have questions: 1. What causes this? I saw some references to "unused extents" but it wasn't clear how that happens, or why they wouldn't be freed through normal operation. Are there certain usage patterns that exacerbate it? 2. Is this documented? I didn't see it mentioned anywhere in the documentation, and defragmenting was just a random thing to try, based on a few hints in various blogs and mailing lists. Luckily it worked, but otherwise I'm not sure how I could have known that defragmenting was the solution. 3. Is this reasonable? With all the other filesystems I've used, space that isn't occupied by your files is available for use, minus a reasonable amount of overhead for metadata etc, without needing any special administrative chores. Should I take it that I can't expect this from btrfs, and I have to plan to defragment occasionally to keep the disk from filling up? 4. If this is not normal, and if I'm able to reproduce it, what information should I gather for a bug report? 5. Is there a better way to detect this kind of wastage, to distinguish it from more mundane causes (deleted files still open, etc) and see how much space could be recovered? In particular, is there a way to tell which files are most affected, so that I can just defragment those? Thanks very much for any information or pointers. Here is info about the filesystem, if it matters. This is from after the defrag. It has two subvolumes and no snapshots. # uname -a Linux moneta 5.0.0-32-generic #34-Ubuntu SMP Wed Oct 2 02:06:48 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux # btrfs --version btrfs-progs v4.20.2 # btrfs fi show / Label: none uuid: [xxx] Total devices 1 FS bytes used 127.83GiB devid 1 size 227.29GiB used 197.02GiB path /dev/mapper/nvme0n1p3_crypt # btrfs fi df / Data, single: total=194.01GiB, used=127.03GiB System, single: total=4.00MiB, used=48.00KiB Metadata, single: total=3.01GiB, used=817.80MiB GlobalReserve, single: total=182.75MiB, used=0.00B Prior to the defrag, the `used=` number in `btrfs fi df` was about 185 GiB. -- Nate Eldredge nate@thatsmathematics.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Defragmenting to recover wasted space 2019-11-07 14:03 Defragmenting to recover wasted space Nate Eldredge @ 2019-11-07 16:50 ` Remi Gauvin 2019-11-07 19:41 ` Nate Eldredge 0 siblings, 1 reply; 6+ messages in thread From: Remi Gauvin @ 2019-11-07 16:50 UTC (permalink / raw) To: Nate Eldredge [-- Attachment #1.1: Type: text/plain, Size: 1649 bytes --] On 2019-11-07 9:03 a.m., Nate Eldredge wrote: > 1. What causes this? I saw some references to "unused extents" but it > wasn't clear how that happens, or why they wouldn't be freed through > normal operation. Are there certain usage patterns that exacerbate it? Virtual Box Image files are subject to many, many small writes... (just booting windows, for example, can create well over 5000 file fragments.) When the image file is new, the extents will be very large. In BTRFS, the extents are immutable. When a small write creates a new 4K COW extent, the old 4k remains as part of the old extent as well. This situation will remain until all the data in the old extent is re-written.. when none of that data is referenced anymore, the extent will be freed. > 5. Is there a better way to detect this kind of wastage, to distinguish > it from more mundane causes (deleted files still open, etc) and see how > much space could be recovered? In particular, is there a way to tell > which files are most affected, so that I can just defragment those? Generally speaking, files that are subject to many random writes are few, and you should be well aware of the larger ones where this might be an issues,, (virtual image files, large databases, etc.) These files should be defragmented frequently. I don't see any reason not run defrag over the whole subvolume, but if you want to search for files with absurd fragments, you can always use the find command to search for files, run the filefrag command on them, then use whatever tools you like to search the output for files with thousands of fragments. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Defragmenting to recover wasted space 2019-11-07 16:50 ` Remi Gauvin @ 2019-11-07 19:41 ` Nate Eldredge 2019-11-08 8:01 ` Qu Wenruo 0 siblings, 1 reply; 6+ messages in thread From: Nate Eldredge @ 2019-11-07 19:41 UTC (permalink / raw) To: Remi Gauvin; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 2997 bytes --] On Thu, 7 Nov 2019, Remi Gauvin wrote: > On 2019-11-07 9:03 a.m., Nate Eldredge wrote: > >> 1. What causes this? I saw some references to "unused extents" but it >> wasn't clear how that happens, or why they wouldn't be freed through >> normal operation. Are there certain usage patterns that exacerbate it? > > Virtual Box Image files are subject to many, many small writes... (just > booting windows, for example, can create well over 5000 file fragments.) > When the image file is new, the extents will be very large. In BTRFS, > the extents are immutable. When a small write creates a new 4K COW > extent, the old 4k remains as part of the old extent as well. This > situation will remain until all the data in the old extent is > re-written.. when none of that data is referenced anymore, the extent > will be freed. Thanks, Remi. This is very helpful in understanding what is going on. In particular, I didn't realize that extents are immutable even when there is only one reference to them (I have no snapshots or reflinks to these files). I guess this also means that in the worst case, if I want to overwrite the entire file "in place" in a random order, I actually need additional free space equal to the file's size, until I get around to defragging. That's rather counterintuitive for somebody used to traditional filesystems. >> 5. Is there a better way to detect this kind of wastage, to distinguish >> it from more mundane causes (deleted files still open, etc) and see how >> much space could be recovered? In particular, is there a way to tell >> which files are most affected, so that I can just defragment those? > > Generally speaking, files that are subject to many random writes are > few, and you should be well aware of the larger ones where this might be > an issues,, (virtual image files, large databases, etc.) These files > should be defragmented frequently. I don't see any reason not run > defrag over the whole subvolume, but if you want to search for files > with absurd fragments, you can always use the find command to search for > files, run the filefrag command on them, then use whatever tools you > like to search the output for files with thousands of fragments. Okay. Defragmenting is kind of inconvenient, though, and I suppose it involves some extra wear on the SSD since data is really being moved. There's also the issue, as I understand it, that defragmenting will break up existing reflinks, which in some other situations I may really want to keep. In fact, it seems that somehow what I really want is for the file to be *completely* fragmented, so that every write replaces an extent and frees the old one. On an SSD I don't really care if the data blocks are actually contiguous. It seems perverse, but even if there is more overhead, it might be worth it when I don't have a lot of free space to spare. I don't suppose there is any way to arrange that? Thanks again! -- Nate Eldredge nate@thatsmathematics.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Defragmenting to recover wasted space 2019-11-07 19:41 ` Nate Eldredge @ 2019-11-08 8:01 ` Qu Wenruo 2019-11-08 15:24 ` Nate Eldredge 0 siblings, 1 reply; 6+ messages in thread From: Qu Wenruo @ 2019-11-08 8:01 UTC (permalink / raw) To: Nate Eldredge, Remi Gauvin; +Cc: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 3440 bytes --] On 2019/11/8 上午3:41, Nate Eldredge wrote: > On Thu, 7 Nov 2019, Remi Gauvin wrote: > >> On 2019-11-07 9:03 a.m., Nate Eldredge wrote: >> >>> 1. What causes this? I saw some references to "unused extents" but it >>> wasn't clear how that happens, or why they wouldn't be freed through >>> normal operation. Are there certain usage patterns that exacerbate it? >> >> Virtual Box Image files are subject to many, many small writes... (just >> booting windows, for example, can create well over 5000 file fragments.) >> When the image file is new, the extents will be very large. In BTRFS, >> the extents are immutable. When a small write creates a new 4K COW >> extent, the old 4k remains as part of the old extent as well. This >> situation will remain until all the data in the old extent is >> re-written.. when none of that data is referenced anymore, the extent >> will be freed. > > Thanks, Remi. This is very helpful in understanding what is going on. > In particular, I didn't realize that extents are immutable even when > there is only one reference to them (I have no snapshots or reflinks to > these files). > > I guess this also means that in the worst case, if I want to overwrite > the entire file "in place" in a random order, I actually need additional > free space equal to the file's size, until I get around to defragging. > That's rather counterintuitive for somebody used to traditional > filesystems. > >>> 5. Is there a better way to detect this kind of wastage, to distinguish >>> it from more mundane causes (deleted files still open, etc) and see how >>> much space could be recovered? In particular, is there a way to tell >>> which files are most affected, so that I can just defragment those? >> >> Generally speaking, files that are subject to many random writes are >> few, and you should be well aware of the larger ones where this might be >> an issues,, (virtual image files, large databases, etc.) These files >> should be defragmented frequently. I don't see any reason not run >> defrag over the whole subvolume, but if you want to search for files >> with absurd fragments, you can always use the find command to search for >> files, run the filefrag command on them, then use whatever tools you >> like to search the output for files with thousands of fragments. > > Okay. Defragmenting is kind of inconvenient, though, and I suppose it > involves some extra wear on the SSD since data is really being moved. > There's also the issue, as I understand it, that defragmenting will > break up existing reflinks, which in some other situations I may really > want to keep. > > In fact, it seems that somehow what I really want is for the file to be > *completely* fragmented, so that every write replaces an extent and > frees the old one. On an SSD I don't really care if the data blocks are > actually contiguous. It seems perverse, but even if there is more > overhead, it might be worth it when I don't have a lot of free space to > spare. I don't suppose there is any way to arrange that? In fact, you can just go nodatacow. Furthermore, nodatacow attr can be applied to a directory so that any newer file will just inherit the nodatacow attr. In that case, any overwrite will not be COWed (as long as there is no snapshot for it), thus no space wasted. Thanks, Qu > > Thanks again! > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Defragmenting to recover wasted space 2019-11-08 8:01 ` Qu Wenruo @ 2019-11-08 15:24 ` Nate Eldredge 2019-11-08 15:53 ` Remi Gauvin 0 siblings, 1 reply; 6+ messages in thread From: Nate Eldredge @ 2019-11-08 15:24 UTC (permalink / raw) To: Qu Wenruo; +Cc: Remi Gauvin, linux-btrfs On Fri, 8 Nov 2019, Qu Wenruo wrote: > In fact, you can just go nodatacow. > Furthermore, nodatacow attr can be applied to a directory so that any > newer file will just inherit the nodatacow attr. > > In that case, any overwrite will not be COWed (as long as there is no > snapshot for it), thus no space wasted. Aha, I didn't know about that feature. Thanks, that is exactly what I want. -- Nate Eldredge nate@thatsmathematics.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Defragmenting to recover wasted space 2019-11-08 15:24 ` Nate Eldredge @ 2019-11-08 15:53 ` Remi Gauvin 0 siblings, 0 replies; 6+ messages in thread From: Remi Gauvin @ 2019-11-08 15:53 UTC (permalink / raw) To: Nate Eldredge; +Cc: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 1184 bytes --] On 2019-11-08 10:24 a.m., Nate Eldredge wrote: > On Fri, 8 Nov 2019, Qu Wenruo wrote: > >> In fact, you can just go nodatacow. >> Furthermore, nodatacow attr can be applied to a directory so that any >> newer file will just inherit the nodatacow attr. >> >> In that case, any overwrite will not be COWed (as long as there is no >> snapshot for it), thus no space wasted. > > Aha, I didn't know about that feature. Thanks, that is exactly what I > want. > I would advise caution with this approach.. with nodatacow you give up all of the features that would make you want to use BTRFS in the first place. (No Checksum verification, for example.) And if using in conjunction with BTRFS Raid, BTRFS behavior, is,, in terms of RAID, outright psychotic. In case of unclean shutdown while data was being written, the RAID copies will be inconsistent, and BTRFS will never synchronize them, (short of a full re-balance.).. What data gets read will just randomnly depend on what device BTRFS is reading from. If you would rather forgo the benefits of BTRFS for better performance or fragmentation issues, why not carve out an XFS / EXT4 partition? [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-11-08 15:53 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-07 14:03 Defragmenting to recover wasted space Nate Eldredge 2019-11-07 16:50 ` Remi Gauvin 2019-11-07 19:41 ` Nate Eldredge 2019-11-08 8:01 ` Qu Wenruo 2019-11-08 15:24 ` Nate Eldredge 2019-11-08 15:53 ` Remi Gauvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).