Defragmenting to recover wasted space

* Defragmenting to recover wasted space
@ 2019-11-07 14:03 Nate Eldredge
  2019-11-07 16:50 ` Remi Gauvin
  0 siblings, 1 reply; 6+ messages in thread
From: Nate Eldredge @ 2019-11-07 14:03 UTC (permalink / raw)
  To: linux-btrfs

I had a confusing issue on a btrfs filesystem, where the amount of space 
used according to `df', `btrfs fi usage', etc, was about 50% higher than 
the total reported by `du' or `btrfs fi du', about 185 GB vs 125 GB, 
meaning that about 60 GB was somehow wasted.  I ruled out all the usual 
suspects (deleted files still open, files under mount points, etc) and 
eventually fixed the issue by doing `btrfs fi defrag` on a directory 
containing a few big files (Virtualbox disk images).

This is on Ubuntu 19.04, currently using kernel 5.0.0-32.

So everything is good now, but I have questions:

1. What causes this?  I saw some references to "unused extents" but it 
wasn't clear how that happens, or why they wouldn't be freed through 
normal operation.  Are there certain usage patterns that exacerbate it?

2. Is this documented?  I didn't see it mentioned anywhere in the 
documentation, and defragmenting was just a random thing to try, based on 
a few hints in various blogs and mailing lists.  Luckily it worked, but 
otherwise I'm not sure how I could have known that defragmenting was the 
solution.

3. Is this reasonable?  With all the other filesystems I've used, space 
that isn't occupied by your files is available for use, minus a reasonable 
amount of overhead for metadata etc, without needing any special 
administrative chores.  Should I take it that I can't expect this from 
btrfs, and I have to plan to defragment occasionally to keep the disk from 
filling up?

4. If this is not normal, and if I'm able to reproduce it, what 
information should I gather for a bug report?

5. Is there a better way to detect this kind of wastage, to distinguish it 
from more mundane causes (deleted files still open, etc) and see how much 
space could be recovered? In particular, is there a way to tell which 
files are most affected, so that I can just defragment those?

Thanks very much for any information or pointers.

Here is info about the filesystem, if it matters.  This is from after the 
defrag.  It has two subvolumes and no snapshots.

# uname -a
Linux moneta 5.0.0-32-generic #34-Ubuntu SMP Wed Oct 2 02:06:48 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
# btrfs --version
btrfs-progs v4.20.2 
# btrfs fi show /
Label: none  uuid: [xxx]
 	Total devices 1 FS bytes used 127.83GiB
 	devid    1 size 227.29GiB used 197.02GiB path /dev/mapper/nvme0n1p3_crypt

# btrfs fi df /
Data, single: total=194.01GiB, used=127.03GiB
System, single: total=4.00MiB, used=48.00KiB
Metadata, single: total=3.01GiB, used=817.80MiB
GlobalReserve, single: total=182.75MiB, used=0.00B

Prior to the defrag, the `used=` number in `btrfs fi df` was about 185 
GiB.

-- 
Nate Eldredge
nate@thatsmathematics.com

^ permalink raw reply	[flat|nested] 6+ messages in thread