All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: "Niccolò Belli" <darkbasic@linuxsystems.it>
Cc: David Sterba <dsterba@suse.cz>, linux-btrfs@vger.kernel.org
Subject: Re: Any chance to get snapshot-aware defragmentation?
Date: Mon, 21 May 2018 09:15:46 -0400	[thread overview]
Message-ID: <ebc2865d-2784-0be7-79e2-fe56b624baa5@gmail.com> (raw)
In-Reply-To: <b8ac31e3-7c9b-44a7-a286-08f1642e25c6@linuxsystems.it>

On 2018-05-19 04:54, Niccolò Belli wrote:
> On venerdì 18 maggio 2018 20:33:53 CEST, Austin S. Hemmelgarn wrote:
>> With a bit of work, it's possible to handle things sanely.  You can 
>> deduplicate data from snapshots, even if they are read-only (you need 
>> to pass the `-A` option to duperemove and run it as root), so it's 
>> perfectly reasonable to only defrag the main subvolume, and then 
>> deduplicate the snapshots against that (so that they end up all being 
>> reflinks to the main subvolume).  Of course, this won't work if you're 
>> short on space, but if you're dealing with snapshots, you should have 
>> enough space that this will work (because even without defrag, it's 
>> fully possible for something to cause the snapshots to suddenly take 
>> up a lot more space).
> 
> Been there, tried that. Unfortunately even if I skip the defreg a simple
> 
> duperemove -drhA --dedupe-options=noblock --hashfile=rootfs.hash rootfs
> 
> is going to eat more space than it was previously available (probably 
> due to autodefrag?).
It's not autodefrag (that doesn't trigger on use of the EXTENT_SAME 
ioctl).  There's two things involved here:

* BTRFS has somewhat odd and inefficient handling of partial extents. 
When part of an extent becomes unused (because of a CLONE ioctl, or an 
EXTENT_SAME ioctl, or something similar), that part stays allocated 
until the whole extent would be unused.
* You're using the default deduplication block size (128k), which is 
larger than your filesystem block size (which is at most 64k, most 
likely 16k, but might be 4k if it's an old filesystem), so deduplicating 
can split extents.

Because of this, if a duplicate region happens to overlap the front of 
an already shared extent, and the end of said shared extent isn't 
aligned with the deduplication block size, the EXTENT_SAME call will 
deduplicate the first part, creating a new shared extent, but not the 
tail end of the existing shared region, and all of that original shared 
region will stick around, taking up extra space that it wasn't before.

Additionally, if only part of an extent is duplicated, then that area of 
the extent will stay allocated, because the rest of the extent is still 
referenced (so you won't necessarily see any actual space savings).

You can mitigate this by telling duperemove to use the same block size 
as your filesystem using the `-b` option.   Note that using a smaller 
block size will also slow down the deduplication process and greatly 
increase the size of the hash file.

  reply	other threads:[~2018-05-21 13:15 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-11 15:22 Any chance to get snapshot-aware defragmentation? Niccolò Belli
2018-05-18 16:20 ` David Sterba
2018-05-18 16:36   ` Niccolò Belli
2018-05-18 17:10     ` Austin S. Hemmelgarn
2018-05-18 17:18       ` Niccolò Belli
2018-05-18 18:33         ` Austin S. Hemmelgarn
2018-05-18 22:26           ` Chris Murphy
2018-05-18 22:46             ` Omar Sandoval
2018-05-19  8:54           ` Niccolò Belli
2018-05-21 13:15             ` Austin S. Hemmelgarn [this message]
2018-05-21 13:42               ` Timofey Titovets
2018-05-21 15:38                 ` Austin S. Hemmelgarn
2018-06-01  3:19                   ` Zygo Blaxell
2018-05-18 23:55       ` Tomasz Pala
2018-05-19  8:56         ` Niccolò Belli
     [not found]           ` <20180520105928.GA17117@polanet.pl>
2018-05-21 13:49             ` Niccolò Belli
2018-05-21 17:43       ` David Sterba
2018-05-21 19:22         ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ebc2865d-2784-0be7-79e2-fe56b624baa5@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=darkbasic@linuxsystems.it \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.