All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: defragmenting best practice?
Date: Fri, 3 Nov 2017 07:30:23 -0400	[thread overview]
Message-ID: <5f81ce65-614c-0e45-f1f3-0e3ec5beb5a9@gmail.com> (raw)
In-Reply-To: <20171103082624.5a2e12e9@jupiter.sol.kaishome.de>

On 2017-11-03 03:26, Kai Krakow wrote:
> Am Thu, 2 Nov 2017 22:47:31 -0400
> schrieb Dave <davestechshop@gmail.com>:
> 
>> On Thu, Nov 2, 2017 at 5:16 PM, Kai Krakow <hurikhan77@gmail.com>
>> wrote:
>>
>>>
>>> You may want to try btrfs autodefrag mount option and see if it
>>> improves things (tho, the effect may take days or weeks to apply if
>>> you didn't enable it right from the creation of the filesystem).
>>>
>>> Also, autodefrag will probably unshare reflinks on your snapshots.
>>> You may be able to use bees[1] to work against this effect. Its
>>> interaction with autodefrag is not well tested but it works fine
>>> for me. Also, bees is able to reduce some of the fragmentation
>>> during deduplication because it will rewrite extents back into
>>> bigger chunks (but only for duplicated data).
>>>
>>> [1]: https://github.com/Zygo/bees
>>
>> I will look into bees. And yes, I plan to try autodefrag. (I already
>> have it enabled now.) However, I need to understand something about
>> how btrfs send-receive works in regard to reflinks and fragmentation.
>>
>> Say I have 2 snapshots on my live volume. The earlier one of them has
>> already been sent to another block device by btrfs send-receive (full
>> backup). Now defrag runs on the live volume and breaks some percentage
>> of the reflinks. At this point I do an incremental btrfs send-receive
>> using "-p" (or "-c") with the diff going to the same other block
>> device where the prior snapshot was already sent.
>>
>> Will reflinks be "made whole" (restored) on the receiving block
>> device? Or is the state of the source volume replicated so closely
>> that reflink status is the same on the target?
>>
>> Also, is fragmentation reduced on the receiving block device?
>>
>> My expectation is that fragmentation would be reduced and duplication
>> would be reduced too. In other words, does send-receive result in
>> defragmentation and deduplication too?
> 
> As far as I understand, btrfs send/receive doesn't create an exact
> mirror. It just replays the block operations between generation
> numbers. That is: If it finds new blocks referenced between
> generations, it will write a _new_ block to the destination.
That is mostly correct, except it's not a block level copy.  To put it 
in a heavily simplified manner, send/receive will recreate the subvolume 
using nothing more than basic file manipulation syscalls (write(), 
chown(), chmod(), etc), the clone ioctl, and some extra logic to figure 
out the correct location to clone from.  IOW, it's functionally 
equivalent to using rsync to copy the data, and then deduplicating, 
albeit a bit smarter about when to deduplicate (and more efficient in 
that respect).
> 
> So, no, it won't reduce fragmentation or duplication. It just keeps
> reflinks intact as long as such extents weren't touched within the
> generation range. Otherwise they are rewritten as new extents.
A received subvolume will almost always be less fragmented than the 
source, since everything is received serially, and each file is written 
out one at a time.
> 
> Autodefrag and deduplication processes will as such probably increase
> duplication at the destination. A developer may have a better clue, tho.
In theory, yes, but in practice, not so much.  Autodefrag generally 
operates on very small blocks of data (64k IIRC), and I'm pretty sure it 
has some heuristic that only triggers it on small random writes, so 
depending on the workload, it may not be triggering much (for example, 
it often won't trigger on cache directories, since those almost never 
have files rewritten in place).

  reply	other threads:[~2017-11-03 11:30 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-31  7:05 btrfs filesystem defragment -r -- does it affect subvolumes? Ulli Horlacher
2017-09-12 16:28 ` defragmenting best practice? Ulli Horlacher
2017-09-12 17:27   ` Austin S. Hemmelgarn
2017-09-14  7:54     ` Duncan
2017-09-14 12:28       ` Austin S. Hemmelgarn
2017-09-14 11:38   ` Kai Krakow
2017-09-14 13:31     ` Tomasz Kłoczko
2017-09-14 15:24       ` Kai Krakow
2017-09-14 15:47         ` Kai Krakow
2017-09-14 17:48         ` Tomasz Kłoczko
2017-09-14 18:53           ` Austin S. Hemmelgarn
2017-09-15  2:26             ` Tomasz Kłoczko
2017-09-15 12:23               ` Austin S. Hemmelgarn
2017-09-14 20:17           ` Kai Krakow
2017-09-15 10:54           ` Michał Sokołowski
2017-09-15 11:13             ` Peter Grandi
2017-09-15 13:07             ` Tomasz Kłoczko
2017-09-15 14:11               ` Michał Sokołowski
2017-09-15 16:35                 ` Peter Grandi
2017-09-15 17:08                 ` Kai Krakow
2017-09-15 19:10                   ` Tomasz Kłoczko
2017-09-20  6:38                     ` Dave
2017-09-20 11:46                       ` Austin S. Hemmelgarn
2017-09-21 20:10                         ` Kai Krakow
2017-09-21 23:30                           ` Dave
2017-09-21 23:58                           ` Kai Krakow
2017-09-22 11:22                           ` Austin S. Hemmelgarn
2017-09-22 20:29                             ` Marc Joliet
2017-09-21 11:09                       ` Duncan
2017-10-31 21:47                         ` Dave
2017-10-31 23:06                           ` Peter Grandi
2017-11-01  0:37                             ` Dave
2017-11-01 12:21                               ` Austin S. Hemmelgarn
2017-11-02  1:39                                 ` Dave
2017-11-02 11:07                                   ` Austin S. Hemmelgarn
2017-11-03  2:59                                     ` Dave
2017-11-03  7:12                                       ` Kai Krakow
2017-11-03  5:58                                   ` Marat Khalili
2017-11-03  7:19                                     ` Kai Krakow
2017-11-01 17:48                               ` Peter Grandi
2017-11-02  0:09                                 ` Dave
2017-11-02 11:17                                   ` Austin S. Hemmelgarn
2017-11-02 18:09                                     ` Dave
2017-11-02 18:37                                       ` Austin S. Hemmelgarn
2017-11-02  0:43                                 ` Peter Grandi
2017-11-02 21:16                               ` Kai Krakow
2017-11-03  2:47                                 ` Dave
2017-11-03  7:26                                   ` Kai Krakow
2017-11-03 11:30                                     ` Austin S. Hemmelgarn [this message]
     [not found]                             ` <CAH=dxU47-52-asM5vJ_-qOpEpjZczHw7vQzgi1-TeKm58++zBQ@mail.gmail.com>
2017-12-11  5:18                               ` Dave
2017-12-11  6:10                                 ` Timofey Titovets
2017-11-01  7:43                           ` Sean Greenslade
2017-11-01 13:31                           ` Duncan
2017-11-01 23:36                             ` Dave
2017-09-21 19:28                       ` Sean Greenslade
2017-09-20  7:34                     ` Dmitry Kudriavtsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f81ce65-614c-0e45-f1f3-0e3ec5beb5a9@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.