From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: Add device while rebalancing
Date: Mon, 25 Apr 2016 09:02:04 -0400 [thread overview]
Message-ID: <571E154C.9060604@gmail.com> (raw)
In-Reply-To: <pan$193a9$b381fd2b$1c0f329f$7b14e34b@cox.net>
On 2016-04-25 08:43, Duncan wrote:
> Austin S. Hemmelgarn posted on Mon, 25 Apr 2016 07:18:10 -0400 as
> excerpted:
>
>> On 2016-04-23 01:38, Duncan wrote:
>>>
>>> And again with snapshotting operations. Making a snapshot is normally
>>> nearly instantaneous, but there's a scaling issue if you have too many
>>> per filesystem (try to keep it under 2000 snapshots per filesystem
>>> total, if possible, and definitely keep it under 10K or some operations
>>> will slow down substantially), and deleting snapshots is more work, so
>>> while you should ordinarily automatically thin down snapshots if you're
>>> automatically making them quite frequently (say daily or more
>>> frequently), you may want to put the snapshot deletion, at least, on
>>> hold while you scrub or balance or device delete or replace.
>
>> I would actually recommend putting all snapshot operations on hold, as
>> well as most writes to the filesystem, while doing a balance or device
>> deletion. The more writes you have while doing those, the longer they
>> take, and the less likely that you end up with a good on-disk layout of
>> the data.
>
> The thing with snapshot writing is that all snapshot creation effectively
> does is a bit of metadata writing. What snapshots primarily do is lock
> existing extents in place (down within their chunk, with the higher chunk
> level being the scope at which balance works), that would otherwise be
> COWed elsewhere with the existing extent deleted on change, or simply
> deleted on on file delete. A snapshot simply adds a reference to the
> current version, so that deletion, either directly or from the COW, never
> happens, and to do that simply requires a relatively small metadata write.
Unless I'm mistaken about the internals of BTRFS (which might be the
case), creating a snapshot has to update reference counts on every
single extent in every single file in the snapshot. For something small
this isn't much, but if you are snapshotting something big (say,
snapshotting an entire system with all the data in one subvolume), it
can amount to multiple MB of writes, and it gets even worse if you have
no shared extents to begin with (which is still pretty typical). On
some of the systems I work with at work, snapshotting a terabyte of data
can end up resulting in 10-20 MB of writes to disk (in this case, that
figure came from a partition containing mostly small files that were
just big enough that they didn't fit in-line in the metadata blocks).
This is of course still significantly faster than copying everything,
but it's not free either.
>
> So while I agree in general that more writes means balances taking
> longer, snapshot creation writes are pretty tiny in the scheme of things,
> and won't affect the balance much, compared to larger writes you'll very
> possibly still be doing unless you really do suspend pretty much all
> write operations to that filesystem during the balance.
In general, yes, except that there's the case of running with mostly
full metadata chunks, where it might result in a further chunk
allocation, which in turn can throw off the balanced layout. Balance
always allocates new chunks, and doesn't write into existing ones, so if
you're writing enough to allocate a new chunk while a balance is happening:
1. That chunk may or may not get considered by the balance code (I'm not
100% certain about this, but I believe it will be ignored by any balance
running at the time it gets allocated).
2. You run the risk of ending up with a chunk with almost nothing in it
which could be packed into another existing chunk.
Snapshots are not likely to trigger this, but it is still possible,
especially if you're taking lots of snapshots in a short period of time.
>
> But as I said, snapshot deletions are an entirely different story, as
> then all those previously locked in place extents are potentially freed,
> and the filesystem must do a lot of work to figure out which ones it can
> actually free and free them, vs. ones that still have other references
> which therefore cannot yet be freed.
Most of the issue here with balance is that you end up potentially doing
an amount of unnecessary work which is unquantifiable before it's done.
next prev parent reply other threads:[~2016-04-25 13:02 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-22 20:36 Add device while rebalancing Juan Alberto Cirez
2016-04-23 5:38 ` Duncan
2016-04-25 11:18 ` Austin S. Hemmelgarn
2016-04-25 12:43 ` Duncan
2016-04-25 13:02 ` Austin S. Hemmelgarn [this message]
2016-04-26 10:50 ` Juan Alberto Cirez
2016-04-26 11:11 ` Austin S. Hemmelgarn
2016-04-26 11:44 ` Juan Alberto Cirez
2016-04-26 12:04 ` Austin S. Hemmelgarn
2016-04-26 12:14 ` Juan Alberto Cirez
2016-04-26 12:44 ` Austin S. Hemmelgarn
2016-04-27 0:58 ` Chris Murphy
2016-04-27 10:37 ` Duncan
2016-04-27 11:22 ` Austin S. Hemmelgarn
2016-04-27 15:58 ` Juan Alberto Cirez
2016-04-27 16:29 ` Holger Hoffstätte
2016-04-27 16:38 ` Juan Alberto Cirez
2016-04-27 16:40 ` Juan Alberto Cirez
2016-04-27 17:23 ` Holger Hoffstätte
2016-04-27 23:19 ` Chris Murphy
2016-04-28 11:21 ` Austin S. Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=571E154C.9060604@gmail.com \
--to=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.