* Is autodefrag recommended?
@ 2017-09-04 9:31 Marat Khalili
2017-09-04 10:23 ` Henk Slager
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Marat Khalili @ 2017-09-04 9:31 UTC (permalink / raw)
To: linux-btrfs
Hello list,
good time of the day,
More than once I see mentioned in this list that autodefrag option
solves problems with no apparent drawbacks, but it's not the default.
Can you recommend to just switch it on indiscriminately on all
installations?
I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's
Ubuntu that gives us this strange choice, no idea why it's not 4.9).
Only spinning rust here, no SSDs.
--
With Best Regards,
Marat Khalili
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-04 9:31 Is autodefrag recommended? Marat Khalili
@ 2017-09-04 10:23 ` Henk Slager
2017-09-04 10:34 ` Duncan
` (3 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Henk Slager @ 2017-09-04 10:23 UTC (permalink / raw)
To: Marat Khalili; +Cc: linux-btrfs
On Mon, Sep 4, 2017 at 11:31 AM, Marat Khalili <mkh@rqc.ru> wrote:
> Hello list,
> good time of the day,
>
> More than once I see mentioned in this list that autodefrag option solves
> problems with no apparent drawbacks, but it's not the default. Can you
> recommend to just switch it on indiscriminately on all installations?
Of course it has drawbacks, it depends on the use-cases on the
filesystem what your trade-off is. If the filesystem is created log
time ago and has 4k leafes the on HDD over time you get exessive
fragmentation and scattered 4k blocks all over the disk for a file
with a lot random writes (standard CoW for whole fs), like a 50G vm
image, easily 500k extents.
With autodefrag on from the beginning of fs creation, most
extent/blocksizes will be 128k or 256k in that order and then amount
of extents for the same vm image is roughly 50k. So statistically, the
average blocksize is not 4k but 128k, which is at least less free
space fragmentation (I use SSD caching of HDD otherwise also those 50k
extents result in totally unacceptable performance). But also for
newer standard 16k leafes, it is more or less the same story.
The drawbacks for me are:
1. I use nightly differential snapshotting for backup/replication over
a metered mobile network link, and this autodefrag causes a certain
amount of unnessccesaty fake content difference due to send|receive
based on CoW. But the amount of extra datavolume it causes is still
acceptable. If I would let the guest OS defragment its fs inside the
vm image for example, then the datavolume per day becomes
unacceptable.
2. It causes extra HDD activity, so noise, powerconsumption etc, which
might be unacceptable for some use-cases.
> I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's Ubuntu
> that gives us this strange choice, no idea why it's not 4.9). Only spinning
> rust here, no SSDs.
Kernel 4.4 is new enough w.r.t. autodefrag, but if you can switch to
4.8 or newer, I would do so.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-04 9:31 Is autodefrag recommended? Marat Khalili
2017-09-04 10:23 ` Henk Slager
@ 2017-09-04 10:34 ` Duncan
2017-09-04 11:09 ` Henk Slager
2017-09-04 10:54 ` Hugo Mills
` (2 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Duncan @ 2017-09-04 10:34 UTC (permalink / raw)
To: linux-btrfs
Marat Khalili posted on Mon, 04 Sep 2017 12:31:54 +0300 as excerpted:
> Hello list,
> good time of the day,
>
> More than once I see mentioned in this list that autodefrag option
> solves problems with no apparent drawbacks, but it's not the default.
> Can you recommend to just switch it on indiscriminately on all
> installations?
>
> I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's
> Ubuntu that gives us this strange choice, no idea why it's not 4.9).
> Only spinning rust here, no SSDs.
AFAIK autodefrag is recommended in general, but may not be for certain
specific use-cases.
* Because the mechanism involves watching written files for fragmentation
and scheduling areas that are too fragmented for later rewrite, if the
filesystem is operating at near capacity already, adding the extra load
of the defragmenting rewrites may actually reduce throughput and increase
latency, at least short term. (Longer term the additional fragmentation
from /not/ using it will become a factor and reduce throughput even more.)
* Users just turning autodefrag on after not using it for an extended
period, thus having an already highly fragmented filesystem, may well see
a period of higher latencies and lower throughput until the system
"catches up" and has defragged frequently written-to files. (This is
avoided with a policy of always having it on from the first time the
filesystem is mounted, so it's on at initial filesystem population.)
* As with many issues on a COW-based filesystem such as btrfs, it's the
frequently written into (aka internal-rewrite-pattern) files that are the
biggest test case. At-once written files that are never in-place
rewritten (for file safety many editors make a temporary copy, fsync it,
and then atomically replace the original with a rename, thus not being in-
place rewrites) don't tend to be an issue, unless the filesystem is
already fragmented enough at write time that the file must be fragmented
as it is initially written.
In general, this internal-rewrite-pattern is most commonly seen for
database files and virtual machines, with systemd's journal files likely
being the most common example of the former -- they're NOT the common
append-only log file format that so-called "legacy" text-based log files
tend to be. Also extremely common are the browser database files used by
both gecko and webkit based browsers (and browser-based apps such as
thunderbird).
* Autodefrag works very well when these internal-rewrite-pattern files
are relatively small, say a quarter GiB or less, but, again with near-
capacity throughput, not necessarily so well with larger databases or VM
images of a GiB or larger. (The quarter-gig to gig size is intermediate,
not as often a problem and not a problem for many, but it can be for
slower devices, while those on fast ssds may not see a problem until
sizes reach multiple GiB.)
For larger internal-rewrite-pattern files, again, starting at a gig or so
depending on device speed as well as rewrite activity, where
fragmentation AND performance are issues, the NOCOW file attribute may be
useful, tho there are side effects (losing btrfs checksumming and
compression functionality, interaction with btrfs snapshotting forcing
COW1, etc).
However, in general COW-based filesystems, including btrfs, are not going
to perform well with this use-case, and those operating large DBs or VMs
with performance considerations may find more traditional filesystems (or
even operation on bare-device, bypassing the filesystem layer entirely) a
better match for their needs, and I personally consider NOCOW an
unacceptable compromise, losing many of the advantages that make btrfs so
nice in general, so IMO it's then better to just use a different
filesystem better suited to that use-case.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-04 9:31 Is autodefrag recommended? Marat Khalili
2017-09-04 10:23 ` Henk Slager
2017-09-04 10:34 ` Duncan
@ 2017-09-04 10:54 ` Hugo Mills
2017-09-05 11:45 ` Austin S. Hemmelgarn
2017-09-05 12:36 ` A L
2017-09-05 14:01 ` Is autodefrag recommended? -- re-duplication??? Marat Khalili
4 siblings, 1 reply; 12+ messages in thread
From: Hugo Mills @ 2017-09-04 10:54 UTC (permalink / raw)
To: Marat Khalili; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]
On Mon, Sep 04, 2017 at 12:31:54PM +0300, Marat Khalili wrote:
> Hello list,
> good time of the day,
>
> More than once I see mentioned in this list that autodefrag option
> solves problems with no apparent drawbacks, but it's not the
> default. Can you recommend to just switch it on indiscriminately on
> all installations?
>
> I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's
> Ubuntu that gives us this strange choice, no idea why it's not 4.9).
> Only spinning rust here, no SSDs.
autodefrag effectively works by taking a small region around every
write or cluster of writes and making that into a stand-alone extent.
This has two consequences:
- You end up duplicating more data than is strictly necessary. This
is, IIRC, something like 128 KiB for a write.
- There's an I/O overhead for enabling autodefrag, because it's
increasing the amount of data written.
Hugo.
--
Hugo Mills | The future isn't what it used to be.
hugo@... carfax.org.uk |
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-04 10:34 ` Duncan
@ 2017-09-04 11:09 ` Henk Slager
2017-09-04 22:27 ` Duncan
0 siblings, 1 reply; 12+ messages in thread
From: Henk Slager @ 2017-09-04 11:09 UTC (permalink / raw)
To: Duncan; +Cc: linux-btrfs
On Mon, Sep 4, 2017 at 12:34 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> * Autodefrag works very well when these internal-rewrite-pattern files
> are relatively small, say a quarter GiB or less, but, again with near-
> capacity throughput, not necessarily so well with larger databases or VM
> images of a GiB or larger. (The quarter-gig to gig size is intermediate,
> not as often a problem and not a problem for many, but it can be for
> slower devices, while those on fast ssds may not see a problem until
> sizes reach multiple GiB.)
I have seen you stating this before about some quarter GiB filesize or
so, but it is irrelevant, it is simply not how it works. See
explanation of Hugo for how it works. I can post/store an actual
filefrag output of a vm image that is around for 2 years on the one of
my btrfs fs, then you can do some statistics on it and see from there
how it works.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-04 11:09 ` Henk Slager
@ 2017-09-04 22:27 ` Duncan
0 siblings, 0 replies; 12+ messages in thread
From: Duncan @ 2017-09-04 22:27 UTC (permalink / raw)
To: linux-btrfs
Henk Slager posted on Mon, 04 Sep 2017 13:09:24 +0200 as excerpted:
> On Mon, Sep 4, 2017 at 12:34 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>
>> * Autodefrag works very well when these internal-rewrite-pattern files
>> are relatively small, say a quarter GiB or less, but, again with near-
>> capacity throughput, not necessarily so well with larger databases or
>> VM images of a GiB or larger. (The quarter-gig to gig size is
>> intermediate,
>> not as often a problem and not a problem for many, but it can be for
>> slower devices, while those on fast ssds may not see a problem until
>> sizes reach multiple GiB.)
>
> I have seen you stating this before about some quarter GiB filesize or
> so, but it is irrelevant, it is simply not how it works. See explanation
> of Hugo for how it works. I can post/store an actual filefrag output of
> a vm image that is around for 2 years on the one of my btrfs fs, then
> you can do some statistics on it and see from there how it works.
FWIW...
I believe it did work that way (whole-file autodefrag) at one point.
Because back in the early kernel 3.x era at least, we had complaints
about autodefrag performance with larger internal-rewrite-pattern files
where the larger the file the worse the performance, and the
documentation mentioned something about being appropriate for small files
but less so far large files as well.
But I also believe you're correct that it no longer works that way (if it
ever did, maybe the complaints were due to some unrelated side effect, in
any case I've not seen any for quite some time now), and hasn't since
before anything we're still trying to reasonably support on this list
(IOW, back two LTS kernel series ago, so to 4.4).
So I should drop the size factor, or mention that it's not nearly the
problem it once was, at least.
Thanks for forcing the reckoning. =:^)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-04 10:54 ` Hugo Mills
@ 2017-09-05 11:45 ` Austin S. Hemmelgarn
2017-09-05 12:49 ` Henk Slager
0 siblings, 1 reply; 12+ messages in thread
From: Austin S. Hemmelgarn @ 2017-09-05 11:45 UTC (permalink / raw)
To: Hugo Mills, Marat Khalili, linux-btrfs
On 2017-09-04 06:54, Hugo Mills wrote:
> On Mon, Sep 04, 2017 at 12:31:54PM +0300, Marat Khalili wrote:
>> Hello list,
>> good time of the day,
>>
>> More than once I see mentioned in this list that autodefrag option
>> solves problems with no apparent drawbacks, but it's not the
>> default. Can you recommend to just switch it on indiscriminately on
>> all installations?
>>
>> I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's
>> Ubuntu that gives us this strange choice, no idea why it's not 4.9).
>> Only spinning rust here, no SSDs.
>
> autodefrag effectively works by taking a small region around every
> write or cluster of writes and making that into a stand-alone extent.
I was under the impression that it had some kind of 'random access'
detection heuristic, and onky triggered if that flagged the write
patterns as 'random'.
>
> This has two consequences:
>
> - You end up duplicating more data than is strictly necessary. This
> is, IIRC, something like 128 KiB for a write.
FWIW< I'm pretty sure you can mitigate this first issue by running a
regular defrag on a semi-regular basis (monthly is what I would probably
suggest).
>
> - There's an I/O overhead for enabling autodefrag, because it's
> increasing the amount of data written.
And this issue may not be as much of an issue. The region being
rewritten gets written out sequentially, so it will increase the amount
of data written, but in most cases probably won't increase IO request
counts to the device by much. If you care mostly about raw bandwidth,
then this could still have an impact, but if you care about IOPS, it
probably won't have much impact unless you're already running the device
at peak capacity.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-04 9:31 Is autodefrag recommended? Marat Khalili
` (2 preceding siblings ...)
2017-09-04 10:54 ` Hugo Mills
@ 2017-09-05 12:36 ` A L
2017-09-05 14:01 ` Is autodefrag recommended? -- re-duplication??? Marat Khalili
4 siblings, 0 replies; 12+ messages in thread
From: A L @ 2017-09-05 12:36 UTC (permalink / raw)
To: linux-btrfs
There is a drawback in that defragmentation re-dups data that is previously deduped or shared in snapshots/subvolumes.
---- From: Marat Khalili <mkh@rqc.ru> -- Sent: 2017-09-04 - 11:31 ----
> Hello list,
> good time of the day,
>
> More than once I see mentioned in this list that autodefrag option
> solves problems with no apparent drawbacks, but it's not the default.
> Can you recommend to just switch it on indiscriminately on all
> installations?
>
> I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's
> Ubuntu that gives us this strange choice, no idea why it's not 4.9).
> Only spinning rust here, no SSDs.
>
> --
>
> With Best Regards,
> Marat Khalili
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-05 11:45 ` Austin S. Hemmelgarn
@ 2017-09-05 12:49 ` Henk Slager
2017-09-05 13:00 ` Austin S. Hemmelgarn
0 siblings, 1 reply; 12+ messages in thread
From: Henk Slager @ 2017-09-05 12:49 UTC (permalink / raw)
To: Austin S. Hemmelgarn; +Cc: linux-btrfs
On Tue, Sep 5, 2017 at 1:45 PM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
>> - You end up duplicating more data than is strictly necessary. This
>> is, IIRC, something like 128 KiB for a write.
>
> FWIW< I'm pretty sure you can mitigate this first issue by running a regular
> defrag on a semi-regular basis (monthly is what I would probably suggest).
No, both autodefrag and regular defrag duplicate data, so if you keep
snapshots around for weeks or months, it can eat up a significant
amount of space.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended?
2017-09-05 12:49 ` Henk Slager
@ 2017-09-05 13:00 ` Austin S. Hemmelgarn
0 siblings, 0 replies; 12+ messages in thread
From: Austin S. Hemmelgarn @ 2017-09-05 13:00 UTC (permalink / raw)
To: Henk Slager; +Cc: linux-btrfs
On 2017-09-05 08:49, Henk Slager wrote:
> On Tue, Sep 5, 2017 at 1:45 PM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>
>>> - You end up duplicating more data than is strictly necessary. This
>>> is, IIRC, something like 128 KiB for a write.
>>
>> FWIW< I'm pretty sure you can mitigate this first issue by running a regular
>> defrag on a semi-regular basis (monthly is what I would probably suggest).
>
> No, both autodefrag and regular defrag duplicate data, so if you keep
> snapshots around for weeks or months, it can eat up a significant
> amount of space.
>
I'm not talking about data duplication due to broken reflinks, I'm
talking about data duplication due to how partial extent rewrites are
handled in BTRFS.
As a more illustrative example, suppose you've got a 256k file that has
just one extent. Such a file will require 256k of space for the data
Now rewrite from 128k to 192k. The file now technically takes up 320k,
because the region you rewrote is still allocated in the original extent.
I know that sub-extent-size reflinks are handled like this (in the above
example, if you instead use the CLONE ioctl to create a new file
reflinking that range, then delete the original, the remaining 192k of
space in the extent ends up unreferenced, but gets kept around until the
referenced region is no longer referenced (and the easiest way to ensure
this is to either rewrite the whole file, or defragment it)), and I'm
pretty sure from reading the code that mid-extent writes are handled
this way too, in which case, a full defrag can reclaim that space.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended? -- re-duplication???
2017-09-04 9:31 Is autodefrag recommended? Marat Khalili
` (3 preceding siblings ...)
2017-09-05 12:36 ` A L
@ 2017-09-05 14:01 ` Marat Khalili
2017-09-05 14:39 ` Hugo Mills
4 siblings, 1 reply; 12+ messages in thread
From: Marat Khalili @ 2017-09-05 14:01 UTC (permalink / raw)
To: linux-btrfs
Cc: Henk Slager, Duncan, Hugo Mills, Austin S. Hemmelgarn, Henk Slager, A L
Dear experts,
At first reaction to just switching autodefrag on was positive, but
mentions of re-duplication are very scary. Main use of BTRFS here is
backup snapshots, so re-duplication would be disastrous.
In order to stick to concrete example, let there be two files, 4KB and
4GB in size, referenced in read-only snapshots 100 times each, and some
4KB of both files are rewritten each night and then another snapshot is
created (let's ignore snapshots deletion here). AFAIU 8KB of additional
space (+metadata) will be allocated each night without autodefrag. With
autodefrag will it be perhaps 4KB+128KB or something much worse?
--
With Best Regards,
Marat Khalili
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Is autodefrag recommended? -- re-duplication???
2017-09-05 14:01 ` Is autodefrag recommended? -- re-duplication??? Marat Khalili
@ 2017-09-05 14:39 ` Hugo Mills
0 siblings, 0 replies; 12+ messages in thread
From: Hugo Mills @ 2017-09-05 14:39 UTC (permalink / raw)
To: Marat Khalili; +Cc: linux-btrfs, Henk Slager, Duncan, Austin S. Hemmelgarn, A L
[-- Attachment #1: Type: text/plain, Size: 1061 bytes --]
On Tue, Sep 05, 2017 at 05:01:10PM +0300, Marat Khalili wrote:
> Dear experts,
>
> At first reaction to just switching autodefrag on was positive, but
> mentions of re-duplication are very scary. Main use of BTRFS here is
> backup snapshots, so re-duplication would be disastrous.
>
> In order to stick to concrete example, let there be two files, 4KB
> and 4GB in size, referenced in read-only snapshots 100 times each,
> and some 4KB of both files are rewritten each night and then another
> snapshot is created (let's ignore snapshots deletion here). AFAIU
> 8KB of additional space (+metadata) will be allocated each night
> without autodefrag. With autodefrag will it be perhaps 4KB+128KB or
> something much worse?
I'm going for 132 KiB (4+128).
Of course, if there's two 4 KiB writes close together, then there's
less overhead, as they'll share the range.
Hugo.
--
Hugo Mills | Once is happenstance; twice is coincidence; three
hugo@... carfax.org.uk | times is enemy action.
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-09-05 14:39 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-04 9:31 Is autodefrag recommended? Marat Khalili
2017-09-04 10:23 ` Henk Slager
2017-09-04 10:34 ` Duncan
2017-09-04 11:09 ` Henk Slager
2017-09-04 22:27 ` Duncan
2017-09-04 10:54 ` Hugo Mills
2017-09-05 11:45 ` Austin S. Hemmelgarn
2017-09-05 12:49 ` Henk Slager
2017-09-05 13:00 ` Austin S. Hemmelgarn
2017-09-05 12:36 ` A L
2017-09-05 14:01 ` Is autodefrag recommended? -- re-duplication??? Marat Khalili
2017-09-05 14:39 ` Hugo Mills
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.