* Recommendations for balancing as part of regular maintenance?
@ 2018-01-08 15:55 Austin S. Hemmelgarn
2018-01-08 16:20 ` ein
` (2 more replies)
0 siblings, 3 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-08 15:55 UTC (permalink / raw)
To: Btrfs BTRFS
So, for a while now I've been recommending small filtered balances to
people as part of regular maintenance for BTRFS filesystems under the
logic that it does help in some cases and can't really hurt (and if done
right, is really inexpensive in terms of resources). This ended up
integrated partially in the info text next to the BTRFS charts on
netdata's dashboard, and someone has now pointed out (correctly I might
add) that this is at odds with the BTRFS FAQ entry on balances.
For reference, here's the bit about it in netdata:
You can keep your volume healthy by running the `btrfs balance` command
on it regularly (check `man btrfs-balance` for more info).
And here's the FAQ entry:
Q: Do I need to run a balance regularly?
A: In general usage, no. A full unfiltered balance typically takes a
long time, and will rewrite huge amounts of data unnecessarily. You may
wish to run a balance on metadata only (see Balance_Filters) if you find
you have very large amounts of metadata space allocated but unused, but
this should be a last resort.
I've commented in the issue in netdata's issue tracker that I feel that
the FAQ entry could be better worded (strictly speaking, you don't
_need_ to run balances regularly, but it's usually a good idea).
Looking at both though, I think they could probably both be improved,
but I would like to get some input here on what people actually think
the best current practices are regarding this (and ideally why they feel
that way) before I go and change anything.
So, on that note, how does anybody else out there feel about this? Is
balancing regularly with filters restricting things to small numbers of
mostly empty chunks a good thing for regular maintenance or not?
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 15:55 Recommendations for balancing as part of regular maintenance? Austin S. Hemmelgarn
@ 2018-01-08 16:20 ` ein
2018-01-08 16:34 ` Austin S. Hemmelgarn
2018-01-10 21:37 ` waxhead
2018-01-12 18:24 ` Austin S. Hemmelgarn
2 siblings, 1 reply; 34+ messages in thread
From: ein @ 2018-01-08 16:20 UTC (permalink / raw)
To: Austin S. Hemmelgarn, Btrfs BTRFS
On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
> [...]
>
> And here's the FAQ entry:
>
> Q: Do I need to run a balance regularly?
>
> A: In general usage, no. A full unfiltered balance typically takes a
> long time, and will rewrite huge amounts of data unnecessarily. You may
> wish to run a balance on metadata only (see Balance_Filters) if you find
> you have very large amounts of metadata space allocated but unused, but
> this should be a last resort.
IHMO three more sentencens and the answer would be more useful:
1. BTRFS balance command example with note check the man first.
2. What use case may cause 'large amounts of metadata space allocated
but unused'.
--
PGP Public Key (RSA/4096b):
ID: 0xF2C6EA10
SHA-1: 51DA 40EE 832A 0572 5AD8 B3C0 7AFF 69E1 F2C6 EA10
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 16:20 ` ein
@ 2018-01-08 16:34 ` Austin S. Hemmelgarn
2018-01-08 18:17 ` Graham Cobb
0 siblings, 1 reply; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-08 16:34 UTC (permalink / raw)
To: ein, Btrfs BTRFS
On 2018-01-08 11:20, ein wrote:
> On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
>> [...]
>>
>> And here's the FAQ entry:
>>
>> Q: Do I need to run a balance regularly?
>>
>> A: In general usage, no. A full unfiltered balance typically takes a
>> long time, and will rewrite huge amounts of data unnecessarily. You may
>> wish to run a balance on metadata only (see Balance_Filters) if you find
>> you have very large amounts of metadata space allocated but unused, but
>> this should be a last resort.
>
> IHMO three more sentencens and the answer would be more useful:
> 1. BTRFS balance command example with note check the man first.
> 2. What use case may cause 'large amounts of metadata space allocated
> but unused'.
>
That's kind of what I was thinking as well, but I'm hesitant to get too
heavily into stuff along the lines of 'for use case X, do 1, for use
case Y, do 2, etc', as that tends to result in pigeonholing (people just
go with what sounds closest to their use case instead of trying to
figure out what actually is best for their use case).
Ideally, I think it should be as generic as reasonably possible,
possibly something along the lines of:
A: While not strictly necessary, running regular filtered balances (for
example `btrfs balance start -dusage=50 -dlimit=2 -musage=50 -mlimit=4`,
see `man btrfs-balance` for more info on what the options mean) can help
keep a volume healthy by mitigating the things that typically cause
ENOSPC errors. Full balances by contrast are long and expensive
operations, and should be done only as a last resort.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 16:34 ` Austin S. Hemmelgarn
@ 2018-01-08 18:17 ` Graham Cobb
2018-01-08 18:34 ` Austin S. Hemmelgarn
2018-01-10 4:38 ` Duncan
0 siblings, 2 replies; 34+ messages in thread
From: Graham Cobb @ 2018-01-08 18:17 UTC (permalink / raw)
To: Btrfs BTRFS
On 08/01/18 16:34, Austin S. Hemmelgarn wrote:
> Ideally, I think it should be as generic as reasonably possible,
> possibly something along the lines of:
>
> A: While not strictly necessary, running regular filtered balances (for
> example `btrfs balance start -dusage=50 -dlimit=2 -musage=50 -mlimit=4`,
> see `man btrfs-balance` for more info on what the options mean) can help
> keep a volume healthy by mitigating the things that typically cause
> ENOSPC errors. Full balances by contrast are long and expensive
> operations, and should be done only as a last resort.
That recommendation is similar to what I do and it works well for my use
case. I would recommend it to anyone with my usage, but cannot say how
well it would work for other uses. In my case, I run balances like that
once a week: some weeks nothing happens, other weeks 5 or 10 blocks may
get moved.
For reference, my use case is for two separate btrfs filesystems each on
a single large disk (so no RAID) -- the disks are 6TB and 12TB, both
around 80% used -- one is my main personal data disk, the other is my
main online backup disk.
The data disk receives all email delivery (so lots of small files,
coming and going), stores TV programs as PVR storage (many GB sized
files, each one written once, which typically stick around for a while
and eventually get deleted) and is where I do my software development
(sources and build objects). No (significant) database usage. I am
guessing this is pretty typical personal user usage (although it doesn't
store any operating system files). The only unusual thing is that I have
it set up as about 20 subvolumes, and each one has frequent snapshots
(maybe 200 or so subvolumes in total at any time).
The online backup disk receives backups from all my systems in three
main forms: btrfs snapshots (send/receive), rsnapshot copies (rsync),
and DAR archives. Most get updated daily. It contains several hundred
snapshots (most received from the data disk).
It would be interesting to hear if similar balancing is seen as useful
for other very different cases (RAID use, databases or VM disks, etc).
Graham
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 18:17 ` Graham Cobb
@ 2018-01-08 18:34 ` Austin S. Hemmelgarn
2018-01-08 20:29 ` Martin Raiber
2018-01-10 4:38 ` Duncan
1 sibling, 1 reply; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-08 18:34 UTC (permalink / raw)
To: Graham Cobb, Btrfs BTRFS
On 2018-01-08 13:17, Graham Cobb wrote:
> On 08/01/18 16:34, Austin S. Hemmelgarn wrote:
>> Ideally, I think it should be as generic as reasonably possible,
>> possibly something along the lines of:
>>
>> A: While not strictly necessary, running regular filtered balances (for
>> example `btrfs balance start -dusage=50 -dlimit=2 -musage=50 -mlimit=4`,
>> see `man btrfs-balance` for more info on what the options mean) can help
>> keep a volume healthy by mitigating the things that typically cause
>> ENOSPC errors. Full balances by contrast are long and expensive
>> operations, and should be done only as a last resort.
>
> That recommendation is similar to what I do and it works well for my use
> case. I would recommend it to anyone with my usage, but cannot say how
> well it would work for other uses. In my case, I run balances like that
> once a week: some weeks nothing happens, other weeks 5 or 10 blocks may
> get moved.
>
> For reference, my use case is for two separate btrfs filesystems each on
> a single large disk (so no RAID) -- the disks are 6TB and 12TB, both
> around 80% used -- one is my main personal data disk, the other is my
> main online backup disk.
>
> The data disk receives all email delivery (so lots of small files,
> coming and going), stores TV programs as PVR storage (many GB sized
> files, each one written once, which typically stick around for a while
> and eventually get deleted) and is where I do my software development
> (sources and build objects). No (significant) database usage. I am
> guessing this is pretty typical personal user usage (although it doesn't
> store any operating system files). The only unusual thing is that I have
> it set up as about 20 subvolumes, and each one has frequent snapshots
> (maybe 200 or so subvolumes in total at any time).
>
> The online backup disk receives backups from all my systems in three
> main forms: btrfs snapshots (send/receive), rsnapshot copies (rsync),
> and DAR archives. Most get updated daily. It contains several hundred
> snapshots (most received from the data disk).
>
> It would be interesting to hear if similar balancing is seen as useful
> for other very different cases (RAID use, databases or VM disks, etc).
In my own usage I've got a pretty varied mix of other stuff going on.
All my systems are Gentoo, so system updates mean that I'm building
software regularly (though on most of the systems that happens on tmpfs
in RAM), I run a home server with a dozen low use QEMU VM's and a bunch
of transient test VM's, all of which I'm currently storing disk images
for raw on top of BTRFS (which is actually handling all of it pretty
well, though that may be thanks to all the VM's using PV-SCSI for their
disks), I run a BOINC client system that sees pretty heavy filesystem
usage, and have a lot of personal files that get synced regularly across
systems, and all of this is on raid1 with essentially no snapshots. For
me the balance command I mentioned above run daily seems to help, even
if the balance doesn't move much most of the time on most filesystems,
and the actual balance operations take at most a few seconds most of the
time (I've got reasonably nice SSD's in everything).
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 18:34 ` Austin S. Hemmelgarn
@ 2018-01-08 20:29 ` Martin Raiber
2018-01-09 8:33 ` Marat Khalili
0 siblings, 1 reply; 34+ messages in thread
From: Martin Raiber @ 2018-01-08 20:29 UTC (permalink / raw)
To: Btrfs BTRFS
On 08.01.2018 19:34 Austin S. Hemmelgarn wrote:
> On 2018-01-08 13:17, Graham Cobb wrote:
>> On 08/01/18 16:34, Austin S. Hemmelgarn wrote:
>>> Ideally, I think it should be as generic as reasonably possible,
>>> possibly something along the lines of:
>>>
>>> A: While not strictly necessary, running regular filtered balances (for
>>> example `btrfs balance start -dusage=50 -dlimit=2 -musage=50
>>> -mlimit=4`,
>>> see `man btrfs-balance` for more info on what the options mean) can
>>> help
>>> keep a volume healthy by mitigating the things that typically cause
>>> ENOSPC errors. Full balances by contrast are long and expensive
>>> operations, and should be done only as a last resort.
>>
>> That recommendation is similar to what I do and it works well for my use
>> case. I would recommend it to anyone with my usage, but cannot say how
>> well it would work for other uses. In my case, I run balances like that
>> once a week: some weeks nothing happens, other weeks 5 or 10 blocks may
>> get moved.
>
> In my own usage I've got a pretty varied mix of other stuff going on.
> All my systems are Gentoo, so system updates mean that I'm building
> software regularly (though on most of the systems that happens on
> tmpfs in RAM), I run a home server with a dozen low use QEMU VM's and
> a bunch of transient test VM's, all of which I'm currently storing
> disk images for raw on top of BTRFS (which is actually handling all of
> it pretty well, though that may be thanks to all the VM's using
> PV-SCSI for their disks), I run a BOINC client system that sees pretty
> heavy filesystem usage, and have a lot of personal files that get
> synced regularly across systems, and all of this is on raid1 with
> essentially no snapshots. For me the balance command I mentioned
> above run daily seems to help, even if the balance doesn't move much
> most of the time on most filesystems, and the actual balance
> operations take at most a few seconds most of the time (I've got
> reasonably nice SSD's in everything).
There have been reports of (rare) corruption caused by balance (won't be
detected by a scrub) here on the mailing list. So I would stay a away
from btrfs balance unless it is absolutely needed (ENOSPC), and while it
is run I would try not to do anything else wrt. to writes simultaneously.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 20:29 ` Martin Raiber
@ 2018-01-09 8:33 ` Marat Khalili
2018-01-09 12:46 ` Austin S. Hemmelgarn
0 siblings, 1 reply; 34+ messages in thread
From: Marat Khalili @ 2018-01-09 8:33 UTC (permalink / raw)
To: Martin Raiber, Austin S. Hemmelgarn; +Cc: Btrfs BTRFS
On 08/01/18 19:34, Austin S. Hemmelgarn wrote:
> A: While not strictly necessary, running regular filtered balances
> (for example `btrfs balance start -dusage=50 -dlimit=2 -musage=50
> -mlimit=4`, see `man btrfs-balance` for more info on what the options
> mean) can help keep a volume healthy by mitigating the things that
> typically cause ENOSPC errors.
The choice of words is not very fortunate IMO. In my view volume
stopping being "healthy" during normal operation presumes some bugs (at
least shortcomings) in the filesystem code. In this case I'd prefer to
have detailed understanding of the situation before copy-pasting
commands from wiki pages. Remember, most users don't run cutting-edge
kernels and tools, preferring LTS distribution releases instead, so one
size might not fit all.
On 08/01/18 23:29, Martin Raiber wrote:
> There have been reports of (rare) corruption caused by balance (won't be
> detected by a scrub) here on the mailing list. So I would stay a away
> from btrfs balance unless it is absolutely needed (ENOSPC), and while it
> is run I would try not to do anything else wrt. to writes simultaneously.
This is my opinion too as a normal user, based upon reading this list
and own attempts to recover from ENOSPC. I'd rather re-create filesystem
from scratch, or at least make full verified backup before attempting to
fix problems with balance.
--
With Best Regards,
Marat Khalili
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-09 8:33 ` Marat Khalili
@ 2018-01-09 12:46 ` Austin S. Hemmelgarn
2018-01-10 3:49 ` Duncan
0 siblings, 1 reply; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-09 12:46 UTC (permalink / raw)
To: Marat Khalili, Martin Raiber; +Cc: Btrfs BTRFS
On 2018-01-09 03:33, Marat Khalili wrote:
> On 08/01/18 19:34, Austin S. Hemmelgarn wrote:
>> A: While not strictly necessary, running regular filtered balances
>> (for example `btrfs balance start -dusage=50 -dlimit=2 -musage=50
>> -mlimit=4`, see `man btrfs-balance` for more info on what the options
>> mean) can help keep a volume healthy by mitigating the things that
>> typically cause ENOSPC errors.
>
> The choice of words is not very fortunate IMO. In my view volume
> stopping being "healthy" during normal operation presumes some bugs (at
> least shortcomings) in the filesystem code. In this case I'd prefer to
> have detailed understanding of the situation before copy-pasting
> commands from wiki pages. Remember, most users don't run cutting-edge
> kernels and tools, preferring LTS distribution releases instead, so one
> size might not fit all.
I will not dispute that the tendency of BTRFS to end up in bad
situations is a shortcoming of the filesystem code. However, that isn't
likely to change any time soon (fixing it is going to be a lot of work
that will likely reduce performance for quite a few people), so there is
absolutely no reason that people should not be trying to mitigate the
problem.
As far as the exact command, the one I quoted has worked for at least 2
years worth of btrfs-progs and kernels, and I think far longer than that
(the usage and limit filters were implemented pretty early on). I agree
that detailed knowledge would be better, but that doesn't exactly fit
with the concept of a FAQ in most cases, and most people really don't
care about the details as long as it works.
>
> On 08/01/18 23:29, Martin Raiber wrote:
>> There have been reports of (rare) corruption caused by balance (won't be
>> detected by a scrub) here on the mailing list. So I would stay a away
>> from btrfs balance unless it is absolutely needed (ENOSPC), and while it
>> is run I would try not to do anything else wrt. to writes simultaneously.
>
> This is my opinion too as a normal user, based upon reading this list
> and own attempts to recover from ENOSPC. I'd rather re-create filesystem
> from scratch, or at least make full verified backup before attempting to
> fix problems with balance.
While I'm generally of the same opinion (and I have a feeling most other
people who have been server admins are too), it's not a very user
friendly position to recommend that. Keep in mind that many (probably
most) users don't keep proper backups, and just targeting 'sensible'
people as your primary audience is a bad idea. It also needs to work at
at least a basic level anyway though simply because you can't always
just nuke the volume and rebuild it from scratch.
Personally though, I don't think I've ever seen issues with balance
corrupting data, and I don't recall seeing complaints about it either
(though I would love to see some links that prove me wrong).
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-09 12:46 ` Austin S. Hemmelgarn
@ 2018-01-10 3:49 ` Duncan
2018-01-10 16:30 ` Tom Worster
0 siblings, 1 reply; 34+ messages in thread
From: Duncan @ 2018-01-10 3:49 UTC (permalink / raw)
To: linux-btrfs
Austin S. Hemmelgarn posted on Tue, 09 Jan 2018 07:46:48 -0500 as
excerpted:
>> On 08/01/18 23:29, Martin Raiber wrote:
>>> There have been reports of (rare) corruption caused by balance (won't
>>> be detected by a scrub) here on the mailing list. So I would stay a
>>> away from btrfs balance unless it is absolutely needed (ENOSPC), and
>>> while it is run I would try not to do anything else wrt. to writes
>>> simultaneously.
>>
>> This is my opinion too as a normal user, based upon reading this list
>> and own attempts to recover from ENOSPC. I'd rather re-create
>> filesystem from scratch, or at least make full verified backup before
>> attempting to fix problems with balance.
> While I'm generally of the same opinion (and I have a feeling most other
> people who have been server admins are too), it's not a very user
> friendly position to recommend that. Keep in mind that many (probably
> most) users don't keep proper backups, and just targeting 'sensible'
> people as your primary audience is a bad idea. It also needs to work at
> at least a basic level anyway though simply because you can't always
> just nuke the volume and rebuild it from scratch.
>
> Personally though, I don't think I've ever seen issues with balance
> corrupting data, and I don't recall seeing complaints about it either
> (though I would love to see some links that prove me wrong).
AFAIK, such corruption reports re balance aren't really balance, per se,
at all.
Instead, what I've seen in nearly all cases is a number of filesystem
maintenance commands involving heavy I/O colliding, that is, being run at
the same time, possibly because some of them are scheduled, and the admin
didn't take into account scheduled commands when issuing others manually.
I don't believe anyone would recommend running balance, scrub, snapshot-
deletion, and backups (rsync or btrfs send/receive being the common
ones), all at the same time, or even two or more at the same time, if for
no other reason than because they're all IO intensive and running just
/one/ of them at a time is hard /enough/ on the system and the
performance of anything else running at the same time, even when all
components are fully stable and mature (and as we all know, btrfs is
stabilizing, but not yet fully stable and mature), yet that's what these
sorts of reports invariably involve.
Of course, with a certainty btrfs /should/ be able to handle more than
one of these at once without corruption, because anything else is a bug,
but... btrfs /is/ still stabilizing and maturing, and it's precisely this
sort of rare corner-case race-condition bugs where more than one
extremely heavy IO filesystem maintenance command is being run at the
same time that tend to be the last to be found and fixed, because they
/are/ rare corner-cases, often depending on race conditions, that tend to
be rare enough reported, and then extremely difficult to duplicate, so
that's exactly the type of bugs that tend to remain around at this point.
So rather than discouraging a sane-filtered regular balance (which I'll
discuss in a different reply), I'd suggest that the more sane
recommendation is to be aware of other major-IO filesystem maintenance
commands (not just btrfs commands but rsync-based backups, etc, too,
rsync being demanding enough on its own to have triggered a number of
btrfs bug reports and fixes over the years), including scheduled
commands, and to only run one at a time.
IOW, don't do a balance if your scheduled backup or snapshot-deletion is
about to kick in. One at a time is stressful enough on the filesystem
and hardware, don't compound the problem trying to do two or more at once!
So assuming a weekly schedule, do one a day of balance, scrub, snapshot-
deletion, backups (after ensuring that none of them take over a day,
balance in particular could at TiB-scale+ if not sanely filtered,
particularly if quotas are enabled due to the scaling issues of that
feature). And if any of those are scheduled daily or more frequently,
space the scheduling appropriately and ensure they're done before
starting the next task.
And keep in mind the scheduled tasks when running things manually, so as
not to collide there either.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 18:17 ` Graham Cobb
2018-01-08 18:34 ` Austin S. Hemmelgarn
@ 2018-01-10 4:38 ` Duncan
2018-01-10 12:41 ` Austin S. Hemmelgarn
2018-01-11 20:12 ` Hans van Kranenburg
1 sibling, 2 replies; 34+ messages in thread
From: Duncan @ 2018-01-10 4:38 UTC (permalink / raw)
To: linux-btrfs
Graham Cobb posted on Mon, 08 Jan 2018 18:17:13 +0000 as excerpted:
> On 08/01/18 16:34, Austin S. Hemmelgarn wrote:
>> Ideally, I think it should be as generic as reasonably possible,
>> possibly something along the lines of:
>>
>> A: While not strictly necessary, running regular filtered balances (for
>> example `btrfs balance start -dusage=50 -dlimit=2 -musage=50
>> -mlimit=4`,
>> see `man btrfs-balance` for more info on what the options mean) can
>> help keep a volume healthy by mitigating the things that typically
>> cause ENOSPC errors. Full balances by contrast are long and expensive
>> operations, and should be done only as a last resort.
>
> That recommendation is similar to what I do and it works well for my use
> case. I would recommend it to anyone with my usage, but cannot say how
> well it would work for other uses. In my case, I run balances like that
> once a week: some weeks nothing happens, other weeks 5 or 10 blocks may
> get moved.
Why 50% usage, and why the rather low limits?
OK, so it rarely makes sense to go over 50% usage when the intent of the
balance is to return chunks to the unallocated pool, because at 50% the
payback ratio is one free chunk for two processed and it gets worse after
that and MUCH worse after ~67-75%, where the ratios are 1:3 and 1:4
respectively, but why so high especially for a suggested scheduled/
routine command?
I'd suggest a rather lower usage value, say 20/25/34%, for favorable
payback ratios of 5:1, 4:1, and 3:1. That should be reasonable for a
generic recommendation for scheduled/routine balances. If that's not
enough, people can do more manually or increase the values from the
generic recommendation for their specific use-case.
And I'd suggest either no limits or (for kernels that can handle it,
4.4+, which at this point is everything within our recommended support
range of the last two LTSs, thus now 4.9 earliest, anyway) range-limits,
say 2..20, so it won't bother if there's less than enough to clear at
least one chunk within the usage target (but see the observed behavior
change noted below), but will do more than the low 2-4 in the above
suggested limits if there is. With the lower usage= values, processing
should take less time per chunk, and if there's no more that fit the
usage filter it won't use the higher range anyway, so the limit can and
should be higher.
Meanwhile, for any recommendation of balance, I'd suggest also mentioning
the negative effect that enabled quotas have on balance times, probably
with a link to a fuller discussion where I'd suggest disabling them due
to the scaling issues if the use-case doesn't require them, and if that's
not possible due to the use-case, to at least consider temporarily
disabling quotas before doing a balance so as to speed it up, after which
they can be enabled again. (I'm not sure if a manual quota rescan is
required to update them at that point, or not. I don't use quotas here
or I'd test.)
And an additional observation...
I'm on ssd here and run many rather small independent btrfs instead of
fewer larger ones, so I'm used to keeping an eye on usage, tho I've never
found the need to schedule balances, partly because on ssd with
relatively small btrfs, balances are fast enough they're not a problem to
do "while I wait".
And I've definitely noticed an effect since the ssd option stopped using
the 2 MiB spreading algorithm in 4.14. In particular, while chunk usage
was generally stable before that and I only occasionally needed to run
balance to clear out empty chunks, now, balance with the usage filter
will apparently actively fill in empty space in existing chunks, so while
previously a usage-filtered balance that only rewrote one chunk didn't
actually free anything, simply allocating a new chunk to replace the one
it freed, so at least two chunks needed rewritten to actually free space
back to unallocated...
Now, usage-filtered rewrites of only a single chunk routinely frees the
allocated space, because it writes that small bit of data in the freed
chunk into existing free space in other chunks.
At least I /presume/ that new balance-usage behavior is due to the ssd
changes. Maybe it's due to other patches. Either way, it's an
interesting and useful change. =:^)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 4:38 ` Duncan
@ 2018-01-10 12:41 ` Austin S. Hemmelgarn
2018-01-11 20:12 ` Hans van Kranenburg
1 sibling, 0 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-10 12:41 UTC (permalink / raw)
To: linux-btrfs
On 2018-01-09 23:38, Duncan wrote:
> Graham Cobb posted on Mon, 08 Jan 2018 18:17:13 +0000 as excerpted:
>
>> On 08/01/18 16:34, Austin S. Hemmelgarn wrote:
>>> Ideally, I think it should be as generic as reasonably possible,
>>> possibly something along the lines of:
>>>
>>> A: While not strictly necessary, running regular filtered balances (for
>>> example `btrfs balance start -dusage=50 -dlimit=2 -musage=50
>>> -mlimit=4`,
>>> see `man btrfs-balance` for more info on what the options mean) can
>>> help keep a volume healthy by mitigating the things that typically
>>> cause ENOSPC errors. Full balances by contrast are long and expensive
>>> operations, and should be done only as a last resort.
>>
>> That recommendation is similar to what I do and it works well for my use
>> case. I would recommend it to anyone with my usage, but cannot say how
>> well it would work for other uses. In my case, I run balances like that
>> once a week: some weeks nothing happens, other weeks 5 or 10 blocks may
>> get moved.
>
>
> Why 50% usage, and why the rather low limits?
>
> OK, so it rarely makes sense to go over 50% usage when the intent of the
> balance is to return chunks to the unallocated pool, because at 50% the
> payback ratio is one free chunk for two processed and it gets worse after
> that and MUCH worse after ~67-75%, where the ratios are 1:3 and 1:4
> respectively, but why so high especially for a suggested scheduled/
> routine command?
Largely because that's what I use myself, and I know it works reliably.
In my case, I use a large number of small filesystems, don't delete very
large amounts of data very often, and run the command daily, so it's not
very likely that a large number of chunks are going to be below half
full, and therefore it made sense for me to just limit it to a small
number of half full chunks so that it completes quickly.
>
> I'd suggest a rather lower usage value, say 20/25/34%, for favorable
> payback ratios of 5:1, 4:1, and 3:1. That should be reasonable for a
> generic recommendation for scheduled/routine balances. If that's not
> enough, people can do more manually or increase the values from the
> generic recommendation for their specific use-case.
That's probably a good idea, though I'd likely go for about 25% as a
generic recommendation (much lower, and you're not likely to process any
chunks at all most of the time since BTRFS will back-fill things, much
higher and the ratio becomes rather unfavorable).
>
> And I'd suggest either no limits or (for kernels that can handle it,
> 4.4+, which at this point is everything within our recommended support
> range of the last two LTSs, thus now 4.9 earliest, anyway) range-limits,
> say 2..20, so it won't bother if there's less than enough to clear at
> least one chunk within the usage target (but see the observed behavior
> change noted below), but will do more than the low 2-4 in the above
> suggested limits if there is. With the lower usage= values, processing
> should take less time per chunk, and if there's no more that fit the
> usage filter it won't use the higher range anyway, so the limit can and
> should be higher.
Good point on the limits too, though I would say that we should probably
comment specifically on the fact that you need 4.4 or newer for the
range support (there are still people dealing with much older kernels
out there, think of embedded life-cycles for example).
>
>
> Meanwhile, for any recommendation of balance, I'd suggest also mentioning
> the negative effect that enabled quotas have on balance times, probably
> with a link to a fuller discussion where I'd suggest disabling them due
> to the scaling issues if the use-case doesn't require them, and if that's
> not possible due to the use-case, to at least consider temporarily
> disabling quotas before doing a balance so as to speed it up, after which
> they can be enabled again. (I'm not sure if a manual quota rescan is
> required to update them at that point, or not. I don't use quotas here
> or I'd test.)
Also a good point!
>
>
> And an additional observation...
>
> I'm on ssd here and run many rather small independent btrfs instead of
> fewer larger ones, so I'm used to keeping an eye on usage, tho I've never
> found the need to schedule balances, partly because on ssd with
> relatively small btrfs, balances are fast enough they're not a problem to
> do "while I wait".
In my case, they're pretty darn fast too, I just don't like having to
remember to run them by hand (that is the main appeal for automation
after all).
>
> And I've definitely noticed an effect since the ssd option stopped using
> the 2 MiB spreading algorithm in 4.14. In particular, while chunk usage
> was generally stable before that and I only occasionally needed to run
> balance to clear out empty chunks, now, balance with the usage filter
> will apparently actively fill in empty space in existing chunks, so while
> previously a usage-filtered balance that only rewrote one chunk didn't
> actually free anything, simply allocating a new chunk to replace the one
> it freed, so at least two chunks needed rewritten to actually free space
> back to unallocated...
>
> Now, usage-filtered rewrites of only a single chunk routinely frees the
> allocated space, because it writes that small bit of data in the freed
> chunk into existing free space in other chunks.
>
> At least I /presume/ that new balance-usage behavior is due to the ssd
> changes. Maybe it's due to other patches. Either way, it's an
> interesting and useful change. =:^)
I'm pretty sure it's due to the 'ssd' option change. The way it was
coded previously made the allocator rather averse to back-filling free
space, and balance just sends stuff back through the allocator again
(other than the filtering, that is quite literally all it does), so a
change to the allocator's behavior will change balance behavior too.
Regardless, this is also a good point that should probably be added to
the FAQ. Given this, it might also be worth recommending that people
with SSD's who upgraded to 4.14 should run a much more aggressive
filtered balance (thinking 50% usage and no limit filter) to repack
things a bit more efficiently.
Overall, I'm starting to think that the best option here is to update
the FAQ entry, and then have netdata's help text point to the FAQ entry
instead of trying to contain the same info.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 3:49 ` Duncan
@ 2018-01-10 16:30 ` Tom Worster
2018-01-10 17:01 ` Austin S. Hemmelgarn
0 siblings, 1 reply; 34+ messages in thread
From: Tom Worster @ 2018-01-10 16:30 UTC (permalink / raw)
To: linux-btrfs
On 9 Jan 2018, at 22:49, Duncan wrote:
> AFAIK, such corruption reports re balance aren't really balance, per
> se,
> at all.
>
> Instead, what I've seen in nearly all cases is a number of filesystem
> maintenance commands involving heavy I/O colliding, that is, being run
> at
> the same time
I hope there is consensus on this because it might be the key to
resolving the contradictions that appear to me in the following
propositions that all seem plausible/reasonable:
- Depletion of unallocated space (DoUS, apologies for coining the term
if there already is one) is a property of BTRFS even if the volume's
capacity is more than enough for the files on it.
- To a user that isn't a BTRFS expert, DoUS can be unexpected, its
advance can be surprisingly fast and it can become severe.
- BTRFS does not recycle allocated but unused space to the unallocated
pool.
- Resolving severe DoUS involves either running `btrfs balance` or
recreating the filesystem from, e.g. backups.
- People have reported that `btrfs balance` sometimes causes filesystem
corruption.
- Some experienced users say that, to resolve a problem with DoUS, they
would rather recreate the filesystem than run balance.
- Some experienced users say you should stop all other use of the
filesystem while running balance.
- Some experts recommend running balance regularly, even once a day, to
prevent DoUS.
Without some satisfactory way to resolve the contradictions, I'm not
sure how to proceed. For example, I'm not willing to offload the
workload from each filesystem once a day for prophylactic balance. And
I'm not going to let balance run unattended if those more experienced
than me say it's known to corrupt filesystems. The best I can do is
monitor DoUS and respond ad hoc. Or I can use a different fs type.
But if Duncan is right (which, for me, is practically the same as
consensus on the proposition) that problems with corruption while
running balance are associated with heavy coincident IO activity, then I
can see a reasonable way forwards. I can even see how general
recommendations for BTRFS maintenance might develop.
Tom
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 16:30 ` Tom Worster
@ 2018-01-10 17:01 ` Austin S. Hemmelgarn
2018-01-10 18:33 ` Tom Worster
2018-01-11 8:51 ` Duncan
0 siblings, 2 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-10 17:01 UTC (permalink / raw)
To: Tom Worster, linux-btrfs
On 2018-01-10 11:30, Tom Worster wrote:
> On 9 Jan 2018, at 22:49, Duncan wrote:
>
>> AFAIK, such corruption reports re balance aren't really balance, per se,
>> at all.
>>
>> Instead, what I've seen in nearly all cases is a number of filesystem
>> maintenance commands involving heavy I/O colliding, that is, being run at
>> the same time
>
> I hope there is consensus on this because it might be the key to
> resolving the contradictions that appear to me in the following
> propositions that all seem plausible/reasonable:
>
> - Depletion of unallocated space (DoUS, apologies for coining the term
> if there already is one) is a property of BTRFS even if the volume's
> capacity is more than enough for the files on it.
Strictly speaking this particular statement is only true in that there
are still probably bugs in the allocator. The goal is for this to never
be a significant problem as long as you have a reasonable amount of free
space (reasonable being enough for at least a couple of chunks to be
allocated).
Also, for future reference, the term we typically use is ENOSPC, as
that's the symbolic name for the error code you get when this happens
(or when your filesystem is just normally full), but I actually kind of
like your name for it too, it conveys the exact condition being
discussed in a way that should be a bit easier for non-technical types
to understand.
>
> - To a user that isn't a BTRFS expert, DoUS can be unexpected, its
> advance can be surprisingly fast and it can become severe.
Absolutely correct, and actually true even for a number of BTRFS
'experts' (no, seriously, I know of a number of cases where this caught
'experts' (including myself) by surprise simply because they ran into a
corner case they had never dealt with or found a bug in the allocator).
>
> - BTRFS does not recycle allocated but unused space to the unallocated
> pool.
Kind of.
The regular BTRFS allocator will (usually) preferentially avoid using
blocks of free space smaller than a given size for new allocations.
Without the 'ssd' mount option set, or when using Linux kernel version
4.14 or newer, the minimum size is 64kB, so it's generally not too bad
unless you regularly are dealing with lots of small files that change
very frequently. With the 'ssd' mount option set on Linux kernels prior
to 4.14, the minimum size is 2MB, which tends to result in really poor
space utilization, though it's still mostly an issue with volumes
holding lots of small files that change frequently or see lots of small
changes to large files.
However, this does not mean that that space will always be unused. If
space gets tight, BTRFS will use that previously allocated space to it's
fullest, and it will reuse it in other circumstances too.
>
> - Resolving severe DoUS involves either running `btrfs balance` or
> recreating the filesystem from, e.g. backups.
In most cases yes, though it is sometimes possible to resolve simply by
dropping snapshots if you have a lot of them and then deleting some files.
>
> - People have reported that `btrfs balance` sometimes causes filesystem
> corruption.
As I commented, I've not heard about this specifically, and I'm inclined
to agree with Duncan's assessment that it's probably from people running
multiple low-level maintenance operations happening concurrently
(running two or more balances at the same time is known to be able to
cause this type of corruption, and as a result there's locking in the
kernel to prevent you from running more than one balance at a time on a
filesystem).>
> - Some experienced users say that, to resolve a problem with DoUS, they
> would rather recreate the filesystem than run balance.
This is kind of independent of BTRFS. A lot of seasoned system
administrators are going to be more likely to just rebuild a broken
filesystem from scratch if possible than repair it simply because it's
more reliable and generally guaranteed to fix the issue. It largely
comes down to the mentality of the individual, and how confident they
are that they can fix a problem in a reasonable amount of time without
causing damage elsewhere.
>
> - Some experienced users say you should stop all other use of the
> filesystem while running balance.
I've never seen any evidence that this is actually needed, but it does
make the balance operation finish faster. Strictly speaking, it
shouldn't be needed at all (that's part of the point of having CoW
semantics in the filesystem, it makes it easier to handle maintenance
on-line).
>
> - Some experts recommend running balance regularly, even once a day, to
> prevent DoUS. >
> Without some satisfactory way to resolve the contradictions, I'm not
> sure how to proceed. For example, I'm not willing to offload the
> workload from each filesystem once a day for prophylactic balance. And
> I'm not going to let balance run unattended if those more experienced
> than me say it's known to corrupt filesystems. The best I can do is
> monitor DoUS and respond ad hoc. Or I can use a different fs type.
It may be worth seriously looking at whether you actually _need_ BTRFS
for your use case. In general, unless you need at least one of it's
features, and either can't get that feature with ZFS or just want to
avoid using ZFS, you are likely better-off for the time being using
another filesystem.
In my case for example, I _really_ want to avoid dealing with ZFS on
Linux because of how it impacts what kernel versions I use and the fact
that I don't trust the proprietary NVIDIA drivers to get along with it,
and I need the checksumming and online transformation features
(reshaping, profile conversion, device replacement, etc) of BTRFS. If
it weren't for all of that, I would not be using BTRFS at all.
>
> But if Duncan is right (which, for me, is practically the same as
> consensus on the proposition) that problems with corruption while
> running balance are associated with heavy coincident IO activity, then I
> can see a reasonable way forwards. I can even see how general
> recommendations for BTRFS maintenance might develop.
As I commented above, I would tend to believe Duncan is right in this
case (both because it makes sense, and because he seems to generally be
right about this type of thing). That said, I really do think that
normal user I/O is probably not the issue, but low-level filesystem
operations are. That said, there is no reason that BTRFS shouldn't either:
1. Handle this just fine without causing corruption.
or:
2. Extend the mutex used to prevent concurrent balances to cover other
operations that might cause issues (that is, make it so you can't scrub
a filesystem while it's being balanced, or defragment it, or whatever else).
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 17:01 ` Austin S. Hemmelgarn
@ 2018-01-10 18:33 ` Tom Worster
2018-01-10 20:44 ` Timofey Titovets
2018-01-11 8:51 ` Duncan
1 sibling, 1 reply; 34+ messages in thread
From: Tom Worster @ 2018-01-10 18:33 UTC (permalink / raw)
To: Austin S. Hemmelgarn; +Cc: linux-btrfs
On 10 Jan 2018, at 12:01, Austin S. Hemmelgarn wrote:
> On 2018-01-10 11:30, Tom Worster wrote:
>
> Also, for future reference, the term we typically use is ENOSPC, as
> that's the symbolic name for the error code you get when this happens
> (or when your filesystem is just normally full), but I actually kind
> of like your name for it too, it conveys the exact condition being
> discussed in a way that should be a bit easier for non-technical types
> to understand.
Iiuc, ENOSPC is _exhaustion_ of unallocated space, which is a specific
case of depletion.
I sought a term to refer to the phenomenon of unallocated space
shrinking beyond what filesystem use would demand and how it ratchets
down. Hence a sysop needs to manage DoUS. ENOSPC is likely a failure of
such management.
>> - Some experienced users say that, to resolve a problem with DoUS,
>> they would rather recreate the filesystem than run balance.
> This is kind of independent of BTRFS.
Yes. I mentioned it only because it was, to me, a striking statement of
lack of confidence in balance.
>> But if Duncan is right (which, for me, is practically the same as
>> consensus on the proposition) that problems with corruption while
>> running balance are associated with heavy coincident IO activity,
>> then I can see a reasonable way forwards. I can even see how general
>> recommendations for BTRFS maintenance might develop.
> As I commented above, I would tend to believe Duncan is right in this
> case (both because it makes sense, and because he seems to generally
> be right about this type of thing). That said, I really do think that
> normal user I/O is probably not the issue, but low-level filesystem
> operations are. That said, there is no reason that BTRFS shouldn't
> either:
> 1. Handle this just fine without causing corruption.
> or:
> 2. Extend the mutex used to prevent concurrent balances to cover other
> operations that might cause issues (that is, make it so you can't
> scrub a filesystem while it's being balanced, or defragment it, or
> whatever else).
Yes, but backtracking a bit, I think there's another really important
point here. Assuming Duncan's right, it's not so hard to develop
guidelines for general BTRFS management that include DoUS among other
topics. Duncan's other email today contains or implies quite a lot of
those guidelines.
Or, to put it another way, it's enough for me. I think I know what to do
now. And that much could be written down for the benefit of others.
Tom
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 18:33 ` Tom Worster
@ 2018-01-10 20:44 ` Timofey Titovets
2018-01-11 13:00 ` Austin S. Hemmelgarn
0 siblings, 1 reply; 34+ messages in thread
From: Timofey Titovets @ 2018-01-10 20:44 UTC (permalink / raw)
To: Tom Worster; +Cc: Austin S. Hemmelgarn, linux-btrfs
2018-01-10 21:33 GMT+03:00 Tom Worster <fsb@thefsb.org>:
> On 10 Jan 2018, at 12:01, Austin S. Hemmelgarn wrote:
>
>> On 2018-01-10 11:30, Tom Worster wrote:
>>
>> Also, for future reference, the term we typically use is ENOSPC, as that's
>> the symbolic name for the error code you get when this happens (or when your
>> filesystem is just normally full), but I actually kind of like your name for
>> it too, it conveys the exact condition being discussed in a way that should
>> be a bit easier for non-technical types to understand.
>
>
> Iiuc, ENOSPC is _exhaustion_ of unallocated space, which is a specific case
> of depletion.
>
> I sought a term to refer to the phenomenon of unallocated space shrinking
> beyond what filesystem use would demand and how it ratchets down. Hence a
> sysop needs to manage DoUS. ENOSPC is likely a failure of such management.
>
>
>>> - Some experienced users say that, to resolve a problem with DoUS, they
>>> would rather recreate the filesystem than run balance.
>>
>> This is kind of independent of BTRFS.
>
>
> Yes. I mentioned it only because it was, to me, a striking statement of lack
> of confidence in balance.
>
>
>>> But if Duncan is right (which, for me, is practically the same as
>>> consensus on the proposition) that problems with corruption while running
>>> balance are associated with heavy coincident IO activity, then I can see a
>>> reasonable way forwards. I can even see how general recommendations for
>>> BTRFS maintenance might develop.
>>
>> As I commented above, I would tend to believe Duncan is right in this case
>> (both because it makes sense, and because he seems to generally be right
>> about this type of thing). That said, I really do think that normal user
>> I/O is probably not the issue, but low-level filesystem operations are.
>> That said, there is no reason that BTRFS shouldn't either:
>> 1. Handle this just fine without causing corruption.
>> or:
>> 2. Extend the mutex used to prevent concurrent balances to cover other
>> operations that might cause issues (that is, make it so you can't scrub a
>> filesystem while it's being balanced, or defragment it, or whatever else).
>
>
> Yes, but backtracking a bit, I think there's another really important point
> here. Assuming Duncan's right, it's not so hard to develop guidelines for
> general BTRFS management that include DoUS among other topics. Duncan's
> other email today contains or implies quite a lot of those guidelines.
>
> Or, to put it another way, it's enough for me. I think I know what to do
> now. And that much could be written down for the benefit of others.
>
> Tom
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
My two cents,
I've about ~50 different systems
(VCS Systems, MySQL DB, Web Servers, Elastic Search nodes & etc.).
All running btrfs only and run fine, even with auto snapshot rotating
on some of them,
(btrfs make my life easier and i like it).
Most of them are small VMs From 3GiB..512GiB (I use compression everywhere).
And no one of them need balance, only that i care,
i try have always some unallocated space on it.
Most of them are stuck with some used/allocated/unallocated ratio.
I.e. as i see from conversation point of view.
We run balance for reallocate data -> make more unallocated space,
but if someone have plenty of it, that useless, no?
ex. I've 60% allocated by data/meta data chunks on my notebook,
And only 40% are really used by data, even then i have 90% allocated,
and 85% used, i don't face into ENOSPC problems. (256GiB ssd).
And if i run balance, i run it only to fight with btrfs discard processing bug,
which leads to trim only unallocated space (probably fixed already).
So if we talk about "regular" running of balance, may be that make a sense
To check free space, i.e. if system have some percentage of space
allocated, like 80%,
and have plenty of allocated/unused space, only then balance will be needed, no?
(I'm not say that btrfs have no problems, i see some rare hateful bugs,
on some systems, but most of them are internal btrfs problems
or problems with coop of btrfs with applications).
Thanks.
--
Have a nice day,
Timofey.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 15:55 Recommendations for balancing as part of regular maintenance? Austin S. Hemmelgarn
2018-01-08 16:20 ` ein
@ 2018-01-10 21:37 ` waxhead
2018-01-11 12:50 ` Austin S. Hemmelgarn
2018-01-11 19:56 ` Hans van Kranenburg
2018-01-12 18:24 ` Austin S. Hemmelgarn
2 siblings, 2 replies; 34+ messages in thread
From: waxhead @ 2018-01-10 21:37 UTC (permalink / raw)
To: Austin S. Hemmelgarn, Btrfs BTRFS
Austin S. Hemmelgarn wrote:
> So, for a while now I've been recommending small filtered balances to
> people as part of regular maintenance for BTRFS filesystems under the
> logic that it does help in some cases and can't really hurt (and if done
> right, is really inexpensive in terms of resources). This ended up
> integrated partially in the info text next to the BTRFS charts on
> netdata's dashboard, and someone has now pointed out (correctly I might
> add) that this is at odds with the BTRFS FAQ entry on balances.
>
> For reference, here's the bit about it in netdata:
>
> You can keep your volume healthy by running the `btrfs balance` command
> on it regularly (check `man btrfs-balance` for more info).
>
>
> And here's the FAQ entry:
>
> Q: Do I need to run a balance regularly?
>
> A: In general usage, no. A full unfiltered balance typically takes a
> long time, and will rewrite huge amounts of data unnecessarily. You may
> wish to run a balance on metadata only (see Balance_Filters) if you find
> you have very large amounts of metadata space allocated but unused, but
> this should be a last resort.
>
>
> I've commented in the issue in netdata's issue tracker that I feel that
> the FAQ entry could be better worded (strictly speaking, you don't
> _need_ to run balances regularly, but it's usually a good idea). Looking
> at both though, I think they could probably both be improved, but I
> would like to get some input here on what people actually think the best
> current practices are regarding this (and ideally why they feel that
> way) before I go and change anything.
>
> So, on that note, how does anybody else out there feel about this? Is
> balancing regularly with filters restricting things to small numbers of
> mostly empty chunks a good thing for regular maintenance or not?
> --
As just a regular user I would think that the first thing you would need
is an analyze that can tell you if it is a good idea to balance or not
in the first place.
Scrub seems like a great place to start - e.g. scrub could auto-analyze
and report back need to balance. I also think that scrub should
optionally autobalance if needed.
Balance may not be needed, but if one can determine that balancing would
speed up things a bit I don't see why this as an option can't be
scheduled automatically. Ideally there should be a "scrub and polish"
option that would scrub, balance and perhaps even defragment in one go.
In fact, the way I see it btrfs should idealy by itself keep track on
each data/metadata chunk and it should know , when was this chunk last
affected by a scrub, balance, defrag etc and perform the required
operations by itself based on a configuration or similar. Some may
disagree for good reasons , but for me this is my wishlist for a
filesystem :) e.g. a pool that just works and only annoys you with the
need of replacing a bad disk every now and then :)
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 17:01 ` Austin S. Hemmelgarn
2018-01-10 18:33 ` Tom Worster
@ 2018-01-11 8:51 ` Duncan
1 sibling, 0 replies; 34+ messages in thread
From: Duncan @ 2018-01-11 8:51 UTC (permalink / raw)
To: linux-btrfs
Austin S. Hemmelgarn posted on Wed, 10 Jan 2018 12:01:42 -0500 as
excerpted:
>> - Some experienced users say that, to resolve a problem with DoUS, they
>> would rather recreate the filesystem than run balance.
> This is kind of independent of BTRFS. A lot of seasoned system
> administrators are going to be more likely to just rebuild a broken
> filesystem from scratch if possible than repair it simply because it's
> more reliable and generally guaranteed to fix the issue. It largely
> comes down to the mentality of the individual, and how confident they
> are that they can fix a problem in a reasonable amount of time without
> causing damage elsewhere.
Specific to this one...
I'm known around here for harping on the backup point (hold on, I'll
explain how that ties in). A/the sysadmin's first rule of backups: The
(true) value of your data is defined not by any arbitrary claims, but by
how many backups of that data you consider it worth having. No backups
defines the data as of only trivial value, worth less than the time/
trouble/resources necessary to make that backup.
It therefore follows that in the event of data mishap, a sysadmin can
always rest happy, because regardless of what might have been lost, what
actions defined as of *MOST* value, either the data if it was backed up,
or the time/trouble/resources that would have otherwise gone into that
backup if not, was *ALWAYS* saved.
Understanding that puts an entirely different spin on backups and data
mishaps, taking a lot of the pressure off when things /do/ go wrong,
because one understands that the /true/ value of that data was defined
long before, and now we're simply dealing with the results of our
decision to define it that way, only playing out the story we setup for
ourselves long before.
But how does that apply to the current discussion?
Simply this way: For someone understanding the above, repair is never a
huge problem or priority, because the data was either of such trivial
value as to make it no big deal, or there were backups, thus making this
particular instance of the data, and the necessity of repair, no big deal.
Once /that/ is understood, the question of repair vs. rebuild from
scratch (or even simply fail-over to the hot-spare and send the old
filesystem component devices to be tested for reuse or recycle) becomes
purely one of efficiency, and the answer ends up being pretty
predictable, because rebuild from scratch and restore from backup should
be near 100% reliable on a reasonable/predictable time frame, vs.
/attempting/ a repair with unknown likelihood of success and a much
/less/ predictable time frame, especially since there's a non-trivial
chance one will have to fall back to the rebuild from scratch and backups
method anyway, after repair attempts fail.
Once one is thinking in those terms and already has backups accordingly,
even for home or other one-off systems where actual formatting and
restore from backups is going to be manual and thus will take longer than
a trivial fix, the practical limits on the extents to which one is
willing to go to get a fix are pretty narrow, and while one might try a
couple fixes if they're easy and quick enough, beyond that it very
quickly becomes restore from backups time if the data was considered
valuable enough to be worth making them, or simply throw it away and
start over if the data wasn't considered valuable enough to be worth
making a backup in the first place.
So it's really independent of btrfs and not reflective on the reliability
of balance, etc, at all. It's simply a reflection of understanding the
realities of possible repair... or not and having to replace anyway...
without a good estimate on the time required either way... vs. a (near)
100% guaranteed fix and back in business, in a relatively tightly
predictable timeframe. Couple that with the possibility that a repair
may leave other problems latent and ready to be exposed later, while
starting over from scratch gives you a "clean starting point", and it's
pretty much a no-brainer, regardless of the filesystem... or whatever
else (hardware, software layers other than the filesystem) may be in use.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 21:37 ` waxhead
@ 2018-01-11 12:50 ` Austin S. Hemmelgarn
2018-01-11 19:56 ` Hans van Kranenburg
1 sibling, 0 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-11 12:50 UTC (permalink / raw)
To: waxhead, Btrfs BTRFS
On 2018-01-10 16:37, waxhead wrote:
> Austin S. Hemmelgarn wrote:
>> So, for a while now I've been recommending small filtered balances to
>> people as part of regular maintenance for BTRFS filesystems under the
>> logic that it does help in some cases and can't really hurt (and if done
>> right, is really inexpensive in terms of resources). This ended up
>> integrated partially in the info text next to the BTRFS charts on
>> netdata's dashboard, and someone has now pointed out (correctly I might
>> add) that this is at odds with the BTRFS FAQ entry on balances.
>>
>> For reference, here's the bit about it in netdata:
>>
>> You can keep your volume healthy by running the `btrfs balance` command
>> on it regularly (check `man btrfs-balance` for more info).
>>
>>
>> And here's the FAQ entry:
>>
>> Q: Do I need to run a balance regularly?
>>
>> A: In general usage, no. A full unfiltered balance typically takes a
>> long time, and will rewrite huge amounts of data unnecessarily. You may
>> wish to run a balance on metadata only (see Balance_Filters) if you find
>> you have very large amounts of metadata space allocated but unused, but
>> this should be a last resort.
>>
>>
>> I've commented in the issue in netdata's issue tracker that I feel that
>> the FAQ entry could be better worded (strictly speaking, you don't
>> _need_ to run balances regularly, but it's usually a good idea). Looking
>> at both though, I think they could probably both be improved, but I
>> would like to get some input here on what people actually think the best
>> current practices are regarding this (and ideally why they feel that
>> way) before I go and change anything.
>>
>> So, on that note, how does anybody else out there feel about this? Is
>> balancing regularly with filters restricting things to small numbers of
>> mostly empty chunks a good thing for regular maintenance or not?
>> --
> As just a regular user I would think that the first thing you would need
> is an analyze that can tell you if it is a good idea to balance or not
> in the first place.
In an ideal situation, the only reason it should ever be a bad idea to
run a balance is the performance impact (which is of course why we have
filters). Beyond that though, there's too much involved for even a
computer to reliably tell you if it will be beneficial to run a balance
or not. It depends not just on how the data looks on the filesystem,
but also how you are going to be using the filesystem in the near future
(for example, if you've got a number of large blocks of empty space
within data chunks, it might make sense to balance, but not if you're
likely to be adding a bunch of new files in the very near future (they
will just end up packed into that empty space in existing chunks, and
your actual layout on disk shouldn't be all that different from if you
had run a balance)).
>
> Scrub seems like a great place to start - e.g. scrub could auto-analyze
> and report back need to balance. I also think that scrub should
> optionally autobalance if needed.
>
> Balance may not be needed, but if one can determine that balancing would
> speed up things a bit I don't see why this as an option can't be
> scheduled automatically. Ideally there should be a "scrub and polish"
> option that would scrub, balance and perhaps even defragment in one go.
In this case, the recommendation isn't as much about speed as it is
about trying to keep things from getting into a state where you get
ENOSPC but conventional tools report lots of free space. As a general
rule, unless things are pathologically bad to begin with, balancing a
filesystem won't usually have any measurable impact on performance.
>
> In fact, the way I see it btrfs should idealy by itself keep track on
> each data/metadata chunk and it should know , when was this chunk last
> affected by a scrub, balance, defrag etc and perform the required
> operations by itself based on a configuration or similar. Some may
> disagree for good reasons , but for me this is my wishlist for a
> filesystem :) e.g. a pool that just works and only annoys you with the
> need of replacing a bad disk every now and then :)
Long-term, that type of things is a goal, but I doubt that we're going
to go that far with automation (even ZFS doesn't go that far, you still
have to schedule scrubs and similar things).
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 20:44 ` Timofey Titovets
@ 2018-01-11 13:00 ` Austin S. Hemmelgarn
0 siblings, 0 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-11 13:00 UTC (permalink / raw)
To: Timofey Titovets, Tom Worster; +Cc: linux-btrfs
On 2018-01-10 15:44, Timofey Titovets wrote:
> 2018-01-10 21:33 GMT+03:00 Tom Worster <fsb@thefsb.org>:
>> On 10 Jan 2018, at 12:01, Austin S. Hemmelgarn wrote:
>>
>>> On 2018-01-10 11:30, Tom Worster wrote:
>>>
>>> Also, for future reference, the term we typically use is ENOSPC, as that's
>>> the symbolic name for the error code you get when this happens (or when your
>>> filesystem is just normally full), but I actually kind of like your name for
>>> it too, it conveys the exact condition being discussed in a way that should
>>> be a bit easier for non-technical types to understand.
>>
>>
>> Iiuc, ENOSPC is _exhaustion_ of unallocated space, which is a specific case
>> of depletion.
>>
>> I sought a term to refer to the phenomenon of unallocated space shrinking
>> beyond what filesystem use would demand and how it ratchets down. Hence a
>> sysop needs to manage DoUS. ENOSPC is likely a failure of such management.
>>
>>
>>>> - Some experienced users say that, to resolve a problem with DoUS, they
>>>> would rather recreate the filesystem than run balance.
>>>
>>> This is kind of independent of BTRFS.
>>
>>
>> Yes. I mentioned it only because it was, to me, a striking statement of lack
>> of confidence in balance.
>>
>>
>>>> But if Duncan is right (which, for me, is practically the same as
>>>> consensus on the proposition) that problems with corruption while running
>>>> balance are associated with heavy coincident IO activity, then I can see a
>>>> reasonable way forwards. I can even see how general recommendations for
>>>> BTRFS maintenance might develop.
>>>
>>> As I commented above, I would tend to believe Duncan is right in this case
>>> (both because it makes sense, and because he seems to generally be right
>>> about this type of thing). That said, I really do think that normal user
>>> I/O is probably not the issue, but low-level filesystem operations are.
>>> That said, there is no reason that BTRFS shouldn't either:
>>> 1. Handle this just fine without causing corruption.
>>> or:
>>> 2. Extend the mutex used to prevent concurrent balances to cover other
>>> operations that might cause issues (that is, make it so you can't scrub a
>>> filesystem while it's being balanced, or defragment it, or whatever else).
>>
>>
>> Yes, but backtracking a bit, I think there's another really important point
>> here. Assuming Duncan's right, it's not so hard to develop guidelines for
>> general BTRFS management that include DoUS among other topics. Duncan's
>> other email today contains or implies quite a lot of those guidelines.
>>
>> Or, to put it another way, it's enough for me. I think I know what to do
>> now. And that much could be written down for the benefit of others.
>>
>> Tom
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> My two cents,
> I've about ~50 different systems
> (VCS Systems, MySQL DB, Web Servers, Elastic Search nodes & etc.).
> All running btrfs only and run fine, even with auto snapshot rotating
> on some of them,
> (btrfs make my life easier and i like it).
>
> Most of them are small VMs From 3GiB..512GiB (I use compression everywhere).
> And no one of them need balance, only that i care,
> i try have always some unallocated space on it.
>
> Most of them are stuck with some used/allocated/unallocated ratio.
>
> I.e. as i see from conversation point of view.
> We run balance for reallocate data -> make more unallocated space,
> but if someone have plenty of it, that useless, no?
Not exactly. In terms of reactive, after-the-fact type maintenance,
that's most of what it's used for. For preventative maintenance (which
is what the recommendations I'm trying to work out are about), it can be
used to help avoid ending up in such situations in the first place.
IOW, balancing in small increments on a regular basis can be used as a
prophylactic measure to help keep things from getting into a state where
you have a bunch of free space in one type of chunk, but need more space
in another type of chunk and can't allocate any of that second type of
chunk (which is the most common case of ENOSPC problems, df and
statvfs() both show lots of free space, but certain VFS ops reliably
return ENOSPC). While it's unlikely to end up in such a state if you
keep a reasonable amount of space unallocated, it is still possible, and
even aside from that balancing can help keep the load evenly distributed
in a multi-device volume.
>
> ex. I've 60% allocated by data/meta data chunks on my notebook,
> And only 40% are really used by data, even then i have 90% allocated,
> and 85% used, i don't face into ENOSPC problems. (256GiB ssd).
>
> And if i run balance, i run it only to fight with btrfs discard processing bug,
> which leads to trim only unallocated space (probably fixed already).
Yes, it is fixed in mainline, though I forget what kernel version the
fix went into (I think 4.9 and newer have it, but I'm not sure).
>
> So if we talk about "regular" running of balance, may be that make a sense
> To check free space, i.e. if system have some percentage of space
> allocated, like 80%,
> and have plenty of allocated/unused space, only then balance will be needed, no?
>
> (I'm not say that btrfs have no problems, i see some rare hateful bugs,
> on some systems, but most of them are internal btrfs problems
> or problems with coop of btrfs with applications).
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 21:37 ` waxhead
2018-01-11 12:50 ` Austin S. Hemmelgarn
@ 2018-01-11 19:56 ` Hans van Kranenburg
1 sibling, 0 replies; 34+ messages in thread
From: Hans van Kranenburg @ 2018-01-11 19:56 UTC (permalink / raw)
To: waxhead, Austin S. Hemmelgarn, Btrfs BTRFS
On 01/10/2018 10:37 PM, waxhead wrote:
> As just a regular user I would think that the first thing you would need
> is an analyze that can tell you if it is a good idea to balance or not
> in the first place.
Tooling to create that is available. Btrfs allows you to read a lot of
different data to analyze, and then you can experiment with your own
algorithms to find out which blockgroup you're going to feed to balance
next.
There's two language options...
* C -> in this case you're extending btrfs-progs to build new tools
* Python -> python-btrfs has everything in it to quickly throw together
things like this, and examples are available with the source.
For example:
- balance_least_used.py -> balance starting from the least used chunk
and work up towards a max of X% used.
- show_free_space_fragmentation.py -> find out which chunks have badly
fragmented free space. Remember: if you have a 1GiB chunk with usage
50%, that doesn't tell you if it has only a handful of extents, filling
up the first 500MiB, with the rest empty, or if it's thousands of
alternating pieces of 4KiB used space and 4KiB free space. ;-] [0]
In the same way you can program something new, like a balance algorithm
that cleans up blocks with high free space fragmentation first.
Or, another thing you could do is first count the number of extents in a
block group and add it to the algorithm. Balance of a block group with a
few extents is much faster than thousands of extents with a lot of
reflinks, like highly deduped data.
Or... look at generation of metadata to find out which parts of data on
your disk have been touched recently, and which weren't... Too many fun
things to play around with. \:D/
As always, first thing to do is make sure you're on 4.14 or otherwise
use nossd, otherwise you might keep shoveling data around forever.
And if your filesystem has been treated badly by <4.14 kernel in ssd
mode for a long time, then first get that cleaned up:
https://www.spinics.net/lists/linux-btrfs/msg70622.html
> Scrub seems like a great place to start - e.g. scrub could auto-analyze
> and report back need to balance. I also think that scrub should
> optionally autobalance if needed.
>
> Balance may not be needed, but if one can determine that balancing would
> speed up things a bit I don't see why this as an option can't be
> scheduled automatically. Ideally there should be a "scrub and polish"
> option that would scrub, balance and perhaps even defragment in one go.
>
> In fact, the way I see it btrfs should idealy by itself keep track on
> each data/metadata chunk and it should know , when was this chunk last
> affected by a scrub, balance, defrag etc and perform the required
> operations by itself based on a configuration or similar. Some may
> disagree for good reasons , but for me this is my wishlist for a
> filesystem :) e.g. a pool that just works and only annoys you with the
> need of replacing a bad disk every now and then :)
I don't think these kind of things will ever end up in kernel code.
[0] There's a version in the devel branch in git that also works without
free space tree, taking a slower detour via the extent tree.
--
Hans van Kranenburg
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-10 4:38 ` Duncan
2018-01-10 12:41 ` Austin S. Hemmelgarn
@ 2018-01-11 20:12 ` Hans van Kranenburg
1 sibling, 0 replies; 34+ messages in thread
From: Hans van Kranenburg @ 2018-01-11 20:12 UTC (permalink / raw)
To: Duncan, linux-btrfs
On 01/10/2018 05:38 AM, Duncan wrote:
> [...]
>
> And I've definitely noticed an effect since the ssd option stopped using
> the 2 MiB spreading algorithm in 4.14.
Glad to hear. :-)
> In particular, while chunk usage
> was generally stable before that and I only occasionally needed to run
> balance to clear out empty chunks, now, balance with the usage filter
> will apparently actively fill in empty space in existing chunks, so while
> previously a usage-filtered balance that only rewrote one chunk didn't
> actually free anything, simply allocating a new chunk to replace the one
> it freed, so at least two chunks needed rewritten to actually free space
> back to unallocated...
>
> Now, usage-filtered rewrites of only a single chunk routinely frees the
> allocated space, because it writes that small bit of data in the freed
> chunk into existing free space in other chunks.
And that back-filling the existing chunks indeed also means a decrease
in total work that needs to be done. But this probably also means that
the free space gaps you have/had were rather small so they got ignored
in the past. Large free gaps would always get data written in them by
balance already. It probably also means that now you're on 4.14, much
less of these small free space gaps will be left behind, because they're
already immediately reused by new small writes.
> At least I /presume/ that new balance-usage behavior is due to the ssd
> changes.
Most probably, yes.
> Maybe it's due to other patches. Either way, it's an
> interesting and useful change. =:^)
--
Hans van Kranenburg
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 15:55 Recommendations for balancing as part of regular maintenance? Austin S. Hemmelgarn
2018-01-08 16:20 ` ein
2018-01-10 21:37 ` waxhead
@ 2018-01-12 18:24 ` Austin S. Hemmelgarn
2018-01-12 19:26 ` Tom Worster
2018-01-13 22:09 ` Chris Murphy
2 siblings, 2 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-12 18:24 UTC (permalink / raw)
To: Btrfs BTRFS
On 2018-01-08 10:55, Austin S. Hemmelgarn wrote:
> So, for a while now I've been recommending small filtered balances to
> people as part of regular maintenance for BTRFS filesystems under the
> logic that it does help in some cases and can't really hurt (and if done
> right, is really inexpensive in terms of resources). This ended up
> integrated partially in the info text next to the BTRFS charts on
> netdata's dashboard, and someone has now pointed out (correctly I might
> add) that this is at odds with the BTRFS FAQ entry on balances.
>
> For reference, here's the bit about it in netdata:
>
> You can keep your volume healthy by running the `btrfs balance` command
> on it regularly (check `man btrfs-balance` for more info).
>
>
> And here's the FAQ entry:
>
> Q: Do I need to run a balance regularly?
>
> A: In general usage, no. A full unfiltered balance typically takes a
> long time, and will rewrite huge amounts of data unnecessarily. You may
> wish to run a balance on metadata only (see Balance_Filters) if you find
> you have very large amounts of metadata space allocated but unused, but
> this should be a last resort.
>
>
> I've commented in the issue in netdata's issue tracker that I feel that
> the FAQ entry could be better worded (strictly speaking, you don't
> _need_ to run balances regularly, but it's usually a good idea). Looking
> at both though, I think they could probably both be improved, but I
> would like to get some input here on what people actually think the best
> current practices are regarding this (and ideally why they feel that
> way) before I go and change anything.
>
> So, on that note, how does anybody else out there feel about this? Is
> balancing regularly with filters restricting things to small numbers of
> mostly empty chunks a good thing for regular maintenance or not?
OK, I've gotten a lot of good feedback on this, and the general
consensus seems to be:
* If we're going to recommend regular balance, we should explain how it
actually helps things.
* We should mention the performance interactions with qgroups, as well
as warning people off of running other things like scrubs or defrag
concurrently.
* The filters should be reasonably tame in terms of chunk selection.
* BTRFS should ideally get smarter about this kind of thing so the user
doesn't have to be.
To that end, I propose the following text for the FAQ:
Q: Do I need to run a balance regularly?
A: While not strictly necessary for normal operations, running a
filtered balance regularly can help prevent your filesystem from ending
up with ENOSPC issues. The following command run daily on each BTRFS
volume should be more than sufficient for most users:
`btrfs balance start -dusage=25 -dlimit=2..10 -musage=25 -mlimit=2..10`
If you are running a kernel older than version 4.4 and can't upgrade,
the following should be used instead:
`btrfs balance start -dusage=25 -musage=25`
Both of these commands will effectively compact partially full chunks on
the filesystem so that new chunks have more space to be allocated. For
more information on what the commands actually mean, check out `man
btrfs-balance`
When run regularly, both of these should complete extremely fast on most
BTRFS volumes. Note that these may run significantly slower on volumes
which have quotas enabled. Additionally, it's best to make sure other
things aren't putting a lot of load on the filesystem while running a
balance, so try to make sure this doesn't run at the same time as a
scrub or defrag.
A full, unfiltered balance (one without any options passed in) is
completely unnecessary for normal usage of a filesystem.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-12 18:24 ` Austin S. Hemmelgarn
@ 2018-01-12 19:26 ` Tom Worster
2018-01-12 19:43 ` Austin S. Hemmelgarn
2018-01-13 22:09 ` Chris Murphy
1 sibling, 1 reply; 34+ messages in thread
From: Tom Worster @ 2018-01-12 19:26 UTC (permalink / raw)
To: Austin S. Hemmelgarn; +Cc: Btrfs BTRFS
On 12 Jan 2018, at 13:24, Austin S. Hemmelgarn wrote:
> OK, I've gotten a lot of good feedback on this, and the general
> consensus seems to be:
>
> * If we're going to recommend regular balance, we should explain how
> it actually helps things.
> * We should mention the performance interactions with qgroups, as well
> as warning people off of running other things like scrubs or defrag
> concurrently.
> * The filters should be reasonably tame in terms of chunk selection.
> * BTRFS should ideally get smarter about this kind of thing so the
> user doesn't have to be.
>
> To that end, I propose the following text for the FAQ:
>
> Q: Do I need to run a balance regularly?
>
> A: While not strictly necessary for normal operations, running a
> filtered balance regularly can help prevent your filesystem from
> ending up with ENOSPC issues. The following command run daily on each
> BTRFS volume should be more than sufficient for most users:
>
> `btrfs balance start -dusage=25 -dlimit=2..10 -musage=25
> -mlimit=2..10`
>
> If you are running a kernel older than version 4.4 and can't upgrade,
> the following should be used instead:
>
> `btrfs balance start -dusage=25 -musage=25`
>
> Both of these commands will effectively compact partially full chunks
> on the filesystem so that new chunks have more space to be allocated.
> For more information on what the commands actually mean, check out
> `man btrfs-balance`
>
> When run regularly, both of these should complete extremely fast on
> most BTRFS volumes. Note that these may run significantly slower on
> volumes which have quotas enabled. Additionally, it's best to make
> sure other things aren't putting a lot of load on the filesystem while
> running a balance, so try to make sure this doesn't run at the same
> time as a scrub or defrag.
>
> A full, unfiltered balance (one without any options passed in) is
> completely unnecessary for normal usage of a filesystem.
Hi Austin,
From the discussion we've had I have the impression there might be
another way to answer this FAQ that is as valid as this one: monitor
usage. You may never need to balance, or hardly ever.
Your suggestion for regular balance has clear advantages: It's
set-and-forget. And to apply the advice the user doesn't need to
understand the allocator, interpret usage stats, or figure out filters.
On the other hand, not needing to run balance is quite appealing. So is
avoiding yet another cron/timer job if I don't really need it.
Question about your proposed text, "When run regularly, both of these
should complete extremely fast..." Does that imply that it might not be
fast if, say, you've never run balance before on a filesystem with a lot
of unused space in allocated chunks?
Tom
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-12 19:26 ` Tom Worster
@ 2018-01-12 19:43 ` Austin S. Hemmelgarn
0 siblings, 0 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-12 19:43 UTC (permalink / raw)
To: Tom Worster; +Cc: Btrfs BTRFS
On 2018-01-12 14:26, Tom Worster wrote:
> On 12 Jan 2018, at 13:24, Austin S. Hemmelgarn wrote:
>
>> OK, I've gotten a lot of good feedback on this, and the general
>> consensus seems to be:
>>
>> * If we're going to recommend regular balance, we should explain how
>> it actually helps things.
>> * We should mention the performance interactions with qgroups, as well
>> as warning people off of running other things like scrubs or defrag
>> concurrently.
>> * The filters should be reasonably tame in terms of chunk selection.
>> * BTRFS should ideally get smarter about this kind of thing so the
>> user doesn't have to be.
>>
>> To that end, I propose the following text for the FAQ:
>>
>> Q: Do I need to run a balance regularly?
>>
>> A: While not strictly necessary for normal operations, running a
>> filtered balance regularly can help prevent your filesystem from
>> ending up with ENOSPC issues. The following command run daily on each
>> BTRFS volume should be more than sufficient for most users:
>>
>> `btrfs balance start -dusage=25 -dlimit=2..10 -musage=25 -mlimit=2..10`
>>
>> If you are running a kernel older than version 4.4 and can't upgrade,
>> the following should be used instead:
>>
>> `btrfs balance start -dusage=25 -musage=25`
>>
>> Both of these commands will effectively compact partially full chunks
>> on the filesystem so that new chunks have more space to be allocated.
>> For more information on what the commands actually mean, check out
>> `man btrfs-balance`
>>
>> When run regularly, both of these should complete extremely fast on
>> most BTRFS volumes. Note that these may run significantly slower on
>> volumes which have quotas enabled. Additionally, it's best to make
>> sure other things aren't putting a lot of load on the filesystem while
>> running a balance, so try to make sure this doesn't run at the same
>> time as a scrub or defrag.
>>
>> A full, unfiltered balance (one without any options passed in) is
>> completely unnecessary for normal usage of a filesystem.
>
> Hi Austin,
>
> From the discussion we've had I have the impression there might be
> another way to answer this FAQ that is as valid as this one: monitor
> usage. You may never need to balance, or hardly ever.
>
> Your suggestion for regular balance has clear advantages: It's
> set-and-forget. And to apply the advice the user doesn't need to
> understand the allocator, interpret usage stats, or figure out filters.
>
> On the other hand, not needing to run balance is quite appealing. So is
> avoiding yet another cron/timer job if I don't really need it.
While I can understand some people may prefer that, it's also a lot more
error prone, and I personally don't think it's a good idea to suggest
it. Recommendations where the user is required to regularly do
something themself and evaluate the results tend to be bad ideas from a
support perspective. Yeah, it may be fine for you and me to just
monitor things manually, but it's not really a good idea for someone who
doesn't have any understanding of how things work under the hood. It's
also not likely to be what most distros end up standardizing on as best
practices for handling this, and not matching up with that will look bad.
>
> Question about your proposed text, "When run regularly, both of these
> should complete extremely fast..." Does that imply that it might not be
> fast if, say, you've never run balance before on a filesystem with a lot
> of unused space in allocated chunks?
There's a very high potential that it will take a long time on the first
run (or the first couple if things are particularly bad), but the
filters still limit exactly how much data will get moved. The
recommended command with the limit filters for example shouldn't ever
move more than about 15GB of data, and it will only do that much on
pathologically bad multi-TB volumes, and thus it shouldn't take more
than a few minutes on SSD's, or about 15-20 on traditional hard drives.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-12 18:24 ` Austin S. Hemmelgarn
2018-01-12 19:26 ` Tom Worster
@ 2018-01-13 22:09 ` Chris Murphy
2018-01-15 13:43 ` Austin S. Hemmelgarn
2018-01-15 18:23 ` Tom Worster
1 sibling, 2 replies; 34+ messages in thread
From: Chris Murphy @ 2018-01-13 22:09 UTC (permalink / raw)
To: Austin S. Hemmelgarn; +Cc: Btrfs BTRFS
On Fri, Jan 12, 2018 at 11:24 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
> To that end, I propose the following text for the FAQ:
>
> Q: Do I need to run a balance regularly?
>
> A: While not strictly necessary for normal operations, running a filtered
> balance regularly can help prevent your filesystem from ending up with
> ENOSPC issues. The following command run daily on each BTRFS volume should
> be more than sufficient for most users:
>
> `btrfs balance start -dusage=25 -dlimit=2..10 -musage=25 -mlimit=2..10`
Daily? Seems excessive.
I've got multiple Btrfs file systems that I haven't balanced, full or
partial, in a year. And I have no problems. One is a laptop which
accumulates snapshots until roughly 25% free space remains and then
most of the snapshots are deleted, except the most recent few, all at
one time. I'm not experiencing any problems so far. The other is a NAS
and it's multiple copies, with maybe 100-200 snapshots. One backup
volume is 99% full, there's no more unallocated free space, I delete
snapshots only to make room for btrfs send receive to keep pushing the
most recent snapshot from the main volume to the backup. Again no
problems.
I really think suggestions this broad are just going to paper over
bugs or design flaws, we won't see as many bug reports and then real
problems won't get fixed.
I also thing the time based method is too subjective. What about the
layout means a balance is needed? And if it's really a suggestion, why
isn't there a chron or systemd unit that just does this for the user,
in btrfs-progs, working and enabled by default? I really do not like
all this hand holding of Btrfs, it's not going to make it better.
> A full, unfiltered balance (one without any options passed in) is completely
> unnecessary for normal usage of a filesystem.
That's good advice.
--
Chris Murphy
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-13 22:09 ` Chris Murphy
@ 2018-01-15 13:43 ` Austin S. Hemmelgarn
2018-01-15 18:23 ` Tom Worster
1 sibling, 0 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-15 13:43 UTC (permalink / raw)
To: Chris Murphy; +Cc: Btrfs BTRFS
On 2018-01-13 17:09, Chris Murphy wrote:
> On Fri, Jan 12, 2018 at 11:24 AM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>
>
>> To that end, I propose the following text for the FAQ:
>>
>> Q: Do I need to run a balance regularly?
>>
>> A: While not strictly necessary for normal operations, running a filtered
>> balance regularly can help prevent your filesystem from ending up with
>> ENOSPC issues. The following command run daily on each BTRFS volume should
>> be more than sufficient for most users:
>>
>> `btrfs balance start -dusage=25 -dlimit=2..10 -musage=25 -mlimit=2..10`
>
>
> Daily? Seems excessive.
For handling of chunks that are only 25% full and capping it at 10
chunks processed each for data and metadata? That's only (assuming I
remember the max chunk size correctly) about 15GB of data being moved at
the absolute most, and that will likely only happen in pathologically
bad cases. In most cases it should be either nothing (in most cases) or
about 768MB being shuffled around, and even on traditional hard drives
that should complete insanely fast (barring impact from very large
numbers of snapshots or use of qgroups).
If there are no chunks that match (or only one chunk), this finishes in
at most a second with near zero disk I/O. If exactly two match (which
should be the common case for most users when it matches at all), it
should take at most a few seconds to complete, even on traditional hard
drives. If more match, it will of course take longer, but it should be
pretty rare that more than two match.
Given that, it really doesn't seem all that excessive to me. As a point
of comparison, automated X.509 certificate renewal checks via certbot
take more resources to perform when there's not a renewal due than this
balance command takes when there's nothing to work on, and it's
absolutely standard to run the X.509 checks daily despite the fact that
weekly checks would still give no worse security (certbot will renew
things well before they expire).
>
> I've got multiple Btrfs file systems that I haven't balanced, full or
> partial, in a year. And I have no problems. One is a laptop which
> accumulates snapshots until roughly 25% free space remains and then
> most of the snapshots are deleted, except the most recent few, all at
> one time. I'm not experiencing any problems so far. The other is a NAS
> and it's multiple copies, with maybe 100-200 snapshots. One backup
> volume is 99% full, there's no more unallocated free space, I delete
> snapshots only to make room for btrfs send receive to keep pushing the
> most recent snapshot from the main volume to the backup. Again no
> problems.
In the first case, you're dealing with a special configuration that
makes most of this irrelevant most of the time (as I'm assuming things
change _enough_ between snapshots that dumping most of them will
completely empty out most of the chunks they were stored in).
In the second I'd have to say you've been lucky. I've personally never
run a volume that close to full with BTRFS without balancing regularly
and not had some kind of issue.
>
> I really think suggestions this broad are just going to paper over
> bugs or design flaws, we won't see as many bug reports and then real
> problems won't get fixed.
So maybe we should fix things so that this is never needed? Yes, it's a
workaround for a well known and documented design flaw (and yes, I
consider the whole two-level allocator's handling of free space
exhaustion to be a design flaw), but I don't see any patches forthcoming
to fix it, so if we want to keep users around, we need to provide some
way for them to mitigate the problems it can cause (otherwise we won't
find any bugs because we won't have any users).
>
> I also thing the time based method is too subjective. What about the
> layout means a balance is needed? And if it's really a suggestion, why
> isn't there a chron or systemd unit that just does this for the user,
> in btrfs-progs, working and enabled by default? I really do not like
> all this hand holding of Btrfs, it's not going to make it better.
For a filesystem you really have two generic possibilities for use cases:
1. It's designed for general purpose usage. Doesn't really excel at any
thing in particular, but isn't really bad at anything either.
2. It's designed for a very specific use case. Does an amazing job for
that particular use case and possibly for some similar ones, and may or
may not do a reasonable job for other use cases.
Your comments here seem to imply that BTRFS falls under the second case,
which is odd since most everything else I've seen implies that BTRFS
fits the first case (or is trying to at least). In either case though,
you need to provide something to deal with this particular design flaw.
In the first case, you _need_ to make it as easy as possible for people
who have no understanding of computers to use. While needing balances
from time to time is not exactly in-line with that, requiring people to
try and judge based on the numbers whether or not a balance is warranted
is even less in-line with it. By just telling people to automate it and
give reasonable filters to the balance command, we remove the guesswork
entirely, and make things far easier for people.
In the second case, it's generally more acceptable to require more work
of the user, but making baseline prophylactic maintenance something that
you can't trivially automate is still a bad idea (imagine how popular
ZFS would be if you could only run scrubs manually).
That said, if you can find or write up a script that reliably does the
math to check if a balance is needed and then actually runs it if it is,
I would be more than happy to recommend that in the FAQ instead.
>
>> A full, unfiltered balance (one without any options passed in) is completely
>> unnecessary for normal usage of a filesystem.
>
> That's good advice.
And so far it seems to be the one thing that everyone agrees on ;).
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-13 22:09 ` Chris Murphy
2018-01-15 13:43 ` Austin S. Hemmelgarn
@ 2018-01-15 18:23 ` Tom Worster
2018-01-16 6:45 ` Chris Murphy
1 sibling, 1 reply; 34+ messages in thread
From: Tom Worster @ 2018-01-15 18:23 UTC (permalink / raw)
To: Chris Murphy; +Cc: Austin S. Hemmelgarn, Btrfs BTRFS
On 13 Jan 2018, at 17:09, Chris Murphy wrote:
> On Fri, Jan 12, 2018 at 11:24 AM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>
>
>> To that end, I propose the following text for the FAQ:
>>
>> Q: Do I need to run a balance regularly?
>>
>> A: While not strictly necessary for normal operations, running a
>> filtered
>> balance regularly can help prevent your filesystem from ending up
>> with
>> ENOSPC issues. The following command run daily on each BTRFS volume
>> should
>> be more than sufficient for most users:
>>
>> `btrfs balance start -dusage=25 -dlimit=2..10 -musage=25
>> -mlimit=2..10`
>
>
> Daily? Seems excessive.
>
> I've got multiple Btrfs file systems that I haven't balanced, full or
> partial, in a year. And I have no problems. One is a laptop which
> accumulates snapshots until roughly 25% free space remains and then
> most of the snapshots are deleted, except the most recent few, all at
> one time. I'm not experiencing any problems so far. The other is a NAS
> and it's multiple copies, with maybe 100-200 snapshots. One backup
> volume is 99% full, there's no more unallocated free space, I delete
> snapshots only to make room for btrfs send receive to keep pushing the
> most recent snapshot from the main volume to the backup. Again no
> problems.
>
> I really think suggestions this broad are just going to paper over
> bugs or design flaws, we won't see as many bug reports and then real
> problems won't get fixed.
This is just an answer to a FAQ. This is not Austin or anyone else
trying to telling you or anyone else that you should do this. It should
be clear that there is an implied caveat along the lines of: "There are
other ways to manage allocation besides regular balancing. This
recommendation is a For-Dummies-kinda default that should work well
enough if you don't have another strategy better adapted to your
situation." If this implication is not obvious enough then we can add
something explicit.
> I also thing the time based method is too subjective. What about the
> layout means a balance is needed? And if it's really a suggestion, why
> isn't there a chron or systemd unit that just does this for the user,
> in btrfs-progs, working and enabled by default?
As a newcomer to BTRFS, I was astonished to learn that it demands each
user figure out some workaround for what is, in my judgement, a required
but missing feature, i.e. a defect, a bug. At present the docs are
pretty confusing for someone trying to deal with it on their own.
Unless some better fix is in the works, this _should_ be a systemd unit
or something. Until then, please put it in FAQ.
> I really do not like
> all this hand holding of Btrfs, it's not going to make it better.
Maybe it won't but, absent better proposals, and given the nature of the
problem, this kind of hand-holding is only fair to the user.
Tom
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-15 18:23 ` Tom Worster
@ 2018-01-16 6:45 ` Chris Murphy
2018-01-16 11:02 ` Andrei Borzenkov
2018-01-16 12:57 ` Austin S. Hemmelgarn
0 siblings, 2 replies; 34+ messages in thread
From: Chris Murphy @ 2018-01-16 6:45 UTC (permalink / raw)
To: Tom Worster; +Cc: Chris Murphy, Austin S. Hemmelgarn, Btrfs BTRFS
On Mon, Jan 15, 2018 at 11:23 AM, Tom Worster <fsb@thefsb.org> wrote:
> On 13 Jan 2018, at 17:09, Chris Murphy wrote:
>
>> On Fri, Jan 12, 2018 at 11:24 AM, Austin S. Hemmelgarn
>> <ahferroin7@gmail.com> wrote:
>>
>>
>>> To that end, I propose the following text for the FAQ:
>>>
>>> Q: Do I need to run a balance regularly?
>>>
>>> A: While not strictly necessary for normal operations, running a filtered
>>> balance regularly can help prevent your filesystem from ending up with
>>> ENOSPC issues. The following command run daily on each BTRFS volume
>>> should
>>> be more than sufficient for most users:
>>>
>>> `btrfs balance start -dusage=25 -dlimit=2..10 -musage=25 -mlimit=2..10`
>>
>>
>>
>> Daily? Seems excessive.
>>
>> I've got multiple Btrfs file systems that I haven't balanced, full or
>> partial, in a year. And I have no problems. One is a laptop which
>> accumulates snapshots until roughly 25% free space remains and then
>> most of the snapshots are deleted, except the most recent few, all at
>> one time. I'm not experiencing any problems so far. The other is a NAS
>> and it's multiple copies, with maybe 100-200 snapshots. One backup
>> volume is 99% full, there's no more unallocated free space, I delete
>> snapshots only to make room for btrfs send receive to keep pushing the
>> most recent snapshot from the main volume to the backup. Again no
>> problems.
>>
>> I really think suggestions this broad are just going to paper over
>> bugs or design flaws, we won't see as many bug reports and then real
>> problems won't get fixed.
>
>
> This is just an answer to a FAQ. This is not Austin or anyone else trying to
> telling you or anyone else that you should do this. It should be clear that
> there is an implied caveat along the lines of: "There are other ways to
> manage allocation besides regular balancing. This recommendation is a
> For-Dummies-kinda default that should work well enough if you don't have
> another strategy better adapted to your situation." If this implication is
> not obvious enough then we can add something explicit.
It's an upstream answer to a frequently asked question. It's rather
official, or about as close as it gets to it.
>
>
>> I also thing the time based method is too subjective. What about the
>> layout means a balance is needed? And if it's really a suggestion, why
>> isn't there a chron or systemd unit that just does this for the user,
>> in btrfs-progs, working and enabled by default?
>
>
> As a newcomer to BTRFS, I was astonished to learn that it demands each user
> figure out some workaround for what is, in my judgement, a required but
> missing feature, i.e. a defect, a bug. At present the docs are pretty
> confusing for someone trying to deal with it on their own.
>
> Unless some better fix is in the works, this _should_ be a systemd unit or
> something. Until then, please put it in FAQ.
At least openSUSE has a systemd unit for a long time now, but last
time I checked (a bit over a year ago) it's disabled by default. Why?
And insofar as I'm aware, openSUSE users aren't having big problems
related to lack of balancing, they have problems due to the lack of
balancing combined with schizo snapper defaults, which are these days
masked somewhat by turning on quotas so snapper can be more accurate
about cleaning up.
Basically the scripted balance tells me two things:
a. Something is broken (still)
b. None of the developers has time to investigate coherent bug reports
about a. and fix/refine it.
And therefore papering over the problem is all we have. Basically it's
a sledgehammer approach.
The main person working on enoscp stuff is Josef so I'd run this by
him and make sure this papering over bugs is something he agrees with.
>
>
>> I really do not like
>> all this hand holding of Btrfs, it's not going to make it better.
>
>
> Maybe it won't but, absent better proposals, and given the nature of the
> problem, this kind of hand-holding is only fair to the user.
This is hardly the biggest gotcha with Btrfs. I'm fine with the idea
of papering over design flaws and long standing bugs with user space
work arounds. I just want everyone on the same page about it, so it's
not some big surprise it's happening. As far as I know, none of the
developers regularly looks at the Btrfs wiki.
And I think the best way of communicating:
a. this is busted, and it sucks
b. here's a proposed user space work around, so users aren't so pissed off.
Is to try and get it into btrfs-progs, and enabled by default, because
that will get in front of at least one developer.
--
Chris Murphy
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-16 6:45 ` Chris Murphy
@ 2018-01-16 11:02 ` Andrei Borzenkov
2018-01-16 12:57 ` Austin S. Hemmelgarn
1 sibling, 0 replies; 34+ messages in thread
From: Andrei Borzenkov @ 2018-01-16 11:02 UTC (permalink / raw)
To: Chris Murphy; +Cc: Tom Worster, Austin S. Hemmelgarn, Btrfs BTRFS
On Tue, Jan 16, 2018 at 9:45 AM, Chris Murphy <lists@colorremedies.com> wrote:
...
>>
>> Unless some better fix is in the works, this _should_ be a systemd unit or
>> something. Until then, please put it in FAQ.
>
> At least openSUSE has a systemd unit for a long time now, but last
> time I checked (a bit over a year ago) it's disabled by default. Why?
>
It is now enabled by default on Tumbleweed and hence likely on SLE/Leap 15.
> And insofar as I'm aware, openSUSE users aren't having big problems
> related to lack of balancing, they have problems due to the lack of
> balancing combined with schizo snapper defaults, which are these days
> masked somewhat by turning on quotas so snapper can be more accurate
> about cleaning up.
>
Not only that but also making snapshot policy less aggressive - now
(in Tumbleweed/Leap 42.3) periodical snapshots are turned off by
default, only configuration changes via YaST/package updates via
zypper trigger snapshot creation.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-16 6:45 ` Chris Murphy
2018-01-16 11:02 ` Andrei Borzenkov
@ 2018-01-16 12:57 ` Austin S. Hemmelgarn
1 sibling, 0 replies; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-16 12:57 UTC (permalink / raw)
To: Chris Murphy, Tom Worster; +Cc: Btrfs BTRFS
On 2018-01-16 01:45, Chris Murphy wrote:
> On Mon, Jan 15, 2018 at 11:23 AM, Tom Worster <fsb@thefsb.org> wrote:
>> On 13 Jan 2018, at 17:09, Chris Murphy wrote:
>>
>>> On Fri, Jan 12, 2018 at 11:24 AM, Austin S. Hemmelgarn
>>> <ahferroin7@gmail.com> wrote:
>>>
>>>> To that end, I propose the following text for the FAQ:
>>>>
>>>> Q: Do I need to run a balance regularly?
>>>>
>>>> A: While not strictly necessary for normal operations, running a filtered
>>>> balance regularly can help prevent your filesystem from ending up with
>>>> ENOSPC issues. The following command run daily on each BTRFS volume
>>>> should
>>>> be more than sufficient for most users:
>>>>
>>>> `btrfs balance start -dusage=25 -dlimit=2..10 -musage=25 -mlimit=2..10`
>>>
>>> Daily? Seems excessive.
>>>
>>> I've got multiple Btrfs file systems that I haven't balanced, full or
>>> partial, in a year. And I have no problems. One is a laptop which
>>> accumulates snapshots until roughly 25% free space remains and then
>>> most of the snapshots are deleted, except the most recent few, all at
>>> one time. I'm not experiencing any problems so far. The other is a NAS
>>> and it's multiple copies, with maybe 100-200 snapshots. One backup
>>> volume is 99% full, there's no more unallocated free space, I delete
>>> snapshots only to make room for btrfs send receive to keep pushing the
>>> most recent snapshot from the main volume to the backup. Again no
>>> problems.
>>>
>>> I really think suggestions this broad are just going to paper over
>>> bugs or design flaws, we won't see as many bug reports and then real
>>> problems won't get fixed.
>>
>> This is just an answer to a FAQ. This is not Austin or anyone else trying to
>> telling you or anyone else that you should do this. It should be clear that
>> there is an implied caveat along the lines of: "There are other ways to
>> manage allocation besides regular balancing. This recommendation is a
>> For-Dummies-kinda default that should work well enough if you don't have
>> another strategy better adapted to your situation." If this implication is
>> not obvious enough then we can add something explicit.
>
> It's an upstream answer to a frequently asked question. It's rather
> official, or about as close as it gets to it.
>
>>
>>> I also thing the time based method is too subjective. What about the
>>> layout means a balance is needed? And if it's really a suggestion, why
>>> isn't there a chron or systemd unit that just does this for the user,
>>> in btrfs-progs, working and enabled by default?
>>
>> As a newcomer to BTRFS, I was astonished to learn that it demands each user
>> figure out some workaround for what is, in my judgement, a required but
>> missing feature, i.e. a defect, a bug. At present the docs are pretty
>> confusing for someone trying to deal with it on their own.
>>
>> Unless some better fix is in the works, this _should_ be a systemd unit or
>> something. Until then, please put it in FAQ.
>
> At least openSUSE has a systemd unit for a long time now, but last
> time I checked (a bit over a year ago) it's disabled by default. Why?
>
> And insofar as I'm aware, openSUSE users aren't having big problems
> related to lack of balancing, they have problems due to the lack of
> balancing combined with schizo snapper defaults, which are these days
> masked somewhat by turning on quotas so snapper can be more accurate
> about cleaning up.
And in turn causing other issues because of the quotas, but that's
getting OT...
>
> Basically the scripted balance tells me two things:
> a. Something is broken (still)
> b. None of the developers has time to investigate coherent bug reports
> about a. and fix/refine it.
I don't entirely agree here. The issue is essentially inherent in the
very design of the two-stage allocator itself, so it's not really
something that can just be fixed by some simple surface patch. The only
real options I see to fix it are either:
1. Redesign the allocator
or:
2. figure out some way to handle this generically and automatically.
The first case is pretty much immediately out because it will almost
certainly require a breaking change in the on-disk format. The second
is extremely challenging to do right, and likely to cause some
significant controversy among list regulars (I for one don't want the FS
doing stuff behind my back that impacts performance, and I have a
feeling that quite a lot of other people here don't either).
Given that, I would say time is only a (probably small) part of it.
This is not an easy thing to fix given the current situation, and
difficult problems tend to sit around with no progress for very long
periods of time in open source development.
>
> And therefore papering over the problem is all we have. Basically it's
> a sledgehammer approach.
How exactly is this any different than requiring a user to manually
scrub things to check data that's not being actively used? Or requiring
manual invocation of defragmentation? Or even batch deduplication?
All of those are manually triggered solutions to 'problems' with the
filesystem, just like this is. The only difference is that people are
used to needing to manually defrag disks, and reasonably used to the
need for manual scrubs (and don't seem to care much about dedupe), while
doing something like this to keep the allocator happy is absolutely
alien to them (despite being no different conceptually in that respect
from defrag, just operating at a different level).
>
> The main person working on enoscp stuff is Josef so I'd run this by
> him and make sure this papering over bugs is something he agrees with.
I agree that Josef's input would be nice to have, as he really does
appear to be the authority on this type of thing.
I would also love to hear from someone at Facebook about their
experience with this type of thing, as they probably have the largest
current deployment of BTRFS around.
>
>>
>>> I really do not like
>>> all this hand holding of Btrfs, it's not going to make it better.
>>
>> Maybe it won't but, absent better proposals, and given the nature of the
>> problem, this kind of hand-holding is only fair to the user.
>
> This is hardly the biggest gotcha with Btrfs. I'm fine with the idea
> of papering over design flaws and long standing bugs with user space
> work arounds. I just want everyone on the same page about it, so it's
> not some big surprise it's happening. As far as I know, none of the
> developers regularly looks at the Btrfs wiki.
>
> And I think the best way of communicating:
> a. this is busted, and it sucks
> b. here's a proposed user space work around, so users aren't so pissed off.
>
> Is to try and get it into btrfs-progs, and enabled by default, because
> that will get in front of at least one developer.
Maybe it's time someone writes up a BCP document and includes that as a
man page bundled with btrfs-progs? That would get much better developer
visibility, would be much easier to keep current, and would probably
cover the biggest issue with our documentation currently (it's great for
technical people, but somewhat horrendous for new users without
technical background). We've already essentially got the beginnings of
such a document between the FAQ and the Gotcha's page on the wiki.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-09 12:23 ` Austin S. Hemmelgarn
@ 2018-01-09 14:16 ` Tom Worster
0 siblings, 0 replies; 34+ messages in thread
From: Tom Worster @ 2018-01-09 14:16 UTC (permalink / raw)
To: Austin S. Hemmelgarn; +Cc: linux-btrfs
On 9 Jan 2018, at 7:23, Austin S. Hemmelgarn wrote:
> On 2018-01-08 16:43, Tom Worster wrote:
>>
>> Given the documentation and the usage stats, I did not know what
>> options to use with balance. I spent some time reading and
>> researching and trying to understand the filters and how they should
>> relate to my situation. Eventually I abandoned that effort and ran
>> balance without options.
> Hopefully the explanation I gave on the filters in the Github issue
> helped some. In this case though, it sounds like running a filtered
> balance probably wouldn't have saved you much over a full one.
Yes, it helped. Hugo's email helped too. I now have a better
understanding of balance filters.
At the same time, Hugo's email and others in this thread added to my
belief that I'm now managing systems with a filesystem I'm unqualified
to use.
Tom
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 21:43 Tom Worster
2018-01-08 22:18 ` Hugo Mills
@ 2018-01-09 12:23 ` Austin S. Hemmelgarn
2018-01-09 14:16 ` Tom Worster
1 sibling, 1 reply; 34+ messages in thread
From: Austin S. Hemmelgarn @ 2018-01-09 12:23 UTC (permalink / raw)
To: Tom Worster, linux-btrfs
On 2018-01-08 16:43, Tom Worster wrote:
> On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
>
>> On 2018-01-08 11:20, ein wrote:
>>
>> > On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
>> >
>> > > [...]
>> > >
>> > > And here's the FAQ entry:
>> > >
>> > > Q: Do I need to run a balance regularly?
>> > >
>> > > A: In general usage, no. A full unfiltered balance typically takes a
>> > > long time, and will rewrite huge amounts of data unnecessarily.
>> You may
>> > > wish to run a balance on metadata only (see Balance_Filters) if
>> you find
>> > > you have very large amounts of metadata space allocated but
>> unused, but
>> > > this should be a last resort.
>> >
>> > IHMO three more sentencens and the answer would be more useful:
>> > 1. BTRFS balance command example with note check the man first.
>> > 2. What use case may cause 'large amounts of metadata space allocated
>> > but unused'.
>>
>> That's kind of what I was thinking as well, but I'm hesitant to get
>> too heavily into stuff along the lines of 'for use case X, do 1, for
>> use case Y, do 2, etc', as that tends to result in pigeonholing
>> (people just go with what sounds closest to their use case instead of
>> trying to figure out what actually is best for their use case).
>>
>> Ideally, I think it should be as generic as reasonably possible,
>> possibly something along the lines of:
>>
>> A: While not strictly necessary, running regular filtered balances
>> (for example `btrfs balance start -dusage=50 -dlimit=2 -musage=50
>> -mlimit=4`, see `man btrfs-balance` for more info on what the options
>> mean) can help keep a volume healthy by mitigating the things that
>> typically cause ENOSPC errors. Full balances by contrast are long and
>> expensive operations, and should be done only as a last resort.
>
> As the BTRFS noob who started the conversation on netdata's Github
> issues, I'd like to describe my experience.
>
> I got an alert that unallocated space on a BTRFS filesystem on one host
> was low. A netdata caption suggested btrfs-balance and directed me to
> its man page. But I found it hard to understand since I don't know how
> BTRFS works or its particular terminology. The FAQ was easier to
> understand but didn't help me find a solution to my problem.
>
> It's a 420GiB NVMe with single data and metadata. It has a MariaDB
> datadir with an OLTP workload and a small GlusterFS brick for
> replicating filesystem with little activity. I recall that unallocated
> space was under 2G, metadata allocation was low, a few G and about 1/3
> used. Data allocation was very large, almost everything else, with ~25%
> used.
>
> Given the documentation and the usage stats, I did not know what options
> to use with balance. I spent some time reading and researching and
> trying to understand the filters and how they should relate to my
> situation. Eventually I abandoned that effort and ran balance without
> options.
Hopefully the explanation I gave on the filters in the Github issue
helped some. In this case though, it sounds like running a filtered
balance probably wouldn't have saved you much over a full one.
>
> While general recommendations about running balance would be welcome,
> what I needed was a dummy's guide to what the output of btrfs usage
> _means_ and how to use balance to tackle problems with it.
This really is a great point. Our documentation does a decent job as a
reference for people who already have some idea what they're doing, but
it really is worthless for people who have no prior experience.
>
> The other mystery is how the data allocation became so large.
The most common case is that you had a lot of data on the device, and
then deleted most of it. Unless a chunk becomes completely empty
(either because the data that was in it becomes completely unused, or
because a balance moved all the data), it won't be automatically deleted
by the kernel, so it's not unusual for filesystems that have been very
active (especially if they have the 'ssd' mount option set, which
happens automatically on most SSD's and a lot of other things the kernel
marks as not being rotational media) to have a reasonably large amount
of empty space scattered around the data chunks.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
2018-01-08 21:43 Tom Worster
@ 2018-01-08 22:18 ` Hugo Mills
2018-01-09 12:23 ` Austin S. Hemmelgarn
1 sibling, 0 replies; 34+ messages in thread
From: Hugo Mills @ 2018-01-08 22:18 UTC (permalink / raw)
To: Tom Worster; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 5919 bytes --]
On Mon, Jan 08, 2018 at 04:43:02PM -0500, Tom Worster wrote:
> On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
>
> >On 2018-01-08 11:20, ein wrote:
> >
> >> On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
> >>
> >> > [...]
> >> >
> >> > And here's the FAQ entry:
> >> >
> >> > Q: Do I need to run a balance regularly?
> >> >
> >> > A: In general usage, no. A full unfiltered balance typically
> >takes a
> >> > long time, and will rewrite huge amounts of data
> >unnecessarily. You may
> >> > wish to run a balance on metadata only (see Balance_Filters)
> >if you find
> >> > you have very large amounts of metadata space allocated but
> >unused, but
> >> > this should be a last resort.
> >>
> >> IHMO three more sentencens and the answer would be more useful:
> >> 1. BTRFS balance command example with note check the man first.
> >> 2. What use case may cause 'large amounts of metadata space
> >allocated
> >> but unused'.
> >
> >That's kind of what I was thinking as well, but I'm hesitant to
> >get too heavily into stuff along the lines of 'for use case X, do
> >1, for use case Y, do 2, etc', as that tends to result in
> >pigeonholing (people just go with what sounds closest to their use
> >case instead of trying to figure out what actually is best for
> >their use case).
> >
> >Ideally, I think it should be as generic as reasonably possible,
> >possibly something along the lines of:
> >
> >A: While not strictly necessary, running regular filtered balances
> >(for example `btrfs balance start -dusage=50 -dlimit=2 -musage=50
> >-mlimit=4`, see `man btrfs-balance` for more info on what the
> >options mean) can help keep a volume healthy by mitigating the
> >things that typically cause ENOSPC errors. Full balances by
> >contrast are long and expensive operations, and should be done
> >only as a last resort.
>
> As the BTRFS noob who started the conversation on netdata's Github
> issues, I'd like to describe my experience.
>
> I got an alert that unallocated space on a BTRFS filesystem on one
> host was low. A netdata caption suggested btrfs-balance and directed
> me to its man page. But I found it hard to understand since I don't
> know how BTRFS works or its particular terminology. The FAQ was
> easier to understand but didn't help me find a solution to my
> problem.
The information is there in the FAQ, but only under headings that
you'd find if you'd actually hit the problems, rather than being warned
that the problems might be happening (which is your situation):
https://btrfs.wiki.kernel.org/index.php/FAQ#Help.21_Btrfs_claims_I.27m_out_of_space.2C_but_it_looks_like_I_should_have_lots_left.21
> It's a 420GiB NVMe with single data and metadata. It has a MariaDB
> datadir with an OLTP workload and a small GlusterFS brick for
> replicating filesystem with little activity. I recall that
> unallocated space was under 2G, metadata allocation was low, a few G
> and about 1/3 used. Data allocation was very large, almost
> everything else, with ~25% used.
>
> Given the documentation and the usage stats, I did not know what
> options to use with balance. I spent some time reading and
> researching and trying to understand the filters and how they should
> relate to my situation. Eventually I abandoned that effort and ran
> balance without options.
That'll certainly work, although it's wasteful of I/O bandwidth and
time.
> While general recommendations about running balance would be
> welcome, what I needed was a dummy's guide to what the output of
> btrfs usage _means_ and how to use balance to tackle problems with
> it.
In this kind of situation, it's generally recommended to balance
data chunks only (because that's where the overallocation usually
happens). There's not much point in balancing everything, so the
question is how much work to do... Ideally, you want to end up
compacting everything into the smallest number of chunks, which will
be the number of GiB of actual data.
There's a couple of ways to limit the work done. One way is to only
pick the chunks less than some threshold fraction used. This is the
usage=N option (-dusage=30, for example). It allows you to do (in
theory) the minimum amount of actual balance work neded. Drawbacks are
that you don't know how many such chunks there are for any given N, so
you end up searching manually for an appropriate N.
The other way is to tell balance exactly how many chunks it should
operate on. This is the limit=N option. This gives you precise control
over the number of chunks to balance, but doesn't specify which
chunks, so you may end up moving N GiB of data (whereas usage=N could
move much less actual data).
Personally, I recommend using limit=N, where N is something like
(Allocated - Used)*3/4 GiB.
Note the caveat below, which is that using "ssd" mount option on
earlier kernels could prevent the balance from doing a decent job.
> The other mystery is how the data allocation became so large.
You have a non-rotational device. That means that it'd be mounted
automatically with the "ssd" mount option. Up to 4.13 (or 4.14, I
always forget), the behaviour of "ssd" leads to highly fragmented
allocation of extents, which in turn results in new data chunks being
allocated when there's theoretically loads of space available to use
(but which it may not be practical to use, due to the fragmented free
space).
After 4.13 (or 4.14), the "ssd" mount option has been fixed, and it
no longer has the bad long-term effects that we've seen before, but it
won't deal with the existing fragmented free space without a data
balance.
If you're running an older kernel, it's definitely recommended to
mount all filesystems with "nossd" to avoid these issues.
Hugo.
--
Hugo Mills | As long as you're getting different error messages,
hugo@... carfax.org.uk | you're making progress.
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Recommendations for balancing as part of regular maintenance?
@ 2018-01-08 21:43 Tom Worster
2018-01-08 22:18 ` Hugo Mills
2018-01-09 12:23 ` Austin S. Hemmelgarn
0 siblings, 2 replies; 34+ messages in thread
From: Tom Worster @ 2018-01-08 21:43 UTC (permalink / raw)
To: linux-btrfs
On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
> On 2018-01-08 11:20, ein wrote:
>
> > On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
> >
> > > [...]
> > >
> > > And here's the FAQ entry:
> > >
> > > Q: Do I need to run a balance regularly?
> > >
> > > A: In general usage, no. A full unfiltered balance typically takes
> a
> > > long time, and will rewrite huge amounts of data unnecessarily.
> You may
> > > wish to run a balance on metadata only (see Balance_Filters) if
> you find
> > > you have very large amounts of metadata space allocated but
> unused, but
> > > this should be a last resort.
> >
> > IHMO three more sentencens and the answer would be more useful:
> > 1. BTRFS balance command example with note check the man first.
> > 2. What use case may cause 'large amounts of metadata space
> allocated
> > but unused'.
>
> That's kind of what I was thinking as well, but I'm hesitant to get
> too heavily into stuff along the lines of 'for use case X, do 1, for
> use case Y, do 2, etc', as that tends to result in pigeonholing
> (people just go with what sounds closest to their use case instead of
> trying to figure out what actually is best for their use case).
>
> Ideally, I think it should be as generic as reasonably possible,
> possibly something along the lines of:
>
> A: While not strictly necessary, running regular filtered balances
> (for example `btrfs balance start -dusage=50 -dlimit=2 -musage=50
> -mlimit=4`, see `man btrfs-balance` for more info on what the options
> mean) can help keep a volume healthy by mitigating the things that
> typically cause ENOSPC errors. Full balances by contrast are long and
> expensive operations, and should be done only as a last resort.
As the BTRFS noob who started the conversation on netdata's Github
issues, I'd like to describe my experience.
I got an alert that unallocated space on a BTRFS filesystem on one host
was low. A netdata caption suggested btrfs-balance and directed me to
its man page. But I found it hard to understand since I don't know how
BTRFS works or its particular terminology. The FAQ was easier to
understand but didn't help me find a solution to my problem.
It's a 420GiB NVMe with single data and metadata. It has a MariaDB
datadir with an OLTP workload and a small GlusterFS brick for
replicating filesystem with little activity. I recall that unallocated
space was under 2G, metadata allocation was low, a few G and about 1/3
used. Data allocation was very large, almost everything else, with ~25%
used.
Given the documentation and the usage stats, I did not know what options
to use with balance. I spent some time reading and researching and
trying to understand the filters and how they should relate to my
situation. Eventually I abandoned that effort and ran balance without
options.
While general recommendations about running balance would be welcome,
what I needed was a dummy's guide to what the output of btrfs usage
_means_ and how to use balance to tackle problems with it.
The other mystery is how the data allocation became so large.
Tom
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2018-01-16 12:57 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-08 15:55 Recommendations for balancing as part of regular maintenance? Austin S. Hemmelgarn
2018-01-08 16:20 ` ein
2018-01-08 16:34 ` Austin S. Hemmelgarn
2018-01-08 18:17 ` Graham Cobb
2018-01-08 18:34 ` Austin S. Hemmelgarn
2018-01-08 20:29 ` Martin Raiber
2018-01-09 8:33 ` Marat Khalili
2018-01-09 12:46 ` Austin S. Hemmelgarn
2018-01-10 3:49 ` Duncan
2018-01-10 16:30 ` Tom Worster
2018-01-10 17:01 ` Austin S. Hemmelgarn
2018-01-10 18:33 ` Tom Worster
2018-01-10 20:44 ` Timofey Titovets
2018-01-11 13:00 ` Austin S. Hemmelgarn
2018-01-11 8:51 ` Duncan
2018-01-10 4:38 ` Duncan
2018-01-10 12:41 ` Austin S. Hemmelgarn
2018-01-11 20:12 ` Hans van Kranenburg
2018-01-10 21:37 ` waxhead
2018-01-11 12:50 ` Austin S. Hemmelgarn
2018-01-11 19:56 ` Hans van Kranenburg
2018-01-12 18:24 ` Austin S. Hemmelgarn
2018-01-12 19:26 ` Tom Worster
2018-01-12 19:43 ` Austin S. Hemmelgarn
2018-01-13 22:09 ` Chris Murphy
2018-01-15 13:43 ` Austin S. Hemmelgarn
2018-01-15 18:23 ` Tom Worster
2018-01-16 6:45 ` Chris Murphy
2018-01-16 11:02 ` Andrei Borzenkov
2018-01-16 12:57 ` Austin S. Hemmelgarn
2018-01-08 21:43 Tom Worster
2018-01-08 22:18 ` Hugo Mills
2018-01-09 12:23 ` Austin S. Hemmelgarn
2018-01-09 14:16 ` Tom Worster
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.