All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Tom Worster <fsb@thefsb.org>, linux-btrfs@vger.kernel.org
Subject: Re: Recommendations for balancing as part of regular maintenance?
Date: Tue, 9 Jan 2018 07:23:06 -0500	[thread overview]
Message-ID: <33ee8802-df5a-b153-6e24-2b38be597846@gmail.com> (raw)
In-Reply-To: <C11AC11F-C4CC-41B8-BE2F-52486E42C987@thefsb.org>

On 2018-01-08 16:43, Tom Worster wrote:
> On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
> 
>> On 2018-01-08 11:20, ein wrote:
>>
>> > On 01/08/2018 04:55 PM, Austin S. Hemmelgarn wrote:
>> >
>> > > [...]
>> > >
>> > > And here's the FAQ entry:
>> > >
>> > > Q: Do I need to run a balance regularly?
>> > >
>> > > A: In general usage, no. A full unfiltered balance typically takes a
>> > > long time, and will rewrite huge amounts of data unnecessarily. 
>> You may
>> > > wish to run a balance on metadata only (see Balance_Filters) if 
>> you find
>> > > you have very large amounts of metadata space allocated but 
>> unused, but
>> > > this should be a last resort.
>> >
>> > IHMO three more sentencens and the answer would be more useful:
>> > 1. BTRFS balance command example with note check the man first.
>> > 2. What use case may cause 'large amounts of metadata space allocated
>> > but unused'.
>>
>> That's kind of what I was thinking as well, but I'm hesitant to get 
>> too heavily into stuff along the lines of 'for use case X, do 1, for 
>> use case Y, do 2, etc', as that tends to result in pigeonholing 
>> (people just go with what sounds closest to their use case instead of 
>> trying to figure out what actually is best for their use case).
>>
>> Ideally, I think it should be as generic as reasonably possible, 
>> possibly something along the lines of:
>>
>> A: While not strictly necessary, running regular filtered balances 
>> (for example `btrfs balance start -dusage=50 -dlimit=2 -musage=50 
>> -mlimit=4`, see `man btrfs-balance` for more info on what the options 
>> mean) can help keep a volume healthy by mitigating the things that 
>> typically cause ENOSPC errors. Full balances by contrast are long and 
>> expensive operations, and should be done only as a last resort.
> 
> As the BTRFS noob who started the conversation on netdata's Github 
> issues, I'd like to describe my experience.
> 
> I got an alert that unallocated space on a BTRFS filesystem on one host 
> was low. A netdata caption suggested btrfs-balance and directed me to 
> its man page. But I found it hard to understand since I don't know how 
> BTRFS works or its particular terminology. The FAQ was easier to 
> understand but didn't help me find a solution to my problem.
> 
> It's a 420GiB NVMe with single data and metadata. It has a MariaDB 
> datadir with an OLTP workload and a small GlusterFS brick for 
> replicating filesystem with little activity. I recall that unallocated 
> space was under 2G, metadata allocation was low, a few G and about 1/3 
> used. Data allocation was very large, almost everything else, with ~25% 
> used.
> 
> Given the documentation and the usage stats, I did not know what options 
> to use with balance. I spent some time reading and researching and 
> trying to understand the filters and how they should relate to my 
> situation. Eventually I abandoned that effort and ran balance without 
> options.
Hopefully the explanation I gave on the filters in the Github issue 
helped some.  In this case though, it sounds like running a filtered 
balance probably wouldn't have saved you much over a full one.
> 
> While general recommendations about running balance would be welcome, 
> what I needed was a dummy's guide to what the output of btrfs usage 
> _means_ and how to use balance to tackle problems with it.
This really is a great point.  Our documentation does a decent job as a 
reference for people who already have some idea what they're doing, but 
it really is worthless for people who have no prior experience.
> 
> The other mystery is how the data allocation became so large.
The most common case is that you had a lot of data on the device, and 
then deleted most of it.  Unless a chunk becomes completely empty 
(either because the data that was in it becomes completely unused, or 
because a balance moved all the data), it won't be automatically deleted 
by the kernel, so it's not unusual for filesystems that have been very 
active (especially if they have the 'ssd' mount option set, which 
happens automatically on most SSD's and a lot of other things the kernel 
marks as not being rotational media) to have a reasonably large amount 
of empty space scattered around the data chunks.

  parent reply	other threads:[~2018-01-09 12:23 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-08 21:43 Recommendations for balancing as part of regular maintenance? Tom Worster
2018-01-08 22:18 ` Hugo Mills
2018-01-09 12:23 ` Austin S. Hemmelgarn [this message]
2018-01-09 14:16   ` Tom Worster
  -- strict thread matches above, loose matches on Subject: below --
2018-01-08 15:55 Austin S. Hemmelgarn
2018-01-08 16:20 ` ein
2018-01-08 16:34   ` Austin S. Hemmelgarn
2018-01-08 18:17     ` Graham Cobb
2018-01-08 18:34       ` Austin S. Hemmelgarn
2018-01-08 20:29         ` Martin Raiber
2018-01-09  8:33           ` Marat Khalili
2018-01-09 12:46             ` Austin S. Hemmelgarn
2018-01-10  3:49               ` Duncan
2018-01-10 16:30                 ` Tom Worster
2018-01-10 17:01                   ` Austin S. Hemmelgarn
2018-01-10 18:33                     ` Tom Worster
2018-01-10 20:44                       ` Timofey Titovets
2018-01-11 13:00                         ` Austin S. Hemmelgarn
2018-01-11  8:51                     ` Duncan
2018-01-10  4:38       ` Duncan
2018-01-10 12:41         ` Austin S. Hemmelgarn
2018-01-11 20:12         ` Hans van Kranenburg
2018-01-10 21:37 ` waxhead
2018-01-11 12:50   ` Austin S. Hemmelgarn
2018-01-11 19:56   ` Hans van Kranenburg
2018-01-12 18:24 ` Austin S. Hemmelgarn
2018-01-12 19:26   ` Tom Worster
2018-01-12 19:43     ` Austin S. Hemmelgarn
2018-01-13 22:09   ` Chris Murphy
2018-01-15 13:43     ` Austin S. Hemmelgarn
2018-01-15 18:23     ` Tom Worster
2018-01-16  6:45       ` Chris Murphy
2018-01-16 11:02         ` Andrei Borzenkov
2018-01-16 12:57         ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33ee8802-df5a-b153-6e24-2b38be597846@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=fsb@thefsb.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.