From: Goffredo Baroncelli <kreijack@libero.it>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
Josef Bacik <josef@toxicpanda.com>
Cc: David Sterba <dsterba@suse.cz>,
Sinnamohideen Shafeeq <shafeeqs@panasas.com>,
Paul Jones <paul@pauljones.id.au>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode
Date: Tue, 14 Dec 2021 20:03:45 +0100 [thread overview]
Message-ID: <71e523dc-2854-ca9b-9eee-e36b0bd5c2cb@libero.it> (raw)
In-Reply-To: <YbfN8gXHsZ6KZuil@hungrycats.org>
On 12/13/21 23:49, Zygo Blaxell wrote:
> On Mon, Dec 13, 2021 at 04:15:14PM -0500, Josef Bacik wrote:
>> On Mon, Dec 13, 2021 at 08:54:24PM +0100, Goffredo Baroncelli wrote:
>>> Gentle ping :-)
>>>
>>> Are there anyone of the mains developer interested in supporting this patch ?
>>>
>>> I am open to improve it if required.
>>>
>>
>> Sorry I missed this go by. I like the interface, we don't have a use for
>> device->type yet, so this fits nicely.
>>
>> I don't see the btrfs-progs patches in my inbox, and these don't apply, so
>> you'll definitely need to refresh for a proper review, but looking at these
>> patches they seem sane enough, and I like the interface. I'd like to hear
>> Zygo's opinion as well.
>
> I've been running earlier versions with modifications since summer 2020,
> and this version mostly unmodified (rebase changes only) since it was
> posted. It seems to work, even in corner cases like converting balances,
> replacing drives, and running out of space. The "running out of space"
> experience is on btrfs is weird at the best of times, and these patches
> add some more new special cases, but it doesn't behave in ways that
> would surprise a sysadmin familiar with how btrfs chunk allocation works.
>
> One major piece that's missing is adjusting the statvfs (aka df)
> available blocks field so that it doesn't include unallocated space on
> any metadata-only devices. Right now all the unallocated space on
> metadata-only devices is counted as free even though it's impossible to
> put a data block there, so anything that is triggered automatically
> on "f_bavail < some_threshold" will be confused.
>
> I don't think that piece has to block the rest of the patch series--if
> you're not using the feature, df gives the right number (or at least the
> same number it gave before), and if you are using the feature, you can
> subtract the unavailable data space until a later patch comes along to
> fix it.
>
> I like
>
> echo data_only > /sys/fs/btrfs/$uuid/devinfo/3/type
Only to be clear, for now you can pass a numeric value to "type". Not a text
like your example.
However I want to put on the table another option: to not expose all the
"type" field, but only the "allocation policy"; we can add a new sysfs field
called "allocation policy" that internally change the dev_item->type field.
It is not only a "cosmetic" change. If we want to change the allocation
policy, now the correct way is:
- read the type field
- change the "allocation policy" bits
- write the type field
Which is race 'prone'
For now it is not a problem, because type contains only the allocation bits.
But in future when the type field will contains further properties this could
be a problem.
>
> more than patching btrfs-progs so I can use
>
> btrfs prop set /dev/... allocation_hint data_only
>
> but I admit that might be because I'm weird.
I prefer the echo approach too; however it is not very ergonomics in conjunction
to sudo....
>
>> If we're going to use device->type for this, and since we don't have a user of
>> device->type, I'd also like you to go ahead and re-name ->type to
>> ->allocation_policy, that way it's clear what we're using it for now.
>>
>> I'd also like some xfstests to validate the behavior so we're sure we're testing
>> this. I'd want 1 test to just test the mechanics, like mkfs with different
>> policies and validate they're set right, change policies, add/remove disks with
>> different policies.
>>
>> Then a second test to do something like fsstress with each set of allocation
>> policies to validate that we did actually allocate from the correct disks. For
>> this please also test with compression on to make sure the test validation works
>> for both normal allocation and compression (ie it doesn't assume writing 5gib of
>> data == 5 gib of data usage, as compression chould give you a different value).
>>
>> With that in place I think this is the correct way to implement this feature.
>> Thanks,
>>
>> Josef
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
next prev parent reply other threads:[~2021-12-14 19:03 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-24 15:31 [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 1/4] btrfs: add flags to give an hint to the chunk allocator Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 2/4] btrfs: export dev_item.type in /sys/fs/btrfs/<uuid>/devinfo/<devid>/type Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 3/4] btrfs: change the DEV_ITEM 'type' field via sysfs Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 4/4] btrfs: add allocator_hint mode Goffredo Baroncelli
2021-12-17 15:58 ` Hans van Kranenburg
2021-12-17 18:28 ` Goffredo Baroncelli
2021-12-17 19:41 ` Zygo Blaxell
2021-12-18 9:07 ` Goffredo Baroncelli
2021-12-18 22:48 ` Zygo Blaxell
2021-12-19 0:03 ` Graham Cobb
2021-12-19 2:30 ` Zygo Blaxell
2021-12-13 9:39 ` [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Paul Jones
2021-12-13 19:54 ` Goffredo Baroncelli
2021-12-13 21:15 ` Josef Bacik
2021-12-13 22:49 ` Zygo Blaxell
2021-12-14 14:31 ` Josef Bacik
2021-12-14 19:03 ` Goffredo Baroncelli [this message]
2021-12-14 20:04 ` Zygo Blaxell
2021-12-14 20:34 ` Josef Bacik
2021-12-14 20:41 ` Goffredo Baroncelli
2021-12-15 13:58 ` Josef Bacik
2021-12-15 18:53 ` Goffredo Baroncelli
2021-12-16 0:56 ` Josef Bacik
2021-12-17 5:40 ` Zygo Blaxell
2021-12-17 14:48 ` Josef Bacik
2021-12-17 16:31 ` Zygo Blaxell
2021-12-17 18:08 ` Goffredo Baroncelli
2021-12-16 2:30 ` Paul Jones
2021-12-14 1:03 ` Sinnamohideen, Shafeeq
2021-12-14 18:53 ` Goffredo Baroncelli
2021-12-14 20:35 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=71e523dc-2854-ca9b-9eee-e36b0bd5c2cb@libero.it \
--to=kreijack@libero.it \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=dsterba@suse.cz \
--cc=josef@toxicpanda.com \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
--cc=paul@pauljones.id.au \
--cc=shafeeqs@panasas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).