linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Goffredo Baroncelli <kreijack@libero.it>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
	Josef Bacik <josef@toxicpanda.com>
Cc: David Sterba <dsterba@suse.cz>,
	Sinnamohideen Shafeeq <shafeeqs@panasas.com>,
	Paul Jones <paul@pauljones.id.au>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode
Date: Tue, 14 Dec 2021 20:03:45 +0100	[thread overview]
Message-ID: <71e523dc-2854-ca9b-9eee-e36b0bd5c2cb@libero.it> (raw)
In-Reply-To: <YbfN8gXHsZ6KZuil@hungrycats.org>

On 12/13/21 23:49, Zygo Blaxell wrote:
> On Mon, Dec 13, 2021 at 04:15:14PM -0500, Josef Bacik wrote:
>> On Mon, Dec 13, 2021 at 08:54:24PM +0100, Goffredo Baroncelli wrote:
>>> Gentle ping :-)
>>>
>>> Are there anyone of the mains developer interested in supporting this patch ?
>>>
>>> I am open to improve it if required.
>>>
>>
>> Sorry I missed this go by.  I like the interface, we don't have a use for
>> device->type yet, so this fits nicely.
>>
>> I don't see the btrfs-progs patches in my inbox, and these don't apply, so
>> you'll definitely need to refresh for a proper review, but looking at these
>> patches they seem sane enough, and I like the interface.  I'd like to hear
>> Zygo's opinion as well.
> 
> I've been running earlier versions with modifications since summer 2020,
> and this version mostly unmodified (rebase changes only) since it was
> posted.  It seems to work, even in corner cases like converting balances,
> replacing drives, and running out of space.  The "running out of space"
> experience is on btrfs is weird at the best of times, and these patches
> add some more new special cases, but it doesn't behave in ways that
> would surprise a sysadmin familiar with how btrfs chunk allocation works.
> 
> One major piece that's missing is adjusting the statvfs (aka df)
> available blocks field so that it doesn't include unallocated space on
> any metadata-only devices.  Right now all the unallocated space on
> metadata-only devices is counted as free even though it's impossible to
> put a data block there, so anything that is triggered automatically
> on "f_bavail < some_threshold" will be confused.
> 
> I don't think that piece has to block the rest of the patch series--if
> you're not using the feature, df gives the right number (or at least the
> same number it gave before), and if you are using the feature, you can
> subtract the unavailable data space until a later patch comes along to
> fix it.
> 
> I like
> 
> 	echo data_only > /sys/fs/btrfs/$uuid/devinfo/3/type

Only to be clear, for now you can pass a numeric value to "type". Not a text
like your example.

However I want to put on the table another option: to not expose all the
"type" field, but only the "allocation policy"; we can add a new sysfs field
called "allocation policy" that internally change the dev_item->type field.

It is not only a "cosmetic" change. If we want to change the allocation
policy, now the correct way is:
- read the type field
- change the "allocation policy" bits
- write the type field

Which is race 'prone'

For now it is not a problem, because type contains only the allocation bits.
But in future when the type field will contains further properties this could
be a problem.

> 
> more than patching btrfs-progs so I can use
> 
> 	btrfs prop set /dev/... allocation_hint data_only
> 
> but I admit that might be because I'm weird.

I prefer the echo approach too; however it is not very ergonomics in conjunction
to sudo....

> 
>> If we're going to use device->type for this, and since we don't have a user of
>> device->type, I'd also like you to go ahead and re-name ->type to
>> ->allocation_policy, that way it's clear what we're using it for now.
>>
>> I'd also like some xfstests to validate the behavior so we're sure we're testing
>> this.  I'd want 1 test to just test the mechanics, like mkfs with different
>> policies and validate they're set right, change policies, add/remove disks with
>> different policies.
>>
>> Then a second test to do something like fsstress with each set of allocation
>> policies to validate that we did actually allocate from the correct disks.  For
>> this please also test with compression on to make sure the test validation works
>> for both normal allocation and compression (ie it doesn't assume writing 5gib of
>> data == 5 gib of data usage, as compression chould give you a different value).
>>
>> With that in place I think this is the correct way to implement this feature.
>> Thanks,
>>
>> Josef


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

  parent reply	other threads:[~2021-12-14 19:03 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-24 15:31 [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 1/4] btrfs: add flags to give an hint to the chunk allocator Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 2/4] btrfs: export dev_item.type in /sys/fs/btrfs/<uuid>/devinfo/<devid>/type Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 3/4] btrfs: change the DEV_ITEM 'type' field via sysfs Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 4/4] btrfs: add allocator_hint mode Goffredo Baroncelli
2021-12-17 15:58   ` Hans van Kranenburg
2021-12-17 18:28     ` Goffredo Baroncelli
2021-12-17 19:41       ` Zygo Blaxell
2021-12-18  9:07         ` Goffredo Baroncelli
2021-12-18 22:48           ` Zygo Blaxell
2021-12-19  0:03             ` Graham Cobb
2021-12-19  2:30               ` Zygo Blaxell
2021-12-13  9:39 ` [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Paul Jones
2021-12-13 19:54   ` Goffredo Baroncelli
2021-12-13 21:15     ` Josef Bacik
2021-12-13 22:49       ` Zygo Blaxell
2021-12-14 14:31         ` Josef Bacik
2021-12-14 19:03         ` Goffredo Baroncelli [this message]
2021-12-14 20:04           ` Zygo Blaxell
2021-12-14 20:34             ` Josef Bacik
2021-12-14 20:41               ` Goffredo Baroncelli
2021-12-15 13:58                 ` Josef Bacik
2021-12-15 18:53                   ` Goffredo Baroncelli
2021-12-16  0:56                     ` Josef Bacik
2021-12-17  5:40                       ` Zygo Blaxell
2021-12-17 14:48                         ` Josef Bacik
2021-12-17 16:31                           ` Zygo Blaxell
2021-12-17 18:08                         ` Goffredo Baroncelli
2021-12-16  2:30                   ` Paul Jones
2021-12-14  1:03       ` Sinnamohideen, Shafeeq
2021-12-14 18:53       ` Goffredo Baroncelli
2021-12-14 20:35         ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=71e523dc-2854-ca9b-9eee-e36b0bd5c2cb@libero.it \
    --to=kreijack@libero.it \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=dsterba@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=paul@pauljones.id.au \
    --cc=shafeeqs@panasas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).