From: Josef Bacik <josef@toxicpanda.com>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: kreijack@inwind.it, David Sterba <dsterba@suse.cz>,
Sinnamohideen Shafeeq <shafeeqs@panasas.com>,
Paul Jones <paul@pauljones.id.au>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode
Date: Tue, 14 Dec 2021 15:34:24 -0500 [thread overview]
Message-ID: <Ybj/0ITsCQTBLkQF@localhost.localdomain> (raw)
In-Reply-To: <Ybj40IuxdaAy75Ue@hungrycats.org>
On Tue, Dec 14, 2021 at 03:04:32PM -0500, Zygo Blaxell wrote:
> On Tue, Dec 14, 2021 at 08:03:45PM +0100, Goffredo Baroncelli wrote:
> > On 12/13/21 23:49, Zygo Blaxell wrote:
> > > On Mon, Dec 13, 2021 at 04:15:14PM -0500, Josef Bacik wrote:
> > > > On Mon, Dec 13, 2021 at 08:54:24PM +0100, Goffredo Baroncelli wrote:
> > > > > Gentle ping :-)
> > > > >
> > > > > Are there anyone of the mains developer interested in supporting this patch ?
> > > > >
> > > > > I am open to improve it if required.
> > > > >
> > > >
> > > > Sorry I missed this go by. I like the interface, we don't have a use for
> > > > device->type yet, so this fits nicely.
> > > >
> > > > I don't see the btrfs-progs patches in my inbox, and these don't apply, so
> > > > you'll definitely need to refresh for a proper review, but looking at these
> > > > patches they seem sane enough, and I like the interface. I'd like to hear
> > > > Zygo's opinion as well.
> > >
> > > I've been running earlier versions with modifications since summer 2020,
> > > and this version mostly unmodified (rebase changes only) since it was
> > > posted. It seems to work, even in corner cases like converting balances,
> > > replacing drives, and running out of space. The "running out of space"
> > > experience is on btrfs is weird at the best of times, and these patches
> > > add some more new special cases, but it doesn't behave in ways that
> > > would surprise a sysadmin familiar with how btrfs chunk allocation works.
> > >
> > > One major piece that's missing is adjusting the statvfs (aka df)
> > > available blocks field so that it doesn't include unallocated space on
> > > any metadata-only devices. Right now all the unallocated space on
> > > metadata-only devices is counted as free even though it's impossible to
> > > put a data block there, so anything that is triggered automatically
> > > on "f_bavail < some_threshold" will be confused.
> > >
> > > I don't think that piece has to block the rest of the patch series--if
> > > you're not using the feature, df gives the right number (or at least the
> > > same number it gave before), and if you are using the feature, you can
> > > subtract the unavailable data space until a later patch comes along to
> > > fix it.
> > >
> > > I like
> > >
> > > echo data_only > /sys/fs/btrfs/$uuid/devinfo/3/type
> >
> > Only to be clear, for now you can pass a numeric value to "type". Not a text
> > like your example.
> >
> > However I want to put on the table another option: to not expose all the
> > "type" field, but only the "allocation policy"; we can add a new sysfs field
> > called "allocation policy" that internally change the dev_item->type field.
> >
> > It is not only a "cosmetic" change. If we want to change the allocation
> > policy, now the correct way is:
> > - read the type field
> > - change the "allocation policy" bits
> > - write the type field
> >
> > Which is race 'prone'
>
> > For now it is not a problem, because type contains only the allocation bits.
> > But in future when the type field will contains further properties this could
> > be a problem.
>
> Yeah, keep the interface very narrow, don't hand out access to random bits.
>
> If the kernel supports additional bits, it should support additional
> sysfs filenames to go with them. Or it could put all the supported
> options in the sysfs field, like block IO schedulers do, so you could
> find this in the file by reading it:
>
> [prefer_data] prefer_metadata metadata_only data_only
>
> > > more than patching btrfs-progs so I can use
> > >
> > > btrfs prop set /dev/... allocation_hint data_only
> > >
> > > but I admit that might be because I'm weird.
> >
> > I prefer the echo approach too; however it is not very ergonomics in conjunction
> > to sudo....
>
> For /proc/sys/* we have the 'sysctl' tool, so you can write 'sysctl
> vm.drop_caches=1' or 'sudo sysctl vm.drop_caches=1'. For some reason
> we don't have this for sysfs (or maybe it's just Debian...?) so we have
> to write things like 'echo foo | sudo tee /sys/fs/...'.
>
> Of course btrfs-progs could always open the
> /sys/fs/btrfs/.../allocation_policy file and write to it. But if we're
> modifying btrfs-progs then we could use the ioctl interface anyway.
>
> I don't have a strong preference for either sysfs or ioctl, nor am I
> opposed to simply implementing both. I'll let someone who does have
> such a preference make their case.
I think echo'ing a name into sysfs is better than bits for sure. However I want
the ability to set the device properties via a btrfs-progs command offline so I
can setup the storage and then mount the file system. I want
1) The sysfs interface so you can change things on the fly. This stays
persistent of course, so the way it works is perfect.
2) The btrfs-progs command sets it on offline devices. If you point it at a
live mounted fs it can simply use the sysfs thing to do it live.
Does this seem reasonable? Thanks,
Josef
next prev parent reply other threads:[~2021-12-14 20:34 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-24 15:31 [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 1/4] btrfs: add flags to give an hint to the chunk allocator Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 2/4] btrfs: export dev_item.type in /sys/fs/btrfs/<uuid>/devinfo/<devid>/type Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 3/4] btrfs: change the DEV_ITEM 'type' field via sysfs Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 4/4] btrfs: add allocator_hint mode Goffredo Baroncelli
2021-12-17 15:58 ` Hans van Kranenburg
2021-12-17 18:28 ` Goffredo Baroncelli
2021-12-17 19:41 ` Zygo Blaxell
2021-12-18 9:07 ` Goffredo Baroncelli
2021-12-18 22:48 ` Zygo Blaxell
2021-12-19 0:03 ` Graham Cobb
2021-12-19 2:30 ` Zygo Blaxell
2021-12-13 9:39 ` [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Paul Jones
2021-12-13 19:54 ` Goffredo Baroncelli
2021-12-13 21:15 ` Josef Bacik
2021-12-13 22:49 ` Zygo Blaxell
2021-12-14 14:31 ` Josef Bacik
2021-12-14 19:03 ` Goffredo Baroncelli
2021-12-14 20:04 ` Zygo Blaxell
2021-12-14 20:34 ` Josef Bacik [this message]
2021-12-14 20:41 ` Goffredo Baroncelli
2021-12-15 13:58 ` Josef Bacik
2021-12-15 18:53 ` Goffredo Baroncelli
2021-12-16 0:56 ` Josef Bacik
2021-12-17 5:40 ` Zygo Blaxell
2021-12-17 14:48 ` Josef Bacik
2021-12-17 16:31 ` Zygo Blaxell
2021-12-17 18:08 ` Goffredo Baroncelli
2021-12-16 2:30 ` Paul Jones
2021-12-14 1:03 ` Sinnamohideen, Shafeeq
2021-12-14 18:53 ` Goffredo Baroncelli
2021-12-14 20:35 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ybj/0ITsCQTBLkQF@localhost.localdomain \
--to=josef@toxicpanda.com \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=dsterba@suse.cz \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
--cc=paul@pauljones.id.au \
--cc=shafeeqs@panasas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).