linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: kreijack@inwind.it
Cc: Josef Bacik <josef@toxicpanda.com>, linux-btrfs@vger.kernel.org
Subject: Re: [RFC][PATCH V5] btrfs: preferred_metadata: preferred device for metadata
Date: Thu, 21 Jan 2021 13:54:00 -0500	[thread overview]
Message-ID: <20210121185400.GH28049@hungrycats.org> (raw)
In-Reply-To: <765cec4e-b989-081b-2ad7-e2d1c9cf7f55@libero.it>

On Thu, Jan 21, 2021 at 07:16:05PM +0100, Goffredo Baroncelli wrote:
> On 1/20/21 5:02 PM, Josef Bacik wrote:
> > On 1/17/21 1:54 PM, Goffredo Baroncelli wrote:
> > > 
> > > Hi all,
> > > 
> > > This is an RFC; I wrote this patch because I find the idea interesting
> > > even though it adds more complication to the chunk allocator.
> > > 
> > > The basic idea is to store the metadata chunk in the fasters disks.
> > > The fasters disk are marked by the "preferred_metadata" flag.
> > > 
> > > BTRFS when allocate a new metadata/system chunk, selects the
> > > "preferred_metadata" disks, otherwise it selectes the non
> > > "preferred_metadata" disks. The intial patch allowed to use the other
> > > kind of disk in case a set is full.
> > > 
> > > This patches set is based on v5.11-rc2.
> > > 
> > > For now, the only user of this patch that I am aware is Zygo.
> > > However he asked to further constraint the allocation: i.e. avoid to
> > > allocated metadata on a not "preferred_metadata"
> > > disk. So I extended the patch adding 4 modes to operate.
> > > 
> > > This is enabled passing the option "preferred_metadata=<mode>" at
> > > mount time.
> > > 
> > 
> > I'll echo Zygo's hatred for mount options.  The more complicated policy decisions belong in properties and sysfs knobs, not mount options.
> > 
> I tend to agree. However adding a filesystem property can be done in a second time. I don't think that this a problem. However I prefer to make the patch smaller.
> 
> Anyway I have to point out that we need a way to change the allocation
> policy without changing the metadata otherwise we risk to be in the
> loop of exhausting metadata space: - how we can increase the space for
> metadata if we don't have space for metadata but I need to allocate
> few block of metadata....
> 
> What I mean is that even if we store the setting as filesystem
> properties (and definitely we have to do), we need a way to override
> in an emergency scenario.

There are no new issues introduced by this change, thus no requirement
for a mount option to deal with new issues.

The same issue comes up when changing RAID profile, or removing devices,
or when existing devices simply fill up.  Part of the solution is the
global reserve, which ensures we can always create a transaction to modify
a few metadata pages.

Part of the solution is a run-time check to ensure we have min_devs for
active RAID profiles whenever we change a device policy to reject data
or metadata (see btrfs_check_raid_min_devices).  This is currently
implemented for the device remove ioctl, and a similar check will
be needed for the device property set ioctl for preferred_metadata.
That part is missing in v5 of this patch and will have to be added,
though even now it works most of the time without.

v5 is also missing changes to the df statvfs code to deal with metadata
only devices.  At this stage it's an RFC patch, so that's OK, but it
will also need to be fixed.  We presume these will be addressed in future
versions.  Again, it works now, but 'df' will give the wrong number.

None of the above requirements is addressed by a mount option, and
the mount option adds new requirements that we don't want.

> > And then for the properties themselves, presumably we'll want to
> add other FS wide properties in the future.  I'm not against adding
> new actual keys and items to the tree itself, but is there a way
> we could use our existing property infrastructure that we use for
> compression, and simply store the xattrs in the tree root?  It looks
> like we're just toggling a policy decision, and we don't actually
> need the other properties in the item you've created, so why not
> just a btrfs.preferred_metadata property with the value stored in it,
> dropped into the tree_root so it can be read on mount?  Thanks,
> 
> What if the root subvolume is not mounted ? 

Same as device add or remove--if the filesystem isn't mounted, you can't
make any changes.

Note that all the required properties are per-device, so really you just
need any open FD on the filesystem.  (I think Josef didn't read that far
down).

The per-device policy storage can go in dev_root (tree 4) along with the
device stats data, if we don't want to use btrfs_device::type.  You'd still
need an ioctl to get to it.

Or maybe I'm misreading Josef here, and his idea is to make the per-device
configuration a string blob that can be set by putting an xattr on the
root subvol?  I'm not sure that's better, but it'll work.

> Yes we can create a further
> api to store/retrive this kind of metadata without mounting the root
> subvolume, but doing so in what it would be different than adding a
> key to the root fs like the default subvolume ioctl does ?

> > 
> > Josef
> 
> 
> -- 
> gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
> Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
> 

  reply	other threads:[~2021-01-21 19:06 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-17 18:54 [RFC][PATCH V5] btrfs: preferred_metadata: preferred device for metadata Goffredo Baroncelli
2021-01-17 18:54 ` [PATCH 1/5] Add an ioctl to set the device properties Goffredo Baroncelli
2021-01-17 18:54 ` [PATCH 2/5] Add flags for dedicated metadata disks Goffredo Baroncelli
2021-01-17 18:54 ` [PATCH 3/5] Export dev_item.type in sysfs /sys/fs/btrfs/<uuid>/devinfo/<devid>/type Goffredo Baroncelli
2021-01-17 18:54 ` [PATCH 4/5] btrfs: add preferred_metadata option Goffredo Baroncelli
2021-01-17 18:54 ` [PATCH 5/5] btrfs: add preferred_metadata mode mount option Goffredo Baroncelli
2021-01-18  3:05   ` kernel test robot
2021-01-19 23:12 ` [RFC][PATCH V5] btrfs: preferred_metadata: preferred device for metadata Zygo Blaxell
2021-01-21  8:31   ` Martin Svec
2021-01-20 16:02 ` Josef Bacik
2021-01-20 16:15   ` Johannes Thumshirn
2021-01-20 16:17     ` Josef Bacik
2021-01-20 16:20   ` Zygo Blaxell
2021-01-21 18:16   ` Goffredo Baroncelli
2021-01-21 18:54     ` Zygo Blaxell [this message]
2021-01-22 18:31       ` Goffredo Baroncelli
2021-01-22 22:42         ` Zygo Blaxell
2021-01-23 14:55           ` Graham Cobb
2021-01-23 17:21             ` Zygo Blaxell
2021-01-23 17:44               ` Graham Cobb
2021-01-24  4:00                 ` Zygo Blaxell
2021-01-24 20:05                 ` Goffredo Baroncelli
2021-01-25 15:21 ` Josef Bacik
2023-01-15 17:00   ` Goffredo Baroncelli
2023-01-15 17:05     ` Goffredo Baroncelli
2023-01-16  8:20       ` Paul Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210121185400.GH28049@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=josef@toxicpanda.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).