linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Josef Bacik <josef@toxicpanda.com>
Cc: kreijack@inwind.it, David Sterba <dsterba@suse.cz>,
	Sinnamohideen Shafeeq <shafeeqs@panasas.com>,
	Paul Jones <paul@pauljones.id.au>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode
Date: Fri, 17 Dec 2021 00:40:55 -0500	[thread overview]
Message-ID: <Ybwi58Uivf29oGhw@hungrycats.org> (raw)
In-Reply-To: <YbqOwN7SW7NWm5/S@localhost.localdomain>

On Wed, Dec 15, 2021 at 07:56:32PM -0500, Josef Bacik wrote:
> On Wed, Dec 15, 2021 at 07:53:40PM +0100, Goffredo Baroncelli wrote:
> > On 12/15/21 14:58, Josef Bacik wrote:
> > > On Tue, Dec 14, 2021 at 09:41:21PM +0100, Goffredo Baroncelli wrote:
> > > > On 12/14/21 21:34, Josef Bacik wrote:
> > > > > On Tue, Dec 14, 2021 at 03:04:32PM -0500, Zygo Blaxell wrote:
> > > > > > On Tue, Dec 14, 2021 at 08:03:45PM +0100, Goffredo Baroncelli wrote:
> > > > 
> > > > > > 
> > > > > > I don't have a strong preference for either sysfs or ioctl, nor am I
> > > > > > opposed to simply implementing both.  I'll let someone who does have
> > > > > > such a preference make their case.
> > > > > 
> > > > > I think echo'ing a name into sysfs is better than bits for sure.  However I want
> > > > > the ability to set the device properties via a btrfs-progs command offline so I
> > > > > can setup the storage and then mount the file system.  I want
> > > > > 
> > > > > 1) The sysfs interface so you can change things on the fly.  This stays
> > > > >      persistent of course, so the way it works is perfect.
> > > > > 
> > > > > 2) The btrfs-progs command sets it on offline devices.  If you point it at a
> > > > >      live mounted fs it can simply use the sysfs thing to do it live.
> > > > 
> > > > #2 is currently not implemented. However I think that we should do.
> > > > 
> > > > The problem is that we need to update both:
> > > > 
> > > > - the superblock		(simple)
> > > > - the dev_item item		(not so simple...)
> > > > 
> > > > What about using only bits from the superblock to store this property ?
> > > 
> > > I'm looking at the patches and you only are updating the dev_item, am I missing
> > > something for the super block?
> > 
> > When btrfs write the superblocks (see write_all_supers() in disk-io.c), it copies
> > the dev_item fields (contained in fs_info->fs_devices->devices lists) in each
> > superblock before updating it.
> > 
> 
> Oh right.  Still, I hope we're doing this correctly in btrfs-progs, if not
> that's a problem.
> 
> > > 
> > > For offline all you would need to do is do the normal open_ctree,
> > > btrfs_search_slot to the item and update the device item type, that's
> > > straightforward.
> > > 
> > > For online if you use btrfs prop you can see if the fs is mounted and just find
> > > the sysfs file to modify and do it that way.
> > > 
> > > But this also brings up another point, we're going to want a compat bit for
> > > this.  It doesn't make the fs unusable for old kernels, so just a normal
> > > BTRFS_FS_COMPAT_<whatever> flag is fine.  If the setting gets set you set the
> > > compat flag.
> > 
> > Why we need a "compact" bit ? The new kernels know how treat the dev_item_type field.
> > The old kernels ignore it. The worst thing is that a filesystem may require a balance
> > before reaching a good shape (i.e. the metadata on ssd and the data on a spinning disk)
> 
> So you can do the validation below, tho I'm thinking I care about it less, if we
> just make sure that type is correct regardless of the compat bit then that's
> fine.  Thanks,

In theory if you get stuck in an impossible allocation situation (like all
your disks are data-only and you run out of metadata space) then one way
to recover from it is to mount with an old kernel which doesn't respect
the type bits.  Another way to recover would be to flip the type bits
while the filesystem is offline with btrfs-progs.  A third way would be to
have a mount option for newer kernels to ignore the allocation bits like
old kernels do (yes I know I already said I didn't like that idea).

If we have a bit that says "old kernels don't mount this filesystem any
more" then we lose one of those recovery options, and the other options
aren't implemented yet.

While I think of it, the metadata reservation system eventually needs
to know that it can't use data-only devices for metadata, the same way
that df eventually needs to know about metadata-only devices.


> Josef
> 

  reply	other threads:[~2021-12-17  5:40 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-24 15:31 [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 1/4] btrfs: add flags to give an hint to the chunk allocator Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 2/4] btrfs: export dev_item.type in /sys/fs/btrfs/<uuid>/devinfo/<devid>/type Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 3/4] btrfs: change the DEV_ITEM 'type' field via sysfs Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 4/4] btrfs: add allocator_hint mode Goffredo Baroncelli
2021-12-17 15:58   ` Hans van Kranenburg
2021-12-17 18:28     ` Goffredo Baroncelli
2021-12-17 19:41       ` Zygo Blaxell
2021-12-18  9:07         ` Goffredo Baroncelli
2021-12-18 22:48           ` Zygo Blaxell
2021-12-19  0:03             ` Graham Cobb
2021-12-19  2:30               ` Zygo Blaxell
2021-12-13  9:39 ` [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Paul Jones
2021-12-13 19:54   ` Goffredo Baroncelli
2021-12-13 21:15     ` Josef Bacik
2021-12-13 22:49       ` Zygo Blaxell
2021-12-14 14:31         ` Josef Bacik
2021-12-14 19:03         ` Goffredo Baroncelli
2021-12-14 20:04           ` Zygo Blaxell
2021-12-14 20:34             ` Josef Bacik
2021-12-14 20:41               ` Goffredo Baroncelli
2021-12-15 13:58                 ` Josef Bacik
2021-12-15 18:53                   ` Goffredo Baroncelli
2021-12-16  0:56                     ` Josef Bacik
2021-12-17  5:40                       ` Zygo Blaxell [this message]
2021-12-17 14:48                         ` Josef Bacik
2021-12-17 16:31                           ` Zygo Blaxell
2021-12-17 18:08                         ` Goffredo Baroncelli
2021-12-16  2:30                   ` Paul Jones
2021-12-14  1:03       ` Sinnamohideen, Shafeeq
2021-12-14 18:53       ` Goffredo Baroncelli
2021-12-14 20:35         ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ybwi58Uivf29oGhw@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=dsterba@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=paul@pauljones.id.au \
    --cc=shafeeqs@panasas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).