All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugo Mills <hugo@carfax.org.uk>
To: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: dsterba@suse.cz,
	"Holger Hoffstätte" <holger.hoffstaette@googlemail.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: Fix no space bug caused by removing bg
Date: Tue, 22 Sep 2015 15:39:30 +0000	[thread overview]
Message-ID: <20150922153930.GK5918@carfax.org.uk> (raw)
In-Reply-To: <56016BB5.6060101@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4589 bytes --]

On Tue, Sep 22, 2015 at 10:54:45AM -0400, Austin S Hemmelgarn wrote:
> On 2015-09-22 10:36, Hugo Mills wrote:
> >On Tue, Sep 22, 2015 at 04:23:33PM +0200, David Sterba wrote:
> >>On Tue, Sep 22, 2015 at 01:41:31PM +0000, Hugo Mills wrote:
> >>>On Tue, Sep 22, 2015 at 03:36:43PM +0200, Holger Hoffstätte wrote:
> >>>>On 09/22/15 14:59, Jeff Mahoney wrote:
> >>>>(snip)
> >>>>>So if they way we want to prevent the loss of raid type info is by
> >>>>>maintaining the last block group allocated with that raid type, fine,
> >>>>>but that's a separate discussion.  Personally, I think keeping 1GB
> >>>>
> >>>>At this point I'm much more surprised to learn that the RAID type can
> >>>>apparently get "lost" in the first place, and is not persisted
> >>>>separately. I mean..wat?
> >>>
> >>>    It's always been like that, unfortunately.
> >>>
> >>>    The code tries to use the RAID type that's already present to work
> >>>out what the next allocation should be. If there aren't any chunks in
> >>>the FS, the configuration is lost, because it's not stored anywhere
> >>>else. It's one of the things that tripped me up badly when I was
> >>>failing to rewrite the chunk allocator last year.
> >>
> >>Yeah, right now there's no persistent default for the allocator. I'm
> >>still hoping that the object properties will magically solve that.
> >
> >    There's no obvious place that filesystem-wide properties can be
> >stored, though. There's a userspace tool to manipulate the few current
> >FS-wide properties, but that's all special-cased to use the
> >"historical" ioctls for those properties, with no generalisation of a
> >property store, or even (IIRC) any external API for them.
> >
> >    We're nominally using xattrs in the btrfs: namespace on directories
> >and files, and presumably on the top directory of a subvolume for
> >subvol-wide properties, but it's not clear where the FS-wide values
> >should go: in the top directory of subvolid=5 would be confusing,
> >because then you couldn't separate the properties for *that subvol*
> >from the ones for the whole FS (say, the default replication policy,
> >where you might want the top subvol to have different properties from
> >everything else).
> Possibly do special names for the defaults and store them there?  In
> general, I personally see little value in having some special
> 'default' properties however.

   That would work.

> The way I would expect things to work is that a new subvolume
> inherits it's properties from it's parent (if it's a snapshot),

   Definitely this.

> or
> from the next higher subvolume it's nested in.

   I don't think I like this. I'm not quite sure why, though, at the
moment.

   It definitely makes the process at the start of allocating a new
block group much more complex: you have to walk back up through an
arbitrary depth of nested subvols to find the one that's actually got
a replication policy record in it. (Because after this feature is
brought in, there will be lots of filesystems without per-subvol
replication policies in them, and we have to have some way of dealing
with those as well).

   With an FS default policy, you only need check the current subvol,
and then fall back to the FS default if that's not found.

   These things are, I think, likely to be lightly used: I would be
reasonably surprised to find more than two or possibly three storage
policies in use on any given system with a sane sysadmin.

   I'm actually not sure what the interactions of multiple storage
policies are going to be like. It's entirely possible, particularly
with some of the more exotic (but useful) suggestions I've thought of,
that the behaviour of the FS is dependent on the order in which the
block groups are allocated. (i.e. "20 GiB to subvol-A, then 20 GiB to
subvol-B" results in different behaviour than "1 GiB to subvol-A then
1 GiB to subvol-B and repeat"). I tried some simple Monte-Carlo
simulations, but I didn't get any concrete results out of it before
the end of the train journey. :)

>  This would obviate
> the need for some special 'default' properties, and would be
> relatively intuitive behavior for a significant majority of people.

   Of course, you shouldn't be nesting subvolumes anyway. It makes
it much harder to manage them.

   Hugo.

-- 
Hugo Mills             | What's a Nazgûl like you doing in a place like this?
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                                Illiad

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2015-09-22 15:39 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-21 12:59 [PATCH] btrfs: Fix no space bug caused by removing bg Zhao Lei
2015-09-21 13:27 ` Filipe David Manana
2015-09-21 13:37   ` Filipe David Manana
2015-09-22 10:06   ` Zhao Lei
2015-09-22 10:22     ` Filipe David Manana
2015-09-22 11:24       ` Zhao Lei
2015-09-22 12:45         ` Filipe David Manana
2015-09-23  1:59           ` Zhao Lei
2015-09-22 10:22     ` Zhao Lei
2015-09-22 12:59 ` Jeff Mahoney
2015-09-22 13:28   ` Hugo Mills
2015-09-22 13:36   ` Holger Hoffstätte
2015-09-22 13:41     ` Hugo Mills
2015-09-22 14:23       ` David Sterba
2015-09-22 14:36         ` Hugo Mills
2015-09-22 14:54           ` Austin S Hemmelgarn
2015-09-22 15:39             ` Hugo Mills [this message]
2015-09-22 17:32               ` Austin S Hemmelgarn
2015-09-22 17:37                 ` Austin S Hemmelgarn
2015-09-23  4:49                 ` Duncan
2015-09-23 13:28               ` David Sterba
2015-09-23 13:57                 ` Austin S Hemmelgarn
2015-09-23 14:05                 ` Hugo Mills
2015-09-23 13:12           ` David Sterba
2015-09-23 13:19             ` Qu Wenruo
2015-09-23 13:32               ` Austin S Hemmelgarn
2015-09-23 14:00                 ` Qu Wenruo
2015-09-23 17:28                   ` David Sterba
2015-09-23 13:37               ` David Sterba
2015-09-23 13:45               ` Hugo Mills
2015-09-23 13:28             ` Hugo Mills
2015-09-22 16:23     ` Jeff Mahoney
2015-09-23  2:14   ` Zhao Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150922153930.GK5918@carfax.org.uk \
    --to=hugo@carfax.org.uk \
    --cc=ahferroin7@gmail.com \
    --cc=dsterba@suse.cz \
    --cc=holger.hoffstaette@googlemail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.