All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>, linux-xfs@vger.kernel.org
Subject: Re: [RFC PATCH 0/14] xfs: Towards thin provisioning aware filesystems
Date: Mon, 6 Nov 2017 08:01:00 -0500	[thread overview]
Message-ID: <20171106130100.GA30884@bfoster.bfoster> (raw)
In-Reply-To: <20171105225028.GB5858@dastard>

On Mon, Nov 06, 2017 at 09:50:28AM +1100, Dave Chinner wrote:
> On Fri, Nov 03, 2017 at 07:36:23AM -0400, Brian Foster wrote:
> > On Thu, Nov 02, 2017 at 07:47:40PM -0700, Darrick J. Wong wrote:
> > > FWIW the way I've been modelling this patch series in my head is that we
> > > format an arbitrarily large filesystem (m_LBA_size) address space on a
> > > thinp, feed statfs an "adjusted" size (m_usable_size)i which restricts
> > > how much space we can allocate, and now growfs increases or decreases
> > > the adjusted size without having to relocate anything or mess with the
> > > address space.  If the adjusted size ever exceeds the address space
> > > size, then we tack on more AGs like we've always done.  From that POV,
> > > there's no need to physically shrink (i.e. relocate) anything (and we
> > > can leave that for later/never).
> 
> [...]
> 
> > For example, suppose we had an absolute crude, barebones implementation
> > of physical shrink right now that basically trimmmed the amount of space
> > from the end of the fs iff those AGs were completely empty and otherwise
> > returned -EBUSY. There is no other userspace support, etc. As such, this
> > hypothetical feature is extremely limited to being usable immediately
> > after a growfs and thus probably has no use case other than "undo my
> > accidental growfs."
> > 
> > If we had that right now, _then_ what would the logical shrink interface
> > look like?
> 
> Absolutely no different to what I'm proposing we do right now. That
> is, the behaviour of the "shrink to size X" ioctl is determined by
> the feature bit in the superblock.  Hence if the thinspace feature
> is set we do a thin shrink, and if it is not set we do a physical
> shrink. i.e. grow/shrink behaviour is defined by the kernel
> implementation, not the user or the interface.
> 

I don't buy that argument at all. ;) What you describe above may be
reasonable for the current situation where shrink doesn't actually exist
(or thin comes first), but the above example assumes that there is at
least one simple and working physical shrink use case wired up to the
existing interface already. What you suggest means we would change the
behavior of a _working_ interface to do something completely different
based on a feature bit in the filesystem.

> As it is, I still can't see a use case or compelling reason for
> physically shrinking a thin filesystem. What's the use case that
> leads you to think that we need to physically shrink a thin
> filesystem, Brian? Let's get that on the table first, rather than
> waste time discussing hypothetical what-if's....
> 

I think you're missing the point. I'm really not that concerned about
whether we ultimately allow physical shrink or not on thin filesystems.
We can make that decision down the road. As mentioned previously, the
decision to support physical shrink on a thin enabled filesystem is
distinct from preserving the ability to do so via the current interface.

My concern is that the decision to override this interface has the
potential to create a mess later, both with the kernel interface and
from the perspective of xfsprogs usability. I don't want to try and
repeat the weird corner cases where I think that could materialize
because I can't seem to get that across well enough for you to consider
the requisite conditions. Instead, perhaps I'll just try to describe
what I think this interface should look like at a high level...

- Define a new version of struct xfs_growfs_data that includes a
  new_blocks field, new_usable_blocks field and imaxpct. Also include
  whatever mechanical changes are necessary to rev. the interface (i.e.,
  version number, padding, etc.).

  This means that the updated growfs interface supports the ability to
  physically and logically grow/shrink independent from whatever feature
  decisions we make in the future. This also means the logical
  grow/shrink interface is a bit more flexible because we don't have to
  enforce logical grow on physical grow. Finally, if we ever do support
  physical and logical shrink together, the potential for having to
  consider whether a growfs_data->newblocks shrink command means logical
  shrink because it comes from an older (but post-thin) xfsprogs or
  physical shrink because it comes from some 3rd party application goes
  away completely. The kernel interface is clear and well-defined.

For userspace, we have at least a couple options:

- Implement the same behavior as you've already proposed: physically and
  logically grow together, logical shrink only when hasthin == true,
  otherwise return an error. The point here is that using a more
  flexible kernel interface doesn't preclude/enforce how we decide to
  expose this feature in xfsprogs, and afaict doesn't introduce any new
  backwards compatibility issues since we have to update xfsprogs
  anyways. Older xfsprogs would retain the same behavior it has today.

Or...

- Create a separate logical grow/shrink parameter in xfs_growfs (i.e., a
  -T param analogous to -D for physical blocks). This ensures that
  logical shrink/grow can execute independent from physical shrink/grow,
  that there is no potential for confusion over logical vs. physical
  shrink on thin filesystems and in particular, that if we do ever
  support physical shrink but do _not_ support it on thin fs', that
  there is a distinct physical shrink command that _will return an error
  from the kernel_ even after userspace grows support for physical
  shrink, rather than succeed doing something other than what the user
  might have anticipated.

  Note that even with a separate thin parameter, I think we could still
  consider support of logical+physical grow via the xfs_growfs -d
  parameter.

I find the first option unwise because it similarly confuses userspace
syntax of future physical shrink. For example, that causes me to start
to think about things like whether some user could come along after we
support physical shrink (particularly if we don't support it on thin
fs), run the "shrink the data section" command mistakenly thinking it
freed up block address space and then run the corresponding lvresize
command to shrink the volume since the fs shrink succeeded (which leaves
the user in a data loss scenario).

That is primarily due to user error of course, but maybe the user in
that example simply picked the wrong volume out of tens or hundreds of
others and didn't realize it was thin in the first place. I'm sure we'll
have documentation and whatnot to absolve us from blame, but IMO this is
all much more usable with distinct interfaces for physical vs. logical
adjustments.

In summary, my arguments here consist mostly of a collection of red
flags that I see rather than hard incompatibilities or specific use
cases I want to support. The problematic situations change depending on
whether we decide to support physical shrink on thin fs or not and so
it's not really possible or important to try and pin them all down.
OTOH, it's also quite possible that none of them ever materialize at
all.

If they do, I'm pretty sure we could find ways to address each one
individually as we progress, or document potentially confusing behavior
appropriately, etc. The larger point is that I think much of this simply
goes away with a cleaner interface. IMO, this boils down to what I think
is just a matter of practicing good software engineering and system/user
interface design.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2017-11-06 13:01 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-26  8:33 [RFC PATCH 0/14] xfs: Towards thin provisioning aware filesystems Dave Chinner
2017-10-26  8:33 ` [PATCH 01/14] xfs: factor out AG header initialisation from growfs core Dave Chinner
2017-10-26  8:33 ` [PATCH 02/14] xfs: convert growfs AG header init to use buffer lists Dave Chinner
2017-10-26  8:33 ` [PATCH 03/14] xfs: factor ag btree reoot block initialisation Dave Chinner
2017-10-26  8:33 ` [PATCH 04/14] xfs: turn ag header initialisation into a table driven operation Dave Chinner
2017-10-26  8:33 ` [PATCH 05/14] xfs: make imaxpct changes in growfs separate Dave Chinner
2017-10-26  8:33 ` [PATCH 06/14] xfs: separate secondary sb update in growfs Dave Chinner
2017-10-26  8:33 ` [PATCH 07/14] xfs: rework secondary superblock updates " Dave Chinner
2017-10-26  8:33 ` [PATCH 08/14] xfs: move various type verifiers to common file Dave Chinner
2017-10-26  8:33 ` [PATCH 09/14] xfs: split usable space from block device size Dave Chinner
2017-10-26  8:33 ` [PATCH 10/14] xfs: hide reserved metadata space from users Dave Chinner
2017-10-26  8:33 ` [PATCH 11/14] xfs: bump XFS_IOC_FSGEOMETRY to v5 structures Dave Chinner
2017-10-26  8:33 ` [PATCH 12/14] xfs: convert remaingin xfs_sb_version_... checks to bool Dave Chinner
2017-10-26 16:03   ` Darrick J. Wong
2017-10-26  8:33 ` [PATCH 13/14] xfs: add suport for "thin space" filesystems Dave Chinner
2017-10-26  8:33 ` [PATCH 14/14] xfs: add growfs support for changing usable blocks Dave Chinner
2017-10-26 11:30   ` Amir Goldstein
2017-10-26 12:48     ` Dave Chinner
2017-10-26 13:32       ` Amir Goldstein
2017-10-27 10:26         ` Amir Goldstein
2017-10-26 11:09 ` [RFC PATCH 0/14] xfs: Towards thin provisioning aware filesystems Amir Goldstein
2017-10-26 12:35   ` Dave Chinner
2017-11-01 22:31     ` Darrick J. Wong
2017-10-30 13:31 ` Brian Foster
2017-10-30 21:09   ` Dave Chinner
2017-10-31  4:49     ` Amir Goldstein
2017-10-31 22:40       ` Dave Chinner
2017-10-31 11:24     ` Brian Foster
2017-11-01  0:45       ` Dave Chinner
2017-11-01 14:17         ` Brian Foster
2017-11-01 23:53           ` Dave Chinner
2017-11-02 11:25             ` Brian Foster
2017-11-02 23:30               ` Dave Chinner
2017-11-03  2:47                 ` Darrick J. Wong
2017-11-03 11:36                   ` Brian Foster
2017-11-05 22:50                     ` Dave Chinner
2017-11-06 13:01                       ` Brian Foster [this message]
2017-11-06 21:20                         ` Dave Chinner
2017-11-07 11:28                           ` Brian Foster
2017-11-03 11:26                 ` Brian Foster
2017-11-03 12:19                   ` Amir Goldstein
2017-11-06  1:16                     ` Dave Chinner
2017-11-06  9:48                       ` Amir Goldstein
2017-11-06 21:46                         ` Dave Chinner
2017-11-07  5:30                           ` Amir Goldstein
2017-11-05 23:51                   ` Dave Chinner
2017-11-06 13:07                     ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171106130100.GA30884@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.