All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@redhat.com>
To: Brian Foster <bfoster@redhat.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	linux-xfs@vger.kernel.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Eric Sandeen <sandeen@sandeen.net>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH v6 6/7] xfs: support shrinking unused space in the last AG
Date: Thu, 4 Feb 2021 21:58:30 +0800	[thread overview]
Message-ID: <20210204135830.GD149518@xiangao.remote.csb> (raw)
In-Reply-To: <20210204123303.GA3716033@bfoster>

Hi Brian,

On Thu, Feb 04, 2021 at 07:33:03AM -0500, Brian Foster wrote:
> On Thu, Feb 04, 2021 at 03:02:17AM +0800, Gao Xiang wrote:

....

> > > 
> > > Long question:
> > > 
> > > The reason why we use (nb - dblocks) is because growfs is an all or
> > > nothing operation -- either we succeed in writing new empty AGs and
> > > inflating the (former) last AG of the fs, or we don't do anything at
> > > all.  We don't allow partial growing; if we did, then delta would be
> > > relevant here.  I think we get away with not needing to run transactions
> > > for each AG because those new AGs are inaccessible until we commit the
> > > new agcount/dblocks, right?
> > > 
> > > In your design for the fs shrinker, do you anticipate being able to
> > > eliminate all the eligible AGs in a single transaction?  Or do you
> > > envision only tackling one AG at a time?  And can we be partially
> > > successful with a shrink?  e.g. we succeed at eliminating the last AG,
> > > but then the one before that isn't empty and so we bail out, but by that
> > > point we did actually make the fs a little bit smaller.
> > 
> > Thanks for your question. I'm about to sleep, I might try to answer
> > your question here.
> > 
> > As for my current experiement / understanding, I think eliminating all
> > the empty AGs + shrinking the tail AG in a single transaction is possible,
> > that is what I'm done for now;
> >  1) check the rest AGs are empty (from the nagcount AG to the oagcount - 1
> >     AG) and mark them all inactive (AGs freezed);
> >  2) consume an extent from the (nagcount - 1) AG;
> >  3) decrease the number of agcount from oagcount to nagcount.
> > 
> > Both 2) and 3) can be done in the same transaction, and after 1) the state
> > of such empty AGs is fixed as well. So on-disk fs and runtime states are
> > all in atomic.
> > 
> > > 
> > > There's this comment at the bottom of xfs_growfs_data() that says that
> > > we can return error codes if the secondary sb update fails, even if the
> > > new size is already live.  This convinces me that it's always been the
> > > case that callers of the growfs ioctl are supposed to re-query the fs
> > > geometry afterwards to find out if the fs size changed, even if the
> > > ioctl itself returns an error... which implies that partial grow/shrink
> > > are a possibility.
> > > 
> > 
> > I didn't realize that possibility but if my understanding is correct
> > the above process is described as above so no need to use incremental
> > shrinking by its design. But it also support incremental shrinking if
> > users try to use the ioctl for multiple times.
> > 
> 
> This was one of the things I wondered about on an earlier versions of
> this work; whether we wanted to shrink to be deliberately incremental or
> not. I suspect that somewhat applies to even this version without AG
> truncation because technically we could allocate as much as possible out
> of end of the last AG and shrink by that amount. My initial thought was
> that if the implementation is going to be opportunistic (i.e., we
> provide no help to actually free up targeted space), perhaps an
> incremental implementation is a useful means to allow the operation to
> make progress. E.g., run a shrink, observe it didn't fully complete,
> shuffle around some files, repeat, etc. 
> 
> IIRC, one of the downsides of that sort of approach is any use case
> where the goal is an underlying storage device resize. I suppose an
> underlying device resize could also be opportunistic, but it seems more
> likely to me that use case would prefer an all or nothing approach,
> particularly if associated userspace tools don't really know how to
> handle a partially successful fs shrink. Do we have any idea how other
> tools/fs' behave in this regard (I thought ext4 supported shrink)? FWIW,
> it also seems potentially annoying to ask for a largish shrink only for
> the tool to hand back something relatively tiny.
> 
> Based on your design description, it occurs to me that perhaps the ideal
> outcome is an implementation that supports a fully atomic all-or-nothing
> shrink (assuming this is reasonably possible), but supports an optional
> incremental mode specified by the interface. IOW, if we have the ability
> to perform all-or-nothing, then it _seems_ like a minor interface
> enhancement to support incremental on top of that as opposed to the
> other way around. Therefore, perhaps that should be the initial goal
> until shown to be too complex or otherwise problematic..?
> 

I cannot say too much of this, yet my current observation is that
shrinking tail empty AG [+ empty AGs (optional)] in one transaction
is practical (I don't see any barrier so far [1]). I'm implementing
an atomic all-or-nothing truncation and userspace can utilize it to
implement in all-or-nothing way (I saw Dave's spaceman work before) or
incremental way (by using binary search approach and multiple ioctls)...
In principle, supporting the ioctl with the extra partial shrinking
feature is practial as well (but additional work might need to be
done). And also, I'm not sure it's user-friendly since most end-users
might want an all-or-nothing shrinking (at least in the fs truncation
step) result.

btw, afaik (my limited understanding), Ext4 shrinking is an offline
approach so it's somewhat easier to implement (no need to consider
any runtime impact), which is also considered as an all-or-nothing
truncation as well (Although it also supports -M to shrink the
filesystem to the minimum size, I think it can be implemented by
multiple all-or-nothing shrink ioctls...)

Thanks,
Gao Xiang

[1] it's somewhat outdated yet I'd like to finish this tail AG patchset
first
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git/log/?h=xfs/shrink2

> Brian
>


  reply	other threads:[~2021-02-04 14:01 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26 12:56 [PATCH v6 0/7] xfs: support shrinking free space in the last AG Gao Xiang
2021-01-26 12:56 ` [PATCH v6 1/7] xfs: rename `new' to `delta' in xfs_growfs_data_private() Gao Xiang
2021-02-02 19:37   ` Brian Foster
2021-01-26 12:56 ` [PATCH v6 2/7] xfs: get rid of xfs_growfs_{data,log}_t Gao Xiang
2021-02-02 19:37   ` Brian Foster
2021-01-26 12:56 ` [PATCH v6 3/7] xfs: update lazy sb counters immediately for resizefs Gao Xiang
2021-02-02 19:38   ` Brian Foster
2021-02-03  0:45     ` Gao Xiang
2021-01-26 12:56 ` [PATCH v6 4/7] xfs: hoist out xfs_resizefs_init_new_ags() Gao Xiang
2021-02-02 19:38   ` Brian Foster
2021-01-26 12:56 ` [PATCH v6 5/7] xfs: introduce xfs_ag_shrink_space() Gao Xiang
2021-01-26 12:56 ` [PATCH v6 6/7] xfs: support shrinking unused space in the last AG Gao Xiang
2021-02-03 14:23   ` Brian Foster
2021-02-03 14:51     ` Gao Xiang
2021-02-03 18:01       ` Brian Foster
2021-02-04  9:18         ` Gao Xiang
2021-02-04 12:33           ` Brian Foster
2021-02-04 16:21             ` Gao Xiang
2021-02-03 18:12       ` Darrick J. Wong
2021-02-03 18:14         ` Darrick J. Wong
2021-02-03 19:02         ` Gao Xiang
2021-02-03 19:19           ` Gao Xiang
2021-02-04 12:33           ` Brian Foster
2021-02-04 13:58             ` Gao Xiang [this message]
2021-02-04  9:40         ` Gao Xiang
2021-01-26 12:56 ` [PATCH v6 7/7] xfs: add error injection for per-AG resv failure when shrinkfs Gao Xiang
2021-02-03 14:23   ` Brian Foster
2021-02-03 15:01     ` Gao Xiang
2021-02-03 18:01       ` Brian Foster
2021-02-04  9:20         ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210204135830.GD149518@xiangao.remote.csb \
    --to=hsiangkao@redhat.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.