All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [RFC PATCH 1/4] xfs: support deactivating AGs
Date: Thu, 15 Apr 2021 15:08:33 +0800	[thread overview]
Message-ID: <20210415070833.GD1864610@xiangao.remote.csb> (raw)
In-Reply-To: <20210415062824.GN63242@dread.disaster.area>

Hi Dave,

On Thu, Apr 15, 2021 at 04:28:24PM +1000, Dave Chinner wrote:
> On Thu, Apr 15, 2021 at 12:28:37PM +0800, Gao Xiang wrote:
> > Hi Dave,
> > 
> > On Thu, Apr 15, 2021 at 01:42:55PM +1000, Dave Chinner wrote:
> > > On Thu, Apr 15, 2021 at 03:52:37AM +0800, Gao Xiang wrote:
> > > > To get rid of paralleled requests related to AGs which are pending
> > > > for shrinking, mark these perags as inactive rather than playing
> > > > with per-ag structures theirselves.
> > > > 
> > > > Since in that way, a per-ag lock can be used to stablize the inactive
> > > > status together with agi/agf buffer lock (which is much easier than
> > > > adding more complicated perag_{get, put} pairs..) Also, Such per-ags
> > > > can be released / reused when unmountfs / growfs.
> > > > 
> > > > On the read side, pag_inactive_rwsem can be unlocked immediately after
> > > > the agf or agi buffer lock is acquired. However, pag_inactive_rwsem
> > > > can only be unlocked after the agf/agi buffer locks are all acquired
> > > > with the inactive status on the write side.
> > > > 
> > > > XXX: maybe there are some missing cases.
> > > > 
> > > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> > > > ---
> > > >  fs/xfs/libxfs/xfs_ag.c     | 16 +++++++++++++---
> > > >  fs/xfs/libxfs/xfs_alloc.c  | 12 +++++++++++-
> > > >  fs/xfs/libxfs/xfs_ialloc.c | 26 +++++++++++++++++++++++++-
> > > >  fs/xfs/xfs_mount.c         |  2 ++
> > > >  fs/xfs/xfs_mount.h         |  6 ++++++
> > > >  5 files changed, 57 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
> > > > index c68a36688474..ba5702e5c9ad 100644
> > > > --- a/fs/xfs/libxfs/xfs_ag.c
> > > > +++ b/fs/xfs/libxfs/xfs_ag.c
> > > > @@ -676,16 +676,24 @@ xfs_ag_get_geometry(
> > > >  	if (agno >= mp->m_sb.sb_agcount)
> > > >  		return -EINVAL;
> > > >  
> > > > +	pag = xfs_perag_get(mp, agno);
> > > > +	down_read(&pag->pag_inactive_rwsem);
> > > 
> > > No need to encode the lock type in the lock name. We know it's a
> > > rwsem from the lock API functions...
> > > 
> > > > +
> > > > +	if (pag->pag_inactive) {
> > > > +		error = -EBUSY;
> > > > +		up_read(&pag->pag_inactive_rwsem);
> > > > +		goto out;
> > > > +	}
> > > 
> > > This looks kinda heavyweight. Having to take a rwsem whenever we do
> > > a perag lookup to determine if we can access the perag completely
> > > defeats the purpose of xfs_perag_get() being a lightweight, lockless
> > > operation.
> > 
> > I'm not sure if it has some regression since write lock will be only
> > taken when shrinking (shrinking is a rare operation), for most cases
> > which is much similiar to perag radix root I think.
> 
> It's still an extra pair of atomic operation per xfs_perag_get/put()
> call pairs. pag_inactive being true is the slow/rare path, so we
> should be trying to keep the overhead of detecting that path out of
> all our fast paths...
> 
> Indeed, xfs_perag_get() already shows up on profiles because of the
> number of calls we make to it, so adding an extra atomic operation
> for this operation is going to be noticable in terms of CPU usage,
> if nothing else.
> 
> > The locking logic is that, when pag->pag_inactive = false -> true,
> > the write lock, AGF/AGI locks all have to be taken in advance.
> > 
> > > 
> > > I suspect what we really want here is active/passive references like
> > > are used for the superblock, and an API that hides the
> > > implementation from all the callers.
> > 
> > If my understanding is correct, my own observation these months is
> > that the current XFS codebase is not well suitable to accept !pag
> > (due to many logic assumes pag structure won't go away, since some
> >  are indexed/passed by agno rather than some pag reference count).
> 
> Maybe so, but that's exactly what this patch is addressing - it's
> adding a way to detect that perag has "gone away"i and should not be
> referenced any more.
> 
> It wasn't until I saw this patch that I realised that there is a way
> that we can, in fact, safely handle perags being freed by ensuring
> that the RCU lookup fails for these "active" references....
> 
> > Even I think we could introduce some active references, but handle
> > the cover range is still a big project. The current approach assumes
> > pag won't go away except for umounting and blocks allocation / imap/
> > ... paths to access that.
> 
> The way I described should work just fine - nobody should be
> accessing the per-ag without a reference gained through a lookup in
> some way.  Buffers carry a passive reference, because the per-ag
> teardown will do the teardown of those references before the perag
> is freed.  Everything else carries an active reference and so
> teardown cannot begin until all active references go away.
> 
> > My current thought is that we could implement it in that way as the
> > first step (in order to land the shrinking functionality to let
> > end-users benefit of this), and by the codebase evolves, it can be
> > transformed to a more gentle way.
> 
> I think converting this patchset to active/passive references ias
> I've described solves the problem entirely - there's no "evolving"
> needed as we can solve it with this one structural change...

Since currently even xfs_perag_put() reaches zero, it won't free
the per-ag anyway (it may just use to mark the pointer is no longer
used in the context? not sure what's the exact use of the such pairs),
so in practice I think after active/passive references are introduced,
there is still the only one real reference count that works for the
per-ag lifetime management and currently it doesn't manage whole
lifetime at all...

So (my own understanding is) I think in practice, that approachs
would be somewhat equal to relocate/rearrange xfs_perag_get()/put()
pair to manage the perag lifetime instead. and maybe use xfs_perag_ptr()
to access perag when some reference count is available.

Thanks,
Gao Xiang

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


  reply	other threads:[~2021-04-15  7:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-14 19:52 [RFC PATCH 0/4] xfs: support shrinking empty AGs Gao Xiang
2021-04-14 19:52 ` [RFC PATCH 1/4] xfs: support deactivating AGs Gao Xiang
2021-04-15  3:42   ` Dave Chinner
2021-04-15  4:28     ` Gao Xiang
2021-04-15  6:28       ` Dave Chinner
2021-04-15  7:08         ` Gao Xiang [this message]
2021-04-15  8:44           ` Dave Chinner
2021-04-14 19:52 ` [RFC PATCH 2/4] xfs: check ag is empty Gao Xiang
2021-04-15  3:52   ` Dave Chinner
2021-04-15  4:34     ` Gao Xiang
2021-04-14 19:52 ` [RFC PATCH 3/4] xfs: introduce max_agcount Gao Xiang
2021-04-15  3:59   ` Dave Chinner
2021-04-14 19:52 ` [RFC PATCH 4/4] xfs: support shrinking empty AGs Gao Xiang
2021-04-15  4:25   ` Dave Chinner
2021-04-15  5:22     ` Gao Xiang
2021-04-15  8:33       ` Dave Chinner
2021-04-15 17:00         ` Darrick J. Wong
2021-04-15 21:24           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210415070833.GD1864610@xiangao.remote.csb \
    --to=hsiangkao@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.