All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 11/22] xfs: add a perag to the btree cursor
Date: Wed, 12 May 2021 20:49:21 -0700	[thread overview]
Message-ID: <20210513034921.GQ8582@magnolia> (raw)
In-Reply-To: <20210513010750.GC2893@dread.disaster.area>

On Thu, May 13, 2021 at 11:07:50AM +1000, Dave Chinner wrote:
> On Wed, May 12, 2021 at 05:55:08PM -0700, Darrick J. Wong wrote:
> > On Thu, May 13, 2021 at 10:12:16AM +1000, Dave Chinner wrote:
> > > On Wed, May 12, 2021 at 03:40:06PM -0700, Darrick J. Wong wrote:
> > > > On Thu, May 06, 2021 at 05:20:43PM +1000, Dave Chinner wrote:
> > > > > From: Dave Chinner <dchinner@redhat.com>
> > > > > 
> > > > > Which will eventually completely replace the agno in it.
> > > > > 
> > > > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > > > ---
> > > > >  fs/xfs/libxfs/xfs_alloc.c          | 25 +++++++++++++++----------
> > > > >  fs/xfs/libxfs/xfs_alloc_btree.c    | 13 ++++++++++---
> > > > >  fs/xfs/libxfs/xfs_alloc_btree.h    |  3 ++-
> > > > >  fs/xfs/libxfs/xfs_btree.c          |  2 ++
> > > > >  fs/xfs/libxfs/xfs_btree.h          |  4 +++-
> > > > >  fs/xfs/libxfs/xfs_ialloc.c         | 16 ++++++++--------
> > > > >  fs/xfs/libxfs/xfs_ialloc_btree.c   | 15 +++++++++++----
> > > > >  fs/xfs/libxfs/xfs_ialloc_btree.h   |  7 ++++---
> > > > >  fs/xfs/libxfs/xfs_refcount.c       |  4 ++--
> > > > >  fs/xfs/libxfs/xfs_refcount_btree.c | 17 ++++++++++++-----
> > > > >  fs/xfs/libxfs/xfs_refcount_btree.h |  2 +-
> > > > >  fs/xfs/libxfs/xfs_rmap.c           |  6 +++---
> > > > >  fs/xfs/libxfs/xfs_rmap_btree.c     | 17 ++++++++++++-----
> > > > >  fs/xfs/libxfs/xfs_rmap_btree.h     |  2 +-
> > > > >  fs/xfs/scrub/agheader_repair.c     | 20 +++++++++++---------
> > > > >  fs/xfs/scrub/bmap.c                |  2 +-
> > > > >  fs/xfs/scrub/common.c              | 12 ++++++------
> > > > >  fs/xfs/scrub/repair.c              |  5 +++--
> > > > >  fs/xfs/xfs_discard.c               |  2 +-
> > > > >  fs/xfs/xfs_fsmap.c                 |  6 +++---
> > > > >  fs/xfs/xfs_reflink.c               |  2 +-
> > > > >  21 files changed, 112 insertions(+), 70 deletions(-)
> > > > > 
> > > > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > > > index ce31c00dbf6f..7ec4af6bf494 100644
> > > > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > > > @@ -776,7 +776,8 @@ xfs_alloc_cur_setup(
> > > > >  	 */
> > > > >  	if (!acur->cnt)
> > > > >  		acur->cnt = xfs_allocbt_init_cursor(args->mp, args->tp,
> > > > > -					args->agbp, args->agno, XFS_BTNUM_CNT);
> > > > > +						args->agbp, args->agno,
> > > > > +						args->pag, XFS_BTNUM_CNT);
> > > > 
> > > > If we still have to pass the AG[FI] buffer into the _init_cursor
> > > > functions, why not get the perag reference from the xfs_buf and
> > > > eliminate the agno/pag parameter?  It looks like cursors get their own
> > > > active reference to the perag, so I think only the _stage_cursor
> > > > function needs to be passed a perag structure, right?
> > > 
> > > Because when I convert this to active/passive perag references, the
> > > buffers only have a passive reference and they can't be converted to
> > > active references.
> > >
> > > Active references provide the barrier that prevents high
> > > level code from accessing/entering the AG while a shrink (or other
> > > offline type event) is in the process of tearing down that AG. The
> > > process of tearing down the AG still may require the ability to
> > > read/write to the AG metadata (e.g. checking the AG is fully
> > > empty), so we still need the buffer cache to work while in this
> > > transient offline state. Hence we need passive reference counts for
> > > the buffers, because having cached buffers should not impact on the
> > > functioning of the high level "don't use this AG anymore" barrier.
> > 
> > Agreed; it's past time to shift the world towards the idea that one
> > grabs the incore object and only then starts grabbing buffers.  But even
> > if you've done it correctly:
> > 
> > 	pag = xfs_perag_get_active(agno);
> > 	xfs_alloc_read_agf(tp, ..., &agfbp);
> > 
> > Why not pass in just the buffer?
> > 
> > 	cur = xfs_rmapbt_init_cursor(tp, agfbp, pag);
> > 
> > Is the idea here simply that we don't want to give people the idea that
> > they can just read the agf and pass it to xfs_rmapbt_init_cursor?  In
> > other words, the third parameter is there to give the people the
> > impression that pag and agfbp are both equally important tokens, and
> > that callers must obtain both and pass them in?
> 
> I'm angling to hide the agfbp entirely here - move the allocation
> serialisation to the perag using a mutex rwsem, not the AGF buffer.
> We don't actually need the AGF until we have to modify it, so it's
> lock hold time during allocation can be cut way down if it is not
> used to serialise the entire allocation operation. This would also
> get us lockdep coverage for agi/agf locking, etc.

Hmmm.  Recently I've been wrangling with a concurrency problem between
scrub and io threads running a series of different AG btree updates
(e.g. bunmap, which runs a deferred rmap-delete and then a deferred
refcount-decrease).  Because the only way we serialize access to the AGs
is through the AG buffer locks and we don't maintain that lock across
the transaction roll between defer items, it's possible that scrub can
step in and grab the buffer lock between items, which it then marks as
cross-reference corruption because the number of rmaps for a given block
doesn't match the number of references in the refcount btree.  The log
intent items guarantee consistency (and the defer items are queue in the
correct order to avoid accidental freeing) but this is suboptimal.

Right now I've "fixed" it by adding a per-ag counter of pending intents
and adapting scrub to lock the AG buffers, check the counter, and if
it's nonzero, it'll cycle the buffers and try again.  This works, but
it's a little silly.  Depending on how the AG mutex works, this could
fix that problem for scrub.

> Also, I'm intending that the perag init functions which currently
> take a agno and then do a lookup for the perag to initialise it will
> take a perag, not a agno. We almost always already have the perag
> when we do a buffer read to initialise the perag...

<nod>

> > No monkeying around
> > with the buffer's passive reference by higher level code, at all, ever?
> 
> Except in the buffer cache itself or callouts directly from the
> buffer cache that operation under the buffer's existence guarantee
> for the perag.

Got it.

> > I guess I can live with that.  I might just be micro-optimising function
> > call parameter counts.  :P
> 
> Oh, I'm planning on chopping them down, but going the other way.
> Cache the ag bps in the perag so we don't ahve to do buffer cache
> lookups to find them, nor do we have to pass them around anywhere
> that we pass a perag....

That's gonna hurt on explodefs filesystems with a million AGs... ;)

Ok, I'm satisfied.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2021-05-13  3:49 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06  7:20 [RFC 00/22] xfs: initial agnumber -> perag conversions for shrink Dave Chinner
2021-05-06  7:20 ` [PATCH 01/22] xfs: move xfs_perag_get/put to xfs_ag.[ch] Dave Chinner
2021-05-10 12:52   ` Brian Foster
2021-05-11  7:18     ` Dave Chinner
2021-05-10 22:28   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 02/22] xfs: prepare for moving perag definitions and support to libxfs Dave Chinner
2021-05-10 12:53   ` Brian Foster
2021-05-11  7:19     ` Dave Chinner
2021-05-06  7:20 ` [PATCH 03/22] xfs: move perag structure and setup to libxfs/xfs_ag.[ch] Dave Chinner
2021-05-10 22:26   ` Darrick J. Wong
2021-05-10 23:38     ` Dave Chinner
2021-05-06  7:20 ` [PATCH 04/22] xfs: make for_each_perag... a first class citizen Dave Chinner
2021-05-10 12:53   ` Brian Foster
2021-05-11  7:35     ` Dave Chinner
2021-05-11 12:29       ` Brian Foster
2021-05-11 21:33         ` Dave Chinner
2021-05-12 21:58           ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 05/22] xfs: convert raw ag walks to use for_each_perag Dave Chinner
2021-05-10 12:54   ` Brian Foster
2021-05-06  7:20 ` [PATCH 06/22] xfs: convert xfs_iwalk to use perag references Dave Chinner
2021-05-10 13:41   ` Brian Foster
2021-05-12 22:08   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 07/22] xfs: convert secondary superblock walk to use perags Dave Chinner
2021-05-10 13:41   ` Brian Foster
2021-05-12 22:09   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 08/22] xfs: pass perags through to the busy extent code Dave Chinner
2021-05-11 12:29   ` Brian Foster
2021-05-12 22:13   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 09/22] xfs: push perags through the ag reservation callouts Dave Chinner
2021-05-11 12:29   ` Brian Foster
2021-05-13  0:29     ` Dave Chinner
2021-05-12 22:16   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 10/22] xfs: pass perags around in fsmap data dev functions Dave Chinner
2021-05-11 12:30   ` Brian Foster
2021-05-12 22:23   ` Darrick J. Wong
2021-05-18  1:00     ` Dave Chinner
2021-05-06  7:20 ` [PATCH 11/22] xfs: add a perag to the btree cursor Dave Chinner
2021-05-11 12:30   ` Brian Foster
2021-05-11 20:51     ` Darrick J. Wong
2021-05-11 21:52       ` Dave Chinner
2021-05-12 12:49         ` Brian Foster
2021-05-12 22:41           ` Darrick J. Wong
2021-05-12 22:40   ` Darrick J. Wong
2021-05-13  0:12     ` Dave Chinner
2021-05-13  0:55       ` Darrick J. Wong
2021-05-13  1:07         ` Dave Chinner
2021-05-13  3:49           ` Darrick J. Wong [this message]
2021-05-06  7:20 ` [PATCH 12/22] xfs: convert rmap btree cursor to using a perag Dave Chinner
2021-05-11 12:30   ` Brian Foster
2021-05-12 22:45   ` Darrick J. Wong
2021-05-13  3:54     ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 13/22] xfs: convert refcount btree cursor to use perags Dave Chinner
2021-05-11 12:30   ` Brian Foster
2021-05-12 22:47   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 14/22] xfs: convert allocbt cursors " Dave Chinner
2021-05-11 12:30   ` Brian Foster
2021-05-13  3:55   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 15/22] xfs: use perag for ialloc btree cursors Dave Chinner
2021-05-11 12:30   ` Brian Foster
2021-05-13  3:55   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 16/22] xfs: remove agno from btree cursor Dave Chinner
2021-05-11 12:34   ` Brian Foster
2021-05-11 22:02     ` Dave Chinner
2021-05-12 22:52   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 17/22] xfs: simplify xfs_dialloc_select_ag() return values Dave Chinner
2021-05-12 12:49   ` Brian Foster
2021-05-12 22:55   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 18/22] xfs: collapse AG selection for inode allocation Dave Chinner
2021-05-12 12:52   ` Brian Foster
2021-05-18  1:21     ` Dave Chinner
2021-05-12 23:11   ` Darrick J. Wong
2021-05-18  1:29     ` Dave Chinner
2021-05-06  7:20 ` [PATCH 19/22] xfs: get rid of xfs_dir_ialloc() Dave Chinner
2021-05-06 11:26   ` kernel test robot
2021-05-06 11:26     ` kernel test robot
2021-05-06 11:26   ` [RFC PATCH] xfs: xfs_dialloc_ag can be static kernel test robot
2021-05-06 11:26     ` kernel test robot
2021-05-12 12:52   ` [PATCH 19/22] xfs: get rid of xfs_dir_ialloc() Brian Foster
2021-05-12 23:19   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 20/22] xfs: inode allocation can use a single perag instance Dave Chinner
2021-05-12 12:52   ` Brian Foster
2021-05-12 23:19   ` Darrick J. Wong
2021-05-06  7:20 ` [PATCH 21/22] xfs: clean up and simplify xfs_dialloc() Dave Chinner
2021-05-12 21:49   ` Darrick J. Wong
2021-05-18  1:46     ` Dave Chinner
2021-05-06  7:20 ` [PATCH 22/22] xfs: use perag through unlink processing Dave Chinner
2021-05-12 21:37   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210513034921.GQ8582@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.