linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 10/10] xfs: refactor INUMBERS to use iwalk functions
Date: Tue, 11 Jun 2019 13:01:41 -0400	[thread overview]
Message-ID: <20190611170137.GC12395@bfoster> (raw)
In-Reply-To: <20190611162127.GT1871505@magnolia>

On Tue, Jun 11, 2019 at 09:21:28AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 11, 2019 at 11:08:51AM -0400, Brian Foster wrote:
> > On Tue, Jun 04, 2019 at 02:50:33PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Now that we have generic functions to walk inode records, refactor the
> > > INUMBERS implementation to use it.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/xfs_ioctl.c   |   20 ++++--
> > >  fs/xfs/xfs_ioctl.h   |    2 +
> > >  fs/xfs/xfs_ioctl32.c |   35 ++++------
> > >  fs/xfs/xfs_itable.c  |  168 ++++++++++++++++++++------------------------------
> > >  fs/xfs/xfs_itable.h  |   22 +------
> > >  fs/xfs/xfs_iwalk.c   |  161 ++++++++++++++++++++++++++++++++++++++++++++++--
> > >  fs/xfs/xfs_iwalk.h   |   12 ++++
> > 
> > It looks like there's a decent amount of xfs_iwalk code changes in this
> > patch. That should probably be a separate patch, at minimum to have a
> > separate changelog to document the changes/updates required for
> > inumbers.
> 
> <nod> I'll break out the iwalk changes into a separate patch so that
> this one only has the changes needed to wire up the ioctl.
> 
> > >  7 files changed, 262 insertions(+), 158 deletions(-)
> > > 
> > > 
> > ...
> > > diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> > > index 06abe5c9c0ee..bade54d6ac64 100644
> > > --- a/fs/xfs/xfs_itable.c
> > > +++ b/fs/xfs/xfs_itable.c
> > > @@ -259,121 +259,85 @@ xfs_bulkstat(
> > >  	return error;
> > >  }
> > >  
> > > -int
> > > -xfs_inumbers_fmt(
> > > -	void			__user *ubuffer, /* buffer to write to */
> > > -	const struct xfs_inogrp	*buffer,	/* buffer to read from */
> > > -	long			count,		/* # of elements to read */
> > > -	long			*written)	/* # of bytes written */
> > > +struct xfs_inumbers_chunk {
> > > +	inumbers_fmt_pf		formatter;
> > > +	struct xfs_ibulk	*breq;
> > > +};
> > > +
> > > +/*
> > > + * INUMBERS
> > > + * ========
> > > + * This is how we export inode btree records to userspace, so that XFS tools
> > > + * can figure out where inodes are allocated.
> > > + */
> > > +
> > > +/*
> > > + * Format the inode group structure and report it somewhere.
> > > + *
> > > + * Similar to xfs_bulkstat_one_int, lastino is the inode cursor as we walk
> > > + * through the filesystem so we move it forward unless there was a runtime
> > > + * error.  If the formatter tells us the buffer is now full we also move the
> > > + * cursor forward and abort the walk.
> > > + */
> > > +STATIC int
> > > +xfs_inumbers_walk(
> > > +	struct xfs_mount	*mp,
> > > +	struct xfs_trans	*tp,
> > > +	xfs_agnumber_t		agno,
> > > +	const struct xfs_inobt_rec_incore *irec,
> > > +	void			*data)
> > >  {
> > > -	if (copy_to_user(ubuffer, buffer, count * sizeof(*buffer)))
> > > -		return -EFAULT;
> > > -	*written = count * sizeof(*buffer);
> > > -	return 0;
> > > +	struct xfs_inogrp	inogrp = {
> > > +		.xi_startino	= XFS_AGINO_TO_INO(mp, agno, irec->ir_startino),
> > > +		.xi_alloccount	= irec->ir_count - irec->ir_freecount,
> > > +		.xi_allocmask	= ~irec->ir_free,
> > > +	};
> > 
> > Not related to this patch, but I'm wondering if we should be using
> > xfs_inobt_irec_to_allocmask() here to mask off holes from the resulting
> > allocation bitmap. Eh, I guess it's misleading either way..
> 
> Holes were supposed to be marked in ir_free also, right?
> 

Hm, yeah I think so. So right now holes would be reported as free via
inumbers, which is probably what we want. Must have been thinking about
this backwards... disregard!

Brian

> So (assuming the irec isn't corrupt) we should be protected against
> reporting a hole as an "allocated" inode, right?
> 
> > > +	struct xfs_inumbers_chunk *ic = data;
> > > +	xfs_agino_t		agino;
> > > +	int			error;
> > > +
> > > +	error = ic->formatter(ic->breq, &inogrp);
> > > +	if (error && error != XFS_IBULK_BUFFER_FULL)
> > > +		return error;
> > > +	if (error == XFS_IBULK_BUFFER_FULL)
> > > +		error = XFS_INOBT_WALK_ABORT;
> > > +
> > > +	agino = irec->ir_startino + XFS_INODES_PER_CHUNK;
> > > +	ic->breq->startino = XFS_AGINO_TO_INO(mp, agno, agino);
> > > +	return error;
> > >  }
> > >  
> > ...
> > > diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c
> > > index c4a9c4c246b7..3a35d1cf7e14 100644
> > > --- a/fs/xfs/xfs_iwalk.c
> > > +++ b/fs/xfs/xfs_iwalk.c
> > ...
> > > @@ -286,7 +298,7 @@ xfs_iwalk_ag_start(
> > >  	 * have to deal with tearing down the cursor to walk the records.
> > >  	 */
> > >  	error = xfs_iwalk_grab_ichunk(*curpp, agino, &icount,
> > > -			&iwag->recs[iwag->nr_recs]);
> > > +			&iwag->recs[iwag->nr_recs], trim);
> > 
> > Hmm, it looks like we could lift the lookup from xfs_iwalk_grab_ichunk()
> > up into xfs_iwalk_ag_start() and avoid needing to pass trim down
> > multiple levels. In fact, if we're not trimming the record we don't need
> > to grab the record at all in this path. We could do the lookup (setting
> > has_more) then bounce right up to the core walker algorithm, right? If
> > so, that seems a bit cleaner in terms of only using special cased code
> > when it's actually necessary.
> 
> Right.
> 
> > 
> > >  	if (error)
> > >  		return error;
> > >  	if (icount)
> > ...
> > > @@ -561,3 +574,135 @@ xfs_iwalk_threaded(
> > ...
> > > +/*
> > > + * Walk all inode btree records in a single AG, from @iwag->startino to the end
> > > + * of the AG.
> > > + */
> > > +STATIC int
> > > +xfs_inobt_walk_ag(
> > > +	struct xfs_iwalk_ag		*iwag)
> > > +{
> > > +	struct xfs_mount		*mp = iwag->mp;
> > > +	struct xfs_trans		*tp = iwag->tp;
> > > +	struct xfs_buf			*agi_bp = NULL;
> > > +	struct xfs_btree_cur		*cur = NULL;
> > > +	xfs_agnumber_t			agno;
> > > +	xfs_agino_t			agino;
> > > +	int				has_more;
> > > +	int				error = 0;
> > > +
> > > +	/* Set up our cursor at the right place in the inode btree. */
> > > +	agno = XFS_INO_TO_AGNO(mp, iwag->startino);
> > > +	agino = XFS_INO_TO_AGINO(mp, iwag->startino);
> > > +	error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more,
> > > +			false);
> > > +
> > > +	while (!error && has_more && !xfs_pwork_want_abort(&iwag->pwork)) {
> > > +		struct xfs_inobt_rec_incore	*irec;
> > > +
> > > +		cond_resched();
> > > +
> > > +		/* Fetch the inobt record. */
> > > +		irec = &iwag->recs[iwag->nr_recs];
> > > +		error = xfs_inobt_get_rec(cur, irec, &has_more);
> > > +		if (error || !has_more)
> > > +			break;
> > > +
> > > +		/*
> > > +		 * If there's space in the buffer for more records, increment
> > > +		 * the btree cursor and grab more.
> > > +		 */
> > > +		if (++iwag->nr_recs < iwag->sz_recs) {
> > > +			error = xfs_btree_increment(cur, 0, &has_more);
> > > +			if (error || !has_more)
> > > +				break;
> > > +			continue;
> > > +		}
> > > +
> > > +		/*
> > > +		 * Otherwise, we need to save cursor state and run the callback
> > > +		 * function on the cached records.  The run_callbacks function
> > > +		 * is supposed to return a cursor pointing to the record where
> > > +		 * we would be if we had been able to increment like above.
> > > +		 */
> > > +		error = xfs_iwalk_run_callbacks(iwag, xfs_inobt_walk_ag_recs,
> > > +				agno, &cur, &agi_bp, &has_more);
> > > +	}
> > > +
> > > +	xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error);
> > > +
> > > +	/* Walk any records left behind in the cache. */
> > > +	if (iwag->nr_recs == 0 || error || xfs_pwork_want_abort(&iwag->pwork))
> > > +		return error;
> > > +
> > > +	return xfs_inobt_walk_ag_recs(iwag);
> > > +}
> > 
> > Similar comments apply here as for the previous xfs_iwalk_ag() patch.
> > Though looking at it, the only differences here are the lack of free
> > inode check, readahead and the callback function (particularly once you
> > consider the partial completion refactoring we discussed earlier). I
> > think this could all be generalized with a single flag such that there's
> > no need for separate xfs_[inobt|iwalk]_ag() functions.
> 
> Yep.  Already refactoring that. :)
> 
> > Hmmm.. perhaps we could use a flag or separate function pointers in
> > struct xfs_iwalk_ag to accomplish the same thing all the way up through
> > the record walker functions. IOW, xfs_iwalk_ag_recs() looks like it
> > could call either callback based on which is defined in the
> > xfs_iwalk_ag.
> 
> <nod>
> 
> > This could all be done as a separate patch of course, if that's easier.
> 
> I might just go back and remove the function pointer from run_callbacks
> in the earlier patches...
> 
> > 
> > > +
> > > +/*
> > > + * Walk all inode btree records in the filesystem starting from @startino.  The
> > > + * @inobt_walk_fn will be called for each btree record, being passed the incore
> > > + * record and @data.  @max_prefetch controls how many inobt records we try to
> > > + * cache ahead of time.
> > > + */
> > > +int
> > > +xfs_inobt_walk(
> > > +	struct xfs_mount	*mp,
> > > +	struct xfs_trans	*tp,
> > > +	xfs_ino_t		startino,
> > > +	xfs_inobt_walk_fn	inobt_walk_fn,
> > > +	unsigned int		max_prefetch,
> > > +	void			*data)
> > > +{
> > > +	struct xfs_iwalk_ag	iwag = {
> > > +		.mp		= mp,
> > > +		.tp		= tp,
> > > +		.inobt_walk_fn	= inobt_walk_fn,
> > > +		.data		= data,
> > > +		.startino	= startino,
> > > +		.pwork		= XFS_PWORK_SINGLE_THREADED,
> > > +	};
> > > +	xfs_agnumber_t		agno = XFS_INO_TO_AGNO(mp, startino);
> > > +	int			error;
> > > +
> > > +	ASSERT(agno < mp->m_sb.sb_agcount);
> > > +
> > > +	xfs_iwalk_set_prefetch(&iwag, max_prefetch * XFS_INODES_PER_CHUNK);
> > 
> > A brief comment above this line would be helpful. Something like:
> > 
> > 	/* translate inumbers record count to inode count */
> 
> Done.  Thanks for the review!
> 
> --D
> 
> > Brian
> > 
> > > +	error = xfs_iwalk_alloc(&iwag);
> > > +	if (error)
> > > +		return error;
> > > +
> > > +	for (; agno < mp->m_sb.sb_agcount; agno++) {
> > > +		error = xfs_inobt_walk_ag(&iwag);
> > > +		if (error)
> > > +			break;
> > > +		iwag.startino = XFS_AGINO_TO_INO(mp, agno + 1, 0);
> > > +	}
> > > +
> > > +	xfs_iwalk_free(&iwag);
> > > +	return error;
> > > +}
> > > diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h
> > > index 76d8f87a39ef..20bee93d4676 100644
> > > --- a/fs/xfs/xfs_iwalk.h
> > > +++ b/fs/xfs/xfs_iwalk.h
> > > @@ -18,4 +18,16 @@ int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino,
> > >  		xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, bool poll,
> > >  		void *data);
> > >  
> > > +/* Walk all inode btree records in the filesystem starting from @startino. */
> > > +typedef int (*xfs_inobt_walk_fn)(struct xfs_mount *mp, struct xfs_trans *tp,
> > > +				 xfs_agnumber_t agno,
> > > +				 const struct xfs_inobt_rec_incore *irec,
> > > +				 void *data);
> > > +/* Return value (for xfs_inobt_walk_fn) that aborts the walk immediately. */
> > > +#define XFS_INOBT_WALK_ABORT	(XFS_IWALK_ABORT)
> > > +
> > > +int xfs_inobt_walk(struct xfs_mount *mp, struct xfs_trans *tp,
> > > +		xfs_ino_t startino, xfs_inobt_walk_fn inobt_walk_fn,
> > > +		unsigned int max_prefetch, void *data);
> > > +
> > >  #endif /* __XFS_IWALK_H__ */
> > > 

      reply	other threads:[~2019-06-11 17:01 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-04 21:49 [PATCH v2 00/10] xfs: refactor and improve inode iteration Darrick J. Wong
2019-06-04 21:49 ` [PATCH 01/10] xfs: create simplified inode walk function Darrick J. Wong
2019-06-10 13:58   ` Brian Foster
2019-06-10 16:59     ` Darrick J. Wong
2019-06-10 17:55       ` Brian Foster
2019-06-10 23:11         ` Darrick J. Wong
2019-06-11 22:33           ` Dave Chinner
2019-06-11 23:05             ` Darrick J. Wong
2019-06-12 12:13               ` Brian Foster
2019-06-12 16:53                 ` Darrick J. Wong
2019-06-12 17:54               ` Darrick J. Wong
2019-06-04 21:49 ` [PATCH 02/10] xfs: convert quotacheck to use the new iwalk functions Darrick J. Wong
2019-06-10 13:58   ` Brian Foster
2019-06-10 17:10     ` Darrick J. Wong
2019-06-11 23:23     ` Dave Chinner
2019-06-12  0:32       ` Darrick J. Wong
2019-06-12 12:55         ` Brian Foster
2019-06-12 23:33           ` Dave Chinner
2019-06-13 18:34             ` Brian Foster
2019-06-04 21:49 ` [PATCH 03/10] xfs: bulkstat should copy lastip whenever userspace supplies one Darrick J. Wong
2019-06-10 13:59   ` Brian Foster
2019-06-04 21:49 ` [PATCH 04/10] xfs: convert bulkstat to new iwalk infrastructure Darrick J. Wong
2019-06-10 14:02   ` Brian Foster
2019-06-10 17:38     ` Darrick J. Wong
2019-06-10 18:29       ` Brian Foster
2019-06-10 23:42         ` Darrick J. Wong
2019-06-04 21:49 ` [PATCH 05/10] xfs: move bulkstat ichunk helpers to iwalk code Darrick J. Wong
2019-06-10 14:02   ` Brian Foster
2019-06-04 21:50 ` [PATCH 06/10] xfs: change xfs_iwalk_grab_ichunk to use startino, not lastino Darrick J. Wong
2019-06-10 19:32   ` Brian Foster
2019-06-04 21:50 ` [PATCH 07/10] xfs: clean up long conditionals in xfs_iwalk_ichunk_ra Darrick J. Wong
2019-06-10 19:32   ` Brian Foster
2019-06-04 21:50 ` [PATCH 08/10] xfs: multithreaded iwalk implementation Darrick J. Wong
2019-06-10 19:40   ` Brian Foster
2019-06-11  1:10     ` Darrick J. Wong
2019-06-11 13:13       ` Brian Foster
2019-06-11 15:29         ` Darrick J. Wong
2019-06-11 17:00           ` Brian Foster
2019-06-04 21:50 ` [PATCH 09/10] xfs: poll waiting for quotacheck Darrick J. Wong
2019-06-11 15:07   ` Brian Foster
2019-06-11 16:06     ` Darrick J. Wong
2019-06-11 17:01       ` Brian Foster
2019-06-04 21:50 ` [PATCH 10/10] xfs: refactor INUMBERS to use iwalk functions Darrick J. Wong
2019-06-11 15:08   ` Brian Foster
2019-06-11 16:21     ` Darrick J. Wong
2019-06-11 17:01       ` Brian Foster [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190611170137.GC12395@bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).