All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 10/10] xfs: refactor INUMBERS to use iwalk functions
Date: Tue, 11 Jun 2019 13:01:41 -0400	[thread overview]
Message-ID: <20190611170137.GC12395@bfoster> (raw)
In-Reply-To: <20190611162127.GT1871505@magnolia>

On Tue, Jun 11, 2019 at 09:21:28AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 11, 2019 at 11:08:51AM -0400, Brian Foster wrote:
> > On Tue, Jun 04, 2019 at 02:50:33PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Now that we have generic functions to walk inode records, refactor the
> > > INUMBERS implementation to use it.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/xfs_ioctl.c   |   20 ++++--
> > >  fs/xfs/xfs_ioctl.h   |    2 +
> > >  fs/xfs/xfs_ioctl32.c |   35 ++++------
> > >  fs/xfs/xfs_itable.c  |  168 ++++++++++++++++++++------------------------------
> > >  fs/xfs/xfs_itable.h  |   22 +------
> > >  fs/xfs/xfs_iwalk.c   |  161 ++++++++++++++++++++++++++++++++++++++++++++++--
> > >  fs/xfs/xfs_iwalk.h   |   12 ++++
> > 
> > It looks like there's a decent amount of xfs_iwalk code changes in this
> > patch. That should probably be a separate patch, at minimum to have a
> > separate changelog to document the changes/updates required for
> > inumbers.
> 
> <nod> I'll break out the iwalk changes into a separate patch so that
> this one only has the changes needed to wire up the ioctl.
> 
> > >  7 files changed, 262 insertions(+), 158 deletions(-)
> > > 
> > > 
> > ...
> > > diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> > > index 06abe5c9c0ee..bade54d6ac64 100644
> > > --- a/fs/xfs/xfs_itable.c
> > > +++ b/fs/xfs/xfs_itable.c
> > > @@ -259,121 +259,85 @@ xfs_bulkstat(
> > >  	return error;
> > >  }
> > >  
> > > -int
> > > -xfs_inumbers_fmt(
> > > -	void			__user *ubuffer, /* buffer to write to */
> > > -	const struct xfs_inogrp	*buffer,	/* buffer to read from */
> > > -	long			count,		/* # of elements to read */
> > > -	long			*written)	/* # of bytes written */
> > > +struct xfs_inumbers_chunk {
> > > +	inumbers_fmt_pf		formatter;
> > > +	struct xfs_ibulk	*breq;
> > > +};
> > > +
> > > +/*
> > > + * INUMBERS
> > > + * ========
> > > + * This is how we export inode btree records to userspace, so that XFS tools
> > > + * can figure out where inodes are allocated.
> > > + */
> > > +
> > > +/*
> > > + * Format the inode group structure and report it somewhere.
> > > + *
> > > + * Similar to xfs_bulkstat_one_int, lastino is the inode cursor as we walk
> > > + * through the filesystem so we move it forward unless there was a runtime
> > > + * error.  If the formatter tells us the buffer is now full we also move the
> > > + * cursor forward and abort the walk.
> > > + */
> > > +STATIC int
> > > +xfs_inumbers_walk(
> > > +	struct xfs_mount	*mp,
> > > +	struct xfs_trans	*tp,
> > > +	xfs_agnumber_t		agno,
> > > +	const struct xfs_inobt_rec_incore *irec,
> > > +	void			*data)
> > >  {
> > > -	if (copy_to_user(ubuffer, buffer, count * sizeof(*buffer)))
> > > -		return -EFAULT;
> > > -	*written = count * sizeof(*buffer);
> > > -	return 0;
> > > +	struct xfs_inogrp	inogrp = {
> > > +		.xi_startino	= XFS_AGINO_TO_INO(mp, agno, irec->ir_startino),
> > > +		.xi_alloccount	= irec->ir_count - irec->ir_freecount,
> > > +		.xi_allocmask	= ~irec->ir_free,
> > > +	};
> > 
> > Not related to this patch, but I'm wondering if we should be using
> > xfs_inobt_irec_to_allocmask() here to mask off holes from the resulting
> > allocation bitmap. Eh, I guess it's misleading either way..
> 
> Holes were supposed to be marked in ir_free also, right?
> 

Hm, yeah I think so. So right now holes would be reported as free via
inumbers, which is probably what we want. Must have been thinking about
this backwards... disregard!

Brian

> So (assuming the irec isn't corrupt) we should be protected against
> reporting a hole as an "allocated" inode, right?
> 
> > > +	struct xfs_inumbers_chunk *ic = data;
> > > +	xfs_agino_t		agino;
> > > +	int			error;
> > > +
> > > +	error = ic->formatter(ic->breq, &inogrp);
> > > +	if (error && error != XFS_IBULK_BUFFER_FULL)
> > > +		return error;
> > > +	if (error == XFS_IBULK_BUFFER_FULL)
> > > +		error = XFS_INOBT_WALK_ABORT;
> > > +
> > > +	agino = irec->ir_startino + XFS_INODES_PER_CHUNK;
> > > +	ic->breq->startino = XFS_AGINO_TO_INO(mp, agno, agino);
> > > +	return error;
> > >  }
> > >  
> > ...
> > > diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c
> > > index c4a9c4c246b7..3a35d1cf7e14 100644
> > > --- a/fs/xfs/xfs_iwalk.c
> > > +++ b/fs/xfs/xfs_iwalk.c
> > ...
> > > @@ -286,7 +298,7 @@ xfs_iwalk_ag_start(
> > >  	 * have to deal with tearing down the cursor to walk the records.
> > >  	 */
> > >  	error = xfs_iwalk_grab_ichunk(*curpp, agino, &icount,
> > > -			&iwag->recs[iwag->nr_recs]);
> > > +			&iwag->recs[iwag->nr_recs], trim);
> > 
> > Hmm, it looks like we could lift the lookup from xfs_iwalk_grab_ichunk()
> > up into xfs_iwalk_ag_start() and avoid needing to pass trim down
> > multiple levels. In fact, if we're not trimming the record we don't need
> > to grab the record at all in this path. We could do the lookup (setting
> > has_more) then bounce right up to the core walker algorithm, right? If
> > so, that seems a bit cleaner in terms of only using special cased code
> > when it's actually necessary.
> 
> Right.
> 
> > 
> > >  	if (error)
> > >  		return error;
> > >  	if (icount)
> > ...
> > > @@ -561,3 +574,135 @@ xfs_iwalk_threaded(
> > ...
> > > +/*
> > > + * Walk all inode btree records in a single AG, from @iwag->startino to the end
> > > + * of the AG.
> > > + */
> > > +STATIC int
> > > +xfs_inobt_walk_ag(
> > > +	struct xfs_iwalk_ag		*iwag)
> > > +{
> > > +	struct xfs_mount		*mp = iwag->mp;
> > > +	struct xfs_trans		*tp = iwag->tp;
> > > +	struct xfs_buf			*agi_bp = NULL;
> > > +	struct xfs_btree_cur		*cur = NULL;
> > > +	xfs_agnumber_t			agno;
> > > +	xfs_agino_t			agino;
> > > +	int				has_more;
> > > +	int				error = 0;
> > > +
> > > +	/* Set up our cursor at the right place in the inode btree. */
> > > +	agno = XFS_INO_TO_AGNO(mp, iwag->startino);
> > > +	agino = XFS_INO_TO_AGINO(mp, iwag->startino);
> > > +	error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more,
> > > +			false);
> > > +
> > > +	while (!error && has_more && !xfs_pwork_want_abort(&iwag->pwork)) {
> > > +		struct xfs_inobt_rec_incore	*irec;
> > > +
> > > +		cond_resched();
> > > +
> > > +		/* Fetch the inobt record. */
> > > +		irec = &iwag->recs[iwag->nr_recs];
> > > +		error = xfs_inobt_get_rec(cur, irec, &has_more);
> > > +		if (error || !has_more)
> > > +			break;
> > > +
> > > +		/*
> > > +		 * If there's space in the buffer for more records, increment
> > > +		 * the btree cursor and grab more.
> > > +		 */
> > > +		if (++iwag->nr_recs < iwag->sz_recs) {
> > > +			error = xfs_btree_increment(cur, 0, &has_more);
> > > +			if (error || !has_more)
> > > +				break;
> > > +			continue;
> > > +		}
> > > +
> > > +		/*
> > > +		 * Otherwise, we need to save cursor state and run the callback
> > > +		 * function on the cached records.  The run_callbacks function
> > > +		 * is supposed to return a cursor pointing to the record where
> > > +		 * we would be if we had been able to increment like above.
> > > +		 */
> > > +		error = xfs_iwalk_run_callbacks(iwag, xfs_inobt_walk_ag_recs,
> > > +				agno, &cur, &agi_bp, &has_more);
> > > +	}
> > > +
> > > +	xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error);
> > > +
> > > +	/* Walk any records left behind in the cache. */
> > > +	if (iwag->nr_recs == 0 || error || xfs_pwork_want_abort(&iwag->pwork))
> > > +		return error;
> > > +
> > > +	return xfs_inobt_walk_ag_recs(iwag);
> > > +}
> > 
> > Similar comments apply here as for the previous xfs_iwalk_ag() patch.
> > Though looking at it, the only differences here are the lack of free
> > inode check, readahead and the callback function (particularly once you
> > consider the partial completion refactoring we discussed earlier). I
> > think this could all be generalized with a single flag such that there's
> > no need for separate xfs_[inobt|iwalk]_ag() functions.
> 
> Yep.  Already refactoring that. :)
> 
> > Hmmm.. perhaps we could use a flag or separate function pointers in
> > struct xfs_iwalk_ag to accomplish the same thing all the way up through
> > the record walker functions. IOW, xfs_iwalk_ag_recs() looks like it
> > could call either callback based on which is defined in the
> > xfs_iwalk_ag.
> 
> <nod>
> 
> > This could all be done as a separate patch of course, if that's easier.
> 
> I might just go back and remove the function pointer from run_callbacks
> in the earlier patches...
> 
> > 
> > > +
> > > +/*
> > > + * Walk all inode btree records in the filesystem starting from @startino.  The
> > > + * @inobt_walk_fn will be called for each btree record, being passed the incore
> > > + * record and @data.  @max_prefetch controls how many inobt records we try to
> > > + * cache ahead of time.
> > > + */
> > > +int
> > > +xfs_inobt_walk(
> > > +	struct xfs_mount	*mp,
> > > +	struct xfs_trans	*tp,
> > > +	xfs_ino_t		startino,
> > > +	xfs_inobt_walk_fn	inobt_walk_fn,
> > > +	unsigned int		max_prefetch,
> > > +	void			*data)
> > > +{
> > > +	struct xfs_iwalk_ag	iwag = {
> > > +		.mp		= mp,
> > > +		.tp		= tp,
> > > +		.inobt_walk_fn	= inobt_walk_fn,
> > > +		.data		= data,
> > > +		.startino	= startino,
> > > +		.pwork		= XFS_PWORK_SINGLE_THREADED,
> > > +	};
> > > +	xfs_agnumber_t		agno = XFS_INO_TO_AGNO(mp, startino);
> > > +	int			error;
> > > +
> > > +	ASSERT(agno < mp->m_sb.sb_agcount);
> > > +
> > > +	xfs_iwalk_set_prefetch(&iwag, max_prefetch * XFS_INODES_PER_CHUNK);
> > 
> > A brief comment above this line would be helpful. Something like:
> > 
> > 	/* translate inumbers record count to inode count */
> 
> Done.  Thanks for the review!
> 
> --D
> 
> > Brian
> > 
> > > +	error = xfs_iwalk_alloc(&iwag);
> > > +	if (error)
> > > +		return error;
> > > +
> > > +	for (; agno < mp->m_sb.sb_agcount; agno++) {
> > > +		error = xfs_inobt_walk_ag(&iwag);
> > > +		if (error)
> > > +			break;
> > > +		iwag.startino = XFS_AGINO_TO_INO(mp, agno + 1, 0);
> > > +	}
> > > +
> > > +	xfs_iwalk_free(&iwag);
> > > +	return error;
> > > +}
> > > diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h
> > > index 76d8f87a39ef..20bee93d4676 100644
> > > --- a/fs/xfs/xfs_iwalk.h
> > > +++ b/fs/xfs/xfs_iwalk.h
> > > @@ -18,4 +18,16 @@ int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino,
> > >  		xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, bool poll,
> > >  		void *data);
> > >  
> > > +/* Walk all inode btree records in the filesystem starting from @startino. */
> > > +typedef int (*xfs_inobt_walk_fn)(struct xfs_mount *mp, struct xfs_trans *tp,
> > > +				 xfs_agnumber_t agno,
> > > +				 const struct xfs_inobt_rec_incore *irec,
> > > +				 void *data);
> > > +/* Return value (for xfs_inobt_walk_fn) that aborts the walk immediately. */
> > > +#define XFS_INOBT_WALK_ABORT	(XFS_IWALK_ABORT)
> > > +
> > > +int xfs_inobt_walk(struct xfs_mount *mp, struct xfs_trans *tp,
> > > +		xfs_ino_t startino, xfs_inobt_walk_fn inobt_walk_fn,
> > > +		unsigned int max_prefetch, void *data);
> > > +
> > >  #endif /* __XFS_IWALK_H__ */
> > > 

      reply	other threads:[~2019-06-11 17:01 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-04 21:49 [PATCH v2 00/10] xfs: refactor and improve inode iteration Darrick J. Wong
2019-06-04 21:49 ` [PATCH 01/10] xfs: create simplified inode walk function Darrick J. Wong
2019-06-10 13:58   ` Brian Foster
2019-06-10 16:59     ` Darrick J. Wong
2019-06-10 17:55       ` Brian Foster
2019-06-10 23:11         ` Darrick J. Wong
2019-06-11 22:33           ` Dave Chinner
2019-06-11 23:05             ` Darrick J. Wong
2019-06-12 12:13               ` Brian Foster
2019-06-12 16:53                 ` Darrick J. Wong
2019-06-12 17:54               ` Darrick J. Wong
2019-06-04 21:49 ` [PATCH 02/10] xfs: convert quotacheck to use the new iwalk functions Darrick J. Wong
2019-06-10 13:58   ` Brian Foster
2019-06-10 17:10     ` Darrick J. Wong
2019-06-11 23:23     ` Dave Chinner
2019-06-12  0:32       ` Darrick J. Wong
2019-06-12 12:55         ` Brian Foster
2019-06-12 23:33           ` Dave Chinner
2019-06-13 18:34             ` Brian Foster
2019-06-04 21:49 ` [PATCH 03/10] xfs: bulkstat should copy lastip whenever userspace supplies one Darrick J. Wong
2019-06-10 13:59   ` Brian Foster
2019-06-04 21:49 ` [PATCH 04/10] xfs: convert bulkstat to new iwalk infrastructure Darrick J. Wong
2019-06-10 14:02   ` Brian Foster
2019-06-10 17:38     ` Darrick J. Wong
2019-06-10 18:29       ` Brian Foster
2019-06-10 23:42         ` Darrick J. Wong
2019-06-04 21:49 ` [PATCH 05/10] xfs: move bulkstat ichunk helpers to iwalk code Darrick J. Wong
2019-06-10 14:02   ` Brian Foster
2019-06-04 21:50 ` [PATCH 06/10] xfs: change xfs_iwalk_grab_ichunk to use startino, not lastino Darrick J. Wong
2019-06-10 19:32   ` Brian Foster
2019-06-04 21:50 ` [PATCH 07/10] xfs: clean up long conditionals in xfs_iwalk_ichunk_ra Darrick J. Wong
2019-06-10 19:32   ` Brian Foster
2019-06-04 21:50 ` [PATCH 08/10] xfs: multithreaded iwalk implementation Darrick J. Wong
2019-06-10 19:40   ` Brian Foster
2019-06-11  1:10     ` Darrick J. Wong
2019-06-11 13:13       ` Brian Foster
2019-06-11 15:29         ` Darrick J. Wong
2019-06-11 17:00           ` Brian Foster
2019-06-04 21:50 ` [PATCH 09/10] xfs: poll waiting for quotacheck Darrick J. Wong
2019-06-11 15:07   ` Brian Foster
2019-06-11 16:06     ` Darrick J. Wong
2019-06-11 17:01       ` Brian Foster
2019-06-04 21:50 ` [PATCH 10/10] xfs: refactor INUMBERS to use iwalk functions Darrick J. Wong
2019-06-11 15:08   ` Brian Foster
2019-06-11 16:21     ` Darrick J. Wong
2019-06-11 17:01       ` Brian Foster [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190611170137.GC12395@bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.