All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/4] mkfs: stop zeroing old superblocks excessively
Date: Fri, 7 Sep 2018 10:04:32 +1000	[thread overview]
Message-ID: <20180907000432.GH27618@dastard> (raw)
In-Reply-To: <20180906133108.GA3311@bfoster>

On Thu, Sep 06, 2018 at 09:31:08AM -0400, Brian Foster wrote:
> On Wed, Sep 05, 2018 at 06:19:29PM +1000, Dave Chinner wrote:
....
> > @@ -1220,15 +1221,68 @@ zero_old_xfs_structures(
> >  
> >  	/*
> >  	 * block size and basic geometry seems alright, zero the secondaries.
> > +	 *
> > +	 * Don't be insane when it comes to overwriting really large filesystems
> > +	 * as it could take millions of IOs to zero every secondary
> > +	 * superblock. If we are remaking a huge filesystem, then do the
> > +	 * zeroing, but if we are replacing it with a small one (typically done
> > +	 * in test environments, limit the zeroing to:
> > +	 *
> > +	 *	- around the range of the new filesystem
> > +	 *	- the middle of the old filesystem
> > +	 *	- the end of the old filesystem
> > +	 *
> > +	 * Killing the middle and end of the old filesystem will prevent repair
> > +	 * from finding it with it's fast secondary sb scan algorithm. The slow
> > +	 * scan algorithm will then confirm the small filesystem geometry by
> > +	 * brute force scans.
> >  	 */
> >  	memset(buf, 0, new_sb->sb_sectsize);
> > +
> > +	/* this carefully avoids integer overflows */
> > +	end = sb.sb_dblocks;
> > +	if (sb.sb_agcount > 10000 &&
> > +	    new_sb->sb_dblocks < end / 10)
> > +		end = new_sb->sb_dblocks * 10;
> 
> ... but what's with the 10k agcount cutoff? Just a number out of a hat
> to demonstrate the improvement..?

yeah, I pulled it from a hat, but mainly so it only triggers the new
"partial zeroing" code on really large devices that had a large
filesystem and now we're making a much smaller filesystem on it.

There's no real reason for it except for the fact I haven't done the
verification needed to make this the default behaviour (hence the
RFCRAP status :).

> Given that you've done the work to rough in an AIO buffered write
> mechanism for mkfs, have you considered whether we can find a way to
> apply that mechanism here?

It would have be another AIO context because this loop doesn't
use xfs_bufs.

> I'm guessing that the result of using AIO
> wouldn't be quite as impressive of not doing I/O at all, but as you've
> already noted, this algorithmic optimization is more targeted at test
> environments than production ones. The AIO approach sounds like it could
> be more broadly beneficial, even if not quite as fast in those
> particular test cases.

I just don't think it's worth the hassle as, like you said, it's
easier just to avoid the IO altogether.

> 
> >  	off = 0;
> > -	for (i = 1; i < sb.sb_agcount; i++)  {
> > +	for (i = 1; i < sb.sb_agcount && off < end; i++)  {
> > +		off += sb.sb_agblocks;
> > +		if (pwrite(xi->dfd, buf, new_sb->sb_sectsize,
> > +					off << sb.sb_blocklog) == -1)
> > +			break;
> > +	}
> > +
> > +	if (end == sb.sb_dblocks)
> > +		return;
> > +
> > +	/*
> > +	 * Trash the middle 1000 AGs of the old fs, which we know has at least
> > +	 * 10000 AGs at this point. Cast to make sure we are doing 64bit
> > +	 * multiplies, otherwise off gets truncated to 32 bit. I hate C.
> > +	 */
> > +	i = (sb.sb_agcount / 2) - 500;
> > +	off = (xfs_off_t)sb.sb_agblocks * i;
> > +	off = (xfs_off_t)sb.sb_agblocks * ((sb.sb_agcount / 2) - 500);
> 
> Looks like a couple lines of dead code there.

Yup, didn't clean it up properly, did I?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-09-07  4:43 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-05  8:19 [RFCRAP PATCH 0/4 v2] mkfs.xfs IO scalability Dave Chinner
2018-09-05  8:19 ` [PATCH 1/4] mkfs: stop zeroing old superblocks excessively Dave Chinner
2018-09-06 13:31   ` Brian Foster
2018-09-07  0:04     ` Dave Chinner [this message]
2018-09-07 11:05       ` Brian Foster
2018-09-05  8:19 ` [PATCH 2/4] mkfs: rework AG header initialisation ordering Dave Chinner
2018-09-06 13:31   ` Brian Foster
2018-09-07  0:08     ` Dave Chinner
2018-09-05  8:19 ` [PATCH 3/4] mkfs: introduce new delayed write buffer list Dave Chinner
2018-09-06 13:32   ` Brian Foster
2018-09-07  0:21     ` Dave Chinner
2018-09-05  8:19 ` [PATCH 4/4] mkfs: Use AIO for batched writeback Dave Chinner
2018-09-06 13:32   ` Brian Foster
2018-09-07  0:30     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180907000432.GH27618@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.