linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/9] xfs: cover the log during log quiesce
Date: Tue, 19 Jan 2021 09:51:54 -0800	[thread overview]
Message-ID: <20210119175154.GO3134581@magnolia> (raw)
In-Reply-To: <20210107195336.GB845369@bfoster>

On Thu, Jan 07, 2021 at 02:53:36PM -0500, Brian Foster wrote:
> On Thu, Jan 07, 2021 at 11:04:08AM -0800, Darrick J. Wong wrote:
> > On Wed, Jan 06, 2021 at 12:41:22PM -0500, Brian Foster wrote:
> > > The log quiesce mechanism historically terminates by marking the log
> > > clean with an unmount record. The primary objective is to indicate
> > > that log recovery is no longer required after the quiesce has
> > > flushed all in-core changes and written back filesystem metadata.
> > > While this is perfectly fine, it is somewhat hacky as currently used
> > > in certain contexts. For example, filesystem freeze quiesces (i.e.
> > > cleans) the log and immediately redirties it with a dummy superblock
> > > transaction to ensure that log recovery runs in the event of a
> > > crash.
> > > 
> > > While this functions correctly, cleaning the log from freeze context
> > > is clearly superfluous given the current redirtying behavior.
> > > Instead, the desired behavior can be achieved by simply covering the
> > > log. This effectively retires all on-disk log items from the active
> > > range of the log by issuing two synchronous and sequential dummy
> > > superblock update transactions that serve to update the on-disk log
> > > head and tail. The subtle difference is that the log technically
> > > remains dirty due to the lack of an unmount record, though recovery
> > > is effectively a no-op due to the content of the checkpoints being
> > > clean (i.e. the unmodified on-disk superblock).
> > > 
> > > Log covering currently runs in the background and only triggers once
> > > the filesystem and log has idled. The purpose of the background
> > > mechanism is to prevent log recovery from replaying the most
> > > recently logged items long after those items may have been written
> > > back. In the quiesce path, the log has been deliberately idled by
> > > forcing the log and pushing the AIL until empty in a context where
> > > no further mutable filesystem operations are allowed. Therefore, we
> > > can cover the log as the final step in the log quiesce codepath to
> > > reflect that all previously active items have been successfully
> > > written back.
> > > 
> > > This facilitates selective log covering from certain contexts (i.e.
> > > freeze) that only seek to quiesce, but not necessarily clean the
> > > log. Note that as a side effect of this change, log covering now
> > > occurs when cleaning the log as well. This is harmless, facilitates
> > > subsequent cleanups, and is mostly temporary as various operations
> > > switch to use explicit log covering.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > >  fs/xfs/xfs_log.c | 49 +++++++++++++++++++++++++++++++++++++++++++++---
> > >  fs/xfs/xfs_log.h |  2 +-
> > >  2 files changed, 47 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index 1b3227a033ad..f7b23044723d 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -91,6 +91,9 @@ STATIC int
> > >  xlog_iclogs_empty(
> > >  	struct xlog		*log);
> > >  
> > > +static int
> > > +xfs_log_cover(struct xfs_mount *);
> > > +
> > >  static void
> > >  xlog_grant_sub_space(
> > >  	struct xlog		*log,
> > > @@ -936,10 +939,9 @@ xfs_log_unmount_write(
> > >   * To do this, we first need to shut down the background log work so it is not
> > >   * trying to cover the log as we clean up. We then need to unpin all objects in
> > >   * the log so we can then flush them out. Once they have completed their IO and
> > > - * run the callbacks removing themselves from the AIL, we can write the unmount
> > > - * record.
> > > + * run the callbacks removing themselves from the AIL, we can cover the log.
> > >   */
> > > -void
> > > +int
> > >  xfs_log_quiesce(
> > >  	struct xfs_mount	*mp)
> > >  {
> > > @@ -957,6 +959,8 @@ xfs_log_quiesce(
> > >  	xfs_wait_buftarg(mp->m_ddev_targp);
> > >  	xfs_buf_lock(mp->m_sb_bp);
> > >  	xfs_buf_unlock(mp->m_sb_bp);
> > > +
> > > +	return xfs_log_cover(mp);
> > >  }
> > >  
> > >  void
> > > @@ -1092,6 +1096,45 @@ xfs_log_need_covered(
> > >  	return needed;
> > >  }
> > >  
> > > +/*
> > > + * Explicitly cover the log. This is similar to background log covering but
> > > + * intended for usage in quiesce codepaths. The caller is responsible to ensure
> > > + * the log is idle and suitable for covering. The CIL, iclog buffers and AIL
> > > + * must all be empty.
> > > + */
> > > +static int
> > > +xfs_log_cover(
> > > +	struct xfs_mount	*mp)
> > > +{
> > > +	struct xlog		*log = mp->m_log;
> > > +	int			error = 0;
> > > +
> > > +	ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) &&
> > > +	        !xfs_ail_min_lsn(log->l_ailp)) ||
> > > +	       XFS_FORCED_SHUTDOWN(mp));
> > > +
> > > +	if (!xfs_log_writable(mp))
> > > +		return 0;
> > > +
> > > +	/*
> > > +	 * To cover the log, commit the superblock twice (at most) in
> > > +	 * independent checkpoints. The first serves as a reference for the
> > > +	 * tail pointer. The sync transaction and AIL push empties the AIL and
> > > +	 * updates the in-core tail to the LSN of the first checkpoint. The
> > > +	 * second commit updates the on-disk tail with the in-core LSN,
> > > +	 * covering the log. Push the AIL one more time to leave it empty, as
> > > +	 * we found it.
> > > +	 */
> > 
> > Hm.  At first I looked at _need_covered and wondered how this could work
> > properly if we are in state DONE or DONE2, because this not-quite
> > predicate returns zero in that case.
> > 
> > I think it's the case that the only way the log can end up in DONE state
> > is if the background log worker had previously been in NEED, written the
> > first of the dummy transactions, moved the state to DONE, and waited for
> > xlog_covered_state to move the log from DONE to NEED2.  Similarly, the
> > log can only be in DONE2 state if the background worker wrote the second
> > dummy and is now waiting for xlog_covered_state to move the log from
> > DONE2 to IDLE.
> > 
> > Since xfs_log_quiesce cancelled the log worker and waited for it to
> > finish before calling xfs_log_cover, the covering state here can only be
> > IDLE, NEED, or NEED2, right?  And hence the while loop pushes the log to
> > IDLE no matter where it is now, right?
> > 
> 
> Yeah, we're in a quiescent context at this point where no other
> transactions are running, the in-core structures should be drained and
> the background log worker cancelled, etc. With regard to the background
> log worker, I don't think it should ever actually see the DONE or DONE2
> states as it sets those states and immediately issues the synchronous sb
> transaction. Therefore, the commit should have changed the state from
> DONE to NEED2 or NEED (if other items happened to land in the log)
> before it returns.
> 
> That said, I suppose it wouldn't be that surprising if some odd timing
> scenario or combination of external superblock commits could cause the
> background log worker to see a DONE state. I haven't fully audited for
> that, but regardless it would appropriately do nothing and that
> shouldn't be an issue from the quiesce context due to the runtime being
> pretty much shut down by this point.

<nod> I think that makes sense.  :)

--D

> > (I also wondered why this isn't a do-while loop but patch 6 addresses
> > that.)
> > 
> 
> Right, that changes due to the lazy sb counter logic..
> 
> Brian
> 
> > --D
> > 
> > > +	while (xfs_log_need_covered(mp)) {
> > > +		error = xfs_sync_sb(mp, true);
> > > +		if (error)
> > > +			break;
> > > +		xfs_ail_push_all_sync(mp->m_ail);
> > > +	}
> > > +
> > > +	return error;
> > > +}
> > > +
> > >  /*
> > >   * We may be holding the log iclog lock upon entering this routine.
> > >   */
> > > diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> > > index b0400589f824..044e02cb8921 100644
> > > --- a/fs/xfs/xfs_log.h
> > > +++ b/fs/xfs/xfs_log.h
> > > @@ -138,7 +138,7 @@ void	xlog_cil_process_committed(struct list_head *list);
> > >  bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
> > >  
> > >  void	xfs_log_work_queue(struct xfs_mount *mp);
> > > -void	xfs_log_quiesce(struct xfs_mount *mp);
> > > +int	xfs_log_quiesce(struct xfs_mount *mp);
> > >  void	xfs_log_clean(struct xfs_mount *mp);
> > >  bool	xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t);
> > >  bool	xfs_log_in_recovery(struct xfs_mount *);
> > > -- 
> > > 2.26.2
> > > 
> > 
> 

  reply	other threads:[~2021-01-19 21:21 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
2021-01-06 22:50   ` Allison Henderson
2021-01-07 19:06   ` Darrick J. Wong
2021-01-11 17:38   ` Christoph Hellwig
2021-01-12 14:55     ` Brian Foster
2021-01-12 18:20       ` Christoph Hellwig
2021-01-21 15:08   ` Bill O'Donnell
2021-01-21 16:49     ` Darrick J. Wong
2021-01-21 17:17       ` Bill O'Donnell
2021-01-06 17:41 ` [PATCH 2/9] xfs: lift writable fs check up into log worker task Brian Foster
2021-01-06 22:50   ` Allison Henderson
2021-01-07 18:34   ` Darrick J. Wong
2021-01-07 19:53     ` Brian Foster
2021-01-07 21:28       ` Darrick J. Wong
2021-01-13 15:24   ` Christoph Hellwig
2021-01-06 17:41 ` [PATCH 3/9] xfs: separate log cleaning from log quiesce Brian Foster
2021-01-06 22:50   ` Allison Henderson
2021-01-07 19:07   ` Darrick J. Wong
2021-01-13 15:30   ` Christoph Hellwig
2021-01-06 17:41 ` [PATCH 4/9] xfs: cover the log during " Brian Foster
2021-01-07 19:04   ` Darrick J. Wong
2021-01-07 19:53     ` Brian Foster
2021-01-19 17:51       ` Darrick J. Wong [this message]
2021-01-19 15:35   ` Christoph Hellwig
2021-01-06 17:41 ` [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints Brian Foster
2021-01-07 19:30   ` Darrick J. Wong
2021-01-07 20:01     ` Brian Foster
2021-01-06 17:41 ` [PATCH 6/9] xfs: fold sbcount quiesce logging into log covering Brian Foster
2021-01-07 19:31   ` Darrick J. Wong
2021-01-06 17:41 ` [PATCH 7/9] xfs: remove duplicate wq cancel and log force from attr quiesce Brian Foster
2021-01-07 19:38   ` Darrick J. Wong
2021-01-06 17:41 ` [PATCH 8/9] xfs: remove xfs_quiesce_attr() Brian Foster
2021-01-07 19:39   ` Darrick J. Wong
2021-01-06 17:41 ` [PATCH 9/9] xfs: cover the log on freeze instead of cleaning it Brian Foster
2021-01-07 19:39   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210119175154.GO3134581@magnolia \
    --to=djwong@kernel.org \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).