All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/5] xfs: external logs need to flush data device
Date: Thu, 22 Jul 2021 16:10:19 -0700	[thread overview]
Message-ID: <20210722231019.GO559212@magnolia> (raw)
In-Reply-To: <20210722214539.GP664593@dread.disaster.area>

On Fri, Jul 23, 2021 at 07:45:39AM +1000, Dave Chinner wrote:
> On Thu, Jul 22, 2021 at 11:14:45AM -0700, Darrick J. Wong wrote:
> > On Thu, Jul 22, 2021 at 11:53:32AM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > The recent journal flush/FUA changes replaced the flushing of the
> > > data device on every iclog write with an up-front async data device
> > > cache flush. Unfortunately, the assumption of which this was based
> > > on has been proven incorrect by the flush vs log tail update
> > > ordering issue. As the fix for that issue uses the
> > > XLOG_ICL_NEED_FLUSH flag to indicate that data device needs a cache
> > > flush, we now need to (once again) ensure that an iclog write to
> > > external logs that need a cache flush to be issued actually issue a
> > > cache flush to the data device as well as the log device.
> > > 
> > > Fixes: eef983ffeae7 ("xfs: journal IO cache flush reductions")
> > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > ---
> > >  fs/xfs/xfs_log.c | 19 +++++++++++--------
> > >  1 file changed, 11 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index 96434cc4df6e..a3c4d48195d9 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -827,13 +827,6 @@ xlog_write_unmount_record(
> > >  	/* account for space used by record data */
> > >  	ticket->t_curr_res -= sizeof(ulf);
> > >  
> > > -	/*
> > > -	 * For external log devices, we need to flush the data device cache
> > > -	 * first to ensure all metadata writeback is on stable storage before we
> > > -	 * stamp the tail LSN into the unmount record.
> > > -	 */
> > > -	if (log->l_targ != log->l_mp->m_ddev_targp)
> > > -		blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev);
> > >  	return xlog_write(log, &vec, ticket, NULL, NULL, XLOG_UNMOUNT_TRANS);
> > >  }
> > >  
> > > @@ -1796,10 +1789,20 @@ xlog_write_iclog(
> > >  	 * metadata writeback and causing priority inversions.
> > >  	 */
> > >  	iclog->ic_bio.bi_opf = REQ_OP_WRITE | REQ_META | REQ_SYNC | REQ_IDLE;
> > > -	if (iclog->ic_flags & XLOG_ICL_NEED_FLUSH)
> > > +	if (iclog->ic_flags & XLOG_ICL_NEED_FLUSH) {
> > >  		iclog->ic_bio.bi_opf |= REQ_PREFLUSH;
> > > +		/*
> > > +		 * For external log devices, we also need to flush the data
> > > +		 * device cache first to ensure all metadata writeback covered
> > > +		 * by the LSN in this iclog is on stable storage. This is slow,
> > > +		 * but it *must* complete before we issue the external log IO.
> > 
> > I'm a little confused about what's going on here.  We're about to write
> > a log record to disk, with h_tail_lsn reflecting the tail of the log and
> > h_lsn reflecting the current head of the log (i.e. this record).
> > 
> > If the log tail has moved forward since the last log record was written
> > and this fs has an external log, we need to flush the data device
> > because the AIL could have written logged items back into the filesystem
> > and we need to ensure those items have been persisted before we write to
> > the log the fact that the tail moved forward.  The AIL itself doesn't
> > issue cache flushes (nor does it need to), so that's why we do that
> > here.
> > 
> > Why don't we need a flush like this if only FUA is set?  Is it not
> > possible to write a checkpoint that fits within a single iclog after the
> > log tail has moved forward?
> 
> Yes, it is, and that is the race condition is exactly what the next
> patch in the series addresses. If the log tail moves after the data
> device cache flush was issued before we started writing the
> checkpoint to the iclogs, then we detect that when releasing the
> commit iclog and set the XLOG_ICL_NEED_FLUSH flag on it. That will
> then trigger this code to issue a data device cache flush....

Aha, yeah, I noticed that after scanning the next few patches.

> IOWs, for external logs, the XLOG_ICL_NEED_FLUSH flag indicates that
> both the data device and the log device need a cache flush, rather
> than just the log device. I think it could be split into two flags,
> but then my head explodes thinking about log forces and trying to
> determine what type of flush is implied (and what flags we'd need to
> set) when we return log_flushed = true....

Maybe later when we're not focussed on recovery failures.

In the meantime, I'm satisfied enough to
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2021-07-22 23:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22  1:53 [PATCH 0/5] xfs: fix log cache flush regressions Dave Chinner
2021-07-22  1:53 ` [PATCH 1/5] xfs: flush data dev on external log write Dave Chinner
2021-07-22  6:41   ` Christoph Hellwig
2021-07-22 15:52   ` Darrick J. Wong
2021-07-22  1:53 ` [PATCH 2/5] xfs: external logs need to flush data device Dave Chinner
2021-07-22  6:48   ` Christoph Hellwig
2021-07-22 18:14   ` Darrick J. Wong
2021-07-22 21:45     ` Dave Chinner
2021-07-22 23:10       ` Darrick J. Wong [this message]
2021-07-22  1:53 ` [PATCH 3/5] xfs: fix ordering violation between cache flushes and tail updates Dave Chinner
2021-07-22  7:06   ` Christoph Hellwig
2021-07-22  7:28     ` Dave Chinner
2021-07-22 19:12     ` Darrick J. Wong
2021-07-22  1:53 ` [PATCH 4/5] xfs: log forces imply data device cache flushes Dave Chinner
2021-07-22  7:14   ` Christoph Hellwig
2021-07-22  7:32     ` Dave Chinner
2021-07-22 19:30   ` Darrick J. Wong
2021-07-22 22:12     ` Dave Chinner
2021-07-22 23:13       ` Darrick J. Wong
2021-07-22  1:53 ` [PATCH 5/5] xfs: avoid unnecessary waits in xfs_log_force_lsn() Dave Chinner
2021-07-22  7:15   ` Christoph Hellwig
2021-07-22 19:13   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210722231019.GO559212@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.