All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/5] xfs: journal IO cache flush reductions
Date: Fri, 29 Jan 2021 08:46:53 +1100	[thread overview]
Message-ID: <20210128214653.GQ4662@dread.disaster.area> (raw)
In-Reply-To: <20210128151205.GC2599027@bfoster>

On Thu, Jan 28, 2021 at 10:12:05AM -0500, Brian Foster wrote:
> On Thu, Jan 28, 2021 at 03:41:52PM +1100, Dave Chinner wrote:
> ...
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/xfs_log.c      | 34 ++++++++++++++++++++++------------
> >  fs/xfs/xfs_log_priv.h |  3 +++
> >  2 files changed, 25 insertions(+), 12 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index c5e3da23961c..8de93893e0e6 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> ...
> > @@ -2464,9 +2465,18 @@ xlog_write(
> >  		ASSERT(log_offset <= iclog->ic_size - 1);
> >  		ptr = iclog->ic_datap + log_offset;
> >  
> > -		/* start_lsn is the first lsn written to. That's all we need. */
> > -		if (!*start_lsn)
> > +		/*
> > +		 * Start_lsn is the first lsn written to. That's all the caller
> > +		 * needs to have returned. Setting it indicates the first iclog
> > +		 * of a new checkpoint or the commit record for a checkpoint, so
> > +		 * also mark the iclog as requiring a pre-flush to ensure all
> > +		 * metadata writeback or journal IO in the checkpoint is
> > +		 * correctly ordered against this new log write.
> > +		 */
> > +		if (!*start_lsn) {
> >  			*start_lsn = be64_to_cpu(iclog->ic_header.h_lsn);
> > +			iclog->ic_flags |= XLOG_ICL_NEED_FLUSH;
> > +		}
> 
> My understanding is that one of the reasons for the preflush per iclog
> approach is that we don't have any submission -> completion ordering
> guarantees across iclogs. This is why we explicitly order commit record
> completions and whatnot, to ensure the important bits are ordered
> correctly. The fact we implement that ordering ourselves suggests that
> PREFLUSH|FUA itself do not provide such ordering, though that's not
> something I've investigated.

PREFLUSH provides ordering between completed IOs and the IO to be
submitted. It does not provide any ordering guarantees against IO
currently in flight, so the application needs to wait for the IOs it
needs to order against to complete before issuing an IO with
PREFLUSH.

i.e. PREFLUSH provides a "many" completion to "single" submission
ordering guarantee on stable storage.

REQ_FUA only guarantees that when the write IO completes, it is on
stable storage. It does not provide ordering guarantees against any
IO in flight, nor IOs submitted while it is in flight. Once it
completes, however, it is guaranteed taht any latter IO submission
will hit stable storage after that IO.

i.e. REQ_FUA provides a "single" completion to "many" submission
ordering guarantee on stable storage.

> In any event, if the purpose fo the PREFLUSH is to ensure that metadata
> in the targeted LSN range is committed to stable storage, and we have no
> submission ordering guarantees across non-commit record iclogs, what
> prevents a subsequent iclog from the same checkpoint from completing
> before the first iclog with a PREFLUSH?

Fair point. I suspect that we should just do an explicit cache flush
before we start the checkpoint, and then we don't have to worry
about REQ_PREFLUSH for the first iclog in the checkpoint at all.

Actually, I wonder if we can pipeline that - submit an async cache
flush bio as soon as we enter the push work, then once we're ready
to call xlog_write() having pulled the hundreds of thousands of log
vectors off the CIL, we wait on the cache flush bio to complete.
THis gets around the first iclog in a long checkpoint requiring 
cache flushing or FUA. It also means that if there is a single
iclog for the checkpoint, we only need a FUA write as the cache
flush has already been done...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2021-01-28 21:47 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-28  4:41 [PATCH 0/5] xfs: various log stuff Dave Chinner
2021-01-28  4:41 ` [PATCH 1/5] xfs: log stripe roundoff is a property of the log Dave Chinner
2021-01-28 14:57   ` Brian Foster
2021-01-28 20:59     ` Dave Chinner
2021-01-28 21:25   ` Darrick J. Wong
2021-01-28 22:00     ` Dave Chinner
2021-01-28  4:41 ` [PATCH 2/5] xfs: separate CIL commit record IO Dave Chinner
2021-01-28 15:07   ` Brian Foster
2021-01-28 21:22     ` Dave Chinner
     [not found]       ` <20210129145851.GB2660974@bfoster>
2021-01-29 22:25         ` Dave Chinner
2021-02-01 16:07           ` Brian Foster
2021-01-30  9:13   ` Chandan Babu R
2021-02-01 12:59   ` Christoph Hellwig
2021-01-28  4:41 ` [PATCH 3/5] xfs: journal IO cache flush reductions Dave Chinner
2021-01-28 15:12   ` Brian Foster
2021-01-28 21:46     ` Dave Chinner [this message]
2021-01-28 21:26   ` Dave Chinner
2021-01-30 12:56   ` Chandan Babu R
2021-01-28  4:41 ` [PATCH 4/5] xfs: Fix CIL throttle hang when CIL space used going backwards Dave Chinner
2021-01-28 16:53   ` Brian Foster
2021-02-02  5:52   ` Chandan Babu R
2021-02-17 11:33   ` Paul Menzel
2021-02-17 21:06   ` Donald Buczek
2021-01-28  4:41 ` [PATCH 5/5] xfs: reduce buffer log item shadow allocations Dave Chinner
2021-01-28 16:54   ` Brian Foster
2021-01-28 21:58     ` Dave Chinner
2021-02-02 12:01   ` Chandan Babu R
2021-02-01 12:39 ` [PATCH 0/5] xfs: various log stuff Christoph Hellwig
2021-02-03 21:20   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210128214653.GQ4662@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.