All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 04/10] xfs: fix ordering violation between cache flushes and tail updates
Date: Tue, 27 Jul 2021 07:44:49 +1000	[thread overview]
Message-ID: <20210726214449.GR664593@dread.disaster.area> (raw)
In-Reply-To: <20210726173521.GB559142@magnolia>

On Mon, Jul 26, 2021 at 10:35:21AM -0700, Darrick J. Wong wrote:
> On Mon, Jul 26, 2021 at 04:07:10PM +1000, Dave Chinner wrote:
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -489,12 +489,20 @@ xfs_log_reserve(
> >  
> >  /*
> >   * Flush iclog to disk if this is the last reference to the given iclog and the
> > - * it is in the WANT_SYNC state.
> > + * it is in the WANT_SYNC state. If the caller passes in a non-zero
> 
> I've noticed that the log code isn't always consistent about special
> looking LSNs -- some places use NULLCOMMITLSN, some places opencode
> (xfs_lsn_t)-1, and other code uses zero.  Is there some historical
> reason for having these distinct values?  Or do they actually mean
> separate things?

If depends on the use case. I resisted converting the (xfs_lsn_t)-1s
in and around this patchset to NULLCOMMITLSN just to keep the noise
down. Where-ever we use -1 as a LSN, we should really use
NULLCOMMITLSN.

As for comparing to zero, that works because we the LSN is largely
unlikely to have 64 bit overflow in the lifetime of the journal. And
if it does overflow, it's unlikely that we'll overflow exactly to
(0,0) as a meaningful LSN.

However, I think we've probably got bigger problems around xfs_lsn_t
overflowing, such as it being defined as signed rather than
unsigned and I suspect that XFS_LSN_CMP() fails if we overflow LSNs
back to zero and compare high cycle (old) with low cycle (new) LSNs.

So, really, right now I've ostriched this issue because I'm still
trying to work throw g/482 failures and I suspect we need to change
the definition of a LSN to fix these issues...

> > + * @old_tail_lsn, then we need to check if the log tail is different to the
> > + * caller's value. If it is different, this indicates that the log tail has
> > + * moved since the caller sampled the log tail and issued a cache flush and so
> > + * there may be metadata on disk that we need to flush before this iclog is
> 
> "If the caller passes in a non-zero @old_tail_lsn and the current log
> tail does not match, there may be metadata on disk that must be
> persisted before this iclog is written.  To satisfy that requirement,
> set the XLOG_ICL_NEED_FLUSH flag as a condition for writing this iclog
> with the new log tail value." ?

Ok.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2021-07-26 21:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-26  6:07 [PATCH 0/10 v2] xfs: fix log cache flush regressions and bugs Dave Chinner
2021-07-26  6:07 ` [PATCH 01/10] xfs: flush data dev on external log write Dave Chinner
2021-07-26  6:07 ` [PATCH 02/10] xfs: external logs need to flush data device Dave Chinner
2021-07-26  6:07 ` [PATCH 03/10] xfs: fold __xlog_state_release_iclog into xlog_state_release_iclog Dave Chinner
2021-07-26 17:20   ` Darrick J. Wong
2021-07-26  6:07 ` [PATCH 04/10] xfs: fix ordering violation between cache flushes and tail updates Dave Chinner
2021-07-26  7:22   ` Christoph Hellwig
2021-07-26 17:35   ` Darrick J. Wong
2021-07-26 21:44     ` Dave Chinner [this message]
2021-07-26 22:16       ` Darrick J. Wong
2021-07-26  6:07 ` [PATCH 05/10] xfs: factor out forced iclog flushes Dave Chinner
2021-07-26  7:25   ` Christoph Hellwig
2021-07-26 17:48   ` Darrick J. Wong
2021-07-26 21:47     ` Dave Chinner
2021-07-26  6:07 ` [PATCH 06/10] xfs: log forces imply data device cache flushes Dave Chinner
2021-07-26  7:27   ` Christoph Hellwig
2021-07-26 17:58   ` Darrick J. Wong
2021-07-26  6:07 ` [PATCH 07/10] xfs: avoid unnecessary waits in xfs_log_force_lsn() Dave Chinner
2021-07-26  6:07 ` [PATCH 08/10] xfs: logging the on disk inode LSN can make it go backwards Dave Chinner
2021-07-26  6:07 ` [PATCH 09/10] xfs: Enforce attr3 buffer recovery order Dave Chinner
2021-07-26  7:35   ` Christoph Hellwig
2021-07-26 17:57   ` Darrick J. Wong
2021-07-26 21:52     ` Dave Chinner
2021-07-26 22:34       ` Darrick J. Wong
2021-07-26  6:07 ` [PATCH 10/10] xfs: need to see iclog flags in tracing Dave Chinner
2021-07-26  7:36   ` Christoph Hellwig
2021-07-26 17:57   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210726214449.GR664593@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.