From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/4] xfs: introduce inode unlink log item
Date: Thu, 2 Jul 2020 08:25:53 -0400 [thread overview]
Message-ID: <20200702122553.GC55314@bfoster> (raw)
In-Reply-To: <20200701222428.GX2005@dread.disaster.area>
On Thu, Jul 02, 2020 at 08:24:28AM +1000, Dave Chinner wrote:
> On Wed, Jul 01, 2020 at 10:32:19AM -0400, Brian Foster wrote:
> > On Tue, Jun 23, 2020 at 07:50:15PM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > >
> > > Tracking dirty inodes via cluster buffers creates lock ordering
> > > issues with logging unlinked inode updates direct to the cluster
> > > buffer. The unlinked inode list is unordered, so we can lock cluster
> > > buffers in random orders and that causes deadlocks.
> > >
> > > To solve this problem, we really want to dealy locking the cluster
> > > buffers until the pre-commit phase where we can order the buffers
> > > correctly along with all the other inode cluster buffers that are
> > > locked by the transaction. However, to do this we need to be able to
> > > tell the transaction which inodes need to have there unlinked list
> > > updated and what it should be updated to.
> > >
> > > We can delay the buffer update to the pre-commit phase based on the
> > > fact taht all unlinked inode list updates are serialised by the AGI
> > > buffer. It will be locked into the transaction before the list
> > > update starts, and will remain locked until the transaction commits.
> > > Hence we can lock and update the cluster buffers safely any time
> > > during the transaction and we are still safe from other racing
> > > unlinked list updates.
> > >
> > > The iunlink log item currently only exists in memory. we need a log
> > > item to attach information to the transaction, but it's context
> > > is completely owned by the transaction. Hence it is never formatted
> > > or inserted into the CIL, nor is it seen by the journal, the AIL or
> > > log recovery.
> > >
> > > This makes it a very simple log item, and the changes makes results
> > > in adding addition buffer log items to the transaction. Hence once
> > > the iunlink log item has run it's pre-commit operation, it can be
> > > dropped by the transaction and released.
> > >
> > > The creation of this in-memory intent does not prevent us from
> > > extending it in future to the journal to replace buffer based
> > > logging of the unlinked list. Changing the format of the items we
> > > write to the on disk journal is beyond the scope of this patchset,
> > > hence we limit it to being in-memory only.
> > >
> > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > ---
> > > fs/xfs/Makefile | 1 +
> > > fs/xfs/xfs_inode.c | 70 +++----------------
> > > fs/xfs/xfs_inode_item.c | 3 +-
> > > fs/xfs/xfs_iunlink_item.c | 141 ++++++++++++++++++++++++++++++++++++++
> > > fs/xfs/xfs_iunlink_item.h | 24 +++++++
> > > fs/xfs/xfs_super.c | 10 +++
> > > 6 files changed, 189 insertions(+), 60 deletions(-)
> > > create mode 100644 fs/xfs/xfs_iunlink_item.c
> > > create mode 100644 fs/xfs/xfs_iunlink_item.h
> > >
...
> > > diff --git a/fs/xfs/xfs_iunlink_item.c b/fs/xfs/xfs_iunlink_item.c
> > > new file mode 100644
> > > index 000000000000..83f1dc81133b
> > > --- /dev/null
> > > +++ b/fs/xfs/xfs_iunlink_item.c
> > > @@ -0,0 +1,141 @@
> > ...
> > > +
> > > +static const struct xfs_item_ops xfs_iunlink_item_ops = {
> > > + .flags = XFS_ITEM_RELEASE_WHEN_COMMITTED,
> > > + .iop_release = xfs_iunlink_item_release,
> >
> > Presumably we need the release callback for transaction abort, but the
> > flag looks unnecessary. That triggers a release on commit to the on-disk
> > log, which IIUC should never happen for this item.
>
> You are probably right - I didn't look that further than "it should
> be freed at commit time" and the flag name implies it is freed at
> commit time.
>
> Which, of course, then raises the question: "Which commit are we
> talking about here?". But because it's RFC work at this point I
> didn't bother chasing that detail down because the code worked and I
> had other things to do.....
>
I think it's funcionally harmless since this item would never be in a
situation where that flag has an effect. FWIW, I was never a big fan of
the iop_release() factoring and it hasn't really grown on me since. I
routinely have to double/triple check these callbacks where I didn't
before, especially when we have to consider things like this behavior
altering flag and cases where some of the other callbacks route back
into the release callback, etc.
Brian
> > > + .iop_sort = xfs_iunlink_item_sort,
> > > + .iop_precommit = xfs_iunlink_item_precommit,
> > > +};
> > > +
> > > +
> > > +/*
> > > + * Initialize the inode log item for a newly allocated (in-core) inode.
> > > + *
> > > + * Inode extents can only reside within an AG. Hence specify the starting
> > > + * block for the inode chunk by offset within an AG as well as the
> > > + * length of the allocated extent.
> > > + *
> > > + * This joins the item to the transaction and marks it dirty so
> > > + * that we don't need a separate call to do this, nor does the
> > > + * caller need to know anything about the iunlink item.
> > > + */
> >
> > Looks like some copy/paste remnants in the comment.
>
> Yup, I did just copy-pasta at lot of stuff around here...
>
> Cheers,
>
> Dave.
>
> --
> Dave Chinner
> david@fromorbit.com
>
prev parent reply other threads:[~2020-07-02 12:26 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-23 9:50 [PATCH 0/4] [RFC] xfs: in memory inode unlink log items Dave Chinner
2020-06-23 9:50 ` [PATCH 1/4] xfs: xfs_iflock is no longer a completion Dave Chinner
2020-06-24 15:36 ` Brian Foster
2020-07-01 5:48 ` Dave Chinner
2020-06-23 9:50 ` [PATCH 2/4] xfs: add log item precommit operation Dave Chinner
2020-06-30 18:06 ` Darrick J. Wong
2020-07-01 14:30 ` Brian Foster
2020-07-01 22:02 ` Dave Chinner
2020-06-23 9:50 ` [PATCH 3/4] xfs: track unlinked inodes in core inode Dave Chinner
2020-07-01 8:59 ` Gao Xiang
2020-07-01 22:06 ` Dave Chinner
2020-07-01 14:31 ` Brian Foster
2020-07-01 22:18 ` Dave Chinner
2020-07-02 12:24 ` Brian Foster
2020-07-07 14:39 ` Gao Xiang
2020-06-23 9:50 ` [PATCH 4/4] xfs: introduce inode unlink log item Dave Chinner
2020-06-30 18:19 ` Darrick J. Wong
2020-06-30 22:31 ` Gao Xiang
2020-07-01 6:26 ` Dave Chinner
2020-07-01 14:32 ` Brian Foster
2020-07-01 22:24 ` Dave Chinner
2020-07-02 12:25 ` Brian Foster [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200702122553.GC55314@bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).