linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>,
	"Theodore Y. Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH V4 09/13] fs/xfs: Add write aops lock to xfs layer
Date: Wed, 26 Feb 2020 09:59:41 +1100	[thread overview]
Message-ID: <20200225225941.GO10776@dread.disaster.area> (raw)
In-Reply-To: <20200225211228.GB15810@iweiny-DESK2.sc.intel.com>

On Tue, Feb 25, 2020 at 01:12:28PM -0800, Ira Weiny wrote:
> On Tue, Feb 25, 2020 at 09:32:45AM +1100, Dave Chinner wrote:
> > On Mon, Feb 24, 2020 at 11:57:36AM -0800, Ira Weiny wrote:
> > > On Mon, Feb 24, 2020 at 11:34:55AM +1100, Dave Chinner wrote:
> > > > On Thu, Feb 20, 2020 at 04:41:30PM -0800, ira.weiny@intel.com wrote:
> > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > 
> > > 
> > > [snip]
> > > 
> > > > > 
> > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > > > index 35df324875db..5b014c428f0f 100644
> > > > > --- a/fs/xfs/xfs_inode.c
> > > > > +++ b/fs/xfs/xfs_inode.c
> > > > > @@ -142,12 +142,12 @@ xfs_ilock_attr_map_shared(
> > > > >   *
> > > > >   * Basic locking order:
> > > > >   *
> > > > > - * i_rwsem -> i_mmap_lock -> page_lock -> i_ilock
> > > > > + * s_dax_sem -> i_rwsem -> i_mmap_lock -> page_lock -> i_ilock
> > > > >   *
> > > > >   * mmap_sem locking order:
> > > > >   *
> > > > >   * i_rwsem -> page lock -> mmap_sem
> > > > > - * mmap_sem -> i_mmap_lock -> page_lock
> > > > > + * s_dax_sem -> mmap_sem -> i_mmap_lock -> page_lock
> > > > >   *
> > > > >   * The difference in mmap_sem locking order mean that we cannot hold the
> > > > >   * i_mmap_lock over syscall based read(2)/write(2) based IO. These IO paths can
> > > > > @@ -182,6 +182,9 @@ xfs_ilock(
> > > > >  	       (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL));
> > > > >  	ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_SUBCLASS_MASK)) == 0);
> > > > >  
> > > > > +	if (lock_flags & XFS_DAX_EXCL)
> > > > > +		inode_aops_down_write(VFS_I(ip));
> > > > 
> > > > I largely don't see the point of adding this to xfs_ilock/iunlock.
> > > > 
> > > > It's only got one caller, so I don't see much point in adding it to
> > > > an interface that has over a hundred other call sites that don't
> > > > need or use this lock. just open code it where it is needed in the
> > > > ioctl code.
> > > 
> > > I know it seems overkill but if we don't do this we need to code a flag to be
> > > returned from xfs_ioctl_setattr_dax_invalidate().  This flag is then used in
> > > xfs_ioctl_setattr_get_trans() to create the transaction log item which can then
> > > be properly used to unlock the lock in xfs_inode_item_release()
> > > 
> > > I don't know of a cleaner way to communicate to xfs_inode_item_release() to
> > > unlock i_aops_sem after the transaction is complete.
> > 
> > We manually unlock inodes after transactions in many cases -
> > anywhere we do a rolling transaction, the inode locks do not get
> > released by the transaction. Hence for a one-off case like this it
> > doesn't really make sense to push all this infrastructure into the
> > transaction subsystem. Especially as we can manually lock before and
> > unlock after the transaction context without any real complexity.
> 
> So does xfs_trans_commit() operate synchronously?

What do you mean by "synchronously", and what are you expecting to
occur (a)synchronously with respect to filesystem objects and/or
on-disk state?

Keep in mid that the xfs transaction subsystem is a complex
asynchronous IO engine full of feedback loops and resource
management, so asking if something is "synchronous" without any
other context is a difficult question to answer :)

> I want to understand this better because I have fought with a lot of ABBA
> issues with these locks.  So...  can I hold the lock until after
> xfs_trans_commit() and safely unlock it there... because the XFS_MMAPLOCK_EXCL,
> XFS_IOLOCK_EXCL, and XFS_ILOCK_EXCL will be released at that point?  Thus
> preserving the following lock order.

See how operations like xfs_create, xfs_unlink, etc work. The don't
specify flags to xfs_ijoin(), and so the transaction commits don't
automatically unlock the inode. This is necessary so that rolling
transactions are executed atomically w.r.t. inode access - no-one
can lock and access the inode while a multi-commit rolling
transaction on the inode is on-going.

In this case it's just a single commit and we don't need to keep
it locked after the change is made, so we can unlock the inode
on commit. So for the XFS internal locks the code is fine and
doesn't need to change. We just need to wrap the VFS aops lock (if
we keep it) around the outside of all the XFS locking until the
transaction commits and unlocks the XFS locks...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-02-25 22:59 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-21  0:41 [PATCH V4 00/13] Enable per-file/per-directory DAX operations V4 ira.weiny
2020-02-21  0:41 ` [PATCH V4 01/13] fs/xfs: Remove unnecessary initialization of i_rwsem ira.weiny
2020-02-21  1:26   ` Dave Chinner
2020-02-27 17:52     ` Ira Weiny
2020-02-21  0:41 ` [PATCH V4 02/13] fs/xfs: Clarify lockdep dependency for xfs_isilocked() ira.weiny
2020-02-21  1:34   ` Dave Chinner
2020-02-21 23:00     ` Ira Weiny
2020-02-21  0:41 ` [PATCH V4 03/13] fs: Remove unneeded IS_DAX() check ira.weiny
2020-02-21  1:42   ` Dave Chinner
2020-02-21 23:04     ` Ira Weiny
2020-02-21 17:42   ` Christoph Hellwig
2020-02-21  0:41 ` [PATCH V4 04/13] fs/stat: Define DAX statx attribute ira.weiny
2020-02-21  0:41 ` [PATCH V4 05/13] fs/xfs: Isolate the physical DAX flag from enabled ira.weiny
2020-02-21  0:41 ` [PATCH V4 06/13] fs/xfs: Create function xfs_inode_enable_dax() ira.weiny
2020-02-22  0:28   ` Darrick J. Wong
2020-02-23 15:07     ` Ira Weiny
2020-02-21  0:41 ` [PATCH V4 07/13] fs: Add locking for a dynamic address space operations state ira.weiny
2020-02-21 17:44   ` Christoph Hellwig
2020-02-21 22:44     ` Dave Chinner
2020-02-21 23:26       ` Dan Williams
2020-02-24 17:56       ` Christoph Hellwig
2020-02-25  0:09         ` Dave Chinner
2020-02-25 17:36           ` Christoph Hellwig
2020-02-25 19:37             ` Jeff Moyer
2020-02-26  9:28               ` Jonathan Halliday
2020-02-26 11:31                 ` Jan Kara
2020-02-26 11:56                   ` Jonathan Halliday
2020-02-26 16:10                 ` Ira Weiny
2020-02-26 16:46                 ` Dan Williams
2020-02-26 17:20                   ` Jan Kara
2020-02-26 17:54                     ` Dan Williams
2020-02-25 21:03             ` Ira Weiny
2020-02-26 11:17           ` Jan Kara
2020-02-26 15:57             ` Ira Weiny
2020-02-22  0:33   ` Darrick J. Wong
2020-02-23 15:03     ` Ira Weiny
2020-02-21  0:41 ` [PATCH V4 08/13] fs: Prevent DAX state change if file is mmap'ed ira.weiny
2020-02-21  0:41 ` [PATCH V4 09/13] fs/xfs: Add write aops lock to xfs layer ira.weiny
2020-02-22  0:31   ` Darrick J. Wong
2020-02-23 15:04     ` Ira Weiny
2020-02-24  0:34   ` Dave Chinner
2020-02-24 19:57     ` Ira Weiny
2020-02-24 22:32       ` Dave Chinner
2020-02-25 21:12         ` Ira Weiny
2020-02-25 22:59           ` Dave Chinner [this message]
2020-02-26 18:02             ` Ira Weiny
2020-02-21  0:41 ` [PATCH V4 10/13] fs/xfs: Clean up locking in dax invalidate ira.weiny
2020-02-21 17:45   ` Christoph Hellwig
2020-02-21 18:06     ` Ira Weiny
2020-02-21  0:41 ` [PATCH V4 11/13] fs/xfs: Allow toggle of effective DAX flag ira.weiny
2020-02-21  0:41 ` [PATCH V4 12/13] fs/xfs: Remove xfs_diflags_to_linux() ira.weiny
2020-02-21  0:41 ` [PATCH V4 13/13] Documentation/dax: Update Usage section ira.weiny
2020-02-26 22:48 ` [PATCH V4 00/13] Enable per-file/per-directory DAX operations V4 Jeff Moyer
2020-02-27  2:43   ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200225225941.GO10776@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).