nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Ross Zwisler <zwisler@kernel.org>
Cc: Jan Kara <jack@suse.cz>,
	darrick.wong@oracle.com, linux-nvdimm@lists.01.org,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	lczerner@redhat.com, linux-ext4 <linux-ext4@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v4 0/2] ext4: fix DAX dma vs truncate/hole-punch
Date: Mon, 6 Aug 2018 13:55:50 +1000	[thread overview]
Message-ID: <20180806035550.GE7395@dastard> (raw)
In-Reply-To: <CAOxpaSWCT+XRtGxGQT_PVMcn79csMxbqb+cf4yycbV3eqjjnkQ@mail.gmail.com>

On Fri, Jul 27, 2018 at 10:28:51AM -0600, Ross Zwisler wrote:
> + fsdevel and the xfs list.
> 
> On Wed, Jul 25, 2018 at 4:28 PM Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > On Wed, Jul 11, 2018 at 10:17:41AM +0200, Jan Kara wrote:
> > > On Tue 10-07-18 13:10:29, Ross Zwisler wrote:
> > > > Changes since v3:
> > > >  * Added an ext4_break_layouts() call to ext4_insert_range() to ensure
> > > >    that the {ext4,xfs}_break_layouts() calls have the same meaning.
> > > >    (Dave, Darrick and Jan)
> > >
> > > How about the occasional WARN_ON_ONCE you mention below. Were you able to
> > > hunt them down?
> >
> > The root cause of this issue is that while the ei->i_mmap_sem provides
> > synchronization between ext4_break_layouts() and page faults, it doesn't
> > provide synchronize us with the direct I/O path.  This exact same issue exists
> > in XFS AFAICT, with the synchronization tool there being the XFS_MMAPLOCK.

That's not correct for XFS.

Truncate/holepunch in XFS hold *both* the MMAPLOCK and the IOLOCK in
exclusive mode. Holding the MMAPLOCK serialises against page faults,
holding the IOLOCK serialises against buffered IO and Direct IO
submission. The inode_dio_wait() call that truncate/holepunch does
after gaining the IOLOCK ensures that we wait for all direct IO in
progress to complete (e.g. AIO) before the truncate/holepunch goes
ahead. e.g:

STATIC long
xfs_file_fallocate(
        struct file             *file,
        int                     mode,
        loff_t                  offset,
        loff_t                  len)
{
.....
        uint                    iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL;
.....
        xfs_ilock(ip, iolock);
        error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP);
.....
	if (mode & FALLOC_FL_PUNCH_HOLE) {
		error = xfs_free_file_space(ip, offset, len);


and then xfs_free_file_space calls xfs_flush_unmap_range() which
waits for direct IO to complete, flushes all the dirty cached pages
and then invalidates the page cache before doing any extent
manipulation. The cannot be any IO or page faults in progress after
this point.....

truncate (through xfs_vn_setattr() essentially does the same thing,
except that it's called with the IOLOCK already held exclusively
(remember the IOLOCK is the vfs inode->i_rwsem).

> > This allows the direct I/O path to do I/O and raise & lower page->_refcount
> > while we're executing a truncate/hole punch.  This leads to us trying to free
> > a page with an elevated refcount.

I don't see how this is possible in XFS - maybe I'm missing
something, but "direct IO submission during truncate" is not
something that should ever be happening in XFS, DAX or not.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  reply	other threads:[~2018-08-06  3:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-10 19:10 Ross Zwisler
2018-07-10 19:10 ` [PATCH v4 1/2] dax: dax_layout_busy_page() warn on !exceptional Ross Zwisler
2018-08-10 19:52   ` Eric Sandeen
2018-08-10 20:33     ` Theodore Y. Ts'o
2018-08-11  2:10       ` Theodore Y. Ts'o
2018-08-13 10:12         ` Jan Kara
2018-08-13 12:46           ` Theodore Y. Ts'o
2018-08-24 15:44           ` Jan Kara
2018-08-27 16:09           ` Jan Kara
2018-07-10 19:10 ` [PATCH v4 2/2] ext4: handle layout changes to pinned DAX mappings Ross Zwisler
2018-07-11  8:17 ` [PATCH v4 0/2] ext4: fix DAX dma vs truncate/hole-punch Jan Kara
2018-07-11 15:41   ` Ross Zwisler
2018-07-25 22:28   ` Ross Zwisler
2018-07-27 16:28     ` Ross Zwisler
2018-08-06  3:55       ` Dave Chinner [this message]
2018-08-06 15:49         ` Christoph Hellwig
2018-08-06 22:25           ` Dave Chinner
2018-08-07  8:45       ` Jan Kara
2018-09-10 22:18         ` Eric Sandeen
2018-09-11 15:14           ` Jan Kara
2018-09-11 15:20             ` Jan Kara
2018-09-11 17:28               ` Theodore Y. Ts'o
2018-09-11 18:21                 ` Eric Sandeen
2018-07-31 19:44 ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180806035550.GE7395@dastard \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=zwisler@kernel.org \
    --subject='Re: [PATCH v4 0/2] ext4: fix DAX dma vs truncate/hole-punch' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).