All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Jan Kara <jack@suse.com>, Matthew Wilcox <willy@linux.intel.com>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	linux-nvdimm@lists.01.org,
	Matthew Wilcox <matthew.r.wilcox@intel.com>
Subject: Re: [PATCH v2 1/2] dax: dax_pfn_mkwrite() truncate race check
Date: Wed, 14 Oct 2015 11:26:26 -0600	[thread overview]
Message-ID: <20151014172626.GA14123@linux.intel.com> (raw)
In-Reply-To: <20151014052550.GL31326@dastard>

On Wed, Oct 14, 2015 at 04:25:50PM +1100, Dave Chinner wrote:
> On Tue, Oct 13, 2015 at 04:25:36PM -0600, Ross Zwisler wrote:
> > Update dax_pfn_mkwrite() so that it validates i_size before returning.
> > This is necessary to ensure that the page fault has not raced with truncate
> > and is now pointing to a region beyond the end of the current file.
> > 
> > This change is based on a similar outstanding patch for XFS from Dave
> > Chinner entitled "xfs: add ->pfn_mkwrite support for DAX".
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Cc: Dave Chinner <david@fromorbit.com>
> > ---
> >  fs/dax.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 131fd35a..82be6e4 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -693,12 +693,21 @@ EXPORT_SYMBOL_GPL(dax_pmd_fault);
> >   */
> >  int dax_pfn_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  {
> > -	struct super_block *sb = file_inode(vma->vm_file)->i_sb;
> > +	struct inode *inode = file_inode(vma->vm_file);
> > +	struct super_block *sb = inode->i_sb;
> > +	int ret = VM_FAULT_NOPAGE;
> > +	loff_t size;
> >  
> >  	sb_start_pagefault(sb);
> >  	file_update_time(vma->vm_file);
> > +
> > +	/* check that the faulting page hasn't raced with truncate */
> > +	size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > +	if (vmf->pgoff >= size)
> > +		ret = VM_FAULT_SIGBUS;
> > +
> >  	sb_end_pagefault(sb);
> > -	return VM_FAULT_NOPAGE;
> > +	return ret;
> 
> This is still racy, as the read of the inode size is not serialised
> against or ordered by any locks held by truncate. The check in XFS
> is serialised against truncate by the XFS_MMAPLOCK and the generic
> DAX code does not have such a mechanism to rely on. Hence I'd
> suggest that the correct thing to do here is remove
> dax_pfn_mkwrite() and force filesystems to implement their own
> truncate-safe variants of ->pfn_mkwrite.

Agreed, let's just drop this patch and remove dax_pfn_mkwrite().  The last
user of this function was ext4, and that usage goes away with Jan's latest
patch set.

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Jan Kara <jack@suse.com>, Matthew Wilcox <willy@linux.intel.com>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	linux-nvdimm@ml01.01.org,
	Matthew Wilcox <matthew.r.wilcox@intel.com>
Subject: Re: [PATCH v2 1/2] dax: dax_pfn_mkwrite() truncate race check
Date: Wed, 14 Oct 2015 11:26:26 -0600	[thread overview]
Message-ID: <20151014172626.GA14123@linux.intel.com> (raw)
In-Reply-To: <20151014052550.GL31326@dastard>

On Wed, Oct 14, 2015 at 04:25:50PM +1100, Dave Chinner wrote:
> On Tue, Oct 13, 2015 at 04:25:36PM -0600, Ross Zwisler wrote:
> > Update dax_pfn_mkwrite() so that it validates i_size before returning.
> > This is necessary to ensure that the page fault has not raced with truncate
> > and is now pointing to a region beyond the end of the current file.
> > 
> > This change is based on a similar outstanding patch for XFS from Dave
> > Chinner entitled "xfs: add ->pfn_mkwrite support for DAX".
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Cc: Dave Chinner <david@fromorbit.com>
> > ---
> >  fs/dax.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 131fd35a..82be6e4 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -693,12 +693,21 @@ EXPORT_SYMBOL_GPL(dax_pmd_fault);
> >   */
> >  int dax_pfn_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  {
> > -	struct super_block *sb = file_inode(vma->vm_file)->i_sb;
> > +	struct inode *inode = file_inode(vma->vm_file);
> > +	struct super_block *sb = inode->i_sb;
> > +	int ret = VM_FAULT_NOPAGE;
> > +	loff_t size;
> >  
> >  	sb_start_pagefault(sb);
> >  	file_update_time(vma->vm_file);
> > +
> > +	/* check that the faulting page hasn't raced with truncate */
> > +	size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > +	if (vmf->pgoff >= size)
> > +		ret = VM_FAULT_SIGBUS;
> > +
> >  	sb_end_pagefault(sb);
> > -	return VM_FAULT_NOPAGE;
> > +	return ret;
> 
> This is still racy, as the read of the inode size is not serialised
> against or ordered by any locks held by truncate. The check in XFS
> is serialised against truncate by the XFS_MMAPLOCK and the generic
> DAX code does not have such a mechanism to rely on. Hence I'd
> suggest that the correct thing to do here is remove
> dax_pfn_mkwrite() and force filesystems to implement their own
> truncate-safe variants of ->pfn_mkwrite.

Agreed, let's just drop this patch and remove dax_pfn_mkwrite().  The last
user of this function was ext4, and that usage goes away with Jan's latest
patch set.

  parent reply	other threads:[~2015-10-14 17:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-13 22:25 [PATCH v2 0/2] Add updated DAX locking to ext2 Ross Zwisler
2015-10-13 22:25 ` Ross Zwisler
2015-10-13 22:25 ` [PATCH v2 1/2] dax: dax_pfn_mkwrite() truncate race check Ross Zwisler
2015-10-13 22:25   ` Ross Zwisler
2015-10-14  5:25   ` Dave Chinner
2015-10-14  5:25     ` Dave Chinner
2015-10-14  8:40     ` Jan Kara
2015-10-14  8:40       ` Jan Kara
2015-10-14 22:53       ` Dave Chinner
2015-10-14 22:53         ` Dave Chinner
2015-10-16  7:55         ` Jan Kara
2015-10-16  7:55           ` Jan Kara
2015-10-14 17:26     ` Ross Zwisler [this message]
2015-10-14 17:26       ` Ross Zwisler
2015-10-13 22:25 ` [PATCH v2 2/2] ext2: Add locking for DAX faults Ross Zwisler
2015-10-13 22:25   ` Ross Zwisler
2015-10-14  8:51   ` Jan Kara
2015-10-14  8:51     ` Jan Kara
2015-10-14 15:31     ` Ross Zwisler
2015-10-14 15:31       ` Ross Zwisler
2015-10-19 12:47       ` Jan Kara
2015-10-19 12:47         ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151014172626.GA14123@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.