All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: linux-kernel@vger.kernel.org
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.com>,
	Matthew Wilcox <willy@linux.intel.com>,
	linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org
Subject: [PATCH v3 5/5] dax: fix clearing of holes in __dax_pmd_fault()
Date: Fri, 22 Jan 2016 14:36:13 -0700	[thread overview]
Message-ID: <1453498573-6328-6-git-send-email-ross.zwisler@linux.intel.com> (raw)
In-Reply-To: <1453498573-6328-1-git-send-email-ross.zwisler@linux.intel.com>

When the user reads from a DAX hole via a mmap we service page faults using
zero-filled page cache pages.  These zero pages are also placed into the
address_space radix tree.  When we get our first write for that space, we
can allocate a PMD page worth of DAX storage to replace that hole.

When this happens we need to unmap the zero pages and remove them from the
radix tree.  Prior to this patch we were unmapping *all* storage in our
PMD's range, which is incorrect because it removed DAX entries as well on
non-allocating page faults.

Instead, keep track of when get_block() actually gives us storage so that
we can be sure to only remove zero pages that were covering holes.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Jan Kara <jack@suse.cz>
---
 fs/dax.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index a2ed009..206650f 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -791,9 +791,9 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	bool write = flags & FAULT_FLAG_WRITE;
 	struct block_device *bdev;
 	pgoff_t size, pgoff;
-	loff_t lstart, lend;
 	sector_t block;
 	int error, result = 0;
+	bool alloc = false;
 
 	/* dax pmd mappings require pfn_t_devmap() */
 	if (!IS_ENABLED(CONFIG_FS_DAX_PMD))
@@ -831,10 +831,17 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	block = (sector_t)pgoff << (PAGE_SHIFT - blkbits);
 
 	bh.b_size = PMD_SIZE;
-	if (get_block(inode, block, &bh, write) != 0)
+
+	if (get_block(inode, block, &bh, 0) != 0)
 		return VM_FAULT_SIGBUS;
+
+	if (!buffer_mapped(&bh) && write) {
+		if (get_block(inode, block, &bh, 1) != 0)
+			return VM_FAULT_SIGBUS;
+		alloc = true;
+	}
+
 	bdev = bh.b_bdev;
-	i_mmap_lock_read(mapping);
 
 	/*
 	 * If the filesystem isn't willing to tell us the length of a hole,
@@ -843,15 +850,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	 */
 	if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) {
 		dax_pmd_dbg(&bh, address, "allocated block too small");
-		goto fallback;
+		return VM_FAULT_FALLBACK;
+	}
+
+	/*
+	 * If we allocated new storage, make sure no process has any
+	 * zero pages covering this hole
+	 */
+	if (alloc) {
+		loff_t lstart = pgoff << PAGE_SHIFT;
+		loff_t lend = lstart + PMD_SIZE - 1; /* inclusive */
+
+		truncate_pagecache_range(inode, lstart, lend);
 	}
 
-	/* make sure no process has any zero pages covering this hole */
-	lstart = pgoff << PAGE_SHIFT;
-	lend = lstart + PMD_SIZE - 1; /* inclusive */
-	i_mmap_unlock_read(mapping);
-	unmap_mapping_range(mapping, lstart, PMD_SIZE, 0);
-	truncate_inode_pages_range(mapping, lstart, lend);
 	i_mmap_lock_read(mapping);
 
 	/*
-- 
2.5.0


WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: linux-kernel@vger.kernel.org
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.com>,
	Matthew Wilcox <willy@linux.intel.com>,
	linux-fsdevel@vger.kernel.org, linux-nvdimm@ml01.01.org
Subject: [PATCH v3 5/5] dax: fix clearing of holes in __dax_pmd_fault()
Date: Fri, 22 Jan 2016 14:36:13 -0700	[thread overview]
Message-ID: <1453498573-6328-6-git-send-email-ross.zwisler@linux.intel.com> (raw)
In-Reply-To: <1453498573-6328-1-git-send-email-ross.zwisler@linux.intel.com>

When the user reads from a DAX hole via a mmap we service page faults using
zero-filled page cache pages.  These zero pages are also placed into the
address_space radix tree.  When we get our first write for that space, we
can allocate a PMD page worth of DAX storage to replace that hole.

When this happens we need to unmap the zero pages and remove them from the
radix tree.  Prior to this patch we were unmapping *all* storage in our
PMD's range, which is incorrect because it removed DAX entries as well on
non-allocating page faults.

Instead, keep track of when get_block() actually gives us storage so that
we can be sure to only remove zero pages that were covering holes.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Jan Kara <jack@suse.cz>
---
 fs/dax.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index a2ed009..206650f 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -791,9 +791,9 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	bool write = flags & FAULT_FLAG_WRITE;
 	struct block_device *bdev;
 	pgoff_t size, pgoff;
-	loff_t lstart, lend;
 	sector_t block;
 	int error, result = 0;
+	bool alloc = false;
 
 	/* dax pmd mappings require pfn_t_devmap() */
 	if (!IS_ENABLED(CONFIG_FS_DAX_PMD))
@@ -831,10 +831,17 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	block = (sector_t)pgoff << (PAGE_SHIFT - blkbits);
 
 	bh.b_size = PMD_SIZE;
-	if (get_block(inode, block, &bh, write) != 0)
+
+	if (get_block(inode, block, &bh, 0) != 0)
 		return VM_FAULT_SIGBUS;
+
+	if (!buffer_mapped(&bh) && write) {
+		if (get_block(inode, block, &bh, 1) != 0)
+			return VM_FAULT_SIGBUS;
+		alloc = true;
+	}
+
 	bdev = bh.b_bdev;
-	i_mmap_lock_read(mapping);
 
 	/*
 	 * If the filesystem isn't willing to tell us the length of a hole,
@@ -843,15 +850,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	 */
 	if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) {
 		dax_pmd_dbg(&bh, address, "allocated block too small");
-		goto fallback;
+		return VM_FAULT_FALLBACK;
+	}
+
+	/*
+	 * If we allocated new storage, make sure no process has any
+	 * zero pages covering this hole
+	 */
+	if (alloc) {
+		loff_t lstart = pgoff << PAGE_SHIFT;
+		loff_t lend = lstart + PMD_SIZE - 1; /* inclusive */
+
+		truncate_pagecache_range(inode, lstart, lend);
 	}
 
-	/* make sure no process has any zero pages covering this hole */
-	lstart = pgoff << PAGE_SHIFT;
-	lend = lstart + PMD_SIZE - 1; /* inclusive */
-	i_mmap_unlock_read(mapping);
-	unmap_mapping_range(mapping, lstart, PMD_SIZE, 0);
-	truncate_inode_pages_range(mapping, lstart, lend);
 	i_mmap_lock_read(mapping);
 
 	/*
-- 
2.5.0

  parent reply	other threads:[~2016-01-22 21:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-22 21:36 [PATCH v3 0/5] DAX fsync/msync fixes Ross Zwisler
2016-01-22 21:36 ` Ross Zwisler
2016-01-22 21:36 ` [PATCH v3 1/5] dax: never rely on bh.b_dev being set by get_block() Ross Zwisler
2016-01-22 21:36   ` Ross Zwisler
2016-01-22 21:36 ` [PATCH v3 2/5] dax: clear TOWRITE flag after flush is complete Ross Zwisler
2016-01-22 21:36   ` Ross Zwisler
2016-01-22 21:36 ` [PATCH v3 3/5] dax: improve documentation for fsync/msync Ross Zwisler
2016-01-22 21:36   ` Ross Zwisler
2016-01-22 21:36 ` [PATCH v3 4/5] dax: fix PMD handling " Ross Zwisler
2016-01-22 21:36   ` Ross Zwisler
2016-01-22 21:36 ` Ross Zwisler [this message]
2016-01-22 21:36   ` [PATCH v3 5/5] dax: fix clearing of holes in __dax_pmd_fault() Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1453498573-6328-6-git-send-email-ross.zwisler@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.