[RFC][PATCH] dax: Do not try to clear poison for partial pages

* [RFC][PATCH] dax: Do not try to clear poison for partial pages
@ 2020-01-29 21:03 Vivek Goyal
  2020-01-31  5:42 ` Christoph Hellwig
  2020-02-05 20:26 ` jane.chu
  0 siblings, 2 replies; 7+ messages in thread
From: Vivek Goyal @ 2020-01-29 21:03 UTC (permalink / raw)
  To: Dan Williams, vishal.l.verma, Christoph Hellwig
  Cc: linux-nvdimm, linux-fsdevel

I am looking into getting rid of dependency on block device in dax
path. One such place is __dax_zero_page_range() which checks if
range being zeroed is aligned to block device logical size, then
it calls bdev_issue_zeroout() instead of doing memset(). Calling
blkdev_issue_zeroout() also clears bad blocks and poison if any
in that range.

This path is used by iomap_zero_range() which in-turn is being
used by filesystems to zero partial filesystem system blocks.
For zeroing full filesystem blocks we seem to be calling
blkdev_issue_zeroout() which clears bad blocks and poison in that
range.

So this code currently only seems to help with partial filesystem
block zeroing. That too only if truncation/hole_punch happens on
device logical block size boundary.

To avoid using blkdev_issue_zeroout() in this path, I proposed another
patch here which adds another dax operation to zero out a rangex and
clear poison.

https://lore.kernel.org/linux-fsdevel/20200123165249.GA7664@redhat.com/

Thinking more about it, it might be an overkill to solve this problem.

How about if we just not clear poison/bad blocks when a partial page
is being zeroed. IOW, users will need to hole_punch whole filesystem
block worth of data which will free that block and it will be zeroed
some other time and clear poison in the process. For partial fs block
truncation/punch_hole, we don't clear poison.

If above interface is acceptable, then we can get rid of code which
tries to call blkdev_issue_zeroout() in iomap_zero_range() path and
we don't have to implement another dax operation.

Looking for some feedback on this.

Vivek
 
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/dax.c |   50 +++++++++++++++-----------------------------------
 1 file changed, 15 insertions(+), 35 deletions(-)

Index: redhat-linux/fs/dax.c
===================================================================

--- redhat-linux.orig/fs/dax.c	2020-01-29 15:19:18.551902448 -0500
+++ redhat-linux/fs/dax.c	2020-01-29 15:40:56.584824549 -0500
@@ -1044,47 +1044,27 @@ static vm_fault_t dax_load_hole(struct x
 	return ret;
 }
 
-static bool dax_range_is_aligned(struct block_device *bdev,
-				 unsigned int offset, unsigned int length)
-{
-	unsigned short sector_size = bdev_logical_block_size(bdev);
-
-	if (!IS_ALIGNED(offset, sector_size))
-		return false;
-	if (!IS_ALIGNED(length, sector_size))
-		return false;
-
-	return true;
-}
-
 int __dax_zero_page_range(struct block_device *bdev,
 		struct dax_device *dax_dev, sector_t sector,
 		unsigned int offset, unsigned int size)
 {
-	if (dax_range_is_aligned(bdev, offset, size)) {
-		sector_t start_sector = sector + (offset >> 9);
-
-		return blkdev_issue_zeroout(bdev, start_sector,
-				size >> 9, GFP_NOFS, 0);
-	} else {
-		pgoff_t pgoff;
-		long rc, id;
-		void *kaddr;
-
-		rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, &pgoff);
-		if (rc)
-			return rc;
-
-		id = dax_read_lock();
-		rc = dax_direct_access(dax_dev, pgoff, 1, &kaddr, NULL);
-		if (rc < 0) {
-			dax_read_unlock(id);
-			return rc;
-		}
-		memset(kaddr + offset, 0, size);
-		dax_flush(dax_dev, kaddr + offset, size);
+	pgoff_t pgoff;
+	long rc, id;
+	void *kaddr;
+
+	rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, &pgoff);
+	if (rc)
+		return rc;
+
+	id = dax_read_lock();
+	rc = dax_direct_access(dax_dev, pgoff, 1, &kaddr, NULL);
+	if (rc < 0) {
 		dax_read_unlock(id);
+		return rc;
 	}
+	memset(kaddr + offset, 0, size);
+	dax_flush(dax_dev, kaddr + offset, size);
+	dax_read_unlock(id);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(__dax_zero_page_range);
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 7+ messages in thread