linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range
@ 2020-02-28 16:34 Vivek Goyal
  2020-02-28 16:34 ` [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem Vivek Goyal
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Vivek Goyal @ 2020-02-28 16:34 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams; +Cc: david, dm-devel

Hi,

This is V6 of patches. These patches are also available at.

Changes since V5:

- Dan Williams preferred ->zero_page_range() to only accept PAGE_SIZE
  aligned request and clear poison only on page size aligned zeroing. So
  I changed it accordingly. 

- Dropped all the modifications which were required to support arbitrary
  range zeroing with-in a page.

- This patch series also fixes the issue where "truncate -s 512 foo.txt"
  will fail if first sector of file is poisoned. Currently it succeeds
  and filesystem expectes whole of the filesystem block to be free of
  poison at the end of the operation.

Christoph, I have dropped your Reviewed-by tag on 1-2 patches because
these patches changed substantially. Especially signature of of
dax zero_page_range() helper.

Thanks
Vivek

Vivek Goyal (6):
  pmem: Add functions for reading/writing page to/from pmem
  dax, pmem: Add a dax operation zero_page_range
  s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver
  dm,dax: Add dax zero_page_range operation
  dax: Use new dax zero page method for zeroing a page
  dax,iomap: Add helper dax_iomap_zero() to zero a range

 drivers/dax/super.c           | 20 ++++++++
 drivers/md/dm-linear.c        | 18 +++++++
 drivers/md/dm-log-writes.c    | 17 ++++++
 drivers/md/dm-stripe.c        | 23 +++++++++
 drivers/md/dm.c               | 30 +++++++++++
 drivers/nvdimm/pmem.c         | 97 ++++++++++++++++++++++-------------
 drivers/s390/block/dcssblk.c  | 15 ++++++
 fs/dax.c                      | 59 ++++++++++-----------
 fs/iomap/buffered-io.c        |  9 +---
 include/linux/dax.h           | 21 +++-----
 include/linux/device-mapper.h |  3 ++
 11 files changed, 221 insertions(+), 91 deletions(-)

-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
@ 2020-02-28 16:34 ` Vivek Goyal
  2020-02-29  8:04   ` Pankaj Gupta
  2020-02-28 16:34 ` [PATCH v6 2/6] dax, pmem: Add a dax operation zero_page_range Vivek Goyal
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Vivek Goyal @ 2020-02-28 16:34 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams
  Cc: david, dm-devel, Christoph Hellwig

This splits pmem_do_bvec() into pmem_do_read() and pmem_do_write().
pmem_do_write() will be used by pmem zero_page_range() as well. Hence
sharing the same code.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 drivers/nvdimm/pmem.c | 86 +++++++++++++++++++++++++------------------
 1 file changed, 50 insertions(+), 36 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 4eae441f86c9..075b11682192 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -136,9 +136,25 @@ static blk_status_t read_pmem(struct page *page, unsigned int off,
 	return BLK_STS_OK;
 }
 
-static blk_status_t pmem_do_bvec(struct pmem_device *pmem, struct page *page,
-			unsigned int len, unsigned int off, unsigned int op,
-			sector_t sector)
+static blk_status_t pmem_do_read(struct pmem_device *pmem,
+			struct page *page, unsigned int page_off,
+			sector_t sector, unsigned int len)
+{
+	blk_status_t rc;
+	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
+	void *pmem_addr = pmem->virt_addr + pmem_off;
+
+	if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
+		return BLK_STS_IOERR;
+
+	rc = read_pmem(page, page_off, pmem_addr, len);
+	flush_dcache_page(page);
+	return rc;
+}
+
+static blk_status_t pmem_do_write(struct pmem_device *pmem,
+			struct page *page, unsigned int page_off,
+			sector_t sector, unsigned int len)
 {
 	blk_status_t rc = BLK_STS_OK;
 	bool bad_pmem = false;
@@ -148,34 +164,25 @@ static blk_status_t pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 	if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
 		bad_pmem = true;
 
-	if (!op_is_write(op)) {
-		if (unlikely(bad_pmem))
-			rc = BLK_STS_IOERR;
-		else {
-			rc = read_pmem(page, off, pmem_addr, len);
-			flush_dcache_page(page);
-		}
-	} else {
-		/*
-		 * Note that we write the data both before and after
-		 * clearing poison.  The write before clear poison
-		 * handles situations where the latest written data is
-		 * preserved and the clear poison operation simply marks
-		 * the address range as valid without changing the data.
-		 * In this case application software can assume that an
-		 * interrupted write will either return the new good
-		 * data or an error.
-		 *
-		 * However, if pmem_clear_poison() leaves the data in an
-		 * indeterminate state we need to perform the write
-		 * after clear poison.
-		 */
-		flush_dcache_page(page);
-		write_pmem(pmem_addr, page, off, len);
-		if (unlikely(bad_pmem)) {
-			rc = pmem_clear_poison(pmem, pmem_off, len);
-			write_pmem(pmem_addr, page, off, len);
-		}
+	/*
+	 * Note that we write the data both before and after
+	 * clearing poison.  The write before clear poison
+	 * handles situations where the latest written data is
+	 * preserved and the clear poison operation simply marks
+	 * the address range as valid without changing the data.
+	 * In this case application software can assume that an
+	 * interrupted write will either return the new good
+	 * data or an error.
+	 *
+	 * However, if pmem_clear_poison() leaves the data in an
+	 * indeterminate state we need to perform the write
+	 * after clear poison.
+	 */
+	flush_dcache_page(page);
+	write_pmem(pmem_addr, page, page_off, len);
+	if (unlikely(bad_pmem)) {
+		rc = pmem_clear_poison(pmem, pmem_off, len);
+		write_pmem(pmem_addr, page, page_off, len);
 	}
 
 	return rc;
@@ -197,8 +204,12 @@ static blk_qc_t pmem_make_request(struct request_queue *q, struct bio *bio)
 
 	do_acct = nd_iostat_start(bio, &start);
 	bio_for_each_segment(bvec, bio, iter) {
-		rc = pmem_do_bvec(pmem, bvec.bv_page, bvec.bv_len,
-				bvec.bv_offset, bio_op(bio), iter.bi_sector);
+		if (op_is_write(bio_op(bio)))
+			rc = pmem_do_write(pmem, bvec.bv_page, bvec.bv_offset,
+				iter.bi_sector, bvec.bv_len);
+		else
+			rc = pmem_do_read(pmem, bvec.bv_page, bvec.bv_offset,
+				iter.bi_sector, bvec.bv_len);
 		if (rc) {
 			bio->bi_status = rc;
 			break;
@@ -223,9 +234,12 @@ static int pmem_rw_page(struct block_device *bdev, sector_t sector,
 	struct pmem_device *pmem = bdev->bd_queue->queuedata;
 	blk_status_t rc;
 
-	rc = pmem_do_bvec(pmem, page, hpage_nr_pages(page) * PAGE_SIZE,
-			  0, op, sector);
-
+	if (op_is_write(op))
+		rc = pmem_do_write(pmem, page, 0, sector,
+				   hpage_nr_pages(page) * PAGE_SIZE);
+	else
+		rc = pmem_do_read(pmem, page, 0, sector,
+				   hpage_nr_pages(page) * PAGE_SIZE);
 	/*
 	 * The ->rw_page interface is subtle and tricky.  The core
 	 * retries on any error, so we can only invoke page_endio() in
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 2/6] dax, pmem: Add a dax operation zero_page_range
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
  2020-02-28 16:34 ` [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem Vivek Goyal
@ 2020-02-28 16:34 ` Vivek Goyal
  2020-02-29  9:21   ` Pankaj Gupta
  2020-02-28 16:34 ` [PATCH v6 3/6] s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver Vivek Goyal
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Vivek Goyal @ 2020-02-28 16:34 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams; +Cc: david, dm-devel

Add a dax operation zero_page_range, to zero a page. This will also clear any
known poison in the page being zeroed.

As of now, zeroing of one page is allowed in a single call. There
are no callers which are trying to zero more than a page in a single call.
Once we grow the callers which zero more than a page in single call, we
can add that support. Primary reason for not doing that yet is that this
will add little complexity in dm implementation where a range might be
spanning multiple underlying targets and one will have to split the range
into multiple sub ranges and call zero_page_range() on individual targets.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 drivers/dax/super.c   | 20 ++++++++++++++++++++
 drivers/nvdimm/pmem.c | 11 +++++++++++
 include/linux/dax.h   |  4 ++++
 3 files changed, 35 insertions(+)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 0aa4b6bc5101..e498daf3c0d7 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -344,6 +344,26 @@ size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
 }
 EXPORT_SYMBOL_GPL(dax_copy_to_iter);
 
+int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
+			size_t nr_pages)
+{
+	if (!dax_alive(dax_dev))
+		return -ENXIO;
+
+	if (!dax_dev->ops->zero_page_range)
+		return -EOPNOTSUPP;
+	/*
+	 * There are no callers that want to zero more than one page as of now.
+	 * Once users are there, this check can be removed after the
+	 * device mapper code has been updated to split ranges across targets.
+	 */
+	if (nr_pages != 1)
+		return -EIO;
+
+	return dax_dev->ops->zero_page_range(dax_dev, pgoff, nr_pages);
+}
+EXPORT_SYMBOL_GPL(dax_zero_page_range);
+
 #ifdef CONFIG_ARCH_HAS_PMEM_API
 void arch_wb_cache_pmem(void *addr, size_t size);
 void dax_flush(struct dax_device *dax_dev, void *addr, size_t size)
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 075b11682192..5b774ddd0efb 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -282,6 +282,16 @@ static const struct block_device_operations pmem_fops = {
 	.revalidate_disk =	nvdimm_revalidate_disk,
 };
 
+static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
+				    size_t nr_pages)
+{
+	struct pmem_device *pmem = dax_get_private(dax_dev);
+
+	return blk_status_to_errno(pmem_do_write(pmem, ZERO_PAGE(0), 0,
+				   PFN_PHYS(pgoff) >> SECTOR_SHIFT,
+				   PAGE_SIZE));
+}
+
 static long pmem_dax_direct_access(struct dax_device *dax_dev,
 		pgoff_t pgoff, long nr_pages, void **kaddr, pfn_t *pfn)
 {
@@ -313,6 +323,7 @@ static const struct dax_operations pmem_dax_ops = {
 	.dax_supported = generic_fsdax_supported,
 	.copy_from_iter = pmem_copy_from_iter,
 	.copy_to_iter = pmem_copy_to_iter,
+	.zero_page_range = pmem_dax_zero_page_range,
 };
 
 static const struct attribute_group *pmem_attribute_groups[] = {
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 328c2dbb4409..71735c430c05 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -34,6 +34,8 @@ struct dax_operations {
 	/* copy_to_iter: required operation for fs-dax direct-i/o */
 	size_t (*copy_to_iter)(struct dax_device *, pgoff_t, void *, size_t,
 			struct iov_iter *);
+	/* zero_page_range: required operation. Zero page range   */
+	int (*zero_page_range)(struct dax_device *, pgoff_t, size_t);
 };
 
 extern struct attribute_group dax_attribute_group;
@@ -199,6 +201,8 @@ size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
 		size_t bytes, struct iov_iter *i);
 size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
 		size_t bytes, struct iov_iter *i);
+int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
+			size_t nr_pages);
 void dax_flush(struct dax_device *dax_dev, void *addr, size_t size);
 
 ssize_t dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 3/6] s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
  2020-02-28 16:34 ` [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem Vivek Goyal
  2020-02-28 16:34 ` [PATCH v6 2/6] dax, pmem: Add a dax operation zero_page_range Vivek Goyal
@ 2020-02-28 16:34 ` Vivek Goyal
  2020-02-28 16:34 ` [PATCH v6 4/6] dm,dax: Add dax zero_page_range operation Vivek Goyal
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Vivek Goyal @ 2020-02-28 16:34 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams
  Cc: david, dm-devel, linux-s390, Gerald Schaefer

Add dax operation zero_page_range for dcssblk driver.

CC: linux-s390@vger.kernel.org
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 drivers/s390/block/dcssblk.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index 63502ca537eb..ab3e2898816c 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -57,11 +57,26 @@ static size_t dcssblk_dax_copy_to_iter(struct dax_device *dax_dev,
 	return copy_to_iter(addr, bytes, i);
 }
 
+static int dcssblk_dax_zero_page_range(struct dax_device *dax_dev,
+				       pgoff_t pgoff, size_t nr_pages)
+{
+	long rc;
+	void *kaddr;
+
+	rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
+	if (rc < 0)
+		return rc;
+	memset(kaddr, 0, nr_pages << PAGE_SHIFT);
+	dax_flush(dax_dev, kaddr, nr_pages << PAGE_SHIFT);
+	return 0;
+}
+
 static const struct dax_operations dcssblk_dax_ops = {
 	.direct_access = dcssblk_dax_direct_access,
 	.dax_supported = generic_fsdax_supported,
 	.copy_from_iter = dcssblk_dax_copy_from_iter,
 	.copy_to_iter = dcssblk_dax_copy_to_iter,
+	.zero_page_range = dcssblk_dax_zero_page_range,
 };
 
 struct dcssblk_dev_info {
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 4/6] dm,dax: Add dax zero_page_range operation
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
                   ` (2 preceding siblings ...)
  2020-02-28 16:34 ` [PATCH v6 3/6] s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver Vivek Goyal
@ 2020-02-28 16:34 ` Vivek Goyal
  2020-03-31 19:34   ` Dan Williams
  2020-02-28 16:34 ` [PATCH v6 5/6] dax: Use new dax zero page method for zeroing a page Vivek Goyal
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Vivek Goyal @ 2020-02-28 16:34 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams; +Cc: david, dm-devel

This patch adds support for dax zero_page_range operation to dm targets.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 drivers/md/dm-linear.c        | 18 ++++++++++++++++++
 drivers/md/dm-log-writes.c    | 17 +++++++++++++++++
 drivers/md/dm-stripe.c        | 23 +++++++++++++++++++++++
 drivers/md/dm.c               | 30 ++++++++++++++++++++++++++++++
 include/linux/device-mapper.h |  3 +++
 5 files changed, 91 insertions(+)

diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 8d07fdf63a47..e1db43446327 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -201,10 +201,27 @@ static size_t linear_dax_copy_to_iter(struct dm_target *ti, pgoff_t pgoff,
 	return dax_copy_to_iter(dax_dev, pgoff, addr, bytes, i);
 }
 
+static int linear_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
+				      size_t nr_pages)
+{
+	int ret;
+	struct linear_c *lc = ti->private;
+	struct block_device *bdev = lc->dev->bdev;
+	struct dax_device *dax_dev = lc->dev->dax_dev;
+	sector_t dev_sector, sector = pgoff * PAGE_SECTORS;
+
+	dev_sector = linear_map_sector(ti, sector);
+	ret = bdev_dax_pgoff(bdev, dev_sector, nr_pages << PAGE_SHIFT, &pgoff);
+	if (ret)
+		return ret;
+	return dax_zero_page_range(dax_dev, pgoff, nr_pages);
+}
+
 #else
 #define linear_dax_direct_access NULL
 #define linear_dax_copy_from_iter NULL
 #define linear_dax_copy_to_iter NULL
+#define linear_dax_zero_page_range NULL
 #endif
 
 static struct target_type linear_target = {
@@ -226,6 +243,7 @@ static struct target_type linear_target = {
 	.direct_access = linear_dax_direct_access,
 	.dax_copy_from_iter = linear_dax_copy_from_iter,
 	.dax_copy_to_iter = linear_dax_copy_to_iter,
+	.dax_zero_page_range = linear_dax_zero_page_range,
 };
 
 int __init dm_linear_init(void)
diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
index 99721c76225d..8ea20b56b4d6 100644
--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -994,10 +994,26 @@ static size_t log_writes_dax_copy_to_iter(struct dm_target *ti,
 	return dax_copy_to_iter(lc->dev->dax_dev, pgoff, addr, bytes, i);
 }
 
+static int log_writes_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
+					  size_t nr_pages)
+{
+	int ret;
+	struct log_writes_c *lc = ti->private;
+	sector_t sector = pgoff * PAGE_SECTORS;
+
+	ret = bdev_dax_pgoff(lc->dev->bdev, sector, nr_pages << PAGE_SHIFT,
+			     &pgoff);
+	if (ret)
+		return ret;
+	return dax_zero_page_range(lc->dev->dax_dev, pgoff,
+				   nr_pages << PAGE_SHIFT);
+}
+
 #else
 #define log_writes_dax_direct_access NULL
 #define log_writes_dax_copy_from_iter NULL
 #define log_writes_dax_copy_to_iter NULL
+#define log_writes_dax_zero_page_range NULL
 #endif
 
 static struct target_type log_writes_target = {
@@ -1016,6 +1032,7 @@ static struct target_type log_writes_target = {
 	.direct_access = log_writes_dax_direct_access,
 	.dax_copy_from_iter = log_writes_dax_copy_from_iter,
 	.dax_copy_to_iter = log_writes_dax_copy_to_iter,
+	.dax_zero_page_range = log_writes_dax_zero_page_range,
 };
 
 static int __init dm_log_writes_init(void)
diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index 63bbcc20f49a..fa813c0f993d 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -360,10 +360,32 @@ static size_t stripe_dax_copy_to_iter(struct dm_target *ti, pgoff_t pgoff,
 	return dax_copy_to_iter(dax_dev, pgoff, addr, bytes, i);
 }
 
+static int stripe_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
+				      size_t nr_pages)
+{
+	int ret;
+	sector_t dev_sector, sector = pgoff * PAGE_SECTORS;
+	struct stripe_c *sc = ti->private;
+	struct dax_device *dax_dev;
+	struct block_device *bdev;
+	uint32_t stripe;
+
+	stripe_map_sector(sc, sector, &stripe, &dev_sector);
+	dev_sector += sc->stripe[stripe].physical_start;
+	dax_dev = sc->stripe[stripe].dev->dax_dev;
+	bdev = sc->stripe[stripe].dev->bdev;
+
+	ret = bdev_dax_pgoff(bdev, dev_sector, nr_pages << PAGE_SHIFT, &pgoff);
+	if (ret)
+		return ret;
+	return dax_zero_page_range(dax_dev, pgoff, nr_pages);
+}
+
 #else
 #define stripe_dax_direct_access NULL
 #define stripe_dax_copy_from_iter NULL
 #define stripe_dax_copy_to_iter NULL
+#define stripe_dax_zero_page_range NULL
 #endif
 
 /*
@@ -486,6 +508,7 @@ static struct target_type stripe_target = {
 	.direct_access = stripe_dax_direct_access,
 	.dax_copy_from_iter = stripe_dax_copy_from_iter,
 	.dax_copy_to_iter = stripe_dax_copy_to_iter,
+	.dax_zero_page_range = stripe_dax_zero_page_range,
 };
 
 int __init dm_stripe_init(void)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index b89f07ee2eff..aa72d9e757c1 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1198,6 +1198,35 @@ static size_t dm_dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff,
 	return ret;
 }
 
+static int dm_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
+				  size_t nr_pages)
+{
+	struct mapped_device *md = dax_get_private(dax_dev);
+	sector_t sector = pgoff * PAGE_SECTORS;
+	struct dm_target *ti;
+	int ret = -EIO;
+	int srcu_idx;
+
+	ti = dm_dax_get_live_target(md, sector, &srcu_idx);
+
+	if (!ti)
+		goto out;
+	if (WARN_ON(!ti->type->dax_zero_page_range)) {
+		/*
+		 * ->zero_page_range() is mandatory dax operation. If we are
+		 *  here, something is wrong.
+		 */
+		dm_put_live_table(md, srcu_idx);
+		goto out;
+	}
+	ret = ti->type->dax_zero_page_range(ti, pgoff, nr_pages);
+
+ out:
+	dm_put_live_table(md, srcu_idx);
+
+	return ret;
+}
+
 /*
  * A target may call dm_accept_partial_bio only from the map routine.  It is
  * allowed for all bio types except REQ_PREFLUSH, REQ_OP_ZONE_RESET,
@@ -3199,6 +3228,7 @@ static const struct dax_operations dm_dax_ops = {
 	.dax_supported = dm_dax_supported,
 	.copy_from_iter = dm_dax_copy_from_iter,
 	.copy_to_iter = dm_dax_copy_to_iter,
+	.zero_page_range = dm_dax_zero_page_range,
 };
 
 /*
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 475668c69dbc..af48d9da3916 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -141,6 +141,8 @@ typedef long (*dm_dax_direct_access_fn) (struct dm_target *ti, pgoff_t pgoff,
 		long nr_pages, void **kaddr, pfn_t *pfn);
 typedef size_t (*dm_dax_copy_iter_fn)(struct dm_target *ti, pgoff_t pgoff,
 		void *addr, size_t bytes, struct iov_iter *i);
+typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff,
+		size_t nr_pages);
 #define PAGE_SECTORS (PAGE_SIZE / 512)
 
 void dm_error(const char *message);
@@ -195,6 +197,7 @@ struct target_type {
 	dm_dax_direct_access_fn direct_access;
 	dm_dax_copy_iter_fn dax_copy_from_iter;
 	dm_dax_copy_iter_fn dax_copy_to_iter;
+	dm_dax_zero_page_range_fn dax_zero_page_range;
 
 	/* For internal device-mapper use. */
 	struct list_head list;
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 5/6] dax: Use new dax zero page method for zeroing a page
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
                   ` (3 preceding siblings ...)
  2020-02-28 16:34 ` [PATCH v6 4/6] dm,dax: Add dax zero_page_range operation Vivek Goyal
@ 2020-02-28 16:34 ` Vivek Goyal
  2020-02-28 16:34 ` [PATCH v6 6/6] dax,iomap: Add helper dax_iomap_zero() to zero a range Vivek Goyal
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Vivek Goyal @ 2020-02-28 16:34 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams; +Cc: david, dm-devel

Use new dax native zero page method for zeroing page if I/O is page
aligned. Otherwise fall back to direct_access() + memcpy().

This gets rid of one of the depenendency on block device in dax path.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/dax.c | 53 +++++++++++++++++++++++------------------------------
 1 file changed, 23 insertions(+), 30 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 35da144375a0..98ba3756163a 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1038,47 +1038,40 @@ static vm_fault_t dax_load_hole(struct xa_state *xas,
 	return ret;
 }
 
-static bool dax_range_is_aligned(struct block_device *bdev,
-				 unsigned int offset, unsigned int length)
-{
-	unsigned short sector_size = bdev_logical_block_size(bdev);
-
-	if (!IS_ALIGNED(offset, sector_size))
-		return false;
-	if (!IS_ALIGNED(length, sector_size))
-		return false;
-
-	return true;
-}
-
 int __dax_zero_page_range(struct block_device *bdev,
 		struct dax_device *dax_dev, sector_t sector,
 		unsigned int offset, unsigned int size)
 {
-	if (dax_range_is_aligned(bdev, offset, size)) {
-		sector_t start_sector = sector + (offset >> 9);
+	pgoff_t pgoff;
+	long rc, id;
+	void *kaddr;
+	bool page_aligned = false;
 
-		return blkdev_issue_zeroout(bdev, start_sector,
-				size >> 9, GFP_NOFS, 0);
-	} else {
-		pgoff_t pgoff;
-		long rc, id;
-		void *kaddr;
 
-		rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, &pgoff);
-		if (rc)
-			return rc;
+	if (IS_ALIGNED(sector << SECTOR_SHIFT, PAGE_SIZE) &&
+	    IS_ALIGNED(size, PAGE_SIZE))
+		page_aligned = true;
+
+	rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, &pgoff);
+	if (rc)
+		return rc;
 
-		id = dax_read_lock();
+	id = dax_read_lock();
+
+	if (page_aligned)
+		rc = dax_zero_page_range(dax_dev, pgoff, size >> PAGE_SHIFT);
+	else
 		rc = dax_direct_access(dax_dev, pgoff, 1, &kaddr, NULL);
-		if (rc < 0) {
-			dax_read_unlock(id);
-			return rc;
-		}
+	if (rc < 0) {
+		dax_read_unlock(id);
+		return rc;
+	}
+
+	if (!page_aligned) {
 		memset(kaddr + offset, 0, size);
 		dax_flush(dax_dev, kaddr + offset, size);
-		dax_read_unlock(id);
 	}
+	dax_read_unlock(id);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(__dax_zero_page_range);
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 6/6] dax,iomap: Add helper dax_iomap_zero() to zero a range
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
                   ` (4 preceding siblings ...)
  2020-02-28 16:34 ` [PATCH v6 5/6] dax: Use new dax zero page method for zeroing a page Vivek Goyal
@ 2020-02-28 16:34 ` Vivek Goyal
  2020-02-29  9:16 ` [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Pankaj Gupta
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Vivek Goyal @ 2020-02-28 16:34 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams
  Cc: david, dm-devel, Christoph Hellwig

Add a helper dax_ioamp_zero() to zero a range. This patch basically
merges __dax_zero_page_range() and iomap_dax_zero().

Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/dax.c               | 16 ++++++++--------
 fs/iomap/buffered-io.c |  9 +--------
 include/linux/dax.h    | 17 +++--------------
 3 files changed, 12 insertions(+), 30 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 98ba3756163a..11b16729b86f 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1038,10 +1038,10 @@ static vm_fault_t dax_load_hole(struct xa_state *xas,
 	return ret;
 }
 
-int __dax_zero_page_range(struct block_device *bdev,
-		struct dax_device *dax_dev, sector_t sector,
-		unsigned int offset, unsigned int size)
+int dax_iomap_zero(loff_t pos, unsigned offset, unsigned size,
+		   struct iomap *iomap)
 {
+	sector_t sector = iomap_sector(iomap, pos & PAGE_MASK);
 	pgoff_t pgoff;
 	long rc, id;
 	void *kaddr;
@@ -1052,16 +1052,17 @@ int __dax_zero_page_range(struct block_device *bdev,
 	    IS_ALIGNED(size, PAGE_SIZE))
 		page_aligned = true;
 
-	rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, &pgoff);
+	rc = bdev_dax_pgoff(iomap->bdev, sector, PAGE_SIZE, &pgoff);
 	if (rc)
 		return rc;
 
 	id = dax_read_lock();
 
 	if (page_aligned)
-		rc = dax_zero_page_range(dax_dev, pgoff, size >> PAGE_SHIFT);
+		rc = dax_zero_page_range(iomap->dax_dev, pgoff,
+					 size >> PAGE_SHIFT);
 	else
-		rc = dax_direct_access(dax_dev, pgoff, 1, &kaddr, NULL);
+		rc = dax_direct_access(iomap->dax_dev, pgoff, 1, &kaddr, NULL);
 	if (rc < 0) {
 		dax_read_unlock(id);
 		return rc;
@@ -1069,12 +1070,11 @@ int __dax_zero_page_range(struct block_device *bdev,
 
 	if (!page_aligned) {
 		memset(kaddr + offset, 0, size);
-		dax_flush(dax_dev, kaddr + offset, size);
+		dax_flush(iomap->dax_dev, kaddr + offset, size);
 	}
 	dax_read_unlock(id);
 	return 0;
 }
-EXPORT_SYMBOL_GPL(__dax_zero_page_range);
 
 static loff_t
 dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 7c84c4c027c4..6f750da545e5 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -974,13 +974,6 @@ static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset,
 	return iomap_write_end(inode, pos, bytes, bytes, page, iomap, srcmap);
 }
 
-static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes,
-		struct iomap *iomap)
-{
-	return __dax_zero_page_range(iomap->bdev, iomap->dax_dev,
-			iomap_sector(iomap, pos & PAGE_MASK), offset, bytes);
-}
-
 static loff_t
 iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count,
 		void *data, struct iomap *iomap, struct iomap *srcmap)
@@ -1000,7 +993,7 @@ iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count,
 		bytes = min_t(loff_t, PAGE_SIZE - offset, count);
 
 		if (IS_DAX(inode))
-			status = iomap_dax_zero(pos, offset, bytes, iomap);
+			status = dax_iomap_zero(pos, offset, bytes, iomap);
 		else
 			status = iomap_zero(inode, pos, offset, bytes, iomap,
 					srcmap);
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 71735c430c05..d7af5d243f24 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -13,6 +13,7 @@
 typedef unsigned long dax_entry_t;
 
 struct iomap_ops;
+struct iomap;
 struct dax_device;
 struct dax_operations {
 	/*
@@ -214,20 +215,8 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf,
 int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index);
 int dax_invalidate_mapping_entry_sync(struct address_space *mapping,
 				      pgoff_t index);
-
-#ifdef CONFIG_FS_DAX
-int __dax_zero_page_range(struct block_device *bdev,
-		struct dax_device *dax_dev, sector_t sector,
-		unsigned int offset, unsigned int length);
-#else
-static inline int __dax_zero_page_range(struct block_device *bdev,
-		struct dax_device *dax_dev, sector_t sector,
-		unsigned int offset, unsigned int length)
-{
-	return -ENXIO;
-}
-#endif
-
+int dax_iomap_zero(loff_t pos, unsigned offset, unsigned size,
+			struct iomap *iomap);
 static inline bool dax_mapping(struct address_space *mapping)
 {
 	return mapping->host && IS_DAX(mapping->host);
-- 
2.20.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem
  2020-02-28 16:34 ` [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem Vivek Goyal
@ 2020-02-29  8:04   ` Pankaj Gupta
  2020-03-04 16:16     ` Christoph Hellwig
  2020-03-04 18:44     ` Vivek Goyal
  0 siblings, 2 replies; 17+ messages in thread
From: Pankaj Gupta @ 2020-02-29  8:04 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, linux-nvdimm, hch, david, dm-devel, Christoph Hellwig

On Fri, 28 Feb 2020 at 17:35, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> This splits pmem_do_bvec() into pmem_do_read() and pmem_do_write().
> pmem_do_write() will be used by pmem zero_page_range() as well. Hence
> sharing the same code.
>
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  drivers/nvdimm/pmem.c | 86 +++++++++++++++++++++++++------------------
>  1 file changed, 50 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
> index 4eae441f86c9..075b11682192 100644
> --- a/drivers/nvdimm/pmem.c
> +++ b/drivers/nvdimm/pmem.c
> @@ -136,9 +136,25 @@ static blk_status_t read_pmem(struct page *page, unsigned int off,
>         return BLK_STS_OK;
>  }
>
> -static blk_status_t pmem_do_bvec(struct pmem_device *pmem, struct page *page,
> -                       unsigned int len, unsigned int off, unsigned int op,
> -                       sector_t sector)
> +static blk_status_t pmem_do_read(struct pmem_device *pmem,
> +                       struct page *page, unsigned int page_off,
> +                       sector_t sector, unsigned int len)
> +{
> +       blk_status_t rc;
> +       phys_addr_t pmem_off = sector * 512 + pmem->data_offset;

minor nit,  maybe 512 is replaced by macro? Looks like its used at multiple
places, maybe can keep at is for now.

> +       void *pmem_addr = pmem->virt_addr + pmem_off;
> +
> +       if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
> +               return BLK_STS_IOERR;
> +
> +       rc = read_pmem(page, page_off, pmem_addr, len);
> +       flush_dcache_page(page);
> +       return rc;
> +}
> +
> +static blk_status_t pmem_do_write(struct pmem_device *pmem,
> +                       struct page *page, unsigned int page_off,
> +                       sector_t sector, unsigned int len)
>  {
>         blk_status_t rc = BLK_STS_OK;
>         bool bad_pmem = false;
> @@ -148,34 +164,25 @@ static blk_status_t pmem_do_bvec(struct pmem_device *pmem, struct page *page,
>         if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
>                 bad_pmem = true;
>
> -       if (!op_is_write(op)) {
> -               if (unlikely(bad_pmem))
> -                       rc = BLK_STS_IOERR;
> -               else {
> -                       rc = read_pmem(page, off, pmem_addr, len);
> -                       flush_dcache_page(page);
> -               }
> -       } else {
> -               /*
> -                * Note that we write the data both before and after
> -                * clearing poison.  The write before clear poison
> -                * handles situations where the latest written data is
> -                * preserved and the clear poison operation simply marks
> -                * the address range as valid without changing the data.
> -                * In this case application software can assume that an
> -                * interrupted write will either return the new good
> -                * data or an error.
> -                *
> -                * However, if pmem_clear_poison() leaves the data in an
> -                * indeterminate state we need to perform the write
> -                * after clear poison.
> -                */
> -               flush_dcache_page(page);
> -               write_pmem(pmem_addr, page, off, len);
> -               if (unlikely(bad_pmem)) {
> -                       rc = pmem_clear_poison(pmem, pmem_off, len);
> -                       write_pmem(pmem_addr, page, off, len);
> -               }
> +       /*
> +        * Note that we write the data both before and after
> +        * clearing poison.  The write before clear poison
> +        * handles situations where the latest written data is
> +        * preserved and the clear poison operation simply marks
> +        * the address range as valid without changing the data.
> +        * In this case application software can assume that an
> +        * interrupted write will either return the new good
> +        * data or an error.
> +        *
> +        * However, if pmem_clear_poison() leaves the data in an
> +        * indeterminate state we need to perform the write
> +        * after clear poison.
> +        */
> +       flush_dcache_page(page);
> +       write_pmem(pmem_addr, page, page_off, len);
> +       if (unlikely(bad_pmem)) {
> +               rc = pmem_clear_poison(pmem, pmem_off, len);
> +               write_pmem(pmem_addr, page, page_off, len);
>         }
>
>         return rc;
> @@ -197,8 +204,12 @@ static blk_qc_t pmem_make_request(struct request_queue *q, struct bio *bio)
>
>         do_acct = nd_iostat_start(bio, &start);
>         bio_for_each_segment(bvec, bio, iter) {
> -               rc = pmem_do_bvec(pmem, bvec.bv_page, bvec.bv_len,
> -                               bvec.bv_offset, bio_op(bio), iter.bi_sector);
> +               if (op_is_write(bio_op(bio)))
> +                       rc = pmem_do_write(pmem, bvec.bv_page, bvec.bv_offset,
> +                               iter.bi_sector, bvec.bv_len);
> +               else
> +                       rc = pmem_do_read(pmem, bvec.bv_page, bvec.bv_offset,
> +                               iter.bi_sector, bvec.bv_len);
>                 if (rc) {
>                         bio->bi_status = rc;
>                         break;
> @@ -223,9 +234,12 @@ static int pmem_rw_page(struct block_device *bdev, sector_t sector,
>         struct pmem_device *pmem = bdev->bd_queue->queuedata;
>         blk_status_t rc;
>
> -       rc = pmem_do_bvec(pmem, page, hpage_nr_pages(page) * PAGE_SIZE,
> -                         0, op, sector);
> -
> +       if (op_is_write(op))
> +               rc = pmem_do_write(pmem, page, 0, sector,
> +                                  hpage_nr_pages(page) * PAGE_SIZE);
> +       else
> +               rc = pmem_do_read(pmem, page, 0, sector,
> +                                  hpage_nr_pages(page) * PAGE_SIZE);
>         /*
>          * The ->rw_page interface is subtle and tricky.  The core
>          * retries on any error, so we can only invoke page_endio() in
> --
> 2.20.1

Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> _______________________________________________
> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
                   ` (5 preceding siblings ...)
  2020-02-28 16:34 ` [PATCH v6 6/6] dax,iomap: Add helper dax_iomap_zero() to zero a range Vivek Goyal
@ 2020-02-29  9:16 ` Pankaj Gupta
  2020-03-10 12:18 ` Vivek Goyal
  2020-04-01 16:11 ` [PATCH v6 7/6] dax: Move mandatory ->zero_page_range() check in alloc_dax() Vivek Goyal
  8 siblings, 0 replies; 17+ messages in thread
From: Pankaj Gupta @ 2020-02-29  9:16 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: linux-fsdevel, linux-nvdimm, hch, david, dm-devel

Hi Vivek,

>
> Hi,
>
> This is V6 of patches. These patches are also available at.

Looks like cover letter is missing the motivation for the patchset.
Though I found it in previous posting. Its good to add it in the series
for anyone joining the discussion at later stages.

Thanks,
Pankaj

>
> Changes since V5:
>
> - Dan Williams preferred ->zero_page_range() to only accept PAGE_SIZE
>   aligned request and clear poison only on page size aligned zeroing. So
>   I changed it accordingly.
>
> - Dropped all the modifications which were required to support arbitrary
>   range zeroing with-in a page.
>
> - This patch series also fixes the issue where "truncate -s 512 foo.txt"
>   will fail if first sector of file is poisoned. Currently it succeeds
>   and filesystem expectes whole of the filesystem block to be free of
>   poison at the end of the operation.
>
> Christoph, I have dropped your Reviewed-by tag on 1-2 patches because
> these patches changed substantially. Especially signature of of
> dax zero_page_range() helper.
>
> Thanks
> Vivek
>
> Vivek Goyal (6):
>   pmem: Add functions for reading/writing page to/from pmem
>   dax, pmem: Add a dax operation zero_page_range
>   s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver
>   dm,dax: Add dax zero_page_range operation
>   dax: Use new dax zero page method for zeroing a page
>   dax,iomap: Add helper dax_iomap_zero() to zero a range
>
>  drivers/dax/super.c           | 20 ++++++++
>  drivers/md/dm-linear.c        | 18 +++++++
>  drivers/md/dm-log-writes.c    | 17 ++++++
>  drivers/md/dm-stripe.c        | 23 +++++++++
>  drivers/md/dm.c               | 30 +++++++++++
>  drivers/nvdimm/pmem.c         | 97 ++++++++++++++++++++++-------------
>  drivers/s390/block/dcssblk.c  | 15 ++++++
>  fs/dax.c                      | 59 ++++++++++-----------
>  fs/iomap/buffered-io.c        |  9 +---
>  include/linux/dax.h           | 21 +++-----
>  include/linux/device-mapper.h |  3 ++
>  11 files changed, 221 insertions(+), 91 deletions(-)
>
> --
> 2.20.1
> _______________________________________________
> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 2/6] dax, pmem: Add a dax operation zero_page_range
  2020-02-28 16:34 ` [PATCH v6 2/6] dax, pmem: Add a dax operation zero_page_range Vivek Goyal
@ 2020-02-29  9:21   ` Pankaj Gupta
  0 siblings, 0 replies; 17+ messages in thread
From: Pankaj Gupta @ 2020-02-29  9:21 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: linux-fsdevel, linux-nvdimm, hch, david, dm-devel

>
> Add a dax operation zero_page_range, to zero a page. This will also clear any
> known poison in the page being zeroed.
>
> As of now, zeroing of one page is allowed in a single call. There
> are no callers which are trying to zero more than a page in a single call.
> Once we grow the callers which zero more than a page in single call, we
> can add that support. Primary reason for not doing that yet is that this
> will add little complexity in dm implementation where a range might be
> spanning multiple underlying targets and one will have to split the range
> into multiple sub ranges and call zero_page_range() on individual targets.
>
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  drivers/dax/super.c   | 20 ++++++++++++++++++++
>  drivers/nvdimm/pmem.c | 11 +++++++++++
>  include/linux/dax.h   |  4 ++++
>  3 files changed, 35 insertions(+)
>
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 0aa4b6bc5101..e498daf3c0d7 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -344,6 +344,26 @@ size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
>  }
>  EXPORT_SYMBOL_GPL(dax_copy_to_iter);
>
> +int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
> +                       size_t nr_pages)
> +{
> +       if (!dax_alive(dax_dev))
> +               return -ENXIO;
> +
> +       if (!dax_dev->ops->zero_page_range)
> +               return -EOPNOTSUPP;
> +       /*
> +        * There are no callers that want to zero more than one page as of now.
> +        * Once users are there, this check can be removed after the
> +        * device mapper code has been updated to split ranges across targets.
> +        */
> +       if (nr_pages != 1)
> +               return -EIO;
> +
> +       return dax_dev->ops->zero_page_range(dax_dev, pgoff, nr_pages);
> +}
> +EXPORT_SYMBOL_GPL(dax_zero_page_range);
> +
>  #ifdef CONFIG_ARCH_HAS_PMEM_API
>  void arch_wb_cache_pmem(void *addr, size_t size);
>  void dax_flush(struct dax_device *dax_dev, void *addr, size_t size)
> diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
> index 075b11682192..5b774ddd0efb 100644
> --- a/drivers/nvdimm/pmem.c
> +++ b/drivers/nvdimm/pmem.c
> @@ -282,6 +282,16 @@ static const struct block_device_operations pmem_fops = {
>         .revalidate_disk =      nvdimm_revalidate_disk,
>  };
>
> +static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
> +                                   size_t nr_pages)
> +{
> +       struct pmem_device *pmem = dax_get_private(dax_dev);
> +
> +       return blk_status_to_errno(pmem_do_write(pmem, ZERO_PAGE(0), 0,
> +                                  PFN_PHYS(pgoff) >> SECTOR_SHIFT,
> +                                  PAGE_SIZE));
> +}
> +
>  static long pmem_dax_direct_access(struct dax_device *dax_dev,
>                 pgoff_t pgoff, long nr_pages, void **kaddr, pfn_t *pfn)
>  {
> @@ -313,6 +323,7 @@ static const struct dax_operations pmem_dax_ops = {
>         .dax_supported = generic_fsdax_supported,
>         .copy_from_iter = pmem_copy_from_iter,
>         .copy_to_iter = pmem_copy_to_iter,
> +       .zero_page_range = pmem_dax_zero_page_range,
>  };
>
>  static const struct attribute_group *pmem_attribute_groups[] = {
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index 328c2dbb4409..71735c430c05 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -34,6 +34,8 @@ struct dax_operations {
>         /* copy_to_iter: required operation for fs-dax direct-i/o */
>         size_t (*copy_to_iter)(struct dax_device *, pgoff_t, void *, size_t,
>                         struct iov_iter *);
> +       /* zero_page_range: required operation. Zero page range   */
> +       int (*zero_page_range)(struct dax_device *, pgoff_t, size_t);
>  };
>
>  extern struct attribute_group dax_attribute_group;
> @@ -199,6 +201,8 @@ size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
>                 size_t bytes, struct iov_iter *i);
>  size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
>                 size_t bytes, struct iov_iter *i);
> +int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
> +                       size_t nr_pages);
>  void dax_flush(struct dax_device *dax_dev, void *addr, size_t size);
>
>  ssize_t dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
> --
> 2.20.1

Zeroing single page seems right approach for now.
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> _______________________________________________
> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem
  2020-02-29  8:04   ` Pankaj Gupta
@ 2020-03-04 16:16     ` Christoph Hellwig
  2020-03-04 18:44     ` Vivek Goyal
  1 sibling, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2020-03-04 16:16 UTC (permalink / raw)
  To: Pankaj Gupta
  Cc: linux-fsdevel, linux-nvdimm, hch, david, dm-devel, Christoph Hellwig

On Sat, Feb 29, 2020 at 09:04:00AM +0100, Pankaj Gupta wrote:
> > +       phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
> 
> minor nit,  maybe 512 is replaced by macro? Looks like its used at multiple
> places, maybe can keep at is for now.

That would be the existing SECTOR_SIZE macro.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem
  2020-02-29  8:04   ` Pankaj Gupta
  2020-03-04 16:16     ` Christoph Hellwig
@ 2020-03-04 18:44     ` Vivek Goyal
  1 sibling, 0 replies; 17+ messages in thread
From: Vivek Goyal @ 2020-03-04 18:44 UTC (permalink / raw)
  To: Pankaj Gupta
  Cc: linux-fsdevel, linux-nvdimm, hch, david, dm-devel, Christoph Hellwig

On Sat, Feb 29, 2020 at 09:04:00AM +0100, Pankaj Gupta wrote:
> On Fri, 28 Feb 2020 at 17:35, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > This splits pmem_do_bvec() into pmem_do_read() and pmem_do_write().
> > pmem_do_write() will be used by pmem zero_page_range() as well. Hence
> > sharing the same code.
> >
> > Suggested-by: Christoph Hellwig <hch@infradead.org>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  drivers/nvdimm/pmem.c | 86 +++++++++++++++++++++++++------------------
> >  1 file changed, 50 insertions(+), 36 deletions(-)
> >
> > diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
> > index 4eae441f86c9..075b11682192 100644
> > --- a/drivers/nvdimm/pmem.c
> > +++ b/drivers/nvdimm/pmem.c
> > @@ -136,9 +136,25 @@ static blk_status_t read_pmem(struct page *page, unsigned int off,
> >         return BLK_STS_OK;
> >  }
> >
> > -static blk_status_t pmem_do_bvec(struct pmem_device *pmem, struct page *page,
> > -                       unsigned int len, unsigned int off, unsigned int op,
> > -                       sector_t sector)
> > +static blk_status_t pmem_do_read(struct pmem_device *pmem,
> > +                       struct page *page, unsigned int page_off,
> > +                       sector_t sector, unsigned int len)
> > +{
> > +       blk_status_t rc;
> > +       phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
> 
> minor nit,  maybe 512 is replaced by macro? Looks like its used at multiple
> places, maybe can keep at is for now.

This came from existing code. If I end up spinning this patch series
again, I will replace it with (sector << SECTOR_SHIFT).

Thanks
Vivek
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
                   ` (6 preceding siblings ...)
  2020-02-29  9:16 ` [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Pankaj Gupta
@ 2020-03-10 12:18 ` Vivek Goyal
  2020-04-01 16:11 ` [PATCH v6 7/6] dax: Move mandatory ->zero_page_range() check in alloc_dax() Vivek Goyal
  8 siblings, 0 replies; 17+ messages in thread
From: Vivek Goyal @ 2020-03-10 12:18 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams; +Cc: david, dm-devel

On Fri, Feb 28, 2020 at 11:34:50AM -0500, Vivek Goyal wrote:
> Hi,
> 
> This is V6 of patches. These patches are also available at.

Hi Dan,

Ping. Does this patch series look fine to you?

Vivek

> 
> Changes since V5:
> 
> - Dan Williams preferred ->zero_page_range() to only accept PAGE_SIZE
>   aligned request and clear poison only on page size aligned zeroing. So
>   I changed it accordingly. 
> 
> - Dropped all the modifications which were required to support arbitrary
>   range zeroing with-in a page.
> 
> - This patch series also fixes the issue where "truncate -s 512 foo.txt"
>   will fail if first sector of file is poisoned. Currently it succeeds
>   and filesystem expectes whole of the filesystem block to be free of
>   poison at the end of the operation.
> 
> Christoph, I have dropped your Reviewed-by tag on 1-2 patches because
> these patches changed substantially. Especially signature of of
> dax zero_page_range() helper.
> 
> Thanks
> Vivek
> 
> Vivek Goyal (6):
>   pmem: Add functions for reading/writing page to/from pmem
>   dax, pmem: Add a dax operation zero_page_range
>   s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver
>   dm,dax: Add dax zero_page_range operation
>   dax: Use new dax zero page method for zeroing a page
>   dax,iomap: Add helper dax_iomap_zero() to zero a range
> 
>  drivers/dax/super.c           | 20 ++++++++
>  drivers/md/dm-linear.c        | 18 +++++++
>  drivers/md/dm-log-writes.c    | 17 ++++++
>  drivers/md/dm-stripe.c        | 23 +++++++++
>  drivers/md/dm.c               | 30 +++++++++++
>  drivers/nvdimm/pmem.c         | 97 ++++++++++++++++++++++-------------
>  drivers/s390/block/dcssblk.c  | 15 ++++++
>  fs/dax.c                      | 59 ++++++++++-----------
>  fs/iomap/buffered-io.c        |  9 +---
>  include/linux/dax.h           | 21 +++-----
>  include/linux/device-mapper.h |  3 ++
>  11 files changed, 221 insertions(+), 91 deletions(-)
> 
> -- 
> 2.20.1
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 4/6] dm,dax: Add dax zero_page_range operation
  2020-02-28 16:34 ` [PATCH v6 4/6] dm,dax: Add dax zero_page_range operation Vivek Goyal
@ 2020-03-31 19:34   ` Dan Williams
  2020-04-03  0:49     ` Dan Williams
  2020-04-03  1:02     ` Mike Snitzer
  0 siblings, 2 replies; 17+ messages in thread
From: Dan Williams @ 2020-03-31 19:34 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, linux-nvdimm, Christoph Hellwig, david,
	device-mapper development, Mike Snitzer

[ Add Mike ]

On Fri, Feb 28, 2020 at 8:35 AM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> This patch adds support for dax zero_page_range operation to dm targets.

Mike,

Sorry, I should have pinged you earlier, but could you take a look at
this patch and ack it if it looks ok to go through the nvdimm tree
with the rest of the series?

>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  drivers/md/dm-linear.c        | 18 ++++++++++++++++++
>  drivers/md/dm-log-writes.c    | 17 +++++++++++++++++
>  drivers/md/dm-stripe.c        | 23 +++++++++++++++++++++++
>  drivers/md/dm.c               | 30 ++++++++++++++++++++++++++++++
>  include/linux/device-mapper.h |  3 +++
>  5 files changed, 91 insertions(+)
>
> diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
> index 8d07fdf63a47..e1db43446327 100644
> --- a/drivers/md/dm-linear.c
> +++ b/drivers/md/dm-linear.c
> @@ -201,10 +201,27 @@ static size_t linear_dax_copy_to_iter(struct dm_target *ti, pgoff_t pgoff,
>         return dax_copy_to_iter(dax_dev, pgoff, addr, bytes, i);
>  }
>
> +static int linear_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
> +                                     size_t nr_pages)
> +{
> +       int ret;
> +       struct linear_c *lc = ti->private;
> +       struct block_device *bdev = lc->dev->bdev;
> +       struct dax_device *dax_dev = lc->dev->dax_dev;
> +       sector_t dev_sector, sector = pgoff * PAGE_SECTORS;
> +
> +       dev_sector = linear_map_sector(ti, sector);
> +       ret = bdev_dax_pgoff(bdev, dev_sector, nr_pages << PAGE_SHIFT, &pgoff);
> +       if (ret)
> +               return ret;
> +       return dax_zero_page_range(dax_dev, pgoff, nr_pages);
> +}
> +
>  #else
>  #define linear_dax_direct_access NULL
>  #define linear_dax_copy_from_iter NULL
>  #define linear_dax_copy_to_iter NULL
> +#define linear_dax_zero_page_range NULL
>  #endif
>
>  static struct target_type linear_target = {
> @@ -226,6 +243,7 @@ static struct target_type linear_target = {
>         .direct_access = linear_dax_direct_access,
>         .dax_copy_from_iter = linear_dax_copy_from_iter,
>         .dax_copy_to_iter = linear_dax_copy_to_iter,
> +       .dax_zero_page_range = linear_dax_zero_page_range,
>  };
>
>  int __init dm_linear_init(void)
> diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
> index 99721c76225d..8ea20b56b4d6 100644
> --- a/drivers/md/dm-log-writes.c
> +++ b/drivers/md/dm-log-writes.c
> @@ -994,10 +994,26 @@ static size_t log_writes_dax_copy_to_iter(struct dm_target *ti,
>         return dax_copy_to_iter(lc->dev->dax_dev, pgoff, addr, bytes, i);
>  }
>
> +static int log_writes_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
> +                                         size_t nr_pages)
> +{
> +       int ret;
> +       struct log_writes_c *lc = ti->private;
> +       sector_t sector = pgoff * PAGE_SECTORS;
> +
> +       ret = bdev_dax_pgoff(lc->dev->bdev, sector, nr_pages << PAGE_SHIFT,
> +                            &pgoff);
> +       if (ret)
> +               return ret;
> +       return dax_zero_page_range(lc->dev->dax_dev, pgoff,
> +                                  nr_pages << PAGE_SHIFT);
> +}
> +
>  #else
>  #define log_writes_dax_direct_access NULL
>  #define log_writes_dax_copy_from_iter NULL
>  #define log_writes_dax_copy_to_iter NULL
> +#define log_writes_dax_zero_page_range NULL
>  #endif
>
>  static struct target_type log_writes_target = {
> @@ -1016,6 +1032,7 @@ static struct target_type log_writes_target = {
>         .direct_access = log_writes_dax_direct_access,
>         .dax_copy_from_iter = log_writes_dax_copy_from_iter,
>         .dax_copy_to_iter = log_writes_dax_copy_to_iter,
> +       .dax_zero_page_range = log_writes_dax_zero_page_range,
>  };
>
>  static int __init dm_log_writes_init(void)
> diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
> index 63bbcc20f49a..fa813c0f993d 100644
> --- a/drivers/md/dm-stripe.c
> +++ b/drivers/md/dm-stripe.c
> @@ -360,10 +360,32 @@ static size_t stripe_dax_copy_to_iter(struct dm_target *ti, pgoff_t pgoff,
>         return dax_copy_to_iter(dax_dev, pgoff, addr, bytes, i);
>  }
>
> +static int stripe_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
> +                                     size_t nr_pages)
> +{
> +       int ret;
> +       sector_t dev_sector, sector = pgoff * PAGE_SECTORS;
> +       struct stripe_c *sc = ti->private;
> +       struct dax_device *dax_dev;
> +       struct block_device *bdev;
> +       uint32_t stripe;
> +
> +       stripe_map_sector(sc, sector, &stripe, &dev_sector);
> +       dev_sector += sc->stripe[stripe].physical_start;
> +       dax_dev = sc->stripe[stripe].dev->dax_dev;
> +       bdev = sc->stripe[stripe].dev->bdev;
> +
> +       ret = bdev_dax_pgoff(bdev, dev_sector, nr_pages << PAGE_SHIFT, &pgoff);
> +       if (ret)
> +               return ret;
> +       return dax_zero_page_range(dax_dev, pgoff, nr_pages);
> +}
> +
>  #else
>  #define stripe_dax_direct_access NULL
>  #define stripe_dax_copy_from_iter NULL
>  #define stripe_dax_copy_to_iter NULL
> +#define stripe_dax_zero_page_range NULL
>  #endif
>
>  /*
> @@ -486,6 +508,7 @@ static struct target_type stripe_target = {
>         .direct_access = stripe_dax_direct_access,
>         .dax_copy_from_iter = stripe_dax_copy_from_iter,
>         .dax_copy_to_iter = stripe_dax_copy_to_iter,
> +       .dax_zero_page_range = stripe_dax_zero_page_range,
>  };
>
>  int __init dm_stripe_init(void)
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index b89f07ee2eff..aa72d9e757c1 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1198,6 +1198,35 @@ static size_t dm_dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff,
>         return ret;
>  }
>
> +static int dm_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
> +                                 size_t nr_pages)
> +{
> +       struct mapped_device *md = dax_get_private(dax_dev);
> +       sector_t sector = pgoff * PAGE_SECTORS;
> +       struct dm_target *ti;
> +       int ret = -EIO;
> +       int srcu_idx;
> +
> +       ti = dm_dax_get_live_target(md, sector, &srcu_idx);
> +
> +       if (!ti)
> +               goto out;
> +       if (WARN_ON(!ti->type->dax_zero_page_range)) {
> +               /*
> +                * ->zero_page_range() is mandatory dax operation. If we are
> +                *  here, something is wrong.
> +                */
> +               dm_put_live_table(md, srcu_idx);
> +               goto out;
> +       }
> +       ret = ti->type->dax_zero_page_range(ti, pgoff, nr_pages);
> +
> + out:
> +       dm_put_live_table(md, srcu_idx);
> +
> +       return ret;
> +}
> +
>  /*
>   * A target may call dm_accept_partial_bio only from the map routine.  It is
>   * allowed for all bio types except REQ_PREFLUSH, REQ_OP_ZONE_RESET,
> @@ -3199,6 +3228,7 @@ static const struct dax_operations dm_dax_ops = {
>         .dax_supported = dm_dax_supported,
>         .copy_from_iter = dm_dax_copy_from_iter,
>         .copy_to_iter = dm_dax_copy_to_iter,
> +       .zero_page_range = dm_dax_zero_page_range,
>  };
>
>  /*
> diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
> index 475668c69dbc..af48d9da3916 100644
> --- a/include/linux/device-mapper.h
> +++ b/include/linux/device-mapper.h
> @@ -141,6 +141,8 @@ typedef long (*dm_dax_direct_access_fn) (struct dm_target *ti, pgoff_t pgoff,
>                 long nr_pages, void **kaddr, pfn_t *pfn);
>  typedef size_t (*dm_dax_copy_iter_fn)(struct dm_target *ti, pgoff_t pgoff,
>                 void *addr, size_t bytes, struct iov_iter *i);
> +typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff,
> +               size_t nr_pages);
>  #define PAGE_SECTORS (PAGE_SIZE / 512)
>
>  void dm_error(const char *message);
> @@ -195,6 +197,7 @@ struct target_type {
>         dm_dax_direct_access_fn direct_access;
>         dm_dax_copy_iter_fn dax_copy_from_iter;
>         dm_dax_copy_iter_fn dax_copy_to_iter;
> +       dm_dax_zero_page_range_fn dax_zero_page_range;
>
>         /* For internal device-mapper use. */
>         struct list_head list;
> --
> 2.20.1
>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 7/6] dax: Move mandatory ->zero_page_range() check in alloc_dax()
  2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
                   ` (7 preceding siblings ...)
  2020-03-10 12:18 ` Vivek Goyal
@ 2020-04-01 16:11 ` Vivek Goyal
  8 siblings, 0 replies; 17+ messages in thread
From: Vivek Goyal @ 2020-04-01 16:11 UTC (permalink / raw)
  To: linux-fsdevel, linux-nvdimm, hch, dan.j.williams
  Cc: david, dm-devel, gerald.schaefer

zero_page_range() dax operation is mandatory for dax devices. Right now
that check happens in dax_zero_page_range() function. Dan thinks that's
too late and its better to do the check earlier in alloc_dax().

I also modified alloc_dax() to return pointer with error code in it in
case of failure. Right now it returns NULL and caller assumes failure
happened due to -ENOMEM. But with this ->zero_page_range() check, I
need to return -EINVAL instead.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 drivers/dax/bus.c            |    4 +++-
 drivers/dax/super.c          |   14 +++++++++-----
 drivers/md/dm.c              |    2 +-
 drivers/nvdimm/pmem.c        |    4 ++--
 drivers/s390/block/dcssblk.c |    5 +++--
 5 files changed, 18 insertions(+), 11 deletions(-)

Index: redhat-linux/drivers/dax/super.c
===================================================================
--- redhat-linux.orig/drivers/dax/super.c	2020-04-01 12:03:39.911439769 -0400
+++ redhat-linux/drivers/dax/super.c	2020-04-01 12:05:31.727439769 -0400
@@ -349,9 +349,6 @@ int dax_zero_page_range(struct dax_devic
 {
 	if (!dax_alive(dax_dev))
 		return -ENXIO;
-
-	if (!dax_dev->ops->zero_page_range)
-		return -EOPNOTSUPP;
 	/*
 	 * There are no callers that want to zero more than one page as of now.
 	 * Once users are there, this check can be removed after the
@@ -571,9 +568,16 @@ struct dax_device *alloc_dax(void *priva
 	dev_t devt;
 	int minor;
 
+	if (ops && !ops->zero_page_range) {
+		pr_debug("%s: error: device does not provide dax"
+			 " operation zero_page_range()\n",
+			 __host ? __host : "Unknown");
+		return ERR_PTR(-EINVAL);
+	}
+
 	host = kstrdup(__host, GFP_KERNEL);
 	if (__host && !host)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	minor = ida_simple_get(&dax_minor_ida, 0, MINORMASK+1, GFP_KERNEL);
 	if (minor < 0)
@@ -596,7 +600,7 @@ struct dax_device *alloc_dax(void *priva
 	ida_simple_remove(&dax_minor_ida, minor);
  err_minor:
 	kfree(host);
-	return NULL;
+	return ERR_PTR(-ENOMEM);
 }
 EXPORT_SYMBOL_GPL(alloc_dax);
 
Index: redhat-linux/drivers/nvdimm/pmem.c
===================================================================
--- redhat-linux.orig/drivers/nvdimm/pmem.c	2020-04-01 12:03:39.911439769 -0400
+++ redhat-linux/drivers/nvdimm/pmem.c	2020-04-01 12:05:31.729439769 -0400
@@ -487,9 +487,9 @@ static int pmem_attach_disk(struct devic
 	if (is_nvdimm_sync(nd_region))
 		flags = DAXDEV_F_SYNC;
 	dax_dev = alloc_dax(pmem, disk->disk_name, &pmem_dax_ops, flags);
-	if (!dax_dev) {
+	if (IS_ERR(dax_dev)) {
 		put_disk(disk);
-		return -ENOMEM;
+		return PTR_ERR(dax_dev);
 	}
 	dax_write_cache(dax_dev, nvdimm_has_cache(nd_region));
 	pmem->dax_dev = dax_dev;
Index: redhat-linux/drivers/dax/bus.c
===================================================================
--- redhat-linux.orig/drivers/dax/bus.c	2020-04-01 12:03:39.911439769 -0400
+++ redhat-linux/drivers/dax/bus.c	2020-04-01 12:05:31.729439769 -0400
@@ -421,8 +421,10 @@ struct dev_dax *__devm_create_dev_dax(st
 	 * device outside of mmap of the resulting character device.
 	 */
 	dax_dev = alloc_dax(dev_dax, NULL, NULL, DAXDEV_F_SYNC);
-	if (!dax_dev)
+	if (IS_ERR(dax_dev)) {
+		rc = PTR_ERR(dax_dev);
 		goto err;
+	}
 
 	/* a device_dax instance is dead while the driver is not attached */
 	kill_dax(dax_dev);
Index: redhat-linux/drivers/s390/block/dcssblk.c
===================================================================
--- redhat-linux.orig/drivers/s390/block/dcssblk.c	2020-04-01 12:03:39.911439769 -0400
+++ redhat-linux/drivers/s390/block/dcssblk.c	2020-04-01 12:05:31.730439769 -0400
@@ -695,8 +695,9 @@ dcssblk_add_store(struct device *dev, st
 
 	dev_info->dax_dev = alloc_dax(dev_info, dev_info->gd->disk_name,
 			&dcssblk_dax_ops, DAXDEV_F_SYNC);
-	if (!dev_info->dax_dev) {
-		rc = -ENOMEM;
+	if (IS_ERR(dev_info->dax_dev)) {
+		rc = PTR_ERR(dev_info->dax_dev);
+		dev_info->dax_dev = NULL;
 		goto put_dev;
 	}
 
Index: redhat-linux/drivers/md/dm.c
===================================================================
--- redhat-linux.orig/drivers/md/dm.c	2020-04-01 12:03:39.911439769 -0400
+++ redhat-linux/drivers/md/dm.c	2020-04-01 12:05:31.732439769 -0400
@@ -2005,7 +2005,7 @@ static struct mapped_device *alloc_dev(i
 	if (IS_ENABLED(CONFIG_DAX_DRIVER)) {
 		md->dax_dev = alloc_dax(md, md->disk->disk_name,
 					&dm_dax_ops, 0);
-		if (!md->dax_dev)
+		if (IS_ERR(md->dax_dev))
 			goto bad;
 	}
 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 4/6] dm,dax: Add dax zero_page_range operation
  2020-03-31 19:34   ` Dan Williams
@ 2020-04-03  0:49     ` Dan Williams
  2020-04-03  1:02     ` Mike Snitzer
  1 sibling, 0 replies; 17+ messages in thread
From: Dan Williams @ 2020-04-03  0:49 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, linux-nvdimm, Christoph Hellwig, david,
	device-mapper development, Mike Snitzer

On Tue, Mar 31, 2020 at 12:34 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> [ Add Mike ]
>
> On Fri, Feb 28, 2020 at 8:35 AM Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > This patch adds support for dax zero_page_range operation to dm targets.
>
> Mike,
>
> Sorry, I should have pinged you earlier, but could you take a look at
> this patch and ack it if it looks ok to go through the nvdimm tree
> with the rest of the series?

I'm going to proceed with pushing this into -next to get more exposure
and check back later next week about pushing it to Linus. It looks
good to me and it unblocks Vivek on his virtio-fs work, but still
don't want to push device-mapper bits without device-mapper maintainer
Ack.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 4/6] dm,dax: Add dax zero_page_range operation
  2020-03-31 19:34   ` Dan Williams
  2020-04-03  0:49     ` Dan Williams
@ 2020-04-03  1:02     ` Mike Snitzer
  1 sibling, 0 replies; 17+ messages in thread
From: Mike Snitzer @ 2020-04-03  1:02 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-fsdevel, linux-nvdimm, Christoph Hellwig, david,
	device-mapper development

On Tue, Mar 31 2020 at  3:34pm -0400,
Dan Williams <dan.j.williams@intel.com> wrote:

> [ Add Mike ]
> 
> On Fri, Feb 28, 2020 at 8:35 AM Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > This patch adds support for dax zero_page_range operation to dm targets.
> 
> Mike,
> 
> Sorry, I should have pinged you earlier, but could you take a look at
> this patch and ack it if it looks ok to go through the nvdimm tree
> with the rest of the series?

Yes, looks fine to me.

Acked-by: Mike Snitzer <snitzer@redhat.com>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-04-03  1:02 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-28 16:34 [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Vivek Goyal
2020-02-28 16:34 ` [PATCH v6 1/6] pmem: Add functions for reading/writing page to/from pmem Vivek Goyal
2020-02-29  8:04   ` Pankaj Gupta
2020-03-04 16:16     ` Christoph Hellwig
2020-03-04 18:44     ` Vivek Goyal
2020-02-28 16:34 ` [PATCH v6 2/6] dax, pmem: Add a dax operation zero_page_range Vivek Goyal
2020-02-29  9:21   ` Pankaj Gupta
2020-02-28 16:34 ` [PATCH v6 3/6] s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver Vivek Goyal
2020-02-28 16:34 ` [PATCH v6 4/6] dm,dax: Add dax zero_page_range operation Vivek Goyal
2020-03-31 19:34   ` Dan Williams
2020-04-03  0:49     ` Dan Williams
2020-04-03  1:02     ` Mike Snitzer
2020-02-28 16:34 ` [PATCH v6 5/6] dax: Use new dax zero page method for zeroing a page Vivek Goyal
2020-02-28 16:34 ` [PATCH v6 6/6] dax,iomap: Add helper dax_iomap_zero() to zero a range Vivek Goyal
2020-02-29  9:16 ` [PATCH v6 0/6] dax/pmem: Provide a dax operation to zero page range Pankaj Gupta
2020-03-10 12:18 ` Vivek Goyal
2020-04-01 16:11 ` [PATCH v6 7/6] dax: Move mandatory ->zero_page_range() check in alloc_dax() Vivek Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).