nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot
@ 2018-11-21  3:26 Huaisheng Ye
  2018-11-21  3:27 ` [RFC PATCH v2 1/3] dm: enable " Huaisheng Ye
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Huaisheng Ye @ 2018-11-21  3:26 UTC (permalink / raw)
  To: linux-nvdimm, agk, snitzer, dm-devel, dan.j.williams, willy,
	zwisler, jack, dave.jiang, vishal.l.verma
  Cc: linux-fsdevel, linux-kernel, chengnt, Huaisheng Ye

From: Huaisheng Ye <yehs1@lenovo.com>

Changes
v1->v2:
	Add NULL funtions for origin_dax_direct_access and
	origin_dax_copy_from/to_iter in order to avoid building
	error when CONFIG_DAX_DRIVER has NOT been enabled.

[v1]: https://lkml.org/lkml/2018/11/20/759

This series patches are used to realize the dax_operations for dm-snapshot
with persistent memory device.

Here are the steps about how to verify the function.

1. Configure the persistent memory to fs-dax mode and create namespace with ndctl;
2. find them in /dev;
  # ndctl list
  {
    "dev":"namespace0.0",
    "mode":"fsdax",
    "map":"dev",
    "size":132118478848,
    "sector_size":512,
    "blockdev":"pmem0",
    "name":"yhs_pmem0",
    "numa_node":0
  },
3. create lv_pmem (here is 4G size) for testing;
   # pvcreate /dev/pmem0
   # vgcreate vg_pmem /dev/pmem0
   # lvcreate  -L 4G -n lv_pmem vg_pmem
4. create filesystem (ext2 or ext4) to /dev/pmem0
   # mkfs.ext2 -b 4096 /dev/vg_pmem/lv_pmem
5. mount pmem with DAX way;
   # mkdir /mnt/lv_pmem
   # mount -o dax /dev/vg_pmem/lv_pmem /mnt/lv_pmem/
6. cp some files to /mnt/lv_pmem;
   # cp linear_table03.log /mnt/lv_pmem/
   # cp test0.log /mnt/lv_pmem/
7. create snapshot for test (here I limit it to 1G size);
   # lvcreate -L 1G -n snap_pmem -s /dev/vg_pmem/lv_pmem
8. modify the files copied with vim or copy more other new files;
   # vim /mnt/lv_pmem/test0.log
9. umount the pmem device;
   # umount /mnt/lv_pmem/
10.merge the snapshot back to origin;
   # lvconvert --merge /dev/vg_pmem/snap_pmem
11.mount pmem device again for checking the content of files;
   # mount -o dax /dev/vg_pmem/lv_pmem /mnt/lv_pmem/

Huaisheng Ye (3):
  dm: enable dax_operations for dm-snapshot
  dm: expand hc_map in mapped_device for lack of map
  dm: expand valid types for dm-ioctl

 drivers/md/dm-core.h  |  1 +
 drivers/md/dm-ioctl.c |  4 +++-
 drivers/md/dm-snap.c  | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 drivers/md/dm.c       | 15 +++++++++++++++
 4 files changed, 68 insertions(+), 3 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH v2 1/3] dm: enable dax_operations for dm-snapshot
  2018-11-21  3:26 [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Huaisheng Ye
@ 2018-11-21  3:27 ` Huaisheng Ye
  2018-11-21  3:27 ` [RFC PATCH v2 2/3] dm: expand hc_map in mapped_device for lack of map Huaisheng Ye
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Huaisheng Ye @ 2018-11-21  3:27 UTC (permalink / raw)
  To: linux-nvdimm, agk, snitzer, dm-devel, dan.j.williams, willy,
	zwisler, jack, dave.jiang, vishal.l.verma
  Cc: linux-fsdevel, linux-kernel, chengnt, Huaisheng Ye

From: Huaisheng Ye <yehs1@lenovo.com>

Reconstruct origin_dax_direct_access and expand functions
origin_dax_copy_to/from_iter for DAX operations of dm-snapshot.

Here is the call trace of origin_dax_copy_to_iter,
origin_dax_copy_to_iter will callin dm-linear (-real) for further
dax_operation.

[518597.924019] Call Trace:
[518597.927320]  dump_stack+0x65/0x7e
[518597.931592]  origin_dax_copy_to_iter+0x51/0x78 [dm_snapshot]
[518597.938494]  dm_dax_copy_to_iter+0x86/0xc5 [dm_mod]
[518597.944519]  dax_copy_to_iter+0x27/0x29
[518597.949371]  dax_iomap_actor+0x264/0x326
[518597.954308]  ? trace_event_raw_event_dax_pmd_load_hole_class+0xd0/0xd0
[518597.962159]  iomap_apply+0xc7/0x128
[518597.966609]  ? trace_event_raw_event_dax_pmd_load_hole_class+0xd0/0xd0
[518597.974447]  dax_iomap_rw+0x66/0xa8
[518597.978893]  ? trace_event_raw_event_dax_pmd_load_hole_class+0xd0/0xd0
[518597.986741]  ext2_file_read_iter+0x4f/0x83 [ext2]
[518597.992552]  __vfs_read+0x130/0x168
[518597.997001]  vfs_read+0x92/0x146
[518598.001155]  ksys_read+0x4f/0xa5
[518598.005308]  __x64_sys_read+0x16/0x18
[518598.009942]  do_syscall_64+0x88/0x15f
[518598.014575]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
---
 drivers/md/dm-snap.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
index ae4b33d..714e26f 100644
--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -19,6 +19,7 @@
 #include <linux/vmalloc.h>
 #include <linux/log2.h>
 #include <linux/dm-kcopyd.h>
+#include <linux/dax.h>
 
 #include "dm.h"
 
@@ -2316,13 +2317,57 @@ static int origin_map(struct dm_target *ti, struct bio *bio)
 	return do_origin(o->dev, bio);
 }
 
+#if IS_ENABLED(CONFIG_DAX_DRIVER)
 static long origin_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
 		long nr_pages, void **kaddr, pfn_t *pfn)
 {
-	DMWARN("device does not support dax.");
-	return -EIO;
+	long ret = 0;
+	struct dm_origin *o = ti->private;
+	struct block_device *bdev = o->dev->bdev;
+	struct dax_device *dax_dev = o->dev->dax_dev;
+	sector_t sector = pgoff * PAGE_SECTORS;
+
+	ret = bdev_dax_pgoff(bdev, sector, nr_pages * PAGE_SIZE, &pgoff);
+	if (ret)
+		return ret;
+
+	return dax_direct_access(dax_dev, pgoff, nr_pages, kaddr, pfn);
+}
+
+static size_t origin_dax_copy_from_iter(struct dm_target *ti, pgoff_t pgoff,
+		void *addr, size_t bytes, struct iov_iter *i)
+{
+	struct dm_origin *o = ti->private;
+	struct block_device *bdev = o->dev->bdev;
+	struct dax_device *dax_dev = o->dev->dax_dev;
+	sector_t sector = pgoff * PAGE_SECTORS;
+
+	if (bdev_dax_pgoff(bdev, sector, ALIGN(bytes, PAGE_SIZE), &pgoff))
+		return 0;
+
+	return dax_copy_from_iter(dax_dev, pgoff, addr, bytes, i);
 }
 
+static size_t origin_dax_copy_to_iter(struct dm_target *ti, pgoff_t pgoff,
+		void *addr, size_t bytes, struct iov_iter *i)
+{
+	struct dm_origin *o = ti->private;
+	struct block_device *bdev = o->dev->bdev;
+	struct dax_device *dax_dev = o->dev->dax_dev;
+	sector_t sector = pgoff * PAGE_SECTORS;
+
+	if (bdev_dax_pgoff(bdev, sector, ALIGN(bytes, PAGE_SIZE), &pgoff))
+		return 0;
+
+	return dax_copy_to_iter(dax_dev, pgoff, addr, bytes, i);
+}
+
+#else
+#define origin_dax_direct_access NULL
+#define origin_dax_copy_from_iter NULL
+#define origin_dax_copy_to_iter NULL
+#endif
+
 /*
  * Set the target "max_io_len" field to the minimum of all the snapshots'
  * chunk sizes.
@@ -2383,6 +2428,8 @@ static int origin_iterate_devices(struct dm_target *ti,
 	.status  = origin_status,
 	.iterate_devices = origin_iterate_devices,
 	.direct_access = origin_dax_direct_access,
+	.dax_copy_to_iter = origin_dax_copy_to_iter,
+	.dax_copy_from_iter = origin_dax_copy_from_iter,
 };
 
 static struct target_type snapshot_target = {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH v2 2/3] dm: expand hc_map in mapped_device for lack of map
  2018-11-21  3:26 [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Huaisheng Ye
  2018-11-21  3:27 ` [RFC PATCH v2 1/3] dm: enable " Huaisheng Ye
@ 2018-11-21  3:27 ` Huaisheng Ye
  2018-11-21  3:27 ` [RFC PATCH v2 3/3] dm: expand valid types for dm-ioctl Huaisheng Ye
  2018-11-25 20:59 ` [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Dan Williams
  3 siblings, 0 replies; 6+ messages in thread
From: Huaisheng Ye @ 2018-11-21  3:27 UTC (permalink / raw)
  To: linux-nvdimm, agk, snitzer, dm-devel, dan.j.williams, willy,
	zwisler, jack, dave.jiang, vishal.l.verma
  Cc: linux-fsdevel, linux-kernel, chengnt, Huaisheng Ye

From: Huaisheng Ye <yehs1@lenovo.com>

Sometimes dm_swap_table couldn't be used in time.

For example, during the origin construction of
dm-snapshot, the sys_ioctl table_load will try to
detect the lower device origin-real whether support
direct access or not. But origin-real md's struct
dm-table pointer has not yet been assigned real address
in time by dev_suspend.

So I expand hc_map for get the address from struct
hash_cell directly in this case.

Here is the call trace, dm_dax_direct_access will call
dm_dax_get_live_target for get the struct dm_table pointer.

[  213.975827] Call Trace:
[  213.975832]  dump_stack+0x5a/0x73
[  213.975840]  dm_dax_direct_access+0x12b/0x1b0 [dm_mod]
[  213.975845]  dax_direct_access+0x2d/0x60
[  213.975848]  __bdev_dax_supported+0x162/0x2a0
[  213.975851]  ? dump_stack+0x5a/0x73
[  213.975859]  device_supports_dax+0x15/0x20 [dm_mod]
[  213.975867]  dm_table_supports_dax.isra.13+0x7d/0xa0 [dm_mod]
[  213.975875]  dm_table_complete+0x3fb/0x750 [dm_mod]
[  213.975883]  table_load+0x19a/0x390 [dm_mod]
[  213.975891]  ? retrieve_status+0x1c0/0x1c0 [dm_mod]
[  213.975898]  ctl_ioctl+0x1d8/0x450 [dm_mod]
[  213.975909]  dm_ctl_ioctl+0xa/0x10 [dm_mod]
[  213.975913]  do_vfs_ioctl+0xa9/0x620
[  213.975918]  ? syscall_trace_enter+0x1c9/0x2b0
[  213.975923]  ksys_ioctl+0x60/0x90
[  213.975927]  __x64_sys_ioctl+0x16/0x20
[  213.975931]  do_syscall_64+0x5b/0x180
[  213.975936]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
---
 drivers/md/dm-core.h  |  1 +
 drivers/md/dm-ioctl.c |  1 +
 drivers/md/dm.c       | 15 +++++++++++++++
 3 files changed, 17 insertions(+)

diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index 224d445..5577d90 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -40,6 +40,7 @@ struct mapped_device {
 	 * dereference.
 	 */
 	void __rcu *map;
+	struct dm_table *hc_map;
 
 	unsigned long flags;
 
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index f666778..a27016e 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1365,6 +1365,7 @@ static int table_load(struct file *filp, struct dm_ioctl *param, size_t param_si
 	if (hc->new_map)
 		old_map = hc->new_map;
 	hc->new_map = t;
+	hc->md->hc_map = hc->new_map;
 	up_write(&_hash_lock);
 
 	param->flags |= DM_INACTIVE_PRESENT_FLAG;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index c510179..19b48bb 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1080,6 +1080,21 @@ static struct dm_target *dm_dax_get_live_target(struct mapped_device *md,
 	struct dm_target *ti;
 
 	map = dm_get_live_table(md, srcu_idx);
+	if (!map) {
+		/* Sometimes dm_swap_table couldn't be used in time.
+		 *
+		 * For example, during the origin construction of
+		 * dm-snapshot, the sys_ioctl table_load will try to
+		 * detect the lower device origin-real whether support
+		 * direct access or not. But origin-real device md's struct
+		 * dm-table pointer has not yet been assigned real address.
+		 * So hc_map has been used for get the address from
+		 * struct hash_cell directly.
+		 */
+		DMINFO("failed to get map, use hc_map insteadly");
+		map = md->hc_map;
+	}
+
 	if (!map)
 		return NULL;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH v2 3/3] dm: expand valid types for dm-ioctl
  2018-11-21  3:26 [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Huaisheng Ye
  2018-11-21  3:27 ` [RFC PATCH v2 1/3] dm: enable " Huaisheng Ye
  2018-11-21  3:27 ` [RFC PATCH v2 2/3] dm: expand hc_map in mapped_device for lack of map Huaisheng Ye
@ 2018-11-21  3:27 ` Huaisheng Ye
  2018-11-25 20:59 ` [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Dan Williams
  3 siblings, 0 replies; 6+ messages in thread
From: Huaisheng Ye @ 2018-11-21  3:27 UTC (permalink / raw)
  To: linux-nvdimm, agk, snitzer, dm-devel, dan.j.williams, willy,
	zwisler, jack, dave.jiang, vishal.l.verma
  Cc: linux-fsdevel, linux-kernel, chengnt, Huaisheng Ye

From: Huaisheng Ye <yehs1@lenovo.com>

If use DAX way to mount the origin device of dm_snapshot, when
merging back snapshot to origin, system call table_load during the
construction of snapshot-merge will try to detect new dm_table's type
equals to the existed md's type or not.
The existed type equals to DM_TYPE_DAX_BIO_BASED, but the new created
type belongs to DM_TYPE_BIO_BASED. So, we need to expand valid_type in
function is_valid_type.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
---
 drivers/md/dm-ioctl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index a27016e..158d657 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1295,7 +1295,8 @@ static int populate_table(struct dm_table *table,
 static bool is_valid_type(enum dm_queue_mode cur, enum dm_queue_mode new)
 {
 	if (cur == new ||
-	    (cur == DM_TYPE_BIO_BASED && new == DM_TYPE_DAX_BIO_BASED))
+	    (cur == DM_TYPE_BIO_BASED && new == DM_TYPE_DAX_BIO_BASED) ||
+	    (cur == DM_TYPE_DAX_BIO_BASED && new == DM_TYPE_BIO_BASED))
 		return true;
 
 	return false;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot
  2018-11-21  3:26 [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Huaisheng Ye
                   ` (2 preceding siblings ...)
  2018-11-21  3:27 ` [RFC PATCH v2 3/3] dm: expand valid types for dm-ioctl Huaisheng Ye
@ 2018-11-25 20:59 ` Dan Williams
  2018-11-28 14:27   ` [External] " Huaisheng HS1 Ye
  3 siblings, 1 reply; 6+ messages in thread
From: Dan Williams @ 2018-11-25 20:59 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: Jan Kara, Mike Snitzer, linux-nvdimm, NingTing Cheng,
	Linux Kernel Mailing List, Matthew Wilcox,
	device-mapper development, zwisler, linux-fsdevel,
	Alasdair Kergon

On Tue, Nov 20, 2018 at 7:27 PM Huaisheng Ye <yehs2007@zoho.com> wrote:
>
> From: Huaisheng Ye <yehs1@lenovo.com>
>
> Changes
> v1->v2:
>         Add NULL funtions for origin_dax_direct_access and
>         origin_dax_copy_from/to_iter in order to avoid building
>         error when CONFIG_DAX_DRIVER has NOT been enabled.
>
> [v1]: https://lkml.org/lkml/2018/11/20/759
>
> This series patches are used to realize the dax_operations for dm-snapshot
> with persistent memory device.

How does this interact with mmap write faults if the mapping is dax
and the page needs to be cow'd?
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot
  2018-11-25 20:59 ` [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Dan Williams
@ 2018-11-28 14:27   ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 6+ messages in thread
From: Huaisheng HS1 Ye @ 2018-11-28 14:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jan Kara, Mike Snitzer, linux-nvdimm, NingTing Cheng,
	Linux Kernel Mailing List, Matthew Wilcox,
	device-mapper development, zwisler, linux-fsdevel,
	Alasdair Kergon

From: Dan Williams <dan.j.williams@intel.com>
Sent: Monday, November 26, 2018 5:00 AM> 
> On Tue, Nov 20, 2018 at 7:27 PM Huaisheng Ye <yehs2007@zoho.com> wrote:
> >
> > From: Huaisheng Ye <yehs1@lenovo.com>
> >
> > Changes
> > v1->v2:
> >         Add NULL funtions for origin_dax_direct_access and
> >         origin_dax_copy_from/to_iter in order to avoid building
> >         error when CONFIG_DAX_DRIVER has NOT been enabled.
> >
> > [v1]: https://lkml.org/lkml/2018/11/20/759
> >
> > This series patches are used to realize the dax_operations for dm-snapshot
> > with persistent memory device.
> 
> How does this interact with mmap write faults if the mapping is dax
> and the page needs to be cow'd?

Hi Dan,

Based on my understanding, I think dm-snapshot is impossible to work
with file mapping in DAX way.

When file mapping has been used with DAX, userspace could get the pointer
To access persistent memory directly by page fault.
Here is the call trace.

 74 [ 7677.897354] Call Trace:
 75 [ 7677.900538]  dump_stack+0x67/0x82
 76 [ 7677.904695]  __pmem_direct_access+0xa9/0x101 [nd_pmem]
 77 [ 7677.910922]  ? linear_ctr+0x12a/0x12a [dm_mod]
 78 [ 7677.916340]  pmem_dax_direct_access+0x30/0x37 [nd_pmem]
 79 [ 7677.922641]  dax_direct_access+0x30/0x58
 80 [ 7677.927480]  linear_dax_direct_access+0x66/0x71 [dm_mod]
 81 [ 7677.933848]  dm_dax_direct_access+0x9c/0xf1 [dm_mod]
 82 [ 7677.939856]  ? origin_dax_copy_from_iter+0x88/0x88 [dm_snapshot]
 83 [ 7677.947032]  dax_direct_access+0x30/0x58
 84 [ 7677.951876]  origin_dax_direct_access+0x6a/0x75 [dm_snapshot]
 85 [ 7677.958753]  dm_dax_direct_access+0x9c/0xf1 [dm_mod]
 86 [ 7677.964738]  dax_direct_access+0x30/0x58
 87 [ 7677.969542]  dax_iomap_pfn+0x84/0x10d
 88 [ 7677.974061]  dax_iomap_pte_fault+0x4a9/0x773
 89 [ 7677.979271]  dax_iomap_fault+0x21/0x36
 90 [ 7677.983895]  ext2_dax_fault+0x70/0x9a [ext2]
 91 [ 7677.989061]  __do_fault+0x1d/0x74
 92 [ 7677.993159]  __handle_mm_fault+0xf04/0x17a4
 93 [ 7677.998225]  handle_mm_fault+0x1a0/0x204
 94 [ 7678.003035]  __do_page_fault+0x39b/0x4d3
 95 [ 7678.007839]  do_page_fault+0xfc/0x11b
 96 [ 7678.012316]  ? page_fault+0x8/0x30
 97 [ 7678.016498]  page_fault+0x1e/0x30

The application in userspace could directly read or write the data of
the file content by mmap in DAX way.
dm-snapshot works at kernel space, so it doesn't have chance to be aware
of the modification of FS.

I think in this case, perhaps userspace should take the responsibility
for snapshot when file mapping has been used in DAX way. Just like what
NVML has done as CLFLUSH for cache lines flush.

Correct me if anything is not accurate.

Cheers,
Huaisheng Ye
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-11-28 14:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-21  3:26 [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Huaisheng Ye
2018-11-21  3:27 ` [RFC PATCH v2 1/3] dm: enable " Huaisheng Ye
2018-11-21  3:27 ` [RFC PATCH v2 2/3] dm: expand hc_map in mapped_device for lack of map Huaisheng Ye
2018-11-21  3:27 ` [RFC PATCH v2 3/3] dm: expand valid types for dm-ioctl Huaisheng Ye
2018-11-25 20:59 ` [RFC PATCH v2 0/3] realize dax_operations for dm-snapshot Dan Williams
2018-11-28 14:27   ` [External] " Huaisheng HS1 Ye

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).