All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-14 16:48 ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: tytso, linux-nvdimm, jack, david, linux-kernel, linux-mm,
	adilger.kernel, linux-fsdevel, kirill.shutemov

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
size.  This feature relies on both mmap virtual address and FS
block (i.e. physical address) to be aligned by the pmd page size.
Users can use mkfs options to specify FS to align block allocations.
However, aligning mmap address requires code changes to existing
applications for providing a pmd-aligned address to mmap().

For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
It calls mmap() with a NULL address, which needs to be changed to
provide a pmd-aligned address for testing with DAX pmd mappings.
Changing all applications that call mmap() with NULL is undesirable.

This patch-set extends filesystems to align an mmap address for
a DAX file so that unmodified applications can use DAX pmd mappings.

[1]: https://github.com/axboe/fio/blob/master/engines/mmap.c

v3:
 - Check overflow condition to offset + length. (Matthew Wilcox)
 - Remove indent by using gotos. (Matthew Wilcox)
 - Define dax_get_unmapped_area to NULL when CONFIG_FS_DAX is unset.
   (Matthew Wilcox)
 - Squash all filesystem patches together. (Matthew Wilcox)

v2:
 - Change filesystems to provide their get_unmapped_area().
   (Matthew Wilcox)
 - Add more description about the benefit. (Matthew Wilcox)

---
Toshi Kani (2):
 1/2 dax: add dax_get_unmapped_area for pmd mappings
 2/2 ext2/4, xfs, blk: call dax_get_unmapped_area() for DAX pmd mappings

---
 fs/block_dev.c      |  1 +
 fs/dax.c            | 43 +++++++++++++++++++++++++++++++++++++++++++
 fs/ext2/file.c      |  1 +
 fs/ext4/file.c      |  1 +
 fs/xfs/xfs_file.c   |  1 +
 include/linux/dax.h |  3 +++
 6 files changed, 50 insertions(+)
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-14 16:48 ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: willy, ross.zwisler, kirill.shutemov, david, jack, tytso,
	adilger.kernel, linux-nvdimm, linux-fsdevel, linux-mm,
	linux-kernel

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
size.  This feature relies on both mmap virtual address and FS
block (i.e. physical address) to be aligned by the pmd page size.
Users can use mkfs options to specify FS to align block allocations.
However, aligning mmap address requires code changes to existing
applications for providing a pmd-aligned address to mmap().

For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
It calls mmap() with a NULL address, which needs to be changed to
provide a pmd-aligned address for testing with DAX pmd mappings.
Changing all applications that call mmap() with NULL is undesirable.

This patch-set extends filesystems to align an mmap address for
a DAX file so that unmodified applications can use DAX pmd mappings.

[1]: https://github.com/axboe/fio/blob/master/engines/mmap.c

v3:
 - Check overflow condition to offset + length. (Matthew Wilcox)
 - Remove indent by using gotos. (Matthew Wilcox)
 - Define dax_get_unmapped_area to NULL when CONFIG_FS_DAX is unset.
   (Matthew Wilcox)
 - Squash all filesystem patches together. (Matthew Wilcox)

v2:
 - Change filesystems to provide their get_unmapped_area().
   (Matthew Wilcox)
 - Add more description about the benefit. (Matthew Wilcox)

---
Toshi Kani (2):
 1/2 dax: add dax_get_unmapped_area for pmd mappings
 2/2 ext2/4, xfs, blk: call dax_get_unmapped_area() for DAX pmd mappings

---
 fs/block_dev.c      |  1 +
 fs/dax.c            | 43 +++++++++++++++++++++++++++++++++++++++++++
 fs/ext2/file.c      |  1 +
 fs/ext4/file.c      |  1 +
 fs/xfs/xfs_file.c   |  1 +
 include/linux/dax.h |  3 +++
 6 files changed, 50 insertions(+)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-14 16:48 ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: willy, ross.zwisler, kirill.shutemov, david, jack, tytso,
	adilger.kernel, linux-nvdimm, linux-fsdevel, linux-mm,
	linux-kernel

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
size.  This feature relies on both mmap virtual address and FS
block (i.e. physical address) to be aligned by the pmd page size.
Users can use mkfs options to specify FS to align block allocations.
However, aligning mmap address requires code changes to existing
applications for providing a pmd-aligned address to mmap().

For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
It calls mmap() with a NULL address, which needs to be changed to
provide a pmd-aligned address for testing with DAX pmd mappings.
Changing all applications that call mmap() with NULL is undesirable.

This patch-set extends filesystems to align an mmap address for
a DAX file so that unmodified applications can use DAX pmd mappings.

[1]: https://github.com/axboe/fio/blob/master/engines/mmap.c

v3:
 - Check overflow condition to offset + length. (Matthew Wilcox)
 - Remove indent by using gotos. (Matthew Wilcox)
 - Define dax_get_unmapped_area to NULL when CONFIG_FS_DAX is unset.
   (Matthew Wilcox)
 - Squash all filesystem patches together. (Matthew Wilcox)

v2:
 - Change filesystems to provide their get_unmapped_area().
   (Matthew Wilcox)
 - Add more description about the benefit. (Matthew Wilcox)

---
Toshi Kani (2):
 1/2 dax: add dax_get_unmapped_area for pmd mappings
 2/2 ext2/4, xfs, blk: call dax_get_unmapped_area() for DAX pmd mappings

---
 fs/block_dev.c      |  1 +
 fs/dax.c            | 43 +++++++++++++++++++++++++++++++++++++++++++
 fs/ext2/file.c      |  1 +
 fs/ext4/file.c      |  1 +
 fs/xfs/xfs_file.c   |  1 +
 include/linux/dax.h |  3 +++
 6 files changed, 50 insertions(+)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
  2016-04-14 16:48 ` Toshi Kani
  (?)
@ 2016-04-14 16:48   ` Toshi Kani
  -1 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: tytso, linux-nvdimm, jack, david, linux-kernel, linux-mm,
	adilger.kernel, linux-fsdevel, kirill.shutemov

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
size.  This feature relies on both mmap virtual address and FS
block (i.e. physical address) to be aligned by the pmd page size.
Users can use mkfs options to specify FS to align block allocations.
However, aligning mmap address requires code changes to existing
applications for providing a pmd-aligned address to mmap().

For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
It calls mmap() with a NULL address, which needs to be changed to
provide a pmd-aligned address for testing with DAX pmd mappings.
Changing all applications that call mmap() with NULL is undesirable.

Add dax_get_unmapped_area(), which can be called by filesystem's
get_unmapped_area to align an mmap address by the pmd size for
a DAX file.  It calls the default handler, mm->get_unmapped_area(),
to find a range and then aligns it for a DAX file.

[1]: https://github.com/axboe/fio/blob/master/engines/mmap.c
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
---
 fs/dax.c            |   43 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/dax.h |    3 +++
 2 files changed, 46 insertions(+)

diff --git a/fs/dax.c b/fs/dax.c
index 75ba46d..f8ddd27 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1158,3 +1158,46 @@ int dax_truncate_page(struct inode *inode, loff_t from, get_block_t get_block)
 	return dax_zero_page_range(inode, from, length, get_block);
 }
 EXPORT_SYMBOL_GPL(dax_truncate_page);
+
+/**
+ * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
+ * @filp: The file being mmap'd, if not NULL
+ * @addr: The mmap address. If NULL, the kernel assigns the address
+ * @len: The mmap size in bytes
+ * @pgoff: The page offset in the file where the mapping starts from.
+ * @flags: The mmap flags
+ *
+ * This function can be called by a filesystem for get_unmapped_area().
+ * When a target file is a DAX file, it aligns the mmap address at the
+ * beginning of the file by the pmd size.
+ */
+unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
+		unsigned long len, unsigned long pgoff, unsigned long flags)
+{
+	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;
+
+	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
+	    !filp || addr || !IS_DAX(filp->f_mapping->host))
+		goto out;
+
+	off = pgoff << PAGE_SHIFT;
+	off_end = off + len;
+	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
+
+	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
+		goto out;
+
+	len_pmd = len + PMD_SIZE;
+	if ((off + len_pmd) < off)
+		goto out;
+
+	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
+						  pgoff, flags);
+	if (!IS_ERR_VALUE(addr_pmd)) {
+		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
+		return addr_pmd;
+	}
+out:
+	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+}
+EXPORT_SYMBOL_GPL(dax_get_unmapped_area);
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 636dd59..184b171 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -17,12 +17,15 @@ int __dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t,
 
 #ifdef CONFIG_FS_DAX
 struct page *read_dax_sector(struct block_device *bdev, sector_t n);
+unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
+		unsigned long len, unsigned long pgoff, unsigned long flags);
 #else
 static inline struct page *read_dax_sector(struct block_device *bdev,
 		sector_t n)
 {
 	return ERR_PTR(-ENXIO);
 }
+#define dax_get_unmapped_area	NULL
 #endif
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
@ 2016-04-14 16:48   ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: willy, ross.zwisler, kirill.shutemov, david, jack, tytso,
	adilger.kernel, linux-nvdimm, linux-fsdevel, linux-mm,
	linux-kernel, Toshi Kani

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
size.  This feature relies on both mmap virtual address and FS
block (i.e. physical address) to be aligned by the pmd page size.
Users can use mkfs options to specify FS to align block allocations.
However, aligning mmap address requires code changes to existing
applications for providing a pmd-aligned address to mmap().

For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
It calls mmap() with a NULL address, which needs to be changed to
provide a pmd-aligned address for testing with DAX pmd mappings.
Changing all applications that call mmap() with NULL is undesirable.

Add dax_get_unmapped_area(), which can be called by filesystem's
get_unmapped_area to align an mmap address by the pmd size for
a DAX file.  It calls the default handler, mm->get_unmapped_area(),
to find a range and then aligns it for a DAX file.

[1]: https://github.com/axboe/fio/blob/master/engines/mmap.c
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
---
 fs/dax.c            |   43 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/dax.h |    3 +++
 2 files changed, 46 insertions(+)

diff --git a/fs/dax.c b/fs/dax.c
index 75ba46d..f8ddd27 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1158,3 +1158,46 @@ int dax_truncate_page(struct inode *inode, loff_t from, get_block_t get_block)
 	return dax_zero_page_range(inode, from, length, get_block);
 }
 EXPORT_SYMBOL_GPL(dax_truncate_page);
+
+/**
+ * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
+ * @filp: The file being mmap'd, if not NULL
+ * @addr: The mmap address. If NULL, the kernel assigns the address
+ * @len: The mmap size in bytes
+ * @pgoff: The page offset in the file where the mapping starts from.
+ * @flags: The mmap flags
+ *
+ * This function can be called by a filesystem for get_unmapped_area().
+ * When a target file is a DAX file, it aligns the mmap address at the
+ * beginning of the file by the pmd size.
+ */
+unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
+		unsigned long len, unsigned long pgoff, unsigned long flags)
+{
+	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;
+
+	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
+	    !filp || addr || !IS_DAX(filp->f_mapping->host))
+		goto out;
+
+	off = pgoff << PAGE_SHIFT;
+	off_end = off + len;
+	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
+
+	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
+		goto out;
+
+	len_pmd = len + PMD_SIZE;
+	if ((off + len_pmd) < off)
+		goto out;
+
+	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
+						  pgoff, flags);
+	if (!IS_ERR_VALUE(addr_pmd)) {
+		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
+		return addr_pmd;
+	}
+out:
+	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+}
+EXPORT_SYMBOL_GPL(dax_get_unmapped_area);
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 636dd59..184b171 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -17,12 +17,15 @@ int __dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t,
 
 #ifdef CONFIG_FS_DAX
 struct page *read_dax_sector(struct block_device *bdev, sector_t n);
+unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
+		unsigned long len, unsigned long pgoff, unsigned long flags);
 #else
 static inline struct page *read_dax_sector(struct block_device *bdev,
 		sector_t n)
 {
 	return ERR_PTR(-ENXIO);
 }
+#define dax_get_unmapped_area	NULL
 #endif
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
@ 2016-04-14 16:48   ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: willy, ross.zwisler, kirill.shutemov, david, jack, tytso,
	adilger.kernel, linux-nvdimm, linux-fsdevel, linux-mm,
	linux-kernel, Toshi Kani

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
size.  This feature relies on both mmap virtual address and FS
block (i.e. physical address) to be aligned by the pmd page size.
Users can use mkfs options to specify FS to align block allocations.
However, aligning mmap address requires code changes to existing
applications for providing a pmd-aligned address to mmap().

For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
It calls mmap() with a NULL address, which needs to be changed to
provide a pmd-aligned address for testing with DAX pmd mappings.
Changing all applications that call mmap() with NULL is undesirable.

Add dax_get_unmapped_area(), which can be called by filesystem's
get_unmapped_area to align an mmap address by the pmd size for
a DAX file.  It calls the default handler, mm->get_unmapped_area(),
to find a range and then aligns it for a DAX file.

[1]: https://github.com/axboe/fio/blob/master/engines/mmap.c
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
---
 fs/dax.c            |   43 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/dax.h |    3 +++
 2 files changed, 46 insertions(+)

diff --git a/fs/dax.c b/fs/dax.c
index 75ba46d..f8ddd27 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1158,3 +1158,46 @@ int dax_truncate_page(struct inode *inode, loff_t from, get_block_t get_block)
 	return dax_zero_page_range(inode, from, length, get_block);
 }
 EXPORT_SYMBOL_GPL(dax_truncate_page);
+
+/**
+ * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
+ * @filp: The file being mmap'd, if not NULL
+ * @addr: The mmap address. If NULL, the kernel assigns the address
+ * @len: The mmap size in bytes
+ * @pgoff: The page offset in the file where the mapping starts from.
+ * @flags: The mmap flags
+ *
+ * This function can be called by a filesystem for get_unmapped_area().
+ * When a target file is a DAX file, it aligns the mmap address at the
+ * beginning of the file by the pmd size.
+ */
+unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
+		unsigned long len, unsigned long pgoff, unsigned long flags)
+{
+	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;
+
+	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
+	    !filp || addr || !IS_DAX(filp->f_mapping->host))
+		goto out;
+
+	off = pgoff << PAGE_SHIFT;
+	off_end = off + len;
+	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
+
+	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
+		goto out;
+
+	len_pmd = len + PMD_SIZE;
+	if ((off + len_pmd) < off)
+		goto out;
+
+	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
+						  pgoff, flags);
+	if (!IS_ERR_VALUE(addr_pmd)) {
+		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
+		return addr_pmd;
+	}
+out:
+	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+}
+EXPORT_SYMBOL_GPL(dax_get_unmapped_area);
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 636dd59..184b171 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -17,12 +17,15 @@ int __dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t,
 
 #ifdef CONFIG_FS_DAX
 struct page *read_dax_sector(struct block_device *bdev, sector_t n);
+unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
+		unsigned long len, unsigned long pgoff, unsigned long flags);
 #else
 static inline struct page *read_dax_sector(struct block_device *bdev,
 		sector_t n)
 {
 	return ERR_PTR(-ENXIO);
 }
+#define dax_get_unmapped_area	NULL
 #endif
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 2/2] ext2/4, xfs, blk: call dax_get_unmapped_area() for DAX pmd mappings
  2016-04-14 16:48 ` Toshi Kani
  (?)
@ 2016-04-14 16:48   ` Toshi Kani
  -1 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: tytso, linux-nvdimm, jack, david, linux-kernel, linux-mm,
	adilger.kernel, linux-fsdevel, kirill.shutemov

To support DAX pmd mappings with unmodified applications,
filesystems need to align an mmap address by the pmd size.

Call dax_get_unmapped_area() from f_op->get_unmapped_area.

Note, there is no change in behavior for a non-DAX file.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
---
 fs/block_dev.c    |    1 +
 fs/ext2/file.c    |    1 +
 fs/ext4/file.c    |    1 +
 fs/xfs/xfs_file.c |    1 +
 4 files changed, 4 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 20a2c02..52518e0 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1798,6 +1798,7 @@ const struct file_operations def_blk_fops = {
 	.write_iter	= blkdev_write_iter,
 	.mmap		= blkdev_mmap,
 	.fsync		= blkdev_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.unlocked_ioctl	= block_ioctl,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= compat_blkdev_ioctl,
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index c1400b1..dbf11eb 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -172,6 +172,7 @@ const struct file_operations ext2_file_operations = {
 	.open		= dquot_file_open,
 	.release	= ext2_release_file,
 	.fsync		= ext2_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index fa2208b..6c268f8 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -708,6 +708,7 @@ const struct file_operations ext4_file_operations = {
 	.open		= ext4_file_open,
 	.release	= ext4_release_file,
 	.fsync		= ext4_sync_file,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ext4_fallocate,
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 569938a..1e409a6 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1708,6 +1708,7 @@ const struct file_operations xfs_file_operations = {
 	.open		= xfs_file_open,
 	.release	= xfs_file_release,
 	.fsync		= xfs_file_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.fallocate	= xfs_file_fallocate,
 };
 
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 2/2] ext2/4, xfs, blk: call dax_get_unmapped_area() for DAX pmd mappings
@ 2016-04-14 16:48   ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: willy, ross.zwisler, kirill.shutemov, david, jack, tytso,
	adilger.kernel, linux-nvdimm, linux-fsdevel, linux-mm,
	linux-kernel, Toshi Kani

To support DAX pmd mappings with unmodified applications,
filesystems need to align an mmap address by the pmd size.

Call dax_get_unmapped_area() from f_op->get_unmapped_area.

Note, there is no change in behavior for a non-DAX file.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
---
 fs/block_dev.c    |    1 +
 fs/ext2/file.c    |    1 +
 fs/ext4/file.c    |    1 +
 fs/xfs/xfs_file.c |    1 +
 4 files changed, 4 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 20a2c02..52518e0 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1798,6 +1798,7 @@ const struct file_operations def_blk_fops = {
 	.write_iter	= blkdev_write_iter,
 	.mmap		= blkdev_mmap,
 	.fsync		= blkdev_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.unlocked_ioctl	= block_ioctl,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= compat_blkdev_ioctl,
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index c1400b1..dbf11eb 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -172,6 +172,7 @@ const struct file_operations ext2_file_operations = {
 	.open		= dquot_file_open,
 	.release	= ext2_release_file,
 	.fsync		= ext2_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index fa2208b..6c268f8 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -708,6 +708,7 @@ const struct file_operations ext4_file_operations = {
 	.open		= ext4_file_open,
 	.release	= ext4_release_file,
 	.fsync		= ext4_sync_file,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ext4_fallocate,
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 569938a..1e409a6 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1708,6 +1708,7 @@ const struct file_operations xfs_file_operations = {
 	.open		= xfs_file_open,
 	.release	= xfs_file_release,
 	.fsync		= xfs_file_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.fallocate	= xfs_file_fallocate,
 };
 

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 2/2] ext2/4, xfs, blk: call dax_get_unmapped_area() for DAX pmd mappings
@ 2016-04-14 16:48   ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-14 16:48 UTC (permalink / raw)
  To: akpm, dan.j.williams, viro
  Cc: willy, ross.zwisler, kirill.shutemov, david, jack, tytso,
	adilger.kernel, linux-nvdimm, linux-fsdevel, linux-mm,
	linux-kernel, Toshi Kani

To support DAX pmd mappings with unmodified applications,
filesystems need to align an mmap address by the pmd size.

Call dax_get_unmapped_area() from f_op->get_unmapped_area.

Note, there is no change in behavior for a non-DAX file.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
---
 fs/block_dev.c    |    1 +
 fs/ext2/file.c    |    1 +
 fs/ext4/file.c    |    1 +
 fs/xfs/xfs_file.c |    1 +
 4 files changed, 4 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 20a2c02..52518e0 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1798,6 +1798,7 @@ const struct file_operations def_blk_fops = {
 	.write_iter	= blkdev_write_iter,
 	.mmap		= blkdev_mmap,
 	.fsync		= blkdev_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.unlocked_ioctl	= block_ioctl,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= compat_blkdev_ioctl,
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index c1400b1..dbf11eb 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -172,6 +172,7 @@ const struct file_operations ext2_file_operations = {
 	.open		= dquot_file_open,
 	.release	= ext2_release_file,
 	.fsync		= ext2_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index fa2208b..6c268f8 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -708,6 +708,7 @@ const struct file_operations ext4_file_operations = {
 	.open		= ext4_file_open,
 	.release	= ext4_release_file,
 	.fsync		= ext4_sync_file,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ext4_fallocate,
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 569938a..1e409a6 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1708,6 +1708,7 @@ const struct file_operations xfs_file_operations = {
 	.open		= xfs_file_open,
 	.release	= xfs_file_release,
 	.fsync		= xfs_file_fsync,
+	.get_unmapped_area = dax_get_unmapped_area,
 	.fallocate	= xfs_file_fallocate,
 };
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-14 16:48 ` Toshi Kani
@ 2016-04-16  5:05   ` Andrew Morton
  -1 siblings, 0 replies; 35+ messages in thread
From: Andrew Morton @ 2016-04-16  5:05 UTC (permalink / raw)
  To: Toshi Kani
  Cc: dan.j.williams, viro, willy, ross.zwisler, kirill.shutemov,
	david, jack, tytso, adilger.kernel, linux-nvdimm, linux-fsdevel,
	linux-mm, linux-kernel

On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani <toshi.kani@hpe.com> wrote:

> When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
> size.  This feature relies on both mmap virtual address and FS
> block (i.e. physical address) to be aligned by the pmd page size.
> Users can use mkfs options to specify FS to align block allocations.
> However, aligning mmap address requires code changes to existing
> applications for providing a pmd-aligned address to mmap().
> 
> For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
> It calls mmap() with a NULL address, which needs to be changed to
> provide a pmd-aligned address for testing with DAX pmd mappings.
> Changing all applications that call mmap() with NULL is undesirable.
> 
> This patch-set extends filesystems to align an mmap address for
> a DAX file so that unmodified applications can use DAX pmd mappings.

Matthew sounded unconvinced about the need for this patchset, but I
must say that

: The point is that we do not need to modify existing applications for using
: DAX PMD mappings.
: 
: For instance, fio with "ioengine=mmap" performs I/Os with mmap(). 
: https://github.com/caius/fio/blob/master/engines/mmap.c
: 
: With this change, unmodified fio can be used for testing with DAX PMD
: mappings.  There are many examples like this, and I do not think we want
: to modify all applications that we want to evaluate/test with.

sounds pretty convincing?


And if we go ahead with this, it looks like 4.7 material to me - it
affects ABI and we want to get that stabilized asap.  What do people
think?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-16  5:05   ` Andrew Morton
  0 siblings, 0 replies; 35+ messages in thread
From: Andrew Morton @ 2016-04-16  5:05 UTC (permalink / raw)
  To: Toshi Kani
  Cc: dan.j.williams, viro, willy, ross.zwisler, kirill.shutemov,
	david, jack, tytso, adilger.kernel, linux-nvdimm, linux-fsdevel,
	linux-mm, linux-kernel

On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani <toshi.kani@hpe.com> wrote:

> When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
> size.  This feature relies on both mmap virtual address and FS
> block (i.e. physical address) to be aligned by the pmd page size.
> Users can use mkfs options to specify FS to align block allocations.
> However, aligning mmap address requires code changes to existing
> applications for providing a pmd-aligned address to mmap().
> 
> For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
> It calls mmap() with a NULL address, which needs to be changed to
> provide a pmd-aligned address for testing with DAX pmd mappings.
> Changing all applications that call mmap() with NULL is undesirable.
> 
> This patch-set extends filesystems to align an mmap address for
> a DAX file so that unmodified applications can use DAX pmd mappings.

Matthew sounded unconvinced about the need for this patchset, but I
must say that

: The point is that we do not need to modify existing applications for using
: DAX PMD mappings.
: 
: For instance, fio with "ioengine=mmap" performs I/Os with mmap(). 
: https://github.com/caius/fio/blob/master/engines/mmap.c
: 
: With this change, unmodified fio can be used for testing with DAX PMD
: mappings.  There are many examples like this, and I do not think we want
: to modify all applications that we want to evaluate/test with.

sounds pretty convincing?


And if we go ahead with this, it looks like 4.7 material to me - it
affects ABI and we want to get that stabilized asap.  What do people
think?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-16  5:05   ` Andrew Morton
@ 2016-04-18 20:26     ` Jan Kara
  -1 siblings, 0 replies; 35+ messages in thread
From: Jan Kara @ 2016-04-18 20:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Toshi Kani, dan.j.williams, viro, willy, ross.zwisler,
	kirill.shutemov, david, jack, tytso, adilger.kernel,
	linux-nvdimm, linux-fsdevel, linux-mm, linux-kernel

On Fri 15-04-16 22:05:31, Andrew Morton wrote:
> On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani <toshi.kani@hpe.com> wrote:
> 
> > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
> > size.  This feature relies on both mmap virtual address and FS
> > block (i.e. physical address) to be aligned by the pmd page size.
> > Users can use mkfs options to specify FS to align block allocations.
> > However, aligning mmap address requires code changes to existing
> > applications for providing a pmd-aligned address to mmap().
> > 
> > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
> > It calls mmap() with a NULL address, which needs to be changed to
> > provide a pmd-aligned address for testing with DAX pmd mappings.
> > Changing all applications that call mmap() with NULL is undesirable.
> > 
> > This patch-set extends filesystems to align an mmap address for
> > a DAX file so that unmodified applications can use DAX pmd mappings.
> 
> Matthew sounded unconvinced about the need for this patchset, but I
> must say that
> 
> : The point is that we do not need to modify existing applications for using
> : DAX PMD mappings.
> : 
> : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). 
> : https://github.com/caius/fio/blob/master/engines/mmap.c
> : 
> : With this change, unmodified fio can be used for testing with DAX PMD
> : mappings.  There are many examples like this, and I do not think we want
> : to modify all applications that we want to evaluate/test with.
> 
> sounds pretty convincing?
> 
> 
> And if we go ahead with this, it looks like 4.7 material to me - it
> affects ABI and we want to get that stabilized asap.  What do people
> think?

So I think Mathew didn't question the patch set as a whole. I think we all
agree that we should align the virtual address we map to so that PMD
mappings can be used. What Mathew was questioning was whether we really
need to play tricks when logical offset in the file where mmap is starting
is not aligned (and similarly for map length). Whether allowing PMD
mappings for unaligned file offsets is worth the complication is IMO a
valid question.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-18 20:26     ` Jan Kara
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Kara @ 2016-04-18 20:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Toshi Kani, dan.j.williams, viro, willy, ross.zwisler,
	kirill.shutemov, david, jack, tytso, adilger.kernel,
	linux-nvdimm, linux-fsdevel, linux-mm, linux-kernel

On Fri 15-04-16 22:05:31, Andrew Morton wrote:
> On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani <toshi.kani@hpe.com> wrote:
> 
> > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
> > size.  This feature relies on both mmap virtual address and FS
> > block (i.e. physical address) to be aligned by the pmd page size.
> > Users can use mkfs options to specify FS to align block allocations.
> > However, aligning mmap address requires code changes to existing
> > applications for providing a pmd-aligned address to mmap().
> > 
> > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
> > It calls mmap() with a NULL address, which needs to be changed to
> > provide a pmd-aligned address for testing with DAX pmd mappings.
> > Changing all applications that call mmap() with NULL is undesirable.
> > 
> > This patch-set extends filesystems to align an mmap address for
> > a DAX file so that unmodified applications can use DAX pmd mappings.
> 
> Matthew sounded unconvinced about the need for this patchset, but I
> must say that
> 
> : The point is that we do not need to modify existing applications for using
> : DAX PMD mappings.
> : 
> : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). 
> : https://github.com/caius/fio/blob/master/engines/mmap.c
> : 
> : With this change, unmodified fio can be used for testing with DAX PMD
> : mappings.  There are many examples like this, and I do not think we want
> : to modify all applications that we want to evaluate/test with.
> 
> sounds pretty convincing?
> 
> 
> And if we go ahead with this, it looks like 4.7 material to me - it
> affects ABI and we want to get that stabilized asap.  What do people
> think?

So I think Mathew didn't question the patch set as a whole. I think we all
agree that we should align the virtual address we map to so that PMD
mappings can be used. What Mathew was questioning was whether we really
need to play tricks when logical offset in the file where mmap is starting
is not aligned (and similarly for map length). Whether allowing PMD
mappings for unaligned file offsets is worth the complication is IMO a
valid question.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
  2016-04-14 16:48   ` Toshi Kani
  (?)
@ 2016-04-18 20:47     ` Jan Kara
  -1 siblings, 0 replies; 35+ messages in thread
From: Jan Kara @ 2016-04-18 20:47 UTC (permalink / raw)
  To: Toshi Kani
  Cc: tytso, jack, linux-nvdimm, david, linux-kernel, linux-mm,
	adilger.kernel, viro, linux-fsdevel, akpm, kirill.shutemov

On Thu 14-04-16 10:48:30, Toshi Kani wrote:
> +
> +/**
> + * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
> + * @filp: The file being mmap'd, if not NULL
> + * @addr: The mmap address. If NULL, the kernel assigns the address
> + * @len: The mmap size in bytes
> + * @pgoff: The page offset in the file where the mapping starts from.
> + * @flags: The mmap flags
> + *
> + * This function can be called by a filesystem for get_unmapped_area().
> + * When a target file is a DAX file, it aligns the mmap address at the
> + * beginning of the file by the pmd size.
> + */
> +unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
> +		unsigned long len, unsigned long pgoff, unsigned long flags)
> +{
> +	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;

I think we need to use 'loff_t' for the offsets for things to work on
32-bits.

> +	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
> +	    !filp || addr || !IS_DAX(filp->f_mapping->host))
> +		goto out;
> +
> +	off = pgoff << PAGE_SHIFT;

And here we need to type to loff_t before the shift...

> +	off_end = off + len;
> +	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
> +
> +	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))

None of these parenthesis is actually needed (and IMHO they make the code
less readable, not more).

> +		goto out;
> +
> +	len_pmd = len + PMD_SIZE;
> +	if ((off + len_pmd) < off)
> +		goto out;
> +
> +	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
> +						  pgoff, flags);
> +	if (!IS_ERR_VALUE(addr_pmd)) {
> +		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
> +		return addr_pmd;

Otherwise the patch looks good to me.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
@ 2016-04-18 20:47     ` Jan Kara
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Kara @ 2016-04-18 20:47 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, dan.j.williams, viro, willy, ross.zwisler, kirill.shutemov,
	david, jack, tytso, adilger.kernel, linux-nvdimm, linux-fsdevel,
	linux-mm, linux-kernel

On Thu 14-04-16 10:48:30, Toshi Kani wrote:
> +
> +/**
> + * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
> + * @filp: The file being mmap'd, if not NULL
> + * @addr: The mmap address. If NULL, the kernel assigns the address
> + * @len: The mmap size in bytes
> + * @pgoff: The page offset in the file where the mapping starts from.
> + * @flags: The mmap flags
> + *
> + * This function can be called by a filesystem for get_unmapped_area().
> + * When a target file is a DAX file, it aligns the mmap address at the
> + * beginning of the file by the pmd size.
> + */
> +unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
> +		unsigned long len, unsigned long pgoff, unsigned long flags)
> +{
> +	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;

I think we need to use 'loff_t' for the offsets for things to work on
32-bits.

> +	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
> +	    !filp || addr || !IS_DAX(filp->f_mapping->host))
> +		goto out;
> +
> +	off = pgoff << PAGE_SHIFT;

And here we need to type to loff_t before the shift...

> +	off_end = off + len;
> +	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
> +
> +	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))

None of these parenthesis is actually needed (and IMHO they make the code
less readable, not more).

> +		goto out;
> +
> +	len_pmd = len + PMD_SIZE;
> +	if ((off + len_pmd) < off)
> +		goto out;
> +
> +	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
> +						  pgoff, flags);
> +	if (!IS_ERR_VALUE(addr_pmd)) {
> +		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
> +		return addr_pmd;

Otherwise the patch looks good to me.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
@ 2016-04-18 20:47     ` Jan Kara
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Kara @ 2016-04-18 20:47 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, dan.j.williams, viro, willy, ross.zwisler, kirill.shutemov,
	david, jack, tytso, adilger.kernel, linux-nvdimm, linux-fsdevel,
	linux-mm, linux-kernel

On Thu 14-04-16 10:48:30, Toshi Kani wrote:
> +
> +/**
> + * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
> + * @filp: The file being mmap'd, if not NULL
> + * @addr: The mmap address. If NULL, the kernel assigns the address
> + * @len: The mmap size in bytes
> + * @pgoff: The page offset in the file where the mapping starts from.
> + * @flags: The mmap flags
> + *
> + * This function can be called by a filesystem for get_unmapped_area().
> + * When a target file is a DAX file, it aligns the mmap address at the
> + * beginning of the file by the pmd size.
> + */
> +unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
> +		unsigned long len, unsigned long pgoff, unsigned long flags)
> +{
> +	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;

I think we need to use 'loff_t' for the offsets for things to work on
32-bits.

> +	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
> +	    !filp || addr || !IS_DAX(filp->f_mapping->host))
> +		goto out;
> +
> +	off = pgoff << PAGE_SHIFT;

And here we need to type to loff_t before the shift...

> +	off_end = off + len;
> +	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
> +
> +	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))

None of these parenthesis is actually needed (and IMHO they make the code
less readable, not more).

> +		goto out;
> +
> +	len_pmd = len + PMD_SIZE;
> +	if ((off + len_pmd) < off)
> +		goto out;
> +
> +	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
> +						  pgoff, flags);
> +	if (!IS_ERR_VALUE(addr_pmd)) {
> +		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
> +		return addr_pmd;

Otherwise the patch looks good to me.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
  2016-04-18 20:47     ` Jan Kara
  (?)
@ 2016-04-19  2:36       ` Toshi Kani
  -1 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-19  2:36 UTC (permalink / raw)
  To: Jan Kara
  Cc: tytso, linux-nvdimm, david, linux-kernel, linux-mm,
	adilger.kernel, viro, linux-fsdevel, akpm, kirill.shutemov

On 4/18/2016 4:47 PM, Jan Kara wrote:
> On Thu 14-04-16 10:48:30, Toshi Kani wrote:
>> +
>> +/**
>> + * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
>> + * @filp: The file being mmap'd, if not NULL
>> + * @addr: The mmap address. If NULL, the kernel assigns the address
>> + * @len: The mmap size in bytes
>> + * @pgoff: The page offset in the file where the mapping starts from.
>> + * @flags: The mmap flags
>> + *
>> + * This function can be called by a filesystem for get_unmapped_area().
>> + * When a target file is a DAX file, it aligns the mmap address at the
>> + * beginning of the file by the pmd size.
>> + */
>> +unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
>> +		unsigned long len, unsigned long pgoff, unsigned long flags)
>> +{
>> +	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;
> I think we need to use 'loff_t' for the offsets for things to work on
> 32-bits.

Agreed. Will change to loff_t.

>> +	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
>> +	    !filp || addr || !IS_DAX(filp->f_mapping->host))
>> +		goto out;
>> +
>> +	off = pgoff << PAGE_SHIFT;
> And here we need to type to loff_t before the shift...

Right.

>> +	off_end = off + len;
>> +	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
>> +
>> +	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
> None of these parenthesis is actually needed (and IMHO they make the code
> less readable, not more).

OK.  Will remove the parenthesis.

>> +		goto out;
>> +
>> +	len_pmd = len + PMD_SIZE;
>> +	if ((off + len_pmd) < off)
>> +		goto out;
>> +
>> +	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
>> +						  pgoff, flags);
>> +	if (!IS_ERR_VALUE(addr_pmd)) {
>> +		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
>> +		return addr_pmd;
> Otherwise the patch looks good to me.

Great. Thanks Jan!
-Toshi
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
@ 2016-04-19  2:36       ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-19  2:36 UTC (permalink / raw)
  To: Jan Kara
  Cc: akpm, dan.j.williams, viro, willy, ross.zwisler, kirill.shutemov,
	david, tytso, adilger.kernel, linux-nvdimm@lists.01.org,
	linux-fsdevel, linux-mm, linux-kernel

On 4/18/2016 4:47 PM, Jan Kara wrote:
> On Thu 14-04-16 10:48:30, Toshi Kani wrote:
>> +
>> +/**
>> + * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
>> + * @filp: The file being mmap'd, if not NULL
>> + * @addr: The mmap address. If NULL, the kernel assigns the address
>> + * @len: The mmap size in bytes
>> + * @pgoff: The page offset in the file where the mapping starts from.
>> + * @flags: The mmap flags
>> + *
>> + * This function can be called by a filesystem for get_unmapped_area().
>> + * When a target file is a DAX file, it aligns the mmap address at the
>> + * beginning of the file by the pmd size.
>> + */
>> +unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
>> +		unsigned long len, unsigned long pgoff, unsigned long flags)
>> +{
>> +	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;
> I think we need to use 'loff_t' for the offsets for things to work on
> 32-bits.

Agreed. Will change to loff_t.

>> +	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
>> +	    !filp || addr || !IS_DAX(filp->f_mapping->host))
>> +		goto out;
>> +
>> +	off = pgoff << PAGE_SHIFT;
> And here we need to type to loff_t before the shift...

Right.

>> +	off_end = off + len;
>> +	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
>> +
>> +	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
> None of these parenthesis is actually needed (and IMHO they make the code
> less readable, not more).

OK.  Will remove the parenthesis.

>> +		goto out;
>> +
>> +	len_pmd = len + PMD_SIZE;
>> +	if ((off + len_pmd) < off)
>> +		goto out;
>> +
>> +	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
>> +						  pgoff, flags);
>> +	if (!IS_ERR_VALUE(addr_pmd)) {
>> +		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
>> +		return addr_pmd;
> Otherwise the patch looks good to me.

Great. Thanks Jan!
-Toshi

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/2] dax: add dax_get_unmapped_area for pmd mappings
@ 2016-04-19  2:36       ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-19  2:36 UTC (permalink / raw)
  To: Jan Kara
  Cc: akpm, dan.j.williams, viro, willy, ross.zwisler, kirill.shutemov,
	david, tytso, adilger.kernel, linux-nvdimm, linux-fsdevel,
	linux-mm, linux-kernel

On 4/18/2016 4:47 PM, Jan Kara wrote:
> On Thu 14-04-16 10:48:30, Toshi Kani wrote:
>> +
>> +/**
>> + * dax_get_unmapped_area - handle get_unmapped_area for a DAX file
>> + * @filp: The file being mmap'd, if not NULL
>> + * @addr: The mmap address. If NULL, the kernel assigns the address
>> + * @len: The mmap size in bytes
>> + * @pgoff: The page offset in the file where the mapping starts from.
>> + * @flags: The mmap flags
>> + *
>> + * This function can be called by a filesystem for get_unmapped_area().
>> + * When a target file is a DAX file, it aligns the mmap address at the
>> + * beginning of the file by the pmd size.
>> + */
>> +unsigned long dax_get_unmapped_area(struct file *filp, unsigned long addr,
>> +		unsigned long len, unsigned long pgoff, unsigned long flags)
>> +{
>> +	unsigned long off, off_end, off_pmd, len_pmd, addr_pmd;
> I think we need to use 'loff_t' for the offsets for things to work on
> 32-bits.

Agreed. Will change to loff_t.

>> +	if (!IS_ENABLED(CONFIG_FS_DAX_PMD) ||
>> +	    !filp || addr || !IS_DAX(filp->f_mapping->host))
>> +		goto out;
>> +
>> +	off = pgoff << PAGE_SHIFT;
> And here we need to type to loff_t before the shift...

Right.

>> +	off_end = off + len;
>> +	off_pmd = round_up(off, PMD_SIZE);  /* pmd-aligned offset */
>> +
>> +	if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
> None of these parenthesis is actually needed (and IMHO they make the code
> less readable, not more).

OK.  Will remove the parenthesis.

>> +		goto out;
>> +
>> +	len_pmd = len + PMD_SIZE;
>> +	if ((off + len_pmd) < off)
>> +		goto out;
>> +
>> +	addr_pmd = current->mm->get_unmapped_area(filp, addr, len_pmd,
>> +						  pgoff, flags);
>> +	if (!IS_ERR_VALUE(addr_pmd)) {
>> +		addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
>> +		return addr_pmd;
> Otherwise the patch looks good to me.

Great. Thanks Jan!
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-18 20:26     ` Jan Kara
@ 2016-04-19 18:23       ` Matthew Wilcox
  -1 siblings, 0 replies; 35+ messages in thread
From: Matthew Wilcox @ 2016-04-19 18:23 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Toshi Kani, dan.j.williams, viro, ross.zwisler,
	kirill.shutemov, david, tytso, adilger.kernel, linux-nvdimm,
	linux-fsdevel, linux-mm, linux-kernel

On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote:
> On Fri 15-04-16 22:05:31, Andrew Morton wrote:
> > On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani <toshi.kani@hpe.com> wrote:
> > 
> > > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
> > > size.  This feature relies on both mmap virtual address and FS
> > > block (i.e. physical address) to be aligned by the pmd page size.
> > > Users can use mkfs options to specify FS to align block allocations.
> > > However, aligning mmap address requires code changes to existing
> > > applications for providing a pmd-aligned address to mmap().
> > > 
> > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
> > > It calls mmap() with a NULL address, which needs to be changed to
> > > provide a pmd-aligned address for testing with DAX pmd mappings.
> > > Changing all applications that call mmap() with NULL is undesirable.
> > > 
> > > This patch-set extends filesystems to align an mmap address for
> > > a DAX file so that unmodified applications can use DAX pmd mappings.
> > 
> > Matthew sounded unconvinced about the need for this patchset, but I
> > must say that
> > 
> > : The point is that we do not need to modify existing applications for using
> > : DAX PMD mappings.
> > : 
> > : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). 
> > : https://github.com/caius/fio/blob/master/engines/mmap.c
> > : 
> > : With this change, unmodified fio can be used for testing with DAX PMD
> > : mappings.  There are many examples like this, and I do not think we want
> > : to modify all applications that we want to evaluate/test with.
> > 
> > sounds pretty convincing?
> > 
> > 
> > And if we go ahead with this, it looks like 4.7 material to me - it
> > affects ABI and we want to get that stabilized asap.  What do people
> > think?
> 
> So I think Mathew didn't question the patch set as a whole. I think we all
> agree that we should align the virtual address we map to so that PMD
> mappings can be used. What Mathew was questioning was whether we really
> need to play tricks when logical offset in the file where mmap is starting
> is not aligned (and similarly for map length). Whether allowing PMD
> mappings for unaligned file offsets is worth the complication is IMO a
> valid question.

I was questioning the approach as a whole ... since we have userspace
already doing this in the form of NVML, do we really need the kernel to
do this for us?

Now, a further wrinkle.  We have two competing patch sets (from Kirill
and Hugh) which are going to give us THP for page cache filesystems.
I would suggest that this is not DAX functionality but rather VFS
functionality to opportunistically align all mmaps on files which are
reasonably likely to be able to use THP.

I hadn't thought about this until earlier today, and I'm sorry I didn't
raise it further.  Perhaps we can do a lightning session on this later
today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself)
are here.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-19 18:23       ` Matthew Wilcox
  0 siblings, 0 replies; 35+ messages in thread
From: Matthew Wilcox @ 2016-04-19 18:23 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Toshi Kani, dan.j.williams, viro, ross.zwisler,
	kirill.shutemov, david, tytso, adilger.kernel, linux-nvdimm,
	linux-fsdevel, linux-mm, linux-kernel

On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote:
> On Fri 15-04-16 22:05:31, Andrew Morton wrote:
> > On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani <toshi.kani@hpe.com> wrote:
> > 
> > > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
> > > size.  This feature relies on both mmap virtual address and FS
> > > block (i.e. physical address) to be aligned by the pmd page size.
> > > Users can use mkfs options to specify FS to align block allocations.
> > > However, aligning mmap address requires code changes to existing
> > > applications for providing a pmd-aligned address to mmap().
> > > 
> > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
> > > It calls mmap() with a NULL address, which needs to be changed to
> > > provide a pmd-aligned address for testing with DAX pmd mappings.
> > > Changing all applications that call mmap() with NULL is undesirable.
> > > 
> > > This patch-set extends filesystems to align an mmap address for
> > > a DAX file so that unmodified applications can use DAX pmd mappings.
> > 
> > Matthew sounded unconvinced about the need for this patchset, but I
> > must say that
> > 
> > : The point is that we do not need to modify existing applications for using
> > : DAX PMD mappings.
> > : 
> > : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). 
> > : https://github.com/caius/fio/blob/master/engines/mmap.c
> > : 
> > : With this change, unmodified fio can be used for testing with DAX PMD
> > : mappings.  There are many examples like this, and I do not think we want
> > : to modify all applications that we want to evaluate/test with.
> > 
> > sounds pretty convincing?
> > 
> > 
> > And if we go ahead with this, it looks like 4.7 material to me - it
> > affects ABI and we want to get that stabilized asap.  What do people
> > think?
> 
> So I think Mathew didn't question the patch set as a whole. I think we all
> agree that we should align the virtual address we map to so that PMD
> mappings can be used. What Mathew was questioning was whether we really
> need to play tricks when logical offset in the file where mmap is starting
> is not aligned (and similarly for map length). Whether allowing PMD
> mappings for unaligned file offsets is worth the complication is IMO a
> valid question.

I was questioning the approach as a whole ... since we have userspace
already doing this in the form of NVML, do we really need the kernel to
do this for us?

Now, a further wrinkle.  We have two competing patch sets (from Kirill
and Hugh) which are going to give us THP for page cache filesystems.
I would suggest that this is not DAX functionality but rather VFS
functionality to opportunistically align all mmaps on files which are
reasonably likely to be able to use THP.

I hadn't thought about this until earlier today, and I'm sorry I didn't
raise it further.  Perhaps we can do a lightning session on this later
today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself)
are here.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-19 18:23       ` Matthew Wilcox
@ 2016-04-21  3:10         ` Toshi Kani
  -1 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-21  3:10 UTC (permalink / raw)
  To: Matthew Wilcox, Jan Kara
  Cc: Andrew Morton, dan.j.williams, viro, ross.zwisler,
	kirill.shutemov, david, tytso, adilger.kernel, linux-nvdimm,
	linux-fsdevel, linux-mm, linux-kernel

On 4/19/2016 2:23 PM, Matthew Wilcox wrote:
> On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote:
>> On Fri 15-04-16 22:05:31, Andrew Morton wrote:
>>> On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani <toshi.kani@hpe.com> wrote:
>>>
>>>> When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
>>>> size.  This feature relies on both mmap virtual address and FS
>>>> block (i.e. physical address) to be aligned by the pmd page size.
>>>> Users can use mkfs options to specify FS to align block allocations.
>>>> However, aligning mmap address requires code changes to existing
>>>> applications for providing a pmd-aligned address to mmap().
>>>>
>>>> For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
>>>> It calls mmap() with a NULL address, which needs to be changed to
>>>> provide a pmd-aligned address for testing with DAX pmd mappings.
>>>> Changing all applications that call mmap() with NULL is undesirable.
>>>>
>>>> This patch-set extends filesystems to align an mmap address for
>>>> a DAX file so that unmodified applications can use DAX pmd mappings.
>>> Matthew sounded unconvinced about the need for this patchset, but I
>>> must say that
>>>
>>> : The point is that we do not need to modify existing applications for using
>>> : DAX PMD mappings.
>>> :
>>> : For instance, fio with "ioengine=mmap" performs I/Os with mmap().
>>> : https://github.com/caius/fio/blob/master/engines/mmap.c
>>> :
>>> : With this change, unmodified fio can be used for testing with DAX PMD
>>> : mappings.  There are many examples like this, and I do not think we want
>>> : to modify all applications that we want to evaluate/test with.
>>>
>>> sounds pretty convincing?
>>>
>>>
>>> And if we go ahead with this, it looks like 4.7 material to me - it
>>> affects ABI and we want to get that stabilized asap.  What do people
>>> think?
>> So I think Mathew didn't question the patch set as a whole. I think we all
>> agree that we should align the virtual address we map to so that PMD
>> mappings can be used. What Mathew was questioning was whether we really
>> need to play tricks when logical offset in the file where mmap is starting
>> is not aligned (and similarly for map length). Whether allowing PMD
>> mappings for unaligned file offsets is worth the complication is IMO a
>> valid question.
> I was questioning the approach as a whole ... since we have userspace
> already doing this in the form of NVML, do we really need the kernel to
> do this for us?
>
> Now, a further wrinkle.  We have two competing patch sets (from Kirill
> and Hugh) which are going to give us THP for page cache filesystems.
> I would suggest that this is not DAX functionality but rather VFS
> functionality to opportunistically align all mmaps on files which are
> reasonably likely to be able to use THP.
>
> I hadn't thought about this until earlier today, and I'm sorry I didn't
> raise it further.  Perhaps we can do a lightning session on this later
> today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself)
> are here.

How about moving the function (as is) to mm/huge_memory.c, rename it to
get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
when TRANSPARENT_HUGEPAGE is unset?

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-21  3:10         ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-21  3:10 UTC (permalink / raw)
  To: Matthew Wilcox, Jan Kara
  Cc: Andrew Morton, dan.j.williams, viro, ross.zwisler,
	kirill.shutemov, david, tytso, adilger.kernel, linux-nvdimm,
	linux-fsdevel, linux-mm, linux-kernel

On 4/19/2016 2:23 PM, Matthew Wilcox wrote:
> On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote:
>> On Fri 15-04-16 22:05:31, Andrew Morton wrote:
>>> On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani <toshi.kani@hpe.com> wrote:
>>>
>>>> When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page
>>>> size.  This feature relies on both mmap virtual address and FS
>>>> block (i.e. physical address) to be aligned by the pmd page size.
>>>> Users can use mkfs options to specify FS to align block allocations.
>>>> However, aligning mmap address requires code changes to existing
>>>> applications for providing a pmd-aligned address to mmap().
>>>>
>>>> For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1].
>>>> It calls mmap() with a NULL address, which needs to be changed to
>>>> provide a pmd-aligned address for testing with DAX pmd mappings.
>>>> Changing all applications that call mmap() with NULL is undesirable.
>>>>
>>>> This patch-set extends filesystems to align an mmap address for
>>>> a DAX file so that unmodified applications can use DAX pmd mappings.
>>> Matthew sounded unconvinced about the need for this patchset, but I
>>> must say that
>>>
>>> : The point is that we do not need to modify existing applications for using
>>> : DAX PMD mappings.
>>> :
>>> : For instance, fio with "ioengine=mmap" performs I/Os with mmap().
>>> : https://github.com/caius/fio/blob/master/engines/mmap.c
>>> :
>>> : With this change, unmodified fio can be used for testing with DAX PMD
>>> : mappings.  There are many examples like this, and I do not think we want
>>> : to modify all applications that we want to evaluate/test with.
>>>
>>> sounds pretty convincing?
>>>
>>>
>>> And if we go ahead with this, it looks like 4.7 material to me - it
>>> affects ABI and we want to get that stabilized asap.  What do people
>>> think?
>> So I think Mathew didn't question the patch set as a whole. I think we all
>> agree that we should align the virtual address we map to so that PMD
>> mappings can be used. What Mathew was questioning was whether we really
>> need to play tricks when logical offset in the file where mmap is starting
>> is not aligned (and similarly for map length). Whether allowing PMD
>> mappings for unaligned file offsets is worth the complication is IMO a
>> valid question.
> I was questioning the approach as a whole ... since we have userspace
> already doing this in the form of NVML, do we really need the kernel to
> do this for us?
>
> Now, a further wrinkle.  We have two competing patch sets (from Kirill
> and Hugh) which are going to give us THP for page cache filesystems.
> I would suggest that this is not DAX functionality but rather VFS
> functionality to opportunistically align all mmaps on files which are
> reasonably likely to be able to use THP.
>
> I hadn't thought about this until earlier today, and I'm sorry I didn't
> raise it further.  Perhaps we can do a lightning session on this later
> today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself)
> are here.

How about moving the function (as is) to mm/huge_memory.c, rename it to
get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
when TRANSPARENT_HUGEPAGE is unset?

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-21  3:10         ` Toshi Kani
@ 2016-04-21  7:06           ` Matthew Wilcox
  -1 siblings, 0 replies; 35+ messages in thread
From: Matthew Wilcox @ 2016-04-21  7:06 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Jan Kara, Andrew Morton, dan.j.williams, viro, ross.zwisler,
	kirill.shutemov, david, tytso, adilger.kernel, linux-nvdimm,
	linux-fsdevel, linux-mm, linux-kernel

On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote:
> How about moving the function (as is) to mm/huge_memory.c, rename it to
> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
> when TRANSPARENT_HUGEPAGE is unset?

Great idea.  Perhaps it should look something like this?

unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
                unsigned long len, unsigned long pgoff, unsigned long flags)
{
        loff_t off, off_end, off_pmd;
        unsigned long len_pmd, addr_pmd;

        if (addr)
                goto out;
        if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
                goto out;
        /* Kirill, please fill in the right condition here for THP pagecache */

        off = (loff_t)pgoff << PAGE_SHIFT;
        off_end = off + len;
        off_pmd = round_up(off, PMD_SIZE);      /* pmd-aligned start offset */

        if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
                goto out;

        len_pmd = len + PMD_SIZE;
        if ((off + len_pmd) < off)
                goto out;

        addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd,
                                                pgoff, flags);
        if (!IS_ERR_VALUE(addr_pmd)) {
                addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
                return addr_pmd;
        }
 out:
        return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
}

 - I deleted the check for filp == NULL.  It can't be NULL ... this is a
   file_operation ;-)
 - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)?
 - I'm still in two minds about passing 'addr' to the first call to
   get_unmapped_area() instead of NULL.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-21  7:06           ` Matthew Wilcox
  0 siblings, 0 replies; 35+ messages in thread
From: Matthew Wilcox @ 2016-04-21  7:06 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Jan Kara, Andrew Morton, dan.j.williams, viro, ross.zwisler,
	kirill.shutemov, david, tytso, adilger.kernel, linux-nvdimm,
	linux-fsdevel, linux-mm, linux-kernel

On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote:
> How about moving the function (as is) to mm/huge_memory.c, rename it to
> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
> when TRANSPARENT_HUGEPAGE is unset?

Great idea.  Perhaps it should look something like this?

unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
                unsigned long len, unsigned long pgoff, unsigned long flags)
{
        loff_t off, off_end, off_pmd;
        unsigned long len_pmd, addr_pmd;

        if (addr)
                goto out;
        if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
                goto out;
        /* Kirill, please fill in the right condition here for THP pagecache */

        off = (loff_t)pgoff << PAGE_SHIFT;
        off_end = off + len;
        off_pmd = round_up(off, PMD_SIZE);      /* pmd-aligned start offset */

        if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
                goto out;

        len_pmd = len + PMD_SIZE;
        if ((off + len_pmd) < off)
                goto out;

        addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd,
                                                pgoff, flags);
        if (!IS_ERR_VALUE(addr_pmd)) {
                addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
                return addr_pmd;
        }
 out:
        return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
}

 - I deleted the check for filp == NULL.  It can't be NULL ... this is a
   file_operation ;-)
 - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)?
 - I'm still in two minds about passing 'addr' to the first call to
   get_unmapped_area() instead of NULL.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-21  7:06           ` Matthew Wilcox
@ 2016-04-21 20:21             ` Mike Kravetz
  -1 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-04-21 20:21 UTC (permalink / raw)
  To: Matthew Wilcox, Toshi Kani
  Cc: Jan Kara, linux-nvdimm, david, linux-kernel, linux-mm,
	adilger.kernel, viro, linux-fsdevel, tytso, Andrew Morton,
	kirill.shutemov

On 04/21/2016 12:06 AM, Matthew Wilcox wrote:
> On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote:
>> How about moving the function (as is) to mm/huge_memory.c, rename it to
>> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
>> when TRANSPARENT_HUGEPAGE is unset?
> 
> Great idea.  Perhaps it should look something like this?
> 
> unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>                 unsigned long len, unsigned long pgoff, unsigned long flags)
> {

Might want to keep the future possibility of PUD_SIZE THP in mind?
-- 
Mike Kravetz

>         loff_t off, off_end, off_pmd;
>         unsigned long len_pmd, addr_pmd;
> 
>         if (addr)
>                 goto out;
>         if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
>                 goto out;
>         /* Kirill, please fill in the right condition here for THP pagecache */
> 
>         off = (loff_t)pgoff << PAGE_SHIFT;
>         off_end = off + len;
>         off_pmd = round_up(off, PMD_SIZE);      /* pmd-aligned start offset */
> 
>         if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
>                 goto out;
> 
>         len_pmd = len + PMD_SIZE;
>         if ((off + len_pmd) < off)
>                 goto out;
> 
>         addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd,
>                                                 pgoff, flags);
>         if (!IS_ERR_VALUE(addr_pmd)) {
>                 addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
>                 return addr_pmd;
>         }
>  out:
>         return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> }
> 
>  - I deleted the check for filp == NULL.  It can't be NULL ... this is a
>    file_operation ;-)
>  - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)?
>  - I'm still in two minds about passing 'addr' to the first call to
>    get_unmapped_area() instead of NULL.
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-21 20:21             ` Mike Kravetz
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Kravetz @ 2016-04-21 20:21 UTC (permalink / raw)
  To: Matthew Wilcox, Toshi Kani
  Cc: Jan Kara, linux-nvdimm, david, linux-kernel, linux-mm,
	adilger.kernel, viro, linux-fsdevel, tytso, Andrew Morton,
	kirill.shutemov

On 04/21/2016 12:06 AM, Matthew Wilcox wrote:
> On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote:
>> How about moving the function (as is) to mm/huge_memory.c, rename it to
>> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
>> when TRANSPARENT_HUGEPAGE is unset?
> 
> Great idea.  Perhaps it should look something like this?
> 
> unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>                 unsigned long len, unsigned long pgoff, unsigned long flags)
> {

Might want to keep the future possibility of PUD_SIZE THP in mind?
-- 
Mike Kravetz

>         loff_t off, off_end, off_pmd;
>         unsigned long len_pmd, addr_pmd;
> 
>         if (addr)
>                 goto out;
>         if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
>                 goto out;
>         /* Kirill, please fill in the right condition here for THP pagecache */
> 
>         off = (loff_t)pgoff << PAGE_SHIFT;
>         off_end = off + len;
>         off_pmd = round_up(off, PMD_SIZE);      /* pmd-aligned start offset */
> 
>         if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
>                 goto out;
> 
>         len_pmd = len + PMD_SIZE;
>         if ((off + len_pmd) < off)
>                 goto out;
> 
>         addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd,
>                                                 pgoff, flags);
>         if (!IS_ERR_VALUE(addr_pmd)) {
>                 addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
>                 return addr_pmd;
>         }
>  out:
>         return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> }
> 
>  - I deleted the check for filp == NULL.  It can't be NULL ... this is a
>    file_operation ;-)
>  - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)?
>  - I'm still in two minds about passing 'addr' to the first call to
>    get_unmapped_area() instead of NULL.
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-21  7:06           ` Matthew Wilcox
@ 2016-04-21 23:35             ` Toshi Kani
  -1 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-21 23:35 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Andrew Morton, dan.j.williams, viro, ross.zwisler,
	kirill.shutemov, david, tytso, adilger.kernel, linux-nvdimm,
	linux-fsdevel, linux-mm, linux-kernel


On 4/21/2016 3:06 AM, Matthew Wilcox wrote:
> On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote:
>> How about moving the function (as is) to mm/huge_memory.c, rename it to
>> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
>> when TRANSPARENT_HUGEPAGE is unset?
> Great idea.  Perhaps it should look something like this?

Yes, it looks good! I will use it. :-)

>
> unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>                  unsigned long len, unsigned long pgoff, unsigned long flags)
> {
>          loff_t off, off_end, off_pmd;
>          unsigned long len_pmd, addr_pmd;
>
>          if (addr)
>                  goto out;
>          if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
>                  goto out;
>          /* Kirill, please fill in the right condition here for THP pagecache */
>
>          off = (loff_t)pgoff << PAGE_SHIFT;
>          off_end = off + len;
>          off_pmd = round_up(off, PMD_SIZE);      /* pmd-aligned start offset */
>
>          if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
>                  goto out;
>
>          len_pmd = len + PMD_SIZE;
>          if ((off + len_pmd) < off)
>                  goto out;
>
>          addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd,
>                                                  pgoff, flags);
>          if (!IS_ERR_VALUE(addr_pmd)) {
>                  addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
>                  return addr_pmd;
>          }
>   out:
>          return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> }
>
>   - I deleted the check for filp == NULL.  It can't be NULL ... this is a
>     file_operation ;-)

Right.

>   - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)?

The length is padded with an extra-PMD size so that any assigned address 
'addr_pmd'
can be aligned by PMD.  IOW, it does not make an assumption that 
addr_pmd is aligned
by the length.

>   - I'm still in two minds about passing 'addr' to the first call to
>     get_unmapped_area() instead of NULL.

When 'addr' is specified, we need to use 'len' since user may be 
managing free VMA
range by itself.  So, I think falling back with the original args is 
correct.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-21 23:35             ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-21 23:35 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Andrew Morton, dan.j.williams, viro, ross.zwisler,
	kirill.shutemov, david, tytso, adilger.kernel, linux-nvdimm,
	linux-fsdevel, linux-mm, linux-kernel


On 4/21/2016 3:06 AM, Matthew Wilcox wrote:
> On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote:
>> How about moving the function (as is) to mm/huge_memory.c, rename it to
>> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
>> when TRANSPARENT_HUGEPAGE is unset?
> Great idea.  Perhaps it should look something like this?

Yes, it looks good! I will use it. :-)

>
> unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>                  unsigned long len, unsigned long pgoff, unsigned long flags)
> {
>          loff_t off, off_end, off_pmd;
>          unsigned long len_pmd, addr_pmd;
>
>          if (addr)
>                  goto out;
>          if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
>                  goto out;
>          /* Kirill, please fill in the right condition here for THP pagecache */
>
>          off = (loff_t)pgoff << PAGE_SHIFT;
>          off_end = off + len;
>          off_pmd = round_up(off, PMD_SIZE);      /* pmd-aligned start offset */
>
>          if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE))
>                  goto out;
>
>          len_pmd = len + PMD_SIZE;
>          if ((off + len_pmd) < off)
>                  goto out;
>
>          addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd,
>                                                  pgoff, flags);
>          if (!IS_ERR_VALUE(addr_pmd)) {
>                  addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1);
>                  return addr_pmd;
>          }
>   out:
>          return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> }
>
>   - I deleted the check for filp == NULL.  It can't be NULL ... this is a
>     file_operation ;-)

Right.

>   - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)?

The length is padded with an extra-PMD size so that any assigned address 
'addr_pmd'
can be aligned by PMD.  IOW, it does not make an assumption that 
addr_pmd is aligned
by the length.

>   - I'm still in two minds about passing 'addr' to the first call to
>     get_unmapped_area() instead of NULL.

When 'addr' is specified, we need to use 'len' since user may be 
managing free VMA
range by itself.  So, I think falling back with the original args is 
correct.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-21 20:21             ` Mike Kravetz
@ 2016-04-21 23:43               ` Toshi Kani
  -1 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-21 23:43 UTC (permalink / raw)
  To: Mike Kravetz, Matthew Wilcox
  Cc: Jan Kara, linux-nvdimm, david, linux-kernel, linux-mm,
	adilger.kernel, viro, linux-fsdevel, tytso, Andrew Morton,
	kirill.shutemov


On 4/21/2016 4:21 PM, Mike Kravetz wrote:
> On 04/21/2016 12:06 AM, Matthew Wilcox wrote:
>> On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote:
>>> How about moving the function (as is) to mm/huge_memory.c, rename it to
>>> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
>>> when TRANSPARENT_HUGEPAGE is unset?
>> Great idea.  Perhaps it should look something like this?
>>
>> unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>>                  unsigned long len, unsigned long pgoff, unsigned long flags)
>> {
> Might want to keep the future possibility of PUD_SIZE THP in mind?

Yes, this is why the func name does not say 'pmd'. It can be extended to 
support
PUD_SIZE in future.

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-21 23:43               ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-21 23:43 UTC (permalink / raw)
  To: Mike Kravetz, Matthew Wilcox
  Cc: Jan Kara, linux-nvdimm, david, linux-kernel, linux-mm,
	adilger.kernel, viro, linux-fsdevel, tytso, Andrew Morton,
	kirill.shutemov


On 4/21/2016 4:21 PM, Mike Kravetz wrote:
> On 04/21/2016 12:06 AM, Matthew Wilcox wrote:
>> On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote:
>>> How about moving the function (as is) to mm/huge_memory.c, rename it to
>>> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h
>>> when TRANSPARENT_HUGEPAGE is unset?
>> Great idea.  Perhaps it should look something like this?
>>
>> unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>>                  unsigned long len, unsigned long pgoff, unsigned long flags)
>> {
> Might want to keep the future possibility of PUD_SIZE THP in mind?

Yes, this is why the func name does not say 'pmd'. It can be extended to 
support
PUD_SIZE in future.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-21 23:43               ` Toshi Kani
@ 2016-04-22  0:22                 ` Matthew Wilcox
  -1 siblings, 0 replies; 35+ messages in thread
From: Matthew Wilcox @ 2016-04-22  0:22 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Mike Kravetz, Jan Kara, linux-nvdimm, david, linux-kernel,
	linux-mm, adilger.kernel, viro, linux-fsdevel, tytso,
	Andrew Morton, kirill.shutemov

On Thu, Apr 21, 2016 at 07:43:39PM -0400, Toshi Kani wrote:
> On 4/21/2016 4:21 PM, Mike Kravetz wrote:
> >Might want to keep the future possibility of PUD_SIZE THP in mind?
> 
> Yes, this is why the func name does not say 'pmd'. It can be extended to
> support
> PUD_SIZE in future.

Sure ... but what does that look like?  I think it should look a little
like this:

unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len,
                        loff_t off, unsigned long flags, unsigned long size);
{
        unsigned long addr;
        loff_t off_end = off + len;
        loff_t off_align = round_up(off, size);
        unsigned long len_size;

        if ((off_end <= off_align) || ((off_end - off_align) < size))
                return NULL;

        len_size = len + size;
        if ((len_size < len) || (off + len_size) < off)
                return NULL;

        addr = current->mm->get_unmapped_area(filp, NULL, len_size,
                                                off >> PAGE_SHIFT, flags);
        if (IS_ERR_VALUE(addr))
                return NULL;
 
        addr += (off - addr) & (size - 1);
        return addr;
}

unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
                unsigned long len, unsigned long pgoff, unsigned long flags)
{
        loff_t off = (loff_t)pgoff << PAGE_SHIFT;

        if (addr)
                goto out;
        if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
                goto out;
        /* Kirill, please fill in the right condition here for THP pagecache */

        addr = __thp_get_unmapped_area(filp, len, off, flags, PUD_SIZE);
        if (addr)
                return addr;
        addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE);
        if (addr)
                return addr;

 out:
        return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
}

By the way, I added an extra check here, when we add len and size
(PMD_SIZE in the original), we need to make sure that doesn't wrap.
NB: I'm not even compiling these suggestions, just throwing them out
here as ideas to be criticised.

Also, len_size is a stupid name, but I can't think of a better one.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-22  0:22                 ` Matthew Wilcox
  0 siblings, 0 replies; 35+ messages in thread
From: Matthew Wilcox @ 2016-04-22  0:22 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Mike Kravetz, Jan Kara, linux-nvdimm, david, linux-kernel,
	linux-mm, adilger.kernel, viro, linux-fsdevel, tytso,
	Andrew Morton, kirill.shutemov

On Thu, Apr 21, 2016 at 07:43:39PM -0400, Toshi Kani wrote:
> On 4/21/2016 4:21 PM, Mike Kravetz wrote:
> >Might want to keep the future possibility of PUD_SIZE THP in mind?
> 
> Yes, this is why the func name does not say 'pmd'. It can be extended to
> support
> PUD_SIZE in future.

Sure ... but what does that look like?  I think it should look a little
like this:

unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len,
                        loff_t off, unsigned long flags, unsigned long size);
{
        unsigned long addr;
        loff_t off_end = off + len;
        loff_t off_align = round_up(off, size);
        unsigned long len_size;

        if ((off_end <= off_align) || ((off_end - off_align) < size))
                return NULL;

        len_size = len + size;
        if ((len_size < len) || (off + len_size) < off)
                return NULL;

        addr = current->mm->get_unmapped_area(filp, NULL, len_size,
                                                off >> PAGE_SHIFT, flags);
        if (IS_ERR_VALUE(addr))
                return NULL;
 
        addr += (off - addr) & (size - 1);
        return addr;
}

unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
                unsigned long len, unsigned long pgoff, unsigned long flags)
{
        loff_t off = (loff_t)pgoff << PAGE_SHIFT;

        if (addr)
                goto out;
        if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
                goto out;
        /* Kirill, please fill in the right condition here for THP pagecache */

        addr = __thp_get_unmapped_area(filp, len, off, flags, PUD_SIZE);
        if (addr)
                return addr;
        addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE);
        if (addr)
                return addr;

 out:
        return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
}

By the way, I added an extra check here, when we add len and size
(PMD_SIZE in the original), we need to make sure that doesn't wrap.
NB: I'm not even compiling these suggestions, just throwing them out
here as ideas to be criticised.

Also, len_size is a stupid name, but I can't think of a better one.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
  2016-04-22  0:22                 ` Matthew Wilcox
@ 2016-04-22  0:59                   ` Toshi Kani
  -1 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-22  0:59 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Mike Kravetz, Jan Kara, linux-nvdimm, david, linux-kernel,
	linux-mm, adilger.kernel, viro, linux-fsdevel, tytso,
	Andrew Morton, kirill.shutemov


On 4/21/2016 8:22 PM, Matthew Wilcox wrote:
> On Thu, Apr 21, 2016 at 07:43:39PM -0400, Toshi Kani wrote:
>> On 4/21/2016 4:21 PM, Mike Kravetz wrote:
>>> Might want to keep the future possibility of PUD_SIZE THP in mind?
>> Yes, this is why the func name does not say 'pmd'. It can be extended to
>> support
>> PUD_SIZE in future.
> Sure ... but what does that look like?  I think it should look a little
> like this:

Yes, I had something similar in mind, too.  Do you want me to use this 
version without the call with PUD_SIZE?

>
> unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len,
>                          loff_t off, unsigned long flags, unsigned long size);
> {
>          unsigned long addr;
>          loff_t off_end = off + len;
>          loff_t off_align = round_up(off, size);
>          unsigned long len_size;
>
>          if ((off_end <= off_align) || ((off_end - off_align) < size))
>                  return NULL;
>
>          len_size = len + size;
>          if ((len_size < len) || (off + len_size) < off)
>                  return NULL;
>
>          addr = current->mm->get_unmapped_area(filp, NULL, len_size,
>                                                  off >> PAGE_SHIFT, flags);
>          if (IS_ERR_VALUE(addr))
>                  return NULL;
>   
>          addr += (off - addr) & (size - 1);
>          return addr;
> }
>
> unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>                  unsigned long len, unsigned long pgoff, unsigned long flags)
> {
>          loff_t off = (loff_t)pgoff << PAGE_SHIFT;
>
>          if (addr)
>                  goto out;
>          if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
>                  goto out;
>          /* Kirill, please fill in the right condition here for THP pagecache */
>
>          addr = __thp_get_unmapped_area(filp, len, off, flags, PUD_SIZE);
>          if (addr)
>                  return addr;
>          addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE);
>          if (addr)
>                  return addr;
>
>   out:
>          return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> }
>
> By the way, I added an extra check here, when we add len and size
> (PMD_SIZE in the original), we need to make sure that doesn't wrap.
> NB: I'm not even compiling these suggestions, just throwing them out
> here as ideas to be criticised.

Yes, I agree with the extra check.  Thanks for pointing this out.

>
> Also, len_size is a stupid name, but I can't think of a better one.

How about len_pad?

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
@ 2016-04-22  0:59                   ` Toshi Kani
  0 siblings, 0 replies; 35+ messages in thread
From: Toshi Kani @ 2016-04-22  0:59 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Mike Kravetz, Jan Kara, linux-nvdimm, david, linux-kernel,
	linux-mm, adilger.kernel, viro, linux-fsdevel, tytso,
	Andrew Morton, kirill.shutemov


On 4/21/2016 8:22 PM, Matthew Wilcox wrote:
> On Thu, Apr 21, 2016 at 07:43:39PM -0400, Toshi Kani wrote:
>> On 4/21/2016 4:21 PM, Mike Kravetz wrote:
>>> Might want to keep the future possibility of PUD_SIZE THP in mind?
>> Yes, this is why the func name does not say 'pmd'. It can be extended to
>> support
>> PUD_SIZE in future.
> Sure ... but what does that look like?  I think it should look a little
> like this:

Yes, I had something similar in mind, too.  Do you want me to use this 
version without the call with PUD_SIZE?

>
> unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len,
>                          loff_t off, unsigned long flags, unsigned long size);
> {
>          unsigned long addr;
>          loff_t off_end = off + len;
>          loff_t off_align = round_up(off, size);
>          unsigned long len_size;
>
>          if ((off_end <= off_align) || ((off_end - off_align) < size))
>                  return NULL;
>
>          len_size = len + size;
>          if ((len_size < len) || (off + len_size) < off)
>                  return NULL;
>
>          addr = current->mm->get_unmapped_area(filp, NULL, len_size,
>                                                  off >> PAGE_SHIFT, flags);
>          if (IS_ERR_VALUE(addr))
>                  return NULL;
>   
>          addr += (off - addr) & (size - 1);
>          return addr;
> }
>
> unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>                  unsigned long len, unsigned long pgoff, unsigned long flags)
> {
>          loff_t off = (loff_t)pgoff << PAGE_SHIFT;
>
>          if (addr)
>                  goto out;
>          if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD))
>                  goto out;
>          /* Kirill, please fill in the right condition here for THP pagecache */
>
>          addr = __thp_get_unmapped_area(filp, len, off, flags, PUD_SIZE);
>          if (addr)
>                  return addr;
>          addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE);
>          if (addr)
>                  return addr;
>
>   out:
>          return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
> }
>
> By the way, I added an extra check here, when we add len and size
> (PMD_SIZE in the original), we need to make sure that doesn't wrap.
> NB: I'm not even compiling these suggestions, just throwing them out
> here as ideas to be criticised.

Yes, I agree with the extra check.  Thanks for pointing this out.

>
> Also, len_size is a stupid name, but I can't think of a better one.

How about len_pad?

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2016-04-22  0:59 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-14 16:48 [PATCH v3 0/2] Align mmap address for DAX pmd mappings Toshi Kani
2016-04-14 16:48 ` Toshi Kani
2016-04-14 16:48 ` Toshi Kani
2016-04-14 16:48 ` [PATCH v3 1/2] dax: add dax_get_unmapped_area for " Toshi Kani
2016-04-14 16:48   ` Toshi Kani
2016-04-14 16:48   ` Toshi Kani
2016-04-18 20:47   ` Jan Kara
2016-04-18 20:47     ` Jan Kara
2016-04-18 20:47     ` Jan Kara
2016-04-19  2:36     ` Toshi Kani
2016-04-19  2:36       ` Toshi Kani
2016-04-19  2:36       ` Toshi Kani
2016-04-14 16:48 ` [PATCH v3 2/2] ext2/4, xfs, blk: call dax_get_unmapped_area() for DAX " Toshi Kani
2016-04-14 16:48   ` Toshi Kani
2016-04-14 16:48   ` Toshi Kani
2016-04-16  5:05 ` [PATCH v3 0/2] Align mmap address " Andrew Morton
2016-04-16  5:05   ` Andrew Morton
2016-04-18 20:26   ` Jan Kara
2016-04-18 20:26     ` Jan Kara
2016-04-19 18:23     ` Matthew Wilcox
2016-04-19 18:23       ` Matthew Wilcox
2016-04-21  3:10       ` Toshi Kani
2016-04-21  3:10         ` Toshi Kani
2016-04-21  7:06         ` Matthew Wilcox
2016-04-21  7:06           ` Matthew Wilcox
2016-04-21 20:21           ` Mike Kravetz
2016-04-21 20:21             ` Mike Kravetz
2016-04-21 23:43             ` Toshi Kani
2016-04-21 23:43               ` Toshi Kani
2016-04-22  0:22               ` Matthew Wilcox
2016-04-22  0:22                 ` Matthew Wilcox
2016-04-22  0:59                 ` Toshi Kani
2016-04-22  0:59                   ` Toshi Kani
2016-04-21 23:35           ` Toshi Kani
2016-04-21 23:35             ` Toshi Kani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.