All of lore.kernel.org
 help / color / mirror / Atom feed
From: Toshi Kani <toshi.kani@hpe.com>
To: mingo@kernel.org, bp@suse.de, hpa@zytor.com, tglx@linutronix.de
Cc: linux-nvdimm@lists.01.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kirill.shutemov@linux.intel.com
Subject: [PATCH] x86 get_unmapped_area: Add PMD alignment for DAX PMD mmap
Date: Wed,  6 Apr 2016 07:58:09 -0600	[thread overview]
Message-ID: <1459951089-14911-1-git-send-email-toshi.kani@hpe.com> (raw)

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using PMD page
size.  This feature relies on both mmap virtual address and FS
block data (i.e. physical address) to be aligned by the PMD page
size.  Users can use mkfs options to specify FS to align block
allocations.  However, aligning mmap() address requires application
changes to mmap() calls, such as:

 -  /* let the kernel to assign a mmap addr */
 -  mptr = mmap(NULL, fsize, PROT_READ|PROT_WRITE, FLAGS, fd, 0);

 +  /* 1. obtain a PMD-aligned virtual address */
 +  ret = posix_memalign(&mptr, PMD_SIZE, fsize);
 +  if (!ret)
 +    free(mptr);  /* 2. release the virt addr */
 +
 +  /* 3. then pass the PMD-aligned virt addr to mmap() */
 +  mptr = mmap(mptr, fsize, PROT_READ|PROT_WRITE, FLAGS, fd, 0);

These changes add unnecessary dependency to DAX and PMD page size
into application code.  The kernel should assign a mmap address
appropriate for the operation.

Change arch_get_unmapped_area() and arch_get_unmapped_area_topdown()
to request PMD_SIZE alignment when the request is for a DAX file and
its mapping range is large enough for using a PMD page.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/kernel/sys_x86_64.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 10e0272..a294c66 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -157,6 +157,13 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		info.align_mask = get_align_mask();
 		info.align_offset += get_align_bits();
 	}
+	if (filp && IS_ENABLED(CONFIG_FS_DAX_PMD) && IS_DAX(file_inode(filp))) {
+		unsigned long off_end = info.align_offset + len;
+		unsigned long off_pmd = round_up(info.align_offset, PMD_SIZE);
+
+		if ((off_end > off_pmd) && ((off_end - off_pmd) >= PMD_SIZE))
+			info.align_mask |= (PMD_SIZE - 1);
+	}
 	return vm_unmapped_area(&info);
 }
 
@@ -200,6 +207,13 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 		info.align_mask = get_align_mask();
 		info.align_offset += get_align_bits();
 	}
+	if (filp && IS_ENABLED(CONFIG_FS_DAX_PMD) && IS_DAX(file_inode(filp))) {
+		unsigned long off_end = info.align_offset + len;
+		unsigned long off_pmd = round_up(info.align_offset, PMD_SIZE);
+
+		if ((off_end > off_pmd) && ((off_end - off_pmd) >= PMD_SIZE))
+			info.align_mask |= (PMD_SIZE - 1);
+	}
 	addr = vm_unmapped_area(&info);
 	if (!(addr & ~PAGE_MASK))
 		return addr;
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Toshi Kani <toshi.kani@hpe.com>
To: mingo@kernel.org, bp@suse.de, hpa@zytor.com, tglx@linutronix.de
Cc: dan.j.williams@intel.com, willy@linux.intel.com,
	kirill.shutemov@linux.intel.com, linux-mm@kvack.org,
	x86@kernel.org, linux-nvdimm@ml01.01.org,
	linux-kernel@vger.kernel.org, Toshi Kani <toshi.kani@hpe.com>
Subject: [PATCH] x86 get_unmapped_area: Add PMD alignment for DAX PMD mmap
Date: Wed,  6 Apr 2016 07:58:09 -0600	[thread overview]
Message-ID: <1459951089-14911-1-git-send-email-toshi.kani@hpe.com> (raw)

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using PMD page
size.  This feature relies on both mmap virtual address and FS
block data (i.e. physical address) to be aligned by the PMD page
size.  Users can use mkfs options to specify FS to align block
allocations.  However, aligning mmap() address requires application
changes to mmap() calls, such as:

 -  /* let the kernel to assign a mmap addr */
 -  mptr = mmap(NULL, fsize, PROT_READ|PROT_WRITE, FLAGS, fd, 0);

 +  /* 1. obtain a PMD-aligned virtual address */
 +  ret = posix_memalign(&mptr, PMD_SIZE, fsize);
 +  if (!ret)
 +    free(mptr);  /* 2. release the virt addr */
 +
 +  /* 3. then pass the PMD-aligned virt addr to mmap() */
 +  mptr = mmap(mptr, fsize, PROT_READ|PROT_WRITE, FLAGS, fd, 0);

These changes add unnecessary dependency to DAX and PMD page size
into application code.  The kernel should assign a mmap address
appropriate for the operation.

Change arch_get_unmapped_area() and arch_get_unmapped_area_topdown()
to request PMD_SIZE alignment when the request is for a DAX file and
its mapping range is large enough for using a PMD page.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/kernel/sys_x86_64.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 10e0272..a294c66 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -157,6 +157,13 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		info.align_mask = get_align_mask();
 		info.align_offset += get_align_bits();
 	}
+	if (filp && IS_ENABLED(CONFIG_FS_DAX_PMD) && IS_DAX(file_inode(filp))) {
+		unsigned long off_end = info.align_offset + len;
+		unsigned long off_pmd = round_up(info.align_offset, PMD_SIZE);
+
+		if ((off_end > off_pmd) && ((off_end - off_pmd) >= PMD_SIZE))
+			info.align_mask |= (PMD_SIZE - 1);
+	}
 	return vm_unmapped_area(&info);
 }
 
@@ -200,6 +207,13 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 		info.align_mask = get_align_mask();
 		info.align_offset += get_align_bits();
 	}
+	if (filp && IS_ENABLED(CONFIG_FS_DAX_PMD) && IS_DAX(file_inode(filp))) {
+		unsigned long off_end = info.align_offset + len;
+		unsigned long off_pmd = round_up(info.align_offset, PMD_SIZE);
+
+		if ((off_end > off_pmd) && ((off_end - off_pmd) >= PMD_SIZE))
+			info.align_mask |= (PMD_SIZE - 1);
+	}
 	addr = vm_unmapped_area(&info);
 	if (!(addr & ~PAGE_MASK))
 		return addr;

WARNING: multiple messages have this Message-ID (diff)
From: Toshi Kani <toshi.kani@hpe.com>
To: mingo@kernel.org, bp@suse.de, hpa@zytor.com, tglx@linutronix.de
Cc: dan.j.williams@intel.com, willy@linux.intel.com,
	kirill.shutemov@linux.intel.com, linux-mm@kvack.org,
	x86@kernel.org, linux-nvdimm@lists.01.org,
	linux-kernel@vger.kernel.org, Toshi Kani <toshi.kani@hpe.com>
Subject: [PATCH] x86 get_unmapped_area: Add PMD alignment for DAX PMD mmap
Date: Wed,  6 Apr 2016 07:58:09 -0600	[thread overview]
Message-ID: <1459951089-14911-1-git-send-email-toshi.kani@hpe.com> (raw)

When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using PMD page
size.  This feature relies on both mmap virtual address and FS
block data (i.e. physical address) to be aligned by the PMD page
size.  Users can use mkfs options to specify FS to align block
allocations.  However, aligning mmap() address requires application
changes to mmap() calls, such as:

 -  /* let the kernel to assign a mmap addr */
 -  mptr = mmap(NULL, fsize, PROT_READ|PROT_WRITE, FLAGS, fd, 0);

 +  /* 1. obtain a PMD-aligned virtual address */
 +  ret = posix_memalign(&mptr, PMD_SIZE, fsize);
 +  if (!ret)
 +    free(mptr);  /* 2. release the virt addr */
 +
 +  /* 3. then pass the PMD-aligned virt addr to mmap() */
 +  mptr = mmap(mptr, fsize, PROT_READ|PROT_WRITE, FLAGS, fd, 0);

These changes add unnecessary dependency to DAX and PMD page size
into application code.  The kernel should assign a mmap address
appropriate for the operation.

Change arch_get_unmapped_area() and arch_get_unmapped_area_topdown()
to request PMD_SIZE alignment when the request is for a DAX file and
its mapping range is large enough for using a PMD page.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/kernel/sys_x86_64.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 10e0272..a294c66 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -157,6 +157,13 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		info.align_mask = get_align_mask();
 		info.align_offset += get_align_bits();
 	}
+	if (filp && IS_ENABLED(CONFIG_FS_DAX_PMD) && IS_DAX(file_inode(filp))) {
+		unsigned long off_end = info.align_offset + len;
+		unsigned long off_pmd = round_up(info.align_offset, PMD_SIZE);
+
+		if ((off_end > off_pmd) && ((off_end - off_pmd) >= PMD_SIZE))
+			info.align_mask |= (PMD_SIZE - 1);
+	}
 	return vm_unmapped_area(&info);
 }
 
@@ -200,6 +207,13 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 		info.align_mask = get_align_mask();
 		info.align_offset += get_align_bits();
 	}
+	if (filp && IS_ENABLED(CONFIG_FS_DAX_PMD) && IS_DAX(file_inode(filp))) {
+		unsigned long off_end = info.align_offset + len;
+		unsigned long off_pmd = round_up(info.align_offset, PMD_SIZE);
+
+		if ((off_end > off_pmd) && ((off_end - off_pmd) >= PMD_SIZE))
+			info.align_mask |= (PMD_SIZE - 1);
+	}
 	addr = vm_unmapped_area(&info);
 	if (!(addr & ~PAGE_MASK))
 		return addr;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2016-04-06 14:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-06 13:58 Toshi Kani [this message]
2016-04-06 13:58 ` [PATCH] x86 get_unmapped_area: Add PMD alignment for DAX PMD mmap Toshi Kani
2016-04-06 13:58 ` Toshi Kani
2016-04-06 16:50 ` Matthew Wilcox
2016-04-06 16:50   ` Matthew Wilcox
2016-04-06 16:50   ` Matthew Wilcox
2016-04-06 17:44   ` Toshi Kani
2016-04-06 17:44     ` Toshi Kani
2016-04-06 17:44     ` Toshi Kani
2016-04-07 17:41     ` Matthew Wilcox
2016-04-07 17:41       ` Matthew Wilcox
2016-04-07 17:41       ` Matthew Wilcox
2016-04-07 21:20       ` Toshi Kani
2016-04-07 21:20         ` Toshi Kani
2016-04-07 21:20         ` Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1459951089-14911-1-git-send-email-toshi.kani@hpe.com \
    --to=toshi.kani@hpe.com \
    --cc=bp@suse.de \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.