All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: fix mprotect() behaviour on VM_LOCKED VMAs
@ 2015-04-17 12:20 ` Kirill A. Shutemov
  0 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-04-17 12:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-api, linux-kernel, Kirill A. Shutemov

On mlock(2) we trigger COW on private writable VMA to avoid faults in
future.

mm/gup.c:
 840 long populate_vma_page_range(struct vm_area_struct *vma,
 841                 unsigned long start, unsigned long end, int *nonblocking)
 842 {
 ...
 855          * We want to touch writable mappings with a write fault in order
 856          * to break COW, except for shared mappings because these don't COW
 857          * and we would not want to dirty them for nothing.
 858          */
 859         if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
 860                 gup_flags |= FOLL_WRITE;

But we miss this case when we make VM_LOCKED VMA writeable via
mprotect(2). The test case:

	#define _GNU_SOURCE
	#include <fcntl.h>
	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <sys/mman.h>
	#include <sys/resource.h>
	#include <sys/stat.h>
	#include <sys/time.h>
	#include <sys/types.h>

	#define PAGE_SIZE 4096

	int main(int argc, char **argv)
	{
		struct rusage usage;
		long before;
		char *p;
		int fd;

		/* Create a file and populate first page of page cache */
		fd = open("/tmp", O_TMPFILE | O_RDWR, S_IRUSR | S_IWUSR);
		write(fd, "1", 1);

		/* Create a *read-only* *private* mapping of the file */
		p = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_PRIVATE, fd, 0);

		/*
		 * Since the mapping is read-only, mlock() will populate the mapping
		 * with PTEs pointing to page cache without triggering COW.
		 */
		mlock(p, PAGE_SIZE);

		/*
		 * Mapping became read-write, but it's still populated with PTEs
		 * pointing to page cache.
		 */
		mprotect(p, PAGE_SIZE, PROT_READ | PROT_WRITE);

		getrusage(RUSAGE_SELF, &usage);
		before = usage.ru_minflt;

		/* Trigger COW: fault in mlock()ed VMA. */
		*p = 1;

		getrusage(RUSAGE_SELF, &usage);
		printf("faults: %ld\n", usage.ru_minflt - before);

		return 0;
	}

	$ ./test
	faults: 1

Let's fix it by triggering populating of VMA in mprotect_fixup() on this
condition. We don't care about population error as we don't in other
similar cases i.e. mremap.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/mprotect.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 88584838e704..911fb9070b2b 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -29,6 +29,8 @@
 #include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
 
+#include "internal.h"
+
 /*
  * For a prot_numa update we only hold mmap_sem for read so there is a
  * potential race with faulting where a pmd was temporarily none. This
@@ -322,6 +324,15 @@ success:
 	change_protection(vma, start, end, vma->vm_page_prot,
 			  dirty_accountable, 0);
 
+	/*
+	 * Private VM_LOCKED VMA become writable: trigger COW to avoid major
+	 * fault on access.
+	 */
+	if ((oldflags & (VM_WRITE | VM_SHARED | VM_LOCKED)) == VM_LOCKED &&
+			(newflags & VM_WRITE)) {
+		populate_vma_page_range(vma, start, end, NULL);
+	}
+
 	vm_stat_account(mm, oldflags, vma->vm_file, -nrpages);
 	vm_stat_account(mm, newflags, vma->vm_file, nrpages);
 	perf_event_mmap(vma);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] mm: fix mprotect() behaviour on VM_LOCKED VMAs
@ 2015-04-17 12:20 ` Kirill A. Shutemov
  0 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-04-17 12:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Kirill A. Shutemov

On mlock(2) we trigger COW on private writable VMA to avoid faults in
future.

mm/gup.c:
 840 long populate_vma_page_range(struct vm_area_struct *vma,
 841                 unsigned long start, unsigned long end, int *nonblocking)
 842 {
 ...
 855          * We want to touch writable mappings with a write fault in order
 856          * to break COW, except for shared mappings because these don't COW
 857          * and we would not want to dirty them for nothing.
 858          */
 859         if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
 860                 gup_flags |= FOLL_WRITE;

But we miss this case when we make VM_LOCKED VMA writeable via
mprotect(2). The test case:

	#define _GNU_SOURCE
	#include <fcntl.h>
	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <sys/mman.h>
	#include <sys/resource.h>
	#include <sys/stat.h>
	#include <sys/time.h>
	#include <sys/types.h>

	#define PAGE_SIZE 4096

	int main(int argc, char **argv)
	{
		struct rusage usage;
		long before;
		char *p;
		int fd;

		/* Create a file and populate first page of page cache */
		fd = open("/tmp", O_TMPFILE | O_RDWR, S_IRUSR | S_IWUSR);
		write(fd, "1", 1);

		/* Create a *read-only* *private* mapping of the file */
		p = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_PRIVATE, fd, 0);

		/*
		 * Since the mapping is read-only, mlock() will populate the mapping
		 * with PTEs pointing to page cache without triggering COW.
		 */
		mlock(p, PAGE_SIZE);

		/*
		 * Mapping became read-write, but it's still populated with PTEs
		 * pointing to page cache.
		 */
		mprotect(p, PAGE_SIZE, PROT_READ | PROT_WRITE);

		getrusage(RUSAGE_SELF, &usage);
		before = usage.ru_minflt;

		/* Trigger COW: fault in mlock()ed VMA. */
		*p = 1;

		getrusage(RUSAGE_SELF, &usage);
		printf("faults: %ld\n", usage.ru_minflt - before);

		return 0;
	}

	$ ./test
	faults: 1

Let's fix it by triggering populating of VMA in mprotect_fixup() on this
condition. We don't care about population error as we don't in other
similar cases i.e. mremap.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
---
 mm/mprotect.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 88584838e704..911fb9070b2b 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -29,6 +29,8 @@
 #include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
 
+#include "internal.h"
+
 /*
  * For a prot_numa update we only hold mmap_sem for read so there is a
  * potential race with faulting where a pmd was temporarily none. This
@@ -322,6 +324,15 @@ success:
 	change_protection(vma, start, end, vma->vm_page_prot,
 			  dirty_accountable, 0);
 
+	/*
+	 * Private VM_LOCKED VMA become writable: trigger COW to avoid major
+	 * fault on access.
+	 */
+	if ((oldflags & (VM_WRITE | VM_SHARED | VM_LOCKED)) == VM_LOCKED &&
+			(newflags & VM_WRITE)) {
+		populate_vma_page_range(vma, start, end, NULL);
+	}
+
 	vm_stat_account(mm, oldflags, vma->vm_file, -nrpages);
 	vm_stat_account(mm, newflags, vma->vm_file, nrpages);
 	perf_event_mmap(vma);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] mm: fix mprotect() behaviour on VM_LOCKED VMAs
@ 2015-04-17 12:20 ` Kirill A. Shutemov
  0 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-04-17 12:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-api, linux-kernel, Kirill A. Shutemov

On mlock(2) we trigger COW on private writable VMA to avoid faults in
future.

mm/gup.c:
 840 long populate_vma_page_range(struct vm_area_struct *vma,
 841                 unsigned long start, unsigned long end, int *nonblocking)
 842 {
 ...
 855          * We want to touch writable mappings with a write fault in order
 856          * to break COW, except for shared mappings because these don't COW
 857          * and we would not want to dirty them for nothing.
 858          */
 859         if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
 860                 gup_flags |= FOLL_WRITE;

But we miss this case when we make VM_LOCKED VMA writeable via
mprotect(2). The test case:

	#define _GNU_SOURCE
	#include <fcntl.h>
	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <sys/mman.h>
	#include <sys/resource.h>
	#include <sys/stat.h>
	#include <sys/time.h>
	#include <sys/types.h>

	#define PAGE_SIZE 4096

	int main(int argc, char **argv)
	{
		struct rusage usage;
		long before;
		char *p;
		int fd;

		/* Create a file and populate first page of page cache */
		fd = open("/tmp", O_TMPFILE | O_RDWR, S_IRUSR | S_IWUSR);
		write(fd, "1", 1);

		/* Create a *read-only* *private* mapping of the file */
		p = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_PRIVATE, fd, 0);

		/*
		 * Since the mapping is read-only, mlock() will populate the mapping
		 * with PTEs pointing to page cache without triggering COW.
		 */
		mlock(p, PAGE_SIZE);

		/*
		 * Mapping became read-write, but it's still populated with PTEs
		 * pointing to page cache.
		 */
		mprotect(p, PAGE_SIZE, PROT_READ | PROT_WRITE);

		getrusage(RUSAGE_SELF, &usage);
		before = usage.ru_minflt;

		/* Trigger COW: fault in mlock()ed VMA. */
		*p = 1;

		getrusage(RUSAGE_SELF, &usage);
		printf("faults: %ld\n", usage.ru_minflt - before);

		return 0;
	}

	$ ./test
	faults: 1

Let's fix it by triggering populating of VMA in mprotect_fixup() on this
condition. We don't care about population error as we don't in other
similar cases i.e. mremap.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/mprotect.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 88584838e704..911fb9070b2b 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -29,6 +29,8 @@
 #include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
 
+#include "internal.h"
+
 /*
  * For a prot_numa update we only hold mmap_sem for read so there is a
  * potential race with faulting where a pmd was temporarily none. This
@@ -322,6 +324,15 @@ success:
 	change_protection(vma, start, end, vma->vm_page_prot,
 			  dirty_accountable, 0);
 
+	/*
+	 * Private VM_LOCKED VMA become writable: trigger COW to avoid major
+	 * fault on access.
+	 */
+	if ((oldflags & (VM_WRITE | VM_SHARED | VM_LOCKED)) == VM_LOCKED &&
+			(newflags & VM_WRITE)) {
+		populate_vma_page_range(vma, start, end, NULL);
+	}
+
 	vm_stat_account(mm, oldflags, vma->vm_file, -nrpages);
 	vm_stat_account(mm, newflags, vma->vm_file, nrpages);
 	perf_event_mmap(vma);
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: fix mprotect() behaviour on VM_LOCKED VMAs
  2015-04-17 12:20 ` Kirill A. Shutemov
@ 2015-04-22 18:55   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-04-22 18:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-api, linux-kernel

Ping?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: fix mprotect() behaviour on VM_LOCKED VMAs
@ 2015-04-22 18:55   ` Kirill A. Shutemov
  0 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-04-22 18:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-api, linux-kernel

Ping?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-04-22 18:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-17 12:20 [PATCH] mm: fix mprotect() behaviour on VM_LOCKED VMAs Kirill A. Shutemov
2015-04-17 12:20 ` Kirill A. Shutemov
2015-04-17 12:20 ` Kirill A. Shutemov
2015-04-22 18:55 ` Kirill A. Shutemov
2015-04-22 18:55   ` Kirill A. Shutemov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.