All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
@ 2013-04-02  9:54 Cyril Hrubis
  2013-04-02 10:56   ` Mel Gorman
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Cyril Hrubis @ 2013-04-02  9:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3910 bytes --]

This patch fixes corner case for MAP_FIXED when requested mapping length
is larger than rlimit for virtual memory. In such case any overlapping
mappings are unmapped before we check for the limit and return ENOMEM.

The check is moved before the loop that unmaps overlapping parts of
existing mappings. When we are about to hit the limit (currently mapped
pages + len > limit) we scan for overlapping pages and check again
accounting for them.

This fixes situation when userspace program expects that the previous
mappings are preserved after the mmap() syscall has returned with error.
(POSIX clearly states that successfull mapping shall replace any
previous mappings.)

This corner case was found and can be tested with LTP testcase:

testcases/open_posix_testsuite/conformance/interfaces/mmap/24-2.c

In this case the mmap, which is clearly over current limit, unmaps
dynamic libraries and the testcase segfaults right after returning into
userspace.

I've also looked at the second instance of the unmapping loop in the
do_brk(). The do_brk() is called from brk() syscall and from vm_brk().
The brk() syscall checks for overlapping mappings and bails out when
there are any (so it can't be triggered from the brk syscall). The
vm_brk() is called only from binmft handlers so it shouldn't be
triggered unless binmft handler created overlapping mappings.

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
---
 mm/mmap.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 2664a47..e755080 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -33,6 +33,7 @@
 #include <linux/uprobes.h>
 #include <linux/rbtree_augmented.h>
 #include <linux/sched/sysctl.h>
+#include <linux/kernel.h>
 
 #include <asm/uaccess.h>
 #include <asm/cacheflush.h>
@@ -543,6 +544,34 @@ static int find_vma_links(struct mm_struct *mm, unsigned long addr,
 	return 0;
 }
 
+static unsigned long count_vma_pages_range(struct mm_struct *mm,
+		unsigned long addr, unsigned long end)
+{
+	unsigned long nr_pages = 0;
+	struct vm_area_struct *vma;
+
+	/* Find first overlaping mapping */
+	vma = find_vma_intersection(mm, addr, end);
+	if (!vma)
+		return 0;
+
+	nr_pages = (min(end, vma->vm_end) -
+		max(addr, vma->vm_start)) >> PAGE_SHIFT;
+
+	/* Iterate over the rest of the overlaps */
+	for (vma = vma->vm_next; vma; vma = vma->vm_next) {
+		unsigned long overlap_len;
+
+		if (vma->vm_start > end)
+			break;
+
+		overlap_len = min(end, vma->vm_end) - vma->vm_start;
+		nr_pages += overlap_len >> PAGE_SHIFT;
+	}
+
+	return nr_pages;
+}
+
 void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
 		struct rb_node **rb_link, struct rb_node *rb_parent)
 {
@@ -1433,6 +1462,23 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	unsigned long charged = 0;
 	struct inode *inode =  file ? file_inode(file) : NULL;
 
+	/* Check against address space limit. */
+	if (!may_expand_vm(mm, len >> PAGE_SHIFT)) {
+		unsigned long nr_pages;
+
+		/*
+		 * MAP_FIXED may remove pages of mappings that intersects with
+		 * requested mapping. Account for the pages it would unmap.
+		 */
+		if (!(vm_flags & MAP_FIXED))
+			return -ENOMEM;
+
+		nr_pages = count_vma_pages_range(mm, addr, addr + len);
+
+		if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
+			return -ENOMEM;
+	}
+
 	/* Clear old maps */
 	error = -ENOMEM;
 munmap_back:
@@ -1442,10 +1488,6 @@ munmap_back:
 		goto munmap_back;
 	}
 
-	/* Check against address space limit. */
-	if (!may_expand_vm(mm, len >> PAGE_SHIFT))
-		return -ENOMEM;
-
 	/*
 	 * Private writable mapping: check memory availability
 	 */
-- 
1.8.1.5

See also a testsuite that exercies the newly added codepaths which is
attached as a tarball (All testcases minus the second that tests
that this patch works succeeds both before and after this patch).

-- 
Cyril Hrubis
chrubis@suse.cz

[-- Attachment #2: mm.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 2284 bytes --]

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
  2013-04-02  9:54 [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping Cyril Hrubis
@ 2013-04-02 10:56   ` Mel Gorman
  2013-04-02 12:29 ` Wanpeng Li
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Mel Gorman @ 2013-04-02 10:56 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Andrew Morton, linux-mm, linux-kernel

On Tue, Apr 02, 2013 at 11:54:03AM +0200, Cyril Hrubis wrote:
> This patch fixes corner case for MAP_FIXED when requested mapping length
> is larger than rlimit for virtual memory. In such case any overlapping
> mappings are unmapped before we check for the limit and return ENOMEM.
> 
> The check is moved before the loop that unmaps overlapping parts of
> existing mappings. When we are about to hit the limit (currently mapped
> pages + len > limit) we scan for overlapping pages and check again
> accounting for them.
> 
> This fixes situation when userspace program expects that the previous
> mappings are preserved after the mmap() syscall has returned with error.
> (POSIX clearly states that successfull mapping shall replace any
> previous mappings.)
> 
> This corner case was found and can be tested with LTP testcase:
> 
> testcases/open_posix_testsuite/conformance/interfaces/mmap/24-2.c
> 
> In this case the mmap, which is clearly over current limit, unmaps
> dynamic libraries and the testcase segfaults right after returning into
> userspace.
> 
> I've also looked at the second instance of the unmapping loop in the
> do_brk(). The do_brk() is called from brk() syscall and from vm_brk().
> The brk() syscall checks for overlapping mappings and bails out when
> there are any (so it can't be triggered from the brk syscall). The
> vm_brk() is called only from binmft handlers so it shouldn't be
> triggered unless binmft handler created overlapping mappings.
> 
> Signed-off-by: Cyril Hrubis <chrubis@suse.cz>

Reviewed-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
@ 2013-04-02 10:56   ` Mel Gorman
  0 siblings, 0 replies; 10+ messages in thread
From: Mel Gorman @ 2013-04-02 10:56 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Andrew Morton, linux-mm, linux-kernel

On Tue, Apr 02, 2013 at 11:54:03AM +0200, Cyril Hrubis wrote:
> This patch fixes corner case for MAP_FIXED when requested mapping length
> is larger than rlimit for virtual memory. In such case any overlapping
> mappings are unmapped before we check for the limit and return ENOMEM.
> 
> The check is moved before the loop that unmaps overlapping parts of
> existing mappings. When we are about to hit the limit (currently mapped
> pages + len > limit) we scan for overlapping pages and check again
> accounting for them.
> 
> This fixes situation when userspace program expects that the previous
> mappings are preserved after the mmap() syscall has returned with error.
> (POSIX clearly states that successfull mapping shall replace any
> previous mappings.)
> 
> This corner case was found and can be tested with LTP testcase:
> 
> testcases/open_posix_testsuite/conformance/interfaces/mmap/24-2.c
> 
> In this case the mmap, which is clearly over current limit, unmaps
> dynamic libraries and the testcase segfaults right after returning into
> userspace.
> 
> I've also looked at the second instance of the unmapping loop in the
> do_brk(). The do_brk() is called from brk() syscall and from vm_brk().
> The brk() syscall checks for overlapping mappings and bails out when
> there are any (so it can't be triggered from the brk syscall). The
> vm_brk() is called only from binmft handlers so it shouldn't be
> triggered unless binmft handler created overlapping mappings.
> 
> Signed-off-by: Cyril Hrubis <chrubis@suse.cz>

Reviewed-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
  2013-04-02  9:54 [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping Cyril Hrubis
  2013-04-02 10:56   ` Mel Gorman
  2013-04-02 12:29 ` Wanpeng Li
@ 2013-04-02 12:29 ` Wanpeng Li
  2013-04-11 22:57   ` Andrew Morton
  3 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2013-04-02 12:29 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Andrew Morton, linux-mm, linux-kernel

On Tue, Apr 02, 2013 at 11:54:03AM +0200, Cyril Hrubis wrote:
>This patch fixes corner case for MAP_FIXED when requested mapping length
>is larger than rlimit for virtual memory. In such case any overlapping
>mappings are unmapped before we check for the limit and return ENOMEM.
>
>The check is moved before the loop that unmaps overlapping parts of
>existing mappings. When we are about to hit the limit (currently mapped
>pages + len > limit) we scan for overlapping pages and check again
>accounting for them.
>
>This fixes situation when userspace program expects that the previous
>mappings are preserved after the mmap() syscall has returned with error.
>(POSIX clearly states that successfull mapping shall replace any
>previous mappings.)
>
>This corner case was found and can be tested with LTP testcase:
>
>testcases/open_posix_testsuite/conformance/interfaces/mmap/24-2.c
>
>In this case the mmap, which is clearly over current limit, unmaps
>dynamic libraries and the testcase segfaults right after returning into
>userspace.
>
>I've also looked at the second instance of the unmapping loop in the
>do_brk(). The do_brk() is called from brk() syscall and from vm_brk().
>The brk() syscall checks for overlapping mappings and bails out when
>there are any (so it can't be triggered from the brk syscall). The
>vm_brk() is called only from binmft handlers so it shouldn't be
>triggered unless binmft handler created overlapping mappings.
>

Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>

>Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
>---
> mm/mmap.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 46 insertions(+), 4 deletions(-)
>
>diff --git a/mm/mmap.c b/mm/mmap.c
>index 2664a47..e755080 100644
>--- a/mm/mmap.c
>+++ b/mm/mmap.c
>@@ -33,6 +33,7 @@
> #include <linux/uprobes.h>
> #include <linux/rbtree_augmented.h>
> #include <linux/sched/sysctl.h>
>+#include <linux/kernel.h>
>
> #include <asm/uaccess.h>
> #include <asm/cacheflush.h>
>@@ -543,6 +544,34 @@ static int find_vma_links(struct mm_struct *mm, unsigned long addr,
> 	return 0;
> }
>
>+static unsigned long count_vma_pages_range(struct mm_struct *mm,
>+		unsigned long addr, unsigned long end)
>+{
>+	unsigned long nr_pages = 0;
>+	struct vm_area_struct *vma;
>+
>+	/* Find first overlaping mapping */
>+	vma = find_vma_intersection(mm, addr, end);
>+	if (!vma)
>+		return 0;
>+
>+	nr_pages = (min(end, vma->vm_end) -
>+		max(addr, vma->vm_start)) >> PAGE_SHIFT;
>+
>+	/* Iterate over the rest of the overlaps */
>+	for (vma = vma->vm_next; vma; vma = vma->vm_next) {
>+		unsigned long overlap_len;
>+
>+		if (vma->vm_start > end)
>+			break;
>+
>+		overlap_len = min(end, vma->vm_end) - vma->vm_start;
>+		nr_pages += overlap_len >> PAGE_SHIFT;
>+	}
>+
>+	return nr_pages;
>+}
>+
> void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
> 		struct rb_node **rb_link, struct rb_node *rb_parent)
> {
>@@ -1433,6 +1462,23 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> 	unsigned long charged = 0;
> 	struct inode *inode =  file ? file_inode(file) : NULL;
>
>+	/* Check against address space limit. */
>+	if (!may_expand_vm(mm, len >> PAGE_SHIFT)) {
>+		unsigned long nr_pages;
>+
>+		/*
>+		 * MAP_FIXED may remove pages of mappings that intersects with
>+		 * requested mapping. Account for the pages it would unmap.
>+		 */
>+		if (!(vm_flags & MAP_FIXED))
>+			return -ENOMEM;
>+
>+		nr_pages = count_vma_pages_range(mm, addr, addr + len);
>+
>+		if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
>+			return -ENOMEM;
>+	}
>+
> 	/* Clear old maps */
> 	error = -ENOMEM;
> munmap_back:
>@@ -1442,10 +1488,6 @@ munmap_back:
> 		goto munmap_back;
> 	}
>
>-	/* Check against address space limit. */
>-	if (!may_expand_vm(mm, len >> PAGE_SHIFT))
>-		return -ENOMEM;
>-
> 	/*
> 	 * Private writable mapping: check memory availability
> 	 */
>-- 
>1.8.1.5
>
>See also a testsuite that exercies the newly added codepaths which is
>attached as a tarball (All testcases minus the second that tests
>that this patch works succeeds both before and after this patch).
>
>-- 
>Cyril Hrubis
>chrubis@suse.cz


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
  2013-04-02  9:54 [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping Cyril Hrubis
  2013-04-02 10:56   ` Mel Gorman
@ 2013-04-02 12:29 ` Wanpeng Li
  2013-04-02 12:29 ` Wanpeng Li
  2013-04-11 22:57   ` Andrew Morton
  3 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2013-04-02 12:29 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: Andrew Morton, linux-mm, linux-kernel

On Tue, Apr 02, 2013 at 11:54:03AM +0200, Cyril Hrubis wrote:
>This patch fixes corner case for MAP_FIXED when requested mapping length
>is larger than rlimit for virtual memory. In such case any overlapping
>mappings are unmapped before we check for the limit and return ENOMEM.
>
>The check is moved before the loop that unmaps overlapping parts of
>existing mappings. When we are about to hit the limit (currently mapped
>pages + len > limit) we scan for overlapping pages and check again
>accounting for them.
>
>This fixes situation when userspace program expects that the previous
>mappings are preserved after the mmap() syscall has returned with error.
>(POSIX clearly states that successfull mapping shall replace any
>previous mappings.)
>
>This corner case was found and can be tested with LTP testcase:
>
>testcases/open_posix_testsuite/conformance/interfaces/mmap/24-2.c
>
>In this case the mmap, which is clearly over current limit, unmaps
>dynamic libraries and the testcase segfaults right after returning into
>userspace.
>
>I've also looked at the second instance of the unmapping loop in the
>do_brk(). The do_brk() is called from brk() syscall and from vm_brk().
>The brk() syscall checks for overlapping mappings and bails out when
>there are any (so it can't be triggered from the brk syscall). The
>vm_brk() is called only from binmft handlers so it shouldn't be
>triggered unless binmft handler created overlapping mappings.
>

Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>

>Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
>---
> mm/mmap.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 46 insertions(+), 4 deletions(-)
>
>diff --git a/mm/mmap.c b/mm/mmap.c
>index 2664a47..e755080 100644
>--- a/mm/mmap.c
>+++ b/mm/mmap.c
>@@ -33,6 +33,7 @@
> #include <linux/uprobes.h>
> #include <linux/rbtree_augmented.h>
> #include <linux/sched/sysctl.h>
>+#include <linux/kernel.h>
>
> #include <asm/uaccess.h>
> #include <asm/cacheflush.h>
>@@ -543,6 +544,34 @@ static int find_vma_links(struct mm_struct *mm, unsigned long addr,
> 	return 0;
> }
>
>+static unsigned long count_vma_pages_range(struct mm_struct *mm,
>+		unsigned long addr, unsigned long end)
>+{
>+	unsigned long nr_pages = 0;
>+	struct vm_area_struct *vma;
>+
>+	/* Find first overlaping mapping */
>+	vma = find_vma_intersection(mm, addr, end);
>+	if (!vma)
>+		return 0;
>+
>+	nr_pages = (min(end, vma->vm_end) -
>+		max(addr, vma->vm_start)) >> PAGE_SHIFT;
>+
>+	/* Iterate over the rest of the overlaps */
>+	for (vma = vma->vm_next; vma; vma = vma->vm_next) {
>+		unsigned long overlap_len;
>+
>+		if (vma->vm_start > end)
>+			break;
>+
>+		overlap_len = min(end, vma->vm_end) - vma->vm_start;
>+		nr_pages += overlap_len >> PAGE_SHIFT;
>+	}
>+
>+	return nr_pages;
>+}
>+
> void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
> 		struct rb_node **rb_link, struct rb_node *rb_parent)
> {
>@@ -1433,6 +1462,23 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> 	unsigned long charged = 0;
> 	struct inode *inode =  file ? file_inode(file) : NULL;
>
>+	/* Check against address space limit. */
>+	if (!may_expand_vm(mm, len >> PAGE_SHIFT)) {
>+		unsigned long nr_pages;
>+
>+		/*
>+		 * MAP_FIXED may remove pages of mappings that intersects with
>+		 * requested mapping. Account for the pages it would unmap.
>+		 */
>+		if (!(vm_flags & MAP_FIXED))
>+			return -ENOMEM;
>+
>+		nr_pages = count_vma_pages_range(mm, addr, addr + len);
>+
>+		if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
>+			return -ENOMEM;
>+	}
>+
> 	/* Clear old maps */
> 	error = -ENOMEM;
> munmap_back:
>@@ -1442,10 +1488,6 @@ munmap_back:
> 		goto munmap_back;
> 	}
>
>-	/* Check against address space limit. */
>-	if (!may_expand_vm(mm, len >> PAGE_SHIFT))
>-		return -ENOMEM;
>-
> 	/*
> 	 * Private writable mapping: check memory availability
> 	 */
>-- 
>1.8.1.5
>
>See also a testsuite that exercies the newly added codepaths which is
>attached as a tarball (All testcases minus the second that tests
>that this patch works succeeds both before and after this patch).
>
>-- 
>Cyril Hrubis
>chrubis@suse.cz


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
  2013-04-02  9:54 [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping Cyril Hrubis
@ 2013-04-11 22:57   ` Andrew Morton
  2013-04-02 12:29 ` Wanpeng Li
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2013-04-11 22:57 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: linux-mm, linux-kernel

On Tue, 2 Apr 2013 11:54:03 +0200 Cyril Hrubis <chrubis@suse.cz> wrote:

> This patch fixes corner case for MAP_FIXED when requested mapping length
> is larger than rlimit for virtual memory. In such case any overlapping
> mappings are unmapped before we check for the limit and return ENOMEM.
> 
> The check is moved before the loop that unmaps overlapping parts of
> existing mappings. When we are about to hit the limit (currently mapped
> pages + len > limit) we scan for overlapping pages and check again
> accounting for them.
> 
> This fixes situation when userspace program expects that the previous
> mappings are preserved after the mmap() syscall has returned with error.
> (POSIX clearly states that successfull mapping shall replace any
> previous mappings.)
> 
> This corner case was found and can be tested with LTP testcase:
> 
> testcases/open_posix_testsuite/conformance/interfaces/mmap/24-2.c
> 
> In this case the mmap, which is clearly over current limit, unmaps
> dynamic libraries and the testcase segfaults right after returning into
> userspace.
> 
> I've also looked at the second instance of the unmapping loop in the
> do_brk(). The do_brk() is called from brk() syscall and from vm_brk().
> The brk() syscall checks for overlapping mappings and bails out when
> there are any (so it can't be triggered from the brk syscall). The
> vm_brk() is called only from binmft handlers so it shouldn't be
> triggered unless binmft handler created overlapping mappings.
> 
> ...
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -33,6 +33,7 @@
>  #include <linux/uprobes.h>
>  #include <linux/rbtree_augmented.h>
>  #include <linux/sched/sysctl.h>
> +#include <linux/kernel.h>
>  
>  #include <asm/uaccess.h>
>  #include <asm/cacheflush.h>
> @@ -543,6 +544,34 @@ static int find_vma_links(struct mm_struct *mm, unsigned long addr,
>  	return 0;
>  }
>  
> +static unsigned long count_vma_pages_range(struct mm_struct *mm,
> +		unsigned long addr, unsigned long end)
> +{
> +	unsigned long nr_pages = 0;
> +	struct vm_area_struct *vma;
> +
> +	/* Find first overlaping mapping */
> +	vma = find_vma_intersection(mm, addr, end);
> +	if (!vma)
> +		return 0;
> +
> +	nr_pages = (min(end, vma->vm_end) -
> +		max(addr, vma->vm_start)) >> PAGE_SHIFT;

urgh, these things always make my head spin.  Is it guaranteed that
end, vm_end, addr and vm_start are all multiples of PAGE_SIZE?  If not,
we have a problem don't we?



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
@ 2013-04-11 22:57   ` Andrew Morton
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2013-04-11 22:57 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: linux-mm, linux-kernel

On Tue, 2 Apr 2013 11:54:03 +0200 Cyril Hrubis <chrubis@suse.cz> wrote:

> This patch fixes corner case for MAP_FIXED when requested mapping length
> is larger than rlimit for virtual memory. In such case any overlapping
> mappings are unmapped before we check for the limit and return ENOMEM.
> 
> The check is moved before the loop that unmaps overlapping parts of
> existing mappings. When we are about to hit the limit (currently mapped
> pages + len > limit) we scan for overlapping pages and check again
> accounting for them.
> 
> This fixes situation when userspace program expects that the previous
> mappings are preserved after the mmap() syscall has returned with error.
> (POSIX clearly states that successfull mapping shall replace any
> previous mappings.)
> 
> This corner case was found and can be tested with LTP testcase:
> 
> testcases/open_posix_testsuite/conformance/interfaces/mmap/24-2.c
> 
> In this case the mmap, which is clearly over current limit, unmaps
> dynamic libraries and the testcase segfaults right after returning into
> userspace.
> 
> I've also looked at the second instance of the unmapping loop in the
> do_brk(). The do_brk() is called from brk() syscall and from vm_brk().
> The brk() syscall checks for overlapping mappings and bails out when
> there are any (so it can't be triggered from the brk syscall). The
> vm_brk() is called only from binmft handlers so it shouldn't be
> triggered unless binmft handler created overlapping mappings.
> 
> ...
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -33,6 +33,7 @@
>  #include <linux/uprobes.h>
>  #include <linux/rbtree_augmented.h>
>  #include <linux/sched/sysctl.h>
> +#include <linux/kernel.h>
>  
>  #include <asm/uaccess.h>
>  #include <asm/cacheflush.h>
> @@ -543,6 +544,34 @@ static int find_vma_links(struct mm_struct *mm, unsigned long addr,
>  	return 0;
>  }
>  
> +static unsigned long count_vma_pages_range(struct mm_struct *mm,
> +		unsigned long addr, unsigned long end)
> +{
> +	unsigned long nr_pages = 0;
> +	struct vm_area_struct *vma;
> +
> +	/* Find first overlaping mapping */
> +	vma = find_vma_intersection(mm, addr, end);
> +	if (!vma)
> +		return 0;
> +
> +	nr_pages = (min(end, vma->vm_end) -
> +		max(addr, vma->vm_start)) >> PAGE_SHIFT;

urgh, these things always make my head spin.  Is it guaranteed that
end, vm_end, addr and vm_start are all multiples of PAGE_SIZE?  If not,
we have a problem don't we?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
  2013-04-11 22:57   ` Andrew Morton
@ 2013-04-12 13:42     ` chrubis
  -1 siblings, 0 replies; 10+ messages in thread
From: chrubis @ 2013-04-12 13:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

Hi!
> > +static unsigned long count_vma_pages_range(struct mm_struct *mm,
> > +		unsigned long addr, unsigned long end)
> > +{
> > +	unsigned long nr_pages = 0;
> > +	struct vm_area_struct *vma;
> > +
> > +	/* Find first overlaping mapping */
> > +	vma = find_vma_intersection(mm, addr, end);
> > +	if (!vma)
> > +		return 0;
> > +
> > +	nr_pages = (min(end, vma->vm_end) -
> > +		max(addr, vma->vm_start)) >> PAGE_SHIFT;
> 
> urgh, these things always make my head spin.  Is it guaranteed that
> end, vm_end, addr and vm_start are all multiples of PAGE_SIZE?  If not,
> we have a problem don't we?

Yes, it takes a little of concentration before one can say what the code
does, unfortunatelly this is the most readable variant I've came up
with.

The len is page aligned right at the start of the do_mmap_pgoff() (end
is addr + len). The addr should be aligned in the get_unmapped_area()
although the codepath is more complicated to follow, but it seems to end
up in one of the arch_get_unmapped_area* and these makes sure the
address is aligned.

Moreover mmap() manual page says that the addr passed to mmap() is page
aligned (although I tend to check the code rather than the docs).

And for the vmas I belive these are page aligned by definition, correct
me if I'm wrong.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
@ 2013-04-12 13:42     ` chrubis
  0 siblings, 0 replies; 10+ messages in thread
From: chrubis @ 2013-04-12 13:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

Hi!
> > +static unsigned long count_vma_pages_range(struct mm_struct *mm,
> > +		unsigned long addr, unsigned long end)
> > +{
> > +	unsigned long nr_pages = 0;
> > +	struct vm_area_struct *vma;
> > +
> > +	/* Find first overlaping mapping */
> > +	vma = find_vma_intersection(mm, addr, end);
> > +	if (!vma)
> > +		return 0;
> > +
> > +	nr_pages = (min(end, vma->vm_end) -
> > +		max(addr, vma->vm_start)) >> PAGE_SHIFT;
> 
> urgh, these things always make my head spin.  Is it guaranteed that
> end, vm_end, addr and vm_start are all multiples of PAGE_SIZE?  If not,
> we have a problem don't we?

Yes, it takes a little of concentration before one can say what the code
does, unfortunatelly this is the most readable variant I've came up
with.

The len is page aligned right at the start of the do_mmap_pgoff() (end
is addr + len). The addr should be aligned in the get_unmapped_area()
although the codepath is more complicated to follow, but it seems to end
up in one of the arch_get_unmapped_area* and these makes sure the
address is aligned.

Moreover mmap() manual page says that the addr passed to mmap() is page
aligned (although I tend to check the code rather than the docs).

And for the vmas I belive these are page aligned by definition, correct
me if I'm wrong.

-- 
Cyril Hrubis
chrubis@suse.cz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping
@ 2013-03-25 13:24 Cyril Hrubis
  0 siblings, 0 replies; 10+ messages in thread
From: Cyril Hrubis @ 2013-03-25 13:24 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3910 bytes --]

This patch fixes corner case for MAP_FIXED when requested mapping length
is larger than rlimit for virtual memory. In such case any overlapping
mappings are unmapped before we check for the limit and return ENOMEM.

The check is moved before the loop that unmaps overlapping parts of
existing mappings. When we are about to hit the limit (currently mapped
pages + len > limit) we scan for overlapping pages and check again
accounting for them.

This fixes situation when userspace program expects that the previous
mappings are preserved after the mmap() syscall has returned with error.
(POSIX clearly states that successfull mapping shall replace any
previous mappings.)

This corner case was found and can be tested with LTP testcase:

testcases/open_posix_testsuite/conformance/interfaces/mmap/24-2.c

In this case the mmap, which is clearly over current limit, unmaps
dynamic libraries and the testcase segfaults right after returning into
userspace.

I've also looked at the second instance of the unmapping loop in the
do_brk(). The do_brk() is called from brk() syscall and from vm_brk().
The brk() syscall checks for overlapping mappings and bails out when
there are any (so it can't be triggered from the brk syscall). The
vm_brk() is called only from binmft handlers so it shouldn't be
triggered unless binmft handler created overlapping mappings.

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
---
 mm/mmap.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 2664a47..e755080 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -33,6 +33,7 @@
 #include <linux/uprobes.h>
 #include <linux/rbtree_augmented.h>
 #include <linux/sched/sysctl.h>
+#include <linux/kernel.h>
 
 #include <asm/uaccess.h>
 #include <asm/cacheflush.h>
@@ -543,6 +544,34 @@ static int find_vma_links(struct mm_struct *mm, unsigned long addr,
 	return 0;
 }
 
+static unsigned long count_vma_pages_range(struct mm_struct *mm,
+		unsigned long addr, unsigned long end)
+{
+	unsigned long nr_pages = 0;
+	struct vm_area_struct *vma;
+
+	/* Find first overlaping mapping */
+	vma = find_vma_intersection(mm, addr, end);
+	if (!vma)
+		return 0;
+
+	nr_pages = (min(end, vma->vm_end) -
+		max(addr, vma->vm_start)) >> PAGE_SHIFT;
+
+	/* Iterate over the rest of the overlaps */
+	for (vma = vma->vm_next; vma; vma = vma->vm_next) {
+		unsigned long overlap_len;
+
+		if (vma->vm_start > end)
+			break;
+
+		overlap_len = min(end, vma->vm_end) - vma->vm_start;
+		nr_pages += overlap_len >> PAGE_SHIFT;
+	}
+
+	return nr_pages;
+}
+
 void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
 		struct rb_node **rb_link, struct rb_node *rb_parent)
 {
@@ -1433,6 +1462,23 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	unsigned long charged = 0;
 	struct inode *inode =  file ? file_inode(file) : NULL;
 
+	/* Check against address space limit. */
+	if (!may_expand_vm(mm, len >> PAGE_SHIFT)) {
+		unsigned long nr_pages;
+
+		/*
+		 * MAP_FIXED may remove pages of mappings that intersects with
+		 * requested mapping. Account for the pages it would unmap.
+		 */
+		if (!(vm_flags & MAP_FIXED))
+			return -ENOMEM;
+
+		nr_pages = count_vma_pages_range(mm, addr, addr + len);
+
+		if (!may_expand_vm(mm, (len >> PAGE_SHIFT) - nr_pages))
+			return -ENOMEM;
+	}
+
 	/* Clear old maps */
 	error = -ENOMEM;
 munmap_back:
@@ -1442,10 +1488,6 @@ munmap_back:
 		goto munmap_back;
 	}
 
-	/* Check against address space limit. */
-	if (!may_expand_vm(mm, len >> PAGE_SHIFT))
-		return -ENOMEM;
-
 	/*
 	 * Private writable mapping: check memory availability
 	 */
-- 
1.8.1.5

See also a testsuite that exercies the newly added codepaths which is
attached as a tarball (All testcases minus the second that tests
that this patch works succeeds both before and after this patch).

-- 
Cyril Hrubis
chrubis@suse.cz

[-- Attachment #2: mm.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 2284 bytes --]

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-04-12 13:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-02  9:54 [PATCH] mm/mmap: Check for RLIMIT_AS before unmapping Cyril Hrubis
2013-04-02 10:56 ` Mel Gorman
2013-04-02 10:56   ` Mel Gorman
2013-04-02 12:29 ` Wanpeng Li
2013-04-02 12:29 ` Wanpeng Li
2013-04-11 22:57 ` Andrew Morton
2013-04-11 22:57   ` Andrew Morton
2013-04-12 13:42   ` chrubis
2013-04-12 13:42     ` chrubis
  -- strict thread matches above, loose matches on Subject: below --
2013-03-25 13:24 Cyril Hrubis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.