All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -mm 0/6] mm: scalable and unified arch_get_unmapped_area
@ 2012-06-18 14:31 ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel

A long time ago, we decided to limit the number of VMAs per
process to 64k. As it turns out, there actually are programs
using tens of thousands of VMAs.

The linear search in arch_get_unmapped_area and
arch_get_unmapped_area_topdown can be a real issue for
those programs. 

This patch series aims to fix the scalability issue by
tracking the size of each free hole in the VMA rbtree,
propagating the free hole info up the tree. 

Another major goal is to put the bulk of the necessary
arch_get_unmapped_area(_topdown) functionality into one
set of functions, so we can eliminate the custom large
functions per architecture, sticking to a few much smaller
architecture specific functions instead.

In this version I have only gotten rid of the x86, ARM
and MIPS arch-specific code, and am already showing a
fairly promising diffstat:

 arch/arm/include/asm/pgtable.h    |    6 
 arch/arm/mm/init.c                |    3 
 arch/arm/mm/mmap.c                |  217 ------------------
 arch/mips/include/asm/page.h      |    2 
 arch/mips/include/asm/pgtable.h   |    7 
 arch/mips/mm/mmap.c               |  177 --------------
 arch/x86/include/asm/elf.h        |    3 
 arch/x86/include/asm/pgtable_64.h |    4 
 arch/x86/kernel/sys_x86_64.c      |  200 ++--------------
 arch/x86/vdso/vma.c               |    2 
 include/linux/mm_types.h          |    8 
 include/linux/sched.h             |   13 +
 mm/internal.h                     |    5 
 mm/mmap.c                         |  455 ++++++++++++++++++++++++++++++--------
 14 files changed, 420 insertions(+), 682 deletions(-)

TODO:
- eliminate arch-specific functions for more architectures
- integrate hugetlbfs alignment (with Andi Kleen's patch?)

Performance

Testing performance with a benchmark that allocates tens
of thousands of VMAs, unmaps them and mmaps them some more
in a loop, shows promising results.

Vanilla 3.4 kernel:
$ ./agua_frag_test_64
..........

Min Time (ms): 6
Avg. Time (ms): 294.0000
Max Time (ms): 609
Std Dev (ms): 113.1664
Standard deviation exceeds 10

With patches:
$ ./agua_frag_test_64
..........

Min Time (ms): 14
Avg. Time (ms): 38.0000
Max Time (ms): 60
Std Dev (ms): 3.9312
All checks pass

The total run time of the test goes down by about a
factor 4.  More importantly, the worst case performance
of the loop (which is what really hurt some applications)
has gone down by about a factor 10.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH -mm 0/6] mm: scalable and unified arch_get_unmapped_area
@ 2012-06-18 14:31 ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel

A long time ago, we decided to limit the number of VMAs per
process to 64k. As it turns out, there actually are programs
using tens of thousands of VMAs.

The linear search in arch_get_unmapped_area and
arch_get_unmapped_area_topdown can be a real issue for
those programs. 

This patch series aims to fix the scalability issue by
tracking the size of each free hole in the VMA rbtree,
propagating the free hole info up the tree. 

Another major goal is to put the bulk of the necessary
arch_get_unmapped_area(_topdown) functionality into one
set of functions, so we can eliminate the custom large
functions per architecture, sticking to a few much smaller
architecture specific functions instead.

In this version I have only gotten rid of the x86, ARM
and MIPS arch-specific code, and am already showing a
fairly promising diffstat:

 arch/arm/include/asm/pgtable.h    |    6 
 arch/arm/mm/init.c                |    3 
 arch/arm/mm/mmap.c                |  217 ------------------
 arch/mips/include/asm/page.h      |    2 
 arch/mips/include/asm/pgtable.h   |    7 
 arch/mips/mm/mmap.c               |  177 --------------
 arch/x86/include/asm/elf.h        |    3 
 arch/x86/include/asm/pgtable_64.h |    4 
 arch/x86/kernel/sys_x86_64.c      |  200 ++--------------
 arch/x86/vdso/vma.c               |    2 
 include/linux/mm_types.h          |    8 
 include/linux/sched.h             |   13 +
 mm/internal.h                     |    5 
 mm/mmap.c                         |  455 ++++++++++++++++++++++++++++++--------
 14 files changed, 420 insertions(+), 682 deletions(-)

TODO:
- eliminate arch-specific functions for more architectures
- integrate hugetlbfs alignment (with Andi Kleen's patch?)

Performance

Testing performance with a benchmark that allocates tens
of thousands of VMAs, unmaps them and mmaps them some more
in a loop, shows promising results.

Vanilla 3.4 kernel:
$ ./agua_frag_test_64
..........

Min Time (ms): 6
Avg. Time (ms): 294.0000
Max Time (ms): 609
Std Dev (ms): 113.1664
Standard deviation exceeds 10

With patches:
$ ./agua_frag_test_64
..........

Min Time (ms): 14
Avg. Time (ms): 38.0000
Max Time (ms): 60
Std Dev (ms): 3.9312
All checks pass

The total run time of the test goes down by about a
factor 4.  More importantly, the worst case performance
of the loop (which is what really hurt some applications)
has gone down by about a factor 10.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH -mm 1/6] mm: get unmapped area from VMA tree
  2012-06-18 14:31 ` Rik van Riel
@ 2012-06-18 14:31   ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Rik van Riel

From: Rik van Riel <riel@surriel.com>

Change the generic implementations of arch_get_unmapped_area(_topdown)
to use the free space info in the VMA rbtree. This makes it possible
to find free address space in O(log(N)) complexity.

For bottom-up allocations, we pick the lowest hole that is large
enough for our allocation. For topdown allocations, we pick the
highest hole of sufficient size.

For topdown allocations, we need to keep track of the highest
mapped VMA address, because it might be below mm->mmap_base,
and we only keep track of free space to the left of each VMA
in the VMA tree.  It is tempting to try and keep track of
the free space to the right of each VMA when running in
topdown mode, but that gets us into trouble when running on
x86, where a process can switch direction in the middle of
execve.

We have to leave the mm->free_area_cache and mm->largest_hole_size
in place for now, because the architecture specific versions still
use those.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/mm_types.h |    1 +
 mm/mmap.c                |  270 +++++++++++++++++++++++++++++++---------------
 2 files changed, 184 insertions(+), 87 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index bf56d66..8ccb4e1 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -307,6 +307,7 @@ struct mm_struct {
 	unsigned long task_size;		/* size of task vm space */
 	unsigned long cached_hole_size; 	/* if non-zero, the largest hole below free_area_cache */
 	unsigned long free_area_cache;		/* first hole of size cached_hole_size or larger */
+	unsigned long highest_vma;		/* highest vma end address */
 	pgd_t * pgd;
 	atomic_t mm_users;			/* How many users with user space? */
 	atomic_t mm_count;			/* How many references to "struct mm_struct" (users count as 1) */
diff --git a/mm/mmap.c b/mm/mmap.c
index 1963ef9..40c848e 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -4,6 +4,7 @@
  * Written by obz.
  *
  * Address space accounting code	<alan@lxorguk.ukuu.org.uk>
+ * Rbtree get_unmapped_area Copyright (C) 2012  Rik van Riel
  */
 
 #include <linux/slab.h>
@@ -250,6 +251,17 @@ static void adjust_free_gap(struct vm_area_struct *vma)
 	rb_augment_erase_end(&vma->vm_rb, vma_rb_augment_cb, NULL);
 }
 
+static unsigned long node_free_hole(struct rb_node *node)
+{
+	struct vm_area_struct *vma;
+
+	if (!node)
+		return 0;
+
+	vma = container_of(node, struct vm_area_struct, vm_rb);
+	return vma->free_gap;
+}
+
 /*
  * Unlink a file-based vm structure from its prio_tree, to hide
  * vma from rmap and vmtruncate before freeing its page tables.
@@ -386,12 +398,16 @@ void validate_mm(struct mm_struct *mm)
 	int bug = 0;
 	int i = 0;
 	struct vm_area_struct *tmp = mm->mmap;
+	unsigned long highest_address = 0;
 	while (tmp) {
 		if (tmp->free_gap != max_free_space(&tmp->vm_rb))
 			printk("free space %lx, correct %lx\n", tmp->free_gap, max_free_space(&tmp->vm_rb)), bug = 1;
+		highest_address = tmp->vm_end;
 		tmp = tmp->vm_next;
 		i++;
 	}
+	if (highest_address != mm->highest_vma)
+		printk("mm->highest_vma %lx, found %lx\n", mm->highest_vma, highest_address), bug = 1;
 	if (i != mm->map_count)
 		printk("map_count %d vm_next %d\n", mm->map_count, i), bug = 1;
 	i = browse_rb(&mm->mm_rb);
@@ -449,6 +465,9 @@ void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
 	/* Propagate the new free gap between next and us up the tree. */
 	if (vma->vm_next)
 		adjust_free_gap(vma->vm_next);
+	else
+		/* This is the VMA with the highest address. */
+		mm->highest_vma = vma->vm_end;
 }
 
 static void __vma_link_file(struct vm_area_struct *vma)
@@ -648,6 +667,8 @@ again:			remove_next = 1 + (end > next->vm_end);
 	vma->vm_start = start;
 	vma->vm_end = end;
 	vma->vm_pgoff = pgoff;
+	if (!next)
+		mm->highest_vma = end;
 	if (adjust_next) {
 		next->vm_start += adjust_next << PAGE_SHIFT;
 		next->vm_pgoff += adjust_next;
@@ -1456,13 +1477,29 @@ unacct_error:
  * This function "knows" that -ENOMEM has the bits set.
  */
 #ifndef HAVE_ARCH_UNMAPPED_AREA
+struct rb_node *continue_next_right(struct rb_node *node)
+{
+	struct rb_node *prev;
+
+	while ((prev = node) && (node = rb_parent(node))) {
+		if (prev == node->rb_right)
+			continue;
+
+		if (node->rb_right)
+			return node->rb_right;
+	}
+
+	return NULL;
+}
+
 unsigned long
 arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
-	unsigned long start_addr;
+	struct vm_area_struct *vma = NULL;
+	struct rb_node *rb_node;
+	unsigned long lower_limit = TASK_UNMAPPED_BASE;
 
 	if (len > TASK_SIZE)
 		return -ENOMEM;
@@ -1477,40 +1514,76 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		    (!vma || addr + len <= vma->vm_start))
 			return addr;
 	}
-	if (len > mm->cached_hole_size) {
-	        start_addr = addr = mm->free_area_cache;
-	} else {
-	        start_addr = addr = TASK_UNMAPPED_BASE;
-	        mm->cached_hole_size = 0;
-	}
 
-full_search:
-	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (TASK_SIZE - len < addr) {
-			/*
-			 * Start a new search - just in case we missed
-			 * some holes.
-			 */
-			if (start_addr != TASK_UNMAPPED_BASE) {
-				addr = TASK_UNMAPPED_BASE;
-			        start_addr = addr;
-				mm->cached_hole_size = 0;
-				goto full_search;
+	/* Find the left-most free area of sufficient size. */
+	for (addr = 0, rb_node = mm->mm_rb.rb_node; rb_node; ) {
+		unsigned long vma_start;
+		int found_here = 0;
+
+		vma = rb_to_vma(rb_node);
+
+		if (vma->vm_start > len) {
+			if (!vma->vm_prev) {
+				/* This is the left-most VMA. */
+				if (vma->vm_start - len >= lower_limit) {
+					addr = lower_limit;
+					goto found_addr;
+				}
+			} else {
+				/* Is this hole large enough? Remember it. */
+				vma_start = max(vma->vm_prev->vm_end, lower_limit);
+				if (vma->vm_start - len >= vma_start) {
+					addr = vma_start;
+					found_here = 1;
+				}
 			}
-			return -ENOMEM;
 		}
-		if (!vma || addr + len <= vma->vm_start) {
-			/*
-			 * Remember the place where we stopped the search:
-			 */
-			mm->free_area_cache = addr + len;
-			return addr;
+
+		/* Go left if it looks promising. */
+		if (node_free_hole(rb_node->rb_left) >= len &&
+					vma->vm_start - len >= lower_limit) {
+			rb_node = rb_node->rb_left;
+			continue;
 		}
-		if (addr + mm->cached_hole_size < vma->vm_start)
-		        mm->cached_hole_size = vma->vm_start - addr;
-		addr = vma->vm_end;
+
+		if (!found_here && node_free_hole(rb_node->rb_right) >= len) {
+			/* Last known hole is to the right of this subtree. */
+			rb_node = rb_node->rb_right;
+			continue;
+		} else if (!addr) {
+			rb_node = continue_next_right(rb_node);
+			continue;
+		}
+
+		/* This is the left-most hole. */
+		goto found_addr;
 	}
+
+	/*
+	 * There is not enough space to the left of any VMA.
+	 * Check the far right-hand side of the VMA tree.
+	 */
+	rb_node = mm->mm_rb.rb_node;
+	while (rb_node->rb_right)
+		rb_node = rb_node->rb_right;
+	vma = rb_to_vma(rb_node);
+	addr = vma->vm_end;
+
+	/*
+	 * The right-most VMA ends below the lower limit. Can only happen
+	 * if a binary personality loads the stack below the executable.
+	 */
+	if (addr < lower_limit)
+		addr = lower_limit;
+
+ found_addr:
+	if (TASK_SIZE - len < addr)
+		return -ENOMEM;
+
+	/* This "free area" was not really free. Tree corrupted? */
+	VM_BUG_ON(find_vma_intersection(mm, addr, addr+len));
+
+	return addr;
 }
 #endif	
 
@@ -1528,14 +1601,31 @@ void arch_unmap_area(struct mm_struct *mm, unsigned long addr)
  * stack's low limit (the base):
  */
 #ifndef HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+struct rb_node *continue_next_left(struct rb_node *node)
+{
+	struct rb_node *prev;
+
+	while ((prev = node) && (node = rb_parent(node))) {
+		if (prev == node->rb_left)
+			continue;
+
+		if (node->rb_left)
+			return node->rb_left;
+	}
+
+	return NULL;
+}
+
 unsigned long
 arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 			  const unsigned long len, const unsigned long pgoff,
 			  const unsigned long flags)
 {
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma = NULL;
 	struct mm_struct *mm = current->mm;
-	unsigned long addr = addr0, start_addr;
+	unsigned long addr = addr0;
+	struct rb_node *rb_node = NULL;
+	unsigned long upper_limit = mm->mmap_base;
 
 	/* requested length too big for entire address space */
 	if (len > TASK_SIZE)
@@ -1553,68 +1643,65 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 			return addr;
 	}
 
-	/* check if free_area_cache is useful for us */
-	if (len <= mm->cached_hole_size) {
- 	        mm->cached_hole_size = 0;
- 		mm->free_area_cache = mm->mmap_base;
- 	}
+	/* requested length too big; prevent integer underflow below */
+	if (len > upper_limit)
+		return -ENOMEM;
 
-try_again:
-	/* either no address requested or can't fit in requested address hole */
-	start_addr = addr = mm->free_area_cache;
+	/*
+	 * Does the highest VMA end far enough below the upper limit
+	 * of our search space?
+	 */
+	if (upper_limit - len > mm->highest_vma) {
+		addr = upper_limit - len;
+		goto found_addr;
+	}
 
-	if (addr < len)
-		goto fail;
+	/* Find the right-most free area of sufficient size. */
+	for (addr = 0, rb_node = mm->mm_rb.rb_node; rb_node; ) {
+		unsigned long vma_start;
+		int found_here = 0;
 
-	addr -= len;
-	do {
-		/*
-		 * Lookup failure means no vma is above this address,
-		 * else if new region fits below vma->vm_start,
-		 * return with success:
-		 */
-		vma = find_vma(mm, addr);
-		if (!vma || addr+len <= vma->vm_start)
-			/* remember the address as a hint for next time */
-			return (mm->free_area_cache = addr);
+		vma = container_of(rb_node, struct vm_area_struct, vm_rb);
 
- 		/* remember the largest hole we saw so far */
- 		if (addr + mm->cached_hole_size < vma->vm_start)
- 		        mm->cached_hole_size = vma->vm_start - addr;
+		/* Is this hole large enough? Remember it. */
+		vma_start = min(vma->vm_start, upper_limit);
+		if (vma_start > len) {
+			if (!vma->vm_prev ||
+			    (vma_start - len >= vma->vm_prev->vm_end)) {
+				addr = vma_start - len;
+				found_here = 1;
+			}
+		}
 
-		/* try just below the current vma->vm_start */
-		addr = vma->vm_start-len;
-	} while (len < vma->vm_start);
+		/* Go right if it looks promising. */
+		if (node_free_hole(rb_node->rb_right) >= len) {
+			if (upper_limit - len > vma->vm_end) {
+				rb_node = rb_node->rb_right;
+				continue;
+			}
+		}
 
-fail:
-	/*
-	 * if hint left us with no space for the requested
-	 * mapping then try again:
-	 *
-	 * Note: this is different with the case of bottomup
-	 * which does the fully line-search, but we use find_vma
-	 * here that causes some holes skipped.
-	 */
-	if (start_addr != mm->mmap_base) {
-		mm->free_area_cache = mm->mmap_base;
-		mm->cached_hole_size = 0;
-		goto try_again;
+		if (!found_here && node_free_hole(rb_node->rb_left) >= len) {
+			/* Last known hole is to the right of this subtree. */
+			rb_node = rb_node->rb_left;
+			continue;
+		} else if (!addr) {
+			rb_node = continue_next_left(rb_node);
+			continue;
+		}
+
+		/* This is the right-most hole. */
+		goto found_addr;
 	}
 
-	/*
-	 * A failed mmap() very likely causes application failure,
-	 * so fall back to the bottom-up function here. This scenario
-	 * can happen with large stack limits and large mmap()
-	 * allocations.
-	 */
-	mm->cached_hole_size = ~0UL;
-  	mm->free_area_cache = TASK_UNMAPPED_BASE;
-	addr = arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
-	/*
-	 * Restore the topdown base:
-	 */
-	mm->free_area_cache = mm->mmap_base;
-	mm->cached_hole_size = ~0UL;
+	return -ENOMEM;
+
+ found_addr:
+	if (TASK_SIZE - len < addr)
+		return -ENOMEM;
+
+	/* This "free area" was not really free. Tree corrupted? */
+	VM_BUG_ON(find_vma_intersection(mm, addr, addr+len));
 
 	return addr;
 }
@@ -1828,6 +1915,8 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 				vma->vm_end = address;
 				if (vma->vm_next)
 					adjust_free_gap(vma->vm_next);
+				if (!vma->vm_next)
+					vma->vm_mm->highest_vma = vma->vm_end;
 				perf_event_mmap(vma);
 			}
 		}
@@ -2013,6 +2102,13 @@ detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
 	*insertion_point = vma;
 	if (vma)
 		vma->vm_prev = prev;
+	else {
+		/* We just unmapped the highest VMA. */
+		if (prev)
+			mm->highest_vma = prev->vm_end;
+		else
+			mm->highest_vma = 0;
+	}
 	if (vma)
 		rb_augment_erase_end(&vma->vm_rb, vma_rb_augment_cb, NULL);
 	tail_vma->vm_next = NULL;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 1/6] mm: get unmapped area from VMA tree
@ 2012-06-18 14:31   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Rik van Riel

From: Rik van Riel <riel@surriel.com>

Change the generic implementations of arch_get_unmapped_area(_topdown)
to use the free space info in the VMA rbtree. This makes it possible
to find free address space in O(log(N)) complexity.

For bottom-up allocations, we pick the lowest hole that is large
enough for our allocation. For topdown allocations, we pick the
highest hole of sufficient size.

For topdown allocations, we need to keep track of the highest
mapped VMA address, because it might be below mm->mmap_base,
and we only keep track of free space to the left of each VMA
in the VMA tree.  It is tempting to try and keep track of
the free space to the right of each VMA when running in
topdown mode, but that gets us into trouble when running on
x86, where a process can switch direction in the middle of
execve.

We have to leave the mm->free_area_cache and mm->largest_hole_size
in place for now, because the architecture specific versions still
use those.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/mm_types.h |    1 +
 mm/mmap.c                |  270 +++++++++++++++++++++++++++++++---------------
 2 files changed, 184 insertions(+), 87 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index bf56d66..8ccb4e1 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -307,6 +307,7 @@ struct mm_struct {
 	unsigned long task_size;		/* size of task vm space */
 	unsigned long cached_hole_size; 	/* if non-zero, the largest hole below free_area_cache */
 	unsigned long free_area_cache;		/* first hole of size cached_hole_size or larger */
+	unsigned long highest_vma;		/* highest vma end address */
 	pgd_t * pgd;
 	atomic_t mm_users;			/* How many users with user space? */
 	atomic_t mm_count;			/* How many references to "struct mm_struct" (users count as 1) */
diff --git a/mm/mmap.c b/mm/mmap.c
index 1963ef9..40c848e 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -4,6 +4,7 @@
  * Written by obz.
  *
  * Address space accounting code	<alan@lxorguk.ukuu.org.uk>
+ * Rbtree get_unmapped_area Copyright (C) 2012  Rik van Riel
  */
 
 #include <linux/slab.h>
@@ -250,6 +251,17 @@ static void adjust_free_gap(struct vm_area_struct *vma)
 	rb_augment_erase_end(&vma->vm_rb, vma_rb_augment_cb, NULL);
 }
 
+static unsigned long node_free_hole(struct rb_node *node)
+{
+	struct vm_area_struct *vma;
+
+	if (!node)
+		return 0;
+
+	vma = container_of(node, struct vm_area_struct, vm_rb);
+	return vma->free_gap;
+}
+
 /*
  * Unlink a file-based vm structure from its prio_tree, to hide
  * vma from rmap and vmtruncate before freeing its page tables.
@@ -386,12 +398,16 @@ void validate_mm(struct mm_struct *mm)
 	int bug = 0;
 	int i = 0;
 	struct vm_area_struct *tmp = mm->mmap;
+	unsigned long highest_address = 0;
 	while (tmp) {
 		if (tmp->free_gap != max_free_space(&tmp->vm_rb))
 			printk("free space %lx, correct %lx\n", tmp->free_gap, max_free_space(&tmp->vm_rb)), bug = 1;
+		highest_address = tmp->vm_end;
 		tmp = tmp->vm_next;
 		i++;
 	}
+	if (highest_address != mm->highest_vma)
+		printk("mm->highest_vma %lx, found %lx\n", mm->highest_vma, highest_address), bug = 1;
 	if (i != mm->map_count)
 		printk("map_count %d vm_next %d\n", mm->map_count, i), bug = 1;
 	i = browse_rb(&mm->mm_rb);
@@ -449,6 +465,9 @@ void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
 	/* Propagate the new free gap between next and us up the tree. */
 	if (vma->vm_next)
 		adjust_free_gap(vma->vm_next);
+	else
+		/* This is the VMA with the highest address. */
+		mm->highest_vma = vma->vm_end;
 }
 
 static void __vma_link_file(struct vm_area_struct *vma)
@@ -648,6 +667,8 @@ again:			remove_next = 1 + (end > next->vm_end);
 	vma->vm_start = start;
 	vma->vm_end = end;
 	vma->vm_pgoff = pgoff;
+	if (!next)
+		mm->highest_vma = end;
 	if (adjust_next) {
 		next->vm_start += adjust_next << PAGE_SHIFT;
 		next->vm_pgoff += adjust_next;
@@ -1456,13 +1477,29 @@ unacct_error:
  * This function "knows" that -ENOMEM has the bits set.
  */
 #ifndef HAVE_ARCH_UNMAPPED_AREA
+struct rb_node *continue_next_right(struct rb_node *node)
+{
+	struct rb_node *prev;
+
+	while ((prev = node) && (node = rb_parent(node))) {
+		if (prev == node->rb_right)
+			continue;
+
+		if (node->rb_right)
+			return node->rb_right;
+	}
+
+	return NULL;
+}
+
 unsigned long
 arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
-	unsigned long start_addr;
+	struct vm_area_struct *vma = NULL;
+	struct rb_node *rb_node;
+	unsigned long lower_limit = TASK_UNMAPPED_BASE;
 
 	if (len > TASK_SIZE)
 		return -ENOMEM;
@@ -1477,40 +1514,76 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		    (!vma || addr + len <= vma->vm_start))
 			return addr;
 	}
-	if (len > mm->cached_hole_size) {
-	        start_addr = addr = mm->free_area_cache;
-	} else {
-	        start_addr = addr = TASK_UNMAPPED_BASE;
-	        mm->cached_hole_size = 0;
-	}
 
-full_search:
-	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (TASK_SIZE - len < addr) {
-			/*
-			 * Start a new search - just in case we missed
-			 * some holes.
-			 */
-			if (start_addr != TASK_UNMAPPED_BASE) {
-				addr = TASK_UNMAPPED_BASE;
-			        start_addr = addr;
-				mm->cached_hole_size = 0;
-				goto full_search;
+	/* Find the left-most free area of sufficient size. */
+	for (addr = 0, rb_node = mm->mm_rb.rb_node; rb_node; ) {
+		unsigned long vma_start;
+		int found_here = 0;
+
+		vma = rb_to_vma(rb_node);
+
+		if (vma->vm_start > len) {
+			if (!vma->vm_prev) {
+				/* This is the left-most VMA. */
+				if (vma->vm_start - len >= lower_limit) {
+					addr = lower_limit;
+					goto found_addr;
+				}
+			} else {
+				/* Is this hole large enough? Remember it. */
+				vma_start = max(vma->vm_prev->vm_end, lower_limit);
+				if (vma->vm_start - len >= vma_start) {
+					addr = vma_start;
+					found_here = 1;
+				}
 			}
-			return -ENOMEM;
 		}
-		if (!vma || addr + len <= vma->vm_start) {
-			/*
-			 * Remember the place where we stopped the search:
-			 */
-			mm->free_area_cache = addr + len;
-			return addr;
+
+		/* Go left if it looks promising. */
+		if (node_free_hole(rb_node->rb_left) >= len &&
+					vma->vm_start - len >= lower_limit) {
+			rb_node = rb_node->rb_left;
+			continue;
 		}
-		if (addr + mm->cached_hole_size < vma->vm_start)
-		        mm->cached_hole_size = vma->vm_start - addr;
-		addr = vma->vm_end;
+
+		if (!found_here && node_free_hole(rb_node->rb_right) >= len) {
+			/* Last known hole is to the right of this subtree. */
+			rb_node = rb_node->rb_right;
+			continue;
+		} else if (!addr) {
+			rb_node = continue_next_right(rb_node);
+			continue;
+		}
+
+		/* This is the left-most hole. */
+		goto found_addr;
 	}
+
+	/*
+	 * There is not enough space to the left of any VMA.
+	 * Check the far right-hand side of the VMA tree.
+	 */
+	rb_node = mm->mm_rb.rb_node;
+	while (rb_node->rb_right)
+		rb_node = rb_node->rb_right;
+	vma = rb_to_vma(rb_node);
+	addr = vma->vm_end;
+
+	/*
+	 * The right-most VMA ends below the lower limit. Can only happen
+	 * if a binary personality loads the stack below the executable.
+	 */
+	if (addr < lower_limit)
+		addr = lower_limit;
+
+ found_addr:
+	if (TASK_SIZE - len < addr)
+		return -ENOMEM;
+
+	/* This "free area" was not really free. Tree corrupted? */
+	VM_BUG_ON(find_vma_intersection(mm, addr, addr+len));
+
+	return addr;
 }
 #endif	
 
@@ -1528,14 +1601,31 @@ void arch_unmap_area(struct mm_struct *mm, unsigned long addr)
  * stack's low limit (the base):
  */
 #ifndef HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+struct rb_node *continue_next_left(struct rb_node *node)
+{
+	struct rb_node *prev;
+
+	while ((prev = node) && (node = rb_parent(node))) {
+		if (prev == node->rb_left)
+			continue;
+
+		if (node->rb_left)
+			return node->rb_left;
+	}
+
+	return NULL;
+}
+
 unsigned long
 arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 			  const unsigned long len, const unsigned long pgoff,
 			  const unsigned long flags)
 {
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma = NULL;
 	struct mm_struct *mm = current->mm;
-	unsigned long addr = addr0, start_addr;
+	unsigned long addr = addr0;
+	struct rb_node *rb_node = NULL;
+	unsigned long upper_limit = mm->mmap_base;
 
 	/* requested length too big for entire address space */
 	if (len > TASK_SIZE)
@@ -1553,68 +1643,65 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 			return addr;
 	}
 
-	/* check if free_area_cache is useful for us */
-	if (len <= mm->cached_hole_size) {
- 	        mm->cached_hole_size = 0;
- 		mm->free_area_cache = mm->mmap_base;
- 	}
+	/* requested length too big; prevent integer underflow below */
+	if (len > upper_limit)
+		return -ENOMEM;
 
-try_again:
-	/* either no address requested or can't fit in requested address hole */
-	start_addr = addr = mm->free_area_cache;
+	/*
+	 * Does the highest VMA end far enough below the upper limit
+	 * of our search space?
+	 */
+	if (upper_limit - len > mm->highest_vma) {
+		addr = upper_limit - len;
+		goto found_addr;
+	}
 
-	if (addr < len)
-		goto fail;
+	/* Find the right-most free area of sufficient size. */
+	for (addr = 0, rb_node = mm->mm_rb.rb_node; rb_node; ) {
+		unsigned long vma_start;
+		int found_here = 0;
 
-	addr -= len;
-	do {
-		/*
-		 * Lookup failure means no vma is above this address,
-		 * else if new region fits below vma->vm_start,
-		 * return with success:
-		 */
-		vma = find_vma(mm, addr);
-		if (!vma || addr+len <= vma->vm_start)
-			/* remember the address as a hint for next time */
-			return (mm->free_area_cache = addr);
+		vma = container_of(rb_node, struct vm_area_struct, vm_rb);
 
- 		/* remember the largest hole we saw so far */
- 		if (addr + mm->cached_hole_size < vma->vm_start)
- 		        mm->cached_hole_size = vma->vm_start - addr;
+		/* Is this hole large enough? Remember it. */
+		vma_start = min(vma->vm_start, upper_limit);
+		if (vma_start > len) {
+			if (!vma->vm_prev ||
+			    (vma_start - len >= vma->vm_prev->vm_end)) {
+				addr = vma_start - len;
+				found_here = 1;
+			}
+		}
 
-		/* try just below the current vma->vm_start */
-		addr = vma->vm_start-len;
-	} while (len < vma->vm_start);
+		/* Go right if it looks promising. */
+		if (node_free_hole(rb_node->rb_right) >= len) {
+			if (upper_limit - len > vma->vm_end) {
+				rb_node = rb_node->rb_right;
+				continue;
+			}
+		}
 
-fail:
-	/*
-	 * if hint left us with no space for the requested
-	 * mapping then try again:
-	 *
-	 * Note: this is different with the case of bottomup
-	 * which does the fully line-search, but we use find_vma
-	 * here that causes some holes skipped.
-	 */
-	if (start_addr != mm->mmap_base) {
-		mm->free_area_cache = mm->mmap_base;
-		mm->cached_hole_size = 0;
-		goto try_again;
+		if (!found_here && node_free_hole(rb_node->rb_left) >= len) {
+			/* Last known hole is to the right of this subtree. */
+			rb_node = rb_node->rb_left;
+			continue;
+		} else if (!addr) {
+			rb_node = continue_next_left(rb_node);
+			continue;
+		}
+
+		/* This is the right-most hole. */
+		goto found_addr;
 	}
 
-	/*
-	 * A failed mmap() very likely causes application failure,
-	 * so fall back to the bottom-up function here. This scenario
-	 * can happen with large stack limits and large mmap()
-	 * allocations.
-	 */
-	mm->cached_hole_size = ~0UL;
-  	mm->free_area_cache = TASK_UNMAPPED_BASE;
-	addr = arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
-	/*
-	 * Restore the topdown base:
-	 */
-	mm->free_area_cache = mm->mmap_base;
-	mm->cached_hole_size = ~0UL;
+	return -ENOMEM;
+
+ found_addr:
+	if (TASK_SIZE - len < addr)
+		return -ENOMEM;
+
+	/* This "free area" was not really free. Tree corrupted? */
+	VM_BUG_ON(find_vma_intersection(mm, addr, addr+len));
 
 	return addr;
 }
@@ -1828,6 +1915,8 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 				vma->vm_end = address;
 				if (vma->vm_next)
 					adjust_free_gap(vma->vm_next);
+				if (!vma->vm_next)
+					vma->vm_mm->highest_vma = vma->vm_end;
 				perf_event_mmap(vma);
 			}
 		}
@@ -2013,6 +2102,13 @@ detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
 	*insertion_point = vma;
 	if (vma)
 		vma->vm_prev = prev;
+	else {
+		/* We just unmapped the highest VMA. */
+		if (prev)
+			mm->highest_vma = prev->vm_end;
+		else
+			mm->highest_vma = 0;
+	}
 	if (vma)
 		rb_augment_erase_end(&vma->vm_rb, vma_rb_augment_cb, NULL);
 	tail_vma->vm_next = NULL;
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 2/6] Allow each architecture to specify the address range that can be used for this allocation.
  2012-06-18 14:31 ` Rik van Riel
@ 2012-06-18 14:31   ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Rik van Riel

From: Rik van Riel <riel@surriel.com>

On x86-64, this is used to implement MMAP_32BIT semantics.

On PPC and IA64, allocations using certain page sizes need to be
restricted to certain virtual address ranges. This callback could
be used to implement such address restrictions with minimal hassle.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/mips/mm/mmap.c               |    8 ++----
 arch/x86/include/asm/pgtable_64.h |    1 +
 arch/x86/kernel/sys_x86_64.c      |   11 ++++++---
 include/linux/sched.h             |    7 ++++++
 mm/mmap.c                         |   38 ++++++++++++++++++++++++++++++++++--
 5 files changed, 53 insertions(+), 12 deletions(-)

diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 302d779..3f8af17 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -61,8 +61,6 @@ static inline unsigned long COLOUR_ALIGN_DOWN(unsigned long addr,
 	((((addr) + shm_align_mask) & ~shm_align_mask) +	\
 	 (((pgoff) << PAGE_SHIFT) & shm_align_mask))
 
-enum mmap_allocation_direction {UP, DOWN};
-
 static unsigned long arch_get_unmapped_area_common(struct file *filp,
 	unsigned long addr0, unsigned long len, unsigned long pgoff,
 	unsigned long flags, enum mmap_allocation_direction dir)
@@ -107,7 +105,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
 			return addr;
 	}
 
-	if (dir == UP) {
+	if (dir == ALLOC_UP) {
 		addr = mm->mmap_base;
 		if (do_color_align)
 			addr = COLOUR_ALIGN(addr, pgoff);
@@ -204,7 +202,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr0,
 	unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	return arch_get_unmapped_area_common(filp,
-			addr0, len, pgoff, flags, UP);
+			addr0, len, pgoff, flags, ALLOC_UP);
 }
 
 /*
@@ -216,7 +214,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp,
 	unsigned long flags)
 {
 	return arch_get_unmapped_area_common(filp,
-			addr0, len, pgoff, flags, DOWN);
+			addr0, len, pgoff, flags, ALLOC_DOWN);
 }
 
 void arch_pick_mmap_layout(struct mm_struct *mm)
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 975f709..8af36f6 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -169,6 +169,7 @@ extern void cleanup_highmap(void);
 
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#define HAVE_ARCH_GET_ADDRESS_RANGE
 
 #define pgtable_cache_init()   do { } while (0)
 #define check_pgt_cache()      do { } while (0)
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index b4d3c39..2595a5e 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -95,8 +95,8 @@ out:
 	return error;
 }
 
-static void find_start_end(unsigned long flags, unsigned long *begin,
-			   unsigned long *end)
+void arch_get_address_range(unsigned long flags, unsigned long *begin,
+		unsigned long *end, enum mmap_allocation_direction direction)
 {
 	if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT)) {
 		unsigned long new_begin;
@@ -114,9 +114,12 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
 			if (new_begin)
 				*begin = new_begin;
 		}
-	} else {
+	} else if (direction == ALLOC_UP) {
 		*begin = TASK_UNMAPPED_BASE;
 		*end = TASK_SIZE;
+	} else /* direction == ALLOC_DOWN */ {
+		*begin = 0;
+		*end = current->mm->mmap_base;
 	}
 }
 
@@ -132,7 +135,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	if (flags & MAP_FIXED)
 		return addr;
 
-	find_start_end(flags, &begin, &end);
+	arch_get_address_range(flags, &begin, &end, ALLOC_UP);
 
 	if (len > end)
 		return -ENOMEM;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4059c0f..fc76318 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -388,7 +388,14 @@ extern int sysctl_max_map_count;
 #include <linux/aio.h>
 
 #ifdef CONFIG_MMU
+enum mmap_allocation_direction {
+	ALLOC_UP,
+	ALLOC_DOWN
+};
 extern void arch_pick_mmap_layout(struct mm_struct *mm);
+extern void
+arch_get_address_range(unsigned long flags, unsigned long *begin,
+		unsigned long *end, enum mmap_allocation_direction direction);
 extern unsigned long
 arch_get_unmapped_area(struct file *, unsigned long, unsigned long,
 		       unsigned long, unsigned long);
diff --git a/mm/mmap.c b/mm/mmap.c
index 40c848e..92cf0bf 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1465,6 +1465,20 @@ unacct_error:
 	return error;
 }
 
+#ifndef HAVE_ARCH_GET_ADDRESS_RANGE
+void arch_get_address_range(unsigned long flags, unsigned long *begin,
+		unsigned long *end, enum mmap_allocation_direction direction)
+{
+	if (direction == ALLOC_UP) {
+		*begin = TASK_UNMAPPED_BASE;
+		*end = TASK_SIZE;
+	} else /* direction == ALLOC_DOWN */ {
+		*begin = 0;
+		*end = current->mm->mmap_base;
+	}
+}
+#endif
+
 /* Get an address range which is currently unmapped.
  * For shmat() with addr=0.
  *
@@ -1499,7 +1513,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma = NULL;
 	struct rb_node *rb_node;
-	unsigned long lower_limit = TASK_UNMAPPED_BASE;
+	unsigned long lower_limit, upper_limit;
+
+	arch_get_address_range(flags, &lower_limit, &upper_limit, ALLOC_UP);
 
 	if (len > TASK_SIZE)
 		return -ENOMEM;
@@ -1546,6 +1562,13 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 			continue;
 		}
 
+		/* We have gone too far right, and can not go left. */
+		if (vma->vm_end + len > upper_limit) {
+			if (!addr)
+				return -ENOMEM;
+			goto found_addr;
+		}
+
 		if (!found_here && node_free_hole(rb_node->rb_right) >= len) {
 			/* Last known hole is to the right of this subtree. */
 			rb_node = rb_node->rb_right;
@@ -1625,7 +1648,9 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	struct mm_struct *mm = current->mm;
 	unsigned long addr = addr0;
 	struct rb_node *rb_node = NULL;
-	unsigned long upper_limit = mm->mmap_base;
+	unsigned long lower_limit, upper_limit;
+
+	arch_get_address_range(flags, &lower_limit, &upper_limit, ALLOC_DOWN);
 
 	/* requested length too big for entire address space */
 	if (len > TASK_SIZE)
@@ -1644,7 +1669,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	}
 
 	/* requested length too big; prevent integer underflow below */
-	if (len > upper_limit)
+	if (len > upper_limit - lower_limit)
 		return -ENOMEM;
 
 	/*
@@ -1681,6 +1706,13 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 			}
 		}
 
+		/* We have gone too far left, and can not go right. */
+		if (vma->vm_start < lower_limit + len) {
+			if (!addr)
+				return -ENOMEM;
+			goto found_addr;
+		}
+
 		if (!found_here && node_free_hole(rb_node->rb_left) >= len) {
 			/* Last known hole is to the right of this subtree. */
 			rb_node = rb_node->rb_left;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 2/6] Allow each architecture to specify the address range that can be used for this allocation.
@ 2012-06-18 14:31   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Rik van Riel

From: Rik van Riel <riel@surriel.com>

On x86-64, this is used to implement MMAP_32BIT semantics.

On PPC and IA64, allocations using certain page sizes need to be
restricted to certain virtual address ranges. This callback could
be used to implement such address restrictions with minimal hassle.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/mips/mm/mmap.c               |    8 ++----
 arch/x86/include/asm/pgtable_64.h |    1 +
 arch/x86/kernel/sys_x86_64.c      |   11 ++++++---
 include/linux/sched.h             |    7 ++++++
 mm/mmap.c                         |   38 ++++++++++++++++++++++++++++++++++--
 5 files changed, 53 insertions(+), 12 deletions(-)

diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 302d779..3f8af17 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -61,8 +61,6 @@ static inline unsigned long COLOUR_ALIGN_DOWN(unsigned long addr,
 	((((addr) + shm_align_mask) & ~shm_align_mask) +	\
 	 (((pgoff) << PAGE_SHIFT) & shm_align_mask))
 
-enum mmap_allocation_direction {UP, DOWN};
-
 static unsigned long arch_get_unmapped_area_common(struct file *filp,
 	unsigned long addr0, unsigned long len, unsigned long pgoff,
 	unsigned long flags, enum mmap_allocation_direction dir)
@@ -107,7 +105,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
 			return addr;
 	}
 
-	if (dir == UP) {
+	if (dir == ALLOC_UP) {
 		addr = mm->mmap_base;
 		if (do_color_align)
 			addr = COLOUR_ALIGN(addr, pgoff);
@@ -204,7 +202,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr0,
 	unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	return arch_get_unmapped_area_common(filp,
-			addr0, len, pgoff, flags, UP);
+			addr0, len, pgoff, flags, ALLOC_UP);
 }
 
 /*
@@ -216,7 +214,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp,
 	unsigned long flags)
 {
 	return arch_get_unmapped_area_common(filp,
-			addr0, len, pgoff, flags, DOWN);
+			addr0, len, pgoff, flags, ALLOC_DOWN);
 }
 
 void arch_pick_mmap_layout(struct mm_struct *mm)
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 975f709..8af36f6 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -169,6 +169,7 @@ extern void cleanup_highmap(void);
 
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#define HAVE_ARCH_GET_ADDRESS_RANGE
 
 #define pgtable_cache_init()   do { } while (0)
 #define check_pgt_cache()      do { } while (0)
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index b4d3c39..2595a5e 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -95,8 +95,8 @@ out:
 	return error;
 }
 
-static void find_start_end(unsigned long flags, unsigned long *begin,
-			   unsigned long *end)
+void arch_get_address_range(unsigned long flags, unsigned long *begin,
+		unsigned long *end, enum mmap_allocation_direction direction)
 {
 	if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT)) {
 		unsigned long new_begin;
@@ -114,9 +114,12 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
 			if (new_begin)
 				*begin = new_begin;
 		}
-	} else {
+	} else if (direction == ALLOC_UP) {
 		*begin = TASK_UNMAPPED_BASE;
 		*end = TASK_SIZE;
+	} else /* direction == ALLOC_DOWN */ {
+		*begin = 0;
+		*end = current->mm->mmap_base;
 	}
 }
 
@@ -132,7 +135,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	if (flags & MAP_FIXED)
 		return addr;
 
-	find_start_end(flags, &begin, &end);
+	arch_get_address_range(flags, &begin, &end, ALLOC_UP);
 
 	if (len > end)
 		return -ENOMEM;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4059c0f..fc76318 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -388,7 +388,14 @@ extern int sysctl_max_map_count;
 #include <linux/aio.h>
 
 #ifdef CONFIG_MMU
+enum mmap_allocation_direction {
+	ALLOC_UP,
+	ALLOC_DOWN
+};
 extern void arch_pick_mmap_layout(struct mm_struct *mm);
+extern void
+arch_get_address_range(unsigned long flags, unsigned long *begin,
+		unsigned long *end, enum mmap_allocation_direction direction);
 extern unsigned long
 arch_get_unmapped_area(struct file *, unsigned long, unsigned long,
 		       unsigned long, unsigned long);
diff --git a/mm/mmap.c b/mm/mmap.c
index 40c848e..92cf0bf 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1465,6 +1465,20 @@ unacct_error:
 	return error;
 }
 
+#ifndef HAVE_ARCH_GET_ADDRESS_RANGE
+void arch_get_address_range(unsigned long flags, unsigned long *begin,
+		unsigned long *end, enum mmap_allocation_direction direction)
+{
+	if (direction == ALLOC_UP) {
+		*begin = TASK_UNMAPPED_BASE;
+		*end = TASK_SIZE;
+	} else /* direction == ALLOC_DOWN */ {
+		*begin = 0;
+		*end = current->mm->mmap_base;
+	}
+}
+#endif
+
 /* Get an address range which is currently unmapped.
  * For shmat() with addr=0.
  *
@@ -1499,7 +1513,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma = NULL;
 	struct rb_node *rb_node;
-	unsigned long lower_limit = TASK_UNMAPPED_BASE;
+	unsigned long lower_limit, upper_limit;
+
+	arch_get_address_range(flags, &lower_limit, &upper_limit, ALLOC_UP);
 
 	if (len > TASK_SIZE)
 		return -ENOMEM;
@@ -1546,6 +1562,13 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 			continue;
 		}
 
+		/* We have gone too far right, and can not go left. */
+		if (vma->vm_end + len > upper_limit) {
+			if (!addr)
+				return -ENOMEM;
+			goto found_addr;
+		}
+
 		if (!found_here && node_free_hole(rb_node->rb_right) >= len) {
 			/* Last known hole is to the right of this subtree. */
 			rb_node = rb_node->rb_right;
@@ -1625,7 +1648,9 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	struct mm_struct *mm = current->mm;
 	unsigned long addr = addr0;
 	struct rb_node *rb_node = NULL;
-	unsigned long upper_limit = mm->mmap_base;
+	unsigned long lower_limit, upper_limit;
+
+	arch_get_address_range(flags, &lower_limit, &upper_limit, ALLOC_DOWN);
 
 	/* requested length too big for entire address space */
 	if (len > TASK_SIZE)
@@ -1644,7 +1669,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	}
 
 	/* requested length too big; prevent integer underflow below */
-	if (len > upper_limit)
+	if (len > upper_limit - lower_limit)
 		return -ENOMEM;
 
 	/*
@@ -1681,6 +1706,13 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 			}
 		}
 
+		/* We have gone too far left, and can not go right. */
+		if (vma->vm_start < lower_limit + len) {
+			if (!addr)
+				return -ENOMEM;
+			goto found_addr;
+		}
+
 		if (!found_here && node_free_hole(rb_node->rb_left) >= len) {
 			/* Last known hole is to the right of this subtree. */
 			rb_node = rb_node->rb_left;
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
  2012-06-18 14:31 ` Rik van Riel
@ 2012-06-18 14:31   ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Rik van Riel

From: Rik van Riel <riel@surriel.com>

Teach the generic arch_get_unmapped_area(_topdown) code to call the
page colouring code.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/mips/include/asm/page.h      |    2 -
 arch/mips/include/asm/pgtable.h   |    1 +
 arch/x86/include/asm/elf.h        |    3 -
 arch/x86/include/asm/pgtable_64.h |    1 +
 arch/x86/kernel/sys_x86_64.c      |   35 +++++++++-----
 arch/x86/vdso/vma.c               |    2 +-
 include/linux/sched.h             |    8 +++-
 mm/mmap.c                         |   91 ++++++++++++++++++++++++++++++++-----
 8 files changed, 111 insertions(+), 32 deletions(-)

diff --git a/arch/mips/include/asm/page.h b/arch/mips/include/asm/page.h
index da9bd7d..459cc25 100644
--- a/arch/mips/include/asm/page.h
+++ b/arch/mips/include/asm/page.h
@@ -63,8 +63,6 @@ extern void build_copy_page(void);
 extern void clear_page(void * page);
 extern void copy_page(void * to, void * from);
 
-extern unsigned long shm_align_mask;
-
 static inline unsigned long pages_do_alias(unsigned long addr1,
 	unsigned long addr2)
 {
diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index b2202a6..f133a4c 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -415,6 +415,7 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
  */
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#define HAVE_ARCH_ALIGN_ADDR
 
 /*
  * No page table caches to initialise
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 5939f44..dc2d0bf 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -358,8 +358,6 @@ static inline int mmap_is_ia32(void)
 enum align_flags {
 	ALIGN_VA_32	= BIT(0),
 	ALIGN_VA_64	= BIT(1),
-	ALIGN_VDSO	= BIT(2),
-	ALIGN_TOPDOWN	= BIT(3),
 };
 
 struct va_alignment {
@@ -368,5 +366,4 @@ struct va_alignment {
 } ____cacheline_aligned;
 
 extern struct va_alignment va_align;
-extern unsigned long align_addr(unsigned long, struct file *, enum align_flags);
 #endif /* _ASM_X86_ELF_H */
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 8af36f6..8408ccd 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -170,6 +170,7 @@ extern void cleanup_highmap(void);
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
 #define HAVE_ARCH_GET_ADDRESS_RANGE
+#define HAVE_ARCH_ALIGN_ADDR
 
 #define pgtable_cache_init()   do { } while (0)
 #define check_pgt_cache()      do { } while (0)
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 2595a5e..ac0afb8 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -25,31 +25,40 @@
  * @flags denotes the allocation direction - bottomup or topdown -
  * or vDSO; see call sites below.
  */
-unsigned long align_addr(unsigned long addr, struct file *filp,
-			 enum align_flags flags)
+unsigned long arch_align_addr(unsigned long addr, struct file *filp,
+			unsigned long pgoff, unsigned long flags,
+			enum mmap_allocation_direction direction)
 {
-	unsigned long tmp_addr;
+	unsigned long tmp_addr = PAGE_ALIGN(addr);
 
 	/* handle 32- and 64-bit case with a single conditional */
 	if (va_align.flags < 0 || !(va_align.flags & (2 - mmap_is_ia32())))
-		return addr;
+		return tmp_addr;
 
-	if (!(current->flags & PF_RANDOMIZE))
-		return addr;
+	/* Always allow MAP_FIXED. Colouring is a performance thing only. */
+	if (flags & MAP_FIXED)
+		return tmp_addr;
 
-	if (!((flags & ALIGN_VDSO) || filp))
-		return addr;
+	if (!(current->flags & PF_RANDOMIZE))
+		return tmp_addr;
 
-	tmp_addr = addr;
+	if (!(filp || direction == ALLOC_VDSO))
+		return tmp_addr;
 
 	/*
 	 * We need an address which is <= than the original
 	 * one only when in topdown direction.
 	 */
-	if (!(flags & ALIGN_TOPDOWN))
+	if (direction == ALLOC_UP)
 		tmp_addr += va_align.mask;
 
 	tmp_addr &= ~va_align.mask;
+	tmp_addr += ((pgoff << PAGE_SHIFT) & va_align.mask);
+
+	if (direction == ALLOC_DOWN && tmp_addr > addr) {
+		tmp_addr -= va_align.mask;
+		tmp_addr &= ~va_align.mask;
+	}
 
 	return tmp_addr;
 }
@@ -159,7 +168,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 
 full_search:
 
-	addr = align_addr(addr, filp, 0);
+	addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
 
 	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
 		/* At this point:  (!vma || addr < vma->vm_end). */
@@ -186,7 +195,7 @@ full_search:
 			mm->cached_hole_size = vma->vm_start - addr;
 
 		addr = vma->vm_end;
-		addr = align_addr(addr, filp, 0);
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
 	}
 }
 
@@ -235,7 +244,7 @@ try_again:
 
 	addr -= len;
 	do {
-		addr = align_addr(addr, filp, ALIGN_TOPDOWN);
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
 
 		/*
 		 * Lookup failure means no vma is above this address,
diff --git a/arch/x86/vdso/vma.c b/arch/x86/vdso/vma.c
index 00aaf04..83e0355 100644
--- a/arch/x86/vdso/vma.c
+++ b/arch/x86/vdso/vma.c
@@ -141,7 +141,7 @@ static unsigned long vdso_addr(unsigned long start, unsigned len)
 	 * unaligned here as a result of stack start randomization.
 	 */
 	addr = PAGE_ALIGN(addr);
-	addr = align_addr(addr, NULL, ALIGN_VDSO);
+	addr = arch_align_addr(addr, NULL, 0, 0, ALLOC_VDSO);
 
 	return addr;
 }
diff --git a/include/linux/sched.h b/include/linux/sched.h
index fc76318..18f9326 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -390,12 +390,18 @@ extern int sysctl_max_map_count;
 #ifdef CONFIG_MMU
 enum mmap_allocation_direction {
 	ALLOC_UP,
-	ALLOC_DOWN
+	ALLOC_DOWN,
+	ALLOC_VDSO,
 };
 extern void arch_pick_mmap_layout(struct mm_struct *mm);
 extern void
 arch_get_address_range(unsigned long flags, unsigned long *begin,
 		unsigned long *end, enum mmap_allocation_direction direction);
+extern unsigned long shm_align_mask;
+extern unsigned long
+arch_align_addr(unsigned long addr, struct file *filp,
+		unsigned long pgoff, unsigned long flags,
+		enum mmap_allocation_direction direction);
 extern unsigned long
 arch_get_unmapped_area(struct file *, unsigned long, unsigned long,
 		       unsigned long, unsigned long);
diff --git a/mm/mmap.c b/mm/mmap.c
index 92cf0bf..0314cb1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1465,6 +1465,51 @@ unacct_error:
 	return error;
 }
 
+#ifndef HAVE_ARCH_ALIGN_ADDR
+/* Each architecture is responsible for setting this to the required value. */
+unsigned long shm_align_mask = PAGE_SIZE - 1;
+EXPORT_SYMBOL(shm_align_mask);
+
+unsigned long arch_align_addr(unsigned long addr, struct file *filp,
+			unsigned long pgoff, unsigned long flags,
+			enum mmap_allocation_direction direction)
+{
+	unsigned long tmp_addr = PAGE_ALIGN(addr);
+
+	if (shm_align_mask <= PAGE_SIZE)
+		return tmp_addr;
+
+	/* Allow MAP_FIXED without MAP_SHARED at any address. */
+	if ((flags & (MAP_FIXED|MAP_SHARED)) == MAP_FIXED)
+		return tmp_addr;
+
+	/* Enforce page colouring for any file or MAP_SHARED mapping. */
+	if (!(filp || (flags & MAP_SHARED)))
+		return tmp_addr;
+
+	/*
+	 * We need an address which is <= than the original
+	 * one only when in topdown direction.
+	 */
+	if (direction == ALLOC_UP)
+		tmp_addr += shm_align_mask;
+
+	tmp_addr &= ~shm_align_mask;
+	tmp_addr += ((pgoff << PAGE_SHIFT) & shm_align_mask);
+
+	/*
+	 * When aligning down, make sure we did not accidentally go up.
+	 * The caller will check for underflow.
+	 */
+	if (direction == ALLOC_DOWN && tmp_addr > addr) {
+		tmp_addr -= shm_align_mask;
+		tmp_addr &= ~shm_align_mask;
+	}
+
+	return tmp_addr;
+}
+#endif
+
 #ifndef HAVE_ARCH_GET_ADDRESS_RANGE
 void arch_get_address_range(unsigned long flags, unsigned long *begin,
 		unsigned long *end, enum mmap_allocation_direction direction)
@@ -1513,18 +1558,22 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma = NULL;
 	struct rb_node *rb_node;
-	unsigned long lower_limit, upper_limit;
+	unsigned long lower_limit, upper_limit, tmp_addr;
 
 	arch_get_address_range(flags, &lower_limit, &upper_limit, ALLOC_UP);
 
 	if (len > TASK_SIZE)
 		return -ENOMEM;
 
-	if (flags & MAP_FIXED)
+	if (flags & MAP_FIXED) {
+		tmp_addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
+		if (tmp_addr != PAGE_ALIGN(addr))
+			return -EINVAL;
 		return addr;
+	}
 
 	if (addr) {
-		addr = PAGE_ALIGN(addr);
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
 		    (!vma || addr + len <= vma->vm_start))
@@ -1533,7 +1582,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 
 	/* Find the left-most free area of sufficient size. */
 	for (addr = 0, rb_node = mm->mm_rb.rb_node; rb_node; ) {
-		unsigned long vma_start;
+		unsigned long vma_start, tmp_addr;
 		int found_here = 0;
 
 		vma = rb_to_vma(rb_node);
@@ -1541,13 +1590,17 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		if (vma->vm_start > len) {
 			if (!vma->vm_prev) {
 				/* This is the left-most VMA. */
-				if (vma->vm_start - len >= lower_limit) {
-					addr = lower_limit;
+				tmp_addr = arch_align_addr(lower_limit, filp,
+						pgoff, flags, ALLOC_UP);
+				if (vma->vm_start - len >= tmp_addr) {
+					addr = tmp_addr;
 					goto found_addr;
 				}
 			} else {
 				/* Is this hole large enough? Remember it. */
 				vma_start = max(vma->vm_prev->vm_end, lower_limit);
+				vma_start = arch_align_addr(vma_start, filp,
+						pgoff, flags, ALLOC_UP);
 				if (vma->vm_start - len >= vma_start) {
 					addr = vma_start;
 					found_here = 1;
@@ -1599,6 +1652,8 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	if (addr < lower_limit)
 		addr = lower_limit;
 
+	addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
+
  found_addr:
 	if (TASK_SIZE - len < addr)
 		return -ENOMEM;
@@ -1656,12 +1711,17 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	if (len > TASK_SIZE)
 		return -ENOMEM;
 
-	if (flags & MAP_FIXED)
+	if (flags & MAP_FIXED) {
+		unsigned long tmp_addr;
+		tmp_addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
+		if (tmp_addr != PAGE_ALIGN(addr))
+			return -EINVAL;
 		return addr;
+	}
 
 	/* requesting a specific address */
 	if (addr) {
-		addr = PAGE_ALIGN(addr);
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
 				(!vma || addr + len <= vma->vm_start))
@@ -1678,7 +1738,9 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	 */
 	if (upper_limit - len > mm->highest_vma) {
 		addr = upper_limit - len;
-		goto found_addr;
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
+		if (addr > mm->highest_vma);
+			goto found_addr;
 	}
 
 	/* Find the right-most free area of sufficient size. */
@@ -1691,9 +1753,14 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 		/* Is this hole large enough? Remember it. */
 		vma_start = min(vma->vm_start, upper_limit);
 		if (vma_start > len) {
-			if (!vma->vm_prev ||
-			    (vma_start - len >= vma->vm_prev->vm_end)) {
-				addr = vma_start - len;
+			unsigned long tmp_addr = vma_start - len;
+			tmp_addr = arch_align_addr(tmp_addr, filp,
+						   pgoff, flags, ALLOC_DOWN);
+			/* No underflow? Does it still fit the hole? */
+			if (tmp_addr && tmp_addr <= vma_start - len &&
+					(!vma->vm_prev ||
+					 tmp_addr >= vma->vm_prev->vm_end)) {
+				addr = tmp_addr;
 				found_here = 1;
 			}
 		}
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 14:31   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Rik van Riel

From: Rik van Riel <riel@surriel.com>

Teach the generic arch_get_unmapped_area(_topdown) code to call the
page colouring code.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/mips/include/asm/page.h      |    2 -
 arch/mips/include/asm/pgtable.h   |    1 +
 arch/x86/include/asm/elf.h        |    3 -
 arch/x86/include/asm/pgtable_64.h |    1 +
 arch/x86/kernel/sys_x86_64.c      |   35 +++++++++-----
 arch/x86/vdso/vma.c               |    2 +-
 include/linux/sched.h             |    8 +++-
 mm/mmap.c                         |   91 ++++++++++++++++++++++++++++++++-----
 8 files changed, 111 insertions(+), 32 deletions(-)

diff --git a/arch/mips/include/asm/page.h b/arch/mips/include/asm/page.h
index da9bd7d..459cc25 100644
--- a/arch/mips/include/asm/page.h
+++ b/arch/mips/include/asm/page.h
@@ -63,8 +63,6 @@ extern void build_copy_page(void);
 extern void clear_page(void * page);
 extern void copy_page(void * to, void * from);
 
-extern unsigned long shm_align_mask;
-
 static inline unsigned long pages_do_alias(unsigned long addr1,
 	unsigned long addr2)
 {
diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index b2202a6..f133a4c 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -415,6 +415,7 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
  */
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#define HAVE_ARCH_ALIGN_ADDR
 
 /*
  * No page table caches to initialise
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 5939f44..dc2d0bf 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -358,8 +358,6 @@ static inline int mmap_is_ia32(void)
 enum align_flags {
 	ALIGN_VA_32	= BIT(0),
 	ALIGN_VA_64	= BIT(1),
-	ALIGN_VDSO	= BIT(2),
-	ALIGN_TOPDOWN	= BIT(3),
 };
 
 struct va_alignment {
@@ -368,5 +366,4 @@ struct va_alignment {
 } ____cacheline_aligned;
 
 extern struct va_alignment va_align;
-extern unsigned long align_addr(unsigned long, struct file *, enum align_flags);
 #endif /* _ASM_X86_ELF_H */
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 8af36f6..8408ccd 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -170,6 +170,7 @@ extern void cleanup_highmap(void);
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
 #define HAVE_ARCH_GET_ADDRESS_RANGE
+#define HAVE_ARCH_ALIGN_ADDR
 
 #define pgtable_cache_init()   do { } while (0)
 #define check_pgt_cache()      do { } while (0)
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 2595a5e..ac0afb8 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -25,31 +25,40 @@
  * @flags denotes the allocation direction - bottomup or topdown -
  * or vDSO; see call sites below.
  */
-unsigned long align_addr(unsigned long addr, struct file *filp,
-			 enum align_flags flags)
+unsigned long arch_align_addr(unsigned long addr, struct file *filp,
+			unsigned long pgoff, unsigned long flags,
+			enum mmap_allocation_direction direction)
 {
-	unsigned long tmp_addr;
+	unsigned long tmp_addr = PAGE_ALIGN(addr);
 
 	/* handle 32- and 64-bit case with a single conditional */
 	if (va_align.flags < 0 || !(va_align.flags & (2 - mmap_is_ia32())))
-		return addr;
+		return tmp_addr;
 
-	if (!(current->flags & PF_RANDOMIZE))
-		return addr;
+	/* Always allow MAP_FIXED. Colouring is a performance thing only. */
+	if (flags & MAP_FIXED)
+		return tmp_addr;
 
-	if (!((flags & ALIGN_VDSO) || filp))
-		return addr;
+	if (!(current->flags & PF_RANDOMIZE))
+		return tmp_addr;
 
-	tmp_addr = addr;
+	if (!(filp || direction == ALLOC_VDSO))
+		return tmp_addr;
 
 	/*
 	 * We need an address which is <= than the original
 	 * one only when in topdown direction.
 	 */
-	if (!(flags & ALIGN_TOPDOWN))
+	if (direction == ALLOC_UP)
 		tmp_addr += va_align.mask;
 
 	tmp_addr &= ~va_align.mask;
+	tmp_addr += ((pgoff << PAGE_SHIFT) & va_align.mask);
+
+	if (direction == ALLOC_DOWN && tmp_addr > addr) {
+		tmp_addr -= va_align.mask;
+		tmp_addr &= ~va_align.mask;
+	}
 
 	return tmp_addr;
 }
@@ -159,7 +168,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 
 full_search:
 
-	addr = align_addr(addr, filp, 0);
+	addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
 
 	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
 		/* At this point:  (!vma || addr < vma->vm_end). */
@@ -186,7 +195,7 @@ full_search:
 			mm->cached_hole_size = vma->vm_start - addr;
 
 		addr = vma->vm_end;
-		addr = align_addr(addr, filp, 0);
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
 	}
 }
 
@@ -235,7 +244,7 @@ try_again:
 
 	addr -= len;
 	do {
-		addr = align_addr(addr, filp, ALIGN_TOPDOWN);
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
 
 		/*
 		 * Lookup failure means no vma is above this address,
diff --git a/arch/x86/vdso/vma.c b/arch/x86/vdso/vma.c
index 00aaf04..83e0355 100644
--- a/arch/x86/vdso/vma.c
+++ b/arch/x86/vdso/vma.c
@@ -141,7 +141,7 @@ static unsigned long vdso_addr(unsigned long start, unsigned len)
 	 * unaligned here as a result of stack start randomization.
 	 */
 	addr = PAGE_ALIGN(addr);
-	addr = align_addr(addr, NULL, ALIGN_VDSO);
+	addr = arch_align_addr(addr, NULL, 0, 0, ALLOC_VDSO);
 
 	return addr;
 }
diff --git a/include/linux/sched.h b/include/linux/sched.h
index fc76318..18f9326 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -390,12 +390,18 @@ extern int sysctl_max_map_count;
 #ifdef CONFIG_MMU
 enum mmap_allocation_direction {
 	ALLOC_UP,
-	ALLOC_DOWN
+	ALLOC_DOWN,
+	ALLOC_VDSO,
 };
 extern void arch_pick_mmap_layout(struct mm_struct *mm);
 extern void
 arch_get_address_range(unsigned long flags, unsigned long *begin,
 		unsigned long *end, enum mmap_allocation_direction direction);
+extern unsigned long shm_align_mask;
+extern unsigned long
+arch_align_addr(unsigned long addr, struct file *filp,
+		unsigned long pgoff, unsigned long flags,
+		enum mmap_allocation_direction direction);
 extern unsigned long
 arch_get_unmapped_area(struct file *, unsigned long, unsigned long,
 		       unsigned long, unsigned long);
diff --git a/mm/mmap.c b/mm/mmap.c
index 92cf0bf..0314cb1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1465,6 +1465,51 @@ unacct_error:
 	return error;
 }
 
+#ifndef HAVE_ARCH_ALIGN_ADDR
+/* Each architecture is responsible for setting this to the required value. */
+unsigned long shm_align_mask = PAGE_SIZE - 1;
+EXPORT_SYMBOL(shm_align_mask);
+
+unsigned long arch_align_addr(unsigned long addr, struct file *filp,
+			unsigned long pgoff, unsigned long flags,
+			enum mmap_allocation_direction direction)
+{
+	unsigned long tmp_addr = PAGE_ALIGN(addr);
+
+	if (shm_align_mask <= PAGE_SIZE)
+		return tmp_addr;
+
+	/* Allow MAP_FIXED without MAP_SHARED at any address. */
+	if ((flags & (MAP_FIXED|MAP_SHARED)) == MAP_FIXED)
+		return tmp_addr;
+
+	/* Enforce page colouring for any file or MAP_SHARED mapping. */
+	if (!(filp || (flags & MAP_SHARED)))
+		return tmp_addr;
+
+	/*
+	 * We need an address which is <= than the original
+	 * one only when in topdown direction.
+	 */
+	if (direction == ALLOC_UP)
+		tmp_addr += shm_align_mask;
+
+	tmp_addr &= ~shm_align_mask;
+	tmp_addr += ((pgoff << PAGE_SHIFT) & shm_align_mask);
+
+	/*
+	 * When aligning down, make sure we did not accidentally go up.
+	 * The caller will check for underflow.
+	 */
+	if (direction == ALLOC_DOWN && tmp_addr > addr) {
+		tmp_addr -= shm_align_mask;
+		tmp_addr &= ~shm_align_mask;
+	}
+
+	return tmp_addr;
+}
+#endif
+
 #ifndef HAVE_ARCH_GET_ADDRESS_RANGE
 void arch_get_address_range(unsigned long flags, unsigned long *begin,
 		unsigned long *end, enum mmap_allocation_direction direction)
@@ -1513,18 +1558,22 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma = NULL;
 	struct rb_node *rb_node;
-	unsigned long lower_limit, upper_limit;
+	unsigned long lower_limit, upper_limit, tmp_addr;
 
 	arch_get_address_range(flags, &lower_limit, &upper_limit, ALLOC_UP);
 
 	if (len > TASK_SIZE)
 		return -ENOMEM;
 
-	if (flags & MAP_FIXED)
+	if (flags & MAP_FIXED) {
+		tmp_addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
+		if (tmp_addr != PAGE_ALIGN(addr))
+			return -EINVAL;
 		return addr;
+	}
 
 	if (addr) {
-		addr = PAGE_ALIGN(addr);
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
 		    (!vma || addr + len <= vma->vm_start))
@@ -1533,7 +1582,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 
 	/* Find the left-most free area of sufficient size. */
 	for (addr = 0, rb_node = mm->mm_rb.rb_node; rb_node; ) {
-		unsigned long vma_start;
+		unsigned long vma_start, tmp_addr;
 		int found_here = 0;
 
 		vma = rb_to_vma(rb_node);
@@ -1541,13 +1590,17 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		if (vma->vm_start > len) {
 			if (!vma->vm_prev) {
 				/* This is the left-most VMA. */
-				if (vma->vm_start - len >= lower_limit) {
-					addr = lower_limit;
+				tmp_addr = arch_align_addr(lower_limit, filp,
+						pgoff, flags, ALLOC_UP);
+				if (vma->vm_start - len >= tmp_addr) {
+					addr = tmp_addr;
 					goto found_addr;
 				}
 			} else {
 				/* Is this hole large enough? Remember it. */
 				vma_start = max(vma->vm_prev->vm_end, lower_limit);
+				vma_start = arch_align_addr(vma_start, filp,
+						pgoff, flags, ALLOC_UP);
 				if (vma->vm_start - len >= vma_start) {
 					addr = vma_start;
 					found_here = 1;
@@ -1599,6 +1652,8 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	if (addr < lower_limit)
 		addr = lower_limit;
 
+	addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
+
  found_addr:
 	if (TASK_SIZE - len < addr)
 		return -ENOMEM;
@@ -1656,12 +1711,17 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	if (len > TASK_SIZE)
 		return -ENOMEM;
 
-	if (flags & MAP_FIXED)
+	if (flags & MAP_FIXED) {
+		unsigned long tmp_addr;
+		tmp_addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
+		if (tmp_addr != PAGE_ALIGN(addr))
+			return -EINVAL;
 		return addr;
+	}
 
 	/* requesting a specific address */
 	if (addr) {
-		addr = PAGE_ALIGN(addr);
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
 				(!vma || addr + len <= vma->vm_start))
@@ -1678,7 +1738,9 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	 */
 	if (upper_limit - len > mm->highest_vma) {
 		addr = upper_limit - len;
-		goto found_addr;
+		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
+		if (addr > mm->highest_vma);
+			goto found_addr;
 	}
 
 	/* Find the right-most free area of sufficient size. */
@@ -1691,9 +1753,14 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 		/* Is this hole large enough? Remember it. */
 		vma_start = min(vma->vm_start, upper_limit);
 		if (vma_start > len) {
-			if (!vma->vm_prev ||
-			    (vma_start - len >= vma->vm_prev->vm_end)) {
-				addr = vma_start - len;
+			unsigned long tmp_addr = vma_start - len;
+			tmp_addr = arch_align_addr(tmp_addr, filp,
+						   pgoff, flags, ALLOC_DOWN);
+			/* No underflow? Does it still fit the hole? */
+			if (tmp_addr && tmp_addr <= vma_start - len &&
+					(!vma->vm_prev ||
+					 tmp_addr >= vma->vm_prev->vm_end)) {
+				addr = tmp_addr;
 				found_here = 1;
 			}
 		}
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 4/6] mm: remove x86 arch_get_unmapped_area(_topdown)
  2012-06-18 14:31 ` Rik van Riel
@ 2012-06-18 14:31   ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Rik van Riel

From: Rik van Riel <riel@surriel.com>

The generic arch_get_unmapped_area(_topdown) should now be able
to do everything x86 needs.  Remove the x86 specific functions.

TODO: make the hugetlbfs arch_get_unmapped_area call the generic
code with proper alignment info.

Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/x86/include/asm/pgtable_64.h |    2 -
 arch/x86/kernel/sys_x86_64.c      |  162 -------------------------------------
 2 files changed, 0 insertions(+), 164 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 8408ccd..0ff6500 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -167,8 +167,6 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
 extern int kern_addr_valid(unsigned long addr);
 extern void cleanup_highmap(void);
 
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
 #define HAVE_ARCH_GET_ADDRESS_RANGE
 #define HAVE_ARCH_ALIGN_ADDR
 
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index ac0afb8..0243c58 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -131,165 +131,3 @@ void arch_get_address_range(unsigned long flags, unsigned long *begin,
 		*end = current->mm->mmap_base;
 	}
 }
-
-unsigned long
-arch_get_unmapped_area(struct file *filp, unsigned long addr,
-		unsigned long len, unsigned long pgoff, unsigned long flags)
-{
-	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
-	unsigned long start_addr;
-	unsigned long begin, end;
-
-	if (flags & MAP_FIXED)
-		return addr;
-
-	arch_get_address_range(flags, &begin, &end, ALLOC_UP);
-
-	if (len > end)
-		return -ENOMEM;
-
-	if (addr) {
-		addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
-		if (end - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-	if (((flags & MAP_32BIT) || test_thread_flag(TIF_ADDR32))
-	    && len <= mm->cached_hole_size) {
-		mm->cached_hole_size = 0;
-		mm->free_area_cache = begin;
-	}
-	addr = mm->free_area_cache;
-	if (addr < begin)
-		addr = begin;
-	start_addr = addr;
-
-full_search:
-
-	addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
-
-	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (end - len < addr) {
-			/*
-			 * Start a new search - just in case we missed
-			 * some holes.
-			 */
-			if (start_addr != begin) {
-				start_addr = addr = begin;
-				mm->cached_hole_size = 0;
-				goto full_search;
-			}
-			return -ENOMEM;
-		}
-		if (!vma || addr + len <= vma->vm_start) {
-			/*
-			 * Remember the place where we stopped the search:
-			 */
-			mm->free_area_cache = addr + len;
-			return addr;
-		}
-		if (addr + mm->cached_hole_size < vma->vm_start)
-			mm->cached_hole_size = vma->vm_start - addr;
-
-		addr = vma->vm_end;
-		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
-	}
-}
-
-
-unsigned long
-arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
-			  const unsigned long len, const unsigned long pgoff,
-			  const unsigned long flags)
-{
-	struct vm_area_struct *vma;
-	struct mm_struct *mm = current->mm;
-	unsigned long addr = addr0, start_addr;
-
-	/* requested length too big for entire address space */
-	if (len > TASK_SIZE)
-		return -ENOMEM;
-
-	if (flags & MAP_FIXED)
-		return addr;
-
-	/* for MAP_32BIT mappings we force the legact mmap base */
-	if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT))
-		goto bottomup;
-
-	/* requesting a specific address */
-	if (addr) {
-		addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
-		if (TASK_SIZE - len >= addr &&
-				(!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-
-	/* check if free_area_cache is useful for us */
-	if (len <= mm->cached_hole_size) {
-		mm->cached_hole_size = 0;
-		mm->free_area_cache = mm->mmap_base;
-	}
-
-try_again:
-	/* either no address requested or can't fit in requested address hole */
-	start_addr = addr = mm->free_area_cache;
-
-	if (addr < len)
-		goto fail;
-
-	addr -= len;
-	do {
-		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
-
-		/*
-		 * Lookup failure means no vma is above this address,
-		 * else if new region fits below vma->vm_start,
-		 * return with success:
-		 */
-		vma = find_vma(mm, addr);
-		if (!vma || addr+len <= vma->vm_start)
-			/* remember the address as a hint for next time */
-			return mm->free_area_cache = addr;
-
-		/* remember the largest hole we saw so far */
-		if (addr + mm->cached_hole_size < vma->vm_start)
-			mm->cached_hole_size = vma->vm_start - addr;
-
-		/* try just below the current vma->vm_start */
-		addr = vma->vm_start-len;
-	} while (len < vma->vm_start);
-
-fail:
-	/*
-	 * if hint left us with no space for the requested
-	 * mapping then try again:
-	 */
-	if (start_addr != mm->mmap_base) {
-		mm->free_area_cache = mm->mmap_base;
-		mm->cached_hole_size = 0;
-		goto try_again;
-	}
-
-bottomup:
-	/*
-	 * A failed mmap() very likely causes application failure,
-	 * so fall back to the bottom-up function here. This scenario
-	 * can happen with large stack limits and large mmap()
-	 * allocations.
-	 */
-	mm->cached_hole_size = ~0UL;
-	mm->free_area_cache = TASK_UNMAPPED_BASE;
-	addr = arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
-	/*
-	 * Restore the topdown base:
-	 */
-	mm->free_area_cache = mm->mmap_base;
-	mm->cached_hole_size = ~0UL;
-
-	return addr;
-}
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 4/6] mm: remove x86 arch_get_unmapped_area(_topdown)
@ 2012-06-18 14:31   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Rik van Riel

From: Rik van Riel <riel@surriel.com>

The generic arch_get_unmapped_area(_topdown) should now be able
to do everything x86 needs.  Remove the x86 specific functions.

TODO: make the hugetlbfs arch_get_unmapped_area call the generic
code with proper alignment info.

Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/x86/include/asm/pgtable_64.h |    2 -
 arch/x86/kernel/sys_x86_64.c      |  162 -------------------------------------
 2 files changed, 0 insertions(+), 164 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 8408ccd..0ff6500 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -167,8 +167,6 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
 extern int kern_addr_valid(unsigned long addr);
 extern void cleanup_highmap(void);
 
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
 #define HAVE_ARCH_GET_ADDRESS_RANGE
 #define HAVE_ARCH_ALIGN_ADDR
 
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index ac0afb8..0243c58 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -131,165 +131,3 @@ void arch_get_address_range(unsigned long flags, unsigned long *begin,
 		*end = current->mm->mmap_base;
 	}
 }
-
-unsigned long
-arch_get_unmapped_area(struct file *filp, unsigned long addr,
-		unsigned long len, unsigned long pgoff, unsigned long flags)
-{
-	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
-	unsigned long start_addr;
-	unsigned long begin, end;
-
-	if (flags & MAP_FIXED)
-		return addr;
-
-	arch_get_address_range(flags, &begin, &end, ALLOC_UP);
-
-	if (len > end)
-		return -ENOMEM;
-
-	if (addr) {
-		addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
-		if (end - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-	if (((flags & MAP_32BIT) || test_thread_flag(TIF_ADDR32))
-	    && len <= mm->cached_hole_size) {
-		mm->cached_hole_size = 0;
-		mm->free_area_cache = begin;
-	}
-	addr = mm->free_area_cache;
-	if (addr < begin)
-		addr = begin;
-	start_addr = addr;
-
-full_search:
-
-	addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
-
-	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (end - len < addr) {
-			/*
-			 * Start a new search - just in case we missed
-			 * some holes.
-			 */
-			if (start_addr != begin) {
-				start_addr = addr = begin;
-				mm->cached_hole_size = 0;
-				goto full_search;
-			}
-			return -ENOMEM;
-		}
-		if (!vma || addr + len <= vma->vm_start) {
-			/*
-			 * Remember the place where we stopped the search:
-			 */
-			mm->free_area_cache = addr + len;
-			return addr;
-		}
-		if (addr + mm->cached_hole_size < vma->vm_start)
-			mm->cached_hole_size = vma->vm_start - addr;
-
-		addr = vma->vm_end;
-		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_UP);
-	}
-}
-
-
-unsigned long
-arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
-			  const unsigned long len, const unsigned long pgoff,
-			  const unsigned long flags)
-{
-	struct vm_area_struct *vma;
-	struct mm_struct *mm = current->mm;
-	unsigned long addr = addr0, start_addr;
-
-	/* requested length too big for entire address space */
-	if (len > TASK_SIZE)
-		return -ENOMEM;
-
-	if (flags & MAP_FIXED)
-		return addr;
-
-	/* for MAP_32BIT mappings we force the legact mmap base */
-	if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT))
-		goto bottomup;
-
-	/* requesting a specific address */
-	if (addr) {
-		addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
-		if (TASK_SIZE - len >= addr &&
-				(!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-
-	/* check if free_area_cache is useful for us */
-	if (len <= mm->cached_hole_size) {
-		mm->cached_hole_size = 0;
-		mm->free_area_cache = mm->mmap_base;
-	}
-
-try_again:
-	/* either no address requested or can't fit in requested address hole */
-	start_addr = addr = mm->free_area_cache;
-
-	if (addr < len)
-		goto fail;
-
-	addr -= len;
-	do {
-		addr = arch_align_addr(addr, filp, pgoff, flags, ALLOC_DOWN);
-
-		/*
-		 * Lookup failure means no vma is above this address,
-		 * else if new region fits below vma->vm_start,
-		 * return with success:
-		 */
-		vma = find_vma(mm, addr);
-		if (!vma || addr+len <= vma->vm_start)
-			/* remember the address as a hint for next time */
-			return mm->free_area_cache = addr;
-
-		/* remember the largest hole we saw so far */
-		if (addr + mm->cached_hole_size < vma->vm_start)
-			mm->cached_hole_size = vma->vm_start - addr;
-
-		/* try just below the current vma->vm_start */
-		addr = vma->vm_start-len;
-	} while (len < vma->vm_start);
-
-fail:
-	/*
-	 * if hint left us with no space for the requested
-	 * mapping then try again:
-	 */
-	if (start_addr != mm->mmap_base) {
-		mm->free_area_cache = mm->mmap_base;
-		mm->cached_hole_size = 0;
-		goto try_again;
-	}
-
-bottomup:
-	/*
-	 * A failed mmap() very likely causes application failure,
-	 * so fall back to the bottom-up function here. This scenario
-	 * can happen with large stack limits and large mmap()
-	 * allocations.
-	 */
-	mm->cached_hole_size = ~0UL;
-	mm->free_area_cache = TASK_UNMAPPED_BASE;
-	addr = arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
-	/*
-	 * Restore the topdown base:
-	 */
-	mm->free_area_cache = mm->mmap_base;
-	mm->cached_hole_size = ~0UL;
-
-	return addr;
-}
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 5/6] remove MIPS arch_get_unmapped_area code
  2012-06-18 14:31 ` Rik van Riel
@ 2012-06-18 14:31   ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Ralf Baechle, sjhill,
	Rik van Riel

From: Rik van Riel <riel@surriel.com>

Remove all the MIPS specific arch_get_unmapped_area(_topdown) and
page colouring code, now that the generic code should be able to
handle things.

Untested, because I do not have any MIPS systems.

Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: sjhill@mips.com
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/mips/include/asm/pgtable.h |    8 --
 arch/mips/mm/mmap.c             |  175 ---------------------------------------
 2 files changed, 0 insertions(+), 183 deletions(-)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index f133a4c..5f9c49a 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -410,14 +410,6 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
 #endif
 
 /*
- * We provide our own get_unmapped area to cope with the virtual aliasing
- * constraints placed on us by the cache architecture.
- */
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
-#define HAVE_ARCH_ALIGN_ADDR
-
-/*
  * No page table caches to initialise
  */
 #define pgtable_cache_init()	do { } while (0)
diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 3f8af17..ac342bd 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -15,9 +15,6 @@
 #include <linux/random.h>
 #include <linux/sched.h>
 
-unsigned long shm_align_mask = PAGE_SIZE - 1;	/* Sane caches */
-EXPORT_SYMBOL(shm_align_mask);
-
 /* gap between mmap and stack */
 #define MIN_GAP (128*1024*1024UL)
 #define MAX_GAP ((TASK_SIZE)/6*5)
@@ -45,178 +42,6 @@ static unsigned long mmap_base(unsigned long rnd)
 	return PAGE_ALIGN(TASK_SIZE - gap - rnd);
 }
 
-static inline unsigned long COLOUR_ALIGN_DOWN(unsigned long addr,
-					      unsigned long pgoff)
-{
-	unsigned long base = addr & ~shm_align_mask;
-	unsigned long off = (pgoff << PAGE_SHIFT) & shm_align_mask;
-
-	if (base + off <= addr)
-		return base + off;
-
-	return base - off;
-}
-
-#define COLOUR_ALIGN(addr, pgoff)				\
-	((((addr) + shm_align_mask) & ~shm_align_mask) +	\
-	 (((pgoff) << PAGE_SHIFT) & shm_align_mask))
-
-static unsigned long arch_get_unmapped_area_common(struct file *filp,
-	unsigned long addr0, unsigned long len, unsigned long pgoff,
-	unsigned long flags, enum mmap_allocation_direction dir)
-{
-	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
-	unsigned long addr = addr0;
-	int do_color_align;
-
-	if (unlikely(len > TASK_SIZE))
-		return -ENOMEM;
-
-	if (flags & MAP_FIXED) {
-		/* Even MAP_FIXED mappings must reside within TASK_SIZE */
-		if (TASK_SIZE - len < addr)
-			return -EINVAL;
-
-		/*
-		 * We do not accept a shared mapping if it would violate
-		 * cache aliasing constraints.
-		 */
-		if ((flags & MAP_SHARED) &&
-		    ((addr - (pgoff << PAGE_SHIFT)) & shm_align_mask))
-			return -EINVAL;
-		return addr;
-	}
-
-	do_color_align = 0;
-	if (filp || (flags & MAP_SHARED))
-		do_color_align = 1;
-
-	/* requesting a specific address */
-	if (addr) {
-		if (do_color_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-		else
-			addr = PAGE_ALIGN(addr);
-
-		vma = find_vma(mm, addr);
-		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-
-	if (dir == ALLOC_UP) {
-		addr = mm->mmap_base;
-		if (do_color_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-		else
-			addr = PAGE_ALIGN(addr);
-
-		for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) {
-			/* At this point:  (!vma || addr < vma->vm_end). */
-			if (TASK_SIZE - len < addr)
-				return -ENOMEM;
-			if (!vma || addr + len <= vma->vm_start)
-				return addr;
-			addr = vma->vm_end;
-			if (do_color_align)
-				addr = COLOUR_ALIGN(addr, pgoff);
-		 }
-	 } else {
-		/* check if free_area_cache is useful for us */
-		if (len <= mm->cached_hole_size) {
-			mm->cached_hole_size = 0;
-			mm->free_area_cache = mm->mmap_base;
-		}
-
-		/*
-		 * either no address requested, or the mapping can't fit into
-		 * the requested address hole
-		 */
-		addr = mm->free_area_cache;
-		if (do_color_align) {
-			unsigned long base =
-				COLOUR_ALIGN_DOWN(addr - len, pgoff);
-			addr = base + len;
-		}
-
-		/* make sure it can fit in the remaining address space */
-		if (likely(addr > len)) {
-			vma = find_vma(mm, addr - len);
-			if (!vma || addr <= vma->vm_start) {
-				/* cache the address as a hint for next time */
-				return mm->free_area_cache = addr - len;
-			}
-		}
-
-		if (unlikely(mm->mmap_base < len))
-			goto bottomup;
-
-		addr = mm->mmap_base - len;
-		if (do_color_align)
-			addr = COLOUR_ALIGN_DOWN(addr, pgoff);
-
-		do {
-			/*
-			 * Lookup failure means no vma is above this address,
-			 * else if new region fits below vma->vm_start,
-			 * return with success:
-			 */
-			vma = find_vma(mm, addr);
-			if (likely(!vma || addr + len <= vma->vm_start)) {
-				/* cache the address as a hint for next time */
-				return mm->free_area_cache = addr;
-			}
-
-			/* remember the largest hole we saw so far */
-			if (addr + mm->cached_hole_size < vma->vm_start)
-				mm->cached_hole_size = vma->vm_start - addr;
-
-			/* try just below the current vma->vm_start */
-			addr = vma->vm_start - len;
-			if (do_color_align)
-				addr = COLOUR_ALIGN_DOWN(addr, pgoff);
-		} while (likely(len < vma->vm_start));
-
-bottomup:
-		/*
-		 * A failed mmap() very likely causes application failure,
-		 * so fall back to the bottom-up function here. This scenario
-		 * can happen with large stack limits and large mmap()
-		 * allocations.
-		 */
-		mm->cached_hole_size = ~0UL;
-		mm->free_area_cache = TASK_UNMAPPED_BASE;
-		addr = arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
-		/*
-		 * Restore the topdown base:
-		 */
-		mm->free_area_cache = mm->mmap_base;
-		mm->cached_hole_size = ~0UL;
-
-		return addr;
-	}
-}
-
-unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr0,
-	unsigned long len, unsigned long pgoff, unsigned long flags)
-{
-	return arch_get_unmapped_area_common(filp,
-			addr0, len, pgoff, flags, ALLOC_UP);
-}
-
-/*
- * There is no need to export this but sched.h declares the function as
- * extern so making it static here results in an error.
- */
-unsigned long arch_get_unmapped_area_topdown(struct file *filp,
-	unsigned long addr0, unsigned long len, unsigned long pgoff,
-	unsigned long flags)
-{
-	return arch_get_unmapped_area_common(filp,
-			addr0, len, pgoff, flags, ALLOC_DOWN);
-}
-
 void arch_pick_mmap_layout(struct mm_struct *mm)
 {
 	unsigned long random_factor = 0UL;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 5/6] remove MIPS arch_get_unmapped_area code
@ 2012-06-18 14:31   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Ralf Baechle, sjhill,
	Rik van Riel

From: Rik van Riel <riel@surriel.com>

Remove all the MIPS specific arch_get_unmapped_area(_topdown) and
page colouring code, now that the generic code should be able to
handle things.

Untested, because I do not have any MIPS systems.

Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: sjhill@mips.com
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/mips/include/asm/pgtable.h |    8 --
 arch/mips/mm/mmap.c             |  175 ---------------------------------------
 2 files changed, 0 insertions(+), 183 deletions(-)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index f133a4c..5f9c49a 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -410,14 +410,6 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
 #endif
 
 /*
- * We provide our own get_unmapped area to cope with the virtual aliasing
- * constraints placed on us by the cache architecture.
- */
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
-#define HAVE_ARCH_ALIGN_ADDR
-
-/*
  * No page table caches to initialise
  */
 #define pgtable_cache_init()	do { } while (0)
diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 3f8af17..ac342bd 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -15,9 +15,6 @@
 #include <linux/random.h>
 #include <linux/sched.h>
 
-unsigned long shm_align_mask = PAGE_SIZE - 1;	/* Sane caches */
-EXPORT_SYMBOL(shm_align_mask);
-
 /* gap between mmap and stack */
 #define MIN_GAP (128*1024*1024UL)
 #define MAX_GAP ((TASK_SIZE)/6*5)
@@ -45,178 +42,6 @@ static unsigned long mmap_base(unsigned long rnd)
 	return PAGE_ALIGN(TASK_SIZE - gap - rnd);
 }
 
-static inline unsigned long COLOUR_ALIGN_DOWN(unsigned long addr,
-					      unsigned long pgoff)
-{
-	unsigned long base = addr & ~shm_align_mask;
-	unsigned long off = (pgoff << PAGE_SHIFT) & shm_align_mask;
-
-	if (base + off <= addr)
-		return base + off;
-
-	return base - off;
-}
-
-#define COLOUR_ALIGN(addr, pgoff)				\
-	((((addr) + shm_align_mask) & ~shm_align_mask) +	\
-	 (((pgoff) << PAGE_SHIFT) & shm_align_mask))
-
-static unsigned long arch_get_unmapped_area_common(struct file *filp,
-	unsigned long addr0, unsigned long len, unsigned long pgoff,
-	unsigned long flags, enum mmap_allocation_direction dir)
-{
-	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
-	unsigned long addr = addr0;
-	int do_color_align;
-
-	if (unlikely(len > TASK_SIZE))
-		return -ENOMEM;
-
-	if (flags & MAP_FIXED) {
-		/* Even MAP_FIXED mappings must reside within TASK_SIZE */
-		if (TASK_SIZE - len < addr)
-			return -EINVAL;
-
-		/*
-		 * We do not accept a shared mapping if it would violate
-		 * cache aliasing constraints.
-		 */
-		if ((flags & MAP_SHARED) &&
-		    ((addr - (pgoff << PAGE_SHIFT)) & shm_align_mask))
-			return -EINVAL;
-		return addr;
-	}
-
-	do_color_align = 0;
-	if (filp || (flags & MAP_SHARED))
-		do_color_align = 1;
-
-	/* requesting a specific address */
-	if (addr) {
-		if (do_color_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-		else
-			addr = PAGE_ALIGN(addr);
-
-		vma = find_vma(mm, addr);
-		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-
-	if (dir == ALLOC_UP) {
-		addr = mm->mmap_base;
-		if (do_color_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-		else
-			addr = PAGE_ALIGN(addr);
-
-		for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) {
-			/* At this point:  (!vma || addr < vma->vm_end). */
-			if (TASK_SIZE - len < addr)
-				return -ENOMEM;
-			if (!vma || addr + len <= vma->vm_start)
-				return addr;
-			addr = vma->vm_end;
-			if (do_color_align)
-				addr = COLOUR_ALIGN(addr, pgoff);
-		 }
-	 } else {
-		/* check if free_area_cache is useful for us */
-		if (len <= mm->cached_hole_size) {
-			mm->cached_hole_size = 0;
-			mm->free_area_cache = mm->mmap_base;
-		}
-
-		/*
-		 * either no address requested, or the mapping can't fit into
-		 * the requested address hole
-		 */
-		addr = mm->free_area_cache;
-		if (do_color_align) {
-			unsigned long base =
-				COLOUR_ALIGN_DOWN(addr - len, pgoff);
-			addr = base + len;
-		}
-
-		/* make sure it can fit in the remaining address space */
-		if (likely(addr > len)) {
-			vma = find_vma(mm, addr - len);
-			if (!vma || addr <= vma->vm_start) {
-				/* cache the address as a hint for next time */
-				return mm->free_area_cache = addr - len;
-			}
-		}
-
-		if (unlikely(mm->mmap_base < len))
-			goto bottomup;
-
-		addr = mm->mmap_base - len;
-		if (do_color_align)
-			addr = COLOUR_ALIGN_DOWN(addr, pgoff);
-
-		do {
-			/*
-			 * Lookup failure means no vma is above this address,
-			 * else if new region fits below vma->vm_start,
-			 * return with success:
-			 */
-			vma = find_vma(mm, addr);
-			if (likely(!vma || addr + len <= vma->vm_start)) {
-				/* cache the address as a hint for next time */
-				return mm->free_area_cache = addr;
-			}
-
-			/* remember the largest hole we saw so far */
-			if (addr + mm->cached_hole_size < vma->vm_start)
-				mm->cached_hole_size = vma->vm_start - addr;
-
-			/* try just below the current vma->vm_start */
-			addr = vma->vm_start - len;
-			if (do_color_align)
-				addr = COLOUR_ALIGN_DOWN(addr, pgoff);
-		} while (likely(len < vma->vm_start));
-
-bottomup:
-		/*
-		 * A failed mmap() very likely causes application failure,
-		 * so fall back to the bottom-up function here. This scenario
-		 * can happen with large stack limits and large mmap()
-		 * allocations.
-		 */
-		mm->cached_hole_size = ~0UL;
-		mm->free_area_cache = TASK_UNMAPPED_BASE;
-		addr = arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
-		/*
-		 * Restore the topdown base:
-		 */
-		mm->free_area_cache = mm->mmap_base;
-		mm->cached_hole_size = ~0UL;
-
-		return addr;
-	}
-}
-
-unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr0,
-	unsigned long len, unsigned long pgoff, unsigned long flags)
-{
-	return arch_get_unmapped_area_common(filp,
-			addr0, len, pgoff, flags, ALLOC_UP);
-}
-
-/*
- * There is no need to export this but sched.h declares the function as
- * extern so making it static here results in an error.
- */
-unsigned long arch_get_unmapped_area_topdown(struct file *filp,
-	unsigned long addr0, unsigned long len, unsigned long pgoff,
-	unsigned long flags)
-{
-	return arch_get_unmapped_area_common(filp,
-			addr0, len, pgoff, flags, ALLOC_DOWN);
-}
-
 void arch_pick_mmap_layout(struct mm_struct *mm)
 {
 	unsigned long random_factor = 0UL;
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 6/6] remove ARM arch_get_unmapped_area functions
  2012-06-18 14:31 ` Rik van Riel
@ 2012-06-18 14:31   ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Russell King, Rik van Riel

From: Rik van Riel <riel@surriel.com>

Remove the ARM special variants of arch_get_unmapped_area since the
generic functions should now be able to handle everything.

Untested because I have no ARM hardware.

Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/arm/include/asm/pgtable.h |    6 -
 arch/arm/mm/init.c             |    3 +
 arch/arm/mm/mmap.c             |  217 +---------------------------------------
 3 files changed, 4 insertions(+), 222 deletions(-)

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index f66626d..6754183 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -296,12 +296,6 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 #include <asm-generic/pgtable.h>
 
 /*
- * We provide our own arch_get_unmapped_area to cope with VIPT caches.
- */
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
-
-/*
  * remap a physical page `pfn' of size `size' with page protection `prot'
  * into virtual address `from'
  */
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index f54d592..534dd96 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -600,6 +600,9 @@ void __init mem_init(void)
 	extern u32 itcm_end;
 #endif
 
+	/* Tell the page colouring code what we need. */
+	shm_align_mask = SHMLBA - 1;
+
 	max_mapnr   = pfn_to_page(max_pfn + PHYS_PFN_OFFSET) - mem_map;
 
 	/* this will put all unused low memory onto the freelists */
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index ce8cb19..2b1f881 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -11,21 +11,7 @@
 #include <linux/random.h>
 #include <asm/cachetype.h>
 
-static inline unsigned long COLOUR_ALIGN_DOWN(unsigned long addr,
-					      unsigned long pgoff)
-{
-	unsigned long base = addr & ~(SHMLBA-1);
-	unsigned long off = (pgoff << PAGE_SHIFT) & (SHMLBA-1);
-
-	if (base + off <= addr)
-		return base + off;
-
-	return base - off;
-}
-
-#define COLOUR_ALIGN(addr,pgoff)		\
-	((((addr)+SHMLBA-1)&~(SHMLBA-1)) +	\
-	 (((pgoff)<<PAGE_SHIFT) & (SHMLBA-1)))
+unsigned long shm_align_mask = SHMLBA - 1;
 
 /* gap between mmap and stack */
 #define MIN_GAP (128*1024*1024UL)
@@ -54,207 +40,6 @@ static unsigned long mmap_base(unsigned long rnd)
 	return PAGE_ALIGN(TASK_SIZE - gap - rnd);
 }
 
-/*
- * We need to ensure that shared mappings are correctly aligned to
- * avoid aliasing issues with VIPT caches.  We need to ensure that
- * a specific page of an object is always mapped at a multiple of
- * SHMLBA bytes.
- *
- * We unconditionally provide this function for all cases, however
- * in the VIVT case, we optimise out the alignment rules.
- */
-unsigned long
-arch_get_unmapped_area(struct file *filp, unsigned long addr,
-		unsigned long len, unsigned long pgoff, unsigned long flags)
-{
-	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
-	unsigned long start_addr;
-	int do_align = 0;
-	int aliasing = cache_is_vipt_aliasing();
-
-	/*
-	 * We only need to do colour alignment if either the I or D
-	 * caches alias.
-	 */
-	if (aliasing)
-		do_align = filp || (flags & MAP_SHARED);
-
-	/*
-	 * We enforce the MAP_FIXED case.
-	 */
-	if (flags & MAP_FIXED) {
-		if (aliasing && flags & MAP_SHARED &&
-		    (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
-			return -EINVAL;
-		return addr;
-	}
-
-	if (len > TASK_SIZE)
-		return -ENOMEM;
-
-	if (addr) {
-		if (do_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-		else
-			addr = PAGE_ALIGN(addr);
-
-		vma = find_vma(mm, addr);
-		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-	if (len > mm->cached_hole_size) {
-	        start_addr = addr = mm->free_area_cache;
-	} else {
-	        start_addr = addr = mm->mmap_base;
-	        mm->cached_hole_size = 0;
-	}
-
-full_search:
-	if (do_align)
-		addr = COLOUR_ALIGN(addr, pgoff);
-	else
-		addr = PAGE_ALIGN(addr);
-
-	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (TASK_SIZE - len < addr) {
-			/*
-			 * Start a new search - just in case we missed
-			 * some holes.
-			 */
-			if (start_addr != TASK_UNMAPPED_BASE) {
-				start_addr = addr = TASK_UNMAPPED_BASE;
-				mm->cached_hole_size = 0;
-				goto full_search;
-			}
-			return -ENOMEM;
-		}
-		if (!vma || addr + len <= vma->vm_start) {
-			/*
-			 * Remember the place where we stopped the search:
-			 */
-			mm->free_area_cache = addr + len;
-			return addr;
-		}
-		if (addr + mm->cached_hole_size < vma->vm_start)
-		        mm->cached_hole_size = vma->vm_start - addr;
-		addr = vma->vm_end;
-		if (do_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-	}
-}
-
-unsigned long
-arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
-			const unsigned long len, const unsigned long pgoff,
-			const unsigned long flags)
-{
-	struct vm_area_struct *vma;
-	struct mm_struct *mm = current->mm;
-	unsigned long addr = addr0;
-	int do_align = 0;
-	int aliasing = cache_is_vipt_aliasing();
-
-	/*
-	 * We only need to do colour alignment if either the I or D
-	 * caches alias.
-	 */
-	if (aliasing)
-		do_align = filp || (flags & MAP_SHARED);
-
-	/* requested length too big for entire address space */
-	if (len > TASK_SIZE)
-		return -ENOMEM;
-
-	if (flags & MAP_FIXED) {
-		if (aliasing && flags & MAP_SHARED &&
-		    (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
-			return -EINVAL;
-		return addr;
-	}
-
-	/* requesting a specific address */
-	if (addr) {
-		if (do_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-		else
-			addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
-		if (TASK_SIZE - len >= addr &&
-				(!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-
-	/* check if free_area_cache is useful for us */
-	if (len <= mm->cached_hole_size) {
-		mm->cached_hole_size = 0;
-		mm->free_area_cache = mm->mmap_base;
-	}
-
-	/* either no address requested or can't fit in requested address hole */
-	addr = mm->free_area_cache;
-	if (do_align) {
-		unsigned long base = COLOUR_ALIGN_DOWN(addr - len, pgoff);
-		addr = base + len;
-	}
-
-	/* make sure it can fit in the remaining address space */
-	if (addr > len) {
-		vma = find_vma(mm, addr-len);
-		if (!vma || addr <= vma->vm_start)
-			/* remember the address as a hint for next time */
-			return (mm->free_area_cache = addr-len);
-	}
-
-	if (mm->mmap_base < len)
-		goto bottomup;
-
-	addr = mm->mmap_base - len;
-	if (do_align)
-		addr = COLOUR_ALIGN_DOWN(addr, pgoff);
-
-	do {
-		/*
-		 * Lookup failure means no vma is above this address,
-		 * else if new region fits below vma->vm_start,
-		 * return with success:
-		 */
-		vma = find_vma(mm, addr);
-		if (!vma || addr+len <= vma->vm_start)
-			/* remember the address as a hint for next time */
-			return (mm->free_area_cache = addr);
-
-		/* remember the largest hole we saw so far */
-		if (addr + mm->cached_hole_size < vma->vm_start)
-			mm->cached_hole_size = vma->vm_start - addr;
-
-		/* try just below the current vma->vm_start */
-		addr = vma->vm_start - len;
-		if (do_align)
-			addr = COLOUR_ALIGN_DOWN(addr, pgoff);
-	} while (len < vma->vm_start);
-
-bottomup:
-	/*
-	 * A failed mmap() very likely causes application failure,
-	 * so fall back to the bottom-up function here. This scenario
-	 * can happen with large stack limits and large mmap()
-	 * allocations.
-	 */
-	mm->cached_hole_size = ~0UL;
-	mm->free_area_cache = TASK_UNMAPPED_BASE;
-	addr = arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
-	/*
-	 * Restore the topdown base:
-	 */
-	mm->free_area_cache = mm->mmap_base;
-	mm->cached_hole_size = ~0UL;
-
-	return addr;
-}
-
 void arch_pick_mmap_layout(struct mm_struct *mm)
 {
 	unsigned long random_factor = 0UL;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH -mm 6/6] remove ARM arch_get_unmapped_area functions
@ 2012-06-18 14:31   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:31 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, Rik van Riel, Russell King, Rik van Riel

From: Rik van Riel <riel@surriel.com>

Remove the ARM special variants of arch_get_unmapped_area since the
generic functions should now be able to handle everything.

Untested because I have no ARM hardware.

Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/arm/include/asm/pgtable.h |    6 -
 arch/arm/mm/init.c             |    3 +
 arch/arm/mm/mmap.c             |  217 +---------------------------------------
 3 files changed, 4 insertions(+), 222 deletions(-)

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index f66626d..6754183 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -296,12 +296,6 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 #include <asm-generic/pgtable.h>
 
 /*
- * We provide our own arch_get_unmapped_area to cope with VIPT caches.
- */
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
-
-/*
  * remap a physical page `pfn' of size `size' with page protection `prot'
  * into virtual address `from'
  */
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index f54d592..534dd96 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -600,6 +600,9 @@ void __init mem_init(void)
 	extern u32 itcm_end;
 #endif
 
+	/* Tell the page colouring code what we need. */
+	shm_align_mask = SHMLBA - 1;
+
 	max_mapnr   = pfn_to_page(max_pfn + PHYS_PFN_OFFSET) - mem_map;
 
 	/* this will put all unused low memory onto the freelists */
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index ce8cb19..2b1f881 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -11,21 +11,7 @@
 #include <linux/random.h>
 #include <asm/cachetype.h>
 
-static inline unsigned long COLOUR_ALIGN_DOWN(unsigned long addr,
-					      unsigned long pgoff)
-{
-	unsigned long base = addr & ~(SHMLBA-1);
-	unsigned long off = (pgoff << PAGE_SHIFT) & (SHMLBA-1);
-
-	if (base + off <= addr)
-		return base + off;
-
-	return base - off;
-}
-
-#define COLOUR_ALIGN(addr,pgoff)		\
-	((((addr)+SHMLBA-1)&~(SHMLBA-1)) +	\
-	 (((pgoff)<<PAGE_SHIFT) & (SHMLBA-1)))
+unsigned long shm_align_mask = SHMLBA - 1;
 
 /* gap between mmap and stack */
 #define MIN_GAP (128*1024*1024UL)
@@ -54,207 +40,6 @@ static unsigned long mmap_base(unsigned long rnd)
 	return PAGE_ALIGN(TASK_SIZE - gap - rnd);
 }
 
-/*
- * We need to ensure that shared mappings are correctly aligned to
- * avoid aliasing issues with VIPT caches.  We need to ensure that
- * a specific page of an object is always mapped at a multiple of
- * SHMLBA bytes.
- *
- * We unconditionally provide this function for all cases, however
- * in the VIVT case, we optimise out the alignment rules.
- */
-unsigned long
-arch_get_unmapped_area(struct file *filp, unsigned long addr,
-		unsigned long len, unsigned long pgoff, unsigned long flags)
-{
-	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
-	unsigned long start_addr;
-	int do_align = 0;
-	int aliasing = cache_is_vipt_aliasing();
-
-	/*
-	 * We only need to do colour alignment if either the I or D
-	 * caches alias.
-	 */
-	if (aliasing)
-		do_align = filp || (flags & MAP_SHARED);
-
-	/*
-	 * We enforce the MAP_FIXED case.
-	 */
-	if (flags & MAP_FIXED) {
-		if (aliasing && flags & MAP_SHARED &&
-		    (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
-			return -EINVAL;
-		return addr;
-	}
-
-	if (len > TASK_SIZE)
-		return -ENOMEM;
-
-	if (addr) {
-		if (do_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-		else
-			addr = PAGE_ALIGN(addr);
-
-		vma = find_vma(mm, addr);
-		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-	if (len > mm->cached_hole_size) {
-	        start_addr = addr = mm->free_area_cache;
-	} else {
-	        start_addr = addr = mm->mmap_base;
-	        mm->cached_hole_size = 0;
-	}
-
-full_search:
-	if (do_align)
-		addr = COLOUR_ALIGN(addr, pgoff);
-	else
-		addr = PAGE_ALIGN(addr);
-
-	for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
-		/* At this point:  (!vma || addr < vma->vm_end). */
-		if (TASK_SIZE - len < addr) {
-			/*
-			 * Start a new search - just in case we missed
-			 * some holes.
-			 */
-			if (start_addr != TASK_UNMAPPED_BASE) {
-				start_addr = addr = TASK_UNMAPPED_BASE;
-				mm->cached_hole_size = 0;
-				goto full_search;
-			}
-			return -ENOMEM;
-		}
-		if (!vma || addr + len <= vma->vm_start) {
-			/*
-			 * Remember the place where we stopped the search:
-			 */
-			mm->free_area_cache = addr + len;
-			return addr;
-		}
-		if (addr + mm->cached_hole_size < vma->vm_start)
-		        mm->cached_hole_size = vma->vm_start - addr;
-		addr = vma->vm_end;
-		if (do_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-	}
-}
-
-unsigned long
-arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
-			const unsigned long len, const unsigned long pgoff,
-			const unsigned long flags)
-{
-	struct vm_area_struct *vma;
-	struct mm_struct *mm = current->mm;
-	unsigned long addr = addr0;
-	int do_align = 0;
-	int aliasing = cache_is_vipt_aliasing();
-
-	/*
-	 * We only need to do colour alignment if either the I or D
-	 * caches alias.
-	 */
-	if (aliasing)
-		do_align = filp || (flags & MAP_SHARED);
-
-	/* requested length too big for entire address space */
-	if (len > TASK_SIZE)
-		return -ENOMEM;
-
-	if (flags & MAP_FIXED) {
-		if (aliasing && flags & MAP_SHARED &&
-		    (addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
-			return -EINVAL;
-		return addr;
-	}
-
-	/* requesting a specific address */
-	if (addr) {
-		if (do_align)
-			addr = COLOUR_ALIGN(addr, pgoff);
-		else
-			addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
-		if (TASK_SIZE - len >= addr &&
-				(!vma || addr + len <= vma->vm_start))
-			return addr;
-	}
-
-	/* check if free_area_cache is useful for us */
-	if (len <= mm->cached_hole_size) {
-		mm->cached_hole_size = 0;
-		mm->free_area_cache = mm->mmap_base;
-	}
-
-	/* either no address requested or can't fit in requested address hole */
-	addr = mm->free_area_cache;
-	if (do_align) {
-		unsigned long base = COLOUR_ALIGN_DOWN(addr - len, pgoff);
-		addr = base + len;
-	}
-
-	/* make sure it can fit in the remaining address space */
-	if (addr > len) {
-		vma = find_vma(mm, addr-len);
-		if (!vma || addr <= vma->vm_start)
-			/* remember the address as a hint for next time */
-			return (mm->free_area_cache = addr-len);
-	}
-
-	if (mm->mmap_base < len)
-		goto bottomup;
-
-	addr = mm->mmap_base - len;
-	if (do_align)
-		addr = COLOUR_ALIGN_DOWN(addr, pgoff);
-
-	do {
-		/*
-		 * Lookup failure means no vma is above this address,
-		 * else if new region fits below vma->vm_start,
-		 * return with success:
-		 */
-		vma = find_vma(mm, addr);
-		if (!vma || addr+len <= vma->vm_start)
-			/* remember the address as a hint for next time */
-			return (mm->free_area_cache = addr);
-
-		/* remember the largest hole we saw so far */
-		if (addr + mm->cached_hole_size < vma->vm_start)
-			mm->cached_hole_size = vma->vm_start - addr;
-
-		/* try just below the current vma->vm_start */
-		addr = vma->vm_start - len;
-		if (do_align)
-			addr = COLOUR_ALIGN_DOWN(addr, pgoff);
-	} while (len < vma->vm_start);
-
-bottomup:
-	/*
-	 * A failed mmap() very likely causes application failure,
-	 * so fall back to the bottom-up function here. This scenario
-	 * can happen with large stack limits and large mmap()
-	 * allocations.
-	 */
-	mm->cached_hole_size = ~0UL;
-	mm->free_area_cache = TASK_UNMAPPED_BASE;
-	addr = arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
-	/*
-	 * Restore the topdown base:
-	 */
-	mm->free_area_cache = mm->mmap_base;
-	mm->cached_hole_size = ~0UL;
-
-	return addr;
-}
-
 void arch_pick_mmap_layout(struct mm_struct *mm)
 {
 	unsigned long random_factor = 0UL;
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
  2012-06-18 14:31   ` Rik van Riel
@ 2012-06-18 16:30     ` Andi Kleen
  -1 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2012-06-18 16:30 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-mm, akpm, aarcange, peterz, minchan, kosaki.motohiro, hnaz,
	mel, linux-kernel, Rik van Riel

Rik van Riel <riel@redhat.com> writes:

> From: Rik van Riel <riel@surriel.com>
>
> Teach the generic arch_get_unmapped_area(_topdown) code to call the
> page colouring code.

What tree is that against? I cannot find x86 page colouring code in next
or mainline.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 16:30     ` Andi Kleen
  0 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2012-06-18 16:30 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-mm, akpm, aarcange, peterz, minchan, kosaki.motohiro, hnaz,
	mel, linux-kernel, Rik van Riel

Rik van Riel <riel@redhat.com> writes:

> From: Rik van Riel <riel@surriel.com>
>
> Teach the generic arch_get_unmapped_area(_topdown) code to call the
> page colouring code.

What tree is that against? I cannot find x86 page colouring code in next
or mainline.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
  2012-06-18 16:30     ` Andi Kleen
@ 2012-06-18 16:45       ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 16:45 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-mm, akpm, aarcange, peterz, minchan, kosaki.motohiro, hnaz,
	mel, linux-kernel, Rik van Riel

On 06/18/2012 12:30 PM, Andi Kleen wrote:
> Rik van Riel<riel@redhat.com>  writes:
>
>> From: Rik van Riel<riel@surriel.com>
>>
>> Teach the generic arch_get_unmapped_area(_topdown) code to call the
>> page colouring code.
>
> What tree is that against? I cannot find x86 page colouring code in next
> or mainline.

This is against mainline.

See align_addr in arch/x86/kernel/sys_x86_64.c and the
call sites in arch_get_unmapped_area(_topdown).

On certain AMD chips, Linux tries to get certain
allocations aligned to avoid cache aliasing issues.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 16:45       ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 16:45 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-mm, akpm, aarcange, peterz, minchan, kosaki.motohiro, hnaz,
	mel, linux-kernel, Rik van Riel

On 06/18/2012 12:30 PM, Andi Kleen wrote:
> Rik van Riel<riel@redhat.com>  writes:
>
>> From: Rik van Riel<riel@surriel.com>
>>
>> Teach the generic arch_get_unmapped_area(_topdown) code to call the
>> page colouring code.
>
> What tree is that against? I cannot find x86 page colouring code in next
> or mainline.

This is against mainline.

See align_addr in arch/x86/kernel/sys_x86_64.c and the
call sites in arch_get_unmapped_area(_topdown).

On certain AMD chips, Linux tries to get certain
allocations aligned to avoid cache aliasing issues.

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
  2012-06-18 16:45       ` Rik van Riel
@ 2012-06-18 18:16         ` Borislav Petkov
  -1 siblings, 0 replies; 30+ messages in thread
From: Borislav Petkov @ 2012-06-18 18:16 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andi Kleen, linux-mm, akpm, aarcange, peterz, minchan,
	kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On Mon, Jun 18, 2012 at 12:45:48PM -0400, Rik van Riel wrote:
> >What tree is that against? I cannot find x86 page colouring code in next
> >or mainline.
> 
> This is against mainline.

Which mainline do you mean exactly?

1/6 doesn't apply ontop of current mainline and by "current" I mean
v3.5-rc3-57-g39a50b42f702.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 18:16         ` Borislav Petkov
  0 siblings, 0 replies; 30+ messages in thread
From: Borislav Petkov @ 2012-06-18 18:16 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andi Kleen, linux-mm, akpm, aarcange, peterz, minchan,
	kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On Mon, Jun 18, 2012 at 12:45:48PM -0400, Rik van Riel wrote:
> >What tree is that against? I cannot find x86 page colouring code in next
> >or mainline.
> 
> This is against mainline.

Which mainline do you mean exactly?

1/6 doesn't apply ontop of current mainline and by "current" I mean
v3.5-rc3-57-g39a50b42f702.

-- 
Regards/Gruss,
Boris.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
  2012-06-18 18:16         ` Borislav Petkov
@ 2012-06-18 19:00           ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 19:00 UTC (permalink / raw)
  To: Borislav Petkov, Andi Kleen, linux-mm, akpm, aarcange, peterz,
	minchan, kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On 06/18/2012 02:16 PM, Borislav Petkov wrote:
> On Mon, Jun 18, 2012 at 12:45:48PM -0400, Rik van Riel wrote:
>>> What tree is that against? I cannot find x86 page colouring code in next
>>> or mainline.
>>
>> This is against mainline.
>
> Which mainline do you mean exactly?
>
> 1/6 doesn't apply ontop of current mainline and by "current" I mean
> v3.5-rc3-57-g39a50b42f702.

I git pulled on Friday, then used guilt to apply and rediff
all the patches. I pull from here:

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

I see no g39a50b... commit after pulling that tree here.
Do you have any local changes by chance?

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 19:00           ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 19:00 UTC (permalink / raw)
  To: Borislav Petkov, Andi Kleen, linux-mm, akpm, aarcange, peterz,
	minchan, kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On 06/18/2012 02:16 PM, Borislav Petkov wrote:
> On Mon, Jun 18, 2012 at 12:45:48PM -0400, Rik van Riel wrote:
>>> What tree is that against? I cannot find x86 page colouring code in next
>>> or mainline.
>>
>> This is against mainline.
>
> Which mainline do you mean exactly?
>
> 1/6 doesn't apply ontop of current mainline and by "current" I mean
> v3.5-rc3-57-g39a50b42f702.

I git pulled on Friday, then used guilt to apply and rediff
all the patches. I pull from here:

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

I see no g39a50b... commit after pulling that tree here.
Do you have any local changes by chance?

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
  2012-06-18 18:16         ` Borislav Petkov
@ 2012-06-18 19:02           ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 19:02 UTC (permalink / raw)
  To: Borislav Petkov, Andi Kleen, linux-mm, akpm, aarcange, peterz,
	minchan, kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On 06/18/2012 02:16 PM, Borislav Petkov wrote:
> On Mon, Jun 18, 2012 at 12:45:48PM -0400, Rik van Riel wrote:
>>> What tree is that against? I cannot find x86 page colouring code in next
>>> or mainline.
>>
>> This is against mainline.
>
> Which mainline do you mean exactly?
>
> 1/6 doesn't apply ontop of current mainline and by "current" I mean
> v3.5-rc3-57-g39a50b42f702.

After pulling in the latest patches, including that
39a50b... commit, all patches still apply here when
I type guilt push -a.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 19:02           ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 19:02 UTC (permalink / raw)
  To: Borislav Petkov, Andi Kleen, linux-mm, akpm, aarcange, peterz,
	minchan, kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On 06/18/2012 02:16 PM, Borislav Petkov wrote:
> On Mon, Jun 18, 2012 at 12:45:48PM -0400, Rik van Riel wrote:
>>> What tree is that against? I cannot find x86 page colouring code in next
>>> or mainline.
>>
>> This is against mainline.
>
> Which mainline do you mean exactly?
>
> 1/6 doesn't apply ontop of current mainline and by "current" I mean
> v3.5-rc3-57-g39a50b42f702.

After pulling in the latest patches, including that
39a50b... commit, all patches still apply here when
I type guilt push -a.

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
  2012-06-18 19:02           ` Rik van Riel
@ 2012-06-18 20:37             ` Borislav Petkov
  -1 siblings, 0 replies; 30+ messages in thread
From: Borislav Petkov @ 2012-06-18 20:37 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andi Kleen, linux-mm, akpm, aarcange, peterz, minchan,
	kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On Mon, Jun 18, 2012 at 03:02:54PM -0400, Rik van Riel wrote:
> On 06/18/2012 02:16 PM, Borislav Petkov wrote:
> >On Mon, Jun 18, 2012 at 12:45:48PM -0400, Rik van Riel wrote:
> >>>What tree is that against? I cannot find x86 page colouring code in next
> >>>or mainline.
> >>
> >>This is against mainline.
> >
> >Which mainline do you mean exactly?
> >
> >1/6 doesn't apply ontop of current mainline and by "current" I mean
> >v3.5-rc3-57-g39a50b42f702.
> 
> After pulling in the latest patches, including that
> 39a50b... commit, all patches still apply here when
> I type guilt push -a.

That's strange.

I'm also pulling from

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

Btw, if I had local changes, the top commit id would've changed, right?
So I wouldn't have had 39a50b anymore.

Just in case, I tried applying 1/6 on another repository and it still
doesn't apply:

$ patch -p1 --dry-run -i /tmp/riel.01
patching file include/linux/mm_types.h
Hunk #1 succeeded at 300 (offset -7 lines).
patching file mm/mmap.c
Hunk #2 succeeded at 206 with fuzz 1 (offset -45 lines).
Hunk #3 FAILED at 398.
Hunk #4 FAILED at 461.
Hunk #5 succeeded at 603 (offset -57 lines).
Hunk #6 succeeded at 1404 (offset -66 lines).
Hunk #7 succeeded at 1441 (offset -66 lines).
Hunk #8 succeeded at 1528 (offset -66 lines).
Hunk #9 succeeded at 1570 (offset -66 lines).
Hunk #10 FAILED at 1908.
Hunk #11 FAILED at 2093.
4 out of 11 hunks FAILED -- saving rejects to file mm/mmap.c.rej

riel.01 is the mail saved from mutt so it should be fine.

Now let's look at the first failing hunk:

Mainline has:

void validate_mm(struct mm_struct *mm)
{
	int bug = 0;
	int i = 0;
	struct vm_area_struct *tmp = mm->mmap;
	while (tmp) {
		tmp = tmp->vm_next;
		i++;
	}
	if (i != mm->map_count)
		printk("map_count %d vm_next %d\n", mm->map_count, i), bug = 1;
	i = browse_rb(&mm->mm_rb);
	if (i != mm->map_count)
		printk("map_count %d rb %d\n", mm->map_count, i), bug = 1;
	BUG_ON(bug);
}

--
and your patch has some new ifs in it:

@@ -386,12 +398,16 @@ void validate_mm(struct mm_struct *mm)
 	int bug = 0;
 	int i = 0;
 	struct vm_area_struct *tmp = mm->mmap;
+	unsigned long highest_address = 0;
 	while (tmp) {
 		if (tmp->free_gap != max_free_space(&tmp->vm_rb))
 			printk("free space %lx, correct %lx\n", tmp->free_gap, max_free_space(&tmp->vm_rb)), bug = 1;

			^^^^^^^^^^^^^^

I think this if-statement is the problem. It is not present in mainline
but this patch doesn't add it so some patch earlier than that adds it
which is probably in your queue?

+		highest_address = tmp->vm_end;
 		tmp = tmp->vm_next;
 		i++;
 	}
+	if (highest_address != mm->highest_vma)
+		printk("mm->highest_vma %lx, found %lx\n", mm->highest_vma, highest_address), bug = 1;

 	if (i != mm->map_count)
 		printk("map_count %d vm_next %d\n", mm->map_count, i), bug = 1;
 	i = browse_rb(&mm->mm_rb);
--

I haven't looked at the other failing hunks...

Thanks.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 20:37             ` Borislav Petkov
  0 siblings, 0 replies; 30+ messages in thread
From: Borislav Petkov @ 2012-06-18 20:37 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andi Kleen, linux-mm, akpm, aarcange, peterz, minchan,
	kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On Mon, Jun 18, 2012 at 03:02:54PM -0400, Rik van Riel wrote:
> On 06/18/2012 02:16 PM, Borislav Petkov wrote:
> >On Mon, Jun 18, 2012 at 12:45:48PM -0400, Rik van Riel wrote:
> >>>What tree is that against? I cannot find x86 page colouring code in next
> >>>or mainline.
> >>
> >>This is against mainline.
> >
> >Which mainline do you mean exactly?
> >
> >1/6 doesn't apply ontop of current mainline and by "current" I mean
> >v3.5-rc3-57-g39a50b42f702.
> 
> After pulling in the latest patches, including that
> 39a50b... commit, all patches still apply here when
> I type guilt push -a.

That's strange.

I'm also pulling from

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

Btw, if I had local changes, the top commit id would've changed, right?
So I wouldn't have had 39a50b anymore.

Just in case, I tried applying 1/6 on another repository and it still
doesn't apply:

$ patch -p1 --dry-run -i /tmp/riel.01
patching file include/linux/mm_types.h
Hunk #1 succeeded at 300 (offset -7 lines).
patching file mm/mmap.c
Hunk #2 succeeded at 206 with fuzz 1 (offset -45 lines).
Hunk #3 FAILED at 398.
Hunk #4 FAILED at 461.
Hunk #5 succeeded at 603 (offset -57 lines).
Hunk #6 succeeded at 1404 (offset -66 lines).
Hunk #7 succeeded at 1441 (offset -66 lines).
Hunk #8 succeeded at 1528 (offset -66 lines).
Hunk #9 succeeded at 1570 (offset -66 lines).
Hunk #10 FAILED at 1908.
Hunk #11 FAILED at 2093.
4 out of 11 hunks FAILED -- saving rejects to file mm/mmap.c.rej

riel.01 is the mail saved from mutt so it should be fine.

Now let's look at the first failing hunk:

Mainline has:

void validate_mm(struct mm_struct *mm)
{
	int bug = 0;
	int i = 0;
	struct vm_area_struct *tmp = mm->mmap;
	while (tmp) {
		tmp = tmp->vm_next;
		i++;
	}
	if (i != mm->map_count)
		printk("map_count %d vm_next %d\n", mm->map_count, i), bug = 1;
	i = browse_rb(&mm->mm_rb);
	if (i != mm->map_count)
		printk("map_count %d rb %d\n", mm->map_count, i), bug = 1;
	BUG_ON(bug);
}

--
and your patch has some new ifs in it:

@@ -386,12 +398,16 @@ void validate_mm(struct mm_struct *mm)
 	int bug = 0;
 	int i = 0;
 	struct vm_area_struct *tmp = mm->mmap;
+	unsigned long highest_address = 0;
 	while (tmp) {
 		if (tmp->free_gap != max_free_space(&tmp->vm_rb))
 			printk("free space %lx, correct %lx\n", tmp->free_gap, max_free_space(&tmp->vm_rb)), bug = 1;

			^^^^^^^^^^^^^^

I think this if-statement is the problem. It is not present in mainline
but this patch doesn't add it so some patch earlier than that adds it
which is probably in your queue?

+		highest_address = tmp->vm_end;
 		tmp = tmp->vm_next;
 		i++;
 	}
+	if (highest_address != mm->highest_vma)
+		printk("mm->highest_vma %lx, found %lx\n", mm->highest_vma, highest_address), bug = 1;

 	if (i != mm->map_count)
 		printk("map_count %d vm_next %d\n", mm->map_count, i), bug = 1;
 	i = browse_rb(&mm->mm_rb);
--

I haven't looked at the other failing hunks...

Thanks.

-- 
Regards/Gruss,
    Boris.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
  2012-06-18 20:37             ` Borislav Petkov
@ 2012-06-18 22:03               ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 22:03 UTC (permalink / raw)
  To: Borislav Petkov, Andi Kleen, linux-mm, akpm, aarcange, peterz,
	minchan, kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On 06/18/2012 04:37 PM, Borislav Petkov wrote:

> and your patch has some new ifs in it:
>
> @@ -386,12 +398,16 @@ void validate_mm(struct mm_struct *mm)
>   	int bug = 0;
>   	int i = 0;
>   	struct vm_area_struct *tmp = mm->mmap;
> +	unsigned long highest_address = 0;
>   	while (tmp) {
>   		if (tmp->free_gap != max_free_space(&tmp->vm_rb))
>   			printk("free space %lx, correct %lx\n", tmp->free_gap, max_free_space(&tmp->vm_rb)), bug = 1;
>
> 			^^^^^^^^^^^^^^
>
> I think this if-statement is the problem. It is not present in mainline
> but this patch doesn't add it so some patch earlier than that adds it
> which is probably in your queue?

Argh! I see the problem now.

guilt-patchbomb sent everything from my second patch onwards,
not my first patch :(

Let me resend the series properly, I have 7 patches not 6.

I am having a bad email day...

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 22:03               ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 22:03 UTC (permalink / raw)
  To: Borislav Petkov, Andi Kleen, linux-mm, akpm, aarcange, peterz,
	minchan, kosaki.motohiro, hnaz, mel, linux-kernel, Rik van Riel

On 06/18/2012 04:37 PM, Borislav Petkov wrote:

> and your patch has some new ifs in it:
>
> @@ -386,12 +398,16 @@ void validate_mm(struct mm_struct *mm)
>   	int bug = 0;
>   	int i = 0;
>   	struct vm_area_struct *tmp = mm->mmap;
> +	unsigned long highest_address = 0;
>   	while (tmp) {
>   		if (tmp->free_gap != max_free_space(&tmp->vm_rb))
>   			printk("free space %lx, correct %lx\n", tmp->free_gap, max_free_space(&tmp->vm_rb)), bug = 1;
>
> 			^^^^^^^^^^^^^^
>
> I think this if-statement is the problem. It is not present in mainline
> but this patch doesn't add it so some patch earlier than that adds it
> which is probably in your queue?

Argh! I see the problem now.

guilt-patchbomb sent everything from my second patch onwards,
not my first patch :(

Let me resend the series properly, I have 7 patches not 6.

I am having a bad email day...

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [[PATCH -mm] 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
       [not found] <1340029247-6949-1-git-send-email-riel@surriel.com>
@ 2012-06-18 14:20   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:20 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, knoel, Rik van Riel, Rik van Riel

<<< No Message Collected >>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [[PATCH -mm] 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code.
@ 2012-06-18 14:20   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2012-06-18 14:20 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, aarcange, peterz, minchan, kosaki.motohiro, andi, hnaz,
	mel, linux-kernel, knoel, Rik van Riel, Rik van Riel

<<< No Message Collected >>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2012-06-18 22:04 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-18 14:31 [PATCH -mm 0/6] mm: scalable and unified arch_get_unmapped_area Rik van Riel
2012-06-18 14:31 ` Rik van Riel
2012-06-18 14:31 ` [PATCH -mm 1/6] mm: get unmapped area from VMA tree Rik van Riel
2012-06-18 14:31   ` Rik van Riel
2012-06-18 14:31 ` [PATCH -mm 2/6] Allow each architecture to specify the address range that can be used for this allocation Rik van Riel
2012-06-18 14:31   ` Rik van Riel
2012-06-18 14:31 ` [PATCH -mm 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code Rik van Riel
2012-06-18 14:31   ` Rik van Riel
2012-06-18 16:30   ` Andi Kleen
2012-06-18 16:30     ` Andi Kleen
2012-06-18 16:45     ` Rik van Riel
2012-06-18 16:45       ` Rik van Riel
2012-06-18 18:16       ` Borislav Petkov
2012-06-18 18:16         ` Borislav Petkov
2012-06-18 19:00         ` Rik van Riel
2012-06-18 19:00           ` Rik van Riel
2012-06-18 19:02         ` Rik van Riel
2012-06-18 19:02           ` Rik van Riel
2012-06-18 20:37           ` Borislav Petkov
2012-06-18 20:37             ` Borislav Petkov
2012-06-18 22:03             ` Rik van Riel
2012-06-18 22:03               ` Rik van Riel
2012-06-18 14:31 ` [PATCH -mm 4/6] mm: remove x86 arch_get_unmapped_area(_topdown) Rik van Riel
2012-06-18 14:31   ` Rik van Riel
2012-06-18 14:31 ` [PATCH -mm 5/6] remove MIPS arch_get_unmapped_area code Rik van Riel
2012-06-18 14:31   ` Rik van Riel
2012-06-18 14:31 ` [PATCH -mm 6/6] remove ARM arch_get_unmapped_area functions Rik van Riel
2012-06-18 14:31   ` Rik van Riel
     [not found] <1340029247-6949-1-git-send-email-riel@surriel.com>
2012-06-18 14:20 ` [[PATCH -mm] 3/6] Fix the x86-64 page colouring code to take pgoff into account and use that code as the basis for a generic page colouring code Rik van Riel
2012-06-18 14:20   ` Rik van Riel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.