linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] proc: clear_refs: do not clear reserved pages
@ 2012-01-13 15:13 Will Deacon
  2012-01-13 15:35 ` Russell King - ARM Linux
  2012-01-13 22:55 ` Nicolas Pitre
  0 siblings, 2 replies; 8+ messages in thread
From: Will Deacon @ 2012-01-13 15:13 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-arm-kernel
  Cc: moussaba, Will Deacon, David Rientjes, Andrew Morton, Nicolas Pitre

/proc/pid/clear_refs is used to clear the Referenced and YOUNG bits for
pages and corresponding page table entries of the task with PID pid,
which includes any special mappings inserted into the page tables in
order to provide things like vDSOs and user helper functions.

On ARM this causes a problem because the vectors page is mapped as a
global mapping and since ec706dab ("ARM: add a vma entry for the user
accessible vector page"), a VMA is also inserted into each task for this
page to aid unwinding through signals and syscall restarts. Since the
vectors page is required for handling faults, clearing the YOUNG bit
(and subsequently writing a faulting pte) means that we lose the vectors
page *globally* and cannot fault it back in. This results in a system
deadlock on the next exception.

This patch avoids clearing the aforementioned bits for reserved pages,
therefore leaving the vectors page intact on ARM. Since reserved pages
are not candidates for swap, this change should not have any impact on
the usefulness of clear_refs.

Cc: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicolas Pitre <nico@fluxnic.net>
Reported-by: Moussa Ba <moussaba@micron.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---

An aside: if you want to see this problem in action, just run:

$ echo 1 > /proc/self/clear_refs

on an ARM platform (as any user) and watch your system hang. I think this
has been the case since 2.6.37, so I'll CC stable once people are happy
with the fix.

 fs/proc/task_mmu.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e418c5a..7dcd2a2 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -518,6 +518,9 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr,
 		if (!page)
 			continue;
 
+		if (PageReserved(page))
+			continue;
+
 		/* Clear accessed and referenced bits. */
 		ptep_test_and_clear_young(vma, addr, pte);
 		ClearPageReferenced(page);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] proc: clear_refs: do not clear reserved pages
  2012-01-13 15:13 [RFC PATCH] proc: clear_refs: do not clear reserved pages Will Deacon
@ 2012-01-13 15:35 ` Russell King - ARM Linux
  2012-01-13 22:43   ` Andrew Morton
  2012-01-13 22:55 ` Nicolas Pitre
  1 sibling, 1 reply; 8+ messages in thread
From: Russell King - ARM Linux @ 2012-01-13 15:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linux-mm, linux-arm-kernel, Andrew Morton,
	Nicolas Pitre, moussaba, David Rientjes

On Fri, Jan 13, 2012 at 03:13:07PM +0000, Will Deacon wrote:
> /proc/pid/clear_refs is used to clear the Referenced and YOUNG bits for
> pages and corresponding page table entries of the task with PID pid,
> which includes any special mappings inserted into the page tables in
> order to provide things like vDSOs and user helper functions.
> 
> On ARM this causes a problem because the vectors page is mapped as a
> global mapping and since ec706dab ("ARM: add a vma entry for the user
> accessible vector page"), a VMA is also inserted into each task for this
> page to aid unwinding through signals and syscall restarts. Since the
> vectors page is required for handling faults, clearing the YOUNG bit
> (and subsequently writing a faulting pte) means that we lose the vectors
> page *globally* and cannot fault it back in. This results in a system
> deadlock on the next exception.
> 
> This patch avoids clearing the aforementioned bits for reserved pages,
> therefore leaving the vectors page intact on ARM. Since reserved pages
> are not candidates for swap, this change should not have any impact on
> the usefulness of clear_refs.

Having just looked at mm/swapfile.c, what ensures that we don't try to swap
the vectors page out?

I thought that VM_IO or VM_RESERVED once guaranteed that the vma wouldn't
be scanned, but I don't see anything in there which tests these flags.
As a result, it seems to me that the original patch is wrong, and we need
to keep the vectors page completely out of the vma list to prevent it
ever being made old.

Maybe the MM gurus can comment?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] proc: clear_refs: do not clear reserved pages
  2012-01-13 15:35 ` Russell King - ARM Linux
@ 2012-01-13 22:43   ` Andrew Morton
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2012-01-13 22:43 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Will Deacon, linux-kernel, linux-mm, linux-arm-kernel,
	Nicolas Pitre, moussaba, David Rientjes, Hugh Dickins

On Fri, 13 Jan 2012 15:35:56 +0000
Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:

> On Fri, Jan 13, 2012 at 03:13:07PM +0000, Will Deacon wrote:
> > /proc/pid/clear_refs is used to clear the Referenced and YOUNG bits for
> > pages and corresponding page table entries of the task with PID pid,
> > which includes any special mappings inserted into the page tables in
> > order to provide things like vDSOs and user helper functions.
> > 
> > On ARM this causes a problem because the vectors page is mapped as a
> > global mapping and since ec706dab ("ARM: add a vma entry for the user
> > accessible vector page"), a VMA is also inserted into each task for this
> > page to aid unwinding through signals and syscall restarts. Since the
> > vectors page is required for handling faults, clearing the YOUNG bit
> > (and subsequently writing a faulting pte) means that we lose the vectors
> > page *globally* and cannot fault it back in. This results in a system
> > deadlock on the next exception.
> > 
> > This patch avoids clearing the aforementioned bits for reserved pages,
> > therefore leaving the vectors page intact on ARM. Since reserved pages
> > are not candidates for swap, this change should not have any impact on
> > the usefulness of clear_refs.
> 
> Having just looked at mm/swapfile.c, what ensures that we don't try to swap
> the vectors page out?

Scanning and swapout access the page via the vm's page LRU lists. 
These magical pages shouldn't be on the LRU, so the VM won't pester
them.  If those pages did get added to the LRU somwhoe then yes,
there's a problem.

> I thought that VM_IO or VM_RESERVED once guaranteed that the vma wouldn't
> be scanned, but I don't see anything in there which tests these flags.
> As a result, it seems to me that the original patch is wrong, and we need
> to keep the vectors page completely out of the vma list to prevent it
> ever being made old.
> 
> Maybe the MM gurus can comment?

The MM guru is Hugh.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] proc: clear_refs: do not clear reserved pages
  2012-01-13 15:13 [RFC PATCH] proc: clear_refs: do not clear reserved pages Will Deacon
  2012-01-13 15:35 ` Russell King - ARM Linux
@ 2012-01-13 22:55 ` Nicolas Pitre
  2012-01-14 17:36   ` Hugh Dickins
  1 sibling, 1 reply; 8+ messages in thread
From: Nicolas Pitre @ 2012-01-13 22:55 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linux-mm, linux-arm-kernel, moussaba,
	David Rientjes, Andrew Morton

On Fri, 13 Jan 2012, Will Deacon wrote:

> /proc/pid/clear_refs is used to clear the Referenced and YOUNG bits for
> pages and corresponding page table entries of the task with PID pid,
> which includes any special mappings inserted into the page tables in
> order to provide things like vDSOs and user helper functions.
> 
> On ARM this causes a problem because the vectors page is mapped as a
> global mapping and since ec706dab ("ARM: add a vma entry for the user
> accessible vector page"), a VMA is also inserted into each task for this
> page to aid unwinding through signals and syscall restarts. Since the
> vectors page is required for handling faults, clearing the YOUNG bit
> (and subsequently writing a faulting pte) means that we lose the vectors
> page *globally* and cannot fault it back in. This results in a system
> deadlock on the next exception.
> 
> This patch avoids clearing the aforementioned bits for reserved pages,
> therefore leaving the vectors page intact on ARM. Since reserved pages
> are not candidates for swap, this change should not have any impact on
> the usefulness of clear_refs.
> 
> Cc: David Rientjes <rientjes@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Reported-by: Moussa Ba <moussaba@micron.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Given Andrew's answer, this should be fine wrt Russell's concern.

Acked-by: Nicolas Pitre <nico@linaro.org>

> An aside: if you want to see this problem in action, just run:
> 
> $ echo 1 > /proc/self/clear_refs
> 
> on an ARM platform (as any user) and watch your system hang. I think this
> has been the case since 2.6.37, so I'll CC stable once people are happy
> with the fix.
> 
>  fs/proc/task_mmu.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e418c5a..7dcd2a2 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -518,6 +518,9 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr,
>  		if (!page)
>  			continue;
>  
> +		if (PageReserved(page))
> +			continue;
> +
>  		/* Clear accessed and referenced bits. */
>  		ptep_test_and_clear_young(vma, addr, pte);
>  		ClearPageReferenced(page);
> -- 
> 1.7.4.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] proc: clear_refs: do not clear reserved pages
  2012-01-13 22:55 ` Nicolas Pitre
@ 2012-01-14 17:36   ` Hugh Dickins
  2012-01-15 15:07     ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2012-01-14 17:36 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Will Deacon, linux-kernel, linux-mm, linux-arm-kernel, moussaba,
	David Rientjes, Andrew Morton, Russell King - ARM Linux

On Fri, 13 Jan 2012, Nicolas Pitre wrote:
> On Fri, 13 Jan 2012, Will Deacon wrote:
> 
> > /proc/pid/clear_refs is used to clear the Referenced and YOUNG bits for
> > pages and corresponding page table entries of the task with PID pid,
> > which includes any special mappings inserted into the page tables in
> > order to provide things like vDSOs and user helper functions.
> > 
> > On ARM this causes a problem because the vectors page is mapped as a
> > global mapping and since ec706dab ("ARM: add a vma entry for the user
> > accessible vector page"), a VMA is also inserted into each task for this
> > page to aid unwinding through signals and syscall restarts. Since the
> > vectors page is required for handling faults, clearing the YOUNG bit
> > (and subsequently writing a faulting pte) means that we lose the vectors
> > page *globally* and cannot fault it back in. This results in a system
> > deadlock on the next exception.
> > 
> > This patch avoids clearing the aforementioned bits for reserved pages,
> > therefore leaving the vectors page intact on ARM. Since reserved pages
> > are not candidates for swap, this change should not have any impact on
> > the usefulness of clear_refs.
> > 
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Nicolas Pitre <nico@fluxnic.net>
> > Reported-by: Moussa Ba <moussaba@micron.com>
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> 
> Given Andrew's answer, this should be fine wrt Russell's concern.
> 
> Acked-by: Nicolas Pitre <nico@linaro.org>

Yes, it should be okay as an urgent fix for -stable.
But going forward, I doubt it's the right answer: comments below.

> 
> > An aside: if you want to see this problem in action, just run:
> > 
> > $ echo 1 > /proc/self/clear_refs
> > 
> > on an ARM platform (as any user) and watch your system hang. I think this
> > has been the case since 2.6.37, so I'll CC stable once people are happy
> > with the fix.
> > 
> >  fs/proc/task_mmu.c |    3 +++
> >  1 files changed, 3 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index e418c5a..7dcd2a2 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -518,6 +518,9 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr,

What got me worried was the line just above the context shown below:
    		page = vm_normal_page(vma, addr, ptent);
> >  		if (!page)
> >  			continue;

This is not a normal page, and it's worrying that vm_normal_page() did
not catch it: I wonder how many other places that could be a problem
(but I have not actually identified any).

vm_normal_page() doesn't catch it because at the time it was written,
we thought we were on the point of removing both PageReserved and
VM_RESERVED (both of whose meanings are imprecise), and there was no
need for it to check either of them.  But nobody found time to do the
final (not entirely trivial) cleanup, removing the definitions.

Maybe ec706dab added a need for it to check one of those; though you
can understand my reluctance to spread PageReserved any further than
it goes already.  I was looking for VM_ flags which might serve you
better, when I thought...

This is a horrible hack vma, which is very liable to introduce bugs
of this nature, because not many people are at all aware of it.
But we've had a horrible hack vma for years, the gate_vma (see
mm/memory.c), and that seems to share many characteristics with your
vectors page (most notably, being in kernel not user address space).

Please, going forward, can you delete your vectors page code, and
use the gate_vma for it?  Extending it a little if it somehow does
not satsify your need.  Or else can you please explain (ec706dab
does not) why the gate_vma does not suit you.

I'm not saying the horrible hack gate_vma mechanism is any safer
than yours (the latest bug in it was fixed all of 13 days ago).
But I am saying that one horrible hack is safer than two.

> >  
> > +		if (PageReserved(page))
> > +			continue;

Let's note in passing that this does change the "behaviour" of clear_refs
on the ZERO_PAGE; but it doesn't make any functional difference, we just
need to be aware of it, in case someone tries examining /proc/pid/smaps
after /proc/pid/clear_refs, and complains that some pages are left marked
referenced which were cleared before.  Doesn't make a real difference.

> > +
> >  		/* Clear accessed and referenced bits. */
> >  		ptep_test_and_clear_young(vma, addr, pte);
> >  		ClearPageReferenced(page);
> > -- 
> > 1.7.4.1

Hugh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] proc: clear_refs: do not clear reserved pages
  2012-01-14 17:36   ` Hugh Dickins
@ 2012-01-15 15:07     ` Will Deacon
  2012-01-16  4:19       ` Nicolas Pitre
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2012-01-15 15:07 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Nicolas Pitre, linux-kernel, linux-mm, linux-arm-kernel,
	moussaba, David Rientjes, Andrew Morton,
	Russell King - ARM Linux

Hi Hugh,

Thanks for the explanation.

On Sat, Jan 14, 2012 at 05:36:37PM +0000, Hugh Dickins wrote:
> On Fri, 13 Jan 2012, Nicolas Pitre wrote:
> > Given Andrew's answer, this should be fine wrt Russell's concern.
> > 
> > Acked-by: Nicolas Pitre <nico@linaro.org>
> 
> Yes, it should be okay as an urgent fix for -stable.
> But going forward, I doubt it's the right answer: comments below.

Ok great, getting this into -stable ASAP would be much appreciated. Can
somebody pick it up please?

> Please, going forward, can you delete your vectors page code, and
> use the gate_vma for it?  Extending it a little if it somehow does
> not satsify your need.  Or else can you please explain (ec706dab
> does not) why the gate_vma does not suit you.
> 
> I'm not saying the horrible hack gate_vma mechanism is any safer
> than yours (the latest bug in it was fixed all of 13 days ago).
> But I am saying that one horrible hack is safer than two.

Something like what I've got below seems to do the trick, and clear_refs
also seems to behave when it's presented with the gate_vma. If Russell is
happy with the approach, we can move to the gate_vma in the future.

Thanks,

Will


    ARM: vectors: use gate_vma for vectors user mapping
    
    The current user mapping for the vectors page is inserted as a `horrible
    hack vma' into each task via arch_setup_additional_pages. This causes
    problems with the MM subsystem and vm_normal_page, as described here:
    
    https://lkml.org/lkml/2012/1/14/55
    
    Following the suggestion from Hugh in the above thread, this patch uses
    the gate_vma for the vectors user mapping, therefore consolidating
    the horrible hack VMAs into one.
    
    Signed-off-by: Will Deacon <will.deacon@arm.com>

diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index 0e9ce8d..38050b1 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -130,8 +130,4 @@ struct mm_struct;
 extern unsigned long arch_randomize_brk(struct mm_struct *mm);
 #define arch_randomize_brk arch_randomize_brk
 
-extern int vectors_user_mapping(void);
-#define arch_setup_additional_pages(bprm, uses_interp) vectors_user_mapping()
-#define ARCH_HAS_SETUP_ADDITIONAL_PAGES
-
 #endif
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index ca94653..e851aa3 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -151,6 +151,8 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
 #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
 extern void copy_page(void *to, const void *from);
 
+#define __HAVE_ARCH_GATE_AREA 1
+
 #include <asm/pgtable-2level-types.h>
 
 #endif /* CONFIG_MMU */
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 3d0c6fb..c13b8f6 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -493,22 +493,40 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
 #ifdef CONFIG_MMU
 /*
  * The vectors page is always readable from user space for the
- * atomic helpers and the signal restart code.  Let's declare a mapping
- * for it so it is visible through ptrace and /proc/<pid>/mem.
+ * atomic helpers and the signal restart code. Insert it into the
+ * gate_vma so that it is visible through ptrace and /proc/<pid>/mem.
  */
+static struct vm_area_struct gate_vma;
 
-int vectors_user_mapping(void)
+static int __init gate_vma_init(void)
 {
-	struct mm_struct *mm = current->mm;
-	return install_special_mapping(mm, 0xffff0000, PAGE_SIZE,
-				       VM_READ | VM_EXEC |
-				       VM_MAYREAD | VM_MAYEXEC |
-				       VM_ALWAYSDUMP | VM_RESERVED,
-				       NULL);
+	gate_vma.vm_start	= 0xffff0000;
+	gate_vma.vm_end		= 0xffff0000 + PAGE_SIZE;
+	gate_vma.vm_page_prot	= PAGE_READONLY_EXEC;
+	gate_vma.vm_flags	= VM_READ | VM_EXEC |
+				  VM_MAYREAD | VM_MAYEXEC |
+				  VM_ALWAYSDUMP;
+	return 0;
+}
+arch_initcall(gate_vma_init);
+
+struct vm_area_struct *get_gate_vma(struct mm_struct *mm)
+{
+	return &gate_vma;
+}
+
+int in_gate_area(struct mm_struct *mm, unsigned long addr)
+{
+	return (addr >= gate_vma.vm_start) && (addr < gate_vma.vm_end);
+}
+
+int in_gate_area_no_mm(unsigned long addr)
+{
+	return in_gate_area(NULL, addr);
 }
 
 const char *arch_vma_name(struct vm_area_struct *vma)
 {
-	return (vma->vm_start == 0xffff0000) ? "[vectors]" : NULL;
+	return (vma == &gate_vma) ? "[vectors]" : NULL;
 }
 #endif


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] proc: clear_refs: do not clear reserved pages
  2012-01-15 15:07     ` Will Deacon
@ 2012-01-16  4:19       ` Nicolas Pitre
  2012-01-16 10:06         ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Nicolas Pitre @ 2012-01-16  4:19 UTC (permalink / raw)
  To: Will Deacon
  Cc: Hugh Dickins, linux-kernel, linux-mm, linux-arm-kernel, moussaba,
	David Rientjes, Andrew Morton, Russell King - ARM Linux

On Sun, 15 Jan 2012, Will Deacon wrote:

> Hi Hugh,
> 
> Thanks for the explanation.
> 
> On Sat, Jan 14, 2012 at 05:36:37PM +0000, Hugh Dickins wrote:
> > I'm not saying the horrible hack gate_vma mechanism is any safer
> > than yours (the latest bug in it was fixed all of 13 days ago).
> > But I am saying that one horrible hack is safer than two.

Absolutely.

> Something like what I've got below seems to do the trick, and clear_refs
> also seems to behave when it's presented with the gate_vma. If Russell is
> happy with the approach, we can move to the gate_vma in the future.

I like it much better, although I haven't tested it fully yet.

However your patch is missing the worst of the current ARM hack I would 
be glad to see go as follows:

diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
index 71605d9f8e..876e545297 100644
--- a/arch/arm/include/asm/mmu_context.h
+++ b/arch/arm/include/asm/mmu_context.h
@@ -18,6 +18,7 @@
 #include <asm/cacheflush.h>
 #include <asm/cachetype.h>
 #include <asm/proc-fns.h>
+#include <asm-generic/mm_hooks.h>
 
 void __check_kvm_seq(struct mm_struct *mm);
 
@@ -133,32 +135,4 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
 #define deactivate_mm(tsk,mm)	do { } while (0)
 #define activate_mm(prev,next)	switch_mm(prev, next, NULL)
 
-/*
- * We are inserting a "fake" vma for the user-accessible vector page so
- * gdb and friends can get to it through ptrace and /proc/<pid>/mem.
- * But we also want to remove it before the generic code gets to see it
- * during process exit or the unmapping of it would  cause total havoc.
- * (the macro is used as remove_vma() is static to mm/mmap.c)
- */
-#define arch_exit_mmap(mm) \
-do { \
-	struct vm_area_struct *high_vma = find_vma(mm, 0xffff0000); \
-	if (high_vma) { \
-		BUG_ON(high_vma->vm_next);  /* it should be last */ \
-		if (high_vma->vm_prev) \
-			high_vma->vm_prev->vm_next = NULL; \
-		else \
-			mm->mmap = NULL; \
-		rb_erase(&high_vma->vm_rb, &mm->mm_rb); \
-		mm->mmap_cache = NULL; \
-		mm->map_count--; \
-		remove_vma(high_vma); \
-	} \
-} while (0)
-
-static inline void arch_dup_mmap(struct mm_struct *oldmm,
-				 struct mm_struct *mm)
-{
-}
-
 #endif


Nicolas

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] proc: clear_refs: do not clear reserved pages
  2012-01-16  4:19       ` Nicolas Pitre
@ 2012-01-16 10:06         ` Will Deacon
  0 siblings, 0 replies; 8+ messages in thread
From: Will Deacon @ 2012-01-16 10:06 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Hugh Dickins, linux-kernel, linux-mm, linux-arm-kernel, moussaba,
	David Rientjes, Andrew Morton, Russell King - ARM Linux

On Mon, Jan 16, 2012 at 04:19:43AM +0000, Nicolas Pitre wrote:
> On Sun, 15 Jan 2012, Will Deacon wrote:
> > Something like what I've got below seems to do the trick, and clear_refs
> > also seems to behave when it's presented with the gate_vma. If Russell is
> > happy with the approach, we can move to the gate_vma in the future.
> 
> I like it much better, although I haven't tested it fully yet.
> 
> However your patch is missing the worst of the current ARM hack I would 
> be glad to see go as follows:
> 
> diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
> index 71605d9f8e..876e545297 100644
> --- a/arch/arm/include/asm/mmu_context.h
> +++ b/arch/arm/include/asm/mmu_context.h
> @@ -18,6 +18,7 @@
>  #include <asm/cacheflush.h>
>  #include <asm/cachetype.h>
>  #include <asm/proc-fns.h>
> +#include <asm-generic/mm_hooks.h>
>  
>  void __check_kvm_seq(struct mm_struct *mm);
>  
> @@ -133,32 +135,4 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
>  #define deactivate_mm(tsk,mm)	do { } while (0)
>  #define activate_mm(prev,next)	switch_mm(prev, next, NULL)
>  
> -/*
> - * We are inserting a "fake" vma for the user-accessible vector page so
> - * gdb and friends can get to it through ptrace and /proc/<pid>/mem.
> - * But we also want to remove it before the generic code gets to see it
> - * during process exit or the unmapping of it would  cause total havoc.
> - * (the macro is used as remove_vma() is static to mm/mmap.c)
> - */
> -#define arch_exit_mmap(mm) \
> -do { \
> -	struct vm_area_struct *high_vma = find_vma(mm, 0xffff0000); \
> -	if (high_vma) { \
> -		BUG_ON(high_vma->vm_next);  /* it should be last */ \
> -		if (high_vma->vm_prev) \
> -			high_vma->vm_prev->vm_next = NULL; \
> -		else \
> -			mm->mmap = NULL; \
> -		rb_erase(&high_vma->vm_rb, &mm->mm_rb); \
> -		mm->mmap_cache = NULL; \
> -		mm->map_count--; \
> -		remove_vma(high_vma); \
> -	} \
> -} while (0)
> -
> -static inline void arch_dup_mmap(struct mm_struct *oldmm,
> -				 struct mm_struct *mm)
> -{
> -}
> -
>  #endif

Nice, I missed those hunks! I'm more than happy to include this for v2
(which I'll just post to the ARM list). I'll also give this some testing on
the boards that I have.

Thanks,

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-01-16 10:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-13 15:13 [RFC PATCH] proc: clear_refs: do not clear reserved pages Will Deacon
2012-01-13 15:35 ` Russell King - ARM Linux
2012-01-13 22:43   ` Andrew Morton
2012-01-13 22:55 ` Nicolas Pitre
2012-01-14 17:36   ` Hugh Dickins
2012-01-15 15:07     ` Will Deacon
2012-01-16  4:19       ` Nicolas Pitre
2012-01-16 10:06         ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).