linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: protect against concurrent vma expansion
@ 2012-12-01  6:56 Michel Lespinasse
  2012-12-03 23:01 ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Michel Lespinasse @ 2012-12-01  6:56 UTC (permalink / raw)
  To: linux-mm, Rik van Riel, Hugh Dickins, Andrew Morton; +Cc: linux-kernel

expand_stack() runs with a shared mmap_sem lock. Because of this, there
could be multiple concurrent stack expansions in the same mm, which may
cause problems in the vma gap update code.

I propose to solve this by taking the mm->page_table_lock around such vma
expansions, in order to avoid the concurrency issue. We only have to worry
about concurrent expand_stack() calls here, since we hold a shared mmap_sem
lock and all vma modificaitons other than expand_stack() are done under
an exclusive mmap_sem lock.

I previously tried to achieve the same effect by making sure all
growable vmas in a given mm would share the same anon_vma, which we
already lock here. However this turned out to be difficult - all of the
schemes I tried for refcounting the growable anon_vma and clearing
turned out ugly. So, I'm now proposing only the minimal fix.

Signed-off-by: Michel Lespinasse <walken@google.com>

---
 mm/mmap.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 9ed3a06242a0..e44fe876a7e3 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2069,6 +2069,11 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 		if (vma->vm_pgoff + (size >> PAGE_SHIFT) >= vma->vm_pgoff) {
 			error = acct_stack_growth(vma, size, grow);
 			if (!error) {
+				/*
+				 * page_table_lock to protect against
+				 * concurrent vma expansions
+				 */
+				spin_lock(&vma->vm_mm->page_table_lock);
 				anon_vma_interval_tree_pre_update_vma(vma);
 				vma->vm_end = address;
 				anon_vma_interval_tree_post_update_vma(vma);
@@ -2076,6 +2081,8 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 					vma_gap_update(vma->vm_next);
 				else
 					vma->vm_mm->highest_vm_end = address;
+				spin_unlock(&vma->vm_mm->page_table_lock);
+
 				perf_event_mmap(vma);
 			}
 		}
@@ -2126,11 +2133,18 @@ int expand_downwards(struct vm_area_struct *vma,
 		if (grow <= vma->vm_pgoff) {
 			error = acct_stack_growth(vma, size, grow);
 			if (!error) {
+				/*
+				 * page_table_lock to protect against
+				 * concurrent vma expansions
+				 */
+				spin_lock(&vma->vm_mm->page_table_lock);
 				anon_vma_interval_tree_pre_update_vma(vma);
 				vma->vm_start = address;
 				vma->vm_pgoff -= grow;
 				anon_vma_interval_tree_post_update_vma(vma);
 				vma_gap_update(vma);
+				spin_unlock(&vma->vm_mm->page_table_lock);
+
 				perf_event_mmap(vma);
 			}
 		}
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2012-12-01  6:56 [PATCH] mm: protect against concurrent vma expansion Michel Lespinasse
@ 2012-12-03 23:01 ` Andrew Morton
  2012-12-04  0:35   ` Michel Lespinasse
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2012-12-03 23:01 UTC (permalink / raw)
  To: Michel Lespinasse; +Cc: linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

On Fri, 30 Nov 2012 22:56:27 -0800
Michel Lespinasse <walken@google.com> wrote:

> expand_stack() runs with a shared mmap_sem lock. Because of this, there
> could be multiple concurrent stack expansions in the same mm, which may
> cause problems in the vma gap update code.
> 
> I propose to solve this by taking the mm->page_table_lock around such vma
> expansions, in order to avoid the concurrency issue. We only have to worry
> about concurrent expand_stack() calls here, since we hold a shared mmap_sem
> lock and all vma modificaitons other than expand_stack() are done under
> an exclusive mmap_sem lock.
> 
> I previously tried to achieve the same effect by making sure all
> growable vmas in a given mm would share the same anon_vma, which we
> already lock here. However this turned out to be difficult - all of the
> schemes I tried for refcounting the growable anon_vma and clearing
> turned out ugly. So, I'm now proposing only the minimal fix.
> 

I think I don't understand the problem fully.  Let me demonstrate:

a) vma_lock_anon_vma() doesn't take a lock which is specific to
   "this" anon_vma.  It takes anon_vma->root->mutex.  That mutex is
   shared with vma->vm_next, yes?  If so, we have no problem here? 
   (which makes me suspect that the races lies other than where I think
   it lies).

b) I can see why a broader lock is needed in expand_upwards(): it
   plays with a different vma: vma->vm_next.  But expand_downwards()
   doesn't do that - it only alters "this" vma.  So I'd have thought
   that vma_lock_anon_vma("this" vma) would be sufficient.


What are the performance costs of this change?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2012-12-03 23:01 ` Andrew Morton
@ 2012-12-04  0:35   ` Michel Lespinasse
  2012-12-04  0:43     ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Michel Lespinasse @ 2012-12-04  0:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

On Mon, Dec 3, 2012 at 3:01 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 30 Nov 2012 22:56:27 -0800
> Michel Lespinasse <walken@google.com> wrote:
>
>> expand_stack() runs with a shared mmap_sem lock. Because of this, there
>> could be multiple concurrent stack expansions in the same mm, which may
>> cause problems in the vma gap update code.
>>
>> I propose to solve this by taking the mm->page_table_lock around such vma
>> expansions, in order to avoid the concurrency issue. We only have to worry
>> about concurrent expand_stack() calls here, since we hold a shared mmap_sem
>> lock and all vma modificaitons other than expand_stack() are done under
>> an exclusive mmap_sem lock.
>>
>> I previously tried to achieve the same effect by making sure all
>> growable vmas in a given mm would share the same anon_vma, which we
>> already lock here. However this turned out to be difficult - all of the
>> schemes I tried for refcounting the growable anon_vma and clearing
>> turned out ugly. So, I'm now proposing only the minimal fix.
>
> I think I don't understand the problem fully.  Let me demonstrate:
>
> a) vma_lock_anon_vma() doesn't take a lock which is specific to
>    "this" anon_vma.  It takes anon_vma->root->mutex.  That mutex is
>    shared with vma->vm_next, yes?  If so, we have no problem here?
>    (which makes me suspect that the races lies other than where I think
>    it lies).

So, the first thing I need to mention is that this fix is NOT for any
problem that has been reported (and in particular, not for Sasha's
trinity fuzzing issue). It's just me looking at the code and noticing
I haven't gotten locking right for the case of concurrent stack
expansion.

Regarding vma and vma->vm_next sharing the same root anon_vma mutex -
this will often be the case, but not always. find_mergeable_anon_vma()
will try to make it so, but it could fail if there was another vma
in-between at the time the stack's anon_vmas got assigned (either a
non-stack vma that later gets unmapped, or another stack vma that
didn't get its own anon_vma assigned yet).

> b) I can see why a broader lock is needed in expand_upwards(): it
>    plays with a different vma: vma->vm_next.  But expand_downwards()
>    doesn't do that - it only alters "this" vma.  So I'd have thought
>    that vma_lock_anon_vma("this" vma) would be sufficient.

The issue there is that vma_gap_update() accesses vma->vm_prev, so the
issue is actually symetrical with expand_upwards().

> What are the performance costs of this change?

It's expected to be small. glibc doesn't use expandable stacks for the
threads it creates, so having multiple growable stacks is actually
uncommon (another reason why the problem hasn't been observed in
practice). Because of this, I don't expect the page table lock to get
bounced between threads, so the cost of taking it should be small
(compared to the cost of delivering the #PF, let alone handling it in
software).

But yes, the initial idea of forcing all growable vmas in an mm to
share the same root anon_vma sounded much more appealing at first.
Unfortunately I haven't been able to make that work in a simple enough
way to be comfortable submitting it this late in the release cycle :/

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2012-12-04  0:35   ` Michel Lespinasse
@ 2012-12-04  0:43     ` Andrew Morton
  2012-12-04 14:48       ` Michel Lespinasse
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2012-12-04  0:43 UTC (permalink / raw)
  To: Michel Lespinasse; +Cc: linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

On Mon, 3 Dec 2012 16:35:01 -0800
Michel Lespinasse <walken@google.com> wrote:

> On Mon, Dec 3, 2012 at 3:01 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Fri, 30 Nov 2012 22:56:27 -0800
> > Michel Lespinasse <walken@google.com> wrote:
> >
> >> expand_stack() runs with a shared mmap_sem lock. Because of this, there
> >> could be multiple concurrent stack expansions in the same mm, which may
> >> cause problems in the vma gap update code.
> >>
> >> I propose to solve this by taking the mm->page_table_lock around such vma
> >> expansions, in order to avoid the concurrency issue. We only have to worry
> >> about concurrent expand_stack() calls here, since we hold a shared mmap_sem
> >> lock and all vma modificaitons other than expand_stack() are done under
> >> an exclusive mmap_sem lock.
> >>
> >> I previously tried to achieve the same effect by making sure all
> >> growable vmas in a given mm would share the same anon_vma, which we
> >> already lock here. However this turned out to be difficult - all of the
> >> schemes I tried for refcounting the growable anon_vma and clearing
> >> turned out ugly. So, I'm now proposing only the minimal fix.
> >
> > I think I don't understand the problem fully.  Let me demonstrate:
> >
> > a) vma_lock_anon_vma() doesn't take a lock which is specific to
> >    "this" anon_vma.  It takes anon_vma->root->mutex.  That mutex is
> >    shared with vma->vm_next, yes?  If so, we have no problem here?
> >    (which makes me suspect that the races lies other than where I think
> >    it lies).
> 
> So, the first thing I need to mention is that this fix is NOT for any
> problem that has been reported (and in particular, not for Sasha's
> trinity fuzzing issue). It's just me looking at the code and noticing
> I haven't gotten locking right for the case of concurrent stack
> expansion.
> 
> Regarding vma and vma->vm_next sharing the same root anon_vma mutex -
> this will often be the case, but not always. find_mergeable_anon_vma()
> will try to make it so, but it could fail if there was another vma
> in-between at the time the stack's anon_vmas got assigned (either a
> non-stack vma that later gets unmapped, or another stack vma that
> didn't get its own anon_vma assigned yet).
> 
> > b) I can see why a broader lock is needed in expand_upwards(): it
> >    plays with a different vma: vma->vm_next.  But expand_downwards()
> >    doesn't do that - it only alters "this" vma.  So I'd have thought
> >    that vma_lock_anon_vma("this" vma) would be sufficient.
> 
> The issue there is that vma_gap_update() accesses vma->vm_prev, so the
> issue is actually symetrical with expand_upwards().
> 
> > What are the performance costs of this change?
> 
> It's expected to be small. glibc doesn't use expandable stacks for the
> threads it creates, so having multiple growable stacks is actually
> uncommon (another reason why the problem hasn't been observed in
> practice). Because of this, I don't expect the page table lock to get
> bounced between threads, so the cost of taking it should be small
> (compared to the cost of delivering the #PF, let alone handling it in
> software).
> 
> But yes, the initial idea of forcing all growable vmas in an mm to
> share the same root anon_vma sounded much more appealing at first.
> Unfortunately I haven't been able to make that work in a simple enough
> way to be comfortable submitting it this late in the release cycle :/

hm, OK.  Could you please cook up a new changelog which explains these
things to the next puzzled reader and send it along?

Ingo is playing in the same area with "mm/rmap: Convert the struct
anon_vma::mutex to an rwsem", but as that patch changes
vma_lock_anon_vma() to use down_write(), I expect it won't affect
anything.  But please check it over.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2012-12-04  0:43     ` Andrew Morton
@ 2012-12-04 14:48       ` Michel Lespinasse
  2012-12-20  1:56         ` Simon Jeons
  0 siblings, 1 reply; 11+ messages in thread
From: Michel Lespinasse @ 2012-12-04 14:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

expand_stack() runs with a shared mmap_sem lock. Because of this, there
could be multiple concurrent stack expansions in the same mm, which may
cause problems in the vma gap update code.

I propose to solve this by taking the mm->page_table_lock around such vma
expansions, in order to avoid the concurrency issue. We only have to worry
about concurrent expand_stack() calls here, since we hold a shared mmap_sem
lock and all vma modificaitons other than expand_stack() are done under
an exclusive mmap_sem lock.

I previously tried to achieve the same effect by making sure all
growable vmas in a given mm would share the same anon_vma, which we
already lock here. However this turned out to be difficult - all of the
schemes I tried for refcounting the growable anon_vma and clearing
turned out ugly. So, I'm now proposing only the minimal fix.

The overhead of taking the page table lock during stack expansion is
expected to be small: glibc doesn't use expandable stacks for the
threads it creates, so having multiple growable stacks is actually
uncommon and we don't expect the page table lock to get bounced
between threads.

Signed-off-by: Michel Lespinasse <walken@google.com>

---
 mm/mmap.c |   28 ++++++++++++++++++++++++++++
 1 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 9ed3a06242a0..2b7d9e78a569 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2069,6 +2069,18 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 		if (vma->vm_pgoff + (size >> PAGE_SHIFT) >= vma->vm_pgoff) {
 			error = acct_stack_growth(vma, size, grow);
 			if (!error) {
+				/*
+				 * vma_gap_update() doesn't support concurrent
+				 * updates, but we only hold a shared mmap_sem
+				 * lock here, so we need to protect against
+				 * concurrent vma expansions.
+				 * vma_lock_anon_vma() doesn't help here, as
+				 * we don't guarantee that all growable vmas
+				 * in a mm share the same root anon vma.
+				 * So, we reuse mm->page_table_lock to guard
+				 * against concurrent vma expansions.
+				 */
+				spin_lock(&vma->vm_mm->page_table_lock);
 				anon_vma_interval_tree_pre_update_vma(vma);
 				vma->vm_end = address;
 				anon_vma_interval_tree_post_update_vma(vma);
@@ -2076,6 +2088,8 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 					vma_gap_update(vma->vm_next);
 				else
 					vma->vm_mm->highest_vm_end = address;
+				spin_unlock(&vma->vm_mm->page_table_lock);
+
 				perf_event_mmap(vma);
 			}
 		}
@@ -2126,11 +2140,25 @@ int expand_downwards(struct vm_area_struct *vma,
 		if (grow <= vma->vm_pgoff) {
 			error = acct_stack_growth(vma, size, grow);
 			if (!error) {
+				/*
+				 * vma_gap_update() doesn't support concurrent
+				 * updates, but we only hold a shared mmap_sem
+				 * lock here, so we need to protect against
+				 * concurrent vma expansions.
+				 * vma_lock_anon_vma() doesn't help here, as
+				 * we don't guarantee that all growable vmas
+				 * in a mm share the same root anon vma.
+				 * So, we reuse mm->page_table_lock to guard
+				 * against concurrent vma expansions.
+				 */
+				spin_lock(&vma->vm_mm->page_table_lock);
 				anon_vma_interval_tree_pre_update_vma(vma);
 				vma->vm_start = address;
 				vma->vm_pgoff -= grow;
 				anon_vma_interval_tree_post_update_vma(vma);
 				vma_gap_update(vma);
+				spin_unlock(&vma->vm_mm->page_table_lock);
+
 				perf_event_mmap(vma);
 			}
 		}
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2012-12-04 14:48       ` Michel Lespinasse
@ 2012-12-20  1:56         ` Simon Jeons
  2012-12-20  3:01           ` Michel Lespinasse
  0 siblings, 1 reply; 11+ messages in thread
From: Simon Jeons @ 2012-12-20  1:56 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Andrew Morton, linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

On Tue, 2012-12-04 at 06:48 -0800, Michel Lespinasse wrote:
> expand_stack() runs with a shared mmap_sem lock. Because of this, there
> could be multiple concurrent stack expansions in the same mm, which may
> cause problems in the vma gap update code.
> 
> I propose to solve this by taking the mm->page_table_lock around such vma
> expansions, in order to avoid the concurrency issue. We only have to worry
> about concurrent expand_stack() calls here, since we hold a shared mmap_sem
> lock and all vma modificaitons other than expand_stack() are done under
> an exclusive mmap_sem lock.

Hi Michel and Andrew,

One question.

I found that mainly callsite of expand_stack() is #PF, but it holds
mmap_sem each time before call expand_stack(), how can hold a *shared*
mmap_sem happen?

> 
> I previously tried to achieve the same effect by making sure all
> growable vmas in a given mm would share the same anon_vma, which we
> already lock here. However this turned out to be difficult - all of the
> schemes I tried for refcounting the growable anon_vma and clearing
> turned out ugly. So, I'm now proposing only the minimal fix.
> 
> The overhead of taking the page table lock during stack expansion is
> expected to be small: glibc doesn't use expandable stacks for the
> threads it creates, so having multiple growable stacks is actually
> uncommon and we don't expect the page table lock to get bounced
> between threads.
> 
> Signed-off-by: Michel Lespinasse <walken@google.com>
> 
> ---
>  mm/mmap.c |   28 ++++++++++++++++++++++++++++
>  1 files changed, 28 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9ed3a06242a0..2b7d9e78a569 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2069,6 +2069,18 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  		if (vma->vm_pgoff + (size >> PAGE_SHIFT) >= vma->vm_pgoff) {
>  			error = acct_stack_growth(vma, size, grow);
>  			if (!error) {
> +				/*
> +				 * vma_gap_update() doesn't support concurrent
> +				 * updates, but we only hold a shared mmap_sem
> +				 * lock here, so we need to protect against
> +				 * concurrent vma expansions.
> +				 * vma_lock_anon_vma() doesn't help here, as
> +				 * we don't guarantee that all growable vmas
> +				 * in a mm share the same root anon vma.
> +				 * So, we reuse mm->page_table_lock to guard
> +				 * against concurrent vma expansions.
> +				 */
> +				spin_lock(&vma->vm_mm->page_table_lock);
>  				anon_vma_interval_tree_pre_update_vma(vma);
>  				vma->vm_end = address;
>  				anon_vma_interval_tree_post_update_vma(vma);
> @@ -2076,6 +2088,8 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  					vma_gap_update(vma->vm_next);
>  				else
>  					vma->vm_mm->highest_vm_end = address;
> +				spin_unlock(&vma->vm_mm->page_table_lock);
> +
>  				perf_event_mmap(vma);
>  			}
>  		}
> @@ -2126,11 +2140,25 @@ int expand_downwards(struct vm_area_struct *vma,
>  		if (grow <= vma->vm_pgoff) {
>  			error = acct_stack_growth(vma, size, grow);
>  			if (!error) {
> +				/*
> +				 * vma_gap_update() doesn't support concurrent
> +				 * updates, but we only hold a shared mmap_sem
> +				 * lock here, so we need to protect against
> +				 * concurrent vma expansions.
> +				 * vma_lock_anon_vma() doesn't help here, as
> +				 * we don't guarantee that all growable vmas
> +				 * in a mm share the same root anon vma.
> +				 * So, we reuse mm->page_table_lock to guard
> +				 * against concurrent vma expansions.
> +				 */
> +				spin_lock(&vma->vm_mm->page_table_lock);
>  				anon_vma_interval_tree_pre_update_vma(vma);
>  				vma->vm_start = address;
>  				vma->vm_pgoff -= grow;
>  				anon_vma_interval_tree_post_update_vma(vma);
>  				vma_gap_update(vma);
> +				spin_unlock(&vma->vm_mm->page_table_lock);
> +
>  				perf_event_mmap(vma);
>  			}
>  		}



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2012-12-20  1:56         ` Simon Jeons
@ 2012-12-20  3:01           ` Michel Lespinasse
  2013-01-04  0:40             ` Simon Jeons
  0 siblings, 1 reply; 11+ messages in thread
From: Michel Lespinasse @ 2012-12-20  3:01 UTC (permalink / raw)
  To: Simon Jeons
  Cc: Andrew Morton, linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

Hi Simon,

On Wed, Dec 19, 2012 at 5:56 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
> One question.
>
> I found that mainly callsite of expand_stack() is #PF, but it holds
> mmap_sem each time before call expand_stack(), how can hold a *shared*
> mmap_sem happen?

the #PF handler calls down_read(&mm->mmap_sem) before calling expand_stack.

I think I'm just confusing you with my terminology; shared lock ==
read lock == several readers might hold it at once (I'd say they share
it)

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2012-12-20  3:01           ` Michel Lespinasse
@ 2013-01-04  0:40             ` Simon Jeons
  2013-01-04  0:50               ` Michel Lespinasse
  2013-01-04  2:49               ` Al Viro
  0 siblings, 2 replies; 11+ messages in thread
From: Simon Jeons @ 2013-01-04  0:40 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Andrew Morton, linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

On Wed, 2012-12-19 at 19:01 -0800, Michel Lespinasse wrote:
> Hi Simon,
> 
> On Wed, Dec 19, 2012 at 5:56 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
> > One question.
> >
> > I found that mainly callsite of expand_stack() is #PF, but it holds
> > mmap_sem each time before call expand_stack(), how can hold a *shared*
> > mmap_sem happen?
> 
> the #PF handler calls down_read(&mm->mmap_sem) before calling expand_stack.
> 
> I think I'm just confusing you with my terminology; shared lock ==
> read lock == several readers might hold it at once (I'd say they share
> it)

Sorry for my late response. 

Since expand_stack() will modify vma, then why hold a read lock here?

> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2013-01-04  0:40             ` Simon Jeons
@ 2013-01-04  0:50               ` Michel Lespinasse
  2013-01-04  1:18                 ` Simon Jeons
  2013-01-04  2:49               ` Al Viro
  1 sibling, 1 reply; 11+ messages in thread
From: Michel Lespinasse @ 2013-01-04  0:50 UTC (permalink / raw)
  To: Simon Jeons
  Cc: Andrew Morton, linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

On Thu, Jan 3, 2013 at 4:40 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
> On Wed, 2012-12-19 at 19:01 -0800, Michel Lespinasse wrote:
>> Hi Simon,
>>
>> On Wed, Dec 19, 2012 at 5:56 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
>> > One question.
>> >
>> > I found that mainly callsite of expand_stack() is #PF, but it holds
>> > mmap_sem each time before call expand_stack(), how can hold a *shared*
>> > mmap_sem happen?
>>
>> the #PF handler calls down_read(&mm->mmap_sem) before calling expand_stack.
>>
>> I think I'm just confusing you with my terminology; shared lock ==
>> read lock == several readers might hold it at once (I'd say they share
>> it)
>
> Sorry for my late response.
>
> Since expand_stack() will modify vma, then why hold a read lock here?

Well, it'd be much nicer if we had a write lock, I think. But, we
didn't know when taking the lock that we'd end up having to expand
stacks.

What happens is that page faults don't generally modify vmas, so they
get a read lock (just to know what vma the fault is happening in) and
then fault in the page.

expand_stack() is the one exception to that - after getting the read
lock as usual, we notice that the fault is not in any vma right now,
but it's close enough to an expandable vma.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2013-01-04  0:50               ` Michel Lespinasse
@ 2013-01-04  1:18                 ` Simon Jeons
  0 siblings, 0 replies; 11+ messages in thread
From: Simon Jeons @ 2013-01-04  1:18 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Andrew Morton, linux-mm, Rik van Riel, Hugh Dickins, linux-kernel

On Thu, 2013-01-03 at 16:50 -0800, Michel Lespinasse wrote:
> On Thu, Jan 3, 2013 at 4:40 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
> > On Wed, 2012-12-19 at 19:01 -0800, Michel Lespinasse wrote:
> >> Hi Simon,
> >>
> >> On Wed, Dec 19, 2012 at 5:56 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
> >> > One question.
> >> >
> >> > I found that mainly callsite of expand_stack() is #PF, but it holds
> >> > mmap_sem each time before call expand_stack(), how can hold a *shared*
> >> > mmap_sem happen?
> >>
> >> the #PF handler calls down_read(&mm->mmap_sem) before calling expand_stack.
> >>
> >> I think I'm just confusing you with my terminology; shared lock ==
> >> read lock == several readers might hold it at once (I'd say they share
> >> it)
> >
> > Sorry for my late response.
> >
> > Since expand_stack() will modify vma, then why hold a read lock here?
> 
> Well, it'd be much nicer if we had a write lock, I think. But, we
> didn't know when taking the lock that we'd end up having to expand
> stacks.
> 
> What happens is that page faults don't generally modify vmas, so they
> get a read lock (just to know what vma the fault is happening in) and
> then fault in the page.
> 

Thanks for your quick explanation. 

> expand_stack() is the one exception to that - after getting the read
> lock as usual, we notice that the fault is not in any vma right now,
> but it's close enough to an expandable vma.

If this senario only occur for userspace stack?

> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: protect against concurrent vma expansion
  2013-01-04  0:40             ` Simon Jeons
  2013-01-04  0:50               ` Michel Lespinasse
@ 2013-01-04  2:49               ` Al Viro
  1 sibling, 0 replies; 11+ messages in thread
From: Al Viro @ 2013-01-04  2:49 UTC (permalink / raw)
  To: Simon Jeons
  Cc: Michel Lespinasse, Andrew Morton, linux-mm, Rik van Riel,
	Hugh Dickins, linux-kernel

On Thu, Jan 03, 2013 at 06:40:05PM -0600, Simon Jeons wrote:
> On Wed, 2012-12-19 at 19:01 -0800, Michel Lespinasse wrote:
> > Hi Simon,
> > 
> > On Wed, Dec 19, 2012 at 5:56 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
> > > One question.
> > >
> > > I found that mainly callsite of expand_stack() is #PF, but it holds
> > > mmap_sem each time before call expand_stack(), how can hold a *shared*
> > > mmap_sem happen?
> > 
> > the #PF handler calls down_read(&mm->mmap_sem) before calling expand_stack.
> > 
> > I think I'm just confusing you with my terminology; shared lock ==
> > read lock == several readers might hold it at once (I'd say they share
> > it)
> 
> Sorry for my late response. 
> 
> Since expand_stack() will modify vma, then why hold a read lock here?

To prevent that vma being ripped out.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-01-04  2:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-01  6:56 [PATCH] mm: protect against concurrent vma expansion Michel Lespinasse
2012-12-03 23:01 ` Andrew Morton
2012-12-04  0:35   ` Michel Lespinasse
2012-12-04  0:43     ` Andrew Morton
2012-12-04 14:48       ` Michel Lespinasse
2012-12-20  1:56         ` Simon Jeons
2012-12-20  3:01           ` Michel Lespinasse
2013-01-04  0:40             ` Simon Jeons
2013-01-04  0:50               ` Michel Lespinasse
2013-01-04  1:18                 ` Simon Jeons
2013-01-04  2:49               ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).