linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Jeons <simon.jeons@gmail.com>
To: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: protect against concurrent vma expansion
Date: Wed, 19 Dec 2012 20:56:34 -0500	[thread overview]
Message-ID: <1355968594.1415.4.camel@kernel-VirtualBox> (raw)
In-Reply-To: <20121204144820.GA13916@google.com>

On Tue, 2012-12-04 at 06:48 -0800, Michel Lespinasse wrote:
> expand_stack() runs with a shared mmap_sem lock. Because of this, there
> could be multiple concurrent stack expansions in the same mm, which may
> cause problems in the vma gap update code.
> 
> I propose to solve this by taking the mm->page_table_lock around such vma
> expansions, in order to avoid the concurrency issue. We only have to worry
> about concurrent expand_stack() calls here, since we hold a shared mmap_sem
> lock and all vma modificaitons other than expand_stack() are done under
> an exclusive mmap_sem lock.

Hi Michel and Andrew,

One question.

I found that mainly callsite of expand_stack() is #PF, but it holds
mmap_sem each time before call expand_stack(), how can hold a *shared*
mmap_sem happen?

> 
> I previously tried to achieve the same effect by making sure all
> growable vmas in a given mm would share the same anon_vma, which we
> already lock here. However this turned out to be difficult - all of the
> schemes I tried for refcounting the growable anon_vma and clearing
> turned out ugly. So, I'm now proposing only the minimal fix.
> 
> The overhead of taking the page table lock during stack expansion is
> expected to be small: glibc doesn't use expandable stacks for the
> threads it creates, so having multiple growable stacks is actually
> uncommon and we don't expect the page table lock to get bounced
> between threads.
> 
> Signed-off-by: Michel Lespinasse <walken@google.com>
> 
> ---
>  mm/mmap.c |   28 ++++++++++++++++++++++++++++
>  1 files changed, 28 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9ed3a06242a0..2b7d9e78a569 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2069,6 +2069,18 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  		if (vma->vm_pgoff + (size >> PAGE_SHIFT) >= vma->vm_pgoff) {
>  			error = acct_stack_growth(vma, size, grow);
>  			if (!error) {
> +				/*
> +				 * vma_gap_update() doesn't support concurrent
> +				 * updates, but we only hold a shared mmap_sem
> +				 * lock here, so we need to protect against
> +				 * concurrent vma expansions.
> +				 * vma_lock_anon_vma() doesn't help here, as
> +				 * we don't guarantee that all growable vmas
> +				 * in a mm share the same root anon vma.
> +				 * So, we reuse mm->page_table_lock to guard
> +				 * against concurrent vma expansions.
> +				 */
> +				spin_lock(&vma->vm_mm->page_table_lock);
>  				anon_vma_interval_tree_pre_update_vma(vma);
>  				vma->vm_end = address;
>  				anon_vma_interval_tree_post_update_vma(vma);
> @@ -2076,6 +2088,8 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  					vma_gap_update(vma->vm_next);
>  				else
>  					vma->vm_mm->highest_vm_end = address;
> +				spin_unlock(&vma->vm_mm->page_table_lock);
> +
>  				perf_event_mmap(vma);
>  			}
>  		}
> @@ -2126,11 +2140,25 @@ int expand_downwards(struct vm_area_struct *vma,
>  		if (grow <= vma->vm_pgoff) {
>  			error = acct_stack_growth(vma, size, grow);
>  			if (!error) {
> +				/*
> +				 * vma_gap_update() doesn't support concurrent
> +				 * updates, but we only hold a shared mmap_sem
> +				 * lock here, so we need to protect against
> +				 * concurrent vma expansions.
> +				 * vma_lock_anon_vma() doesn't help here, as
> +				 * we don't guarantee that all growable vmas
> +				 * in a mm share the same root anon vma.
> +				 * So, we reuse mm->page_table_lock to guard
> +				 * against concurrent vma expansions.
> +				 */
> +				spin_lock(&vma->vm_mm->page_table_lock);
>  				anon_vma_interval_tree_pre_update_vma(vma);
>  				vma->vm_start = address;
>  				vma->vm_pgoff -= grow;
>  				anon_vma_interval_tree_post_update_vma(vma);
>  				vma_gap_update(vma);
> +				spin_unlock(&vma->vm_mm->page_table_lock);
> +
>  				perf_event_mmap(vma);
>  			}
>  		}



  reply	other threads:[~2012-12-20  2:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-01  6:56 [PATCH] mm: protect against concurrent vma expansion Michel Lespinasse
2012-12-03 23:01 ` Andrew Morton
2012-12-04  0:35   ` Michel Lespinasse
2012-12-04  0:43     ` Andrew Morton
2012-12-04 14:48       ` Michel Lespinasse
2012-12-20  1:56         ` Simon Jeons [this message]
2012-12-20  3:01           ` Michel Lespinasse
2013-01-04  0:40             ` Simon Jeons
2013-01-04  0:50               ` Michel Lespinasse
2013-01-04  1:18                 ` Simon Jeons
2013-01-04  2:49               ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1355968594.1415.4.camel@kernel-VirtualBox \
    --to=simon.jeons@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).