All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pekka Enberg <penberg@kernel.org>
To: Christoph Lameter <cl@linux.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	RT <linux-rt-users@vger.kernel.org>,
	Clark Williams <clark@redhat.com>, John Kacur <jkacur@gmail.com>,
	"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>,
	Joonsoo Kim <js1304@gmail.com>,
	Glauber Costa <glommer@parallels.com>,
	linux-mm@kvack.org, David Rientjes <rientjes@google.com>,
	elezegarcia@gmail.com
Subject: Re: FIX [2/2] slub: tid must be retrieved from the percpu area of the current processor.
Date: Fri, 1 Feb 2013 12:24:47 +0200	[thread overview]
Message-ID: <CAOJsxLFQt+Yq-n5QABgGczUjiaAGCJMwHZJwzWnpAKDCtKvabA@mail.gmail.com> (raw)
In-Reply-To: <0000013c695fbea7-9472355c-ccb3-4aa3-ba3d-2ecd6afb2e5a-000000@email.amazonses.com>

On Wed, Jan 23, 2013 at 11:45 PM, Christoph Lameter <cl@linux.com> wrote:
> As Steven Rostedt has pointer out: Rescheduling could occur on a differnet processor
> after the determination of the per cpu pointer and before the tid is retrieved.
> This could result in allocation from the wrong node in slab_alloc.
>
> The effect is much more severe in slab_free() where we could free to the freelist
> of the wrong page.
>
> The window for something like that occurring is pretty small but it is possible.
>
> Signed-off-by: Christoph Lameter <cl@linux.com>

Okay, makes sense. Has anyone triggered this in practice? Do we want
to tag this for -stable?

>
> Index: linux/mm/slub.c
> ===================================================================
> --- linux.orig/mm/slub.c        2013-01-23 15:06:39.805154107 -0600
> +++ linux/mm/slub.c     2013-01-23 15:24:47.656868067 -0600
> @@ -2331,13 +2331,18 @@ static __always_inline void *slab_alloc_
>
>         s = memcg_kmem_get_cache(s, gfpflags);
>  redo:
> -
>         /*
>          * Must read kmem_cache cpu data via this cpu ptr. Preemption is
>          * enabled. We may switch back and forth between cpus while
>          * reading from one cpu area. That does not matter as long
>          * as we end up on the original cpu again when doing the cmpxchg.
> +        *
> +        * Preemption is disabled for the retrieval of the tid because that
> +        * must occur from the current processor. We cannot allow rescheduling
> +        * on a different processor between the determination of the pointer
> +        * and the retrieval of the tid.
>          */
> +       preempt_disable();
>         c = __this_cpu_ptr(s->cpu_slab);
>
>         /*
> @@ -2347,7 +2352,7 @@ redo:
>          * linked list in between.
>          */
>         tid = c->tid;
> -       barrier();
> +       preempt_enable();
>
>         object = c->freelist;
>         page = c->page;
> @@ -2594,10 +2599,11 @@ redo:
>          * data is retrieved via this pointer. If we are on the same cpu
>          * during the cmpxchg then the free will succedd.
>          */
> +       preempt_disable();
>         c = __this_cpu_ptr(s->cpu_slab);
>
>         tid = c->tid;
> -       barrier();
> +       preempt_enable();
>
>         if (likely(page == c->page)) {
>                 set_freepointer(s, object, c->freelist);
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Pekka Enberg <penberg@kernel.org>
To: Christoph Lameter <cl@linux.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	RT <linux-rt-users@vger.kernel.org>,
	Clark Williams <clark@redhat.com>, John Kacur <jkacur@gmail.com>,
	"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>,
	Joonsoo Kim <js1304@gmail.com>,
	Glauber Costa <glommer@parallels.com>,
	linux-mm@kvack.org, David Rientjes <rientjes@google.com>,
	elezegarcia@gmail.com
Subject: Re: FIX [2/2] slub: tid must be retrieved from the percpu area of the current processor.
Date: Fri, 1 Feb 2013 12:24:47 +0200	[thread overview]
Message-ID: <CAOJsxLFQt+Yq-n5QABgGczUjiaAGCJMwHZJwzWnpAKDCtKvabA@mail.gmail.com> (raw)
In-Reply-To: <0000013c695fbea7-9472355c-ccb3-4aa3-ba3d-2ecd6afb2e5a-000000@email.amazonses.com>

On Wed, Jan 23, 2013 at 11:45 PM, Christoph Lameter <cl@linux.com> wrote:
> As Steven Rostedt has pointer out: Rescheduling could occur on a differnet processor
> after the determination of the per cpu pointer and before the tid is retrieved.
> This could result in allocation from the wrong node in slab_alloc.
>
> The effect is much more severe in slab_free() where we could free to the freelist
> of the wrong page.
>
> The window for something like that occurring is pretty small but it is possible.
>
> Signed-off-by: Christoph Lameter <cl@linux.com>

Okay, makes sense. Has anyone triggered this in practice? Do we want
to tag this for -stable?

>
> Index: linux/mm/slub.c
> ===================================================================
> --- linux.orig/mm/slub.c        2013-01-23 15:06:39.805154107 -0600
> +++ linux/mm/slub.c     2013-01-23 15:24:47.656868067 -0600
> @@ -2331,13 +2331,18 @@ static __always_inline void *slab_alloc_
>
>         s = memcg_kmem_get_cache(s, gfpflags);
>  redo:
> -
>         /*
>          * Must read kmem_cache cpu data via this cpu ptr. Preemption is
>          * enabled. We may switch back and forth between cpus while
>          * reading from one cpu area. That does not matter as long
>          * as we end up on the original cpu again when doing the cmpxchg.
> +        *
> +        * Preemption is disabled for the retrieval of the tid because that
> +        * must occur from the current processor. We cannot allow rescheduling
> +        * on a different processor between the determination of the pointer
> +        * and the retrieval of the tid.
>          */
> +       preempt_disable();
>         c = __this_cpu_ptr(s->cpu_slab);
>
>         /*
> @@ -2347,7 +2352,7 @@ redo:
>          * linked list in between.
>          */
>         tid = c->tid;
> -       barrier();
> +       preempt_enable();
>
>         object = c->freelist;
>         page = c->page;
> @@ -2594,10 +2599,11 @@ redo:
>          * data is retrieved via this pointer. If we are on the same cpu
>          * during the cmpxchg then the free will succedd.
>          */
> +       preempt_disable();
>         c = __this_cpu_ptr(s->cpu_slab);
>
>         tid = c->tid;
> -       barrier();
> +       preempt_enable();
>
>         if (likely(page == c->page)) {
>                 set_freepointer(s, object, c->freelist);
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-02-01 10:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20130123214514.370647954@linux.com>
2013-01-23 21:45 ` FIX [1/2] slub: Do not dereference NULL pointer in node_match Christoph Lameter
2013-01-23 21:45   ` Christoph Lameter
2013-01-24  0:53   ` Simon Jeons
2013-01-24  0:53     ` Simon Jeons
2013-01-24 15:14     ` Christoph Lameter
2013-01-25  8:11       ` Simon Jeons
2013-01-25  8:11         ` Simon Jeons
2013-01-25 14:53         ` Christoph Lameter
2013-02-01 10:23   ` Pekka Enberg
2013-02-01 10:23     ` Pekka Enberg
2013-02-01 11:51     ` Steven Rostedt
2013-02-01 11:51       ` Steven Rostedt
2013-01-23 21:45 ` FIX [2/2] slub: tid must be retrieved from the percpu area of the current processor Christoph Lameter
2013-01-23 21:45   ` Christoph Lameter
2013-02-01 10:24   ` Pekka Enberg [this message]
2013-02-01 10:24     ` Pekka Enberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOJsxLFQt+Yq-n5QABgGczUjiaAGCJMwHZJwzWnpAKDCtKvabA@mail.gmail.com \
    --to=penberg@kernel.org \
    --cc=cl@linux.com \
    --cc=clark@redhat.com \
    --cc=elezegarcia@gmail.com \
    --cc=glommer@parallels.com \
    --cc=jkacur@gmail.com \
    --cc=js1304@gmail.com \
    --cc=lgoncalv@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.