All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Christoph Lameter <cl@linux-foundation.org>,
	David Howells <dhowells@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH RFC] percpu: add data dependency barrier in percpu accessors and operations
Date: Thu, 12 Jun 2014 09:56:30 -0400	[thread overview]
Message-ID: <20140612135630.GA23606@htj.dyndns.org> (raw)

>From 2ff90ab638d50d6191ba3a3564b53fccb78bd4cd Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Wed, 11 Jun 2014 20:47:02 -0400

percpu areas are zeroed on allocation and, by its nature, accessed
from multiple cpus.  Consider the following scenario.

 p = NULL;

	CPU-1				CPU-2
 p = alloc_percpu()		if (p)
					WARN_ON(this_cpu_read(*p));

As all percpu areas are zeroed on allocation, CPU-2 should never
trigger the WARN; however, there's no barrier between zeroing of the
percpu regions and the assignment of @p or between testing of @p and
dereferencing it and CPU-2 may see garbage value from before the
clearing and trigger the warning.

Note that this may happen even on archs which don't require data
dependency barriers.  While CPU-2 woudln't reorder percpu area access
above the testing of @p, nothing prevents CPU-1 from scheduling
zeroing after the assignment of @p.

This patch fixes the issue by adding a smp_wmb() before a newly
allocated percpu pointer is returned and a smp_read_barrier_depends()
in __verify_pcpu_ptr() which is guaranteed to be invoked at least once
before every percpu access.  It currently may be invoked multiple
times per operation which isn't optimal.  Future patches will update
the definitions so that the macro is invoked only once per access.

It can be argued that smp_read_barrier_depends() is the responsibility
of the caller; however, discerning local and remote accesses gets very
confusing with percpu areas (initialized remotely but local to this
cpu and vice-versa) and data dependency barrier is free on all archs
except for alpha, so I think it's better to include it as part of
percpu accessors and operations.

I wonder whether we also need a smp_wmb() for other __GFP_ZERO
allocations.  There isn't one and requiring the users to perform
smp_wmb() to ensure the propagation of zeroing seems too subtle.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/percpu-defs.h | 15 ++++++++++++---
 mm/percpu.c                 |  9 +++++++++
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index a5fc7d0..b91bb37 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -1,6 +1,8 @@
 #ifndef _LINUX_PERCPU_DEFS_H
 #define _LINUX_PERCPU_DEFS_H
 
+#include <asm/barrier.h>
+
 /*
  * Base implementations of per-CPU variable declarations and definitions, where
  * the section in which the variable is to be placed is provided by the
@@ -19,9 +21,15 @@
 	__attribute__((section(".discard"), unused))
 
 /*
- * Macro which verifies @ptr is a percpu pointer without evaluating
- * @ptr.  This is to be used in percpu accessors to verify that the
- * input parameter is a percpu pointer.
+ * This macro serves two purposes.  It verifies @ptr is a percpu pointer
+ * without evaluating @ptr and provides the data dependency barrier paired
+ * with smp_wmb() at the end of the allocation path so that the memory
+ * clearing in the allocation path is visible to all percpu accsses.
+ *
+ * The existence of the data dependency barrier is guaranteed and percpu
+ * users can take advantage of it - e.g. percpu area updates followed by
+ * smp_wmb() and then a percpu pointer assignment are guaranteed to be
+ * visible to accessors which access through the assigned percpu pointer.
  *
  * + 0 is required in order to convert the pointer type from a
  * potential array type to a pointer to a single item of the array.
@@ -29,6 +37,7 @@
 #define __verify_pcpu_ptr(ptr)	do {					\
 	const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;	\
 	(void)__vpp_verify;						\
+	smp_read_barrier_depends();					\
 } while (0)
 
 /*
diff --git a/mm/percpu.c b/mm/percpu.c
index 2ddf9a9..bd3cab8 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -816,6 +816,15 @@ area_found:
 	/* return address relative to base address */
 	ptr = __addr_to_pcpu_ptr(chunk->base_addr + off);
 	kmemleak_alloc_percpu(ptr, size);
+
+	/*
+	 * The following wmb is paired with the data dependency barrier in
+	 * the percpu accessors and guarantees that the zeroing of the
+	 * percpu areas in pcpu_populate_chunk() is visible to all percpu
+	 * accesses through the returned percpu pointer.  The accessors get
+	 * their data dependency barrier from __verify_pcpu_ptr().
+	 */
+	smp_wmb();
 	return ptr;
 
 fail_unlock:
-- 
1.9.3

             reply	other threads:[~2014-06-12 13:56 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-12 13:56 Tejun Heo [this message]
2014-06-12 15:34 ` [PATCH RFC] percpu: add data dependency barrier in percpu accessors and operations Paul E. McKenney
2014-06-12 15:52   ` Tejun Heo
2014-06-17 14:41     ` Paul E. McKenney
2014-06-17 15:27       ` Tejun Heo
2014-06-17 15:56         ` Christoph Lameter
2014-06-17 16:00           ` Tejun Heo
2014-06-17 16:05             ` Tejun Heo
2014-06-17 16:28               ` Christoph Lameter
     [not found]                 ` <CA+55aFxHr8JXwDR-4g4z1mkXvZRtY=OosYcUMPZRD2upfooS1w@mail.gmail.com>
2014-06-17 18:47                   ` Christoph Lameter
2014-06-17 18:55                     ` Paul E. McKenney
2014-06-17 19:39                       ` Christoph Lameter
2014-06-17 19:47                         ` Tejun Heo
2014-06-17 19:56                         ` Paul E. McKenney
2014-06-19 20:39                           ` Christoph Lameter
2014-06-17 16:57         ` Paul E. McKenney
2014-06-17 18:56           ` Tejun Heo
2014-06-17 19:42             ` Christoph Lameter
2014-06-17 20:44               ` Tejun Heo
2014-07-09  0:55         ` Rusty Russell
2014-07-14 11:39           ` Paul E. McKenney
2014-07-14 15:22             ` Christoph Lameter
2014-07-15 10:11               ` Paul E. McKenney
2014-07-15 14:06                 ` Christoph Lameter
2014-07-15 14:32                   ` Paul E. McKenney
2014-07-15 15:06                     ` Christoph Lameter
2014-07-15 15:41                       ` Linus Torvalds
2014-07-15 16:12                         ` Christoph Lameter
     [not found]                           ` <CA+55aFxU166V5-vH4vmK9OBdTZKyede=71RjjbOVSN9Qh+Se+A@mail.gmail.com>
2014-07-15 17:45                             ` Paul E. McKenney
2014-07-15 17:41                       ` Paul E. McKenney
2014-07-16 14:40                         ` Christoph Lameter
2014-07-15 11:50             ` Rusty Russell
2014-06-17 19:27 ` Christoph Lameter
2014-06-17 19:40   ` Paul E. McKenney
2014-06-19 20:42     ` Christoph Lameter
2014-06-19 20:46       ` Tejun Heo
2014-06-19 21:11         ` Christoph Lameter
2014-06-19 21:15           ` Tejun Heo
2014-06-20 15:23             ` Christoph Lameter
2014-06-20 15:52               ` Tejun Heo
2014-06-19 20:51       ` Paul E. McKenney
2014-06-20 15:29         ` Christoph Lameter
2014-06-20 15:50           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140612135630.GA23606@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.