All of lore.kernel.org
 help / color / mirror / Atom feed
* [this_cpu_xx 00/11] Introduce this_cpu_xx operations
@ 2009-06-05 19:18 cl
  2009-06-05 19:18 ` [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations cl
                   ` (10 more replies)
  0 siblings, 11 replies; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, mingo, rusty, davem

The patchset introduces various operations to allow efficient access
to per cpu variables for the current processor. Currently there is
no way in the core to calcualte the address of the instance
of a per cpu variable without a table lookup through

	per_cpu_ptr(x, smp_processor_id())

The patchset introduces a way to calculate the address using the offset
that is available in arch specific ways (register or special memory
locations) using

	this_cpu_ptr(x)

In addition operations are provided that can operate on per cpu
pointers. This is necessary to be able to use the addresses
generated by the new per cpu allocator with per cpu RMW instructions.

The arch provided RMW instructions can be used to avoid having to switch
off preemption and interrupts for per cpu counter updates.

One caveat with this patchset is that it currently does not work on S/390.
Tejun Heo has a patchset that fixes the SHIFT_PERCPU_PTR issues on that
platform. That patch is required before S/390 will work.

Patchset will reduce the code size and increase speed of operations for
dynamically allocated per cpu based statistics.

Patch shows how this could be done. There are many other places in
the code where these macros could be beneficial.

---
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
@ 2009-06-05 19:18 ` cl
  2009-06-10  5:12   ` Tejun Heo
  2009-06-17  8:19   ` Tejun Heo
  2009-06-05 19:18 ` [this_cpu_xx 02/11] Use this_cpu operations for SNMP statistics cl
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

[-- Attachment #1: this_cpu_ptr_intro --]
[-- Type: text/plain, Size: 8214 bytes --]

this_cpu_ptr(xx) = per_cpu_ptr(xx, smp_processor_id).

The problem with per_cpu_ptr(x, smp_processor_id) is that it requires
an array lookup to find the offset for the cpu. Processors typically
have the offset for the current cpu area in some kind of (arch dependent)
efficiently accessible register or memory location.

We can use that instead of doing the array lookup to speed up the
determination of the addressof the percpu variable. This is particularly
significant because these lookups occur in performance critical paths
of the core kernel.

This optimization is a prerequiste to the introduction of per processor
atomic operations for the core code. Atomic per processor operations
implicitly do the offset calculation to the current per cpu area in a
single instruction. All the locations touched by this patchset are potential
candidates for atomic per cpu operations.

this_cpu_ptr comes in two flavors. The preemption context matters since we
are referring the the currently executing processor. In many cases we must
insure that the processor does not change while a code segment is executed.

__this_cpu_ptr 	-> Do not check for preemption context
this_cpu_ptr	-> Check preemption context


Provide generic functions that are used if an arch does not define optimized
this_cpu operations. The functions come also come in the two favors. The first
parameter is a scalar that is pointed to by a pointer acquired through
allocpercpu or by taking the address of a per cpu variable.

The operations are guaranteed to be atomic vs preemption if they modify
the scalar (unless they are prefixed by __ in which case they do not need
to be). The calculation of the per cpu offset is also guaranteed to be atomic.

this_cpu_read(scalar)
this_cpu_write(scalar, value)
this_cpu_add(scale, value)
this_cpu_sub(scalar, value)
this_cpu_inc(scalar)
this_cpu_dec(scalar)
this_cpu_and(scalar, value)
this_cpu_or(scalar, value)
this_cpu_xor(scalar, value)

The arches can override the defaults and provide atomic per cpu operations.
These atomic operations must provide both the relocation (x86 does it
through a segment override) and the operation
on the data in a single instruction. Otherwise preempt needs to be disabled
and there is no gain from providing arch implementations.

A third variant is provided prefixed by irqsafe_. These variants are safe
against hardware interrupts on the *same* processor (all per cpu atomic
primitives are *always* *only* providing safety for code running on the
*same* processor!). The increment needs to be implemented by the hardware
in such a way that it is a single RMW instruction that is either processed
before or after an interrupt.

cc: David Howells <dhowells@redhat.com>
cc: Tejun Heo <tj@kernel.org>
cc: Ingo Molnar <mingo@elte.hu>
cc: Rusty Russell <rusty@rustcorp.com.au>
cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 include/asm-generic/percpu.h |    5 +
 include/linux/percpu.h       |  144 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 149 insertions(+)

Index: linux-2.6/include/linux/percpu.h
===================================================================
--- linux-2.6.orig/include/linux/percpu.h	2009-06-04 13:38:28.000000000 -0500
+++ linux-2.6/include/linux/percpu.h	2009-06-04 14:15:51.000000000 -0500
@@ -176,4 +176,148 @@ do {									\
 # define percpu_xor(var, val)		__percpu_generic_to_op(var, (val), ^=)
 #endif
 
+
+/*
+ * Optimized manipulation for memory allocated through the per cpu
+ * allocator or for addresses taken from per cpu variables.
+ *
+ * The first group is used for accesses that must be done in a
+ * preemption safe way since we know that the context is not preempt
+ * safe
+ */
+#ifndef this_cpu_read
+# define this_cpu_read(pcp)						\
+  ({									\
+	*this_cpu_ptr(&(pcp));						\
+  })
+#endif
+
+#define _this_cpu_generic_to_op(pcp, val, op)				\
+do {									\
+	preempt_disable();						\
+	*__this_cpu_ptr(&pcp) op val;					\
+	preempt_enable_no_resched();					\
+} while (0)
+
+#ifndef this_cpu_write
+# define this_cpu_write(pcp, val)	__this_cpu_write((pcp), (val))
+#endif
+
+#ifndef this_cpu_add
+# define this_cpu_add(pcp, val)		_this_cpu_generic_to_op((pcp), (val), +=)
+#endif
+
+#ifndef this_cpu_sub
+# define this_cpu_sub(pcp, val)		this_cpu_add((pcp), -(val))
+#endif
+
+#ifndef this_cpu_inc
+# define this_cpu_inc(pcp)		this_cpu_add((pcp), 1)
+#endif
+
+#ifndef this_cpu_dec
+# define this_cpu_dec(pcp)		this_cpu_sub((pcp), 1)
+#endif
+
+#ifndef this_cpu_and
+# define this_cpu_and(pcp, val)		_this_cpu_generic_to_op((pcp), (val), &=)
+#endif
+
+#ifndef this_cpu_or
+# define this_cpu_or(pcp, val)		_this_cpu_generic_to_op((pcp), (val), |=)
+#endif
+
+#ifndef this_cpu_xor
+# define this_cpu_xor(pcp, val)		_this_cpu_generic_to_op((pcp), (val), ^=)
+#endif
+
+
+/*
+ * Generic percpu operations that do not require preemption handling.
+ * Either we do not care about races or the caller has the
+ * responsibility of handling preemptions issues.
+ */
+#ifndef __this_cpu_read
+# define __this_cpu_read(pcp)						\
+  ({									\
+	*__this_cpu_ptr(&(pcp));					\
+  })
+#endif
+
+#define __this_cpu_generic_to_op(pcp, val, op)				\
+do {									\
+	*__this_cpu_ptr(&(pcp)) op val;					\
+} while (0)
+
+#ifndef __this_cpu_write
+# define __this_cpu_write(pcp, val)	__this_cpu_generic_to_op((pcp), (val), =)
+#endif
+
+#ifndef __this_cpu_add
+# define __this_cpu_add(pcp, val)	__this_cpu_generic_to_op((pcp), (val), +=)
+#endif
+
+#ifndef __this_cpu_sub
+# define __this_cpu_sub(pcp, val)	__this_cpu_add((pcp), -(var))
+#endif
+
+#ifndef __this_cpu_inc
+# define __this_cpu_inc(pcp)		__this_cpu_add((pcp), 1)
+#endif
+
+#ifndef __this_cpu_dec
+# define __this_cpu_dec(pcp)		__this_cpu_sub((pcp), 1)
+#endif
+
+#ifndef __this_cpu_and
+# define __this_cpu_and(pcp, val)	__this_cpu_generic_to_op((pcp), (val), &=)
+#endif
+
+#ifndef __this_cpu_or
+# define __this_cpu_or(pcp, val)	__this_cpu_generic_to_op((pcp), (val), |=)
+#endif
+
+#ifndef __this_cpu_xor
+# define __this_cpu_xor(pcp, val)	__this_cpu_generic_to_op((pcp), (val), ^=)
+#endif
+
+/*
+ * IRQ safe versions
+ */
+#define irqsafe_cpu_generic_to_op(pcp, val, op)				\
+do {									\
+	unsigned long flags;						\
+	local_irqsave(flags);						\
+	*__this_cpu_ptr(&(pcp)) op val;					\
+	local_irqrestore(flags);					\
+} while (0)
+
+#ifndef irqsafe_this_cpu_add
+# define irqsafe_this_cpu_add(pcp, val)	irqsafe_cpu_generic_to_op((pcp), (val), +=)
+#endif
+
+#ifndef irqsafe_this_cpu_sub
+# define irqsafe_this_cpu_sub(pcp, val)	irqsafe_cpu_add((pcp), -(var))
+#endif
+
+#ifndef irqsafe_this_cpu_inc
+# define irqsafe_this_cpu_inc(pcp)	irqsafe_cpu_add((pcp), 1)
+#endif
+
+#ifndef irqsafe_this_cpu_dec
+# define irqsafe_this_cpu_dec(pcp)	irqsafe_cpu_sub((pcp), 1)
+#endif
+
+#ifndef irqsafe_this_cpu_and
+# define irqsafe_this_cpu_and(pcp, val)	irqsafe_cpu_generic_to_op((pcp), (val), &=)
+#endif
+
+#ifndef irqsafe_this_cpu_or
+# define irqsafe_this_cpu_or(pcp, val)	irqsafe_cpu_generic_to_op((pcp), (val), |=)
+#endif
+
+#ifndef irqsafe_this_cpu_xor
+# define irqsafe_this_cpu_xor(pcp, val)	irqsafe_cpu_generic_to_op((pcp), (val), ^=)
+#endif
+
 #endif /* __LINUX_PERCPU_H */
Index: linux-2.6/include/asm-generic/percpu.h
===================================================================
--- linux-2.6.orig/include/asm-generic/percpu.h	2009-06-04 13:38:28.000000000 -0500
+++ linux-2.6/include/asm-generic/percpu.h	2009-06-04 13:47:10.000000000 -0500
@@ -56,6 +56,9 @@ extern unsigned long __per_cpu_offset[NR
 #define __raw_get_cpu_var(var) \
 	(*SHIFT_PERCPU_PTR(&per_cpu_var(var), __my_cpu_offset))
 
+#define this_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, my_cpu_offset)
+#define __this_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, __my_cpu_offset)
+
 
 #ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA
 extern void setup_per_cpu_areas(void);
@@ -66,6 +69,8 @@ extern void setup_per_cpu_areas(void);
 #define per_cpu(var, cpu)			(*((void)(cpu), &per_cpu_var(var)))
 #define __get_cpu_var(var)			per_cpu_var(var)
 #define __raw_get_cpu_var(var)			per_cpu_var(var)
+#define this_cpu_ptr(ptr) per_cpu_ptr(ptr, 0)
+#define __this_cpu_ptr(ptr) this_cpu_ptr(ptr)
 
 #endif	/* SMP */
 

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 02/11] Use this_cpu operations for SNMP statistics
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
  2009-06-05 19:18 ` [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations cl
@ 2009-06-05 19:18 ` cl
  2009-06-05 19:18 ` [this_cpu_xx 03/11] Use this_cpu operations for NFS statistics cl
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, mingo, rusty, davem

[-- Attachment #1: this_cpu_snmp --]
[-- Type: text/plain, Size: 2040 bytes --]

SNMP statistics macros can be signficantly simplified.
This will also reduce code size if the arch supports these operations
in harware.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 include/net/snmp.h |   37 ++++++++++++-------------------------
 1 file changed, 12 insertions(+), 25 deletions(-)

Index: linux-2.6/include/net/snmp.h
===================================================================
--- linux-2.6.orig/include/net/snmp.h	2009-06-03 13:48:02.000000000 -0500
+++ linux-2.6/include/net/snmp.h	2009-06-03 14:27:51.000000000 -0500
@@ -136,29 +136,16 @@ struct linux_xfrm_mib {
 #define SNMP_STAT_BHPTR(name)	(name[0])
 #define SNMP_STAT_USRPTR(name)	(name[1])
 
-#define SNMP_INC_STATS_BH(mib, field) 	\
-	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field]++)
-#define SNMP_INC_STATS_USER(mib, field) \
-	do { \
-		per_cpu_ptr(mib[1], get_cpu())->mibs[field]++; \
-		put_cpu(); \
-	} while (0)
-#define SNMP_INC_STATS(mib, field) 	\
-	do { \
-		per_cpu_ptr(mib[!in_softirq()], get_cpu())->mibs[field]++; \
-		put_cpu(); \
-	} while (0)
-#define SNMP_DEC_STATS(mib, field) 	\
-	do { \
-		per_cpu_ptr(mib[!in_softirq()], get_cpu())->mibs[field]--; \
-		put_cpu(); \
-	} while (0)
-#define SNMP_ADD_STATS_BH(mib, field, addend) 	\
-	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field] += addend)
-#define SNMP_ADD_STATS_USER(mib, field, addend) 	\
-	do { \
-		per_cpu_ptr(mib[1], get_cpu())->mibs[field] += addend; \
-		put_cpu(); \
-	} while (0)
-
+#define SNMP_INC_STATS_BH(mib, field)	\
+			__this_cpu_inc(mib[0]->mibs[field])
+#define SNMP_INC_STATS_USER(mib, field)	\
+			this_cpu_inc(mib[1]->mibs[field])
+#define SNMP_INC_STATS(mib, field)	\
+			this_cpu_inc(mib[!in_softirq()]->mibs[field])
+#define SNMP_DEC_STATS(mib, field)	\
+		 	this_cpu_dec(mib[!in_softirq()]->mibs[field])
+#define SNMP_ADD_STATS_BH(mib, field, addend)	\
+			__this_cpu_add(mib[0]->mibs[field], addend)
+#define SNMP_ADD_STATS_USER(mib, field, addend)	\
+			this_cpu_add(mib[1]->mibs[field], addend)
 #endif

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 03/11] Use this_cpu operations for NFS statistics
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
  2009-06-05 19:18 ` [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations cl
  2009-06-05 19:18 ` [this_cpu_xx 02/11] Use this_cpu operations for SNMP statistics cl
@ 2009-06-05 19:18 ` cl
  2009-06-05 19:18 ` [this_cpu_xx 04/11] Use this_cpu ops for network statistics cl
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, mingo, rusty, davem

[-- Attachment #1: this_cpu_nfs --]
[-- Type: text/plain, Size: 1910 bytes --]

Simplify NFS statistics and allow the use of optimized
arch instructions.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 fs/nfs/iostat.h |   30 +++++++++---------------------
 1 file changed, 9 insertions(+), 21 deletions(-)

Index: linux-2.6/fs/nfs/iostat.h
===================================================================
--- linux-2.6.orig/fs/nfs/iostat.h	2009-06-04 17:40:55.000000000 -0500
+++ linux-2.6/fs/nfs/iostat.h	2009-06-05 12:51:54.000000000 -0500
@@ -25,13 +25,7 @@ struct nfs_iostats {
 static inline void nfs_inc_server_stats(const struct nfs_server *server,
 					enum nfs_stat_eventcounters stat)
 {
-	struct nfs_iostats *iostats;
-	int cpu;
-
-	cpu = get_cpu();
-	iostats = per_cpu_ptr(server->io_stats, cpu);
-	iostats->events[stat]++;
-	put_cpu_no_resched();
+	this_cpu_inc(server->io_stats->events[stat]);
 }
 
 static inline void nfs_inc_stats(const struct inode *inode,
@@ -44,13 +38,13 @@ static inline void nfs_add_server_stats(
 					enum nfs_stat_bytecounters stat,
 					unsigned long addend)
 {
-	struct nfs_iostats *iostats;
-	int cpu;
-
-	cpu = get_cpu();
-	iostats = per_cpu_ptr(server->io_stats, cpu);
-	iostats->bytes[stat] += addend;
-	put_cpu_no_resched();
+	/*
+	 * bytes is larger than word size on 32 bit platforms.
+	 * Thus we cannot use this_cpu_add() here.
+	 */
+	preempt_disable();
+	*this_cpu_ptr(&server->io_stats->bytes[stat]) +=  addend;
+	preempt_enable_no_resched();
 }
 
 static inline void nfs_add_stats(const struct inode *inode,
@@ -65,13 +59,7 @@ static inline void nfs_add_fscache_stats
 					 enum nfs_stat_fscachecounters stat,
 					 unsigned long addend)
 {
-	struct nfs_iostats *iostats;
-	int cpu;
-
-	cpu = get_cpu();
-	iostats = per_cpu_ptr(NFS_SERVER(inode)->io_stats, cpu);
-	iostats->fscache[stat] += addend;
-	put_cpu_no_resched();
+	this_cpu_add(NFS_SERVER(inode)->io_stats->fscache[stat], addend);
 }
 #endif
 

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 04/11] Use this_cpu ops for network statistics
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
                   ` (2 preceding siblings ...)
  2009-06-05 19:18 ` [this_cpu_xx 03/11] Use this_cpu operations for NFS statistics cl
@ 2009-06-05 19:18 ` cl
  2009-06-08 11:27   ` Robin Holt
  2009-06-05 19:18 ` [this_cpu_xx 05/11] this_cpu_ptr: Straight transformations cl
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, mingo, rusty, davem

[-- Attachment #1: this_cpu_net --]
[-- Type: text/plain, Size: 1848 bytes --]

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 include/net/neighbour.h              |    7 +------
 include/net/netfilter/nf_conntrack.h |    9 ++-------
 2 files changed, 3 insertions(+), 13 deletions(-)

Index: linux-2.6/include/net/neighbour.h
===================================================================
--- linux-2.6.orig/include/net/neighbour.h	2009-06-03 16:23:29.000000000 -0500
+++ linux-2.6/include/net/neighbour.h	2009-06-03 16:36:23.000000000 -0500
@@ -89,12 +89,7 @@ struct neigh_statistics
 	unsigned long unres_discards;	/* number of unresolved drops */
 };
 
-#define NEIGH_CACHE_STAT_INC(tbl, field)				\
-	do {								\
-		preempt_disable();					\
-		(per_cpu_ptr((tbl)->stats, smp_processor_id())->field)++; \
-		preempt_enable();					\
-	} while (0)
+#define NEIGH_CACHE_STAT_INC(tbl, field) this_cpu_inc((tbl)->stats->field)
 
 struct neighbour
 {
Index: linux-2.6/include/net/netfilter/nf_conntrack.h
===================================================================
--- linux-2.6.orig/include/net/netfilter/nf_conntrack.h	2009-06-03 16:23:29.000000000 -0500
+++ linux-2.6/include/net/netfilter/nf_conntrack.h	2009-06-03 16:37:17.000000000 -0500
@@ -291,14 +291,9 @@ extern int nf_conntrack_set_hashsize(con
 extern unsigned int nf_conntrack_htable_size;
 extern unsigned int nf_conntrack_max;
 
-#define NF_CT_STAT_INC(net, count)	\
-	(per_cpu_ptr((net)->ct.stat, raw_smp_processor_id())->count++)
+#define NF_CT_STAT_INC(net, count) __this_cpu_inc((net)->ct.stat->count)
 #define NF_CT_STAT_INC_ATOMIC(net, count)		\
-do {							\
-	local_bh_disable();				\
-	per_cpu_ptr((net)->ct.stat, raw_smp_processor_id())->count++;	\
-	local_bh_enable();				\
-} while (0)
+	this_cpu_inc((net)->ct.stat->count)
 
 #define MODULE_ALIAS_NFCT_HELPER(helper) \
         MODULE_ALIAS("nfct-helper-" helper)

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 05/11] this_cpu_ptr: Straight transformations
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
                   ` (3 preceding siblings ...)
  2009-06-05 19:18 ` [this_cpu_xx 04/11] Use this_cpu ops for network statistics cl
@ 2009-06-05 19:18 ` cl
  2009-06-05 19:18 ` [this_cpu_xx 06/11] Eliminate get/put_cpu cl
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

[-- Attachment #1: this_cpu_ptr_straight_transforms --]
[-- Type: text/plain, Size: 4260 bytes --]

Use this_cpu_ptr and __this_cpu_ptr in locations where straight
transformations are possible because per_cpu_ptr is used with
either smp_processor_id() or raw_smp_processor_id().

cc: David Howells <dhowells@redhat.com>
cc: Tejun Heo <tj@kernel.org>
cc: Ingo Molnar <mingo@elte.hu>
cc: Rusty Russell <rusty@rustcorp.com.au>
cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 drivers/infiniband/hw/ehca/ehca_irq.c       |    3 +--
 drivers/net/chelsio/sge.c                   |    5 ++---
 drivers/net/loopback.c                      |    2 +-
 fs/ext4/mballoc.c                           |    2 +-
 include/net/netfilter/nf_conntrack_ecache.h |    2 +-
 5 files changed, 6 insertions(+), 8 deletions(-)

Index: linux-2.6/drivers/net/chelsio/sge.c
===================================================================
--- linux-2.6.orig/drivers/net/chelsio/sge.c	2009-06-03 12:27:07.000000000 -0500
+++ linux-2.6/drivers/net/chelsio/sge.c	2009-06-03 12:38:53.000000000 -0500
@@ -1378,7 +1378,7 @@ static void sge_rx(struct sge *sge, stru
 	}
 	__skb_pull(skb, sizeof(*p));
 
-	st = per_cpu_ptr(sge->port_stats[p->iff], smp_processor_id());
+	st = this_cpu_ptr(sge->port_stats[p->iff]);
 
 	skb->protocol = eth_type_trans(skb, adapter->port[p->iff].dev);
 	if ((adapter->flags & RX_CSUM_ENABLED) && p->csum == 0xffff &&
@@ -1780,8 +1780,7 @@ int t1_start_xmit(struct sk_buff *skb, s
 {
 	struct adapter *adapter = dev->ml_priv;
 	struct sge *sge = adapter->sge;
-	struct sge_port_stats *st = per_cpu_ptr(sge->port_stats[dev->if_port],
-						smp_processor_id());
+	struct sge_port_stats *st = this_cpu_ptr(sge->port_stats[dev->if_port]);
 	struct cpl_tx_pkt *cpl;
 	struct sk_buff *orig_skb = skb;
 	int ret;
Index: linux-2.6/drivers/net/loopback.c
===================================================================
--- linux-2.6.orig/drivers/net/loopback.c	2009-06-03 12:27:07.000000000 -0500
+++ linux-2.6/drivers/net/loopback.c	2009-06-03 12:38:53.000000000 -0500
@@ -78,7 +78,7 @@ static int loopback_xmit(struct sk_buff 
 
 	/* it's OK to use per_cpu_ptr() because BHs are off */
 	pcpu_lstats = dev->ml_priv;
-	lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
+	lb_stats = this_cpu_ptr(pcpu_lstats);
 	lb_stats->bytes += skb->len;
 	lb_stats->packets++;
 
Index: linux-2.6/fs/ext4/mballoc.c
===================================================================
--- linux-2.6.orig/fs/ext4/mballoc.c	2009-06-03 12:27:07.000000000 -0500
+++ linux-2.6/fs/ext4/mballoc.c	2009-06-03 12:38:53.000000000 -0500
@@ -4210,7 +4210,7 @@ static void ext4_mb_group_or_file(struct
 	 * per cpu locality group is to reduce the contention between block
 	 * request from multiple CPUs.
 	 */
-	ac->ac_lg = per_cpu_ptr(sbi->s_locality_groups, raw_smp_processor_id());
+	ac->ac_lg = __this_cpu_ptr(sbi->s_locality_groups);
 
 	/* we're going to use group allocation */
 	ac->ac_flags |= EXT4_MB_HINT_GROUP_ALLOC;
Index: linux-2.6/include/net/netfilter/nf_conntrack_ecache.h
===================================================================
--- linux-2.6.orig/include/net/netfilter/nf_conntrack_ecache.h	2009-06-03 12:27:07.000000000 -0500
+++ linux-2.6/include/net/netfilter/nf_conntrack_ecache.h	2009-06-03 12:38:53.000000000 -0500
@@ -39,7 +39,7 @@ nf_conntrack_event_cache(enum ip_conntra
 	struct nf_conntrack_ecache *ecache;
 
 	local_bh_disable();
-	ecache = per_cpu_ptr(net->ct.ecache, raw_smp_processor_id());
+	ecache = __this_cpu_ptr(net->ct.ecache);
 	if (ct != ecache->ct)
 		__nf_ct_event_cache_init(ct);
 	ecache->events |= event;
Index: linux-2.6/drivers/infiniband/hw/ehca/ehca_irq.c
===================================================================
--- linux-2.6.orig/drivers/infiniband/hw/ehca/ehca_irq.c	2009-06-03 12:27:07.000000000 -0500
+++ linux-2.6/drivers/infiniband/hw/ehca/ehca_irq.c	2009-06-03 12:38:53.000000000 -0500
@@ -827,8 +827,7 @@ static void __cpuinit take_over_work(str
 		cq = list_entry(cct->cq_list.next, struct ehca_cq, entry);
 
 		list_del(&cq->entry);
-		__queue_comp_task(cq, per_cpu_ptr(pool->cpu_comp_tasks,
-						  smp_processor_id()));
+		__queue_comp_task(cq, this_cpu_ptr(pool->cpu_comp_tasks));
 	}
 
 	spin_unlock_irqrestore(&cct->task_lock, flags_cct);

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 06/11] Eliminate get/put_cpu
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
                   ` (4 preceding siblings ...)
  2009-06-05 19:18 ` [this_cpu_xx 05/11] this_cpu_ptr: Straight transformations cl
@ 2009-06-05 19:18 ` cl
  2009-06-05 19:34   ` Dan Williams
  2009-06-05 19:18 ` [this_cpu_xx 07/11] xfs_icsb_modify_counters does not need "cpu" variable cl
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Dan Williams, Eric Biederman, Stephen Hemminger,
	Trond Myklebust, Herbert Xu, David L Stevens, mingo, rusty,
	davem

[-- Attachment #1: this_cpu_ptr_eliminate_get_put_cpu --]
[-- Type: text/plain, Size: 4482 bytes --]

There are cases where we can use this_cpu_ptr and as the result
of using this_cpu_ptr() we no longer need to determine the
current executing cpu.

In those places no get/put_cpu combination is needed anymore.
The local cpu variable can be eliminated.

Preemption still needs to be disabled and enabled since the
modifications of the per cpu variables is not atomic. There may
be multiple per cpu variables modified and those must all
be from the same processor.

cc: Dan Williams <dan.j.williams@intel.com>
cc: Eric Biederman <ebiederm@aristanetworks.com>
cc: Stephen Hemminger <shemminger@vyatta.com>
cc: Trond Myklebust <Trond.Myklebust@netapp.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 drivers/dma/dmaengine.c |   36 +++++++++++++-----------------------
 drivers/net/veth.c      |    7 +++----
 2 files changed, 16 insertions(+), 27 deletions(-)

Index: linux-2.6/drivers/dma/dmaengine.c
===================================================================
--- linux-2.6.orig/drivers/dma/dmaengine.c	2009-06-04 13:38:15.000000000 -0500
+++ linux-2.6/drivers/dma/dmaengine.c	2009-06-04 14:19:43.000000000 -0500
@@ -326,14 +326,7 @@ arch_initcall(dma_channel_table_init);
  */
 struct dma_chan *dma_find_channel(enum dma_transaction_type tx_type)
 {
-	struct dma_chan *chan;
-	int cpu;
-
-	cpu = get_cpu();
-	chan = per_cpu_ptr(channel_table[tx_type], cpu)->chan;
-	put_cpu();
-
-	return chan;
+	return this_cpu_read(channel_table[tx_type]->chan);
 }
 EXPORT_SYMBOL(dma_find_channel);
 
@@ -803,7 +796,6 @@ dma_async_memcpy_buf_to_buf(struct dma_c
 	struct dma_async_tx_descriptor *tx;
 	dma_addr_t dma_dest, dma_src;
 	dma_cookie_t cookie;
-	int cpu;
 	unsigned long flags;
 
 	dma_src = dma_map_single(dev->dev, src, len, DMA_TO_DEVICE);
@@ -822,10 +814,10 @@ dma_async_memcpy_buf_to_buf(struct dma_c
 	tx->callback = NULL;
 	cookie = tx->tx_submit(tx);
 
-	cpu = get_cpu();
-	per_cpu_ptr(chan->local, cpu)->bytes_transferred += len;
-	per_cpu_ptr(chan->local, cpu)->memcpy_count++;
-	put_cpu();
+	preempt_disable();
+	__this_cpu_add(chan->local->bytes_transferred, len);
+	__this_cpu_inc(chan->local->memcpy_count);
+	preempt_enable();
 
 	return cookie;
 }
@@ -852,7 +844,6 @@ dma_async_memcpy_buf_to_pg(struct dma_ch
 	struct dma_async_tx_descriptor *tx;
 	dma_addr_t dma_dest, dma_src;
 	dma_cookie_t cookie;
-	int cpu;
 	unsigned long flags;
 
 	dma_src = dma_map_single(dev->dev, kdata, len, DMA_TO_DEVICE);
@@ -869,10 +860,10 @@ dma_async_memcpy_buf_to_pg(struct dma_ch
 	tx->callback = NULL;
 	cookie = tx->tx_submit(tx);
 
-	cpu = get_cpu();
-	per_cpu_ptr(chan->local, cpu)->bytes_transferred += len;
-	per_cpu_ptr(chan->local, cpu)->memcpy_count++;
-	put_cpu();
+	preempt_disable();
+	__this_cpu_add(chan->local->bytes_transferred, len);
+	__this_cpu_inc(chan->local->memcpy_count);
+	preempt_enable();
 
 	return cookie;
 }
@@ -901,7 +892,6 @@ dma_async_memcpy_pg_to_pg(struct dma_cha
 	struct dma_async_tx_descriptor *tx;
 	dma_addr_t dma_dest, dma_src;
 	dma_cookie_t cookie;
-	int cpu;
 	unsigned long flags;
 
 	dma_src = dma_map_page(dev->dev, src_pg, src_off, len, DMA_TO_DEVICE);
@@ -919,10 +909,10 @@ dma_async_memcpy_pg_to_pg(struct dma_cha
 	tx->callback = NULL;
 	cookie = tx->tx_submit(tx);
 
-	cpu = get_cpu();
-	per_cpu_ptr(chan->local, cpu)->bytes_transferred += len;
-	per_cpu_ptr(chan->local, cpu)->memcpy_count++;
-	put_cpu();
+	preempt_disable();
+	__this_cpu_add(chan->local->bytes_transferred, len);
+	__this_cpu_inc(chan->local->memcpy_count);
+	preempt_enable();
 
 	return cookie;
 }
Index: linux-2.6/drivers/net/veth.c
===================================================================
--- linux-2.6.orig/drivers/net/veth.c	2009-06-04 13:38:15.000000000 -0500
+++ linux-2.6/drivers/net/veth.c	2009-06-04 14:18:00.000000000 -0500
@@ -153,7 +153,7 @@ static int veth_xmit(struct sk_buff *skb
 	struct net_device *rcv = NULL;
 	struct veth_priv *priv, *rcv_priv;
 	struct veth_net_stats *stats, *rcv_stats;
-	int length, cpu;
+	int length;
 
 	skb_orphan(skb);
 
@@ -161,9 +161,8 @@ static int veth_xmit(struct sk_buff *skb
 	rcv = priv->peer;
 	rcv_priv = netdev_priv(rcv);
 
-	cpu = smp_processor_id();
-	stats = per_cpu_ptr(priv->stats, cpu);
-	rcv_stats = per_cpu_ptr(rcv_priv->stats, cpu);
+	stats = this_cpu_ptr(priv->stats);
+	rcv_stats = this_cpu_ptr(rcv_priv->stats);
 
 	if (!(rcv->flags & IFF_UP))
 		goto tx_drop;

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 07/11] xfs_icsb_modify_counters does not need "cpu" variable
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
                   ` (5 preceding siblings ...)
  2009-06-05 19:18 ` [this_cpu_xx 06/11] Eliminate get/put_cpu cl
@ 2009-06-05 19:18 ` cl
  2009-06-05 19:22   ` Christoph Hellwig
  2009-06-05 19:18 ` [this_cpu_xx 08/11] Use this_cpu_ptr in crypto subsystem cl
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Christoph Hellwig, Olaf Weber, mingo, rusty, davem

[-- Attachment #1: this_cpu_ptr_xfs --]
[-- Type: text/plain, Size: 1449 bytes --]

The xfs_icsb_modify_counters() function no longer needs the cpu variable
if we use this_cpu_ptr() and we can get rid of get/put_cpu().

cc: Christoph Hellwig <hch@lst.de>
Acked-by: Olaf Weber <olaf@sgi.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 fs/xfs/xfs_mount.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: linux-2.6/fs/xfs/xfs_mount.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_mount.c	2009-05-28 15:03:50.000000000 -0500
+++ linux-2.6/fs/xfs/xfs_mount.c	2009-05-28 15:09:05.000000000 -0500
@@ -2320,12 +2320,12 @@ xfs_icsb_modify_counters(
 {
 	xfs_icsb_cnts_t	*icsbp;
 	long long	lcounter;	/* long counter for 64 bit fields */
-	int		cpu, ret = 0;
+	int		ret = 0;
 
 	might_sleep();
 again:
-	cpu = get_cpu();
-	icsbp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, cpu);
+	preempt_disable();
+	icsbp = (xfs_icsb_cnts_t *)this_cpu_ptr(mp->m_sb_cnts);
 
 	/*
 	 * if the counter is disabled, go to slow path
@@ -2369,11 +2369,11 @@ again:
 		break;
 	}
 	xfs_icsb_unlock_cntr(icsbp);
-	put_cpu();
+	preempt_enable();
 	return 0;
 
 slow_path:
-	put_cpu();
+	preempt_enable();
 
 	/*
 	 * serialise with a mutex so we don't burn lots of cpu on
@@ -2421,7 +2421,7 @@ slow_path:
 
 balance_counter:
 	xfs_icsb_unlock_cntr(icsbp);
-	put_cpu();
+	preempt_enable();
 
 	/*
 	 * We may have multiple threads here if multiple per-cpu

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 08/11] Use this_cpu_ptr in crypto subsystem
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
                   ` (6 preceding siblings ...)
  2009-06-05 19:18 ` [this_cpu_xx 07/11] xfs_icsb_modify_counters does not need "cpu" variable cl
@ 2009-06-05 19:18 ` cl
  2009-06-05 19:18 ` [this_cpu_xx 09/11] X86 optimized this_cpu operations cl
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, Huang Ying, mingo, rusty, davem

[-- Attachment #1: this_cpu_ptr_crypto --]
[-- Type: text/plain, Size: 912 bytes --]

Just a slight optimization that removes one array lookup.
The processor number is needed for other things as well so the
get/put_cpu cannot be removed.

Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 crypto/cryptd.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/crypto/cryptd.c
===================================================================
--- linux-2.6.orig/crypto/cryptd.c	2009-05-27 11:55:20.000000000 -0500
+++ linux-2.6/crypto/cryptd.c	2009-05-27 11:56:55.000000000 -0500
@@ -93,7 +93,7 @@ static int cryptd_enqueue_request(struct
 	struct cryptd_cpu_queue *cpu_queue;
 
 	cpu = get_cpu();
-	cpu_queue = per_cpu_ptr(queue->cpu_queue, cpu);
+	cpu_queue = this_cpu_ptr(queue->cpu_queue);
 	err = crypto_enqueue_request(&cpu_queue->queue, request);
 	queue_work_on(cpu, kcrypto_wq, &cpu_queue->work);
 	put_cpu();

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 09/11] X86 optimized this_cpu operations
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
                   ` (7 preceding siblings ...)
  2009-06-05 19:18 ` [this_cpu_xx 08/11] Use this_cpu_ptr in crypto subsystem cl
@ 2009-06-05 19:18 ` cl
  2009-06-05 19:18 ` [this_cpu_xx 10/11] Use this_cpu ops for vm statistics cl
  2009-06-05 19:18 ` [this_cpu_xx 11/11] RCU: Use this_cpu operations cl
  10 siblings, 0 replies; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, mingo, rusty, davem

[-- Attachment #1: this_cpu_x86_ops --]
[-- Type: text/plain, Size: 2253 bytes --]

Basically the existing percpu ops can be used. However, we do not pass a
reference to a percpu variable in. Instead an address of a percpu variable
is provided.

Both preempt, the non preempt and the irqsafe operations generate the same code.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 arch/x86/include/asm/percpu.h |   22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Index: linux-2.6/arch/x86/include/asm/percpu.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/percpu.h	2009-06-04 13:38:01.000000000 -0500
+++ linux-2.6/arch/x86/include/asm/percpu.h	2009-06-04 14:21:22.000000000 -0500
@@ -140,6 +140,28 @@ do {							\
 #define percpu_or(var, val)	percpu_to_op("or", per_cpu__##var, val)
 #define percpu_xor(var, val)	percpu_to_op("xor", per_cpu__##var, val)
 
+#define __this_cpu_read(pcp)		percpu_from_op("mov", pcp)
+#define __this_cpu_write(pcp, val)	percpu_to_op("mov", (pcp), val)
+#define __this_cpu_add(pcp, val)	percpu_to_op("add", (pcp), val)
+#define __this_cpu_sub(pcp, val)	percpu_to_op("sub", (pcp), val)
+#define __this_cpu_and(pcp, val)	percpu_to_op("and", (pcp), val)
+#define __this_cpu_or(pcp, val)		percpu_to_op("or", (pcp), val)
+#define __this_cpu_xor(pcp, val)	percpu_to_op("xor", (pcp), val)
+
+#define this_cpu_read(pcp)		percpu_from_op("mov", (pcp))
+#define this_cpu_write(pcp, val)	percpu_to_op("mov", (pcp), val)
+#define this_cpu_add(pcp, val)		percpu_to_op("add", (pcp), val)
+#define this_cpu_sub(pcp, val)		percpu_to_op("sub", (pcp), val)
+#define this_cpu_and(pcp, val)		percpu_to_op("and", (pcp), val)
+#define this_cpu_or(pcp, val)		percpu_to_op("or", (pcp), val)
+#define this_cpu_xor(pcp, val)		percpu_to_op("xor", (pcp), val)
+
+#define irqsafe_cpu_add(pcp, val)	percpu_to_op("add", (pcp), val)
+#define irqsafe_cpu_sub(pcp, val)	percpu_to_op("sub", (pcp), val)
+#define irqsafe_cpu_and(pcp, val)	percpu_to_op("and", (pcp), val)
+#define irqsafe_cpu_or(pcp, val)	percpu_to_op("or", (pcp), val)
+#define irqsafe_cpu_xor(pcp, val)	percpu_to_op("xor", (pcp), val)
+
 /* This is not atomic against other CPUs -- CPU preemption needs to be off */
 #define x86_test_and_clear_bit_percpu(bit, var)				\
 ({									\

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 10/11] Use this_cpu ops for vm statistics.
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
                   ` (8 preceding siblings ...)
  2009-06-05 19:18 ` [this_cpu_xx 09/11] X86 optimized this_cpu operations cl
@ 2009-06-05 19:18 ` cl
  2009-06-05 19:18 ` [this_cpu_xx 11/11] RCU: Use this_cpu operations cl
  10 siblings, 0 replies; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, mingo, rusty, davem

[-- Attachment #1: this_cpu_vmstats --]
[-- Type: text/plain, Size: 1307 bytes --]

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 include/linux/vmstat.h |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Index: linux-2.6/include/linux/vmstat.h
===================================================================
--- linux-2.6.orig/include/linux/vmstat.h	2009-06-04 14:23:32.000000000 -0500
+++ linux-2.6/include/linux/vmstat.h	2009-06-04 14:36:04.000000000 -0500
@@ -75,24 +75,23 @@ DECLARE_PER_CPU(struct vm_event_state, v
 
 static inline void __count_vm_event(enum vm_event_item item)
 {
-	__get_cpu_var(vm_event_states).event[item]++;
+	__this_cpu_inc(per_cpu_var(vm_event_states).event[item]);
 }
 
 static inline void count_vm_event(enum vm_event_item item)
 {
-	get_cpu_var(vm_event_states).event[item]++;
+	this_cpu_inc(per_cpu_var(vm_event_states).event[item]);
 	put_cpu();
 }
 
 static inline void __count_vm_events(enum vm_event_item item, long delta)
 {
-	__get_cpu_var(vm_event_states).event[item] += delta;
+	__this_cpu_add(per_cpu_var(vm_event_states).event[item], delta);
 }
 
 static inline void count_vm_events(enum vm_event_item item, long delta)
 {
-	get_cpu_var(vm_event_states).event[item] += delta;
-	put_cpu();
+	this_cpu_add(per_cpu_var(vm_event_states).event[item], delta);
 }
 
 extern void all_vm_events(unsigned long *);

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [this_cpu_xx 11/11] RCU: Use this_cpu operations
  2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
                   ` (9 preceding siblings ...)
  2009-06-05 19:18 ` [this_cpu_xx 10/11] Use this_cpu ops for vm statistics cl
@ 2009-06-05 19:18 ` cl
  2009-06-10 17:42   ` Paul E. McKenney
  10 siblings, 1 reply; 29+ messages in thread
From: cl @ 2009-06-05 19:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, mingo, rusty, davem

[-- Attachment #1: this_cpu_rcu --]
[-- Type: text/plain, Size: 2825 bytes --]

RCU does not do dynamic allocations but it increments per cpu variables
a lot. These instructions results in a move to a register and then back
to memory. This patch will make it use the inc/dec instructions on x86
that do not need a register.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 kernel/rcupreempt.c |    4 ++--
 kernel/rcutorture.c |    8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

Index: linux-2.6/kernel/rcutorture.c
===================================================================
--- linux-2.6.orig/kernel/rcutorture.c	2009-06-04 14:26:42.000000000 -0500
+++ linux-2.6/kernel/rcutorture.c	2009-06-04 14:38:05.000000000 -0500
@@ -709,13 +709,13 @@ static void rcu_torture_timer(unsigned l
 		/* Should not happen, but... */
 		pipe_count = RCU_TORTURE_PIPE_LEN;
 	}
-	++__get_cpu_var(rcu_torture_count)[pipe_count];
+	__this_cpu_inc(per_cpu_var(rcu_torture_count)[pipe_count]);
 	completed = cur_ops->completed() - completed;
 	if (completed > RCU_TORTURE_PIPE_LEN) {
 		/* Should not happen, but... */
 		completed = RCU_TORTURE_PIPE_LEN;
 	}
-	++__get_cpu_var(rcu_torture_batch)[completed];
+	__this_cpu_inc(per_cpu_var(rcu_torture_batch)[completed]);
 	preempt_enable();
 	cur_ops->readunlock(idx);
 }
@@ -764,13 +764,13 @@ rcu_torture_reader(void *arg)
 			/* Should not happen, but... */
 			pipe_count = RCU_TORTURE_PIPE_LEN;
 		}
-		++__get_cpu_var(rcu_torture_count)[pipe_count];
+		__this_cpu_inc(per_cpu_var(rcu_torture_count)[pipe_count]);
 		completed = cur_ops->completed() - completed;
 		if (completed > RCU_TORTURE_PIPE_LEN) {
 			/* Should not happen, but... */
 			completed = RCU_TORTURE_PIPE_LEN;
 		}
-		++__get_cpu_var(rcu_torture_batch)[completed];
+		__this_cpu_inc(per_cpu_var(rcu_torture_batch)[completed]);
 		preempt_enable();
 		cur_ops->readunlock(idx);
 		schedule();
Index: linux-2.6/kernel/rcupreempt.c
===================================================================
--- linux-2.6.orig/kernel/rcupreempt.c	2009-06-04 14:28:53.000000000 -0500
+++ linux-2.6/kernel/rcupreempt.c	2009-06-04 14:39:35.000000000 -0500
@@ -173,7 +173,7 @@ void rcu_enter_nohz(void)
 	static DEFINE_RATELIMIT_STATE(rs, 10 * HZ, 1);
 
 	smp_mb(); /* CPUs seeing ++ must see prior RCU read-side crit sects */
-	__get_cpu_var(rcu_dyntick_sched).dynticks++;
+	__this_cpu_inc(per_cpu_var(rcu_dyntick_sched).dynticks);
 	WARN_ON_RATELIMIT(__get_cpu_var(rcu_dyntick_sched).dynticks & 0x1, &rs);
 }
 
@@ -181,7 +181,7 @@ void rcu_exit_nohz(void)
 {
 	static DEFINE_RATELIMIT_STATE(rs, 10 * HZ, 1);
 
-	__get_cpu_var(rcu_dyntick_sched).dynticks++;
+	__this_cpu_inc(per_cpu_var(rcu_dyntick_sched).dynticks);
 	smp_mb(); /* CPUs seeing ++ must see later RCU read-side crit sects */
 	WARN_ON_RATELIMIT(!(__get_cpu_var(rcu_dyntick_sched).dynticks & 0x1),
 				&rs);

-- 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 07/11] xfs_icsb_modify_counters does not need "cpu" variable
  2009-06-05 19:18 ` [this_cpu_xx 07/11] xfs_icsb_modify_counters does not need "cpu" variable cl
@ 2009-06-05 19:22   ` Christoph Hellwig
  2009-06-05 19:36     ` Christoph Lameter
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Hellwig @ 2009-06-05 19:22 UTC (permalink / raw)
  To: cl
  Cc: linux-kernel, Tejun Heo, Christoph Hellwig, Olaf Weber, mingo,
	rusty, davem

On Fri, Jun 05, 2009 at 03:18:26PM -0400, cl@linux-foundation.org wrote:
> The xfs_icsb_modify_counters() function no longer needs the cpu variable
> if we use this_cpu_ptr() and we can get rid of get/put_cpu().

Looks good to me.  While you're at it you might also remove the
superflous cast of the this_cpu_ptr return value.


Reviewed-by: Christoph Hellwig <hch@lst.de>

Btw, any reason this_cpu_ptr doesn't do the preempt_disable itself
and has something paired to reverse it?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 06/11] Eliminate get/put_cpu
  2009-06-05 19:18 ` [this_cpu_xx 06/11] Eliminate get/put_cpu cl
@ 2009-06-05 19:34   ` Dan Williams
  2009-06-09 14:02     ` Sosnowski, Maciej
  0 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2009-06-05 19:34 UTC (permalink / raw)
  To: cl
  Cc: linux-kernel, Tejun Heo, Eric Biederman, Stephen Hemminger,
	Trond Myklebust, Herbert Xu, David L Stevens, mingo, rusty,
	davem, Sosnowski, Maciej

[ added Maciej to the cc ]

On Fri, Jun 5, 2009 at 12:18 PM, <cl@linux-foundation.org> wrote:
> There are cases where we can use this_cpu_ptr and as the result
> of using this_cpu_ptr() we no longer need to determine the
> current executing cpu.
>
> In those places no get/put_cpu combination is needed anymore.
> The local cpu variable can be eliminated.
>
> Preemption still needs to be disabled and enabled since the
> modifications of the per cpu variables is not atomic. There may
> be multiple per cpu variables modified and those must all
> be from the same processor.
>
> cc: Dan Williams <dan.j.williams@intel.com>

Acked-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 07/11] xfs_icsb_modify_counters does not need "cpu" variable
  2009-06-05 19:22   ` Christoph Hellwig
@ 2009-06-05 19:36     ` Christoph Lameter
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Lameter @ 2009-06-05 19:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, Tejun Heo, Olaf Weber, mingo, rusty, davem

On Fri, 5 Jun 2009, Christoph Hellwig wrote:

> Looks good to me.  While you're at it you might also remove the
> superflous cast of the this_cpu_ptr return value.

Ok.

> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
> Btw, any reason this_cpu_ptr doesn't do the preempt_disable itself
> and has something paired to reverse it?

Would break the symmetry with the atomic per cpu ops introduced in the
same patch. Putting preempt side effects and RMWs together is making
things a bit complicated.

Also if the caller manages the preempt explicity (like this piece of code)
then may be better to have separate statements for clarity.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 04/11] Use this_cpu ops for network statistics
  2009-06-05 19:18 ` [this_cpu_xx 04/11] Use this_cpu ops for network statistics cl
@ 2009-06-08 11:27   ` Robin Holt
  2009-06-08 20:49     ` Christoph Lameter
  0 siblings, 1 reply; 29+ messages in thread
From: Robin Holt @ 2009-06-08 11:27 UTC (permalink / raw)
  To: cl; +Cc: linux-kernel, Tejun Heo, mingo, rusty, davem

On Fri, Jun 05, 2009 at 03:18:23PM -0400, cl@linux-foundation.org wrote:
...
> --- linux-2.6.orig/include/net/netfilter/nf_conntrack.h	2009-06-03 16:23:29.000000000 -0500
...
>  #define NF_CT_STAT_INC_ATOMIC(net, count)		\
> -do {							\
> -	local_bh_disable();				\
> -	per_cpu_ptr((net)->ct.stat, raw_smp_processor_id())->count++;	\
> -	local_bh_enable();				\
> -} while (0)
> +	this_cpu_inc((net)->ct.stat->count)

Why not put this on one line?
#define NF_CT_STAT_INC_ATOMIC(net, count)  this_cpu_inc((net)->ct.stat->count)

obin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 04/11] Use this_cpu ops for network statistics
  2009-06-08 11:27   ` Robin Holt
@ 2009-06-08 20:49     ` Christoph Lameter
  2009-06-08 20:54       ` Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Lameter @ 2009-06-08 20:49 UTC (permalink / raw)
  To: Robin Holt; +Cc: linux-kernel, Tejun Heo, mingo, rusty, davem

On Mon, 8 Jun 2009, Robin Holt wrote:

> > +	this_cpu_inc((net)->ct.stat->count)
>
> Why not put this on one line?
> #define NF_CT_STAT_INC_ATOMIC(net, count)  this_cpu_inc((net)->ct.stat->count)

Would be more than 78 characters in a line.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 04/11] Use this_cpu ops for network statistics
  2009-06-08 20:49     ` Christoph Lameter
@ 2009-06-08 20:54       ` Ingo Molnar
  0 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2009-06-08 20:54 UTC (permalink / raw)
  To: Christoph Lameter, Andy Whitcroft
  Cc: Robin Holt, linux-kernel, Tejun Heo, rusty, davem


* Christoph Lameter <cl@linux-foundation.org> wrote:

> On Mon, 8 Jun 2009, Robin Holt wrote:
> 
> > > +	this_cpu_inc((net)->ct.stat->count)
> >
> > Why not put this on one line?
> > #define NF_CT_STAT_INC_ATOMIC(net, count)  this_cpu_inc((net)->ct.stat->count)
> 
> Would be more than 78 characters in a line.

I think you can ignore such types of warnings in general. I think 
checkpatch should be silent up to 90 chars or so, if there's not 
more than say 3 tabs in that line. (if there's a lot of tabs that 
means the indentation level is wrong.)

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [this_cpu_xx 06/11] Eliminate get/put_cpu
  2009-06-05 19:34   ` Dan Williams
@ 2009-06-09 14:02     ` Sosnowski, Maciej
  0 siblings, 0 replies; 29+ messages in thread
From: Sosnowski, Maciej @ 2009-06-09 14:02 UTC (permalink / raw)
  To: Williams, Dan J, cl
  Cc: linux-kernel, Tejun Heo, Eric Biederman, Stephen Hemminger,
	Trond Myklebust, Herbert Xu, David L Stevens, mingo, rusty,
	davem

Dan Williams wrote:
> [ added Maciej to the cc ]

Thanks Dan.

> 
> On Fri, Jun 5, 2009 at 12:18 PM, <cl@linux-foundation.org> wrote:
>> There are cases where we can use this_cpu_ptr and as the result
>> of using this_cpu_ptr() we no longer need to determine the current executing cpu.
>> 
>> In those places no get/put_cpu combination is needed anymore.
>> The local cpu variable can be eliminated.
>> 
>> Preemption still needs to be disabled and enabled since the
>> modifications of the per cpu variables is not atomic. There may
>> be multiple per cpu variables modified and those must all
>> be from the same processor.
>> 
>> cc: Dan Williams <dan.j.williams@intel.com>
> 
> Acked-by: Dan Williams <dan.j.williams@intel.com>

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-05 19:18 ` [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations cl
@ 2009-06-10  5:12   ` Tejun Heo
  2009-06-11 15:10     ` Christoph Lameter
  2009-06-17  8:19   ` Tejun Heo
  1 sibling, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2009-06-10  5:12 UTC (permalink / raw)
  To: cl
  Cc: linux-kernel, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

Hello,

cl@linux-foundation.org wrote:
...
> The operations are guaranteed to be atomic vs preemption if they modify
> the scalar (unless they are prefixed by __ in which case they do not need
> to be). The calculation of the per cpu offset is also guaranteed to be atomic.
> 
> this_cpu_read(scalar)
> this_cpu_write(scalar, value)
> this_cpu_add(scale, value)
> this_cpu_sub(scalar, value)
> this_cpu_inc(scalar)
> this_cpu_dec(scalar)
> this_cpu_and(scalar, value)
> this_cpu_or(scalar, value)
> this_cpu_xor(scalar, value)

Looks good to me.  The only qualm I have is that I wish these macros
take pointer instead of the symbol name directly.  Currently it's not
possible due to the per_cpu__ appending thing but those should go with
Rusty's patches and the same ops should be useable for both static and
dynamic ones.  One problem which may occur with such scheme is when
the arch+compiler can't handle indirect dereferencing atomically.  At
any rate, it's a separate issue and we can deal with it later.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 11/11] RCU: Use this_cpu operations
  2009-06-05 19:18 ` [this_cpu_xx 11/11] RCU: Use this_cpu operations cl
@ 2009-06-10 17:42   ` Paul E. McKenney
  0 siblings, 0 replies; 29+ messages in thread
From: Paul E. McKenney @ 2009-06-10 17:42 UTC (permalink / raw)
  To: cl; +Cc: linux-kernel, Tejun Heo, mingo, rusty, davem

On Fri, Jun 05, 2009 at 03:18:30PM -0400, cl@linux-foundation.org wrote:
> RCU does not do dynamic allocations but it increments per cpu variables
> a lot. These instructions results in a move to a register and then back
> to memory. This patch will make it use the inc/dec instructions on x86
> that do not need a register.

Looks good to me!

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
> 
> ---
>  kernel/rcupreempt.c |    4 ++--
>  kernel/rcutorture.c |    8 ++++----
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> Index: linux-2.6/kernel/rcutorture.c
> ===================================================================
> --- linux-2.6.orig/kernel/rcutorture.c	2009-06-04 14:26:42.000000000 -0500
> +++ linux-2.6/kernel/rcutorture.c	2009-06-04 14:38:05.000000000 -0500
> @@ -709,13 +709,13 @@ static void rcu_torture_timer(unsigned l
>  		/* Should not happen, but... */
>  		pipe_count = RCU_TORTURE_PIPE_LEN;
>  	}
> -	++__get_cpu_var(rcu_torture_count)[pipe_count];
> +	__this_cpu_inc(per_cpu_var(rcu_torture_count)[pipe_count]);
>  	completed = cur_ops->completed() - completed;
>  	if (completed > RCU_TORTURE_PIPE_LEN) {
>  		/* Should not happen, but... */
>  		completed = RCU_TORTURE_PIPE_LEN;
>  	}
> -	++__get_cpu_var(rcu_torture_batch)[completed];
> +	__this_cpu_inc(per_cpu_var(rcu_torture_batch)[completed]);
>  	preempt_enable();
>  	cur_ops->readunlock(idx);
>  }
> @@ -764,13 +764,13 @@ rcu_torture_reader(void *arg)
>  			/* Should not happen, but... */
>  			pipe_count = RCU_TORTURE_PIPE_LEN;
>  		}
> -		++__get_cpu_var(rcu_torture_count)[pipe_count];
> +		__this_cpu_inc(per_cpu_var(rcu_torture_count)[pipe_count]);
>  		completed = cur_ops->completed() - completed;
>  		if (completed > RCU_TORTURE_PIPE_LEN) {
>  			/* Should not happen, but... */
>  			completed = RCU_TORTURE_PIPE_LEN;
>  		}
> -		++__get_cpu_var(rcu_torture_batch)[completed];
> +		__this_cpu_inc(per_cpu_var(rcu_torture_batch)[completed]);
>  		preempt_enable();
>  		cur_ops->readunlock(idx);
>  		schedule();
> Index: linux-2.6/kernel/rcupreempt.c
> ===================================================================
> --- linux-2.6.orig/kernel/rcupreempt.c	2009-06-04 14:28:53.000000000 -0500
> +++ linux-2.6/kernel/rcupreempt.c	2009-06-04 14:39:35.000000000 -0500
> @@ -173,7 +173,7 @@ void rcu_enter_nohz(void)
>  	static DEFINE_RATELIMIT_STATE(rs, 10 * HZ, 1);
> 
>  	smp_mb(); /* CPUs seeing ++ must see prior RCU read-side crit sects */
> -	__get_cpu_var(rcu_dyntick_sched).dynticks++;
> +	__this_cpu_inc(per_cpu_var(rcu_dyntick_sched).dynticks);
>  	WARN_ON_RATELIMIT(__get_cpu_var(rcu_dyntick_sched).dynticks & 0x1, &rs);
>  }
> 
> @@ -181,7 +181,7 @@ void rcu_exit_nohz(void)
>  {
>  	static DEFINE_RATELIMIT_STATE(rs, 10 * HZ, 1);
> 
> -	__get_cpu_var(rcu_dyntick_sched).dynticks++;
> +	__this_cpu_inc(per_cpu_var(rcu_dyntick_sched).dynticks);
>  	smp_mb(); /* CPUs seeing ++ must see later RCU read-side crit sects */
>  	WARN_ON_RATELIMIT(!(__get_cpu_var(rcu_dyntick_sched).dynticks & 0x1),
>  				&rs);
> 
> -- 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-10  5:12   ` Tejun Heo
@ 2009-06-11 15:10     ` Christoph Lameter
  2009-06-12  2:09       ` Tejun Heo
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Lameter @ 2009-06-11 15:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-kernel, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

On Wed, 10 Jun 2009, Tejun Heo wrote:

> Looks good to me.  The only qualm I have is that I wish these macros
> take pointer instead of the symbol name directly.  Currently it's not

They take the adress of the scalar. No symbol name is involved.

> possible due to the per_cpu__ appending thing but those should go with
> Rusty's patches and the same ops should be useable for both static and
> dynamic ones.  One problem which may occur with such scheme is when

They are usable for both as the following patches show.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-11 15:10     ` Christoph Lameter
@ 2009-06-12  2:09       ` Tejun Heo
  2009-06-12 14:18         ` Christoph Lameter
  0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2009-06-12  2:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-kernel, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

Christoph Lameter wrote:
> On Wed, 10 Jun 2009, Tejun Heo wrote:
> 
>> Looks good to me.  The only qualm I have is that I wish these macros
>> take pointer instead of the symbol name directly.  Currently it's not
> 
> They take the adress of the scalar. No symbol name is involved.
> 
>> possible due to the per_cpu__ appending thing but those should go with
>> Rusty's patches and the same ops should be useable for both static and
>> dynamic ones.  One problem which may occur with such scheme is when
> 
> They are usable for both as the following patches show.

Oops, sorry about that.  Got confused there.  :-)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-12  2:09       ` Tejun Heo
@ 2009-06-12 14:18         ` Christoph Lameter
  2009-06-17  8:09           ` Tejun Heo
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Lameter @ 2009-06-12 14:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-kernel, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

On Fri, 12 Jun 2009, Tejun Heo wrote:

> > They are usable for both as the following patches show.
>
> Oops, sorry about that.  Got confused there.  :-)

Reviewed-by's or so would be appreciated. I almost got the allocators
converted to use the ops as well but I want the simple stuff to be merged
first.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-12 14:18         ` Christoph Lameter
@ 2009-06-17  8:09           ` Tejun Heo
  0 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2009-06-17  8:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-kernel, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

Christoph Lameter wrote:
> On Fri, 12 Jun 2009, Tejun Heo wrote:
> 
>>> They are usable for both as the following patches show.
>> Oops, sorry about that.  Got confused there.  :-)
> 
> Reviewed-by's or so would be appreciated. I almost got the allocators
> converted to use the ops as well but I want the simple stuff to be merged
> first.

Sorry about late reply.  Was hiding in my hole.  Will reply to the
orignal posting.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-05 19:18 ` [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations cl
  2009-06-10  5:12   ` Tejun Heo
@ 2009-06-17  8:19   ` Tejun Heo
  2009-06-17 18:41     ` Christoph Lameter
  1 sibling, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2009-06-17  8:19 UTC (permalink / raw)
  To: cl
  Cc: linux-kernel, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

Hello,

cl@linux-foundation.org wrote:
> +#ifndef this_cpu_write
> +# define this_cpu_write(pcp, val)	__this_cpu_write((pcp), (val))
> +#endif

Is this safe?  Write itself would always be atomic but this means that
a percpu variable may change its value while a thread is holding the
processor by disabling preemption.  ie,

0. v contains A for cpu0

1. task0 on cpu0 does this_cpu_write(v, B), looks up cpu but gets
   preemted out.

2. task1 gets scheduled on cpu1, disables preemption and does
   __this_cpu_read(v) and gets A and goes on with preemtion disabled.

3. task0 gets scheduled on cpu1 and executes the assignment.

4. task1 does __this_cpu_read(v) again and oops gets B this time.

Please note that this can also happen between addition or other
modifying ops and cause incorrect result.

Also, these macros depricate percpu_OP() macros, right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-17  8:19   ` Tejun Heo
@ 2009-06-17 18:41     ` Christoph Lameter
  2009-06-18  1:08       ` Tejun Heo
  2009-06-18  3:01       ` Rusty Russell
  0 siblings, 2 replies; 29+ messages in thread
From: Christoph Lameter @ 2009-06-17 18:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-kernel, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

On Wed, 17 Jun 2009, Tejun Heo wrote:

> cl@linux-foundation.org wrote:
> > +#ifndef this_cpu_write
> > +# define this_cpu_write(pcp, val)	__this_cpu_write((pcp), (val))
> > +#endif
>
> Is this safe?  Write itself would always be atomic but this means that
> a percpu variable may change its value while a thread is holding the
> processor by disabling preemption.  ie,
>
> 0. v contains A for cpu0
>
> 1. task0 on cpu0 does this_cpu_write(v, B), looks up cpu but gets
>    preemted out.
>
> 2. task1 gets scheduled on cpu1, disables preemption and does
>    __this_cpu_read(v) and gets A and goes on with preemtion disabled.
>
> 3. task0 gets scheduled on cpu1 and executes the assignment.
>
> 4. task1 does __this_cpu_read(v) again and oops gets B this time.
>
> Please note that this can also happen between addition or other
> modifying ops and cause incorrect result.

Per cpu operations are only safe for the current processor. One issue
there may be that the store after rescheduling may not occur to the
current processors per cpu instance but the prior cpu. At that point
another thread may be running on the prior cpu and be disturbed like you
point out. So it needs a preempt disable there too.

> Also, these macros depricate percpu_OP() macros, right?

They are different. percpu_OP() macros require a percpu variable name
to be passed.

this_cpu_* macros require a reference to a variable in a
structure allocated with the new per cpu allocator.

It is possible to simply pass the full variable name of a percpu variable
to this_cpu_* macros. See the patch of the vm statistics handling.

It uses

	per_cpu_var(per_cpu_name_without_prefix)

to generate the full name.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-17 18:41     ` Christoph Lameter
@ 2009-06-18  1:08       ` Tejun Heo
  2009-06-18  3:01       ` Rusty Russell
  1 sibling, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2009-06-18  1:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-kernel, David Howells, Ingo Molnar, Rusty Russell,
	Eric Dumazet, davem

Hello,

Christoph Lameter wrote:
> On Wed, 17 Jun 2009, Tejun Heo wrote:
>> Please note that this can also happen between addition or other
>> modifying ops and cause incorrect result.
> 
> Per cpu operations are only safe for the current processor. One issue
> there may be that the store after rescheduling may not occur to the
> current processors per cpu instance but the prior cpu. At that point
> another thread may be running on the prior cpu and be disturbed like you
> point out. So it needs a preempt disable there too.

Yeap, to summarize, the problem is that the address determination and
the actual memory write aren't atomic with respect to preeamption.

>> Also, these macros depricate percpu_OP() macros, right?
> 
> They are different. percpu_OP() macros require a percpu variable name
> to be passed.
> 
> this_cpu_* macros require a reference to a variable in a
> structure allocated with the new per cpu allocator.
> 
> It is possible to simply pass the full variable name of a percpu variable
> to this_cpu_* macros. See the patch of the vm statistics handling.
> 
> It uses
> 
> 	per_cpu_var(per_cpu_name_without_prefix)
> 
> to generate the full name.

Yeap, I guess it's about time to ressurect Rusty's drop-per_cpu_
prefix patch; then, we can truly handle static and dynamic variables
in the same manner.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations
  2009-06-17 18:41     ` Christoph Lameter
  2009-06-18  1:08       ` Tejun Heo
@ 2009-06-18  3:01       ` Rusty Russell
  1 sibling, 0 replies; 29+ messages in thread
From: Rusty Russell @ 2009-06-18  3:01 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Tejun Heo, linux-kernel, David Howells, Ingo Molnar, Eric Dumazet, davem

On Thu, 18 Jun 2009 04:11:17 am Christoph Lameter wrote:
> It is possible to simply pass the full variable name of a percpu variable
> to this_cpu_* macros. See the patch of the vm statistics handling.
>
> It uses
>
> 	per_cpu_var(per_cpu_name_without_prefix)
>
> to generate the full name.

I have a patch to rip out the prefixes and use sparse annotations instead; I'll
dig it out...

OK, was a series of three.  Probably bitrotted, but here they are:

alloc_percpu: rename percpu vars which cause name clashes.

Currently DECLARE_PER_CPU vars have per_cpu__ prefixed to them, and
this effectively puts them in a separate namespace.  No surprise that
they clash with other names when that prefix is removed.

There may be others I've missed, but if so the transform is simple.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 arch/ia64/kernel/crash.c                |    4 ++--
 arch/ia64/kernel/setup.c                |    8 ++++----
 arch/mn10300/kernel/kprobes.c           |    2 +-
 arch/powerpc/platforms/cell/interrupt.c |   14 +++++++-------
 arch/x86/include/asm/processor.h        |    2 +-
 arch/x86/include/asm/timer.h            |    5 +++--
 arch/x86/kernel/cpu/common.c            |    4 ++--
 arch/x86/kernel/dumpstack_64.c          |    2 +-
 arch/x86/kernel/tsc.c                   |    4 ++--
 arch/x86/kvm/svm.c                      |   14 +++++++-------
 drivers/cpufreq/cpufreq.c               |   16 ++++++++--------
 drivers/s390/net/netiucv.c              |    8 ++++----
 kernel/lockdep.c                        |   11 ++++++-----
 kernel/sched.c                          |   14 ++++++++------
 kernel/softirq.c                        |    4 ++--
 kernel/softlockup.c                     |   20 ++++++++++----------
 mm/slab.c                               |    8 ++++----
 mm/vmstat.c                             |    6 +++---
 18 files changed, 75 insertions(+), 71 deletions(-)

diff --git a/arch/ia64/kernel/crash.c b/arch/ia64/kernel/crash.c
--- a/arch/ia64/kernel/crash.c
+++ b/arch/ia64/kernel/crash.c
@@ -50,7 +50,7 @@ final_note(void *buf)
 
 extern void ia64_dump_cpu_regs(void *);
 
-static DEFINE_PER_CPU(struct elf_prstatus, elf_prstatus);
+static DEFINE_PER_CPU(struct elf_prstatus, elf_prstatus_pcpu);
 
 void
 crash_save_this_cpu(void)
@@ -59,7 +59,7 @@ crash_save_this_cpu(void)
 	unsigned long cfm, sof, sol;
 
 	int cpu = smp_processor_id();
-	struct elf_prstatus *prstatus = &per_cpu(elf_prstatus, cpu);
+	struct elf_prstatus *prstatus = &per_cpu(elf_prstatus_pcpu, cpu);
 
 	elf_greg_t *dst = (elf_greg_t *)&(prstatus->pr_reg);
 	memset(prstatus, 0, sizeof(*prstatus));
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -939,7 +939,7 @@ cpu_init (void)
 	unsigned long num_phys_stacked;
 	pal_vm_info_2_u_t vmi;
 	unsigned int max_ctx;
-	struct cpuinfo_ia64 *cpu_info;
+	struct cpuinfo_ia64 *cpuinfo;
 	void *cpu_data;
 
 	cpu_data = per_cpu_init();
@@ -972,15 +972,15 @@ cpu_init (void)
 	 * depends on the data returned by identify_cpu().  We break the dependency by
 	 * accessing cpu_data() through the canonical per-CPU address.
 	 */
-	cpu_info = cpu_data + ((char *) &__ia64_per_cpu_var(cpu_info) - __per_cpu_start);
-	identify_cpu(cpu_info);
+	cpuinfo = cpu_data + ((char *) &__ia64_per_cpu_var(cpu_info) - __per_cpu_start);
+	identify_cpu(cpuinfo);
 
 #ifdef CONFIG_MCKINLEY
 	{
 #		define FEATURE_SET 16
 		struct ia64_pal_retval iprv;
 
-		if (cpu_info->family == 0x1f) {
+		if (cpuinfo->family == 0x1f) {
 			PAL_CALL_PHYS(iprv, PAL_PROC_GET_FEATURES, 0, FEATURE_SET, 0);
 			if ((iprv.status == 0) && (iprv.v0 & 0x80) && (iprv.v2 & 0x80))
 				PAL_CALL_PHYS(iprv, PAL_PROC_SET_FEATURES,
diff --git a/arch/mn10300/kernel/kprobes.c b/arch/mn10300/kernel/kprobes.c
--- a/arch/mn10300/kernel/kprobes.c
+++ b/arch/mn10300/kernel/kprobes.c
@@ -39,7 +39,7 @@ static kprobe_opcode_t current_kprobe_ss
 static kprobe_opcode_t current_kprobe_ss_buf[MAX_INSN_SIZE + 2];
 static unsigned long current_kprobe_bp_addr;
 
-DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
+DEFINE_PER_CPU(struct kprobe *, current_kprobe_pcpu) = NULL;
 
 
 /* singlestep flag bits */
diff --git a/arch/powerpc/platforms/cell/interrupt.c b/arch/powerpc/platforms/cell/interrupt.c
--- a/arch/powerpc/platforms/cell/interrupt.c
+++ b/arch/powerpc/platforms/cell/interrupt.c
@@ -54,7 +54,7 @@ struct iic {
 	struct device_node *node;
 };
 
-static DEFINE_PER_CPU(struct iic, iic);
+static DEFINE_PER_CPU(struct iic, iic_pcpu);
 #define IIC_NODE_COUNT	2
 static struct irq_host *iic_host;
 
@@ -82,7 +82,7 @@ static void iic_unmask(unsigned int irq)
 
 static void iic_eoi(unsigned int irq)
 {
-	struct iic *iic = &__get_cpu_var(iic);
+	struct iic *iic = &__get_cpu_var(iic_pcpu);
 	out_be64(&iic->regs->prio, iic->eoi_stack[--iic->eoi_ptr]);
 	BUG_ON(iic->eoi_ptr < 0);
 }
@@ -146,7 +146,7 @@ static unsigned int iic_get_irq(void)
 	struct iic *iic;
 	unsigned int virq;
 
-	iic = &__get_cpu_var(iic);
+	iic = &__get_cpu_var(iic_pcpu);
 	*(unsigned long *) &pending =
 		in_be64((u64 __iomem *) &iic->regs->pending_destr);
 	if (!(pending.flags & CBE_IIC_IRQ_VALID))
@@ -161,12 +161,12 @@ static unsigned int iic_get_irq(void)
 
 void iic_setup_cpu(void)
 {
-	out_be64(&__get_cpu_var(iic).regs->prio, 0xff);
+	out_be64(&__get_cpu_var(iic_pcpu).regs->prio, 0xff);
 }
 
 u8 iic_get_target_id(int cpu)
 {
-	return per_cpu(iic, cpu).target_id;
+	return per_cpu(iic_pcpu, cpu).target_id;
 }
 
 EXPORT_SYMBOL_GPL(iic_get_target_id);
@@ -181,7 +181,7 @@ static inline int iic_ipi_to_irq(int ipi
 
 void iic_cause_IPI(int cpu, int mesg)
 {
-	out_be64(&per_cpu(iic, cpu).regs->generate, (0xf - mesg) << 4);
+	out_be64(&per_cpu(iic_pcpu, cpu).regs->generate, (0xf - mesg) << 4);
 }
 
 struct irq_host *iic_get_irq_host(int node)
@@ -350,7 +350,7 @@ static void __init init_one_iic(unsigned
 	/* XXX FIXME: should locate the linux CPU number from the HW cpu
 	 * number properly. We are lucky for now
 	 */
-	struct iic *iic = &per_cpu(iic, hw_cpu);
+	struct iic *iic = &per_cpu(iic_pcpu, hw_cpu);
 
 	iic->regs = ioremap(addr, sizeof(struct cbe_iic_thread_regs));
 	BUG_ON(iic->regs == NULL);
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -377,7 +377,7 @@ union thread_xstate {
 };
 
 #ifdef CONFIG_X86_64
-DECLARE_PER_CPU(struct orig_ist, orig_ist);
+DECLARE_PER_CPU(struct orig_ist, orig_ist_pcpu);
 #endif
 
 extern void print_cpu_info(struct cpuinfo_x86 *);
diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -42,13 +42,14 @@ extern int no_timer_check;
  *			-johnstul@us.ibm.com "math is hard, lets go shopping!"
  */
 
-DECLARE_PER_CPU(unsigned long, cyc2ns);
+DECLARE_PER_CPU(unsigned long, percpu_cyc2ns);
 
 #define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */
 
 static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
 {
-	return cyc * per_cpu(cyc2ns, smp_processor_id()) >> CYC2NS_SCALE_FACTOR;
+	return cyc * per_cpu(percpu_cyc2ns, smp_processor_id()) >>
+		CYC2NS_SCALE_FACTOR;
 }
 
 static inline unsigned long long cycles_2_ns(unsigned long long cyc)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -955,7 +955,7 @@ unsigned long kernel_eflags;
  * Copies of the original ist values from the tss are only accessed during
  * debugging, no special alignment required.
  */
-DEFINE_PER_CPU(struct orig_ist, orig_ist);
+DEFINE_PER_CPU(struct orig_ist, orig_ist_pcpu);
 
 #else
 
@@ -980,7 +980,7 @@ void __cpuinit cpu_init(void)
 {
 	int cpu = stack_smp_processor_id();
 	struct tss_struct *t = &per_cpu(init_tss, cpu);
-	struct orig_ist *orig_ist = &per_cpu(orig_ist, cpu);
+	struct orig_ist *orig_ist = &per_cpu(orig_ist_pcpu, cpu);
 	unsigned long v;
 	char *estacks = NULL;
 	struct task_struct *me;
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -40,7 +40,7 @@ static unsigned long *in_exception_stack
 	 * 'stack' is in one of them:
 	 */
 	for (k = 0; k < N_EXCEPTION_STACKS; k++) {
-		unsigned long end = per_cpu(orig_ist, cpu).ist[k];
+		unsigned long end = per_cpu(orig_ist_pcpu, cpu).ist[k];
 		/*
 		 * Is 'stack' above this exception frame's end?
 		 * If yes then skip to the next frame.
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -570,7 +570,7 @@ EXPORT_SYMBOL(recalibrate_cpu_khz);
  *                      -johnstul@us.ibm.com "math is hard, lets go shopping!"
  */
 
-DEFINE_PER_CPU(unsigned long, cyc2ns);
+DEFINE_PER_CPU(unsigned long, percpu_cyc2ns);
 
 static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 {
@@ -580,7 +580,7 @@ static void set_cyc2ns_scale(unsigned lo
 	local_irq_save(flags);
 	sched_clock_idle_sleep_event();
 
-	scale = &per_cpu(cyc2ns, cpu);
+	scale = &per_cpu(percpu_cyc2ns, cpu);
 
 	rdtscll(tsc_now);
 	ns_now = __cycles_2_ns(tsc_now);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -90,7 +90,7 @@ struct svm_cpu_data {
 	struct page *save_area;
 };
 
-static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
+static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data_pcpu);
 static uint32_t svm_features;
 
 struct svm_init_data {
@@ -275,7 +275,7 @@ static void svm_hardware_enable(void *ga
 		printk(KERN_ERR "svm_cpu_init: err EOPNOTSUPP on %d\n", me);
 		return;
 	}
-	svm_data = per_cpu(svm_data, me);
+	svm_data = per_cpu(svm_data_pcpu, me);
 
 	if (!svm_data) {
 		printk(KERN_ERR "svm_cpu_init: svm_data is NULL on %d\n",
@@ -301,12 +301,12 @@ static void svm_cpu_uninit(int cpu)
 static void svm_cpu_uninit(int cpu)
 {
 	struct svm_cpu_data *svm_data
-		= per_cpu(svm_data, raw_smp_processor_id());
+		= per_cpu(svm_data_pcpu, raw_smp_processor_id());
 
 	if (!svm_data)
 		return;
 
-	per_cpu(svm_data, raw_smp_processor_id()) = NULL;
+	per_cpu(svm_data_pcpu, raw_smp_processor_id()) = NULL;
 	__free_page(svm_data->save_area);
 	kfree(svm_data);
 }
@@ -325,7 +325,7 @@ static int svm_cpu_init(int cpu)
 	if (!svm_data->save_area)
 		goto err_1;
 
-	per_cpu(svm_data, cpu) = svm_data;
+	per_cpu(svm_data_pcpu, cpu) = svm_data;
 
 	return 0;
 
@@ -1508,7 +1508,7 @@ static void reload_tss(struct kvm_vcpu *
 {
 	int cpu = raw_smp_processor_id();
 
-	struct svm_cpu_data *svm_data = per_cpu(svm_data, cpu);
+	struct svm_cpu_data *svm_data = per_cpu(svm_data_pcpu, cpu);
 	svm_data->tss_desc->type = 9; /* available 32/64-bit TSS */
 	load_TR_desc();
 }
@@ -1517,7 +1517,7 @@ static void pre_svm_run(struct vcpu_svm 
 {
 	int cpu = raw_smp_processor_id();
 
-	struct svm_cpu_data *svm_data = per_cpu(svm_data, cpu);
+	struct svm_cpu_data *svm_data = per_cpu(svm_data_pcpu, cpu);
 
 	svm->vmcb->control.tlb_ctl = TLB_CONTROL_DO_NOTHING;
 	if (svm->vcpu.cpu != cpu ||
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -62,14 +62,14 @@ static DEFINE_SPINLOCK(cpufreq_driver_lo
  * - Governor routines that can be called in cpufreq hotplug path should not
  *   take this sem as top level hotplug notifier handler takes this.
  */
-static DEFINE_PER_CPU(int, policy_cpu);
+static DEFINE_PER_CPU(int, policy_cpu_pcpu);
 static DEFINE_PER_CPU(struct rw_semaphore, cpu_policy_rwsem);
 
 #define lock_policy_rwsem(mode, cpu)					\
 int lock_policy_rwsem_##mode						\
 (int cpu)								\
 {									\
-	int policy_cpu = per_cpu(policy_cpu, cpu);			\
+	int policy_cpu = per_cpu(policy_cpu_pcpu, cpu);			\
 	BUG_ON(policy_cpu == -1);					\
 	down_##mode(&per_cpu(cpu_policy_rwsem, policy_cpu));		\
 	if (unlikely(!cpu_online(cpu))) {				\
@@ -88,7 +88,7 @@ EXPORT_SYMBOL_GPL(lock_policy_rwsem_writ
 
 void unlock_policy_rwsem_read(int cpu)
 {
-	int policy_cpu = per_cpu(policy_cpu, cpu);
+	int policy_cpu = per_cpu(policy_cpu_pcpu, cpu);
 	BUG_ON(policy_cpu == -1);
 	up_read(&per_cpu(cpu_policy_rwsem, policy_cpu));
 }
@@ -96,7 +96,7 @@ EXPORT_SYMBOL_GPL(unlock_policy_rwsem_re
 
 void unlock_policy_rwsem_write(int cpu)
 {
-	int policy_cpu = per_cpu(policy_cpu, cpu);
+	int policy_cpu = per_cpu(policy_cpu_pcpu, cpu);
 	BUG_ON(policy_cpu == -1);
 	up_write(&per_cpu(cpu_policy_rwsem, policy_cpu));
 }
@@ -822,7 +822,7 @@ static int cpufreq_add_dev(struct sys_de
 	cpumask_copy(policy->cpus, cpumask_of(cpu));
 
 	/* Initially set CPU itself as the policy_cpu */
-	per_cpu(policy_cpu, cpu) = cpu;
+	per_cpu(policy_cpu_pcpu, cpu) = cpu;
 	lock_policy_rwsem_write(cpu);
 
 	init_completion(&policy->kobj_unregister);
@@ -866,7 +866,7 @@ static int cpufreq_add_dev(struct sys_de
 
 			/* Set proper policy_cpu */
 			unlock_policy_rwsem_write(cpu);
-			per_cpu(policy_cpu, cpu) = managed_policy->cpu;
+			per_cpu(policy_cpu_pcpu, cpu) = managed_policy->cpu;
 
 			if (lock_policy_rwsem_write(cpu) < 0)
 				goto err_out_driver_exit;
@@ -929,7 +929,7 @@ static int cpufreq_add_dev(struct sys_de
 	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		per_cpu(cpufreq_cpu_data, j) = policy;
-		per_cpu(policy_cpu, j) = policy->cpu;
+		per_cpu(policy_cpu_pcpu, j) = policy->cpu;
 	}
 	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
@@ -1937,7 +1937,7 @@ static int __init cpufreq_core_init(void
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		per_cpu(policy_cpu, cpu) = -1;
+		per_cpu(policy_cpu_pcpu, cpu) = -1;
 		init_rwsem(&per_cpu(cpu_policy_rwsem, cpu));
 	}
 	return 0;
diff --git a/drivers/s390/net/netiucv.c b/drivers/s390/net/netiucv.c
--- a/drivers/s390/net/netiucv.c
+++ b/drivers/s390/net/netiucv.c
@@ -98,7 +98,7 @@ MODULE_DESCRIPTION ("Linux for S/390 IUC
 		debug_event(iucv_dbf_##name,level,(void*)(addr),len); \
 	} while (0)
 
-DECLARE_PER_CPU(char[256], iucv_dbf_txt_buf);
+DECLARE_PER_CPU(char[256], iucv_dbf_txt_buf_pcpu);
 
 /* Allow to sort out low debug levels early to avoid wasted sprints */
 static inline int iucv_dbf_passes(debug_info_t *dbf_grp, int level)
@@ -110,11 +110,11 @@ static inline int iucv_dbf_passes(debug_
 	do { \
 		if (iucv_dbf_passes(iucv_dbf_##name, level)) { \
 			char* iucv_dbf_txt_buf = \
-					get_cpu_var(iucv_dbf_txt_buf); \
+					get_cpu_var(iucv_dbf_txt_buf_pcpu); \
 			sprintf(iucv_dbf_txt_buf, text); \
 			debug_text_event(iucv_dbf_##name, level, \
 						iucv_dbf_txt_buf); \
-			put_cpu_var(iucv_dbf_txt_buf); \
+			put_cpu_var(iucv_dbf_txt_buf_pcpu); \
 		} \
 	} while (0)
 
@@ -462,7 +462,7 @@ static debug_info_t *iucv_dbf_data = NUL
 static debug_info_t *iucv_dbf_data = NULL;
 static debug_info_t *iucv_dbf_trace = NULL;
 
-DEFINE_PER_CPU(char[256], iucv_dbf_txt_buf);
+DEFINE_PER_CPU(char[256], iucv_dbf_txt_buf_pcpu);
 
 static void iucv_unregister_dbf_views(void)
 {
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -135,7 +135,8 @@ static inline struct lock_class *hlock_c
 }
 
 #ifdef CONFIG_LOCK_STAT
-static DEFINE_PER_CPU(struct lock_class_stats[MAX_LOCKDEP_KEYS], lock_stats);
+static DEFINE_PER_CPU(struct lock_class_stats[MAX_LOCKDEP_KEYS],
+		      percpu_lock_stats);
 
 static int lock_point(unsigned long points[], unsigned long ip)
 {
@@ -181,7 +182,7 @@ struct lock_class_stats lock_stats(struc
 	memset(&stats, 0, sizeof(struct lock_class_stats));
 	for_each_possible_cpu(cpu) {
 		struct lock_class_stats *pcs =
-			&per_cpu(lock_stats, cpu)[class - lock_classes];
+			&per_cpu(percpu_lock_stats, cpu)[class - lock_classes];
 
 		for (i = 0; i < ARRAY_SIZE(stats.contention_point); i++)
 			stats.contention_point[i] += pcs->contention_point[i];
@@ -208,7 +209,7 @@ void clear_lock_stats(struct lock_class 
 
 	for_each_possible_cpu(cpu) {
 		struct lock_class_stats *cpu_stats =
-			&per_cpu(lock_stats, cpu)[class - lock_classes];
+			&per_cpu(percpu_lock_stats, cpu)[class - lock_classes];
 
 		memset(cpu_stats, 0, sizeof(struct lock_class_stats));
 	}
@@ -218,12 +219,12 @@ void clear_lock_stats(struct lock_class 
 
 static struct lock_class_stats *get_lock_stats(struct lock_class *class)
 {
-	return &get_cpu_var(lock_stats)[class - lock_classes];
+	return &get_cpu_var(percpu_lock_stats)[class - lock_classes];
 }
 
 static void put_lock_stats(struct lock_class_stats *stats)
 {
-	put_cpu_var(lock_stats);
+	put_cpu_var(percpu_lock_stats);
 }
 
 static void lock_release_holdtime(struct held_lock *hlock)
diff --git a/kernel/sched.c b/kernel/sched.c
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -315,12 +315,14 @@ struct task_group root_task_group;
 /* Default task group's sched entity on each cpu */
 static DEFINE_PER_CPU(struct sched_entity, init_sched_entity);
 /* Default task group's cfs_rq on each cpu */
-static DEFINE_PER_CPU(struct cfs_rq, init_cfs_rq) ____cacheline_aligned_in_smp;
+static DEFINE_PER_CPU(struct cfs_rq, percpu_init_cfs_rq)
+	____cacheline_aligned_in_smp;
 #endif /* CONFIG_FAIR_GROUP_SCHED */
 
 #ifdef CONFIG_RT_GROUP_SCHED
 static DEFINE_PER_CPU(struct sched_rt_entity, init_sched_rt_entity);
-static DEFINE_PER_CPU(struct rt_rq, init_rt_rq) ____cacheline_aligned_in_smp;
+static DEFINE_PER_CPU(struct rt_rq, percpu_init_rt_rq)
+	____cacheline_aligned_in_smp;
 #endif /* CONFIG_RT_GROUP_SCHED */
 #else /* !CONFIG_USER_SCHED */
 #define root_task_group init_task_group
@@ -7213,14 +7215,14 @@ struct static_sched_domain {
  */
 #ifdef CONFIG_SCHED_SMT
 static DEFINE_PER_CPU(struct static_sched_domain, cpu_domains);
-static DEFINE_PER_CPU(struct static_sched_group, sched_group_cpus);
+static DEFINE_PER_CPU(struct static_sched_group, sched_group_cpus_pcpu);
 
 static int
 cpu_to_cpu_group(int cpu, const struct cpumask *cpu_map,
 		 struct sched_group **sg, struct cpumask *unused)
 {
 	if (sg)
-		*sg = &per_cpu(sched_group_cpus, cpu).sg;
+		*sg = &per_cpu(sched_group_cpus_pcpu, cpu).sg;
 	return cpu;
 }
 #endif /* CONFIG_SCHED_SMT */
@@ -8408,7 +8410,7 @@ void __init sched_init(void)
 		 * tasks in rq->cfs (i.e init_task_group->se[] != NULL).
 		 */
 		init_tg_cfs_entry(&init_task_group,
-				&per_cpu(init_cfs_rq, i),
+				&per_cpu(percpu_init_cfs_rq, i),
 				&per_cpu(init_sched_entity, i), i, 1,
 				root_task_group.se[i]);
 
@@ -8423,7 +8425,7 @@ void __init sched_init(void)
 #elif defined CONFIG_USER_SCHED
 		init_tg_rt_entry(&root_task_group, &rq->rt, NULL, i, 0, NULL);
 		init_tg_rt_entry(&init_task_group,
-				&per_cpu(init_rt_rq, i),
+				&per_cpu(percpu_init_rt_rq, i),
 				&per_cpu(init_sched_rt_entity, i), i, 1,
 				root_task_group.rt_se[i]);
 #endif
diff --git a/kernel/softirq.c b/kernel/softirq.c
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -602,7 +602,7 @@ void __init softirq_init(void)
 	open_softirq(HI_SOFTIRQ, tasklet_hi_action);
 }
 
-static int ksoftirqd(void * __bind_cpu)
+static int run_ksoftirqd(void *__bind_cpu)
 {
 	set_current_state(TASK_INTERRUPTIBLE);
 
@@ -714,7 +714,7 @@ static int __cpuinit cpu_callback(struct
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
-		p = kthread_create(ksoftirqd, hcpu, "ksoftirqd/%d", hotcpu);
+		p = kthread_create(run_ksoftirqd, hcpu, "ksoftirqd/%d", hotcpu);
 		if (IS_ERR(p)) {
 			printk("ksoftirqd for %i failed\n", hotcpu);
 			return NOTIFY_BAD;
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -95,28 +95,28 @@ void softlockup_tick(void)
 void softlockup_tick(void)
 {
 	int this_cpu = smp_processor_id();
-	unsigned long touch_timestamp = per_cpu(touch_timestamp, this_cpu);
-	unsigned long print_timestamp;
+	unsigned long touch_ts = per_cpu(touch_timestamp, this_cpu);
+	unsigned long print_ts;
 	struct pt_regs *regs = get_irq_regs();
 	unsigned long now;
 
 	/* Is detection switched off? */
 	if (!per_cpu(watchdog_task, this_cpu) || softlockup_thresh <= 0) {
 		/* Be sure we don't false trigger if switched back on */
-		if (touch_timestamp)
+		if (touch_ts)
 			per_cpu(touch_timestamp, this_cpu) = 0;
 		return;
 	}
 
-	if (touch_timestamp == 0) {
+	if (touch_ts == 0) {
 		__touch_softlockup_watchdog();
 		return;
 	}
 
-	print_timestamp = per_cpu(print_timestamp, this_cpu);
+	print_ts = per_cpu(print_timestamp, this_cpu);
 
 	/* report at most once a second */
-	if (print_timestamp == touch_timestamp || did_panic)
+	if (print_ts == touch_ts || did_panic)
 		return;
 
 	/* do not print during early bootup: */
@@ -131,18 +131,18 @@ void softlockup_tick(void)
 	 * Wake up the high-prio watchdog task twice per
 	 * threshold timespan.
 	 */
-	if (now > touch_timestamp + softlockup_thresh/2)
+	if (now > touch_ts + softlockup_thresh/2)
 		wake_up_process(per_cpu(watchdog_task, this_cpu));
 
 	/* Warn about unreasonable delays: */
-	if (now <= (touch_timestamp + softlockup_thresh))
+	if (now <= (touch_ts + softlockup_thresh))
 		return;
 
-	per_cpu(print_timestamp, this_cpu) = touch_timestamp;
+	per_cpu(print_timestamp, this_cpu) = touch_ts;
 
 	spin_lock(&print_lock);
 	printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n",
-			this_cpu, now - touch_timestamp,
+			this_cpu, now - touch_ts,
 			current->comm, task_pid_nr(current));
 	print_modules();
 	print_irqtrace_events(current);
diff --git a/mm/slab.c b/mm/slab.c
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -933,17 +933,17 @@ static void next_reap_node(void)
  */
 static void __cpuinit start_cpu_timer(int cpu)
 {
-	struct delayed_work *reap_work = &per_cpu(reap_work, cpu);
+	struct delayed_work *reap = &per_cpu(reap_work, cpu);
 
 	/*
 	 * When this gets called from do_initcalls via cpucache_init(),
 	 * init_workqueues() has already run, so keventd will be setup
 	 * at that time.
 	 */
-	if (keventd_up() && reap_work->work.func == NULL) {
+	if (keventd_up() && reap->work.func == NULL) {
 		init_reap_node(cpu);
-		INIT_DELAYED_WORK(reap_work, cache_reap);
-		schedule_delayed_work_on(cpu, reap_work,
+		INIT_DELAYED_WORK(reap, cache_reap);
+		schedule_delayed_work_on(cpu, reap,
 					__round_jiffies_relative(HZ, cpu));
 	}
 }
diff --git a/mm/vmstat.c b/mm/vmstat.c
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -903,10 +903,10 @@ static void vmstat_update(struct work_st
 
 static void __cpuinit start_cpu_timer(int cpu)
 {
-	struct delayed_work *vmstat_work = &per_cpu(vmstat_work, cpu);
+	struct delayed_work *vw = &per_cpu(vmstat_work, cpu);
 
-	INIT_DELAYED_WORK_DEFERRABLE(vmstat_work, vmstat_update);
-	schedule_delayed_work_on(cpu, vmstat_work, HZ + cpu);
+	INIT_DELAYED_WORK_DEFERRABLE(vw, vmstat_update);
+	schedule_delayed_work_on(cpu, vw, HZ + cpu);
 }
 
 /*
alloc_percpu: remove per_cpu__ prefix.

Now that the return from alloc_percpu is compatible with the address
of per-cpu vars, it makes sense to hand around the address of per-cpu
variables.  To make this sane, we remove the per_cpu__ prefix we used
created to stop people accidentally using these vars directly.

Now we have sparse, we can use that (next patch).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 arch/alpha/include/asm/percpu.h         |    4 ++--
 arch/cris/arch-v10/kernel/entry.S       |    2 +-
 arch/cris/arch-v32/mm/mmu.S             |    2 +-
 arch/ia64/include/asm/percpu.h          |    4 ++--
 arch/ia64/kernel/ia64_ksyms.c           |    4 ++--
 arch/ia64/mm/discontig.c                |    2 +-
 arch/parisc/lib/fixup.S                 |    8 ++++----
 arch/powerpc/platforms/pseries/hvCall.S |    2 +-
 arch/sparc/kernel/rtrap_64.S            |    8 ++++----
 arch/x86/include/asm/percpu.h           |   20 ++++++++++----------
 arch/x86/kernel/entry_64.S              |    4 ++--
 arch/x86/kernel/head_32.S               |    2 +-
 arch/x86/kernel/head_64.S               |    2 +-
 arch/x86/xen/xen-asm_32.S               |    4 ++--
 include/asm-generic/percpu.h            |    2 +-
 include/linux/percpu.h                  |   12 ++++++------
 16 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/arch/alpha/include/asm/percpu.h b/arch/alpha/include/asm/percpu.h
--- a/arch/alpha/include/asm/percpu.h
+++ b/arch/alpha/include/asm/percpu.h
@@ -7,7 +7,7 @@
  * Determine the real variable name from the name visible in the
  * kernel sources.
  */
-#define per_cpu_var(var) per_cpu__##var
+#define per_cpu_var(var) var
 
 #ifdef CONFIG_SMP
 
@@ -43,7 +43,7 @@ extern unsigned long __per_cpu_offset[NR
 	unsigned long __ptr, tmp_gp;			\
 	asm (  "br	%1, 1f		  	      \n\
 	1:	ldgp	%1, 0(%1)	    	      \n\
-		ldq %0, per_cpu__" #var"(%1)\t!literal"		\
+		ldq %0, "#var"(%1)\t!literal"		\
 		: "=&r"(__ptr), "=&r"(tmp_gp));		\
 	(typeof(&per_cpu_var(var)))(__ptr + (offset)); })
 
diff --git a/arch/cris/arch-v10/kernel/entry.S b/arch/cris/arch-v10/kernel/entry.S
--- a/arch/cris/arch-v10/kernel/entry.S
+++ b/arch/cris/arch-v10/kernel/entry.S
@@ -358,7 +358,7 @@ 1:	btstq	12, $r1		   ; Refill?
 1:	btstq	12, $r1		   ; Refill?
 	bpl	2f
 	lsrq	24, $r1     ; Get PGD index (bit 24-31)
-	move.d  [per_cpu__current_pgd], $r0 ; PGD for the current process
+	move.d  [current_pgd], $r0 ; PGD for the current process
 	move.d	[$r0+$r1.d], $r0   ; Get PMD
 	beq	2f
 	nop
diff --git a/arch/cris/arch-v32/mm/mmu.S b/arch/cris/arch-v32/mm/mmu.S
--- a/arch/cris/arch-v32/mm/mmu.S
+++ b/arch/cris/arch-v32/mm/mmu.S
@@ -115,7 +115,7 @@ 3:	; Probably not in a loop, continue no
 #ifdef CONFIG_SMP
 	move    $s7, $acr	; PGD
 #else
-	move.d  per_cpu__current_pgd, $acr ; PGD
+	move.d  current_pgd, $acr ; PGD
 #endif
 	; Look up PMD in PGD
 	lsrq	24, $r0	; Get PMD index into PGD (bit 24-31)
diff --git a/arch/ia64/include/asm/percpu.h b/arch/ia64/include/asm/percpu.h
--- a/arch/ia64/include/asm/percpu.h
+++ b/arch/ia64/include/asm/percpu.h
@@ -9,7 +9,7 @@
 #define PERCPU_ENOUGH_ROOM PERCPU_PAGE_SIZE
 
 #ifdef __ASSEMBLY__
-# define THIS_CPU(var)	(per_cpu__##var)  /* use this to mark accesses to per-CPU variables... */
+# define THIS_CPU(var)	(var)  /* use this to mark accesses to per-CPU variables... */
 #else /* !__ASSEMBLY__ */
 
 
@@ -39,7 +39,7 @@ extern void *per_cpu_init(void);
  * On the positive side, using __ia64_per_cpu_var() instead of __get_cpu_var() is slightly
  * more efficient.
  */
-#define __ia64_per_cpu_var(var)	per_cpu__##var
+#define __ia64_per_cpu_var(var)	var
 
 #include <asm-generic/percpu.h>
 
diff --git a/arch/ia64/kernel/ia64_ksyms.c b/arch/ia64/kernel/ia64_ksyms.c
--- a/arch/ia64/kernel/ia64_ksyms.c
+++ b/arch/ia64/kernel/ia64_ksyms.c
@@ -29,9 +29,9 @@ EXPORT_SYMBOL(max_low_pfn);	/* defined b
 #endif
 
 #include <asm/processor.h>
-EXPORT_SYMBOL(per_cpu__cpu_info);
+EXPORT_SYMBOL(cpu_info);
 #ifdef CONFIG_SMP
-EXPORT_SYMBOL(per_cpu__local_per_cpu_offset);
+EXPORT_SYMBOL(local_per_cpu_offset);
 #endif
 
 #include <asm/uaccess.h>
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -360,7 +360,7 @@ static void __init initialize_pernode_da
 		cpu = 0;
 		node = node_cpuid[cpu].nid;
 		cpu0_cpu_info = (struct cpuinfo_ia64 *)(__phys_per_cpu_start +
-			((char *)&per_cpu__cpu_info - __per_cpu_start));
+			((char *)&cpu_info - __per_cpu_start));
 		cpu0_cpu_info->node_data = mem_data[node].node_data;
 	}
 #endif /* CONFIG_SMP */
diff --git a/arch/parisc/lib/fixup.S b/arch/parisc/lib/fixup.S
--- a/arch/parisc/lib/fixup.S
+++ b/arch/parisc/lib/fixup.S
@@ -36,8 +36,8 @@
 #endif
 	/* t2 = &__per_cpu_offset[smp_processor_id()]; */
 	LDREGX \t2(\t1),\t2 
-	addil LT%per_cpu__exception_data,%r27
-	LDREG RT%per_cpu__exception_data(%r1),\t1
+	addil LT%exception_data,%r27
+	LDREG RT%exception_data(%r1),\t1
 	/* t1 = &__get_cpu_var(exception_data) */
 	add,l \t1,\t2,\t1
 	/* t1 = t1->fault_ip */
@@ -46,8 +46,8 @@
 #else
 	.macro  get_fault_ip t1 t2
 	/* t1 = &__get_cpu_var(exception_data) */
-	addil LT%per_cpu__exception_data,%r27
-	LDREG RT%per_cpu__exception_data(%r1),\t2
+	addil LT%exception_data,%r27
+	LDREG RT%exception_data(%r1),\t2
 	/* t1 = t2->fault_ip */
 	LDREG EXCDATA_IP(\t2), \t1
 	.endm
diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S
--- a/arch/powerpc/platforms/pseries/hvCall.S
+++ b/arch/powerpc/platforms/pseries/hvCall.S
@@ -55,7 +55,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_PURR);				
 	/* calculate address of stat structure r4 = opcode */	\
 	srdi	r4,r4,2;		/* index into array */	\
 	mulli	r4,r4,HCALL_STAT_SIZE;				\
-	LOAD_REG_ADDR(r7, per_cpu__hcall_stats);		\
+	LOAD_REG_ADDR(r7, hcall_stats);				\
 	add	r4,r4,r7;					\
 	ld	r7,PACA_DATA_OFFSET(r13); /* per cpu offset */	\
 	add	r4,r4,r7;					\
diff --git a/arch/sparc/kernel/rtrap_64.S b/arch/sparc/kernel/rtrap_64.S
--- a/arch/sparc/kernel/rtrap_64.S
+++ b/arch/sparc/kernel/rtrap_64.S
@@ -149,11 +149,11 @@ rtrap_irq:
 rtrap_irq:
 rtrap:
 #ifndef CONFIG_SMP
-		sethi			%hi(per_cpu____cpu_data), %l0
-		lduw			[%l0 + %lo(per_cpu____cpu_data)], %l1
+		sethi			%hi(__cpu_data), %l0
+		lduw			[%l0 + %lo(__cpu_data)], %l1
 #else
-		sethi			%hi(per_cpu____cpu_data), %l0
-		or			%l0, %lo(per_cpu____cpu_data), %l0
+		sethi			%hi(__cpu_data), %l0
+		or			%l0, %lo(__cpu_data), %l0
 		lduw			[%l0 + %g5], %l1
 #endif
 		cmp			%l1, 0
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -66,13 +66,13 @@ DECLARE_PER_CPU(struct x8664_pda, pda);
  */
 #ifdef CONFIG_SMP
 #define PER_CPU(var, reg)				\
-	movl %fs:per_cpu__##this_cpu_off, reg;		\
-	lea per_cpu__##var(reg), reg
-#define PER_CPU_VAR(var)	%fs:per_cpu__##var
+	movl %fs:this_cpu_off, reg;			\
+	lea var(reg), reg
+#define PER_CPU_VAR(var)	%fs:var
 #else /* ! SMP */
 #define PER_CPU(var, reg)			\
-	movl $per_cpu__##var, reg
-#define PER_CPU_VAR(var)	per_cpu__##var
+	movl $var, reg
+#define PER_CPU_VAR(var)	var
 #endif	/* SMP */
 
 #else /* ...!ASSEMBLY */
@@ -162,11 +162,11 @@ do {							\
 	ret__;						\
 })
 
-#define x86_read_percpu(var) percpu_from_op("mov", per_cpu__##var)
-#define x86_write_percpu(var, val) percpu_to_op("mov", per_cpu__##var, val)
-#define x86_add_percpu(var, val) percpu_to_op("add", per_cpu__##var, val)
-#define x86_sub_percpu(var, val) percpu_to_op("sub", per_cpu__##var, val)
-#define x86_or_percpu(var, val) percpu_to_op("or", per_cpu__##var, val)
+#define x86_read_percpu(var) percpu_from_op("mov", var)
+#define x86_write_percpu(var, val) percpu_to_op("mov", var, val)
+#define x86_add_percpu(var, val) percpu_to_op("add", var, val)
+#define x86_sub_percpu(var, val) percpu_to_op("sub", var, val)
+#define x86_or_percpu(var, val) percpu_to_op("or", var, val)
 #endif /* !__ASSEMBLY__ */
 #endif /* !CONFIG_X86_64 */
 
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1073,9 +1073,9 @@ ENTRY(\sym)
 	movq %rsp,%rdi		/* pt_regs pointer */
 	xorl %esi,%esi		/* no error code */
 	movq %gs:pda_data_offset, %rbp
-	subq $EXCEPTION_STKSZ, per_cpu__init_tss + TSS_ist + (\ist - 1) * 8(%rbp)
+	subq $EXCEPTION_STKSZ, init_tss + TSS_ist + (\ist - 1) * 8(%rbp)
 	call \do_sym
-	addq $EXCEPTION_STKSZ, per_cpu__init_tss + TSS_ist + (\ist - 1) * 8(%rbp)
+	addq $EXCEPTION_STKSZ, init_tss + TSS_ist + (\ist - 1) * 8(%rbp)
 	jmp paranoid_exit	/* %ebx: no swapgs flag */
 	CFI_ENDPROC
 END(\sym)
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -702,7 +702,7 @@ idt_descr:
 	.word 0				# 32 bit align gdt_desc.address
 ENTRY(early_gdt_descr)
 	.word GDT_ENTRIES*8-1
-	.long per_cpu__gdt_page		/* Overwritten for secondary CPUs */
+	.long gdt_page			/* Overwritten for secondary CPUs */
 
 /*
  * The boot_gdt must mirror the equivalent in setup.S and is
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -401,7 +401,7 @@ NEXT_PAGE(level2_spare_pgt)
 	.globl early_gdt_descr
 early_gdt_descr:
 	.word	GDT_ENTRIES*8-1
-	.quad   per_cpu__gdt_page
+	.quad   gdt_page
 
 ENTRY(phys_base)
 	/* This must match the first entry in level2_kernel_pgt */
diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -164,9 +164,9 @@ ENTRY(xen_iret)
 	GET_THREAD_INFO(%eax)
 	movl TI_cpu(%eax),%eax
 	movl __per_cpu_offset(,%eax,4),%eax
-	mov per_cpu__xen_vcpu(%eax),%eax
+	mov xen_vcpu(%eax),%eax
 #else
-	movl per_cpu__xen_vcpu, %eax
+	movl xen_vcpu, %eax
 #endif
 
 	/* check IF state we're restoring */
diff --git a/include/asm-generic/percpu.h b/include/asm-generic/percpu.h
--- a/include/asm-generic/percpu.h
+++ b/include/asm-generic/percpu.h
@@ -7,7 +7,7 @@
  * Determine the real variable name from the name visible in the
  * kernel sources.
  */
-#define per_cpu_var(var) per_cpu__##var
+#define per_cpu_var(var) var
 
 #ifdef CONFIG_SMP
 
diff --git a/include/linux/percpu.h b/include/linux/percpu.h
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -11,7 +11,7 @@
 #ifdef CONFIG_SMP
 #define DEFINE_PER_CPU(type, name)					\
 	__attribute__((__section__(".data.percpu")))			\
-	PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name
+	PER_CPU_ATTRIBUTES __typeof__(type) name
 
 #ifdef MODULE
 #define SHARED_ALIGNED_SECTION ".data.percpu"
@@ -21,15 +21,15 @@
 
 #define DEFINE_PER_CPU_SHARED_ALIGNED(type, name)			\
 	__attribute__((__section__(SHARED_ALIGNED_SECTION)))		\
-	PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name		\
+	PER_CPU_ATTRIBUTES __typeof__(type) name			\
 	____cacheline_aligned_in_smp
 
 #define DEFINE_PER_CPU_PAGE_ALIGNED(type, name)			\
 	__attribute__((__section__(".data.percpu.page_aligned")))	\
-	PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name
+	PER_CPU_ATTRIBUTES __typeof__(type) name
 #else
 #define DEFINE_PER_CPU(type, name)					\
-	PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name
+	PER_CPU_ATTRIBUTES __typeof__(type) name
 
 #define DEFINE_PER_CPU_SHARED_ALIGNED(type, name)		      \
 	DEFINE_PER_CPU(type, name)
@@ -38,8 +38,8 @@
 	DEFINE_PER_CPU(type, name)
 #endif
 
-#define EXPORT_PER_CPU_SYMBOL(var) EXPORT_SYMBOL(per_cpu__##var)
-#define EXPORT_PER_CPU_SYMBOL_GPL(var) EXPORT_SYMBOL_GPL(per_cpu__##var)
+#define EXPORT_PER_CPU_SYMBOL(var) EXPORT_SYMBOL(var)
+#define EXPORT_PER_CPU_SYMBOL_GPL(var) EXPORT_SYMBOL_GPL(var)
 
 #ifndef PERCPU_ENOUGH_ROOM
 extern unsigned int percpu_reserve;
alloc_percpu: use __percpu annotation for sparse.

Add __percpu for sparse.

We have to make __kernel "__attribute__((address_space(0)))" so we can
cast to it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
---
 include/asm-generic/percpu.h |   19 ++++++++++++-------
 include/linux/compiler.h     |    4 +++-
 include/linux/percpu.h       |    8 ++++----
 3 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/include/asm-generic/percpu.h b/include/asm-generic/percpu.h
--- a/include/asm-generic/percpu.h
+++ b/include/asm-generic/percpu.h
@@ -45,7 +45,9 @@ extern unsigned long __per_cpu_offset[NR
  * Only S390 provides its own means of moving the pointer.
  */
 #ifndef SHIFT_PERCPU_PTR
-#define SHIFT_PERCPU_PTR(__p, __offset)	RELOC_HIDE((__p), (__offset))
+/* Weird cast keeps both GCC and sparse happy. */
+#define SHIFT_PERCPU_PTR(__p, __offset)	\
+	((typeof(*__p) __kernel __force *)RELOC_HIDE((__p), (__offset)))
 #endif
 
 /*
@@ -61,16 +63,19 @@ extern unsigned long __per_cpu_offset[NR
 	(*SHIFT_PERCPU_PTR(&per_cpu_var(var), __my_cpu_offset))
 
 /* Use RELOC_HIDE: some arch's SHIFT_PERCPU_PTR really want an identifier. */
+#define RELOC_PERCPU(addr, off) \
+	((typeof(*addr) __kernel __force *)RELOC_HIDE((addr), (off)))
+
 /**
  * per_cpu_ptr - get a pointer to a particular cpu's allocated memory
- * @ptr: the pointer returned from alloc_percpu
+ * @ptr: the pointer returned from alloc_percpu, or &per-cpu var
  * @cpu: the cpu whose memory you want to access
  *
  * Similar to per_cpu(), except for dynamic memory.
  * cpu_possible(@cpu) must be true.
  */
 #define per_cpu_ptr(ptr, cpu) \
-	RELOC_HIDE((ptr), (per_cpu_offset(cpu)))
+	RELOC_PERCPU((ptr), (per_cpu_offset(cpu)))
 
 /**
  * __get_cpu_ptr - get a pointer to this cpu's allocated memory
@@ -78,8 +83,8 @@ extern unsigned long __per_cpu_offset[NR
  *
  * Similar to __get_cpu_var(), except for dynamic memory.
  */
-#define __get_cpu_ptr(ptr) RELOC_HIDE(ptr, my_cpu_offset)
-#define __raw_get_cpu_ptr(ptr) RELOC_HIDE(ptr, __my_cpu_offset)
+#define __get_cpu_ptr(ptr) RELOC_PERCPU(ptr, my_cpu_offset)
+#define __raw_get_cpu_ptr(ptr) RELOC_PERCPU(ptr, __my_cpu_offset)
 
 #ifdef CONFIG_HAVE_SETUP_PER_CPU_AREA
 extern void setup_per_cpu_areas(void);
@@ -100,7 +105,7 @@ extern void setup_per_cpu_areas(void);
 #define PER_CPU_ATTRIBUTES
 #endif
 
-#define DECLARE_PER_CPU(type, name) extern PER_CPU_ATTRIBUTES \
-					__typeof__(type) per_cpu_var(name)
+#define DECLARE_PER_CPU(type, name) \
+	extern PER_CPU_ATTRIBUTES __percpu __typeof__(type) per_cpu_var(name)
 
 #endif /* _ASM_GENERIC_PERCPU_H_ */
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -5,7 +5,7 @@
 
 #ifdef __CHECKER__
 # define __user		__attribute__((noderef, address_space(1)))
-# define __kernel	/* default address space */
+# define __kernel	__attribute__((address_space(0)))
 # define __safe		__attribute__((safe))
 # define __force	__attribute__((force))
 # define __nocast	__attribute__((nocast))
@@ -15,6 +15,7 @@
 # define __acquire(x)	__context__(x,1)
 # define __release(x)	__context__(x,-1)
 # define __cond_lock(x,c)	((c) ? ({ __acquire(x); 1; }) : 0)
+# define __percpu	__attribute__((noderef, address_space(3)))
 extern void __chk_user_ptr(const volatile void __user *);
 extern void __chk_io_ptr(const volatile void __iomem *);
 #else
@@ -32,6 +33,7 @@ extern void __chk_io_ptr(const volatile 
 # define __acquire(x) (void)0
 # define __release(x) (void)0
 # define __cond_lock(x,c) (c)
+# define __percpu
 #endif
 
 #ifdef __KERNEL__
diff --git a/include/linux/percpu.h b/include/linux/percpu.h
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -11,7 +11,7 @@
 #ifdef CONFIG_SMP
 #define DEFINE_PER_CPU(type, name)					\
 	__attribute__((__section__(".data.percpu")))			\
-	PER_CPU_ATTRIBUTES __typeof__(type) name
+	PER_CPU_ATTRIBUTES __typeof__(type) __percpu name
 
 #ifdef MODULE
 #define SHARED_ALIGNED_SECTION ".data.percpu"
@@ -21,15 +21,15 @@
 
 #define DEFINE_PER_CPU_SHARED_ALIGNED(type, name)			\
 	__attribute__((__section__(SHARED_ALIGNED_SECTION)))		\
-	PER_CPU_ATTRIBUTES __typeof__(type) name			\
+	PER_CPU_ATTRIBUTES __typeof__(type) __percpu name		\
 	____cacheline_aligned_in_smp
 
 #define DEFINE_PER_CPU_PAGE_ALIGNED(type, name)			\
 	__attribute__((__section__(".data.percpu.page_aligned")))	\
-	PER_CPU_ATTRIBUTES __typeof__(type) name
+	PER_CPU_ATTRIBUTES __typeof__(type) __percpu name
 #else
 #define DEFINE_PER_CPU(type, name)					\
-	PER_CPU_ATTRIBUTES __typeof__(type) name
+	PER_CPU_ATTRIBUTES __typeof__(type) __percpu name
 
 #define DEFINE_PER_CPU_SHARED_ALIGNED(type, name)		      \
 	DEFINE_PER_CPU(type, name)


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2009-06-18  4:11 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-05 19:18 [this_cpu_xx 00/11] Introduce this_cpu_xx operations cl
2009-06-05 19:18 ` [this_cpu_xx 01/11] Introduce this_cpu_ptr() and generic this_cpu_* operations cl
2009-06-10  5:12   ` Tejun Heo
2009-06-11 15:10     ` Christoph Lameter
2009-06-12  2:09       ` Tejun Heo
2009-06-12 14:18         ` Christoph Lameter
2009-06-17  8:09           ` Tejun Heo
2009-06-17  8:19   ` Tejun Heo
2009-06-17 18:41     ` Christoph Lameter
2009-06-18  1:08       ` Tejun Heo
2009-06-18  3:01       ` Rusty Russell
2009-06-05 19:18 ` [this_cpu_xx 02/11] Use this_cpu operations for SNMP statistics cl
2009-06-05 19:18 ` [this_cpu_xx 03/11] Use this_cpu operations for NFS statistics cl
2009-06-05 19:18 ` [this_cpu_xx 04/11] Use this_cpu ops for network statistics cl
2009-06-08 11:27   ` Robin Holt
2009-06-08 20:49     ` Christoph Lameter
2009-06-08 20:54       ` Ingo Molnar
2009-06-05 19:18 ` [this_cpu_xx 05/11] this_cpu_ptr: Straight transformations cl
2009-06-05 19:18 ` [this_cpu_xx 06/11] Eliminate get/put_cpu cl
2009-06-05 19:34   ` Dan Williams
2009-06-09 14:02     ` Sosnowski, Maciej
2009-06-05 19:18 ` [this_cpu_xx 07/11] xfs_icsb_modify_counters does not need "cpu" variable cl
2009-06-05 19:22   ` Christoph Hellwig
2009-06-05 19:36     ` Christoph Lameter
2009-06-05 19:18 ` [this_cpu_xx 08/11] Use this_cpu_ptr in crypto subsystem cl
2009-06-05 19:18 ` [this_cpu_xx 09/11] X86 optimized this_cpu operations cl
2009-06-05 19:18 ` [this_cpu_xx 10/11] Use this_cpu ops for vm statistics cl
2009-06-05 19:18 ` [this_cpu_xx 11/11] RCU: Use this_cpu operations cl
2009-06-10 17:42   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.